Hi folks, Eric Bidstrup here.
We interrupt our regular schedule of blog postings to offer this special post for “Super Tuesday” given the subject matter. Hope you enjoy…
This year is a presidential election year in the United States. Selecting a new president is perhaps the ultimate example of the importance of having a trustworthy election process. There have been some well chronicled examples of elections with extremely close results, where the winner’s margin of victory was perhaps smaller than the election system’s margin of error. The term “Hanging Chads,” from the 2000 U.S Presidential election, is now part of the American vocabulary, and locally here in Washington State our last gubernatorial election in 2004 required 3 recounts with the final winner being determined by a margin of only 129 votes, or 0.0045% of the popular vote.
The populace demands confidence that, even in close elections, the election result accurately reflects the voters’ intent. In theory, such precision can be improved by using computers and technology.
However, it seems that every recent election season brings stories in the media about security concerns regarding voting machine (and their software) security. A recent New York Times article provides a good overview of voting machine security concerns; and academic studies on voting systems last year in California, Connecticut, Florida, and Ohio provide some interesting insights about security concerns and vulnerabilities in voting systems from several vendors.
These analyses are fascinating to us, because they offer an opportunity to see how a set of experts look at products other than ours. Applied security researchers often analyze our products, and often share their processes and tools with us, but it’s rare to see a top-to-bottom product review released. In California, there was both white and black box testing done by different teams, and we’ve studied these reports to see the perceptions of development practices from other vendors and results of a different type of review process.
Something my colleagues and I find very interesting is that many of the vulnerabilities noted in these reports could have been prevented by following the requirements in Microsoft’s Security Development Lifecycle. The studies performed in California (prepared at UC Berkeley but created by teams of academics from across the United States) included detailed source code analysis. I’ll select out a few examples from those studies and describe them here. (Note: I’m deliberately picking a few examples from each vendor assessed in the study. I am not attempting to criticize any specific vendor, but rather am trying to illustrate examples of areas where application of the SDL could help contribute towards society’s need for trustworthy computing in a very visible and important application.)
Let’s start with the Source Code Review of the Sequoia Voting System. Two examples from the executive summary are interesting:
“Cryptography. …Many cryptographic functions are implemented incorrectly, based on weak algorithms with known flaws, or used in an ineffective or insecure manner. Of particular concern is the fact that virtually all cryptographic key material is permanently hardcoded in the system (and is apparently identical in all Sequoia hardware shipped to different jurisdictions)…
Software Engineering. …The software suffers from numerous programming errors, many of which have a high potential to introduce or exacerbate security weaknesses. These include buffer overflows, format string vulnerabilities, and type mismatch errors….”
A deeper reading of the cryptographic concerns (page 29 in report) notes concerns (amongst others) over the use of a flawed implementation of the SHA hash algorithm and use of the Data Encryption Standard (DES) algorithm. The SDL has specific policies outlining appropriate selection of cryptographic algorithms. For example, DES is prohibited except for backwards compatibility. SDL also requires that applications use operating system cryptographic functions and libraries. The cryptography team in the operating systems group is supported by world-class cryptographers who carefully scrutinize the implementation of crypto algorithms, and additionally these operating system functions are formally reviewed and certified by the National Institute of Standards and Technology (NIST) Cryptographic Module Validation Program (CMVP) who validates cryptographic modules meet Federal Information Processing Standards (FIPS). Most application developers are not cryptographers and hence are unlikely to encode crypto algorithms correctly. The SDL requires the use of standard crypto functions and outlines requirements on algorithm selection, key length and key management.
Moving to the software engineering concerns; while several common coding and design concerns are noted (e.g. input validation) I want to select one with a bit more subtlety: running code from USB sticks (page 37 in report). From the report, it appears the code present on the USB sticks is used to program a component (HAAT) of their client (WinEDS) to prepare for a specific election. The valid concern noted by the study is that USB sticks used by WinEDS to configure the HAAT are implicitly trusted to have appropriate authorization to program the voting devices for an election, and that a formal authorization framework didn’t appear to be present. The implication being (as stated in the report): “If such a stick is used in a HAAT that has been compromised by an attacker, or an attacker can provide a maliciously modified USB stick in place of a legitimate one, the attacker could surreptitiously take complete control over the WinEDS client”. Basically, this is a potential “rootkit” for election systems. A threat model, a fundamental design requirement of the SDL, could help uncover such design issues and illustrate the need for mitigations.
Now, let’s turn to the Source Code Review of the Hart InterCivic Voting System. I’ll try to keep my commentary balanced by selecting two examples here as well:
From the executive summary:
“Unsecured network interfaces … Voters can connect to unsecured network links in a polling place to subvert eSlates, as well as to eavesdrop on cast votes and to inject new votes. Poll workers can connect to JBCs or eScans over the management interfaces and perform back-office functions such as modifying the device software. The impact of this is that a malicious voter could potentially take over one or more eSlates in a precinct and a malicious poll worker could potentially take over all the devices in a precinct. …
Failure to protect ballot secrecy Hart’s system fails to adequately protect ballot secrecy…”
The concerns about unsecured network interfaces are discussed in the context of authentication and least privilege (pages 24-25). While that is certainly a reasonable perspective, with the SDL we take a broader view and require all teams to threat model the attack surface of the software being developed. Attack surface is the enumeration of all possible entry points that an attacker could use to compromise software (code listening to network interfaces, code that accepts data from external sources, etc). The SDL requires development teams to both minimize attack surface in the software they are building and to consider attacks from each entry point on the attack surface to ensure that mitigations are present. It would appear that these examples show that the development teams didn’t adopt such a systematic approach, or failed to think about mitigations of each possible attack if they did.
Ballot secrecy is an example where security and privacy concerns intersect. Many people confuse security and privacy, and both are fundamental to trust. Privacy addresses a wide variety of concerns about many types of data (such as Personally Identifiable Data (PII), ballot data, etc.), how it’s handled (gathered, transmitted, stored, and disposed of) and what rights and expectations different stakeholders may have regarding that data. (Tina Knutson gave a great overview on these issues in a previous blog posting “Privacy is not just about data security“). Security provides the mechanisms, policies, and practices to enforce privacy requirements. Given the intertwined nature of these issues, both are addressed in the SDL.
The concerns about vote storage (section 6.8, page 58 of report) review some classic challenges in software security and privacy with weak random number generation. Randomization is important here since it controls how votes are stored in memory, and weak randomization enables someone to reverse engineer how individual voters voted by examining the aggregate tally of votes (which can be found on the Mobile Ballot Boxes “MBB”) in conjunction with the audit log. The MBB has mitigations in place to protect integrity (tampering) of votes, but doesn’t appear to protect against information disclosure. The SDL cryptographic policies also cover correct random number generation. The challenge of fully considering all ways in which data can be reverse engineered, contextualized (order of log entries providing information that can be linked to individuals’ choices), and correlated with other data sources is a growing challenge. In the SDL privacy policies, we call attention to these issues, but it’s still a challenge.
Next, let’s look at the Source Code Review of the Diebold Voting System. Again, I’ll pick two subjects.
“Vulnerability to malicious software: The Diebold software contains vulnerabilities that could allow an attacker to install malicious software on voting machines or on the election management system…
Vulnerability to malicious insiders: The Diebold system lacks adequate controls to ensure that county workers with access to the GEMS central election management system do not exceed their authority….”
Let’s look at the “Malicious Software” first: While there’s a lot of discussion of general concerns with viruses and malicious payloads, I’d like to drill down on a specific case noted in section 4.2.3 (page 29). The typical concerns around string handling in C/C++ and buffer overflows are mentioned. What is interesting is that in many places this system uses the Microsoft Foundation Classes (MFC) CString class to help mitigate such concerns. The problem noted is that this practice is not consistently followed, and in fact there is a case of one specific function making calls to both CString *and* a standard C string library, in the same function. So here it appears the engineering team had the right idea by trying to remove calls to potentially risky C string library functions (just as required in SDL), but they just weren’t able to consistently and completely apply it.
Regarding the executive summary concern about malicious insiders, I’m inclined to attribute it to what’s described in section 4.3 on page 30: “No formal threat model or security plan” and “No formal security training”. Both of these are pivotal elements in the SDL. Several comments are offered to the effect that “security measures that are in place appeared to be ad hoc”, and “When new developers arrive at the company, they do not receive any kind of security training”. We’ve blogged here in the past about the importance of both areas, so I won’t repeat that again. (See Adam’s Threat Modeling series and Dave’s “Security Education v. Security Training” posts respectively for more info).
Is the SDL enough to ensure trustworthy voting systems?
When I offered this blog post for the review of my colleagues, it generated some very interesting discussion. Some of my colleagues were worried that I would misrepresent the SDL as a panacea for creating perfectly trustworthy voting systems. Let me be clear: this is absolutely NOT the case. While the SDL could help mitigate repeating many of the problems identified in these studies, it’s worth noting that election systems have a number of unusual and unique requirements. For example, voters cannot review their voting records as they would their banking records to ensure that no fraud has been committed – since the ability to do so would typically enable vote-selling and coercion. Alternate techniques are therefore required to allow voters to verify that their votes have been properly counted. Such requirements force the adoption of “extraordinary” techniques that go beyond those of secure software engineering. Furthermore, the expectations of society on the trustworthiness of voting systems are much greater as compared to other types of software (for example: the latest XBOX game title). I’ll further explore differences in how different people think about “degrees of trustworthiness” (aka “assurance” or “robustness”) in a future posting.
Let me wrap by saying this, building secure software is difficult. Prior to the advent of Trustworthy Computing and the Security Development Lifecycle here at Microsoft, I’d bet that many of the issues noted in these reports would have applied to earlier Microsoft products too. Some might think I’m throwing stones while living in a glass house, but that is not my intent. While Microsoft products are not vulnerability free, we continue to systematically analyze the sources of vulnerabilities in our software. We continue to modify our engineering practices and tools to better identify potential vulnerabilities and mitigate them before software is released. With increasing awareness and concerns over the trustworthiness of computers in general, the entire industry needs to improve. Given the importance of how we choose to organize ourselves as a society and elect representatives to govern us, voting systems are a great place to step up both in the context of the computing industry, and to better serve society.
I believe many of the issues found in these voting systems would not have entered the system if the SDL was used to design and build the voting systems.