April 20, 2014

avatar

The Silver Effect: What We Can Learn from Poll Aggregators

For those who now think Nate Silver is god, here’s a question: Can Nate Silver make a prediction so accurate that Nate Silver himself doesn’t believe it?

Yes, he can–and he did. Silver famously predicted the results of Election 2012 correctly in every state. Yet while his per-state predictions added up to the 332 electoral votes that Obama won, Silver himself predicted that Obama’s expected electoral vote total was only 313. Why? Because Silver predicted that Silver would get some states wrong. Unpacking this (pseudo-)paradox can help us understand what we can and can’t learn from the performance of poll aggregators like Nate Silver and Princeton’s Sam Wang in this election.
[Read more...]

avatar

Voting technology issues in Virginia on election day

I spent Election Day in one of the command centers for the 866-OUR-VOTE hotline. The command center was accepting calls from New Jersey, Maryland, DC, and Virginia, but 95% of the technology issues were from Virginia. I was the designated “technology guy”, so pretty much everything that came through that center came to me. This gave me a pretty good perspective on the scope of issues. (I don’t know about the non-technology issues, although I heard discussions of issues like demanding more ID than is required, voter intimidation, etc.)

Following is a summary of what I saw. What’s most interesting is that if you divide things into “easy to solve” and “hard to solve”, the “easy to solve” ones are all in places using optical scan, and the “hard to solve” are all in places using DREs (colloquially known as “touch screens”, although not all of them are).
[Read more...]

avatar

Get Out the Vote, Cee-Lo Style?

This semester, Ed Felten and I are teaching a Freshman Seminar called “Facebook: The Social Impact of Social Networks.” This week, the class is discussing a recent article published in the journal Nature, entitled “A 61-Million-Person Experiment in Social Influence and Political Mobilization“. The study reveals that if Facebook shows you a list of your closest friends who have voted, you are more likely to do so yourself. It is a fascinating read both because it is probably the first very-large-scale controlled test of social influence via online social networks, and because it appears that without much work the company was able to spur about 340,000 extra people to vote in the 2010 midterm elections.

I confess that last night I watched some of the wildly popular reality TV competition The Voice. What can I say? The pyrotechnics were more calming than the amped-up CNN spin-zoners. It was the first day that the at-home audience began voting for their favorites. Carson Daly mentioned that the show would take the requisite break on Election Night, but return in force on Wednesday. (Incidentally, I can’t decide whether or not this video urging us to “vote Team Cee-Lo” is too clever by half).
[Read more...]

avatar

Grading the absentee-in-person experience in Virginia

[Each year, I write a "my day as a pollworker" report. This year, I'm not a pollworker, or election officer in Virginia parlance, for a variety of reasons, so I decided to write about my voting experience.]

I just got back from “in-person absentee voting”. This is similar to but not the same as early voting – in Virginia, it’s still absentee voting, but you do it by going to a central polling place (there are almost a dozen in Fairfax, which is a very geographically large and populous county). And you have to have one of a dozen reasons (e.g., you’ll be out of the county on business or pleasure, you’re disabled, pregnant, incarcerated awaiting trial, …) – you can’t just do it because it’s more convenient. See Code of Virginia 24.2-700 for all of the acceptable reasons.

My goal, besides the actual act of voting, was twofold. First, Virginia has new voter ID laws, and I wanted to see whether pollworkers had been trained to know what the new laws are. And second, Fairfax County by policy is supposed to offer voters the choice of “paper or plastic” – optical scan or DRE, and I wanted to see how that happened. (I know how it has happened in the past in my precinct, because I was responsible for ensuring that we followed the rules, but wanted to see how it was done in this environment.)
[Read more...]

avatar

New Jersey Voting in the Aftermath of Hurricane Sandy

Hurricane Sandy has disrupted many aspects of life here in New Jersey. Even beyond the physical destruction, the state’s infrastructure is still coming back on line. Many homes are still without power and heat, and some roads are closed. Schools were closed all of last week, and some will be closed for longer.

Sandy has also disrupted plans for Tuesday’s election. The election cannot be rescheduled, so we have to find a way to let people vote. Here in Princeton, 63% of the voting districts will vote in temporary, relocated polling places.

In response to the electoral challenges, New Jersey Lieutenant Governor Kim Guadagno has issued three orders (1, 2, 3), decreeing changes in voting procedures:
[Read more...]

avatar

NJ Lt. Governor invites voters to submit invalid ballots

On November 3rd, the Lieutenant Governor of New Jersey issued a directive, well covered in the media, permitting storm-displaced New Jersey voters to vote by e-mail.  The voter is to call or e-mail the county clerk to request an absentee ballot by e-mail or fax, then the voter returns the ballot by e-mail or fax:

“The voter must transmit the signed waiver of secrecy along with the voted ballot by fax or e-mail for receipt by the applicable county board of election no later than November 6, 2012 at 8 p.m.”

We see already one problem:  The loss of the secret ballot.  At many times in the 20th century, NJ political machines put such intense pressure on voters that the secret ballot was an important protection.  In 2012 it’s in the news that some corporations are pressuring their employees to vote in certain ways.  The secret ballot is still critical to the functioning of democracy.

But there’s a much bigger problem with the Lt. Gov. Kim Guadagno’s directive:  If voters and county clerks follow her instructions, their votes will be invalid.
[Read more...]

avatar

Oak Ridge, spear phishing, and i-voting

Oak Ridge National Labs (one of the US national energy labs, along with Sandia, Livermore, Los Alamos, etc) had a bunch of people fall for a spear phishing attack (see articles in Computerworld and many other descriptions). For those not familiar with the term, spear phishing is sending targeted emails at specific recipients, designed to have them do an action (e.g., click on a link) that will install some form of software (e.g., to allow stealing information from their computers). This is distinct from spam, where the goal is primarily to get you to purchase pharmaceuticals, or maybe install software, but in any case is widespread and not targeted at particular victims. Spear phishing is the same technique used in the Google Aurora (and related) cases last year, the RSA case earlier this year, Epsilon a few weeks ago, and doubtless many others that we haven’t heard about. Targets of spear phishing might be particular people within an organization (e.g., executives, or people on a particular project).

In this posting, I’m going to connect this attack to Internet voting (i-voting), by which I mean casting a ballot from the comfort of your home using your personal computer (i.e., not a dedicated machine in a precinct or government office). My contention is that in addition to all the other risks of i-voting, one of the problems is that people will click links targeted at them by political parties, and will try to cast their vote on fake web sites. The scenario is that operatives of the Orange party send messages to voters who belong to the Purple party claiming to be from the Purple party’s candidate for president and giving a link to a look-alike web site for i-voting, encouraging voters to cast their votes early. The goal of the Orange party is to either prevent Purple voters from voting at all, or to convince them that their vote has been cast and then use their credentials (i.e., username and password) to have software cast their vote for Orange candidates, without the voter ever knowing.

The percentage of users who fall prey to targeted attacks has been a subject of some controversy. While the percentage of users who click on spam emails has fallen significantly over the years as more people are aware of them (and as spam filtering has improved and mail programs have improved to no longer fetch images by default), spear phishing attacks have been assumed to be more effective. The result from Oak Ridge is one of the most significant pieces of hard data in that regard.

According to an article in The Register, of the 530 Oak Ridge employees who received the spear phishing email, 57 fell for the attack by clicking on a link (which silently installed software in their computers using to a security vulnerability in Internet Explorer which was patched earlier this week – but presumably the patch wasn’t installed yet on their computers). Oak Ridge employees are likely to be well-educated scientists (but not necessarily computer scientists) – and hence not representative of the population as a whole. The fact that this was a spear phishing attack means that it was probably targeted at people with access to sensitive information, whether administrative staff, senior scientists, or executives (but probably not the person running the cafeteria, for example). Whether the level of education and access to sensitive information makes them more or less likely to click on links is something for social scientists to assess – I’m going to take it as a data point and assume a range of 5% to 20% of victims will click on a link in a spear phishing attack (i.e., that it’s not off by more than a factor of two).

So as a working hypothesis based on this actual result, I propose that a spear phishing attack designed to draw voters to a fake web site to cast their votes will succeed with 5-20% of the targeted voters. With UOCAVA (military and overseas voters) representing around 5% of the electorate, I propose that a target of impacting 0.25% to 1% of the votes is not an unreasonable assumption. Now if we presume that the race is close and half of them would have voted for the “preferred” candidate anyway, this allows a spear phishing attack to capture an additional 0.12% to 0.50% of the vote.

If i-voting were to become more widespread – for example, to be available to any absentee voter – then these numbers double, because absentee voters are typically 10% of all voters. If i-voting becomes available to all voters, then we can guess that 5% to 20% of ALL votes can be coerced this way. At that point, we might as well give up elections, and go to coin tossing.

Considering the vast sums spent on advertising to influence voters, even for the very limited UOCAVA population, spear phishing seems like a very worthwhile investment for a candidate in a close race.

avatar

Tinkering with Disclosed Source Voting Systems

As Ed pointed out in October, Sequoia Voting Systems, Inc. (“Sequoia”) announced then that it intended to publish the source code of their voting system software, called “Frontier”, currently under development. (Also see EKR‘s post: “Contrarianism on Sequoia’s Disclosed Source Voting System”.)

Yesterday, Sequoia made good on this promise and you can now pull the source code they’ve made available from their Subversion repository here:
http://sequoiadev.svn.beanstalkapp.com/projects/

Sequoia refers to this move in it’s release as “the first public disclosure of source code from a voting systems manufacturer”. Carefully parsed, that’s probably correct: there have been unintentional disclosures of source code (e.g., Diebold in 2003) and I know of two other voting industry companies that have disclosed source code (VoteHere, now out of business, and Everyone Counts), but these were either not “voting systems manufacturers” or the disclosures were not available publicly. Of course, almost all of the research systems (like VoteBox and Helios) have been truly open source. Groups like OSDV and OVC have released or will soon release voting system source code under open source licenses.

I wrote a paper ages ago (2006) on the use of open and disclosed source code for voting systems and I’m surprised at how well that analysis and set of recommendations has held up (the original paper is here, an updated version is in pages 11–41 of my PhD thesis).

The purpose of my post here is to highlight one point of that paper in a bit of detail: disclosed source software licenses need to have a few specific features to be useful to potential voting system evaluators. I’ll start by describing three examples of disclosed source software licenses and then talk about what I’d like to see, as a tinkerer, in these agreements.

The definition of an open source software product is relatively simple: for all practical purposes, anything released under an OSI-approved software license is open source, especially in the sense that one who downloads the source code will have wide latitude to copy, distribute, modify, perform, etc. the source code. What we refer to as disclosed source software is publicly released under a more restrictive license.

Three Disclosed Source Licenses

We have at least three examples of these kinds of licenses from voting systems applications:

  • Sequoia: Sequoia’s license is a good place to start, due to its relative simplicity. It grants the user a limited copyright and patent license for “reference use”, which is defined as (emphasis added):

    “Reference use” means use of the software within your company as a reference, in read only form, for the sole purposes of reviewing and inspecting the code. In addition, this license allows inspection of the code to enhance, or develop ancillary software or hardware products to enhance interoperability of your products with the software. Distribution of this source is allowed provided that this license, and all copyright notices remain in-tact.

    (If you’re a developer and you suspect that you’ve seen this license before, you might have. It’s appears identical to Microsoft’s Reference Source License (Ms-RSL) which is used to distribute things like the .NET Framework under their shared source program. That makes a good deal of sense since Frontier is written in .NET!)

  • Everyone Counts1: Everyone Counts’ (E1C) license is curious. (Unfortunately, I don’t see a copy of this license posted publicly, so I’ll just quote from it.) I say curious mostly because of the flowery language it includes, such as (emphasis added):

    The sources are provided for scrutiny in the same spirit as any member of the public may apply and obtain escorted access to a public election, and is entitled free and open access to election processes to satisfy his or herself of the veracity of the claims of electoral officers and that the process upholds democratic principles. In the same way, the elections observer is not permitted to capture and publish or otherwise make public the processes of the physical count at an election, the same applies for these sources.

    I’m not sure what that last part is about; it’s pretty well accepted—in the U.S.—that “making public the processes of the physical count” is a basic requirement of democratic elections.

    Anyway, on to the substance: It’s a pretty simple license, although more complex than the Sequoia license. The core of this license allows the user to “examine, compile and execute the resulting bytecodes” from the Java source code. It specifies that the user is allowed these rights for the purpose of “analysis forming electoral scrutiny”, which is a difficult phrase to parse. The license suffers from a lot of such wording problems, which make it pretty hard to understand.

  • VoteHere2: The VoteHere license agreement is considerably more complex and looks more like a commercial software license agreement. My favorite part, of course, is:

    TO AVOID ANY DOUBT, THIS SOFTWARE IS NOT BEING LICENSED ON AN OPEN SOURCE BASIS.

    The central component is that VoteHere restricts all your rights, other than copying and modifying the source code for evaluation purposes, and owns any derivative works you create from the source code. It has some other quirks; for example, the license, despite being a click-wrap license, has a hard term of 60 days after which all copies and such must be destroyed. (Presumably, you could click through again after 60 days, and restart the term.)

What Does an Evaluator Need in a License?

Each of these licenses has its strengths and weaknesses. The Sequoia license doesn’t seem to permit modification of the source code but is relatively simple and allows distribution of Sequoia’s unmodified code. The VoteHere and E1C licenses, however, understand that modification may be necessary to evaluate the software (VoteHere even includes a license to the system’s documentation, an essential but often overlooked part of evaluating source code.). The VoteHere license is extremely onerous in that it is very strict and places heavy burdens on the evaluator. The E1C license is flowery and hard to understand, but seems simple at the core and seems to understand what evaluators might need in terms of modifying the code during evaluation.

This raises a good question: What rights, exactly, do evaluators need to examine source code? Practically, it depends on what they want to do. If all they want to do is human line-by-line source code analysis, than a “read-only” license like Sequoia’s is probably fine. However, what about compiling the source code with debugging flags set? What about modifying the software to see how another piece of it performs?

Listed (sort of) in terms of the rights granted by U.S. copyright law, here are some thoughts:

  • Examining: If it doesn’t require making a copy, simply looking at the source code with your eyeballs is not covered by copyright law. So licenses that allow you to “examine” the source code aren’t really granting much, unless they define “examine” to mean something that implicates exclusive rights (which are listed in 17 USC 106).

  • Copying: Of course, downloading and loading source code in an IDE or text editor will make a number of copies, so at the most basic level, evaluators will need to be able to do this. Backup copies are also a necessity these days and VoteHere’s license contemplates this by allowing “reproduc[tion] for temporary archive purposes”.

  • Modification: Evaluators will need some ability to modify the code. Either simply in compiling it to execute it or the next logical step which involves special types of compilation such that “debugging flags” are set (this includes special flags and metadata in the compiled code which allows debugging tools to step through the program, set break points, etc.). Some types of evaluation require minor modifications to integrate the code into an analysis method; a simple example is just changing pieces of the code to see how other pieces will respond or inserting code that prints to the screen (which is a very primitive but useful form of debugging!). Each of these actions creates a derivative work, so that should be explicitly allowed in these licenses.

  • Distribution: At first, you might not think that evaluators would need much in the way of rights to distribute the source code. However, distributing modified works, such as a patch that fixes a bug, could be very useful. Also, being able to share the code if the official means of getting the code goes dark is often useful; for example, having a portal of voting system source code would provide this “mirroring” capability and could also allow creating a “one-stop shop” for tinkerers and researchers who would point research tools at these code repositories for analysis.

  • Performance: Both reading the source code out loud or showing it publicly are also implicated by copyright law. Why would an evaluator want to do this? Well, imagine that you’ve completed an analysis of a disclosed source code product and you want to write up your findings. It’s often useful to include snippets of code. Small snippets would likely be covered by fair use, but it’s always nice to not have to worry about that and have explicit permission to at least “display” and possibly “read aloud” source code in these contexts (think accessible or podcast versions of a report!).

  • Executing: There are a line of legal cases that say that “executing” a program is protected by copyright law due to a copy of the object code being loaded into memory at run time. Hopefully, there’s no reason to believe that permission to make copies, the first and most basic need of evaluators highlighted above, wouldn’t also include this interpretation of “executing” the code.

Outside of these types of exclusive rights, there’s also something to be said for simplicity. The simplicity of the BSD license is a great example: it’s widely used and understood to be very generous and easy to understand. The Sequoia license (being the Ms-RSL license) is very simple and easy to understand. The E1C license is not particularly complex, but it’s substance is hard to understand (again, apologies that I cannot post the text of that license). The VoteHere license is easy to understand but very complex and extremely onerous in terms of the burden it places on evaluators.

As I finally finish writing this, I’m told Sequoia might be interested in modifying their license. That would be a wonderful idea and I hope these thoughts are useful for modifying it. I do wonder how they’ll be able to modify the license and still distribute parts of the .NET Framework under a new license. Perhaps they’ll specify that the .NET parts are under the Ms-RSL and any Sequoia-sourced source code is under Sequoia’s new license. We’ll see!

1 Everyone Counts sells internet voting solutions, which are scary as hell to a lot of us.

2 VoteHere was a company, now seemingly out of business, that made a number of products including a cryptographic add-on module, Sentinel, for the Diebold/Premier AccuVote-TS voting system and a absentee ballot tracking system.