April 20, 2014

avatar

New Research Result: Bubble Forms Not So Anonymous

Today, Joe Calandrino, Ed Felten and I are releasing a new result regarding the anonymity of fill-in-the-bubble forms. These forms, popular for their use with standardized tests, require respondents to select answer choices by filling in a corresponding bubble. Contradicting a widespread implicit assumption, we show that individuals create distinctive marks on these forms, allowing use of the marks as a biometric. Using a sample of 92 surveys, we show that an individual’s markings enable unique re-identification within the sample set more than half of the time. The potential impact of this work is as diverse as use of the forms themselves, ranging from cheating detection on standardized tests to identifying the individuals behind “anonymous” surveys or election ballots.

If you’ve taken a standardized test or voted in a recent election, you’ve likely used a bubble form. Filling in a bubble doesn’t provide much room for inadvertent variation. As a result, the marks on these forms superficially appear to be largely identical, and minor differences may look random and not replicable. Nevertheless, our work suggests that individuals may complete bubbles in a sufficiently distinctive and consistent manner to allow re-identification. Consider the following bubbles from two different individuals:

These individuals have visibly different stroke directions, suggesting a means of distinguishing between both individuals. While variation between bubbles may be limited, stroke direction and other subtle features permit differentiation between respondents. If we can learn an individual’s characteristic features, we may use those features to identify that individual’s forms in the future.

To test the limits of our analysis approach, we obtained a set of 92 surveys and extracted 20 bubbles from each of those surveys. We set aside 8 bubbles per survey to test our identification accuracy and trained our model on the remaining 12 bubbles per survey. Using image processing techniques, we identified the unique characteristics of each training bubble and trained a classifier to distinguish between the surveys’ respondents. We applied this classifier to the remaining test bubbles from a respondent. The classifier orders the candidate respondents based on the perceived likelihood that they created the test markings. We repeated this test for each of the 92 respondents, recording where the correct respondent fell in the classifier’s ordered list of candidate respondents.

If bubble marking patterns were completely random, a classifier could do no better than randomly guessing a test set’s creator, with an expected accuracy of 1/92 ? 1%. Our classifier achieves over 51% accuracy. The classifier is rarely far off: the correct answer falls in the classifier’s top three guesses 75% of the time (vs. 3% for random guessing) and its top ten guesses more than 92% of the time (vs. 11% for random guessing). We conducted a number of additional experiments exploring the information available from marked bubbles and potential uses of that information. See our paper for details.

Additional testing—particularly using forms completed at different times—is necessary to assess the real-world impact of this work. Nevertheless, the strength of these preliminary results suggests both positive and negative implications depending on the application. For standardized tests, the potential impact is largely positive. Imagine that a student takes a standardized test, performs poorly, and pays someone to repeat the test on his behalf. Comparing the bubble marks on both answer sheets could provide evidence of such cheating. A similar approach could detect third-party modification of certain answers on a single test.

The possible impact on elections using optical scan ballots is more mixed. One positive use is to detect ballot box stuffing—our methods could help identify whether someone replaced a subset of the legitimate ballots with a set of fraudulent ballots completed by herself. On the other hand, our approach could help an adversary with access to the physical ballots or scans of them to undermine ballot secrecy. Suppose an unscrupulous employer uses a bubble form employment application. That employer could test the markings against ballots from an employee’s jurisdiction to locate the employee’s ballot. This threat is more realistic in jurisdictions that release scans of ballots.

Appropriate mitigation of this issue is somewhat application specific. One option is to treat surveys and ballots as if they contain identifying information and avoid releasing them more widely than necessary. Alternatively, modifying the forms to mask marked bubbles can remove identifying information but, among other risks, may remove evidence of respondent intent. Any application demanding anonymity requires careful consideration of options for preventing creation or disclosure of identifying information. Election officials in particular should carefully examine trade-offs and mitigation techniques if releasing ballot scans.

This work provides another example in which implicit assumptions resulted in a failure to recognize a link between the output of a system (in this case, bubble forms or their scans) and potentially sensitive input (the choices made by individuals completing the forms). Joe discussed a similar link between recommendations and underlying user transactions two weeks ago. As technologies advance or new functionality is added to systems, we must explicitly re-evaluate these connections. The release of scanned forms combined with advances in image analysis raises the possibility that individuals may inadvertently tie themselves to their choices merely by how they complete bubbles. Identifying such connections is a critical first step in exploiting their positive uses and mitigating negative ones.

This work will be presented at the 2011 USENIX Security Symposium in August.

Comments

  1. Anonymous says:

    Here in Hong Kong we are required to complete election ballots by using rubber stamps instead of pens or pencils. I guess it is good for anonymity.

  2. Anonymous says:

    Why not use bingo daubers? As long as these were all the same colour and so on, I would assume that each form would become less distinguishable.

  3. JRD says:

    This sounds like the kind of research that could have won Senator Proxmire’s Golden Fleece Award if any public money was used to finance it. Surely of all the ways to corrupt an election, the thought that an employer would dig out his old application forms and use them to determine how his workers voted is the least of our worries.

    I’ll grant your uses of verifying a test taker, or of spotting ballot box stuffing, but 92 votes would be a pretty small precinct. Anyone with time to spend trying to identify voters this way surely has time to think up a better way to swing elections.

    • RonK says:

      You obviously don’t have any idea how academic research works, and how it benefits the public good. Even a totally negative result would actually be interesting for use by people wanting to make public policy decisions (like what kind of election protocols might be more or less secure).

      Proxmire’s award was the ultimate in hypocrisy, given that the amount of public money wasted by politicians has to be an order of magnitude (or two) greater than that wasted by academic researchers. Interestingly enough, it’s often academic researchers who are pointing out waste and inefficiency in public policy legislated by politicians, this blog being a prime example.

    • brink says:

      “Anyone with time to spend trying to identify voters this way…”

      Computers have all the time in the world to work endlessly on boring, repetitive, monotonous work which the Programmer only had to enter *once*…

    • emk says:

      In something like an election all that would be required for fraud to work is the potential for your vote to be identifiable.

      If a nasty character gives people money to vote a particular way and there is a credible possibility that their ballots are identifiable. Those voters would likely feel compelled to vote the way they are directed.

      The nasty character doesn’t have to actually identify the voter’s ballot. Just make a credible threat.

      emk

  4. rp says:

    So in any release of bubble-form images it would make perfect sense to replace filled-in bubbles with generics at least part of the time.
    I’d also like to see this work repeated with bubbles that are filled in by ink instead of pencil.

  5. Carolyn Crnich says:

    The promise of a verifiable election far outweighs the remote possibility that a voter’s anonymity could be compromised by the examination of their marks. The employer going to the trouble of applying this identification process would probably have to examine several hundred specimens in the hope of locating just one ballot. Further, California law does not allow for identifying marks on ballots where some state require a serial number on each ballot. California Election Code 14287: No voter shall place any mark upon a ballot
    that will make that ballot identifiable.
    According to this research, every voter in California who has voted using a paper ballot has violated the law. The Courts think our correctional facilities are overcrowded now…

    • joehall says:

      Hi Carolyn, Joe Hall here from UC Berkeley and Princeton. We’ll have to disagree as to whether or not a verifiable election requires publishing scanned ballot images that would allow this kind of bubble-identification. But I take the point that many citizens appreciate having public images of scanned ballots.

      I don’t think it’s as clear-cut as you say (that this finding means that California voters who fill-in opscans are violating CA EC 14287 as they are making distinguishing marks). Of course, you quote the statute correctly, but there are a series of legal cases that have interpreted this section of CA law as requiring the voter to *intend* to make a distinguishing mark in order to identify themselves before their ballot can be invalidated under this statute. A case from 1904 came to a decision interpreting exactly this: In Chubbuck v. Wilson (1904) (142 Cal 599, 76 P 1126, 1904 Cal LEXIS 986), the CA Supreme Court upheld an Appeals court ruling that held a distinguishing mark could not invalidate a ballot unless it appeared that the voter made such a mark with “the purpose of identifying the ballot”. So, ordinary voters shouldn’t worry too much about this law unless they’re intending to identify their ballots with a distinguishing mark.

  6. PrometheeFeu says:

    I would assume you need a fairly high resolution scan in order to do better than 50%. We could just scan at a relatively low resolution.

    • joehall says:

      I’ll let Joe and Will answer more definitively if they’d like (not sure they have results on varying dpi vs. reidentifiaction success), but there’s quite a bit of structure in the images above and they’re not terribly high resolution. One definitely can do useful things with low-res images (e.g., count dots) and lower-resolution might also make paper-fingerprinting much more challenging.

    • jcalandr says:

      We expected the same thing, but accuracy seems to decrease only marginally and gradually with decreased resolution. Surprisingly, our first guess is correct more than 35% of the time even at 48 DPI, a resolution at which the text can be hard to read. The paper has more numbers at other resolutions—we used 1200 DPI for the numbers specified in this post but performance is nearly as strong at 300 DPI or even 150 DPI.

  7. brink says:

    Enter the Bingo Marker to the world of standardized testing!

    • rp says:

      I just voted in a local special election yesterday and was reminded that around here the voting booths are supplied with fairly fat-nosed deep black felt tip pens. There’s no question that there would be enough of a market for election-form-compatible markers.

      You could probably still get some identifying characteristics if you really worked on it.

      Meanwhile, pencil will keep being used for exams, where erasability is important (and, coincidentally where being able to identify the person who made the marks may be a useful feature).