July 14, 2024

Finding a randomly numbered ballot

In my previous posts, I’ve been discussing ballot-level comparison audits, a form of risk-limiting audit. Ballots are imprinted with serial numbers (after they leave the voter’s hands); during the audit, a person must find a particular numbered ballot in a batch of a thousand (more or less).

If the ballot papers are numbered consecutively, that’s not too difficult. But if the serial numbers are in random order, it’s very time-consuming.

An answer to the second puzzle.

So here’s my next idea. Likely I’m not the first to think of it, so I can’t claim much credit. And this idea may or may not be practical; it would need to be tested in practice.

Problem: You have a batch of serial-numbered ballots, like this, and you need to find the one numbered 0236000482.

Take the pile of ballots, and feed them through a high-volume scanner. Scanners that can do 140 pages per minute cost about $6000. The computer attached to the scanner can use OCR (optical character recognition) software just on the corners of the page, to find and recognize the serial number. When it finds the right number, the computer commands the scanner to stop.

Then the human auditor can pick up the last-scanned page, and examine it to make sure it’s the right number.

If the OCR software does not work perfectly (false positives), no harm done: the human sees that it’s the wrong number, and resumes the scanner. False negatives are more annoying, but still recoverable: the human would have to search through the entire pile. Because we don’t rely on the scanner to work perfectly, because the scanner is not counting or tabulating votes, there’s no need to put this equipment through an EAC certification process.

As you’ll notice, the serial number is printed in fairly low-quality, hard-to-read print. This might pose problems for the OCR software. Better-quality printing would help the OCR, but it would help the humans too, and might be worth doing in any case.

Another variant of this solution is to print the serial number as a barcode in addition to human-readable digits. That would be easier for the scanner to recognize. If the PCOS tries to cheat in some way by making the barcode mismatch the human-readable number, this will be detected immediately by the human auditor.

Puzzle number 3: The solution I propose in this article might work; but surely a creative person can find even better ways to support ballot-level comparison audits of PCOS machines.


  1. Related to Boothe’s comment

    If the ballots were serialized by edge notching, then a desired ballot could be extracted relatively quickly manually using a few dollars worth of tools.

    See https://en.wikipedia.org/wiki/Edge-notched_card and https://www.youtube.com/watch?time_continue=4&v=-mQ5p1pL-M0&feature=emb_logo

    Problems with this include:
    most efficient notching requires conversion of serial number to binary. Possibly confusing for some poll workers.
    ballot could easily tear rendering tool less effective
    serialization equipment would be more expensive

    An alternative would be to notch for the low-order 3 digits in the ballot (10 binary notches). This would allow rapid selection of a ballot at locations with fewer than several thousand ballots to search.

  2. Peter Boothe says

    We once knew how to sort and/or randomize punch cards. We should use those same techniques here, for ballots

    A physical radix sort, iirc.