December 6, 2022

Why the voting machines failed in Mercer County

On Election Day, November 8, 2022, every voting machine in every polling place in Mercer County, New Jersey failed to work.  Voters in each precinct filled in the ovals in their preprinted optical-scan paper ballots, but the voting machines couldn’t read them.  So voters were instructed to put their ballots into “slot 3” of the voting machines, that is, directly into the ballot box.  The Mercer County Board of Elections collected the ballots at the close of the polls on election night, using their usual chain-of-custody procedures.  Then they counted those ballots using the county’s central-count optical-scan voting machines, which are normally used for mail-in ballots.  This took two or three days.  All the votes got counted – but it’s still an embarrassing screw-up that deserves scrutiny.

Between 2002 and 2018, Mercer County used paperless full-face touchscreen voting machines.  That was an untrustworthy technology–if the computer miscounted the votes because of hacking or malfunction, there were no paper ballots that could be recounted, and we’d never know.  So I was glad to see those machines go, and glad to see them replaced by hand-marked optical-scan paper ballots, counted by precinct-count optical scanners.  This is the most securable technology I know of.  And that method of vote-counting is robust, meaning even if the voting machines fail to operate, voters can deposit their ballots in a ballot box for counting later.  That’s how all the votes got counted in the November 22 election. Still, we don’t expect every voting machine in the whole county to fail at once!  So what happened exactly?

Optical scan voting machines are “told” what candidates are on the ballot using a Ballot Definition File. It is a computer file that “defines” how to count the marks on the ballot–a mark at this position is for candidate Smith, and a mark at that position is for Jones.  Meanwhile, the printing contractor prints all the optical-scan ballots that the voters will mark.  The layout of candidates on the preprinted ballot must match the Ballot Definition File.  If the ballot does not match what the voting machine is expecting, either the ballot cannot be read (what happened in Mercer County); or the ballot could be read incorrectly (what happened in Antrim County, Michigan in 2020). 

In Mercer County, New Jersey, because of an error in preparing the Ballot Definition Files, the printed ballots did not match the ballot definition file, and the ballots could not be read.  Here is the detailed explanation:

In order that the County Clerk can report precinct-by-precinct totals, New Jersey ballots are labeled with a precinct-number; and each vote-by-mail ballot, each provisional ballot, each early-voting ballot, each regular election-day ballot is labeled with a distinct precinct number.   In Mercer County, all these ballots are printed in advance of the election, except the early-voting ballots which are produced on-demand by ballot-marking devices.

Ballot from Mercer County election showing barcode indicating ballot ID

The County Clerk is responsible for ballot printing, which she contracts out to private companies such as Royal Printing Services;  and voting-machine “programming” is contracted to Dominion Voting Systems.  The vote-counting program in the voting machines does not change from election to election, but for each election a “ballot definition file” is prepared that tells the vote-counting program what candidates are on the ballot, in each ballot style.

In September 2022 the County Clerk sent to Royal Printing Services the list of candidates on the ballot (in each town) for the November election.  Royal created all the ballot layouts; then sent the ballot styles for approval by the County Clerk, as follows:

  • Ballot styles 1-243:  vote-by-mail ballot definitions for each precinct
  • Ballot styles 244-486:  early voting ballot definitions for each precinct
  • Ballot styles 487-729:  provisional ballot definitions for each precinct
  • Ballot styles 730-972:  election-day polling-place ballot definitions for each precinct.

This file was sent on October 5 to Dominion, to the County Clerk, and to the Superintendent of Elections.  On each ballot, the “ballot ID” (a number between 1 and 975) is printed at the bottom of the page, in plain text and as a barcode that the voting machines can read. Later that morning, employees at Dominion started to encode this file of ballot styles into the Ballot Definition Files that the Dominion voting machines could accept.

The County Clerk did not approve this set of ballot styles but asked to have a phone call with Royal and Dominion that same day.  In the call she pointed out that it was not necessary to print 243 different forms of provisional ballot, it would suffice to have one style of provisional ballot for each of the 18 different layouts (4 Trenton wards + 11 towns + Spanish-language ballots in 3 of those towns), and doing it this way would save money for the county.  This occurred during a conference call mid-day on October 5th, with  the County Clerk, Dominion, and Royal.   In that call it was agreed that Royal would produce a new set of ballot styles.   In an e-mail on the afternoon of October 5th, Dominion was formally notified of this change, and Dominion acknowledged by e-mail that they had received the change.

Royal Printing’s new file of ballot styles looked like this:

  • Ballot styles 1-243:  vote-by-mail ballot definitions for each precinct
  • Ballot styles 244-486: early voting ballot definitions
  • Ballot styles 487-504:  provisional ballot definitions for each town
  • Ballot styles 505-747:  election-day polling-place ballot definitions for each precinct.

For example, an East Windsor precinct #1 election-day ballot that was numbered 730 in the old list was numbered 505 in the new list.

The Dominion employee who was tasked with updating the Ballot Definition Files redid the coding of all the provisional ballots.  Unfortunately this employee failed to understand that all the election-day ballots would have new numbers as well.  So in Dominion’s files, the voting machines were still programmed to interpret number 730 as East Windsor precinct 1 – but the ballots printed by Royal Printing for East Windsor #1 had Ballot ID 505 printed at the bottom.

Logic and Accuracy Testing.  Before each election, it is routine for an election official to perform “logic and accuracy testing” (LAT), to make sure the voting machines are correctly counting the votes.  The Mercer County Superintendent of Elections performed LAT between October 21 and 24, by feeding a “deck” of about 5000 optical-scan paper ballots through the voting machines.  This test deck was created by Dominion and printed by Royal, with votes already marked (ovals filled in).  The test deck covered all the contests in all the ballot styles for regular election-day ballots.

Dominion created the LAT test deck based on their own list of ballot styles: ballot IDs 730-972.  This matched what the voting machines expected, and the LAT test “passed.”  In hindsight, it’s clear that a more reliable test would use the actual preprinted optical-scan ballots that Royal was preparing for the real election.

Royal Printing Services delivered to Mercer County, 243 styles of preprinted vote-by-mail ballots (ballot IDs 1-243), as well as packets of pre-printed provisional ballots (ballot IDs 487-504), and 243 batches of pre-printed regular Election-Day ballots (ballot IDs 505-747).  Those ballots were then delivered in November to the polling places, along with the precinct-count optical-scan voting machines.

On Election Day, in each and every precinct, the preprinted ballots had a different ballot ID number than the voting machine expected:  #505 instead of #730, #506 instead of #731, and so on.  The only people eligible to vote in an election-day polling place are the voters registered in that precinct, so the voting machines rejected the ballots.

Soon after the polls opened at 6am there were urgent calls from poll workers to the Superintendent and the County Clerk, who consulted with Royal Printing and with Dominion and pretty soon figured out what had happened.   Poll workers throughout the county were instructed to have voters mark their regular Election-Day ballots  (not provisional ballots), and deposit those in the voting machine’s ballot boxes for counting later.   The Board of Elections owns high-speed central-count optical-scan voting machines, normally used for counting just the mail-in ballots.  In order to use those same “central-count” machines to count the polling-place ballots (after the polls closed), the Superintendent and County Clerk reprogrammed the machines with the correct list of ballot-IDs.

Missing ballots?   In a normal election, immediately upon the close of the polls all the paper ballots from each precinct are put into a red security bag with a tamper-evident seal.  Then a County Board of Elections employee drives this bag to the Board of Elections office where the bag is put into the vault.  The vault has two locks: to open it requires both a Republican and a Democratic member of the Board of Elections – in principle.   And in addition, the flash drives with the electronic totals from the voting machines are put into a blue security bag and transported to the town’s municipal clerk; these are collected on election night by the County Clerk from each town’s municipal clerk.

The “chain of custody” of those paper ballots is very important.  If there is ever a recount, it is those paper ballots that will be counted.  State law mandates that a random audit be conducted on those paper ballots, as a partial check against hacks or errors in the voting machines.  And, as we learned, if the voting machines fail entirely, we can still count those paper ballots.  So it’s very important that the paper ballots be safeguarded against tampering, starting from when they are removed from the ballot boxes (voting machines) at the close of the polls, all the way through the last time they are counted.

Unfortunately, Mercer County’s chain-of-custody procedures are not perfect.  In one Robbinsville precinct, the red bag with paper ballots stayed overnight at the Robbinsville clerk’s office, so it was missing from the County Clerk’s initial tally.  The Mayor of Robbinsville was very concerned about this, and justifiably so.

In three precincts in Princeton, the paper ballots were left inside the voting machines overnight; the red bags delivered to the Board of Elections were empty.  County election officials retrieved those ballots from the voting machines the next morning.

Many of the red bags from precincts throughout the county arrived at the Board of Elections office without their tamper-evident seals.  I presume that imperfectly trained poll workers forgot that step of the process, and simply put the ballots into the red bags without sealing them.  (We can wish for perfectly trained poll workers, but remember that the Superintendent has to hire and train hundreds of people to work a single 14-hour day for low pay – not so easy!) It is also possible to imagine that someone tampered with the ballots in those bags.

Tamper-evident security seals are not perfectly secure: it can be possible to remove and replace them without evidence of tampering.  Given that some bags have forgotten seals, and even the sealed bags can be vulnerable to tampering, the ballots should be transported by teams of two Board of Elections employees instead of just one, and those two should belong to different political parties.   This would be a logistical hassle:  whose car would they use, and how would the other worker get back to her own car still parked at the polling place?  But these logistics can be sorted out.  Chain of custody is compromised when only one person is in charge of ballots. 

In summary: although there is no concrete evidence of any tampering with the ballots, or any permanently lost ballots, there are many imperfections in the chain of custody.   Many poll watchers who witnessed these problems were angry about the sloppy chain of custody, and for good reason.  This is something the Superintendent should improve.

Conclusion.  This was an embarrassing failure of our county election system.  Voters were angry that the voting machines didn’t work, and had an uncomfortable feeling depositing the ballots in a slot where who-knows-what would happen to them.  For over a decade I have been advocating for preprinted hand-marked paper ballots, counted by precinct-count optical scanners, so it was embarrassing for me too.  

But I still advocate for preprinted hand-marked ballots, because all of the alternatives are much, much worse: if a touchscreen ballot-marking device makes a mistake or is hacked, you might never know that the vote totals are wrong.  With preprinted hand-marked paper ballots, even if there’s intentional computer hacking, those hand-marked paper ballots can be recounted.  In Mercer County, the system worked.  We had the paper ballots and we counted them, so we can be confident our results reflect the will of the voters.  Even with these mistakes, this election was more secure and more trustworthy than previous elections that had no paper ballots.

Our election administrators have some work to do – and they know it – in improving communications with vendors,  logic and accuracy testing, and chain of custody protocols. I feel confident that they’re on it.  

[This article is based on presentations made to the Mercer County Board of Commissioners at a meeting on November 21, 2022 by the Mercer County Prosecutor, the County Clerk, the Superintendent of Elections, an officer of Royal Printing Services, a vice president of Dominion Voting Systems, and several Mercer County citizens who witnessed chain-of-custody issues.]

CITP Seeks Postdocs for Fellows Program

Those with a background in information integrity, or in precision health are especially encouraged to apply.

As part of our Fellows program, CITP is hiring a Postdoctoral Research Associate. This position is designed for people who have recently received or are about to receive a Ph.D.

Applicants should have experience conducting research in at least one of our three focus areas:

  • Data Science and the intersection of Artificial Intelligence and Society;
  • Privacy and Security
  • Digital Platforms and Infrastructure

We are especially interested in hearing from postdoc candidates who specialize in “information integrity,” as part of our privacy and security focus.

We are also seeking postdocs who work at the intersection of precision health, data-driven medicine and public policy, as part of our “Data Science and AI and Society” focus area.

CITP is Hiring a Professor

We are seeking an Assistant, Associate, or Full professor whose work aligns with one or more of our three focus areas.

  • Data Science and the intersection of Artificial Intelligence and Society
  • Privacy and Security, and 
  • Digital Platforms and Infrastructure

Please visit the Princeton University open position’s page for more details about the position and the application.

Both CITP and Princeton University seek for our research communities to be diverse and inclusive. This commitment informs our approach to recruiting and hiring faculty with a strong commitment to teaching, mentoring, and research.

The deadline to apply is December 1, 2022.

Princeton CITP Launches the Digital Witness Lab to Help Journalists Track Bad Actors on Platforms

Read the full announcement and Q & A with Investigative Data Journalist and Engineer, Surya Mattu.

Princeton University’s Center for Information Technology Policy (CITP) is excited to announce the launch of the Digital Witness Lab — an innovative research laboratory where engineers will design software and hardware tools to track the inner workings of social media platforms, and help journalists expose how they exploit users’ privacy and aid in the spread of misinformation and injustices globally.

Based at CITP’s Sherrerd Hall office, the Lab is led by Surya Mattu, an award-winning data engineer and journalist whose most recent project with The Markup resulted in “Facebook Is Receiving Sensitive Medical Information from Hospital Websites,” an investigative news story that revealed 33 hospital websites and seven health system patient portals were collecting patients’ sensitive patient data through Facebook’s Meta Pixel code.

“In our digital world, injustice often lurks in the shadows of digital platforms,” Mattu said. “This could be a Facebook post for housing that excludes people based on their demographic group. It could be an algorithm used to sort employment resumes where only one type of person passes the screening. In criminal justice, it could be a risk assessment algorithm that penalizes black defendants more than white defendants in criminal sentencing.

“Algorithmic decisions in systems like these take place in proprietary software and apps,” he explained. “To bypass this barrier, the lab builds custom software and hardware to capture data from these platforms.”

Mattu’s first undertaking at CITP is WhatsApp Watch — a research project in which engineers will monitor public WhatsApp groups to document the spread of misinformation.

“We are excited to welcome Surya into the Princeton CITP community,” said Tithi Chattopadhyay, the Center’s executive director. “We look forward to building relationships with journalists and newsrooms that don’t have access to the types of digital tools Surya has a record of developing to support the critical work of investigative reporters. We are excited about the real world impact his work will have.”

The Center for Information Technology Policy is a nonprofit, nonpartisan, interdisciplinary hub where researchers study the impact of digital technologies on society with the mission of informing policymakers, journalists, researchers, and the public for the good of society. CITP’s research priorities are Platforms & Digital Infrastructure, Privacy & Security, and Data Science, AI & Society.

An Introduction to My Project: Algorithmic Amplification and Society

This article was originally published on the Knight Institute website at Columbia University.

The distribution of online speech today is almost wholly algorithm-mediated. To talk about speech, then, we have to talk about algorithms. In computer science, the algorithms driving social media are called recommendation systems, and they are the secret sauce behind Facebook and YouTube, with TikTok more recently showing the power of an almost purely algorithm-driven platform.

Relatively few technologists participate in the debates on the societal and legal implications of these algorithms. As a computer scientist, that makes me excited about the opportunity to help fill this gap by collaborating with the Knight First Amendment Institute at Columbia University as a visiting senior research scientist — I’m on sabbatical leave from Princeton this academic year. Over the course of the year, I’ll lead a major project at the Knight Institute focusing on algorithmic amplification.

This is a new topic for me, but it is at the intersection of many that I’ve previously worked on. My broad area is the societal impact of AI (I’m wrapping up a book on machine learning and fairness, and writing one about AI’s limits). I’ve done technical work to understand the social implications of recommender systems. And finally, I’ve done extensive research on platform accountability, including privacymisleading content, and ads.

Much of my writing will be about algorithmic amplification: roughly, the fact that algorithms increase the reach of some speech and suppress others. The term amplification is caught up in a definitional thicket. It’s tempting to define amplification with respect to some imagined neutral, but there is no neutral, because today’s speech platforms couldn’t exist in a recognizable form without algorithms at their core. Having previously worked on privacy and fairness—two terms that notoriously resist a consensus definition—I don’t see this as a problem. There are many possible definitions of amplification, and the most suitable definition will vary depending on the exact question one wants to answer.

It’s important to talk about amplification and to explore its variations. Much of the debate over online speech, particularly the problem of mis/disinformation, reduces the harms of social media platforms to a false binary—what should be taken down, what should be left up. However, the logic of engagement optimization rewards the production of content that may not be well-aligned with societal benefit, even if it’s not harmful enough to be taken down. This manifests differently in different areas. Instagram and TikTok have changed the nature of travel, with hordes of tourists in some cases trampling on historic or religious sites to make videos in the hopes of going viral. In science, facts or figures in papers can be selectively quoted out of context in service of a particular narrative.

Speech platforms are complex systems whose behavior emerges from feedback loops involving platform design and user behavior. As such, our techniques for studying them are in their infancy, and are further scuttled by the limited researcher access that platforms provide. I hope to both advocate for more access and transparency, and to push back against oversimplified narratives that have sometimes emerged from the research literature.

My output, in collaboration with the Knight Institute and others at Columbia, will take three forms: an essay series, a set of videos and interactives to illustrate technical concepts, and a major symposium in spring 2023. An announcement about the symposium is coming shortly.