April 26, 2019

Archives for 2019

OpenPrecincts: Can Citizen Science Improve Gerrymandering Reform?

How the American public understand gerrymandering and collect data that could lead to fairer, more representative voting districts across the US?

Speaking today at CITP are Ben Williams and Hannah Wheelen of the Princeton Gerrymandering Project, part of a team with Sam Wang, William Adler, Steve Birnbaum, Rick Ober, and James Turk. Ben is the lead coordinator for the Princeton Gerrymandering Project’s research and organizational partnerships. Hannah, coordinates the collection of voting precinct boundary information

What’s Gerrymandering and Why Does it Matter?

Ben opens by explaining what gerrymandering is and why it matters. Reapportionment is a process by which congressional districts are allocated to the states after each decennial census. The process of redrawing those lines is called redistricting. When redistricting happens, politicians sometimes engage in gerrymandering, the practice of redrawing the lines to benefit a particular party– something that is common behavior by all parties.

Who has the power to redistrict federal lines? Depending on state law, redistricting is done by by different parties, who have different kinds of

  • independent commissions who make the decisions, independently from the politicians affected by it
  • advisory commissions who advise a legislature but have no decision-making power
  • politician or political appointees
  • state legislatures

Ben tells us that gerrymandering has been part of US democracy ever since the first congress. He tells us about Patrick Henry, governor of Virginia, who redrew the lines to try to favor James Monroe over James Madison. The term came into use in the 19th century, and it has remained common since then.

Why Do People Care About Gerrymandering And What Can We Do About It?

Ben tells us about the Tea Party Wave in 2010, when Republicans announced in the Wall Street Journal a systematic plan, called REDMAP, to redraw districts to establish a majority for republicans in the US for a decade. Democrats have also done similar things on a smaller scale. Since then, the designer of the REDMAP plan has become an advocate for reform, says Ben.

How do we solve gerrymandering if the point is that politicians use it to establish their power and are unlikely to give it up? Ben describes three structures:

  • Create independent commissions to draw the lines. Ballot initiatives in MI, CO, UT, and MO and state legislative action (VA) have put commissions in place.
  • Require governors to approve the plan, and give the governor the capcity to refer district lines to courts (WI, MD)
  • State supreme courts (PA, NC?)

These structures have been achieved in some states, through a variety of means: litigation, and through political campaigns. Ben also hopes that if citizens can learn to recognize gerrymandering, they can spot it and organize to respond as needed.

Decisions and controversies about gerrymandering need reliable evidence, especially at times when different sides bring their own experts to a conversation. Ben describes the projects that have been done so far, summarized the recent paper, “An Antidote for Gobbledygook: Organizing the Judge’s Partisan Gerrymandering Toolkit into a Two-Part Framework.” He also mentions the Metric Geometry and Gerrymandering Group at Tufts and MIT and work by Procaccia and Pegden at Carnegie Mellon.

Citizen Science Solutions to the Bad Data Problem in Redistricting Accountability

These many tools have opened new capacities for citizens to have an informed voice on redistricting conversations. Unfortunately, all of these projects rely on precinct level data on the geography of voting precincts and vote counts at a precinct level. Hannah talks to us about the challenge of contacting thousands of counties for precinct-level voting data. In many cases, national datasets of voter behavior are actually wrong– when you check the paper records held by local areas, you find that the boundaries are often wrong. Worse, errors are so common that gerrymandering datasets could easily produce mistaken outcomes. With too many errors for researchers to untangle, how can these data tools be useful?

Might local citizens be able to contribute to a high quality national dataset about voting precincts, and then use that data to hold politicians accountable? Hannah tells us about OpenPrecincts, a citizen science project by the Princeton Gerrymandering Project to organize the public to create accurate datasets about voter records. Hannah tells us about the many grassroots organizations that they are hoping to empower to collect data for their entire state.

BMDs are not meaningfully auditable

This paper has just been released on SSRN. In this paper we analyze, if a BMD were hacked to cheat, to print rigged votes onto the paper ballot; and even suppose voters carefully inspected their ballots (which most voters don’t do), and even supposing a voter noticed that the wrong vote was printed, what then? To assess this question, we characterize under what circumstances a voting system is “contestable” or “defensible.” Voting systems must be contestable and defensible in order to be meaningfully audited, and unfortunately BMDs are neither contestable nor defensible. Hand-marked paper ballots, counted by an optical-scan voting machine, are both contestable and defensible.

Ballot-Marking devices (BMDs) cannot assure the will of the voters

by Andrew W. Appel, Richard A. DeMillo, and Philip B. Stark


Computers, including all modern voting systems, can be hacked and misprogrammed. The scale and complexity of U.S. elections may require the use of computers to count ballots, but election integrity requires a paper-ballot voting system in which, regardless of how they are initially counted, ballots can be recounted by hand to check whether election outcomes have been altered by buggy or hacked software. Furthermore, secure voting systems must be able to recover from any errors that might have occurred.

However, paper ballots provide no assurance unless they accurately record the vote as the voter expresses it. Voters can express their intent by hand-marking a ballot with a pen, or using a computer called a ballot-marking device (BMD), which generally has a touchscreen and assistive interfaces. Voters can make mistakes in expressing their intent in either technology, but only the BMD is also subject to systematic error from computer hacking or bugs in the process of recording the vote on paper, after the voter has expressed it. A hacked BMD can print a vote on the paper ballot that differs from what the voter expressed, or can omit a vote that the voter expressed.

It is not easy to check whether BMD output accurately reflects how one voted in every contest. Research shows that most voters do not review paper ballots printed by BMDs, even when clearly instructed to check for errors. Furthermore, most voters who do review their ballots do not check carefully enough to notice errors that would change how their votes were counted. Finally, voters who detect BMD errors before casting their ballots, can correct only their own ballots, not systematic errors, bugs, or hacking. There is no action that a voter can take to demonstrate to election officials that a BMD altered their expressed votes, and thus no way voters can help deter, detect, contain, and correct computer hacking in elections. That is, not only is it inappropriate to rely on voters to check whether BMDs alter expressed votes, it doesn’t work.

Risk-limiting audits of a trustworthy paper trail can check whether errors in tabulating the votes as recorded altered election outcomes, but there is no way to check whether errors in how BMDs record expressed votes altered election out- comes. The outcomes of elections conducted on current BMDs therefore cannot be confirmed by audits. This paper identifies two properties of voting systems, contestability and defensibility, that are necessary conditions for any audit to confirm election outcomes. No commercially available EAC-certified BMD is contestable or defensible.

To reduce the risk that computers undetectably alter election results by printing erroneous votes on the official paper audit trail, the use of BMDs should be limited to voters who require assistive technology to vote independently.

CITP’s OpenWPM privacy measurement tool moves to Mozilla

As part of my PhD at Princeton’s Center for Information Technology Policy (CITP), I led the development of OpenWPM, a tool for web privacy measurement, with the help of many contributors. My co-authors and I first released OpenWPM in 2014 with the goal of lowering the technical costs of large-scale web privacy measurement. The tool’s success exceeded our expectations; it has been used by over 30 academic studies since its release, in research areas ranging from computer science to law.

OpenWPM has a new home at Mozilla. After graduating in 2018, I joined Mozilla’s security engineering team to work on strengthening Firefox’s tracking protection. We’re committed to ensuring users are protected from tracking by default. To that end, we’ve migrated OpenWPM to Mozilla, where it will remain open source to ensure researchers have the tools required to discover privacy-infringing practices on the web. We are also using it ourselves to understand the implications of our new anti-tracking features, to discover fingerprinting scripts and add them to our tracking protection lists, as well as to collect data for a number of ongoing privacy research projects.

Over the past six months we’ve started a number of efforts to significantly improve OpenWPM:

1. Cloud-friendly data storage. OpenWPM has long used SQLite to store crawl data. This makes it easy for anyone to install the tool, run a small measurement, and inspect the dataset locally. However, this is very limiting for large-scale measurements. OpenWPM can now save data directly to Amazon S3 in Parquet format, making it possible to launch crawls on a cluster of machines.

2. Support for modern versions of Firefox. We are in the process of migrating all of OpenWPM’s instrumentation to WebExtensions, which is necessary to run measurements with Firefox 57+.

2. Modular instrumentation. OpenWPM’s instrumentation was previously deeply embedded in the crawler, making it difficult to use outside of a crawling context. We’ve now refactored the instrumentation into a separate npm package that can easily be imported by any Firefox WebExtension. In fact, we’ve already used the module to collect data in one of our user studies.

4. A standard set of analysis utilities. To further ease analyses on OpenWPM datasets, we’ve bundled the many small utility functions we’ve developed over the years into a single utilities package available on PyPI.

5. Data collection and release. Since 2015, CITP has collected monthly 1-million-site web measurements using OpenWPM. All of this data is available for download, but once Gunes Acar moves on from CITP in a few months, the CITP measurements will end. At Mozilla, we are exploring options to regularly collect and release new measurements.

All of these efforts are still underway, and we welcome community involvement as we continue to build upon them. You can find us hanging out in #openwpm on irc.mozilla.org.