July 13, 2020

Fair Elections During a Crisis

Even before the crisis of COVID-19, which will have severe implications for the conduct of the 2020 elections, the United States faced another elections crisis of legitimacy: Americans can no longer take for granted that election losers will concede a closely fought election after election authorities (or courts) have declared a winner.

Along with two dozen other scholars (in Tech, Law, Political Science, and Media), I joined an ad-hoc working group convened by Professor Rick Hasen of the U.C. Irvine Law School, to make recommendations on steps that American election administrators (and others) can take this year to deal with these two overlapping crises. Our report has just been released:

Fair Elections During a Crisis: Urgent Recommendations in Law, Media, Politics, and Tech to Advance the Legitimacy of, and the Public Confidence in, the November 2020 U.S. Elections.

We make 14 specific recommendations. In Law: regarding absentee ballots, emergency plans, COVID-19, vote-counting dispute-resolution protocols. Media: how media can provide accurate information to voters about the election process, expectations for timing of election results (slower this year than before). Politics and Norms: Funding for COVID-19 costs, bipartisan Election Crisis Commission, principles for fair elections, responsibilities of social media. Tech: paper ballots and audits, resilient election infrastructure, .gov domains for election officials, monitoring and auditing of voter-registration databases.

Can Legislatures Safely Vote by Internet?

It is a well understood scientific fact that Internet voting in public elections is not securable: “the Internet should not be used for the return of marked ballots. … [N]o known technology guarantees the secrecy, security, and verifiability of a marked ballot transmitted over the Internet.

But can legislatures (city councils, county boards, or the U.S. Congress) safely vote by Internet? Perhaps they can. To understand why, let’s examine two important differences between legislature votes and public elections:

  1. Public elections require the secret ballot; legislatures can vote by public roll-call vote.
  2. Internet voting requires digital credentials; the U.S. has no effective way to distribute digital credentials to the public, but it is feasible to provide credentials to members of a legislature.

The cyberthreats facing any kind of Internet voting include:

  • (A) hackers impersonating a voter,
  • (B) hackers exploiting server vulnerabilities to fraudulently change the software that counts votes,
  • (C) hackers exploiting (voter’s phones and laptops) client vulnerabilities to fraudulently change the software that transmits votes, and
  • (D) Other attacks, such as denial of service: prevent some legislators from acccessing the Internet.

(Blockchain can’t solve these problems; see pages 103-105 )

But suppose a legislative body wished to avoid meeting in person during a pandemic. Could these threats be mitigated sufficiently?

(A) It is feasible to distribute security tokens to the 15 members of a county commission or the 435 members of the House of Representatives, in a way that’s not feasible for 235 million registered voters. Even without security tokens, a Member who is personally known to the clerk of the legislature could vote by video chat, in an emergency. (Caveats: Security tokens are highly secure but not perfect; video chat could be subject to deep fakes; but see below for mitigations.)

(B,C) Attacks that compromise the client or server computers can be detected and corrected, if everyone’s vote is displayed on a “public bulletin board.” That is, each member of the legislature would transmit his or her vote, then must check the public roll-call display to make sure the vote was reported and recorded accurately.

Checking the public roll-call display isn’t so simple, since hackers could alter the member’s client device (e.g., laptop computer or phone) to make it lie about what’s downloaded from the roll-call display. A Member should check the roll-call from a variety of devices in a variety of locations, or (perhaps) coordinate with other Members to make sure they’re getting a consistent report.

This remote workaround would not be simple and easy. Careful protocols must be designed to limit the amount of time for members to contest their vote; one must consider what happens if Members game the system (by falsely claiming their vote was altered); one must consider what happens if lobbyists are literally sitting next to the member during voting (which is less likely when the member is gathered in a public place for a traditional vote). What do the legislatures quorum rules mean in this context? And many legislatures prefer to take many votes by “voice vote” where each member’s individual vote is not recorded.

And just because Internet roll-call votes may be feasible to secure, that doesn’t mean they’re automatically a good idea, or legal: see this report by the Majority staff of the House of Representatives.

Conclusion: we know that Internet voting by the public is impossible to secure, and thus we must not vote by Internet even during the COVID-19 epidemic. But Internet voting by legislatures is not necessarily impossible to secure, and could reasonably be considered. If legislative bodies desire to meet and vote remotely, there is still plenty of work to do to actually secure the process. And that’s difficult to do in a hurry.

Building a Bridge with Concrete… Examples

Thanks to Annette Zimmermann and Arvind Narayanan for their helpful feedback on this post.

Algorithmic bias is currently generating a lot of lively public and scholarly debate, especially amongst computer scientists and philosophers. But do these two groups really speak the same language—and if not, how can they start to do so?

I noticed at least two different ways of thinking about algorithmic bias during a recent research workshop on the ethics of algorithmic decision-making at Princeton University’s Center for Human Values, organized by political philosopher Dr. Annette Zimmermann. Philosophers are thinking about algorithmic bias in terms of things like the inherent value of explanation, the fairness and accountability rights afforded to humans, and whether groups that have been systematically affected by unfair systems should bear the burden for integration when transitioning to a uniform system. Computer scientists, by contrast, are thinking about algorithmic bias in terms of things like running a gradient backwards to visualize a heat map, projecting features into various subspaces devoid of protected attributes, and tuning hyperparameters to better satisfy a new loss function. Of course these are vast generalizations about the two fields, and there are plenty of researchers doing excellent work at the intersection, but it seems that for the most part while philosophers are debating which sets of ethical axioms ought to underpin algorithmic decision-making system, computer scientists are in the meantime already deploying these systems into the real world.

In formulating loss functions, consequentialists might prioritize maximizing accurate outcomes for the largest possible number of people, even if that is at the cost of fair treatment, whereas deontologists might prioritize treating everyone fairly, even if that is at the cost of optimality. But there isn’t a definitive “most moral” answer, and if something like equalizing false positive rates were the key to fairness, we would not be having the alarming headlines of algorithmic bias that we have today.

Inundated with various conflicting definitions of fairness, scientists are often optimizing for metrics they believe to be best and proceeding onwards. For example, one might reasonably think that the way to ensure fairness of an algorithm between different racial groups could be to enforce predictive parity (equal likelihood of accurate positive predictions), or to equalize false error rates, or just to treat similar individuals similarly. However, it is actually mathematically impossible to simultaneously satisfy seemingly reasonable fairness criteria like these in most real world settings. It is unclear how to choose amongst the criteria, and even more unclear how one would go about translating complex ideas that may require consideration, such as systematic oppression, into a world of optimizers and gradients.

Since concrete mappings between a mathematical loss function and moral concepts are likely impossible to dictate, and philosophers are unlikely to settle on an ultimate theory of fairness, perhaps for now we can adopt a strategy that is, at least, not impossible to implement: a purposefully created, context- and application-specific validation/test set. The motivation behind this is that even if philosophers and ethicists cannot decisively articulate a set of general, static fairness desiderata, perhaps they can make more domain-specific, dynamic judgements: for instance whether one should prefer a system that gives person A with a set of attributes and features a loan or not. And they can also say that for person B and C and so on. Of course there will not be unanimous agreement, but at least a general consensus towards a particular outcome as preferable over the other. One could then create a whole set of such examples. Concepts like the idea that similar people should be treated similarly in a given decision scenario—the ‘like cases maxim’ in legal philosophy—could be encoded into this test set by having groups of people that differ only in a protected attribute be given the same result, and even concepts like equal accuracy rates across protected groups could be encoded in by having the test set be represented by equal numbers of people from each group rather than proportional to the real world majority/minority representations. However, the test set is not a constructually valid way to enforce these fairness constraints, and it shouldn’t be either, because the reason why such a test set would exist is that the right fairness criteria are not actually known, otherwise it would just be explicitly formulated into the loss function.

At this juncture, ethicists and computer scientists could usefully engage in complementary work: ethicists could identify difficult edge cases that challenge what we think about moral questions and incorporate this into the test set, and computer scientists could work on optimizing accuracy rates on a given validation set. There are a few crucial differences, however, from similar collaborative approaches in other domains like when doctors are called on to provide expert labels on medical data so models can be trained to detect things like eye diseases. There is now the new notion that the distribution of the test set, in addition to just the labels, are going to be specifically decided upon by domain experts. Further, this collaboration would last beyond just the labeling of the data. Failure cases should be critically investigated earlier in the machine learning pipeline in an iterative and reflective way to ensure things like overfitting are not happening. Whether performing well on the hidden test set requires learning fairer representations in the feature space or thresholding different groups differently, scientists will build context-specific models that encompass certain moral values defined by ethicists, who are grounding the test set in examples of realizations of such values.

But does this proposal mean adopting a potentially dangerous, ethically objectionable “the ends justify the means” logic? Not necessarily. With algorithm developers working in conjunction with ethicists to ensure the means are not unsavory, this could be a way to bridge the divide between abstract notions of fairness, and concrete ways of implementing systems.

This may not be a long-term ideal way to deal with the problem of algorithmic fairness because of the difficulty in generalizing between applications, and in situations where creating an expert-curated test set is too expensive or not scalable, not preferred over satisfying one of the many mathematical definitions of fairness, but it could be one possible way to incorporate philosophical notions of fairness into the development of algorithms. Because technologists are not going to hold off and wait on deploying machine learning systems until they are in a state of fairness everyone agrees on, finding a way of incorporating philosophical views about central moral values like fairness and justice into algorithmic systems right now is an urgent problem.

Supervised machine learning has traditionally been focused on predicting based on historical and existing data, but maybe we can structure our data in a way that is a model not of the society we actually live in, but of the one we hope to live in. Translating complex philosophical values into representative examples is not an easy task, but it is one that ethicists have been doing a version of for centuries in order to investigate moral concepts—and perhaps it can also be the way to convey some sense of our morals to machines.