September 26, 2018

Securing the Vote — National Academies report

In this November’s election, could a computer hacker, foreign or domestic, alter votes (in the voting machine) or prevent people from voting (by altering voter registrations)?  What should we do to protect ourselves?

The National Academies of Science, Engineering, and Medicine have released a report,  Securing the Vote: Protecting American Democracy about the cybervulnerabilities in U.S. election systems and how to defend them.  The committee was chaired by the presidents of Indiana University and Columbia University, and the members included 5 computer scientists, a mathematician, two social scientists, a law professor, and three state and local election administrators.  I served on this committee, and I am confident that the report presents the clear consensus of the scientific community, as represented not only by the members of the committee but also the 14 external reviewers—election officials, computer scientists, experts on elections—that were part of the National Academies’ process.

The 124-page report, available for free download, lays out the scientific basis for our conclusions and our 55 recommendations.  We studied primarily the voting process; we did not address voter-ID laws, gerrymandering, social-media disinformation, or campaign financing.

There is no national election system in the U.S.; each state or county runs its own elections.  But in the 21st century, state and local election administrators face new kinds of threats.  In the 19th and 20th centuries elections did not face the threat of vote manipulation (and voter-registration tampering) from highly sophisticated adversaries anywhere in the world.  Most state and local election administrators know they must improve their cybersecurity and adopt best practices, and the federal government can (and should) offer assistance.  But it’s impossible to completely prevent all attacks; we must be able to run elections even if the computers might be hacked; we must be able to detect and correct errors in the computer tabulation.

Therefore, our key recommendations are,

4.11.  Elections should be conducted with human-readable paper ballots.  These may be marked by hand or by machine (using a ballot-marking device); they may be counted by hand or by machine (using an optical scanner).  Recounts and audits should be conducted by human inspection of the human-readable portion of the paper ballots.  Voting machines that do not provide the capacity for independent auditing (e.g., machines that do not produce a voter-verifiable paper audit trail) should be removed from service as soon as possible.

In our report, we explain why:  voting machines can never be completely hack-proof, but with paper ballots we can–if we have to–count the votes independent of possibly hacked computers.

4.12.  Every effort should be made to use human-readable paper ballots in the 2018 federal election.  All local, state, and federal elections should be conducted using human-readable paper ballots by the 2020 presidential election.

5.8.  States should mandate risk-limiting audits prior to the certification of election results.  With current technology, this requires the use of paper ballots.  States and local jurisdictions should implement risk-limiting audits within a decade.  They should begin with pilot programs and work toward full implementation.  Risk-limiting audits should be conducted for all federal and state election contests, and for local contests where feasible. 

In our report, we explain why:  examining a small random sample of the paper ballots, and comparing with the results claimed by the computers, can assure with high confidence that the computers haven’t been hacked to produce an incorrect outcome–or else, can provide clear evidence that a recount is needed.

5.11.  At the present time, the Internet (or any network connected to the Internet)  should not be used for the return of marked ballots.  Further, Internet voting should not be used in the future until and unless very robust guarantees of security and verifiability are developed and in place, as no known technology guarantees the secrecy, security, and verifiability of a marked ballot transmitted over the Internet.

4.1.  Election administrators should routinely assess the integrity of voter registration databases and the integrity of voter registration databases connected to other applications.  They should develop plans that detail security procedures for assessing voter registration database integrity and put in place systems that detect efforts to probe, tamper with, or interfere with voter registration systems.  States should require election administrators to report any detected compromises or vulnerabilities in voter registration systems to the U.S. Department of Homeland Security, the U.S. Election Assistance Commission, and state officials.

Many of these recommendations are not controversial, in most states.  Almost all the states use paper ballots, counted by machine;  the few remaining states that use paperless touchscreens are taking steps to move to paper ballots; the states have not adopted internet voting (except for scattered ill-advised experiments); and many, many election administrators nationwide are professionals who are working hard to come up to speed on cybersecurity.

But many election administrators are not sure about risk-limiting audits (RLAs).  They ask, “can’t we just audit the digital ballot images that the machines provide?”  No, that won’t work:  if the machine is hacked to lie about the vote totals, it can easily be hacked to provide fake digital pictures of the ballots themselves.  The good news is, well designed risk-limiting audits, added to well-designed administrative processes for keeping track of batches of ballots, can be efficient and practical.  But it will take some time and effort to get things going: the design of those processes, the design of the audits themselves, training of staff, state legislation where necessary.  And it can’t be a one-size-fits-all design:  different states vote in different ways, and the risk-limiting audit must be designed to fit the state’s election systems and methods.  That’s why we recommend pilots of RLAs as soon as possible, but a 10-year period for full adoption.

Many other findings and recommendations are in the report itself.  For example, Congress should fully fund the Election Assistance Commission to perform its mission, authorize the EAC to set standards for voter-registration systems and e-pollbooks (not just voting machines); the President should nominate and Congress should confirm EAC commissioners.

But the real bottom line is:  there are specific things we can do, at the state level and at the national level; and we must do these things to secure our elections so that we are confident that they reflect the will of the voters.

Why PhD experiences are so variable and what you can do about it

People who do PhDs seem to have either strongly positive or strongly negative experiences — for some, it’s the best time of their lives, while others regret the decision to do a PhD. Few career choices involve such a colossal time commitment, so it’s worth thinking carefully about whether a PhD is right for you, and what you can do to maximize your chances of having a good experience. Here are four suggestions. Like all career advice, your mileage may vary.

1. A PhD should be viewed as an end in itself, not a means to an end. Some people find that they are not enjoying their PhD research, but decide to stick with it, seeing it as a necessary route to research success and fulfillment. This is a trap. If you’re not enjoying your PhD research, you’re unlikely to enjoy a research career as a professor. Besides, professors spend the majority of our time on administrative and other unrewarding activities. (And if you don’t plan to be a professor, then you have even less of a reason to stick with an unfulfilling PhD.)

If you feel confident that you’d be happier at some other job than in your PhD, jumping ship is probably the right decision. If possible, structure your program at the outset so that you can leave with a Master’s degree in about two years if the PhD isn’t working out. And consider deferring your PhD for a year or two after college, so that you’ll have a point of comparison for job satisfaction.

2. A PhD is a terrible financial decision. Doing a PhD incurs an enormous financial opportunity cost. If maximizing your earning potential is anywhere near the top of your life goals, you probably want to stay away from a PhD. While earning prospects vary substantially by discipline, a PhD is unlikely to improve your career earnings, regardless of area.

3. The environment matters. PhD programs can be welcoming and nurturing, or toxic and dysfunctional, or anywhere in between. The institution, department, your adviser, and your peers all make a big difference to your experience. But these differences are not reflected in academic rankings. When you’re deciding between programs, you might want to weigh factors like support structures for mental health, the incidence of harassment, location, and extra-curricular activities more strongly than rankings. It is extremely common for graduate researchers to face mental health challenges. During my own PhD, I benefited greatly from professional mental health support.

4. Manage risk. Like viral videos, acting careers, and startups, the distribution of success in research is wildly skewed. Most research papers gather dust while a few get all the credit — and the process that sorts papers involves a degree of luck and circumstance that researchers often don’t like to admit. This contributes to the high variance in PhD outcomes and experiences. Even for the eventual “winners”, the uncertainty is a source of stress.

Perhaps counterintuitively, the role of luck means that you should embrace risky projects, because if a project is low-risk the upside will probably be relatively insignificant as well. How, then, to manage risk? One way is to diversify — maintain a portfolio of independent research agendas. Also, if the success of research projects is not purely meritocratic, it follows that selling your work makes a big difference. Many academics find this distasteful, but it’s simply a necessity. Still, at the end of the day, be mentally prepared for the possibility that your objectively best work languishes while a paper that you cranked out as a hack job ends up being your most highly cited.

Conclusion. Many people embark on a PhD for the wrong reasons, such as their professors talking them into it. But a PhD only makes sense if you strongly value the intrinsic reward of intellectual pursuit and the chance to make an impact through research, with financial considerations being of secondary importance. This is an intensely personal decision. Even if you decide it’s right for you, you might want to leave yourself room to re-evaluate your choice. You should pick your program carefully and have a strategy in place for managing the inherent riskiness of research projects and the somewhat lonely nature of the journey.

A note on terminology. I don’t use the terms grad school and PhD student. The “school” frame is utterly at odds with what PhD programs are about. Its use misleads prospective PhD applicants and does doctoral researchers a disservice. Besides, Master’s and PhD programs have little in common, so the umbrella term “grad school” is doubly unhelpful.

Thanks to Ian Lundberg and Veena Rao for feedback on a draft.

What Are Machine Learning Models Hiding?

Machine learning is eating the world. The abundance of training data has helped ML achieve amazing results for object recognition, natural language processing, predictive analytics, and all manner of other tasks. Much of this training data is very sensitive, including personal photos, search queries, location traces, and health-care records.

In a recent series of papers, we uncovered multiple privacy and integrity problems in today’s ML pipelines, especially (1) online services such as Amazon ML and Google Prediction API that create ML models on demand for non-expert users, and (2) federated learning, aka collaborative learning, that lets multiple users create a joint ML model while keeping their data private (imagine millions of smartphones jointly training a predictive keyboard on users’ typed messages).

Our Oakland 2017 paper, which has just received the PET Award for Outstanding Research in Privacy Enhancing Technologies, concretely shows how to perform membership inference, i.e., determine if a certain data record was used to train an ML model.  Membership inference has a long history in privacy research, especially in genetic privacy and generally whenever statistics about individuals are released.  It also has beneficial applications, such as detecting inappropriate uses of personal data.

We focus on classifiers, a popular type of ML models. Apps and online services use classifier models to recognize which objects appear in images, categorize consumers based on their purchase patterns, and other similar tasks.  We show that if a classifier is open to public access – via an online API or indirectly via an app or service that uses it internally – an adversary can query it and tell from its output if a certain record was used during training.  For example, if a classifier based on a patient study is used for predictive health care, membership inference can leak whether or not a certain patient participated in the study. If a (different) classifier categorizes mobile users based on their movement patterns, membership inference can leak which locations were visited by a certain user.

There are several technical reasons why ML models are vulnerable to membership inference, including “overfitting” and “memorization” of the training data, but they are a symptom of a bigger problem. Modern ML models, especially deep neural networks, are massive computation and storage systems with millions of high-precision floating-point parameters. They are typically evaluated solely by their test accuracy, i.e., how well they classify the data that they did not train on.  Yet they can achieve high test accuracy without using all of their capacity.  In addition to asking if a model has learned its task well, we should ask what else has the model learned? What does this “unintended learning” mean for the privacy and integrity of ML models?

Deep networks can learn features that are unrelated – even statistically uncorrelated! – to their assigned task.  For example, here are the features learned by a binary gender classifier trained on the “Labeled Faces in the Wild” dataset.

While the upper layer of this neural network has learned to separate inputs by gender (circles and triangles), the lower layers have also learned to recognize race (red and blue), a property uncorrelated with the task.

Our more recent work on property inference attacks shows that even simple binary classifiers trained for generic tasks – for example, determining if a review is positive or negative or if a face is male or female – internally discover fine-grained features that are much more sensitive. This is especially important in collaborative and federated learning, where the internal parameters of each participant’s model are revealed during training, along with periodic updates to these parameters based on the training data.

We show that a malicious participant in collaborative training can tell if a certain person appears in another participant’s photos, who has written the reviews used by other participants for training, which types of doctors are being reviewed, and other sensitive information. Notably, this leakage of “extra” information about the training data has no visible effect on the model’s test accuracy.

A clever adversary who has access to the ML training software can exploit the unused capacity of ML models for nefarious purposes. In our CCS 2017 paper, we show that a simple modification to the data pre-processing, without changing the training procedure at all, can cause the model to memorize its training data and leak it in response to queries. Consider a binary gender classifier trained in this way.  By submitting special inputs to this classifier and observing whether they are classified as male or female, the adversary can reconstruct the actual images on which the classifier was trained (the top row is the ground truth):

Federated learning, where models are crowd-sourced from hundreds or even millions of users, is an even juicier target. In a recent paper, we show that a single malicious participant in federated learning can completely replace the joint model with another one that has the same accuracy but also incorporates backdoor functionality. For example, it can intentionally misclassify images with certain features or suggest adversary-chosen words to complete certain sentences.

When training ML models, it is not enough to ask if the model has learned its task well.  Creators of ML models must ask what else their models have learned. Are they memorizing and leaking their training data? Are they discovering privacy-violating features that have nothing to do with their learning tasks? Are they hiding backdoor functionality? We need least-privilege ML models that learn only what they need for their task – and nothing more.

This post is based on joint research with Eugene Bagdasaryan, Luca Melis, Reza Shokri, Congzheng Song, Emiliano de Cristofaro, Deborah Estrin, Yiqing Hua, Thomas Ristenpart, Marco Stronati, and Andreas Veit.

Thanks to Arvind Narayanan for feedback on a draft of this post.