April 19, 2019

AI Ethics: Seven Traps

By Annette Zimmermann and Bendert Zevenbergen

            The question of how to ensure that technological innovation in machine learning and artificial intelligence leads to ethically desirable—or, more minimally, ethically defensible—impacts on society has generated much public debate in recent years. Most of these discussions have been accompanied by a strong sense of urgency: as more and more studies about algorithmic bias have shown, the risk that emerging technologies will not only reflect, but also exacerbate structural injustice in society is significant.

            So which ethical principles ought to govern machine learning systems in order to prevent morally and politically objectionable outcomes? In other words: what is AI Ethics? And indeed, “is ethical AI even possible?”, as a recent New York Times article asks?

            Of course, that depends. What does ‘ethical AI’ mean? One particularly demanding possible view would be the following: ‘ethical AI’ means that (a hypothetical, extremely sophisticated, fully autonomous) artificial intelligence itself makes decisions which are ethically justifiable, all things considered’. But, as philosopher Daniel Dennett argues in a recent piece in Wired, “AI in its current manifestations is parasitic on human intelligence. It quite indiscriminately gorges on whatever has been produced by human creators and extracts the patterns to be found there—including some of our most pernicious habits. These machines do not (yet) have the goals or strategies or capacities for self-criticism and innovation to permit them to transcend their databases by reflectively thinking about their own thinking and their own goals”. Of course, reflecting on the kinds of ethical principles that should underpin decisions by future, much more sophisticated artificial intelligence (‘strong AI’) is an important task for researchers and policy-makers. But it is also important to think about how ethical principles ought to constrain ‘weak AI’, such as algorithmic decision-making, here and now. Doing so is part of what AI ethics is: which values ought we to prioritise when we (partially) automate decisions in criminal justice, law enforcement, hiring, credit scoring, and other areas of contemporary life? Fairness? Equality? Transparency? Privacy? Efficiency?

            As it turns out, the pursuit of AI Ethics—even in its ‘weak’ form—is subject to a range of possible pitfalls. Many of the current discussions on the ethical dimensions of AI systems do not actively include ethicists, nor do they include experts working in relevant adjacent disciplines, such as political and legal philosophers. Therefore, a number of inaccurate assumptions about the nature of ethics have permeated the public debate, which leads to several flawed assessments of why and how ethical reasoning is important for evaluating the larger social impact of AI.

            In what follows, we outline seven ‘AI ethics traps’. In doing so, we hope to provide a resource for readers who want to understand and navigate the public debate on the ethics of AI better, who want to contribute to ongoing discussions in an informed and nuanced way, and who want to think critically and constructively about ethical considerations in science and technology more broadly. Of course, not everybody who contributes to the current debate on AI Ethics is guilty of endorsing any or all of these traps: the traps articulate extreme versions of a range of possible misconceptions, formulated in a deliberately strong way to highlight the ways in which one might prematurely dismiss ethical reasoning about AI as futile.

1. The reductionism trap:

“Doing the morally right thing is essentially the same as acting in a fair way. (or: transparent, or egalitarian, or <substitute any other value>). So ethics is the same as fairness (or transparency, or equality, etc.). If we’re being fair, then we’re being ethical.”

            Even though the problem of algorithmic bias and its unfair impact on decision outcomes is an urgent problem, it does not exhaust the ethical problem space. As important as algorithmic fairness is, it is crucial to avoid reducing ethics to a fairness problem alone. Instead, it is important to pay attention to how the ethically valuable goal of optimizing for a specific value like fairness interacts with other important ethical goals. Such goals could include—amongst many others—the goal of creating transparent and explainable systems which are open to democratic oversight and contestation, the goal of improving the predictive accuracy of machine learning systems, the goal of avoiding paternalistic infringements of autonomy rights, or the goal of protecting the privacy interests of data subjects. Sometimes, these different values may conflict: we cannot always optimize for everything at once. This makes it all the more important to adopt a sufficiently rich, pluralistic view of the full range of relevant ethical values at stake—only then can one reflect critically on what kinds of ethical trade-offs one may have to confront.

2. The simplicity trap:

“In order to make ethics practical and action-guiding, we need to distill our moral framework into a user-friendly compliance checklist. After we’ve decided on a particular path of action, we’ll go through that checklist to make sure that we’re being ethical.”

            Given the high visibility and urgency of ethical dilemmas arising in the context of AI, it is not surprising that there are more and more calls to develop actionable AI ethics checklists. For instance, a 2018 draft report by the European Commission’s High-Level Expert Group on Artificial Intelligence specifies a preliminary ‘assessment list’ for ‘trustworthy AI’. While the report plausibly acknowledges that such an assessment list must be context-sensitive and that it is not exhaustive, it nevertheless identifies a list of ten fixed ethical goals, including privacy and transparency. But can and should ethical values be articulated in a checklist in the first place? It is worth examining this underlying assumption critically. After all, a checklist implies a one-off review process: on that view, developers or policy-makers could determine whether a particular system is ethically defensible at a specific moment in time, and then move on without confronting any further ethical concerns once the checklist criteria have been satisfied once. But ethical reasoning cannot be a static one-off assessment: it required an ongoing process of reflection, deliberation, and contestation. Simplicity is good—but the willingness to reconsider simple frameworks, when required, is better. Setting a fixed ethical agenda ahead of time risks obscuring new ethical problems that may arise at a later point in time, or ongoing ethical problems that become apparent to human decision-makers only later.

3. The relativism trap:

“We all disagree about what is morally valuable, so it’s pointless to imagine that there is a universal baseline against which we can use in order to evaluate moral choices. Nothing is objectively morally good: things can only be morally good relative to each person’s individual value framework.”

            Public discourse on the ethics of AI frequently produces little more than an exchange of personal opinions or institutional positions. In light of pervasive moral disagreement, it is easy to conclude that ethical reasoning can never stand on firm ground: it always seems to be relative to a person’s views and context. But this does not mean that ethical reasoning about AI and its social and political implications is futile: some ethical arguments about AI may ultimately be more persuasive than others. While it may not always be possible to determine ‘the one right answer’, it is often possible to identify at least  some paths of action are clearly wrong, and some paths of action that are comparatively better (if not optimal all things considered). If that is the case, comparing the respective merits of ethical arguments can be action-guiding for developers and policy-makers, despite the presence of moral disagreement. Thus, it is possible and indeed constructive for AI ethics to welcome value pluralism, without collapsing into extreme value relativism.

4. The value alignment trap:

“If relativism is wrong (see #3), there must be one morally right answer. We need to find that right answer, and ensure that everyone in our organisation acts in alignment with that answer. If our ethical reasoning leads to moral disagreement, that means that we have failed.”

            The flipside of the relativist position is the view that ethical reasoning necessarily means advocating for one morally correct answer, to which everyone must align their values. This view is as misguided as relativism itself, and it is particularly dangerous to (in our view, falsely) attribute this view to everyone engaged in the pursuit of AI ethics. A recent Forbes article (ominously titled “Does AI Ethics Have A Bad Name?”) argues, “[p]eople are going to disagree about the best way to obtain the benefits of AI and minimise or eliminate its harms. […] But if you think your field is about ethics rather than about what is most effective there is a danger that you start to see anyone who disagrees with you as not just mistaken, but actually morally bad. You are in danger of feeling righteous and unwilling or unable to listen to people who take a different view. You are likely to seek the company of like-minded people and to fear and despise the people who disagree with you. This is again ironic as AI ethicists are generally (and rightly) keen on diversity.” AI ethics skepticism on the grounds that AI ethics prohibits constructive disagreement means attacking a straw man. By contrast, any plausible approach to AI ethics will avoid the value alignment trap as much as it will avoid relativism.

5. The dichotomy trap:

“The goal of ethical reasoning is to ‘be(come) ethical’.

            Using ‘ethical’ as an adjective—such as when people speak of ‘ethical AI’—risks suggesting that there there are exactly two options: AI is either ‘ethical’ or ‘unethical’; or we (as policy-makers, technologists, or society as a whole) are ‘ethical’ or ‘unethical’. We can see this kind of language in recent contributions to the public debate: “we need to be ethical enough to be trusted to make this technology on our own, and we owe it to the public to define our ethics clearly” or “building ethical artificial intelligence is an enormously complex task”. But this, again, is too simplistic. Rather than thinking of ethics as an attribute of a person or a technology, we should think of it as an activity: a type of reasoning about what the right and wrong to do is, and about what the world ought to look like. AI ethics (and ethics more generally) is therefore best construed as something that people think about, and something that people do. It is not something that people, or technologies, can simply be—or not be.

6. The myopia trap:

“The ethical trade-offs that we identify within one context are going to be the same ethical trade-offs that we are going to face in other contexts and moments in time, both with respect to the nature of the trade-off and with respect to the scope of the trade-off.”

            Empirical evidence and public discussion can present a clear picture of the value tradeoffs and consequences with regards to the introduction of an AI technology in a particular context that may inform governance decisions. However, artificial intelligence is an umbrella term for a wide range of technologies that can be used in many different contexts. The same ethical trade-offs and priorities do not therefore necessarily–and are indeed unlikely to–translate across contexts and technologies.

7. The rule of law trap:

“Ethics is essentially the same as the rule of law. When we lack appropriate legal categories for the governance of AI, ethics is a good substitute. And when we do have sufficient legal frameworks, we don’t need to think about ethics.”

            To illustrate this view, consider the following point from the aforementioned NYT article: “Some activists—and even some companies—are beginning to argue that the only way to ensure ethical practices is through government regulation.” While it is true that realizing ethical principles in the real world usually requires people and institutions to advocate for their enforcement, it would be too quick to conclude that engaging in ethical reasoning is the same as establishing frameworks for legal compliance. It is misguided to frame ethics as a substitute for the rule of law, legislation, human rights, institutions, or democratically legitimate authorities. Claims to that extent, whether it is to encourage the use of ethics or to criticize the discipline in the governance of technology, should be rejected as it is a misrepresentation of ethics as a discipline. Ethical and legal reasoning pursue related but distinct questions. The issue of discriminatory outcomes in algorithmic decision-making provides a useful example. From an ethical perspective, we might ask: what makes (algorithmic) discrimination morally wrong? Is the problem that is wrongly generalizes from judgments about a set of people to another set of people, thus failing to respect them as individuals? Is it that it violates individual rights, or that it exacerbates existing structures of inequality in society? On the other hand, we might ask a set of legal questions: how should democratic states enforce principles of non-discrimination and due process when algorithms support our decision making processes? How should we interpret, apply, and expand our existing legal frameworks? Who is legally liable for disparate outcomes? Thus, ethics and the law is not an ‘either—or’ question: sometimes, laws might fall short of enforcing important ethical values, and some ethical arguments might simply not be codifiable in law.


            This blog post responds critically to some recent trends in the public debate about AI Ethics. The seven traps which we have identified here are the following: (1) the reductionism trap, (2) the simplicity trap, (3) the relativism trap, (4) the value alignment trap, (5) the dichotomy trap, (6) the myopia trap, and (7) the rule of law trap. We will soon publish a white paper clarifying the role of ethics as a discipline in the assessment of AI system design and deployment in society, which addresses these points in more detail.

Bridging Tech-Military AI Divides in an Era of Tech Ethics: Sharif Calfee at CITP

In a time when U.S. tech employees are organizing against corporate-military collaborations on AI, how can the ethics and incentives of military, corporate, and academic research be more closely aligned on AI and lethal autonomous weapons?

Speaking today at CITP was Captain Sharif Calfee, a U.S. Naval Officer who serves as a surface warfare officer. He is a graduate of the U.S. Naval Academy and U.S. Naval Postgraduate School and a current MPP student at the Woodrow Wilson School.

Afloat, Sharif most recently served as the commanding officer, USS McCAMPBELL (DDG 85), an Aegis guided missile destroyer. Ashore, Sharif was most recently selected for the Federal Executive Fellowship program and served as the U.S. Navy fellow to the Center for Strategic & Budgetary Assessments (CSBA), a non-partisan, national security policy analysis think-tank in Washington, D.C..

Sharif spoke to CITP today with some of his own views (not speaking for the U.S. government) about how research and defense can more closely collaborate on AI.

Over the last two years, Sharif has been working on ways for the Navy to accelerate AI and adopt commercial systems to get more unmanned systems into the fleet. Toward this goal, he recently interviewed 160 people at 50 organizations. His talk today is based on that research.

Sharif next tells us about a rift between the U.S. government and companies/academia in AI. This rift is a symptom, he tells us, of a growing “civil-military divide” in the US. In previous generations, big tech companies have worked closely with the U.S. military, and a majority of elected representatives in Congress had prior military experience. That’s no longer true. As there’s a bifurcation in the experiences of Americans who serve in the military versus those who have. This lack of familiarity, he says, complicates moments when companies and academics discuss the potential of working with and for the U.S. military.

Next, Sharif says that conversations about tech ethics in the technology industry are creating a conflict that making it difficult for the U.S. military to work with them. He tells us about Project Maven, a project that Google and the Department of Defense worked on together to analyze drone footage using AI. Their purpose was to reduce the number of casualties to civilians who are not considered battlefield combatants. This project, which wasn’t secret, burst into public awareness after a New York Times article and a letter from over three thousand employees. Google declined to renew the DOD contract and update their motto.

U.S. Predator Drone (via Wikimedia Commons)

On the heels of their project Maven decision, Google also faced criticism for working with the Chinese government to provide services in China in ways that enabled certain kinds of censorship. Suddenly, Google found themselves answering questions about why they were collaborating with China on AI and not with the U.S. military.

How do we resolve this impasse in collaboration?

  • The defense acquisition process is hard for small, nimble companies to engage in
  • Defense contracts are too slow, too expensive, too bureaucratic, and not profitable
  • Companies aren’t not necessarily interested in the same type of R&D products as the DOD wants
  • National security partnerships with gov’t might affect opportunities in other international markets.
  • The Cold War is “ancient history” for the current generation
  • Global, international corporations don’t want to take sides on conflicts
  • Companies and employees seek to create good. Government R&D may conflict with that ethos

Academics also have reasons not to work for the government:

  • Worried about how their R&D will be utilized
  • Schools of faculty may philoisophically disagree with the government
  • Universities are incubators of international talent, and government R&D could be divisive, not inclusive
  • Government R&D is sometimes kept secret, which hurts academic careers

Faced with this, according to Sharif, the U.S. government is sometimes baffled by people’s ideological concerns. Many in the government remember the Cold War and knew people who lived and fought in World War Two. They can sometimes be resentful about a cold shoulder from academics and companies, especially since the military funded the foundational work in computer science and AI.

Sharif tells us that R&D reached an inflection point in the 1990s. During the Cold War, new technologies were developed through defense funding (the internet, GPS, nuclear technology) and then they reached industry. Now the reverse happens. Now technologies like AI are being developed by the commercial sector and reaching government. That flow is not very nimble. DOD acquisition systems are designed for projects that take 91 months to complete (like a new airplane), while companies adopt AI technologies in 6-9 months (see this report by the Congressional Research Service).

Conversations about policy and law also constrain the U.S. government from developing and adopting lethal autonomous weapons systems, says Sharif. Even as we have important questions about the ethical risks of AI, Sharif tells us that other governments don’t have the same restrictions. He asks us to imagine what would have happened if nuclear weapons weren’t developed first by the U.S..

How can divides between the U.S. government and companies/academia be bridged? Sharif suggests:

  • The U.S. government must substantially increase R&D funding to help regain influence
  • Establish a prestigious DOD/Government R&D one-year fellowship program with top notch STEM grads prior to joining the commercial sector
  • Expand on the Defense Innovation Unit
  • Elevate the Defense Innovation Board in prominence and expand the project to create conversations that bridge between ideological divides. Organize conversations at high levels and middle management levels to accelerate this familiarization.
  • Increase DARPA and other collaborations with commercial and academic sectors
  • Establish joint DOD and Commercial Sector exchange programs
  • Expand the number of DOD research fellows and scientists present on university campuses in fellowship programs
  • Continue to reform DOD acquisition processes to streamline for sectors like AI

Sharif has also recommended to the U.S. Navy that they create an Autonomy Project Office to enable the Navy to better leverage R&D. The U.S. Navy has used structures like this for previous technology transformations on nuclear propulsion, the Polaris submarine missiles, naval aviation, and the Aegis combat system.

At the end of the day, says Sharif, what happens in a conflict where the U.S. does not have the technological overmatch and is overmatched by someone else? What are the real life consequences? That’s what’s at stake in collaborations between researchers, companies, and the U.S. department of defense.

What Are Machine Learning Models Hiding?

Machine learning is eating the world. The abundance of training data has helped ML achieve amazing results for object recognition, natural language processing, predictive analytics, and all manner of other tasks. Much of this training data is very sensitive, including personal photos, search queries, location traces, and health-care records.

In a recent series of papers, we uncovered multiple privacy and integrity problems in today’s ML pipelines, especially (1) online services such as Amazon ML and Google Prediction API that create ML models on demand for non-expert users, and (2) federated learning, aka collaborative learning, that lets multiple users create a joint ML model while keeping their data private (imagine millions of smartphones jointly training a predictive keyboard on users’ typed messages).

Our Oakland 2017 paper, which has just received the PET Award for Outstanding Research in Privacy Enhancing Technologies, concretely shows how to perform membership inference, i.e., determine if a certain data record was used to train an ML model.  Membership inference has a long history in privacy research, especially in genetic privacy and generally whenever statistics about individuals are released.  It also has beneficial applications, such as detecting inappropriate uses of personal data.

We focus on classifiers, a popular type of ML models. Apps and online services use classifier models to recognize which objects appear in images, categorize consumers based on their purchase patterns, and other similar tasks.  We show that if a classifier is open to public access – via an online API or indirectly via an app or service that uses it internally – an adversary can query it and tell from its output if a certain record was used during training.  For example, if a classifier based on a patient study is used for predictive health care, membership inference can leak whether or not a certain patient participated in the study. If a (different) classifier categorizes mobile users based on their movement patterns, membership inference can leak which locations were visited by a certain user.

There are several technical reasons why ML models are vulnerable to membership inference, including “overfitting” and “memorization” of the training data, but they are a symptom of a bigger problem. Modern ML models, especially deep neural networks, are massive computation and storage systems with millions of high-precision floating-point parameters. They are typically evaluated solely by their test accuracy, i.e., how well they classify the data that they did not train on.  Yet they can achieve high test accuracy without using all of their capacity.  In addition to asking if a model has learned its task well, we should ask what else has the model learned? What does this “unintended learning” mean for the privacy and integrity of ML models?

Deep networks can learn features that are unrelated – even statistically uncorrelated! – to their assigned task.  For example, here are the features learned by a binary gender classifier trained on the “Labeled Faces in the Wild” dataset.

While the upper layer of this neural network has learned to separate inputs by gender (circles and triangles), the lower layers have also learned to recognize race (red and blue), a property uncorrelated with the task.

Our more recent work on property inference attacks shows that even simple binary classifiers trained for generic tasks – for example, determining if a review is positive or negative or if a face is male or female – internally discover fine-grained features that are much more sensitive. This is especially important in collaborative and federated learning, where the internal parameters of each participant’s model are revealed during training, along with periodic updates to these parameters based on the training data.

We show that a malicious participant in collaborative training can tell if a certain person appears in another participant’s photos, who has written the reviews used by other participants for training, which types of doctors are being reviewed, and other sensitive information. Notably, this leakage of “extra” information about the training data has no visible effect on the model’s test accuracy.

A clever adversary who has access to the ML training software can exploit the unused capacity of ML models for nefarious purposes. In our CCS 2017 paper, we show that a simple modification to the data pre-processing, without changing the training procedure at all, can cause the model to memorize its training data and leak it in response to queries. Consider a binary gender classifier trained in this way.  By submitting special inputs to this classifier and observing whether they are classified as male or female, the adversary can reconstruct the actual images on which the classifier was trained (the top row is the ground truth):

Federated learning, where models are crowd-sourced from hundreds or even millions of users, is an even juicier target. In a recent paper, we show that a single malicious participant in federated learning can completely replace the joint model with another one that has the same accuracy but also incorporates backdoor functionality. For example, it can intentionally misclassify images with certain features or suggest adversary-chosen words to complete certain sentences.

When training ML models, it is not enough to ask if the model has learned its task well.  Creators of ML models must ask what else their models have learned. Are they memorizing and leaking their training data? Are they discovering privacy-violating features that have nothing to do with their learning tasks? Are they hiding backdoor functionality? We need least-privilege ML models that learn only what they need for their task – and nothing more.

This post is based on joint research with Eugene Bagdasaryan, Luca Melis, Reza Shokri, Congzheng Song, Emiliano de Cristofaro, Deborah Estrin, Yiqing Hua, Thomas Ristenpart, Marco Stronati, and Andreas Veit.

Thanks to Arvind Narayanan for feedback on a draft of this post.