May 22, 2022

How the National AI Research Resource can steward the datasets it hosts

Last week I participated on a panel about the National AI Research Resource (NAIRR), a proposed computing and data resource for academic AI researchers. The NAIRR’s goal is to subsidize the spiraling costs of many types of AI research that have put them out of reach of most academic groups.

My comments on the panel were based on a recent study by researchers Kenny Peng, Arunesh Mathur, and me (NeurIPS ‘21) on the potential harms of AI. We looked at almost 1,000 research papers to analyze how they used datasets, which are the engine of AI.

Let me briefly mention just two of the many things we found, and then I’ll present some ideas for NAIRR based on our findings. First, we found that “derived datasets” are extremely common. For example, there’s a popular facial recognition dataset called Labeled Faces in the Wild, and there are at least 20 new datasets that incorporate the original data and extend it in some way. One of them adds race and gender annotations. This means that a dataset may enable new harms over time. For example, once you have race annotations, you can use it to build a model that tracks the movement of ethnic minorities through surveillance cameras, which some governments seem to be doing.

We also found that dataset creators are aware of their potential for misuse, so they often have licenses restricting their use for research and not for commercial purposes. Unfortunately, we found evidence that many companies simply get around this by downloading a model pre-trained on that dataset (in a research context) and using that model in commercial products.

Stepping back, the main takeaway from our paper is that dataset creators can sometimes — but not always — anticipate the ways in which a dataset might be used or misused in harmful ways. So we advocate for what we call dataset stewarding, which is a governance process that lasts throughout the lifecycle of a dataset. Note that some prominent datasets see active use for decades.

I think NAIRR is ideally positioned to be the steward of the datasets that it hosts, and perform a vital governance role over datasets and, in turn, over AI research. Here are a few specific things NAIRR could do, starting with the most lightweight ones.

1. NAIRR should support a communication channel between a dataset creator and the researchers who use that dataset. For example, if ethical problems — or even scientific problems — are uncovered in a dataset, it should be possible to notify users about it. As trivial as this sounds, it is not always the case today. Prominent datasets have been retracted over ethical concerns without a way to notify the people who had downloaded it.

2. NAIRR should standardize dataset citation practices, for example, by providing Digital Object Identifiers (DOIs) for datasets. We found that citation practices are chaotic, and there is currently no good way to find all the papers that use a dataset to check for misuse.

3. NAIRR could publish standardized dataset licenses. Dataset creators aren’t legal experts, and most of the licenses don’t accomplish what dataset creators want them to accomplish, enabling misuse.

4. NAIRR could require some analog of broader impact statements as part of an application for data or compute resources. Writing a broader impact statement could encourage ethical reflection by the authors. (A recent study found evidence that the NeurIPS broader impact requirement did result in authors reflecting on the societal consequences of their technical work.) Such reflection is valuable even if the statements are not actually used for decision making about who is approved. 

5. NAIRR could require some sort of ethical review of proposals. This goes beyond broader impact statements by making successful review a condition of acceptance. One promising model is the Ethics and Society Review instituted at Stanford. Most ethical issues that arise in AI research fall outside the scope of Institutional Review Boards (IRBs), so even a lightweight ethical review process could help prevent obvious-in-hindsight ethical lapses.

6. If researchers want to use a dataset to build and release a derivative dataset or pretrained model, then there should be an additional layer of scrutiny, because these involve essentially republishing the dataset. In our research, we found that this is the start of an ethical slippery slope, because data and models can be recombined in various ways and the intent of the original dataset can be lost.

7. There should be a way for people to report to NAIRR that some ethics violation is going on. The current model, for lack of anything better, is vigilante justice: journalists, advocates, or researchers sometimes identify ethical issues in datasets, and if the resulting outcry is loud enough, dataset creators feel compelled to retract or modify them. 

8. NAIRR could effectively partner with other entities that have emerged as ethical regulators. For example, conference program committees have started to incorporate ethics review. If NAIRR made it easy for peer reviewers to check the policies for any given data or compute resource, that would let them verify that a submitted paper is compliant with those policies.

There is no single predominant model for ethical review of AI research analogous to the IRB model for biomedical research. It is unlikely that one will emerge in the foreseeable future. Instead, a patchwork is taking shape. The NAIRR is set up to be a central player in AI research in the United States and, as such, bears responsibility for ensuring that the research that it supports is aligned with societal values.

——–

I’m grateful to the NAIRR task force for inviting me and to my fellow panelists and moderators for a stimulating discussion.  I’m also grateful to Sayash Kapoor and Mihir Kshirsagar, with whom I previously submitted a comment on this topic to the relevant federal agencies, and to Solon Barocas for helpful discussions.

A final note: the aims of the NAIRR have themselves been contested and are not self-evidently good. However, my comments (and the panel overall) assumed that the NAIRR will be implemented largely as currently conceived, and focused on harm mitigation.

CITP Case Study on Regulating Facial Recognition Technology in Canada

Canada, like many jurisdictions in the United States, is grappling with the growing usage of facial recognition technology in the private and public sectors. This technology is being deployed at a rapid pace in airports, retail stores, social media platforms, and by law enforcement – with little oversight from the government. 

To help address this challenge, I organized a tech policy case study on the regulation of facial recognition technology with Canadian members of parliament – The Honorable Greg Fergus and Matthew Green. Both sit on the House of Commons’ Standing Committee on Access to Information, Privacy, and Ethics (ETHI) Committee and I served as a legislative aide to them through the Parliamentary Internship Programme before joining CITP. Our goal for the session was to put policymakers in conversation with subject matter experts. 

The core problem is that there is lack of accountability in the use of facial recognition technology that excarbates historical forms of discrimination and puts marginalized communities at risk for a wide range of harms. For instance, a recent story describes the fate of three black men who were wrongfully arrested because of being misidentified by facial recognition software. As the Canadian Civil Liberties Association argues, the police’s use of facial recognition technology, notably provided by the New York-based company, Clearview AI, “points to a larger crisis in police accountability when acquiring and using emerging surveillance tools.

A number of academics and researchers – such as DAIR Instititute’s Timnit Gebru and the Algorithmic Justice League’s Joy  Buolamwini, who documented the missclassification of darker-skinned women in a recent paper – are bringing attention to the discriminatory algorithms associated with facial recognition that have put racialized people, women, and members of the LGBTIQ community, at greater risk of false identification.  

Meanwhile, Canadian officials are beginning to tackle the real world consequences of the use of facial recognition. A year ago, the Office of the Privacy Commissioner found that Clearview AI, had scraped billions of images of people from from the internet in what “represented mass surveillance and was a clear violation of the privacy rights of Canadians.” 

Following that investigation, Clearview AI stopped providing services to the Canadian market, including the Royal Canadian Mounted Police. In light of these findings and the absence of dedicated legislation, the ETHI Committee began studying the uses of facial recognition technology in May 2021, and has recently resumed this work by focusing on the use by various levels of government in Canada, law enforcement agencies, and private corporations. 

The CITP case study session on March 24, began with a presentation by Angelina Wang, a graduate affiliate of CITP, who provided a technical overview where she explained the different functions and harms associated with this technology. Following Wang’s presentation, I provided a regulatory overview of how U.S. lawmakers have addressed facial recognition by noting the different legislative strategies deployed for law enforcement, private, and public sector uses. We then had a substantive, free-flowing discussion with CITP researchers and the policymakers about the challenges and opportunities for different regulatory strategies. 

Following CITP’s case study session, Wang and Dr. Elizabeth Anne Watkins, a CITP Fellow, were invited to testify before the ETHI committee in an April 4 hearing. Wang discussed the different tasks facial recognition technology can and cannot perform, how the models are created, why they are susceptible to adversarial attacks, and the ethical implications behind the creation of this technology. Dr. Watkins’ testimony provided an overview of the privacy, security, and safety concerns related to the private industry’s use of facial verification on workers as informed by her research.  The committee is expected to report its findings by the end of May 2022. 

We continue to do research on how Canada might regulate facial recognition technology and will publish those analyses in the coming months.

Calling for Investing in Equitable AI Research in Nation’s Strategic Plan

By Solon Barocas, Sayash Kapoor, Mihir Kshirsagar, and Arvind Narayanan

In response to the Request for Information to the Update of the National Artificial Intelligence Research and Development Strategic Plan (“Strategic Plan”) we submitted comments  providing suggestions for how the Strategic Plan for government funding priorities should focus resources to address societal issues such as equity, especially in communities that have been traditionally underserved. 

The Strategic Plan highlights the importance of investing in research about developing trust in AI systems, which includes requirements for robustness, fairness, explainability, and security. We argue that the Strategic Plan should go further by explicitly including a commitment to making investments in research that examines how AI systems can affect the equitable distribution of resources. Specifically, there is a risk that without such a commitment, we make investments in AI research that can marginalize communities that are disadvantaged. Or, even in cases where there is no direct harm to a community, the research support focuses on classes of problems that benefit the already advantaged communities, rather than problems facing disadvantaged communities.  

We make five recommendations for the Strategic Plan:  

First, we recommend that the Strategic Plan outline a mechanism for a broader impact review when funding AI research. The challenge is that the existing mechanisms for ethics review of research projects – Institutional Review Boards (“IRB”) –  do not adequately identify downstream harms stemming from AI applications. For example, on privacy issues, an IRB ethics review would focus on the data collection and management process. This is also reflected in the Strategic Plan’s focus on two notions of privacy: (i) ensuring the privacy of data collected for creating models via strict access controls, and (ii) ensuring the privacy of the data and information used to create models via differential privacy when the models are shared publicly. 

But both of these approaches are focused on the privacy of the people whose data has been collected to facilitate the research process, not the people to whom research findings might be applied. 

Take, for example, the potential impact of face recognition for detecting ethnic minorities. Even if the researchers who developed such techniques had obtained approval from the IRB for their research plan, secured the informed consent of participants, applied strict access control to the data, and ensured that the model was differentially private, the resulting model could still be used without restriction for surveillance of entire populations, especially as institutional mechanisms for ethics review such as IRBs do not consider downstream harms during their appraisal of research projects. 

We recommend that the Strategic Plan include as a research priority supporting the development of alternative institutional mechanisms to detect and mitigate the potentially negative downstream effects of AI systems. 

Second, we recommend that the Strategic Plan include provisions for funding research that would help us understand the impact of AI systems on communities, and how AI systems are used in practice. Such research can also provide a framework for informing decisions on which research questions and AI applications are too harmful to pursue and fund. 

We recognize that it may be challenging to determine what kind of impact AI research might have as it affects a broad range of potential applications. In fact, many AI research findings will have dual use: some applications of these findings may promise exciting benefits, while others would seem likely to cause harm. While it is worthwhile to weigh these costs and benefits, decisions about where to invest resources should also depend on distributional considerations: who are the people likely to suffer these costs and who are those who will enjoy the benefits? 

While there have been recent efforts to incorporate ethics review into the publishing processes of the AI research community, adding similar considerations to the Strategic Plan would help to highlight these concerns much earlier in the research process. Evaluating research proposals according to these broader impacts would help to ensure that ethical and societal considerations are incorporated from the beginning of a research project, instead of remaining an afterthought.

Third, our comments highlight the reproducibility crisis in fields adopting machine learning methods and the need for the government to support the creation of computational reproducibility infrastructure and a reproducibility clearinghouse that sets up benchmark datasets for measuring progress in scientific research that uses AI and ML. We suggest that the Strategic Plan borrow from the NIH’s practices to make government funding conditional on disclosing research materials, such as the code and data, that would be necessary to replicate a study.

Fourth, we focus attention on the industry phenomenon of using a veneer of AI to lend credibility to pseudoscience as AI snake oil. We see evaluating validity as a core component of ethical and responsible AI research and development. The strategic plan could support such efforts by prioritizing funding for setting standards for and making tools available to independent researchers to validate claims of effectiveness of AI applications. 


Fifth, we document the need to address the phenomenon of “runaway datasets” — the practice of broadly releasing datasets used for AI applications without mechanisms of oversight or accountability for how that information can be used. Such datasets raise serious privacy concerns and they may be used to support research that is counter to the intent of the people who have contributed to them. The Strategic Plan can play a pivotal role in mitigating these harms by establishing and supporting appropriate data stewardship models, which could include supporting the development of centralized data clearinghouses to regulate access to datasets.