December 9, 2022

AI Nation podcast, from CITP and WHYY

I’m excited to introduce AI Nation: a podcast about AI, everyday life, and what happens when we delegate vital decisions to machines. It’s a collaboration, born at CITP, between Princeton University and WHYY, Philadelphia’s famous NPR station. The first episode drops on April 1.

Tune in, and you’ll hear a variety of voices. You’ll hear my co-host, Malcolm Burnley, a journalist who reports on culture and social justice, a non-scientist sci-fi enthusiast who is much hipper than me. (A low bar, I know, but you get my point.) You’ll hear voices of people who have been impacted by AI problems, such as being arrested due to bad facial recognition. And you’ll hear from a diverse group of experts on the tech and its implications.

We spent a long time figuring out how to make a podcast that is compelling without being superficial, and connects everyday life to the deep and important issues raised by the AI and computing revolution. There were several false starts and some pilots that got progressively closer to the vision. Then we connected to the team at WHYY, and found the recipe.

I hope you like it. Whatever you think, let us know!

Huge thanks to everyone who has made this possible. At Princeton, that starts with Olga Russakovsky who helped to hatch the original vision, Tithi Chattopadhyay who shepherded the process from beginning to end, Margaret Koval who advised us and made vital connections, and Daniel Kearns for his peerless audio engineering. At WHYY, the thanks start with our producer Alex Stern (now I know and more importantly appreciate everything a producer does!), John Sheehan, and of course my co-host Malcolm Burnley.

Enhancing the Security of Data Breach Notifications and Settlement Notices

[This post was jointly written by Ryan Amos, Mihir Kshirsagar, Ed Felten, and Arvind Narayanan.]

We couldn’t help noticing that the recent Yahoo and Equifax data breach settlement notifications look a lot like phishing emails. The notifications make it hard for users to distinguish real settlement notifications from scams. For example, they direct users to URLs on unfamiliar domains that are not clearly owned by the company that was breached nor any other trusted entity. Practices like this lower the bar for scammers to create fake phishing emails, potentially victimizing users twice. To illustrate the severity of this problem, Equifax mixed up domain names and posted a link to a phishing website to their Twitter account. Our discussion paper presents two recommendations to stakeholders to address this issue.

First, we recommend creating a centralized database of settlements and breaches, with an authoritative URL for each one, so that users have a way to verify the notices distributed. Such a database has precedent in the Consumer Product Safety Commission (CPSC) consumer recall list. When users receive notice of a data breach, this database would serve as a reliable authority to verify the information included in the notice. A centralized database has additional value outside the data breach context as courts and government agencies increasingly turn to electronic notices to inform the public, and scammers (predictably) respond by creating false notices.

Second, we recommend that no settlement or breach notice include a URL to a new domain. Instead, such notices should include a URL to a page on a trusted, recognizable domain, such as a government-run domain or the breached party’s domain. That page, in turn, can redirect users to a dedicated domain for breach information, if desired. This helps users avoid phishing by allowing them to safely ignore links to unrecognized domains. After the settlement period is over, any redirections should be automatically removed to avoid abandoned domains from being reused by scammers.

CITP to Launch Tech Policy Clinic; Hiring Clinic Lead

We’re excited to announce the CITP technology policy clinic, a first-of-its-kind interdisciplinary project to engage students and scholars directly in the policy process. The clinic will be supported by a generous alumni gift.

The technology policy clinic will adapt the law school clinic model to involve scholars at all levels in real-world policy activities related to technology—preparing written comments and briefs, working with startup companies, and collaborating with public-interest law groups. As an outgrowth of this work, CITP could provide federal, state and local policy makers with briefings on emerging technologies and could also create simple non-partisan guides to action for citizens and small businesses.

We’re looking to hire a Clinic Lead, an experienced policy professional to lead the clinic. For more information, go to https://citp.princeton.edu/clinic-lead/

CITP was founded as Princeton’s initiative to support research and education on technology policy issues. Over the years, CITP’s voice grew stronger as it uniquely leveraged its strength of world class computer scientists and engineers, to work alongside leading policy experts at the Woodrow Wilson School of Public Policy. The center has now established a recognized national voice in areas including AI policy, privacy and security, technology for governance and civil liberties, broadband policy, big data, cryptocurrencies, and the internet of things. As the national debate over technology and its impact on democracy has come to the forefront in recent times, the demand for technology policy experts has surged. CITP recognizes a need to take on a larger role in tackling some of these technology policy problems by providing on-the-ground training to Princeton’s extraordinary students. We’re eager to hire a Clinic Lead and get started!  

Blockchain: What is it good for?

Blockchain and cryptocurrencies are surrounded by world-historic levels of hype and snake oil. For people like me who take the old-fashioned view that technical claims should be backed by sound arguments and evidence, it’s easy to fall into the trap of concluding that there is no there there–and that blockchain and cryptocurrencies are fundamentally useless. This post is my attempt to argue that if we strip away the fluff, some valuable computer science ideas remain.

Let’s start by setting aside the currency part, for now, and focusing on blockchains. The core idea goes back to at least the 1990s: replicate a system’s state across a set of machines; use some kind of distributed consensus algorithm to agree on an append-only log of events that change the state; and use cryptographic hash-chaining to make the log tamper-evident. Much of the legitimate excitement about “blockchain” is driven by the use of this approach to enhance transparency and accountability, by making certain types of actions in a system visible. If an action is recorded in your blockchain, everyone can see it. If it is not in your blockchain, it is ignored as invalid.

An example of this basic approach is certificate transparency, in which certificate authorities (“CAs,” which vouch for digital certificates connecting a cryptographic key to the owner of a DNS name) must publish the certificates they issue on a public list, and systems refuse to accept certificates that are not on the list. This ensures that if a CA issues a certificate without permission from a name’s legitimate owner, the bogus certificate cannot be used without publishing it and thereby enabling the legitimate owner to raise an alarm, potentially leading to public consequences for the misbehaving CA.

In today’s world, with so much talk about the policy advantages of technological transparency, the use of blockchains for transparency can an important tool.

What about cryptocurrencies? There is a lot of debate about whether systems like Bitcoin are genuinely useful as a money transfer technology. Bitcoin has many limitations: transactions take a long time to confirm, and the mining-based consensus mechanism burns a lot of energy. Whether and how these limitations can be overcome is a subject of current research.

Cryptocurrencies are most useful when coupled with “smart contracts,” which allow parties to define the behavior of a virtual actor in code, and have the cryptocurrency’s consensus system enforce that the virtual actor behaves according to its code. The name “smart contract” is misleading, because these mechanisms differ significantly from legal contracts.  (A legal contract is an explicit agreement among an enumerated set of parties that constrains the behavior of those parties and is enforced by ex post remedies. A “smart contract” doesn’t require explicit agreement from parties, doesn’t enumerate participating parties, doesn’t constrain behavior of existing parties but instead creates a new virtual party whose behavior is constrained, and is enforced by ex ante prevention of deviations.) It is precisely these differences that make “smart contracts” useful.

From a computer science standpoint, what is exciting about “smart contracts” is that they let us make conditional payments an integral part of the toolbox for designing distributed protocols. A party can be required to escrow a deposit as a condition of participating in some process, and the return of that deposit, in part or in whole, can be conditioned on the party performing arbitrary required steps, as long as compliance can be checked by a computation.

Another way of viewing the value of “smart contracts” is by observing that we often define correctness for a new distributed protocol by postulating a hypothetical trusted third party who “referees” the protocol, and then proving some kind of equivalence between the new referee-free protocol we have designed and the notional refereed protocol. It sure would be nice if we could just turn the notional referee into a smart contract and let the consensus system enforce correctness.

But all of this requires a “smart contract” system that is efficient and scalable–otherwise the cost of using “smart contracts” will be excessive. Existing systems like Ethereum scale poorly. This too is a problem that will need to be overcome by new research. (Spoiler alert: We’ll be writing here about a research solution in the coming months.)

These are not the only things that blockchain and cryptocurrencies are good for. But I hope they are convincing examples. It’s sad that the hype and snake oil has gotten so extreme that it can be hard to see the benefits. The benefits do exist.

 

Singularity Skepticism 4: The Value of Avoiding Errors

[This is the fourth in a series of posts. The other posts in the series are here: 1 2 3.]

In the previous post, we did a deep dive into chess ratings, as an example of a system to measure a certain type of intelligence. One of the takeaways was that the process of numerically measuring intelligence, in order to support claims such as “intelligence is increasing exponentially”, is fraught with complexity.

Today I want to wrap up the discussion of quantifying AI intelligence by turning to a broad class of AI systems whose performance is measured as an error rate, that is, the percentage of examples from population for which the system gives a wrong answer. These applications include  facial recognition, image recognition, and so on.

For these sorts of problems, the error rate tends to change over time as shown on this graph:

The human error rate doesn’t change, but the error rate for the AI system tends to fall exponentially, crossing the human error rate at a time we’ll call t*, and continuing to fall after that.

How does this reduction in error rate translate into outcomes? We can get a feel for this using a simple model, where a wrong answer is worth W and a right answer is worth R, with R>W, naturally.

In this model, the value created per decision changes over time as shown in this graph:

Before t*, humans perform better, and the value is unchanged. At t*, AI becomes better and the graph takes a sharp turn upward. After that, the growth slows as the value approaches its asymptote of R.

This graph has several interesting attributes. First, AI doesn’t help at all until t*, when it catches up with people. Second, the growth rate of value (i.e., the slope of the curve) is zero while humans are better, then it lurches upward at t*, then the growth rate falls exponentially back to zero. And third, most of the improvement that AI can provide will be realized in a fairly short period after t*.

Viewed over a long time-frame, this graph looks a lot like a step function: the effect of AI is a sudden step up in the value created for this task. The step happens in a brief interval after AI passes human performance. Before and after that interval, the value doesn’t change much at all.

Of course, this simple model can’t be the whole story. Perhaps a better solution to this task enables other tasks to be done more effectively, multiplying the improvement. Perhaps people consume more of this tasks’s output because it is better. For these and other reasons, things will probably be somewhat better than this model predicts. But the model is still a long way from establishing that any kind of intelligence explosion or Singularity is going to happen.

Next time, we’ll dive into the question of how different AI tasks are connected, and how to think about the Singularity in a world where task-specific AI is all we have.