October 31, 2024

21st Century Wiretapping: Risk of Abuse

Today I’m returning, probably for the last time, to the public policy questions surrounding today’s wiretapping technology. Thus far in the series (1, 2, 3, 4, 5, 6, 7, 8) I have described how technology enables wiretapping based on automated recognition of certain features of a message (rather than individualized suspicion of a person), I have laid out the argument in favor of allowing such content-triggered wiretaps given a suitable warrant, and I have addressed some arguments against allowing them. These counterarguments, I thnk, show that content-triggered wiretaps be used carefully and with suitable oversight, but they do not justify forgoing such wiretaps entirely.

The best argument against content-triggered wiretaps is the risk of abuse. By “abuse” I mean the use of wiretaps, or information gleaned from wiretaps, illegally or for the wrong reasons. Any wiretapping regime is subject to some kind of abuse – even if we ban all wiretapping by the authorities, they could still wiretap illegally. So the risk of abuse is not a new problem in the high-tech world.

But it is a worse problem than it was before. The reason is that to carry out content-triggered wiretaps, we have to build an infrastructure that makes all communications available to devices managed by the authorities. This infrastructure enables new kinds of abuse, for example the use of content-based triggers to detect political dissent or, given enough storage space, the recording of every communication for later (mis)use.

Such serious abuses are not likely, but given the harm they could do, even a tiny chance that they could occur must be taken seriously. The infrastructure of content-triggered wiretaps is the infrastructure of a police state. We don’t live in a police state, but we should worry about building police state infrastructure. To make matters worse, I don’t see any technological way to limit such a system to justified uses. Our only real protections would be oversight and the threat of legal sanctions against abusers.

To sum up, the problem with content-triggered wiretaps is not that they are bad policy by themselves. The problem is that doing them requires some very dangerous infrastructure.

Given this, I think the burden should be on the advocates of content-triggered wiretaps to demonstrate that they are worth the risk. I won’t be convinced by hypotheticals, even vaguely plausible ones. I won’t be convinced, either, by vague hindsight claims that such wiretaps coulda-woulda-shoulda captured some specific badguy. I’m willing to be convinced, but you’ll have to show me some evidence.

Syndromic Surveillance: 21st Century Data Harvesting

[This article was written by a pseudonymous reader who calls him/herself Enigma Foundry. I’m publishing it here because I think other readers would find it interesting. – Ed Felten]

The recent posts about 21st Century Wiretapping described a government program which captured, stored, filtered and analyzed large quantities of information, information which the government had not previously had access to without special court permission. On reading these posts, it had struck me that there were other government programs that are in the process of being implemented that will also capture, store, filter and analyze large quantities of information that had not been previously available to governmental authorities.

In contrast to the NSA wiretap program described in previous posts, the program I am going to describe has not yet generated any significant amount of public controversy, although its development has taken place in nearly full public view for the past decade. Also, unlike the NSA program, this program is still hypothetical, although a pilot project is underway.

The systems that have been used to detect disease outbreaks to date primarily rely on the recognition and reporting of health statistics that fit recognized disease patterns. (See, e.g., the summary for the CDC’s Morbidity and Mortality weekly Report.) These disease surveillance systems works well enough for outbreaks of recognized and ‘reportable’ diseases which, by virtue of having a long clinically described history, have distinct and well-known symptoms and, in almost all cases, definitive tests exist for their diagnosis. But what if an emerging infectious disease or a bio-terrorist attack used an agent that did not fit a recognized pattern, and therefore there existed no well-defined set of symptoms, let alone a clinically meaningful test for identifying it?

If the initial symptoms are severe enough, as in the case of S.A.R.S., the disease will quickly come to light. (Although it is important to note that that did not happen in China, where the press was tightly controlled) If the initial symptoms are not severe, however, the recognition that an attack has even occurred may be delayed many months (or using certain types of agents, conceivably even years) after the event had occurred. To give Health Authorities the ability to see events that are outside the set of diseases that are required to be reported, the creation of a large database, which would collate information such as: workplace and school absenteeism, prescription and OTC (over the counter) medicine sales, symptoms reported at schools, numbers of doctor and Emergency Department visits, even weather patterns and veterinary conditions reported could serve a very useful function in identifying a disease outbreak, and bringing it to the attention of Public Health Authorities. Such a data monitoring system has been given the name ‘Syndromic Surveillance,’ to separate it from the traditional ‘Disease Surveillance’ programs.

You don’t need to invoke the specter of bioterrorism to make a strong case for the value of such a system. The example frequently cited is a 1993 outbreak in Milwaukee of cryptosporidium (an intestinal parasite) which eventually affected over 400,000 people. In that case, sales of anti-diarrhea medicines spiked some three weeks before officials became aware of the outbreak. If the sales of OTC medications had been monitored, perhaps officials could have been alerted to the outbreak earlier.

Note that this system, as currently proposed does not necessarily create or require records that can be tied to particular individuals, although certain data about each individual such as place of work and residence, occupation, recent travel are all of interest. The data would probably tie individual reports to census tract, or perhaps census block. So the concerns about individual privacy being violated seem to be less then in the case of the NSA data mining of telephone records, since the information is not tied to an individual and the type of information is very different from that harvested by the NSA program.

There are three interesting problems created by the database used by a Syndromic Surveillance system: (1) The problem of False Positives, (2) Issues relating to access to and control of the data base & (3) What to do if the Syndromic Surveillance system actually works.

First with regard to the false positives, even a very minor rate error rate can lead to many false alarms, and the consequences of a false alarm are much greater than in the case of the NSA data filtering program:

For instance, thousands of syndromic surveillance systems soon will be running simultaneously in cities and counties throughout the United States. Each might analyze data from 10 or more data series—symptom categories, separate hospitals, OTC sales, and so on. Imagine if every county in the United States had in place a single syndromic surveillance system with a 0.1 percent false-positive rate; that is, the alarm goes off inappropriately only once in a thousand days. Because there are about 3,000 counties in the United States, on average three counties a day would have a false-positive alarm. The costs of excessive false alarms are both monetary, in terms of resources needed to respond to phantom events, and operational, because too many false events desensitize responders to real events….

There are obviously many issues relating to public policy regarding to access and dissemination of information generated by such a public health database, but there are two particular items providing contradictory information which I’d like to present, and hear your reactions and thoughts:

Livingston, NJ -When news of former President Bill Clinton’s experience with chest pains and his impending cardiac bypass surgery hit the streets, hospital emergency departments and urgent care centers in the Northeast reportedly had an increase in cardiac patients. Referred to as “the Bill Clinton Effect,” the talked-about increase in cardiac patients seeking care has now been substantiated by Emergency Medical Associates’ (EMA) bio-surveillance system.

Reports of Clinton’s health woes were first reported on September 3rd, with newspaper accounts appearing nationally in September 4th editions. On September 6th, EMA’s bio-surveillance noted an 11% increase in emergency department visits with patients complaining of chest pain (over the historical average for that date), followed by a 76% increase in chest pain visits on September 7th, and a 53% increase in chest pain visits on September 8th.

The second story has to do with my own personal experience and observation of the Public Health authorities’ actions in Warsaw immediately following the Chernobyl accident. In Warsaw, the authorities had prepared for the event, and children were immediately given iodine to prevent the uptake of radioactive iodine. This has been widely credited with preventing many deaths due to cancer. In Warsaw, the Public Health Authorities also very promptly informed the public about the level of ambient radiation. Certainly, there was great concern among the populace but panic was largely averted. My empirical evidence is of course limited, but my gut feeling is that much dislocation was averted by (1) the obvious signs of organized preparation for such an event, and (2) the transparency with which data concerning public health were disseminated.

Links:
article summarizing ‘Syndromic Surveillance’
CDC article
epi-x, CDC’s epidemic monitoring program

The Exxon Valdez of Privacy

Recently I moderated a panel discussion, at Princeton Reunions, about “Privacy and Security in the Digital Age”. When the discussion turned to public awareness of privacy and data leaks, one of the panelists said that the public knows about this issue but isn’t really mobilized, because we haven’t yet seen “the Exxon Valdez of privacy” – the singular, dramatic event that turns a known area of concern into a national priority.

Scott Craver has an interesting response:

An audience member asked what could possibly comprise such a monumental disaster. One panelist said, “Have you ever been a victim of credit card fraud? Well, multiply that by 500,000 people.”

This is very corporate thinking: take a loss and multiply it by a huge number. Sure that’s a nightmare scenario for a bank, but is that really a national crisis that will enrage the public? Especially since cardholders are somewhat sheltered from fraud. Also consider how many people are already victims of identity theft, and how much money it already costs. I don’t see any torches and pitchforks yet.

Here’s what I think: the “Exxon Valdez” of privacy won’t be $100 of credit card fraud multiplied by a half million people. It will instead be the worst possible privacy disruption that can befall a single individual, and it doesn’t have to happen to a half million people, or even ten thousand. The number doesn’t matter, as long as it’s big enough to be reported on CNN …

[…]

So back to the question: what is the worst, the most sensational privacy disaster that can befall an individual – that in a batch of, oh say 500-5,000 people, will terrify the general public? I’m not thinking of a disaster that is tangentially aided by a privacy loss, like a killer reading my credit card statement to find out what cafe I hang out at. I’m talking about a direct abuse of the private information being the disaster itself.

What would be the Exxon Valdez of privacy? I’m not sure. I don’t think it will just be a loss of money – Scott explained why it won’t be many small losses, and it’s hard to imagine a large loss where the privacy harm doesn’t seem incidental. So it will have to be a leak of information so sensitive as to be life-shattering. I’m not sure exactly what that is.

What do you think?

Twenty-First Century Wiretapping: Content-Based Suspicion

Yesterday I argued that allowing police to record all communications that are flagged by some automated algorithm might be reasonable, if the algorithm is being used to recognize the voice of a person believed (for good reason) to be a criminal. My argument, in part, was that that kind of wiretapping would still be consistent with the principle of individualized suspicion, which says that we shouldn’t wiretap someone unless we have strong enough reason to suspect them, personally, of criminality.

Today, I want to argue that there are cases where even individualized suspicion isn’t necessary. I’ll do so by introducing yet another hypothetical.

Suppose we have reliable intelligence that al Qaeda operatives have been instructed to use a particular verbal handshake to identify each other. Operatives will prove they were members of al Qaeda by carrying out some predetermined dialog that is extremely unlikely to occur naturally. Like this, for instance:

First Speaker: The Pirates will win the World Series this year.
Second Speaker: Yes, and Da Vinci Code is the best movie ever made.

The police ask us for permission to run automated voice recognition algorithms on all phone conversations, and to record all conversations that contain this verbal handshake. Is it reasonable to give permission?

If the voice recognition is sufficiently accurate, this could be reasonable – even though the wiretapping is not based on advance suspicion of any particular individual. Suspicion is based not on the identity of the individuals speaking, but on the content of the communication. (You could try arguing that the content causes individualized suspicion, at the moment it is analyzed, but if you go that route the individualized suspicion principle doesn’t mean much anymore.)

Obviously we wouldn’t give the police carte blanche to use any kind of content-based suspicion whenever they wanted. What makes this hypothetical different is that the suspicion, though content-based, is narrowly aimed and is based on specific evidence. We have good reason to believe that we’ll be capturing some criminal conversations, and that we won’t be capturing many noncriminal ones. This, I think, is the general principle: intercepted communications may only be made known to a human based on narrowly defined triggers (whether individual-based or content-based), and those triggers must be justified based on specific evidence that they will be fruitful but not overbroad.

You might argue that if the individualized suspicion principle has been good enough for the past [insert large number] years, it should be good enough for the future too. But I think this argument misses an important consequence of changing technology.

Back before the digital revolution, there were only two choices: give the police narrow warrants to search or wiretap specific individuals or lines, or give the police broad discretion to decide whom to search or wiretap. Broad discretion was problematic because the police might search too many people, or might search people for the wrong reasons. Content-based triggering, where a person got to overhear the conversation only if its content satisfied specific trigger rules, was not possible, because the only way to tell whether the trigger was satisfied was to have a person listen to the conversation. And there was no way to unlisten to that conversation if the trigger wasn’t present. Technology raises the possibility that automated algorithms can implement triggering rules, so that content-based triggers become possible – in theory at least.

Given that content-based triggering was infeasible in the past, the fact that traditional rules don’t make provision for it does not, in itself, end the argument. This is the kind of situation that needs to be evaluated anew, with proper respect for traditional principles, but also with an open mind about how those principles might apply to our changed circumstances.

By now I’ve convinced you, I hope, that there is a plausible argument in favor of allowing government to wiretap based on content-based triggers. There are also plausible arguments against. The strongest ones, I think, are (1) that content-based triggers are inconsistent with the current legal framework, (2) that content-based triggers will necessarily make too many false-positive errors and thereby capture too many innocent conversations, and (3) that the infrastructure required to implement content-based triggers creates too great a risk of abuse. I’ll wrap up this series with three more posts, discussing each of these arguments in turn.

Twenty-First Century Wiretapping: Recognition

For the past several weeks I’ve been writing, on and off, about how technology enables new types of wiretapping, and how public policy should cope with those changes. Having laid the groundwork (1; 2; 3; 4; 5) we’re now ready for to bite into the most interesting question. Suppose the government is running, on every communication, some algorithm that classifies messages as suspicious or not, and that every conversation labeled suspicious is played for a government agent. When, if ever, is government justified in using such a scheme?

Many readers will say the answer is obviously “never”. Today I want to argue that that is wrong – that there are situations where automated flagging of messages for human analysis can be justified.

A standard objection to this kind of algorithmic triggering is that authority to search or wiretap must be based on individualized suspicion, that is, that there must be sufficient cause to believe that a specific individual is involved in illegal activity, before that individual can be wiretapped. To the extent that that is an assertion about current U.S. law, it doesn’t answer my question – recall that I’m writing here about what the legal rules should be, not what they are. Any requirement of individualized suspicion must be justified on the merits. I understand the argument for it on the merits. All I’m saying is that that argument doesn’t win by default.

One reason it shouldn’t win by default is that individualized suspicion is sometimes consistent with algorithmic recognition. Suppose that we have strong cause to believe that Mr. A is planning to commit a terrorist attack or some other serious crime. This would justify tapping Mr. A’s phone. And suppose we know Mr. A is visiting Chicago but we don’t know exactly where in the city he is, and we expect him to make calls on random hotel phones, pay phones, and throwaway cell phones. Suppose further that the police have good audio recordings of Mr. A’s voice.

The police propose to run automated voice recognition software on all phone calls in the Chicago area. When the software flags a recording as containing Mr. A’s voice, that recording will be played for a police analyst, and if the analyst confirms the voice as Mr. A’s, the call will be recorded. The police ask us, as arbiters of the public good, for clearance to do this.

If we knew that the voice recognition algorithm would be 100% accurate, then it would be hard to object to this. Using an automated algorithm would be more consistent with the principle of individualized suspicion than would be the traditional approach of tapping Mr. A’s home phone. His home phone, after all, might be used by an innocent family member or roommate, or by a plumber working in his house

But of course voice recognition is not 100% accurate. It will miss some of Mr. A’s calls, and it will incorrectly flag some calls by others. How serious a problem is this? It depends on how many errors the algorithm makes. The traditional approach sometimes records innocent people – others might use Mr. A’s phone, or Mr. A might turn out to be innocent after all – and these errors make us cautious about wiretapping but don’t preclude wiretapping if our suspicion of Mr. A is strong enough. The same principle ought to hold for automated voice recognition. We should be willing to accept some modest number of errors, but if errors are more frequent we ought to require a very strong argument that recording Mr. A’s phone calls is of critical importance.

In practice, we would want to set out crisply defined criteria for making these determinations, but we don’t need to do that exercise here. It’s enough to observe that given sufficiently accurate voice recognition technology – which might exist some day – algorithmically triggered recording can be (a) justified, and (b) consistent with the principle of individualized suspicion.

But can algorithmic triggering be justified, even if not based on individualized suspicion? I’ll argue next time that it can.