April 23, 2014

avatar

Twenty-First Century Wiretapping: Recording

Yesterday I started a thread on new wiretapping technologies, and their policy implications. Today I want to talk about how we should deal with the ability of governments to record and store huge numbers of intercepted messages.

In the old days, before there were huge, cheap digital storage devices, government would record an intercepted message only if it was likely to listen to that message eventually. Human analysts’ time was scarce, but recording media were relatively scarce too. The cost of storage tended to limit the amount of recording.

Before too much longer, Moore’s Law will enable government to record every email and phone call it knows about, and to keep the recordings forever. The cost of storage will no longer be a factor. Indeed, if storage is free but analysts’ time is costly, then the cost-minimizing strategy is to record everything and sort it out later, rather than spending analyst time figuring out what to record. Cost is minimized by doing lots of recording.

Of course the government’s cost is not the only criterion that wiretap policy should consider. We also need to consider the effect on citizens.

Any nontrivial wiretap policy will sometimes eavesdrop on innocent citizens. Indeed, there is a plausible argument that a well-designed wiretap policy will mostly eavesdrop on innocent citizens. If we knew in advance, with certainty, that a particular communication would be part of a terrorist plot, then of course we would let government listen to that communication. But such certainty only exists in hypotheticals. In practice, the best we can hope for is that, based on the best available information, there is some known probability that the message will be part of a terrorist plot. If that probability is just barely less than 100%, we’ll be comfortable allowing eavesdropping on that message. If the probability is infinitesimal, we won’t allow eavesdropping. Somewhere in the middle there is a threshold probability, just high enough that we’re willing to allow eavesdropping. We’ll make the decision by weighing the potential benefit of hearing the bad guys’ conversations, against the costs and harms imposed by wiretapping, in light of the probability that we’ll overhear real bad guys. The key point here is that even the best wiretap policy will sometimes listen in on innocent people.

(For now, I’m assuming that “we” have access to the best possible information, so that “we” can make these decisions. In practice the relevant information may be closely held (perhaps with good reason) and it matters greatly who does the deciding. I know these issues are important. But please humor me and let me set them aside for a bit longer.)

The drawbacks of wiretapping come in several flavors:
(1) Cost: Wiretapping costs money.
(2) Mission Creep: The scope of wiretapping programs (arguably) tends to increase over time, so today’s reasonable, well-balanced program will lead to tomorrow’s overreach.
(3) Abuse: Wiretaps can be (and have been) misused, by improperly spying on innocent people such as political opponents of the wiretappers, and by misusing information gleaned from wiretaps.
(4) Privacy Threat: Ordinary citizens will feel less comfortable and will feel compelled to speak more cautiously, due to the knowledge that wiretappers might be listening.

Cheap, high capacity storage reduces the first drawback (cost) but increases all the others. The risk of abuse seems particularly serious. If government stores everything from now on, corrupt government officials, especially a few years down the road, will have tremendous power to peer into the lives of people they don’t like.

This risk is reason enough to insist that recording be limited, and that there be procedural safeguards against overzealous recording. What limits and safeguards are appropriate? That’s the topic of my next post.

Comments

  1. William McGeveran says:

    I suppose one possible architectural limitation on the scary scenario here is the accuracy of indexing. Even if the cost of storing recordings digitally has become trivial (or will soon), they still must be catalogued accurately. Take the case of a corrupt government type going back to find past private conversations of a political enemy. Presumably, the original index will not have been made with that purpose in mind. How easy will it be to rescan the whole database looking for different speech snippets than those originally intended?

    I hope the answer is that it would be difficult; I suspect the answer is that it would be simple.

  2. V says:

    Sidenote: Since wiretapping now involves any electronic communication (email, VoIP, etc.) it’s now easier to use encryption schemes, which require more computer power to crack as data transfer gets faster and encryption codes get more robust, and possibly more popular with increasing identity theft. How does this affect the situation?

  3. cynic says:

    Ed writes:

    “If government stores everything from now on, corrupt government officials, especially a few years down the road, will have tremendous power to peer into the lives of people they don’t like.”

    I feel obligated to point out a couple of additional potential targets:

    1) People they like a little too much: think of the possibilities for sexual harrassment for a petty government official with access to private conversations.

    2) Information for sale: think of the money to be made by petty government officials selling private conversations to all sorts of consumers of this type of information, such as tabloids, private investigators, stalkers, identity theft perpetrators, etc. Experience teaches us that there’s a market for this sort of thing, and that government data is sold by insiders.

  4. Steve R. says:

    I would add “Security” to the list of wiretapping drawbacks. We would be collecting massive amounts of data. Though the intent is to collect information regarding threats to our national security, the massive wholesale collection of data could result in the unintentional collection of our own security secrets. There is always the potential for a “mole” or disgruntled person to sift the data for our security secrets and give/sell it to others.

  5. paul says:

    As far as I can see, every possible safeguard one can envision for keeping the contents of a universal (yes, thus far for incomplete values of “universal”) surveillance archive in the right hands will fail as a result of human factors. No one is going to put in access controls that require a key known to be held only by the judiciary, or even built-in notification channels so that idependent entities will be alerted when the data is mined. So at best you’ll be relying on the integrity and hack-resistance of the entire intelligence and law-enforcement communities, which is to say you’re up the creek.

    There are plenty of situations where liberty and civilized behavior are incompatible with the most cost-effective solution to a given problem (even assuming for the moment that a universal surveillance archive would actually solve any problems — something that the NSA program has not made clear). I think this is pretty clearly one of them. The entire idea of “collect now, analyze later” should be considered unacceptable.

  6. Steve R. says:

    Based on Paul’s post, add “Data Integrity” as another item to the list wiretapping drawbacks. Given the degree of existing sophistication in manipulating data, how will we ever know if the retrieved data is actually “real”. The movie industry is giving us pretty realistic fantasy. In court how would NSA document the “chain of control” to certify that the data is unaltered???

  7. Neo says:

    This stuff could never be used in court — only law enforcement taps resulting from a warrant. Its only use is to detect and prevent the next 9/11, and for that purpose it gets stale very fast. So limit recording lifetime. Past a few years its only use would be for blackmail or similar.

    As for misuses, you forget selling to PIs and to large corporate marketing departments.

  8. Jeff Epler says:

    At 4 kilobytes / second, recording 4 billion audio and storing it on $100USD 300GB hard drives, you only need around 420 million of them per year, or 42 billion USD/year for the storage. These drives use 8 W while running, or $3B/year at ten cents per kWh. Since the true number of streams to be recorded is much lower, probably fewer than 400 million average simultaneous conversations, the costs would be correspondingly lower.

    Who would say “no” to the ability to “stop the next 9/11″ if it only costs $5B/year? $10B/year? Certainly not the present administration.

  9. paul says:

    Intercepts talking about 9/11 plans were, as of that morning, lying on virtual desktops waiting to be translated, read by analysts and passed up the chain of command to someone who might or might not have done anything as a result. To me, that suggests that spending a huge (or even middling) pile of gigabucks on interception and storage with a counterbalancing pile of time and money on translation and analysis will just mean that, a month after the next major terrorist attack we will have a much more detailed picture of the lives of the people who carried out the attack and any of their co-conspirators who were foolish enough not to pass handwritten notes or talk in person.

    But think of all the dirty SMS messages analysts will be able to pass around.

  10. Rich Gibbs says:

    Some earlier comments have already referred to the risks of the collected information being sold to marketers, private investigators, and so on. Beyond that, the mere existence of a large, comprehensive database of intercepted private communications would constitute an extremely attractive cracking target. Think of how convincing an identity thief could be with the information contained in, say, six months’ worth of your private conversations, E-mails, etc. And the thief’s problem is simpler than that of an official who wants to get back at an enemy: the thief doesn’t need to find any particular target, just one or more useful ones.

    All of this is in addition to the most basic point, which Bruce Schneier has made often: putting into place the apparatus of a police state is very poor “civic hygiene”.

  11. enigma_foundry says:

    “(2) Mission Creep: The scope of wiretapping programs (arguably) tends to increase over time, so today’s reasonable, well-balanced program will lead to tomorrow’s overreach.”

    Or the other question would be: Who will be in charge tomorrow? We are laying the foundation for a catastrophic accident. Think of Richard Nixon–or even worse–The Nazi Party (recall that the Nazi’s were elected) somehow coming into power and finding this incredibly elaborate and flexible instrumentality of total control. How exactly would they be overthrown?

    “(3) Abuse: Wiretaps can be (and have been) misused, by improperly spying on innocent people such as political opponents of the wiretappers, and by misusing information gleaned from wiretaps.”

    I would be extremely surprised that this has not already been going on. Of course it has–why else do you think Bush was so eager to get wiretapping going without a warrent. Bush is almost certainly spying right now on Democrats, trying to get their plans for upcoming elections, or dig up dirt on opponents. It has already been documented that anti-war groups were infiltrated by government agents. George Soros and other like him almost certainly are now spied on by NSA.

  12. Govt Skeptic says:

    I think the new title of this series should be:
    “Twenty-First Century (Bulk) Wiretapping: The Only Explanation Is Nefarious Use”

    After only two installments, I think it’s pretty easy to see where this is going. Basically, from a technology point of view, it’s hard to believe that any of this wiretapping leads anywhere good. Humans simply can’t process that much data, and computers simply won’t do a good enough job to make it worthwhile (see Schneier’s blog for analysis on false positives in these systems). The only explanation is that this is simply an excuse to set up a zero-accountability program that can be used to monitor specific individuals for purposes other than national security (e.g., pure politics, campaign tactics, personal attacks, etc).

    Okay, enjoy your digital day, everyone!

    P.S. enigma_foundry, your post constitutes thougtcrime and as such is unacceptable. Please report to reeducation immediately.

  13. Tarek says:

    “If that probability is less than 100%, we’ll be comfortable allowing eavesdropping on that message.”

    Did you mean “if that probability is 100%”? (The way it’s written, this first option includes the infintesimal and the “somewhere in the middle” zones, which I presume shouldn’t overlap.)

    I’m also wondering about how much of the recordings get archived, and what standard gets applied to old recordings. The place we we draw the line on what’s appropriate and what’s not is going to change with circumstance (popular opinion, who’s in power, state of the world, etc)

    Should (or do) we have a rule that archived recordings can be searched only using the criteria that were in effect at the time of their recording?

    I think there’s a legal concept similar to this that says that you can’t be charged with a crime for an act that wasn’t illegal when you committed it, but the name escapes me…

    (Aha! wikipedia to the rescue: Ex post facto)

  14. fourleggedant says:

    as the likelyhood of any particular form of communication being intercepted (and stored) approaches 1 so the likelyhood of that form being chosen as the vehicle for secure communication approaches 0. it obviously follows that the likelyhood of any given target of wiretapping being innocent is equivalent to the ease of interception.

    . -ant

  15. Peter says:

    I think a lot of this also falls under the idea “google can do it, why can’t we?” The information gleaned from just pen traps (who you call, and who calls you) tells a lot of information about who you are. That information is of value to marketers “well, Betty bought this frobinator2006, why won’t you say yes?” Just drop names of the people you communicate with, without (the sale to Betty) even being true, will fool a lot of prospective customers into saying “yes.”

    I agree with the other posters who claim it will be used for blackmail.

  16. Stephen Purpura says:

    I’m a believer in limited government on wiretapping, but I am not as pessimistic about human controls. But this assumes that we’re not focused on trying to protect against the “World War II Euro Scenario” when historical information was taken over by the Nazis and used to exterminate citizens.

    The problem with FISA today is that it doesn’t take into consideration the reality of new technology. It needs to be re-written to address it. But completely preventing the use of this new technology to defend ourselves seems foolish.

    These days, I seem to hear a lot of complaints which dismiss the usefulness of collecting this information unless it would be used to investigate every person. There seems to be a deficit in explanations which address why collecting “non-terrorist” samples for comparison against “terrorist” samples is

  17. Stephen Purpura says:

    I’m a believer in limited government on wiretapping, but I am not as pessimistic about human controls. But this assumes that we’re not focused on trying to protect against the “World War II Euro Scenario” when historical information was taken over by the Nazis and used to exterminate citizens.

    The problem with FISA today is that it doesn’t take into consideration the reality of new technology. It needs to be re-written to address it. But completely preventing the use of this new technology to defend ourselves seems foolish.

    These days, I seem to hear a lot of complaints which dismiss the usefulness of collecting this information unless it would be used to investigate every person. There seems to be a deficit in explanations which address why collecting “non-terrorist” samples for comparison against “terrorist” samples helps to eliminate type 1 and type 2 errors within terrorism analysis systems.

  18. Old Grouch says:

    And of course the mere existence of such a data store will attract a horde of would-be “legitimate” users, each his own “good reasons” for gaining access: The historian, sociologist, psychologist, missing-child investigator, graduate student, bluenose, busybody, health activist, the antismoking-Nazi– everybody but the average citizen.

    Gee, what if we take the DMCA, apply it to personal communications, and say they couldn’t listen to my conversations without my permission (or a court order) until 75 years after my death!

  19. Edward Kuns says:

    The fact that they already had the information they needed before the 911 attack, but didn’t have the analysis ability to process it quickly enough, tells me that adding an enormous new mountain of data even bigger than what you originally started with will make that problem worse. That is, it will take even LONGER than before for the intelligence agencies to process information. They are doing something that actually makes the situation worse in the guise of doing something that makes it better.

    And I keep hearing the specious argument, “Ordinary citizens [who happen to work for the phone company] have access to this information. It isn’t private. We have no reasonable expectation of privacy. We have no reason to deny the government this data.” By that reasoning, why don’t they look at our medical records, library records (I know, Patriot Act), grocery store purchase records as recorded by courtesy cards, purchase history of our credit cards, and so on. After all, any ordinary citizen (who happens to work for your credit card company) can look up your entire history of purchases (at least as allowed by the software at the credit card company), so why shouldn’t the government have access to this same information?

    And again, what amazes me is that only 53% of Americans think this phone call record sifting goes too far and that so many people are incapabable of seeing any problem with it. To many people, this falls into, “If you haven’t done anything wrong, you have nothing to be afraid of.” To which I cry BS.

    But I know that here I’m preaching to the choir so I’ll shut up now.

  20. Jim Harper says:

    “Before too much longer, Moore’s Law will enable government to record every email and phone call it knows about, and to keep the recordings forever. The cost of storage will no longer be a factor.”

    Storage may become cheap, but given the quantities of information involved, it will not be free and probably not even inexpensive if the data is to be accessible and usable. You’ve been careful to note that cost “will no longer be a factor.” By this, do you mean that storage will actually be cheap? Or do you presume that Congress and the American people will continue indefinitely writing a blank check to security agencies?

    Ed, more thought and care around the “storage is free” notion would be welcome from an influential computer scientist like yourself.

  21. PJ says:

    …but how long before enduser devices can all be encryption-enabled, thereby making wiretapping expensive again?

  22. Ed Felten says:

    Jim,

    I’m assuming that the budget that the intelligence agencies have to spend on this stuff won’t change by a huge factor. Cutting that budget in half buys you an extra eighteen months, i.e., what they can do today for $X, then can do eighteen months from now for $X/2. Unless we postulate a huge budget cut, the budget doesn’t change the basic issue, it only delays the problem a bit.

    My main point is that the day will come, not to far in the future, when the intelligence agencies will be able to store nearly everything — unless the law stops them.

  23. Neo says:

    Storage per buck and the worldwide flow of information will both exponentiate, and will tend to limit each other too (storage capacity being one factor influencing transmission capacity, as data has to lay over in buffers between forwardings, and has to originate somewhere; meanwhile, storage isn’t very useful unless you receive or generate a lot of data to store).

    It seems to me to be quite likely that in ten years, you’ll be able to store the same fraction of global communications traffic per $1000 as you can store now, in constant dollars. If the intelligence budgets remain roughly the same in constant dollars, it follows that they’ll be able to store the same fraction of total traffic in ten years as now. Of course the amount they store will be maybe 100 times bigger, but the amount they miss will be 100 times bigger as well.

    Encryption strength and brute force decryption will also stay around parity, with the bigger intelligence agencies able to break a message if they turn all their resources to the one message and it’s long enough, and the crypto available if you know where on the net to look strong enough to stop any brute forcing by a lesser effort. Assuming quantum cryptography and cryptanalysis don’t change the whole landscape. (It seems likely quantum computers would easily crack classical codes based on NP problems like the discrete logarithm, but quantum encryption will be uncrackable except maybe with a quantum computer inserted as a wiretap — and even then, if a party sends one of an entangled pair of particles down the line and perform joint measurements on their respective particles, any middleman presence can be detected.)

  24. billybob says:

    i say sure yup that what i think too…

  25. Trish says:

    “If government stores everything from now on, corrupt government officials, especially a few years down the road, will have tremendous power to peer into the lives of people they don’t like. ”

    As a person who is not well-liked by the community I live in, it’s already happening on a daily basis, and believe me, it’s not fun. It’s creepy.