In the President’s NSA reform speech last week, he called for a study of how to re-architect the NSA’s phone call data program, to change where the data is stored. This raises a bunch of interesting computer science questions, which I’m planning to explore in a series of posts here.
Here is the relevant part of the President’s speech:
For all these reasons, I believe we need a new approach. I am therefore ordering a transition that will end the Section 215 bulk metadata program as it currently exists, and establish a mechanism that preserves the capabilities we need without the government holding this bulk metadata.
This will not be simple. The review group recommended that our current approach be replaced by one in which the providers or a third party retain the bulk records, with government accessing information as needed. Both of these options pose difficult problems. Relying solely on the records of multiple providers, for example, could require companies to alter their procedures in ways that raise new privacy concerns. On the other hand, any third party maintaining a single, consolidated database would be carrying out what is essentially a government function but with more expense, more legal ambiguity, potentially less accountability — all of which would have a doubtful impact on increasing public confidence that their privacy is being protected.
I have instructed the intelligence community and the Attorney General to use this transition period to develop options for a new approach that can match the capabilities and fill the gaps that the Section 215 program was designed to address without the government holding this metadata itself.
This will kick off a process that might not be pretty. Redesigning this program is essentially an exercise in computer system design: how to create a system that has reasonable cost, serves the NSA’s essential intelligence requirements, and offers better privacy protection and accountability than the old system did. And this system is going to be designed by … the Washington policy community, which is not known for its technical savvy.
Saturday’s Washington Post ran a story about this issue, featuring quotes like this one:
The problem is that phone companies are used to receiving law enforcement requests to search for customers’ records. If they are handed a number that does not belong to a customer, say a number in Yemen, the task becomes much harder.
“It would be an incredibly long process, because basically we would be setting a computer running to search through billions of numbers,” said one industry official who was not authorized to speak on the record. “It would probably take days to comb through the database.”
Reading this, I am reminded of the scene in Austin Powers where Dr. Evil, in exchange for not destroying the world, demands the staggering sum of “… one MILLION dollars.” In the year 2014, billions of records is not a particularly large database, and searching through billions of records is not an onerous requirement. The metadata for a billion calls would fit on one of those souvenir thumb drives they give away at conferences; or if you want more secure, backed up storage, Amazon will rent you what you need for $3 a month. Searching through a billion records looking for a particular phone number seems to take a few minutes on my everyday laptop, but that is only because I didn’t bother to build a simple index, which would have made the search much faster. This is not rocket science.
Fortunately, a former high-ranking government lawyer is willing to speculate that the necessary technology might someday be feasible: “The United States has the best technologists and innovators in the world. I’m confident that if the intelligence community focuses on it and works with companies in the private sector, they can solve that problem.”
As the Washington policy establishment takes up the debate about how to structure the NSA program, I fear that we’re going to see a lot of these sorts of weak pseudo-computer science arguments. It’s up to those of us who understand the issues to speak up, in the hope of fostering a fact-based dialogue.
As it turns out, redesigning the NSA metadata program does raise some interesting computer science questions. I’ll start to unpack them in the next post.