Two weeks ago I started a series of posts (so far: 1, 2) about how new technologies change the policy issues around government wiretapping. I argued that technology changed the policy equation in two ways, by making storage much cheaper, and by enabling fancy computerized analyses of intercepted communications.
My plan was to work my way around to a carefully-constructed hypothetical that I designed to highlight these two issues – a hypothetical in which the government gathered a giant database of everybody’s phone call records and then did data mining on the database to identify suspected bad guys. I had to lay a bit more groundwork before getting to the hypothetical, but I was planning to get to it after a few more posts.
Events intervened – the “hypothetical” turned out, apparently, to be true – which makes my original plan moot. So let’s jump directly to the NSA call-database program. Today I’ll explain why it’s a perfect illustration of the policy issues in 21st century surveillance. In the next post I’ll start unpacking the larger policy issues, using the call record program as a running example.
The program illustrates the cheap-storage trend for obvious reasons: according to some sources, the NSA’s call record database is the biggest database in the world. This part of the program probably would not have been possible, within the NSA’s budget, until the last few years.
The data stored in the database is among the least sensitive (i.e., private) communications data around. This is not to say that it has no privacy value at all – all I mean is that other information, such as full contents of calls, would be much more sensitive. But even if information about who called whom is not particularly sensitive for most individual calls, the government might, in effect, make it up on volume. Modestly sensitive data, in enormous quantities, can add up to a big privacy problem – an issue that is much more important now that huge databases are feasible.
The other relevant technology trend is the use of automated algorithms, rather than people, to analyze communications traffic. With so many call records, and relatively few analysts, simple arithmetic dictates that the overwhelming majority of call records will never be seen by a human analyst. It’s all about what the automated algorithms do, and which information gets forwarded to a person.
I’ll start unpacking these issues in the next post, starting with the storage question. In the meantime, let me add my small voice to the public complaints about the NSA call record program. They ruined my beautiful hypothetical!