June 24, 2024

Crowdsourcing State Secrets

Those who regularly listen to Fresh Air may have heard a recent interview with journalist Dana Priest about the dramatic expansion of the intelligence community over the past ten years. The guest mentioned how the government had paid contractors several times what their own intelligence officials would be paid to perform the same analysis tasks. The guest also mentioned how unwieldy the massive network of contractors had become (to the point where even decided who gets top secret clearance had been contracted out). At the same time, in this age of Wikileaks and #Antisec, leaks and break-ins are becoming all the more common. It’s only a matter of time before thousands of military intelligence reports show up on Pastebin.

However, what if we didn’t have to pay this mass of analysts? What if we stopped worrying so much about leaks and embraced them? What if we could bring in anyone who wanted to analyze the insane amount of information by simply dumping large amounts of the raw data to a publicly-accessible location? What if we crowdsourced intelligence analysis?

Granted, we wouldn’t be able to just dump everything, as some items (such as “al-Qaeda’s number 5 may be house X in Waziristan, according to informant Y who lives in Taliban-controlled territory”) would be damaging if released. But (at least according to the interview) many of the items which are classified as top secret actually wouldn’t cause “exceptionally grave damage.” As for particularly sensitive (but could benefit from analysis) information in such documents, we could simply use pseudonyms and keep the pseudonym-real name mapping top secret.

Adversaries would almost certainly attempt to piece together false analyses. This simply becomes an instance of the Byzantine generals problem, but with a twist: because the mainstream media is always looking for the next sensational story, it would be performing much of the analysis. Because this creates a common goal between the public and the news outlets, there would be some level of trust that other (potentially adversarial) actors would not necessarily have.

In an era when the talking heads in Washington and the media want to cut everything from the tiny National Endowment for the Arts to gigantic Social Security, the last thing we need is to pay people to do work that many would do for free. Applying open government principles to data that do not necessarily need to be kept secret could go a long way toward reducing the part of government that most politicians are unwilling to touch.


  1. Because, while the raw information may not cause any harm if released, the results of the analysis may be valuable only if secret.

    The fact that the *analysis* could be reproduced by the enemy doesn’t mean it will be, so it can still be a valuable asset itself.

  2. Just using pseudonyms / phony identifiers won’t work, for the same reasons anonymized data are often identifiable. (Paul Ohm wrote a paper on the legal problems with anonymization.) So some reports will have to stay secret; since dealing with the volume of data is a big part of the problem, publishing some is probably not as big a win as we’d like, once the work of sorting is accounted for. But I think the underlying idea is a good one, worth pursuing.

  3. People always seem to find ways to piece together identities from supposedly anonymized data. I don’t think the pseudonym idea will fly.

    That said, I like the train of thought quite a bit.