There have been rumors for years that the NSA can decrypt a significant fraction of encrypted Internet traffic. In 2012, James Bamford published an article quoting anonymous former NSA officials stating that the agency had achieved a “computing breakthrough” that gave them “the ability to crack current public encryption.” The Snowden documents also hint at some extraordinary capabilities: they show that NSA has built extensive infrastructure to intercept and decrypt VPN traffic and suggest that the agency can decrypt at least some HTTPS and SSH connections on demand.
However, the documents do not explain how these breakthroughs work, and speculation about possible backdoors or broken algorithms has been rampant in the technical community. Yesterday at ACM CCS, one of the leading security research venues, we and twelve coauthors presented a paper that we think solves this technical mystery.
The key is, somewhat ironically, Diffie-Hellman key exchange, an algorithm that we and many others have advocated as a defense against mass surveillance. Diffie-Hellman is a cornerstone of modern cryptography used for VPNs, HTTPS websites, email, and many other protocols. Our paper shows that, through a confluence of number theory and bad implementation choices, many real-world users of Diffie-Hellman are likely vulnerable to state-level attackers.
For the nerds in the audience, here’s what’s wrong: If a client and server are speaking Diffie-Hellman, they first need to agree on a large prime number with a particular form. There seemed to be no reason why everyone couldn’t just use the same prime, and, in fact, many applications tend to use standardized or hard-coded primes. But there was a very important detail that got lost in translation between the mathematicians and the practitioners: an adversary can perform a single enormous computation to “crack” a particular prime, then easily break any individual connection that uses that prime.
How enormous a computation, you ask? Possibly a technical feat on a scale (relative to the state of computing at the time) not seen since the Enigma cryptanalysis during World War II. Even estimating the difficulty is tricky, due to the complexity of the algorithm involved, but our paper gives some conservative estimates. For the most common strength of Diffie-Hellman (1024 bits), it would cost a few hundred million dollars to build a machine, based on special purpose hardware, that would be able to crack one Diffie-Hellman prime every year.
Would this be worth it for an intelligence agency? Since a handful of primes are so widely reused, the payoff, in terms of connections they could decrypt, would be enormous. Breaking a single, common 1024-bit prime would allow NSA to passively decrypt connections to two-thirds of VPNs and a quarter of all SSH servers globally. Breaking a second 1024-bit prime would allow passive eavesdropping on connections to nearly 20% of the top million HTTPS websites. In other words, a one-time investment in massive computation would make it possible to eavesdrop on trillions of encrypted connections.
NSA could afford such an investment. The 2013 “black budget” request, leaked as part of the Snowden cache, states that NSA has prioritized “investing in groundbreaking cryptanalytic capabilities to defeat adversarial cryptography and exploit internet traffic.” It shows that the agency’s budget is on the order of $10 billion a year, with over $1 billion dedicated to computer network exploitation, and several subprograms in the hundreds of millions a year.
Based on the evidence we have, we can’t prove for certain that NSA is doing this. However, our proposed Diffie-Hellman break fits the known technical details about their large-scale decryption capabilities better than any competing explanation. For instance, the Snowden documents show that NSA’s VPN decryption infrastructure involves intercepting encrypted connections and passing certain data to supercomputers, which return the key. The design of the system goes to great lengths to collect particular data that would be necessary for an attack on Diffie-Hellman but not for alternative explanations, like a break in AES or other symmetric crypto. While the documents make it clear that NSA uses other attack techniques, like software and hardware “implants,” to break crypto on specific targets, these don’t explain the ability to passively eavesdrop on VPN traffic at a large scale.
Since weak use of Diffie-Hellman is widespread in standards and implementations, it will be many years before the problems go away, even given existing security recommendations and our new findings. In the meantime, other large governments potentially can implement similar attacks, if they haven’t already.
Our findings illuminate the tension between NSA’s two missions, gathering intelligence and defending U.S. computer security. If our hypothesis is correct, the agency has been vigorously exploiting weak Diffie-Hellman, while taking only small steps to help fix the problem. On the defensive side, NSA has recommended that implementors should transition to elliptic curve cryptography, which isn’t known to suffer from this loophole, but such recommendations tend to go unheeded absent explicit justifications or demonstrations. This problem is compounded because the security community is hesitant to take NSA recommendations at face value, following apparent efforts to backdoor cryptographic standards.
This state of affairs puts everyone’s security at risk. Vulnerability on this scale is indiscriminate—it impacts everybody’s security, including American citizens and companies—but we hope that a clearer technical understanding of the cryptanalytic machinery behind government surveillance will be an important step towards better security for everyone.
For more details, see our research paper: Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice. (Update: We just received the Best Paper Award at CCS 2015!)
J. Alex Halderman is an associate professor of Computer Science and Engineering at the University of Michigan and director of Michigan’s Center for Computer Security and Society.
Nadia Heninger is an assistant professor of Computer and Information Science at the University of Pennsylvania.