November 21, 2024

New research: There's no need to panic over factorable keys–just mind your Ps and Qs

You may have seen the preprint posted today by Lenstra et al. about entropy problems in public keys. Zakir Durumeric, Eric Wustrow, Alex Halderman, and I have been waiting to talk about some similar results. We will be publishing a full paper after the relevant manufacturers have been notified. Meanwhile, we’d like to give a more complete explanation of what’s really going on.

We have been able to remotely compromise about 0.4% of all the public keys used for SSL web site security. The keys we were able to compromise were generated incorrectly–using predictable “random” numbers that were sometimes repeated. There were two kinds of problems: keys that were generated with predictable randomness, and a subset of these, where the lack of randomness allows a remote attacker to efficiently factor the public key and obtain the private key. With the private key, an attacker can impersonate a web site or possibly decrypt encrypted traffic to that web site. We’ve developed a tool that can factor these keys and give us the private keys to all the hosts vulnerable to this attack on the Internet in only a few hours.

However, there’s no need to panic as this problem mainly affects various kinds of embedded devices such as routers and VPN devices, not full-blown web servers. (It’s certainly not, as suggested in the New York Times, any reason to have diminished confidence in the security of web-based commerce.) Unfortunately, we’ve found vulnerable devices from nearly every major manufacturer and we suspect that more than 200,000 devices, representing 4.1% of the SSL keys in our dataset, were generated with poor entropy. Any weak keys found to be generated by a device suggests that the entire class of devices may be vulnerable upon further analysis.


We’re not going to announce every device we think is vulnerable until we’ve contacted their manufacturers, but the attack is fairly easy to reproduce from material already known. That’s why we are working on putting up a web site that you can use to determine whether your device is immediately vulnerable.

Read on for more details, and watch for our full paper soon.

Don’t worry, the key for your bank’s web site is probably safe

SSL is used to authenticate every major web site on the Internet, but in our analysis, these were not the keys that were vulnerable to the problems outlined in this blog post.

So which systems are vulnerable? Almost all of the vulnerable keys were generated by and are used to secure embedded hardware devices such as routers and firewalls, not to secure popular web sites such as your bank or email provider. Only one of the factorable SSL keys was signed by a trusted certificate authority and it has already expired. There are signed certificates using repeated keys; some of them are generated by vulnerable devices, some of them are due to website owners submitting known weak keys to be signed, and for some of them we have no good explanation.

Embedded devices are well known to have entropy problems. However, until now it wasn’t apparent how widespread these problems were in real, Internet-connected devices.

Background: key generation

Websites and networked computers use public-key cryptography for authentication. The kind of authentication that we will be talking about here is a server certifying to a client that it really is the server that the client intended to connect to. An attacker who knows the private key to one of these systems would be able to impersonate the real system to a client or in many cases decrypt encrypted traffic between the client and server.

The most widely used cryptosystem for this purpose is RSA. The RSA cryptosystem is intended to be based on the difficulty of factoring large numbers. An RSA public key consists of a pair of integers: an encryption exponent e and a modulus N, which is a large integer that itself is the product of two large primes, p and q. If an adversary can factor this integer N back into its prime factors p and q, then the adversary can decrypt any messages encrypted using this public key. However, even using the fastest known factoring algorithm, to public knowledge nobody has yet been able to factor a 1024-bit RSA modulus.

It is vitally important to the security of the keys that they are generated using random inputs. If the inputs used to generate the keys were not random, then an adversary may be able to guess those inputs and thus recover the keys without having to laboriously factor N.

On modern computers and servers, key generation software attempts to collect random information from physical sources (often through the underlying operating system): the movements of the mouse, keyboard, hard drive, network events, and other external sources of unpredictable information. However, if the keys are generated from a small set of possibilities, that is, using too little entropy, then the keys may be vulnerable to an attacker. Gathering strong entropy and verifying its strength is a very difficult problem that has given rise to multiple vulnerabilities over the years.

Two versions of the problem

We decided to investigate the prevalence of this issue by scanning the Internet for all SSL and SSH public keys. We scanned every IPv4 address on the Internet, collecting a copy of each SSL certificate and SSH host key. We were able to complete both scans in less than a day: we first used a standard tool called nmap to find hosts with the relevant ports open, and then used our own optimized software to query those hosts. In our SSL scan, we collected 5.8 million certificates. In our SSH scan, we collected 10 million host keys.

We found that entropy problems resulted in two different types of weaknesses:

Repeated public keys. We found that 1% of the RSA keys in our SSL scan data were repeated, apparently due to entropy problems. When two different devices have the same public key, it means they also have the same private key. In effect, the devices that share keys are “in the same boat” as one another–an attacker would only need to compromise the weakest one of these devices, in order to obtain the repeated private key that protects all of the devices. This has long been a known problem, but until now, none of the publicly available security literature has documented how widespread the problem was.

We manually verified that 59,000 duplicate keys were repeated due to entropy problems, representing 1% of all certificates, or 2.6% of self-signed certificates. We also found that 585,000 certificates, or 4.6% of all devices used the default certificates pre-installed on embedded devices. While these devices are not using keys generated with poor entropy, they are suspectible to the same attack as their private keys are found on every device of a given model. We manually verified these keys because a large number of websites may utilize repeated keys for legitimate reason; these provide no risk to users.

Factorable public keys. More surprisingly, we discovered that entropy problems can allow a remote attacker with no special access to factor a significant fraction of the RSA keys in use on the Internet. We were able to factor 0.4% of the RSA keys in our SSL scan. We did this by computing the greatest common divisor (GCD) of all pairs of moduli from RSA public keys on the Internet.

We identified 1724 common factors which allowed us to factor 24,816 SSL keys, and 301 common factors which allowed us to factor 2422 SSH host keys. This means we were able to calculate the private keys for almost half of 1% of the RSA keys in use for SSL. We will explain how we did this calculation below.

Specific vulnerable devices

Embedded devices often generate cryptographic keys on first boot, when their entire state may have been pre-determined in the factory. This can result in the kinds of entropy problems we observe in this study.

We were able to use information from the SSL certificates to identify classes of devices that are prone to generating weak keys. Many more devices than the ones whose keys we factored are probably also producing weak keys that could be compromised by a determined attacker. The list of vulnerable devices that we have already identified includes more than thirty different manufacturers, including almost all of the biggest names in the computer hardware industry. The kinds of products that we identified include firewalls, routers, VPN devices, remote server administration devices, printers, projectors, and VOIP phones.

We’re not going to list specific devices or brands until we’ve told the manufacturers, but here are some examples:

Firewall product X:
52,055 hosts in our SSL dataset
6,730 share public keys
12,880 have factorable keys

Consumer-grade router Y:
26,952 hosts in our SSL dataset
9,345 share public keys
4,792 have factorable keys

Enterprise remote access solution Z:
1,648 hosts in our SSL dataset
24 share public keys
0 factorable

How could this happen?

It wasn’t obvious at first how these types of entropy problems might result in keys that could be factored. We’ll explain now for the geekier readers.

Here’s one way a programmer might generate an RSA modulus:

prng.seed(seed)
p = prng.generate_random_prime()
q = prng.generate_random_prime()
N = p*q

If the pseudorandom number generator is seeded with a predictable value, then that would likely result in different devices generating the same modulus N, but we would not expect a good pseudorandom number generator to produce different moduli that share a single factor.

However, some implementations add additional randomness between generating the primes p and q, with the intention of increasing security:

prng.seed(seed)
p = prng.generate_random_prime()
prng.add_randomness(bits)
q = prng.generate_random_prime()
N = p*q

If the initial seed to the pseudorandom number generator is generated with low entropy, this could result in multiple devices generating different moduli which share the prime factor p and have different second factors q. Then both moduli can be easily factored by computing their GCD: p = gcd(N1, N2).

OpenSSL’s RSA key generation functions this way: each time random bits are produced from the entropy pool to generate the primes p and q, the current time in seconds is added to the entropy pool. Many, but not all, of the vulnerable keys were generated by OpenSSL and OpenSSH, which calls OpenSSL’s RSA key generation code.

Computing the GCDs of all pairs of keys

If any pair of RSA moduli N1 and N2 share, say, the same prime factor p in common, but have different second factors q1 and q2, then we can easily factor the moduli by computing their greatest common divisor. On my desktop computer, computing the GCD of two 1024-bit RSA moduli took about 17µs.

For the mathematically inclined, I’ll explain how we were able to use this idea to factor a large collection of keys.

The simplest way that one might try to factor keys is by computing the GCD of each pair of RSA moduli. A back of the envelope calculation shows that doing a GCD computation for all pairs of moduli in our data sets would take 24 years of computation time on my computer.

Instead, we used an idea Dan Bernstein published in the Journal of Algorithms in 2005 for factoring a group of integers into coprimes which allowed us to do the computation in a few hours on a desktop computer, in a few lines of Python. The algorithm is no great secret: a long stream of published papers has worked on improving these ideas.

The main mathematical insight is that one can compute the GCD of a single modulus N1 with every other modulus N2,…,Nm using the following equation:

gcd(N1,N2…Nm) = gcd(N1, (N1*N2*…*Nm mod N12)/N1)

The secret sauce is in making this run fast–note that the first step is to compute the product of all the keys, a 729 million digit number. We were able to factor the SSL data in eighteen hours on a desktop computer using a single core, and the SSH data in about four hours using four cores.

The bottom line

This is a problem, but it’s not something that average users need to worry about just yet. However, embedded device manufacturers have a lot of work to do, and some system administrators should be concerned. This is a wake-up call to the security community, and a reminder to all of how security vulnerabilities can sometimes be hiding in plain sight.

Comments

  1. Is ssh-keygen (on say, any recent ubuntu distribution) affected by this? I mean is it still safe to use, or is there another safer/better ssh key generator available somewhere?

  2. It is really very cool observations. Probing using NMAP to single IPv4 adress is a cyber offence. You probed almost all adresses!!!

  3. It’s late February and still no predictions for 2012, or 2011 predictions scorecard. What gives?

  4. enjoy

  5. I have learned that some inexpensive embedded devices don’t have true calendar date/time function due to the missing battery backup. This means that the calendar time function on these devices work only when it is powered on (and after it is set automatically or manually to the external time reference, the time reflects real calendar date/time.).
    And that many implementations don’t bother to keep a time stamp of its operation from one session (powered on state) to the other even to offer the concept of advancing time stamp. (On these devices, the time is repeated again and again across each power-up.)

    For example, my old telco internet router’s log always starts at 1970 1/1. (That is the epoch time of POSIX-like time function.) Eventually once the router got connected to the Internet and synced to a time source, the log printed out the current date and time.

    Imagine that a random generator is seeded with such “time” after a power up BEFORE any external time source is referenced. I am not surprised that the value of time() function in the resolution of 50Hz or 60 Hz may be the same on careless implementation across different devices, thus ending up with the same value of Ps.

    Use of 50Hz or 60Hz timing resolution (or similar values) with today’s > 100MHz clock CPUs makes it quite likely that same time() values are seen by different units. Even if there is a long processing after the power up and until this invocation of random numer seeding happens using time(), the variation of execution time may not be that large across different units, and if so, values of time() returned at the crucial point during prime number generation then may fall only on several consecutive values in 50Hz or 60Hz resolution. This may explain the same Ps.

    Even on the cheap devices like these, the MAC address, say, of the ethernet interface is recorded differently on different units and can be used to add a little bit of entropy to the initial seed to the rand function.Given that the P,Q generation probably happens only at the initial invocation, there is not much other data we can use. User-specific data is probably not available yet then. Oh year, we can ask the user to type in some phrases like OpenPGP does. But again, for some embedded devices that is impossible…)

    So this brings up the question: I have no idea why such certification creation is done on headless/non-interactive server like embedded device if my assumptions are correct. (Even worse maybe without network connection? Could it be these were created at the factory before shipment?)

    Someone wondered why the security expert at these companies didn’t realize this problem. The fact is many development teams never have that type of expert at all, and security review is done by amateurs in that sense.
    (Look at WEP fiasco few years ago. That was supposed to be created by experts in communication. Sadly, that group didn’t have security experts. Ordinary development groups at manufacturers certainly don’t.)

    Or is the problem partially based on sinister conspiracy?

    But, the particular devices cited seem to suffer from the problem I mention above.
    Such concentrated problem for a few companies is too conspicuous for a conspiracy.

  6. In fact OpenSSL add the current time in second BEFORE generating a new random number and not after.
    Just look at the code of BN_rand :
    bnrand function, line 142 in bn_rand.c :

    time(&tim);
    RAND_add(&tim,sizeof(tim),0.0);
    […]
    if (RAND_bytes(buf, bytes) <= 0)

    so I really don't see why they should share the same p if the key hasn't been generated the same time (or the clock is not initialized and the key is automatically generated at the first boot but I doubt of this) …
    (or it's a really old version of OpenSSL or a modified one)

  7. Clive Robinson says

    I guess it should go without saying,

    What you see may not be entropy.

    However it does not always appear to be the case.

    An example of how you can get confused is that of clocking data in and out of systems, where the clocks run independently of each other (normal with most computer systems).

    In effect the system can be viewed as a D-type latch where that data clock goes in on the D input and the system clock drives the latch clock. If the two clocks are not of the same frequency or phase then the output on the latches Q output may well appear random on an oscillascope or other measuring device.

    I can assure you it’s not, it’s a digitised sinewave at the difference frequency or of some sub frequency that has been “folded back”. and. is thus very predictable.

    The only real entropy might be due to the drift rates of the two clocks, and this may be swamped out compleatly by the difference frequency effects.

    Now if you look at the direct output of the latch you might just see this (the frequency change is related to what part of the sinewave is being digitized and is highest at the zero crossing point) you might not. But what about after a von Newman de-bias circuit (clock two bits into a shift register and use the xor of the two bits) or other de-bias circuit or firmware such as a hash function?

    Always always do your testing of an entropy source as close to the actuall source of entropy as you can. And if you cannot work out where or what the source of entropy is and what the source of bias is then don’t use it because in the majority of cases you will end up assuming the bias is entropy and using it to “shoot yourself in the foot and in some cases your system users in the head”.

  8. The problem with computing gcd(N1,N2…Nm) is that it won’t always deliver a useful factorisation of the key. If there is a set of keys who’s factors form a disconnected clique (e.g. each of their factors always and only occur in other keys in the same set) then the GCD for each key will be the key itself, since both factors will always occur on both sides in the GCD test. There is at least one such clique in the EFF SSL Observatory set. Fortunately there are other ways to address this.

  9. Thanks for this accessible technical summary.

    Can you expand upon this:

    > OpenSSL’s RSA key generation functions this way: each time random bits are produced from the entropy pool to generate the primes p and q, the current time in seconds is added to the entropy pool. Many, but not all, of the vulnerable keys were generated by OpenSSL

    Are you saying there is a flaw regarding insufficient entropy in the standard OpenSSL code? (But… maybe… one that only becomes pertintent when you do something ELSE wrong?) Or am I misunderstanding, and OpenSSL is not to blame at all here?

    • The entropy (it may help to think about it as “unpredictability”) is not in the code (alternatively stated, code doesn’t have entropy). It must come from outside the code. Usually this comes from keyboard presses, mouse movements, timing of network traffic, etc. On embedded devices, it can be hard to get enough to generate random enough numbers.

      I’m not familiar with how OpenSSL gets its initial seed (hopefully another commenter is), but the suspicion here is that it not unpredictable enough (ie. doesn’t have enough entropy). Likely the problem is related to how other developers use OpenSSL, if it were inherent to OpenSSL, we would probably be seeing far more collisions.

      • Clive Robinson says

        @ Shopt,

        On many embbeded systems the entropy only comes from three places,

        1, System clock to real world time drift.
        2, Interupts from I/O
        3, Overhead from data dependant I/O handling.

        On some “Real time” systems it might only be from clock drift and that may not be measurable in any realistic fashion.

        Because for some systems that perform exactly the same function on I/O data (think DSP) there is no entropy to be gained from I/O data handling because it uses exactly the same number of CPU cycles to handle each action. Further on systems (think telecoms) where the data is clocked in and out on network master time and the CPU clock is either very very high precision (think atomic clock) or locked to the network master time the entropy could be nonexistant in real world terms. And for the likes of say high security mobile telecoms devices might just be down to doppler shift and relativity.

  10. Some shared public keys could be deliberate – different members of the same cluster that are supposed to be smoothly interchangeable.

  11. “Embedded devices often generate cryptographic keys on first boot, when their entire state may have been pre-determined in the factory. This can result in the kinds of entropy problems we observe in this study.”

    I have to wonder if VMs cloned from an existing OS image (including widely available “virtual appliances”) could run into similar problems.

    • Clive Robinson says

      @ Matt,

      Entropy is a valuable comodity, and like fine wine it needs time to mature.

      One consequence of this is that some CS-PRNG periodicaly store their entropy to a file. This is so that only a small amount of entropy is lost if the system is rebooted for some reason.

      So when you save your VM master all copies of it will have the same entropy stored in file.

      Now depending on how you use your VM master and copies depends on what happens.

      Usually your VM master is “read only” and when copied can either be “read only” or “writable”.

      For many “security functions” such as Honeynets the copies will be “read only” for “shared hosting” they will be writable.

      Now some may know that one of the authors of the preprint article is also the author on a book on virtual honeynets. A number of years ago I wrote him an email about “enumerating” honeynets from TCP time stamps. Basicaly if you can measure the time drift or delta function then you can tell the difference between a network of real machines and a single machine pretending via VM’s to be a network of real machines. And as I noted to him such a sophisticated enumeration could not from the honeynet end be told appart from a “braindead script kiddy” attack without adding some significant checking that was not being done. However a sophisticated cracker looking to use their latest “zero day” would not want it to be caught in a honeynet, therefore it was worth their while carefully enumerating target networks to look for a honeynet…

      Now, the point is that the reason the TCP timestamp enumeration worked was because the underlying hardware where the clock drift came from was the same for all the VM’s running on it.

      Fast forward to your question, you will see that nearly all the entropy gathered by CS-PRNG’s on machines is taken as “close to the metal” as possible. So for all the VM’s running on the same physical machine their low level source of entropy is the same…

      Thus for those designing “entropy pools” you need to consider that atleast some of the “entropy” you collect comes from a non hardware source but something that is OS or process dependant. One such thing on an ordinary use multitasking system is the “in process time” or CPU cycles used compared to either file system time or real world time. However this does not work well for very low latency, non blocking or time triggered designs for “real Time” operations for obvious reasons.

      Obviously for security systems like Honeynet VM’s there is still a problem because the only source of entropy they get is from the overhead of network traffic sent to the indivdual VM’s….

  12. Just to make sure I understand: you are not implying that OpenSSL’s choice to inject extra randomness between calls is a bad choice, but rather that the fact that it does that might have masked the bad quality of the source used for the initial seeding, is that right? That is, a developer of an embedded device may have missed the problem, because the add_randomness() step made the keys seem ok on superficial inspection?

    • It also makes the factorization here possible — having different q values means you have two moduli with same “p”. If q was same, you’d instead have identical keys, but they would be as hard to factor as any single key (unless one could guess p or q due to bad RNG).

  13. There are so many cool ways to build solid inexpensive entropy into embedded devices that manufacturers that fail at it have no excuse. I had to produce good random numbers in a VPN device that had already been designed once, so it had to be a software-only hack. I stared at the schematics for awhile and considered everything I could see that could be read from or written to and did a few experiments until one panned out: It was impossible to predict exactly how long unused blocks in the flash RAM chips would take to write and erase. The bit rate wasn’t particularly high, but the quality was quantum. Felt good.

    • Peter Gutmann says

      There are many ways to build cool RNGs in the lab, but it’s decidedly more tricky in the real world. You’re dealing with highly unpredictable (deliberately so) circuits that need to work in varying temperature, pressure, and humdity environments, with apparently-identical devices that may exhibit subtle differences due to changes in manufacturing processes in the exact region where you’re using them to generate random data, or with component substitutions (“we’ll swap this aluminium electrolytic for a polymer film cap, saving is 3 cents per unit and making the circuit far more stable, the engineers will thank us for it”) and many other things. This is why commercial dedicated RNGs like a RBG1210 were so expensive, they had to be designed and qualified to perform as expected in all sorts of environments.

  14. So, a back of the envelope calculation on those examples suggests collision entropy of something like 16 bits for the first one, 14 for the second, and 22 bits for the third. (The first one has about a 2/3 probability of getting more entropy between calls, the second has about a 1/3 probability, and the last has about a 0 probability). This is surely simpler than the real situation, but it leaves me with this mental image of a process to gather entropy that’s probably getting almost nothing except the number of clock ticks since startup or some such thing.

    An interesting thing about both your numbers and the ones from the paper on eprint is that if the implementers had bothered to roll, say, the device’s ethernet address or IP address or serial number into the PRNG seed, they’d be just as weak, but you’d never see it by looking for duplicate N or shared prime factors. I wonder how many of the devices of this kind that you don’t see this problem for are avoiding it by rolling in some unique-to-the-device information like this.

    Do you know what’s being done to generate a seed inside these devices? It’s clearly not any good, but it would be interesting to know if they’re really seeding it with a time() call or something, or if there’s some kind of more subtle problem.

    • Peter Gutmann says

      One obvious unique value to use is the preconfigured WPA2 key that’s printed on the label on the bottom of many 802.11 devices, that has a fair bit of entropy and isn’t publicly (or at least semi-publicly) known like a MAC address.

      Having said that, it’d be interesting to take a batch of devices from the same job lot and see if you can find any correlations between the preset WPA2 keys. If they’re doing keygen badly then the preconfigured keys could be just as weak.

      So many things to break, so little time :-).

    • Clive Robinson says

      @ John,

      Your question,

      “… if the implementers had bothered to roll, say, the device’s ethernet address or IP address or serial number into the PRNG seed, they’d be just as weak, but you’d never see it by looking for duplicate N or shared prime factors.”

      Raises a couple of issues with regards “unique device data”.

      Firstly the unique device data is obviously programed into the device some how during production so why do the dessigners not just put in other random unique data or even certificates etc, the result would be better than they currently appear to be getting (more on this later).

      Secondly is the current unique device data actually suitable for using for “whitening” or “seeding” a PRNG. That is does the “unique device data” have sufficient entropy (for instance the manufacturer portion of a network MAC has no entropy as it remains the same not just on the product range but all the manufactures ranges of product, likewise a serial number may contain a pre or post product identifier etc.).

      To be usefull we need to extract what is unique from each piece of data, and then importantly use it to provide the same level of uncertainty across each bit of what is to become the “whitening” or “seed” for the PRNG. Most of the unique device data is quite biased in that the low bits change more rapidly than the high bits as you would expect in what are effectivly counters (and in some older designs where actualy derived from a single number).

      In cryptography we have the the concept of the “Avalanche” criteria which effectivly says for any single bit change to the input half of the bits at the output should change, and preferably the output bits that have changed should be not predictable from observing the output alone. Also it should not be possible to work the change in output bits back to the input bit concerned. Whilst such functions can be difficult to design they are available “off the shelf”, “fully certified” as either block cipher or hash functions that are almost certainly going to be in the device for other reasons (otherwise their would be no point it generating a PK Cert in the firsst place 😉

      So “in theory” unique device data has some entropy that can be made suitable to be used as either “whitening” or “seeding” the product of a PRNG. But it might only be as little as that you would expect from a simple limited range counter giving between 15 and 25 bits of entropy…

      However that does not make it suitable for a CS-PRNG. For a couple of reasons.

      Firstly because of the very limited entropy expected of unique device data. Secondly is the issues of “Public -v- Private” data. For a CS-PRNG the data must be private in all respects because if it can leak in some way in part or full then it becomes public data and is thus potentialy predictable to an attacker in part or full. And it is this that is the problem, as it is reasonable to assume that the manufacturer “knows” the unique device data from the serial number etc.

      So from a theoretical approach unique device data in it’s current form is unsuitable for use in a CS-PRNG with any kind of real security criteria in mind (ie against a reasonably well resourced attacker who is targeting a specific device).

      But what about from a practical point of view, in the majority of cases users don’t want or need what is entailed with “High Security” they just want the kit to work. And the unsaid part of “work” is “only for them”. That is the device is sufficiently secure that the “work factor” for an attacker is sufficiently high that there is two much cost involved for the attacker to be reasonably expected to persue an attack against the device of a non specific target.

      So we need to look again at the attack, it does not work against individual devices nor does it help an attacker attack a specific target. It simply tells an attacker when two or more effectivly random and unknown devices have a PRNG output number in common. Has this helped an attacker? Almost certainly not currently, but could it in the future?

      Well the answer is yes in a similar way to that rainbow tables do for breaking passwords. It appears from the data collected that some shared PRNG output numbers are very common whilst others are not. This brings the threashold of the cost of attacking those devices down quite a lot.

      However one thing is known is that the size of the data set to this attack is actually bounded by the number of certificates that can be seen on the Internet which is still mainly IPV4 based and thus less than 2^32.

      Which brings us around to your subsiquent question,

      “I wonder how many of the devices of this kind that you don’t see this problem for are avoiding it by rolling in some unique-to-the-device information like this”

      The answer is known only to the respective device manufactures.

      But we do know that if they all had put in around 5 or 6 bytes of “entropy” into each device then we would not be having this conversation. And the entropy they would have needed could hkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk

      • John Kelsey says

        Someone whose job it was to eavesdrop on a lot of encrypted traffic going over the internet would presumably buy one of each of these classes of device, and recover enough information to generate all or most of the keys that this class of device would ever be able to generate. Each individual device isn’t all *that* expensive, after all. My guess is that there is a lot of overlap, too–probably lots of people are using OpenSSL or some other common library to do everything, starting from a fairly standard stripped down version of Linux, copying and pasting some example code for RSA key generation from the documentation, etc. (One interesting question is whether there were any collisions between keys of different devices.)

  15. Another way to compute the large products is to store partial product and reuse as many as possible each time. A*B*C*D*E*F*G*H = ((A*B)*(C*D))*((E*F)*(G*H)) So if all N products are stored and you decide to drop one term, you only need to compute log(N) products to get the new result. For example, eliminating E means computing F*(GH) and then ABCD*FGH – only 2 multiplications. Different partitioning could be done as well to reuse larger products less often – ABCDEF * either G or H for example.

  16. Peter Gutmann says

    You may also want to check keys coming from ILOM (integrated lights-out management) systems as well, these won’t (or at least shouldn’t) be visible on the public Internet and so probably wouldn’t appear in your original scan. The state of these is generally pretty appalling, and they’re present more or less invisibly in large numbers of servers, workstations, and PCs.

  17. Jan Schejbal says

    Do you know how may valid certificates signed by an accepted CA vs. self-signed ones are affected?

    • Jan Schejbal says

      Sorry, just saw three fields in a blog comment form and filled them out as usual – missing that the third was subject, not web site. Feel free to delete the URL of course.

  18. I would like to know about the setup you used to check billions of addresses twice (ports 22 & 443)! Looking Forward to it, seems like an insane accomplishment in itself.

  19. “We scanned every IPv4 address on the Internet” … really?

    • 4 billion is quite a tractable number especially considering you can use multiple machines and be issuing many probes at the same time from one machine.

      • any idea of the net traffic you can produce?

        • I’m sure they just started an SSL handshake and gave up after the public key exchange. That’s not very expensive when you only do it once per destination.

  20. Those devices could simply have used a Zener diod which is well know to generate lots of noise as a viable source of startup entropy or just randomness. Add to that that the production of diod is not 100% exact there’s always tolerance and the chance of having 2 identical Zener diods that startup at the exact same conditions (temperature, voltage, parasite resistances on the soldering pads etc….) is slim to none.
    Now making a reader that can sense all this small deltas from the output of a Zener rather than simply sample it’s a different issue, but the point is we do have random/noisy/cheap hardware sources we could use.

    • The reason why Zener diods can generate a high quality random is not that they are never identical, but that with proper calibration, the main source of the measured noise should be two phenomena that produces real white (so perfectly ideal) noise : shot noise which is a quantum physic phenomena, and thermal noise results from the Brownian motion of ionized molecules, which also when properly isolated should be resulting of quantum physic determined moves of the molecules. The trouble is that it’s hard not to have those two effects polluted by other, not actually random things. But if you don’t use every measured noise of the diod as one full bit of alea, mix them together with some hash function, you will get a really, really strong random result.

  21. Great research ! congratulations !!

  22. Looking forward to the final paper! The problem of generating keys in an entropy-starved situation is real, but is more a result from failure to make the right trade-off between convenience and security. For example, it can be mostly if /dev/random is used as entropy source. The downside is that this may cause long initial boot time or will require a user to enter some random keystrokes. But that already fixes the issue!

    So my guess is that device vendors just did want to avoid this inconvenience and made the wrong choice in order to not irritate customers.

  23. @H King
    If you do it that way, you will be making m-1 multiplications for each modulus. Their method does a constant number of operations for each modulus, though admittedly using huge integers.

  24. I am confused by your last formula involving (N1*N2*…*Nm mod N1^2)/N1 since this is apparently the simpler N2*…*Nm mod N1 readily calculated for example by:
    x=1
    for i=2 to m
    x = x * Ni mod N1

    presumably calculated twice faster since N1 has half as many multiprecision digits as N1^2.
    Unless I am being stupid this morning, or am unaware of some clever optimizations possible for calculations mod N1^2 versus N1.

    Skimming Bernsteins 2005 paper doesn’t help me understand where this might come from, but there is a lot there.

    Great work by the way, many thanks for your work and lucid explanation.

    • The first step of computing gcd(N1,N2*N3*N4*…*Nm) is to compute N2*N3*N4*…*Nm%N1. That first step is where the majority of the time would be spend, because after that step, you will be left with two “small” numbers.

      Having computed the product of all moduli instead of multiplying all those numbers, you “just” have to do one division. N1*N2*N3*N4*…*Nm/N1.

      With that approach you would need to do two slow operations. First the division, then the first modulus step. After that the remaining calculations would be fast.

      So in order to compute (N1*N2*N3*N4*…*Nm/N1)%N1 you need two slow operations. However that calculation can be transformed into another calculation: (N1*N2*N3*N4*…*Nm%(N1^2))/N1

      In this last calculation only the calculation N1*N2*N3*N4*…*Nm%(N1^2) is slow. Once that modulus is computed you will be dividing two “small” numbers.

      Computing a%b or a/b takes about the same time. In most cases the algorithm to compute one of those two numbers will compute the other one as well as a side effect. The only speedup you get from computing just one of the two is that you just don’t store the other result.

      So the optimization is not due to any performance difference between the / operation and the % operation, but rather due to only needing a single operation working with the very large number instead of two.

    • The key insight is that you don’t want to compute JUST gcd(N1, N2..Nm) — you also want to compute gcd(N2, N1N3..Nm) etc (m total gcds). So by using the formula, you need only compute the huge value N1*N2*..*Nm once (which takes awhile), but then you can use it to quickly compute all the gcds.