January 15, 2025

Report from Crypto 2004

Here’s the summary of events from last night’s work-in-progress session at the Crypto conference. [See previous entries for backstory.] (I’ve reordered the sequence of presentations to simplify the explanation.)

Antoine Joux re-announced the /msg02554.html">collision he had found in SHA-0.

One of the Chinese authors (Wang, Feng, Lai, and Yu) reported a family of collisions in MD5 (fixing the previous bug in their analysis), and also reported that their method can efficiently (2^40 hash steps) find a collision in SHA-0. This speaker received a standing ovation, from at least part of the audience, at the end of her talk.

Eli Biham announced new results in cryptanalyzing SHA-1, including a collision in a reduced-round version of SHA-1. The full SHA-1 algorithm does 80 rounds of scrambling. At present, Biham and Chen can break versions of SHA-1 that use up to about 40 rounds, and they seem confident that their attacks can be extended to more rounds. This is a significant advance, but it’s well short of the dramatic full break that was rumored.

Where does this leave us? MD5 is fatally wounded; its use will be phased out. SHA-1 is still alive but the vultures are circling. A gradual transition away from SHA-1 will now start. The first stage will be a debate about alternatives, leading (I hope) to a consensus among practicing cryptographers about what the substitute will be.

Comments

  1. MD5算法已经被破解 电子签名法2005-04-01起施行

    Ping Back来自:blog.csdn.net

  2. 中国人破解MD5(转载)

    Ping Back来自:blog.csdn.net

  3. 密码学领域重大发现:山东大学王小云教授成功破解MD5

    Ping Back来自:blog.csdn.net

  4. 密码学领域重大发现:山东大学王小云教授成功破解MD5

    Ping Back来自:blog.csdn.net

  5. On August 25 the NIST released brief comments on the issues raised at the recent Crypto conference.

    Check their SHA site here.
    A direct link to their statement here.

    In brief: although full-strength SHA-1 was not compromised, they plan on phasing out SHA-1 in favour of stronger hash functions (SHA-224, SHA-256, SHA-384 and SHA-512) by the year 2010.

  6. Hm —

    Right now it seems that you need two inputs to generate a hash:

    – The original source document
    – The original digest of the source document

    … this allows you to create a new source document with the same digest as the original.

    Doesn’t this mean that in cases where you don’t have the original source document (ie: a password), you are still safe?

  7. Jörn Nettingsmeier says

    for the record, since people asked for the consequences of hash collisions, here is a nice introductory-level article:
    http://www.unixwiz.net/techtips/iguide-crypto-hashes.html

  8. and whats the problem, its clear that for some 2 values have to be the checksum the same. It can’t be injective mapping, because set of codes to sum is much m0re bigger than set that contains sums. (Set of codes is uncountable and infinite and the set of sums contains “only” 16^32=340282366920938463463374607431768211456 elements)
    Simply: there are many this fucking examples of this “collision”.

  9. Not being able to see the difference in the bytestreams, I decided to run the bytes through SHA1. When I saw that the SHA1s were the same hash… AND realized that the bytestreams were different… Well I almost poo’ed my pants.

    Then I looked more closely at the perl code and realized that it has a fatal flaw in it. The program as written above finds the md5sum of ‘128’, not the md5sum of the hex stream.

    It should be rewritten as such:

    #!/usr/bin/perl -w

    use strict;
    use Digest::MD5 qw(md5_hex);
    use Digest::SHA1 qw(sha1_hex);

    # Create a stream of bytes from hex.
    my @bytes1 = map {chr(hex($_))} qw(
    d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
    2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
    55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
    08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
    d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
    dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
    e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
    c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70
    );

    # Create a second stream of bytes from hex.
    my @bytes2 = map {chr(hex($_))} qw(
    d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
    2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
    55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
    08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
    d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
    dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
    e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
    c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70
    );

    # Print MD5 hashes
    print “MD5 Hash:n”;
    print md5_hex(@bytes1), “n”;
    print md5_hex(@bytes2), “n”;

    # Print SHA1 hashes
    print “SHA1 Hash:n”;
    print sha1_hex(@bytes1), “n”;
    print sha1_hex(@bytes2), “n”;

  10. mort–

    Close. They can’t match an arbitrary hash; they can only manipulate two datasets such that eventually their “shadows” align.

    –Dan

  11. Remember, any hash is a many to one onto mapping. It is much the same thing as a 3D object casting a shadow onto a 2D plane: it is a projection. It is a reduction of basis. By analogy with the shadow illustration, a collision is the same thing as finding 2 different 3D objects that cast the same shadow. You cannot tell them apart by looking at only the shadow, even though they are different objects.

    As already pointed out, the real issue is that given one shadow, can an attacker devise an object that casts the same shadow, and can he generate this object in a reasonable amount of time.

  12. 8=======|)~~~~~~~~~

  13. How hard is it to find a collision in both MD5 and SHA-1 in the same place where data is changed so that both hashes remain the same

    Roughly speaking, if I can find a MD5 collision in 2^40 steps, and a SHA-1 collision in 2^30 steps, it’s gonna be 2^70 steps to find something that collides in both.

    Roughly speaking.

  14. Perhaps the absolute paranoid can offer both, MD5 and SHA-1 hashes, to [files|certificates|whatever data]. Even if the data is subverted so that one hash is the same as before, the other hash surely would change.

    That begs the question: How hard is it to find a collision in both MD5 and SHA-1 in the same place where data is changed so that both hashes remain the same?

  15. How does this effect the security of APOP for POP3 and CRAM-MD5 for SMTP?

    Compared to the alternatives of USER/PASS for POP3 and AUTH LOGIN/PLAIN for SMTP, APOP and CRAM-MD5 provide a more secure method of authentication.

  16. Kelsey Requiem says

    Actually how severe IS this prolem? I understand the lot but it doesn’t seem to me that it would mean too mch trouble, as it’s not one of the easiest attacks… (I guess…)

  17. Just to clarify a some things…

    Mathew: Their method is doing some very specific transformations and (apparently) yielding a comparatively high rate of success. While the chances of there being an exploit that ends SSL as we know it appearing tomorrow are pretty slim, the apparent regularity of the transformations implies that a very real flaw exists and that it will probably be exploitable within a year.

    x: You’re right – their method is (at best) a type of known plaintext attack, and is therefore not going to be useful for sniffing out passwords from a *nix/*BSD shadow file.

    Foobar: You’re right too, though technically you misspoke. MD5 isn’t typically used as part of the algorithm, but as part of the key exchange protocol. It’s a subtle distinction, but an important one.

    Consistent MD5 collision has no effect on the crypto itself, but it may have a large effect on modern key distribution schemes. Most notably, it has potentially disasterous effects on the trust aspect of SSL if any certificate authority is using it that is also trusted by default by popular web browsers.

    A description of one potential attack can be found here.

  18. Finding a collisions by chance is easy due to birthday paradoxon.
    Compared to finding collisions to given data.
    Which is in turn easier than adjusting reasonable data to collide with some given data.

    What we’ve seen here are just two pieces of data (that are both not reasonable) to have a collision.

  19. Many public key algorithms use MD5 for key verification. In this case an attacker knows the plaintext (public key) and the hash value (MD5 of the key). If he can generate a collision MITM attacks are very hard to detect. It is necessary to note that the fake public key doesn’t need to have any special properties (e.g. prime, rel. prime, …) as long as the asymmetric cipher happily swallows it.

  20. Laddy, I do not think they are finding collisions when given a hash – but finding collisions with two different plaintexts. Thus your example is flawed, as when given the md5sum of a password, you do not know the plaintext, and thus generating a collision using this new discovered method isnt possible.

  21. Meta News II

    This week was a big news week. Here’s all the important stuff: Elmer Bernstein died. He was one of the great film composers, writing the scores for such classics as The Magnificent Seven, Airplane!, and Animal House. Ron Rivest’s hash…

  22. somebody in particular says

    The perl script posted by “no one in particular” doesn’t work properly. The variables “bytes1” and “bytes2” both contain the value 128, instead of the intended stream of bytes.

  23. Hi, does somebody knows where I could find description of the attack? With mathematical description? (prove …) thanks

  24. The importance is quite simple. I type in ‘password’ which generates md5 ‘x’. Then whenever someone types in a possible password, I run md5, and compare the results.

    What they are hinting at, is that it is now possible to take ‘x’, and find a comparable string which generates the sum ‘x’. The problem is that instead of taking 2^128 operations to find this collision, you can do it in 2^40 operations. The difference between those two numbers is several days, versus several million years.

    How is this important? For example, core internet routers use md5 to exchange passwords. I simply sniff the md5sum, and if I can find a string that generates the same sum, easily, I can send my own routing update that takes down the internet. More examples, since a LOT of applications use md5, but you get the idea.

    Obviously the above attack isn’t quite so simple, but this research makes it *possible*. Before, it was believed to be sufficiently difficult to find a collision, that nobody worried about it. Now they are saying its feasible to do it in hours.

    The question hanging around right now is that these researchers managed to find collisions easily, but not for an artbitrary string. The questions is how long before someone modifies this method to find any colllision. That is how much time the world has to move away.

  25. SHA/MD5 obsolete?

    Freedom to Tinker: Report from Crypto 2004…

  26. So what do we need? Just stronger versions of SHA/MD5 that make it harder for number crunchers to find ‘holes’, or do we need completly new/other functions that replace SHA/MD5?

  27. Actually, you can take any of the lines in the example data, append it and the md5 hashsums of both records are still identical.

  28. Max Kington says

    Can the same attacks can be applied to SHA-256, SHA-384 and SHA-512?

  29. depesz: it’s easy to show that hash functions must have collisions, but what’s news is that people have worked out a way to actually *find* these collisions.

    An example of what you might do with this. You could request an SSL certificate (for your real identity) from a certificate authority. After the response comes back, you can then use that response (which is based on the MD5 of your identity+key) to “authenticate” a carefully chosen different certificate, one which claims that you are LargeBankOrSoftwareCorp., but which has the same MD5 as your real identity. You can then present this to other people in order to convince them that you are someone whom you are not.

  30. Similar to SHA-0 case, which have to be replaced by SHA-1 after finding that collisions can be made with reasonable computing resource.

  31. Jordan Vance says

    n00b question/comment:
    there seems to be a lot of chicken littling going on here. how serious is the fact that collisions can be made? I know it breaks one of the rules of CHF as you mentioned before, but what are these used in? I know some are used to check files for tampering (what I’ve seen of md5). Can someone give a nice little mini-lecture type thing about how they are applied and how they work?

  32. i don’t quite understand why there is so much buzz about finding collisions in hash algorithm.
    everything sane *knew* that md5 has collisions. so does EVERY hash algorithm in the world. it’s a simple matter of calculation.

    so, what’s the buzz about? really.

    depesz

  33. $bytes1 should be @bytes1

  34. Oh no! SHA-0 is insecure?! And it’s only 9 years obsolete! Shock! Horror!

  35. The bytestreams are NOT the same.

    Check out the 4th value on the second line of the bytestreams, and the 3rd last value on the end of the third line, and the 5th last value on the fourth line.

  36. your bytestreams are the same.

  37. your bytestreams are the same.

  38. MD5 is broken, but what does that mean?

    I’ll apologize right up front for the “techie” post, but I thought this was important enough to share.

    MD5, a computer algorithm used to for security purposes, has been broke. To quote Symantec, “MD5 is a one-way operation that transforms a data st…

  39. no one in particular says

    Here’s a simple shift of greg’s code to make it simpler and run on any platform with a perl installation:

    #!/usr/bin/perl -w

    use strict;
    use Digest::MD5 qw(md5_hex);

    # Create a stream of bytes from hex.
    my $bytes1 = map {chr(hex($_))} qw(
      d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
      2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
      55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
      08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
      d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
      dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
      e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
      c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70
    );

    # Create a second stream of bytes from hex.
    my $bytes2 = map {chr(hex($_))} qw(
      d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
      2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
      55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
      08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
      d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
      dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
      e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
      c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70
    );

    # Print MD5 hashes
    print md5_hex($bytes1), “n”;
    print md5_hex($bytes2), “n”;

  40. no one in particular says

    Here’s a simple shift of greg’s code to make it simpler and run on any platform with a perl installation:

    #!/usr/bin/perl -w

    use strict;
    use Digest::MD5 qw(md5_hex);

    # Create a stream of bytes from hex.
    my $bytes1 = map {chr(hex($_))} qw(
      d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
      2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
      55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
      08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
      d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
      dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
      e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
      c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70
    );

    # Create a second stream of bytes from hex.
    my $bytes2 = map {chr(hex($_))} qw(
      d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
      2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
      55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
      08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
      d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
      dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
      e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
      c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70
    );

    # Print MD5 hashes
    print md5_hex($bytes1), “n”;
    print md5_hex($bytes2), “n”;

  41. Has anyone heard any comment from all of the digital certificate vendors out there yet? I called ours (Geotrust) and the sales person there was unaware of any problems.

    I certainly wouldn’t buy any certs that had a lifespan of longer than the minimum now until this all gets sorted out.

  42. Greg Buchholz says

    Here’s a down and dirty perl script to show the results (runs under unix, i.e. needs md5sum program)…

    #!/usr/bin/perl -w
    use strict;

    my $v1=<<END_V1;
    d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
    2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
    55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
    08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
    d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
    dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
    e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
    c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70
    END_V1

    my $v2=<<END_V2;
    d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
    2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
    55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
    08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
    d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
    dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
    e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
    c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70
    END_V2

    my $p=join(“”,map {chr(hex($_))} split /s+/, $v1);
    my $q=join(“”,map {chr(hex($_))} split /s+/, $v2);

    print `echo -n ‘$p’|md5sum`;
    print `echo -n ‘$q’|md5sum`;

  43. Does anyone have a recording of the webcast? The IACR web page hasn’t been updated with a recording yet.

  44. Were there any implementation details of the attacks by Wang et al published?

    I’m really surprised that this sudden cluster of breaks all turned up in the rump session, to be published in the space of 45 minutes or so.