[sclug] MD5 is compromised

Thu Aug 19 20:17:05 UTC 2004

Will Dickson wrote:

> Although the full paper has not yet been published, this
> doesn't seem to be the case. There were problems verifying
> the first collision (turned out to be due to the wrong
> endianness on the MD5 IV) at which point they produced a
> revised version, with a different collision, overnight.

Interesting; this is newer information than I had received. It is
certainly a step closer to MD5's being rendered useless for cryptographic
purposes than what had been mentioned 24 hours earlier. That it was done
so promptly suggests a big step, but we are yet to see whether this is so.
(Roughly, if the speed with which the updated clash was derived indicates
that the researchers have come up with a general-purpose clash generator
for MD5 then it's a huge step; if they merely did something akin to
correcting the endian-ness of their inputs to match the correction in the
endian-ness of the IV, then the speed of release of the new clash is not
so impressive. I've not yet seen enough information to determine this
either way.)

> The entire point of a cryptologic hash function such as MD5
> is that it's supposed to be computationally infeasible, ie.
> impossible for all practical purposes, to find even one such
> collision. Obviously they exist, but you're *not* supposed
> to be able to find them. If somebody mananges to find even
> one such collision, the hash is broken. MD5 is now broken.

It no longer has its ideal properties (no known clashes), but few users
depend upon this property. Most depend upon weaker properties, e.g. that
it is not possible to generate an alternate message with the same hash as
some _existing_ message upon whose integrity they depend (e.g. a
download). So:

- In the very narrow sense that a single known clash exists, it could
indeed be described as "broken", however for all practical purposes
(notably the signing of digital messages, e.g. downloads, financial
transactions, ...) it is in and of itself an uninteresting fact because
those practical purposes are currently unaffected.

- When/if the observed fact switches from "here is one known clash" to
"here is a general purpose technique for generating clashing pairs" then
MD5 becomes useless in many cryptographic applications.

- When/if the observed fact switches from "here is one known clash" to
"here is a general purpose technique for generating large numbers of
clashes with the hash for an existing message" (i.e. the ability to trojan
downloads, forge transactions, ...) then MD5 becomes useless in
essentially all cryptographic applications.

The point I was trying to make was that while the first of these three
things has now occurred, the second and third have not and that therefore,
at the present moment, there is less cause for panic than there would be
if either or both of the latter things had occured.

There have been cases in the past where a single (group of) researcher(s)
has made both the initial break and the complete undermining of the same
algorithm at the same time, but as far as I can recall, it has never
occurred with a widely-used algorithm. These researchers may have done
this but (a) it would be a prestige-worthy first and (b) they don't appear
to have announced this prestigious feat; consequently I assume that they
have not achieved it.

I was also trying to make clear that I wasn't suggesting that there was no
problem at all by pointing out that, frequently, a narrow break of this
type is a precursor to more general breaks and that, therefore, while
there was no apparent need for immediate panic, using something else
instead or as well as MD5 is warranted.

> No. Hashcash works on the basis of *partial* collisions,
> which are known to be creatable with not much work. This
> result is a *full* collision, and that's supposed to be
> computationally infeasible.

My mistake.

> True. However, it's still not uncommon: until yesterday, the
> best current practice was that MD5 was unsuitable for new
> designs but still OK for backwards compatibility. The

It still is. Any "best current practice" that mandates suspending
MD5-dependent applications immediately (before the end of today) until MD5
can be replaced with something else is not a sound practice, it is an
academic fantasy. BCPs are about the best available trade-offs between
ideal solutions and the reality of deployed systems. In many cases, until
it actually becomes possible to generate at will a fraudulent message to
match an existing signature, the costs of instant suspension until a
change can be made will overwhelm the costs of exposure associated with
allowing MD5-dependent applications to run until another algorithm is
inserted.

Certainly, the existence of an actual known clash will serve to put more
pressure on laggards to switch.

- Raz