Yesterday, a BlackHat Europe presentation on Office 2003 encryption was brought to my attention. Seems that Eric Filiol has done quite a bit of work to recover RC4 encrypted Office documents using an issue that was brought to our attention in 2004. Eric's paper can be found at this link: BlackHat-EU-2010-Filiol-Office-Encryption-wp.pdf. The paper really just shows how an attack discovered by Hongjun Wu where we committed the error of key stream reuse can actually be implemented.
While Eric's done quite a bit of interesting work in terms of cryptanalysis, I'd like to point out some criticisms I have of his paper. First and most importantly, if one is publishing a scholarly paper, it is critically important to find the most important references in the area. If you search on Office RC4 encryption, the top 5 hits contain MS-OFFCRYPTO. If we skip all the gory detail of exactly how everything is implemented and go straight to section 4.3.3, which is security considerations for Office Binary Document Encryption, we find that we've explicitly stated (emphasis mine):
The Office binary document RC4 encryption method is not recommended, and ought to be used only when backward compatibility is required.
Passwords are limited to 255 Unicode characters.
Office binary document RC4 encryption has the following known cryptographic weaknesses:
The key derivation algorithm is not an iterated hash, as recommended in [RFC2898], which allows brute-force attacks against the password to be performed rapidly.
Encryption begins with the first byte, and does not throw away an initial range as is recommended to overcome a known weakness in the RC4 pseudorandom number generator.
No provision is made for detecting corruption within the encryption stream, which exposes encrypted data to bit-flipping attacks.
While the derived encryption key is actually 128 bits, the input used to derive the key is fixed at 40 bits, and current hardware enables brute-force attacks on the encryption key without knowing the password in a relatively short period of time, so that even if the password cannot easily be recovered, the information could still be disclosed.
Some streams might not be encrypted.
Depending on the application, key stream reuse could occur, potentially with known plaintext, implying that certain portions of encrypted data could be either directly extracted or easily retrieved.
Document properties might not be encrypted, which could result in information leakage.
Because of the cryptographic weaknesses of the Office binary document RC4 encryption, it is considered easily reversible, and therefore is not recommended when storing sensitive materials.
I really do not know how to say this more plainly. The RC4 encryption is the poster child for why the SDL now requires all cryptographic implementations to be approved by the crypto board. This is what happens when people who do not understand cryptography try to do encryption. I detailed this in the Office Crypto Follies, and apparently Mr. Filiol has not read that, either. While the BlackHat presentation rightly points out a serious flaw, there are more serious flaws that we've already documented which were not covered.
Some further corrections of the paper – the analysis was done against the 128-bit CAPI RC4 encryption, which he believes to be the default. Excepting PowerPoint, the older 40-bit RC4 encryption is the default for these documents. There is no need to do complicated cryptanalysis on 40-bit encryption. Just try all 2^40 keys, which can happen on a modern system in a matter of hours. 40-bit encryption amounts to a "Boy Scout decoder ring", as my friend Mike Warfield aptly puts it.
If 128-bit encryption is used, then some of the more egregious key stream reuse issues are not present, but the critical flaw becomes the very poor key derivation function, which uses only one hashing operation. The single hashing operation makes a password brute force something that can be run on a GPU in parallel, and you can easily get tens of millions of cracking attempts per second. Unless the password is extremely strong, the document will fall to this flaw much more easily than with Mr. Filiol's technique. Adobe recently ran afoul of this problem.
There was also a suggestion that there could be a "trap door". It isn't a very good trap if there are neon signs surrounding it saying "WARNING! BAD ENCRYPTION! DO NOT USE!" That's just ridiculous. If Mr. Filiol had read MS-OFFCRYPTO, he'd have seen how not only were the exact techniques documented, but the flaws in these techniques are also explicitly called out.
If you need to encrypt an Office document, then use the new file format, and get real encryption as we've documented in more than one place. If you need to encrypt an older file format, then use a 3rd party tool that will do proper encryption. If you merely need obfuscation, perhaps to keep your kids out of the Christmas list, it might suffice for that, but not if you have a really bright kid.
Next time, I'll get into some of the really cool new stuff we're doing with signatures – we have full XAdES-X-L support.