The issue of obfuscation and decompiling .NET code comes up on a fairly regular basis, so I thought I’d explore it in some more depth, and try to address some of the common questions that arise.
This is not a new issue or topic – since the first high level compilers existed, people have explored the ideas of reverse engineering programs to reveal the source code. Back in the 1980s I used to program in C, and one of the languages nice features was the predictable way it generated assembly code based on the C source code, which made it possible to write high level language code, yet still steel feel in control of the assembly it generated. As newer programming languages introduced ever greater levels of abstractions, the mapping has become more complex, so decompiling an .exe file from a modern language is a non trivial task. (Open an .exe in the debug editor and try to make sense of it, if you doubt this)
In contrast to this, .NET assemblies carry with them a lot of metadata, which affords the runtime a number of important benefits, but as a side effect, makes them easier to decompile and discover the original intent of the programmer. The good news though is that this is a well understood issue, and it can be relatively easily addressed.
There are two really good articles that I urge you to read to better understand this topic. The first is a very readable MSDN Magazine article “Thwart Reverse Engineering of Your Visual Basic .NET or C# Code”, and the other is the Visual Studio documentation on MSDN “Goals of Obfuscation” (written by PreEmptive Solutions, who wrote the Obfuscator in Visual Studio 2003). Together these two articles should answer most of your questions.
I love the analogy based on food in the MSDN article, likening obfuscation to putting a six course meal into a blender, so no once can identify the ingredients, whilst still delivering those contents to the recipient – not a strict computer analogy, but a nice one just the same!
One of the common questions is – why is .NET assembly code encrypted instead of obfuscated? The MSDN article addresses this nicely “You could encrypt .NET assemblies to make them completely unreadable. However, this methodology suffers from a classic dilemma—since the runtime must execute unencrypted code, the decryption key must be kept with the encrypted program. Therefore, an automated utility could be created to recover the key, decrypt the code, and then write out the IL to disk in its original form. Once that happens, the program is fully exposed to decompilation.”
Another common question is “shouldn’t the code be obfuscated by default?”. The danger of default obfuscation is that someone would crack whatever algorithm was used, and publish a decompiler tuned to that, so the apparent safety of the default solution would turn out to be deceptive. It’s better to decouple the obfuscation process, and have an after-market of ISV’s whose focus it is to create ever smarter obfuscators, keeping ahead of the decompiler writers. PreEmptive, for example, have a range of offerings, each with different capabilities.
Most people who are interested in obfuscation tend to do so for a number of specific reasons. Firstly, the code implements some form of copy protection (such as checking the presence of a CD, requiring a registration number, etc) where if the relevant algorithm could be decoded, it would be possible to bypass the copy protection. Another is the code utilises some Intellectual Property (IP) that needs to be kept secret, and which is core to the way the program works. A slightly different variant is the software handles sensitive information (such as financial, personal, etc) where it might be possible to create a hack or spoof the system if details of the way the information is handled could be decoded. The final one, and perhaps the most common, is people just plain don’t want others to see inside their code, just as a general thing, rather than any of the other factors. Given some of the poor code I have seen over the years, it may be just to avoid embarrassment !