Obfuscated code…

I recently ran into this post from Alex Papadimoulis’s “Daily WTF”, and it reminded me of one company’s response to mandatory source disclosure (no, this isn’t really another open source discussion, really – I’ve learned my lesson J).

This company (which will remain nameless) licensed the sources to its code to Microsoft for integration in a Microsoft product (no, I’m not going to name names). 

As a matter of fact, giving away the source code was one of the selling points of their product.  They licensed the source code to any and everyone who bought the product.  This was important because some of their customers were government agencies with source code availability requirements.  It also allowed for their code to run on lots of different platforms, all you needed was a compiler (and of course the work to adopt the program to your platform, which they were more than happy to provide).

But of course, if you’re giving away the source code to your product, how do you prevent the people who have your source code from using it?  How do you continue to make money off the product once your customers have the source code?  What’s to prevent them from making the bug fixes for you?  Why should they continue to pay you lucrative contracting fees so that you’ll continue to get revenue from the product?  And more importantly, how do they prevent their customers from making an incompatible (or incorrect) change to their server?  If your customers have the source, you lose the ability to ensure quality of fixes.  This latter issue is a very real issue btw.  I see this all the time on the IETF IMAP mailing list.  About once a semester or so, someone posts a “fix” for the U.W. IMAP server, and Mark Crispin immediately jumps on the fix explaining how the guy got it wrong.  So it’s important that you make sure that your customers, who have the source code to your product, only make the fixes that you authorize.

Well, this company hit on what I think is a novel solution to the problem.  Since their code had to be platform independent, they already had a restriction that none of their identifiers could be more than 6 characters in length (to work around limitations in the linkers on some of their supported platforms).  So they took this one step further and purposely obfuscated their entire source code.

Every single function name in the source code took up exactly 6 letters.  So did all the structures and local variables.  And they stripped most of the comments out of the code.  They had a book (on paper) that translated the obfuscated names to their functions to the human readable names, and their support guys (and internal development) all had copies of the book. 

The customers weren’t allowed to have the book, only employees of the company got the book.

So the customers couldn’t really figure out what was going on inside the source code, the only thing they could do was to call support and have them look at the code.

A clever solution to the problem, if a bit difficult for the customer J

Oh, and before you ask, no, this is NOT what Microsoft does when it licenses the source to someone.  If you license the source to a Microsoft product, as far as I know, you get the real source.


Comments (16)

  1. RichB says:

    "we have an extensive process to scrub the Rotor source"

    – Brian Harry, July 2002

  2. Do you have a reference Rich? "scrub the Rotor source" is actually highly ambiguous.

    We scrub the Windows source for instance before we release it – we pull out swear words from the comments, for example I believe.

    But we don’t obfuscate it.

  3. Doesn’t that eliminate the real world benefit of receiving the source in the first place? While unauthorized fixes may not be what the vendor wants it can save a company millions if they *can* do it.

    I’d be hesitant to deal with a company that acted in such a manner. To me receiving obfuscated source code is worse than receiving binaries only — as a vendor I’ve paid money to and we agreed upon what benefits would be provided, you lied to me. "The source is still compilable!" Whatever.

    The distinction between ethics and legalities have plagued the technology sector for a long time. What is legal is not ethical, and what is ethical is not always legal.

    As for Microsoft source: never seen obfuscated source. For all the physical copy protections and mechanisms that Microsoft uses I’ve rarely found those mechanisms to extend into the software itself.

    Heck, grab Reflector and take a look at the Microsoft assemblies.

  4. Mike Dimmick says:

    Or, go grab the Rotor sources: http://sharedsourcecli.sscli.net/. It’s very readable (and interesting to see that some of the source dates back to 1997!)

    I’d have to say that Dinkumware, suppliers of the C++ run-time library supplied with Visual Studio, are also protecting their investment. That code is almost completely incomprehensible. By contrast, the C run-time code supplied by Microsoft is quite readable – for example, malloc.c – although it could stand to have a few more comments.

    Of course, the length of a header file has a direct impact on the compile time of a C program.

  5. Apparently PJ’s done a better job of documenting the C++ run-time library in the more recent releases of the runtime library.

    Although you’re right, it’s pretty darned hard to read 🙂

  6. This is also what another company (or perhaps the same company, being nameless and all) did to try and get away with lifting and altering GPLed code. Their little ploy didn’t hold much water, though, thanks to the rather comprehensive definition of waht counts as ‘source code’ under the GPL, i.e., the code actually used by the developers to create / modify the program. Releasing unreadable code that technically does compile is, IMO, the equivalent of releasing byte-code, and not really within what should be expected when you are required to release "source code".

  7. I very much doubt it was the same company actually. I’ve never even vaguely heard a hint that the code wasn’t their own.

    This was in the mid 1990’s btw, and there wasn’t that much GPL code out there.

  8. runtime says:

    Are you sure the company’s developers actually coded using the obfuscated source files? There are numerous tools to obfuscate C/C++ source files before distributing them to the world. That is an easy job to be automated.

  9. Yes, they really did work on it in the obfuscated form. They had to because they typically had to make fixes at the customer site. As far as I know, even development in-house at that vendor was done using the obfuscated versions.

  10. Keith Moore [exmsft] says:

    Another phrase for intentionally obfuscated source code is "shrouded source". Some searching on Google shows that, while not common, it’s not exactly rare.

  11. runtime says:

    Instead of forcing their developers to use a confusing codebook, the company should just have hired developers who don’t speak English. Then all the identifiers in their code would be just as obfuscated to English-speaking Americans! 😉

  12. Centaur says:


    Any sensible coding standard forces you to code as if your native language was English. And you really don’t want to hire those who do not.

  13. Kumar says:

    Can someone tell me few `free’ tools for cpp code obfuscation.

    Google didn’t help much with `obfuscation’ or `shrouding’

    Thank you,


  14. Norman Diamond says:

    Regarding postings by 6/1/2004 9:49 AM runtime

    and 6/1/2004 10:09 AM Centaur,

    Most of the world DOES hire developers who don’t speak English. Usually the comments end string constants end up appearing obfuscated to English-speakers but the identifiers are a different story. Programmers who know the advantages of making identifier names say what they mean do name identifiers that way, and programmers who don’t know it don’t do it.

    Though even when identifiers resemble English words, there can be nuisances. For example one way to write the word "information" in Japanese is to write phonetics that sound almost like the English word "information". One programmer knew this but didn’t know the correct English spelling, so part of a function name was "infomation". I didn’t notice when reading it, but did some hunting when a link error came up, and made the caller match the actual defined name.

Skip to main content