Do not change program semantics in the debug build


What you don't want is a problem that goes away when you debug it.

It is expected that a program have additional debugging code inside #ifdef DEBUG blocks. After all, that's why it's a debug build. But what you definitely don't want to do is have that debugging to fundamentally change the program's behavior. You can perform additional validation. You can raise assertion failures. You can track resources. It can be slower and consume additional resources, but you had better not alter code flow.

// This is wrong
BOOL DoSomething(Thing *p)
{
#ifdef DEBUG
 // Do some extra parameter checking
 if (!p) {
  Log("Error! p parameter must not be NULL.");
  return FALSE; // WRONG!
 }
#endif
  ... remainder of function ...
}

This code is wrong: The debug version behaves fundamentally differently from the retail version. If somebody calls this function with NULL for the p parameter, the retail version of the program will crash but the debug build will trap the error and fail the call.

Do not change the function's semantics in the debug build. If the retail build crashes, then the debug build must also crash in the same way. Sure, you can log the failure before you crash, but you still need to crash.

An analogous mistake in the C# world might go like this:

// This is wrong
void DoSomething()
{
#if DEBUG
  try {
#endif
   ... guts of function ...
#if DEBUG
  } catch (Exception ex) {
     LogException(ex);
  }
#endif
}

In this C# example, the debug build logs and swallows exceptions, while the retail version allows them to escape.

If you mess up and write code like this, where the retail and debug versions behave in some fundamentally different way, you will eventually get yourself into this situation: The retail version has some problem, but the debug version works okay. Your customer can't figure out what the difference is, so they switched to the debug version on their production servers. It runs twice as slow and consumes three times as much memory, requiring significant capital expenses to scale up to their previous level of service. But it's the best they can do because the problem doesn't occur on the debug version (and therefore cannot be debugged there).

I have seen reports of software getting into this predicament, and it reflects very poorly on the developers.

Comments (36)
  1. Tom says:

    Not having different semantics in debug and release builds is important and should (hopefully) be common knowledge.  What gets me is when the optimizer does something in the release build to cause an error.

    For example, I recently had a bug that only appeared in one machine that was a quad-Xeon box running Windows Server 2003.  Win 2K and XP ran the code fine.  In fact, I developed the software on a Dual Xeon Win2K machine.  It turned out that the optimizer reordered some floating point operations that resulted in my DLL generating a most-of-the-time increasing series instead of a stricly-increasing series.  Fortunately my code checked for this condition and returned an error code, but it was certainly confusing as to why it worked everywhere but on that machine.

    Since I didn’t have direct access to the Server 2K3 box, I had WinDbg a core dump of the process on a 2K box.  That was fun!

  2. Andy says:

    Some days I read your "don’t do this" type of posts and I think "wow that is a really good tip I never would have thought to check for that". Other days like today I look at it and go "OMG please tell me people aren’t doing that in real life code". A while back you mentioned that all of your "don’t do that" posts stem from things you have actually seen or know for certain were done. That fact coupled with code like what you showed above just makes me pray that these people aren’t working on anything critical like navigation software for airplanes or if they are that they will read your blog and will stop writing horrendous code.

  3. Stu says:

    This kind of debug-code-changes-behavior is quite easy to spot, but then there are the times when the debug code has unintended consiquences, like printf’s taking long enough to make thread synchronisation work…

  4. Tom says:

    It can get much worse than this.

    I’ve heard of embedded systems that used the UART for debug messages. Most of the testing was done in debug mode, but just before release the UART was disabled since it might reveal secret information. Also the system was pretty slow. Lots of fatal bugs were uncovered, some that could not be fixed in time. Most were due to the changed timing from not having prints running – some of the drivers only worked because of the delays as a side effect of the UART interrupt service routine running every few ms.

    In the end they ended up shipping a version with the UART pins disabled, but all the printfs still in the code. Ouch.

    The lesson I think is to agressively minimise the printfs before testing – only really essential info should go out to the port. That way you have more time to find timing bugs in the code while you still have time to fix them.

  5. Raymond Chens excellent blog has a post today warning of the dangers of changing program behaviour between…

  6. Matt Lee says:

    A coworker once did something like

    <pre>

    assert (someFunc() != null);

    </pre>

    Where someFunc() actually did some necessary operation.  Of course it wasn’t being called at all in the release version.  This one didn’t make it into the wild but it was tricky to track down.

    Another problem with debug versions is the memory guards that the memory manager puts at the beginning and end of memory blocks allowing your debug version to survive off by a few pointer calculations without trashing other memory.  Especially when you use the guard character (ASCII 253 I believe) for an important delimiter in your strings like a company I previously worked for.

  7. Andrew Feldstein says:

    What about assertions that get turned off by compiler options in the release build?

  8. JamesW says:

    @Matt Lee

    I won’t blame a coworke – I’ve done the assert() one myself. It’s easy to do if you’re not thinking too hard at the time because the #ifdef DEBUGs are hidden from view. Tends not to be found immediately because everything works on your debug project. It blows up once the program is tested in release a while later…

  9. Tyler Reddun says:

    Andy, it’s worse then that. When I was in collage we were tought that we had to put are param checks in the #ifdefs because it was clearly debug only code. Infact we got marked down if we didn’t do it. I suspect some people in his class still do it that way as well.

  10. oPossum says:

    VERIFY(someFunc()!=NULL); will keep someFunc()!=NULL in the release build and behave like ASSERT(someFunc()!=NULL) in the debug build.

  11. richard says:

    Hmmm … I have never understood why people want release builds to be different from debug builds. As far as I am concerned, if it is important enough to check during development, it is important enough to check at release.

    For this reason, I dislike ASSERT() – not to mention it can also be a source of introducing subtle behavioural differences like:

    ASSERT (i++ > 0);

    or ASSERT (some_modifying_function_call(i));

    All of which I have seen in code.

  12. Doug says:

    IMHO, #ifdef DEBUG, ASSERT(), and their bretheren should not exist.

    They just lead to issues like this.

    The only semi valid reason to have them is for performance reasons.  And for most code, that is not a good reason.

  13. KJK::Hyperion says:

    Assertions are invaluable, and they’re strictly debug code too, in my opinion. If you are afraid of altering the environment with your debug assertions, well use them in a sensible pattern, then. Like, for data structures in C++ you can define a const member function called AssertInvariants, put all your asserts in it (which cannot alter the state since "this" points to a constant object) and call it every time execution enters or leaves a function in a public interface. I can’t count the times this pattern has saved my ass against subtle corruption, off-by-one errors or faulty logic

  14. KJK::Hyperion says:

    Assertions are invaluable, and they’re strictly debug code too, in my opinion. If you are afraid of altering the environment with your debug assertions, well use them in a sensible pattern, then. Like, for data structures in C++ you can define a const member function called AssertInvariants, put all your asserts in it (which cannot alter the state since "this" points to a constant object) and call it every time execution enters or leaves a function in a public interface. I can’t count the times this pattern has saved my ass against subtle corruption, off-by-one errors or faulty logic

  15. Jonathan says:

    And let’s not forget the ever-popular deubg-code-stomping-on-GetLastError…

  16. DriverDude says:

    "I’ve heard of embedded systems that used the UART for debug messages. Most of the testing was done in debug mode,…"

    There’s one problem: most of development may be done with debug mode, but TESTING must be done with release code. Otherwise the product that is being tested is not what customer has. Doh!

    Given that we know about these pitfalls, we should either 1) avoid using those constructs, or 2) design/review code to catch those mistakes. Even the best of us make mistakes. In my experience, precious few have learned from their (and their peers’) mistakes.

    Understanding the system is also necessary. Perhaps a extra printf isn’t a big deal for a multi-threaded app, but the time it takes to send stuff over a UART in an embedded system is a big deal. Not realizing that leads to design mistakes and hard-to-find bugs.

    I’d like to see more this in literature and taught in classes or peer-reviewed. Every C book says something along the lines of "beware the min macro can eval its args twice"  But few warn about the pitfalls in:

    ASSERT(i++ < 10);

    if (expr)  DEBUG_PRINTF_MACRO(("…"));

  17. Here’s a post I&amp;nbsp;read&amp;nbsp;on&amp;nbsp;The&amp;nbsp;Old New Thing:&amp;nbsp;&amp;nbsp;Do not change program semantics…

  18. You’ve been kicked (a good thing) – Trackback from DotNetKicks.com

  19. silkio says:

    god those ‘trackbacks’ are annoying.

    Anyway, I had an annoying issue with MS SAPI 5.1, where you couldn’t actually use breakpoints in the app because it would freeze. I’m guessing it’s due to some strangeness in the SAPI lib itself or the com interop-y stuff between .net and the dlls.

  20. asmguru62 says:

    In my opinion DEBUG build should alert developers and RELEASE one should also not crash, so there should be some redundancy of sort:

    HRESULT f (ISomething* pISomething)

    {

     ASSERT (pISomething != NULL);

     if (pISomething == NULL)

     {

       return E_FAIL;

     }

     …

    }

  21. jmstall says:

    Amen!

    As a debugger developer, I see these showing up people saying "The debugger broke my app" and blaming the debugger tool.

    While your’e at it, avoid kernel32!IsDebuggerPresent and Debugger.IsAttached.

    Unfortunately, Winforms has a notorious violation of the "don’t change behavior under a debugger" rule by having a different wndproc for when under a debugger that has an additional exception catcher.

  22. Dewi Morgan says:

    Jonathan wrote: "And let’s not forget the ever-popular deubg-code-stomping-on-GetLastError…"

    Yup. It’s the flip case of what was discussed in the article, where debug code breaks code, rather than having the code rely on debug code.

    But I have to hold up my hands and say I was recently guilty of writing something like the following wrapper in PHP (paraphrased heavily for brevity):

    function my_exec_sql($query) {

     global $DEBUG;

     if (builtin_exec_sql($query)) {

       ; Store last insert id, so debug won’t clobber it.

       my_set_last_id(builtin_get_last_id());

       if ($DEBUG) {

         if (!builtin_exec_sql("

           INSERT INTO log_table

           VALUES (‘".escape($query)."’)

         ")) {

           die("failed to save debug info");

         }

       }

     }

     else { die("failed to insert"); }

    }

    Now, we’ll ignore the fact that relying on a DB to log debug info about the DB is fairly bad…

    builtin_exec_sql() causes builtin_get_last_id() to return the index of the last inserted item.

    Unfortunately, a function that called my_exec_sql() subsequently called builtin_get_last_id() directly, instead of calling my_get_last_id() function. This gave it the ids from log_table, instead of whatever table it was inserting to.

    Worse, on a virgin test system, the ids were "correct", so long as that function was only called for that one query. So it wasn’t reproducible no matter how many times you ran the query!

    Not my fault, but a real pain to debug!

    I changed it to insert to the log *before* doing the action, and then modify the log afterwards, on fail/success. This still has knock-on effects, and affects speed, but I felt this was less serious and more obvious than the subtle bug.

    More appropriately to the topic of debug code making the system work, in another app I worked on (multithreaded java 1.0, back in the day), I found that removing a certain apparently-useless print("") call caused NullPointerExceptions, in an apparently unrelated thread.

    Why? Both threads did stuff with some static variables in an object they both instantiated, and the crashing thread was assuming it was the first to instantiate the object. And as System.out.print() was a slow operation, it *was* the first, so long as that print existed.

    Rather than track down the problem, the original programmers had instead just decided to keep the "magic" print(), and just remove all the content from it.

    The problem was eventually resolved more correctly, but this has served me for years as a powerful lesson on the kind of mess not to get into with multithreading.

  23. Subtle behavior changes are harder to catch.

    We had some debug code that check an STL map<> that caused crashes under release.

    Why?

    Because the map<>::operator[] implicitly creates an item in the map.

    When the debug code was present, the map was always guaranteed to have at least one item in it.

    I guess this is why some people think C++ is evil…

  24. Nick Lamb says:

    I’ll be the devil’s advocate for a moment here…

    Depending on your situation it /may/ be better for your program to keep running despite certain types of error, thus changing the semantics. In the DEBUG build you want these errors reported, and if you’re in a debugger or have crash dumping you probably want it to crash immediately so that you can try to understand how it happened. But in the production build it might be better to carry on and cross our fingers.

    Raymond’s justification is that customers will use your slow and undesirable DEBUG version, but actually in his example what the customer wants (and gets) is a version that crashes less. By putting most asserts and forced crashes into the DEBUG path only, you reduce the chance of the customer’s production version crashing. It might give bad results, but it probably won’t crash. For a lot of customers, a lot of the time, that’s actually the preferred outcome.

    There has been some interesting research done on changing the behaviour of illegal memory accesses (a really important bit of semantics) in production systems. A lot of crashes found in ordinary application software go away with minimal side effects if the OS simply ignores illegal writes and provides zero values for illegal reads.

    Obviously if you’re monitoring a nuclear reactor, you’d like to know if anything goes wrong with the software you’re using, and it is probably more acceptable for the software to fail safely, altogether, than for it to try to continue past a fault. For a word processor or a video game it’s much less certain that crashing is the right thing to do.

    Here’s a really trivial example (please don’t bother shooting this down unless you have a real point to make too). Suppose you are playing World of Warcraft, and you’re the Main Tank for AQ20. Would you rather that a memory allocation problem in the WoW executable caused a crash, disconnecting you from the game and therefore almost inevitably resulting in a "wipe" of the raid or would it be better if it just corrupted the audio and video of the game, making it harder to play until you could tell everyone else and get a rest break to reboot your computer ?

    If you’re the developer, I hope you’d choose the crash, the game is fun but hopefully you’re enthusiastic about actually finding and fixing bugs. I expect everyone else would choose the corruption.

  25. Adam says:

    Nick : That sounds extraordinarily dangerous, even for non-nuclear-reactor scenarios.

    If you’re doing illegal reads or writes, then that’s because your pointers are badly messed up, and you’re attempting to read/write to a page that’s not even been mapped into your address space. double-free()s, walking off either end an array, dereferencing dangling pointers; most of these will not produce a SEGV, they’ll just trash something else in your address space silently. (Remember, most of the time free() does not return memory to the OS, it just returns it to malloc())

    If you end up trashing some other pointer and/or malloc’s housekeeping code then yes, you’re likely to get a SEGV further down the line when you try to deref a trashed pointer. But that’s just a symptom. The cause is deeper, and I’ve found that I’m relatively likely to have messed up some other data in the process. I’d rather my process stopped right there before things go more badly wrong, instead of continuting and increasing the chances that some bad data gets written somewhere (file containing important report, DB, etc…) permenantly.

    What if the corruption is due to an attacker trying to exploit an error? (e.g. steal your legenday gear?) Crash and they’re gone. Keep going and you give them more of a chance to get what they’re after.

    Once you’ve started reading/writing random memory, you can’t really be sure of anything anymore. Fail early.

  26. Neil says:

    A lot of crashes found in ordinary application software go away with minimal side effects if the OS simply ignores illegal writes and provides zero values for illegal reads.

    I used to use a database server called SQLBase running under Novell Netware 3.12; it seems that this OS performed read and write fault emulation. Netware 4.11 disabled emulation by default but fortunately to run older versions of SQLBase you could turn it back on again. (Latest versions of Netware don’t support that option, they simply suspend the faulting process).

    I use Mozilla 1.6 (yeah, I should upgrade) which occasionally crashes during garbage collection and I have found that I can use WinDbg to set the EIP to the function epilog to keep it running.

  27. Charlie Tangora says:

    Funny, I just ran across this one today. From a simple stack allocator in an console videogame’s memory management, edited for brevity:

    U8* Allocate(size_t numBytes)

    {

     U8* ptr = m_current;

     m_current += numBytes;

     ASSERT(m_current <= m_topOfHeap);

     return ptr;

    }

    Yup, in release it gave you a pointer regardless of whether there was enough memory for you! Allocate more than your share, and you just start using other people’s memory.

    This bug has been there for MONTHS. The person who wrote that code isn’t even at our company anymore. Nobody has any idea how it managed to work for so long.

    Speaking of differences between debug and release, there’s also a wonderful "feature" in certain ports of gcc. In this sort of case:

    float Stuff(float input)

    {

     float output = input;

     /* do stuff with output */

     /* but don’t create any other locals */

     /* then forget to return output */

    }

    VC++ of course complains that you didn’t return anything. Well done, Microsoft!

    But gcc in certain configurations (I think 4.0.2 on PowerPC is an example, but my memory may be failing me here) will not only not warn you, it will put your local variable on the stack where the return value should go. So the function works UNLESS it gets optimized.

    You really feel warm and fuzzy once you’ve spent half a day trying to figure out if you’re seeing a compiler bug, then finding that mistake.

  28. James says:

    On the hardware side, I recall a production error in a batch of computers some years ago. One capacitor’s value was very slightly too low. Of course, attaching an oscilloscope to the component *fixed* the problem, because the ‘scope itself had some capacitance, making the machine behave perfectly again…

    I’ve heard at least one person suggest that the released code should have *more* checks than debug, since it’s running in a less known/controlled environment – I can’t remember who that was, but I tend to agree to some extent.

  29. Vipin says:

    Andy:"That fact coupled with code like what you showed above just makes me pray that these people aren’t working on anything critical like navigation software for airplanes or if they are that they will read your blog and will stop writing horrendous code"

    These type of code exists and some people won’t even change the attitude towards writing both debug and release code having same code flow.I was recently looking at the code of a ssl vpn product and the guy(designated a senior staff engineer) has written code which doesn’t need certain condition to be statisfied to run properly in the debug build, but requires the condition in the release build and this library links statically to the modules other people write,imagine the catastrophe here.The user of the module would go nuts here at this behaviour. He will not budge to making that proper,being dictated by his ego. Now the bottom line is the release code flow never gets executed in the debug build until caught by surprise later on when the product ships. Coders have to keep their egos down, admit the mistakes like this in the code and fix them.

  30. Ken Hagan says:

    I’m joining this late, but Tom’s initial response raised an important related issue…

    "What gets me is when the optimizer does something in the release build to cause an error."

    Use full optimisation in debug builds. You’ll need ‘#pragma optimize("",off)’ occasionally, but rip it out as soon as the bug is found. Debug what you intend to ship. Ship what you debugged.

    Actually, in fairness to the compiler folks, I’ve been doing this for over a decade and found only a handful of optimiser bugs. (I optimise for size, not speed, so your mileage may vary.)

  31. Vipin says:

    The problem is buffer overruns that would go unnoticed in a debug build because of the extra padding,etc won’t work on a sensitive stack when the release binary runs because all the padding and extra spacing which exists in a debug binary are stripped of,things like the FPO also happen. But the latest compilers solves buffer run catching via the /GZ compiler option, that brings the debug/release difference in that respect lot closer. But I guess Raymond’s post had to do with the code flow rather than the compiler introduced differences.

  32. Neil says:

    "What gets me is when the optimizer does something in the release build to cause an error."

    Or indeed the JIT compiler… the IE(3-5+) VM for Java JIT compiler miscompiles some of my code and throws an impossible null pointer exception.

  33. Steve Loughran says:

    I seem to recall that MFC zeros out newed’  memory in debug builds but not releases. all MFC based apps have a fundamentally different assumption about the state of newly allocated memory.

    I now believe in running release-only code, with debug info compiled in.

    [“Well fix MFC then.” Was that directed to me? I don’t have the authority to make changes to MFC. -Raymond]
  34. Norman Diamond says:

    Just imagine if airplanes were allowed to have black boxes only while the manufacturer was testing them with test pilots operating them.  Just imagine if airplanes sold to airlines with ordinary crew and lots of passengers were prohibited from using black boxes so the causes of crashes couldn’t be investigated.

    Just imagine if cars were allowed to have seatbelts during testing by manufacturers, but if it would be illegal for end users to install or use seatbelts.

    Certain corporate policies that prohibit redistribution of debug builds have the same kind of effect.  Yeah when we build verification into our own code we can make sure that it’s doing the same operations in release build as in debug build, but we’re still missing our black boxes.  This kind of corporate policy is something that ought to be illegal.

    Sure there are situations where debug tools need to be stripped out.  If the program is running too slowly then do a performance test.  If debug tools are the third biggest consumer of CPU time (or page faults or whatever) and the two biggest consumers are parts of the program that you can’t optimize any further, then it’s time to start looking at throwing your black box overboard.

  35. Steve Loughran says:

    Raymond: The "well fix MFC then" comment was directed at, well, whoever maintains MFC. I don’t know the org chart. I suspect that MFC is on care-and-maintenance rather than active dev and probably has three interns and new PM every  month, as MFC maintenance is viewed as one of those trouble-without-recognition postings.

    I do know the version of new we use on our code is patched to always zero stuff; we take the hit because we do at least know everything is in a known state after allocation. That is one nice thing about MFC: as long as you dont try shipping your own MFC DLLs, you can patch bits and rebuild it.

  36. Eric says:

    I get your point but your C# version is not that similar.  A more similar example would be:

    bool DoSomething(Thing p)

    {

    #if DEBUG

    // Do some extra parameter checking

    if (p == null) {

     Log("Error! p parameter must not be NULL.");

     return false; // WRONG!

    }

    #endif

     … remainder of function …

    System.Console.Out.WriteLine( p.ToString() ); //This will cause NullReferenceException.

    }

Comments are closed.