Why is managed debugging different than native-debugging?

People ask “why can’t a native debugger debug managed code?”.

The reason is that the CLR provides a lot of cool services beyond what you get in a typical native C++ app, such as: running on a Virtual Machine / JITing, Dynamic class layout, the type-system, garbage-collection, reflection-emit, and more. Each of these imposes special challenges on a debugger. Put another way, a native app that did all these things would not be at all debuggable with conventional native debuggers.

I’ll explore the impact of  these things on a 10k foot level. (I’m intentionally omitting tons of detail for brevity’s sake. Maybe I should fill in the details and just convert this to an article instead of a blog entry?)

1) Native debugging can be abstracted at the hardware level but managed debugging needs to be abstracted at the IL level. Managed code can not just be shoehorned into C/C++ native debugging paradigms. One reason is this may restrict the CLR’s options for executing the IL. For example, although it currently (as of v2.0) JITs IL, we’d like to leave the door open for things like interpreting the IL, pitching rarely used jitted code, or even rejitting code. If ICorDebug used native code offsets for everything, it would be unable to debug interpreted IL.   

2) Managed debugging needs a lot of information not available until runtime. With managed code, the compilers only produce IL and the real debugging information is not resolved until runtime. For example, the JIT will compile IL to native code at runtime, and the loader will dynamically determine most class layout at runtime. The type-system may create new types at runtime (from reflection-emit or from System.Activator). For native code, this is all determined at compile time. A managed debugger needs some way to get all this information at runtime. Some solutions include:

a. Have the CLR create auxiliary PDBs at runtime as the information is determined. This could be a huge perf hit, and we’d hate to do it if the debugger is not attached. But if we don’t do it when a debugger is not attached, it may not be available if a debugger does attach later.

b. Have the managed debugger inspect the pertinent CLR data structures (either directly from out-of-process or via a “helper” thread running in-process). A big caveat here is ensuring the debugger doesn’t request such information when the CLR data structures are in an inconsistent state. The CLR currently uses a helper thread.

3) A managed debugger needs to coordinate with the Garbage Collector (GC) .  The CLR has a mark-sweep-compact GC.  This means the GC will move objects to defragment the heap and update all references (“gc roots”) throughout the process accordingly. This impacts debugging in several ways:

a. The debuggee is temporarily in an inconsistent state during the GC. The debugger must coordinate with the GC to ensure that it doesn’t inspect the debuggee during this window. 

b. Debuggers can let users change the values of variables. This updating must be coordinated with the GC’s updating.

c. There’s no convenient object identity. In native code, the raw pointer value to an object uniquely identifies that object since objects don’t get moved around.