Why interop debugging is difficult

The Visual Studio debugger supports debugging both .NET code and native code at the same time. We call this ‘interop’. At first glance, this might not seem like much of an accomplishment. After all, we support debugging .NET code well enough, and we support debugging native code. What’s so hard about doing both at the same time?

The problem is this – the way we debug .NET code is that the CLR has two pieces of their debugger. They have one piece that sits inside of the debuggee process. This piece understands all the nitty-gritty details of the CLR. Then there is another part of the CLR that runs inside the debugger process. This code knows how to talk to the code inside the debuggee, and exposes an interface that Visual Studio consumes to debug .NET code (ICorDebug).

Now the way the Win32 native debugging API works is that the OS tells the debugger when an event happens. After the OS tells the debugger about the event, every thread in the debuggee is frozen. Normally, all of these threads stay frozen until the debugger is done processing the event, or until the user decided to continue the process (depending on the kind of event).

This causes a big problem. In order to do .NET debugging, we need to run code in the process. But native debugger can’t let code run in the process. This has been a very hard problem to solve. We partially work by have one thread inside the debuggee continued. The difficulty of this problem is why we don’t always just go and enable interop debugging, and why interop debugging is generally slower and less reliable.

Someday, we would like to make this easier by having the CLR examine data structures from the debugger process. We call this a ‘hard mode’ debugger because the debuggee is hard stopped. However, there will still be tricky issues to deal with. One problem is that the current .NET debugger can’t stop the process on any arbitrary instruction. Instead, it tries to get the debuggee into a ‘safe place’. This has the downside that sometimes we can’t attach to a process. But on the upside:

  • Evaluating properties (and other functions) is safer because it is less likely that another thread will have an important lock (say the heap lock as a good example).
  • We don’t need to worry about debugging while the garbage collector is moving references around.

Comments (12)

  1. bk says:

    what about IL debugging? is it being considered to be integrated in VS.NET or does this feature have difficulties like those mentioned above?

  2. Gregg Miskelly says:

    Can you explain what you mean by ‘IL debugging’? If you mean debugging code written in MSIL, then that is what I ment by ‘.NET code’.

  3. bk says:

    ‘IL debugging’ like stepping over disassembled(or rather call it decompiled? or ILDASMed?) IL written in higher level .NET languages like C# in VS.NET. Analogous to stepping over assembly code in unmanaged code. This isn’t quite right but just for comprehension if you think IL:ManagedLanguages=Assembly:UnmanagedLanguage, you’ll get the picture of what i’m saying 🙂

  4. Gregg Miskelly says:

    Yes, now I understand what you mean. This problem is actually fairly unrelated to interop debugging. This isn’t too difficult.

    One issue is that we arn’t sure how valuable this information would be. The only real use that I know of is for code that you don’t have the souce to (since IL is easier to understand then x86 disasm). However, usually when you don’t have source, you are debugging retail code. At this point, with retail code, the CLR doesn’t always track IL->native maps, which would be required for IL debugging.

    Another issue is where to put this information. One option would be the disassembly window. But this is a problem because one is that the CLR doesn’t exsactly map from CLR instructions to native instructions. This also would be a problem because in the disassembly window, stepping is always done on native instructions.

    Still, I bet we will do this someday.

  5. bk says:

    There could be scenarios like stepping over the BCL IL to find out what’s happening underneath. Well, you can use ILDASM or Reflector, but i see it has advantages when you can do things on the fly on runtime and inspect the IL values, etc.

    For example, Specifically i tried to look inside the mysteries of marshalling in P/Invoking because i had some weird results not expected(it happened to be my bad though). Using ILDASM and Reflector didn’t work because the core functions had native instructions or whatever that prevented the disILing. So i had to look at the SSCLI code which made me study the whole thing which wasn’t what i intended. Perhaps this isn’t a good example for showing the needs for ‘IL Debugging’ but just wanted to say it would be useful if it were to be implemented 🙂

  6. John Schroedl says:

    I can appreciate that this is difficult for the debugger/CLR team to pull off. Nice job! Thanks for writing this up…

    I recently hit a roadblock debugging a problem. I KNOW for a fact that some code somewhere is calling SetWindowPos on my .NET code’s MainForm. Specifically, the WindowState changes but not via set_WindowState on the Form class — I can break there and see other calls. I cannot figure out the correct syntax to break on USER32:SetWindowPos like I could in Win32-land. Is this possible? The modules window doesn’t even list the Win32 DLLS like KERNEL32, USER32, etc. any more – is this on purpose? I have enabled Native and Managed debugging for the project. A writeup or tips for debugging something like this would help me understand the Win32/Managed relationships.


  7. Gregg Miskelly says:

    If you have symbols for user32.dll, it should be possible. You want {,,user32}_SetWindowPos@28. By default, the modules Window only shows modules for the current program. You need to either switch to the ‘native’ program, or right click on the modules window and enable seeing modules for all programs. See http://blogs.msdn.com/greggm/archive/2004/02/07/69330.aspx.