What is Interop-Debugging?

(This is an excerpt from an internal document I wrote explaining what is Interop-Debugging (aka Mixed Mode) and how does it work under the covers)

 

General Debugging background.

When a process is being debugged, it generates debug-events which a debugger can listen and respond to. These events include things like CreateProcess, LoadModule, Exception, ExitThread, Breakpoint, etc.

When a debug event is dispatched, the debuggee is stopped until the debugger continues the debug event. The debugger can inspect the debuggee during the window while the debuggee is stopped. Once the debuggee is continued, it runs free until the next debug event.

Debug events come on a per-thread basis.

 

Managed + Native debugging both share this conceptual model, though they have different sets of debug events and different method of cycling between stopped and running.

A debugger uses the debug events combined with inspection APIs to implement all debugging operations.  This document is only concerned with explaining how Interop debugging routes the managed + native debug events. How to build a debugger on top of these events is outside the scope of this document.

 

Native Debugging:

Native debugging is implemented by the OS. The OS provides the debugging API for listening to and continuing debug events ( WaitForDebugEvent and ContinueDebugEvent).

At a debug event, the OS freezes the debuggee.

The native debugging API is very small. There are only a few native debug events.

Key properties of Native Debugging:

- Native debugging is completely Out-Of-Process (oop).  The debugger requires no extra cooperation from the debuggee (outside of OS support).

- We’ll call the native debuggee stop state ‘frozen’. A frozen process has been stopped by the OS and executes no user code until it is resumed.

- Native debug events can come at any time the process is running free.

- All calls to the win32 debug API must be made on the same thread. We call this thread the W32ET and it becomes very special b/c it must never block.

Managed Debugging

Managed debugging is implemented entirely in user mode by the CLR, thus the OS has no knowledge of when managed-only debugging is occurring.

The CLR has a special thread (called the Helper thread) in every managed process that services requests from the managed debugging API. The portion of the CLR dedicated to managed debugging is called the Left-Side (LS). The implementation of ICorDebug residing in the debugger process is called the right-side (RS).  The LS and RS communicate via various user-mode interprocess-communication (IPC) mechanisms such as named events and shared memory blocks.

 

The managed debugging interfaces (ICorDebug) are much richer than their native counterparts.

Key properties:

- Managed debugging is an In-Process model. The helper thread must be present and running in the debuggee’s process in order for managed-debugging to function.

- We’ll call the managed debuggee stop state ‘synchronized’. A synchronized process is live from the OS perspective, but all managed threads are stopped by the CLR.

- Managed debug events are entirely created and dispatched in user-mode.

- Managed debug events may be built on top of native debug events.

- Managed debug operations can only occur at a managed stopped state (which requires the helper thread is running)

 

Managed vs. Native operations.

Managed and Native debugging are two different worlds. For any debugging operation, it is very important to view whether it is happening in the managed world or the native world.

The CLR Debugging Services only implement managed debugging operations, and they do not implement any native debug operations. Likewise, the native debugging API doesn’t provide any managed debugging support.

 

For example, an end-user just thinks of “stepping”, but there are actually 2 discrete operations, “Managed Stepping” and “Native stepping”, with two completely disjoint implementations. Managed stepping is implemented via the CLR’s ICorDebug API; while native stepping is implemented via a non-CLR native debug library that consumes Win32 debug events.

 

The native debug API is very low level whereas the managed debug API is very rich. For example, native execution control (such as stepping and breakpoints) is implemented entirely on top of exceptions and has no explicit support in the native debugging API. The managed debugging API explicitly has breakpoint and stepper functionality.

 

This difference in abstraction levels prevents code-sharing between managed + native debugging operations.

 

So what is Interop-Debugging?

An end-user’s view of interop-debugging is the ability to debug managed + native  portions of an app in a single debug-session. This includes the ability to step between managed + native code and running mixed callstacks.

 

The managed + native debug events are completely disjoint and are handled by different components in the debugger (which we’ll call the managed debug engine and native debug engine).

If a managed + native debugger were just naively simultaneously attached to the same process, they would interfere with each other. The interop-debugger ensures these two disjoint models cooperate together.

 

From an interface perspective, Interop-Debugging is exposing both managed and native debugging interfaces on the same process to a single debugger. This implies the debugger’s native debug engine and managed debug engine can run simultaneously with little modification. In theory, this means that a debugger capable of managed-only debugging and native-only debugging can be easily extended to do interop-debugging.

Ideally, the native + managed debug engines would cooperate with each other enough to present a unified model to the end-user. In practice, this communication between debug engines may require significant change and planning to a debugger’s design.

 

Interop-debugging can be broken down into the follow sub-areas:

  1. Avoid interference between Managed + Native
    1. Filtering native debug events: Any native debug events which are intermediate parts of   a managed operation must be given back to the managed debugger. For example, managed breakpoints are built on top of native breakpoints. Thus when the debuggee hits a managed breakpoint, a native debug event for the breakpoint will be generated. The interop debugger must recognize this native debug event is actually for the managed debugger and so not dispatch it to the native debugger.
    2. Filter native debug APIs: Managed debugging API provides special versions of key native debugging APIs because the CLR interference in the app would confuse native debugging APIs. (See hijacking below).
  2. Combining debug event streams: Interop debugging must combine managed + native debug events into a single stream. This means that when a debuggee is stopped at a managed debug event, the debugger can do native debugging operations and vice versa.
  3. Unifying stopping states: When a debuggee is stopped at a native debug event, it’s completely frozen by the OS. However, when a debuggee is stopped at a managed debug event, it’s GC-suspended but still live from the OS perspective. In particular, the helper thread is running and servicing managed debug requests. This means that the interop-debugger must magically take a frozen debuggee and get the helper thread running again.
  4. Unify operations: Interop debugging needs to allow a debugger to unify managed + native operations into single operation. This includes both mixed callstacks and unified stepping operations.