Stop the debuggee to poke at it

In ICorDebug, most operations are only available when the debuggee is stopped. (This was asked here). Many things will fail with CORDBG_E_PROCESS_NOT_SYNCHRONIZED if you call them when the process is running. The motivation is:
1) Correctness: trying to query a running debuggee is not safe and may produce very unintuitive results.
2) Simplicity: querying a running debuggee introduces race issues that complicate the implementation.

Example: taking a callstack while running
For example, it's not clear what the correct behavior is when taking a callstack of a moving thread, particularly because ICD provides such intimate inpsection. Even if we suspend the thread, take the callstack, and then resume, there could still be problems. Imagine if the CLR had code-pitching (which was originally planned) and we pitched a method on the stack in the middle of the debugger taking the stack trace. So even if we let ICD clients take callstacks while running, there's a decent chance that the clients would not handle all the innate corner cases and misuse the information. And we judged there's a higher chance that if the client is indeed querying while the debuggee is running, it's actually by accident (perhaps there's a race in their code). So we made the judgment to fail the query operations. This is an example of rejecting behavior that's only correct 90% of the time.

Backwards compat + the Stop Count:
Unfortunately, we didn't start really enforcing this until V2; and so there were cases where this enforcement was an unacceptable breaking change.  Breakpoints are one example. Visual Studio lets you add breakpoints even while the process is running. This exerted a pressure on ICorDebug to let you add breakpoints (call ICorDebugBreakpoint::Activate) while the process is running. Our approach here is to Stop things under the covers, do the operation, and then resume things. This is feasible because of the stop-count. If the debuggee is already stopped, then the extra stop/go is just increment/decrement a counter. If the debuggee is running, then at least the operation now has clear semantics (it's a snapshot).

Thus the ICD APIs fall into the following categories:
1) Must-be-synchronized: The debuggee must be synchronized in order to inspect. This is our preferred API type and is true for most inspection APIs.
2) Should-be-synchronized:  This will do a Stop/Go around the API so that it degenerates into the 'Must-be-synchronized' case. This is mainly for backwards compat.
3) Can-be-live: This is for a small set of APIs that don't matter if they're synchronized. For example, read-only APIs like getting the process PID.

If you're looking at the rotor sources, these correspond to the ATT_* macros you see defined in rspriv.h and present at the top of most ICD public functions.

What about MDbg?
Mdbg only lets you do commands while the debuggee is stopped, and so that mostly avoids the issue. This fits naturally with a command-line debugger's UI, and is also consistent with Windbg's UI.

Comments (2)

  1. max says:

    thanks a lot, mike

Skip to main content