MDbg UI Threading, round 2

Article
01/27/2006

The STA/MTA threading problem is (hopefully) fixed in the latest MDbg winforms Gui. (which also has some cool new features, including an IL disassembly window)

The original solution implemented in the original GUI (described here ) had the problem that the UI thread was MTA! This is very bad because UI threads are supposed to be STA. In fact, certain UI things, like the OpenFile dialog, must operate on an STA thread. (Winforms started enforcing this in Beta 2, leading to the problems many people identified here). We did this because ICorDebug is MTA, and we wanted the UI thread to be able to access MDbg / ICorDebug directly because that greatly simplifies the code.

Basic properties of this solution:
Here are the basic properties of this new solution:
1) UI thread is STA (as it should be), and so it never touches MDbg objects (which are effectively MTA because they call on the MTA ICorDebug)
2) We have a worker thread dedicated to talking to MDbg. All gui elements (including the text box used to input commands) post to this worker thread.
3) UI thread has a variety of ways of accessing the worker thread. See below for details.
4) The worker thread can post asynchronous (non-blocking) requests to update UI elements. The worker thread never blocks on the UI thread.

Consequences of those decisions:
Here's an explanation of why this system is correct and doesn't deadlock.
Mdbg has a worker thread that invokes a callback for ReadLine() when its command line is ready for input. The worker returns from the callback, passing out a string to execute, and then the Mdbg executes that command. Once the command is done, MDbg invokes the ReadLine() callback for the next command. When MDbg runs at the command prompt, this callback just forwards the call to Console.ReadLine(). The MDbg gui provides another callback that coordinates ReadLine() with the main GUI winform which gets the command content from a TextBox.

The threading implications are:
1) The worker thread is the only thread that can resume the debuggee, which it does by just returning from the ReadLine() callback.
2) The worker thread is blocked (inside of MDbg) while the debuggee is running. It will invoke the ReadLine() callback when the debuggee stops (eg, at a breakpoint or some other event).
Note that it doesn't matter whether the command actually resumes the debuggee (like "go", or "next") or whether it's just a quick command like "print".

The UI thread can attempt to post non-blocking text commands to the worker thread. Eg, press F5 in the UI, and it posts a "go" command to the worker thread. The UI will not wait for this command to finish.
The UI thread can also post blocking work requests to get information used to fill in tool windows. Eg, the UI has a modeless callstack tool window, and it can have the worker thread produce a collection of frames, which the UI thread then puts into the callstack window. These work requests can only gather information and can not resume the debuggee. All UI actions that could resume the debuggee must be done via a text command.

Enough background, here's the proof:
I must emphasize that I'm not a UI expert, so here's my best attempt at proving safety:

1) The only way we can deadlock is if the UI thread blocks on something. The UI thread calls modal dialogs, which are safe since the UI is STA and the dialogs will pump. The only other thing it even considers blocking on is the worker thread.

2) So this simplifies to: The only way we can deadlock is if the UI thread blocks on the worker thread, and then the worker blocks on something.
The worker only makes non-blocking calls (Control.BegionInvoke) to the UI thread, and so it can never block on that. The worker doesn't calls any blocking function on Mdbg. So the only remaining thing the worker can block on is waiting for the debuggee to hit a debug event. In other words, the the worker only blocks while the debuggee is running.

3) So this simplifies to: The only way we can deadlock is if the UI thread blocks on the worker while the debuggee is running.

4) The only way the UI blocks on the worker is by invoking a synchronous worker delegate to get the worker to fill out data for a tool window.
As a safety-check, this could have this block be with a timeout; that pops a "Unresponsive" dialog. That dialog could even have a button to include an async-break.

So this means we just need to ensure that the debuggee is stopped from the time the UI issues the worker delegate until the time the UI gets the delegate results.

5) The only way the worker thread can resume is if it returns from its callback. It can only do this if the UI tells it to via the UI posting a text command (eg, "Go" or "Step"). Thus the debuggee can't be resumed without the UI thread issuing the order. Also, the worker thread can notify the UI thread once the debuggee has stopped again. Thus, the UI thread knows a well-defined, thread safe, bounding window on when the debuggee is running. (This is tracked via the MainForm.IsProcessStopped property in GUIMainForm.cs).

6) The same UI thread issues worker delegates and triggers resuming the debuggee. The UI thread has no reentrancy while issuing the delegate, and so there is guaranteed mutual exclusion between these two actions.

7) Thus the debuggee is indeed stopped while the UI thread is blocked on the delegate. And that means the worker thread is free to process the delegate. And thus worker completes and the UI thread doesn't deadlock.

Q.E.D.

If we really wanted to be thorough, we could ensure that the UI thread truly never blocked by having the UI would async post the delegate; and then the delegate would async post back to the UI thread to finish.

Ways that the Mdbg worker thread may need to access the UI thread:
The worker thread posts non-blocking requests to the UI thread. These get queued and executed in the order they were posted. This is done by the worker thread calling Control.BeginInvoke (and not just Control.Invoke, which blocks).

1) Enable / Disable input box:
The UI has a Text Box that lets the user type in text commands and send to the Mdbg shell (think the GUI version of Console.ReadLine).
When the worker thread is ready for text input, it will tell the UI to enable the input edit box so that the user can type in a text command. When the worker thread is then about to block (actually, give control back to Mdbg) to execute the command, it signals the UI thread to disable the input box so that the user can't enter any new commands.

2) Writing output:
The worker thread needs to notify the UI if any commands write output. It's extremely significant that the write operations are queued

Ways the UI thread (STA) needs to access the MDbg layer (MTA) :
The UI needs to access MDbg in several different ways.

1) execute a shell command: I discussed here that the GUI will want to implement many UI commands via shell commands. For example, pressing F10 maps to the "next" command, and pressing "F5" maps to the "go" command.
Shell commands are significant because the debuggee may resume execution (such as "next", "go"), and that means the command may block indefinitely. Clearly, the UI can't block as soon as the user presses F5. So the UI issues shell commands asynchronously. This worker thread then goes off and executes the command, and it will then notify the UI once it hits stopping event. Meanwhile, the UI thread is free to continuing painting.

2) collect Mdbg information to populate a tool window: The UI has tool windows (such as callstack, module, etc) and it needs information from MDbg in order to fill these windows in. For this, the UI thread will make post synchronous (blocking) worker requests to the worker thread. UI thread will block waiting for worker to complete. This is safe because these query operations are guaranteed to not block. But to be safe, the UI thread could block with timeout and then pop an error dialog on timeout. Anonymous delegates provide a convenient way for the UI thread to make a cross-thread call to the worker thread.

3) Async break: This is a special case. All other UI access occurs when the debuggee is stopped. However, the UI also has a command to break into a running debuggee. For this, the UI thread can just spin up an extra thread to do the async break. The UI thread doesn't need to wait on the extra thread because as soon as the async break finishes, the worker thread will become free and notify the UI thread.

MDbg UI Threading, round 2

Additional resources