One of the challenges in building a UI application is correctly using the UI thread. This is a shared resource, and so any misuse has the potential to have global application effect.
The most common form of misuse is using the UI thread for a long period of time. In severe cases this causes the application to hang and for Windows to detect it is not responsive. In minor cases this can interfere with animations or missing transitional visual updates.
Unfortunately most UI technologies make it really easy to make a mistake. UI programs are very event driven. And these technologies call back into the application on the UI thread to handle these events, which naturally causes people to write code to handle those events on the UI thread, unless they do additional work to move the logic to a background thread.
Components like ThreadPool and BackgroundWorker can help with this task, but it is still easy to make a mistake and occasionally block the UI thread.
Operations that are always slow are easy for developers and testers to detect during development so it can be fixed. But real applications don’t always run at the same speed – network connections, responsiveness of remote computers, and local thread timing can all make UI thread violation issues hard to reproduce or even detect in the first place – especially if the symptom is a subtle paint issue.
This makes detecting the issues almost random which can cause a steady stream of bugs coming in. Worse it is hard to get confidence that the state of the application is improving.
Wouldn’t it be nice if there was a way to detect these issues deterministically? That would help one identifying all existing issues and help prevent new issues from being added.
The answer is something I call the UI Thread Watchdog. Its responsibility is to detect misuse of the UI thread and complain. It has 2 approaches to achieve that.
The first way is to provide methods which components can call to validate if they are being executed on the UI thread or not. These methods are easy to implement. They just need to compare the current Thread with the Thread being used for the UI. The Dispatcher object has this information.
Using these methods is also easy, just call the watchdog right method from your code. Now one could do a brute force approach and call the watchdog from just about everywhere. But one often gets more “bang for the buck” if one picks strategic locations to “instrument”. Key places are the ones which may take a while, such as:
- Code which interacts with networking
- Code which interacts with other processes
- Code which interacts with the file system
- Code which acquires locks
- Code which makes blocking calls, like waiting on handles
- Code which interacts with the hardware
The second way the UI Thread Watchdog works is that is occasionally polls the UI thread to see if it is responsive. Implementing this isn’t as easy, but isn’t hard.
On a background thread one occasionally does an Invoke to the UI thread. Then checks to see if the invoked method gets called within a certain amount of time. If it doesn’t execute then something time consuming is using the UI thread, preventing processing of other messages.
The hard part of this is deciding how often to poll and how long to wait before declaring the UI thread as blocked. The balance is to make the times small enough to detect issues, but long enough to avoid false positives. I suggest starting with small values and then increment the values up if you hit a large number of false positives.
This isn’t deterministic like the validation methods, but can improve detection of these issues significantly – and works on code not “instrumented.”
Attached is code which implements a simple thread watchdog using these techniques to demonstrate the ideas.
Some ways this can be improved are:
- Freeze the UI thread on a not responsive validation so the culprit will be on the stack when the issue is debugged
- Disable the checks on the shipping build so users don’t see the issue
- Log the violations to the application’s log