Some hangs are based on code that makes the hang easy to spot: A Sleep() call with a long duration, a WaitForSingleObject() on a handle that wasn't signalled, or an EnterCriticalSection() while a background thread is busy. However, many hangs are not so obvious. In fact, some code may behave just fine in a test environment but customers end up reporting hangs for the same code.
One such class of hangs that may not be readily apparent in a code review is a Device I/O hang. These are hangs that occur while waiting for I/O to complete on a device.
It's generally a bad idea to do any synchronous I/O on a UI thread. I/O is dependent on external factors and those can lead to unresponsiveness. Even interacting with a local disk has some scenarios in which the disk can be unresponsive. For example, a hard drive that is powered down and needs to start back up or a disk that has gone bad.
However, in practice, people write code in their UI threads that perform I/O all the time. You may find yourself debating over whether to do a quick local disk write versus implementing a complex threading architecture with synchronization and cancellation. This is a difficult issue to debate because a multi-threaded application is complex and can be very bug prone.
The data Microsoft gets from customers reporting hangs indicates that there are certain scenarios that lead to hangs much more often than others. For example, hangs happen much more frequently on network I/O as opposed to local I/O. In fact, the data indicates that the following forms of I/O are significantly more hang prone:
Devices Prone To Hangs:
- Remote (network)
- Flash (thumb drive, flash card reader, etc)
- RAM disk
Thus, if you find the decision difficult between a complex multi-threaded architecture versus performing I/O on your UI thread, ask whether the type of I/O could ever be on one of the devices mentioned above.
In the next posting, I expect to enumerate some Win32 APIs that are known to be hang prone based on Device I/O.