Managed blocking

What’s the difference between WaitHandle.WaitOne/WaitAny/WaitAll and just PInvoke’ing to WaitForSingleObject or WaitForMultipleObjects directly? Plenty.

There are several reasons why we prefer you to use managed blocking through WaitHandle or similar primitives, rather than calling out to the operating system via PInvoke.

First, we can blur any platform differences for you

Do you know the differences between Windows 95 and Windows Server 2003 when you have duplicate handles in the list you are waiting on? You certainly shouldn’t have to!

Second, we can do any pumping that is appropriate

While a thread in a Single-Threaded Apartment (STA) blocks, we will pump certain messages for you. Message pumping during blocking is one of the black arts at Microsoft. Pumping too much can cause reentrancy that invalidates assumptions made by your application. Pumping too little causes deadlocks. Starting with Windows 2000, OLE32 exposes CoWaitForMultipleHandles so that you can pump “just the right amount.” On lower operating systems, the CLR uses MsgWaitForMultipleHandles / PeekMessage / MsgWaitForMultipleHandlesEx and whatever else is available on that version of the operating system to try to mirror the behavior of CoWaitForMultipleHandles. The net effect is that we will always pump COM calls waiting to get into your STA. And any SendMessages to any windows will be serviced. But most PostMessages will be delayed until you have finished blocking.

The degree of pumping that’s happening has been painfully tuned to be appropriate to WindowsForms, non-GUI console apps, ASP compatibility mode using an STA threadpool on the server, and all the other traditional STA scenarios. However, in the future we know we’re going to be revisiting this. The underlying operating system is evolving and there are some big changes underway in this area. Believe me, you don’t want to be doing this stuff yourself. The CLR should be insulating you from this pain.

Third, the CLR can make wise decisions about activity

The CLR threadpool monitors CPU utilization to guide its heuristics about thread injection and retirement. It also notices GC activity, since there’s little reason to inject a thread that will immediately be suspended until a non-concurrent GC is complete. The threadpool also notices whenever one of its threads is blocked or emerges from a blocking operation. We can do this accurately if you use managed blocking. If you PInvoke to unmanaged blocking services, everything is opaque.

Fourth, we can ensure that your thread can be controlled

The operating system provides a TerminateThread() service. It should never be used under any circumstances. It will corrupt the process. The CLR provides services like Thread.Abort and Thread.Interrupt. They can take control of your thread in a reasonably safe manner. By reasonably safe, I mean that the process and the CLR remain consistent. Your application state might not remain consistent. In particular, if a thread is Aborted while it is executing a .cctor method, I’ve explained in another blog how this leaves your class in an “off limits” situation. Another example of this is that your thread might be Aborted in the middle of executing some backout code like a finally or catch clause. Once again, your application state might be corrupt.

(We’re careful to allow finally and catch clauses to execute once an Abort has been induced on your thread. But that’s subtly different from never inducing an Abort in the middle of a finally or catch execution).

Over time, we hope to provide ways for your application to remain consistent – even in the face of Thread.Abort and other asynchronous exceptions, including resource failures like OutOfMemoryException and StackOverflowException.

Until then, there are only two completely safe uses of Thread.Abort:

1) You can abort your own thread via Thread.CurrentThread.Abort.

2) You can perform an AppDomain.Unload, which internally uses Thread.Abort to unwind threads out of the doomed AppDomain.

The first usage is safe because the Abort isn’t induced asynchronously. You are inducing it directly on your own thread – almost as if you had called “throw new ThreadAbortException();”

The second usage is safe because all the application state is being discarded after the thread has been Aborted. That application state might be inconsistent, but it’s all going away anyway.

However, if you PInvoke to WaitForMultipleObject, then Thread.Abort is powerless. We cannot take control of threads that are in unmanaged code. The operating system provides no safe way to do this. A thread in unmanaged code could be holding arbitrary locks (the OS loader lock and the OS heap lock are two particularly troublesome ones).

So there are several good reasons why you should favor managed blocking over a PInvoke to unmanaged blocking. Examples of managed blocking are:

  • Thread.Join
  • WaitHandle.WaitOne/WaitAny/WaitAll
  • GC.WaitForPendingFinalizers
  • Monitor.Enter if there is enough contention for us to give up on spinning and block

Thread.Sleep is a little unusual. We can take control of threads that are inside this service. But, following the tradition of Sleep on the underlying Windows operating system, we perform no pumping.

If you need to Sleep on an STA thread, but you want to perform the standard COM and SendMessage pumping, consider Thread.CurrentThread.Join(timeout) as a replacement.

Comments (9)

  1. Matt says:

    Can U provide an insight into Monitor.Wait. In one application we have, every once in a while, a call to Monitor.Wait causes 99% CPU usage


  2. Chris Brumme says:

    I’m just guessing, but…

    If you are running on a multi-proc machine, Monitor.Wait will attempt to spin for a while if there is contention. First we busy-spin, then we spin using SwitchToThread so we don’t consume a CPU, and then we eventually block efficiently on an OS primitive. On a uniproc, we obviously dispense with all this spinning since it would simply take the CPU away from the thread holding the lock.

    We are routinely revisiting those spin counts, since CPUs keep getting faster / the cost of bus activity changes / etc. Unfortunately, tuning the spin counts is an inexact science, since the optimum values also depend on how long the application holds the lock, the number of arriving waiters, etc.

    In a perfect world, the system would be self-tuning. We would adjust the spinning parameters for each lock based on historical hold times, the performance profile of your machine, etc. In reality, it’s not clear whether this would be worth the effort.

    Instead, we assume that none of your locks are hot. If we are waiting for a lock, we expect to get it right now or very, very soon. And — unlike the operating system — we are prepared to sacrifice fairness to get better throughput on that lock.

    If you have hot locks, then we don’t have the right design. But the solution isn’t for us to change our design. The solution is for you to reduce the heat on your lock, perhaps by doing less work in the lock or by using finer grained locks to partition the work or some other standard technique.

    At least, that’s our design point.

    Incidentally, I think you will find our lock performance is much better in the release of the CLR we just shipped. We did a bunch of work to improve the case where your lock never experiences any contention at all!

  3. Matt says:

    Thanks. We actually found the problem only seemed to occur when debugging with VS .NET 2002.

  4. Chad says:

    I have been tasked with writing a multithreaded .NET app that spawns several threads and interacts with an STA COM object and processes results from that object. The threads should be STA.

    I need to spawn these threads and wait for them all to complete and then return/exit.

    What’s the best way to do this? WaitHandle.WaitAll() won’t work because I’m in an STA thread and all the pooled threads are STA.

    I’ve tried having a thread counter and using Interlocked.Inc/Decrement and then checking to see if the count was 0 and firing a manual reset event, but it doesn’t seem to be working. The event fires somehow before the count is 0. I can’t observe it during debugging because it’s a race condition which is eliminated by the slowness of the debugger.

    Any thoughts?



  5. Chris Brumme says:

    What a coincidence. Yesterday I posted a new blog that talks about pumping and apartments. That blog contains a section that explains why you can’t do WaitAll on an STA, and suggests 3 different workarounds. In each case, the workaround has drawbacks. But you might find one of them works for your scenario. The full article is at

Skip to main content