Sabotaging yourself: Closing a handle and then using it

A customer reported a problem with the WaitForSingleObject function:

I have a DLL with an Initialize() function and an Uninitialize() function. The code goes like this:

HANDLE FooMutex;

BOOL Initialize()
 ... unrelated initialization stuff ...
 FooMutex = CreateMutex(NULL, FALSE, "FooMutex");
 ... error checking removed ...
 return TRUE;

BOOL Uninitialize()
 // fail if never initialized
 if (FooMutex == NULL) return FALSE;

 // fail if already uninitialized
 if (WaitForSingleObject(FooMutex, INFINITE) == WAIT_FAILED)
  return FALSE;

 ... unrelated cleanup stuff ...
 return TRUE;

Under certain conditions, the Initialize() function is called twice, and the Uninitialize() function is correspondingly called twice. Under these conditions, if I run the code on a single-processor system with hyperthreading disabled, then everything works fine. But if I enable hypethreading, then the second call to Uninitialize() hangs in the WaitForSingleObject call. (As you can see, it's waiting for a mutex handle which was closed by the previous call to Uninitialize().)

Why does this happen only on a hyperthreaded machine? Shouldn't the WaitForSingleObject return WAIT_FAILED because the parameter is invalid? Is this a bug in Windows hyperthreading support?

Remember, don't immediately jump to the conspiracy theory:¹ Hyperthreading may be the trigger for your problem, but it's probably not a bug in hyperthreading.

While it's true that WaitForSingleObject returns WAIT_FAILED when given an invalid parameter, handle recycling means that any invalid handle can suddenly become valid again (but refer to an unrelated object).

The problem with hyperthreading will probably recur if you run the scenario on a multiprocessor machine. Hyperthreading (and multi-core processing) means that two threads can be executing simultaneously, which means that the opportunity for race conditions grows significantly.

What's probably happening is that between the two calls to Uninitialize(), another thread called CreateThread or CreateEvent or some other function which creates a waitable kernel object. That new kernel object was coincidentally assigned the same numerical handle value that was previously assigned to your FooMutex. (This is perfectly legitimate since you closed the handle, thereby making it available for re-use.) Now when you call WaitForSingleObject(FooMutex), you are actually waiting on that other object. And since that other object is not signaled, the wait call waits.

Okay, so how do we fix the problem? The simple fix is to null out FooMutex after closing the handle. This assumes however that your DLL design imposes the restriction on clients that all calls to Initialize() and Uninitialize() take place on the same thread, and that the DLL is uninitialized on the first call to Uninitialize().

Oh, and you may have noticed another bug: When Initialize() is called the second time, the DLL reinitializes itself. In particular, it re-creates the mutex and overwrites the handle from the previous call to Initialize(). Now you have a handle leak. I suspect that's why the call to CreateMutex explicitly passes "FooMutex" as the mutex name. The previous version passed NULL, creating an anonymous mutex, but then the author discovered that the mutex "stopped working" because some code was using the old handle (using the mutex created by the first call to Initialize()) and some code was using the new handle (using the mutex created by the second call to Initialize()). Using a named mutex "fixes" the problem by forcing the two calls to Initialize() to create a handle to the same underlying object.

To fix the handle leak, you can just stick a if (FooMutex != NULL) return TRUE; at the top. The DLL has already been initialized; don't need to initialize it again.

The design as I inferred it is somewhat unusual, however. Usually, when a component exposes Initialize() and Uninitialize() and supports multiple initialization, the convention is that the DLL initialization remains valid until the last call to Uninitialize(), not the first one. If that was the design intention of this DLL, then a different fix is called for, one which I leave as an exercise, since we've drifted pretty far from the original question.

¹The authors of The Pragmatic Programmer call this principle 'select' isn't broken.

Comments (19)
  1. Anonymous says:

    How many bugs like this would be hidden by not reusing handle values until they overflow?  I.e. each handle to a new kernel object would be higher than the previous one.  How likely would that be to introduce new compatibility bugs by causing programs relying on handle reuse to no longer function?

    To support multiple initialization, keep the initialization count in a variable that gets InterlockedIncrement'ed in Initialize and InterlockedDecrement'ed in Uninitialize.  If it was 0 before Initialize, do the actual initialization.  If it becomes 0 in Uninitialize, do the actual uninitialization.

    [It would catch/mask a lot of bugs. (Mostly mask, since nobody actually checks return values.) The cost is that the handle table now has to be a sparse table, which increases the complexity of handle lookup. And it would break programs that rely on handle recycling. -Raymond]
  2. @Medinoc:

    I don't understand what was the purpose of closing the handle. Was it to kick ReadConsole?

  3. Anonymous says:

    [And it would break programs that rely on handle recycling. -Raymond]

    Ah yes dup2. Very handy, but very painful to implement on Windows.

  4. Medinoc says:

    @alrgr1: It was to add a timeout to ReadConsole(). I couldn't simply use WaitForSingleObject()+ReadConsole() because I had ENABLE_LINE_INPUT set, and the synchronization routines ignore it: Even when it's set, the handle becomes signaled as soon as a character is in the buffer instead of waiting for an end of line.

  5. Anonymous says:

    That's a lot of paint to cover the 'live and learn' lesson.

  6. @Medinoc:

    Consoles are tricky. Their handles seem special, not the generic kernel handles. I'm not even sure one can open the console handle as overlapped and use async IO. For a regular file handle, CloseHandle would not cause an IO to be canceled, if there is a duplicate handle.

  7. [And it would break programs that rely on handle recycling. -Raymond]

    Wait, how do you recycle a handle in Windows? Isn't CRTL under Windows using fake descriptors instead of handles?

    [I'm sure there's somebody out there who is relying on handle recycling, assuming they work the same way as unix. -Raymond]
  8. Anonymous says:

    @alegrl: You stop all runnable threads, close the handle you want to allocate to, then call DuplicateHandle until you get the right value, close all the junk ones you just opened, then restart all threads you stopped.

  9. Anonymous says:

    > I'm sure there's somebody out there who is relying on handle recycling, assuming they work the same way as unix.

    On Unix, it makes a lot of sense, since file descriptors are sequential numbers starting from 0.

    But on Win32, where handles are pointer-looking things with arbitrary values, expecting them to be recycled makes as much sense as expecting that malloc() will return the exact same pointer passed to the preceding call to free().

    Thinking about handles as if they were pointers to opaque objects in memory is a good mental model (even if it is not the case), and AFAIK it is how they are declared in the headers (as opaque pointers, either "void *" or pointers to a struct).

    [You're assuming that people who make this assumption are as smart as you are. -Raymond]
  10. Anonymous says:

    [You're assuming that people who make this assumption are as smart as you are. -Raymond]

    I didn't make the assumption, but tested it and found it to be true. While I have written oneshot code that depended on it, I have not written any code that depends on it that I expect to work in the next Windows version. Changing it in a service pack would have disturbed me but I'm not using it anymore.

    But yeah, somebody might have made the assumption it's never gonna change. Back in the Win98-XP days I'd have made the same stack of assumptions, but then I saw just how much Microsoft was willing to break in Vista.

  11. Anonymous says:

    Cesar: There are probably tons of programs out there that assume a malloc will return memory at the address that was last freed. Odds are they free some data structure, then allocate a new one and forget to initialized it, unwittingly relying on the fact that the memory still contains an initialized data structure from before.

    These apps will break when you change the heap allocator, requiring a compatibility shim if the heap behavior changes enough.

  12. Anonymous says:

    In spite of the largely sound "select isn't broken" principle, I did I have a time when I found, in rapid succession, a bug in GCC, the C++ standard library that ships with a rather older version of GCC, and the 1998 C++ standard. The first I was never able to figure out what was going on well enough to describe, but the last two other people had found before me. (There was even a defect report accepted for the standards bug, which was something like there being no conversion from a reverse_iterator to a const_reverse_iterator.) That was an amusing week and a half.

  13. Medinoc says:

    I have to confess I once *purposely* used the trick of closing a handle while another thread was waiting on it, and for that particular implementation at that particular time it worked: I closed a duplicate console input handle while ReadConsole() was waiting on the same duplicate handle (the duplication allowed me to repeat this without destroying the console input buffer).

  14. @joshua:

    That's too much to bother.

    I suspect handle recycle is not even remotely on the list of concerns of kernel guys.

  15. Medinoc says:

    And like Cesar said, a lot of people don't even know their program relies on handle recycling: This kind of thing is more likely to be a bug than a conscious choice.

  16. Anonymous says:

    Assuming you set FooMutex to null, doesn't that save you from having to wait on it to check whether it's uninitialised? (Although given the possibility of uninitialising the DLL on two threads at once, it might make more sense just to avoid closing the handle.)

  17. Anonymous says:

    Neil: Waiting on it isn't to check whether it's uninitialized; it's to protect the critical section in the cleanup code.

    The ReleaseMutex is what's unnecessary — in fact, I'd call it a bug. You want the second thread to fail, not get the mutex.

  18. Anonymous says:

    "'select' isn't broken" reminds me of another phrase I like: "When you hear hoof beats, think 'horses', not 'zebras'".

  19. Anonymous says:

    As Raymond once pointed out in TechNet, this principle is why it's very dangerous to "unlock" stuck file handles..…/2009.04.windowsconfidential.aspx

Comments are closed.