A process shutdown puzzle: Answers


Last week, I posed a process shutdown puzzle in honor of National Puzzle Day. Let's see how we did.

Part One asked us to explain why the ThreadFunction thread no longer exists. That's easy. One of the things that happen inside ExitProcess is that all threads (other than the one calling ExitProcess) are forcibly terminated in the nastiest way possible. This happens before the DLL_PROCESS_DETACH notification is sent. Therefore, the code in StopWorkerThread that waits for the thread completion event waits forever because the ThreadFunction is no longer running. There is nobody around to see the shutdown event and respond by setting the completion event.

Okay, that was the easy part. Part Two asked us to criticize the replacement solution which replaced the completion event with a call to FreeLibraryAndExitThread and changed the StopWorkerThread function to wait for the thread handle to become signaled. This solution is also flawed.

Consider the case that the DLL is receiving its DLL_PROCESS_DETACH notification because the DLL is being unloaded by a call to FreeLibrary, rather than due to process termination. In that case, StopWorkerThread sets the shutdown event, and the ThreadFunction proceeds to clean up and call FreeLibraryAndExitThread. But one of the steps in thread shutdown is sending DLL_THREAD_DETACH notifications, which will not happen until the DLL_PROCESS_DETACH notifications are complete. The WaitForSingleObject waits indefinitely because it won't complete until the thread exits, but the thread won't exit until StopWorkerThread returns. Deadlock.

Finally, Part Three asks us to explain why the code doesn't cause a problem in practice even though the code is flawed. The call to FreeLibraryAndExitThread implies that the code follows the "Worker thread retains its own reference on the DLL" model. After all, that's why the last thing the thread does is free the library. But if that's the case, then a call to FreeLibrary coming from the application won't actually unload the DLL, because the DLL reference count is still nonzero: There is one reference still being held by the worker thread. Therefore, the DLL will never actually unload outside of process termination. All the flaws in the dynamic unload case are masked by the fact that the code never executes.

Led astray: Some of us mentioned that waiting on ThreadHandle returned immediately because the handle to a thread is automatically closed when the thread exits. This is wrong. Handles do not self-close. You have to call CloseHandle to close them. This is "obvious" if you apply the "imagine if the world actually worked this way" rule: Suppose thread handles were invalidated (and eligible for re-use) when a thread exited. Then how could you use a thread handle at all? Any time you use a thread handle, there would be an unavoidable race condition where the thread might have exited just before you used the handle. And it would be impossible to call GetExitCodeThread at all! (Since it only does anything interesting if you pass the handle to a thread that has exited.)

A handle to a thread remains valid until you close it. If the thread has exited, then a wait on the thread handle completes, but the handle is still valid because if it went invalid, programming would become impossible.

Comments (16)
  1. Mark says:

    They were being profligate with HMODULE, so who’s to say they hadn’t already closed the thread handle?

  2. Doug says:

    The fun is making Thread/DLL code that survives both process shutdown and VB DLL unload when VB unloads an OCX…

  3. dave says:

    because if it went invalid, programming

    would become impossible.

    From what I recall, that didn’t stop the implementers of the C RTL function _beginthread from trying to do the impossible.

    (The _beginthread function return the thread handle; the C RTL obligingly closed the handle when the thread main function returned. Hence one reason for the addition of the _beginthreadex function).

  4. porter says:

    > From what I recall, that didn’t stop the implementers of the C RTL function _beginthread from trying to do the impossible.

    There is no problem closing a thread’s handle if you don’t want to either wait for it or find it’s exit code. Compare with pthread_detach.

  5. dave says:

    re: porter

    Sure – but there’s a logical disjunction between (a) returning the handle to the caller, and then (b) arranging so that caller’s use of the handle is subject to uncertainty about whether it is still valid.

    Either give the caller the handle and let him dispose of it, or don’t give the caller the handle at all because he can’t trust it to be valid.

    _beginthreadex takes the former approach, in recognition that _beginthread had a flawed design.

  6. porter says:

    > _beginthreadex takes the former approach, in recognition that _beginthread had a flawed design.

    Or rather an interesting heritage, it came from OS/2 where it would have returned a thread id.

  7. Nawak says:

    Programming is impossible, let’s go shopping!

  8. 640k says:

    The concept of associating a thread with a handle value are flawed. This is why windows leaks handles all over the place. How fast are handles reused anyway?

  9. porter says:

    > The concept of associating a thread with a handle value are flawed.

    Always a problem when GetCurrentThread()  returns -2.

    Try TIDs and TlsGetValue()…

    However handles are the only way to wait for a thread termination, so they have their place.

  10. Brian says:

    I think the people who said the handle was closed really meant the handle is signaled (which afaik is true in this case).

  11. steveshe says:

    @ 640k – OK, I’ll bite, what are you talking about when you say "This is why windows leaks handles all over the place"? I have boxes that run for months and don’t leak any noticeable number of handles.

  12. John says:

    Won’t a handle / id remain valid until all references to the underlying object have been released?  I’m pretty sure that’s the way it works, so as long as you track your references properly you should be fine.

  13. Mark says:

    Thank you Raymond for your puzzles: they’re always illuminating.

  14. porter says:

    > Won’t a handle / id remain valid until all references to the underlying object have been released?  I’m pretty sure that’s the way it works, so as long as you track your references properly you should be fine.

    I would not have thought so, DuplicateHandle() returns a new handle, not the same handle with bumped up reference count.

    Try… open file, duplicate handle, close original file, try original handle, close duplicate handle

  15. Alexandre Grigoriev says:

    John, porter

    A kernel object can have multiple handles associated with it, and also other references (each handle holds a reference, as well). An object stays alive until there are references to it. When the last reference is released (this does not mean a last handle is closed), the object is gone.

Comments are closed.

Skip to main content