Another reason not to do anything scary in your DllMain: Inadvertent deadlock


Your DllMain function runs inside the loader lock, one of the few times the OS lets you run code while one of its internal locks is held. This means that you must be extra careful not to violate a lock hierarchy in your DllMain; otherwise, you are asking for a deadlock.

(You do have a lock hierarchy in your DLL, right?)

The loader lock is taken by any function that needs to access the list of DLLs loaded into the process. This includes functions like GetModuleHandle and GetModuleFileName. If your DllMain enters a critical section or waits on a synchronization object, and that critical section or synchronization object is owned by some code that is in turn waiting for the loader lock, you just created a deadlock:

// global variable
CRITICAL_SECTION g_csGlobal;

// some code somewhere
EnterCriticalSection(&g_csGlobal);
... GetModuleFileName(MyInstance, ..);
LeaveCriticalSection(&g_csGlobal);

BOOL WINAPI
DllMain(HINSTANCE hinstDLL, DWORD fdwReason,
        LPVOID lpvReserved)
{
  switch (fdwReason) {
  ...
  case DLL_THREAD_DETACH:
   EnterCriticalSection(&g_csGlobal);
   ...
  }
  ...
}

Now imagine that some thread is happily executing the first code fragment and enters g_csGlobal, then gets pre-empty. During this time, another thread exits. This enters the loader lock and sends out DLL_THREAD_DETACH messages while the loader lock is still held.

You receive the DLL_THREAD_DETACH and attempt to enter your DLL’s g_csGlobal. This blocks on the first thread, who owns the critical section. That thread then resumes execution and calls GetModuleFileName. This function requires the loader lock (since it’s accessing the list of DLLs loaded into the process), so it blocks, since the loader lock is owned by somebody else.

Now you have a deadlock:

  • g_cs owned by first thread, waiting on loader lock.

  • Loader lock owned by second thread, waiting on g_cs.

I have seen this happen. It’s not pretty.

Moral of the story: Respect the loader lock. Include it in your lock hierarchy rules if you take any locks in your DllMain.

Comments (17)
  1. BTannenbaum says:

    We’ve been bitten by this *many* times before.

    The problem with the loader lock is that there’s no documentation on which functions require it, so it jumps out and bites you when you’re making an apparently innocent call. And to the best of my knowledge, there’s no way for me to grab the loader lock in my code, so I could add it to my hierarchy.

    And of course, this is all timing dependent, so it will work perfectly in the lab and hang randomly at a customer site.

  2. BTannenbaum says:

    We’ve been bitten by this *many* times before.

    The problem with the loader lock is that there’s no documentation on which functions require it, so it jumps out and bites you when you’re making an apparently innocent call. And to the best of my knowledge, there’s no way for me to grab the loader lock in my code, so I could add it to my hierarchy.

    And of course, this is all timing dependent, so it will work perfectly in the lab and hang randomly at a customer site.

  3. keithmo says:

    I hate it when my threads get all "pre-empty"… :-)

  4. To B. Tannenbaum: Since there’s no way to know when you need the loader lock, the following is useless advice, but theoretically it would solve your deadlocks.

    Define one more lock in your own design which serves no purpose by itself except to waste resources.

    When you know that the loader lock is going to be grabbed by something you’re going to call, lock your lock as just described. When you know that the thing you called is already finished with the loader lock, release your lock as just described.

    Include your extra lock in your hierarchy.

  5. Sven G. Ali says:

    Don’t static objects get destructed in DllMain? I think I ran into the loader lock when I had a static object whose job it was to manage a background thread. Its destructor set an event to break the thread out of its loop, then waited on the thread handle. A deadlock happened, and I think the reason was "resource inversion" between the loader lock and the thread handle. My quick "solution" was to bypass DLL_THREAD_DETACH by having the thread terminate itself instead of exiting cleanly. Could this be a valid use of TerminateThread?

  6. Mike Dimmick says:

    Yes, static objects are destroyed in _DllMainCRTStartup, and yes, this can cause problems. I’ve been bitten by this, too.

    Our solution was to remove any setup/teardown code from DllMain and provide InitLibrary/TermLibrary functions separately.

    It can be quite frustrating to try to debug a process beyond the end of main().

  7. Raymond Chen says:

    This ties into the comment from Norman Diamond on the other DllMain thread.

    DllMain is a placeholder name for the actual DLL entry point you choose to use for your DLL. If you choose to use the C runtimes, then the entry point is really _DllMainCRTStartup. That function does some other stuff (like managing global destructors/constructors) and then calls a function called (confusingly coincidentally) DllMain.

    In this case, then, the function you write called "DllMain" is NOT the actual DLL entry point, but an incredible simulation.

  8. I’ve been <a href="http://www.larkware.com/Articles/SomeSpelunkingHelp.html">splunking</a&gt; around Dll loading recently for a pet project. It’s been an interesting journey and this evening I solved the final piece of the puzzle and, when I did, I suddenly wondered, not for the first time, why Windows holds the <a href="http://blogs.msdn.com/oldnewthing/archive/2004/01/28/63880.aspx">loader</a&gt; <a href="http://blogs.msdn.com/cbrumme/archive/2003/08/20/51504.aspx">lock</a&gt; when calling <code>DllMain()</code>…

Comments are closed.