The NT DLL Loader: DLL callouts (DllMain) – DLL_PROCESS_ATTACH deadlocks


The Windows DLL loader (I wasn't around then but I assume some of this even comes from the days of 16-bit Windows) has a feature where a DLL may have an "entry point".


If a DLL has an entry point, the loader calls into it on certain significant events.  These events have identifiers associated with them:



  • DLL_PROCESS_ATTACH
  • DLL_PROCESS_DETACH
  • DLL_THREAD_ATTACH
  • DLL_THREAD_DETACH

I'm not going to talk about the thread callouts any time soon; they probably don't do what you expect them to do so for the most part you should call the function DisableThreadLibraryCalls() in your DLL_PROCESS_ATTACH to save extra page faults when threads come and go.


The apparent contract for DLL_PROCESS_ATTACH is that it is called before any code in your DLL can be called.  Sounds like a nice place to do one-time initialization that couldn't be static for some reason.


Note that I said "apparent".  Due to the previous articles, if you are involved in a cycle, you can have your code called before your DLL_PROCESS_ATTACH callout has been issued.  Maybe you're lucky and you've never been hit by this.  There are a lot of lucky people out there.


I'm going to paint a contractual picture here, assuming no cycles are involved.


Presumably if one thread is in the middle of calling your DLL_PROCESS_ATTACH, any other thread that wants to access the exports of your DLL has to block waiting for you to finish your initialization.  Let's call this the "mythical loader lock".  Maybe it's not so mythical and we can discuss/debate the scope of the synchronization (all DLLs, only the DLLs waiting to initialize, what about cycles?) but let's work out the invariants of the contract before we get too hung up on implementation details.


Thus the first problem with DLL_PROCESS_ATTACH processing is deadlocks.  A great example of this is calling CreateThread() inside your own DLL_PROCESS_ATTACH to start a thread running code for your DLL.  Clearly the new thread can't start running your code until you have finished your DLL_PROCESS_ATTACH because otherwise we would be breaking the contract.  Thus the new thread has to wait for you to exit your DllMain().  Maybe that's OK; if it can just suspend waiting for you to finish up, maybe it fired up and immediately had to block waiting for synchronization but if it doesn't happen too frequently, you can survive this.


But now imagine that you do something nifty and useful in that worker thread.  Someday someone comes along and wants to queue a work item to that thread and wait for it to complete.  Boom.  Insta-deadlock.


That's an easy example, but basically the rule is this:


Calling any function from within your DLL_PROCESS_ATTACH which requires synchronization can deadlock.


Obviously it doesn't have to deadlock; a lot of folks get away with a lot of bad stuff.  They're getting lucky for the most part.


A great example is the process heap.  Did you know that you can lock it?  You can!  You can probably have a lot of fun by calling HeapLock(GetProcessHeap())?  Why would you do that?  I don't know!  Who can know?  Can we stop it?  People want to but just wait for the black helicopter crowd to show up saying that it's really a collusion/conspiracy to get people to upgrade software on Windows.


If someone locks it (or maybe calls HeapWalk on the process heap which I assume locks it for the duration of the walk) and then calls into the loader... well... boom.  You're deadlocked.


Those are two easy cases.  Clearly you can deadlock in additional ways (RPC calls to another process or machine which have to reenter your process on a different thread which then might need the Mythical Loader Lock) and being creative with things like the thread pool, windows messages, etc. you can come up with a million variations on the theme.


Thus, DLL_PROCESS_ATTACH rule #1:


Don't do anything that requires synchronization.  Currently, even heap allocation is suspect.


Comments (14)

Cancel reply

  1. A good rule of thumb is to not do anything. Windows Script has suffered a few times because automation objects weren’t released before the host shuts down the script engine. GC’ing those objects may also unload DLLs, like we see in managed code. Violá – deadlock. It’s all tricky business when considering what can and can’t be done in DllMain, so as little as possible should be done – if anything.

  2. Although a tad "dated", people might want to check out my notes on the loader lock in Windows SP (SP1) located at http://www.smidgeonsoft.com. This is a collection of notes and references to the loader lock found by searching the system DLLs. It is interesting to note that there are two exported NTDLL functions, LdrLockLoaderLock and LdrUnlockLoaderLock, that make life "interesting".

  3. *Makes a note to self: File a bug to have those exports removed*

    I hope to heaven above that nobody EVER uses those exports.

    This is a HIDEOUSLY, HORRIBLE, AWFUL, HIENOUS idea.

    It’s also not recommended.

    Please don’t go there.

  4. MGrier says:

    I added those exports because instead, people were using an even worse hack to lock the loader lock in various DLLs around the system.

    At least this way we can set a breakpoint on the lock call.

    These are, of course, undocumented and for internal use only and will go away someday. It was better than the pointer swizzling people were doing.

  5. Mike Dimmick says:

    I’ll add, don’t put global objects of a class that has a constructor or destructor in a DLL. Global objects get constructed by DllMainCRTStartup (the actual entry point if you’re going to use the C/C++ runtime) in response to DLL_PROCESS_ATTACH and destroyed in response to DLL_PROCESS_DETACH.

    So the same rules apply to these global objects as to DllMain itself.

  6. MGrier says:

    I’m gettin’ there… I’m gettin’ there… people usually do not believe that the restrictions are real so I’m making sure that there is plenty of either obvious behavior of behavior which can be justified based strictly on documented public behavior which will quantify exactly what the dangerous patterns are.

  7. Norman Diamond says:

    Wednesday, June 22, 2005 4:33 PM by Mike Dimmick

    > I’ll add, don’t put global objects of a

    > class that has a constructor or destructor

    > in a DLL.

    I have a feeling I’ve seen that warning before. But one place I _haven’t_ seen it is in the output of a VC++ compiler. Anyone know why not? Compilers know when they’re building DLLs and they know when the source code has global objects with constructors or destructors, so compilers ought to know that the programmer needs to be warned.

  8. MGrier says:

    There is hope for C++ global object construction if we could nix the idea of data exports.

    When we get to the phase about talking about the ramifications of DLL_PROCESS_DETACH, we’ll see that it’s actually the rundown that’s harder to solve.

  9. Ron Hancock says:

    Apologies but I really don’t have that much idea of the detail that is being discussed. However I have just experienced a problem that I can not remove and it appears relevant to the general discussion. I use Win 2000 Pro (with all updates) and I am now getting the following error message during boot:

    "The procedure entry point LdrLockLoaderLock could not be located in the dynamic link library ntdll.dll"

    The header on the error mentions ctfmon.exe at this point but I also get a similar error from all sorts of other applications.

    And now the question. Having got this how do I get rid of it? I was running synchronization but have turned this off.

    If you can help I’d be very grateful. I’m fairly clued in on using PCs but I am not a programmer.

  10. MGrier says:

    That executable requires a feature that is present only on Windows XP and later. It cannot run on Windows 2000.

  11. Seth McCarus says:

    "Don’t do anything that requires synchronization. Currently, even heap allocation is suspect."

    MSDN explicitly allows InitializeCriticalSection in DllMain, but I’m pretty sure that allocates heap memory…

  12. MGrier says:

    Re: InitializeCriticalSection() allocating heap:

    Only on chk builds or if the app verifier is on. On the other hand, do the docs say "does not take any additional synchronization or make callbacks forever into the future?"?

    I’m just saying it’s suspect. As long as someone hasn’t called HeapWalk(), most of the process default heap’s allocations actually come from lookaside lists which don’t acquire the heap’s lock in any case (which possibly violates the HeapLock() contract…) and so would never deadlock.

    As in the theme for my blog, this probably works 99.9% of the time. Is that OK? Do you want to fly on a plane that’s run by a control system that’s 99.9% reliable or do you want to pay for a few more 9s? How about your banking? Email? online RPG? Online picture archives?

  13. Eugene Gershnik says:

    > for the most part you should call the

    > function DisableThreadLibraryCalls() in your > DLL_PROCESS_ATTACH to save extra page faults

    > when threads come and go.

    Isn’t this a bad advice for folks that use statically linked C runtime? AFAIK it cleans up its per-thread data structures on thread detach notifications.

Skip to main content