The NT DLL Loader: DLL callouts (DllMain) - DLL_PROCESS_ATTACH deadlocks

The Windows DLL loader (I wasn't around then but I assume some of this even comes from the days of 16-bit Windows) has a feature where a DLL may have an "entry point".

If a DLL has an entry point, the loader calls into it on certain significant events.  These events have identifiers associated with them:

  • DLL_PROCESS_ATTACH
  • DLL_PROCESS_DETACH
  • DLL_THREAD_ATTACH
  • DLL_THREAD_DETACH

I'm not going to talk about the thread callouts any time soon; they probably don't do what you expect them to do so for the most part you should call the function DisableThreadLibraryCalls() in your DLL_PROCESS_ATTACH to save extra page faults when threads come and go.

The apparent contract for DLL_PROCESS_ATTACH is that it is called before any code in your DLL can be called.  Sounds like a nice place to do one-time initialization that couldn't be static for some reason.

Note that I said "apparent".  Due to the previous articles, if you are involved in a cycle, you can have your code called before your DLL_PROCESS_ATTACH callout has been issued.  Maybe you're lucky and you've never been hit by this.  There are a lot of lucky people out there.

I'm going to paint a contractual picture here, assuming no cycles are involved.

Presumably if one thread is in the middle of calling your DLL_PROCESS_ATTACH, any other thread that wants to access the exports of your DLL has to block waiting for you to finish your initialization.  Let's call this the "mythical loader lock".  Maybe it's not so mythical and we can discuss/debate the scope of the synchronization (all DLLs, only the DLLs waiting to initialize, what about cycles?) but let's work out the invariants of the contract before we get too hung up on implementation details.

Thus the first problem with DLL_PROCESS_ATTACH processing is deadlocks.  A great example of this is calling CreateThread() inside your own DLL_PROCESS_ATTACH to start a thread running code for your DLL.  Clearly the new thread can't start running your code until you have finished your DLL_PROCESS_ATTACH because otherwise we would be breaking the contract.  Thus the new thread has to wait for you to exit your DllMain().  Maybe that's OK; if it can just suspend waiting for you to finish up, maybe it fired up and immediately had to block waiting for synchronization but if it doesn't happen too frequently, you can survive this.

But now imagine that you do something nifty and useful in that worker thread.  Someday someone comes along and wants to queue a work item to that thread and wait for it to complete.  Boom.  Insta-deadlock.

That's an easy example, but basically the rule is this:

Calling any function from within your DLL_PROCESS_ATTACH which requires synchronization can deadlock.

Obviously it doesn't have to deadlock; a lot of folks get away with a lot of bad stuff.  They're getting lucky for the most part.

A great example is the process heap.  Did you know that you can lock it?  You can!  You can probably have a lot of fun by calling HeapLock(GetProcessHeap())?  Why would you do that?  I don't know!  Who can know?  Can we stop it?  People want to but just wait for the black helicopter crowd to show up saying that it's really a collusion/conspiracy to get people to upgrade software on Windows.

If someone locks it (or maybe calls HeapWalk on the process heap which I assume locks it for the duration of the walk) and then calls into the loader... well... boom.  You're deadlocked.

Those are two easy cases.  Clearly you can deadlock in additional ways (RPC calls to another process or machine which have to reenter your process on a different thread which then might need the Mythical Loader Lock) and being creative with things like the thread pool, windows messages, etc. you can come up with a million variations on the theme.

Thus, DLL_PROCESS_ATTACH rule #1:

Don't do anything that requires synchronization.  Currently, even heap allocation is suspect.