The NT DLL Loader: DLL_PROCESS_ATTACH reentrancy - step 2 - GetProcAddress()

Last time we pondered what does LoadLibrary() do when called inside of a DLL_PROCESS_ATTACH callout.  The answer was pretty simple and predictable - the only nuance is that the initializers are not run before returning.

Now place yourself in the position of mythical developer Weve Stoods who evidently did most of the loader development in the early 80s.  You cleverly avoided the whole recursive initialization problem.  Then a bug report comes in.  GetProcAddress and calling through the pointer doesn't work.  I've never met Weve myself and I can't say for sure what was going on at the time but I can imagine two scenarios.  First, the fact that it does work in some cases (because after all if the library aleady had been initialized due to static imports prior to this case), the team in question "suddenly" has a break where sometimes calling through the function pointer works and sometimes it does not.  Second scenario is that "it's so easy to make it work, why can't you just make it work?"

The road to Hades is paved with good intentions...

In both cases, pushing back and saying "GetProcAddress should never be called during DLL_PROCESS_ATTACH" would be a very strange response, even if we might wish now that that was the actual result.

So someone made it work.  How did they make it work?  Well, we already established that there is an in-initialization-order list that's built up in the loader's internal tables.  I believe that this is not an implementation detail - you have to have the single linear/global order because uninitialization on unload has to occur in the reverse order of initialization.

So the simple answer is that within the context of GetProcAddress(), the loader runs all the initializers that have not yet been run up to and including the target module of the GetProcAddress() function.

And you know what?  This very often works.  Let's assume that we're in a.dll's DLL_PROCESS_ATTACH and it's loading b.dll and calling GetProcAddress() to get the foo() function address.  If b.dll depends only on let's say kernel32.dll, well its initializer already ran and is complete before b.dll's initialier is run so b.dll's initializer runs and lo and behold, the b!foo function is ready for business.

But let's say that b.dll, unbeknownst to a.dll, now has a dependency on a.dll.  The loader will not attempt to rerun A's initializer and will instead just call B's initializer.  Which may call into A.  But A didn't finish its initialization!  And A had no idea that B depended on it so it wasn't particularly careful about doing all the initialization that B might need before loading B.   Boom.

As usual, sprinkle dependency cycles into the mix, stand back and watch the fun.

Next time, we'll look at quality problems with DLL_PROCESS_ATTACH implementations which directly or indirectly call back into the loader.  This next topic is a very big part of why I'm so stuck on quality and reliability.  Very innocuous errors which people would generally ignore can amplify into terrible problems which are unbelievably difficult to debug.