The case of the DLL that refuses to load

A customer reported that they had a problem that occurred only on some machines but not others. Their application called


and the call succeeded on some machines, but failed on others with error ERROR_MOD_NOT_FOUND ("The specified module could not be found"). The path was a fully-qualified path to a file that was confirmed to exist and be readable.

strModule = 0x09e875b4 "C:\Users\Bob\Desktop\CopyAndRun\Contoso.dll"

If the sxe ld Contoso.dll command was used in the debugger to break when the DLL loads, the breakpoint does fire, but a breakpoint on Contoso's Dll­Main is never hit. "I think this means that the problem is not that Contoso failed to initialize, but what does it mean?"

If you get a break from sxeld but no breakpoint on Dll­Main, then it means that the DLL was found but couldn't be loaded. You can use loader snaps will tell you what went wrong. "My psychic powers tell me that a dependent DLL could not be found or initialized."

The customer replied, "Ah, of course. We'll look into that."

A short while later, they confirmed the diagnosis. "The Contoso DLL was dependent on a version of the C runtime library that was not installed on the machines where it failed to load. But as a follow-up question: I would have expected that the standard The program can't start because XYZ is missing from your computer. dialog to appear in this case. Why isn't it?"

The reason is there in the error message: The "missing file" error message is shown only when a program cannot start due to a missing file. Specifically, it is raised by the loader only during the initial DLL resolution phase that occurs as part of process initialization. These are the DLLs linked implicitly via the module header because you linked against kernel32.lib, for example. DLLs loaded explicitly via Load­Library do not display this error message; instead, the error is returned back to the program, where it is expected to take appropriate recovery steps. By comparison, if DLL resolution fails during process initialization, there is nowhere to return the failure code. You can't return it to the program since the program isn't running yet. The only place to put the error is on the screen.

Comments (17)
  1. MC says:

    I use SysInternals ProcessMonitor utility to diagnose this sort of problem at least once a month.     The last time it was the Oracle libraries needed a version of the MS C runtime that wasn't installed (in the path).  

  2. Thorsten says:

    Somewhat annoying is that when loader shows the "missing file" dialog during the initial loading of the application it uses internal knowledge that is simply not available when calling LoadLibrary.

    If the start of the application fails because a DLL is missing that's 4 dependencies away from the application image, it still tells you exactly what file is missing. But if you call LoadLibrary and a DLL is missing that's 3 dependencies away from the the DLL you are trying to load, all you get is an error code telling you it couldn't load the DLL, not why. There probably should be a version of LoadLibrary that returns a more detailed description of the nature of the error than just an error code.

  3. Gabe says:

    Having just read the headline, my psychic debugging powers said the problem was that the program was a .Net executable compiled with AnyCPU that tries to load a 32-bit DLL. The systems where it fails would be 64-bit systems.

    Are there any other likely scenarios that cause these symptoms?

  4. Joker_vD says:

    @Gabe: The needed version of the MS Visual C runtime is not installed, and the program was linked against it dynamically (with /MD key).

  5. cheong00 says:

    I found it's easier to just run depwalk (the dependency walker) on the EXE to check all dependency at once. Of course it doesn't work for delay loaded libraries but it usually does do the work. And the best thing is it doesn't require much explanation on how it works, so a decent tool to ask junior collegues to carry when they're sent to site for on-site works.

  6. Gabe says:

    Joker_vD: As Raymond said, if your program is dynamically linked against a version of the CRT that isn't installed, you get an error dialog and your program won't even start.

    In this case the program dynamically loaded a DLL that was linked against a version of the CRT that wasn't installed. The program started but got an error when it tried to load that DLL.

  7. Neil says:

    @cheong00: I don't know about depwalk but depends.exe found a DLL loading issue for me quite recently. (A dependent DLL needed to be relinked when two DLLs it depended on were merged but I had skipped the relevant dependency checking.)

  8. John Doe says:

    @cheong00, Dependency Walker has a "profiling" feature that runs an executable and catches runtime library loading errors, so it's really the best tool.

  9. Medinoc says:

    @Thornsten: I fully agree.

    A shame the loader doesn't maintain/expose a per-thread variable that could be used for such diagnostics, with a function such as GetLastModuleNotFound(LPTSTR, int)…

  10. Escaper says:

    Is it a legacy issue of the LoadLibrary function of sorts? It cannot use exceptions or something? If it can then what is preventing it from throwing an exception with name of the missing file specified in it all they way up to the process trying to load a library?

    [So you're saying that everybody who calls LoadLibrary needs to wrap it inside a try/except? "Why did Microsoft design the LoadLibrary function to raise errors two different ways simultaneously? It raises an exception and returns an error code. Worst of both worlds!" -Raymond]
  11. Escaper says:

    > So you're saying that everybody who calls LoadLibrary needs to wrap it inside a try/except?

    Whoever wants to handle the missing file exception should catch it, yes. Or am I missing something?

    [And I presume that this means that LoadLibrary never returns NULL. I guess you could do that, though this breaks the general rule that in Win32, exceptions are used only for catastrophic failures (for the most part nonrecoverable). Also, it breaks backward compatibility with Win16, but I assume you knew that. -Raymond]
  12. Gabe says:

    Presumably when Win32 came about with structured exception handling, they could have made a new version of LoadLibrary (LoadLibraryEx?) that throws exceptions instead of returning errors. Of course, if you're just going to introduce a new API, you can introduce a function called GetLoadLibraryLastError and then not even have to create a new contract.

  13. Escaper says:

    Failure to load a library (which is presumably required to run an application) does seem a bit catastrophic to me as a developer of the application. :) As for Win16, well, if I understand it right, it is old and not used extensively nowadays at all. Keeping backward compatibility has to have its limits so that (inefficient) legacy code does not impact modern development, I think. In any case, recently I tried to run some old 16-bit application on Windows 7 and it didn't start. So backward compatibility is already somewhat broken for old applications.

    [I think you're missing the big picture. It's a chain of compatibility. Compatibility with Win16 is important to Windows NT 3.1. Compatibility with Windows NT 3.1 is important to Windows NT 4.0. Compatibility with Windows NT 4.0 is important to Windows 2000. And so on to the present day. -Raymond]
  14. Escaper says:

    >  And so on to the present day.

    I understand. But this chain has to break at some points in time I believe. Otherwise having to adhere to this paradigm will hold new development back and make it less efficient. The example we just saw – a function not throwing exception when it's appropriate to do so because there were no exceptions mechanism the day the function has been written. What I am saying is sometimes it's worth abandoning something old in favor of letting something new evolve faster. But that's just my particular developer's view on it, I am far away from business prospective of all this stuff. :)

    [Check out WinRT. -Raymond]
  15. Joker_vD says:

    "The example we just saw – a function not throwing exception when it's appropriate to do so because there were no exceptions mechanism the day the function has been written."

    Have you ever used streams from the STL of C++? I just love that when input/output on a stream fails, the stream becomes all sad and emotional and refuses to do anything unless you cheer it up with clear(). And of course, you technically can enable exceptions on streams — but that breaks such a huge amount of third-party library code that you definitely don't want to do it.

    And the absense of any way to know what exactly went wrong whith that DLL is indeed quite stupid. Which reminds me of a related gotcha: if you start a console program with CREATE_NO_WINDOW, the "The program can't start because XYZ is missing from your computer" dialog won't appear. Amazing, isn't it?

  16. Gabe says:

    Escaper: If a DLL is required to run an application, why would you call LoadLibrary on it? Your DLL would be implicitly linked and your app wouldn't start without it. You use LoadLibrary to enable optional functionality, whether a plug-in, a COM object, or perhaps to avoid paying for a feature until you know the user wants it.

    I'd hate to be unable to use some app just because a plug-in I don't even intend to use fails to load when the app initializes.

  17. Escaper says:

    Gabe, ok, we could have two versions of the LoadLibrary following the well-known pattern: LoadLibrary throws an exception, while TryLoadLibrary returns a value. But that is not the point. The point is LoadLibrary could return name of the missing file one way or another. That would save a lot of time for a lot of people who are trying to find out what particular file is missing.

    [This assumes that the only reason LoadLibrary could fail is a missing file. -Raymond]

Comments are closed.

Skip to main content