Understanding errors in classical linking: The delay-load catch-22


Wrapping up our week of understanding the classical model for linking, we’ll put together all the little pieces we’ve learned this week to puzzle out a linker problem: The delay-load catch-22.

You do some code cleanup, then rebuild your project, and you get

LNK4199: /DELAYLOAD:SHLWAPI ignored; no imports found from SHLWAPI

What does this error mean?

It means that you passed a DLL via the /DELAYLOAD command line switch which your program doesn’t actually use, so the linker is saying, “Um, you said to treat this DLL special, but I don’t see that DLL.”

“Oh, right,” you say to yourself. “I got rid of a call to Hash­String, and that was probably the last remaining function with a dependency on SHLWAPI.DLL. The linker is complaining that I asked to delay-load a DLL that I wasn’t even loading!”

You fix the problem by deleting SHLWAPI.DLL from the /DELAYLOAD list, and removing SHLWAPI.LIB from the list of import libararies. And then you rebuild, and now you get

LNK2019: unresolved external '__imp__HashData' referenced in function 'HashString'

“Wait a second, I stopped calling that function. What’s going on!”

What’s going on is that the Hash­String function got taken along for the ride by another function. The order of operations in the linker is

  • Perform classical linking
  • Perform nonclassical post-processing
    • Remove unused functions (if requested)
    • Apply DELAYLOAD (if requested)

The linker doesn’t have a crystal ball and say, “I see that in the future, the ‘remove unused functions’ step is going to delete this function, so I can throw it away right now during the classical linking phase.”

You have a few solutions available to you.

If you can modify the library, you can split the Hash­String function out so that it doesn’t come along for the ride.

If you cannot modify the library, then you’ll have to use the /IGNORE flag to explicitly ignore the warning.

Exercise: Another option is to leave SHLWAPI.LIB in the list of import libraries, but remove it from the DELAYLOAD list. Why is this a dangerous option? What can you do to make it less dangerous?

Comments (25)
  1. configurator says:

    Why delayload it at all? What happens if you statically link it? I'm sure all hell breaks loose, but I'm not so sure why.

  2. Joshua says:

    It means the linker was dumb. When it gets to Apply DELAYLOAD, and a referenced library is nowhere to be found but it was found earlier, silently drop it.

    @configurator: You can't statically link to system DLL files.

  3. Bob says:

    @configurator:

    Documentation says to use /DELAYLOAD if you won't use the DLL at all or not until the end of the code.  So, I figure you are saving memory and load time.

    One other possibility is for optional features.  If you didn't install feature X (maybe you didn't pay for it) then you won't even have the DLL that feature X depends on.  Could avoid the "why do I have to waste xx MB of disk space on DLLs which support hardware options I don't even have?" problem.

  4. ChrisR says:

    @Joshua:  It seems fairly obvious from the context that configurator means why not place a static dependency on the DLL.

  5. Matt says:

    @Joshua: "@configurator: You can't statically link to system DLL files."

    No, but you can put in a hard dependency (i.e. a proper not delay-loaded dependently), which I think is what configurator actually meant.

  6. Maurits says:

    If I understand this correctly: the linker's left hand doesn't know what its right hand is doing, so you have to pass an /IGNORE switch to prevent it from confusing itself.

  7. GregM says:

    'Why delayload it at all?'

    You pay the cost of loading it when your EXE is started (or DLL is loaded) even if you won't use it for a while, if at all.

    'You can't statically link to system DLL files.'

    Don't get caught up in the specifics of the example and miss the entire point.  Ignore the fact that it's a system DLL.  The same thing applies to any other DLL you might use.

  8. ultramage says:

    "Why is this a dangerous option? What can you do to make it less dangerous?"

    I think this is what Configurator is wondering about, and me as well. What is so dangerous about the standard default implib approach?

    [The danger is that you are now loading the target DLL at startup rather than on demand, which slows down your app's startup time and increases its memory usage. -Raymond]
  9. Is the danger specific to SHLWAPI?

    [SHLWAPI was merely an example. (What an odd question. You may as well refine it to "Is the danger specific to the HashData function?") -Raymond]
  10. Ofek Shilon says:

    Still hoping for some elaboration on function level linking. Wasn't /GL intended to be the silver bullet to solve all these cases?

    [As noted in the article, function level linking is applied after the classical link pass. This is important because of the tricks that people use that rely on the classical model. -Raymond]
  11. alexcohn says:

    [The danger is that you are now loading the target DLL at startup rather than on demand, which slows down your app's startup time and increases its memory usage]

    – but only after another subtle change makes this DLL used again. Otherwise, would the 'Remove unused functions' step not remove the import lib entirely?

    [And that subtle change is the danger. -Raymond]
  12. Ofek Shilon says:

    Is function level linking indeed a synonym for removing unused functions?  /GL and /OPT:REF are different switches, one could reasonably hope that function level linking does more than that.

    [As explained in the documentation for /Gy, function-level linking allows functions to be discardable during the "unused function" pass, if you ask for it via /OPT:REF. It does not alter the actual classical model for linking. The flag name is misleading. It's not "perform function-level linking". It merely enables it by telling the linker where functions begin and end. And it's not so much function-level linking as it is function-level unlinking. -Raymond]
  13. GregM says:

    "As noted in the article, function level linking is applied after the classical link pass."

    Raymond, the phrase "function level linking" does not appear in the article.  I assume that Ofek simply didn't realize that "Remove unused functions" was "function level linking".  I know I didn't realize that, but I also wasn't specifically looking for a discussion of it, having forgotten about yesterday's question.

    Alex, the main problem here seems to be that the function level linking pass doesn't provide to the delayload pass the list of DLLs for which it removed references, so that they delayload pass doesn't produce warnings for them.

    Unfortunately, the workaround of using /IGNORE also has the problem that it is relying on undocumented functionality, and it can (and has in the past) change at any time (such as the inability to disable LNK4099 in recent linker versions).

  14. You pay the cost of loading it when your EXE is started (or DLL is loaded) even if you won't use it for a while, if at all.

    Cost (time) of DLL (and EXE) loading could be decreased a lot, if the whole module were touched (up to a reasonable size limit) to force it page-in. Great part of an application startup delay is caused by disk thrashing, this is why SSD is so helpful. By changing the page-in pattern from single page by demand, to a whole module, you can reduce number of disk seeks.

    One day I'll try this by writing an add-on driver that uses PsSetLoadImageNotifyRoutine to intercept module load and force them in.

  15. Joshua says:

    @alegrl1: I eagerly await the result of your experiment. If it passes, I'll just block-read the dlls with a ReadFile command and then throw away the result. (Now they're in cache.)

  16. OfekSh says:

    I was under the impression that /Gy (not /GL of course, sorry) also enables resolution of duplicate definitions of *used* functions, via /OPT:ICF, but perhaps you were referring to such unification of duplicates as removal of unused functions? (it probably happens on the same stage anyway).

    /OPT:ICF can probably cover half the cases that otherwise require manually separating functions to individual obj files.  Regarding the other cases – I understand (now, thanks to your posts) that the linker still tries to resolve imports even for *unused* COMDATs, if they're included in an obj file that got pulled in.

  17. alexcohn says:

    @GregM: the DLL will be factored out altogether. The main problem is that the linker will not warn you when changes in your (or linked) code require this DLL again.

    Still, I believe it's safer to keep the import lib and check the load behaviour regularly, than to use /IGNORE.

  18. Matt says:

    @alegr1: That already happens due to Prefast, and even before that on ASLRed modules this happens because most pages in a text segment need relocations.

    Actually the problem is the reverse of what you're stating. The reason that startup is slow is BECAUSE the module gets touched when loading (and all of its dependencies, recursively), incurring an extra penalty of a couple of million cycles between when you double click your app and when it begins executing its "main" function.

  19. Neil says:

    @alegr1 @Joshua @Matt Firefox does this with its master DLLs. Since they are almost always fully paged in, it actually performs a dummy read on them in order to persuade Windows into reading them 2M at a time instead of 32K or 4K or whatever you normally get. This apparently saves up to 50% off cold startup time, which is better than prefetch (which unfortunately tends to prefetch the wrong files). See blog.mozilla.org/…/firefox-7-cheating-the-operating-system-to-start-faster for example.

  20. To make it less dangerous: Have tooling that catches changes in the number of entries in the IAT? (or some other kind of profiling). I'm not sure if that's the answer you're looking for though.

  21. ulric says:

    I don't see how loading an extra Dll is "dangerous"

    most apps end up loading dozens, maybe. hundred dlls. trying to shave one dll off the link line is effectively not seeing the forest from the trees, and the equivalent of doing early optimization without first profiling

    [You know why most apps load dozens, maybe hundreds of DLLs? Because nobody was watching when the count went up by one. And it happened dozens, maybe hundreds of times. It's like "I don't see what's so dangerous about not balancing my checkbook. My checkbook hasn't been balanced for years!" -Raymond]
  22. alexcohn says:

    @Anil: to make it less dangerous, it's enough to check the list of loaded DLLs before release.

    @ulric: I don't think that loading an extra DLL is "dangerous".

    But it may easily become a startup time disaster, if the "extra" DLL in turn pulls dozens, maybe hundreds DLLs with it. OTOH, /DELAYLOAD should be, IMVHO, a last-minute optimization. Unlike dynamic linking, the DELAYLOAD magic can rarely get broken by the user code (and vice versa). Introducing it early in the development cycle may make debugging harder, and due to subtle changes in the code and environment the effects of DELAYLOAD may vary greatly.

    Therefore, I would recommend to introduce /DELAYLOAD after the application and the environment have stabilized, during the stage of performance tuning and testing.

    [Actually, delay loading can introduce its own problems. For example, if you trigger a delay-load while holding a lock, you may inadvertently create a deadlock. -Raymond]
  23. Joshua says:

    [Actually, delay loading can introduce its own problems. For example, if you trigger a delay-load while holding a lock, you may inadvertently create a deadlock. -Raymond]

    Repeat after me: Do not do interesting in DLLMain. That way lies madness.

  24. Deduplicator says:

    And never forget that when all static initializers run.

  25. alexcohn says:

    Probably, it could be nice to have a /DELAYLOAD_ALL flag, and a way to list exceptions

Comments are closed.