How does delay-loading use binding information?


In the documentation for delay-loading, there's a remark that says that the call to GetProcAddress can be avoided if there is binding information. A customer who received the explanation of why you can't delay-load kernel32 pointed out that paragraph and asked whether this means that you can delay-load kernel32 if you bind to it. (Getting around to answering this question was the point of the past few days.)

Let's take another look at what that GetProcAddress-avoidance optimization does. Actually, it's just another look at what the module loader does when it's time to resolve imports to a bound DLL: At build time, the actual function pointers are precomputed and cached, along with the timestamp of the DLL those precomputed values came from. At run time, the delay-load stubs check the timestamp of the target DLL and compare it against the timestamp that it had cached. If they are the same, then they skip the call to GetProcAddress and use the cached value.

In other words, the delay-load stubs use binding information in exactly the same way the module loader does.

Does this mean that you can now delay-load kernel32?

No. First of all, if the timestamps don't match or if the target DLL was not loaded at its preferred address, then the binding information is of no use—you have a cache miss. In that case, the module loader (and the delay-load stubs) must obtain the function pointers the old-fashioned way. You can't assume that your binding information will always be accurate. (For example, after your module was bound to kernel32, there may have been a security update which modified kernel32, which invalidates your binding information.)

And besides, even if the binding information were used, you still have to call LoadLibrary to get the target DLL loaded in the first place. Even though binding may have optimized away one call to kernel32, you still have that LoadLibrary to deal with.

Comments (10)
  1. DAEngh says:

    Raymond, thank you for this past series.  I’ve almost always used tools that insulated me from really having to know things in this level of detail, and it’s interesting to hear imagined details explained.

  2. Leo Davidson says:

    “or if the target DLL was not loaded at its preferred address, then the binding information is of no use”

    Couldn’t the binding information still be used? The loader just has to shift the precomputed addresses by (actual_base – prefered_base), unless I’ve missed a detail.

    Either way, the rest still stands, so the overall message is still correct; I’m just wondering about that detail.

    [The main purpose of binding is to prevent the page from becoming dirty, and optimizing out the GetProcAddress will still dirty the page. It just creates another code path that needs to be debugged, and it doesn’t give you much of a benefit over hinting. -Raymond]
  3. Leo Davidson says:

    “The main purpose of binding is to prevent the page from becoming dirty, and optimizing out the GetProcAddress will still dirty the page.”

    Fair enough. I was thinking it could still be useful to avoid the GetProcAddress overhead.

    (Adding an offset still being much quicker than a linear string-table search, but it’s not like it takes that long either way so I can definitely accept your argument against making the code more complex.)

    [Since the hint is guaranteed to be correct, there is no linear search. All you’re saving is a single strcmp. -Raymond]
  4. Leo Davidson says:

    "Since the hint is guaranteed to be correct, there is no linear search"

    True and makes sense. I was wrongly reading/assuming that all the optimisations would be dropped, including the hinting info, but you hadn’t said that at all.

    Apologies & thanks for clearing up my confusion.

  5. waleri says:

    What are the chances that the DLL in question remains the same?

    Every windows update changes a bunch of DLL. Different users run different version of the OS. And on top of that, sometimes when the DLL is changed its entry points remains the same, but timestamp will be different.

  6. yuhong2 says:

    "Then again, people would most likely still want to try to delay load kernel32 because they are too lazy to deal with multiple Windows versions properly, relying on delay loading to get around the initial symbol check."

    In this case, why not just delay load just the kernel32 symbols in question?

  7. Philip says:

    Crescens2k:

    The real reason to delay-load kernel32 is to make the usage of functions like CreateFileTransactedW nicer by providing a failure stub in the case of the OS not having the function (XP, etc.)

    What I’m describing is to both statically and delay load kernel32. You statically pick up the functions you want to use in delay loading (LoadLibrary, GetProcAddress, heap functions, no-fail function, etc) and you delay-load the functions you want to have OS-specific behavior for.

  8. Philip says:

    As noted before, you *can* delay-load kernel32 with some PE editing. You just need to link against it for GetProcessAddress, LoadLibrary, GetModuleHandle, and the interlocked functions.

  9. Crescens2k says:

    Philip:

    Even though you *can* delay-load kernel32 with some PE editing, as what should also be noted along with it, it is a very daft thing to do.

    Not only do you end up in the catch 22 situation where you can’t load kernel32 because you need LoadLibrary from kernel32 but there are other functions in there which are very important. For example, VirtualAlloc and other virtual memory functions are in there, as well as the Heap functions. So you would also lose memory management too.

    Then again, people would most likely still want to try to delay load kernel32 because they are too lazy to deal with multiple Windows versions properly, relying on delay loading to get around the initial symbol check.

  10. Worf says:

    If you want to do tricks like that, GetProcAddress() on kernel32.dll is the supported way. And with ASLR, the only way.

    Or are people too lazy to do it this way?

Comments are closed.