Issues related to forcing a stub to be created for an imported function


I noted last time that you can concoct situations that force the creation of a stub for an imported function. For example, if you declare a global function pointer variable:

DWORD (WINAPI *g_pGetVersion)() = GetVersion;

then the C compiler is forced to generate the stub and assign the address of the stub to the g_pGetVersion variable. That's the best it can do, since the loader will patch up only the imported function address table; it won't patch up anything else in the data segment.

The C++ compiler, on the other hand, can take advantage of some C++ magic and secretly generate a "pseudo global constructor" (I just made up that term so don't go around using it like it's official or something) that copies the value from the imported function address table to the g_pGetVersion variable at runtime. Note, however, that since this is happening at runtime, mixed in with all the other global constructors, then the variable might not be set properly if you call it from any code that runs during construction of global objects. Consider the following buggy program made up of two files.

// file1.cpp
#include <windows.h>

EXTERN_C DWORD (WINAPI *g_pGetVersion)();
class Oops {
  public: Oops() { g_pGetVersion(); }
} g_oops;

int __cdecl main(int argc, char **argv)
{
  return 0;
}

// file2.cpp
#include <windows.h>

EXTERN_C DWORD (WINAPI *g_pGetVersion)() = GetVersion;

The rules for C++ construction of global objects is that global objects within a single translation unit are constructed in the order they are declared (and destructed in reverse order), but there is no enforced order for global objects from separate translation units. But notice that there is an order-of-construction dependency here. The construction of the g_oops object requires that the g_pGetVersion object be fully constructed, because it's going to call through the pointer when the Oops constructor runs.

It so happens that the Microsoft linker constructs global objects in the order in which the corresponding OBJ files are listed in the linker's command line. (I don't know whether this is guaranteed behavior or merely an implementation detail, so I wouldn't rely on it.) Consequently, if you tell the linker to link file1.obj + file2.obj, you will crash because the linker will generate a call to the Oops::Oops() constructor before it gets around to constructing g_pGetVersion. On the other hand, if you list them in the order file2.obj + file1.obj, you will run fine.

Even stranger: If you rename file2.cpp to file2.c, then the program will run fine regardless of what order you give the OBJ files to the linker, because the C compiler will use the stub instead of trying to copy the imported function address at runtime.

But what happens if you mess up and declare a function as dllimport when it isn't, or vice versa? We'll look at that next time.

Comments (16)
  1. Mike Dimmick says:

    The linker doesn’t really know anything about constructing C++ global objects. All it does is link together sections of the .obj files to form the same-named sections of the final file. One of the special features it has is that if a section in the .obj has a $ sign in it, it sorts the objects by the text appearing after the $ and combines them together to form the section with the name on the left. So, if sections in the whole link job exist named .CRT$XCA, .CRT$XCU, and .CRT$XCZ exist, the sections are sorted so that XCA is sorted first, then XCU, then XCZ. The order of objects with identical trailers (e.g. if multiple .obj files have .CRT$XCU sections) is not specified.

    I’m not explaining this well.

    Anyway, in crt0init.c, you can see that the C run-time library declares global variables __xc_a in the .CRT$XCA section and __xc_z in the .CRT$XCZ section. Then there’s a linker directive to tell it to merge the .CRT section into the .data section. If you use a global object with a constructor, the compiler generates a .CRT$XCU section containing a pointer to that constructor. The linker’s magic with the $ sections causes a function pointer table to be constructed.

    In [w]{Win|Dll}MainCRTStartup, there’s a call to cinit, which is implemented in crt0dat.c. This calls _initterm, which simply iterates through the table calling every function pointer. Theoretically, if there were CRT global objects that needed construction before being used by user global objects, they could be given a letter earlier than U in the alphabet. In practice, there aren’t any (at least in VC6). This magic _is used (with $XI) to initialize and clean up the standard I/O library, for example.

    The fact that this executes in DllMainCRTStartup is that you have to be careful of the loader lock in any global object (with a constructor) created in a DLL. It’s best not to use them.

  2. Andrew says:

    Fascinating as always! However I didn’t think Microsoft shipped anything other than a C++ compiler these days? Is the mention of a C compiler just a historical detail or is there actually a Microsoft pure C compiler still available somewhere?

  3. Brian says:

    If you name your file .c rather than .cpp it runs the C compiler rather than the C++ compiler.

  4. Andrew says:

    Cool! Thanks Brian.

  5. Daniel Colascione says:

    Is there a way to force the compiler to use a stub instead of creating a global constructor?

    [See Brian’s comment above or the response to Norman Diamond below. -Raymond]
  6. Do LoadLibrary and GetProcAddress have any thread affinity at all?

    I replaced a piece of code that loads msmapi32.dll dynamically as soon as my own dll is loaded (you cannot statically link to it since it can live in several places, none of them in the search path) with something like the following (Delphi):

    var _MAPIInitialize : pointer = nil;

    function MAPIInitialize;

    begin

     GetMAPIProcedureAddress(_MAPIInitialize, ‘MAPIInitialize’);

     asm

       mov esp, ebp

       pop ebp

       jmp [_MAPIInitialize]

     end;

    end;

    GetMAPIProcedureAddress() would load msmapi321.dll if necessary and call GetProcAddress if GetProcAddress has never been called for the the given function.

    In other words, the dll would get loaded as soon as an attempt is made to call any of its functions.

    This works perfectly 99.9% of the time,  but as soon as this code is used in a multithreaded environment, weird things start happening: some time after the code runs and loads msmapi32.dll on a secondary thread (which later terminates, but my dll stays up), I get access violations either in msmapi32.dll itself or in one of the dlls that also use msmapi32.dll (such mspst32.dll). This includes both true multithread applications written in C++ as well as the VB IDE, which runs the code in its own address space when debugging.

    Yes, I do wrap the code that calls LoadLibrary and GetProcAddress in critical sections… The crashes occur seemingly out of the blue and I cannot see the call stack…

  7. Doug says:

    GetProcAddress probably doesn’t have any thread affinity, but MAPI is probably making some assumptions about the thread that is used to initialize it.  Also, MAPIInitialize takes a parameter which you aren’t passing in your call.  The documentation for MAPIInitialize specifies how the parameter should be initialized for multithreaded programs.

    Don’t play these tricks just to save an extra "ret" instruction.  Call MAPIInitialize according to the documentation and things will work a lot better.

  8. steveg says:

    Dmitry Streblechenko: (shrug) Send/PostMessage to your main thread and let it do the work. Sounds like you’ve spent enuf time on this one.

  9. Mike Dimmick says:

    Andrew, Brian: The C++ and C compiler share a lot of the same code, the difference is that for modules compiled as C, C1.DLL is used to produce a parse tree, while for C++, C1XX.DLL is used. As you say, normally for .c files the C compiler is used while for .cpp, .cxx files the C++ compiler is used; however, you can override this behaviour by using the /Tc, /TC, /Tp, /TP switches (respectively ‘compile this file as C’, ‘compile all files as C’, ‘compile this file as C++’, ‘compile all files as C++’).

  10. Doug,

    MAPIInitialize is stil called on the same thread (and it must be called on every thread that uses MAPI, much like CoInit), it is a question of when LoadLibrary() is called which apparently makes a difference.

    As for the missing parameter, it is there – just a Delphi shortcut: it allows to omit the parameters list in the implementation sesion if the function definition in the interface section lists the parameters:

    function  MAPIInitialize(lpMapiInit : Pointer) : HResult; stdcall;

  11. Steve,

    I don’t have a main thread – my code is in a COM dll which is called by other executables; so main thread is a relative term here – there might be a main thread MAPI-wise (that does all the MAPI related work) as opposed to the main UI thread.

  12. Todd Greer says:

    It’s not clear to me how the behavior you describe conforms to the C++ standard. From my reading, g_pGetVersion is required to be initialized before any dynamic initialization takes place:

    EXTERN_C DWORD (WINAPI *g_pGetVersion)() = GetVersion;

    This is a non-local static of POD type that is initialized by an address constant expression. g_pGetVersion is therefore initialized before g_oops is constructed.

    Unless I’m misreading the standard, you’ve just described a very interesting compiler bug.

    Thank you for the very interesting post.

    [Interesting point. But whether it’s conforming or not, it’s what happens, and your choices are either to accomodate this behavior or to vote with your wallet and buy a different compiler. I tend to discuss things as they actually are rather than how they ought to be in an ideal world, because it turns out we don’t live in an ideal world. -Raymond]
  13. Jim Howard says:

    "Interesting point. But whether it’s conforming or not, it’s what happens, and your choices are either to accomodate this behavior or to vote with your wallet and buy a different compiler. I tend to discuss things as they actually are rather than how they ought to be in an ideal world, because it turns out we don’t live in an ideal world. -Raymond"

    Raymond, just be glad you’re not me.  I have to sit next to this language lawyer all day long!

  14. Norman Diamond says:

    To Todd Greer:

    (1)  Is the initializer really considered to be a constant when its value comes from an extern declaration?  The external definition is not part of this translation unit.

    (2)  When a translation unit contains a #pragma, whether or not brought about from doing #include <windows.h>, doesn’t the standard recuse itself entirely from defining the meaning (or absence thereof) of a program?

    (3)  If nonetheless it’s a compiler bug, you’ll be glad to know that this isn’t the only place where you can report it without paying a fee.  Visual Studio is different from Windows.  For Visual Studio you can go to the following web site, report a bug, and get a “won’t fix” resolution for free:
    http://connect.microsoft.com/feedback/default.aspx?SiteID=210

    [I think (2) is the operative condition here. In order to get this behavior, you have to use a Microsoft language extension (__declspec), at which point the rules are allowed to change. If you don’t use __declspec(dllimport), then you get the naive behavior which is standard-conforming. -Raymond]
  15. Todd Greer says:

    Indeed, __declspec is key here. As this is a documented MS extension, it is to be expected that it changes the rules regarding what it modifies.

    Is this particular way in which it changes the rules documented anywhere (other than here)? I was unable to find any such documentation. I did find http://msdn2.microsoft.com/en-us/library/twa2aw10.aspx, which mentions assigning the address of a dllimport function to a global or static variable, but it did not mention this caveat.

    It would appear that I should still report a bug, but that it is a documentation bug.

  16. Todd Greer says:

    Norman,

    (1) Whether the declaration is from this translation unit or not is irrelavent.

    (2) Good point. A #pragma causes the implementation to behave in an implementation-defined manner, but the standard does require that the implementation-defined behavior be documented. As Raymond pointed out though, you’ve more or less hit it with the point that it involves a Microsoft extension.

    (3) Thank you. I didn’t know that. I’ll file a documentation bug there.

Comments are closed.

Skip to main content