Why do we have import libraries anyway?


Last time we looked at the classical model for linking as groundwork for answering Adam's question why do we need import libraries? Why can't all the type information be encoded in the export table?

At the time the model for DLLs was being developed, the classical model still was the primary means by which linking was performed. Men were men and women were women. Compilers generated object modules, and linkers resolved symbols, connecting the loose ends of the object modules, in order to produce an executable. The linker didn't care what language the object modules were written in; in fact it had no way of finding out, since by the time you had an object module, all that was left was a bunch of code with placeholders, and some metadata describing how those placeholders should be filled in.

"If the answer to this is 'because the .dll doesn't contain all the information the linker needs' - the question becomes why don't .dlls contain all the information needed to link against them?'"

Because the linker doesn't have that information when it generates the DLL file. It just has a bunch of code with placeholders. The type information was lost a long time ago. (And if you're writing code in assembly language, your concept of type information is radically different from that of a C or C++ programmer. What is the type of the ECX register contains the size of the buffer on entry to the function and contains the number of unused bytes in the buffer on exit?)

Many years later, the OLE automation folks decided to encode type information in metadata in the form of a type library. And if that's not good enough, you can turn to the CLR, which records enough type information in its assemblies that you can, in principle, call the methods exported from it.

Mind you, this principle is pretty useless, because even with this type information, all you get are the function parameter types and return type. You still have no idea what that bool parameter means, for example. All that you have is a little more information to shoot yourself in the foot with.

Comments (43)
  1. Random832 says:

    Wait a minute… why is type information (outside of the header files or whatever other language-specific thing that the compiler used to call the functions the right way) needed for normal use of a DLL?

  2. Adam says:

    I’ll also take this opportunity to apologies for framing my question in an unclear and ambiguous fashion. I do appreciate the information in the posts you have made, even if they’re not what I intended to ask for! Thanks.

  3. Medinoc says:

    I’m with Random832 here: From what I understood of what you said yesterday, all the Linker had to do its job was the name of the symbols to resolve; and that’s precisely what DLLs have, don’t they? (unless the symbol is exported under a different name, such as the undecorated one. If that’s the flaw, my reasoning will have to admit defeat)

  4. porter says:

    > undecorated

    That’s the nub of it. Even a simple "int foo(int)" can appear with a different export name in the table "foo" or "_foo@4", the correct information gets put in the lib for the "__imp__foo" entry which points to the correct one.

    It used to be that given a DEF file you could produce an import library ( think Win16 or OS/2), with Win32 even for simple C entries you need to write a stub C piece of code with the correct number arguments and correct calling type to build the import library.

  5. Matthew says:

    @Random832:

    "Wait a minute… why is type information (outside of the header files or whatever other language-specific thing that the compiler used to call the functions the right way) needed for normal use of a DLL?"

    It’s not. Normal use – i.e., loading and accessing exported items, a la LoadLibrary and GetProcAddress – of a DLL doesn’t require import libraries either.

    But if you want to hook up directly to DLL, so that your code treats the imported functions the same as those internal to the current module, you will need that type information.

    DLLs contain the ordinal, name, and location of exported symbols. They do not contain type information about those symbols. Thus, if you want to be able to use that type information, you need something more – like an import library, which is created with access to the original type information used to create the DLL in the first place.

    If you want to run things yourself – using LoadLibrary et al – then it’s all up to you to make sure you know what you’re doing. If you want to use the language to ensure correctness (etc.), then you need to be able to provide it (compiler/linker) with the extra information to perform such checks. Which isn’t available from the DLL itself.

  6. rs says:

    Chiming in with Random832 and Medinoc: If I can call LoadLibrary and GetProcAddress to get all I need to be able to execute a DLL function, why can’t the linker?

    [Please re-read yesterday’s article about the classical model for linking. Code injection is not part of the classical model. -Raymond]
  7. Matthew says:

    @rs:

    To properly execute it, you need outside knowledge. What form of function pointer is being returned by GetProcAddress? You have no idea what arguments it is expecting – if any – and zero clue about the return value – if any.

    To properly use the function, you would need to know the specification of the function – its type information. Information that doesn’t exist in the export header of a DLL.

    Same with the linker; if the linker is supposed to match potentially overloaded functions, it needs to know about the arguments. Which it cannot possibly gather from the DLL.

  8. rs says:

    Matthew, Raymond: Thanks for the comments. I had been under the impression that for C functions the library didn’t include any information not in the DLL. But it actually does include the size of the parameter block, allowing the compiler can do some minimal type checking.

  9. Random832 says:

    don’t have the parameters, their types, or any documentation. -Raymond]” yeah, but the linker doesn’t (or shouldn’t) need those, and as I understand it aren’t present in import libraries (as opposed to  type libraries). That’s the compiler’s concern, and it can get them from the header file (or whatever other language-specific mechanism).

    It seems like the question you’re answering isn’t really the same one that was asked.

    If the issue is that the decorated name (either with type info for C++, or with the @N for C stdcall functions) isn’t included in the dll (and just what name is included in the dll if you have multiple overloaded functions by the same name), the question remains: why not?

    [All the linker sees is that CALLFOO.OBJ is looking for the function _foo@12. If it looks at FOO.DLL it sees a function called “Foo”. Since the strcmp fails, the linker says “No match.” The import library’s job is to provide the missing link to the linker: “When somebody asks for _foo@12, send them to FOO.DLL function Foo.” Why isn’t the decoration included in the DLL? (1) Why should it? The information isn’t needed at run time, and the principle of don’t keep track of information you don’t need applies. (2) If you really want to, go ahead and export the decorated names. And then your DLL compiled with Compiler X cannot be used with Compiler Y because they have different decoration algorithms. -Raymond]
  10. Matthew says:

    @Random832

    "and just what name is included in the dll if you have multiple overloaded functions by the same name"

    Anything. Including nothing.

    The external symbol name (the exported one visible in the DLL) is not restricted by the name of the internal function that it references. In fact, as Raymond has already mentioned, you don’t need any name – you can export something with only an ordinal.

    To perform ANY type of checking, the build system NEEDS to have a way to translate between the internal symbols and the exports.

    Headers only (typically) declare the internal symbology of some exported feature. The mapping from internal to external is still missing, hence the need for additional information.

    For example, you can declare a function as exported in a header, but later give it an alias during actual export. There is no way to convey this information afterwards with only the ouput (DLL). Maybe the original internal function was named "Func1" but then you alias-ed it to "Func123" on export. The linker wouldn’t identify the two as being related if you only had the externally-visible DLL name ("Func123") and were comparing against the internal name ("Func1"). Unresolved external (since the only externally-visible name doesn’t match the now ‘missing’ function)

  11. Clinton L. Warren says:

    Tiny C for Windows (http://bellard.org/tcc/) let’s you link against a .def file.  It also includes a utility to create a .def file from a .dll file.  

    So, it’d seem like it’s possible to omit the .lib step.  I myself have linked against sqlite3.dll using only a tcc-generated .def file.

  12. arnshea says:

    God I love this blog, thanks for the compiler linker refresh.

    Preconditions and post-conditions move us a little further along in the direction of "What does ‘bool foo’ mean?" but at some point you bump up against Godel’s Incompleteness Theorem.  Given that programs are clearly more expressive than simple algebra, there’s a limit to how much meaning the system can self-describe.

  13. Al Urker says:

    From the category ‘the more things change, the more they stay the same’:  The clr dream allows the same assembly to be used for the entire lifecycle: you can code, design, link, and run against the same CLR assembly. But many .NET SDKs include asmmeta assemblies for you to code, design and link against. Asmmeta assemblies contain type info and no code – similar to a lib & header file.

  14. Adam says:

    OK, I realise it’s a bit late for this now as you will have written these posts a couple of years ago, but my main concern was not with *types* not being stored in DLLs, but with things as basic as *function names* not being stored by default. You generally can’t link to a DLL if you don’t have the LIB because for there’s generally not even a symbol table available.

    As referenced in my question, your July 18 post talks about problems arising from *this* issue, and doesn’t have much to do with types.

    <“>http://blogs.msdn.com/oldnewthing/archive/2006/07/18/669668.aspx>

    [Exported function names are included by default, but you can remove them with the NONAME attribute. But even if you have the name, that still doesn’t tell you enough – you don’t have the parameters, their types, or any documentation. -Raymond]
  15. Miral says:

    “(2) If you really want to, go ahead and export the decorated names. And then your DLL compiled with Compiler X cannot be used with Compiler Y because they have different decoration algorithms.”

    Isn’t that one of the ideas behind stdcall, though? That it mandates a particular decoration algorithm? (And parameter/return/stack conventions, of course.)

    (Of course, stdcall didn’t show up until Win32, which is fairly late in the game as far as DLLs are concerned.)

    [Another problem is that stdcall decorates differently on different platforms. -Raymond]
  16. nathan_works says:

    Shoot yourself in the foot ? Wasn’t (is*) there a huge market for the old "windows xx api uncovered" books where various folks found, poked/prodded/debugged those functions to guess at the parameters ? Like the guy who wrote up all the ZwXX functions (via a vis the NtXXX flavors) etc ?

    *is, in the sense that because of said books and the consumers of those books writing code to use the undocumented functions does give Raymond plenty of topics to post about..

  17. porter says:

    > For there was a platform that had it all:  the Code Fragment Manager on classic Mac OS (post-PowerPC and pre-OS X).

    Not quite all, you are missing AIX and it’s shared libraries, it supports two types, one which is simply an COFF object file and the other which is a library containing object files and you can link to a specific object within that library. So the library can contain both a 32bit and 64bit objects, it can also contain an exports definition file.

  18. ulric says:

    One issue I have with this entry, is that I believe, that all the C++ decoration issues were nowhere on the radar when design decision about import libraries were was made.  So it looks to me like this is "retcon"-ing history.

    Way back in the early 90s, there was only C and Pascal calling convention, and I don’t recall that there were any decoration in the export names.  Therefore, "how do you encode the types?" never came up.  We never had the types, just the symbol names.  

    Also, it seems a bit of a sidetrack to bring in CLR and Typelibs, since they don’t use import libraries either. Import libraries is something specific to Microsoft C.

    Finally, as you know there are no import libraries on unix, the .so has everything.

    It works just fine!  Can you mix C++ .so with other languages? No! But import libraries do not solve that problem either.

    Unix .so implements what this entry suggests the DLLs could not.

  19. Matt G says:

    Right (to ulric).

    All you need from the import library (or the .so, or the .dll itself if things worked that way) should be the symbol->address mappings.

    Sure, that doesn’t work for name-mangled symbols, but not everyone needs name mangling, and if they do, there are multiple ways around it (predict the mangling, use the same compiler everywhere, or import libraries).

    Sure, that doesn’t give you the type information for the parameters and return values, but neither does the import library! In the C calling convention the calling code sets up and cleans up the stack and it’s on that code to get it right/wrong. Typically that information is encoded in the header file. In any case, the import library wasn’t helping with this part of the problem anyway.

    Sure, that doesn’t work for functions exported by ordinal only with NONAME (to save a few bytes back when raptors roamed the earth), but ordinals aren’t magic, they’re just a different (cheaper) way of encoding the name. You can’t use GetProcAddress with those, either. Oh wait, yes you can — you just pass the name in the magic way that encodes the ordinal. The import library does help here (retaining a name->ordinal mapping that the .dll itself doesn’t encode), but if you know the ordinal (which is necessary to use GetProcAddress) you could still get by without the import library.

    The classic linker model only needs the symbol->address mapping, and AFAIK that’s all the import library provides — import libraries are a lot less magic than this post, and about half the comments, imply.

  20. Ken Hagan says:

    This whole thread has got me very confused. Raymond has answered “Why can’t all the type information be encoded in the export table?”. However, that was Adam’s second question and it is not actually relevant to his first (and main) question.

    You don’t need type information to link. What you need to link is a description of the DLL’s interface (entry points) and the traditional model of programming separated interface from implementation. In fact, if memory serves, there was an IMPLIB tool that would generate a LIB file from a (text format) DEF file, so you could link against a DLL that didn’t actually exist yet.

    That’s actually a surprisingly common state of affairs. Any DLL that is updated “in the field” falls into this category, as does any DLL that has to expose a particular set of entry points to conform to a plug-in or driver architecture. Between them, these scenarios cover *most* reasons for using a DLL.

    You do need type information to compile but the exact information you need is language specific. Embedding type information in the actual DLL is only useful if you can be sure that all your clients are using a compatible language and it is wasteful when you have more than one DLL with the same interface.

    [Note also when a function is exported both by name and by ordinal, it’s not clear which one the author intended for you to use. (I wonder what algorithm Tiny C uses to decide.) And if you guess wrong and link by ordinal when the author intended for you to link by name, you end up calling the wrong function. The DirectX team curses people who generate their own input library instead of using the one provided in the SDK. -Raymond]
  21. Alexandre Grigoriev says:

    In the old times, the exports were encoded with ordinals, to save on the file size. The DLL might not even have names. This was already a good enough reason to use import libraries.

    You say: "Because the linker doesn’t have that information when it generates the DLL file". When the linker generates the DLL file, it *has* ALL that information, because it has ALL the object files, and also an optional .DEF file which describes a mapping of OBJ symbols to export names and/or ordinals.

  22. Leif Strand says:

    As I see it, all this talk about type information and decorated names is beside the point.

    If I may take the liberty of rephrasing Adam’s original question:  Why can’t I put a DLL on the linker’s command line?  More precisely, why don’t DLLs and LIBs share the same file format?  This is not crazy; in fact, this is precisely how it works on ELF platforms:  “-lfoo” on the link line will, by default, search for libfoo.so before it searches for libfoo.a — i.e., it will search for the very same .so file that is used by the dynamic linker at runtime, and use that .so file to resolve symbols (decorated or not).  Indeed, in my experience, this question comes from people with a Unix/ELF background.

    So the real question is, why doesn’t Windows work the way ELF platforms do?  There is only one other platform — that I know of — that works the same way Windows does, where there are both “import libraries” and separate “shared libraries”, and they are not in the same format and therefore not interchangeable:  that platform is AIX.

    The reasons for import libraries were neatly summarized in the responses to Raymond’s previous post.  But, none of these reasons justify the use of distinct file formats, and the oil and water separation between import libraries and DLLs.

    For there was a platform that had it all:  the Code Fragment Manager on classic Mac OS (post-PowerPC and pre-OS X).

    CFM shared libraries and CFM import libraries were in the same file format.  In fact, a CFM “import library” was simply a shared libary stripped of its code, so that only the linker’s symbols remained.  There were .def files as well… presumably you could do most of the things you can do with import libraries on Windows, like separate public interfaces from private, or create cyclic dependencies.

    I’ve always assumed there must be historical reasons why AIX and Windows don’t work in the same, sane fashion that CFM did.  But it must be a reason (or set of reasons) other than the ones suggested so far.

    My theory — for Windows specifically — is twofold:

    1) OBJ and LIB files came first, and the current DLLs/EXEs format (PEF) was developed later.  As I recall, they are all COFF at the bottom, so this part is hard to justify, but it is not that hard to imagine that PEF was optimized for the runtime loader, and it was considered too much trouble, or a waste of time, to make the dev tool chain read the new PEF files.

    2) The format of OBJ and LIB files are MSVC-specific anyway.  The idea that there is a single format for libraries and object files is a very Unix-minded point of view.  On pre-OS-X-Mac and Windows, there is no single “system” compiler; there are multiple compilers from multiple vendors, and (potentially) multiple formats for OBJ and LIB files.  Since there no single linker format that PEF could adhere to, they didn’t bother adhering to any of them.

    I’m not satisfied with this theory, and am eager to hear better ones.  But my point is, DLLs *could* be in a format recognized by the linker, and they *could* contain “redundant” symbolic information for use only by the build-time linker.  It has been done before.

    [They may all be COFF at the bottom today, but that’s not how they were back in 1983 when these rules were invented. (Where did I put my time machine?) To link directly against a DLL, you’d have to break the classical rules: The linker would do some pattern matching to realize that the unresolved symbol _foo@12 corresponds with FOO.DLL!foo (breaking the classical rule that all the linker does is strcmp), and then it needs to autogenerate a stub function _foo@12 that does a “call [__imp__foo]”, and then it needs to autogenerate an import variable __imp_foo, and then it has to decide whether __imp_foo should bind by name or ordinal (and it had better not guess wrong). Classically speaking, the linker does not use fuzzy logic or autogenerate code or data. LIBs are for the code and data that glue the two parts together (and which also remove the need for fuzzy logic). And of course you don’t want to have to include all your DLLs in your SDK when all that’s needed to link is the import information. (Would the Platform SDK have to include an XP copy of shell32.dll, a Vista copy of shell32.dll, and a windows 7 copy of shell32.dll? Would security patches also have to patch your SDK?) -Raymond]
  23. Hm, I could have sworn that the DLLs I build export things with their fully decorated names (*). I know I have caught instances of forgotten extern "C" by noticing that there were decorated names in the export table.

    (*) And wouldn’t doing things otherwise require the linker to know about all programming languages such that it could un-decorate the names correctly while creating the DLL?

    The point that (in a more frugal time) DLLs used to export things solely by ordinal makes sense, though.

  24. Neil says:

    On Win16 you could import directly on your DEF file e.g. ENABLESCROLLBAR=USER.482 if you were targeting Windows 3.1 using a Windows 3.0 SDK.

    I don’t know what might be involved in making it possible to link without import libraries on Win32 but possibly you could create an extension e.g. __declspec(dllimport(user32,EnableScrollBar)) which would direct the compiler to create the import stub.

  25. Medinoc says:

    Regarding your answer to Ken Hagan: Is the format of a Visual Studio import library documented?

    If it is, then it should be no trouble for any other compiler vendor to write a converter and build its own import library from the official ones instead of from the DLL. But if it’s not, that makes it a no-win situation…

  26. asd says:

    <i>And if that’s not good enough, you can turn to the CLR</i>

    And if that’s still not good enough, you can turn to pascal/delphi, where linking a library is just “function Whatever(params, params): result; stdcall; external ‘libname.dll’;”. If only things were like this in C++ world, at least for function imports…

    After all, it seems pretty logical. If you can LoadLibrary, GetProcAddress and cast it to whatever function pointer you want, why do you have to go to all the troubles to tell compiler to do just that, only automatically?

    [I think you’re confusing the compiler and the linker. Remember, under the classical model, the linker does not generate code. (The linker also doesn’t know what language you wrote your program in.) -Raymond]
  27. Random832 says:

    “because you’ve made it possible to link C++ modules from different vendors.”

    In the article you just linked, you made “or if you intend them to be able to use your DLL from a language other than C/C++ or use a C++ compiler different from Microsoft Visual Studio” sound like a worthy – or at least achievable – goal.

    And for that matter wasn’t that the entire point of having the undecorated names exported from a dll, which was the entire point of having an import library to map the decorated name used in the object module to the undecorated name exported from the DLL, which led us to this question?

    If this shouldn’t be possible, then the objection that decorated names are compiler-specific and so shouldn’t be exported evaporates.

    And there already is a way to define undecorated [or at least, undecorated enough. There’s still that weird stdcall thing] function names in C++. It’s called extern “C”.

    [By convention, the standardized name is the undecorated name, and these undecorated names follow Win32 ABI rules (not C++ rules or Delphi rules or anything else). The job of the import library is to bridge the gap between the decorated name imported by the compiler and the undecorated name exported by the DLL. Think of it as a thunk library. Why doesn’t the linker auto-generate these thunks? Because under the classical model, linkers do not generate code. -Raymond]
  28. Random832 says:

    Why is it necessary to generate code to bridge a gap between a stdcall function in a dll that takes an int and returns an int, and code outside the dll that calls such a function? Even with an import library and using undecorated names in the dll, all you’re doing is mapping foo to _foo@4 – that’s not code generation. At most it’s “breaking the classical rule that all the linker does is strcmp”, but putting _foo@4 in the dll in the first place would eliminate even that.

    If some language doesn’t provide a language-specific (i.e. put in the header file or equivalent for the compiler to understand, not an import library) way to call a stdcall function, that’s that language’s problem. Maybe they need something like an import library, to put thunks in.

    And “by convention” automatically raises the question of why again. What benefit is there to having “foo” in the dll when any code, compiled from any language that knows how to call a stdcall function, is going to ask for “_foo@4” (ignoring, for the moment until you bring up again the fact that stdcall decoration is different on different platforms, the question of why stdcall actually needs the size in the symbol)

    [A call to an imported function isn’t hooking up the _foo@4 placeholder to foo. You may need to generate a stub, and you definitely need to generate the IAT and related adata. -Raymond]
  29. anonymous says:

    "The type information was lost a long time ago."

    The linker generates two outputs – one is the DLL and the other is the import library so it has all the information needed (the decorated symbol and the exported symbol).

    The decoration needed to link with Win32 is standardized for the platform (otherwise you would need an SDK for Visual C and SDK for Borland C etc.  

    So why isn’t the decorated name exported?  Probably so that you could call GetProcAddress(handle, "foo") in a portable way.   Decorations on different platforms are different.

  30. Jared says:

    The predominate mainframe architecture linker knew what compiler generated the object modules — the compiler included that information in the object module.

    The decision to ignore it was deliberate — "special cases" for a specific compiler would not be supported but instead a general mechanism would be provided instead.

    Assistance was provided so that a compiler could pass information about the compilation through the linker for use by the linked code at run time.

    Again, there’s very little under the sun which is truly new.

  31. Matthew says:

    @anonymous:

    "So why isn’t the decorated name exported?  Probably so that you could call GetProcAddress(handle, "foo") in a portable way.   Decorations on different platforms are different."

    The exported name can be ANYTHING you want, including nothing (ordinal only).

    It could be the decorated name. It could be an undecorated name. It could be a completely different name.

    There is absolutely no reason I can’t have an export-marked function called ‘foo’ (internally) and then alias it to ‘gobbledygook’ when it actually gets exported (as in what is visible from the DLL).

    (Go look up the EXPORTS syntax for module definition files on MSDN. You’ll find that it will tell you exactly how to arbitrarily map internal to external symbol names.)

  32. Don Munsil says:

    I remember having exactly this discussion back in 1990.

    Import libraries exist specifically because you may need (in fact, often need) to generate two DLLs that call each other. Without some sort of intermediate external dependencies file, you can’t have DLL 1 and DLL 2 have dependencies on each other, because neither can be completed until the other is completed. The import library, on the other hand, can be generated before the DLL is fully linked.

    Given that you’ve got to create some kind of imports file, it made a lot of sense to produce it in the same format the linker uses, name decoration and all. Doing that was lots easier than changing the existing linker to understand a new format like DEF or something similar.

    So the specifics of the format are to some extent accidents of history. The linker had a format already for linking stuff. DLLs needed import libraries to solve the chicken and egg problem. Voila.

    And of course you could change the linker to allow linking directly to finished DLLs. There’s just not a lot of incentive to do so, since that wouldn’t remove the necessity for import libraries.

  33. Matthew says:

    The big point is still being lost for the most part.

    DLLs don’t contain any kind of mappings primarily because they don’t need them.

    The mappings are completely meaningless and entirely useless at runtime. They are only useful for compile/link time. As such, you are bloating your modules by including elements which serve absolutely no purpose at runtime, just so that you don’t have to have an additional file during compile/link time.

    The only thing that matters at runtime is that ordinal X or *externally-visible* name Y corresponds to address A. Nobody at runtime cares that ‘Y’ is an alias for internal name ‘ABCDEFG@$#BBQ’. It doesn’t matter what it used to be called during development; at runtime, you only care about what it is called *right now*.

    As Raymond says, you don’t keep track of information that you don’t need.

    Also, @Matt G:

    "…but if you know the ordinal (which is necessary to use GetProcAddress) you could still get by without the import library."

    This isn’t about explicit linking. Explicit linking doesn’t require squat. All responsibility is on the developer to ensure that everything they do makes sense (e.g., that if a symbol they expect doesn’t exist, they know what to do, and that if it does, that they know the appropriate form to cast to).

    Import libraries are only used for implicit dynamic linking, whereby you are already tying in to an external symbol during link instead of waiting until runtime to determine anything and everything.

  34. Random832 says:

    “[Another problem is that stdcall decorates differently on different platforms. -Raymond]”

    “The linker would do some pattern matching to realize that the unresolved symbol _foo@12 corresponds with FOO.DLL!foo”

    Again it comes back to the question of why. It is clear enough now that a number of decisions* have been made that make it necessary to have an import library; what is not clear is what benefit those decisions bring (or were perceived to bring) that is worth the extra complexity, as compared to standardizing name decoration** and putting the decorated names in the dll and using the decorated names for linking to the DLL at both compile time and runtime.

    And having a different name on a different platform isn’t really insurmountable when you need both a different dll and a different exe to run on a different platform, but in principle you could standardize it to the point where this isn’t necessary.

    *including, for that matter, decorating stdcall at all. I’ve always taken this for granted, but the reason this should be necessary is not obvious.

    **you don’t need to use the same compiler everywhere to decorate names the same way, just like you don’t need to use the same compiler to pass arguments the right way round on the stack.

    Ditto @Henning Malkolm: How exactly does the DLL get the undecorated names anyway, when the object modules it is built from only contain decorated names?

    [Standardized decoration would have been great, but which vendor’s decoration scheme do you standardize on? And once you have standardized decoration, you have to have standardized C++ object layout, because you’ve made it possible to link C++ modules from different vendors. Your Win32 ABI now encompasses all of C++. (And now every other language compiler needs to understand C++ in order to decorate its symbols correctly. But what if a language [e.g. Managed C++] has a feature that doesn’t exist in C++? How do you decorate that?) -Raymond]
  35. Matthew says:

    @asd:

    "After all, it seems pretty logical. If you can LoadLibrary, GetProcAddress and cast it to whatever function pointer you want, why do you have to go to all the troubles to tell compiler to do just that, only automatically?"

    That’s what import libraries are for. To provide all of the necessary translation – including from internal symbols to the exported, visible symbols, the necessary casting and calling conventions to apply to the address when it is actually called, etc.

    Without such added information, it would be downright magical. There is no way of knowing how your internally-defined external reference maps along the symbol chain, or if it needs to be accessed by ordinal. If it could do all of this, then yes, it could apply your prototype to cast the function pointer appropriately – but you can’t get to that point without knowning the absolute mapping, which is not contained in the DLL.

    See Raymond’s response to the comment by Leif Strand.

    Another general comment:

    People need to realize that *why* they get certain output is more important than that they got it. What do I mean? Well, like the comment on how they get decorated symbols as export names. That’s all fine and dandy, but why? Because the spec requires it? NO. It’s because the default options on that particular build environment chose to output it in that way.

    It’s like those who assert that ‘++i’ and ‘i++’ are always equivalent in efficiency when used in a standalone statement, simply because the machine code generated by their build environment is identical. Just because your compiler works in a certain way (in this case, optimizing postfix increment into prefix increment when it doesn’t matter) doesn’t mean that all compilers are mandated by a standard to operate in that way.

  36. porter says:

    > Import libraries exist specifically because you may need (in fact, often need) to generate two DLLs that call each other.

    Another excellently layered architecture!

    NetBSD will happily segment trap over circular library references, try using ldd on one.

    If you have a circular reference, how do you unload them?

  37. HagenP says:

    "[…] all you get are the function parameter types and return type […]"

    With COM, you also get the names of the parameters. So you can better aim (for shooting yourself in the foot).

    (Side note: After studying MSDN for three days exporting a COM interface from a .NET public interface becomes really simple. What happened to one-page samples that teach you the same thing in 15 minutes?)

    "[…] Would security patches also have to patch your SDK?"

    Well, yes. That’s exactly what KMDF/UMDF are trying to do with their co-installer system.  Newer framework libraries shall be able to replace older ones – even for already installed drivers. One major reason for this are security fixes for drivers.

  38. asd says:

    >> [I think you’re confusing the compiler and the linker. Remember, under the classical model, the linker does not generate code. (The linker also doesn’t know what language you wrote your program in.) -Raymond]

    Isn’t it possible for the compiler to automatically generate an additional “auto-imports” file, which would contain references to the imported functions mentioned in code? Then linker would link everything just fine, not even knowing about the trick. This way you don’t break the two-layered architecture.

    Of course that probably won’t work with exported classes and such, as compiler has nowhere to get their link info from, but they’re rarely exported this way. So if anybody uses them, he might as well use import libraries.

    [How does the compiler know that the imported named “_bar” is coming from FOO.DLL and not another file BAR.C in your project? And how does it know that _bar maps to FOO.DLL? Remember, we’re in the old days before __declspec. Oh, and if FOO.C and BAZ.C both imported FOO.DLL!_bar, then the linker will raise a “multiply defined symbol” error. (Under the classical model, a symbol could be defined only once. COMDAT didn’t exist back then.) -Raymond]
  39. asd says:

    >> That’s what import libraries are for. To provide all of the necessary translation – including from internal symbols to the exported, visible symbols, the necessary casting and calling conventions to apply to the address when it is actually called, etc.

    No, wait, sorry, I still don’t get it. Are you saying that this is impossible to implement auto-imports for functions specifically in a separate compiler/linker architecture? Because we already have a working example of how the mixed compiler/linker actually manages to auto-import functions perfectly fine, that’s Delphi:

    function abcd(params): result; call_convention; external ‘dllname’ name ‘exportname’/index export_index;

    (I’m sure there are more examples, I’m just too lazy to google for them)

    This works. In other words, just like one would think after learning that we have LoadLibrary/GetProcAddress, it’s possible to teach compiler or linker or both (as in delphi) to do function imports automatically, from only header info and just a bit more (import name or import ordinal).

    I bet it’s possible to implement this even in a separate compiler/linker scheme. I suggested one way in a previous post, and even if it’s flawed, there could be other ways to do the same without heavily modifying the linker.

    The problem with the import library scheme is just that you have to have both header and import library. This seems like an overkill. After all, you don’t need import library to do GetProcAddress and cast a result to your type, right? You only need a function type, which can be declared in a header just fine, and a function name, which can be declared there too. So the requirement to have an additional file to do the same automatically seems strange.

    Just to make things clear, I don’t argue with that you need import library for accessing things in dll other than exported functions. Also, if your/Raymonds point was just in explaining “why the things went this way” instead of proving this was the only way to go, I have nothing to object too.

    (p.s. Oh, but then, another reason why things might have went that way is probably compatibility to C++ standards. Although I’m not sure what in particular could have caused this. Attributes? I believe Microsoft already has Microsoft C++ specific attributes anyway.)

    [Now you’re proposing changes to the language, which was something the originally design specifically avoided. That way you didn’t need a special “Windows-enabled” version of your compiler; you could keep using your old one and link it to the import LIBs. Important when there didn’t exist a “Windows-enabled” version of your compiler in the first place. If were allowed to change the language, then I could add a keyword dllimport(“FOO.DLL”).  Oh, and then I’d go back in time and invent COMDAT so I wouldn’t get “multiply defined” errors. -Raymond]
  40. ikk says:

    [quote]That way you didn’t need a special “Windows-enabled” version of your compiler;[/quote]

    Now that’s a whole different story. We are used to Windows-enabled compilers, with __declspec, __stdcall and whatever.

    If there was a requirement of keeping the same linker AND the same compiler I can say I understand. On the other hand, if you can change the compiler, it is possible to push the parameters as the DLL requires (no need for a stub, and no need for a linker that generates code) and call the funcion directly, even if some name remapping is needed.

    [Well, except that functions imported from DLLs are not called directly, remember? (How can they, since the address is not known until runtime.) Direct calls work great for statically-linked functions; not so great for dynamically-linked functions.]

    [quote]How does the compiler know that the imported named “_bar” is coming from FOO.DLL and not another file BAR.C in your project?

    And how does it know that _bar maps to FOO.DLL? Remember, we’re in the old days before __declspec.[/quote]

    Again, i guess most readers didn’t get it (including me) because we are taking __declspec for granted. FOO.DLL could be treated just like BAR.OBJ, that is just another object file. If there are functions with the same name, there will be an error like “multiply defined symbol”.

    [quote]Oh, and if FOO.C and BAZ.C both imported FOO.DLL!_bar, then the linker will raise a “multiply defined symbol” error.[/quote]

    Why? The symbol is defined only once, that is in the DLL. You seem to assume that a stub is necessary. Just think of the DLL as a special object file.

    I think that with __declspec and maybe other compiler features, import libraries are not needed. You can specify everything in the header file and the compiler does the rest.

    [The symbol _bar doesn’t actually exist in FOO.DLL (because FOO.DLL is not an object library). The compiler would have to autogenerate the import table entry (in Win32, the __imp_bar variable), and it’s that metadata that would be multiply defined. -Raymond]
  41. Random832 says:

    The problem is, we’re *not* in the old days before – well for my argument it’s mainly __stdcall. Compilers today *are* windows-enabled. When that happened, import libraries could have been removed from the model (since we’re all more or less in agreement that the contents of import libraries are compiler-specific, right?).

    So the question is, why didn’t that happen.

    In the classical model, as you’ve said the linker *resolves symbols* (between multiple object modules and static libraries).

    In the current model on unix systems (which is where several people in this discussion are getting their point of view on this), the linker resolves the symbols *against the .so* at link time, and once again at runtime. It’s not clear [other than decorated names in some cases] what information a .so has in it that a .dll does not.

    [Yes, it could have changed, but one of the principles of Win32 was to be as similar to Win16 as possible to ease the transition. It just strikes me as awfully heavyweight to carry a 23MB DLL around just because you need 12KB of information from it. I guess the SDK could come with “stub DLLs” that are good only for linking. (And woe unto you if you run a program which accidentally dynamically loads the stub DLL instead of the real one!) And then modify the linker so it does the fuzzy logic to map _Foo@12 to the export named “Foo”. That’s an awful lot of work (and risk) to change something that had been working fine up until now. -Raymond]
  42. Random832 says:

    I guess that’s the core difference in philosophy here. On linux the link-time linker actually resolves against the *real* .so files – that is, the same ones that are used at runtime, the ones installed on the system. The equivalent would be for the SDK to not ship with any dlls, but rather to just use the installed ones (in System32 for the windows API itself, or installed wherever you normally install third-party dlls otherwise).

    I guess the windows model would be more space-efficient for something like a cross-compilation scenario, where the system compiling a program is not expected to be capable of running it.

    Thanks, I think this answers the question.

    ‘And then modify the linker so it does the fuzzy logic to map _Foo@12 to the export named "Foo".’ Well, that or name the export _Foo@12. But I do see your point, one way or another that’s another aspect that would have to be changed.

  43. ulric says:

    @Random832

    >linux the link-time linker actually resolves against the *real* .so files

    you can link with the .so, but you don’t have to.

    On unix you can compile your application or .so so that symbols that will only be resolved at run time.  This is how you can compile two .so that use each other.

    The runtime linker will pick up the symbols wherever they are exported from, in any shared library

Comments are closed.