The classical model for linking


Commenter Adam wonders why we need import libraries anyway. Why can't all the type information be encoded in the export table?

This goes back to the classical model for linking. This model existed for decades before Microsoft was even founded, so at least this time you don't have Bill Gates to kick around. (Though I'm sure you'll find a way anyway.)

Back in the days when computer programs fit into a single source file, there was only one step in producing an executable file: You compile it. The compiler takes the source code, parses it according to the rules of the applicable language, generates machine code, and writes it to a file, ready to be executed.

This model had quite a following, in large part because it was ridiculously fast if you had a so-called one-pass compiler.

When programs got complicated enough that you couldn't fit them all into a single source file, the job was split into two parts. The compiler still did all the heavy lifting: It parsed the source code and generated machine code, but if the source code referenced a symbol that was defined in another source file, the compiler doesn't know what memory address to generate for that reference. The compiler instead generated a placeholder address and included some metadata that said, "Hey, there is a placeholder address at offset XYZ for symbol ABC." And for each symbol in the file, it also generated some metadata that said, "Hey, in case anybody asks, I have a symbol called BCD at offset WXY." These "99%-compiled" files were called object modules.

The job of the linker was to finish that last 1%. It took all the object module and glued the dangling bits together. If one object module said "I have a placeholder for symbol ABC," it went and looked for any other object module that said "I have a symbol called ABC," and it filled in the placeholder with the information about ABC, known as resolving the external reference.

When all the placeholders got filled in, the linker could then write out the finished executable module. And if there were any placeholders left over, you got the dreaded unresolved external error.

Notice that the only information about symbols that is provided in the object module is the symbol name. Older languages trusted the programmer to get everything else right. If your FORTRAN program defined a common block with two integers and a real, and you referenced it from another source file, it was simply a language requirement that when you access the common block, you must treat it as having two integers and a real. The compiler was not under any obligation to verify that your uses of the common block were consistent. Similar, if your C program took a function returning long and redeclared it as a function returning int, the compiler merely agreed to your little subterfuge, and you were on the hook for the consequences.

Given the classical model for linking, that's pretty much all the language specification could do. All that was shared between object modules was symbol names. And back in the old days, symbol names were restricted to a maximum of eight characters consisting of uppercase letters or digits.

The C++ language came up with a workaround: They encoded the type information in the symbol name, a technique known as decoration. Your function which is named Resolve in the source code ends up with the name ?Resolve@@YG_NPAGI_N@Z in the object module, so that it can be matched up against the placeholders which ask for a function named ?Resolve@@YG_NPAGI_N@Z. The C++ language folks could get away with this because by the time the C++ language rolled around, the maximum length for a symbol was far greater than 8, and the repertoire of valid characters had grown significantly. And if you were one of the dinosaurs using an older system with the 8-character uppercase-only limitation, then you were just out of luck.

But even the greater symbol name length doesn't solve all type mismatches. For example, symbols for structures and unions are not decorated with the members of the structure or union. You can have one C++ file declare a structure called S as

struct S {
 int i;
 float f;
};

and have another C++ file declare it as

struct S {
 float f;
 int i;
};

and most compilers won't catch the mismatch.

With that historical background, we can begin addressing Adam's question next time.

Sidebar: For those interested in nonclassical linking, there's this article on changes to linker scalability in Visual C++ 2010.

Comments (12)
  1. Mark says:

    There’s something very screwy going on with your autoposter (in case you hadn’t noticed).

    [This is all fallout from the systems upgrade last Friday. Hopefully the dust will settle soon. -Raymond]
  2. Alexandre Grigoriev says:

    The answer "why you need import libraries":

    1. You can combine multiple import libraries into one LIB, along with "real" OBJs.

    2. You can make an import library from a DEF file, even if you don’t have a real DLL at hand.

    3. You could provide only public exports in a LIB, leaving private exports out.

  3. Alexandre Grigoriev says:
    1. You can make 2 DLLs cross-link each other, without trying to solve egg-chicken dilemma.
  4. Yuhong Bao says:

    "This goes back to the classical model for linking. This model existed for decades before Microsoft was even founded, so at least this time you don’t have Bill Gates to kick around. "

    And thus is not even unique to Windows.

  5. SuperKoko says:

    C++ compilers use name decoration to avoid name collisions of overloaded functions. That’s the main reason.

    Some compilers (e.g. Borland C++) don’t include the return type in the decored named, because functions cannot be overloaded on their return types.

  6. I love your mentioning one-pass compiler – yeah: it is Delphi Programming for ever and ever (now with Win7 and multitouch in the VCL):-)

  7. David M says:

    Delphi still has a one-pass compiler, and is still blazingly fast.  It now includes a linker step, but the object files (DCUs) are smarter than OBJs (pretty much raw compiler output, the format changes with every compiler revision, and the linker can load them straight in) and so the performance there is good too.  It’s still not uncommon for a large program to build in a few seconds.

  8. Adrian says:

    Good post.

    External used to be limited to SIX characters (at least when C was first standardized by ANSI), which is why strncpy wasn’t called strcpyn.

  9. And even C decorates names these days, for functions with stdcall (aka WINAPI) calling convention,

  10. Worf says:

    Actually, C name decoration is a x86 artifact. Other architectures let you get away without WINAPI and other stuff like PASCAL or cdecl because they have a calling convention. (I’ve had code that compiles fine on ARM and MIPS but dies on x86 because of a missed WINAPI). x86 doesn’t, which leaves us with a million ways of calling a function.

    Anyhow, a peculiarity of Win32 is that all executables (DLL’s, EXE’s and others of the sort) must have symbols resolved, even if the symbol is in another file. ELF doesn’t, and lets you do wierd things like have unresolved symbols in dynamic libraries. (Executables can’t, for obvious reasons, but the linker just needs proof that there’s something somewhere during linking that will provide it). Fun things happen when you link with one library, but provide another during runtime (cross-compiling – you need a "proof" library during linking, but the actual one used during execution can be different).

    But since an ELF shared object can have dangling references, you don’t find out until you run and load… (this way stuff like C libraries are linked on load).

    It does have a fun aspect. I wrote a utility that built in a library of functions client libraries had to use. Those libraries linked fine, and they linked back to the executable when the executable loaded them. When someone ported it to win32, they had to break out those functions as a DLL on its own linked by both.

  11. nitpicker says:

    > the compiler merely agreed

    … “merrily agreed”?

    [Not a typo, but now that you mention it, I wish I had written “merrily.” -Raymond]
  12. Jared says:

    Just for the record, the major mainframe architecture had no limitations on character set, and there were products which took advantage of that fact to encode limited information into external names in the early 1970s, if not before.

    (FWIW, I was one of the owners of the IEWLxxx code.)

    Virtually everything I’ve seen in the PC world has an analogue in early computing.  Today’s problems were recognized and addressed and the solutions "lost" to the newer generation who have re-invented the wheel.  

    Pioneers recognized the consequences of some choices and deliberately avoided them (c.f. null terminated strings discussion) not because of resource restrictions, but because of foresight.

    Many, many "software patents" have granted since the internet took off which are not "new art" — pioneers already used the idea in systems long forgotten by history and now reinvented for the PC.

    If Google can’t find it, it must not exist, right?

Comments are closed.