If you get a procedure address by ordinal, you had better be absolutely sure it’s there, because the failure mode is usually indistinguishable from success


A customer reported that the Get­Proc­Address function was behaving strangely.

We have this code in one of our tests:

typedef int (CALLBACK *T_FOO)(int);

void TestFunctionFoo(HINSTANCE hDLL)
{
  // Function Foo is ordinal 1 in our DLL
  T_FOO pfnFoo = (T_FOO)GetProcAddress(hDLL, (PCSTR)1);
  if (pfnFoo) {
    ... run tests on pfnFoo ...
  }
}

Recently, this test started failing in bizarre ways. When we stepped through the code, we discovered that pfnFoo ends up calling Bar instead of Foo. The first time we try to test pfnFoo, we get stack corruption because Bar has a different function prototype from Foo, and of course on top of that the test fails horribly because it's calling the wrong function!

When trying to narrow the problem, we found that the issue began when the test was run against a version of the DLL that was missing the Foo function entirely. The line

    Foo @1

was removed from the DEF file. Why did the call to Get­Proc­Address succeed and return the wrong function? We expected it to fail.

Let's first consider the case where a DLL exports no functions by ordinal.

EXPORTS
    Foo
    Bar
    Plugh

The linker builds a list of all the exported functions (in an unspecified order) and fills in two arrays based on that list. If you look in the DLL image, you'll see something like this:

Exported Function Table

00049180 address of Bar
00049184 address of Foo
0004918C address of Plugh

Exported Names

00049190 address of the string "Bar"
00049194 address of the string "Foo"
00049198 address of the string "Plugh"

There are two parallel arrays, one with function addresses and one with function names. The string "Bar" is the first entry in the exported names table, and the function Bar is the first entry in the exported function table. In general, the string in the Nth entry in the exported names table corresponds to the function in the Nth entry of the exported function table.

Since it is only the relative position that matters, let's replace the addresses with indices.

Exported Function Table

[1] address of Bar
[2] address of Foo
[3] address of Plugh

Exported Names

[1] address of the string "Bar"
[2] address of the string "Foo"
[3] address of the string "Plugh"

Okay, now let's introduce functions exported by ordinal. When you do that, you're telling the linker, "Make sure this function goes into the NNth slot in the exported function table." Suppose your DEF file went like this:

EXPORTS
    Foo @1
    Bar
    Plugh

This says "First thing we do is put Foo in slot 1. Once that's done, fill in the rest arbitrarily."

The linker says, "Okay, I have a total of three functions, so let me build two tables with three entries each."

Exported Function Table

[1] address of ?
[2] address of ?
[3] address of ?

Exported Names

[1] address of ?
[2] address of ?
[3] address of ?

"Now I place Foo in slot 1."

Exported Function Table

[1] address of Foo
[2] address of ?
[3] address of ?

Exported Names

[1] address of the string "Foo"
[2] address of ?
[3] address of ?

"Now I fill in the rest arbitrarily."

Exported Function Table

[1] address of Foo
[2] address of Bar
[3] address of Plugh

Exported Names

[1] address of the string "Foo"
[2] address of the string "Bar"
[3] address of the string "Plugh"

Since you explicitly placed Foo in slot 1, when you do a Get­Proc­Address(hDLL, 1), you will get Foo. On the other hand, if you do a Get­Proc­Address(hDLL, 2), you will get Bar, or at least you will with this build. With the next build, you may get something else, because the linker just fills in the slots arbitrarily, and next time, it may choose to fill them arbitrarily in some other order. Furthermore, if you do a Get­Proc­Address(hDLL, 6), you will get NULL because the table has only three functions in it.

I hope you see where this is going.

If you delete Foo from the EXPORTS section, this stops exporting Foo but says nothing about what goes into slot 1. As a result, the linker is free to put anything it wants into that slot.

Exported Function Table

[1] address of Bar
[2] address of Plugh

Exported Names

[1] address of the string "Bar"
[2] address of the string "Plugh"

Now, when you do a Get­Proc­Address(hDLL, 1), you get Bar, since that's the function that happened to fall into slot 1 this time.

The moral of the story is that if you try to obtain a function by ordinal, then it had better be there, because there is no reliable way of being sure that the function you got is one that was explicitly placed there, as opposed to some other function that happened to be assigned that slot arbitrarily.

Related reading: How are DLL functions exported in 32-bit Windows?

Comments (18)
  1. Joshua says:

    So mingw allows linking directly to DLLs (possibly because LD can't open .lib files) which I've had to use due to the import libraries being super out of date.

    [I hope it links by name and not ordinal. -Raymond]
  2. Peter says:

    I'd say another take away from this story is: if you're going to export any one function by ordinal, then export all of them by ordinal.

  3. alegr1 says:

    @Peter:

    No, the rel take is: DON'T import functions by ordinal. This is a rudiment of the times when RAM was scarce and processors were slow.

  4. Joshua says:

    [I hope it links by name and not ordinal. -Raymond]

    It does. Manually verified by strings (locating the function name in the import table).

  5. Andre says:

    So the remaining question is: how is this not obvious?

    I didn't know anything about importing functions by ordinals and just from the customer report figured out that, well, that can't possibly work. Why would you expect it to? (It would happen to work if the removed function was the last one exported and thus the deleted entry frees up that slot. But that surely isn't something you'd rely on)

  6. Kemp says:

    I might be missing something, but why would any person anywhere expect "GetProcAddress(hDLL, (PCSTR)1)" to magically know you wanted the Foo function and to fail if it was a different one? Foo is not mentioned at all in that call. Should that call fail for every other developer in the world who wanted function 1 (non-Foo) in a different DLL?

  7. Roger Lipscombe says:

    When I have called GetProcAddress with an ordinal, I've always used MAKEINTRESOURCE, rather than a (PCSTR) cast. This is mentioned in a comment on the MSDN page for GetProcAddress, but is it actually a more correct way to do it?

    I almost showed my age there: I originally wrote "(LPCSTR) cast".

  8. JJJ says:

    @Andre, @Kemp:  The developer probably thought there was a separate "Exported by ordinal" table that gets filled in when you use the "@x" directive.  When the developer removed "Foo @1", they probably expected this table to be empty (or at least slot 1 to be empty), and therefore GetProcAddress would fail.

    Not an unreasonable line of thinking, but unfortunately wrong.

    [I agree, that's probably what they were thinking. And in 16-bit Windows, they'd have been correct! -Raymond]
  9. Cesar says:

    Also note that other common operating systems don't have an equivalent to this "ordinal" business; it's all done by name. Exporting/importing by ordinal is the kind of micro-optimization which doesn't make much sense nowadays.

    (Besides Win32, I only have experience with operating systems of the Unix sort; AFAIK, all modern Unix-style systems link by name. Win32 is the odd one here, and even on Win32 most programmers pretend linking by ordinal doesn't exist.)

  10. John says:

    @Andre

    If I understood the post above its possible it was just poor testing to not catch unexpected behavior or that the "lucky" function that was called had no side affect that was apparent?

    Another scary thought is that the linker *appears* to be consistent but in reality because there is no guarantee an update/hotfix comes down the pipe and everything stops working on the next build.

  11. Myria says:

    @Roger: MAKEINTRESOURCE is actually incorrect.  If you #define UNICODE, MAKEINTRESOURCE is defined as MAKEINTRESOURCEW, which casts an integer to const wchar_t *.  GetProcAddress is one of the few Win32 functions that does not have a wchar_t version – its string parameter is always const char *.  If you use MAKEINTRESOURCE, your code will fail to compile under #define UNICODE.  Either use the "A" version MAKEINTRESOURCEA explicitly, or just reinterpret_cast to LPCSTR/PCSTR.

    The reason GetProcAddress only takes "thin" chars as a parameter is because that's how they're stored in the image.

  12. Asking for things using a compiled in ordinal always just strikes me as asking for trouble.

  13. Henke37 says:

    This reminds me of the story of when the DirectX team had to add explicit export ordinals because programs were linking against ordinals instead of names.

  14. For a library with over 10000 exports like MFC with C++ mangled names I imagine export by ordinal would matter more. Another reason to use export by ordinal is for obfuscation purposes.

  15. Neil says:

    [And in 16-bit Windows, they'd have been correct!]

    Which begs the question, which ordinals would the 16-bit linker have used…

    [I don't see how this begs the question (since 16-bit Windows wasn't part of the question), so I'm not sure what question you're asking. Maybe this page answers your question somehow. -Raymond]
  16. Anonymous Coward says:

    This is really a comment for the ‘A lie repeated often enough becomes the truth: The continuing saga of the Windows 3.1 blue screen of death (which, by the way, was never called that)’ article, but you closed that comment thread, so I have to post it here. Feel free to move it.

    Back when we were still using Windows 3.11 we always called the blue error screens Blue Screens of Death. Because when you got them the system never returned to a state stable enough to save your files. Now, we didn't have internet at the time, but someone else in the time period did and posted the term on Usenet. In 1993. (Note: the poster was talking about NT, but my family and friends all knew and used the term and most used 3.11 rather than NT.)

  17. SD says:

    AmigaOS used jump tables for shared library calls, but libraries were expected to stay backwards compatible.  If anything DLL hell was worse because there was no way to install multiple versions of a library and many of the system libraries were in ROM.

  18. Anonymous Coward 1.1 says:

    @Chris Crowther

    There are many things in the Win32 API (and most C APIs) that use compiled-in ordinals – just usually they're #define'd.

Comments are closed.

Skip to main content