The Intel 80386, part 13: Calling conventions

I covered calling conventions for x86 some time ago. Consider that information incorporated by reference.

Nonstandard calling conventions are permitted, provided the caller and callee agree what the convention is. Such nonstandard calling conventions may arise as the result of link-time code generation, because the linker can create a custom calling convention and simultaneously alter all the call sites to conform to it.

In order to access stack-based inbound parameters, it is conventional (but not required) to build a stack frame, using the ebp register as the frame pointer. The ebp register points to the previous stack frame, which results in a linked list of stack frames headed by the ebp register.

for caller's caller
previous ebp


for caller
previous ebp


for current function
└─ previous ebp ← current ebp
← current esp

Use of ebp as a frame pointer is not mandatory, and it was fashionable at the time not to do so in order to permit general-purpose use of an additional register. This technique is known as frame pointer omission, or FPO.¹ If code was compiled with FPO, then the debugger's k command will require additional debugging information in the PDB file in order to follow the stack trace past an FPO frame.

As of this writing, the guidance is not to use FPO. This permits Watson to generate full stack traces and allows more intelligent bucketing of crashes on the back end.

If you end up debugging a module that was compiled with FPO, but for which you do not have debugging information that includes FPO information, then stack traces will unceremoniously stop when they read an FPO function. Next time, we'll look at how to rescue those stack traces.

¹ Not to be confused with the other FPO.

Comments (14)
  1. Joker_vD says:

    > Nonstandard calling conventions are permitted, provided the caller and callee agree what the convention is.
    Well, that’s what the word “convention” means. After all, the processors’ implementation of CALL/BL/whatever is merely “copy the value of the IP register somewhere, then jump” — the parameter passing and frame linking is but a mental construction of the programmer (or the compiler).

    1. On all the other architectures, the calling convention is dictated by the OS. Private calling conventions would preclude unwinding during exceptions.

      1. Sometimes compilers use private calling conventions even on targets where the platform ABI doesn’t allow so, and the typical result is that things appear to work at first but interesting breakages surface later in scenarios the compiler writers didn’t envision.

        1. Paradice says:

          That link appears to be suggesting that Intel wrote a compiler that didn’t comply with Intel 64’s __regcall convention.

          That’s… rather impressive, actually.

          1. The compiler complied with its own calling convention just fine, but the code didn’t always end up calling what it thinks it’s calling.

            There’s a call to “foo”, which is marked up as “regcall”. So the compiler generates code to perform that call according to the regcall calling convention. Only, in this particular instance, “foo” happens to be in a shared object (=Unix equivalent to a DLL, essentiallly). The Linux dynamic linker defaults to “lazy binding”. That means that if your program depends on “foo” in a shared library, the dynamic linker doesn’t immediately look up the location of “foo” inside that library when your program is loaded. Instead it initially points calls to foo (and every other imported symbol) at a magic thunk function.

            Whenever the magic thunk gets called, the dynamic linker goes “oh”, figures out what function you were actually meaning to call, looks it up in the shared library, changes the function pointer for “foo” to point at the correct location (instead of the magic thunk), and then finally resumes by jumping to the just-resolved function. If everything goes right, this is completely transparent to the application, and it means only the symbols you actually use get resolved. (The idea being that if you say launch a program that depends on 200MB of libraries to see its command-line help, the dynamic linker will not spend a lot of time resolving all the symbols just to not use any of them.)

            If things go wrong, it’s not so transparent. For example, if the dynamic linker discovers during lazy binding that a function you’re trying to call doesn’t actually exist in its version of the library, there is embarrassment all around, and the dynamic linker prints an error message then kills the process.

            In this particular instance, the thing that went wrong is that the code for the magic thunk changed from one version to the next, and ended up using XMM registers that it’s allowed to overwrite as per the platform ABI, but that a Intel-custom “regcall” calling convention function is supposed to preserve. The magic thunk does not know about regcall, certainly has no idea that the function that it eventually ends up calling (post-resolve) is regcall, and merrily stomps over registers that the calling code thinks are preserved. Oops.

            The compiler thought of “bar() calls foo()” as a direct call; but with lazy binding and foo in a shared library, the actual sequence ends up being “bar() calls magic_thunk() which eventually jumps to foo()” and magic_thunk() is generic system code with no idea about the secret handshake calling convention that bar and foo agreed on between themselves. This happened to work accidentally for a while (by virtue of “magic_thunk()” not doing something it wouldn’t be allowed to do under regcall), but that was pure luck, and said luck eventually ran out.

      2. Joker_vD says:

        Um, on x86 the calling conventions is also dictated by the OS — Windows and Linux use different ones, which doesn’t stop GCC from using the Linux-like ABI even when compiling for Windows (with a layer of shims for calling into Win32 API). But, GCC uses Linux-like ABI even for x64, which leads to rather amusing disassembly listings comparison for GCC vs MSVC — the used/preserved registers differ, stack layouts are different, etc.

        And how does exception handling even enter into this? Are we talking about OS- or language-level exceptions? Sure, language implementations with differently implemented exceptions interoperate poorly but again, that’s what “conventions” are for — that still doesn’t mean they’re dictated or enforced by anything.

        1. Naturally you need to conform to the ABI when interfacing with external code. The question is how much of the ABI is in effect outside of that point. For example, the stack pointer register must always point to the stack; you can’t use it as a general-purpose register. The OS needs this to be true so that it can do things like dispatch OS exceptions (“signals” in Unix-speak).

          1. Torkelly says:

            I vaguely recall a blog post on on using esp as a general-purpose register, lemme see if I can find it…

            Ah, is it. The downside is if a SEH exception occurs, Windows penalizes you for trying to sneak around the ABI by firing your process into the sun.

  2. J-Nelson says:

    In a prior calling conventions post ( a commenter asked “Weren’t there also some fun stuff with the stack pointer sometimes being odd and sometimes even in WIN16?” and you responded “Egads I thought everybody had forgotten the odd/even rule. I’ll make another blog entry on that piece of Windows trivia.”

    If you made that post, I’m unable to find it. Could you follow up on this?

    1. Father Chris says:

      Thanks for asking that as I was also intrigued by the reference. I remember writing a GPF handler that walked the stack for DOS16M back in the early 90’s and had vague memories of the odd stack pointer being something I worried about.

  3. Aren’t the parameters just accessed by looking back down the stack and the stack frame is for declaring space for local variables so that if you need to call another function that stack space is preserved?

  4. anyfoo says:

    The confusion between FPO as “for placement/position only” and the Austrian party “FPÖ” in your linked post is very bizarre, since “O” and “Ö” are different letters in German. They sound different, are not interchangeable, and while there is a transliteration if you don’t have “Ö” itself available, that transliteration is “OE” or “Oe” and never “O”.

    So as a native German speaker, that confusion is pretty absurd to me. In fact, any instance where foreign speakers mistakenly transliterate Umlauts by just removing the two dots immediately jumps out as wrong/typoish. So my guess is that the person who asked just meant to be annoying.

  5. GreenCat says:

    > As of this writing, the guidance is not to use FPO.
    Why is the MSVC default project setting for x86/Arm/Arm64 target release build enabled FPO.

Comments are closed.

Skip to main content