Although the x64 calling convention reserves spill space for parameters, you don’t have to use them as such


Although the x64 calling convention reserves space on the stack as spill locations for the first four parameters (passed in registers), there is no requirement that the spill locations actually be used for spilling. They're just 32 bytes of memory available for scratch use by the function being called.

We have a test program that works okay when optimizations are disabled, but when compiled with full optimizations, everything appears to be wrong right off the bat. It doesn't get the correct values for argc and argv:

int __cdecl
wmain( int argc, WCHAR** argv ) { ... }

With optimizations disabled, the code is generated correctly:

        mov         [rsp+10h],rdx  // argv
        mov         [rsp+8],ecx    // argc
        sub         rsp,158h       // local variables
        mov         [rsp+130h],0FFFFFFFFFFFFFFFEh
        ...

But when we compile with optimizations, everything is completely messed up:

        mov         rax,rsp 
        push        rsi  
        push        rdi  
        push        r13  
        sub         rsp,0E0h 
        mov         qword ptr [rsp+78h],0FFFFFFFFFFFFFFFEh 
        mov         [rax+8],rbx    // ??? should be ecx (argc)
        mov         [rax+10h],rbp  // ??? should be edx (argv)

When compiler optimizations are disabled, the Visual C++ x64 compiler will spill all register parameters into their corresponding slots. This has as a nice side effect that debugging is a little easier, but really it's just because you disabled optimizations, so the compiler generates simple, straightforward code, making no attempts to be clever.

When optimizations are enabled, then the compiler becomes more aggressive about removing redundant operations and using memory for multiple purposes when variable lifetimes don't overlap. If it finds that it doesn't need to save argc into memory (maybe it puts it into a register), then the spill slot for argc can be used for something else; in this case, it's being used to preserve the value of rbx.

You see the same thing even in x86 code, where the memory used to pass parameters can be re-used for other purposes once the value of the parameter is no longer needed in memory. (The compiler might load the value into a register and use the value from the register for the remainder of the function, at which point the memory used to hold the parameter becomes unused and can be redeployed for some other purpose.)

Whatever problem you're having with your test program, there is nothing obviously wrong with the code generation provided in the purported defect report. The problem lies elsewhere. (And it's probably somewhere in your program. Don't immediately assume that the reason for your problem is a compiler bug.)

Bonus chatter: In a (sadly rare) follow-up, the customer confessed that the problem was indeed in their program. They put a function call inside an assert, and in the nondebug build, they disabled assertions (by passing /DNDEBUG to the compiler), which means that in the nondebug build, the function was never called.

Extra reading: Challenges of debugging optimized x64 code. That .frame /r command is real time-saver.

Comments (18)
  1. Adam Rosenfield says:

    The application programmer says "It must be a bug in the library."  The library programmer says "It must be a bug in the kernel."  The kernel programmer says "It must be a bug in the hardware."  They're nearly always wrong.

  2. Sunil Joshi says:

    @Adam Rosenfield

    And what does Intel say? It's a bug in Physics?

  3. Dan Bugglin says:

    @Sunil you should already know the answer to that: en.wikipedia.org/…/Pentium_FDIV_bug

  4. Just Dave says:

    I used to work with a guy who blamed nearly all of his bugs on the .NET framework.  He never found a bug in the framework.

    I had an irritating issue that I assumed was my own bug, but I couldn't find it.  I opened a support case, and it was eventually determined to be a bug in the .NET Sql Provider (very much a corner case).  Neat!

  5. MazeGen says:

    It's interesting how "x86" became synonym for "32-bit".

  6. configurator says:

    MazeGen: It didn't. x86 is a specific type of 32-bit processor; others don't have x86 code, so it would be wrong (or undefined) to say you're talking about 32-bit code.

  7. lixiong says:

    ".frame /r" !!!!!! Thanks~~~

    Before reading this blog, I thought I was clever because I could read the ASM code to find out the parameters…

  8. Joshua says:

    I've found a nasty bug in .NET Framework.

    Pooled connections inherit the transaction isolation level from the previous user of the connection.

    It's no fun when a data saver that must be thread-safe gets READ UNCOMMITTED from a long-running report.

  9. MazeGen says:

    configurator: The term "x86" is vague. Intel architecture manuals don't even mention it so one could argue it's "a specific type of 32-bit processor". For example, Wikipedia says that it is derived "from the fact that early successors to the 8086 also had names ending in "86"". Neither 8086 nor 80186 is a 32-bit processor.

  10. kinokijuf says:

    Microsoft calls it officially i386.

  11. ErikF says:

    @MazeGen: It is doubtful that anyone would be discussing the 16-bit varieties today, just as it is understood that a "Windows program" is going to be Win32 or Win64 nowadays. I prefer the term "x86" because it less ambiguous than "32-bit"; 32-bit could mean anything from a VAX to a Motorola 68xxx to a MIPS!

  12. Alex Grigoriev says:

    I'm not surprized anymore when I find a bug or a bottleneck in the kernel mode. Mostly with storport.sys, but sometimes in another component.

  13. @ErikF: Take a look at comp.lang.asm.x86 where 16-bit code is still regularly discussed.  I tend to call the 32-bit variant IA-32: en.wikipedia.org/…/IA-32

  14. RobThree says:

    "They're just 32 bytes of memory available"… wouldn't that be 32 BITS?

  15. Alex Grigoriev says:

    @RobThree:

    4 of 8-byte registers=32 bytes.

  16. @Alex Grigoriev says:

    Yep; thought of that as soon as I posted… D'uh….

  17. Bob says:

    Processor folks call it a soft error:  en.wikipedia.org/…/Soft_error

  18. Myria says:

    I call them x86-16, x86-32, and x86-64, specifying "real mode" or "protected mode" in the 16-bit case.  That avoids the annoying ambiguity of the names, and is more reflective of what the machine code actually is.

    As for bugs in Windows, I have found a total of two actual Windows API bugs *ever*.  And I've done some crazy stuff.

    1. KiUserExceptionDispatcher didn't have a "cld", which could break vectored handler functions if the exception occurred during a memmove call.  Fixed in Vista.

    2. WOW64's NtQuerySystemInformation(SystemHandleInformation) forgot to increment an index when converting the 64-bit function's result to the 32-bit format, meaning that the first element of the array was the last handle, and all the other elements were uninitialized.  Fixed in 7.

    By the way, you can use this Win64 x86-64 parameter shadow space from leaf functions, not just frame functions.  This gives leaf functions 32 bytes of stack space to work with.

Comments are closed.

Skip to main content