What happens if you call VirtualAlloc to MEM_COMMIT a page you never MEM_RESERVE?


A customer reported that while trying to solve a problem with their program, they noticed that they had been calling Virtual­Alloc incorrectly for years. They were able to reduce it into a simple program:

#include <windows.h>
#include <stdio.h>
#include <tchar.h>

int _tmain(int argc, _TCHAR* argv[])
{
 LPVOID base = VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_READWRITE);
 _tprintf(TEXT("Allocated at %p\n"), base);
 return 0;
}

First of all, thank you for reducing your program. That really focuses the investigation.

The customer noted that their code was passing the MEM_COMMIT flag without the MEM_RESERVE flag, a scenario that is specifically called out in the documentation:

The function fails if you attempt to commit a page that has not been reserved. The resulting error code is ERROR_INVALID_ADDRESS.

But their call to Virtual­Alloc was succeeding! The customer suspected that this was not actually the source of their problem, but they wanted to double-check that perhaps their incorrect use of Virtual­Alloc was somehow indirectly contributing to it. Specifically, they were wondering if what they're doing is okay, or whether they should always use MEM_RESERVE | MEM_COMMIT.

What the customer found is a compatibility hack. A lot of application forget to set the MEM_RESERVE flag when they MEM_COMMIT, so the memory manager lets it slide if they also pass lpAddress = NULL, indicating that they are requesting a new allocation rather than modifying an existing one.

The problem is that MSDN fell into the trap of over-documenting. Instead of documenting the contract, MSDN documented the implementation. The contract is "A page being committed must also be reserved." If you try to commit a page that is not also reserved, then the behavior is unspecified. It is therefore valid for the implementation to treat the violation as "Sorry, you lose," or "Okay, I'll let you do it, but just this time."

It appears that some time after this issue was identified, the MSDN documentation was revised. But they didn't revise it by documenting the contract. They revised it by documenting the implementation more precisely.

Attempting to commit a specific address range by specifying MEM_COMMIT without MEM_RESERVE and a non-NULL lpAddress fails unless the entire range has already been reserved. The resulting error code is ERROR_INVALID_ADDRESS.

My recommendation to the customer was to switch to MEM_RESERVE | MEM_COMMIT, since that is the preferred behavior and therefore the one least likely to trigger compatibility behavior. But the fact that they were accidentally omitting the MEM_RESERVE was not related to their problem, and they should keep looking.

Comments (12)
  1. dirk gently says:

    So, doesn't the implementation become the contract as soon as you document it?

  2. Adrian says:

    I wish there were an easy, low-overhead way to know when your code has triggered a compatibility shim.  Like if AppVerifier screamed or if there were notices in the debug output when running your app in the debugger.  Developers used to have checked versions of Windows, with more thorough parameter checking on the APIs.  Without such things, more and more code will be developed that become dependent on these shims, and there will inevitably be cases where the shim won't cover all the cases or Microsoft will have to abandon the shim in order to move forward.

  3. Random User 230985712 says:

    Adrian,

    I haven't had an opportunity to look at an MSDN subscription lately. Do they no longer provide checked builds?

  4. @Random User 230985712: Not for Windows 10, apparently.  I can find checked builds for the earlier releases though.

  5. Myria says:

    I would dare say that *most* programs calling VirtualAlloc pass MEM_COMMIT without MEM_RESERVE.  This compatibility functionality is embedded into NtAllocateVirtualMemory, rather than a compatibility shim.  I think it's too late to change it.

  6. alegr1 says:

    I think the implementation worked that way from the day 1. The flags specify what should be the result of the function, not what should be all the intermediate steps to get to that result. If you ask for new committed memory, you don't specify that you want the MM first reserve it and them immediately commit it. That must have been the developers intention all along.

  7. It wouldn't be that difficult to scream if the program is being run under a debugger. That way it forces the developers to fix it (they won't be able to debug otherwise). I believe the heap manager does that if you start under a debugger and cause a memory problem.

  8. cheong00 says:

    [It wouldn't be that difficult to scream if the program is being run under a debugger.]

    The kernel mode code inside APIs don't know what is the calling process, so there is a problem.

    On the other hand, "Just throw when run in Checked build" is much much easier.

  9. David Totzke says:

    I see this listed in my subscription:

    Windows 10 Symbols Debug/Checked (x64) – (English)

    en_windows_10_symbols_debug_checked_x64_6903166.msi

    It's only 859 Mb so maybe you just install it over a normal Windows 10 installation?  Others may know.

  10. quacka6 says:

    @David, That's just the symbols

  11. Zack says:

    "if (addr == 0 && (flags & MEM_COMMIT)) flags |= MEM_RESERVE;" compiles to less machine code than "if ((flags & (MEM_COMMIT|MEM_RESERVE) == MEM_COMMIT) && CallingProcessWantsToBeLinted()) { ReportError(); }", and adding the second bit doesn't eliminate the need for the first bit, and the API is more ergonomic with this feature anyway.

  12. Joshua says:

    I just happened to discover I'm among those who made this mistake (NULL address, MEM_COMMIT but no MEM_RESERVE) in working on some code yesterday. One program fixed, millions to go. Really fortuitous discovery while this blog post is still open for comments as this code doesn't even get opened every year.

Comments are closed.

Skip to main content