Some remarks on VirtualAlloc and MEM_LARGE_PAGES


If you try to run the sample program demonstrating how to create a file mapping using large pages, you'll probably run into the error ERROR_NOT_ALL_ASSIGNED (Not all privileges or groups referenced are assigned to the caller) when calling Adjust­Token­Privileges. What is going on?

The Adjust­Token­Privileges enables privileges that you already have (but which are masked). Sort of like how a super hero can't use super powers while disguised as a normal mild-mannered citizen. In order to enable the Se­Lock­Memory­Privilege privilege, you must already have it. But where do you get it?

You do this by using the group policy editor. The list of privileges says that the Se­Lock­Memory­Privilege corresponds to "Lock pages in memory".

Why does allocating very large pages require permission to lock pages in memory?

Because very large pages are not pageable. This is not an inherent limitation of large pages; the processor is happy to page them in or out, but you have to do it all or nothing. In practice, you don't want a single page-out or page-in operation to consume 4MB or 16MB of disk I/O; that's a thousand times more I/O than your average paging operation. And in practice, the programs which use these large pages are "You paid $40,000 for a monster server whose sole purpose is running my one application and nothing else" type applications, like SQL Server. Those applications don't want this memory to be pageable anyway, so adding code to allow them to be pageable is not only a bunch of work, but it's a bunch of work to add something nobody who uses the feature actually wants.

What's more, allocating very large pages can be time-consuming. All the physical pages which are involved in a very large page must be contiguous (and must be aligned on a large page boundary). Prior to Windows XP, allocating a very large page can take 15 seconds or more if your physical memory is fragmented. (And even machines with as much as 2GB of memory will probably have highly fragmented physical memory once they're running for a little while.) Internally, allocating the physical pages for a very large page is performed by the kernel function which allocates physically contiguous memory, which is something device drivers need to do quite often for I/O transfer buffers. Some drivers behave "highly unfavorably" if their request for contiguous memory fails, so the operating system tries very hard to scrounge up the memory, even if it means shuffling megabytes of memory around and performing a lot of disk I/O to get it. (It's essentially performing a time-critical defragmentation.)

If you followed the discussion so far, you'll see another reason why large pages aren't paged out: When they need to be paged back in, the system may not be able to find a suitable chunk of contiguous physical memory!

In Windows Vista, the memory manager folks recognized that these long delays made very large pages less attractive for applications, so they changed the behavior so requests for very large pages from applications went through the "easy parts" of looking for contiguous physical memory, but gave up before the memory manager went into desperation mode, preferring instead just to fail. (In Windows Vista SP1, this part of the memory manager was rewritten so the really expensive stuff is never needed at all.)

Note that the MEM_LARGE_PAGES flag triggers an exception to the general principle that MEM_RESERVE only reserves address space, MEM_COMMIT makes the memory manager guarantee that physical pages will be there when you need them, and that the physical pages aren't actually allocated until you access the memory. Since very large pages have special physical memory requirements, the physical allocation is done up front so that the memory manager knows that when it comes time to produce the memory on demand, it can actually do so.

Comments (41)
  1. Aaron.E says:

    If you have a "monster server whose sole purpose is running my one application and nothing else," which has some large amount of memory n GB (where n >= 16), and the application does in fact use large pages and is configured to use up to (n – 4) GB of physical memory, does it still make sense to leave the page file on for the rest of the system, granting that there is never going to be more than 4GB of memory pressure outside of the "sole purpose" application?

  2. mxn says:

    > does it still make sense to leave the page file on for the rest of the system

    And does it make sense to turn it off ? Either you have no memory pressure outside, and thus you waste some GB of disk space on a 40K$ machine but no performance (it stays unused), or you have pressure and it's used.. I don't think you will find those GB on the system disk to be so valuable in that use-case.

    On an aside, it depends also on how SQL Server uses large pages.. it may still use normal pages for some data structures, etc. so there will probably still be some memory pressure outside that (but I really don't know, just guessing).

  3. Alex Grigoriev says:

    I wonder what's the point in requesting large pages? To reduce TLB thrashing? Anything else?

  4. Aaron.E says:

    mxm:  I don't know if it makes sense to turn it off, but it's not wasting disk space I'm worried about (though I think one could reasonably worry about that if it was a SSD-only server), but rather the IO cost of doing unnecessary disk operations.

    Gabe:  I wasn't thinking there would be no other user-mode applications running on the system, I was just thinking that if the ohter user-mode applications that are running are restricted to a known set, the memory pressure on the system could be analyzed and the large application (everyone is assuming SQL Server, so fine, let's say SQL Server) could be configured to leave enough physical memory that the other applications would never need to page-out to disk, you could just disable paging and save the IO overhead so that when SQL Server needs to do something IO intensive like rebuilding an index, it doesn't have as much other IO to contend with.  

    Now, maybe it's the case that SQL Server would benefit more from having that memory I'm proposing setting aside for the rest of the system all to itself than it would from not having the IO contention, I don't know.  I'm a developer and not a sysadmin, so server configruation isn't my area of expertise, it just seemed like if there was ever a time when disabling the page file might be helpful, this scenario would be a contender.  Then again, maybe there isn't.

    [Disk access to the pagefile is not "unnecessary"; the memory manager is going to the pagefile because it has run out of RAM. The amount of physical memory you need leave for other applications would have to be equal to their theoretical peak memory usage, which can be quite large. Seems wasteful to reserve that much memory for a rare scenario. If you were the head of a country's central bank, it sounds like you'd declare the bank reserve rate to be 100%. -Raymond]
  5. mxm says:

    > but rather the IO cost of doing unnecessary disk operations.

    If there is no pressure, there is no swap so no disk ops.

    As an aside, I wonder about the fact that 4KB pages were deemed optimal ~25 years ago (as a balance of number of pages to swap, versus time to swap a page) .. is it still the optimal value ? Seems strange considering all the hardware changes, but I wonder :)

  6. Joshua says:

    [Disk access to the pagefile is not "unnecessary"; the memory manager is going to the pagefile because it has run out of RAM. The amount of physical memory you need leave for other applications would have to be equal to their theoretical peak memory usage, which can be quite large.]

    You're gonna laugh but I had a machine that ran so badly until I turned paging off. The disk was so slow that it was better to let the memory pressure shrink the disk cache to zero rather than opportunistically page out idling programs.

  7. yuhong2 says:

    "Prior to Windows XP, allocating a very large page can take 15 seconds or more if your physical memory is fragmented. (And even machines with as much as 2GB of memory will probably have highly fragmented physical memory once they're running for a little while.)"

    MEM_LARGE_PAGE was actually introduced in Windows Server 2003.

    [The MEM_LARGE_PAGE flag may have been introduced in Windows Server 2003, but large pages were in use since Windows 2000. Wikipedia says so, so it must be true. -Raymond]
  8. 640k says:

    Windows starts swapping long time before memory is full.

    Put the page file on a ram disk. Problem solved.

  9. Ian says:

    @640k – I see the humour in your comment :)  At least I hope it was intended to be humourous – someone will take it as a good idea though.

  10. 640k says:

    Because windows' memory manager is optimized to do lots of swapping (instead of using actual unused memory), it gets very slow if you turn off page file (newer windows version cannot even disable it). Better is to let windows think it's still swapping, but let it "swap" to a ramdisk instead.

  11. 640k says:

    Also, with 32-bit windows the kernel can allocate the ram disk at 2-4gb memory, address space which user apps cannot use anyway.

  12. Gabe says:

    Even if you *wanted* to swap out large pages, they would take orders of magnitude more time, and once paged out, it would have orders of magnitude more chance of having to get paged back in. One interesting optimization, though, would be to satifsy requests for huge blocks of memory with large pages whenever possible. Then break them up into small pages as necessary when they need to get paged out.

    Alex: Not only does a large page require 3 or 4 orders of magnitude fewer TLB entries, it also requires commensurately fewer page table entries and of course requires fewer resources to create and maintain those PTEs.

    Aaron: Even if you have a large computer dedicated to SQL Server, you'll still have other processes like backup agents, antivirus, report generators, administrative tools, and so on vying for memory. There's probably no reason to limit their abilities by restricting the available page file.

  13. Billy O'Neal says:

    @640k: Please get sources before spewing mountains incorrect information.

    1. Why on earth would the memory manager be optimized to do lots of swapping? The memory manager is optimized to perform best for end user programs, not to make some internal operation run quickly. If making swapping fast makes the memory manager as a whole fast, that might be optimized, but not at the expense of being performant for other programs.

    2. Hmm. Windows 7 x64 seems to have no problem disabling the page file (at least on my box).

    3. Your comment about the kernel using the 2-4GB memory range shows a lack of understanding of how virtual memory operates. User programs may use all of a system's memory. The difference is that on a 32 bit machine, they may only use 2GB of it per process (or 3GB if the appropriate switch is turned on at boot). Physical memory is not subject to the 2GB split — that happens in the virtual address space, not the physical one.

  14. 640k says:

    1.

    Read again. –> Windows (NT) starts swapping long time before memory is full.

    I think it's supposed to be a feature. Maybe disk cache for app1 is more important than keeping app2 in memory. "windows known what's best for you", and that's to start swapping before all memory is allocated. Raymond has also suggested that apps get swapped out when their window is minimized.

    2.

    Usually physical adresses at ~2-4gb is allocated by i/o (depending on installed hardware). To use more memory in 32-bit windows the kernel has to map it with PAE, which a 32-bit app usually doesn't use. But a ram disk using PAE-allocated memory can help performance considerably when ordinary 32-bit apps gets swapped out, if it gets swapped out to a PAE-allocated ram disk.

  15. Anton says:

    @640K: actually on 32-bit windows physical address space is also 32 bit, and it is heavily reserved by graphics and other hardware. If 4Gb physical memory is installed not all of it will be ever accessed, google it. In this case if you setup 2Gb RAM disk, it will leave nothing for the rest of system.

    [Sigh. Windows can use more than 4GB of physical memory on a 32-bit machine. Otherwise, what's the point of PAE? I spent two whole weeks trying to explain this. Apparently I need to set aside another two weeks to try to explain it again. -Raymond]
  16. Joseph Koss says:

    @640K

    If you are commonly swapping out pages because your physical memory really is full (rather than being used for caching and other assorted benefits) then you simply do not have enough memory and thats all there is to it. Reserving some of this precious memory on a starved system, for a ramdisk, is NOT normally beneficial. Basically, 1990 called and it wants its ramdisk back.

  17. Nawak says:

    Having an old laptop with XP and only 512MB of RAM, I also have noticed that the machine is better off with the pagefile disabled. Less disk trashing (and those laptop disks were slow), better windows startup time, better application startup time. Sometimes when I used the laptop for some unusual task, I got the "low virtual memory" popup, so eventually I decided for a small memory upgrade (now 1.5GB). Now, no more "low virtual memory" warning (and still no trashing).

    Windows is supposed to swap only when necessary, but evidence show that it is not true (or that windows can adapt very well to low memory condition, so well that giving him slow "disk memory" is more a poison than a present)

    I would advise people with old XP systems and disk-trashing problems to at least try it.

    Because it's with less memory available that windows pages the more, the less memory you have, the more you can benefit from restricting windows memory usage even more! Of course if you try it and during your usual tasks windows complain about virtual memory, you lost this "game"… In that case I would advise removing unnecessary startup programs etc. This relative "feature downgrade" (often useless features) will be a lot more bearable than the constant disk trashing.

    On my win7 desktop, I didn't feel the need to try this strange "optimization", maybe it is the new memory manager, maybe it is the larger RAM, maybe the faster disk…

  18. Jim says:

    "[32-bit] Windows can use more than 4GB of physical memory on a 32-bit machine."

    It could, up until XP SP2 I think (if not, SP3).  Then it was disabled in client versions of Windows for "driver compatibility reasons" (it's still enabled in 32-bit server versions of Windows).  There is a registry key that controls this behaviour, but it is in the license section of the registry, which is digitally signed, so in practice this could only be enabled by Microsoft.  (This can be worked around, but only by violating your EULA.)  So yes, 32-bit client versions can only use 4GB of physical address space, which amounts to somewhat less than 4GB of physical memory.

    "Otherwise, what's the point of PAE?"

    This is left on purely to allow the NX bit.

  19. Dylan says:

    I've had my share of issues from windows swapping out too much, but I find that running an app to restrict the disk cache to about a tenth of ram works much better than trying to disable swap.

  20. Luc Rooijakkers says:

    Slightly OT… The page frame numbers returned by AllocateUserPhysicalPages are in user space and must later be passed back to MapUserPhysicalPages. In principle, modifying these numbers would allow you to access any page in the system (the documentation explicitly warns against modifying these numbers). Does anybody know if Windows contains checks against such behaviour?

  21. yuhong2 says:

    On PAE and Windows, Geoff Chappell has an article about what happened. It is not 100% accurate, but here it is:

    http://www.geoffchappell.com/viewer.htm

  22. yuhong2 says:

    "This is left on purely to allow the NX bit."

    Yea, it was XP SP2 that introduced support for the NX bit in the first place.

  23. Joshua says:

    [Sigh. Windows can use more than 4GB of physical memory on a 32-bit machine. Otherwise, what's the point of PAE? I spent two whole weeks trying to explain this. Apparently I need to set aside another two weeks to try to explain it again. -Raymond]

    It's because Microsoft went out of their way to make it not work on 32 bit consumer OSes and for years claimed it was a hardware limitation. Tell a lie long enough and people start to believe it.

    [It's a driver compatibility limitation. But server-class machines will work (because driver compatibility on servers is a smaller problem – nobody runs strange drivers on servers). -Raymond]
  24. yuhong2 says:

    [It's a driver compatibility limitation. But server-class machines will work (because driver compatibility on servers is a smaller problem – nobody runs strange drivers on servers). -Raymond]

    The frustrating thing about it is there is no 32-bit version of Server 2008 R2, yet there is a 32-bit version of Windows 7.

  25. Troll says:

    "It's a driver compatibility limitation. But server-class machines will work (because driver compatibility on servers is a smaller problem – nobody runs strange drivers on servers"

    Let driver manufacturers solve it and produce a 32-bit version without the crippled limit of 4 GB. Microsoft has worded the 4 GB limit for 32-bit Windows so cunningly on their websites that I initially used to think it was a technical limitation but it's not. It's a licensing limit imposed by MS to prevent BSODs due to drivers accessing above 3 GB reserved area thinking they know best. I expect at least one last 32-bit Windows to be produced without this artificial limit.

  26. Stefan Kanthak says:

    "Wikipedia says so, so it must be true."

    Even the MSKB says it: <support.microsoft.com/…/en-us>

    (<support.microsoft.com/…/en-us> is not available any

    more).

  27. yuhong2 says:

    "Let driver manufacturers solve it and produce a 32-bit version without the crippled limit of 4 GB. "

    Yea, I once was thinking of defaulting to /MAXMEM:4096, with a /NOMAXMEM option to lift the limit.

  28. yuhong2 says:

    "As for the memory confusion issues, the "wasted memory" that some ramdisk drivers are capable of using is remapped memory. "

    Yep, after Intel introduced PAE with the Pentium Pro, Intel introduced PSE36 in the Pentium II era, which allowed large pages to be mapped above the 4GB boundary without changing the page table format significantly.

  29. f0dder says:

    640kb: it's actually *older* versions of Windows that can't run without a pagefile – on Windows 2000, you're allowed to disable the pagefile, but the system then creates a small PF on system startup. For XP and later, there's on problem in disabling the pagefile as long as you have enough RAM.

    Putting the pagefile on a ramdisk is moronic – better to turn it off.

    As for the memory confusion issues, the "wasted memory" that some ramdisk drivers are capable of using is remapped memory. XP SP2 doesn't limit you to 4GB of memory, it limits you to the 4GB lower physical memory addresses (exactly for the driver compatibility reasons Raymond mentions – retarded 32bit driver developers who didn't expect to get PHYSICALADDRESSes with >4GB addresses, even though we've had PAE since PPro).

  30. Cheong says:

    @Nawak: Windows (start from WinXP I think) in fact attempts to write a copy of pagable location on disk as it sees fit, so when you need more memory, if could just mark these memory block as "paged" and transfer the memory for new allocations, instead of requiring you to wait the slow I/O to page out the memory block to disk. So it's trying to do you a better service (if you can feel the startup become sluggish because of this, you'd feel it even more sluggish when your application needs more RAM and need paging but Windows not chosen this strategy).

  31. 640k says:

    @Joseph

    > If you are commonly swapping out pages because your physical memory really is full (rather than being used for caching and other assorted benefits) then you simply do not have enough memory and thats all there is to it.

    This is NOT true. Windows starts swapping out apps BEFORE memory is full:

    1. Large disk cache for app1 is more important than keeping app2 in memory. This leads to several problems, the most prominent is with virus scanners which reads lots of files, then all other programs gets swapped out, and computer will become superslow until next reboot.

    2. When a process' window is minimized, the process gets swapped out.

    > Reserving some of this precious memory on a starved system, for a ramdisk, is NOT normally beneficial.

    Reserving PAE-allocated memory for a ram disk is beneficial, because 32-bit apps cannot (usually) use that memory anyway.

    [Shouldn't the virus scanner be doing a FILE_FLAG_NO_BUFFERING to say "do not pollute the cache with this data"? -Raymond]
  32. Alex Grigorief says:

    I've been complaining about terrible file cache performance in XP for long time. It just wants to discard VM pages (executables, etc) for a big file being worked on. Example: Do a compare of large files between a DVD-R and the disk. Other processes will become terribly starved of working set and will fall to page-in loop.

  33. Simon says:

    2MB pages won't take a thousand times longer than 4KB pages to page out/in. Let's assume we have a old HD (no SSD) with a 80MB/s read/write burst rate and a 10ms random access time. This gives us total times of 22.5ms for 2MB and ~10.1ms for 4KB pages.

    [Oh, wait, now you're require that the pages be contiguous in the pagefile too? In addition to defragmenting memory, you also have to defragment the pagefile. Now instead of 15 seconds to allocate your large page, it takes 15 minutes. -Raymond]
  34. Gabe says:

    Simon: Even if your large page is contiguous in the pagefile, the pagefile still might not be contiguous on disk. You have no reason to assume that a 4MB page-in would take any less time than 1000 simultaneous 4k page-ins.

  35. James Schend says:

    @Alex Grigorief: You complain about so many hundreds of things in every post, that I guess we just kind of lose track of the specifics. Let's just assume that if you're posting in this blog, you're complaining about Windows in some way and not worry about the specifics?

    In any case, what are you expecting to happen? Do you think Microsoft is going to just bin Windows 7 and go back to XP development? What's the use of complaining about a product that's only alive because huge inflexible corporations take a long time to upgrade?

  36. Alex Grigoriev says:

    @James Schend:

    OK. Now can I complain about Win7 indexing service starting in the middle of the boot and slowing it down every time, and going through all the unchanged files at every boot? And also allocating (together with Security Essentials) 1.5 GB of kernel paged pool in the process of doing that? I'm not kidding you.

    Oh, by the way, the problem with XP cache manager was clear since early 200x, way before Vista was out.

    [Why does every article eventually turn into a "why Windows sucks" comment thread? -Raymond]
  37. James Schend says:

    Sorry Raymond. I over-estimated Alex Grigoriev's ability to get my point. (Which, to spell it out, was: "please stop griping, we're all sick of it.")

    I didn't anticipate in a million years that he'd reply to that with even MORE griping even MORE off-topic!

  38. Pavel Lebedinsky says:

    @640k: Vista and win7 no longer empty the working set of the process when its main window is minimized.

  39. Simon says:

    > [Oh, wait, now you're require that the pages be contiguous in

    > the pagefile too? In addition to defragmenting memory, you also

    > have to defragment the pagefile. Now instead of 15 seconds to

    > allocate your large page, it takes 15 minutes. -Raymond]

    Microsoft's programmers wouldn't be that stupid. (Easiest explained solution) They could create a second pagefile that only holds 2MB pages. Coming up with better solutions is left to the reader as an exercise.

    [Sure, let's rearchitect the pagefile model to handle a case nobody cares about. -Raymond]
  40. Geoff Chappell says:

    It is fascinating that it's even thought worth writing that paging large pages would be a lot of work for no good reason. I mean, that large pages aren't paged should be written somewhere, but as an incidental detail in the documentation, not as any sort of news to anyone.

    Now, some people will much more easily run into a different frustration when trying to use large pages. It may be that large-page support has been disabled, whatever your security privileges, because you have an unlicensed physical processor. Will you tell us something of the thinking behind that?

  41. Nik says:

    Raymond, I feel your pain. Every time the page file is mentioned, it brings out the conspiracy theorists.  It would be interesting to make a complete catalog of windows memory conspiracy theories.  Then you could save time by replying "you're a type 3 kook, see this table".

    In the past you used to get irritated to the point of saying "I will never write about topic X again", so I'm surprised you still talk about virtual memory.

    I'm also a bit suprised that you mentioned the "backwards compatibility time machine" topic again, since it's also troll bait.

    [Yeah, what was I thinking? -Raymond]

Comments are closed.