A 32-bit application can allocate more than 4GB of memory, and you don’t need 64-bit Windows to do it

Commenter Herb wondered how a 32-bit program running on 64-bit Windows can allocate more than 4GB of memory. Easy: The same way it allocates more than 4GB of memory on 32-bit Windows!

Over a year before Herb asked the question, I had already answered it in the tediously boring two-week series on the myths surrounding the /3GB switch. Here's a page that shows how you can allocate more than 2GB of memory by using shared memory (which Win32 confusingly calls file mappings). That code fragment allocated 4GB of memory at one go, and then accessed it in pieces (because a 32-bit program can't map an entire 4GB memory block at one go). To allocate more, either make the number bigger in the call to CreateFileMapping or just call CreateFileMapping multiple times.

The following week, I talked about how you can use AWE to allocate physical pages. Again, you can allocate as much memory as you like, but if you allocate enormous amounts of memory, you will probably not be able to map them all in at once.

The claims of the program are true, but 64-bit Windows wasn't necessary for the program to accomplish what it claims. It's like Dumbo and the magic feather. "Dumbo can fly with the magic feather in his trunk." Well, yeah, but he didn't actually need the feather.

(On the other hand, 64-bit Windows certainly makes it more convenient to use more than 4GB of memory, since you can map the memory into your address space all at once and use normal pointers to access it.)

Comments (20)
  1. Sunil Joshi says:

    I think the point of the claim was that client versions of 32-bit Windows only support 4GB physical memory as a license restriction.

    If you have a 64-bit system, you can have more physical memory so it’s more worthwhile going through the contortions to allocate greater than 4GB memory.

  2. Alexandre Grigoriev says:

    Yea, AWE: XMS/EMS all over again!

  3. Rick C says:

    .  It's nice to know we can allocate that much memory.

  4. Krunch says:

    This trick is not specific to Windows (well, yes I know, this is a Microsoft blog) and there is a similar technique in Linux that can be used by Oracle (among other things): http://www.puschitz.com/TuningLinuxForOracle.shtml#ConfiguringVeryLargeMemory

    Now, the interest of doing that seems a bit dubious to me. Better switch to 64 bits and make your application code simpler (and stop confusing sysadmins/users who don’t understand this method and its limitations).

  5. Mark says:

    Karellen: have you read http://blogs.msdn.com/ericlippert/9628808.aspx?  Sometimes the "allocation is storage" analogy is useful, sometimes "allocation is mapping" is more appropriate.  I’d say both are true at the same time.

  6. Mark says:

    Make that "most of the time, both are true".

  7. Karellen says:

    Hmmm….I suppose that it depends on what you mean by "allocate".

    If I created a regular 8Gb file, a regular heap-allocated 500Mb buffer, and then wrote a function called MapChunk() which was responsible for writing the old contents of the buffer to where it was originally read from in the file, and then read a different portion of the file into the buffer, would that count as having "allocated" 8Gb of (very, very slow) memory?

    It feels like you’re playing semantics and stretching the meaning of the word "allocated". Now, I don’t think you *are* doing that and think you are actually correct, but that’s what it feels like.

    Oh, the perils of language. Even highly technical terms can have widely used but incorrect colloquialisms that are used by techies.

    To me, after a good few years of C programming, memory "allocated" to my process is memory mapped into my process’ virtual address space – i.e. that which I am allowed to access immediately with a simple pointer dereference.

  8. Dean Harding says:

    Karellen: The difference is if you use CreateFileMapping and you’ve got more than 4GB of physical memory, Windows can keep the whole lot in physical memory. So as you map difference sections of the file, you’re not actually copying stuff from disk to physical memory, just mapping a differen physical location into your virtual address space.

    Using a manual buffer, you HAVE TO copy the file contents to and from disk each time you map a new "view".

  9. Krunch: Yep, it’s a horrible hack from back when 32bit Xeons with physical address extension (generally 36bit) were all the rage. I doubt anyone really does it now.

  10. Karellen says:

    Dean: Yes, but that’s an implementation detail which varies according to the hardware you’re running on.

    If you’ve got more than 4Gb of physical memory, Windows could still keep the whole file in a physical memory page cache, and you’d "only" actually be copying stuff between regions of memory during my horribly inefficient "map a new view" API.

    But again, that’s an implementation detail.

    The process of allocating memory and using that memory is an abstraction defined by an API. In C, that API is revolves around pointers. "Allocated memory" is accessed through pointers by dereferencing them, and pointers are obtained either from "taking the address" of things on "the stack", or they’re managed by various families of functions which return pointers or take pointer arguments to allocate or deallocate regions of memory, such as {m,c,re}alloc()/free(), mmap()/munmap(), HeapAlloc()/HeapFree(), CreateFileMapping()/… etc…

    The case could certainly be made that, in the C "abstract machine", if you can’t access the storage right now through a plain pointer, then it’s not currently "allocated".

    Of course, the difference between the C abstract machine and concrete implementations is that in practice, there /is/ a difference between theory and practice.

  11. Worf says:

    Basically, you do bank switching, which is how you get 64kiB of memory in a useful 8-bit computer, or access 32MiB of data when your processor can on map 16MiB of physical address space… Or in DOS, we called them overlays and mapped code in and out as we needed it.

    Hrm, fun trick would be to try to execute code and bypaassing DEP…

  12. Mike says:

    "Hrm, fun trick would be to try to execute code and bypaassing DEP…"

    4GB of executable code? give it a few more years…

  13. Krunch: I have used this technique a few times because it is considerably cheaper to extend the life of a well trusted 32-bit application that "needs a little more headroom" than to port it to 64-bits.

  14. Karellen says:

    @Chris: Really? What are you typically needing to do to port your applications to 64-bit other than simply recompile them and then pass them through whatever QA process you normally have in place (if any)?

  15. Karellen: In theory that’s all there is to it. In practice you have umpteen 3rd party libs and in-proc COM objects that may or may not available in 64-bit variants. One hopes the code is 64-bit aware, but I can guarentee that wherever serialization is used, either by file or network etc, there will be issues. What about mixing 32-bit clients with 64-bit servers? What about the costs of bringing in new 64-bit servers instead of adding a little more RAM to a bedded in 32-bit server? Even the 64-bit servers themselves behave differently and can highlight new issues.

    This all takes time and is risky, especially with old legacy systems like the ones I was dealing with. Try selling that option to "The Business" over some smaller, possibly more localised change.

  16. Cooney says:

    Last time I looked, windows didn’t support shared memory. At best, you can map a file and mess with that, but the guarantees required for it to really be shared just aren’t there.

    [MapViewOfFile describes the conditions for coherency. If you create a mapping from the pagefile, you have shared memory. -Raymond]
  17. Cooney says:

    [MapViewOfFile describes the conditions for coherency. If you create a mapping from the pagefile, you have shared memory. -Raymond]

    What if I have some memory and I map that? No file involved at all, and if I set a value in it, it’s available on the next tick of the clock to anyone who’s mapped it. As far as I can tell, your method is a VM hack and not shared memory

    [You’re reading too much into “backed by the pagefile”, assuming that this makes the memory somehow different from “normal” memory. It turns out that normal memory is backed by the pagefile. If it’s a VM hack, then so is any other type of virtual memory. -Raymond]
  18. cooney says:

    [You’re reading too much into “backed by the pagefile”, assuming that this makes the memory somehow different from “normal” memory.

    Not at all. I’m going with what I found online stating that I can’t rely on writes being visible without a flush (on a pagefile of all things). If this were shared memory, then you would have each process sharing the memory writing to the same actual pages.

    The fact is, mapped files are different from normal memory, and so is shared memory, but they also differ from each other, even when the mapped memory is mapping the pagefile.

    Besides, if I’m mapping the pagefile, won’t that break if I turn off paging to disk?

    [Like I already pointed out, the conditions for ensuring coherency are spelled out in the documentation for MapViewOfFile. Memory “from the page file” meets the criteria and are therefore coherent (because it’s the same as normal memory). Here’s an example. -Raymond]
  19. Cooney says:

    I see your examples, but you’re missing the point:

    1. shared memory must use the same pages across processes. Your example used a single process.

    2. The VM hack you suggest to simulate shared mem fails if I map 1G and only have 500M in the pagefile (assume for argument that I have 8G sitting around on a system).

    3. MapViewOffile is silent on coherence with regards to the pagefile. According to rumor on the internet, flushes are required for the coherence to actually work

    4. shared memory is automatically coherent, by design. If you need to do extra work for shared mem, it isn’t shared mem ™

    I’m trying to educate you on how shared memory works and how ms differs in its usage, but you must realize that just because MS calls something shared mem doesn’t make it so.

    [Shared memory uses the same pages across processes. Feel free to use the kernel debugger to confirm. The physical pages are shared. If you want to call that something other than “shared memory” (say you want to call it “Fred”), then MapViewOfFile(INVALID_HANDLE_VALUE) gives you “Fred”. -Raymond]
  20. Marthinus says:

    This reminds me of the 16bit days where you could only allocate 64kb at a time, so you started playing with segments and offsets and what else, but in all honesty it sucked! The first 32bit protected mode compile I could lay my hands on I think was Watcom C++ and it was brilliant, allocating 8mb of memory just for the hell of it what just so satisfying.

Comments are closed.