Myth: The /3GB switch lets me map one giant 3GB block of memory


Just because the virtual address space is 3GB doesn't mean that you can map one giant 3GB block of memory. The standard holes in the virtual address space are still there: 64K at the bottom, and 64K near the 2GB boundary.

Moreover, the system DLLs continue to load at their preferred virtual addresses which lie just below the 2GB boundary. The process heap and other typical process bookkeeping also take their bites out of your virtual address space.

As a result, even though the user-mode virtual address space is nearly 3GB, it is not the case that all of the available space is contiguous. The holes near the 2GB boundary prevent you from getting even 2GB of contiguous address space.

Some people may try to relocate the system DLLs to alternate addresses in order to create more room, but that won't work for multiple reasons. First, of course, is that it doesn't get rid of the 64K gap near the 2GB boundary. Second, the system allocates other items such as thread information blocks and the process environment variables before your program gets a chance to start running, so by the time your program gets around to allocating memory, the space it wanted may already have been claimed.

Third, the system really needs certain key DLLs to be loaded at the same address in all processes. For example, the syscall trap must reside at a fixed location so that the kernel-mode trap handler will recognize it as a valid syscall trap and not as an illegal instruction. The debugger requires this as well, so that it can use CreateRemoteThread to inject a breakpoint into the process when you tell it to break into the process you are debugging.

Comments (20)
  1. Anonymous says:

    "Third, the system really needs certain key DLLs to be loaded at the same address in all processes. For example, the syscall trap must reside at a fixed location so that the kernel-mode trap handler will recognize it as a valid syscall trap "

    The handler can’t look for two addresses instead of one?

    It’s still not clear to me why the top bad pointer catching hole can’t be removed entirely (the reason for it to exist no longer makes so much sense with a 3 GiB address space), and why system .dlls can’t be loaded at just below 3 GiB. Indeed, it’s not clear why they can’t be loaded at just below 3 GiB even for 2 GiB applications.

    Of course, it’s still not going to stop boneheaded third-party hook libraries loading at somewhere idiotic like 256 or 512 MiB thereby leaving the address space horribly fragmented before we’ve even begun.

    And even Windows itself is pretty crappy in this regard, as various system libraries will load as low as about 1.5 GiB.

    It’s a wonder the CLR manages to load as often as it does.

  2. Anonymous says:

    You seriously do not want to slow down the syscall code path. That is critical to system performance. I’ll try to remember tell a story about syscall performance later.

  3. Anonymous says:

    As Skywing noted, there’s more than one ‘fixed address’ in ntdll.

    http://weblogs.asp.net/oldnewthing/archive/2004/08/12/213468.aspx#215160

    See remarks there for other considerations (like the enormouse cost of rebasing).

  4. Anonymous says:

    (Notice that Pietrek didn’t measure the memory cost, only the speed cost.)

    Sure all of these things could have been done, I’m not denying that. But you have to balance the benefit (to a comparatively limited set of programs) against the cost (lots of changes in the kernel that affect all programs) and the schedule (time spent doing this is time not spent doing something else). Somebody did that balance and decided that in the grand scheme of things, the benefit did not outweigh the cost.

    (Indeed, the Win95 team implemented rebasing in an entirely different way which is focused on minimizing memory requirements. Different teams have different design priorities.)

  5. Anonymous says:

    As an idea for a article, could you lay out the various "special" virtual addresses in a win32 application? I for one had no idea what the "64K gap around 2GB" was for.

  6. Anonymous says:

    Raymond covered this issue here <http://weblogs.asp.net/oldnewthing/archive/2003/10/08/55239.aspx&gt;

    It’s a workaround for an artifact of how you build a 32-bit address on Alpha processors.

  7. Anonymous says:

    Two address checks (if two checks are even necessary; you could just have two interrupts; one for libraries loaded just below 2 GiB, one for libraries loaded just below 3 GiB, which would have no performance cost and a *tiny* (fraction of a page) memory cost) would really be that damaging to performance?

  8. Anonymous says:

    "(like the enormouse cost of rebasing). "

    Um, as pathological examples such as http://msdn.microsoft.com/msdnmag/issues/0500/hood/default.aspx show, the actual cost of rebasing is very small (about a 12% load cost in something that’s approaching a worst-case scenario; 1000 imported functions is a lot); there is an additional memory cost, but that could be mitigated in many circumstances without much additional code (it seems to me would be simple enough to have one memory footprint *per base address* and to prefer to use pre-existing rebased base addresses if they exist).

  9. Anonymous says:

    Will these /3GB posts never end?

  10. Anonymous says:

    "Sure all of these things could have been done, I’m not denying that. But you have to balance the benefit (to a comparatively limited set of programs) against the cost (lots of changes in the kernel that affect all programs) and the schedule (time spent doing this is time not spent doing something else). Somebody did that balance and decided that in the grand scheme of things, the benefit did not outweigh the cost. "

    But the thing is, the benefit could apply to *any* program, because *any* program could have to rebase some libraries. Even if the developer has picked different offsets for all his .dlls, because in practice he has no guarantee that they’ll load at their preferred address.

  11. Anonymous says:

    2^31 bit boundary, sorry.

  12. Anonymous says:

    It’s probably there to encourage more uniform behavior across all platforms, which theoretically means more portable applications.

  13. Anonymous says:

    Skywing: " Raymond covered this issue here http://weblogs.asp.net/oldnewthing/archive/2003/10/08/55239.aspx "

    That article explains why even on the x86, address space allocation granularity is 64K (to keep Windows’ behaviour consistent among processor architectures; specifically the Alpha AXP). It also explains why on the Alpha AXP, there’s a 64 kb "hole" in the address space near 2^32 boundary.

    It does *not* explain why the same hole should exist on x86 processors, or — even more astonishing — on 64-bit x86 processors.

  14. Anonymous says:

    Or *byte*, for that matter. I urgently need some sleep. Or caffeine. Preferably both…

  15. Anonymous says:

    Skywing wrote: "It’s probably there to encourage more uniform behavior across all platforms, which theoretically means more portable applications."

    It seems to me that it is likely to do the exact opposite. By making undocumented behaviour consistent across all architectures, even where there are benefits to tuning it for each architecture, MS fails to challenge the assumptions of programmers who "know too much" about the underlying system. If things like this were to vary between architectures then programmers would learn sooner not to write unportable code. Er, I think.

  16. Anonymous says:

    If they varied between architectures then programmers would still write unportable code, because – let’s be honest – raise your hand if you test your programs on Alpha AXP or ia64 before you release it…

    By keeping the quirks the same, you increase the changes that a program written for one architecture will run on another architecture "entirely by accident".

  17. Anonymous says:

    "By keeping the quirks the same, you increase the changes that a program written for one architecture will run on another architecture "entirely by accident". "

    But you also increase the chances that a program which doesn’t *actually* work properly on another platform (say, because it truncates a pointer or something like that) *appears* to work *most* of the time.

    That’s not a good thing.

  18. Anonymous says:

    &nbsp; As Evan&nbsp;already mentioned on his blog, Raymond Chen has a great series on /3GB switch on his blog. What is really cool is that Raymond takes on some myths about the /3GB switch and&nbsp; the fact that he…

  19. Anonymous says:

    From time to time I wonder who comes to read this blog and why, but those kinds of questions are very…

  20. Anonymous says:

    From time to time I wonder who comes to read this blog and why, but those kinds of questions are very

Comments are closed.