Stupid memory-mapping tricks


Shared memory is not just for sharing memory with other processes. It also lets you share memory with yourself in sneaky ways.

For example, this sample program (all error checking and cleanup deleted for expository purposes) shows how you can map the same shared memory into two locations simultaneously. Since they are the same memory, modifications to one address are reflected at the other.

#include <windows.h>
#include <stdio.h>

void __cdecl main(int argc, char **argv)
{
    HANDLE hfm = CreateFileMapping(INVALID_HANDLE_VALUE, NULL,
                    PAGE_READWRITE, 0, sizeof(DWORD), NULL);

    LPDWORD pdw1 = (LPDWORD)MapViewOfFile(hfm, FILE_MAP_WRITE,
                                          0, 0, sizeof(DWORD));

    LPDWORD pdw2 = (LPDWORD)MapViewOfFile(hfm, FILE_MAP_WRITE,
                                          0, 0, sizeof(DWORD));

    printf("Mapped to %x and %x\n", pdw1, pdw2);

    printf("*pdw1 = %d, *pdw2 = %d\n", *pdw1, *pdw2);

    /* Now watch this */
    *pdw1 = 42;
    printf("*pdw1 = %d, *pdw2 = %d\n", *pdw1, *pdw2);
}

This program prints


Mapped to 280000 and 290000
*pdw1 = 0, *pdw2 = 0
*pdw1 = 42, *pdw2 = 42

(Missing asterisks added, 8am - thanks to commenter Tom for pointing this out.)

The addresses may vary from run to run, but observe that the memory did get mapped to two different addresses, and changing one value to 42 magically changed the other.

This is a nifty consequence of the way shared memory mapping works. I stumbled across it while investigating how I could copy large amounts of memory without actually copying it. The solution: Create a shared memory block, map it at one location, write to it, then unmap it from the old location and map it into the new location. Presto: The memory instantly "moved" to the new location. This a major win if the memory block is large, since you didn't have to allocate a second block, copy it, then free the old block - the memory block doesn't even get paged in.

It turns out I never actually got around to using this trick, but it was a fun thing to discover anyway.

Comments (9)
  1. Tom says:

    On my system, this gives "Mapped to 340000 and 350000". Why do they seem to be separated by 10000? I’m not familiar w/ these calls – I’ll take this as a pointer to MSDN or Richter.

    Thanks.

    p.s. Your output is missing a couple of "*"s.

  2. Gregor says:

    Do you know if this will work in all versions of Windows? How about in the future? This would truely make some things easier.

  3. phaeron says:

    The 0x10000 offset is because Windows reserves address space in 64K increments, even though it commits memory in 4K pages.

    I was going to attempt this technique for ring buffers to handle the wrap nicely, but decided against it for two reasons. One is address space — you need at least 128K of address space per buffer and in Win95/98 they’re crammed into the 1GB shared arena with everyone else. The other is that it may not be portable, depending on the cache architecture. Linux-kernel recently tested memory aliasing on a number of platforms and found it failed on a couple of RISC platforms — I think one was SPARC.

    There is another hidden way this aliasing can occur: when CreateDIBSection() is called with a handle to a mapped file that already has a view in the process. I would think that aliasing being broken on a platform would cause problems for GDI in this case.

  4. Yeep says:

    Just curious, but how is this different compared to:
    {
    char* ptr1 = (char *)malloc(1000);
    char* ptr2 = ptr1;

    printf("Mapped to %x and %xn", ptr1, ptr2);
    printf("*ptr1 = %d, *ptr2 = %dn", *ptr1, *ptr2);
    *ptr1 = 42;
    printf("*ptr1 = %d, *ptr2 = %dn", *ptr1, *ptr2);
    }

    All that’s different is that the two pointer vars contain the same address. For rest it does the same. Or am I missing something here?

  5. Raymond Chen says:

    Right, the point is that the two pointers have different addresses. In my case, I wanted to allocate a huge chunk of memory, fill it with stuff, then allocate a second hunge chunk of memory, fill the second chunk with stuff based on what was in the first chunk, then "magically" swap the two chunks in memory. (Those who are playing at home will recognize this as a twospace garbage collector.)

    I was never going to use the "map it twice to get two copies" trick; that was just fallout from the "move by remapping" trick.

    I find it hard to believe that a CPU would have trouble with aliasing. When you pass an I/O buffer to kernel mode, kernel mode locks the memory and creates an alias for it, then operates on the alias from then on. That way your app can’t crash kernel mode by freeing the memory while the I/O is still in progress. (And because at the next context switch, your user mode memory gets unmapped, which is kind of a bummer if the hardware generates an interrupt when your app isn’t current – which it probably won’t be since it’s blocked on I/O!)

  6. phaeron says:

    From what I heard, some CPUs have their data cache lines indexed by virtual address, so if the cache size is large enough relative to the page size, it’s possible for the same byte to get mapped into the cache twice via the two windows, and after that updating one cached alias doesn’t update the other. IA-32 uses physical addresses so this isn’t a problem. Presumably if you actually invoke VM calls, though, the kernel does whatever flushing is necessary to avoid problems with stale data.

    The window swapping trick is neat, but isn’t there a potential problem with another thread, perhaps an OS worker thread, mapping something in that area while you’re trying to remap, since you’ll have to temporarily unmap it? IIRC, you can’t memory map into a region reserved by VirtualAlloc(), and if you get switched out it’s possible for someone to sneak in and allocate the address range you’re trying to MapViewOfFileEx() back into. Of course, you’re obviously much more knowledgable about Win32 than I am, so maybe I’ve missed some way that the mapping addresses of the windows can be changed atomically.

    By the way, I love the blog so far — lots of good tidbits and history. Have you thought of writing a book? :)

  7. RJ says:

    Phaeron said "Have you thought of writing a book? :)"

    This is a fantastic idea, please write a book you’d sell at least 2 copies and that’s gotta be more than most of the "Java in 15"* minutes books sold …. actually I think it’d do very well, and you wouldn’t have to go far to find a publisher :)

    *I only mention this because I went to my local bookstore last night and there were hordes of Java books – still – even after the hype has supposedly died down. Could I find a single book on embedded systems? Yes one – Embedded Systems Programming with Java.

  8. Virtual memory is not virtual address space (part 2).

Comments are closed.