Memory-mapped files and how they work

A key Windows facility that’s been available since NT shipped is the support of memory-mapped files.  A memory-mapped file is a file that has been mapped (i.e., not copied) into virtual memory such that it looks as though it has been loaded into memory.  Rather than actually being copied into virtual memory, a range of virtual memory addresses are simply marked off for use by the file.  You (or Windows) can then access the file as though it were memory, freely changing parts of it without having to make file I/O calls.  Windows transparently loads parts of the file into physical memory as you access them.  You don’t have to concern yourself with which parts are and are not in memory at any given time.

 

As each page of the file is accessed and copied into memory so that the CPU can access it, it bypasses the paging file.  In this sense, the file takes the place of the paging file as the backing storage for the particular range of virtual memory addresses into which it has been mapped.  Typically, the backing storage for virtual memory is the system paging file.  But with memory-mapped files, things change.  The files themselves act as virtual extensions of the paging file and serve as the backing storage for their associated virtual memory address range for unmodified pages.

 

How are memory-mapped files used?  Most commonly by Windows itself.  Every time a binary image (an .EXE or .DLL, for example) is loaded into memory, it is actually loaded there as a memory-mapped file.  The DLLs you see loaded into a process’s virtual memory address space are actually mapped there as memory-mapped files by Windows. 

 

As I’ve mentioned, memory-mapped files are also used to process files as though they were data.  If you’d rather search a file or perform minor in-place modifications of it using simple pointer dereferencing and value assignment rather than invoking file I/O operations, you can certainly do so.  Have a look at the MapViewOfFile() Win32 API for some basic info and some pointers on how to get going with this.  Something to be careful of, though, is depleting virtual memory.  When mapping large files into virtual memory to perform I/O on them, be cognizant of the fact that every address you burn in virtual memory is another than can’t be used by your app.  It’s usually more efficient to use regular file I/O routines to perform read/write operations on large files.

 

I’ll continue tomorrow with a discussion of how this relates to SQL Server.