Are views of memory-mapped files coherent within a single process? (And how this was the wrong question.)

A customer wanted some clarification. An MSDN article says

This means that changes made to a page in the memory-mapped file via one process's view are automatically reflected in a common view of the memory-mapped file in another process.

"The above paragraph says that the views are coherent across processes, but what about within a single process?"

Yes, the views are coherent within a single process, too. The documentation for Map­View­Of­File says

... file views derived from any file mapping object that is backed by the same [local] file are coherent or identical at a specific time. Coherency is guaranteed for views within a process and for views that are mapped by different processes.

The customer explained that they had a large file that they wanted to update from multiple threads. No two threads operate on the same bytes of the file, but the pieces are very close together and sometimes are interleaved. (As an extreme example: Thread 1 is operating on all the bytes at odd file offsets, and thread 2 is operating on all the bytes at even file offset.) Right now, they are using critical sections to make sure that only one thread updates the file at a time. They were thinking of using a memory-mapped file and letting each of the threads party on the portion of the view that corresponds to the work that that thread is doing. Since the threads are operating on disjoint portions of the file, they wouldn't need to synchronize access with each other. They just want to make sure there are no hidden gotchas with coherency.

If you look at their problem description, you'll see that view coherency is totally not the issue at hand. They aren't creating multiple views. They are using a single view and sharing it among multiple threads. This is not about view coherency: There is only one view, so it is trivially coherent with itself. This is really about memory coherency. So let's take the view out of the picture. The actual question is "I have a chunk of memory that multiple threads are updating without locking. No two threads are updating the same byte of the memory. How safe is that?"

The answer is "Sort of, but you need to be careful."

The solution is to use atomic memory operations. But we're not really interested in the atomicity, seeing as each byte is operated on by only one thread. What we really care about is that one thread's writes don't incidentally write to bytes that belong to another thread.

In practice, this means operating on objects that can be updated atomically by the processor: Win32 guarantees atomic access for properly-aligned 32-bit values and properly-aligned pointer-sized values.

Therefore, you can slice up the file into chunks so that everything you operate on are either properly-aligned 32-bit values or properly-aligned pointer-sized values (though why you are putting pointers in a file eludes me), then you can have all the threads access the memory directly, and they won't step on each other. It's okay if a single slice consists of, say, four 32-bit values that are individually aligned, although the slice as a whole is not 32-byte aligned. The point is that when you access the slice, you access in pieces that are atomically updatable.

But if you are slicing up the file into bytes, like in our example above, then you cannot just have them all modifying bytes freely because byte-granular access is not guaranteed to be atomic. Windows can run on systems that do not support byte-granular access, such as the original Alpha AXP. Writing a byte on the original Alpha AXP was a multi-step affair: Load the eight-byte-aligned chunk of memory surrounding the byte into a single 64-bit register, update the single byte in that register, then write all eight bytes back to memory. Observe that this update is not atomic: If two threads try to write different bytes that happen to reside in the same eight-byte-aligned chunk, the writes may collide, and one of the writes will appear to be be lost. (The Alpha AXP also has a notoriously weak memory model.)

Bonus chatter: If you wanted to avoid having to rewrite the program too much, you could use the Write­File function with an OVERLAPPED structure that provides the explicit offset for each write. This avoids having to synchronize the threads so that two threads don't try to move the file pointer at the same time. On the other hand, you aren't avoiding the synchronization entirely because I/O to a synchronous file handle is serialized. To get truly parallel writes, you need to open the file in FILE_FLAG_OVERLAPPED mode, but that is potentially a bigger change to the program.

Comments (13)
  1. To me, aside from the emphasis on the View, I would tend to assume that anything which is guaranteed across processes, would also be guaranteed within a process… which begs a bigger question: what might be guaranteed to work ACROSS processes, that cannot be guaranteed WITHIN a process?

  2. pc says:

    Would the answer be any different if the threads did each use a different view?

    In practice, as long as you aren’t running your code on Alpha AXP (or a future similar platform), are byte-granular accesses atomic?

    1. Oh, you mean like the original ARM, which also didn’t have byte-granular access? At any rate, it’s a CPU question, not a Windows question. You’d have the same issues in any operating system.

  3. Off-topic: Is something going on with the blog software again today? has the old style again (yay!) and incidentally redirects back to the /b/ URL, but it’s missing this week’s posts from it, and the recent posts it does have are missing some/all comments. Smells like it’s connecting to an old backup of the database. Yet, this week’s posts, which I still have open, are still displaying in the new white style.

    1. mikeb says:

      The old URL I have for rss.aspx didn’t update with this article. There doesn’t seem to be an RSS feed URL for the new blog page. I might be jumping the gun, but I hope MS isn’t abandoning RSS as a way to get updates on MSDN blogs.

      1. Ramón Sola says:

        You can reach the feed at (actually an Atom feed, not a RSS feed, but readers usually can parse both). The feed URL is not exposed via “link” tags yet.

      2. Neil says:

        This blog now appears to be running on WordPress, so I tried the permalink-style feed URL which is simply and that does appear to work. (I believe the default format is RSS2 but if you prefer e.g. atom, then works for that.)

      3. Neil says:

        The comment I wrote mentioning that the WordPress default URL appeared to work is still pending moderation but since then the original feed URL has started working again so it doesn’t matter.

      4. you are correct that rss.aspx is no longer… if you open the page in a browser, you’ll be redirected to /feed, which does work.

  4. sense says:

    Is “32-byte aligned” a typo? And if yes, is it a typo of “32-bit aligned” or a typo of “16-byte aligned” (Because four 32-bit values will be 16-bytes long)?

    Either interpretation of the sentence makes no sense to me:
    1- a 32 or 16 byte alignment is just too much for CPUs I know,
    2- and the slice cannot be “not 32-bit aligned”, if all values in it are 32-bit aligned

  5. Lev says:

    There will also be performance issues. Threads won’t be able to use the cache efficiently because it loads more than 8-byte chunks.

  6. Ben Voigt says:

    @Scott: Some examples of things that work well across processes but are prone to failure within a process: DebugActiveProcess(), MiniDumpWriteDump()

Comments are closed.

Skip to main content