Visual Studio 2005 gives you acquire and release semantics for free on volatile memory access

If you are using Visual Studio 2005 or later, then you don't need the weird Interlocked­Read­Acquire function because Visual Studio 2005 and later automatically impose acquire semantics on reads from volatile locations. It also imposes release semantics on writes to volatile locations. In other words, you can replace the old Interlocked­Read­Acquire function with the following:

#if _MSC_VER >= 1400
LONG InterlockedReadAcquire(__in volatile LONG *pl)
    return *pl; // Acquire imposed by volatility

This is a good thing because it expresses your intentions more clearly to the compiler. The old method that overloaded Interlocked­Compare­Exchange­Acquire forced the compiler to perform the actual compare-and-exchange even though we really didn't care about the operation; we just wanted the side effect of the Acquire semantics. On some architectures, this forces the cache line dirty even if the comparison fails.

Comments (10)
  1. Zan Lynx says:

    Of course, writing your code relying on volatile in this way results in mysterious failure conditions if you port it to any other compiler and/or operating system.

    [That's why I put the #if around it. -Raymond]
  2. Alex Grigoriev says:

    Original meaning of volatile was for memory locations that can change their state outside of "C virtual machine". The closest to that definition are device registers.

    I don't know if later C/C++ standards extended volatile definition for multi-processing.

  3. Billy O'Neal says:

    @Alex, the C and C++ standards do not know the existence of multithreading (oops… C++11 just got released… well, even there it's just a library). They therefore cannot impose any threading semantics on volatile. This is MSVC++ specific.

  4. Ben Hutchings says:

    @Billy: Threading is not 'just a library' in C++11. The abstract machine semantics were almost entirely rewritten to cover the behaviour of multithreaded programs.

  5. mikeb says:

    I think the barrier semantics MS added in VS 2005 are sensible defaults.  But one situation I'd like to experiment with is how the implied barriers might impact something like the following (on a platform that requires barriers):

       void CopyToVolatileBuf( char volatile* vdst, char* src)


           while (*vdst++ = *src++) {};


    In most situations, I'd expect that you'd want to perform the entire transfer before issuing  a barrier, but I expect with the volatile semantics in MSVC there would be a barrier after each character is copied.

    I'll have to experiment (with an ia64 build?).  I'd bet to get only a single barrier, you'd have to do something like:

       void CopyToVolatileBuf( char volatile* vdst, char* src)


           char* dst = (char*) vdst;

           while (*dst++ = *src++) {};

           // perform some operation to get a memory barrier


    However, I think Microsoft's volatile semantics are what should be done – I'd prefer correct behavior over performance by default.  And I think having volatile imply an appropriate barrier is the right thing to do.  Even if the standard doesn't require it, the standard also doesn't require much of any other expected behavior in the face of multiple threads (excluding the recently ratified C++0x standard), but compilers have been going 'beyond the standard' to support things sensibly with threads for years.

  6. LR says:

    @mikeb: As far as I understood the MSDN documentation about this semantics of "volatile", this is ONLY taken as a hint to the compiler to perform all data access in the order written in the C code, and without using registers for the "volatile" value. It has no implication whatsoever on reordering of memory access by the CPU in the face of multi-core systems. There is no memory barrier.

    So, your C loop examples are not affected by using "volatile" one way or the other.

    Or am I wrong?

  7. Anon says:

    On the Itanium, acquire/release semantics are achieved through the use of special load/store instructions (ld.acq, st.rel). Explicit acquire/release "barriers" do not exist. (There is a full memory barrier instruction (mf), though, which is generated by the non-Acquire/Release versions of Interlocked*.)

    [Yuhong, is that you? -Raymond]
  8. Anon says:

    No. ;) Was addressing the comments posted by mikeb and LR, actually. (Acquire/release isn't implemented via barriers inserted at the end of statements or blocks, and the instructions generated do enforce proper ordering by the CPU.)

  9. LR says:

    @Anon: Thanks.

    I'm have no experience with that, but I'm a bit confused by the mix of two different things: OP code generation and ordering (by the compiler) vs. memory-access reordering (by CPU/cache/whatever).

    From the description, I take that VS2005 only deals with the first issue, not the second.

  10. Anon says:

    @LR: Yeah, the VC docs could be clearer…

    On x86/x64, it only actually has to deal with the first issue (compiler reordering), because in effect all reads and writes already have acquire and release semantics, respectively. See:…/volatile-acquire-release-memory-fences-and-vc2005.aspx

    On the Itanium, the compiler deals with the second issue (CPU memory-access reordering) by replacing ordinary load/store instructions with special acquire/release versions.

    On non-"Windows" platforms, however, such as Xbox 360, you evidently can't count on "volatile" to prevent CPU reordering. Under "Volatile Variables and Reordering",…/ee418650%28v=VS.85%29.aspx states: "on Xbox 360 the compiler does not insert any instructions to prevent the CPU from reordering reads and writes."

Comments are closed.