Looking through some of the additions to the WDK, I found an interesting structure, RTL_RUN_ONCE. After a little more digging, I found that it is also exposed in user mode under a different typedef, INIT_ONCE. They both have the same set of functionality, but with different naming patterns that matches their typenames. The Rtl functions follow the pattern of RtlRunOnceXxx while the user mode equivalents are InitOnceXxx. (Currently these APIs are only supported on Windows Vista, if you would like to see support for previous Windows releases, please let me know.)
Great, so what? But what does this structure do? Well, it helps you implement the double checked locking pattern in a safe and race free way that has been tested by the Windows kernel team. So why was this added? The Windows org found a lot of components were implementing this pattern (and yes I am aware that languages like Java and C# provide language support directly), but few were implementing it correctly and accounting for the entire set of race conditions that could occur, so this set of functionality was created and exposed so that there could be one well written and well tested implementation of this pattern.
If you are implementing this pattern for initializing and using pointer sized value and only one value is being initialized, you can still use InterlockedCompareExchange to implement the pattern with relative ease, but if you are initializing multiple fields, then you should look at this new functionality. Initializing multiple fields can be quite difficult to do correctly if there are two threads each trying to perform the initialization or waiting for the other to finish touching all of the fields. (My personal fix for this problem has been to put all the fields into a structure and access these fields through a pointer to that structure, where the pointer to the structure is set by calling InterlockedCompareExchange).
I have been told and read online that the Visual C++ 2005 compiler will also help you implement this pattern because they formalized what the volatile keyword meant in terms of reading and writing to a memory location. I haven't see a formal statement to that affect, but from what I can read, it does help in implementing the pattern.