Why don't critical sections work cross process?

I could have sworn this was answered in a previous blog by someone else (Raymond, Eric Lippert, etc), but...

Someone sent me feedback asking:

Q> Why can't critical section objects be used across processes compared to mutexes?

Originally, I thought "Man, that's a silly question, it's obvious".

But then I realized that it's not obvious, because critical sections are different from every other external synchronization mechanism in Windows (there are some internal synchronization mechanisms that share characteristics with critical sections, but they're not public).

You see, a critical section isn't a native object type in Windows.  All the other synchronization primitives (mutexes, events, semaphores, etc) are native objects - the user mode semaphore (or mutex, or event) is an object maintained by NT's object manager.  As such, it has an ACL, a name, all the things that go with being a native object. 

And since these synchronization primitives are maintained by the object manager, they can be shared across processes - another process can open a named handle, or you can dup the handle into another process, or you can have the process be inherited by a child process.

But critical sections are special.

You see, the flexibility that you get by being maintained by the NT object manager has a cost associated with it - every operation that's performed on the semaphore/mutex/event requires a user mode  to kernel mode transition, as does waiting on the object.

Sometimes that cost is too high - there's a need for a highly performant lock structure that can be used to protect a region of code.  That's where the critical section comes into play.

A critical section is just  a structure, it contains a whole bunch of opaque fields.  Inside the EnterCriticalSection routine is code that uses interlocked instructions to acquire the structure without entering the kernel - it's what makes the critical section so fast.

Of course, the fact that the critical section is just a structure is also why it can't be shared between processes - since it's just a chunk of memory that's in the processes address space, it's not accessible to other processes.

The clever observer now realizes that this that begs the question: What happens if I initialize a critical section in a shared memory region - after all, it's just a chunk of memory, I can share a memory region between two processes, and just initialize a critical section in the shared memory region.

This might actually work, for a while.  But the thing about critical sections is that they're more than just a spin lock.  There's also a semaphore that's acquired when the critical section has contention.  And that semaphore isn't shared between processes (actually, the semaphore isn't even "allocated" until there's contention (it's not allocated, per se)).  If that wasn't enough, there are also fields within the critical section that point to other external data structures as well - those structures won't exist in the process that didn't initialize the critical section.  There's no way of knowing what will happen if the other process enters the critical section.  If you're lucky, the process will crash.  If you're not, you might "just" corrupt memory.

This is a really long answer to a really short question, but sometimes its worth digging into it a bit.