If you protect a write with a critical section, you may also want to protect the read


It is common to have a critical section which protects against concurrent writes to a variable or collection of variables. But if you protect a write with a critical section, you may also want to protect the read, because the read might race against the write.

Adam Rosenfield shared his experience with this issue in a comment from a few years back. I'll reproduce the example here in part to save you the trouble of clicking, but also to make this entry look longer and consequently make it seem like I'm actually doing some work (when in fact Adam did nearly all of the work):

class X {
 volatile int mState;
 CRITICAL_SECTION mCrit;
 HANDLE mEvent;
};

X::Wait() {
 while(mState != kDone) {
   WaitForSingleObject(mEvent, INFINITE);
 }
}

X::~X() {
 DestroyCriticalSection(&mCrit);
}

X::SetState(int state) {
 EnterCriticalSection(&mCrit);
 // do some state logic
 mState = state;
 SetEvent(mEvent);
 LeaveCriticalSection(&mCrit);
}

Thread1()
{
 X x;
 ... spawn off thread 2 ...
 x.Wait();
}

Thread2(X* px)
{
 ...
 px->SetState(kDone);
 ...
}

There is a race condition here:

  • Thread 1 calls X::Wait and waits.
  • Thread 2 calls X::SetState.
  • Thread 2 gets pre-empted immediately after calling Set­Event.

  • Thread 1 wakes up from the Wait­For­Single­Object call, notices that mState == kDone, and therefore returns from the X::Wait method.

  • Thread 1 then destructs the X object, which destroys the critical section.

  • Thread 2 finally runs and tries to leave a critical section that has been destroyed.

The fix was to enclose the read of mState inside a critical section:

X::Wait() {
 while(1) {
   EnterCriticalSection(&mCrit);
   int state = mState;
   LeaveCriticalSection(&mCrit);
   if(state == kDone)
     break;
   WaitForSingleObject(mEvent, INFINITE);
 }
}

Forgetting to enclose the read inside a critical section is a common oversight. I've made it myself more than once. You say to yourself, "I don't need a critical section here. I'm just reading a value which can safely be read atomically." But you forget that the critical section isn't just for protecting the write to the variable; it's also to protect all the other actions that take place under the critical section.

And just to make it so I actually did some work today, I leave you with this puzzle based on an actual customer problem:

class BufferPool {
public:
 BufferPool() { ... }
 ~BufferPool() { ... }

 Buffer *GetBuffer()
 {
  Buffer *pBuffer = FindFreeBuffer();
  if (pBuffer) {
   pBuffer->mIsFree = false;
  }
  return pBuffer;
 }

 void ReturnBuffer(Buffer *pBuffer)
 {
  pBuffer->mIsFree = true;
 }

private:
 Buffer *FindFreeBuffer()
 {
  EnterCriticalSection(&mCrit);
  Buffer *pBuffer = NULL;
  for (int i = 0; i < 8; i++) {
   if (mBuffers[i].mIsFree) {
    pBuffer = &mBuffers[i];
    break;
   }
  }
  LeaveCriticalSection(&mCrit);
  return pBuffer;
 }
private:
 CRITICAL_SECTION mCrit;
 Buffer mBuffers[8];
};

The real class was significantly more complicated than this, but I've distilled the problem to its essence.

The customer added, "I tried declaring mIs­Free as a volatile variable, but that didn't seem to help."

Comments (30)
  1. tsaf says:

    pBuffer->mIsFree = false;

    is not in the critical section

  2. SI says:

    Setting it to true also is not in the crit section, which is much worse.

  3. SI says:

    Um forget what I said, I thought the var was called IsUsed. Then you are right and setting it to false is a race condition while setting it to true is irrelevent.

  4. slow says:

    Yeah, FindFreeBuffer can return the same buffer if called from multiple threads before GetBuffer sets the state variable. I think ReturnBuffer might be ok if the assignment is atomic and the class is used correctly (ie: no double-returns, etc), though I've been wrong before and this isn't the way I'd write that kind of code and so haven't been bitten.

  5. slow says:

    Also, don't forget to flush your pBuffers. har har.

  6. Diego F. says:

    What a mess! Multiple threads could acquire the same buffer! The FindFreeBuffer should NEVER leave the critical section without setting mIsFree to false first.

  7. Joshua says:

    FindFreeBuffer can allocate the same buffer twice (already discussed). ReturnBuffer is safe assuming nobody screws up. With modern coding standards, ReturnBuffer should check null first, but older coding standards dind't require that.

  8. voo says:

    @slow Yep I don't see any problems with ReturnBuffer as long as the write is atomic – if not we have a nasty race condition though. But that's highly unlikely without a programming error.

  9. Chris B says:

    I won't even pretend that I understand the problem well enough to fix it, but I will say that when I hear people say things like "I tried X, but that didn't seem to help.", it scares the dickens out of me.  I generally find the "stab in the dark" and "guess and test" methods to be poor proofs of correctness.

  10. Deduplicator says:

    Wouldn't this work with even less overhead?

    X::Wait() {

    while (mState!=kDone)

      WaitForSingleObject(mEvent,INFINITE);

    EnterCriticalSection(&mCrit);// Just make sure the last user left

    LeaveCriticalSection(&mCrit);// Can one just leave a critical section locked if it will just be destroyed? Or will the fact that it is still owned bugger things?

    }

    It's not really that we need the read protected as I understand it, we just need to ensure that noone uses our mCrit any more. Did I forget anything?

  11. bd_ says:

    It should be noted that there are techniques to avoid the need to lock on read – RCU for one. I haven't heard of RCU being implemented on userspace in Windows however; an efficient implementation needs a way to cause a CPU memory barrier on a remote thread, which is difficult without unix signals.

  12. "And just to make it so I actually did some work today…"

    Alright, with that last puzzle you've earned your blog money.

  13. azth says:

    The first example is basically a condition variable right? Why not just use those? :)

  14. Deduplicator says:

    I see two ways to fix things, and as FindFreeBuffer() is a private method and my gut says its name implies the doing, that's where it should be done, quarantining all the nitty-gritty:

      if (mBuffers[i].mIsFree) {

       pBuffer = &mBuffers[i];

    /*Added*/ pBuffer->mIsFree=false;

       break;

      }

    One can do it in by changing GetBuffer() instead, if one so desires:

    Buffer *GetBuffer()

    {

    /*Added*/ EnterCriticalSection(&mCrit);

     Buffer *pBuffer = FindFreeBuffer();

     if (pBuffer) {

      pBuffer->mIsFree = false;

     }

    /*Added*/ LeaveCriticalSection(&mCrit);

     return pBuffer;

    }

    But that's more work to no useful end, as far as I can see, and doesn't clear up the behavior of FindFreeBuffer().

  15. 640k says:

    When i doubt, serialize execution. Thinking of concurrency hurts brain. And even if YOU get it right, the next dude which maintain the code will of course introduce bugs the first thing he do.

  16. Martin J says:

    Doesn't developers have any knowledge of basic CS algorithms any more?

    en.wikipedia.org/…/Readers%E2%80%93writer_lock

  17. John Doe says:

    Read-Write Lock in Windows API has implemented since Vista.

    It will be buggy if implement is written by himself.

  18. asdbsd says:

    @Deduplicator > "Can one just leave a critical section locked if it will just be destroyed?"

    blogs.msdn.com/…/9940330.aspx

    One can, but you're introducing race condition.

    Thread1 (Worker): EnterCriticalSection(cs);

    Thread2 (Destructor): EnterCriticalSection(cs); //waits

    Thread1: LeaveCriticalSection(cs);

    Thread2: //enters cs

    Thread3: EnterCriticalSection(cs); //waits

    Thread2: DeleteCriticalSection(cs);

    Thread3: //wtf dude?! I'm still waiting on it

  19. Deduplicator says:

    @asdbsd:

    I don't see where you get the idea that anyone could want to handle the cs after the point the destructor destroys it. Looks like you want to keep the cs around forever.

  20. asdbsd says:

    @Deduplicator: If nobody can, then there's no point to do EnterCriticalSection(cs); before destroying it in the first place.

  21. Adam Rosenfield says:

    @azth: Yes, it's basically a condition variable, but Xbox 360 SDK does not have condition variables.

    @Martin J, @John Doe: Likewise, the Xbox 360 SDK does not have reader-writer locks (but I won't fault you for not knowing that, even if you didn't click through to the original post and scroll down to find it, since anchors still seem to be broken)

    Thanks for the call-out, Raymond.

  22. Peeter Joot says:

    How does a generic programming error like this end up serviced by Microsoft support? Do you offer general debugging support contracts? I can't imagine how this can be portrayed as a Windows bug.

    [My guess is that if you offer enough money, Microsoft Consulting Services will even wash your car for you. And don't forget, some people think that any bug that occurs on Windows is a Windows bug. -Raymond]
  23. Deduplicator says:

    asdbsd wrote: @Deduplicator: If nobody can [want to enter the critical section after it's destroyed], then there's no point to do EnterCriticalSection(cs); before destroying it in the first place.

    The point of the exercise is not to synchronize anything but the critical section itself. You don't want a ReleaseCriticalSection() in a different thread after you already destroyed it. So we wait till the last thread to lock it is done, by using EnterCriticalSection(). How else would you guarantee that?

  24. Jon says:

    @Deduplicator

    Your code is basically the same as:

    X::Wait() {

    while(1) {

      EnterCriticalSection(&mCrit);

      LeaveCriticalSection(&mCrit);

      int state = mState;

      if(state == kDone)

        break;

      WaitForSingleObject(mEvent, INFINITE);

    }

    }

    I can't comment on efficiency or correctness though. Anyone care to comment?

  25. Axel says:

    volatile doesn't guarantee atomic access in C/C++ anyway! That's a common error, because in Java and C# it *does* actually do that.

  26. Jon says:

    Is it safe to leave ReturnBuffer() as it is, unlocked? I've written and read plenty of code like that on x86 and x64, so wanted to be sure.

  27. azth says:

    @Adam Rosenfield: Isn't the 'fixed' version of X::Wait() also broken? Consider this scenario:

    * Thread 1 enters the critical section, saves mState (which at this point is != kDone), and just before it calls WaitForSingleObject(), it gets preempted.

    * Thread 2 starts running, does it's thing, and calls SetEvent(); however, Thread 1 did not call WaitForSingleObject() yet, so this signal is lost.

    * Thread 1 runs again, and enters WaitForSingleObject(). It will forever remain in this state as SetEvent() will not be called again.

    Am I missing anything? :)

  28. Jon Daley says:

    @azth, the behavior of SetEvent depends on how the event is created, as to whether it will be auto-reset or not.

    But, yes, this code is pretty strange.  The "fixed" version doesn't do anything for protecting the critical section from being used after it is deleted.  Sounds like a code redesign is needed, or else (as other people have said) the critical section shouldn't be deleted here.

  29. asdbsd says:

    @Deduplicator> "So we wait till the last thread to lock it is done, by using EnterCriticalSection()."

    How can you say that this was the last thread? That is the real problem. If you can guarantee that at most one thread is at worst case INSIDE of critical section right now then the worst part is already solved.

    There could be several threads approaching EnterCriticalSection at this moment. Some of them could be waiting in EnterCriticalSection, so no warning flags you set up are going to help: they're already there! Taking the section won't work either because you might get sorted before those other waiters:

    Thread1: EnterCriticalSection(cs); //enters

    Thread2: EnterCriticalSection(cs); //waits

    Thread3 (destructor): EnterCriticalSection(cs); //waits

    Thread1: LeaveCriticalSection(cs);

    Thread3: //enters critical section and destroys it

    Thread2: //huh?

    Even with only one other thread Enter/Leave can't protect anything:

    Thread1: bCanEnterSection==true? Okay, I'll enter it.

    Thread2: //commence destruction

    bCanEnterSection=false;

    EnterCriticalSection(cs); //make sure no one is inside

    Thread1: EnterCriticalSection(cs); //waits

    Thread2:

    //no one is inside so we're destroying it

    DestroyCriticalSection(cs);

    Thread1: //hey, wait a bit…

  30. Jules says:

    Java has an interesting take on this problem: you can't call Object.wait() (equivalent to WaitForSingleObject()) unless you hold the mutex that's associated with that object; it throws an exception otherwise.  This means that the code

    X::Wait() {

    while(mState != kDone) {

      WaitForSingleObject(mEvent, INFINITE);

    }

    }

    basically has to be translated in Java idiomatic usage to:

    class X {

     void Wait () {

       synchronized (event) {

         while (mState != kDone) {

           mEvent.wait ();

         }

       }

     }

    }

    Sure, you *could* put the synchronized block inside the while loop, but by doing so the bug is made obvious…

Comments are closed.