Concurrency, Part 4 - Deadlocks

Yesterday I talked about critical sections and their use in protecting application data.

But critical sections are like duct-tape - they hold applications together but they also have their dark side.  A couple of commenters have touched on this, but the biggest issue with critical sections is "deadlocks".

The simplest way of generating a deadlock is what is called a "lock order inversion".  Basically if you have two objects, call them "Thing1" and "Thing2".  Each is protected by a critical section.

So what happens when you have:

class Thing1 {    CriticalSection _Thing1Lock;    class Thing2 *_Thing2;public:    void Lock() { EnterCriticalSection(&_Thing1Lock); }    void Unlock() { LeaveCriticalSection(&_Thing1Lock); }    void Thing1Method1()    {        Lock();        ...        _Thing2->Thing2Method2();        ...        Unlock();    }    Thing1Method2()    {        Lock();        ...        Unlock();    }};class Thing2 {    CriticalSection _Thing2Lock;    class Thing1 *_Thing1;public:    void Lock() { EnterCriticalSection(&_Thing2Lock); }    void Unlock() { LeaveCriticalSection(&_Thing2Lock); }    Thing2Method1()    {        Lock();        ...        _Thing1->Thing1Method2();        ...        Unlock();    }    Thing2Method2()    {        Lock();        ...        Unlock();    }};

So what happens when Thread1 calls:

        thing1->Thing1Method1();

And Thread2 calls:

        thing2->Thing2Method1();

?

Thread 1 Thread 2
thing1->Thing1Method1() thing2->Thing2Method1()
EnterCriticalSection(&_Thing1Lock) EnterCriticalSection(&_Thing2Lock)
... ...
_thing2->Thing2Method2() _thing1->Thing1Method2()
EnterCriticalSection(&_Thing2Lock) EnterCriticalSection(&_Thing1Lock)

At this point, you've deadlocked the process.  Thread 1 is blocked waiting for Thread2 to release the _Thing2Lock.  Thread2 is blocked waiting on Thread1 to release the _Thing1Lock.

And there's no way of getting out of this situation, you're totally stuck.

So how do you avoid deadlocks?  Well, this is where the whole "concurrency discussion" starts getting tricky.

The simplest solution is once again, to avoid the problem.  Define a strict ordering for your locks and stick with it.

So this would change Thing2::Thing2Method1() to be:
    Thing2Method1()    {        _Thing1->Lock();        Lock();        ...        _Thing1->Thing1Method2();        ...        Unlock();        _Thing1->Unlock();    } 

And now, the flow of control is:

Thread 1 Thread 2
thing1->Thing1Method1() thing2->Thing2Method1()
EnterCriticalSection(&_Thing1Lock) EnterCriticalSection(_Thing1Lock)
   
...  
_thing2->Thing2Method2()  
EnterCriticalSection(&_Thing2Lock)  
_Thing1->Thing1Method2()  
EnterCriticalSection(&_Thing1Lock)  
...  
LeaveCriticalSection(&_Thing1Lock)  
...  
LeaveCriticalSection(&_Thing2Lock)  
LeaveCriticalSection(&_Thing1Lock)  
  EnterCriticalSection(&_Thing2Lock)
  ...
  _Thing1->Thing1Method2()
  EnterCriticalSection(&_Thing1Lock)
  ...
  LeaveCriticalSection(&_Thing1Lock)
  ...
  LeaveCriticalSection(&_Thing2Lock)
  LeaveCriticalSection(&_Thing1Lock)

Voila, we've avoided the deadlock.

So that's the third principle of concurrent programming: Know your lock order and never, ever violate it.

But what happens if you can't avoid violating the lock order?  It does happen, especially if your code calls out to other components.

Tomorrow, I'll talk about some ways of avoiding the problem.

 

Edit: Fixed typo in Thing2::Lock and Unlock, thanks ac.