Locking Hierarchy

Many hangs occur within the Win32 API EnterCriticalSection.  Often, these hangs indicate a violation of the locking hierarchy.  A locking hierarchy is a graph depicting how threads acquire locks (such as a critical section).  In general, locks should always be acquired and released in the same order.

A locking hierarchy is a design-time practice that few teams consider.  It makes sense that people do not design a locking hierarchy: it is an expensive analytical task that is difficult to maintain throughout a product's development cycle.

To help teams that do not work with a locking hierarchy, here are some loose code guidelines that can help prevent hangs on locks:

  1. Minimize the duration that locks are held.  Acquire late and release early.
  2. Do not call into code you don't control while holding a lock.
  3. Do not call SendMessage or DispatchMessage while holding a lock. (special case of #2)
  4. Check error paths to ensure locks are properly released.

To demonstrate, here is some sample code:

// Wrong way to maintain locking hierarchyvoid CMyObject::Release(){    EnterCriticalSection(&this->m_Lock);    delete this->m_pMyObject;    this->m_pMyObject = NULL;    LeaveCriticalSection(&this->m_Lock);}

Calling into the object destructor traverses into other code (via the destructor), and unless we control that destructor, we don't know what it may do. Here is the alternative that tries to preserve locking hierarchy:

// Better way to maintain locking hierarchyvoid CMyObject::Release(){    MyObject* pMyObject = NULL;    EnterCriticalSection(&this->m_Lock);    pMyObject = this->m_pMyObject;    this->m_pMyObject = NULL;    LeaveCriticalSection(&this->m_Lock);    delete pMyObject;}

There may be some drawbacks to changing the code as in these examples, so it's up to each individual developer to determine how to preserve the locking hierarchy for their code.