Concurrency, part 6 - Reference counting is hard :)

The other day, I talked about using reference counting as a way of ensuring object when enumerating a list containing objects.
Reference counting is an incredibly useful, but incredibly dangerous technique.  If you use it right, it can be a huge help, if you get it wrong, it's a nightmare.
For example, what's wrong with this code:
class Thing1 : public RefCountedObject<br>{<br>    CriticalSection _Thing1Lock;<br>    class Thing2 *_Thing2;<br>public:<br>    void Lock() { EnterCriticalSection(&_Thing1Lock); }<br>    void Unlock() { LeaveCriticalSection(&_Thing1Lock); }<br>    void SetThing2(Thing2 thing2)<br>    {<br>        Lock();<br>        _Thing2 = thing2;<br>        _Thing2->AddRef();<br>        Unlock();<br>    }<br>};class Thing2 : public RefCountedObject<br>{<br>    CriticalSection _Thing2Lock;<br>    class Thing1 *_Thing1;<br>public:<br>    void Lock() { EnterCriticalSection(&_Thing2Lock); }<br>    void Unlock() { LeaveCriticalSection(&_Thing2Lock); }<br>    void SetThing1(Thing1 *thing1)<br>    {<br>        Lock();<br>        _Thing1 = thing1;<br>        _Thing1->AddRef();<br>        Unlock();<br>    }<br>};
If you call thing2->SetThing1 and thing1->SetThing2, you've just introduced a circular dependency - neither thing1 or thing2 will ever be released.  And, while this example is utterly silly, there are lots of examples where you can get circular references. 
I think a real world example fits in quite nicely here.  We've got a notification class for some of our stuff that had a deadlock in it.  The original class looked like:
class NotificationManager: public IUnknown<br>{<br>    struct NotificationBlock <br>    {<br>        CALLBACKFUNCTION _Function;<br>        LPVOID _Context;<br>        NotificationBlock(CALLBACKFUNCTION Function, LPVOID Context) :<br>            _Function(Function),<br>            _Context(Context);<br>        {<br>        }<br>    }<br>    CAtlList<NotificationBlock> _NotificationListList;<br>    CRITICAL_SECTION _NotificationListLock;<br>    RegisterForNotifications(CALLBACKFUNCTION Function, LPVOID Context)<br>    {<br>        NotificationBlock block;<br>        block._Function = Function; block._Context = Context;<br>        <br>        EnterCriticalSection(&_NotificationListLock);<br>        _NotificationList.Add(block);<br>        LeaveCriticalSection(&_NotificationListLock);<br>    }<br>    UnregisterNotifications(CALLBACKFUNCTION Function, LPVOID Context)<br>    {<br>        NotificationBlock block(Function, Context);<br>        EnterCriticalSection(&_NotificationListLock);<br>        pos = _NotificationList.Find(block);<br>        _NotificationList.RemoveAt(pos);<br>        LeaveCriticalSection(&_NotificationListLock);<br>    }<br>    GenerateNotification(LPVOID NotificationBlock)<br>    {<br>        NotificationBlock block;<br>        EnterCriticalSection(&_NotificiationListLock)<br>        pos = _NotificationList.GetHeadList(block);<br>        while (block = _NotificationList->GetNext(pos))<br>        {<br>            block._Function(block._Context, NotificationBlock);<br>        }<br>        LeaveCriticalSection(&_NotificationListLock);<br>    }<br>}
Unfortunately, because the callback function was called with the list locked, there was a huge potential for deadlock (and in fact, one was observed).
My first thought was to change the notification block using a temporary list..
    GenerateNotification(LPVOID NotificationBlock)<br>    {<br>        NotificationBlock block;<br>        CAtlList<NotificationBlock> temporaryList;<br>        EnterCriticalSection(&_NotificiationListLock)<br>        pos = _NotificationList.GetHeadList(block);<br>        while (block = _NotificationList->GetNext(pos))<br>        {<br>            NotificationBlock tempBlock(block._Function, block._Context);<br>            temporaryList.Add(tempBlock);<br>        }<br>        LeaveCriticalSection(&_NotificationListLock);<br>        while (block = temporaryList.RemoveHead())<br>        {<br>            block._Function(block._Context, NotificationBlock);<br>        }        <br>    }
The problem with this is that it's possible for someone to have removed the block from the list as soon as the lock was released.  So I needed to have a way of protecting the blocks.  Well, the fourth principle comes into play: I reference counted the blocks.
Now the function looks like:class Thing1 : public INotification<br>{<br>    CriticalSection _Thing1Lock;<br>    Thing1Container *_MyContainer;<br>public:<br>    void Lock() { EnterCriticalSection(&_Thing1Lock); }<br>    void Unlock() { LeaveCriticalSection(&_Thing1Lock); }<br>    void FinalConstruct(NotificationContainer *MyContainer)<br>    {<br>        _MyContainer = MyContainer;<br>        _MyContainer->AddRef();<br>        _MyContainer->RegisterForNotifications(CallbackFunction, this); <br>    }<br>    void FinalRelease()<br>    {<br>        _MyContainer->UnregisterNotifications(CallbackFunction, this); <br>        _MyContainer->Release();<br>    }<br>};class NotificationManager: public IUnknown<br>{<br>    CAtlList<INotification *> _NotificationListList;<br>    CRITICAL_SECTION _NotificationListLock;<br>    RegisterForNotifications(INotification* Notify)<br>    {<br>        EnterCriticalSection(&_NotificationListLock);<br>        _NotificationList.Add(Notify);<br>        LeaveCriticalSection(&_NotificationListLock);<br>    }<br>    UnregisterNotifications(INotification *Notify)<br>    {<br>        EnterCriticalSection(&_NotificationListLock);<br>        pos = _NotificationList.Find(Notify);<br>        _NotificationList.RemoveAt(pos);<br>        LeaveCriticalSection(&_NotificationListLock);<br>    }<br>    GenerateNotification(LPVOID NotificationBlock)<br>    {<br>        INotification *notify;<br>        CAtlList<INotification *> temporaryList;<br>        EnterCriticalSection(&_NotificiationListLock)<br>        pos = _NotificationList.GetHeadList(block);<br>        while (notify = _NotificationList->GetNext(pos))<br>        {<br>            notify->AddRef();<br>            temporaryList.Add(notify);<br>        }<br>        LeaveCriticalSection(&_NotificationListLock);<br>        while (notify = temporaryList.RemoveHead())<br>        {<br>            notify->OnNotification(NotificationBlock);<br>            notify->Release();<br>        }<br>    }}
Unfortunately, this introduces a circular reference.  Since thing1 is added to the list in its final construct, the call to UnregisterNotifications never happens - the notification block keeps a reference to thing1 and the reference count never goes away.  Unless there's an external mechanism for removing thing1 from the container, neither object will ever be freed.
The easiest way of solving this is to introduce a tiny "delegator" object between thing1 and the notification manager.  The notification manager references the delegator, not the Thing1 object, and thus the lifetime of Thing1 is unaffected by the notification manager.
The good news is that managed environments, like the CLR make most of this reference counting stuff pretty much unnecessary, which is a very good thing.
Oh, and the fifth principle: "Reference counting is hard if you're not really careful".
So much for framework involving concurrency.  Believe it or not, I'm almost done :)  Tomorrow, I want to talk about opportunities for concurrency.
 
Edit: One of these days, I'll remember that your principal is your friend.