Reference counting is hard.


One of the big advantages of managed code is that you don’t have to worry about managing object lifetimes. Here’s an example of some unmanaged code that tries to manage reference counts and doesn’t quite get it right. Even a seemingly-simple function has a reference-counting bug.

template <class T>
T *SetObject(T **ppt, T *ptNew)
{
 if (*ppt) (*ppt)->Release(); // Out with the old
 *ppt = ptNew; // In with the new
 if (ptNew) (ptNew)->AddRef();
 return ptNew;
}

The point of this function is to take a (pointer to) a variable that points to one object and replace it with a pointer to another object. This is a function that sits at the bottom of many “smart pointer” classes. Here’s an example use:

template <class T>
class SmartPointer {
public:
 SmartPointer(T* p = NULL)
   : m_p(NULL) { *this = p; }
 ~SmartPointer() { *this = NULL; }
 T* operator=(T* p)
   { return SetObject(&m_p, p); }
 operator T*() { return m_p; }
 T** operator&() { return &m_p; }
private:
 T* m_p;
};

void Sample(IStream *pstm)
{
  SmartPointer<IStream> spstm(pstm);
  SmartPointer<IStream> spstmT;
  if (SUCCEEDED(GetBetterStream(&spstmT))) {
   spstm = spstmT;
  }
  ...
}

Oh why am I explaining this? You know how smart pointers work.

Okay, so the question is, what’s the bug here?

Stop reading here and don’t read ahead until you’ve figured it out or you’re stumped or you’re just too lazy to think about it.


The bug is that the old object is Release()d before the new object is AddRef()’d. Consider:

  SmartPointer<IStream> spstm;
  CreateStream(&spstm);
  spstm = spstm;

This assignment statement looks harmless (albeit wasteful). But is it?

The “smart pointer” is constructed with NULL, then the CreateStream creates a stream and assigns it to the “smart pointer”. The stream’s reference count is now one. Now the assignment statement is executed, which turns into

 SetObject(&spstm.m_p, spstm.m_p);

Inside the SetObject function, ppt points tp spstm.m_p, and pptNew equals the original value of spstm.m_p.

The first thing that SetObject does is release the old pointer, which now drops the reference count of the stream to zero. This destroys the stream object. Then the ptNew parameter (which now points to a freed object) is assigned to spstm.m_p, and finally the ptNew pointer (which still points to a freed object) is AddRef()d. Oops, we’re invoking a method on an object that has been freed; no good can come of that.

If you’re lucky, the AddRef() call crashes brilliantly so you can debug the crash and see your error. If you’re not lucky (and you’re usually not lucky), the AddRef() call interprets the freed memory as if it were still valid and increments a reference count somewhere inside that block of memory. Congratulations, you’ve now corrupted memory. If that’s not enough to induce a crash (at some unspecified point in the future), when the “smart pointer” goes out of scope or otherwise changes its referent, the invalid m_p pointer will be Release()d, corrupting memory yet another time.

This is why “smart pointer” assignment functions must AddRef() the incoming pointer before Release()ing the old pointer.

template <class T>
T *SetObject(T **ppt, T *ptNew)
{
 if (ptNew) (ptNew)->AddRef();
 if (*ppt) (*ppt)->Release();
 *ppt = ptNew;
 return ptNew;
}

If you look at the source code for the ATL function AtlComPtrAssign, you can see that it exactly matches the above (corrected) function.

[Raymond is currently on vacation; this message was pre-recorded.]

Comments (48)
  1. Ben Hutchings says:

    There’s another question in my mind about assignment of raw pointers to smart pointers. If raw pointers are always converted immediately into smart pointers then no reference is added and AddRef should not be called. OTOH sometimes the smart pointer will be an additional reference alongside the raw pointer, and AddRef should be called. It might be better not to allow assignment from raw pointers and instead to have named functions that explicitly provide one or both of those behaviours.

  2. project says:

    The self-assignment is what any assignment function should be aware about. It’s so obvious :)

  3. Petr Kadlec says:

    Yes, the main topic I see here is not in the reference counting, but in the assignment function itself (see e.g. http://www.parashift.com/c++-faq-lite/assignment-operators.html#faq-12.2).

  4. "One of the big advantages of managed code is that you don’t have to worry about managing object lifetimes."

    Bullshit. If some objects in your data graph reference expensive or scarce resources, they you are back to reference counting for the whole graph.

    Only in managed code you don’t have smart pointers at your disposal.

  5. Petr, if you fix the SetObject()-function then you fix the problem with the assignment-operator as well. Fix the assignment-operator and you still have a potential bug in the SetObject()-function.

  6. Centaur says:

    Another problem is that this smart pointer has a smart destructor (it will Release() the object) but a dumb copy constructor and a dumb copy assignment operator (they both will bitwise-copy m_p and forget to AddRef() the victim). Thus, if a scope has several SmartPointers that are initialized using each other, the victim will be AddRef()ed once (when assigned to the first SmartPointer) and Released() as many times as there are SmartPointers. Then it will meet an untimely death, and someone who knew it in life (had a pointer) will one day come up to it in hopes of asking it to do things (invoke a method) but will just be met by a lifeless corpse.

  7. JCAB says:

    A word of warning: (from the article Petr linked to) "if (this == &f) return *this; // Gracefully handle self assignment".

    This is not quite enough. Think, for instance, of object hierarchies, where the object the smartptr holds owns a pointer to another object. If you call SetObject (or the assignment operator) with this owned pointer, then the original assignment also breaks, even if you check for self-assignment, even though the object assigned is different than the existing one.

    JCAB

  8. John Brown says:

    In Open Source code, of course, this could never be an issue: The end users are familiar with the code in the applications they use, and know enough not to do anything that would send the application down that code path.

    In fact, I and my colleagues at the Open Source Users’ Forum are supporting the passage of H.R. 4014 in the House of Representatives, a bill which will effectively criminalize the use of computer hardware (not software; that would infringe the 1st Amendment) by "citizens or legally resident aliens lacking in technical sophistication", where "technical sophistication" is defined as a working knowledge of ANSI C. This is a compromise, of course, as Hackers prefer K&R C[1], but the idiot-friendly nature of our government dictates that compromises must be made. The penalties are inadequate as well, but we’re confident that the threat of hard time at Leavenworth will serve to deter at least some malefactors from abusing the technological gifts of their betters.

    [1] "Further Bloviations and Glittering Generalities About Hacking", E. S. Raymond, E. S. Raymond Press, E. S. Raymond Memorial Edition (signed, blessed, and baptized by E. S. Raymond), 2003

  9. Daniel says:

    Reference counting is easy. You use boost::shared_ptr.

  10. DrPizza says:

    "One of the big advantages of managed code is that you don’t have to worry about managing object lifetimes"

    No, that’s not true. You don’t have to worry about memory deallocation. That’s it. That’s what GC gives us.

    One still has to worry about every other resource one might care to use. And the big problem with GC is that it’s other resources that I care about a lot more than memory. I have oodles of memory. 2 or more gigabytes per process. More memory than I know what to do with. And as displeasing as it may be, I can afford to leak a few kilobytes of memory now and then. I don’t, but if I did, it wouldn’t kill me.

    This makes memory entirely unlike file handles, or thread handles, or GDI objects, or database connections, or sockets, or damn near any other resource I have to deal with. These objects all have much much lower limits than my memory limit, and they all do much much nastier things when they run out. If I run out of DB connections my DB starts refusing connections from any application; if I run out of sockets the OS starts refusing connections from anyone trying to connect to me; if I run out of GDI objects my desktop stops drawing properly and I discover that a great many programs crash when they can’t draw to the screen any more.

    This is all bad stuff.

    It’s bad enough in languages with manual memory management. At least a language like C++ has the RAII idiom to save us. GC takes even that away from us.

    "But we have finalizers" you say. Well, yes, we do. But we don’t know when they’ll be run (if at all). And because garbage collections are usually due to memory pressure, they probably won’t run between now and running out of handles anyway. A handle doesn’t take up much memory at all. I can easily consume all the GDI handles in a system without making an appreciable dent in the amount of free memory. So the GC will remain completely ignorant of the dire situation the system is in.

    In C++ I don’t have to worry about managing object lifetimes the vast majority of time. Mostly resources have a lifetime that’s tied to their scope, and mostly they don’t leak outside that scope. With this in mind I can use RAII, and bingo, no object lifetime management worries; I know everything will be cleaned up in an orderly fashion. And on those rare occasions where I do have to do something special (to let an object outlive its scope) I’m fully aware of this — I’m using C++ so I know I have to take care of these things from time to time, and I’ve not been poisoned by any GC-style propaganda about not having to care at all.

    One of the big disadvantages of managed code is that you do have to worry about managing object lifetimes.

  11. Loathsome Echinoderm says:

    DrPizza: Regarding your concern about memory pressure not forcing de-allocation of GDI resources, there’s a workaround which ought to be obvious to any experienced programmer: When you call CreateSolidBrush(), for example, simply allocate ten or twenty MB off the heap and stash it away in a private pointer in the smart pointer object that holds the brush. Memory pressure is thus tied to the availability of other resources and the garbage collector will do its job correctly.

    All you need to do to support this is find out at runtime how much RAM is available, and calculate the amount you’ll need to allocate for each brush/device context/whatever. The details of this operation are trivial; I’ll leave the actual implementation as a healthful exercise for lesser minds.

  12. Orbit says:

    Whoops! Thanks Raymond, you insired me to look at my smart pointer class. It exhibited the same (incorrect) behaviour. It’s fixed now, cheers!

  13. asdf says:

    JCAB: What do you mean by "then the original assignment also breaks"?

  14. Q says:

    asdf: What JCAB meant (I believe) is this arrangement (pointer syntax ommitted):

    A => B => C

    ("X => Y" means "X holds a smart pointer to Y".)

    Assume you have a smart pointer S. S points to A. Assume S is the only reference to A, A has the only reference to B and so on.

    You reassign S to point to C. Now in the original (buggy) implementation, first A’s refcount becomes zero, so A, B and C are destructed before you try to increase the refcount of C.

  15. asdf says:

    I didn’t realize he was talking about smart pointers to smart pointers. Thanks for clearing that one up.

    On a side note. Raymond can you get the documentation for ChildWindowFromPoint (and the Ex version) cleaned up. Specifically this sentence: "The search is restricted to immediate child windows, grandchildren, and deeper descendant windows are not searched."

  16. Raymond Chen says:

    What specifically needs cleaning? Do you want the first comma changed to a semicolon?

  17. asdf says:

    Yes.

  18. Norman Diamond says:

    4/6/2004 10:47 AM DrPizza:

    > I have oodles of memory. […]

    > I can afford to leak a few kilobytes of

    > memory now and then. I don’t, but if I did,

    > it wouldn’t kill me.

    Suppose your car or airplane has 2 GB of memory, and it repeats some computation every 100 microseconds. 1 time in 20 it leaks 4 KB of memory. How long until it’s out of memory? It won’t kill you to turn the ignition off and on again to restart with 2 GB of free memory, right?

    On the other hand, please enlighten me:

    > At least a language like C++ has the RAII

    > idiom to save us. GC takes even that away

    > from us.

    What does RAII stand for?

  19. Tom Higgins says:

    RAII means Resource Acquisition Is Initializiation: http://www.hackcraft.net/raii/

    The big idea is to use C++ constructors and destructors to acquire and release resources — thus relying on the C++ implementation and object lifetime rules to manage resources for you. It helps simplify error-handling code quite a bit.

  20. Edward says:

    Looks like "Resource Acquisition Is Initialisation"

    http://www.hackcraft.net/raii/

    Ahh, so much to learn, so little time. How do you guys pick up so much so fast?

  21. Edward says:

    oh dear, still to slow. Same minute though.

  22. Pavel Lebedinsky says:

    At least a language like C++ has the RAII

    > idiom to save us. GC takes even that away

    > from us.

    It also takes away the ability to corrupt memory by freeing the same pointer twice etc.

    All things considered, lack of deterministic finalization doesn’t outweigh the benefits of managed code. At least in my opinion.

  23. Dan Maas says:

    I wonder if a language could seamlessly implement both deterministic destruction for stack objects and (lazy) garbage collection for heap objects?

    Python seems to do this pretty well.

    I don’t like languages with GC but no deterministic destruction. RAII is just too useful :)

  24. asdf says:

    Or you can just use http://www.hpl.hp.com/personal/Hans_Boehm/gc/ if not for the gc then for help with leak detection.

  25. JCAB says:

    asdf: sorry, I was in a hurry, and the explanation wasn’t as clear as it should.

    Q: You nailed it for me, thanx!

  26. Raymond Chen says:

    In order to get deterministic destruction of stack objects you need to be able to detect when a reference has leaked out, which means refcounted GC rather than tracing GC, and then you still have the cyclic reference problem.

  27. Florian says:

    >"Resource Acquisition Is Initialisation"<<

    Nice little article, but I miss a discussion of how to handle errors during resource allocation or release. He doesn’t touch that and seems to imply that you would have your constructors and destructors throw exceptions when a resource cannot be allocated or released. But what if you can’t use exceptions? Application of RAII seems rather limited to special envirounments to me. (Yes, I know this blog is about programming for Windows PCs, but that Hackcraft article wasn’t, was it?)

  28. DrPizza says:

    "DrPizza: Regarding your concern about memory pressure not forcing de-allocation of GDI resources, there’s a workaround which ought to be obvious to any experienced programmer: When you call CreateSolidBrush(), for example, simply allocate ten or twenty MB off the heap and stash it away in a private pointer in the smart pointer object that holds the brush. Memory pressure is thus tied to the availability of other resources and the garbage collector will do its job correctly. "

    You have got to be kidding me. That is a horrible workaround. It still doesn’t force timely deallocation. Nor does it scale at all well. If I have lots of objects that need timely deallocation (such as GDI handles, where an application may have several hundred) then the memory attached to each object will itself be insignificant.

  29. DrPizza says:

    "Suppose your car or airplane has 2 GB of memory, and it repeats some computation every 100 microseconds. 1 time in 20 it leaks 4 KB of memory. How long until it’s out of memory? It won’t kill you to turn the ignition off and on again to restart with 2 GB of free memory, right?"

    Suppose it leaks a file handle at the same rate. Which problem’s gonna make it fall over quicker?

    Hint: it won’t be the memory leak.

  30. asdf says:

    Florian: Then you have to manually check for errors instead of depending on exceptions (and you should never throw exceptions from destructors, that terminates your program!). This type of RAII is used to avoid code like (assuming var1 and var2 are dumb objects like regular pointers):

    var1 = get1();

    if (!got1(var1))

    return;

    var2 = get2();

    if (!got2(var2)) {

    unget1(var1);

    return;

    }

    etc. It gets really mundane and unmaintainable quickly. And I bet you’ve seen a lot of code like this (Almost every single piece of DirectX/COM code I’ve come across does crap like this in their initialization routine: create object, check for HRESULTS, if it failed release all previous objects and set them to NULL, return an error code).

    So making var1 and var2 be smart objects we get code like:

    var1 = get1();

    if (!got1(var1))

    return;

    var2 = get2();

    if (!got2(var2))

    return;

    where the destructor for varN looks like:

    varN::~varN()

    {

    if (!gotN(this->obj))

    ungetN(this->obj);

    }

  31. DrPizza says:

    "Destruction is trickier though, especially in the case of VisualC++ and other compilers where the uncaught_exception() function always returns false (a clear bug, it’s documented but not everywhere that uncaught_exception() is documented). For the record, programming for Windows PCs is the only environment I’ve much experience in."

    In spite of the documentation, uncaught_exception appears to work properly on VC++ 7 and up. If one looks at the source, one sees that, unlike VC++ 6, it’s no longer hard-coded to return false.

  32. Jon Hanna says:

    Woo hoo!

    I just tested and yes VC++7 does have correct behaviour here, which means throwing errors from destructors has been improved from "impossible" to "difficult, be very careful".

    I was pretty sure that I’d already tested that though and found it hadn’t changed. Perhaps this was a change made with the 2003 version (I had the previous version but don’t recall which I’d tested with).

  33. Daniel says:

    You can sort of do RAII with the using statement in C#, then your Dispose method can act like a destructor. Unfortunately you do have to remember to use ‘using’ each time so it’s not so automatic and the code is a little more complicated.

    D is also a garbage collected which has RAII features. See: http://www.digitalmars.com/d/cpptod.html

  34. Daniel says:

    In case you don’t know how to use the using statement in C#, here’s some code:

    class RaiiExample : System.IDisposable

    {

    RaiiExample() {

    System.Console.WriteLine("Resource acquired.");

    }

    public void Dispose() {

    System.Console.WriteLine("Resource disposed.");

    }

    public static void Main() {

    try {

    using(RaiiExample resource = new RaiiExample()) {

    System.Console.WriteLine("Do stuff with resource.");

    throw new System.Exception("This failure demonstrate cleanup.");

    }

    }

    catch(System.Exception e) {

    System.Console.WriteLine(e.Message);

    }

    }

    }

    I think this works for structs as well as classes.

  35. Jon Hanna says:

    "Nice little article, but I miss a discussion of how to handle errors during resource allocation or release. He doesn’t touch that and seems to imply that you would have your constructors and destructors throw exceptions when a resource cannot be allocated or released."

    Yes. Throw errors during construction. The point of a constructor is to set an object into a valid state so constructors should throw errors when they fail to do so. If that wasn’t clear I need to revisit that article. Destruction is trickier though, especially in the case of VisualC++ and other compilers where the uncaught_exception() function always returns false (a clear bug, it’s documented but not everywhere that uncaught_exception() is documented). For the record, programming for Windows PCs is the only environment I’ve much experience in.

    "I wonder if a language could seamlessly implement both deterministic destruction for stack objects and (lazy) garbage collection for heap objects?"

    Yes, indeed this has been done as an extension to C++.

    My article is often referenced in threads like this on the pros and cons of different garbage collection models which is beginning to annoy me since I don’t like reference counting much!

    I do think that an advantage of reference counting is that you can use RAII with it, as someone who had to suffer coding VB6 applications (yuck) the ability to use RAII was one of the few things it had going for it.

    However reference counting has more than a few issues. For one thing even with the well-written "correct answer" smart-pointer above there can be difficulties where circular references between objects mean that none of the objects will have that final Release() call to destory them and little circularly-referencing nuggets of memory (and other resources) are lost forever. In VB you had a choice of:

    1. Forcing the user to call a tear-down function to manually break the circular references (they won’t, they’ll forget).

    2. Not having the circular reference, but instead if (say) a "parent" has a "child" then parent.firstChild() could be obtained by parent having a reference to child, but child.parent() could discover the object to return by some other means (generally this is extremely inefficient, when it’s possible at all).

    3. Using undocumented VB6 features to copy bytes into pointer variables without AddRef() or Release() being called – obvious to a C++ hacker but exactly the sort of detail that VB was meant to hide (this works very well when it works, when it doesn’t you’ve just introduced a new CorruptMemory() feature into VB!).

    Personally, I like RAII that comes from having a difference between automatic and free-store variables. RAII with reference-counting is a silver lining.

    As for garbage collection I will tolerate having none, I will generally tolerate having lazy GC but I would prefer to have a combination of automatic variables with free-store objects which are deleted by a lazy GC *if I don’t elect to delete them myself*.

    For those who are missing RAII when it comes to managed code I recommend you take a look at "RAII Idiom and Managed C++" by Nemanja Trifunovic http://www.codeproject.com/managedcpp/managedraii.asp

  36. Peter Evans says:

    So what’s the status of adding deterministic finalization feature to the next release of Visual Studio languages. I heard mention that some (sells, et al) were looking into trying to add it to the c# language, but ran into performance issues.

    I don’t think ref counting is hard its the limitations of language constructs that make it easy to mess up. Seems like the hard problems all resolve around treating stack and heap memory equivalently in code constructs even though semantically they function very differently.

    I would prefer to see both gc based memory management and stack based destruction as features available from .NET cli compliant languages. I think it would give them a leg up on Java.

    Anyone have a better scenario they’d like to share/see for future memory management implementations????

  37. Eugene Gershnik says:

    Reference counting may be hard but using C++ correctly isn’t. If you were to use canonical form of the assignment the problem whould not arise:

    X & operator=(const X & src)

    {

    X temp(src);

    swap(temp);

    return *this;

    }

    void swap(X & src)

    {

    T * pTemp = src.m_p;

    src.m_p = m_p;

    m_p = temp;

    }

    This is safe from any angle.

    In addition this is a correct form for any class assignment operator.

  38. Ben Hutchings says:

    Deterministic finalisation is a contradiction in terms. I think what you mean is deterministic destruction. (See Boehm’s paper "Destructors, Finalizers, and Synchronization" <http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html&gt;.) Now if C# would automatically call Dispose on references to certain types as they go out of scope, without the need for "using", that would be great. This is what C++/CLI does with objects that are declared as automatic but are actually allocated on the managed heap.

  39. Norman Diamond says:

    4/7/2004 2:27 AM DrPizza:

    >> Suppose your car or airplane has 2 GB of

    >> memory, and it repeats some computation

    >> every 100 microseconds. 1 time in 20 it

    >> leaks 4 KB of memory.

    >

    > Suppose it leaks a file handle at the same

    > rate. Which problem’s gonna make it fall

    > over quicker?

    Of course a program that kills you in 2 minutes kills you faster than a program that kills you in 20 minutes. Now, my opinion is that all programs that kill you kill you, and all kinds of leaks are dangerous. You are the one who said the memory leak version doesn’t kill you. I sure hope you aren’t programming for cars or airplanes.

    (Of course I also hope Microsoft never does. There have been enough failures of Microsoft-powered bank machines and phones already. And there have already been enough virus infections of airline reservation systems and electrical power generating plants. Sometimes it’s better to be paranoid.)

  40. DrPizza says:

    "Of course a program that kills you in 2 minutes kills you faster than a program that kills you in 20 minutes. Now, my opinion is that all programs that kill you kill you, and all kinds of leaks are dangerous."

    In some situations, perhaps. In general? No, simply not true.

    "You are the one who said the memory leak version doesn’t kill you. I sure hope you aren’t programming for cars or airplanes. "

    I said that memory is rarely the most scare resource, the implication being that, if you’re leaking resources, your application isn’t going to die due to a memory leak, it’ll die due to running out of some other resource first. As such, making memory management [slightly] easier at the expense of making management of other resources much harder is a poor trade-off.

  41. Norman Diamond says:

    4/10/2004 12:11 PM DrPizza:

    > making memory management [slightly] easier

    > at the expense of making management of other

    > resources much harder is a poor trade-off.

    Oh, we’re in huge agreement on this. If this is the point you were trying to make in your two previous postings, I’m sorry I didn’t see it then.

  42. Daniel says:

    Eugene Gershnik wrote about copy and swap:

    > In addition this is a correct form for any class assignment operator.

    Actually, it isn’t. It requires that you have a swap function which doesn’t throw. Which means that every member variable, and every base class of your class has to have a swap which doesn’t throw. Often this isn’t the case.

    Sometimes it’s not practicle because of your class’s relations to other classes.

    And finally, this method can be expensive or impractical because any resources that the class requires need to be allocated and the old ones released. An old-fashioned copy constructor might not require this. Sometimes that can be very important.

    Copy and swap is often a useful technique, but it isn’t always the right thing to do.

  43. Jon Hanna says:

    "Actually, it isn’t. It requires that you have a swap function which doesn’t throw."

    No, it means you have to have a swap function that is either doesn’t throw or which is exception-safe.

    "And finally, this method can be expensive or impractical because any resources that the class requires need to be allocated and the old ones released."

    That’s an inherent requirement to assignment. You can’t get away from this (in general) you can only move it around.

  44. Daniel says:

    "No, it means you have to have a swap function that is either doesn’t throw or which is exception-safe."

    True, but this can make things harder. The whole point of this technique is to make life easy. Now you have to fully think through what you’re doing, but this gives the impression that you don’t have to.

    For example, if a new class is created, which has two member classes which have swaps which can throw, your exception-safe throw isn’t so easy. It also means that your assignment is not going to be strongly exception safe, which may or may not be important. You’ve moved the difficulty from the assignment into the swap.

    "You can’t get away from this (in general) you can only move it around."

    Yes you can. For example, if you implement a vector class, you can reuse the array in the object you’re assigning to, as long as it’s bigger than the size of the source. You can reuse the existing resource, instead of creating a new one.

  45. Bauq says:

    Agree. The benefit of GC to remove complexity far outweighs the cost, to majority of programmers.

  46. Raymond Chen says:

    Commenting on this entry has been closed.

Comments are closed.