Smart Pointers Are Too Smart


Joel’s
law of leaky abstractions rears its ugly head once more.  "urn:schemas-microsoft-com:office:office" />I
try to never use smart pointers because… I’m not smart enough.

 

COM
programmers are of course intimately familiar with
AddRef and Release.  The
designers of COM decided to use reference counting as the mechanism for implementing
storage management in COM, and decided to put the burden of implementing and calling
these methods upon the users.

 

Now,
it is very natural when using a language like C++ to say that perhaps we can encapsulate
these semantics into an object, and let the C++ compiler worry about calling the
AddRefs
and
Releases
in the appropriate constructors, copy constructors and destructors.  It
is very natural and very tempting, and I avoid template libraries that do so like
the plague.

 

Everything
usually works fine until there is some weird, unforeseen interaction between the various
parts.  Let me give you an example:

 

Suppose
you have the following situation: you have a C++ template library
map mapping IFoo pointers
onto
IBar pointers,
where the pointers to the
IFoo and IBar are
in smart pointers.  You want the map to
take ownership of the pointers.  Does
this code look correct?

 

map[srpFoo.Disown()]
= srpBar.Disown();

 

It
sure looks correct, doesn’t it?

 

Look
again.  Is there a memory leak there? 

 

I
found code like this in a library I was maintaining once, and since I had never used
smart pointer templates before, I decided to look at what exactly this was doing at
the assembly level. A glance at the generated assembly shows that the order of operations
is:

 

1)
call
srpFoo.Disown() to
get the
IFoo*

2)
call
srpBar.Disown() to
get the
IBar*

3)
call map’s
operator[] passing
in the
IFoo* ,
returning an
IBar**

4)
do the assignment of the
IBar* to
the address returned in (3)
.

 

So
where is the leak? This library had C++ exception handling for out-of-memory exceptions
turned on.  If the
operator[] throws
an out-of-memory exception then the not-smart  
IFoo* and IBar* presently
on the argument stack are both going to leak.
 

 

The
correct code is to copy the pointers before you disown them:

 

map[srpFoo]
= srpBar;

srpFoo.Disown();

srpBar.Disown();

 

Before
the day that I found this I had never been in a situation before where I had
to think about the C++ order of operations for assigning and subscripting in order
to get the error handling right!  The
fact that you have to know these picky details about C++ operator semantics in order
to get the error handling right is an indication that people are going to get it wrong.

 

Let
me give you another example.  One day
I was adding a feature to this same library, and I noticed that I had caused a memory
leak.  Clearly I must have forgotten to
call
Release somewhere,
or perhaps some smart pointer code was screwed up.  I
figured I’d just put a breakpoint on the
AddRef and Release for
the object, and figure out who was calling the extra
AddRef.

 

Here
— and I am not making this up! — is the call stack at the point of the first
AddRef:

 

ATL::CComPolyObject<CLayMgr>::AddRef

ATL::CComObjectRootBase::OuterAddRef

ATL::CComContainedObject<CLayMgr>::AddRef

ATL::AtlInternalQueryInterface

ATL::CComObjectRootBase::InternalQueryInterface

CLayMgr::_InternalQueryInterface

ATL::CComPolyObject<CLayMgr>::QueryInterface

ATL::CComObjectRootBase::OuterQueryInterface

ATL::CComContainedObject<CLayMgr>::QueryInterface

CDDS::FinalConstruct

ATL::CComPolyObject<CDDS>::FinalConstruct

ATL::CComCreator<ATL::CComPolyObject<CDDS>
>::CreateInstance

CTryAssertComCreator<ATL::CComPolyObject<CDDS>
>::CreateInstance

ATL::CComClassFactory2<CDLic>::CreateInstance(ATL::CComClassFactory2

CTryAssertClassFactory2<CDLic>::CreateInstance(CTryAssertClassFactory2<CDLic>

 

Good
heavens! 

 

Now,
maybe you ATL programmers out there are smarter than me.  In
fact, I am almost certain you are, because I have not the faintest idea what the differences
between a
CComContainedObject,
a
CComObjectRootBase and
a
CComPolyObject are!
It took me hours to debug this memory
leak.  So much for smart pointers saving
time!

 

I
am too dumb to understand that stuff, so when I write COM code, my implementation
of
AddRef is one
line long
, not hundreds of lines of dense macro-ridden, templatized cruft winding
its way through half a dozen wrapper classes.

Comments (15)

  1. Sergei says:

    Hi!

    > map[srpFoo.Disown()] = srpBar.Disown();
    Do you really think this is a normal way of using smart pointers?
    It is not.

    We can discuss that over email. Send me a lettre if you like.

  2. Ex Pointy Head says:

    I don’t think the ATL programmers need to be smarter than you 🙂

    A few years ago, I took over a project which did not use smart pointers and was being developed by C++ beginners. You would not believe the bugs! I coerced them to use smart pointers and the product quality got much better. Soon, they figured out the rules of the game and got along with smart pointers just fine.

    I think once you disconnect the smart pointer from the object it is managing, it the programmer’s responsibility to handle the raw pointer. I would lay the blame at the code rather than smart pointers.

  3. Pavel says:

    I agree that ATL can be a bit intimidating. But since this post is about smart pointers, I have to point out that CComPtr doesn’t really depend on the rest of ATL. You don’t have to use or understand things like CComObjectRootBase if all you want is a smart pointer.

    Also, I think that if somebody else had to read and understand your code it would have been easier for him if you had used ATL to implement your COM objects. ATL is the de-facto standard in this area and most people are at least somewhat familiar with it.

  4. D. Brian Ellis says:

    Smart Pointers are a pain but bearable and useful. ATL however is the devil… Some people never got over vietnam. For me it was ATL. It sends shivers down my spine.

  5. Brian says:

    Neither example illustrates a problem with smart pointers. The first is a good example of why exceptions are not as good a fit in the C++ world as they are in some other languages. The second is just pointing out some ugliness in ATL and C++ template syntax.

    I think in both cases you’re pointing out some of the ugliness inherent in "modern" C++. Exceptions were bolted on late in the day, and most libraries and programmers have yet to become comfortable with them. C++ template syntax is ugly, especially when the debugger shoves them in your face.

    Whatever gotchas you might find when using smart pointers, using them is almost always a better idea than trying to manage the refcounts "by hand". I’ve used them for years and they’ve been the least of my concerns. Trying to maintain code that attempts to manage reference counts explicitly is a pain, if only because the actual meat of the code is lost in a blizzard of AddRefs, QueryInterfaces and Releases…

  6. Eric Lippert says:

    > using them is almost always a better idea than trying to manage the refcounts "by hand". I’ve used them for years and they’ve been the least of my concerns.

    I disagree. I’ve debugged ref count bugs caused by me forgetting an AddRef, and I’ve debugged ref count bugs like the one above with the fifteen-deep call stack of meaningless (to me) gibberish, and the former are a whole lot easier to track down.

    I agree that COM code looks gross, and that it tends to emphasize the error paths and cleanup code rather than the program logic. Two points: First, given that the error paths and cleanup code are where most of the bugs are, I want them to be as explicit as possible. Those ARE the program logic.

    Second, I understand the urge to abstract them away into classes that take care of the details for you. But in my experience, the abstraction is both complex and leaky. The abstraction is easy to use for simple cases, and thereby affords creation of much more subtle, hard-to-debug bugs.

    My point, simply, is that AddRef/Release/QI are a pain in the rear to USE, but they are EASY to understand. Smart pointers are easy to use and hard to understand, and I prefer to understand the code I write.

  7. The Sim says:

    so basically you are saying smart pointers suck, because you don’t know how to use them properly?

  8. The Sim says:

    Never mind, I reread the post, that seems to be exactly what you said. C# 4ever, whoop!

  9. Eric Lippert says:

    No, I’m saying that smart pointers suck because to use them properly, you have to understand _exactly_ what they are doing.

    An abstraction is supposed to relieve you of the burden of understanding the abstracted thing, but smart pointers do a poor job of that. In order to use them correctly, you have to understand exactly what they’re doing to ensure that you don’t use smart pointers to violate the very rules they are designed to abstract away.

    That’s a lousy abstraction. It is easier for me to learn the rules of COM and apply them than to learn the rules of COM AND learn the entire smart pointer framework.

    The .NET framework does a much better job of abstracting away the details of how the underlying system works, because it was designed to do that from day one. Smart pointers are a kludge written post hoc.

  10. josep says:

    That’s because you have bad smart pointers! Don’t blame a concept because of a lame implementation!

    Look at boost smart pointers and you’ll find referenced count smart pointers which can be inserted in any STL container (shared_ptr).

  11. indranil banerjee says:

    I agree with Brian that neither example illustrates a problem with smart pointers.

    The first is just an example of the unpredictable order of evaluation in C++ program.

    This code has exactly the same problem, and no smart pointers:-

    map[new Foo(‘A’)] = new Bar("Aarcvark");

    The second is just the horrendous complexity of ATL. But then again anything to do with COM is horrendously complex.

  12. Milan Sekler says:

    I can not beleive I found somebody that thinks like I do about the smart pointers. I agree with the author 100% and I spend 10 hours a day looking at C++ code. How do I convince myself to use this &%^@#:

    template

    <

      typename T,

      template <class> class OwnershipPolicy = RefCounted,

      class ConversionPolicy = DisallowConversion,

      template <class> class CheckingPolicy = AssertCheck,

      template <class> class StoragePolicy = DefaultSPStorage

    >

    class SmartPtr;

  13. sbi says:

    What does "smart" stand for in a "smart pointer" when you have to call a Disown() function for it to get ownership semantics right? A good smart pointer would do ownership transfer at assignment itself, when the possibly failing operations are sorted out.

    Yes, you will have to think about the stuff at the low level (this is C++, after all), but only when you are _implementing_ a smart pointer. If you need to think about it when you are _using_ it, it#s the wrong one.