We’re using a smart pointer, so we can’t possibly be the source of the leak


A customer reported that there was a leak in the shell, and they included the output from Application Verifier as proof. And yup, the memory that was leaked was in fact allocated by the shell:

VERIFIER STOP 00000900 : pid 0x3A4: A heap allocation was leaked.

        497D0FC0 : Address of the leaked allocation.
        002DB580 : Adress to the allocation stack trace.
        0D65CFE8 : Address of the owner dll name.
        6F560000 : Base of the owner dll.

1: kd> du 0D65CFE8
0d65cfe8  "SHLWAPI.dll"

1: kd> !heap -p -a 497D0FC0
...
    ntdll!RtlpAllocateHeap+0x0003f236
    ntdll!RtlAllocateHeap+0x0000014f
    Kernel32!LocalAlloc+0x0000007c
    shlwapi!CreateMemStreamEx+0x00000043
    shlwapi!CreateMemStream+0x00000012
    <Unloaded_xyz.dll>+0x000642de
    <Unloaded_xyz.dll>+0x0005e2af
    <Unloaded_xyz.dll>+0x0002d49a
    <Unloaded_xyz.dll>+0x0002a0fd
    <Unloaded_xyz.dll>+0x000289cb
    <Unloaded_xyz.dll>+0x0002a25c
    <Unloaded_xyz.dll>+0x00027225
    <Unloaded_xyz.dll>+0x0002252b
    <Unloaded_xyz.dll>+0x00025394
    <Unloaded_xyz.dll>+0x0004d70f
    Kernel32!BaseThreadInitThunk+0x0000000d
    ntdll!RtlUserThreadStart+0x0000001d

1: kd> dps 002DB580
shlwapi!CreateMemStreamEx+0x43
shlwapi!CreateMemStream+0x12
<Unloaded_xyz.dll>+0x642de
<Unloaded_xyz.dll>+0x5e2af
<Unloaded_xyz.dll>+0x2d49a
<Unloaded_xyz.dll>+0x2a0fd
<Unloaded_xyz.dll>+0x289cb
<Unloaded_xyz.dll>+0x2a25c
<Unloaded_xyz.dll>+0x27225
<Unloaded_xyz.dll>+0x2252b
<Unloaded_xyz.dll>+0x25394
<Unloaded_xyz.dll>+0x4d70f
Kernel32!BaseThreadInitThunk+0xd
ntdll!RtlUserThreadStart+0x1d

On the other hand, SHCreateMemStream is an object creation function, so it's natural that the function allocate some memory. The responsibility for freeing the memory belongs to the caller.

We suggested that the customer appears to have leaked the interface pointer. Perhaps there's a hole where they called AddRef and managed to avoid the matching Release.

"Oh no," the customer replied, "that's not possible. We call this function in only one place, and we use a smart pointer, so a leak is impossible." The customer was kind enough to include a code snippet and even highlighted the lines that proved they weren't leaking.

CComPtr<IStream> pMemoryStream;
CComPtr<IXmlReader> pReader;
UINT nDepth = 0;

//Open read-only input stream
pMemoryStream = ::SHCreateMemStream(utf8Xml, cbUtf8Xml);

The exercise for today is to identify the irony in the highlighted lines.

Hint. Answers (and more discussion) tomorrow.

Comments (29)
  1. John says:

    I don’t know what irony you are looking for, but I assume the assignment operator for CComPtr calls AddRef(); there is probably an Attach() method that does what the customer really wants to do.

  2. Sunil Joshi says:

    SHCreateMemStream must AddRef before it returns the stream. operator= must also AddRef before it stores the pointer in the CComPtr.

    Hence, you end up with an extra AddRef = Memory Leak.

    Solution would be to use whatever member function allows you to attach to a CComPtr without AddRefing.

  3. Mathew Grabau says:

    According to MSDN the SHCreateMemStream function acts like a less functional CreateStreamOnHGlobal. The notes don’t include anything on modifying the freeing procedure, which in CreateStreamOnHGlobal is controlled by the parameter BOOL fDeleteOnRelease. My guess is that SHCreateMemStream adopts the functionality as if fDeleteOnRelease (which can’t be changed) is set to FALSE. This would mean that the caller is responsible for freeing the memory once it is used. The smart pointer, while taking care of AddRef/Release I guess it doesn’t take care of freeing the memory block used for the interface call itself.

  4. Erik says:

    Mathew, the documentation says that SHCreateMemStream "does not allow direct access to the stream contents" (no  GetHGlobalFromStream), so the caller has no way to free the memory.

    Btw., this function is another one with the ominous "might be altered or unavailable in subsequent versions of Windows" warning. I’m wondering about the policy behind these warnings, is Microsoft really planning a break from the old routine and going to risk breaking backward compatibility just to save a few bytes and a bit of a maintenance burden?

  5. Mathew Grabau says:

    Erik thanks for clarifying that – I misread the documentation for the first parameter to that function. Now looking at it closer I see that it’s the initialization for the buffer. My bad.

  6. Laura T. says:

    SHCreateMemStream returns IStream with refcount +1

    The = operator of CComPtr addrefs again (via AtlComPtrAssign ).. IStream refcount is now +2

    The destructor of the CComPtr releases the IStream.. refcount is now +1 and it will leak.

    CComPtr.Attach(::SHCreateMemStream(utf8Xml, cbUtf8Xml); would work.. Attach does not increment refcount.

    Smart pointers need smart users :)

  7. Laura T. says:

    .. and for the irony.. for me it is that without smart pointers no memory leak. Too smart.

  8. This is a good rebuttal to 640k’s insistence that programmers need not understand the libraries they use. I won’t use any smart-pointer class that I haven’t stepped through in a debugger. I’ve been burned by too many of them.

  9. Joe S. says:

    This is one of those issues that will never come up if you stick to defining variables at the point of initialization instead of at top of block:

    CComPtr<IStream> pMemoryStream = ::SHCreateMemStream(utf8Xml, cbUtf8Xml);

    [Assuming it’s possible. It might be a member variable, or the variable might be used outside the scope in which it is initialized. In these cases you’re forced to declare and initialize separately. -Raymond]
  10. anonymous says:

    From looking at the MSDN documentation for SHCreateMemStream, it looks like a function that was not intended to be a public API.

    A good API design would be to return an HRESULT and set an out parameter.  That would also not confuse users of smart-pointers who do not understand smart pointers.

  11. Leo Davidson says:

    I know it uglies-up the code a bit but I prefer my smart wrapper objects to be explicit about what they’re doing.

    Rather than use operator= and attach(), I’d rather have methods called SetAndTakeRef() and SetAndAddRef() or similar.

    Then you don’t have to look at the docs or implementation to know what’s going on; you can’t be confused when switching between different classes/behaviours; and you’re reminded of what’s going on whenever you use the methods.

    I think it’s better to have slightly uglier source than to end up with ugly behaviour like leaks and double-releases.

    Of course, in a managed languages this problem doesn’t exist with basic references, so I’m not saying C# should get rid of its assignment operator. C++ ain’t such a language, though, and trying to make it look like one seems more dangerous & obfuscating than clever & simple.

  12. Kyle Sluder says:

    The fact that this:

    Foo f = Bar();

    is completely unequivalent to this:

    Foo f;

    f = Bar();

    is one of the worst misfeatures of C++.  I am very glad that modern language design (including C#) instead favors one pattern of assignment.

  13. @Kyle Sluder: It’s not really any different in C#. This line:

    int x = 123;

    is different from these lines:

    int x;

    x = 123;

    At least, they are unless the optimizer steps in. Ditto reference types:

    Foo f = new Foo();

    f = olderFoo;

    But ignoring that, how else could C++ have been designed?

  14. I was a bit hasty… immutable value types in C# are different than scalar types in C++. My first example isn’t really applicable. There is still a difference between initialization and assignment, though, and the question of how else C++ would have been designed is still valid. I’ve probably already gone too far off into the weeds, anyway.

  15. NT says:

    The annoying thing about smart pointers is that even if you eschew them because the complexity just isn’t worth the benefit, you still have to share code with people who 1) don’t agree, so they use them, and 2) think they’re a magic bullet that lets them stop thinking.  Life would be simpler if my less-talented coworkers just leaked regular pointers. :)

  16. Dean Harding says:

    "Assuming it’s possible. It might be a member variable, or the variable might be used outside the scope in which it is initialized. In these cases you’re forced to declare and initialize separately. -Raymond"

    I use a lot of boost::shared_ptr (now std::tr1::shared_ptr) and the standard usage for that kind of thing is like this:

    shared_ptr<MyClass> m_SomeMemberVariable;

    shared_ptr<MyClass> temp(new MyClass());

    m_SomeMemberVariable(temp);

    In fact, shared_ptr doesn’t even have an implicit constructor that takes a T* – you can only explicity construct it (like I’ve done here). That means something like:

    m_SomeMemberVariable = new MyClass();

    doesn’t even compile. It requires extra typing, and sometimes an extra variable, but it helps to avoid this particular bug. Of course, it’s too late for CComPtr, now, though…

  17. KJK::Hyperion says:

    Erik: functions with that disclaimer are previously undocumented APIs that Microsoft used in Internet Explorer, Office or other applications, and was forced to document as part of some antitrust settlement or another

  18. quixver says:

    " Laura T.

    Smart pointers need smart users :)

    "

    +1 for the post.

    in addition – i move we bring in idiot proof pointers as well.

  19. ulric says:

    In my experience, programmers are always tempted to write functions that return COM pointers, instead of returning them in a parameter and returning an HRESULT, often to save lines of code.  The problem then is the ambuiguity; did the programmer addref the pointer or not?  I’ve seen both.

    The correct code for this case here of course is

    pMemoryStream.p = ::SHCreateMemStream(utf8Xml, cbUtf8Xml);

    but, that’s just something you need to learn.

    In an ideal world, smart pointer would be part of the API

  20. peterchen says:

    @ulric: The rule is "the function returning the pointer must addref". Of course, that requires people playing by the rules… (CWnd::GetControlUnknown being the most prominet violator, but also without much company).

    @NT: you have to think about reference count when you cross the raw/smart border – this also requires that you understand reference counting in general.

    Beyond that, raw pointers just require to much boilerplate code. And yes, I am one of those people who insist on using them, and I stop thinking about reference counts once I have a smart pointer and I’ve ruled out circles.

  21. Maurits says:

    > CComPtr<IStream> pMemoryStream = ::SHCreateMemStream(utf8Xml, cbUtf8Xml);

    Actually this leaks for the same reason that naked assignment does.  CComPtr<IFoo>::CComPtr(IFoo *pFoo) calls pFoo->AddRef() just as CComPtr<IFoo>::operator=(IFoo

    *) does.

  22. Marvin says:

    CComPtr author(s) made an idiotic choice to have a direct conversion from a raw pointer and AddRef there. This is wrong because there isn’t any agreeable default of what such a conversion should do. Some people need an AddRef an some don’t. As a result many people are bitten by bugs like the one above. There is also a reverse bug — crashing because of an extra Release(). A good smart pointer would disallow the conversion and require the caller to specify the exact intent in one way or another.

    It isn’t hard to find a decent COM compatible smart pointer that doesn’t suffer from this problem. I wish the monkeys who wrote and maintain ATL were replaced by some at least half decent C++ library designers. Preferably from the industry instead of the usual ex-interns that bake in MS for many years without learning anything not in MS source base.

  23. Duke of New York says:

    Marvin: If you want to make the programmer specify the intent in every potentially ambiguous situation, then you can just tell him to use raw pointers.

    For example, suppose the pointer is going out of scope. Should it be automatically released, or not? Decisions, decisions…

  24. President Pleasence says:

    Y-You are the… Duke of New… New York. You’re A-Number One.

  25. piersh says:

    this is just BAD API DESIGN.

    seriously, what kind of COM API is this:

    IStream *SHCreateMemStream(      

       const BYTE *pInit,

       UINT cbInit

    );

    it should have been

    HRESTUL SHCreateMemStream(      

       const BYTE *pInit,

       UINT cbInit,

       [out] IStream **ppStream

    );

    and there would have been NO confusion.

    I understand that you guys couldn’t load OLE32.DLL into explorer in win95 because of the whole 4MB thing but you could at least have followed the idioms.

  26. Neil says:

    Gecko tends to require that a raw T* does not have an extra reference and introduces a type already_AddRefed<T> for when you want to return the sole reference to an object. The generated code seems to be about the same as for a T** outparam.

  27. Marvin says:

    Duke of New York, you are mistaken. Forcing somebody to *choose* automatic behavior is not the same as forcing somebody to manually do everything. If you want an automatic transmission car to go backward you have to put the stick into R, if you want to go forward you have to use D. However you don’t need worry about switching between 1,2,3,4 and 5. See the difference?

    A decent smart pointer would require user to write something like this:

    DecentPtr pStream = AddRefPtr(SHCreateBlahBlah(….));

    or

    DecentPtr pStream = AttachPtr(SHCreateBlahBlah(….));

    but will *not* compile

    DecentPtr pStream = SHCreateBlahBlah(….);

    These are your R and D. After you made your choice everything else proceeds automatically.

  28. Greg says:

    The *real* lesson here is that Microsoft APIs are so insane and poorly designed that writing to them is a complete nightmare.

Comments are closed.