Caches are nice, but they confuse memory leak detection tools


Knowledge Base article 139071 has the technically correct but easily misinterpreted title FIX: OLE Automation BSTR caching will cause memory leak sources in Windows 2000. The title is misleading because it makes you think that Oh, this is a fix for a memory leak in OLE Automation, but that's not what it is.

The BSTR is the string type used by OLE Automation, and since strings are used a lot, OLE Automation maintains a cache of recently-freed strings which it can re-use when somebody allocates a new one. Caches are nice (though you need to make sure you have a good replacement policy), but they confuse memory leak detection tools, because the memory leak detection tool will not be able to match up the allocator with the deallocator. What the memory leak detection tool sees is not the creation and freeing of strings but rather the allocation and deallocation of memory. And if there is a string cache (say, of just one entry, for simplicity), what the memory leak detection tool sees is only a part of the real story.

  • Program (line 1): Creates string 1.
  • String manager: Allocates memory block A for string 1.
  • Program (line 2): Frees string 1.
  • String manager: Puts memory block A into cache.
  • Program (line 3): Creates string 2.
  • String manager: Re-uses memory block A for string 2.
  • Program (line 4): Creates string 3.
  • String manager: Allocates memory block B for string 3.
  • Program (line 5): Frees string 3.
  • String manager: Puts memory block B into cache.
  • Program (line 6): Frees string 2.
  • String manager: Deallocates memory block A since there is no room in the cache.

Your program sees only the lines marked Program:, and the memory leak detection tool sees only the underlined part. As a result, the memory leak detection tool sees a warped view of the program's string usage:

  • Line 1 of your program allocates memory block A.
  • Line 4 of your program allocates memory block B.
  • Line 6 of your program deallocates memory block A.

Notice that the memory leak detection tool thinks that line 6 freed the memory allocated by line 1, even though the two lines of the program are unrelated. Line 6 is freeing string 2, and line 1 is creating string 1!

Notice also that the memory leak detection tool will report a memory leak, because it sees that you allocated two memory blocks but deallocated only one of them. The memory leak detection tool will say, "Memory allocated at line 4 is never freed." And you stare at line 4 of your program and insist that the memory leak detection tool is on crack because there, you freed it right at the very next line! You chalk this up as "Stupid memory leak detection tool, it has all these useless false positives."

Even worse: Suppose somebody deletes line 6 of your program, thereby introducing a genuine memory leak. Now the memory leak detection tool will report two leaks:

  • Memory allocated at line 1 is never freed.
  • Memory allocated at line 4 is never freed.

You already marked the second report as bogus during your last round of investigation. Now you look at the first report, and decide that it too is bogus; I mean look, we free the string right there at line 2!

Result: A memory leak is introduced, the memory leak detection tool finds it, but you discard it as another bug in the memory leak detection tool.

When you're doing memory leak detection, it helps to disable your caches. That way, the high-level object creation and destruction performed in your program maps more directly to the low-level memory allocation and deallocation functions tracked by the memory leak detection tool. In our example, if there were no cache, then every Create string would map directly to an Allocate memory call, and every Free string would map directly to a Deallocate memory call.

What KB article 139071 is trying to say is FIX: OLE Automation BSTR cache cannot be disabled in Windows 2000. Windows XP already contains support for the OANOCACHE environment variable, which disables the BSTR cache so you can investigate those BSTR leaks more effectively. The hotfix adds support for OANOCACHE to Windows 2000.

Bonus chatter: Why do we have BSTR anyway? Why not just use null-terminated strings everywhere?

The BSTR data type was introduced by Visual Basic. They couldn't use null-terminated strings because Basic permits nulls to be embedded in strings. Whereas Win32 is based on the K&R C&nbsp way of doing things, OLE automation is based on the Basic way of doing things.

Comments (16)
  1. AsmGuru62 says:

    Also, the multiple heaps approach confuses the Leak Detection Tools. For example, if I need to load some data

    into some memory structures – I create a class with a HANDLE initialized with HeapCreate(…) and I’ll allocate all

    my data from this allocator. That’s what MSDN suggests. And that works wonders for performance – in class

    destructor I simply call HeapDestroy() instead of thousands calls to free()/delete, etc. Clean and fast! However,

    I am constantly getting the "bugs" assigned to me which say: "A lot of leaks reported in module XYZ! (my module)".

    I am tired of explaining the situation to SQA guys, actually…

  2. Koro says:

    Today’s post is right on what I am currently working on.

    I wasted a day trying to figure out why STLport seemed to leak each and every allocation to end up discovering it does its own pooling/caching in the back. Why can’t these guys just use the OS’s allocator and let it do its job is beyond me. It has been optimized for this.

  3. Leo Davidson says:

    This is one of several reasons I find the memory-leak tools to be really frustrating.

    Even the ones that cost thousands of dollars — at least the two or three high-end ones I’ve tried — throw up so many false-positives that it’s very time consuming to sift through them looking for the legitimate problems.

    (And that’s if they don’t fall over completely on a large/complex project.)

    It’s a very difficult problem to solve, to be fair to them, but when they ask $$$$ for solutions that don’t really work I don’t have much sympathy.

  4. Anonymous Coward says:

    I fully agree Teo. Microsoft should expose APIs using counted strings and deprecate the old ones.

    As to whether the BSTR way of implementing counted strings is a panacea, well I’ve had my doubts on that, but it’s a million times better than the C way.

    Some weirder parts of the BSTR way are obviously there for some compatibility with the C way, which is funny since from VB you can’t actually call a DLL function that needs a C string without some conversion taking place (for the …A functions this is conversion to the local code page, for the …W functions this is copying the contents to a Byte array). Unless you write a typelib, but most people don’t bother.

  5. JohnQPublic says:

    Was it not a design goal of BSTR to have a common allocator/deallocator function that everyone could agree on when working with strings?  I suppose everyone could have relied on CoTaskMemAlloc/CoTaskMemFree for strings, but BSTR seems like it was tailor made for this purpose.

  6. porter says:

    > I fully agree Teo. Microsoft should expose APIs using counted strings and deprecate the old ones.

    Why bother? Support for OLE and even the Win32 API is archiac legacy stuff. .NET is the way forward. Surely? :)

  7. @porter: +1 (although perhaps not the most popular stance at this blog).

  8. Teo says:

    I find quite amusing and embarrising at the same time that, on the lowest levels, NT uses counted strings in the form of UNICODE_STRING structure, and basically evety high-level language uses counted strings, be they std::basic_string<>, BSTR, Delphi’s string, .net String etc. But they *must* talk to each other using null-terminated interface of Win32 API middle-man:(

  9. Friday says:

    Can OANOCACHE be used for non-debug purposes?

    To improve stability and/or speed?

    Don’t look at me like that, I’m not a programmer. :) I just haven’t seen it non-debugging context and am curious.

  10. Ivo says:

    I’m having similar problems with SHGetKnownFolderIDList and other shell functions. They create multiple memory allocations, probably for caching purposes.

    Anybody knows how to make the shell free its caches on shutdown? So I can see what the true leaks are? OANOCACHE has no effect in this case.

    [I believe the checked build of Windows does what you want. (Sometimes I wonder why we make a checked build since it seems nobody runs it.) -Raymond]
  11. Yuhong Bao says:

    [I believe the checked build of Windows does what you want. (Sometimes I wonder why we make a checked build since it seems nobody runs it.) -Raymond]

    Well, only MSDN subscribers even has access to checked builds of Windows.

    [Then subscribe to MSDN. There are many benefits to being an MSDN subscriber. For example, you get to be one of those “customers” I often write about. -Raymond]
  12. steveg says:

    [I believe the checked build of Windows does what you want. (Sometimes I wonder why we make a checked build since it seems nobody runs it.) -Raymond]

    Rename it to "Windows Debug Build"? AFAIK the term "checked" isn’t a common synonym for "debug".

  13. Miral says:

    > Rename it to "Windows Debug Build"? AFAIK the term "checked" isn’t a common synonym for "debug".

    IIRC, it’s not actually a debug build, though.  It’s just the release build with assertions and similar diagnostic *checks* enabled.  The debug build is something else.

  14. Yuhong Bao says:

    "IIRC, it’s not actually a debug build, though.  It’s just the release build with assertions and similar diagnostic *checks* enabled.  The debug build is something else."

    Larry Osterman has the story:

    http://blogs.msdn.com/larryosterman/archive/2005/08/31/458572.aspx

  15. Neo says:

    BSTRs themselves can have null characters which makes them different from regular C style null terminated strings. This could have been a requirement to create this new string type?

  16. Teo says:

    >> Sometimes I wonder why we make a checked build since it seems nobody runs it

    Well, for user-mode stuff it is not that helpful but for kernel-mode it’s a life saver. Most certainly *somebody does* run it, like us :-P

Comments are closed.