Hey, why am I leaking all my BSTR's?

IMHO, every developer should have a recent copy of the debugging tools for windows package installed on their machine (it's updated regularly, so check to see if there's a newer version).

One of the most useful leak tracking tools around is a wonderfully cool tool that's included in this package, UMDH.  UMDH allows you to take a snapshot of the heaps in a process, and perform a diff of the heap over time - basically you run it once to take a snapshot, then run it a second time after running a particular test and it allows you to compare the differences in the heaps.

This tool can be unbelievably useful when debugging services, especially shared services.  The nice thing about it is that it provides a snapshot of the heap usage, there are often times when that's the only way to determine the cause of a memory leak.

As a simple example of this the Exchange 5.5 IMAP server cached user logons.  It did this for performance reasons, it could take up to five seconds for a call to LogonUser to complete, and that affected our ability to service large numbers of clients - all of the server threads ended up being blocked waiting on the domain controllers to respond.  So we put in a logon cache.  The cache took the users credentials, performed a LogonUser with those credentials, and put the results into a heap.  On subsequent logons, the cache took the users credentials, looked them up in the heap, and if they were found, it just reused the token from the cache (and no, it didn't do the lookup in clear text, I'm not that stupid).  Unfortunately, when I first wrote the cache implementation, I had an uninitialized variable in the hash function used to lookup the user in the cache, and as a result, every user logon occupied a different slot in the hash table.  As a result, when run over time, I had a hideous memory leak (hundreds of megabytes of VM).  But, since the cache was purged on exit, the built-in leak tracking logic in the Exchange store didn't detect any memory leaks. 

We didn't have UMDH at the time, but UMDH would have been a perfect solution to the problem.

I recently went on a tear trying to find memory leaks in some of the new functionality we've added to the Windows Audio Service, and used UMDH to try to catch them.

I found a bunch of the leaks, and fixed them, but one of the leaks I just couldn't figure out showed up every time we allocated a BSTR object.

It drove me up the wall trying to figure out how we were leaking BSTR objects, nothing I did found the silly things.  A bunch of the leaks were in objects allocated with CComBSTR, which really surprised me, since I couldn't see how on earth they would leak memory.

And then someone pointed me to this KB article (KB139071).  KB1239071 describes the OLE caching of BSTR objects.  It also turns out that this behavior is described right on the MSDN page for the string manipulation functions, proving once again that I should have looked at the documentation :).

Basically, OLE caches all BSTR objects allocated in a process to allow it to pool together strings.  As a result, these strings are effectively leaked “on purposeâ€.  The KB article indicates that the cache is cleared when the OLEAUT32.DLL's DLL_PROCESS_DETACH logic is run, which is good to know, but didn't help me to debug my BSTR leak - I could still be leaking BSTRs.

Fortunately, there's a way of disabling the BSTR caching, simply set the OANOCACHE environment variable to 1 before launching your application.  If your application is a service, then you need to set OANOCACHE as a system environment variable (the bottom set of environment variables) and reboot.

I did this and all of my memory leaks mysteriously vanished.  And there was much rejoicing.