Memory Leak Detection and MSXML

File this one under "I had no idea so I best share what I learned."

Over a couple of days during the past two weeks I have been working with a customer to isolate a memory leak in a ATL/C++ application.  This particular customer application leverages the msxml3 component to do some document translations on imported data.  The size of the data can be fairly substantial and allocated memory is not being released either.

I am not the foremost authority on ATL/C++ but I'm not a newbie either.  I know my way around the basics of WinDbg to isolate problems.

It turns out that MSXML gives tools like Rational's Purify and Microsoft Customer Support Services (CSS) LeakDiag false positives because of the way memory is managed.  Based on the documentation I have read, there are five memory managers used by MSXML which are designed for performance and scalability.  To start, the garbage collector in MSXML allocates a pool of memory for the management of cached objects.  Wrapped around the OS heap is a multi-processor optimized heap manager which also caches memory for performance.  In addition there are two memory managers.  One is designed for large memory allocations and the other the COM allocator.  Both cache memory or allocate large amounts of memory.

So you can see how easy it would be for these memory tools to incorrectly identify MSXML as a memory-leak culprit.  Microsoft KB 304227 is recommended reading if you suspect MSXML garbage collection needs to be tuned further.

On the other hand, if you really suspect your own code then I recommend the often overlooked Debug C Run-Time Library.  John Robbin's does a great job laying this out in Chapter 17 of his Debugging Applications book.  You can find details in MSDN as well.  And if you are a regular Bugslayer reader then this is all old hat.