I got an array with plenty of nuthin’


A customer reported a memory leak in the function PropVariantClear:

We found the following memory leak in the function PropVariantClear. Please fix it immediately because it causes our program to run out of memory.

If the PROPVARIANT's type is VT_ARRAY, then the corresponding SAFEARRAY is leaked and not cleaned up.

SAFEARRAY* psa = SafeArrayCreateVector(VT_UNKNOWN, 0, 1);
PROPVARIANT v;
v.vt = VT_ARRAY;
v.parray = psa;
PropVariantClear(&v);

// The psa is leaked

Right now, we are temporarily working around this in our program by inserting code before all calls to PropVariantClear to free the SAFEARRAY, but this is clearly an unsatisfactory solution because it will merely result in double-free bugs once you fix the bug. Please give this defect your highest priority as it is holding up deployment of our system.

The VT_ARRAY value is not a variant type in and of itself; it is a type modifier. There are other type modifiers, such as VT_VECTOR and VT_BYREF. The thing about modifiers is that they need to modify something.

The line v.vt = VT_ARRAY is incorrect. You have to say what you have a safe array of. In this case, you want v.vt = VT_ARRAY | VT_UNKNOWN. Once you change that, you'll find the memory leak is fixed.

The customer didn't believe this explanation.

I find this doubtful for several reasons.

  1. While this would explain why the IUnknowns in the SAFEARRAY are not released, it doesn't explain why the SAFEARRAY itself is leaked.

  2. The SAFEARRAY already contains this information, so it should already know that destroying it entails releasing the IUnknown pointers.

  3. If I manually call SafeArrayDestroy, then the IUnknowns are correctly released, confirming point 2.

  4. The function SafeArrayDestroy is never called; that is the root cause of the problem.

The customer's mental model of PropVariantDestroy appeared to be that it should go something like this:

if (pvt->vt & VT_ARRAY) {
 switch (pvt->vt & VT_TYPEMASK) {
 ...
 case VT_UNKNOWN:
  ... release the IUnknowns in the SAFEARRAY...
  break;
 ...
 }
 InternalFree(pvt->psa->pvData);
 InternalFree(pvt->psa);
 return S_OK;
}

In fact what's really going on is that the value of VT_ARRAY is interpreted as VT_ARRAY | VT_EMPTY, because (1) VT_ARRAY is a modifier, so it has to modify something, and (2) the numeric value of zero happens to be equal to VT_EMPTY. In other words, you told OLE automation that your PROPVARIANT holds a SAFEARRAY filled with VT_EMPTY.

It also happens that a SAFEARRAY of VT_EMPTY is illegal. Only certain types can be placed in a SAFEARRAY, and VT_EMPTY is not one of them.

The call to PropVariantClear was returning the error DISP_E_BADVARTYPE. It was performing parameter validation and rejecting the property variant as invalid, because you can't have an array of nothing. The customer's response to this explanation was very terse.

Tx. Interesting.

Comments (24)
  1. nathan_works says:

    They need to level up their psychic debugging skills. Oh yeah, and check yer return values you knuckleheads..

  2. Pierre B. says:

    nathan: normally, I’d agree with you, but almost all clean up functions, and certainly all memory deallocation functions I’ve used return no error value. So I’m not surprised at all they didn’t check the return value that they probably didn’t anticipate.

    It does show that good naming is once again central to good and clear APIs. If a bit field contains two types of values, then their naming should reflect that. In this case, it would be clearer if the modifiers were called VT_MODIFIER_ARRAY, VT_MODIFIER_BYREF, etc.

    Hindsight still at 100%.

  3. Adrian says:

    Humility is a virtue many programmers lack.  If you want to be right most of the time, then always suspect your own code first.  Even once you’re 100% convinced your stuff is right, proceed delicately.

    [I’m not claiming sainthood in this regard, but I try.]

  4. Alexandre Grigoriev says:

    One more reason to crash and burn on invalid arguments.

  5. Anonymous says:

    @Pierre CloseHandle() can fail.  As can fclose() in stdio.  I would say it’s still good to check return codes of functions that release resources.

  6. Gabe says:

    At least the customer’s response wasn’t "tl;dr"!

  7. Michael H. says:

    @Pierre

    In testing yes, in distribution, maybe. Because what the hell do you do, other than log an error, when close fails?

    http://blogs.msdn.com/oldnewthing/archive/2008/01/07/7011066.aspx

  8. Michael H. says:

    My apologies, that was meant for the Anonymous commenter a few comments up, not Pierre.

  9. C Garth says:

    A friend of mine, every time he encounters a bug in his Javascript/HTML code, tells me excitedly that he found a bug in the web browser. (Occam’s razor tells me that it’s a bug in the impossible-to-read code he wrote at 3 in the morning, and that he doesn’t actually encounter a new browser bug every night)

  10. Tim Dawson says:

    You can always spot a real winner when they can’t be bothered to type "thank you" in full.

  11. Mike Dimmick says:

    I’m kind of with your customer on this one. From the MSDN documentation of SAFEARRAY:

    “The fFeatures flags describe array attributes that can affect how the array is released. This allows freeing the array without referencing its containing variant. The bits are accessed using the following constants.”

    There’s no need for PropVariantDestroy to switch on the interior type of the SAFEARRAY – the array *already knows what it is*, at least as far as calling the right deallocation function goes, and can destroy itself happily just by calling SafeArrayDestroy. The call to SafeArrayCreateVector sets fFeatures appropriately (to FADF_UNKNOWN | FADF_HAVEIID).

    On the other hand, it’s necessary to tell VARIANT what kind of SAFEARRAY it is, for regular use, because SAFEARRAY doesn’t record the type information for straightforward integral types, as opposed to pointers to other structures.

    Still, PropVariantDestroy could probably do a better job.

    [Said “better job” comes at the cost of consistency. “Sometimes you have to specify the inner type of the array and sometimes you don’t, and if the two disagree then some functions trust the variant type code and others trust the SAFEARRAY.” -Raymond]
  12. Scott says:

    Really? To me the ‘winner’ detector should have triggered with "Please fix it immediately".

    That statement just oozed ego to me.

    Is it just my perception, or are a significant portion of all Microsoft employee blog postings customers having issues with allocation or deallocation of COM data?

  13. Marvin says:

    There are two problems here. First the whole VARIANT and SAFEARRAY APIs are a giant mess that very few people understand. The second is that some people still (it was 2009 last time I checked) don’t know that they need to check error codes. Yes this is a ‘destructor’ so no errors can safely propagate out of the check but still one ought to check and log/alert/assert if it fails.

    With regards to VARIANT/SAFEARRAY APIs, yes they are old, which is a fine excuse for why they are the way they are. What is not excusable is that MS still haven’t bothered to create a good wrapper for it. Various half baked CYadaYadaVariant classes in ATL and elsewhere are pathetic.

  14. ulric says:

    The customer’s attitude is just plain standard for an engineer.  

    The real fault that I see here is that a developer should not think that there is a leak in PropVariantClear after 15 years and he would be the first one to run into it.  It’s just not rational.

    Since they are not reported by the standard C Run Time debug output, there is tons of COM leaks out there..  people are doing terrible things with Variants and BSR.  This client probably only saw the leak because it leaked a COM object in his code.  Perhaps he had to update a lot of code after this discovery.  

  15. It’s interesting because the root cause of these issues is really API documentation.  The area of "user assistance" in regards to API’s is fascinating.  The .NET Framework did some usability studies with regard to how the class library was presented and made a number of changes based on the feedback.

    COM on the other is old enough that the original developers coded to their own perceptions and assumed others would think similarly.

    Although it’s a slow process, Microsoft does do a good job of updating documentation to address common implementer mistakes.

    Like other commenters, I rarely think the problem is in Microsoft’s code – particularly memory leaks – as I know that’s something given a lot of attention to.  Given the age of COM and OLE Automation, I’m sure any memory leak has long since been discovered and fixed.

    Shouldn’t the complier and/or some Lint/Code-correctness tool be warning the implementer whenever a function returns a value that is not evaluated?

  16. will says:

    I kind of wonder how many messages of this type companies get.  

    I would of been easy for the guy to try the fix solution and see if that works vs replying back with that message.

    This would of been a reasonable conversation if it was taking place with the people talking(by phone or in person) but when using email try the solution before you reply.

  17. Marquess says:

    @Charles Oppermann: “Shouldn’t the complier and/or some Lint/Code-correctness tool be warning the implementer whenever a function returns a value that is not evaluated?”

    There’s a lot of functions where evaluating the return value is is rather the exception than the norm. Like printf, MessageBox (with only one button) or SendMessage (for certain messages). You’d be swamped in false positives.

  18. Marc K says:

    MS never did bother to fix the leak that was present in TransparentBlt in Windows 95/98.  So it is possible for 15-year old MS code to leak resources.  Though the odds of being the first one to happen upon the leak aren’t so good.

  19. Ken Hagan says:

    "@Pierre CloseHandle() can fail.  As can fclose() in stdio.  I would say it’s still good to check return codes of functions that release resources."

    As can GDI’s DeleteObject() and COM’s Release(). In fact, if your API’s clean-up functions *don’t* return an error (or throw, or crash), how are they supposed to indicate that you’ve given them a corrupted object or an already-cleaned-up one?

    It is true that if you feed them a well-formed object, nearly all clean-up functions *won’t* fail, but that’s all the more reason to assert that they didn’t, and there are some that have implicit "finish what you are doing first and then die" semantics (like fclose) and these really can fail with just about any error.

    I must confess I probably fall short of my own high standards here, but I’m certain that I’d have checked the return value before demanding that MS fix the bug that has been lurking in their API for 15 years and I’m the first person to notice. The original correspondent is nothing like as smart as they think they are.

  20. Dean Harding says:

    I’ll never understand how these queries get all the way to Raymond before somebody thinks to ask “did you check the return value?”

    [Checking the return value would have changed the question, but the question would still be there. “Why is PropVariantClear reporting an invalid type for my array?” -Raymond]
  21. Peter Kay says:

    There really isn’t any excuse not to check return  values, especially once memory errors are encountered.

    Having said that, this post has reminded me of the horrors of exchange gateway programming, and having to recode parts of a gateway after finding out that one particular (Microsoft) function appeared to corrupt parts of memory one time in six.

    It’s possible that this was my fault but a) the documentation is awful b) there were (are?) no debug symbols – assembly level debugging isn’t fun c) it uses craploads of hard to use structures and d) it teaches the programmer the hard way that reading the function descriptions in sample code is not sufficient, as the actual code has unexpected side effects. Plus add a bonus e) no validation functions, and no errors returned up until and including the point of corruption.

  22. Gabe says:

    Ken: Be careful what you wish for! If you insist that programmers assert that all resource release functions don’t return errors, you’re going to get code littered with "ASSERT(resource.Release() != ERROR)". Then you’ll just have different random memory leaks.

  23. Yuhong Bao says:

    "b) there were (are?) no debug symbols – assembly level debugging isn’t fun“

    Exchange was one of the MS products where at least some symbols were public, I think.

  24. Peter Kay says:

    There may be some Exchange debug symbols, but usually the problem is in MAPI – and that doesn’t have debug symbols.

    I can’t speak for Exchange 2007/2010 development, though, because the work I did was for 2000/2003. There were plenty of oddities, though – unexpected outcomes and instances where the reality didn’t manage the documentation (such as strongly implying messages *always* had an RTF component, yet then messages without one came through).

    I don’t rule out some of the errors being mine, though.

Comments are closed.

Skip to main content