If you say that your buffer can hold 200 characters, then it had better hold 200 characters


A security vulnerability report claimed that there was a vulnerability in the Get­Doodad­Name function (not the actual function name):

There is a buffer overflow bug in the Get­Doodad­Name function. If the doodad's name is 10 characters long, and the caller provides a buffer of size 11 characters, but specifies a buffer size of 200, then the Get­Doodad­Name function will write more than 11 characters (10 characters for the name, plus the null terminator). Even though the caller passed an incorrect buffer size, the overflow should not happen because the caller's buffer was large enough to hold the actual result.

The original report was difficult to understand, probably because English was not the finder's native language, and there are parts of the report where I couldn't figure out what the finder was trying to say.

Going back to the issue at hand: If you pass a buffer to a function and say that it can hold up to 200 characters, then the function is welcome to use the entire buffer, even if the final result doesn't require full use of the buffer. This is just part of the basic ground rules for programming:

A function is permitted to write to the full extent of the buffer provided by the caller, even if not all of the buffer is required to hold the result.

In this case, what happens is that the Get­Doodad­Name function relies upon an internal function, let's call it Get­Doodad­Full­Name, which returns a fully-qualified name. It then removes the unnecessary qualifications, resulting in the final doodad name for the caller.

The simple implementation of this would go something like this:

DWORD GetDoodadName(
    HANDLE doodad,
    PWSTR buffer, UINT bufferSize,
    UINT* actualSize)
{
 *actualSize = 0;

 UINT actualFullSize;
 DWORD result = GetDoodadFullName(doodad,
   nullptr, 0, &actualFullSize);

 // If something went wrong other than "buffer too small",
 // then give up.
 if (result != ERROR_MORE_DATA) return result;

 PWSTR fullName = (PWSTR)HeapAlloc(
    GetProcessHeap(), 0, actualFullSize);
 if (!fullName) return ERROR_NOT_ENOUGH_MEMORY;
 result = GetDoodadFullName(doodad,
    fullName, actualFullSize, &actualFullSize);
 if (result == ERROR_SUCCESS) {
  *actualSize = ExtractLocalNameFromFullName(
    fullName, buffer, bufferSize);
 }
 HeapFree(GetProcessHeap(), 0, fullName);
 return result;
}

Since you're going to have to call Get­Doodad­Full­Name anyway, you may as well see if you're lucky, and the full name fits inside the caller-provided buffer. In that case, you need only call Get­Doodad­Full­Name once, and you don't need to allocate the temporary buffer either. That saves you a memory allocation and two calls to the doodad server.

DWORD GetDoodadName(HANDLE doodad, PWSTR buffer, UINT bufferSize,
    UINT* actualSize)
{
 *actualSize = 0;

 UINT actualFullSize;
 DWORD result = GetDoodadFullName(doodad,
    buffer, bufferSize, &actualFullSize);
 if (result == ERROR_SUCCESS) {
  // The caller's buffer is big enough to hold the full name.
  *actualSize = ExtractLocalNameFromFullName(
    buffer, buffer, bufferSize);
  return result;
 }

 // If something went wrong other than "buffer too small",
 // then give up.
 if (result != ERROR_MORE_DATA) return result;

 PWSTR fullName = (PWSTR)HeapAlloc(
    GetProcessHeap(), 0, actualFullSize);
 if (!fullName) return ERROR_NOT_ENOUGH_MEMORY;
 result = GetDoodadFullName(doodad,
    fullName, actualFullSize, &actualFullSize);
 if (result == ERROR_SUCCESS) {
  *actualSize = ExtractLocalNameFromFullName(
    fullName, buffer, bufferSize);
 }
 HeapFree(GetProcessHeap(), 0, fullName);
 return result;
}

This is a legitimate optimization because the function has free use of the caller-provided buffer, for the full extent of the caller-specified size of the buffer, until the function returns. And this function takes advantage of this freedom by using the caller-provided buffer as a temporary buffer for holding the full name.

If the caller provides a buffer of size 200, then that buffer had better be 200 characters in size, and all 200 of those characters had better be expendable.

What's even more dangerous about this is that the caller cannot guarantee the length of the doodad's local name. A doodad's name can be changed by anybody who can find the doodad in the system doodad table, so you can't say, "Well, my code created the doodad with a local name whose length I know to be exactly 10, so it's safe to overstate the buffer size because I know that the result won't use more than 11 characters." Some other program may have changed the doodad's name, and your "knowledge" that the result will require only 11 of those characters is no longer valid.

Curiously, the finder even acknowledged the fact that the name could change for reasons outside the program's control, and noted that if the new name requires more than 11 characters, then more than 11 characters will be modified.

So I don't know what the finder was trying to say. Since the length of the name cannot be known ahead of time, the caller doesn't know how much of the putatively 200-character buffer will be used, so the caller needs to be prepared for the case that all of it will be used. If the caller had important data at character 12, the caller may be in for an unpleasant surprise.

The buffer overflow in the report is not a vulnerability in the Get­Doodad­Name function. It is a vulnerability in the caller, for passing the wrong buffer size.

We asked the finder to clarify why they considered this a flaw in the Get­Doodad­Name function, but the response was not readily comprehensible. They seemed to be more interested in the change in behavior when the incorrectly-specified buffer size is large enough to hold the full name, as opposed to when it isn't.

One part that was sort of understandable went like this (after correcting grammar):

According to the design principle, even if I have provided an incorrect buffer size, a crash should not happen, because the user provided an actual buffer large enough to hold the (undecorated) doodad name.

I'm not sure what design principle says that if a caller provides an incorrect buffer size, we should somehow detect and avoid overflowing the buffer. If that were possible, then why have buffer size parameters at all? Just detect the correct buffer size automatically for all callers!

The changing or missing antecedents makes the clarification hard to decipher as well. Sometimes the invalid parameter came from "me", sometimes it was provided by "the user", and sometimes the agent that crashed is left unspecified.

The crash is interesting if it occurs at a different security level from the caller who passed invalid (or malicious) parameters. In this case, the invalid parameters are coming from the calling application, and the buffer overflow occurs in the calling application, and the crash occurs in the calling application. So everything happens at the same security level, and there is no elevation. What you found is a way for a malicious caller to corrupt its own memory in a very roundabout way.

Comments (35)
  1. florian says:

    It’s a good pattern for Win32 functions returning strings to call them the first time with a likely-to-fit buffer, and the second time (only if necessary) with a tailored buffer. But you need to watch closely whether or not the NULL terminator is already included if copied or required sizes are returned, this is not always consistent.

    One such function I have been using recently was ExpandEnvironmentStrings, and MSDN says:

    The size of the lpSrc and lpDst buffers is limited to 32K.

    But on Windows 10 1803, the function seems to accept larger buffers. Where can I file the security vulnerability, I mean, hey, this could eat up all my computer memory!

    1. Peter Doubleday says:

      Eating up all of your computer’s memory is not actually a security flaw. After all, in this case, it’s what you asked the system to do. It’s not a security vulnerability, it’s a self-inflicted local DoS.

      It’s also, in terms of environment strings, incredibly difficult to see how this would happen in real life. But if you have an example of it happening, then by all means, write a simple program, explain the environment, and submit a report. (I’m going to get hell from Microsoft for suggesting this, but then again I really don’t see it as a likely eventuality.)

      Interestingly, on an old (Unix, not MS) system, I did once trip over a security vulnerability to do with expanding environment strings, but in fact it was the absolute inverse of this. The API call in question took the 32KB buffer (it was about that size) and blithely expanded all the strings using the abbreviations in .rc files until they overran the 32KB…

      That’s the sort of “security vulnerability” you need to worry about. (And thankfully it was solved by the Unix system in question.)

    2. Gee Law says:

      The other thing is that you want to call these functions in a loop. For example, call RegQueryValueEx in a loop to avoid copying partial data to the buffer.

    3. Darran Rowe says:

      About the size of the buffers of ExpandEnvironmentStrings, this may also just be a documentation bug. You see, the environment block had a 32k limit prior to Windows Vista, this meant that the environment variables had to be less than 32k to find into a block.
      With Windows Vista, each environment variable had a maximum size of 32k and a block didn’t have a maximum fixed size. So if the documentation for ExpandEnvironmentStrings hasn’t been touched for a long time, then this may just not be true. It’s not like the documentation in the MSDN is perfect.

  2. kantos says:

    Not only would I assume as a caller that the buffer for the passed in size is mutatable I’d assume it’s almost certainly mutated. I know quite a few reasons that functions would deliberately memset the buffer after checking initial arguments to nullptr just to ensure that they don’t have to go back and null terminate if the value fits within buffersize – 1. It might be worth specifying in the documentation if the function will fail if it can’t write a null terminator explicitly though.

    1. Torsten Kammer says:

      That doesn’t really apply here, though. As far as the function knows, it can write a null terminator. The person in charge of providing the space said the space is there, after all.

    2. Peter Doubleday says:

      That would be an awful lot of documentation updates, to cover the one single thing that any C/C++ programmer should know instinctively. If you are dealing with strings, you need to allow (at least*) one extra byte for the null. And no, supplying a malloced/newed byte buffer of length N with a length parameter of N+1 doesn’t really absolve you of a charge of very basic and utter incompetence. And even if it did, I doubt the sort of programmer who does this (or submits a “bug report”) would bother to read the documentation.

      * There are some (I don’t know how many) API calls that deal with arrays-of-arrays, which I believe are terminated by a double null in all cases. These are a bit trickier and not necessarily what a C/C++ programmer would expect (although they should be able to spot the necessity). If the documentation for said API calls doesn’t mention the double null, then I’d agree — it should.

      1. Erik F says:

        Minor observation: that should be one extra *element*, as strings are generally UTF-16 nowadays in Windows (and an array of pointers would have 4- or 8-byte elements if NULL terminated.) That said, I always have to double-check after writing array-handling code (and string handling code in particular) because it’s easy to miscount (on a good day, the program will crash after reading past the end, but it often doesn’t), and misinterpret (did that length include the terminator, or didn’t it?), and C doesn’t help with functions like strncpy(), which has a hidden gotcha if you’re not careful! I do love programming in C, but sometimes I hate it.

  3. The MAZZTer says:

    > Just detect the correct buffer size automatically for all callers!

    This is the whole point of .NET, and other managed languages, yes? If you want .NET, code in .NET (or your language of choice that does this).

    1. Clockwork-Muse says:

      Yes and no. Being a managed language is more about automatically handling buffer (and other object) allocation/deallocation – but since you generally need to know the size of such a buffer to free it correctly, size is handled “for free” (and, since most of these languages were trying to eliminate foot-guns, this makes detecting such index-out-of-bounds errors much easier for programmers).

      C++ has an Array class type: http://en.cppreference.com/w/cpp/container/array – this brings the size along with it. What’s often useful in the case of buffers, however, is _slicing_ the buffer; that is, providing a view of a section of the buffer as if it were the entire thing. For the C++ Array type, you’d do that with start/end iterators, but other languages (such as C#) provide explicit Slice objects.

      1. Bret Kuhns says:

        > For the C++ Array type, you’d do that with start/end iterators, but other languages (such as C#) provide explicit Slice objects.

        Span (http://en.cppreference.com/w/cpp/container/span) is coming to C++20 or can be used today via the “Guideline Support Library” (https://github.com/microsoft/gsl). A span acts like a “slice object” rather than having to deal with iterator pairs.

        Example: https://godbolt.org/g/dqneAM

      2. cheong00 says:

        Let’s not forget there is BSTR that is a mix of both Pascal string (length denoted) and null terminated string.

  4. Joshua says:

    Today’s example is an example of an unintended poor design. Not because the caller specified a buffer shorter than the declared buffer, but because something that doesn’t look like a bottom-level API function expects the caller to allocate the buffer.

    The correct design choice is almost always to have the API itself allocate and return the buffer, and have the caller free it. (Ok. I’ll make an exception for the File I/O and Registry functions.) But unfortunately on Windows the only good functions for allocating memory in one DLL and freeing it in another are LocalAlloc and GlobalAlloc, both of which are marked as for backwards compatibility.

    1. kantos says:

      CoTaskMemAlloc isn’t deprecated and uses the COM heap, not sure if it requires CoInitialize though.

    2. Clockwork-Muse says:

      While that’s nice (and would allow more “immutable language”-like behavior), the problem is that it means you’re allocating additional memory. If the API is sufficiently old, using pre-allocated buffers was mostly the default. It also supposes the possibility of dynamic memory allocation, which isn’t always possible (embedded systems, for example).
      Even in modern, managed languages there’s some effort to write into existing buffers, especially if those buffers can be reused.

    3. Ben Voigt says:

      FTFY: The correct design choice is NEVER to have the API itself allocate the buffer unless the function’s one and only purpose is allocation.

      Proof: Apply the Single Responsibility Principle.

      1. Joshua says:

        I know you do C# programming a lot. Besides, don’t you get tired of the call twice pattern?

    4. Denis Silvestre says:

      Well no. I want MY result allocated in MY memory space which could be a member of a structure or a variable on the stack (thus free of allocation/deallocation costs).

      Having the API allocating its own memory is extremely heavier on resources and time plus it requires a lot more complexity on the API side to track object handles (you *do not* want static memory allocated on a multithreaded environment).

      1. Joshua says:

        How do you allocate unknown sized objects on the stack? I get really tired of the ERROR_MORE_DATA loop.

  5. Yuri Khan says:

    In other news, if you say your buffer can hold 200 characters, then it had better hold at least 800 bytes, in case every one of those 200 is an astral plane Unicode character such as U+1F4A9 PILE OF POO that is represented with 4 bytes in UTF-8 or a surrogate pair in UTF-16.

    1. The measurement unit is “code units”, not “code points”. A 400-byte buffer holds 200 UTF-16 code units.

      1. Maggan says:

        Actually a 400-byte buffer isn’t enough for both 200 utf-16 and ucs-2 characters. Windows can use either.

        1. Darran Rowe says:

          Considering UCS-2 is UTF-16 without the surrogates, isn’t mentioning UCS-2 redundant? If something can handle UTF-16 then it can handle UCS-2 by default.
          Also, Raymond didn’t say UTF-16 characters, he said code units. So basically a UTF-8 code unit is 8 bits/1 byte, a UTF-16 code unit is 16 bits/2 bytes. So this means that a 400 byte buffer is large enough to hold 200 UTF-16 code units. Whether those 200 code units represent 200 characters depends on the code units in the string.
          Sometimes it helps to actually read what someone writes before you comment.

    2. florian says:

      Many functions * to cut strings to a certain length can produce split surrogate pairs. But from a user perspective, with “graphemes” as the superior concept possibly consisting of multiple “characters” (code points), this doesn’t even matter, you can still end up with a split “grapheme” even without split surrogates. And yes, there’s even combining characters beyond the basic multilingual plane. **

      For example, programming a simple console utility with column-aligned (and truncated if necessary) output can be quite challenging with Unicode support. Sometimes I’m tempted to believe the best approach to abbreviate a string is to measure its display length …

      * I have tested: _snwprintf lstrcpynW PathCompactPathExW StrCatBuffW StrCatChainW StrCpyNW StringCchCopyW StringCchPrintfW wnsprintfW

      ** http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B:M:%5D&g=&i=

      1. Yuri Khan says:

        Re: combining astral plane characters

        I’m stumped that the U+1F607 SMILING FACE WITH HALO was included in Unicode only as a precomposed character. Imagine if it was SMILING FACE + COMBINING HALO. You could then use the aforementioned U+1F4A9 with the combining halo to mean “holy crap”.

      2. florian says:

        What looks like relaxed associativity was just my thoughts about measuring buffer lengths in code units.

        By the way, CharNextW * is not helpful for detecting “grapheme” boundaries, as the underlying GetStringTypeW merely returns C3_HIGHSURROGATE or C3_LOWSURROGATE for non-BMP code points, without any more in-depth hints about the code point type.

        * Apart from being broken:

        Michael S. Kaplan, 2008/12/16 – UCS-2 to UTF-16, Part 9: The torrents of breaking CharNext/CharPrev – http://archives.miloush.net/michkap/archive/2008/12/16/9223301.html

  6. ZLB says:

    Raymond, has these sorts of optimizations caused security issues/data leakage in the past that have had to be fixed?

    Eg: API returns a list of items that are subject to access controls, API grabs all of the items and uses the user-supplied buffer as a scratch space then processes and filters the items before returning.

    The caller could then copy off the intermediate data on another thread.

    1. I don’t see it making a big difference, unless the kernel was using a caller supplied buffer as scratch space. I would be quite concerned if it was. Any API that’s doing security checks in userspace is just asking for trouble from the beginning. You could just directly call the offending internal function that’s returning the unchecked secured objects instead of having to mess around with watching a buffer changing while the security checks are being done.

      1. ZLB says:

        It was this post that twigged the thought. https://blogs.msdn.microsoft.com/oldnewthing/20110512-00/?p=10683

        Which is basically the same thing over the UserKernel fence.

  7. Scarlet Manuka says:

    The real question is – why did the caller want to tell the GetDoodadName function that the buffer size was 200, if the real buffer size was 11? Why not just tell the truth?

    1. laonianren says:

      Possibly someone had a financial liability for bugs in the code (e.g. a contractor or outsourcer) so they were desperately casting round for an explanation of how this failure wasn’t their fault.

      1. Chris Iverson says:

        That would explain why they went through the trouble of making a report with Microsoft over this, but not why they lied about the buffer in the first place.

        1. laonianren says:

          If they had a non-technical motive to defend the code there is no reason to believe the code was deliberate. It was probably just a bug.

  8. I am not sure I understand. What happened when the customer wrote 11 characters to the supposedly 200-characters-long buffer? Did his/her app crash?

    Also, how did the customer attack anything here?

    1. Aged .Net Guy says:

      The customer didn’t write anything anywhere. The customer passed the Windows API function an 11 char buffer expecting a 10 char+NULL response. But the customer / caller told the Win API the buffer was 200 chars long and so the function clobbered some or all of the next 189 chars of memory after the first 11. Which surprised the customer / caller.

      In other words, he enticed the Win API to scribble into his other data. What happened next in his program is undefined. But almost certainly not what he wanted.

      Then the customer doubled down on the mistake by thinking that somehow this was a vulnerability that malicious code could exploit. Nope. The customer / caller could have trashed his own memory without the API’s help.

Comments are closed.

Skip to main content