Fixing security holes in other programs


Any crash report that involves a buffer overrun quickly escalates in priority. The last few that came my way were actually bugs in other programs that were detected by Windows.

For example, there were a few programs that responded to the LVN_GETDISPINFO notification by overflowing the LVITEM.pszText buffer, writing more than LVITEM.cchTextMax characters.

Another responded to IContextMenu::GetContextMenu by overflowing the pszName buffer, writing more than cchMax characters.

Fortunately, in both cases, the overflow was only one character, so we were able to fix it by over-allocating the buffer by one and underreporting its size. That way, if the program overflows the buffer by one, it doesn't corrupt anything.

Another one overflows one of its own stack buffers if you right-click on a file whose name is longer than MAX_PATH. (These files are legal but are hard to create or manipulate.) Not much we can do to prevent that one.

So remember folks, watch those buffer sizes and don't overflow them. Security is everybody's job. We're all in this together.

Comments (36)
  1. asdf says:

    It looks like whoever wrote the documentation for those didn’t mention if the size included the terminating ‘’ or if the terminating ‘’ was optional if the number of characters is cch(Text)Max like it does for other structures/functions on msdn.

  2. Jack Mathews says:

    For files that go over MAX_PATH, does Windows even attempt to use the short file names as a substitute or anything? I mean, I know that there’s not ALWAYS short versions, but it would be nice.

  3. Raymond Chen says:

    Well, cchMax says "Size of the buffer, in characters, to receive the null-terminated string" and cchTextMax says "Number of characters in the buffer pointed to by pszText". The terminating null is part of the buffer, but you’re right it’s not explicitly called out. I’ll have it added. (People should know by now that when you have a buffer, everything must fit inside the buffer, and that includes the terminating null. In the same way that a pointer variable implicitly has the documentation "This pointer must be valid".)

  4. Raymond Chen says:

    Jack: Converting to short names (when possible) to squeeze under MAX_PATH is new for Longhorn.

    That still wouldn’t help the shell extension I’m thinking about, though. It has an "operate on entire folder" command which might start with a less-than-MAX_PATH directory name but in the course of its enumeration run across a greater-than-MAX_PATH filename, so giving a short directory name only postpones the inevitable.

  5. Andreas Häber says:

    "For files that go over MAX_PATH, does Windows even attempt to use the short file names as a substitute or anything?"

    If you have path > MAX_PATH you should name it with a "\?" prefix, and Windows will bypass a lot of checks (from my understanding) and let you use it. Note that you have to use UNICODE paths to use this, and that each component of the path has a limit of MAX_PATH characters.

    The shell uses MAX_PATH and can’t handle those, so it seems like you’re quite on your own if you choose to create something like this :).

    Please correct me if I’m wrong. I’ve got most of this information from <a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/naming_a_file.asp">Naming a File</a> in MSDN Library.

  6. Mike Dunn says:

    On the subject of copying into OS-supplied buffers: lstrcpyn() is the call I always use because, unlike _tcsncpy(), it ensures that the destination buffer is zero-terminated. With _tcsncpy() you have to terminate it yourself just in case the source string length >= the destination buffer size.

    So if you have an LVN_GETDISPINFO handler, the safe way to return your text is:

    CString str = /*build a string…*/;

    lstrcpyn ( pDispinfo->item.pszText, str, pDispinfo->item.cchTextMax );

  7. Michael says:

    Mike: lstrcpyn is nice, but StringCchCopy is nice. Everyone writing C++ code on Windows should be using the string handling functions in strsafe.h.

    strsafe has replacements for strcat, strcpy, printf, explicit count of bytes and count of characters version, and will always null-terminate the destination buffer, even if the string is truncated.

    More info is available here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsecure/html/strsafe.asp.

  8. Centaur says:

    Wouldn’t it be nice if the functions strcpy, strcat and sprintf were __declspec(deprecated) and #ifdef’ed out of the standard C library headers?

  9. Michael says:

    strsafe.h with the Platform SDK deprecates the old API’s by default.

  10. Mike Dunn says:

    Michael, thanks for the pointer to strsafe.h, I’d never looked at those functions before.

    However, since I do most of my work with WTL these days, I use WTL::CString (or occasionally STL’s string classes), so I don’t run into potential overflow problems with the CRT string functions.

    I also use a class I wrote that’s derived from CString which has wrappers for the shlwapi functions like PathAppend(). That makes it very easy to deal with building paths and filenames, again with no overflow problems.

  11. tom says:

    Raymond,

    When navigating to the Platform SDK page as a non-administrator user on Windows XP, there is this:

    —————————

    Microsoft Internet Explorer

    —————————

    ‘Versions’ is undefined

    —————————

    OK

    —————————

    Realizing this isn’t a product support blog, can I ask if you would you pass that along to your inside contacts?

  12. Raymond Chen says:

    Yeah I hit this too. The problem is that the Platform SDK download site uses an ActiveX control (for what reason I don’t know), and nonadmins cannot installed ActiveX controls.

  13. Here’s another reason why programs have to lie about buffer sizes. I posted this in vc.atl on Dec. 18, 2003 10:39 PM Redmond time and finally have replies of sorts. Nonetheless my program still has to lie about its buffer size, because otherwise it loses data.

    In Windows XP SP1 with VC++6 SP5, at least one CComBSTR constructor loses data.

    In the MSDN Library for October 2001, page CComBSTR::CComBSTR describes how CComBSTR constructors should work in Visual Studio 6. One form is:

    CComBSTR( int nSize, LPCSTR sz );

    […]

    nSize

    [in] The number of characters to copy from sz.

    […]

    The other four constructors set m_str to a copy of

    the specified string; however, if you pass a value

    for nSize, then only nSize characters will be copied,

    followed by a terminating null character.

    In fact, only (nSize – 1) characters get copied, followed by a terminating null character.

    For this bug there is enough source code distributed in Visual Studio 6 to prove that this bug is a bug.

    File ATLBASE.H contains this code:

    #ifndef OLE2ANSI

    […]

    CComBSTR(int nSize, LPCSTR sz)

    {

    m_str = A2WBSTR(sz, nSize);

    }

    File ATLCONV.H contains this code:

    inline BSTR A2WBSTR(LPCSTR lp, int nLen = -1)

    {

    […]

    int nConvertedLen = MultiByteToWideChar(_acp,

    0, lp,

    nLen, NULL, NULL)-1;

    The subtraction -1 at the end of that computation causes the CComBSTR constructor to lose the last byte of data that is passed to it.

    This bug can be fixed by deleting the -1 at the end of that computation.

    Whether this fix would generate other random unknown bugs in unknown programs that currently depend on this bug-ridden behavior, who knows. But the bad effects resulting from this bug are not particularly trival either. It needs fixing.

  14. Raymond Chen says:

    You’ll have to take that up with the ATL folks.

  15. ilm says:

    The strsafe.h functions are great. I’d use them everywhere, if only they were available for other compilers as well.

  16. asdf says:

    Raymond, you also may want to get "number of characters" replaced by "number of TCHARs" in LVITEM. But you’re right, I didn’t see "null terminated string" or the use of "psz" Hungarian notation due to lack of sleep.

    I can’t believe you guys are recomending the use of strsafe. Excessive syntax doesn’t make it automagically more "safer" than the C string functions no matter what msdn keeps telling you. I put "safer" in quotes because I can’t remember a time I made an error with any C string function.

  17. Raymond Chen says:

    I might point out that I am not aware of any function that asks you to fill a buffer (as opposed to one where you give it a buffer from which data will be extracted, like the ATL example above) where the buffer size doesn’t include the terminating null; i.e., I am not aware of any "please fill this buffer" function where you are actually allowed to overflow the buffer by one.

    Good point about "number of TCHARs". This was a shift in writing style at the Platform SDK started a few years ago (at my suggestion) – the old text would say "number of bytes if ANSI or number of characters if UNICODE" and I tried to nudge them towards "number of TCHARs". Even though I nouned a typedef, it struck me as clearer. I think all new documentation uses the "number of TCHARs" style but the old ones may still linger.

    asdf: Maybe you like strsafe maybe you don’t. But sometimes you just have to do things you don’t like. I can’t remember making any C string function errors either, but I use strsafe anyway because it’s just one less thing to worry about.

  18. Sam says:

    Huh? Why is anybody advocating using strsafe.h for C++ instead of something like std::string?

  19. Raymond Chen says:

    It’s great if you don’t have to interop with plain C (you can’t pass a std::string to GetWindowText, for example) and if you don’t mind being tied to a particular compiler (the OS needs to be not only compiler-agnostic but language agnostic too), and I’m not sure if you can use the C++ runtime libraries in your kernel driver.

  20. Andreas Magnusson says:

    Re: std::string vs. strsafe… Perhaps because it is really lousy when it comes to I18N support? It might be better now, but last time I was in VC6 there wasn’t a Unicode version of std::string *and supporting classes*. Haven’t looked in the new improved C++ lib in VC2003. I’m from Sweden so I take strange characters seriously.

  21. Henk Devos says:

    Re: "Wouldn’t it be nice if the functions strcpy, strcat and sprintf were __declspec(deprecated) and #ifdef’ed out of the standard C library headers?"

    Some of us like to write portable code. There is usually an operating system abstraction layer that is platform specific and does user interface stuff and the like, while most of the software is platform independent. For me, most of the code i write nowadays runs both on Mac and on Windows. In a previous life, most of the code i wrote worked both on Windows and on an embedded system.

    This means i can’t use these safe functions in most of my code.

    Re: "you can’t pass a std::string to GetWindowText, for example"

    Use the c_str member function.

  22. Henk Devos says:

    Oops sorry i was wrong about the c_str, GetWindowText is a function that modifies the string…

  23. Tim Robinson says:

    You used to be able to cheat and do this:

    std::string str;

    int length;

    length = GetWindowTextLength(hwnd);

    str.resize(length + 1);

    GetWindowText(hwnd, str.begin(), length + 1);

    str.resize(length);

    But this relied on VC++ 6’s definition of std::string::iterator being the same as char*.

    You could use a vector<char> instead of a string, then do return std::string(vector.begin(), vector.end()).

  24. asdf says:

    I think the string/vector implementation is allowed to add more stuff to the elements and have different (but fixed) padding between them as long as the interface and complexity requirements are met, so this might not work for all STL implementations (I know RogueWave’s does weird stuff like this [for debugging and stricter conformance issues] if you set the right #defines).

  25. Mike Dimmick says:

    The C++ standard says something like ‘vector is intended as a direct replacement for a C-style array’, IIRC.

    A standards-compliant C++ library must basically implement vector<char> as a char[] – it must have the same alignment requirements (which for character types means that the array must be a contiguous sequence of bytes).

  26. runtime says:

    Raymond, what exactly does the ‘T’ in TCHAR stand for? The header files and MSDN docs never really say. I’ve debated this with co-workers and no one really knows. T = Text?

  27. Mike Dunn says:

    I would guess "text" because of the macro TEXT() which does the same thing as _T()

    And in one of those unintentionally funny moments, _T has other equivalents: __T(), _TEXT() and __TEXT(). Looks like some people couldn’t agree on a name ;)

  28. Catatonic says:

    Andreas: Visual C++ 6 has the std::wstring class for Unicode. Is there something wrong with it that I’m not aware of?

  29. Shane King says:

    "But this relied on VC++ 6’s definition of std::string::iterator being the same as char*."

    As you mention, it’s better to use a vector<char>. If you use

    &*vector.begin()

    you’re guaranteed to get a pointer to the first character of a contiguous buffer on any standards compliant compiler and library, no matter what type it declares its iterators as.

    This assumption probably holds true for the string class on any compiler you care to name too, but I don’t believe that the standard makes any promises.

    One thing to note is that if performance really matters, the std::vector based approach isn’t the way to go, as each element is initialized to 0. In that case, it might be best to use a char[] wrapped in a smart pointer from a library like boost.

  30. asdf says:

    I don’t think vector is guaranteed to be contiguous either. There was some discussion about that in 1998/1999 (search google groups) but I never followed the outcome. Even if it was guaranteed, an implementation is still free to do something like store it in an array but index the elements from the reverse end of it.

    Use &vector.front() instead of &*vector.begin(). I’m not so sure the latter is even required to support the & operator after the dereference. Plus you’ll save on the construction of a vector::iterator.

  31. brian says:

    re: vector,

    My impression from c.l.c++.m is that the explicit requirement of contiguous storage was accidentally left out of the standard, but that a Defect Report has been files and it will be included in the next version. Also, there are no known implementations that don’t already comply.

    std::string was intentionally not required to use contiguous storage. I can’t name any implementations that take advantage of the flexibility, but I believe they exist. The pointer returned by c_str() is of course required to hold its data contiguously, but it’s not required that the storage be permanent – it could just be a temporary copy, so that changes to it are not reflected by the contents of the string. That’s why it’s const.

  32. Shane King says:

    Yeah that’s correct, it’s in the defect report, not the standard.

    &*vector.begin() is legal for all stl containers according to Herb Sutter, so I’m inclined to believe it is. That said, using .front() is effectively the same thing, and may be more efficient (although most implementations would optimise them to the same thing I’d think).

    I have no idea why I use begin instead of front in light of this. Old habits die hard I guess. ;)

  33. "The problem is that the Platform SDK download site uses an ActiveX control (for what reason I don’t know), and nonadmins cannot installed ActiveX controls."

    Actally, nonadmins *can* install ActiveX controls on Windows 2000 and XP, if the controls are written properly.

    For a nonadmin install, the control simply needs to register itself under HKEY_CURRENT_USERSOFTWAREClasses instead of HKEY_LOCAL_MACHINESOFTWAREClasses or HKEY_CLASSES_ROOT. It also needs to install its executables in a directory the user can write to, e.g. in the user’s profile.

    2000 and XP treat HKEY_CLASSES_ROOT specially to make this work. When you write to HKEY_CLASSES_ROOT, you’re actually writing to HKEY_LOCAL_MACHINESOFTWAREClasses. But when you *read* from HKEY_CLASSES_ROOT, Windows first looks in HKEY_LOCAL_MACHINESOFTWAREClasses and then in HKEY_CURRENT_USERSOFTWAREClasses. So code that expects to get COM registration data from HKEY_CLASSES_ROOT works whether the data is actually stored in HKEY_LOCAL_MACHINE or HKEY_CURRENT_USER.

  34. The tragedy of the commons.

Comments are closed.