It is not unreasonable to expect uninitialized garbage to change at any time, you don’t need to ask for an explanation

A customer admitted that they had a bug in their code:

#define UNICODE
#define _UNICODE
#include <windows.h>

// error checking removed for expository purposes

// code that writes out the data
RegSetValueEx(hkey, pszValue, 0, REG_SZ, (const BYTE *)pszData,
              _tcslen(pszData) * sizeof(TCHAR) + 1);

// code that reads the data
DWORD dwType, cbData;
RegQueryValueEx(hkey, pszValue, NULL, &dwType, NULL, &cbData);
TCHAR *pszData = new TCHAR[cbData / sizeof(TCHAR)];
RegQueryValueEx(hkey, pszValue, NULL, &dwType, pszData, &cbData);

One bug in the above code is in the final parameter passed to Reg­Set­Value­Ex: It's supposed to be the count in bytes, but the calculation appends only one byte for the terminating null instead of a full TCHAR. In other words, it should be

RegSetValueEx(hkey, pszValue, 0, REG_SZ, (const BYTE *)pszData,
              _tcslen(pszData) * sizeof(TCHAR) + sizeof(TCHAR));

For concreteness, let's say the original string was five TCHARs in length, not counting the terminating null. Therefore, the correct buffer size is 12 bytes, but they passed only 11 to Reg­Set­Value­Ex.

This error is compounded in the code that reads the value back: The code happily divides cbData / sizeof(TCHAR) without checking that the division is even. In our example, the call returns a length of 11 bytes. They divide by sizeof(TCHAR) (which is 2, since the code is compiled as Unicode), leaving 5 (remainder discarded), causing them to allocate a 5-TCHAR buffer.

That error would have been okay by itself except for another error, which is calling Reg­Query­Value­Ex a second time with an invalid buffer size: The cbData variable remains the original value of 11 even though they allocated only 10 bytes. The subsequent Reg­Query­Value­Ex call reads 11 bytes into a 10-byte buffer.

The customer conceded that the code that writes the value is buggy, but points out that the code "worked" on Windows XP, in the sense that the string read back from the registry was correct. But Windows Vista "broke" their program, because the string read back now contained garbage at the end. Instead of returning "Hello", it returned "HelloЀ╅۞". The customer wanted to know what change to Windows Vista broke their program.

The change to Windows Vista that broke their program is known as "luck running out." The program contained three bugs, which combined to form a heap buffer write overflow. The uninitialized garbage at the end of the heap block they allocated happened to be zero on Windows XP due to a coincidence in the way their program allocated and freed memory. Consequently, when the data was read from the registry, the "string" ended in a single null byte instead of two. The extra null byte that "happened to be there already" combined with the single null byte read from the registry to form a proper null terminator.

When run on Windows Vista, that happy coincidence no longer took place, and the uninitialized garbage was nonzero, resulting in the subsequent attempt to use the string to read past the end of the buffer and pick up heap garbage. (Yay, bug number four: read overflow.) Why was the uninitialized garbage different?

It's different because there was nothing forcing it to be the same. The internals of the heap manager change all the time. (Look-aside lists, low fragmentation heap, and fault-tolerant heap are just a few recent examples.) Any of these changes will result in heap memory being used and reused differently. Plus, changes in other parts of Windows may have allocated and freed memory differently between Windows XP and Windows Vista. Heck, the program itself may have allocated and freed memory differently due to the change in operating system. (For one thing, the length of the string "Windows Vista" is different from the length of the string "Windows XP".)

Uninitialized garbage will contain unpredictable values. There's no point asking why you got a different unpredictable value this time.

Comments (21)
  1. SimonRev says:

    It is amazing the number of C questions that can be answered: You engaged in undefined behavior.  One possible value of undefined is "It works as expected." Another is "It doesn't."

    Nitpickers corner: Even if coding in C++, anytime you use the Windows API, you are engaging in C programming because Windows uses a C API and very few people I have seen bother to do full on RAII around Windows API calls.

  2. John says:

    "Uninitialized garbage will contain unpredictable values. There's no point asking why you got a different unpredictable value this time."

    True, but Microsoft has written compatibility shims for worse things in the past. I guess this guy just wasn't important enough.

    [Whether a compatibility shim exists for this issue is independent of the question "Why did this change?", and "Why did this change?" was the topic for today, not "Can we cajole the heap into behaving like it used to?" -Raymond]
  3. Tanveer Badar says:

    I always want to complain when something stops exhibiting normal levels of abnormalness. There's a limit to one's tolerance.

  4. Mathieu Garstecki says:

    I have found that many developers around me translate "undefined" as "random behavior", as in "will not behave the same in each run". I think that's why so many people rely on implementation details: it works every time for them so "obviously" it is not undefined !

    I try to correct this misconception every time I encounter it, but it is very common unfortunately.

  5. Joshua says:

    Sometimes I think everybody would be a lot better off if at some point Microsoft decided to shrug their shoulders and sell (but not support) every version of every Microsoft product that ever existed for something like its market price near the time of its obsolescence.

  6. Guillaume says:

    Raymond, I read and love your blog. And this quote is the best : The change to Windows Vista that broke their program is known as "luck running out."

    I don't about code reuse, but I known there's going to be a lot of quote reuse !

  7. NB says:

    @SimonRev: No need for nitpicking, C++ also comes with undefined behavior!

  8. WhoDoesThis says:

    The correct resolution for this would have been for the customer to fix their bug in both the XP and Vista versions of their software and then admit the mistake and move on. To place blame on Microsoft is garbage.

    Too bad names aren't named, I would really like to stop supporting development houses that engage in this type of behavior (it was Microsoft's Fault our software is broken!). I understand your blog ground rules that forbid you from doing this, but one would hope that there is a site out there that would call these people out by name.

  9. WendyHome says:

    For a moment I thought the title was an extract from a US Home Owners Association set of rules and regulations…

  10. PhilW says:

    Maybe the customer had that attitude "if it isn't broke, don't fix it" on XP, since it apparently worked. The proper attitude should be "if it isn't broke, you haven't looked hard enough", and that would have prompted them to correct it before they got burnt on Vista.

  11. Jonathan says:

    PhilW: Specifically, "if it isn't broke, use AppVerifier to break it". AppVerifier would've detected this problem right away.

  12. Adam Rosenfield says:

    @SimonRev: I think Bjarne Stroustrup said it best: "C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do, it blows away your whole leg."

  13. JohnL says:

    I'm always wondering why the National Lottery never seems to randomly select the same numbers I did.  I guess it must be a bug in their ball selection machines?

  14. Anonymous Coward says:

    So what if Bjarne Stroustrup said it, it's a stupid argument from authority. Besides, in its original context is was probably meant as a joking poke at something particular, that no one can find again since if you look up the phrase now you only get inane quotation websites.

  15. SimonRev says:

    @NB, Yeah I know.  But the recent MS meme has been that *modern* C++ is safe. Which is true — if you use modern techniques and ignore the C-with-classes techniques it is quite easy to avoid undefined behavior in C++.

    Unfortunately, I find that a lot of that flies out the window when you have to interact with a C API.  People forget all the lessons they have learned for the past 15 years.

  16. pete.d says:

    I'd prefer "(_tcslen(pszData) + 1) * sizeof(TCHAR)" also.  Seems like a better expression of the code's intent.

    I'm amused, but not really that surprised, that anyone would after learning their code had these bugs in it, still complain that they were lucky enough in one OS version for the bug to not hurt them, but not in some other version.

    It's the classic "always someone else's fault" attitude.  All too common in so many facets of daily life.  :(

  17. Dave says:

    Here's a fifth bug in the code: Apparently the RegQueryValueEx() function can never fail (unless error checking code was removed from the sample that was posted to keep it simple).

    (That's not bad, that's more than one serious bug per line of code.  A friend of mine once had to security-audit code with those characteristics.  His recommendation was "the only way to secure this code is to run it in an isolated sandbox, since they can churn out buggy crap far faster than we can ever fix it").

  18. Damien says:

    @Dave – I'm guessing you missed the comment that frequently appears at the top of code samples here (as it does on today's):

    // error checking removed for expository purposes

  19. Personally, I would prefer (_tcslen(pszData) + 1) * sizeof(TCHAR) to _tcslen(pszData) * sizeof(TCHAR) + sizeof(TCHAR).

  20. Dave says:

    @Damien: Code comments?  I've heard of those… :-).  In this case I'll attribute it to inattentional blindness,…/Inattentional_blindness.

  21. Zan Lynx says:

    Undefined behavior like this is why building on different systems and architectures is so helpful. Windows, OSX or Linux; x86, x86_64, ARM, PPC and Itanium; GCC, MSVC, ICC combinations for example.

    It wouldn't have helped this guy, because the code in question only runs on Windows, but even then he might have detected the problem if he'd tried it on Windows 2000 during development.

Comments are closed.