Undefined behavior can result in time travel (among other things, but time travel is the funkiest)


The C and C++ languages are notorious for the very large section of the map labeled here be dragons, or more formally, undefined behavior.

When undefined behavior is invoked, anything is possible. For example, a variable can be both true and false. John Regehr has a list of interesting examples, as well as some winners of the ensuing contest.

Consider the following function:

int table[4];
bool exists_in_table(int v)
{
    for (int i = 0; i <= 4; i++) {
        if (table[i] == v) return true;
    }
    return false;
}

What does this have to do with time travel, you ask? Hang on, impatient one.

First of all, you might notice the off-by-one error in the loop control. The result is that the function reads one past the end of the table array before giving up. A classical compiler wouldn't particularly care. It would just generate the code to read the out-of-bounds array element (despite the fact that doing so is a violation of the language rules), and it would return true if the memory one past the end of the array happened to match.

A post-classical compiler, on the other hand, might perform the following analysis:

  • The first four times through the loop, the function might return true.

  • When i is 4, the code performs undefined behavior. Since undefined behavior lets me do anything I want, I can totally ignore that case and proceed on the assumption that i is never 4. (If the assumption is violated, then something unpredictable happens, but that's okay, because undefined behavior grants me permission to be unpredictable.)

  • The case where i is 5 never occurs, because in order to get there, I first have to get through the case where i is 4, which I have already assumed cannot happen.

  • Therefore, all legal code paths return true.

As a result, a post-classical compiler can optimize the function to

bool exists_in_table(int v)
{
    return true;
}

Okay, so that's already kind of weird. A function got optimized to basically nothing due to undefined behavior. Note that even if the value isn't in the table (not even in the illegal-to-access fifth element), the function will still return true.

Now we can take this post-classical behavior one step further: Since the compiler can assume that undefined behavior never occurs (because if it did, then the compiler is allowed to do anything it wants), the compiler can use undefined behavior to guide optimizations.

int value_or_fallback(int *p)
{
 return p ? *p : 42;
}

The above function accepts a pointer to an integer and either returns the pointed-to value or (if the pointer is null) returns the fallback value 42. So far so good.

Let's add a line of debugging to the function.

int value_or_fallback(int *p)
{
 printf("The value of *p is %d\n", *p);
 return p ? *p : 42;
}

This new line introduces a bug: It dereferences the pointer p without checking if it is null. This tiny bug actually has wide-ranging consequences. A post-classical compiler will optimize the function to

int value_or_fallback(int *p)
{
 printf("The value of *p is %d\n", *p);
 return *p;
}

because it observes that the null pointer check is no longer needed: If the pointer were null, then the printf already engaged in undefined behavior, so the compiler is allowed to do anything in the case the pointer is null (including acting as if it weren't).

Okay, so that's not too surprising. That may even be an optimization you expect from a compiler. (For example, if the ternary operator was hidden inside a macro, you would have expected the compiler to remove the test that is provably false.)

But a post-classical compiler can now use this buggy function to start doing time travel.

void unwitting(bool door_is_open)
{
 if (door_is_open) {
  walk_on_in();
 } else {
  ring_bell();

  // wait for the door to open using the fallback value
  fallback = value_or_fallback(nullptr);
  wait_for_door_to_open(fallback);
 }
}

A post-classical compiler can optimize this entire function to

void unwitting(bool door_is_open)
{
 walk_on_in();
}

Huh?

The compiler observed that the call value_or_fallback(nullptr) invokes undefined behavior on all code paths. Propagating this analysis backward, the compiler then observes that if door_is_open is false, then the else branch invokes undefined behavior on all code paths. Therefore, the entire else branch can be treated as unreachable

Okay, now here comes the time travel:

void keep_checking_door()
{
 for (;;) {
  printf("Is the door open? ");
  fflush(stdout);
  char response;
  if (scanf("%c", &response) != 1) return;
  bool door_is_open = response == 'Y';
  unwitting(door_is_open);
 }
}

A post-modern compiler may propagate the analysis that "if door_is_open is false, then the behavior is undefined" and rewrite this function to

void keep_checking_door()
{
 for (;;) {
  printf("Is the door open? ");
  fflush(stdout);
  char response;
  if (scanf("%c", &response) != 1) return;
  bool door_is_open = response == 'Y';
  if (!door_is_open) abort();
  walk_on_in();
 }
}

Observe that even though the original code rang the bell before crashing, the rewritten function skips over ringing the bell and just crashes immediately. You might say that the compiler went back in time and unrung the bell.

This "going back in time" is possible even for objects with external visibility like files, because the standard allows for anything at all to happen when undefined behavior is encountered. And that includes hopping in a time machine and pretending you never called fwrite.

Even if you claim that the compiler is not allowed to perform time travel,¹ it's still possible to see earlier operations become undone. For example, it's possible that the undefined operation resulted in the file buffers being corrupted, so the data never actually got written. Even if the buffers were flushed, the undefined operation may have resulted in a call to ftruncate to logically remove the data you just wrote. Or it may have resulted in a Delete­File to delete the file you thought you had created.

All of these behaviors have the same observable effect, namely that the earlier action appears not to have occurred. Whether they actually occurred and were reversed or never occurred at all is moot from a compiler-theoretic point of view.

The compiler may as well have propagated the effect of the undefined operation backward in time.

¹ For the record, the standard explicitly permits time travel in the face of undefined behavior:

However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

(Emphasis mine.)

² Another way of looking at this transformation is that the compiler saw that the else branch invokes undefined behavior on all code paths, so it rewrote the code as

void unwitting(bool door_is_open)
{
 if (door_is_open) {
  walk_on_in();
 } else {
  walk_on_in();
 }
}

taking advantage of the rule that undefined behavior allows anything to happen, so in this case, it decided that "anything" was "calling walk_on_in by mistake."

Bonus chatter: Note that there are some categories of undefined behavior which may not be obvious. For example, dereferencing a null pointer is undefined behavior even if you try to counteract the dereference before it does anything dangerous.

int *p = nullptr;
int& i = *p;
foo(&i); // undefined

You might think that the & and the * cancel out and the result is as if you had written foo(p), but the fact that you created a reference to a nonexistent object, even if you never carried through on it, invokes undefined behavior (§8.5.3(1)).

Related reading: What Every C Programmer Should Know About Undefined Behavior, Part 1, Part 2, Part 3.

Update: Broke the &* into two lines because it is the lone * that is the problem.

Comments (73)
  1. Karellen says:

    "Here be dragons"? What happened to the nasal demons?

    catb.org/…/nasal-demons.html

  2. VinDuv says:

    […] Therefore, the entire else branch can be treated as unreachable.² […] Even if you claim that the compiler is not allowed to perform time travel,¹ […]

    I see what you did there.

  3. Liquorice says:

    What do we want? Undefined behavior! When do we want it? It's irrelevant.

  4. zd says:

    Raymond: the C99 standard explicitly states: "83) Thus, &*E is equivalent to E (even if E is a null pointer)" (6.5.3.3, page 89, not sure about C89 though)

    I don't know where you got your knowledge?

  5. Joshua says:

    The omit null check after deref annoys me if there's no option to turn it off as kerenl mode code might have to handle a null passed from user space to system call.

    It's too late in general to unring that bell. Windows got lucky so few programs depended on it.

  6. zd says:

    I think I read somewhere though that in C++, &*E is indeed undefined behavior and not equivalent to E. So the bonus chatter may be true for C++. Someone confirm or disprove it please, because I know nothing about C++

  7. Pete says:

    Something something c++ operator overloading something something, I'd imagine. After all, the dereference operator might have side-effects! (Shudders)

    [Assume no operator overloading. (I can't believe I had to say that.) -Raymond]
  8. Pete says:

    Raymond, I meant that operator overloading could be one of the reasons that &*E is not necessarily equal to E in C++, where it would necessarily be in C.

    [I think you're missing the point of the article. -Raymond]
  9. Pete says:

    Sorry Raymond, I wasn't talking about the article, I was trying to provide an explanation to the comment above mine about the behavior in c++ vs c.  I see now that they're all off topic, including mine.

  10. mikeb says:

    The undefined behavior explained in the `value_or_fallback()` example resulted in an actual Linux kernel exploit a while back (the TUN/TAP driver exploit).  

    See lwn.net/…/342330

  11. SimonRev says:

    A good example of the undefined behavior is if you do something like

    std::vector<int> vect;

    int *x = &vect[0];

    I got bit by that before vector had a .data() method.  My initial expectation was that x would contain NULL.  Fortunately no time travel was invoked and the program simply crashed.

  12. zd says:

    @Pete, No, they are not off topic. Raymond claims that &*nullptr is undefined behavior in both C and C++, but I pointed out it is not undefined behavior in C, but I'm not sure about C++. You pointed out that it is undefined behavior in C++ for the reason you gave.

  13. Anders Munch says:

    Nit: In the unwitting function, the optimisation can only be done if the compiler can prove that ring_bell terminates or invokes undefined behaviour.

  14. Kemp says:

    I would simplify the explanation for the first example to: When executing the loop, at some point i will equal 4. This invokes undefined behaviour due to accessing beyond the end of the array, allowing the function to return true. The final iteration of the loop will thus always return true, and so the loop can be optimised to a "return true" statement.

  15. Liquorice says:

    SimonRev: Maybe the undefined behavior invoked a time travel and modified everyone's memory, as if the program simply crashed? Wait, if a time travel is invoked and no one remembers it, does it make a real time travel?

  16. Crescens2k says:

    I always question this kind of compiler behaviour especially when you see these kinds of results. Since the likelihood that these invocations of undefined behaviour are unintentional, then wouldn't it be better finding better ways of telling the user that the code is wrong rather than optimising it away and causing lots of confusion when things break.

    [Check out Part 3 of the Related Reading. -Raymond]
  17. VinDuv says:

    @David: Are you complaining that compilers turn buggy code into buggier code?

  18. Kay says:

    Can you please give an example of "A post-modern compiler" that does such crazy optimizations exploiting undefined behaviors? Is this theoretical reasoning? Or, does it can really happen?

  19. John Doe says:

    @VinDuv, compilers certainly turn buggy code hard to de-bug.

  20. mikeb says:

    The `exists_in_table()` example (the first one) is a variation of a paradox that a professor in one of my university classes presented way back when – before anyone talked about undefined behavior on computers.

    The paradox was presented as:

    A teacher of a class on logic announces on a Friday that sometime during the next week a surprise quiz will be given, and that no one will know before the quiz is given when it will be. The students deduce that if the quiz hasn't been given by Friday, then it would have to be given on Friday. But that couldn't happen because they would know before Friday's class that the quiz would be given that day. Since Friday is eliminated as a candidate for quiz day, reduction eliminates all the other days of the week, and the conclusion is that the quiz can't be given. The students spend the weekend partying instead of studying.

    The students show up for class on Monday, and find that they have to take a surprise quiz.

  21. mikeb says:

    @Kay:

    Follow the links Raymond gave to John Regehr's articles on real world undefined behavior.  These aren't just theoretical.

  22. Anon says:

    @John Doe

    s/hard/nigh-unto-impossible/

  23. Shafik Yaghmour says:

    This StackOverflow question where gcc turns a loop with four iterations into a infinite loop due to signed integer overflow is one of the best I have see:

     stackoverflow.com/…/why-does-this-code-output-more-than-4-lines

  24. Edwin says:

    "You might think that the &* cancel out and the result is as if you had written foo(p), but the fact that you performed the dereference at all (even if you never carried through on it) invokes undefined behavior."

    Is this really true? The C99 standard says:

    "The unary & operator yields the address of its operand. If the operand has type ‘‘type’’,

    the result has type ‘‘pointer to type’’. If the operand is the result of a unary * operator,

    neither that operator nor the & operator is evaluated and the result is as if both were

    omitted, except that the constraints on the operators still apply and the result is not an

    lvalue."

    (the constraints are basically that the * still has to be used on a pointer, so you can't write &*1)

    [Point taken. I've updated the example. This is what happens when I try to condense an example and go too far. -Raymond]
  25. noMAD says:

    I think you statement: `When i is 4, the code performs undefined behavior. Since undefined behavior lets me do anything I want, I can totally ignore that case and proceed on the assumption that i is never 4.` is paradoxical. If you are going to ignore the case and proceed on the assumption that i is never 4 then you would never have reached the state of undefined behavior in the first place where you would want to not ignore the case.

  26. Chris says:

    About the whole &*p thing, it is indeed legal in C++: stackoverflow.com/…/dereferencing-an-invalid-pointer-then-taking-the-address-of-the-result

    It becomes undefined behaviour when there is an lvalue-to-rvalue conversion required for *p.

  27. David says:

    This is great. Post-classical compilers are very good at producing code that very efficiently does the wrong* thing. Progress!

    * allowed by the standard but obviously wrong anyway.

  28. Evan says:

    @Kay: "Can you please give an example of "A post-modern compiler" that does such crazy optimizations exploiting undefined behaviors?"

    Yes, there have been several security holes "caused" by compiler optimizations. This can be as simple as a "char* password = malloc(size); …; memset(password, '', size); free(password);" where the memset is optimized away to much more complex scenarios. "What Every C Programmer Should Know About Undefined Behavior" Part 2 (and I think somewhere in Reghr's blogs) shows a simplified example of a real, exploitable security hole caused by the fact that the Linux kernel dereferenced a pointer before checking for null — the compiler made optimizations based on the dereference that said "since this pointer is dereferenced, it cannot be null*, hence this check for null is redundant and can be removed." But removing the null check left the function exploitable. (The correct fix was presumably to move the dereference after the check.)

    * More precisely, the compiler says "since this pointer is dereferenced, I can do whatever the heck I want when it is NULL, including bypass the null check."

  29. Evan says:

    @David: "This is great. Post-classical compilers are very good at producing code that very efficiently does the wrong* thing. Progress!"

    The problem is, one person's idea of "incorrect code" is another person's idea of an optimization.

    Take the following program:

     void foo() {

         int x;

         x = 5;

     }

     void bar() {

         int y;

         printf("%dn", y);

     }

     int main() {

         foo();

         bar();

         return 0;

     }

    It's certainly conceivable that you could find someone (probably from 1975) who would argue that program REALLY should always print 5, because after all, bar's stack frame will just overlay foo's, and that something else is obviously incorrect. You can probably find programs that do what I'd call a moral equivalent of that, expecting things to work out, today. But hopefully this example seems pretty ridiculous, because the compiler can do things like register allocation; maybe y doesn't even *have* a memory address!

    Similarly, take the optimization Raymond presented of

     int value_or_fallback(int *p)

     {

       printf("The value of *p is %dn", *p);

       return p ? *p : 42; // optimized to 'return *p'

     }

    I'm actually very happy with the compiler doing that. It makes perfect sense to me! In this example the benefit is not so clear, but imagine that the different lines are from different functions that were inlined, and the conditional was written because the function was more general than it needs to be for the context in which it is used.

  30. Rick C says:

    @Anders Munch: that's not a nit, that's the entire point:  the compiler *has* determined that undefined behavior occurs on all code paths.  Obviously if it cannot do that, none of the rest of the article follows.

  31. Crescens2k says:

    [Check out Part 3 of the Related Reading. -Raymond]

    Yea, well aware of that. But I can always hope that one day, instead of finding optimisation opportunities based off of incorrect code, they spend that time to figure out how to detect and warn about the code itself.

    [As noted in Part 3, a big problem is avoid false positives in such reports. Consider:
    int *GetFroobie() { static int froobie; return &froobie; }
    int GetValueOrDefault(int *p) { return p ? *p : 42; }
    char *GetColor(int *p) { return colorTable[GetValueOrDefault(p)]; }
    char *UpdateFroobieAndGetColor(int v) { int *p = GetFroobie(); *p = v; return GetColor(p); }
    Do you want the compiler to warn you, "Possible bug in in GetValueOrDefault: testing for "p != nullptr" is redundant because p is dereferenced in UpdateFroobieAndGetColor prior to the test"? -Raymond
    ]
  32. Myria says:

    I'm curious what the official stance is on Visual C++'s support for signed integer overflow.  By the C/C++ standard, signed integer overflow is undefined, but it may be defined in a given implementation.  In C++11, std::numeric_limits<signed T>::is_modulo was reworded to clarify that is_modulo is true if and only if the implementation considers signed integer overflow to be fully defined, and that the definition is to wrap modulo whatever power of 2, as if it were unsigned.

    Visual C++ sets is_modulo to true, and experimentally I've never been been able to get VC's optimizer to assume that signed integers can't overflow.  VC acts much as GCC with -fwrapv does, rather than GCC without -fwrapv.

    However, I did find a case in which Visual C++'s optimizer handles signed integer overflow incorrectly.  When reading about a Visual C++ bug that's new to 2013's Update 2 on Microsoft Connect, I noticed that the Update 1 code, from before the bug, doesn't handle signed integer overflow correctly.

    I've only tried on x64, but this code on VC 2013 Update 2 breaks spectacularly (Connect bug link at end of post):

    void func (int *b, int n)

    {

     for (int i = 0; i < n; i++)

       b[i * (n + 1)] = 1;

    }

    In Update 1, it works much more properly.  However, there's a very subtle issue: if "n" is exactly 0x7FFFFFFF, VC's optimizer acts as if the "* (n + 1)" expression means "* 0x0000000080000000LL" rather than the correct "* 0xFFFFFFFF80000000LL".  This obviously is probably not what is desired, and it's very unlikely that you'd pass such a large array, but it does show that there is some issue with signed integer overflow in the optimizer.

    Besides the Update 2 bug, the Update 1 version is also a bug in Visual C++: either the signed integer overflow handling is wrong, or the definition of std::numeric_limits<int>::is_modulo is wrong.

    connect.microsoft.com/…/wrong-loop-optimization

  33. Evan says:

    @noMAD: "If you are going to ignore the case and proceed on the assumption that i is never 4 then you would never have reached the state of undefined behavior in the first place where you would want to not ignore the case."

    That's where the time travel comes into play. The generated code is a version with a naive implementation of the undefined behavior, except that at runtime it detects the UB, goes back in time, and does something else leading up to it. :-)

    (Obviously you should not take that description literally.)

    The compiler is not required to leave the undefined behavior in the program. It can do whatever it wants to it, provided that it preserves the observable semantics of runs that would *not* invoke undefined behavior in the abstract machine defined by the appropriate standard.

  34. Dzmitry says:

    So why do then C-geeks consider Java inferior? In Java, unreachable code doesn't compile, and undefined behavior cannot exist.

  35. xor88 says:

    See stackoverflow.com/…/can-branches-with-undefined-behavior-be-assumed-unreachable-and-optimized-as-dea ("Can branches with undefined behavior be assumed unreachable and optimized as dead code?"). This post answers that question nicely.

  36. David says:

    @VinDuv: Are you complaining that compilers turn buggy code into buggier code?

    Yes. Yes I am. See, it turns what should have been a relatively simple problem with straightforward local (if any) consequences, into an impossible to diagnose "infects the whole program" nightmare with observable behaviour that essentially lies about what the problem is. In the name of speed, which is useless because the code is broken anyway.

  37. Muzer_ says:

    Through the links you posted, I'm most amazed that *loops can be assumed to terminate* if they have no side-effects in C++11. Even in cases like a function to test something that either halts or loops forever, a C++ compiler can assume it halts. Link (linked to from one of Raymond's links): blog.regehr.org/…/161

  38. Neil says:

    [Do you want the compiler to warn you, "Possible bug in in GetValueOrDefault: testing for "p != nullptr" is redundant because p is dereferenced in UpdateFroobieAndGetColor prior to the test"? -Raymond]

    If UpdateFroobieAndGetColor can be shown to be the only caller to GetValueOrDefault, then yes, I want the compiler to warn me. (And the reverse is true if I remove the null-check from GetValueOrDefault and then call it from somewhere that might pass nullptr.)

  39. Joey says:

    Dzmitry: While Java has no UB in the actual language (in stark contrast to C/C++) there are still things undefined. One example I stumbled over, was memory-mapping files where creating a mapping beyond the end of the file is undefined (most likely to differences in how different operating systems handle that case). But there *are* in fact undefined things in the Java standard lib and they are explicitly called out. The definition of undefined behaviour isn't as hostile as in C as well, granted.

    And C geeks consider Java inferior because its *performance* is in most cases worse than C's. They are willing to live with the plethora of land mines C puts before their feet because they need the features C is able to provide. Well, but then there are those who choose languages not by merits and requirements but more by hype or beliefs and you can't reason about those. (To me at least there are suitable uses for many different languages and I probably wouldn't try writing a CRUD web application in C.)

  40. T says:

    Are you sure that dereferencing a pointer without actually looking at the value is undefined behaviour? The following snippet is taken from stddef.h in VS 2012:

    #define offsetof(s,m)   (size_t)( (ptrdiff_t)&reinterpret_cast<const volatile char&>((((s *)0)->m)) )

    Also, when trying to access an array item out of array size bounds, is that also undefined behaviour if the array is part of the struct? It's a common technique to have an array of size 1 at the end of a struct and just allocate as much memory as you need its length to be. For example, taken out of WinBase.h:

    typedef struct _FILE_NAME_INFO {

       DWORD FileNameLength;

       WCHAR FileName[1];

    } FILE_NAME_INFO, *PFILE_NAME_INFO;

    If to follow your post, accessing FileName[5] is undefined, but it totally depends on how much memory is allocated. I would expect the compiler to actually look up the uninitialized memory in this or any other case.

  41. Crescens2k says:

    @Raymond:

    I totally agree, but maybe I'm being a bit optimistic in thinking that there should be a way to satisfy both demands now. For example, implementing a compiler option called /imwillingtoputupwiththeextratimeandfalsepositives so that it takes the extra compile time, disk space and memory to do these extra checks so that by the time the optimisations are being done, it has all the information needed.

    Well, there are already a lot of great tools to help figure these things out (obligatory product placement time, VCs /analyse option), I also know that the poor compiler devs would be the ones getting all of the complaints for these changes too.

    @Neil:

    The problem is, quite often the compiler doesn't have the information needed. The problem is that the compiler strips out a lot of information as it parses the code. The thing which would help is if the compiler could keep/store the extra information for the build, but there be dragons too. Developers always seem to prioritise build times, if you couple that together with the fact that the C++ standard keeps getting more complex then that has a negative impact on compilation times. So as Raymond was pointing out to me, the choice is between false positives or slower build times, and the general consensus is that both suck. (False positives because of options like /Wx and slower build times because that time could be spent doing more work).

    @T:

    This was covered near the start of the post comments. Originally Raymond had something similar in the bonus chatter section. As the C99 standard states, if & and * appear in the same statement, neither would be evaluated. Now, before the reinterpret_cast there is a &, and the ((s*)0)->m is essentially (*(s*)0).m, so with the address of operator, and without the casts, you have &*0.m. There is no invocation of undefined behaviour there. For the array, well, unless it is a class type that has an overloaded operator[], the standard defines Array[Elem] as *((Array)+(Elem)) and I don't think there is any mention of array bounds at all. So for the undefined behaviour, that would arise with the dereferencing of the array element, not with going past the end. This is where the code fragment in the sample and this structure are different. When you allocate it, it could be allocated as

    PFILE_NAME_INFO info = (PFILE_NAME_INFO)malloc(sizeof(FILE_NAME_INFO)+ MAX_PATH – 1);

    GetFileInformationByHandleEx(handle, FileNameInfo, info, sizeof(FILE_NAME_INFO) + MAX_PATH – 1);

    So you can see from that, there is enough memory even with the array only being defined as length 1. On the other hand Raymond's example has

    int table[4];

    So right from the start you know that it has length 4, so table[4] will be the same as *(table + 4), and that dereferences memory that has not been explicitly allocated, and this is where the undefined behaviour is.

  42. voo says:

    @Neil: You may think you do, but in reality no you really don't. For example Python does reference counting for objects manually using Py_XDECREF/PY_XINCREF macros. Those are some general macros that check if their argument is zero before working on it. Now clearly those will be used in situations where the compiler can deduce that the argument will never be null – do you really want to generate a warning for each of those situations? You could use another macro version that doesn't do the null checks, but that's dangerous – and worst of all since the compiler can do the deduction for you won't even give you a performance benefit.

    Now warning about missing null checks? That's probably a good idea, there's still the chance of false positives, but in general those are going to be way lower than the opposite..

  43. @voo says:

    "Now clearly those will be used in situations where the compiler can deduce that the argument will never be null – do you really want to generate a warning for each of those situations?"

    Take Raimond's example with the array: WHEN the compiler is able to see an access to non-allocated memory, THEN it would be much better to produce an compile-time error (because that is clearly a error in the source code) than to generate crazy code. When the compiler cannot prove that then he should of course not generate an error NOR the crazy code.

    If you talk about unreachable code in tbe true sense, deduced by compile-time evaluation of expressions, not over undefined behavior, than I'm with you: There should be no warning.

  44. Joshua says:

    Undefined behavior can and does exist in Java. It's just harder to find how to cause it.

  45. Hans says:

    @voo: Pretty much sums it up.

    But instead of a warning programmers tend to ignore the compiler should actually _replace_ the source code, so that Raimonds example will list as:

    bool exists_in_table(int v)

    {

       return true;

    }

    Would then be no surprise the generated code matches the source code ;-)

  46. voo says:

    @@voo: I think I misunderstood Neil's example, I agree that these kind of optimizations which rely on undefined behavior to generate optimized code are on first glance annoying – if the compiler can deduce such a thing why shouldn't it generate a warning instead? But since working on compilers a bit, I find that generally those kind of optimizations happen deep down the pipeline where it's no longer apparent that the information was caused by undefined behavior and is a bug and not some valid optimization.

    llvm is doing much better than MSVC/gcc in general in this regard but there's still much to do.

    @Joshua: That I doubt, if you exclude JNI and co – since Java's sandbox was originally planned to make sure code is securely executed, having undefined behavior would be a gigantic problem. C#/CLR does have some, but I don't think the Java spec actually does. What examples do you have in mind?

  47. mk says:

    "This new line introduces a bug: It dereferences the pointer p without checking if it is null. This tiny bug actually has wide-ranging consequences. A post-classical compiler will optimize the function to"

    This is complete nonsense; the code works fine if p is never null, and the compiler can't break it when it isn't. With a claim so blazingly inane and incompetent, I stopped reading at that point.

  48. mk says:

    Argh, I misread … your optimization does return the right value when p is not null. Ignore my inane and incompetent comment … carry on.

  49. GWO says:

    I ran this entire article through an optimising English compiler.  The output was:

    "The behaviour of a program that has undefined behaviour is undefined."

  50. Crazy Town says:

    IF SO-CALLED "PROFESSIONALS" AND "SOFTWARE ENGINEERS" WOULD MAKE SO MANY *SIMPLE* ARITHMATIC COUNTING AND LOGIC ERRORS ETC IN THE FIRST PLACE WE WOULD NOT BE IN THESE SITUATION!!!

  51. Azarien says:

    @Karellen: "here be dragons" sounds cool. "nasal demons" sounds ugly.

  52. hm says:

    "The compiler observed that the call value_or_fallback(nullptr) invokes undefined behavior on all code paths. Propagating this analysis backward, the compiler then observes that if door_is_open is false, then the else branch invokes undefined behavior on all code paths. Therefore, the entire else branch can be treated as unreachable.² "

    The code of value_or_fallback() imposes an implicit  <ptr != NULL> constraint, which the compiler can act upon in a sensible way by eliminating the ternary operator.

    If value_or_fallback() is called with a pointer proven to be NULL, then this constraint is violated and the only sensible behavior of the compiler would be to raise an error.

    In this case (and also in the array example), if the compiler is confident about the code error (using an invalid array index, using an invalid pointer) what is the value in trying to optimize broken code? The compiler should refuse to compile it.

  53. Katie says:

    @hm

    Things like macros, template definitions and inline functions interact in all sorts of ways in a complicated enough project. There may be situations where an expansion ends up dereferencing a null pointer, but the developer knows that that branch of the expression will never be evaluated. The compiler should compile that code in that case. Although the code could be rewritten to avoid the issue it would make the code more complicated and would possibly duplicate logic.

  54. Hm says:

    "in all sorts of ways in a complicated enough project. There may be situations where an expansion ends up dereferencing a null pointer, but the developer knows that that branch of the expression will never be evaluated."

    If the code is complicted enough to fit your claim, how would the developer know for sure if the right or the wrong parts of such a complicated construct will be executed?

    I think this whole approach has no place in a prograaming tool that is meant to be used be humans.

  55. Katie says:

    Here's a simple example with a macro from the Linux kernel:

    #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))

    int foo()

    {

       return DMA_BIT_MASK(64);

    }

    A 64-bit shift is undefined, but the macro creates one in dead code.

  56. hm says:

    "64-bit shift is undefined"

    So what? This statement in itself is non-sensical. This is a completely artifical and counter-intuitive definition. Shift operations should always simply compute the expected result. What compiled code to you expect if you use this macro with a variable instead of a constant?

    [It may be an artificial and counter-intuitive definition, but it's also the law. [expr.shift] "The behavior is undefined if the right operand is negative or greater than or equal to the length in bits of the promoted left operand." (It's not artificial if you understand the CPU instruction set. Most architectures limit the number of positions that can be shifted in a single instruction.) -Raymond]
  57. Hm says:

    lwn.net/…/278137

    Very telling: "If you use GCC and its C frontend, you want performance, not security."

    So please please don't use GCC to compile an operating system.

  58. Deduplicator says:

    @Hm: So, you want Windows, Linux, Java, .Net, (fill in your platform of choice) to use three times the memory and run a tenth as fast, just so there are no undefined behaviors?

    There's a reason non-PoC Operating Systems, Programming platforms and many big projects are written in languages which are not strongly-typed and have undefined behavior: Performance.

  59. Joker_vD says:

    Remember: doing wrong things fast is more important and better for business than doing correct things slowly. You can always work around wrong behaviour with slight performance hit (provided you can't stand the wrong behaviour, otherwise just let it be), while you can't optimize and speed up correct but slow program.

  60. ariel says:

    The optimisation removing ring_bell is a little weird – ring_bell probably ends up invoking at least one system call, and the compiler shouldn't be able to prove they all return – and if ring_bell fails to return then the code is fine (in C11, ring_bell can be assumed to not diverge, but that's a different issue – a system call not returning != diverging, as the former can happen if, say, one of the system calls is _exit, or a futex_wait that is part of a deadlock, or a blocking read from an empty pipe, etc.).

  61. @Hm says:

    I expect that the compiler will use the target CPU's shift instruction, where the result of the shift can vary across architectures. It may break it into a few instructions to handle targets that don't support 63 bit shifts.

    However I also understand that it may realize that the variable will never be greater than 64 because of the undefined behavior and wrap it into a clever optimization based on that knowledge. I don't really care that the optimization may have strange results when it gets a value of 70 because that means my code is doing something wrong anyways. If my driver thinks it needs a 70-bit DMA mask in a 64-bit variable then why should I expect anything after that request to work right anyways?

  62. Katie says:

    @Hm

    That was me above… not sure why I put your name like that…

    I also just realized that your question implies that any time a shift operator is used with a variable the compiler must be able to verify that the argument to it will always be within a valid range, otherwise it should refuse to compile it because it would trigger undefined behavior in the language spec.

  63. hm says:

    @Katie: I omitted some text from my answer to your example:

    #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))

    First, as long as <n> is in the range 0 .. 64, there is no undefined behavior whatsoever, because the code path with the shift operation has the constraint that it will not be executed with <n> == 64. "1ULL  << 64" in itself is not undefined behavior until it is proven that this code will be executed.

    Second, imaging the following: Some driver module using this macro is changed. After several levels of inlining and macro expansion etc etc (as you said) the compiler can prove that this code fragment will be used with a constant (or variable) outside of the range 0..64, lets say, 128.

    What would you like more?

    (a) The compiler silently turns this undefined behavior into a random DMA bit mask.

    (b) The compiler emits a compile-time error, because it had seen your error.

    [But what if the code that passed 128 is itself dead in a way the compiler cannot detect? Now the compiler refuses to compile dead code. In a sense, intentionally executing undefined behavior is how you tell the compiler that code is dead. -Raymond]
  64. Katie says:

    (c) The compiler emits a warning letting me know that I have an expression with undefined behavior. This is what GCC does with default configuration.

    I selected this example because in some situations it does generate a compiler warning because of the "1ULL<<(64)" when using it with a value of 64. The compiler isn't always smart enough to recognize that it is dead code and supress the warning. You're suggesting that in this case it should assume the programmer made a mistake and refuse to compile, even though in that situation there is no error and no matter what code it generates for that expression it will never actually run it.

    The warning brings the potential problem to the programmer's attention and allows them to evaluate the situation, and allows them to fix the bug in the case of it using 128, or ignore it in the case that they can see it is in dead code.

  65. Deduplicator says:

    @Joker: I'm not sure whether you are living up to your name there (too subtle for me then) or are actually serious (Thus demonstrating you didn't read the material or misunderstood it).

    @ariel: You are right, he actually didn't explicitly say that ring_bell provably returns (Doing system-calls is no valid indication for might never return though).

    @hm: The meat of the article is that the compiler uses Undefined Behavior as a hook to understanding what code can be called with which values, so it can properly exploit all those things which are "intuitively obvious" to a human reader, without having to solve the halting problem. Thus, you left out the real options:

    A: The compiler warns whenever you might have sinned by invoking UB, swamping you with spurious warnings and errors (Your choice: Allow only provable code or invent an un-consumer which distills the 1000 optimizations the compiler did to a human-readable diagnostic.).

    B: The compiler produces mediocre code (half as fast as your competition or even worse => You are done for) because you define all the corner-cases.

    C: The compiler augments its reasoning about your code by supposing you don't do such evil things and thus gets a much better view.

    Seems like you want A, but without false positives (and without missing anything), as well as C. Not gonna happen, the sufficiently smart compiler is still a pipe-dream.

  66. Katie says:

    @Deduplicater

    And even case B isn't very reasonable. The compiler would have to define the behavior for every case of undefined behavior, essentially creating a new language. And then bugs would still happen – if a value of 128 made it into my DMA macro the code would go along and do whatever the compiler says a shift should do in that case, but whatever value it returns still won't make sense in the context of code written with a maximum of 64 in mind.

  67. . says:

    "Windows, Linux, Java, .Net, (fill in your platform of choice) to use three times the memory and run a tenth as fast, just so there are no undefined behaviors?"

    I'm sure you overestimate this factors by a very large scale.

    The point is that reliable operation is much more important than 0.1 (or 5%, if you like) performance gains. Remember that we talk here about "general use" software, not about high-performance computing like weather simulation.

    The compiler folks deny this responsibility for weak reasons.

  68. ariel says:

    @Deduplicator

    I'm quite sure that no compiler tries to analyse whether system calls return (gcc, at least, doesn't: system calls are either within cross-module calls or asm statements, and it has no always_returns attribute)  – that would be way too complicated (in Linux, for example, there could be a seccomp-bpf filter that SIGKILLs your program when it tries to make a specific system call with specific arguments – and IIRC the filter could be imposed by the process that exec(2)'d you so even WPO won't help you).

  69. Deduplicator says:

    @Katie: I never said it was reasonable or even useful. Specifically, I never suggested it would magically fix a wrong program.

    @.: I might be underestimating it there. Though in signal-processing, defined signed overflow alone purportedly makes things last 50% longer. And if your computing system gets even 5% slower, will you really stay loyal and not go to the competition which does not hobble you thus?

    @ariel: Didn't want to imply the compiler analyzing sys-calls. It just generally assumes calls return, unless marked not to or sometimes if it can trivially prove it cannot return. (It has decorations for imposing reordering restrictions)

  70. Katie says:

    @Deduplicator

    I know, I was agreeing with you that it was a bad solution, just adding my own reason why it still doesn't solve the kinds of problems we've been discussing.

    The signed overflow is an interesting example as well. If there is a signed integer overflow somewhere in my signal processing flow then the system will produce incorrect output and the equipment my software controls could destroy a multimillion dollar prototype. So in critical places I prove out that my algorithm will never produce an overflow and put simulations into my test suite that try the typical and extreme cases and verify that it doesn't overflow and produces the correct output. If I'm confident that the value will never overflow, why not let the compiler assume that it won't overflow and make some nice optimizations as a result?

    If for some reason I do expect it to overflow or am not confident enough in my analysis/testing then I can put my own checks in to handle the situation (probably letting it saturate instead of roll over) and it is still safe for the compiler to assume it won't overflow.

  71. ariel says:

    @Deduplicator

    If the compiler assumes that calls return, even if they don't, then this is a bug in the compiler and should be fixed – not returning is a perfectly valid action (again, C11-diverging isn't relevant here), for example, take the following code:

    do_security_check(data);

    data->p[data->i] = data->v;

    Assume that do_security_check can abort via some weird path if the condition was false. Now if the compiler thinks that it always returns, it can move the write ahead of the security check, turning correct (but aborting-on-invalid-data, which is completely fine in application context) code into exploitable code.

    Note that some versions of gcc have (had) a bug where

    unsigned int f(unsigned int z) {

     unsigned int a = 100;

     // trigger loop-invariant optimization

     for(int i=0;i<100;i++) {

         // this isn't loop-invariant – but prevents divide-by-zero

         if((z+1)*(a+1) == 1) return 0;

         // but this is:

         a += a*(100/z);

     }

     return a;

    }

    could end up failing when z==0. But that was a different problem – gcc didn't understand that division could crash.

  72. Joker_vD says:

    @Deduplicator: It was a joke, don't worry. The main problem of UB in C/C++ is, in my opinion, not that it exists, no, but that it looks nearly indistinguishable from the legal code, and worse, your tools don't usually have "look for probable subtle errors" mode of work.

    As for signed integer overflow, gee, that's why C# has "unchecked" keyword (I'd assume). It's like that plastic cover on the "LAUNCH THE MISSILES" buttons — you don't want to press one by accident, but sometimes you do actually want to launch some missiles.

  73. Adam Rosenfield says:

    @ariel: No, the compiler *cannot* move the write ahead of the security check.  There's a sequence point after the function call, and the write has observable side effects, so the compiler is not allowed to reorder those unless it can prove that the two orderings are equivalent, which it won't be able to do if do_security_check() is an external function to which it doesn't have the code to at the time.

    The loop-invariant code motion example you gave with GCC is, as you said, a bug.(may

Comments are closed.