Adventures in undefined behavior: The premature downcast


A customer encountered the following problem:

class Shape
{
public:
    virtual bool Is2D() { return false; }
};

class Shape2D : public Shape
{
public:
    virtual bool Is2D() { return true; }
};

Shape *FindShape(Cookie cookie);

void BuyPaint(Cookie cookie)
{
    Shape2D *shape = static_cast<Shape2D *>(FindShape(cookie));
    if (shape->Is2D()) {
       .. do all sorts of stuff ...
    }
}

The Buy­Paint function converts the cookie back to a Shape object, and then checks if the object is a Shape2D object by calling Is­2D. If so, then it does some more stuff to figure out what type of paint to buy.

(Note to nitpickers: The actual scenario was not like this, but I presented it this way to illustrate the point. If you say "You should've used RTTI" or "You should've had a BuyPaint method on the Shape class", then you're missing the point.)

The programmers figured they'd save some typing by casting the result of Find­Shape to a Shape2D right away, because after all, since Is­2D is a virtual method, it will call the right version of the function, either Shape::Is­2D or Shape2D::Is­2D, depending on the actual type of the underlying object.

But when compiler optimizations were turned on, they discovered that the call to Is­2D was optimized away, and the Buy­Paint function merely assumed that it was always operating on a Shape2D object. It then ended up trying to buy paint even for one-dimensional objects like points and lines.

Compiler optimization bug? Nope. Code bug due to reliance on undefined behavior.

The C++ language says (9.3.1) "If a nonstatic member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined." In other words, if you are invoking a method on an object of type X, then you are promising that it really is of type X, or a class derived from it.

The code above violates this rule, because it is invoking the Is­2D method on a Shape2D*, which therefore comes with the promise "This really is a Shape2D object (or something derived from it)." But this is a promise the code cannot deliver on, because the object returned by Find­Shape might be a simple Shape.

The compiler ran with the (false) promise and said, "Well, since you are guaranteeing that the object is at least a Shape2D, and since I have studied your code and determined that no classes which further derive from Shape2D override the Is­2D method, I have therefore proved that the final overrider is Shape2D::Is­2D and can therefore inline that method."

Result: The compiler inlines Shape2D::Is­2D, which returns true, so the "if" test can be optimized out. The compiler can assume that Buy­Paint is always called with cookies that represent two-dimensional objects.

The fix is to do the annoying typing that the original authors were trying to avoid:

void BuyPaint(Cookie cookie)
{
    Shape *shape = FindShape(cookie);
    if (shape->Is2D()) {
      Shape2D *shape2d = static_cast<Shape2D *>(shape);
       .. do all sorts of stuff (with shape2d) ...
    }
}
Comments (45)
  1. Anonymous says:

    That sort of optimization would have to be done in the linker (e.g. using /LTCG with the VS compiler), since the compiler has no way of knowing if another translation unit defines a subclass that overrides that method.

  2. Anonymous says:

    Last I checked (MSVC6) dynamic_cast used string comparisons of the class name. How many would depend on the inheritance hierarchy.

    I imagine that would be slower than a single virtual call.

  3. Anonymous says:

    Can the compiler ( or linker ) even decide whether there is a derived class or not? One could be linked during dynamic linking, couldn't it? Most plugin systems work that way.

    The compiler has to prove, that there is no way, that the object would be a derived type, and only then it can optimize that code.

    E.g.:

    Shape* shape = new Shape2D;

    shape->Is2D(); // could be optimized

  4. kinokijuf says:

    one-dimensional objects like points and lines

    Objection! Points are zero-dimensional!

  5. Anonymous says:

    @Christian Vetter: Can you do dynamic linking of C++ classes?

    If you can, maybe the compiler undoes such optimizations as there if that happens :)

  6. MSVC6 is 14 years old.  I'm sure they have better/faster ways of dynamic casting than string comparisons today!!  What about, say, MSVC 2010?  GCC?

    @Christian Vetter: if the class isn't exported from the project then how could something use it from dynamic linking? (e.g. DLL plug-in)

  7. Anonymous says:

    "and since I have studied your code and determined that no classes which further derive from Shape2D override the Is­2D method"

    How the heck could it know that?  That isn't even known at link time if the code contains a plugin mechanism.

  8. Anonymous says:

    LTCG + building an exe or LTCG + not exporting the vtable of the class = being able to state categorically no deriving classes.

  9. Anonymous says:

    @JamesJohnston: Right.  If the class isn't tagged with __declspec(dllexport) or __declspec(dllimport), then the compiler knows it can't be used for dynamic linking.

  10. Anonymous says:

    This is an optimization along the lines of "sufficiently smart compiler", and unfortunately, IMHO, starts to encroach on "smart-ass compiler".

    Predictability is more valuable than performance. Predictability increases the correspondence between the programmer's mental model and the runtime behaviour; this reduces bugs. Performance, on the other hand, should be diagnosed with a profiler and targeted specifically if it is a problem.

    There's a trade-off of course; really common idioms that would otherwise have poor performance but for a simple optimization may well be better off with slightly confusing behaviour, but such idioms should be common enough that most developers know about them. But this should be a relatively high bar, because bugs are typically a bigger problem than poor performance.

    But overall, this is a cultural problem with C++. The spec authors try hard to come up with ways to let vendor compilers be extra smart-ass, and being language lawyers, they often get a perverse kind of pleasure out of counter-intuitive behaviour, because it lets them stroke their egos as they explain what's going on to the peons in the massed ranks. Not saying Raymond is like this, but quite a few asocial engineers are.

    tl;dr: this is a bug in the spec, and no healthy language culture would prefer this kind of behaviour.

  11. Anonymous says:

    @Barry Kelly: I fail to see how it's smart-ass. A static_cast is effectively telling the compiler "I know about this type than you do, and I'm telling you it's this particular derived type". The compiler accepts that and, running from that assumption, inlines the function.

    How is that counter-intuitive?

  12. Anonymous says:

    IMHO what the programmer did was counter-intuitive. I don't quite get how anyone could even think it would work, when it's Shape2D that inherits Shape, not the other way around?

  13. @Barry Kelly: then go use C#/Java and/or don't compile with optimizations.  C++ is a language designed to let you compile one step away from assembly if you write your code well with a good optimizing compiler.  In order to do that, it has to let you shoot yourself in the foot.  If that's not acceptable, then use a different language.

    At the end of the day, my C++ code would probably be faster than your equivalent C# code.  But my code would have a much higher risk of undefined behavior if I screwed up.

  14. Anonymous says:

    @Joshua:

    "How the heck could it know that?  That isn't even known at link time if the code contains a plugin mechanism."

    What is going on is that the compiler's Whole Program Optimisation pass is attempting to remove virtual calls, and has come across a virtual Shape2D::Is2D call. What it then does is build up a graph of everywhere this could end up – that is to say it could either land at InvalidAddress (if Shape is NULL), Shape2D::Is2D, Shape::Is2D or ExternalDerivesFromShape2D::Is2D.

    The compiler then attempts to reduce this set as much as possible. Firstly it says – using this undefined behaviour in the C++ definition, we can remove Shape::Is2D and InvalidAddress from the set – if you're relying on undefined behaviour, well, it doesn't matter if the optimisation screws up. You're on your own in undefined-behaviour-land.

    But what about external calls? Well, to see that, WPO looks at where the Shape2D came from – in this case from FindShape. Where did FindShape get it from? Well, WPO is able to look at FindShape and see that FindShape is from raymond_example_lib.lib and that it doesn't make any virtual calls (i.e. virtual/abstract calls in C++ or pointer calls via GetProcAddress/COM etc). Consequently, FindShape's shape MUST come from a type known to raymond_example_lib. Since raymond_example_lib doesn't know about any other object that derives from Shape2D, ExternalDerivesFromShape2D::Is2D CANNOT be a valid target for the virtual call.

    Having done all of this, the compiler has reduced this all down to the single target Shape::Is2D, and hence it can convert virtcall Shape2D -> Is2D into instcall Shape2D -> Is2D.

    Now since Shape2D::Is2D is known and is an instance call, it can be inlined, so it can replace

    instcall Shape2D::Is2D($shape) returns into $resultreg

    into

    $resultreg = TRUE

  15. Anonymous says:

    So the point of all that is that the compiler MIGHT make this optimisation under certain conditions, and only if it is able to confirm that the object really didn't come from a plugin.

    The good thing about optimisations is that the default state of "leave it alone" is always there. The compiler's first job is to produce CORRECT rather than FAST code.

    Consequently if any of your respective special cases are applicable, or the optimisation can't PROVE it will be safe, or can't be bothered to prove it will be safe (i.e. compilation is taking too long) within the defined behaviour of C++ (even if it usually is), then the optimisation won't remove the virtual call and you'll end up with slower, but correct code.

  16. Anonymous says:

    its an interesting gotcha but the standard bits you quoted are not relevant. The Is2d method is a method of the shape class so the results are not undefined as per that part of the spec.  There is presumably a bit in the spec that says something like 'static cast of a pointer to class X to a pointer to class Y where X!=Y and X not subclass of Y results in undefined'

  17. Anonymous says:

    If you were following this code in your head, you would look at the "shape->Is2D()" call and say "well, shape is defined as Shape2D*, and I know that there are no subclasses of Shape2D, so this must be calling Shape2D::Is2D, which returns true".

    What reasonable person reading the code would see "shape->Is2D" and think "maybe Shape::Is2D will get called instead"? I would look at the code, naively assume that only Shape2D pointers will get returned from FindShape based on the existence of the static cast, and then wonder why the redundant Is2D() call is there.

  18. Anonymous says:

    Mmm…. Compiler optimizations: always nasty, always correct. Love 'em!

  19. Anonymous says:

    So, this is one of the cases where it is NOT easier to ask for forgiveness than to ask for permission.

    (i.e., you should check whether an object has type T before you go around claiming that it has type T).

  20. Anonymous says:

    If you want to learn more about undefined behavior, here is a set of seven useful links: lwn.net/…/511767 (as a comment to an article on yet another way undefined behavior can hurt you).

  21. Anonymous says:

    The compiler is doing the right thing. And who would build a map when you trust the programmers. But yea, never understood the idea of trusting a magic number in the object's memory layout. Crashing rightfully is cool. Plus, spherical cow.

  22. Anonymous says:

    @Cesar:

    The undefined behaviour isn't the cast. It's the use of the pointer afterwards, hence my second example is safe (since the pointer is valid because of the prior call to Is2D on the base object).

    This is made clear in the spec that Raymond quoted:

    The C++ language says (9.3.1) "If a nonstatic member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined." In other words, if you are invoking a method on an object of type X, then you are promising that it really is of type X, or a class derived from it.

    C/C++ explicitly allow you to hold invalid pointer values and keep going – just not to dereference them. If it didn't, you'd immediately break on the statement after a "delete" – since the variable you just deleted is now in an undefined state. Indeed – you'd also break whenever you entered a function that declares a pointer-type local variable, since the local-variable has undefined value until you initialize it. C/C++ is fine with letting you hang on to undefined values. It just doesn't have defined behaviour when you try and USE the value.

    [5.2.9(11) says that "the result of the cast is undefined." Undefined means "anything can happen", as opposed to "indeterminate" which means "not an error, but the result is not predictable." Uninitialized local variables have indeterminate value, not undefined (8.5). -Raymond]
  23. Anonymous says:

    It's been a few years since I did any C++, so I'm confused. Shouldn't static_cast<Shape2D *>(shape) fail at compile-time since there is no guarantee that a Shape * object can be casted down to a Shape2D *? I'd expect reinterpret_cast<Shape2D *> to cause the behaviour in the exemple, but static_cast to cause a compile time error.

  24. @DrkMatter: static_cast is perfectly legal here; you're telling the compiler that you know in advance that the type can be casted as specified and that the downcast is legal.  That saves the program from checking at runtime to see whether Shape is actually a Shape2D; it's faster to just assume it and carry on.  reinterpret_cast shouldn't really be used here; that cast is for dirty pointer casts like casting a pointer to/from an integer, for example.

    What I would like to know is why didn't they use dynamic_cast and eliminate the whole Is2D method?  Was this a performance-sensitive bit of code and it turns out dynamic_cast was slow?  (How slow is dynamic_cast anyway compared to the Is2D virtual method?  I use dynamic_cast all the time but never worry since I rarely do anything remotely performance-sensitive and dynamic_cast seems safer since a null pointer is better than a pointer to an undefined object of the wrong type.)

    Shape2D *shape = dynamic_cast<Shape2D*>(shape)

    if (shape) {

       // carry on…

    }

    [See "Note to nitpickers." Developing an example even with dynamic_cast is left as an exercise. -Raymond]
  25. Anonymous says:

    I suppose another way of looking at it is that if FindShape()'s result is not ALWAYS a Shape2D, then the line

       Shape2D *shape = static_cast<Shape2D *>(FindShape(cookie));

    makes shape an meaningless pointer, in much the same way that

       IUnknown *obj = (IUnknown *)(FindShape(cookie));

    would make obj an invalid pointer if FindShape doesn't return something which is actually an IUnknown.

    If you then ever dereference obj or shape – particularly to call a virtual function, then frankly you shouldn't be surprised if you end up in invalid memory.

    The solution therefore is to either do

    Shape* shape = FindShape(cookie);

    if(shape->Is2D())

    {

     Shape2D* shape2D = static_cast<Shape2D*>(shape);

     DoAwesome2DStuff(shape2D);

    }

    shape->Release();

    or if you're feeling particularly brave:

    Shape* shape = FindShape(cookie);

    Shape2D* shape2D = static_cast<Shape2D*>(shape); // FOR GOD'S SAKE DON'T USE THIS YET

    if(shape->Is2D())

    {

     DoAwesome2DStuff(shape2D); // Oh, thank god we're safe

    }

    else

    {

     shape2D = NULL; // BAIL BAIL BAIL before someone is stupid enough to try anything with this pointer!

    }

    shape->Release();

  26. GregM says:

    "its an interesting gotcha but the standard bits you quoted are not relevant."

    They are exactly relevant.  Try reading it this way:

    "If a nonstatic member function of a class Shape2D is called for an object that is not of type Shape2D, or of a type derived from Shape2D, the behavior is undefined."

    Since you don't know yet whether or not the object is of type Shape2D, or derived from Shape2D, you can get undefined behavior.  It doesn't matter at all that the function you are calling is also virtual in the base class.  The compiler searches Shape2D *first*, and finds the virtual function there.

    "The Is2d method is a method of the shape class so the results are not undefined as per that part of the spec."

    Except that you're not calling Shape::Is2D(), you're calling Shape2D::Is2D().

    "There is presumably a bit in the spec that says something like 'static cast of a pointer to class X to a pointer to class Y where X!=Y and X not subclass of Y results in undefined'"

    That's true, this program can invoke undefined behavior two times on the same line of code.

    [Yup. 5.2.9(8) says that if you static_cast to a derived type that the object is not a sub-object of, then the result is undefined. -Raymond]
  27. MItaly says:

    @Barry Kelly: actually, in my opinion the problem here is that *the developer* is trying to be a smartass: "I know that the vtable will still be there after the cast, so I'll just cast to Shape2D and check later; I avoided two lines of code, what a smart developer I am!".

    Inlining of virtual methods can be a blessing in performance-sensitive areas, and I for one want it enabled (especially if this stops people to roll out enormous switch statements because "virtual functions are expensive"). If some smartass developer wants to trade speed for conformance to a mental model just to avoid a few lines of code here and there he's welcome to disable all optimizations.

  28. Anonymous says:

    @Matt: your second example is a bad one. The compiler _knows_ shape is in fact a Shape2D*, since you did a static_cast of it to a Shape2D*, and the only way the cast could be valid is if shape was a Shape2D. Then the compiler makes a mental note, "shape is in fact a Shape2D*" (even while its type is still Shape*), and when it sees the shape->is2D(), it knows that it can inline to "true".

    After you do something undefined, your whole program state is undefined, so the compiler can assume you did NOT do the undefined thing and optimize accordingly. In fact, even *before* you do something undefined, your program state is already undefined! Therefore, if you do something like "if (shape->is2D()) { … } Shape2D* shape2D = static_cast<Shape2D*>(shape);", the compiler can also assume shape->is2D() inlines to "true".

    It might be better to think of "undefined" as "impossible". As in, the compiler assumes anything which is undefined will not happen, so any path leading to or from it will not be taken. In the original example, any path where shape is not a Shape2D* is impossible, because if it were possible, you would be executing undefined behavior, which is something that from the point of view of the compiler no programmer is going to do.

  29. Anonymous says:

    I don't really understand that if we are using virtual functions then why is there a need to know the specific type to which an object belongs. The whole point of using virtual functions (and polymorphism) is that the client code needs to know only the interface provided by base class and then let the derived classes take care by overriding the interface functions according.

    For example suppose in this code we have another virtual function Shape::FindArea() which makes sense only for 2D objects then this function would have appropriate definition only for Shape2D class, but say for Shape1D this function will just return 0 without performing any calculation.

    Thus the point of checking whether the object is 2D or not through the function Shape::Is2D is unnecessary. May be sometimes we do need to know the type of an object, but in general I prefer to avoid such code.

  30. Anonymous says:

    @cesar: you mean like GCC's now famous optimization of assuming null pointer dereferences crash?

  31. Anonymous says:

    In a abc I would prefer:

    class Shape

    {

    public:

       virtual double GetDimensions();

    }

  32. Anonymous says:

    At the very least I would replace virtual bool Is2D() { return false; } with virtual Shape2D* As2D() { return nullptr; } thus allowing you to write Shape2D *shape = FindShape(cookie)->as2D();

  33. Anonymous says:

    @Joshua: yes, precisely.

    And null pointer references do crash, unless you mmap()ed something at the first page of the virtual address space. Modern distributions do not allow that by default (mmap_min_addr), but that is configurable. So, dereferencing a null pointer without crashing can be possible.

    But gcc is still right that dereferencing a null pointer is something no sane programmer is going to do. Which leads gcc to assume that the pointer cannot possibly be null at that point. After all, programmers never make mistakes.

  34. Anonymous says:

    I think the solution to the problem is a better way to code. The main benefit is being able to easily debug what's going on. The resulting code is easier to understand, which means you can get away with less thinking. I hate code that makes me think unnecessarily.

  35. Anonymous says:

    Cesar, thanks for that link. It made for very interesting reading and should convince anyone that C(++) is terrible language family.

  36. RicarDog says:

    @JamesJohnston Regarding dynamic_cast, VS2010 still uses the same slow method of comparing strings. I would recommend against using RTTI in general if your application's performance is critical.

    @Barry Kelly "Predictability is more valuable than performance." This is a case of "my scenario is better than yours". Some applications do have high performance requirements, and the C++ language was designed with them in mind. It's not an ego issue, it's just that computers don't have brains yet so they have to rely on yours.

    As a side note, please NEVER cast base to derived pointers using reinterpret_cast or C-style cast such as Shape2D* shape2d = (Shape2D*)shape, even if you have guarantee that the cast is valid, because it results in undefined behavior when the type is incomplete (i.e. Shape2D is a forward-declared type) and the compiler won't complain about it. On the other hand, static_cast does generate a compile error in such cases.

  37. Anonymous says:

    @Cesar: It got famous when it was discovered that optimization was a bad idea in kernel mode.

    If I were compiling for 32 bit nommu mode (enable 32 bit protected mode and set the segment selectors up for the famous unreal mode but stay in protected mode), I'd definitely disable this optimization as the interrupt vector table (which I will be accessing) is still at NULL.

  38. Anonymous says:

    @Joshua

    I'd disagree. The Interrupt Vector Table is at address 0. Just because there's no way to distinguish between "pointer to address 0" and "NULL pointer" doesn't mean that the two distinct concepts should be conflated.

  39. Anonymous says:

    Thanks for the cheat sheet.  Lesson learned.

  40. @POKE53280,0:  Wrong; you get a null pointer if the cast is invalid, which is then noticed by the following "if" statement.

    bad_cast is for references because there is no such thing as a null reference; for example:

    Shape &shape;

    Shape2D &shape2D = dynamic_cast<Shape2D&>(shape); // may throw bad_cast

  41. Anonymous says:

    I've failed the "example even with dynamic_cast" exercise, and I don't see it posted anywhere in the comments.  Does anyone have a teacher's edition of this blog with this exercise completed?

    [Shape2D *shape = dynamic_cast<Shape2D*>(shape);
    if (shape) { … }
    shape->Something(); /* tells the compiler that the dynamic_cast succeeded, so it can assume that shape is a Shape2D. */ -Raymond
    ]
  42. bad_cast is thrown only is you cast into a reference type. dynamic_cast to a pointer doesn't throw.

  43. Anonymous says:

    A simple note to Raymond's code posted on mbrierst comment:

    [Shape2D *shape = dynamic_cast<Shape2D*>(shape);

    if (shape) { … }]

    I think if dynamic_cast<> fails, a std::bad_cast C++ exception is thrown, so probably the simple "if (shape)…" check is not correct.

  44. Anonymous says:

    Shape &shape;

    will not compile. You cannot declare a reference without assigning to it, since, well, there is no such things as a null reference ;)

  45. Yeah I know…. it won't work in a function.  I think that should work if it's a class member, though.  Just the constructors need to initialize it in the initializer lists. :)  I probably should have clarified it more.

Comments are closed.