Cast operators do not obey the distributive law

Another interesting question from StackOverflow. Consider the following unfortunate situation:

object result;
bool isDecimal = GetAmount(out result);
decimal amount = (decimal)(isDecimal ? result : 0);

The developer who wrote this code was quite surprised to discover that it compiles and then throws “invalid cast exception” if the alternative branch is taken.

Anyone see why?

In regular algebra, multiplication is “distributive” over addition. That is q * (r + s) is the same as q * r + q * s. The developer here was probably expecting that casting was distributive over the conditional operator. It is not. This is not the same as

decimal amount = isDecimal ? (decimal)result : (decimal)0;

which is in fact the correct code here. Or, better still:

decimal amount = isDecimal ? (decimal)result : 0.0m;

The problem faced by the compiler is that the type of the conditional expression must be consistent for both branches; the language rules do not allow you to return object on one branch and int on the other.

We choose the best type based on the types we have in the expression itself, not on the basis of types that are outside the expression, like the cast. Therefore the choices are object and int. Every int is convertible to object but not every object is convertible to int, so the compiler chooses object. Therefore this is the same as

decimal amount = (decimal)(isDecimal ? result : (object)0);

And therefore the zero returned is a boxed int. The cast then unboxes the boxed int to decimal. As we’ve already discussed at length, it is illegal to unbox a boxed int to decimal. That throws an invalid cast exception, and there you go.

Comments (23)

  1. Shiva says:

    I prefer compiler to throw a compilation error instead of putting type conversion for me conditional operator. You may say that asking devs to put explicit type conversion in all assignments may be overkill. But at least in the case of conditional operator it should have asked me to make my intentions clear.

  2. Phil says:

    Thank dude. You just made me a little smarter (which suffice for your good deed for the day.) I have noticed this problem too but never understood it until now.

  3. snarfblam says:

    This is something that has always bothered me about C#'s casting operators. Boxing is intentionally largely transparent in C#. When you want to cast an object to a value type, however, the boxing is suddenly completely opaque. Nowhere else do we have to worry about whether a cast is a simple type-cast or conversion. I understand why it works this way, but it certainly isn't obvious until it's gotten you a couple of times.

  4. pminaev says:

    VB can handle this, however:

           Dim x As Object = 12.3

           Console.WriteLine(CType(x, Integer)) ' okay – 12

    but, of course, you pay the runtime penalty for all the extra type checks it has to do to make this work.

    Still, I like the overall approach better. Especially the named cast operators with distinct semantics that is reflected in the names. So you wouldn't expect DirectCast to do the above in VB, because, well, it's not "direct" – you have to go from Object to Double, and then from Double to Integer.

    Better yet would be to have altogether separate syntaxes for casting references up/down/across the inheritance hierarchy, for unboxing, and for conversions. They are, after all, different operations with noticeable semantical differences – up/down/cross-casting is identity-preserving while unboxing is not, for example; and data conversions can simply lose some of the data (even widening ones – think int->float). F# fares relatively well there:

    upcast: (x :> T)

    downcast: (x :?> T)

    cross-cast: (x :> obj :?> T), i.e. upcast followed by downcast – no special syntax

    boxing: (box x) – is not implicit

    unboxing: (unbox<T> x), though T is normally inferred from context, then it's just (unbox x)

    conversions: (int x), (float x), (string x) etc

  5. josheinstein says:

    I am shaken. I can pretty much guarantee over the past 10 years or so, the longest I've gone without using C# was about 48 hours and yes that includes my wedding in the Bahamas. Yet I have never noticed or have been impacted by the fact that a boxed T can only be cast to T. It's kinda like finding out you were adopted.

  6. mike says:

    I would point out that algebra is algebra. There is no 'regular' algebra.

    I would point out that there are infinitely many algebras. An "algebra" is by the pure mathematician's definition simply the combination of a field with a multiplication operator closed over the field such that the operator has certain attractive properties, such as distributivity. So, yes, I did not need to say "regular algebra" here, since by definition a multiplication operator in any algebra is distributive.

    A computer scientist would define "algebra" very differently than a pure mathematician. To a computer scientist, an algebraic system is any system that affords certain symbolic manipulations. There does not have to be a vector space, or a multiplication operator that distributes over addition of vectors. As a computer scientist I think of the type system of C# as forming an algebra because it is a bunch of stuff I can manipulate symbolically. Let's call such algebras "symbolic algebras", and the pure mathematician's algebras "vector algebras".

    What I am calling out here is that our shared understanding of a particular vector algebra (namely elementary algebra over the vector space real numbers) leads us all to have intuitions about the symbolic algebra of the C# type system, intuitionswhich are not accurate. Something that looks textually like a multiplication in a vector algebra does not actually have the distributive property in a symbolic algebra. And thus the title of the blog post: cast operators do not obey the distributive law.

    I decided that it would be a major digression to explain the difference between two kinds of algebras (and I note that these are just two out of many possible definitions of "algebra" used by academics), so I didn't bother to put all this verbiage in the original text. But regardless, there are infinitely many vector algebras, and there are many different definitions of the word "algebra". Perhaps "regular algebra" was not the best choice of words, but I felt that "elementary algebra over the field of real numbers" was a bit excessively verbose for what ought to be a pretty straightforward concept.

    – Eric

  7. Peter J Fraser says:

    Now dynamic exists wouldn't  converting to dynamic be a better choice.

    (I release that it may be to late. i.e. the differece between

     var x =  a ? b : 0; )

  8. jsrfc58 says:

    Wait…whatever happened to the concept of a simple if/then statement? Sometimes a compact notation actually degrades legibility.

    Better yet, why not fix the "GetAmount" method so that it actually returns a decimal value (result or zero)?

  9. Jonathan says:

    I think it is more accurate to say that compile time resolution of cast syntax into the contextually appropiate cast operator does not obey the distributive law. When it is really the same cast operator applied to both values, the distributive law works.

  10. John M Kerr says:

    Okay, so whose bright idea was this?

       const int x = 1, y = 2, z = 4, answer = x + y + z;

       Console.WriteLine("the answer is " + answer);

       Console.WriteLine(x + y + z + " is the answer");

       Console.WriteLine("the answer is " + x + y + z);

       Console.WriteLine("the answer is " + (x + y) + z);

       Console.WriteLine("the answer is " + x + (y + z));

       Console.WriteLine("the answer is " + (x + y + z));



       the answer is 7

       7 is the answer

       the answer is 124

       the answer is 34

       the answer is 16

       the answer is 7

    Associativity of '+' in C# –…/associativity-of-in-c.html

  11. Joren says:

    Evaluation order in C# is left to right.…/4374222.aspx

  12. Pavel Minaev [MSFT] says:

    @Joren: John's complaint has nothing to do with evaluation order; it's about operator precedence and associativity.

  13. pminaev says:

    @Joren: John's complaint has nothing to do with evaluation order; it's about operator precedence and associativity.

  14. Jon Skeet says:

    @Josh: In fact, you *can* cast from a boxed T to other types – sometimes.

    In particular, you can cast from a boxed integral type to an enum which has that type as its underlying type – and vice versa.

    I can't remember whether that's guaranteed by the language spec or not – I seem to remember that at one point both the CLI spec and the language spec made some attempt to talk about it, but both ended up being slightly nonsensical in different ways.

  15. Rayviewer says:

    I agree with Shiva. The compiler should do a better job for the type case for the “0”, or do not compile the code.

    If the compiler only evaluate the expression “isDecimal ? result : 0”, it is true the best type is “object” for the the “0”.  However, if the compiler evaluates the whole expression “decimal amount = (decimal)(isDecimal ? result : 0)”, the best type should be chosen is “decimal”.

    The runtime exception is generated by the compiler, not the person who wrote the code.

    OK, what about this?

    void M(decimal x) { }
    void M(int x) { }
    void M(object x) { }

    M(isDecimal ? result : 0);

    Describe the exact semantics you would like to see here. Now we have three conversions, one to decimal, one to object, and one to int. Which one is correct? How should the compiler decide?

    Once you've solved that one, try this one:

    void N(Func<decimal> f) { }
    void N(Func<object> f) { }
    void N(Func<int> f) { }

    M(()=>isDecimal ? result : 0);

    Is that a function to int, to decimal, or to object?

    Reasoning from outside to inside is very difficult; we try to ensure that expressions in C# can be analyzed independently of their contexts, because the semantic analysis of the context could be the thing we're trying to figure out. We don't want to add more chicken-and-egg problems to C#.

    – Eric

  16. Gabe says:

    Given the (I hope) unlikeliness that somebody is depending on that invalid exception being thrown, and the fact it should be fairly deterministic when this happens, this seems like a good candidate for a compiler warning. When you see that a boxed value is going to be unboxed as an incompatible type, there should be a warning.

    Pavel: The problem with F#'s approach is that when you have an operator for everything, you end up with 80 operators. And the problem with having 80 operators is that every developer has to know all of them to be fluent in the language because most symbols aren't very intuitive (knowing the -> operator doesn't help you know the <- operator; knowing :> and :?> doesn't tell you what 😕 does).

    jsrfc58: There are plenty of places where you need an expression, and a statement is not valid, like in LINQ expressions or expression lambdas. And who's to say that GetAmount is "broken" in the first place? Eric has presented it as an analog to the Decimal.TryParse function, which would arguably be much more broken if it returned 0 when it couldn't parse its input.

  17. Rayviewer says:

    I prefer C# to have the syntax like M(isDecimal?result:0 as int) …  or M((Decimal) ()=>isDecimal?result:0) to resolve the Ambiguous, instead of the compiler generates an exception that may blow up in the customer site.

    I guess I will make sure I using suffixes – 0f, 0m for now.

  18. pminaev says:

    >> And the problem with having 80 operators is that every developer has to know all of them to be fluent in the language because most symbols aren't very intuitive

    Arguably, it's better to see/use an unfamiliar operator and go look it up to see what it _actually_ does, than see/use a (seemingly) familiar operator in a new context, think that you know what it does, and guess wrongly because you hit some corner case where the semantical similarity which was the original reasoning for using identical syntax suddenly vanishes – such as in Eric's example.

    In any case, operators don't have to be symbolic, as demonstrated by VB's "DirectCast", C++'s "static_cast", or C#'s own "is" and "as". We could just as well have "convert", "upcast", "downcast" etc.

  19. Phil Koop says:

    I like your term "regular algebra", though some would have preferred "elementary" but your explanation is a case of a cure worse than its disease.

    An algebra is not "a multiplication operator closed over a field", because a field is already an algebraic structure that includes addition and multiplication operators. The real numbers are a set; combining them with + and * using the "regular" rules defines a field because they satisfy all the field axioms, including not only closure but also inverse elements, identity elements, associativity, commutativity and distributivity.

    You are quite right that there are many algebras; a field is only one example. The reals plus only * would constitute a group, which is also an algebraic structure. In your casting example, a group would be a more parsimonious analogy than a field.

    Indeed. You are, I'm sure, correct. Keep in mind that it has been 18 years since I last took linear algebra, and even then I had a hard time remembering what the differences amongst a ring, group, field and algebra are. – Eric

  20. Phil Koop says:

    Oh dear! I have created a false impression, for I am not a "real" mathematician. And I am older than you, so my school days are correspondingly more distant.

    All the same, that academic grounding was useful when the time came for me to learn sigma algebra (another algebra!), martingale probability, stochastic calculus. The theories we held as students – that the stuff we were learning could never possibly be useful for anything – proved to be wrong.

  21. Bent Rasmussen says:

    Nice factoid!

    The source example is horrible in more ways than one though. I'd definitely write that code as such:

    decimal? value = GetAmount(result);

    decimal amount = value.Value;


    decimal amount = GetAmount(result) ?? 0.0m;

  22. Bent Rasmussen says:

    I had to make a blog post about it:…/an-anti-pattern-denied

  23. Eduard Dumitru says:

    I think some of you guys are reading these blogs for the wrong reason.

    I personally had no profound revelation learning that "cast operators do not obey the distributive law" but still I find it to be a materialization of a belief I could not have enumerated if asked: tell me what you know about C#.

    I really appreciate Eric's mathematical approach. I'm much younger that he is and have already forgotten much of the little vector-algebra I once knew. I still consider this post as being a *Fabulous* one.

    And Bent, *An-Anti-Pattern-Denied* sounds a bit like a double negation :). Make sure your readers don't do the exact opposite.