Why does the assignment operator in C# evaluate left to right instead of right to left?


When I noted some time ago that the compound assignment operator is guaranteed to evaluate its left hand side before the right hand side, there was much commenting about why the language chose that model instead of the more "obvious" evaluation order of right-then-left.

In other words, instead of rewriting E1 += E2 as E1 = E1 + E2, rewrite it as

    temp = E2;
    E1 = E1 + temp;

(Or as

    ((System.Func<T2, T1>)((e2) => E1 = E1 + e2))(E2)

if you want to keep it as a single statement.)

Thank goodness you can't overload the += operator, because it would require that the operator overload for += be declared backward:

operator+=(T1 y, T2 x)
{
 return x = x + y;
}

in order to ensure that the right hand side is evaluated first. (Because function parameters are evaluated left to right.)

(We'll come back to the rewrite rules later.)

One reason for the existing rule is that it keeps the rules simple. In C#, all expressions are evaluated left to right, and they are combined according to associativity. Changing the rules for assignment operators complicates the rules, and complicated rules create confusion. (See, for example, pretty much every article I've written about Win32 programming titled "Why does...?")

In particular, for compound assignment, it means that E1 += E2 and E1 = E1 + E2 are no longer equivalent if E1 and E2 have interacting side effects. Collapsing x = x + y into x += y would no longer be something you could do without having to think really hard first. Like hoisting closed-over variables, this would create another case where something that at first appears to be purely an issue of style turns into a correctness issue.

One argument for making a special rule is that any code which relied on E1 being evaluated before E2 is probably broken already and at best is working by sheer luck. After all, this is the rationale behind changing the variable lifetime rules for closures that involve the loop variable.

But it's not as cut-and-dried that anybody who relied on order of evaluation was "already broken".

Consider a byte code interpreter for a virtual machine. Let's say that the Poke opcode is followed by a 16-bit unsigned integer (the address to poke) and an 8-bit unsigned integer (the value to poke).

  // switch on opcode
  switch (NextUnsigned8())
  {
  ...
  case Opcode.PokeByte:
    memory[NextUnsigned16()] = NextUnsigned8();
    break;
  ...
  }

The C# order of evaluation guarantees that the left hand side is evaluated before the right hand side. Therefore, the 16-bit unsigned integer is read first, and that value is used to determine which element of the memory array is being assigned. Then the 8-bit unsigned integer is read next, and that value is stored into the array element.

Therefore, this code is perfectly well defined and does what the author intended. Changing the order of evaluation for the assignment operator (and compound assignment operators) would break this code.

You can't say that this code is "already broken" because it's not. It does exactly what it intended, and it does it correctly, and what it gets is guaranteed by the language standard.

Okay, you could have come up with something similar for capturing the loop variable: Some code which captures the loop variable and wants to capture the shared variable. So maybe it's not fair showing code which relies on the feature correctly, because one could argue that any such code is contrived, or at least too subtle for its own good.

But as it happens, most people implicitly expect that everything is evaluated left to right. You can see many instances of this on StackOverflow. They don't actually verbalize this assumption, but it is implicit in their attempt to explain the situation.

The C# language tries to avoid undefined behavior, so given that it must define a particular order of evaluation, and given that everywhere else in the language, left-to-right evaluation is used, and given that naïve programmers expect left-to-right evaluation here too, it makes sense that the evaluation order here also be left-to-right. It may not be the best style, but it at least offers no surprises.

With the right-to-left rule, you get a different surprise:

 x.value += Calculate();
 x[index] += Calculate();

If x is null, or if index is out of bounds, the corresponding exception is not raised until after the Calculate has occurred. Some people may find this surprising.

Okay, so maybe can still salvage this by changing the rewrite rule so that E1 is still evaluated before E2, but only to the extent where the value to be modified is identified (an lvalue, in C terminology). Then we evaluate E2, and only then do we combine it with the value of E1. In other words, the rewrite rule is that E1 += E2 becomes

    System.CompoundAssignment.AddInPlace(ref T1 E1, T2 E2)

where

T1 AddInPlace<T1, T2>(ref T1 x, T2 y)
{
  return x = x + y;
}

This still preserves most of the left-to-right evaluation, but delays the fetch of the initial value until as late as possible. I can see some sense to this rule, but it does come at a relatively high cost to language complexity. It's going to be one of those things that "nobody really understands".

Bonus chatter: Java also follows that "always evaluate left to right" rule. Dunno if that makes you more or less angry. See, for example, Example 15.26-2-2: Value of Left-Hand Side Of Compound Assignment Is Saved Before Evaluation Of Right-Hand Side. However, for some reason, Java has a special exception for direct (non-compound) assignment to an array element. In the case of x[index] = y, the range check on the index occurs after the right-hand side is evaluated. Again, this may make you more or less angry. You decide.

I have a second bonus chatter, but writing it up got rather long, so I'll turn it into a separate post.

Comments (30)
  1. Karellen says:

    “The C# language tries to avoid undefined behavior, so given that it must define a particular order of evaluation,”

    I think “undefined behavior” is an unfortunate choice of words here, talking about a member of the “C”-derived family of languages, given that in that context it’s a well-defined term with a very specific meaning.

    (After all, C doesn’t define the order of evaluation, but statements with multiple subexpressions to be evaluated results in either “unspecified” or “implementation-defined” behaviour rather than “undefined behaviour”.)

    1. Kevin says:

      > but statements with multiple subexpressions to be evaluated results in either “unspecified” or “implementation-defined” behaviour rather than “undefined behaviour”.

      No, unsequenced side-effects are undefined, not just unspecified. The full set of nasal demon machinery is in play if you do something horrible like foo[i++] = ++i;

      1. Joshua says:

        That’s still only unspecified. It will write to an index of foo within 2 steps of the original i a value within 2 steps of the original i.

        Now if foo only had one more index left in the array the unspecified becomes undefined.

        1. Karellen says:

          No, that’s undefined behaviour. See http://c-faq.com/expr/evalorder1.html and questions 3.2, 3.3, 3.8 to 3.12b and 11.33.

          I was sloppy in my wording though; I meant to imply the order of evaluation of subexpressions in a statement like:
          foo[i++] = ++j;

          Which is well defined, but where the order of evaluation of “i++” and “++j” is either unspecified or implementation-defined (can’t remember which offhand, but probably the first)

  2. Antonio Rodríguez says:

    IMHO, it would have been better to preserve the right-to-left order. Veteran programmers are used to take into account the unusual order of the assignment operator, while writing, optimizing and debugging. Changing the order makes it hard to your primary target (the very developers you are trying to sell your new language to), and for what benefit? Making life a bit easier to novice programmers, and those that after two decades in the field still aren’t worth their salaries. In other words: upset your current customers for trying to lure would-bes.

    I just don’t mind very much about Java, but Microsoft, IMHO, has made a few key strategical mistakes in the 2000s. In an era when, for the first time in almost 20 years, Microsoft was getting some real alternatives (Palm OS, iOS, Android, web apps…).

    1. xcomcmdr says:

      If you have the chance to remove an unusual order, why not get rid of it ?

      Enabling programmers to focus on their core task (ie. making software for themselves or clients) instead of having to remember annoying details is good.

      It’s why real programmers document their code, write unit tests, etc… instead of trying to preserve unusual orders of assignment for the sake of tradition.

      1. Antonio Rodríguez says:

        If we want to eliminate the quirks of programming languages, then why do we keep semicolon sentence termination, three-expression for loops, and curly braces block delimiters in most modern languages? They all cause a lot of confusion to beginners, and even for expert programmers it is a pain to get an error at the last line of the file just because you forgot brace in a for loop some fifty lines above (now go and find it – it sometimes takes a lot of time of carefully reading the code).

        Pascal and Basic may be outdated, but by using different block closing sentences matched to the opening ones (and forcing you write them even for one-sentence blocks), you get the error in the next block closing, usually not more than a handful of lines bellow the mistake. This, also, makes it easier to keep indentation right, which is a plus when teaching beginners.

        We keep using curly braces and C-like syntax in most modern languages (C#, Java, PHP, Rust, Swift…) out of tradition. There are so many veteran programmers used to them (and convinced that non-curly-languages are script-like, or, even worse, Basic-like) that you have no opportunity promoting a serious language that doesn’t use them. Even with all the problems they carry on. Of Javascript and Python, which one is seen as more professional? And which one gets the most bashing?

        On the other hand, take into account that it is a lot more difficult to write a parser that takes into account the right precedence of the assignment operator. Yet most procedural, non-stack-based languages from the 60s, 70s and 80s (when computing power was far more limited than now) implemented that. Because it is, in fact, more intuitive once you hear about the concept. Anyone with some background in Mathematics will agree. StackOverflow copypasters may disagree, but I’m talking about professional programmers here.

        In other words: we keep using curly braces (and C syntax) out of tradition. Why not do the same for the assignment operator?

        1. Someone says:

          “Pascal and Basic may be outdated”

          Why? In this paragraph, you just have named perfect reasons to mistrust languages which seams only to be defined with a “hey, cool” mindset, but without considering the usefulness. (C++ is just unreadable for anyone not very deeply intimate with its strange constructs, plus the nonsense of Undefined Behavior.)

        2. xcomcmdr says:

          > If we want to eliminate the quirks of programming languages, then why do we keep semicolon sentence termination, three-expression for loops, and curly braces block delimiters in most modern languages?

          You might have heard of Python, which already does a lot of what you say.

          Despite that, it’s quite popular.

        3. xcomcmdr says:

          > Because it is, in fact, more intuitive once you hear about the concept. Anyone with some background in Mathematics will agree.

          You are confused. We are talking about order of evaluation, NOT associativity.

          Also, you might want to add arguments to your story, because arguments of authority won’t cut it.

        4. DWalker07 says:

          I disagree that modern VB.Net is outdated. I believe it has all, or almost all, of the constructs and semantics of C#. Just different syntax without curly braces.

        5. Joshua Schaeffer says:

          C-style for-loops aren’t hard for beginners. No loop is hard for beginners because loops are when their programs start actually doing something interesting. If you didn’t write dumb newline text in an infinite loop, you’re lying. Nobody told you to do that. You were having early fun with loops, not struggling.

        6. voo says:

          “Because it is, in fact, more intuitive once you hear about the concept. Anyone with some background in Mathematics will agree. StackOverflow copypasters may disagree, but I’m talking about professional programmers here.”
          So why would a mathematician have an opinion about the order of assignments? If anything the mathematician would expect a side-effect free language where such a distinction would be meaningless.

    2. Mike Caron says:

      Hi, Target Audience here! I’ve been coding for 20 years, since I was very young, over a handful of languages (QBasic, Visual Basic, dabbling in C/C++, PHP, and now C#), and this is the first time I’ve ever heard, or even considered, the order of evaluation of right-side vs left-side.

      I already knew that everything else in C# was strictly left-to-right, so I guess if you had asked me, I would have given that answer. Honestly though, I can’t think of very many cases where you would want to rely on this behaviour. The sample that Raymond posted above is a decent example, though I would probably prefer to make the variables explicit, and the let the compiler/jitter do the work of optimising them away (or not, if that’s what it feels is best).

      Definitely, when optimizing a piece of code, I would never, ever decide to introduce a dependency on order of operations. Surely code that finely hand-tuned is written in IL directly, if not removed from C# altogether.

      1. Antonio Rodríguez says:

        You are right: in most cases, you should not be relying on this behavior. Most times the precedence of the assignment operator matters, it is because of side effects between the expressions at the right and at the left. Which borders on undocumented features.

        The only reasonable case I can think of is, precisely, optimization. Nowadays, processors are powerful and compilers are good at optimization, so it rarely makes no sense writing assembler or IL by hand. It is a cost/benefit trade: your code may have to be read and maintained by other programmers, which may not know assembler or IL. Because of that, it is best to write the most efficient possible code in the same high-level language as the rest of the project. That way, with the help of detailed comments, any programmer can understand what is going on without having to do a course on x86 assembler.

        1. Alex Cohn says:

          > it is best to write the most efficient possible code in the same high-level language as the rest of the project.

          Better, write simple and clean code in high-level language, and let the optimizing compiler care about it.

          1. Simon Farnsworth says:

            Or write a code generator in a high level language (Python, for example), that generates the efficient code in the compiled language (C++, for example). That way, when the compilers improve, you can simplify your code generator.

    3. Ben Voigt says:

      There’s no “preserve the right-to-left order”, because there is no existing order (C, C++), or the pre-existing order was left-to-right (Java).

      Methinks you’re talking about associativity (which IS right-to-left) rather than order of evaluation.

    4. zboot says:

      There was no “right to left” order to preserve. So you’re complaining today, that “veteran programmers” would shy away from C#, because it works differently from what they’re used to? I think at this point, if there are veteran programmers who can’t adapt to C#, they are clearly *not* the target market for this language. They’re too old. C# has been out so long, there are younger and more numerous “veterans” who won’t have that issue.

  3. littlealex says:

    I have to very strongly disagree on this one:
    “You can’t say that this code is “already broken” because it’s not. “
    I very much can, and will call this code “broken”. “It works” is absolutely not a synonym for “it is not broken”. Code that even at first glance scream “I am going to cause problems” is broken. Even it is working.

    All code is either “easy to maintain (as much as possible, of course)”, or it is “broken”. It can be broken in different ways and most importantly in different cost-categories, where “absolutely does not work at all” is rather cheap, while “does work (at least right now)” tends to be broken in very costly ways. At the time this already broken code finally stops working, probably no one will even remember the name of the one who wrote this code. No one will know which side-effects of the code are required, which are accepted because the one resposible asserted (ar thought he asserted) the side effects will not hurt, and which of the side effects actually do hurt and since when have these sideeffect been part of the code, making it work in subtly wrong ways, and so on.

    Every code which relies on the order of evaluation is broken. It is absolutely worth the effort to do:
    case Opcode.PokeByte:
    ushort index = NextUnsigned16();
    byte value = NextUnsigned8();
    memory[index] = value;
    break;

    To be broken in a slightly less expensive way, the code must at the very least be heavily commented, with CAPITALS and exc!amat!on marks, to ensure everybody will be extremely careful when poking their head into the sheer horror of this language specific side-effect based ‘code’. And writing the appropriate comments will be much more work than just fixing the whole thing.

    1. That’s like saying any code that uses STL or lambdas is verboten, because only language experts really know what they’re doing under the hood, and they’ll work differently in other languages. The C# spec says that “memory[NextUnsigned16()] = NextUnsigned8();” and “n16 = NextUnsigned16(); n8 = NextUnsigned8(); memory[n16] = n8;” mean exactly the same thing, therefore it isn’t broken in any way. It will never change, it will never break.

      If you port any code to another language without knowing the differences in languages, or wrongly assuming similar expressions will always act the same, then of course your code will be broken.

      1. Someone says:

        “that uses STL or lambdas is verboten,…”

        I agree with that.

    2. Someone says:

      “Every code which relies on the order of evaluation is broken”

      This, but with one exception: Constructs with “&&” and “||”. Short-cutting of such constructs is a very, very useful feature, and hopefully, every programming language on earth will perform this left-to-right.

      1. 640k says:

        Except the equivalent operator (AND/OR) in BASIC of course.

        1. DWalker07 says:

          VB.NET has AndAlso and OrElse, which short-circuit.

  4. Neil says:

    While in C++ you evaluate x[index] = y; as auto &t = x[index]; t = y; you don’t have that luxury in Java, which explains that behaviour, while in JavaScript you can write nonsense such as a = [0]; a += ++a[0]; which results in a’s value becoming “11”.

  5. IanBoyd says:

    Chalk up one more for the “left-to-right” camp.

    C, C++, Java, C#, Delphi.

  6. Bryce Wagner says:

    “In particular, for compound assignment, it means that E1 += E2 and E1 = E1 + E2 are no longer equivalent if E1 and E2 have interacting side effects.”

    It wasn’t completely clear whether this statement was referring to how things currently exist, or what would be different if the order of operations were different. In C#: x[i++] += 1; has different behavior from x[i++] = x[i++] + 1;. The first version only evaluates i++ once, even if you’re dealing with a this[] operator and not an array [].

  7. Peter Doubleday says:

    I wonder if LTR associativity for the assignment operator is “natural” for someone whose first written language is RTL?

    Aaaand … no, boustrophedonics are best left to one side for now.

    I think this is one of those cases where clarity trumps purity (assuming that purity is the rationale for RTL associativity). Me, I find the invisible fix for lambda capture of a foreach variable to be obvious, natural, and valuable — even though it is the Impure Work Of The Devil.

    A part of me, however, still wants lvalues to be lvalues, and I see RTL evaluation as enforcing this meaning. But, as I say — rules is rules, and somebody has to make the choice when the compiler team starts with -100 points for either feature.

  8. MarcK4096 says:

    I followed the loop closure link without realizing it isn’t an article written by Raymond and was shocked when I found the article was soliciting opinions on purposefully making a breaking change to C#. After reading all the articles about the questionable bugs that have been left in Windows for the sake of “backwards compatibility”, I could never see Raymond writing something like this.

Comments are closed.

Skip to main content