Do people write insane code with multiple overlapping side effects with a straight face?

On an internal C# discussion list, a topic that comes up every so often is asking about the correct interpretation of statements like

    a -= a *= a;
    p[x++] = ++x;

I asked,

Who writes code like that with a straight face? It's one thing to write it because you're trying to win the IOCCC or you're writing a puzzle, but in both cases, you know that you're doing something bizarre. Are there people who write a -= a *= a and p[x++] = ++x; and think, "Gosh, I'm writing really good code?"

Eric Lippert replied "Yes, there are most certainly such people." He gave as one example a book from an apparently-successful author (sales of over four million and counting) who firmly believed that the terser your code, the faster it ran. The author crammed multiple side effects into a single expression, used ternary operators like they were going out of style, and generally believed that run time was proportional to the number of semicolons executed, and every variable killed a puppy.

Sure, with enough effort, you could do enough flow analysis to have the compiler emit a warning like "The result of this operation may vary depending upon the order of evaluation", but then you have to deal with other problems.

First of all, there will be a lot of false positives. For example, you might write

    total_cost = p->base_price + p->calculate_tax();

This would raise the warning because the compiler observes that the calculate_tax method is not const, so it is worried that executing the method may modify the base_price, in which case it matters whether you add the tax to the original base price or the updated one. Now, you may know (by using knowledge not available to the compiler) that the calculate_tax method updates the tax locale for the object, but does not update the base price, so you know that this is a false alarm.

The problem is that there are going to be an awful lot of these false alarms, and people are just going to disable the warning.

Okay, so you dial things back and warn only for more blatant cases, where a variable is modified and evaluated within the same expression. "Warning: Expression relies on the order of evaluation."

Super-Confident Joe Expert programmer knows that his code is awesome and the compiler is just being a wuss. "Well, obviously the variable is incremented first, and then it is used to calculate the array index, and then the result of the array lookup is stored back to the variable. There's no order of evaluation conflict here. Stupid compiler." Super-Confident Joe Expert turns off the warning. But then again, Super-Confident Joe Expert is probably a lost cause, so maybe we don't worry about him.

Joe Beginner programmer doesn't really understand the warning. "Well, let's see. I compiled this function five times, and I got the same result each time. The result looks reliable to me. Looks like a spurious warning." The people who would benefit from the warning don't have the necessary background to understand it.

Sure enough, some time later, it came up again. Somebody asked why x ^= y ^= x ^= y doesn't work in C#, even though it works in C++. More proof that people write code that rely upon multiple side effects, and they passionately believe that what they are doing is obvious and guaranteed.

Comments (68)
  1. French Guy says:

    That’s intended to swap the values of x and y without using a temporary variable, right? Basically, a one-expression version of “x ^= y; y ^= x; x ^= y;”

    1. The MAZZTer says:

      I think in C#7 you could use tuples. I don’t have access to VS2017 at the moment but this might work:

      (y, x) = (x, y);

      Not sure.

      1. David Totzke says:

        You are correct sir.

      2. Matt Warren says:

        Tuples for the win…

        Of course, there are now two temp variables (a tuple of them) that is constructed on the stack and then read from during the assignment.

        But it is clearer.

        Unless, your understanding of tuples assignments is that they are just a grouping of individual assignments, with a mere translation to x = y; y = x, which is obviously correct because its more optimal and doesn’t waste all that unnecessary stack, which means the simple looking assignment is actually wrong in a horribly subtle way, so you freak out and start sending hate mail to your fellow developer for making that noob mistake…

        Until you realize that you were wrong.. and that the C# design team (or at least one pushy member) thought it would be really cool if the assignment allowed swap to work, so they personally would never have to remember how to do that swap trick or need to name a local variable ‘temp’ anymore..

        But maybe that’s just me.

        1. poizan42 says:

          The clr stack != the processor stack. If enregistration works as it is supposed to in RyuJIT then the tuple struct should never exist on stack. After that the flow analysis should be able to figure out that it can be compile to an exchange instruction.

    2. DWalker07 says:

      Sure, it’s possible to swap the values of two variables without using a temp variable.

      But, in the real world (outside of contests), is it worth it? Sure, variables might be expensive (it depends), but programming errors and confusion by the programmer-hired-later-to-update-code are much more expensive. Clarity is key.

      1. In the real world, the CPU has these temporary variables called “registers”.

        1. Joshua says:

          It occurred to me the warning should say “statement depends on the compiler version” where the same expression within wrote to the same variable twice. Joe Newbie has a decent chance of understanding that.

        2. The_Assimilator says:

          The kind of person who writes this sort of code probably considers using a register an unnecessary performance overhead.

      2. Joe says:

        This is where I wonder where the compiler would figure out to optimize out or not even use a memory address if it’s a temp variable.

        There’s also the 80/20, 90/10, 95/5 rules where most of the program is in a small amount of the code.

        Sometimes newer developers overthink the RAM cost, and it’s an easy thing to do. However, math can save you from insanity. Let’s say you have an 160 int temp variables in your file. That’s 320 bytes on a 64-bit CPU, assuming int in C++. That’s bytes, not kilobytes. On today’s devices where you are expected to have at least 512 MB of RAM on a mobile device (32 MB available for programs on lower-end Android devices), those 320 bytes are more than likely not what’s wrong with your program if something’s going wrong

        I also see a push to minimize lines of code. In many cases, this is alright. However, it sometimes gets taken to an extreme by cramming many lines of code or statements on a single line that will expand to a bunch of lines anyways once it gets compiled down. If you are in a language without the Elvis operator, pray that you don’t have a null value in that chain.

        1. Joe says:

          … and I can’t edit the math on the previous comment. I thought it was 20 int vars. 160 int vars on 64-bit C++ is 1280 bytes, 1.2k.

    3. Darran Rowe says:

      The funny thing is, a compiler with optimisations enabled should always remove the temporary variable anyway, so for the majority of cases using the temporary variable is the simpler option. This is especially true if you can’t guarantee that the variables are not aliased.
      But this is another problem with less experienced programmers, then never take into account what the compiler itself is capable of doing and then uses these “tricks” to try to speed things up. Not realising that what they are doing will probably be the same speed, or maybe even slower.

      1. John says:

        I have been witness to rare circumstances on both military and medical projects where the output of different compilers were required to be identical regardless of the haphazard omission/modification of compiler settings to ensure that binaries sourced from different parties weren’t tampered with (think shared codebases between different nations militaries’ where trust cannot be assumed). More often than not much of this code was written in assembly to avoid trouble, but there were a ton of creative workarounds utilized in higher level languages when working at a lower level of code could not be done for whatever reason. Readability and maintenance were nightmarish, but sometimes that’s just what the project requires.

        1. Joshua says:

          See reproducible-builds for the modern way.

    4. Antonio Rodríguez says:

      Even if the compiler doesn’t remove the temporary variable, it isn’t a great loss. The variable, being short-lived, can be stored in a register, avoiding touching memory. And even if it is stack-based (or if spills into the stack), the top of the stack will probably be at the level 1 cache, so it shouldn’t cause a cache fault.

  2. xcomcmdr says:

    I hate such code. It’s obfuscation, plain and simple.

  3. Darran Rowe says:

    I always find it fun when you have the kind of people who think that writing less code is faster, and then you show them seemingly more code that does the same thing in the same amount of time, or even faster.

  4. Ivan K says:

    But sir, there is only so much space on the drum for my environment, build tools, code, and resulting text file.

    1. SimonRev says:

      You laugh, but I once interviewed a software developer that stated in all seriousness that since his work PC’s hard drive was too small, he would uninstall Visual Studio version x and install Visual Studio version y every time he had to switch projects. I normally try to avoid delving into the work environment of a previous employer, but I seriously had to follow up and nail down that his employer would rather pay him to reinstall visual studio twice a day than buy him a bigger hard drive.

  5. Antonio Rodríguez says:

    Writing less code (i.e., less tokens) can be slightly faster in *interpreted* languages like PHP (a script and all of its includes are parsed each and every time it is invoked, at least in versions prior to PHP 7). But in those languages, the biggest gains come from avoiding including (and parsing) files you know you are not needing (in PHP, include() is a sentence, which makes it easy to do “lazy includes”). Playing tricks to save one or two lines of code isn’t clever if the code isn’t readable and causes trouble when debugging or adding new features. Once again, is a matter of using common sense and trying to hit a balance.

    Of course, in compiled languages with modern compilers, compressing the code does not make much sense, unless you really know what you are doing and you are optimizing an inner loop that runs thousand of times a second. And even then, you have to get in the compiler’s shoes, and think in terms of machine operations, and not high level language sentences.

  6. Kevin Fee says:

    I once wrote code similar to the p[x++] = x++; It made sense as I wrote it, two discreet blocks of everything moving forward. The instant after I wrote it, I read it, at which point I said, “That is completely unreadable.” and I created a local variable for each of the increments to make it absolutely clear which order I was expecting. I didn’t even run it. I now suspect it may not have evaluated in the order I expected, so either way it’s a good thing I did.

  7. user says:

    I wonder what would be a positive example of an expression making good usage of side effects? It seems to me that the concept of “a *= b” or “a++” evaluating to a number just opens the door for this kind of abuse. If they were “void” expressions that only change the variable but don’t evaluate to anything, these examples with multiple overlapping side effects would not be possible.

    1. pc says:

      I think the standard argument for assignments returning a value making any sort of sense is initializations like “a = b = c = 1”. This syntax may derive from a more mathy based syntax used in descriptions of algorithms. (That is, I could see that line being used on a whiteboard, and it seeming like a good idea to allow it directly in code.)

      But yes, I think trading away being able to do “tricks” like that so we have could assignments be statements instead of expressions and do away with some nonsense would be worthwhile. Keep in mind if you’re ever designing a new programming language.

      1. Antonio Rodríguez says:

        Pascal and Basic (among others) make assignments first class sentences. Basic even had a keyword for that, LET, which got dropped latter. Both Pascal and Basic predate C (but IMHO are better suited for application development).

        1. Python too. Assignments are statements, not expressions, and you can’t chain them together at all.

          1. (That last phrase is wrong. You can chain simple assignments like x=y=z=0. But you cannot chain augmented assignments like x^=y^=x^=y)

      2. Kevin says:

        Python splits the difference: Chained assignments are legal because the language grammar makes it so, but assignment is a statement not an expression. You can’t do “clever” things like while (data =, which I suppose might tend to annoy some people. But the upshot is that you *never* confuse = with ==, because the former is a syntax error in expression context.

        (I suppose you could write x == 5 when you intended to initialize x to 5, and Python would interpret this as an expression-statement, but I don’t see that error very often in practice.)

      3. Timothy Byrd says:

        In Ruby, everything is an expression. All statements return a value. So this is valid:
        x = if 1 > 2 then 3 else 4 end

        And it has parallel assignment built-in, so this works to swap variables:
        x,y = y,x

  8. mikeb says:

    I would have expected that C#’s strict rules on order of evaluation would require the expression x ^= y ^= x ^= y to ‘work’. I see now that C#’s rules on order of evaluation actually has two parts:

    1) operators are evaluated according to the rules of precedence and associativity, but
    2) operands are evaluated strictly left to right

    Item 2 means that both times `x` is used in the expression as a operand, it has the value that `x` started out with. This ruins the xor-swap trick.

    However, I’m going to confess that I still don’t feel that I understand how the “operands are evaluated strictly left to right” rule works when an operand is a sub-expression that is more complicated than a variable name. I’m not sure how important it is to understand that (which I think is the point of the article).

    In this case the C++ situation is much simpler: all bets are off, the compiler can do whatever it wants. So there’s no point even trying to understand how the expression should work. Though there is a point in understanding that even though you might be getting the result you expect, there is no guarantee of that and it’ll likely break at some point. The solution is the same: don’t do that.

  9. IanBoyd says:

    Eric Brumer [MSFT] has an excellent talk on Channel 9 about the thing that nobody considers much today: only 1% of your CPU is for computation. 99% of your CPU is dedicated to dealing with slow-memory, and re-writing your code (changing the order of instructions) so that it has to wait on memory less.

    Eric, if you’re around, your presentations are **great**! I wish there was more.

  10. Nathan Reed says:

    Didn’t you just write such code with a straight face yourself only yesterday? ;) In your bytecode interpreter example:

    memory[NextUnsigned16()] = NextUnsigned8();

    Presumably the NextUnsigned8/16 functions have side effects of advancing a buffer pointer, so this is nearly the equivalent of p[x++] = ++x.

    1. Gene Hamilton says:

      The C# order of evaluation guarantees that the left hand side is evaluated before the right hand side. Therefore, the 16-bit unsigned integer is read first, and that value is used to determine which element of the memory array is being assigned. Then the 8-bit unsigned integer is read next, and that value is stored into the array element.

      1. Nathan Reed says:

        Yes, I know, because I read Raymond’s post yesterday. :) The point is, that was effectively the same as the “insane code” Raymond’s complaining about today!

        1. The_Assimilator says:

          The difference is that the behaviour is now contained in those methods, so callers theoretically do not have to worry about side effects. Plus the methods can now be tested to ensure they behave as expected, including not having unwanted side effects.

    2. Matt Warren says:

      Yes, this is the same thing. :-) It just doesn’t seem as obfuscated because its got words instead of punctuation.

  11. alegr1 says:

    >every variable killed a puppy
    Believe it or not, *every* local variable in Visual C (independent of scope) takes a separate location on the function stack, even in the release build. At least that’s what the VC code analysis thinks. There was that time when I had to go through a big IOCTL switch statement and prune those local pointers in case blocks (replace with a common variable on the top of function), otherwise the code analysis and SDV would not be happy because of stack size.

  12. alegr1 says:

    I once tried to convince my fellow programmer to fix this abomination in C code:

    i = i++;

    He kept telling that the compiler does what he wants (increments i) in this case. Even considering that previously he had a lecture from me on evils of writing code with undefined behavior.

    I had to sneak a drive-by fix in another commit.

    1. Viila says:

      …….uhhh… even if it did do exactly what he wanted, why on Earth would he write that over just “i++;”?

      1. Brian says:

        As someone who cut my teeth on really slow processors (Z80s back in the 80s) and who worked in many tight nested loops, I know that:
        as a simple statement is just about always preferable to:
        I’m sure that a modern optimizer probably optimizes a simple post-increment statement to just an increment statement, but pre-increment never generate more code than a post increment (back in the day on my completely non-optimizing Z80 compiler, it was like the difference one instruction and five instructions).
        I’ve had many people remark that I always pre-increment the loop variable in a for-loop statement. I worry when some of my colleagues don’t understand my explanation.

        1. BZ says:

          I just write ++i, not i++ because that’s what K&R has in its earliest examples

        2. waitingForTheStorm says:

          But you are mistaken in some cases. On old Univac computers (1110, 1100/80, 1100/90) the instruction set included a hardware based post-increment operator that was free: it happened as an integral part of the instruction interpretation in the central processor. If memory serves, it happened on practically every instruction that included a reference to a memory location through one of the index registers (16 of said registers, 4 overlapped with the arithmetic register set). It has been a long time, but I know that I used post incrementers all of the time on these engines because the math was free.

          1. Brian says:

            That’s why I said “just about always preferable”; I’ve worked on oddball instruction set computers before (Data General Nova was perhaps the weirdest I worked on – I think that’s what I’m remembering). Did your Univac post-increment really run faster than a pre-incrementing equivalent? That would surprise me.

  13. cheong00 says:

    “Well, let’s see. I compiled this function five times, and I got the same result each time. The result looks reliable to me. Looks like a spurious warning.”

    On yeah. I’ve had a hard time trying to convince my ex-colleague that “If you try to update any UI element on spawned thread, you need to check if Invoke is required first” back in the days everyone was still running WinXP. He said it works and always works on his computer, so directly updating the UI must be okay. //faceplam

    And no, I left the company so I don’t know whether there were any customer hit by this.

    1. Ben Voigt says:

      But you *don’t* need to check if Invoke is required. You just use BeginInvoke for all UI updates (unless they’re directly written in UI event handlers which are guaranteed always to run on the UI thread). Control.Invoke already checks InvokeRequired, so for you to do it again is a waste.

      1. cheong00 says:

        Well, my point is he’s not doing .Invoke() at all when trying to update UI from code that could be running on another thread.

        And btw, I remember that .NET v1.x documentation still use:

        if (control.InvokeRequired) { control.Invoke(somemethod); } else { somemethod(); }

        pattern at that time.

  14. Josh says:

    Just wait until you see what Knuth does in Assembler…

  15. Eric Lippert says:

    Thanks for the shout-out. I’m glad to see the return of CLR week!

    I remember that book very well; it was by far the worst C# book I’ve ever edited. Tragically, I cannot find my review notes, otherwise I would share a few choice morsels.

    1. Matt Warren says:

      Of course, you only said all that to justify your own abusive coding practices!

  16. I wish compilers with undefined behaviour would add a switch that randomized the undefined behaviour. If order of evaluation is undefined, let me compile with the order actively randomized. Then the issues will become apparent. Any dependency on the undefined behaviour will be more obvious. In that case, also emit the warnings so as to help guide people to the right places to fix.

  17. There is no shame in writing simple, dumb code.

    If you exhaust all your cleverness writing the cleverest code possible, then you are, by definition, not smart enough to debug that code, because debugging the code takes even more cleverness than writing it did.

    Don’t write code that you can’t debug. That’s my motto.

    1. That’s paraphrased Brian Kernighan.

    2. alegr1 says:

      But can God of Programming write a routine He can’t debug?

  18. Kerrash says:

    Entire article on how developers focus on the wrong things; comments focus on the closing statement… :D

  19. Jeff Moden says:

    BWAAAA-HAAAA!!! And then those same people say that SQL is difficult! ;-)

  20. penguinapricot says:

    As someone who programmed 8-bit opcodes directly into a 256 byte ram connected to a z80,
    I laugh in a jovial and patronizing way at anybody who thinks the length or amount of C# lines they type has any relationship to execution speed or efficiency.
    You are using a super-high-level language, on top of an intelligent interpreter , on top of highly optimized API, on a super computer with endless amounts of memory, and multiple cores.

    1. Put every operation, including ‘x++’ on a separate line.
    2. Write, long meaningful comments.
    3. Give variable names, super long and meaningful names such as : ‘user_surname_abbreviated’

    This is so when other programmers need to fix your highly efficient, not-working code, we can understand what it’s meant to do.

  21. Ricky says:

    That’s illegal code. Not necessarily for the compiler but for our company. Since we sell devices that emit up to 3000 volts, any software engineer that tried to do this kind of programming and get it around our safety checks would be fired. I’m sure he’d have a straight face when he was led out the door by Security.

  22. Nick Humphries says:

    This will sound crazy, but I met a ‘developer’ once who wrote some code to remove the comments from the source code prior to compilation, honestly believing it would result in a smaller faster executable.

  23. waitingForTheStorm says:

    I have been a developer for a very long time (40+ years). I dislike code with side-effects. My coding style is a clear, top to bottom, simplistic flow of logic with simple declarative statements and straight-forward logical constructs. Why? Much more of the life cycle is support, and even 6 months after writing code, you will be like a newby to the code. If you have to eat your own dog food, and I have most of my life, it engenders a simplistic discipline. On almost every team I have joined, my colleagues can tell me that they know immediately when they walk into my code because it is a model of simplicity. I have never had any complaints.

    1. MikeB says:

      I couldn’t agree more. Simplicity is the first virtue to strive for. I’ve said before that you should read my code and think I’m a bit of a simpleton. Support is most of the life cycle and we all do more of that then anything else.

  24. Patrick Thurmond says:

    Death to hard to read code!!!

    I am a code formatting Nazi. If I saw a pull request with crap like that come through, I would reject it out of hand and force the developer to clean up their act.

    Given that I now manage a team of 12 developers I can be outright ruthless about it. But I do so with reason and I explain the reasons to the devs that make these mistakes. They have just gotten used to it and for the most part do it right. I let them slide on missing line breaks and spaces here an there though.

    1. Darran Rowe says:

      But do you go insane on things like:
      int *i; vs. int* i;?
      That is the real teller.

  25. Paulo Zemek says:

    About this case: p[x++] = ++x;
    I don’t know if this actually happens, but I can see this being interpreted in two different ways (depending on the compiler, optimizations etc).

    int value = ++x;
    p[x++] = value;


    int index = x++;
    p[index] = ++x;

    Which would definitely yield different results.

    1. alegr1 says:

      A third (legal) possible option is where ‘x’ gets incremented only once.

    2. alegr1 says:

      (I’m talking about C, not C#, though)

  26. In my 1975 PL/I textbook (“PL/I for Programmers”) my second rule of quality was “A program must be obvious” (the first was “A program must be correct”). Later these became one rule “A program must be obviously correct”. Efficiency is unimportant if this rule is broken: I don’t care how quickly you can do something that might be wrong.
    a -= a *= a;
    p[x++] = ++x;
    fail the “obvious” test so it’s bad code.

  27. Mark W Allen says:

    Some people do it just to be obtuse on purpose

  28. 640k says:

    This has been going on for ages at mcrosoft, when will you learn? Don’t optimize the experience for develpers who cannot be bothered to learn the profession, you should not care about these stereotypical developers who write bad code. Instead optimize the experience for professional developers who write good code.

Comments are closed.

Skip to main content