# The Root Of All Evil, Part One

People often quote Knuth's famous statement that premature optimization is the root of all evil. Boy, has that ever been the theme of my life these last few weeks as I've been banging my head against the compiler trying to figure out how we're going to make it work with LINQ without breaking backwards compatibility. Some seemingly harmless premature optimizations in the compiler are eating up precious time dealing with how to work around them.

Before I can explain what exactly the premature optimization is that's drivin' me nuts, a short quiz. Suppose you've got an enumerated type E with value X.

E e1 = E.X;
E e2 = 0 | E.X;
E e3 = E.X | 0;
E e4 = E.X | 0 | 0;
E e5 = 0 | E.X | 0;
E e6 = 0 | 0 | E.X;

All of these are legal according to the compiler. One of these is illegal according to the specification. Before you read on, take a guess as to which one, and why.

Now, producing sensible behaviour when we should be producing an error is the least bad of all possible spec violations, but still, it's irksome. It means that we have to ensure that we never comply with the spec, because if we suddenly turn this into an error, we might break an existing program.

Which of the above should be illegal? The first one is obviously good. The second and third are legal because section 13.1.3 says that literal zero is implicitly convertible to any enum, and section 14.10.2 says that you can binary-or two enumerated type values. We'll therefore convert the zero to the enumerated type and use the enum binary-or operator.

Since binary-or groups on the left, the fourth and fifth are legal. The left-hand part of the expression is evaluated first, and the right-hand zero is converted to the enumerated type.

The sixth case is illegal according to the spec, because this is the same as (0|0)|E.X(0|0) is not a literal zero, it's a compile-time-constant expression that happens to equal zero, and there's no implicit conversion from that thing to any enumerated type!

And there's the premature optimization. After we've got a complete parse tree we walk through the parse tree making sure that all the types work out. Unfortunately the initial type binding pass bizarrely enough does arithmetic optimizations. It detects the 0|something and aggressively replaces it with just something , so as far as the compiler is concerned, the sixth case is the same as the second, which is legal. Argh!

This kind of optimization is way, way premature. It's the kind of thing you want to do very late in the game, possibly even during the code generation stage. Someone tried to do too much in one pass – the type annotation stage is no place for arithmetic optimizations, particularly when they screw up your type analysis in subtle ways like this!

Next time I'll talk about how these same premature optimizations screw up definite assignment checking. There is a much related situation where there is a variable which the compiler treats as definitely assigned, and it IS definitely assigned, but the definite assignment specification requires us to produce an error. Bonus points to anyone who can suss it out from that vague description!

Tags

1. Wesner Moise says:

When I try to create unions using explicit struct layout, (for example, to union a double and a long) the compiler complains that I need to provide assignments to both the double and long.

The following compilers and runs without producing an exception, because of aggressive optimization. The second line doesn’t produce an error because x is a dead store.

string s = null;

int x = s.Length;

Console.Write(s);

My guess for the situation where the compiler treats a variable as definitely assigned against the spec is the following.

string s ;

if (true)

s = null;

int x = s.Length;

Console.Write(s);

2. Personally, I think you should change the spec to match the existing behavior here. Constant folding should have absolutely no bearing on whether the program matches the spec or not, and if the spec says it does, the spec should be changed.

3. Carlos says:

The simplest example I could think of:

int a;

int b = (a * 0) + 1; // Should be an error: (a * 0) is not a constant expression and a is not definitely assigned

P.S. Welcome back!

4. Eric Lippert says:

Wes: send me a repro of your bug, I’ll have a look at it.

Anthony:  The spec specifies precisely what constant foldings must be performed so that folded constants can be used for switch labels, constant fields, etc.  So it’s a nonsequiter to say that "constant folding should have no bearing on whether the program matches the spec".  The spec is all about constant folding.

The problem with changing the spec to match the behaviour is that the existing behaviour is a complicated and weird interaction between an optimizer (which is NOT specified by the specification) and the constant folding code (which is specified.)  For example, (0-0) | E.X  also fails to produce an error, but (7-7)|E.X does correctly produce an error.  Basically we have a few choices:

1) Do nothing — document the spec violations internally and ensure that we maintain them

2) Rewrite the spec so that it is a description of the current complex and bizarre behaviour

3) Extend the spec even further, so that, say, any integer constant could be implicitly converted to enum.

Option (2) is gross, option (3) would make unfortunate expressions like (6 + 8 – 2 * 7) | E.X legal, so (1) is what we’re stuck with.

Hence, root of all evil.  We wouldn’t have this evil choice to make if someone had made the optimization pass the LAST thing that runs rather than the FIRST.

5. Eric Lippert says:

Carlos: This is in fact a bug that you’ve found, so, good work.  However, it’s not the bug I asked for.  I asked for a bug where the compiler treats a var as definitely assigned, it IS definitely assigned, but the definite assignment spec says that it should be an error.

Your bug is one where the compiler treats a var as definitely assigned, it is NOT definitately assigned, and it should be an error.

But thanks for pointing this out, because this means I’ll need to ensure that this case is maintained as well!

6. Eric Lippert says:

Wes: the compiler is smart enough to treat if(true) as something which always executes the consequence, so s will be definitely assigned and this is not in violation of the spec.

7. Carlos says:

int y = 2;

int x;

if ((y * 0) == 0)

x = 1;

int z = x + 1; // Should be an error: (y * 0) is not a constant expression so the compiler shouldn’t consider the "if" as always taken.

8. Eric Lippert says:

Ding!  Give that man a cigar!

9. Stupid question, but…why not break non-compliant programs? They’re non-compliant. I admit that it’s cold comfort to the broken people when you tell them that they were wrong, but otherwise it rather degrades the relevance of a spec to say that, well, we’re never going to comply with it.

10. Eric Lippert says:

Look at it from the point of view of the customer.  You have a codebase of hundreds of thousands of lines of working code.  Microsoft comes along and says "hey, we’ve got a new compiler that has new language features that will enable you to do even better work."

New language features raise productivity, and thereby increase the profitability of companies which provide .NET solutions.  Those companies are better able to provide software to their customers, which encourages those customers to see the .NET platform as attractive, which means that they buy Windows, which makes me happy as a MSFT shareholder.

Now we come along and say "but there’s a catch.  We were bozos and allowed you to write a perfectly sensible, straightforward program which for obscure reasons just happened to be noncompliant with the holy specification in this one little way.  If you try to compile your old code against our new compiler, it might break.  Which means that you have a choice of either (a) don’t use the new compiler, and forego all the new productivity benefits, or (b) take your developers off of whatever they’re doing now so that they can insert the right parentheses in the right places to bring the code up to compliance with the specification.

Both of those are costs to the ISV, costs which are associated with ZERO increase in code quality.  It makes upgrading less attractive, it makes people less productive, blah blah blah.

Remember, the Holy Specification is a means to an end, not an end in itself.  It’s there so that the language can be understood by the people who are using it.  But "stable from version to version" is for most real-world applications, about a million times more important than "word-for-word compliant to the spec".  Would we like to be totally compliant to the spec, all other things being equal?  Of course.  But given that we’ve screwed up in the past, it is way more important to keep those mistakes stable and documented and predicatble than to get people to upgrade and discover that they have to "fix" previously working code.

11. Eric Lippert says:

Incidentally, the way that I discovered this issue today was that I moved the premature optimization to run after constant folding, not before.  When I went to recompile the .NET Base Class Library with the new compiler, sure enough, somewhere in there someone had 0 | 0 | E.X, and it broke right away.  This is in real-world production code, and if I were to check in a compiler that broke the BCL build, I would be in so much trouble it would make your head spin. So many _thousands_ of people depend on being able to make a fresh build of the BCL every day that we have to show that there is a HUGE benefit that outweighs the massive cost of make breaking changes. Merely coming into line with the spec is nowhere near important enough to justify stopping thousands of people from being able to compile for a day.

In the past VS has always seemed to flag discovered violations as warnings for a version, then promote them to errors, giving at least some lead time. Heck, VS2005 broke badly a number of apps that relied too much on VS6 and VS2003’s bad scoping rules. (Alternately, a special subparser designed to scan imported code for specific patterns like this could correct them, since you’ll be documenting them anyway. It won’t work for everything, but it’s something. I know you already have a lot of code in upgrading project files, but as far as I know, not for actual code.)

13. zzz says:

Very simple:

a) Some validation thing, possibly upon first import of "legacy" code will detect situations where there is possibility for a break to occur and warn about them

b) a compiler switch could be used or a define added that specifies the compiler to use non-conforming, legacy behaviour on these kind of special cases. A validation/import tool could then automatically add this to old projects/files.

14. Eric Lippert says:

Indeed, one of the things we are researching for the next release of the compiler is what breaking changes are appropriate for this kind of "switched compiler" treatment.   Again, though, just having a switch is not a magic panacea.  Additional switches greatly complicate the test matrix, which means that we have less confidence that we’re giving you a quality product.

Adding warnings is a good technique.  I’ll soon be talking in this space about some of the warnings I’ve added to the compiler in the last few weeks.

15. "IS definitely assigned, but the definite assignment specification requires us to produce an error."

E e6 = 0 | 0 | E.X;

16. Eric.

Your answer is roughly what I expected you to say about maintaining application compatibility, and roughly what I’d have said were I in your job rather than my job. However, since I don’t work for Microsoft, I can’t code to what your programs do: I have to code to the spec. So, when I code to the spec, my programs don’t work. This makes the point of a spec a touch nebulous!

As a side note, why does "breaking the app causes customer pain and updates" get to be a good argument when "breaking the spec by changing it, causing customer pain and updates" not get to be?

I suspect the answer here is: changing the spec doesn’t break anyone’s program, because no-one’s program can be spec-compliant, because spec-compliant programs don’t work. (I appreciate that your example above is a massive edge-case, but you’re viewing this, I believe, as a specific example of a generic problem and discussing it as such.) However, when us poor external developers clamour for documentation (which MSFT is pretty darn good at providing) and then use it, only to find that it doesn’t work and the truth is embedded in your code rather than the spec, it causes people to throw books at walls.

Naturally, "never make any mistakes or write code with bugs in again" is not an answer, even for you ๐ But it would be nice to see programs move toward the specs sometimes, rather than the specs move toward the programs, meaning that I can work from the docs rather than have to check interoperability by constantly testing and upgrading to the latest releases.

17. Eric Lippert says:

Clearly there is a spectrum here, and there are definately times when we do make breaking changes and implement the spec.  For example, I earlier wrote about a breaking change from 1.1 to 2.0 where we fixed a bug in the implementation of enumeration subtraction.  That ended up breaking an important customer who was most put out, but understood when we explained what had happened.

As for the question of whether this is a specific example of a generic issue — yes it is, but the generic issue I was intending to address was the pain of premature optimization.  We haven’t yet decided whether we’re going to fix this one or not, but either way, I will have to write wodges and wodges of code in order to make LINQ work around this premature optimization.

18. Good call, and I do completely and utterly agree with you about premature optimisation ๐

I must have missed the earlier note about a breaking change. Thanks for the pointer!

Eric, you know more than I do about compilers, being a pro at this and all while I’m a mere undergrad, but I’ve been writing a compiler for a small C-like language with type inference for my thesis, and I couldn’t agree with you more. In fact, I’d take it farther and say that prematurely doing *anything* is the root of all evil – put off doing things as long as you can!

20. bmm6o says:

What was the context for "0 | 0 | E.X" appearing in code?  The first two are actual zero numeric literals?  It seems like a lot of extra typing for nothing.

21. Eric Lippert says:

The context was a table mapping one kind of flag to another.  A particular set of flags had five "flavours", and was mapped to a different set with four flavours, so to make it clear which mapped to which:

PMtoAF[( int )( PM.F | PM.CF |   0   |   0   |   0   )] =  AF.CI |   0   |   0   | AF.NP ;

PMtoAF[( int )(   0  | PM.CF |   0   | PM.GF |   0   )] =  AF.CI |   0   | AF.IO |   0   ;

PMtoAF[( int )(   0  | PM.CF |   0   |   0   |   0   )] =  AF.CI |   0   | AF.IO | AF.NP ;

etc.  One of these happened to have 0 | 0 | enumtype

22. bmm6o says:

Thanks for clarifying…  My initial reaction to the post was "forget back-compat, that’s a dumb thing to type", but in context it’s pretty reasonable.  I guess one lesson is that it’s hard to spec out a syntax in the abstract – you have to decide on the legality of various constructs in a vacuum.  It’s hard to argue that 0 | 0 | E.x should be illegal with such a reasonable code sample (and when such very similar things are legal).  And part of it is aesthetic (not wanting warts), which makes it even harder.

23. Reader Larry Lard asks a follow-up question regarding the subject of Mondayโs blog entry . Why is it

24. David says:

I understand the difficulties involved with changing the spec and with fixing the bug, but how does it make sense to just do nothing? If your attitude towards standards compliance (the C# spec is an ECMA standard, afterall) is "Being compliant is not as important as not pissing people off", what do you think that does to my trust level in your commitment to standards compliance in the future?

You mention the cost to ISV’s of either not upgrading or having to fix their illegal code, but you fail to factor in the productivity cost when the implementation doesn’t match the spec, and developers have to waste time finding the mismatch and determining what to do about it. As Stuart said, all we have is the spec; if its "wrong" (in the sense that it doesn’t match reality), we’re sunk. Granted, in this case, I don’t see that being a big problem, but as a general issue, I think you are missing an important part of the equation.

25. Eric Lippert says:

> how does it make sense to just do nothing?

Ultimately we decided to take option 3.

However, "just do nothing" is often the correct course of action when faced with a bug. I have fixed a great many specification violations in the last twelve years, and rolled back a number of those fixes when it turned out that fixing the violation broke an important customer’s code.

We constantly strive to be both consistent and correct, but it is not always possible to do both in an environment where well-meaning humans have produced erroneous code. When faced with a situation in which consistency and correctness are opposites, my team acts like professional engineers. We evaluate the specific situation, consult with the affected customers when possible and make the technical calls that we believe best serve the interests of our customers and therefore our business.

If you honestly believe that we do not take the costs you mention into account, then frankly you’re not reading this blog very carefully.  And frankly, there is probably no developer on this planet whose productivity is more deeply impacted by mismatches between the spec and the implementation than _me_. I understand that cost at a far more visceral level than the vast majority of users will ever need to, and thank goodness for that.

If my pulling the curtain back and showing what factors go into these hard decisions decreases your trust level, that’s unfortunate. I write this blog because I firmly believe that a lack of transparency has bred mistrust in the past. I’m seeking to remedy it in part by being brutally honest about our mistakes and the difficult situations which are their consequences.

I am deeply committed to engineering a solid product that advances the state of the industry and adds value to the economy; being standards compliant is an important part of that commitment but it is far, far from the only part. Standards are a means to an end — a tool. They’re not an end in themselves. I am committed to making the "standard tool" a useful tool, but I am more committed to making the compiler itself a useful tool since that’s what actually gets the customer’s job done.

26. David says:

First of all, I appreciate you taking the time to respond intelligently to my scattershot comments all over your blog. Although you should probably know that your willingness to respond is what prompts me to continue posting, so if I’m keeping you from doing something more important… ๐

1. "If my pulling the curtain back and showing what factors go into these hard decisions decreases your trust level"

It certainly does not. The decision itself is what concerns me; at least by explaining the reasoning behind it, I know that these decisions are not being made randomly by monkeys with typewriters ๐ In fact, I very much appreciate your willingness to be so transparent about the tough decisions that have to be made. Frankly its refreshing after what I have perceived to be a disturbing lack of transparency by all but a few members of the C#, CLR, and BCL development teams (Mads being one of the notable exceptions). In fact it is that transparency which has prompted me to respond so vigorously; I actually feel like at least someone is listening! I apologize if the flood of contrary opinions has given you the impression that I do not appreciate your honesty. Please, keep it up!

2. "If you honestly believe that we do not take the costs you mention into account…"

I suppose I don’t really believe that. My previous comment was prompted by the fact that you did not mention the impedence mismatch between documentation and reality as a factor in your decision in the post itself. But then, I can’t ask you to transcribe your entire train of thought about a particular issue over what I assume was at least a several week period for my perusal, can I? Point taken.

3. "We constantly strive to be both consistent and correct, but it is not always possible to do both in an environment where well-meaning humans have produced erroneous code."

I understand this perfectly. I also understand that you are aware of the significant responsibility you have in developing a platform that is used by millions of developers, and that you accordingly hold yourself to a very high standard.

4. "When faced with a situation in which consistency and correctness are opposites…[we] consult with the affected customers…"

This to me is the crux of the problem. Of course you cannot possibly consult with all of the affected customers; that’s ridiculous. So when you make this statement, you obviously mean "we consult with a sample of the affected customers we know about." These customers are likely to be companies with which you have existing relationships (and probably support contracts). These, by definition, are likely to be large companies that you deal with repeatedly over a long period of time.

But I don’t work for one of those companies. Their needs are not my needs, nor are they the needs of the majority of developers who use this platform. As I have just finished commenting on another post, the needs of these companies will naturally lean toward maintaining legacy systems, which means they are more likely to value consistency (i.e. backward compatibility) over correctness than the average developer. But that does not mean that most users would make that decision, nor does it make it the right decision for the long term.

Let me be clear: I don’t for a moment believe that when you encounter a potential problem like this, you just call up your 10 biggest customers and ask "Would this break you?", and base your decision off of their answer. I know you and your team put a lot of thought into this, and that you want to do the right thing. But it is becoming clear to me, through your posts, the posts of other MS bloggers, and my own observations of the decisions the various .NET teams have been making over the last several years, that the ratio of the value that Microsoft as a whole places on backward compatibility versus correctness and forward momentum is much larger than that of the majority of their users, regardless of where that influence is coming from. When I project that value system into the future, what can I conclude but that .NET as a platform will lag behind and become obsolete before its time? And seeing that, what compelling reason is there for me to stick with it, rather than start looking for a platform which will continue to serve me well far into the future?

27. Eric Lippert says:

Here’s something I’ve noticed over the last decade at Microsoft which surprised me at first, but now I’m totally used to it.

When your decisions have as much impact on the industry as ours do, any time we do some action X, there will be people who complain that we did did X and did not do not-X.  That’s unsurprising. The surprising bit is that frequently when we do X, there are people who complain that we really ought to do X.

I am not sure what is up with these people; basically I’ve come to the conclusion that people are strange.

Case in point: I am trying to decide whether to make a breaking change to the product, and if so, what breaking change. In the past we might have made a beta or a CTP and shipped it out to high-end customers to see what their feedback is. But for the Orcas release I decided, yes, we will keep doing that, but in addition I will document not just the putative breaking change, but the decision process that goes into it. I’ll do this before the change is final, in a public forum with 10000+ readers in the community. That way I get feedback not just from high end customers, but from affected community members who have opinions one way or the other.

In this specific case, I was initially leaning towards not making any breaking change at all. But based on further analysis and careful consideration of the feedback that I received on this and other posts, we decided to take a breaking change which affects overload resolution rules. This change is a real breaking change — at least one customer has since reported the break.  But we decided that this particular premature optimization was holding us back from deeper optimizations and, more importantly, the ability to cleanly design and implement new language features like expression trees. So we took the community feedback and took the break.

And for my trouble, I get accused by members of the very community that I’m attempting to serve of being at best insensitive, at worst disingenuous.  Claire Booth Luce was right.

To address the concern of your last paragraph: what you are describing is called the "value proposition". I am committed to producing a product with a compelling value proposition for C# developers, with an emphasis on pro devs working in corporate environments, but with attention to other market segments as well. This should not come as a surprise.

My larger team is committed to providing a value proposition for not just C# developers, but for F#, JScript, VB, Python, Ruby, etc, developers, who have very different market segmentations.

The developer division as a whole is committed to providing a value proposition for all developers on Windows-based platforms.

It’s up to you to decide whether the value proposition of the platform actually has net value to you. However, the argument that we are not driving the C# language forward adequately due to backwards compat issues strikes me as specious. We have just added query comprehensions, lambda expressions, extension methods, auto properties, object initializers, expression trees, collection initializers, local variable type inference, and method type inference into anonymous function bodies to the language. This does not strike me as the actions of a team that is afraid to take big bets, and I don’t think you’ll be seeing many of these features showing up in Java any time soon.