How Many Passes?

Large bodies of code written in the C/C++ languages typically divide up the code into “header” files, which are just declarations of methods and types (and definitions of macros). The actual bodies of those methods and types are in completely different files.

People sometimes ask me why doesn’t C# need header files? which is a bit of an odd way to phrase the question; I would have asked the equivalent question why does C++ need header files? Header files seem like a huge potential point of failure; all the time I edit C++ code and change the signature of a method; if I forget to update the header file, then the code doesn’t compile and often gives some cryptic error message. Hopefully this large cost actually buys you something.

It buys the compiler writer one thing, and the user one thing.

What it buys the user is that you can compile each individual “cpp” file into a “obj” file independently, provided that you have all the necessary headers. All the information necessary to generate the bodies that are in a given cpp file can be gleaned from the set of headers. This means that the build system can recompile just those cpp files that changed, provided that no header changed.

What it buys the compiler writer is that every file can be compiled in “one pass”. Because every type and method declaration is processed before its first usage, the compiler can simply start from the top of the file, pull in all the included headers, and proceed from the top down, spitting out the obj file as it goes, never having to go back and revisit something its seen already.

The C# language does not require that declarations occur before usages, which has two impacts, again, on the user and on the compiler writer. The impact on the user is that you don’t get to recompile just the IL that changed when you change a file; the whole assembly is recompiled. Fortunately the C# compiler is sufficiently fast that this is seldom a huge issue. (Another way to look at this is that the “granularity” of recompilation in C# is at the project level, not the file level.)

The impact on the compiler writer is that we have to have a “two pass” compiler. In the first pass, we look for declarations and ignore bodies. Once we have gleaned all the information from the declarations that we would have got from the headers in C++, we take a second pass over the code and generate the IL for the bodies.

A good way to think about this is that the first pass computes metadata: the “top level” stuff, like namespaces, classes, structs, enums, interfaces, delegates, methods, type parameters, formal parameters, constructors, events, attributes, and so on. The second pass computes the IL: the code that goes in the method bodies, constructor bodies, and so on.

In practice, though we only do “two passes” over the raw text, we do many, many passes over the resulting data structures. I thought I might lay out for you just what passes we do in order to implement the C# language.

The first “metadata” phase goes like this:

The first thing we do is take the text of the sources and break it up into a stream of tokens. That is, we do lexical analysis to determine that “class c : b { }” is class, identifier, colon, identifier, left curly, right curly.

We then do a “top level parse” where we verify that the token streams define a grammatically-correct C# program. However, we skip parsing method bodies.  When we hit a method body, we just blaze through the tokens until we get to the matching close curly. We’ll come back to it later; we only care about getting enough information to generate metadata at this point.

We then do a “declaration” pass where we make notes about the location of every namespace and type declaration in the program. At this point we’re done looking at the source code for the first phase; every subsequent pass is over the set of “symbols” deduced from the declarations.

We then do a pass where we verify that all the types declared have no cycles in their base types. We need to do this first because in every subsequent pass we need to be able to walk up type hierarchies without having to deal with cycles.

We then do a pass where we verify that all generic parameter constraints on generic types are also acyclic.

We then do a pass where we check whether every member of every type — methods of classes, fields of structs, enum values, and so on — is consistent. No cycles in enums, every overriding method overrides something that is actually virtual, and so on. At this point we can compute the “vtable” layouts of all interfaces, classes with virtual methods, and so on.

We then do a pass where we work out the values of all “const” fields.

At this point we have enough information to emit almost all the metadata for this assembly. We still do not have information about the metadata for iterator/anonymous function closures or anonymous types; we do those late.

We can now start generating IL. For each method body (and properties, indexers, constructors, and so on), we rewind the lexer to the point where the method body began and parse the method body.

[UPDATE: A number of people have asked me about how the passes below are optimized. During the initial binding pass we make notes on things like “there’s nullable arithmetic in here”, or “there’s an object initializer in here” and so on. When we go to do object initializer rewriting, say, we first check to see if we even have to, and skip the pass entirely if we know it will be a no-op. (And testing runs a special version of the compiler that does the pass anyway, and reports back whether it had any effect or not, so that we can test that our pass-skipping logic is correct.) ]

Once the method body is parsed, we do an initial “binding” pass, where we attempt to determine the types of every expression in every statement. We then do a whole pile of passes over each method body. Again, at this point we are done looking at the source code; we’re now passing over the “annotated parse tree” many times, and rewriting it as we go.

We first run a pass to transform loops into gotos and labels.

(The next few passes look for bad stuff.)

Then we run a pass to look for use of deprecated types, for warnings.

Then we run a pass that searches for uses of anonymous types that we haven’t emitted metadata for yet, and emit those.

Then we run a pass that searches for bad uses of expression trees. For example, using a ++ operator in an expression tree.

Then we run a pass that looks for all local variables in the body that are defined, but not used, to report warnings.

Then we run a pass that looks for illegal patterns inside iterator blocks.

Then we run the reachability checker, to give warnings about unreachable code, and tell you when you’ve done something like forgotten the return at the end of a non-void method.

Then we run a pass that verifies that every goto targets a sensible label, and that every label is targeted by a reachable goto.

Then we run a pass that checks that all locals are definitely assigned before use, notes which local variables are closed-over outer variables of an anonymous function or iterator, and which anonymous functions are in reachable code. (This pass does too much. I have been meaning to refactor it for some time now.)

At this point we’re done looking for bad stuff, but we still have way more passes to go before we sleep.

Next we run a pass that detects missing ref arguments to calls on COM objects and fixes them. (This is a new feature in C# 4.)

Then we run a pass that looks for stuff of the form “new MyDelegate(Foo)” and rewrites it into a call to CreateDelegate.

Then we run a pass that transforms expression trees into the sequence of factory method calls necessary to create the expression trees at runtime.

Then we run a pass that rewrites all nullable arithmetic into code that tests for HasValue, and so on.

Then we run a pass that finds all references of the form base.Blah() and rewrites them into code which does the non-virtual call to the base class method.

Then we run a pass which looks for object and collection initializers and turns them into the appropriate property sets, and so on.

Then we run a pass which looks for dynamic calls (in C# 4) and rewrites them into dynamic call sites that use the DLR.

Then we run a pass that looks for calls to removed methods. (That is, partial methods with no actual implementation, or conditional methods that don’t have their conditional compilation symbol defined.) Those are turned into no-ops.

Then we look for unreachable code and remove it from the tree. No point in codegenning IL for it.

Then we run an optimization pass that rewrites trivial “is” and “as” operators.

Then we run an optimization pass that looks for switch(constant) and rewrites it as a branch directly to the correct case.

Then we run a pass which turns string concatenations into calls to the correct overload of String.Concat.

(Ah, memories. These last two passes were the first things I worked on when I joined the compiler team.)

Then we run a pass which rewrites uses of named and optional parameters into calls where the side effects all happen in the correct order.

Then we run a pass which optimizes arithmetic; for example, if we know that M() returns an int, and we have 1 * M(), then we just turn it into M().

Then we do generation of the code for anonymous types first used by this method.

Then we transform anonymous functions in this body into methods of closure classes.

Finally, we transform iterator blocks into switch-based state machines.

Then we emit the IL for the transformed tree that we’ve just computed.

UPDATE: I’ve glossed over a number of additional passes that happen during IL generation. Basically, the IL generator turns the program tree into a set of “basic blocks” — that is, sequences of code where nothing “in the middle” of the block leaves the sequence, and no other sequences branch to “the middle” of any other sequence. Once we have the code in basic block form we can start doing additional analysis on it: rewriting branches to be more efficient, removing basic blocks that the arithmetic optimizer has proven to be dead code, and so on. See my article on the optimize switch for more details of what we do during this pass.

Easy as pie!

Comments (50)

  1. Interesting to hear Eric! I remember the first time I heard that

    string superBigString = bigStringIHave + anotherBigString + yetAnotherBigString;

    …was actually optimized to a String.Concat call. Made using StringBuilder unnecessary. It would be interesting to hear about more optimizations done and how it affects the daily code developers write. So here’s a vote for such a blog entry =)

  2. What pass takes a base.X() call in an iterator or anonymous method and generates an appropriate accessor method on the containing class?

    That is done by both the interator rewriter and the anonymous function rewriter. In the anonymous method rewriter, that pass is run after all other transformations have been performed on the anonymous function. The iterator rewriter is kind of bizarre; first we rewrite the body to figure out where the switch needs to branch to, then we rewrite the base calls, then we generate the switch and the branches, and then we generate the returns. I need to refactor that code one of these days; it is nigh impenetrable. — Eric

  3. Pavel Minaev says:

    > What it buys the compiler writer is that every file can be compiled in “one pass”. Because every type and method declaration is processed before its first usage, the compiler can simply start from the top of the file, pull in all the included headers, and proceed from the top down, spitting out the obj file as it goes, never having to go back and revisit something its seen already.

    It’s true for C, but not really so for C++, because of inline functions, and body of class being fully accessible from within method bodies. I.e.:

      class foo {

           void bar() { baz(); /* forward reference! */ }

           void baz() { bar(); }


    why they didn’t go all the way there, and didn’t also provide the same facility on namespace level, is beyond me, but oh well. Ultimately it just means that compiler writer now has two headaches – he has to write a two-pass compiler because of classes, but he also has to insert checks in that compiler so that forward references on namespace level are reported as errors (because language spec requires them to be ill-formed).

    That aside, headers actually provide one other advantage to the user: they (ideally) separate interface of the module from its implementation. This allows several things.

    For one, I can do interface-first design – write the header first, carefully designing the public surface of my API, and then implement it. If I forget or mismatch something, the compiler will complain. And if something isn’t in the header, then it’s not exposed (well, not quite true with C/C++, in practice…).

    As a side note, doc comments also normally go into headers, so they stick with interface where they belong – not with implementation.

    For another, as a library client, I can use header as formalized documentation – just open it, and look at the definitions (hopefully with doc comments).

    I suspect those two reasons are why F# went for its separate module interface (.fsi) + module implementation (.fs) approach, since they don’t actually use .fsi files in the same way C/C++ does for separate compilation.

  4. Pavel Minaev says:

    > What pass takes a base.X() call in an iterator or anonymous method and generates an appropriate accessor method on the containing class?

    It’s implicit in the algorithm due to the ordering of the steps:

    – Then we run a pass that finds all references of the form base.Blah() and rewrites them into code which does the non-virtual call to the base class method.

    – Then we transform anonymous functions in this body into methods of closure classes.

    – Finally, we transform iterator blocks into switch-based state machines.

    As I understand it, the structures between those steps don’t map 1-to-1 to C# code, which is why this is possible. So at step #1, base.Foo() is mapped to something which isn’t representable directly in C# syntax, but is analogous to ClassName::Foo() in C++.

  5. Pavel, that’s true in C# 2 and 3, but it generates unverifiable code (which therefore produces a warning, because there wasn’t time to fix it in those versions of the compilers). Because verifiability prohibits non-virtual calls on any object other than ‘this’, I believe.

    In C#4, this bug is fixed by having the compiler actually generate a method accordingly, so that:

    public IEnumerable<int> foo() {

     yield return base.Bar();


    is compiled into something equivalent to:

    public int baseBar() { return base.Bar(); }

    public Func<int> foo() {

     yield return baseBar();


    I’m curious as to which pass of the compiler does that rewriting.

  6. Pavel Minaev says:

    > Pavel, that’s true in C# 2 and 3, but it generates unverifiable code (which therefore produces a warning, because there wasn’t time to fix it in those versions of the compilers). Because verifiability prohibits non-virtual calls on any object other than ‘this’, I believe.

    Indeed, I’ve found the corresponding section in the CLI spec – PIII 3.19. It’s actually a little bit more subtle – it prohibits non-virtual calls not via “this” only for non-final virtual methods – so base.GetType() is fine, but base.ToString() is not. Something new for me, and I’ve never ran into this VC# warning before, but overall it makes sense.

  7. Chris B says:

    Thanks a lot for putting this out here. I have been wondering about the process the C# compiler used to get from a loose string of text into IL for a long time.

    What is the reason for turning partial methods with no implementation into no-ops as opposed to completely eliminating them as if they were unreachable code?  Does the compiler do this the same way regardless of the state of the debug info and optimization options?

    I believe the no-ops are removed by the code generator if in “optimize” mode but I am not certain of that; I haven’t looked at that code for a while. — Eric

  8. Aaron says:

    Very enlightening.  It’s interesting that you mention your intent to refactor one of the passes that does too many things, presumably, into more passes.  How important is the cohesion of the passes compared to the performance of the compiler (or are they somehow orthogonal)?  Do you consider the impact of adding passes to the compiler when designing new language features, or do you simply design the feature as you will and then do what it takes to make the compiler performant?  Obviously cramming functionality together into a single pass violates the "Do Not Make Premature Optimizations" principle, but since your team likely has a good understanding of how much penalty must be paid for each additional pass, I’m curious as to the process for determining how each pass is justified.

  9. Pavel Minaev says:

    > What is the reason for turning partial methods with no implementation into no-ops as opposed to completely eliminating them as if they were unreachable code?

    I suspect it is so that one can set a breakpoint at such a method in debug build, whether it’s implemented or not. VC# likes to put NOPs in the code elsewhere as well, with /optimize-, and probably for the same reason.

  10. Stilgar says:

    Great post!

    I am curious how the IDE communicates with the compiler and which passes are included when the IDE checks the code. Also how do you do “edit and continiue” and integrate with the debugger in general. Now when I read my questions I realize that these may require a full post or more than one full post.

    The interaction between the design-time IDE, the run-time debugger, and the edit-and-continue layer are all very complex, and none of them are my area of expertise. — Eric

  11. Jason Whittington says:

    Given the large number of passes many of which don’t seem to rely much on one another what opportunites do you see for doing some of this concurrently?  Or compiling multiple projects concurrently?  Or is disk I/O the big bottleneck, which would make concurrent project compilation pointless?

    MSBuild should already be pretty smart about doing project compilation on separate processors, but again, that is not my area of expertise.

    I would like very much to be able to take advantage of the potential parallel execution here. Doing the passes in parallel is probably a bad idea because the rewriting passes all act on the output of the previous pass, and many of the analysis passes depend on the rewriting passes having run. (For example, today’s reachability checking pass assumes that all loops have been lowered into ifs and gotos.)

    We could certainly run analysis of method bodies in parallel. However, that then raises the question of what to do about emitting the IL; that seems like a poor candidate for parallelization. — Eric

  12. Patrick Kristiansen says:

    I’m curious to know whether the basic lexer and parser is hand-written or generated by tools analogous to lex and yacc?

    The lexer and parser are both hand-written; the parser is your basic recursive descent parser. Though it is easy to quickly get a language going with a machine-generated parser, and they do tend to be accurate, there are enough oddities and unusual cases in C# that its not easy to do a great job with a machine-generated parser. We also want to be able to easily extend and debug the parser, add custom error recovery, add heuristics that make it more useful for IDE scenarios, and so on. I’m sure that if we had an expert on machine-generated parsers we could get something going; we’ve done experiments with them in the past. Trouble is, when something breaks really horribly, you kinda need someone with an advanced degree in parser theory to suss it out. We don’t want to have to go running to Steve Lucco every time our parser has a glitch. — Eric

  13. Jay Bazuzi says:

    Wow, that brings back a lot of memories. Of being confused, mostly: I was a manager at the time.

    One implication of this architecture is if you have errors in your program, the compiler will stop on the failing pass. That means if you have cycles in your base types, and cycles in your generic parameter types, only the first class of errors will be reported. Only when the first class of errors is resolved will the second class appear.  

    We had been writing our C++ in a way the reflected our love of C#. We tried to use idioms that would bring some of the power, safety, and expressiveness of C# in to C++. At one point we tried to port the C++ to C# by renaming all the source files to .cs, compiling, and working through the errors. We figured this might be doable since we were writing C++ like it was C#.

    We’d compile and have 10,000 errors, then work our way down to 1, then fix that last 1, and then there’d be 7000 new errors, as the compiler made it on to the next pass.

    Indeed; if you cannot know that your base classes are acyclic then it makes it very difficult to, say, determine whether a given method overrides something; you could look through the base types forever. We make it easier on ourselves by skipping later phases if important earlier phases did not complete without error. This is then complicated by the desire to give good intellisense in the IDE even when the program does have problems at a more fundamental level. It’s a hard problem. — Eric


  14. Gilles says:

    Having this internal glimpse of the mechanics is very, very interesting!

    And it’s simply all kinds of awesome to blog about it.

    Eric, a hearty thank-you for the blog in general and this one in particular.

  15. Jake says:

    Thank you for this post! I’ve been wondering about many of these stages for a while.

    How does the compiler handle emitting of IL? Does it use reflection.emit or something more sinister?


    The compiler is written in unmanaged C++, so the managed ref emit library is impractical. We use the unmanaged metadata and IL emitting code provided by the CLR team.

    Even if we had a compiler written in managed code, we would not use ref emit; its metadata emitter was not designed with full-fledged compilers in mind. It is not difficult to find crazy type topologies that ref emit is unable to handle. If you want to write a compiler in managed code that can emit arbitrarily complex type topologies then consider something like CCI. — Eric

  16. Jesper says:

    Which pass expands query comprehensions? Since that’s fairly mechanical, is it an effect of the top level parse?

    The initial method body binding pass does syntactic query rewrites on the parse tree. We could in theory do it in the method body parser; for a variety of reasons it turns out to simply be easier to do it post parsing. — Eric

  17. Derek says:

    > The impact on the user is that you don’t get to recompile just the IL that changed when you change a file; the whole assembly is recompiled.

    Is this part literally true?  Does a single change require the whole assembly to be recompiled?  Or is the system a bit smarter, only recompiling those files that depend on the changes?

    Yep. We recompile the whole thing. The compiler is pretty fast! — Eric

  18. Vlad says:

    What I really want to know very much, is C# compiler faster on multicore CPU (can it use multiple threads)? For how much? Can’t find any benchmarks.

    If not, are there plans to made it so?

    I would like that very much, but no guarantees. See above. — Eric

  19. TheMG says:

    Many of those passes are optimizations. Why do they need to run in different passes, and why do they not run inside a common backend for all .Net compilers?

    They don’t need to run in different passes. Doing so is convenient because it means that we can (1) write classes that have a single responsibility, and (2) write later passes to depend on postconditions of earlier passes.

    They don’t run in a common backend because that would mean every .NET language team would have to agree on a common format for the internal representation of a program and then of course it would not be “internal” anymore. Remember, the compiler fundamentally is manipulating parse trees at this stage, and every language parses differently.

    Could we build a library of lowest-common-denominator objects representing things like “if”, “local variable”, “assignment”, “goto” and so on? Then lower the parse trees to those and run a common backend analyzer on them. Yes, we could. The expression tree library could be used as a start on such a thing. But is there any cost savings in getting every compiler team to agree on such a format? It seems like a lot of work for very little gain; the vast majority of the analysis and transformation is language-specific and happens before IL gen. This is an idea we’ve considered, but I don’t think much will come of it any time soon. — Eric

  20. Dean Harding says:

    Aaron: I imagine having one pass that does three things is about the same in terms of performance as having three passes that do one thing each.

    In any case, the C# grammar is simple enough that compiling even very large projects takes a very short amount of time. Our software is ~1M loc, spread out over dozens of solutions and projects and takes about 3 minutes to compile (and that’s including the time our build server takes to do an ‘svn update’). That’s impressive!

    Nice. And my guess would be that much of that time is not being spent in the analysis. A million lines of code is not that much to throw at the compiler. Running all the metadata to and from disk for each project can be expensive. — Eric

  21. Lord Dust says:


    Sweet! Thank you for this.


    Given Eric’s comment regarding how "impenetrable" the code is, I’ll guess that the driving criteria for Eric’s desire to refactor that pass is maintainability, not performance. I always look at it like this:

    Rule #1: The program must be able to successfully do the work you want it to.

    Rule #2: The program must be able to be changed in case you find out that you don’t really want it to do that, unless this should conflict with the first rule or you’re intentionally ruling out using this code for any other than the original application.

    Rule #3: The program should run as quickly as possible, unless this should conflict with the first or second rules.

    I’ll also go out on a limb and speculate wildly that there have been no recent, major features that have touched on what that too-big pass does, or the rewrite would have been more of a priority. Then again, I could be wrong.

  22. Eric, thanks for the clarification of the base method rewriting.

    I’m curious as to how you might approach refactoring the iterator-rewriting pass. I’ve generally been happy to consider that functionality as “magic” but I’ve never quite had a good understanding of the process by which that magic is made to do the right thing.

    The code is a bit of a confusing mess right now. The way it builds up the switch that drives the state machine is hard to understand and hard to maintain. I would want to refactor it until every step along the way was very clear. I’d probably divide it up into multiple passes: first, hoist the locals onto a closure class. Then identify every point in the code that is a branch target and insert a label. Then rewrite the try blocks to ensure that the finallys are run when they ought to be. And last, build a switch that branches to the labels. Put it all together and emit the closure class, and you’re done. — Eric

    I’d also like to second Jesper’s question about query comprehensions; don’t see an obvious place for them, either. Maybe the “loops into gotos and labels” pass?

    Query comprehensions are a syntactic sugar for method calls, and we work out the type of every method call in the first “binding” pass — that’s when we do overload resolution. So the rewrite of query comprehensions into method calls needs to happen very early. — Eric

  23. Doug E. Cook says:

    As Pavel mentioned, headers separate interface from implementation.

    The separation of interface and implementation can create some significant disadvantages. The interface and implementation are mostly redundant. This means time is spent writing the same thing twice and keeping the interface and the implementation in sync. Even worse, they can get out of sync. And they aren’t completely redundant, so you have to look in two places to really know the answer to some questions.

    The separation of interface and implementation can create some significant advantages. For those who simply need the interface (compilers of client code or human readers of the code), the information density of a header is much higher than the information density of the implementation. In addition, the implementation can change without requiring a change in the implementation (i.e. clients don’t have to recompile unless the header changes). These advantages are not currently available in C#.

    The part that really bugs me is that C# could have its cake and eat it too. With a bit of additional effort, it could get all of the benefits of separated interface and implementation but avoid most or all disadvantages. Unfortunately, instead of being addressed, the disadvantages have been ignored or downplayed.

    To be honest, C# does have header files, but they go by a different name. Instead of .h, you have metadata (embedded in the assembly, in a .metadata_dll file, or in an .asmmeta file) and documentation (those .xml doc files). This is actually pretty cool because it means that the headers are automatically generated from the implementation — the programmer just writes the .cs file and the headers are generated by magic! Of course, the programmer might have to put more effort into the implementation (add XML-doc comments), but nothing that he/she wouldn’t have had to put into a header in the .H model.

    The problem is that there isn’t any good support in the VS IDE or the C# compiler to take advantage of the automatically-generated headers. The IDE could have a mode that filters out the implementation and shows a nice high-level view of the .CS file (just the metadata with XML comments included optionally included), but there is no such mode (there is for viewing .DLL files, but not for viewing .CS files). The C# compiler (with help from the build system) could generate .metadata_dll files automatically, but instead you have to use messy external tools like asmmeta (along with a bunch of ad-hoc additional processing to filter out "insignificant" changes). The build system could automatically detect changes in the .metadata_dll file and only rebuild the downstream assemblies if the .metadata_dll changed significantly.

  24. didroe says:

    dcook, if you wish to separate method signatures from implementation you can choose to do so by using C# interfaces.

  25. Aaron R says:

    Sounds like there are a lot more passes now than what’s in the 2.0 compiler source in Rotor, but this helped explain some of the stuff I saw in there.  I’ve been looking through that source for the past couple weeks and one thing I’ve wondered is why it does two passes through the lex data.  Why not do one pass to build the AST and then walk the AST to pull out “header” information as a first pass?    My guess is so the parser could also be used for Intellisense or other metadata only needs.

    Will the 4.0 source every be released in a new Rotor release?

    This is highly unlikely for C# 4. However, we are evaluating whether to release hypothetical future versions of the compiler under a shared-source sort of license. No promises, but I would like this very much. — Eric

  26. Jesse says:

    I was about to ask the same question as Aaron R. The compilers I’ve worked on have done just what he suggested: lexing and parsing happens exactly once, then all further operations happen on in-memory trees (including matching up identifiers that aren’t defined until some point after they’re used). I don’t understand why the C# compiler needs to parse the source code multiple times.

  27. Anon says:

    @dcook, there’s nothing stopping you from writing a visualizer into Visual Studio (or standalone if you prefer).

    Or, for the lazy like myself, just collapse to definitions and scroll.

    There are many add-in’s out there that supplement the look and feel and give you some wicked productivity tools. e.g. Resharper. To be honest, I’ve never figured out a system by looking at a .h file.

    I’d also argue that the metadata is consumed by the most important client, and that’s the compiler itself. It’s funny to see, however many years later, that there are people now lamenting the fact of no separate .h files and no sync issues.

    The manifest being included in the assembly, and concretely describing every type is a godsend and is what makes it possible to do so much more that you could do with C++. In theory with a correctly compiled C++ system, all you need are .h files, but developing with .h files was a nuisance, and the theory often failed in practice.

    You can argue there’s still the same issue but in a different guise: instead of instead of .h proliferation, there is dll proliferation, but at least you can be assured that if you have the working dll, you will have the working code. You can’t say the same about a .h.

  28. Gabe says:

    What are trivial “is” and “as” operators? Is that where an expression being converted is already known to be of or not be of the type being converted to?

    Correct. Unfortunately, introducing co and contravariant conversions changed the set of reference conversions which are known to be trivial, and I accidentally introduced some bugs in this pass as a result. Hopefully the fixes will make it into the final C# 4 release. Apologies for the inconvenience. — Eric

  29. Joe White says:

    I found it interesting that you check for nonreachable code twice (once early on, to give warnings; once late, on an already-substantially-rewritten parse tree, to remove the dead code). The only reason I can think of for separating them is if the intervening rewrites could actually introduce unreachable code that wasn’t in the original source. Is that the case?

    I found it even more interesting that the dead-code removal won’t necessarily remove all the dead code. For example, if I had a switch(const), it sounds like you would codegen all of the cases, even though only one was reachable — because you do the switch(const) optimization after you remove the dead code, instead of before.

    I’d be curious to hear how the decisions were made (and by whom) about where to add some of these passes, and how you weighed the tradeoffs. I’m sure there are a few interesting stories there.

    Good question. We have two kinds of dead code. First, dead code that the *specification* says is unreachable. The spec is very clear about the formal definition of unreachable. For example: 

        int x = 10;
        int y, z;
        if (false) M(y);
        if (x * 0 != 0) N(z);

    The call to M(y) is unreachable, and therefore the definite assignment checker skips reporting the fact that y is used before it is assigned. The call to N(z) is reachable according to the spec, and therefore the usage of z is an error. The spec says that the consequence of a conditional is unreachable only if the condition is the compile time constant false. A compile time constant never contains a variable. Therefore x * 0 != 0 is not a compile-time constant, and therefore the consequence is reachable, and must be checked for definite assignment errors.

    Now, you and I know that x * 0 != 0 is always false, and the arithmetic optimizer knows that. Assuming that we initialize z so the program becomes error free and we get to code gen, we can safely remove the call to N(z) during code gen. 

    In short: we need to compute these two different kinds of reachability at different times, and therefore have two different passes to remove the two kinds of unreachable code. — Eric

  30. Andrew says:

    This is very different from the Java world right?

    The Eclipse IDE scales to very large projects excellently as it seems to retain a pre-computed dependency graph between compilation units within a single project. That way a recompile of the entire project is unnecessary because you changed one file – you just need to recompile what really needs recompiling.

    I don’t think its enough to say "the C# compiler is really fast" because it simply is not fast enough for day to day coding on a large commercial project (I’m working a project where individual assemblies were developed that contain thousands of files totally hundreds of thousands of lines of code). The speed of the compiler is a significant problem in my day to day workflow as I must sit around feeling frustrated while my box churns away. I work in finance so have a very powerful dev. box too.

    The advice "split up your assemblies" is good and we are moving in that direction, but ultimately I feel that C# and Visual Studio have taken a large retrograde step by not meeting the bar in this area.

  31. Carl Sixsmith says:

    Might seem like a silly question but would writing code like;

    string message = string.Concat("A string", "B string", "C string");

    be more efficient (compiler time wise)


    string message = "A string" + "B string" + "C string";

  32. Mark Knell says:

    quoth E. Lippert

    > Running all the metadata to and from disk for each project can be expensive.

    So this provokes two questions: do you develop on an SSD-juiced rig? And, what fraction of your team/build platform does?  In other words, has the SSD tipping point tipped, at least for the C# compiler cognoscenti.

  33. Michael O'Brien says:

    I’m not sure I understand the pass that checks generics are acyclic, could you explain? For example, this compiles:

    public class CyclicGeneric<T> where T : CyclicGeneric<T> { }

    Isn’t that cyclic?

    Nope. The base classes have to be cycle free, so “class C : B {} class B : C{}” is a cycle. And the constraints have to be cycle free, so “class D<T, U> where T : U where U : T {}” is a cycle. But the code you gave is perfectly legal. Weird, but legal. Similarly, “struct E : IComparable<E>” is not a cycle, and obviously is a desirable and common pattern. However, there are numerous bugs in our existing cycle detector that accidentally detect some of these weird cases as cycles incorrectly. — Eric

  34. Eric, would you please walk across the hall and "poke" someone on the VB team to see if they’d be willing to write up some details on their current and future compiler design. 🙂 I’m curious to hear whether they made similar choices.

    Do they plan to support incremental compilation?

    Are they really rewriting the compiler in VB?

    If so, what VB features are they using/avoiding to do so, and did they have to resort to non-VB for parts, or add hidden special features such as support for unsafe code?

    Have they considered large messy multi-million LOC single-assembly applications as a possible scenario?

    My team is currently in the midst of re-architecting/designing/writing a large mess of a Java application, and the available compilers are inadequate. The Eclipse compiler is the best available, but frequently gets confused, requiring a full rebuild. My dream is to one day move this app to the .NET world (primarily to take advantage of WPF and language features), but compilation speed is critical.

  35. Tom says:

    @Michael:  no, I’ve used that approach to create a hierarchical class, e.g.:

    public abstract class Node<T> where T : Node<T>


        protected T Parent;

        protected List<T> Children;


  36. Tom says:

    Wait, misread the first part of your statement.  I’m not exactly sure what constitutes "cyclic," but I’m sure one more venerably wise than I will shed some light.

  37. Tom says:

    Ugh, today is my multi-post day.

    > Running all the metadata to and from disk for each project can be expensive.

    Avoiding repeated disk writes is precisely why I like copying projects to a RAM disk until I’m done working on them for the day.  Barring catastrophic system failure and loss of resident data, it’s a good way to speed up . . . well, everything.  Have you perhaps maybe considered the possibility of compiling (or providing an option to compile) temporary files to memory before writing the final output?  I mean, you can already do this with CodeDOM.

  38. tobi says:

    Serch for "Then we run a pass that checks" on Google…^^

  39. Niall Connaughton says:

    On the build times and parallelization, MSBuild does use multiple cores, and it speeds build time up a bit. But the vast majority of build time is held up in disk access. Copy local references are a killer, especially in a solution with many projects with lots of references because when an assembly is copied local, all of its dependencies are copied local as well. So it fans out quite quickly and becomes painful.

    I saw numbers on SSD read and write times a while ago and it seemed that they weren’t really that much faster. I would think compiling to a ramdisk would be a massive speedup. I’ve seen a 5 disk RAID 0 setup turn a 20-25 minute build of 4 million lines of code into 3-5 minutes.

    I doubt parallelization of the CPU work in the compiler would have a significant effect on the build times of projects that are actually suffering from slow builds.

  40. Nice blog!  You mention that the C# compiler is pretty darn fast – it certainly is!  I can understand why the linker portion is significantly faster than the VS C++, but can you explain why the compile process is so much faster?

    Nope. First off, I don’t know that it actually is faster; I’ve never once run a head-to-head comparison of C# vs C++. (Though my experience with compiling large bodies of both C# and C++ makes it seem plausible that C# is faster.) And second, supposing for the sake of argument that C# is faster, in order to determine why it is faster, I’d need to do a comparison of each *stage* of compilation in order to find the one for which C# is faster. Could be that C# is easier to parse, or that overload resolution is easier in C#, or that loading metadata is faster than loading predefined headers, or that its the backend code generator that is faster; could be anything. I wouldn’t like to say without actually crunching the numbers. — Eric

  41. Silverhalide says:

    Following up on Joe White’s comment, I too was wondering why the dead code removal wasn’t done after the switch(constant) rewrite. I would think that if you swapped the order, you could actually even remove the branch that the switch is converted to, since the branch statement would only go to the code following itself.

    I’m sure that many of these passes have dependencies on earlier passes; I’d be interested to know what some of those are.  Why isn’t X done later, why isn’t Y done earlier?

  42. Mark Rendle says:

    Is it bad that since I found out the C# compiler optimizes + operator string concatenations into string.Concat calls, I stopped actually specifying string.Concat?

  43. BlindWanderer says:

    Now I understand why you might be loath to add features. It’s complicated!

    Now I’m left wondering what the code looks like. A method for each pass, or maybe #regions… probably methods in partial classes; less likely to stomp other dev’s changes that way and easier to do unit testing.

    Don’t forget, the compiler is written in C++, not in C#.

    We have a whole bunch of visitor pass base classes — one that is useful only for its side effects (like producing errors), one that is used for tree rewriting, and one that is used to query the annotated AST for facts about it. Most of the passes are just derived classes from one of the visitor base classes. Some of them need a lot more custom logic — like, say, definite assignment checking — and those are built by hand to recursively descend through the tree. — Eric

  44. Pavel Minaev says:

    > Now I’m left wondering what the code looks like. A method for each pass, or maybe #regions… probably methods in partial classes; less likely to stomp other dev’s changes that way and easier to do unit testing.

    From Eric’s comment above:

    "The compiler is written in unmanaged C++"

  45. Dave Sexton says:

    “Then we run a pass that verifies that every goto targets a sensible label, and that every label is targeted by a reachable goto”

    Since this happens after the pass that transforms loops into gotos, does it mean that the transformed loops are also checked instead of only user code?

    Label checking determines that every user-defined label is targetted by some goto. Compiler-generated labels are not checked; we always internally generate a label for the “break” and “continue” locations even if there is no break/continue statement, but having a loop with no break or continue statement is not something you should be warned about.

    A compiler-generated goto never does anything stupid; it never branches out of a lambda or out of a finally, for example. So we skip checking those as well. — Eric


  46. Gabe says:

    David Cunningham: I would assume that the C# compiler is faster than a C++ compiler because it does less work. A C# compiler only has to look at each source file once, while a C++ compiler has to look at each header file once for each main file that includes it. Using the STL could easily make the compiler do 10 times as much work as if you had written the code yourself (for example just doing #include <string> could pull in two dozen more STL files totalling over 13,000 lines). Even if you use precompiled headers, that’s still significant overhead that has to be repeated for each file. While a C# generic needs to be generated once, using C++ templates requires expansion of a template for each different type that it’s instantiated for.

    Furthermore, the C# compiler generates IL, leaving the native code generation and optimization to somebody else (ngen or runtime), while a C++ compiler usually has to output optimized native machine code.

  47. ShuggyCoUk says:

    Interesting, at what stage are partial classes ‘merged’. You speak of when the optimizations for unimplemented methods occur but not when their merge happens.

    Partial classes are merged immediately after the top-level parse completes. Since two halves of a partial class must have consistent base classes, we need to have all the “declarations” for a particular class known before base class checking. Namespaces are of course always “partial”; their contents are also merged together early. — Eric

    Also in the same vein at what stage do you convert things like foreach/using into their more verbose equivalents?

    In the C# 2 code all the “lowering” from loops to gotos happened in the initial binding pass. In C# 3 and 4, we’ve been gradually moving to a model where those lowerings happen later; right now the lowerings for many constructs happen before reachability and definite assignment checking because those passes only understand gotos, not loops. Eventually we would like to be able to easily implement things like “statement trees”, so statements would not be lowered until just before IL generation, so that the statement tree generator could generate the right thing. In today’s code, the lowering of a foreach to its try-finally-with-a-loop-in-it form happens early; I’d like to see that happen later. — Eric

  48. Benjol says:

    Makes me think of that cartoon with two guys at the blackboard doing maths, with "then a miracle happens" in the middle.

    I just hit F5, and somebody else does the magic.

  49. M. McCulloch says:

    Eric, if I was almost half as smart as you and your team, I’d be no less than twice as smart as I’m not now!

  50. CodeInChaos says:

    Which parts of your AST are mutable and which are immutable?