Anonymous method (and Iterator) perf

probably a few of you hav asked how to make iterators and anonymous methods fast.  Well hopefully if you understand how the compiler translates them, it should be relatively easy to optimize them.  The first thing to remember is that for all practical purposes the C# compiler does not have an optimizer.  Thus the code that you write is fairly directly translated into IL.  The best way to optimize the IL is to optimize your code.  In the case of iterators and anonymous methods, I've tried (and I think this was the language designers' intent as well) to make the transformations farily transparent, so that the programmer can easily write educated and efficient code while taking advantage of these great features.  So here's my rules for making effecient anonymous methods and iterators (in no particular order):

  • Think carefully about which locals you use (a.k.a. capture).  As soon as you touch a local in an anonyous method, that local becomes heap based and goes through an extra level of indirection (and prevent enregistering).
  • Think carefully about where locals are declared.  In order to maintain proper scoping semantics the compiler creates a new heap object on entry of each scope that has captured locals.  If you can move all the captured locals into one scope you'll probably get better perf.  If possible move that scope 'inward' to avoid an GC object creation on a hot path (assuming the anonymous method is not on the hot path).  If possible move that scope out of loops to avoid GC object creations on each iteration of the loop.
  • The captured locals do not have to be in the same scope as the anonymous method.  The delegate that holds the anonymous method will be cached at the scope of the innermost captured local.  Thus if the locals are outside of a loop, but the anonymous method is inside, that's OK.
  • Add extra scopes as hints to the compiler about lifetimes.  For iterators, any scopes without a 'yield return' statement will be mostly left intact, including using real locals for everytihg in that scope instead of heap-based fields.
  • For iterators, if a try/finally block contains a yield statement, the finally block must be cloned from the MoveNext method into the Dispose method (so the same code is in two places).
  • For iterators, calling GetEnumerator twice is just as expensive as calling the original iterator method/property/indexer twice.
  • Iterators are designed to help amortize the cost of enumerations, so if you only need the first few elements, you don't need to pay the price for enumerating everything.  Follow that pattern in your own iterators where possible.
  • Don't enumerate something twice when once will do.  Most enumerations have another method to obtain the count of elements without iterating.  So use that instead of iterating once just to count the elements, and then again to extract the values.
  • Anonymous methods make some algorithms clearer, and iterators tend to make more correct IEnumerable/IEnumerator implementations, but neither of them will never be as efficient as hand written code (or hand written assembler, but who has time for that anymore).  So write well documented, clear, clean code, and then use a profiler to see if you need a speed improvement.  And remember that a faster algorithm will probably trump any micro-optimizations in your code (like counting down instead of up to avoid that one extra opcode on th eloop test)

I'm sure I think of more later, but that's most of them for now.  Enjoy, and feel free to correct me.  I can't wait until you guys can get a newer build and see the stuff we've been working on for almost the last year.

--Grant