“Generic” advice on using language features in library


One of the hard things about working on the CLR is that your code is likely to be used in a wide variety of situations and so from a performance perspective you have to be a lot more careful than if you’re just writing tools for a specific context that you can understand.

We recently had a very long discussion internally about those kinds of responsibilitys, this is one of my comments that I thought was useful generally. 

This thread was about using Generics (as found in C# 2.0) in WinFX code but this response um, more generic than Generics 🙂  I was responding to the comment marked with “>>”


>>If one human can follow some simple boring rules to write repetitive boilerplate code and the outcome beats what the CLR does, then it’s a pretty sure sign the CLR had better become smarter about it…

It’s very hard to beat the CLR’s overall implementation if you were going to code up something that’s as nice as what we give you.  List<T> supports all the right base types and iteroperates well with other collections and so forth — if you don’t have access to the source you can get a feel for what’s included by using ildasm on mscorlib and expanding it out for instance.  It’s fairly robust.

If we were to ship something less robust, basically our customers would hate it and with good reason.  It would be weird and confusing.

In order to “beat the CLR” you have to have to know something about your problem that we can’t reasonably know.  Such as “I only do this downcast in two places and only on Sunday’s anyway”.  If you “only do the downcast in two places and only on Sunday’s” all that beautiful typesafe code is hardly justifying its existance now is it?  Maybe you’d be better off just doing it the old fashioned way.  Heck maybe collections at all are overkill, maybe you should use a nice yummy array.

That’s the reason for all this fuss:  You can’t expect to just toss in List<T> instantations any time you feel the urge and expect to get good perf.  You have to think about whether you’re getting good value out of that construct just like any other construct.  It isn’t “free” and even though it’s well implemented it still may be overkill.

Our admonitions are to remind people to consider the costs — including the static costs — when making a choice.

Let me give you the following universal advice — which happens to work for generics too.

For all features X one of the following must be true:

  1. I am not using feature X
  2. I understand the costs of using feature X and it has a good cost/value proposition for my customers as I intend to use it.

Note, you may not replace (2) with any of:

  • Feature X is nifty keen and I like my code to look cool so I’m using it
  • Feature X would solve my problem nicely if only it didn’t cost so much (but this is good feedback for us)
  • Feature X is the only way I know how to solve this problem
  • On my last project we used Feature X for some other code and it was fine for that
  • I don’t understand the costs of using it but SomePerfGuyLikeRico said it’s “fine”

Seriously if you don’t have a decent understanding of the costs, what on earth are you doing using the darn thing?  You have to do your homework first (and then please yell at us to make it better because we aspire to be usable in lots of places where we’re currently falling short).  XML, Reflection, and Generics are my “favorite” examples of X.

See also:  http://blogs.msdn.com/ricom/archive/2003/12/02/40779.aspx  and http://msdn.microsoft.com/perf (especially chapter 5, mandatory reading!)

The less you intend to re-use the code, or the more you know about how the program will be used the easier it is to get #2 right — it may take no more than a passing thought for a one-off command line tool for instance.

I know it’s not easy to understand the costs of all the systems you intend to use but, you know, gee whiz, sometimes being an systems engineer is hard and stuff.  The lower you are in the stack, the harder it is (because it’s harder to be sure of #2 among other reasons). 

Comments (7)

  1. Minh says:

    Performance-wise, what should we watch out for when using generics? Is it like templates? Where it’s a compile-time issue? In case of a List<T>, it would surely be better than a List of objects? And comparable to a custom List of T?

  2. Generics are not compile-time like templates in C++. It wouldn’t been an issue then, except if you worry about what work the compiler has to do 🙂

    But .NET gives generics runtime costs, and that’s where the perf issue comes in. If you only use a generic type for one type then it costs you more then what an equivalent c++ template definition would give, because you need to add the costs of the runtime support for the generic type, instantination of the ‘not-generic type'(err. there must be better word for this?), etc.

    For a more detailed explanation of generics in .NET you can google for the Microsoft PDC2003 slides. On a slide for new stuff in the runtime you’ll see how it works…

  3. Ken Cowan says:

    This is great advice:

    "… You have to think about whether you’re getting good value out of that construct just like any other construct. …"

    We deliver this same message when demoing our profiler, and is why our profiler gives you visiblity into what the CLR is doing on your behalf. It might take only 8 lines to write a really cool web service, but somebody else had to write those thousands of lines of code you didn’t. You still pay a performance price.

    KC

  4. "Generics are not compile-time like templates in C++…"

    This is good and bad. It’s bad because it’s harder to understand and visualize the static costs. I ship code that uses C++ templatized collections and we have to be very careful to avoid bloat even with ICF (code merging when the generated code is identical) turned on. The bloat is still there, it’s just not as easily visible in the DLLs and EXEs you ship.

    At the same time, it’s good because there is some hope that someday more generated code can be shared at run-time. (Which of course then is typically bad for debugging – anyone who’s debugged ICFd C++ templates knows the joy of stepping into a function and finding yourself somewhere random that you didn’t expect.)

  5. Feature X is usually bloated and overengineered, as well as being overused — a bad combination. Xml processing could be handled by a very small library at first.

    A common reason for using: "Feature X looks good on my resume." Very few will admit this.