Very high cpu utilization on idle

We got an interesting ProductFeedback bug from Oren Novotny who says:

While working with a C# solution consisting of seven projects, I noticed that devenv was pegging one of my two CPUs.  The IDE didn't feel sluggish though and was still responsive despite the cpu usage. 

I wanted to talk about this because i thought it was interesting how this behavior was viewed by our customers.  As it turns out this behavior is considered "by design".  In order to help explain that, i'm going to delve a little bit into our design for the C# Language Service in order to show that the consequences of our design is the exact behavior that Oren is seeing.

The C# Language Service is a component that implements many Visual Studio services in order to provide C# specific capabilities to VS users.  It analyzes your source code and builds up an internal representation for it, and uses that to provide many services like colorization, completion lists, parameter help, squiggles, population of the task list, debugger interaction support (like breakpoint resolution), navigation bar population, etc. etc. etc.  The amount of services that the C# language service implements is quite large, but is almost completely specified by the interfaces that Visual Studio provides.  That means that you could rip out the C# language service yourself and insert your own MyC# Language Service that does everything we do.  (and, in fact, that's what some customers out there do!).

So, as i've specified it, there are really two different kinds of behaviors that the C# Language service has.  The first is simply code analyzation.  The second is interaction with the rest of the VS shell in order to interact with the user sitting at the keyboard.  Now, what's interesting is that these two behaviors have vastly different requirements and performance characteristics.  Let's start with the second type of work that the language service does.  We're interacting with the user so it is *vital* that we be performant.  Consider something like a user typing "this<dot>".  We need to bring up a completion list and have it populated in the span of milliseconds.  If we're slow to do this then we're going to screw up your typing.  And if there's anything we know it's that if you screw with typing you're going to have people ripping their hair out and screaming with you.  Typing absolutely must be performant no matter what.  Similarly, when colorizing your text we have to be fast, fast, fast!  If colorization isn't instantly happening it's very disorienting and will cause users to think they're doing something wrong. 

Now, let's talk about the first job that the Language Service has: Code analyzation.  Consider this.  Before we are able to bring up a completion list after you type "DateTime<dot>".  We need to do several things.  We need to try to bind the name on the left of the <dot>.  In order to do that we need to understand the method that it's in (because it might be the name of a parameter), we need to understand the class it's in (because it might be a property or field or even a nested type), we need to understand the namespace it's in (because it might be brought into scope by one of your "using's".  And, not only that, but we need to understand the references your project has so that we'll know about the types that are accessible (for example System.DateTime comes from mscorlib.dll).  In essense we're compiling your code, except that unlike a standard compiler we need to compile fast, and we need to be incredibly error tolerant, and also we have to deal with the constant changes that you're making to your code.  This is actually a whole heck of a lot of work. 

So we could try to do all this work after any change the user makes.  After you type a character we simply do all the work to keep our internal symbol model up to date.  However, it turns out that we simply weren't smart enough to figure out how to make that work while satisfying all the requirements out there.  We absolutely could do the work in between your keystrokes, but we don't know how to do it performantly enough.  Consider renaming a namespace that sits in your root namespace.  The amount of code that needs to be recompiled at that point is enormous.  Chances are that every file of yours will have many references to types from mscorlib that now need to be rebound in case they might bind differently with this namespace name change.  Now realize that you have about 1 millisecond to do all that work!  It's an extremely complicated task and we decided that trying to do it in between keystrokes would be too difficult. 

So what did we do instead?  We decided that instead of analyzing your code and keeping our internal symbol model up to date in between user changes, we would instead have another thread whose sole purpose was to keep that symbol information up to date in the background.  This thread receives notifications that changes have been made and will do the work to re-lex, re-parse, and re-bind the new code.  Now, while that's happening it's possible for user requests to come in and be serviced without delay.

Now, as a consequence of this, a user request might come in and be serviced before the background thread has finished its work.  What happens in that case?  Well the "primary" thread will simply access a symbol table that is slightly out of date.  And, for pretty much all cases that suffices.  In 99% of cases by the time you need access to something you've just changed, it will be understood and contained in the symbol table.  For example: say you rename a method parameter 'foo' (or change it's type).  You then move down and try to access "foo<dot>".  In the short time it takes you to move down to the usage of "foo" we will almost certainly be up to date.

Now, there is one case where the 99% rule won't apply.  If you're opening a large solution then it's going to take time for the background analyzation to happen.  When the solution opens we will know nothing about your code, and as time goes forward we will be madly analyzing your code to have the initial symbol table population done.  If you happen to use some service (like IntelliSense(tm)) during that time, it is absolutely the case that we might not have all information ready for you.  However, we are extremely fast with our analyzation, and on a fast machine you can usually expect that we'll know about all your code fairly quickly.  A good rule of thumb is about 1 second per megabyte of source code you have.  i.e. if you have 35 megs of source, we'll take 35-40 seconds until we know *everything* about your code.  Of course, after 20 seconds we'll know 50%, so as you get closer to full analyzation the accuracy gets better and better.

So what does this mean for the user?  Well, as you're changing your code, it's quite likely that you'll see a spike of CPU as the background thread goes about its business analyzing code.  This thread runs at lower priority than our foreground interaction thread and so while we're using a lot of CPU, you'll still be able to type and you won't find the IDE to be sluggish.  This is absolutely by design and is the expected behavior for the C# language service.

It's great to see people concerned about and wanting to let us know about these things in case there is a problem.  But it's even better to be able to know that things are working as normal and there's nothing that even needs to be fixed :-)

Any questions or comments on the design decisions we've made here?