Performance issues Part One: GetSuperType

So I said on Saturday that I would write a future post about the performance issues I've been seeing.  Instead I'm going to write two separate posts.  First because, I don't entirely understand both issues, and second, because I think this one might get a little bit long.  The first issue I noticed you won't find in either Beta1 or Beta2, because it was recently introduced.  The issue is that after you open a file, or make an edit outside of a method/property body, the next time that you try to bring up a completion list, it takes a really long time.  (about 5 seconds on my dev machine, which is pretty powerful).

The reason for this is a little complicated, so I'm going to go into some more background about the C# language service, much like Cyrus does in this post.  One of the things we do when bringing up a completion list is determine what methods and types to display in it.  We generally do that by walking up the scopes from the cursor position, and adding everything we see.  However, sometimes we do more than that.  For example, if you are in a bases and interfaces list, and not on the first type in the list, we'll only show namespaces and interfaces or types that contain nested interfaces.  In order to do that, we ask each type: are you an interface, or do you have a nested interface, or a nested type that contains a nested interface.  We need to show types that have nested types that have interfaces, because you may want to type that nested interface, and if we don't show you the containing type, the completion list is just getting in your way.  Pretty simple really. 

The problem is that our logic wasn't quite right.  The issue is that it's possible to access a nested type in one of your super types, and we didn't put those nested types in the list.  Type for an example I think:

class Base {

public interface IBase {

}

}

class Derived : Base {

}

class C {}

class D : C, // HERE {

}

At the comment, it's valid to type "Derived.IBase", but we wouldn't show "Derived" in the completion list. In order to fix this issue, we added code to an internal function "GetSuperType()", and checked to see if it has any nested types we would want to show.

Okay, so why does that create a perf problem you might ask. The problem has to do with our implementation of GetSuperType. Getting the super type of an object is a fairly expensive operation for us, because we essentially need to reparse the file to find it's bases and interfaces list, and then determine the new super type. Once we find the super type, we then cache it for future lookups, which helps out performance most of the time. The problem is that every time we reparse an entire file, we clear the cache of every super type. That's because adding or removing a type may cause what the super type is to change (a new type closer in scope with the same name might be added, meaning we should re-bind to that type for example).

In my example above, when you open a file, we re-parse the file, in order to do things like populate the error-list, calculate the collapsible regions, and populate the navigation bar (those two drop-downs at the top of the file). This causes our cache to be invalidated, and the next time you need a completion list, we have to go and re-bind the super type for every type in your containment hierarchy, and any nested types inside them, which causes quite a hit.

Okay, so how are we going to address the situation? Well, we have three ideas.

  1. We can keep the parse tree's for all the files we need to parse alive while we calculate all of the super types, instead of re-parsing for each type we find. Hopefully this will be enough.
  2. If it's not enough, we think we have a way to clear the cache must less aggressively. Basically, when a type is removed from our knowledge (which happens when we parse, we remove it, then re-add it), we currently flush the cache of every supertype in the system. In actuality, we only need to flush the cache of the types of which it is a supertype. The problem is that currently a type doesn't know what types derive from it in our architecture. We may need to add this information, but changing a core area like our type system is not something we would like to do.
  3. We could regress to the behaviour we have in Beta1 and Beta2, and just not show derived classes whose supertypes have nested classes we want to show.

We're currently investigating option number 1. What do you all think.