Push/Pull architecture


I did a little work today on a significant perf issue that cropped up and was preventing us from meeting our Beta2 exit criteria.  Basically when reloading an extremely large solution we were seeing the whole UI hang while doing some intensive CPU labor.  After doing some profiling with some of our internal tools we discovered it was in my code.  Sigh…

Before explaining the issue I thought I’d discuss a little bit about the portion of architecture that was affected.  To start with consider the this code and the following sequence of events:

public class MyClass {
    public SomeType someField;
}

  1. We start compiling the type so that we’ll able to provide intellisense on it
  2. We start compiling the field of that type
  3. We attempt to resolve the type (SomeType) of the field.  (and say we find it in A.DLL)
  4. Later you then remove A.DLL and add B.DLL (which also contains a type ‘SomeType’)
  5. You then try to use some bit of intellisense (like ‘goto definition’) on ‘SomeType

In order to make sure that works we need to correctly bind ‘SomeType’ to the type in B.DLL.   In our current model we used a ‘push’ based approach.  When the set of imported DLLs had changed we’d “push” that information out to all interested parties.    In this case the interested parties were the set of all types we’d parsed out of your source code and any time those types had bound to some type from the DLL.   Cases where those types would bind to a type from the DLL would be (as above) for the type of a field, the return type or parameters to a method, the supertype of a method, etc. etc.  Then when the listener received notification that a DLL had been added or removed it would go and unbind/rebind any previously bound types.

As you can imagine there are a *lot* of places that bind to types from metadata (just think about your own code), and pushing this information to all of them was actually quite costly.  What’s worse was that we were compounding that cost.  When you load or reload a solution we get notified of all the DLLs your project imports.  In the case of this uber-solution we were getting notified about many tens of DLLs that were included in the project.   So not only were we pushing the information about the add or remove of a DLL, we were doing it many times.  So we’d unbind/rebind on the addition of the first DLL, then we’d do it again on the addition of the second, then the third, etc.  Of course, all but the last unbind/rebind was unnecessary.

We had a few ideas about how to make things better.  The first was to just push this work into a task that our background thread would process.  This would allow us start processing user interaction on the foreground thread again that would make the experience much better in terms of user perception.  We decided against that for two reasons.  The first was that we’re very wary about moving work to the background thread, especially this late in the game.  We spent a lot of time in Whidbey trying to eliminate deadlocks from C#, and moving work to the background thread is just inviting a ciritical hang to occur right as we’re shutting down.  The second reason was that even if we did move this to the background thread we’d still be doing a lot of unnecessary duplicated work.  This would mean our background thread would be just chewing up CPU when it could be doing important things like updating the core intellisense understanding of your code as you’re typing.

Another idea we had was to try to batch up all the changes, and then dispatch them all at once.  So if three DLLs were added to the project we’d only issue the ‘push’ once at the end instead of three times.  We decided against this because it would involve changing the interfaces we have with the VS project system which would require a lot of coordination amongst all the different languages in order to make sure they would all support these new interfaces.

So, instead we decided to reverse the operation entirely and move from a push model to a pull model.  So now what happens is that the current project stores a version number indicating what its current version is, and each parsed type stores the version number of the project when it last sync’ed with it.  Whenever references are added or removed from the project that version is incremented.  Then when you ask for something (like the return type of ‘someField’) we compare the current type’s version against the version of the project.  If they disagree then we know we can’t trust the current type and we recompile it, getting the right information, and then storing the version of the world we are consistent with.

This change solved a couple of issues for us.  First, it dropped the perf impact of solution load/reload down to almost nothing (in terms of the time spent in C# at least).  Second, it allowed us to drastically simplify our code since we now were able to cut out all the push logic that was threaded through a fair amount of code.

What’s interesting is that if you consider the simple one DLL case the total amount of CPU time spent doing this work will be more after my change was made.  Why?  Well, before the change we had a set amount of work to do and it was accomplished in one pass (i.e. pushing the information to all interested parties).  After the change the same work needs to be done *and* some even more work needs to be done so that the interested parties check versions and pull in information every time they’re accessed.  Of course, in the multiple DLL case we should be a lot better since instead of pushing the information n-times to each listener, we will only pull it once.

Of course, even in the single DLL case while our overall performance might be worse, our perceived performance will be much better.  This is because we’ll have taken operations that would have stalled the UI before and we’ve moved them into operations that will now be demand driven and will operate fast enough that they can be performed in the idle time between your keystrokes.

When I get into work I’ll send out some information on the project that we use on our team to measure performance on large solutions.  In return I’d like to hear from you guys on the kinds of projects you work on.  Number of projects in your solution, number of files in your project/solution, total size of all your source, largest single source code file.  Those stats would be invaluable in making sure we can make our performance scale to the projects that you’ll be working on.

Edit: The solution i work with and routinely do perf testing on has 8 projects, 1400 .cs files (the largest being 352KB), totaling 25 MB of source.
I’ve worked on fixing bugs in VS7.1 for enterprises with larger solutions that than (i.e. 250 projects), but i’m curious about you guys.

Edit #2: When loading a project there are actually two components interacting.  The “project system” and the “c# compiler/language service”.  Scalability across many projects (i.e. handling 250 projects) is mainly dealt with on the project system side.  Scalability across many files and large files is mainly dealt with on the C# side.  Also, for the most part loading all the projects happens on the main thread, whereas loading all the files happens on the background thread.  This will hopefully allow us to load each project very quickly and allow you to start editing/working right away.  However, we’ll be sitting there churning away in the background to bring ourselves up to a full understanding of your code.


Comments (12)

  1. Marc Bernard says:

    The main solution we’re working on has 71 projects, 582 files, 217,000 lines of code. It takes several minutes to open the solution.

    Good to see you guys eating your own dogfood, sorry it took so long to happen.

  2. Kai Iske says:

    Same for us over here.

    The main solution we’re working on is 61 pojects 715 files and 350000 lines of code. It takes ages and is especially annoying when you have to restart VS.NET 2k3 after a crash 🙂

    Looking forward to getting my hands on the new release.

    …but your 8 project 1400 files solution sounds interesting though 🙂

  3. damien morton says:

    27 projects, 1200 files, 8MB source code

  4. damien morton says:

    27 projects, 1200 files, 8MB source code, 250KLOC

  5. Yeah, we definitely had issues scaling to a very large number of projects in 7.1. Loading the 250 projects took >20 minutes when i was investigating and fixing a customer bug one time. (that’s a story for another time, but trust me when i say it was one of the hardest bugs to find and fix).

    Hopefully we’ll be a lot better for 2k5

  6. Ron says:

    For the development effort I work on the core is 17 projects, 434 files, 12.8MB

    These are loaded at all times with additional projects loaded as needed (anywhere from 1 to 15 supplimentary projects depending on the specific subsystem).

    The overall development effort is 665 projects, ~16.5K files, 300MB. None of it is ever loaded at one time.

    Unfortunately, I don’t have accurate SLOCs, but I expect you get the idea. 🙂

  7. Steve Hall says:

    The last big thang I worked on was 21 C++ projects, 1500 files, 35MB source code, ~1.5MLOC. (You read the right: M…not K!) Under VC6 it took about a minute to load, under VS7 it takes over 5 minutes…(same machine) Since this product was discontinued, I decided to orphan it on VC6 for this very reason… (Don’t want to go through the ringer just to diagnose the occasional bug in order to produce a patch…)

  8. Ron: that’s very disturbing 🙂

    Is it all C#?

  9. This is a really serious problem with VS2003.

    It’s much, much worse though when you load a Setup & Deployment project–like the ones we use to generate our MSI’s.

    Even on a smallish 50KLOC project, it takes more than three minutes to load. It’s gotten so bad that I’ve made two solution files, one without the MSI project; that one only takes about a minute to load.

    WE SHOULD ALL BE DOGFOODING STUDIO! This kind of problem shouldn’t become apparent this late in the dev cycle. Load time has been a problem since 1.0 RTM’ed

  10. Ron says:

    Cyrus: No, it’s a mixture of just about everything under the sun. 🙂 Most of it is under Visual Studio, but there’s a fair bit of Java and Perl thrown in for good measure. You should see the install process involved for the product. It takes a couple hours, though to be fair this includes a goodly amount of support data. 🙂

  11. Eric Newton says:

    when you’re dealing with a solution with more than 10 projects in it, something is wrong, probably due to too much code subdivision ["i wanna be able to use this here, here, and here, but without that…"]

    In a sense, I get irritated with, say System.Web having so many types in it, however, since they all basically tightly depend on each other to be basically the same version, it works well… one could argue that the Hosting aspect could be separated from the Component/Rendering aspect of ASP.net, but the cons outweigh the pros…

    I’d like to see VS2005 use a more solution based Referencing approach. For example, I have my "Ensoft.DataComponents" project, that I’ve stabilized at v1.0. I just recently decided to add some functionality and types to the assembly, and upped it to v1.1. However, some of my class libraries/app projects reference the SOURCE and some only reference the DLL…

    It would be nice to have the reference use the SOURCE if the SOURCE project is loaded, otherwise, use the browsed reference… [am i making any sense here?]

  12. Eric: If i understand you correctly, then this is something we’ve done for VS2005.

    If you add a reference from one project to another, then it’s a "source" reference, and changes in one are immediately visible in another.

    If you add a reference from one project to a dll, then it’s a metadata reference, and we’ll use that instead.

    So we should be fine in your situation…

    Have you tried out VS2005 to check if this is so?