In truth, this won’t apply to very many of you and it is with some trepidation that I share it. The last thing in the world I want is for everyone to blindly follow it. This, like in many things in software, involves trade-offs that you need to think hard about.
VS 2010 includes more managed code than any previous version of VS. Also more of the managed code is NGENed (precompiled and saved on disk to avoid just in time compilation cost). As we’ve investigated the VM exhaustion issues (approaching the 2GB limit) we’ve seen some effects from this.
Before I go into the effects, let me set one more piece of context. In the early planning and development for this release, we were toying with significant changes to the VS deployment model to make the vast majority of it “xcopy deployable”. We ultimately decided not to try to deliver that for this release but it’s still in our aspirations. As part of our initial foray into this, we moved a whole bunch of our assemblies that had previously been placed in the Global Assembly Cache (GAC) out of it to better facilitate xcopy deployment.
When we started looking at our VM usage more closely in the Beta 2 timeframe, we noticed that a whole lot of files were mapped into memory twice (for instance: Microsoft.VisualStudio.TeamFoundation.Client.dll and Microsoft.VisualStudio.TeamFoundation.Client.ni.dll). The first one is the image with the IL in it and the second one is the image with native code in it that has been NGENed. Double loading so many DLLs uses up dozens of MBs of VM in an application of this size.
Upon further investigation, we learned that the CLR will often load both the IL image and the NGEN’d image when the IL image is stored outside the GAC. Moving the assembly inside the GAC eliminates the load of the IL image – loading only the NGEN’d image.
Now, hearkening back to my first paragraph, this actually sounds much worse than it is for most applications. The IL image (although mapped into memory) is not actually used for anything. The performance impact of this “double mapping” is nearly non-existent. Further it’s much better in .NET 4 than it was in .NET 3.5 because there actually used to be a performance hit in 3.5 (and earlier). In .NET 3.5 and before, if the assembly was strong name signed and NGENed, not only did it map both images, it actually verified the strong name on the IL which involves both CPU and I/O cost. In .NET 4.0 for the security model has been updated to only verify the strong names on images coming from untrusted locations.
This issue is something that people should think about if they have an app where the 2GB VM limit is a significant consideration but be aware that moving stuff into the GAC comes with its own issues. For one thing, it means your install can more “impactful” on the system than otherwise because your components are in a shared place where other applications can take dependencies on them. It also means you can’t just copy your app onto the computer – you have to install things in the GAC with GACUtil. There are installation permission implications and more. I don’t intend to turn this post into a “best practices for what to GAC and what not to” – I suspect there’s plenty of material out on the web for that – just be aware that it’s not a magic bullet that everyone should shoot.
When we first realized the impact this was having on the VS virtual memory situation, we spent several weeks with the .NET Framework team trying to determine if they could change this behavior and eliminate the “double loading”. After quite a bit of prototyping, we concluded that WAY too much code needed to be changed in the Framework and that this is a better fit for the next release. The loader is one of the most delicate parts of the CLR and even VERY subtle changes in semantics, ordering, timing, etc can result in 3rd party applications breaking.
Instead, we chose to pursue the path of analyzing our VS assemblies and determining which ones made sense to move back into the GAC. That work is largely complete and we’ve found that the savings are on the order or 50-70MB of VM across a sampling of scenarios. It’s not a huge amount (compared to 2GB) but it’s a very nice step in the right direction given them most of our scenarios were only over by a couple of hundred MB.