Over the last ten years of building the .NET runtime, quite a number of assumptions have changed. Early on we could assume that most computer users only had one processor. Today, the assumption is that you have at least two processors. While including parallelism in an app for performance challenges most developers, what if that parallelism came for free? That’s exactly what we’ve done with our newest CLR performance feature. Today, Dan Taylor, a program manager from the CLR performance team, shares how multicore JIT can make your app start faster. The best part — you just have to include two lines of code to try it out. Super easy! –Brandon
In this post, I will provide an in-depth review of how the Multicore JIT technology works, and then show you how easy it is to use in your .NET Framework 4.5 apps.
App launch is faster with Multicore JIT
On the .NET Framework performance team, we spend a lot of time looking at the launch performance of managed applications. Large managed applications require JIT (just-in-time) compilation at launch time, so improving launch performance can be challenging. .NET Framework developers have been able to use Ngen.exe (Native Image Generator) to move code generation from application startup time to installation time. However, for the most part, this pre-compilation option is available only for large .NET Framework applications that also happen to have an installer.
As developers continue to take advantage of the great productivity benefits of the .NET Framework, they are using managed code in places where there is no installer and where Ngen is not available. To address the needs of these developers and to round out our portfolio of performance technologies in the .NET Framework 4.5, we have introduced Multicore JIT, which uses parallelization to reduce the JIT compilation time during application startup.
With Multicore JIT, methods are compiled on two cores in parallel. The more code you execute on your startup path, the more effective Multicore JIT will be at reducing startup time. Improvements of 20%-50% are very typical, which is great news to anyone developing medium to large .NET Framework applications that are not able to take advantage of NGen. You can improve the startup time of your application by up to 50% with very little work, even if it runs off of a USB stick.
Real-world benefit of Multicore JIT
Let’s take a look at how this works in practice with a few real-world applications. Bing.com recently moved to Windows Server 2012 and the .NET Framework 4.5. Because of Multicore JIT, their ASP.NET based services now start up 50% faster, going from an average of around 155 seconds to just under 80 seconds. You can read more about the Bing.com results with Multicore JIT in their recent blog post.
Multicore JIT can also yield significant improvements to desktop Windows Presentation Foundation (WPF) applications. The graph below shows startup times with and without Multicore JIT for three desktop WPF applications. In terms of code executed on startup, these applications are small to medium-sized, and are certainly much smaller than Bing.com’s ASP.NET applications.
A comparison of startup time with and without Multicore JIT
Even though these applications are small to medium-sized, the startup improvements from Multicore JIT range from 16% to 35%. In the case of Windows Performance Analyzer, the startup path included loading a performance trace and displaying a number of graphs. This shows that even with a large amount of non-JIT related work, Multicore JIT can result in a big improvement in the overall startup time of the application.
Let’s take a look at the CPU characteristics of Multicore JIT by looking at Paint.NET. This application normally uses Ngen, but for this analysis the native images were removed and the application’s source was modified to enable Multicore JIT. The following graph compares the CPU during launch of Paint.NET with and without Multicore JIT.
Paint.NET startup improvements from Multicore JIT
In the Multicore JIT scenario, instead of JIT-compiling methods on one CPU, two CPUs were used and the application was able to reach the end of its startup execution more quickly. An 8 core computer was used, so 12.5% on the % CPU axis is one CPU at full utilization, and 25% is two CPUs at full utilization.
Take a look at the following Channel 9 interview with a developer from the Windows team, who used multicore JIT in the apps that we looked at above.
How Multicore JIT works
Multicore JIT uses two modes of operation: recording mode and playback mode. During recording mode, the JIT compiler records every method it is asked to compile. Once the CLR determines that startup is complete, it saves a profile of all the methods that were executed to disk.
Multicore JIT recording mode
When Multicore JIT is enabled, recording mode is used the first time your application is launched. For subsequent launches, playback mode is used. Playback mode loads the profile from disk and uses the information to compile methods in the background, before they are needed by the main thread.
Multicore JIT playback mode
As a result, the main thread doesn’t need to do as much compilation, and your application launches faster. The recording and playback features are turned on only for multicore machines, since single-core machines do not benefit from parallelization.
Using Multicore JIT
We’ve made it simple for you to use Multicore JIT from your application.
In a .NET Framework desktop application, all you need to do is use the System.Runtime.ProfileOptimization class to start profiling at the entry point of your application—the rest happens automatically. The following code shows how you can enable Multicore JIT by inserting two method calls in your application constructor:
Starting Multicore JIT in an app constructor
The SetProfileRoot call tells the runtime where to store JIT profiles, and the StartProfile call enables Multicore JIT by using the provided profile name. The first time your application is launched, the profile does not exist, so Multicore JIT operates in recording mode and writes out a profile to the specified location. The second time your application launches, the CLR loads the profile from the previous launch, and Multicore JIT operates in playback mode.
If your application has a multi-stage startup, you can call StartProfile at any point in your application to take advantage of parallel compilation. For example, after your initial startup sequence, you might display a menu that enables the user to navigate into different parts of your app, with each navigation loading new code paths and causing more JIT compilation. In this case, you could use one JIT profile for the main menu and another profile for the various items in the main menu.
When we developed Multicore JIT, we took into consideration that ASP.NET applications run in a hosted environment, so we turned on Multicore JIT for these applications automatically. So if you’re running ASP.NET 4.5, you don’t have to do any extra work to turn on Multicore JIT. To make your ASP.NET applicationstart up faster, simply upgrade your server to ASP.NET 4.5.
If you want to turn Multicore JIT off in your ASP.NET 4.5 applications, use the new profileGuidedOptimizations flag in the web.config file as follows:
<?xml version="1.0" encoding="utf-8" ?>
<!-- ... -->
<compilation profileGuidedOptimizations="None" />
<!-- ... -->
XML code for turning off Multicore JIT in ASP.NET applications
Multicore JIT is an easy-to-use performance feature for applications that do not use Ngen. You can use this feature to speed up your application launch time by up to 50% with very little work. If you are developing an ASP.NET application, you will see automatic benefits by moving to ASP.NET 4.5. If you are developing a desktop application, you can turn Multicore JIT on with only a few lines of code. If you’d like to know more about Multicore JIT, you can take a look at our in-depth Channel 9 interview above or see the System.Runtime.ProfileOptimization class topic in the MSDN Library.
We provide performance improvements with each version of the .NET Framework, so you should always try out your applications on the latest version. To read about some of the other improvements we have made in the latest version, be sure to check out Ashwin Kamath’s article Overview of Performance Improvements in .NET 4.5.