HTML5, and Real World Site Performance: Seventh IE9 Platform Preview Available for Developers

Performance on the web is multi-dimensional. Real websites call on many different browser subsystems to deliver amazing experiences. Each browser subsystem in turn has its own performance characteristics. The chart here and the diagrams in this post describe the specific subsystems like JavaScript and graphics rendering and more in detail.

Over the last few weeks, we’ve been tuning the JavaScript engine for more of the patterns we’ve found in real world sites. Based on the progress since the last platform preview, we’re releasing an updated platform preview build today so developers can try it out and provide feedback about the changes. Here’s a video of the changes in action:

Stepping back from these striking differences, let’s look at the big picture. JavaScript performance is one part of one aspect of performance. In addition to the other parts of the browser platform (like graphics rendering), performance involves other aspects as well: how well does the browser guard the user from slow add-ons, or from unreliable sites that crash? Enabling users to pin websites to the Windows taskbar means that users can go directly to sites without having to launch the browser and navigate. Making the most of your device for web browsing is significant as well. Taking advantage of the whole PC, and using the specialized graphics hardware and the many cores that modern PCs typically include, offers huge performance gains.

Tuning the JavaScript Engine for Real world patterns

Real world websites use JavaScript to respond to user input, manipulate strings, and move objects around the screen, and much more. We looked at common patterns and made adjustments to IE9’s script engine. You can try these examples (and more) at www.ietestdrive.com.

Running these sites in other browsers like the latest Google Chrome or FireFox 4, shows significant performance differences. In Shakespeare’s Tag Cloud, you see the performance advantage of compiled JavaScript and ECMAScript5 support with script-intensive operations. IE9 can run through a large body of text and build a tag cloud much more quickly. That same ECMAScript5 support combined with HTML5 Canvas delivers an HTML5 Sudoku solver that can solve puzzles faster than other browsers running the same site. With Galactic, you see fast JavaScript and hardware accelerated graphics together creating a 3D interactive experience. The frame rate in IE9 is significantly better than what the other browsers do today.

The Webkit SunSpider Microbenchmark

One measure of some aspects of JavaScript performance is the Webkit Sunspider microbenchmark. Remember that JavaScript is just one component of many that define browser performance on real world sites. Here is a chart of the latest results:

Webkit Sunspider JavaScript Benchmark Results. IE9 Platform Preview 7 is fastest followed by Chrome 8 Beta, Opera 11 Alpha, Opera 10.63, Chrome 7, FF 4 Pre-Release, Safari 5.0.2, FF 3.6.12 and IE8.

You may notice that the relative positions of browsers on this chart have changed with the latest IE9 Platform Preview.

We’ve gotten some questions after the last few Platform Previews about this chart. The results in the graph are from running actual browsers (in a consistent lab setup), just as we have since we started publishing the chart. Reporting results that “rely on a ‘shell’ JS engine that runs in a command line” is odd because those results don’t reflect the user’s experience in a browser. Similarly, because the point of a browser is to run actual websites, not just benchmarks, the chart we publish continues to include two versions of each other browser. You can read more on this choice at the end of this post.

You may also notice that the differences between browsers on this microbenchmark are converging within thousandths of a second on tests that repeat operations many, many times to find any differences at all. Some have suggested that this benchmark doesn’t matter now and it’s time for another JavaScript microbenchmark.

We’ve been consistent in our point of view that these tests are at best not very useful, and at worst misleading. Even with the most recent results in the chart above, our motivations and our point of view remain unchanged. We’ve focused on improving real world site performance. We’ve made progress on some microbenchmarks as a side effect. Focusing on another subsystem microbenchmark is not very useful.

Microbenchmarks and Real world web patterns have little in common

We think people should evaluate browser performance with real-world scenarios. Real-world scenarios involve using all the subsystems in the browser together rather than looking at single subsystems in isolation. Using a narrow slice of features to assess the big picture makes as little sense here as using the “Acid” tests to understand standards compliance.

Microbenchmarks isolate very narrow and specific aspects of a system. That’s the opposite of real-world websites that use the different subsystems together in order to deliver something useful or entertaining. Microbenchmark results often have little in common with the actual user experience of sites; for example, comparing JavaScript ray tracing or Fourier transforms is a good measure if that’s what the sites you rely on actually do. Performance on the test drive site samples has not, to date, looked like the Webkit SunSpider graph because there’s more to real world performance than JavaScript. The work at the W3C with other browser developers on a web standard to help developers measure and understand the performance of their website is a powerful statement about how much more there is to performance than JavaScript.

This Channel 9 interview with Jason Weber (a Lead Program Manager on the IE team focused on performance) is worth watching if you’re interested in hearing more about the difference between real world performance and microbenchmarks,

The IE Test Drive site offers samples that run in all browsers and deliberately represent real-world site patterns rather than microbenchmark-style samples. The visualization of the real-world samples is often more fun than a graph, as with Galactic, browser hunt and speed reading. The performance differences between browsers can be striking. They reflect how we’ve designed for real world, end-to-end performance rather than tuning subsystems for microbenchmarks.

Previews with Meaning

Back in March, we committed to delivering public Platform Preview Builds to the developer community on a regular schedule leading up to the IE9 beta. Feedback about these previews from developers has made both the IE9 development process and the IE9 release itself significantly different from previous IE releases.

This latest Platform Preview comes about two and half weeks after the most recent Preview. Leading up to the IE9 beta, we released Platform Previews approximately every eight weeks. We have internal builds of IE at least once a day, often more. Why not release public “dailies” and “nightlies” of IE?

The cadence of IE Platform Previews reflects our point of view: the point of a browser is to run actual websites, not just benchmarks or samples that are hardwired for one browser. Our point of view starts with providing consistent quality to respect customers’ time and includes delivering meaningful progress with each Platform Preview.

Twenty four hours of elapsed time is rarely meaningful. “Nightlies” vary widely in performance and quality, and may not run actual websites successfully. Those daily builds are of some interest to a small audience of “insider” enthusiasts who often take activity (even incrementing the version number) as progress. The gap between the IE Platform Preview download numbers (in the millions) and the other browsers’ pre-release offerings reflects this difference.

Looking ahead, we will continue to improve IE9’s performance. Making the full power of the PC available to websites is part of our focus on real-world sites and real-world performance. We’ll continue to work with the W3C and other browser developers to deliver on the goal of the same markup (the same HTML, CSS, and script) working across browsers. We’ll continue to take a holistic approach to the browser experience and platform and safety. We’ll also continue to release at a cadence that provides meaningful builds for the community to provide meaningful feedback.

Dean Hachamovitch

Update 1:10pm – There have been some questions raised about how a particular optimization in the Chakra engine is affecting IE9’s Webkit Sunspider results. The below addition addresses these questions.

Dead Code Elimination in JavaScript

One of the changes we made to the IE9 JavaScript Engine, codenamed Chakra, to improve performance on real world web sites involves dead code elimination. Yesterday afternoon, someone posted a question (“What sorts of code does the analysis work on, other than the exact [math-cordic test] function included in SunSpider,”) on the Microsoft Connect feedback site. Given some recent interest in this question, this blog answers that question.

Briefly, the IE9 JavaScript engine includes many different changes to improve the performance of real-world Web sites and applications. You can see this in action by visiting www.ietestdrive.com and trying the samples there with IE9 and other browsers. The behavior of the IE9 JavaScript engine is not a “special case optimization” for any benchmark and not a bug.

Some of the optimizations we’ve made to the JavaScript interpreter/compiler in IE9 are of a type known in the compiler world as dead code elimination. Dead code elimination optimizations look for code that has no effect on a running program, and removes the code from the program. This has a benefit of both reducing the size of the compiled program in memory and running the program faster.

Here is a very simple example of JavaScript code that is a candidate for dead code elimination. Because the conditional will always evaluate to false, the JavaScript engine can eliminate this code altogether.

 function func() {
    var x = 1;
    var y = 3;
    var w = x + y;

    if (w != 4) {
        // dead code 
    }
}

Dead code elimination is especially effective when code would be executed many times, as in a loop. In the following example, the code in the loop repeatedly overwrites the same variable (what is known in computer science as a dead store), so it can be reduced to only one call.

 function func(a, b) {
    var x;
    var i = 300;
    while (i--) {
        x = a + b; // dead store
    }
}

In the example below, the code executed in the loop is not used anywhere in the program outside the loop, so the entire loop can be safely removed.

 function sum() {
    var a = [1, 2, 3, 4, 5];
    var sum = 0.0;
    
    // dead loop elimination
    for (var i = 0; i < 5; i++) {
        sum += a[i];
    }
}

Developers often write dead code without knowing it and can rely on compilers to optimize the code. Most modern compilers today run an extensive set of dead code elimination optimizations.

So, why does this affect the math-cordic benchmark in the Webkit SunSpider suite? Let’s take a look at the inner function in the test.

 function cordicsincos() { 
     var X;  
     var Y;  
     var TargetAngle; 
     var CurrAngle;  
     var Step;   
     X = FIXED(AG_CONST);         /* AG_CONST * cos(0) */ 
     Y = 0;                       /* AG_CONST * sin(0) */ 
   
    TargetAngle = FIXED(28.027);  
    CurrAngle = 0;  
    for (Step = 0; Step < 12; Step++) { 
        var NewX; 
            if (TargetAngle > CurrAngle) { 
               NewX = X - (Y >> Step);  
               Y = (X >> Step) + Y; 
               X = NewX; 
               CurrAngle += Angles[Step];  
            } else { 
               NewX = X + (Y >> Step); 
               Y = -(X >> Step) + Y; 
               X = NewX; 
               CurrAngle -= Angles[Step]; 
            } 
    } 
} 

The benchmark runs an expensive loop, and then does nothing with the results; the benchmark is written exactly in a way that triggers this general optimization.

Of course, the benchmark could be rewritten to avoid triggering this optimization, which would bring our performance on this specific benchmark in line with other browsers.

The interest in this issue is a great example of why these microbenchmarks fail to represent the real world web. Webkit Sunspider uses an expensive JavaScript loop to approximate sine and cosine. Real world sites would actually use the much faster and CPU-optimized functions already available in JavaScript engines.

These optimizations are relatively new to the world of JavaScript runtimes even though there are many examples of dead code in real-world JavaScript on the Web. These optimizations often require significant flow analysis of the code, and in a real world site, spending too much time analyzing code can reduce the responsiveness of the page. The Chakra engine picks the right balance between code quality and analysis time, and only performs a small set of dead code optimizations. We continue to tune this for IE9, and bugs reported via Microsoft Connect are examples of where the optimization could do more. We continue to tune these and other optimizations for the final release.

This kind of dead code elimination is one of many optimizations that Chakra makes to reduce unnecessary work. Over the next few days we’ll post about some of the other techniques Chakra uses to deliver great performance.