Designing for Add-on Performance

As we worked towards the recent release of Internet Explorer 8 Beta 1, the IE team focused hard on performance. As part of our effort to improve IE, our investigations have revealed several add-on performance problems. In this post, I want to share some of the common themes that we have discovered.

First, I would like to thank those of you who have provided feedback on this blog, in the IE Beta NewsGroup, and around the web. The Internet Explorer team has been working hard on performance in IE8 and it is great to see the results of some of our early investments. We still have room (and plans) to improve, but for now you can find out more about the performance improvements in IE8 Beta1 from our developer whitepapers.

If you are new to the world of developing IE add-ons and want some background material, here are a few great links to get you started:

Broadly speaking add-on performance issues typically impact IE users in two areas:

  1. Opening/Closing the IE window or individual tabs
  2. Browser responsiveness

Opening and closing speeds are largely impacted by add-ons performing lots of expensive work every time they are created. One particularly common problem is that add-ons check for updates during either browser startup or shutdown.

Registry misuse has been a common problem leading to poor responsiveness. Many add-ons perform expensive registry operations that can reduce Internet Explorer’s responsiveness.

In the sections below I discuss these two areas and provide some guidance for designing performance into add-ons.

Add-on Initialization and Checking for Updates

  • Principle 1: Be lazy – give hard work to another thread
  • Principle 2: Don’t pay a toll every time you start the car

During startup, Internet Explorer checks the registry for installed add-ons. When IE detects an installed Browser Helper Object (BHO) or toolbar it calls CoCreateInstance to instantiate each installed and enabled add-on. Essentially, Internet Explorer creates add-ons as inproc servers, executing in the context of IE’s main UI thread. For backwards-compatibility Internet Explorer follows these steps for every opened tab. This behavior is important for several reasons, and you’ll see why as I discuss some of the most popular problems encountered by add-ons.

Be lazy – give hard work to another thread

One common trend in many of the popular add-ons today is integration with online content. Maintaining this integration with live data invariably entails some update mechanism. In many of the cases we have investigated, add-ons perform synchronous update checks when IE hands control over to the add-on’s SetSite implementation during initialization.

From my description of how add-ons are initialized in Internet Explorer, you can guess what the potential impact is from these types of update checks. Consider the following flow:

  1. IE begins initialization
  2. IE detects that the Foo Toolbar has been installed
  3. IE calls the Foo Toolbar’s SetSite method
  4. Foo Toolbar contacts https://foo.example.com to check for updated content
  5. Foo Toolbar returns control to IE
  6. IE continues initialization and displays the user’s homepage

See the problem yet? Consider step 4 above – what happens if the Foo Toolbar finds lots of content that needs to be updated, if the user’s connection to the content server is slow, or if the user is working offline? The answer is, (since add-ons execute in the context of the UI thread), that the toolbar can cause IE to become unresponsive for long periods of time or can lead to IE’s startup and shutdown times inflating faster than a balloon at a clown convention.

A better approach is to create a worker thread that can perform the content update asynchronously. The preferred way is to use SHCreateThread (when developing an add-on in C++) as follows:

STDMETHODIMP SetSite(IUnknown* pUnkSite)
{

if (pUnkSite != NULL && IsUpdateRequired())
{
        SHCreateThread(Update, NULL, CTF_COINIT | CTF_PROCESS_REF, NULL);
}
else
{
         // Release cached pointers and other resources here.
}
 // Return the base class implementation
return IObjectWithSiteImpl<CHelloWorldBHO>::SetSite(pUnkSite);

}
DWORD WINAPI Update(LPVOID pParam)
{
            DWORD dw = 1;
            // Perform update here
           return dw;

}

DWORD WINAPI IsUpdateRequired()
{
           DWORD dw = 1;
            // Perform a low-cost check here to verify that an update should be
// performed. This can be accomplished by checking a registry key.
           return dw;

}

Notice that in the above example SetSite creates a new thread to execute the Update method. Using this approach SetSite does not run the risk of blocking the UI thread for extended periods of time, and the add-on is still able to update its content. Also notice that by establishing a suitable frequency for update checks (for example, every 2 or 3 days) add-ons can be updated quickly without forcing users to pay the price of the update check with every browser or tab opening.

Adopting this approach can help move long-running operations off of IE’s main UI thread and can lead to better perceived performance. It is important to remember, however, that moving to a worker thread is not a panacea. There are many potential issues, including the possibility that numerous expensive cross-thread COM calls could outweigh the benefit of moving to a worker thread.

Pay the toll when you get to the booth

Handing off long-running operations to a worker thread helps avoid UI hangs. Nevertheless, users may still pay an avoidable up-front cost every time your add-on is initialized. Users often start IE without taking advantage of the updated content. In these cases both the users and content providers are paying extra costs associated with the update checks without any commensurate dividend.

When performing content updates an extreme approach would be to pay the costs only when users have explicitly announced that they want new content – by clicking on the “Check for Updates” menu item, for example. That solution is, however, unrealistic in many cases because it could compromise the add-on’s performance. For example, consider a user clicking on a drop-down menu, and having to wait a second to view the associated drop-down while updated content is downloaded – yikes!

There are a variety of techniques that more effectively balance user experience and up-front costs. For example, toolbar developers might want to consider moving their update checks out of SetSite entirely and do them either the first time the user mouses over the toolbar, or update on a fixed schedule. Exact solutions will vary from add-on to add-on, so it’s important to stay creative and try to avoid forcing fixed costs on users whenever possible.

In almost every case there is a way to avoid doing lots of work in either SetSite or in an OnDocumentComplete handler. Taking the extra time to push work out of these areas is a great way to avoid performance problems and ensure that users are happy to install your add-on.

Using the Registry

  • Principle 3: Caching is your friend
  • Principle 4: Break the habit – Don’t flush!
Caching is your friend

Using the registry is sometimes reminiscent of the Macarena circa 1996 – a few people knew the steps, fewer people were actually good at it, but neither of those facts prevented everyone else from taking part. Registry overuse is common among Windows applications, and we have been working hard to reduce our registry accesses with IE8.

Overusing the registry is discouraged because the overhead of registry operations can be significant – opening, reading, and closing a cached key can cost tens of thousands of cycles. Since it is relatively common for individual add-ons to perform hundreds, thousands, or even tens of thousands of registry accesses during startup, these costs can quickly add up to a noticeably slower browser.

Fortunately, it is possible to reduce the cost of using the registry. First and foremost, optimize for the common case. It is very likely that most registry values are not going to be changed during the course of an add-on’s execution, so reading the value once and then maintaining a cache can significantly reduce the number of individual registry accesses.

Where it is not possible to eliminate registry accesses, you can often reduce the cost of the remaining operations. It turns out that accessing keys using full registry paths (e.g. HKEY_LOCAL_MACHINEFooBar) can be two to three times as expensive as using relative paths, depending on number of levels separating the target key from the provided root. Add-ons typically have the vast majority of their settings available under a key or a small set of keys. For example, suppose an add-on wanted to retrieve the associations used by IE. The following registry keys would need to be accessed (under HKEY_LOCAL_MACHINE):

SOFTWAREMicrosoftInternet ExplorerCapabilitiesFileAssociations SOFTWAREMicrosoftInternet ExplorerCapabilitiesMIMEAssociations SOFTWAREMicrosoftInternet ExplorerCapabilitiesUrlAssociations

Using the Win32 method RegOpenKeyeach of the regkeys could be accessed with the following snippet of code (using FileAssociations as an example):

HKEY hk;

RegOpenKey(HKEY_LOCAL_MACHINE, L"SOFTWARE\Microsoft\Internet Explorer\Capabilities\FileAssociations", &hk);

The remaining keys could be accessed in a similar fashion using HKEY_LOCAL_MACHINE as the root. However, a better approach in these cases is to create a handle to the Capabilities key and then perform additional relative-path RegOpenKey operations to retrieve the remaining values, as follows (again, using FileAssociations as an example):

HKEY hkRoot;

RegOpenKey(HKEY_LOCAL_MACHINE, L"SOFTWARE\Microsoft\Internet Explorer\Capabilities", &hkRoot);

HKEY hkFileAssoc;
RegOpenKey(hkRoot, L"FileAssociations", &hkFileAssoc);

Break the habit - Don’t flush!

Lastly, in the past we have seen add-ons using the RegFlushKey to ensure that their registry values were in fact pushed out to disk. In some cases this is done in an attempt to maintain state between two instances of an add-on running in separate tabs or windows.

As noted in the MSDN documentation for RegFlushKey, there is rarely a need to use this API. Furthermore, calling RegFlushKey can be surprisingly expensive as it will ensure that all the data backing the registry has been written to disk. That activity may take hundreds of milliseconds to return control to the calling program. Even worse, accesses to the registry will be blocked while it completes.

As a result, calls to RegFlushKey can have an impact not only on IE but can reduce performance throughout the system. Rather than flushing the registry, add-ons using the registry for synchronization between instances can use RegNotifyChangeKeyValue to maintain state. Larry Osterman and Raymond Chen have blog posts on (mis)use of the registry that are worth reading for more detail:

I hope my guidelines on improving add-on performance help you understand some of the common problem areas we have encountered. Thanks for contributing great add-ons to the Internet Explorer ecosystem, and I look forward to your comments.

Christian Stockwell
Program Manager
Performance Geek

Edit: Added "Root" to this line of code: RegOpenKey(HKEY_LOCAL_MACHINE, L"SOFTWARE\Microsoft\Internet Explorer\Capabilities", &hkRoot);