Add-on Performance Part 3: Optimizing Add-on Startup Performance


In the first post of this series, we described how add-ons can decrease IE’s performance during tab creation. Many users with add-ons enabled have noticed a performance improvement when they open new tabs after disabling their add-ons. We also walked you through how to measure add-on performance and identify areas of impact using the Windows Performance Tools.

In this post, we delve into methods and best practices developers can use to improve their add-ons’ start-up performance, first time and every time. A great place to start is by minimizing the number of DLLs and dependencies associated with an add-on and by delay-loading these DLLs where possible. We show the bare minimum amount of code an add-on needs to run during initialization to ensure the fastest performance, and show how using the SetSite() implementation in the example below, we built a toolbar that has only a 0.01 second average load time in IE. We’ll also show you how to investigate add-on performance problems using the Windows Performance Tools, and will use the data to highlight the various performance footprints that are the results of several common add-on coding mistakes today.

Once again, we ask add-on developers to use the information from this post and start applying performance optimizations today. Improving your add-on’s load time also differentiates your add-ons from other slower add-ons when compared to each other via the Manage Add-ons window.

Revisiting Add-on Tab Creation

To bring readers up to speed, let’s first review how add-ons affect IE during tab creation. Every time the user creates a new tab, IE initializes add-ons to make sure that they can interact with a webpage properly. In fact, when the user launches a new IE window one of the first things IE does is create a new tab that, in turn, initializes all of the user’s add-ons in sequence. The time it takes each add-on to initialize is its load time.

When IE initializes an add-on, it first calls the CoCreateInstance() function on the add-on’s ClsID, which in turn invokes the add-on module’s DllGetClassObject() function to create an object in memory. Add-ons do not typically incur a performance delay during this function call. It’s nevertheless important to focus on this function call while optimizing startup performance since we’ve found examples of slow performance in some of the most popular add-ons.

Once the add-on object has been instantiated, IE passes a pointer to the IWebBrowser2 object to the add-on via the SetSite() function. This function facilitates the add-on’s initial communications with Internet Explorer and is exposed by the IObjectWithSite interface, which all IE add-ons must implement. Add-ons typically run their initialization routines in this function, such as displaying toolbar UI or loading other modules.

If the add-on being initialized is a Toolbar or Explorer Bar, IE calls the add-on’s ShowDW() function to make the add-on visible on the browser window. Some add-ons choose to run their UI rendering code within this function so it can also impact startup performance.

A Simple SetSite Implementation

The majority of performance issues during add-on startup occurs within SetSite(). We encourage developers to spend as little time as possible doing work in this function. This is the best way to optimize for a fast load time. The following code describes a sample ATL implementation of a toolbar’s SetSite() function. There are several key points worth noting:

  1. This implementation is written with IE-native technologies (C++, Win32). We recommend building IE add-ons using these technologies instead of platforms that have a high up-front initialization or working set cost, such as Silverlight and managed code
  2. This implementation performs the minimum amount of work required to complete the initialization, ensuring that the toolbar returns from the SetSite() call as soon as possible. Add-on developers should minimize the number of DLLs and external dependencies associated with an add-on and delay-load DLLs where possible
  3. The toolbar won’t perform any network operations or disk-intensive operations (such as registry accesses) commonly observed in the SetSite() calls of certain add-ons
  4. When you close a tab or the IE window, IE calls SetSite(NULL) on the add-on. Minimize the work in this part of the function call as well to ensure that tabs close quickly
  5. Some add-ons need to handle DWebBrowser2 events (such as DocumentComplete) and run code. If your add-on doesn’t need to handle any events, then you don’t need to sink them in SetSite()
// IObjectWithSite --------------------------------------------------------
STDMETHODIMP CContosoBand::SetSite(IUnknown *punkSite)
{
  
if
(punkSite != NULL)
    {
      
// Initialize the toolbar.
       CComPtr<IOleWindow> spOleWindow;
       HRESULT hr = punkSite->QueryInterface(IID_PPV_ARGS(&spOleWindow));
      
if
(SUCCEEDED(hr))
       {
          hr = spOleWindow->GetWindow(&_hwndParent);
         
if
(SUCCEEDED(hr))
          {
            
// Create a standard Windows comctl32 toolbar control and add a few buttons to it
              hr = _toolbar.Init(_hwndParent);
          }
       }
 
      
// Store off IWebBrowser2 and sink DWebBrowserEvents2
       if
(SUCCEEDED(hr))
       {
          CComPtr<IServiceProvider> spServiceProvider;
          hr = punkSite->QueryInterface(IID_PPV_ARGS(&spServiceProvider));
         
if
(SUCCEEDED(hr))
          {
              hr = spServiceProvider->QueryService(SID_SWebBrowserApp, IID_PPV_ARGS(&_spBrowser));
             
if
(SUCCEEDED(hr))
              {
                
//Establish a connection to the DWebBrowser2 event source
                 hr = DispEventAdvise(_spBrowser);
                
if
(SUCCEEDED(hr))
                 {
                     _fAdvised =
true
;
                 }
              }
          }
       }
    }
   
else
    {       
      
//Terminate the connection to browser event source
       if
(_fAdvised && _spBrowser != NULL)
       {
          DispEventUnadvise(_spBrowser);
          _fAdvised =
false
;
       }
   
      
// Tear down the toolbar window and clear out the pointers
       _toolbar.Destroy();
 
       _spBrowser.Release();
    }
 
   
return
IObjectWithSiteImpl<CContosoBand>::SetSite(punkSite);   
}

A toolbar that we built using the above SetSite() implementation only incurred a 0.01 second average load time in IE. If you mirror your add-on’s implementation as described above, you will be able to minimize your add-on’s impact to IE’s tab creation performance.

Investigating Add-on Performance using Windows Performance Tools

As you develop more functionality to your add-on you will need to perform additional operations during add-on initialization. We encourage you to continually monitor your add-on’s performance using the Windows Performance Tools (xperf) or via IE’s Manage Add-ons window. If you happen to find performance regressions, it’s important not to rely on guesswork to find the root cause. In fact, you can use the Windows Performance Tools to profile your add-on’s performance, investigate the traces and identify the regressions.

In the following examples we’ll show you how to profile the startup performance for your add-on. The first step involves collecting a performance trace of the add-on during tab creation.

  1. Make sure your add-on is enabled (and visible if your add-on is a toolbar). For the best results, disable the other add-ons installed in the browser.
  2. From an elevated command prompt, execute the following command (assuming you’ve added the xperf folder to your system path) to start the trace:
    xperf -on latency+dispatcher -stackwalk profile+cswitch+readythread -buffersize 128 -minbuffers 300 –maxbuffers 300 -start browse -on Microsoft-IEFRAME:0x100
  3. Launch IE and open several new tabs. Be sure to wait a sufficient time between actions to ensure that all the necessary code executes successfully. You can open more new tabs to get more datapoints out of the trace.
  4. Stop the trace and log the results to a file (such as toolbarprofile.etl):
    xperf -stop browse -stop -d toolbarprofile.etl
  5. Once you’ve collected a trace, you can use the Windows Performance Analyzer (xperfview) to display several graphs showing a timeline of data covering the period recorded by the trace:
    xperfview toolbarprofile.etl

Main page for Windows Performance Tools

The Windows Performance Analyzer contains many different graphs based on the data from your trace. You can strip it down to only the relevant graphs for more convenient analysis.

Let’s zoom into the events that correspond to the add-on’s initialization (and un-initialization). We recommend developers analyze add-on performance for all of the following events:

Scenario

Start Event

End event

Tab Creation

(Load time)

CoCreateInstance():
Microsoft-IEFRAME/ExtensionCreate/Start

SetSite():
Microsoft-IEFRAME/ExtensionSetSite/Start

(Toolbars/Explorer bars only)
ShowDW();
Microsoft-IEFRAME/ExtensionShowDW/Start

CoCreateInstance():
Microsoft-IEFRAME/ExtensionCreate/Stop

SetSite():
Microsoft-IEFRAME/ExtensionSetSite/Stop
(Toolbars/Explorer bars only)
ShowDW();

Microsoft-IEFRAME/ExtensionShowDW/Stop

Tab Close

SetSite(NULL):
Microsoft-IEFRAME/ExtensionSetSiteNull/Start

SetSite(NULL):
Microsoft-IEFRAME/ExtensionSetSiteNull/Stop

For this example we’ll focus on analyzing The events that account for the load time (ExtensionCreate, ExtensionSetSite, ExtensionShowDW) of our toolbar.

Right click the GenericEvents graph and select Summary Table. You’ll see a window which allows you to identify when and how long your events fired in the trace. Use the selector on the left to customize your columns in the following order for easier data consumption. If you find this column sequence helpful you can create a summary table profile for it via the Save View… option from the View Menu:

Process Name, Provider Name, Field1, Yellow Bar, Count, Time (s) (sort entries by this column), Task Name, Opcode Name

Expand the field for the iexplore.exe process, then the Microsoft-IEFRAME provider,and finally the field corresponding to the ClsID for your add-on. . Scroll down until you find the win:Start and win:Stop event pairs for each of the initialization events. The overall time delta from ExtensionCreate-Start to ExtensionShowDW-Stop is the add-on’s load time, which in this case is only 0.01 seconds:

Summary table view from Windows Performance Tools

You can select the proper time region in the main window that is relevant to your add-on’s load time. Go to the main window, right-click on any graph and choose “Select Interval…”. Enter the values from the ExtensionCreate/Start and ExtensionShowDW/Stop events in the Summary Table (hint: right click on the cell containing the time for each event and select “Copy Cell”).

Once you hit enter on the Select Interval window, the specified time range will be selected in the main window. You can replicate that selection to all of the graphs by right-clicking on the selection and clicking “Clone Selection”. You can then zoom into that range for all graphs.

With the time interval set up, you can start looking at the individual views to investigate performance. For example, you can look at the CPU Usage by Process graph to see how much CPU time iexplore.exe was using during your add-on’s startup. You may need to use the selector on the left to enable the views. Here’s how the CPU Usage by Process graph looks like for the toolbar that we just profiled. Note that a value of 25% usage corresponds to using 100% of one CPU in a four-CPU system:

Graph that shows the CPU usage for the iexplore.exe processes when a typical add-on is initializing.

CPU Usage by Process (iexplore.exe processes only)

Graphical View of Common Performance Problems

For comparison, we’ve captured traces for two add-ons that have known performance problems. You can see the differences in the graphs and the length of the initializations.

Graph showing the CPU usage if an add-on is waiting for a network call to return.

Sample Add-on 1 – Network calls during initialization (iexplore.exe CPU usage only)

This add-on takes around 0.27 seconds to load, but most of the time is spent waiting for a network call to respond, as evidenced by lack of CPU activity throughout the initialization.

Graph showing the CPU usage for an add-on performing lots of registry operations during initialization

Sample Add-on 2 – Registry operations during initialization (iexplore.exe CPU usage only)

This add-on takes around 0.3 seconds to load. IE consistently uses 25% of the CPU throughout the initialization accessing and writing values to the registry.

Summary

Tab creation performance is a great measure of users’ overall browsing satisfaction. As add-ons are commonly installed by our users today, improving add-on startup performance is critical to ensuring that our users can continue to enjoy the benefits of add-ons while staying fast.

We encourage add-on developers to take the guidance from this post and apply it to their relevant add-ons. Design the initialization routines for your add-ons from the ground up with performance in mind. Start by minimizing the number of DLLs that need to be loaded by an add-on and delay-loading these DLLs where possible. Use the Windows Performance Analyzer to profile your add-on’s performance and spot different issues to optimize for, instead of attempting to stipulate the problem areas in code. From measurement to investigation to optimization, add-on performance engineering is a repeated process.

Developers can use these tools and optimizations for other scenarios as well, such as navigation, add-on un-initialization, etc. If you find useful optimization methods or have any questions about analysis steps, feel free to post comments here. Start optimizing add-on performance today!

Herman Ng
Program Manager


Useful links

Previous Blog Posts on Add-on Performance

Comments (17)

  1. Beatrix Kiddo says:

    Love how the IE team recommends VC++ over managed code.

  2. xer says:

    @Beatrix, I Agree. I've wasted a ton of time trying to get a sample like that working in C# never got it working properly.

    @MSFT, You should provide a working sample for a C# Com object using that code. Can you do it?

  3. tobi says:

    The new focus on addon performance is a great strategic move. It is not about technology but about forcing behavior.

    For IE9 you nailed performance pretty much. For IE10 we need the most convenient browser possible, because chrome did such a good job here. Surely you can crack it.

  4. Harry Richter says:

    … if people were to read the relevant information (including some on this blog):

    blogs.msdn.com/…/recap-of-add-on-con.aspx

    blogs.msdn.com/…/1317290.aspx

    then they'd know why using .Net is a bad choice of technology for IE add-ons, and suddenly it would become clear even to those why there are no examples available.

    Cheers

    Harry

  5. Steve says:

    Except in .NET 4.0 this isn't true anymore as it can host multiple .NET versions in the same process.

  6. Guy says:

    VC++ is managed….

  7. Mario says:

    Will McAfee site Advisor on My Windows 7 computer inside of IE8 work on the ie9? and will delay startup time if so how long?      and when will IE9 be out of Beta?!  please answer

  8. Help says:

    Hey IE team, with IE7/IE8, you broke the "What's this" help (little question mark) of Internet Options (especially Advanced tab) which worked in IE6 and replaced it with a CHM help file that does not describe or explain the advanced options at all. Now there's no hope of getting this fixed on XP but at least on Windows 7 and Vista, clicking the little arrow should behave like What's this help and not just open a useless help file which doesn't have any explanation of advanced options. The currently supported HTML Help also supports What's this type of help, so please update the IE CHM file to include all advanced options explanation and make clicking the ? reference it.

  9. CvP says:

    @IE Team

    a bit off topic but…

    I'm all for your "same markup" slogan however I have two concerns.

    1. It seems .clearfix {overflow: auto} is not working on IE9 (and opera) when html5 tags (header, nav etc) are used where it works for latest FF/Chrome. Why's that?

    I have to use .clearfix:after for these which is not something i want to use.

    2. What are you planning to do about useragent stylesheet? as long as each browser has its own different default css, without "css reset", same markup is impossible.

    (btw i like IE9's default stylesheet far better than any other browsers)

  10. Mike says:

    Also off-topic, but can we have updated versions of the IE 6 VHDs please ? The current ones expired 1st October 2010.

  11. Andy says:

    please look at bugid 603668 on microsoft connect in regards to add-on performance notification and the recommendation I put on there…

    Posted by jawz101 on 9/23/2010 at 12:37 PM

    I think I'd like to add a comment on what should happen.

    I have an add-on, LastPass, which I love. I know it takes longer than .2 seconds to load and I am willing to accept that. Now, I could adjust the add-on performance check's threshold to be something like .5 seconds, however, I'd like to still be notified of other add-ons that exceed the .2 second threshold.

    What I think should happen is to have the threshold check be per add-on or have the option to exclude add-ons from the performance check- ie – mark an add-on as 'Disable performance checks against this add-on'.

  12. Rob^_^ says:

    @CvP

    add

    /* HTML5 new tags */

           article, aside, details, figcaption, figure, footer, header, hgroup, menu, nav, section

           {

               display: block;

           }

    to your reset.css… IE9 renders them like <spans>, the other browsers <div>s…. theres a ticket on connect.

  13. xer says:

    @Harry Richter

    #1 For your first link, Microsoft should figure out a way to make it fast to load .net apps in the browser because its their flagship development platform, they own the browser, the framework, and the OS… so I'm sure they can make it work.

    #2 Second link, I never knew this, but it still does not change the fact that .Net should be backward and forward compadble, so loading the latest .net framework to host the .net code should be used. If the .net code requires a newer version of the framework, the browser should handle prompting and installing the latest framework.

    If I remember correctly, I think the problem I ran into with implementing the ISite was that you couldnt use it with safe for scripting, it forces you to pick one or the other. Thats if I remember correctly ;o)

  14. Harry Richter says:

    @ xer

    .Net is the flagship development platform for DESKTOP applications (VB.Net and C#) and WEB-Apps (ASP.Net). It is NOT the development platform for browser add-ons. That is what C++ is for. For each task its own tool! The nature of .net does not permit it to load fast enough for a browser add-on, a penalty that is acceptable for a desktop application, but NOT for browser add-ons, where the core libs will have to be loaded maybe dozens of times during a session (once for each tab created, with process isolation making it even more difficult). To use your metaphor: even if you own the world, you’ll have problems abolishing gravity!

    Cheers

    Harry

  15. xer says:

    @harry

    I must disagree .net is not for desktop applications, its basically for any platform you want to provide an application for. Runs on desktops, mobile phones, webservers, linux, mac, etc.

    You should be able to write some code and use it anywhere. I wish they would actually allow C# as a scripting language to replace javascript. Would be great too if they didnt force it to be straight text anyone could steal… Why don't we have precompiled javascript files?

  16. Matt says:

    @Steve: Side-by-side problems aren't the only reason you shouldn't be building browser add-ons in .NET. IE's multi-process architecture means you end up loading the framework into every tab, bloating its working set and at significant CPU cost. Here's a real-world example of why you don't build add-ons in .NET: blogs.msdn.com/…/agcore-addon-hangs-internet-explorer.aspx

    @xer: You need to understand the difference between what you wish were so, and what is actually so.

    @Guy: VC++ can compile to native, or managed.

  17. xer says:

    @Matt

    No problem understanding the difference, as I said in my post ther "I wish".

    To comment on your article you mentioned to Steve… I was reading it over and I saw in the comments by EricLaw (MSFT) the following

    "The Office team made a major investment in .NET programmability, and IE has not prioritized such an investment to date (as we have other, higher priorities)."

    Thats exactly my problem which creates this argument, .NET should be a high priority to the IE team as it is in all other Micrsoft programs/devices. Ofcourse I would agree they do have other higher priorities since they took X amount of years off. The statement also shows they understand they have a lack of .NET support and someday it should be worked on… they should put at least a little bit of time into it before IE9 is released.

    It was probably the lack of the IE teams support for .NET that caused Microsoft to create Silverlight