Brain Dump: Shims, Detours, and other “magic”

Note: The “brain dump” series is akin to what the support.microsoft.com team calls “Fast Publish” articles—namely, things that are published quickly, without the usual level of polish, triple-checking, etc. I expect that these posts will contain errors, but I also expect them to be mostly correct. I’m writing these up this way now because they’ve been in my “Important things to write about” queue for ~5 years. Alas, these topics are so broad and intricate that a proper treatment would take far more time than I have available at the moment.

Since IE6, Internet Explorer has implemented major architectural changes without accompanying breaking changes to its binary extension model. While new extension features have been introduced (e.g. Search Providers, Web Slices, and Accelerators), they are all based on markup rather than code and have been relatively straightforward to keep working from version to version.

In contrast, Internet Explorer’s binary extension models: ActiveX, Browser Helper Objects (BHOs), Toolbars, etc, are all architected such that 3rd-party COM code runs within the Internet Explorer process. In many cases, extensions originally designed for IE6 (and earlier) continue to run without modification even in IE9 and IE10 on the Desktop. That’s despite the fact that virtually everything else around these extensions has changed: tabbed browsing and Protected Mode were introduced for IE7, Loosely-Coupled IE was added in IE8, Hang Resistance was introduced in IE9, and IE10 introduced Enhanced Protected Mode and other major changes throughout Windows. Each of these architectural shifts would break the majority of the binary extensions if not for a corresponding set of investments in compatibility features undertaken in each release of the browser.

Windows Vista’s introduction of the Integrity Level system was accompanied by the UAC Virtualization system, designed to help accommodate applications that expected to be running with Administrative privileges. If a 32-bit executable’s manifest lacks a requestedExecutionLevel element (e.g. iexplore.exe’s embedded manifest doesn’t have one), then UAC Virtualization will be applied for file and registry operations. Browser extensions running in Internet Explorer benefit from this virtualization, enabling legacy add-ons that expect to be able to read or write to protected locations to continue working. Virtualization works by redirecting write operations from read-only areas to a per-user “virtualized” location. For instance, attempting to write a file to the Desktop from Low Integrity would ordinarily fail, but virtualization permits the operation to succeed by writing the file to a hidden folder elsewhere in the file system. (IE’s Low Integrity virtualization uses a shim to redirect writes to %USERPROFILE%\AppData\Local\Microsoft\Windows\Temporary Internet Files\Virtualized\, while UAC virtualization writes to %USERPROFILE%\AppData\Local\VirtualStore).

However, virtualization alone isn’t enough to ensure compatibility. For instance, when tabbed browsing was introduced in IE7 and Hang Resistance was introduced in IE9, the behavior of windows and dialogs needed to be updated to be compatible with these features. For instance, when an extension in a background tab attempts to show a prompt, this prompt must be suppressed until that tab is activated (otherwise, a confusing experience would result). To accommodate that behavior, a system of shims and detours is used.

These two technologies are similar:

  • Shims work by rewriting a module’s import address table at runtime to point to a different target function
  • MSR’s Detours work by rewriting the start of one or more target functions at runtime to point to a wrapper function

These technologies allow Internet Explorer to intercept calls to important functions (e.g. CreateProcess, CoCreateInstance, CreateWindow, etc) and modify the behavior of those calls to improve compatibility with the restrictions and desired behaviors of the tab/content process in which HTML and add-ons run. For instance, the CreateProcess and CoCreateInstance APIs are wrapped such that the Protected Mode Elevation Policies can be applied. Similarly, CreateWindow is designed to accommodate the creation of new windows by background tabs, and to properly parent those windows to the correct window handle even though the window hierarchy was changed due to the hang resistance feature.

In IE10, we’ve moved most functionality away from Detours to Shims for enhanced compatibility and because we’re shipping to a new platform (Windows RT) to which we otherwise would have had to port the IE version of Detours. In most cases, this was a seamless change, but we recently ran into one ancient toolbar that was impacted by the change.

The toolbar in question was a simple one that offered a standard search box, a few notification icons, and a short set of menus that would launch dialog boxes to configure the toolbar and show information about it. Our compatibility testing team noticed that in IE10, the dialog boxes from the toolbar would never come up. Debugging native code extensions without source or symbols is never fun, but I decided to take a look anyway. I ran the installer and verified that the dialog boxes didn’t come up. Knowing nothing about the technology (e.g. maybe the dialogs were written in HTML), I took a quick look at the installation folder. I got an idea of how old the code was when I saw that the installation folder contained unicows.dll, an ancient library designed to help enable compatibility with pre-Unicode versions of Windows (e.g. 95/98).

I next ran through the repro with IE10 running under the debugger and found a nested function deep inside a call to CreateWindow() was returning Access Denied. I then ran the same repro in IE9 under the debugger and found that CreateWindow succeeded, but observed that in IE9, there were detoured compatibility wrappers in the stack trace, but those wrappers were not present in the scenario in IE10.

I spent several hours pondering this question and aimlessly touring around in the debugger. I was whining about this scenario to a colleague, complaining about code so ancient that it was shipping with unicows.dll, when I realized that I’d never used this library myself, and in fact I’d never seen a toolbar use it before. When trying to explain what it did to the colleague, I decided that I’d probably stop hand-waving and pulled up unicows up on Wikipedia. And bam, there it was, plain as day:

By adding the UNICOWS.LIB to the link command-line [ ... ] the linker will resolve referenced symbols with the one provided by UNICOWS.LIB instead. When a wide-character function is called for the first time at runtime, the function stub in UNICOWS.LIB first receives control and [ ... ] if the OS natively supports the W version (i.e. Windows NT/2000/XP/2003), then the function stub updates the in-memory import table so that future calls will directly invoke the native W version without any more overhead.

…and there’s the problem!

When IE first loads a toolbar, the shims run against the module and wrap all calls to CreateWindow with a call to the compatibility wrapper function. But when IE loaded this toolbar, it didn’t find any calls to CreateWindow, because those calls had been pointed at a function inside unicows.dll instead of at the original function in user32.dll. As a result, the compatibility shim wasn’t applied, and the function call failed.

Now, this wouldn’t have happened if unicows did its import-table fixup the “normal” way, using the GetProcAddress function. That's because the compatibility shims are applied to GetProcAddress as well, and the fixup would have been applied properly at the time that unicows did the update of the import table. However, for reasons that I thought were lost to the mists of time (see below) , the implementers of unicows instead copied the source code of GetProcAddress from user32 into their own DLL, so the shims had no way to recognize it. While we could add a new shim to handle unicows.dll, the obscurity and low priority of this scenario mean that we instead decided to outreach to the vendor and request that they update their build process to remove the long-defunct support for Windows ‘9x.

-Eric

Update: Over on his blog, Michael Kaplan provided a history of why unicows.dll works the way it does.

PS: This MSDN article is a great resource that explains the PE file format and how linking and delay loading features work.