Fixing Activation Context Pollution

As the number of apps in the world that use side-by-side activation (as a result of depending on the new Visual C++ Runtime v8.0) increases, providers of callable code (libraries, control packs, whatever) may start seeing odd and potentially unexpected behavior. Typically it's hard to diagnose. Somewhere deep inside your publicly exposed surface area, a LoadLibrary(gdiplus.dll) fails, or CreateWindow(ComCtlv6WindowClass) fails. You might have fallen victim to context pollution inadvertantly caused by your caller. There is hope however, in that you can control your own destiny even in the face of your callers' context.

The activation context stack is as it sounds, a per thread stack (on Windows Server 2003 it's per thread or per fiber) of contexts that have been activated. When the process is launched, an initial context may be created and pushed into the process as the process default context. When thread A creates thread B, the currently-active context for A will flow onto B as the top of B's thread context stack. Sometimes, this flow of contexts works in your favor. The windowing system captures the current stack top along with a SendMessage/PostMessage, then ensures that the context is activated before it invokes the target window procedure. Similarly, sending an APC or making a cross-apartment COM call both capture the stack top and push it onto the stack of whatever thread ends up servicing the request before calling whatever code is to be invoked. The important point is that the stack top at the target is the same as the stack top at the call site. When whatever target code was invoked returns, the top of the stack is popped so that the thread returns to the state before the call was performed.

This automatic flow of contexts ensures that the call executes as if it was running in the same context as when the call was initiated. It only covers a small subset of the possible ways code can be invoked, and it certainly doesn't help the usual case - the "call" instruction generated by the compiler. Unless you (the provider of a library) take special action, your library code will execute using the current context stack top - whether that's what you wanted or not!

Let's go back to our example to see what this might mean. Assume your library depends on some other side-by-side components that use registration-free COM activation - maybe RTC as an example. When control transfers into your function, the first thing you do is CoCreateInstance(CLSID_SomeRTCObject). Your tests work - you authored "testharness.exe.manifest" which contains a reference to Microsoft.Windows.Networking.RtcDll, so that the process default context created at process launch always contains component registration information for CSLID_SomeRTCObject. Your clients are writing a rendering engine on top of your functionality that uses GDI+, and complain that your function always fails with "class not registered." Doing a little debugging, it turns out that CoCreateInstance is returning that error.

Your library has just fallen prety to context pollution. The rendering client executable had its own "client.exe.manifest", but it only referred to the GDI+ assembly that it knew it needed. The context it has on the stack when it calls into your code does not have the RtcDll assembly in it, so it can't have any of the COM registration information present either. One immediate fix is to demand that client executables always reference the RtcDll assembly as well, so that the activation information will be present at the time of your CoCreateInstance call. Fortunately, the client balks at this - their app has worked forever, at least until they started using your communication DLL.

The way out of this mess is to be Master Of Your Own Destiny. Don't like the context you're handed? Create an activate another one! Creating an activation context is easy, if you've burned your manifest into your resource section (a topic for another day). Activating a context is easy. Deactivating a context is easy. The only "hard" part is remembering to activate & deactivate in the right places and at the right times.

Assume that you have a finely crafted manifest that refers to the correct set of components. You've gone through the right steps, and this manifest is now baked into your DLL's resource section with type RT_MANIFEST at resource ID 1000. To create a new activation context from this manifest, use CreateActCtx:

 HANDLE hMyActCtx = INVALID_HANDLE_VALUE;
ACTCTX Request = { sizeof(Request) };
Request.Flags = ACTCTX_FLAG_HMODULE_VALID | ACTCTX_FLAG_RESOURCE_NAME_VALID;
Request.hModule = g_hMyModuleHandle;
Request.lpResourceName = MAKEINTRESOURCE(1000);
hMyActCtx = CreateActCtx(&Request);

On success, hMyActCtx will be something other than INVALID_HANDLE_VALUE. You can now use this with ActivateActCtx and DeactivateActCtx:

 HRESULT MyPublicFunction(...) {
  ULONG_PTR ulpCookie = 0;
  ActivateActCtx(g_hMyActCtx, &ulpCookie);
  CoCreateInstance(CLSID_SomeRTCObject);
  DeactivateActCtx(0, ulpCookie);
}

Now, when CoCreateInstance looks at the active context, it'll see the one you activated rather than the one your client had activated. Pollution solved - you now control your own destiny. Go nuts - have different manifests for different top-level entrypoints. Have one mondo context that contains everything you'll need. No matter what context your caller might have stuck you with, you have complete control over the context you decide to bind code with.

Next time - making your life easier with C++ RAII around activation and deactivation; "racy initialization" of the activation context, and more.