Using metadata interfaces from managed code

The metadata APIs are unmanaged COM-classic interfaces declared in Cor.h in the SDK. (look for IMetaData* ). In this blog entry, I’ll wander over some random trivia about trying to use the metadata interfaces from managed code. We run into this from the debugging side because the debugging interfaces are intimately related to the metadata interfaces and so you hit this as soon as you write a debugger in managed code.


Creating the managed wrappers:

The first instinct is to use tlbimp to automatically generate managed wrappers. This is what we did to get the managed wrappers for ICorDebug, which is also a unmanaged COM-classic interface. However, there’s no idl file for the metadata interfaces, and no tlb to use.  Now that aside, you can always generate the IL wrappers by hand (this is what we did for MDbg).


The versioning problem:

However, even once you get a managed set of wrappers that overcome the random marshaling problems, you’re hit with a strange versioning problem: A managed app can only load the version of the metadata that that app itself is running against. In other words, you can’t create a managed application that runs against the v1.1 CLR  but uses the CLR’s v2.0 metadata coclasses.


This stems from some seemingly innocent facts:

1)      The CLR only allows one version of the runtime (eg, mscorwks) in a single process.

2)      Your managed app has already bound to some version of the runtime.

3)      The CLR’s metadata implementation (eg, the coclasses specified in cor.h) is also implemented in mscorwks. Contrast this to the debugging interfaces which are implemented in a separate dll from the rest of the runtime (mscordbi).


Each of these by itself is harmless. But combine them and it means that the metadata implementation and the app’s current runtime compete over a single version of mscorwks to load. If an app runs against the v1.1 CLR, then it must have loaded the v1.1 mscorwks. Using the v2.0 metadata interfaces would require it to also load the v2.0 mscorwks. But we can only load a single version of mscorwks.


You can avoid this by implementing the metadata interfaces yourself, and thus avoiding the coclasses that depend on mscorwks.  That’s a ton of work, and that’s usually not what people want to do when they talk about calling the metadata interfaces from managed code. It also won’t help other applications that are still bound to the CLR’s coclasses.


Assuming you use the CLR’s metadata coclasses, this means that managed tools that inspect v2.0 metadata must be written in v2.0. So you couldn’t write a MDbg that runs on the v1.1  CLR but debugs v2.0 apps. Debugging Vx apps requires loading Vx metadata. Thus this is one reason that MDbg requires the v2.0 runtime to even run.



What about reflection?

I can’t talk about managed metadata interfaces in good conscience without at least mentioning reflection. Perhaps you can use reflection instead of the metadata interfaces? Chris King (a CLR Reflection guru developer), points out that there are definitely scenarios where this won’t work. For example, reflection needs to resolve assembly binding whereas metadata interfaces don’t.

Regardless, it’s worth pointing out that they do have very related functionality. Reflection has several advantages for a managed client:

         The reflection API is managed (check out System.Type), and exposes the same type of information as metadata.

         Reflection is already integrated into the runtime type systems of many .NET languages. For example, C# has the “typeof” keyword to get a System.Type instance.

         In v2.0 CLR, there’s actually an “inspection only” mode of reflection that allows you to load and inspect the types in modules without actually loading the module for execution. This allows you to inspect modules across platforms.


In fact, Chris has implemented the CLR’s Reflection in v2.0 extensively using private versions of a wrapper metadata interface.


How does MDbg do it?

MDbg handles the managed metadata issue by:

         having a small hand baked set of wrappers for the com-classic interfaces. (See IMetaDataImport.cs in the MDbg beta 1 sample source)

         these are then used to implement our own versions of the reflection objects that derive from the reflection interfaces.  (see CorMetadata.cs in the MDbg beta 1 sample sources)

This is actually functionally very similar to if we had used the inspection-only functionality added in v2.0. (Inspection-only loading didn’t exist at the time we created MDbg, so it was a moot point to ponder). Chris strongly recommends against the pattern of deriving your own reflection objects, and wishes the objects were sealed. He explains the original motivation for allowing System.Type + friends to be polymorphic was for reflection.emit, and they didn’t plan on users deriving their own. In retrospect, this part of MDbg was probably not our best idea.


Comments (7)

  1. ? says:

    Why can’t the clr load two versions of mscorwks.dll side by side in the same dll?

  2. Dan Golick says:

    Does this mean I can port MDbg to 1.1 given that I don’t want to debug 2.0 apps?

  3. David  Srbecky asked:

    Can a EnC capable compiler work on top of System.Refletion.Emit? (ie. If…

  4. MDbg is a debugger for managed code written entirely in C# (and IL), which started shipping in the CLR