Describing Types in .NET

There are several different APIs for handling Types in .NET. 

For each category I want to call out:
Managed/Unmanaged – is the API managed or unmanaged?
Audience – who uses this API?
Binding – A type is defined within a module. So how does it refer to types in different modules or types that haven’t yet been loaded?
Unloading – can you unload the modules you’re inspecting types in? This is very related to the isolation between the host doing the inspection and the target that’s being inspected.
Generating code – can you use these inspection APIs to generate new code?

Different APIs

  1. Metadata: IMetaDataImport and tokens (mdTypeDef).  
    This is the fundamental API for types in .NET. MSDN has an overview here, and online technical specs are available here.
    Unmanaged COM-classic API:  This is the low-level unmanaged COM-classic API. You can use COM-interop to import it into managed code, but there are some problems using the metadata APIs from managed code.
    Audience: This is what unmanaged tools, such as ILdasm, use; and it forms the basis for describing types in the other unmanaged tool APIs described below. This is pretty much a rocket science API.
    Binding: Metadata scopes are explicitly single files and do no resolve across files. Binding is determined at runtime, and since metadata is inspecting raw files and not necessarily executing code, it can’t correctly know how binding across modules will work.
    Unloading: Metadata allows inspecting executables without actually executing them or binding them; and so you can unload the files you’re inspecting. You can verify this by loading a file in ildasm and then noticing that the managed dll you’re inspecting is not loaded into the ildasm process.
    Generating code: Metadata supports the IMetaDataEmit interfaces for generating new code.
  2. Reflection: System.Type .
    Managed: Reflection is a managed API mostly included the System.Reflection namespace. Unmanaged tools can’t use this (unless they call into managed code, of course).
    Audience: Reflection is nicely integrated with languages (eg, C#’s typeof keyword returns a System.Type object) and easy to program against. It’s good for a program inspecting itself. Most tools written in managed code would use this.
    Binding: Since Reflection inspects its own process, it knows how the types in other modules could be loaded. Reflection could also eagerly trigger assembly loads to answer binding questions. Ultimately, inspecting via reflection taints the process with the target module being inspected.
    Unloading: Currently, the only granularity for unloading managed code is via appdomain unload. Since Reflection actually loads the managed code, it is subject to these restrictions and you must unload the appdomain. This is unfortunately true even for code loaded into the Inspection-Only context.
    Generating code: Reflection is very closely tied to Reflection.Emit namespace for generating new code. See an example of emit here.
  3. Debugging: ICorDebugType.
    ICorDebug is the debugging APIs for managed code.
    Unmanaged com-classic API: This is an unmanaged COM-classic API, and designed as an extension to the metadata APIs. Although MDbg provides a set of managed wrappers that make it easy to consume from managed code. In .Net 2.0, ICorDebug represents types in the Debuggee as ICorDebugType. Prior to .Net 2.0, it uses ICorDebugClass, which could not represent generics.
    Audience: Debugger authors. This is used to inspect types in the debuggee process and is completely isolated fro the debugger process. 
    Unloading:  Debuggers and debuggees are separate processes, so they’re well isolated. 
    Generating code: The debugging APIs are for debugging existing code, and not targeted at writing new code. As trivia, debugging APIs expose Edit-and-Continue which cooperates extensively with the Metadata APIs to allow debuggers to add and change code on the fly. However, this is not a recommended technique for code generation.
  4. Profiling:  ClassId
    The ICorProf APIs supports loading profilers into a CLR process. Profilers are very intimate with the runtime. See comparison of profiler vs. debugging for more details.
    Unmanaged COM API: Profilers are unmanaged components that get loaded into the same process as the CLR and receive callbacks directly from the runtime. It’s very coupled to the runtime and thus must be unmanaged code.
    Audience: Extreme rocket scientists writing profilers.
    Binding:  Like debugging, profiling simply describes the current bindings in the process and does not actually force any binding to occur.
    Unloading: As of .Net 2.0, neither the runtime nor profilers can be unloaded until the process actually exits. The profiler is inspecting the managed code in the process, which can be unloaded via AppDomain unloading
    Generating code:  The profiling API supports rejit.  See Dave Broman’s blog for more details about writing a IL-rewriting profiler.

Conversions between APIs:
Metadata is the fundamental type description in .NET, so converting between the type APIs will ultimately need to go through metadata.  The non-metadata APIs have methods to convert to and from Metadata. For example, reflection exposes the tokens via the  MemberInfo.MetadataToken property.

Comments (2)

  1. Many of the .NET docs use the phrase "TypeDef or TypeRef". What’s the difference? Both refer to metadata