Read IL from MethodBody

Reflection in .NET 2.0 ships with a new class MethodBody, which "provides access
to information about the local variables and exception-handling clauses in a method
body, and to the Microsoft intermediate language (MSIL) that makes up the method
body". (Thanks to Glenn, who wrote
this in the MSDN doc; I simply copy & paste it). The only one method
inside this class (others are properties) is:

public byte[] GetILAsByteArray()

which returns us a byte array containing IL content for this method body.

Recently we got some questions related to reflecting IL stream: users want to check
where a specific method was used; or try to build a call graph inside an assembly;...
Reflection currently does not support this directly. Lutz Roeder's awesome tool
- .NET Reflector can indeed show
the call/callee graph for each method. I used to have the source code for his ILReader,
and noticed it did not use Reflection APIs (so for sure we can read IL without reflection
APIs)

A code sketch below shows how we can use classes in the Reflection/Emit namespace
(and other useful APIs provided in .NET 2.0) to read IL instructions.
Standard ECMA-335 is the authoritative resource to understand IL, MethodBody
format and other CLI topics if you want to know more of them.

public class ILReader
: IEnumerable<ILInstruction>

{

   Byte[] m_byteArray;

   Int32 m_position;

   MethodBase m_enclosingMethod;

 

   static OpCode[]
s_OneByteOpCodes = new OpCode[0x100];

   static OpCode[]
s_TwoByteOpCodes = new OpCode[0x100]; 

   static ILReader()

   {

     foreach (FieldInfo
fi in typeof(OpCodes).GetFields(BindingFlags.Public | BindingFlags.Static))

     {

       OpCode opCode = (OpCode)fi.GetValue(null);

       UInt16 value = (UInt16)opCode.Value;

       if (value < 0x100)

         s_OneByteOpCodes[value] = opCode;

       else if ((value &
0xff00) == 0xfe00)

         s_TwoByteOpCodes[value & 0xff] = opCode;

     }

   }

 

   public ILReader(MethodBase
enclosingMethod)

   {

     this.m_enclosingMethod = enclosingMethod;

     MethodBody methodBody = m_enclosingMethod.GetMethodBody();

     this.m_byteArray = (methodBody
== null) ? new Byte[0] : methodBody.GetILAsByteArray();

     this.m_position = 0;

   }

 

   public IEnumerator<ILInstruction>GetEnumerator()

   {

     while (m_position < m_byteArray.Length)

       yield return Next();

     m_position = 0;

     yield break;

   }

   System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()  { return
this.GetEnumerator(); }

 

   ILInstruction Next()

   {

     Int32 offset = m_position;

     OpCode opCode =
OpCodes.Nop;

     Int32 token = 0;

 

     // read first 1 or 2 bytes as opCode

     Byte code = ReadByte();

     if (code != 0xFE)

       opCode = s_OneByteOpCodes[code];

     else

     {

       code = ReadByte();

       opCode = s_TwoByteOpCodes[code];

     }

 

     switch (opCode.OperandType)

     {

       case
OperandType.InlineNone:

         return new InlineNoneInstruction(m_enclosingMethod,
offset, opCode);

 

       case
OperandType.ShortInlineBrTarget:

         SByte shortDelta =
ReadSByte();

         return new ShortInlineBrTargetInstruction(m_enclosingMethod,
offset, opCode, shortDelta);

 

       case
OperandType.InlineBrTarget:
Int32 delta = ReadInt32();    
return new ...;

       case
OperandType.ShortInlineI:  
Byte int8 = ReadByte();        return
new ...;

       case
OperandType.InlineI:       
Int32 int32 = ReadInt32();    
return new ...;

       case
OperandType.InlineI8:       Int64
int64 = ReadInt64();     return new ...;

       case
OperandType.ShortInlineR:   Single float32
= ReadSingle(); return new ...;

       case
OperandType.InlineR:        Double
float64 = ReadDouble(); return new ...;

       case
OperandType.ShortInlineVar: Byte index8 =
ReadByte();      return new ...;

       case
OperandType.InlineVar:      UInt16
index16 = ReadUInt16(); return new ...;

       case
OperandType.InlineString:   token = ReadInt32();
return new ...;

       case
OperandType.InlineSig:      token = ReadInt32();
return new ...;

       case
OperandType.InlineField:    token = ReadInt32();
return new ...;

       case
OperandType.InlineType:     token = ReadInt32();
return new ...;

       case
OperandType.InlineTok:      token = ReadInt32();
return new ...;

 

       case
OperandType.InlineMethod:

         token = ReadInt32();

         return new InlineMethodInstruction(m_enclosingMethod,
offset, opCode, token);

 

       case
OperandType.InlineSwitch:

         Int32 cases = ReadInt32();

         Int32[] deltas = new Int32[cases];

         for (Int32
i = 0; i < cases; i++) deltas[i] = ReadInt32();

         return new InlineSwitchInstruction(m_enclosingMethod,
offset, opCode, deltas);

 

       default:

         throw new BadImageFormatException("unexpected OperandType " + opCode.OperandType);

     }

   }

 

   Byte ReadByte() { return
(Byte)m_byteArray[m_position++]; }

   SByte ReadSByte() {
return (SByte)ReadByte(); }

 

   UInt16 ReadUInt16() { m_position += 2;
return BitConverter.ToUInt16(m_byteArray,
m_position - 2); }

   UInt32 ReadUInt32() { m_position += 4;
return BitConverter.ToUInt32(m_byteArray,
m_position - 4); }

   UInt64 ReadUInt64() { m_position += 8;
return BitConverter.ToUInt64(m_byteArray,
m_position - 8); }

 

   Int32 ReadInt32() { m_position += 4; return BitConverter.ToInt32(m_byteArray,
m_position - 4); }

   Int64 ReadInt64() { m_position += 8; return BitConverter.ToInt64(m_byteArray,
m_position - 8); }

 

   Single ReadSingle() { m_position += 4;
return BitConverter.ToSingle(m_byteArray,
m_position - 4); }

   Double ReadDouble() { m_position += 8;
return BitConverter.ToDouble(m_byteArray,
m_position - 8); }

}

Few comments here:

  • Definitions for ILInstruction and others derived XXXInstructions are not included
    here. Imagine ILInstruction (need) knows Offset, OpCode, ...; and the derived ILInstructions
    may contain other information. If only OpCode.Call or OpCode.CallVirt (which belong
    to OperandType.InlineMethod case) was interesting to me, I'd like to define special
    CallInstruction or CallVirtInstruction class, and create them explicitly. Module.ResolveMethod
    can then help to get MethodBase from the operand (token). Yiru's
    blog has some posts related to this topic.
  • The static constructor of ILReader uses Reflection to initialize 2 static OpCode
    arrays from the enum type OpCodes. As I mentioned
    here, we may use Enum.GetValues to achieve this too.
  • The public constructor accepts a MethodInfo, gets its MethodBody, and then calls
    GetILAsByteArray() to set m_byteArray. Note calling GetMethodBody on the interface
    method and others could return null.
  • The Next() method takes advantage of OpCode.OperandType to decide what kind of operand
    is expected, how many bytes should be read.
  • BitConverter.ToXXX can read a number of bytes and convert them to primitive values
    in-place. Very handy!

Then I can consume the ILInstruction sequence in C# as follows:

foreach (ILInstruction
il in new ILReader(method))
{ /* do something with il */ }