Tool to allow inline IL in C# / VB.Net

C# doesn’t support inline IL.  As an experiment, I wrote a post-compiler tool that allows primitive IL inlining for C# / VB.Net (or any .net language).

(My main goal here was I actually wanted to try out fxcop and needed some pet project to do it with - but that’s another story).

 

Examples

Here are some examples of inputs that it can compile. It also handles the pdbs properly, so you can even debug the IL snippets alongside the rest of the source code.

 

1) A trivial example of inline IL in C#.

using System;

class Program

{

    static void Main()

    {

        int x = 3;

        int y = 4;

        int z = 5;

        // Here's some inline IL; “x=x+y+z”

#if IL

        ldloc x

        ldloc y

        add

        ldloc z

        add

        stloc x

#endif

        Console.WriteLine(x);

    }

}

 

 

2) A similarly trivial example of inline IL in VB.Net

' VB demo with inline IL

module m1

sub Main()

        Console.WriteLine("Hi!")

        Dim x As Integer

        x = 3

#If IL Then

        // Here's some inline IL

        ldloc x

        dup

        add

        stloc x

#End If

        Console.WriteLine(x)

    End Sub

end module

 

 

3) An example of using inline IL to inject a filter in C#. This is more interesting because C# doesn’t support filters, so we’re actually adding new functionality here.

using System;

class Foo

{

    static void ThrowMe()

    {

        throw new ArgumentException();

    }

    static void Main()

    {

        string x;

        object ex = null;

#if IL

        // declare a new local.

        .locals init (int32 ghi)

     .try

        {

#endif

            x = "a";

            ThrowMe();

#if IL

            leave.s IL_ExitTryCatch

        } // end try block

        filter

        {

            // Exception object is on the stack.

            stloc ex

#endif

            Console.WriteLine("Inside Filter. Object=" + ex);

            x += "b";

#if IL

            ldc.i4.1 // true - execute handler

            endfilter

        } // end filter

        { // begin handler

#endif

            Console.WriteLine("Yow! In handler now!");

            x += "c";

#if IL

            leave.s IL_ExitTryCatch

        } // end handler

        IL_ExitTryCatch: nop

#endif

        Console.WriteLine("Back in C#");

        Console.WriteLine(x);

    } // end Main

} // end Class Foo

 

 

 

How does it work?

This IL inliner is really just a parlor trick. It’s just a  post-compile tool which round trips the source  through ilasm / ildasm, and injects the new IL in the middle.

Specifically

1) It invokes another process (such as C# / VB.Net) to compile the original source using the high level compiler. E.g.:

csc %input_file% /debug+ /out:%temp_output%

You’ll notice the IL snippets are conveniently under an #ifdef such that the compiler doesn’t get confused by it. This also introduces a key limitation which I’ll discuss below.

2) ILDasm the output. E.g.:

ildasm %temp_output% /linenum /out=%temp_il_output%

3) Find the inline IL snippets in the original source. It turns out that ilasm is sufficiently intelligent that this is mainly just text processing. You could do it with a perl script.

4) Inject these snippets back into the ildasm output from step #2. Be sure to add .line directives to map the pdbs

5) run ilasm on the newly merged file.

ilasm %temp_il_output% /output=%output_file% /optimize /debug

 

Here’s a quick demo. For the first C# example, the main() function from the IL file before injecting the snippet (%temp_il_output%) is:

  .method private hidebysig static void Main() cil managed

  {

    .entrypoint

    // Code size 14 (0xe)

    .maxstack 1

    .locals init ([0] int32 x,

             [1] int32 y,

             [2] int32 z)

    .language '{3F5162F8-07C6-11D3-9053-00C04FA302A1}', '{994B45C4-E6E9-11D2-903F-00C04FA302A1}', '{5A869D0B-6611-11D3-BD2A-0000F80849BD}'

    .line 6,6 : 9,19 'c:\\temp\\y2.cs'

    IL_0000: ldc.i4.3

    IL_0001: stloc.0

    .line 7,7 : 9,19 ''

    IL_0002: ldc.i4.4

    IL_0003: stloc.1

    .line 8,8 : 9,19 ''

    IL_0004: ldc.i4.5

    IL_0005: stloc.2

    .line 19,19 : 9,30 '' ß snippet goes before this line

    IL_0006: ldloc.0

    IL_0007: call void [mscorlib]System.Console::WriteLine(int32)

    IL_000c: nop

    .line 20,20 : 5,6 ''

    IL_000d: ret

  } // end of method Program::Main

 

 

 

After injecting the IL snippet, that function looks like:

  .method private hidebysig static void Main() cil managed

  {

    .entrypoint

    // Code size 14 (0xe)

// removed .maxstack declaration

    .locals init ([0] int32 x,

             [1] int32 y,

             [2] int32 z)

    .language '{3F5162F8-07C6-11D3-9053-00C04FA302A1}', '{994B45C4-E6E9-11D2-903F-00C04FA302A1}', '{5A869D0B-6611-11D3-BD2A-0000F80849BD}'

    .line 6,6 : 9,19 'c:\\temp\\y2.cs'

    IL_0000: ldc.i4.3

    IL_0001: stloc.0

    .line 7,7 : 9,19 ''

    IL_0002: ldc.i4.4

    IL_0003: stloc.1

    .line 8,8 : 9,19 ''

    IL_0004: ldc.i4.5

    IL_0005: stloc.2

.line 12,12 : 1,16 'c:\\temp\\y2.cs'

        ldloc x

.line 13,13 : 1,16 'c:\\temp\\y2.cs'

        ldloc y

.line 14,14 : 1,12 'c:\\temp\\y2.cs'

        add

.line 15,15 : 1,16 'c:\\temp\\y2.cs'

        ldloc z

.line 16,16 : 1,12 'c:\\temp\\y2.cs'

        add

.line 17,17 : 1,16 'c:\\temp\\y2.cs'

        stloc x

    .line 19,19 : 9,30 ''

    IL_0006: ldloc.0

    IL_0007: call void [mscorlib]System.Console::WriteLine(int32)

    IL_000c: nop

    .line 20,20 : 5,6 ''

    IL_000d: ret

  } // end of method Program::Main

 

 

 

What sort of limitations are there?

On the bright side, this inliner is pretty cheap. It’s under 1000 lines of C#. It operates on the IL level and so can work cross-language without a compiler change. That said, it works well enough on the samples above but it does have some key limitations, including:

1) The compiler (e.g. csc.exe) is completely ignorant to the IL snippets. This greatly simplifies the model but also introduces some issues:

a. The compiler doesn’t know about any locals declared in the IL snippets.

b. The compiler can’t do any analysis on the IL snippets. This can be critical in dead-code detection. If the only reference to C# code is via the IL snippet, CSC will think it’s dead code and remove it, and then the code will be unavailable for the inliner. This is why the C# filter example above puts the throw in its own function.

2) There are limitations to stitching the high-level source code and the IL together. For example, you can’t share labels across the boundary. Also, the compiler doesn’t know about declarations from the IL snippets.

3) The inliner only supports IL statements. It doesn’t support IL expressions, members, or types.  Supporting expressions would require real integration with the compiler, and also provide little value since they can trivially be converted into statements. Supporting members would also require real integration with the compiler so that the rest of the compiler could see the newly declared member. Supporting types don’t make sense since the type could just be in its own IL file.

 

Overall, I think this is a cute academic exercise, but not fit for commercial purposes.

 

 

Random issues with ILAsm

There are a few ilasm/ildasm issues to be aware of when doing this:

1) ILasm/ildasm and the symbol store.

At first, I thought I’d need to use the managed symbol store to do things like

- determine where in the IL code to inject the IL snippet. This would be mapping the source-line of the snippet back to a function and IL offset, and then injecting the snippet there.

- resolve variable names to IL variable indices. This would mean transforming ‘ldloc x’ into ‘ldloc 0’ if x was IL var 0.

 

It turns out that you can view ILasm as an alternative (text-based) symbol store interface. ILdasm will automatically load the pdbs and resolve variable names to indices. Thus you can blindly  inject “ldloc x” into the IL stream and it will resolve the variable ‘x’ for you.

ILdasm will also inject ‘.line’ directives for the managed sequence points if you pass the /linenum switch. This lets you preserve sequence points when round-tripping. You can also determine where to inject the IL snippets by parsing the IL text for ‘.line’ directives instead of actually cracking the symbol store.

 

2) Mapping IL snippets back to source.

Just as we single-step through the high-level source, we want to be able to debug the inline IL. There are two things we need to do here. First, need to map the lines of the IL snippet back to the source via injecting our own ‘.line’ directives. If we don’t do this, our IL snippets will map back to either a random ‘.line’ directive occurring earlier in the file, or the temporary %temp_il_output% file that we created. ILasm can only place ‘.line’ directives before statements, so we need to do a little hacky parsing to determine that.  Make sure the filenames in the ‘.line’ directives have consistent casing or that may confuse debuggers.

Second, we need to ilasm with the /debug flag (instead of /debug=impl), which tells the jitter to explicitly use the sequence points we specified as opposed to inferring sequence points based off various heuristics (such as stack-empty points). The downside is that there’s a perf hit to using explicit sequence points.

 

3) Removing the ‘.maxstack’ directive.

When we did the ildasm, it emiited ‘.maxstack’ directives describing the stack space needed based off just the high-level source code. That will change when we add the new IL. If we just strip these directives from %temp_il_output%,  ilasm will recompute the maxstack when it reassembles.

 

I think it would also be cool if there was an IL codedom that allowed you to easily load, manipulate, and save assemblies at the il level, without having to resort to parsing ildasm text output.

 

Now what?

This was mostly an experiment.  I didn’t find a good use for inlining IL in C#. Inlined IL also is very fragile and seems to be more pain then it’s worth.  Just look at the C# filter example.

I have the demo files from above at: https://mikewinisp.members.winisp.net/blog/InlineIL/InlineIl.zip. [Update:] This also includes the source for the tool.

 

Another takeaway is that I think there are interesting opportunities for extremely cheap tools that take advantage of IL roundtripping. For example, Serge Lidin (the author of ilasm)  wrote a short (under 1000 lines) utility, ILlink, which merges managed assemblies by basically running ildasm on the assemblies, concatenating the outputs, and then ilasm that. Like my IL inliner, he had to do some basic touchups (such as removing multiple assembly declarations).

I could imagine another such tool that does code-coverage: ILdasm an assembly, blindly inject code coverage probes before each  ‘.line’ statement, and then ilasm it back up. Note that the officially correct way to do code-coverage is with the CLR profiling APIs, which is the most scalable and would let you reinstrument the code on the fly (thus avoiding a separate code-coverage build), but ildasm roundtripping would certainly be simpler to implement.