Small changes can prevent optimizations.

Here’s an example where adding a very useful (and seemingly innocent) method could prevent some useful optimizations.

The ILGenerator class provides a reasonably safe way to emit a stream of IL. It has a bunch of overloaded Emit methods that emit various IL opcodes. The actual opcodes don’t have to be written until the generator is done and the type gets “baked”.

However, it doesn’t expose a way to get the current IL offset in the stream.  This would be useful because although Reflection-emit lets you generate a PDB and thus make it debuggable, sometimes it’s nice to track your own IL-to-Source mapping if you don’t want a PDB.

So Rick and I discussed adding a property to ILGenerator like “CurrentILOffset” that would return the current IL offset for the emitter. The property we want to expose is just sitting there as a private field, so it’s got to be safe, right? Expected usage would be something like this:

    int [] offsetsIL = new int[lines + 1]; 

    offsetsIL[0] = gen.CurrentILOffset; // gets IL offset before source line 1
    gen.Emit(…); // Emit  IL opcodes for source line 1

    offsetsIL[1] = gen.CurrentILOffset; // gets IL offset at the end of line 1 and before line 2
    gen.Emit(…); // Emit  IL opcodes for source line 2

    offsetsIL[2] = gen.CurrentILOffset; // final IL offset

And now the offsetsIL array contains a mapping from source line to IL offset. A compiler could then store that in some random arbitrary non-pdb format (such as via Custom Attributes). (See here for the “right” way to do this)

What’s the problem?
Adding this property seems nice and cute. But being able to provide the IL offset means that you always know the size of code emitted, and that prevents certain optimizations.
Some IL operations can be encoded in different ways. For example, “ldc.i4” has an 4-byte integer argument and pushes that on the IL stack.  That’s 5 bytes total. But there are also short forms of the “load constant” opcode. For example, “ldc.i4.1” pushes the constant ‘1’ onto the stack, and thus shaves off the 4-bytes for the integer argument.

Branch  instructions also have long and short forms. If the branch target is close, it can be encoded in fewer bits, and thus can use a short form of the opcode (bge vs. bge.s).

ILAsm (at least in whidbey) has a “/optimize” that will pick shorter encodings when appropriate. For example, no need to emit “ldc.i4 1” (5 bytes) when you could say “ldc.i4.1” (1 byte).

So here’s where it break downs: since an opcode can be encoded in different ways, you may not know the number of the IL bytes it takes, and thus may not be able to accurately provide the IL offset. For example, suppose you had a method like “EmitBranch” that took a branch target and would emit the short form of the opcode is possible. Now suppose you had psuedo code like this:

    Label l = gen.DefineLabel();
    gen.EmitBranch(…, l); // emit a branch to unknown label l
    int offset = gen.CurrentILOffset; // gets IL offset means we need to know the size of that branch instruction


    gen.MarkLabel(l); // label gets marked here.
    gen.Emit(…); // emit more stuff at end

If EmitOtherStuff only emits a few bytes, then the EmitBranch() could emit a short branch. If it emits a lot of bytes, then EmitBranch would need to emit a a long branch instruction.  Thus the behavior of EmitOtherStuff affects the result of CurrentILOffset on the previous line. So exposing the IL offset prevents this sort of optimization.

So I admit this isn’t the best example. IL Generator today does not do this optimization, and the optimization in question is difficult. And for the record, there are ways to abstract exposing the IL offset in a way that allows these optimizations. That aside, it’s a real life example of how innocent  and useful changes can cause grief such as preventing optimizations. This contributes to the “There’s no such thing as a ‘small’ change” philosophy, which is a topic for another blog entry.

Comments (2)

  1. ILGenerator shouldn’t do any "optimizations", it’s a pain in the ass. Exposing IL offset is a critical feature (although it would be acceptable to me to do it based on labels, in a way that the offset only becomes available after the method is finished) and at the moment I’ve reverse engineered ILGenerator and taken a dependency on the implementation details of ILGenerator to be able to map IL offsets to source line numbers.

  2. jmstall says:

    Jeroen – This is more of an abstract point.

    1) It’s reasonable to want some sort of IL emitter that uses short form encodings when possible.

    2) exposing something like ILOffset would prevent that.

    We could pull up other examples where "inocent" changes prevent optimizations.

    In reality, ILGenerator isn’t the optimizing emitter. I expect (no promises) a future version will expose the IL offset. Your situtation is a specific scenario for us.