IL offset 0 vs. Native offset 0

Within a function, offset 0 into the native code stream corresponds to the very first native instruction in that function. Since the function is ultimately executed via native code (and not via interpreted IL), it’s safe to say that native offset 0 corresponds to the very start of the function.  When native-debugging, if you place a function-level breakpoint, it is placed at native offset 0.
Likewise, offset 0 into the IL stream corresponds to the very first IL instruction in that function.  However, since the IL doesn’t describe the prolog, IL offset 0 starts after the prolog. (There’s no IL for the epilog either).
Thus  a breakpoint placed at IL offset 0 will skip the prolog. In practice, this only matter if you want to debug the prolog.  Since the prolog has only one exit point, IL offset 0 is always guaranteed to be hit.

Take a trivial function that compiles to IL:

int Add(int x, int y)
int z = x+y;
return z;

Here’s a merged view of IL (in red), Native x86 (in normal font) and the source (in bold).
[update:] Note that this is specifically full unoptimized, debuggable code. That way nothing gets inlined, breakpoints all work, you can inspect all locals, etc. Once you enable optimizations, everything gets folded into a single add instruction (see comments for details).

int Add(int x, int y)
00000000 push edi <– start of prolog, Native Offset 0
00000001 push esi
00000002 push ebx
00000003 push ebp
00000004 mov ebx,ecx
00000006 mov esi,edx
00000008 cmp dword ptr ds:[001AA30Ch],0
0000000f je 00000016
00000011 call 769AF339 <– End of prolog

00000016 xor edi,edi <– zero out local #0
00000018 xor ebp,ebp <– zero local #1
    int z = x+y;
IL_0000: ldarg.0
IL_0001: ldarg.1
IL_0002: add
0000001a lea eax,[ebx+esi] <– Here’s the native code for IL offset 0.

IL_0003: stloc.1

0000001d mov ebp,eax

return z;
IL_0004: ldloc.1
IL_0005: stloc.0

0000001f mov edi,ebp

IL_0006: ldloc.0
IL_0007: ret

00000021 mov eax,edi

00000023 pop ebp <– epilog and return (return value is in eax).
00000024 pop ebx
00000025 pop esi
00000026 pop edi
00000027 ret

I often find this 3-way view convenient. As another pet project, I’d love to add a debugger tool window that automatically stitches these 3 views together.

Comments (5)

  1. OT: Is the JIT-compiler that bad? 20 instructions for a simple "return x+y"?

  2. Eric W says:

    I understand most of that x86 assembly, but what exactly are these three lines in the prolog doing?

    00000008 cmp dword ptr ds:[001AA30Ch],0

    0000000f je 00000016

    00000011 call 769AF339

  3. I should have clarified: this is *fully-debuggable* code with *all* optimizations disabled (even the simple ones).

    For example, you’ll notice it refrained from inlining anything; and all the locals are still available, and it eagerly zero-initialized things, etc.

    When I throw the switch and run as optimized, it folds everything. If I call it with constants, like:

    int z2 = Add(5,6);


    It optimizes very nicely to:

    00000058 mov ecx,0Bh

    0000005d call 75AD2A98

    Even with vars, it’s still smart and produces this code:

    int z2 = Add(x1,y1);

    0000007a mov eax,dword ptr [ebp-4Ch]

    0000007d add esi,eax


    0000007f mov ecx,esi

    00000081 call 75AD2A98

  4. Eric W – those lines are basically some instrumention at the start of the method (like a "Function-Enter hook" for the CLR). They only appear in non-optimized code.