3rd-parties and Edit And Continue (Part 2: Debuggers)

I recently blogged about what’s involved for 3rd-parties doing Edit-and-Continue (EnC), where I explained EnC is an IDE (at least debugger+editor+compiler) wide feature, not just a debugger feature. In that entry, I alluded to some basic work a 3rd-party would need to do to their editor and compiler to support EnC. In this entry, I’ll focus on EnC requirements on the debugger.

 

I’ll first look at what the basic rules are for EnC at the ICorDebug level, and then I’ll walk through a simple EnC demo at the ICorDebug level.

 

Basic ICorDebug Enc Rules

There are a few basic rules about EnC and ICorDebug:

  1. The only valid edits are:
    1. Adding a new private members (field, method, property) to a  type and
    2. Replacing an entire function.  If a user edits a single line within a function, the IDE still needs to replace the entire function. There is an ICorDebugFunction object for each version of the function.
  2. There are many random restrictions on top of these. We basically restricted anything that we didn’t think was a core scenario for EnC. This freed up resources to make sure the core-scenarios worked well. For example:
    1.  you can’t add any members to a value-type or change anything in a generic type.
    2. there are additional restrictions on the legal edits for an active function (currently on a thread’s stack) versus an inactive function (not on any thread’s stack).
    3. you can’t edit dynamic code.
    4. you can’t remove locals from a function or change its signature.
  3. The design of ICorDebug EnC is to offload as much as possible to the debugger / IDE. For example, the debugger’s responsibilities include:
    1. compiling delta files for the IL and metadata for each edit. (The debugger presumably obtains these from the compiler).
    2. knowing how a thread in a stale function should get remapped to a new version of that function. (The debugger presumably obtains this from the editor).
    3. remapping all execution control, such as breakpoints and steppers, from the stale function to the new function.
    4. handling any language policy issues. For example, what should be done if an end-user puts an “if (false) {  }” around the current IP?

 

 

An EnC walkthrough at the ICorDebug level

The MDbg sample has a demo of Edit-And-Continue (See the $\readme.htm and $\demo\EditAndContinue directory in the MDbg sample) which demonstrates EnC at the ICorDebug level via editing IL. This sample is great for understanding the underlying mechanics of EnC. I’ll walk through that demo here.

 

The Overview

Here’s the test program (HelloWorld.cs) in C# (we’ll eventually ildasm this to produce our starting IL code):

using System;

public class HelloWorld {

     public static void Main()

     {

          Console.Write("Hello, ");

          PrintWorld();

     }

     static void PrintWorld()

     {

          Console.WriteLine("World");

     }

}

 

We’d like to step into the PrintWorld() function and then edit it to be:

     static void PrintWorld()

     {

          Console.WriteLine("new World");

     }

 

It sounds simple enough, now let’s see what’s involved…

 

Preparation:

In order to do EnC, we need to produce compiled delta files which requires an EnC-capable compiler. ILasm has primitive EnC support that can be used for academic and testing purposes. Its critical limitation is that it must compile all EnC deltas at initial compile time. It can’t compile new deltas in the middle of a debug session. That’s fine for testing and demo purposes but obviously useless for anybody who actually wants to do EnC in a real scenario. After all, if you knew the edits before hand, why not just compile it that way initially?

 

We ildasm that to get our starting IL. Here’s the full program in IL (HelloWorld.il). The method we’re going to edit is in red.

// Microsoft (R) .NET Framework IL Disassembler. Version 1.2.30711.0

// Copyright (C) Microsoft Corporation 1998-2003. All rights reserved.

// Metadata version: v1.2.30711

.assembly extern legacy library mscorlib

{

  auto

}

.assembly legacy library HelloWorld

{

  // --- The following custom attribute is added automatically, do not uncomment -------

  // .custom instance void [mscorlib]System.Diagnostics.DebuggableAttribute::.ctor(bool,

  // bool) = ( 01 00 00 01 00 00 )

  .hash algorithm 0x00008004

  .ver 0:0:0:0

}

.module HelloWorld.exe

// MVID: {9F4A2EAC-B19E-4D09-B988-5A948172960E}

.imagebase 0x00400000

.file alignment 0x00000200

.stackreserve 0x00100000

.subsystem 0x0003 // WINDOWS_CUI

.corflags 0x00000001 // ILONLY

// Image base: 0x03EA0000

// =============== CLASS MEMBERS DECLARATION ===================

.class public auto ansi beforefieldinit HelloWorld

       extends [mscorlib]System.Object

{

  .method public hidebysig static void Main() cil managed

  {

    .entrypoint

    // Code size 16 (0x10)

   .maxstack 8

    IL_0000: ldstr "Hello "

    IL_0005: call void [mscorlib]System.Console::Write(string)

    IL_000a: call void HelloWorld::PrintWorld()

    IL_000f: ret

  } // end of method HelloWorld::Main

  .method private hidebysig static void PrintWorld() cil managed

  {

    // Code size 11 (0xb)

    .maxstack 8

    IL_0000: ldstr "World"

    IL_0005: call void [mscorlib]System.Console::WriteLine(string)

    IL_000a: ret

  } // end of method HelloWorld::PrintWorld

  .method public hidebysig specialname rtspecialname

          instance void .ctor() cil managed

  {

    // Code size 7 (0x7)

    .maxstack 8

    IL_0000: ldarg.0

    IL_0001: call instance void [mscorlib]System.Object::.ctor()

    IL_0006: ret

  } // end of method HelloWorld::.ctor

} // end of class HelloWorld

// =============================================================

//*********** DISASSEMBLY COMPLETE ***********************

// WARNING: Created Win32 resource file HelloWorld.res

 

 

Now in the demo we edit the PrintWorld() function. We need to replace the entire function. The IL for the new function is:

 

HelloWorld_v1.il:

.class public auto ansi beforefieldinit HelloWorld

{

  .method private hidebysig static void PrintWorld() cil managed

  {

    // Code size 11 (0xb)

    .maxstack 8

    IL_0000: ldstr "new World"

    IL_0005: call void [mscorlib]System.Console::WriteLine(string)

    IL_000a: ret

  } // end of method HelloWorld::PrintWorld

} // end of class HelloWorld

 

ILasm’s enc support can take in snippets of IL that just contain the new enc items and produce the compiled IL and metadata for it. We pass the delta IL file via the /enc switch.

Go to the demo directory (<your MDbg install root>\demo\EditAndContinue), and run:

ilasm /debug=IMPL HelloWorld.il /ENC=HelloWorld_v1.il /out=HelloWorld.exe

 

That produces a bunch of files :

HelloWorld.exe, HelloWorld.pdb – these are the standard exe + pdb you’d expect.

 

It also produces these additional files.:

· HelloWorld.exe.1.dil – the compiled delta IL file (this is analogous to the .exe)

· HelloWorld.exe.1.dmeta – the compiled metadata file.

· HelloWorld.exe.1.pdb – the updated pdb for the new edit.

 

Now we’re all set to run the EnC demo:

 

A walkthrough of running the sample:

Here’s a breakdown of editing an active function demo from the run.bat file in the demo directory:

 

1) First  Run MDbg from the beta 1 sample.

..\..\bin\debug\mdbg

 

2) Now load the EnC and ILDasm extension dll..

MDbg> load enc

trying to load: C:\dev\mdbg\bin\debug\.\enc.dll

Extension EnC loaded

MDbg> load ildasm

trying to load: C:\dev\mdbg\bin\debug\.\ildasm.dll

Extension ildasm loaded

ildasm command loaded. For help type 'help'

 

3) Now load the debuggee. We pass the “-enc” flag because ICorDebug needs to know if we’re doing EnC at startup because it affects codegen.(There’s a brief discussion of that here)

MDbg> run -enc HelloWorld.exe

STOP: Breakpoint Hit

42: IL_0000: ldstr "Hello "

4) Now we step into the function that we’re going to edit (PrintWorld)

[p#:0, t#:0] MDbg> next

Hello 44: IL_000a: call void HelloWorld::PrintWorld()

[p#:0, t#:0] MDbg> step

52: IL_0000: ldstr "World"

[p#:0, t#:0] MDbg> w

Thread [#:0]

*0. HelloWorld.PrintWorld (HelloWorld.il:52)

 1. HelloWorld.Main (HelloWorld.il:45)

5) Just for kicks, we can view the IL and verify that it’s really the original IL:

[p#:0, t#:0] MDbg> ildasm

code size: 11

current IL-IP: 0

mapping: MAPPING_EXACT

URL: HelloWorld.PrintWorld

* IL_0 : ldstr "World"

  IL_5 : call System.Console.WriteLine

   IL_A : ret

 

6) Now we actually apply the edit. A fancy IDE (like VS) would do this under the covers in response to a user editting the source-file. For MDbg, we have to explicitly apply the edit. This will map to calling ICorDebugModule2::ApplyEdit which will just notify the CLR of the edit request but not actually move any threads or change anything on the stack.

[p#:0, t#:0] MDbg> enc HelloWorld.exe HelloWorld_v1.il

ENC successfully applied.

7) New function calls to PrintWorld will call the most recent versions, but threads already in the old version will not be automatically remapped. We need to let the thread keep running until it reaches a point that it can be remapped to the new version of the function.  The CLR will fire an ICorDebugManagedCallback2::FunctionRemapOpportunity callback when this happens.

[p#:0, t#:0] MDbg> go

STOP RemapOpportunityReached

52: IL_0000: ldstr "World"

 

8) Once we reach a “remap opportunity”, we need to actually remap. The debugger provides the IL offset into the latest function for where to resume. This means that the debugger needs to be keeping source-mapping tables between the old version of the function and the new one. For our demo, we’ll just start at the beginning (IL offset 0). MDbg will then call ICorDebugILFrame2::RemapFunction here.

[p#:0, t#:0] MDbg> remap 0

Remap successful.

 

If we don’t remap, then the thread will just continue executing in the stale function until it hits the next remap opportunity (in which case we repeat step 8). ICorDebug never gets the actual remap table, and thus can’t remap any IL-based objects such as steppers (which step across an IL range) or breakpoints (which are set at an IL offset). The debugger also needs to remap these by creating new objects in the updated function.

 

9) Now that the CLR knows where the thread should resume in the new function, we need to let the debuggee resume so that the CLR can remap the thread. The CLR will fire an ICorDebugManagedCallback2::FunctionRemapComplete callback here.

[p#:0, t#:0] MDbg> go

STOP FunctionRemapComplete

7: IL_0000: ldstr "new World"

 

10) We’re now in the new function. We can verify that with the ildasm command again:

[p#:0, t#:0] MDbg> ildasm

code size: 11

current IL-IP: 0

mapping: MAPPING_EXACT

URL: HelloWorld.PrintWorld

* IL_0 : ldstr "new World"

   IL_5 : call System.Console.WriteLine

   IL_A : ret

 

11) We can also run a callstack and see the pdb has been updated and that our new function is indeed pointing at the new source:

      [p#:0, t#:0] MDbg> w

Thread [#:0]

*0. HelloWorld.PrintWorld (HelloWorld_v1.il:7)

 1. HelloWorld.Main (HelloWorld.il:45)

 

12) The last bit of proof is to run the current function and verify that it does indeed execute the new code:

[p#:0, t#:0] MDbg> out

new World

45: IL_000f: ret

 

That’s edit-and-continue with IL and MDbg

 

The Next Step:

The above example is mostly a proof-of-concept thing to illustrate EnC at the raw ICorDebug level. I can’t imagine any end-user using MDbg + Enc like this for anything beyond academic interest.  There are several additional things a debugger and IDE would need to do to make the EnC experience more useful to end users.  This work could be factored to between a debugging engine (which talks to ICorDebug) and a “language service” component, which would then let the debugging portion be reusable across languages. This is how VS is architectured, which is what allowed VS’s C# to add EnC support after VS already had VB enc support.

 

Things the language service would have to do may include:

  • have the compilers be able to compile delta IL files on the fly, instead of having ilasm’s restriction of requiring them all up front. Unfortunately for 3rd parties, both VBC and Csc compilers don’t publicly expose their EnC functionality.
  • automatically generates the delta IL files based off source-changes in the high level language.
  • automatically track IL mappings (remap tables) between different versions of functions.  This is important to provide the new IL offset for the remap command in step 8.
  • handle all the “goofy” source-level cases. For eg, what if somebody puts an foreach loop around the current IP?

 

Things the debugging engine would have to do may include:

  • handle the remaps automatically. This means applying the remap table provided by the language service to do steps 6,7,8 and 9 automatically under the covers for the end-user. The debugger can queue up all the source changes, and then make a single batch call to ApplyEdit once the user resumes after making the source changes.
  • build a source-level experience on top of this raw-IL experience. This includes:
    • mapping breakpoints from stale functions to new functions.
    • remapping steppers so that a source-level step-operation completes across remap events.
    • handling stale-code cases (eg, what if you hit a breakpoint in stale code before you hit a remap opportunity?)