Extension Methods and the Debugger


One source of confusion I find myself clearing up a lot is the use of evaluating extension methods in the debugger windows.  Users report evaluation as working sometimes but not others for the exact same piece of code.  Such flaky behavior can only be the result of a poorly implemented feature or subtle user error.  Right??? 

Unfortunately no.  In this case the behavior described is very possible and “By Design”[1].  It’s an unfortunate fallout from how the way the debugger works.

Quick review.  Expression evaluators strive to have evaluation parity with the compiler.  So if expression expr is valid at the place the debugger is stopped, expr should also be a valid expression in the immediate, watch, etc … windows.  This holds true for extension methods.  For example

using System;
using System.Collections.Generic;
using System.Linq;
using System.Diagnostics;

namespace ConsoleApplication1 {
    class ExtensionMethodExample {
        public static void Example() {
            var col = new List<int>();
            col.Add(42);
            Debugger.Break();
        }
    }
}

At the Debugger.Break line expressions like ‘col.First()’ are valid and legal (assuming System.Core is referenced).  Hence they should also be available in the debugger windows.  But with extension methods developers will occasionally see the following.

image

Clearly it failed to evaluate but is legal in code so users often interpret this as a bug (and I can’t blame them, the behavior is odd). 

Expression evaluators host the compiler in order to evaluate expressions.  In order to semantically interpret an expression compilers need symbols to bind to.  In a debugging session symbols are acquired by reading the metadata from the DLLs loaded into the debugee process.  So the evaluation essentially occurs by referencing the set of DLL’s loaded into the debugee process. 

For most expressions this poses no problem.  In order to even have a given value to run an expression off of it’s DLL must be loaded and hence symbols for the type of the value and it’s members are available.  Extension methods are quite different though in that the target method can, and often does, live in a separate DLL.  So unlike normal members, simply having a value in the debugger does not necessitate symbols for it’s extension method are loaded into the process.  When they are not binding fails. 

This is the case for the above sample.  The symbols for the value ‘col’ are in mscorlib while the ‘First’ extension method are in System.Core.dll.  If System.Core.dll is not loaded into the process then it’s symbols are not available and attempts to bind to the LINQ extension methods will fail.

This is what makes the behavior appear to be flaky.  The ability to call an extension method is directly related to whether or not it’s DLL is currently loaded in the debugee process.   When there is a disconnect between DLL’s available at compile time and loaded in the process there becomes a gap in what can be evaluated.   DLL’s are loaded on demand and if no extension method, or other type, in the DLL has been used up until the current point in the process it will not be loaded and hence not available.

What complicates this discussion even further is a side effect of the hosting process in Visual Studio is that it hides this problem for certain DLLs (primarily System.Core).  One of the features of the hosting process is that it preloads a set of DLL’s into the debugee process including System.Core.dll.  As a result LINQ extension methods are readily available in most projects.  For a normal console application the above won’t ever fail with F5 unless you specifically disable the hosting process in the debug tab of the project properties page.

image

This further adds to the perception of extension methods in the debugger are flaky since LINQ works but user defined extension methods fail.  It creates additional confusion because the hosting process does not work for all project types (devices, certain types of web projects, etc …) and does not come into play in an attach scenarios. 


[1] I do hate using the “By Design” tag to describe a feature as successfully failing but such is life. 

Comments (6)

  1. James Curran says:

    In the example you give, would using the "long form"  — Enumerable.First(col) — in the watch window workaround the problem?

  2. jaredpar says:

    @James,  

    No.  Even typing the super long form System.Linq.Enumerable.First(col) won' fix the problem.  The reason being the DLL isn't loaded at all so that type doesn't exist as far as the expression evaluator is concerned.  

  3. I don't get it; why doesn't the debugger just load the DLL when it sees the reference to System.Linq.Enumerable.First? Or better still, load all of the referenced assemblies for the current project?

  4. jaredpar says:

    @Domenic,

    There are a couple of reasons why.  

    The main one being that loading a DLL introduces a huge side effect into the process.  The effect is visible to the program and can be acted upon by user code.  So executing a .First() suddenly causes other parts of your program to begin executing.  Probably not what the user expects.

    The next big problem is where do we load the DLL from?  Certainly we know the set of DLL's which have extension methods used in the project and where the were when the compilation happened.  But that does not allow us to load them because

    1. The DLL referenced at compilation time could be a reference only DLL and have no code.  Loading it into the user process would be simply incorrect and cause the program to function incorrectly

    2. Compilation and execution can happen on different machines so the DLL location is irrelevant

    3. The DLL on disk could be different than the one that was compiled.  

    The other problem is a bit of a catch 22.  Until we have the DLL in memory we cannot bind First because we don't have any symbols for it.  Hence we can't even discover what DLL we're looking for.

  5. terry says:

    @Jared

    For #2 "In order for it to compile or execute the Dll location had to be known right?

    For #3 the location of the right DLL had to be found or else it would have never have compiled much less execute so isn't there a way to piggyback on this process to resolve which DLLS need to be loaded?

  6. jaredpar says:

    @terry

    Yes we definitely know where the DLL was when it was compiled.  But execution and compilation occur at two very different times.  Arbitrarily loading it from the compilation path is not a good idea because the program may have a dependency on it loading from the execution location.  Additionally it's very possible the compilation DLL was a reference only DLL and hence has no code.  There is no way to detect this and loading a reference only DLL into the AppDomain would cause the program to stop functioning.