The dark beauty of function evaluation

A recent post in the Visual C# IDE forum (seems like I start a lot of blogs this way J) got me thinking about function evaluation (“FuncEval”) while debugging. I don’t think there is any scarier term for a debugger developer – seriously, I’ve seen them cower in fear. FuncEval is, quite simply, the process of invoking a method while stopped in the debugger. This feature is invaluable for managed code debugging for a number of reasons; chief among them, however, is the very simple desire to see the value of properties in the various data windows. Keep in mind that underlying a property are the getter and setter methods. FuncEval not only enables property evaluation, but also features like debugger visualizers, design time expression evaluation (and expression evaluation in general), the object test bench and many others that make the managed world a fun place to be.

So, why would a debugger developer hide whimpering in a corner every time a new feature based on FuncEval is introduced? The Visual Studio debugger is designed to allow deep inspection of data at a given point of time in an application’s execution. That ‘point in time’ can be hit via a breakpoint, stepping, an exception being thrown, etc. Now, let’s suppose that the problem is an exception. The debugger may stop the program at the point where the exception is unhandled, enters user code, is thrown – generally good places to understand what the problem is. The point is at that moment in time the state of the program is such that an exceptional situation occurred and the debugger’s job is to give the developer enough information about the state to find out how to prevent it in the future. The problem is that function evaluations can modify that state. This is often referred to as a side effect of the evaluation though, in truth, it could easily be the desired behavior of the evaluation and only a side effect when evaluated in the debugger. An example will probably clarify this a bit:

Code example

class Handle

{

    public bool CheckSomething() { return true; }

}

class BehavingBadly

{

    private Handle handle;

    public Handle Handle

    {

        get

        {

            if (handle == null)

                handle = new Handle();

            return handle;

        }

    }

    public void DoSometing()

    {

        if (handle.CheckSomething()) { }

    }

}

class Program

{

    static void Main(string[] args)

    {

        BehavingBadly bb = new BehavingBadly();

        bb.DoSometing();

    }

}

Run this program under the debugger. An exception is thrown in BehavingBadly.DoSomething because handle is null. Let’s suppose that the first thing that the developer does is to open up the locals window and expand ‘this’ to examine the state of the object. At that point, the Handle property is executed, it sees that the field is set to null, and subsequently allocates a new Handle for it. This could clearly be confusing – a NullReferenceException, but handle is clearly assigned a value!

It turns out that it’s even more insidious. Function evaluations are order dependent. If an ‘implicit’ function evaluation, like a property, changes state that another field depends on then the value shown in the debugger may not actually reflect the current value of the field. For example:

Code example

class Program

{

    int a;

    int b;

    public int B

    {

        get { a++; return b; }

    }

    static void Main(string[] args)

    {

        Program p = new Program();

    }

}

Set a breakpoint on the closing brace of Main and start debugging. Examine p in the watch window. Notice that the debugger says that the current value of p.a is 0. Type p.a in the watch window. It now says the value of p.a is 1. This becomes even more interesting from a UI perspective because you may have multiple windows showing the same data (like the locals and watch in this example). In that case, should the property be called twice, or should the debugger attempt to cache the value and show that? How long does the value stay cached for – what if there is an explicit call, say, in the immediate window that affects its value? The debugger currently reevaluates properties for different views (in VS 2003 a single view, like the watch window, could actually cause a property to be evaluated many times).

I’m sure it’s becoming clear why debugger developers aren’t huge fans of implicit function evaluation (that is, function evaluation that happens without an explicit user action). It turns out that there is even more bad news. Function evaluation is slow, hundreds of times slower than simply examining a field. It’s also not guaranteed to finish executing; a property evaluation may simply spin forever. Therefore the debugger imposes a 10 second timeout at which point it attempts to abort the function evaluation. The CLR can’t actually guarantee much about function evaluation aborts, so it’s not an operation we want to happen very often. In addition, function evaluation runs in the debuggee itself (which is how it modifies the debuggee’s state). That means that when a function evaluation is occurring the debugee has an active thread. This can cause difficult to understand and debug problems, particularly in multi-threaded applications.

It may seem like I’m making a case for eliminating FuncEval (or at least implicit FuncEval) entirely. On the contrary, I’m as much to blame as anyone for the increased usage of implicit FuncEval in C# debugging in VS 2005. In VS 2005 I was the PM for the expression evaluator, which is the part of the debugger that understands a particular language (such that it’s possible to evaluate arbitrary expressions like 1+Foo() in the immediate and watch windows). A number of the features that improve data visualization require function evaluation to work. For example, the VS 2005 C# debugger will implicitly call ToString on objects that are displayed in the debugger variable windows. This often gives a significantly better immediate view of the data. Debugger visualizers and type proxies, extensible systems for providing your own data visualization also require function evaluation. Type proxies are used to greatly enhance the display for collection types like Dictionary<T> (where the underlying store isn’t even close to the client’s conceptual model).

The trick of course is to harness the power of function evaluation in such a way that for the vast majority of cases it makes it easier to debug a program instead of harder. That’s why we try to limit the number of places that we perform implicit function evaluations to methods that should not have side effects. Property evaluations are very often safe to call because a large number simply return the underlying field. ToString methods should virtually never modify the state of the object.

I’ve got a lot more to say on the subject, including common examples of performance problems that may be introduced by function evaluations, customization options, workarounds for situations where better data visualization is important but you can’t use ToString, when function evaluation doesn’t work, and of course a more convincing explanation as to why implicit function evaluation is a good thing. Unfortunately I’m exhausted so it’ll have to wait for another day…