A day in the life of compiler bugfixing

The VB/C# team is hard at work on the Async feature... You might already have download the Async CTP yourself, or tried out async in the VS11 Developer Preview, and you might wonder: "The feature seems complete already -- so what are they still working on?" This post is about an example bug that I dealt with last week.

Most programmers trust their compilers. If their program's not behaving right, they think their own code is at fault. Even if they narrow it down to what looks like a compiler bug, they still believe it's their own fault -- maybe they think they misunderstand the language, or aren't using it right. It's a big expectation that we in the compiler team have to live up to.

Here's one bug I worked on last week. I discovered this bug because, as part of unrelated performance investigations, I tested out using "Await" in every possible position allowed by the the VB language.

 Module Module1
    Sub Main()
        f().Wait()
    End Sub

    Async Function f() As Task
        Dim c As New C
        Dim t = Task.FromResult(c)

        For (Await t).i = 1 To 10
            Console.WriteLine(c.i)
        Next

        ' WHAT I EXPECT: It should print out the numbers 1 to 10
        ' WHAT I GET: it doesn't print out anything at all, and fails PEVerify
    End Function

    Class C
        Public i As Integer
    End Class
End Module

 

Loop control variables with side-effects

It looks weird to have an expression as the loop control variable. Nevertheless, as a historical legacy, VB allows it -- it stems from the days when the For loop didn't declare its own local variable and so always had to refer to some pre-existing variable. The important question to ask is when and how often this expression gets executed. Alas the VB language specification (c:\Program Files (x86)\Microsoft Visual Studio 10.0\VB\Specifications\1033) doesn't say. So let's resort to experiment, by making the expression have some observable side-effects:

     Dim c As New C
    Function test() As C
        Console.Write("* ")
        Return c
    End Function
 
     For test().i = 1 To 2
        Console.Write(c.i & " ")
    Next
    ' WHAT I GET: * 1 * * 2 * *
 
    Dim min = 1
    Dim max = 2
    For test().i = min To max
        Console.Write(c.i & " ")
    Next
    ' WHAT I GET: * * 1 * * 2 * *

This is unexpected! It's unexpected that the compiler executes the expression so many times, and unexpected that the exact number of times varies depending on whether the loop bounds were constants or not.

Looking at the compiler implementation, what it's doing is at the start of each iteration is it assigns to the loop control variable and then reads from the loop control variable to check whether to do another iteration. Both of these tasks involve evaluating the expression. But in the first code snippet, with constants for minimum and maximum, it knows it can skip that check the first time around.

 

Another strange thing is that the original code used to work fine in the Async CTP, but stopped working in the VS11 Developer Preview. The reason is that the CTP played fast-and-loose with evaluation order. It basically factored out all "Await" expressions from a statement, did them all and assigned the results to temporary variables, and then performed the rest of the statement. This was a quick-and-dirty hack to get the CTP out as soon as possible. But it results in the wrong evaluation order -- for instance, "Console.WriteLine(f() & Await g())" would evaluate g() first, then Await it, then evaluate f().

The VS11 Developer Preview got correct evaluation order in most cases, but it seems to have missed this edge case of For loops.

 

Fixing the bug?

It looked like it would take me about 5 days to fix this bug -- which is very long, considering that I'd like to be fixing 2 bugs a day. It would take this long because the loop comparison logic is all in the codegen phase of the compiler, well after the "async transformation". It hardly seemed worth the effort -- no one in their right mind would use this strange corner of the compiler. And VB would probably have been a nicer language if it didn't allow any side effects in its loop control variable expressions.

We can't just leave the bug unfixed. We'll have to add a new paragraph to the language spec and a new error to the compiler to say "You can't use Await in a loop control expression".

The decision to add this new error felt like an easy decision. There was no need to bring it to the "VB-Insiders", a set of elite users under Non-Disclosure Agreements with whom we discuss ideas. There was no need even to bring it to a regular VB Language Design Meeting. My call (as the VB Language Lead) was that it was sufficient just to send an email around the compiler team. Everyone agreed to just add the error.

The next step was to finalize the wording of the error message with the User Experience team, who are also in charge of writing MSDN documentation about the language. They have to pick a wording that's similar to the existing error messages, and one that can be translated into different world languages. They settled on this:

BC37060: 'Await' cannot be used in a loop control variable expression

After this, the VB IDE team asked: should we make a "quick-fix" for this error, i.e. a one-click way to automatically fix the user's code if they wrote this? The answer was "definitely not".

 

 

There we have it. Bug fixed in half a day, on schedule.