Testing the C# Compiler #2

Lots of questions were sent my way after the last post about testing the compiler. A lot of them can be answered by providing some example tests that show some real code and demonstrate how a test works in our world. So that's what I'll attempt to do in this post. I'm keeping the tests as simple as possible as complicating the actual code in the test serves no purpose in this case.

Also, I'd love to get some feedback on how others are testing compilers. I know there are a lot of people out there who work on similar projects and I'm hoping to find ways of sharing our best practices, ideas, processes, etc. So, please contribute ideas, questions, thoughts, etc. either via email or through the comments (which I'm staying with for now).

Positive & Negative Examples
Let's start off with a very simple positive test. Here's what the code would look like for what may be one of the simplest tests we'd have:

 // <Title>A simple demonstration test (positive)</Title>
// <Expects Status=Success/>
class Test
{
static int Main()
{
int a = 3, b = 4;
if (a + b == 7)
             return 0;
return 1;
}
}

Things to note:

  • We use a sort-of-xml-syntax within comments to include test metadata in the actual test's source code. I might at some point go into why we do this if there's more than the obvious to talk about.
  • The Expects element has an attribute called "Status." That's what makes this test a positive test. Having a "Success" value tells our harness we're expecting this code to compile without errors and generate an executable that generates zero on exit.
  • The actual code is a very simple class that just contains a Main method. In the method, two int variables are declared then added in an if statement and checked to see if they add up to the literal 7. If it does, the app exits with an exit code of zero. It otherwise returns a one, representing failure.

Here's a simple example of a test that is expected to generate a compiler error:

 // <Title>A simple demonstration test (negative)</Title>
// <Expects Status=Error>CS0029.*string.*int</Expects>
class Test
{
static int Main()
{
return "fail";
}
}

The Expects element now has an "Error" value for its Status attribute. This tells our harness we're expecting a compiler error and no generated binary.

The error in the actual code will generate the following output from the compiler: "error CS0029: Cannot implicitly convert type 'string' to 'int'." The string "CS0029.*string.*int" in the Expects element is a regular expression that our harness verifies is a match with the compiler's actual output.

If the compiler does not generate an error or generates an error but the error's text does not match the given regular expression, the test fails.

One thing to note is that the only text that is matched for this test is the actual error code (e.g. CS0029) and the values of all "fill-in" strings (e.g. "string" and "int"). There are a few reasons for this:

  • While under development, the actual text we use for errors can change significantly. The developer who adds the error to the compiler usually comes up with text that he feels best describes the error. Usually, that text sticks. But sometimes our QA team will find a scenario where the same error gets generated but the original text doesn't clearly address anymore. At this point, we either come up with new text that satisfies all scenarios or add a new, more specific error for the new scenario. As our internal partners start updating their toolsets with our new compiler, they also sometimes find error messages that they feel could be improved, etc. The point is that the text may change many times throughout a product cycle. So to reduce the tests' maintenance cost, we check only the parts of an error message we know are less likely to change.
  • We have localized versions of the compiler also. That means we have versions of the compiler that generate errors in Japanese, German, Spanish, etc. The error code and fill-in strings are always the same, no matter what language the errors are translated into. This allows us to run the same exact tests on all localized versions.

That's where I'll stop this time. There are a lot of details I'm not getting into and it is somewhat oversimplified compared to what our tests really look like. But this should provide the overall idea of how things work. Also, I should point out that we have several other types of tests in our suite than these two. There are cases where we have to verify the actual IL that gets emitted, verify other types of output (.DLLs, .NETMODULES, XML Doc contents, debug info (PDB files)), and many others. But these represent the majority of the type of tests we use today and will make it easier for me to talk about compiler testing going forward.