Managing a Huge Test Matrix with Automation

Hi, my name is Matt Travis and I’m an SDET on the Visual C++ Front End team.  In addition to my work on the Front End, I’ve been working on an automated rolling test system to help manage the huge test matrix we have in Visual C++. 

Our next release, Orcas, will have to run on the usual slew of Windows versions, processors, and languages and we also have to take into account things like running as Normal User as opposed to Admin, running side by side with older versions of Visual Studio, and running different SKUs of Visual Studio (like Express or Pro or VSTS).  We have many test suites comprising hundreds of thousands of individual test cases that have to be run in all these different configurations and these suites often have several different modes they need to be run in as well.  All these test suites and configurations have to be matched up, somehow, on a finite set of lab machines.  As you might guess, doing a full test pass can be a time-consuming, resource intense process.  In the past, it’s taken us a month or more to complete a full test pass and one of our goals for Orcas is to reduce that time.

To help achieve this goal, I’ve been working on a test scheduler to help tie together many of our test tools and automate as much of the process as possible.  It’s part of our rolling test system which allows us to constantly have our tests running against the current build of Orcas so that we have a (semi) real-time assessment of the quality of our product.   Previously, we used a tool which would generate a tracking entry for each suite we wanted to run.  Then, a person would have to look at each of these entries and use a tool which would generate a test run in yet another tool, which would go out and grab a lab machine and re-image it, install the product on it, and then run the tests.  Once that was done, someone would have to update the tracking entry with the results and then the test suite owner would have to investigate those results and then close the tracking entry.  This process was very prone to delay because of the manual intervention required and also didn’t make very efficient use of our lab resources because we would usually end up re-imaging and reinstalling everything on a machine for each individual suite.

The new scheduler generates the list of test cases we want to run and creates tracking entries much as we did before, but then it analyzes this list and buckets test suites that can be run on the same, OS, hardware, language, etc together and then automatically creates batch runs in a format that our test running tool can understand.  This allows us to use our lab machines more efficiently and greatly speeds up the process of creating test runs since no manual intervention is required.  Also, when we find a problem with our automation, we can fix it in one place and all the rest of the suites benefit from that fix.  This alone helped us greatly increase our test throughput, but we also added an automated test retirement queue to the end of the cycle to speed things up even further.  The retirement queue waits for our test running tool to generate results.  It then compares the results to prior results for a particular test suite and, if the results match within an acceptable threshold, it will automatically dump the results into our test history and close the tracking entry without any human intervention required.  If the results are not good enough, it notifies the suite owner and automatically reserves one of the test run machines for investigation so the owner doesn’t have to spend a ton of time just getting the right machine configuration set up.  


The combination of these two improvements has greatly improved our test pass times from over a month to two or three weeks.  It’s also allowed us to run our tests throughout the Orcas development cycle which prevents us from having a giant backlog of test run investigations at the end of the cycle which is when we would typically ramp up our test run cycle in previous products.  It’s also allowed people to spend less time on some of the more mundane tasks and more time doing more interesting stuff, like blogging!