Testing Systems

This is the third in my series on test harnesses. In this post, I'll talk about systems that do much more than simple test harnesses. Test harnesses provide a framework for writing and executing test cases. Test harnesses focus on the actual execution of test cases. For complex testing tasks, this is insufficient. A test system can enhance a test harness by supplying a back end for results tracking and a mechanism to automatically run the tests across multiple machines.

A test harness provides a lot of functionality to make writing tests easier, but it still requires a lot of manual work to run them. Typically a test application will contains dozens or even hundreds of test cases. When run, it will automatically execute all of them. It still requires a person to set up the system under test, copy the executable and any support files to the system, execute them, record results, and then analyze those results. A test system can be used to automate these tasks.

The most basic service provided by a testing system is that of a database. The test harness will log test results to a database instead of (or in addition to) a file on the drive. The advantages to having a database to track test results are numerous. The results can be compared over time. The results of multiple machines running the same tests can be combined. The aggregate pass/fail rate can be easily determined. An advanced system might even have the ability to send mail or otherwise alert users when a complete set of tests is finished or when certain tests fail.

It is imperative that any database used to track testing data have good reporting capabilities. The first database I was forced to use to track test results was, unfortunately, not strong in the reporting area. It was easy to log results to the database, but trying to mine for information later was very difficult. You basically had to write your own ASP page which made your own SQL calls to the database and did you own analysis. I lovingly called this system the "black hole of data." A good system has a query builder built in (probably on a web page) which lets users get at any data they want without the necessity of knowing the database schema and the subtleties of SQL. The data mining needs to go beyond simple pass/fail results. It is often interesting to see data grouped by a piece of hardware on a machine or a particular OS. The querying mechanism needs to be flexible enough to handle pivoting on many different fields.

Another feature often provided by a test system is the ability to automatically run tests across a pool of machines. For this to work, there is a specified set of machines set aside for use by the testing system. Upon a specified event, the test system invokes the test harness on specific machines where they execute the tests and record the results back to the database. These triggering events might be a specified time, the readiness of a build of the software, or simply a person manually scheduling a test.

Part of this distributed testing feature is preparing the machines for testing. This may involve restoring a drive image, copying down the test binaries and any supporting files they might need, and conducting setup tasks. Setup tasks might be setting registry entries, registering files, mapping drives, and installing drivers. After the tests are run, the testing system will execute tasks to clean up and restore the machine to a state ready to run more tests.

Having a testing system with these capabilities can be invaluable on a large project. It can be used to automate what we at Microsoft call BVTs or Build Verification Tests. These are tests that are run at each build and verify basic functionality before more extensive manual testing is done. Through the automation of distributed testing, substantial time can be saved setting up machines and executing tests. People can spend more time analyzing results and investigating failures instead of executing tests.

It is important that I note here the downside of testing systems. Once you have a full-featured testing system, it is tempting to try to automate everything. Rather than spending money having humans run tests, it is possible to just use the test system to run everything. This is fine to a point but beyond that, it is dangerous. It is very easy to get carried away and automate everything. This has two downsides. First, it means you'll miss bugs. Remember, once you have run your automated tests the first time, you will never, ever find a new bug. You may find a regression, but if you missed a bug, you'll never find it. As I've discussed before, it is imperative to have someone manually exploring a feature. Second, it means that your testers will not develop a solid understanding of the feature and thus will be less able to find bugs and help with investigation. When a system is overly automated, testers tend to spend all of their time working with the test system and not with the product. This is a prescription for disaster.

When used properly, a good test harness coupled with a good test system can save subtantial development time, improve the amount of test coverage you are able to do in a given period of time, and make understanding your results much easier. When used poorly, they can lull you into a false sense of security.