Code Coverage

Measuring code coverage is often perceived as a good measure of test quality. It is not. Good tests finds problems when changes are made to the code. But if you just want to have large code coverage you can easily make a number of tests calling all your methods but not checking the results. The only thing high code coverage values really tell you is that the code is at least not crashing with the given input.

 

If you however are using BDD/TDD, code coverage values might be of interest. For example if you do not have 100% function coverage (i.e. not 100% of the functions are called) then you aren't really using BDD/TDD are you? Because how could you write a method that is not called by anyone? Well actually you might have created methods that are never called by your tests. Many TDD practitioners use a rule "to simple to test" which applies to real simple, property like methods. I don't really like that philosophy, but more on that in a later post.

 

So now you think with 100% function coverage you will also have 100% line coverage with BDD/TDD, right? Well, yes and no. Typically you don't since you will add error handling with logging methods that will never occur since the errors never occur in your tests. It might be that you open a file and if that fails you log an error and exit your application. This never happens in the tests since you always manage to open the file in your tests. So how did those lines get in there if never called? Well one other rule often used by TDD practitioners is that you "should not do stupid things". If a system call may fail, you check the result regardless of tests or not. With dependency injection you'll probably get close to 100% but there is no point in bending back-wards in order to achieve high code coverage.

 

Does this mean there is no point in measuring code coverage? I think it is great to measure code coverage if you use the result correctly. You should not add more tests just to increase coverage since test added just to increase coverage tend to just exercise code and not really testing something interesting. But low code coverage when using BDD/TDD is definitely a warning signal that something is wrong. The team is not using the methodology correctly. So what is considered OK coverage levels? From personal experience I think anything below the levels listed in the table below should be considered bad since type without injectiontype withoutinjection with with you should have no problem at all achieving the given values.
 

Coverage type without dependency injection with dependency injection
Function 90% 99%
Line 75% 95%

But sometimes there is someone (usually a manager) that thinks coverage should be above some level. So even though you know those tests will not really be useful you have to add more tests for coverage. What do you do? Either you can try to ignore the coverage fact and just try to add more tests, testing interesting things. Or you could try using Pex. Pex is a tool from Microsoft Research that is an automated exploratory testing tool. It can be used to make a small test suite with high code coverage and with only a few simple examples I get the impression it is quite good at finding border cases in your code. This will not however replace your traditional tests/specifications written as part of your TDD/BDD process. But it can help you test some cases you did not think of and that way increase code coverage even more without any extra effort from you. At least it is better than adding coverage tests by hand.