Newsflash: misuse of quality metrics


I’ve seen a few posts recently on bad software metrics. What I find interesting is that the points made in the posts are the same points that I’ve been reading (and teaching) for years. I guess the old saying about history, learning and doom is spot on.


Metrics have always been a passion of mine. I co-designed (and sometimes teach) a course on metrics at MS, and speak about metrics occasionally at conferences. In a way, this is a “me too” post, but the other postings on the subject seem to miss the mark a bit. I’m not linking to protect the innocent.


Let’s take two examples I’ve read this week – code coverage, and test pass rates. These are both metrics that show up on the short list of just about every team I work with. For the record, neither have anything to do with quality – using coverage alone as a measure of test quality, or test pass rates as a measure of code quality are silly things to do. But, you still should measure both – I just don’t really care what the numbers are. I see teams with goals of reaching X% code coverage and Y% test pass rates, but those are the wrong way to use those numbers. I’ve said many times before that all 80% statement coverage tells you is that 20% of your code is completely untested (not to mention that there are a lot of bugs left in the 80% you’ve covered). Test pass rates are just as useless. It is not uncommon for teams at MS to run a million test cases. On a million test cases, 99% pass rate leaves 10k failures. These could include known (punted) bugs, bugs in tests, and perhaps even a showstopper or two.


If you are measuring code coverage and test pass rates, here are the metrics I suggest you use. For code coverage, you need a goal of reviewing and understanding 100% of the uncovered code blocks. Uncovered code can reveal where additional testing may need to be done. Similarly, for test pass rates, your goal should be to investigate and understand the cause of 100% of the failures. For those of you who are nitpickers, of course you will need to continue to test the covered part of the code, and do some work to confirm that your “passing” tests are indeed passing, and are testing real user scenarios.


All I ask is that when you decide to measure something, that you do it for the right reasons. One good litmus test for choosing any metric that has to do with a percentage is that you are able to justify the target. I ask teams why they chose 75% for a code coverage metric, and in just about every case (if they know), they say “it just seemed like a good number”. If the goal doesn't make sense, or is just a "feel good" number, you're probably measuring the wrong thing.


Comments (4)

  1. Adam Goucher says:

    When I talk about code coverage in my class, I do it in the following ways

    • measure it, but use it not as a target number but as a means to answer ‘where are we now?’ and ‘are we okay with that?’. Coverage dropped 5% last week, why? or it increased by 5%, way to go!

    • 100% coverage all the time is a myth and often you have to do way to many shenanigans to achieve it. Aim for the highest bang-for-buck number, and no higher. What that number is will vary from team-to-team and no one can predict what it will be before they achieve it.

  2. Srinivas says:

    Hi…Alan

    I red your article misuse of quality metrics.i have some doubts .Could you please clarify ..you are saying the measuring code coverage and test pass rates.I would like to know how we identify the uncovered code blocks.

    I am lokking for your reply…

  3. gstaneff says:

    Test Pass Rates tell you nothing about the severity of the defects discovered by failures, the importance of scenarios blocked by failures, and the number of bugs or scenarios that are not being guarded (for or against) because those tests just don’t exist.  For instance, if the test pass test population is dominated by one kind of test (e.g. most testing was performed against the localhost with very few calling against a remote machine) there can be a very high test pass rate despite an entire piece of critical functionality being broken.  Maximizing Test Pass Rate does not necessarily lead to Maximizing Product Quality.

    Code Coverage is a measure of existence and not correctness.  Using code coverage as part of test’s readiness to ship calculus is improper use of this metric.  Using Code Coverage to identify product code that isn’t necessary to satisfy necessary user scenarios is a proper use of the metric, but is thwarted when test teams artificially inflate their code coverage in order to drive to some arbitrary code coverage target percent.  

    Test pass rates are unable to tell you what you have not tested, what you have not tested well, and the relative importance of those failures were when compared to the successes.

    Code Coverage is unable to tell you what you have not tested well, nor to identify unnecessary code in cases where tests were written with the intent to maximize code coverage.

  4. Alan Page says:

    Srinivas – Geoff kind of answered your question already. Basically what I’m saying is that measuring code coverage is good so that you can find what areas of code haven’t been touched, but setting a goal of reaching some number rarely drives the right behavior.

Skip to main content