Perceived vs. Objective Quality

I recently heard this story, but I can't recall who told it to me. I don’t have proof of its veracity so it might be apocryphal. Nevertheless, it illustrates an important point that I believe to be true independent of the truth of this story.

As the story goes, in the late 1990s, several Microsoft researchers set about trying to understand the quality of various operating system codebases. Of concern were Linux, Solaris, and Windows NT. The perception among the IT crowd was that Solaris and Linux were of high quality and Windows NT was not. These researchers wanted to test that objectively and understand why NT would be considered worse.

They used many objective measures of code quality to assess the 3 operating systems. This would be things like cyclomatic complextity, depth of inheritance, static analysis tools such as lint, and measurements of coupling. Without debating the exact value of this sort of approach, there are reasons to believe these sort of measurements are at least loosely correlated with defect density and code quality.

What the researchers found was the Solaris came out on top. It was the highest quality. This matched the common sense. Next up they found was Windows NT. It was closely behind Solaris. The surprise was Linux. It was far behind both of the other two. Why then the sense that it was high quality? The perceived quality of both NT and Linux did not match their objective measures of quality.

The speculation on the part of the researchers was that while Linux had a lot of rough edges, the most used paths were well polished. The primary scenarios were close to 100% whereas the others were only at, say, 60%. NT, on the other hand, was at 80 or 90% everywhere. This made for high objective quality, but not high experienced quality.

Think of it. If everything you do is 90% right, you will run into small problems all the time. On the other hand, if you stay within the expected lanes on something like Linux, you will rarely experience issues.

This coincides well with the definition of quality being about fitness for a function. For the functions it was being used for, Linux was very fit. NT supported a wider variety of functions, but was less fit for each of them and thus perceived as being of lower quality.

The moral of the tale: Quality is not the absence of defects. Quality is the absence of the right kinds of defects. The way to achieve higher quality is not to scour the code for every possible defect. That may even have a negative effect on quality due to randomization. Instead, it is better to understand the user patterns and ensure that those are free of bugs. Data Driven Quality gives the team a chance to understand both these use patterns and what bugs are impeding them.