My last three posts have explained how test lost its way. It evolved from advocates of the user to a highly efficient machine for producing test results, verifying correctness as determined by a specification. Today, test can find itself drowning in a sea of results which aren’t correlated with any discernible user activity. If only there were a way to put the user back at the center, scale testing, and be able to handle the deluge of results. It turns out, there is. The path to a solution has been blazed by our web services brethren. That solution is data. Data driven quality is the 4th wave of testing.
There is a lot to be said for manual testing, but it doesn’t scale. It takes too many people, too often. They are too expensive and too easily bored doing the same thing over and over. There is also the problem of representativeness. A tester is not like most of the population. We would need testers from all walks of life to truly represent the audience. Is it possible to hire a tester that represents how my grandmother uses a computer? It turns out, it is. For free. Services do this all the time.
If software can be released to customers early, they will use it. In using it, they will inevitably stumble across all of the important issues. If there were a way to gather and analyze their experiences, much of what test does today could be done away with. This might be called the crowdsourcing of testing. The difficulty is in the collection and analysis.
Big Data and Data Science are the hot buzzwords of the moment. Despite the hype, there is a lot of value to be had in the increased use of data analysis. What were once gut feels or anecdotal decisions can be made using real information. Instead of understanding a few of our customers one at a time, we can understand them by the thousands.
A big web service like Bing ships changes to its software out to a small subset of users and then watches them use the product. If the users stop using the product, or a part of the product, this can indicate a problem. This problem can then be investigated and fixed.
The advantage of this approach is that it represents real users. Each data point is a real person, doing what really matters to them. Because they are using the product for real, they don’t get bored. They don’t miss bugs. If the product is broken, their behavior will change. That is, if they experience the issue. If they don’t, is it really a bug? (more on this in another post). This approach scales. It can cover all types of users. It doesn’t cost more as the coverage increases.
Using data aggregated across many users, it should be possible to spot trends and anomalies. It can be as simple as looking at what features are most used, but it can quickly grow from there. Where are users failing to finish a task? What parts of the system don’t work in certain geographies? What kind of changes most improve the usage.
If quality is the fitness of a feature for a particular function, then watching whether customers use a feature, for how long, and in what ways can give us a good sense of quality. By watching users use the product, quality can begin to be driven by data instead of pass/fail rates.
Moving toward data driven quality is not simple. It operates very differently than traditional testing. It will feel uncomfortable at first. It requires new organizational and technical capabilities. But the payoff in the end is high. Software quality will, by definition, improve. If users are driving the testing and the team is fixing issues to increase user engagement, the fitness for the function users demand of software must go up.
Over the next few posts, I will explore some of the changes necessary to start driving quality with data.