beta testing vs usability testing

I said in an earlier post that usability testing is not beta testing. The line between beta and usability can get blurred when someone like 37 Signals tells you that you should test in the wild instead of doing a usability study. Here's what they say about it:

Formal usability testing is too stiff. Lab settings don't reflect reality. If you stand over someone's shoulder, you'll get some idea of what's working or not but people generally don't perform well in front of a camera. When someone else is watching, people are especially careful not to make mistakes — yet mistakes are exactly what you're looking for.

This paragraph shows a fundamental misunderstanding of the value of usability testing, plus overstates the value of beta testing for finding user experience issues. They're absolutely right that formal usability testing can be too stiff, and that there's a potential bias built into the system because people are more likely to be hyper-aware of what they're doing. That doesn't mean that it's without value -- after all, if someone is hyper-aware of what they're doing, and they still stumble, then this means that you've definitely got an issue that needs to be addressed.

Relying solely on a beta for finding user experience issues is fraught with danger. By the time you hit beta, you're pretty much locked down. You've made all of the decisions about what's going into your product, not to mention what workflows you're going to support and what scenarios you're going to optimise for [1]. If you wait until beta to gather user feedback, and you learn during your beta that some of your early assumptions were wrong, your options are limited: either go out the door with a product that you now know doesn't meet user needs because you've already committed to it, or you change direction at a very late date (which likely means that you have to delay your release). In either case, you've wasted a lot of time and resources on something that won't be accepted by your intended users.

Much of my research happens long before beta, and one of its goals is to reduce the risk that the software won't meet its goals. My research happens early enough in the cycle that we can go back to the drawing board and fix issues long before they're in code. I iterate on important features to make sure that we've really nailed it -- in this release, this is underscored by the number of hours that my team has spent in the lab with the MacRibbon.

Relying solely on a beta for finding user experience issues means that you're also relying on the user to report issues, and be able to articulate the issues that they're experiencing. When a user experiences an issue in the lab, there is something there observing what is happening. There are plenty of times when the user experiences a problem and doesn't even notice it themselves; it's only later when you ask about it that they reflect back on the experience and recall that they had an issue. In the moment, they were so focused on doing what they wanted to do that they just moved past it as quickly as possible. Skip usability testing, and you miss out on issues like that.

Last year, I conducted a usability test for an important group of features in one of our applications. If you were to simply look at my data, the study went well: the participants were able to do everything that I asked them to do. The process was complex, but they were able to tell me what was happening at every step along the way. On paper, the study went well, and everyone said that they were satisfied with their experience. But as I observed the study, I knew that there was a big problem. While all of the participants said and did all the right things, they weren't comfortable with it. They said one thing, and their body language and tone told me something entirely different. They understood the complex process, but they weren't comfortable with it. We went back to the drawing board, tweaked some things, and then did another usability study with our new design. This time, their words matched their body language. We never would have caught that if we were just relying on self-reported information.

This isn't to say that there aren't user experience issues that can be uncovered during beta. In the usability lab, I get data about usability, but I don't get data about usage -- that is, I learn how users initially react to something and what happens over the course of a couple of hours as they use it, but I don't learn how users react to something over an extended period of time. I monitor our betas to determine what kind of usage issues exist, and my team also conducts plenty of research throughout the software lifecycle to learn more about usage issues.

Standard usability testing is not, and should not be, the only method that you use to capture user feedback. It's one method of many. It has its strengths and weaknesses. Assuming that its weaknesses mean that it's useless is short-sighted and dangerous.


While all of the decisions have probably been made by the time your application is in beta, that doesn't actually mean that they've all been implemented by the time you're in beta. It's beta because there's still work to be done. Work that tends to happen late in the game includes icons, since you don't want to waste resources on putting a lot of polish on icons that might or might not actually make it into the final product, and you don't want to have to keep on going back to your icon designers and saying, "oops, we need another one".