“if Microsoft is so good at testing, why does your software suck?”

Article
08/11/2008

What a question! I only wish I could convey the way that question is normally asked. The tone of voice is either partially apologetic (because many people remember that I was a major ask-er of that same question long before I became an ask-ee) or it’s condescending to the point that I find myself smiling as I fantasize about the ask-er’s computer blue-screening right before that crucial save. (Ok, so I took an extra hit of the kool-aid today. It was lime and I like lime.)

After 27 months on the inside I have a few insights. The first few are, I readily concede, downright defensive. But as I’ve come to experience firsthand, true nonetheless. The last one though is really at the heart of the matter: that, talent notwithstanding, testers at Microsoft do have some work to do.

I’m not going down the obvious path: that testing isn’t responsible for quality and to direct the question to a developer/designer/architect instead. (I hate the phrase ‘you can’t test quality in,’ it’s a deflection of blame and as a tester, I take quality directly as my responsibility.)

But I am getting ahead of myself. I’ll take up that baton at the end of this post. Let’s begin with the defensive points:

1. Microsoft builds applications that are among the world’s most complex. No one is going to argue that Windows, SQL Server, Exchange and so forth aren’t complex and the fact that they are in such widespread use means that our biggest competitors are often our own prior versions. We end up doing what we call “brown field” development (as opposed to ‘green field’ or version 1 development) in that we are building on top of existing functionality. That means that testers have to deal with existing features, formats, protocols along with all the new functionality and integration scenarios that make it very difficult to build a big picture test plan that is actually do-able. Testing real end-to-end scenarios must share the stage with integration and compatibility tests. Legacy sucks and functionality is only part of it…as testers, we all know what is really making that field brown! Be careful where you step. Dealing with yesterday’s bugs keeps part of our attention away from today’s bugs.

(Aside: Have you heard that old CS creationist joke: “why did it take god only seven days to create the universe?” The answer: “No installed base.” There’s nothing to screw up, no existing users to piss off or prior functionality and crappy design decisions to tiptoe around. God got lucky, us…not so much.)

2. Our user-to-tester ratio sucks, leaving us hopelessly outnumbered. How many testers does it take to run the same number of test cases that the user base of, say, Microsoft Word can run in the first hour after it is released? The answer: far more than we have or could hire even if we could find enough qualified applicants. There are enough users to virtually ensure that every feature gets used in every way imaginable within the first hour (day, week, fortnight, month, pick any timescale you want and it’s still scary) after release. This is a lot of stress to put our testers under. It’s one thing to know you are testing software that is important. It’s quite another to know that your failure to do so well will be mercilessly exposed soon after release. Testing our software is hard, only the brave need apply.

3. On a related point, our installed base makes us a target. Our bugs affect so many people that they are newsworthy. There are a lot of people watching for us to fail. If David Beckham wears plaid with stripes to fetch his morning paper, it’s scandalous; if I wore my underpants on the outside of my jeans for a week few people would even notice (in their defense though, my fashion sense is obtuse enough that they could be readily forgiven for overlooking it). Becks is a successful man, but when it comes to the ‘bad with the good’ I’m betting he’s liking the good a whole lot more. You’re in good company David.

But none of that matters. We’ll take our installed base and our market position any day. No trades offered. But still, we always ready to improve. I think testers should step up and do a better job of testing quality in. That’s my fourth point.

4. Our testers don’t play a strong enough role in the design of our apps. We have this “problem” at Microsoft that we have a whole lot of wicked smart people. We have these creatures called Technical Fellows and Distinguished Engineers who have really big brains and use them to dream really big dreams. Then they take these big dreams of theirs and convince General Managers and VPs (in addition to being smart they are also articulate and passionate) that they should build this thing they dreamt about. Then another group of wicked smart people called Program Managers start designing the hell out of these dreams and Developers start developing the hell out of them and a few dozen geniuses later this thing has a life of its own and then someone asks ‘how are we going to test this thing’ and of course it’s A LITTLE LATE TO BE ASKING THAT QUESTION NOW ISN’T IT?

Smart people who dream big inspire me. Smart people who don’t understand testing and dream big scare the hell out of me. We need to do a better job of getting the word out. There’s another group of wicked smart people at Microsoft and we’re getting involved a wee bit late in the process. We’ve got things to say and contributions to make, not to mention posteriors to save. There’s a part of our job we aren’t doing as well as we should: pushing testing forward into the design and development process and educating the rest of the company on what quality means and how it is attained.

We can test quality in; we just have to start testing a lot sooner. That means that everyone from TF/DE through the entire pipeline needs to have test as part of their job. We have to show them how to do that. We have to educate these smart people about what quality means and take what we know about testing and apply it not only to just binaries/assemblies, but to designs, user stories, specs and every other artifact we generate. How can it be the case that what we know about quality doesn’t apply to these early stage artifacts? It does apply. We need to lead the way in applying it.

I think that ask-ers of the good-tester/crappy-software question would be surprised to learn exactly how we are doing this right now. Fortunately, you’ll get a chance because Tara Roth, one of the Directors of Test for Office is speaking at STAR West in November. Office has led the way in pushing testing forward and she’s enjoyed a spot as a leader of that effort. I think you’ll enjoy hearing what she has to say.

“if Microsoft is so good at testing, why does your software suck?”

Additional resources