google v. microsoft, and the dev:test ratio debate


Every since I gave a talk at Google’s GTAC event here in Seattle this past October, I’ve had the chance to interact with a number of Google testers comparing and contrasting our two companies’ approach to testing. It’s been a good exchange.


Now it seems that, their toilets notwithstanding, Google focuses on testing with an intensity that is in the same general ballpark as ours. We both take the discipline and the people who do it seriously. But I think that there are some insights into the differences that are worth pondering.


Specifically, the disparity between our respective developer-to-tester ratios is worth a deeper look. At Microsoft the dev:test ratio varies somewhat from near 1:1 in some groups to double or triple that in others. At Google just the opposite seems to be the case with a single tester responsible for a larger number of bug-writing devs (clearly we have that in common).


So which is better? You tell me, but here are my thoughts (without admission of any guilt on Microsoft’s part or accusations against Google):


1.       1:1 is good. It shows the importance we place on the test profession and frees developers to think about development tasks and getting the in-the-small programming right. It maximizes the number of people on a project actively thinking about quality. It speeds feature development because much of the last minute perfecting of a program can be done by testers. And it emphasizes tester independence, minimizing the bias that keeps developers from effectively testing their own code.


2.       1:1 is bad. It’s an excuse for developers to drop all thoughts of quality because that is someone else’s job. Devs can just build the mainline functionality and leave the error checking and boring parts to the testers.


It’s interesting to note that Microsoft testers tend to be very savvy developers and are often just as capable of fixing bugs as they are of finding bugs. But when they do so, do devs really learn from their mistakes when they have someone else cleaning up after them? Are testers, when talented and plentiful, an excuse for devs to be lazy? That’s the other side of this debate:


3.       Many:1 is good. When testers are scarce, it forces developers to take a more active role in quality and increases the testability and initial quality of the code they write. We can have fewer testers because we our need is less.


4.       Many:1 is bad. It stretches testers too thin. Developers are creators by nature and you need a certain number of people to take the negative viewpoint or you’re going to miss things. Testing is simply too complicated for such a small number of testers. Developers approach testing with the wrong, creationist attitude and are doomed to be ineffective.


So where’s the sweet spot? Clearly there are application specific influences in that big server apps require more specialized and numerous testers. But is there some general way to get the mix of testers, developers, unit testing, automated testing and manual testing  right? I think it is important that we start paying attention to how much work there really is in quality assurance and what roles are most impactful and where. Test managers should be trying to find that sweet spot.

Comments (18)

  1. dgerodim says:

    In Many:1 IT shops I have been involved with, testers are just a formality.  They don’t have the skills to code or script, they mostly do manual UI tests.  

    To many current dev managers, employing capable testers that can write code and build harnesses to automate the process is different, and different is bad.

    I colleague once said that he gauges social maturity by people’s ability to patiently stand in line, and gauges a team’s engineering maturity by the willingess to invest in testing.

  2. Philk says:

    Is finding the sweet spot of numbers really that important ? or should fostering a mindset that everyone is involved in quality more important ?( a cliche that is often used but how can it be put into practice ? )

  3. strazzerj says:

    James,

    There is no universal answer to the question "where’s the sweet spot?" concerning dev:test ratios.

    The "sweet spot" is the one that works well for your individual company – be that Microsoft, Google, or Joe’s Company.

    Clearly, the time and count of people required to test something is contextual – it depends on factors that may have little or nothing to do with how long it took to develop that feature.  It will vary by company and project, and may vary over time.

    Also consider:

    – what counts as Development?

    – what counts as Testing?

    see:

    http://www.sqablogs.com/jstrazzere/150/What+is+the+%22Correct%22+Ratio+of+Development+Time+to+Test+Time%3F.html

  4. MSDNArchive says:

    Excellent points, all. I love the quote about innovation in testing.

    In my ‘future’ series I pointed out that quality will eventually be everyone’s job. The point here is what is the proportion of quality-oriented people that takes us to that future faster?

  5. dannyR says:

    I don’t think you can talk about dev:test ratio without talking about engineering process. For example, listen to the Channel 9 interview with Nachi Nagappan of the RISE group about his team’s findings on TDD (test driven development). In his study, MS showed a 60-90% reduction in code defects at a cost of making the dev cycle 15-35% longer. Employing such a development process would likely change the ideal dev:test ratio.

    One thing I have little to no understanding of is how Test/QA is structured in other engineering disciplines (EE, Civil, etc).

    http://channel9.msdn.com/posts/Peli/Experimental-study-about-Test-Driven-Development/

  6. snuchia@statsoft.com says:

    On test/dev in other engineering disciplines:

    Most other disciplines day-to-day work are more similar to line-of-business software development than they are to product development.  Real innovation is a relatively small part of the job, with ensuring that requirements are captured and covered by the design and implementation taking a much larger portion of the engineer’s time and attention.

    Building a new chip is much more like developing a software product than it is like other electrical engineering tasks.

    Much testing in other engineering situations is focused on risks unrelated to the design: faulty components, mis-assembly, etc.

    I’ll give a scenario that I’m intimately familiar with from Electrical Engineering.  I was the test engineer on commissioning jobs for new electrical distribution equipment in chemical plants on several occasions.  The firm I worked for specialized in test and repair of industrial electrical system components, we were retained to do the commissioning tests for a new power control building in a refinery expansion project.

    The PCB was designed by an engineering firm based on detailed requirmentents from the client’s chemical engineering people.  We had the requirements and the design documentation, we developed the commissioning test plan.  There was a row of 13,800V metal-clad indoor switchgear including two vacuum circuit breakers configured as synchronous motor startes and some air-break switches feeding a row of seven outdoor pad-mounted transformers.  There was a row of 4160V motor starters with air-break switches feeding them from two of the transformers, and a big 480V MCC fed from the other transformers.  We were concerned only with the medium voltage stuff and the main breakers on the MCCs.

    The 13,800V protection scheme used GE "UR" series digital protective relays and a fairly complex scheme designed to "tie" the busses together intelligently in the event of a feed loss from on of the two sources.  Typical scheme, unusually complex implementation.

    In the process of testing all this equiment we found miswiring of the main-tie-main control scheme.  We found blast barriers missing in several of the air-break swithces.  We found miswired current transformers on one of the 13,800V breaker cabinets.  There were minor problems in other areas but I don’t remember all of them, I was mainly doing the relays.  There was something wrong with the potential transformers in the protective relaying setup for the 4160V lineup as well but I don’t recall exactly what that was now.  Potential transformer outputs were permuted at the inputs to the relays in the 13,800V lineup and there were was a design flaw involving visibility of the backup source’s potential signals from the opposite main breaker relay, I don’t remember the details now.  There was also a flaw in the mechanical interlocking scheme on some of the air-break switches.

    Except for the last two, none of these were defects in the drawings as received from the "programmers" at the electrical engineering firm.  Compilers do not misinterpret the programmer’s instructions the way the wiring guys on the shop floor misinterpret the drawings.  Testing software is qualitatively different from testing physical artifacts.

    In line-of-business software, the developer’s role is closer to that of the wire monkeys.  Take a stack of business rule changes and implement them using routine coding idioms.  The defects that result are similar to wiring errors: a certain rule is not properly implemented in certain circumstances.

    There’s still no need to test each copy of the program before it is commissioned, of course.

    There is also no need to repeat tests.  Every year or two that electrical equipment will need to be retested.  Not the full commissioning test plan, but a lot of it.  Clean the switch contacts and measure their resistance.  Check insulation resistance on the transformers and their feed cables.  Check contact wear, trip times and vacuum integrity on the breakers.  Nothing analogous happens for software.

    I believe that a useful discussion of the role of a testing activity in any engineering context should begin with a characterization of the risks being addressed.  The tester/dev ratio discussion is focused on risks specific to new software product code development: conformance to specifications, usability, crashing bugs, platform compatibility, etc.  Different contexts and different risk portfolios lead to very different test activities.

    -swn

  7. pencildot says:

    In today s world its extremely difficult to find an efficient tester. Automation is a method to test a scenario. Scenario is what we need to be testing and not the tool. Its is good that a tester learns the tool.

    Testing is a talent, an art. All of us can draw but da Vinci s are rare.

    A tester should see ahead of a developer, to help him find the loop holes  that he missed while creating his  work. A developer can test what he has created but a tester should know what the developer should be creating. The success  of any product depends on how good their work relationship is. If they work against each other the product is a chaos.

    There is no way to determine the ratio between the two. If they both are good then 1:1 is great. Unfortunately to find a good developer and an equally good tester is like finding needle in a haystack.

    I am saying this on seeing the testers around. Most of them in the field are trying to become a developer or they have no other option, hence chose this field. There is no passion for the profession. You got to love what you do.IN TODAYS WORLD I would suggest 1:3(i know many of u would be shocked, but this is reality).

    To test a product you need to think ahead of not one developer but a million customers.HOW MANY OF US OUT THERE CAN DO THAT?

  8. pablo_fung@hotmail.com says:

    Testers can only test code.  How about design bugs in the design or architecture?  Who’s responsible for vetting those artifacts?  

    One design / architecture error in a Microsoft product that replicates content from one farm to another assumes that if content is deleted and inserted again, and replication happens afterwards, the replication process fails because the newly inserted item already exists in the destination (it just tries to do an update or insert).

  9. I believe that this post misses the point entirely: It is not a question of x is good vs y is bad. It is a question of servicing: Microsoft’s servicing model is a relic of the 80s, with a major software release coming every couple of years. In that case, you need an immense amount of test effort because bugs are extraordinarily expensive and difficult to fix.

    Google’s online-centric model means that bug fixes can be rolled out rapidly. Look at gmail – it’s still in Beta after all these years, but has also been generating revenue for all that time.

    Microsoft needs a new servicing model that allows software to be rapidly and efficiently patched at low risk and with high speed and security. This is a monumental task but it’s what must be done.

  10. JimAtASH says:

    I’m not sure of the "optimal" ratio of Dev:QA, but I know what it is not.  Anything over 4 to 1 in my experience became unmanageable from the QA standpoint.  Meaningful testing never seemed to be done enough because of the turn around and release of the software.  QA was seen as "too slow" when in fact they were overwhelmed not slow. It made QA a true bottleneck on projects which also made the people in QA less happy.

    I’ve never worked in a 1 to 1 environment but it definitely sounds good.  I’m sure it gives the QA engineer the ability to accomplish the "should do"s and not just the "must do"s.

    – Jim

  11. davidvthokie says:

    I’d love to know how closely the practices of the G v. MS testers align.  What do each spend the day doing – and at what abstraction level?  What time is spent on various tasks relative to each other?  What does the Venn diagram look like as to responsibilities? etc…

  12. Alun Jones says:

    You missed commenting on the 1:many environment – every development project goes through this phase, when substantially more people are using the software than ever worked to create it in the first place. In the 1:1 or Many:1 situations, one tester is testing the output, usually against a set of described use cases. Once the product reaches alpha, beta, or release stages, however, it gets tested by people who have no interest in the original designers’ use-cases, only their own.

  13. I was triggered to read a blog post by James Whittaker, software architect at Microsoft, through an article

  14. [Nacsa Sándor, 2009. január 13. – február 3.]  A minőségbiztosítás kérdésköre szinte alig ismert

  15. [ Nacsa Sándor , 2009. február 6.] Ez a Team System változat a webalkalmazások és –szolgáltatások teszteléséhez

  16. izdelava spletnih strani says:

    Exactly why i like your blog. You tell iz like it is, no matter what. Whether it’s Microsoft or Google product.

  17. Test and measurement instrument says:

    Taurus Powertronics- The products driving the company’s dynamic growth in high performance technology include: Battery Ground Fault Locator, Test and Measurement Instrument, Fault Passage Indicators, Earth Tester, Insulation Tester, Fault Locator Systems, Corona Imaging Camera, TBD Later India.

    http://www.tauruspowertronics.com

  18. The ration will be found naturally says:

    If you break ever story down and you have a task for each part of a story: design, develop, unit test, automated integration test, build, install, localize:

    Your team works on a story until it is done. The should take the task to write the automated integration test for each story and a team with a size of four to five people should only do one or two stories at a time, and each takes a different task.

    If the tester can keep up, you have enough testers. If he is good yet can't keep up, you need another one. Sometimes you might find you need one dedicated automated integration test developer and one developer who is a feature developer 50% and an automated integration test developer %50.

    In a four or five person team, you will see the ratio naturally arrive at 3 Devs, 1-2 Developers for Automated test.