Old World vs. New World

Article
10/31/2011

As software has become increasingly more complex and the demands placed on the test discipline have escalated, it is clear that the legacy approaches for testing software are not always adequate. Sometimes test teams get caught up in performing the same types of tests and/or taking the same types of approaches not because those things keep providing great value, but oftentimes simply because “that’s the way it has always been done” J.

Years ago, as part of thinking about this phenomenon with a colleague, we drew up a small playbook which described the values that our test team “aspired to”. As a small part of this was the below “Old World, New World” chart which tried to summarize some of the tenets of what we were trying to move towards (and some of the stuff we were trying to move away from…). We wanted to take a hard look at some of the things that we were investing a good deal of time into and see if we could make any tweaks to get more bang for the buck. Looking at the chart now it is interesting for me to see that many of the points are still relevant and we still haven’t met all of our goals…but some incremental progress has been made J

Let’s look at the chart and then briefly comment on some of the bullet points:

Old World	New World
Doing same old testing that we have always done (grandfathered in)	Killing old tests, procedures, efforts, etc. that have little value or whose value does not measure up to the cost Expanding /adding tests with creative additions Exploratory/Scenario testing investments
Test spending time running regression tests on private binaries Test adding coverage on new functionality	Having Developers run regressions using centralized system Test adding coverage on new functionality
Spending time triaging regression results because of test issues	Spending time getting tests to 100% in all configurations
3-4 week Full Test Pass	1 week Full Test Pass
Adding, Adding, Adding, Adding	Removing, Removing, Innovating, Adding
“Classical SDET” deliverables are rewarded	Radical thinking and results are rewarded
Finding bugs after check-in	Finding bugs before check-in
Test serving Development	Test serving Quality

“Doing same old testing that we have always done (grandfathered in)/Killing old tests, procedures, efforts….”: The idea behind this row is as was described above, i.e. test teams frequently keep doing the same old things again and again without analysis of the return on investment for some of these tasks. As an example one of my test teams maintained an extremely large and complex test suite which tested a particular type of authentication behavior. This test suite had thousands upon thousands of variations and was not particularly stable. Analysis showed that the maintenance costs for the test suite were quite high and in fact the suite was not catching many regressions and/or finding new issues (certainly not enough to justify its existence). The decision was made to retire the old test suite (at least from regular automated runs) and replace it with a much more simplified version which was stable and which provided similar coverage. Although changing something like this was scary, in the long run it was the right thing to do for the team. The end result was that due to the reduction in maintenance costs, the testers who previously owned the suite were able to spend more time on end-to-end scenario testing and ad-hoc testing and as a result were able to catch more defects from reaching the customer.

“Test spending time running regression tests on private binaries…/Having Developers run regressions using centralized system.”: When at all possible, centralized systems should be put into place in such a way that the test team is not a bottleneck for development. Instead of developers tossing the binaries to test over the fence and waiting for the test team’s “blessing”, a centralized system which can be leveraged by everyone across the disciplines can enable developers to verify their private binaries and protect against regressions. In the meantime, test can be off tracking down the tough bugs J

“Spending time triaging regression results because of test issues/Spending time getting tests to 100% in all configurations”: In my estimation, there is no greater drain in an SDET’s work life than unstable automation. When SDETs spend too much time triaging test automation bugs, it takes away from high value work such as moving quality upstream or finding hard-to-track-down bugs in end to end flows. The “Golden Standard” that our test teams try to evangelize is that every time a test fails, it should be an indicator of an actual product bug. While this is sometimes difficult to achieve in practice, it is certainly a good result to aspire to. Another problem with unstable automation is that it causes a loss of credibility within the work group. Unstable tests and/or false alarms reduce our clout as testers and may cause the “Boy who cried wolf” syndrome in which people simply stop being alarmed when tests fail. Apathy like this in the team is dangerous.

“3-4 week Full Test Pass/1 week Full Test Pass”: This speaks to the results that a test team can realize when they tighten up their battery of automation by increasing the stability of their test automation and removing tests/procedures which no longer provide benefit. These particular numbers came from a test team which worked on a component which had a huge number of system interdependencies, so please treat the particular example with a grain of salt J.

“Adding, Adding, Adding, Adding/Removing, Removing, Innovating, Adding”: The trap of continuously adding is something that I have encountered in multiple test teams that I have worked in and is a fairly common occurrence. When a new feature comes along, tests are added. When a test hole is uncovered, tests are added. When a new test technique is researched, tests are added. The problem with this approach is that every new test/process/procedure that is added has a cost, and that cost is compounded over its lifetime. This creates a maintenance burden that becomes unmanageable. Thankfully there are many ways out of the trap. One example is to enforce review and removal of old test variations which are no longer relevant at the same time that new tests are added. Another example is to figure out ways to combine/consolidate test cases such that coverage is maximized with a minimal set of tests.

“ ’Classical SDET’ deliverables are rewarded/Radical thinking and results are rewarded”: Along the same lines as the comments above about questioning the same old testing that was grandfathered in, it is important for SDETs to feel safe that they can try new techniques to augment or potentially replace some of these old approaches. This requires management to be “on board” with SDETs who are able to see past the cloudy veil of the past, and take risks (with the expectation that some of these risks will fail). We will never escape from the past if we are not receptive to innovation and the risk that comes with it.

“Finding bugs after check-in/Finding bugs before check-in”: Although testers are frequently characterized by the issues that they raise while testing the actual product (i.e., after the code has made its way into the build after check-in), the most efficient time to find the bugs is before the code makes its way into the build and ideally before a single line of code has been written. This is all part of the “Push Quality Upstream” war cry. There are many ways to go about this and each method probably deserves its own series of blog posts J:

Ensuring test is plugged into the design and spec phase
Test actively involved in pre check-in code inspection
Creating a check-in battery of tests which are run automatically before the new code is committed to the source control, and whose failure will prevent the code from being checked in
Etc.

“Test serving Development/Test serving Quality”: Sometimes the test discipline is seen as a henchman for the development discipline, with test waiting ready for development to throw the code over the fence. It is important for test to assert that the master we are serving is quality and that may not always include being the safety net for development. For example, sometimes it means teaching development to be their own safety net J

Future blog posts will dive deeper into the above shifts in behavior as well as describing some of the processes that can be used to facilitate the transitions.

Comments/Questions are always welcome. Thanks for reading!

-Liam Price, Microsoft

Old World vs. New World

Additional resources