Software Testing Problems Part 3: Scheduling and Testing

In my last two blogs about software testing (Customers, Talents )
I tackled issues that I felt are more understood.  In
this entry I want to throw out some ideas my team and I started thinking on today
about estimating a true testing cost for new features or potential design changes.  IMO
developer costs are well understood and relatively controllable over time, but if
careful attention is not paid you can easily under or overestimate the amount of time
and resources needed to test a product.

"urn:schemas-microsoft-com:office:office" />

 

For arguments sake lets assume I’m talking about a new feature being added to the
product you are testing.  I’m also going
to assume that you have moved out of the dark ages of relying solely on testing by
hand and have some way to automate some percentage of your testing.  Finally
I’m going to assume that you don’t start the QA costing process until you have a good
understanding of the feature that includes potential developer implementation designs.   No
sense starting off blind. J How
can you estimate the costs involved over time? Here is what we came up with so far:

 

I would propose that there are no truly non-subjective measurements that give you
the whole picture when it comes to the cost of releasing high quality software.  If
you have one let me know.   The following
subjective (1 through 10) measures can be used up front to rate the test costs of
potential features against one another.  Once
you start writing a test plan, once you have a more detailed spec, and once coding
begins you should try and refine these numbers based on more concrete evidence over
time.   

 

Has Dependency Scale (1-10): Does this
feature stand on its own or is it highly dependant on features that surround it for
success?   Think
of a 10 as the feature that requires something from every other feature in the product
in order to work at all.   Think
of 1 as the feature that depends on next to nothing.  With
these scales I’ll leave figuring out the middle ground as an exercise to the reader.  

 

Is Dependency (1-10): If this code malfunctions
will it bring the entire product screeching to a halt or does faulty code here only
bring about the demise itself?  Are teams
outside of your own dependant on this feature? How much handholding are they going
to require?  How much testing coordination
are you going to have to put in with these teams?  Overtime
you can refine this with actual information about # of references to this code or
# of calls made into the code while the product is running, etc. 

 

Importance (1-10): I know I know, you
would never code a feature if it wasn’t important.  But
just like everything that has a price I’m sure you could stack rank features in order
of relative importance to the end users.  For
example: You may really like intellisense features, but if you had a choice between
them and being able to run/compile/debug your code in VS I’m sure you would pick being
able to run/compile/debug your code over these features.   So
get over yourself and think about how relatively important feature Y really is.  If
you want to ship on a schedule knowing this will help you determine were to apply
that “extra” testing efforts in the long run.  

 

Testability Scale (1-10): How hard is
it to reproduce bugs with this feature?   Can
you automate all the tests or will every test have to be done by hand?   If
the feature first requires you to create a large amount of dummy data then simulate
a crash (Assume you are testing Office document recovery.) then this feature is going
to cost you more to test in an ad-hoc fashion and find bugs over time with. 

 

Complexity (1-10): How complex is this
feature?  Is the implementation going
to be complex?  After you have more information
about LOC, # of code paths, or whatever your favorite complexity metrics are you can
nail this one down more. 

 

Security Risk (1-10): Using whatever criteria
you like; determine if the feature represents a large security risk for your product.  Security
testing is both important to get right and costly to apply to a feature.  Giving
a high number here will mean you will want to spend a lot of time thinking like a
hacker and using exploratory methods to try and break through so, in the real world,
you’re satisfied with the reduced risk. 

 

X-Factor (1-10): Ok, maybe this is a cop
out, but there needs to be a way to take an educated guess at even more uncostables.  This
could include your estimate of how well understood the problem space is.  For
example: Completely new, innovative, or uncharted feature will present much higher
costs than one based on existing works.  Another
reason for a high number here could be that you think the feature will present problems
on international systems.  I’m asking
for a best guess here.

 

What does this all add up to?

Add up the numbers, divide, and you get a relative test cost on a scale of 1-10.  J We’re
just starting to think about this so I can’t tell you how many features you can add
up on one person or how much time it takes to test a feature with a cost of 7 it’s
something that I’m hoping to figure out over time combined with historical data we
have.  I guess I’m really searching for
a numerical way to represent the things that product planners should be thinking of
when they evaluate choices between features from a test perspective. 

 

I’m also searching for a useful acronym with which to represent these.  I
guess I’d like to find out if I’m missing something before deciding on STICHIX or
CHIXIST.  (I’ll also take suggestions
on this.)

 

What can I cost with real numbers?

“Your pseudo science frightens me, what can I cost with real numbers?”  Ok,
so the above scales aren’t based on exact complex differential equations, but then
again neither was Moores Law.  If you
are looking for real costs then you’ll just have to itemize and cost in # of real
days, compressed hours, monkees required to cover everything, or whatever suites you
best the procedures followed by your team.  I’d
like to recommend you hold off on getting these numbers until you have written a basic
test plan for the feature so you can get relatively low level and say, for example,
“Feature A will require testing functionality Z with about X number of tests that
take Q time to develop per test.” You could also try out old standby’s like “If it
takes someone 1 day to write it then you’ll spend 2 days testing it.” However you
want to do it; these, tangible, static costs are also important present when looking
at a feature and planning your resources. 

 

I’ll make no determination just yet on where the pseudo science and tangible costs
cross other than knowing they are both important.  I’m
just thinking outloud here looking for some feedback on these thoughts.  

- josh