Black Box Testing

   I attended a talk on campus yesterday discussing various aspects of testing.  Part of the talk discussed the need for testers to become better versed in the formalities of testing.  I'll leave that subject for another day.  A portion of the talk, however, discussed an experiment done with some inexperienced testers.  They were asked to create test cases for the Myers Triangle Test.  A lot of the test cases they came up with were not useful.  By that I mean they didn't test the algorithm or they were redundant with other tests.  Some would try inputting something like an 'A' which is invalid and won't pass the string->int conversion function or they would try lots of different numbers that all went down the same code path.  If you look at the underlying code, it is obvious why these tests don't make sense.  Too often though, test plans are full of cases like these.  Why is that?

   I contend that we often test things only at the surface level and don't consider the implementation.  At some point in time I was told that black box testing was a good idea because if you looked at the underlying code, you might make the same flawed assumptions that the developer made.  This is probably also where we got the notion that you shouldn't test your own code.  I never really agreed with the concept of purposeful black box testing but didn't fully challenge the assumption in my mind.  After some reflection though, I am pretty sure that black box testing is almost always less useful than white box testing. 

   Just in case you don't follow, let me define some terms.  Black box testing is testing where you don't understand the implementation details of the item you are testing.  It is a black box.  You put in data, you get out different data, how it tranforms the data is unknown.  White box testing is testing where you have the source code available (and look at it).  You can see that there are 3 distinct comparisons going on in the Meyers Triangle Test. 

   Black box testing can be useful if we don't have the time or the ability to understand what we are testing but if we do, it is always better to take advantage of it.  Without knowing the details, I have to try every potential input to a program to verify that all of the outputs are correct.  If I know the inputs, however, I can just test each code path.  If all triangles are tested for A == B == C, I don't need to Triangle(3,3,3) and Triangle (4,4,4) and Triangle(5,5,5).  After the first one, I'm not trying anything new.  Without looking at the code, however, I don't know that.  Not only does white box testing allow you to see where you don't need to test, it lets you see where you do.  Some years ago I was testing our DirectShow DVD Navigator software.  There is a function for fast forward that takes a floating point number.  From a black box perspective, one would have no idea what numbers to pass.  Just try some and call it good.  In this particular implementation, however, there were different behaviors depending on which number you put in.  For a certain range of numbers, all frames were decoded and just played quickly.  For a higher range, only I-frames were played.  For everything above that range, the navigator started playing only some of the I-frames.  Without looking at the code, I could not have known which test cases were interesting.  I couldn't guarantee that I tried something from every range.

   What about making wrong assumptions if you look at the code?  Won't that cause you to miss things?  Perhaps.  However, test driven development, unit testing, etc. have proven that testing done by the developer is quite effective.  Testers should also have a spec outlining what proper behavior should be.  If the code deviates from that spec, you found a bug (somewhere--it might in be the spec).  If you use common sense, you are unlikely to miss a bug because you make the same assumption as the developer.  If you do, the trade-off for greater test efficiency is probably worth it.  You'll have found many new bugs for each one you miss.