Quality of Testing and Quality of Code

I see a good number of Tweets in my Twitter stream that just flash by without leaving any impression. Chatty comments between people I barely know (if at all), links to things that don’t really interest me, and other items that really don’t have a lifetime beyond a few minutes after they are sent. But enough them really do make me think or take action that it makes Twitter  pretty useful for me. Recently I saw the following Tweet from Eugene Wallingford.

image

Student developers do “interesting” things. For example many of them try to write the whole project before starting to test. Oh they may run the compile to check for syntax errors – as if a program that compiles cleanly will automatically work correctly. And then they start to test. This seldom goes well because there is just much too much to test all at once. Especially for a beginner finding logic errors in a (theoretically) complete program is just too complicated. What makes it worse than just the quantity of code (which let’s face it is not all that much compared to a professional program) but the number of code paths (i.e.. places for things to go wrong) that is just more than a beginner can trace in their heads. Yes, in fact, a beginner writing code this way means that the quality of the code may just not be good enough for easy testing. The result is a long and painful testing process.

Experienced programmers test small pieces of code. The fewer code paths the easier it is to design a set of test data to test them all. This, test design, is not something we tend to spend much time on in introductory courses. Or perhaps in anything other than a software engineering course which is a shame. I think we could save a lot of students a lot of heartache and perhaps reduce attrition (theory of mine) if we taught them how to test earlier.

We ask students to do a number of things when creating their own software. I tend to start with three large items:

  1. Understand the problem
  2. Design a solution (algorithm)
  3. Code up the solution

Many start the testing thinking during (or worse after) the code up the solution part. Not a good idea. At the latest we need to start thinking about testing no later than design a solution. In fact if we understand the problem we should already have a handle of testing. At that point we should know  - what is the right answer for various inputs? What values should be outside the range of what we want to handle and how should we deal with them? In short, testing data is all part of understanding the problem. This understanding should directly lead to test data. Much of this test data would should manually run through our algorithm before we code solutions. When we test our data with the code the results we achieved during manual testing of the algorithm should match.

Reducing the pain of testing requires several things of which having good code is just one. It’s one important piece of course but not the whole piece. Having good, solid testing data with known expectations of results is a piece. So is testing small pieces of code so that fewer things can go wrong. The fewer the places for things to go badly the easier it is to find and repair them. By coincidence (not) breaking ones code into smaller, more manageable and testable pieces almost invariably results in better code. Sections of code that are too long and that try to do too many things in one module are the huge run-on sentences of programming. In natural language writing run-on sentences are a “bad thing” and make understanding difficult for the reader. Modules of code with too many lines make testing and understanding difficult. They also make modification, expansion, reuse, maintenance and all sorts of other things difficult if not impossible.

The first programming course is a difficult course to design. Students have to learn a whole new way of thinking as well as a new language in which to express their ideas. That alone makes it difficult. And then there are so many things that we could make an argument belong in the first course. I could see a first course that is four years long if we tried to get it all in. So by necessity things get left out. I am thinking, at this point in my life, that testing is not one of those things we can afford to leave out though. Not that we spend forever on it of course.  I also don’t think it is a separate unit either. Rather I think it is something we talk about as we talk about other things. Include it is the discussion of understanding the problem. Include it with designing algorithms. And include it when we talk about writing the code as something more than just having the compiler check the syntax. It it means that something else waits until a second or third course so be it. On the other hand I wonder if doing better testing and including testing thinking in the whole process might just reduce the amount of time student projects take to complete. It’s a theory. What do you think?