Struggling with unit tests (again)

I am having a mental hard time today because of unit tests. Not tests that fail... tests that don't exist. And tests I’m not sure that should exist.

It is very natural, right, and highly necessary to test methods which do a non-trivial transformation of data. Here is an example: given a list of events, output a list of the timespans between the events, excluding empty timespans.

The implementation of this method will do conditional tests, looping, and handle edge cases in the input such as an empty or single-item list. It not only feels right to test such code, it is also very, very easy to actually test such code as long as you do the appropriate factoring - so that test code can supply/mock out dependencies.

In my scenario this feels like mostly testing the implementation of the system. It is not a true end-to-end scenario, and while it helps satisfy some requirement of the system, and verifying its logic helps provide reassurance that the system overall meets its requirements, it isn’t meeting a requirement per se.

However, there’s a couple weird things I noticed while thinking about this.

I ended up writing and rewriting code a bit so that this code was easily testable, and found that I introduced new data classes that capture more specifically the input/output required of this particular piece of code. Which meant I wrote a little bit of glue code that runs before and after to squeeze data into the exact right shape to easily process, and then resqueeze it into a final output format.

I have the following doubts about what I just did:

· That glue code on the input side shouldn’t really be necessary – I should be able to get or process the input data in the exact right form in the first place

· That glue code on the output side is somewhat necessary, because it also has other functionality - it’s merging yet more data together adding them to the timespans, as it simultaneously puts things into a slightly different format required by another system. However, this could have been done without as many types into the picture, if I was generating the output type from the code under test.

· What I actually built was a pipeline – reformat data, process data, reformat data yet again, merge in more data. The goodness parts of my code, the bits of the application that MUST survive in some form no matter how I refactor it, and probably the best bits to test are the actual changes to the data that happen, not the ‘data contracts’ or ‘schemas of data’ in between those pieces of pipeline.

Where am I going with this? Testing the trivial reformatting of data from one input form to another intermediate form doesn’t feel that valuable. Testing more trivial reformatting of data from a second intermediate form to the final output form also doesn’t feel valuable. Yes there can be bugs in this code. But we’re testing something that doesn’t matter to the customer. It’s better to avoid writing such a test if you can write a test of the same code that also proves something that does matter to the customer.

Actually… the input form and output format of the data don’t actually matter to the customer. They are just conventions of code for dealing with a database and some another service… what matters is that actual data flows through the system.

OK. And now to the second part of my dilemma… I also have a manager asking me if I can get 100% code coverage and OK, if not, why not 80%?

So I feel caught in a pinch between two basic issues: 1) it is necessary to prove every single piece of my code is correct 2) most of my code is so arbitrary, that testing it can’t be said to provide end user value. In fact if done in the wrong way it will provide negative value, since it slows down my ability to change arbitrary code into different arbitrary code that solves some new issue.

How do I get out of this pickle?

Here is my initial idea of how to move forward:

1. write my first tests of some code I know matters

2. measure code coverage, and try to find some more code that isn’t covered at all, but definitely matters to final correctness

3. refactor that code to make it a testable independent piece that can, once tested survive unchanged forever – adding some glue code if necessary

4. figure out some way to remove the glue code? Or write some very stupid throwaway tests that prove this code is correct right now, but the test should probably be deleted as soon as the test fails.

I’ll let you know if that experiment turns up anything interesting…