Coming up with estimates for test work

This week has been busy and one of the tasks on our plate is defining estimates for test work needed for the different items on our list. Getting test estimates correct is always a challenge since in a very real sense the test matrix for any (non-trivial) feature is infinite. This means that the test work is also infinite and we need deal with this.

A useful technique for a situation in which we want to add a new feature to OneNote would be to define all the testing that is possible first. This includes all automation, tools, and test cases. I want to focus on test cases for today, and for this, we will define a test case as a set of test steps that need to be run to verify OneNote produces the expected output. A first attempt at this could lead you to come up with cases that produce the behavior the user expects - these are the happy cases. For instance, if I was testing spell check for a new language we are supporting, I would have a case to check that a correctly spelled word is not flagged and an incorrectly spelled word is flagged.

Moving on from here, I want to check failure cases - times at which OneNote is expected to fail to run spell check for the new language. I would have a case to change the language from the new language to a different language - the new language spell check should fail to run. Other errors that I may want to test would be updates to the custom dictionary - an obvious case would be a failure to update the dictionary file if the hard drive is full. And so on…

After this I would finish the test matrix and get it ready for review. There are whole books written on completing a test matrix so I will move pat that here and get to the estimating piece.

For each test case in the matrix, I need an estimate for how long this will take me to run. In order to run the case, I really want automation to do this for me, so my estimate will reflect how long it will take me to create the automation for that particular task. Let's say it is one hour for the test case in which I checked for a correctly spelled word. That may include overhead to get my machine in the correct state, enlist in the new code, build, etc… If the new feature is complicated, it may take me more time to learn the code and come up with a method to validate the results. So my initial estimate may have some "learning" time as well as the coding time. The estimate I have to create here will depend on the complexity of the new code: am I verifying page content? If so, validation is somewhat easy. If we are changing our sync mechanism, I may need to verify some aspect of network traffic. I may need a day (or days) to examine existing tools to see what can be used, or may need to write my own. This would require a design, design review and time to test my tool, which in turn, would be used for testing. But for this simple case of spell check, I can validate that the word is on the page in the expected state, so I can quickly come up with a validation routine. Assuming I do that and get my first test done, the estimate for checking for an incorrectly spelled word may now be less than a minute - if I wrote my first test correctly, I should be able to complete that next test very quickly since I could reuse most of the code.

I do that for each test case and then total the number of hours needed. That's my first estimate. After that, I go do the work, track how long it actually takes, and update our status accordingly. When it is done, I compare my estimate to the actual time and use that to assess how accurate I was.

A similar process exists for cases I cannot automate. I have to estimate how long it will take to manually run each case. That usually is a much higher number than one hour to start - I need to get a test machine set up, install a version of OneNote to test, start the application, manually go through the steps and then log results. Even with tools to perform most of the setup for me, I estimate a minimum of one hour to run a very simple test. And if I have to run these tests 5 times per month, that adds up to a large number of hours very quickly. Automation wins here - it runs automatically in a lab as often as needed and does not require this large, recurring time commitment.

The key point here is that I need to estimate all aspects of the testing needed, from machine setup, to automation creating, to manual steps needed and then check my estimates to see if they were accurate. It is a process of continual improvement, and it really helps with knowing when new features will be done.

Questions, comments, concerns and criticisms always welcome,

John