To cleanup or not to cleanup

One attribute typically considered important for automated test cases (or almost any other test case for that matter) is cleanup.

The test shall leave the system in the same state it was in before the test ran.

This is a good theory, but I wonder how practical it is. Anything a test does has potential to have some impact on the system I am unaware of. Even if I close the application I was testing, remove any files my test may have created, and remove any other artifacts, the fact that some amount of code ran leaves the system in an unknown state. Can you really leave the system in the state it was in before the test ran?

When I was on the CE team, we just threw this rule out the window. Every test suite (definition in this context == collection of tests for testing a particular functionality or attribute as defined by the tester) ran from the exact same starting point - because we automatically re-flashed a new OS onto the device before every test suite. Flashing a new OS took a minute or two, but guaranteed a known environment at the beginning of each test run. Short of a solution like this, I don't know how this rule could be accurately observed.

To be clear, I'm not saying I liked this approach. One advantage of not restoring the system to a known state is the exact scenario that this rule tries to avoid. Running code causes the system to change. One of the things I always worried about was "What if a test is causing an OS memory leak or corruption or some other badness". What if the badness is severe, but just not severe enough to cause crash in the tests? The system may be in an awful state after the tests complete, but we'll never know, because we throw away the system state after the test executes" (yes, you could run a memory check or other diagnostics after tests run and maybe catch stuff like this...).

The point is, that I don't know if the goal of cleanup is attainable, or even if it is something desirable. Sure, if you create a thousand files, maybe you need to clean them up, but maybe we need to embrace change rather than trick ourselves into avoiding it.

Let me put it this way: Forget clean up. Don't bother. Unless it's necessary to clean something up, leave the system in the dirtiest state you can. Heck - it's a lot closer to the way customers run the software anyway. My parents never delete files, and they certainly don't clean up reg keys. I just can't think of many good reasons to put effort into cleaning up after tests. Of course, you are welcome to let me know if I've forgotten something.

Comments (4)

  1. scyost says:

    When we test on real devices we actually can’t do that anymore. Clearing the flash on a retail device takes several minutes and we can’t afford to have the lab spending so much time cleaning up devices.

    You can probably go either way with that rule. If you have a cheap way to restore your environment then you don’t need to spend so much on cleaning up after a test.

  2. Adam Goucher says:

    I think it bears mention that you would be well served to at least a couple times during the cycle return things back to a clean state. We used to do this at the beginning of the cycle, 2/3 of the way through and on one of the RC builds.

    Here is a story to illustrate why…

    When I was at a startup, we got bought by someone who promptly bought us brand new Dell machines to replace our mis-matched bootstrap financing machines. When the machines arrived we were in the midst of the re-badging release so didn’t have time to setup the new machines into our ‘lab’* but on the final build someone found one of the new machines and put in the cd to see the speed improvement (more on a lark than anything else). When we ran the product it blue screen o’ deathed. Turns out that we had inadvertently created a dependency on one of Office’s DLLs. Had we wiped a machine clean, we would have likely have found this problem much sooner.

    I agree that cleaning things every time might be overkill, but there is value to it. Yes, user’s machines are a mess, but it is often not the mess your application is creating.

    • a bunch of too-slow for everyday machines sitting in a corner in a mess of cables and monitors
  3. Alan says:

    Great comments.

    Scott – I think your second paragraph is the exact right answer – but one I think a lot of people miss.

    Adam – I’ve seen a lot of similar situations. I think the right approach may be a combination of solutions. Cleanup is nice, because it makes failures easy to diagnose – if something failed, it is very likely something that the test did. Following Scott’s advice, you could clean as often as time constraints made sense.

    At the same time, you may want to just let one build run and run and run (with some sort of "long-haul" tests running) and see if the system degrades at all. I’ve seen situations where a system worked perfectly in test, but failed in the field – the software was a service that nobody on the test team ran for longer than a weekend without upgrading. Customers, unfortunately, found out that the service ground itself to a halt after running for about two weeks.

Skip to main content