It's all about systems - even if you're testing only a component

Earlier this year I bought a brand new computer and wanted to play what turned out to be one of my favorite games of all times. However, it also turned out that this new computer of mine came with a pretty bad graphics card. It did work but the performance was so bad that even full screen DVD playback was impacted. So I got a new one, installed it and was ready to play. Except for the fact that the game which ran fine before crashed now during initialization. I did a quick search on the web and found reports from other users hitting a similar if not the same issue. I thought about sending an email to the developer but noticed that since others had already posted about this on the support forum it would be nothing more than a "me too" mail. Unless I provided some additional information and fortunately I had already installed Visual Studio on that machine. So I started the debugger and tried to gather some information about the crash. The first thing I saw was this:

    Unhandled exception at 0x6e75f0ec (d3d8.dll) in SamMax101.exe:

    0xC0000005: Access violation reading location 0x00000000.

Looked like someone was dereferencing a null pointer and then tried to read from that memory location. I did not pay enough attention to the module in which this occurred. If I had I could already have been done at this point but I kept going and checked the stack traces. There were seven threads of which only the main thread did something useful and it was also the thread on which the access violation occurred.

    d3d8.dll!CD3DHal::CreateVertexShaderI() + 0x24c bytes

    d3d8.dll!CD3DBase::CreateVertexShader() + 0x33e bytes

    SamMax101.exe!005cc3a6()

    [Frames below may be incorrect and/or missing, no symbols loaded for SamMax101.exe]

    SamMax101.exe!0070a4b8()

    SamMax101.exe!005cc79d()

    SamMax101.exe!007a4aed()

    SamMax101.exe!00556619()

    SamMax101.exe!00730061()

    kernel32.dll!_FindFirstFileA@8()

    7ffdfbf8()

    ntdll.dll!@RtlpAllocateHeap@20() + 0x3ce bytes

    SamMax101.exe!00592208()

    user32.dll!_NtUserPeekMessage@20() + 0xc bytes

    user32.dll!__PeekMessage@24() + 0x2d bytes

The last two frames on the stack refer to functions in Direct3D of DirectX 8 suggesting that the game might be innocent after all (and it was). The function in which the access violation occurred was CreateVertexShaderI() according to the trace. Even without knowing all the details it's obvious that creating shaders involves the graphics card driver so there actually was another potential source for the issue that wasn't even in the stack trace. An inner voice finally said "update the graphics card driver, Luke" and after doing so the game worked flawlessly again (and with the new card also with a very high frame rate).

A couple of months later I received an email about Visual Studio crashing with an access violation when attempting to create a new Silverlight or WPF project. That sounded a bit strange so I asked for a stack trace and the trace I got back contained the following frame:

    d3d9!CD3DDDIDX10TL::CreateVertexShaderFunc+0x88

With this piece of the puzzle the issue started looking similar to the game crashing because of a bad driver. So I suggested updating the graphics card driver and that once again did fix the issue. But why am I telling you all this. Well...

  • It's all about systems
    First, let's go back to the title of this post. As testers we usually own a specific feature (area) that we focus on and there is nothing wrong with that - a complex product can't be tested by one person alone. But it is important to have an understanding of the components that your feature relies on. And of the components those components rely on and so on going down all the way to the operating system. Of course you cannot have a deep understanding of everything in a software stack down to the OS. The "further away" you are from the components you own the more basic/general your understanding of that part of the system will be. Still, any kind of knowledge you have will be an advantage. Going back to the sample of Visual Studio crashing, I have to admit I was irritated at first. But if you happen to know that WPF uses DirectX for rendering it's not surprising anymore that trying to use the WPF designer (which starts automatically when you create a new WPF application) can bring down VS in case of a bad driver. Ultimately this is about being able to determine the root cause of an issue which is essential for logging good bugs because otherwise a tester will log the issue against the feature through which the issue surfaced. That in turn puts an unnecessary burden on the developer to determine the actual cause. In my example above the developer might not even be able to repro the issue on his/her machine.
  • Good testers need broad debugging skills
    I'm a "managed developer/tester". I love the .NET Framework and Silverlight and I try not to touch unmanaged code unless I really have to. But when you look at the software we use every day - starting with the OS itself - there is still a lot of unmanaged code in the mix which hosts our managed solutions and with which we interface/interop. Again my point is not to say that you should or even can be an expert in all possible fields. But because of the circumstances you need to be able to debug into the managed and the unmanaged parts of a system to determine the root cause of an issue or at least narrow it down. Just being able to figure out the area in which an issue originally occurred is pretty useful since it will allow you to take all the information you already gathered and hand it over to the person who actually owns the corresponding feature. More generally speaking - and also including debugging of managed code that is simply owned by someone else - testers need to be comfortable debugging into "foreign" (parts of the) codebase(s) in order to write detailed and accurate bug reports even if the issue extends beyond the features assigned to said tester.
  • Noticing patterns in defects
    Granted, this post may not contain the best example. Either way, recognizing patterns is important as it can save you a lot of time. In my examples it was just about an access violation happening somewhere during vertex shader creation and your reoccurring issues may be more complex. The point is that if you have similar looking issues and you've already found a solution for one of them it's definitely worth checking if the same solution applies of the other ones.

This posting is provided "AS IS" with no warranties, and confers no rights.