Do you have a bug story? Everyone has stories of being affected by bugs. However the stories from software folks are fascinating because the annoyance of the bug is atmosphere rather than the whole story. Here it is about defeating the bug, or uncovering the bug, or sometimes it is about being the infamous architect of the bug.
My favorite personal bug story is from a college internship working on an autonomous robot. I was working on the mapping system and a friend was working on the guidance. Each of us was attempting to complete a master thesis. The problem was that the underlying control system for the robot would hang for no apparent reason when it moved quickly. We didn't build the robot, but if it couldn't move reliably, neither of us would be able to graduate.
Over a period of weeks we narrowed down the problem to something passing in front of the infrared proximity sensors when it was moving at high speed. We determined this key piece of information by scientifically running next to the robot stooped over and waving our hands in front of the sensors. Fortunately another friend on the project built us a platform that allowed the robot to go full speed on a set of rollers, preserving our dignity and allowing for more controlled tests.
To determine the underlying problem we wrote a sound driver to emit a different frequency from the speaker and littered the control code with calls to it. Now the robot sounded like a modem trying to connect. We would repro the hang and based on the pitch at 'flatline' we knew the point of the last successful call to the sound driver. After a few runs and repositioning the calls we found the last successful call was just before an attempt to read the digital compass. The digital compass, was a black box for us. We didn't control the hardware or the software. The call that was hanging was a call out to a provided library. We couldn't go deeper into the software, but the clue allowed us to find the underlying issue.
The issue was a hardware problem. A low resistor value was used in the circuit attached to the infrared sensor. Instead of providing one clean and fast transition when something was detected, it created a slow rise. When the robot was moving the slow rise was noisy and would cause multiple transitions into and out of the 'something's been detected level'. Each transition sent a hardware interrupt, which overflowed the interrupt buffer. The overflowed interrupt buffer caused the digital compass code to hang. Once we knew the problem we verified it with an oscilloscope, and swapped out the resistor.
I love that story. In college I was a slave to 'printf' debugging, but in this instance there was no way to use printf. However, a simplified 'logging' strategy using sound did work.
If you have a favorite bug war story I'd love to hear it.