How embarrassing! How did that time bomb get into Vista anyway?

As promised the story of how something like a time bomb slips through.

 

Two words: Dual Projects

 

Let’s set some basics on software development and define a few terms. I’m figuring most of you know this already but a baseline never hurt anyone.

 

When developing software you normally write code and check it into what is called a source tree. Think of the source tree like a library. Each source file is a book. You can check a book out and check it in.

 

The source tree keeps track of a lot of things:

 

  • Who checked something in
  • What were the differences
  • When was it done
  • History of the file

 

There is a lot more that a source does or can do – but that is the gist.

 

Once all the code is checked into a source tree you do what is called building. A build is when the source code gets compiled into something that can be used by the computer itself. A lot of things happen at build time but the main point is to think of this step as what makes the application. The best explanation for this is that compiling (building for the sake of this post) is what translates the code written by people into a form that can be understood by computers. Way oversimplified but enough to tell the story.

 

So rewind to last summer (2005). One part of the Media Center team was working furiously to finish Update Rollup 2 for Media Center 2005. All of the code was being checked into the Media Center 2005 source tree so that it could be built to make the update.

 

At the same time another part of the team was working on the Vista Media Center project and checking into a different source tree. Makes sense, right? Two projects. Two source trees.

 

Well as we were finishing up with the Update Rollup for Media Center 2005 we did what was called an “integrate.” This means that we take the code from the Media Center 2005 source tree and add it to the Media Center Vista source tree often called “merging.”

 

Merging is done so that the work that was done on the current project isn’t lost for the new project. The tricky part is that the same file might have been worked on in both projects already – so a merge has to be managed to resolve any conflicts. Basically you have to decide which of two files to keep – or beyond that which parts of the two files to keep.

 

So when we got to this merge point we still had a time bomb in the Update Rollup for Media Center 2005. This made sense. We weren’t done with the project and were sending betas out. We want to give people time to test – but we don’t want beta software to run forever, thus the time bomb. We make it expire after a certain period.

 

Well as you might guess the time bomb code got merged into the Windows Vista Media Center tree. Normally not a big deal. After doing a merge towards the end of a project a developer is required to check-in any fixes into both source tree locations. This avoids the need to do another expensive merge and helps make sure that fixes get into both projects. Somehow this particular removal of the time bomb was missed.

 

My guess about how this happened; I have a few:

 

  • The developer who fixed the time bomb for Update Rollup for Media Center 2005 forgot to fix it in Windows Vista.
  • There was a merge conflict for the time bomb in the file and the time bomb code was kept instead of removed
  • A file with the time bomb was reverted for some reason and the merge was lost
  • We didn’t manage the check-ins tightly enough

 

Worst part though is that we normally would have caught this in testing. We run what is called a “Media Verification Test” to ensure that we don’t have anything like a time bomb and a lot of other last minute things in our product before we release major milestones such as betas or final releases. In the case of the build that had the time bomb it was a minor release for us and the media verification test pass wasn’t run.

 

What’s the final outcome of all of this?

 

The most important thing that can come out this from a project management perspective is to answer the questions, “how can this never happen again?”

 

Well a few things.

 

  • We will add the time bomb test to a more routinely run test pass.
  • I am going off to investigate how we can do something when we build (remember that a lot happens when we build) to set a dynamic time bomb so it will never be missed again.
  • Likely on the next round of dual projects we’ll add more stringent requirements on checking into both source trees
  • Review time bombs entirely to see if there is a better way to make sure beta copies don’t last forever.

 

A while back we talked about bureaucracy and process being taxes on being creative. From above I hope you can see that we need some of this to avoid mistakes and to help manage an incredibly complex and very difficult project.