It's on the whiteboard

Article
10/12/2005

Way back when, when we were first shipping NT 3.1, checking files into the source tree was pretty easy. You made your changes and checked them in. Not a big deal, since there were only 20 or so people working on the code base - the chances of collision were relatively small, and the codebase was pretty managible. There was a small team of people who had the job of doing nightly builds, it was their responsibility to ensure that a build was done every day, and that the build worked and passed BVTs (the team was something like 5 people, if I recall).

At some point, a number of groups joined the core NT team, and the NT team grew to a couple of hundred developers. Not surprisingly, the system that had worked for the 20 or so people didn't scale to the hundreds of people who were now using the system. It got so bad that we often went for days at a time without being able to have a good build.

We tried community shame (if I had a scanner here at work, I'd scan in the picture of me from back in those days wearing goat horns), it didn't work (I have no shame). We tried staged checkins (each team gets a dedicated hour to check in). We tried darned near everything, but the problem was that our old system simply didn't scale to the size of the group.

Eventually things got so bad that Dave Cutler ended up moving into the NT build lab to directly supervise the builds. It was a varient of the "community shame" solution, but instead of being forced to wear a silly costume, you had to explain why you screwed up to Dave directly, and it was FAR more effective (there's nothing like being grilled by Dave Cutler to instill fear into a developer).

In order to manage the volume of changes, Dave instituted the "Whiteboard". Basically he got Microsoft to buy a 4 foot by 12 foot whiteboard, and had them mount it vertically in the build lab across from where he sat. When you had a change ready to check in, you went to the whiteboard and wrote the your name, the bug #, the module being changed, and a contact number. Dave would then periodically run down the board and call individuals to get them to check in their changes. The cool thing about this mechanism was that Dave could control the build process - he could do sanity builds after individuals (like me) who had a propensity of breaking the build, he could batch changes from the same group together, etc.

It also provided a dramatic visual representation of the state of NT - when the whiteboard was full, the product had lots of bugs, when it was clear, we were close to being done. And when it was empty, we had shipped the product.

Of course, the whiteboard didn't really scale, even to a project the size of NT 3.1. And today, Vista is vastly more complicated - there are several thousand developers contributing code into a a bazillion binaries composed of a gajillion source files (I don't know how many, but there's a lot of them). There's no way that the whiteboard could concievably scale today. Instead, we have a main build lab (which produces the final bits of the product) and a series of "virtual build labs", each of which is responsible for aggregating changes from a set of Windows developers. Its far more scalable than the old system, and significantly more flexible (at a minimum, it doesn't require that a Senior Distinguished Engineer spend all his time making sure that the build completes successfully).

It's on the whiteboard

Additional resources