Over the past several months my blog has transformed into more of an announcement forum. And while that’s a service I’m happy to provide, I’m not satisfied with it. I want to exchange ideas and share thoughts about topics all (ok, maybe many) software developers and teams care about.
It’ll probably take the form of talking about the way we do things but only time will tell. We’ve been spending the last several months working on refining the way we measure and report on the quality of Team Foundation Server. We use a great many reports at many different levels of detail. We’re just about done with our “dashboard” reports – high level reports that give the 10,000 foot view on each of our major measuring points. I thought this would make a good topic to start off with.
There are many things we measure to assess the quality. One of the biggest is build quality. It’s a high level assessment of how useful a build is and we measure it every day. Build health is very often an leading indicator of things going wrong on the team. It’s a very important metric to watch. Of course consistently bad builds are a sure sign of a problem. Consistently good builds are a good sign but not sufficient as you will see in future posts.
We have a set of automated “scouting” steps that install a build and do a basic end-to-end run through the product: creating a team project, checking in files, adding/modifying work items, creating a build definition and running it, viewing reports, etc. These tests can run in a couple of hours. The outcome of the scouting steps is a “build rating”. These rating are:
- Self Test – The build is good and ready for in depth testing.
- Self Test with Work arounds – The build is good but requires manual work arounds. With these work arounds all functional areas pass scouting tests.
- Partially Testable – Some functional areas may fail and further testing may be blocked for that area but most work and none of the failures block overall progress.
- Self Toast – The build is no good. It won’t install, can’t create Team Projects, or other major features are not working properly.
- Build failure – There was a break and a build could not be produced.
Every day we rate the build based on the assessment with a report that looks like this (this was a build that was not very good):
Build 20201.00 – Partially Testable
1) Team Build SKU fails to install (Product Bug 188908) causing the Setup and Reporting features to be partially testable and Team Build to be self-toast
Of course it’s easy to get lost in that much daily detail so we have created some dashboard reports to capture Build Quality trend information. Here’s a recent report:
Each row in the chart is a different branch of the code that we maintain. We are currently building/testing 3 branches:
- Orcas – The active development (what we call a product unit branch or PU Branch for short) for the Orcas release.
- Rosario – Yes, I know I’m not supposed to use that name but it’s in the chart and I didn’t want to cut it out. This is the active development for the Rosario release. One of the reasons this is so much better quality than the Orcas branch right now is that we have a lot fewer people working on it.
- Main – This is the branch from which all official Visual Studio builds are produced. Each product unit (or in some cases groups of product units) have branches off of main (the Orcas branch above is a PU Branch for the Team System team) and periodically “reverse integrate” their work into Main when it is ready to be shared with the entire division.
From this report, you can see that since early December the Orcas branch has been rocky. In early January we got very concerned about this and started jumping up and down about fixing the build quality. This report is a great way to see that things are headed for the ditch.
Sometimes you want to drill in even further and see more detail for build issues. Here’s a report with one more level of drill down. Here you can see which components are having issues and which are not.
Build quality is our first line of defense. It’s the first thing we measure and the first place we look for an assessment of overall quality health. If it’s bad none of the other stuff I’m going to show you in the coming days will matter.
Well, that’s an overview of our build quality assessment. I can see now this is going to be a really long series of posts. This has only scratched the surface of how we manage quality on the TFS team. I’ll try to post them as rapidly as I can and maybe follow up with a summary that shows one full daily dashboard report.
I know some of you are likely to ask me for these reports. I’ll see what I can do but some of these are tied into our methodology enough that I don’t know that I can get them easily separated out. I’m not going to even try until I’m done with the series. When it’s all done we can have a talk about what you find most compelling and whether on not there’s anything we can share for your own use.
You might ask, “Hey, don’t you use the reports in the process templates you ship?” The answer is yes, some of them. But our process (just like yours) is evolving all of the time. Many of these reports have been created in the past six months. We’ll take the ones that work best and incorporate them into the product in future releases.