Thanks for the comments/feedback I received. A few additional comments (I'll try to keep this one short, or at least shorter).
Josh Ledgard questioned the need for every/daily builds. I use the word “every” to make a couple points, one internally and one externally.
The internal point is that “shipping” these builds - with all necessary extra info like content and quality - has to be automatic/free. If it’s not, i.e. we have to decide which build, or do extra work to “polish” the build before it ships, there will be resistance to doing them - the daily build is pretty ingrained into how we do things and keeping forks (and dev machines set up) for previous builds adds lots of work. We also get into gnarly conversations about whether to fix some hideous bug in one area before we ship it - those are gnarly conversations because up until we get close to shipping, there's almost always at least one hideous, embarrassing bug in every build - the choice is really to lockdown the build while we fix the hideous bugs, or fix it in the next build and risk that the new crop of hideous bugs is worse. The longer we stay locked down, the longer most devs go without having their work subjected to the next testing gauntlet – they can’t check in, can’t build against all the other changes, etc.. Someone commented that Whidbey is complex – that is absolutely reflected here at the daily, product development level.
Externally, I want to make sure that we set expectations that the build from today is not necessarily better than the build from yesterday and that the casual observer should think twice before installing. Our builds do not have a consistent incremental increase in quality. There are spikes and valleys. There are valleys even late in the release cycle, e.g. we send a beta out, get feedback, check in improvements across the product, and break things (which we then fix). I want customers to see lots of builds coming out so they understand the risk, vs. seeing fewer builds and thinking (erroneously) that each build is good and “guaranteed” to be better than the last one.
All that said, I’m more committed to frequent, automatic, and public, than “every build”. I just think aiming for every build is the best way to get there – otherwise we’ll think incrementally. It's easy to scale back, e.g. from every day to once a week, or to whatever we end up seeing customers do.
I do not mean to imply that there is zero order within our teams. Over time the builds absolutely get better, we have a set of build verification suites that are run for every build, we have tools in place to help make “breaking changes” easier to manage across the division, and there are always people looking at process/efficiency improvements. In many ways, what our teams do is amazing. But the complexity is very high, the % of tests we can run against each build is relatively small, and bad bugs get through.
The feedback about “must always pass” tests is good. That is what our build verification tests are meant to show, but reading this prompted me to think about three things: 1) maybe we’re not looking at the right things in the bvt’s. 2) maybe we need a more comprehensive set that takes longer to run but which still completes within a day or so. 3) can we get community help here? i.e. perhaps we augment what we’re doing with a suite that the community provides. That would be a great rallying data point for the division – do our daily builds at least reach the bar for what customers are willing to install? I’m not looking to pass work off on others, just thinking about where I can blur the line between internal and external so that everyone ends up better off in the end. I don’t remember if I’ve written about this before, but I’m also interested in the possibility of developing many test cases in public workspaces so people can at least see them. Maybe bvts are the pilot we’ve been seeking.
Thanks for the comments re: encouraging people to try this. That’s exactly where we are (although we’re also lucky to have strong support from Eric Rudder on down to help the division get started and survive through the inevitable glitches). People are pretty excited about being more aligned with customers’ needs and wants. I haven’t met a person who thinks this is a bad thing. The concerns I have heard are very legitimate: build quality is variable, how do we do this in a way that doesn’t cause customers to waste their time or become dissatisfied, it’s extra work to gather/publish some of this information. Those are all things to work through, vs. reasons for not doing it.