One of the major goals of the Pioneer Team Foundation Server is to upgrade it to early builds so that we can get some “bake time” with them and feel confident with what we’re releasing to customers. Since we spun up the server, we’ve done two successful upgrades and we’re already planning the next one:
- July 1st – Initial installation of Beta 1 build
- August 29 – Upgrade to 8/22 build from the VSTS branch
- September 17 – Upgrade to 9/17 build from the Beta 2 branch (yes, the build dropped at 5PM and we used it to upgrade at 6PM)
- November – Upgrade to a post-Beta2 build
The process of testing the upgrades on a copy of our database is just as important as running the upgrade itself. With our interesting dogfood topology of a dedicated SQL server, NLB/multiple application tiers, a dedicated MOSS server, virtual machines and use of virtual DNS names, we find some interesting bugs that we’re able to fix before things are locked down for shipping.
There’s no whitepaper on “How to Dogfood” at Microsoft, so our process for deploying perpetual upgrades is something that just kind of happened. Here’s basically what we do:
Depending on where we are in the cycle and the amount of churn in the code base, we might add another iteration or skip one. What’s most interesting about this process, is that it’s driven by the end date. Based on the product schedule, we know when our windows of opportunity are for upgrading and we work back from then.
Tips for upgrading
Here are some tips to consider when planning your own upgrades:
- Hardware – Get your production & pre-production servers in place early. Getting new hardware or re-configuring existing hardware usually has a long lead time. Make sure that you’ve also got enough storage space.
- Start fresh – If at all possible, give yourself the best chance at success by starting with a fresh box. Sure, plenty of products support in-place upgrades & clean uninstalls – but in my experience, flattening the box and starting fresh lays a good foundation for the future.
- Backups – Organize a copy of your production backups that you can test the upgrade with. Not only does it give you peace of mind that your backups work, it also helps you line up the DBA’s for when it’s time to do the real upgrade.
- Start with the latest OS – If you start with Windows 2008 R2 now, it’s one less OS upgrade (and downtime) you’ll have to do in the future. It also means you’ll be on 64-bit, which is a very good thing.
- Virtualization – We run our 3x Application Tiers as virtual machines with no problems whatsoever. This is great for when we’re installing pre-beta copies of the .NET framework that don’t have clean uninstalls yet. In our last upgrade we just turned off the old application tiers and spun up three fresh ones. You can treat AT’s as throwaways in a TFS2010 NLB topology, since there is no state stored on them.
- Dry-Runs – Run through your plan and make sure everything is going to work. Review it with others and get their feedback.
- Have a test plan – Once you’ve done the upgrade, you need to be sure that everything is back online and working as expected. Checkin, Checkout, Open work item, Save work item, Web Access, Sharepoint, Reports, Cube, Alerts, Monitoring, Builds, etc.
- Do pre-requisites ahead of time – For example, to install SQL2008 you need to have .NET3.5 and Windows Installer 4.5 both installed first and both of these require a reboot. If you plan to do one on the first weekend and the other on the second weekend, it means that you don’t waste time during your more important SQL upgrade waiting for reboots.
- Do one thing at a time – This is an extension of the last point. Let’s say you’re moving from a single-server TFS2008 SP1 on SQL2005 and Windows 2003 to a dual-server TFS2010 on SQL2008 SP1 and Windows 2008 R2. If it were me, I’d take it in steps: upgrade SQL, make sure everything still works, move to dual-server, check again, upgrade to TFS2010. The more steps you try and bundle together, the higher your risk of failure.
I’ve been involved with at least 5 major upgrades this year and here are some things I’ve learnt from running them:
- Book a conference room
- This is the single most valuable thing you can do. Try and get anybody who is actually doing any steps in the upgrade (DBA< IT, Helpdesk, etc) to come to the room.
- Have a spare computer or project your own desktop on the screen. This is useful for synchronizing with others in the room, running ‘ping –t’ when a server reboots and doing “group debugging” sessions when problems happen.
- Setup a “party line” conference call. Invite anybody who wants to “listen in” to join. Invite your manager.
- It’s also useful for anybody who you might need help from to have these details. If things go south, you don’t waste time trying to setup a conference call – it’s already there.
- Setup a live meeting / desktop sharing session. Invite anybody who wants to be a “fly on the wall” to join.
- Sure, you can log a bug that says “when I clicked on X on screen Y, setup blew up” and people might believe you. But if you’re sharing your desktop and other people see it happen, then you have some people to back you up.
- (I’ve been meaning to do a screen recording of our upgrades, but various technology problems have prevented it each time – I think it would be useful for reviewing and improving the process for next time)
- Setup an “on call / escalation” list. Find out who you need to call if you have problems with one of your dependent systems. The last thing you want to be doing is sitting on hold to a L1 helpdesk if your server doesn’t reboot cleanly or something.
- Send at least 3 notificationsto users of the downtime
- 1st Notice – Give as much notice as possible so people can plan milestones around the date. Even if it’s not set in stone yet, people would rather know about it and have it cancelled than to find out at the last minute.
- 2nd Notice – 7 days or on the Monday before a weekend upgrade is good for this.
- 3rd Notice – On the day of the upgrade (or 24 hours before if it’s a weekend), send another one.
- Guidelines: Make sure you describe what the impact to users is, any actions they need to take and where they can go to get future status updates and support.
- Provide regular status to stakeholders. Start an email thread and trickle through progress as it happens. By keeping them informed, you buy their trust and their support when things don’t go as planned.
Planning your own upgrade
Brian Keller has a great post on how to Get ready to “go live” with Team Foundation Server 2010 beta 2! He includes two useful documents:
- Overview presentation (.PPTX)
- Detailed checklist(.DOCX)
I’m also going to share with you a copy of the generic deployment plan template that I’ve been using for the Pioneer upgrades.
Deployment Plan Template(.XLSX)
Planning, testing and upgrading to these new releases has been a fun experience and it’s a bonus when everything goes smoothly. It’s great to see our experiences influence the product and make it the best release yet.