Before I start enumerating the features of TFS 2010, I need to start with some of the big conceptual changes that we’ve made. This post set some architectural groundwork and define some terms that I’ll use in subsequent posts.
Team Project Collections
The first important concept to understand is what we call Team Project Collections (TPCs). In TFS 2008 a TFS server contains a set of Team Projects. Those Team Projects are mostly but not 100% independent of each other. For instance one project can be a branch of another project. Also work item types in 2 projects share the same underlying database schema, making it very easy to run work item tracking queries across projects on the server. All checkins on a server share the same monotonically increasing changeset number and all work item ids are allocated from the same monotonically increasing work item ID, etc. Because of the subtle dependencies, certain things people want to do are difficult (or nearly impossible) – for example, consolidate two Team Foundation Servers into one or back up and restore an individual project.
In TFS 2010 we wanted to solve a long list of scenarios like those and our solution is called Team Project Collections (TPCs). In TFS 2010 a TFS farm (notice I said farm, not server – more on that in a minute) hosts Team Project Collections and not just Team Projects. A Team Project Collection is a group of related Team Projects and a TFS farm can host many Team Project Collections. To try to make an analogy with TFS 2008, it’s as if TFS 2008 could host exactly 1 Team Project Collection per physical TFS server. Just about any statement you might make about a TFS 2008 server would apply to a TFS 2010 Team Project Collection (TPC). For instance, everything I said above about changeset numbers, work item IDs, etc is true of a TFS 2010 Team Project Collection. The key, however, is that Team Project Collections are completely independent of each other. Two Team Project Collections can each have a change set with the same changeset number (but very different contents). They can each have work items with the same work item ID. You used to identify things in TFS by server url + ID. Now you identify them by server url + team project collection + ID.
Let me try to make this concrete with a screen shot. When you connect to a TFS server in TFS 2008, you get a screen that looks like this. As you can see – you pick the server and then one or more Team Projects to work on.
However, in TFS 2010, the Connect to TFS dialog looks like this:
As you can see on the left hand side, there is now a list of Team Project Collections (currently labeled “Directory”) and on the right hand side, you can see a familiar looking list of Team Projects within the selected Collection. The client will only allow you to connect to projects in one TPC at a time.
If all of this sounds vaguely similar to the notions around Sharepoint Site Collections and Sharepoint Sites, that’s because it is. The concepts, benefits and limitations are all quite similar. If you are familiar with the Sharepoint architecture, it will help you understand the TFS 2010 architecture.
The introduction of Team Project Collections has brought with it changes to the organization of the TFS databases. TFS 2008 was composed to 5-7 databases partitioned by subsystem. There was one for Version Control, one for Work Item Tracking, Work Item Tracking attachments, Project Management, Build, Integration, … With the introduction of Team Project Collections, we wanted to consolidate the various subsystem data to make Team Project Collections easier to manage. As a result, the TFS 2010 database architecture is as follows:
TFS_Config – The “root” database that contains centralized TFS configuration data, including the list of all Team Project Collections that this TFS farm is responsible for. If you look at Beta 1 (to be released in the not too distant future), you will see this is currently called TFS_Application but we are changing the name to TFS_Configuration after Beta 1 to make it more consistent with Sharepoint terminology (WSS_Config).
TFS_Warehouse – The TFS 2010 data warehouse database that contains reporting data from all Team Project Collections served by this Farm. This means that the data warehouse provides reporting capabilities across all Team Project Collections in the farm.
TFS_* – One database for each Team Project Collection managed by the TFS farm. For example the “default” one would be TFS_DefaultCollection. Each database contains all of the operational data regardless of sub system (version control, work item tracking, build, etc) for a given Team Project Collection.
Of course, there are still databases for Sharepoint and Report Server where ever you install those components.
I share this information not because I particularly want people mucking around in the databases but rather because I think it helps people understand the changes and large scale TFS administrators will want to be able to understand the SQL characteristics for operations.
The introduction of the notion of a TFS farm is another big architectural change in TFS 2010. In TFS 2008, we talked about TFS “Servers”. Even then it was a bit of a misnomer since you can install all TFS 2008 capabilities (TFS, SQL, Sharepoint, Reporting Services, …) on a single physical (or virtual) server or distribute them across multiple.
However, it gets even more flexible with TFS 2010 and as such, it’s now really awkward to talk about a TFS “server”. That said, it is still possible (and will likely be common) to install all of the TFS components on a single server.
The big changes that constitute “TFS farms” are the following:
NLB support for TFS application tiers – With TFS 2010, you can configure multiple TFS application tier machines to serve the same set of Team Project Collections. The primary purpose of NLB support is to enable a cleaner and more complete high availability story than in TFS 2008. Any application tier in the farm can fail and the farm will automatically continue to work with hardly any indication to end users of a problem. It also improves things like the operating system patching story (ATs in the farm can individually be taken offline for patching with out shutting users out of the system). And more.
Scale out for SQL data tiers – TFS 2010 now support use of as many SQL Servers as you like. Each data base can be configured to be on any SQL server and because each TPC is an independent database, this gives administrators a great deal of flexibility to manage their SQL server installations. These features can be used to load balance databases across SQL Servers, manage capacity, retire old SQL servers, etc. A project collection can easily be suspended while it is moved between SQL servers without affecting the operation of any other collections.
Just to reiterate – you don’t have to do any of this. You can still run TFS on a single server and not have to think about multiple ATs or multiple SQL servers. You can stick with one Team Project Collection if you like.
I believe that these new capabilities will significantly change the way enterprises manage their TFS installations in the future. With Team Project Collections and TFS farms, you can create a single, arbitrarily large TFS installation. You can grow it incrementally by adding ATs and SQL Servers as needed. You can evolve it as you need to reconfigure hardware. It provides straight forward high availability. TPCs provide a complete delegation model, allowing you to provide services to different groups without them knowing about each other. TFS 2008 servers will be able to be consolidated as independent Project Collections on TFS 2010 farms.
My intent with this post was to cover the key architectural changes and concepts and not cover the specific features/benefits (but I had to touch on some of them to make it make sense). Subsequent posts will cover this from more of a feature perspective but hopefully you now have the concepts that will help it all make sense.