TFS Version Control Concepts 3: Item Versions and the two meanings of 'Changeset'

Let's recap Items one more time.

  1. Items are unique.  They have an ID that no other item does.
  2. Items are versioned.  Like all version control systems, TFVC is about making it easy to store successive versions of the same item and retrieve old ones when necessary.
  3. Items have names.  Yes, plural.  At a rudimentary level, there's a server path showing where it lives on the server and a local path showing where it lives on your filesystem.  More on this later.  Suffice to say names are not unique.
  4. Items have a type.  Either they are an ANSI, UTF-16, Binary, etc. file; or they are a directory.

I think #1 and #4 are self-explanatory.  We covered #3 in great detail last time.  Time to tackle #2.  Versions aren't nearly as tricky as names, but there are some historical and terminology issues to cover.

image

One way to look at the repository is to watch each "slot" over time.  ("Slot" is our slang for "server path in committed space.")  Every time you checkin a new version of an item, two things happen: a unit of work is recorded, and the new contents of that slot going forward is established

So far all of this is common to every version control system ever written.  In addition, modern version control systems support so-called "atomic checkins."  That's actually shorthand for a few related concepts:

  • Multiple files can be checked in during the same transaction.
  • If any files fail to checkin, the entire operation is aborted.
  • Version numbers - or at least some types of version numbers - are consistent across the repository.
  • Metadata is associated with the entire collection of files being committed.

Older systems like CVS and SourceSafe didn't have atomic checkins, so a version was a version (or "revision" in CVS terms).  Individual items in a modern version control system still have versions, but when we introduced atomic checkins to TFS we needed another word to encapsulate the all the new concepts above.  As you probably know, we picked "changeset."  Unfortunately, the word doesn't help distinguish between the "two things" (above in boldface) that happen to the repository upon checkin.  These two distinct meanings sometimes cause confusion:

  1. [adj] A point in time.  The state of the entire repository immediately after a unit of work was committed.  Whenever you specify a "changeset versionspec" in a TFVC command, it takes on this meaning.  In this sense changesets are equivalent to dates.  (Old posts from Adam and James have more details on versionspecs and date specs in particular.) 
  2. [noun] A unit of work.  The files you change during 1 checkin, plus any associated metadata.  Whenever you view the contents of a changeset (tf changeset /i, the Changeset dialog, the output of Checkin) you're seeing this meaning.  Same for work items: when you link bugs to changesets, you're linking them to that specific unit of work.  Unfortunately we don't have an easy way to specify meaning #2 in TFVC commands, except for the 'tfpt getcs' power tool. 

That's a lot of words for a really simple distinction.  Math terms to the rescue once again.  If you think of the repository as a 2D plane like the one pictured above,
meaning #1: all points on a line time = <constant>
meaning #2: a set of revisions that lie on a such a line

By now it should be obvious that a TFVC operation on "changeset versionspec 20" will affect both a.cs and b.cs, while "changeset #20" only contains b.cs.  Unfortunately I don't have a special trick for demarcating which meaning of the word I intend, so I'll try to be unambiguous in future discussions of changesets.