TFS Version Control Concepts 1: Items

We use version control to store stuff.  What stuff?  A bunch of items, the most basic elements of TFVC.  In everyday parlance, an item is a file or a folder.  TFS rarely makes a distinction between files and folders; they are stored as rows in the same table.  But implementation details aren't the point.  In our model we will almost always talk about items because files and folders behave so similarly.

(in other words, anticipate lots of posts about version control that never say anything about files.  weird, eh?  that's how I rationalized writing 2 long intro posts)

Ok, so items are important.  What are they?

  • Items are unique.  They have an ID that no other item does.
  • Items are versioned.  Like all version control systems, TFVC is about making it easy to store successive versions of the same item and retrieve old ones when necessary.
  • Items have names.  Yes, plural.  At a rudimentary level, there's a server path showing where it lives on the server and a local path showing where it lives on your filesystem.  More on this later.  Suffice to say names are not unique.
  • Items have a type.  Either they are an ANSI, UTF-16, Binary, etc. file; or they are a directory.

Those are the important points for today. As an addendum, let's think through the consequences of having folders be first-class items.

  • Any operation over items has the potential to be recursive.  As a result, many operations are more complicated than they appear at first glance.  The advantage is usability. 

    Consider a user who wants to delete a folder.  The UI will represent this in the most natural way: a single recursive operation.  The fact that the backend also considers it a single recursive operation means that we can be consistent throughout the item's lifecycle.  Displaying the changeset's history will show the same recursive format; merging the changeset to another branch will pend a single merge, delete.  If only files were considered items then things wouldn't be as clean, at least not without a big perf cost (e.g. automatically grouping file operations on the client).

  • All the fun ambiguities around non-unique names will apply to folders.  With files, it's relatively easy to accept the idea that foo.txt @ time 1 might be a different item from foo.txt @ time 2 -- not just a different version of the same item.  The same situation arises with folders but is harder to reconcile. 

    That is, with folders it's tempting to argue that add -> delete -> re-add (creating 2 items) is identical to add -> delete -> undelete (different versions, same item).  Folders don't store any data themselves; they're just a placeholder, so what difference does it make?  Users of SourceSafe or Perforce would be absolutely correct in this analysis.  With TFS, though, you have to remember the first bullet point about recursion.  When you delete a folder, you always delete its children, and it's considered one operation.   Undelete does the same in reverse: you always get the children back.  Obviously you don't expect that to happen when you create a new item with the same name.

Next time: more on names.