Diffing two trees in TFS version control

I often need to answer some seemingly simple questions like:

 

What has changed between these two builds?

How do the contents of these two branches deviate?

What % of the files in a child branch have deltas from the parent branch?

 

To answer these I’ve traditionally had to map both paths into a workspace, get everything and use a diffing tool like Beyond Compare to diff the tree structures.  The history command does not give me the level of detail I want (it gives changeset detail, not file level detail) and the recursive diff command is painfully slow.

 

Yesterday Buck demonstrated a project diffing tool useful for comparing a tree at two different points in time.  This was really close to what I needed but did not operate on arbitrary (and possibly unrelated) locations.

 

My goal was to extend that sample into something that worked as I needed and also integrated into a scripted environment well.  E.g. Every line is prefixed with the change type or a comment/summary indicator:

 

left-only: <item>

right-only: <item>

diff: “<item>” – “<item>”

comment: <free text>

summary: <free text>

 

All normal output is sent to stdout.  All error output is sent to stderr.  The existence of error output means that the contents of stdout can not be considered complete.

 

The following example diffs a project at a specific changeset with a branch of that project at the tip.  It shows that 5 files have been changed but none added or deleted.

 

> treediff.exe "$/MyTeamProject/Project1;C30006" "$/MyTeamProject/Project1Branch;T"

comment: Downloading items for path $/MyTeamProject/Project1

comment: Downloading items for path $/MyTeamProject/Project1Branch

comment: Diffing item sets

diff: "$/MyTeamProject/Project1/Src/Core/Client.cs" - "$/MyTeamProject/Project1Branch/Src/Core/Client.cs"

diff: "$/MyTeamProject/Project1/Src/Core/Command.cs" - "$/MyTeamProject/Project1Branch/Src/Core/Command.cs"

diff: "$/MyTeamProject/Project1/Src/Results/Changes.cs" - "$/MyTeamProject/Project1Branch/Src/Results/Changes.cs"

diff: "$/MyTeamProject/Project1/Src/Results/Describe.cs" - "$/MyTeamProject/Project1Branch/Src/Results/Describe.cs"

diff: "$/MyTeamProject/Project1/Src/Results/Integrated.cs" - "$/MyTeamProject/Project1Branch/Src/Results/Integrated.cs"

summary: Compared 50 items

summary: Same: 45

summary: Diff: 5

summary: Right Only: 0

summary: Left Only: 0

 

Items left as an exercise to the reader is support for detecting encoding changes and case-only rename changes (also more specialized rename support could be added) … and the usual disclaimers apply – YMMV.

 

The code…

 

using System;

using System.Collections.Generic;

using Microsoft.TeamFoundation;

using Microsoft.TeamFoundation.Client;

using Microsoft.TeamFoundation.VersionControl.Client;

using Microsoft.TeamFoundation.VersionControl.Common;

namespace treediff

{

    class Program

    {

        static void Main(string[] args)

        {

            if (args.Length != 2)

            {

                Console.Error.WriteLine("Usage: treediff <itemspec> <itemspec>");

                Environment.Exit(1);

            }

            try

            {

                ProcessDiffs(args[0], args[1]);

            }

            catch (TeamFoundationServerException e)

            {

                Console.Error.WriteLine(e.Message);

                Environment.Exit(1);

            }

            catch (TreeDiffException e)

            {

                Console.Error.WriteLine(e.Message);

                Environment.Exit(1);

            }

        }

        private static void ProcessDiffs(string lhsSpecStr, string rhsSpecStr)

        {

            VersionControlServer tfsClient = GetVersionControlServer();

            ItemSpec lhsSpec, rhsSpec;

            VersionSpec lhsVersion, rhsVersion;

            string lhsRoot = LoadItemAndVersionSpec(lhsSpecStr, out lhsSpec, out lhsVersion);

            string rhsRoot = LoadItemAndVersionSpec(rhsSpecStr, out rhsSpec, out rhsVersion);

            Console.WriteLine("comment: Downloading items for path {0}", lhsRoot);

            ItemSet lhsItems = tfsClient.GetItems(lhsSpec, lhsVersion, DeletedState.NonDeleted, ItemType.Any, false);

            Console.WriteLine("comment: Downloading items for path {0}", rhsRoot);

            ItemSet rhsItems = tfsClient.GetItems(rhsSpec, rhsVersion, DeletedState.NonDeleted, ItemType.Any, false);

            Console.WriteLine("comment: Diffing item sets");

            DiffItemSets(lhsItems, rhsItems, lhsRoot, rhsRoot);

        }

        private static VersionControlServer GetVersionControlServer()

        {

            // Figure out the workspace information based on the local cache.

            WorkspaceInfo wsInfo = Workstation.Current.GetLocalWorkspaceInfo(Environment.CurrentDirectory);

            if (wsInfo == null)

            {

                throw new TreeDiffException("The current directory is not mapped.");

            }

            // Now we can get to the workspace.

            TeamFoundationServer tfs = new TeamFoundationServer(wsInfo.ServerUri.AbsoluteUri);

            return (VersionControlServer)tfs.GetService(typeof(VersionControlServer));

        }

        private static string LoadItemAndVersionSpec(string spec, out ItemSpec itemSpec, out VersionSpec versionSpec)

        {

            string fileName;

            VersionSpec[] versions;

            VersionSpec.ParseVersionedFileSpec(spec, null, out fileName, out versions);

            switch (versions.Length)

            {

                case 0:

                    versionSpec = VersionSpec.Latest;

                    break;

           case 1:

                    versionSpec = versions[0];

                    break;

                default:

                    throw new TreeDiffException(string.Format("Expected 0 or 1 version spec - found {0}", versions.Length));

            }

            if (!string.IsNullOrEmpty(fileName))

            {

                itemSpec = new ItemSpec(fileName, RecursionType.Full);

            }

            else

            {

                throw new TreeDiffException("Item name was null or empty");

  }

            return fileName;

        }

        private static void DiffItemSets(ItemSet lhs, ItemSet rhs, string lhsRoot, string rhsRoot)

        {

            int same = 0;

            int lonly = 0;

            int ronly = 0;

            int diff = 0;

            Dictionary<string, Item> lhsItemHash = CreateHash(lhs, lhsRoot);

            Dictionary<string, Item> rhsItemHash = CreateHash(rhs, rhsRoot);

            foreach (string lhsPath in lhsItemHash.Keys)

            {

               Item lhsItem = lhsItemHash[lhsPath];

                Item rhsItem;

                if (rhsItemHash.TryGetValue(lhsPath, out rhsItem))

                {

                    // verify that the file types are the same and if it is a file that the contents are identical

                    // this could be extended to also verify the case of the item name has not changed and that the

                    // encoding has not changed.

                    if (lhsItem.ItemType == rhsItem.ItemType &&

                           ((lhsItem.ItemType == ItemType.File && EqualFileContents(lhsItem, rhsItem)) ||

                            lhsItem.ItemType == ItemType.Folder)

                 )

                    {

                        same++;

                    }

                    else

                    {

                        Console.WriteLine("diff: \"{0}\" - \"{1}\"", lhsItem.ServerItem, rhsItem.ServerItem);

                        diff++;

                    }

                    // by removing the RHS items when we are done in this loop the rhsItem collection

                    // will contain the RHS orphans.

                    rhsItemHash.Remove(lhsPath);

                }

                else

                {

                    Console.WriteLine("left-only: {0}", lhsItem.ServerItem);

                    lonly++;

                }

            }

            foreach (Item rhsItem in rhsItemHash.Values)

            {

                Console.WriteLine("right-only: {0}", rhsItem.ServerItem);

                ronly++;

            }

            Console.WriteLine("Compared {0} items", ronly + lonly + diff + same);

            Console.WriteLine("Same: {0}", same);

            Console.WriteLine("Diff: {0}", diff);

            Console.WriteLine("Right Only: {0}", ronly);

            Console.WriteLine("Left Only: {0}", lonly);

        }

        // taken from Buck's post https://blogs.msdn.com/buckh/archive/2006/04/06/project\_diff.aspx

        static bool EqualFileContents(Item item1, Item item2)

        {

            if (item1.ContentLength != item2.ContentLength)

            {

                return false;

            }

            // If the two hash values have different lengths or both have a length of zero,

            // the files are not the same. The only time this would happen would be for

            // files uploaded by clients that have FIPS enforcement enabled (rare).

            // Those clients can't compute the MD5 hash, so it has a length of zero in that

            // case. To do this right with FIPS, the code would need to compare file

            // contents (call item.DownloadFile()).

            // For information on FIPS enforcement and MD5, see the following link.

            // https://blogs.msdn.com/shawnfa/archive/2005/05/16/417975.aspx

            if (item1.HashValue.Length != item2.HashValue.Length ||

                item1.HashValue.Length == 0)

            {

                return false;

            }

            for (int i = 0; i < item1.HashValue.Length; i++)

            {

                if (item1.HashValue[i] != item2.HashValue[i])

                {

                    return false;

        }

            }

            return true;

        }

        private static Dictionary<string, Item> CreateHash(ItemSet items, string root)

        {

            Dictionary<string, Item> itemHash =

                new Dictionary<string, Item>(items.Items.Length, StringComparer.OrdinalIgnoreCase);

            foreach (Item item in items.Items)

            {

                itemHash.Add(item.ServerItem.Substring(root.Length), item);

            }

            return itemHash;

        }

    }

    internal class TreeDiffException : Exception

    {

        public TreeDiffException(string msg)

            : base(msg)

        {

        }

    }

}