Open XML diff tools

When learning about Open XML or developing Open XML solutions, it's very common to find yourself wondering "what's the difference between these two documents?" For example, you may see something in a document that you'd like to recreate programmatically, so you want to know what markup would be required. Or perhaps you've modified a document manually (using Word, say) and you want to know what markup changes were caused by your edits.

In those situations, a diff utility can save a lot of time. I'll cover two good options for comparing Open XML documents below: Eric White's free command-line tool OpenXmlDiff, which comes with source code and can be useful in automated workflows, and Altova's commercial GUI tool DiffDog, which offers a variety of interactive capabilities for analyzing the differences between Open XML documents.

Eric White's OpenXmlDiff

Eric White recently had a need for an Open XML diff utility, and he decided to create a tool from scratch. The result was OpenXmlDiff, a simple and straightforward command-line tool that generates a report of all the differences between two Open XML documents. The diff report is written to console output, so you can easily redirect it to a text file or another program. Eric has put together a screencast that provides a concise 3-minute overview of how to download and use OpenXmlDiff.

OpenXmlDiff uses the XML Diff and Patch Utility (a free download on MSDN) to analyze the differences between the same XML part within two different Open XML documents. That tool identifies the specific changes that would be need to transform one XML document (i.e., OPC part) into another, and OpenXmlDiff handles the details of the OPC package and generates a well-organized output report that summarizes differences at the package level and then shows the specific details for parts that differ.

OpenXmlDiff is a good option if you want to study source code or extend a tool on your own, and it's also free. For those who want more of a slick GUI tool for comparing Open XML documents, there's another good option ...

Altova's DiffDog

I had the pleasure of meeting Alexander Falk in person at TechEd two weeks ago, and we had lunch and talked about our mutual interests including XML standards, Open XML tools, and — most of all — photography. Ironically, we got so busy talking about photography that I forgot to take a picture of Alex, but I did snap a couple of photos of their booth, where a variety of Altova employees (including Tara and Erin, pictured) were on hand to answer questions and do demos.

Altova's suite of XML tools has been evolving rapidly, and one of the areas where they've added quite a bit of functionality lately is Open XML support. For example, Alex blogged recently about how to use Altova's MapForce to auto-generate C# code that creates an Open XML spreadsheet, and their XMLSpy and StyleVision products also provide built-in support for the Open XML formats.

Another Altova tool that can be very useful to Open XML developers is DiffDog, a full-featured general-purpose diff/merge utility that supports any type of text file and also offers XML-aware differencing and support for Open XML documents (i.e., OPC packages) and ZIP files.

DiffDog's "XML-aware" approach means that it's smart about how to organize differences in XML documents for various visualizations (text view, grid view), and it also provides options for how to handle whitespace, CDATA, ordering of attributes (semantically meaningless, but sometimes important to a developer) and many other XML-specific details. And with full support for parts in ZIP packages, you can easily use DiffDog on Open XML documents. Download the free 30-day trial version and check it out.