Source Code Available: Complete Implementation of 'Accept All Changes (Tracked Revisions) in Open XML Documents'

As most Open XML developers know, Word 2007 has a feature that tracks changes while you are editing a document.  These tracked revisions are recorded in Open XML markup.  It often is desirable to use the Open XML SDK to process tracked revisions and produce a new Open XML document that contains no tracked revisions.  For instance, you may want to accept tracked revisions programmatically while checking in a document to a SharePoint document library, thereby ensuring that no documents in the library contain tracked revisions.  Another important use of accepting revisions is before doing a complex transform, such as one from Open XML to XHTML.  The transform is significantly easier to write if you first accept all tracked revisions.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC(Update December 21, 2009 - See Accept Revisions in Open XML WordprocessingML Documents for complete semantics of accepting tracked revisions in WordprocessingML documents.) 

Today, I've posted code in the PowerTools for Open XML project on CodePlex that contains a more complete implementation of Open XML SDK code to accept tracked revisions.  To download the code, go to CodePlex.com/PowerTools, click on the Downloads tab, and download RevisionAccepter.zip.  The code is released under the Microsoft Public License (Ms-PL), which gives you wide latitude in how you use the code.

The use of the code is pretty simple.  You call a static method in the RevisionAccepter class with an open WordprocessingDocument.  Upon return of the method, the document will contain no tracked revisions.

using (WordprocessingDocument doc =
WordprocessingDocument.Open("DocumentWithRevisions.docx", true))
{
RevisionAccepter.AcceptRevisions(doc);
}

There have been a couple of implementations of accepting tracked revisions.  Some time ago, I posted some LINQ to XML code that accepts tracked revisions.  In addition, the Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 contains a snippet for this purpose.  Neither of those implementations is complete.  For example, you can track revisions while inserting and deleting content controls, and those revisions won't be accepted by either of the above implementations.  In addition, neither of the above versions process deleted paragraph marks properly.  There are a number of other areas where neither of the above implementations is correct.

Accepting tracked revisions is a non-trivial problem.  There are more than 40 markup elements that you must process.  There are 23 elements that must simply be removed.  There are four elements that must be collapsed (the element is removed, and child elements are promoted in the XML hierarchy to the level of the removed element).  These are the easy ones.  In addition to these, there are a handful of elements that can be processed in a few lines of code, and a handful that require somewhat extensive processing, although 'extensive' is relative.  The RevisionAccepter class contains around 1700 lines of code, which is still in the category of an example program, not a real development effort.

Currently, the PowerShell cmdlet for accepting tracked revisions still uses the old approach.  I'll be updating that cmdlet in a couple of months to use the new version.  I've posted the AcceptRevisions code so that C# developers can take advantage of it now.