Merging Comments from Multiple Open XML Documents into a Single Document

Microsoft Word 2007 allows you to lock a document, prohibiting users from making changes to content, while allowing them to add comments.  If we have multiple documents that have the same content yet different comments, we can merge those comments into a single document.  One possible use would be a specification review system.  After the specification writer finishes a specification, she could send it to other members of her team for review.  As each reviewer returns the specification, she could merge all comments into a single document, making it simpler to integrate those comments.  (This example was inspired by an email thread with Sergey Solyanik, and his need for a comment merger for his very cool code (and soon-to-be spec) review system.  Also, need to say thanks, Sergey did a code review, and the code is better for it.)  The code to do comment merging is available in a zip file named CommentMerger.zip in the downloads tab at www.codeplex.com/powertools.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCNote: The CommentMerger class is part of the Power Tools for Open XML project.  In the future, I’m going to build a new PowerShell cmdlet to do comment merging.  PowerTools for Open XML is an open source project on CodePlex that makes it easy to create and modify Open XML documents using PowerShell scripts.  It’s important to note that Power Tools for Open XML is not a supported Microsoft product and doesn’t necessarily represent future product direction.  We think it will serve as inspiration for customers who need to create and modify Open XML documents programmatically.  Power Tools for Open XML is published under the Microsoft public license (Ms-PL), which gives you wide latitude in how you use the code.

The CommentMerger.MergeComments method uses the Open XML SDK.  The use of the comment merger class is pretty simple: You call the method passing two open WordprocessingDocument objects:

using (WordprocessingDocument destinationDocument =
WordprocessingDocument.Open("Test1a.docx", true))
using (WordprocessingDocument sourceDocument =
WordprocessingDocument.Open("Test1b.docx", false))
{
CommentMerger.MergeComments(destinationDocument, sourceDocument);
}

Upon return, the comments in the source document are merged into the destination document.  To merge comments from multiple documents, you can call the function multiple times.

I wrote the CommentMerger.MergeComments method in the pure functional style.  All methods are written without side-effects.  After initializing, no variables are mutated.  For a detailed explanation of this approach, see Recursive Approach to Pure Functional Transformations of XML.

The CommentMerger.MergeComments method is an example of a ‘Common-vocabulary document-centric transform’.  For an overview of these types of transforms, see Document-Centric Transforms using LINQ to XML.

Before merging comments, the CommentMerger.MergeComments method validates that the two documents contain the same content.  For more info, see Comparing Two Open XML Documents using the Zip Extension Method.

This code is based on the code I presented in Splitting Runs in Open XML Word Processing Document Paragraphs.

This code pre-atomizes XName objects.  See A More Robust Approach for Handling XName Objects in LINQ to XML.

The code uses the Open XML SDK.