Reusable Methods for Manipulating Paragraphs in WordprocessingML

In a previous post, I showed you guys the easy way to merge multiple Word documents into one final document by taking advantage of altChunks. One issue with using altChunks is in order to view the final merged document you need an application, like Word, that understands altChunks and is able to actually perform the complex merge tasks. What happens if you don't have the luxury of using Word, or any other application that understands altChunks? Well, then you are required to manually merge the documents together.

Manually merging content within the same or a different document is possible, but requires you to deal with certain issues. There are a number of things you need to consider before you can call your merge task complete. Here are a few example complexities:

  • Styles
    • Does your content reference any styles?
    • Does your destination document also reference those styles?
    • Are there any style conflicts between the source and destination documents? For example, does the source document specify bold for style "Foo" while the destination document specify italics for style "Foo?"
    • Are the document defaults and Normal style definitions different between the source and destination files?
  • Numbering
    • Does your content reference any numbering?
    • Does your source document reference a numbering definition that already exists in the destination document?
    • Do you want continue numbering or restart numbering for copied content?
  • References
    • Does your content reference other parts, like images, comments, headers/footers, etc.?
  • Range Elements
    • Does your content contain range based elements, like bookmarks, content controls, custom xml, etc.?

The issues listed above are just some of the things you need to think about before you can accomplish manual merging. Sounds like a lot of work, but I do have some good news.

Eric White recently wrote a post, where he talks about how we have extended the Power Tools for Open XML to include functionality around manipulating, inserting, and deleting paragraphs within a Wordprocessing document. The great thing about these Power Tools is that they are completely open source under the Microsoft Public License (Ms-PL). That means you can freely deploy solutions that use any of the code within the Power Tools. Another cool piece of information is that these tools are built off of version 1 of the SDK, which has a "go-live" license. In other words, there is nothing stopping you from reusing any of the code within Power Tools for your own solution. Perhaps in a later post I will build on top of these libraries to accomplish a rich end-to-end scenario.

Next Time

In my last post, I got a request from Anthony Rubalcaba to write a post on copying a spreadsheet within a workbook. So, next week I will show you how to accomplish this scenario.

Zeyad Rajabi