Updated DocumentBuilder to work with Dec09 CTP of Open XML SDK V2

DocumentBuilder is a small API (part of the PowerTools for Open XML project, an open source project on CodePlex) that allows you to merge contents of documents while retaining document integrity and resolving issues of markup interdependence.  This post contains detailed information on interdependence of Open XML WordprocessingML markup.  This post introduces DocumentBuilder, and gives a few examples of its use.  This post discusses how to control sections (and headers) when using DocumentBuilder.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCDocumentBuilder is licensed under the Microsoft Public License (Ms-PL), which gives you wide latitude in how you use the code.  To get DocumentBuilder, go to PowerTools for Open XML, click on the Downloads tab, and download DocumentBuilder.zip.

I've updated DocumentBuilder with a couple of minor bug fixes:

  • The Dec09 CTP changed the way that you add a custom XML part.  It also changed the way that you query for and add hyperlinks.  Updated DocumentBuilder to use the changed API.
  • Fixed a bug where images were copied one byte too short.

There are upcoming tasks for DocumentBuilder, and the PowerTools for Open XML in general:

  • Validate with Office 2010, and ISO/IEC 29500.  I want to revisit each of the cmdlets, and make sure that they work properly for ISO/IEC 29500.
  • Fix issue with duplicate IDs.  This is not a serious issue, as the resulting documents load in Word properly.  I believe that the spec indicates that an implementation is free to load even though IDs are not unique.
  • Enhance DocumentBuilder so that if multiple documents contain the same header, reuse the header instead of duplicating headers in the destination document.
  • Build new test harness/suite for DocumentBuilder, CommentMerger, and RevisionAccepter.
  • Build a cmdlet for CommentMerger.
  • There are a few cmdlets in PowerTools for Open XML that I want to revisit, and make more general.
  • Some of the cmdlets and supporting code can be refactored.  I plan on making a utility module for code that is specific to Open XML, and a utility module for code not related to Open XML, such as some of my favorite functional programming extension methods and classes.
  • Incorporate the new transform from WordprocessingML into XHtml.  (I haven't yet blogged on the final version of the transform yet.)
  • Valildate with PowerShell 2.0.