Validate Open XML Documents using the Open XML SDK 2.0

Open XML developers create new documents in a variety of ways – either through transforming from an existing document to a new one, or by programmatically altering an existing document and saving it back to disk.  It is valuable to use the Open XML SDK 2.0 to determine if the new or altered document, spreadsheet,…

5

ListItemRetriever: Accurately Retrieving Text of a Open XML WordprocessingML Paragraph

When you are retrieving the text of an Open XML WordprocessingML paragraph, it is often pretty important to retrieve the text of a list item.  This was especially true for the WordprocessingML => XHtml transform.  This post introduces the ListItemRetriever class, which implements one aspect of the functionality in HtmlConverter to retrieve the entire text…

0

Retrieving the Default Style Name of an Open XML WordprocessingML Document

Whenever I write some Open XML SDK code that processes paragraphs based on style name, I need to retrieve the default style name for a document.  It is pretty easy to do, but it always takes a small bit of time to remember / lookup the element and attribute names.  Posting this code here so…

0

Enabling Better Transformations by Simplifying Open XML WordprocessingML Markup

When transforming Open XML markup to another XML vocabulary (such as XHtml), you can sometimes simplify the transform by first transforming the original document to a new, valid WordprocessingML document that contains much simpler markup, and therefore is easier to process.  WordprocessingML markup has many capabilities, such as revision tracking, content controls, and comments.  You…

2

HtmlConverter: Transform Open XML WordprocessingML to XHtml

Last October, I embarked on a project to convert Open XML WordprocessingML to XHtml.  I’ve now published an MSDN article, Transforming Open XML WordprocessingML to XHTML Using the Open XML SDK 2.0, that describes the first version of this translator. This is one in a series of posts on transforming Open XML WordprocessingML to XHtml.  You…

3

Simplifying Open XML WordprocessingML Queries by First Accepting Revisions

Revision tracking is one of the more involved areas of the Open XML standard.  There are over 40 elements and attributes (some with very involved semantics) that define tracked revisions.  I’ve written an MSDN article, Accepting Revisions in Open XML Word-Processing Documents, on the exact semantics of revision tracking markup.  By first accepting revisions, you…

5

Updated DocumentBuilder to work with Dec09 CTP of Open XML SDK V2

DocumentBuilder is a small API (part of the PowerTools for Open XML project, an open source project on CodePlex) that allows you to merge contents of documents while retaining document integrity and resolving issues of markup interdependence.  This post contains detailed information on interdependence of Open XML WordprocessingML markup.  This post introduces DocumentBuilder, and gives…

6

How to Control Sections when using OpenXml.PowerTools.DocumentBuilder

DocumentBuilder is a small API (part of the PowerTools for Open XML project, an open source project on CodePlex) that allows you to merge contents of documents while retaining document integrity and resolving issues of markup interdependence.  This post contains detailed information on interdependence of Open XML WordprocessingML markup.  This post introduces DocumentBuilder, and gives…

4

Accepting Revisions in Open XML WordprocessingML Documents

Revision tracking markup in Open XML word-processing documents is one of the more complex areas of the standard. If you first accept tracked revisions, it makes subsequent processing of text in word-processing documents much simpler. As an example, in my current project of transforming Open XML word-processing documents to XHtml, before doing the conversion, I…

6

Working with Numbering in Open XML WordprocessingML

When implementing a conversion of Open XML word processing documents to HTML, one of the interesting issues is accurately converting numbered and bulleted lists.  You must pay special attention to them, because they impact the text that the document contains, but that text isn’t directly in the markup.  If you are accurately extracting the text…

3