Using XHTML in a WordprocessingML document

This question has come up a few times, most recently over on the OpenXMLDeveloper site (https://openxmldeveloper.org/forums/477/ShowThread.aspx#477)

The challenge a lot of folks have is that they want to generate a WordprocessingML document using pre-existing content. Often times that content is in other formats, like HTML. This is also the case if you have folks entering rich content in a web form or some other type of HTML control, and then you want to use that content to generate a wordprocessingML document. While there are tools out there that will transform from HTML into WordprocessingML, this is also easily achievable using the altChunk element.

You can place one or more XHTML files as a seperate part(s) in the ZIP package, and give it the proper content type. Then create a relationship to it from the document.xml part. Once you've done that, you can place the afChunk element (which is a block level element) into the content of the document, and reference the relationship ID that you used to point at the XHTML part. You also have the option to specify whether you want the styles to be merged with the document, or if you want it to maintain the source formatting.

So, for example, you could have the following:

<document>
  <body>
    <p><r><t>Here is a some WordprocessingML followed by someXHTML:</t></r></p>
    <altChunk r:id="rel7"/>
    <p><r><t>Here is some more WordprocessingML</t></r></p>
  <body>
</document>

The relationship type is: https://schemas.openxmlformats.org/officeDocument/
2006/relationships/afChunk

The content type for html is: application/html

With the example above, the content of the HTML file that was referenced by the altChunk tag would show up directly inline after the first paragraph. Now, you should note that this is an import only feature. Once the file is opened, the XHTML content is merged with the rest of the file, and when you save, it will be represented with wordprocessingML rather than XHTML.

This was something I really wanted us to support with the 2003 XML formats when we did the cfChunk work. The cfChunk is extremely useful, and the altChunk builds off of it.

-Brian