A common business need: Generating server-side documents on the fly

I gathered a list of common Open XML questions related to programmability:

  1. What are the Open XML File Formats and what can I do with them?
  2. Can you show me the internal structure of a Word 2007 document?
  3. What are WordprocessingML, SpreadsheetML, PresentationML, and DrawingML?
  4. Do you have a .NET API that I can use to generate documents programmatically (server-side)?
  5. What is the architecture of a server-side OBA document generation solution?
  6. How can I generate a document programmatically and have more control over document content?
  7. How do you add images to an Open XML document?
  8. How can I pull data from my data source and create a table in a document?
  9. How can I add styles and format to my document content?
  10. What about compatibility with previous versions of Office?

Office Business Applications + Open XML File Formats

When you are trying to create a document assembly solution and you are want to understand how you can use the Open XML File Formats to generate a document programmatically, you may be faced to some of the previous questions. All this questions have been answered in multiple MSDN articles, SDKs, blogs, trainings, forums, and newsgroups. However, I am the kind of person that loves end-to-end documentation and code samples that take you from zero to a working solution. We all have limited time to learn new technologies and walkthrough articles and code sample downloads are always a nice option.

Some time ago I tried to do the same thing and I blogged to show you how to generate a document using a document template, content controls, and XML mapping. I also created a little video and article that shows how to bind custom xml to a document template. This approach is great when you are trying to replace placeholder data in document templates like an invoice or contract. However, your business needs may be different and you may want to have more control over document content and formatting. In that case a better approach would be to manipulate the WordProcessingML content stored in different document parts.

I wrote a new article that helps answer the Open XML questions listed in this blog entry. I split the article in two parts and a code sample download. I start by discussing all the theory and basic concepts you need to learn to work with the Open XML File Formats. For example, I talk about Open XML Package Architecture, WordprocessingML basics, the Open XML object model, and the conceptual architecture of a document integration solution.

The second part explains all the coding that needs to happen to generate a simple sales document from scratch. I show you how to deal with images, tables, styles, and formatting. I also show how to create a helper class that pulls data from your line of business systems (in this case the AdventureWorks sample database to keep the LOB piece as simple as possible), and a helper class that uses the Open XML object model and WordprocessingML to create a document.

You can find the articles and code samples here:

Many thanks go to Doug Mahugh, Wouter van Vugt, and Frank Rice for sharing all their knowledge and helping me put this together. I hope this helps you get started with custom document generation with Open XML.

Enjoy!