Data-driven document generation with Word 2007 and the Office XML File Formats: Part 1


Last week I had a chance to write some interesting code samples. I missed coding a lot, and I felt a little rusty after working in different projects. But I had a lot of fun and I want to share with you one of the samples I wrote: A simple ASP.NET 2.0 solution that retrieves data from SQL Server and generates a Word 2007 document using the Office XML File formats, WinFX, and XML Mappings in Word 2007.


One of the most common requirements for applications that work with data is “data-driven document generation.” No matter what the data source is -could be an Access database, SQL database, Web service, SharePoint list, Excel spreadsheet, XML file, Word document, or multiple sources. The typical “export data and create Word documents” feature is a common need.


How to do it? Two simple steps:



  1. The Word part: Create a Word 2007 document template.
  2. The Cool app part: Create an application that pulls data from a datasource and generates new documents based on the Word 2007 document template.

Word 2007 allows you to build data-driven document generation solutions. You can create a document template with a custom XML part and use Content Controls to bind to custom XML data using XML Mapping. Next, you can create a managed application to build a new document based on the template. The managed application will open the document template, retrieve data from a SQL Server database to build a new custom XML part, replace the template’s custom XML part , and save it as a new document.


This little sample will show you how to build a document template from scratch and create a server-side application that generates documents that display data stored in a SQL Server database. How did I learn to do this, well, believe it or not, I watched the following Office Dev Con session:


CD305—Word 2007 XML Programmability: Data/View Separation and Rich Eventing for Custom XML Solutions


Presenter: Tristan Davis; 59 minutes (180 MB)
In this session, we introduce new XML capabilities, then dive into the functionality of the Office XML data store and how it can be leveraged to build solutions that will strongly tie Word documents to your business processes.


..and then asked Tristan Davis some questions related with XML Mapping.


Business Scenario: Custom Customer Letter


To demonstrate how to build and Word 2007 template and connect an item in the data store to a SQL Server database, you will first build a customer letter document template with content controls that map to an XML file. Next, you will create a server-side data-driven document generation application that will allow you to select a company name to generate a custom letter. The application will retrieve customer data from a SQL Server database and use the customer letter document template to build a new document that displays customer data based on a user selection. The document will display the following information:



  • Company Name
  • Contact Name
  • Contact Title
  • Phone

To create a Word template and define the XML Mappings for each content control.


Use the following steps to create a Microsoft Office Word 2007 template.


1. Open Word 2007.


2. Create a document.


3. Create content controls to bind to a node in the data store.



What are content controls? Content controls are predefined pieces of content. There are several types of content controls, including text blocks, drop-down menus, combo boxes, calendar controls, and pictures. You can map these content controls to an element in an XML file. Using XPath expressions, you can programmatically map content in an XML file to a content control. This enables you to write a simple and short application to manipulate and modify data in a document. 


4. Set the XML mapping on the content control.



XML mapping is a feature of Word 2007 that enables you to create a link between a document and an XML file. This creates true data/view separation between the document formatting and the custom XML data.


To load a custom XML part, you must first add a new data store to a Document object by using the Add method of the CustomXMLParts collection. This appends a new, empty data store to the document. Because it is empty, you cannot use it yet. Next, you must load a custom XML part from an XML file into the data store, by calling the Load method of the CustomXMLPart object, using a valid path to an XML file as the parameter.


5. Add four plain text content controls in the following order.



  • Content control for Company Name
  • Content control for Contact Name
  • Content control for Contact Title
  • Content control for Phone

6. Save the template document as C:\CustomerLetterTemplate.docx.


To set an XML mapping on a content control


Note In this procedure, you map the content control to a sample custom XML file. You will create a valid custom XML file, save it to your hard drive, and add a data store to the document that contains the information to which you want to map.


1. Create a text file and save it as c:\CustomerData.xml.


2. Paste the following into the text file and save it: 



<?xml version=”1.0″?>
<Customer>
  
<CompanyName>Alfreds Futterkiste</CompanyName>
  
<ContactName>Maria Anders</ContactName>
  
<ContactTitle>Sales Representative</ContactTitle>
  
<Phone>030-0074321</Phone>
</Customer>

3. Now, map each content control to bind to the <CompanyName>, <ContactName>, <ContactTitle>, and <Phone> and nodes of the previous custom XML part.


4. Open the Visual Basic editor and run the following VBA code to add a data store to your template document.


This sample code demonstrates how to attach an XML file to a document, so that it becomes an available data store item.


‘ Load CustomerData.xml file
   
ActiveDocument.CustomXMLParts.Add
   ActiveDocument.CustomXMLParts(
4).Load (“c:\CustomerData.xml”)

To create an XML mapping, you use an XPath expression to the node in the custom XML data part to which you want to map a content control. After you add a data store to your document (and the data store points to a valid XML file), you are ready to map one of its nodes to a content control. To do this, pass a String containing a valid XPath to a ContentControl object by using the SetMapping method of the XMLMapping object (via the XMLMapping property of the ContentControl object).


5. Open the Visual Basic editor and run the following VBA code to bind content controls to items in the data store.


Dim strXPath1 As String
strXPath1 
“/Customer/CompanyName”
ActiveDocument.ContentControls(1).XMLMapping.SetMapping strXPath1

Dim strXPath2 As String
strXPath2 
“/Customer/ContactName”
ActiveDocument.ContentControls(2).XMLMapping.SetMapping strXPath2

Dim strXPath3 As String
strXPath3 
“/Customer/ContactTitle”
ActiveDocument.ContentControls(3).XMLMapping.SetMapping strXPath3

Dim strXPath4 As String
strXPath4 
“/Customer/Phone”
ActiveDocument.ContentControls(4).XMLMapping.SetMapping strXPath4

You have your template file and you are ready to start your document generation application. I love ASP.NET and I created a server-side application to show you how to generate documents on the server using the Office XML File Formats. You can build a document generation application using Win Forms, a console application, a shared Add-in or your preferred application depending of your business needs.


If you want to learn more about Word 2007 and document templates, read this article:


http://msdn2.microsoft.com/en-us/library/ms406053.aspx


This blog entry is too long already, so I will split it here and leave you with the intrigue of what part 2 will bring. Stay connected and you will soon know the happy ending for this walkthrough.


Enjoy,


-Erika

Comments (48)

  1. Edje says:

    Thanks very much for this Erika, I sincerely hope part two will deal with ‘collections’ as in adding a table with repeating data.  I am currently dealing with that using XSLT which is very cumbersome especially when you need to update and deploy.

    I love your posts!

  2. Brett says:

    I love this idea. One question, now that you have created the Word 2007 file on the server, what tools could you use to convert the document into a Word 2003 or XP file for download by users who haven’t got Office 2007?

  3. Erika Ehrli says:

    You can build a server-side application using Visual Studio to generate data-rich documents using the Office XML File Formats and  the .NET Framework 3.0 (aka Microsoft WinFX ). Here’s how…

  4. erikaehrli says:

    Edje:

    Thanks for the very nice comment :). Part 2 doesn’t deal with WordML generation, it deals more with replacing a Word Content Control (placeholder) with custom xml data. To iterate a collection to add a custom table, you need to create the WordML file (document.xml part). You can use a XMLDocument or XMLTextWriter object to create a table element. I will keep in mind your request for a future blog entry :).

    Brett:

    The only tool I know of is the "Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats (Beta 2)."   You can use this tool with your current version of Office (2003 and XP only) to be able to read and write the Office XML formats. You can download the tool here:

    http://www.microsoft.com/office/preview/beta/converter.mspx

  5. I’ve been planning to start pulling together some real world examples around the ways in which people…

  6. Qanuc says:

    Hi Erika!

    Great article, it helped me a lot.

    But now, i got a slight different problem:

    In my AddIn, it must be possible to add an entry to the CustomXmlPart, according to a given schema. Initially, the XML of the CustomXmlPart looks like "<MetaData></MetaData>", and by clicking some button it should become "<MetaData><Name>MySelf</Name></MetaData>". But, as the given schema I mentioned is very rich and flexible, I cannot pre-define all the fields.

    Now, I try the following:

    part = Globals.ThisAddIn.Application.ActiveDocument.CustomXMLParts.Add ("<MetaData></MetaData>", System.Reflection.Missing.Value);

    —> The SchemaCollection of the newly created CustomXmlPart is empty.

    part.SchemaCollection.Add ("NameSpaceUri", "SchemaAlias", "(path to xsd file)", false);

    When doing the SchemaCollection.Ad, I get the following exception:

    "System.Runtime.InteropServices.COMException (0xE0041802): This schema cannot be reloaded, as the schema collection is currently in use."

    What is going wrong here? Any idea?

    Thanks

    Michael

  7. erikaehrli says:

    Hey Michael,

    Thanks for writing.

    The CustomXMLParts.Add method allows you to add a new valid Custom XML part, what you are adding is invalid. Try doing the following:

    ActiveDocument.CustomXMLParts.Add

    ActiveDocument.CustomXMLParts(4).LoadXML ("<?xml version=’1.0′?><MetaData><Name>MySelf</Name></MetaData>")

    Where 4 is the index number of the CustomXMLPart. You can add more as needed.

    If you are trying to override an existing CustomXMLPart with new nodes,the best way is using the AddNode method.

    The model is similar to DOM.

    I hope this helps,

    -Erika

  8. Article submitted by: Stéphane Rodriguez This article is a tutorial for working with parts as defined…

  9. Article submitted by: Stéphane Rodriguez This article is a tutorial for working with parts as defined…

  10. Hi Erika,

    Nice article, though I’m afraid I am still clueless as to how to map a content control to an XML Element.

    Is there a way to do this through Word 2007? Can you guide me through the menus if possible?

    Thanks in advance.

    Akshay Dhavle

  11. goodwill says:

    Akshay,

    Totally agree with you, I really dont understand why documentations are usually missing important points… IMHO its really a bit lousy…. Not everyone have the bandwidth to see the demo play…

  12. Annette says:

    Hello,

    is the mapping of Content Controls to XML only possible programatically?

    Would be glad if you can give me an answer

    (annette.ilgen(at)cataligent.com )

  13. erikaehrli says:

    Ashkay, goodwill and Annette:

    There’s no UI to map controls to XML with one exception – SharePoint and Built-in properties are listed in the Quick Parts | Linked Properties fly out menu.

    You can map Content Controls to Custom XML parts.

    1. Using XMLMapping object (programmatically).

    2. Manually updating files inside of a document package. Here’s a great sample: http://openxmldeveloper.org/archive/2006/06/04/Mapping_Content_Controls_to_Custom_XML_Parts_using_Notepad.aspx

    3.Using the System.IO.Packaging class to programmatically manipulate the package and create the item.xml parts and the relationships.

    I promise to blog about this soon to clarify a bit more.

  14. Jason says:

    Hi Erika,

    Can you help me? I created a task pane using visual studio tools for office. This task pane have a list list of customers from a database and displays them in a grid.

    How can I mapping this information into a plain text content control?

    Thanks

    Jason

  15. I have used your sample to create a similar implementation in VSTO using C#. I anayone is interested i can make this available.

  16. Flavia says:

    Here is a link to view a video…how to link CC to xml data files using a ready-to-use toolkit… Source code also available…pretty neat i think

    http://channel9.msdn.com/ShowPost.aspx?PostID=254539

  17. adam says:

    Are there plans to support a datagrid (or a table) type of content control.  I need to report on a list of data and do not see an obvious way of doing this without manually writing all the code to create a table.

  18. Adriaan says:

    I am a php website developer, but I have to say that these tools that MS are making available are really cool… thanks for the info, I really found it quite useful.

  19. Joe Sweeney says:

    I found these directions very confusing for a first-timer using Word 2007. The steps to accomplish the task are not very direct or clear. I speak as an adjunct instructor at a community college for eight years. It took me a while to find the toolbar options that would allow me to add the text controls and to open a VBA window.

    However, once I found everything, it worked like a charm.

  20. We are going the way to use more XML based content. This makes us more flexible with further ideas but

  21. We are going the way to use more XML based content. This makes us more flexible with further ideas but

  22. Erika Ehrli says:

    I gathered a list of common Open XML questions related to programmability: What are the Open XML File

  23. Category: Word programming Level: Advance Here is a small tip on how to add data to Word content control

  24. Constantin says:

    Is there a possibility to select all the cc with a specific name? … throw select i mean make them somehow visible .. showing their titles or highlighted .. this being done programmatically.

  25. Timothy J. Winters says:

    Hi Erika,

    I’m (very) new to Open XML (and programming altogther)and have been playing around with it for a couple of days now. I’ve been trying to link an external XML file to a word document, but am having some issues in doing this.

    Basically, i’ll describe exactly what i’m doing and perhaps you can point me in the right direction:

    1) Create a new word 2007 document

    2) Add 3 plain text content controls from the developer tab

    3) Save the document in docx format

    4) Launch VBA Editor

    5) Enter the following code to add a custom XML part and load an XML document:

    ActiveDocument.CustomXMLParts.Add

    ActiveDocument.CustomXMLParts(3).Load ("c:TData.xml")

    NB – The XML file looks like this:

    <?xml version="1.0" ?>

    <Customer>

     <CompanyName>Timothy Winters</CompanyName>

     <Location>Springfield</Location>

     <Phone>02476144524</Phone>

    </Customer>

    6) Bind the data in the XML nodes to the (previously created) content controls:

    Dim strXPath1 As String

    strXPath1 = "/Customer/CompanyName"

    ActiveDocument.ContentControls(1).XMLMapping.SetMapping strXPath1

    Dim strXPath1 As String

    strXPath1 = "/Customer/Location"

    ActiveDocument.ContentControls(2).XMLMapping.SetMapping strXPath1

    Dim strXPath1 As String

    strXPath1 = "/Customer/Phone"

    ActiveDocument.ContentControls(3).XMLMapping.SetMapping strXPath1

    7) Click ‘save’ in VBA Editor.

    8) No changes in the content controls!!!!

    Where am I going wrong, Erika? 🙁 I know it’s something silly, but I can’t quite put my finger on it at the moment!

    I’m unsure as to whether I enter the code in steps 5  6 into the ‘Immediate’ window or in the ‘Code’ window. Do I enter step 5 first and then save? Or do I enter both sets of code at once and then save? :s

    Some assistance at grassroots level is desperately needed… 🙂

    Thanks in advance, Erika!

    PS You’re doing a great job!!!!

    Timothy.

  26. Timothy J. Winters says:

    PS In step 6 above, I entered the code incorrectly on the post. It should have read:

    Dim strXPath1 As String

    strXPath1 = "/Customer/CompanyName"

    ActiveDocument.ContentControls(1).XMLMapping.SetMapping strXPath1

    Dim strXPath2 As String

    strXPath1 = "/Customer/Location"

    ActiveDocument.ContentControls(2).XMLMapping.SetMapping strXPath2

    Dim strXPath3 As String

    strXPath1 = "/Customer/Phone"

    ActiveDocument.ContentControls(3).XMLMapping.SetMapping strXPath3

    Thanks!

  27. Timothy J. Winters says:

    Pardon me again… i’m having a bad day! My last post should have read:

    Dim strXPath1 As String

    strXPath1 = "/Customer/CompanyName"

    ActiveDocument.ContentControls(1).XMLMapping.SetMapping strXPath1

    Dim strXPath2 As String

    strXPath2 = "/Customer/Location"

    ActiveDocument.ContentControls(2).XMLMapping.SetMapping strXPath2

    Dim strXPath3 As String

    strXPath3 = "/Customer/Phone"

    ActiveDocument.ContentControls(3).XMLMapping.SetMapping strXPath3

    Thanks!

  28. Simon says:

    There are a couple of little checks that can be done here:

    1) how many custom XML parts have you got loaded?

    ?ActiveDocument.CustomXMLParts.count

    Word will just keep adding CustomXMLParts to a document every time you save, so you may find you have loads of bad data that does not map loaded. If this is the case check the XML in each part:

    ?ActiveDocument.CustomXMLParts(4).xml

    and where your (bad) data is loaded delete it.

    ActiveDocument.CustomXMLParts(4).delete

    2) has your XML loaded?

    activedocument.ContentControls(2).XMLMapping

    Will return True or False dependent on the sucess of the mapping; perhaps a msgbox as a result of false might help. Note: there appear to be very few error messages if mapping fails, the process appears to continue and map what can be mapped.

    3) finally you have run the VBA haven’t you?

    PS I added the mapping on document_open, including a global variable recording the number of the new customXMLpart.On document_close the new customXMLpart is deleted. This is of particular use where the document is used as a template.

    Cheers

    Simon

  29. Kathryn says:

    Once the data is in a data content control, say a simple rich text content control, how do we insert that into another part of the document. That is, how can I resuse text entered by the user within the same document?

    Kathryn

  30. Frode says:

    Hi.

    I’ve a case where i need to convert all dynamic content in a Word 2007 document to static text. Any ideas how and where to do this?

    The requirement is equivalent to what happens when you use the Save As function in Word 2007 and stores the document as a an old Word 97/2003 document, but it has to be done on the server side.

    Regards,

    Frode

  31. Ray says:

    Hi,

    I have a question whether everyone can help me with this VBA. I’m fairly new to this program but the company wants me to develop something for them. In word 2007, under the insert tab, there is a quick part section where the user can control its document property. I am developing a similar function to the document property, I’ve created a text box and a button where the user can put the text in the text box and hit the button. Once they hit the button, at the moment the content control is creating along with the string that is written inside the text box. However, when I change text from the text box, I have no control to change the content control that has already been written on the sheet. Below is the code to add the content control and the string. Any idea how to update all the content control that has been written on the sheet to the new text inserted in the text box?

    Dim editdocadd As String

    Dim rngTemp As Range

    Dim objRange As Range

    Dim objCustomPart As CustomXMLPart

    Dim objCustomControl As ContentControl

    Dim objCustomNode As CustomXMLNode

    Private Sub Callprop(ByVal control As IRibbonControl)

           ‘ActiveDocument.BuiltInDocumentProperties("Title").Value = editdocadd

           ‘ActiveDocument.BuiltInDocumentProperties editdocadd

            Set objCustomPart = ActiveDocument.CustomXMLParts.Add

                   objCustomPart.LoadXML ("<evans><author>Rayz</author>" & _

                       "<title>" + editdocadd + "</title></evans>")

                   ActiveDocument.Range.InsertParagraphAfter

                   Set objRange = ActiveDocument.Paragraphs(1).Range

                   Set objCustomNode = objCustomPart.SelectSingleNode _

                       ("/evans/title")

                   Set objCustomControl = ActiveDocument.ContentControls _

                       .Add(wdContentControlText, objRange)

                   objCustomControl.XMLMapping.SetMappingByNode objCustomNode

       End Sub

    Any help would be much appreciated.

    Thanks,

    Ray

  32. Ron Gregory says:

    You can download the "Word 2007 Content Control Kit" which allows you to map Content Controls to XML without using VBA.

  33. HA says:

    Hi Erika,

           We have written word letter tempaltes in 2003 but we are having trouble with variuos versions of word on the client machines.

    At this point we are thinking of generating letter templates that are independent of versions any help will be highly appreciated.

  34. Ray says:

    Hi Ron,

    Thanks for the suggestion to download Content Control Kit. I have downloaded and used it. I seems to be working fine. However, I can’t find anywhere to add the xml that we have editted with using content control manually to the word document. Is it possible to add extra button like it is provided under document property (i.e. if we add property name on xml and it shows on the word document when we open it. At the moment, there is no menu to add another property name instead we have to copy it manually.) Any suggestion how to add a button that we can add the additional property name?

    Thanks in advance 🙂

    Ray

  35. Warraich says:

    Nice article, I am looking something opposite of this. I have the data/contents in MS Word 2007 files and I want to save them to the database and also want to display them on the web site. Any commnets?

    atiqw@yahoo.com

    thanks much.

  36. henry2003 says:

    can you help me with my question

    is there a way that when saving a document  under Save As the name  the document gets saved under is automatically pulled in from pre defined field in the document property

    this would save me a step of having to retype the name and date of the document when I use the Save As

  37. Last week I had a chance to write some interesting code samples. I missed coding a lot, and I felt a little rusty after working in different projects. But I had a lot of fun and I want to share with you one of the samples I wrote: A simple ASP.NET 2.

  38. Mohan Kumar K says:

    Hi Erika,

    <?xml version="1.0"?>

    <Customer>

     <CompanyName>Alfreds Futterkiste</CompanyName>

     <ContactName>Maria Anders</ContactName>

     <ContactTitle>Sales Representative</ContactTitle>

     <Phone>030-0074321</Phone>

    </Customer>

    In the above code we mapped the plain text to the content controls? Is there anyway to input the image on the sample xml file?

    Thanks,

    Mohan

  39. ahsan says:

    plz write some code in C# or vb.net for this article thanks.

  40. mark says:

    this seems great! we need a database driven template ofa catalog which includes images listing and pricing  whichwe can keep on updating on DB side and the doc should update,we then would convert to PDF and then to digital catalog format

    any ideas?

  41. julian247 says:

    Hi Mark,

    I saw your post and thought you might be interested in an "out of the box" solution for your need?

    If you have SharePoint there's a SharePoint feature that adds document generation functionality to SharePoint and Word, called Qorus. Authors can use it to embed Data Tags in Microsoft Word templates to retrieve specified custom content from data sources – practically any data source via a Web service. It even converts the final output to PDF (if you have SharePoint 2010 or if are happy to buy a Word-PDF converter like Aspose), as per your post.

    You should check out the online demo or try the free trial of the product and see if it will suit your needs: http://www.qoruscorp.com

    kind regards

    julian

  42. Hi;

    Please also take a look at http://www.windwardreports.com – we do all of this without any programming required. Design in Word, Excel, or PowerPoint, merge the data in (using our engine), and you've got your documents.

    thanks – dave

  43. Chit Min Maung says:

    Hi,

    How can I map for Header/ Footer?

    Thanks

  44. Parm says:

    Hi,

    I am trying to create a template which will read a XML file of data. Therefore I have created a word document (.docx) which has plain text content controls on it. The XML file is the same format as the example Erika provided.

    However when l enter the code in the VBA editor linked to the word template the code cannot be viewed when l re-load the template even after l have saved the code in VBA editor. Does anyone why in Word 2007 the code does not seem to save with template.

    Regards

  45. Konus says:

    is this method affected by "Custom XML mapping is removed when you open a document in Word 2010" Article ID: 2445060 support.microsoft.com/…/en-us

  46. Claudia says:

    Hi Erika, thanks for your great article, i try to implement this solution in a web app but not working the error is "Memory stream is not expandable" i try to do many things with the class stream but without succesfull, you can help me. the file i used is rich in tables and images. i dont now if is this the problem, thanks a lot

  47. Angela says:

    Great post! I'm looking for a way to output a listing of all of the content controls set up in an MS Word document. Is there a way to do that in Word, or some code that I could use to do that quickly/easily. Thank you.