Mail Merging with Word, LINQ and XML Literals in VB

I've mentioned before that you can use XML literals in VB 9 to do text merging so I figured it should be pretty easy to do some mail merging in Word the same way. Word 2007 (and Excel, etc) now uses a standard XML format called Office Open XML to describe documents so I thought this would be the perfect opportunity to try out how to do this with LINQ and XML literals in Visual Basic.

What I wanted to do is take all the Customers in the Northwind database who had some orders shipping today and send them thank you letters. Of course, I had to create some new orders first since every order in Northwind was placed before the turn of the century! Anyway, to get started I created a new Word 2007 document with the letter text and some placeholder field names to indicate where I want the data:

Then I just saved the Doc1.docx file to my project folder. If we rename this file with a .zip extension we can see that it's just a zip file with a bunch of XML documents inside it. This is how the Office Open XML works. Basically it's a zip package with a bunch of related XML files inside. You can read more about Office Open XML in Microsoft Office here. If we drill down through the zip file we'll see that the text we just typed is located in the \word\document.xml file.

 

If I open this document.xml file in a text editor like notepad, copy all the text into the clipboard, and then paste it into a Visual Basic program it will be inferred as an XDocument object and we can see our text and placeholders in the XML (I collapsed a couple sections at the top and bottom for readability).

So what we want to do in our mail merge program is write a query that creates this XDocument for all our customers that have shipping orders and then pops it back into a zip file that Word 2007 can open. There's a namespace called System.IO.Packaging that will help us extract and replace the document.xml with the XDocument we create. But first, let's create the query by connecting our Server Explorer to our Northwind database and then adding a new item to our project and selecting the "LINQ to SQL Classes" template. Then just drag-drop the Customers and Orders tables onto the designer. (I show how to do this in this video).

What we want to end up with is a collection of XDocuments above, with our data merged into it. Instead of just creating a collection of XDocuments I also want to capture the CustomerID so that we can use it to create unique file names for our Word documents. So let's create a simple class called Letter that has two properties, CustomerID As String and Document As XDocument.

 Public Class Letter

    Private m_doc As XDocument
    Public Property Document() As XDocument
        Get
            Return m_doc
        End Get
        Set(ByVal value As XDocument)
            m_doc = value
        End Set
    End Property


    Private m_ID As String
    Public Property CustomerID() As String
        Get
            Return m_ID
        End Get
        Set(ByVal value As String)
            m_ID = value
        End Set
    End Property
 End Class

Now we can write our query to create a collection of these classes.

 Dim db As New NorthwindDataContext
 Dim letters = _
        From Order In db.Orders _
        Where Order.OrderDate IsNot Nothing AndAlso _
              Order.ShippedDate IsNot Nothing AndAlso _
              Order.ShippedDate.Value.Date = Date.Today _
          Let OrdDate = Order.OrderDate.Value.ToShortDateString _
          Let ShipDate = Order.ShippedDate.Value.ToShortDateString _
        Select New Letter With { _
               .CustomerID = Order.Customer.CustomerID, _
               .Document = <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
                            rest of doc omitted for clarity ...}

We're selecting the CustomerID and the XML literal with our embedded expressions in place of the placeholder fields I created in the document. I omitted the rest of the XML literal in the query above for clarity. All we need to do is run through the literal and replace all the placeholder fields with embedded expressions from our query. For instance, here's a snapshot of the body of our letter :

 ...
<w:r w:rsidR="007A5236">
    <w:t xml:space="preserve">Dear </w:t>
</w:r>
<w:proofErr w:type="spellStart"/>
<w:r>
    <w:t><%= Order.Customer.ContactName %></w:t>
</w:r>
<w:proofErr w:type="spellEnd"/>
<w:r w:rsidR="00AA2EC6">
    <w:t>,</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00AA2EC6" w:rsidRDefault="00AA2EC6" w:rsidP="00AA2EC6">
<w:r>
    <w:t xml:space="preserve">We'd like to inform you that the order you placed on </w:t>
</w:r>
<w:proofErr w:type="spellStart"/>
<w:r w:rsidR="00112228">
    <w:t><%= OrdDate %></w:t>
</w:r>
<w:proofErr w:type="spellEnd"/>
<w:r w:rsidR="00806521">
    <w:t xml:space="preserve"> has shipped on </w:t>
</w:r>
<w:proofErr w:type="spellStart"/>
<w:r w:rsidR="00112228">
    <w:t><%= ShipDate %></w:t>
</w:r>
...

Now that we have a collection of our Letter objects, we need to create the Word document by making a copy of our original document (which remember is just a .zip file) and replacing the \word\document.xml file in that package. You can also create packages from scratch but then you have to create a relationship file that describes the relation of all the files in the package and it can get tricky. It's easier to just make a copy of the original Word docx file in our case and just replace the part we modified. The key to writing the document back properly is getting the content type right when we call CreatePart on the Package object. The Package object lives in the System.IO.Packaging namespace.

 Dim uri As New Uri("/word/document.xml", UriKind.Relative)
Dim contentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"
Dim sourceFile = CurDir() & "\Doc1.docx"
Dim letterDir = CurDir() & "\Letters\"
My.Computer.FileSystem.CreateDirectory(letterDir)

For Each letter In letters
    Dim customerFile = letterDir & letter.CustomerID & ".docx"
    Console.WriteLine(String.Format("Creating letter {0}", letter.CustomerID))
    My.Computer.FileSystem.CopyFile(sourceFile, customerFile, True)

    Using p As Package = Package.Open(customerFile)
        'Delete the current document.xml file
        p.DeletePart(uri)


        'Replace that part with our XDocument
        Dim replace As PackagePart = p.CreatePart(uri, contentType)
        Using sw As New StreamWriter(replace.GetStream())
            letter.Document.Save(sw)
            sw.Close()
         End Using

        p.Close()
    End Using
Next

This creates a directory of our letters all nicely mail merged with the customer and order data that met our criteria in the Where clause of the letters query.

 

If we open one of the documents we'll see the data properly merged into the letter.

 

I've attached the complete example to this post. Have fun exploring the Office Open XML format because there's just so many things you can do with it especially combined with LINQ, XML literals and Visual Basic!

Enjoy!

WordMailMerge.zip