Modifying an Open XML Document in a SharePoint Document Library

On a fairly regular basis, I need to write an example that retrieves an Open XML document from a SharePoint document library, modify the document, and save the document back to the document library.  The correct approach is to use a CAML query to retrieve the document.  This post presents the minimum amount of code to use the SharePoint object model to do this.

This code requires the Open XML SDK, so you will need to download and install it.  You need to add a reference to the assembly.  In addition, you need to add a reference to the WindowsBase assembly and the Microsoft.SharePoint assembly.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCSee the Open XML Developer Center for lots of information on building applications that work with Open XML documents.

When building console applications for SharePoint 2010, you must target the .NET 3.5 framework.  In addition, you must target ‘Any CPU’, not X86.  The post Developing with SharePoint 2010 Word Automation Services contains explicit instructions for targeting .NET 3.5 and Any CPU.

Here is the smallest C# console application to do this:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading;
using Microsoft.SharePoint;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

class Program
{
static void Main(string[] args)
{
string siteUrl = "https://localhost";
using (SPSite spSite = new SPSite(siteUrl))
{
Console.WriteLine("Querying for Test.docx");
SPList list = spSite.RootWeb.Lists["Shared Documents"];
SPQuery query = new SPQuery();
query.ViewFields = @"<FieldRef Name='FileLeafRef' />";
query.Query =
@"<Where>
<Eq>
<FieldRef Name='FileLeafRef' />
<Value Type='Text'>Test.docx</Value>
</Eq>
</Where>";
SPListItemCollection collection = list.GetItems(query);
if (collection.Count != 1)
{
Console.WriteLine("Test.docx not found");
Environment.Exit(0);
}
Console.WriteLine("Opening");
SPFile file = collection[0].File;
byte[] byteArray = file.OpenBinary();
using (MemoryStream memStr = new MemoryStream())
{
memStr.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open(memStr, true))
{
Document document = wordDoc.MainDocumentPart.Document;
Paragraph firstParagraph = document.Body.Elements<Paragraph>()
.FirstOrDefault();
if (firstParagraph != null)
{
Paragraph testParagraph = new Paragraph(
new Run(
new Text("Test")));
firstParagraph.Parent.InsertBefore(testParagraph,
firstParagraph);
}
}
Console.WriteLine("Saving");
string linkFileName = file.Item["LinkFilename"] as string;
file.ParentFolder.Files.Add(linkFileName, memStr, true);
}
}
}
}

Here is the same example in VB.  One thing that is cool about VB is that you can use XML literals to write the CAML query, and then call ToString() to set the Query field of the SPQuery object.

Imports System.IO
Imports System.Threading
Imports Microsoft.SharePoint
Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Wordprocessing
Module Module1
Sub Main()
Dim siteUrl As String = "https://localhost"
Using spSite As SPSite = New SPSite(siteUrl)
Console.WriteLine("Querying for Test.docx")
Dim list As SPList = spSite.RootWeb.Lists("Shared Documents")
Dim query As SPQuery = New SPQuery()
query.ViewFields = "<FieldRef Name='FileLeafRef' />"
query.Query = ( _
<Where>
<Eq>
<FieldRef Name='FileLeafRef'/>
<Value Type='Text'>Test.docx</Value>
</Eq>
</Where>).ToString()
Dim collection As SPListItemCollection = list.GetItems(query)
If collection.Count <> 1 Then
Console.WriteLine("Test.docx not found")
Environment.Exit(0)
End If
Console.WriteLine("Opening")
Dim file As SPFile = collection(0).File
Dim byteArray As Byte() = file.OpenBinary()
Using memStr As MemoryStream = New MemoryStream()
memStr.Write(byteArray, 0, byteArray.Length)
Using wordDoc As WordprocessingDocument = _
WordprocessingDocument.Open(memStr, True)
Dim document As Document = wordDoc.MainDocumentPart.Document
Dim firstParagraph As Paragraph = _
document.Body.Elements(Of Paragraph)().FirstOrDefault()
If firstParagraph IsNot Nothing Then
Dim testParagraph As Paragraph = New Paragraph( _
New Run( _
New Text("Test")))
firstParagraph.Parent.InsertBefore(testParagraph, _
firstParagraph)
End If
End Using
Console.WriteLine("Saving")
Dim linkFileName As String = file.Item("LinkFilename")
file.ParentFolder.Files.Add(linkFileName, memStr, True)
End Using
End Using
End Sub
End Module