Using the Open XML SDK - VB

[Table of Contents] [Next Topic]

Open XML Packages

To follow this tutorial, you don't need to delve into all of the details of working with packages.  This topic presents a small chunk of code that you can use as boilerplate code – it opens a word document and retrieves the main part, the style part, and the comment part.  It uses LINQ to XML to count the XML nodes in the three parts, and prints the counts to the console.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCThe boiler plate code uses the Open XML SDK, a set of managed classes for .NET that provides more convenient access to Open XML documents.  Using the SDK, you can get the main part of the document, and navigate to related parts more easily.  It cuts down your code by quite a bit.  This blog post is a summary of the differences between the classes in System.IO.Packaging and the classes in the Open XML SDK.  This example uses the Open XML SDK v1.0.  Download it here.

Before attempting to compile, don't forget to:

·         Download and install the Open XML SDK.

·         Add a reference to the DocumentFormat.OpenXml assembly.

For the interested:

Just a few points about packages.  Various parts in the package are related.  You never rely on absolute paths to retrieve a part, even if you know the path.  Instead, you start from the main part, and use relationships to navigate to the other parts.  As mentioned, many of these parts are XML documents, including files that specify the relationships between parts.  You can access the parts and the relationship files using any conformant XML parser and a library that can open and read from ZIP files.  However, the classes in the Open XML SDK allow you to work with packages in a more convenient way.

Parts have a content type.  In the case of XML parts, the content type also indicates indirectly which XSD schema should be used to validate the part.

Here is the boiler plate code:

Imports DocumentFormat.OpenXml.Packaging
Imports System.IO
Imports System.Xml

Module Module1
Public Function LoadXDocument(ByVal part As OpenXmlPart) _
As XDocument
Using streamReader As StreamReader = New StreamReader(part.GetStream())
Using xmlReader As XmlReader = xmlReader.Create(streamReader)
Return XDocument.Load(xmlReader)
End Using
End Using
End Function

Sub Main()
Dim filename As String = "SampleDoc.docx"
Using wordDoc As WordprocessingDocument = _
WordprocessingDocument.Open(filename, True)
Dim mainPart As MainDocumentPart = _
wordDoc.MainDocumentPart
Dim styleDefinitionPart As StyleDefinitionsPart = _
mainPart.StyleDefinitionsPart
Dim commentsPart As WordprocessingCommentsPart = _
mainPart.WordprocessingCommentsPart
Dim xDoc As XDocument = LoadXDocument(mainPart)
Dim styleDoc As XDocument = LoadXDocument(styleDefinitionPart)
Dim commentsDoc As XDocument = LoadXDocument(commentsPart)
Console.WriteLine("The main document part has {0} nodes.", _
xDoc.DescendantNodes().Count())
Console.WriteLine("The style part has {0} nodes.", _
styleDoc.DescendantNodes().Count())
Console.WriteLine("The comments part has {0} nodes.", _
commentsDoc.DescendantNodes().Count())
End Using
End Sub
End Module

[Table of Contents] [Next Topic] [Blog Map]