The WordprocessingML Class: A refinement of the approach of using LINQ to XML to access Open XML

(July 10, 2008 - I've written a new blog post on a better way to accomplish this.) 

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCThis post presents a refinement of the OpenXmlDocument class, which is a new class (WordprocessingML) that derives from the OpenXmlDocument class. The WordprocessingML class adds additional functionality that is specific to WordprocessingML documents, including:

· Some constant strings that contain the DocumentRelationshipType, the StylesRelationshipType, and the CommentsRelationshipType.

 

· An XNamespace object that contain the main XML namespace for WordprocessingML documents.

· Initialized properties that find the main DocumentRelationship object, the StylesRelationship object, and the CommentsRelationship object. The Relationship class is declared in the code found in the link below, and represents a node in the object graph that contains an entire OpenXML document.

 

· A DefaultStyle method that queries for the default style of the document.

· A Paragraphs method that enumerates all paragraphs in the document. The Paragraphs method returns IEnumerable<Paragraph>. The Paragraph class is a tupple class that contains: the XElement node of the paragraph for further querying if necessary, the style of the paragraph, the text of the paragraph, and a collection of comments for the paragraph. It needs to contain a collection because a paragraph can have more than one comment.

 

You can see the complete listing here: The WordprocessingML Class 

Following is a simple example that shows the use of the WordprocessingML class:

string filename = "Test.docx";

using (WordprocessingML doc = new WordprocessingML(filename))
{
foreach (var p in doc.Paragraphs())
{
Console.WriteLine("Style: {0} Text: >{1}<",
p.StyleName.PadRight(16), p.Text);
if (p.Comments != null)
foreach (var c in p.Comments)
{
Console.WriteLine(" Comment:");
Console.WriteLine(" Id: {0}", c.Id);
Console.WriteLine(" Author: {0}", c.Author);
Console.WriteLine(" Text: >{0}<", c.Text);
}
}
}
 

When run on a small document, the code produces the following output:

Style: Normal Text: >This is a test.<

  Comment:
Id: 0
Author: Eric White
Text: >Hello world
test
comment<
Style: Heading1 Text: >This is only a test.<
Style: Normal Text: >This is another paragraph.<

  Comment:
Id: 1
Author: Eric White
Text: >another<