Open XML Format SDK 2.0: Getting Started Best Practices

The Open XML Format allows you to generate, manipulate, or pull data from Word 2007, Excel 2007, and PowerPoint 2007 files. If you are working with Microsoft-based solutions, you can generate, manipulate, and pull data from documents using the following tools and technologies:

Note: If you are working with Java or PHP/Linux/Unix/Solaris, you can also find plenty of resources here and here.

Anyway, today I want to focus on a couple best practices for getting started with coding for Open XML Format SDK 2.0.

For the last couple of months I have been working with some Open XML gurus: Zeyad Rajabi, Linda Lanqing Brownell, Eric White, Joel Krist, and Ken Getz. We are working together on a new set of Visual How-tos for the Open XML Format SDK 2.0 and a set of 50+ code snippets for Visual Studio 2008 to be released soon on MSDN.

Quite frequently I hear from most of them a couple best practices that I’d like to share with you today: If you plan to build a solution using the Open XML Format SDK 2.0…

  1. Always start with a template for your solution
  2. Use DocumentReflector to get started with coding

Always start with the template for your solution

The very first thing we recommend you to do before you start writing code for a document solution, is to create the template. Depending on what kind of solution you want to build, you can use Word 2007, Excel 2007, or PowerPoint 2007 to create a template.

Imagine that you own a software company and you sent all your developers and consultants to the great Office and SharePoint sessions at TechEd this year. The end of the fiscal year is coming soon and you need expense data from each conference attendee ASAP. Developers and consultants plan to spend this weekend on New Orleans, so you ask them to fill out a Web-based Expense Report form that sends data to your company’s accounting database. Your accountant loves Excel 2007 and she asked you to export all expenses per/employee to Excel spreadsheets so she can review details before approving. CSV export won’t do this time because your accountant wants it all pretty and styled.  You want to figure out a way of programmatically generate the Expense Report spreadsheets by pulling data from your accounting database. You need styles and formatting as well. After evaluating lots of tools and technologies, you decide that you want to build this solution using the super cool Open XML Format SDK 2.0. Now what?

The very first thing we recommend you to do before you start writing code is create the template using Excel 2007. You may ask your accountant to design the Expense Report template using Excel 2007, or you can always download tons of great templates from Office Online. Here’s my sample expense report template:

expenseRTemplate[1]

Open the template using Excel 2007 and add some dummy data. Don’t forget to save the template as Excel Workbook file (.xlsx). Now that you have a template to start with, you can start coding the solution.

Use DocumentReflector to get started with coding

The Open XML Format SDK 2.0 download ships with a set of tools that facilitate the process of building document solutions using the SDK. One of this tools is the DocumentReflector. This tool has the ability to open an existing Open XML document and dynamically generate C# source code that uses the Open XML SDK 2.0 typesafe classes to create the document parts. Exploring the source code created by DocumentReflector is a great way to become familiar with the Open XML SDK 2.0 classes and the Office document formats. Zeyad refers to this tool as the “Macro Recorder on vitamins” and I can tell you it rocks! The main idea is that once you have a solution template, you open it using the DocumentReflector and you get a set of autogenerated classes that you can later modify with Visual Studio 2008.

To illustrate using the DocumentReflector to create the base code for the Expense Report solution, follow the next steps:

  1. Open the DocumentReflector tool: The DocumentReflector tool is located in the Tools folder under the Open XML Format SDK 2.0 installation folder. Use Windows Explorer to navigate to the Tools folder and double-click the DocumentReflector.exe file to launch the DocumentReflector tool.
  2. Select the DocumentReflector's File | Open menu and in the Open dialog browse to the folder containing the ExpenseReport.xlsx workbook created previously, select the workbook, and click on the Open button. DocumentReflector will open the workbook and display the content of the document. Clicking on the top-level Package node will display the generated code that can be used to create the entire package.

docReflector[1]

At this point the code generated by DocumentReflector can be copied and pasted into the Visual Studio solution for reuse and learning purposes.

Using Visual Studio 2008 to create a Windows console application

For testing purposes, you can create a Console application in Visual Studio 2008. Here are some generic steps:

  1. Open Visual Studio 2008.

  2. Create a new C# Windows Console Application project.

  3. Add a reference to the Open XML API assembly.

  4. Add a reference to the WindowsBase assembly.

    The next steps involve copying code generated by the DocumentReflector tool to the Visual Studio project. The DocumentReflector tool provided with the Open XML Format SDK 2.0 allows users to open a valid Open XML document, choose an XML element, part, or the whole package, and have DocumentReflector generate a C# class that can create the selected document parts using the Open XML Format SDK 2.0 classes.

  5. Select the using statements from the top of the DocumentReflector code window then copy and paste them to the top of the Program.cs file in the Visual Studio project.

[C#]

using DocumentFormat.OpenXml.Packaging;
using ap = DocumentFormat.OpenXml.ExtendedProperties;
using vt = DocumentFormat.OpenXml.VariantTypes;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Spreadsheet;
using a = DocumentFormat.OpenXml.Drawing;
using op = DocumentFormat.OpenXml.CustomProperties;

6. Select the code for all of the methods inside of the GeneratedClass class from the DocumentReflector code window then copy and paste the code as methods of the Program class in the Program.cs file in the Visual Studio project.

7. Change the copied CreatePackage method to be private static so it can be called from the static Main method.

[C#]

private static void CreatePackage(string filePath) {
    using (SpreadsheetDocument package =
        SpreadsheetDocument.Create(filePath,
SpreadsheetDocumentType.Workbook)) {
AddParts(package);
    }
}

The AddParts method creates all the parts that you need in the Expense Report spreadsheet.

[C#]

private static void AddParts(SpreadsheetDocument parent) {
var extendedFilePropertiesPart1 = parent.AddNewPart<ExtendedFilePropertiesPart>("rId3");
           GenerateExtendedFilePropertiesPart1().Save(extendedFilePropertiesPart1);

           var coreFilePropertiesPart1 = parent.AddNewPart<CoreFilePropertiesPart>("rId2");
           GenerateCoreFilePropertiesPart1(coreFilePropertiesPart1);

           var workbookPart1 = parent.AddWorkbookPart();
           GenerateWorkbookPart1().Save(workbookPart1);

           var workbookStylesPart1 = workbookPart1.AddNewPart<WorkbookStylesPart>("rId3");
           GenerateWorkbookStylesPart1().Save(workbookStylesPart1);

           var themePart1 = workbookPart1.AddNewPart<ThemePart>("rId2");
           GenerateThemePart1().Save(themePart1);

           var worksheetPart1 = workbookPart1.AddNewPart<WorksheetPart>("rId1");
           GenerateWorksheetPart1().Save(worksheetPart1);

           var spreadsheetPrinterSettingsPart1 = worksheetPart1.AddNewPart<SpreadsheetPrinterSettingsPart>("rId1");
           GenerateSpreadsheetPrinterSettingsPart1(spreadsheetPrinterSettingsPart1);

           var calculationChainPart1 = workbookPart1.AddNewPart<CalculationChainPart>("rId5");
           GenerateCalculationChainPart1().Save(calculationChainPart1);

           var sharedStringTablePart1 = workbookPart1.AddNewPart<SharedStringTablePart>("rId4");
           GenerateSharedStringTablePart1().Save(sharedStringTablePart1);

           var customFilePropertiesPart1 = parent.AddNewPart<CustomFilePropertiesPart>("rId4");
           GenerateCustomFilePropertiesPart1().Save(customFilePropertiesPart1);

       }

The GenerateSharedStringTablePart1() method contains the code that you need to create values for the table on the Expense Report. You can modify this code to iterate through the accounting database. You can pull expense report data by employee and replace the dummy data you entered.

[C#]

private static SharedStringTable GenerateSharedStringTablePart1() {
var element =
new SharedStringTable(
                 new SharedStringItem(
                     new Text("Name")),
                 new SharedStringItem(
                     new Text("Department")),
                 new SharedStringItem(
                     new Text("Manager")),
                 new SharedStringItem(
                     new Text("Position")),
                 new SharedStringItem(
                     new Text("From")),
                 new SharedStringItem(
                     new Text("To")),
                 new SharedStringItem(
                     new Text("Date")),
                 new SharedStringItem(
                     new Text("Account")),
                 new SharedStringItem(
                     new Text("Description")),
                 new SharedStringItem(
                     new Text("Transport")),
                 new SharedStringItem(
                     new Text("Fuel")),
                 new SharedStringItem(
                     new Text("Meals")),
                 new SharedStringItem(
                     new Text("Phone")),
                 new SharedStringItem(
                     new Text("Subtotal")),
                 new SharedStringItem(
                     new Text("Advances")),
                 new SharedStringItem(
                     new Text("For Office Use Only")),
                 new SharedStringItem(
                     new Text("Misc.")),
                 new SharedStringItem(
                     new Text("Hotel")),
                 new SharedStringItem(
                     new Text("Entertainment")),
                 new SharedStringItem(
                     new Text("PURPOSE:")),
                 new SharedStringItem(
                     new Text("STATEMENT NUMBER:")),
                 new SharedStringItem(
                     new Text("PAY PERIOD:")),
                 new SharedStringItem(
                     new Text("EMPLOYEE INFORMATION:")),
                 new SharedStringItem(
                     new Text("Total")),
                 new SharedStringItem(
                     new Text("APPROVED:")),
                 new SharedStringItem(
                     new Text("NOTES: "){ Space = "preserve" }),
                 new SharedStringItem(
                     new Text(" SSN"){ Space = "preserve" }),
                 new SharedStringItem(
                     new Text(" Employee ID"){ Space = "preserve" }),
                 new SharedStringItem(
                     new Text("Expense report")),
                 new SharedStringItem(
                     new Text("TechEd 2009")),
                 new SharedStringItem(
                     new Text("Erika Ehrli Cabral")),
                 new SharedStringItem(
                     new Text("123456")),
                 new SharedStringItem(
                     new Text("12345678")),
                 new SharedStringItem(
                     new Text("Office Development")),
                 new SharedStringItem(
                     new Text("JPBagel")),
                 new SharedStringItem(
                     new Text("Delicious breakfast")),
                 new SharedStringItem(
                     new Text("Developer (in my dreams)"))
){ Count = (UInt32Value)38U, UniqueCount = (UInt32Value)37U };
return element;
     }

7. Modify the Main method and add a call to the CreatePackage method.

[C#]

static void Main(string[] args){
    // Create an Excel workbook named ExpenseReportTest.xlsx
// in the current folder. You can write some code here to iterate through
// your accounting database and generate one Spreadsheet per employee.

CreatePackage("ExpenseReportTest.xlsx");
}

8. Build and run the sample. Using the code shown above the sample application will create an Excel workbook named ExpenseReportTest.xlsx located in the Visual Studio project's Debug or Release build folder depending on the selected build mode.

Opening the workbook with Excel will display a workbook that looks just like the ExpenseReport.xlsx workbook created previously.

More resources

If you are looking for more end-to-end solutions using the Open XML Format SDK 2.0, you must see this articles:

Also, as mentioned before, Brian’s/Zeyad’s blog and Eric’s blog are always a great resource for Open XML code samples.

Coming soon the set of Open XML Format SDK 2.0 VHTs and code snippets for Visual Studio 2008!