Document Assembly Solution for SpreadsheetML

In my last post I gave a high level overview of the architecture of the SDK. Now that we've gone over the basics, we're ready to talk about some real world solutions and end-to-end scenarios. Based on feedback I heard from you guys, I will start with scenarios and solutions around SpreadsheetML. Let me know if you have any specific scenarios or solutions you would like me to address in future posts.

In this post, I will show you how to populate a spreadsheet with data from a real SQL database and create a cool looking chart based on that data, all without using Excel client.

Scenario

Today's scenario theme is document assembly, which is all about constructing a file based on external data sources, like other files or databases.

Imagine a scenario where I'm a developer working for a fictional company called Adventure Works (based on the sample company on MSDN, which can be found here). Like other companies, my company stores all sales information in a database. Currently the Sales team uses Excel to manually connect to the database in order to pull out the latest sales information. I have been asked to create a report generation tool that takes data from a database into Excel and creates a report with a chart. The Sales team would like to run this solution on our report server in overnight batches so there is a strict requirement that we cannot use the Excel client.

Solution

Before I get into the details of my solution I want to call out a couple of things:

  1. I created my solution using Visual Studio 2008
  2. I am using the freely available Adventure Works database built for SQL Server 2005

Please note that you are not required to use either of these two technologies to build Open XML solutions. Feel free to use your favorite editor/compiler and database or data source.

This solution will leverage the Open XML SDK, so if you don't already have it, download the SDK here. Also make sure you install the Adventure Works database from the link above.

Okay, we're ready to get started. If you just want to jump straight into the code, feel free to download this solution here.

Step 1 – Setting the Project up

Once you've installed the Open XML SDK, your first step is to create a .Net project that has a reference to the SDK. In my case, I created a C# project called v2WorkbookFromDB and included a reference to DocumentFormat.OpenXML, which is the dll name of the Open XML SDK. Now we have access to the SDK.

We then need to add a connection to our database. I am a huge fan of LINQ, especially for making my life a lot easier for querying and connecting to SQL databases. If you're not already familiar with the technology, you should check out LINQ to SQL. Within Visual Studio, you can add automatically generated LINQ to SQL classes to my Adventure Works database in a matter of seconds. These classes allow you to connect and query databases without the need of ever writing SQL code. For those of you who have never tried this functionality before, let me walk you through the steps:

  1. In your Visual Studio project, right click the project and go to Add | New Item

  2. Select the LINQ to SQL Classes template. Choose a name and click add

  3. Now that you have added a template, let's connect to the database. Choose your database connection in the Server Explorer window and drag the SalesTerritory table to the designer window, which should end up looking like the following

That's it. No extra coding or anything. You now have access to the database and not only that, there are .Net classes automatically generated, which allow you to easily query the database through Objects. Sweet!

For the sake of this scenario, let's say I am only interested in last year's and this year's sales for each territory.

Step 2 – Creating a Template File

The scenario I listed above talks about creating a workbook from data within my database. I could either create my workbook from the ground up or start from an existing template. I definitely recommend the latter choice. It's almost always easier to create an Open XML solution by starting from an existing template.

In this scenario, I created a workbook template in Excel that has two sheets. The first sheet contains the header rows for the information I am interested in, which in this case is "Territory Name", "Sales Last Year", and "Sales This Year". The second sheet contains the overall structure of what I would like my chart to look like.

My template looks like the following:

With this template I am left with the following core tasks:

  1. Insert data into the first sheet of my workbook
  2. Modify the ranges in the my chart so it picks up the data

Step 3 – Insert Data into my Workbook

Before we can use the SDK, we need to add the appropriate namespaces to our C# file. In this case, we need to add my Packaging and Spreadhseet API components as follows:

using DocumentFormat.OpenXml.Packaging;

using DocumentFormat.OpenXml.Spreadsheet;

using DocumentFormat.OpenXml;

I know by taking a look at our template file that the first worksheet will need to contain all the necessary data. The first thing we need to do is find this worksheet within the workbook. The following code shows you how to open an existing workbook file and get access to the first worksheet. Once you have access to the worksheet we can access the SheetData Object, which represents all the cell data contained within the worksheet.

//Make a copy of the template file

File.Copy("template.xlsx", "generated.xlsx", true);

//Open up the copied template workbook

using (SpreadsheetDocument myWorkbook =

SpreadsheetDocument.Open("generated.xlsx", true))

{

//Access the main Workbook part, which contains all references

WorkbookPart workbookPart = myWorkbook.WorkbookPart;

//Grab the first worksheet

WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();

//SheetData will contain all the data

SheetData sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();

...

}

To add data to the worksheet we will need to append more rows to the SheetData object based on data from the database. The following code shows you how to query the Adventure Works database for all sales information in the SalesTerriroty table. For each row in the query we then need to create and add a new row to the worksheet based on the data from the query.

//Connect to database

AdventureWorksDataContext db = new AdventureWorksDataContext();

//My data starts at row 2

int index = 2;

//Select * from SalesTerritoties table

var territoryQuery =

from t in db.SalesTerritories

select t;

//For each row in my database add a row to my spreadsheeet

foreach (var item in territoryQuery)

{

string territoryName = item.Name;

decimal salesLastYear = Math.Round(item.SalesLastYear, 2);

decimal salesThisYear = Math.Round(item.SalesYTD, 2);

//Add a new row

Row contentRow = CreateContentRow(index, territoryName, salesLastYear, salesThisYear);

index++;

//Append new row to sheet data

sheetData.AppendChild(contentRow);

}

I know in our case we want to create a row that has three cells: the name of the territory, sales from last year, and sales for this year. The first cell is a text based cell, while the other two cells are value based cells. The following code creates a row with the three cells:

string[] headerColumns = new string[] { "A", "B", "C" };

Row CreateContentRow(int index, string territory, decimal salesLastYear, decimal salesThisYear)

{

//Create new row

Row r = new Row();

r.RowIndex = (UInt32)index;

//First cell is a text cell, so create it and append it

Cell firstCell = CreateTextCell(headerColumns[0], territory, index);

r.AppendChild(firstCell);

//Create cells that contain data

for (int i=1; i<headerColumns.Length; i++)

{

Cell c = new Cell();

c.CellReference = headerColumns[i] + index;

CellValue v = new CellValue();

if (i == 1)

v.Text = salesLastYear.ToString();

else

v.Text = salesThisYear.ToString();

c.AppendChild(v);

r.AppendChild(c);

}

return r;

}

Text based cells and value based cells are stored differently in SpreadsheetML. Take a look at this past post for more information on the differences. Based on these differences we have the following code to create text based cells:

Cell CreateTextCell(string header, string text, int index)

{

//Create new inline string cell

Cell c = new Cell();

c.DataType = CellValues.InlineString;

c.CellReference = header + index;

//Add text to text cell

InlineString inlineString = new InlineString();

Text t = new Text();

t.Text = text;

inlineString.AppendChild(t);

c.AppendChild(inlineString);

return c;

}

At this point we have a fully functional workbook with data populated from our database.

Step 4 – Adjusting the Chart Data

If you take a look at the chart data within our template you will notice that the ranges for the data are fixed and reference data up to the second row in the spreadsheet as follows:

<c:f>'My Data'!$B$2</c:f>

In order to fix our chart data, we need to extend this range to include all of our new rows. The following code accomplishes this task:

void FixChartData(WorkbookPart workbookPart, int totalCount)

{

//Grab the appropriate chart part from template file

ChartPart chartPart = workbookPart.ChartsheetParts.First().DrawingsPart.ChartParts.First();

//Change the ranges to accomodate for newly inserted data

foreach (Charts.Formula formula in chartPart.ChartSpace.Descendants<Charts.Formula>())

{

if (formula.Text.Contains("$2"))

{

string s = formula.Text.Split('$')[1];

formula.Text += ":$" + s + "$" + totalCount;

}

}

chartPart.ChartSpace.Save();

}

End Result

The end result is we have just created a new workbook that contains all the data from our database as well as a cool looking chart that represents the data. The generated file looks like:

Next Time

In my next few posts I am going to walk through other solutions to some key scenarios. Suggestions are always welcome.

Zeyad Rajabi