Excel on Azure


 

I amended my open-source CsvTools with an Excel reader. Once I read the excel worksheet into a datatable, I can use all the data table operators from the core CsvTools, including enumeration, Linq over the rows, analysis, mutation, and saving back out as a CSV. So this gives be a Linq-to-Excel on Azure experience, which ought to win a buzzword bingo contest!

The excel reader uses the OpenXml SDK, and so it can run on Azure.  This is useful because Excel as a COM-object doesn’t run on servers, and so I couldn’t upload excel files to my ASP.Net projects without really fighting the security settings. With OpenXml, it’s easy since you’re just reading XML.

Here’s a little azure MVC test page that demonstrates uploading a xlsx file and displaying the contents in azure:

(side note: deploying MVC to Azure is super easy, courtesy of this great tutorial).

I also need to give a shout-out for Nuget! The dependency management here was great. I have one Nuget package for the core CsvTools (which is just the CSV reader with no dependencies) , and another package CsvTools.Excel (which has a dependency on CsvTools and the OpenXml SDK).

The excel reader is an extension method exposed off “DataTable.New”, so it’s easily discoverable.

Here’s a sample excel sheet, foo.xlsx:

image

And then the code to read it from C#:

private static void TestExcel()
{
    var dt = DataTable.New.ReadExcel(@"c:\temp\foo.xlsx");
    var names = from row in dt.Rows where int.Parse(row["age"]) > 10 select row["Name"];
    foreach (var name in names)
    {
        Console.WriteLine(name);
    }            
}

This example just reads the first worksheet in the workbook, which is the common case for my usage scenarios where people are using excel as a CSV format.  It prints:

Ed
John

There  are also some other overloads to give the whole list of worksheets.

public static IList<MutableDataTable> ReadExcelAllSheets(this DataTableBuilder builder, string filename);
public static IList<MutableDataTable> ReadExcelAllSheets(this DataTableBuilder builder, Stream input);
 

The reader is intended for Excel workbooks that represent tabular data and is not hardened against weird or malformed input.

Anyway, I’m finding this useful for some experiments, and sharing in case somebody else finds it useful too. 

(Now I just need to throw in a WebAPI parameter binding for DataTables, use WebAPI’s query string support, and add some data table Azure helpers and I will be the buzzword bingo champion!)


Comments (6)

  1. Jhail says:

    Sorry for being off-topic, but I couldn't find another way to contact you.

    A few years ago you posted a topic on forcing communications between two .NET components to go through the COM interop layer by using ICustomQueryInterface:

    blogs.msdn.com/…/icustomqueryinterface-and-clr-v4.aspx

    Unfortunately this technique no longer seems to work.  Do you have any insight on how this can be adapted to work with the release version of the runtime?

  2. Nuget user says:

    Hi Mike

    Looks useful but can't install (on Webmatrix) via Web Pages Administration. An exception is generated:

    The element 'metadata' in namespace 'schemas.microsoft.com/…/nuspec.xsd& has invalid child element 'frameworkAssemblies' in namespace 'schemas.microsoft.com/…/nuspec.xsd&. List of possible elements expected: 'id, iconUrl, requireLicenseAcceptance, licenseUrl, description, title, language, tags, summary' in namespace 'schemas.microsoft.com/…/nuspec.xsd&.

  3. Pranav K says:

    @NuGet User

    Hi, the version of NuGet that Web Pages Administration uses is really old and this package uses features that require a newer version of it. The easiest way to get up to speed with the newest version of WebPages Administration (and NuGet) would be to get the latest version of WebMatrix. (http://www.microsoft.com/…/betafeatures.aspx)

    It's still in beta but it should work perfectly well. Plus it has better support for managing packages from inside the editor.

  4. Nuget user says:

    @Pranav

    I got round the issue by using the package manager in Visual Web Developer 2010 Express, rather that moving to the beta version of WebMatrix.

  5. Nuget user says:

    Hi Mike

    There's an issue with the CellToText function in the Excel extensions. theCell.CellValue can be null (if a cell is empty), but the return from the function in this case is theCell.CellValue.Text which causes a null reference exception.

  6. jmstall says:

    @ Nugetuser – thanks for reporting that. I submited a fix and updated the nuget package.