Using LINQ to Query Excel Tables


Excel has a very cool feature where you can declare that a range of cells is a table.  It is a feature that allows you to use Excel very much like a database.  You can add new rows as necessary, sort the table by columns, do some simple filtering, calculate the sum of columns, and more.  Each table has a unique table name, and each column has a column name.  Because these tables are stored in Open XML documents, we can implement some simple extension methods and some classes so that we can query these tables using LINQ in a manner that is similar to querying a SQL database.  This post presents a bit of code to do this.  The code and sample documents are attached to this post.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC(Updated July 21, 2010 – Fixed a bug where the code would return the wrong value for cells in the table if the columns had been moved.)

Note: this code is presented as an example – a proof-of-concept.  This code could be further optimized, so that it performs better (although it performs quite well as is).  And it may be interesting in the future to modify the code to use a strongly-typed approach – as the code is currently implemented, if you misspell a table or column name, the code throws an exception.  However, this code is useful as is for doing ad-hoc queries of Excel tables.  I certainly will be using it! :-)

This code uses the Open XML SDK, either V1, or the CTP of V2.  You can download V1 of the SDK here.  You can download CTP1 of V2 of the SDK here.

Thanks to Brian Jones who suggested this project.

Following is a screen clipping of an Excel spreadsheet that contains a table:

You can see the four columns of this table: Item, Qty, Price, and Extension.  In addition, in the Design tab of the ribbon, in the far left box, you can see that this table has a table name of “Inventory”.  Using the code presented in this post, you can query this table as follows:

var query =
    from i in spreadsheet.Table(“Inventory”).TableRows()
    where (int)i[“Qty”] > 2
    select i;

foreach (var r in query)
{
    Console.WriteLine(r[“Item”]);
    Console.WriteLine(r[“Qty”]);
    Console.WriteLine(r[“Price”]);
    Console.WriteLine(r[“Extension”]);
    Console.WriteLine();
}

When you run this code, it produces:

Book
44
2
88

Phone
4
10
40

As you can see from the above code, to access a particular column from a table row, you can use a default indexed property, passing the name of the column:

Console.WriteLine(r[“Item”]);
Console.WriteLine(r[“Qty”]);
Console.WriteLine(r[“Price”]);
Console.WriteLine(r[“Extension”]);

This allows us to write code that is easy to read.

The table class (returned by the Table method) has a TableColumns method that iterates the columns in the table:

// list all of the columns in the Inventory table
Console.WriteLine(“Table: Inventory”);
foreach (var c in spreadsheet.Table(“Inventory”).TableColumns())
    Console.WriteLine(”  {0}”, c.Name);

When you run this code, you see:

Table: Inventory
  Item
  Qty
  Price
  Extension

The LtxOpenXml Namespace

Some time ago, I wrote some code that enabled querying Open XML spreadsheets using LINQ to XML, presented in the blog post ‘Open XML SDK and LINQ to XML’.  I’ve added the code to query tables to the code presented in that post.  The extension methods that enable querying tables make use of that code.  The enhanced LtxOpenXml namespace now contains code for:

  • Querying word processing documents
  • Querying spreadsheets
  • Querying tables contained in spreadsheets

The code for querying word processing documents and spreadsheets is unmodified.  Refer to the above mentioned blog post for details on using those extension methods.

The code that enables querying of spreadsheet tables is, of course, written in the pure functional style.  No state is maintained, and all methods to query the document are lazy.

If you have questions about how to write functional code (like the code that implements the extension methods and classes associated with this post), go through this Functional Programming Tutorial.

I’ve provided a summary of the types and extension methods included in the LtxOpenXml namespace at the end of this post.

Use of Data Types

Here’s another example of a table that contains a few more columns with more data types:

Each row returned by the TableRows method is a collection of TableCell objects.  I’ve defined explicit conversions between TableCell and some of the most common .NET types, so that you can simply cast a TableCell to your desired type.  Here’s a query to list all vehicles in the table:

// list all vehicles
var q = from c in spreadsheet.Table(“Vehicles”).TableRows()
        select new VehicleRecord()
        {
            Vehicle = (string)c[“Vehicle”],
            Color = (string)c[“Color”],
            Year = (int)c[“Year”],
            HorsePower = (int)c[“HorsePower”],
            Cost = (decimal)c[“Cost”],
            AcquisitionDate = (DateTime)c[“AcquisitionDate”],
            ExecutiveUseOnly = (bool)c[“ExecutiveUseOnly”]
        };

Console.WriteLine(“List of all vehicles”);
PrintVehicles(q);
Console.WriteLine();

I’ve written a PrintVehicles method:

public static void PrintVehicles(IEnumerable<VehicleRecord> list)
{
    int[] tabs = new[] { 12, 10, 6, 6, 10, 14, 10 };
    foreach (var z in list)
        Console.WriteLine(“{0}{1}{2}{3}{4}{5}{6}”,
            z.Vehicle.PadRight(tabs[0]),
            z.Color.PadRight(tabs[1]),
            z.Year.ToString().PadRight(tabs[2]),
            z.HorsePower.ToString().PadRight(tabs[3]),
            z.Cost.ToString().PadRight(tabs[4]),
            ((DateTime)z.AcquisitionDate).ToShortDateString()
                .PadRight(tabs[5]),
            ((bool)z.ExecutiveUseOnly).ToString()
                .PadRight(tabs[6]));
}

When you run the above query, you see:

List of all vehicles
Pickup      White     2002  165   23000     2/22/2002     False
Pickup      Red       2004  185   32000     10/21/2004    False
Sports Car  Red       2003  165   23000     1/1/2004      True
Sedan       Blue      2005  200   21000     2/25/2005     False
Limo        Black     2008  440   72000     4/1/2008      True

You can query for all executive vehicles, like this:

// list all executive vehicles
q = from c in spreadsheet.Table(“Vehicles”).TableRows()
        where (bool)c[“ExecutiveUseOnly”] == true
        select new VehicleRecord()
        {
            Vehicle = (string)c[“Vehicle”],
            Color = (string)c[“Color”],
            Year = (int)c[“Year”],
            HorsePower = (int)c[“HorsePower”],
            Cost = (decimal)c[“Cost”],
            AcquisitionDate = (DateTime)c[“AcquisitionDate”],
            ExecutiveUseOnly = (bool)c[“ExecutiveUseOnly”]
        };

You can write queries that select on data types such as DateTime:

// list all vehicles acquired after 2004
q = from c in spreadsheet.Table(“Vehicles”).TableRows()
    where (DateTime)c[“AcquisitionDate”] >= new DateTime(2004, 1, 1)
    select new VehicleRecord()
    {
        Vehicle = (string)c[“Vehicle”],
        Color = (string)c[“Color”],
        Year = (int)c[“Year”],
        HorsePower = (int)c[“HorsePower”],
        Cost = (decimal)c[“Cost”],
        AcquisitionDate = (DateTime)c[“AcquisitionDate”],
        ExecutiveUseOnly = (bool)c[“ExecutiveUseOnly”]
    };

And of course, you can use all of the grouping, ordering, and filtering capabilities of LINQ queries:

// vehicles grouped by user
var groups = from v in spreadsheet.Table(“Vehicles”).TableRows()
             group v by v[“ExecutiveUseOnly”];

foreach (var g in groups)
{
    Console.WriteLine(“Executive Use: {0}”, (bool)g.Key);
    foreach (var v in g)
        Console.WriteLine(”  Vehicle:{0}  Year:{1}”,
            v[“Vehicle”], v[“Year”]);
    Console.WriteLine();
}

I’ve imported the Customers and Orders from the Northwind database into a spreadsheet, where the Customers table is in one sheet, and the Orders table is in another sheet within the worksheet.  Here is the Customers table:

And here is the Orders table:

We can now write a query that joins the customers and orders tables:

using (SpreadsheetDocument spreadsheet =
    SpreadsheetDocument.Open(filename, false))
{
    // list all of the columns in the Customer table
    Console.WriteLine(“Table: Customer”);
    foreach (var c in spreadsheet.Table(“Customer”).TableColumns())
        Console.WriteLine(”  {0}”, c.Name);
    Console.WriteLine();

    // list all of the columns in the Order table
    Console.WriteLine(“Table: Order”);
    foreach (var o in spreadsheet.Table(“Order”).TableColumns())
        Console.WriteLine(”  {0}”, o.Name);
    Console.WriteLine();

    // query for all customers with city == London,
    // then select all orders for that customer
    var q = from c in spreadsheet.Table(“Customer”).TableRows()
            where (string)c[“City”] == “London”
            select new
            {
                CustomerID = c[“CustomerID”],
                CompanyName = c[“CompanyName”],
                ContactName = c[“ContactName”],
                Orders = from o in spreadsheet.Table(“Order”).TableRows()
                         where (string)o[“CustomerID”] ==
                               (string)c[“CustomerID”]
                         select new
                             {
                                 CustomerID = o[“CustomerID”],
                                 OrderID = o[“OrderID”]
                             }
            };

    // print the results of the query
    int[] tabs = new[] { 20, 25, 30 };
    Console.WriteLine(“{0}{1}{2}”,
        “CustomerID”.PadRight(tabs[0]),
        “CompanyName”.PadRight(tabs[1]),
        “ContactName”.PadRight(tabs[2]));
    Console.WriteLine(“{0} {1} {2} “, new string(‘-‘, tabs[0] – 1),
        new string(‘-‘, tabs[1] – 1), new string(‘-‘, tabs[2] – 1));
    foreach (var v in q)
    {
        Console.WriteLine(“{0}{1}{2}”,
            v.CustomerID.Value.PadRight(tabs[0]),
            v.CompanyName.Value.PadRight(tabs[1]),
            v.ContactName.Value.PadRight(tabs[2]));
        foreach (var v2 in v.Orders)
            Console.WriteLine(”  CustomerID:{0}  OrderID:{1}”,
                v2.CustomerID, v2.OrderID);
        Console.WriteLine();
    }
}

This code produces the following output:

Table: Customer
  CustomerID
  CompanyName
  ContactName
  ContactTitle
  Address
  City
  Region
  PostalCode
  Country
  Phone
  Fax

Table: Order
  OrderID
  CustomerID
  EmployeeID
  OrderDate
  RequiredDate
  ShipVia
  Freight
  ShipName
  ShipAddress
  ShipCity
  ShipRegion
  ShipPostalCode
  ShipCountry

CustomerID          CompanyName              ContactName
——————- ———————— —————————–
AROUT               Around the Horn          Thomas Hardy
  CustomerID:AROUT  OrderID:10355
  CustomerID:AROUT  OrderID:10383
  CustomerID:AROUT  OrderID:10453
  CustomerID:AROUT  OrderID:10558
  CustomerID:AROUT  OrderID:10707
  CustomerID:AROUT  OrderID:10741
  CustomerID:AROUT  OrderID:10743
  CustomerID:AROUT  OrderID:10768
  CustomerID:AROUT  OrderID:10793
  CustomerID:AROUT  OrderID:10864
  CustomerID:AROUT  OrderID:10920
  CustomerID:AROUT  OrderID:10953
  CustomerID:AROUT  OrderID:11016

BSBEV               B’s Beverages            Victoria Ashworth
  CustomerID:BSBEV  OrderID:10289
  CustomerID:BSBEV  OrderID:10471
  CustomerID:BSBEV  OrderID:10484
  CustomerID:BSBEV  OrderID:10538
  CustomerID:BSBEV  OrderID:10539
  CustomerID:BSBEV  OrderID:10578
  CustomerID:BSBEV  OrderID:10599
  CustomerID:BSBEV  OrderID:10943
  CustomerID:BSBEV  OrderID:10947
  CustomerID:BSBEV  OrderID:11023

CONSH               Consolidated Holdings    Elizabeth Brown
  CustomerID:CONSH  OrderID:10435
  CustomerID:CONSH  OrderID:10462
  CustomerID:CONSH  OrderID:10848

EASTC               Eastern Connection       Ann Devon
  CustomerID:EASTC  OrderID:10364
  CustomerID:EASTC  OrderID:10400
  CustomerID:EASTC  OrderID:10532
  CustomerID:EASTC  OrderID:10726
  CustomerID:EASTC  OrderID:10987
  CustomerID:EASTC  OrderID:11024
  CustomerID:EASTC  OrderID:11047
  CustomerID:EASTC  OrderID:11056

NORTS               North/South              Simon Crowther
  CustomerID:NORTS  OrderID:10517
  CustomerID:NORTS  OrderID:10752
  CustomerID:NORTS  OrderID:11057

SEVES               Seven Seas Imports       Hari Kumar
  CustomerID:SEVES  OrderID:10359
  CustomerID:SEVES  OrderID:10377
  CustomerID:SEVES  OrderID:10388
  CustomerID:SEVES  OrderID:10472
  CustomerID:SEVES  OrderID:10523
  CustomerID:SEVES  OrderID:10547
  CustomerID:SEVES  OrderID:10800
  CustomerID:SEVES  OrderID:10804
  CustomerID:SEVES  OrderID:10869

Summary of the LtxOpenXml Namespace

This section summarizes the LtxOpenXml extension methods and types that make it easy to work with Open XML SpreadsheetML tables.

For details on the extension methods and types for word processing documents and spreadsheets (other than Tables within spreadsheets), see the post, Open XML SDK and LINQ to XML.

Tables Extension Method

This method returns a collection of all tables in the spreadsheet.  Its signature:

public static IEnumerable<Table> Tables(this SpreadsheetDocument spreadsheet)

Table Extension Method

This method returns the Table object with the specified table name.  Its signature:

public static Table Table(this SpreadsheetDocument spreadsheet,
    string tableName)

Table Class

This method represents an Excel Table.  Its definition:

public class Table
{
    public int Id { get; set; }
    public string TableName { get; set; }
    public string DisplayName { get; set; }
    public string Ref { get; set; }
    public int? HeaderRowCount { get; set; }
    public int? TotalsRowCount { get; set; }
    public string TableType { get; set; }
    public TableDefinitionPart TableDefinitionPart { get; set; }
    public WorksheetPart Parent { get; set; }
    public Table(WorksheetPart parent) { Parent = parent; }
    public IEnumerable<TableColumn> TableColumns()
    {
        …
    }
    public IEnumerable<TableRow> TableRows()
    {
        …
    }
}

This class contains a number of properties about the table.  In addition, it contains two methods, TableColumns, which returns a collection of TableColumn objects (the columns of the table), and TableRows, which returns a collection of TableRow objects (the rows of the table).

TableColumn Class

This class represents a column of a table.  Its definition:

public class TableColumn
{
    public int Id { get; set; }
    public string Name { get; set; }
    public int? FormatId { get; set; }  // dataDxfId
    public int? QueryTableFieldId { get; set; }
    public string UniqueName { get; set; }
    public Table Parent { get; set; }
    public TableColumn(Table parent) { Parent = parent; }
}

The most important property of this class is the Name property.

TableRow Class

This class represents a row of a table.  Its definition:

public class TableRow
{
    public Row Row { get; set; }
    public Table Parent { get; set; }
    public TableRow(Table parent) { Parent = parent; }
    public TableCell this[string columnName]
    {
        get
        {
            …
        }
    }
}

The most important feature of this class is the default indexed property that takes a column name and returns a TableCell object.  This is what allows us to write code like this:

Console.WriteLine(r[“Item”]);
Console.WriteLine(r[“Qty”]);
Console.WriteLine(r[“Price”]);
Console.WriteLine(r[“Extension”]);

TableCell Class

This class represents a cell of a row of a table.  It implements IEquatable<T> so that you can do a value compare of two cells.  It also implements a number of explicit conversions to other data types so that it’s easy to deal with columns of various types.  Its definition:

public class TableCell : IEquatable<TableCell>
{
    public string Value { get; set; }
    public TableCell(string v)
    {
        Value = v;
    }
    public override string ToString()
    {
        return Value;
    }
    public override bool Equals(object obj)
    {
        return this.Value == ((TableCell)obj).Value;
    }
    bool IEquatable<TableCell>.Equals(TableCell other)
    {
        return this.Value == other.Value;
    }
    public override int GetHashCode()
    {
        return this.Value.GetHashCode();
    }
    public static bool operator ==(TableCell left, TableCell right)
    {
        if ((object)left != (object)right) return false;
        return left.Value == right.Value;
    }
    public static bool operator !=(TableCell left, TableCell right)
    {
        if ((object)left != (object)right) return false;
        return left.Value != right.Value;
    }
    public static explicit operator string(TableCell cell)
    {
        if (cell == null) return null;
        return cell.Value;
    }
    public static explicit operator bool(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return cell.Value == “1”;
    }
    public static explicit operator bool?(TableCell cell)
    {
        if (cell == null) return null;
        return cell.Value == “1”;
    }
    public static explicit operator int(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return Int32.Parse(cell.Value);
    }
    public static explicit operator int?(TableCell cell)
    {
        if (cell == null) return null;
        return Int32.Parse(cell.Value);
    }
    public static explicit operator uint(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return UInt32.Parse(cell.Value);
    }
    public static explicit operator uint?(TableCell cell)
    {
        if (cell == null) return null;
        return UInt32.Parse(cell.Value);
    }
    public static explicit operator long(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return Int64.Parse(cell.Value);
    }
    public static explicit operator long?(TableCell cell)
    {
        if (cell == null) return null;
        return Int64.Parse(cell.Value);
    }
    public static explicit operator ulong(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return UInt64.Parse(cell.Value);
    }
    public static explicit operator ulong?(TableCell cell)
    {
        if (cell == null) return null;
        return UInt64.Parse(cell.Value);
    }
    public static explicit operator float(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return Single.Parse(cell.Value);
    }
    public static explicit operator float?(TableCell cell)
    {
        if (cell == null) return null;
        return Single.Parse(cell.Value);
    }
    public static explicit operator double(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return Double.Parse(cell.Value);
    }
    public static explicit operator double?(TableCell cell)
    {
        if (cell == null) return null;
        return Double.Parse(cell.Value);
    }
    public static explicit operator decimal(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return Decimal.Parse(cell.Value);
    }
    public static explicit operator decimal?(TableCell cell)
    {
        if (cell == null) return null;
        return Decimal.Parse(cell.Value);
    }
    public static implicit operator DateTime(TableCell cell)
    {
        if (cell == null) throw new ArgumentNullException(“TableCell”);
        return new DateTime(1900, 1, 1).AddDays(Int32.Parse(cell.Value) – 2);
    }
    public static implicit operator DateTime?(TableCell cell)
    {
        if (cell == null) return null;
        return new DateTime(1900, 1, 1).AddDays(Int32.Parse(cell.Value) – 2);
    }
}

LtxOpenXml.zip

Comments (33)

  1. I hope you all have been enjoying Zeyad’s articles showing some of the powerful solutions you can build

  2. Carlos LeBrilla says:

    Is there anyway to use this with Excel Services?

  3. EricWhite says:

    Hi Carlos,

    If you are writing a feature for SharePoint, you could use this approach to extract information from spreadsheets.  Also, if you have spreadsheets in a document library, you could write a web service to retrieve the spreadsheets and use this code to query tables within the spreadsheets.  Does this answer your questions?

    Thanks, Eric

  4. Richerd says:

    Hi Eric,

      I am building a B/S software to run a management work with VS2008Sp1 and SQL 2005. I use Linq in my project. My problem is, could I use linq to update the Datum in Excel. if so, I can use LinQ to SQL to retrive data, and Linq to Xml to write that data to excel.

      Thanks in advance.

  5. EricWhite says:

    Hi Richard,

    Yes, this is certainly possible.  The easiest way is to have a ‘template’ spreadsheet that you copy and modify, inserting the results of your LINQ to SQL query.  In short, you want to modify the worksheet part, and replace the x:sheetData element and its child x:Row elements with new elements that you construct from your query.

    -Eric

  6. Richerd says:

    Hi Eric,

      I am trying my best to find out a fast solution for the reporting part of my project. I have to export the data in to Excel. And I have realize that with automation Excel in the serverside(not a good solution), javascrip and gridview to out put the dataset from the Celint side. And now, I find that open XML maybe a better way to do that. Of course, I have write data to excel with openXml sdk. But I do not know how to manipulate the excel style(eg.the column with, border, and etc). Could you please provide some resource (eg. blog, article or website) for me to learn about that?

      Do you have any suggestion for me?

      What I need is a Stable and fast system to export data from SQL 2005 to excel. Thanks a lot!

  7. Richerd says:

    And does it the linq that makes my exporting system slow?

  8. Dernier post de cette série sur la suppresion des commentaires dans les documents PowerPoint 2007 (PresentationML).

  9. EricWhite says:

    Hi Richard,

    Exporting the data into Excel is certainly doable.  I have a screen-cast that I need to record that shows how to do this, but basically, the gist is to find the sheetData element in the worksheet, and replace that element with a new one that contains appropriate child row elements.  Take a look at this post: http://blogs.msdn.com/brian_jones/archive/2008/11/04/document-assembly-solution-for-spreadsheetml.aspx , and in general, look at the other posts by Zeyad Rajabi on Brian Jones’s blog.

    One approach for doing formatting – it is easiest to set up a spreadsheet with your desired formatting, and then modify the spreadsheet rather than generating the spreadsheet with formatting from scratch.

    Also, keep OpenXmlDiff in mind – this has the capacity to teach you how to format – save a spreadsheet, change formatting slightly, save it again, and see the differences.  This shows you the markup necessary to change formatting.

    Regarding speed, the portion to write out the Open XML, or read the Open XML using either LINQ to XML or Open XML SDK V2 will, in general, be very fast.

    -Eric

  10. Richerd says:

    Hi Eric,

      I really appreciate your help. I will learn more throug the resources which you and your firend provided.

      Thanks again.

  11. Comme à l’accoutumé, voici une brochettes de liens de la semaine sur Open XML. Posts techniques en vrac

  12. By combining LINQ to SQL with LINQ to Objects in this fashion, you can write some pretty simple C# code that uses LINQ to join data across two separate, disparate data sources.

  13. Constantinos Leftheris says:

    Is there any chance in the future, of providing additional functionality to the given example for Tables so that after manipulating the table contents in memory you can actualy save the results back? Your post was a great help for me, thanks alot!!

  14. EricWhite says:

    Hi Constantinos, I do have plans to show some code to update tables.  This would be valuable, I think.

    -Eric

  15. Paul says:

    Nice post Eric. Have you tried out the open source Linq to Excel project (<a href="http://code.google.com/p/linqtoexcel/">http://code.google.com/p/linqtoexcel/</a&gt;) to use linq queries against Excel? It uses OleDb and takes care of everything in the background. All you have to do is declare a simple class with properties that map to the column names in the excel sheet. Here’s an example:

    IExcelRepository<Customer> repo = new ExcelRepository<Customer>();

    var londonCustomers = from c in repo.Worksheet

                         where c.City == "London"

                         select c;

    foreach (Customer customer in londonCustomers)

     Console.WriteLine(customer.ToString());

  16. Erika Ehrli says:

    Here is a list on links that I want to share with you. LINQ for Office Developers Some Office solutions

  17. Deepesh says:

    Hi Eric,

    You post are good and book even better, but it has been a struggle for me most of the time because i have to painfully reconstruct everything from c# to VB, could you be kind enough to give VB option to all your code?

    Thanks a lot

    Deepesh

  18. EricWhite says:

    Hi Deepesh,

    I agree, having VB samples would be great.  I’ll do this when possible.  :)

    -Eric

  19. Mickel says:

    Hi Eric,

    This helped me ALOT and I am very grateful for all your high-quality posts :)

  20. Bernard Darnton says:

    Thanks for this post – I’ve found it very useful.

    One problem I’ve run into is addressing table columns where columns have been inserted after the rest of the table has been created. The ids are no longer in order and the TableRow[columnName] method no longer works.

    You get this in table1.xml:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

    <table xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main&quot; id="1" name="Table1" displayName="Table1" ref="A1:D2" insertRow="1" totalsRowShown="0">

    <autoFilter ref="A1:D2">

    <filterColumn colId="1" />

    </autoFilter>

    <tableColumns count="4">

    <tableColumn id="1" name="First" />

    <tableColumn id="4" name="Inserted" />

    <tableColumn id="2" name="Second" />

    <tableColumn id="3" name="Third" />

    </tableColumns>

    <tableStyleInfo name="TableStyleMedium2" showFirstColumn="0" showLastColumn="0" showRowStripes="1" showColumnStripes="0" />

    </table>

    Because the column ids are out of order, the line in the TableRow class that gets the cell gets the wrong result:

    string columnAddress = (startRefs[0].ColumnAddressToIndex() + tc.Id – 1).IndexToColumnAddress();

  21. Bernard Darnton says:

    The solution I’ve used for the problem above is to add a COlumnIndex property to the TableColumn class:

    public int ColumnIndex { get; set; }

    Set it using the ElementsBeforeSelf method:

    new TableColumn(this)

    {

     Id = (int)c.Attribute("id"),

     Name = (string)c.Attribute("name"),

     FormatId = (int?)c.Attribute("dataDxfId"),

     QueryTableFieldId = (int?)c.Attribute("queryTableFieldId"),

     UniqueName = (string)c.Attribute("uniqueName"),

     ColumnIndex = c.ElementsBeforeSelf().Count()

    }

    And then fix the cell reference in the TableRow[columnName] method:

    string columnAddress = (startRefs[0].ColumnAddressToIndex() + tc.ColumnIndex).IndexToColumnAddress();

    Hope that’s useful for someone.

  22. Minh Le says:

    Hi,

    I just use Open XML Lib to query data from Excel file. However, I have one problem with this approach. The problem is that I can not handle the black cell. For ex: I have a table with 14 columns. The first row is the header, it contains column’s name. The remaining rows are the data. In the range of data, there’s some cell are blank. When I query this data, instead of getting 14 columns for row, I only get 13 columns and the data is incorrect due to lack of order.

    Onw work around for me right now is replace the blank cell with the null pattern defined by myself, and later I replace the pattern with space. But I find it is tricky.

    Do you know a better solution?

    Regards,

    Minh.

  23. sanvid says:

    hi what my requirement is client will be able to uplad the multiple CSV uaually 5 to 6 which is user define and he wants to generate another csv file with a use of query builder. As per generated query a new file is generated. Can you please help me to fire a query with multiple tables in csv file. is it possible with above given solution by you.

  24. gopal says:

    how to convert following string in to linq query as in my project its comming dynamically.

    "from i in spreadsheet.Table("Inventory").TableRows()

       where (int)i["Qty"] > 2

       select i;"

  25. samir says:

    I have two spreadsheets: one is a list of job numbers and descriptions and the other is a time sheet. Any ideas on how I could get the job numbers as a drop -down list in the time sheet file?

    Thanks.

  26. EricWhite says:

    Hi Samir, I’m not a word-automation expert.  This is a good question.  I’ll ask one of the experts in my office.

    -Eric

  27. EricWhite says:

    Hi Samir, I have a few questions.  Are you looking for a solution in Visual Studio?  Do you want the combo box on a toolbar?  Which version of Excel are you using?  Feel free to contact me directly via the email button above, and I’ll be happy to help get you the answers you need.

    Thanks, Eric

  28. EricWhite says:

    Hi Samir, I have perhaps a solution for you.  I have a shared Add-in project that populates a combo box on the Ribbon in one Excel workbook with a range of data from another workbook.  If you’d like to contact me directly via the EMAIL button on my blog, I’d be happy to send it to you (and anyone else who wants it).

    -Eric

  29. spottedmahn@gmail.com says:

    Hi Eric,

    First let me say thanks!!

    I think I’ve un-covered a bug.  The issue I was having was the with TableRow.GetCell().  It was incorrectly calculating the ColumnAddress.  The correct column was E but it calcuated H.

    I suspect the problem arised because I had moved some columns around in my table.  To get around my problem I copied and pasted the values into a new table and my issue went away (off course it took 30 minutes of debugging to figure this out).

    I can send you the Excel file if you so desire.

    Regards,

    Mike D.

  30. EricWhite says:

    Hi Mike,

    You are right.  I’ll update this code next week.

    -Eric

  31. Pascal Craponne says:

    I had the same problem as spottedmahn and fixed it (by adding a row index to the TableCell). I can send you the updated source code if you tell me where I can send it.

  32. Craig says:

    Hi Eric,

    Excellent examples, thank you very much!

    I think I may have discovered a bug in the "Table.TableRows()" return query.  Shouldn't the return query skip the headerRowsCount and totalRowsCount AFTER the Where clause executes, not before?  On a sheet with multiple tables I found that the query was skipping the first headerRowsCount + totalRowsCount number of rows in the sheet, thus I was still getting the header row and totals row returned when the table is not located on the first row.  By moving the Where clause BEFORE the skips resolved the issue.

    Thanks,

    Craig