Complex editing of XLSX files

 

Editing Excel OpenXML documents using the OpenXML 2.0 SDK offers better performace and improved supportability compared to normal Office automation (e.g. VBA). There are also great examples on the Internet to get started: Office Open XML Formats: Inserting Values into Excel 2007 Cells.

What I haven’t found until now is some clarifications as to how one can edit multiple Excel cells without corrupting the workbook. This is what I’m presenting in this blog post.

The fact is that whenever rows and cells need to be added to an Excel worksheet, they need to be sorted upon insertion. Take for instance the XML for the following worksheet where A1 = 1, B1 = 2 and B2 = 3. Here is the only way the XML should be saved like:

image

If we inserted say cell B1 before cell A1, Excel would show a corruption error when opening the XLSX.

The program that I’m attaching here does exactly this: the input is a worksheet, a row and a column and the output is a modified worksheet that contains the row and the cell required for that cell to be edited.

Other useful features:

- Convert column index to colum name (e.g. 28 to AB)

- Get the column index from cell reference (e.g. AB23 to 28)

- Selecting cells based on argument value using LINQ

 

Sample edited worksheet:

image

Note: The sample program does not create a blank workbook, you will need to save one from Excel and pass the full path and the name of the worksheet to the program.

ModifyWorksheet.zip