Excel 2007 Binary File Format (.xlsb) as a timesaver

A couple of months ago while I was attending TechEd US, I created quite a lot of Microsoft Office Excel 2007 documents. I was using beta 1 build 4017.1004 of Microsoft Office Excel 2007 and running Windows Vista beta 2 Build 5384.

At that time both beta products worked well together. It was of the first builds were I experienced (and enjoyed) the integrated search functionality from both products and the new features of 2007 Office System. To make sure that others could read my newly created Office files I saved them twice: once in the old file format (Excel 97-2003 Workbook) for sharing purposes and once in the new file format (eg. .xlsx for Excel Workbooks) for my own use. Despite the Microsoft Office Compatibility Pack for Word, Excel and PowerPoint 2007 file formats was already available at that time, you couldn't expect that every Office 2003 user had installed this pack. So I ended up having the same files saved twice with different extensions.

A couple of weeks ago (after my holidays) I started working again on some of the files saved during TechEd US using the .xlsx format. But since then I reinstalled my machine with Windows Vista beta 2 build 5472.5 and Microsoft Office Excel beta 2 build 4228.1000.

Unfortunately due to changes in the Open XML file formats between the two builds,  I wasn't able to open my .xlsx files with the latest build. I expected this could happen so no worries ... I still had all my documents saved in .xls format too. At least that's what I thought ... So big surprise when I tried to open (with Excel 2007 beta 2) one of the.xls files created by beta 1 and the following dialog was shown:

 

After some investigation of the Office 2007 Open XML file format and taking a peak inside the XML of the zip containers, it was obvious that it would be too time consuming to write a tool to extract all the data from the container, or - even worse - to manually copy it. So a reinstall of the Office build 4017.1004 in order to access my data was the quickest way. But then again ... how could I open these files in a later build, eg. beta 2 Technical Refresh?

Well, luckily enough we have some very smart people in Belgium that do know all about MOSS 2007 development and Office file formats that solve this. Patrick Tisseghem (Thanks!) adviced me to save the beta 1 files into the new .xlsb binary file format. .xlsb is the extension of the new Excel 2007 Binary Workbook as explained on the Excel 2007 blog:

The Excel binary format is a full fidelity format for Excel 2007. It is similar to the Office Open XML format in structure - a set of related parts, in a zip container - except that instead of each part containing XML, each part contains binary data.

Even though we've done a lot of work to make sure that our XML formats open quickly and efficiently, this binary format is still more efficient for Excel to open and save, and can lead to some performance improvements for workbooks that contain a lot of data, or that would require a lot of XML parsing during the Open process. (In fact, we've found that the new binary format is faster than the old XLS format in many cases.) Also, there is no macro-free version of this file format – all XLSB files can contain macros (VBA and XLM). In all other respects, it is functionally equivalent to the XML file format above:

  • File size - file size of both formats is approximately the same, since both formats are saved to disk using zip compression
  • Architecture - both formats use the same packaging structure, and both have the same part-level structures.
  • Feature support - both formats support exactly the same feature set
  • Runtime performance - once loaded into memory, the file format has no effect on application/calculation speed
  • Converters - both formats will have identical converter support

Saving my files in beta 1 builds as .xlsb and open these files in beta 2 builds gave no errors and all data was ready for reuse.

But after the all the question remains: what has caused the the .xls file format change between the two beta builds?

Despite the fact that most of us will probably not face this issue in the near future (the shipping date is coming close), and I don't expect any major changes anymore in the file format I'll continue to use this binary file format (.xlsb) till the last beta of Excel 2007. This ensures at least that I can open my Excel files using any of the pre-builds of 2007 Office System that I'm running. Once the dogfooding is over I'll switch to the new Open XML File Format which offers great developer flexibility.

Interesting resources on this topic:

Tags: Microsoft, Community, Excel 2007, 2007 Office system