New Binary File Format for Spreadsheets

With the upcoming release of the 2007 Microsoft Office System, much attention has been focused on the new Open XML file formats. But there is also another new file format for Excel spreadsheets that you'll want to take a look at if you're building unusually large or complex spreadsheets: the new XLSB binary format. Like Open XML, it's a full-fidelity file format that can store anything you can create in Excel, but the XLSB format is optimized for performance in ways that aren't possible with a pure XML format. The Excel team blog has some information about the various formats supported by Excel 2007.

The XLSB format (also sometimes referred to as BIFF12, as in "binary file format for Office 12") uses the same Open Packaging Convention used by the Open XML formats and XPS. So it's basically a ZIP container, and you can open it with any ZIP tool to see what's inside. But instead of .XML parts within the package, you'll find .BIN parts as shown to the right.

The format of .BIN parts will be thoroughly documented around the time of the Office 2007 release, but in the meantime there's no documentation available for the details of those binary components. In the last few weeks, however, software developer Stephane Rodriguez has embarked on an ambitious project to analyze the contents of .BIN parts and document them. He posted a great article yesterday entitled "Office 2007 .bin file format" that shows in great detail how some of the records are structured within the .BIN parts and how to work with them programmatically.

Stephane's article is very interesting reading for anyone who wants to get a head-start on understanding the new XLSB format before the official documentation is available. Thanks, Stephane!