Office's methods for parsing XML

As you know, there are many tools available today for parsing XML. If you want to work with the MS Office XML files, you have a ton of options to choose from. There are a couple alternatives provided by Microsoft, but you can use whatever you want. The XML we output is just standard XML 1.0 as defined by the W3C.

Today in Office, we use MSXML's SAX parser for both reading and generating our XML files. The SAX parser is very efficient, and we've been really happy with it. We actually first started using it in early 2002 when we began to integrate the rich XML functionality that shipped with Office 2003.

Before that though, we had a couple different approaches we used in Office. We first started doing work with XML in 1997 at the beginning of the Office 2000 project. At that point XML was still in it's infancy and we decided to build an Office specific XML parser which we integrated with our HTML code. Then in 1999 when we started work on Office XP, we decidedĀ  to use the same Office XML parser to generate Excel's spreadsheetML format. It wasn't until Office 2003 (which we started planning in late 2001) that we made the move to MSXML, which we use today in our new XML formats.

-Brian