I had a few people point me at a couple of IBM blogs today (Bob Sutor and Rob Weir) and I have to admit I was a little disappointed to see that they are really working hard to continue to push negative views of the Office Open XML formats. Basically they want to position it in such a way that there is a winner and a loser, and it’s no surprise that they think the winner should be the one they’ve put all their resources behind (ODF). It’s definitely a strong “us vs. them” mentality that you also see a lot in politics these days. I admit I’ve pushed back in the other direction at times and had some criticisms of the Open Document format, but those have always been in response to folks who ask why we couldn’t use ODF as the default format for Microsoft Office. I had always stated that we needed a format that could fully support all of the features our customers used, and when the ODF folks snapped back saying that I wasn’t providing enough concrete examples, I decided to start providing specific problems. I’ve never said the world can’t use ODF, I’ve just said that the Office Open XML formats are also necessary. I feel like some of these folks have watched Highlander one too many times (hence the title of this post). I would never make the claim that the HTML format means that ODF isn’t necessary, and I certainly don’t believe the ODF means that Office Open XML isn’t necessary.
The latest criticism from Bob and Rob is that the Open XML formats don’t use MathML, and instead define a separate XML syntax for a Math Presentation format. Rob even displayed a bit of a flare for the dramatic, as he titles his post “Math you can’t use” and Bob followed up with “Making bad choices, over and over again.” Well thankfully this isn’t really true, and to be honest, if posts like that aren’t considered ‘FUD’ I don’t know what is. Every piece of the Office Open XML format is being fully defined in Ecma, and we’ve even built XSLTs that will transform from MathML and back. In addition to that, we’ve worked closely with different companies out there that already support MathML to make sure we are compatible with their solutions. We support MathML on the clipboard, so you can paste a MathML equation into Word. Here is the latest version of the XSLT that takes the Office Open XML format for Math and transforms it into MathML (http://jonesxml.com/resources/omml2mml.xsl), and here is the XSLT that goes in the opposite direction (http://jonesxml.com/resources/mml2omml.xsl). Anyone who has Beta 2 of Office 2007 should already have these on their machine under “Program Files\Microsoft Office\Office12”.
Just in case folks aren’t sure what I’m talking about, this is all about the presentation form of MathML. The math is never actually calculated, only displayed. Also note that this is different from the discussions around functions, which are a large part of the SpreadsheetML specification. Unlike the spreadsheet functions, the Math support is all around scenarios like academic papers that need to use formulas as part of the information they are presenting.
I remember a few years ago having a discussion with Murray Sargent, who was one of the key folks behind the new math support in Office 2007. He also had worked on the MathML 2.0 standards body before it was dissolved, and we talked about whether or not we could use MathML for the formats. He obviously was very familiar with the MathML format, and the conclusion was that we unfortunately couldn’t use MathML in our new default XML formats. We found that while MathML works great for isolated math islands, it didn’t give us everything we needed at the document-level. Although MathML does have space for annotations so we could have extended it, that would not have worked well with document-level features like comments, track changes, Word styles, etc. The equation support in Word 2007 is actually very impressive, and if you haven’t taken a look yet I strongly suggest you give it a try.
We did agree though that we should fully support MathML as an interoperability language between apps, which is why we can read and write Presentation MathML on the clipboard (leveraging those XSLTs).
This is just another example of the difficult decisions we had to make when building these new formats. Of course we would have loved to have just used MathML, as it was already fully designed and documented. It would have been much easier, but it would have also meant we would have to either cut back the functionality, or extend it in such ways that it was no longer as usable. If you ever used the HTML formats from prior versions of Office, you’ve seen that when you try to take a format that was designed for other purposes and add extensions so that it can represent your files you often end up with a rather complex and unmanageable result. So instead, we used MathML as a guide, and tried to leverage as much of the design as we could. We had to make sure we could support our features though and not let the format put the end user in a bad state. Most of our users don’t care the least bit about XML and XML formats, and if moving to the new file formats meant things like tracked changes wouldn’t work on the equations, then folks would have chosen to stick with the binary formats instead. So we instead have an XML format that supports all of the features, and that format is fully documented and free for anyone to use. Not a bad deal in my view. I can’t say enough how proud those of us are who worked on the formats are. It’s such an important change in the world of Office documents.