Spin Spin Sugar

Article
07/27/2006

OK, forgive the random Sneaker Pimps reference and I promise we will move off this topic of ODF politics we've had the past week or two, but I wanted to call out something that Stephen McGibbon pointed out to me today. He mentioned this blog post he made on Monday entitled Spinning out of Control. Stephen pointed out that in the press release for the ISO approval of ODF, the following statement was made:

Billions of existing office documents will be able to be converted to the XML standard format with no loss of data, formatting, properties, or capabilities. This will facilitate document contents access, search, use, integration and development in new and innovative ways.

Now, I'm not sure if this was just an exaggeration, or if they meant that ideally in future versions of ODF it will be the case. It's clear though that as the spec stands now, it's not the case. There are clearly a number of areas either left unspecified, or specified to a more limited level than what people are already doing today in their documents. I'm not talking about future innovations, but basics that have been around for years. I know that pushers of ODF like to say this is just FUD, but really it's just a fact. Look at the spec. If the goal is to guarantee perfect fidelity with the existing base of Microsoft Office documents (which would be implied by the "billions of documents" statement), then there is still a long way to go.

Now, maybe fidelity with the existing base of Microsoft Office documents was a non goal. In reading through the newsgroups, it's pretty clear that the initial goal of ODF was mainly targeted around fidelity with the existing OpenOffice 1.1 format that was created by Sun. This is stated pretty clearly by David Faure who is a voting member on the OASIS Open Document Technical Committee:

The format is heavily based on the requirements, constraints, and experiences of *Sun* customers and KOffice users and developers though, and nothing says that those requirements are totally different. But for sure we didn't target *Microsoft*'s customers. The art of implying something without actually saying so...

"Almost no material changes" is certainly exaggerated, but yes, ODT is mostly bsaed on OO-1.1, it wasn't completely redesigned;

I think the key here is for everyone to just be clear on the goals. The ODF format is based on Sun's StarOffice, and Open XML was based on the Microsoft Office formats. Both have the goals of being open, both have been submitted to standards bodies, and both have a commitment from the donating companies (Sun and Microsoft) that there will be no licensing restrictions and anyone is allowed to freely use the formats. A big difference though is that the ODF folks took a slightly different approach as far as when to declare draft 1.0 complete. There are even features that OpenOffice supports that aren't yet defined clearly in the spec. The Ecma draft on the other hand pretty clearly defines everything, which then allows people to implement as much or as little of it as they want.

A recent statement that really left me scratching my head around this though was made by Gary Edwards up on Stephen's blog post. You may remember Gary as the guy who was under the impression that there was a mythical binary key in the Office XML formats. Gary is a member on the ODF Foundation and has been talking a lot about the add-in they built to open and save ODF in Microsoft Office. I still haven't had a chance to look at the add-in, as it's been kept pretty secret, but Gary has really promised a lot. Here is what he said on Stephen's blog about ODF not being full fidelity with the existing base of documents:

You're wrong. The OpenDocument Foundation plug-in will deliver near perfect fidelity for ODF documents produced by MSOffice. Our fidelity is near identical to the fidelity achieved when converting MS binaries to MOOX.

Maybe you need to pay more attention to the trials going on in Massachusetts. Oh, that's right. Microsoft isn't participating in those trials. Based on the piss poor fidelity of your translator project, i wouldn't participate either if that was the best i could do.

The truth is that it doesn't matter to us if it's billions of documents or ten documents. If that document can be loaded into any version of M$Office from 1997 to 2007, we can convert it with near perfect fidelity. At least as good as your own conversion within MSOffice to MOOX.

Perhaps you need to worry more about your own credibility than that of the ODF Community. We're doing just fine thank you,

Oh yeah, one other thing. Accessibility add ons to MSOffice work just fine with ODF. There is no performance differential between ODF and MOOX within MSOffice worth worrying about. There is no differential in how accessiblitiy applications are handled. So what was your complaint again?

~ge~

I really don't understand this. First off maybe he isn't aware that the translator project we announced is currently in a very early prototype stage and is completely open source. It will continue to improve over the coming months. I understand people usually expect stuff that we announce to be further along, but we wanted this to be done in the open so anyone could comment and contribute.

I also thought that everyone was in agreement that the ODF format was not yet to a point where it could fully represent the existing base of Office documents, but Gary seems to say their tool can somehow get around this limitation. I don't know how deep Gary has looked into this, but it's simply not possible unless he and the ODF Foundation have already added significant extensions to the ODF standard. I haven't seen these new extensions documented anywhere. The OASIS ODF technical committee claims it's still over a year away from defining spreadsheet functions and tables in presentations, and no mention of solutions to the international numbering issues or even simple things like character highlighting.

Gary also doesn't seem to understand the performance problems with ODF. It has nothing to do with performance once the file is loaded. The problems are with how long it takes to read and write ODF files since they decided to use a generic table model to represent full spreadsheets.

So, while I think the ODF spec is a great representation of the OpenOffice file format, it's just not anywhere close to the Ecma spec in terms of representing Microsoft Office documents. And since we already have billions of documents in that format and hundreds of millions of customers, we absolutely have to keep our focus on the Ecma spec for now. We are also helping to build transformations between the two formats, which really helps to show the beauty of working with documented, open, XML formats.

-Brian

Spin Spin Sugar

Additional resources