office file formats… pondering

Tim Bray made a couple of interesting posts on the Office XML and ODF debate that's going on. He's right that the Atom reference from Dare and co. is a red herring. Except that the Atom/RSS experience is worth noting in one regard: the basic design of Atom and RSS (do the basics and do everything else in in extensions).

Tim makes the following point:

Almost all office documents are just paragraphs of text, with some bold and some italics and some lists and some tables and some pictures. Almost all spreadsheets are numbers and labels, with some sums and averages and pivots and simple algebra. Almost all presentations are lists of bullet points with occasional pictures.

But he misses something else. Most important documents aren't just paragraphs of text, etc. etc. Most documents that businesses depend on are more complicated than that, and simply missing important features in the file format (like, oh say... formulas) because the group decides that their focus is on structure instead of content (quoted from this news article - original source unknown).

In his post, Tim says:

The ideal outcome would be a common shared office-XML dialect for the basics—and it should be ODF (or a subset), since that’s been designed and debugged—then another extended vocabulary to support Microsoft features , whether they’re cool new whizzy features or mouldy old legacy features (XML Namespaces are designed to support exactly this kind of thing). That way, if you stayed with the basic stuff you’d never need to worry about software lock-in; the difference between portable and proprietary would be crystal-clear. And, for the basic stuff that everybody uses, there’d be only one set of tags.

This all sounds warm and fuzzy in the cool glow of one's monitor, but in practice this is completely impractical. The logical outcome of this approach (standardize on the "basics" and go crazy on everything else) is that difference between portable and proprietary would never be "crystal clear" as Tim puts it. The user, and only the user, would pay for the pain of not knowing whether a given document would open in one app or another.

Even between ODF implementers, there are already problems that are showing up when applications want to implement something that something isn't covered in the spec. Do you really want to subject your users to having to reduce their documents to the lowest common denominator - or to figuring out what the lowest common denominator is at any given time?

The basic point here is that this "standardize the basics" plan is doomed to failure in the real world. For RSS/Atom feeds, maybe it works, but for office documents -- for documents that people and business rely on -- the warm and fuzzies aren't good enough. My document just has to work. When I create a formula in a spreadsheet, it has to work. When I create a macro in a word processor, it just has to work. When I create a custom transition in a presentation, it just has to work. It's clear that, at least for now, ODF doesn't do that, and in some cases, there don't appear to be any plans to do any of it (caveat: I'm not deep into ODF development, and I welcome corrections).

I, for one, am very glad that the Office guys are doing the right thing with their licenses. This most recent modification to the license makes their intent super-clear, even to those inclined to assume the worst: All the team wants to do is build a great product while providing what users want (in this case, an well-documented open format).

Disclaimer: I have nothing to do with the Office file format teams and my opinions do not represent anyone but myself.

Skip to main content