UOF translator project

"There is a Chinese curse which says, "May he live in interesting times."
Like it or not, we live in interesting times."

- Robert F. Kennedy, 1966

Most historians now doubt whether there was ever any such curse. But its enduring popularity has made such questions irrelevant. To paraphrase Voltaire, if that saying didn't exist it would be necessary to invent it. It seems to fit so many situations, and consequently it has been repeated millions of times.

That mythical saying sprung to mind when I saw today's announcement of the UOF translator project:

As part of its continued commitment to deliver interoperability by design, Microsoft Corp. today announced a new collaborative effort with the Beihang University (Beijing University of Aeronautics and Astronautics) and others to create an open source translator project between China’s Unified Office Format (UOF) and the Ecma Open XML File Formats. In addition, the company announced the beta release of translation tools for Windows® XP, and the 2003 and 2007 versions of Microsoft® Office Excel® and Microsoft Office PowerPoint® as part of the Open XML Translator project launched in July 2006.
The UOF translation tools will be developed and licensed as open source software, and ultimately will be made available as free, downloadable add-ins for Microsoft Office Word 2003 and 2007 customers from SourceForge.net. As such, the tools will be available for use with other individual and commercial projects to accelerate document interoperability across the industry and benefit Microsoft Office customers in China who need to work with the UOF standard. A preview of the UOF translator tools will be released on SourceForge this summer, with final versions expected early next year. Further details on the UOF SourceForge project are available at https://uof-translator.sourceforge.net.

These are interesting times for document formats. We're in the midst of a transition from complicated, proprietary binary formats controlled by a single vendor to simple, XML-based formats controlled by standards organizations. UOF is such a format. So is Open XML, and so is ODF; and projects like the UOF translator will make it easier for all of these formats to co-exist.

And co-exist they will. I believe all three formats will grow in popularity in the years ahead, and so the big question isn't the theatrical "who will win?" of a George Patton but rather the plaintive "can't we all just get along?" of a Rodney King. In a few years, I'm expecting that the concept of winning or losing a document-format "war" is going to look pretty silly in hindsight.

UOF translation challenges

The UOF translator project is especially interesting to me because I had the opportunity to meet some of the people involved when I was in Beijing last month. Several of the attendees at the Open XML workshop we held at Beihang University were preparing to work on this project, and I was blown away by how thoroughly they had studied the spec. They asked precise questions (some of which I still haven't answered -- sorry, Belinda!) that reflected a deep understanding of the details. It was cool to discuss things at that level, and a refreshing break from the discussions of the spec that seem to revolve around everything but the actual content of it: how many pages it has, what's not in it, who created it, and so on.

Translation between UOF and Open XML includes challenges that most XML programmers never deal with. For example, there's the issue of character sets. Here's a sample of some UOF markup from an actual UOF document:

You can imagine what it's like to map these element names to the English element names of Open XML. And there are other conceptual leaps. For example, there's the issue of text direction: when you install Chinese language support, word-processing applications have additional text-direction options that don't appear for many other languages. How should text in documents that take advantage of those options be translated?

It's a big job to figure out all of these issues, but based on what I've seen there's a great team ready to start working on the problem. It will be interesting to watch how this project evolves.

The other news: more ODF-Open XML translators

Almost lost in the noise about the UOF translator today was the other half of the press release quoted above -- the bit about the beta release of spreadsheet and presentation versions of the ODF-Open XML translator project. These will be very interesting projects too, and will be critical for many types of organizations.

As Stephen McGibbon points out, for some reason these other document types don't get as much attention in the document-formats debate.

The matter of timing

Finally, I'd like to address one other point regarding the UOF translator announcement. There has been a bit of speculation in the blogosphere about the fact that Scott McNealy went to Beijing and talked about merging ODF and UOF, then Bill Gates spoke in Beijing shortly thereafter, and now this announcement regarding UOF and Open XML.

What seems to be missing from this speculation is the matter of timing. So I'd like to set the record straight: I was there first.

I'm hopeful this information will help put the rumors and speculation to rest.