Abriendo Puertas con XML

When Win Office 2003 shipped, there was a great deal of debate as to the “openness” of Microsoft’s use of XML. The debate resurfaced with the announcement of the new Office 12 XML-based file formats, and it’s been further brought to the fore in recent days with Massachusetts’ recent decision regarding the adoption of “open” file formats.

On one side of the debate, you have those who would argue that, as long as it’s XML, it’s open. The other side of the debate would argue that mere use of XML isn’t sufficient, that the schemas need to be established or endorsed by an independent standards body. There are subtle shades on both sides of the debate, including schema publishing and licensing, but even with a published schema with a royalty-free license, there are those who would argue that a format isn’t open unless the schema itself has been approved by a standards body.

One of the most articulate proponents of the “standards body” side of the debate has been Joe Wilcox over at www.microsoftmonitor.com. Joe's reasoning can be found here and here.

In reading Joe's remarks, however, it's difficult to find a coherent position. At one point, he bases his notion of "open" on the acceptance of a standard by an independent standards body. At another point, he defines "open" based on the extent to which independent software vendors have supported the format with a certain degree of fidelity. Thus, OASIS's OpenDocument XML format is "open," but so is Adobe's PDF.

As a side note, back on September 1, Joe was scratching his head about Massachusetts' inclusion of PDF in their definition of "open." Apparently Joe forgot that he'd done exactly the same thing back in June. To be fair, Joe's reasoning was subtly different from that invoked by the Commonwealth of Massachusetts, but neither line of reasoning is all that coherent in the exclusion of Microsoft's use of XML in Office from the "open" rubric.

Joe's second post from last June comes closest to articulating a coherent stance on the subject. In that post, he likens OASIS's OpenOffice format to a simpler, more widely understood, idiom, albeit within the same XML language, to the idiom adopted in by Microsoft Office. Joe write:

But, even though the two people agreed on a common language, suppose one starts using geeky engineering jargon the other can't understand. Tough to communicate, right? So the one gives the other a big, fat book of definitions for the jargon--kind of like Microsoft publishing its schemas--so that they can talk. But the other person would have to learn the jargon first. Sure all the jargon is in the book, but wouldn't it just be better to communicate (e.g. be more "open") by speaking the basic language previously agreed on?

I think Joe's reasoning would be sound if Microsoft's addition of "geeky engineering jargon" was merely gratuitous. His reasoning breaks down, however, when we note that the "geeky engineering jargon" in Office's use of XML is necessary to adequately describe the features that are available in Office.

We can see this by altering Joe's analogy. Suppose we aren't talking about "geeky engineering jargon." Suppose, rather, we're talking about the jargon used within an academic field. The jargon in any academic field arises when various academics coin new terms to express various ideas within the field. Economists, for example, talk about IS-LM curves. Lay people haven't a clue what that's about, but, among Economists, a great deal of information can be conveyed very succinctly by using the jargon of IS-LM curves.

One important point of the academic jargon analogy is that anybody can extend the vocabulary. No one sits around waiting for some standards body to approve each new term before they're allowed to coin it in some academic paper. Academic jargon is open not only because anyone who is willing to engage in a study of the field is able to understand the lexicon. It's also open because anyone who works in that academic field is able to extend the lexicon.

Moreover, extension of the lexicon is based entirely on voluntary adoption of that lexicon within the field. An academic can coin a new term, but that term won't get adopted into regular usage unless other academics find enough value in the ideas that's expressed by the new terminology.

And, yes, there is a point where the analogy breaks down. Academic jargon doesn't get the same copyright protection that XML schemas get, and no academic field is bifurcated into those who produce new studies and those who only read the new studies the way the software field is bifurcated into vendors and users.

But, I think the difference between Joe's analogy and mine is still instructive in terms of the underlying values that each analogy expresses. Joe's analogy values user choice of equally adept vendors. My analogy values the ability of vendors to extend software to resolve new user problems. I would contend that both values are worth preserving for the benefit of people who use software.

Users do benefit from software commoditization in their ability to choose different vendors and in the ability of offerings from different vendors to interoperate. But this benefit comes at a sacrifice of product differentiation. Users benefit from product differentiation as vendors strive to solve user problems in new, and more effective, ways.

The ideal solution would be able to accommodate both aspects of "openness." In the world of software, it might not be possible to come up with a solution that balances both values, but I have difficulty imagining one that does a better job of balancing both than the approach we've adopted with Office's use of XML. The schemas are published with a royalty free license. Anybody is free to use those schemas.

Moreover, the way XML support is implemented in Office, people can extend those schemas. Word 2003 supports custom schemas, and the number of solutions providers who are incorporating Office 2003 into solutions that make use of a number of XML standards relevant to particular vertical industries is growing at an impressive rate.

Lastly, XML, with the inclusion of XSLTs into the standard, provides a ready tool for translating one idiom into another. Through the use of XSLTs, for example, it's possible to have Office support OASIS' file format out of the box, albeit with a certain loss of information on the save side.

"Abriendo puertas," is Spanish for "I'm opening doors." In an ideal world, we would be "opening doors" for both vendors and for customers to both use common formats and be able to extend them. That is at least what we're trying to do with the new XML formats. The future will tell us how well we've succeeded.

I just hope that the future gets decided by the people who actually have to use the software than either by government fiat or by pundits who have difficulty arriving at a coherent definition of the word "open".

 

Rick

Currently playing in iTunes: Hablemos El Mismo Idioma by Gloria Estefan