“Directions on Microsoft” Report on Open XML formats

There is a new report out from Directions on Microsoft that discusses the new Open XML formats, and the impact on Office customers. It discusses the benefits that organizations will see from the formats, as well as some deployment recommendations. Here is a link: http://www.directionsonmicrosoft.com/sample/DOMIS/update/2007/01jan/0107nffio2.htm

I liked Rob's overview of the payoffs for Content Management, Public Sector:

The most important benefit of the new file formats will be for software developers and integrators who use Office as part of a larger solution. Because the Office XML formats are documented and accessible through standard APIs and tools, applications other than Office can do tasks such as generating documents from user input and extracting data from documents for business applications such as customer relationship management (CRM) systems. Microsoft itself could eventually benefit: the company's business applications (such as Dynamics CRM) could exploit the formats to extract information from Office documents or annotate them.

It's definitely the case that on top of all the solutions other folks can build, even within Microsoft we'll see other teams building solutions they couldn't have easily done before. The ability to generate or consume a rich Office documents in other applications is a huge shift for the market.

- Brian

Comments (8)

  1. JasonG says:

    He seems to be making two basic assertions:

    – oXML "Art page borders" specification is not general enough

    – Various backwards compatibility features are not specific enough

    Additionally, if you follow the link to the "Game of Zendo" page you’ll see many other arguments leveled at "Art Page Borders".  There, you’ll see assertions like:

    – Document format has cultural biases due to type of art in pre-existing border definitions.

    – Specification includes a "clipart collection" which is concerning.

    – He wonders about his right to reproduce these images.

    Right off the bat I see that his arguments conflict with each other.  Considering that the Art page borders in themselves are a backwards compatibility feature he asks for it to be more general.  Then he goes on to list many other backwards compatibility features which are not specific enough in their rendering description.

    Concerning the art page borders, his first assertion (of cultural bias) is purely specious. He says, "gingerbread men for Christmas, pumpkins for Halloween, or images of Cupid for St. Valentines day, or globes which are neatly centered on the United States."  If one reads page 1653, the spec. clearly doesn’t say anything about gingerbread men having anything to do with Christmas.  Neither, on page 1663, does it say anything about pumpkins having anything to do with Halloween.  On page 1646, the description of cupid is void of any mention of St. Valentines day.  Looking at page 1650, there are actually two globes defined, the first (which he might have missed) is centered prominently on Africa and most of Eurasia. The second globe is centered directly on the Caribbean and the continents of North and South America.  Quite clearly, neither one is "neatly centered on the United States."  He neatly misses mentioning cultural icons that are distinctly non-western European/Anglo-American such as a woven braid which, as a cultural icon, has roots in African or Native American culture.

    He argues that the spec essentially contains a clipart collection and is uncertain if he or others may leagely reproduce said "clipart collection."  He is not a lawyer, so his concern doesn’t hold any authority in the first place.  Secondly, the example images are part of the spec.  Any reasonable person would deduce that they would fall under the same license as the rest of the document which is quite clear regarding the implementation rights of others.  Of course he is free to hire a lawyer and make a more authoritative argument if the lawyer finds that the images are questionably licensed.   Further, reading the spec., some examples shown are prefaced by "as follows."  The Random House Unabridged Dictionary states that, when used as an adverb, "as" means, "to the same degree, amount, or extent; similarly; equally."  So, an implementer obviously could create their own graphical representation for an enumeration value with this type of description.  Some of the descriptions state that an exact image must be used (see weavingAngles corner description).

    He makes an argument from ignorance that the spec’s XML representation of Art Borders is not general enough to be extensible.  He then gives proof by assertion via an example of circles and squares.  His initial (problematic) circle and square syntax is not like the syntax used for the oXML description of Art Borders.  In actuality, the oXML syntax more closely matches his second (fixed) syntax, where he introduces the triangle shape.  A border enumeration in oXML has many possible values defined currently and I don’t see where in the spec. where it says that more can’t be added.  So, apparently, he wants his cake (border should have capability to be anything!), and wants to eat it too (the spec should not specify exact graphical representations).  Absurd!

    He is further concerned that this embedded clipart collection makes the spec bloated and says that art should be stored separately from the document.  This is a straw man argument, because, nowhere that I can see, does the spec say that these borders are embedded as binary goo in the document its self.  In fact, elsewhere in the spec I believe it specifically says that graphics ARE separate in the package file.  So again, he wants his cake and eat it too… …wants the spec to be smaller and not define individual graphical elements… …wants every implementations rendering to be exactly the same…  I already made an (hopefully convincing) argument that we could see a hundred different looking apple borders see page 1632 (graphical description defined by "as follows").

    Enough on the art borders!  Lets refute his arguments for (or is he against?) backwards compatibility elements.  He gives several examples, all of which are deprecated features of old versions of office products. or competing proprietary software.  I will refute only the footnoteLayoutLikeWW8 element here because the same argument can be applied to any of his other examples.  This element specifies that a rendering of the document should emulate a bug in old versions of Microsoft Word and purposely break the rendering of the document (inappropriately placing footnotes), instead of displaying the content properly.  He objects that this bug is not described and someone even comments on his page that (OMG the sky is falling) medical documents might be affected!  He states that, "If you ignore it, the document will not be formatted correctly."  But actually he’s got it backwards.  If you *DO* implement the element the document will not be formatted correctly.  Just what part of, "This emulation typically involves some and/or all of the footnote being inappropriately placed on the page following the footnote reference," do they not understand?  So what they’re saying is that the user should put up with broken rendering of pages.  This could indeed be bad for a user if they depend on the broken rendering printing a certain way on a pre-printed form for example.  Not implementing this backwards compatibility feature would therefore force the user to re-implement their form to take advantage of the corrected rendering of the footnote.

    Interestingly, I have often seen arguments against Microsoft for maintaining so much backwards compatibility which perpetuates broken output just as described in this feature.

    I love the title of his blog ("An Antic Disposition"), since it’s quite possibly the only accurate thing on the pages.  His writings are indeed Antics, in fact I seriously wonder if the whole thing isn’t in Jest.  From the Random House Unabridged Dictionary on the word "antic":

    "5. fantastic; odd; grotesque: an antic disposition."

    Right above it:

    "4. ludicrous; funny."

  2. Wouter,

    I’m not sure what kind of response is required there. If Rob feels that somehow the specification was built so that it would only work with Microsoft Office, he’s pretty far off base. There are already a number of tools today that work with the formats that have nothing to do with Microsoft Office. Rob’s obviously entitled to his opinion, but from the series of recent blog posts it seems that opinion is primarily an anti-Microsoft view. Not much point trying to argue with someone who’s that much of a zealot.


    Rather than getting into an argument about how well documented those portions are, I’ll instead say that if the legacy document compatibility settings are the main problem with the OpenXML spec, then I think we’re in great shape. Those things are so inconsequential when you think about all the pieces of the spec as a whole.

    They are completely deprecated, and within MS Office we do everything we can to try and turn those off. The only reason we included those in our initial Ecma submission was that we thought people would be upset if we left them out. We didn’t have anyone complain about them until after the TC had finalized it’s work (and even now those complaints are from a handful of individuals).

    Folks like Rob are in a pretty easy position because the formats are so large. It’s clear that Rob wants to poke holes in the spec, and frankly if Rob wants to find something to complain about, it’s not too hard. He could complain that there is too much in the spec, and if it’s ever removed he could then complain that things were left out and kept "hidden." That’s why I try to avoid Rob’s blog. It’s like listening to someone complain about the playcalling at the end of a well played football game. Nothing in life goes 100% perfectly, so there will always be something to complain about.


    Thanks for taking the time to write up your thoughts. The art borders complaint is pretty funny to me as well, although I am sorry that we weren’t able to do something to address those concerns. I brought those up last spring as an example of how deep we were going with the documentation. The point of bringing it up is that the art borders are such a small (and in my own opinion pretty "lame") feature that only a small percentage of folks use. But, rather than not including them in the spec and harming those customers (however small of a percentage as they may be), we included that base level of functionality and documented it 100%.


  3. Francis says:

    Interesting comment from Rob Weir’s post:

    "I wouldn’t be surprised if those ‘too complicated to explain the behavior’ tags listed in the article are essentially an enumeration of all of the opaque, undocumented, legacy routines that live on in MS Office."

    This is what I have suspected all along. I understand why these options are in the standard. Yet including them in the standard means that the legacy code to support them can never be retired.

    Which means that Word may never get a new engine. Which is sad: in many regards, WPF and IE are now better at text and layout–what should be Word’s core competence–than Word itself is!

  4. Actually that’s not really the case Francis. Any application is free to decide whether or not they want to support that portion of the standard. They would still be fully conformant as long as they don’t claim to support those pieces. So if at some point Word decides to no longer support those old layout behaviors, it would just need to say that those portions of the standard are no longer supported by Word.

    Maybe the Ecma TC would even decide that in a future version of the standard those portions are not longer supported. If people are using them, then they probably wouldn’t, but if at some point they are no longer used I can imagine that decision being made.


  5. Francis says:

    One question: are all of the settings deprecated? Some of them replicate buggy or bad behavior, but others are very useful. E.G. in Word: Don’t expand character spaces on a line that ends with SHIFT+RETURN; Suppress extra line spacing at top/bottom of page; Suppress Space Before after a hard page or column break. I always turn these options on for new documents.

    Separation between mere compatibility options (that preserve otherwise useless old behavior) and layout options would probably help clear the air on this debate, as well as simplify the overgrown "compatibility options" dialogs in Word.

  6. hAl says:

    As I have tried to comment on Rob wier blog as  well:

    Both Opendocument as well as Office Open XML are not exact rendering specs. That means documents will never be interpreted exactly the same when just based on the specs in the standard.

    If you need exact replica’s of documents you need a spec like XPS of PDF.

    This also means that it would be virtually impossible to replicate things like buggy rendering issues from older Office versions. That would require an exact specification which these Office standards are not.

    To be honest both standards are completly useless for rendering purposes without reference implementation examples. For Office Open XML the biggest advantage is that it has only one main implementation which in itself server as a reference implementation. For OpenDocument OOo does a simular job but with more applications using it that might become a lot more complex as OOo does not have nearly the same userbase as MS Office and for instance google could easily create an application using Opendocument that in a short period creates a bigger userbase and creates more documents.  

Skip to main content