Independent Implementations of Open XML


One element that keeps coming up in the comments section to my blog is the idea of the quality of the specification. This argument is tied to the length of the specification and what that means for reviewers of the documents. Both of these have been huffed-and-puffed at for quite some time on various community blogs, but there is a really important test for these two items. Can successful independent implementations be built? The answer is an unqualified yes.

As I, and a number of others, have pointed out over time, the original document was significantly expanded as the members of TC45 asked for greater and greater details. This also lead to the creation of a great deal of reference information being included in the specification. To review old facts: of the 6,000 page, 5-part spec that is Ecma 376, 5,756 pages are in Part 4, “Markup Language Reference” which, “Defines every element and attribute, the hierarchy of parent/child relationships for elements, and additional semantics as appropriate” among other things (read about it here). Why is this important? Because this data is exceedingly helpful when an implementer begins building real software for commercial-grade solutions.

As for the quality of the document, it could have been cleaner from an editorial perspective – but given the volume of implementations we are seeing this is hardly a stumbling block for those looking to build real-world solutions. Granted, the ballot resolution process is going to clean a bunch of this up which is undoubtedly a good thing as well.

So how about those independent implementations? I’m not going to go crazy and try to list them all here. I would recommend checking out the OpenXMLCommunity list of projects and if you are interested in doing some work yourself to build an implementation – check out OpenXMLDeveloper.org. Also, my colleagues in Germany have provided me with this link to more than 160 projects implementing Open XML (but I think it best if you know how to read German for that one).

Here are a few: 

At some point the question needs to be raised about the desire for standards bodies to have work items that represent what is current and valuable in the marketplace. There is still hard work to be done on Open XML within the formalized context of JTC-1, and that is good. But, there is no doubt about the fact that this compelling technology is enjoying an explosion of interest in the marketplace completely separate from the sale of Microsoft Office. That is a very practical measure of the quality of this specification.

Comments (43)

  1. Jason,

    I think that where the confusion comes in is at two levels: first, when most people say "there are no other implementations," what I think that they mean is that there is no other competing office suite based upon OOXML, although some suites (such as Novell’s ODF implementation) will be able to save to OOXML.

    The second is whether these implementations are stand alone implementations that are using OOXML as a preferred tool, or whether they are implementing it solely because they need to intereact with Microsoft products.

    When lists are posted of products that implement OOXML without explaining the how and why, I think that it leaves people still puzzled, particularly where products such as the iPhone are mentioned, because it isn’t intuitively obvious how this would be done, or why.  

    I think what would be helpful would be if you, or someone else, could explain the how and why.  For example, my assumption (which may be all wrong) is that most of the implementations would be to allow other products to interact with Office, making their implementation more in the nature of utilizing an interface standard rather than as a foundation for their product chosen for OOXML’s intrinsic merit.

    If the above list is as good as any to use for this purpose, it would be very interesting to explain, for each product::

    1.  Is this implementation for the purpose of working with Office (for example, in the Datawatch product, is the implementation only in order to allow mining of data saved in Office 2007 databases?)

    2.  If not, how is OOXML being used, and to what purpose?

    On a different note, It would be interesting to know whether the product could or could not have been based on ODF, and if not, whether the reason is other than because the product needs to work with Office 2007.

    Andy Updegrove

  2. Jason – Do you have any sense of how many of these products work with Open XML that don’t work with the older binary XLS format?  I am not trying to catch you out or anything, just understand how much it has expanded the application space to have an XML spec and how much it is just another format for companies to support who already supported the binary format. With any luck, there should be whole new applications possible, but I don’t know whether that is a current or future thing.  – Ben

  3. Jody says:

    Andy : I can respond for Gnumeric.  I’m implementing it primarily to interact with MS Office.  The same reasoning applies to ODF for OO.o.   If I had to chose between the two formats today, OOX’s spreadsheetML would win because of it’s performance characteristics (shared strings and formulas)

  4. Simone says:

    Andy: I’ll respond for Records For Living.  We’re in the final stages of implementing extensions to our personal health record – HealthFrame – to use OpenXML as a container of personal health record data.

    1. and 2. Our implementation is not done using OpenXML for the sole purpose of integrating with MS Office – in fact most of our testing checks for interoperability with NeoOffice on the Mac and soon we will be testing on PDAs.  We’re actually using OpenXML to expand the portability and accessibility of an individual’s clinical records.

    As to whether we could have used OpenDocument – maybe, but from what we understand not to the fullest of its capabilities.  To be clear, HealthFrame will embed full, custom XML documents inside standard medical forms (both for input and reporting).  These XML documents will comply with the industry standard ASTM Continuity of Care Record (CCR).

    This is a powerful idea that allows for humans to view aspects of the patient record, while allowing electronic business processes to access the CCR record.  

    It’s our understanding that with OpenXML we are able to include a full, valid CCR document, instead of extending portions of the document elements.  

    Our solution is not yet listed in the OpenXML Community because we’re a few weeks away from a complete announcement.  I thought the question was interesting and perhaps our use case somewhat a propos for the discussion of what it means to ‘implement OpenXML’.

    Ben: I don’t know if this could be implemented in the old binary formats.  But the fact that Records For Living could access the OpenXML specification and implement these extensions made it that much easier and more attractive.

  5. A.Falk says:

    Andy: I can respond for Altova’s XMLSpy implementation – Re 1) yes, the main purpose is to interoperate with Office 2007. But I do not see that as a negative factor. On the contrary, the reality of life is that the majority of Office documents are being created in Msft Office, and I much prefer them to be OOXML than to be the old binary DOC and XLS formats. I have yet to receive a single ODF document in an e-mail from anybody, whereas I already receive tons of OOXML files. I also second Jody’s comment – OOXML wins in terms of performance.

    Ben: you asked how many of these products didn’t support the binary XLS format – XMLSpy didn’t. Being an XML tool we only support the new OOXML formats.

    Jason: while I totally appreciate you listing Altova XMLSpy as an example of an independent implementation of OOXML, I must admit it is a bit misleadin to cast it that way, because our product is not an Office suite, but rather an XML development tool. Our goal in supporting OOXML is to enable developers to (a) reuse data that was created by Office suites through XSLT 2.0 transformations, through application of XQuery, etc. and (b) help developers create OOXML documents. As such we’ve also added OOXML support to our AltovaXML processing engine (which 3rd party developers can use royalty-free) so that you can directly use an OOXML document in your XSLT 1.0 or 2.0 transformations.

  6. Andrew Sayers says:

    I think Andy’s distinction between applications "based on" Office Open XML, and those that merely support it, is a very interesting one, as it hints at a level of support beyond implementation details.  MS Office uses programming techniques that never gained much popularity outside of Microsoft, and Andy’s comment makes me wonder how many complaints about Office Open XML are trying to say "nobody outside of Microsoft groks the format", but somehow wind up making a less abstract claim, like nobody understanding the format or nobody implementing it.  That theory would explain why some people become frustrated when the evidence doesn’t back up their assertions: they know what they meant, so arguing against an overly literal interpretation of their words would just strike them as childish.  Obviously Andy’s comment isn’t an example of this – lawyers are trained not to make that sort of language error – but it’s worth considering the next time somebody brings the issue up on a blog.

    One unusual programming technique used by Microsoft Office is to represent documents internally as a stream rather than a tree, and I’d like to ask the developers present for their opinions about the general strengths and weaknesses of each approach.  For example, would you agree that it’s easy to create a special-purpose application that reads or writes data in a stream-based format, but that an application must understand the entire format if it’s going to usefully modify data in a stream?

    – Andrew

  7. Andrew –

    An interesting thing to note in this regard is that IBM’s Lotus Notes use an internal rich text format which is also stream based rather than hierarchically based.  I have been working with it for years (1.4 million licensed clients for our Midas Rich Text LSX, for example), and its has its advantages and disadvantages.  Some of the techniques which we use for rich text are much harder with ODF’s hierarchical model, although some other techniques are easier.  For example, copying a chart from one ODF container to another is quite easy, but search and replace or other mass modifications are easier and more efficient with a stream based model that does not rely so extensively on either recursion or simulated recursion.

    That said, I imagine that for many developers used to the hierarchical nature of most XML formats, it is going to take some getting used to to use Open XML, and it may well be one of the things developers don’t like but can’t quite describe.

    – Ben Langhinrichs

  8. Andy,

    It looks like a few of the answers here are from the horse’s mouth, so I’ll add to that.

    —1.  Is this implementation for the purpose of working with Office (for example, in the Datawatch product, is the implementation only in order to allow mining of data saved in Office 2007 databases?)

    Just to clarify here, in Monarch (the Datawatch product) we read and write a variety of formats, Fixed Text (the primary input, in the form of reports), HTML, PDF, DB, DBF, MDB, XLS, XLSX and delimited text, with Excel being the most common target output application.

    Also remember that reading xls does not always mean Office created xls.  Many applications, mostly financial in nature, use xls as the export format, as it is the richest and best suited format to represent this data.

    There are xls engines written in-house by companies in their own software, embeddable components for developers (such as the now infamous Stephane Rodriguez’s xlsgen and SoftArtisans ExcelWriter) as well as developer components from Microsoft that allow one to create xls files, which constitutes a large amount of the xls files one might need to consume.

    For Monarch, we decided to support OpenXML for a number of reasons;

    Export to a document format that has an open specification, to broaden the number of applications that can potentially consume our richest output, from a metadata point of view, which was previously xls.  Our output is often going to be spreadsheet-like, so our main efforts are focused there.  The concerns of organizations such as the Commonwealth of Massachusetts about proprietary formats, also made choosing an open format attractive.

    Simplify the development process for our Excel reading and writing.  Compared to BIFF, OpenXML is a real joy.  In addition to that, what’s not to like about having detailed specs on ANY file format you have to support that you don’t own up on the web and impossible for one vendor to change unilaterally.

    Learn about OpenXML in general, using SpreadsheetML as an introduction, which has a massive ROI for our product development, in order to see if we could then support output to WordProcessingML, which would have been too great an incremental development effort in the past with the legacy formats.

    Create a basis to move forward with functionality not previously possible in other formats such as the ability to add in our own XML documents or other bits in the package, which could then travel around and add value through the lifetime of the documents.  The new format gives us more scope for enhancing the richness of our exports.

    Support the latest version of Excel, and give our customers the benefits that exist when using the latest format.  See David Gainer’s blog for more info on that.

    So to sum up, currently, the benefit is going to be the greatest with Office, since that is the most popular software that our users are likely to use in conjunction with our product, however, the risk we took in implementing OpenXML on a tight deadline, when we could have played it safe, was also to take advantage of being able to feed other applications with our richest output and take a step towards opening up a bunch of new feature development opportunities.

    Bear in mind, that we had a firm release date for the software, which meant our goals for 9.0 were to support the same functionality as we had for the legacy Excel format in OpenXML.  

    As we progress, then more functionality specific to OpenXML will filter through.

    —2.  If not, how is OOXML being used, and to what purpose?

    On this point, as above, our use of OpenXML is primarily publishing, so until more applications that our users would target support the standard, the current purpose remains serving our Excel users.  At the point where more applications support the standard, then we get an import source and an export target for free, so to speak.

    —On a different note, It would be interesting to know whether the product could or could not have been based on ODF, and if not, whether the reason is other than because the product needs to work with Office 2007.

    Yes, it could have imported and exported ODF.  However, in the early specification phase for Monarch 9, we sent out a survey to our users, which specifically mentioned ODF support, with some accompanying information – I believe we mentioned it in conjunction with OpenOffice, to give it some context.

    Note that at the time, we had not heard about OpenXML by then, so this was not an either/or scenario, just trying to determine whether we should add a new format or not.  

    There was no interest from our users.  In the future, that may well change, and we will duly implement ODF.  Until that becomes a decision we can make on a sound financial basis, we can’t expend development time on it.

    Gareth Horton

  9. Andy says:

    simone: "Our implementation is not done using OpenXML for the sole purpose of integrating with MS Office – in fact most of our testing checks for interoperability with NeoOffice on the Mac and soon we will be testing on PDAs.  We’re actually using OpenXML to expand the portability and accessibility of an individual’s clinical records."

    You’r re kidding us. This is a particular problem I have with you guys, you can’t talk with a straight face because the case does not hold. OOXML as it is cannot survive independent review. Office is a market leader, and everyone knows that.

    Anyway, the question is: Why not a SINGLE standard format and what does OOXML what the existing international standard can’t do? What are these surplus features?

  10. I’d like to thank everyone for taking the time to respond to my questions – it’s appreciated, and I’ve found the answers to be quite instructive and intersesting ( by the way: I’m not the "Andy" that made the last comment added above).

    Here’s what I’m taking away from the responses above.  Feel free to correct me if you feel that I’ve gone astray:

    1.  There is quite a variety of ways that ISVs have found to work with OOXML, and some aspects of it that some like or can use more than others.  

    2.  My "either this way or that way" assumed alternative question is obviously much too simplistic, and may not be a very useful way of asking the question at all.

    3.  On the other hand, answers that are meaningful for ISVs to give are very hard for a non-technical reader (such as me) to fully appreciate.  For a non-programmer, the forest is hard enough to get a handle on, let alone the relative merits of heirarchies versus trees (sorry, couldn’t help myself).

    My net takeaway is that trying to accurately describe what OOXML means for ISVs to a non-technical audience is a more difficult task than I would have imagined.

    I expect that I’ll do a short blog entry on that topic, and point readers back to this blog entry, so that those that are more technically adept than I am and that are interested can continue the dialogue.

    Thanks, Jason, for hosting this thread.

    –  Andy Updegrove

  11. Thanks to all of you who have taken the time to post your answers to my questions – they were quite interesting and informative.  I’m not, by the way, the "Andy" that made the last comment above (it seems like this thread is of unusual interest to Andys and Andrews).  

    As a non-technical person, what I’m taking away from this thread is the following:

    1.  My initial question is too simplistic.  It seems that you can’t divide OOXML implementations into just those that are for the purpose of working with Office, or data stored in Office, and those that implement OOXML for their own stand-alone purposes, although that does work for several of the products.  

    It seems that the question does work for Gnumeric and Altova (the answer for each is pretty much, "the former").  In the case of Records for Living, though, OOXML will be used to create documents for use "outside" the application, as it were (that’s probably more metaphorically than technically accurate).  And for Datawatch (an old client of mine, as it happens, back in the early 1980s, when they were doing TEMPEST technology), the answer is also more complex.

    2.  Andrew and Ben also take the question to a level of technical detail that is beyond my competence; it’s hard enough for me to see the forest here, let alone separate the technical hierarchies from the trees (sorry, couldn’t help myself there).  

    3.  OOXML has some aspects that some find attractive, and preferable to earlier Microsoft technology.

    4.  As between ODF and OOXML, the question is, from a business perspective, somewhat moot, given the minimal market penetration of OOXML to date.

    5.  It doesn’t appear that any of these products are "built on" OOXML, in the original sense of my question.  That’s not too surprising, given that it only came out of Ecma last year, and still isn’t final in ISO/IEC JTC1.  That makes the question somewhat premature, although given the installed base of Office users and the anticipated level of upgrades, early implementation in the manner described in this thread has obviously been seen to be a smart business move.

    If I’m off base in any of the above interpretations, please let me know.  I’ll probably add a brief blog entry in the next several days at my blog, http://www.consortiuminfo.org/standardsblog and point people back here, so that those that are more technically proficient than I am can add their questions and thoughts.

    Jason, I appreciate your hosting this thread, and look forward to any thoughts that you might add.  It seems to me that the topic of OOXML implementations has received little in-depth description to date, and I think that it would serve the ongoing dialogue well for people to have as much accurate information to rely on as possible to inform their opinions.

    –  Andy Updegrove

    [This is a longer repost of an entry I did an hour ago that didn’t appear; the one I posted at the top of this thread appeared almost immediately, so I wasn’t sure if something glitched this time around]

  12. Simone says:

    Andy,

    I thought I had made the point clear as to why OpenXML and not Open Document Format – at least at this point in our implementation efforts: custom schema support – in particular ASTM CCR.

    Even though ODF supports custom schema extensibility, through XForms, it is actually harder to implement than the OpenXML-based solution.  Consider that the XForm-based solution requires additional parsing/processing that the OpenXML ‘part’ approach does not.

    When companies choose one standard over another, it is often important that considerations such as ease of implementation are taken into consideration, as it was in our case.  BTW, the implementation was actually done in Java…

  13. hAl says:

    I think Andy is a bit disappointed about the Office Open XML implementors actually responding and all firmly supporting the Office Open XML format choice.

  14. Sam Hiser says:

    Jason-

    In other words, there are none.

  15. jasonmatusow says:

    Wow, what a great thread folks. Mr. Updegrove (to keep the 2 Andy’s on this thread separate), thanks for kicking off such a good discussion. All of the implementers – also thanks for the thoughtful comments. Great stuff all aroudn.

    Mr. Updegrove – I think you hit the nail on the head with your comment about the complexity of this topic. I’m glad to see you and I can find at least a little common ground. The conversation we had in Geneva seems a long time ago. I hope we can get together again soon. Also, the glitch getting your comments up was me being a busy Dad this weekend and not approving the comments quickly enough. Sorry about that.

    A. Falk – point taken and I will be more accurate in speaking of your work. I do think though that your work shows an important fact. Open XML will have uses for more than just office automation app producers. Which does get to a point that Mr. Updegrove was asking about which is the relationship of Open XML implementation work to that of the MS Office products. I am going to write more about that as a top-level post rather than here in comments.

    Ben – I’d love to understand your question more. One of the tenets of the Open XML standardization was that the spec would support backward compatibility. That said, I think what you are seeing on this thread is people doing app work with this format that don’t have to do with the old formats. In other words, there is utility to this beyond the concern raised that this would all just be about MS Office. I’ll make this point again, but the fact is MS Office has created an addressible market that I think many ISVs will seek to target. BUT – I think we are going to see a plethora of solutions on Open XML that will have nothing to do with MS Office as well. (evidence already showing of this)

    Andrew Sayers – your question about the developer-side of things might do better on Brian Jones’ blog as he has the technical acumen to particpate in that answer as compared to me. (I am certain other readers here have the tech chops too – but it may be a better conversation there).

    I am on a long flight all day Sunday – but early next week I will wiegh in with some thoughts based on this conversation.

    Thanks all!  Seriously – I really appreciate the quality of this thread.

    Jason

  16. Andrew Sayers says:

    First, my apologies to Jason and Andy Updegrove for getting too technical.  Brian’s blogged productively about Microsoft’s position on the technicalities, and my comment was actually a continuation from a thread on his blog – it happens to be a slow week over there, and I couldn’t resist the opportunity to ask the implementers that happened to be here.  I agree with Andy that it’s very difficult to explain answers meaningfully at this stage – we’re starting on a whole new generation of document formats, so programmers are having to go right back to first principles in order to make sense of what’s happening.  If I knew what the world of document formats would look like in 10 years time, I could give you a very simple, useful explanation of the technicalities.  Until my time machine arrives though, I’ll try to interpret as best I can (and do tell me if I’m getting too technical again).

    Ultimately, any technical system lives or dies based on whether it lets people do what they want to do.  My favourite example of this is Roman vs. Arabic numerals.  Roman numerals are prettier, but doing maths in Roman is really hard (try multiplying XVI by XIV in your head without translating into Arabic numerals).  Since most uses of numbers involve calculation rather than display, Roman numerals have been almost completely displaced in the market, except for those few niches where you just need an attractive way of displaying a number.  Therefore, when assessing any new technology, two of the most important questions to ask are "what do people want to do with it?" and "what does it make easy, what does it make hard?".

    Until recently, the main things that people wanted to do with office documents was create them by hand, send them to other people to edit, and eventually print them.  But as Jason has explained to me, people are now starting to create and edit documents automatically, to search collections of documents, and people like Simone are creating strange new hybrids that don’t even have a name yet.  Since it’s too early for anyone to spot the trends in what people will want to do with documents in the future, it’s important to have a good understanding of what different formats make easy and hard, so that we’re ready when those trends become easier to spot.

    I found Ben’s comment really informative in terms of what sorts of programs are easy and hard to write with different formats.  I’ll try to explain my understanding of his comment in a minute, but if you’re really not interested in the details, my opinion is that most jobs will be possible to do in either format, but that ODF will be better for jobs that involve picking the structure out of a document (like creating a table of contents) but Office Open XML will be better for jobs that involve picking out arbitrary information (like creating a glossary).  If you’re happy to believe this unsubstantiated claim from some guy on a blog, you can stop reading now.

    A good way of thinking about the internal structure of a document format is to imagine that ODF is structured like a legal contract, whereas Office Open XML is structured like the assembly instructions for a model aircraft.

    A legal contract is drawn up in a very hierarchical way, with sections containing subsections containing clauses.  Computer scientists like to describe that sort of structure as a "tree", where the leaves of the tree are individual clauses, the branches immediately before the leaves are subsections, the branches before that are sections, and so on.  Each clause in a contract is contained entirely within one subsection, and each subsection is contained entirely within one section, meaning that you can refer to "clause 1.2.3" and can copy "subsection 4.5" verbatim from one contract to another.  But if you just want to *read* the contract, you have to stop thinking about the structure and pretend that it’s just awkwardly written prose.

    Assembly instructions for model aircraft are much more linear than contracts.  Instruction 1 might be "Pick up part A", instruction 2 might be "pick up wood glue", instruction 3 "apply glue to part A", and instruction 4 "affix part A to part B".  Computer scientists like to describe this sort of a structure as a "stream", because your attention should flow from one instruction to the next to the next.  This is much easier to read than a legal contract, but it doesn’t let you make a bigger plane by inserting the instructions for one model into the middle of the instructions for another model – you’d end up using the wrong glue and affixing the wrong parts in the wrong order.

    In an ODF document, subsection 4.5 of a document might describe a chart, where clause 4.5.1 says "this is a pie chart with a black boundary", clause 4.5.2 says "the first slice is blue and covers 30 degrees of the pie", and so on.  In an Office Open XML document, instruction 450 might be "draw in black from now on", then instruction 451 might be "draw a circle", followed by instruction 452 "draw in blue from now on", then 453 "draw 30 degrees of a circle inside the current object", and so on.

    Copying a chart between two ODF documents is quite simple: find the relevant subsection, copy and paste.  Copying a chart from one Office Open XML document to another requires you to copy and paste the chart instructions, then add all sorts of peripheral instructions before and after it, to make sure that you don’t wind up using wood glue on metal parts.  On the other hand, finding every instance of the word "blue" in an ODF document means teaching the computer to read through the document and ignore those sections that talk about "blue" in the middle of a pie chart, whereas finding every "blue" in an Office Open XML document means finding every instance of the instruction "print word W to the screen" and seeing if word W is "blue".

    In conclusion, ODF and Office Open XML lay documents out in radically different ways, both of which make life easier for some people and harder for others.  This is just one of the myriad of technical differences that will affect which format implementers prefer for different jobs.

    – Andrew

  17. Andrew Sayers says:

    Yikes, that sounded far too authoritative.  For the record, my previous comment was intended to give a general understanding of how the "stream vs. tree" argument affects things, not to be a technically accurate introduction to file formats.  For example, I made up the details of how charts are described for purposes of explanation, and it’s very unlikely that misusing Office Open XML will cause wood glue to get into your computer.

    If you’re a programmer, and are irritated by my weak grasp of the details, I’d ask you submit your complaints in the form of a patch rather than a flame.

    – Andrew

  18. jasonmatusow says:

    Andrew –

    Thanks for the great comment. From what I understand there were fundamentally different approaches taken in the design of the formats. These design decisions resulted in advantages and disadvantages for each.

    The important thing is that neither prevents the other format from functioning (no contradiction). Instead, they co-exist quite easily, especially now that the translation tool sare maturing at such a rapid rate.

    Thanks

    Jason

  19. carlos says:

    Jason matusow said:

    "Thanks for the great comment. From what I understand there were fundamentally different approaches taken in the design of the formats. These design decisions resulted in advantages and disadvantages for each."

    The design of OOXML was made +10 years ago ( basically, OOXML is legacy binary formats encoded in XML ).

  20. jasonmatusow says:

    Carlos – I don’t agree with that. Issues such as file size, performance, custom schema support, etc. etc. were all part of the decision. And yes, Open XML is part of a decade of XML file format development effort. I’m proud of the fact that our guys were looking that far ahead when they started building an XML-based format.

    Jason

  21. Mikael Andersson says:

    The only interesting thing is, can someone else successfully render and save OOXML documents in a competing Office Suite? From what i have understood that is the biggest concern with OOXML of all. Freedom to choose vendor is what most people strife after and if OOXML doesnt give that chances are many vendors outside the Microsoft partner sphere turns to ODF instead where its very easy to interop at every level.

  22. Terry Lechecul says:

    I wonder sometimes if developers live in closed universe where they are locked out and are clueless as to why people react a certain way.

    Anyone who is over 18 knows the history of Microsoft, their predatory habits and the format lock-in which they have enjoyed.

    Can you at least understand WHY people are leary of anything you do?

    I know we wont be able to move forward unless we learn to trust each other but there is going to have be proof that things have changed.

    The OOXML process showed that the politicking hasnt stopped.

    The ‘open source’ license has all the signs of a trojan horse.

    Im sorry but there was chances recently for Microsoft to show they were ready to cooperate with the rest of the planet and it was once again blown (Forget that the guy who signs the paychecks has never once even hinted at a change of heart).

    The company I worked for years ago was a victim of Microsoft’s format lock in so figure me for being cynical but I have every right to be.

    And unfortunately no matter how good your specs really are, as long as Microsoft holds the keys, I will not trust that good technology wont be used to to what Microsoft always does.

  23. hAl says:

    @Mikael Andersson

    You suggest it is easy to interop with ODF.

    Weird as after more than two years we are still waiting for either full or interoperable implementations of ODF.

    In fact I have seen claims that ODf support SVG and whilst that seems incorrect in itself an ODF subformat like SVG still does not have full interoperable implementations.

    It ain’t easy to implement a full Office implementation . And just using ODF does not make it easy !

  24. D. Suse says:

    Terry L.:

    I couldn’t agree more.  After reading the glowing and detailed technical praise offered by the OOXML supporters who seem to have gathered here,  a question still remains in my mind: why would a corporation which has been repeatedly convicted for monopolistic and anti-competitive behaviour choose a non-cooperative role here, aggressively petitioning for (and allegedly purchasing) votes in support of a document format which is not based on or compatible with an existing ISO standard?   Instead of reinventing the wheel, participating with revising and contributing to the existing ISO standard ODF format would go a long way toward showing that *the leopard may have changed its spots*.

    The wisdom of this behaviour becomes even more puzzling in light of the recent EU anti-trust decision, charging Microsoft with anticompetitive and monopolistic business practices.  Why not commit a realistic effort to at least APPEAR to be cooperating with the international community?  Does Microsoft actually respect anyone or anything besides Microsoft?  In addition to business leaders, politicians can also be bought, but most also like to *stay* politicians, and this does not happen if people feel their best interests have been sold down the river for *special favours* from Microsoft or anyone else. They can also go to jail for things like this; not a great retirement option IMHO.  It is not in people’s best interest to have competition crushed, and be forced to deal with a monopolist.  It is not in people’s best interest to be locked in to paying hundreds of dollars for a product because free or lower cost alternatives have been made *incompatible* and incapable of functioning with other heavily-promoted document formats.   The poor of the world have equal worth to those rich who *have* the hundreds of dollars that Microsoft thirsts for.

    OOXML may be to the liking of the supporters reading this blog, but blindly investing your resources and energy in supporting such an uncertain possibility, even if paid for this in the short term by Microsoft, could very well wind up as a *grave* error in the final analysis (pun fully intended).  Like it or not, this is also a political, economic, and social issue, regardless of the supposed technical merits of *brand M* as compared with *brand O*.  We are talking about control over an international standard.  My best bet is that Microsoft’s continued politically incorrect, heavy-handed method of dealing with this situation will result the their precious OOXML standard receiving final rejection as an ISO standard.

  25. Elliot says:

    First, sorry for my English.

    – Datawatch = Microsoft Managed ISV Partner

    – Apple = It voted yes for OOXML

    – Minjet = Microsoft Gold Certified Partner

    – Gnumeric = Gnome = Miguel De Icaza = Mono = Novell = Microsoft Partner

    – Dataviz = Microsoft Gold Certified Partner (Palm? droped Foleo  Linux device unexpectedly)

    – Intergen = Microsoft Gold Certified Partner

    Do you see a pattern in here?

  26. btimby says:

    Andrew Sayers:

    I generally find streams harder to deal with. Take your example of finding all blue text. For a tree (in XML), I can use a simple xquery, in english:

    find all text markup contained within brush markup of color blue.

    For a stream, I must use a state engine.

    parse element

     is it color markup?

       yes, store current color

     is it text markup?

       yes, is current color blue?

         yes, we found blue text!

    Now, extrapolate this out to parsing many different types of items and you will find that using an XML dom for querying the document is superior to maintaining a state engine for parsing a stream. While DOMs do not perform as well as stream parsers in general, the point is that your code is easier to write and maintain which is almost always a better goal to have. If performance is your goal, then you always have the choice of treating a well-form hierarchical XML document as a stream, implementing your parser using a state engine, giving you full freedom over implementation.

    I did some work with the Apple iPod playlist format. It is stream-based. I ranted on my blog how much I hated that crappy way of writing XML.

    I have not looked at (and likely will not look at, see Terry Lechecul’s comment above for my motivation) OOXML, so I cannot comment on it. I can only give you my feedback on your statements regarding stream vs. hierarchy consumption.

  27. Doug Mahugh says:

    there have been some great discussions about Open XML online in the last week. Here are a few of my favorites

  28. there have been some great discussions about Open XML online in the last week. Here are a few of my favorites

  29. Elliott:

    Datawatch is also an IBM Advanced Partner.  

    http://d03bphrb.partner.boulder.ibm.com/bpconnections/bpcms.nsf/public/9DD15EA9991A250F872570E6003B6A61?OpenDocument&NL=en

    Read my previous post on our decision making process for OpenXML.  If the new IBM Symphony product and others create a large enough user base for us to be interested in, then we will look at implementing ODF support.

    Joining the partner program is always a good move, when you intend to use another company’s technology. It doesn’t mean they start to dictate your company strategy.

    Gareth

  30. Daniel says:

    I’m a non technical user, but I’m in charge of finding alternatives to Microsoft programs in the place I work.  When I read the title "Independent Implementations of Open XML"  I thought immediately of OpenOffice, but it seems I’m wrong and this "Open" refers to the new Microsoft format?  It seems deceiving to me that the most representative company of the proprietary software approach, markets its new format as "Open" xml.  It would be a lot more clear if  they had labeled it MicrosoftXML or something like that.

  31. Andrew Sayers says:

    btimby,

    I’m glad to hear that my analogy made sense.  It’s interesting that you talk about "simply using xquery" vs. "maintaining a state machine", because the simplicity of xquery is built on decades of work in the industry around getting people to understand trees, unifying around XML as a representation, and so on.

    Somewhere back in the day, the industry decided that it was going to use trees instead of streams as the default way of representing data.  The effect of that decision seems to be that more programmers understand trees than streams, and that we have more advanced tools for trees than for streams.  In other words, there’s a network effect in the value of an approach to representing data, just like there’s a network effect in the value of a software product.

    Looking at it this way, the pattern Elliot discussed becomes obvious: the people that are most eager to start using a stream-based office document format are the people who’ve been most exposed to the stream-based model in the past.  People like yourself that have only had brief, painful, experiences of stream-based formats will prefer to give it a miss until the stream-based equivalent of xquery is invented.

    – Andrew

  32. Pesquisando alguns blogs relacionados a Open XML…

  33. btimby says:

    Andrew Sayers:

    I don’t know that I find all streams painful, however, XML formed as a stream is easier to deal with as it provides choices.

    Have parsed many streams (generally binary streams with a predictable, rigid format) PE, COFF, ELF etc.

    I think my point is the following:

    writing XML like this:

    tag1

     tag2

     tag2close

    tag1close

    is easier to deal with than like this:

    tag1

    tag1close

    tag2

    tag2close

    The first way allows me to parse using a state engine OR a query whereas the second method forces me to use a state engine. Querying this type of XML is nigh impossible.

    I can query (simpler implementation) or use a state engine for better performance. It is up to me if your document is written "sanely" by my definition. Otherwise you are forcing me to parse it in a particular manner.

  34. Rick Jelliffe says:

    Judging a standard on whether it prompts the creation of a new versions of particular class of application is quite bogus, it seems to me.

    For a start, the success of a standard is whether there it is useful to someone (even a small niche) not mass issues. Second, a standard for a data format and a standard for a class of application are different things: DIS 29500 is not a standard for defining office applications, but a data format: related, but not the same thing. Third, certainly at JTC1 SC34, we have for decades been promoting the idea of pipelines or systems of processes not monolithic applications: monolithic applications under any brand name are the problem not the solution; moving to XML-based standard file formats provides an on- and off-ramp to these more interesting processing systems. Fourth, given the peculiarity of Open XML, mixing state-of-the-art ideas with fossils from the dawn of WYSIWYG, it is highly unlikely that people will implement every nook and cranny of it fast: particularly so with the compatibility and optional parts. Fifthly, the ease or difficulty in implementing any standard is proportional to how easily the standard maps to your existing products as has been mentioned before: so a product that already supports MS-isms will not have as big a task as products based on different data structures and functions (the slowness for the Mac Office implementation of OOXML makes me suspect that it has some feature issue to be resolved, rather than just being a simple mapping problem.)

  35. norbert says:

    Andrew Sayers said:

    "(try multiplying XVI by XIV in your head without translating into Arabic numerals)"

    You should have taken a less trivial example. (a-b)*(a+b) is not the hardest thing to calculate especially with b=1, even with roman numerals.

    (XV + I) * (XV – 1) = XV*XV- I

    XV*XV= CL + L + XXV

    so: CCXXIV

    🙂

  36. Jason,

    I’ve finally gotten around to adding an entry at my blog commenting on this thread, and pointing folks back to it.  You can find that entry here:  http://www.consortiuminfo.org/standardsblog/article.php?story=20070922150738283

    Andy Updegrove

  37. Ian Easson says:

    Andy, I left a comment on your blog about your analogy between OOXML/ODF and Wi-Fi Bluetooth.  (I did it as ananymous only because I was rushed and didn’t have time to register.)  You got some of the analogy wrong, and I thought it was worth pursuing.

    I would like to partially quote myself here, because I think the argument fits Jason’s blog:

    … WiFi and Bluetooth …have two different design intents, as described. (I am CEO of a company that is making Wi-Fi equipment with a range of 10 to 15 km.)

    The real analogy is with OOXML and ODF.  MS has always said they have different design intents, and they do.

    Let us make the specific analogies Wi-FI <–> OOXML and ODF<–>Bluetooth.

    ODF is for people and organizations who want to use or interwork with OpenOffice and a very short list of other open source applications.  (That;’s pretty limited, which is why I chose to make it analagous to the very short-range Bluetooth.)

    OOXML has a potentially much wider reach (which is why it should be thought of as analogous to Wi-Fi).  It is for people for want to do one of several things:

    – Use MS Office as a normal user

    – Interwork with MS Office (as a set of applications), programatically, using files in OOXML as part of the data interchange.  This is the so-called "Line of Business" application approach, and since it is internal to large organizations, you will not hear much about this.  (The use of OOXML as an archive format comes in here)

    – Create new LOB applications that don’t even touch MS Office, and use OOXML as one of the file interchange formats.  (The use of OOXML as an archive format comes in here too)

    – Use 3rd party applications, e.g., other office suites, that read and/or write OOXML formats intended to be read at some point by MS Office.

    – Use 3rd party applications that use OOXML as a convenient and powerful interchange format for numbers, words, and diagrams.

    Note that these are all valid "implementations" of OOXML, which was purposly and explicitly designed so that an application does not need to implement 100% of what’s in the spec to be counted as a conforming application.  Also note that the potential use to create a MS Office clone is not listed above.  That potential use was thought up only by OOXML opponents (as in the idea that the spec is so complex that only MS could implement it), who then went on to believe the propaganda they had written.

  38. It looks like we’ve got a forked thread going on here, so I guess I should paste in here my reponse to your comment at my blog.  Here goes:

    My clear recollection of the marketing claims of the time is that the Bluetooth was supposed to have a c. 30 foot reach, and was being promoted in the beginning as the solution for (for example) a wireless home office, with your laptop. printer, computer and so on all wireless linked – the same vision that was promoted for WiFi, before the products actually started to move out into the marketplace and see what people wanted to buy, and how they wanted to use what they bought.

    With time, WiFi manufacturers took advantage of a number of differences between the two technologies, including the fact that you could be farther than 30 feet away from your router, while Bluetooth took less advantage of its nominal (or at least originally claimed) reach, and became used mostly for much shorter range communication.

    My point is also that the early claims and hype about what a standard can be used for eventually fall to realities.  When that happens, it’s interesting to compare the initial claims between competing camps with the wisdom (or at least the choices) of the marketplace.

     –  Andy

  39. Dave S. says:

    So – there are no implementations of MSO-XML. just  a number of partial implementations.

    Since it’s primary goal is to provide backwards compatibility then there is no use to call it anything besides MSO-XML, unless you believe a wolf in sheep’s clothing makes it not a wolf.

  40. In case you missed the news on this yesterday – Microsoft and the DAISY (Digital Accessible Information

  41. In case you missed the news on this yesterday – Microsoft and the DAISY (Digital Accessible Information