Schwartz Joins the Doc Format Discussion


I appreciate the fact that Jonathan Schwartz of SUN has jumped into the doc format discussion with a well thought out discussion. I think there are some important issues raised, control of your data, long-term archival of data, and most importantly, the translation of documents.  Yet, his posting follows some very careful positioning in order to lead to the logical conclusion (for him) that you should use his products and services. Logical – but it is important to keep an eye on the issues rather than the commercial interests.


Yes – my employer has commercial objectives to see our product used. Thus goes the nature of competition.


 1) Control of your data is critical. We certainly recognize that – thus the move by most formats to being XML-based. The base level standard is XML, and that gives enormous opportunity for access to the data no matter what application created it, or when. 100 years from now, XML will be the magic bullet, not ODF or Open XML. 


2) Long-term archival is a huge issue and challenge. There is no question that the accessability to the Declaration of Independence is not based on a requirement for a machine to be able to read the document. Knowledge of the language sure helps though. The trade-offs in the past for data storage vs. computing capability were such that it was not feasable to have the benefits of XML-based formats. How much of the flat data stored on mainframes is readable without the mainframe applications? How about the data stored in directory services or other data manipulation systems.


I have never found the archival arguments to be complete. If you were to go completely down this path then SAP would have to open all of its data, Oracle, IBM (DB2), etc. etc. The trade off that everyone is ok with is based on the benefits that the system may offer (value) in exchange for the difficulties with long-term archival (or any one of a hundred other trade-offs).


Microsoft responded to national libraries and government agencies years ago with access to the binary formats, and source code for the Save As functions etc. Also, we provided documentation and licensing that recognized the import of the long-term archival issues. Is that a perfect solution – nope, but it was a step in the right direction. ODF is no better nor worse than Open XML at enabling technical access in 100 years simply because both are XML files and can be opened up easily. In fact, the Open XML translator project shows that it is a relatively easy task to build an independent piece of code that lets you do that.


 3) The point that Mr. Schwartz makes about the bridge that is being built between Google formats and OpenOffice is exactly what we have been saying for over a year. Translation is the key. There are dozens of document format standards, some from consortia, some from national bodies, and some from international bodies. The whole “only one” argument is really a commercially-driven sentiment that does not reflect the market reality, nor the desire of customers to have more choice – not less. I think it is great they are building translation capability, I hope they continue to do so as it is the exact right thing to enable for customers.


 Interesting letter from Mr. Schwartz – glad the conversation continues.

Comments (9)

  1. Sam Hiser says:

    "ODF is no better nor worse than Open XML at enabling technical access in 100 years simply because both are XML files and can be opened up easily. In fact, the Open XML translator project shows that it is a relatively easy task to build an independent piece of code that lets you do that."

    This is more sophistry from you, Jason, in which you’re witholding two pieces of information…as if from yourself:

    o in your XML implementation, not all the data is in XML; and

    o the translator is designed to do poor translations

    The governments do understand. Plugins are now a fact of life. Pep-talk may feel good but it doesn’t hold the market.    

  2. jasonmatusow says:

    Hi Sam – thanks for the comment.

    1) Not hiding anything from myself…I asked and I assured myself.

    2) I’ll defer to techies like Brian Jones and  you. I know that there are referenced items that are not in XML (like image formats). I also know that in ODF there are instances where the spec still needs to mature, thus there are non-standardized elements from StarOffice used (like formulas if I am not mistaken). At the end of the day though, XML is still at the root of both, and you can crack them open very easily to get to the data. That is goodness in general.

    3) The translator is an open source project and you are welcome to check out the translation to see if it is good or bad. If you think it is bad, please join in the fun. It’s open source.

    Thanks again for the comments.

    Jason

  3. Sam Hiser says:

    Jason-

    More sophistry. As you know, the Translator "Project" was enlivened on July 5, 2006 to kill my Plugin — to suck my air supply. My Plugin works and yours doesn’t. I wouldn’t possibly join a project that’s dedicated to the wrong measures.

    Why don’t you join mine? We’re at $300,000 buys you visibility of our Plugin source code (that’s right…"shared source") & membership on the Board of Directors of the OpenDocument Foundation.

    Come along. We’re the only vendor-neutral organization doing a plugin that’s credible on ‘interoperability’.

  4. Today an open letter was posted to the interop site at Microsoft . Yesterday, an open letter was posted

  5. Anthony Christopher says:

    Track record:

    My friend has some old files created by old versions of Microsoft Office.  He cannot access the data contained in those files because newer versions of Microsoft Office cannot open these files and he no longer has copies and cannot easily obtain copies of the original Microsoft Office versions which were used to create them. I have other friends who claim that even if he could get copies of the old versions of Microsoft Office that they would not install, let alone run, on my friends new computer.

    Question 1.

    Is the claim on the part of my other friends true?

    Scenario:

    A few years from now, I fill out some paper work for some branch of my state government. That paper work is processed and stored using some office suite procured by the state government and stored in the then current file format. Ten years later I want my state government to access the data in those files.

    Surmise:

    If the state government procures Microsoft Software and uses it to store and to access that data, if everything runs true to form, they will not be able to access my records.

    Options for avoiding fulfillment of my surmise:

    Option A

    Support legislation that attempts to ensure my state government’s procurement and use of software meets a standard set by a broad based alliance of developers and users with this very goal in mind.

    Option B

    Do nothing and hope the problem goes away by itself.

    Option C

    Support legislation that attempts to ensure my state government’s procurement and use of software meets one of two standards, one of the standards being set exclusively by the very same vendor whose track record is that of being part of the problem with the goal of rubber stamping whatever format it produces.

    Question 2.

    Which of the above options is most likely to guarantee that my surmise is not correct?

    Question 3.

    Of what use to me, is a rubber stamping standard?

    Question 4.

    Why did Microsoft refuse to lend a hand in the development of a standard set by a broad based alliance of developers and users with one of its goals that of solving the data archival problem(ODF)?

    Question 5.

    Since at least one state in the US now requires vendors to meet just such a standard, proof of user demand, why does Microsoft say it will not support that standard?

  6. Wesley Parish says:

    FWLIW, Anthony Christopher, there are heaps of copies of MS Win9x and MS Office 9x floating around the world.  I’ve seen the MS Win95 floppy version sold second-hand, unused, for NZD$5.00, and turned it down because I couldn’t guarantee that the 3.5" floppies were still usable for what I wanted to do with them – install MS Win95 off them in an emulator.

    That’s the other thing – you can get a good emulator or virtual machine for the price of a qemu or bochs download from Sourceforge.

    What you then do is save as rtf, because that can be reasonably guaranteed, within a certain level of confidence, that it will not only be readable by later versions of MS Office, it will preserve at least 85% to 98% of the formatting information.  But you also save as txt, just so that if the worst comes to the worst, you can at least have the text … 😉

    Then mount the file that is the virtual file system and copy from it to the main file system.

    In other words, translation is vitally important.  It helps if the translation software is up to scratch.  Unfortunately, Microsoft in previous versions of MS Office, didn’t make this a priority.

    Which is one of the reasons why I keep pressing for Microsoft to open the source of the likes of MS Windows, 3.x, 9.x, NT 3.x – 4.x, MS Office 9.x, MS Works 3.x -4.x …, under the template Microsoft Community License ,,, There’s enough versions of this software floating around to make its current static status positively hazardous, since its vulnerabilities are known inside and out by the black hats – Microsoft is going to compete against its massive installed base anyway, so it may as well have the benefit of getting its bugs and security holes fixed by others while competing against it.

  7. jasonmatusow says:

    Anthony – thanks for the comment.

    1) I would have to know which version of Office docs you are talking about, and I’m happy to ask our tech resources inside MS for an answer.

    2) Your scenario is no longer the issue it was in the days of binary formats. Which – by the way – were not just about MS binary formats. The binary format concern was true for other word processors and spreadsheets available on the market as well. Binary file formats offered great performance, and at the time, represented competitive differentiation in the marketplace for various products.

    Today, with the advent of XML, documents are being created in formats that will allow future generations to get to the data more easily. But – you can be sure that the technologies will have long surpased the XML formats of today (I think Jonathan Schwartz made a good point about that in his blog entry). Governments will still be in the situation where they have to figure out how to go back, crack open old technology, and translate for the modern technology.

    Open XML, ODF, and other XML-based formats seek to address this very concern. It is not about Microsoft solutions, or anyone elses – it is about what XML can offer.

    As for your options – I think it is a really bad idea for governments to mandate technology solutions. If they must mandate something, it shoudl be the end goal (long-term access capability for example). Technologies, standards, etc. will progress far faster than the legislation, and thus a legislative solution almost by defnition will put the government in a position of having to procure sub-optimal solutions in order to meet a mandate.

    Thus, I think your Option A and C are the same, and both flawed

    As for your "rubber stamping comment" – that is great rhetoric, but poorly informed. Do you have faith in ISO standardization? Do you think the ODF ISO standard is valid because of the ISO impramatur? If so, then you should have more faith in Ecma than OASIS (before people jump all over me – I am NOT comparing the two orgs in terms of one being better than another). ISO’s certification of the Ecma process is the highest possible rating. It is a 46 year old standards body whose standards have been broadly accepted throughout the industry. Your DVD player benefits from that, as do a very long list of other technologies. OASIS is an industry consortium body that also has a fine history, and offers significant value to the industry. IBM has been pushing the "rubber stamp" line because it meets their current needs – yet they are a member in good standing of the organization. Let’s talk about the technology, let’s talk about the industry issues – but this is a red herring argument at best.

    To your next question – we have been working on XML-based formats for a very long time. At the time ODF was started, it had basically 2 lead member organizations SUN and IBM. They both had product strategies driving their investment in the format. The line of logic you are using would suggest that the "first to standardize" makes for the best business decision for software producers. I disagree wholeheartedly to that concept. I am in favor of organizations heavily investing in R&D and pressing their technologies/products forward as aggressively as possible. That is what has driven the software industry (>50K software companies out there) to such amazing growth and reach. MOre of the same for me please.

    I think you need to check your facts about the US States and what they are or are not mandating. As for if they should – my answer from above remains the same.

    Thanks again for the comments.

    Jason

  8. Anthony Christopher says:

    Thank you for the offer to pass my friends problem to tech resources inside Microsoft. I think I may suggest we try something brought to mind from the Wesley Parish post first.

    The trouble is that I don’t see how ISO approval of OOXML is going to prevent something similar from happening in the future.

    I do nott think XML is the panacea you seem to think it is. I did, at one time, read the goals for XML and I do not remember them addressing the archival problem directly. Not that I do not think XML could be used to address the problem, but I would think the information industry would need the help of something like the ODF standard to do a thorough job of it. While I have read enough(of criticism, comentary, etc) to know that the ODF is not perfect, it is a good step in the right direction. The participation of Boeing, National Archives of Australia, Society of Biblical Literature, Intel, New York State Office of the Attorney General, Sony and others in ODF development gives me some hope that it is, and will continue to develop into even more of, a technically balanced standard that truly deserves to have the word "open" associated with it.

    My characterization of OOXML as a rubber stamp does not come from IBM, it comes from these excerpts from the statement of its purpose found in an overview of OOXML retrieved from http://www.ecma-international.org:

    "OpenXML was designed from the start to be capable of faithfully representing the pre-existing corpus of word-processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft Corporation. … OpenXML addresses the need for a standard that covers the features represented in the existing document corpus."

    The only other part of the purpose of this OOXML specification that exceeds that of the ODF is to aid in the migration of existing binaries into this rubber stamped "standard format". But the OOXML does not make good use of the existing ODF standard, indeed, it generally represents a redundant and contradicting specification. Hopefully the people involved in the ISO approval process are the type that understand that when it comes to standards "the fewer standards that address a specific problem, the better." Whether or not something developed by Microsoft, for Microsoft’s purposes, should really be called open is irrelevant in light OOXML’s redundancy with respect to other standards.

    "As for your options – I think it is a really bad idea for governments to mandate technology solutions. If they must mandate something, it should be the end goal (long-term access capability for example). Technologies, standards, etc. will progress far faster than the legislation, and thus a legislative solution almost by definition will put the government in a position of having to procure sub-optimal solutions in order to meet a mandate."

    I am generally in favor of smaller government. But in the case of the senator’s chair in the senator’s office, I think they should be at liberty to legislate the procurement of whatever chairs will make them the most productive senators. If its their chair on the senate floor the may need to legislate with other factors in mind, coworkers, people visiting the senate to witness US government in action, but I think they should still be at liberty to do so. Likewise, I think it perfectly appropriate for the government to legislate technical requirements for government procurement of software that give due consideration to the productivity of the government worker as well as the protection and in some cases freedom of the public’s data. Indeed if they do not write such legislation I must consider them remiss in fulfilling their obligations to the public.  Furthermore, having worked for Boeing and understanding a small part of what Boeing goes through to meet legal requirements, I sometimes wonder if Microsoft’s software might not be considerably improved in quality if Microsoft had more experience of building software for the government, in a fashion similar to the way Boeing has built military planes.

    I do not like the option of doing nothing about a problem.

    Working for a company that has had all the delays that Microsoft has had in producing Vista, and claiming that technologies changes at a pace likely to leave legislation behind, is rather cheeky.

    ‘The line of logic you are using would suggest that the "first to standardize" makes for the best business decision for software producers.’

    If you insist on confusing documents with software and only seeing things from the business side of the software producer, I can see why you might think that. Although I have produced some software, I have other concerns than my ability to profit from what I write. That is perhaps one of the things that makes me skeptical of a proposed "standard" written by Microsoft. So few within Microsoft show any sign of understanding that standards are about cooperation, not competition.  But for me, someone who often sees things in non business terms, the inclusion of parties like Boeing and the Society of Biblical Literature in developing the ODF is a more important technical indicator than that the initial business leads were IBM and SUN.

    "I think you need to check your facts about the US States and what they are or are not mandating. As for if they should – my answer from above remains the same."

    While I know that Massachusetts is a small state and that the requirement only applies to a subset of the Massachusetts state government, and I know that their current "enforcement" of the requirement involves a plug-in rather than software purchases, the still existing requirement for ODF indicates customer demand, surely what has gone on there has not fallen below Microsoft’s "radar"?

    I really do not see what prevents a company from "heavily investing in R&D and pressing their technologies/products forward as aggressively as possible" and cooperating, lending a hand in the establishment of a standard. Do you mean that the only way to succeed is to avoid cooperation? If so, you might be interested in something John Nash once wrote concrening optimal stratgies and cooperation.

  9. jasonmatusow says:

    Thanks Anthony –

    Sitting down for a beer at some point would be easier than the essays we are currently writing. 🙂

    I hear you on many points, and will think about them. I do not think that the agressive promotion of your commercial interests precludes cooperation. I just think that the cooperation is colored by the self-interest. (maybe in a positive way, maybe in a negative way – or maybe not at all)

    Anyway – I’m crammed today and not able to write a long response. Thanks again for the really thoughtful response.

    Jason

Skip to main content