Back from Sapporo – tons of progress in Ecma


Well, it has definitely been a pretty hectic couple weeks, and it’s going to take me awhile to get caught up. I was in Boston two weeks ago for TechEd, and in Sapporo last week for Ecma meetings. Both were great trips, but it’s nice to be home. The meetings in Sapporo were extremely productive, and you can actually read all about it in the status report filed by Tom Ngo from NextPage (http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50.htm).


Some of the key things I wanted to call attention to:




  1. U.S. Library of Congress joins Ecma TC-45 – This was really great news. We’ve already benefited significantly with the participation of Adam Farquhar from the British Library, and I’m really excited to have the Library of Congress on board too. Like the British Library, the Library of Congress cares deeply about archival and is particularly interested in the long term accessibility of the formats.


  2. Progress on conformance definition – We’ve spent a lot of time debating how to best define conformance to allow for good interoperability while at the same time making it super easy to use just portions of the specification. We resolved a number of issues and I think we’re really in a good spot here.


  3. Progress on WordprocessingML issues – We’ve made a lot of progress working on the initial WordprocessingML documentation and are now able to drill into the various issues logged by the various members of the technical committee. I think everyone was excited as we were able to start closing down some of the older issues.


  4. Java WordprocessingML to HTML converter – Toshiba gave us a demo of a WordprocessingML to HTML converter they’ve written in Java. I always get excited when I see tools built on top of the new formats. It’s really one of the biggest differences between the old formats and the new. We’ll see a lot more 3rd party solutions that were either not possible, or incredibly difficult with the old binary formats.


  5. Schema visualization – Representatives from BP, StatOil, and Essilor went over some ideas for making the documentation and schemas easier to visualize. There are about 4000 pages of documentation right now, and we really want to figure out ways to make it easier to consume.

It really was a great few days, but I wish I’d had more time to explore the area. I lived in Okinawa, Japan throughout most of Junior High and High School, and this was my first time back since then. I really enjoyed Sapporo. The food was great, and of course you can’t beat being that close to the Sapporo brewery. Toshiba was an outstanding host.


-Brian

Comments (29)

  1. ECMGuy says:

    Thanks for the update.

    I found your comment about living in Okinawa though Jr High and High School interesting. I was stationed in Okinawa with the Navy in 1992-1994. What a magnificnt place. The people wee fantastic and the views spectatular.

  2. Wesley Parish says:

    "Progress on conformance definition – We’ve spent a lot of time debating how to best define conformance to allow for good interoperability while at the same time making it super easy to use just portions of the specification. We resolved a number of issues and I think we’re really in a good spot here."

    That touches on a question I posed some days ago, and which I find Sun reiterating in its response to the Massachussetts RFI on MS Office ODF Plugins:

    http://www.mass.gov/?pageID=itdterminal&&L=4&L0=Home&L1=Open+Initiatives&L2=OpenDocument&L3=ODF+Plug-In+for+Microsoft+RFI&sid=Aitd&b=terminalcontent&f=open_rfi_response_sun_response&csid=Aitd

    "In other words, even though the MS XMLRS may be fully unencumbered through patent grants and a convenant not to sue, a number of the features and functions that the MS Office applications implement remain proprietary, private, and are not available for implementation by other developers."

    Are you (and your team) intending to release an Open XML-Lite that ignores features like ActiveX, that make it possible to make Open XML a truly cross-platform file format?  Or are you pushing to release ActiveX and suchlike through the standardization process together with the broad patent grant and covenant not to sue, so that it can be re-implemented – in full – on Linux, FreeBSD, OpenSolaris, MacOSX, etc?

    I mean, it’s nice that the Library of Congress and the British Museum have joined Ecma TC-45, but I expect that LC and BM expect to be around longer than Microsoft is expected to be around, and iff Windows ceases to be a widespreadand broadly-used platform, and iff the Microsoft "Intellectual Property Rights" in the Windows platform are rigidly held, and iff the US Fe[de]ral Government insists on criminalizing reverse-engineering and suchlike, then at Microsoft’s End-Of-Life, the LC and BM are going to be out of pocket.

    Has this been taken into consideration?  It doesn’t seem that it has.

  3. Here’s June’s update (see also May’s and April’s) from the Ecma International Technical Committee…

  4. KJK::Hyperion says:

    Wesley, in my humble opinion that’s a load of goddamn bullshit. Smugness and rhetoric, I have had enough of your kind, spamming blogs, spamming Wikipedia, spamming mailing lists. You either don’t get the point or are using the "ActiveX" word as a scare tactic.

    Hey, it’s not hard. If you want a portable document, don’t use non-portable features. You say it like provision for non-portable features makes a format inherently bad, which is so obviously retarded you cannot possibly be serious. And if you are serious, then you are dishonest, Hey, look how many evil formats, HTML, CSS, TIFF, RIFF, Ogg… all of them can contain evil proprietary data, but they were designed by people who are not Sun competitors, so they are OK!

  5. Sam Hiser says:

    Hyperion-

    You’ve lost the plot. Non-portable features make a format a non-standard. That is, not standardizeable.

    It is the wrong strategy when there is an ISO standard file format in existence which will never have non-portable features.

    On this basis, MSECMAXML can’t compete.

  6. BrianJones says:

    Sam, with OpenOffice you can insert a spreadsheet formula into your document yet the format for that formula isn’t specified in the ODF spec. If it is not documented, does that make it non portable?

    OpenOffice also allows you to embed OLE objects. So it appears to be the case that OpenOffice allows you to insert undocumented stuff into ODF. I can’t understand (after reading your comments) why you would consider these to be portable features.

    I think it’s absolutely the right thing to do. It would be ridiculous for OpenOffice to prevent their users to insert formulas just because ODF is lacking. But I think this also shows that if you really believe that Open XML is not portable then you must apply that same logic to ODF.

    Now, if you look at the Ecma Open XML spec, you’ll see that rather than taking the ODF approach of just not documenting certain things, we fully document how everything works. This way the documents are completely interoperable. If the end user decides to insert a foreign object into the document, there is not much we can do there. We can document how we store that object, but beyond that it’s up to the object. OpenOffice does the same thing, but ODF provides even less information on how it’s done.

    -Brian

  7. BrianJones says:

    Hey Russ, Okinawa was amazing. What base were you on? My Dad was with the NIS and was stationed at Butler (we actually lived on Kishaba though).

    I had initially wanted to take my wife with me out to Sapporo, and then head down to Okinawa for a few days to show her around. She couldn’t get away from work though so I just kept the trip limited to a few days in Sapporo.

    -Brian

  8. Francis says:

    There is a solution to the portable document controversy. Simply have the Office applications inform the user when she uses proprietary features, just like how they warn her when content will be lost (due to saving in older formats, e.g. in the compatability checker.)

    Then the responsibility rests squarely with the user.

  9. BrianJones says:

    Francis, I think that’s a good concept but I think it would be too heavy handed. What do you think the average user reaction would be to such a message? I’m pretty sure that it wouldn’t be positive. 🙂

    We try to avoid throwing warnings to users (and I think we already do it more often than we should).

    I think the concept is good though. I think it would be pretty easy to have a tool that quickly notifies you if there are objects embedded in a document that may not be portable. That way people interested in portability could use the tool to easily verify it.

    That would also have the added benefit of being customizable, and you could even update it as there are changes in the market conditions and what’s considered portable.

    -Brian

  10. Sean DALY says:

    > OpenOffice also allows you to embed OLE objects.

    > So it appears to be the case that OpenOffice

    > allows you to insert undocumented stuff into ODF.

    Undocumented because Microsoft has a clear interest in locking customers into a proprietary format.

    I’ll ask this question a third time: why not just publish the specifications (if they exist) of the proprietary secret binary formats, in the interest of openness and interoperability, for the millions of current and former Word users with their billions of Word documents?

    Sean DALY.

  11. Wesley Parish says:

    Actually, Brian, the history of Microsoft’s interactions with Digital Research provides us with a test-case of how average people react to such a barbed warning.  Microsoft included a warning in MS Windows 3.1 that Microsoft couldn’t guarantee that MS Windows 3.1 would run well when the underlying DOS wasn’t MS DOS.  DR DOS which up till then had been on a roll, fell off the market precipitately.

    And KJK::Hyperion, I’ve just been reading some of the comments on various blogs on Microsoft’s dropping WinFS – one guy had already started supporting it, based on MS Windows Vista Beta 1, in some stuff he was writing.  I’m afraid I know what I’m talking about.  ActiveX’s continued existence is no more a given than my own.

  12. Adam says:

    Brian: WRT the formula thing that you keep bringing up, the ODF spreadsheet formula *is* being worked on. See http://www.openformula.org/Main_Page Based on the speed of the work, they may have a decent draft and a first implementation by the end of the year. (Maybe before Office 2007 comes out? 🙂

    Yes, it is a weak spot, but seeing as you can’t portably move spreadsheet documents between applications/OSs at the moment anyway, it’s not like anyone’s losing out *yet*.

    However, if the alternative is to implement MSOOX, I’d be interested in what you make of the claim that MSOOX doesn’t have a rigorous specification for its formulas *either*, as claimed by David Wheeler at http://sourceforge.net/mailarchive/forum.php?thread_id=8911498&forum_id=46632

    (site is down atm, google cache at http://216.239.59.104/search?q=cache:Mmo-0pwZT4AJ:sourceforge.net/mailarchive/forum.php%3Fthread_id%3D8911498%26forum_id%3D46632+david+wheeler+microsoft+xml+formula+specified&hl=en&lr=&strip=1 )

    (And if MSOOX is no better than ODF/OpenFormula *at this stage*, could you tell us what the progress is on specifying MSOOX spreadsheet formulae?)

  13. BrianJones says:

    Adam,

    you need to get caught up a bit, as that comment is from last year before the work in Ecma even started.

    Download the draft that was released back in early March: http://www.ecma-international.org/news/TC45_current_work/Ecma%20TC45%20OOXML%20Standard%20-%20Draft%201.3.pdf

    Take a look at chapter 15.5 (starts on page 247). There are about 160 pages of content describing the formula syntax and about 360 different functions. You’ll notice that there is still a ways to go, but this is already a huge amount of useful content.

    Sean,

    Could you give me the list of binary parts you’re concerned about? Is it just ActiveX controls (of which I would guess less the 0.01% of Office documents contain)? Users have an option of embedding them, but it isn’t something Office would even automatically do.

    Is there anything else?

    -Brian

  14. Adam says:

    Brian: Thanks for that. I hope you’ll forgive me for not wading through up to 4000 pages myself just to see if any more progress had been made or not. 🙂

    "You’ll notice that there is still a ways to go, but this is already a huge amount of useful content."

    So … the spec for ODF spreadsheet formulae is not fully defined yet, but it’s not fully documented for MSOOX yet either? Why is this some kind of point for *either* format over the other?

  15. BrianJones says:

    Adam, no problem, I understand. 🙂

    My point with the formula comment is that while the ODF spec already went through ISO, it’s far from complete. All specs will of coures evolve over time as innovation occurs, but ODF isn’t even caught up with the innovations of today (heck, equations have been around for more than a decade). Many folks out there have been led to believe that it’s already a complete spec, and that isn’t the case.

    In addition, from what I’ve seen so far, the work in Ecma on formulas is definitely ahead of the work going on in OASIS.

    -Brian

  16. Brutus says:

    @Wesley Parish:

    "Actually, Brian, the history of Microsoft’s interactions with Digital Research provides us with a test-case of how average people react to such a barbed warning.  Microsoft included a warning in MS Windows 3.1 that Microsoft couldn’t guarantee that MS Windows 3.1 would run well when the underlying DOS wasn’t MS DOS.  DR DOS which up till then had been on a roll, fell off the market precipitately. "

    —————-

    Just for clarification, that warning appeared in a Win3.1 *beta*.  And it was true; that Win3.1 beta had NOT been tested at all on DR DOS.  I guess Microsoft didn’t want to waste time chasing down beta bug reports that were actually DR DOS issues rather than Win3.1 issues.

  17. Wesley Parish says:

    Thanks, Brutus, for that clarification.

    But consider this: the [MS|PC|DR] Disk Operating System was and is essentially a combination of program loader, (minimal) device driver library, and file system.

    MS Windows 3.x was and is essentially a set of graphics, multitasking and system libraries that were loaded on top of [MS|PC|DR] DOS.

    It is the custom for PC Operating Environments and Systems that don’t manage all the permutations of peripherals and device drivers that are available, to get pilloried; OS/2 got pilloried, Linux gets pilloried, BeOS got pilloried.

    Why should MS Windows 3.1 be any different?  It was after all a semi-autonomous DOS extension that depended on the DOS base for initialization, etc.  For Microsoft to say they didn’t take the trouble to ensure that their Windows 3.1 beta could run satisfactorily on DR DOS, either says that they were slack and lazy, incompetent programmers, or did it maliciously.  Take your pick.

  18. Adam says:

    Brian> "but ODF isn’t even caught up with the innovations of today"

    You make it sound like:

    a) There is a document format out there that has a completely open and fully-specified formula/equation syntax, or:

    b) None of the applications that use ODF can load formulas/equations that they themselves have saved.

    Clearly, neither of these statements are true. Of course, if you intended a third meaning, I’d be grateful if you could be a little clearer about it…

    On the other hand, you bought the formula bits up as a rebuttal to the fact that Word 2007 will include non-portable and undocumented OLE objects inside what is meant to be a portable and documented format. As ODF formulas are in the process of being documented (which I’m sure you were aware of before I pointed it out earlier in the thread) does this mean that (and I’m only using the connection you made between the two features) MS are in the process of documenting their existing OLE-based formats to match?

    While I don’t like undocumented features, I’m much less wary of them if I know they’re going to be documented in short order.

    As for not warning the user that embedding non-portable objects in their "portable" document format will make it non-portable, your arguments that "If the end user decides to insert a foreign object into the document, there is not much we can do there." and "I think it would be pretty easy to have a tool that quickly notifies you if there are objects embedded in a document that may not be portable. That way people interested in portability could use the tool to easily verify it." don’t appear to hold much water.

    First, they do kind of assume that the user knows which foreign objects are portable and which aren’t. I’m pretty technical, and I couldn’t give you a 100% answer on whether the WMF image format is portable. The user sees an image, they put it in the document, it looks fine. What more do you want.

    Secondly, a fair number of people *will* assume that even if something is non-portable, if they insert it into a portable document format it will be sprinkled with portability magic and be readable anywhere. It’s in a portable document, right? Therefore it’s portable.

    Third, the extra tool approach to check for unportable objects assumes that the person creating the document is the person interested in portability, _and_ that they’re even aware that the tool exists. If someone wants people to send them documents in MSOOX because "it’s portable" and they’ll be able to read it, but they get sent a document that’s, say, just a big embedded, non-portable WMF file, well…

    With the above 3 in combination, I can just imagine:

    A: "I can’t read your file!"

    B: "But it’s in the format *you* requested"

    A: "Yeah, but you’ve put a non-portable thing in it"

    B: "But it’s a portable format. Duh!"

    A: "But…"

    etc…

    *shudder* 🙂

  19. BrianJones says:

    Adam, OpenOffice allows you to embed WMF images and it allows you to embed OLE objects when saving to ODF. Does that mean that you don’t believe the ODF saved from OpenOffice is portable?

    -Brian

  20. Adam says:

    Wow. Yes, I’d definitely agree that ODF saved from OO.o is not portable if it contains such elements.

    Thanks for pointing it out. Looks like I’ll have to go start bothering the OO.o developers as well. 🙂

    And I’ll have to start giving people caveats when I ask them to send me ODFs if they’re using OO.o on Win32. 🙁

    Damnit, the whole point of a standardised doc formats is to *prevent* this sort of thing. Argh! Now I’m disappointed! I kind of expect MS to do this sort of thing, but the OO.o guys should know better! Rah! *stomp* *stomp* *stomp*

    Or maybe I should just go back to plain text. 🙂

  21. BrianJones says:

    I think plain text may be a bit extreme 🙂

    With Open XML, there is always an alternative representation of OLE objects (a picture), so if you aren’t able to render the object you can fall back on the picture. It doesn’t appear that OpenDocument does this (I’m not sure what the content type of the "ObjectReplacement" is).

    -Brian

  22. Wesley Parish says:

    Brian, having thought about some of the issues raised in this series of threads etc, I believe I may have a solution to the "controls" one.

    It would involve someone going through ActiveX and some other toolkits on the market, such as QT and GTK, and identifying similar or identical functionality.  Then releasing supplementary documentation that pinpoint the similar and identical functionality – in effect providing supplementary documentation for those functions, for which countless Windows developers will bless you (or whoever) from the bottoms of their hearts.

    And Microsoft has a group specifically devoted to interoperability – Port25 is their website – who could probably do this as part of their function.  Then QT and GTK, etc, maintainers could incorporate an ActiveX binding in their toolkits that would translate ActiveX calls to QT, GTK, etc, calls.

    That would go a long way to clearing up my objections.

    And just in closing – ActiveX may be in only 0.01% of all MS Office Documents, but you’ve demonstrated the use of controls in MS Office – MS Word if I remember correctly.  It’s obviously meant to be used.

  23. BrianJones says:

    Wesley, thanks for the suggestions. I’ll talk to some folks about it.

    In terms of the controls that I’ve demonstrated, those are actually native to Word. The storage for those controls is completely declarative in the format, and they are fully documented. The ActiveX control support is a legacy thing which we’ve tried to replace when possible.

    -Brian

  24. Sean DALY says:

    > Could you give me the list of binary parts you’re concerned about?

    I meant the file format. We can agree that there is an enormous number of Word (and Excel and to a lesser degree, PowerPoint) files created in the binary blob formats. I find it unfortunate that Microsoft does not wish to unlock those formats, for the benefit of those who created the documents in the first place.

  25. Here’s June’s update (see also May’s and April’s ) from the Ecma International Technical Committee (