Specifying the document settings


Earlier today I got an e-mail from Max asking if I could help clarify the section of the specification that deals with compatibility settings. Max was reading a blog post from IBM, and was hoping that I could respond with my point of view. Here’s the original message from Max:


Hi,


I read this entry from Rob Weir: http://www.robweir.com/blog/2006/01/how-to-hire-guillaume-portes.html about specific properties inside Open XML, e.g. “useWord97LineBreakRules”.


He makes the point how this can be an open format, when the format is documented, but certain properties are not, so in this example, nobody besides MS knows how these specific line breaking rules look like.


It would be great if you could comment your point of view on these issues as an response or on your blog.


Best regards


Max


This is a great question, and one that on the surface seems to have an obvious answer. But, similar to the spreadsheet date issues that we discussed towards the end of last year, the approach taken for compatibility settings was done to allow for the most interoperable format possible without negatively impacting the average end user. Let’s explore this a bit more.


Leaving things out of the spec isn’t the solution


If you look at section 2.15 in part 4 of the spec, you’ll see that there are almost 400 pages of documentation covering 206 elements. I still remember when we first started the documentation of that section. We knew it would be tedious, but the more annoying part was that a good portion of it was for legacy functionality that we would have much rather just left out of the spec. In fact, in Word 2007, when you take a .doc file and upgrade it to .docx we ask you if we can do a full upgrade so that all of those legacy settings are removed.


Unfortunately for us, there was a legacy base of billions of documents out there, and many of them had one or more of these settings. For the same reason we had to give the user the option of doing a full upgrade to remove the legacy settings rather than just doing it automatically, we had to include it in the file format. So rather than just trying to sweep it under the carpet, we embarked on investigating and trying to understand as much as we could about what each of those legacy settings meant.


No requirement for conformance


For the most part, no one building a solution that reads from the format or writes to the format will care to deal with that part of the spec. It doesn’t have a significant impact in the behavior of the documents, and as I already mentioned we actually try to get the user to upgrade and turn these settings off whenever we can.


If you look at the first part of the compat settings section (2.15.3) you’ll see that we tried to make it as clear as possible that implementation of these settings is completely optional. If your customers really care about it and want you to implement it, then you’ll need to do so, and the spec defines how you should represent that setting. Here’s the blurb from the spec that tries to make this very clear:


It is important to note that all compatibility settings are optional in nature – applications may freely ignore all behaviors described within this section and these settings should not be added unless compatibility is specifically needed in one or more cases. The compatibility settings are provided for backward compatibility with documents created in legacy applications. As such, a number of the settings reference specific applications and specific versions of those applications. This is solely for backward compatibility reasons, and any of those settings are ignorable.


“Use OpenOffice.org 1.1 line spacing”


Since the folks pointing out this issue are extreme ODF proponents (which clearly shapes their agenda), it’s worthwhile looking into whether or not there is a similar issue with the ODF spec. It’s actually a great example of the difference between the two specs and the approaches the two groups took when defining the formats. ODF took the approach of just leaving it all unspecified. This is a similar approach that has been taken for a number of things (like spreadsheet functions, etc.).


For example, in OpenOffice, there is a compatibility option called: “Use OpenOffice.org 1.1 line spacing”, that when saved out into the format is defined as follows:


<config:config-item config:name=UseFormerLineSpacing config:type=boolean>false</config:config-item>


That’s a bit cryptic isn’t it? Unfortunately there is nothing in the ODF spec that explains what that means. It’s left undefined. That’s probably not as big of a deal for a setting like this, since it talks about using layout from a specific application, but none of the settings are defined in the spec. Other settings a bit more important to the actual display of a document that OpenOffice outputs but aren’t defined at all in the spec are:



  • PrintTables

  • LoadReadonly

  • OutlineLevelYieldsNumbering

  • TableRowKeep

  • CharacterCompressionType

Those are just a few I noticed in a blank document I saved out today.


Will there eventually be a new standard within the standard?


This is the approach that was taken for all configuration settings. There is no mention in the standard of how to name these things or how two applications should interoperate. In the blank document I saved out I got a settings file that looks something like this:


<office:document-settings xmlns:office=urn:oasis:names:tc:opendocument:xmlns:office:1.0 xmlns:xlink=http://www.w3.org/1999/xlink xmlns:config=urn:oasis:names:tc:opendocument:xmlns:config:1.0 xmlns:ooo=http://openoffice.org/2004/office office:version=1.0>


<office:settings>


<config:config-item-set config:name=ooo:view-settings>


<config:config-item config:name=ViewAreaTop config:type=int>0</config:config-item>


<config:config-item config:name=ViewAreaLeft config:type=int>0</config:config-item>


<config:config-item config:name=ViewAreaWidth config:type=int>30032</config:config-item>


<config:config-item config:name=ViewAreaHeight config:type=int>17570</config:config-item>


<config:config-item config:name=ShowRedlineChanges config:type=boolean>true</config:config-item>


<config:config-item config:name=InBrowseMode config:type=boolean>false</config:config-item>


<config:config-item-map-indexed config:name=Views>


<config:config-item-map-entry>


<config:config-item config:name=ViewId config:type=string>view2</config:config-item>


<config:config-item config:name=ViewLeft config:type=int>3708</config:config-item>


<config:config-item config:name=ViewTop config:type=int>3002</config:config-item>


<config:config-item config:name=VisibleLeft config:type=int>0</config:config-item>


<config:config-item config:name=VisibleTop config:type=int>0</config:config-item>


<config:config-item config:name=VisibleRight config:type=int>30030</config:config-item>


<config:config-item config:name=VisibleBottom config:type=int>17568</config:config-item>


<config:config-item config:name=ZoomType config:type=short>0</config:config-item>


<config:config-item config:name=ZoomFactor config:type=short>100</config:config-item>


<config:config-item config:name=IsSelectedFrame config:type=boolean>false</config:config-item>


</config:config-item-map-entry>


</config:config-item-map-indexed>


</config:config-item-set>


<config:config-item-set config:name=ooo:configuration-settings>


<config:config-item config:name=AddParaTableSpacing config:type=boolean>true</config:config-item>


<config:config-item config:name=PrintReversed config:type=boolean>false</config:config-item>


<config:config-item config:name=OutlineLevelYieldsNumbering config:type=boolean>false</config:config-item>


<config:config-item config:name=LinkUpdateMode config:type=short>1</config:config-item>


<config:config-item config:name=PrintEmptyPages config:type=boolean>true</config:config-item>


<config:config-item config:name=IgnoreFirstLineIndentInNumbering config:type=boolean>false</config:config-item>


<config:config-item config:name=CharacterCompressionType config:type=short>0</config:config-item>


<config:config-item config:name=PrintSingleJobs config:type=boolean>false</config:config-item>


<config:config-item config:name=UpdateFromTemplate config:type=boolean>false</config:config-item>


<config:config-item config:name=PrintPaperFromSetup config:type=boolean>false</config:config-item>


<config:config-item config:name=AddFrameOffsets config:type=boolean>false</config:config-item>


<config:config-item config:name=PrintLeftPages config:type=boolean>true</config:config-item>


<config:config-item config:name=RedlineProtectionKey config:type=base64Binary/>


<config:config-item config:name=PrintTables config:type=boolean>true</config:config-item>


<config:config-item config:name=ChartAutoUpdate config:type=boolean>true</config:config-item>


<config:config-item config:name=PrintControls config:type=boolean>true</config:config-item>


<config:config-item config:name=PrinterSetup config:type=base64Binary/>


<config:config-item config:name=IgnoreTabsAndBlanksForLineCalculation config:type=boolean>false</config:config-item>


<config:config-item config:name=PrintAnnotationMode config:type=short>0</config:config-item>


<config:config-item config:name=LoadReadonly config:type=boolean>false</config:config-item>


<config:config-item config:name=AddParaSpacingToTableCells config:type=boolean>true</config:config-item>


<config:config-item config:name=AddExternalLeading config:type=boolean>true</config:config-item>


<config:config-item config:name=ApplyUserData config:type=boolean>true</config:config-item>


<config:config-item config:name=FieldAutoUpdate config:type=boolean>true</config:config-item>


<config:config-item config:name=SaveVersionOnClose config:type=boolean>false</config:config-item>


<config:config-item config:name=SaveGlobalDocumentLinks config:type=boolean>false</config:config-item>


<config:config-item config:name=IsKernAsianPunctuation config:type=boolean>false</config:config-item>


<config:config-item config:name=AlignTabStopPosition config:type=boolean>true</config:config-item>


<config:config-item config:name=ClipAsCharacterAnchoredWriterFlyFrames config:type=boolean>false</config:config-item>


<config:config-item config:name=CurrentDatabaseDataSource config:type=string/>


<config:config-item config:name=DoNotCaptureDrawObjsOnPage config:type=boolean>false</config:config-item>


<config:config-item config:name=TableRowKeep config:type=boolean>false</config:config-item>


<config:config-item config:name=PrinterName config:type=string/>


<config:config-item config:name=PrintFaxName config:type=string/>


<config:config-item config:name=ConsiderTextWrapOnObjPos config:type=boolean>false</config:config-item>


<config:config-item config:name=PrintRightPages config:type=boolean>true</config:config-item>


<config:config-item config:name=IsLabelDocument config:type=boolean>false</config:config-item>


<config:config-item config:name=UseFormerLineSpacing config:type=boolean>false</config:config-item>


<config:config-item config:name=AddParaTableSpacingAtStart config:type=boolean>true</config:config-item>


<config:config-item config:name=UseFormerTextWrapping config:type=boolean>false</config:config-item>


<config:config-item config:name=DoNotResetParaAttrsForNumFont config:type=boolean>false</config:config-item>


<config:config-item config:name=PrintProspect config:type=boolean>false</config:config-item>


<config:config-item config:name=PrintGraphics config:type=boolean>true</config:config-item>


<config:config-item config:name=AllowPrintJobCancel config:type=boolean>true</config:config-item>


<config:config-item config:name=CurrentDatabaseCommandType config:type=int>0</config:config-item>


<config:config-item config:name=DoNotJustifyLinesWithManualBreak config:type=boolean>false</config:config-item>


<config:config-item config:name=UseFormerObjectPositioning config:type=boolean>false</config:config-item>


<config:config-item config:name=PrinterIndependentLayout config:type=string>high-resolution</config:config-item>


<config:config-item config:name=UseOldNumbering config:type=boolean>false</config:config-item>


<config:config-item config:name=PrintPageBackground config:type=boolean>true</config:config-item>


<config:config-item config:name=CurrentDatabaseCommand config:type=string/>


<config:config-item config:name=PrintDrawings config:type=boolean>true</config:config-item>


<config:config-item config:name=PrintBlackFonts config:type=boolean>false</config:config-item>


</config:config-item-set>


</office:settings>


</office:document-settings>


So what part of this is defined in the ODF spec? Well all of the elements are defined, but the problem is that there are only 4 elements used. The actual data can’t be determined by the name of the element, but rather the value of the attribute “config:name”.


None of those values are defined in the specification though, so there is no way for two applications that follow the spec to share any of these properties without the two applications working together to define a completely new standard within the standard.


Do you want to remove them from the spec?


I guess the question is whether or not folks would rather just see section 2.15 of the spec removed and take an approach similar to ODF. I don’t see how that would help interoperability in any way though. Sure it makes the spec smaller, but does that really help? Does it make it easier to move files from one application to another?


-Brian

Comments (26)

  1. Sinleeh says:

    I must declare that I haven’t written any standards before. One reason I read your’s and Weir’s blog is because I want to understand the difference between the two document formats. I like both blogs, minus the increasingly hostile nature which I detected lately.

    I haven’t read either Open XML or ODF specs. Weir made his argument by quoting Open XML specification, which is important and I hope you will quote the specific ODF specification (section/paragraph at the very least). Just because OpenOffice.org writes out xml elements does not mean the xml element makes it into the ODF format. Without quote from specific ODF section it does it does mean you did not present your arguments in a as strong as possible light.

    Rob Weir’s original post argues that Open XML specs, not MS or others implementation of Open XML, have xml elements that are clearly specific to legacy version of MS Office and they should not be there. This increases the need to quote from ODF in any counter-argument.

    Moreover, most of the config:name (some listed below) are rather obvious, especially if you take into account config:type

    ZoomFactor

    IgnoreTabsAndBlanksForLineCalculation

    ChartAutoUpdate

    PrintGraphics

    Another point is simply because some of the config:name looks cryptic does not mean they are not defined properly. To do so you will need to show that the specification did not define them. I know showing something is not there is not easy. However, if you say that section A.B.C mention "ViewAreaTop" but you cannot find anything in that section that defines what it is, most people, including me, will be satisfied that you made a reasonable effort and take your word at that, unless ODF proponent prove it to the contrary.

    ODF proponent always argues that Open XML choice of names/value are cryptic. Hence, to show that they have cryptic names/values do score points for OpenXML.

    Some of Weir’s argument do betray how he thinks legacy issue should be handled. See

    http://www.robweir.com/blog/2006/01/how-to-hire-guillaume-portes.html#6342215969521370348

    One thing that he is arguing constantly is MS Office is likely to be the only one supporting the full specs. This argument is important because Open XML proponent has always said that full "backward" fidelity (not compatibility) by ALL application is the aim of Open XML specs and they therefore they are merely holding Open XML proponent to the standard they specified.

  2. hAl says:

    I can seee that the approach by OpenOffice in creating config items that cannot be interpreted makes it difficult as well to interprete ODF document.

    The config item approach seems simple and easy to use. I like that simple idea.

    However when I meet any such a config item it seems impossible to tell wheater it is a depreciated feature or a new feature. Also when a multitude of application start creating their own ODF files there seems a big risk of naming different items with the same name which in some cases might lead to strange unodcumented behaviour.

    I wish for instance that the OpenOffice programmers would have added an id to the config items so in stead of PrintGraphics something like OpenOffice_PrintGraphics but the better way would have been that the config-items would have had a property identifying the creator of the particular item.

    Something like:

    Config:config_origin="OpenOffice 2.0".

    and mayby an optional config status

    config:config_status="depreciated"

  3. marc says:

    in my opinion, if you put something in a specification, you must "specificate it". So, this autoSpaceLikeWord95-footnoteLayoutLikeWW8

    -mwSmallCaps-shapeLayoutLikeWW8

    -useWord2002TableStyleRules-useWord97LineBreakRules

    -wpJustification-etc thing is simply indefendible.

  4. Jean says:

    Sinleeh: I suggest Brian to quote the whole 700+ pages of ODF specs to convince you that what he mentions really doesn’t appear in the spec. 😉

    hAl: I agree with you, and I suggest to add a "namespace" attribute (to identify the origin of the config element), what would more conform to XML usage.

  5. jones206@hotmail.com says:

    Hey folks, just take a look at that clip of XML I pasted. You’ll see two things:

    1. The OpenOffice namespace is used for both configuration item sets showing that all of those configuration settings are from OpenOffice.

    2. The only things defined in the ODF spec are the config-item elements. None of the actual config:name attributes are defined anywhere in the spec.

    So, if OpenOffice and KOffice both go and follow the ODF specification, they will not be able to share any of these settings. They aren’t defined anywhere.

    Instead, they would have to get together and agree to create a new standard within the standard for what the config:name attribute values should be and what they would mean. This is why as currently stated it’s not interoperable.

    This is the case throughout the spec, which of course made it much easier to complete, and is also the reason why it’s such a smaller spec. The Open XML spec was being held to an extremely high bar (and still is), and everyone we talked to made it clear that we couldn’t leave out large pieces like this.

    As I said in earlier posts, if the fact that "autoSpaceLikeWord95" doesn’t include the algorith for calculating the autospacing is the biggest problem with the spec, then I’m pretty happy about where we stand. It’s not going to affect any new documents going forward, and we make every attempt to get folks to turn it off.

    Remember, there were folks on the Ecma technical committee representing both OpenOffice and Apple. Neither side had a problem with this part of the spec, and instead were actually very impressed with the level of documentation and detail provided. The only people out there complaining are the folks from IBM who are banking on governments creating an ODF only policy.

    -Brian

  6. sinleeh says:

    Brian : I did a quick search for config-item in  ODF spec 1st May 2005. It appears in section 2.4.2, pg 48. The section is part of "Section 2.4 Application Settings". I did not read the specs, but it sounds to me it means config-items are used to hold application specific information, such as placement of menubars etc. It is extremely likely that the config-items you quoted in this blog post falls into this category.

    If used as "application settings", nobody will expects these settings to be portable, nor is it designed to be.

    hAl: it looks like "config-item" are designed to be used as a key-value map.

    HTH

  7. Francis says:

    Thanks for pointing out those omissions, Brian. I hadn’t read the ODF standard. Apparently, neither have some of its staunchest supporters.

    Still, a clear answer would be appreciated: are _all_ of the compatibility settings deprecated? This information is not in the OpenXML specification.

  8. Jean says:

    True, I missed this "name" attribute in the config set (it is not very well chosen, as a name rather refers to something private, an internal name used to identify the set among others, and not to a shared identifier…).

    To answer your initial question: personnaly I think that unspecified items should not appear in the specification of a standard. I agree that if it’s the only remaining issue with Open XML, it’s a rather good sign. But I believe ISO will require those elements to be removed from the spec before accepting it. And that should not be a real issue for Microsoft (or ECMA)…

  9. Brian, the fact that you are encouraging people not to use those compatibility flags does not matter at all here. There obviously will be documents with those flags turned on, right? Otherwise you wouldn’t have put this in the standard. So it’s just a corner case, but still: This means ONLY your office suite will be able to display those documents correctly, even if a competing program implemented the whole specification. Why? Because you didn’t specify how those flags affect the display of the document (a hell of a specification you have there…). I still haven’t seen any answer to this valid criticism. It’s a competitive advantage for Microsoft since the standard is incomplete and your company is the only one that has the missing parts.

    – Stephan

  10. Stefan Wenig says:

    Stephan, how is this any different from ODF documents with application specific config settings?

  11. sinleeh says:

    Stefan Wenig,

    The differences are

    (1)Whether it affects the presentation/content of the document, e.g., italics in the wrong places etc. Serious one as it affects the interpretation of the document by machines and humans

    (2)Whether it is just an indication to the application on how to display the non-important and non-portable things such as location of menu bars etc. Not important and is usually ignore. For example, you will want to ignore Photoshop’s menubar setting if you are simply displaying a photoshop picture in your word document.

    Interesting questions and certainly worth debating. However comparing the two posts looks like comparing oranges with apples to me.

  12. marc says:

    The key here is that MSOOXML claims:

    "[O]pening Up Billions of Documents … Thanks to the depth of the technical resources the TC45 created, the Open XML standard covers the full set of features used in the existing corpus of billions of documents…"

    ( from http://www.ecma-international.org/news/PressReleases/PR_TC45_Dec2006.htm )

    So, this autoSpaceLikeWord95-footnoteLayoutLikeWW8-mwSmallCaps-shapeLayoutLikeWW8-useWord2002TableStyleRules-useWord97LineBreakRules

    thing becomes important.

    IMHO, if they are in the "standard", they must be clearly and completely specificated, or taken away.

    Does MS knows the word "ethic" ?

  13. Dave S. says:

    If Microsoft had adopted a units model along with the values, errors such poorly chosen date bases would not be a problem.

    If dates and their related constants had the units of MS1904-basis or Gregorian then one could always get it right.

    Explicit units would have provided universal interoperability instead of the diode model Office products employ. It’s hard to see how such a model is attractive in an heterogeneous environment.

  14. hAl says:

    @marc

    Standards often are filled with dpreciated features that are not specified in detail. Even a lot of ISO standards have those kind of depreciated features.

    In thos case the only reason for the existance of those features is compatibility for older office documents. But not even MS Office 2007 supports all of those depreciated features. However they are there so that you can

    A) recognize that the document might not render the same as the original

    B) to see what might be the application that could recreate the original rendering.

    I would for instance use those features to recognize possible problems when doing a mass conversion of old Office binary files. Those kind of compatibility issues would have been extremly hard to recognize within binary files. But now after a conversion I can easily create a search program to verify if my older document contain compatibility issues which might make them unfit for use in newer application an if nescesary even correct those issues manually. Especially important when faced with either libaries or legal documents conversions.

    Especially with these kind of rendering issues it it likely to be impossible to exactly recreate the original documents without using the original applications.

    MS Office does take some of the issues into next verions for compatibility reasons but leaves out others.

  15. sinleeh says:

    Dear hAl,

    Your reply to marc is the by far the best response I see to having all the "deprecated" office 95-2003 features. Congratulations.

    I agree those features can be used in the way you described and it will definitely be very useful for third party apps that cannot render the documents properly to flag it for manual intervention.

    We still need to discuss what these "deprecated" functions impacts a non-MS applications who sets out to faithfully maintain fidelity with older MSOffice documents. Is the standard, as it stands good enough? As other posters and I had pointed out, the reason we hold OOXML to this standard is because it is "the aim" of the standard from day one. And being a standard, we must expects third party, namely non-MS application to be able to achieve this aim.

  16. A says:

    It’s all great and wonderful to encourage people to stop using these options, but that doesn’t change the fact that if you convert a binary file (which I assume is what you intend people to do), some of those options carry over.  If the binary file used the printer metrics, then that option will appear in the XML version of the format too.  

    I assume that because the option will appear in the file, that Word 2007 will respond to its presence in some way (in the case of the printer metrics option, the line spacing will change).  If one wants to be able to open the document with the same page breaking as Word, they’ll have to get the line spacing right first.

    If it just stores them for backwards compatibility, and turning them on or off has no effect on the display, then fine, then we don’t need to know what they are for.

    Trust me hAL, customers don’t want a message to pop up saying "This new application won’t show this the same as Word, please take some manual intervention to correct the problem".  They just want the file to look the same in both apps.  Impossible to do?  Probably, but you try telling an end user that.

    Now I agree with Brian, that if the meaning of some weird and rarely used option is the only problem with the spec, then the spec is pretty darn good.  Having worked for several years with the Word 97 spec, and reverse engineering what wasn’t in it, I don’t have any right to complain about the new one.  Even without the documentation I can figure most of it out.

    I’ve just got my fingers crossed that maybe in a future version of the spec we’ll get more details. As Sinleeh said,

    "And being a standard, we must expects third party, namely non-MS application to be able to achieve this aim."

    There is no requirement for conformance…unless your customers demand it of you.

  17. jones206@hotmail.com says:

    Thanks again for all the comments everyone.

    Clearly, there are areas that we could look into improving for the next version of the spec. The ODF folks left out huge pieces of the specification. Our approach was to instead document everything. Some places could probably use a bit more information, and maybe TC45 will decide that those should be improved in the next version.

    Remember though that the large majority of these settings are also used in ODF documents. The problem is that they are all application specific an left undefined in the actual specification. That means that if you follow the ISO ODF standard, there is no interoperability to speak of in these areas. The Open XML spec on the other hand clearly specifies what they are to be called, and how they behave. There are a handful that could probably use better descriptions, but those are clearly in the minority.

    -Brian

  18. Stefan Wenig says:

    sinleeh,

    well argued, but what you say simply does not apply to all settings in the fragment brian posted. it’s ok for things like "ZoomFactor" to be ignored in other apps, but look at things like "DoNotJustifyLinesWithManualBreak" or  "UseFormerObjectPositioning". so i think my question is still valid.

  19. Christopher says:

    I’m not a proponent of either standard, but the difference seems clear to me.  The fact that the XML elements you quote are nested within

    [config:config-item-set config:name="ooo:view-settings"]

    and

    [config:config-item-set config:name="ooo:configuration-settings"]

    tags surely shows that these are *application*-specific settings that do not affect the document’s contents or layout, e.g. zoom factor and print settings are purely specific to OpenOffice.org.

    However, "shapeLayoutLikeWW8" and similar tags *do* in fact effect the layout of the document itself.

    Now I hear the argument that these elements in the OOXML spec are optional for software implementing the spec, but isn’t backwards compatibility really the whole reason behind the OOXML spec in the first place?

    It’s nice that Word 2007 tries to get users to convert from these legacy features, so surely this means Word must have been developed with a generic or flexible enough layout model that allows "useWord2002TableStyleRules" to be converted and marked up in OOXML?

    So, in that case, what’s the problem in specifying in the OOXML spec what the transforms to be applied are?

    Regards,

    Chris

  20. jones206@hotmail.com says:

    Christopher, just look at the settings. The ODF settings absolutely affect the layout. They just decided to try and sweep them under the carpet by using the OO namespace, and not documenting them in the spec.

    -Brian

  21. sinleeh says:

    Stefan Wenig

    I haven’t done a exhausive check, and my reading of OOXML and ECMA is at best haphazard. However, the settings Brian mentioned are in a file called settings.xml and are application specific. This is a guess, but "DoNotJustifyLinesWithManualBreak" and "UseFormerObjectPositioning" can refers to how you want your new lines and new objects to be positioned.

    Brian,

    Not documenting settings that affect layout of existing elements is definitely a bad idea. ODF does not claim 100% layout fidelity so that does make the bad idea slightly less bad. Still, it is a bad idea.

    A cursory reading of the examples config-item you selected, stripping out all non-layout affecting material, seems to show that they take effect only if you start editing the document. As soon as one edit the document, one immediately lose all rights to insist on exactly the same layout, even if you undo your changes.

    For example, simply continue typing from the end of the paragraph can results in different font from the rest of the paragraph. It happens yesterday, today and extremely likely, tomorrow and is not an issue for document standard to solve.

    Hence, if the example you had shown only affect the documents you decided to edit, I am afraid I do not have any sympathy for the users who find that the layout changes, whichever document format you use.

    The arguement ODF proponent putting through is that legacy requirements of MS applications creep into OOXML zip file. I am sure some of OO.o 1.0’s requirements do appear in ODF zip file. Hence, the appropriate argument is not how or why they did, but how was these legacy requirements are managed. Are they placed in sections that requires everyone to interpret, or are they contained in sections that only interested parties need to read.

    I repeated what you did on ODF and find that the settings you mentioned are in a separate file known as setting.xml, That makes a lot of difference in the argument on unspecified specs. If it were in contents.xml or in style.xml, Weirs will be guilty of "pot calling the kettle black". Still guilty, but lesser if it were in metadata.xml. setting.xml seems to have a very low priority in the list of xml files in the ODF bundle. If my interpretation is right, it is rather inconsequential in the discussion of unspecified specs since it is designed to be unspecified.

    A,

    Yes, 99.999% of the customers don’t want "This new application won’t show this the same as Word, please take some manual intervention to correct the problem" coz 99% of them do not rally have constrains on the actual layout. The remaining .999% (Think graphic designers and people writing for magazine/journals who have page length limits) will appreciate this warning, even if the application cannot do anything to automatically adhere to Word (or other application) layout.

  22. Stefan Wenig says:

    sinleh,

    i just browsed through the settings list that brian posted, and i can’t help but think that this is a classical example of assuming the worst about msft and the best about everyone else – as long as they are up against msft.

    many of them are probably just settings that don’t really affect interoperability and layout, but i don’t think all of them are. i may be wrong, but then i think that rob weir will use just about anything against openxml that he can. take this posting: http://www.robweir.com/blog/2006/07/lost-in-translation.html

    he’s reading the scope for the prototype and jumping to conclusions about what’s intended to be in the final thing that don’t make any sense. and he has never approved my comment about this obvious mistake. he’s a smart guy, but i don’t see why he should be any more trustworthy than anyone from msft.

  23. sinleeh says:

    Dear Stefen Wenig,

    [On Rob Weir will use just about anything against open xml that he can) Yes, I agree he will. I try not to, but seems to be falling into the same trap  often myself. I am obviously not aware of Rob Weir disapproving comments that he don’t like, but would not be surprise about it coz there is a moderation policy on his blog.

    Weir is guilty on discussing only the sections he wants to discuss, but so does Brian. There is their rights as it is their blogs. To censor comments is arguably their rights as well. However, I do not think it is ethical for anyone to do this.

    Personally, I see we have two opposite camp here: this blog and Weir’s blog. that is what make me read both blogs. I never accept what both parties said as absolute truth. I find that I can learn a lot about XML design by participating in discussions on both blogs.

    One potential criticism on Brian’s posts in general is that he do not quote from the ODF standard, unlike Weir. This give me the impression that Brian did not bother to refer to the ODF specs at all when posting. Rightly or wrongly, Weir’s reference to OOXML specs lend his argument credibility.

    "Assuming the worst about MSFT and the best about everyone else" is one of my weakness. Something I am trying very hard to eradicate. That bias is not good in any discussion.

    From Weir’s original post, this counter-post and his rebuttal, it is clear that both OOXML and ODF in one way or another contains settings about applications, including legacy applications, in their zip bundle. The question of interests is where it occurs, and how does it impact applications. Some of these settings are not properly documented in the specs and may be we should not be too crossed if they are not. It is the designers’ choice. The only thing we can discuss is whether the designers achieve their goals and the impact/merits of their design decisions. In both OOXML and ODF, interoperability. In OOXML, MS stated goal of faithfully reproducing legacy documents (and whether third party who wants to do so can do it with reasonable effort) and in ODF, whether does it successfully support cross-platform  / multi-vendors and whether some missing parts, i.e. spreadsheet formulas is as bad as the opposition claims. There are other goals that we can discuss as we go along and as both camps starts the discussions.

  24. marc says:

    regarding Weir’s and Brian blogs, i have to say that in general i don’t agree with Brian counter-arguments ( his words sound like PR blah blah ), but i congratulate his openness with the comments, because he doesn’t delete anything. Rob, learn from Brian and let the people express disagree !! greetings to Brian.

  25. Stefan Wenig says:

    sinleeh,

    i agree quite a lot with your statements. work should focus on finding actual problems and getting them solved. i think it’s unfortunate that the ODF camp is attacking open xml so hard and basically claiming that it has no right to exist as a standard. (in the beginning they even claimed that it has no right to exist at all, but they obviously have learned that this is not a very promising strategy…)

    so, yes, critisism is ok. it’s not always right, though. further examination may lead to the conclusion, that absolutely nobody is interested in implementing those legacy settings of open xml. then again, the opposite may be true.

    but come on, having the spreadsheet formulas not documented and critizising the other party for some obsure legacy stuff… (and look at the conclusions rob draws, he’s not just pointing out the facts!)

    besides, i don’t look rob’s dismissive tone. as someone who disagrees with him a lot, i can’t help feeling pretty patronized. go figure, i got reasons for my opinion too.

  26. Fernando says:

    sinleeh,

    "One potential criticism on Brian’s posts in general is that he do not quote from the ODF standard, unlike Weir."

    That is exactly the point – Brian cannot quote from the ODF standard because *these elements are not documented there*.