Difficult decisions between loose conformance and true interoperability

Rick Jelliffe had a great post earlier this week discussing the problems you get when you allow for really loose conformance. We've had discussions around this issue in Ecma for the past 9 months. On the one hand, we don't want people to feel like they have to implement the entire standard to be compliant. We want people to benefit from the work we've done, and to choose which parts of the standard they want to use. We also want people to use the standard as a launching pad for their own innovations. But as Rick points out, this can be problematic:

The ability to create extensions or subsets willy nilly is the antithesis of a standard. It is the difference between "I bought product X because it says it supports standard Y but it often fails and I am pissed" and "I bought product X because it says it has partial support for standard Y and I accept it doesn't work completely."

 

The way we've decided to go in terms of conformance (and you can read this in the 1.4 working draft of the spec) was to make it possible to use as much or as little of the spec as you want, but if you claim conformance to any part of the spec, you must fully conform to that part. If you don't fully conform to that part of the spec, you can still be conformant as long as you state which things you did not implement. Here is the "Goals" description from the spec:

The goal of this clause is to define conformance, and to provide interoperability guidelines in a way that fosters broad and innovative use of the Office Open XML file format, while maximizing interoperability and preserving investment in existing files and applications (§4). By meeting this goal, this Standard benefits the following audiences:

  • Developers that design, implement, or maintain Office Open XML applications.
  • Developers that interact programmatically with Office Open XML applications.
  • Governmental or commercial entities that procure Office Open XML applications.
  • Testing organizations that verify conformance of specific Office Open XML applications to this Standard. (Note that this Standard does not include a test suite.)
  • Educators and authors who teach about Office Open XML applications.

So based on those goals, and the nature of the spec, we identified the following issues:

  1. The application domain encompasses a range of possible consumers (§4) and producers (§4) so broad that defining specific application behaviors would restrict innovation. For example, stipulating visual layout would be inappropriate for a consumer that extracts data for machine consumption, or that renders text in sound. Another example is that restricting capacity or precision runs the risk of diluting the value of future advances in hardware.
  2. Commonsense user expectations regarding the interpretation of an Office Open XML package (§4) play such an important role in that package's value that a purely syntactic definition of conformance would fail to effect a useful level of interoperability. For example, such a definition would admit an application that reads a package, and then writes it in a manner that, though syntactically valid, differs arbitrarily from the original.
  3. Legitimate operations on a package include deliberate transformations, making blanket change prohibitions inappropriate in the conformance definition. For example, collapsing spreadsheet formulas to their calculated values, or converting complex presentation graphics to static bitmaps, could be correct for an application whose published purpose is to perform those operations. Again, commonsense user expectation makes the difference.
  4. Existing files and applications exercise a broad range of formats and functionality that, if required by the conformance definition, would add an impractical amount of bulk to the Standard and could inadvertently obligate new applications to implement a prohibitive amount of functionality. This issue is caused by the breadth of currently available functionality and is compounded by the existence of legacy formats.

The important thing to then get clear on is what the standard specifies. Section 2.3 of Part 1 defines this:

To address the issues listed above, this Standard constrains both syntax and semantics, but it is not intended to predefine application behavior. Therefore, it includes, among others, the following three types of information:

  1. Schemas and an associated validation procedure for validating document syntax against those schemas. (The validation procedure includes un-zipping, locating files, processing the extensibility elements and attributes, and XML Schema validation.)
  2. Additional syntax constraints in written form, wherever these constraints cannot feasibly be expressed in the schema language.
  3. Descriptions of element semantics. The semantics of an element refers to its intended interpretation by a human being.

And then we defined document conformance; application conformance; and listed some interoperability guidelines.

It's actually interesting how much time it took for the TC to really get comfortable with this section of the spec. It's extremely important to get right though, and Rick's post reminded me of that.

-Brian