Inclusion of alternate formats


Here’s a question I got from someone wanting to store alternative formats in the files:



Want to know if alternative formats can also be stored in the same office XML zip format. E.g., Is it possible that the XML file format also stores the 2003 binary format as an alternative? Or a pdf version of it along?


This would provide the user another level of backup if some part of the XML format is corrupt.


Does the current Open XML schema allow such inclusions?


I have access to the beta 1, but I could not create such documents. Do you see it as a possibility in near long term (office 12 GM)


If yes, how would the Office UI react to opening such documents?


This is actually an issue we thought through a lot at the beginning of the project. One of the advantages of the packaging model we chose was that it’s easy to add additional content to the files. This could be used in a number of different scenarios. One potential use would be to do something similar to the binder (binder was was an old feature in earlier versions of Office that let you take multiple files and bind them up into one “project file”). Another case (which is what this question was about) is the ability to take just one document, but embed multiple representations so that viewers could choose which format they best support.


For Office 2007 at least, we aren’t planning to support either of these scenarios. That said, there is nothing in the format to prevent a solution provider from extending Office to output alternate representations. Office would have no problem opening the file either as long as the proper relationships were set up in the package. For example, someone could capture the save event, and in addition to saving the file, programmatically do a save to PDF and put that output into the package as well.


-Brian

Comments (7)

  1. Chris says:

    So would Office preserve other content in the package that it does not recognize?  In other words, if I stuff other data into a DOCX file and then open it with Word, when I resave it will Word keep the other content intact?

  2. Mike says:

    I think Brian’s answer nails it.

    But I just did a test with beta 1. Adding a .xls file into a .docx file has produced a "document corrupt" error when opening it in Word 2007. And, may be because of a bug, the document error leads to a repair tool which is unable to repair the file.

    In another test, I added the .xls file somewhere else than the root of the zip, and it produced the same error.

    Parts and their relationships seem to be exclusively checked at open time, and this forbids any extra file.

  3. BrianJones says:

    There are still some bugs in Beta 1, but there is a model for preservation of other content. We have a future extensibility mechanism that we designed which will allow for new features to be added in future versions without breaking Office 2007’s ability to open the files. The future content can either be removed, or preserved, and that just depends on how it’s used.

    There are specific areas where unknown content can go, and it will be preserved. Other areas allow for unknown content without preservation, so Office 2007 can open the files without error, but when the file is saved that content no longer exists. Of course, you can’t just put unknown content anywhere… only in areas that allow for it (which will all be documented).

    -Brian

  4. Mike says:

    Great news Brian. Before this, I thought the "easy single-file transport approach" was seeing its legs cut, and that was certainly disappointing.

  5. Wouter Schut says:

    I’m working my way back into your blog, so just now I am   beginning to understand the packaging thing. 😉

  6. Links to blog posts that contain useful technical information for developers.  Open XML is a new standard, but there’s some good information already available if you know where to look.

  7. Weddings says:

    Here’s a question I got from someone wanting to store alternative formats in the files: Want to know if alternative formats can also be stored in the same office XML zip format. E.g., Is it possible that the XML file format also stores the 2003 binar