What about Word 2003’s XML format?

I’ve had a few folks ask me about the XML format from Word 2003, and whether or not it would be supported in Word 2007. I mentioned this back in the fall, but in case you missed it let me repeat that the Word 2003 format will continue to be supported in Word 2007.

There are a ton of folks out there who have already built solutions on top of the Word 2003 XML format, and those will continue to work. Everyone can decide for themselves whether they want to port those solutions forward into the new Open XML format, or keep them in the 2003 XML format. The new Open XML format is largely based on the Word 2003 XML format, so you’ll see a lot of similarities.

One of the benefits of the Open XML format over the Word 2003 format is the number of Office versions that will support it. As you all know by now, the new Open XML formats will work in Office 2000, XP, 2003, and 2007; while the Word 2003 XML format will work in Word 2003, and 2007.

The XML formats in Word 2007 will be:

  1. Word document (.docx) – This is the default, and it’s in the Open XML format

  2. Word macro-enabled document (.docm) – This is the macro-enabled version of the Open XML format

  3. Word XML document (.xml) – This is a single XML file that is a serialized version of the Open XML format

  4. Word 2003 XML document (.xml) – This is the exact same as the XML format that Word 2003 supported

We also will continue to support opening anyone else’s XML files as well, just like we did in Word 2003. Here’s an entry I made back in the summer about opening your own XML files in Word: http://blogs.msdn.com/brian_jones/archive/2005/08/16/452478.aspx


Comments (9)

  1. rasx says:

    This is great news but what about VSTO hooking into Word 12? Right now, in 2003, when we run Application.Selection.XML() we get the Word 2003 format of XML. What format will be returned in Word 12? Will it be one "flattened" XML string of the entire Word 12 document?

  2. Great question. The range.xml property will still return the 2003 XML format. There is also a new property added called "wordOpenXML" (or something similar to that) that will return the serialized version of the Open XML format.


  3. Michael Locker MD says:

    Great question!

    Michael Locker MD

  4. JayV says:

    Hey Brian,

     I have a wordprocessingml schema implementation question.  Even in Word XML 2003, section properties (<sectpr>) appear in the last paragraph of a section.  Why was it decided to put the section properties at the end of a section instead of the beginning?

     I see how a producer of a Word doc will derive some small benefit because they will no doubt have that information by the time they reach the section end.  What about consumers that view the document though?  To display a correctly wrapped page it needs the margins, page size, headers, footers, etc.  This means the consumer must either 1.) read through the whole section before displaying anything within the section, or 2.) once the consumer encounters the section properties in the last paragraph of a section, it most likely must clear, rewrap, & redraw anything in the section that it has already displayed.  This seems like a (possibly very large) performance hit for the consumer – either delay the display of content (as in 1.), or possibly display it incorrectly initially and end up doing redundant rewrapping to correct it later(as in 2.).  Is there any possibility that the section properties will be output at the top of a section (or even in a separate xml part) in the future to avoid these drawbacks?  Or is there a different reason behind putting them out at the end?

    Thanks for your insight,


  5. bill says:

    Navigation menu made in macromedia flash, easy to use, all you have to do is insert the swf file in your web page like any other flash file and place the XML configuration file next to the page that contains the menu (flash file), you can edit the configuration (XML file) with a text editor like notepad from windows.

    Sub menus items number can be changed and main button can be hidden so you can have up to 7 main button and up to 8 sub buttons.

    Sounds can also be enabled or disabled by configuration file.

    A background file can be dinamicaly loaded in the menu , this background can be a swf or a jpg file and it`s name is set by the XML file.


  6. Francis says:

    An automated batch tool to convert Word 2003 XML files (and DOC) to the 2007 format would be nice. Think of Photoshop’s droplets, only with subdirectory recursion.

    Otherwise, users will have to convert their important files one-by-one, which means that many files will not be future-proofed and end up, in 8 years when import converters are no longer provided for DOC files, unreadable.

    (This will happen: from the Microsoft KB "Word 2002 and Word 2000 do not have an import converter for Word 2.0 for Windows or earlier.")

  7. Yves says:


    I can’t any information about

    The "single XML file that is a serialized version of the Open XML format"

    in the OpenXML Draft 1.3 (may 2006).

    Where can I find it?


  8. Hi Yves,

    The format that we are currently working on standardizing is just the main one which uses ZIP (the default for Office). The serialized version is not currently planned as part of the standard. It just takes the standard and essentially replaces the ZIP container with an XML container.