I saw this the other day on slashdot. I have to admit that this is the first time I’ve heard about this and I’m not really familiar with exactly what is being output by TextEdit. In the slashdot post there is a link to an example file, and if that’s really what’s being output I’m surprised. Does anyone know for sure if that really is what TextEdit outputs? The file posted matches our XML format that we had in the Beta 2 version of Word 2003. You can tell by the namespace: “http://schemas.microsoft.com/office/word/2003/2/wordml”. The final namespace that we shipped with was “http://schemas.microsoft.com/office/word/2003/wordml”. It also has the PI to tell the shell to launch the file in Word instead of IE. I talked about that behavior in this post last month.
If that really is what they output and you you want to try opening that file in Word 2003 there are a couple things you need to do. The first problem is the namespace (as I mentioned before). Just remove the “/2” and it will now be in the right namespace. Same goes for the hint namespace. It should be “http://schemas.microsoft.com/office/word/2003/auxHint” and not “http://schemas.microsoft.com/office/word/2003/2/auxHint”.
There are some interesting comments on that post as well. It looks like some people thought this was a glimpse at the new XML formats for Office 12, but I think most people saw that this was the XML format from the last version. One comment that was really interesting to see was that someone mentioned that the XML was overly complicated. As I’ve mentioned before, we could have gone with a really simple schema, but in doing that it would not be full fidelity. We have over a billion Word documents out there today that we need to be able to represent in XML. We have to be able to represent every piece of Office functionality in XML, and that results in a pretty large schema. Word, PPT, and Excel are rich in functionality. The files don’t have to be that complicated though if you don’t care about all the functionality. Just read this post to see how simple a file can be. As you start representing more functionality though, it will become more complex.