Highlighting in a document

I've had a lot of folks ask me to provide more information on what features are missing from ODF and why it was that we decided to create out own XML format (Open XML). I didn't want to get too involved in pulling together a full detailed list, but it's probably worthwhile pointing things out every once and awhile. Most of you know that ODF wasn't even around when we first started working on our XML formats, so that's really one of the big reasons. Another reason is that we need to make sure that we created an XML format that all of our customers could (and would) use. We want our customers to move all their existing documents into this new format and we need them to be willing to use it as the default format. ODF just wouldn't have allowed us to achieve that (both because of a lack of functionality as well as different optimizations that sacrifice things like performance).

An area I just came across today that really surprised me was highlighting. I'm sure most folks are familiar with highlighting in a Word document. You can use highlighting to call attention to different areas in a document either for yourself or to point things out to others. The key about highlighting is that it does not affect any other formatting. Character shading (aka background-color in ODF) for instance will still be preserved when you highlight some text. I've seen some implementations out there that try to use shading as a substitute for highlighting, but that doesn't really work because people may also want to apply shading in addition to highlighting. For example, you may have a range of text shaded with light gray (ie the background-color is light gray), and then you want to highlight some of the text in that range. Then, once folks have reviewed the document, you want to remove the highlighting without removing the gray shading. In the ODF spec I saw support for shading on text, but not highlighting which we view as two different things (I only saw mention of highlighting on tables).

I came across this the other day while I was looking through the ODF spec and comparing it to the Ecma draft trying to get a better handle as to why the ODF spec was so much lighter (700 pages compared to 4000). I wanted to see if there were things we could do to reduce the number of pages in the Open XML spec without losing any of the necessary information. It looks like while there are some things that can be done for minor size reductions, we just have a lot more functionality and there is no way we could get it anywhere close to that small while still fully covering wordprocessingML, spreadsheetML, and presentationML. There are three reasons that we have so much more content. The first is that we are just representing a much richer set of features (since we have to XMLize all the existing Microsoft Office binary documents) so as a result there is just a lot more to document. The second reason is that the ODF spec points off to other specs for certain things to provide more details. The third reason is that the Ecma Open XML spec is just a lot more detailed as to how things work. The WordprocessingML sections are the furthest along in the latest draft, and if you read through the paragraphs and rich formatting section for instance (Section 19), you'll see what I'm talking about. The ODF spec on the other hand is very light and vague on a number of issues (like the numbering format issue I pointed out earlier).

-Brian