I’ve been really backed up on email, so I’m sorry for those of you who’ve sent me questions directly and still haven’t heard back. I’m working to catch up on all of them. There are a number of great questions, and I figured that I should post some of my replies on this blog, since other folks may be interested in the answers as well. I created a new category so if you want to see just the email replies go to the “email” category (this is the first post under that category).
Here’s the question (I’ll keep the names private):
I can see several advantages to the _rels file…but have a few questions about it also.
In addition to the RID number in document.xml, the image tag includes a shape id and type, and a width and height. The _rels file contains only the target. You state that this makes it easy to change an image and just update the _rels file…but if the image were of a different size, which size would be used when rendered in Word? And which one would you trust if you were writing software to convert this to some other format? Why not move all the info about the image to the _rels file?
I am a little concerned about the way the _rels ID numbering starts at 1 for each document. Is there a unique GUID for each document that relates it to the rels file? If multiple documents are concatenated (externally to Word) then the _rels files would need to be combined and all IDs in document.xml and _rels.xml would need to be updated. It wouldn’t look as pretty, but would make it a lot easier to work with multiple documents if the IDs included a unique GUID.
The same thing is true of bookmarks that are used in cross-references between Word documents. If the bookmarks aren’t made unique through the use of some docID or GUID, then when Word docs are combined there is a chance of multiple identical bookmarks…which makes xref targets a bit uncertain.
Other than these points, I think the new format will be a great improvement over the old doc format, and when converting to other formats will save the extra step of saving as html or xml and then cleaning up the results.
We had discussed early on putting more information out into the relationships, but it’s a slippery slope. You either end up duplicating a bunch of data, or making it confusing when trying to understand where you go for all the display information. In the end we decided that only the resource location and type should be part of the relationship. All other information dealing with how it is used (like scaling) should live in the markup. Remember that the size information that is in the markup is not “meta-data” about the size of the image, but instead instructions on how to display the image (what height and width to scale the image to).
The relationship IDs really have no significance. You could use a relationship ID of “foobar1” if you wanted to. In Office, we just decided to start at 1 and count up from there. The only rule is that a relationship ID must be unique for any given part. They don’t need to be unique across the entire document. A GUID would have obviously worked, but it would have been overkill. Another really important point to note is that we don’t preserve relationship IDs. When we open the file, it’s converted into our internal memory structures, and at save time we do a full save. We will regenerate the relationship IDs, and they won’t necessarily match what they were on open (of course the references to the ID in the document markup will also be updated to match the new ID).
In Word, bookmarks should be unique for the document. This is enforced, so you know you will not have to deal with conflicts.
Hope that helps.