Migrate Office 2003 documents to Office 2007 documents and what is happen with the old doc-properties?


We are going the way to use more XML based content. This makes us more flexible with further ideas but the first step to migrate our e.g. Word 2003 docs to rich Word 2007 docxs. In case of having those documents stored in a Windows SharePoint Services V3 document library this is not the easiest task when you think about columns and properties.

This post will explain one idea.

Prerequisites:
Windows SharePoint Services V3 or Microsoft Office SharePoint Server 2007

Source Document Library to store Word 2003 documents:

image

The extension will be DOC.

 

Target Document Library to store Word 2007 documents:

image

The extension will be DOCX.

Due to that DOC is a complete binary format anything must be stored at the right place. You can also edit such document properties with the Windows Explorer (All currently supported Windows versions) :

 

image

 

Uploaded Sample.doc to the DOC2003 doclib and let show the column MyCity:

image

 

Most of us used this way to add such “Properties” into the document body:

image

To refresh the value in the field it must be selected and hit on F9.

 

This form we will see by using the Word menu “File-Properties”:

image

That means we can enter the text also here which is “synchronized” with the Word property and field and the WSS column MyCity. This form we will miss with Word 2007.

Start to Migrate:

Opened in Word 2007 and saved back to the new doclib for Word 2007 documents.

image

Well, in the moment we have the right value in MyCity, the old one.

The “New Dialog” in Word is integrated into the ribbon:

image

 

Now I changed the value of MyCity in the ribbon to “London”. The text inside the document-body field MyCity will not be changed because we changed the value in the XML-Property MyCity.

image

 

How this looks now on the doclib?

image

The Migration is complete and I see always the old values of the old Doc-Property in my document-body.

What I can do?

With credits of my colleague Fritz Geiselhöriger as one of our Escalation Engineer for Office client, thanks to him for this part of information:

1.) Legacy Property and WSS Properties - where are they stored in a file?

====================================================

In binary Office compound files (doc, xls, ppt) all built-in and custom document properties are always stored within a certain data storage "SummaryInformation". Any application which is aware of the OLE compound format can therefore retrieve or even modify properties there. You can e.g. right-click such binary files in Windows Explorer and view/edit properties there.

 

WSS properties are completely different here. They are stored as custom XML part within an Office 2007 OpenXML file. If you e.g. rename a *.docx file to *.zip file extension you can see a subfolder "CustomXML" holding all kind of different xml parts - including WSS properties and property definitions:

image

 

If we have a closer look at all files in this "customXML" folder then we can see one file "ItemX.xml" that contains the values for our WSS properties. In the following example we defined a Choice-property named "MyChoice" on MOSS with multiple checkbox values. For our current document two values "A" and "B" are selected:

 

<?xml version="1.0" encoding="utf-8" ?>

<p:properties xmlns:p="http://schemas.microsoft.com/office/2006/metadata/properties" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<documentManagement>

    <MyChoice xmlns="b62c8859-ff04-422d-93f8-0bcd7d1c6205">

        <Value>A</Value>

        <Value>B</Value>

    </MyChoice>

<MyLookup xmlns="b62c8859-ff04-422d-93f8-0bcd7d1c6205">1</MyLookup>

</documentManagement>

</p:properties>

 

In this example "b62c8859-ff04-422d-93f8-0bcd7d1c6205" is the GUID for our document library. The above namespace and the element "documentManagement" are expected and required by Office 2007 to correctly map properties. This sample just contains property values but no information that our property "MyChoice" is using checkboxes or what possible values are allowed there. This definition information can be found in another "ItemX.xml" file:

 

<xsd:element name="MyChoice" ma:index="9" nillable="true" ma:displayName="MyChoice" ma:default="A" ma:description="Multiple Choice Test" ma:internalName="MyChoice">

     <xsd:complexType>

     <xsd:complexContent>

     <xsd:extension base="dms:MultiChoice">

     <xsd:sequence>

     <xsd:element name="Value" maxOccurs="unbounded" minOccurs="0" nillable="true">

     <xsd:simpleType>

          <xsd:restriction base="dms:Choice">

               <xsd:enumeration value="A" />

               <xsd:enumeration value="B" />

               <xsd:enumeration value="C" />

          </xsd:restriction>

      </xsd:simpleType>

      </xsd:element>

      </xsd:sequence>

      </xsd:extension>

      </xsd:complexContent>

      </xsd:complexType>

</xsd:element>

 

Here we see that the default value for our MultiChoice property is "A", possible values are A,B or C. So schema definition and property data are stored in two different xml files within one Office 2007 OpenXML file package.

 

 

Some Comments:

·          Previous Office versions do not support WSS properties. When opening Office2007 files (with Office Compatibility Pack file converters installed in order to read the new OpenXML format) all WSS properties are just added to the legacy file properties. So all properties (legacy and WSS properties) will all show up in the legacy Office "File-Properties" dialog, there's no difference any more between them. This also means that special types of WSS properties (Choice, Lookup, MultiLine, …) are just normal text-properties in older Office applications.

·          WSS properties are not fully exposed to the Windows Shell. If you check file properties for an OpenXML file in Windows Explorer you will not see them and you will also not see any "Custom" tab for custom document properties. You will only see the built-in legacy proerties like Author, Comments, …

·          In Office 2003 when you opened a file from Sharepoint and clicked on "File-Properties" you first got a Web-Dialog with all properties (as these properties are retrieved from SPS you can see property types not directly supported by Office, e.g. MultipleChoice values):
image
In this dialog you can click on "File Properites" and you will see the Office 2003 legacy properties dialog. We can see that our WSS property "MyChoice" has also been mapped to the custom Office property "MyChoice" - though here it is just a normal text property and no multiple choice property any more:
image


 

2.) How to access WSS properties in Office UI?

=================================

Again: Older Office versions do not directly support WSS properties. You can however click on "File Properties" in Office which will open the above "Web File Properties" window reflecting our WSS properties as well (if the document was opened from Sharepoint, otherwise you only get the legacy file properties dialog and will only see WSS properties as normal legacy text properties)

 

In Office 2007 a new properties ribbon was implemented. If you don't see this ribbon after opening a document just click on the Office Button and select "Prepare / Properties". You can now select between "Server-Properties" (WSS), "Document Properties" (common legacy file properties), "Advanced Properties" (this is our legacy Office FileProperties dialog)

 

 

3.) How to insert mapped data controls showing our WSS properties?

===============================================

Word 2007 for the first time allows mapped content controls to view/edit text and also data-bind it to any kind of XML content within our file (so we can map them to WSS properties as well as they are also just a custom XML part in our file).

 

In order to insert new content controls you need to enable the Developer ribbon in Word. You can do this by clicking on the Office Button and here on "Word-Options". You can enable the option "Show Developer Tab in the Ribbon" in the "Popular" category there.

 

In our Developer ribbon you can now select a richttext or text-control from the "Controls" block and insert it into your document. You cannot data-bind it on the UI to certain data in your file, this has to be done programmatically. There's a nice blog describing how data-binding can be done for those content controls:

http://blogs.msdn.com/erikaehrli/archive/2006/08/11/word2007DataDocumentGenerationPart1.aspx

 

 

Sub CreateMappingSample()

Dim objCustomPart As CustomXMLPart

Dim objContentControl1 As ContentControl As ContentControl

 

' Clear document:

ActiveDocument.Content.Delete

' Add a line of text:

Selection.TypeText "Date for our text control: " & vbTab

 

' Now add a text content control which will later be mapped to a WSS property:

Set objContentControl1 = Selection.Range.ContentControls.Add(wdContentControlText)

 

' Step through all XML parts in our file and search for a "properties" metadata:

For Each objCustomPart In ActiveDocument.CustomXMLParts

    If objCustomPart.DocumentElement.BaseName = "properties" Then Exit For

Next objCustomPart

 

' We hopefully found our properties xml part. Now map our text control to our WSS property:

If Not (objCustomPart Is Nothing) Then

    objContentControl1.XMLMapping.SetMapping "/ns0:properties/documentManagement/ns2:MyText", , objCustomPart

End If

End Sub

 

 

Comment: In the above macro we use the two namespace prefixes ns0: and ns2: which are being used in the schema definition XML part so they must be specified in our mapping as well. In my case ns0: is always the predfined namespace for our WSS metadata http://schemas.microsoft.com/office/2006/metadata/properties and ns2: is the GUID for my SPS DocLibrary.

 

After running such a code you can edit your WSS property "MyText" in Word 2007 not only in the properties pane but also directly in the text in our text control. Both are data-bound to the same XML WSS property part:

image

 

How this looks with our Sample.docx?

Go through the way to add VBA code and put the routine into the VBA editor.

 

Sub CreateMappingSample()

 

Dim objCustomPart As CustomXMLPart

Dim objContentControl1 As ContentControl

 

' Goto the end of the dic and add a line of text:

Selection.EndKey Unit:=wdStory

Selection.TypeText "Our text control: " & vbTab

 

' Now add a text content control which will later be mapped to a WSS property:

Set objContentControl1 = Selection.Range.ContentControls.Add(wdContentControlText)

 

' Step through all XML parts in our file and search for a "properties" metadata:

For Each objCustomPart In ActiveDocument.CustomXMLParts

    If objCustomPart.DocumentElement.BaseName = "properties" Then Exit For

Next objCustomPart

 

' We hopefully found our properties xml part. Now map our text control to our WSS property:

If Not (objCustomPart Is Nothing) Then

    objContentControl1.XMLMapping.SetMapping "/ns0:properties/documentManagement/ns2:MyCity", , objCustomPart

End If

 

End Sub

-------------------------------------------------------------------

After you run the code the document looks like this:

image

The rest is straight forward and you should think about which way you want to migrate your documents and how to provide that kind of functionality with your new DOTM templates.

I am living in the SharePoint world and for any Word/Excel (Office client) discussion please refer to our newsgroups and contact our Word specialists who also blogging and helping you further.


Comments (1)

  1. rtao626 says:

    Nice Post!

    Did you notice that if you have a server property, which is date data type and is also required field, Office 2007 , like Excel or Word, disable it. Hence, users can not save the document to the server directly to the server.

Skip to main content