XML Document Property Parsing in SharePoint (4 of 5): Specifying Document Content Type for XML Parsing

This is the fourth in a five-part series on how to use the built-in XML parser in WSS V3 to promote and demote document properties in your XML files, including InfoPath 2007 forms.

Read part one here.

Read part two here.

Read part three here.

There are two document properties the built-in XML parser examines to determine the content type to assign to an XML document when a user first uploads it to a given document library. The parser must determine which of the content types associated with the document library to assign the document before the parser can promote or demote document properties.

For a detailed examination of the process the parser performs to match a document’s content type with a content type associated with the document library, see part three.

Specifying Content Type by Content Type ID

The parser first looks for a processing instruction that specifies the document’s content type by content type ID. The location of this processing instruction is included in the definition for the content type ID column template. The processing instruction is named MicrosoftWindowsSharePointServices and contains an attribute named ContentTypeID that represent the ID of the document's content type.

<FieldRef

  ID="{4B1BF6C6-4F39-45ac-ACD5-16FE7A214E5E}"

  Name="Content Type ID"

  PITarget="MicrosoftWindowsSharePointServices"

  PIAttribute="ContentTypeID"/>

By default, all library list templates include a column that represents the content type ID.

Add this processing instruction to your XML document. Set the ContentTypeID attribute to the ID of the document’s content type.

For example:

<? MicrosoftWindowsSharePointServices ContentTypeID=”0x010101003D7907A1908011d082BD08005AA74F5E00A557E10DA69DBF4C8BE1E21071B08163”/>

The parser will fail to determine the content type in the following situations:

· The MicrosoftWindowsSharePointServices processing instruction isn’t present in the document

· The processing instruction does not specify a content type

· The specified content type is not associated with the document library

· No parent or child of the specified content type is associated with the document library

If the parser cannot identify the content type by content type ID, it performs a second check, detailed below.

Note The parser looks for the content type ID in whatever document location you specify in the field definition for the Content Type ID column on the document library. You can map the Content Type ID column to any processing instruction or XPath expression you choose. However, we recommend you adhere to the default mapping include in the content type ID column template definition. This minimizes the chance of having content types that specify a different location for this document property than the document library with which they are associated. Such a situation would lead to the XML parser looking in the wrong document location for the content type ID.

Specifying Content Type by Document Template

If the parser fails to determine a suitable content type for the document based on content type ID, the parser then looks for a processing instruction that contains the URL of the document template on which the document is based. The processing instruction is named mso-infoPathSolution that contains an attribute named href that represents the URL of the document template.

<FieldRef

  ID="{4B1BF6C6-4F39-45ac-ACD5-16FE7A214E5E}"

  Name="DocumentTemplate"

  PITarget="mso-infoPathSolution"

  PIAttribute="href"/>

This column is included in the Form content type, and is added to a library anytime that content type is added to the library.

So, rather than include a content type ID, you can add this processing instruction to your XML document. Set the href attribute to the URI of the document template on which the document is based.

For example:

<?mso-infoPathSolution href=”https://www.adventureworks.com/templates/myTemplate.XML”?>

If the parser finds this processing instruction, it then examines the content types associated with the document library to determine if a content type has the same document template. If so, the parser assigns that content type to the document. If more than one content type associated with the document library has the same matching document template, the parser simply assigns the first content type it finds that matches.

Note The parser looks for the document template URL in whatever document location you specify in the field definition for the Document Template column on the document library. You can map the Document Template column to any processing instruction or XPath expression you choose. However, we recommend you adhere to the default mapping included in the document template column template definition. This minimizes the chance of having content types that specify a different location for this document property than the document library with which they are associated. Such a situation would lead to the XML parser looking in the wrong document location for the document template.

In the final installment of this series, I'll show how you can include namespace prefixes in the XPath expressions you use to specify the location of a document property.