Document Parsers in SharePoint (3 of 4): Parsers and Content Types

For these four entries, I’m going to go over in detail how to construct and register a custom parser that enables you to promote and demote properties between your custom file types and Windows SharePoint Services.

Read part one here.

Read part two here.

Document Parsing and Content Types

When WSS invokes a document parser to promote document properties, the parser writes all document properties to a property bag object. WSS then determines which of these document properties to promote to matching document library columns. If the document has been assigned a content type, then WSS promotes the document properties that match the columns included in the content type.

Parsing Content Types in Documents

Using the document parser interface, document parsers can access the content type assigned to the document, and, if desired, store the content type in the document itself. If, when WSS invokes the parser to parser a document, the parser writes the document’s content type to the property bag object as a document property, WSS compares this content type to the content types associated with the document library to which the document is being uploaded. If the document’s content type is one that is associated with the document library, WSS promotes the appropriate document properties and saves the document.

However, there are cases where the document’s content type may not actually be associated with the document library to which the user is uploading the document. For example, the user might have created the document from a document template that contained a content type; or the user might move a document from one document library to another.

If the document’s content type is not associated with the document library, WSS takes the following actions:

· If the document contains a document property for content type, but that document property is empty, WSS invokes the parser to demote the default list content type for the document library into the document. WSS then promotes the document properties that match columns in the default list content type, and stores the document.

This would occur if the document had not yet been assigned a content type.

· If the document is assigned a content type not associated with the document library, WSS determines whether the document library allows any content type. If so, WSS leaves the document’s content type as is. WSS does not promote the document content type; however, it does promote any document properties that match document library columns.

Lists can be set to allow any content type. To do this, add the Unknown Document Type content type to the list. If you add this content type to a list, then documents of any content type can be uploaded to the list without having their content types overwritten. This enables users to move a document to the list without losing the document’s metadata, as would happen if the content type was overwritten.

· If the document is assigned a content type not associated with the document library, and the document library does not allow any content types, WSS invokes the parser to demote the default list content type for the document library into the document. WSS then promotes the document properties that match columns in the default list content type, and stores the document.

The figure below details the actions taken by WSS, if the parser includes the document’s content type as a document property in the property bag returned to WSS when the parser parses a document.

WSS never promotes a document’s content type onto a document library.

In my final post of this series, I'll introduce you to the document parser definition schema, and the document parser interface.