XML Document Property Parsing in SharePoint (1 of 5): XML Parser Overview

I've just finished putting together a lot of information around how document parsers work within Windows SharePoint Services V3, including how to use the built-in XML parser, and how to create your own custom parsers for custom file types. This material won't be included in the WSS SDK until the next major update, so I figured I'd give you a preview of it here. For the next five posts, I'm going to be covering how to use the built-in XML parser in WSS V3 to promote and demote document properties in your XML files, including InfoPath 2007 forms.

So without further ado:

WSS V3 includes a built-in XML document parser you can use to promote and demote the properties included in your XML documents. Your XML files can adhere to any schema you choose. As long as your XML file meets the requirements listed below, WSS V3 automatically invokes the built-in XML parser whenever document property promotion or demotion is required.

(Property promotion refers to extracting document properties from a document, and writing those property values to the appropriate columns on the document library where the document is stored. Property demotion refers to taking column values from the document library where a document is stored, and writing those column values into the document itself.)

Using the built-in XML parser for your custom XML files helps ensure that your document metadata is always up-to-date and synchronized between the document library and the document itself. Users can edit document properties in the document itself, and have the property values on the document library automatically updated to reflect their changes. Likewise, users can update property values at the document library level, and have those changes automatically written back into the document itself.

For WSS V3 to invoke the built-in XML parser for an XML file, that XML file must meet the following requirements:

· The file must have an extension of .xml.

· The file must not be a WordML file. WSS V3 contains a separate built-in parser for WordML files; WSS V3 automatically invokes this parser for XML files created using WordML.

Additionally, for the XML parser to actually promote and demote document properties, the XML file should be assigned a content type that specifies where each document property is located in the document, and which content type column that property maps to. (We'll talk about that in a later entry in this series.)

XML Parser Processing

The following is a brief overview of how the built-in parser operates:

When a user uploads an XML document, WSS V3 examines the document to determine if the built-in XML parser should be invoked. If the document meets the requirements, WSS V3 invokes the parser to promote the appropriate document properties to the document library.

Once invoked, the XML parser examines the document to determine the document content type. The parser then accesses the document's content type definition. The content type definition includes information about each column in that content type; this information can include:

· The document property that maps to a given column, if there is one

· The location where the document property is stored in the document itself

Using this information, the XML parser can extract each document property from the correct location in the document, and pass these properties to WSS V3. WSS V3 then promotes the appropriate document property to the matching column included in the content type.

Likewise, WSS V3 can also invoke the built-in XML parser to demote properties from the content type columns, on the document library, into the document itself. When WSS V3 invokes the demotion function of the parser, it passes the parser the document and the column values to be demoted into the document. Once again, the parser accesses the document's content type definition. The parser uses the content type definition to determine:

· Which document properties map to the column values passed to it for demotion

· The location of those document properties in the document

Using this information, the parser writes the column values into the applicable document property locations in the document.

Enabling Property Demotion

For a document property to be demoted, the column to which it is mapped must be defined with its ReadOnly attribute set to "false".

In my next post, we'll discuss how to use content type to specify XML document properties. Stay tuned.