Intro to Word XML Part 4: Schema Validation

In the Intro to Word XML Part 3, I showed how you could add your own XML to a Word document. I also briefly touched on how you could take advantage of the XML to make programming against the document a lot easier. Let's now briefly explore data validation.

Schema Validation

If you go to https://jonesxml.com/schemas/example1 there is a pointer to an XSD that validates the namespace used in the files we generated in the Intro to Word XML Part 3 post. Go ahead and download that file and save it somewhere locally. Now grab the XML file you created in Part 3 and open it in Word. Here's the XML in case you don't have it anymore:

<?xml version="1.0"?>
<w:wordDocument xmlns:w="https://schemas.microsoft.com/office/word/2003/wordml" xmlns:s="https://jonesxml.com/schemas/example1">
<w:docPr>
<w:view w:val="print"/>
<w:showXMLTags w:val="on"/>
</w:docPr>
<w:body>
<s:employee>
<w:p>
<w:r>
<w:rPr>
<w:b/>
</w:rPr>
<w:t xml:space="preserve">Name: </w:t>
</w:r>
<s:name>
<w:r>
<w:t>Brian Jones</w:t>
</w:r>
</s:name>
</w:p>
<w:p>
<w:r>
<w:rPr>
<w:b/>
</w:rPr>
<w:t xml:space="preserve">Occupation: </w:t>
</w:r>
<s:occupation>
<w:r>
<w:t>Program Manager</w:t>
</w:r>
</s:occupation>
</w:p>
</s:employee>
</w:body>
</w:wordDocument>

When you open that file in Word, the XML tags are persisted, but there is no validation. According to the schema, the occupation tag can only have a few values, but Word doesn't know that yet. If we want to enforce the schema, we'll need to first add it to our schema library. In Word, go to the Tools menu and choose "Templates and Addins". In that dialog there will be an "XML Schema" tab. In that dialog, press the "Add Schema..." button and go find the schema you downloaded from jonesxml.com. When you add the schema you'll also be given the option to specify an alias. This just makes it so you don't need to see the longer namespace if you don't want to.

Now that you've added the schema for that namespace into Word, your document can be validated. You'll probably notice that there is a vertical purple squiggly line along the left. That's showing a schema validation error. The schema doesn't allow for mixed content, and that's what the error is about (I'll show you later how to fix this). Try changing the value of the occupation tag to be something like "boss". Notice that you now get a purple squiggly underline. Right click on this and you'll see that that element only allows "Program Manager", "Developer", and "Tester".

So that's a really quick introduction to Word's support for schema validation. If you bring up the XML Structure task pane you'll also notice now that Word can suggest the right elements to you to apply to your selection, and will show you the validation errors in the tree view. Now that you've added the schema to your schema library, any XML file you open in that namespace will automatically get schema validation.

-Brian