Frustrated with the Limitations of XSD for XML Document Validation? Try Schematron!!!


My article Improving XML Document Validation with Schematron is finally up on MSDN. It provides a brief introduction to Schematron, shows how to embed Schematron assertions in a W3C XML Schema document for improved validation capabilities and how to get the power of Schematron in the .NET Framework today. The introduction of the article is excerpted below

Currently the most popular XML schema language is the W3C XML Schema Definition language (XSD). Although XSD is capable of satisfying scenarios involving type annotated infosets it is fairly limited when it comes to describing constraints on the structure of an XML document. There are many examples of situations where common idioms in XML vocabulary design are impossible to express using the constraints available in W3C XML Schema. The three most commonly requested constraints that are incapable of being described by W3C XML Schema are:

  1. The ability to specify a choice of attributes. For example, a server-status element should either have a server-uptime attribute or a server-downtime attribute.

  2. The ability to group elements and attributes into model groups. Although one can group elements using compositors such as xs:sequence, xs:choice, and xs:all, the same cannot be done with both elements and attributes. For example, one cannot create a choice between one set of elements and attributes and another.

  3. The ability to vary the content model based on the value of an element or attribute. For example, if the value of the status attribute is “available” then the element should have an uptime child element; otherwise it should have a downtime child element. The technical name for such constraints is co-occurrence constraints.

Although these idioms are widely used in XML vocabularies it isn’t possible to describe them using W3C XML Schema, which makes it difficult to rely on schema validation for enforcing the message contract. This article describes how to layer such functionality on top of the W3C XML Schema language using Schematron.

Embedding Schematron assertions in a W3C XML Schema document allows you to have your cake and eat it to.


Comments (3)

  1. This is cool. Should we embed Schematron to our server as well? Actually, people are asking for this kind of validation in the registry context as well. For example, using Schematron for asserting that every registered business services has at least one tModel in uddi:uddi.org:wsdl:types category and one management tModel would be a good approach….

  2. Xan Gregg says:

    Where do you get the name "co-occurrence constraints" for constraining the content model based on an attribute value? I’m not sure there are any formal terms, but the XML Schema WG has been using that term for constraining the content model based on the *occurrence* of an attribute, which is closer to your item #2. The broad term to cover all of these is "co-constraints", but I don’t know if there is a specific term for your item #3. If you have a pointer to accepted terminology, I’d appreciate it.

  3. steve says:

    Very nice – but would be good to get an updated version of schematron.net class that contained support for <value-of…/> within an assert or a report.

    <a href="http://sourceforge.net/forum/forum.php?thread_id=1162064&forum_id=87322">Daniel says</a> A newer version will be released in .NET 2.0 that will support the latest <a href="http://www.dsdl.org/0524.pdf">ISO version </a> of schematron that supports value-of etc

    Anyone fancy upgrading the 1.1 version to cope with value-of?