Model Groups in XML Schemas - xs:all Groups

A recent article on devx.com discusses the use of xs:all to constrain a mixed content model groups. The question posed in the article perked my ears because I thought it was obvious that xs:all was not appropriate to the problem.  The article surrounds a question on how to create a schema that validates the following instance document:

<text  xmlns="https://tempuri.org/XSDSchema1.xsd">
   <bold>Hello</bold> John, my <italics>name</italics>
   is <bold>Paul</bold>. I would
   <italics>like<italics> to tell you something.
</text>

The problem with using xs:all here is that it just won't work.  I will use this as a springboard to look at xs:all and maybe even talk about how to model your groups effectively.

Prove the Easy Solution Doesn't Work

First, we look at a simple first-pass schema and attempt to (incorrectly) model the data.

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema id="XSDSchema1" targetNamespace="https://tempuri.org/XSDSchema1.xsd" elementFormDefault="qualified"
 xmlns="https://tempuri.org/XSDSchema1.xsd" xmlns:mstns="https://tempuri.org/XSDSchema1.xsd"
 xmlns:xs="https://www.w3.org/2001/XMLSchema">
 <xs:complexType mixed="true" name="textType">
  <xs:all>
   <xs:element name="bold" type="xs:string" />
   <xs:element name="italics" type="xs:string" />
  </xs:all>
 </xs:complexType>
 <xs:element name="text" type="textType"/>
</xs:schema>

In Visual Studio .NET, create an XML file and the above schema.  With the XML file loaded in the designer, go to the XML menu item and choose “Validate XML Data“.  The document will not validate, and you will receive a list of validation errors in the Task List pane:

Element 'bold' cannot appear more than once if content model type is "all".
Element 'italics' cannot appear more than once if content model type is "all".
Per the active schema, the element 'italics' cannot be nested within 'italics'.
The 'https://tempuri.org/XSDSchema1.xsd:italics' element is not declared.
The 'italics' start tag on line '5' does not match the end tag of 'text'.
The element 'https://tempuri.org/XSDSchema1.xsd:italics' has invalid child element 'https://tempuri.org/XSDSchema1.xsd:italics'.

You can see from the first two validation errors that 'bold' and 'italics' cannot appear more than once if the content model type is “all“.  So what does this error mean?

All Or None

The first reason why xs:all is inappropriate for this problem is because it is an “all or none“ approach.  The wording in section 3.8.2 of “XML Schema Recommendation Part 2: Structures“ reads a little confusingly.  The xs:all element constrains a model group to contain either:

  • all of the particle elements, or
  • zero of the particle elements. 

This constraint means that the document must have all of the elements present or zero of the elements present.  We might have an instance document that contains only <bold> elements or only <italics> elements, meaning one of the elements may not be present.  Or, the element may contain only text, while neither of the elements are present. 

<text  xmlns="https://tempuri.org/XSDSchema1.xsd">
   This is just text.
</text>

Look at the schema definition one more time.  We specified xs:all, meaning that all of the elements must appear, or none at all.  But notice the mixed attribute with a value of “true“.  This attribute specifies that the element will contain both text and elements.  If the elements are not present but there is text, our new example will fail as well because the text itself becomes part of the model group.  The defintion for this group actually reads as “one of the following is true“:

  1. The element contains a single <bold> element, and a single <italic> element, and optionally contains text, or
  2. the content model is empty.   

What if the document contained only <bold> elements and contained no <italic> elements, or vice-versa?  The document would not validate because the schema specifies that the element must contain both <italic> and <bold> elements.  xs:element provides a set of attributes, minOccurs and maxOccurs.  These attributes are useful for specifying how many times an element may appear.  The default value for both minOccurs is “1“, which is why the above instance document fails to validate:  the element must be present within the model group.  We can better model this constraint by specifying that the content elements have a minOccurs attribute value of “0“:

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema id="XSDSchema1" targetNamespace="https://tempuri.org/XSDSchema1.xsd" elementFormDefault="qualified"
 xmlns="https://tempuri.org/XSDSchema1.xsd" xmlns:mstns="https://tempuri.org/XSDSchema1.xsd"
 xmlns:xs="https://www.w3.org/2001/XMLSchema">
 <xs:complexType mixed="true" name="textType">
  <xs:all>
   <xs:element name="bold" type="xs:string" minOccurs=”0”  />
   <xs:element name="italics" type="xs:string" minOccurs=”0” />
  </xs:all>
 </xs:complexType>
 <xs:element name="text" type="textType"/>
</xs:schema>

This update means that the following document will now validate using this schema:

<?xml version="1.0" encoding="utf-8" ?>
<text xmlns="https://tempuri.org/XSDSchema1.xsd">
This is some text, there are no formatting elements in this text.
</text>

Notice that there are no <bold> or <italics> elements in the content, there is just text.  This also means that the following XML document will also validate against our updated schema:

<?xml version="1.0" encoding="utf-8" ?>
<text xmlns="https://tempuri.org/XSDSchema1.xsd">
This contains <bold>bold faced</bold> markup, but no leaning characters.
</text>

Occurrence

It would be tempting to circumvent the parser and specify that the group may simply occur lots of times.  But xs:all limits that the group itself may only appear zero or one times through its maxOccurs attribute:

<all
  id = ID
  maxOccurs = 1 : 1
  minOccurs = (0 | 1) : 1
  {any attributes with non-schema namespace . . .}>
  Content: (annotation?, element*)
</all>

Notice that the maxOccurs attribute for xs:all specifies a default maxOccurs constraint value of “1”, where valid values are 0 or 1.  That means that the elements in an xs:all group may occur at most once.  For xs:all to be an appropriate model, the described element could only contain both a <bold> element and an <italic> element, and they both appear once, or neither appears, and the group itself appears only once.

We altered the value for minOccurs, but did not address the maxOccurs attribute yet.  Our instance document with a single <bold> element will validate, but including content with two <bold> elements still will not validate.  We can attempt to update the schema's elements to allow a maxOccurs of “unbounded.“

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema id="XSDSchema1" targetNamespace="https://tempuri.org/XSDSchema1.xsd" elementFormDefault="qualified"
 xmlns="https://tempuri.org/XSDSchema1.xsd" xmlns:mstns="https://tempuri.org/XSDSchema1.xsd"
 xmlns:xs="https://www.w3.org/2001/XMLSchema">
 <xs:complexType mixed="true" name="textType">
  <xs:all>
   <xs:element name="bold" type="xs:string" minOccurs=”0” maxOccurs=”unbounded”  />
   <xs:element name="italics" type="xs:string" minOccurs=”0” maxOccurs=”unbounded” />
  </xs:all>
 </xs:complexType>
 <xs:element name="text" type="textType"/>
</xs:schema>

This method will not work.  The error we receive when we validate the schema in VS.NET is:

The {max occurs} of all the particles in the {particles} of an all group must be 0 or 1.

If we look in section 3.8.6 of XML Schema Part 2, we can see a confusing constraint for xs:all:

When a model group has {compositor} all all of the following must be true:
1. one of the following must be true:
1.1 It appears as the model group of a model group definition.
1.2 It appears in a particle with {min occurs}={max occurs}=1, and that particle must be part of a pair which constitutes the {content type} of a complex type definition.
2 The {max occurs} of all the particles in the {particles} of the group must be 0 or 1.

This constraint means that every child element of the xs:all element in a schema may only have its maxOccurs attribute set to 0 or 1.  This is different than the maxOccurs attribute of the xs:all element:  this constraint applies to the child elements of xs:all.

Proper Usage of xs:all

While xs:all is not appropriate for this usage, it is appropriate for the conditions we have already stated.  Namely, child elements:

  1. Must all appear once or not at all
  2. May appear in any order

We can leverage the xs:all behavior, coupled with the minOccurs and maxOccurs attributes of the particle elements.  Suppose you want to declare a model group that specifically must not contain a certain element, such as <underline>.  We can specify the minOccurs and maxOccurs attribute values to be “0”, meaning this particle element is explicitly disallowed in the content:

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema id="XSDSchema1" targetNamespace="https://tempuri.org/XSDSchema1.xsd" elementFormDefault="qualified"
 xmlns="https://tempuri.org/XSDSchema1.xsd" xmlns:mstns="https://tempuri.org/XSDSchema1.xsd"
 xmlns:xs="https://www.w3.org/2001/XMLSchema">
 <xs:complexType mixed="true" name="textType">
  <xs:all>
   <xs:element name="bold" type="xs:string" minOccurs="0" />
   <xs:element name="italics" type="xs:string" minOccurs="0" />
   <xs:element name="underline" type="xs:string"  minOccurs="0" maxOccurs="0" />
  </xs:all>
 </xs:complexType>
 <xs:element name="text" type="textType"/>
</xs:schema>

We can now see that a document that has an <underline> element in it is not valid because we explicitly disallowed it from the content model.

<text  xmlns="https://tempuri.org/XSDSchema1.xsd">
   <bold>Hello</bold> John, my <italics>name</italics>
   is <underline>Paul</underline>.
</text>