Static Typing Consequences for the Parent Axis and Attribute Types

I posted a previous blog entry dealing with the static typing consequences of using the parent axis in your XQuery expressions over XML data inside SQL Server 2005 (Static Typing Consequences of the Parent Axis). In addition to the general issue of typing information loss associated with the parent axis, there is an additional issue that is related to attributes in particular. This issue is best explained through an example. Consider the following XSD schema definition:

 create xml schema collection attribute_types as N'
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="https://www.w3.org/2001/XMLSchema"> 
    <xs:element name="root">
      <xs:complexType>
          <xs:choice maxOccurs="unbounded">
             <xs:element name="pointer">
                   <xs:complexType>
                      <xs:choice>
                           <xs:element name="sibling" type="xs:string" />
                        </xs:choice>
                      <xs:attribute name="singleton_attribute" type="xs:IDREF" />
                       <xs:attribute name="list_attribute" type="xs:IDREFS" />
                   </xs:complexType>
             </xs:element>
             <xs:element name="target">
                    <xs:complexType>
                      <xs:attribute name="ID" type="xs:ID" />
                   </xs:complexType>
             </xs:element>
         </xs:choice>
      </xs:complexType>
 </xs:element>
</xs:schema>'

In this case, we have defined a schema for a piece of XML that uses ID and IDREF types to create links between different elements. The important thing to notice here is the difference between the types of the singleton_attribute and list_attribute attributes of the pointer element. Note that the singleton_attribute has a type of xs:IDREF (implying a single IDREF value), while the list_attribute has a type xs:IDREFS (implying a list of IDREF values). Another key piece of information is the way in which list types are handled in XQuery. XQuery does not have a concept of list types as such, and handles these by simply treating their values as a sequence of singleton simple types. This is best illustrated by examining the static types of some queries over an instance of this data. Taking the following XML instance:

 declare @x xml(attribute_types)
set @x='
<root>
 <pointer singleton_attribute="A">
     <sibling>test</sibling>
 </pointer>
    <pointer list_attribute="A B">
        <sibling>test</sibling>
 </pointer>

    <target ID="A"/>
  <target ID="B"/>
</root>'

If we execute a query to select the value of a single instance of the singleton_attribute:

 select @x.query('
data( (/root/pointer)[1]/@singleton_attribute )
')

Examining the static type generated by this query, we get the following - xs:IDREF ?  i.e. a single instance (optional) of an IDREF value. If we now execute a query to select the value of a single instance of the list_attribute:

 select @x.query('
data( (/root/pointer)[2]/@list_attribute )
')

Examining the static type generated by this query, we get the following - xs:IDREF * i.e. multiple (optional) IDREF values. So, the result of selecting a single attribute of an XSD list type is a sequence of values, not a singleton. This is very useful since we can now use standard XQuery semantics (e.g. the ordinal predicate or FLWOR iteration) to navigate through this list of values and find the one we want.

This all works fine since the data function is able to handle sequences of values. But, remember that a lot of the functions and operators in XQuery require a singleton value and these will fail if applied to the value(s) of the list_attribute.

Now, remembering that the parent axis causes a loss of type fidelity and understanding that the above semantics require that the type of the attribute be known - what happens when we combine the parent axis with these attributes. Taking the following query against the singleton_attribute:

 select @x.query('
data( (/root/pointer)[1]/sibling/../@singleton_attribute )
')

If we examine the static type generated by this query, we now get the following: xdt:anyAtomicType *. Two things to notice here, firstly we have lots type fidelity (IDREF has become  anyAtomicType) and secondly we have gone from an (optional) singleton value to a sequence! In fact, if we run a similar query against the list_attribute:

 select @x.query('
data( (/root/pointer)[2]/sibling/../@list_attribute )
')

We get the exact same static type generated! The reasoning behind this is that, since we have lost all type fidelity by using the parent axis and there is a possibility that the attribute selected was a list type, we assume for the purposes of static typing that the result is a sequence i.e. we have lost the fact the the singleton_attribute was a singleton value!

This in and of itself is a problem, but (in most cases) the operations defined to deal with IDREF values can handle sequences of values, so this problem is moot in this case. But, armed with our knowledge of how the parent axis affects both the type and cardinality of attribute types, take the following new XSD schema:

 create xml schema collection attribute_types2 as N'
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="https://www.w3.org/2001/XMLSchema"> 
    <xs:element name="root">
      <xs:complexType>
          <xs:choice maxOccurs="unbounded">
             <xs:element name="value">
                 <xs:complexType>
                      <xs:choice>
                           <xs:element name="sibling" type="xs:string" />
                        </xs:choice>
                      <xs:attribute name="integer" type="xs:integer" />
                 </xs:complexType>
             </xs:element>
         </xs:choice>
      </xs:complexType>
 </xs:element>
</xs:schema>'

In this case, we have an attribute that stores a single integer value. Taking the following instance of the above schema:

 declare @x xml(attribute_types2)
set @x='
<root>
  <value integer="12">
      <sibling>test</sibling>
 </value>
</root>
'

We can use this attribute value in conjunction with the arithmetic operators to do some calculations:

 select @x.query('
(/root/value)[1]/@integer + 10
')

This works fine and returns the expected result of 22. Now, if we add in a parent axis to the mix:

 select @x.query('
(/root/value)[1]/sibling/../@integer + 10
')

We get the following error message returned:

 Msg 2389, Level 16, State 1, Line 11
XQuery [query()]: '+' requires a singleton (or empty sequence), found operand of type 'xdt:anyAtomicType *'

So, even though we had no list types in our schema at all. Because of the loss of typing fidelity from the parent operator and the necessary assumption that any attribute might be a list type in this case, we loose cardinality fidelity too.

Again, the moral of the story is to use the parent operator with extreme caution :)

-
Disclaimer:
This posting is provided “AS IS” with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm.