Working with XML from Multiple Instances

There are often cases where we need to perform operations based on multiple XML instances. To illustrate this, consider the example XML given below. Basically, we have two XML instances. The first instance (call this the results instance) contains information about the execution of a set of tests. Each test entry contains a name for the test, the result of the execution and an ID to the section of the specification that the test targets. The second instance (call this the sections instance) contains detailed information about the various sections of the specification, along with an identifying ID for each section.

Results Instance

 <results>
   <item name="First Item" section-id="Section1" result="PASS" />
    <item name="Second Item" section-id="Section1" result="PASS" />
   <item name="Third Item" section-id="Section2" result="PASS" />
    <item name="Fourth Item" section-id="Section3" result="FAIL" />
   <item name="Fifth Item" section-id="Section3" result="PASS" />
    <item name="Sixth Item" section-id="Section3" result="FAIL" />
</results>

Sections Instance

 <sections>
 <section name="First Section" id="Section1" />
    <section name="Second Section" id="Section2" />
   <section name="Third Section" id="Section3" />
</sections>

The report we would like to produce over this data shows the results of each tests, along with a detailed description of the section of the specification that the test targets. For example, the final data we want may look like the following:

 <result name="First Item" section="First Section">PASS</result>
<result name="Second Item" section="First Section">PASS</result>
<result name="Third Item" section="Second Section">PASS</result>
<result name="Fourth Item" section="Third Section">FAIL</result>
<result name="Fifth Item" section="Third Section">PASS</result>
<result name="Sixth Item" section="Third Section">FAIL</result>

In order to do this, we essentially need to perform a join across the two instances. Using standard XQuery and the document function, we could write a FLWOR expression to do this, along the lines of the following:

 let $sections := document("sections.xml")/sections
for $result in document("results.xml")/results/item
return
<result 
    name="{$result/@name}" 
 section="{$sections/section[id=$result/@id]/@name}">
{ $result/text() }
</result>

Unfortunately, XQuery in SQL Server 2005 will not execute this query. Firstly, we do not support the let syntax. Secondly, there is no document() (or equivalent) function. So, we need to re-write this query to use T-SQL to get around the limitations of our XQuery implementation. In order to perform the join operation, we need to extract the relevant fields out of the XML instances and turn them into relational data, so we can use them in the standard T-SQL context - the recommended method for doing this is a combination of the the value() method for converting the XML into a relational type, and the nodes() method for extracting the multiple values from the single XML instance.

We use nodes() to extract the multiple item and section elements from the pair of XML instances. Once these are extracted, we can use the value() method on each individual entry to convert it to a relational value, using these to perform the join. The query to do this is as follows:

 select R.i.value('@name', 'varchar(30)')   [Name], 
       S.s.value('@name', 'varchar(30)')   [Section],
       R.i.value('@result', 'varchar(30)') [Result]
from Results  cross apply xml_data.nodes('/results/item') R(i),
     Sections cross apply xml_data.nodes('/sections/section') S(s)
where R.i.value('@section-id', 'varchar(30)') = S.s.value('@id', 'varchar(30)')

The join operation in this case is being done through the where clause of the select statement. Running this query will produce the following relational result set:

 First Item First Section   PASS
Second Item First Section   PASS
Third Item  Second Section  PASS
Fourth Item Third Section   FAIL
Fifth Item  Third Section   PASS
Sixth Item  Third Section   FAIL

If a relational results is what you wanted, then you are done :) The initial goal was to output XML though, so we can add a for xml statement to this query to take these relational results and re-compose them back into XML. The query including the for xml statement is as follows:

 select R.i.value('@name', 'varchar(30)')   [@name], 
       S.s.value('@name', 'varchar(30)')   [@section],
       R.i.value('@result', 'varchar(30)') [text()]
from Results  cross apply xml_data.nodes('/results/item') R(i),
     Sections cross apply xml_data.nodes('/sections/section') S(s)
where R.i.value('@section-id', 'varchar(30)') = S.s.value('@id', 'varchar(30)')
for xml path('result'), type

This query then produces the expected results from above. The for xml statement itself is pretty self-explanatory. The string (result) given as the path parameter is the name of the container element, while the column names of the various values indicate how they will be stored. Notice that the column names for @name and @section correspond to the name and section attributes respectively, while the text() column becomes the text value of the element.

Thanks again to Jinghao for help with the for xml part of the query.

-
Disclaimer:
This posting is provided “AS IS” with no waranties, and confers no rights.
Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm.