Running XQuery over multiple XML datatype instances

One of the limitations of using XQuery inside SQL Server 2005 is that you are limited to running your expressions over a single XML data type instance. It is usually possible to insert your XML into the database so that this is not a problem (e.g. placing all the XML data you want to query over into a single cell in the database), but you can also leverage the FOR XML functionality that initially shipped with SQL Server 2000 to acheive this.

FOR XML allows you to run a select query and return the results of this query formatted as XML. This would normally be used to take standard relational data and export it in an XML serialized form, but it can also be used to take multiple XML instances and aggregate them together. This aggregate data then forms an intermediate XML data type instance over which you can run your XQuery expression. This is not going to work in all cases, but should surfice for the majority of cases where this functionality would be useful.

To demonstarate how this can be done, I am going to use an example from the W3C XML Query Use Case Document - section 1.4.4.6. This example uses 2 XML source documents - items.xml (which contains XML data about items for sale in an auction) and bids.xml (which contains XML data about bids in an auction system). I create a simple table (AuctionData) with an XML data type column in it (xml_data) and insert these 2 instaces, each in it's own row of the database. The FOR XML query which is used to aggregate these instances together is the following:

 select cast( xml_data as xml )
from AuctionData
for xml raw( '' ), elements, type

This FOR XML query returns a sequence of XML nodes comprising of all the top level <items> and <bids> elements. We now apply XQuery to this FOR XML query:

 select
( select cast(xml_data as xml)
  from AuctionData
  for xml raw(''), elements, type ).query('
<result>
 {
  for $item in /items/item
  where $item/reserve_price[1] * 2 < max(/bids/bid[itemno[1] = $item/itemno]/bid_price)
  return
   <successful_item>
    { $item/itemno }
    { $item/description }
    { $item/reserve_price }
    <high_bid>{ max(/bids/bid[itemno[1] = $item/itemno]/bid_price) }</high_bid>
   </successful_item>
 }
</result>
')

This returns us the expected results. Note that we had to change the query slightly to get around the fact that our implementation does not support XQuery let. Executing queries of this type can be slow, depending on the size of the XML returned from the FOR XML query. The FOR XML query creates a new, unindexed XML data type instance over which the XQuery expression is executed. The fact that the intermediate XML data type instance in not indexed, means that the XQuery expression will execute slowly when compared to an equivalent XQuery over an indexed XML data type instance.

Thanks to Jinghao for this example!

-
Disclaimer:
This posting is provided “AS IS” with no waranties, and confers no rights.
Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm.