Getting relational data as well formed XML structures out of SQL Server

 

I saw many times people having problems working with XML structures in SQL Server. It is that easy, even more with the new functionality in SQL Server 2005>. Most times people try to take the XML line with the least resistance and effort which would be XML AUTO. While XML AUTO is easy an fast to implement it might not give you the information in a way you want to work with in your application.

 

Let take a sample to see the outcomings (using the row constructors functionality for our base data):

 

Select * From
(
Values
('JOHN','SMITH'),
('JANE','BROWN'),
('JOE','GREEN')

) Names(First_Name, Last_Name)
FOR XML AUTO

 

will transform to:

 

<Names First_Name="JOHN" Last_Name="SMITH" />
<Names First_Name="JANE" Last_Name="BROWN" />
<Names First_Name="JOE" Last_Name="GREEN" />

 

For those people directly spooling the information to a file, this will bring back an ugly XML error in almost any XML viewer (like IE):

 

The XML page cannot be displayed

Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.


Only one top level element is allowed in an XML document. Error processing resource (…)

 

So you need a root element. The root element is specified by the syntax ROOT (Suprise ?). So if you want the “Names” element as the root, use the following syntax.

 

Select * From
(
Values
('JOHN','SMITH'),
('JANE','BROWN'),
('JOE','GREEN')

) Names(First_Name, Last_Name)
FOR XML AUTO, ROOT('Customers')

 

<Customers>
<Names First_Name="JOHN" Last_Name="SMITH" />
<Names First_Name="JANE" Last_Name="BROWN" />
<Names First_Name="JOE" Last_Name="GREEN" />
</Customers>

 

Ok, much better, you are ready to open that with IE now. But you maybe do not want the values as attributes, you want them in a separate node PATH? Then you should use the …guess what… the PATH syntax.

 

Select * From
(
Values
('JOHN','SMITH'),
('JANE','BROWN'),
('JOE','GREEN')

) Names(First_Name, Last_Name)
FOR XML PATH ,ROOT('Names')

 

<Names>
<row>
<First_Name>JOHN</First_Name>
<Last_Name>SMITH</Last_Name>
</row>
<row>
<First_Name>JANE</First_Name>
<Last_Name>BROWN</Last_Name>
</row>
<row>
<First_Name>JOE</First_Name>
<Last_Name>GREEN</Last_Name>
</row>
</Names>

 

Looks already good but the "row" specified is left as an ugly artifact. PATH can have a parameter with specifying the name of the node name.

Select * From
(
Values
('JOHN','SMITH'),
('JANE','BROWN'),
('JOE','GREEN')

) Names(First_Name, Last_Name)
FOR XML PATH('Customer') ,ROOT('Names')

 

Pretty good now:

 

<Names>
<Customer>
<First_Name>JOHN</First_Name>
<Last_Name>SMITH</Last_Name>
</Customer>
<Customer>
<First_Name>JANE</First_Name>
<Last_Name>BROWN</Last_Name>
</Customer>
<Customer>
<First_Name>JOE</First_Name>
<Last_Name>GREEN</Last_Name>
</Customer>
</Names>

If you need more specific placement of data in the XML tree, like having the information like CustomerID as an attribute in the Customer noe, you would have to use the EXPLICIT keyword which would be implemented deeper in the query.

 

But that´s it for now and for simplicity.

 

-Jens