Mixed content (part 2)

Following last week's introduction to the concept of mixed content, let's look in detail at a validated XML instance with mixed content.

Taking a look at the children nodes of the <letter> element

Let’s reuse the instance we stored in our table last time. We can query it and look at all the nodes under the “letter” root element. With the nodes() method we can output a table. This table will contain a row for each node under <letter>. The first column contains the node itself, the second column contains the value “text” if it is a text node or “element” if it is an element node.

SELECT x.query('.'), x.query('if (. instance of text()) then "text" else

if (. instance of element()) then "element" else ""')

FROM T

CROSS APPLY xmlCol.nodes('/*:letter[1]/node()') AS Tref(x)

WHERE iCol = 1

The results look something like this (I formatted the cells containing element nodes so that the whole XML fragment is visible. If you run the query yourself you’ll notice no such formatting.)

<p1:date xmlns:p1="urn:letter">2006-10-15-08:00</p1:date>

element

<p1:sender xmlns:p1="urn:letter"><p1:FirstName>James</p1:FirstName><p1:LastName>Kirk</p1:LastName><p1:Line1>1234 Elm Street</p1:Line1><p1:City>Redmond</p1:City><p1:ZIPCode>98052</p1:ZIPCode><p1:State>WA</p1:State></p1:sender>

element

<p1:recipient xmlns:p1="urn:letter"><p1:FirstName>Jean-Luc</p1:FirstName><p1:LastName>Picard</p1:LastName><p1:Line1>13 rue des Lilas</p1:Line1><p1:City>Ronchin</p1:City><p1:ZIPCode>59555</p1:ZIPCode><p1:Country>France</p1:Country></p1:recipient>

element

Dear Sir, I received your letter dated

text

<p1:date xmlns:p1="urn:letter:tags">2006-10-10Z</p1:date>

element

this morning. It would be my pleasure to attend the annual symposium

text

<p1:title xmlns:p1="urn:letter:tags">XML database systems in class-B Federation Spacecrafts</p1:title>

element

.Please contact my assistant

text

<p1:person xmlns:p1="urn:letter:tags">Mr. Spock</p1:person>

element

at

text

<p1:email xmlns:p1="urn:letter:tags">spock@space-federation.org</p1:email>

element

to make all necessary arrangements.Sincerely,James T. Kirk

text

Let’s now reprint the entire instance, as we submitted it to the server, with the text nodes colored in green.

<l:letter xmlns:l="urn:letter" xmlns:t="urn:letter:tags">

<l:date>2006-10-15-08:00</l:date>

<l:sender>

<l:FirstName>James</l:FirstName>

<l:LastName>Kirk</l:LastName>

<l:Line1>1234 Elm Street</l:Line1>

<l:City>Redmond</l:City>

<l:ZIPCode>98052</l:ZIPCode>

<l:State>WA</l:State>

</l:sender>

<l:recipient>

<l:FirstName>Jean-Luc</l:FirstName>

<l:LastName>Picard</l:LastName>

<l:Line1>13 rue des Lilas</l:Line1>

<l:City>Ronchin</l:City>

<l:ZIPCode>59555</l:ZIPCode>

<l:Country>France</l:Country>

</l:recipient>

Dear Sir,

I received your letter dated <t:date>2006-10-10+00:00</t:date> this morning. It would be my pleasure to attend the annual symposium <t:title>XML database systems in class-B Federation Spacecrafts</t:title>.

Please contact my assistant <t:person>Mr. Spock</t:person> at <t:email>spock@space-federation.org</t:email> to make all necessary arrangements.

Sincerely,

James T. Kirk

</l:letter>

You’ll notice that all text between two elements tags counts as one text node. Our instance contains five of them. You can verify the accuracy of that count with the following query, which returns 5 also

SELECT xmlCol.value('count(/*:letter/text())','int') FROM T WHERE iCol = 1

go

Whitespace-only text nodes

You might also have noticed that our instance contained portions of text made only of whitespace characters (between the “date” and the “sender” element for example). Those weren’t preserved as text nodes. Whitespace-only text nodes are actually a very interesting topic that isn’t limited to mixed content and we’ll probably talk about it some more in the future but for now, let’s look at what one should do when they wish to preserve them. There is an optional parameter in the T-SQL convert() function, that when set to 1, forces the server to preserve whitespace-only text nodes when converting from a string type to XML. In our case, we should do this.

INSERT INTO T VALUES (2, CONVERT(XML(SC_Letter), '<l:letter xmlns:l="urn:letter" xmlns:t="urn:letter:tags">

<l:date>2006-10-15-08:00</l:date>

<l:sender>

<l:FirstName>James</l:FirstName>

<l:LastName>Kirk</l:LastName>

<l:Line1>1234 Elm Street</l:Line1>

<l:City>Redmond</l:City>

<l:ZIPCode>98052</l:ZIPCode>

<l:State>WA</l:State>

</l:sender>

<l:recipient>

<l:FirstName>Jean-Luc</l:FirstName>

<l:LastName>Picard</l:LastName>

<l:Line1>13 rue des Lilas</l:Line1>

<l:City>Ronchin</l:City>

<l:ZIPCode>59555</l:ZIPCode>

<l:Country>France</l:Country>

</l:recipient>

Dear Sir,

I received your letter dated <t:date>2006-10-10+00:00</t:date> this morning. It would be my pleasure to attend the annual symposium <t:title>XML database systems in class-B Federation Spacecrafts</t:title>.

Please contact my assistant <t:person>Mr. Spock</t:person> at <t:email>spock@space-federation.org</t:email> to make all necessary arrangements.

Sincerely,

James T. Kirk

</l:letter>', 1))

go

Let’s rerun the same query as last time for this new instance

SELECT x.query('.'), x.query('if (. instance of text()) then "text" else

if (. instance of element()) then "element" else ""')

FROM T

CROSS APPLY xmlCol.nodes('/*:letter[1]/node()') AS Tref(x)

WHERE iCol = 2

This time the results look like this (once again I altered the way results are displayed for your reading comfort)

text

<p1:date xmlns:p1="urn:letter">2006-10-15-08:00</p1:date>

element

text

<p1:sender xmlns:p1="urn:letter"><p1:FirstName>James</p1:FirstName>

<p1:LastName>Kirk</p1:LastName><p1:Line1>1234 Elm Street</p1:Line1><p1:City>Redmond</p1:City>

<p1:ZIPCode>98052</p1:ZIPCode><p1:State>WA</p1:State></p1:sender>

element

text

<p1:recipient xmlns:p1="urn:letter"><p1:FirstName>Jean-Luc</p1:FirstName><p1:LastName>Picard</p1:LastName><p1:Line1>13 rue des Lilas</p1:Line1><p1:City>Ronchin</p1:City>

<p1:ZIPCode>59555</p1:ZIPCode><p1:Country>France</p1:Country></p1:recipient>

element

Dear Sir, I received your letter dated

text

<p1:date xmlns:p1="urn:letter:tags">2006-10-10Z</p1:date>

element

this morning. It would be my pleasure to attend the annual symposium

text

<p1:title xmlns:p1="urn:letter:tags">XML database systems in class-B Federation Spacecrafts</p1:title>

element

.Please contact my assistant

text

<p1:person xmlns:p1="urn:letter:tags">Mr. Spock</p1:person>

element

at

text

<p1:email xmlns:p1="urn:letter:tags">spock@space-federation.org</p1:email>

element

to make all necessary arrangements.Sincerely,James T. Kirk

text

Now there are 8 text nodes under the “letter” element as evidenced by the following query:

SELECT xmlCol.value('count(/*:letter/text())','int') FROM T WHERE iCol = 2

go

This time following whitespace-only text nodes have been preserved

  • All characters between the opening “letter” tag and the opening “date” tag
  • All characters between the closing “date” tag and the opening “sender” tag”
  • All characters between the closing “sender” tag and the opening “recipient” tag.

Next time we'll look at DML operations and the effect they can have on mixed content.

-
Disclaimer:
This posting is provided “AS IS” with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm.