Restricting the contents of a typed XML column using triggers

A typed XML column/variable is constrained by a schema collection. That means only instances that can be validated against the schemas in the collection will be accepted. You can tighten the rules a little bit with the use of the DOCUMENT facet, which adds the extra requirement that only well formed XML documents be accepted (no fragments).

 

For some users, this isn’t enough. They would also like to be able to constrain their XML columns or variables to a single namespace or even to a single element.

 

Unfortunately the DDL syntax doesn’t allow for such requirements. There are workarounds though, and we’ll try to look at a simple example.

 

First, let’s create an Xml Schema Collection and a table.

 

CREATE XML SCHEMA COLLECTION myCollection AS '

<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"

targetNamespace="https://a">

 <xsd:element name="a">

  <xsd:complexType>

   <xsd:sequence>

    <xsd:element name="strElement" type="xsd:string"/>

    <xsd:element name="bElement" type="xsd:boolean"/>

   </xsd:sequence>

  </xsd:complexType>

 </xsd:element>

</xsd:schema>

<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"

targetNamespace="https://b">

 <xsd:element name="b">

  <xsd:complexType>

   <xsd:sequence>

    <xsd:element name="strElement" type="xsd:string"/>

    <xsd:element name="floatElement" type="xsd:float" minOccurs="0"/>

   </xsd:sequence>

  </xsd:complexType>

 </xsd:element>

</xsd:schema>

<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"

targetNamespace="https://main"

xmlns:ns="https://main"

xmlns:b="https://b"

xmlns:a="https://a">

 <xsd:import namespace="https://a"/>

 <xsd:import namespace="https://b"/>

 <xsd:element name="root" type="ns:rootType"/>

 <xsd:complexType name="rootType">

  <xsd:choice minOccurs="0" maxOccurs="unbounded">

   <xsd:element ref="a:a"/>

   <xsd:element ref="b:b"/>

  </xsd:choice>

 </xsd:complexType>

</xsd:schema>

'

go

CREATE TABLE myTable (

date Datetime,

xmlCol XML(myCollection)

)

go

 

At this point any valid instance of elements “root” (namespace “https://main”), “a” (namespace “https://a”) and “b” (namespace “https://b”) can be inserted into the table.

Let’s decide that we should only accept instances of elements from namespace “https://main”.

 

Let’s create the following trigger

 

CREATE TRIGGER CheckNS_OnInsert_Trigger on myTable INSTEAD OF INSERT

AS

BEGIN

      INSERT INTO myTable

            SELECT date, xmlCol.query('for $i in /* return (if (namespace-uri($i) = "https://main") then $i else ())')

            FROM inserted

END

 

Now, let’s try the following insertion

 

INSERT INTO myTable VALUES ('1/1/2001 15:00:00', '

<x:a xmlns:x="https://a"><strElement>Data a</strElement><bElement>false</bElement></x:a>

<x:root xmlns:x="https://main"/>

<x:b xmlns:x="https://b"><strElement> Data b</strElement></x:b>

<x:root xmlns:x="https://main">

      <x:a xmlns:x="https://a"><strElement>Data a</strElement><bElement>true</bElement></x:a>

      <x:b xmlns:x="https://b"><strElement> Data b</strElement></x:b>

</x:root>

<x:b xmlns:x="https://b"><strElement/></x:b>'

)

 

Apparently it succeeds (no error message back) but let’s pull the data back out of the table.

 

select xmlCol FROM myTable WHERE date = '1/1/2001 15:00:00'

 

The data looks like this (I pretty-printed it for better legibility)

 

<x:root xmlns:x="https://main" />

<x:root xmlns:x="https://main">

<x:a xmlns:x="https://a">

<strElement>Data a</strElement>

<bElement>true</bElement>

</x:a>

<x:b xmlns:x="https://b">

<strElement> Data b</strElement>

</x:b>

</x:root>

 

All instances of top level elements that were not qualified with namespace “https://main” were eliminated by the trigger.

 

The problem is, we can still perform a DML operation that will insert a top level element from any namespace at the root, like this

 

UPDATE myTable

SET xmlCol.modify('insert <x:b xmlns:x="https://b"><strElement/></x:b> as last into / ')

WHERE date = '1/1/2001 15:00:00'

go

 

When we pull the data out of the table again we get

 

<x:root xmlns:x="https://main" />

<x:root xmlns:x="https://main">

<x:a xmlns:x="https://a">

<strElement>Data a</strElement>

<bElement>true</bElement>

</x:a>

<x:b xmlns:x="https://b">

<strElement> Data b</strElement>

</x:b>

</x:root>

<x:b xmlns:x="https://b"><strElement></strElement></x:b>

 

To remedy this, we need to create another trigger that will catch any “illegal” element(s) inserted during updates.

 

CREATE TRIGGER CheckNS_OnUpdate_Trigger on myTable AFTER UPDATE

AS

BEGIN

      IF UPDATE (xmlCol)

            UPDATE myTable

            SET xmlCol.modify('delete /*[namespace-uri(.)!="https://main"] ')

END

 

 

Now let’s try that update again

UPDATE myTable

SET xmlCol.modify('insert <x:b xmlns:x="https://b"><strElement/></x:b> as last into / ')

WHERE date = '1/1/2001 15:00:00'

go

 

The query seems to succeed but when we query the data out we get this:

 

<x:root xmlns:x="https://main" />

<x:root xmlns:x="https://main">

<x:a xmlns:x="https://a">

<strElement>Data a</strElement>

<bElement>true</bElement>

</x:a>

<x:b xmlns:x="https://b">

<strElement> Data b</strElement>

</x:b>

</x:root>

 

All top level elements from namespaces other than “https://main”, including the one present before we created the trigger are gone.

 

This trigger deletes all top level element whose namespace uri isn’t “https://main”, in all rows. Performance-wise this might become a problem once your table contains more than a few rows. It can probably be made more effective with a primary key and a join on the inserted or delete table. I’ll leave the optimization to you.

 

One last thing is left to check. We must ensure that an element from “https://a” or “https://b” can still be inserted inside an element from “https://main”.

 

UPDATE myTable

SET xmlCol.modify('declare namespace main="https://main";

insert <x:b xmlns:x="https://b"><strElement/></x:b> as first into /main:root[1] ')

WHERE date = '1/1/2001 15:00:00'

go

 

When we query the data out we get this:

 

<x:root xmlns:x="https://main">

<x:b xmlns:x="https://b">

<strElement></strElement>

</x:b>

</x:root>

<x:root xmlns:x="https://main">

<x:a xmlns:x="https://a">

<strElement>Data a</strElement>

<bElement>true</bElement>

</x:a>

<x:b xmlns:x="https://b">

<strElement> Data b</strElement>

</x:b>

</x:root>

 

The first “root” element has now a child “b” qualified with namespace “https://b”.

 

-
Disclaimer:
This posting is provided “AS IS” with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm.