Importing an existing XML schema into SQL Server 2005

Article
04/18/2006

In my last post I addressed the issue of non deterministic content models, which can be valid with respect to the W3C’s standards, but which are always rejected by SQL Server 2005. There are a few other XSD features that SQL Server 2005 doesn’t support. There are also challenges inherent to the nature of SQL Server 2005. As a result it is sometimes impossible to import an existing schema as-is if it wasn’t designed with SQL Server in mind. Fixes must be used, and workarounds must be applied, before the schema can be successfully imported.

Today I’m going to walk you through the import of a schema that contains multiple features that conflict with SQL Server 2005’s implementation of XSD. To maximize the value of this example I will use a made-up schema. In a future post I will show how the techniques described here can apply to a real-world schema.

Let’s assume that there is a directory C:\Example on my SQL Server 2005 machine and that it contains the following three schema files

order.xsd

<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
targetNamespace="https://www.example.com/denisruc"
xmlns="https://www.example.com/denisruc"
xmlns:t="https://www.example.com/denisruc/types">

<xsd:import namespace="https://www.example.com/denisruc/types" schemaLocation="types.xsd"/>

<xsd:complexType name="itemType">
<xsd:sequence>
<xsd:element name="name" type="t:nameType"/>
<xsd:element name="description" type="xsd:string"/>
</xsd:sequence>
<xsd:attribute name="referenceId" type="t:referenceType"/>
<xsd:attribute name="price" type="t:priceType"/>
</xsd:complexType>

<xsd:complexType name="itemOrderType">
<xsd:complexContent>
<xsd:extension base="itemType">
<xsd:sequence>
<xsd:any processContents="lax" namespace="##other" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="quantity" type="xsd:integer"/>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>

<xsd:complexType name="orderType">
<xsd:sequence minOccurs="0" maxOccurs="unbounded">
<xsd:element name="Item" type="itemOrderType"/>
</xsd:sequence>
</xsd:complexType>

<xsd:element name="Order" type="orderType">
<xsd:unique name="refAndname">
<xsd:selector xpath="Item"/>
<xsd:field xpath="@referenceId"/>
<xsd:field xpath="name"/>
</xsd:unique>
</xsd:element>

</xsd:schema>

types.xsd

<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
targetNamespace="https://www.example.com/denisruc/types"
xmlns="https://www.example.com/denisruc/types">

<xsd:include schemaLocation="nameType.xsd"/>

<xsd:simpleType name="referenceType">
<xsd:restriction base="xsd:string">
<xsd:pattern value="[A-Z](\-[0-9]{4}){3}"/>
</xsd:restriction>
</xsd:simpleType>

<xsd:simpleType name="priceType">
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>

</xsd:schema>

nameType.xsd

<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
targetNamespace="https://www.example.com/denisruc/types"
xmlns="https://www.example.com/denisruc/types">

<xsd:simpleType name="nameType">
<xsd:restriction base="xsd:string">
<xsd:maxLength value="20"/>
</xsd:restriction>
</xsd:simpleType>

</xsd:schema>

Incompatibilities with SQL Server 2005

The first problem to solve is that of schema dependencies.

First, you have to look at the use of <xsd:import> in the schema files. You cannot import the files in any order because you might leave unresolved references.

For example, if you load up the contents of order.xsd and try to create a schema collection, like this

DECLARE @order varchar(MAX)
SELECT @order = xmlCol FROM OPENROWSET(Bulk 'C:\\example\\order.xsd', SINGLE_CLOB) as results(xmlCol)
create xml schema collection SC AS @order

you will get the following error
Reference to an undefined name 'nameType' within namespace 'https://www.example.com/denisruc/types'
The contents of the schema with namespace ‘https://www.example.com/denisruc/types’ have to already be present in the collection when you import the schema from order.xsd.

The other dependency is introduced by <xsd:include>. To resolve this, you can use an external tool. Books Online contains such a tool at https://msdn2.microsoft.com/en-us/library/ms190756(SQL.90).aspx. In this case, there is really no need to pre-process the schema files. We can import the contents of both types.xsd and nameType.xsd in one DDL operation since you can use multiple schema fragments with CREATE XML SCHEMA COLLECTION or ALTER XML SCHEMA COLLECTION, even if those schema fragments have the same target namespace.

In general, if possible, it is better to import all the schemas at once, since this also solves the problem of circular dependencies (when A imports B and B imports A).

The second problem to solve is that of the “lax” wildcard. SQL Server 2005 supports “strict” and “skip” but not “lax”. The workaround in this case is to use “skip”. The same wildcards instances will be allowed by the server but no type information will be available.

Finally we need to remove the <xsd:unique> node since this feature is not supported

“Fixing” and importing the schema

We could perform all the changes described above directly on the schema files but this might not be desirable and for big schemas, with lots of incompatibilities, this would be time-consuming.
Instead, we can take advantage of the XML datatype itself by loading the files into an XML variable and performing the changes as XML DML operations.

-- First we import all three files into varchar(MAX) variables
DECLARE @order varchar(MAX)
SELECT @order = xmlCol FROM OPENROWSET(Bulk 'C:\\example\\order.xsd', SINGLE_CLOB) as results(xmlCol)

DECLARE @types varchar(MAX)
SELECT @types = xmlCol FROM OPENROWSET(Bulk 'C:\\example\\types.xsd', SINGLE_CLOB) as results(xmlCol)

DECLARE @nametype varchar(MAX)
SELECT @nametype = xmlCol FROM OPENROWSET(Bulk 'C:\\example\\nametype.xsd', SINGLE_CLOB) as results(xmlCol)

-- concatenate all three files and convert to XML datatype
DECLARE @schemas XML
SET @schemas = CONVERT (XML, @order + @types + @nametype)

-- first remove the <xsd:include> tag
SET @schemas.modify('delete /xs:schema/xs:include')

-- then change all the 'lax' wildcards to 'skip'

-- first we count how many of those wildcards there are
DECLARE @iCount int
SELECT @iCount = @schemas.value('count(//xs:any[@processContents="lax"])','int')
-- then we run the DML operation as many times as necessary
DECLARE @i int
SET @i = @iCount
WHILE (@i > 0)
BEGIN
SET @i = @i - 1
SET @schemas.modify('replace value of (//xs:any[@processContents="lax"]/@processContents)[1] with "skip"')
END

-- finally we remove xsd:unique
SET @schemas.modify('delete //xs:unique')

-- the schema is now compatible with SQL Server 2005
-- We can create the schema collection
CREATE XML SCHEMA COLLECTION myCollection AS @schemas
go

The schema collection is successfully created and you can retrieve its content with

SELECT XML_SCHEMA_NAMESPACE(N'dbo', N'myCollection')

If you look at this statement's results you’ll notice that only two <xsd:schema> elements are serialized out, even though you imported three files. That is because nameType.xsd was included in types.xsd. When the schema components are stored into metadata, there is no way to tell that they might have come from two different physical files. When serializing out the schema with target namespace ‘https://www.example.com/denisruc/types’ the server puts all the components under one unique schema element.

In a future post I will do this again with a real (and much more complex) industry schema.

-
Disclaimer:
This posting is provided “AS IS” with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm.

Importing an existing XML schema into SQL Server 2005

Additional resources