Importing an existing XML schema into SQL Server 2005


In my last post I addressed the issue of non deterministic content models, which can be valid with respect to the W3C’s standards, but which are always rejected by SQL Server 2005. There are a few other XSD features that SQL Server 2005 doesn’t support. There are also challenges inherent to the nature of SQL Server 2005. As a result it is sometimes impossible to import an existing schema as-is if it wasn’t designed with SQL Server in mind. Fixes must be used, and workarounds must be applied, before the schema can be successfully imported.


Today I’m going to walk you through the import of a schema that contains multiple features that conflict with SQL Server 2005’s implementation of XSD. To maximize the value of this example I will use a made-up schema. In a future post I will show how the techniques described here can apply to a real-world schema.


Let’s assume that there is a directory C:\Example on my SQL Server 2005 machine and that it contains the following three schema files


order.xsd


<xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema
 targetNamespace=”
http://www.example.com/denisruc
 xmlns=”
http://www.example.com/denisruc
 xmlns:t=”
http://www.example.com/denisruc/types“>


 <xsd:import namespace=”http://www.example.com/denisruc/types” schemaLocation=”types.xsd”/>



 <xsd:complexType name=”itemType”>
  <xsd:sequence>
   <xsd:element name=”name” type=”t:nameType”/>
   <xsd:element name=”description” type=”xsd:string”/>
  </xsd:sequence>
  <xsd:attribute name=”referenceId” type=”t:referenceType”/>
  <xsd:attribute name=”price” type=”t:priceType”/>
 </xsd:complexType>


 <xsd:complexType name=”itemOrderType”>
  <xsd:complexContent>
   <xsd:extension base=”itemType”>
    <xsd:sequence>
     <xsd:any processContents=”lax” namespace=”##other” minOccurs=”0″ maxOccurs=”unbounded”/>
    </xsd:sequence>
    <xsd:attribute name=”quantity” type=”xsd:integer”/>
   </xsd:extension>
  </xsd:complexContent>
 </xsd:complexType> 


 <xsd:complexType name=”orderType”>
  <xsd:sequence minOccurs=”0″ maxOccurs=”unbounded”>
   <xsd:element name=”Item” type=”itemOrderType”/>
  </xsd:sequence>
 </xsd:complexType>


 <xsd:element name=”Order” type=”orderType”>
  <xsd:unique name=”refAndname”>
   <xsd:selector xpath=”Item”/>
   <xsd:field xpath=”@referenceId”/>
   <xsd:field xpath=”name”/>
  </xsd:unique>
 </xsd:element>


</xsd:schema>


types.xsd


<xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema
 targetNamespace=”
http://www.example.com/denisruc/types
 xmlns=”
http://www.example.com/denisruc/types“>



 <xsd:include schemaLocation=”nameType.xsd”/>


 <xsd:simpleType name=”referenceType”>
  <xsd:restriction base=”xsd:string”>
   <xsd:pattern value=”[A-Z](\-[0-9]{4}){3}”/>
  </xsd:restriction>
 </xsd:simpleType>


 <xsd:simpleType name=”priceType”>
  <xsd:restriction base=”xsd:decimal”>
   <xsd:fractionDigits value=”2″/>
  </xsd:restriction>
 </xsd:simpleType>



</xsd:schema>


nameType.xsd


<xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema
 targetNamespace=”
http://www.example.com/denisruc/types
 xmlns=”
http://www.example.com/denisruc/types“>


 <xsd:simpleType name=”nameType”>
  <xsd:restriction base=”xsd:string”>
   <xsd:maxLength value=”20″/>
  </xsd:restriction>
 </xsd:simpleType>
   



</xsd:schema>


 


Incompatibilities with SQL Server 2005



The first problem to solve is that of schema dependencies.


First, you have to look at the use of <xsd:import> in the schema files. You cannot import the files in any order because you might leave unresolved references.


For example, if you load up the contents of order.xsd and try to create a schema collection, like this


DECLARE @order varchar(MAX)
SELECT @order = xmlCol FROM OPENROWSET(Bulk ‘C:\\example\\order.xsd’, SINGLE_CLOB) as results(xmlCol)
create xml schema collection SC AS @order


you will get the following error
Reference to an undefined name ‘nameType’ within namespace ‘http://www.example.com/denisruc/types’
The contents of the schema with namespace ‘http://www.example.com/denisruc/types’ have to already be present in the collection when you import the schema from order.xsd.


The other dependency is introduced by <xsd:include>. To resolve this, you can use an external tool. Books Online contains such a tool at http://msdn2.microsoft.com/en-us/library/ms190756(SQL.90).aspx. In this case, there is really no need to pre-process the schema files. We can import the contents of both types.xsd and nameType.xsd in one DDL operation since you can use multiple schema fragments with CREATE XML SCHEMA COLLECTION or ALTER XML SCHEMA COLLECTION, even if those schema fragments have the same target namespace.


In general, if possible, it is better to import all the schemas at once, since this also solves the problem of circular dependencies (when A imports B and B imports A).



The second problem to solve is that of the “lax” wildcard. SQL Server 2005 supports “strict” and “skip” but not “lax”. The workaround in this case is to use “skip”. The same wildcards instances will be allowed by the server but no type information will be available.


Finally we need to remove the <xsd:unique> node since this feature is not supported



“Fixing” and importing the schema



We could perform all the changes described above directly on the schema files but this might not be desirable and for big schemas, with lots of incompatibilities, this would be time-consuming.
Instead, we can take advantage of the XML datatype itself by loading the files into an XML variable and performing the changes as XML DML operations.


 


— First we import all three files into varchar(MAX) variables
DECLARE @order varchar(MAX)
SELECT @order = xmlCol FROM OPENROWSET(Bulk ‘C:\\example\\order.xsd’, SINGLE_CLOB) as results(xmlCol)


DECLARE @types varchar(MAX)
SELECT @types = xmlCol FROM OPENROWSET(Bulk ‘C:\\example\\types.xsd’, SINGLE_CLOB) as results(xmlCol)


DECLARE @nametype varchar(MAX)
SELECT @nametype = xmlCol FROM OPENROWSET(Bulk ‘C:\\example\\nametype.xsd’, SINGLE_CLOB) as results(xmlCol)


— concatenate all three files and convert to XML datatype
DECLARE @schemas XML
SET @schemas = CONVERT (XML, @order + @types + @nametype)


— first remove the <xsd:include> tag
SET @schemas.modify(‘delete /xs:schema/xs:include’)


— then change all the ‘lax’ wildcards to ‘skip’


— first we count how many of those wildcards there are
DECLARE @iCount int
SELECT @iCount = @schemas.value(‘count(//xs:any[@processContents=”lax”])’,’int’)
— then we run the DML operation as many times as necessary
DECLARE @i int
SET @i = @iCount
WHILE (@i > 0)
 BEGIN
  SET @i = @i – 1
  SET @schemas.modify(‘replace value of (//xs:any[@processContents=”lax”]/@processContents)[1] with “skip”‘)
 END



— finally we remove xsd:unique
SET @schemas.modify(‘delete //xs:unique’)



— the schema is now compatible with SQL Server 2005
— We can create the schema collection

CREATE XML SCHEMA COLLECTION myCollection AS @schemas
go


 


The schema collection is successfully created and you can retrieve its content with SELECT XML_SCHEMA_NAMESPACE(N’dbo’, N’myCollection’)
 


If you look at this statement’s results you’ll notice that only two <xsd:schema> elements are serialized out, even though you imported three files. That is because nameType.xsd was included in types.xsd. When the schema components are stored into metadata, there is no way to tell that they might have come from two different physical files. When serializing out the schema with target namespace ‘http://www.example.com/denisruc/types’ the server puts all the components under one unique schema element.


In a future post I will do this again with a real (and much more complex) industry schema.



Disclaimer:
This posting is provided “AS IS” with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

Comments (12)

  1. Henry Boehlert says:

    Great post, I’ve been looking for a concise description like this since for quite some time.

    I already pursued an approach using a .NET function but due to security constraints this did not become an easily applicable solution.

    One little "add": key and keyref need to be removed as well, unless this restriction was lifted in SP1 (which I doubt)

  2. MSDN Archive says:

    Thanks for the comment Henry. You are absolutely right, xsd:key and xsd:keyref nodes would have to be removed as well. And there are no plans that I know of to add support for those in the near future.

    As for SP1 it should be available shortly and I’ll probably write a post about the improvements that we made to XML datatype.

  3. Yesterday we released our first Service Pack for SQL Server 2005. You might remember a while back I…

  4. I’m adding a link in the sidebar to Denis Ruckebusch’s blog. Denis is one of our testers for the XML…

  5. As promised earlier I’m going to give you an example&amp;nbsp;using a&amp;nbsp;real-world schema. I’ve chosen…

  6. Waldemar says:

    I am calling sp_xml_PrepareDocument from a VS.NEt app against SQL Server 2005 and It worked fine till I installed SP1.

    I got the following XML and sent this command not from the client but from sql management studio.

    DECLARE @hdoc int

    DECLARE @doc varchar(1000)

    SET @doc =

    ‘<?xml version="1.0" encoding="utf-16"?>

    <Keys xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot; xmlns:xsd="http://www.w3.org/2001/XMLSchema"&gt;

     <Key ID="26" Code="1000149" />

    </Keys>’

    EXEC sp_xml_preparedocument @hdoc OUTPUT, @doc

    If I change @Doc to nvarchar it works, I did it on the client too but this time it does not work.

    Any suggestions?

  7. Steve says:

    Great article. I just discovered your site and immediately set it as a favorite.

    I’m trying to find how I might be able to replace a schema already in a schema collection. I could easily just add the revised schema to the collection and give it a slightly different namespace but I don’t want to get sloppy. Any suggestions?

    Cheers.

    Steve Campos

  8. MSDN Archive says:

    Steve,

    Unfortunately for now we don’t support removing or replacing schema components, only adding them.

    You can create a new collection for your new schema and then retype all the columns constrained by the old collection (with ALTER TABLE ALTER COLUMN)

  9. Terry Aney says:

    What is the performance hit of dropping a schema collection (alter table alter column) and then re-adding it?

    Additionally, if I have an Xml index set up, I have to remove that first as well right?  I would assume this performance hit would be bad?

Skip to main content