Getting those XML files into your brand spanking new relational database system

Up until now, XML documents have generally languished as mere files in the lowly file-system, our relational database systems haven’t seen them as important enough to warrant inclusion, up until SQL Server 2005 that is.  In SQL Server 2005, XML documents are now first class (data type) citizens; no more painful shredding required to hammer those triangular trees into oblong tables. 


With the new XML data type in SQL Server 2005, the rush is now on to get our XML documents out of the file-system and into the relational database, where they can, at last, feel validated (XML Schema validated that is!). 


Let’s take a look at how we can get those XML documents sitting in the filesystem into SQL Server 2005.


A BULKed up OPENROWSET provider   We have extended the OPENROWSET provider with the ‘BULK’ option; this allows the provider to read data directly from a file.  We then added three optional arguments (SINGLE_CLOB, SINGLE_NCLOB, SINGLE_BLOB) for the BULK option, these arguments enable the contents of a file to be returned as a single-row, single-column row-set of, respectively, VARCHAR(max), NVARCHAR(max) or VARBINARY(max).


Example: The SINGLE_CLOB (non-Unicode) and SINGLE_NCLOB (Unicode) arguments enable us to read an XML file and insert it into an XML data type column.


[UPDATE 7/19: John Gallardo, rightly points out… “For importing XML data using OPENROWSET we recommend people use SINGLE_BLOB rather than SINGLE_CLOB or SINGLE_NCLOB.”.  I have updated the usage below.  Thanks John!]


CREATE TABLE xml_documents( x XML );


INSERT INTO xml_documents ( x )




A parameterization problem   That was easy, but there is a problem.  I have a lot of .xml files on my machine and they are all called different names.  Unfortunately OPENROWSET does not allow me to parameterize the filename parameter (I have still yet to find out why), i.e. this fails:


DECLARE @filename NVARCHAR(max);

SET @filename = N‘myOtherXmlFile.xml’;


INSERT INTO xml_documents ( x )




Msg 102, Level 15, State 1, Line 5

Incorrect syntax near ‘@filename’.


So unless all my file names are the same, which of course they are not, I am left with building statements at runtime using dynamic SQL — slow, horrible, and often vulnerable to a sql injection attack — which we want to avoid.


DECLARE @filename NVARCHAR(max);

SET @filename = N‘myOtherXmlFile.xml’;



SET @SQLString = N‘INSERT INTO xml_documents ( x ) ‘ +

                              ‘SELECT * FROM OPENROWSET(BULK ”’ +

                                    @filename + ”’, SINGLE_BLOB) AS x’;


EXEC sp_executesql @SQLString;



SQL/CLR integration to the rescue   Here is a great example of how a quick, small and neat use of the CLR integration in SQL Server 2005 can save you from a T/SQL coding mess. 


I create a function in C#, let’s say LoadFile( SqlString filename ), that does allow me to parameterize the filename, and then I allow access to that function from T/SQL.  Here is how:


Save this file as ‘LoadFile.cs’ 

using System.IO;


public partial class UserDefinedFunctions



      public static byte[] LoadFile( string filename )


        using (BinaryReader binaryReader = new BinaryReader((Stream)File.OpenRead(filename)))


            byte[] bytes = new byte[binaryReader.BaseStream.Length];


            binaryReader.Read(bytes, 0, bytes.Length);

            return bytes;




[Update 7/21: This function passes the contents of the file back as binary bytes (original post was returning a string).  We return binary bytes instead of reading as text, so we do not lose the XML document encoding]


Compile the C# file (just using the standard C# compiler sitting in your Windows\Microsoft.Net directory):


C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\csc /target:library /out:”c:\temp\SqlClrObjects.dll” “LoadFile.cs”


Import the assembly library into SQL Server and expose the C# function as a TSQL function.


/* First enable SQL CLR functionality (which is off by default) */

      sp_configure “clr enabled”, 1






/* Create the assembly.  Do a defensive drop of assembly and function (that maybe bound to the assembly) first */


      IF EXISTS (SELECT * FROM sys.objects WHERE name = N‘LoadFile’)

            DROP FUNCTION LoadFile;


      IF EXISTS (SELECT * FROM sys.assemblies WHERE name = N‘SqlClrAssembly’)

            DROP ASSEMBLY SqlClrAssembly;


      CREATE ASSEMBLY SqlClrAssembly

            FROM N‘c:\temp\SqlClrObjects.dll’



      // Note the PERMISSION_SET = EXTERNAL_ACCESS is required because the C# function we have written accesses the file-system


/* Expose the C# function in the Assembly as a T/SQL function */


      CREATE FUNCTION LoadFile ( @filename NVARCHAR(max) )

            RETURNS VARBINARY(max)

            AS EXTERNAL NAME SqlClrAssembly.UserDefinedFunctions.LoadFile;



/*  Test the function */

      SELECT CONVERT( XML, dbo.LoadFile( ‘c:\myOtherXmlFile.xml’ ) AS xml





<please>Don’t forget me</please>


(1 row(s) affected)


So Easy   Now, we can use our new parameterized LoadFile function to insert an XML file (where the filename is specified as a variable) into the XML data type column


DECLARE @filename NVARCHAR(max);

SET @filename = N‘myOtherXmlFile.xml’;


INSERT INTO xml_documents ( x )

   SELECT dbo.LoadFile ( @filename );



This is far neater, safer (and I bet faster) than the horrible dynamic SQL (sp_executesql) above.


If Only   If I had a table containing the names of all the .XML files on my file-system, i.e. like this:


CREATE TABLE files ( filename NVARCHAR(max) );

INSERT INTO files VALUES( N‘c:\myXmlFile.xml’ );

INSERT INTO files VALUES( N‘c:\myOtherXmlFile.xml’ );



I could write a single T/SQL statement to insert the contents of all my .xml files as rows in the xml_documents table


INSERT INTO xml_documents ( x )

      SELECT dbo.LoadFile ( filename ) FROM files;



(2 row(s) affected)


SELECT * FROM xml_documents;



<please>Put me in the RDBMS</please>

<please>Don’t forget me</please>


But that is for another post. 


In this post we have seen some of the enhancements to the OPENROWSET provider for loading files, we have seen a glimpse of the XML data type, and we have seen what may be the shortest SQL/CLR function you will ever see.  If these help get you started, let me know.  If I have left you with more questions than answers, then fire the questions my way and I will get back to you.


In my next post we will look at how we can create a SQL/CLR Table Valued Function (TVF) that returns a table containing a directory listing of all the .xml files on my machine.

Comments (13)

  1. Stuart Padley, a Test Lead here in the SQL Server Engine team, posts about loading XML documents using…

  2. In my last post, I said we would look at how we could return a table that contains a directory listing…

  3. PP says:

    Very well described, extremely simple to understand.

    Looking forward for more articles 🙂

  4. DW says:

    Has anybody had any problems setting the max for the varbinary data type?

    My script is:

    CREATE TABLE largeValues (

    lVarbinary varbinary(MAX)


    and I get an error saying:

    Line 2: Incorrect syntax near ‘MAX’.

    Any help would be appreciated.

  5. Shan says:

    Very much clear and useful article. Thanks for sharing your idea.

  6. Developer Dan says:

    Awesome! This is precisely what I am trying to do, but I got stuck when I discovered that BULK doesn’t accept a parameter. I’ve been wanting to use CLR assemblies in SQL for a while, but haven’t had time to plunge in. This was a perfect first attempt.

    NOTE: CREATE ASSEMBLY didn’t work for me until I signed the assembly, even though I was using a login with the sysadmin role.

  7. Robert says:

    First of all, thanks for sharing your idea. When I tried to load large xml file ( 3 mb) by using your code, I am getting the below exception. Do you have any idea about it?

    Msg 6522, Level 16, State 2, Line 6

    A .NET Framework error occurred during execution of user defined routine or aggregate ‘LoadFile’:

    System.Data.SqlServer.TruncationException: Trying to convert return value or output parameter of size 2959963 bytes to a T-SQL type with a smaller size limit of 8000 bytes.


      at System.Data.SqlServer.Internal.CXVariantBase.RgByteToSSBytes(Byte[] rgbValue, Int64 cbSize, Int64 cbMaxLength, Int32 iOffset, EPadding ePad)

  8. Larry Hayashi says:

    I have a table that already contains a column with a separate GUIDnumber for each row. I have corresponding xml files where each file has the name GUID number.xml. I would like to load these XML files into this existing table in a new column called XMLRepr. Any suggestions as to how to use what you have here. Can I do an update query using dbo.LoadFile?



  9. Andy says:

    Brilliant, just what I was looking for!