XML Source – Making things easier with XSLT


As I wrote in my previous post, complex XML documents will produce multiple outputs when you’re using the XML Source adapter. Most of the time it will be easier to pre-process your source file with XSLT to de-normalize it a bit. Reducing the number of outputs greatly simplifies your data flow.

Let’s take the same XML document I used in the last example:

<extract date="2007-12-05">
    <counters>
        <counter category="dispatcher" name="server1">
            <runtime>6</runtime>
            <queue>3</queue>
            <maxrequest>8</maxrequest>
            <color>blue</color>
            <host>
                <name>svo2555</name>
                <path>\\dispatcher</path>
                <lastaccessed>2007-02-03</lastaccessed>
            </host>
        </counter>
        <counter category="gateway" name="server1">
            <runtime>1</runtime>
            <queue>10</queue>
            <maxrequest>10</maxrequest>
            <color>purple</color>
            <host>
                <name>svo2555</name>
                <path>\\gateway</path>
                <lastaccessed>2007-02-03</lastaccessed>
            </host>
        </counter>
    </counters>
</extract>

We want to flatten this out a bit using an XSL transform like this one (forgive my novice XSLT skills):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/extract">
    <xsl:variable name="extractDate" select="/extract/@date" />
    <counters>
    <xsl:for-each select="counters/counter">
        <counter>
            <extractDate><xsl:value-of select="$extractDate"/></extractDate>
            <category><xsl:value-of select="@category"/></category>
            <name><xsl:value-of select="@name"/></name>
            <runtime><xsl:value-of select="runtime"/></runtime>
            <queue><xsl:value-of select="queue"/></queue>
            <maxrequest><xsl:value-of select="maxrequest"/></maxrequest>
            <color><xsl:value-of select="color"/></color>
            <hostName><xsl:value-of select="host/name"/></hostName>
            <path><xsl:value-of select="host/path"/></path>
            <lastaccessed><xsl:value-of select="host/lastaccessed"/></lastaccessed>
        </counter>
    </xsl:for-each>
    </counters>
</xsl:template>
</xsl:stylesheet> 

We’ll apply the transform with an XML Task. Add one to your package, and open the editor. You’ll want to change the Operation Type property to XSLT, set SaveOperationResult to true, and set all of the file connections.

image

Note, the Source should be your XML source document and the SecondOperand is your XSLT document.

The processed XML looks like this:

<?xml version="1.0" encoding="utf-8"?>
<counters>
  <counter>
    <extractDate>2007-12-05</extractDate>
    <category>dispatcher</category>
    <name>server1</name>
    <runtime>6</runtime>
    <queue>3</queue>
    <maxrequest>8</maxrequest>
    <color>blue</color>
    <hostName>svo2555</hostName>
    <path>\\dispatcher</path>
    <lastaccessed>2007-02-03</lastaccessed>
  </counter>
  <counter>
    <extractDate>2007-12-05</extractDate>
    <category>gateway</category>
    <name>server1</name>
    <runtime>1</runtime>
    <queue>10</queue>
    <maxrequest>10</maxrequest>
    <color>purple</color>
    <hostName>svo2555</hostName>
    <path>\\gateway</path>
    <lastaccessed>2007-02-03</lastaccessed>
  </counter>
</counters>

Add a Data Flow Task, and setup your XML source to use the processed XML document. You’ll need to update/regenerate the schema for your document to account for the new format. Notice there is now only one output to deal with.

Comments (18)

  1. tannerwatson says:

    Thanks for the tip. I’ve recently been on a project where we needed to do this exact same thing but we hit a slight problem. We noticed that if the original XML file is larger than 100MB it would cause a System.OutOfMemoryException when it tries to transform the XML. Have you heard of this occurring?

  2. Hi Tanner,

    I believe both the XML Task and XML Source read the entire XML document into memory before performing any operations. They were originally designed to work on smaller XML files.

    I’d suggest trying a similar operation using a script task, and see if that works better. You’ll have more control over how the XSL is executed.

    ~Matt

  3. markllemmon says:

    Are there any plans to getting this upgraded to XSLT 2.0 and making it more robust?

    Mark

  4. Hi Mark,

    (Wow, sorry for the delay! I completely missed this comment).

    Yes, improving the performance of our XSLT usage is one of our work items for the next release. Hopefully they’ll make it in!

    ~Matt

  5. mirsamar says:

    Thanks for the article!

    Mohammad

  6. Eshwar says:

    Thanks for the article

    Few questions:

    1. Is it required to create the XSLT file, i just have an xml file how can create this xslt file.

    2. what is second operand?

    Regards,

    Eshwar.

  7. Rzebra says:

    Hi Matt,

    Is there a better way to create the xslt file and not manually?

    Thanks

  8. Richie says:

    Matt,

    I am trying to do the same thing, yanking data that's at the root level, but my file is also much simpler because I don't have sub element data. Can anyone here please help me generate the xslt file that I need to generate the final XML source file. Thanks

    <ResponseFile Date="2011-11-11" StatusCode="s"  MessageText="FAIL">

            <Response TranCode="15" ID="4444" BorrowerID="1101" Status="5"/>

            <Response TranCode="15" ID="4444" BorrowerID="7777" Status="5"/>

            <Response TranCode="15" ID="4444" BorrowerID="8888" Status="5"/>

       </ResponseFile>

  9. jacko says:

    I tried this one but it says

    "[XML Task] Error: An error occurred with the following error message: "There are multiple root elements. Line 482, position 2.".

  10. David says:

    Hi Matt

    In a case where i have multiple files, say 500 XML files. how would i convert them, because i have tried a For Each Loop and it's failing and im not getting any errors, somehow it just manage to process 1 file

  11. Sachin says:

    Hi Matt,

    Can you please let me know how to proceed further to get the data from processesed xml to sql table as I am getting an error in doing so? It would be a great help. Really strugling to get it resolve. I am new to SSIS.

  12. Bhanu Sri says:

    Where the generated xml files get stored by default

  13. Matt Thompson says:

    Hey Guys,

    I am new to the BLOG, so i just wanted to introduce myself and say the information you guys have provided

    is great!

    TTYS,

    Matt

  14. Ravikiran says:

    I am trying to upload data from a XML file into sql server. I used XML task to denormalize the xml file and then using the data flow task. in this xml source, I select the xml source file and generated the xsd file, while clicking on the columns tab, getting the following the error.

    Pipeline component has returned HRRESULT error code 0xCo2o92A1 from a method call.

    Error at Data Flow Task [XML Source Adapter does not support mixed content model on Comples Types.

    Need help!

    Thanks….

  15. sandy says:

    Hi how did you create the XLST? We give our xml file in source how to give destination and XLST can you please explain?

  16. dhruvarora says:

    Hi Matt,

    I am using xml task to generate HTML using xml and xslt file. Then I am using the html in script task to send email. But in one column, text is coming with hyperlink, although in my xslt , I didn't define any hyperlink for that column.

    Following is my my XSLT. And the html shows hyperlink for Policy Number.

    <?xml version="1.0"?>

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/…/Transform"&gt;

    <xsl:template match="cd">

    <html>

    <head>

    </head>

    <body>

    <h1>The following issues have been found during the data load on <xsl:value-of select="CurrentDate"/></h1>

    <table border="1" cellspacing="0" cellpadding="4">

    <tr>

    <th>Type</th>

    <th>ID</th>

    <th>Policy Number</th>

    <th>Effective Date</th>

    <th>Amount</th>

    <th>Error Description</th>

    </tr>

    <xsl:apply-templates select="Policy"/>

    </table>

    <p/>

    <p/>

    <H6>Please check the report for details: <a><xsl:attribute name="href"><xsl:value-of select="url"/></xsl:attribute>Data Validation Report</a></H6>

    </body>

    </html>

    </xsl:template>

    <xsl:template match="Policy">

    <div style="color:#f00; font-size: 90%;">

    <tr>

    <td><xsl:value-of select="Typ"/></td>

    <td><xsl:value-of select="ID"/></td>

    <td><xsl:value-of select="PolicyNumber"/></td>

    <td><xsl:value-of select="PolicyEffectiveDate"/></td>

    <td><xsl:value-of select="TotalAmount"/></td>

    <td><xsl:value-of select="Issue_Description"/></td>

    </tr>

    </div>

    </xsl:template>

    </xsl:stylesheet>

    Is there any setting in xml task, which prevents the text from showing as hyperlink ? Or something else needs to be done ?

    Thanks

    Dhruv

  17. C Dickens says:

    Brilliant.  Thanks for the article!

  18. SQL Daddy says:

    Hello sandy,

    SSIS doesn’t have Xml destination. After doing various researched I  found out the ways to generate the Xml file output. The following are the ways

    1. By writing dot net script

    1. Drag the script component on the Data Flow task.

    2. Write the script in VB.Net language in the preferred xml format. This script is package specific we need to write the script according to the requirements. Kindly go through the embedded document.

    2. Through the third party components

    Some third party components are available in the market.

    and download XML SSIS Toolkit SQL 2005 – 2.1. They have given the installation manual as well. Go through

    zappysys.com/…/ssis-xml-source

    3. Through the developed custom (Xml destination) component. Following is the procedure to use the custom component