XSLT planning

We are planning what features /improvements need to go in the next release for XSLT. We are making these decisions based on customer input and feedback. So I would like to hear your views on what you would like to see in the release of XSLT.


We are evaluating


XSLT 2.0 –

    1. Do we need to support XSLT 2.0 ?

    2. What are the most useful features of XSLT 2.0 that you would  like to see implemented? (like grouping , support for datetime etc)

    3. Do you believe support for the entire 2.0 spec is useful? If yes , why?


XSLT over large documents –

    1. What are some of the large document sizes that customers apply stylesheets?

    2.  Has anyone run into memory issues or perf issues while loading large documents ?

    3.  What were the workarounds you used for memory issues?



If there are other issues around XSLT/XPATH you would like to be specifically addressed in future releases, please let us know.


Looking forward to hear your comments.




Comments (36)

  1. On XSLT 2.0:

    1. yes.

    2. grouping, temporary trees, user-defined functions, xpath 2

    3. I don’t think XML Schema support is useful because I don’t use XSD (preferring RELAX NG). Full base support would be fine.

    I’m working on some complex citation processing stylesheets I doubt I could do with 1.0.

  2. tzagotta says:

    We have several projects running now that use XQuery 1.0 draft, so I would really like to see that integrated into the platform.

    Our current solution is to generate XML using .NET DOM support, serialize out to a file, and then execute an external command-line XQuery engine. After the process completes, we parse the resulting XML file into DOM nodes. This process is obviously very inefficient.

    Therefore, we would vote for XQuery support built right into .NET, with the ability to consume and produce DOM nodes.

  3. Nic Roche says:

    XSLT 2.0

    1. Yes

    2. support for sequences, xsl:for-each-group, xsl:next-match…

    3. Yes, I dont want the usage to be very far removed from W3C standards. Base with extensions seems (if not a big performance hit).

  4. Microsoft XML Team’s WebLog : XSLT planning Huh, who knew? 😉 Directly from the above linked post:…

  5. Mukul Gandhi says:

    I wish a extensive XSLT 2.0 support ..



  6. Mukul Gandhi says:

    I wish a extensive XSLT 2.0 support ..



  7. What about "Translet" for large documents? I have used earlier version of Translet and found them to be quick efficient for large documents.

  8. Bryan Rasmussen says:

    I don’t personally want the xml schema binding, however given Microsoft’s interest in XML Schema and its usage in various products I think full support of the spec could be a benefit to you.

  9. Jim Fuller says:

    comments re XSLT 2.0;

    – Result Tree Fragments: getting rid of RTF makes it much easier to create pipelines of processing, though I wonder what type of incompatibility this presents with the use of node-set extensions in existing XSLT 1.0. I for one would like to see some of the core EXSLT supported for portability, with exsl:node-set() toping the list

    – Datatypes: I see the binding of XML Schema datatypes as a little obtrusive, though after a while they dont seem to get in the way; though there are pitfalls with implicit type casting. I have yet to take advantage of them, as I use the new regexp features to the fullest.

    – XSLT is not a programming language:This is reflected in seperating out such functionality into a different spec: I would like to see math, date/time, etc that is defined in Functions spec to be available, though to follow the same ‘seperation’.

    – Simplifying Complex XSLT: grouping and the ability to define functions makes it much easier to create more reusable and modular XSLT.

    – Multiple Output Docs: The defunct XSLT 1.1 defined this and it extremely useful…glad it found its way into XSLT 2.0

    – Better String Handling: Regular expression, upper/lower case functions..etc have made XSLT 2.0 a serious text manipulating language

    – XPATH Eval: Dynamic evaluation of xpath should have been in XSLT 2.0…I wish this had found its way into the XSLT 2.0 spec..perhaps this support could be given via EXSLT Dynamic module

    I could go on, but these are the top level comments.

    cheers, Jim Fuller

  10. Nadia says:

    1. Yes

    2. xsl:for-each-group, xpath 2,

    I am currently working on converting Framemaker-generated XML to conform to an in-house DTD, and it would be extremely difficult to do without the xsl:for-each-group element. I am sure that there are other XSLT2 elements that would be useful, but I have gone that in-depth yet.

  11. Jim Fuller says:

    also wanted to mention

    +1 to XSLT 2.0 support!

    +1 to EXSLT 1.0 support in XSLT 1.0 as well

    gl, Jim

  12. I miss some feature from XPATH 2:

    – lower-case

    – replace

    I think it’s very important to support XSLT 2.0. Why:

    – Instruction are already available; than I have always to look if the feature is working or not

    – Other products suppert XSLT 2 (Saxon 7)

    The size of documents is growing. We use up to 90 MB for data exchange to different systems.



  13. Nadia says:

    Oh and xsl:namespace and xsl:sequence are handy too.

  14. jim albright says:

    1. Yes, please support XSLT 2.0.

    2. grouping, regex are things I use the most now

    3. Please support the whole spec. (schema is optional for me for now)

    I generally work on files of 1 to 10Mb and don’t have any issues with size problems. I use Saxon 8.4 and enjoy it.

    I do a lot of work as pipelines, keeping different sets of changes separate – I currently use batch files to handle this.

  15. Brad Bjorndahl says:

    About XSL 2.0: I can hardly believe what I’ve done so far with XSL 1.0; it’s powerful, and, eventually, very intuitive. What I want for XSL 2.0 is very simple: I want it all. Especially regular expressions, grouping, schema awareness, and tunnelling parameters.

  16. John Workman says:

    I have several transforms that run against large amounts of data (1-10mb) and often run into performance issues. My largest transform is about 1200 lines of xslt, which builds a fixed length flat file from XML for JDE. This was written before XMLReader was around, so I may have chosen a different path if it is was being written today.

    The two areas where I find the most cumbersome, which you have mentioned, are grouping and support for date time. I like to compare XPath to SQL. It would be nice to be able to group by multiple attributes (site, day, gl account).

    There are two areas that kill me with dates: formatting and comparison.

    If you are building interfaces to legacy systems and are stuck to certain formats, you have to do string manipulation to reformat the date.

    2005-06-08, 20050608, 08-06-2005.

    In almost every transform I have to do a date between start_date and end_date. I’ve also had several occasion where I have had to group by time buckets, either 15 minute or 1 hour. In the case of the 15 minute buckets, I improved performance by having the sql add a bucket_id to the xml output (1-96). Then I added an xref node to my xslt:



    <xref:Time bucket="1" begin="00:00:00" end="00:14:59"/>

    <xref:Time bucket="2" begin="00:15:00" end="00:29:59"/>

    <xref:Time bucket="3" begin="00:30:00" end="00:44:59"/>

    <xref:Time bucket="4" begin="00:45:00" end="00:59:59"/>

    <xref:Time bucket="5" begin="01:00:00" end="01:14:59"/>




    Then I set up a timebucket variable:

    <xsl:variable name=’TimeBucket’ select="document(”)/xsl:stylesheet/xref:Lookup/xref:TimeBuckets"/>

    Then I loop through the buckets collections sales for each 15 minutes:

    <xsl:for-each select="//EnterpriseDocument/Transaction[generate-id(.)=generate-id(key(‘Bucket’,@Bucket)[1])]">

    <xsl:call-template name="SalesByDest">

    <xsl:with-param name="Bucket" select="@Bucket"/>



    This may seem a like a lot of work to get sales grouped into 15 minutes, but from a performance standpoint, it really made a difference. I’ll try to write up a full explanation of how this works on my blog soon.

    Thanks for asking,

    John workman

  17. In reading some of the posts so far this morning I am LOVING what I am seeing. Excellent feedback and exactly in part with what I am guessing Microsoft is assuming would probably be the case. If I can point…

  18. XSLT 2.0 –

    1. Do we need to support XSLT 2.0 ?

    Given that XSLT 2.0 has a lot more to offer than 1.0, and that compared to XQuery 1.0 is more applicable in the transformation area, it would seem to me logical that there should simply be XSLT 2.0 support, yes. It would make the "programming experience" much more complete.

    2. What are the most useful features of XSLT 2.0 that you would like to see implemented? (like grouping , support for datetime etc)

    Support for dates yes, also for chained stylesheets, and support to handle big input sources, among other things.

    3. Do you believe support for the entire 2.0 spec is useful? If yes , why?

    I think yes because then you would serve the whole range of possible use of the language, and avoid many discussions about an incomplete implementation.

    XSLT over large documents –

    1. What are some of the large document sizes that customers apply stylesheets?

    I did some testing there and came to the conclusion that for big repositories a database works best currently. I would also encourage splitting them up. This works also for database tables, actually. I’m talking about hundreds of MB to even Gigabytes here. I for example have one SQL table that contains 3,5 million rows, and measures about 15 GB on disk… I wouldn’t even be thinking to represent all that data in a XML document, less to be processing a (search) query over it. In that case, you’d be thinking about creating indexes and catalogs that make querying much faster.

    2. Has anyone run into memory issues or perf issues while loading large documents ?

    Yes. Large delay times that make the application unmanagable, and the user angry or at least irritated.

    3. What were the workarounds you used for memory issues?

    Splitting up the source, writing faster XSLT, using different methods of processing, uploading it to a database.

  19. Peter Jacoby says:

    I echo others’ votes for complete XSLT 2.0 support. It seems that if we XSLT developers want to continue to have Microsoft as an option, full XSLT 2.0 support is a requirement. I have yet to migrate my 100’s of XSLT stlesheets to 2.0 simply because we’re tied to Micrsoft technologies. If Microsoft does not support 2.0 then I, and others like me, will have to look at other vendors (read Saxon) to supply the needed functionality.

    Also, I would vote for focusing effort on server-side rather than client-side transformations. Being able to do client-side transformations would be great, but in this more competetive browser market, I would be much more interested in a solid server product. If client-side development can continue after that, all the better, but I recommend focus on the server-side.

    Thank you.

  20. Lars says:

    XSLT 2.0 –

    1. Yes. If MS doesn’t support XSLT 2.0, it will either fail to gain the industry permeation that will make it a safe choice, or the road to acceptance will be slower and more grueling. That will leave a lot of developers and managers struggling with whether to stick with XSLT 1.0 and all its limitations, or move to XSLT 2.0 and risk being left high and dry.

    2. Grouping, grouping, the ability to process its own output (i.e. producing nodesets instead of result tree fragments), and grouping!

    3. Yes. Because without full support, the choice of XSLT 2.0 vs. 1.0 or XSLT vs. something else becomes quite a bit more complicated. For example, if our project is considering making a utility (like Schematron) written in XSLT 2.0 part of the development process, we then have to ask, does it require XSLT 2.0 features that Microsoft doesn’t support? How hard is it to track that down? It would be a lot easier just to be able to say, "This utility requires standard XSLT 2.0, and Microsoft XSLT supports standard XSLT 2.0. Period."

    XSLT over large documents –

    1. We have been processing lists of about 7000 items, with the result that we run up to 6MB documents through a stylesheet.

    2. Yes, we’ve had memory/performance issues… both with the above (long lists) and with recursive servlet calls for each item in a long list. (Note, we have been using Xalan and Saxon, not MS’s XSLT. As far as I know, we don’t have a way to use MS’s XSLT, because it’s not available in a Java package, which is what our webapp framework requires.)

    3. Three things helped: Moving from Xalan to Saxon helped with speed; upping the JVM memory allocation on the servlet container helped with memory; and using XSLT 2.0’s grouping features (in Saxon) greatly improved speed.

    Thanks for asking…


  21. Duncan Godwin says:

    Currently dealing with documents of about 90MB-150MB which contain some header information and 19,000+ items.

    My current approach has to been to create a document spliter that works on the raw document breaking out each item with the header information and then doing a separate transform on each item and writing out the result of each transform as I go. This works because each item is self contained (apart from the header information).

    I wonder if there is a potential for a library here to do something generically for this kind of file.

  22. Yuriy says:

    – extension functions at compile time

    – template match index if there are multiple templates with similiar match like elementname[@value=’something’] and the only difference is somthing. faster matching.

  23. Michael Rys says:

    After long arguments and petitions, Nithya is collecting feedback on what the XML Team should focus on…

  24. Philippe says:

    Yes, implement XSLT 2.0. You too often need to come up with dirty tricks to go around the XSLT1.0 limitations. I will have a hard time going back to XSLT1.0 now after discovering some new functionalities in XSLT 2.0.

    Based on a small experience with XSLT 2.0:

    – XPath 2.0 functions and operators:

    matches(), etc.

    – xsl:for-each-group

    – xsl:analyze-string

    – xsl:function

    Didn’t take the time to use the rest yet.

  25. Oliver says:

    On XSLT 2:

    1) Yes!

    2) Multiple output, grouping, tunneling, string handling, temporary trees, …

    3) XSD support isn’t necessary for me. But I want full base support since a half-baken implementation doesn’t attract me!

  26. Spencer Tickner says:

    1. Yes

    2. Multiple Output, grouping

    3. I think so, like others have already posted Schema support would be nice, but not neccesary

  27. Richard Thurston says:

    1. Yes

    2&3. We use StyleVision to create our xslt, so bacisally anything our designers can do with StyleVision we would need

    The largest xml file we have processed to date was 156K.

  28. Scott says:

    I second all the comments regarding XSLT 2.0 support – it would be nice to have full 2.0 compliance.

    Please give us a fully managed XQuery implementation! If MS doesn’t give this to us, we will be forced to go to third party implementations. XQuery is a very important part of our future development.


  29. Mark McSweeny says:

    XSLT 2.0 –

    1) Yes, XSLT 2.0 addresses a lot of real-world problems that surface in XSLT 1.0.

    2) The grouping and datetime features in 2.0 address issues I hit all the time in 1.0. I also like 2.0’s unified data model, especially being able to treat a result tree fragment as a node set. Of lesser importance, but still of interest to me, is multiple output documents and various XPath 2.0 features.

    3) I think it is important to implement the whole spec (or at least its minimum mandatory bse). It is not so much that I will use every feature myself, but rather that I may want to incorporate templates from other sources that taken collectively *do* depend on the full spec. Reading the spec, I am generally impressed that the working group is introducing features to meet real needs as opposed to dreaming up theoretical nice-to-haves — I am often thinking "I’ve hit that!".

    XSLT over large documents –

    1) The largest documents that I routinely process are 5-10 megabytes in size.

    2) Apart from having to resort to some O(n^2) algorithms from time to time (that XSLT 2.0 features ought to help with), I generally find performance to be acceptable. I wouldn’t complain if the engine were faster 😉 As for memory problems, I have observed that while I am debugging stylesheets in IE, if my stylesheet incurs certain classes of errors due to erroneous code, IE memory usage remains high. This suggests that it may be retaining the stylesheet and/or source documents in memory. It will sometimes crash under these conditions. Of course, this behaviour may be attributable to IE itself rather than the XSLT component.

    3) N/A

    Thanks for the opportunity to give feedback.


  30. bitPimps says:

    XSLT over large documents:

    1. I have 80+ MB XML files that need to be transformed, .NET throws System.OutOfMemoryException when trying to transform large XML files. No solution found using .NET (would love some advice/help).

    2. Yes, point 1. System.OutOfMemoryException is thrown when trying to transform large XML documents. Usually can transform up to around 20 MB, then bombs.

    3. Still looking for a work around in the .NET framework. No answers yet, might have to forget using .NET and go back to old VB (yuk).

    Very frustrating.

  31. Martin Kolař&#237;k says:

    ad XSLT2.0

    1. yes, we need

    2. default namespace for transformed doc, temporary trees (anyhow operating nodes, node sets at higher abstract level), xsl:next-match

    3. hm, I do not know, but maybe it is easier to implement whole spec than cut off some features 😉

    ad complex docs

    1. not so large, single doc max. about 250 KB of characters, and after grouping all together cca 3,5 MB of chars

    2. XSLT itself has no mem/perf problem for me, but XMLDiffView, when comparing two 250 KB docs, has the problem :-), but this is out of core.

    3. see 2.

  32. Lars says:

    P.S. to my above comment let me add that one of the most useful features of XSLT 2.0 (actually XPath) is the replace() function for strings.

  33. Rob Edgar says:

    1. Yes

    2. Multiple Document Output, Grouping

    3. If the choice is between wait a long time to have the whole spec implemented or sooner but a small subset of the spec I would go with a subset (as long as you include my wishes LOL).

  34. monogatari The best approach to interoperability is to focus on getting widespread, conformant implementation of the XSLT 2.0 specification. Atsushi Eno doesn’t blog often but when he does I tend to drop whatever it is I am doing and devour…

  35. G. Tarazi says:

    Debugging xslt combined with C# from VS.NET 2005 is extremely important.

    Here is an application example used by a major financial institution:

    1- 10,000 + lines of C# code are extracting data, 50 to 100 K xml file.

    2- 15,000 + line xslt reformatting the data, calling external C# code functions from xslt.

    3- The above is happening over and over (up to 20 times) in multi threading.

    4- The results (couple of megabits of xml) are passed to another 3000 + lines of xsl

    5- This xsl is again calling C# functions (well, you don’t have variables in xsl)

    6- And the final xml file is ready; it goes to be rendered in a report later (another system).

    Debugging the above in VS.NET 2003 is a torture 🙂 using xml spy and test cases (C# / NUNIT) makes it easier, but if VS.NET 2005 can help here would be great, I must be able to start the C# code and continue debug through the xsl and the C# code again “in multiple threads :-)”

    The speed of xsl is important, the previous steps can take up to 5 seconds in multi processor server, I wish of the call-template, and the xsl-if are faster in 2.0 🙂

    Thank you