DOMParser and XMLSerializer in IE9 Beta


We’ve talked a lot about UI and browser features lately. Today I want to get back to web development by discussing some additions to the platform in IE9 Beta: DOMParser and XMLSerializer.

What do they do?

DOMParser enables building a document from an XML string and XMLSerializer allows you to serialize it back again. Together they make XML to DOM conversions as simple as using JSON, making it easier to use XML as a data-transfer format. More importantly, the nodes created by DOMParser are native, meaning they can be inserted and rendered within any web page. Plus XMLSerializer can serialize any native DOM node to an XML string, even nodes from HTML documents. This native-ness makes it easier to render your data directly, without having to transform it to HTML first.

How do they work?

Below is a basic example of how these APIs can be used. Check out the DOMParser and XMLSerializer demo on our Test Drive site for more details and a live sample.

// Parse a string into an XML document
var parser = new DOMParser();
var doc = parser.parseFromString("<myXML/>", "text/xml");

// Serialize any native DOM node to an XML string
// (including nodes from HTML documents)
var serializer = new XMLSerializer();
var xmlString = serializer.serializeToString(doc);

The second parameter to parseFromString should be “text/xml” or “application/xml” for best cross-browser compatibility. For the full list of supported strings in IE9, see the MSDN documentation for parseFromString.

Why did we add these APIs?

Although these APIs are non-standard, they are supported in the latest versions of Firefox, Chrome, Safari, and Opera. They are also used by a number of existing sites and frameworks. Given this real-world usage and cross-browser support, we chose to implement these APIs in IE9 as part of our commitment to enabling the “Same Markup” on the web. Having these APIs helps more sites run the same code in the same way cross-browser. They also make working with XML from script easier.

How is this different from MSXML?

MSXML provides an XML structure that is separate from IE’s native DOM. This means MSXML objects cannot be inserted and rendered within a web page. MSXML objects also do not get the interoperability and performance benefits of native JavaScript integration. The performance difference is particularly noticeable when copying elements from MSXML to HTML in order to render them.

How does this work with XMLHttpRequest (XHR)?

The responseXML property of XMLHttpRequest still returns an MSXML object in IE9, but you can use DOMParser with responseText to get a native XML object instead.

// Using DOMParser with XMLHttpRequest
var parser = new DOMParser();
var xhr = new XMLHttpRequest();
...
var doc = parser.parseFromString(xhr.responseText, "text/xml");
...

What about XML parsing errors?

When the provided XML is not well-formed, the parseFromString API will throw a SYNTAX_ERR DOM Exception. This was chosen to align with the error handling behavior of innerHTML in XML documents as per the HTML5 spec.

Next Steps

As I mentioned before, many sites already use the DOMParser and XMLSerializer APIs today. Make sure your pages use feature detection to properly identify support for these APIs when using them:

if(window.DOMParser) {
	// Code relying on DOMParser support
} else {
	// Fallback code
}

if(window.XMLSerializer) {
	// Code relying on XMLSerializer support
} else {
	// Fallback code
}

Tony Ross
Program Manager


Comments (25)

  1. Randall says:

    What document/browser modes are these usable in? Are they usable outside HTML5 and XHTML documents?

    I have a tool that uses DOMParser to insert inline SVG in old HTML documents — it works in non-HTML5 documents in other browsers using feature detection, curious if it will in IE9.  (When it can't insert SVG, it inserts raster images instead.)

    Don't know if this is answerable, but as a rule, is switching browser/document modes more like setting a quirks flag in IE9 or more like loading a copy of the old rendering engine?  Like, are new features normally present in other modes?  I guess new features must be missing in compatibility modes, or else old feature-detecting JavaScript targeting an old version of IE might fail.  But would be interesting to hear from you guys about it.

  2. eiras says:

    "When the provided XML is not well-formed, the parseFromString API will throw a SYNTAX_ERR DOM Exception."

    Mozilla returns <parsererror></parsererror> document (IMO; very broken and unexpected behavior), and Opera previously thrown the exception but was forced to change due to websites relying on the Gecko behavior. I  think Webkit has always done what Mozilla did. So if you don't want sites breaking all the sudden, you might want to do the same thing (although I really don't like that behavior).

  3. Adam says:

    When I bring XML content to my page that includes rows to insert in a table or options to drop into a select list will IE9 handle this without barfing?

    e.g.

    var domParser = new DOMParser();

    mySelectObject.innerHTML = domParser.parseFromString( optionGroupAndOptionsFromXMLHTTPRequestResponse );

    This was always one of the biggest issues with IE when trying to use it as an end user platform for serious web applications.  90% of the time I want to fetch data to stuff into a table or a drop down list and IE is the only browser that can't handle this.

    I don't think the development community can take IE9 seriously until this works across all elements without fail… but my current tests show that this is still broken in IE9 beta 1.

    Also, since IE9 now supports SVG can IE handle SVGElement.appendChild(XMLNodeOfNodesOfNodesOfNodes);??? or are we going to have bugs in IE9's SVG from day 1 to make development a pain in IE for the next 15 years too.

    adam

  4. xslt 2.0 support says:

    Is it too late to ask for xslt 2.0 support ? For xml transformation there is nothing better and god would 2.0 make my life easier. If IE9 support it, the rest of the industry will. Thank you.

  5. Miguel Web Developer says:

    :) great job

  6. Holger says:

    Thanks!

    I lost hope as you closed my feedback as "by design" in March:

    connect.microsoft.com/…/please-implement-domparser-and-xmlserializer

    best regards

    Holger

  7. Ms2ger says:

    You enable Same Markup by implementing error handling that's incompatible with all existing user agents? Really? That's so sad it isn't even funny.

  8. Aseem says:

    Cool! Can I ask why XMLHttpRequest.responseXML still returns an MSXML object in IE9, not native XML?

  9. Laurens Holst says:

    Nice work guys, I’m glad to see that I no longer have to deal with the MSXML quirks. However, does using DOMParser with responseText respect the XML encoding declaration? In other browsers it doesn’t, which is why I normally try to avoid parsing responseText.

    I would suggest to instead either 1. let responseXML return a document in the new DOM, or 2. add a boolean setting to XMLHttpRequest to direct it to use the new DOM.

  10. Laurens Holst says:

    To explain further what I mean, this testcase http://www.grauw.nl/…/test-responsetext.html currently throws error: XML5111: WC_E_XMLCHARACTER: illegal xml character in IE9.

    With responseText already being decoded to a native JS string, you can not really properly interpret responseText as another encoding anymore.

    (Although actually the latest versions of the other browsers actually seem to have worked their way around it, looks like they are not just doing encoding sniffing, but even content sniffing as when I enter ISO-8859-5 in the encoding (test file 3), responseText will show a different character.)

  11. CvP says:

    IE9beta is horrendously slow on this page: http://www.whatwg.org/…/current-work

    I hope it will be better in next beta.

  12. Tom says:

    Great Job Microsoft!

    And please do not forget to support Webworkers and Websockets. Chrome and already support it and FireFox 4.0 will support it.

  13. SergeyI says:

    It`s very interesting.

  14. giuseppe says:

    @adam

    your given example does not make sense. you are trying to assign a DOMDocument object to a text property.

    IMHO innerHTML should be avoided where possible. in most cases there are better ways.

    try something like this

    var xmlSource = yourXMLHttpRequest.responseText; //'<?xml version="1.0" ?><foo><option>1</option></foo>'

    var parser = new DOMParser();

    var doc = parser.parseFromString(xmlSource, "text/xml")

    var childNodes = doc.documentElement.childNodes;

    for (var i = 0; i < childNodes.length; i++) {

       var opt = document.createElement("option");

       opt.value = childNodes[i].getAttribute('value');

       opt.text = childNodes[i].textContent;

       //assign other attributes to the corresponding properties

       yourSelectElement.add(opt, null);

    }

    note: i have not tested this with IE9beta.

    further, you could use yourXMLHttpRequest.responseXML directly, but appearantly in IE (incl. 9) / msxml, you have to access the single child nodes by calling childNodes.item(index). however this should work in IE too:

    var child = yourXMLHttpRequest.responseXML.documentElement.firstChild;

    do {

       var opt = document.createElement("option");

       opt.value = child.getAttribute('value');

       opt.text = child.textContent;

       //assign other attributes to the corresponding properties

       yourSelectElement.add(opt, null);

    } while (child = child.nextSibling);

    note: i have not tested this with msxml.

    optgroups support can be implemented using a nested loop if there are always optgroups, or more generic, using recursion and a conditional.

    i believe this would even work with IE6 if you use a cross browser XMLHttpRequest implementation.

    the solution for tables would be similar.

    and to quote mdc:

    "It should never be used to write parts of a table—W3C DOM methods should be used for that—though it can be used to write an entire table or the contents of a cell." (developer.mozilla.org/…/dom:element.innerhtml)

    but if you really like innerHTML that much, you can always use e.g. div elements and style them with table layout, at the cost of sematics.

    i dont see any purpose for innerHTML in the context of current browser versions, besides as a foundation for irrelevant complaints on irrelevant browser inconsistencies 😉

    g

  15. guiseppe says:

    @xslt 2.0 support:

    yes please!

    @Aseem:

    i too am not happy about IE XMLHttpRequest returning msxml objects.

    g

  16. Craig says:

    > CvP

    Thanks. That link of yours made Firefox (3.6.10) freeze. All 8 of my cores started to spike, the fan got loud, and I waited several minutes for it to complete but eventually gave up and had to manually kill FF. I hope IE9 does better, but at least it is in good company.

  17. Holger says:

    @Adam, guiseppe:

    Shorter would be:

    var domParser = new DOMParser();

    var  node = domParser.parseFromString( optionGroupAndOptionsFromXMLHTTPRequestResponse, "text/xml" );

    var rightDocNode = document.importNode(node ,true);

    mySelectObject.appendChild(rightDocNode);

    add a few object detection for old IEs :-)

  18. Mona says:

    @Holger – are you just suggesting this out loud as an idea???…

    In most cases we don't want an optgroup (IE has issues rendering them anyway)

    We just want to add [x] options which could be 25 or 250 options which means that you would still need to make 25 to 250 calls to appendChild() for each option which completely misses the point of importing a DOM structure.

    Plain and simple, MSFT should hurry up and fix .innerHTML on Selects and Tables before they go to market with IE9 touting it as an HTML5 standards based browser when it most certainly fails the most utterly basic of DOM manipulations.

  19. giuseppe says:

    @holger

    i thought so too, tried that in firefox before posting, and it did not work. note, that in your code, the 'node' variable holds a DOMDocument node, which is not what you want to append to the select element. DOMParser does not return DOMDocumentFragment, nodelist, nodeset or the like.

    this is what i tried and did not work in firefox 3.6:

    var child = doc.documentElement.firstChild;

    do {

       yourSelectElement.appendChild(document.importNode(child, true));

    } while (child = child.nextSibling);

    result: no error message; nodes are inserted into DOM; nodes have correct ownerDocument, but appearantly are not recognized as HTMLOptionElement. i double checked now with latest final opera and latest dev channel chrome, with similar results (no error msg, same resulting DOM on quick look, but different rendering of DOM).

    this is what i also tried:

    var child = doc.documentElement.firstChild;

    do {

       yourSelectElement.add(document.importNode(child, true), null);

    } while (child = child.nextSibling);

    result: no nodes are inserted into DOM,

    firefox: "Could not convert JavaScript argument arg 0 [nsIDOMHTMLSelectElement.add]"  nsresult: "0x80570009 (NS_ERROR_XPC_BAD_CONVERT_JS)"

    Opera: "Uncaught exception: Error: WRONG_ARGUMENTS_ERR"

    chrome: no error message

    the doctype was html5 and the document rendered in standard mode.

    @Mona:

    "you would still need to make 25 to 250 calls to appendChild() for each option"

    while i am certain you wanted to write "[…] for each select", it should be noted that you do not have to call appendChild manually for each option, hence the loop. and 250 calls, using the HTMLSelectElement.add method as described in my previous post takes about 7ms (firefox 3.6), 4.5ms (opera) and 3.5ms (chrome) on a 2 year old notebook (core2duo T9400 @2.53GHz), so this should be neglectable, especially considering the time it takes to complete the http request for fetching the data. additionally, the select could be populated while the transfer is in progress, to potentially save some fraction of given time. last but not least, writing to innerHTML is not free either, as, you guess it, besides parsing, the nodes also have to be inserted into the DOM.

    i start getting the feeling that to some, complaining is more important than having a working solution. every ieblog entry has comments with innerHTML complaints. it is annoying. my automatic adaptive in-brain content filter is starting to classify such whiners as trolls and/or hobbyists with lack of knowledge (about their lack of knowledge).

    g

  20. CvP says:

    @Craig

    It freezes firefox4beta for about 15 sec on my core2duo but then it works though firefox seems kinda _little laggy_.

    On IE9, it freezes for a long time every time you want to scroll.

    On Chrome, it's fine.

  21. Tony Ross [MSFT] says:

    @eiras

    Thanks for the heads up about <parsererror></parsererror>. So far I haven't heard any reports of sites breaking due to the thrown exception, but please share any if you know about them. Overall we chose to throw the exception based on a the following considerations:

    – Exact error document format was inconsistent cross-browser

    – Alignment with the HTML5 behavior for innerHTML in XHTML

    – Easier for developers to identify during coding

    – Reduced compatibility risk since most live XML doesn't trigger parser errors

  22. Tony Ross [MSFT] says:

    @Laurens Holst

    Thanks for pointing out the encoding issue; I'll look into it. As for responseXML, we kept it returning an MSXML object for compatibility since there are a couple of XML features our native DOM doesn't have yet, namely XPath. I do think your suggestion of using a boolean setting to opt-in to a native DOM may provide a workable alternative in the interim. I'll investigate further and see what I can do.

  23. Mike says:

    @Tony Ross [MSFT] – re: "I'll investigate further and see what I can do" – apparently you weren't briefed on the MSFT feedback protocol.  The correct standard reply is: "Closed – By Design" for all features, bugs, implementations (working or broken).

    However seeing as you haven't been assimilated yet – any chance you can jump into the .innerHTML code and fix a few of the long-standing major bugs that IE has that currently make IE9 incapable of claiming proper HTML5 support? – thanks!

  24. FremyCompany says:

    Preliminary testing seemed to show that DOMSerializer is smarter than innerHTML and provides better code. Does it mean you should change innerHTML to use the same code as the DOMSerializer ?

  25. <a href="http://www.edeskco.com/">executive desks</a> says:

    I am very knowledgeable after reading this. Not because I liked this article, but I got this in a very well manner. Very well explained.One can get inspiration with this read.The most striking thing about the centre of Detroit these days is how quiet it is.