HTML5 Parsing in IE10

The Web is better when developers can use the same markup and same code across different browsers with the same results. The second platform preview of IE10 makes progress in this area by fully supporting the HTML5 parsing algorithm.

This continues work we started in previous releases to improve IE’s HTML parser to make more HTML “just work” in the same way across browsers. Some key examples include supporting SVG-in-HTML, supporting HTML5 semantic elements, preserving the structure of unknown elements, and improving whitespace handling. As a result of this work, most HTML parses the same across IE9 and other browsers.

Getting the right behavior

The goal of this work is to ensure all HTML parses the same across modern browsers. This is possible because HTML5 is the first version of HTML to fully define HTML parsing rules, down to the last edge case and error condition. Even if your markup is invalid, HTML5 still defines how to parse it and IE10 follows these rules. The examples below illustrate some cases fixed as part of these improvements.

HTML DOM ( HTML5 + IE10 ) DOM ( IE9 )
<b>1<i>2</b> |- <b>   |- "1"   |- <i>     |- "2" |- <b>   |- "1"   |- <i>   |- "2" |- <i>
<p>Test 1 <object>   <p>Test 2 </object> |- <p>   |- "Test 1\n" |- <object>   |- "\n  "   |- <p>     |- "Test 2\n" |- <p>   |- "Test 1\n" |- <object>   |- "\n  " |- <p>   |- "Test 2\n"

Interoperable innerHTML

These improvements apply to innerHTML too. Code patterns like these now work as you’d expect in IE10:

var select = document.createElement("select");

select.innerHTML = "<option>one</option><option>two</option>";

var table = document.createElement("table");

table.innerHTML = "<tr><td>one</td><td>two</td></tr>";

Better error reporting for developers

HTML5 ensures markup will parse consistently. It’s still a good idea for developers to avoid writing invalid markup to begin with. Writing valid markup helps your site work the way you expect and is more compatible with older browsers.

To help developers with this, IE10 now reports HTML parsing errors via the F12 developer tools.

Screen shot of the F12 Developer Tools showing an HTML5 parsing error

Removing legacy features

Because some features in earlier versions of IE aren’t compatible with HTML5 parsing, we’ve removed them from IE10 mode. Sites that rely on these legacy features will still work when running in legacy modes. This way, sites that work today will continue to work with IE10 even if the developers of the site don’t have the time to update them.

Conditional Comments

<!--[if IE]>

This content is ignored in IE10 and other browsers.

In older versions of IE it renders as part of the page.

<![endif]-->

This means conditional comments can still be used, but will only target older versions of IE. If you need to distinguish between more recent browsers, use feature detection instead .

Element Behaviors

<html xmlns:my>

<?import namespace="my" implementation="my.htc">

<my:element>

This parses as an unknown element in IE10 and other browsers.

In older versions of IE it binds to "my.htc".

</my:element>

</html>

XML Data Islands

<xml>

This parses as <b>HTML</b> in IE10 and other browsers.

In older versions of IE it parses as XML.

</xml>

Your feedback welcome

We welcome your feedback making sure that all HTML parses consistently across browsers (including via innerHTML). Download the second platform preview of IE10, use it, and please report any bugs you find via Connect.

—Tony Ross, Program Manager, Internet Explorer