How does IE handle the IDN2008 RFCs?

I've had a couple questions about the IE/Windows behavior for IDNA2008 so I thought I'd address that a bit.  The simple answer for IE is that it just calls the windows IDN apis: https://msdn.microsoft.com/en-us/library/dd318142(v=vs.85).aspx

When we moved from IDNA2003 to IDNA2008, we had to consider some compatibility issues, most of which are described in Unicode's UTS#46.

IDNA2008 removed the symbols that were permitted in IDNA2003

IDNA2008 doesn't fully address the mapping step that IDNA2003 used.

A few "deviation characters" were modified to allow both forms in IDNA2008.  In IDNA2003, those variations resolved to a single form.  This means that following IDNA2003 or IDNA2008 strictly could resolve to a different machine.  This appears to be a security risk as a domain name could resolve to a different machine or site if a registrar (or 2nd level registry) didn't follow secure rules.

The contextual rules for Bidi text were modified and additional rules were added.

The result is that our IDN APIs have the following behavior:

  • Windows follows the mapping provided by UTS#46 for the "mapping phase."
  • Windows continues to allow the symbols and punctuation characters from IDNA2003.
  • Windows continues to map the deviation characters to the IDNA2003 forms.  This is primarily due to concerns and the potential abuse for real or imagined domain hijacking, where a user ends up at a site in an older browser and a different site in a newer browser.
  • No enforcement of contextual rules, presuming that names that aren't permitted to be registered won't resolve.

-Shawn