MLang & MSXML6 doesn’t like UTF-7


In some cases MLang (on which MSXML6 depends) can added extra ? to decoded UTF-7 data, which can cause UTF-7 encoded XML to fail to parse.

UTF-7 isn’t a great encoding anyway, so this is just another reason to Please Avoid UTF-7.

In particular there doesn’t seem to me to be much reason to use encodings other than UTF-8 or UTF-16 with XML data.  XML is new enough that Unicode support exists for whatever the XML is being used for.

Comments (2)

  1. Mihai says:

    Not only that XML is new enough that to support Unicode, but UTF-8 and UTf-16 are mandatory:

    "All XML processors MUST accept the UTF-8 and UTF-16 encodings of Unicode 3.1"

    http://www.w3.org/TR/2006/REC-xml-20060816/#charsets

    In the second I see an "XML parser" project explaining that the there is no Unicode support "yet," I am out of there.

    Not Unicode == not compliant!

  2. shawnste says:

    True, but… ahem… I know of at least one internal XML source file we have with a non-Unicode code page.  (I just converted it so we’re OK now :)