Please avoid UTF-7

UTF-7 inherently some of the security issues that concern people about encodings.  For example, by shifting in & out of the base64 mode one can create multiple representations of the same string, enabling spoofing and other problems.

UTF-7 is primarily interesting for legacy mail and NNTP applications that don’t properly handle native or MIME encoded UTF-8.  The need for new content to be encoded in UTF-7 is very low.  In particular UTF-7 should be avoided with any modern systems that are natively 8-bit.  For example XML files don’t inherently have any limitations that would force the need for UTF-7, so there should be no need for UTF-7 in XML files.

Of course with any general rule there may be some exceptions, but I’d encourage you to support UTF-8 or UTF-16 and only use UTF-7 if you run into some system that can’t support an 8-bit encoding.  If you run into such 7 bit limitations it should probably be a warning that some redesign might be necessary.  For mail this is being considered by the IETF’s eai working group at


Comments (1)

  1. In some cases MLang (on which MSXML6 depends) can added extra ? to decoded UTF-7 data, which can cause