MultiByteToWideChar ignores MB_PRECOMPOSED behavior for UTF-8 and UTF-7


UTF7 and UTF8 conversion by MultiByteToWideChar and WideCharToMultiByte is strictly Unicode to Unicode and none of the flags are honored except for WC_ERR_INVALID_CHARS.

I’ve noticed a few cases where people expect MB_PRECOMPOSED behavior because the documentation says “This is the default translation option”, however for UTF7 & UTF8 the Unicode is just converted to a different form, it isn’t translated.

Comments (2)

  1. Dean Harding says:

    According to the documentation for MultiByteToWideChar, it’s supposed to return ERROR_INVALID_FLAGS when you pass anything but 0 or MB_ERR_INVALID_CHARS for dwFlags when the code page is UTF-7 or UTF-8. And MB_ERR_INVALID_CHARS is only valid for UTF8 and only for Windows XP and later…

  2. Shawn Steele says:

    Yup, my oops. I updated the post to clarify. I said that people were passing in the wrong flags, however really its just some people assuming they’ll get MB_PRECOMPOSED behavior for UTF-8 because that flag is doc’d as being the default.