Why can’t AppLocale just be added to the Compatibility property sheet page?


Commenter DoesNotMatter wants to know why AppLocale cannot just be added to the Compatibility property sheet as a dropdown option.

One of the things about having a huge topic backlog is that if I just wait long enough, there's a good chance somebody else will answer it, and then I don't have to write anything. And more often than not, that somebody else is Michael Kaplan, who addressed this question in April 2010: Not only is AppLocale not installed by default, AppLocale does everything in its power to remind you that you shouldn't be using it!

AppLocale is the emergency compact spare tire that you pull out of your trunk. Its job is to get you home, at which point you can fix the problem properly. You shouldn't be driving on your emergency compact spare as part of your normal daily routine.

Why does changing the CP_ACP code page require a logoff/logon cycle? Because without it, you would have a Frankenstein configuration situation, where two programs think they're speaking the same language to each other, but aren't. This would happen if one program was launched before you changed the locale, and the other was launched after it.

Sure, if the communication was done through the clipboard CF_TEXT data format, or via one of the system-defined window messages that contain strings, then the window manager can convert from one code page to the other (though it will have to round-trip through Unicode). But that's an awful lot of work for something that isn't even a valid steady-state configuration. And besides, it wouldn't even fix the other communication channels between processes, such as custom clipboard formats or private window messages.

For example, suppose you have a copy of LitWare running, and then you change the locale, and then you run another copy of LitWare. You then drag an object from the first copy of LitWare to the second. If LitWare's custom data format uses ANSI strings, then the first copy will encode the object name using the old locale, and the second copy will decode it with the new locale. And since this is a custom data format, there's nowhere the window manager or OLE can step in and say, "Oh, wait, I know what you're doing. There's a string embedded in that structure. Let me convert it for you."

I encounter this problem myself with a Chinese-language program I use. The program uses a Chinese code page rather than Unicode, and I have to use AppLocale to get it to display anything other than meaningless gibberish. (And even then, it still displays what appears to me to be mostly gibberish, but that part of the problem is my own fault for not knowing how to read very much Chinese.) I have to remember that I cannot copy/paste any Chinese characters into or out of the program because the result will be garbage due to the code page mismatch.

Mind you, you can just ignore the logoff reminder that appears when you change the default locale and continue running your Frankenstein configuration. Just understand that you're now in a world where programs can no longer communicate with each other reliably.

Comments (32)
  1. fy_ms_blogs_for_making_me_login_to_read says:

    Said logic works well when you apply it to those in charge of fixing things. Yeah, if AppLocale was intended for developers, then all for the better that it's not available by default and displays that nasty reminder every time you run it.

    But if it's for end users, then what the hell are we supposed to do when AppLocale tells us this is just "a temporary solution". Study Japanese and mail the long-dead company which produced the abomination so that they, just maybe, halt their work on some other abomination and fix this one? As if they would do that! Even if they were still in business, that is. Who cares about gaijins.

    Now, multiply that by 9000, because every japanese software company out there apparently hates Unicode to the death, and where does that put us, end users?

    I mean, seriously. Why does Windows have "Application Compability" tab then? It falls under the same logic: what if several apps think differently about the version of the OS they're using? Disaster ensues, so let's just delete the tab and leave users on their own.

    Pre-emptive anty-snarky comment: Yeah, I know it's probably not your fault nor responsibility, Raymond. It's just that you defended the decision to not include AppLocale by default, you're going to get the responses.

    [AppLocale is unrelated to backward compatibility because these apps that AppLocal fixes were already broken they day they shipped. AppLocale is one of those "above and beyond the call of duty" things. (Besides, imagine if it were included by default. People would complain that it doesn't work in inter-process communication situations, and since it's included by default, it should work!) -Raymond]
  2. Anonymous says:

    This is something that irks me like crazy as well, because I use Japanese code page programs alongside code page 1252 programs practically everyday.  I wish there was a better solution for this. :(  I know it's not Raymond's fault, and I don't blame him!

  3. Anonymous says:

    > […] because these apps that AppLocal [sic] fixes were already broken they day they shipped.

    So were the apps that need to be fixed using shims, except for the fact that their brokenness was hidden by a implementation details in previous Windows versions.

    [And that's the difference. Windows changed, and an application stopped working. But in the AppLocale case, the problem was not caused by a change to Windows. It was just a test scenario the application vendor never tried. (Or tried and rejected.) -Raymond]
  4. Anonymous says:

    Excuses that Raymond finds for other dev's (more likely ProgMan's) decisions always baffle me. Most often, it was a simple oversight or lack of resources, rather than a valid technical reason.

    If you have LitWare configured to run with japanese locale, then they will exchange CF_TEXT data in the same locale. For other apps, which are now mostly UNICODE, CF_TEXT will be translated by the OS to CF_UNICODETEXT. This translation can be done because the system sets CF_LOCALE for the clipboard text automatically.

    To exchange data with another non-unicode app – too bad. Life is not perfect. But at least there would be an option for it.

    And OLE also assumes the text is in UNICODE format.

    [I think you're misunderstanding the scenario. Application 1 calls SetClipboardData(RegisterClipboardFormat("Custom"), customData). customData is just an opaque binary blob as far as the clipboard and OLE are concerned. Application 2 calls GetClipboardData(RegisterClipboardFormat("Custom")) and receives the opaque binary blob, tries to interpret it, and gets confused because the encoding is wrong. -Raymond]
  5. Anonymous says:
    This blogging software is missing a key feature: a button that inserts a marker in the comments stream, labeled "Bone-headed comments quota exceeded, <Blogger Name> gives up answering comments from this point down."
  6. Anonymous says:

    Custom clipboard format doesn't have any encoding. It's a blob. And it's not like two arbitrary unrelated applications will use the same custom format.

    If you set one app to the specific locale, you'll most likely will want to another related app to the same locale. And then they work. Easy!

    Why do you try to find BS corner cases to justify a lame omission? If you want, you can find as BS arguments as this for almost every OS feature. For example, I can formulate a stronger case for not implementing FILE_ATTRIBUTE_OFFLINE feature, which you just love to mention.

    ["It's not like two arbitrary unrelated applications will use the same custom format." Oh, CFSTR_FILEDESCRIPTOR is just my imagination? Program running in AppLocale A drags a virtual file onto program running in AppLocale B. Hilarity ensues. -Raymond]
  7. Anonymous says:

    Yeah, I don't know what is up with Japanese programs not using Unicode.  I purchased a program recently that was specifically labeled as supporting Windows Vista… and no Unicode.  Seriously?

    And setting my system code page to Japanese messes with a limited subset of English applications.  Not the most pleasant world to live in.

    (Note: I'm not pleading for Windows to fix it — these are obviously developer problems.)

  8. Anonymous says:

    When Windows first started using Unicode, there were characters in BIG5 (a traditional Chinese character encoding) that didn't appear in Unicode.  If you were developing an application in Taiwan and switched to Unicode you would find yourself with all kinds of awkward design decisions to handle the corner cases caused by these missing characters.  Or you could just stick with BIG5.

    The Shift JIS encoding (used in Japan) seems to have had similar problems.  For example, Shift JIS character 0x9883 maps to Unicode astral character U+216B4 which wasn't defined until 2001 (i.e. probably too late for Windows XP).  In addition, to properly support astral characters you need to treat (Windows) Unicode strings as a variable-length encoding, so it's not much more convenient than sticking with the existing encoding.

    "Just use Unicode" isn't the no-brainer it appears to be for Asian languages.

  9. SvenG says:

    I have to use AppLocale for Japanese applications on occasion, because for some reason *nobody* in Japan has come up with the idea to compile applications as Unicode. Come on, if Win9x compatibility is that important to you, just use the Microsoft Layer for Unicode already. It's 2010 for God's sake, can we please leave this whole ANSI nonsense behind us?

  10. Anonymous says:

    There's no point in trying to convince people they shouldn't use AppLocale. Those who are using it have no choice, because they have to run some applications in SJIS while some others have to run in 1252. No amount of discouraging will help here.

    Furthermore, since Windows uses Unicode internally, and it is applications that are designed to work with a certain code page, Windows should have put the encoding setting on the application level, not as a global setting. Then AppLocale wouldn't have been necessary and instead of shimming (with the possible problems of that approach) you really could have a simple dropdown box in the property page that uses the normal way to do it that will just work.

    [But when different apps are running with different code pages, IPC with non-Unicode content will not work. I drag a file out of Japanese Program and drop it onto Explorer and instead of copying the file, Explorer says "File not found: ÿ¶¥§ª˜" because the Japanese program encoded the filename in SJIS and Explorer decoded it in 1252. -Raymond]
  11. Anonymous says:

    Talking about file paths, I thought (at the time of WinXP) the decision of allowing NTFS to store file / folder names in both Unicode and other code pages a bit strange. I thought they should have stored everything in Unicode than apply translation on NTFS driver level.

    In that way, the filesystem will be easier to read, file recovery softwares and network filesystem drivers will be easier to write and so on…

  12. fy_ms_blogs_for_making_me_login_to_read says:

    People would complain that it doesn't work in inter-process communication situations, and since it's included by default, it should work!) -Raymond

    You could at least let people disable the reminder and install applocale as a property page at their own risk. (Although I still think including it by default wouldn't earn you [Microsoft] more complaints than what you have now. But on the other hand that might have provoked developers to rely on AppLocale instead of switching to unicode, while with manual installation you can warn the user all you want)

  13. Anonymous says:

    @Cheong:

    On the filesystem level, it's all UNICODE, except for those short names nobody cares about anymore. ANSI->UNICODE translation is done before CreateFileA even reaches kernel mode.

  14. Anonymous says:

    "In addition, to properly support astral characters you need to treat (Windows) Unicode strings as a variable-length encoding"

    Uh, UTF-16 *is* a variable-length encoding. You *always* need to treat it as a variable-length encoding. If you do not, your code is broken.

  15. Anonymous says:

    wcslen+arrays of wchar_t does not have variable character length.

  16. Anonymous says:

    "Uh, UTF-16 *is* a variable-length encoding. You *always* need to treat it as a variable-length encoding. If you do not, your code is broken."

    I wonder how many such broken programs are out there, and how many of them predated UTF-16.

  17. Anonymous says:

    "So were the apps that need to be fixed using shims, except for the fact that their brokenness was hidden by a implementation details in previous Windows versions.

    [And that's the difference. Windows changed, and an application stopped working. But in the AppLocale case, the problem was not caused by a change to Windows. It was just a test scenario the application vendor never tried. (Or tried and rejected.) -Raymond]"

    In the ANSI era, was supporting running in a codepage other than the one for the language your application is localized into (and displaying meaningful text by… what, transliteration?) really something that anyone would ever say with a straight face should be a test scenario?

    If not, then this was caused by a change to windows – specifically the change from ANSI to Unicode.

    How should an ANSI application written in the ANSI era have been written that would have made it not be 'broken' in a way that requires AppLocale to fix?

    And, yeah, that's no excuse for shipping an ANSI application today. But you made a blanket statement that at least _seems_ to apply to every ANSI application ever made.

    [The introduction of Unicode didn't change the ANSI rules. ANSI apps continued to behave the same way they always did. If they were run on a machine where the ANSI code page != the application's desired ANSI code page, the same exact weird things happened as before. So the behavior of Windows hasn't changed with respect to cross-code-page scenarios. Don't make me draw a diagram… -Raymond]
  18. Anonymous says:

    I think many "Unicode" applications on Java/Windows/.Net only actually support UCS-2. There is not much to do to implicitly support UTF-16 if you don't to much to strings aside from storing them, but *any* processing (even just truncating to a fixed maximum size) must take surrogates characters into account. In the case of truncating, if the new last character turns out to be a high surrogate, it must be removed too.

    However, it's still easier to support than UTF-8, because there is a maximum of two 16-bit words per code point.

    That said, even UCS-4 could be called a variable length encoding if we start including diacritics in the picture.

  19. Anonymous says:

    "However, [UTF-16]'s still easier to support than UTF-8, because there is a maximum of two 16-bit words per code point."

    Why does that make it easier?

  20. Anonymous says:

    "Why does that make it easier?"

    When I wrote that, I was mainly thinking of the reasons there isn't a "UTF-8 locale": blogs.msdn.com/…/816996.aspx

    I had in mind that writing the "A" function with the guarantee that a character would never take more than 2 chars was easier, but I didn't think seriously of reasons for it.

    Thinking of it, I think it's easier because it's easier to cheat: Rather than a buffer, you can just use a char and a "special case when there is a second one", rather than working with an array of one to four chars. Also, if you find a high or low UTF-16 surrogate on a given position, you instantly know where the character begins or ends; with UTF-8, if you don't land on the first char, you have no way of knowing whether you are on the second, third or four character without looking up the first char of the code point (or the start of the next).

  21. Anonymous says:

    @Sven G & Nicole: "because for some reason *nobody* in Japan has come up with the idea to compile applications as Unicode"

    The reason is that the Japanese along with a number of other Asian cultures, do not regard Unicode as the Silver Bullet To All Our Character Set Problems that the West typically does.  Ironic really, given that ANSI/ASCII Luddites are accused of not thinking outside their cultures… the Unicode Evangelists are guilty of much the same blinkered thinking, just with different blinkers and the added smug arrogance of thinking they have solved The Problem and everyone else should come to their way of thinking (the ASCII/ANSI Luddites on the other hand typically know that there is a problem but simply don't believe it applies to them).

    :)

  22. Mordachai says:

    Asking small companies with software that is handled with various multibyte arrangements – ANSI, Shift-JIS, BIG5, and has many thousands of pages of translation in these various arrangements, spread out throughout the software (in .rc files, proprietary .bin files, .txt files, &c), because, essentially, "Microsoft said so", is hardly a justifiable business proposition.

    We have software, that is very expensive to retranslate from scratch, and staff that is already working to capacity to maintain technical issues & new features.  Adding rewrites when the software already works on many different language versions of Windows and from which our customer base is not complaining of any limitations thereof, is … not sound logic.

    Personally, this seems like "if we ignore the problem hard enough its like it doesn't exist".  The problem exists, and Microsofts unwillingness to make things work better for their customers is backwards thinking, IMO.  Worrying about end-users expectatinos that they can run the same software in two different code-pages and expect them to interact properly seems goofy.  There are lots of limitations in lots of software, and its quite often assumed that the problem is with the software, not with Windows.  Its a strange world-view from where I'm standing to think that user's blam the OS first: our support people will tell you in no uncertain terms that *everything* is our app's fault by default.

    Making things better and giving a warning that "by setting this app to non-standard code-page you understand that it will not necessarily work properly in every situation" should be more than enough to cover thy ***, and inform the end-user that they're using something that's a shim (just as using any compatibility setting is not guaranteed to fix the software's problems, and may still experience issues with it).  Why locale should be any diferent backwards compatiblity in the eyes of the user in terms of expectations of perfect function is beyond me.

  23. Anonymous says:

    @laonianren

    That is not quite accurate.

    And it is because there is no such thing as "Shift-JIS" or "Big-5"

    All these standards have versions associated with them, same as Unicode.

    For instance the Shift-JIS character 0x9883 was introduced added by JIS X 0213:2000 (in 2000) and to Unicode in Unicode 3.1 (March 2001)

    charset.info/sjis-2004-std.txt

    The initial version of Unicode included enough characters to do correct round-tripping to all major code pages (including JIS and Big 5). Meantime the national code pages changed, and Unicode struggled to keep up.

    It is also true that Windows did not keep up with Unicode too well.

    But you know what? It also did not keep up with JIS and Big-5 either.

    Yes, you could put some glyphs there and be rendered. But things like sorting, or the IME, will still not work.

    So ANSI code pages is really not a solution.

  24. Anonymous says:

    Why does changing the CP_ACP code page require a logoff/logon cycle?

    Because it requires a reboot, and a reboot will log you off :-)

  25. Anonymous says:

    @Cheong

    NTFS does not store file/folder names in anything but UTF-16 (ok, in fact "16-bit code units", because it has no smarts to deal with invalid surrogate sequences, or to prevent the use of undefined Unicode code points).

    But definitely not in other code pages.

    Maybe you are mixing it with FAT32?

  26. Anonymous says:

    @Mihai

    Yes, you could put some glyphs there and be rendered.

    You are wrong!

    If you run an ANSI application on Unicode OS (NT/2000/XP/Vista/Win 7 and the equivalent servers), the strings in ANSI APIs will be converted to Unicode using the ANSI code page, processed by the Unicode API, and the result converted back to the ANSI code page.

    If the Windows JIS tables are outdated then the JIS characters that don't map to Unicode will be lost.

    So deciding to stick with Shift-JIS or Big5 solves exactly nothing (unless you run on Windows 9x :-)

  27. Anonymous says:

    AppLocale surely does some strange stuff. I installed it once on my XP Pro, started some applications in Japanese locale (my default non-unicode codepage is Russian cp1251), then haven't used AppLocale for a long time, but haven't deinstalled it. Then I bought and installed Program XYZ Russian version and in some places including Start menu icons, some shell extension context menus I saw random Japanese characters instead of proper Russian names (looked like cp1251 string was treated as Shift-JIS or maybe UTF-8). Seems that Program XYZ installer uses non-unicode codepage somewhere during installation, but still it's strange how installed AppLocale can cause such behavior. I went crazy and spent like 10 hours talking with XYZ support. They suggested to send registry dumps to them, tweak some registry options and reinstall Program XYZ several times, but nothing helped. The last suggestion from support was to reinstall Windows from the scratch, something I wasn't happy to do, so I decided to live with it (at least I could manually rename Start menu shortcuts). Several months later somebody told me that problem might be in AppLocale, I deinstalled it and reinstalled Program XYZ and everything become nice. I was quite upset that XYZ support hasn't suggest this option to me.

    [Probably because you didn't tell them you were using AppLocale. They're not psychic. -Raymond]
  28. Anonymous says:

    "We have software, that is very expensive to retranslate from scratch"

    You don't have to retranslate from scratch. You can convert the files to Unicode, you know. In fact, the Resource Compiler always convert to Unicode when compiling .RC files, even if these original files were ANSI, and in fact it is required.

  29. Anonymous says:

    In any case, you can suppress this warning message by using environmental variables directly instead:

    tedwvc.wordpress.com/…/experimenting-with-microsoft-applocale

    Not to mention that some applications have "Paste Special…" to paste things in another clipboard format.

  30. Anonymous says:

    @Tagir Valeev

    You are right.

    AppLocales leaves some "junk" behind that affects encoding in some applications. I have also seen it in installers (so it might be something msi related?)

    The solution is to delete a temporary file: %WINDIR%AppPatchAppLoc.tmp

    (it is created again when you run AppLocale though)

  31. Anonymous says:

    Does that Chinese program happens to called XYZ?

    [You must be new here. -Raymond]
  32. Anonymous says:

    > Probably because you didn't tell them you were using AppLocale. They're not psychic. -Raymond

    Right, but they had enough information. They asked me to download some special software to gather information about my system and send resulting .cab-archive to them. Before sending I unpacked that cabinet and examined it a little. It included list of all installed programs as well as dumps of many registry keys, lists of system files and so on. Of course I haven't said them like 'See, guys, I have dozens of programs installed and AppLocale among them, maybe it causes the problem?' If I had suspected AppLocale to cause the problem, why would I call the support at the first place?

    Well, I admit that here's the wrong place to blame "XYZ" support. I just wanted to second the point that using AppLocale may lead to very strange problems.

    [Perhaps it didn't capture the registry key that specified that AppLocale was active, or it did but they didn't know how to interpret it. Troubleshooting complex systems is hard. -Raymond]

Comments are closed.