How do I get my ANSI based application to run correctly?


A common question is “how do I get my ANSI based code page application to run on a system that has a different code page?”

The most obvious solution is to use Unicode :)  Then you won’t have the code page messiness that leads to this kind of problem.  For some legacy applications you may need a stop-gap measure until a Unicode version is created, or you may have legacy applications that don’t have any support.

The system code page impacts all ANSI applications and all users of the machine, so it is a really terrible idea to change the system code page “for” the user so that your application might work as it expects.  Worst case you might explain the limitation to you user and ask them to change their code page, but note that could create compatibility problems with their other applications and corrupt their data if they don’t fully understand the impact of changing the system code page.

Another possible solution is the “Microsoft AppLocale Utility” at http://www.microsoft.com/globaldev/tools/apploc.mspx.  This allows applications to be run using a different code page than the system default.  This can also be confusing if data is exchanged with other applications, but it may provide an acceptable workaround until a Unicode solution is available for a specific problem.

Comments (4)

  1. Teddy says:

    Hi,

    where is the Locale Builder for Vista?

    You (MS) wrote articles in msdn-, .net- and other magazins, made big announcements on the website. But 3 month after the ETA, I see only "Beta Closed" and "Download removed" messages :-(

  2. shawnste says:

    We are working on releasing the RTM version, however it has been delayed.  You may use the CultureAndRegionInfoBuilder class as a workaround if necessary.

  3. stephane says:

    If only we could in an application set the local codepage to UTF8, it would make internationalization of any legacy ANSI application such a breeze.

    Just put a "SetCodePage(UTF8)" in the beginning of your code, convert the few static international strings in your code to utf8, if you have any, and that’s it, no need to rewrite all the char* to TCHAR and all the usual unicode-conversion chores.

    I understand that this would make the application run just a little slower than a full UTF16 application, since it would still use the xxxA to xxxW internal conversions. But it wouldn’t be slower than a traditional ANSI application.

    Any reason why this hasn’t been implemented? Is the xxxxA to xxxW internal conversions too complicated to realisticaly allow per-application codepage? Or is it just that MS policy is to force all developers to switch to UTF16?

  4. shawnste says:

    We’re getting off topic, but the "A" functions in windows pretty much expect a one byte or two byte code page.  (As do "ansi" applications).

    UTF-8 can end up with 3 or 4 bytes per Unicode code point, which can break buffers and other stuff, making it really hard to get UTF8 to behave correctly as an "ansi" code page :(