A brief and also incomplete history of Windows localization


The process by which Windows has been localized has changed over the years.

Back in the days of 16-bit Windows, Windows was developed with a single target language: English.

Just English.

After Windows was complete and masters were sent off to the factory for duplication, the development team handed the source code over to the localization teams. "Hey, by the way, we shipped a new version of Windows. Go localize it, will ya?"

While the code that was written for the English version was careful to put localizable content in resources, there were often English-specific assumptions hard-coded into the source code. For example, it may have assumed that the text reading direction was left-to-right or assumed that a single character fit in a single byte. (Unicode hadn't been invented yet.)

The first assumption is not true for languages such as Hebrew and Arabic (which read right-to-left), and to a lesser degree Chinese and Japanese (which read top-to-bottom in certain contexts). The second assumption is not true for languages like Chinese, Japanese, and Korean, which use DBCS (double-byte character sets).

The localization teams made the necessary code changes to make Windows work in these other locales and merged them back into the master code base. The result was that there were three different versions of the code for Windows, commonly known as Western, Middle-East, and Far-East. If you wanted Windows to support Chinese, you had to buy the Far-East version of Windows. And since the code was different for the three versions, they had different sets of bugs, and workarounds for one version didn't always work on the others. (Patches didn't exist back then, there being no mechanism for distributing them.)

If you ran into a problem with a Western language, like say, German, then you were out of luck, since there was no German Windows code base; it used the same Western code base. Windows 95 tried out a crazy idea: Translate Windows into German during the development cycle, to help catch these only-on-German problems while there was still time to do something about it. This, of course, created significant additional expense, since you had to have translators available throughout the product cycle instead of hiring them just once at the end. I remember catching a few translation errors during Windows 95: A menu item Sort was translated as Art (as in "What sort of person would do this?") rather than Sortieren ("put in a prearranged order"). And a command line tool asked the user a yes/no question, promting "J/N" (Ja/Nein), but if you wanted to answer in the affirmative, you had to type "Y".

The short version of the answer to the question "Why can't the localizers change the code if they have to?" is "Because the code already shipped. What are you going to do, recall every copy of Windows?"

At least in Windows 95, the prohibition on changing code was violated if circumstances truly demanded them, but doing so was very expensive. The only one I can think of is the change to remove the time zone highlighting from the world map. And the change was done in the least intrusive possible way: Patching four bytes in the binary to make the highlight and not-highlight colors the same. You dare not do something like introduce a new variable; who knows what kinds of havoc could result!

Having all these different versions of Windows made servicing very difficult, because you had to develop and test a different patch for each code base. Over the years, the Windows team has developed techniques for identifying these potential localization problems earlier in the development cycle. For a time, Windows was "early-localized" into German and Japanese, so as to cover the Western and Far-East scenarios. Arabic was added later, expanding coverage to the Mid-East cases, and Hindi was added in Windows 7 to cover languages which are Unicode-only.

Translating each internal build of Windows has its pros and cons: The advantage is that it can find issues when there is still time to make code changes to address them. The disadvantage is that code can change while you are localizing, and those code changes can invalidate the work you've done so far, or render it pointless. For example, somebody might edit a dialog you already spent time translating, forcing you to go back and re-translate it, or at least verify that the old translation still works. Somebody might take a string that you translated and start using it in a new way. Unless they let you know about the new purpose, you won't know that the translation needs to be re-evaluated and possibly revised.

The localization folks came up with a clever solution which gets most of the benefits while avoiding most of the drawbacks: They invented pseudo-localization, which simulates what Michael Kaplan calls "an eager and hardworking yet naïve intern localizer who is going to translate every single string." This was so successful that they hired a few more naïve intern localizers, one which performed "Mirrored pseudo-localization" (covering languages which read right-to-left) and "East Asian pseudo-localization" (covering Chinese, Japanese, and Korean).

But the rule prohibiting code changes remains in effect. Changing any code resets escrow, which means that the ship countdown clock gets reset back to its original value and all the testing performed up until that point needs to be redone in order to verify that the change did not affect them.

Comments (32)
  1. Anonymous says:

    But I want a full and complete history!  Whine, whine.  

    Seriously, that was interesting.

  2. Anonymous says:

    When you say "If you ran into a problem with a Western language, like say, German, then you were out of luck, since there was no German Windows code base", do you mean a problem that requires a code change specific to German?

  3. xpclient says:

    Properly displaying Indic language glyphs arrived only with Vista. Earlier it was slightly incorrect.

  4. SimonRev says:

    @Gabe,

    For example in English we say "one hat" "two hats" or sometimes "one hat(s)".  That doesn't work in every language.  Especially languages that have multiple plural forms (i.e. different words for hat when talking about two hats or three hats).

    An example in German might be that 22 is "twenty two" in English, but "two and twenty" in German.  It is very possible that the English UI team puts something together that cannot work like that.

  5. Anonymous says:

    "Patches didn't exist back then, there being no mechanism for distributing them"…obvious in hindsight but truly reminiscent of the changes in software development and distribution

  6. Anonymous says:

    Alternate title: You live, you learn.

  7. Anonymous says:

    "Because the code already shipped. What are you going to do, recall every copy of Windows?"

    It didn't ship in German. Who says you can't ship a later version of the code in German than you did in English?

    [The people who said "We will provide support resources for at most three code bases (Western, Far East, Mid-East)," that's who. Not to mention all the ISVs who have to add a fourth version to all their test matrices. (Because the XYZ API on German Windows behaves slightly differently from Western Windows.) -Raymond]
  8. Anonymous says:

    >> do you mean a problem that requires a code change specific to German?

    Raymond gave an example of something that would require a code change to fix things for German:

    —–

    a command line tool asked the user a yes/no question, promting [sic] "J/N" (Ja/Nein), but if you wanted to answer in the affirmative, you had to type "Y".

    —–

    The code change wouldn't have a noticeable effect in English; however, the change wasn't German-specific (other languages might benefit).

    [That didn't require a code change. The German localizers merely forgot to localize the Y to a J (probably because the Y was not clearly documented as to what it represented, so they played it safe and left it alone). -Raymond]
  9. I wonder whether there are languages where Yes and No start with the same letter…

  10. Anonymous says:

    @Maurits – look at this site: users.elite.net/…/jennifers

    See Yes in 550 languages/No in 520 languages.

    First case of about 6 I found: Huave (Mexico) Yes=Neam No=Ngo

  11. Mike Dimmick says:

    @SimonRev: Grammatical number formats are a pain. There are at least four grammatical forms you have to handle ('one, two, many, lots' as an approximation) and the exact rules for which form to use for each number is different for different languages.

    doc.qt.nokia.com/…/qq19-plurals.html goes into much more detail.

    …yes I have implemented this! Kind of. We have the framework but right now we've only implemented the English number->form mapping.

  12. Anonymous says:

    [ (Patches didn't exist back then, there being no mechanism for distributing them.) ]

    Except when the did exist, as in the Y2K patch for Win3.1's file manager.

    [Um, "then" is 1992. In 1992, there was no mechanism for distributing patches. The Y2K patch came out in 1997. Or are you suggesting that we take the Internet and send it in a time machine back to 1992? -Raymond]
  13. In 1992, there was no mechanism for distributing patches

    Really?

  14. Anonymous says:

    Maurits: You could call up MS support and get a set of floppies, but it's something you only did if you really needed it. I remember getting Winword 6.0a and maybe Windows 3.11 that way.

  15. Anonymous says:

    [Um, "then" is 1992. In 1992, there was no mechanism for distributing patches.]

    As an ISV, I'd have been glad to solve your patch distribution problem for you provided you gave me the patch.

    The offer still stands. I'd love to be able to distribute the Windows patches I depend on.

    [I thought that's what the Windows Update Standalone Installer was for. -Raymond]
  16. Anonymous says:

    @Joshua:

    Distributing the patches that you need is great until you find that the same patch breaks someone else's software. (They don't need the patch, so they never tested against it.) At least when patches come from MS through (say) Windows Update, the end-user will blame MS rather than you.

  17. @Raymond: "Or are you suggesting that we take the Internet and send it in a time machine back to 1992?"

    What you meant to say is that the internet did exist in 1992, but that a) Windows/DOS didn't support TCP/IP, SLIP, PPP, etc directly, and b) the general population did not have access to the internet yet.

    I've been around long enough to remember BITNET and ARPANET. RELAY was the name of the "chat" program over BITNET, and was popular with us college kids back then. We distributed a lot of files back then too (of questionable educational and moral benefit).

  18. Anonymous says:

    [I thought that's what the Windows Update Standalone Installer was for. -Raymond]

    I understood after Microsoft sued the people providing third-party hosting for XPSP2 (after it overloaded the MS site) that redistribution of patches was not allowed. Has this changed?

    [I am not a lawyer. -Raymond]
  19. Anonymous says:

    Well done, but I with the Portuguese (especially Brazilian) version were better, there are so many grammatical problems, It is dreadful to find many a error even on Office 2010 gui…When calling Ms Support for help, the answer is always the same: It will be fixed in the next service pack… It has never been though.. :(

  20. cheong00 says:

    Although not actually have used them (I went on the net the first time at 1999), I think small file transfer is possible with X-modem-CRC protocol even when internet did not exist.

    Not sure if Microsoft would have served such a big modem pool, though.

  21. cheong00 says:

    Back to topic, I remember that there were lots of 3rd party patching software for Win3.1/Win9X to support Chinese. Some will even translate even menu item to (sometimes non-sense) Chinese. I wonder how did they do that at that time.

    Also, was installing hooks not allowed for the teams doing localization works? I think it could effectively modify Windows behaviour without affecting the stability of English version of Windows. (Evade reset escrow bar) The language supplements could be sold as seperate package like "Plus!" at that time.

  22. Anonymous says:

    "A menu item Sort was translated as Art (as in "What sort of person would do this?") rather than Sortieren ("put in a prearranged order")"

    That reminds me of a mistranslation in Star Wars, Episode 2. Cliegg Lars there talks about how, after losing a leg, he "cannot ride any more" (i.e. speeder bikers etc.). In the German translation, he literally says "Seit […] kann ich kein Pferd mehr reiten", which means "Since […] I cannot ride a horse". Except that in Star Wars there are no horses at all, and the translator apparently did not know this.

    After a while I came to the conclusion that whenever possible, I watch movies and play games in their original English version. Some translations are mediocre, some are good, but very few are really stellar and able to capture the original, and a lot are a mixed affair. For example, in the series "The Simpsons", sometimes a joke is translated in a way to perfectly convey the literal meaning and a hidden joke or allusion, while other times there are conversations between the characters that make little sense until you imagine what they would have sounded in English, and after a while I got a pretty good grasp on what the original text was before a bad translation.

    A stellar example of English-to-German localization, however, is the British series "Little Britain". For the localization, they hired two (very good) German comedians who even managed to convey jokes that would only make sense for a British, like replacing names of movie actors or musicians not known outside of the UK with the names of German actors/musicians with a very similar reputation.

  23. Anonymous says:

    At least some of you guys have had localised versions available to you for a while.  It's taken Microsoft 27 years to get round to British English.

  24. Anonymous says:

    Obligatory English English reference… http://www.youtube.com/watch

  25. Anonymous says:

    @Mephane: Did they localise the title as well? ;-)

    Obligatory rant: WFW3.11's TCP/IP sucked, based on the speed of a third party database client/server compared to either a different third party TCP/IP stack or Windows 95.

  26. Anonymous says:

    Speaking of "Patches didn't exist back then, there being no mechanism for distributing them", I remember when Windows Update debuted.  Many Web sites were vehemently against the whole idea.  I read comments like "This is a horrible idea"; "I want to update my software, not Microsoft"; "Who do they think they are, trying to update MY computer"; "I'll decide what patches need to be applied"; advice to not turn on automatic updates; and so on.

    The whole industry has come a long way since then in making good use of automatic software updates and security patches.

  27. Anonymous says:

    @Mephane: the problem is that you need to know a language quite well before watching movies (reading books) in that language is better for understanding than having a mediocre to good translator. only if you yourself know that language pretty well you have a *chance* to grok the hidden meanings. If you still struggle with it by any means watch in this language (read it) to improve that language skills but don't expect to understand everything in all layers. Most translations are in fact quite good, especially if the translator had enough time (i.e. not releasing the same day worldwide)

  28. Anonymous says:

    @Wolf does the data you are looking in indicate how Y/N is handled in any of them? (Are there any of them that are large enough to have locale use cases, either on windows, in any unix OS, or in CLDR?)

  29. cheong00 says:

    @David Walker: Indeed. Especially after once or twice accidents that updates breaks some computer in production after rollout.

    The patch testing process has been much much more rigid since that, however I still know some I.T. staffs prefer to hold the updates 1 month before approving them in their WSUS servers.

  30. cheong00 says:

    @Wolf: I think it's better to leave it as (Y/N) then. Just like in Chinese version, the "choices" are always shown as "是(Y)/否(N)" or "確認(O)/取消(C)".

  31. Anonymous says:

    I remember a small but painful bug in Win98 First Edition PL.

    ipconfig /? listed all the /switches as usual, in English, but some (all? don't remember) of them didn't work.

    After a while of trying to find out why "ipconfig /all" didn't work, I've had a crazy idea…

    ipconfig /wszystko

    Worked.

    Someone over-eagerly translated the switch (bye-bye, script compatibility), and to add insult to injury, didn't translate it in the /? listing.

  32. Anonymous says:

    @Joshua: Distributing the patches that you need is great until you find that the same patch breaks someone else's software. (They don't need the patch, so they never tested against it.) At least when patches come from MS through (say) Windows Update, the end-user will blame MS rather than you.

    When the applications which breaks is developed by MSFT, it justified. I will not namn any but atleast two *server* softwares has been totally incompatible (endless crashing) with Windows Server OS hotfixes. These problems often results in a very obvious cash loss. One would expect MSFT software to be more Windows compatible.

Comments are closed.