Weird bug I had yesterday…

I came across a weird bug recently, and I thought the explanation might be interesting.

But first, a brief digression.

Windows has support for Bi-Directional text (usually abbreviated as "bidi" around here). Some languages - Arabic in this particular case - are written from right to left, so string layout is right to left. (as is the Windows UI, which is pretty freaky the first time you see it).

Which is pretty straightforward.

But sometimes you need to render text that is in multiple languages, and if one of those is ordered RTL and the other is LTR, the engine has to be pretty smart, as it needs to handle both sections in the same string, even though they are written in two different directions (ie "bidi").

Most of this support is hidden - if you're keeping your strings in resources and using placeholders, the right thing just happens for you. (If you venture into custom controls, you *do* need to worry about things like RTL layout).

So, anyway, I got a bug on DVD Maker that said that the numbers in a certain string were wrong in Arabic windows. I looked at the resource string, and it was:

%1!d! of %2!d! minutes

what was coming out in the UI was:

of 90 minutes (*)

where the (*) is an Arabic character that I didn't recognize (not that I recognize any Arabic characters...)

So, is this a bug, or not?


The answer is that this is by design. The string hadn't yet been localized (our localization happens in stages). When FormatMessage went to process it, it saw the %1!d!, and went to format a number, and choose the default locale, which was Arabic, so it put in an Arabic number at the far right side (RTL, right?). But it then came to a chunk of english text, which meant it had to switch to LTR rendering, and it used the english locale to format the number in the english text.

The good news is that when the string is localized, everything will work fine. The bad news? Well, it took me a few hours to figure this out.

Comments (7)

  1. Maurits says:

    It sounds like this would happen every time a string begins with bidi-neutral characters (numbers, in this case.)

  2. Tomer Gabel says:

    I can’t even begin to imagine what dealing with localization is to someone who just isn’t used to the nature of some foreign languages. I’d imagine that there’s nothing harder for an American (or most Europeans for that matter) to figure out what’s going on when first encountering an RTL issue – the concepts and associated Unicode features are hard to grasp even to most people from those cultures.

    I remember first fighting directionality and the ensuing data integrity issues while maintaining a large codebase a few years back; it was a large and hectic MFC, non-Unicode application it would’ve taken weeks or months to migrate to Unicode. It took me days just to understand the nature of the problem – some things just didn’t seem to work right – and that’s with the privilege of understanding the language…

  3. Eric Gunnerson reminded me in a recent post that Bidi issues can be quite unintuitive at times.


  4. Serge Wautier says:

    Besides the layout considerations, how come 90 was displayed ‘correctly’ but the first number was displayed using Arabic character(s) ? Looks somewhat inconsistent ? Or did I miss something ?

  5. Serge Wautier says:

    Ha! sorry, you wrote it: locale switch according to context. Smart! (Well, kinda! Obviously, you pointed out a limit)

Skip to main content