Localization Bugs: String length limitations, #1

I've written ten posts or so about the most common bugs I see in localized software - duplicate hotkeys and clipped text. We try hard to avoid these bugs, but realistically they are present somewhere in almost all software. And not just Microsoft software either.

Ah well, at least they usually have low impact. Which is probably why it's so hard to get rid of them; early on in a product cycle there's always something more important to take care of (like actually getting everything translated), and closer to release they're simply not important enough that you want to take the risk of destabilizing the product.

So - now that I've dissected the most boring & common bugs, let's move on to something more interesting. Today, I'll introduce the String Length Limitation bug. I'll have to take a different approach for this type of bug than for the previous ones. Before I could show a symptom and trace it back to different causes. This time I'll instead focus on the cause and show what symptoms it can lead to. Why should become clear as we go along.

What's a string length limitation then? Well, in short:

  • A string length limitation happens when a developer for some reason decides that a certain string can only be so long. Maybe there's an architectural limitation, like how built-in accounts can only be 20 characters long. Or maybe the developer just figures that what's a short piece of text in English will be a short piece of text in any language.
  • String length limitations always exist, but most of the time the magic number is so high that it doesn't matter. Only when the string I want to see on screen is longer than the limit do we have a problem.
  • Since string length limitations are enforced by code, they're language neutral. This is good, because it can make it easier for us to find them.
  • String length limitations can't be fixed by localizers, they need to be fixed by the developer. At best, I can work around the limitation by shortening my translation, but that's not a fix. It can be very expensive work around too; if one language is forced to shorten a string, it's likely that more languages are affected.
  • There's typically no way for me to know that a certain string has a certain length limitation, unless the developer takes care to communicate this. Many of the limitations are found through normal testing. This is expensive, and it also means that there's a good risk that limitations slip through - especially those that do not cause functional problems.

I'll start out with a simple example of a string length limitation. Here's something I saw while we were in the middle of XP SP2. This is a tooltip you could see if you have a wifi card with poor connection:

The last word should say "anslutningsmöjlighet", but only "anslutningsm" is shown. Half the word is missing.

This looks very much like clipped text, but it's not. Per the definition I gave in one earlier post, a clipping occurs when a control isn't large enough to house a certain piece of text. In this case however, the text has been truncated a certain number of characters. This difference might not seem all that clear, but look at it this way: if it's a clipping, the width of the text (in pixels) decides what's shown. If it's a string length limitation, a certain number of characters will be visible - font size etc doesn't matter.

The cause of this bug then is a combination of 1) the control having a certain string length limitation, 2) the developer trying to fit a lot of information into a small area, and 3) localizers being wordy. It's probably no surprise that my opinion is that enlarging the limitation is the best fix. Having the message changed by the developer would be second best, but that would of course have a knock-on effect on each language that already translated the text. The only thing talking for me changing my translation is that, well, at least that's something I have direct control...

(Btw, bonus points if you said "Hey, the text is cut off just before a non-A-Z character - maybe that's the problem..?". It wasn't the problem this time, but it could have been.)

Worth noting is that this bug is very hard to find unless you have native speakers testing your software. Tools can detect that text is clipped at runtime, but how easy is it to create a tool that detects that some text has been silently cut off? Beta programs are good, even for smaller languages.

So that's for warm up. A pretty basic string length limitation. Next time it'll get more serious.

Oh by the way, of course this problem was fixed before we released.


This posting is provided "AS IS" with no warranties, and confers no rights.