Windows 10 make the best guess for all languages/locales


I've mentioned before to be careful about NLS Locale & Culture Data Churn, in that Locale Data Should Not Be Considered Stable, but there's an interesting aspect of that in Windows 10.

Windows 10 Supports all Locales - Kinda.

We kept getting asked for more and more locale/language data and applications kept running into trouble when NLS didn't think the tag existed.  So we made Windows 10 return the best data it knows about for any valid NLS locale tag that it runs into.

"In the old days," there used to be a somewhat finite set of locales that operating systems supported or had collected data on.  They might differ a little, but the sets were somewhat limited.  In more modern systems however, we start encountering valid data that is tagged in all sorts of interesting and completely valid data that we don't have in-box support for.

That's a problem because then your document says "Hey, I was written in Hawaiian" and the app goes "what's that?" and blows up or refuses to load the document.

"Constructed" Locales

So, in Windows 10, if you pass in a well-formed tag, then we'll give you the best answer that we can.  So if your document says "I'm tagged with the Fijiian dialect of Klingon" we go "Hmm, well, okay then" and let you fumble on - but we don't blow up.

We do that with "contructed" locales.  When we get a locale we don't recognize, we try our best to find reasonable data for it.  Since we've started using the CLDR data, we have a lot more native data, but still you might ask for something we don't have.  If we have the root language, like you asked for "en-FJ" (English as spoken in Fiji), then we'd look for en-FJ and not find it.  So then we'd fall back to "en", (which looks to us a lot like en-US), and use that as a basis for your en-FJ locale.

That means that some strings might be reasonable, like you'd have "Monday", "January", etc. as one might expect for en-FJ.  But we might not know the country name or other country based aspects.

Also, it's possible that we don't have a clue, like if you passed in "tlh-001" (Klingon (World)) we'd fail to find any data.  Then we'd return generic data based on our invariant locale.

Good Enough

Getting by with the constructed data is good enough for lots of scenarios, but it may not provide the best results.  For improved results folks can use the Locale Builder tool, or, even better, work with the CLDR to populate locales with appropriate data for your community's needs.

Those suggestions also imply that eventually the data might improve, whether from the user installing a custom locale or the next Windows 10 update adding improved locale data from CLDR.  Which again means that the data you get back might change in the future, hopefully for the better.


Comments (0)

Skip to main content