Locale (Culture) Data Churn

Data Churn

Some of you have noticed churn in the Locale Data (which ends up as Culture data in .Net).  I've mentioned before that this stuff shouldn't be considered stable, but in Windows 10 we have a little more churn than normal.

What happens is that, over time, locale data preferences change around the world.  Often it's subtle or barely noticeable, especially for those of us in a different country or speaking a different language, but it happens all the time.  We support hundreds of locales so we end up getting several requests to adjust data a week!  Wow.

Some of the reasons for the churn are more obvious than others.  For example, many countries have adopted the Euro over the last couple decades.  A bunch all at once, but more have trickled in, and some are still on track to adopt it in the future.  When that happens, we end up changing the default currency for that locale to the Euro.  Similarly other countries have revalued their currency or modified their currency symbol or any number of other things that impacts their currency format.  And that's just one of the fields that we have data for in each locale!

Since the data changes, it's a great plan to not depend on locale sensitive formats when serializing or deserializing data.  Currency in particular is important for that, your application should know that it's saving it as RON (Romanian Lei) instead of Euros.  So that when Romania adopts the Euro in a few years your figures are still correct.

It isn't just currency, but pretty much every field has changed.  Spelling reforms have caused the days of the week and month to change in some locales.  Shifting cultural preferences have changed date separators.  Sometimes the data was just plain incorrect.

Do it my way!

Sometimes a single locale has differences within itself.  Sometimes that is between the "Official" behavior and "Common Practice".  Sometime's it's a regional or ethnic practice within that country.  Or a business could specify a format used by it's home office even if it was less common in the regional office's locale.

We can also see stylistic variations.  For example / is common as a date separator in the US.  But you also see - or . or space.  Sometimes it differs just for aesthetics or to look modern.  Over time in some locales those variations become the predominant form.

Interestingly sometimes the computer itself defines the behavior in the locale.  Such as when the rules are ambiguous, but computer manufacturers implement a specific behavior and then that becomes common practice. 

Have it your way

In Windows, we do have a way to specify your own behavior if you don't like the in-box behavior.  The simplest method that works a lot of the time is to use the user overrides in the Region control panel.  For more consistent behavior across a machine or enterprise, the Locale Builder can be used to create a custom locale perfect for your business requirements.  I just blogged an example of how to override the Finnish time formats with the Locale Builder if you want a head start.