Making your VB code ready to go global (Matt Gertz)

Greetings, all!


I’m Matt Gertz, the Dev Manager for Visual Basic team.  I’ve been on the team for a bit over 12 years, via the Blackbird/Visual InterDev side of the product, and in that time have been a dev on various features (mostly IDE-related), dev lead of deployment, dev lead of compiler, and box lead before my current responsibilities as DM.  I’ve been somewhat remiss in not having posted to this blog before, being content to hang out on the VB IDE Forum, but it’s my intent to correct that situation and to throw out a few thoughts here and there.  Many of our clever folk here have been writing about some of the exciting new features upcoming for VB9 (such as Scott’s excellent series on extension methods), so rather than stealing a measure from them, I’m going to instead focus on things that you can do with Visual Basic right now.


Going global


Many of the questions I’ve been asked by customers over the years revolve around the concept of globalization – making your code work in different languages – so I’m going to write about this today.  I’m actually fairly passionate this subject – if there was such a thing as a Unicode groupie, that might well describe me.  A lot of this enthusiasm comes from years of development on code that needs to go international.  In Ye Olde Days, this was frankly very hard to do, but fortunately, using Visual Basic & .NET, making your code ready for a global audience is far simpler than it used to be.


Consider the following (admittedly contrived) code:


    Sub test(ByVal value As Double)


        Dim s As String = value.ToString


        Dim positionOfDot = InStr(s, “.”)


        If positionOfDot > 0 Then


            Console.WriteLine(“The number “ & s.Substring(positionOfDot) _


 & ” is the decimal portion.”)


        Else


            Console.WriteLine(“There is no decimal portion.”)


        End If


    End Sub


 


This code will work fine in English (although there are certainly better ways to get the decimal portion of a number — please don’t do it this way for real J).  However, there are several problems with the code if you want to take it global, problems which I’m sure many folks will have spotted:


(1)    Many cultures don’t use “.” as a decimal separator – many cultures use (for example) a comma.


(2)    The output strings are hard-coded to English – you’d have to change the code for each language.


(3)    The first output string assumes an English grammar ordering – some languages might need the argument at the end or the beginning, not in the middle.


So , some of you might be thinking, “What’s so wrong about 2 and 3?  I’m going to have to translate the strings anyway if I go global, right?”  Sure, but you don’t want to have to touch your code files unless necessary, and you sure don’t want to have different code paths for different languages – it will make your supportability nightmarish for any given version of your application.


Fortunately, there are solutions:


(1)    When dealing with cultural information, rely on the System.Globalization namespace.  For example, instead of hard-coding the “.”, you can instead call the following code:


Imports System.Globalization


(…)


      Dim c As String = _


  CultureInfo.CurrentUICulture.NumberFormat.CurrencyDecimalSeparator


 


to get the decimal separator for the current culture – in the contrived example above, you’d then use the result as your search string instead of “.”.  There are many, many pieces of information you can pull out of the CurrentUICulture, and you can even determine information about other cultures as well from System.Globalization.  I’ve attached a ZIP file to this post which gives examples of the cool things you can do with that namespace.  (The example also leverages the Encoding class in the System.Text namespace to show how you can translate your strings from UTF-8 to Unicode to DBCS, something you might need to do in global code.)


(2)    Move your strings to your project’s resources.  This is pretty darn simple in VS2005.  Simply right-click the project and choose “Properties,” and then navigate to the “Resources” page.  Add your strings to the resource table (each string will need a name as well as its text value), and then in code, simply type “My.Resources.MyResourceName” to use the string.  Later, when you need to change the string to a different language, you simply update your resources rather than your code.  The resource manager also can hold icons, images, and even whole text files that you might be using in your UI.  (If you’re not using VS2005, then check out my attached example to see how to access the resources without “My” — I wrote that code before the “My” functionality was available, and simply threw my resources into a handy resx file.)


(3)    Use replacement characters in strings rather than concatenating pieces of strings together when writing full grammatical sentences.  Trust me; this will make your life a lot easier when localizing your code.   In the resources, make your string resource (let’s name it “MyOutput”) something like:


 


“The number {0} is the decimal portion.”


                And then the call becomes:


            Console.WriteLine(String.Format(My.Resources.MyOutput, _


s.Substring(positionOfDot))


 



Now, your localization team won’t need to touch your code to change the grammar of the sentence – they can just move around the “{0}” in the resource string when translating.




The resources on your forms can also be automatically managed from the resource manager, making it similarly easy to update the default strings in your controls.  Simply change the “Localizable” property of the form to “True,” and your form’s strings will no longer be hard-coded in InitializeComponent, but will now be stored instead in {form’s name}.resx (you may have to choose “Show All Files” to see that file).  Those resources are kept separate from the code resources (which are stored in Resources.resx), so you would need to translate both when going global.


The downside of everything I’ve just mentioned is that you still need to rebuild your product for each language after translating the resources – the code isn’t changed, but the resources have, so compilation is necessary.  To get around this and to avoid all possibility of code binaries being different from language to language, you can create separate assemblies for your resources and refer to them from your main project, leveraging the assembly’s ability to identify its culture.  I’m not going to go into it since there’s information already about this on the net if you’re interested – here’s one, for example.


One big win in using VB from a global point of view is that everything is Unicode – you don’t have to do anything special to support the wide variety of characters out there.  Even the editor supports Unicode, so you can have a wide variety of (for example) variable names.  However, I should point out that we do not support Unicode characters greater than &H00FFFF, the so-called “surrogate character” combinations, as variable names.  (If you don’t know what a “surrogate character” is… well, that’s a topic for another time.)


Hopefully you will find this information on globalization useful – please feel free to comment or ask questions, and I’ll do my best to follow up.  The book “Developing International Software” from Microsoft Press is also a very good reference on how we deal with globalization issues here.  The fully comprehensive “Unicode Standard 5.0” volume from The Unicode Consortium is also a handy thing to have around for specific questions on Unicode usage (and it also makes a great booster seat for kids when they’re sitting at a high table) — if it’s outside your budget, you can read it online at www.unicode.org.


Going forward, my plan for my blog entries is to walk through a fairly complex card game I wrote last year, in order to point out some of the other very useful functionality in Visual Basic 2005 that you might not know as much about.


Until next time,


–Matt–*

VB-Globalization-usage.zip