What’s my Encoding Called?

There is a bit of confusion about the System.Text.Encoding names, primarily “Which name do I use for my Encoding?” The Encoding class has 3 hame properties: BodyName, WebName and HeaderName, and the EncodingInfo objects returned by Encoding.GetEncodings have an additional Name property.  The examples in the MSDN documentation list a table. EncodingInfo   EncodingName           CodePage  BodyName      HeaderName     WebName        EncodingNameshift_jis      932       iso-2022-jp   iso-2022-jp    shift_jis      Japanese (Shift-JIS)windows-1250   1250      iso-8859-2    windows-1250  …


What Version of Unicode Does X Support?

Michael answers this question in his blog at http://blogs.msdn.com/michkap/archive/2005/12/23/506887.aspx


Custom Locales vs Custom Cultures in Windows Vista / .Net Framework

When a Custom Culture is created in .Net it is used by Vista as well as a Custom Locale.  There are some disparities between the data available to the Framework and the Windows OS, causing a few edge cases. Additionally some information is available in a shipped locale that the .Net Framework cannot currently modify. …


Custom and Synthetic (Windows ELK) RegionInfos in .Net 2.0

When we ship the Microsoft .Net Framework, the culture data associated with that version is fixed.  In Windows XP we have shipped additional locales (ELKs) that are not native to the Framework.  Windows Vista also includes a superset of the locales in the Framework.  In v2.0 of the framework we added the ability for the framework…


Why and How I Chose Klingon for an Example

This isn’t really a technical post, but some may be curious about why I chose Klingon for my example about making a custom culture/locale and Microsoft LDML.  I hope that other people will make their own custom cultures/locales for their own languages/countries. FWIW: I don’t go around work wearing a Klingon mask and I don’t…


Creating a Custom Culture (Locale) From Microsoft-ish LDML

[Updated 11 Aug 2006 to reflect IETF style locale names] This is just a simple example of creating a custom culture from an LDML file.  The LDML file has to have Microsoft specific tags otherwise you will get some errors for the missing data.  The resulting custom culture/locale works in .Net Framework 2.0 (new CultureInfo(“tlh-Latn-US”))…


Klingon Custom Culture/Locale MS LDML File

This is intended to go with the custom culture LDML example.  Cut & paste this into a file called “tlh-Latn-US.ldml”.  Notepad should work. <?xml version=”1.0″ encoding=”utf-8″?><ldml>  <identity>    <version number=”1.1″>ldml version 1.1</version>    <generation date=”2005-11-23″ />    <special xmlns:msLocale=”http://schemas.microsoft.com/globalization/2004/08/carib/ldml”>      <msLocale:cultureInfoVersion type=”1.0″ />      <msLocale:cultureAndRegionInfoName type=”tlh-Latn-US” />      <msLocale:geoId>244</msLocale:geoId>      <msLocale:parentName type=”en-US” />      <msLocale:languageNameAbbr type=”TLH” />      <msLocale:languageIsoName type=”threeLetters”>tlh</msLocale:languageIsoName>      <msLocale:languageIsoName type=”twoLetters”>tlh</msLocale:languageIsoName>      <msLocale:nativeName type=”tlhIngan…


CultureInfo.Name, ToString, LCID & CompareInfo.Name

There are multiple interesting names associated with CultureInfo and related objects, which could be a little bit confusing.  I’ve listed the name used in a constructor and the names returned by CultureInfo/CompareInfo in the table below: Method en-US de-DE_phoneb Custom Locale CultureInfo(name) en-US de-DE_phoneb fj-FJ CultureInfo(int culture) 0x0409    0x10407 0x0c00 (if user default) CultureInfo.ToString() en-US…


Avoid treating binary data as a String.

A common code snippet that I see is something that uses binary data as a String, something like:    String myHash = “abcdxD800”;   String encrypted = Encoding.Default.GetString(byte[] encryptedData);   encryptor.bytes[] = Encoding.Default.GetBytes(“my key”); The problem is that Strings expect to be Unicode, and not all 16 bit values or combinations of 16 bit values are legal Unicode strings.  The…


Code Page 21027 "Extended/Ext Alpha Lowercase"

I was playing with code pages and ran into an interesting case:  Code Page 21027 – Ext Alpha Lowercase.  This code page has some interesting behavior.  It looks like a Japaneses EBCDIC code page, however its kind of “missing” mappings for some characters, like 8, 9, =, H, I, Q, R, Y, Z, Halfwidth Katakana Ku, and Halfwidth…