What’s with RegionInfo(en-US) type names?

In .Net 2.0 you can construct a RegionInfo() from a full culture name (ie: en-US instead of en).  In fact, counterintuitive though it may be, the full culture name is actually preferred. This is because, despite the best intent of the original design, region information sometimes depends on more than just the region name.  An…


Problems compiling resources in .Net 2.0 apps after updates.

[10 July 2007] The security patch of 10 July 2007 http://www.microsoft.com/technet/security/Bulletin/ms07-040.mspx changes culture names to use the new names on Windows XP/2003/2000 as well as Vista.  In order to conform to RFC 4646 (replaces RFC 3066), we updated the names of some locales, which can cause applications using resources with the old names to fail to compile, usually…


Some custom locale examples.

Here’re some custom locale examples in various formats (1.8 MB .zipped download). These are to be used at your own risk.  They are basically locales I used for testing and playing around with.  They probably have serious errors and were basically selected randomly.  Any mistakes are mine.  You are free to use or change these,…


Expected names of Microsoft Windows "ANSI" Code Pages (Encodings)

I was asked about our use of the windows “ansi” code page names, as used in things like MIME types, http content-type tags, etc.  Each “code page” has a name that most accuratly round trips back to the same code page, which I’ve listed as the “preferred name” below.  Additionally, when you ask for a code page…


Example of overriding your own Encoding.

Previously I wrote about the Best Way to Make Your Own Encoding, but didn’t include an example, so today I’m including an example of a replacement Encoding.  I also included an EncoderFallback example, which replaces unknown characters with numerical entity style replacements (〹).  This example isn’t complete.  If you need Encoder or Decoder functionality you’d have to…


Rambling about RFC 4690 and IDN

There’s a reasonably new RFC 4690 ( http://www.ietf.org/rfc/rfc4690.txt ) that raises a bunch of questions about IDN names and Unicode regarding such things as confusable characters and other issues.  Some of those are also discussed in Unicode TR36 “Unicode Security Considerations” http://www.unicode.org/reports/tr36/. One thing that confuses me about the discussion regarding IDN’s weaknesses is that most of…


Best Way to Make Your Own Encoding

Martin recently asked what the best way to roll his own encoding in .Net 2.0, in particular can you override Encoding/Encoder/Decoder, or should he write his own StreamWriter. #1 is, of course, to use Unicode :), but apparently Martin doesn’t have that option. The answer is that you can write your own Encoding derived from…


List of "ANSI" code pages used by Windows.

These are the “ANSI” code pages that could be used by CP_ACP. Other code pages should not appear as “ANSI” code pages in Windows. 874 windows-874 ANSI/OEM Thai (same as 28605); Thai (Windows) 932 shift_jis ANSI/OEM Japanese; Japanese (Shift-JIS) 936 gb2312 ANSI/OEM Simplified Chinese (PRC, Singapore); Chinese Simplified (GB2312) 949 ks_c_5601-1987 ANSI/OEM Korean (Unified Hangul…