Converting text file code pages
I've said "use Unicode" a lot, but sometimes there are programs that aren't doing what you'd expect, and outputting stuff in a different code page. Additionally, you might sometimes encounter a text file that was created using the system code page of a different machine. (Like if someone emailed me a txt file from a Russian computer, I wouldn't necessarily be able to make sense of it at first).
So, if you happen to have a text file in one encoding that you need to be able to read, you can write a little program to convert it. Or, if you find this blog post, you could even copy my little program to do that:
using System;using System.IO;using System.Text; class Convert{ static void Main(string[] args) { if (args.Length != 3) { Console.WriteLine("Usage: convert.exe infile.txt outfile.txt incodepage"); Console.WriteLine(" eg: convert data.1252.txt data.utf8.txt 1252"); Console.WriteLine(" or: convert data.1252.txt data.utf8.txt windows-1252"); Console.WriteLine(" (output is always UTF-8)"); return; } int codepage = 0; Encoding enc; if (int.TryParse(args[2], out codepage)) { enc = Encoding.GetEncoding(codepage); } else { enc = Encoding.GetEncoding(args[2]); } StreamReader reader = new StreamReader(args[0], enc); StreamWriter writer = new StreamWriter(args[1], false, Encoding.UTF8); String str; while ((str = reader.ReadLine()) != null) { writer.WriteLine(str); } writer.Close(); reader.Close(); }}
I've stuck the source and a compiled version in a convert.zip