Converting text file code pages

I've said "use Unicode" a lot, but sometimes there are programs that aren't doing what you'd expect, and outputting stuff in a different code page.  Additionally, you might sometimes encounter a text file that was created using the system code page of a different machine.  (Like if someone emailed me a txt file from a Russian computer, I wouldn't necessarily be able to make sense of it at first).

So, if you happen to have a text file in one encoding that you need to be able to read, you can write a little program to convert it.  Or, if you find this blog post, you could even copy my little program to do that:

using System;
using System.IO;
using System.Text;

class Convert
    static void Main(string[] args)
        if (args.Length != 3)
            Console.WriteLine("Usage: convert.exe infile.txt outfile.txt incodepage");
            Console.WriteLine("       eg: convert data.1252.txt data.utf8.txt 1252");
            Console.WriteLine("       or: convert data.1252.txt data.utf8.txt windows-1252");
            Console.WriteLine("      (output is always UTF-8)");
        int codepage = 0;
        Encoding enc;
        if (int.TryParse(args[2], out codepage))
            enc = Encoding.GetEncoding(codepage);
            enc = Encoding.GetEncoding(args[2]);

        StreamReader reader = new StreamReader(args[0], enc);
                             StreamWriter writer = new StreamWriter(args[1], false, Encoding.UTF8);

                             String str;
        while ((str = reader.ReadLine()) != null)

I've stuck the source and a compiled version in a


Comments (4)

  1. Craig says:

    Why not wrap your reader and writer with using statements and remove the Close calls?

  2. No reason, just because I didn't do it that way 🙂

  3. Or you could use PowerShell:

    gc Inputfile.txt | Out-File Outputfile.txt utf8

  4. You'd have to do a little more to use random code pages for PowerShell input.

Skip to main content