Example of overriding your own Encoding.


Previously I wrote about the Best Way to Make Your Own Encoding, but didn’t include an example, so today I’m including an example of a replacement Encoding.  I also included an EncoderFallback example, which replaces unknown characters with numerical entity style replacements (〹). 


This example isn’t complete.  If you need Encoder or Decoder functionality you’d have to override those as well.  Also I didn’t include a DecoderFallback example.  From this example those should be reasonably straight forward.  The biggest issues are that Encoders/Decoders maintain state such as lead bytes or high surrogates, so they may have data buffered from a previous conversion.


I included a simple Main that just converts some text to bytes and back.  Its not very pretty, but it demonstrates that something did actually happen 🙂  Be forwarned that I spent almost no time testing this sample, so caveat programmer!  As always this sample is provided as-is with no warrenties or guarantees.


Hope you find this helpful,


Shawn

EncodingSample.cs

Comments (7)

  1. Martin recently asked what the best way to roll his own encoding in .Net 2.0, in particular can you override

  2. Ivan Petrov says:

    Thank you Shawn 🙂

  3. Ittipan Langkulanon says:

    I’m using XmlTextWrter to create an Xml output. To make it readable in Thai I only have to specify Encoding.GetEncoding(874) or Encoding.GetEncoding("ISO-8859-11") or Encoding.GetEncoding("TIS-620") to it’s parameter.

    It almost really works.

    But my API provider says they only accept "TIS-620" (according to Thai Standardization) instead of Microsoft’s "Windows-874" on the XML declaration <?xml version="1.0" encoding="…." ?>

    How can i just easily Make an Encoding that have every thing the same as Encoding.GetEncoding(874) except for just the BodyName property?

  4. shawnste says:

    You can derive a class from Encoding, but it isn’t particularly easy.  You could then wrap the other encoding in your class. (Just forward all the calls to the other class), but that gets annoying with all of the decoder and encoders.

    Of course your API provider should "use Unicode" 🙂 then it wouldn’t be a problem.

  5. Ittipan Langkulanon says:

    Thank you Shawn.

    my API provider also use Unicode, but in another scenarios (not in what I’m doing) :p

    ps. Sorry for my lately reply.

  6. Ittipan Langkulanon says:

    I’m studying programming and quite weak in OOP, could you please give me some example?

  7. shawnste says:

    See the EncodingSample.cs attached above for the basic idea, but it’s pretty complicated to override the whole thing…