If you ran the original example, you’ll notice that the last combination of encodings we wrote produced this output.
Stream Encoding: (no stream)
XML Encoding: Norwegian (IA5)
Unhandled Exception: System.Text.EncoderFallbackException: Unable to translate Unicode character \u0023 at index 55 to specified code page.
at System.Text.EncoderExceptionFallbackBuffer.Fallback(Char charUnknown, Int32 index)
at System.Xml.CharEntityEncoderFallbackBuffer.Fallback(Char charUnknown, Int32 index)
at System.Text.EncoderFallbackBuffer.InternalFallback(Char ch, Char*& chars)
at System.Text.SBCSCodePageEncoding.GetBytes(Char* chars, Int32 charCount, Byte* bytes, Int32 byteCount, EncoderNLS encoder)
at System.Text.EncoderNLS.Convert(Char* chars, Int32 charCount, Byte* bytes, Int32 byteCount, Boolean flush, Int32& charsUsed, Int32& bytesUsed, Boolean& completed)
at System.Text.EncoderNLS.Convert(Char chars, Int32 charIndex, Int32 charCount, Byte bytes, Int32 byteIndex, Int32 byteCount, Boolean flush, Int32& charsUsed, Int32& bytesUsed, Boolean& complet
at System.Xml.XmlEncodedRawTextWriter.EncodeChars(Int32 startOffset, Int32 endOffset, Boolean writeAllToStream)
at Cs.Cs.WriteXml(XmlWriter xmlWriter)
at Cs.Cs.WriteEncodedXml(Encoding streamEncoding, Encoding xmlEncoding, Stream stream)
This is very much an edge case, and it takes a couple of minutes to figure out what’s going on.
Let’s start from the code that produces this problem:
Encoding muhaha = Encoding.GetEncoding(
This encoding is built up as follows. First, it specifies the x-IA5-Norwegian encoding. Then, it specifies that if it cannot map a character to this encoding, it should throw an exception. Typically you can configure encodings to fall back to writing a best-fit character or a generic ‘?’ character, but depending on your system, this may be the wrong thing to do – think, for example, that you cannot reliably round-trip data any more. So, to be safe, we’re setting the encoding to fail if that’s unavailable.
Now, XML has a pretty nifty way of dealing with characters that cannot be directly encoded, by using character references. This allows you to write 	 instead of a tab character, for example. So even if the encoding doesn’t support a character, XML can still represent it using this escape hatch.
And there’s the rub. x-IA-Norwegian is one of the extremely rare encodings that doesn’t have the ‘#’ character in it’s repertoire! So when the XML writer sees a character that’s not in the encoding (I used ‘#’ itself for extra irony points), it tries to write the reference, and then the encoder fails again to write ‘#’. At that point, the writer gives up and allows the exception to bubble out unhandled, which in our simple program just print out the exception to the console.