Strings are UTF-16…. There is an error in XML document (1, 1).


I had a situation today where an xml document had a directive indicating it was utf-8.  So, the code in question was reading in the “string” of that xml then attempting to de-serialize it using an Xsd generated type.

What you end up with is an exception indicating that there’s an error in the Xml document at (1,1) or something to that effect.

The fix is, run it through a memory stream – which reads the string, but at utf8 bytes – if you have things that fall outside of 8 bit chars, you’ll get an exception.

 

//Need to read it to bytes, to undo the fact that strings are UTF-16 all the time.
//We want it to handle it as UTF8.
byte[] bytes = Encoding.UTF8.GetBytes(_myXmlString);

TargetType myInstance = null;
using (MemoryStream memStream = new MemoryStream(bytes))
{
    XmlSerializer tokenSerializer = new XmlSerializer(typeof(TargetType));
    myInstance = (TargetType)tokenSerializer.Deserialize(memStream);
}

 

Writing is similar – also, adding the default namespace prevents the additional xmlns additions that aren’t necessary:

 

XmlWriterSettings settings = new XmlWriterSettings()
{
    Encoding = Encoding.UTF8,
    Indent = true,
    NewLineOnAttributes = true,
};

XmlSerializerNamespaces xmlnsEmpty = new XmlSerializerNamespaces();
xmlnsEmpty.Add("", "http://www.wow.thisworks.com/2010/05");

MemoryStream memStr = new MemoryStream();
using (XmlWriter writer = XmlTextWriter.Create(memStr, settings))
{
    XmlSerializer tokenSerializer = new XmlSerializer(typeof(TargetType));
    tokenSerializer.Serialize(writer, theInstance, xmlnsEmpty);
}
Comments (3)

  1. Curley says:

    Hi, I fell foul of this 'error in xml document at…'  and after reading through reams of stuff I came upon this article and thought I'd found my saviour. Punched the code into a little console app to try it against my simple xml file and I still get the same message when trying to deserialize. What is the real problem?

  2. scicoria says:

    It's generally a mismatch between serialized, xml Utf tag, etc.

    Can you send me the code (sample) that repro's this and I can try to find the "real problem"?

  3. Curley says:

    It was my file that was wrong. I did things the long way. I loaded the file into an array of objects and saved a serialized list. This created a file that can be read. The first two lines of my original file were <?xml version="1.0" encoding="utf-8"?> <ProjectError>and now the new file has

    <?xml version="1.0"?> <ArrayOfProjectError xmlns:xsi="http://www.w3.org/…/XMLSchema-instance&quot; xmlns:xsd="http://www.w3.org/…/XMLSchema"&gt;. <ProjectError> is now the third line. Had I started with no file and just created one using your routine I would not have had a problem and red eyes from reading all about other peoples' problems. Another lesson learnt. Thanks for taking the time to reply and providing a neat solution in the first place.