Kirk Evans Blog

.NET From a Markup Perspective

Reading Fragments with XmlTextReader

Daniel points out a way to read fragments with XmlTextReader, and mentions that I posed the problem.


I was working with some rather nasty “XML” that was not well-formed, and had to find a way to successfully parse it.  The SgmlReader proved perfect for this, save for 1 outstanding problem:  the XML did not contain a root node.  SgmlReader will read the first node in a document until it encounters the closing element for the first node encountered.  When this happens, it sets its internal state to EOF (End of File).


Using Daniel’s approach, I could “trick” the SgmlReader to thinking that there was a root node for the document, and then parse the entire document as though it were well-formed.

FileStream fs = new XmlFragmentStream(“someBIGfile.xml“);
    
SgmlReader reader =
new Sgml.SgmlReader();  
      
reader.InputStream =
new StreamReader(fs,System.Text.Encoding.UTF8);
reader.WhitespaceHandling = System.Xml.WhitespaceHandling.None;
object docNode = reader.NameTable.Add(“DOCUMENT”);

while(reader.Read() )
{    
 
//Use the NameTable for reference comparison                    
 if(reader.NodeType == XmlNodeType.Element)
 {
  
if(docNode == reader.Name)
  {       
   writer.WriteNode(reader);
  }
 }     
}