Removing Namespaces in XML, Security in ASP.NET

Article
06/13/2003

An Update - 05/19/2004

Instead of looking at XSLT for removing namespaces, consider using a specialized System.Xml.XmlTextWriter to remove XML Namespaces.

-ed.

ASP.NET Security

The Patterns and Practices Group introduced "Improving Web Application Security: Threats and Countermeasures", a free PDF download that weighs in at about 6 megs of great information on developing secure web applications. I haven't read "Building Secure ASP.NET Applications" yet, but this guide is certainly a compelling reason to think about ordering it soon.

Removing Namespaces in XML

I have seen this question and related questions in many newsgroup posts: "How do I remove namespaces from my XML?"  There are several ways to do this. Of course, I have to demo the XSLT version first.

<xsl:stylesheet xmlns:xsl ="**https://www.w3.org/1999/XSL/Transform**" version ="1.0" >

<xsl:template match =" @* " >

<xsl:attribute name =" {local-name()} " >

<xsl:value-of select =" . " />

</xsl:attribute>

<xsl:apply-templates/>

</xsl:template>

<xsl:template match =" * " >

<xsl:element name =" {local-name()} " >

<xsl:apply-templates select =" @* | node() " />

</xsl:element>

</xsl:template>

</xsl:stylesheet>

Apply this stylesheet to the following XML:

<k:foo xmlns:k ="testing" >

<k:bar id ="1" />

</k:foo>

The result of the transformation will be:

<foo>

<bar id ="1" />

</foo>

This stylesheet basically removes the namespace from the element or attribute. To transform the XML, you have to use the System.Xml.Xsl.XslTransform class. Typically this is done by providing a URL to the file, which internally creates an XPathDocument based on the contents of the file found at the URL. If you are processing a large XML document in this manner, performance will be horrible: the transformation process requires that the entire document be loaded into memory.

Instead of loading the entire document into memory, you can use an XmlTextReader to read the contents of the XML file at the specified URL. As you read nodes from the XmlTextReader, you write the correct node type to the XmlTextWriter, omitting parameters for namespaces and prefixes.

Response.ContentType = "text/xml";
//Read the original XML file with a pull-model processor
XmlTextReader reader = new System.Xml.XmlTextReader(Server.MapPath("test.xml"));

XmlTextWriter writer = new
  System.Xml.XmlTextWriter(Response.OutputStream,System.Text.Encoding.UTF8);

writer.WriteStartDocument();

while(reader.Read())
{
  switch(reader.NodeType)
  {
    case XmlNodeType.Element:
      writer.WriteStartElement(reader.Name);

      if(reader.HasAttributes)
      {
        //Cannot just use writer.WriteAttributes,
        // else it will also emit xmlns attribute
        while(reader.MoveToNextAttribute())
        {
          if(reader.Name.CompareTo("xmlns") != 0)
            writer.WriteAttributeString(reader.Name,reader.Value);
        }
        reader.MoveToElement();
      }
      if(reader.IsEmptyElement)
      {
        writer.WriteEndElement();
      }
      break;
    case XmlNodeType.Text:
      writer.WriteString(reader.Value);
      break;
    case XmlNodeType.CDATA:
      writer.WriteCData(reader.Value);
      break;
    case XmlNodeType.ProcessingInstruction:
      writer.WriteProcessingInstruction(reader.Name,reader.Value);
      break;
    case XmlNodeType.Comment:
      writer.WriteComment(reader.Value);
      break;
    case XmlNodeType.EntityReference:
      writer.WriteEntityRef(reader.Name);
      break;
    case XmlNodeType.EndElement:
      writer.WriteEndElement();
      break;
  }
}
writer.WriteEndDocument();
writer.Flush();
writer.Close();
reader.Close();

This seems like a lot of work just to remove namespaces, and it is. I previously had done this same task simply by setting the Namespaces property of the XmlTextReader to false (see my post in the the microsoft.public.dotnet.xml newsgroup for a sample). However, I cannot seem to get this to work with version 1.1 of the framework currently, but I suspect my machine may be doing something odd.

Removing Namespaces in XML, Security in ASP.NET

An Update - 05/19/2004

ASP.NET Security

Removing Namespaces in XML

Additional resources