XmlNameTable: The Shiftstick of System.Xml

I spent much of today in a customer lab on performance in .NET applications covering best practices for System.Xml. As always the majority of people used XML somewhere in their application and needed to understand the performance implications of using one technique over another. One approach that I covered, among many others, is the use of the XmlNameTable class. This insignificant, yet crucial class, surfaces itself on all the classes in System.Xml that do some form of processing (XmlTextReader, XPathNavigator and XmlDocument) and like the shift stick on it car (gear stick if you live in the UK), is it an implementation detail that allows you to play with the performance of your XML processing.

Here is an example of it in use. Take this portion of an example XML document called invoices.xml that liist a number of LineItems for a given named customer.

<Invoices xmlns="https://example.invoice/invoices">

    <Invoice>

  <CustomerName>Levi</CustomerName>

      <LineItems>

      <LineItem>

      <ID>18148</ID>

            <Price>564</Price>

      <Description>A description</Description>

      </LineItem>

  </LineItems>

     </Invoice>

...

 </Invoices>

 

The following code uses the XmlTextReader with and without an XmlNameTable. The XmlNameTable enables object reference comparison rather than string value comparison and is useful in documents with many repeating known elements, attributes or namespaces which are automatically added to the XmlReaders XmlNameTable, a process called atomization. This allows you to then added your own names to the nametable and perform efficicent object comparisons rather than character by character string comparisons.

 

static void RunPerfNameTable()

{

      Console.WriteLine("** XmlNameTable vs No XmlNameTable **");

      // Warm up run

      PerfNoNameTable("invoices.xml");

      for (int i=0;i<5;i++)

      {

            PerfNoNameTable("invoices.xml");

            PerfNameTable("invoices.xml");

      }

      Console.ReadLine();

}

static void PerfNoNameTable(string filename)

{

     int start = 0, stop = 0, invoicecount = 0, lineitemcount = 0;

     Console.WriteLine("Reading XML without NameTable comparison");

      start = Environment.TickCount;

      for (int i = 0; i < 80; i++)

      {

            //Create the reader.

            XmlTextReader reader = new XmlTextReader(filename);

            while (reader.Read())

            {

                  if ("Invoice" == reader.LocalName)

                  {

                        invoicecount++;

                  }

                  if ("LineItem" == reader.LocalName)

                  {

                        lineitemcount++;

                  }

            }

      }

      stop = Environment.TickCount;

      Console.WriteLine("XmlTextReader document parsing time in ms WITHOUT NameTable: " + (stop - start).ToString());

}

static void PerfNameTable(string filename)

{

      int start = 0, stop = 0, invoicecount = 0, lineitemcount = 0;

      NameTable nt = new NameTable();

      object invoice = nt.Add("Invoice");

      object lineitem = nt.Add("LineItem");

      Console.WriteLine("Reading XML WITH NameTable comparison");

      start = Environment.TickCount;

      for (int i = 0; i < 80; i++)

      {

            XmlTextReader reader = new XmlTextReader(filename, nt);

            while (reader.Read())

            {

            // Cache the local name to the reader.LocalName property

               object localname = reader.LocalName;

               // comparison between object references. This just compares pointers

               if (invoice == localname)

               {

                     invoicecount++;

               }

               // comparison between object references. This just compares pointers

               if (lineitem == localname)

               {

                     lineitemcount++;

               }

          }

      }

      stop = Environment.TickCount;

      Console.WriteLine("XmlTextReader document parsing time in ms WITH NameTable: " + (stop - start).ToString());

}

The crucial piece of code shown above is this line which performs two things

                  object localname = reader.LocalName;

 

1) This caches the call to the LocalName property which in the V1 implementation of the XmlReader prevents two virtual method calls one public and the other internal each time this property is accessed.

2) Allows the Localname to be compared as an object reference multiple times via reference comparison such as in this line of code

 

                  if (invoice == localname)

 

The end result is a performance increase for parsing a 230kb XML file on a machine with a  PIII processor and 1Gb memory of around 6-9%. This is not enormous, but in scenarios where there is a high through-put of  XML documents or the documents are large (>200kb) then using the XmlNameTable gives you enough of a performance benefit to make it worthwhile especially if your processing starts to spans multiple XML components in a piplelining scenario and the XmlNameTable is shared across them i.e. XmlTextReader->XmlDocument->XslTransform.