Serializing to XML

I’m doing some performance testing on Azure tracing to figure out the overhead of trace statements in code, especially when the tracing level is such that the statement does nothing – but clearly it has to do something to figure out to do nothing.  So I have a simple Worker Role that calculates the 10,000th number in the Fibonacci sequence.  I run this with tracing every 100th time through the loop (but the trace level set such that the trace statement won’t emit the output) and with no tracing code. 

To gather the data, I use a timer which I put into a simple value class:

 [Serializable]
public struct TimerInstance
    {
        public TimerInstance(string str, Int32 nTicks)
        {
            _name = str;
            _ticks = nTicks;
        }

        private readonly string _name;
        private readonly Int32 _ticks;

        public string Name { get { return _name == null ? String.Empty : _name; } }
        public Int32 Ticks { get { return _ticks; } }

        public override string ToString()
        {
            return string.Format("{0,20}: {1,10:N}", Name, Ticks);
        }
    }

Then I just have a list of these which I add to as I run each test:

     static private List<TimerInstance> _Timers = new List<TimerInstance>();

At the end of a batch of tests, I want to persist out all the timer values into XML. I decided it was easiest to just store those in Azure blob storage. I can then import these into Excel where I can do some analysis to figure out the average overhead, minimum, etc.

This seemed super easy, but I found that serializing to XML is not as straightforward as I thought it would be.

First, I added the <pre>[Serializable]</pre> attribute to the struct (as shown above).  But what I ended up with was something like this:

 <?xml version="1.0"?>

<ArrayOfTimerInstance xmlns:xsd="<a href="https://www.w3.org/2001/XMLSchema"">
                                    https://www.w3.org/2001/XMLSchema"</a>
                      xmlns:xsi="<a href="https://www.w3.org/2001/XMLSchema-instance"">
                                    https://www.w3.org/2001/XMLSchema-instance"</a>>


    <TimerInstance />

    <TimerInstance />
    <TimerInstance />
    <TimerInstance />

</ArrayOfTimerInstance>

I couldn’t figure out why the public properties (Name and Ticks) of the items in the list weren’t getting serialized.

After poking around a bit, I realized two things:

  • The Serialize attribute has nothing to do, really, with XML Serialization – it marks the object as binary serializable which I wasn’t interested in for this.
  • There is an interface, IXmlSerializable, you can implement on a struct or a class to control how XML serialization happens.

So I modified my struct as follows:

     public struct TimerInstance : IXmlSerializable
    {
        public TimerInstance(string str, Int32 nTicks)
        {
            _name = str;
            _ticks = nTicks;
        }

        private readonly string _name;
        private readonly Int32 _ticks;

        public string Name { get { return _name == null ? String.Empty : _name; } }
        public Int32 Ticks { get { return _ticks; } }

        public override string ToString()
        {
            return string.Format("{0,20}: {1,10:N}", Name, Ticks);
        }

        #region IXMLSerializable Members

        public System.Xml.Schema.XmlSchema GetSchema()
        {
            throw new ApplicationException("This method is not implemented");
        }

        public void ReadXml(System.Xml.XmlReader reader)
        {
            throw new ApplicationException("XML deserialization of TimerInstance not supported");
        }

        public void WriteXml(System.Xml.XmlWriter writer)
        {
            writer.WriteStartAttribute("Name");
            writer.WriteString(Name);
            writer.WriteEndAttribute();

            writer.WriteStartAttribute("Ticks");
            writer.WriteString(Ticks.ToString());
            writer.WriteEndAttribute();
        }

        #endregion
    }

and used this code to serialize it into a MemoryStream:

         // Return items that should be persisted.  By convention, we are eliminating the "outlier"
        // values which I've defined as the top and bottom 5% of timer values.
        private static IEnumerable ItemsToPersist()
        {
            // Eliminate top and bottom 5% of timers from the enumeration.  Figure out how many items
            // to skip on both ends.  Assuming more than 10 timers in the list, otherwise just take
            // all of them
            int nTimers = (_Timers.Count > 10 ? _Timers.Count : 0);
            int iFivePercentOfTimers = nTimers / 20;
            int iNinetyPercentOfTimers = _Timers.Count - iFivePercentOfTimers * 2;

            return (from x in _Timers
                    orderby x.Ticks descending
                    select x).Skip(iFivePercentOfTimers).Take(iNinetyPercentOfTimers);
        }

        // Serialize the timer list as XML to a stream - for storing in an Azure Blob
        public static void SerializeTimersToStream(Stream s)
        {
            if (_Timers.Count > 0)
            {
                System.Xml.Serialization.XmlSerializer x = new System.Xml.Serialization.XmlSerializer(typeof(List));
                x.Serialize(s, ItemsToPersist().ToList());
             }
         }

Notice that I have a LINQ enumerator which sorts the list and eliminates the outliers - which I define as the bottom and top 5% of timer values. Probably something wierd was going on when these happened, so I don't want the outliers to skew the mean of my overhead calculation.

Once I have this, it's a pretty straightforward thing to store the memory stream in an Azure Blob using UploadFromStream:

         // Serialize the timers into XML and store in a blob in Azure
        private void SerializeTimersToBlob(string strTimerGroupName)
        {
            // add the binary to blob storage
            try
            {
                // Create container if it doesn't exist.  Make it publicly accessible.
                if (_blobContainer == null)
                {
                    CloudBlobClient storage = _storageAccount.CreateCloudBlobClient();
                    storage.GetContainerReference("timers").CreateIfNotExist();
                    if ((_blobContainer = storage.GetContainerReference("timers")) == null)
                        throw new ApplicationException("Failed to create Azure Blob Container for timers");
                    _blobContainer.SetPermissions(
                                      new BlobContainerPermissions()
                                                 { PublicAccess = BlobContainerPublicAccessType.Blob });
                }

                // Remove invalid characters from timer group name and add XML extension to create Blob name.
                string  strBlobName = strTimerGroupName.Replace(':', '.') + ".xml";
                var blob = _blobContainer.GetBlobReference(strBlobName);
                MemoryStream ms = new MemoryStream();

                // Serialize all the recorded timers in XML into the stream
                TraceTimer.SerializeTimersToStream(ms);

                // Seek back to the beginning of the stream.
                ms.Seek(0, SeekOrigin.Begin);

                // Save the stream as the content of the Azure blob
                blob.Properties.ContentType = "text/xml";
                blob.UploadFromStream(ms);
                Diagnostics.WriteDiagnosticInfo(Diagnostics.ConfigTrace,
                                                TraceEventType.Information,
                                                WorkerDiagnostics.TraceEventID.traceFlow, "Added " + strBlobName);
            }
            catch (Exception exp)
            {
                Diagnostics.WriteExceptionInfo(Diagnostics.ConfigTrace,
                                               "Saving Timer Information to Azure Blob", exp);
            }
        }

(Diagnostics is my own class to provide TraceSource-based logging to Azure). The only additional trick I needed here was to Seek on the MemoryStream to the beginning - I was getting empty blobs before I did that.

The final result:

 <?xml version="1.0"?>
<ArrayOfTimerInstance xmlns:xsi="<a href="https://www.w3.org/2001/XMLSchema-instance"">                                    https://www.w3.org/2001/XMLSchema-instance"</a>                      xmlns:xsd="<a href="https://www.w3.org/2001/XMLSchema"">                                    https://www.w3.org/2001/XMLSchema"</a>>
  <TimerInstance Name="Fibonacci Generation (Trace) 1" Ticks="468294" />
  <TimerInstance Name="Fibonacci Generation (Trace) 9" Ticks="410877" />
  <TimerInstance Name="Fibonacci Generation (Trace) 3" Ticks="402439" />
  <TimerInstance Name="Fibonacci Generation (Trace) 5" Ticks="388574" />
  <TimerInstance Name="Fibonacci Generation (Trace) 8" Ticks="385690" />
  <TimerInstance Name="Fibonacci Generation (Trace) 2" Ticks="385261" />
  <TimerInstance Name="Fibonacci Generation (Trace) 10" Ticks="376425" />
  <TimerInstance Name="Fibonacci Generation (Trace) 6" Ticks="345973" />
  <TimerInstance Name="Fibonacci Generation (Trace) 7" Ticks="339461" />
  <TimerInstance Name="Fibonacci Generation (Trace) 4" Ticks="331053" />
</ArrayOfTimerInstance>