Converting from XmlDocument to XDocument

Converting from XmlDocument to XDocument has a number of benefits, including the ability to use LINQ to XML, use a much cleaner object model, get better name handling with XName and being able to use functional constructors. However, there are a lot of XmlDocuments out there, so what is the best way to convert a XmlDocument to an XDocument?

This question came up in the forums a little while ago, and I thought it might be interesting to do some comparisons.

I first came up with a few ways of turning an XmlDocument into an XDocument.

private static XDocument DocumentToXDocument(XmlDocument doc)
{
return XDocument.Parse(doc.OuterXml);
}

private static XDocument DocumentToXDocumentNavigator(XmlDocument doc)
{
return XDocument.Load(doc.CreateNavigator().ReadSubtree());
}

private static XDocument DocumentToXDocumentReader(XmlDocument doc)
{
return XDocument.Load(new XmlNodeReader(doc));
}

Next I whipped up a function to time these with something quick and dirty. I make sure the past activity doesn't leave much in terms of leaving garbage, and I warm up the action a bit (I also warm up the Stopwatch methods, just in case).

private static long Time(int count, Action action)
{
GC.Collect();
for (int i = 0; i < 3; i++)
{
action();
}

Stopwatch watch = new Stopwatch();
watch.Start();
watch.Stop();
watch.Reset();
watch.Start();

for (int i = 0; i < count; i++)
{
action();
}

long result = watch.ElapsedMilliseconds;
watch.Stop();
return result;
}

And finally, all together:

StringBuilder sb = new StringBuilder();
sb.Append("<parent>");
for (int i = 0; i < 1000; i++)
{
sb.Append(" <child>text</child>");
}
sb.Append("</parent>");

string text = sb.ToString();
XmlDocument doc = new XmlDocument();
doc.LoadXml(text);

long docToXDoc = Time(1000, () => DocumentToXDocument(doc));
long docToXDocNavigator = Time(1000, () => DocumentToXDocumentNavigator(doc));
long docToXDocReader = Time(1000, () => DocumentToXDocumentReader(doc)); 

Note that the actual numbers don't matter much, as this is my laptop running a bunch of things in the background, in the debugger and whatnot, but the relative values are interesting to see.

These are the values I got (they vary a bit each run, but not by much).

  • Using OuterXml: 1973 ms.
  • Using a navigator over the document: 1254 ms.
  • Using a reader over the document: 1154 ms.

Not surprisingly, avoiding the creation of a big string just to re-parse it is a big win - save the planet, use less CPU power!

So if we like the reader option, what is a convenient way of encapsulating that? Well C# 3 extension methods aren't too bad.

Here is one way of writing the methods.

public static class XmlDocumentExtensions
{
public static XDocument ToXDocument(this XmlDocument document)
{
return document.ToXDocument(LoadOptions.None);
}

public static XDocument ToXDocument(this XmlDocument document, LoadOptions options)
{
using (XmlNodeReader reader = new XmlNodeReader(document))
{
return XDocument.Load(reader, options);
}
}
}

Now, as long as the class is visible to the code you're writing, you can write code like this.

XmlDocument doc = new XmlDocument();
doc.LoadXml("<parent><child>text</child></parent>");

XDocument xdoc = doc.ToXDocument();
var children = xdoc.Document.Element("parent").Elements("child");
foreach (var child in children)
{
Console.WriteLine(child.Value);
}

Of course, if you could you would just start off from an XDocument - these address the cases where you already have an XmlDocument around and you can't just change all code to use XDocument.

One thing that I like about extension methods is that it helps bridge dependencies across libraries in a clean way.

Enjoy!

Marcelo Lopez Ruiz

https://blogs.msdn.com/marcelolr/