Rough Spots in the LINQ to XML Learning Curve

[minor editorial updates 11/13] 

We've been doing some formal usability testing on all the LINQ components over the last couple of months and have learned a lot about what people find challenging. The results have generally validated LINQ's story as a common programming model for all types of data, but they've also identified some things that people find hard.  Some of these may be fixed with API tweaks, but for the most part they indicate what we have to do a better job of explaining.

I've come up with the following list of things to keep in mind when working with LINQ to XML. Some will be explained in this post, others in subsequent posts, and we'll try to make sure that the documentation calls them out.

  1. Clear your mind of SAX and DOM when approaching LINQ to XML; there are many similarities of course, but they can lead you astray.  XPath, XSLT, and XQuery are closer in spirit to LINQ to XML, but of course the details are very different.
  2. Remember that XElement and XDocument are similar, but not interchangeable when you are loading a document from a data source.  XElement.Load() loads everything under the top-level element, XDocument.Load() also loads any markup before the top-level element.
  3. You must grok IEnumerable<T> to use any of the LINQ tools effectively; this abstraction plays essentially the same role in LINQ as the relation does in the relational model.  In LINQ to XML, its is the axes of the XML tree, not the nodes of the tree, that expose IEnumerable and hence can be queried.
  4. People who see an IEnumerable have a tendency to do a foreach to iterate over it and use lots of if statements to decide what to do with the data values. Avoid that temptation (except when foreach is unavoidable, e.g. when writing to the console), and use the standard LINQ operators to select, filter, sort, join, group, etc. the data.  Once you get used to this discipline, we believe you'll be more productive and your code will be more understandable, and more likely to be correct.
  5. In most scenarios we have investigated, there is a simple, clean, and CORRECT way to do what you need to do with LINQ queries and functional constructors.  If you find yourself writing a lot of tricky imperative code to deal with straightforward XML data, step back and decompose / refactor / reconsider. Remember that the execution of queries is deferred until you actually get the data, so there is no performance penalty to decomposing a complex query / transformation into understandable pieces.

One overall pattern we've detected is worth noting: The less you know about current XML technologies, the faster you learn LINQ to XML.  The intuition of participants in our studies who know DOM, SAX, XmlReader, etc. is that the problems we posed them should be hard to solve, so they tended to roll up their sleeves and start writing a bunch of imperative code. Those who didn't know much about XML except that it is a tree of elements and attributes could use their SQL and C# intuition and found more elegant solutions.  On the other hand, the really experienced developers fairly quickly learned the core LINQ design patterns and began to apply them effectively to new problems. This was very heartening, since the core LINQ to XML value proposition is supposed to be make XML accessible to mainstream developers who don't want to learn the complex details of its syntax or master the numerous XML processing tools that exist today.

Getting to the less cheery news, there is plenty of stuff we need to work harder to explain, and to think about making easier in the API.  One thing that trips up a lot of people (ahem, including moi as I was working on this post!) concerns the XDocument and XElement classes.  To illustrate this, consider the MSDN blogs RSS feed as an example.  Its top-level format (stripped down a bit for illustrative purposes) is:

<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="rss.xsl" mce_href="rss.xsl" >
<rss version="2.0" xmlns:dc="https://purl.org/dc/elements/1.1/">
<channel><title>MSDN Blogs</title>
<description>The Blogs of MSDN</description>
<item><title>Develop Mental: Game Camp</title>

In general, use XElement when you just need to load the tree of elements, and XDocument when you are interested in any doctype information, PIs, etc. between the top of the file and the first element.  For example, if you want to get the xml-stylesheet processing instruction, load the XDocument:

var feed = XDocument.Load(@"https://blogs.msdn.com/MainFeed.aspx");
Console.WriteLine(feed.FirstNode);

This will display the value of the xml-stylesheet PI.  On the other hand, ...

var feed = XElement.Load(@"https://blogs.msdn.com/MainFeed.aspx");
Console.WriteLine(feed.FirstNode);

... will load only the <rss> element and display the contents of the first node beneath it, i.e. the <channel> element, but any markup before the <rss> element is thrown away.  That has a somewhat counterintuitive side effect:

feed.Element("rss")

returns an XElement with the name "rss" when feed is populated via the XDocument object's Load() method, but null when it is populated via XElement.Load().   

Second, it was not always obvious how to apply the LINQ approach based on IEnumerable<T> to XML trees. The answer is that queries work over the axes of a tree very much like XPath uses axes.  [1]  Thus, a big part of writing a clean and correct LINQ to XML query is in choosing the best axis to query over.  For example, if you are querying over the <item> elements in an RSS feed, you could do it by brute force by "dotting into" the element hierarchy (in this example, we loaded the XDocument, so you have to include the top level <rss> element) and then enumerating over all the <item> elements:

var feed = XDocument.Load(@"https://blogs.msdn.com/MainFeed.aspx");
var items = from i in feed.Element("rss").Element("channel").Elements("item")
select i;
foreach (var i in items)
Console.WriteLine(i.Element("title"));

 Or you could more conveniently use the Descendants() axis (and this will work whether we loaded the XElement or the XDocument):

var feed = XDocument.Load(@"https://blogs.msdn.com/MainFeed.aspx");
var items = from i in feed.Descendants("item")
select i;
foreach (var i in items)
Console.WriteLine(i.Element("title"));

The former of these examples illustrates another point of confusion, the distinction between Element() vs Elements() and Attribute() vs Attributes().  The current naming scheme is that the singular form returns the first matching XElement or the only matching XAttribute object; the plural form returns an IEnumerable over all the elements or attributes.  This is, to be frank, creates a dilemma: The naming scheme is quite logical and aligned with English semantics, but can be confusing to real humans who don't necessarily see that they've typed "Element" when they meant to type "Elements".  Intellisense can lead people astray here as well; people tend to  grab the first option that Intellisense presents, which may not be the one they really need.  We could of course make the method names more verbose and less easily confused, e.g. "Element" vs "AllElements" or "ElementsAxis".  Some of us guess is that this verbosity will help people during the learning process, but annoy them for the rest of their careers ... like the much-hated DOM method GetElementsByTagName() does.  We would definitely appreciate some real world feedback here! 

Next time I'll dig more into that challenges that people run into when using functional construction to create or reshape XML, and the power that can be achieved by combining the LINQ operators and functional construction. 

 

[1]  OK, maybe it would be better to say that the more you know about imperative XML APIs, the faster you'll learn LINQ to XML .... but knowledge of the more declarative tools such as XPath/XSLT/XQuery is probably a benefit.