LINQ Farm: LINQ to XML and Line Numbers

There are times when it is useful to know the line number of a node in an XML file. This information can be a helpful to users, particularly if you want to report an error. It can also be convenient to search for a node by line number, but that can, of course, be a very risky endeavor, as documents can be modified accidentally, and their line numbers changed without notice.

This post shows a few fundamentals about working with line numbers in a LINQ to XML program. The code shown in this post is taken from a project called XmlLineNumber. You can download this program from the LINQ Farm on Code Gallery.

Reporting a Line Number

Let’s begin our exploration by detailing a technique for reporting the number of a node that you have found in an XML file. To get started we need to use code from a class called XObject. As shown in Figure 1, XObject sits at the top of the LINQ to XML class hierarchy.

Chapter13-XmlHierarchy

Figure 1: The core objects in the LINQ to XML class hierarchy

XObject implements an interface called IXmlLineInfo:

public interfaceIXmlLineInfo
{
    int LineNumber { get; }
    int LinePosition { get; }
    bool HasLineInfo();
}

The eponymous LineNumber property of this interface is able to store the information we want. To enlist it in our service we need only call XDocument.Load with LoadOptions.SetLineInfo:

 XDocument xml = XDocument.Load(fileName, LoadOptions.SetLineInfo);

If you load this XML file into memory using SetLineInfo from the LoadOptions enumeration, then line numbers will be associated with the nodes in your document. The file we are loading is called FirstFourPlanets.xml. It’s a sweet little file that looks like this:

<?xmlversion="1.0" encoding="utf-8"?>

<Planets>

<Planet>

<Name>Mercury</Name>

<Moons/>

</Planet>

<Planet>

<Name>Venus</Name>

<Moons/>

</Planet>

<Planet>

<Name>Earth</Name>

<Moons> <Moon>

<Name>Moon</Name>

<OrbitalPeriod UnitsOfMeasure="days">27.321582</OrbitalPeriod>

</Moon>

</Moons>

</Planet>

<Planet>

<Name>Mars</Name>

<Moons>

<Moon>

<Name>Phobos</Name>

<OrbitalPeriod UnitsOfMeasure="days">0.318</OrbitalPeriod>

</Moon>

<Moon>

<Name>Deimos</Name>

<OrbitalPeriod UnitsOfMeasure="days">1.26244</OrbitalPeriod>

</Moon>

</Moons>

</Planet>

</Planets>

Here is code that uses the IXmlLineInfo interface to report the line number of a node discovered through a standard LINQ to XML search:

 XText phobos = (from x in xml.DescendantNodes().OfType<XText>()
                where x.Value == "Phobos"
                select x).Single();

var lineInfo = (IXmlLineInfo)phobos;
Console.WriteLine("{0} appears on line {1}", phobos, lineInfo.LineNumber);

This code looks through all the descendants of the root node for nodes of type XText which are equal to the word Phobos. It uses the LINQ query operator Single to ensure that the query returns only a single node. If the query returned more than one result, the call to Single would raise an exception, which in this case is the behavior we want. The program then casts the result as an instance of IXmlLineInfo, and reports the line number to the user:

 Phobos appears on line 24

Searching by Line Number

Let's now turn things around and show how to search through an XML file and look for a node by line number. If you glance at the FirstFourPlanets.xml file, you will see that line 21 looks like this:

 <Name>Mars</Name>

Here is code from the XmlLineNumbers sample showing how to search for that node by line number:

 XDocument xml = XDocument.Load(fileName, LoadOptions.SetLineInfo);

var line = from x in xml.Descendants()
           let lineInfo = (IXmlLineInfo)x
           where lineInfo.LineNumber == 21
           select x;

foreach (var item in line)
{
    Console.WriteLine(item);
}

Note that the first line uses LoadOptions.SetLineInfo to ensures that line information is recorded when the document is loaded into memory.

The LINQ query shown here uses Descendants to iterate over the elements in the FirstFourPlanets.xml file. The where filter in the query checks to see if any of those elements has its line number set to 21. It happens that the 15th element returned by the call to Descendants fits that search criteria, and so that node, and that node alone, is found when we foreach over the results.

Notice the cast to convert the XElement nodes returned by the call to Descendants:

 let lineInfo = (IXmlLineInfo)x

This cast is necessary, since the actual fields of the IXmlLineInfo interface are not exposed by XElement.

Once again, I want to stress that reporting the line number of a node seems like a reasonable thing to do, but searching for an element by line number is usually not a good idea in production code. For unexplained reasons, code that was on line 532 has a way of migrating to line 533 when you least expect it. In any case, you now know enough to begin working with line numbers in a LINQ to XML program.

Download the source.

kick it on DotNetKicks.com