More LINQ to XML examples from the real world

A few weeks ago I pulled together a post on LINQ to XML in action .  I came across a couple more very nice examples over the weekend. 

One is from the LINQ Project forum. A question was posed asking about a clean way to to load a structured text file such as a logfile into an XLinq tree.  The example data was similar to this:

#Fields: time ip http-method url status
12:37:18 127.0.0.1 GET /nowhere/gone.xml 404
12:37:25 127.0.0.1 GET /somewhere/what.xml 401

Anders Hjelsberg offered this little snippet that illustrates how query operations (from, where, select, etc.) are integrated into C# and how functional construction lets you easily build  an XML fragment from the bottom up. 

  var logIIS =
new XElement("LogIIS",
from line in File.ReadAllLines("file.log")
where !line.StartsWith("#")
let items = line.Split(' ')
select new XElement("Entry",
new XElement("Time", items[0]),
new XElement("IP", items[1]),
new XElement("Url", items[3]),
new XElement("Status", items[4])
)
);

The "let" clause allows you to do the split operation once and then refer to the result in subsequent expressions. Those XElement objects could be kept in a list or array, or wrapped up in an enclosing root element to be serialized as XML text.

 

Another nice example was inspired by Robert Scoble's request  to help him figure out how to process the changes.xml file on weblogs.com to find the pointers to updated blog entries that come from major services:

Here’s what I need:

1) Take my Excel .XLS file (I’ll clean it up and put it into a column for you) and delete all the URLs that don’t come from blogspot.com; wordpress.com; livejournal.com; spaces.live.com; typepad.com

While there are a lot of ways to do this, Steve Eichert came up with a small LINQ to XML program that works off the original XML file on the web (no need to load into Excel), does what Scoble asks for ... but wait, there's more!  As a bonus, his program uses LINQ's built-in query processing capability to group and sum the selected entries:

Output:

  • Blogspot has 8928 sites in the changes.xml file
    • list of sites
  • Spaces has 900 sites in the changes.xml file
    • list of sites
  • Wordpress has 384 sites in the changes.xml file
    • list of sites
  • TypePad has 118 sites in the changes.xml file
    • list of sites

Sure, you could do that grouping and counting in Excel, but LINQ offers this kind of basic searching / sorting / counting capability in a form that is almost as easy to use as Excel macros, and the LINQ to XML extensions lets you do this directly on raw XML data.  Thanks Steve for a great illustration!