What's New in the XLinq CTP?

Avner has blogged about the new XML features added to VB9 in this CTP. I'll do the same for XLinq itself (and one little XML feature in the C# IDE).  There have been a number of relatively small changes:

  • Axes that used the term "content" now use the term "nodes".
  • Methods have been added to make it easier to compare the document order of different nodes.
  • Various properties and methods have been added to make it easier to work with the information in namespace prefixes, document types, and the XML declaration.
  • Attributes can be traversed sequentially in their internal storage order. 

There are also a few more fundamental changes: Annotations are supported on XContainer nodes, there is an XNamespace class to simplify the job of working with namespaces, The XText (text nodes and CData sections) class is now public, and there is an XStreamingElement class.  Let's look at each of these in some detail.

Annotations

XLinq gives you the ability associate some application-specific information with a particular node in an XML tree. Examples include the line number range in the source file from which an element was parsed, the post schema validation type of the element, a business object that contains the data structures into which the XML information was copied and the methods for working with it (e.g. a real invoice object with data in CLR and application defined types), and so on.

XLinq accommodates this need by defining methods on the XContainer class that can annotate an instance of the class with one or more objects, each of some unique type. Conceptually, the set of annotations on an XContainer object is akin to a dictionary, with the type being the key and the object itself being the value.

To add an annotation to an XElement or XDocument object:

XElement

contact = new XElementLineNumberInfo linenum = new LineNumberInfo(...);
contact.AddAnnotation(linenum);

where LineNumberInfo is an application defined class for storing line number information. The annotation can be retrieved with:

LineNumberInfo

annotation = contact.GetAnnotation<LineNumberInfo>();

XNamespace

After listening to many suggestions to better encapsulate XML namespaces, we have added an XNamespace class. XLinq has always tried to simplify XML names by removing XML prefixes from the XML Programming API. When reading in XML, each XML prefix is resolved to its corresponding XML namespace. Therefore, when developers work with XML names they are working with a fully qualified XML name: an XML namespace, and a local name.  In the previous preview of XLinq, developers were asked to use the string representation

{NamepaceURI}LocalName

This expanded name representation is still supported -- the XML namespace https://yourCompany.com and the local name contacts can be represented as:

{https://myCompany.com}contacts

In this release, however,  the XNamesclass consists of an XNamespace object and the local name. For example, to create an XElement called contacts that has the namespace "https://mycompany.com" you could use the following code:

XNamespace

ns = "https://mycompany.com";

XElement contacts =

new XElement(ns + "contacts");

XText

The previous version of XLinq exposed the value of elements and attributes only as strings, or via an explicit case to some CLR type.  We received a lot of feedback that this didn't work well for mixed content and CData sections.  To address this, we have exposed the formerly private XText class. Developers who are not  working with mixed content and CData sections still don’t have to worry about text nodes in most cases. You can usually work directly with the basic .NET Framework-based types, reading them and adding them directly to the XML.

Note that whereas DOM explicitly allows adjacent text nodes, the XLinq implementation will always merge XText nodes to correspond with the structure of XML text. This has the benefit that developers never need to check for multiple text nodes that contain a single element’s content. However, it does mean that you cannot rely on the identity of text nodes remaining stable because they may be merged into adjacent text nodes as edits are applied to the XLinq tree. In general, it is best to ignore the existence of XText nodes unless you are working with mixed content or CData sections. If you must work with text nodes in this CTP version of XLinq, do not re-use them or assume that a reference to a text node will contain the correct data after changes are made to the tree.  Note: Yes, we know this is inelegant, and this may change in the next preview of XLinq

XStreamingElement

Much of the power of LINQ comes from its deferred execution approach. This preview of XLinq adds an XStreamingElement class which allows you to build a tree of IEnumberable<T> instantiations that will be evaluated "lazily" when they are actually accessed, not "eagerly" up front.

Consider an example where we have an array of instances of some application object of type Contact; we want to serialize the name and address fields in each object in the array to XML. XStreamingElement allows you to do so lazily rather than by creating an XLinq tree and then serializing the tree.

Contact

[] contacts = ...;

XStreamingElement

s =
new XStreamingElement("contacts",
from c in contacts
select new XStreamingElement("contact",
new XStreamingElement("name", c.name),
new XStreamingElement("address", c.address)
)
);

s.Save("contacts.xml);

If you used XElement rather than XStreamingElement in this example, the iteration over the contacts array would occur when the constructor was evaluated, and a tree of XElement nodes would get built. Using XStreamingElement, the iteration over contacts is deferred until the Save() method is called. Each XStreamingElement object knows how to save itself to the output stream, and then iterates over its lazy list of children and asks each of them to save themselves. This saves the overhead required to construct a tree of XElement nodes yet produces exactly the same output as the equivalent code using XElement.

Pasting XML Literals as C#

One other new feature of note is not in XLinq itself, but in the C# IDE.  Recall how Visual Basic 9 has added XML syntax into the language itself to simplify the task of constructing an XML fragment as XLinq objects.  Many have asked   whether this feature will be added to C# 3.0, and the answer is "no." (That makes some people happy, some sad). This preview does add what I think is a great compromise, a Visual Studio add-in that turns XML text into XLinq.   You can just copy some XML to the clipboard and use "Paste as XElement" on the edit menu  to generate the C# code required to create that XML using the XLinq API. 

Check it out!

Please download the LINQ Community Technology Preview and let us know what you think.  This is a good time to try it out; it is mature enough and tested enough to be useful for research and prototyping, but still under active design and development and your comments can really affect the direction we take.  I'll be explaining some recent decisions and discussing some current design challenges (such as the dilemma of whether or not to merge text nodes)  in my own blog, so please join our conversation about how to make XML programming much easier than it is today.

Mike Champion