XLinq Design Issues - What Do You Think?

Article
05/20/2006

With the recent LINQ CTP, XLinq's feature set is getting close to what we plan to release in "Orcas". The whole point of Community Technology Previews, of course, is to get feedback from potential customers about what they like, what they don't like, and what more they need in a product before the design is frozen. Now is definitely the time to let us know what you think; there is some time, but not a lot of time to try this out and send feedback (The target ship date for Orcas is not public, but since one main point of Orcas it to provide tooling for new Windows Vista and Office 2007 features, you might guess that "months not years" after these ship is the target date).

I'll try to post here a couple of times a week with details on some specific decisions we've made that may be controversial, and to talk about some issues we're still wrestling with. Here's my current TODO list of topics to write about; please let us know what else you might want to see discussed.

Why we are developing yet another XML API
Who we are targeting and which use cases we are more or less ignoring.
To what extent should we facilitate the traditional document-processing scenarios that use DTDs, entities, default attribute values, ID/IDREF constraints ... and where should we draw the line and say "use DOM if you need that stuff."
Why it will help you understand LINQ if you come to grips with some rather arcane concepts (such as "lambda expressions" and "query comprehensions").
What to do about text nodes: they have no counterpart in XML per se, hiding them completely doesn't work if you have mixed data or CData sections, but does making them first class citizens of XLinq take us down into the swamp where DOM foundered?
The in-progress design of an event / notification model so that GUI apps can synch up their view with changes to the XLinq tree.
The "Halloween Problem" (as it is known in the database world): LINQ-style lazy evaluation and imperative data manipulation don't play well together.
How to extend LINQ's intrinsic laziness so that users can process large documents via XLinq without having to wait for the entire document to load, yet with minimal additions or changes to the API.

Note that one topic which has generated some discussion is not on the list - there definitely will not be XML literals in C# 3.0 the way there is in Visual Basic 9. There is what I consider a very nice compromise, a "paste XML as XLinq functional constructors" feature implemented as a Visual Studio addin. There is a preview in the May CTP, so let us know if you agree.

Finally, I'd appreciate any thoughts you have on a meta-issue: XLinq is aimed more at mainstream developers who need to work with straightforward XML simply yet efficiently, and less at at XML specialists who already know SAX, DOM, XSLT, etc. and tend to focus on the hard problems and corner cases. The problem is that this target audience doesn't read XML-focused blogs or mailing lists, and may not think much about XML at all until they have a project to work on. On the other hand, we get all sorts of input from "Einstein" developers (especially within Microsoft) along the lines of "if you only did X, we could do Y a lot better than we can with DOM or XmlReader/XmlWriter". We worry that accomodating all these (perfectly reasonable) requests could end up making XLinq into another DOM (but with a reasonable naming scheme and native support for C#), not something that ordinary mortals can figure out with a little knowledge of XML basics and a lot of help from Intellisense. Ideally simple things will be simple, and complex things possible by extending the API (via subclassing, the annotation mechanism, and using XmlReader/XmlWriter/XPathNavigator as appropriate), but the more "smarts" we put in the harder it is to reproduce the behanvior in subclasses. Lots of devils lurk in the details here, and your suggestions about how to exorcise them would be very welcome.

XLinq Design Issues - What Do You Think?

Additional resources