Typed XML programmer -- What are your pains today?

This post continues the “Typed XML programmer” series of blog posts. This time, let’s ponder about ‘the 1st generation of typed XML programming’. What are the imperfections of the 1st generation? (Thanks for your feedback so far. In particular, I loved Bill's thoughts on ‘XML as a true first-class citizen’ and the behavior that is to be associated with such ‘XML in objects’. Bill, do not spoil my story any further; I am getting there!)

 

Definitions of typed XML programming

 

Let me recall my simple definition of typed XML programming that I offered previously: “Typed XML programming is XML programming in mainstream OO languages like C#, Java and VB while leveraging XML schemas as part of the XML programming model”. With an eye on what people are doing with XML and objects today, this definition may be talking about both of these two different problems:

 

  • Object serialization: Represent objects as XML ‘on the wire/disk’.
  • XML data binding: map XML schemas (canonically) to object models.

 

In the case of object serialization, XML data and XML types serve no key role in the programming model. Object serialization is a difficult and important topic, but it is not really about “XML programming”. In the case of XML data binding, the XML schema describes the primary data model, and a tool derives an object model to be used by OO programmers to operate on XML data. (I readily admit that there exist scenarios that are neither clear-cut object serialization nor XML data binding, but this post is getting too long anyway.)

 

So I propose to work with the following definition:

 

(1st generation of) Typed XML programming

= XML data binding + OO programming

= OO programming on schema-derived object models

 

Most readers of this post will know the notion of ‘XML data binding’ pretty well, but let me give my own summary. The typical XML data binding technology performs a ‘canonical’ X-to-O mapping, i.e., XML types are systematically mapped to object types without involving domain knowledge of the data model or the programming problem at hand. In simple terms, schema declarations are mapped to OO classes; the structure of each content model is more or less preserved by the associated class with its members corresponding to element and attribute declarations -- modulo imperfections. Here are two XML data binding resources that I want to point out: (i) Bourret’s excellent web site on XML data binding including an impressive list of technologies; (ii) McLaughlin’s (somewhat dated but still valuable) book on (Java-biased) XML data binding.

 

For illustration, here is an XML schema for purchase orders:

 

<xs:schema xmlns:xs="https://www.w3.org/2001/XMLSchema">

  <xs:element name="order">

    <xs:complexType>

      <xs:sequence>

        <xs:element ref="item" maxOccurs="unbounded"/>

      </xs:sequence>

      <xs:attribute name="id" type="xs:string" use="required"/>

      <xs:attribute name="zip" type="xs:int" use="required"/>

    </xs:complexType>

  </xs:element>

  <xs:element name="item">

    <xs:complexType>

      <xs:sequence>

        <xs:element name="price" type="xs:double" />

        <xs:element name="quantity" type="xs:int" />

      </xs:sequence>

      <xs:attribute name="id" type="xs:string" use="required"/>

    </xs:complexType>

  </xs:element>

</xs:schema>

 

My favorite XML data binding technology derives the following object model:

 

public class order {

    public item[] item;

    public string id;

    public int zip;

}

public class item {

    public double price;

    public int quantity;

    public string id;

}

 

In the canonical mapping at hand, repeating particles (cf. maxOccurs="unbounded") are mapped to arrays; XML Schema’s built-in simple types (such as type="xs:double") are mapped to reasonable counterparts of the C#/.NET type system. Element particles are generally mapped to public fields (or to public properties that trivially access private fields).

 

If such mappings would scale pretty well for all of XML and all of XSD, then typed XML programming (as of today) would be just fine. Typed XML programming would be OO programming where the object models were ‘accidentally’ described in the XSD notation.

           

 

Imperfections of the 1st generation

 

The typical XML data binding technology comes with the (potentially implicit) goal of ‘hiding XML from the developers’ and ‘allowing them to work with familiar objects instead’. As we will discuss now, such a goal may be hard to hit. So let me collect some of the pain points that I encountered in typed XML programming of the 1st generation.

           

           

Pain point: ‘The failure of OO static typing’

 

It is clear that plain DOM-style XML programming makes it (too) easy to construct invalid content or to disobey the schema constraints in queries and DML code. Unfortunately, (contemporary) schema-derived object models do not make all these problems go away. Let’s consider a trivial example for the schema of purchase orders and its associated object model (both shown above). Your application constructs a purchase order, to be sent, as an XML message, to a supplier. Using the API of the object model for orders, the XML message is constructed using OO idioms of object construction. For instance, we may new up an order item using the idiom of expression-oriented object initialization (as provided by C# 3.0):

 

new item { id = "23", price = 42 }

 

The code type-checks fine with regard to the object model shown above, even though the quantity is missing, which however is required by the underlying schema. After sending the message over the wire, suddenly, validation fails or functionality throws on the receiving end. You did not perform separate (sender-site) XSD validation because of performance considerations as well as trust in statically typed OO programming.

 

The example serves as a placeholder for a class of problems: OO static typing and XML/XSD do not blend very well. Most notably, you need to cope with wildcards, various forms of constraints, simple-type facets, and others. Add to this that ‘insistence on valid content’ may be impractical for certain software architectures that leverage XML.

           

           

Pain point: ‘Unwieldy object models’

 

Data modeling for XML isn’t object modeling … so how could anyone expect that XML data binding gives you object models that look like those that a reasonable OO designer would model in the first place? Consequently, (contemporary) typed XML programmers tend to acknowledge that schema-derived object models are unwieldy to work with.

 

For instance, how would you model an ‘expression form for addition’? It depends on whether you are doing this for XML or OO (or BNF or EBNF or ASDL or ASN or Haskell or what have you). With one of my XSD hats on (the one that is not afraid of substitution groups), I model the expression form for addition as follows:

 

  <xs:element name="add" substitutionGroup="exp">

    <xs:complexType>

      <xs:complexContent>

        <xs:extension base="exp">

          <xs:sequence>

            <xs:element ref="exp"/>

            <xs:element ref="exp"/>

  </xs:sequence>

        </xs:extension>

      </xs:complexContent>

    </xs:complexType>

</xs:element>

 

With my plain OO hat on, I instead model addition as follows:

 

  public class Add : Exp {

  public Exp left; // Let’s give a name to this guy.

    public Exp right; // ... this one, too ...

  }

By contrast, the typical XML data binding technology maps the above schema as follows:

  // Unwieldy object model ahead …

  public class Add : Exp {

  public Exp exp; // Sigh! How would the tool know a better name?

  public Exp exp2; // Uuh! The tool applies name mangling.

  }

 

Here is a list of XSD patterns that regularly imply unwieldy object models:

 

  • Nested element declarations.
  • Complex-type derivation by restriction.
  • Nested content models (anonymous compositors).
  • Use of separate ‘symbol namespaces’ for types and elements.
  • Content models with recurring element names (as seen above).
  • Coupled definition of types and elements for the use in substitution groups.
  • … please continue here …

 

Essentially, we are talking about the ‘type dimension’ of the infamous ‘X/O impedance mismatch’. If you want to see a more substantial discussion of such mapping problems, here is a shameless plug for my paper on “Revealing the X/O impedance mismatch”. If you wonder whether the various XSD complications are actually encountered in the real world, yes they are, and here is another plug for my paper on the “Analysis of XML schema usage”. As Andrew Farrell points out in his longer comment on my first post in this series, many technologies “can't cope with the more complex aspects of the XSD standard”. I very much agree. I find it important to distinguish between “can’t yet cope” (subject to engineering efforts) versus “really can’t conservatively cope” (i.e., without ‘out-of-box’ thinking). How do you possibly cope with the problem in the addition example above? One desperate answer is ‘by customization’. However, customization is arguably a pain point, just by itself, and I hope to get back on this later in this series.

           

           

Pain point: ‘Unbridled, ignorant objects’

 

Talk to a computer scientist and you may hear: “What’s the problem? Trees are degenerated graphs … so C# objects are clever enough to hold XML data.” Thanks for making my argument! So if my plain, schema-derived object model copes with general object graphs, how do I know that, at serialization time, I can make a tree from it? Also, at de-serialization time, how do I stuff XML comments, PIs and interim text into the object graph? Furthermore, I loved the parent axis in my DOM code, why am I supposed to live without parent and friends in the wonderful world of typed XML?

 

Here goes an example. Your application transforms a given in-memory representation of an XML tree, say an abstract-syntax tree for your favorite programming language. Using the schema-derived object model for the XML-based abstract syntax, the transformation is encoded in terms of basic OO idioms for imperative object manipulation. Suddenly, serialization throws because the DML operations have created a true object graph with cycles and sharing. For instance, let’s wire up an object graph for an Add expression such that serialization is going to throw:

           

  Add a = new Add();

  a.left = a; // Sigh!

  a.right = a; // Uuh!

  XmlSerializer serializer = new XmlSerializer(typeof(Exp));

  serializer.Serialize(myWriter,a); // Throw!

           

Another example follows. You are facing a ‘configuration file scenario’ in your application for which you used DOM-style XML programming so far. You decide to switch to typed XML programming -- so that you are better prepared for future evolution of the configuration files on the grounds of static type checking. Once you deploy your object model, you are hitting sample data that comprises XML comments and processing instructions. Some components actually rely on these bits. It turns out that the chosen XML data binding technology neglects all such XML-isms. So you are hosed. For instance, suppose you are working on Visual Studio automation such that you are rewriting your project files from plain VS 2005 to LINQ preview projects. A snippet of the given “.csproj” file looks as follows:

           

<Import Project="$(MSBuildBinPath)\Microsoft.CSharp.targets" />

<!-- To modify your build process, commense as follows.

     Add your task inside one of the targets below.

     Then uncomment it.

     Other similar extension points exist.

     See Microsoft.Common.targets.

<Target Name="BeforeBuild">

</Target>

<Target Name="AfterBuild">

</Target>

-->

      

Your typed XML rewriting functionality only aims to replace the “.targets” line.

Only later you find out about a bonus feature: the XML comments were eradicated.

      

<Import Project="$(ProgramFiles)\LINQ Preview\Misc\Linq.targets" />

           

           

Pain point: ‘Cognitive X/O overload’

 

In my experience, typed XML programmers tend to have larger displays (21inch+) than their untyped fellows. Why is that? I can only guess. A simplistic guess would be that the mere size of XML schemas (when compared to EBNF or Haskell) calls for grown-up displays. I don’t really buy this argument because even untyped XML programmers may need to inspect the XSD contract when coding. Hold on, why would typed XML programmers bother about the XML schema given their schema-derived object model? There is the problem! Without going into detail here, it seems that the typical schema-derived object model and its integration into the normal OO programming workflow is insufficient for understanding the XML domain at hand and designing the solution to XML programming problems. Instead, typed XML programmers of the 1st generation tend to consult different resources pseudo-simultaneously:

 

1) A sample XML file.

2) An XML schema.

3) A schema-derived object model.

4) Potentially also a generated documentation.

5) … Anything else? …

 

Even though I am a ‘static typing extremist’, I can see that this scattered approach may lead to cognitive overload. By contrast, have a look at Eric White’s post on “Parsing WordML using XLinq”. He is dealing with XML data of a non-trivial kind (in terms of the underlying XML Schema); yet untyped XML programming, merely based on bullet (1) above, scales pretty well. Now throw the XML schema for WordML at your favorite XML data binding technology, and then attempt recoding Eric’s functionality in a typed manner. Results depend on the chosen technology and your personal algesthesia (i.e., your ability to sense pain).

 

Over to you …

I am sure there are further problems that one could identify.

In fact, I hope to get some feedback from the readers of this post. 

 

  1. What is the typed/untyped ratio in your sovereign territory?
  2. When to leverage an XML data binding technology?
  3. What are the show stoppers for putting such technologies to work?

 

In my next post of this series, I plan to talk about the “the scale of typing” in XML programming. On this scale, I will identify two candidate pain killers for the pains of the 1st generation.

 

Regards,

Ralf Lämmel