Authoring XML using DSL's

Recently, a good colleague of mine (you know who you are Tim), asked me what steps he should take to build an XML authoring solution based upon existing XSD schemas they already have in their company. The objective was to create XML file instances (of a well known schema) to describe some business thing, but how to do that with a better authoring experience than writing angle brackets - since most of the folks who need do this in his company don't really have XML skills or sufficient knowledge of the XSD schemas to accurately construct one of these XML instances in its entirely. XML is just the chosen format the technical staff chose to use to represent this data for other technical reasons, and why should the people who need to create these business things have to care about XML or XSD? But they sure know what these business things are and how to define them in higher level terms.

Clearly, this is not an uncommon scenario, and, in principal at least, it should not warrant a massive development effort  to create a simple XML authoring tool just dedicated just for this XSD schema. You might argue that they could just use XML notepad or other XML editor, but this is really not the kind of solution that non-developers business professionals would be very comfortable or productive with.

To be fair to Tim, he already knew what solution to go for of course, but wanted some reassurance of how quickly and what tasks need to be performed for his team to leverage some of the new tools to create this kind of solution for him. Providing him a nice dedicated 'smart' simplified graphical designer that he could put in front of his business oriented people to help guide them through the desired authoring process of these things; never having to touch (or care about) the underlying XML, how it is defined, or how it gets completed, with all its intricacies etc. He just wanted a simple GUI designer to define the main aspects of the business thing the XML is describing and spit out the fully compliant XML for other tools to process later.

Well, if you were confronted with this problem a few years back, you would have likely proposed creating a new dedicated web solution, convoluted XSLT type solution or some desktop application or tool (or perhaps even an InfoPath solution), or told Tim to go use XML notepad of some such generic XML authoring tool. But today, I am sure (I hope) most of you would now feel comfortable suggesting to go build a domain specific language with the DSL toolkit.

What makes building a DSL for this kind of 'brown-field' solution very interesting, is that you already effectively have a domain model defined in your XSD, which is a great starting point.

'Brown-fields' is a term used to describe scenarios where there are existing solutions already implemented and where you are trying to introduce a new technology or solution to help with the completion or enhancement of that solution. It's almost the opposite to 'Green-field' scenarios where you have freedom to build any solution with any means, and are unconstrained by existing tools, technology or solutions.

The XSD provides most of the information needed to describe the model and the types (domain classes, external types, inheritance etc) it needs. Once you have defined your domain model of course, what you show in a DSL designer does not have to reflect all domain classes and relationships from your XSD. After all, you may want to provide some abstraction of the thing defined in the XSD in your designer, and have the designer fill-in or imply parts of the XML for the user, based on either defaults, or other settings and constraints in the XSD. This ensures that the thing defined (in XML) is always valid and complete with respect to the XSD.

Once you have your domain model fully defined, and a designer defined to graphically represent your domain model, its simply a case of generating the XML from this DSL. You can do this in a text template, or apply an XSLT transform to the DSL file, both are common approaches. But there is a better technique for cases like this.

One little known fact about the DSL toolkit, is that you can extend it to control exactly how to persist the XML that is written out to describe the instance of your DSL.

In case you haven't noticed, the instances of your DSL files are persisted as XML in a logical XML format that reflects the domain model of your DSL. In fact, each DSL project creates for you an XSD that defines the format of this persisted XML as well. View your DSL instance files with a text editor, like notepad, and see for yourself this XML.

The DSL toolkit provides some simple built-in control to customize how the elements and attributes of the persisted XML in the domain model are serialized to your DSL file. This is limited to how to name the elements and attributes, and basic XML formatting. But the DSL tools also support another level of customization that enables you to change completely how these files are rendered in any XML format.

Through the use of custom serializers, you can render your XML in any format you wish, as long as that format is loss-less enough to be serialized and de-serialized without loss of data for your DSL. The DSL tools team have an example of this in chapter 6 their new book 'Domain-Specific Development with Visual Studio DSL Tools', and a bunch of guidelines to follow for custom serialization you should follow.

 

Knowing all this, it's now conceptually trivial to build a new DSL based upon an existing XSD, and provide a graphical designer to author new XML instances of this XSD. Then persist your DSL XML in a format that is defined by your XSD, and skip the additional transformation step - as the model file is the XML it represents. In this way, you are effectively creating a new dedicated graphical editor in Visual Studio specifically for editing your types of XML files, and these files are still valid XML instances that can be consumed by other tools and processes.

In practice though, it can be quite tedious to define all the domain classes and relationships from an existing XSD (using swivel-chair integration), especially if the XSD is large. It will also be a little technically challenging, since the DSL domain meta-model will not be able to support all constructs and relationship types defined in XSD - there will have to limited coverage of XSD that can be modelled, and graceful handling/workarounds of areas it can't.

It's a shame we don't have any tools to import XSD files as a starting point for a new DSL, or import parts of an existing XSD's into an existing domain model (imagine 'New DSL Solution from XSD' command). Which would then pre-populate a domain model reflecting the XSD, and generate the necessary custom serializers used to persist the DSL model to XML conforming to the XSD schema.

However, I am hoping at least someone eager in the DSL community will take it upon themselves to create an 'DSL Import XSD PowerToy' extension that would provide this for others to leverage - perhaps a new codeplex project? - it certainly would help a lot of people get started creating DSL's for existing solutions based upon XSD's, and provide a very quick and easy means to create graphical designers for XML authoring of any existing XSD schema.