The “a ha” moment.


Don Box is looking for an “a ha” moment on why he should use SqlXml.  Unfortunately, he probably won’t find it while building his new Blog engine. 


The reason for this is that the real usefulness for SqlXml shows itself in development scenarios where the developer does not own the database schema and/ or the shape (XSD) of the desired Xml.  SqlXml is all about translating relational data to and from Xml and handling the translation of Xml features (e.g. hierarchical queries, open content, nested relationships) that are any where from moderately hard to nearly impossible to represent as Sql operations against a relational database. 


And typical example of this is the SqlXml Xml Bulkload feature.  Numerous SqlXml users are told “our partners are going to send us a 100 MB Xml document representing new sales information on a daily basis.  Here is an XSD that defines the feed.  Figure out how to get the Xml into our reporting database“.  The possible problems here:



  • The Xml feed may have attributes and elements which do not have appropriate columns or fields in the database.

  • The Xml feed may have open content that needs to be preserved and could change without warning.

  • The XSD is highly hierarchical, while my reporting database is fairly denormalized.

  • My database uses identity values.  These identity values need to be generated upon inserting the data and propagated as primary keys based on the hierarchy of the Xml feed.

With SqlXml, I can easily solve that scenario with a few lines of code and annotating the XSD with mapping information.  Sure I could write the code myself using SqlClient and an XmlReader – and if the Xml feed was very flat and with little open content, that is what I might be inclined to do.  The same analogy exists for querying and generating Xml.  If my database looked quite similar to my desired Xml, I would write the Sql queries myself and do my own markup.  But it the two differed and I wanted to write some fairly sophisticated XPath statements – I would let SqlXml handle the translation. 


If I was writing a Blog Engine from scratch where I owned the schema of a single writer, multiple reader relational data store, I would probably develop a fairly denormalized schema.  All access to the database would be via standard transact Sql queries with manually markup of the results.  The real advantage here is that the database schema could closely match the desired Xml shape, so doing the translation would be fairly trivial and optimized for my needs.  This is why I don’t think Don is going to get any great epiphany if he tries to use SqlXml to develop his Blog engine – in fact, without knowing the details of design, I would still guess that SqlXml is probably overkill.


So, the real overriding question here is when to use SqlXml and when to use SqlClient.  And with SqlXml becoming a first class data access framework in the Whidbey release of .net – the question becomes even more interesting.  Mark Fussell has developed a nice document outlining our recommendations for data access in Whidbey.  (It is worth reading, but keep in mind it is a work in progress and will be updated and expanded based on Whidbey feedback.)   IMHO, the new data access options (first class SqlXml framework and ObjectSpaces) are making SqlClient (using transact Sql statements) a low level interface for cases where fine grained control and performance are required or the features of SqlXml or ObjectSpaces are not required.  Are most users ready to accept that and have we made the higher level data access options powerful enough to enable user to solve real problems exclusvely using those abstractions?  The answer to the first question is probably no – getting developers to give up power is not an easy task.   I don’t know the answer to the second question, but I worry about it all the time.


 


 


Comments (6)

  1. Anonymous says:

    What about XQuery? Somehow I doubt Don would like to markup XML manually.

  2. Anonymous says:

    The project I am working on could not use ObjectSpace, as we need to support customers with Oracle.

    I have only worked on one project that could have used ObjectSpace, every where else I have worked has been an ISV. At an ISV you have to keep your options open, otherwise what you do when you get a big order and the customers IT will not let the department that placed the order use SQL server?

    (As the Customer’s corporation’s policy is to use another database, IT people like having Oracle only rules, as they can get paid more as a Oracle DBA, and if they allowed SQL server, there may not even be a job for a DBA)

    Much as I like SQL server and would rather be working on projects that are SQL server only I have to leave the option open for a low cost port to Oracle.

    ringi at bigfoot dot com

  3. Anonymous says:

    I agree, ObjectSpaces seems like a half-baked solution. Here’s waiting for version 3.0.

    XQuery seems a lot more promising, and works with web services.

  4. Anonymous says:

    Used Bulk Upload a lot in my last project and we even had full control over the datbase. We never had full control over the Xml OR the Schemas however which came from a separate source. Xslt’d the Xml to a format that was valid against out annotated Schemas and updated the database. My view is that sometimes the Schema just doesn’t reasonably map to the database and although you could use Xpath etc you can avoid the programming effort if you go along the declaritive route with Xml Schema – especially if you are in a situation where the client has your software installed, but you need to update the database – update the Xml, the Schema and the database and drop them in. No compilation/re-installations required.

    Even with a blog engine you can imagine the situation where the RSS is extended and you suddenly want to add a new column to your database. Sure you can use other techniques, but being able to just annotate a schema makes it Sooo much easier!