InfoPath and SQL Server 2005: Intro

SQL Server 2005, also known as Yukon, comes with greatly improved XML support. Just look at it: native XML columns - store XML blobs in your records, along with other data types. These XML columns can be typed or untyped; typed here refers to XSD typing, which provides backend data validation. If an XML blob that you're trying to put into an XML column doesn't comply with the schema associated with the column, the insert will fail. Cool, huh?

There's more cool to it. Meet XQuery - a query sub-language of SQL that can be used to query your XML columns. What does this mean to you? It means that you can do a structure-aware search through XML.

Imagine: you need to create a training management system. It's a system with a non-trivial schema - each employee will need to take some mandatory courses, there will be education fields for their secondary education... One of the reports you must generate from the system is "show all employees that graduated before year X". How would you have done this yesterday?

You could create your InfoPath form against a relational database. this would mean that you'd need to start with ERD design (looong... and messy - not all XML structures translate nicely into relational systems: think of choice, recursive, optional sections). Report generation would have been easy afterwards: just do "SELECT employee_id FROM employees WHERE ...". You all know SQL, no need to go too deep here. But recognize the pain you would have to go through with this approach: ERD design, shredding of the XML structures into relational structures. Plus, if you need to make changes to the schema, this becomes painful: non-trivial changes to relational systems are tough.

You could create an employee profile management form in InfoPath, and store the resulting XML blobs in SharePoint. You'd pull in basic employee data - names, positions, hire dates - from your HR database (let's assume it lives on a separate SQL database). But how would you generate that report? Built-in SharePoint full-text search is obviously not an option here. What you'd probably do is create a custom app that reads in the XML's, parses them, verifies the criteria ("graduation date > X"), and if the criteria is satisfied, spits out the employee id. Scary and complicated.

Entering today. XML Columns, XQuery and InfoPath.

Jeez, that sounded too much like a marketing slogan :-).

What you can today is create an XML column and store entire InfoPath there. No ERD design. Just drop an entire form (one record) into a SQL column. Then, when you want to report on several forms, just use XQuery.

This not only allows you to answer trivial queries like "show all employees that graduated before year X", but perform true heterogeneous query. Recall: the basic HR database is a separate relational system. The training management system you just built stores XML blobs. How do you connect the two? Just do a JOIN like you would have otherwise! Yes, SQL Server 2005 allows you to do joins between relational and XML data!

It's hard to underestimate the benefits here. Structure-aware search will frequently give you much better search results, if you know what you're searching for. It can answer semantic queries that full-text search can't answer - and you don't need to shred your data into relational structures.

You're probably wondering how exactly do you make InfoPath store data in an XML column in Yukon.. Coming soon, in blog theatres near you :-). UPDATE: the next chapter of the story is here.

Alex Weinstein
Program Manager