Sub-tree pruning

Article
06/05/2003

While
preparing my talk for Teched, I figured out a trick for getting simple sub-tree pruning
when running XPath queries via SqlXml. Sub-tree
pruning is the ability to filter sub-trees of an Xml hierarchy. For
example, let’s say I had an EmployeeList element
with Employee sub
elements. I would like to return the EmployeeList element,
but only include Employee sub-elements
where the employee had greater then 10 years of experience. "urn:schemas-microsoft-com:office:office" />

My first
attempt would be the following:

/EmployeeList/Employee[Experience>10]

But
that just returns Employee elements,
not the EmployeeList. Next
I would try:

/EmployeeList[Employee[Experience>10]]

But
that actually returns all Employees from
the database since the XPath literally means return
all EmployeeList elements where there exists an Employee sub-element with an Experience
sub element where the text value is greater then ten.

In fact,
you might not even be able to get that far with the current SqlXml support since most
likely the EmployeeList element
is marked as is-constant in the mapping schema. In
that case, the “XPath: predicate on is-constant element (/EmployeeList) is not supported”, Note,
that limitation could be removed from SqlXml, but doing so would really not solve
the initial problem.

So what
is the alternative? Well, one could post
process the results with an XSLT, however that would mean bringing all the data down
from the server and would complicate the code. However,
for simple cases where the containing element is-constant, the solution is to make
the sub-element (Employee) an top level element in the mapping schema and then query
the Employee elements directly. Then,
via the root property, wrap the query results with an EmployeeList element. E.g.

SqlXmlCommand
command = new SqlXmlCommand(_SqlXmlconnectionString);

command.CommandType
= SqlXmlCommandType.XPath;

command.SchemaPath
= "Employee.xsd";

command.RootTag
= "EmployeeList";

command.CommandText
= "/Employee[Experience>10]";

XmlReader
reader = command.ExecuteXmlReader());

It
general, this solution is very limited since it only works when applying the predicate
one level under the root, plus the root needs to be a sqlxml constant element. However,
at least in my experiences this pattern comes up a lot, and will be performant since
SqlXml applies the root while streaming out the results. I
suppose it would be the not hard to write an XSLT to do the same thing, but in general
I prefer to only introduce the complexity of XSLT into my design when it solves and
significant problem. In this case, the
“root hack” is adequate enough. You can
also use the sql:hide annotation for cases when one wants to prune the entire sub-tree
but use content of the sub-tree in the original query. See sql:hide for
more information.

In
general though, it seems that a more general solution is required. Really,
I would prefer to exclude arbitrary parts of the query results without having to post
process with an XSLT and not have the data from then pruned sub-trees filtered out
on the server. XQuery will
achieve this, but the XQuery will be just as complicated as the XSLT. So
is there a better solution? Stay tuned.

Sub-tree pruning

Additional resources