Roger Jennings, in his recent post Controlling the Depth and Order of EntitySets for 1:Many Associations, makes a case for the importance of two features in an O/RM if you want to build data-centric web services using it: the ability to do a filtered relationship navigation, and the ability to either serialize a graph or at least re-create the graph on the other side of a web service boundary. Well, to be fair, he probably is asking for more than these two features, but there are two key things which are possible with the entity framework today (as of beta 3). So I thought I'd take a few minutes to explain how. I’m going to take these two topics out of order, though, because there are some important concepts related to re-creating an entity graph which I hope will help clarify things when it comes to filtered association loading.
Re-creating an entity graph across a web service boundary
While this is by no means a simple problem, for an important sub-set of the overall scenarios it’s pretty easy to solve this with the entity framework today. The thing which makes this possible is the beta 3 feature of serializable EntityReferences. In my previous post about Entity Relationship Concepts, I gave some background about relationship “stub” entries in the object state manager, and these are effectively what gets serialized when an EntityReference is serialized. Stubs are very versatile because they automatically upgrade to real entities either if the real entity is already present when the system tries to load a stub into the state manager or if a stub is present and a real entity is loaded. When you add to this the fact that the framework ensures that the state manager and its entries are automatically kept in sync with the collections and references on the objects, the result is a pretty simple mechanism for breaking a graph into pieces and then cleanly reassembling the whole.
Maybe a picture will help:
So, if you want to remote an object graph across a web service and you do not have many-many relationships, then you can just remote each of the entities which participate in the object graph individually (maybe just as an array of objects) and then attach all of them to an ObjectContext on the other side at which point the EF will recreate the graph for you. This is, in fact, part of how my general purpose container sample (promised as an illustration of strategy #4 in this post but not quite yet ready for sharing) works.
Filtered Association Loading
While it might be nice if there were a way to pass a predicate to either the Load method on EntityCollection and EntityReference or to the Include method on ObjectQuery, it is possible to accomplish these things today. If you want to filter a relationship load, one way to do it is with EntityCollection’s Attach method. As a simple example, if I wanted to load into a customer’s orders collection all orders with date > January 1, 2007, then I could do something like this:
customer.Orders.Attach(customer.Orders.CreateSourceQuery().Where(order => order.Date >= new DateTime(2007, 1, 1)));
The CreateSourceQuery method returns an ObjectQuery<T> which will retrieve the set of entities that Load would retrieve, I then refine that query to filter to the subset of orders we really want and call Attach which tells the collection to incorporate the retrieved orders as though they were Loaded.
Another interesting trick is the fact that object services automatically rewrites ObjectQueries (except those with merge option of NoTracking) to automatically bring along the EntityKey for EntityReferences and then creates the relationship entries and stubs as needed when the results of the query are attached to the context. What this means in practice is that the above operation could also have been accomplished without the attach call just by creating the filtered query and enumerating its results—as the results are enumerated, objects are materialized and attached to the state manager and (like in the diagram above) the orders bring along the keys of their corresponding customers and cause the graph to be fixed up to match.
What this means for the include statement is that there are multiple ways to load a set of customers with a filtered set of related orders:
1) You could query the customers and as you iterate over each one you could use the trick above to query the filtered set of orders for that customer and load them in. This would, unfortunately, result in n+1 queries (where n is the number of customers).
foreach(Customer c in db.Customers)
c.Orders.Attach(c.Orders.CreateSourceQuery().Where(o => o.Date >= new DateTime(2007, 1, 1)));
// do stuff
2) You could query the customers and then once you were done iterating over all of them you could just query for all orders since January 1, 2007. This would require only 2 queries, and it would recreate the graph for you automatically. The only downside is that if you filtered to a subset of all the customers in your first query, then you would have to apply a similar filter to your orders query or else you would retrieve orders for customers you weren’t interested in.
var customers = new List<Customer>(db.Customers);
var orders = new List<Order>(db.Orders.Where(o => o.Date >= new DateTime(2007, 1, 1)));
foreach (Customer c in customers)
// do stuff
3) You could create a projection query which retrieved customers in the first column and collections of orders for that customer in the second column and then attach the collection in the second column to the collection property on the customers. This would use only one query.
foreach (var customerAndOrders in from c in db.Customers
Customer = c,
Orders = (from o in c.Orders
where o.Date >= new DateTime(2007, 1, 1)
// do stuff
Well, it’s growing late in the afternoon on the Friday before Christmas, and my family is calling me home. So I’ll leave you with this. Here’s hoping that all of you have a wonderful holiday and new year!