Stored Procedures and First-Class Relationships in the Entity Framework

A user of the EFExtensions library asked me how to retrieve a set of ‘categories’ and their related ‘products’ through a single stored procedure call. I made some changes to the sample application to illustrate this pattern, but I thought I should also spend some time explaining how it works. The full explanation takes a little bit of patience, because it gets into some of the unique features of the Entity Data Model (EDM).

 

The EDM has “first-class relationships”. What does this mean? Take a look at products and categories in the EFExtensions sample application. The dbo.Products table has a cid column with a foreign key constraint. The corresponding Product entity type does not have a property corresponding to that column. Where did the foreign key go? Typically, foreign keys are mapped to a separate "association set" in your model, in this case ProductCategoryAssociations, so they would be redundant if included in the entity type as well. Consider the following table contents:

 

 

 

In our entity model, there are three independent sets:

 

 

 

 

The products and the relationships between products and categories are actually separate ‘things’ in the conceptual model, whereas in the database they’re represented by a single table. The nice thing about this independence in the EDM is that you can define a relationship between two entity types without modifying the entities. There are also some potentially unintuitive side effects to consider. I’ll talk about a few of them here.

 

Let’s say you decide that ‘Beer’ is a meal in and of itself and should be classified as a ‘Food’ rather than a ‘Beverage’. You start off by loading the relevant category and product:

 

Category food = context.Categories.Where(c => c.Name == "Food").First();

Product beer = context.Products.Where(p => p.Name == "Beer").First();

 

So where is our "first-class" relationship hiding? Let’s look at the contents of the state manager (context.ObjectStateManager.GetStateEntries):

 

  ProductCategoryAssociations(Unchanged)

    <CategoryID=1, ProductID=1>

  Categories(Unchanged)

    C2: Food

  Products(Unchanged)

    P1: Beer

  Categories(Unchanged)

    Stub entity CategoryID=1

 

Although we only queried two entities, there are four state entries! As expected, we have the ‘Beer’ product and the ‘Food’ category. We also have a ProductCategoryAssocations entry, representing the relationship between ‘Beer’ and ‘Beverage’, and a so-called stub entry representing the ‘Beverage’ category (at this point we only know that the key value is ‘1’). This is a little bit strange. After all, I didn’t query the association set. This is because the Entity Framework (EF) implicitly “spans” in reference relationships when you issue certain queries. In this case, Product has a reference (as opposed to collection) relationship with Category, so the query issued automatically includes the relationship information as well.

 

Warning: implicit relationship span does not happen for EntityCommand, ObjectQuery (LINQ or Entity-SQL) with MergeOption.NoTracking, or FunctionImport stored procedures (though I demonstrate how to work around this last limitation using EFExtensions in this post).

 

What happens when I change the category?

 

beer.Category = food;

 

Although it may look like I’m modifying the product, I’m really deleting one relationship and adding another. If you examine the state manager again, you’ll notice there are now two relationship state entries:

 

  ProductCategoryAssociations(Added)

    <CategoryID=2, ProductID=1>

  ProductCategoryAssociations(Deleted)

    <CategoryID=1, ProductID=1>

 

When I call context.SaveChanges, the EF is smart enough to issue an UPDATE rather than attempting to INSERT a new row and DELETE the existing row. The moral of the story: things are not necessarily what they seem. The ObjectContext does a pretty good job of sheltering us from the complexities of first-class relationships, but it’s good to understand them in case something goes wrong. Which brings me back to the original point of this post… What if I want to call a stored procedure that returns multiple categories and multiple products? Most likely, I’d like to know which categories and products are related. My original samples showed how you could return a single category and its related products. If there are multiple categories, we no longer know how the products line up with categories, and we end up with a disconnected graph. After attaching products and categories to my context, I still don’t know about the relationships, obvious when you examine the state manager:

 

  Categories(Unchanged)

    C1: Beverage

  Categories(Unchanged)

    C2: Food

  Products(Unchanged)

    P1: Beer

  Products(Unchanged)

    P3: Water

 

If we change the definition of our product ‘materializer’ (I describe these techniques in detail here), we can include the relationship data with the product entity:

 

new Product {

    ProductID = r.Field<int>("pid"),

    Name = r.Field<string>("name"),

    CategoryReference = {

        EntityKey = new EntityKey("MiniNWEntities.Categories", new[] { new EntityKeyMember("CategoryID", r.Field<int>("cid")) }),

    },

}

 

Now we’re mimicking the “relationship span” behavior of the EF by including the EntityKey for our category reference along with the product, and the state manager reflects the relationships as well as the entities:

 

  ProductCategoryAssociations(Unchanged)

    <CategoryID=1, ProductID=1>

  ProductCategoryAssociations(Unchanged)

    <CategoryID=1, ProductID=3>

 

In general, I think it’s useful to adopt a “first-class relationships” view of the world when you work with the EF because it is central to the design of the product.