Tip 28 - How to implement an Eager Loading strategy

Background:

Over the last 2 years lots of people have complained about the way Eager loading works in the Entity Framework, or rather the way you ask the Entity Framework to eagerly load.

Here is how you do it:

var results = from b in ctx.Blogs.Include(“Posts”)
where b.Owner == “Alex”
select b;

This snippets asks the EF to eager load each matching Blog’s Posts, and it works great.

The problem is the ‘Posts’ string. LINQ in general and LINQ to SQL in particular have spoilt us, we all now expect type safety everywhere, and a string, is well… not type safe.

Instead everyone wants something like this:

var results = from b in ctx.Blogs.Include(b => b.Posts)
where b.Owner == “Alex”
select b;

This is a lot safer. And a number of people have tried something like this before, including my mate Matthieu.

But even better would be something like this:

var strategy = new IncludeStrategy<Blog>();
strategy.Include(b => b.Owner);

var results = from b in strategy.ApplyTo(ctx.Blogs)
where b.Owner == “Alex”
select b;

Because here you can re-use strategies, between queries.

Design Goals:

So I decided I wanted to have a play myself and extend this idea to support strategies.

Here are the types of things I wanted to support:

var strategy = Strategy.NewStrategy<Blog>();
strategy.Include(b => b.Owner)
.Include(p => p.Comments); //sub includes
strategy.Include(b => b.Posts); //multiple includes

The ability to sub-class the strategy class

public class BlogFetchStrategy: IncludeStrategy<Blog>
{
public BlogFetchStrategy()
{
this.Include(b => b.Owner);
this.Include(b => b.Posts);
}
}

so you can do things like this:

var results = from b in new BlogFetchStrategy().ApplyTo(ctx.Blogs)
where b.Owner == “Alex”
select b;

Implementation:

Here is how I implemented this:

1) Create the IncludeStrategy<T> class:

public class IncludeStrategy<TEntity>
where TEntity : class, IEntityWithRelationships
{
private List<string> _includes = new List<string>();

    public SubInclude<TNavProp> Include<TNavProp>(
             Expression<Func<TEntity, TNavProp>> expr
) where TNavProp : class, IEntityWithRelationships
{
return new SubInclude<TNavProp>(
_includes.Add,
new IncludeExpressionVisitor(expr).NavigationProperty
);
}

    public SubInclude<TNavProp> Include<TNavProp>(
Expression<Func<TEntity, EntityCollection<TNavProp>>> expr
) where TNavProp : class, IEntityWithRelationships
{
return new SubInclude<TNavProp>(
_includes.Add,
new IncludeExpressionVisitor(expr).NavigationProperty
);
}

    public ObjectQuery<TEntity> ApplyTo(ObjectQuery<TEntity> query)
{
var localQuery = query;
foreach (var include in _includes)
{
localQuery = localQuery.Include(include);
}
return localQuery;
}
}

Notice that there is a list of strings that holds the Includes we want. And notice that the ApplyTo(…) method allows you to register the Includes with an ObjectQuery<T>, so long as the T’s match.

But of course the bulk of the work is in the two Include(..) methods.

There are two because I wanted to have one for including References and one for including Collections. This implementations are designed to work with .NET 3.5 SP1 so I can rely on classes that have relationships (the only type for which Include makes sense) implementing IEntityWithRelationships. Hence the use of generic constraints.

One thing that is interesting is that for the Include method for Collections, even though the Expression is Expression<Func<TEntity, EntityCollection<TNavProp>>> the return object for creating sub-includes is typed to TNavProp. This is allows us to neatly bypass needing to interpret expressions like this:

Include(b => b.Posts.SelectMany(p => p.Author));

or invent some sort of DSL like this:

Include(b => b.Posts.And().Author);

By instead doing this:

Include(b => b.Posts).Include(p => p.Author);

Which is much much easier to implement, and I would argue to use too.

This idea is central to the whole design.

2) The IncludeExpressionVisitor is a class derived from a copy of the ExpressionVisitor sample you can find here. It is very simple, in fact it is so simple it is probably overkill to use a visitor here, but I wanted to bone up on the correct patterns etc:

public class IncludeExpressionVisitor : ExpressionVisitor
{
private string _navigationProperty = null;

    public IncludeExpressionVisitor(Expression expr)
{
base.Visit(expr);
}
public string NavigationProperty
{
get { return _navigationProperty; }
}

    protected override Expression VisitMemberAccess(
MemberExpression m
)
{
PropertyInfo pinfo = m.Member as PropertyInfo;

        if (pinfo == null)
throw new Exception(
"You can only include Properties");

if (m.Expression.NodeType != ExpressionType.Parameter)
throw new Exception(
"You can only include Properties of the Expression Parameter");

_navigationProperty = pinfo.Name;

        return m;
}

    protected override Expression Visit(Expression exp)
{
if (exp == null)
return exp;
switch (exp.NodeType)
{
case ExpressionType.MemberAccess:
return this.VisitMemberAccess(
(MemberExpression)exp
);
case ExpressionType.Lambda:
return this.VisitLambda((LambdaExpression)exp);
default:
throw new InvalidOperationException(
"Unsupported Expression");
}
}
}

As you can see this visitor is fairly constrained, it only recognizes LambdaExpressions and MemberExpressions. When visiting a MemberExpression it checks to make sure that the Member being access is a Property, and that the member is bound directly to the parameter (i.e. p.Property is okay but p.Property.SubProperty is not). Once it is happy it records the name of the NavigationProperty.

3) Once we know the NavigationProperty name the IncludeStrategy.Include methods create a SubInclude<T> object. This is responsible for registering our intent to include the NavigationProperty, and provides a mechanism for chaining more sub-includes.

The SubInclude<T> class looks like this:

public class SubInclude<TNavProp>
where TNavProp : class, IEntityWithRelationships
{

    private Action<string> _callBack;
private string[] _paths;

internal SubInclude(Action<string> callBack, params string[] path)
{
_callBack = callBack;
_paths = path;
_callBack(string.Join(".", _paths));
}

    public SubInclude<TNextNavProp> Include<TNextNavProp>(
Expression<Func<TNavProp, TNextNavProp>> expr
) where TNextNavProp : class, IEntityWithRelationships
{
string[] allpaths = _paths.Append(
new IncludeExpressionVisitor(expr).NavigationProperty
);

return new SubInclude<TNextNavProp>(_callBack, allpaths);
}

    public SubInclude<TNextNavProp> Include<TNextNavProp>(
Expression<Func<TNavProp, EntityCollection<TNextNavProp>>> expr
) where TNextNavProp : class, IEntityWithRelationships
{
string[] allpaths = _paths.Append(
new IncludeExpressionVisitor(expr).NavigationProperty
);

return new SubInclude<TNextNavProp>(_callBack, allpaths);
}
}

4) Now the only thing missing is a little extension method I wrote to append another element to an array, that looks something like this:

public static T[] Append<T>(this T[] initial, T additional)
{
List<T> list = new List<T>(initial);
list.Add(additional);
return list.ToArray();
}

With this code in place you can write your own eager loading strategy classes very easily, simply by deriving from IncludeStrategy<T>.

All the code you need is in this post, but please bear in mind this is just a sample, it NOT an official Microsoft release, and as such has not been rigorously tested etc.

If you accept that I'm just a Program Manager, and I'm eminently fallible, and you *still* want to try this out, you can download a copy of the source here.

Enjoy.

EagerLoading.zip