Projections in custom providers – Simple solution

Implementing custom provider for WCF Data Services is complicated enough. The fact that for the really interesting providers it’s also necessary to implement a custom IQueryable makes things just so much more complicated. Based on feedback from multiple parties trying to implement the IQueryable for WCF Data Services one of the most challenging tasks is to support projections and expansions. These are hard not only because the expression trees generated for them are complicated but also because they require the query to return results in different shape than usual. In this post we’ll walk through a simple case of dealing with projection and expansions (with some limitations).

First let’s assume that you’re familiar with the expression trees generated by WCF Data Services for projections and expansions. These are described in the Data Services Expressions series part 8 and part 9. Also some familiarity with IQueryable, its behavior and the general approaches to its implementation is required. There are multiple places to learn about these, but the classic source is Matt’s blog series.

Scenarios

Since the solution we’re going to try out is simplistic it only solves certain limited scenarios. Projections ($select) are meant to limit the number of properties returned by a query for any given entity in it. This is definitely useful for the client since it can download only the properties it’s going to need. But the server can take advantage of this as well, since if the client only asks for subset of properties the server can also only load a subset of properties from the underlying data store. The simple solution described in this post assumes that the second optimization is not necessary, that it’s OK to load all properties for each entity from the data store. The projections then only happen inside the service and are easier to deal with.

If your provider wants to support expansions as well, these are even more complicated. This simple solution will only support expansions if the entity instances can load the target of any given navigation property on demand (when the navigation property is accessed). Some data sources might be able to do this easily, but for other it might be hard or expensive. I will discuss possible solutions to those in some later post.

Simplification

The general idea of dealing with projections in our provider will be this:

  • Take the entire query and split it in two parts, the projection and expansion itself and the source query for the projection and expansion which returns the results as plain entities.
  • Then run the source query against the underlying data source (we will assume we already have a solution for this).
  • Take the results of the source query and apply projections and expansions on it inside the service.
  • Return the new results to WCF Data Services.

For the purposes of this sample we will use a LINQ to SQL based data source. So we create a new web application, add a database to it with two tables Products and Categories (similar schema as in the sample service on odata.org). Then we create a LINQ to SQL classes over these tables which will give us a SampleDataContext class with two properties Products and Categories.

Note: I chose LINQ to SQL since it can deal with all the other parts of the query (filters, sorting and so on). It also supports on-demand loading of navigation properties by just accessing them. And last but not least, it generates the proxy classes, so we can use reflection provider to setup the metadata for us. In custom provider implementations this would obviously not be the case and most of these would have to solve in our code, but for the purposes of this sample such an approach is enough.

Custom query provider

We need a custom query provider, that is IQueryable and IQueryProvider implementations. Implementation of IQueryable is pretty simple (as usual):

 public class ProjectionQuery<T> : IOrderedQueryable<T>
 {
     private Expression expression;
     private ProjectionQueryProvider provider;
  
     public ProjectionQuery(ProjectionQueryProvider provider, Expression expression)
     {
         this.provider = provider;
         this.expression = expression;
     }
  
     public IEnumerator<T> GetEnumerator()
     {
         return this.provider.ExecuteQuery<T>(this.expression);
     }
  
     IEnumerator IEnumerable.GetEnumerator()
     {
         return this.provider.ExecuteQuery<T>(this.expression);
     }
  
     public Type ElementType { get { return typeof(T); } }
  
     public Expression Expression { get { return this.expression; } }
  
     public IQueryProvider Provider { get { return this.provider; } }
 }

When a query is to be execute this only calls ExecuteQuery method on the query provider. Now let’s add a stock implementation of the query provider like this:

 public class ProjectionQueryProvider : IQueryProvider
 {
     private IQueryProvider dataSourceQueryProvider;
  
     private ProjectionQueryProvider(IQueryProvider dataSourceQueryProvider)
     {
         this.dataSourceQueryProvider = dataSourceQueryProvider;
     }
  
     public static IQueryable<T> WrapDataSourceQuery<T>(IQueryable<T> dataSourceQuery)
     {
         return new ProjectionQuery<T>(
             new ProjectionQueryProvider(dataSourceQuery.Provider),
             dataSourceQuery.Expression);
     }
  
     public IQueryable<TElement> CreateQuery<TElement>(Expression expression)
     {
         return new ProjectionQuery<TElement>(this, expression);
     }
  
     public IQueryable CreateQuery(Expression expression)
     {
         return (IQueryable)Activator.CreateInstance(
             typeof(ProjectionQuery<>).MakeGenericType(TypeSystem.GetIEnumerableElementType(expression.Type)),
             this,
             expression);
     }
  
     public TResult Execute<TResult>(Expression expression)
     {
         // This is the case of for example .Count() query
         // - we don't support it for now.
         // One option would be to pass it directly to the underlying provider
         // if it can handle it.
         throw new NotSupportedException("We don't support expressions returning simple values yet.");
     }
  
     public object Execute(Expression expression)
     {
         // This is the case of for example .Count() query        
         // - we don't support it for now.
         // One option would be to pass it directly to the underlying provider
         // if it can handle it.
         throw new NotSupportedException("We don't support expressions returning simple values yet.");
     }
  
     // ExecuteQuery implementation missing here.
 }

Our query provider is based on a query provider for the underlying data source. As noted above this assumes that we already have an IQueryable implementation which can deal with filters, sorting and alike.

Projections

And now to the real interesting part, how to implement the ExecuteQuery:

 public IEnumerator<TElement> ExecuteQuery<TElement>(Expression expression)
 {
     // Simple solution, more robust solution would be using expression visitor
     // to process the projection and separate the projection source
     // from the rest of the query.
     // For now assume that the projection is the last operator on the query
     // (which is mostly true for WCF DS Expressions)
  
     // Match the .Select call (this is a helper method which determines
     // if the method call is a .Select and extracts the parameters
     // to the Select call in a nicely consumable way,
     // it doesn't modify the expression in any way).
     var selectMatch = ExpressionUtils.MatchSelectCall(expression);
  
     if (selectMatch == null)
     {
         // No projection - just run the source query.
         return this.dataSourceQueryProvider.CreateQuery<TElement>(expression)
             .GetEnumerator();
     }
     else
     {
         // Projection - separate the source query, and run that
         // against the data source provider.
         // Call a private method so that we can use generics
         // for easier manipulation of results
         // (otherwise we would have to use a lot of reflection)
         // The source item type is the item type of the source expression.
         MethodInfo methodInfo = typeof(ProjectionQueryProvider)
             .GetMethod(
                 "ProcessProjection",
                 BindingFlags.Instance | BindingFlags.NonPublic)
             .MakeGenericMethod(
                 TypeSystem.GetIEnumerableElementType(selectMatch.Source.Type),
                 typeof(TElement));
         return (IEnumerator<TElement>)methodInfo.Invoke(
             this,
             new object[] { selectMatch });
     }
 } 
  
 private IEnumerator<TResultElement>
      ProcessProjection<TSourceElement, TResultElement>(
         ExpressionUtils.SelectCallMatch selectMatch)
 {
     // Source is the query without projection/expansion
     // So we will run it right here against the original provider
     // - this is the query without projections.
     IQueryable<TSourceElement> dataSourceQuery =
          this.dataSourceQueryProvider.CreateQuery<TSourceElement>(selectMatch.Source);
  
     // This turns the query into enumerable which effectively executes it.
     // All other LINQ operations we perform on top of it
     // are performed on LINQ to Objects and not the underlying query provider.
     // (We could also cache the results here if needed).
     IEnumerable<TSourceElement> dataSourceQueryResults =
         dataSourceQuery.AsEnumerable(); ;
  
     // Now we have the results of the query as whole entities
     // (no projection applied).
     // This also assumes that for $expand to work the entities returned
     // already have or can lazy load the navigation properties.
     // Get the projection lambda and compile it into a delegate.
     Func<TSourceElement, TResultElement> projectionFunc =
         (Func<TSourceElement, TResultElement>)selectMatch.Lambda.Compile();
  
     // And now just run the projection function on each item
     // from the results we have to create the real results.
     IEnumerable<TResultElement> results =
          dataSourceQueryResults.Select(sourceItem => projectionFunc(sourceItem));
     return results.GetEnumerator();

The ExecuteQuery implementation only determines if the query contains projections/expansions and if so it invokes the ProcessProjection method. That method takes the source of the projection and, executes it against the data source (in our case this would query IQueryable against the LINQ to SQL and run it). Then it takes the results and applies the projections/expansions in-memory on it.

The main trick used here is to take a lambda expression (System.Linq.Expressions.LambdaExpression) and call the Compile method on it. This generates a lightweight dynamic method which is the body of the .Select and can be called just like any other method. Then we just simply call this method on each result and return to WCF Data Services.

Sample

And now in our sample we simply wrap the IQueryable returned by LINQ to SQL with the one we implemented above. Like this:

 // Data context which wraps the LINQ to SQL data context,
 // so that we can wrap the IQueryable properties.
 public class ProjectionDataContext
 {
     private SampleDataContext sampleDataContext = new SampleDataContext();
  
     public IQueryable<Product> Products {
          get { return ProjectionQueryProvider.WrapDataSourceQuery(
                   this.sampleDataContext.Products); } }
     public IQueryable<Category> Categories {
          get { return ProjectionQueryProvider.WrapDataSourceQuery(
                   this.sampleDataContext.Categories); } }
 }
  
 public class SampleService : DataService<ProjectionDataContext>
 {
     public static void InitializeService(DataServiceConfiguration config)
     {
         config.SetEntitySetAccessRule("*", EntitySetRights.AllRead);
         config.DataServiceBehavior.MaxProtocolVersion =
             DataServiceProtocolVersion.V2;
     }
 }

And that’s it. Give it a try and run a query for example like service/Products?$select=Name. It should only return products with Name property. To verify that it really worked, set a breakpoint into the ProcessProjection method on the line which create dataSourceQueryResults and run the query again. The dataSourceQuery is the IQueryable which will execute against the data source. In this sample it should show in the debug window a simple table scan for table Products (this is provided by LINQ to SQL for us). If the query would contain filters or sorting this query would contain all of those and LINQ to SQL would generate the appropriate SQL for it.

If we would like to use this solution as part of a bigger custom provider the query provider would need to be able to handle all of our custom provider needs on top of this. There are also some technical limitations. In particular the solution requires that IDataServiceQueryProvider.NullPropagationRequire returns true, otherwise the function generated by lambda.Compile would not correctly handle null values. This solution is also not suitable for wrapping EF providers since the expression trees generated for EF provider are not fully compatible with lambda.Compile either. And last, if our custom provider uses untyped properties, those would have to be “resolved” in the lambda expression before the .Compile is called. Not to mention how to solve the expansions for real and don’t lazy load the navigation properties one by one. All this will have to wait for another post.

And here’s the full sample as a VS 2010 solution.