EDM and Store functions exposed in LINQ

In this post Colin Meek and Diego Vega delve into some enhancements we are planning for LINQ to Entities, anyway over to them...

Entity Framework v1 customers preferring to write their queries using LINQ often hit a limitation on the range of functions and query patterns supported in LINQ to Entities. For some of those customers, having to resort to Entity SQL, or even to Entity SQL builder methods, feels awkward and reduces the appeal of Entity Framework.

There are two things we want to do in order to address this in future versions:

  • Expand the range of patterns and standard BCL methods we recognize in LINQ expressions.

  • Provide an extensibility mechanism that people can use to map arbitrary CLR methods to appropriate server and EDM functions.

This blog post expands on the second approach:

It is actually possible for us to improve our LINQ implementation so that all functions defined in the EDM and in the store, and even user defined functions, can be mapped to CLR methods with homologous signatures.

Design

Problem space

There are multiple dimensions to the problem space we want to address:

  • Functions can be defined in either the conceptual or the storage space

  • Functions can be defined in either the manifest, or just declared in the model

  • Functions can be mapped to either static CLR methods or to instance methods on the ObjectContext

  • This feature specifically targets composable functions

How it looks like: EdmFunctionAttribute

The basis of the extensibility mechanism is a new method-level attribute that carry function mapping information. Here is the basic signature of the attribute’s constructor:

public EdmFunctionAttribute(string namespaceName, string functionName)

The namespaceName parameter indicates the namespace for the function in metadata (i.e. “EDM” or “SQLSERVER”, or other store provider namespace). The functionName parameter takes the name of the function itself.

The following example could be product code or customer code applying the attribute on an extension method (it could be a regular static function) in order to map it to the standard deviation SQL Server function:

public static class SqlFunctions
{
[EdmFunction("SqlServer", "stdev")]
public static double? StandardDeviation(this IEnumerable<int?> source)
{
        throw EntityUtil.NotSupported(
System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall);
}
}

Notice that while this method can’t be called directly it can be used in a query like this:

var query =
from p in context.Products
where !p.Discontinued
group p by p.Category into g
select g.Select(each => each.ReorderLevel).StandardDeviation();

The following example shows how the canonical DiffYear function is mapped:

public static class EntityFunctions
{
[EdmFunction("EDM", "DiffYears")]
public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
{
throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall);
}
}

Usage is:

var query =
    from p in context.Products
    where EntityFunctions.DiffYears(DateTime.Today, p.CreationDate) < 5
    select p;

The following example shows how a user defined function defined in SQL Server can be mapped:

public static class MyCustomFunctions
{
    [EdmFunction("SqlServer", "MyFunction")]
    public static Int32? MyFunction(string myArg)
    {
        throw new NotSupportedException("Direct calls not supported"); 
    }
}

Convention based function name

We can establish that by convention the name of the CLR function defines the value of the functionName parameter. That makes the functionName parameter in the EdmFunctionAttribute optional.

EdmFunctionNamespaceAttribute

To avoid having to always specify the namespaceName for each function, we define a new class-level attribute named EdmFunctionNamespaceAttribute that would define the namespace mapping globally for a given class:

public EdmFunctionNamespaceAttribute(string namespaceName)

Using EdmFunctionNamespaceAttribute and the convention based constructor:

[EdmFunctionNamespace("EDM")]
public static class EdmMethods
{
    [EdmFunction]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(
            System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

How it works

When a method with the EdmFunction attribute is detected within a LINQ query expression, its treatment is identical to that of a function within an Entity-SQL query. Overload resolution is performed with respect to the EDM types (not CLR types) of the function arguments. Ambiguous overloads, missing functions or lack of overloads result in an exception. In addition, the return type of the method must be validated. If the CLR return type does not have an implicit cast to the appropriate EDM type, the translation will fail.

Instance methods on the ObjectContext will be supported as well. This allows the method to bootstrap itself and trigger direct evaluation, as in the following example (definition of the method and sample query):

public static class MyObjectContext : ObjectContext
{
    // Method definition
    [EdmFunction("edm", "floor")]
    public double? Floor(double? value)
    { 
        return this.QueryProvider.Execute<double?>(Expression.Call(
            Expression.Constant(this),
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(value, typeof(double?))));
    }
}

// evaluated in the store!
context.Floor(0.1);

Without the ObjectContext, the function cannot reach the store! To support this style of bootstrapping, the context needs to expose the LINQ query provider. For this reason, we now expose a “QueryProvider” property on the ObjectContext. This provider includes the necessary surface to construct or execute a query given a LINQ expression.

public class ObjectContext
{
    public IQueryProvider QueryProvider { get; }
}

If such a method is encountered inline in another query, then we must validate that the instance argument (MethodCallExpression.Object) is the correct context, but the instance is otherwise ignored:

// positive
var q1 = from p in context.Products select context.Floor(p.Price);

// negative
var q2 = from p in context.Products select context2.Floor(p.Price);

A function proxy can sometimes bootstrap itself without an explicit context, e.g. when an input argument is itself an IQueryable:

public static class SqlFunctions
{
    [EdmFunction("SqlServer", "stdev")]
    public static double? StandardDeviation(this IQueryable<int?> source)
    {
        return source.Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(source)));
    }
}

Nullability considerations

Particularly for functions taking collections, we will need to provide overloads for nullable and non-nullable elements. We don’t want to require awkward constructions like:

var query = (from p in products select (int?)p.ReorderLevel).StandardDeviation();

Tool for Generating the Functions

We created a simple internal tool that generates the classes that represent all the EDM canonical function and the SQL Server store functions. The tool will take the function definitions from Metadata and generate the appropriate function stubs/implementations.

The tool will be outside the product and will be run on demand. We expect to make a version of this tool available for provider writers together with the provider samples.

Naming

The methods will be in the following classes:

Namespace Class name
System.Data.Objects EntityFunctions
System.Data.Objects.SqlClient SqlFunctions

Note: The equivalent class in LINQ to SQL is System.Data.Linq.SqlClient.SqlMethods.

The method names will correspond to the name of the EDM/SQL function they represent. The argument names will correspond to the argument names of the EDM/SQL functions as retrieved by the metadata.

The recommendation for provider writers will be to include a similar static class in a namespace of the following form:

System.Data.Objects.[Standard provider namespace].[Standard provider prefix]Functions

Overloads and Implementation

Non-aggregate Functions

For each non-aggregate function we create an overload with all inputs type as nullable of the CLR equivalent of their EDM primitive type, and the return type nullable of the CLR equivalent of their EDM primitive type.

The implementation of the functions (what gets executed if the function is invoked outside an expression tree) will be to throw a NotSupportedException.

Example:

public static class EntityFunctions
{
    [EdmFunction("EDM", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

public static class SqlFunctions
{
    [EdmFunction("SqlServer", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

Aggregate Functions

For each aggregate function we will provide two overloads, one with IEnumerable<Nullable<T>> and another one with IEnumerable<T>, where T is the CLR equivalent of the EDM primitive type of the input. The implementations of these will check whether the input is IQueryable in which case it will implement the self-bootstrapping.

Example:

[EdmFunction("EDM", "VARP")]
public static double? VarP(IEnumerable<int> arg1)
{
    ObjectQuery<int> objectQuerySource = source as ObjectQuery<int>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

[EdmFunction("EDM", "VARP")]
public static double? VarP(IEnumerable<int?> arg1)
{
    ObjectQuery<int?> objectQuerySource = source as ObjectQuery<int?>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

The Entity Framework team would love to hear your comments.

Alex James  
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.