EDM and Store functions exposed in LINQ


In this post Colin Meek and Diego Vega delve into some enhancements we are planning for LINQ to Entities, anyway over to them…

Entity Framework v1 customers preferring to write their queries using LINQ often hit a limitation on the range of functions and query patterns supported in LINQ to Entities. For some of those customers, having to resort to Entity SQL, or even to Entity SQL builder methods, feels awkward and reduces the appeal of Entity Framework.

There are two things we want to do in order to address this in future versions:

  • Expand the range of patterns and standard BCL methods we recognize in LINQ expressions.

  • Provide an extensibility mechanism that people can use to map arbitrary CLR methods to appropriate server and EDM functions.

This blog post expands on the second approach:

It is actually possible for us to improve our LINQ implementation so that all functions defined in the EDM and in the store, and even user defined functions, can be mapped to CLR methods with homologous signatures.

Design

Problem space

There are multiple dimensions to the problem space we want to address:

  • Functions can be defined in either the conceptual or the storage space

  • Functions can be defined in either the manifest, or just declared in the model

  • Functions can be mapped to either static CLR methods or to instance methods on the ObjectContext

  • This feature specifically targets composable functions

How it looks like: EdmFunctionAttribute

The basis of the extensibility mechanism is a new method-level attribute that carry function mapping information. Here is the basic signature of the attribute’s constructor:

public EdmFunctionAttribute(string namespaceName, string functionName)

The namespaceName parameter indicates the namespace for the function in metadata (i.e. “EDM” or “SQLSERVER”, or other store provider namespace). The functionName parameter takes the name of the function itself.

The following example could be product code or customer code applying the attribute on an extension method (it could be a regular static function) in order to map it to the standard deviation SQL Server function:

public static class SqlFunctions
{
    [EdmFunction(“SqlServer”, “stdev”)]
    public static double? StandardDeviation(this IEnumerable<int?> source)
    {

        throw EntityUtil.NotSupported(
            System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall);
    }
}

Notice that while this method can’t be called directly it can be used in a query like this:

var query = 
    from p in context.Products
    where !p.Discontinued
    group p by p.Category into g
    select g.Select(each => each.ReorderLevel).StandardDeviation();

The following example shows how the canonical DiffYear function is mapped:

public static class EntityFunctions
{
    [EdmFunction(“EDM”, “DiffYears”)]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

Usage is:

var query =
    from p in context.Products
    where EntityFunctions.DiffYears(DateTime.Today, p.CreationDate) < 5
    select p;

The following example shows how a user defined function defined in SQL Server can be mapped:

public static class MyCustomFunctions
{
    [EdmFunction(“SqlServer”, “MyFunction”)]
    public static Int32? MyFunction(string myArg)
    {
        throw new NotSupportedException(“Direct calls not supported”); 
    }
}

Convention based function name

We can establish that by convention the name of the CLR function defines the value of the functionName parameter. That makes the functionName parameter in the EdmFunctionAttribute optional.

EdmFunctionNamespaceAttribute

To avoid having to always specify the namespaceName for each function, we define a new class-level attribute named EdmFunctionNamespaceAttribute that would define the namespace mapping globally for a given class:

public EdmFunctionNamespaceAttribute(string namespaceName)

Using EdmFunctionNamespaceAttribute and the convention based constructor:

[EdmFunctionNamespace(“EDM”)]
public static class EdmMethods
{
    [EdmFunction]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(
            System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

How it works

When a method with the EdmFunction attribute is detected within a LINQ query expression, its treatment is identical to that of a function within an Entity-SQL query. Overload resolution is performed with respect to the EDM types (not CLR types) of the function arguments. Ambiguous overloads, missing functions or lack of overloads result in an exception. In addition, the return type of the method must be validated. If the CLR return type does not have an implicit cast to the appropriate EDM type, the translation will fail.

Instance methods on the ObjectContext will be supported as well. This allows the method to bootstrap itself and trigger direct evaluation, as in the following example (definition of the method and sample query):

public static class MyObjectContext : ObjectContext
{
    // Method definition
    [EdmFunction(“edm”, “floor”)]
    public double? Floor(double? value)
    { 
        return this.QueryProvider.Execute<double?>(Expression.Call(
            Expression.Constant(this),
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(value, typeof(double?))));
    }
}



// evaluated in the store!
context.Floor(0.1);

Without the ObjectContext, the function cannot reach the store! To support this style of bootstrapping, the context needs to expose the LINQ query provider. For this reason, we now expose a “QueryProvider” property on the ObjectContext. This provider includes the necessary surface to construct or execute a query given a LINQ expression.

public class ObjectContext
{
    public IQueryProvider QueryProvider { get; }
}

If such a method is encountered inline in another query, then we must validate that the instance argument (MethodCallExpression.Object) is the correct context, but the instance is otherwise ignored:

// positive
var q1 = from p in context.Products select context.Floor(p.Price);

// negative
var q2 = from p in context.Products select context2.Floor(p.Price);

A function proxy can sometimes bootstrap itself without an explicit context, e.g. when an input argument is itself an IQueryable:

public static class SqlFunctions
{
    [EdmFunction(“SqlServer”, “stdev”)]
    public static double? StandardDeviation(this IQueryable<int?> source)
    {
        return source.Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(source)));
    }
}

Nullability considerations

Particularly for functions taking collections, we will need to provide overloads for nullable and non-nullable elements. We don’t want to require awkward constructions like:

var query = (from p in products select (int?)p.ReorderLevel).StandardDeviation();

Tool for Generating the Functions

We created a simple internal tool that generates the classes that represent all the EDM canonical function and the SQL Server store functions. The tool will take the function definitions from Metadata and generate the appropriate function stubs/implementations.

The tool will be outside the product and will be run on demand. We expect to make a version of this tool available for provider writers together with the provider samples.

Naming

The methods will be in the following classes:

Namespace Class name

System.Data.Objects

EntityFunctions

System.Data.Objects.SqlClient

SqlFunctions

Note: The equivalent class in LINQ to SQL is System.Data.Linq.SqlClient.SqlMethods.

The method names will correspond to the name of the EDM/SQL function they represent. The argument names will correspond to the argument names of the EDM/SQL functions as retrieved by the metadata.

The recommendation for provider writers will be to include a similar static class in a namespace of the following form:

System.Data.Objects.[Standard provider namespace].[Standard provider prefix]Functions

Overloads and Implementation

Non-aggregate Functions

For each non-aggregate function we create an overload with all inputs type as nullable of the CLR equivalent of their EDM primitive type, and the return type nullable of the CLR equivalent of their EDM primitive type.

The implementation of the functions (what gets executed if the function is invoked outside an expression tree) will be to throw a NotSupportedException.

Example:

public static class EntityFunctions
{
    [EdmFunction(“EDM”, “DiffYears”)]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

public
static class SqlFunctions
{
    [EdmFunction(“SqlServer”, “DiffYears”)]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}
Aggregate Functions

For each aggregate function we will provide two overloads, one with IEnumerable<Nullable<T>> and another one with IEnumerable<T>, where T is the CLR equivalent of the EDM primitive type of the input. The implementations of these will check whether the input is IQueryable in which case it will implement the self-bootstrapping.

Example:

[EdmFunction(“EDM”, “VARP”)]
public static double? VarP(IEnumerable<int> arg1)
{
    ObjectQuery<int> objectQuerySource = source as ObjectQuery<int>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

[EdmFunction(“EDM”, “VARP”)]
public static double? VarP(IEnumerable<int?> arg1)
{
    ObjectQuery<int?> objectQuerySource = source as ObjectQuery<int?>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

The Entity Framework team would love to hear your comments.

Alex James 
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Comments (14)

  1. Diego Vega says:

    Alex published today a description Colin and I wrote on a new feature the team has been working on for

  2. In the above examples, I did not see any example of functions returning out paramter and table values functions. In linq to sql I have functions returning a collection of entity which i can join against an existing query. Is that supported?

    Also it would be nice to get support for functions taking table, a new featured in sql server 2008?

    thks

    Zeeshan Hirani

  3. Zeeshan Hirani says:

    will the function defined on teh clr be available for use with entit sql as well?

  4. Zeeshan,

    Regarding TVFs and TVPs, we do want to support them with this mechanism, but we first need to build the capability in lower layers in the Entity Framework stack. At this point, we are not sure where this is going to land.

    Regarding your question on whether the CLR functions is going to be available from Entity SQL queries: if you are referring to arbitrary CLR functions being available in Entity SQL, the answer is no. That would require of a general mechanism to translate IL code to the query language of the database server. For instance, there is no general way to translate arbitrary IL code into T-SQL. If you are referring to the functions we are exposing to LINQ with this feature, they are already available to Entity SQL queries.

    Perhaps you are interested in having methods that have both a runtime CLR behavior and a corresponding semantics when used in queries. Then the answer is yes, that would be possible with this mechanism.

    Thanks,

    Diego

  5. Zeeshan Hirani says:

    Thanks Diego for the explanation. I was looking for functions exposed to linq be also available for esql which u said it would be.

    Another question. Not sure if this is a right place to ask. Can we expect to retun partially filled entity cuz lot of times we do not need entire entity yet Ef requires a full entity set. Can the restriction be reduced to returning partially filled entity containing only entity key.

    Like in Linq to sql, I can return partially filled entity from stored procedure or dynamic sql as long as the reocrd set contains the column name that matches teh column name defined as the primary key on the entity.

    Also can we get relationship span to work with stored procedure as well?

  6. The support seems promising.

    Here are 2 ideas I have in mind:

    1. For the static functions – can’t we just declare them as extern without providing "dummy" implementation that throws exceptions.

    2. It is good to think about some kind of alternative syntax for Expression tree, because currently it seems a little bit ugly. Maybe some kind of expression tree’s magic can do the trick:

    public static double? StandardDeviation( this IQueryable<int?> source )

    {

       Expression ex = EntityExpression.Create( source.StandardDeviation() );

       return source.Provider.Execute<double?>( ex );

    }

  7. Diego Vega says:

    Zeeshan,

    I would like to understand more about the scenarios for "partially filled entities". I not sure what is exactly possible and in which circumstances this is more useful than a projection. I am aware that we support materializing arbitrary classes and using ExecuteQuery and Translate in LINQ to SQL, and I agree that is a useful feature.

    I will follow up on your other question on query span iwth stored procedures with someone that may know the answer.

  8. efdesign says:

    Stefan,

    At the language/CLR level, "extern" requires an implementation at type load time. What we are doing in this case does not really require any changes at lower levels of the framework.

    The idea of using this kind of "expression factory" method is interesting. It may result in code that is easier to read than building the expressions directly as we do. However, in the case of our internal code and generated code, the simpler expressions we get when we use the current way seem to be a better choice.

    Thanks,

    Diego

  9. Zeeshan Hirani says:

    Diego,

    Thanks for the explanation. For returning partially filled entity, the only reason i had was it was easier to pas it across tiers as compared to anonymous types. LIke for databainding scenaious, I had some stored procedures that would execute some complicated query to return few columns for my entity. When i was doing linq to sql i got away but as i moved my app over entity framework, I struggled through fixing my stored procedure to meet EF needs.  More so I would need somethings from extensions to be moved to core product that i use on a day to day basis on my project. Like I would create a stored procedure in the database and map it to any clr type entity or not it doesn’t matter. There is so much pain that one has to go through to use a stored procedure and if the stored procedure does not map to an entity, you are out of luck.

    what else.. let me think

    More esql operatos needs to be supported in queryview.

    Be able t specify at the mapping level the default sort to be used in case there is no sort provided for associations or entitysets. LIke the other day I had the need to apply a sort everywhere I am retrieving orders for customers to be sorted by order date.

    Wilson ormapper supported this..

    flexible stored procedure mapping. I want to returning idenity column using output paramters. I can hack it by using command text but i dont like that option. my list is endless so i will stop.

    But thanks for listening to our needs. I am very happy that i moved my project to EF, my code is much cleaner  from the linq to sql queries because now my model is complicated but get a new developer started with our model is much easier and quicker.

    Thks

  10. Alex Mineev (MSFT) says:

    Regarding the partially filled entities: you can define a function that will return entity refs instead of entities.

  11. Today the Entity Framework, and more specifically the Entity Data Model, have a limited notion of Functions.

  12. aaa says:

    what about sprocs with table valued parameters? very useful for bulk-inserts!

  13. A while back I wrote a post that introduced the concept of Computed Properties . Since that time we’ve

  14. bharat mali says:

    CREATE function [dbo].[DevanagariToInt](@Value nchar(10)) returns int

    begin

     return cast(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(

              @Value

            , N'०', N'0')

            , N'१', N'1')

            , N'२', N'2')

            , N'३', N'3')

            , N'४', N'4')

            , N'५', N'5')

            , N'६', N'6')

            , N'७', N'7')

            , N'८', N'8')

            , N'९', N'9') as int)

    end

    This is my procedure and i want to use in entity framework

    i use in sql server in this way

    use Inventory1011

    go

    select dbo.DevanagariToInt(companycode) as exp from CompanyMaster order by exp