LINQ: Building an IQueryable Provider – Part I



This is the first in a series of posts on  how to build a LINQ IQueryable provider.  Each post builds on the last one.


Complete list of posts in the Building an IQueryable Provider series 


 


I’ve been meaning for a while to start up a series of posts that covers building LINQ providers using IQueryable.  People have been asking me advice on doing this for quite some time now, whether through internal Microsoft email or questions on the forums or by cracking the encryption and mailing me directly.  Of course, I’ve mostly replied with “I’m working on a sample that will show you everything” letting them know that soon all will be revealed. However, instead of just posting a full sample here I felt it prudent to go step by step so I can actual dive deep and explain everything that is going on instead of just dumping it all in your lap and letting you find your own way.


The first thing I ought to point out to you is that IQueryable has changed in Beta 2.  It’s no longer just one interface, having been factored into two: IQueryable and IQueryProvider. Let’s just walk through these before we get to actually implementing them.


If you use Visual Studio to ‘go to definition’ you get something that looks like this: 


    public interface IQueryable : IEnumerable {       


        Type ElementType { get; }


        Expression Expression { get; }


        IQueryProvider Provider { get; }


    }


    public interface IQueryable<T> : IEnumerable<T>, IQueryable, IEnumerable {


    }


Of course, IQueryable no longer looks all that interesting; the good stuff has been pushed off into the new interface IQueryProvider. Yet before I get into that, IQueryable is still worth looking at.  As you can see the only things IQueryable has are three read-only properties.  The first one gives you the element type (or the ‘T’ in IQueryable<T>).  It’s important to note that all classes that implement IQueryable must also implement IQueryable<T> for some T and vice versa.  The generic IQueryable<T> is the one you use most often in method signatures and the like. The non-generic IQueryable exist primarily to give you a weakly typed entry point primarily for dynamic query building scenarios.


The second property gives you the expression that corresponds to the query. This is quintessential essence of IQueryable’s being. The actual ‘query’ underneath the hood of an IQueryable is an expression that represents the query as a tree of LINQ query operators/method calls. This is the part of the IQueryable that your provider must comprehend in order to do anything useful. If you look deeper you will see that the whole IQueryable infrastructure (including the System.Linq.Queryable version of LINQ standard query operators) is just a mechanism to auto-construct expression tree nodes for you.  When you use the Queryable.Where method to apply a filter to an IQueryable, it simply builds you a new IQueryable adding a method-call expression node on top of the tree representing the call you just made to Queryable.Where. Don’t believe me? Try it yourself and see what it does.


Now that just leaves us with the last property that gives us an instance of this new interface IQueryProvider.  What we’ve done is move all the methods that implement constructing new IQueryables and executing them off into a separate interface that more logically represents your true provider.


    public interface IQueryProvider {


        IQueryable CreateQuery(Expression expression);


        IQueryable<TElement> CreateQuery<TElement>(Expression expression);


        object Execute(Expression expression);


        TResult Execute<TResult>(Expression expression);


    }


Looking at the IQueryProvider interface you might be thinking, “why all these methods?”  The truth is that there are really only two operations, CreateQuery and Execute, we just have both a generic and a non-generic form of each. The generic forms are used most often when you write queries directly in the programming language and perform better since we can avoid using reflection to construct instances.


The CreateQuery method does exactly what it sounds like it does.  It creates a new instance of an IQueryable query based on the specified expression tree.  When someone calls this method they are basically asking your provider to build a new instance of an IQueryable that when enumerated will invoke your query provider and process this specific query expression.  The Queryable form of the standard query operators use this method to construct new IQueryable’s that stay associated with your provider.  Note the caller can pass any expression tree possible to this API. It may not even be a legal query for your provider.  However, the only thing that must be true is that expression itself must be typed to return/produce a correctly typed IQueryable. You see the IQueryable contains an expression that represents a snippet of code that if turned into actual code and executed would reconstruct that very same IQueryable (or its equivalent). 


The Execute method is the entry point into your provider for actually executing query expressions.  Having an explicit execute instead of just relying on IEnumerable.GetEnumerator() is important because it allows execution of expressions that do not necessarily yield sequences.  For example, the query “myquery.Count()” returns a single integer.  The expression tree for this query is a method call to the Count method that returns the integer.  The Queryable.Count method (as well as the other aggregates and the like) use this method to execute the query ‘right now’.


There, that doesn’t seem so frightening does it?   You could implement all those methods easily, right? Sure you could, but why bother.  I’ll do it for you.  Well all except for the execute method.  I’ll show you how to do that in a later post.


First let’s start with the IQuerayble. Since this interface has been split into two, it’s now possible to implement the IQueryable part just once and re-use it for any provider.  I’ll implement a class called Query<T> that implements IQueryable<T> and all the rest.


    public class Query<T> : IQueryable<T>, IQueryable, IEnumerable<T>, IEnumerable, IOrderedQueryable<T>, IOrderedQueryable {


        QueryProvider provider;


        Expression expression;


 


        public Query(QueryProvider provider) {


            if (provider == null) {


                throw new ArgumentNullException(“provider”);


            }


            this.provider = provider;


            this.expression = Expression.Constant(this);


        }


 


        public Query(QueryProvider provider, Expression expression) {


            if (provider == null) {


                throw new ArgumentNullException(“provider”);


            }


            if (expression == null) {


                throw new ArgumentNullException(“expression”);


            }


            if (!typeof(IQueryable<T>).IsAssignableFrom(expression.Type)) {


                throw new ArgumentOutOfRangeException(“expression”);


            }


            this.provider = provider;


            this.expression = expression;


        }


 


        Expression IQueryable.Expression {


            get { return this.expression; }


        }


 


        Type IQueryable.ElementType {


            get { return typeof(T); }


        }


 


        IQueryProvider IQueryable.Provider {


            get { return this.provider; }


        }


 


        public IEnumerator<T> GetEnumerator() {


            return ((IEnumerable<T>)this.provider.Execute(this.expression)).GetEnumerator();


        }


 


        IEnumerator IEnumerable.GetEnumerator() {


            return ((IEnumerable)this.provider.Execute(this.expression)).GetEnumerator();


        }


 


        public override string ToString() {


            return this.provider.GetQueryText(this.expression);


        }


    }


 


As you can see now, the IQueryable implementation is straightforward. This little object really does just hold onto an expression tree and a provider instance. The provider is where it really gets juicy.


Okay, now I need some provider to show you.  I’ve implemented an abstract base class called QueryProvider that Query<T> referred to above.  A real provider can just derive from this class and implement the Execute method.


    public abstract class QueryProvider : IQueryProvider {


        protected QueryProvider() {


        }


 


        IQueryable<S> IQueryProvider.CreateQuery<S>(Expression expression) {


            return new Query<S>(this, expression);


        }


 


        IQueryable IQueryProvider.CreateQuery(Expression expression) {


            Type elementType = TypeSystem.GetElementType(expression.Type);


            try {


                return (IQueryable)Activator.CreateInstance(typeof(Query<>).MakeGenericType(elementType), new object[] { this, expression });


            }


            catch (TargetInvocationException tie) {


                throw tie.InnerException;


            }


        }


 


        S IQueryProvider.Execute<S>(Expression expression) {


            return (S)this.Execute(expression);


        }


 


        object IQueryProvider.Execute(Expression expression) {


            return this.Execute(expression);


        }


 


        public abstract string GetQueryText(Expression expression);


        public abstract object Execute(Expression expression);


    }


 


I’ve implemented the IQueryProvider interface on my base class QueryProvider.  The CreateQuery methods create new instances of Query<T> and the Execute methods forward execution to this great new and not-yet-implemented Execute method.


 


I suppose you can think of this as boilerplate code you have to write just to get started building a LINQ IQueryable provider.  The real action happens inside the Execute method.  That’s where your provider has the opportunity to make sense of the query by examining the expression tree.


And that’s what I’ll start showing next time.


 


UPDATE:


It looks like I’ve forget to define a little helper class my implementation was using, so here it is:


    internal static class TypeSystem {


        internal static Type GetElementType(Type seqType) {


            Type ienum = FindIEnumerable(seqType);


            if (ienum == null) return seqType;


            return ienum.GetGenericArguments()[0];


        }


        private static Type FindIEnumerable(Type seqType) {


            if (seqType == null || seqType == typeof(string))


                return null;


            if (seqType.IsArray)


                return typeof(IEnumerable<>).MakeGenericType(seqType.GetElementType());


            if (seqType.IsGenericType) {


                foreach (Type arg in seqType.GetGenericArguments()) {


                    Type ienum = typeof(IEnumerable<>).MakeGenericType(arg);


                    if (ienum.IsAssignableFrom(seqType)) {


                        return ienum;


                    }


                }


            }


            Type[] ifaces = seqType.GetInterfaces();


            if (ifaces != null && ifaces.Length > 0) {


                foreach (Type iface in ifaces) {


                    Type ienum = FindIEnumerable(iface);


                    if (ienum != null) return ienum;


                }


            }


            if (seqType.BaseType != null && seqType.BaseType != typeof(object)) {


                return FindIEnumerable(seqType.BaseType);


            }


            return null;


        }


    }


Yah, I know. There’s more ‘code’ in this helper than in all the rest.  Sigh. J


 

Comments (51)

  1. I’ve been meaning for a while to start up a series of posts that covers building LINQ providers using

  2. Here’s an anthology of VS 2008 Beta 2’s changes to LINQ and its domain-specific implementations, which includes a brief description and link to this post: http://oakleafblog.blogspot.com/2007/07/linq-changes-from-orcas-beta-1-to-vs.html

    –rj

  3. Roller says:

    I’ve been meaning for a while to start up a series of posts that covers building LINQ providers using

  4. Frans Bouma says:

    Is there a reason why the scalar query isn’t executed ‘deferred’ ? I more and more start to dislike the whole ‘query is executed when you enumerate’ model, as it’s obscure and unclear, like with this scalar query which is executed immediately. It should have been better if there was some kind of ‘execute’ method on a queryable. Yes, that would require a method call, but it would make everything clear and unobscure.

    I also wonder a bit if this whole ‘enumeration should execute the query’ drove this design. The reason is that it doesn’t make any sense to have the provider on the queryable, unless execution is performed on the queryable by enumeration. If this would have been separated, you would be able to create a queryable, and pass it to a provider of choice. That’s now not possible.

  5. Frans,

    The reason that scalar queries are not deferred is that they are typed as a scalar.  The only representation of a deferred item we have is via IEnumerable. So if the query does not result in an IEnumerable we have no way to defer it.

    The reason there is no Execute method on an IQuerayble was due to wanting the usage experience with IQueryable to be the same as IEnumerable.

    The reason the provider is part of the IQueryable is so the root of the query can determine which query processor is used; so it can just be enumerated and the right thing happens.  Without this you would have to keep around knowledge about which provider should execute which query and you would lose the ability to just tack on a filter operation (or skip & take) w/o needing to know the kind of query you are dealing with.

  6. Frans Bouma says:

    Well, the reason I asked about why the provider is part of the queryable is that I now can’t have general code which works on a general set of entities and write a query there and use it on any of the providers I have available, i.e. one per database: I now have to provide this info to the datasource I’m using in the code which formulates the query.

    Sure, if I have just one provider, no problem. If I have generic code which can target sqlserver and oracle and db2 at the same time, I have a problem, as the code in my application should be generic (I now can do that) without knowledge of db’s but when I have different providers, I can’t because I have to pass these on in the code which formulates the query. So this then requires a query provider which is actually a placeholder which gets the real provider plugged in when the actual db to target is selected.

    or I’m missing something obvious 🙂

  7. Frans, you may be confusing the concept of a LINQ provider with an ADO database provider.  You can certainly have your ‘provider’ target a variety of different databases, etc.

    Also, tying the IQueryable provider to the IQueryable only influences the default translation of the query. You can also get the Expression from any IQueryable and attempt to process it using another provider.

  8. Now, that I’ve laid the groundwork defining a reusable version of IQueryable and IQueryProvider, namely Query and QueryProvider, I’m going to build a provider that actually does something. As I said before, what a query provider really does is execute

  9. Frans Bouma says:

    Thanks Matt for clearing that up. I indeed am confusing the two, so if I can have a normal provider which can later on be tied to an ado.net db provider, I’m OK 🙂

  10. Frans Bouma says:

    Additionally, I’m really happy you’re writing these articles. I was a little disappointed when I saw that the docs to write a linq provider weren’t included in orcas beta 2’s docs but luckily these articles will help me get started 🙂

  11. Part III? Wasn’t I done in the last post? Didn’t I have the provider actually working, translating, executing and returning a sequence of objects? Sure, that’s true, but only just so. The provider I built was really fragile. It only understood one major

  12. I just could not leave well enough alone. I had the crude LINQ provider working with just a translation of the Where method into SQL. I could execute the query and convert the results into my objects. But that’s not good enough for me, and I know it’s

  13. Over the past four parts of this series I have constructed a working LINQ IQueryable provider that targets ADO and SQL and has so far been able to translate both Queryable.Where and Queryable.Select standard query operators. Yet, as big of an accomplishment

  14. Matt Warren présente sur un blog une implémentation d’un provider Linq vers SQL en plusieurs étapes.

  15. Después de que muchos (un servidor incluido) se hayan roto literalmente la cabeza durante meses investigando

  16. This is the sixth in a series of posts on how to build a LINQ IQueryable provider. If you have not read

  17. jankyBlog says:

    Risorse su Linq to SQL

  18. As you can probably tell from the title of my last few posts I’ve been doing some work with LINQ over

  19. As you can probably tell from the title of my last few posts I’ve been doing some work with LINQ over

  20. Darth Bundy says:

    I recently spend a few (many) hours doing some research into the workings of LINQ providers for an internal

  21. As you can probably tell from the title of my last few posts I’ve been doing some work with LINQ over

  22. This is the seventh in a series of posts on how to build a LINQ IQueryable provider. If you have not

  23. 週末の AdminTech 勉強会で TechEd の振り返りということで、 30分ほどですが 尾崎さん と めさいあさん と一緒に LINQ セッションについて感想やディスカッションを交えてお話します。 今まで LINQ to SQL は触ったことがなかった、というかむしろ避けていたんですがこれを機に週末からさわり始めました。

  24. Welcome to the thirty-first edition of Community Convergence. This issue features links to seven very

  25. At the most abstract level, LINQ (Language Integrated Query) can query against two types of provider

  26. This is the eighth in a series of posts on how to build a LINQ IQueryable provider. If you have not read

  27. The Banshee-to-Windows porting has been more or less done for a while and the code is about to be integrated

  28. Ian Cooper says:

    What is LINQ? LINQ stands for Language Integrated Query and is a DSL within C# for querying data. It

  29. What is LINQ? LINQ stands for Language Integrated Query and is a DSL within C# for querying data. It

  30. An Updated LINQ to WMI Implementation

  31. Over the holidays Alex Turner, Mary Deyo and I added a new sample to the downloadable version of the

  32. Over the holidays Alex Turner, Mary Deyo and I added a new sample to the downloadable version of the

  33. This is the nineth in a series of posts on how to build a LINQ IQueryable provider. If you have not read

  34. dave^2=-1 says:

    Just storing a couple of links for a rainy day: Mehfuz Hossain’s LINQ provider basics article on Dotnetslackers

  35. dave^2=-1 says:

    Just storing a couple of links for a rainy day: Mehfuz Hossain’s LINQ provider basics article on Dotnetslackers

  36. Check out the following from Matt Warrens blog posts, if you are interested on how to implement IQueryable…

  37. Check out the following from Matt Warrens blog posts, if you are interested on how to implement IQueryable

  38. Last year, when I was working at db4objects on db4o , I insisted on the need for db4o to act as a LINQ

  39. Linq query providers appear all over the place. Some say &quot;Linq to Everything&quot; to refer to all

  40. meek says:

    Someone asked a great question on the ADO.NET Entity Framework forums yesterday: how do I compose predicates

  41. It seems that everyone else is chiming in on Danny Simmons’ recent comparisons of the Entity Framework

  42. This is the tenth in a series of posts on how to build a LINQ IQueryable provider. If you have not read the previous posts you’ll want to find a nice shady tree, relax and mediate on why your world is so confused and full of meaningless tasks that it

  43. Dużo się m&#243;wi i pisze o tym, że LINQ jest elastyczne i rozszerzalne. Sam powtarzam, że aby podpiąć

  44. This is the eleventh in a series of posts on how to build a LINQ IQueryable provider. If you have not read the previous posts you’ll want to do so before proceeding, or at least before proceeding to copy the code into your own project and telling your

  45. This week I am coming to you from the Microsoft Campus. So as you would expect I have a lot of energy

  46. This is the twelfth in a series of posts on how to build a LINQ IQueryable provider. If you have not

  47. Part I – Reusable IQueryable base classes Part II – Where and reusable Expression tree visitor Part II

  48. Jocelyn says:

    At the most abstract level, LINQ (Language Integrated Query) can query against two types of provider

  49. jocelyn says:

    At the most abstract level, LINQ (Language Integrated Query) can query against two types of provider

Skip to main content