Building a LINQ IQueryable provider – Part XIV


This is the fourteenth in a series of posts on how to build a LINQ IQueryable provider. If you have not read the previous posts you might request a weeks vacation, sit back, relax with a mochacino in one hand a netbook in the other, or if you’ve got better things to do with your time print them all out and stuff them under your pillow. Who knows, it might work better.


Complete list of posts in the Building an IQueryable Provider series


Okay, enough with all the post-is-late guilt! It’s done now, so breathe a sigh of relief and get on with the reading.


What’s inside:



More Mapping – Finally a real mapping system, with attributes and XML.


More Providers – MS Access and MS SQL Server Compact Edition


More POCO – Constructors, Enum and Interfaces.


More More More


The full source code can be found at:


http://www.codeplex.com/IQToolkit


More Mapping



  • Attribute Mapping – put attributes on the properties in the class that declares your tables.

    This differs from LINQ to SQL mapping attributes which are placed on the entities themselves and is more like the proposed LINQ to Entity mapping attributes. However, I’ve not actually gone out of my way to make them the same. The advantages to this approach are 1) keeping the mapping separate from the entity objects (more POCO), and 2) being able to supply different mapping for the same entity type based on the table the entities are accessed from. 

    Mapping attributes look like this:
  • [Table]
    [Column(Member = “CustomerId”, IsPrimaryKey = true)]
    [Column(Member = “ContactName”)]
    [Column(Member = “CompanyName”)]
    [Column(Member = “Phone”)]
    [Column(Member = “City”, DbType=“NVARCHAR(20)”)]
    [Column(Member = “Country”)]
    [Association(Member = “Orders”, KeyMembers = “CustomerID”, RelatedEntityID = “Orders”, RelatedKeyMembers = “CustomerID”)]
    public IUpdatableTable<Customer> Customers


You specify the Table, Column and Association attributes as necessary.  The ‘Member’ refers to the member in the entity type. If this is the same name as the database’s column name you don’t need to repeat it by specifying ‘Name’ too. 

You can specify nested mapping information by using a dot in the Member name. This allows you to have what some call value types, but to keep from clashing with .Net terminology I don’t. For example, if you’ve defined an Address type that you want to use in a nested relationship (actually embedded in the same table row) you can do that like this:

[Table]
[Column(Member = “EmployeeID”, IsPrimaryKey = true)]
[Column(Member = “LastName”)]
[Column(Member = “FirstName”)]
[Column(Member = “Title”)]
[Column(Member = “Address.Street”, Name = “Address”)]
[Column(Member = “Address.City”)]
[Column(Member = “Address.Region”)]
[Column(Member = “Address.PostalCode”)]
public IUpdatable<Employee> Employees


  • Xml Mapping — this is same as attribute based mapping but data is read from an XML file. 

    Xml mapping looks like this:
  • <?xml version=1.0encoding=utf-8?>
    <
    map>
    <
    Entity Id=Customers>
    <
    Table Name=Customers/>
    <
    Column Member = CustomerIdIsPrimaryKey = true/>
    <
    Column Member = ContactName/>
    <
    Column Member = CompanyName/>
    <
    Column Member = Phone/>
    <
    Column Member = CityDbType=NVARCHAR(20)/>
    <
    Column Member = Country/>
    <
    Association Member = OrdersKeyMembers = CustomerIDRelatedEntityID = OrdersRelatedKeyMembers = CustomerID/>
    </
    Entity>
    <
    Entity Id=Orders>
    <
    Column Member = OrderIDIsPrimaryKey = trueIsGenerated = true/>
    <
    Column Member = CustomerID/>
    <
    Column Member = OrderDate/>
    <
    Association Member = CustomerKeyMembers = CustomerIDRelatedEntityID = CustomersRelatedKeyMembers = CustomerID/>
    <
    Association Member = DetailsKeyMembers = OrderIDRelatedEntityID = OrderDetailsRelatedKeyMembers = OrderID/>
    </
    Entity>
    <
    Entity Id=OrderDetails>
    <
    Table Name=Order Details/>
    <
    Column Member = OrderIDIsPrimaryKey = true/>
    <
    Column Member = ProductIDIsPrimaryKey = true/>
    <
    Association Member = ProductKeyMembers = ProductIDRelatedEntityID = ProductsRelatedKeyMembers = ProductID/>
    </
    Entity>
    </
    map>


You use it like this:


XmlMapping mapping = XmlMapping.FromXml(TSqlLanguage.Default, File.ReadAllText(@”northwind.xml”));
SqlQueryProvider provider = new SqlQueryProvider(connection, mapping);


  • Multi-table mapping — Map multiple tables into a single entity.  If you’ve got entity data spread out over multiple tables with a 1:1 association between them you can now specify the additional tables in mapping using the ExtensionTable attribute or equivalent XML element. 

    Here’s what a multi-table mapping looks like:
  • [Table(Name = “TestTable1”, Alias = “TT1”)]
    [ExtensionTable(Name = “TestTable2”, Alias = “TT2”, KeyColumns = “ID”, RelatedAlias = “TT1”, RelatedKeyColumns = “ID”)]
    [ExtensionTable(Name = “TestTable3”, Alias = “TT3”, KeyColumns = “ID”, RelatedAlias = “TT1”, RelatedKeyColumns = “ID”)]
    [Column(Member = “ID”, Alias = “TT1”, IsPrimaryKey = true, IsGenerated = true)]
    [Column(Member = “Value1”, Alias = “TT1”)]
    [Column(Member = “Value2”, Alias = “TT2”)]
    [Column(Member = “Value3”, Alias = “TT3”)]
    public IUpdatable<MultiTableEntity> MultiTableEntities


Extension tables are specified similar to how Associations are specified, except you are never referring to members, only column names.  You use the ‘Alias’ value to connect column & association mappings with columns from particular tables.  All queries for this multi-table entity treat the ‘Table’ as the primary table queried, all other tables are queried with left-outer joins.  All keys for associations must be from the same alias.

Can I mix nested mapping with multi-table mapping?  I have not tried it, but in theory it should work. It should not matter which table your nested entity gets it’s data from, so in effect you can have a composition relationship between one table and another as long as it is 1:1.  



What about many-to-many?  Not yet. Making the system query a many-to-many relationship is relatively easy.  I haven’t yet figured out the right semantics for inserts & updates. Right now, all insert, updates and deletes are explicit via calls to the IUpdatable with real-live entities. Yet how do you make an explicit update to the link table that you don’t have an entity directly mapped to?  I need to ponder this some more.  Possibly if one side of the relationship is a composition as opposed to an association, then it would be implied when that side is updated.  Yet what if you chose not to load the relationship, how do you tell the system to not delete all previous relationships?


More Providers



  • MS Access — This new query provider works with both Access 2000 – 2003 and Access 2007 data files. I don’t know what the true differences are between the Jet and the Ace engines; the query language appears to be identical (as per my limited tests so far), yet the filename extension changed in Access 2007 to ‘accdb’ instead of ‘mdb’ and the northwind sample database plumped up an extra 66% in disk size without any additional data.


In order to make this work I’ve added an AccessLanguage object that is necessary to get the correct semantics for MS Access queries and an AccessFormatter object that handles generating the correct command text. In order to salvage as much as I could from the TSqlFormatter, I moved most of this code to a common SqlFormatter base class, and now the TSQL and Access formatters only supply the deviations from the standard syntax.  (Of course, ‘standard’ is currently whatever I deem it to be so don’t go getting some actual online specification and prove me wrong.) Access only allows one command at a time, so that added an extra wrinkle, but in the end there is now support in the system for providers that can only do one command at a time. This means there are multiple round-trips to the engine for things like inserting a record and getting back the computed keys. Luckily, the access engine is in-proc so this is not really a burden. A new property on QueryLanguage, ‘AllowMultipleCommands’ determines how the execution plan is generated and whether multiple commands can be lumped together into a single ADO command.

The good news is that the access engine passes almost all the Northwind tests; some are not possible (mostly ones testing translation of framework methods that have no apparent equivalent in the access expression engine).  There were a lot of hairy strange & subtle differences in syntax between Access and TSQL, but most were handled by having different format rules, some required new expression visitors to change the query tree, like no explicit cross joins!  This caused me to write a visitor to attempt to get rid of cross joins (often injected by my visitor that tries to get rid of cross-apply joins) which is now generally useful to everyone, and if that didn’t do it, another visitor that would attempt to isolate out the cross joins from any other joins and push them into sub-queries where Access lets me use the old-style comma-list, which is truly a cross join, though it just can’t be mixed with other kinds of joins in the same from clause.



  • SQL Compact — Yes, even more SQL Server.  Though to be truthful, SQL Server Compact Edition (aka SQL CE, aka SQL Compact, aka Skweelzy) is not really SQL Server, it is some other entirely different product that handles a subset of TSQL syntax, and is not a server at all since it runs in-proc just like MS Access.


  • What about MySQL or Oracle?
    One day. The fact is that MS SQL and MS Access are easy for me to get to, they are already on my box. Getting something else up and running would take actual effort, and the MS secret database police might come get me. Meanwhile, if someone out there wants to put together a provider implementation I’ll add it into the drop.


  • Where did the SqlQueryProvider go?
    I moved it. With the addition of the new providers it became apparent that I’d have to start factoring out all this ADO provider specific nonsense, otherwise all uses of the toolkit would have direct dependencies to way more than necessary. So I made separate projects, each building its own library. I may end up separating all the core ‘data’ stuff out into its own project too.

    The solution builds these libraries now:


IQToolkit.dll
IQToolkit.Data.Access.dll
IQToolkit.Data.SqlClient.dll
IQToolkit.Data.SqlServerCe.dll


More POCO



  • Constructors — Use entities that don’t have default constructors.  It is now possible to have entities that require invocation of a constructor with parameters.  The binding process will figure out how to call your constructor and the client side code will call it for you as long as the constructor parameter names match property names. You can even have fully read-only entities if all data member are accounted for in the constructor.


  • Enums — They actually sort of work now. You can have a member in your entity typed as some enum and you get automatic mapping between that enum and a numeric type in the database.


  • Interfaces and abstract base classes — You can now declare you IQueryable’s as IQueryable<SomeInterface> or IQueryable<SomeAbstractClass> and have the provider create instances of a compatible type under the covers, automatically translating all your references to interface or abstract class members to the appropriate mapped members. You can have mutiple entities share a common interface or base class and get different mapping for each. You can write code using generics and constrain generic parameters based on an interface and write queries that will get correct translation at runtime.  (Note, variation of mapping likely won’t work with compiled queries, since the translation is fixed on the first execution.)

Less Policy



There’s not a whole lot of policy being used right now and the policy objects dependence on the mapping object was no where near as deep as the mapping object’s dependence on the language.  So policy is now independent of mapping, which means you can construct providers without specifying policy and/or reusing mapping with different policies.  Now if I could only make it simpler to specify/construct mapping without needing to know the language.  Back to the drawing board.


More Insanity



I apologize for the churn. The namespace changed so now all heck is going to break loose. Gone is the simple ‘IQ’ namespace and in its place is the ‘IQToolkit’ namespace.  I really did like the ‘IQ’ name, it was short, classy and made you feel intelligent just by looking at it.  Yet, it was hard to guess at if you did not already know what it was.  I chose to change the namespace name to match the product name and the DLL name. You add reference to the IQToolkit.dll and you import/use the IQToolkit namespace. No fuss, no muss.  Except for all those files you’ll have to edit now. But hey, this is pre-pre-pre-pre beta stuff. Some people may think they are something special by snarkily keeping all their products in beta. They’ve got a lot to learn.


I hope this toolkit is becoming useful to many. I realize there have been a variety of requests for new things in the toolkit that I just have not gotten time to put in yet.  So you can expect plenty more in the future.


So enough with reading.  It’s time to code!

Comments (21)

  1. Ryan Riley says:

    From a goal of starting off with a simple how-to, you’ve gone a long way and really produced something useful. Thanks for your hard work on this project!

  2. This is the fourteenth in a series of posts on how to build a LINQ IQueryable provider. If you have not

  3. Lee says:

    Great posts – I’m actually working with mono & MySQL at the moment so I really miss LINQ, next to generics it’s one of the best features to be added to the language.

    Keep up the good work.

    Cheers

    Lee

  4. Omari says:

    When compiled queries are implemented I will think of using IQ instead of L2S )

  5. mattwar says:

    Omari,  the IQToolkit has had compiled queries for many releases now.

  6. Thank you for submitting this cool story – Trackback from DotNetShoutout

  7. kfarmer says:

    @Lee:  Have you checked out db4o?  http://www.db4o.com/s/linqdb.aspx

    IIRC, Mono can use this just fine.

  8. Erik says:

    I was wondering whether it’s possible to cast a column to a different type in the query itself.  Our tables have timestamp columns and I’d like to cast them to bigint in the queries and then represent them using Int64 types in the entity classes.  Is that possible or is there a better approach?  Thanks,

    – Erik

  9. mattwar says:

    Erik, SQL server treats timestamp columns as binary, and they come through the ADO provider as byte[], so there is no existing conversion logic to turn this into an integer.  It may be possible to do with some special case logic in the query provider.

  10. Erik says:

    I was hoping to coerce the system to express the timestamp column in SQL like this:

    SELECT [myCol1], CAST([timestampCol] AS bigint) as [timestampCol] …

    SQL 2008 will convert a timestamp to a bigint (not sure about previous version).  I was thinking a new mapping attribute field to cast values on the server would be useful in cases where IConvertable can’t help:

    [Column(CastAs="BigInt")]

    public long TimeStampCol {get; set;}

    It’s not a big deal, I can poke an explicit conversion function into the entity type hierachy.

    BTW, Thanks so much for working on this project — it’s an outstanding contribution.

  11. KShaban says:

    Is there any support for delayed value loading?

    BTW, been following your work and it is definetly outstanding.

    Thanks

  12. mattwar says:

    KShaban,

    There is currently only support for delayed 1:n relationship loading.

  13. Robert Byrne says:

    Small typo, in IQToolkit.ExpressionWriter, line 294, shouldn’t DataTime be DateTime? Unless its some secret undocumented format you guys at Microsoft don’t want us to know about.

  14. Olivier says:

    I love your work! Following the series nzearly from the beginning (from part IV or V).

    I started a while ago implementing an Oracle provider. At this time, I had to refactor a little the last version of the toolkit for it didn’t provided me with the correct extensibility points to plug my provider in… So it is really great news you provide an extensible version by now.

    PS: I have very little time to devote this "hobbyist" project. So I don’t expect I can send feedback about this before a long time. Maybe someone else will provide us with an Oracle provider sooner.

    Nevertheless, once again, I love your series and spending nights clicking (go to definition, find references, … best ways to understand something) inside your code like a miner extracting gems.

  15. deerchao says:

    Glad to know I can safely get rid of most of my tricky derived classes and in-place modifications around the little old bugs.

    Keep it going, man, cos it’s getting better and better!

  16. zihotki says:

    Hello mattwar,

    Thanks for great articles!

    I have one question about lazy-loading associations. In one previous comments you wrote that "There is currently only support for delayed 1:n relationship loading.". And as I understand this is implemented using Association attribute. I’ve look through the tests and didn’t found any test case for lazy loading so I tried to add some additional method to test this:

                   public void TestLazyLoading()

    {

    var result = db.Customers.Where(x => x.CustomerID == "ALFKI");

    foreach (var customer in result)

    {

    foreach (var order in customer.Orders)

    {

    Console.WriteLine(order.Details);

    }

    }

    }

    But this method fails. Orders property of the customer is null (DB contains orders for selected customer). I’ve tried to create an instance of the List for orders property in the constructor but in this case this collection is not loaded too. I’ve added only this function and I’ve changed path to attached database file.

    Could you please check lazy loading via association with or add a sample of usage this functionality?

  17. mattwar says:

    You are right, I don’t have a test for this.  I need to add some.

    As it stands right now, relationships are not included in results unless the QueryPolicy claims they should be.  A few of the tests mock up a policy that will claim one or more properties on the queried object should be included.  Of course, you still won’t get lazy loading unless you change the definition of the Orders property, since it is currently defined as a List<Order>.  If you change it to a DeferredList<Order> and have the policy claim that the Orders property should be included, then it will become lazy loaded.  

  18. Mr. Matt,

    good afternoon.  i am a newbie in linq.  i saw it was very powerful.  i do not know whether i am experiencing a bug with the post i had done.  no one has replied ot this:

    http://silverlight.net/forums/t/91008.aspx

    the only related article i saw was:

    http://aabs.wordpress.com/2006/11/18/how-linq-works-iqueryable/

    any help will be great.  pl. feel to reply to that posting in silverlight.net.

    sorry to have chosen you of all the experts in linq forum.

    regards

    Ravi.

  19. Dave Harvey says:

    Hi Matt,

    Fantastic job. This has been a life saver as we’re using LINQ to validate the contents of an Access database (which is not being upgraded to SQL Server any time soon…).

    If our paths ever cross I owe you two pints – one for each hand 😉

    Cheers,

    Dave Harvey

    PS. You’re got a small bug-ette with you NULL handling. Currently your provider can handle "IS NULL", but it can’t handle "IS NOT NULL".

    It’s a quick fix to required in the IQToolkit.Data.SqlFormatter VisitBinary(BinaryExpression b) method. You need something like the following to handle the "NOT" case:

                   case ExpressionType.Equal:

                   case ExpressionType.NotEqual:

                       if (right.NodeType == ExpressionType.Constant)

                       {

                           ConstantExpression ce = (ConstantExpression)right;

                           if (ce.Value == null)

                           {

                               this.Visit(left);

                               this.Write(" IS");

                               if (b.NodeType == ExpressionType.NotEqual)

                               {

                                   this.Write(" NOT");

                               }

                               this.Write(" NULL");

                               break;

                           }

                       }

                       else if (left.NodeType == ExpressionType.Constant)

                       {

                           ConstantExpression ce = (ConstantExpression)left;

                           if (ce.Value == null)

                           {

                               this.Visit(right);

                               this.Write(" IS");

                               if (b.NodeType == ExpressionType.NotEqual)

                               {

                                   this.Write(" NOT");

                               }

                               this.Write(" NULL");

                               break;

                           }

                       }

                       goto case ExpressionType.LessThan;

  20. mattwar says:

    Thanks, Dave.  You were right!