Polymorphic selector and Extension methods ( Query Design continued ):



This post is a part of a series of posts about query design. For the previous post see: https://blogs.msdn.com/vladimirsadov/archive/2007/02/05/query-expression-query-design-continued.aspx

 

So far we have followed design decisions for the language integrated queries to the point where user specifies the query in the following form.

Dim num_squares = From cur_num In numbers Select result = cur_num * cur_num

 

Compiler transforms this into code that iterates over the source collection (numbers in this case), applies user specified expression to every element and produces the result.

 

If you think about universal applicability of this pattern you may notice that there are different kinds of collections. How can compiler know how to iterate over those? One possible solution is to handle only known kinds of collections (like those that implement IEnumerable(of T) ). We certainly know how to iterate such collections, but requiring that all queryable collections implement IEnumerable(of T) seems to be overly restrictive. Also it seems that producing the result should also be type-specific as we would expect the result of the query to be of the same kind as the source collection. For example if we are making a query on some kind of Set type or a Data Table we would expect to get back another Set or Data Table.

 

One of the object oriented design principles tells that these kinds of type specific operations should be handled polymorphic ally by the types themselves. In this case the queryable types could all derive from a common ancestor and override functions that we need to do a query. That would certainly solve the problem, but it would still require all queryable types to share some common ancestor which is too restrictive and does not work for most of the types that already exist. We really want to be able to do queries on unrelated types to make queries truly extensible.

 

A good example of solving similar problem is the “For Each” operator. In addition to IEnumerable types “For Each” can handle types that implement “collection pattern”. The pattern requires that type has “GetIterator” function with appropriate signature and compiler goes from there by using this function. It seems to be a good idea to have some kind of queryable pattern that compiler can recognize and use for query translation and that is exactly how VB queries are implemented:

 

When compiling a query VB compiler looks for a function called “Select” with a signature that can accept a representation of the user expression. It is the “Select” function that does all the work of iterating over the source collection, applying the expression and collecting results into the result collection.

 

Dim num_squares = From cur_num In numbers Select result = cur_num * cur_num

 

Becomes:

 

Dim num_squares = numbers.Select(AddressOf VB_ CompilerGeneratedFunction)

' =====================================

' synthetic function that performs actions specified in user expression

'======================================

Function CompilerGeneratedFunction(ByVal cur_num As Integer) As Integer

        Dim result = cur_num * cur_num

        Return result

End Function

 

 

Now that we have a well defined way to detect and use queryable collections in queries, we just need to figure how to add these missing “Select” functions to various types. And surely enough in Orcas there is a mechanism to do this – it is called “Extension methods”. Without going into details it makes possible to declare a shared method in such a way that it could be called as an instance function of its first argument. Example:

 

User defines:

 Shared Function Select(target as MyCollection, selector as SomeDelegate) as MyCollection

The function can be used as an instance function of MyCollection:

 

Dim Col1 as New MyCollection

‘ this should work as long as selector_del is of type SomeDelegate

Col1.Select(selector_del)

To make this even better one of the part of the LINQ technology that will ship with VB9 are extensive libraries of extension methods that do all kind of query operations for common types such as IEnumerable(of T), arrays or Xml data. These implementations (also called query operators) are also very type agnostic through usage of generics so they should handle most if not all the scenarios that query user may ever require.

 

The result is that when we combine pattern matching for queryable types in the compiler and standard implementations of this pattern in LINQ libraries, it becomes a complete solution that allows users to apply language integrated queries to common collection types right away.

 

The pattern matching mechanism is also extremely flexible and extensible. If somebody wants to make some type queryable or to override an already provided implementation with a more specific one, it is very easy to do. One can simply add a Select function to the type. In fact through usage of Extension Methods the types do not really need to be modified to become queryable as you can provide Select that is an extension method.