Language Integrated Query (LINQ) Support in the F# Power Pack

In this post I thought I would give some simple, up-to-date examples of writing queries using the F# Power Pack and executing them via LINQ. The techniques described here also apply to querying any obejcts that support the IQueryable interface.

My aim here is not to give a complete guide to "doing everything you can do in SQL" or "everything you can do with LINQ and F#". Instead, I just want to bring some of the material you might find on the web up-to-date for the Beta1/Beta2 releases of the F# Power Pack. Also, note that the IQueryable support described here is not in F#, but in the F# Power Pack, which includes other components such as FsLex and FsYacc. The binaries and code for the F# Power Pack is included with the F# CTP release and can be used with Visual Studio 2010, and we’ll be looking to do regular CodePlex releases of the Power Pack as VS2010 settles down.

 

For those unfamiliar with Linq and IQueryable, here’s a recap: .NET defines an interface IQueryable for roots of queries made up of select/where/join/group/sort and a few other SQL-like operations. C# and VB have a syntax to write queries, and desugar this syntax to expressions and expression trees. Engines such as LINQ-to-SQL then execute expression trees in various interesting ways, for example as SQL. There are many other interesting and powerful engines that implement IQueryable support too.

 

In this post we’ll be looking at how you can leverage IQueryables by using the F# Power Pack. You can walk through everything here step by step in F# Interactive and the script is attached. First, however, some basics, so let’s define some in-memory data:

 

type Customer =

    { Name:string;

      UniqueID: int;

      Weighting:float;

      Preferences: int list }

let c1 = { Name="Don"; UniqueID=6; Weighting=6.2; Preferences=[1;2;4;8] }

let c2 = { Name="Peter"; UniqueID=7; Weighting=4.2; Preferences=[10;20;30;40] }

let c3 = { Name="Freddy"; UniqueID=8; Weighting=9.2; Preferences=[11;12;14] }

let c4 = { Name="Freddi"; UniqueID=9; Weighting=1.0; Preferences=[21;22;29;42] }

let data = [c1;c2;c3;c4]

 

Simple In Memory Queries

 

In F# we express many queries using simple F# sequence expressions. Here is a very simple F# in-memory query:

let q0 =

    [ for i in data do

          if i.Name.Length = 6 then

       yield (i.Name,i.UniqueID) ]

As expected, this generates a list of length 2, containing entries for Freddy and Freddi. You can use the same inner syntax for to generate an on-demand sequence (an IEnumerable<T>):

let q1 =

    seq { for i in data do

            if i.Name.Length = 6 then

         yield (i.Name,i.UniqueID) }

When executed, this will produce an IEnumerable. We can compute the length of this sequence as follows:

q1 |> Seq.length

or we might convert it to a list:

q1 |> Seq.toList

One thing to note is that F# sequence expression syntax looks and feels a lot like regular F# code except enclosed with [ ... ] or seq { ... } and with a few yiields added (it is much like a C# iterator, except in expression form). Sequence expressions are useful for many things, but they have their limits. As a result, it is quite common to follow a sequence expression by a pipeline of combinators, e.g.

    seq { for i in data do

    for j in data do

             if i.Name.Length = j.Name.Length && i.Name <> j.Name then

    yield (i.Name,j.Name) }

    |> Seq.sortBy (fun (name1, name2) -> name1)

    |> Seq.toList

Here we did the sorting and conversion to a list just after the { ... } syntax. The whole expression above is the “query”, but only one part of it uses the sequence expression syntax. This is quite common in F#. Also, there are other kinds of operators that can be considered queries, e.g. the functions Seq.pick, Seq.tryPick, Seq.find and Seq.tryFind are all queries for single elements. Queries for single elements should be written by applying one of these functions to a sequence or some other collection.

 

Using IQueryables with the F# Power Pack

Now, as we mentioned above, .NET 3.5 introduced nice machinery for executing queries “in other ways”, for example as SQL on a database. An indicator of this is when a data source supports the type IQueryable which can be thought of as a “super enumerable”. If you see an object supporting IQueryable, then it’s likely that something “smart” can potentially go on under the hood, e.g. your queries can be implemented through a translation or optimization of some kind. For example, the objects representing handles to SQL tables may implement IQueryable.

In the rest of this post we’re going to look at how to write and execute queries that utilize IQueryables by using some simple functionality from the F# Power Pack. (Note: Anything supporting IQueryable also supports IEnumerable, so you can use it just like an ordinary F# sequence, if the data source is small).

First, however, let’s look at the simplest form of IQueryable, which is a simple wrapper around some in-memory data – this is, in a sense, a “dumb” IQueryable.

#r "System.Core.dll"

open System.Linq

let db = Queryable.AsQueryable<Customer>(data)

We can invoke the IQueryable “way” of accessing this data as follows:

#r "FSharp.PowerPack.Linq.dll"

open Microsoft.FSharp.Linq

open Microsoft.FSharp.Linq.Query

let q2 =

    query <@ seq { for i in db do

                      if i.Name.Length = 6 then

                         yield (i.Name,i.UniqueID) } @>

That is, we take the original query, and put query <@ ... @> around it. Once the query is constructed, the result can be used as an ordinary IEnumerable, e.g.

 

        q2 |> Seq.toList

This will give the same result as before. The important thing is the above can be used with any IQueryable, and that the query is executed through the IQueryable machinery. This means the syntax tree for the query is passed, piece by piece, to the IQueryable. For “real” IQueryables this in turn translates the query to a form executed on a database or a remote server.

Example: Database Queries

So far, we’ve seen the basic philosophy for the IQueryable support in the F# Power Pack; for F# query expressions involving IQueryables, you can query <@ ... @> around the expression and it gets executed through the IQueryable machinery. We’ll go into what constitutes an F# query expression a bit later, but first let’s take a look at doing database queries via this mechanism. (A short digression before we start doing database queries; don’t forget you can also do queries using more fundamental mechanisms such as those used by ADO.NET, a topic well covered in books such as Foundations of F# and Expert F#)

First let’s reference the SqlMetal generated interface to the famous NORTHWND database (Note: a version is in the zip attached to the blog, along with the northwnd.dll built by run SqlMetal.exe on this database and compiling the generated C# code to a DLL)

#r "System.Data.Linq.dll"

#r "northwnd.dll"

Assuming you have copied NORTHWND.MDF to your working directory, you can open the database using a version of SQL Express with this connection string:

let sqlServerInstance = @".\SQLEXPRESS"

let connString = @"AttachDBFileName='" + __SOURCE_DIRECTORY__ + @"\NORTHWND.MDF';Server='" + sqlServerInstance + "';user instance=true;Integrated Security=SSPI;Connection Timeout=30"

let db = new NORTHWND(connString)

It’s handy to set the log:

db.Log <- System.Console.Out

Let’s now define a simple query:

let q3 =

    query <@ seq { for i in db.Customers do

          for j in db.Employees do

                if i.Country = j.Country then

                    yield (i.ContactName, j.FirstName, j.LastName) } @>

We now recognize this as a query over an IQueryable (in this case the database tables are IQueryables). We can now execute this database query as SQL on the server and return the results, e.g.

        q3 |> Seq.toList

This will produce a list of 93 entries. Here’s a second example:

let q4 =

   query <@ seq { for c in db.Customers do

                     for e in db.Employees do

                         if c.Address.Contains("Jardim") &&

                            c.Address.Contains("rosas") then

                               yield (e.LastName,c.ContactName) } @>

// Now execute a query

q4 |> Seq.toList

And a similar example using sorting, within the query:

let q5 =

   query <@ seq { for c in db.Customers do

                     for e in db.Employees do

                         if c.Address.Contains("Jardim") &&

                            c.Address.Contains("rosas") then

                               yield e }

            |> Seq.sortBy (fun e -> e.LastName) @>

q5 |> Seq.toList

Here the sortBy occurs within the <@ ... @> , indicating it is executed as part of the database query, rather than in-memory.

As indicated above, the aim of this blog post is not to show how to write every kind of SQL or LINQ query with the F# Power Pack. However, let’s take a quick look at the set of operators that you can use inside a query <@ ... @> . For starters, a query expression can be one of these forms,

     seq { seq-expr }

     [ seq-expr ]

   [| seq-expr |]

Where the sequence expression contains the usual sequence expression constructs,:

    seq-expr =

      | for pat in query-expr do seq-expr

      | if expr then seq-expr else seq-expr

      | yield! query-expr

      | seq-expr; seq-expr

      | yield expr

      | ...

Next, a query expression can also be made up of uses of the following operators and syntactic forms, perhaps via pipelining:

      Seq.toArray

      Seq.toList

      Seq.length

      Seq.head

      Seq.find

      Seq.max

      Seq.min

      Seq.average

      Seq.averageBy

      Seq.sum

      Seq.exists

      Seq.forall

      Seq.sumBy

      Query.maxBy

      Query.minBy

      Seq.empty

      Seq.collect

      Seq.filter

      Seq.map

      Seq.take

      Seq.sort

      Seq.distinct

      Seq.sortBy

      Query.contains

      Seq.append

      Query.groupBy

      Query.join

      Query.groupJoin

Not all combinations of these can be used together – for example, Seq.sortBy and several other operators must appear as the “outer” or “last” part of a query. Finally, a query expression can contain the following, which are useful for defining macros and shortcuts in the query:

      reflected-definition [optional-args]

      let v = e in qexpr-with-post-processing

      let f args = e in qexpr-with-post-processing

      if expr then qexpr1 else qexpr2

      expr (of type IQueryable<_>) // e.g. db.Customers, f x

The ability to use reflected definitions in queries gives an interesting and powerful way to reuse query fragments, something I plan to discuss in a future post.

For technical reasons, you must use the Query.* versions of some operators. For example, joins are expressed using the Query.join and Query.groupJoin operators.

query <@ Query.join db.Employees db.Customers

             (fun e -> e.Country)

             (fun c -> c.Country)

             (fun e c -> e)

        |> Seq.length @>

Though often similar queries can be written using a conditional:

query <@ seq { for e in db.Employees do

                  for c in db.Customers do

                     if e.Country = c.Country then

                         yield e }

         |> Seq.length @>

Note that when joins are expressed as combinators, they are a little more verbose than joins written in comprehension syntax such as in C#.

Some final notes:

Ø The above hasn’t touched on nullable values, which are, of course, an important issue in database work. In F#, you must unwrap nullables explicitly – on the whole no implicit lifting of operators is provided.

Ø There are many things that can not be used inside queries, for example arbitrary imperative code. The "query" function will give a runtime failure if the query can't be translated. There are also some restrictions we plan to lift in future versions of the power pack, mostly around the use of tuples as replacements for anonymous types.

Ø

Due to an omission, Seq.sort isn’t in the list above, though we’ll be adding it in the next release of the F# Power Pack. You can use Seq.sortBy instead.

Ø The code for the F# Power Pack is in the F# CTP release, so if you wish you can take a look at how all this is implemented. For example, if you’re interested in how F# quotations get translated to LINQ expression trees, then this is where you to look. There is also an interesting pre-processing phase that effectively walks the F# quotation tree looking for the Query DSL outlined above.

Ø We’re planning a CodePlex release of the Power Pack under a more general license – more on that sometime soon!

Ø The design goals for the F# Power Pack support for queries aren’t the same of the design goals for C# and VB’s LINQ support. For example, those languages had a strict goal to make the embedded LINQ DSL query language look and feel much like SQL, though with some variations in phrasing and ordering. The F# Power Pack design goals are to make it feasible to write the most common queries against the strongly typed ORM models provided by SqlMetal, building on the existing techniques and knowledge of the F# programmer. It does assume you know and are comfortable with F# programming.

 

queries-with-fsharp-powerpack-v1.zip