Performance Quiz #13 — Linq to SQL compiled queries cost


I’ve written a few articles about Linq now and you know I was a big fan of compiled queries in Linq but what do they cost?  Or more specifically, how many times to you have to use a compiled query in order for the cost of compilation to pay for itself?  With regular expressions for instance it’s usually a mistake to compile a regular expression if you only intend to match it against a fairly small amount of text.


Lets do a specific experiment to get an idea.  Using the ubiquitous Northwinds database and getting the same data over and over to control for the the cost of the database accesses (and magnify any Linq overheads) we run this query:



var q = (from o in nw.Orders
            select new {
                OrderID = o.OrderID,
                CustomerID = o.CustomerID,
                EmployeeID = o.EmployeeID,
                ShippedDate = o.ShippedDate
           }).Take(5);


and compare it against:



var fq = CompiledQuery.Compile
(
    (Northwinds nw) =>
            (from o in nw.Orders
            select new
                   {
                       OrderID = o.OrderID,
                       CustomerID = o.CustomerID,
                       EmployeeID = o.EmployeeID,
                       ShippedDate = o.ShippedDate
                   }).Take(5)
);


So now the quiz:  How many times to I have to use the compiled version of the query in order for it to be cheaper to compile than it would have been to just use the original query directly?

Comments (28)

  1. Anonymouse says:

    Only once since the query must be "compiled" anyway.

  2. Grant says:

    [Only once since the query must be "compiled" anyway.]

    I doubt that.

    There is almost certainly additional overhead in finding out if the query has already been compiled (using a hashcode or similar mechanism?) and allocating storage for the compiled query in some sort of collection of compiled queries.

  3. JamesNT says:

    I doubt it as well.  Unfortunately, I do not know enough about Linq to give an informed answer.  Therefore, I will be watching Rico’s blog with interest as he posts the answer.

    JamesNT

  4. Oleg Mihailik says:

    There is no implicit caching of compiled queries, so I vote for ‘once’ as well.

    That’s the problem: LINQ doesn’t remember the expressions when they are executed twice or more. It always re-interprets an expression tree from scratch.

    Otherwise, we wouldn’t need CompiledQuery functionality in the first place.

  5. Peter Ritchie says:

    It depends on the query.  Taking into account the cost of the compile, I’ve had some compiled queries take as many as 15 executions before they paid for themself.

  6. Galcho says:

    I think we need to use the following formula to find number fo execution where time is same for both cases

    X.Ec + Tc = X.Er

    or

    X = Tc /(Er – Ec)

    where

    X – number of executions

    Ec – Execution of compiled query

    Tc – time needed for compilation

    Er – execution of raw query

  7. omario says:

    I think queries are cached for  the lifetime of thedatacontext.

    And it worth compiling query if you use it with different datacontexts more than once.

  8. Alexey Lavnikov says:

    I doubt that queries are cached in DataContext.

    I think if you use same query more than once, compiled queries are better.

  9. JamesCurran says:

    Well, from my testing (using the query Rico gave), the pay-off is immediate.

    I’m going to assume the the Comments editor is going to mangle the chart below, so to summarize, Compiled *always* wins, even for just one repitition.  At higher repetitions, the ratio settles at 5:1, but oddly, at low reps, compiled win by as much as 25:1. (Some — but not all — of that can be attributed to the timer resolution)

    Reps      uncompiled Compiled  ratio

       1       :00.25     :00.014  18:1      

       5       :00.37     :00.015  25:1

      10       :00.27     :00.02   14:1

      50       :00.81     :00.06   13:1

     500       :01.98     :00.38   >5:1

    5000       :14.86     :03.10   <5:1

    50000      2:26        :27      >5:1

  10. Peter Ritchie says:

    James, your times don’t seem to detail the compilation time?  My tests show that 1 repetition uncompiled is slower than compiling and 1 repetition of the compiled object.

    Although, I haven’t tested specifically with Rico’s example, in my tests the compilation time begins to be offset at 15 repetitions.  As I stated earlier, this offset is likely dependant on database and the query involved.

  11. Check it out and see if you know the answer. Rico’s Performance Tidbits

  12. JamesCurran says:

    Here is my code for the compiled case.  The time is for one call to this method. ("reps" is a hard-code static const which I manually change between runs.)  ("total" is used to prevent the compiler from eliding unused code, and to compare with the results from Test1() which was the uncompiled version.)

    static int  Test2()

    {

       NorthwindDataContext nwdc = new NorthwindDataContext("(local)");

       var fq = CompiledQuery.Compile((NorthwindDataContext nw) =>

       (from o in nw.Orders

        select new

        {

            OrderID = o.OrderID,

            CustomerID = o.CustomerID,

            EmployeeID = o.EmployeeID,

            ShippedDate = o.ShippedDate

        }).Take(5)

       );

       int total = 0;

       for (int i = 0; i < reps; ++i)

       {

          var t= fq.Invoke(nwdc);

          foreach (var a in t)

          {

              total += a.OrderID;

          }

       }

       return total;

    }

  13. ricom says:

    James:  Things you want to control for at low reps:

    -jitting time (run once and throw away that result to get all the code jitted)

    -state of the connection pool (you don’t want to charge the cost of making the connection to one algorithm, and not the other so make several connections first and discard those results too)

    What I did was I ran several thousand batches first of both cases, discarded those results, and then measured.  Also I tried them in different orders to make sure that the later runs were not penalized by say more things on the GC heap (they were not as it turns out but I controlled for it anyway)

    I don’t understand how the compiled query can possibily win in the single use case 🙂

  14. Peter Ritchie says:

    If I include the cost of the compilation in the times, I get ratios like this with that sample code:

    uncompiled compiled

    1 00:00:00.1047737 00:00:00.0903456 1.2:1

    5 00:00:00.0921325 00:00:00.1044020 0.8:1

    10 00:00:00.0982715 00:00:00.1024622 1,0:1

    50 00:00:00.1724429 00:00:00.1112768 1.5:1

    500 00:00:00.6105094 00:00:00.2092043 2.9:1

    5000 00:00:05.0588695 00:00:01.2370046 4.1:1

    50000 00:00:49.9114562 00:00:11.3373295 4.4:1

    Now, with a different query on the same northwind database the results are different again:

    uncompiled compiled

    1 00:00:00.4928271 00:00:00.3322023 1.5:1

    5 00:00:01.1390319 00:00:01.1036043 1.0:1

    10 00:00:02.1391678 00:00:02.0752555 1.0:1

    50 00:00:10.0530105 00:00:09.6794834 1.0:1

    500 00:01:41.2241573 00:01:35.6616519 1.1:1

    And the copde/compiled-query is:

    static int Test()

    {

    NorthwindDataContext nwdc = new NorthwindDataContext(@"(local)SQLExpress");

    int total = 0;

    Stopwatch stopwatch = new Stopwatch();

    stopwatch.Start();

    var cq = CompiledQuery.Compile((NorthwindDataContext nw) =>

    (from customer in nw.Customers

    select new

    {

    Customer = customer,

    Orders = from o in nw.Orders

    join co in nw.Orders

    on o.CustomerID equals co.CustomerID

    where co.CustomerID == customer.CustomerID

    select o

    }));

    for (int i = 0; i < reps; ++i)

    {

    var q = cq.Invoke(nwdc);

    foreach (var a in q)

    {

    total += a.Orders.Count();

    }

    }

    stopwatch.Stop();

    Trace.WriteLine(String.Format("{0} {1} ({2} ticks)", reps, stopwatch.Elapsed, stopwatch.ElapsedTicks));

    return total;

    }

    A similar query on a different database produced results that showed compiled was slower than uncompiled up to about 15 reps.

    So, it really depends on the query and the database.

    As with any optimizations, it should be checked to make sure it really is an optimization and that it’s worth doing…

  15. ricom says:

    Peter correctly points out that if you’re bringing back lots of data (as in his second example) all this query compilation stuff just so much noise.  The real cost is in processing the data and creating the objects.

    With that much data coming back, compilation isn’t hurting but it sure isn’t helping much. Pretty much you’ve got the raw time to read the data and create the objects and that’s it.  Which is as it should be 🙂

    I’m guessing now but I think the primary difference between my results and Peter’s (where we were using the same query) is that I put in a dummy run to eliminate the cost of jitting and the cost of getting the database connection pool set up from the experiment.  Also I keep the total work constant so that I never have to measure any small times.

  16. Anonymouse says:

    I was right, but in some sense I cheated. 😛

  17. Peter Ritchie says:

    Yes, I didn’t perform and dummy runs to eliminate jitting in the metrics.  Unless I’m NGENing a product I like to see "worst" case.  If the user runs the application for the first time and it is prohibitively slow, they’re not going to want to use it again :-).  But, metrics not including JIT times are equally as informative.

  18. Chabster says:

    using System;

    using System.Collections.Generic;

    using System.Linq;

    using System.Text;

    using System.Data.Linq;

    using System.Diagnostics;

    using System.Linq.Expressions;

    namespace LINQPerf

    {

    class Program

    {

    static void Main(string[] args) {

    Stopwatch s, sAvg;

    const int iterations = 10;

    const int ordersToFetch = 10000;

    Northwind nw;

    nw = new Northwind("Data Source=D:\Development\Projects\C#\LINQPerf\LINQPerf\Northwind.sdf");

    //nw.Log = Console.Out;

    nw.ObjectTrackingEnabled = false;

    //s = Stopwatch.StartNew();

    //var cacheQuery = from orders in nw.Orders select orders;

    //cacheQuery.Take(ordersToFetch).GetEnumerator();

    //Console.WriteLine("Cache: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled: " + s.ElapsedMilliseconds);

    Console.WriteLine();

    sAvg = Stopwatch.StartNew();

    {

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_1 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 1: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_2 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 2: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_3 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 3: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_4 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 4: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_5 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 5: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_6 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 6: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_7 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 7: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_8 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 8: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    (from o in nw.Orders select new { OrderID_N_9 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    Console.WriteLine("Non compiled 9: " + s.ElapsedMilliseconds);

    }

    Console.WriteLine("Non compiled avg not in loop: " + ((double)sAvg.ElapsedMilliseconds / 10));

    s = Stopwatch.StartNew();

    for (int i = 0; i < iterations; i++) {

    (from o in nw.Orders select new { OrderID_N_AVG = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch).GetEnumerator();

    }

    Console.WriteLine("Non compiled avg in loop: " + ((double)s.ElapsedMilliseconds / iterations));

    Console.WriteLine();

    //s = Stopwatch.StartNew();

    var compiledQuery = CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch));

    //Console.WriteLine("Compile: " + s.ElapsedMilliseconds);

    //Console.WriteLine();

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 0: " + s.ElapsedMilliseconds);

    Console.WriteLine();

    sAvg = Stopwatch.StartNew();

    {

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 1: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 2: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 3: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 4: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 5: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 6: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 7: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 8: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    compiledQuery(nw).GetEnumerator();

    Console.WriteLine("Compiled 9: " + s.ElapsedMilliseconds);

    }

    Console.WriteLine("Compiled avg not in loop: " + ((double)sAvg.ElapsedMilliseconds / 10));

    s = Stopwatch.StartNew();

    for (int i = 0; i < iterations; i++) {

    compiledQuery(nw).GetEnumerator();

    }

    Console.WriteLine("Compiled avg in loop: " + ((double)s.ElapsedMilliseconds / iterations));

    Console.WriteLine();

    sAvg = Stopwatch.StartNew();

    {

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_0_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 0 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_1_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 1 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_2_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 2 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_3_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 3 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_4_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 4 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_5_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 5 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_6_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 6 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_7_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 7 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_8_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 8 0: " + s.ElapsedMilliseconds);

    s = Stopwatch.StartNew();

    CompiledQuery.Compile((Northwind nwDb) => (from o in nwDb.Orders select new { OrderID_C_9_0 = o.OrderID, CustomerID = o.CustomerID, EmployeeID = o.EmployeeID, ShippedDate = o.ShippedDate }).Take(ordersToFetch))(nw).GetEnumerator();

    Console.WriteLine("Compiled 9 0: " + s.ElapsedMilliseconds);

    }

    Console.WriteLine("Compiled 0-9 avg not in loop: " + ((double)sAvg.ElapsedMilliseconds / 10));

    Console.ReadLine();

    }

    }

    }

    Result:

    Non compiled: 482

    Non compiled 0: 31

    Non compiled 1: 16

    Non compiled 2: 12

    Non compiled 3: 12

    Non compiled 4: 12

    Non compiled 5: 12

    Non compiled 6: 16

    Non compiled 7: 12

    Non compiled 8: 12

    Non compiled 9: 12

    Non compiled avg not in loop: 15,2

    Non compiled avg in loop: 9,6

    Compiled 0: 19

    Compiled 0: 6

    Compiled 1: 14

    Compiled 2: 6

    Compiled 3: 6

    Compiled 4: 6

    Compiled 5: 6

    Compiled 6: 6

    Compiled 7: 6

    Compiled 8: 6

    Compiled 9: 6

    Compiled avg not in loop: 7,4

    Compiled avg in loop: 6,6

    Compiled 0 0: 13

    Compiled 1 0: 31

    Compiled 2 0: 12

    Compiled 3 0: 11

    Compiled 4 0: 12

    Compiled 5 0: 14

    Compiled 6 0: 12

    Compiled 7 0: 12

    Compiled 8 0: 12

    Compiled 9 0: 12

    Compiled 0-9 avg not in loop: 14,7

    Conclusion: compiled expressions are emitted on first run, but not within CompiledQuery.Compile method. Seems like compilation also affects data mapping.

  19. Blog says:

    LINQ query versus compiled LINQ query

  20. LINQ query versus compiled LINQ query

  21. bhrgunatha says:

    >> A similar query on a different database produced results that showed compiled was slower than uncompiled up to about 15 reps.

    >> So, it really depends on the query and the database.

    I took a course on Optimizing SQL Server several years ago and learned a very valuable lesson.

    The answer is ALWAYS "It depends"

    The general guideline though seems to be use a compiled query for smaller sets of data with moderate re-use. But always profile to check 😉

  22. mattwar says:

    Chabster,

    When you call GetEnumerator() on a query you are executing it but not reading any of the data.  As Rico points out, for large data sets, reading the data is most of the work.

  23. Chabster says:

    mattwar,

    i don’t want data to be read. Only make sure that the query is built and sent to db. But your point might have some value – perhaps compilation affect s mapping performance?

  24. Welcome to the XXXIX issue of Community Convergence. The big news this week is that Microsoft has begun

  25. Welcome to the XXXIX issue of Community Convergence. The big news this week is that Microsoft has begun

  26. I did a series of postings on Linq Compiled Queries last year, I recently got some questions on those