Query Performance
During this post I’ll show a few patterns on how to improve the query performance.
A major design element for performance is the query cache. Once a query is executed, parts of the query are maintained in a global cache. Because of query and metadata caching, the second run always completes faster than the first run. For example, consider a query using Entity SQL query (don’t worry, we’ll discuss LINQ in just a second) with the following code.
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
ObjectQuery<Orders> orders
= ne.CreateQuery<Orders>(“Select value o from Orders as o”);
foreach (Orders o in orders)
{
int i = o.OrderID;
}
}
On the first run, the query completes in 179 milliseconds. The next time, the query completes in 15 milliseconds. The difference between the two is the cost of building the command tree that gets passed down to the provider for execution. All subsequent queries complete in or around the same time of 15 milliseconds.
LINQ Queries
LINQ queries utilize some of the same logic as Entity SQL queries, except that not all parts of the query are cached and some parts are rebuilt each time the query is executed. Take a look at the query below.
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
var orders = from order in ne.Orders
select order;
foreach (Orders o in orders)
{
int i = o.OrderID;
}
}
Executing this query takes 202 milliseconds on the first execution and only 18 milliseconds on subsequent executions, which is still slower than using Entity SQL. Now, let’s take a look at using compiled LINQ queries to improve performance further. The advantage of compiling a LINQ query is that the expression tree is built when the query is compiled and doesn’t need to be rebuilt on subsequent executions.
Here’s the code for a compiled LINQ query that uses a PerformanceArticleContext delegate.
public static Func<PerformanceArticleContext, IQueryable<Orders>>
compiledQuery = CompiledQuery.Compile(
(PerformanceArticleContext ne) =>
(from o in ne.Orders
select o)
);
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
foreach (Orders o in compiledQuery(ne))
{
int i = o.OrderID;
}
}
The times for this compiled LINQ query are 305 milliseconds on the first execution and 15 milliseconds on subsequent executions.
Here’s the standard LINQ.
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
ne.Orders.MergeOption = MergeOption.NoTracking;
var orders = from order in ne.Orders
where order.ShipCity == “London”
select order;
foreach (Orders o in orders)
{
int i = o.OrderID;
}
}
Now here is the compiled LINQ query.
public static Func<PerformanceArticleContext, IQueryable<Orders>>
compiledQuery = CompiledQuery.Compile(
(PerformanceArticleContext ne) =>
(from o in ne.Orders
where o.ShipCity == “London”
select o)
);
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
foreach (Orders o in compiledQuery(ne))
{
int i = o.OrderID;
}
}
In this query the result set is only 33 items. For the non-compiled query, the execution time is 207 milliseconds on the first execution and 17 milliseconds on subsequent executions. By comparison, the result for the compiled query is 268 milliseconds on the first execution and only 3 milliseconds on subsequent executions.
No Tracking/Tracking
In the previous examples, all the queries result in the creation of an object that gets added to the ObjectStateManager so that we can track updates. When it is not important to track updates or deletes to objects, then executing queries using the NoTracking merge option may be a better option. For example, NoTracking may be a good option in an ASP.NET web application that queries for a specific category name but doesn’t make updates to the returned data. In a case like this, there is a performance benefit to using NoTracking queries. Let’s demonstrate this by taking the same queries that we used previously and making them all into NoTracking queries. The following table demonstrates the cost of tracking objects in the query examples:
Tracking v. NoTracking on full query without parameters
Query Type
Tracking
First Run (milliseconds)
Tracking
Second Run (milliseconds)
NoTracking
First Run (milliseconds)
NoTracking
Second Run (milliseconds)
Entity SQL
179
15
149
6
LINQ
202
18
282
8
Compiled LINQ
305
15
276
6
Based on these numbers, the NoTracking option provides a big reduction in the amount of time, where most of this gain comes when we stop tracking changes and managing relationships. For a NoTracking query, the compiled LINQ query outperforms the standard LINQ query both in first execution and in subsequent executions. Note that the second execution of the compiled LINQ query is equivalent to the second execution of the Entity SQL query.
Below are the single parameter filtered queries for both tracking and no tracking.
Query Type
Tracking
First Run
(milliseconds)
Tracking
Second Run
(milliseconds)
NoTracking
First Run
(milliseconds)
NoTracking
Second Run
(milliseconds)
Entity SQL
126
3
140
2
LINQ
207
17
275
18
Compiled LINQ
268
3
277
2
Summary
When optimizing query performance in the Entity Framework, you should consider what works best for your particular programming scenario. Here are a few key takeaways:
· Initial creation of the ObjectContext includes the cost of loading and validating the metadata.
· Initial execution of any query includes the costs of building up a query cache to enable faster execution of subsequent queries.
· Compiled LINQ queries are faster than Non-compiled LINQ queries.
· Queries executed with a NoTracking merge option work well for streaming large data objects or when changes and relationships do not need to be tracked.
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
ObjectQuery<Orders> orders
= ne.CreateQuery<Orders>(“Select value o from Orders as o”);
foreach (Orders o in orders)
{
int i = o.OrderID;
}
}
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
var orders = from order in ne.Orders
select order;
foreach (Orders o in orders)
{
int i = o.OrderID;
}
}
public static Func<PerformanceArticleContext, IQueryable<Orders>>
compiledQuery = CompiledQuery.Compile(
(PerformanceArticleContext ne) =>
(from o in ne.Orders
select o)
);
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
foreach (Orders o in compiledQuery(ne))
{
int i = o.OrderID;
}
}
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
ne.Orders.MergeOption = MergeOption.NoTracking;
var orders = from order in ne.Orders
where order.ShipCity == “London”
select order;
foreach (Orders o in orders)
{
int i = o.OrderID;
}
}
public static Func<PerformanceArticleContext, IQueryable<Orders>>
compiledQuery = CompiledQuery.Compile(
(PerformanceArticleContext ne) =>
(from o in ne.Orders
where o.ShipCity == “London”
select o)
);
using (PerformanceArticleContext ne = new PerformanceArticleContext())
{
foreach (Orders o in compiledQuery(ne))
{
int i = o.OrderID;
}
}
Query Type
Tracking
First Run (milliseconds)
Tracking
Second Run (milliseconds)
NoTracking
First Run (milliseconds)
NoTracking
Second Run (milliseconds)
Entity SQL
179
15
149
6
LINQ
202
18
282
8
Compiled LINQ
305
15
276
6
Query Type
Tracking
First Run
(milliseconds)
Tracking
Second Run
(milliseconds)
NoTracking
First Run
(milliseconds)
NoTracking
Second Run
(milliseconds)
Entity SQL
126
3
140
2
LINQ
207
17
275
18
Compiled LINQ
268
3
277
2
Brian Dawson,
Program Manager, ADO.NET
Do you have any comparison between doing LINQ like
‘from x in y select x;’
vs.
‘x.Select()’
More complex queries would be more useful but I am interested in which type is actually faster
I assume you took into consideration the database buffer cache?
I’m glad to finally see some EF performance information. It would be helpful to see a comparison against the similar process with a DataReader, DataSet, LINQ to SQL and other OR Mappers to see just what the performance trade-offs are. In this regard, I point to Rico Mariani’s series on performance with LINQ to SQL as an example.
In the case of the examples you’ve presented so far, the queries are identical to the kind of 1-1 table mapping that is possible with LINQ to SQL. It would be nice to also see some performance on more complex object mapping schemes (multiple tables to single object for example).
Is there any way that we can reduce the time when the first query is run? I thought that using compiled queries would reduce that time.
Thank you.
ADO.NET Entity FrameWork Performance
Brian
Thank you so much for going into this detail. I look forward to more.
J’ai quelque peu délaissé mon blog ces derniers temps mais maintenant que les techdays sont passés (du
En este post comento una serie buenas prácticas para mejorar la eficiencia de ADO.NET Entity Framework
What is the ability to persist this global cache? Given that the abstraction layer probably doesn’t change a heck of a lot, there’s no need to rebuild it unless it is invalidated. I’d hate to put the user through unneeded work if its not quick.
I’m very curious about best-practices for an asp.net environment. Obviously, keeping around the compiled query would be great for the performance benefits of subsequent queries, yet in a web scenario, I’m generally creating a context, executing a query, and then destroying both. Is there some advice out there about how to make this more efficient so that I don’t pay the 200ms penalty each time?
In the stats above it appears that there is a decrease in performance for the "No Tracking" in several scenarios for the first run. For example, in the LINQ no parameters scenario the Tracking time is 202, whereas the No Tracking time is 282.
Is it true that No Tracking is actually slower in some cases, and if so why?
Many Thanks
概述春节后的第一期推荐系列文章,共有10篇文章:1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFramework…
Como hace unos días apuntaba Unai , en el blog del equipo de desarrollo de ADO.NET están publicando una
概述 春节后的第一期推荐系列文章,共有10篇文章: 1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFr…
The first post, Exploring the Performance of the ADO.NET Entity Framework – Part 1 , began: Performance
Do you have any performance statistics relating to using SQL Server stored procedures?
I love all the amazing work that has been done with the ADO.NET Entity Framework and Data Services. There is one piece that I can’t find any information on, however: Support for multi domain models or multiple databases in the Entity Framework (and thus Data Services).
We have 10+ large enterprise databases all hosted on the same server/instance of sql server and these databases are somewhat used as "schema" conatiners, i.e. we have many stored procedures that make calls to stored procedures/tables/views in another database and there are logical relationships between tables in different databases. This requires our Entity Model to have entity relationships across databases and thus across Entity Models (since, AFAIK, currently the Entity Framework design surface can only include tables from one database).
I believe that this is a big weakness and needs to be addressed as soon as possible. The other issue I have is how much the Entity Framework design surface slows down (and the entity boxes become unmanageble from a visual real-estate perspective) after about 20-30 tables on it…
Any comments on this from the Entity Framework team would be very much appreciated…
Great post!
It will be helpful if you could also include comparison metrics from the DBMS side.
Splitting the execution time between application (CLR) and Sql server (DBMS) for each thread will also be of a huge help in understanding and tuning performance.
Thanks
Thanks for all the comments and questions! It’s great to see the interest in the Entity Framework. I’m working on another post about performance that I’ll try to post soon. It addresses some of the questions.
-Brian
Is there any news on the Release Date of Entity Framework?
There have been a few questions from the last performance blog post about how the Entity Framework compares
L’ADO .Net Team vient de poster deux nouveaux posts : le premier concerne l’utilisation des procédures
For everybody who is interested on internals of Entity Framework (ER) in relation to referential integrity
Query Performance During this post I’ll show a few patterns on how to improve the query performance. A major design element for performance is the query cache. Once a query is executed, parts of the query are maintained in a global cache. Because of quer
Query Performance During this post I’ll show a few patterns on how to improve the query performance. A major design element for performance is the query cache. Once a query is executed, parts of the query are maintained in a global cache. Because of quer
Analyzing the test results, to use compiled LINQ queries is better than non-compiled LINQ queries only in the second execution. In the first execution the compiled LINQ queries are always slower.
But what is exactly the second execution? Is an execution using the same instance of the objectcontext or is an execution in a new thread with a new objectcontext?
What I mean is, does the query cache survive between different threads or it is alive just during a thread execution lifetime?
Do you have comparsion data for EF with stored procedures and standalone stored procedures? Does it have some perfomance difference?
If we’ll skip exact details, we can say, that internal behavior of whole modeling and mapping is based
Great article!
I am using MergeOption.NoTracking.
But i am unable to access navigation properties whenever my linq query ends up with AsEnumerable(). But if i make it as ToList() it works fine.
Using AsEnumerable is must for me and I wanna use MergeOption.NoTracking also.
Do you people have any idea about this.
Thanks in advance.