Isolating Performance with Precompiled/Pre-generated Views in the Entity Framework 4

Introduction

Working with customer’s solutions, we often have to track down potential performance issues or uncover areas for improvement. One of the techniques is to try and measure the relative performance gained using the different configurations or features (likely in a staging/testing environment). When working with Entity Framework 4 one of the potential optimizations is to use precompiled views where Entity Framework query views are pre-built and compiled into the application; which otherwise, the entity framework would have to generate on the fly (extra resources being required). The following are some useful references on precompiled views:

· MSDN - How to: Pre-Generate Views to Improve Query Performance (Entity Framework)

· EF Team Blog - How to use a T4 template for View Generation

This sample application shows one possible way to test potential performance gains and resource utilization when using pre-generated views by providing a performance comparison between runs with precompiled views and without. You can download the sample code here.

When isolating performance, a good idea is to focus on a very narrow piece, preferable a section that gets the most use. Using Visual Studio Profiler to find the bottleneck can point to the slowest query. Then by leveraging the analysis techniques discussed below,the performance gains can be observed/studied.

Analyzing the Sample

The database - The complete sample code (solution named: PerfSample), executes a query against a local database named PerfSample. To create it, the 2 SQL scripts found at the root of the downloadable ZIP file need to be executed against an empty Database. Open the first one, named EFModelGen.SQL, with Microsoft SQL Server Management Studio and Execute it. This will not create the entities yet but instead, it will generate the required SQL script. Copy and paste the results into a new query and run it against your empty database, this will create 200 empty entities. To create records for the entities Type1 and Type2 run the second script, named PopulateData.sql. It contains 2 variables: Type1RecordCount and oneToMany. The first indicates the number of records to be created in entity Type1, and since entities Type1 and Type2 have one-to-many relationship, the second variable indicates how many Type2 records will be created per each Type1 record.

The sample executes a LINQ select statement with a join against the Type1 and Type2 entities. For the purpose of testing precompiled views, the results need to be materialized (turning data records into “real” objects) hence, the sample runs ToList().Count() against the query results, the simple code is as follows.

    var type1WithType2 = from t1 in context.Type1
                        join t2 in context.Type2
                        on t1.Type1Id equals t2.Type1_Type1Id
                           select new
                           { t1.Type1_Col30, t2.Type2_Col30 };
   type1WithType2.ToList().Count();

 

Once the query is identified, execution time (performance) and memory usage (resource) is recorded. To measure time, the StopWatch class is used. For recording memory usage, a memory reading is collected into long integers before and after the query is executed and then the numbers are subtracted, which will provide only the memory used by the query. To look at Managed memory (memory used only by the framework), first the garbage collector (GC) is forced to run, that way obtaining a clearer read of the memory being used by the framework. Note that this is purposely done before the stopwatch starts since the GetTotalMemory (true ) method may take a little while to return while the GC is run (which can be unpredictably long or short).

    startManageMemoryBytes = GC.GetTotalMemory(true); //Force GC
   stopWatch.Start();

      //[Query/code to analyze]

   stopWatch.Stop();
   endManageMemoryBytes = GC.GetTotalMemory(false);

 

The program is compiled with the Visual Studio and to streamline things a bit more, it is built in Release mode, the SQL server is local to the machine (minimizing network delays) and its affinity set to CPU1 (minimizing CPU context switching between SQL and the application), tests were executed on a dual-proc machine.

The query above is executed 3 times (call to QueryToTest(context) ), in order to collect more even results.

Performing the tests

The sample has to first be prepared to either run in precompiled mode or in standard mode. To setup a precompiled run, make sure the PerfSample.view.tt is included in the solution and that it has a corresponding PerfSample.views.cs file (if the .cs file is missing then it can be generated by doing a right-click on the T4 template PerfSample.view.tt and choosing ‘Run Custom Tool’) and then clean and rebuilding the solution. Likewise, to setup a run without precompiled views, exclude the T4 template PerfSample.view.tt from the project (right-click and choose to ‘Exclude From Project’) then clean and rebuild the solution.

The sample program is a simple console application that is executed via a small DOS batch file. The batch file runs the program 3 times and sends the info to the console; it is named runtest.bat and is found under …/bin/release directory. To make it easier to analyze, the results are piped to a text file in the same directory.

The following DOS commands are executed to collect the respective results, after the proper changes are made (as explained above)

    Runtests > WithPreCompiledViews.txt
   Runtests > WithoutPreCompiledViews.txt

NOTE: To append results into an existing file, replace the single less-than symbol (“>”) with the double pipe command “>>”

The test batch file is run multiple times to get a good sample of performance times.

The Results

without precompiled views

with precompiled views

..\bin\Release>call PerfSample

|Time MilSec |Manage Bytes | |--------------------------| | 8,009 | 69,174,356 | | 810 | 53,299,352 | | 815 | 53,299,388 |

..\bin\Release>call PerfSample

|Time MilSec |Manage Bytes | |--------------------------| | 7,629 | 69,169,948 | | 847 | 53,303,016 | | 858 | 53,305,532 |

..\bin\Release>call PerfSample

|Time MilSec |Manage Bytes | |--------------------------| | 8,057 | 68,123,116 | | 800 | 53,303,064 | | 794 | 53,304,332  |

..\bin\Release>call PerfSample

|Time MilSec |Manage Bytes | |--------------------------| | 3,287 | 73,282,032 | | 813 | 53,306,136 | | 897 | 53,301,052 |

..\bin\Release>call PerfSample

|Time MilSec |Manage Bytes | |--------------------------| | 3,269 | 73,283,292 | | 805 | 53,299,816 | | 837 | 53,305,532 |

..\bin\Release>call PerfSample

|Time MilSec |Manage Bytes | |--------------------------| | 3,257 | 73,283,548 | | 815 | 53,306,776 | | 806 | 53,299,004 | 

This is well in line with the expected behavior of precompiled views. The test runs made without precompiled views have to first generate the set of views to access the database so, it is expected that their elapsed time will be greater. From the results above, the time lapse for precompiled views is over twice as fast but not for subsequent runs since the view is already generated and therefore reused.

In the same manner, precompiled views should give us a lower memory footprint. For managed memory, the first run uses about 5~7% less memory and subsequent are in both cases similar. Also, in both cases, the drop in memory usage is likely explained by the fact that unused memory is being reclaimed by the system.

As a way to further explore the behavior of pre-generated views, the sample has one line of code commented-out, this is the method PreRun(context) which does a select count of all the members in both the Products and the TransactionHistory entities, the same entities used in the query. And it does this by reusing the ObjectContext instance (no statistics are recorded). Running the test with this method uncommented, will result in very similar time lapses and memory usage for both the precompiled and not-precompiled executions. This is because the Query from the QueryToTest(context) method is reusing the already generated views from PreRun(context) since they share the same ObjectContext.

Conclusion

This empirically demonstrates the performance improvement offered by precompiled views, by pre-constructing the Entity Framework views, the first run of the query was about 2.4 times quicker and there also was an improvement in memory usage. Since the first execution of the query is where the performance gain is, leveraging pre-generated views in at least two cases will be of significant value: the warm up of a system and the execution of infrequently run queries. The use of less memory is also advantageous and is due to no longer having the overhead of keeping and creating the views.

Results are likely going to vary depending on your solution, particularly the model size and the amount of records being queried, so the sample should be tailored to match, as much as possible, your particular solution, however gains should be noticeable.

Authored by: Jaime Alva Bravo