EF Beta 3 is finally available!

It has been a long time in coming (especially for those of you who have installed VS 2008 RTM and then been frustrated about the inability of beta 2 to work with it), but beta 3 of the Entity Framework and CTP 2 of the EF Designer are finally available!

There are four steps to get everything installed:

0) Uninstall any previous versions of the EF and designer, and install VS 2008 RTM.

1) Install the EF Beta 3 runtime.

2) Install a patch for the XML Editor which is necessary for the Designer to work.

3) Install the EF Tools CTP 2.

In addition to being updated so that they work with the RTM versions of .Net Framework 3.5 and VS 2008, these versions include some great improvements. I'll have to defer to the designer team to give more details about that release, but I wanted to share my quick take on what's new in the EF runtime. I’ll insert my take on things in this color after the new features summary lines I've copied from the download page…

Performance improvements

A HUGE emphasis for us this milestone was on performance. I can’t begin to tell you how much time and effort we spent on these things. The above three represent the biggest, most obvious improvements but effort was made throughout the stack.

 

  • Much quicker object query execution

By far and away the most expensive part of executing a query with the EF prior to beta 3 was the time taken to assemble the results into the correctly final format for the conceptual model and then construct object instances from that data. A major effort was made to change the strategy for this process (which we generally refer to as Object Materialization) by compressing a number of layers and doing dynamic code generation of a larger part of the whole process. The result is a noticeable improvement in query speed.

 

  • Simpler generated SQL

Another key part of the overall query process is in the SQL generated in the first place. A number of individuals remarked in beta 2 and before about the complexity of the queries generated and sent to the server. This was another major investment area. The result is simpler queries for the database to evaluate (which are also easier for mere mortals like me to understand).

 

  • Faster view generation

View generation is a part of the EF process which is necessary for the system to be able to query or update the database. Essentially this is a step which takes the declarative description of the mapping along with the metadata describing the storage model and the conceptual model and transforms them into an internal data structure which is used when the query or update is actually executed. This can happen either at runtime the first time a query or update statement is executed or at compile time. Prior to beta 3, for some complex models the time to generate the views could be quite long (it’s really nasty if you find that booting your app takes 5 minutes with the CPU pegged). During beta 2 I would solve this kind of problem by using compile-time viewgen and checking in the results of the generation so that I only had to pay the price when I actually changed the model, but now it’s enough faster that this generally isn’t necessary.

Easier disconnected operation

Many folks have observed that it can be difficult to use the EF in disconnected scenarios like when building / consuming a webservice or an ASP.Net solution. We spent some time this milestone thinking through the scenarios and adding several key improvements to object services to facilitate these scenarios.

 

  • Public, serializable EntityKey property on EntityReference

This is probably the biggest improvement. It accomplishes several things: 1) Directly exposes relationship stubs (ie. Relationship information which the object state manager knows about but isn’t otherwise exposed in the object graph because only one of the two related objects is in memory). 2) Makes it possible to manually set relationship information in the state manager without having both entities in memory. 3) Because the EntityKey is serialized along with the entity by default, this means that for most graphs (all except many-to-many relationships), if you serialize each of the entities in the graph and remote them to a different location and then attach all of them to a state manager at that location, the relationship information will be filled in through the state manager and this will cause the graph to be re-created even though only shallow serialization was used for each separate entity.

 

  • ApplyPropertyChanges

Given an entity which is already attached to a state manager and another object which represents a new, updated version of that same entity, you can use this method to compare the two entities and record differences as changes.

 

  • Attach on EntityReference

If you have two entity instances which you know are related in the store but the state manager is unaware of that relationship, then you can call Attach on the EntityReference to notify the state manager—just like you can call attach on the context to notify the state manager of an entity that already exists in the database.

 

  • Improvements to EntityKey serialization

There was a bug in beta 2 (now fixed) which caused entities with EntityKeys not to serialize the EntityKey along with the rest of the entity data. A variety of additional improvements were made to increase the reliability of EntityKey serialization.

Extensibility and business logic enhancements

Because of some very unpleasant object lessons associated with other components we’ve worked on where too many events were added without thinking them all the way through, we are taking a pretty conservative approach to extensibility in the EF. Events and other extensibility mechanisms have been added as we could convince ourselves that they were both critical and well thought-out.

 

  • Partial methods in code generation for property changing and property changed events

Now generated entity classes include declarations of partial methods and calls to those methods before and after setting each property value. The partial method mechanism is great because the compiler completely optimizes away the method calls if no implementation is supplied. The naming convention we use is On<PropertyName>Changing and On<PropertyName>Changed.

 

  • Load with MergeOption

This makes it possible to refresh the set of related entities (and the portion of an entity graph which models that—a reference or collection) with well-defined semantics.

 

  • AssociationChanged event

This event is exposed on both collections and references—think “collection changed” (fires for adds and deletes) but on both types of relationship objects.

Query improvements

 

  • Additional canonical functions for LINQ to Entities

The canonical functions are those which all providers are expected to support.

 

  • Apply operator elimination (makes more operations work in SQL Server 2000 and other databases)

The apply operator is a query capability which was added in SQL Server 2005. It’s very effective in some scenarios but unfortunately I’m not aware of any other databases which implement it (as soon as I say that someone will point one out, but in any case most databases do not). Prior to beta 3 this operator was used in a great many scenarios making the provider story for the EF surprisingly complicated. So, we worked to rewrite queries with this operator so that it isn’t required except in a very few scenarios (like when you use apply yourself in an eSQL query).

 

  • Compiled LINQ query

This really ought to be listed under “Performance Improvements” as well because it can make a big difference in the performance of LINQ queries. We list it here, though, because it also has an impact on the semantics of the query by making the binding to variables in the surrounding context more explicit. Check this post by Alex James out for more details.

 

  • ToTraceString() method on ObjectQuery<T> and EntityCommand to facilitate debugging

Now you can get the translated query text which will execute on the database from an EntityCommand or an ObjectQuery<T> by just calling this method.

Other

 

  • Connection management refinements

Prior to beta 3 the object context would open the underlying connection right away when it was constructed in order to gather some key metadata from it. Now that open operation is delayed until it is required. The overall summary of connection management is that by default the context will open the connection before a query or update operation and automatically close it afterwards. If you want to keep a connection open across multiple operations, you can open it explicitly (context.Connection.Open()) in which case you will need to close it yourself, or it will be closed when the context is disposed if the context constructed the connection (rather than it being created by your code and passed into the context’s constructor).

 

  • Provider interface allows better reasoning about primitive types

Essentially this is a matter of the EF admitting that primitive types and the details of the semantics around them are determined by the backend database/provider and the EF should avoid reasoning about them as much as possible—when it does need to reason about them it does so using information which the provider supplies.

In addition to the above list there were, of course, a number of bug fixes and other assorted adjustments. See the ADO.Net team blog for the breaking changes doc for details about the ones you are most likely to encounter right away if you have code written against beta 2 that you want to move forward.

Here's hoping that you will enjoy the improvements. As always, I'd love to hear about your experiences with the EF, criticisms, thoughts for how we can improve it, etc.

- Danny