Productivity Improvements for the Entity Framework

Background

We’ve been hearing a lot of good feedback on the recently released update to the Entity Framework in .NET 4. This release marks a significant advancement from the first release that shipped with .NET 3.5 SP1. I’m not going to spend time here talking about what’s new, but you can check here to see for yourself.

With all that said, there are still a number of things we can do to simplify the process of writing data access code with the Entity Framework. We’ve been paying attention to the most common patterns that we see developers using with the EF and have been brewing up a set of improvements to the Entity Framework designed to allow developers to accomplish the same tasks with less code and fewer concepts.
 

These improvements provide a cleaner and simpler API surface that focuses your attention on the most common scenarios but still allows you to drill down to more advanced functionality when it’s needed. We hope you will enjoy this simpler experience, but we should be quick to assure you that this is NOT a new data access technology. These improvements are built on the same technology for mapping, LINQ, providers and every other part of the Entity Framework. Think of this as a fast path to writing data access code using conventions over configuration, better tuned APIs and other techniques intended to reduce development time when using the EF.

At this stage we have worked through what we think the core API and functionality should look like and would like your feedback. There are still some capabilities such as data binding and concurrency resolution which will cause the API to evolve as we continue the design process, but the main concepts are in place, so it is a good time for feedback.

 

Introducing DbContext & DbSet

At the heart of the Entity Framework Productivity Improvements are two new types, DbContext and DbSet<TEntity>. DbContext is a simplified alternative to ObjectContext and is the primary object for interacting with a database using a specific model. DbSet<TEntity> is a simplified alternative to ObjectSet<TEntity> and is used to perform CRUD operations against a specific type from the model. These new types can be used regardless of whether you created your model using the Entity Designer or code.

 

 

The obvious question is ‘Why not just simplify ObjectContext and ObjectSet<T>?’ We are opting to introduce these new types in order to, on the one hand, preserve full backward compatibility with existing EF applications and continue to address all of the advanced scenarios that are possible given the EF’s existing flexibility, while on the other hand streamlining the experience of using the EF and tuning it for the most common cases. We believe it is critical that the EF programming experience improve in some fundamental ways, and at the same time we are absolutely committed to our existing customers. Establishing a collaborative relationship between the existing types and the new types allows us to achieve both requirements. Also, there are easy ways to get to ObjectContext and ObjectSet<T> from DbContext and DbSet in case you want more control for a particular task.

One point we want to be very clear about is that these new types are not replacing any existing types; they are a simplified alternative that build on the existing types, and as we add features to the Entity Framework they will always be available in ObjectContext/ObjectSet, and they will also be available in DbContext/DbSet if appropriate.

 

We’ll drill into the API surface later, but first let’s take a look at the coding experience using these new types.

Code First Experience

DbContext provides a simplified Code First pattern that requires less code and takes care of some common concerns such as model caching, database provisioning, schema creation and connection management. This simplified pattern uses a number of conventions to take care of these tasks and allows tweaking or overriding of this behavior when required. Let’s start off by using these conventions to write the code needed to build a console application that performs data access using a model:

A Visual Basic version of these code samples is available here.

using System.Collections.Generic;

using System.Data.Entity;

namespace MyDataAccessDemo

{

    class Program

    {

        static void Main(string[] args)

        {

            using (var context = new ProductContext())

            {

                var food = new Category { CategoryId = "FOOD" };

                context.Categories.Add(food);

                var cheese = new Product { Name = "Cheese" };

                cheese.Category = context.Categories.Find("FOOD");

                context.Products.Add(cheese);

                context.SaveChanges();

            }

        }

    }

    public class ProductContext : DbContext

    {

        public DbSet<Product> Products { get; set; }

        public DbSet<Category> Categories { get; set; }

    }

    public class Product

    {

        public int ProductId { get; set; }

        public string Name { get; set; }

        public Category Category { get; set; }

    }

    public class Category

    {

        public string CategoryId { get; set; }

        public string Name { get; set; }

        public ICollection<Product> Products { get; set; }

    }

}

That is 100% of the code you would write to get this program running. No separate model definition, XML metadata, config file or anything else is required. Conventions are used to fill in all of this information. Obviously there is quite a bit going on under the covers so let’s take a closer look at some of the things DbContext is doing automatically.

Model Discovery

During construction we scan the derived context for DbSet properties and include the types in the model. Model Discovery uses the existing Code First functionality so the new default conventions we recently blogged about are processed during discovery. You can opt out of set discovery by specifying an attribute on the set properties that should be ignored.

 

 

Of course there will be times when you want to further describe a model or change what was detected by convention. For example say you have a Book entity whose ISBN property is the primary key, this won’t be detected by convention. There are two options here; you can use data annotations to annotate the property in your class definition:

public class Book

{

    [Key]

    public string ISBN { get; set; }

    public string Title { get; set; }

}

Alternatively DbContext includes a virtual method that can be overridden to use the Code First fluent API on ModelBuilder to further configure the model:

public class ProductContext : DbContext

{

    public DbSet<Book> Books { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)

    {

        modelBuilder.Entity<Book>().HasKey(b => b.ISBN);

    }

}

Model Caching

There is some cost involved in discovering the model, processing Data Annotations and applying fluent API configuration. To avoid incurring this cost every time a derived DbContext is instantiated the model is cached during the first initialization. The cached model is then re-used each time the same derived context is constructed in the same AppDomain. Model caching can be turned off by setting the CacheForContextType property on ModelBuilder to ‘false’ in the OnModelCreating method.

DbSet Initialization

You’ll notice in the sample that we didn’t assign a value to either of the DbSet properties on the derived context. During construction DbContext will scan the derived context for DbSet properties and, if they expose a public setter, will construct a new DbSet and assign it to the property. You can opt out of set initialization by specifying an attribute on the set properties that should not be initialized.

You can also create DbSet instances using the DbContext.Set<TEntity>() method if you don’t want to expose public setters for the DbSet properties.

Database Provisioning

By default the database is created and provisioned using SqlClient against localhost\SQLEXPRESS and has the same name as the derived context. This convention is configurable and is controlled by an AppDomain setting that can be tweaked or replaced. You can tweak the default SqlClient convention to connect to a different database, replace it with a SqlCe convention that we include or define your own convention by implementing the IDbConnectionFactory interface.

public interface IDbConnectionFactory

{

    DbConnection CreateConnection(string databaseName);

}

The active IDbConnectionFactory can be retrieved or set via the static property, Database.DefaultConnectionFactory.

DbContext also includes a constructor that accepts a string to control the value that is passed to the convention, the SqlClient and SqlCE factories allow you to specify either a database name or the entire connection string.

Before calling the convention, DbContext will check in the app/web.config file for a connection string with the same name as your context (or the string value if you used the constructor that specifies a string). If there is a matching entry, we will use that rather than calling the convention. Because connection string entries also include provider information this allows you to target multiple providers in one application.

Finally, if you want full control over your connections there is a constructor on DbContext that accepts a DbConnection.

Database First and Model First Experience

In the latest version of the Entity Framework that shipped with .Net Framework 4.0 and Visual Studio 2010 we introduced T4 based code generation. T4 allows you to customize the code that is generated based on a model you have defined using the designer in either the Database First or Model First approach. The default template generates a derived ObjectContext with an ObjectSet<T> for each entity set in your model.

The productivity improvements will also include a template that generates a derived DbContext with a DbSet<TEntity> for each entity set in your model. This allows Model First and Database First developers to make use of the simplified API surface described in the next section.

API Surface

DbContext is the starting point for interacting with your model. Compared to ObjectContext it has a greatly reduced number of methods and properties that are exposed at the root level. The aim is to expose only the most commonly used methods on DbContext and have the ability to drill down to more advanced APIs. One example of this is the Database property that exposes database related APIs. We will add a couple more members as we work through adding the rest of the advanced functionality but we want to keep it as minimal as possible. In most cases you would work with a context that derives from DbContext and exposes strongly typed DbSet properties for the types in your model.

public class DbContext : IDisposable

{

    public DbContext(DbModel model, string nameOrConnectionString);

    public DbContext(DbModel model, DbConnection existingConnection);
public DbContext(ObjectContext objectContext);

    protected DbContext();

    protected DbContext(string nameOrConnectionString);

    protected DbContext(DbConnection existingConnection);

    public Database Database { get; }

    protected ObjectContext ObjectContext { get; }

   

    protected virtual void OnModelCreating(ModelBuilder modelBuilder);

    public virtual int SaveChanges();

    public DbSet<TEntity> Set<TEntity>() where TEntity : class;

    public void Dispose();

}

 

DbModel Constructors

These constructors can be used with Database First, Model First and Code First development. They are used by our T4 template for Database First and Model First Development and can also be used in Code First scenarios where a model is built externally using ModelBuilder.

Code First previously contained a ContextBuilder type which we have now split into two components, ModelBuilder and DbModel. ModelBuilder is mutable and exposes the fluent API for defining your model. ModelBuilder creates an immutable DbModel type that can be used to construct an ObjectContext or DbContext. DbModel can also be constructed from a Database First or Model First approach where an edmx file is generated.

ObjectContext Constructor

If you have an existing code base that uses ObjectContext and want to make use of the alternate surface of DbContext in some parts of your code then this constructor allows you to wrap the ObjectContext with the simpler surface.

Protected Constructors

The three protected constructors can be exposed or used within a derived context where you want to make use of the simplified Code First experience which was explained in the ‘Code First Experience’ section. They are protected because the model discovery, database provisioning and model caching mechanisms rely on having a derived context.

Database

The Database property exposes an instance of the new Database type that contains methods for interacting with the underlying database.

public class Database

{

    public DbConnection Connection { get; }

    public void Create ();

    public bool CreateIfNotExists();

    public bool Exists();

    public static bool Exists(string nameOrConnectionString);

    public static bool Exists(DbConnection existingConnection);

    public void Delete ();

    public static void Delete (string nameOrConnectionString);

    public static void Delete (DbConnection existingConnection);

    public bool DeleteIfExists();

    public static bool DeleteIfExists(string nameOrConnectionString);

    public static bool DeleteIfExists(DbConnection existingConnection);

}

 

ObjectContext

DbContext uses an ObjectContext internally and we make this available as a protected property just in case you ever need to drop down to the lower level API. You can use or expose the required functionality from a derived DbContext.

OnModelCreating

This protected member can be overridden when defining a derived context in CodeFirst development and allows you tweak the shape of your model that was detected by convention.

SaveChanges

This method exposes the same functionality as SaveChanges on ObjectContext and persists all pending changes to the database. It represents the Unit of Work for the context.

Set<TEntity>

This method will create a DbSet<TEntity> for an entity type that is part of your model, similar to CreateObjectSet<T> on ObjectContext.

DbSet represents a collection of a given entity type in your model, similar to ObjectSet<T> except that this new type also supports derived types whereas ObjectSet<T> only supported base types.

public class DbSet<TEntity> : IDbSet<TEntity>, IQueryable<TEntity> where TEntity : class

{

    public void Add(TEntity entity);

    public void Remove(TEntity entity);

    public void Attach(TEntity entity);

    public TEntity Find(params object[] keyValues);

}

Add, Remove & Attach

Add, Remove and Attach are similar to AddObject, DeleteObject and Attach on ObjectSet<T>. We’ve renamed Add and Remove to keep DbSet<TEntity> consistent with other sets in the framework.

Add and Attach can now be called on objects that are already tracked by the context, Add will ensure the object is in the added state and Attach the unchanged state. This is helpful in N-Tier and Web scenarios where you have a detached graph containing both existing and new entities that need to be hooked up to the context.

For example assume you have a new product that is linked to an existing category, neither instance is attached to a context (they may have been returned from a distributed client via a web service call or built from a post back in a web application). Because Add and Attach are graph operations, calling Add on the product will also Add the category. Previously you would need to drop down to lower level APIs (ObjectStateManager.ChangeObjectState) to mark the category as unchanged. Now however this can be achieved by calling Attach:

public void AddProduct(Product product)

{

    using (var context = new ProductContext())

    {

        context.Products.Add(product);

        context.Categories.Attach(product.Category);

        context.SaveChanges();

    }

}

Find

The new Find method will locate an object with the supplied primary key value(s). Find initially checks the in-memory objects before querying the database and is capable of finding added entities that haven’t been persisted to the store yet. If find doesn’t locate an entity with the matching key it returns null.

IDbSet<TEntity>

DbSet<TEntity> implements IDbSet<TEntity> to facilitate building testable applications.

public interface IDbSet<TEntity> : IQueryable<TEntity> where TEntity : class

{

    void Add(TEntity entity);

    void Attach(TEntity entity);

    void Remove(TEntity entity);

    TEntity Find(params object[] keyValues);

}

 

IDbSet<TEntity> allows you to define an interface that can be implemented by your derived context, similar to the example shown below. This allows the context to be replaced with an in-memory test double for testing. DbContext will still perform set discovery and initialization, described in the ‘Code First Development’ section, for properties typed as IDbSet<TEntity>.

public interface IProductContext

{

    IDbSet<Product> Products { get; }

    IDbSet<Category> Categories { get; }

}

public class ProductContext : DbContext, IProductContext

{

    public IDbSet<Product> Products { get; set; }

    public IDbSet<Category> Categories { get; set; }

}

Summary

These improvements are designed to provide a cleaner and simpler API surface that allows you to achieve common scenarios with ease while also allowing you to drill down to more advanced functionality when required. The improvements build on top of the existing Entity Framework components using conventions over configuration and a simplified API surface.

We’d like to hear any feedback you have on the proposed functionality and API surface.

Rowan Miller
Program Manager
ADO.NET Entity Framework Team