EF Feature CTP4: DbContext & Databases

 


The information in this post is out of date.

Visit msdn.com/data/ef for the latest information on current and past releases of EF.

For Code First to a New Database see https://msdn.com/data/jj193542

For Code First to an Existing Database see https://msdn.com/data/jj200620


 

We recently announced the release of Feature CTP4 for the ADO.Net Entity Framework (EF). CTP4 contains a preview of new features that we are considering adding to the core framework and would like community feedback on. CTP4 builds on top of the existing Entity Framework 4 (EF4) which shipped as part of .NET Framework 4.0 and Visual Studio 2010.

CTP4 contains the first preview of a set of Productivity Improvements for EF that provide a cleaner and simpler API surface designed to allow developers to accomplish the same tasks with less code and fewer concepts. The Productivity Improvement effort is described in more detail in a recent Design Blog post.

In addition to a simplified API surface the Productivity Improvements also offer a streamlined Code First experience that takes care of common boiler plate code and uses convention over configuration to reduce the amount of code needed to start performing data access. You can then gradually start to override these conventions with your own configuration as you progress through the development lifecycle and ultimately deploy to a production environment. One example of these conventions is database location and provisioning which we will cover in this post.

This post is largely targeted at using the Code First approach with DbContext to generate a database, if you are mapping to an existing database then this is covered at the end of the post.

Default Conventions

First let’s look at the default behavior of DbContext and how it uses convention rather than configuration to reduce the amount of code we need to write to get an application up and running. Below is a complete application that uses DbContext to persist and query data using DbContext. No additional code or configuration is required; DbContext will automatically create a database for us based on our domain model. The database will be created on our localhost\SQLEXPRESS instance and will be named after the fully qualified type name of our derived context (in the following example this would be PI.DbDemo.ProductCatalog).

using System;
using System.Collections.Generic;
using System.Data.Entity;

namespace PI.DbDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var context = new ProductCatalog())
            {
                // Persist Data
                var food = new Category { Name = "Food" };
                context.Categories.Add(food);
                context.SaveChanges();

                // Query Data
                foreach (var cat in context.Categories)
                {
                    System.Console.WriteLine(cat.Name);
                }
            }

            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }
    }

    public class ProductCatalog : DbContext
    {
        public DbSet<Category> Categories { get; set; }
        public DbSet<Product> Products { get; set; }
    }

    public class Category
    {
        public int CategoryId { get; set; }
        public string Name { get; set; }

        public ICollection Products { get; set; }
    }

    public class Product
    {
        public int ProductId { get; set; }
        public string Name { get; set; }
        public decimal UnitPrice { get; set; }

        public Category Category { get; set; }
    }
}

Of course this convention is useful to get up and running but it’s not going to get us all the way to deploying in a production environment, you probably aren’t going to be using a local SQL Express instance in production and if you’re an enterprise developer your DBA probably isn’t going to overjoyed at the idea of you application having permissions to create databases (with good reason). In the next sections we’ll look at how you can start to take control over the database as your requirements progress.

Connection Factories

Under the covers there is a convention that is taking the name of your context and turning it into a database connection, this is an AppDomain wide setting that can be changed via a static property; System.Data.Entity.Infrastructure.Database.DefaultConnectionFactory. Connection factories implement the System.Data.Entity.Infrastructure.IDbConnectionFactory interface which defines a single CreateConnection method. When you use the default constructor on DbContext the fully qualified name of your context is passed to the default connection factory to obtain a database connection.

public interface IDbConnectionFactory
{
    DbConnection CreateConnection(string nameOrConnectionString);
}

Changing the Database Name

If you just want to change the name of the database that is generated then you can control the string that is passed to the default connection factory by using the DbContext constructor that specifies the nameOrConnectionString parameter. Here is our derived context updated to specify a database name:

public class ProductCatalog : DbContext
{
    public ProductCatalog()
        :base("DemoProductStore")
    { }

    public DbSet<Category> Categories { get; set; }
    public DbSet<Product> Products { get; set; }
}

Changing the Database Server

If you want to have your database on another Microsoft SQL Server Instance then you can tweak the settings on the SQL Client connection factory that is included in CTP4; System.Data.Entity.Infrastructure.SqlConnectionFactory. This connection factory includes a constructor that allows us to override pieces of the final connection sting, such as username, password and server. We need to make changes to the default convention before any contexts are created in our AppDomain, in the case of our console application we can just do this at the start of the Main method:

static void Main(string[] args)
{
    Database.DefaultConnectionFactory =
         new SqlConnectionFactory("Server=MyDatabaseServer");
 
   ...

}

Changing to SQL Compact

Along with the SQL Client connection factory we also include the System.Data.Entity.Infrastructure.SqlCeConnectionFactory which will generate connections to SQL Compact databases. Because the SQL Compact providers aren’t backwards compatible you will need to specify the invariant name of the provider version you want to use. Currently the SQL Compact 4.0 provider is the only one that supports Code First database creation and it is available for download as a separate CTP.

static void Main(string[] args)
{
    Database.DefaultConnectionFactory =
         new SqlCeConnectionFactory("System.Data.SqlServerCe.4.0");

    ...

}

By default SQL Compact files will be created in the |DataDirectory| directory, for executables this is the same directory as the executable, for web applications this is the ‘App_Data’ directory. The SQL Compact factory also includes constructors to override the directory that databases are created in, or any other part of the connection string.

App.config/Web.config

All these conventions are great but if our database server changes between dev, test and production then we really want to be able to change it easily without having to recompile code. No problem, just add a connection string to your applications config file with a name that matches the name of your context (either fully-qualified or not). Because the shape of the model comes from your code rather than xml files this is just a plain connection string rather than an EntityClient connection string used in other areas of EF.

<configuration>
  <connectionStrings>
    <add name ="ProductCatalog"
         providerName="System.Data.SqlClient"
         connectionString="Server=.\SQLEXPRESS;Database=ProductCatalog;Integrated Security=True" />
  </connectionStrings>
</configuration>

Note that if you pass a string to a DbContext constructor (for the nameOrConnectionString parameter) then that string should match the name of the connection string added to your config file.

Database Initializers

So far we’ve looked at how a database is located, now let’s look at how that database is initialized. By default if the database does not exist then DbContext will automatically create a database for you without any data in it, but of course this can be changed. Database initialization is handled via a setting at the AppDomain level which can be tweaked or replaced for your context via the static System.Data.Entity.Infrastructure.Database.SetInitializer method. Initializers need to implement the System.Data.Entity.Infrastructure.IDatabaseInitializer interface, which contains a single method to initialize the database for a given context:

public interface IDatabaseInitializer<in TContext>
     where TContext : DbContext
{
    void InitializeDatabase(TContext context);
}

Having this very-generalized hook gives developers and third parties the opportunity to build custom solutions to help with initialization, in addition to the options provided in the box.

It is worth calling out that this is a very new feature that only just made it into CTP4 and we know it has some rough edges, but we really want your feedback so we decided to include it. As you’ll see when we look at evolving models there are some missing pieces in the CTP4 story but we feel like the database initializer hook is the right way to approach the problem and we just need to expand the abilities of the initializers and tooling that we provide.

Evolving Models

Of course it is unlikely that our domain model will remain the same forevermore after the first time we run our application. This in turn means our database schema is going to need to evolve along with our domain model, this scenario is going to highlight one of the big gaps in CTP4 and something we are working to improve. Ideally we would have a set of tools that help you migrate the existing database, either automatically or by generating migration scripts that can be tweaked and then saved for later use against test and production databases. We’re not there yet though, in CTP4 you need to drop and re-create the database with the new schema to match your domain model.

So let’s go ahead and change our domain model, I’m just adding a Description field to our Product class:

public class Product
{
    public int ProductId { get; set; }
    public string Name { get; set; }
    public decimal UnitPrice { get; set; }
    public string Description { get; set; }

    public Category Category { get; set; }
}

Now if we run our application we’ll get an InvalidOperationException because our schema doesn’t have a column for the new Description field:

“The model backing the 'ProductCatalog' context has changed since the database was created.  Either manually delete/update the database, or call Database.SetInitializer with an IDatabaseInitializer instance.  For example, the RecreateDatabaseIfModelChanges strategy will automatically delete and recreate the database, and optionally seed it with new data.”

As the exception suggests there is an initialization strategy in CTP4 that will drop and re-create the database for us when the model changes. We need to register the initializer before any contexts are created in our AppDomain, in the case of our console application we can just do this at the start of the Main method. Running the application again will cause the old database to be dropped and the new database with the Description column to be created, obviously this isn’t ideal for all situations but it’s where we got to for CTP4.

static void Main(string[] args)
{
    Database.SetInitializer<ProductCatalog>(
        new RecreateDatabaseIfModelChanges<ProductCatalog>());

    ....
}

Other Strategies

In addition to the RecreateDatabaseIfModelChanges strategy CTP4 also includes two other strategies, CreateDatabaseOnlyIfNotExists and AlwaysRecreateDatabase, the names are pretty self-explanatory. CreateDatabaseOnlyIfNotExists  is the default strategy. AlwaysRecreateDatabase can be useful in testing when combined with seed data, allowing you to reset the database to a clean state before each test run, this is shown in the next section.

Seed Data

All three of the strategies included in CTP4 can be derived from and have a virtual Seed method that is called after the database has been created. This allows us to insert some data into the database once it has been created, an instance of the context type is passed into the method so you can specify seed data in terms of your domain objects and use the context to persist them. Of course the underlying database connection is available from context.Database.Connection so you can always interact directly with the database if you want.

Let’s define an initializer for use in a test project that will recreate the database before each test run with a known set of test data. I’m sure some of you are cringing at the thought of running tests against an actual database all the time. It’s definitely not for everyone and you can certainly use interfaces and in-memory fakes rather than hitting the database (we created the IDbSet<T> interface for exactly this reason). This is simply another option for folks who don’t need/want to create a layer of abstraction between their application and data access layer.

public class ProductTestInitializer : AlwaysRecreateDatabase<ProductCatalog>
{
    protected override void Seed(ProductCatalog context)
    {
        var food = new Category { Name = "Food" };
        context.Categories.Add(food);

        context.Products.Add(new Product
        {
            Name = "Bread",
            UnitPrice = 2,
            Category = food
        });

        context.Products.Add(new Product
        {
            Name = "Butter",
            UnitPrice = 5,
            Category = food
        });

        context.Products.Add(new Product
        {
            Name = "Vegemite",
            UnitPrice = 4,
            Category = food
        });
    }
}

If we are using MSTest (included with Visual Studio) for our testing then we use a ‘Class Initialize’ method to register the initializer, this will run before we construct a context in any of our tests:

[TestClass]
public class MyTests
{
    [ClassInitialize]
    public void ClassInitialize(TestContext c)
    {
        Database.SetInitializer<ProductCatalog>(new ProductTestInitializer());
    }

    [TestMethod]
    public void MyTest1()
    {
    }
}

Schema Metadata

One of the challenges in database initialization is in knowing whether an existing database is compatible with the current domain model. In CTP4 we include an additional “EdmMetadata” table in generated schemas to help answer this question. The table includes some information about the model that was used to generate the database and is used internally to support the DbContext.Database.ModelMatchesDatabase method.

Of course not everyone is going to be happy with this table being included in their database so you can switch it off using the Code First ModelBuilder API, if you do turn it off then some of the initialization strategies will not be able to function correctly.

public class ProductCatalog : DbContext
{
    public DbSet<Category> Categories { get; set; }
    public DbSet<Product> Products { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.IncludeMetadataInDatabase = false;
    }
}

Customizing the Generated Schema

Not everyone is going to be happy with the database schema that is generated by default. It’s not really in the scope of this post but we’ll take a quick look at customizing the schema. Below is an example of using the Fluent API to control the table names for our two classes. The Fluent API provides many more knobs for controlling the schema including column names, splitting entities between multiple tables, inheritance patterns and many more. This is one area of the Fluent API that still has some rough edges and we are working to make the code required for mapping more succinct and intuitive for the next release.

public class ProductCatalog : DbContext
{
    public DbSet<Category> Categories { get; set; }
    public DbSet<Product> Products { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Category>()
            .MapSingleType()
            .ToTable("purchasing.CategoryLookup");

        modelBuilder.Entity<Product>()
            .MapSingleType()
            .ToTable("purchasing.ProductCatalog");
    }
}

Working with Existing Databases

Most of the content in this post has dealt with scenarios where Code First is responsible for generating the database. If you have an existing database then you don’t need database location or initialization. Here is the process to follow for using Code First against an existing database:

  1. Define the classes that make up you domain model
  2. Define a derived DbContext and use the Fluent API and/or DataAnnotations to tell Code First how your classes map to your database
  3. Add a connection string to you app/web.config file that has the same name as your derived context and points to your database
  4. Switch off the initialization functionality for your context by registering null:
        Database.SetInitializer<ProductCatalog>(null);

Summary

In this post we looked at how DbContext locates a database and initializes it for Code First developers. We covered the default behavior and how to tweak it or replace it with your own behavior, we also saw how to turn off the conventions when mapping to an existing database. There are some missing pieces in the database initialization space but we would like your feedback on the hooks we have made available and the general approach we are taking.

If you need assistance with EF Feature CTP4 we have an Entity Framework Pre-Release Forum.

Rowan Miller
Program Manager
ADO.Net Entity Framework