EF7 - What Does “Code First Only” Really Mean

A while back we blogged about our plans to make EF7 a lightweight and extensible version of EF that enables new platforms and new data stores. We also talked about our EF7 plans in the Entity Framework session at TechEd North America.

Prior to EF7 there are two ways to store models, in the xml-based EDMX file format or in code. Starting with EF7 we will be retiring the EDMX format and having a single code-based format for models. A number of folks have raised concerns around this move and most of it stems from misunderstanding about what a statement like “EF7 will only support Code First” really means.

Code First is a bad name

Prior to EF4.1 we supported the Database First and Model First workflows. Both of these use the EF Designer to provide a boxes-and-lines representation of a model that is stored in an xml-based .edmx file. Database First reverse engineers a model from an existing database and Model First generates a database from a model created in the EF Designer.

In EF4.1 we introduced Code First. Understandably, based on the name, most folks think of Code First as defining a model in code and having a database generated from that model. In actual fact, Code First can be used to target an existing database or generate a new one. There is tooling to reverse engineer a Code First model based on an existing database. This tooling originally shipped in the EF Power Tools and then, in EF6.1, was integrated into the same wizard used to create EDMX models.

Another way to sum this up is that rather than a third alternative to Database & Model First, Code First is really an alternative to the EDMX file format. Conceptually, Code First supports both the Database First and Model First workflows.

Confusing… we know. We got the name wrong. Calling it something like “code-based modeling” would have been much clearer.

Is code-base modeling better?

Obviously there is overhead in maintaining two different model formats. But aside from removing this overhead, there are a number of other reasons that we chose to just go forward with code-based modeling in EF7.

  • Source control merging, conflicts, and code reviews are hard when your whole model is stored in an xml file. We’ve had lots of feedback from developers that simple changes to the model can result in complicated diffs in the xml file. On the other hand, developers are used to reviewing and merging source code.
  • Developers know how to write and debug code. While a designer is arguably easier for simple tasks, many projects end up with requirements beyond what you can do in the designer. When it comes time to drop down and edit things, xml is hard and code is more natural for most developers.
  • The ability to customize the model based on the environment is a common requirement we hear from customers. This includes scenarios such as multi-tenant database where you need to specify a schema or table prefix that is known when the app starts. You may also need slight tweaks to your model when running against a different database provider. Manipulating an xml-based model is hard. On the other hand, using conditional logic in the code that defines your model is easy.
  • Code based modeling is less repetitive because your CLR classes also make up your model and there are conventions that take care of common configuration. For example, consider a Blog entity with a BlogId primary key. In EDMX-based modeling you would have a BlogId property in your CLR class, a BlogId property (plus column and mapping) specified in xml and some additional xml content to identify BlogId as the key. In code-based modeling, having a BlogId property on your CLR class is all that is needed.
  • Providing useful errors is also much easier in code. We’ve all seen the “Error 3002: Problem in mapping fragments starting at line 46:… ” errors. The error reporting on EDMX could definitely be improved, but throwing an exception from the line of code-based configuration that caused an issue is always going to be easier.
    We should note that in EF6.x you would sometimes get these unhelpful errors from the Code First pipeline, this is because it was built over the infrastructure designed for EDMX, in EF7 this is not the case.

There is also an important feature that could have been implemented for EDMX, but was only ever available for code-based models.

  • Migrations allows you to create a database from your code-based model and evolve it as your model changes over time. For EDMX models you could generate a SQL script to create a database to match your current model, but there was no way to generate a change script to apply changes to an existing database.

 

So, what will be in EF7?

In EF7 all models will be represented in code. There will be tooling to reverse engineer a model from an existing database (similar to what’s available in EF6.x). You can also start by defining the model in code and use migrations to create a database for you (and evolve it as your model changes over time).

We should also note that we’ve made some improvements to migrations in EF7 to resolve the issues folks encountered trying to use migrations in a team environment.

 

What about…

We’ve covered all the reasons we think code-based modeling is the right choice going forwards, but there are some legitimate questions this raises.

What about visualizing the model?

The EF Designer was all about visualizing a model and in EF6.x we also had the ability to generate a read-only visualization of a code-based model (using the EF Power Tools). We’re still considering what is the best approach to take in EF7. There is definitely value in being able to visualize a model, especially when you have a lot of classes involved.

With the advent of Roslyn, we could also look at having a read/write designer over the top of a code-based model. Obviously this would be significantly more work and it’s not something we’ll be doing right away (or possibly ever), but it is an idea we’ve been kicking around.

What about the “Update model from database” scenario?

“Update model from database” is a process that allows you to incrementally pull additional database objects (or changes to existing database objects) into your EDMX model. Unfortunately the implementation of this feature wasn’t great and you would often end up losing customizations you had made to the model, or having to manually fix-up some of the changes the wizard tried to apply (often dropping to hand editing the xml).

For Code First you can re-run the reverse engineer process and have it regenerate your model. This works fine in basic scenarios, but you have to be careful how you customize the model otherwise your changes will get reverted when the code is re-generated. There are some customizations that are difficult to apply without editing the scaffolded code.

Our first step in EF7 is to provide a similar reverse engineer process to what’s available in EF6.x – and that is most likely what will be available for the initial release. We do also have some ideas around pulling in incremental updates to the model without overwriting any customization to previously generated code. These range from only supporting simple additive scenarios, to using Roslyn to modify existing code in place. We’re still thinking through these ideas and don’t have definite plans as yet.

What about my existing models?

We’re not trying to hide the fact that EF7 is a big change from EF6.x. We’re keeping the concepts and many of the top level APIs from past versions, but under the covers there are some big changes. For this reason, we don’t expect folks to move existing applications to EF7 in a hurry. We are going to be continuing development on EF6.x for some time.

We have another blog post coming shortly that explores how EF7 is part v7 and part v1 and the implications this has for existing applications.

 

Is everyone going to like this change?

We’re not kidding ourselves, it’s not possible to please everyone and we know that some folks are going to prefer the EF Designer and EDMX approach over code-based modeling.

At the same time, we have to balance the time and resources we have and deliver what we think is the best set of features and capabilities to help developers write successful applications. This wasn’t a decision we took lightly, but we think it’s the best thing to do for the long-term success of Entity Framework and its customers – the ultimate goals being to provide a faster, easier to use stack and reduce the cost of adding support for highly requested features as we move forward.