Enumeration Support in Entity Framework

We recently shipped Entity Framework 4.1, and it has been exciting to see the reception the new Code First and DbContext APIs have been getting in the developer community. What is also exciting is to be able to start talking more about other features we have been working on, like table-valued functions, unique constraints and spatial types support.

 

At present, enums support is by far the most requested improvement for Entity Framework. We know it because we get asked about it each time we meet with a customer. We also see questions every week in the comments of our blogs, in our forums, in tweets like this, and likewise in our own feedback site, in which enums currently has more than 1,600 votes. That is about 50% more votes than Code First!

 

After having had to explain several times why it takes so long for us to add enums support to EF, we are happy to say that we plan to do it in the next version after 4.1.

 

The goal of this blog post is to describe the design decisions we have made so far and to gather your feedback so that we can refine the design and calibrate the direction we have taken.

Understanding Enums (and why they are so important)

Unlike other things we get to work on, enumerations are an existing programming language feature. Our primary goal is to make sure the feature works as seamlessly as possible with the EF in Object/Relational Mapping scenarios.

 

At the same time we have to focus on incorporating the concept of enums into the core of the metadata system underlying the EF, the Entity Data Model (EDM), in a manner that serves other key pieces of the EDM ecosystem, such as OData and WCF Data Services.

 

In order to achieve these goals we had to first recognize what enums are (and what they aren’t) from the language design perspective, look at how they are commonly used within the realm of data oriented applications, understand requirement gaps, prioritize scenarios and come up with a design that can satisfy the most important among them.

 

A glimpse at enumeration support in the .NET programing languages

The design of enums support in .NET programming languages like C#, VB and F# is amazingly simple. Yet from the perspective of modeling a specific domain, enums unquestionably provide great expressiveness.

This becomes apparent when looking at this straightforward example in C#:

 

    public enum ShippingMethod

    {

        Express,

        Priority,

        Ground

    }

    [Flags]

    public enum ContentType : byte

    {

        Liquid = 1,

        Perishable = 2,

        Edible = 4

    }

    public class Package

    {

        public int Id { get; set; }

        public string Description { get; set; }

        public ShippingMethod ShippingMethod { get; set; }

        public ContentType? ContentType { get; set; }

        public string Reference { get; set; }

    }

 

Given this definition of Package, ShippingMethod and ContentType, the code necessary to change the ShippingMethod and ContentType of a package can be as simple as this:

 

    juiceBox.ShippingMethod = ShippingMethod.Ground;

    juiceBox.ContentType = ContentType.Liquid | ContentType.Perishable | ContentType.Edible;

Structural definition

In C# enums can be structurally described like this:

  • The type declaration of an enum contains the name of the type and a list of zero or more named constants referred to as enum members
  • Each enum type also has an underlying type which is an non-nullable integral primitive type (Int32 if not specified in the declaration)
  • Each enum member consists of a member name and a member value
  • The set of values that an enum member can take is the valid range for the underlying type
  • The value of an enum member is optional in the type declaration.  If the value is not specified, it will be zero for the first member and one more than the value of the previous member for subsequent members
  • Multiple members with different names can have the same value
  • Member names are case sensitive and must be unique within the enum type declaration

 

Instances of enum types present specific characteristics:

  • Any enum value can be explicitly casted to its underlying type
  • All possible values of an enum type can be obtained by casting any valid value of its underlying type to the enum type (i.e. valid enum values are not limited to the declared members of the enum type)
  • However, casting directly between two different enum types is not allowed
  • The default value for any enum type is always zero, casted to the enum type, even if there is no declared member in the enum type with a value of zero
  • Enum instances are equal-comparable and order-comparable against any other instance of the same enum type

 

The following comparisons are supported for enums in C#:

  • Equality comparison between enums: ==, !=
  • Order comparison between enums: <, >, <=, >=

 

The following operators are supported in C# and return an enum of the same type:

  • Simple arithmetic between enums and the underlying type: +, -, ++, --, +=, -=
  • Bitwise operators between enums of the same type: ^, &, |, ~

 

The definition of enums in VB is very similar to C#, although enum types in VB are required to have at least one member, and member names are case insensitive. Operator rules over enums are also different in VB: most arithmetic operators can be applied but the result of the operation will be of the underlying type instead of being an enumeration. In fact, only bitwise operations between two enum instances will result in another enum of the same type.

Independently of the language, the following library methods on System.Enum are particularly useful:

  • HasFlag: Used to verify whether an instance of an enum type contains a specific value (see more on enums that can contain multiple values in the next section).
  • Parse and TryParse: Converts a valid string representation to an enum instance.
  • ToString: Converts an instance of an enum to a string representation.

How enums are used in practice

As simple as enums are, there is a lot of richness in how people think about them and how they are used in real world applications. At a high level, the common usages of enums are characterized by the following notions:

  • Enum types in .NET allow grouping a set of named numeric constants associated to the same concepts into a type:
    • Like regular constants, enums are a much more manageable alternative to sprinkling magic numbers across your code.
    • The ability to group sets of related constants and strong typing enable useful design-time services such as Intellisense and provide compile-time checks that help keep your code organized.
  • Enum types in .NET always have an “underlying” integer type that determines the range of values that an instance can store:
    • Fundamental aspects of using enums, such as the behavior of comparison, default values, arithmetic and bitwise operators depend on this underlying representation.
    • Although this correlation to integers is fundamental to the design, it is often possible to use Enums without paying much attention to it.
    • This is all consistent with the design of Enumerations in the C language, but in .NET enums are more type safe (e.g. you cannot assign directly a value of the underlying type to an enum variable without casting it explicitly).
  • In some cases, a variable of an enum type is used to simultaneously store a combination of multiple values:
    • This is implemented by treating the underlying integer representation as a “bit field”.
    • In .NET, it is possible to express the intent of using an enum type in this way by adding the FlagsAttribute to the declaration of the type, and then by declaring each member to have an integer value that represents a different bit position:

 

   [Flags]

    public enum ContentType : byte

    {

        Liquid = 1,

        Perishable = 2,

        Edible = 4

    }

    • Enumerations that can represent multiple values simultaneously are a powerful concept, but the implementation in .NET is extremely simple: the programing languages handle all enum types in the same way, e.g. bitwise operators can be applied to enums regardless of the presence of FlagsAttribute (in fact, the attribute’s sole impact in the behavior of the Base Class Library is in the conversion of an enum value to a string). In other words, there is no strong contract in using FlagsAttribute, and it is up to the developer to use only enums that were meant for multi-value in such a way.
  • Some tend to think of enums types as a way to model constrained categorizations, which assumes variables of an enum type can be guaranteed to only hold instances of its declared members:

    • Although enums are strongly typed in .NET, there is currently no way in .NET to express the intent to constrain the values to be only of the declared members. In fact, barring WCF serialization, nothing in .NET will constraint the value of an enum instance in this way.
    • That said, this is evidently a very useful thing to do, in particular in cases in which an enum is part of a public contract of a service, or in data centered applications, in which these constraints will often be backed by a database constraint.
    • Other languages outside of .NET, such as Java, deviate from the C design of enums based on integer underlying types to instead focus on constraining enums instances to only store declared values.

Use of enums in data oriented applications

Without native support for enums in the current version of Entity Framework, customers have come up with different ways to achieve similar functionality. The workarounds often require writing lots of additional code, and in the end, the experience is not as seamless as we would like it to be.

 

Incidentally, a common workaround, which consists of creating additional entity types for each of the types you would like to define as enums, can help illustrate the actual role of enums in data oriented applications.

Starting with the previous example, we would create additional entity types for ShippingMethod and ContentType, associating them to Package and then turning the ShippingMethod and ContentTypes properties into navigation properties:

 

    public class ShippingMethod

    {

        public int Id { get; set; }

        public string Name { get; set; }

    }

    public class ContentType

    {

        public int Id { get; set; }

        public string Name { get; set; }

        public virtual ICollection<Package> Packages { get; set; }

    }

    public class Package

    {

        public int Id { get; set; }

        public virtual ShippingMethod ShippingMethod { get; set; }

        public virtual ICollection<ContentType> ContentTypes { get; set; }

    }

Given this definition for Package, ShippingMethod and ContentType, the code for changing the ShippingMethod and ContentType of a package would be similar to this:

    var ground = context.ShippingMethods.Single(c => c.Name == "Ground");

    var liquid = context.ContentTypes.Single(c => c.Name == "Liquid");

    var perishable = context.ContentTypes.Single(c => c.Name == "Perishable");

    var edible = context.ContentTypes.Single(c => c.Name == "Edible");

    juice.ShippingMethod = ground;

    juice.ContentTypes.Clear();

    juice.ContentTypes.Add(liquid);

    juice.ContentTypes.Add(perishable);

    juice.ContentTypes.Add(edible);

The solution based on entities is conceptually equivalent to one based on enums: in both cases, the list of all possible ShippingMethods and ContentTypes are used as “reference data” for Package.

However, there are noticeable differences:

  • Additional tables and foreign key constraints need to exist in the database and be mapped to the entities.
  • In order to establish an association, you need to first retrieve an instance of the related entity, e.g. by executing a query (this part can be made easier by adding an extra foreign key property to the entity, so you can establish an association by just assigning the value of the corresponding key to it, but the value of the key would likely become a magic number in your code).
  • Many-to-many associations are necessary to achieve similar results to “flags” enums, such as Package.ContentTypes. Many-to-many associations require twice as many tables and foreign key constraints in the database as regular one-to-many associations.
  • While instances of an entity type can have multiple properties, members of an enum type are in general constrained to only have a name and a value.
  • In general, an application will only be able to reason about enum members that existed at compile time. On the other hand, a solution based on entities can more easily evolve after the application has been written and compiled, because all the data exist in the database and can be modified at any time.

 

In summary:

  • The main role of enumeration types in data oriented applications is to serve as simple lookup lists for reference data
  • Enums require significantly less ceremony than using Entities
  • Because they are hardcoded and are constrained to only have names and values they are less broadly applicable

Enums support in Entity Framework and Entity Data Model

Taking into account the design of Enumerations in the .NET programing languages, the typical usage and the role of Enums in data oriented applications we explained above, we have decided to align the structural definition of Enums in the Entity Data Model with their definition in C#.

 

This means that the declaration of Enums in EDM will have the same elements that they have in C#, including the optionally specified underlying type, case sensitive member names, optional values, etc., and that the same rules are used to choose values not specified, e.g. the default underlying type is also Int32 and the implicit value of members is the previous member plus one, starting with zero for the first member.

 

This alignment in the design makes satisfying the requirements of O/RM scenarios for our customers much easier to use. It also enables all the richness of enums in the language, including such things as support for multi-valued (flags) enums in LINQ to Entities queries.

 

When it comes to other consumers and producers of the EDM, like OData, the fact that other languages might have significantly different definitions of enums becomes relevant. In particular, Java 1.5 introduces first-class language support for the “typesafe enum pattern” previously used in Java programs to substitute missing support for enums. Java enums remove the idea of integral underlying types and instead implements enums that are completely constrained to the declared members.

 

We recognize that there is a going to be a mismatch between the EDM semantics for enums and Java, but we believe the mismatch can be mitigated by the specific library implementations.

 

Requirements for Enums support in EF

The following is an exhaustive list of the things that we believe developers using EF should be able to do with enumerations:

Models, Metadata and Mapping

Users creating an Entity Data Model should be able to define enum types (optionally specifying their underlying integer type) and their members (optionally specifying member values explicitly).

The following XML fragment is an example of how the definition of a simple enum type looks in CSDL:

<EnumType Name="ShippingMethod">

  <Member Name="Express" />

  <Member Name="Priority" />

  <Member Name="Ground" />

</EnumType>

 

The corresponding enum type in C# looks like this:

    public enum ShippingMethod

    {

        Express,

        Priority,

        Ground

    }

 

Notice that the two type definitions above omit underlying type and member values. Since EDM will use the same rules for implicit underlying type and the assignment of unspecified member values as .NET programming languages, the two types above will unambiguously match, i.e. the underlying type for both is a 32 bit integer and the member values are 0, 1 and 2 for Express, Priority and Ground respectively.

 

Flags enums

When declaring an enum type in EDM, users should also be able to specify the intent to use this type as a “flags” enums:

<EnumType Name="ContentType" IsFlags="true">

  <Member Name="Liquid" Value="1"/>

  <Member Name="Perishable" Value="2"/>

  <Member Name="Edible" Value="4"/>

</EnumType>

 

The only effect of this attribute in the EnumType element is to drive code generation to include the FlagsAttribute in the enum type definition. For instance, this is the equivalent code in C#:

    [Flags]

    public enum ContentType : byte

    {

        Liquid = 1,

        Perishable = 2,

        Edible = 4

    }

 

Nullable enums

Once an enum type has been defined, it should be possible to declare nullable or non-nullable properties of this type in entities or complex types.

Example:

 

<EntityType Name="Package">

  <Key>

    <PropertyRef Name="Id" />

  </Key>

  <Property Name="Id" Nullable="false" Type="Int32" />

  <Property Name="ShippingMethod" Type="Model.ShippingMethod" Nullable="false"/>

  <Property Name="ContentType" Type="Model.ContentType" Nullable="true"/>

  …

</EntityType>

 

Enums as keys

In addition, properties of enum types can participate in the definition of primary keys, unique constraints and foreign keys, as well as take part in concurrency control checks, and have declared default values.

 

Mapping

Properties of enum types support all the same mapping scenarios as properties of the underlying type:

  • An enum property in EDM can be mapped to any of the store integral types supported by EF: Byte, Int16, Int32 and Int64
  • Entities with enum properties can be mapped using stored procedures and query views
  • For object to conceptual mapping, both attribute-based (most commonly used in EntityObject code generated entities) and convention (most commonly used for POCO types) based mapping are available.

 

Note: We expect that customers will use enums in many cases in which previously they would have used entities with reference data. Unlike entities, enum types are not mapped to tables in the database. Users are free to have tables and foreign key constraints, or check constraints in the database that enforce enum values. However Entity Framework will not reason about these constraints at runtime, so it is up to the user to keep rows in those tables and enum members in sync.

 

New metadata APIs

The metadata API has been updated to be able to reason about enum types. The following classes, EnumType and EnumMember are used to represent enums in the EDM layer:

    • System.Data.Metadata.Edm.EnumType class represents an enum type
    • System.Data.Metadata.Edm.EnumMember class represents an enum member
    • New overloads of MetadataWorkspace.GetObjectSpaceType(), MetadataWorkspace.GetEdmSpaceType() and ObjectItemCollection.GetClrType() have been added that allow navigating the mapping between enums the EDM layer and the object layer.

 

Code First APIs

Code First’s property mapping fluent APIs have been extended to support enum types.

Example:

    public class ShippingContext : DbContext

    {

        public DbSet<Package> Packages { get; set; }

        protected override void OnModelCreating(DbModelBuilder modelBuilder)

        {

            modelBuilder.Entity<Package>().Property(p => p.ShippingMethod).HasColumnName("sm");

        }

    }

 

Query

All types of Entity Framework queries (including EntityClient queries and object layer queries written in LINQ or Entity SQL) can refer to enums. For instance, the following operations are allowed:

  • Projection of enum properties, constants and parameters
  • Use of enum properties, constants and parameters in filtering predicates
  • Use of enums as parameters and in the return type of Functions
  • Use of enums as parameters and in the return type of pass-through SQL APIs such as ExecuteStoreQuery, ExecuteStoreCommand, SqlQuery, etc.
  • Cast explicitly between enum instances and the underlying type of the enum (or types that support explicit cast from underlying type of the enum) in queries
  • Use of equality and order comparisons: ==, !=, <, >, <=, >=
  • Use of arithmetic and bitwise operators in LINQ queries: +, -, ^, &, |, ~ (the exact behavior and result type depends on the .NET language being used)

 

Note: Increment and decrement operators such as ++, --, += and -= have side effects and therefore cannot be supported in queries.

Example:

 

        var groundPackages =

            from p in context.Packages

            where p.ShippingMethod == ShippingMethod.Ground

            select p;

 

Update

Entity Framework entities and complex types that contain enum properties can be inserted, updated and deleted. In addition, enum properties can be store generated and used for concurrency control.

 

Example:

        var juiceBox = context.Packages.Where(p => p.Name == "Juice box");

        juiceBox.ShippingMethod = ShippingMethod.Ground;

        juiceBox.ContentType = ContentType.Liquid | ContentType.Perishable | ContentType.Edible;

        context.SaveChanges();

Designer support

The Entity Designer has been extended to support the definition of enum types. An upcoming blog post will contain more details about designer support.

 

Note: Reverse engineering a database into a model (usually called “database first”) never introduces enums into the model automatically. Users need to manually specify which properties represent enum types.

 

Code generation

Code generation, including EdmGen.exe, EntityObject template, Self-Tracking Entities template and DbContext template have been updated to generate enum type definitions and properties of enum types.

 

Limitations

There are several features we considered that will not be included in the upcoming release of EF:

ToString() and Parse()/TryParse() won’t be supported in queries

Translation into store queries would require very complex expressions to achieve fidelity with the behavior of these methods in the CLR.

 

Mapping Enums to strings and arbitrary types

We consider the general ability to perform property type transformations in the mapping to be an important feature, but it wasn’t a priority this time around. Barring type transformations the most natural thing to do for enums is to support mapping to integer numeric types.

 

Using enum properties as mapping discriminators is not supported

The Entity Framework runtime supports the use of Boolean properties as mapping discriminators, i.e. the value of the property in an entity is used to decide in what table in the store that entity instance should stored. We could consider adding the same capability for enum properties, but this wasn’t a priority in this release.

 

Feedback requested

As always, your opinions on all aspects of this design are welcome. However, there are specific areas in which we are seeking your help in deciding where to take the design. Please, take some time to answer them in the comments.

Checked or Strict enums

As we explained before, constraining the value of an enum to the declared members is one of the common notions of enums, however in .NET we don’t provide a way to express this intent. One thing we could consider doing is to enable this in EF:

 

When declaring an enum property, users should also be able to declare the intent to constrain the values of the property. They could do this using adding a new “Checked” facet to properties.

Example:

<EntityType Name="Package">

  <Key>

    <PropertyRef Name="Id" />

  </Key>

  <Property Name="Id" Nullable="false" Type="Int32" />

  <Property Name="ShippingMethod" Type="Model.ShippingMethod" Nullable="false" Checked="true" />

  <Property Name="ContentType" Type="Model.ContentType" Nullable="true" Checked="true" />

  …

</EntityType>

 

The only practical effect of the Checked attribute in the Property element of an enum type would be to enable the DbContext API to perform data validation in the property. DbContext performs data validation automatically by default on SaveChanges and can also perform it on demand. For an enum property that has this attribute, if the property value does not correspond to a declared member (or to a valid combination if the enum type is “flags”) validation will fail and SaveChanges will abort.

 

The equivalent behavior could be obtained when using Code First by adding a new validation data annotation or using a fluent API.

Example:

    public class Package

    {

        public int Id { get; set; }

        public string Description { get; set; }

        [Checked]

        public ShippingMethod ShippingMethod { get; set; }

        [Checked]

        public ContentType? ContentType { get; set; }

        public string Reference { get; set; }

    }

 

Note: The data validation feature in DbContext is explained in more details here and here.

 

Question: How important is it to validate enum values in this way for you? Would you give up any of the other features explained in this post for it?

HasFlag support

.NET 4.0 introduced this new method in System.Enum that makes it possible to detect whether an instance of a “flags” enum contains a value without having to resort to or think about bitwise operators. We could consider adding support for this method on LINQ to Entities.

 

Example:

        var perishableLiquids =

            from p in context.Packages

            where p.ContentType.HasFlag(ContentType.Perishable | ContentType.Liquid)

            select p;

 

This is equivalent to:

        var perishableLiquids =

            from p in context.Packages

            where p.ContentType & (ContentType.Perishable | ContentType.Liquid) ==

                (ContentType.Perishable | ContentType.Liquid)

            select p;

 

Question: How important is it to be able to use HasFlag in LINQ queries for you? Would you give up any of the other features explained in this post for it?

 

Options to disable code generation for specific types

Enum types are primarily a code construct and therefore their existence is commonly independent of an EF model. If you try to use an enum type that is already declared with your code-generated entities it can become very difficult because EF code generation will always generate a new enum type for the type defined in the EDM model.

 

We could consider improving this by defining an annotation that you could easily put on specific types (enum, entity or complex type) in your model that would signal EF code generation templates to ignore that type.

 

Question: How important is it to you to be able to skip generating specific types from your model? Would you give up any of the other features explained in this post for it?

 

Thanks,

 

Diego Vega
Program Manager
Entity Framework Team