Self-Tracking Entities in the Entity Framework

Background

One of the biggest pieces of feedback we received from the N-Tier Improvements for Entity Framework post as well as other sources was: “low level APIs are great, but where is the end-to-end architecture for N-tier and the Entity Framework?”.   This post outlines some of the additional feedback we’ve received and describes the self-tracking entities architecture that will ship along side of VisualStudio 2010 and .NET Framework 4.0.

When an application goes multi-tier, there are important architectural decisions that the application developer makes around how to communicate between tiers. There are a lot of choices, and the choice depends on a variety of conditions. How rapidly is each tier expected to change? How much control is there over each tier? Are their specific protocol, security, or business policy concerns? The answers to these questions often drive the selection of whether to use data transfer objects (DTOs) or DataSets or something different altogether.

Self-Tracking Entities

Self-tracking entities know how to do their own change tracking regardless of which tier those changes are made on. As an architecture, self-tracking entities falls between DTOs and DataSets and includes some of the benefits of each.

Drawing from DataSet

DataSet on the client tier is very easy to use because there is no need to track changes separately or maintain any extra data structures that include change tracking information. DataSet takes care of serializing state information for each row of data. On the mid tier, applying the changes stored within a DataSet is straightforward. DataSets have also gained popularity because of the number of tools that work with DataSets, and because they are easily bound to many UI/presentation controls. Since it works in so many scenarios, for many applications there is never a need to transform data outside of a DataSet allowing a single paradigm to be used up and down the stack.

However, there are disadvantages to DataSets when used as the communication payload between tiers. The first is the cost of getting the data into a serializable format, and the second is that the DataSet serialization format is generally not very interoperable with other languages that are used to expose services. Another disadvantage of using a DataSet is that you can quickly lose the intent of your service call because so many kinds of things can be included in a DataSet. For example, if the service method is declared as:

DataSet GetCustomer(string id);

It is extra work to ensure that the DataSet that is returned contains only rows of data that pertain to “customers”. It is also more difficult to specify a service contract that says the DataSet is also supposed to return data for each customer’s orders and their order details.

Self-tracking entities share many advantages with DataSet. They also encapsulate change tracking information, which is serialized along with the data contained in the entity. On the mid-tier applying changes from a graph of self-tracking entities to a persistent context is equally straightforward. The Entity Framework will also provide tools for generating self-tracking entities and because these entities are just objects, they can be easily made to work with UI/presentation controls.

Drawing from DTOs

DTOs and SOA are used to give the developer more control over the service contract and payload for tier to tier communication. DTOs themselves do not have behavior, so they are typically very simple classes designed just to provide the needed information to perform a specific service operation. Not only does this provide the opportunity to optimize the wire format (some believe a DTO is all about the wire format), but it makes it possible and easy to capture the intent of each service method. The data contract used with DTOs is typically interoperable which makes it easy to use services that run on different platforms. DTOs also provide a way to separate messaging contracts from the presentation layer, the business logic, and the persistence layer which in many cases creates a maintainable architecture.

There are disadvantages to using DTOs and the primary one is complexity. DTOs are often hand-crafted to include only the specific information that is needed for an operation. When there is a common pattern for mapping DTOs to entity classes, there are some tools available that will do DTO generation and mapping but it is not always possible. With DTOs, it is up to the developer to decide how to do change tracking on each tier (especially the client) which increases the complexity of the presentation layer. The complexity of the service implementation also increases because the developer is responsible for translating the DTO into entities for doing business logic validation, as well as being able to report changes stored in the DTO to the persistence framework.

Not only do self-tracking entities share advantages with DataSets as mentioned above, they also share many advantages with DTOs. Self-tracking entities expose a simple and interoperable wire format and it is clear what kinds of data a service method requires or returns. You can also use a self-tracking entity as part of a message. However, self-tracking entities don’t give quite as much architectural separation as using pure DTOs, but you do gain a less complex solution that requires fewer data transformations. It is important to note that self-tracking entities can be made to be ignorant of any particular persistence framework making them essentially POCO objects. Particular persistence frameworks such as the Entity Framework will have the capabilities to create these entities and interpret the change tracking information when saving changes.

.NET Framework 3.5SP1 Challenge

With the .NET Framework 3.5SP1, this sort of solution was very hard to implement using the Entity Framework because change tracking was always done by a centralized ObjectContext which contains an ObjectStateManager. In particular, “reattaching” to report changes back to the ObjectStateManager was all but impossible without completely shredding your entity graph and applying changes one entity and one relationship at a time. The new API changes that are being added to the Entity Framework in .NET Framework 4.0 are enablers for an easier experience of reporting changes back to the ObjectStateManager, making self-tracking entities (as well as other architectures) easier to build.

Mid-tier experience

Using self-tracking entities on the mid-tier is about working with entity graphs and the Entity Framework. The service contract that is used in the following examples contains two simple methods for retrieving a Customer entity graph and applying updates to that entity graph:

interface ICustomerService
{
    Customer GetCustomer(string customerID);
    bool UpdateCustomer(Customer customer);
}

The implementation of the GetCustomer service method can be done using an Entity Framework ObjectContext and using LINQ or query builder methods to retrieve the entity you want. In the example below, a query is issued for a particular customer entity and all of the Orders and OrderDetails are included.

public Customer GetCustomer(string customerID)
{
    using (NorthwindEFContext context = new NorthwindEFContext())
    {
        var result = context.Customers.
              Include("Orders.OrderDetails").
              Single(c => c.CustomerID == customerID);
        return result;
    }
}

UpdateCustomer is example of how to save the changes that are made to a graph of self-tracking entities. Similar to DataSet, applying these changes to the persistence layer and saving them should be simple. In the below example, a new Entity Framework API, “ApplyChanges” is used which understands how to interpret the change tracking information that is stored by each entity and how to tell the ObjectContext’s ObjectStateManager about those changes.

public void UpdateCustomer(Customer customer)
{
    using (NorthwindEFContext context = new NorthwindEFContext())
    {
        context.Customers.ApplyChanges(customer);
        context.SaveChanges();
    }
}

Client Experience

The client experience when working with self-tracking entities is similar to how you would manipulate any object graph. You can make changes to scalar or complex properties, add or remove references, and add or remove from collections of related entities. The key part of the experience is that tracking changes is hidden from the client because it is done internally on each entity. There is no ObjectContext and there is no extra state that has to be maintained or passed from client to the tier that does persistence.

In this example, the test first queries for a customer, then deletes one of the orders. The resulting entity graph is sent back to the service tier using  the UpdateCustomer method.

public void DeleteObjectsTest()
{
    using (CustomerServiceClient client = new CustomerServiceClient())
    {
        var customer = client.GetCustomer("ALFKI");
        customer.Orders.First().Delete();
        client.UpdateCustomer(customer);
    }
}

The next test shows how to add a new Order with two OrderDetails to an existing customer. By default, the constructor of a self-tracking entity puts the entity in the “Added” state. There are conveience methods on each self-tracking entity to change this state if needed (Delete() and SetUnchanged()).

public void AddObjectsTest()
{
    using (CustomerServiceClient client = new CustomerServiceClient())
    {
        var customer = client.GetCustomer("ALFKI");
        customer.Orders.Add(new Order() {
OrderID = 100,
            OrderDetails = {
                new OrderDetail{
ProductID = 3,
Quantity = 7
},
                new OrderDetail{
ProductID = 4,
Quantity = 8
},
            }
});
        client.UpdateCustomer(customer);
    }
}

The final example shows that a self-tracking entity graph can contain any number of changes and those changes can be any combination of adds, modifications, and deletes.

public void ComplexModificationsTest ()
{
    using (CustomerServiceClient client = new CustomerServiceClient())
    {
        var cust = client.GetCustomer("ALFKI");
        // remove first order - will do cascade in the database
        cust.Orders.Remove(cust.Orders.First());
// modify one of the orders
var order = cust.Orders.Last();
order.RequiredDate = DateTime.Now;
order.OrderDetails.Add(
new OrderDetail{
ProductID = 7,
Quantity = 3
});
// add new order
cust.Orders.Add(new Order()
{
OrderID = 100,
OrderDate = DateTime.Now,
ShipCity = "Redmond",
ShipAddress = "One Microsoft Way",
ShipRegion = "WA",
ShipPostalCode = "98052",
OrderDetails = {
new OrderDetail{
ProductID = 3,
Quantity = 7
},
new OrderDetail{
ProductID = 4,
Quantity = 8
},
}
});
client.UpdateCustomer(cust);
}
}

Generating Self-Tracking Entities Building self-tracking entities should be as easy as using them. We plan to ship a T4 template that will do the code generation for creating self-tracking entities from your EDM. This template will show up in the list of available templates when you right click on your model and choose Add New Artifact Generation Item.

clip_image002 

Design Notes

Inside a Self-Tracking Entity

There have not been final decisions about what exactly will be included inside of a self-tracking entity, but it is important to track:

  • The state of the entity, Added, Deleted, Modified, or Unchanged
  • The original value from reference relationship properties
  • Adds and removes from collection relationship properties

This information will be included with the data contract of the entity. As part of the code generation of each self-tracking entity, scalar and complex property changes will mark the entity as “dirty” meaning that its state will change from Unchanged to Modified.

Inside ApplyChanges

ApplyChanges is a new API on the ObjectContext and ObjectSet classes that attaches an entity graph and interprets the change tracking information stored in each entity. The design for discovering the change tracking information has not yet been finalized, but there are a couple of options available. One would be to code generate an interface as part of the T4 template for self-tracking entities as well as a version of ApplyChanges that knew about the interface. Another option would be fall back to runtime discovery of the particular properties using a convension and some of the new dynamic capabilities of the framework. The thing we want to avoid is causing the self-tracking entity implementation to have a dependency on any of the Entity Framework assemblies.

The algorithm that ApplyChanges uses is:

1. Attaching the entity graph to the ObjectContext

2. Changing the state of each entity using the ChangeObjectState API.

3. For any reference relationship property, if there is an original value changing the state of the original reference relationship to Deleted and change the state of the current reference relationship to Added. This is done using the ChangeRelationshipState API.

4. For any collection relationship property, use the ChangeRelationshipState API to mark any removed relationships as Deleted and to mark any added relationships as Added.

Summary

We’ve received a lot of feedback and suggestions on how to make developing multi-tier applications easier using the Entity Framework. One of the components of improving this experience is the introduction of an end-to-end architecture around self-tracking entities. We’d like to hear your feedback on this addition to the Entity Framework, and any other comments you have on the matter of multi-tier development using domain models and entities.

Jeff Derstadt,
Dev Lead, Entity Framework Team, Microsoft

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post .