Self-Tracking Entities: ApplyChanges and duplicate entities

Some customers using the Self-Tracking Entities template we included in Visual Studio 2010 have ran into scenarios in which they call ApplyChanges passing a graph of entities that they put together in the client tier of their app, and then they get an exception with the following message:

AcceptChanges cannot continue because the object’s key values conflict with another object in the ObjectStateManager.

This seems to be the most common unexpected issue our customers run against when using Self-Tracking Entities. I have responded multiple times to this on email and I have meant to blog about it for some time. Somehow I was able to do it today :)

We believe that most people finding this exception are either calling ApplyChanges to the same context with multiple unrelated graphs or they are merging graphs obtained in multiple requests so that they now have duplicate entities in the graph they pass to ApplyChanges.

By duplicate entities, what I mean is that they have more than one instance of the same entity, in other words, two or more objects with the same entity key values.

The current version of Self-Tracking Entities was specifically designed to not handle duplications. In fact, when we were designing this, our understanding was that a situation like this was most likely the result of a programming error and therefore it was more helpful to our customers to throw an exception.

The problem with the exception is that avoiding introducing duplicate entities can be hard. As an example, let’s say that you have service that exposes three service operations for a car catalog:

  • GetModelsWithMakes: returns a list of car Models with their respective associated Makes
  • GetMakes: returns the full list of car Makes
  • UpdateModels: takes a list of car Models and uses ApplyChanges and SaveChanges to save changes in the database

And the typical operation of the application goes likes this:

  1. Your client application invokes GetModelsWithMakes and uses it to populate a grid in the UI.
  2. Then, the app invokes GetMakes and uses the results to populate items in a drop down field in the grid.
  3. When a Make “A” is selected for a car Model, there is some piece of code that assigns the instance of Make “A” to the Model.Make navigation property.
  4. When changes are saved, the UpdateModels operation is called on the server with the graph resulting from the steps above.

This is going to be a problem if there was another Model in the list that was already associated with the same Make “A”: since you brought some Makes with the graph of Models and some Makes from a separate call, you now have two completely different instances of “A” in the graph. The call to ApplyChanges will fail on the server with the exception describing a key conflict.

There are changes we have considered doing in the future to the code in ApplyChanges in order to avoid the exception but in the general case there might be inconsistencies between the states of the two Make “A” instances, and they can be associated with a different Models, making it very difficult for ApplyChanges to decide how to proceed.

In general, the best way to handle duplicates in the object graph seems to be to avoid introducing them in the first place!

Here are a few patterns that you can use to avoid them:

1. Only use Foreign Key values to manipulate associations:

You can use foreign key properties to set associations between objects without really connecting the two graphs. Every time you would do something like this:

model.Make = make;

… replace it with this:

model.MakeId = make.Id;

This is the simplest solution I can think of and should work well unless you have many-to-many associations or other “independent associations” in your graph, which don’t expose foreign key properties in the entities.

2. Use a “graph container” object and have a single “Get” service operation for each “Update” operation:

If we combine the operations used to obtain car Models and Makes into a single service operation, we can use Entity Framework to perform “identity resolution” on the entities obtained, so that we get a single instance for each make and model from the beginning.

This is a simplified version of “GetCarsCatalog” that brings together the data of both Models and Makes.

// type shared between client and server

public class CarsCatalog
{
public Model[] Models {get; set;}
public Make[] Makes {get; set;}
}

// server side code

public CarsCatalog GetCarsCatalog()
{
using (var db = new AutoEntities())
{
return new CarsCatalog
{
Models = context.Models.ToArray(),
Makes = context.Makes.ToArray()
};
}
}

// client side code

var catalog = service.GetCarsCatalog();
var model = catalog.Models.First();
var make = catalog.Makes.First();
model.Make = make;

This approach should work well even if you have associations without FKs. If you have many-to-many associations, it will be necessary to use the Include method in some queries, so that the data about the association itself is loaded from the database.

3. Perform identity resolution on the client:

If the simple solutions above don’t work for you, you can still make sure you don’t add duplicate objects in your graph while on the client.The basic idea for this approach is that every time you are going to assign a Make to a Model, you pass the Make through a process that will help you find whether there is already another instance that represents the same Make in the graph of the Model, so that you can avoid the duplication.

This is a really complicated way of doing it compared with the two solutions above, but using Jeff’s graph iterator template, it doesn’t really take a lot of extra code to do it:

// returns an instance from the graph with the same key or the original entity
public static class Extensions
{
public static TEntity MergeWith<TEntity, TGraph>(this TEntity entity, TGraph graph,
Func<TEntity, TEntity, bool> keyComparer)
where TEntity : class, IObjectWithChangeTracker
where TGraph: class, IObjectWithChangeTracker
{
return AutoEntitiesIterator.Create(graph).OfType<TEntity>()
            .SingleOrDefault(e => keyComparer(entity,e)) ?? entity;
}
}

// usage
model.Make = make.MergeWith(model, (j1, j2) => j1.Id == j2.Id);

Notice that the last argument of the MergeWith method is a delegate that is used to compare key values on instances of the TEntity type. When using EF, you can normally take for granted that EF will know what properties are the keys and that identity resolution will just happen automatically, but since on the client-side you only have a graph of Self-Tracking Entities, you need to provide this additional information.

Summary

Some customers using Self-Tracking Entities are running into exceptions in cases in which duplicate entities are introduced, typically when they have entities retrieved in multiple service operations merged into a single graph. ApplyChanges wasn’t designed to handle duplicates and the best practice is to avoid introducing the duplicates in the first place. I have showed a few patterns that can help with that.

Personally, I believe the best compromise between simplicity and flexibility is provided by a combination of the first and second patterns. For instance, you can use only foreign keys properties to associate entities with reference/read-only data (e.g. associate an OrderLine with a Product in an application used to process Orders), and use graph containers to transfer data that can be modified in the same transaction, i.e. entities that belong in the same aggregate (e.g, Order and its associated OrderLines).

Hope this helps,
Diego