AttachAsModified – a small step toward simplifying EF n-tier patterns

During this week at PDC I had a lot of great discussions with folks who are working with the EF—some of them just evaluating it and others deep into projects built on it. As it turns out much of the feedback that I heard came as no surprise. One of the most common recurring themes is that folks would like more help creating n-tier applications based on the EF—especially solutions built around WCF. Unfortunately, these kinds of solutions are just harder than they should be.

We’re working on simplifying things in EF v2, but in the meantime, there are things that you can do to make the process easier just using the tools available with EF v1. One common pattern that I encounter in these scenarios is the use of optimistic concurrency with a single concurrency token (often a row version or timestamp). In these cases, you can reduce your wire traffic and simplify the code both on the client and the mid-tier with the use of a couple simple extension methods to ObjectContext.

The Problem
A common pattern for web services that work with entities is to have two methods. One method will retrieve an entity, and the other method will update it. The retrieval method is easy, but the update method is harder because in addition to the new version of the entity, the EF needs two other kinds of information about the update operation—it needs to know the original values of any properties used in the concurrency checks, and it needs to know which properties were modified.

One approach which is sometimes used is to require that the client clone the entity before it makes any changes and then send back both the original entity and the new version to the update method. This method can then call “context.Attach(original)” followed by “context.ApplyPropertyChanges(new)” in order to get the context in the right state for a SaveChanges call.

There are problems with this approach, though, because cloning the entity is time consuming and a pain to code, plus you have to send twice as much data on the wire when calling the update method, and ApplyPropertyChanges has to iterate through each property comparing values in order to decide which values have been modified.

The Idea
So, if we find that our entities really only use one property for concurrency checks, then we can make this overall process a lot simpler just by changing the signature of the update web service method to just take the new version of the entity and making a contract with the client that it should not update the property used for concurrency checks so the concurrency information just flows along with the single entity. This makes the client code much easier (no cloning--just retrieve the entity, modify it, send it back), and it reduces the duplication on the wire. The astute reader might bring up three questions, though:

1) What if the client changes the concurrency property? I said above that the idea was to make a contract with the client that it shouldn’t update the concurrency property, but we do need to take into account the possibility of a client with a bug (or maybe even a malicious client). The good news here is that if the client changes the property chances are they will just get a concurrency exception. If the client wants the transaction to succeed, then they should leave the prop alone. If the prop gets modified, then the transaction will fail and the data in the database will be left unmodified.

2) How do we know which properties were changed? We said that the EF needed not only the original concurrency value but also the list of modified properties. The proposed solution to this problem is just to mark every property on the entity as modified whether it was modified or not. This approach won’t work for everyone, but it’s actually effective in a surprising number of cases. The EF uses the list of modified properties in order to determine which values to send to the database in its update statement. Having a shorter, more precise list will reduce the wire traffic to the database, but on the other hand if different properties are updated in different transactions, then having the more precise list may actually cause worse performance once the operation gets to the database because it will produce fragmentation in the query plan cache. Sending the same set of properties every time will sometimes produce much better performance even though it means sending more data on the wire. Whether or not this is the right trade-off for your application is something that will require profiling to determine.

3) OK, maybe this is a good idea, but how do I do it? Ahhh… finally the fun part. ;-) Given the extension methods below, you can replace the Attach(original), ApplyPropertyChanges(new) code in the update service method with a single call to AttachAsModified(new) – and then of course call SaveChanges() to push the changes to the database.

The Code
Without further ado, here’s the code to the extension method. I actually include two methods in order to match the two variants of Attach on the ObjectContext, but just the regular Attach version is probably the one you will want most often.

private static void SetAllPropertiesModified(ObjectContext context,

                                             object entity)

{

    var stateEntry = context.ObjectStateManager.GetObjectStateEntry(entity);

    foreach (var propertyName in from fm in stateEntry.CurrentValues.

                                                DataRecordInfo.FieldMetadata

                                 select fm.FieldType.Name)

    {

        stateEntry.SetModifiedProperty(propertyName);

    }

}

public static void AttachAsModified(this ObjectContext context,

                                    IEntityWithKey entity)

{

    context.Attach(entity);

    SetAllPropertiesModified(context, entity);

}

public static void AttachAsModifiedTo(this ObjectContext context,

                                      string entitySetName, object entity)

{

    context.AttachTo(entitySetName, entity);

    SetAllPropertiesModified(context, entity);

}

 

The Caveats
There are always caveats, aren’t there? In this case there are two things to be aware of: First off, there are some rather obscure mapping scenarios where the original values of properties can affect the update statements other than just the concurrency token. Honestly I can never remember these cases all that well, because they are pretty unusual, but it is something to be aware of if you have some really complex mappings. Secondly, this of course just manages the properties of a single entity. Dealing with graphs of related entities is a much larger topic which I’ll have to address in other posts.

- Danny