EntityBag Part I – Goals

Well, I guess it’s high time that I get down to business of sharing and explaining the “general purpose container object” for transporting graphs of entities along with change tracking information over WCF web services which I mentioned in this previous post.  As it turns out, there’s a fair amount of code involved, so we’ll build up a series of routines / classes until we can get to the top-level object which I call EntityBag<T>. 

To start us off, though, let’s take a look at the top-level interface for EntityBag and how it would be used.  The goal is to write a couple of mid-tier web methods:  one which will retrieve a graph of entities in a single call and instantiate them on the client in full graph form, plus one which will take a modified graph back and persist the changes to the database while honoring the optimistic concurrency contract based on the original values retrieved and without requiring extra round trips to the DB in order to save.  To make the whole thing work, we need:

1)      A way to send the entire graph in spite of the fact that the EF and WCF by default only shallowly serialize entities (that is related entities are not automatically serialized).

2)      Something which will track changes on the client—both changes to regular properties of the entities as well as changes to the graph (new entities, deleted entities, modified relationships).

3)      A serialization format that can include not only the current values of an entity but original values and information about the state of the entities, etc.

The strategy I adopted for EntityBag is to create a DataContract serializable object which will effectively transmit an entire ObjectStateManager and to identity a single object as the “root” of the graph.   This way on the mid-tier a graph of related entities can be retrieved from the database into an ObjectContext, an EntityBag may be constructed to hold that context, and then that bag can be serialized to the client.  On the client, the EntityBag will expose a single public “Root” property and internally will create a private ObjectContext which is used both to help reconstruct the graph and to transparently track changes.  When it’s time to persist the changes, the EntityBag can be serialized back to the mid-tier and the updated entity graph along with all original values can be reconstructed into a mid-tier context which will then be used to save the changes to the DB.

Code for the web methods might look something like this:

public EntityBag<Room> GetRoomAndExits(string roomName)


    using (DPMudDB db = DPMudDB.Create())


        roomName = ‘%’ + roomName + ‘%’;

        var room = db.Rooms.Where(“it.Name like @name”, new ObjectParameter(“name”, roomName))




        return db.CreateEntityBag<Room>(room);




public void UpdateRoomAndExits(EntityBag<Room> bag)


    using (DPMudDB db = DPMudDB.Create())






As you can see, the code in these two methods is uncluttered by serialization considerations—just one call (CreateEntityBag) in the first method and one (UnwrapInto) in the second method encapsulate the plumbing.  The client side is similarly straight-forward.  Essentially all that is necessary is to use the Root property on the EntityBag to access the graph:

EntityBag<Room> bag = client.GetRoomAndExits(“Calm”);

foreach (Exit e in bag.Root.Exits)


    Console.WriteLine(“\t” + e.Name);

    e.Name .= “***”;




var newRoom = new Room();

newRoom.Name = “NotTheRoomYouAreLookingFor”;

var newExit = new Exit();

newExit.Name = “NotTheExitYouAreLookingFor”;

newExit.TargetRoom = newRoom;




While I like the simplicity of this interaction, it is super important to keep in mind the restrictions imposed by this approach.  First off, there’s the fact that this requires us to run .Net and the EF on the client—in fact it requires that the code for your object model be available on the client, so it is certainly not interoperable with Java or something like that.   Secondly, because we are sending back and forth the entire ObjectContext, the interface of the web methods imposes no real contract on the kind of data that will travel over the wire.  The retrieval method in our example is called GetRoomAndExits, but there’s absolutely no guarantee that the method might not return additional data or even that it will return a room and exits at all.  This is even scarier for the update method where the client sends back an EntityBag which can contain any arbitrary set of changes and they are just blindly persisted to the database.  The ease of use we have achieved comes at a very high price.  Of course if you are writing web services that transport instances of the DataSet, you already live in that world, and for some scenarios this might be acceptable.

OK.  Now that I have your attention, in future posts we can dig into the implementation of EntityBag.



Comments (17)

  1. Did you see the post at blogs.msdn.com

  2. JasonBSteele says:

    Very cool Danny – I look forward to your following articles.

    TBH I was surprised to find that the EF didn’t have this built in – perhaps it will inthe future?



  3. simmdan says:

    The EF won’t have something like this built-in for v1, but we are looking hard at the topic for future releases.  I’m not 100% certain we’ll do this, since there are some serious issues when it comes to interoperability, etc. (as I’ve noted).  I believe we made some major mistakes with the DataSet in this regard, and I don’t want to repeat them.

    – Danny

  4. jason.hill says:

    Sounds very interesting. How would this work at the client when you have properties that are lazy loaded and which haven’t been loaded when the graph is serialized?

  5. simmdan says:

    It does not allow lazy loading on the client.  First off there’s the fact that the EF doesn’t support implicit lazy loading.  Next, if we are trying to create an architecture with more than 2-tiers, then we assume the client does not have line-of-sight to the database, and we don’t want to create a web service architecture that essentially re-establishes that just tunneled through HTTP.

    So the expectation is that you would load everything necessary for a particular operation on the mid-tier before returning the EntityBag.  Of course it would be possible to have additional web service methods to retrieve related entities on demand, but that would require expanding EntityBag to make it possible to attach related entities from one bag into another bag (that is into the internal context).  This could be done but is beyond the scope of these blog posts.

    – Danny

  6. jason.hill says:

    Ahhh…right. I haven’t had a chance to dig into the EF in a lot of detail yet but am approaching this from an NH background.

    Do you think the EF will support implicit lazy loading down the track?


  7. simmdan says:

    The EF is unlikely to support implicit lazy loading by default–we’ve had a lot of feedback around the dangers of this approach for many applications and the value of explicit lazy loading instead.

    That said, I certainly think it should be possible to turn on implicit lazy loading if you want it, and in future releases we may make it easier.

    – Danny

  8. jason.hill says:

    I would be interested in hearing about the potential dangers of implicit lazy loading. We are used to this with NH and to have to call Load() every time to load related entites seems like a cumbersome requirement.

    The developer really has to think about whether the collection has been loaded every time they access the property and I think this just opens up the possibility of people forgetting or code littered with Load() calls "just to be sure" that the collection has been loaded.

    I can’t immediately see why this explicit loading is beneficial. Surely, when you need to access related entities, then you just access the property and the ORM should know whether it needs to go to the DB or not.

    I would definitely agree that EF should support implicit lazy loading…as a priority too 🙂


  9. simmdan says:

    The basic reasoning is this:

    1) For some applications (particularly as they grow in size, complexity and sophistication of deployment environment) predictability around round trips to the database becomes very important.  Unless you want a round trip to happen every single time you access the property (which most folks don’t want), implicit loading means that it can be very hard to predict whether or not a property access will require a DB roundtrip.  The EF designers had some painful experiences with these exact problems which led to the design decision to make things much more explicit.

    2) While it is pretty straight-forward to add implicit behavior on top of explicit, the other way around is not so easy.

    In fact, if you want implicit loading with the EF today, you just need the navigation property implementations to have an

    if (!field.IsLoaded) field.Load();

    before returning the field.  This could be done by implementing your own IPOCO classes or quite probably even by customizing the code generaiton process with codegen events.

    – Danny

  10. jason.hill says:

    Thanks for the response Danny.

    I’m really not sure I get #1. Accessing a property shouldn’t trigger a hit to the db *every time* you access it. Surely, the first access hits the db and then the result is cached for the duration of the UOW, i.e. datacontext in EF.

    It would be great if you could also please elaborate on your comment "implicit loading means that it can be very hard to predict whether or not a property access will require a DB roundtrip".

    I am finding it difficult to see any use case where a developer would want [or should be able to] access a property that was not loaded from persistent storage before first access. Allowing this surely just creates an inconsistency between the db and in-memory right from the get-go which to me seems like a recipe for disaster.

    Thanks for providing some info on manually implementing a implicit lazy loading in EF. It is good to see that it is possible to get this working at a base level without the developer having to think about every property access although I am strongly in favour of this being an integral part of EF without the need for IPOCO or codegen solutions.


  11. simmdan says:

    Yes, loading every time would not be what you want, but if you don’t load every time, then there’s no way to really guarantee whether or not a round trip will happen when you access the property.  If your app becomes complicated, then multiple unrelated components might access the property and the order isn’t guaranteed.  You can look at this as an argument for implicit loading (whichever one gets to it first will load), or you can look at it as an argument for explicit loading (stop, think about what’s going on and make your own strategy to solve things).

    Let me give you a concrete example: With EntityBag, you determine the set of entities on the mid-tier and then ship it to the client.  Once on the client, you don’t have a connection to the DB, and if you were to implicit load, it would fail.  So we pre-load things on the mid-tier and then on the client-tier whatever isn’t loaded just comes through as empty.

    It’s a philosophy sort of thing.  In v1 of the EF, you will get explict loading and the mechanisms I mentioned above to customize for implicit loading if you like.  In some future release of the EF, we may further simplify turning on implicit loading, but it will always be off by default.

    – Danny

  12. jason.hill says:

    Ahhh…right…I can see where explicit loading is useful when you ship the objects to a remote client that doesn’t have a connection back to the DB and is therefore unable to make use of implicit loading.

    In all other use cases I would imagine that implicit loading is the logical choice because then the property is just inflated when first accessed without the developer having to think about it.

    The main problem then is that you can’t necessarily forecast how the objects are going to be used in the future so assuming implicit loading may cause problems if the objects are sent to a remote client and the client is unable to implicitly load.

    However, I am still a little uncomfortable with explicit loading as it just requires too much manual effort on the part of the developer with having to call Load() all the time, especially when most of the time you are obviously going to want to work with loaded properties otherwise you wouldn’t be accessing them!

    It really sounds like the core issue is serialization and delivery to remote clients…is that right?

    I like the idea of having implicit loading turned on by default except when the objects are delivered to a remote client in which case it flips to explicit loading or even disabled depending on the load capabilities at the client.

    Possibly, the developer should have the ability to dynamically control this explicit/implicit loading behaviour within the service interfaces rather than having it rigidly defined in the domain. The developer should know at the interface what the capability of the remote clients are going to be so could reliably make that call, e.g. web service interfaces would not allow implicit loading. Is this possible?

    You also mentioned pre-loading on the mid tier and then properties being empty on the client. So, what happens when the client accesses an empty property?


  13. simmdan says:

    The example of access on the client is just an example.  Again, this is a philosophy thing.  I argue that you want implicit loading because it’s what you are used to.  Other folks have a very, very different view.

    With explicit loading, if you access a property that has not been loaded yet, then it is null or empty (depending on whether it’s a ref or a collection).  In either case, though, there’s a way to check a separate property to see if the relationship has been loaded yet or not so that you can tell if it’s empty because there aren’t any related entities in the DB or if it’s empty just because it hasn’t been loaded.

    I’m sorry the EF doesn’t work the way you expect.  Maybe after you work with it for a while you will get used to it, or maybe you will customize things a bit to make it behave more like what you want.

    – Danny

  14. Hot Topics says:

    Danny Simmons wrote a class that persists the EntityState of objects in an ObjectContext so that they

  15. Hot Topics says:

    Danny Simmons wrote a class that persists the EntityState of objects in an ObjectContext so that they

  16. michael.plavnik says:

    Hello Danny. I am sure, EF team has considered multiple desing choices for lazy loading. But discussion about lazy loading had troublesome undertaste for me.

    My experience is that implict loading is complex to layer above explicit the same as vise versa. Both need to be designed together if you want both.

    Jason.Hill also has its point, cause EF idiom is breaking reference semantics of CLR objects, while preserving the syntax.

    More realistic statement would be that world is going to become used to the EF idioms (we shall see, there are other products though not exactly like EF :-)) Also, I do believe that in complex scenarious like you mentioned before, the choice of implict or explict default is not what saves a day (or a product).

    My personal experience started from making things explicit by default and then switching to implicit default by application teams requests. But again, we have different idioms (from EF) to batch request to the database, somewhat along "be explict in the big, not in the small".

  17. jason.hill says:

    Didn’t mean to get into a debate here…just trying to understand the EF philosophy!

    I don’t agree with your comment about wanting implicit loading because it’s what I am used to. I have been looking at this objectively and the arguments for explicit loading just aren’t stacking up for *me* yet. Conversely, I actually see it causing problems:

    1. Unexpected behavior because someone forgot to call Load().

    2. Too much Load and IsLoaded coding required by the developer on lots of properties.

    Maybe my viewpoint will change down the track but for now I just don’t see why it makes sense in the majority of use cases.

    Anyway, as you say, we have options to implement implicit loading through code gen or IPOCO…or of course we can just stick with NHibernate 🙂