Concepts Part I: Getting an entity model up and running

I've been thinking for a while about a series of blog posts I'd like to write explaining various Entity Framework concepts, particularly those related directly to writing code using the framework--by that I mean that I will concentrate more on using the API to write object-oriented programs which do data access and less on some other kinds of topics like writing and debugging the XML mapping, using the designer, etc.  Some of this will be pretty basic, but many of them are the kinds of things that come up fairly often in our own internal discussions as well as on the forums and such.  Hopefully this will be valuable, please let me know what you think, if there are topics you would especially like me to cover, or if there's anything I can do to make it more useful.

I'm going to use a common model/database for my samples which I call DPMud.  It's something that a few friends and I created and maintain as a fun side-project which is an old-style multi-user text adventure (MUD) built using the entity framework.  The architecture is not something that will scale well (every real-time event in the system persists to the database, and users learn about events by querying the database), but it works reasonably well for a few users at a time, it's an environment which is very convenient for us to develop in, and it has turned out to be a decent example for exercising all sorts of entity framework functionality.  It even does a reasonable job of stressing the system. 

In order to help you get a feel for the model, here's an abbreviated version of the diagram:

 DPMud Model Diagram

Essentially I have a many-to-many self relationship for rooms, but because I need a "payload" on that relationship (some properties on the relationship itself rather than the rooms), I chose to create another entity type called Exit and then have two 1-many relationships between Rooms and Exits--one of those relationships represents all of the exits out of a room, and the other relationship represents all of the entrances to the room.  The other parts of the model are actors which are related to rooms, and items which can be related either to a room or an actor.  The model doesn't have a good way to enforce that an item must be related either to a room or an actor but not both or neither, but I've added check constraints to the DB which handle that enforcement.  In the real game there are also events which relate to each of the other entity types as well as a large inheritance hierarchy of actor types and event types among other things, but those make the model diagram very hard to read, so I've left them out.

This diagram corresponds to my CSDL file which is the XML file describing my conceptual model.  If you'd like to look that over, you can find it here.  Well, that CSDL file actually has a few things not present in the diagram, but it is simplified considerably from the full model used in our game.  For completeness, you can also have a look at the SSDL and the MSL.  I know I'm a bit "old school" when it comes to these things since I've been working on the EF since well before we had a designer (funny to say that about a not-yet-released project), but I've authored each of these files by hand rather than using the designer.  I'll also point out, in case some of you are unaware, that the designer uses a file called EDMX to store all of the metadata about an EF model--this one file includes the CSDL, MSL and SSDL just in separate sections within it.  When it comes to runtime, the system requires the three separate files, and the designer projects automatically create them in your application out directory.  So, I will talk about the three files separately and even play some tricks that don't work with the designer today rather than describing the default designer experience.

The one other thing I'll say about DPMud is that it was built from the beginning as rich windows client which talks direclty to the database.  So most of our initial apps will work that way.  This is nice, because these are some of the easiest apps to write, but as we go through the series we probably will also spend some time exploring other app architectures like web services and web apps.

OK. So enough about my funny little sample. Let's cover some EF concepts, shall we? There are three things I'd like to talk about in this post which are all related to getting any basic app up and running (once you are done creating the metadata for the model, generating the basic entity classes, etc.):

  1. Storing your metadata in resource files.   One of the things that folks often find annoying is the fact that EF applications require these three separate metadata files at runtime, and that complicates deployment, etc. Fortunately, it's really quite easy to store these files in the assembly itself as resources and then just tell the EF to find them there. The way to do this is to add the CSDL, MSL and SSDL files to your project and set the "Build Action" property on each one to "Embedded Resource". Then when you specify your conneciton string, rather than explicitly listing the metadata files, you can just add "metadata=res://*/;" which tells the EF to look for appropriate resources in your apps statically linked assemblies any that it finds.

  2. Connection strings.   When you use the designer or EdmGen.exe to generate classes from your model, part of what gets generated is a strongly-typed ObjectContext which provides a convenient façade for working with your model. Part of the standard pattern is that the connection string which was originally used to access the database in order to generate the model is embedded in an EntityClient connection string which is placed into the app.config file under the same name as your entity container (specified in the CSDL and used as the name of the strongly-typed ObjectContext type in the generated code). In my sample, this name is "DPMudDB". If the app.config file exists and has a connection string whose name matches the container, then the generated classes have a nice, parameterless constructor which will automatically pick up that connection string and use it.

    This mechanism is great for many scenarios--it makes for nice clean code, and it follows a great best-practice of putting the connection string out into the config file where it's easier to swap out in situations like where you might use one connection string for development, another for testing/staging and a third for final deployment. That said, there are cases where you might want to more dynamically generate the connection string--in the case of DPMud, we have multiple database servers for different environments and wanted to provide the user with the opportunity to specify the server as well as credential information at runtime. So, it's important to understand what the parts of the connection string are and how they go together.

    If you create a connection string at runtime, then it's still quite straightforward to use. Instead of using the parameterless constructor for the strongly-typed context, you can just use the constructor overload which takes a connection string. This connection string is the exact same connection string that you use either with an ObjectContext (the base type or one of the strongly typed classes that inherit from it) or directly with EntityClient. It follows the same model as all other connection strings which is a set of name value pairs separated by semi-colons, and it has three parts:

    1. A specification for where to find the metadata (csdl, msl & ssdl) which EntityClient will use for mapping between the conceptual model and the database. The keyword for this section is “metadata=”, and there are several different kinds of values you can give it. I mentioned the one I use most often above which is just to specify that the metadata lives in resources within the assembly and that the system should look through all the resources looking for appropriate ones. You can do this with “metadata=res://*/;”, or you can specify a pipe delimited list of directories or files (either absolute or relative to your app directory) which contain the metadata.
    2. The name of the database provider which EntityClient should use to actually talk to the database. For example, “Provider=System.Data.SqlClient;”.
    3. The connection string that you would use with the database provider to access your database. Since this value itself has a series of keywords separated by semi-colons, the string is just enclosed in double quotes. In my sample model it looks like this, “provider connection string="MultipleActiveResultSets=true;database=dpmr;server=.;Integrated Security=true;".

If you are going to build up these parts dynamically, then naturally the recommended way to do so is using a connection string builder, and EntityConnectionBuilder makes this easy. For the provider connection string, you can of course do a similar thing using something like SqlConnectionBuilder. So, the whole thing together would look like this:

SqlConnectionStringBuilder sqlBuilder = new SqlConnectionStringBuilder();

sqlBuilder.MultipleActiveResultSets = true;

sqlBuilder.DataSource = ".";

sqlBuilder.InitialCatalog = "dpmr";

sqlBuilder.IntegratedSecurity = true;

EntityConnectionStringBuilder entityBuilder = new EntityConnectionStringBuilder();

entityBuilder.ProviderConnectionString = sqlBuilder.ToString();

entityBuilder.Metadata = "res://*/";

entityBuilder.Provider = "System.Data.SqlClient";

  1. Context creation and lifetimes.   This is a broad topic and obviously the right answer varies a lot depending on the overall architecture of the app you are building—if you are building a web service or an asp.net page then you almost certainly want the context only to stay alive during each request and then be destroyed so that your server can be as stateless as possible, but if you are building a rich client application, then your context will typically stay alive for a much longer period of time—in fact the same context probably stays alive for the life of your app. The reason for this is that the context performs identity resolution and provides the path from entities to the database for when you want to do things like deferred loading. If you destroy the context or detach the entities from it, then deferred loading won’t work.

    The other thing you have to remember, though, is that the EF, like most of the rest of the .Net framework, is by and large NOT thread-safe. So if you want to interact with the EF or your entity classes from multiple threads, then you are going to have to do your own locking. One simple model which works for some cases is for each thread to maintain its own context so that no locking is required. This means, of course, that the interactions between the threads tends to be quite limited (you can’t really pass an entity from one thread to another without being very careful), but you can do some pretty useful things even so.

    In the case of DPMud, we originally designed the system to have completely separate processes for players and for the system functions (like a process which manages all of the computer controlled AI actors in the system and another process which monitors the system and performs periodic cleanup, etc.). Recently, though, we’ve begun working on some modifications to move the system processes into background threads on the clients. This makes it possible just to run a client and have it find other clients working against the database and automatically elect a process to handle the AI or whatever. (It also means that when it really gets going DPMud can nicely peg both cores on my dual core machine <grin>.)

    Happily for us, this migration to a system which uses multiple threads with a separate context on each thread was actually quite easy because some time ago we had adopted a pattern of using a static property as the entry point for accessing the context in various parts of our code. That way we didn’t have to pass the context around everywhere we needed it, and we could also centralize the logic for building the connection string, etc. I want to caution you, though, to be very careful with this kind of pattern. It’s critical that you think about your context lifetime, the lifetime of the objects in it, and the implication of various methods you can call on the context to be sure that this one effectively global context (at least per thread) approach is the right one for you.

    One of the biggest potential surprises is around the SaveChanges method which will save ALL changes made to entity objects attached to that context. So, if you have methods which modify your objects but do not immediately save (not all that unusual for many situations where you want to build up a set of changes that are part of a reasonably broad task before saving those changes all at once), and you introduce another method which makes a single change and saves (because its task is more limited and focused), then there is a risk that one of your first set of methods (just modify things but don’t save because it’s part of a bigger unit of work) could end up calling your second type of method and inadvertently saving half of your intended unit of work rather than waiting for the whole thing to complete. In the case of DPMud, we’re building a system which saves to the database as frequently as possible because the database is effectively a real-time communication system, so this wasn’t a problem for us.

    In any case, all of that was a long wind-up to sharing some code which follows the same pattern that we use in DPMud not only to consolidate connection string building, but also creating a common, static entry point to the context, and automatically maintaining a separate context for each thread by putting the context into thread-local storage. One other quick point: I put the static properties onto the DPMudDB (strongly typed context) partial class since it’s all about that class so why create another type for that purpose.

    Anyway, the code looks like this:

using System;

using System.Data.EntityClient;

using System.Data.SqlClient;

using DPMud;

namespace DPMud

{

  public partial class DPMudDB

  {

  [ThreadStatic]

  static DPMudDB _db;

  static string _connectionString;

  public static string ConnectionString

  {

  get

  {

  if (_connectionString == null)

  {

  // insert code from above which uses connection string builders here //

        _connectionString = entityBuilder.ToString();

        }

        return _connectionString;

      }

    }

    public static DPMudDB db

    {

    get

    {

    if ((_db == null) && (ConnectionString != null))

      {

      _db = new DPMudDB(ConnectionString);

      }

      return _db;

      }

    }

  }

}

So what do we get from all this? Well, once we have this foundation laid, writing a program which accesses the database using our strongly typed context becomes pretty darn simple (even if we have multiple threads), and that program can itself just be a single EXE with the entire model and all of its metadata in that one assembly (this is what we do for DPMud—nothing quite like drag-and-drop deployment).

Here’s the code for a simple program which prints out a list of all the rooms and items in the DPMud database but does so from two threads running concurrently. Note: you can set break points in each of the loops and step through the program seeing things interleave nicely, but if you actually run the program there’s a good chance all of the rooms will print together and the items the same because the total time required is relatively small so there may not be that much time-slicing between the threads.

using System;

using System.Threading;

using DPMud;

namespace sample1

{

    class Program

    {

        static void Main(string[] args)

        {

            Thread thread = new Thread(new ThreadStart(Program.PrintItems));

            thread.SetApartmentState(ApartmentState.STA);

            thread.Start();

            foreach (Room room in DPMudDB.db.Rooms)

            {

                Console.WriteLine("Room {0}", room.Name);

            }

        }

        static void PrintItems()

        {

            foreach (Item item in DPMudDB.db.Items)

            {

                Console.WriteLine("Item {0}", item.Name);

            }

        }

    }

}

Fun huh? OK. Yes I admit it, I’m a total geek, but I think it’s pretty fun.

Until next time,
Danny