Two approaches to a LINQ Solution Structure

Background

I’m in the process of writing a small windows application with a very straightforward architecture, for which the only twist is the need to be able to plug in different data providers according to how it is used. This means it must be able to handle SQL 2005, other database servers, XML files, flat files, and any other data source that I choose during the life of the application.

Now this isn’t a complex requirement, and I’ve done it many times before, but what is key to the implementation is that I want to build my first data source to access a SQL 2005 database using LINQ to SQL. This means I thought for a while about the best way to take advantage of LINQ without compromising my plug-in architecture.

The simple diagram below should illustrate what I have in mind;

Linq Solution Structure

Obviously my Application and LinqDataSource assemblies will both have a reference to the Interfaces assembly, but there can be no other assembly references in this model.

There are 3 core concepts involved in my ideal architecture;

1. All communication is through interfaces defined in a separate assembly. This assembly contains both entity and operation definitions (the equivalent to Service Contracts and Data Contracts, should we be talking web services).

2. No implementation specific types may be in these interfaces.

3. The host application must have no knowledge of which implementation it is using (a specific instance of a data source should be created using Reflection based on values provided in configuration).

In contrast to this, LINQ to SQL provides some key benefits that I wanted to exploit;

1. It creates code for entities that reflect the tables defined in my database.

2. It handles optimistic concurrency, synchronising “identity” columns with the database, and so on.

3. It automatically tracks changes to the entities it creates so that the changes that must be written back to the database can be deduced easily.

4. It validates property values on the entities I create.

5. It allows me to use rich LINQ statements in code to select the data I want.

6. It has a very extensible model which would allow me to enhance the data access layer further (e.g. I could override validation methods on the entities).

Combining these lists leads me to the challenges;

· If LINQ to SQL automatically defines the entities in my LinqDataSource assembly this means I can’t reuse them across other data source implementations.

· I don’t want to have to do lots of entity translation (see Don Smith’s video on entity translation in the Web Service Software Factory if you’ve not heard this term) between business entities and database entities, as SQL is in reality my primary data source, and therefore the two contain the same logical fields.

The first of these challenges is the real stickler – I would normally define an entity in a shared interfaces assembly that all my loosely coupled plug-ins use. But LINQ to SQL is defining entities – so that means I’ll end up with many entity definitions, all very similar, and lots of entity translation between them all, right? Yuck. It is this I wanted to avoid if at all possible.

It turns out that rising to these challenges, and combining the two worlds of LINQ and my loosely coupled application architecture is actually very simple. So simple, in fact, I questioned whether I should write this post or not! But I decided in the end that it was indeed a useful illustration of how to approach such architecture, and of course, if I’d had to go through the thought process others would need to also. So, I came up with two slightly different approaches, and could not decide between them – each captured my imagination for its elegance and simplicity, and each has distinct advantages...

To approach this problem we’ll look at the top and bottom of the architectural layers first, and then decide how to knit the two together.

The Database

Starting with the lowest layer, then, imagine my application is based on a very simple database named Employees – in fact so simple it has only a single table, as shown below;

Linq Employee Table

As a result of this, the LINQ to SQL designer will generate two useful classes. “EmployeesDataContext” is effectively a facade to the database, allowing us to execute queries and commands against an in memory representation of the data before committing the changes, and “Employee” is an entity that represents a row of data in our Employee table. Examining the designer generated C# for Employee (by looking in the “.designer.cs” file under your LINQ to SQL “.dbml” file) reveals how the properties reflect the fields in the table

[Table(Name="dbo.Employee")]

      public partial class Employee :

INotifyPropertyChanging,

INotifyPropertyChanged

      {

            private int _Id;

           

            private string _Name;

           

            private string _JobTitle;

           

            private byte[] _Timestamp;

// other members ommitted...

      }

 

The Interface Definition

So jumping back up the layers to the application layer, we can define what operations and entities we need. My application operates in a very CRUD manner, so the obvious operations I will require are “Get All”, “Get By Identifier”, “Insert”, “Update”, and “Delete”. I can also confirm that the application must be able to display an employee’s Name, Job Title, and internal system identifier. These requirements result in two simple interface definitions;

    public interface IEmployee

    {

        int Identifier { get; }

        string Name { get; set; }

        string Title { get; set; }

    }

    public interface IEmployeeDataSource

    {

        IEmployee GetById(int id);

        List<IEmployee> GetAll();

        void Insert(IEmployee employee);

        void Update(IEmployee employee);

        void Delete(IEmployee employee);

    }

There is one important thing to notice here – that being that the fields on IEmployee do not precisely match those on the Employee table. The Timestamp column is not present, as I don’t need it in the User Interface, and I have used slightly different property names for “Id” and “Job Title”. I have done this deliberately as the chances are in a real world situation there will be some mismatch between the two, especially if different teams or even companies are authoring each component.

 

The Glue: Option 1 (Adapters)

My first thought was to create an Adapter class to connect IEmployee to the Employee LINQ entity. Such a solution can be described rather succinctly in UML;

Linq Employee Adapter

An example of how this might work is shown below;

    public class EmployeeAdapter : IEmployee

    {

        private Employee _employee;

        public EmployeeAdapter()

        {

            _employee = new Employee();

        }

        internal EmployeeAdapter(Employee employee)

        {

            _employee = employee;

        }

        internal Employee Employee

        {

            get

            {

                return _employee;

            }

        }

        public int Identifier

        {

            get { return _employee.Id; }

        }

        public string Name

        {

            get { return _employee.Name; }

            set { _employee.Name = value; }

        }

        public string Title

        {

            get { return _employee.JobTitle; }

            set { _employee.JobTitle = value; }

        }

    }

This adapter simply maps calls to properties on IEmployee to the corresponding properties on Employee. Such an adapter allows us to write some pretty simple code in an implementation of IEmployeeDataSource to retrieve and persist our data;

        public IEmployee GetById(int id)

        {

            return ( from c in Db.Employees

where c.Id == id

select new EmployeeAdapter(c)).SingleOrDefault();

        }

        public List<IEmployee> GetAll()

        {

            return Db.Employees.Select(

e => new EmployeeAdapter(e) as IEmployee

).ToList<IEmployee>();

        }

        public void Insert(IEmployee employee)

        {

            EmployeeAdapter adaptedEmployee = employee as EmployeeAdapter;

            if (adaptedEmployee == null)

                throw new ArgumentException("Wrong type.");

            Db.Employees.Add(adaptedEmployee.Employee);

        }

        public void Update(IEmployee employee)

        {

            EmployeeAdapter adaptedEmployee = employee as EmployeeAdapter;

            if (adaptedEmployee == null)

                throw new ArgumentException("Wrong type.");

            // no update action required

        }

        public void Delete(IEmployee employee)

        {

            EmployeeAdapter adaptedEmployee = employee as EmployeeAdapter;

            if (adaptedEmployee == null)

                throw new ArgumentException("Wrong type.");

            Db.Employees.Remove(adaptedEmployee.Employee);

        }

 

There are some interesting points to examine in this implementation. Note the presence of the second “internal” EmployeeAdapter constructor that takes an Employee entity as an argument. This is to allow the data access framework to wrap results in an EmployeeAdapter, for example when retrieving a list by ID;

return ( from c in Db.Employees

where c.Id == id

select new EmployeeAdapter(c)).SingleOrDefault();

The embedded LINQ statement must construct a new EmployeeAdapter around each result. The syntax also gets a bit confusing when retrieving all records;

return Db.Employees.Select(

e => new EmployeeAdapter(e) as IEmployee

).ToList<IEmployee>();

Each result must be wrapped with an EmployeeAdapter, and then explicitly cast to an IEmployee to ensure that LINQ detects the correct return type of List<IEmployee>. Without the explicit cast of the EmployeeAdapter to an IEmployee, LINQ returns a List<EmployeeAdapter>, which cannot be cast to the method’s declared return type of List<IEmployee>.

This approach follows well published patterns (namely Adapter), but some of the syntax seems a little untidy, and I have introduced a class that must be maintained to synchronise the implementation of an interface and an entity. So that’s three different types that all need synchronising! I thought about this a little more, and came up with...

 

The Glue: Option 2 (Partial Classes)

LINQ to SQL uses partial classes for most of the code it generates. This allows you to add functionality to the entities that have been generated by the designer, without concern about these additions being overwritten by the designer at any point, and allowing you to use clearly defined extension points. These extension points are way beyond the scope of this article, but trust me, they are there J

Creating the “other half” of the partial Employee entity class generated by LINQ to SQL gives us the opportunity to directly implement IEmployee on the Employee entity. Adding a file to the solution with the following content, we can see that the Adapter pattern described in Option 1 has now been merged with the implementation of Employee.

    public partial class Employee : IEmployee

    {

        public int Identifier

        {

            get { return Id; }

        }

        public string Title

        {

            get

            {

                return JobTitle;

            }

            set

            {

                JobTitle = value;

            }

        }

    }

Note that there is no Name property defined in this class. Why not? Because Employee already defines Name, as it was generated by the LINQ to SQL designer (for this field both the interface definition and database agree on the precise naming), so there is no need to implement it again. Obviously the closer your database definition and interface definition are, the less work you need to do here!

This has two really significant impacts;

1. It has the potential to simplify and reduce maintenance. If all the fields on IEmployee and Employee were named the same, all the partial type need contain is a declaration that it implements the interface – and no members at all!

2. Now the LINQ Employee entity itself implements the interface, the casting syntax we found when using the adapter pattern is significantly more simplified.

The UML for this approach shows exactly how simple it has become, just by exploiting a by-design feature of LINQ;

Linq Employee Implementation

If we now examine our IEmployeeDataSource implementation, we can see how some of the syntax has simplified;

    public IEmployee GetById(int id)

    {

        return ( from c in Db.Employees

where c.Id == id

select c

               ).SingleOrDefault();

    }

    public List<IEmployee> GetAll()

    {

        return Db.Employees.Select(e => e as IEmployee).ToList<IEmployee>();

    }

    // other methods ommitted...

The differences are most obvious when retrieving data. There is no longer a need to wrap each returned entity in an adapter – as the entity is the interface implementation as soon as it is retrieved from the database. The only casting that is performed is to ensure that LINQ creates the correct return type array in “Get All” (List<IEmployee>).

 

Cleaning up for LINQ

Both of these options are viable alternatives for the application architecture in question – but there are a few loose ends that apply to both options!

You may have noticed earlier on... how come the “Update” method has a “no action required” comment? Surely you need to write changes to the database?

It is important to understand what all the write methods are really doing here – Insert, Update, and Delete all operate on LINQ’s in-memory model of the database. Although the code for Insert looks like the changes have been written out to the database, they haven’t. In fact, if you use the code above, no database changes will ever be written!

One of the big benefits to avoiding entity translation is that the real bona-fide LINQ entity is the object that the calling application will perform operations on. This means that all that LINQ provides is automatically supported... and one of those provisions is tracking when objects change. Therefore when a user calls Update, LINQ already knows the object has changed, so we needn’t do anything; it is already flagged for writing to the database at a later date. So why bother with an Update method at all? Well, this is a generic interface that must support other data storage technologies in the future – such as XML for example – and these data sources most likely won’t auto-track changes to entities. Just another reason to use LINQ I suppose!

So we’ve just shown that we don’t yet write to the database – so how do we force this to happen? We must add a “CommitChanges” method to the IEmployeeDataSource interface, implemented as follows;

    public void CommitChanges()

    {

    Db.SubmitChanges();

    }

SubmitChanges just tells LINQ to persist any Inserted, Updated, or Deleted changes to the database. It will of course automatically determine the Identifier for any newly inserted records, and so on.

The final question is – how do I create a new record? Consider the following code;

    private void InsertRecord(

        IEmployeeDataSource source,

        string name,

        string jobTitle)

    {

        IEmployee record = new IEmployee();

        record.Name = name;

        record.Title = jobTitle;

        source.Insert(record);

        source.CommitChanges();

    }

Of course, you can’t create a new instance of an employee using the syntax “new IEmployee()”! And it turns out this is good – if we want our entity to be fully LINQ aware, it must be a fully-fledged LINQ entity from the start. When using the Adapter pattern, we must ensure that the EmployeeAdapter has a valid Employee entity encapsulated within it.

Therefore, creation of new entities must be a responsibility that is delegated to the concrete implementation of an IEmployeeDataSource – such that for my LINQ data sources, I get an entity that supports LINQ, and for a future XML data source I may need one that supports XML (perhaps it stores an XmlNode reference in the same way as EmployeeAdapter stores an Employee reference).

I have chosen to use the Factory Method pattern to achieve this, and therefore have added a CreateNew method to the IEmployeeDataSource definition (the final complete definition follows to save you tracing the additions throughout this article);

    public interface IEmployeeDataSource

    {

        IEmployee CreateNew();

        IEmployee GetById(int id);

        List<IEmployee> GetAll();

        void Insert(IEmployee employee);

        void Update(IEmployee employee);

        void Delete(IEmployee employee);

        void CommitChanges();

    }

The implementation of this new method couldn’t be simpler. For Option 1, we return an EmployeeAdapter as follows;

    return new EmployeeAdapter();

... and for Option 2 we simply return an Employee;

    return new Employee();

Finally, then, our logic to insert a new employee is tweaked to use our new entity factory;

    private void InsertRecord(

        IEmployeeDataSource source,

        string name,

        string jobTitle)

    {

        IEmployee record = source.CreateNew();

        record.Name = name;

        record.Title = jobTitle;

        source.Insert(record);

        source.CommitChanges();

    }

And that’s it! I hope that has been useful – we have seen how you can use LINQ in your applications even when the coupling between the entity consumer and your data access layer must be weak, we have seen some Object Oriented patterns in practice, and we have seen how to take advantage of the most basic of LINQ’s extensibility model.

Note: Yes, I know this is a lot of text to tell you to write a class that implements an interface; but like I said at the beginning, it’s the journey that counts, and I hope you agree!