Rolling your own SQL Update on-top of the Entity Framework - Part 3

Article
02/11/2008

Okay, okay, okay... yes I know this is the slowest moving series in the history of blogging (sorry Roger J).

Now, when a blog post starts like that, you would probably expect the next sentence to begin with "But" or "However", not this time though, I have no excuse, I will simply make an appeal to your leniency.

Just to refresh your memory, last time I ended with...

Next we need to look at GetUpdateCommand(..)

There are two key parts to this...

The first part is the subject of this post, so lets walk through the code:

public SqlCommand GetUpdateCommand(Expression<Func<T, T>> updator){
//Get the information we need from the QUERY
QueryInfo info = GetQueryInfo();

QueryInfo looks like this:

internal class QueryInfo
{
    public string OriginalSQL;
    public string RestrictingSQL;
    public string TableName;
    public Dictionary<string, string> C_S_fieldMap = new Dictionary<string, string>();
}

It is used to:

Hold the trace string for the query held in the original ObjectQuery<T> (in OriginalSQL).
Build a filtering query i.e. Converting (1) to the form Select KEY FROM ... WHERE ... and hold onto it (in RestrictingSQL).
While doing so it holds the name of the underlying table (in TableName).
And builds a map between Conceptual Property names and Store Column Names (in C_S_fieldmap).

The alert amongst you will have noted that I said underlying 'table' not underlying 'tables' . This your first clue that this code has some pretty serious limitations, i.e. it only works against entities that are stored in one table.

You may have also noted that we *build* a map, ideally you would use the Entity Frameworks MetaDataWorkspace to query an existing map, but unfortunately the EntityFramework's mapping information is held in classes that are internal.

In fact it is the fact that I am reverse engineering a map from TSQL rather than using some sort of fail safe metadata, that means there are a whole heap more limitations and assumptions in this code than I would like.

(NB If on the other hand you parsed the MSL file yourself to have a correct map, you could definitely do more interesting updates, but of course, that is rather a lot of work...)

So GetQueryInfo() looks like this:

private QueryInfo GetQueryInfo()
{
EnsureOpenConnection();

    QueryInfo info = new QueryInfo();
    info.OriginalSQL = Query.ToTraceString();
    //Get the table name
    info.TableName = Utilities.GetOrRethrow(
                            () => info.OriginalSQL.Between("FROM ", "AS [Extent1]").Trim(),
                            (e) => new InvalidOperationException("Failed to infer the Table that is being updated", e)
                            );

//Get the aliases so we can build the map
string[] bits = info.OriginalSQL.Split(new string[] { "FROM " }, StringSplitOptions.None);

    Utilities.Assert(
        bits.Length == 2,
        () => new InvalidOperationException("Unexpectedly complex SQL tree with 2 or more FROM's encountered\n\r" + info.OriginalSQL)
    );

    string trunc = bits[0].Replace("SELECT ", "").Trim();
    string[] aliases = trunc.Split(',');
    aliases = aliases.Select(a => a.Trim()).ToArray();

// get an array of arrays in the form [DBfieldName][EntityPropertyName],...]

    var keyvaluepairs =
            aliases.Where(a => a.StartsWith("[Extent1].["))
           .Select(a => a.Replace("[Extent1].[", ""))
           .Select(a => a.Replace("] AS [", ","))
           .Select(a => a.Replace("]", ""))
           .Select(a => a.Split(','));

    //Set up the map from array
    foreach (var keyvaluepair in keyvaluepairs)
        info.C_S_fieldMap[keyvaluepair[1]] = keyvaluepair[0];

    //Make sure the key is simple
    Utilities.Assert(Context.Keys.Length == 1,
        () => new NotSupportedException("Only works for Entities with simple keys")
    );

//Construct an Alias for the SELECTOR
string keyAlias = Context.Keys.Select(key => string.Format("[Extent1].[{0}] AS [{1}]", info.C_S_fieldMap[key], key)).FirstOrDefault();

//Store the filterSQL
info.RestrictingSQL = string.Format("SELECT {0} FROM {1}", keyAlias, bits[1]);

return info;
}

Walking through the code we:

Call EnsureOpenConnection(), because in order to call ToTraceString() in Beta3 the underlying database connection must be open. I've skimmed over that method here, because hopefully for the next release this limitation will be gone, the TSQL should be able to be built without an open database connection.
Then we look for the table name*.
Check the TraceString isn't too complicated*
Pull out the columns that are being projected into the results, and attempt to build a map from Property Name to Column Name for properties on the primary table*
Use our map and our knowledge of the keys to re-write the query as a key query.

As you walk though this code, you will see there are a whole heap of assumptions* being made.

*So what are the assumptions:

It is supremely importantly to recognize we are reverse engineering SQL Gen, as such we need to fail often and early whenever the SQL we encounter doesn't match our expectations.
One of the biggest assumptions is that the Entity Framework is running against SQL Server, and that the SELECT will be in the form SELECT [Extent1].[Column] AS [Property], ... FROM [Table] as [Extent1] .
The table must only have one key.

Assuming these assumptions are okay we are ready to move on to building our update command...

And now I make a promise: Part 4 will be with you in LESS than a week.

Rolling your own SQL Update on-top of the Entity Framework - Part 3

Additional resources