Linq to bits !

Following my previous post, let's see how to extend Linq to bits...

To have a good understanding of what I am working on in this post, it's highly recommended to read my previous post: Enumerating Enums. In fact, this is part II.

Using Linq natural extensions to enumerations we have all the benefits of methods like Intersect, Union or Contains for free. But let me remind the problem: (weekend & Day.Saturday) is one of the fastest instruction almost unchanged from the time we were writing with assembler.
To compute weekend.Intersect(otherdays), let's see how Linq is doing:

 static IEnumerable<T> IntersectIterator<T>(IEnumerable<T> first, IEnumerable<T> second) {
    Dictionary<T, object> dict = new Dictionary<T, object>();
    foreach (T element in first) dict[element] = null;
    foreach (T element in second) {
        if (dict.ContainsKey(element)) dict[element] = dict;
    }
    foreach (KeyValuePair<T, object> pair in dict) {
        if (pair.Value != null) yield return pair.Key;
    }
}

This code is generic and can be performed for any kind of enumeration. In our particular case, we know we can make it better. Linq is really cool and can be extended to change the way our code is interpreted.

To do such we will have to implement IQueryable<T> which is a subinterface of IEnumerable<T>. Just doing this, all sequences of our objects (weekend.Intersect(Day.Saturday)) are compiled to expressions instead of IL. Implementing IQueryable<S> IQueryable<T>.CreateQuery<S>(System.Expressions.Expression expression), we will have the opportunity to analyze the expression to do something else.

 public struct MyEnum<T> : IQueryable<T>, IEnumerable<T> 
{

    ...

    #region IQueryable<T> Members

    IQueryable<S> IQueryable<T>.CreateQuery<S>(System.Expressions.Expression expression)
    {
        return new BinaryQuery<S>(expression);
    }

    public System.Expressions.Expression Expression
    {
        get
        {
            return Expression.Constant(this);
        }
    }

    ...

    #endregion
}

A lot of other methods are needed to implement IQueryable<T> but in our case, you may think I am lazy (which is another question), we do not need them, so I will let them throw the NotImplemented exception.
You can notice that we created a new BinaryQuery class to analyze of the expression and implement IQueryable<S>. Doing such is really more simple because CreateQuery<S> needs to return IQueryable<S> again, so the user can choose to query the sequence again and again.

Now let's see the expression analysis. It's quite simple (especially compared to DLinq :) ). We have to recursively navigate the expression tree and replace method calls (Intersect, Union) by our binary operations. If the node is a constant, we just retrieve the value. If the node is a method call, depending on its name, we call the right binary operation.

 public static MyEnum<T> ParseExpression(Expression expression)
{
    if (expression.NodeType == ExpressionType.MethodCall)
    {
        MethodCallExpression method = expression as MethodCallExpression;
        if (method.Method.DeclaringType == typeof(System.Query.Queryable))
        {
            switch (method.Method.Name)
            {
                case "Intersect":
                    return ParseExpression(method.Parameters[0]) 
                        & ParseExpression(method.Parameters[1]);
                    break;
                case "Union":
                    return ParseExpression(method.Parameters[0]) 
                        | ParseExpression(method.Parameters[1]);
                    break;
                default:
                    throw new NotImplementedException("Not supported");
                    break;
            }
        }
        else
            throw new NotImplementedException();
    }
    else if (expression.NodeType == ExpressionType.Constant)
    {
        ConstantExpression exp = expression as ConstantExpression;
        if (exp.Value is T)
            return new MyEnum<T>((T) exp.Value);
        else if (exp.Value is MyEnum<T>)
            return (MyEnum<T>)exp.Value;
        else
            throw new NotImplementedException();
    }
    throw new NotImplementedException();
}

At the end of this process, we have to return a MyEnum<T> which is the result of our calculation.

This is almost done. Some extensions are breaking the sequence. For example, .Count<T>() returns an int or .Contains<T>() returns a bool. Their interpretation needs to implement another method: public S Execute<S>(Expression expression).
The binary operation to test if a set contains a value is a binary and.

 public S Execute<S>(System.Expressions.Expression expression)
{
    if (expression.NodeType == ExpressionType.MethodCall)
    {
        MethodCallExpression method = expression as MethodCallExpression;
        if (method.Method.DeclaringType == typeof(System.Query.Queryable))
        {
            switch (method.Method.Name)
            {
                case "Contains":
                    T t = (BinaryQuery<T>.ParseExpression(method.Parameters[0]) 
                        & BinaryQuery<T>.ParseExpression(method.Parameters[1]));
                    object result = Convert.ToInt32(t) > 0;
                    return (S) result;
                    break;
                default:
                    throw new NotImplementedException();
                    break;
            }
        }
        else
            throw new NotImplementedException();
    }
    else
        throw new NotImplementedException();
}

This time, it's done ! Writing "weekend.Intersect(Day.Saturday)" or "weekend.Contains(Day.Saturday)" does not call the default implementation any more but our implementation using binary operations.

I remind you that our enumeration supports naturally the Linq syntax:

 var q =
    from day in weekend.Intersect(set1)
    select day;

foreach (Day d in q)
    Console.WriteLine(d);

Conclusion

  • Enumerating values is efficient.
  • The nullable like structure MyEnum<T> costs a little due to operator overriding.
  • Extending Linq to support our operations is a nice exercise. Things are more efficient than using default extension methods but stays far away from the basic efficiency.
  • However, considering that Enums are not huge datas, you can choose to use this solution given the clarity of the code.
  • This remains a small sample and the whole possible extensions are not implemented, but the main technical issues are solved. 

The whole solution (including source code of my previous post) is attached in this post.

Mitsu

PS: My next post should talk about binding and data virtualization.

LinqToBits.zip