C# 4 expressions: blocks [Part I]

Article
03/02/2010

Since .Net 3.5 and Linq, the C# compiler is capable of generating expression trees instead of standard executable IL. Even if Linq opens the door of meta-programming (using the code to define something else, like a Sql query), we still have a lot of limitations.

A C# expression is limited to a single instruction returning a value, given some parameters. Method calls, properties access, constructors, and operators are allowed but no block, no loops, etc…

 Expression<Action<string>> printExpression =
    s => Console.WriteLine(s);
var Print = printExpression.Compile();
Print("Hello !!!");

Of course, compiling the expression is not the only final goal but I will not go further in this article.

With .Net 4.0, Linq expressions implementation has moved to the DLR. The dynamic language runtime is using a richer expression API that allows a lot of thing. We can consider that C# 4.0 compiler is using a subset of the DLR expressions.

So, even if C# 4.0 now allows expressions based on Actions, we still have all the other limitations.

Therefore we can use the expression API programmatically to express more complex expressions like explained in this post from Alexandra Rusina. But we won’t be able to express those complex expressions from the C# language.

Let’s try to find a work around for the first big barrier: statements…

The new expression API offers the BlockExpression class to define a block of statements.
It’s quite easy to use since we just have to provide a list of expressions.

 public static BlockExpression Block(params Expression[] expressions);

We will notice that we can also provide a list of variables if needed but I will come back to this point a little bit later in this article.

So the first very big restriction is we can only provide one single instruction !
Now imagine we use a syntax comparable to the one explained in this article where some instance methods always return the instance itself (this) so we can create a sequence of them.

 public class Block
{
    public Block _(Action action)
    {
        throw new NotImplementedException();
    }
    public static Block Default
        { get; private set; }

I know it’s very strange but I’ve chosen to name my method “_”. It’s authorized by C# and it’s legal :)
I have also defined a Default property to avoid to have to create Block instances every time and because it’s an easy signature to recognize.

Now we can write things like:

 Expression<Action<string>> exp = s =>
    Block.Default
        ._(() => Console.WriteLine("I"))
        ._(() => Console.WriteLine("would like"))
        ._(() => Console.WriteLine("to say: "))
        ._(() => Console.WriteLine(s));

Now the whole expression is correct because we have a single instruction but it defines a collection of actions and each of them contains a single instruction again.

You can notice that there is only one “;” in my code.

Of course, my idea is to use this strange syntax to transform this expression into a BlockExpression. To achieve this I have to analyze my expression, find the Block.Default signature, remove it, and then extract the body from all the actions to finally get the collection of expressions to build my BlockExpression.

To do this, I have implemented an expression visitor. You can notice that the base class is now part of .Net (through the DLR once again): System.Linq.Expressions.ExpressionVisitor.

The visitor is a massively recursive code that helps you analyze all the possible nodes of an expression tree. For this first step, I will override the VisitMethodCall method to catch my sequence of Block methods.

 protected override Expression VisitMethodCall(MethodCallExpression node)
{
    if (IsMethodOf<Block>(node))
    {
        var expressions = new List<Expression>();
        do 
        {
            var r = VisitBlockMethodCall(
                node as MethodCallExpression);
            expressions.Insert(0, r);
            node = node.Object as
                MethodCallExpression;
        } while (node != null);

        return Expression.Block(expressions);
    }
    return base.VisitMethodCall(node);
}

As method calls are in sequence, which is an unary operation, it’s possible to unrecursive this specific part of the visitor and that’s what I am doing in the do…while loop.

There is an important thing to know about method call sequences. The visitor is discovering them in the opposite order of the C# syntax.

For example, if I write :

 test.Do().Print();

We will be discovering Print() first, then Do(). It’s quite logical because (test.Do()) will be the source from which Print() is called. The loop is going up the sequence while the source (node.Object) is still a MethodCallExpression (node != null). You can notice that the extracted actions bodies are inserted on top of the list to recreate the C# syntax order.

Of course we are doing all this work only if the method is declared by the Block type. The IsMethodOf helper method is used here.

 private bool IsMethodOf<T>(MethodCallExpression node, string methodName)
{
    if (node == null)
        return false;
    return ((node.Method.DeclaringType 
        == typeof(T))
        && (node.Method.Name == methodName));
}
private bool IsMethodOf<T>(MethodCallExpression node)
{
    if (node == null)
        return false;
    return (node.Method.DeclaringType 
        == typeof(T));
}

For this first step, I am only looking for the “_” method but I will later add more features to the Block class. That’s why I have isolated the methods recognition is a separated method : VisitBlockMethodCall.

 private Expression VisitBlockMethodCall(MethodCallExpression node)
{
    if (IsMethodOf<Block>(node, "_"))
        return Visit((node.Arguments[0] as
            LambdaExpression).Body);
    ...
}

As I know the only argument is an Action (a lambda expression in our tree), I am just extracting the body and I do not forget to apply the Visitor on it before returning it (so the visitor logic can continue on this branch).

Once I have collected all those expressions, I just have to build and return Expression.Block(expressions). '”Block.Default” is naturally skipped at this moment.

Important point: all the Block members are not important in the end because there goal are to be removed by this transformation step. After the transformation, they must have all disappeared. I just consider them as markers for my transformation engine (metadata and not code). That’s also why they are never implemented (throw new NotImplementedException()). BUT whatever the transformation we make on the expression, it MUST respect the C# syntax in the first place.

Now we have to apply the visitor on our sample expression.

 Expression<Action<string>> expWithBlock = s =>
    Block.Default
        ._(() => Console.WriteLine("I"))
        ._(() => Console.WriteLine("Would like"))
        ._(() => Console.WriteLine("To say: "))
        ._(() => Console.WriteLine(s));


expWithBlock = 
    ExpressionHelper.Translate(expWithBlock);
expWithBlock.Compile()("Hello !!!");

 public static class ExpressionHelper
{
    public static Expression<TDelegate> Translate<TDelegate>(Expression<TDelegate> expression)
    {
        var visitor = 
            new BlockCompilerVisitor<TDelegate>();
        return visitor.StartVisit(expression);
    }
}

and we get

I this very first step we managed to create multiline C# expressions based on actions.

Visual Studio 2010 has a very useful new debug viewer for expressions that you can call at debug time.

Here is our expression before…

 .Lambda #Lambda1<System.Action`1[System.String]>(System.String $s) {
    .Call (.Call (.Call (.Call (CSharp4Expressions.Block.Default)._(.Lambda #Lambda2<System.Action>))._(.Lambda #Lambda3<System.Action>)
    )._(.Lambda #Lambda4<System.Action>))._(.Lambda #Lambda5<System.Action>)
} 

.Lambda #Lambda2<System.Action>() {
    .Call System.Console.WriteLine("I")
} 

.Lambda #Lambda3<System.Action>() {
    .Call System.Console.WriteLine("Would like")
} 

.Lambda #Lambda4<System.Action>() {
    .Call System.Console.WriteLine("To say: ")
} 

.Lambda #Lambda5<System.Action>() {
    .Call System.Console.WriteLine($s)
}

…and after transformation.

 .Lambda #Lambda1<System.Action`1[System.String]>(System.String $s) {
    .Block() {
        .Call System.Console.WriteLine("I");
        .Call System.Console.WriteLine("Would like");
        .Call System.Console.WriteLine("To say: ");
        .Call System.Console.WriteLine($s)
    }
}

It’s not finished but enough for a single post.

Next part will propose a solution for creating variables and then other expression API features like Loop, Goto, Label, Assign and even new features like For.

You can get the whole solution here: https://code.msdn.microsoft.com/CSharp4Expressions

C# 4 expressions: blocks [Part I]

Additional resources