Functional list processing in C# 2.0 with anonymous delegates

Article
06/30/2006

One of the benefits of functional languages is their great flexibility in list manipulation, which enables them to express certain computations concisely that would require one or more verbose loops in procedural languages. Many of the features that functional programmers take for granted such as first class functions, persistent data structures, and garbage collection were included in Lisp partly for the purpose of convenient list processing (Lisp stands for "LISt Processing").

Although C# is not a functional language, in C# 2.0 and the accompanying .NET Framework 2.0, a number of constructs and library calls have been added that allow generic List objects to be manipulated in a more functional way. The most important of these is anonymous delegates, which are the same construct that functional programmers call closures. We start out with the following method, which removes all elements from a list that match a given condition, passed in as a delegate:

     delegate bool Predicate<T>(T arg);

    static void RemoveAll<T>(List<T> list, Predicate<T> condition)
    {
        int i=0;
        while (i < list.Count)
        {
            if (condition(list[i]))
                list.RemoveAt(i);
            else
                i++;
        }
    }

The use of the word "predicate" here comes from the mathematical predicate, which is just a function that yields a boolean true/false value.

Now say we have a list of integers, and we want to remove all the multiples of x, where x is a given number. In C# 1.0, using RemoveAll for this is so no easy task, requiring a temporary object:

     class RemoveMultiplesOfHelper
    {
        private int x;
        public RemoveMultiplesOfHelper(int x)
        {
            this.x = x;
        }
        public bool IsMultiple(int n)
        {
            return (n % x) == 0;
        }
    }

    static void RemoveMultiplesOf(List<int> list, int x)
    {
        RemoveAll<int>(list, new RemoveMultiplesOfHelper(x).IsMultiple);
    }

The advantages of reusing RemoveAll are important: it's tricky to iterate through a list while removing elements from it, and programmers often introduce subtle bugs while attempting to do so. But this code has serious problems too: it's a hassle to write, it's verbose, the "temporary" class isn't very reusable and is difficult to name, and the code is spread all over the place, making the flow of execution difficult to follow. Because of these issues, many procedural programmers opt to reinvent the wheel instead.

Anonymous delegates solve all these problems. An anonymous delegate allows you to write code right in the middle of a method which has access to all the local variables in that method, and then package it up and pass it to another method to execute later. Here's our example using anonymous delegates:

     static void RemoveMultiplesOf(List<int> list, int x)
    {
        RemoveAll<int>(list, delegate(int y) { return (y % x) == 0; });
    }

In this much briefer implementation, the delegate "captures" the local value of x at the instant of its creation, then later RemoveAll uses it in computations.

Another handy trick you can do with anonymous delegates is take a method signature that doesn't quite fit an existing delegate and translate it into one that does. For example, suppose you're sorting a List using the overload of the Sort method that takes a System.Comparison delegate. Now say you're sorting a list of strings, and you have a string comparison method that takes a boolean determining whether the comparison is case-sensitive:

     delegate int Comparison<T>(T left, T right);
    void Sort<T>(Comparison<T> comparison);
    int StringCompare(string left, string right, bool caseSensitive);

At first you appear to be in quite a pickle, as the signature of the StringCompare method does not exactly match the delegate signature Comparison<string>. Anonymous delegates make it easy to overcome this problem:

     static void SortStrings(List<string> a, bool caseSensitive)
    {
        a.Sort(delegate(string left, string right) {
            return StringCompare(left, right, caseSensitive);
        });
    }

We simultaneously match the desired delegate signature and push the decision of whether to sort case-sensitively or not to the caller, where it belongs, all with very little new code.

The advantages of anonymous delegates make functions like our Sort and RemoveAll considerably more useful. Consequently, a number of such functions were included with the generic List class in .NET 2.0:

RemoveAll(Predicate<T>): Like our RemoveAll implementation above, removes all elements from a list matching a predicate.
Sort(Comparison<T>): Mentioned above, this is a Sort method that takes a delegate. Sorting with anonymous delegates is very handy for unusual sorting orders; for example, the following sorts a list of integers in reverse order:
```
 list.Sort(delegate(int x, int y) { return y.CompareTo(x); });
```
Find(Predicate<T>): Finds the first element in a list satisfying a predicate. Handy for replacing linear searches. For example, this single line of code identifies the first file in the current directory modified in the last hour, or yields null if none:
```
 new List<string>(Directory.GetFiles(Directory.GetCurrentDirectory())).Find(
    delegate(string path) { return File.GetLastWriteTime(path) >=
                                   DateTime.Now - new TimeSpan(1, 0, 0); });
```
Exists(Predicate<T>): Determines whether a list contains an element satisfying a predicate. Besides its added readability in situations where you only care about the existence of a value, it also has another use: since "Find" returns the default value for the type T to indicate "not found", you need to use Exists for predicates that might match null or zero, such as:
```
 bool containsEven = list.Exists(delegate(int x) { return (x % 2) == 0; });
```
TrueForAll(Predicate<T>): Similar to Exists, but determines whether the predicate is satisfied by all elements of the list. For example, this line of code might run some tests and see if they all pass:
```
 if (tests.TrueForAll(delegate(Test t) { t.Run(); return t.SuccessCode == 0; })
```
FindAll(Predicate<T>): Creates a new list containing all elements satisfying a predicate. In functional languages this is sometimes called select or filter, as we're "filtering out" the parts of the list not satisfying the predicate. Here's an example which, in a single line, gets a list of all processes using more than k threads:
```
 List<Process> x = new List<Process>(Process.GetProcesses()).FindAll(
    delegate(Process p) { return p.Threads.Count > k; });
```
FindIndex(Predicate<T>): Just like Find, except that instead of returning the first value that satisfies the predicate, it returns its index in the list. For example, this expression yields the index of the first alphabetical character in a string:
```
 new List(s.ToCharArray()).FindIndex(Char.IsLetter)
```
FindLastIndex(Predicate<T>): Same as FindIndex, except that it finds the last occurrence satisfying the predicate.
ForEach(Predicate<T>): Simply executes the given delegate on each member of the list. Although just using a foreach loop is cleaner than using ForEach with an anonymous delegate, ForEach is still handy if you already have a named delegate, as in this example that deletes all the files in the current directory:
```
 new List<string>(Directory.GetFiles(Directory.GetCurrentDirectory())).ForEach(File.Delete);
```
ConvertAll(Predicate<T>): Perhaps the most powerful of the delegate-based List methods, ConvertAll feeds each element of the list through a conversion function, producing a new list. In functional languages this is often called map because, like a mathematical function, it "maps" one set to another. Here's a simple example which produces a list of strings that are the lowercased versions of the strings in the original input list:
```
 list.ConvertAll(delegate(string s) { return s.ToLower(); });
```

What's more, besides being individually useful, complex transformations can be achieved by chaining these methods together in clever ways. For example, here's two lines of code that produce a list of all files larger than a given size in a given directory:

     static List<string> GetBigFiles(string directory, int bigLength)
    {
        List<string> paths = new List<string>(Directory.GetFiles(directory));
        return paths.ConvertAll<FileStream>( File.OpenRead )
                    .FindAll( delegate(FileStream f) { return f.Length >= bigLength; } )
                    .ConvertAll<string>( delegate(FileStream f) { return f.Name; } );
    }

The first ConvertAll opens all the files by using the static library method "File.OpenRead" as the converter. The FindAll filters out just the FileStream objects corresponding to large files. The final ConvertAll extracts the filenames from each stream. Each of these "list filters" is independently reusable. At first this way of programming may seem unusual or cumbersome, but just compare the size of the above method to a procedural implementation.

Don't forget also that you can write your own methods taking delegates to reap more benefits from anonymous delegates. A frequently useful application is for error handling and debugging code. Suppose you find yourself writing several methods that all look like this:

     void foo()
    {
        Debug.Write("entering foo()");
        try
        {
            // Do some stuff
        }
        catch (Exception e)
        {
            // Do some stuff
            throw;
        }
        Debug.Write("exiting foo()");
    }

It's not clear how to factor out the two "Do some stuff" parts, since they're nested right in the middle of the method. Once again, anonymous delegates are the answer. We can write a method like this taking delegates for the parts that get filled in:

     delegate void DoAction();
    delegate bool ProcessException(Exception e);
    void DebugInvoke(string name, DoAction action,
                     ProcessException processException)
    {
        Debug.Write("entering " + name);
        try
        {
            a();
        }
        catch (Exception e)
        {
            if (processException(e))
            {
                throw;
            }
        }
        Debug.Write("exiting " + name);
    }

Now we can mix-and-match try bodies and catch bodies at will, and get the tracing for free:

     void foo(int x, int y)
    {
        int z = 0;
        DebugInvoke("addFoo", delegate() { z = x + y; }, OverflowHandler);
        DebugInvoke("addBar", delegate() { z = z + x; },
                    delegate(Exception e) { Debug.Assert(false); return true; });
        DebugInvoke("addBaz", delegate() { y = z + z; }, OverflowHandler);
    }

    static bool OverflowHandler(Exception e)
    {
        if (e is OverflowException)
        {
            Debug.Write("Encountered overflow");
            return true;
        }
        return false;
    }

Unfortunately, anonymous delegate syntax, while much more concise than the alternative, is still a bit verbose, requiring not only the word "delegate" but full argument lists. One of the goals of the next version of C# is to encourage the use of these functional idioms, partly by introducing a new closure syntax that will take advantage of type inference to overcome this syntactic overhead.

Just remember to keep this rule of thumb in mind: if you're iterating over a list, and the body of the foreach is pretty small, chances are you don't need to. See if you can do the same thing more concisely and with less risk using a combination of the above methods and anonymous delegates (and amaze your coworkers). I hope you found this article helpful.

Functional list processing in C# 2.0 with anonymous delegates

Additional resources