captivated foreach statements

I am sure that most people who use C# regularly use the foreach construct. Intimately understanding this construct is not usually required for proper usage. The syntax is straightforward and it would seem that the semantics would also be straightforward; however, beginning in C# 2.0 there is a subtle gotcha that can cause seemingly odd behavior. Take this code for example:

using

System;

class Program {
private delegate void Proc();

static void Main(string[] args) {
int[] numbers = { 0, 1, 2, 3, 4, 5 };
Proc[] actions = new Proc[numbers.Length];
int current = 0;

foreach (int number in numbers) {
actions[current] = delegate { Console.WriteLine(number); };
actions[current++]();
}
}
}

You would expect that this program would just list the numbers zero through five on sequential lines. This is exactly what the program does; however, what will this slight modification do to the output:

using

System;

class Program {
private delegate void Proc();

static void Main(string[] args) {
int[] numbers = { 0, 1, 2, 3, 4, 5 };
Proc[] actions = new Proc[numbers.Length];
int current = 0;

foreach (int number in numbers) {
actions[current++] = delegate { Console.WriteLine(number); };
}

foreach (Proc action in actions) {
action();
}
}
}

So what is the output to this program. It just repeats the number five on six lines. Why? Well, there are two important concepts to grasp. First, how the foreach statement works and second how anonymous methods "capture" local variables. A quick glance at the ECMA spec tells us that the foreach statement is rewritten like this:

foreach

(T item in collection) {
...statements...
}

becomes (if collection implements IEnumerable<T> other situations are similar)...

IEnumerator

<T> enumerator = ((IEnumerable<T>)(collection)).GetEnumerator();
try {
T element;
while (enumerator.MoveNext()) {
element = enumerator.Current;
...statements...
}
} finally {
enumerator.Dispose();
}

Note that the declaration of element is outside of the loop. It could have been in the loop and in most cases the placement does not matter, but the spec says that it is outside of the loop. Now we move onto anonymous methods.

Anonymous methods capture local variables by putting them on the heap instead of on the stack. A private nested class is generated by the compiler that contains the local variables that are captured. This class is then instantiated when the local variables come into scope. Since the CLR does not provide native support for anonymous methods, these methods are rewritten in constructs that do exist. Here is what the code looks like when the anonymous method and foreach statements are rewritten.

using

System;
using System.Collections;

class Program {
private delegate void Proc();

static void Main(string[] args) {
int[] numbers = { 0, 1, 2, 3, 4, 5 };
Proc[] actions = new Proc[numbers.Length];
int current = 0;

IEnumerator enumerator1 = numbers.GetEnumerator();
try {
Locals1 locals1 = new Locals1();
while (enumerator1.MoveNext()) {
locals1.number = (int)enumerator1.Current;
actions[current++] = new Proc(locals1.Method1);
}
} finally {
IDisposable disposable = enumerator1 as IDisposable;
if (disposable != null) disposable.Dispose();
}

IEnumerator enumerator2 = actions.GetEnumerator();
try {
Proc action;
while (enumerator2.MoveNext()) {
action = (Proc)enumerator2.Current;
action();
}
} finally {
IDisposable disposable = enumerator2 as IDisposable;
if (disposable != null) disposable.Dispose();
}
}

class Locals1 {
public int number;

public void Method1() {
Console.WriteLine(number);
}
}
}

Since the locals are created outside of the loop, the same instance of locals1 is being used by all of the delegates. This leads to them all displaying the same value. How can we fix this? Here is how:

using System;

class Program {
private delegate void Proc();

static void Main(string[] args) {
int[] numbers = { 0, 1, 2, 3, 4, 5 };
Proc[] actions = new Proc[numbers.Length];
int current = 0;

foreach (int number in numbers) {
int temp = number;
actions[current++] = delegate { Console.WriteLine(temp); };
}

foreach (Proc action in actions) {
action();
}
}
}

With our newfound knowledge, we can implement a parallelized version of the Map and ForEach methods. First, lets start with the serial versions.

using

System;
using System.Collections.Generic;

class Program {
delegate T Func<A0,T>(A0 arg0);
delegate void Proc<A0>(A0 arg0);

static IEnumerable<U> Map<T,U>(IEnumerable<T> list, Func<T,U> func) {
foreach (T item in list) {
yield return func(item);
}
}

static void ForEach<T>(IEnumerable<T> list, Proc<T> proc) {
foreach (T item in list) {
proc(item);
}
}

static void Main() {
int[] numbers = { 0, 1, 2, 3, 4, 5 };
Func<int,int> squareInt = delegate(int x) { return x * x; };
Proc<int> displayInt = delegate(int x) { Console.WriteLine(x); };

ForEach(Map(numbers, squareInt), displayInt);
}
}

This displays the squares of the integer values zero through five inclusively. Now we can parallelize them.

using

System;
using System.Threading;
using System.Collections.Generic;

class Program {
delegate T Func<A0,T>(A0 arg0);
delegate void Proc<A0>(A0 arg0);

class ThreadAdapter<T,U> {
Thread thread;
T input;
U output;
Func<T,U> func;

public ThreadAdapter(Func<T,U> func, T input) {
this.input = input;
this.func = func;
thread = new Thread(new ThreadStart(Execute));
}

void Execute() {
output = func(input);
}

public Thread Thread {
get { return thread; }
}

public U Output {
get { return output; }
}
}

static IEnumerable<U> Map<T,U>(IEnumerable<T> list, Func<T,U> func) {
List<ThreadAdapter<T,U>> threads = new List<ThreadAdapter<T,U>>();

foreach (T item in list) {
ThreadAdapter<T,U> thread = new ThreadAdapter<T,U>(func, item);
threads.Add(thread);
thread.Thread.Start();
}

foreach (ThreadAdapter<T,U> thread in threads) {
thread.Thread.Join();
yield return thread.Output;
}
}

static void ForEach<T>(IEnumerable<T> list, Proc<T> proc) {
List<Thread> threads = new List<Thread>();

foreach (T item in list) {
T temp = item;
Thread thread = new Thread(new ThreadStart(delegate { proc(temp); }));
threads.Add(thread);
thread.Start();
}

foreach (Thread thread in threads) {
thread.Join();
}
}

static void Main() {
int[] numbers = { 0, 1, 2, 3, 4, 5 };
Func<int,int> squareInt = delegate(int x) { return x * x; };
Proc<int> displayInt = delegate(int x) { Console.WriteLine(x); };

ForEach(Map(numbers, squareInt), displayInt);
}
}

Just for kicks, we might now write this code with the new C# 3.0 syntax.

using System;
using System.Query;
using System.Threading;
using System.Collections.Generic;

static class Extensions {
class ThreadAdapter<T,U> {
Thread thread;
T input;
U output;
Func<T,U> func;

public ThreadAdapter(Func<T,U> func, T input) {
this.input = input;
this.func = func;
thread = new Thread(new ThreadStart(Execute));
}

void Execute() {
output = func(input);
}

public Thread Thread {
get { return thread; }
}

public U Output {
get { return output; }
}
}

public static IEnumerable<U> Map<T,U>(this IEnumerable<T> list, Func<T,U> func) {
var threads = new List<ThreadAdapter<T,U>>();

foreach (var item in list) {
var thread = new ThreadAdapter<T,U>(func, item);
threads.Add(thread);
thread.Thread.Start();
}

foreach (var thread in threads) {
thread.Thread.Join();
yield return thread.Output;
}
}

public static void ForEach<T>(this IEnumerable<T> list, Action<T> action) {
var threads = new List<Thread>();

foreach (var item in list) {
var temp = item;
var thread = new Thread(new ThreadStart(() => { action(temp); }));
threads.Add(thread);
thread.Start();
}

foreach (var thread in threads) {
thread.Join();
}
}
}

class Program {
static void Main() {
var numbers = new [] { 0, 1, 2, 3, 4, 5 };

numbers.Map(x => x * x).ForEach(x => { Console.WriteLine(x); });
}
}

Enjoy!