In a previous blog post I showed how a 2D convolver can be written in F# using Microsoft’s Accelerator system for GPGPU and SIMD x64 multicore programming targets. This blog article explains how the same 2D convolver can also be nicely expressed in C# and it illustrates a few cool C# and LINQ features including type inference, extension methods, the indexed Select operator and the Aggregate operator. I also give a more old school C# implementation at the end of this blog article. First, here is the code:

using System; using System.Linq; using Microsoft.ParallelArrays; namespace AcceleratorSamples { static partial class Convolver2D { static FloatParallelArray convolve(this FloatParallelArray a, Func<int, int[]> shifts, float[] kernel) { return kernel .Select((k, i) => k * ParallelArrays.Shift(a, shifts(i))) .Aggregate((a1, a2) => a1 + a2); } static FloatParallelArray convolveXY(this FloatParallelArray input, float[] kernel) { return input .convolve(i => new[] { -i, 0 }, kernel) .convolve(i => new[] { 0, -i }, kernel); } static void Main(string[] args) { const int inputSize = 10; var random = new Random(42); var inputData = new float[inputSize, inputSize]; for (int row = 0; row < inputSize; row++) for (int col = 0; col < inputSize; col++) inputData[row, col] = (float)random.NextDouble() * random.Next(1, 100); var testKernel = new[] { 2F, 5, 7, 4, 3 }; var dx9Target = new DX9Target(); var inputArray = new FloatParallelArray(inputData); var result = dx9Target.ToArray2D(inputArray.convolveXY(testKernel)); for (int row = 0; row < inputSize; row++) { for (int col = 0; col < inputSize; col++) Console.Write("{0} ", result[row, col]); Console.WriteLine(); } } } }

This version of the code was suggested by Mads Torgersen. I won’t explain the details of how 2D convolution is implemented here and I kindly ask you to first look over the original F# blog article. To recap a little, a 1D convolver computes a weighted average around each element of an array (or stream):

A 2D convolver works by first performing a convolution the x-direction and then performing a convolution in the y-direction. An x-direction convolution is shown below.

There are two basic operations that we need to express here:

- A multiplication of a shifted 2D input matrix by the i-th element of a kernel 1D matrix. This is accomplished with an indexed Select operator.
- The addition of all the elements in a row (or column) of a 2D matrix. This is accomplished with an Aggregate operator.

First let me explain what C# extension methods are by using the code below as a demonstration.

using System; namespace Extension_Methods_Demonstration { public static class Extension_Methods_Experiment { public static int DoubleFunction(int i) { return 2 * i; } public static int Double(this int i) { return 2 * i; } static void Main(string[] args) { int x = 12; Console.WriteLine(DoubleFunction(x)); Console.WriteLine(x.Double()); } } }

Here we have two ways of encapsulating a computation that doubles an int value. The first approach uses a regular static method method DoubleFunction which takes an int value as an argument and returns an int value as a result i.e. something that behaves like a function. It might be nice to also express the doubling operation as a regular method invocation on variables of type int. However, we don’t always have access to the definition for all the types we want to extend in this way. Extension methods allow us to express functions over types even when we don’t have access to the definition of the type. The syntax is very similar to the syntax used for static methods however the type of the object being extended is indicated as the first parameter and is identified with the **this** keyword. The program above illustrates both the static method style invocation of a double operation as well as the invocation of a Double method on an variable of type int.

Before we explain what an indexed Select operator does we first demonstrate what a regular Select operator does:

using System; using System.Linq; class Select_Demonstration1 { static void Main(string[] args) { int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 }; var y = numbers.Select(n => 2 * n); foreach (var i in y) Console.Write(i + " "); Console.WriteLine(); } }

When run this program produces the output:

10 8 2 6 18 16 12 14 4 0 |

Looking at the example inside out, let us focus on the lambda expression n => 2 * n first. This defines an anonymous function (static method) which takes an int argument n and returns an int result which is twice the value of the input value. It is short-hand for:

delegate (int n) { return 2 * n; }

except the lambda expression is a little shorter and all the types are inferred. The lambda expression form is inspired by lambda expressions in the lambda calculus.

The code snippet above uses the Select operator to apply the same operation (function) to event element of an array to yield a new collection. The operation to be performed is specified here using a lambda expression which just doubles its input. This is very much like the higher order map function in functional languages e.g. List.map in F#. The result of applying the Select operator is a variable y that implements the IEnumerable interface which allows us to use this variable in **foreach** loops. Note that we have used the **var **keyword to indicate that we wish the type of y to be inferred. If you are a functional programmer you might assume that Select is exactly like map then you would expect the type of y to be int**[]** but sadly you would be wrong (but let’s not get bogged down with that here). The overloaded type of Select used here is:

public static IEnumerable<TResult> Select<TSource, TResult>

(this IEnumerable<TSource> source,

Func<TSource, TResult> selector);

Next we turn out attention to an extension method called indexed Select. This operator can be used to apply some operation to every element of a collection (e.g. an array) but unlike the basic Select this operator also provides the index value of each element. The argument to select is a function that takes a pair that contains an element and its index taken from the source array. The result of the function specifies the value at the corresponding index in the output collection. Here’s an example that multiplies each value in an array by its index value:

using System; using System.Linq; class Select_Demonstration2 { static void Main(string[] args) { int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 }; var y = numbers.Select((n, i) => n * i); foreach (var i in y) Console.Write(i + " "); Console.WriteLine(); } }

which produces the output:

0 4 2 9 36 40 36 49 16 0 |

i.e. 5*0, 4*1, 1*2, 3*3, 9*4 etc.

This version of Select has the overloaded type:

static IEnumerable<TResult> Select<TSource, TResult>

(this IEnumerable<TSource> source,

Func<TSource, int, TResult> selector);

Now we can begin to understand what the Select portion of the convolve method does:

static FloatParallelArray convolve(this FloatParallelArray a, Func<int, int[]> shifts, float[] kernel) { return kernel .Select((k, i) => k * ParallelArrays.Shift(a, shifts(i))) .Aggregate((a1, a2) => a1 + a2); }

First we see that convolve is an extension method because the first parameter is preceded by the **this** keyword. In particular this method extends the FloatParallelArray type with this convolve method. The convolve method takes two arguments:

- Func<int, int[]> shifts: this function takes as input an int value which specified the index value of a kernel element and it returns a two element array which specifies the shifts in the x-direction and the y-direction.
**float**[] kernel: this is the 1D kernel array

We now see that the Select operator multiplies each kernel value by the input array shifted according to some function that is specified as an argument to convolve. Let’s consider a specific invocation:

kernel.Select((k, i) => k * ParallelArrays.Shift(a, i => new[] { 0, -i }));

Here we use a lambda expression to indicate the shift degree in the x-direction should match the index value of the current kernel element and the shift degree in the y-direction is always zero. This helps to implement a portion of a 1D convolution of a 2D input array as shown below.

To complete the 1D convolution of the resulting collection of 2D arrays we need to add these arrays together in a point-wise manner. We can achieve this with the extension method Aggregate:

public static TSource Aggregate<TSource>

(this IEnumerable<TSource> source, Func<TSource, TSource, TSource> func);

The Aggregate extension method is very similar to a left fold in functional languages e.g. List.fold in F#. Informally, this method takes a collection of some values e.g. {A, B, C, D} and combines them with some operator, say ?, by computing ((A ?B) ?C) ? D). For example, we can specify the + operation to add up a collection of numbers {A, B, C, D} using Aggregate which computes A + B + C + D. So to compute the sum of the collection of 2D arrays produced by the Selectoperator we simply need to apply the opertator with a lambda expression that adds two FloatParallelArrays. We simply used the + operator which is overloaded for FloatParallelArray.An example of the application of Aggregate to the result of the collection produced by the previous Select is shown below.

This completes the description of the convolve extension method which implements a 1D convolution of a 2D input array. Now we define a 2D convolution convolveXY extension method by specifying a convolution in the y-direction followed by a convolution in the x-direction:

static FloatParallelArray convolveXY(this FloatParallelArray input, float[] kernel) { return input .convolve(i => new[] { -i, 0 }, kernel) .convolve(i => new[] { 0, -i }, kernel); }

Note the use of anonymous array initializers i.e. the type of the two element arrays is not explicitly specified.

The Main method uses fairly straight forward C# code to create some test input data, create a DX9 Accelerator target and run and display the results. The code makes extensive use of **var **to infer the types of variables.

One advantage of making convolve and convolveXY into extension methods it that we can use the nice fluent style of composition which allows us to chain together transformations over collections of FloatParallelArrays.

For comparison, here is another more straight forward C# implementation of the convolver.

using System using Microsoft.ParallelArrays; namespace AcceleratorSamples { partial class Convolver2D { static FloatParallelArray convolve(Func<int, int[]> shifts, float[] kernel,

int i, FloatParallelArray a) { FloatParallelArray e = kernel[i] * ParallelArrays.Shift(a, shifts(i)); if (i == 0) return e; else return e + convolve(shifts, kernel, i - 1, a); } static FloatParallelArray convolveXY(float[] kernel, FloatParallelArray input) { FloatParallelArray convolveX

= convolve(i => new int[] { -i, 0 }, kernel, kernel.Length - 1, input); return convolve(i => new int[] { 0, -i }, kernel, kernel.Length - 1, convolveX); } static void Main(string[] args) { int inputSize = 10; Random random = new Random(42); float[,] inputData = new float[inputSize, inputSize]; for (int row = 0; row < inputSize; row++) for (int col = 0; col < inputSize; col++) inputData[row, col] = (float)random.NextDouble() * random.Next(1, 100); float[] testKernel = new float[]{2, 5, 7, 4, 3} ; DX9Target dx9Target = new DX9Target(); FloatParallelArray inputArray = new FloatParallelArray(inputData); float[,] result = dx9Target.ToArray2D(convolveXY (testKernel, inputArray)); for (int row = 0; row < inputSize; row++) { for (int col = 0; col < inputSize; col++) Console.Write("{0} ", result[row, col]); Console.WriteLine(); } } } }

Thanks satnam; great article.

I wish I could use the Accelerator in commercial apps.

Hi, All!

TIP: To compile Accelerator 2 with Visual Studio 11 Developer Preview you can do these 3 simple steps:

1. Create default project (.NET 4.5 Console Application)

2 Copy the content of your bin directory of Accelerator 2 to created Lib folder in your project's directory

3 Add the Post-Build Event description in that console project Options pages:

xcopy /f /s /y "$(ProjectDir)"Libx86*.dll "$(TargetDir)"

P.S.:

Tested under clean Windows 8 DP with VS11 DP installed + NVIDIA GPU Computing SDK 4.0