Introducing Parallel Extensions to the .NET Framework

There is no escaping from concurrency challenges... or is there?

(A slightly modified version of this article was published in the August 2008 edition of the MSDN Flash newsletter)

Dual, quad, and eight-core processors are becoming the norm. Is your application capable of utilising all available processors? In order to achieve this level of utilisation on an n-processor machine, an application ideally needs at least n threads concurrently performing operations.

Although writing multi-threaded applications has become simpler over the years, many still find it challenging. Also many of the constructs such as the .NET ThreadPool have deficiencies that make them less suitable as the manycore shift begins to accelerate.

Over the past few years, Microsoft has been hard at work developing a new set of parallel extensions for .NET, aptly named Parallel Extensions to the .NET Framework. One of the additions is a set of lightweight and scalable thread-safe data structures and synchronization primitives such as a concurrent dictionary and a spin lock.

It also includes an implementation of LINQ-to-Objects that automatically executes queries in parallel, scaling to utilise most or all of the available processors without developers explicitly managing the distribution of work across all processors. This technology is called Parallel LINQ or PLINQ and the query syntax is almost identical to that of LINQ-to-Objects.

var q = from c in customers .AsParallel()

        join r in regions on c.RegionID equals r.RegionID

        where c.City == "London" && r.Name == "Kingston"

        select c;

When a repetitive computation can be parallelised, the Parallel Extensions to the .NET Framework offers a few interesting data-oriented operations such as For and ForEach that execute a loop in which iterations may run in parallel on parallel hardware. For example the following classic foreach loop can easily be converted to a parallel foreach:

Sequential execution

è

foreach (var c in formulae)

  Calculate(f);

 

Possible parallel execution

è

Parallel.ForEach(formulae, f => Calculate(f));

 

The actual parallelisation of actions is managed by the Task Parallel Library (TPL) that provides an abstraction layer on top of raw threads. TPL uses the notion of Task that is the smallest unit of work and could potentially be executed by any thread owned by TPL. Therefore correctly identifying tasks that could be executed in parallel is crucial. It is then the job of the scheduler component of TPL to decide whether those tasks should execute sequentially or in parallel. This decision is usually made based on available system resources and possible user preferences.

// Create a task to call Calculate(object o) and schedule it for execution

Task t = Task.Create(Calculate);

// Other activities...

// Wait for this task to complete

t.Wait();

Developing concurrent applications requires a shift in mindset. For example, only a single exception can be thrown at any one time when sequentially executing a for loop. However executing the same loop in parallel may warrant the need for dealing with multiple concurrent exceptions. There are also a few parallelism blockers that should be avoided if you are planning to benefit from Parallel Extensions to the .NET Framework when it is released.

There is a lot of promise in Parallel Extensions. If you want to find out more, I’d encourage you to download the June 2008 Community Technology Preview build.