Ordering the output of parallel computations

Stephen Toub - MSFT

Frequently when attempting to do multiple operations in parallel, ordering becomes an issue.  Consider an application where I’m rendering and writing out to a video file frames of a movie:

for (int i = 0; i < numberOfFrames; i++)
{
    var frame = GenerateFrame(i);
    WriteToMovie(frame);
}

For a bit of pizzazz, I’ll show the same thing with LINQ:

var frames = from i in Enumerable.Range(0, numberOfFrames)
             select GenerateFrame(i);
foreach (var frame in frames) WriteToMovie(frame);

Now, my GenerateFrame method is expensive and computationally intensive, so I’d like to generate frames in parallel. For this example, we’ll assume that my GenerateFrame method is thread-safe and can be called concurrently (rendering one frame doesn’t modify any state used to render another frame), but access to my WriteToMovie method must be serialized:

using (ManualResetEvent mre = new ManualResetEvent(false))
{
    int count = numberOfFrames;
    object obj = new object();
    for (int i = 0; i < numberOfFrames; i++)
    {
        ThreadPool.QueueUserWorkItem(state =>
        {
            var frame = GenerateFrame((int)state);
            lock(obj) WriteToMovie(frame);

 

            if (Interlocked.Decrement(ref count) == 0)
                mre.Set();

        }, state);
    }
    mre.WaitOne();
}

Unfortunately, that’s quite a bit of code overhead, and it also suffers from a fundamental ordering problem: the frames may end up being written to the output movie file in an arbitrary order, based on when the frame generation completes.  A version with this ordering issue fixed might look like the following:

var frames = new Bitmap[numberOfFrames];
var events = (from i in Enumerable.Range(0, numberOfFrames)
              select new ManualResetEvent(false)).ToArray();

for (int i = 0; i < numberOfFrames; i++)
{
    ThreadPool.QueueUserWorkItem(state =>
    {
        int frameNum = (int)state;
        frames[frameNum] = GenerateFrame(frameNum);
        events[frameNum].Set();
    }, i);
}

 

for (int i = 0; i < numberOfFrames; i++)
{
    events[i].WaitOne();
    WriteToMovie(frames[i]);
}

This has the general effect I want, but there are still some unfortunately overheads here (for example, I’m creating, setting, and waiting on a ManualResetEvent per frame). It’s also verbose.  Instead, I can use Future<T> from Parallel Extensions to solve the same problem:

var frames = (from i in Enumerable.Range(0, numberOfFrames)
              select Future.Create(() => GenerateFrame(i))).
             ToArray();
foreach (var frame in frames) WriteToMovie(frame.Value);

I love how close this is to the original LINQ version (and how much less code it is the than my previous ThreadPool sample).  Rather than selecting GenerateFrame(i), I’m selecting Future.Create(() => GenerateFrame(i)), and rather than calling WriteToMovie(frame), I’m calling WriteToMovie(frame.Value).  With those changes, the frames are now being computed in parallel (note I’m also using ToArray to calls the entire query to be enumerated), and they’re being written out to the movie file in the correct order without having to do any explicit locking or coordination.  Sweet!

If I wanted to, I could do the same thing without LINQ, tracking the collection of futures explicitly:

var frames = new Queue<Future<Bitmap>>();
for (int i = 0; i < numberOfFrames; i++)
{
    var num = i;
    frames.Enqueue(Future.Create(() => GenerateFrame(num)));
}
while (frames.Count > 0) WriteToMovie(frames.Dequeue().Value);

I could also use PLINQ:

var frames = from i in Enumerable.Range(0, numberOfFrames).
    AsParallel(ParallelQueryOptions.PreserveOrdering)
             select GenerateFrame(i);
foreach (var frame in frames) WriteToMovie(frame);

Three different approaches to solving the same problem with Parallel Extensions, and all of them requiring less code (and less complicated code) than my ThreadPool example.  Just makes you smile, doesn’t it? 🙂

0 comments

Discussion is closed.

Feedback usabilla icon