Multiple thread-local state elements in a loop

Stephen Toub - MSFT

The Parallel.For/ForEach loop constructs included in Parallel Extensions support a variant of thread-local state to aid in efficiently passing data between loop iterations.  Consider one such overload of Parallel.For:

public static void For<TLocal>(
int fromInclusive, int toExclusive,
Func<TLocal> threadLocalInit,
Action<int, ParallelState<TLocal>> body,
Action<TLocal> threadLocalFinally);

The threadLocalInit delegate is called once per thread for each thread involved in processing loop iterations, and it’s called on the thread before the thread processes any iterations of the loop.  Each iteration is then provided with the result through the ParallelState<TLocal> instance that’s passed to the body delegate.  As this instance is mutable, the body delegate can update the value in the ParallelState<TLocal>’s ThreadLocalState property for the next iteration to see, though it can also treat the value as read-only if updates aren’t relevant.  Finally, after a thread is done executing iterations, the threadLocalFinally delegate is called, provided with the ThreadLocalState value.

As an example of using one of these methods:

Parallel.For(0, N, () => new NonThreadSafeData(), (i,loop)=>
{
    UseData(loop.ThreadLocalState);
});

Here, an instance of NonThreadSafeData is constructed once per thread that’s used to process iterations.  In this fashion, the loop body can be sure that no other thread is currently accessing the same instance (providing thread-safety through isolation), and at the same time we create a minimal number of these instances, only enough to ensure that we have plenty to go around for each involved thread.

Another common use for this thread-local state is to support aggregations, where aggregations can be performed without incurring the cost of an interlocked or other expensive synchronization operation for each element, preferring to aggregate locally and then only combine values once per thread:

int total = 0;
Parallel.ForEach(data, ()=>0, (elem,i,loop)=> { loop.ThreadLocalState += Process(elem); }, partial => Interlocked.Add(ref total, partial));

Several developers now have asked me if there’s any way to pass multiple pieces of data, rather than just one, between iterations of the loop.  The answer is yes, but doing so requires a bit of extra code.  For example, in the previous code snippet, I’m aggregating the values that result from calling the Process method on each element in the data set.  What if I also wanted to track how many elements there were?  To do so, I can create a small type that serves merely to store multiple values:

class MultipleValues { public int Total, Count; }

With this type in hand, I can now make my thread-local state an instance of this type:

int total = 0, count = 0;
Parallel.ForEach(data, 
()=>new MultipleValues { Total=0, Count=0 },
(elem,i,loop)=> { loop.ThreadLocalState.Total += Process(elem); loop.ThreadLocalState.Count++; }, partial => { Interlocked.Add(ref total, partial.Total); Interlocked.Add(ref count, partial.Count); });

It would be nice for these situations if we didn’t have to declare such a type and if we could instead take advantage of anonymous types in C# and Visual Basic.  Unfortunately for this situation, in C# anonymous types generate read-only properties, which means they’re not appropriate if you need to update the generated properties, as I’m doing here by incrementing them.  Of course, if you don’t need to modify the property values (as was the case with my initial NonThreadSafeData example), or if you’re using a language (like Visual Basic) where anonymous types aren’t read-only by default, anonymous types could be quite beneficial in this regard.

0 comments

Discussion is closed.

Feedback usabilla icon