TaskSchedulers and semaphores

Andrew Arnott

February 6th, 20160 0

When you write multi-threaded code, it’s important to be aware of whether the code in other libraries you call into is also thread-safe. By my observation, most code written is not thread-safe. So if you’re writing thread-safe code, kudos to you. But as you’ll sometimes need to call non-thread-safe code from your multi-threaded code, this post presents a couple options for doing just that.

Writing thread-safe code is not easy. Even if you think you’re writing thread-safe code you may even have bugs in it. In fact even in the .NET Framework, most types are not thread-safe on their instance members, but are thread-safe for static members. These details are in MSDN documentation for each type and member. Any library’s documentation should discuss how thread-safe it is (or isn’t). If the docs don’t discuss it, you should assume it is not thread-safe.

A less common solution might be to create a dedicated thread for accessing non-thread-safe code. This may be required when the non-thread-safe code is a native COM object that actually requires the exact same thread be used for every call. But otherwise it is unnecessary overhead to create such a thread, since the constraint isn’t that you use exactly one thread, but merely one thread at-a-time.

A very common way to call into code that is not thread-safe is similar to how you may be doing your own thread-safe code: wrap it with an exclusive lock. Locks can be very lightweight and are often the right choice when you don’t expect lock contention to occur frequently or last a long time.

If the code you are writing is or can be async, you have at least a couple other options.

Semaphores

In C#, the lock statement is very similar to a semaphore created with a count of 1. Your multi-threaded code can enter such a semaphore before accessing non-thread-safe code as a way to allow your multi-threaded code to safely call it.

In .NET, the SemaphoreSlim class offers a WaitAsync method you can await on to efficiently yield your thread if the semaphore is not immediately available. Consider this example:

Locks will synchronously block until they’re available, but a semaphore can be awaited on. Also, semaphores can be held across asynchronous awaits, which C# lock won’t allow you to do. Another rather unique feature of using semaphores is you can hold them without holding a thread. So you can, for example await on something while holding a semaphore. It may not always be a good idea to hold a semaphore for a long period of time, but it can be powerful when you need it as it guarantees that no one else can enter the semaphore and observe your object’s state while it’s in the middle of an async operation. So for example, we can do this:

Exclusive TaskSchedulers

Another technique for ensuring that code executes only on one thread at a time is to use a TaskScheduler with a policy to only execute one Task at once. .NET’s TaskScheduler.Default will schedule work by adding it to the threadpool queue, which means it may run multiple tasks concurrently. But .NET also offers a TaskScheduler that ensures that only one Task executes at a time: ConcurrentExclusiveSchedulerPair. Consider the thread-safe dictionary example used above, but written in terms of an exclusive TaskScheduler:

Here, our thread-safe AddAsync method schedules a Task on an exclusive TaskScheduler. As soon as that TaskScheduler is not busy, it will schedule our task for execution.

.NET’s ConcurrentExclusiveSchedulerPair class can actually serve a broader purpose. If your code can allow concurrent access for reading data, but requires exclusive access for writing data, the scheduler pair can help you achieve that. You schedule writes to the exclusive scheduler, and reads to the concurrent scheduler. This scheduler pair work together to ensure that Tasks sent to each scheduler only execute when they should.

But the above example isn’t particularly pretty. And the Task we schedule is synchronous. How can we make it asynchronous? The fastest way to do that based on what we already have is simply change the delegate we pass in to an async delegate and (very important!) add .Unwrap() at the end:

The Unwrap() is important because TaskFactory.StartNew doesn’t recognize async delegates, so StartNew will return a Task where the outer one completes as soon as the async delegate yields, while the inner Task completes after the async delegate fully executes. So what you must return to your caller is the inner Task. Unwrap() does this for you.

There is one significant difference in functionality between what we now have and the last SemaphoreSlim example. Both examples yield between the Add and Remove calls. But in the semaphore example no one else can execute within the semaphore during the yielding Delay(), whereas in the TaskScheduler example, other code *can* execute during the Delay(). Why the difference? TaskSchedulers do not have any insight into async work. They only execute synchronous delegates — not async ones. Each time your async delegate yields (such as for the await Delay()), the delegate has returned back to the TaskScheduler, which considers that Task completed. After the delay has elapsed, another Task is scheduled (on the same TaskScheduler) to execute the next (and last) synchronous portion of your async delegate.

Async methods tend to stay in the context they were started in. Or in other words, when you await a Task, the context you are currently running in is captured so that when the Task completes, it can resume your async method execution in the same context it started in. What is a ‘context’? It entails a lot, but for our purposes let us say that .NET first checks for SynchronizationContext.Current and schedules continuations using SynchronizationContext.Post if it finds one, and otherwise uses TaskScheduler.Current, which in our case is the exclusive TaskScheduler but otherwise would be the same as TaskScheduler.Default, which allows concurrent execution on the threadpool.

Should you use SemaphoreSlim or TaskScheduler for your synchronization needs? As just presented, the most important factor may be whether you need to block others from executing code across asynchronous code. But if that isn’t relevant (e.g. you’re fully synchronous in these critical sections of code), you can go with whichever one lets you write code more naturally.

It’s worth calling out that neither approach guarantees that your thread-safe code will execute on the same thread all the time, nor that when resuming from an await that you’ll be on the same thread you were before the await. They merely guarantee that your code won’t execute with any other code using the same synchronization object (with their varying degrees of granularity as discussed).

Making it look elegant

Our examples thus far have required that we write a lot of boilerplate code. We can do better, both for the semaphore and the TaskScheduler case. In both cases, it’s by adding the Microsoft.VisualStudio.Threading package to your project.

Install-Package Microsoft.VisualStudio.Threading

Then add this statement to the top of your source file:

using Microsoft.VisualStudio.Threading;

Now both examples are downright elegant. First, using AsyncSemaphore and its using block support:

And second, using the ability to await on a TaskScheduler:

Both samples now look quite nice. One thing we lose in the last TaskScheduler sample is observing the CancellationToken while we await on the TaskScheduler. So we make up for that as best we can by responding to cancellation as soon as we resume execution.

There are of course other ways to solve these problems. These techniques are perhaps lesser known, but often good choices for you to consider in your code.