The Asynchronous Programming Models

An important new feature of C# 5.0 that comes with Visual Studio 11 is the async and the await keywords. They are syntactical sugars that simplifies the construction of asynchronous operations code. When the C# compiler sees an await expression, it generates code that automatically invokes the expression asynchronously, then immediately return the control flow to the caller so the caller code can continue executing without block; after the asynchronous operation finished, the control flow will be forwarded back to the code below the await expression and execute the code sequentially till an exit criteria is reached (the exit criteria may be: the end of a method, or an iteration of a loop, etc.) I emphasize that the await keyword is only a  syntactical sugar, it is therefore an alternative that compiler generates the equivalent code rather than you manually write it. Before you can understand what the C# 5.0 does for you for async and await keywords, you should first understand how the Microsoft .NET Framework provides the asynchronous programming models (APM).

In .NET Framework, there are many ways to implement an asynchronous operation: by using thread, thread pool, BeginXxx and EndXxx methods, event based APM, or Task based APM. The first way, using thread is not recommended because creating a thread is very expensive*, and it requires many manual controls to work well, I will skip this discussion due to its complexity; the second way, using thread pool, is the easiest and the most commonly used way to go; the BeginXxx and EndXxx methods declared in specified types provide the standard way to perform an asynchronous operation; the event based asynchronous programming model is less popular than BeginXxx and EndXxx methods, .NET Framework just provides a very small set of the types that support event based APM; the last one, Task based APM, is introduced in .NET Framework 4 as a part of Task Parallel Library (TPL), it dispatches asynchronous operations based on a task scheduler, it also offers many features to extend the task parallelism. The default task scheduler is implemented by using thread pool, .NET Framework also provides task schedulers implemented by Synchronization Contexts, in addition, you can implement your own task schedulers and use it to work with tasks.

* Creating a thread needs about 1.5 MB memory space, Windows will also create many additional data structures to work with this thread, such as a Thread Environment Block (TEB), a user mode stack, and a kernel mode stack. Bringing new thread may also need thread context switching, which also hurts performance. So avoid creating additional threads as much as possible.

In this article, I will go through the different ways to perform asynchronous operation, and show examples to guide you to use both of them.

The Thread Pool APM

When you want to perform an asynchronous operation, it is easy to use thread pool to do so, by calling System.Threading.ThreadPool’s QueueUserWorkItem static method, passing an instance of WaitCallback delegate and optionally an instance of Object that represents the additional parameter to associate with the instance of WaitCallback. The following example shows how to use thread pool to queue asynchronous operations.

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
 
namespace ApmDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            // Define a WaitCallback instance.
            WaitCallback writeCallback = state => Console.WriteLine(state);
 
            // Queue user work items with ThreadPool.
            ThreadPool.QueueUserWorkItem(writeCallback, "This is the first line");
            ThreadPool.QueueUserWorkItem(writeCallback, "This is the second line");
            ThreadPool.QueueUserWorkItem(writeCallback, "This is the third line");
 
            Console.ReadKey();
        }
    }
}

In the above example, I initialized an instance of a WaitCallback instance by assigning a lambda expression as the delegate body, then called ThreadPool’s static method QueueUserWorkItem, passed this instance as the first parameter, and a string as its second parameter. When calling this method, the thread pool seeks for a free thread in the pool, associates the instance of the WaitCallback delegate to that thread, and dispatches this thread to execute the delegate at some time; if there is no free thread in the pool, the thread pool creates a new thread, associates the delegate instance, and then dispatches to execute at some time. I queued three user work items to the thread pool, by calling QueueUserWorkItem method for three times.

When I try to run this program, I may get the following output:

This is the first line

This is the second line

This is the third line

But sometimes I also get the following output:

This is the second line

This is the first line

This is the third line

Please note that the executing order of the queued user work items is unpredictable because there is no way to know when a thread in the thread pool is scheduled to execute the code. As shown above, the work items may complete in sequential, and it is also possible that the work items complete in reverse order. Therefore, do not write asynchronous code that relies on the execution order.

I highly recommended that you use the thread pool APM as much as possible, here are some reasons:

  1. Thread pool is managed automatically by the CLR. When you queue a user item to the thread pool, you never care which thread it will be associated and when it will be executed; the CLR handles everything for you – this pattern enables you to write easy-to-read,. straightforward and less buggy code.
  2. Thread pool manages threads wisely. When perform an asynchronous operation, CLR requires additional thread to perform this operation so the operation can take without blocking the current thread, but however, creating new thread is expensive, introducing new thread every time to serve a user work item is heavy and waste of resources. Thread pool manages a set of threads initially, when a user work item is queued, the thread pool adds this work item to a global work item list, then a CLR thread will check this global work item list, if it is not empty, this thread picks up a work item, and dedicates it to a free thread in the pool; if there is no free thread, the thread pool will then create a new thread, and dedicate it to this newly created thread. The thread pool always chooses to use as less thread as possible to serve all queued user work items. Hence, by using thread pool, CLR uses less system resources, makes the asynchronous operations scheduling effective and efficient.
  3. Thread pool has better performance. Thread pool mechanism guarantees that it can use maximum or configured CPU resources to server user work items. If you are running your program in a multi-core CPU environment, the thread pool initially creates threads which number is equal to the number of the installed CPUs in that environment; when scheduling a user work item, thread pool automatically balances the threads, and makes sure that every logical CPU core is used to serve the work items. This brings a flexibility to dispatch CPU resources and also helps to improve the whole system performance.

Though there are a lot of benefits using thread pool, there are also limits:

  1. Thread pool queues a user work item, and executes it at an uncertain time, when it finished processing a user item, there is no way for the caller code to know when it will complete, thus it is very difficult to write continuation code after this work item is completed. Specially, some operations, like read a number of bytes from a file stream, must get an notification when the operation is completed asynchronously, then the caller code can determine how many bytes it read from the file stream, and use these bytes to do other things.
  2. The ThreadPool’s QueueUserWorkItem method only takes a delegate that receives one parameter, if you code is designed to process more than one parameter, it is impossible to directly pass all the parameters to this method; instead, you may create additional data structure to wrap those parameter, then alternatively pass the wrapper type instance to the method. This reduces the readability and maintainability of your code.

To solve these problems, you may use the following standard way to perform asynchronous operations.

The Standard APM

The Framework Class Library (FCL) ships various types that have BeginXxx and EndXxx methods, these methods are designed to perform asynchronous operations. For example, the System.IO.FileStream type defines Read, BeginRead and EndRead methods, Read method is a synchronous method, it reads a number of bytes from a file stream synchronously; in other word, it won’t return until the read operation from the file stream is completed. The BeginRead and EndRead methods are pair, when calling BeginRead method, CLR queues this operation to the hardware device (in this case, the hard disk), and immediately return the control flow to the next line of code and then continue to execute; when the asynchronous read operation is completed by the hardware device, the device notifies the Windows kernel that the operation is completed, then the Windows kernel notifies CLR to execute a delegate which is specified as a parameter by calling BeginRead method, in this delegate, the code must call EndRead method so that the CLR can transit the number of bytes read from the buffer to the calling delegate, then the code can access the bytes read from the file stream.

here is what the Read, BeginRead and EndRead method signatures are defined.

 public override IAsyncResult BeginRead(byte[] array, int offset, 
    int numBytes, AsyncCallback userCallback, object stateObject);
 public override int EndRead(IAsyncResult asyncResult);
 public override int Read(byte[] array, int offset, int count);

Usually, The BeginXxx method have the same parameters with the Xxx method and two additional parameters: userCallback and stateObject. The userCallback is of type AsyncCallback, which takes one parameter of type IAsyncResult which brings additional information to this asynchronous operation; the stateObject parameter is the instance that you want to pass to the userCallback delegate, which can be accessed by AsyncState property defined on this delegate’s asyncResult argument.

The following code demonstrates how to use BeginXxx and EndXxx methods to perform asynchronous operations.

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.IO;
 
namespace ApmDemo
{
    internal class Program
    {
        private const string FilePath = @"c:\demo.dat";
 
        private static void Main(string[] args)
        {
            // Test async write bytes to the file stream.
            Program.TestWrite();
 
            // Wait operations to complete.
            Thread.Sleep(60000);
        }
 
        private static void TestWrite()
        { 
            // Must specify FileOptions.Asynchronous otherwise the BeginXxx/EndXxx methods are
            // handled synchronously.
            FileStream fs = new FileStream(Program.FilePath, FileMode.OpenOrCreate,
                FileAccess.Write, FileShare.None, 8, FileOptions.Asynchronous);
 
            string content = "A quick brown fox jumps over the lazy dog";
            byte[] data = Encoding.Unicode.GetBytes(content);
 
            // Begins to write content to the file stream.
            Console.WriteLine("Begin to write");
            fs.BeginWrite(data, 0, data.Length, Program.OnWriteCompleted, fs);
            Console.WriteLine("Write queued");
        }
 
        private static void OnWriteCompleted(IAsyncResult asyncResult)
        { 
            // End the async operation.
            FileStream fs = (FileStream)asyncResult.AsyncState;
            fs.EndWrite(asyncResult);
 
            // Close the file stream.
            fs.Close();
            Console.WriteLine("Write completed");
 
            // Test async read bytes from the file stream.
            Program.TestRead();
        }
 
        private static void TestRead()
        {
            // Must specify FileOptions.Asynchronous otherwise the BeginXxx/EndXxx methods are
            // handled synchronously.
            FileStream fs = new FileStream(Program.FilePath, FileMode.OpenOrCreate,
                FileAccess.Read, FileShare.None, 8, FileOptions.Asynchronous);
 
            byte[] data = new byte[1024];
 
            // Begins to read content to the file stream.
            Console.WriteLine("Begin to read");
            // Pass both Fs and data as async state object.
            fs.BeginRead(data, 0, data.Length, Program.OnReadCompleted, new { Stream = fs, Data = data });
            Console.WriteLine("Read queued");
        }
 
        private static void OnReadCompleted(IAsyncResult asyncResult)
        {
            dynamic state = asyncResult.AsyncState;
 
            // End read.
            int bytesRead = state.Stream.EndRead(asyncResult);
 
            // Get content.
            byte[] data = state.Data;
            string content = Encoding.Unicode.GetString(data, 0, bytesRead);
            
            // Display content and close stream.
            Console.WriteLine("Read completed. Content is: {0}", content);
            state.Stream.Close();
            Console.ReadKey();
        }
    }
}

This program tests asynchronous read/write operations from/to a specified file stream, by using BeginRead, EndRead, BeginWrite and EndWrite methods defined on System.IO.FileStream class. When I try to run this program, I get the following output:

image

Now you may already know how to use the standard way to perform an asynchronous operation by calling BeginXxx and EndXxx methods. In fact, this standard way supports many more features as I demonstrated here, such as cancellation, which I will discuss in later articles, and supporting cancellation is really a big plus of this pattern. By using this pattern, you can solve some problems I listed for the thread pool APM, and you can also have additional benefits which I summarize below.

  1. Supports continuation. When an asynchronous operation is completed, the userCallback delegate is invoked, so the caller code can perform other operations based on the result of this asynchronous operation.
  2. Supports I/O based asynchronous operations. The standard APM works with kernel objects to perform I/O based asynchronous operations. When an I/O based asynchronous operation is requested by calling the BeginXxx method, the CLR doesn’t introduce new thread pool thread to dedicate this task, instead, it uses a Windows kernel object to wait for the hardware I/O device to return (through its driver software) when it finishes the task. CLR just uses the hardware device drivers and kernel objects to perform I/O based asynchronous operations, no more managed resources are used to handle this case. hence, it actually improves the system performance by releasing CPU time slices and threads usage.
  3. Supports cancellation. When an asynchronous operation is triggered, user may cancel this operation by calling System.Threading.CancellationTokenSource’s Cancel method, I will introduce this class in the later articles.

But however, by using standard APM, your code becomes more complicated. That’s because all the task continuation happen outside of the calling context, for example, in the above read/write file stream example, the OnReadCompleted and the OnWriteCompleted are separate methods and invoked by a different thread than the current calling thread, this behavior may confuse developers, and therefore make your code logic not clear to understand.

Note: The async method and the await expressions bring a clear, logical and organized code structure to the asynchronous programming.

The Event-based APM

The Framework Class Library (FCL) also ships with some types that support event-based APM. For example, the System.Net.WebClient class defines a DownloadDataAsync method, and a DownloadDataCompleted event, by calling DownloadDataAsync method, CLR begins an asynchronous operation for downloading the data from a specified URL, when it is completed, the DownloadDataCompleted event will be fired, the argument e of type System.Net.DownloadDataCompletedEventArgs contains results and additional information of this operation. Here is the code demonstrates how to use event based APM to perform asynchronous operation.

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.IO;
using System.Net;
 
namespace ApmDemo
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            WebClient wc = new WebClient();
            wc.DownloadDataAsync(new Uri("https://www.markzhou.com"));
            wc.DownloadDataCompleted += (s, e) => Console.WriteLine(Encoding.UTF8.GetString(e.Result));
 
            Console.ReadKey();
        }
    }
}

Actually it acts with the same effect as using BeginXxx and EndXxx methods, the difference is event-based APM is more close to the object model layer, you can ever use a designer and a property window to drag-drop the component to the user interface and then set the event handler through the property window, as opposed, standard APM doesn’t provide events to subscribe, this helps to improve the system performance because implementing events may require additional system resources.

There are very small set of types in FCL support event-based APM, personally, I suggest not use this pattern as much as possible, the event-based APM may suit for the application developers because they are component consumers,. not the component designers, and the designer supportability is not mandatory for the component designers (library developers).

The Task-based APM

Microsoft .NET Framework 4.0 introduces new Task Parallel Library (TPL) for parallel computing and asynchronous programming. The mainly used Task class, which defined in System.Threading.Tasks namespace, represents a user work item to complete, to use task based APM, you have to create a new instance of Task, or Task<T> class, passing an instance of Action  or Action<T> delegate as the first parameter of the constructor of Task or Task<T> , then, call the Task’s instance method Start, notifies the task scheduler to schedule this task as soon as possible.

The following code shows how to use task based APM to perform a compute-bound asynchronous operation.

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
 
namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Task based APM demo");
            Task t = new Task(() =>
            {
                Console.WriteLine("This test is output asynchronously");
            });
            t.Start();
            Console.WriteLine("Task started");
            
            // Wait task(s) to complete.
            Task.WaitAll(t);
        }
    }
}

If I run this program, I will get the following output:

image

Alternatively, if the task delegate returns a value, you can use Task<T> instead of Task, after the task is complete, you can query the result by Task<T> ’s Result property. The following code shows how to use Task<T> to calculate the nth exponent to 2 (n is positive only).

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
 
namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Task based APM demo");
            Func<int, int> calc = (n) => { return 2 << (n - 1); };
 
            Task<int> t = new Task<int>(() =>
            {
                return calc(10);
            });
 
            t.Start();
            Console.WriteLine("Task started");
            
            // Wait task(s) to complete.
            // After t is complete, get the result.
            Task.WaitAll(t);
            Console.WriteLine(t.Result);
        }
    }
}

When I run this program, I get the following output:

image

The Task’s static method WaitAll waits all tasks specified in the parameter array synchronously, meaning that the current thread will be blocked till all the specified tasks are complete. If you don’t want to block the current thread, and you intend to do something after a certain task is complete, you may use the Task’s instance method ContinueWith, Here shows how to do continuation tasks in the following code.

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
 
namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Task based APM demo");
            Func<int, int> calc = (n) => { return 2 << (n - 1); };
 
            Task<int> t = new Task<int>(() =>
            {
                return calc(10);
            });
            
            // Set a continuation operation.
            t.ContinueWith(task => { Console.WriteLine(task.Result); return task.Result; });
            t.Start();
            Console.WriteLine("Task started");
            
            // Wait for a user input to exit the program.
            Console.ReadKey();
        }
    }
}

The task based APM has many features, I list some of the important features below:

  1. You can specify TaskCreationOptions when creating a task, indicating how the task scheduler will schedule the task.
  2. You can specify a CancellationTokenSource when creating a task, indicating the associated cancellation token used to cancel a task.
  3. You can use ContinueWith, or ContinueWith<T> method to perform continuation tasks.
  4. You can wait all specified tasks to complete synchronously by calling Task’s static WaitAll method, or wait any of the tasks to complete synchronously by calling Task’s static WaitAny method.
  5. If you want to create a bunch of tasks with the same creation/continuation settings, you can use TaskFactory’s instance StartNew method.
  6. The task based APM requires a task scheduler to work, the default task scheduler is implemented on top of the thread pool, however, you may change the task scheduler associated with a task to a synchronization context task scheduler, or a customized task scheduler.
  7. You can easily convert a BeginXxx and EndXxx pattern asynchronous operation into a task based APM by calling TaskFactory’s instance FromAsync or FromAsync<T> method.

Task, async method and await expression

I would like to point out that the async method and the await expression/statement in C# 5.0 are implemented in the compiler level by building on top of the task based APM. An async method must have either a return type of void, or a return type of Task or Task<T> , this limitation is obvious because if there is no await expression in the async method, this method will be invoked synchronously; thus this method can be treated as a normal method, making a void return value is clear; otherwise, if the async method contains at least one await expression, this method will be invoked asynchronously and because of await expressions based on the task based APM, a Task or a Task<T> instance must be returned from this method to enable another await expression to perform on this method.

To make this clear, I modify the code to calculate the nth exponent to 2 by using async and await, see the following:

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
 
namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Task based APM demo");
            
            // Call Exponnent() asynchronously.
            // And immediately return the control flow.
            // If I don't put a Task here, the program will sometimes
            // terminate immediately.
            Task t = new Task(async () =>
            {
                int result = await Program.Exponent(10);
 
                // After the operation is completed, the control flow will go here.
                Console.WriteLine(result);
            });
 
            t.Start();
            Console.ReadKey();
        }
 
        static async Task<int> Exponent(int n)
        {
            Console.WriteLine("Task started");
            return await TaskEx.Run<int>(() => 2 << (n - 1));
        }
    }
}

When I try to run this program, I get the exact same result as the example showed in the Task based APM section.

You may still confuse the above code, concern how it works and what the exactly control flow to run this code. In the coming articles, I will discuss it in details.

Conclusion

The Microsoft .NET Framework provides many ways to perform asynchronous operations, you may choose one or some of them by investigating your case; though there are various ways, some of them are not recommended, such as using System.Threading.Thread class to implement asynchronous operations, or event-based APM. The most popular ways are using thread pool or task based APM. In addition, task based APM is used to implement async method and await expression/statement in C# 5.0.

At last, I summarize the different asynchronous models in the following table for reference.

Pattern Description Based On Notes
Thread based By creating System.Threading.Thread instance Managed Thread Expensive, not recommended
Standard BeginXxx and EndXxx methods By calling BeginXxx method with a user callback; calling EndXxx inside that user callback Thread pool Widely used, standard, recommended, support cancellation and continuation
ThreadPool By calling ThreadPool’s static QueueUserWorkItem method Thread pool Widely used, recommended use as much as possible
Delegate By calling Delegate’s BeginInvoke and EndInvoke instance methods Thread pool Less used
Event based By subscribe appropriate event and calling appropriate method Thread pool Avoid use as much as possible, not recommended
Task based By creating System.Threading.Tasks.Task instance A specified task scheduler Recommended, supports all features of a thread pool pattern, and has many other features
async method and await expression By using async and await keywords Task based pattern The new C# 5.0 asynchronous pattern