The Concurrency Runtime and Visual C++ 2010: Lambda Expressions

As a C++ programmer, you might have already started working with the Concurrency Runtime components in Visual Studio 2010 such as the Parallel Patterns Library (PPL) and the Asynchronous Agents Library. These libraries make it much easier to write parallel code that takes advantage of multi-core. I wanted to talk a bit about a few new features that are available in the Visual C++ 2010 compiler that make it even easier to write parallel code. This week I’ll start with lambda expressions, my favorite new language feature in Visual C++ 2010.

Lambda Expressions at a Glance

First we’ll take a look at a program that uses function objects, or functors, to determine which numbers in a random series are prime. Then I’ll explain a bit about lambda expressions and show you how to use them to simplify the program.

Consider the following program that determines which numbers in a random series are prime:

    1: // random-primes-functor.cpp
    2: // compile with: /EHsc
    3: #include <algorithm>
    4: #include <string>
    5: #include <random>
    6: #include <iostream>
    7:  
    8: // Determines whether the input value is prime.
    9: template<typename T>
   10: bool is_prime(T n)
   11: {
   12:    static_assert(std::is_integral<T>::value, "T must be of integral type.");
   13:  
   14:    if (n < 2)
   15:       return false;
   16:    for (T i = 2; i < n / i; ++i)
   17:    {
   18:       if ((n % i) == 0)
   19:          return false;
   20:    }
   21:    return true;
   22: }
   23:  
   24: // A functor class that inserts numbers that are prime at the back of the 
   25: // provided container type.
   26: // The template type Container must hold an integral type and also must 
   27: // support the push_back operation.
   28: template <typename Container>
   29: class InsertPrimes
   30: {
   31:    // Ensure that T is of integral type.
   32:    static_assert(std::is_integral<typename Container::value_type>::value, 
   33:       "Container must hold an integral type.");
   34:  
   35: public:
   36:    InsertPrimes(Container& primes)
   37:       : _primes(primes)
   38:    {
   39:    }
   40:  
   41:    // Inserts the number into the container if the number is prime.
   42:    void operator()(typename Container::value_type n) const
   43:    {
   44:       if (is_prime(n)) {                  
   45:          _primes.push_back(n);
   46:       }
   47:    }
   48:  
   49: private:   
   50:    Container& _primes; // The container to hold primes.
   51: };
   52:  
   53: // Prints values to the standard output stream in comma-separated form.
   54: // Type T must be supported by the wostream << operator.
   55: template <typename T>
   56: class PrintItems
   57: {
   58: public:
   59:    // Prints the provided element to the standard output stream.
   60:    // Elements are separated by a comma and a space character.
   61:    void operator()(const T& item)
   62:    {
   63:       std::wcout << _comma << item;
   64:       _comma = L", ";
   65:    }
   66: private:
   67:    std::wstring _comma; // Separates elements.
   68: };
   69:  
   70: int wmain()
   71: {
   72:    // Create a vector object that contains a few random integers.
   73:    std::vector<int> v(50);
   74:    std::generate(v.begin(), v.end(), std::mt19937(42));
   75:       
   76:    // Create a container that holds the elements of the array that 
   77:    // are prime.
   78:    std::vector<int> primes;
   79:  
   80:    // Insert the elements of the array that are prime at the back 
   81:    // of the container.
   82:    std::for_each(v.begin(), v.end(), InsertPrimes<std::vector<int>>(primes));
   83:  
   84:    // Print the results.
   85:    std::wcout << L"Found the following primes: " << std::endl;
   86:    std::for_each(primes.begin(), primes.end(), PrintItems<int>());
   87: }

 

This program produces the following output:

Found the following primes:

88409749, 911989541

Sidebar – In Visual C++ 2010, there is a more efficient way to populate the primes vector. Instead of using the for_each algorithm and a functor, you can use the std::copy_if algorithm:

 std::copy_if(v.begin(), v.end(), std::back_inserter(primes), is_prime<int>);

However, I used for_each in this example so that I can show you how to convert code that uses a functor to use a lambda expression and so that we can more easily compare for_each to its parallel counterpart, parallel_for_each (more on parallel_for_each later). – End Sidebar

In this example program, the std::for_each algorithm applies a function object to each element in the specified range of elements of a container. This program uses the for_each algorithm two times. The first use of for_each (line 82) applies a function object of type InsertPrimes to insert the numbers into a std::vector object; the second use (line 86) applies a function object of type PrintItems to print the results to the console.

Although the use of function objects is a great way to maintain state across a number of function calls, they can have some disadvantages:

  • They are verbose. You must create a type definition, member variables to hold data, a constructor to initialize data, operator() to process data, and so on.
  • They can be tedious to write. The structure of most function object types fit the same pattern. They often require many lines of code, but the action that you want to express is often just a minor portion of that code.
  • They are prone to errors. The more function object types that you write by hand, the greater the chance is that you’ll make a minor typing error or inadvertently use a reference type instead of a non-reference type (or vice-versa). In the InsertPrimes class in this example, consider what happens if you declare the _primes member variable as a reference type (Container& ), but declare the parameter to the constructor as a non-reference type (Container). (Spoiler: you won’t see any primes in your results!)
  • They don’t always lend to reuse. The InsertPrimes class serves a very specific purpose, and doesn’t necessarily lend itself to other uses. Function objects can add clutter to your code that another programmer (or yourself) may have to mentally untangle down the road.

 

Visual C++ 2010 introduces lambda expressions. A basic way to think of a lambda expression is as an anonymous function that maintains state and that can access the variables that are available to the enclosing scope. More simply put, think of a lambda expression as a way to describe the work that you want to do and have the Visual C++ compiler write the function object type for you.

I won’t go into the specifics of the syntax of lambda expressions here. For more details, see Stephan’s post Lambdas, auto, and static_assert: C++0x Features in VC10, Part 1 and Lambda Expressions in C++ and Lambda Expression Syntax on MSDN. With that said, let’s see how you might use lambda expressions to simplify our example program:

    1: // random-primes-lambda.cpp
    2: // compile with: /EHsc
    3: #include <algorithm>
    4: #include <string>
    5: #include <random>
    6: #include <iostream>
    7:  
    8: // Determines whether the input value is prime.
    9: template<typename T>
   10: bool is_prime(T n)
   11: {
   12:    static_assert(std::is_integral<T>::value, "T must be of integral type.");
   13:  
   14:    if (n < 2)
   15:       return false;
   16:    for (T i = 2; i < n / i; ++i)
   17:    {
   18:       if ((n % i) == 0)
   19:          return false;
   20:    }
   21:    return true;
   22: }
   23:  
   24: int wmain()
   25: {
   26:    // Create a vector object that contains a few random integers.
   27:    std::vector<int> v(50);
   28:    std::generate(v.begin(), v.end(), std::mt19937(42));
   29:       
   30:    // Create a container that holds the elements of the array that 
   31:    // are prime.
   32:    std::vector<int> primes;
   33:  
   34:    // Insert the elements of the array that are prime at the back 
   35:    // of the container.
   36:    // This version uses a lambda expression in the body of for_each.    
   37:    std::for_each(v.begin(), v.end(), [&primes](int n) {
   38:       if (is_prime(n)) {
   39:          primes.push_back(n);
   40:       }
   41:    });
   42:  
   43:    // Print the results.
   44:    std::wcout << L"Found the following primes: " << std::endl;
   45:    std::wstring comma;
   46:    std::for_each(primes.begin(), primes.end(), [&comma](int prime) {
   47:       std::wcout << comma << prime;
   48:       comma = L", ";
   49:    });
   50: }

This version replaces the InsertPrimes and PrintItems types with lambda expressions (lines 37 and 46). Not only is this version much shorter, but the action for both for_each calls is physically much closer to the rest of the program. To replace the InsertPrimes class, the updated example specifies the primes variable in the capture list of the lambda expression that is passed to the for_each algorithm. Because the lambda expression captures primes by reference, each iteration that the for_each algorithm performs can update it. To replace the PrintItems class, the updated example simply performs the action directly in the lambda expression.

Lambda Expressions and the Concurrency Runtime

Now suppose that instead of having only a few numbers to check for primality, you had many to check. Because the computation of each element of the array is independent of all other computations (a pattern that is often called “embarrassingly parallel”, or in a more modern context “delightfully parallel”), you can use the Concurrency::parallel_for_each algorithm that is provided by the PPL to enable the entire process to finish more quickly.

Sidebar – Note that the term many can be somewhat subjective when it comes to parallelism. It is often necessary to experiment to see what kind of work and how much work is required to overcome the overhead that is required to run your workloads concurrently and receive a performance benefit. Let’s say for argument’s sake that in this example 1,000 random values would benefit from being processed in parallel. – End Sidebar

You can think of the parallel_for_each algorithm as a parallel version of the for_each algorithm. Because the parallel algorithms in the PPL are designed to work much like those in the Standard Template Library, you can also pass lambda expressions to these parallel algorithms. The following example modifies the previous to process 1,000 numbers instead of 50 and uses the parallel_for_each algorithm to process the data. This example also uses Concurrency::concurrent_vector instead of vector to enable the loop body to append elements in a concurrency-safe manner.

    1: // random-primes-parallel.cpp
    2: // compile with: /EHsc
    3: #include <ppl.h>
    4: #include <concurrent_vector.h>
    5: #include <algorithm>
    6: #include <string>
    7: #include <random>
    8: #include <iostream>
    9:  
   10: // Determines whether the input value is prime.
   11: template<typename T>
   12: bool is_prime(T n)
   13: {
   14:    static_assert(std::is_integral<T>::value, "T must be of integral type.");
   15:  
   16:    if (n < 2)
   17:       return false;
   18:    for (T i = 2; i < n / i; ++i)
   19:    {
   20:       if ((n % i) == 0)
   21:          return false;
   22:    }
   23:    return true;
   24: }
   25:  
   26: int wmain()
   27: {  
   28:    // Create a vector object that contains many random integers.
   29:    std::vector<int> v(1000);
   30:    std::generate(v.begin(), v.end(), std::mt19937(42));
   31:       
   32:    // Create a concurrent container that holds the elements of the array
   33:    // that are prime.
   34:    Concurrency::concurrent_vector<int> primes;
   35:  
   36:    // Insert the elements of the array that are prime at the back of the 
   37:    // container.
   38:    Concurrency::parallel_for_each(v.begin(), v.end(), [&primes](int n) {
   39:       if (is_prime(n)) {
   40:          primes.push_back(n);
   41:       }
   42:    });
   43:  
   44:    // Print the results.
   45:    std::wcout << L"Found the following primes: " << std::endl;
   46:    std::wstring comma;
   47:    std::for_each(primes.begin(), primes.end(), [&comma](int prime) {
   48:       std::wcout << comma << prime;
   49:       comma = L", ";
   50:    });
   51: }

 

Note that these examples use the fully-qualified name to call out what is provided by the PPL (namespace Concurrency) and what is provided by the STL (namespace std).

This program produces the following sample output (because parallel_for_each does not act on the elements in a pre-defined order, the order can vary from run to run):

Found the following primes:

88409749, 762097033, 819956639, 911989541, 229717603, 1388146037, 601656257, 239964793, 510319067, 1060324619, 333867739, 502798993, 1669356239, 1613116507, 166793897, 312514729, 1482069727, 1116476131, 1165435217, 748898099, 187276799, 1547007029, 1038747649, 1029878879, 394274011, 2036561099, 73706383, 1120529587

Although lambda expressions are a great way to more productively write C++ code, you might want to continue to use function pointers or function objects in the following cases:

  • You have existing code that uses function objects. Because the code that the Visual C++ compiler produces for a lambda expression is similar to that of a hand-written function object, replacing your existing code with lambda expressions will likely not add any performance benefits. (However, it might be worth converting your code that uses function pointers to use lambda expressions because in the Visual C++ compiler lambda expressions are better candidates for optimizations such as inlining).
  • The body of your function object is reusable. The use of function pointers or function objects may be more appropriate if the work of your lambda functions can be used in multiple places or you are required to unit-test this code separately. For example, it can be completely fine to keep the InsertPrimes or PrintItems types if they can be used by other parts of your program.
  • You can take advantage of predefined functions or function objects, such as those provided by the Standard Template Library. Types such as std::plus and std::mt19937 provide many operations for you.
  • You require recursion. A function or function object can call itself recursively; a lambda expression cannot.

 

Lambda expressions are not a cure-all for all of the disadvantages of function pointers and function objects. For example, you still need to be careful how you capture variables. Otherwise, you may inadvertently modify a variable that you did not intend to (or not modify a variable that you intended to modify). You can see a quick example of this on MSDN. Or you can get the full story in Bill Messmer’s excellent Patterns and Practices document (check out the section “Careless Reference Captures”). Although you’ll often see examples capture everything by reference ( [&] ) or by value ( [=] ), in your real-world code it’s best to specify explicitly which variables should be captured by reference and which ones should be captured by value.

For more examples that use lambda expressions with the Concurrency Runtime, be sure to check out the Concurrency Runtime documentation. I hope that lambda expressions become your favorite C++ language feature as well. Enjoy!

 

Next week we’ll take a look at how the Visual C++ team made the auto keyword more useful and how to use it with the Concurrency Runtime.