Parallel STL – Democratizing Parallelism in C++

Only a few years ago, writing parallel code in C++ was a domain of the experts. Nowadays, this field is becoming more and more accessible to regular developers thanks to the advances in libraries, such as the PPL and C++ AMP from Microsoft, Intel’s Threading Building Blocks, OpenMP or OpenACC if you prefer a pragma-style approach, OpenCL for low-level access to heterogeneous hardware, CUDA and Thrust for programming NVidia devices, and so on.

The C++ Standard is catching up too, giving us the fundamentals such as the precisely defined memory model, and the basic primitives like threads, mutexes, and condition variables. This allows us to understand how atomics, fences and threads interact with the underlying hardware, reason about data races and so on. This is extremely important, but now we also need higher level algorithms that are commonly found in many of the popular parallel libraries.

Over the last few years, a group of software engineers from Intel, Microsoft and NVidia have worked together on a proposal for the ISO C++ Standard known as the “Parallel STL“.

This proposal builds on the experience of these three companies building parallel libraries for their platforms — the Threading Building Blocks (Intel), PPL and C++ AMP (Microsoft) and Thrust (NVidia). All these libraries have a common trait — they allow developers to perform common parallel operations on generic containers. Naturally, this aligns very well with the goals of the C++ Standard Template Library.

All three companies are working on their implementations of the proposal. Today, we’re pleased to announce that Microsoft has made the prototype of the proposal available as an open source project at

We encourage everyone to head over to our CodePlex site and check it out.

The proposal has been approved to be the foundation for the “Parallelism Technical Specification” by the ISO C++ Standards Committee meaning that enough people on the Committee are interested in incorporating this proposal into the next major version of the C++ Standard. Needless to say, this set of people includes the representatives of Intel, Microsoft and NVidia, all of which are active members of the Committee.

For those familiar with the STL, using Parallel STL should be easy. Consider an example of sorting a container data using the STL function std::sort:

sort(data.begin(), data.end());

Parallelizing this code is as easy as adding the parallel execution policy as the first parameter to the call:

sort(par, data.begin(), data.end());

Obviously, there is a little more to it than meets the eye. The parallel version of sort, and the execution policy are defined in a separate namespace std::experimental::parallel, so you will need to either use it explicitly or via a using directive (it is expected that the names in this namespace will be promoted to std once this becomes part of the Standard C++).

As is always the case with parallelization, not every program will benefit from using the Parallel STL, so don’t just go sprinkling your STL code with par willy-nilly. You still need to find a bottleneck in your program that’s worth parallelizing. In some cases, your program will need to be rewritten tobecome amenable to parallelism.

Where do we go from here?

As mentioned above, the project is still experimental. While the effort is driven by three major companies, and there is strong interest from the ISO C++ Committee and the C++ community in general, we still have ways to go before Parallel STL becomes part of the C++ Standard. We expect that the draft will undergo changes during the standardization process, so keep this in mind when working with the prototype.

Your feedback is important, and there is a number of ways to get engaged. You can leave a comment below, send email to or head over to and start a discussion.

Artur Laksberg
Visual C++ TeamMicrosoft