First non-Microsoft C++ AMP Implementation Leaves Dock, Shows Glimpses of Future

Alex Voicu

September 2nd, 20140 0

One of the claims made upon initially announcing C++ AMP was that it would be a portable, extensible standard. Parsing the open specification makes it clear that the design is true to that goal, but we were short of an actual proof. In one fell swoop, the good folks at AMD, working in tandem with MulticoreWare, have removed this final concern by introducing an open source implementation of C++ AMP. It thus becomes possible to use C++ AMP in Windows, Linux or in OS X, exploiting a plethora of hardware platforms. Since we have discussed this project in the past, you can get a better understanding of it by perusing our older blog post.

The other (possibly more important) bit of news is that this is also the first case of a partner working to extend the standard before the core language specification is updated. This is a fortunate development, as the constraints under which the core language is grown – maximizing coverage by way of remaining close to the lowest common denominator – can sometime delay alignment with the state of the art. It is also a model that has been shown to yield benefits for core C++, where Boost is an excellent example of extensions that give insight into the future evolution of the standard. Let us quickly a peek at the additions:

Shared Virtual Memory (SVM) – makes it possible to straightforwardly share data-structures between host and accelerator, by e.g. directly capturing them in a restrict(amp) lambda, as opposed to funnelling them through concurrency::array or concurrency::array_view;
C++11 atomics and memory ordering – when coupled with SVM it opens the door for constructing efficient synchronization primitives that work across host and accelerator (or for the brave to write lock-free code);
Dynamic memory allocation and de-allocation (AKA operator new & operator delete) in restrict(amp) functions.

You will note that all of the above are great productivity enhancers, which help smooth out certain wrinkles and bring us closer to being “just C++” with no restrictions. When mature, they will make it easier for newcomers to heterogeneous computing to dive right in, without having to figure out how to map their pre-existing data structures into array friendly forms.

The following reasonable question is likely to arise: when are we going to see the core language specification updated to include these extensions? As much as we like to be on the bleeding edge at all times, in this regard our hands are tied until vendor-agnostic ways of exposing e.g. SVM are introduced. AMD’s extensions are restricted to a subset of their hardware, namely processors from the Kaveri family, a luxury we cannot afford. In the interim, we remain fully committed to ensuring that Visual Studio yields the best C++ AMP development experience on the market, and encourage you to play with the extensions offered by AMD to get an early preview of where the standard should eventually go. Your feedback will prove invaluable in shaping our decisions going forward. Finally, please be aware that whatever is written against the core C++ AMP specification will seamlessly work across both Visual Studio and AMD’s extended implementation – targeting this baseline ensures maximum portability. The specific, non-portable bits, have to do with the AMD’s extensions, that we do not currently implement and, respectively, the DirectX specific elements that we provide. If you employ either of the two, you forfeit the other and have to use the supporting tool-chain.

In closing, we would like to extend our gratitude to our colleagues at AMD and MulticoreWare, for their excellent work.