C++ AMP in a nutshell


With Visual Studio 2012, you are able to get your hands on a new technology that enables you to tap into the power of heterogeneous hardware and specifically take advantage of accelerators such as the GPU for compute purposes: C++ AMP.

So you’ll create an empty C++ project, add a new code file, stick a #include <amp.h> at the top, then add a using namespace concurrency; and then what? I suggest you try typing in our example C++ AMP matrix multiplication code, and trying it out on your DirectX 11 hardware, or download one of our many C++ AMP samples. including the simplistic “Hello World” code.

Then, you can play around with all the new constructs we’ve added, following the description of each from the following posts:

Beyond the core API above, there is even more for you to explore

If the above is not enough reading, you can read the 130 pages of the C++ AMP open specification.

Beyond the API and language, there is fantastic Visual Studio integration (intellisense, code navigation, project and build system, etc) and in particular I encourage you to explore the support for

You’ll no doubt have questions and feedback, and we truly look forward to taking those in our dedicated Native Parallelism forum. Enjoy!

Comments (4)

  1. Tommy says:

    Hi Daniel,

    Thanks for sharing these.

    I am currently changing a CUDA style code to C++ AMP style. I have wonders on how to replace several CUDA keywords. Can you give advice on what C++ AMP can replce the following CUDA codes?

    1. __align__(16)

    2. Cuda::HostMemoryReference1D<int>

    3. DeviceMemoryLinear2D<int>

    4. DeviceMemoryPitched3D<int>

    5. cuda_safe_call() related

    5.1 cuda_safe_call( cudaGetLastError() );

    5.2 cuda_safe_call( cudaUnbindTexture(…) );

    5.3 cuda_safe_call( cudaDeviceSynchronize() );

    6. cudaBindTexture()

    Thanks in advance!

  2. Looking forward to the answers to the previous questions.

  3. 1. __align__(16)

    >> The equivalent of this capability in the MS VC++  compiler is __declspec(align(#)) which allows you to control the alignment of user-defined data.

    2. Cuda::HostMemoryReference1D<int>

    >> The C++ AMP concurrency::array_view type enables multidimensional views over existing CPU memory.

    3. DeviceMemoryLinear2D<int>

    >> The C++ AMP concurrency::array and concurrency::array_view types are equivalent abstractions of multidimensional data containers.

    4. DeviceMemoryPitched3D<int>

    >> There is not direct equivalent of this in C++ AMP. However, if you want to use multidimensional data with specific pitch, you can achieve the same through using the "section" capability of array and array_view types in C++ AMP. Note that when doing this, you would be responsible for defining the pitch unlike the CUDA pitched allocations where the CUDA runtime determines the pitch.

    5. cuda_safe_call() related

    5.1 cuda_safe_call( cudaGetLastError() );

    >> AFAIK cuda_safe_call is just a macro for better error diagnostics for CUDA API calls in debug mode. C++ AMP uses exceptions for runtime errors and when compiling C++ AMP programs in debug mode, you would automatically get detailed debug diagnostics to help you better understand the error.

    5.2 cuda_safe_call( cudaUnbindTexture(…) );

    6. cudaBindTexture()

    >> Please refer to our blog post on textures  to learn about texture capabilities in C++ AMP.

    5.3 cuda_safe_call( cudaDeviceSynchronize() );

    >> accelerator_view::wait is the equivalent C++ AMP API.

    Please feel free to ask any further questions on our MSDN forum.

  4. Hi Amit, many thanks for your detailed answer. 🙂