Following up on the previous post in which I introduced Fast Fourier transforms (FFT) on the GPU, in this post I will talk about the C++ AMP FFT Library and explain how to use it in your application.
As was noted in the previous post, DirectX already contains an FFT API. So what was left for us to do is provide a simple C++ AMP wrapper on top of it which is available as the C++ AMP FFT Library on CodePlex. To use it follow these steps:
- Download the C++ AMP FFT Library from CodePlex. The library is in the form of a DLL and includes the headers and lib files to link to the FFT library DLL. The library sources are also available for download from CodePlex can be built in the Visual Studio 2012 IDE using the provided Visual Studio project file.
- Include inc\amp_fft.h in your source.
- Compile your .cpp source files.
- Link the resulting object files with the amp_fft.lib or amp_fftd.lib available as part of the library.
If you download the library sources, the sample directory has an example Visual Studio project showing how to use C++ AMP FFT library to perform forward and inverse transforms.
Let’s go over a very simple use of the fft class in the C++ AMP FFT Library and explain it.
- fft<float,1> fft_transform(extent<1>(100));
- array<float,1> input_array(extent<1>(100), pointer_to_input_data);
- array<std::complex<float>,1> output_array(extent<1>(100));
- fft_transform.forward_transform(input_array, output_array);
The first line creates the transform object, of type fft<float, 1>. Note that like C++ AMP arrays, the transform type captures the element type of the transformation, and
- Element type. This must be either float, or std::complex<float>. No other types are supported at this point.
- Dimension. The dimension can be 1, 2 or 3. DirectX can actually handle higher dimensions, but the C++ AMP library currently only supports up to 3D transforms. Higher dimensions may be supported in future. Also, like concurrency::array, an fft object is initialized with a C++ AMP extent object. The extent associated with an fft determines the shape of arrays that could be transformed by the fft object. Because FFT transforms are very sensitive to input sizes, a single fft object can only handle a single size, which is why you have to specify it at initialization time.
After the fft object is created, you’d typically cache it and reuse it to transform many different inputs, albeit all have to have the same extent.
Lines 2 and 3 define such possible input and output, and line 4 transforms the input into the output. As is typical with C++ AMP, the output is left on the accelerator, and you have the choice if and when to copy it back to the CPU. You could, for example, use the output of an FFT transform in a subsequent parallel_for_each call, without ever bringing the results back to main memory.
Because the FFT library is implemented using DirectCompute and not using C++ AMP, it has some disadvantages when used from C++ AMP.
- The FFT library only works with floats and complex-numbers based on floats. It doesn’t work with doubles, and it doesn’t take advantage of the new C++ AMP high-accuracy double-precision math library for the GPU.
- The library only accepts arrays, not array_views. This is due to the fact that we can only provide interoperation with DirectX buffers on the bases on whole arrays, rather than arrays views, which don’t have a good counterpart concept in DirectX.
In my next post on the topic of FFT I will share a test program which illustrates higher dimensional transforms and inverse transforms.