Random Number Generator Sample for C++ AMP

Mersenne Twister(MT) Pseudo Random Number Generator (PRNG) is one of the common PRNG. With this post I am sharing a C++ AMP implementation of Mersenne Twister based on a pseudorandom number generator algorithm developed by Makoto Matsumoto and Takuji Nishimura. Like other PRNG, MT is iterative and is hard to parallelize a single twister update step among several execution threads. To solve this problem and to enable efficient implementation of Mersenne twister on parallel architectures, Makoto Matsumoto and Takuji Nishimura developed an offline library for dynamic creation of MT parameters.

Below is the explanation of my C++ AMP sample ported from the above reference.


main – Entry point

Here the Mersenne Twister parameters, generated by an offline tool, are loaded from .dat file. These parameters are copied to a vector and the generate_rand_on_amp function is called to generate a bunch of floating point random numbers. Eventually the random numbers are dumped to a text file.


generate_rand_on_amp – C++ Implementation

This function is called from the main with the twister parameters and an output container. These inputs and output are encapsulated using array objects and dispatched to run on the accelerator. If needed (controlled by the apply_transform parameter) a Box-Muller transform is applied to the generated random numbers. Before returning, the calculated data is copied to host memory.


rand_MT_kernel – C++ AMP Mersenne Twister kernel

The core logic is best described in the research paper mentioned above. I will briefly mention the mapping of key parameters (in italics) which should help you understand the relationship of the code to the paper. A Mersenne twister is completely defined by 11 parameters, only the bit-vector parameters vary on a per-thread basis for the same period and seed : matrix_a – the lower row of matrix A ; mask_b, mask_c – tempering masks of x T transformation. The rest of the (integer) parameters are shared among all the threads, and are inline in the source.


box_muller_kernel and box_muller_transform – C++ AMP Box-Muller kernel

Uniformly distributed random number sequences, produced by Mersenne twisters or any other RNG, can be transformed into normal distribution with the Box-Muller transformation, which is implemented as a separate kernel. box_muller_kernel is called from generate_rand_on_amp. In this function, each thread picks two samples for transformation and the returned data is saved to the GPU memory. box_muller_transform function is the core of the Box-Muller transform, given two samples from the uniform distribution in the interval (0, 1] maps to normally distributed samples with standard deviation 1.


Download the sample

Please download the attached project of the Mersenne Twister RNG that we discussed here and run it on your hardware, and try to understand what the code does and to learn from it. You will need, as always, Visual Studio 11.

As with all the samples we post, they are meant to help you get started with C++ AMP. Once you are up and running, you may be able to come up with a better C++ AMP implementation of one of the algorithms. If you do, please share it with us.


Comments (3)

  1. Matt says:

    Is there anything similar for the PPL?

  2. Stefano says:

    I have a question. I am running your source because I need to generate a normal random variable. I noticed that it generates very strange distributions. This is related to the strange generation that I have for uniform distributed variable. Those present some kind of clustering that brings to a strange generation of the normal variable. Did you have the same problem ?

    Best regards


  3. davidhunter22 says:


    In the code it has this

    #if defined (_M_X64)

       #error ERROR: This sample may not produce valid results on x64/amd64 platform.


    Can you shed some light on this? The word "may" seems a bit odd in this. Also does this mean when doing a 64 bit build or when running on a 54 bit machine. Is the issue that the code uses types like size_t that are different on 32 and 64 bit build?


Skip to main content