Getting random numbers in a thread-safe way

It’s very common in a parallel application to need random numbers for this or that operation.  For situations where random numbers don’t need to be cryptographically-strong, the System.Random class is typically a fast-enough mechanism for generating values that are good-enough.  However, effectively utilizing Random in a parallel app can be challenging in a high-performance manner.

There are several common mistakes I’ve seen developers make.  The first is to assume that Random is thread-safe and is ok to be used concurrently from multiple threads.  This is a bad idea, and can have some drastic consequences on the quality of the random numbers (degrading them well below the “good enough” bar).  For an example of this, consider the following program:

using System;
using System.Threading;

class Program
{
    static void Main(string[] args)
    {
        Random rand = new Random();

        Parallel.For(0, 1000000, (i, loop) =>
        {
            if (rand.Next() == 0) loop.Stop();
        });

        while (true)
        {
            Console.WriteLine(rand.Next());
            Console.ReadLine();
        }
    }
}

While it won’t happen every time you run the app, on a multi-core machine it’s likely that once the Parallel.For in this example exits, rand.Next() will always return 0.  Not very random, is it?  This is due to an implementation detail of Random that does not tolerate multithreaded access (it wasn’t designed for it, and the docs explicitly call out that Random should not be used from multiple threads, as they do for most types in the .NET Framework).

Another common mistake is to fix a problem like that in the previous example by instantiating a new Random instance each time a random number is needed.  This can also affect the quality of the random numbers.  When a seed isn’t explicitly provided to Random through its constructor, it internally uses Environment.TickCount (at least in the current release) as the seed.  TickCount returns a value that’s heavily influenced by the resolution of the system timer, and thus multiple calls to TickCount in a row are likely to yield the same value (on the system on which I’m writing this post, TickCount changes in value approximately once every 15 milliseconds).  This means that Random will suffer a similar fate, as is exemplified in the following program:

using System;

class Program
{
    static void Main(string[] args)
    {
        int lastNumber = 0, count = 0;
        while (true)
        {
            // Get the next number
            int curNumber = new Random().Next();

            // If it’s the same as the previous number, 
            //
note the increased count
            if (curNumber == lastNumber) count++;

            // Otherwise, we have a different number than
            // the previous. Write out the previous number
            // and its count, then start
            // over with the new number.
            else
            {
                if (count > 0)
                    Console.WriteLine(
                        count + “: ” + lastNumber);
                lastNumber = curNumber;
                count = 1;
            }
        }
    }
}

This program continually creates new Random instances and gets a random value from each, counting the number of “random” values in a row that are the same.  When I run this app, I consistently see each number showing up thousands of times before the number changes; again, not very random.  The point of this example is that, in a multithreaded context, you may be creating a Random instance for each iteration, but many iterations across threads are likely to end up with the same value.

There are several approaches then one could take in using Random from multiple threads to avoid these kinds of issues.

One approach is to use a lock.  A shared Random instance is created, and every access to the Random instance is protected by a lock, e.g.

public static class RandomGen1
{
    private static Random _inst = new Random();

    public static int Next()
    {
        lock (_inst) return _inst.Next();
    }
}

This has the benefit of being easy to write, simple to explain, and obviously does the right thing.  Unfortunately, there’s an expensive lock on every access, and the more threads that want random numbers, the more contention there will be for this lock.  Sharing hurts.

Another approach is to use one Random instance per thread, rather than one per AppDomain (as is the case with the RandomGen1 approach shown above).  Here’s one way that might be implemented:

public static class RandomGen2
{
    private static Random _global = new Random();
    [ThreadStatic]
    private static Random _local;

    public static int Next()
    {
        Random inst = _local;
        if (inst == null)
        {
            int seed;
            lock (_global) seed = _global.Next();
            _local = inst = new Random(seed);
        }
        return inst.Next();
    }
}

An AppDomain-wide Random instance is maintained in order to provide seeds for new Random instances created for any new threads that come along wanting random numbers.  Each thread maintains its own Random instance in a ThreadStatic field, such that once initialized, calls to Next need only retrieve the ThreadStatic Random instance and use its Next method; no locks are necessary, and sharing is minimized.  I tested this in a loop like the following:

Stopwatch sw;
const int NUM = 1000000;

while (true)
{
    sw = Stopwatch.StartNew();
    Parallel.For(0, NUM, i =>
    {
        RandomGen1.Next();
    });
    Console.WriteLine(sw.Elapsed);

    sw = Stopwatch.StartNew();
    Parallel.For(0, NUM, i =>
    {
        RandomGen2.Next();
    });
    Console.WriteLine(sw.Elapsed);

    Console.ReadLine();
}

On my dual-core laptop using .NET 3.5 and the June 2008 CTP of Parallel Extensions, the ThreadStatic approach ended up running twice as fast as the lock-based version.  Moreover, as the number of cores increases, so too will the amount of contention on the lock in the lock-based approach, thus increasing the gap between the two approaches.  Additionally, a lot of work has been done in .NET 4.0 to improve the performance of things such as accessing [ThreadStatic].  When I run this same code on current builds of .NET 4.0 (again on a dual-core), I see the ThreadStatic version running more than four times faster than the lock-based version.  Even when there’s no contention on the lock in the RandomGen1 solution (simulated by switching the parallel loops to sequential loops), the ThreadStatic version in .NET 4.0 still runs significantly faster than the lock-based version.

Of course, if you really care about the quality of the random numbers, you should be using RNGCryptoServiceProvider, which generates cryptographically-strong random numbers (Addendum: davidacoder makes a good point in his comments on this post that while Random has certain statistical properties, using multiple Random instances as part of the same algorithm may change the statistical properties in unknown or undesirable ways).  For a look at how to get a Random-based facade for RNGCryptoServiceProvider, see .NET Matters: Tales from the CryptoRandom in the September 2007 issue of MSDN Magazine.  You could also settle on an intermediate solution, such as using an RNGCryptoServiceProvider to provide the seed values for the ThreadStatic Random instances in a solution like that in RandomGen2 (which would help to avoid another issue here, that of two threads starting with the same seed value due to accessing the global Random instance in the same time quantum):

public static class RandomGen3
{
    private static RNGCryptoServiceProvider _global =
        new RNGCryptoServiceProvider();
    [ThreadStatic]
    private static Random _local;

    public static int Next()
    {
        Random inst = _local;
        if (inst == null)
        {
            byte[] buffer = new byte[4];
            _global.GetBytes(buffer);
            _local = inst = new Random(
                BitConverter.ToInt32(buffer, 0));
        }
        return inst.Next();
    }
}