Selecting Random Seeds

A few months ago I ran across some test code which was using GetTickCount() + timeGetTime() as the seed input to a random number generator. Unfortunately this code was bad for two reasons. First, the systems that the test code was running on would be frequently rebooted, so GetTickCount() would be returning similar results across a large set of test runs. Secondly, timeGetTime() is based on the exact same counter that GetTickCount() is, so adding the two together really doesn't buy you very much.

 

This of course raises the question of "how do you pick a good seed?" And if we take a step back, it is also good to ask "how random/secure/performant do you need your data?" For the sake of this post, we will assume that rand() is good enough, but picking a truly random seed is desired. I should also make the distinction that this is for test code, not code which needs to be cryptographically secure, so while we want to reduce collisions, it isn't catastrophic if they do happen.

 

The goal of the original code was to inject a certain amount of randomness into the testing such that over several months of automated testing on daily builds, there would be an increased level of coverage (think fuzz testing). Unfortunately, there just wasn't enough variance built into the seed, which will result in the tests reusing the same values (over an extended period of time).

 

So, in answer to the first question, it is a fairly common practice to take a handful of diverse data sources and munge them together using a hashing function (MD5, SHA-1, etc). This hash value then becomes the seed. The following list is provided to give you some examples as to possible inputs to the hash. It is obviously not required or practical to use all of them, so pick the ones that make the most sense for your application.

 

CryptGenRandom()

If you have access to this API, then it is a great way to get some random data on Windows and CE systems, and may be all you need for the seed (without having to bother with hashing).

XNetRandom()

This is the Xbox's version of CryptGenRandom()

Registry: HKLM \ SOFTWARE \ Microsoft \ Cryptography \ RNG

The "seed" value in this registry node will periodically be updated with a new value.

GetTickCount() or QueryPerformanceCounter()

Throwing in the current CPU counter can be a good way to perturb values across a single test run if you don’t have access to something like CryptGenRandom().

Date

Assuming the current date/time is "real" (i.e. not being reset), then this will work to change things on a daily basis.

MAC Address

Providing a unique value about the computer/system you are running on will help prevent box A from running the same cases as box B.

Build Number

The build number will help prevent today's tests from being the same as tomorrow's.

GetCurrentProcessId() or GetCurrentThreadId()

Provides a small chance to get different number.

Network Latency

If you have a network based application to begin with, then it should be fairly simple to capture some transaction timings.

 

Now, if you want to get really fancy, it wouldn't be all that hard to setup a web service which just dishes out hashes or values from CryptGenRandom (or XNetRandom if you happen to have a spare Xbox devkit lying around). Your tests could just grab a new seed value each time the test is run, ensuring a good starting point from which to crank through those test cases.