I played around a little with computing prime numbers last week, fun stuff but more on that for a another day…. When I discussed some thoughts on the subject with devs on the BCL team I found out (to my surprise) that we actually have code in the BCL for computing prime numbers… Now it is not directly callable publicly, but the source code is in rotor…

Anyone know where? Extra credit if you can explain why we use primes (and need to compute them)… even more extra credit if you can find a small bug in that code…

Brad,

haven’t looked at rotor, but my first guess is you’re generating primes for use in hashtables.

WM_MY0.02$

thomas woelfer

I’ve actually seen an array in the Hashtable containing prime numbers, while debugging.

Sure it’s hashtable.cs.

And that’s because it’s based in "Knuth’s Art of Computer Programming, Vol. 3, p. 528-9" đź™‚

Specifically, primes are used when the size of a hash table must be increased because of the load has reached a certain level. The size of the hashtable is increased such that the number of buckets in the hashtable increases to the smallest prime number that is larger than twice the current number of Hashtable buckets.

The reason for this is two fold. Using a prime number of buckets reduces the effect of clustering in the hashtable (having multiple keys hash to the same location). The size is at least doubled because by doing that the BCL can insure that over a series of N insert, the total time for all inserts takes no more than O(log(N)) total time.

Also Crypto stuff/Strong names?

It is used for the hashtable buckets. The hashtable.cs file has a list of static prime numbers and a function to return the prime number

private static int GetPrime(int minSize)

{

for (int i = 0; i < primes.Length; i++)

{

int size = primes[i];

if (size >= minSize) return size;

}

//outside of our predefined table.

//compute the hard way.

for (int j = (minSize | 1);j < Int32.MaxValue;j+=2)

{

if (Primep (j))

return j;

}

return minSize;

}

To the guy that mentioned crypto stuff. If it was all managed code then you would think so, but the probability is that most of it is done through P/Invoke to the CryptoAPI, which would reduce the change of prime generations showing up in the BCL code.

Hashtable was the obvious suspect… but there’s another one. Fire up Reflector and try a member search for "prime", and you’ll find that System.Windows.Forms.NativeWindow has its own hashtable code, obviously snarfed from Hashtable.

As to the bug, well I don’t see one in the Rotor code posted above, but in what Reflector shows me there’s a ((minSize – 2)|1) where Rotor has (minSize|1). I think the effect of this is that GetPrime(minSize) will actually return (minSize-2) whenever minSize is equal to P+1 or P+2, where P is a prime larger than can be found in the predefined table.

From a quick look at what is done with the return value from GetPrime() this does not look like it would break anything, but still I would call it a violation of the contract that is implied by naming the parameter ‘minSize’.

Also, the code relies on Int32.MaxValue being an odd number.

And there’s an inefficiency in Primep(), it recomputes the square root of the candidate on each iteration of the loop. I have just verified that the JIT does not optimize this away.

Cheers,

–Jonathan

Now that we all agree on hashtables,

I saw with Reflector a for-loop in Hashtable.GetPrime that is

looping over an int while it is less than maxint

This is a (minor) bug, when passing the maxint value

the int will wrap to a negative value…

// Ryan

The primes array is defined as:

// Table of prime numbers to use as hash table sizes. Each entry is the

// smallest prime number larger than twice the previous entry.

private readonly static int[] primes = {

11,17,23,29,37,47,59,71,89,107,131,163,197,239,293,353,431,521,631,761,919,

1103,1327,1597,1931,2333,2801,3371,4049,4861,5839,7013,8419,10103,12143,14591,

17519,21023,25229,30293,36353,43627,52361,62851,75431,90523, 108631, 130363,

156437, 187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403,

968897, 1162687, 1395263, 1674319, 2009191, 2411033, 2893249, 3471899, 4166287,

4999559, 5999471, 7199369

};

However, 17 is not the smallest prime number after 11 * 2 = 22, that would be 23. Likewise for each subsequent entry in the array. So when this is used by the GetPrime() function, you’re getting a smaller prime than you really wanted.

Ryan : this is the loop I was referring to when I wrote that the code relies on Int32.MaxValue being an odd number.

If minSize is already Int32.MaxValue then the for loop will not even be entered.

If minSize is >2147483629 (largest prime that is <Int32.MaxValue) and <Int32.MaxValue, the loop will run to Int32.MaxValue and correctly stop without overflow, since it iterates only on odd numbers. If Int32.MaxValue were even, the loop would jump right past it and continue with Int32.MinValue.

Then GetPrime() will return minSize itself. Which is another bug since minSize is not prime in this case. GetPrime() should just return Int32.MaxValue, since amazingly, it is M31, a Mersenne prime number.

This is all mostly of academical interest however: GetPrime()’s return value is used as the size of an array. 32-bit machines will have a hard time allocating a array of Int32.MaxValue elements…

Cheers,

–Jonathan

Lance: as you point out, the comment is incorrect. However I believe the intent was moved into the expand() function :

private void expand()

{

this.rehash(this.GetPrime(1 + (this.buckets.Length * 2)));

}

Cheers,

–Jonathan

Primes are most useful in asymmetrical encryption. 2 primes form your public and private keys, which, when combined, can encrypt/decrypt data in such a way that due to the nature of primes, you cannot derive the crypto key without having both prime-based "keys"

Dunno what the bug is, though.

HA! I just read some of the middle posts about the crypto stuff being just a wrapper for CryptoAPI. Whoops, true (maybe…it probably is, but I don’t know myself), although that is what I think of when I ponder the utility of primes (Gates said in "The Road Ahead" something about there being an infinite # of primes, hence their suitability as crypto keys)

The hashtable usage is pretty neat, though. I never really thought of that as the means of ensuring uniqueness in the buckets. I recall having to write a hashtable in college, but I had some algorithm that was much less computationally-expensive, yet more fallible (I think I ultimately had my buckets storing lists in case of collisions)

Most of the Crypto stuff is a wrapper around the Win32 CryptoAPI – except for Rijndael. However, Rijndael/AES is a symmetric cipher. Therefore, Keith’s post about primes in the Crypto BCL probably holds true.