Quiz: Primes in the BCL… part 2

Some good comments on my BCL prime quiz… but still one unanswered question (at the end)…


Thomas Woelfer quickly got that the place in the BCL for generating prime numbers is in Hashtable.cs and Eric Wilson does a good job explaining why…

Specifically, primes are used when the size of a hash table must be increased because of the load has reached a certain level. The size of the hashtable is increased such that the number of buckets in the hashtable increases to the smallest prime number that is larger than twice the current number of Hashtable buckets.


The reason for this is two fold. Using a prime number of buckets reduces the effect of clustering in the hashtable (having multiple keys hash to the same location). The size is at least doubled because by doing that the BCL can insure that over a series of N insert, the total time for all inserts takes no more than O(log(N)) total time

And I really love the reference to Juan Felipe Machado gives: “Knuth's Art of Computer Programming, Vol. 3, p. 528-9”, clearly a desktop reference favorite.   How long before it is common place to quote the SLAR v1, pp 239-243 😉


In addition, Norman Headlam gives some of the code in question, but the method I was really talking about is this one…

00149         private bool Primep (int candidate) {
00150             if ((candidate & 1) != 0) {
00151                 for (int divisor = 3; divisor < (int)Math.Sqrt (candidate); divisor+=2){
00152                     if ((candidate % divisor) == 0)
00153                         return false;
00154                 }
00155                 return true;
00156             }
00157             else {
00158                 return (candidate == 2);
00159             }
00160         }


A few of you pointed out some interesting issues, but not what I was looking for… the problem is in the code above and it results in some numbers being return as prime that are not really prime…The bug can be found for “normal” input ranges such as [1-100].


Comments (17)

  1. S N says:

    It should have been

    divisor <= (int)Math.Sqrt(candidate) in the for loop condition.

  2. Mike Dunn says:

    If you pass in 9, the function will return true because divisor starts at 3, which is not less than sqrt(9), so the for loop never runs and the code drops to the return true line.

  3. b. schwehn says:

    same if you put in 1 which by definition isn’t prime.

  4. b. schwehn says:

    btw, using divisor <= (int)Math.Sqrt(candidate) still returns true for candidate = 1

  5. for (int divisor = 3; divisor < (int)Math.Sqrt (candidate); divisor+=2)

    should be:

    sqrtcand = Math.Sqrt(candidate);

    for (int divisor = 3; divisor < sqrtcand; divisor+=2)

    its a minor thing, but you dont want to be calling Math.Sqrt on every loop iteration.

  6. What about:

    int sqrtcand = (int)Math.Sqrt(candidate);

    for (int divisor = 3; divisor <= sqrtcand; divisor+=2)

  7. Jonathan Perret says:

    Indeed, every squared prime will be considered prime.

    Funny how I had a slight doubt yesterday about that ‘<‘ but I didn’t push into it as I was looking for something more subtle…



  8. b.schwehn says:

    <quote>for (int divisor = 3; divisor < (int)Math.Sqrt (candidate); divisor+=2)

    should be:

    sqrtcand = Math.Sqrt(candidate);

    for (int divisor = 3; divisor < sqrtcand; divisor+=2)</quote>

    I believe this is neccessary, the compiler knows that candidate is constant and therefore sqrt(candidate) is constant as well and will do this optimisation itself

  9. b.schwehn says:

    Actually I was wrong, the nither the C# compilter nor the JIT compiler are able to optimize this themselves, presuably because they don’t know whether Math.Sqrt has any intended Sideeffects besides the result. I wonder if there is a way to indicate this to the compiler..?


    if ((candidate & 1) != 0) {

    00000000 push ebp

    00000001 mov ebp,esp

    00000003 sub esp,20h

    00000006 push edi

    00000007 push esi

    00000008 push ebx

    00000009 mov dword ptr [ebp-4],ecx

    0000000c mov esi,edx

    0000000e xor ebx,ebx

    00000010 xor edi,edi

    00000012 test esi,1

    00000018 je 00000056

    for (int divisor = 3; divisor < (int)Math.Sqrt (candidate); divisor+=2){

    0000001a mov ebx,3

    0000001f nop

    00000020 jmp 00000031

    if ((candidate % divisor) == 0)

    00000022 mov eax,esi

    00000024 cdq

    00000025 idiv eax,ebx

    00000027 test edx,edx

    00000029 jne 0000002F

    return false;

    0000002b xor edi,edi

    0000002d jmp 00000063

    for (int divisor = 3; divisor < (int)Math.Sqrt (candidate); divisor+=2){

    0000002f inc ebx

    00000030 inc ebx

    00000031 mov dword ptr [ebp-14h],ebx

    00000034 mov dword ptr [ebp-20h],esi

    00000037 fild dword ptr [ebp-20h]

    0000003a fsqrt

    0000003c fstp qword ptr [ebp-1Ch]

    0000003f push dword ptr [ebp-18h]

    00000042 push dword ptr [ebp-1Ch]

    00000045 call 725CFA7C

    0000004a cmp eax,dword ptr [ebp-14h]

    0000004d jg 00000022

    return true;

    0000004f mov edi,1

    00000054 jmp 00000063

    return (candidate == 2);

    00000056 cmp esi,2

    00000059 sete al

    0000005c movzx eax,al

    0000005f mov edi,eax

    00000061 jmp 00000063





    if ((candidate & 1) != 0) {

    00000000 push ebp

    00000001 mov ebp,esp

    00000003 sub esp,20h

    00000006 push edi

    00000007 push esi

    00000008 push ebx

    00000009 mov dword ptr [ebp-4],ecx

    0000000c mov esi,edx

    0000000e mov dword ptr [ebp-0Ch],0

    00000015 xor ebx,ebx

    00000017 xor edi,edi

    00000019 test esi,1

    0000001f je 0000005D

    int tt = (int)Math.Sqrt (candidate);

    00000021 mov dword ptr [ebp-20h],esi

    00000024 fild dword ptr [ebp-20h]

    00000027 fsqrt

    00000029 fstp qword ptr [ebp-1Ch]

    0000002c push dword ptr [ebp-18h]

    0000002f push dword ptr [ebp-1Ch]

    00000032 call 725CF9FC

    00000037 mov dword ptr [ebp-0Ch],eax

    for (int divisor = 3; divisor < tt;divisor+=2){

    0000003a mov ebx,3

    0000003f nop

    00000040 jmp 00000051

    if ((candidate % divisor) == 0)

    00000042 mov eax,esi

    00000044 cdq

    00000045 idiv eax,ebx

    00000047 test edx,edx

    00000049 jne 0000004F

    return false;

    0000004b xor edi,edi

    0000004d jmp 0000006A

    for (int divisor = 3; divisor < tt;divisor+=2){

    0000004f inc ebx

    00000050 inc ebx

    00000051 cmp ebx,dword ptr [ebp-0Ch]

    00000054 jl 00000042

    return true;

    00000056 mov edi,1

    0000005b jmp 0000006A

    return (candidate == 2);

    0000005d cmp esi,2

    00000060 sete al

    00000063 movzx eax,al

    00000066 mov edi,eax

    00000068 jmp 0000006A



  10. Richard says:

    Use of prime number sizes for the array of hash table buckets is traditional. However the VC++ 7 & 7.1’s hash based containers require a number of buckets that is a power of two:

    The integer constant min_buckets specifies the minimum number of buckets to maintain in the hash table. It must be a power of two and greater than zero. The value supplied by hash_compare is 8.


    (This is the description of the hash_compare template, which you specialise to override the defaults in hash_set and hash_map container instances.)

    It does some clever things to avoid rehashing everything as the number of buckets increases.

  11. Brad Abrams says:

    I was really going for the <= in line 151 as SN points out… Those off-by-one errors are really a kicker aren’t they?

    But good to see the other comments as well…

  12. Eric Wilson says:

    Is this fixed in Widbey?

  13. Brad Abrams says:

    Yes, this is fixed in Whidbey

  14. Knuth’s mod prime reference is so common, you would think that is the best way to do a hash table. But in fact, if you can get a better hash to start with, you don’t need a mod prime, as the hash is already well distributed. Then you can use faster and better aligned power of 2 sized tables. That mod prime technique has been slow and obsolete for a long time now.

    For an example of a more modern hash algorithm, look at FNV:


    That algorithm has already proven itself, speeding up a number of OS hashtables that used the obsolete mod prime method.

  15. FNV is for hashing strings, and is only indirectly related to hashtables and their hash functions.

Skip to main content