Modular arithmetic and primality testing

Number theory
is, roughly speaking, the study of properties of integers. Often a
problem which is easy for real numbers, such as factoring or linear
programming, seems to be considerably more difficult when restricted to
integers (in fact, integer programming is NP-hard). Much of the focus
of modern number theory is on solving these problems efficiently.

The practical importance of number theory was greatly increased by
the introduction of RSA cryptography, an unprecedented system whereby
information could be securely transmitted without first transmitting
the key and risking its interception. Central to RSA is the ability to
efficiently generate a semiprime, which is simply a product of two very
large primes of roughly equal magnitude. What's useful about these
numbers is that we can generate them efficiently, yet given just the
semiprime, determining the two prime factors is a hard,
unsolved problem. It turns out that there are a lot of primes to
choose from: the probability that a randomly chosen k-bit number is
prime is between 0.69/k and 0.87/k. For example, about 7-9% of 10-bit numbers are prime. Consequently, we can find a k-bit prime number in an expected O(k) random guesses. The only remaining problem is to find an efficient algorithm for testing these guesses for primality.

Brute force primality tests such as simply dividing the number by
all values less than the square root are ineffective for large numbers.
To get around this, we'll need an important tool called modular arithmetic.
If you've written C, you may be surprised to know that you're already
familiar with modular arithmetic - the "wrap around" behavior of k-bit integers is an implementation of arithmetic mod 2k. For example, this piece of code computes the value 17:

     unsigned char c1 = 177, c2 = 96;
    unsigned char sum = c1 + c2;

In mathematical notation, we would say:

      177 + 96 ≡ 17 (mod 256)

This
tells us that the numbers 177 + 96 and 17 differ by a multiple of 256,
or in other words have the same last 8 bits. Multiplication mod 256 is
likewise similar to multiplication of unsigned chars in C:

     unsigned char c1 = 177, c2 = 23;
    unsigned char product = c1 * c2;

      177 × 23 ≡ 231 (mod 256)

Modular arithmetic works exactly the same for other moduli than 256;
the only difference is the number where it wraps around. For example,
if you compute 10 + 10 mod 17, you get 3.

The interesting thing about modular arithmetic is that it can be shown that the values form a commutative ring. This means, among other things, that:

  • Addition and multiplication are commutative (you can reverse their operands without changing the result).
  • The associative property holds for both addition and
    multipliation: (177 × 23) × 45 gives the same thing as 177 × (23 × 45),
    even if computed with unsigned chars.
  • The distributive property holds: (177 + 23) × 45 is equal to (177 × 45) + (23 × 45).

If none of these surprise you, this might: if the modulus is a power of 2, as in machine arithmetic, every odd number m has a multiplicative inverse. An inverse is simply a number n such that m × n = 1. What good is this? Well, suppose you want to divide a number k by 7 on a machine with no hardware division. You know that k is divisible by 7. The inverse of 7 mod 256 is 183. Because 7 × 183 = 1, we have 7 × 183 × k = k. Divide both sides by 7, and we get 183 × k = k/7. In other words, multiplying by a number's inverse is the same as dividing by that number, provided that k is evenly divisible by the number.

Now we come back to our original problem: how can modular arithmetic
help us determine primality? Well, it's not hard to show that if p is prime, then:

      For all a with0 < a < p, apa (mod p).

This is called Fermat's little theorem. What's useful about it is not only that it's true for all primes, but that if you find an a such that it does not hold, you have proven that the number p is composite. This can be checked efficiently because there is an efficient algorithm for computing large powers.
This is simply an existence proof - it tells us nothing about what the
factors are. It can be shown that for most composite numbers, if you
just select several a at random and try them, one is likely to fail the test. Unfortunately, there are an infinite number of special numbers called Carmichael numbers for which the above result holds for all a, even though they are not prime.

To get around this, we design a new, slightly more
complicated test which is not susceptible to this problem. I take
this from the excellent book Prime Numbers: A Computational Perspective , by Richard Crandall and Carl Pomerance. Suppose p is an odd prime, and that the binary representation of p - 1 is the odd number t followed by s zeros. Then one of the following is true for every a with 0 < a < p - 1:

      at ≡ 1 (mod p)
      at << i ≡ p - 1 (mod p) for some i with 0 ≤ i < s

Above "<<" denotes the shift-left operator. It can be shown
that for any odd composite number > 9, both of the above will fail
for at least 3/4 of the a between 1 and p-2. We can use this to design a simple probabilistic algorithm:

  1. Choose an a at random with 0 < a < p-1.
  2. Check if one of the above formulas holds. If not, p is composite.
  3. If we have iterated at least T times, claim that p is prime. Otherwise, go back to step 1.

Here's an (untested) piece of C# code which implements this simple
algorithm, assuming that BigInteger is a suitable arbitrary-precision
integer type (the Framework provides no such type):

     public bool IsProbablePrime(BigInteger p, int T) {
        int s = 0;
        BigInteger t = p - 1;
        while (t.IsEven()) {
            s++;
            t >>= 1;
        }
        for (int repeat = 0; repeat < T; T++) {
            BigInteger a = BigInteger.Random(1, p - 2);
            if (!PassesPrimalityTest(a, p, s, t)) {
                return false;
            }
        }
        return true;
    }

    public bool PassesPrimalityTest(BigInteger a, BigInteger p, int s, BigInteger t) {
        BigInteger b = BigInteger.ModPower(a, t, p); // b = a^t mod p
        if (b == 1 || b == p - 1) return true;
        for (int j = 1; j < s; j++) {
            b = BigInteger.ModPower(b, 2, p);
            if (b == p - 1) return true;
        }
        return false;
    }

Because each trial is independent, the chance of erroneously claiming that p is prime is 1/4T,
which is ridiculously tiny even for small T, like T = 20. We can now
use this to quickly locate large prime numbers and generate semiprimes
for RSA cryptography.

Unfortunately, these tests identify composite numbers but reveal
nothing about their factors. This makes them useless for factoring
numbers. Unlike many other hard problems, the factoring problem is
believed to not be NP-complete, and so may very well have an efficient
algorithm that no one has found yet. Such an algorithm would make RSA
encryption useless and win you $625,000.
Combinations of advanced techniques have managed to factor very large
numbers, but only at an enormous expense in computing time and
hardware. This is one of the most active current areas of research in
number theory and computer science.