From Fermat to AKS: A History of Primality Testing

December 16, 2024

Is 341 prime? Trial division says no ( $11 \times 31$ ). But if you check $2^{340} \bmod 341$ , you get 1 -- exactly what a prime would give. This failure of Fermat's test, and the centuries it took to fix it, is the story of probabilistic primality testing.

Background

A positive integer $n > 1$ is prime if its only divisors are 1 and itself. This is a clean definition but a hard computational problem: given an arbitrary $n$ , how do you decide?

The stakes are not abstract. RSA key generation requires finding two large primes $p$ and $q$ , typically 1024 bits each, so that their product $n = pq$ is hard to factor. The security of RSA rests on the assumption that factoring $n$ is computationally infeasible -- but that assumption is worthless if $p$ or $q$ is not actually prime. A composite that slips through primality testing produces a key that can be factored trivially. RSA, Diffie-Hellman, and most public-key cryptography depend on reliably generating large primes.

Trial division

The oldest approach: to test whether $n$ is prime, try dividing it by every integer from 2 up to $\sqrt{n}$ . If none divide evenly, $n$ is prime.

This works because if $n = ab$ with $a \le b$ , then $a \le \sqrt{n}$ . So we only need to check potential factors up to the square root. A standard optimization: after checking 2, only test odd divisors (or better, only test primes up to $\sqrt{n}$ , but generating those primes is itself a sub-problem -- the Sieve of Eratosthenes handles it for moderate bounds).

The complexity is $O(\sqrt{n})$ divisions. For a 10-digit number, that is roughly 100,000 divisions -- fast on modern hardware. For a 300-digit number (a typical RSA prime), $\sqrt{n}$ has 150 digits. There are approximately $10^{150}$ candidate divisors. At $10^{12}$ divisions per second, this would take about $10^{138}$ seconds. The universe is approximately $4 \times 10^{17}$ seconds old.

The issue is that $O(\sqrt{n})$ is exponential in the bit-length of $n$ . If $n$ has $b$ bits, then $\sqrt{n} = 2^{b/2}$ , and we need $2^{b/2}$ divisions. A polynomial-time algorithm would need $O(b^c)$ operations for some constant $c$ . Trial division is correct and deterministic, but for cryptographic sizes it is useless -- and this is not a constant-factor problem that faster hardware will solve. The gap is exponential.

The probabilistic trade

This gap -- between what we need (test 1024-bit numbers) and what deterministic methods can do in reasonable time -- motivates a radical idea: accept a test that might be wrong.

A probabilistic primality test gives one of two answers: "definitely composite" or "probably prime." The "probably" comes with a quantifiable error bound. If the probability of a false "probably prime" answer is less than $2^{-128}$ , that is a smaller failure probability than a cosmic ray flipping a bit in your CPU during the computation. For engineering purposes, "probably prime with error $< 2^{-128}$ " is as good as "proven prime."

The challenge is building a test with: (1) an error probability that decreases exponentially with the number of iterations, and (2) no blind spots -- no class of composites that always fools the test regardless of how many iterations you run.

The first condition is achievable: run the test multiple times with independent random choices, and error probabilities multiply (shrinking exponentially). The second took 340 years from Fermat to Rabin, because the natural first attempt -- the Fermat test -- has exactly the blind spot we need to avoid.

Fermat's Little Theorem

In 1640, Pierre de Fermat stated a theorem that would become foundational to primality testing:

ap11(modp)a^{p-1} \equiv 1 \pmod{p}

for any prime $p$ and integer $a$ not divisible by $p$ .

Consider the $p-1$ nonzero residues $\{1, 2, \ldots, p-1\}$ modulo $p$ . Multiplying each by $a$ (with $\gcd(a, p) = 1$ ) permutes this set: the map $x \mapsto ax \bmod p$ is a bijection on $\{1, \ldots, p-1\}$ . Therefore:

i=1p1(ai)i=1p1i(modp)\prod_{i=1}^{p-1} (a \cdot i) \equiv \prod_{i=1}^{p-1} i \pmod{p}

The left side is $a^{p-1} \cdot (p-1)!$ and the right side is $(p-1)!$ . Since $p$ is prime, $(p-1)!$ is invertible mod $p$ , giving $a^{p-1} \equiv 1$ .

The contrapositive gives us a primality test: if $a^{n-1} \not\equiv 1 \pmod{n}$ for some $a$ coprime to $n$ , then $n$ is definitely composite.

The Fermat Primality Test

This observation leads to a simple probabilistic test:

  1. Pick a random base $a$ with $1 < a < n-1$
  2. Compute $a^{n-1} \mod n$
  3. If the result is not 1, $n$ is composite
  4. If the result is 1, $n$ is probably prime

Step 2 uses modular exponentiation by repeated squaring, which runs in $O(\log n)$ multiplications mod $n$ , each costing $O((\log n)^2)$ with schoolbook multiplication. Total: $O((\log n)^3)$ . This scales with the number of digits, not the magnitude -- exactly what we need.

Note an asymmetry: the test can prove compositeness (if $a^{n-1} \not\equiv 1$ , $n$ is definitely composite) but can only suggest primality (if $a^{n-1} \equiv 1$ , $n$ might be prime). This one-sided error structure defines probabilistic primality testing. We will never get a false "composite" answer, only a false "probably prime."

The weakness: composite numbers that pass this test exist.

Pseudoprimes

A composite number $n$ is called a Fermat pseudoprime to base $a$ if $a^{n-1} \equiv 1 \pmod{n}$ .

For example, $341 = 11 \times 31$ is a pseudoprime to base 2:

23401(mod341)2^{340} \equiv 1 \pmod{341}

Why does this happen? By Fermat's theorem applied to the factors: $2^{10} \equiv 1 \pmod{11}$ and $2^{30} \equiv 1 \pmod{31}$ . Since $10 \mid 340$ and $30 \mid 340$ , we get $2^{340} \equiv 1$ modulo both 11 and 31, hence modulo $341$ by the Chinese Remainder Theorem.

This means the Fermat test with base 2 incorrectly identifies 341 as "probably prime."

The pseudoprimes to base 2 below 1000 are just 341 and 561. They are rare -- there are only 245 base-2 pseudoprimes below $10^6$ -- but they exist, and their rarity is deceptive. It might suggest that testing multiple bases would catch all composites: if $n$ is a pseudoprime to base 2, surely it fails for base 3? Often yes. But not always.

Carmichael Numbers

In 1885, Vaclav Simerka found several composite numbers that are pseudoprimes to every base coprime to them. He published in Casopis pro pestovani mathematiky a fysiky, a Czech-language journal with limited circulation outside Bohemia. His work went unnoticed for over a century.

In 1899, Alwin Korselt characterized such numbers: a composite $n$ is a Carmichael number if and only if:

  1. $n$ is square-free (no repeated prime factors)
  2. For every prime $p$ dividing $n$ , we have $(p-1) \mid (n-1)$

But Korselt knew of no examples.

In 1910, Robert D. Carmichael independently discovered the smallest such number:

561=3×11×17561 = 3 \times 11 \times 17

Verification: $561 - 1 = 560 = 2^4 \times 5 \times 7$ , and indeed:

The Fermat test fails completely on Carmichael numbers -- no choice of base (coprime to $n$ ) will reveal their compositeness. For the Fermat test, Carmichael numbers are an unconditional blind spot. No amount of repetition helps.

Korselt's criterion has an intuitive interpretation: it says that $n$ "pretends" to be prime by having its prime factors' orders all divide $n - 1$ . The Chinese Remainder Theorem then forces $a^{n-1} \equiv 1 \pmod{n}$ for all $a$ coprime to $n$ . The deception is structural, not coincidental.

How many exist?

For decades, mathematicians wondered whether there are infinitely many Carmichael numbers. In 1994, Alford, Granville, and Pomerance proved there are: for sufficiently large $x$ , the number of Carmichael numbers up to $x$ exceeds $x^{2/7}$ . The proof is non-constructive and the actual density appears to be much higher than this bound. Erdos conjectured the count up to $x$ is $x^{1-o(1)}$ , but this remains open.

Solovay-Strassen: The First Probabilistic Test

In 1977, Robert Solovay and Volker Strassen gave the first polynomial-time probabilistic primality test with a rigorously proven error bound. Their test uses the Euler criterion, which connects modular exponentiation to the Legendre symbol.

Lemma (Euler criterion). For an odd prime $p$ and $\gcd(a, p) = 1$ :

a(p1)/2(ap)(modp)a^{(p-1)/2} \equiv \left(\frac{a}{p}\right) \pmod{p}

where $\left(\frac{a}{p}\right)$ is the Legendre symbol ( $+1$ if $a$ is a quadratic residue mod $p$ , $-1$ otherwise).

Proof sketch. Since $\mathbb{F}_p^\times$ is cyclic of order $p-1$ , we have $a^{p-1} \equiv 1$ , so $a^{(p-1)/2}$ is a square root of 1, hence $\pm 1$ . It equals $+1$ if and only if $a = g^{2k}$ for some generator $g$ (i.e., $a$ is a quadratic residue), because $a^{(p-1)/2} = g^{k(p-1)} = 1$ when $k$ is even and $g^{k(p-1)} = g^{(p-1)/2 \cdot \text{odd}} = -1$ when $k$ is odd.

The Solovay-Strassen test: pick a random $a$ , compute both $a^{(n-1)/2} \bmod n$ and the Jacobi symbol $\left(\frac{a}{n}\right)$ (which can be computed in $O(\log^2 n)$ by reciprocity, without knowing the factorization of $n$ ). If they disagree, $n$ is composite.

Why this breaks the Carmichael barrier: a Carmichael number $n$ satisfies $a^{n-1} \equiv 1 \pmod{n}$ for all $a$ coprime to $n$ , but this does not force $a^{(n-1)/2} \equiv \left(\frac{a}{n}\right)$ . The Jacobi symbol factors multiplicatively over the prime factors of $n$ , while the power $a^{(n-1)/2}$ does not decompose the same way. Solovay and Strassen proved that for any composite $n$ , at least half of all bases $a$ in $\{1, \ldots, n-1\}$ are witnesses -- the Euler criterion fails for them. This gives a $1/2$ error bound per round, worse than Miller-Rabin's $1/4$ but sufficient to bypass Carmichael numbers entirely.

Strong Pseudoprimes and Miller's Test

In 1976, Gary Miller observed that we can strengthen the Fermat test by exploiting a deeper property of primes.

Write $n - 1 = 2^s \cdot d$ where $d$ is odd. For a prime $p$ and base $a$ coprime to $p$ , one of the following must hold:

ad1(modp)a^d \equiv 1 \pmod{p}

or

a2rd1(modp)for some 0r<sa^{2^r d} \equiv -1 \pmod{p} \quad \text{for some } 0 \le r < s

Why must one of these hold? Start from Fermat: $a^{p-1} = a^{2^s d} \equiv 1 \pmod{p}$ . Now consider the squaring chain:

ad,  a2d,  a4d,  ,  a2sda^d, \; a^{2d}, \; a^{4d}, \; \ldots, \; a^{2^s d}

The last term is 1. Each term is the square of the previous. Working backwards from $a^{2^s d} \equiv 1$ : if $a^{2^r d} \equiv 1$ , then $a^{2^{r-1} d}$ is a square root of 1 modulo $p$ . In a field (which $\mathbb{Z}/p\mathbb{Z}$ is, since $p$ is prime), the polynomial $x^2 - 1$ has at most two roots: $+1$ and $-1$ . So either the previous term is also 1 (and we continue backwards) or it is $-1$ (and we stop). Eventually we either reach $a^d \equiv 1$ or find some $a^{2^r d} \equiv -1$ .

The critical point: for a composite $n$ , $\mathbb{Z}/n\mathbb{Z}$ is not a field, and $x^2 \equiv 1 \pmod{n}$ can have more than two solutions. For example, modulo $n = 15$ , we have $4^2 = 16 \equiv 1$ -- a "non-trivial square root of unity." This is the crack that Miller's test exploits.

Worked example: $n = 341$ , $a = 2$ . Recall that 341 passes the Fermat test for base 2. Write $341 - 1 = 340 = 2^2 \times 85$ , so $s = 2$ and $d = 85$ . The squaring chain is:

285mod341=32,322mod341=1024mod341=12^{85} \bmod 341 = 32, \quad 32^2 \bmod 341 = 1024 \bmod 341 = 1
Squaring chains compared: prime n=97 reaches -1 cleanly, composite n=341 has a non-trivial square root
Top: prime n=97, the chain reaches -1 at the third squaring -- normal behavior in a field. Bottom: composite n=341, the chain jumps from 32 to 1 without passing through +/-1. The non-trivial square root witnesses compositeness.

The chain went from 32 to 1 in one squaring. But $32 \not\equiv 1$ and $32 \not\equiv -1 \pmod{341}$ (since $-1 \equiv 340$ ). So 32 is a non-trivial square root of 1 modulo 341 -- exactly the kind of witness that cannot exist modulo a prime. Miller's test correctly identifies 341 as composite. The Fermat test missed this because it only checks the final value $2^{340} \bmod 341 = 1$ , discarding the information in the intermediate steps.

A composite number that passes this stronger test for base $a$ is called a strong pseudoprime to base $a$ . Unlike the Fermat test, there are no "strong Carmichael numbers" -- no composite passes for all bases.

The Miller-Rabin Test

In 1980, Michael Rabin made Miller's test probabilistic and proved a key result: for any composite $n$ , at most $1/4$ of bases $a$ in $\{2, \ldots, n-2\}$ are strong liars (bases for which $n$ passes the strong test).

Theorem (Rabin, 1980). If $n$ is an odd composite, the number of strong liars in $\{1, \ldots, n-1\}$ is at most $\frac{n-1}{4}$ .

Proof sketch. Write $n - 1 = 2^s d$ with $d$ odd. By the Chinese Remainder Theorem, $(\mathbb{Z}/n\mathbb{Z})^\times \cong \prod_{i} (\mathbb{Z}/p_i^{e_i}\mathbb{Z})^\times$ . A strong liar $a$ must have its squaring chain "look prime" -- either $a^d \equiv 1$ or $a^{2^r d} \equiv -1$ for some $r$ .

The key constraint: $a^{2^r d} \equiv -1 \pmod{n}$ requires $a^{2^r d} \equiv -1$ modulo every prime power factor simultaneously. But each factor has its own 2-adic structure ( $p_i - 1 = 2^{s_i} d_i$ with possibly different $s_i$ ), so the squaring chains must "synchronize" across all factors at the same step $r$ . The strong liars form a subgroup of $(\mathbb{Z}/n\mathbb{Z})^\times$ . The synchronization constraint forces this subgroup to have index at least 4.

The worst case is $n = pq$ with $p \equiv q \equiv 3 \pmod{4}$ and $\gcd(p-1, q-1) = 2$ , which gives exactly $\frac{n-1}{4}$ strong liars. All other composites have fewer.

This means:

But most composites are caught by the first random base. Damgard, Landrock, and Pomerance (1993) showed that for random odd $k$ -bit composites, the average-case error probability after $t$ rounds drops as $\exp(-c\sqrt{kt})$ for a constant $c$ , much faster than the worst-case $4^{-t}$ . For 1024-bit numbers with even a single round, the average-case error probability is negligible.

Unlike the Fermat test, Miller-Rabin is not fooled by Carmichael numbers. For any Carmichael number $n$ , at least 75% of bases reveal its compositeness through the strong pseudoprime check. The "non-trivial square roots of unity" that exist in $\mathbb{Z}/n\mathbb{Z}$ (but not in $\mathbb{Z}/p\mathbb{Z}$ ) always provide witnesses.

Implementation

Modular exponentiation by repeated squaring makes each round efficient. Here is a complete Miller-Rabin test in Python:

import random

def is_probable_prime(n, k=40):
    """Miller-Rabin with k rounds. False = composite, True = probably prime."""
    if n < 2: return False
    if n < 4: return True
    if n % 2 == 0: return False
    # Write n - 1 = 2^s * d with d odd
    d, s = n - 1, 0
    while d % 2 == 0:
        d //= 2
        s += 1
    for _ in range(k):
        a = random.randrange(2, n - 1)
        x = pow(a, d, n)          # a^d mod n
        if x == 1 or x == n - 1:
            continue
        for _ in range(s - 1):
            x = pow(x, 2, n)      # square mod n
            if x == n - 1:
                break
        else:
            return False           # composite witness found
    return True

Python's built-in three-argument pow(a, d, n) performs modular exponentiation by repeated squaring internally -- it does not compute $a^d$ and then reduce. This makes the implementation efficient even for thousand-bit inputs. With $k = 40$ rounds, the error probability is below $4^{-40} \approx 10^{-24}$ , which is the typical choice for cryptographic key generation.

The for/else construct is Python-specific: the else clause executes only if the inner for loop completes without break, meaning none of the squarings produced $-1$ , so $a$ is a witness to compositeness.

Deterministic Variants

Miller's test under GRH

Miller's original 1976 test was not probabilistic -- it was deterministic, conditional on the extended Riemann hypothesis (ERH). Under ERH, Miller proved that for any composite $n$ , there exists a witness $a < 2 (\ln n)^2$ . This means you only need to test bases up to $O((\log n)^2)$ , giving a deterministic polynomial-time test -- if ERH is true.

ERH remains unproven. The conditional result ties the difficulty of deterministic primality testing to the distribution of zeros of the Riemann zeta function.

Known deterministic witness sets

Even without ERH, exhaustive computation has established that specific small sets of bases suffice for bounded ranges. These results turn Miller-Rabin into a deterministic test for numbers below each bound:

Bound on $n$ Sufficient bases
$< 2{,}047$ $\{2\}$
$< 1{,}373{,}653$ $\{2, 3\}$
$< 3{,}215{,}031{,}751$ $\{2, 3, 5, 7\}$
$< 3{,}317{,}044{,}064{,}679{,}887{,}385{,}961{,}981$ $\{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41\}$

The last bound is approximately $3.3 \times 10^{24}$ . For 64-bit integers ( $< 2^{64} \approx 1.8 \times 10^{19}$ ), the first 12 primes as bases give a deterministic test. This is how many implementations handle "small" numbers: no randomness needed, no probability of error, and the test runs in microseconds.

The smallest strong pseudoprime to base 2 is 2047 -- which is why base $\{2\}$ alone suffices below that threshold.

BPSW: the pragmatic gold standard

The Baillie-PSW test, introduced in 1980 by Robert Baillie, Carl Pomerance, John Selfridge, and Samuel Wagstaff, combines two tests that fail on disjoint sets of composites:

  1. A Miller-Rabin test to base 2
  2. A strong Lucas probable prime test (with a specific parameter selection method)

The idea is that a composite which is a strong pseudoprime to base 2 is unlikely to also be a strong Lucas pseudoprime, and vice versa. This is not just a hope: the number-theoretic properties that make a number a Fermat/Miller-Rabin pseudoprime (multiplicative order structure) are largely independent of those that make it a Lucas pseudoprime (quadratic residue structure).

The Lucas test works in a different algebraic setting. Given parameters $P$ and $Q$ with discriminant $D = P^2 - 4Q$ , define the Lucas sequences $U_k(P,Q)$ and $V_k(P,Q)$ via the recurrence $U_{k+1} = PU_k - QU_{k-1}$ . For a prime $p$ with Jacobi symbol $(D/p) = -1$ , we have $U_{p+1} \equiv 0 \pmod{p}$ . The "strong" variant adds a squaring chain similar to Miller-Rabin. The parameter selection (Selfridge's Method A: choose the first $D$ in the sequence $5, -7, 9, -11, \ldots$ such that $(D/n) = -1$ ) is designed to maximize the independence between the Miller-Rabin and Lucas tests.

No BPSW counterexample has ever been found. The original 1980 paper offered a $30 reward for a counterexample (from Pomerance, Selfridge, and Wagstaff). In 2021, Baillie, Fiori, and Wagstaff offered $2,000 for a counterexample to a strengthened variant. Neither reward has been claimed. Exhaustive search has verified that no counterexample exists below $2^{64}$ .

BPSW requires only two tests (one modular exponentiation + one Lucas computation), making it faster than running 40 rounds of Miller-Rabin. It is the default primality test in many implementations, including Mathematica, Maple, and SageMath.

Beyond Miller-Rabin

Miller-Rabin and BPSW are probabilistic (or deterministic only for bounded ranges). For some applications -- primality certificates, mathematical proofs, record-keeping -- we need tests that prove primality unconditionally.

AKS: theoretical breakthrough

In 2002, Manindra Agrawal, Neeraj Kayal, and Nitin Saxena proved that primality is in P: there exists a deterministic polynomial-time algorithm that decides primality without any unproven hypothesis. Their test, known as AKS, was a landmark in computational number theory.

The original algorithm runs in $\tilde{O}((\log n)^{12})$ time. Lenstra and Pomerance improved this to $\tilde{O}((\log n)^6)$ in 2005. The key idea starts from a classical observation: if $n$ is prime, then the polynomial identity

(x+a)nxn+a(modn)(x + a)^n \equiv x^n + a \pmod{n}

holds in $\mathbb{Z}_n[x]$ for all $a$ (this is essentially the Frobenius endomorphism). For composite $n$ , this identity generally fails. But checking it directly requires working with a polynomial of degree $n$ -- which is as expensive as trial division.

AKS's insight is to check this identity modulo a polynomial $x^r - 1$ for a suitable small $r$ : verify $(x + a)^n \equiv x^{n \bmod r} + a$ in $\mathbb{Z}_n[x]/(x^r - 1)$ . Now the polynomials have degree at most $r$ , and $r$ can be chosen as $O((\log n)^5)$ or better.

The key lemma: if $(x + a)^n \equiv x^n + a \pmod{n, x^r - 1}$ holds for $a = 1, 2, \ldots, \lfloor\sqrt{\phi(r)} \log n\rfloor$ , and $n$ has no prime factor $\le r$ , and $n$ is not a perfect power, then $n$ is prime. The proof constructs a group $G$ of residues in $(\mathbb{Z}/n\mathbb{Z})[x]/(x^r - 1)$ generated by $\{x + 1, x + 2, \ldots\}$ and shows that $|G|$ is both large (from the many verified identities) and small (from the structure of the quotient ring), forcing $n$ to be prime. The total number of checks is polynomial in $\log n$ .

AKS is not used in practice. For 64-bit numbers, BPSW is orders of magnitude faster. For larger numbers, ECPP (below) is faster and also produces a certificate. AKS is a "galactic algorithm" -- polynomial in theory, but with constants so large that Miller-Rabin with 64 rounds is faster by orders of magnitude even for 1024-bit inputs. Its significance is theoretical, settling a long-standing complexity question, not providing a practical tool.

ECPP: the practical gold standard for proven primes

Elliptic Curve Primality Proving (ECPP) was developed by Shafi Goldwasser and Joe Kilian in 1986, and refined into a practical algorithm by A. O. L. Atkin and Francois Morain in 1993. It runs in $\tilde{O}((\log n)^4)$ heuristic time and produces a primality certificate -- a compact proof that anyone can verify much faster than it took to produce.

The core idea uses elliptic curves over $\mathbb{Z}/n\mathbb{Z}$ . Given a point $P$ on an elliptic curve $E$ modulo $n$ , if we can show that the group order $|E(\mathbb{Z}/n\mathbb{Z})|$ has a large prime factor $q$ , and certain conditions hold, then either $n$ is prime or $n$ has a very small factor (which we can check by trial division). The problem then reduces to proving $q$ is prime -- a smaller instance of the same problem. This recursive structure terminates quickly.

The certificate is an Atkin-Goldwasser-Kilian-Morain certificate: a chain of elliptic curves and points that witnesses primality at each recursive step. Verification runs in polynomial time and does not require trusting the prover's computation. This is the key distinction from probabilistic tests: ECPP does not say "probably prime with high confidence." It says "here is a proof; check it yourself." The certificate for a 10,000-digit prime might be a few megabytes, but verifying it takes minutes rather than the hours needed to produce it.

As of 2025, ECPP holds the record for the largest proven prime with a general-purpose algorithm: $R(109297) = (10^{109297} - 1)/9$ , a repunit with 109,297 digits. (Mersenne primes are tested with the deterministic Lucas-Lehmer test, which exploits their special form.)

Comparison

Test Type Complexity Certainty Practical use
Trial division Deterministic $O(\sqrt{n})$ Proven Small $n$ only
Fermat Probabilistic $O(k \log^3 n)$ Probable (with blind spots) Obsolete
Solovay-Strassen Probabilistic $O(k \log^3 n)$ Error $\le 2^{-k}$ Superseded by Miller-Rabin
Miller-Rabin Probabilistic $O(k \log^3 n)$ Error $\le 4^{-k}$ General purpose
BPSW Probabilistic $O(\log^3 n)$ No known counterexample Default in most libraries
Miller (GRH) Deterministic (conditional) $O(\log^5 n)$ Proven if ERH holds Theoretical
AKS Deterministic $\tilde{O}(\log^6 n)$ Proven Not practical
ECPP Deterministic $\tilde{O}(\log^4 n)$ heuristic Proven + certificate Largest proven primes

Timeline

Year Event
1640 Fermat states his "little theorem"
1885 Simerka finds first Carmichael numbers (work overlooked)
1899 Korselt gives characterization criterion
1910 Carmichael discovers 561 independently
1976 Miller introduces strong pseudoprime test (deterministic under ERH)
1977 Solovay and Strassen give first polynomial-time probabilistic test with proven $1/2$ error bound
1980 Rabin proves probabilistic $1/4$ error bound
1980 Baillie, Pomerance, Selfridge, Wagstaff propose BPSW test
1986 Goldwasser and Kilian develop elliptic curve primality proving
1993 Atkin and Morain refine ECPP into a practical algorithm
1994 Alford-Granville-Pomerance prove infinitely many Carmichael numbers
2002 Agrawal-Kayal-Saxena prove primality is in P

The Underlying Structure: Finite Fields

The preceding sections described what each test checks. This section explains why those checks work -- the algebraic structure that makes primes distinguishable from composites.

When $p$ is prime, the integers modulo $p$ form a finite field $\mathbb{F}_p$ . Every nonzero element has a multiplicative inverse, and Fermat's little theorem tells us the multiplicative group $\mathbb{F}_p^\times$ has order $p-1$ .

The multiplicative group is cyclic

A deeper fact: $\mathbb{F}_p^\times$ is cyclic. There exists a generator $g$ (called a primitive root) such that every nonzero element of $\mathbb{F}_p$ is a power of $g$ :

Fp×={g0,g1,g2,,gp2}\mathbb{F}_p^\times = \{g^0, g^1, g^2, \ldots, g^{p-2}\}

This is not obvious. Here is the proof.

Theorem. For any prime $p$ , the group $\mathbb{F}_p^\times$ is cyclic.

Proof. For each divisor $d$ of $p-1$ , let $\psi(d) = |\{a \in \mathbb{F}_p^\times : \text{ord}(a) = d\}|$ . Every element has some order dividing $p-1$ , so $\sum_{d \mid (p-1)} \psi(d) = p-1$ .

Now we claim $\psi(d) \le \phi(d)$ . If $\psi(d) \ge 1$ , there exists an element $g$ of order $d$ . The $d$ elements $g^0, g^1, \ldots, g^{d-1}$ are all roots of $x^d - 1$ . Since $\mathbb{F}_p$ is a field, $x^d - 1$ has at most $d$ roots, so these are all the elements of order dividing $d$ . Among $g^0, \ldots, g^{d-1}$ , exactly $\phi(d)$ have order exactly $d$ (those $g^k$ with $\gcd(k, d) = 1$ ). So $\psi(d) \in \{0, \phi(d)\}$ .

Since $\sum_{d \mid (p-1)} \psi(d) = p - 1 = \sum_{d \mid (p-1)} \phi(d)$ (the latter is a standard identity), and $\psi(d) \le \phi(d)$ for all $d$ , we must have $\psi(d) = \phi(d)$ for every $d$ . In particular, $\psi(p-1) = \phi(p-1) \ge 1$ , so primitive roots exist.

Why cyclicity matters for Miller's test

The cyclicity of $\mathbb{F}_p^\times$ is what makes the squaring chain in Miller's test work. Write every element as $g^j$ for some $j$ . Then $a = g^j$ and:

ad=gjd,a2d=g2jd,,a2sd=g2sjda^d = g^{jd}, \quad a^{2d} = g^{2jd}, \quad \ldots, \quad a^{2^s d} = g^{2^s jd}

The condition $a^{2^s d} = 1$ means $g^{2^s jd} = g^0$ , i.e., $(p-1) \mid 2^s jd$ . Since $p - 1 = 2^s d$ , this is automatic. Now work backwards. At each step, squaring in the cyclic group means doubling the exponent mod $p-1$ . The only elements whose square is 1 are $g^0 = 1$ and $g^{(p-1)/2} = -1$ . This is exactly the "only two square roots of unity" property that Miller's test uses, and it holds because the group is cyclic of known order.

For composite $n$ , the group $(\mathbb{Z}/n\mathbb{Z})^\times$ is not cyclic in general (by the Chinese Remainder Theorem, it decomposes as a product of the groups for each prime power factor). This product structure creates extra square roots of unity -- elements $x$ with $x^2 \equiv 1 \pmod{n}$ but $x \not\equiv \pm 1 \pmod{n}$ . These non-trivial square roots are the witnesses that Miller-Rabin detects.

Beyond $\mathbb{F}_p$

For prime powers $q = p^k$ , there exist unique finite fields $\mathbb{F}_q$ of each size (constructed as polynomial quotient rings, not just $\mathbb{Z}/q\mathbb{Z}$ ). Their multiplicative groups are also cyclic of order $q - 1$ .

The connection between finite fields and primality testing runs deep. The Fermat test is really a test of multiplicative group order. Miller-Rabin refines this by probing the 2-Sylow subgroup structure. The Lucas test probes quadratic extensions $\mathbb{F}_{p^2}$ . ECPP uses the group structure of elliptic curves over $\mathbb{F}_p$ . Each successive test exploits more algebraic structure of finite fields to distinguish primes from composites.

Open Problems

Density of pseudoprimes

There are infinitely many Fermat pseudoprimes to base 2. In fact, the count of base-2 pseudoprimes up to $x$ is $x / L(x)^{1/2 + o(1)}$ where $L(x) = e^{\log x \cdot \log \log \log x / \log \log x}$ -- a result of Pomerance (1981). This grows faster than any power of $\log x$ but slower than any positive power of $x$ . In practical terms: pseudoprimes are sparse but not negligibly so.

For strong pseudoprimes (the kind Miller-Rabin uses), the density is lower still. The count of base-2 strong pseudoprimes below $x$ appears to grow roughly as $x^{1/3}$ empirically, though the exact asymptotics are not fully established.

Carmichael numbers (pseudoprimes to all coprime bases, a much stricter condition) have a proven lower bound of $x^{2/7}$ (Alford-Granville-Pomerance 1994). Erdos conjectured the true count up to $x$ is $x^{1-o(1)}$ , but this remains open. Pinch computed the count up to $10^{18}$ as 1,401,644 -- rare compared to Fermat pseudoprimes to any single base, but not as rare as one might expect.

The BPSW question

The central open problem in practical primality testing: does a BPSW counterexample exist?

Heuristic arguments suggest counterexamples should exist but be astronomically rare. Pomerance has estimated that if they exist, the smallest might exceed $10^{10000}$ . Exhaustive search has checked all numbers below $2^{64}$ with no counterexample found.

The $2,000 bounty (Baillie-Fiori-Wagstaff, 2021) remains unclaimed. If no counterexample exists, proving this would likely require new techniques in analytic number theory. If one exists, finding it would likely require new techniques in constructive algebra.

Either resolution would be significant. In the meantime, BPSW occupies a peculiar status: universally trusted in practice, with no theoretical proof of correctness, and with heuristic arguments that counterexamples should exist but haven't been found.

How real systems test primality

The gap between theory and practice matters here. Real systems do not just run Miller-Rabin in a loop. The full pipeline for generating an RSA prime in OpenSSL (as of version 3.x) looks like this:

  1. Generate a random odd number $n$ of the desired bit length, with the top two bits set (to ensure the product $pq$ has the right bit length).
  2. Check divisibility by a table of small primes (typically the first 2048 primes). This is trial division with a fixed bound, and it rejects about 80% of candidates cheaply.
  3. Run one round of Miller-Rabin with base 2.
  4. Run a Lucas test (the second half of BPSW).
  5. Depending on the security level and standard being followed, run additional Miller-Rabin rounds with random bases.

GMP's mpz_probab_prime_p function follows a similar structure: trial division, then BPSW, then optional additional Miller-Rabin rounds requested by the caller.

The FIPS 186-5 standard (the current revision for US government systems) specifies Miller-Rabin with iteration counts that depend on the required error probability and the bit length of the candidate. For 1024-bit primes (half of a 2048-bit RSA modulus), FIPS requires enough rounds to achieve an error probability below $2^{-112}$ .

libsodium, used widely for NaCl-family cryptography, relies on Ed25519 (which uses a fixed curve, not generated primes) for signatures, so primality testing is not in its hot path. But for applications that do need prime generation, libsodium defers to the OS or to a constant-time Miller-Rabin implementation.

An important engineering detail: the primality test itself is rarely the bottleneck. For a 2048-bit RSA modulus, the expected number of random candidates before finding a prime is approximately $\ln(2^{1024}) \approx 710$ (by the prime number theorem). Each candidate undergoes trial division first, which filters out most composites in microseconds. Only the ~20% that survive trial division reach the expensive probabilistic test. End-to-end, generating both primes typically takes a few hundred milliseconds on modern hardware.

Every step in this pipeline must run in constant time. If the Miller-Rabin test takes longer for primes than composites (because primes survive more squaring steps), an attacker observing timing can learn information about the generated primes. Production implementations use Montgomery multiplication with fixed iteration counts -- the same number of multiplications regardless of intermediate values. This is a first-order correctness requirement, not a performance optimization.

References