by Robin Wilson
I was surprised at how much time I ended up devoting to the Polymath project. This was partly because the nature of the project was so compelling—there were clear numerical metrics of ‘progress’ and always several possible ways of obtaining small improvement, which was continually encouraging. The general enthusiasm amongst the participants (and others outside of the project) also encouraged me to get more and more involved in the project.
At the time of writing, the current gap is 246. This is a massive improvement on the original 70 million—but there’s still quite a way to go before the twin prime conjecture is finally laid to rest.
The distribution of primes
How are the prime numbers distributed? Although they generally seem to ‘thin out’, the further along the list we proceed, they don’t seem to be distributed in a regular manner. For example, the hundred numbers just below 10 million include nine primes,
whereas the hundred numbers just above it include only two,
But although twin primes seem to crop up however far we go, we can also construct arbitrarily long strings of numbers that aren’t prime. To do so, we’ll use the factorial numbers n!, defined by
and in general,
Now, because the number n! is divisible by all the numbers from 1 to n, we see that:
and so the numbers
are all composite, giving us a string of composite numbers. Another example is:
So, for example, two strings of 1000 successive composite numbers are
and
The prime number theorem
In his inaugural lecture as professor at the University of Bonn in 1975, the distinguished number-theorist Don Zagier remarked:
There are two facts of which I hope to convince you so overwhelmingly that they will permanently be engraved in your hearts.
The first is that the prime numbers belong to the most arbitrary objects studied by mathematicians: they grow like weeds, seeming to obey no other law than that of chance, and nobody can predict where the next one will sprout.
The second fact is even more astonishing, for it states just the opposite: that the prime numbers exhibit stunning regularity, that there are laws governing their behaviour, and that they obey these laws with almost military precision.
To see what he meant by this, we’ll introduce the prime-counting function π(x), which counts the number of primes up to any number x. (This use of the Greek letter π (pi) is nothing to do with the circle number π.) So , because there are exactly four primes (2, 3, 5, and 7) up to 10, and , because there are four more primes (11, 13, 17, and 19). Continuing, we find that , , and .
If we plot the values of the primes up to 100 on a graph we get a jagged pattern—each new prime creates a jump (see Figure 30). But if we stand back and view the primes up to 100,000, we get a lovely smooth curve—the primes do indeed seem to increase very regularly.
30. The distribution of primes.
Table 6. x, π(x), and x /π(x), when x is a power of 10
x π(x) x /π(x)
10 4 2.5
100 25 4.0
1000 168 6.0
10,000 1229 8.1
100,000 9592 10.4
1,000,000 78,498 12.7
10,000,000 664,579 15.0
100,000,000 5,761,455 17.4
... ... ...
We can describe this overall regularity more precisely by comparing the values of x and π(x) as x increases. We get the following table which lists x, π(x), and their ratio x/π(x) (to one decimal place).
So up to 100 one-quarter of the numbers are prime, up to 1000 about one-sixth of them are prime, and so on. We can express this ‘thinning-out’ more precisely by noting that whenever x is multiplied by 10, the ratio x/π(x) seems to increase by around 2.3. This number 2.3 turns out to be the natural logarithm of 10. So what do we mean by ‘natural logarithm’?
The logarithm function, introduced in the early 1600s, is a mathematical device for turning multiplication problems into simpler addition ones, by using the basic rule:
There are actually several different logarithm functions, but the one we’re concerned with here is the natural logarithm. It has the property that , where e is an important number that’s about 2.71828. This number appears throughout mathematics and is connected with exponential growth. The graph of the natural logarithm, (sometimes written as ln x), is shown in Figure 31.
31. The graph of the natural logarithm.
Because the natural logarithm x turns multiplication into addition, and, in particular,
we can explain the phenomenon illustrated in the above table of values by saying that, as x increases, π(x) behaves rather like x/log x—or, more precisely, that their ratio approaches 1 as x becomes large. Figure 32 shows the similarity between the graphs of π(x) and x/log x.
32. The graphs of π(x) and x/log x.
Gauss guessed this connection (and even closer approximations) while experimenting with prime numbers at the age of 15. But it wasn’t proved until 1896, when Jacques Hadamard (from France) and Charles de la Vallée Poussin (from Belgium) did so independently, using sophisticated ideas from an area of calculus called complex analysis. It is known as the ‘prime number theorem’.
Prime number theorem: As x increases indefinitely, the ratio of π(x)/(x/log x) tends to 1.
It then took a further fifty years until the Norwegian Atle Selberg and the Hungarian Paul Erdős discovered a purely number-theoretical proof—and because Hadamard lived to 98, de la Vallée Poussin lived to 96, and Selberg lived to 90, it seems that proving the prime number theorem assists longevity!
Primes in arithmetic progressions
In Chapter 3 you met Euclid’s proof that there are infinitely many prime numbers, and we shall now adapt his ideas to derive more precise information. Recalling that every number has the form , , or , we note that numbers of the form 4q and those of the form (other than 2) cannot be prime, because they’re even and so have 2 as a factor. On the other hand, there are many primes of the form and many primes of the form :
primes of the form : 5, 13, 17, 29, 37, 41, 53, 61, 73, 89, 97, 101, …
primes of the form : 3, 7, 11, 19, 23, 31, 43, 47, 59, 67, 71, 79, 83, 103, … .
By modifying Euclid’s argument slightly we can prove that there are infinitely many prime numbers of the form . To do so, we’ll assume (for a contradiction) that there are only finitely many primes, p1, p2, …, pn, of this form (other than 3), but this time we’ll consider the number
which certainly has the form . Now, each of our primes divides the product , and so cannot also divide N. So either N is a new prime of the form , or it’s a composite number, in which case it must split into new primes. But these new primes cannot all be of the form , because multiplying any numbers of that form always gives another one:
So at least one of the new primes must have the form . It follows in either case that there’s a prime of the form other than p1, p2, …, and pn, and this contradiction shows that there must be infinitely many prime numbers of this form.
We can imitate this proof to show that there are infinitely many prime numbers of the form . We assume that there are only finitely many primes, p1, p2, …, pn, of this form, but this time we consider the number
noting that multiplying any numbers of the form always gives another one. We deduce that there are infinitely many primes of the form .
We can also adapt Euclid’s proof to deduce that there are infinitely many primes of certain other forms. But we cannot use this approach to prove that there are infinitely many primes of the form , because such numbers can also be obtained by multiplying only numbers of the form : for example, . We’re likewise unable to adapt this approach to prove that there are infinitely many primes of the form , because such numbers can be obtained by multiplying only numbers of the form : for example, .
But there are indeed infinitely many prime numbers of the form , and we can prove this by recalling from Chapter 4 that if −1 is a square (mod p), then p must be a prime
of the form . To get a contradiction we’ll assume that there are finitely many primes, p1, p2, …, pn, of the form , and this time we’ll consider the number
which certainly has the form . As before, none of our primes p1, p2, …, pn can divide N, so N is either a new prime or is a product of new primes. In either case there’s a new prime, which we’ll call p, that divides N. It follows that
−1 is a square (mod p) and we deduce that p has the form so . This contradiction proves there must be infinitely many primes of the form . A similar proof shows that that there are infinitely many primes of the form .
Having shown that there are infinitely many primes of these forms, we may now ask whether there are infinitely many prime numbers of the form , for any given numbers a and b. In its most general form, the answer to this question is ‘no’, for if a and b have a common factor d that’s greater than 1, then d must also divide all numbers of the form . For example, there are no primes of the form (because all such numbers have 2 as a factor) or of the form (because all such numbers have 3 as a factor).
But if a and b are coprime, then we have the following result, conjectured in 1785 by Legendre and proved in 1837 by Lejeune Dirichlet.
Dirichlet’s theorem: If a and b are given numbers with , then there are infinitely many prime numbers of the form , where q is an integer.
For example, there are infinitely many prime numbers of the form , because , and there are infinitely many prime numbers of the form , because . Also, because , there are infinitely many prime numbers of the form —that is, there are infinitely many primes with final digit 9—and likewise there are infinitely many primes with final digit 1, 3, or 7.
An arithmetic progression with first term b and common difference k is a finite or infinite sequence of equally spaced numbers of the form
For example, the sequence of numbers
is a finite arithmetic progression with first term and common difference , and the sequence
of numbers of the form is an infinite arithmetic progression with first term and common difference . More generally, the sequence of numbers of the form is an infinite arithmetic progression with first term b and common difference a, and if , then such a sequence must include infinitely many primes, by Dirichlet’s theorem.
Let’s now turn the above question around. The sequence
is an arithmetic progression of five primes with first term 5 and common difference 6, but it cannot be extended to an arithmetic progression of six primes because the next term is the composite number . An example of an arithmetic progression of six primes is
More generally, we can ask:
Does the list of primes contain arithmetic progressions of any chosen length n?
As we’ve just seen, the answer to this is ‘yes’ when and 6. For , the following arithmetic progression with first term 199 and common difference 210 consists entirely of primes:
Likewise, for , the arithmetic progression with first term 56,211,383,760,397 and common difference 44,546,738,095,860 consists entirely of primes. At the time of writing, the longest known arithmetic progressions of primes contain 26 primes.
The general question was answered in the affirmative by Ben Green and Terry Tao in 2004:
The Green–Tao theorem: Given any number n, the list of prime numbers contains an arithmetic progression of n primes.
Unique factorization
In Chapter 3 we saw that the factorization of positive integers into primes is unique, apart from the order of the factors—it’s a basic property of our number system. But it doesn’t hold for certain other systems of numbers. Here are two examples:
Example 1. Consider just the positive even numbers (which we’ll call e-numbers),
and consider the factorization of e-numbers into smaller e-numbers. We’ll call an e-number e-composite if it can be written as a product of smaller e-numbers, and e-prime if not. So 16 and 24 are e-composite numbers because and , but 6 and 10 are e-prime numbers because they can’t be written as products of smaller e-numbers. The first few e-primes are
But in this number system we don’t have unique factorization into e-primes: for example, 2, 6, and 18 are all e-primes, and the e-number 36 can be written as either or .
Example 2 (due to David Hilbert). Consider just the numbers of the form ,
and consider the factorization of h-numbers into smaller h-numbers. We’ll call an h-number h-composite if it can be written as a product of smaller h-numbers, and h-prime if not. So 25 and 45 are h-composite numbers because and , but 9 and 21 are h-prime numbers because they can’t be written as products of smaller h-numbers. The first few h-primes are
But in this number system we don’t have unique factorization into h-primes: for example, 9, 21, and 49 are all h-primes, and the h-number 441 can be written as either or .
However, there are some number systems, other than the positive integers, where we do have unique factorization. We give two examples:
Example 3. Consider the numbers of the form , where a and b are integers (√2-numbers). Such numbers include and , and we can carry out ordinary arithmetic with them, replacing (√2)2 wherever it arises by 2:
In this system we can define √2-prime and √2-composite numbers: for example, is √2-composite, because it can be written as . It is less easy to decide which are the √2-primes, but this can be done, and we can prove that every √2-number can be written as a product of √2-primes in only one way, apart from the order in which they appear.
Example 4. This is similar to the previous example, except that we replace √2 by i, the (imaginary) square root of −1. These numbers of the form , where a and b are integers, were introduced by Gauss in 1801, and are known as Gaussian integers. Such numbers include and , and we can carry out arithmetic with them, replacing i2 wherever it arises by −1: for example,
In this system Gauss defined Gaussian primes and Gaussian composite numbers, and proved that every Gaussian integer can be written as a product of Gaussian primes in just one way, apart from the order in which the Gaussian primes appear.
We can imitate the ideas of Examples 3 and 4 to explore prime and composite numbers of the form , where n is an integer. We’ll assume that n is ‘square-free’—that is, it has no square factors other than 1, because we can simply remove them: for example, because we can replace √18 by √2.
We sometimes have unique factorization into primes, as happened for and , but not always. When n is positive, it is not known in general when there’s factorization into primes in just one way. But when n is negative, we can give a complete answer. As before we don’t always get unique factorization: for example, when , the numbers, , and all play the role of primes, and yet we can write
so the factorization into primes is not unique in this case.
In his Disquisitiones Arithmeticae Gauss showed that there is unique factorization when (the Gaussian integers), and for a few other negative square-free values which he listed. He believed these to be the only ones, and this was eventually confirmed in the 1950s and 1960s by several writers, including Kurt Heegner (whose proof was incomplete), Harold Stark (who provided a complete proof) and Alan Baker (who proved it independently). We conclude this chapter with their remarkable result:
The Baker–Heegner–Stark theorem: For numbers of the form , where n is negative and square-free, factorization into primes is unique if and only if
Chapter 8
How to win a million dollars
In 2000 the Clay Mathematics Institute offered a prize of one million US dollars for the solution of each of seven famous problems, widely considered to be among the most important in the subject. The Riemann hypothesis was one of these ‘millennium problems’, and experts have been trying to prove it for more than 150 years.
So what is the Riemann hypothesis, and why is it important? The problem is concerned with the distribution of prime numbers, and was introduced by Bernhard Riemann, a German mathematician who died at the early age of 39 while a professor at Göttingen Un
iversity, where he had followed in the footsteps of Gauss and Dirichlet. Elected to the Berlin Academy in 1859, Riemann expressed his gratitude by presenting his only paper in number theory, ‘On the number of primes less than a given magnitude’ (see Figure 33). Only nine pages long, it is now regarded as a classic.
33. (a) Bernhard Riemann.
33. (b) Riemann’s 1859 paper.
In very broad terms the Riemann hypothesis asks whether all the solutions of a particular equation have a particular form. This is very vague, and the detailed assertion, which is still unproved, is:
Riemann hypothesis: All the non-trivial zeros of the Riemann zeta function have real part .
But what does this mean, and what is its connection with prime numbers?
Infinite series
To investigate these questions we’ll need to enter the world of infinite series. The series
where the denominators are the powers of 2, continues for ever. What happens when we add all these numbers together? Adding them one number at a time gives
Although we never reach 2 by adding a finite number of terms of the series (see Figure 34), we can get as close to 2 as we wish by adding enough of them—say, the first hundred or the first million: for example, we can get within 0.001 of 2 by adding the first twelve terms. We express this by saying that the infinite series converges to 2, or has the finite sum 2, and we write