The Distribution of Prime Numbers: Large Sieves and Zero-density Theorems (Oxford Mathematical Monographs)

OXFORD MATHEMATICAL MONOGRAPHS MEROMORPHIC FUNCTIONS By W. K. HAYMAN. 1963 THE THEORY OF LAMINAR BOUNDARY LAYERS IN CO...

Author: M.N. Huxley

23 downloads 506 Views 2MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

OXFORD MATHEMATICAL MONOGRAPHS

MEROMORPHIC FUNCTIONS By W. K. HAYMAN. 1963

THE THEORY OF LAMINAR BOUNDARY LAYERS IN COMPRESSIBLE FLUIDS By K. STEM WARTSON. 1964

CLASSICAL HARMONIC ANALYSIS AND LOCALLY COMPACT GROUPS By H. REITER. 1968

QUANTUM. STATISTICAL FOUNDATIONS OF CHEMICAL KINETICS By S. GOLDEN. 1969

COMPLEMENTARY VARIATIONAL PRINCIPLES By A. M. ARTHURS. 1970

VARIATIONAL PRINCIPLES IN HEAT TRANSFER By MAURICE A. BIOT. 1970

PARTIAL WAVE AMPLITUDES AND RESONANCE POLES By J. HAMILTON and B. TROMBORG. 1972

THE DISTRIBUTION OF PRIME NUMBERS Large sieves and zero-density theorems BY

M. N. HUXLEY

OXFORD AT THE CLARENDON PRESS 1972

Oxford University Press, Ely House, London W. 1 GLASGOW NEW YORK TORONTO MELBOURNE WELLINGTON CAPE TOWN IBADAN NAIROBI DAR ES SALAAM LUSAKA ADDIS ABA11A DELHI BOMBAY CALCUTTA MADRAS KARACHI LAHORE DACCA KUALA LUMPUR SINGAPORE HONG KONG TOKYO

© Oxford University Press 1972

Printed in Great Britain at the University Press, Oxford by Vivian Richer Printer to the University

TO

THE MEMORY OF

PROFESSOR H. DAVENPORT

PREFACE THIS book has grown out of lectures given at Oxford in 1970 and at University College, Cardiff, intended in each case for graduate students as an introduction to analytic number theory. The lectures were based on Davenport's 11Multiplicative Number Theory, but incorporated simpli-

fications in several proofs, recent work, and other extra material. Analytic number theory, whilst containing a diversity of results, has one unifying method: that of uniform distribution, mediated by certain sums, which may be exponential sums, character sums, or Dirichlet polynomials, according to the type of uniform distribution required. The study of prime numbers leads to all three. Hopes of elegant asymptotic formulae are dashed by the existence of complex zeros of the Riemann zeta function and of the Dirichlet L-functions. The primenumber theorem depends on the qualitative result that all zeros have real parts less than one. A zero-density theorem is a quantitative result asserting that not many zeros have real parts close to one. In recent years many problems concerning prime numbers have been reduced to that of obtaining a sufficiently strong zero-density theorem. The first part of this book is introductory in nature; it presents the notions of uniform distribution and of large sieve inequalities. In the second part the theory of the zeta function and L-functions is developed and the prime-number theorem proved. The third part deals with large sieve results and mean-value theorems for L-functions, and these are used in the fourth part to prove the main results. These are the theorem of Bombieri and A. I. Vinogradov on primes in arithmetic progressions, a result on gaps between prime numbers, and I. M. Vinogradov's theorem

that every large odd number is a sum of three primes. The treatment is self-contained as far as possible; a few results are quoted from Hardy and Wright (1960) and from Titchinarsh (1951). Parts of prime-number theory not touched here, such as the problem of the least prime in an arithmetical progression, are treated in Prachar's Primzahlverteilung (Springer 1957). Further work on zero-density theorems is to be found in Montgomery (1971), who also gives a wide list of references covering the field. M. N. H. Cardiff 1971

CONTENTS PART I. INTRODUCTORY RESULTS

1. Arithmetical functions

1

2. Some sum functions

6

3. Characters 4. Polya's theorem

10

5. Dirichlet series

18

6. Schinzel's hypothesis

23

7. The large sieve 8. The upper-bound sieve 9. Franel's theorem

28

14

32

36

PART II. THE PRIME-NUMBER THEOREM

10. A modular relation 11. The functional equations 12. Hadamard's product formula 13. Zeros of f(s)

40

1.4. Zeros of e(s, X)

58

15. The exceptional zero 16. The prime-number theorem 17. The prime-number theorem for an arithmetic progression

61

45 50 55

66 70

PART III. THE NECESSARY TOOLS

18. A survey of sieves

73

19. The hybrid sieve 20. An approximate functional equation (I) 21. An approximate functional equation (II) 22. Fourth powers of L-functions

79

84 89 93

CONTENTS

s

PART IV. ZEROS AND PRIME NUMBERS

23. Ingham's theorem 24. Bombieri's theorem 25. I. M. Vinogradov's estimate 26. I. M. Vinogradov's three-primes theorem 27. Halasz's method 28. Gaps between prime numbers NOTATION

BIBLIOGRAPHY

INDEX

PART I

Introductory Results

1

ARITHMETICAL FUNCTIONS An Expotition ... means a long line of everybody

I. 110

THIS chapter serves as a brief resume of the elementary theory of prime

numbers. A positive integer m can be written uniquely as a product of primes

(

m = pi'p2E ..* prr,

1.1

)

where the pi are primes in increasing order of size, and the a1 are positive

integers. We shall reserve the letter p for prime numbers, and write a sum over prime numbers as and a product as H. The proof of v

P

unique factorization rests on Euclid's algorithm that the highest common factor (m, n) of two integers (not both zero) can be written as

(m, n) = mu+nv,

(1.2)

where it, v are integers. We use (m, n) for the highest common factor and [rn, n] for the lowest common multiple of two integers where these are defined. Let q be a positive integer. Then the statement that m is congruent ton (mod q), written m - n (mod q), means that m-n is a multiple of q. Congruence mod q is an equivalence relation, dividing the integers into q classes, called residue classes mod q. A convenient set of representatives of the residue classes mod q is 0, 1, 2,..., q- 1. The residue classes mod q form a cyclic group under addition, and the exponential snaps m - eq(am), where a is a fixed integer, and

(1.3)

2

INTRODUCTORY RESULTS

e(a) = exp(2iria),

eq(a) = exp(27ria/q),

1.1

(1.4)

are hoinomorphisms from this group to the group of complex numbers of unit modulus under multiplication. There are q distinct maps, corresponding to a = 0, 1, 2,..., q-1. They too can be given a group structure, forming a cyclic group of order q. They have the important property

I e ama q(

mmodq

)

rq 0

if a- 0 (mod q), if not,

(

15

)

where the summation is over a complete set of representatives of the residue classes mod q (referred to briefly as a complete set of residues mod q). If on the left-hand side of eqn (1.5) we replace m by m+1, the sum is still over a complete set of residues, but it has been multiplied by eq(a), which is not unity unless a - 0 (mod q). The sum is therefore zero unless a - 0 (mod q), when every term is unity. Interchange of a and m leads to a corresponding identity for the sum of the images of m under a complete set of maps (a = 0, 1,..., q- 1). These identities arise because the images lie in a multiplicative not an additive group. From Euclid's algorithm comes the Chinese remainder theorem: if m, n are positive integers and (m, n) = 1, then any pair of residue classes a (mod m) and b (mod n) (which are themselves unions of residue classes modmn) intersect in exactly one class c (modmn), given by

c - bmu+anv (modmn)

(1.6)

in the notation of eqn (1.2). Now let f(m) be the number of solutions (ordered sets (x1,..., x,,) of residue classes) of a set of congruences (1.7) gi(xl...... ,,) - 0 (mod m), where the gi are polynomials in x1,..., x,, with integer coefficients. When (m, n) = 1, gi(xi,..., x,.) is a multiple of mn if and only if it is a multiple both of m and of n. Hence

f (inn) = f (m) f (n) whenever (m,%) = 1.

(1.8)

Equation (1.8) is the defining property of a multiplicative arithmetical

function. An arithmetical function is an enumerated subset of the complex numbers, that is, a sequence f (1), f (2),... of complex numbers.

The property

f (mn) = f (M) f (n)

(1.9)

for all positive integers m and n seems more natural; if eqn (1.9) holds as well as (1.8) then f (m) is said to be totally multiplicative, but (1.8) is the property fundamental in the theory.

1.1

ARITHMETICAL FUNCTIONS

3

The Chinese remainder theorem enables us to construct more complicated multiplicative functions. We call a residue class a (mod q) reduced if the highest common factor (a, q) is unity. A sum over reduced residue classes is distinguished by an asterisk. With this notation we introduce Euler's function p(m) by

cp(ma) _ " 1.

(1.10)

amodm

To show that y(m) is multiplicative, we must verify that in eqn (1.6) (c, man) = 1 if and only if both (a, m) and (b, n) are unity. Equation (1.6) implies also that Ramanujan's sum cq(m) _ :ET eq(ama)

(1.11)

auiodq

is multiplicative in q for each m. We see this if we write a = a2 g12G2+a1 q2 u1,

(1.12)

(1.13) g12G2+g2u1 = 1; where note that u1 (mod q1) and u2 (mod q2) are reduced residue classes, that

Y* eq,(a2u2)eq,(aiui), c'hg2(rn) = 1'' asmodgs aluiodq,

(1.14)

and that al u1 runs through a complete set of reduced residues mod ql when a1 does so. Two examples follow of totally multiplicative arithmetical functions.

The first is f (m) = mas, where s is a complex variable

s = a+it,

(1..15)

(1.16)

a and t being real. This notation is traditional among number theorists. To introduce our second example we note that the reduced residues

mod q (algebraically the invertible elements in the ring of integers mod q) form under multiplication an Abelian group of order p(q). By considering the images of the generators of this group, we can see that from this group to the group of complex numbers of unit modulus under multiplication there are p(q) maps x with the homomorphism property x(mmn) = x(m)x(n).

(1.17)

These include the trivial map for which x(m) = 1 for each reduced class m.

We turn these maps into arithmetical functions by defining (1.18) x(m) = 0 if (m, q) > 1. With this definition, eqn (1.17) still holds. We have now assigned a complex number to each residue class mod q. Hence we have constructed

INTRODUCTORY RES1YLTS

4

1.1

a totally multiplicative periodic function, which is called a Dirichlet's character mod q, or more briefly a character. Characters can be defined as those totally multiplicative functions that are periodic. Since negative integers also belong to well-defined residue classes mod q, we can speak of x(m) when m is a negative integer; in particular, we shall refer to

x(-1).

It is possible to build new multiplicative functions from old. We say that d divides m, written d I rn, when the integer rn is a multiple of the positive integer d; another paraphrase is 'd is a divisor of m'. (Note that the divisors of - 6 are 1, 2, 3, 6.) Now let f (rn) and g(m) be multiplicative.

Then so are the arithmetical functions h(m) = f (m)g(m),

h(m) = If (d), dbn

h(m) = If (d)g(m/d).

and

(1.21)

d1 m

We shall consider eqn (1.21), since (1.20) is a special case, and (1.19) is evident. When (m, n) = 1, the divisor d of mn can be written uniquely

as d = ab, where a I m and b n, and (a, b) = 1. Hence h(mn) =

f (d)g(mn/d)

dIn _ I If (ab)g(mn/ab) alm bin

_ Y_ If (a) f (b)g(m/a)g(n/b),

(1.22)

alm bin

which is h(m)h(n) as required. Thus

d(m) = 11,

(1.23)

din

the number of divisors of m, and

a(m) =I d,

(1.24)

dim

the sum of the divisors of m, are multiplicative functions.

We can invert eqn (1.20) and return from h(m) to f(d) by using 1116bius's multiplicative function µ(m), defined by

µ(1) = 1

µ(p) = -1 for prunes p µ(pa) = 0

for prime powers pa with a > 1

(1.25)

ARITHMETICAL FUNCTIONS

1.1

5

If the positive integer 9n factorizes according to (1.1), then µ(d) _ +p(pi)+p(p%)+...-}-µ(pa')} d1m

(1-1)=0,

1.26)

unless in = 1, when the product in eqns (1.1) and (1.26) is empty. We have now proved the following lemma. LEMMA. If in is a positive integer, then

if In = 1, if m > 1. From the lemma we have the corollary: d1m

d=

1

0

(

1.27 )

COROLLARY. If h(m) and f(m) are related by eqn (1.20), then

f (n) = I µ(in)h(n/m),

(1.28)

mm

and if eqn (1.28) holds then so does eqn (1.20).

To prove the corollary we substitute as follows. j µ.(mn)h.(n/m) _ ,j!,(mm) I f (d) mIn

mjn

dl(n/m)

=f(d) mi(I

t

k(m)

(1.29)

when we interchange orders of summation. The inner sum is zero by eqn (1.27), unless d = n, when only one term f (d) remains. The converse is proved similarly. We can also define an additive function to be an arithmetical function f (m) with (1.30) f(mn) = f (9n) +f (n) when (in, n) = 1.

The simplest examples are log m and the number of prime factors of in. There are useful arithmetical functions that are neither multiplicative nor additive. We shall make much use of 11(m), given by !1(m) = (logp if in is a prime power pa, a > 1, (1.31) 0 if in is not a prime power.

It satisfies the equation

I!1(d) = login.

(1.32)

dim

We could have used eqn (1.32) to define !1(m) and recovered the definition (1.31) by Mdbius's inversion formula (1.28).

2

SOME SUM FUNCTIONS THE study of the sum functions of arithmetical functions is important

in analytic number theory. For instance, we shall treat many of the properties of prime numbers by using the sum function

O(x) _ I A(rn).

(2.1)

nt6 x

Our object is to express the sum function as a smooth main term (a power of x or of log x, for example) plus an error term. In place of the cumbersome

(2.2)

If (x) I = O(g(x)),

we shall often write

f (x) < g(x),

(2.3)

and other asymptotic inequalities similarly. Some sum functions can be estimated by writing the arithmetical function as a sum over divisors and rearranging. In this chapter we shall give examples of this method. From the theory of the logarithmic function we borrow the relation 114

n-log( " nt=1 `1

11

1))

`

wh ere y is a constant lying between formula

Y+0(T,

=

),

(2.4)

and 1. We deduce the useful

1

log(117+1)+y+O(-31-1). ttt=1

(2.5)

r!L

Our first example is an asymptotic formula for (2.6)

(P(x) _ I On) In<X

as x tends to infinity. Since p(m) is the number of integers r with 1 < r < m and (r,rn) = 1, eqn (1.27) gives

_ (1 d1m

a",

0

if if not.

1, (

2.7

)

SOME SUM FUNCTIONS

1.2

Hence

and so

7

9n

p(rn) _ r=1 d m µ(d) (P(x)

_ m<x r-1 cc ?n FL(d) 'In

d

d5x

d5x

1

m<x

r=1

m=0(modd) r=O(modd)

I

/x(d) m

o(mo

and dd)

n<x/d

d<x

where we have written n for in/d, and the sum over n is 'E(x/d)2+ 0(x/d).

We now write eqn (2.9) as 1

(P(x)

µ(d)

2

<x -

(2.10)

2d

`+0(a,))

d2 Fi()+ 0C

x.22

d

d

11

dam,x

dJ

OD

d)+ 0(x)

0(xlogx)

(2.11) = cx2+0(xlogx), where the constant c can be evaluated; in the notation of Chapter 5 it

which can be shown to be 37r -2. The size of the error term is highly satisfactory: when x is a prime, O(x) has a discontinuity x-1, and the actual error in (2.11) is > x infinitely often. Even so, the upper bound 0(x log x) for the error term in (2.11) has been improved a little. When we first apply the same method to (2.12) D(x) = Y_ d(m), is

m<x

we find that

D(x) = I

1

m5x rim

\r+ 0(1)l = x(logx+y+O(x-1))+O(x) r<x

(2.13) = xlogx- -O(x), by (2.5). The error term here is much larger in proportion to the main 8535185

B


8

1.2

term than in (2.12); in fact a little cunning enables us to improve (2.13). We let y be the positive integer for which y2 < x < (y+1)2, and write m = qr in eqn (2.12), so that D(x)

gr5x

=1 1 1+1 1 1_y- 11 45v r-<xl4 rbvgSx/r

4'S?/ , y

= 2 QSU 1 (xlq+o(1))-y2 = xlogx+(2y-1)x+0(x1),

(2.14)

when we substitute (2.5) and the value of y. Dirichlet's divisor problem is to improve the exponent of x in the error term of (2.14). I. M. Vinogradov (1955) has shown that the error term in (2.14) is < x1log2x by elementary arguments. Van der Corput and others have obtained estimates < xS with values of 8 a little less than s ; their method involves writing the error in (2.14) as a contour integral.

On the other side, it has been shown that the error term is > xa for each 8 < 1. The limitation on the accuracy of (2.14) is not so easily explained as that of (2.11): it is not difficult to show that d(m) <ml (2.15) for each t > 0, and so the error in (2.14) must often be very much greater than any individual step in the value of D(x). To illustrate more difficult examples, we consider (2.16)

I d2(m)b)Im.

rn<x

Let dr(m) be the number of ways of writing the positive integer m as a product of r positive integers (so that d(m) = d2(r)Z)). By induction on (2.14),

G dr-1(v) Ui)

X

'v loge-lx (, _ 1) (

-f -

O(x log''-2x),

(2.17)

and by partial summation d4(m) ?n<x

in

D4(x) m5x

ln(rn+1)

+

1 1093X

l x l

= 241og4x+0(log3x).

(2.18)

SOME SUM FUNCTIONS

1.2

9

We have chosen to compare d2(m) with d4(m) since, when m is a prime power pa, d4(7n) = j(a+1)(a+2)(a+3), (2.19)

which is equal to d2(rn) if a = 0 or 1. The next step is to find a function b(m) for which (2.20) d2(m) = d4(u)b(v). UV=24

Since

*(a+1)(a+2)(a+3)-j(a-1)a(a+1) = (a+1)2,

(2.21)

the choice b(1) = 1, b(p2) = -1, b(pa) = 0, for prime powers pa with a not zero or two, satisfies eqn (2.20) when in is a prime power. If we complete the definition of b(m) by making it multiplicative, then (2.20) holds for all m. The choice is thus b m = J14(n) if m = n2, 2.22 if m is not a perfect square. We now complete the proof. Equations (2.20) and (2.22) give (

)

0

d2(m)

,n

-

(

)

d(u)b(v) 1-4

UV

uv5x

(2.23) ta5x

t2

uSx/ta

When we substitute (2.18) and the value 67r-2 of have

mx d2mm)

00

1.t(t)t-2 into this, we

1

- (+o1)log4x.

(2.24)

We require (2.24) and upper estimates for similar sums in the later work.

Any < estimate for a sum involving divisor functions that we quote will be a corollary-of (2.14) or of (2.24), possibly using partial summation. The method we employ in this chapter can be summarized as follows.

To work out a general sum function

F(x) = I f (m),

(2.25)

m f . First we introduce some notation. If a is a real number, we write [a] for the largest integer not exceeding a, and IIaII for the distance from a to the nearest integer, so that (4.3) [a] = maxm, m_ -1q, we can replace m by -q+m and obtain a sum with 0 < x < 2q, multiplied by -X(-1). Thus we can suppose that 0 < x < jq in (4.1). We now use H(a) to construct G(a), where 1

G(a)

-

0

if 0 0 term-by-term integration gives

«+i

f

xss -If

(s) ds

(5.6)

a-coo

where the last term occurs only if x is an integer. There are many integral transforms from Dirichlet series to their coefficient sums, all proved by the same method. The simplest one after (5.6) itself is «+iOO

f

21T1

xsf(s) 8(8 +1)

ds =

7A a in which one side of eqn (5.10) converges absolutely. If the product in (5.10) converges, f (s) can be zero only when one of the factors on the right-hand side of (5.10) is zero. The convergence of the left-hand side of (5.10) alone does not imply that of the product; L(s, X) with X non-trivial has a series (5.8) converging for a > 0, but the function itself has zeros in a > 2, preventing the product from converging in 0 < a < J. The second defining property is that f(8) should have a functional equation f(s)G(s) = f*(r-s)G*(r-s), (5.11)

where r is a positive integer, G(s) is essentially a product of gamma functions, and the operation * has (f*)* =,f and (G*)* = G. As an example, in the functional equation for L(s, X) in Chapter 11, L*(s, X) is L(s, X). An important conjecture about L-functions is the Biemann hypothesis that if f (s) satisfies eqns (5.10) and (5.11) then all zeros of f (s) G(s) have real part Jr. The truth or falsity of this hypothesis is not settled for any L-function. Two generalizations that are often called zeta functions are co

(m } 8)-S,

(5.12)

m.=1

where 8 is a fixed real number, and co

1 r(M)M-8,

m=1

(5.13)

where r(m) is the number of representations of m by a positive definite

quadratic form. Except in special cases these fail to have a product formula of the form (5.10), and not all of their zeros lie on the appropriate line. Some authors even use `zeta function' as a synonym for `Dirichlet series'. In Chapter 11 we shall obtain analytic continuations of c(s) and other L-functions over the whole plane. Since the sum function X (x) formed

DIRICHLET SERIES

1.5

21

with a non-trivial character X is bounded, by partial summation (5.8) con-

verges for a > 0 except when X is trivial. Similarly, the function

I 00

(-1)m-1m-3

m=1

(5.14)

= (1-21--%(s)

converges for a > 0 and provides an analytic continuation for c(s). When we make s - 1 in (5,14), we see that c(s) has a pole of residue 1 at s = 1. When we put f (s) = c(s) in (5.6), the integrand has a simple pole at s = 1 with residue x. The value of the right-hand side of eqn (5.6) is between x-1 and x. If we deform the contour in (5.6) so that it passes to the left of the pole, the residue makes the main contribution, and the contour integral left over is bounded. Let

O(x) _ I A(m)

(5.15)

ma<x

and

117(x) _

(5.16) rn <x

and of 1/c(s) (we shall prove

which are the coefficient sums of

this below). The function also has a pole of residue 1 at s = 1, but 1/c(s) does not. If the corresponding contour integrals were negligible, we should have

OM = x+o (x),

(5.17)

M(x) = o (x).

(5.18)

These are forms of the prime-number theorem, which we shall prove in

Part II. Writing m = fg, we have 00

I qn-8 fins I a(f)b(mlf) 7)L=1

f=1

a(f )f -3)0=1 ( b(g)g-).

(5.19)

If b(g) = 1 and a(f) = µ(f) for each pair of integers f, g, Co

(s) m=1 µ(7n)m-9 = m=1 n-3 1

1

(5.20)

flm

from eqn (1.27). Since eqns (5.3) and (5.6) imply that expansions in Dirichlet series are unique, we have shown that the series on the left-hand side of (5.20) represents 1/c(s) wherever it converges. Similarly, using eqns (5.19) and (1.32) we can check that -t'($)/C(s) has a Dirichlet series with coefficients A(rn). For fifty years (1898-1948) the only proofs known of eqns (5.17) and (5.18) used contour integration and other complex-variable techniques.

In 1948, Selberg and Erdos gave a real proof of (5.18) (see Hardy and


22

1.5

Wright (1960), Chapter 22). The real-variable approach is not so well understood, and the strongest forms of (5.17) and (5.18) (those in which the error term is smallest) have been obtained by analytic methods. The form (16.22) in which we shall prove (5.17) is a little stronger than the best so far obtained by Selberg's method. Apart from the analytic arguments, study of log(N!) suggests the form (5.17) as a conjecture. By eqn (1.32), log(N!) I log ??z = A(d) m 3 863618X

2

(6.2)


24

1.6

We shall now describe how to write down the conjectured asymptotic

formulae. Let

S(a) _

e(pa);

ps

(6.3)

such an expression is called an exponential sum or a trigonometric sun. By the fundamental relation 1

f e(ma) da

1

=

if m = 0,

t0 if m

0,

(6.4)

we see that the number of primes p < N for which p-2 is also prime is 1

f 8((x)8(-a)e(-2a) doz.

(6.5)

0

We cannot, of course, work out this integral, but we can suggest a plausible value for it. Writing 1, 7r(N; q, b) _ p= (modq)

&

we have

S(a/q) _

eq(ab)ir(N; q, b).

(6.6)

(6.7)

b mod q

Now the sum in eqn (6.6) is 0 or 1 if b has a common factor with q. If we make the approximation that the primes are divided equally between the p(q) residue classes b with (b, q) = 1, the expression in (6.7) is {p(q)}-1,T(N) J* eq(ab) = {,p(q)}-1ir(N)cq(a), b modq

(6.8)

where cq(a) is Ramanujan's sum (1.11). If (a, q) = 1 the Ramanujan's sum is just µ(q), from eqn (3.19). This argument suggests that S((X) has a `spike' at a/q of height proportional to F.c(q)/p(q), that is, that IS(a) I has a local maximum close to a/q. Now the area under the graph of I S(a)12 near 0 (which certainly is the site of a spike) may plausibly be written N-17T2(N) +error term, (6.9)

and this can be proved to be true. If we assume further that all the spikes at rational points are the same shape, the spikes at rational points a/q contribute N--11T2(N) I p.2(q)(pp(q))-2 J* eq(2a)+error term, amodq

q

(6.10)

and the sum over q in (6.10) converges to

2 J {1-(p-1)-2}, p>2

which gives the main term in eqn (6.2).

(6.11)

1.6

SCHINZEL'S HYPOTHESIS

25

For small q the argument above can be made rigorous; but then part of the range of integration in (6.5) does not support spikes. Away from a spike we cannot estimate S(a) except by replacing it by its absolute 1

value; and the spikes with small q contribute very little to f IS(a)12 da. 0

In the integral of IS(a)13 the spikes do dominate, and by this method I. M. Vinogradov was able to prove that every large odd number is the sum of three primes. The approach to Schinzel's hypothesis through exponential sums does lead to an upper bound for the number of sets x1,..., x,, of integers not exceeding N for which fl,..., are all prime. To explain the method

we shall take n = 1, so we are considering integers x in the range 1 < x < N for which fl(x),..., f,,,(x) are all prime. We now work modulo a prime p. Apart from the finite number of x for which one of f1(x),..., f,,,(x) is p, x must be such that none of f1(x),..., 0 (modp). This

means that x must be confined to certain residue classes modp. We therefore divide the residue classes modp into a set H(p) off (p) forbidden

classes and a set K(p) of g(p) = p-f(p) permitted classes; lm modp is forbidden if and only if one of the polynomials ff(h) is a multiple of p. If x falls into a forbidden class for any prime p smaller than each of the fa(x), then one at least of the fi(x) cannot be prime. The values of x that make f1(x),..., f,, (x) primes greater than some bound Q form a sifted sequence, in the following sense. The increasing sequence . 4" of positive integers n1, n2,... is sifted by the primes p < Q

if for each prime p < Q there is a set H(p) (possibly empty) of f(p) residue classes modp into which no member of .N' falls. We shall show

in Chapter 8 that, if .N' satisfies the above condition, the number of members of ." in any interval of N consecutive integers is N q'Q

where

+error term,

(6.12)

µ2(q)f(q)lg(q)

f (q) = q IT f (p)/p,

(6.13)

pig

g(q) = q f {1 f (p)/p}.

(6.14)

pig

We shall work out examples of this upper bound in Chapter 8; in each case the leading term is a multiple of the leading term in the conjectured formula. Upper bounds of the right order of magnitude were first found by Viggo Brun using combinatorial arguments. Rosser used Brun's method


26

1.6

to obtain expression (6.12), which was found in a different way by Selberg.

An outline of Selberg's method follows. It rests on the construction of kin exponential sum T(a) with the same spikes at rational points as S(a) _ I e(nza)

(6.15)

9bKN

is conjectured to have. Let K(q) be the set of g(q) residue classes that are in K(p) (permitted classes) for each prime factor p of q. At a/q we expect a spike of height proportional to

K(a, q) =keK(q) I eq(ak)lg(q).

(6.16)

If p is a repeated prime factor of q, then the classes lc+glp are also in K(q), and since replacement of le in eqn (6.16) by k+q/p multiplies the right-hand side by e2,(a), which is not unity, K(a, q) is zero if q has any repeated prime factor. Now the simplest exponential sum is

F(a) _

N m=1

e(9na),

(6.17)

which has a spike at a = 0, since JF(a) = Isin 1rNa/sin 1ral.

(6.18)

We therefore compare S(a) with

T(a) = I J* x(a, q)F(e -a/q). q6Q amodq

(6.19)

The coefficient of e(ma) in T(ae) is

rJ

*

1

q Q g(q)

eq(alc-am) =

1

q

q) amodq

k

g(g) k K(4)

(6.20)

using the definition (1.11) of Ralnanujan's sum. We know the sum over 7c must be zero if q has a repeated prime factor; if q is square-free then

I

cq(k-mn) =

dµ(dq)

ICEK(q)

kZr( q)

al

dl (k 4m)

q g(q)

d g(d) II

µ(q)g(q)µ(g1)f(g1)lg(g1),

(6.21)

where q1 is the largest factor of q for which rn E K(q1). We write q = q1 q2, so that if mz E H(cl) then d divides q2. The expression in (6.21) becomes

µ(g2)g(g2)f(q)lf(g2)

(6.22)

SCHINZEL'S HYPOTHESIS

1.6

f(d)

p(d)d

Now djq2

27

p(g2)J(g2)

(6.23)

f(q2)

and so the coefficient of e(ma) in eqn (6,19) is µ(d)d

=a

)

11 2(q)f(q)

f(d) q°R 111+L

(7.19)

1

here L is a parameter which we shall choose below. We now have nW

I(u,P'))I = I

ill

Moreover

IIuII2 = G Ib,ni2/k2 and

_

IS(x,.)I.

(7.20)

Iam12

(7.21)

N

(V''), f(s)) = Ic(xs-x,.),

where

K(oz) _

(7.22)

r1I+L

-t1L-L

k2, e(mna)

sin2(111+ L)7ra-sin21117ra

(7.23)

L si1127TO,

Hence we have R

I < 2111+L+2

s=1

L-1 cosec27rm8 1 00

2111+ L +

2L-11T2

7n-28-2+O(L-1)

2M+L+(3L82)-1+O(L-1).

(7.24)

1.7

THE LARGE SIEVE

31

When we choose L to be an integer close to (8 \/3)-1, (7.24) becomes 2111-} 328-1d3+O(1).

(7.25)

We substitute (7.20), (7.21), and (7.25) into (7.15) to obtain (7.8). Our inequality (7.8) represents an improvement of an inequality of Roth (1965), which has led to much recent work. The best upper bounds known for the sum (7.3) at the time of writing are 2-\/3

N

_\T)

3

(7.26) 1

8-1(1+270N383)

N (7.27) 1

and

2max(N,5-1)

Iv

1a12.

(7.28)

1

Of these, (7.26) is the result of this chapter, appropriate when N > 18-11/3, and (7.27) and (7.28) are results of Bombieri and Davenport (1969, 1968).

(7.27) is appropriate when 8-1 > 3(10)'N, and (7.28) for the intermediate range.

Note added in proof. H. Montgomery and R. C. Vaughan have now proved the conjecture (7.7). This supersedes (7.26) and (7.28) but not (7.27).

8

THE UPPER-BOUND SIEVE Ìt's a comfortable sort of thing to have', said Christopher Robin, folding up the paper and putting it into his pocket. IT. 170

IN this chapter we obtain the upper bound (6.12) as an application of the large sieve. The notation is that of Chapter 6. A,' is a sequence of positive integers, and for each q < Q there are sets H(q) and K(q) of residue classes mod q. The f (q) classes of H(q) are precisely those that are not congruent to any member of .N' mod p for any prime p dividing q,

so that, if h is in a class of H(q) and n e .N',

(n-h, q) = 1.

(8.1)

The g(q) classes of K(q) are those that for each p dividing q are congruent

modp to some member of Al; their union contains all members of the sequence X. We work with the exponential sum of eqn (6.15):

S(a) _ I e(n2a),

(8.2)

91j

f(p) = 2(p-1),

3).

(8.13)

It is not difficult to show that q

Qµ2(q)f(q)lg(q) = (c+o(1))Q,

(8.14)

where c is a constant, as Q -a oo. Choosing Q = N1, we have shown that the number of perfect squares not exceeding N is O(N1). This is very encouraging, since the sieve upper bound is `sharp', differing only by

a constant factor from the actual number of squares. It is surprising that we have not lost the correct order of magnitude in combining so many inequalities. Our second, less trivial example concerns the primes between Q and N; these form a sifted sequence with f (p) = 1,

g(p) = p- I = log Q

(8.18)

9m5Qin

for q > 1, by (2.5). When we choose Q a little smaller than N1 we see that the primes between 1 and N number

< Q+(N+O(Q2))/logQ < (2+o(1))N/logN.

(8.19)

The right-hand side of (8.19) is just double the true value (5.23). More-

over, (8.19) is also an upper bound for the number of primes in any interval of length N.

1,8

THE UPPER-BOUND SIEVE

35

We derived the inequality (8.12) from Cauchy's inequality; the difference between the two sides of the inequality (8.12) is a measure of how closely the values of S(a/q) are proportional to those of (J(q))-1 hEI e4(ak),

(8.20)

and this in its turn measures how evenly .A" is distributed among the g(q) residue classes mod q into which it is allowed to fall. We could add an explicit term on the right of (8.8) to measure the unevenness (what statisticians might term a variance). The inequality (7.8) gives a strong

upper bound for this variance as well as for the main term. When we use (7.8) to prove Bombieri's theorem, it is the variance bound that is important, not the bound for the main term.

9

FRANEL'S THEOREM Ìt's just a thing you discover', said Christopher Robin carelessly, not being quite sure himself. 1. 109

THE Fare y sequence of order Q consists of the fractions a/q in their lowest

terms (i.e. (a, q) = 1), with q < Q and 0 < a < q. We name them .f,. = a,./qr in increasing order, so thatfi = 1/Q, f2 = 1/(Q-1),..., fF = 1. For notational convenience we may refer to fF+,.; this is to be interpreted as 1-1- f,.. Here F is the number of terms in the Farey sequence, so that

F = I p(q) = 37r-2Q2+0(Q log Q) 45Q

(9.1)

from eqn (2.11). The properties of the Farey sequence are discussed by Hardy and Wright (1960, Chapter 3). We shall sketch a proof that (9.2) fr+i-f,. = (grgr+i)-1. Let us represent rational numbers a/q (not necessarily in their lowest terms) by points (a, q) of two-dimensional Euclidean space. Sincefr and .fr+1 are consecutive, the only integer points in the closed triangle with

vertices 0 (0, 0), Pr (ar, qr), and Pr+1 (ar+i, gr+1) are its three vertices. By

symmetry, the only integer points in the parallelogram OPr TP,.+1 are its vertices, where T is (ar+a,.+i, q,.+q,.+1). We can now cover the plane with the translations of this parallelogram in such a way that integer

points occur only at the vertices of parallelograms. It follows that OP,. TPr+1 has unit area, which is the assertion (9.2).

Before stating Franel's theorem we introduce some notation. For

0 1, the integral in (11.7) converges for all complex s. Since r(s) is a known function, and (r(Zs))-1 is integral (single-valued and regular over the whole s-plane), we can take (11.3) with (11.7) as the definition of i(s), knowing that

m-s agrees with our new definition

when the series converges. We have now continued c(s) over the whole plane. Further, (11.7) is unchanged when we replace s by 1-s, so that (11.9) W-}8r(- sMs) = or18-1r(I-13)x(1-s), the promised functional equation. Since

rM IS)

l1S 2

= 21-sor-4r(s)cos . sor,

an alternative form of (11.9) is (1-s) = 21-8r-sr(s)cos -sor c(s).

(11.10)

(11.11)

We now list some properties of r(s) (see for example Jeffreys and Jeffreys 1962, Chapter 15). The product

r(s+ 1)

= eys

nl a-sins'

co

11,+

,JJ>.-L=111 t

(11.12)

JJ

where y is the constant of (2.5), converges for all s, and defines r(s) as

a function that is never zero and has simple poles at 0, -1, -2,.... Using this information in (11.7) we see that the pole of (11.7) at 1 comes

from c(s), the pole at 0 from r(2s), and that c(s) must have zeros at s = -2, -4,..., to cancel the other poles of r(zs). From eqn (11.12), r(s+1) = sr(s), (11.13) and

r(1+s)r(1-s) = orscosec ors,

(11.14)

where we have used the product formula for sin ors. We can verify eqn (11.10) by showing that the ratio of the two sides is a constant. Equation (11.1) is obtained by evaluation of the limit of w

f 0-1+1(1-t/N)Ndt 0

(11.15)

THE FUNCTIONAL EQUATIONS

2.11

47

in two ways as N tends to infinity. We can also obtain from (11.12) the asymptotic formulae

loge(s) = (s- ))logs-s--21og2Tr+O(1/Isl),

(11.16)

I"(8)/T(s) = logs+0(1/1sI)

(11.17)

and which hold as Isl

oo uniformly in any angle -7T+8 < args < it-8

forany8>0. Next we consider an L-function L(s, X) with X a proper character mod q. There are two cases. If X(-1) is 1, we argue as above up to 00

7r-18q}Sf(zs)L(s, X)

= f xjs-1m=1 I

X(mn)e-"a'ax/q

dx

o

Co

_

f xks-1 y(x, X) dx,

(11.18)

0 co

where

p(x, X) _-coI

(11.10)

X(nz)e-""'Tx/q.

We approach q(x, X) through (10.10): co

00

e-(,"+8)'7+/x = xj I e-"0"xe(mn8).

(11.20)

We put 8 = a/q and use eqn (3.8): (11.21)

X(m)T(X) = Y* X(a)eq(am), amodq

so that

I co

X) _ amodq x(a)

m=-o0

e-"L'',xlgeq(am) ao

I e-(n+alq)'aq/x, !. X(a)(glx)1 amodq M=-00

(11.22)

which we may rearrange as 00

(glx)I-co I amodq G*

R(a)e-("tq+a)'ar/xq = (qlx)}

.4 f'=-00

= (q/x)1pp(1/x, X).

R(r)e-r''alxq

(11.23)

This will play the part of the modular relation (10.10). As before, we split up the range of integration in (11.18) and find that 1

00

f xjs-1p(x, X) dx = f t-js-1(T(7C))-1(gt)lcp(t, X) dx. 0

1

(11.24)

THE PRIME-NUMBER THEOREM

48

2.11

The analogue of (11.7) is now seen to be fCo

Co

Tr I8q 8" (js)L(8, X) = j

xIs-1p(x, x) dx

J

1

x- 8-Iy(x,X) dx.

+2T(X) 1J

(11.25)

As before, the right-hand side of (11.25) converges for all 8, so that

L(s, x) has an analytic continuation over the whole plane, with no singularities. Moreover, L(s, x) must have zeros at 0, -2, -4,... to cancel the poles of 1'(js). We proceed to deduce the functional equation. We have T(X) _

mmodq

X(m)eq(m)

rmodq

X(-m)eq(-m) (11.26)

= T(X),

since it was assumed that x(-1) = 1. By eqn (3.14), since x is proper modq,

(11.27)

q'IT(X) = T(X)Igl.

We now see that the right-hand side of (11.25) is T(x)q-1 times the corresponding expression with s replaced by 1-s and x by X, which gives the functional equation T(X)q-!7r-I+I8gl-I8I'(z- s)L(1-s, X)

-i8gI8I'(zs)L(s, X) =

(11.28)

We now consider characters x(m) proper modq with x(-1) = -1. Since we want to consider a sum from -oo to oo, we use mx(m) in place of x(rn). Writing s+1 for s in (11.2), we have ic8+1)gl(8+1)I'( (s -1))L(s, x) = f00 0

00

me--27T.81gxI8-} dx

m=1 Co

= z f p(x,

&Is-I dx,

(11.29)

0 Co

where

p(x, x) = I

mi(m)e-"z$""Iq.

(11.30)

9,L= - 00

We find a functional equation for p(x, x) by differentiating (10.10) with

respect to S. We get T(X)P(x, x) = ig1x-4P(l/x, X)

(11.31)

Arguing as before, we find -i8-igi8+ P(j(s-F-1))L(s, x) 00

Co

fp(x,

xI8 i dx

1 lqi f p(x, ' & -is dx.

11.32

THE FUNCTIONAL EQUATIONS

2.11

49

Again, the integrals on the right of eqn (11.32) converge for all s, so that

L(s, X) has an analytic continuation; it must have zeros at -1, -3, -5,... to cancel the poles of r(j-(s+1)), and satisfies the functional equation X)

jql

T(X)'-(s+i)g3(S+i)j'((s-F-1))L(s, X).

(11.33)

To check this, we note that when X(-1) = -1 (11.34) T(X) = -T(X). There is also an analytic continuation of L(s, X) when X mod q is not proper. If Xi proper mod f induces X mod q, then for a > 1

L(s, X)

= M=1

X,(r)

µ(d)d 1(d)

Xnm)

4

dim

(d,

fl4 i

(d,f)= 1

(11.35)

f'=1

when we write in = dr. The sum over r in (11.35) is L(s, Xi), which has an analytic continuation since Xi mod f is proper, and the sum over d is defined for all f. The corresponding functional equation for L(s, X) contains the sum over d explicitly. We shall not need this case again.

A number of proofs of the functional equation can be found in Chapter 2 of Titchmarsh (1951).

12

HADAMARD'S PRODUCT FORMULA Suddenly Christopher Robin began to tell Pooh about some of the things : People called Kings and Queens and something called Factors. H. 174

IN proving the prime-number theorem, Hadamard studied integral functions of finite order, that is, functions f (s) regular over the whole plane, with (12.1) log) f (s) I < Is I-d for some constant A, as Is I oo. The order of f (s) is the lower bound of those A for which an inequality of the form (12.1) holds. Hadamard

showed that an integral function of finite order can be written as an infinite product containing a factor s-p corresponding to each zero p of the function. This generalizes the theorem that a polynomial can be written as a product of linear factors. Weierstrass's definition (11.12) of

the gamma function is an example. The product is especially simple when f(s) has order at most unity. The order of 1/1'(s+1) is unity, from (11. 16). We shall obtain the product formulae for e(s) and e(s, X) given by (12.2) e(s) = s(1-s)7r-181'(Js)g(s)

and

e(8, X) = (g1ir)h(3}a)F(z(s-I-a))L(s, X),

(12.3)

where X is proper mod q and a = 0 or 1 according to the relation X(-1) _ (-1)a. Note that (11.9) is just the assertion that E(1-s) is equal to i(s), and (11.28) or (11.33) implies that

If('-s,X)I = l e(84)1-

(12.4)

First we show that ie(s, X) has order one. By eqn (5.3), if a > 0, OD

L(s, X) =

8x-S-1 1

rn r

X(m) dx.

(12.5)

By Polya's theorem (4.2), the sum over mn is bounded, and thus U,

IL(s,X)I ., and we have

(12.7)

logIe(s,X)I 0. As in the derivation of (12.7) from (12.5), we deduce that the right-hand side of (12.9) is < IsI for a > 2. Now when a > j we have

(1-s)/(1-21-8) < Isl, logl(1-sMs)I 1. Thus all zeros p = (3+iy of e(s, x) have 0 < / < 1, and the same is true for e(s) by a similar argument. Riemann's hypothesis is that /3 is always 1. Riemann stated the hypothesis for i(s), but it is difficult to conceive a

proof of the hypothesis for e(s) that would not generalize to e(s, x). We shall prove later that 0 < /3 < 1: this statement is equivalent to the prime-number theorem in the form (5.17) in the sense that each can be derived from the other. For later use we now prove a result more precise than (12.16). LEMMA. The number of zeros p = 13+iy of e(s, x) in the rectangle B,

0 1, so that the series converges. For all real 0, 3+4 cos 0 + cos 20 = 2(1+cos0)2 > 0. Since

we have 853518 x

m=1

i;(a)

4 Re

A(ru)m-°cos(itlogm),

i;'(a+it) -Re C'(o+2it) > 0 .

(o+it) E

(a+2it)

(13.2)

(13.3) (13 . 4)


56

2.13

We now make a+it tend to a zero P+iy. Since C(s) has a pole at 1, there is a circle centre 1 and some radius r, within which C(s) is non-zero. (Calculation shows that r = 3 has this property.) If we suppose that

/i > 1-br,

(13.5)

then IyI > br, and so is bounded away from zero. In eqn (12.40),

'(s) = (s)

1

-B+s-1-

log21r-21 P'(18+1)

P(j8+1) -

(+)p 1

1

(13.6)

we shall assume 1 < a < 2, ItI > s . > 0. Here the sum is over zeros p of e(s), not over all zeros of C(s), and, since s-p and p have positive real part, we have Re (s 1

p+pl

>0

(13.7)

whenever a > 1 and p = P+iy has 0 < < 1. By (11.17) the term in P(- s+1) is < log(It I d-e). We now write down three inequalities. Since there is a pole at s = 1 of residue 1, we have

-C'(a)IC(a) = (a-1)-1+0(1).

(13.8)

At s = a-l-iy we omit all terms in (13.6) except those from the particular zero with which we are concerned; by (12.7) this gives us the inequality

I iy) < -(a-P)-1+0(log(IYI+e)). Similarly,

-C'(a+2iy)/C(a+2iy) < O(log(IYI+e)).

(13.9)

(13.10)

When we substitute (13.8), (13.9), and (13.10) into (13.4), we have

4(a-fl)-1-3(a-1)-1 < 0(log(IYI+e)),

(13.11)

valid as a --> 1 from the right. By giving a a suitable value, we see that

R 0, (14.7) -3 L'(a, L(a+2it, X2) L(a, Xo) L(a+it, X) valid for a > 1, as the analogue of (13.4). Here, Xo is the trivial character mod q, whose value X(nm) is 1 when (m, q) = 1 and 0 otherwise, and X2 is the character whose value at mn is {X(mn)}2. Although we have supposed X to be proper mod q, X2 might be trivial and certainly need not be proper

mod q. The trivial character Xo is not proper mod q. However, if Xi proper mod f induces X2 mod q, then L'(s, X2)/L(s, X2) and L'(s, Xi)/L(s, Xi)

differ only by terms involving powers of those primes that divide q but

not. f. For a > 1, these terms give at most Xl(m)A(m)

loge

gig

1

1

I logp/(p-1) < log q.

(14.8)

s)lq

The inequality (14.8) applies also for f = 1, X2 = Xo We conclude that (14.6) is valid for any non-trivial X niodq, possibly with a different

0-constant, and that Xo) 1< a-1 - Re - Re L'(s' 18-112 L(8, X0)

1 +0(1(t)) s-p

(14.9)

r for the trivial character Xo mod q. If X2 is non-trivial, substitution of (14.6) and (14.9) into (14.7) gives 4(a-fl)-1 < 3(a-1)-1+0(l(t)),

(14.10)

(14.11) 9 < 1-ci/l(y) implying that for some absolute constant ci when we choose a appropriately. If X2 is trivial, then 4(a-f3)-1 < 3(a-1)-1+(a-1)/{(a-1)2+4y2}-}-0{l(y)}, (14.12) which is consistent with f = 1 when a -> 1. However, if IY1 > c2/l(Y)

(14.13)

for some positive c2, then by choice of a in (14.12) we can show that (14.14) 9 < 1-c3/l(y), with a smaller absolute constant c3. We have now shown that either

(14.14) is true or

lyi < 8/logq,

(14.15)

where 8 is an absolute constant. The absolute constant c3 in (14.14) depends on the choice of 8 in (14.15). When (14.15) is satisfied with

60

y


2.14

0 we can still deduce an upper bound for fi, but it is very close to

unity, and tends to 1 as y - 0. The third and greatest difficulty is to deal with zeros close to unity when X2 is trivial. First we show that there is at most one. We have

-Re L'(a, X) - Re L(a, X)

-

A(m)X(m) ma

00

CO

A(m)m'v (14.16)

If p, - fl1+iy1 and P2 = /32+iY2 are two zeros satisfying (14.15), then

-Re L'(a, X) L(a, X)

-Re

1

(7-Pi

-Re a-P2 1 +O(logq) a-fl2

CF-R

(a-f 1 2+y2

(a-fl2)+Y2 +

0(lo g q )

(a-P )2+32(log q)- 2+ O(log q),

(14.17)

if #2 > a1 > 1-8/(logq). If 8 is small enough, this implies that R1 < 1-c4/l(Y1)

(14.18)

Clearly we can choose e4 < c3, and so (14.18) is true for every zero N1+iy1 of L(8, X) except (possibly) P21 and the possible exception p2 occurs only if X(m) is always real, so that X2 is trivial. Since p2 is also a zero when L(s, X) has real coefficients, if P2 fails to satisfy (14.18) we conclude that P2 is real. We devote the next chapter to the case of an exceptional zero R on the real axis.

15

THE EXCEPTIONAL ZERO Piglet said that the best place would be somewhere where a Heffahunp was, just before he fell into it, only about a foot farther on. 1.57

IN this chapter we consider real characters, that is, characters for which X(m) is always real and thus X2 is trivial. As far as we liow, the corre-

sponding L-functions may have real zeros fi with z < 9 < 1. Just as before we saw that L(s, X) cannot have two zeros both close to 1, we shall now see that two functions L(s, X) corresponding to different proper

characters cannot both have zeros close to 1. Suppose Xl is proper mod q1, X2 is proper mod q2, and the corresponding L-functions vanish at N1 and 92. In place of (13.2) we use (1+X1(mmm))(1+X2(rn)) > 0,

(15.1)

which implies that

X1)-L'(a,X2)L'(a,X1X2) > 0 , X1)

L(a, X2)

(15 . 2)

L(a, Xl X2)

where X1 X2 denotes the character mod q1 q2 whose value at m is X1(?fl)X2(?)Z) When Xl and X2 are different, the character X1 X2 is nontrivial, and (14.6) gives - L'(a, X1 X2)/L(a, X1 X2) < 0(logg1g2),

(15.3)

-L'(a, X1)IL(a, X1) < -(a-(31)-1+O(logg1),

(15.4)

and for L(a, X1)

and similarly for X2. In place of (14.17) we have < 0(logg1g2),

(15.5)

and if Nl > 92 then 92 at any rate satisfies the relation 92 < 1- c1/(log q1 q2),

(15.6)

where cl is an absolute constant. We deduce a uniform zero-free region.

62


2.15

By (15.6) and (14.18) there is a constant c2 with the following property.

Let Q > 1. Then no L-function formed with a character x mod q with

q < Q has a zero p = f+iy with a > 1-c2/log{Q(IYI+e)} (15.7) except possibly at a point Nl on the real axis, where L(s, x) has at worst a simple zero. All x modq with q < Q for which L(f1, x) = 0 are induced by the same real character. To prove the prime-number theorem for an arithmetic progression with common difference q, we need to know that neither c(s) nor any L-function formed with a character x mod q has a zero p = P+iy with p = 1. The proof is simpler if we have /3 explicitly bounded away from -unity. We have to deal only with the case x real, p real. One method is to interpret L(1, x) as the density of ideals in a quadratic number field. This gives a very weak bound. We shall prove Siegel's theorem. THEOREM. For each E > 0 there is a constant c(E) such that if

L(R1, x) = 0,

where x is a character rood Nl < 1 -c(E)q-E.

(15.8)

The constant c(E) in Siegel's theorem is ineffective; that is, the proof does

not enable us to calculate it. All previous constants in upper bounds, such as c2 in (15.7), have been ones we could calculate, given a table of

values of c(s) for Isi < 3 and standard inequalities such as Stirling's formula. Following Estermann's account (1948) of Siegel's theorem, we consider the function (15.9) F(s) = (s)L(s, XI)L(s, X2)L(s, XI X2)1 where Xi and X2 are real characters proper mod q1 and mod q2 respectively. By (15.1), the Dirichlet series d)

log F(s) _

{1+Xl(na)+X2(m)+XiX2(mmm)}A(rn)rn-s

m=1

(15.10)

has non-negative coefficients; it converges for a > 1. For a > 1 we can take the exponential of (15.10) : F(s)

(15.11)

m=1

with non-negative coefficients. F(s) has (at worst) a simple pole at s = 1 of residue (15.12) A = L(1, Xl)L(1, X2)L(1, X1 X2),

THE EXCEPTIONAL ZERO

2.15

63

and no pole if A = 0. Moreover, F(s) has a power-series expansion, co

F(s) = I b(r)(2-s)",

(15.13)

r=0

b(r) _ (-1)"F">(2)/r!

where

_

(-logm)" a (m) -,0.

7'i

(15.14)

na=1

In particular, b(O) = F(2), which is at least unity. The function

F(s)-A/(s-1) _

{b(r)-A}(2-s)"

(15.15)

0

has no singularities, and so its power series converges everywhere. If A = 0, the series on the right of eqn (15.15) is positive on the negative

real axis and represents F(s). Since F(s) is zero at 0, -2, -4,..., we conclude that A 0. We now have f1 < 1. To prove the more precise result (15.9), we use Cauchy's formula, integrating round a circle, centre 2 and radius 2, to obtain the coefficients b(r). On the circle, c(s) is bounded, and for the L-functions we use (12.6), (15.16)

IL(s,X)I 1- (log q) -1, the first integral is < 18 1 log q and the second is is I, and so

IL(s, X) I < IsI log q.

(15.27)

When we integrate round a circle radius J(log q) -1 to find L'(8, X), we have for a > 1-(lobo q) 1 (15.28)

the bound

I L'(a, X) I < log2q.

(15.29)

THE EXCEPTIONAL ZERO

2.15

65

Hence, if L(s, Xi) has a zero Pi in the range (15.28), A

= L(1, Xi)L(1, X2)L(1, X1 X2) G loggi L(1, Xi)

G logg1(1-R1)L'(a, Xi) (15.30)

for some u in fi < a < 1, and so by (15.23) and (15.29) we have (15.31) 1 G (1-fl1)gi (1-p2)log3gi < (1-fli)gi, where the constants implied depend on X2 and so on e. If Pi does not

satisfy (15.28), then (15.31) certainly holds, and Siegel's theorem (15.8) follows from (15.31).

16

THE PRIME-NUMBER THEOREM The clock slithered gently over the mantelpiece, collecting vases on the way. II. 135

WE can now prove the prime-number theorem in the form (5.17). Combining (5.4) and (5.5), we have 0(ua(Tllogul)-1)

o++iT

f

a-iT

s

s ds

z+0(a/T)

1+0(ua(T logo)-1) On the line a = a, where a > 1, the series

if it < 1, if it = 1, if it > 1.

-0s)Ms) _ j A(m)m-3

(16.1)

(16.2)

m=1

converges absolutely. We now suppose that x is an integer plus one-half. Then a+iT 1

A(m) x8

lie s J a-iT i=1

ds -

!1(m) m <x

0 xa

T

00

A(m) 1

1

l log x/qn l zna

(16.3)

To estimate the series in the error term of (16.3), we first note that

I A(m) ` y,

(16.4)

m (logqT)-1,

(17.3)

IIPI - RI > (log q)-1

(17.4)

for each zero p = P+iy of e(s, x). By (12.19), we can suppose that 6 < R < 1. The contour C consists of C,: the line segment [a-iT, - s -iT],

C2: the line segment [-j-iT, -2+iT], C3: the line segment [--+iT, a+iT], C4: the circle centre 0, radius R described negatively. The circle C4 avoids the pole of xG/s and a possible L-function zero at

s = 0. As in (16.10), the inequality I L'(s, X)/L(8, X) I G log2qT

(17.5)

holds on the contour C, the constant being absolute. Thus 1

L'(s, x) x8

Wi

L(s, x) s

c

ds

`Xlog2gT+Xilog2gT. T log X

(17.6)

We now treat the sum in (17.2). By (14.18), non-exceptional zeros satisfy the relation R < 1-cl/loggT (17.7) with an absolute constant c1. If q satisfies an inequality log g < c2(log x)1,

(17.8)

2.17

ARITHMETIC PROGRESSION

71

with c2 < 1, we can choose T so that (17.3) holds, and

logqT = (logx)i-j-0(1).

(17.9)

As in the last chapter, we deduce

- xsl/p1+0{xexp(-c3(logx)°)}

if there is an exceptional

0{x exp (-c3(log x)i )}

if not.

+'(x, X) =

zero /31i

(17.10)

The value of c3 depends on that of c2 in (17.8), but can be effectively calculated.

iT

FIG. 3. The contour U

We can absorb the term xsl/fit only if q satisfies q < loglvx

for some N > 0. By Siegel's theorem with e = 2N-1, N1 > 1-c4q-E > 1-c4(logx)-'&, and xNP1 <xexp{-c4(logx)}.

(17.11)

(17.12) (17.13)

We have now shown that the inequality I'(x, X) I

.

x exp{-c5(log x) l},

(17.14)

where c5 is an absolute constant, holds if (17.11) does, or if (17.8) holds and there is no exceptional zero. 863618 x


72

2.17

Our characters can at last abandon propriety. If X mod q is induced by Xi proper mod f then 0(x, XI)-0(x, X) _

loge

Xi(p'),

(17.15)

an expression whose modulus does not exceed r1a

loge 1+1ogx 10, so that the integral along Re w = - -1 - a converges absolutely. We change the variable of integration to u

THE NECESSARY TOOLS

86

3.20

(= A+iT) where s+w = 1-u, and use the functional equation (20.1) to write d(rn)X(m)

X)-

L2(s,

c (7111)

?ns

_

j+Im 1

x1-8-'LK(s +U-1) q 2"-1 G(u)L2(u,

s+u-1

J

)

or

X) du.

(20.13)

Because of the factor K(s+u- 1), the integrand in (20.13) is largest when a is close to 1-8. Now when it is near 1-8 the terms (20.4) approximate the (2A-1)th power of the constant Jq Is I 17r. In modulus at any rate the main contribution to (20.13) is an integral of the form (20.10)

with s and X replaced by 1-s and X, and x replaced by y, where the integer y is given by y = [g2t2/(4,.2x)]. (20.14) More precisely, provided IIm(s+u-1)1 < It 11, we have 1

g

2tt-1

iITII

Cr(26) Cl

yl-u I

It11)2)A-1

(g2(ItI -I41T2xy \

< (1+0(ItI-1)+O(Iyl-1))2jtj¢ < 1,

provided that If we choose y by

(20.15)

y' ItIl.

(20.16)

4zr2xy = q2t2

(20.17)

exactly, so that y is not necessarily an integer, the term 0(y-1) is not present and (20.15) is true in any case; but in this application we shall have y > gItl and (20.16) will hold easily. The series for L2(3+ir, X) converges absolutely uniformly, so that we

may integrate it term by term. By analogy with the term-by-term integration of (20.10), we break the series up into Pi+p3 or into PP2+94

(according to whether we take K1(s+u-1) or K1(1-s-u)), where Y11 .., Y4 are given by

Pl(u) _

d(m)X(m)m-u

wKev

rP2(10 _ mIey 'P4(u) _

(20.18) d(m)X(m)m-4r

d(m)X(m)m-

m>a-,

We write the integral in (20.13) as the sum of four integrals numbered correspondingly h,..., 14. First we consider the integral I3 involving

AN APPROXIMATE FUNCTIONAL EQUATION (I)

3.20

87

p3(u)IK1(s+u-1). When (20.15) is valid, the integrand decreases in modulus as we move the contour to the right. We employ the contour C consisting (in the case ItI > 10) of

Cl: the line segment (2-ioo, 2-it-iltli], C2: the semicircle centre 2-it, radius It Ii to the right of the line A = 2,

C3: the line segment [2-it+iltli, 2+ioo),

2

- it

I -S

x

Fla. 5. The contour C

which is the line A = 2 with an indentation around the poles of K1(s+u-1). If Its < 10 we take C to be the line A = 2; the estimates below will then hold with ItI-' replaced by 1 and with different implied constants. 2 we have For A

I c m>oy

_ rn> oy rn-{D(m)-D(m-1)}

(20.19) C (ey)1-Alogy, by partial summation and the estimate (2.13) for the sum function D(m) of the divisor function. By (20.19) and (20.4), 1 Y7-Ti

f

J

xl-3-UKj(s+11-1) M 2u-1

8+u-1

7r

G(u)cp3(u) du

Cl

xl-a-A

f

q j r 12A-1 yl-A log y d r

(1--It+.rI)6( 2ir)

ClJ

1-1

G x- t

2

(4q t2t2)

qx-Q It 1-1log Y. 853618 X

G

4I t I logy

(20.20)

88

THE NECESSARY TOOLS

3.20

The same estimate holds for the integral along C3. On the semicircle C2, (20.4) and (20.15) are valid, and we have the upper bound x-0'(1+ ItI')-5gltI log y(iIt 11)

qx-0'lt1-1logy (20.21) Hence I. and similarly 14 are bounded by the expression in (20.21).

21

AN APPROXIMATE FUNCTIONAL EQUATION (II) `Why, what's happened to your tail?' he said in surprise. 'What 7xas happened to it 7' said Eeyore. Ìt isn't there!' 1.43

WE have now shown that 13 and 14 can be regarded as error terms. We treat Pi and cP2 by moving the contour the other way. The first case

considered is Itl > 10. If t > 10 we use the contour D given by

Dl: the line segment (-I-ioo, -+-it-iltIA], D2: the semicircle, centre -J-it, radius Itll, to the left of A = -,

D3: the line segment [-J-it+iltl1, -i-i], D4: the line segment [-i-i,1-i], D6: the line segment [1-i, 1+i],

D6: the line segment [,'+i, -+i],

-2-ioo).

D7: the line segment

The contour D for t < -10 is the reflection in the real axis of that described above. The indentation about the origin avoids a possible double pole of G(u) at u = 0. The analogue of (20.19) is that, uniformly in A < 1, (ey)1-Alogy. (21.1) Ipi(u) I
" (urn+Q2T) ?n=1

In

In(m2) 12

AT

l og(',n+e)

', 2a

.

(22.2)

Although (22.2) allows us to vary a, we take v = z throughout, since (22.3)

1

the gamma functions in G being evaluated at conjugate for a complex points. The proof is no simpler, but the form of the upper bounds is less complicated. Cauchy's inequality applied to eqn (21.8) gives I L(s, X) I4 G 61 1 d (nz)

X(m)m1-8c(m/x) 12+ 4

+61 1 d(m)X(m)ms-1ct (m/y)I2+6r1 II,.I2,

(22.4)

THE NECESSARY TOOLS

94

3.22

and we have a similar result for ic(s), with 7 instead of 6 and an extra term O(It1-10). The integers x and y in (22.4) are connected by (20.14): (22.5)

y = [g2t2/(47r2x)].

When we fix x and average over X and t, y is varying. For a good upper

bound, x and y must be of the same order of magnitude. We restrict

ourselves for the moment to P < q < 2P and U < I t l < 2U, and average the right-hand side of (22.4) over all integer values of x between For each fixed X and t the corresponding values IPU/7T and of y given by (22.5) are distinct and lie between JPU/7r-1 and 16PU/7r. The average of the square of the terms involving y taken over all integers y in this range is at least 4 times the corresponding average over the values of y that actually occur in the sum. This device allows us to sum a Dirichlet series of fixed length ey over varying X and t, provided that x does not occur in the coefficients (and vice versa). For each value of x in the above range, (22.2) gives 2PU/ar.

V*

1

p 1. (22.7)

It is only slightly harder to average over the reflected series. For e-1y < m < ey we must break the coefficients c'(m/y) up into five terms corresponding to the five poles of 1K1(w)/w. We take out a factor 01-2TT2x)-r7ri eq2

G(1-s+rai)

(22.8)

from the term corresponding to the pole of K1(w)/w at rori, where

r = 0,

2. By (22.3), the modulus of the expression (22.8) is fixed 2, independently of x, q, t, and the parameter a in the definition of G(u),

which is 1 if X(-1) _ -1 and 0 if X(-1) = +1. After this factor has been removed, a sum like that in (22.6) remains, with no gamma-function

factors and no concealed dependence on x. For each r, the right-hand side of (22.6) is an upper estimate for this sum, possibly with a different 0-constant.

FOURTH POWERS OF L-FUNCTIONS

3.22

95

The contour integrals I1,..., 14 present complications. First we fix t and average only over characters. We have for any function F(u) for which the integrals exist

f F(u) du 2
1,

(22.24)

and for q = 1 I t(r, X) I > 1. We could replace (22.24) by a weaker condition (22.25) t(r+1, X)-t(r, X) > (log QT)-1, and still have the upper bound (22.22), so that the root mean fourth power of L(s, X) is < log QT. Without the sieving over T this method

proves that the root mean fourth power at a fixed s is 2, T > 1. Let X be a large integer and 111(s, X)

=mIXt

'(M)X(M)m-8.

(23.2)

If Riemann's hypothesis were true, then for a > z 111(x, X) would tend to {L(s, X)}-' as X tended to infinity. We write L(s, X)-B1(s, X) = l +f(8, X),

(23.3)

where f (s, X) has the Dirichlet series a(rn)X(m)9n-1,

a(m) _

µ(d).

(23.4) (23.5)

The series (23.4) converges without any hypothesis for a > 1. Until A. I. Vinogradov's work (1965), the number of zeros was customarily estimated by integration of the logarithm of 1 + f (s, X) round the boundary of the rectangle (23.1). As we shall see below, f (s, X) 'is less than unity in root mean square, whether averaged over t or over X or

INGHAM'S THEOREM

4.23

99

over both, provided a > 1. At a zero, f (p, X) is -1. A. I. Vinogradov revived an alternative approach: to count directly the number of times f (s, X) has modulus at least unity. While no simpler in detail, this method is the more flexible, and we follow it here.

The integral transform 2+i eo

w

dvv = e-m!Y

f 2-ico

(23.6)

is easily verified by moving the line of integration to Re w = z - R, where R is a large positive integer, and letting R tend to infinity: the residues at poles of I'(w) give the terms of the exponential series. Hence

e-l!Y+r>X f

a(9n)X(m)m-Pe-111/I'

2+i o0

_

1

L(p+w, X)M(p+w, X)Ywr(w) dw.

21ri

(23.7)

2-ieo

Here p is a zero of L(s, X) so that the zero of L(p+w, X) cancels the pole

of r(w) at w = 0. We take the integral back to the line Re w If X is not trivial, the right-hand side is equal to i-P+100

f

L(p+w, X)M(p+w, X)Ywr(w) dw,

(23.8)

>}-fl-100

and if X mod q is trivial there is an extra term p(q)q-1M(1,

X)Yl-P11(1-p)

(23.9)

from the pole at p+w = 1. (23.10) l = log QT We write log Y < 101. (23.11) and suppose that log X < 101, The terms on the left of (23.7) with m > 1001Y contribute less than .,

if l is sufficiently large. We recall Stirling's formula in the form (20.3), valid if A < ITIi: tr(A+iT)I = e-i'r1T1IA+iTIA-i{(27r)I+O(ITI-')}.

(23.12)

The term (23.9) is now seen to be at most io when lyi > 1001, again

provided l is sufficiently large. Apart from the zeros of c(s) with lyj < 1001, all zeros fall into one or both of the following classes.

Class (i). Zeros p with I

I

X u/Nm-a-it 1n=1 2+100

=

1

f P(w)(JN)t0 (w+a+it) dw

J

2-i00

by the integral transform (23.6).

(27.13)

ZEROS AND PRIME NUMBERS

118

4.27

Before estimating the integral in eqn (27.13), we move the line of integration to the contour C consisting of C1: the line segment (-ioo, -i(logN)-1], C2: the semicircle, centre the origin, radius (log N)-1, to the right of the imaginary axis, Cg: the line segment [i(log N) -1, ioo).

A residue

(27.14)

1'(1-o-it)(N)i-°-tt

accrues from the pole of (w+a+it) at w+a+it = 1. We recall Stirling's formula in the form (20.3): Ir(A+1T)I =

(27.15)

valid when A < IT j i. Hence if

Itl > logN

(27.16)

the residue (27.14) is bounded.

I. From the approximate fimcNext we need a bound for 10 tionalequation of Chapter 21 we have for 0 < A < 2, IT 1

IS(1'/1-1T)I2 < Y d(ra)ma-1+ In

N

Ja(m)I2TOI log2NT.

(27.22)

Clearly when T > To we must divide up the range for T into intervals of length at most To. Repeated application of the inequality (27.7) gives us N R V

m=1

(27.25)

for s = s1,..., 5R, where s,. = a,.+itr with 0 < yr < J and

T ? It,,-tcl > logN

(27.26)

R < GNV-2+G8NTV-61og4NT,

(27.27)

for q 0 r, then the implied constants being absolute.

The form of the second term in (27.27) arises from our choice of functions f(r); it is larger than the first term unless (27.22) holds with T in place of To. A plausible conjecture is that whenever

R < GNP-2

(27.28)

V2 > GTS

(27.29)

for any fixed 8 > 0. The use of the zeta function to prove (27.27) is a curious feature of Halasz's method. If Lindelbf's hypothesis is true, we can take the line of integration in (27.13) to Re w = J, with the effect of replacing To in (27.22) by N=To for any e > 0. This is an improvement for T > N (and if T < N then (27.28) follows trivially from (27.2)), but it is still a long way from weakening the condition on To to (27.29).

28

GAPS BETWEEN PRIME NUMBERS Ì shall do it', said Pooh, after waiting a little longer, `by means of a trap. And it must be a Cunning Trap, so you will have to help me, Piglet.' I. 56

F I P. S T we prove a theorem on the zeros of c(s), replacing the large sieve

(19.26) by Halasz's method in the work of Chapter 23. We shall use the notation of that chapter with Q = 1, so that only zeros of the zeta function are considered. The definition of class (i) and class (ii) zeros remains as before. We pick representatives of each class of zeros in such a way that their imaginary parts differ by at least 21, where

l = log T,

(28.1)

but the representatives are in number > l-2 times the zeros in that class.

We suppose a > J, since the result (28.19) which we obtain below improves on Ingham's theorem only for a > 1. The parameters X and Y will satisfy X < T2, 1001Y < T2. (28.2) In the definition (23.14) of a class (ii) zero p = f3+iy, 1-S+ioo

f

0P+w)M(p+w)Y"F(w) dwl > 17r,

(28.3)

f-5-100

the parts of the integrand with 1Im w l > 1001 give less than I (if 1 is sufficiently large). The integral of IP(j+it) I converges rapidly so, for (28.3) to hold, there must be some t with It-yl < 1001 for which N+it)M(- -+it) I > cYR-4,

(28.4)

where c is an absolute constant. We pick as representatives of the class (ii) zeros a sequence of values of t satisfying (28.4). By (22.22) the number of these t with

I*+it)I > U,

(28.5)

where we choose U below, is T U-415.

(28.6)

GAPS BETWEEN PRIME NUMBERS

4.28

119

Otherwise we have I111(j+it) I > V = cU-1Y01-',

(28.7)

and by (27.28) the number of such t is XV-21+XTV-617. We choose

(28.8)

U = X_111oy3(2a-1)110

V=

(28.9) (28.10)

cX1110Y(2a-1)15,

and on adding (28.6) and (28.8) and multiplying by 12 we see that class (ii)

zeros number

< Xr216y-6(2a-1)/5T19T _ I X'416Y-2(2a-1)/613,

(28.11)

the second term in (28.11) being less than the first provided (28.12)

X2Y4(2a-1) {20(r2+1)}-1. ME1,

We pick representatives and apply (27.28) with

G= G 17LEjr

(28.14)

Ia(m)I2n2-tae-2m/Y < (2rY)1-20exp(-21')l3.

The number of representatives is thus r4(2'Y)2-2aexp(_ 2r+1)13+rl2(2rY)4-8aT exp(- 3 .

2r+1)113.

(28.15)

Summing over r and multiplying by l2, we see that there are at most

The Distribution of Prime Numbers: Large Sieves and Zero-density Theorems (Oxford Mathematical Monographs)

The Distribution of Prime Numbers: Large Sieves and Zero-density Theorems (Oxford Mathematical Monographs)

The Distribution of Prime Numbers: Large Sieves and Zero-density Theorems (Oxford Mathematical Monographs)

Distribution of Prime Numbers: Large Sieves and Zero-density Theorems (Oxford Mathematical Monographs)

Distribution of Prime Numbers

The Distribution of Prime Numbers (Cambridge Mathematical Library)

The Distribution of Prime Numbers (Cambridge Mathematical Library)

Prime numbers

The Solitude of Prime Numbers

The Solitude of Prime Numbers

the Solitude Of Prime Numbers

The Solitude of Prime Numbers

the Solitude Of Prime Numbers

The Solitude of Prime Numbers

The Lore of Large Numbers

The Prime Numbers and Their Distribution (Student Mathematical Library, Vol. 6)

The Prime Numbers and Their Distribution (Student Mathematical Library, Vol. 6)

Lukasiewicz's Logics And Prime Numbers

Lukasiewicz's logics and prime numbers

Algebraic and Geometric Surgery (Oxford Mathematical Monographs)

Homotopy Type and Homology (Oxford Mathematical Monographs)

The Schwarz Lemma (Oxford Mathematical Monographs)

The Schwarz Lemma (Oxford Mathematical Monographs)

The Fourth Janko Group (Oxford Mathematical Monographs)

Distribution theory of algebraic numbers

Foliations and the Geometry of 3-Manifolds (Oxford Mathematical Monographs)

Distribution of Zeros of Entire Functions (Translations of Mathematical Monographs)

Distribution of Values of Holomorphic Mappings (Translations of Mathematical Monographs)

Foliations and the Geometry of 3-Manifolds (Oxford Mathematical Monographs)

Distribution of Values of Holomorphic Mappings (Translations of Mathematical Monographs)

Distribution of Zeros of Entire Functions (Translations of Mathematical Monographs)

The Distribution of Prime Numbers: Large Sieves and Zero-density Theorems (Oxford Mathematical Monographs)