An Introduction to Sieve Methods and Their Applications
LONDON MATHEMATICAL SOCIETY STUDENT TEXTS Managing editor: Pr...
140 downloads
1443 Views
737KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
An Introduction to Sieve Methods and Their Applications
LONDON MATHEMATICAL SOCIETY STUDENT TEXTS Managing editor: Professor J. W. Bruce, Department of Mathematics, University of Hull, UK 3 4 5 8 9 11 12 13 15 17 18 20 21 22 23 24 26 27 28 29 31 32 33 34 35 37 38 39 40 41 42 43 44 45 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
Local fields, J. W. S. CASSELS An introduction to twistor theory: Second edition, S. A. HUGGETT & K. P. TOD Introduction to general relativity, L. P. HUGHSTON & K. P. TOD Summing and nuclear norms in Banach space theory, G. J. O. JAMESON Automorphisms of surfaces after Nielsen and Thurston, A. CASSON & S. BLEILER Spacetime and singularities, G. NABER Undergraduate algebraic geometry, MILES REID An introduction to Hankel operators, J. R. PARTINGTON Presentations of groups: Second edition, D. L. JOHNSON Aspects of quantum field theory in curved spacetime, S. A. FULLING Braids and coverings: selected topics, VAGN LUNDSGAARD HANSEN Communication theory, C. M. GOLDIE & R. G. E. PINCH Representations of finite groups of Lie type, FRANCOIS DIGNE & JEAN MICHEL Designs, graphs, codes, and their links, P. J. CAMERON & J. H. VAN LINT Complex algebraic curves, FRANCES KIRWAN Lectures on elliptic curves, J. W. S CASSELS An introduction to the theory of L-functions and Eisenstein series, H. HIDA Hilbert Space: compact operators and the trace theorem, J. R. RETHERFORD Potential theory in the complex plane, T. RANSFORD Undergraduate commutative algebra, M. REID The Laplacian on a Riemannian manifold, S. ROSENBERG Lectures on Lie groups and Lie algebras, R. CARTER, G. SEGAL, & I. MACDONALD A primer of algebraic D-modules, S. C. COUNTINHO Complex algebraic surfaces, A. BEAUVILLE Young tableaux, W. FULTON A mathematical introduction to wavelets, P. WOJTASZCZYK Harmonic maps, loop groups, and integrable systems, M. GUEST Set theory for the working mathematician, K. CIESIELSKI Ergodic theory and dynamical systems, M. POLLICOTT & M. YURI The algorithmic resolution of diophantine equations, N. P. SMART Equilibrium states in ergodic theory, G. KELLER Fourier analysis on finite groups and applications, AUDREY TERRAS Classical invariant theory, PETER J. OLVER Permutation groups, P. J. CAMERON Introductory lectures on rings and modules, J. BEACHY Set theory, A HAJNÁL, P. HAMBURGER K-theory for C*-algebras, M. RORDAM, F. LARSEN, & N. LAUSTSEN A brief guide to algebraic number theory, H. P. F. SWINNERTON-DYER Steps in commutative algebra: Second edition, R. Y. SHARP Finite Markov chains and algorithmic applications, O. HAGGSTROM The prime number theorem, G. J. O. JAMESON Topics in graph automorphisms and reconstruction, J. LAURI & R. SCAPELLATO Elementary number theory, group theory, and Ramanujan graphs, G. DAVIDOFF, P. SARNAK, & A. VALETTE Logic, Induction and Sets, T. FORSTER Introduction to Banach Algebras and Harmonic Analysis, H. G. DALES et al Computational Algebraic Geometry, HAL SCHENCK Frobenius Algebras and 2-D Topological Quantum Field Theories, J. KOCK Linear Operators and Linear Systems, J. R. PARTINGTON An Introduction to Noncommutative Noetherian Rings, K. R. GOODEARL & R. B. WARFIELD Topics from One Dimensional Dynamics, K. M. BRUCKS & H. BRUIN Singularities of Plane Curves, C. T. C. WALL A Short Course on Banach Space Theory, N. L. CAROTHERS
An Introduction to Sieve Methods and Their Applications ALINA CARMEN COJOCARU Princeton University M. RAM MURTY Queen’s University
cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge cb2 2ru, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521848169 © Cambridge University Press 2005 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2005 isbn-13 isbn-10
978-0-511-13149-3 eBook (NetLibrary) 0-511-13285-9 eBook (NetLibrary)
isbn-13 isbn-10
978-0-521-84816-9 hardback 0-521-84816-4 hardback
isbn-13 isbn-10
978-0-521-61275-3 paperback 0-521-61275-6 paperback
Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
Principles exist. We don’t create them. We only discover them. Vivekananda
Contents
Preface
page xi
1
Some basic notions 1.1 The big ‘O’ and little ‘o’ notation 1.2 The M¨obius function 1.3 The technique of partial summation 1.4 Chebycheff’s theorem 1.5 Exercises
1 1 2 4 5 10
2
Some elementary sieves 2.1 Generalities 2.2 The larger sieve 2.3 The square sieve 2.4 Sieving using Dirichlet series 2.5 Exercises
15 15 17 21 25 27
3
The 3.1 3.2 3.3 3.4 3.5
normal order method A theorem of Hardy and Ramanujan The normal number of prime divisors of a polynomial Prime estimates Application of the method to other sequences Exercises
32 32 35 38 40 43
4
The 4.1 4.2 4.3 4.4
Turán sieve The basic inequality Counting irreducible polynomials in p x Counting irreducible polynomials in x Square values of polynomials
47 47 49 51 53
vii
Contents
viii 4.5 4.6
An application with Hilbert symbols Exercises sieve of Eratosthenes The sieve of Eratosthenes Mertens’ theorem Rankin’s trick and the function x z The general sieve of Eratosthenes and applications Exercises
55 58
5
The 5.1 5.2 5.3 5.4 5.5
63 63 65 68 70 74
6
Brun’s sieve 6.1 Brun’s pure sieve 6.2 Brun’s main theorem 6.3 Schnirelman’s theorem 6.4 A theorem of Romanoff 6.5 Exercises
80 81 87 100 106 108
7
Selberg’s sieve 7.1 Chebycheff’s theorem revisited 7.2 Selberg’s sieve 7.3 The Brun–Titchmarsh theorem and applications 7.4 Exercises
113 113 118 124 130
8
The 8.1 8.2 8.3 8.4 8.5
large sieve The large sieve inequality The large sieve Weighted sums of Dirichlet characters An average result Exercises
135 136 139 142 147 151
9
The 9.1 9.2 9.3 9.4
Bombieri–Vinogradov theorem A general theorem The Bombieri–Vinogradov theorem The Titchmarsh divisor problem Exercises
156 157 167 172 174
10
The lower bound sieve 10.1 The lower bound sieve 10.2 Twin primes 10.3 Quantitative results and variations
177 177 185 193
Contents
11
ix
10.4 Application to primitive roots 10.5 Exercises
195 199
New 11.1 11.2 11.3 11.4 11.5 11.6
201 201 205 207 209 213 216
directions in sieve theory A duality principle A general formalism Linnik’s problem for elliptic curves Linnik’s problem for cusp forms The large sieve inequality on GLn Exercises
References
218
Index
222
Preface
It is now nearly 100 years since the birth of modern sieve theory. The theory has had a remarkable development and has emerged as a powerful tool, not only in number theory, but in other branches of mathematics, as well. Until 20 years ago, three sieve methods, namely Brun’s sieve, Selberg’s sieve and the large sieve of Linnik, could be distinguished as the major pillars of the theory. But after the fundamental work of Deshouillers and Iwaniec in the 1980’s, the theory has been linked to the theory of automorphic forms and the fusion is making significant advances in the field. This monograph is the outgrowth of seminars and graduate courses given by us during the period 1995–2004 at McGill and Queen’s Universities in Canada, and Princeton University in the US. Its singular purpose is to acquaint graduate students to the difficult, but extremely beautiful area, and enable them to apply these methods in their research. Hence we do not develop the detailed theory of each sieve method. Rather, we choose the most expedient route to introduce it and quickly indicate various applications. The reader may find in the literature more detailed and encyclopedic accounts of the theory (many of these are listed in the references). Our purpose here is didactic and we hope that many will find the treatment elegant and enjoyable. Here are a few guidelines for the instructor. Chapters 1 through 5 along with Chapter 7 can be used as material for a senior level undergraduate course. Each chapter includes a good number of exercises suitable at this level. The book contains more than 200 exercises in all. Chapter 6 along with chapters 8 and 9 are certainly at the graduate level and require further prerequisites. Finally, Chapters 10 and 11 are at the ‘seminar’ level and require further mathematical sophistication. For the last chapter, in particular, a modest xi
xii
Preface
background in the theory of elliptic curves and automorphic representations may make the reading a bit smoother. Whenever possible, we have tried to provide suitable references for the reader for these prerequisites. Our list of references is by no means exhaustive.
1 Some basic notions
A. F. Möbius (1790–1868) introduced the famous Möbius function · in 1831 and proved the now well-known inversion formula. The Möbius function is fundamental in sieve theory and it will be seen that almost all sieve techniques are based on approximations or identities of various sorts to this function. It seems that the Möbius inversion formula was independently discovered by P. L. Chebycheff (1821–94) in the year 1851, when he wrote his celebrated paper establishing upper and lower bounds for x and settling Bertrand’s postulate that there is always a prime number between n and 2n for n ≥ 2. We give the proof of this below, but follow a derivation due to S. Ramanujan (1887–1920).
1.1 The big ‘O’ and little ‘o’ notation Let D be a subset of the complex numbers and let f D −→ be a complex valued map defined on D. We will write fx = Ogx if g D −→ + and there is a positive constant A such that fx ≤ Agx for all x ∈ D. Often, D will be the set of natural numbers or the non-negative reals. Sometimes we will write fx gx or 1
gx fx
Some basic notions
2
to indicate that fx = Ogx. If we have that fx gx and gx fx then we write fx gx
In the case that D is unbounded, we will write fx = ogx if lim
x→ x∈D
fx = 0
gx
Clearly, if fx = ogx, then fx = Ogx
We will also write fx ∼ gx to mean lim
x→ x∈D
fx = 1
gx
As an example, look at n, the number of distinct prime factors of a positive integer n. Then n = Olog n. Also, for any > 0, log x = o x because log x →0 x as x → . There is one convention prevalent in analytic number theory and this refers to the use of . Usually, when we write fx = O x it means that for any > 0 there is a positive constant C depending only on such that fx ≤ C x for all x∈ D. If the usage of is in this sense, then it should be clear that fx = O x2 also implies fx = O x . Throughout the book, p, q, will usually denote primes, n, d, k will be positive integers, x, y, z positive real numbers. Any deviation will be clear from the context. Also, sometimes we denote gcd (n, d) as (n, d) and the lcm (n, d) as [n, d].
1.2 The Möbius function The M¨obius function, denoted ·, is defined as a multiplicative function satisfying 1 = 1, p = −1 for every prime p and pa = 0 for integers a ≥ 2. Thus, if n is not squarefree, n = 0 and if n is a product of k distinct primes, then n = −1k . We have the basic lemma:
1.2 The Möbius function
3
Lemma 1.2.1 (The fundamental property of the Möbius function) 1 if n = 1 d = 0 otherwise. dn Proof If n = 1, then the statement of the lemma is clearly true. If n > 1, a let n = p1 1 prar be the unique factorization of n into distinct prime powers. Set N = p1 pr (this is called the radical of n). As d = 0 unless d is squarefree, we have d = d
dn
dN
r
The latter sum contains 2 summands, each one corresponding to a subset of p1
pr since the divisors of N are in one-to-one correspondence with such subsets. The number of k element subsets is clearly r k and for a divisor d determined by such a subset we have d = −1k . Thus r r −1k = 1 − 1r = 0
d = k k=0
dN
This completes the proof. Lemma 1.2.1 is fundamental for various reasons. First, it allows us to derive an inversion formula that is useful for combinatorial questions. Second, it is the basis of both the Eratosthenes’ and Brun’s sieves that we will meet in later chapters. Theorem 1.2.2 (The Möbius inversion formula) Let f and g be two complex valued functions defined on the natural numbers. If fn = gd dn
then gn =
dn
and conversely. Proof Exercise.
dfn/d
Some basic notions
4
Theorem 1.2.3 (The dual Möbius inversion formula) Let be a divisor closed set of natural numbers (that is, if d ∈ and d d then d ∈ ). Let f and g be two complex valued functions on the natural numbers. If gd fn = nd d∈
then gn =
nd d∈
d fd n
and conversely (assuming that all the series are absolutely convergent). Proof Exercise.
1.3 The technique of partial summation Theorem 1.3.1
Let c1 c2 be a sequence of complex numbers and set Sx =
cn
n≤x
Let n0 be a fixed positive integer. If cj = 0 for j < n0 and f n0 −→ has continuous derivative in n0 , then for x an integer > n0 we have
cn fn = Sxfx −
n≤x
x
Stf tdt
n0
Proof This is easily deduced by writing the left-hand side as n≤x
Sn − Sn − 1fn =
Snfn −
n≤x
= Sxfx −
n≤x−1
= Sxfx −
Snfn + 1
n≤x−1
x n0
n+1
Sn n
Stf tdt
f tdt
1.4 Chebycheff’s theorem
5
because St is a step function that is constant on intervals of the form n n + 1. In the mathematical literature, the phrase ‘by partial summation’ often refers to a use of the above lemma with appropriate choices of cn and ft. For instance, we can apply it with cn = 1 and ft = log t to deduce that x t dt log n = x log x − t 1 n≤x where the notation x indicates the greatest integer less than or equal to x. Since x = x + O1, we easily find: Proposition 1.3.2
log n = x log x − x + Olog x
n≤x
Similarly, we deduce: Proposition 1.3.3
1 = log x + O1
n≤x n
1.4 Chebycheff’s theorem We caution the reader to what has now become standard notation in number theory. Usually p (and sometimes q or ) will denote a prime number, and summations or products of the form
p≤x
p
p≤x
p
indicate that the respective sums or products are over primes. Let x denote the number of primes up to x. Clearly, x = Ox. In 1850, Chebycheff proved by an elementary method that x x = O
(1.1) log x In fact, if we define x =
p≤x
then Chebycheff proved:
log p
Some basic notions
6
Theorem 1.4.1 (Chebycheff’s theorem) There exist positive constants A and B such that Ax < x < Bx
By partial summation, this implies the bound on x stated in (1.1). From Theorem 1.4.1 it is clear that there is always a prime number between x and Bx/A, since Bx Bx >A = Bx > x
A A By obtaining constants A and B so that B/A ≤ 2 Chebycheff was able to deduce further from this theorem: Theorem 1.4.2 (Bertrand’s postulate) There is always a prime between n and 2n, for n ≥ 1
Chebycheff’s theorem represented the first substantial progress at that time towards a famous conjecture of Gauss concerning the asymptotic behaviour of x. Based on extensive numerical data and coherent heuristic reasoning, Gauss predicted that x ∼
x log x
(1.2)
as x → . This was proven independently by Hadamard and de la Vallée Poussin in 1895 and is known as the prime number theorem. Before we discuss Chebycheff’s proof, we will outline a simplified treatment of Theorem 1.4.1, due to Ramanujan [55]. Ramanujan’s proof of Theorem 1.4.1 Let us observe that if a 0 ≥ a1 ≥ a2 ≥ · · · is a decreasing sequence of real numbers, tending to zero, then a0 − a1 ≤
−1n an ≤ a0 − a1 + a2
n=0
These inequalities are obvious if we write n=0
−1n an = a0 − a1 − a2 − a3 − a4 − · · ·
1.4 Chebycheff’s theorem
7
on the one hand, and
−1n an = a0 − a1 + a2 − a3 + · · ·
n=0
on the other. Following Chebycheff, we define x = log p
(1.3)
pa ≤x
where the summation is over all prime powers pa ≤ x. We let x Tx = n n≤x and notice that
log n =
n≤x
log p =
pa n
n≤x
m≤x
x m
= Tx
By Proposition 1.3.2 of the previous section, Tx = x log x − x + Olog x so that Tx − 2T
x
On the other hand, Tx − 2T
2
= log 2x + Olog x
x 2
=
−1n−1
n≤x
x n
and we can apply our initial observation to deduce x x + ≥ log 2x + Olog x x − 2 3
(1.4)
because an = x/n is a decreasing sequence of real numbers tending to zero (in fact, equal to zero for n > x). By the same logic, we have x ≤ log 2x + Olog x
x − 2 By successively replacing x with x/2k in the above inequality, we obtain x ≤ 2log 2x + O log2 x
This completes Ramanujan’s proof of Theorem 1.4.1. We can easily deduce Bertrand’s postulate now.
Some basic notions
8
Proof of Theorem 1.4.2 From (1.4), x 1 ≥ log 2x + O log2 x
x − (1.5) 2 3 This shows that there is always a prime power between n and 2n for n sufficiently large. We leave as an exercise to show that (1.5) implies that there is always a prime between n and 2n for n sufficiently large. We now indicate Chebycheff’s argument. Chebysheff’s proof of Theorem 1.4.1 The key observation is that
2n p
n n 0. √ 2. If dn denotes the number of divisors of n, show that dn = O n. 3. Show that there is a constant c > 0 such that c log n
dn = O exp log log n Deduce that for any > 0, dn = O n . 4. Prove the Möbius inversion formula. 5. Prove the dual Möbius inversion formula. 6. Let F and G be complex valued functions defined on 1 . Prove that Gx/n Fx = n≤x
if and only if Gx =
n≤x
nFx/n
1.5 Exercises 7.
11
Prove that there is a constant such that 1 = log x + + O1/x
n≤x n
8.
( is called Euler’s constant.) Show that dn = x log x + Ox
n≤x
9.
Prove that
√ dn = x log x + 2 − 1x + O x
n≤x
10. Prove, by partial summation, that x ∼ x if and only if x ∼ x/ log x. [Hint: prove first that x ∼ x if and only if x ∼ x/ log x.] 11. Using (1.4), show that 1 x 1 ≥ log 2x + O x 2 log2 x
x − 2 3 12. Define the function log p if n = pa n = 0 otherwise. This is called the von Mangoldt function. Observe that n x = n≤x
and prove that x = Ox
13. Show that
n2 x log x
n≤x
14. Show that en = lcm1 2 3 n
15. Consider the integral I=
1 0
xn 1 − xn dx
and show that 0 < I < 4−n
16. Show that e2n+1 I is a positive integer.
Some basic notions
12
17. Deduce from the previous exercise that n ≥ log 2n for n sufficiently large. 18. Infer from the previous exercise that for some constant c > 0, x >
cx log x
for all x ≥ 2. 19. Using the previous two exercises, show that there is always a prime between n and 2n for n ≥ 2. 20. Let n denote the number of coprime residue classes mod n. This is called Euler’s function. Show that d = n
dn
21. Show that n n≤x
n
=
6 x + Olog x
2
22. Let pn denote the n-th prime. Show that there are positive constants A and B such that An log n < pn < Bn log n. The following exercises utilise partial summation and Möbius inversion to deduce Selberg’s formula. This was the key tool in Selberg’s elementary proof of the prime number theorem [58], discovered in 1949. 23. Using partial summation, show that 2 log n = x log2 x − 2x log x + 2x + O log2 x
n≤x
24. Using the previous exercise, deduce that 2x = Ox
log n n≤x 25. Show that n n≤x
n
= log x + O1
26. Show that n n≤x
n
log
x 1 = log2 x + Olog x
n 2
1.5 Exercises
13
27. Putting Gx = 1 in Exercise 6 above, deduce that n n≤x
n
= O1
28. Putting Gx = x in Exercise 6 above, deduce that n n
n≤x
log
x = O1
n
29. Putting Gx = x log x in Exercise 6 above, deduce that n n≤x
n
log2
x = 2 log x + O1
n
30. Deduce from the previous exercises that d1 log2 d2 = 2x log x + Ox
d1 d2 ≤x
31. A complex valued function defined on the set of natural numbers is called an arithmetical function. For two arithmetical functions f and g we define the Dirichlet product f ∗ g by f ∗ gn = fdgn/d
dn
We can also define the ordinary product of two arithmetical functions f and g, denoted fg, by setting fgn = fngn. Now let Ln = log n
Show that Lf ∗ g = Lf ∗ g + f ∗ Lg
32. If denotes the usual von Mangoldt function, show that L + ∗ = ∗ L2 where L· is as above. 33. Deduce from the previous exercises that n log n + uv = 2x log x + Ox
n≤x
uv≤x
34. Deduce Selberg’s formula from the previous exercise: x log x + nx/n = 2x log x + Ox
n≤x
14
Some basic notions
35. Let maxn denote the largest exponent appearing in the unique factorization of n into distinct prime powers. Show that maxn = Ox
n≤x
36. With maxn as in the previous exercise, show that for some constant c > 0, maxn ∼ cx n≤x
as x tends to infinity. What can be said about the error term? What can be said about the constant c?
2 Some elementary sieves
Modern sieve theory had an awkward and slow beginning in the early works of Viggo Brun (1885–1978). Brun’s papers received scant attention and so sieve theory essentially lay dormant, waiting to be developed. By the 1950s, however, the Selberg sieve and the large sieve emerged as powerful tools in analytic number theory. In retrospect, we now understand the concept of a sieve in a better light. This reflection has recently given rise to some elementary sieve techniques. Though relatively recent in origin, these techniques are simple enough to be treated first, especially from the didactic perspective.
2.1 Generalities Let be a finite set of objects and let be an index set of primes such that to each p ∈ we have associated a subset p of . The sieve problem is to estimate, from above and below, the size of the set = \ ∪p∈ p
This is the formulation of the problem in the most general context. Of course, the ‘explicit’ answer is given by the familiar inclusion–exclusion principle in combinatorics. More precisely, for each subset I of denote by I = ∩p∈I p
Then the inclusion–exclusion principle gives us −1#I #I # = I⊆
where for the empty set ∅ we interpret ∅ as itself. This formula is the basis in many questions of probability theory (see the exercises). 15
16
Some elementary sieves
In number theory we often take to be a finite set of positive integers and p to be the subset of consisting of elements lying in certain specified congruence classes modulo p For instance, if is the set of natural numbers ≤ x and p is the set of numbers in divisible by p then the size of will be the number of positive integers n ≤ x coprime to all the elements of Estimating # is a fundamental question which arises in many disguises in mathematics and forms the focus of attention of all sieve techniques. We will illustrate this in later chapters. We could also reverse the perspective. Namely, we can think of = as a given set, whose size we want to estimate. We seek to do this by looking at its image modulo primes p ∈ for some set of primes This will be the point of view in the large sieve (Chapter 8). We could even enlarge this reversed perspective, as follows. Let be a finite set of positive integers and let be a set of prime powers. Suppose that we know the size of the image of mod t for any t ∈ We then seek to estimate the size of itself. This is the approach of the larger sieve, to be discussed below. The rationale for the terminology will be explained later. In some fortuitous circumstances, we may have a family of complex valued functions f −→ such that
fn =
f ∈
Then # =
f ∈
1
if n ∈
0
otherwise.
(2.1)
fn
n∈
and the inner sum may be tractable by other techniques, such as analytic methods. We will illustrate this idea in the section on sieving using Dirichlet series. Often it is the case that a precise relation as (2.1) may not exist and we may only have an approximation to it. For instance, 1 + o1 if n ∈ fn = (2.2) o1 otherwise, f ∈ as # → Such a family of functions can then be used with great effect in obtaining estimates for #
2.2 The larger sieve Even the knowledge
fn ≥ 1
17
if n ∈
f ∈
is sufficient to yield good upper bounds for # Indeed, we can consider
2
fn # ≤
n∈ f ∈ and expand out the square. After interchanging summations, we confront fnf n n∈
for f f ∈ (see Chapter 11). We will illustrate these ideas, first with Gallagher’s larger sieve [21] and then with the square sieve [24]. In the last section we will discuss analytic techniques alluded to earlier, namely how to sieve using Dirichlet series.
2.2 The larger sieve Let be a (non-empty) finite set of integers and let be a set of prime powers. Suppose that for each t ∈ we have # mod t ≤ ut for some ut Thus represents at most ut residue classes modulo t
Theorem 2.2.1 (Gallagher’s larger sieve) We keep the above setting and let X = max b
b∈
If t t∈
ut
then
# ≤
− log2X > 0
t − log2X
t∈
t t∈
ut
− log2X
where · is the von Mangoldt function.
Some elementary sieves
18
Proof Let t ∈ and for each residue class rmod t define Z t r = # b ∈ b ≡ rmod t
Then # =
Z t r
rmod t
By the Cauchy–Schwarz inequality, this is 1/2 1/2 2 ≤ ut Z t r
rmod t
Hence # 2 ≤ ut rmod t ≤ # +
1
bb ∈ bb ≡rmod t
1
bb ∈ tb−b b=b
We multiply this inequality by t and we sum over t ∈ Using t = log n tn
we obtain # 2 t∈
ut
t ≤ #
t + log 2X # 2 − #
t∈
By cancelling the # and rearranging, we establish the inequality. This sieve should be compared with the large sieve discussed in Chapter 8. The advantage here is that we can sieve out residue classes modulo prime powers, whereas in the large sieve only residue classes modulo primes are considered. This explains to some extent the name ‘the larger sieve’. Following Gallagher, we apply the larger sieve to prove: Theorem 2.2.2 Let a b be integers having the property that for any prime power t there exists an integer t such that b ≡ at mod t
Then there exists an integer such that b = a
2.2 The larger sieve
19
Before proceeding with the proof of this theorem, let us review some basic properties of the cyclotomic polynomial. Recall that for a positive integer d the d-th cyclotomic polynomial d x is the minimal polynomial over of a primitive d-th root of unity. Thus it has degree d, where · is Euler’s function. Now let n be an arbitrary positive integer. As the n-th roots of unity can be partitioned according to their order, we see that we have the formula
xn − 1 = d x
dn
Finally, for an integer a such that a n = 1 let fa n be the order of a modulo n By Exercise 6 we have that fa n = d if and only if nd a
Proof of Theorem 2.2.2 Let a b be as in the statement of the theorem. We note that to prove the result, we may suppose that a b are positive and a ≥ 3 (see Exercise 5). Let = n ≤ x n = ai bj for some i j and
= t t prime power fa t ≤ y where y = yx is some parameter to be chosen later. By Exercise 6, is a finite set. We keep the notation of Theorem 2.2.1. If for every prime power t we have that b is a power of a modulo t then ut ≤ fa t
Thus Theorem 2.2.1 implies that # ≤
t − log2x
t∈
t − log2x f t t∈ a
provided that the denominator is positive. We have t = t d≤y fa t=d
t∈
=
t
d≤y td a
=
d≤y
log d a
(2.3)
Some elementary sieves
20
upon using Exercise 6 and the formula
dn d
= log n Clearly,
a − 1d ≤ d a ≤ a + 1d so that
t =
log d a
d≤y
t∈
d y2
d≤y
We also note that this implies t 1 ≥ t y
f t y t∈
t∈ a Now choose y = 100 log2x
From (2.3) we deduce that # log x
(2.4)
To this end, let us remark that if all the powers of a and b are distinct, then the set has cardinality log x2 (see Exercise 7). This contradicts (2.4), and so we conclude that for some i0 j0 we have ai0 = bj0
We may even suppose that i0 j0 = 1 for otherwise we can take i0 j0 -th roots of both sides of the above equality. Let us write
n = pp n p
for the unique factorization of an integer n into prime powers. We deduce that i0 p a = j0 p b for all primes p As i0 j0 = 1 this means that i0 p b and j0 p a for all primes p This implies that a is a j0 -th power and b is an i0 -th power of some integer c The hypothesis now implies that for any prime q there exists a q such that cj0 q ≡ ci0 mod q which is equivalent to fc qj0 q − i0 if q c = 1
2.3 The square sieve
21
Now take a prime divisor q of j0 t c for any t By Exercise 6 we deduce that fc q ≡ 0mod j0 Thus j0 i0 and so b is a power of a as desired.
2.3 The square sieve The square sieve is a simple technique originating in [24] meant to estimate the number of squares in a given set of integers. It relies on the use of a family of quadratic residue symbols for sifting out the squares. Consequently, it is well-suited for those sequences that are uniformly distributed in arithmetic progressions. Theorem 2.3.1 (The square sieve) Let be a finite set of nonzero integers and let be a set of odd primes. Set S = # ∈ is a square
Then
#
+ max S ≤
+ E q1 =q2
# ∈ q1 q2
where
· q1 q2
q1 q2 ∈
denotes the Jacobi symbol and 1 1 2 E = O + # ∈ #2 ∈ =
1
p∈ p
Remark 2.3.2 In practice, the contribution from E is negligible and one would expect the larger contributions to the estimate to come from the first two terms. Proof We begin by observing that if ∈ is a square, then = # −
q q∈ Thus
1 S ≤ 2 ∈ #
2 +
q q∈
(2.5)
Some elementary sieves
22
Upon squaring and interchanging the summations, we get that the right hand side of inequality (2.5) is 1 2 + 2 +
2 q1 q2 q ∈ # q1 q2 ∈ q∈ The first sum is 1 1 # + ≤ #2 ∈ q1 q2 # q1 q2 ∈ #2 ∈ q1 q2 q1 q2 ∈ q1 =q2
#
+ max ≤
# q1 q2 ∈ ∈ q1 q2 q1 =q2
The contribution to (2.5) from the latter sums is easily seen to be E≤
2 1 + 2
# ∈ #2 ∈
This completes the proof. Corollary 2.3.3 Let be a set of nonzero integers and let be a set of primes that are coprime to the elements of Then
#
S = # ∈ is a square ≤ + max
# q1 q2 ∈ ∈ q1 q2 q1 =q2
Proof The hypothesis of the corollary implies that = 0 for any ∈ so that E = 0 in the square sieve. We want to apply the square sieve to count the number of integral points on a hyperelliptic curve y2 = fx where fx ∈ x is a polynomial of degree d, of non-zero discriminant, and which is not the perfect square of a polynomial with integer coefficients. A famous theorem of Siegel [67] tells us that the number of integral points on such a curve is finite. Recently, effective estimates of this number have been given by various authors (see [30]). However, these estimates involve knowing the Mordell-Weil rank of the Jacobian of the hyperelliptic curve. Our approach is elementary and can be adapted to study how often a polynomial fx1 xn represents a square. As will be seen below, the generalization
2.3 The square sieve
23
will require the deep work of Deligne (see [62]). For more applications, the reader may consult [7]. Given fx ∈ x and k ∈ let us set for k > 2, fa Sf k = k amod k where ·/k is the Jacobi symbol. Lemma 2.3.4 Let q1 q2 be distinct primes and let f ∈ x Then Sf q1 q2 = Sf1 q1 Sf2 q2 where f1 x = fq2 x f2 x = fq1 x
Proof The residue classes modulo q1 q2 can be written as q1 a2 + q2 a1 with 0 ≤ a2 ≤ q2 − 1 0 ≤ a1 ≤ q1 − 1 (see Exercise 8). Therefore q1 −1 q2 −1 fq1 a2 + q2 a1 Sf q1 q2 = q1 q2 a2 =0 a1 =0 =
q1 −1 q2 −1
a2 =0 a1 =0
=
q1 −1 q2 −1
a2 =0 a1 =0
fq1 a2 + q2 a1 q1 fq2 a1 q1
fq1 a2 + q2 a1 q2
fq1 a2
q2
The result follows. Now let H be a positive real number and let us consider the set = fn n ≤ H
By the square sieve, the number of squares of is, for any set of primes not dividing the discriminant of f ,
fn 2H + 1
+ max ≤
+ E q1 q2 ∈
# q q 1 2 n≤H q1 =q2
where
E = O
H log H Hlog H2 + # #2
and where we have used the elementary estimate = Olog
Some elementary sieves
24
Let q1 q2 be two distinct primes of . We have fn fa = q1 q2 q1 q2 amod q q n≤H n≤H 1 2
1
n≡amod q1 q2
The inner sum is 2H + O1 q1 q2 so we obtain n≤H
fn q1 q2
2H = q1 q2
amod q1 q2
fa + Oq1 q2
q1 q2
By the lemma, the sum on the right-hand side is the product Sf1 q1 Sf2 q2 for appropriate polynomials f1 f2
We invoke a celebrated result of Weil (see [35, p. 99]), asserting that, for any gx ∈ x with non-zero discriminant and which is not the perfect square of a polynomial with integer coefficients, and for any prime p not dividing the discriminant of g,
ga √
≤ deg g − 1 p
amod p p Using this in the above estimates gives fn H + q1 q2
=O √ q1 q2 q1 q2 n≤H Let us choose the set to be given by the primes not dividing the discriminant of f and lying in the interval z 2z for some z = zH > 0 to be also chosen soon. We get the final estimate S
Hlog Hlog z Hlog H2 log z2 H log z H + + z2 + +
z z z z2
Choosing z = H 1/3 log H2/3 proves: Theorem 2.3.5 Let f be a polynomial with non-zero discriminant and integer coefficients, which is not the perfect square of a polynomial with integral coefficients. Let H > 0 Then the number of squares in the set fn n ≤ H
2.4 Sieving using Dirichlet series
25
is O H 2/3 log H4/3 with the implied O-constant depending only on the degree of f and the coefficients of f .
2.4 Sieving using Dirichlet series Sometimes, the sequences of numbers that we sift from exhibit a multiplicative structure and the sieve conditions may also exhibit such a property. In such cases, analytic methods using Dirichlet series are quite powerful and direct. In some instances, the techniques may even yield asymptotic formulae for the sieve problem. We illustrate this idea below, in greater detail. Further elaboration can be found in [46]. Let be a set of primes and let indicate its complement in the set of all primes. Suppose that we want to count the number of natural numbers n ≤ x which are not divisible by any of the primes of If we define the Dirichlet series
an 1 −1 1− s = Fs = s p n≥1 n p∈ we see that an = 1 if n is not divisible by any p ∈ and an = 0 otherwise. Thus we seek to study an
n≤x
By Perron’s formula (see [45, pp. 54–7]), this can be written as 1 2+i xs an = Fs ds
2i 2−i s n≤x Here is a variant of the classical Tauberian theorem that is useful in such a context. Theorem 2.4.1 (Tauberian theorem) an Let Fs = be a Dirichlet series with non-negative coefficients convergs n≥1 n ing for Res > 1. Suppose that Fs extends analytically at all points on Res = 1 apart from s = 1, and that at s = 1 we can write Fs =
Hs s − 11−
for some ∈ and some Hs holomorphic in the region Res ≥ 1 and nonzero there. Then cx an ∼ log x n≤x
Some elementary sieves
26 with
c =
H1 1 −
where is the usual Gamma function. This theorem was proven in 1938 by Raikov [54]. We do not prove it here, but only indicate that standard techniques of analytic number theory, as explained in [45, Chapter 4], can be used to derive the result. The reader may also find treatments in English in [75] and [68]. As an illustration of the principle, we consider the problem of counting the number of natural numbers n ≤ x that can be written as the sum of two squares. It is well-known (see [32, p. 279]) that n can be written as a sum of two squares if and only if for every prime p ≡ 3mod 4 dividing n, the power of p appearing in the unique factorization of n is even. Thus, if an = 1 whenever n can be written as a sum of two squares and is zero otherwise, we see that an 1 −1
1 −1 1 −1
1 − 1 − = 1 −
Fs = s 2s ps p2s n≥1 n p≡1mod 4 p≡3mod 4 Now we need to invoke some basic properties of the Riemann zeta function s and the Dirichlet L-function Ls 4 associated with the quadratic character 4 , defined by s =
1 s n≥1 n
Ls 4 =
4 n n≥1
ns
for s ∈ with Res > 1 Here, 4 n is 0 for n even and −1n−1/2 for n odd. We refer the reader to [45] for the properties of these functions. Using the Euler products of s and Ls 4 we write Fs = sLs 4 1/2 H1 s where H1 s is analytic and non-vanishing for Res > 1/2 As Ls 4 extends to an entire function and is non-vanishing for Res ≥ 1 we have Fs = s1/2 H2 s for some H2 s holomorphic and nonzero in Res ≥ 1 Thus, using the fact that the Riemann zeta function has a simple pole at s = 1 and that it is analytic and non-vanishing for Res = 1 s = 1 (see [45]), we deduce that Fs =
Hs s − 11/2
with Hs holomorphic and non-vanishing in the region Res ≥ 1. By the Tauberian theorem cited above we obtain:
2.5 Exercises
27
Theorem 2.4.2 The number of n ≤ x that can be written as the sum of two squares is cx ∼ log x for some c > 0 as x →
By the same technique using classical Dirichlet series, one can also deduce the following remarkable result. Fix an integer k ≥ 3. The number of n ≤ x not divisible by any prime p ≡ 1mod k is ∼
c1 x log x1/k
for some c1 > 0, where k is the Euler function. Thus a consequence of this result is that ‘almost all’ numbers have a prime divisor p ≡ 1mod k
Further applications of the technique can be found in [46].
2.5 Exercises 1.
(The Cauchy–Schwarz inequality) Let ai 1≤i≤n bi 1≤i≤n be complex numbers. Show that
2
2 2 ab ≤ ai bi
1≤i≤n i i 1≤i≤n 1≤i≤n
2.
Let k < n be positive integers. Let X be a k-element set and Y = y1 yn an n-element set. Let S be the set of all maps from X to Y , and i be the set of maps whose image does not contain yi Then the set S\ ∪1≤i≤n i consists of maps that are surjective. Using the inclusion–exclusion principle, deduce that n n − ik = 0 −1i i 0≤i≤n whenever k < n
Some elementary sieves
28 3.
Prove that
n −1 n − in = n!
i 0≤i≤n
4.
i
Let = 1 2 n Denote by Dn the number of one-to-one maps f −→ without any fixed point. Show that lim
n→
1 Dn = n! e
where e denotes Euler’s e. [A map f without any fixed point is called a derangement]. 5. Let a b be integers. We say that a and b are related if, for every prime power t, b ≡ at mod t for some positive integer t Show that if a and b are related, so are a2 and b2 Also, show that if a ≤ 2 and b ≤ 2 with a b related, then a = b
6. If t is a prime power coprime to k, show that tk a if and only if amod t has order k
7. If a and b are natural numbers ≥ 2 with ai i ≥ 1 ∩ bj j ≥ 1 = ∅ show that the number of ai bj ≤ x is log x2
8.
Let t1 t2 be coprime positive integers and let t = t1 t2 Show that all the residue classes modulo t can be represented as t1 a2 + t2 a1
9.
for some 0 ≤ a2 ≤ t2 − 1 0 ≤ a1 ≤ t1 − 1 Also, show that the coprime residue classes modulo t can be represented as above with a2 t2 = a1 t1 = 1
For any fx ∈ x and any natural number k > 2 define fa Sf k = k amod k
where (·/k) is the Jacobi symbol. If k = i qi i is the unique prime factorization of k, then there exist polynomials fi such that
Sf k = Sfi qi i
i
2.5 Exercises
29
10. A number n is called squareful if for every prime pn we have p2 n
Show that the number of squareful natural numbers ≤ x is √ ∼c1 x for some c1 > 0, as x →
11. Let k be a natural number ≥ 3 Show that the number of n ≤ x that are not divisible by any prime p ≡ 1mod k is c2 x ∼ log x1/k for some c2 > 0, as x →
d = log n
12. Show that, for any positive integer n dn
13. Show that a − 1d ≤ d a ≤ a + 1d and
d x2
d≤x
14. Let x y denote the number of n ≤ x with the property that if a prime pn, then p < y. Show that if x1/2 < z ≤ x, then log x x x z = 1 − log x+O
log z log z [A number all of whose prime factors are < y is called a y-smooth number.] 15. Prove that if y < z, then
x y = x z −
x/pr p
y≤p 1. 20. With notation as in the previous exercise, let A be a set of natural numbers ≤ x and let SA P denote the set of elements n of A with nP = 1. Now suppose: (i)
for d ∈ P, the set Ad consisting of elements of A divisible by d satisfies x #Ad = + Rx d d
2.5 Exercises
31
for some and some Rx d with x Rx d log = Ox d d≤x d∈P
(ii) if n ∈ A with nP > 1, then nP has at least two prime factors counted with multiplicity; (iii) there is a set B such that SA P = SB P and satisfying the condition that for p ∈ P and m ∈ B, we have pm ∈ B; (iv) there are numbers a and b with a > 0 so that 1 1 = a log x + b + O
m x m≤x m∈B
Under these conditions, show that for some positive constant c, cx #SA P ∼ log x1−a/b as x tends to infinity. 21. Apply the previous exercise to the set A consisting of natural numbers ≡ 1mod 4 and P the set of primes ≡ 3mod 4. Thus, deduce Theorem 2.4.2. [Hint: take B to be the set of odd natural numbers.] The last two exercises have been adapted from [52]. 22. Let fx be a polynomial with integer coefficients having the property that for every integer n, fn is a perfect square. Show that fx is the square of a polynomial with integer coefficients. Generalize this result to polynomials of several variables. [Hint: this can be deduced without using the results obtained in this chapter on the square sieve, as follows. We may suppose without loss of generality that fx is a product of distinct irreducible polynomials. Take a prime p that is coprime to the discriminant of f such that pfn for some n. Consider fn + p and fn and deduce that one of them is not divisible by p2 . The general case can be formally treated using resultants and made to reduce to the single variable case.]
3 The normal order method
The normal order method has its origins in a 1916 paper of G. H. Hardy (1877–1947) and S. Ramanujan (1887–1920). A simpler and more transparent treatment of their work was given later in 1934 by Paul Turán (1910–76). Turán’s method was substantially amplified by Paul Erdös (1913–96) and Mark Kac (1914–84). They used it to create an entire subject that has come to be known as probabilistic number theory. The method of Turán will be discussed in greater detail in the next chapter, which will form the basis for an elementary sieve method. In this chapter we will focus on how Turán’s method can be used to study the distribution of the number of prime factors of various sequences of numbers.
3.1 A theorem of Hardy and Ramanujan We recall that n denotes the number of distinct prime divisors of n. In 1916, Hardy and Ramanujan (see [28, p. 356]) proved that almost all numbers n are composed of log log n prime factors. To be precise, they showed that the number of n ≤ x not satisfying the inequality 1 − log log n < n < 1 + log log n is ox for any given > 0. Their proof involved an elaborate induction argument and was rather long and complicated. In 1934, Turán gave a simpler proof of their result. We begin by presenting Turán’s proof. After that, we discuss the application of the technique to a wider context. In the next chapter, we will isolate a combinatorial sieve method from it, which we call the Turán sieve. 32
3.1 A theorem of Hardy and Ramanujan Theorem 3.1.1
We have
n = x log log x + Ox
n≤x
and
2 n = xlog log x2 + Ox log log x
n≤x
Proof Let us observe that, by Theorem 1.4.4, x n = n≤x p≤x p =x
1 + Ox p≤x p
= x log log x + Ox
Also,
2 n =
n≤x
1
n≤x pn qn
=
1
pq≤x n≤x pn qn
=
pq≤x p=q
=
pq≤x
=x Now,
1 √ p p≤ x
2
x x + pq p≤x p
x + Ox log log x pq
1 + Ox log log x
pq≤x pq 2 1 1 ≤ ≤
pq≤x pq p≤x p
Since 1 √ = log log x + O1 = log log x + O1 √ p p≤ x
33
The normal order method
34 we find that
2 n = xlog log x2 + Ox log log x
n≤x
This completes the proof. Now consider the variance 2 n − log log x2 = n − 2log log x n + log log x2 1 n≤x
n≤x
n≤x
n≤x
which is easily seen to be Ox log log x by what we have proven above. This shows: Theorem 3.1.2 (Turán) n − log log x2 = Ox log log x
n≤x
Corollary 3.1.3 Let > 0. The number of n ≤ x that do not satisfy the inequality n − log log x < log log x 2 + 1
is ox. Proof Indeed, if n ≤ x does not satisfy the inequality, then a summand coming from n satisfies n − log log x ≥ log log x1/2+
The theorem implies that the number of such n ≤ x is O xlog log x−2 = ox
Hardy and Ramanujan prove, in fact, a stronger theorem, getting a more precise estimate for the exceptional set. As already mentioned, the proof we have given is due to Turán (see, for example, [28, pp. 354–8]). Related to the above results there is a celebrated theorem of Erdös and Kac (see [11, 12]), which states that, if for ≤ , n − log log n ≤ Sx = # n ≤ x ≤ log log n
3.2 The normal number of prime divisors of a polynomial then
35
1 −t2 /2 Sx =√ e dt
x→ x 2 lim
The integral is the familiar probability integral associated with the normal distribution. Thus, the theorem of Erdös–Kac says that the function n − log log n log log n is ‘normally distributed’ in a suitable probabilistic sense. We refer the reader to [11, 12] for further details. We say that a function fn has normal order Fn if, for every > 0, the inequality 1 − Fn < fn < 1 + Fn is satisfied for almost all values of n. That is, the number of n ≤ x that do not satisfy the inequality is ox. It is now not difficult to establish that n has normal order log log n (see Exercise 1).
3.2 The normal number of prime divisors of a polynomial Let fx be an irreducible polynomial with integer coefficients. We would like to consider the problem of determining the normal order of fn. For this purpose, we proceed as in the previous section. The details follow. First, let us observe that if y n denotes the number of primes dividing n that are ≤ y and if y = x for some 0 < < 1/2, then for n ≤ x we have n = y n + n − y n = y n + O1 since the number of prime divisors of n greater than y is O1. We can therefore write fn = y fn + Ox
(3.1) n≤x
n≤x
Let us denote by f p the number of solutions modulo p of the congruence fx ≡ 0mod p. Then y fn = 1 (3.2) n≤x
n≤x pfn p≤y
36
The normal order method
so that, upon interchanging summation, we must count, for fixed p, the number of integers n ≤ x that belong to f p residue classes modulo p. We obtain xf p + Of p + Ox fn = (3.3) p n≤x p≤y and since f p ≤ deg f , we see that the error term arising from the above summation is Oy. At this point we need to invoke some algebraic number theory. Let K = where is a solution of fx = 0. The ring of integers K of K is a Dedekind domain. It is a classical theorem of Dedekind (see Theorem 5.5.1 of [17]) that for all but finitely many primes p, f p is the number of prime ideals of K such that the norm NK/ = p. If K x denotes the number of prime ideals whose norm is ≤ x, then the analogue of the prime number theorem for number fields asserts that x K x ∼ log x as x → . In fact, for some constant c > 0, K x = li x + Oxe−c where
x
li x = 2
√
log x
dt log t
is the famous logarithmic integral (let us note that, upon integration by parts, li x = x/log x + O x/log2 x ). Since the norm of any prime ideal is a prime power and since the number of prime ideals whose norm is not a prime cannot √ exceed O x log x, we deduce: Theorem 3.2.1 (The prime ideal theorem) √ f p = li x + Oxe−c log x p≤x
for some c > 0. By partial summation we deduce further: Corollary 3.2.2 f p p≤x
p
= log log x + O1
3.2 The normal number of prime divisors of a polynomial
37
We can now complete our analysis of the normal order of fn. By our earlier discussion ((3.1), (3.2), (3.3)) and the corollary above, fn = x log log x + Ox
n≤x
Also,
2 fn =
n≤x
y2 fn + O
n≤x
=
y fn
n≤x
y2 fn + Ox log log x
n≤x
We find by the Chinese remainder theorem that 2 xf pf q + O1 + Ox log log x y fn = pq pq≤y n≤x p=q
where the latter error term arises from terms where p = q. Since f p2 p
we have
p2
= O1
2 f pf q f p = + O1
pq p pq≤y p≤y p=q
Thus
y2 fn = xlog log x2 + Oy2 + Ox log log x
n≤x
Now we recall that y = x with 0 < < 1/2, and so the Oy2 error term is dominated by Ox log log x. This proves: Theorem 3.2.3
fn − log log x2 = Ox log log x
n≤x
It is now an elementary exercise to deduce the normal order of fn as log log n. Theorem 3.2.3 gives an estimate of x log log x
38
The normal order method
for the number of n ≤ x such that fn is prime. A classical conjecture of Buniakowski formulated in 1854 predicts that any irreducible polynomial fx ∈ x, such that f+ has no common divisor larger than 1, represents primes infinitely often. The only known case of this conjecture is the celebrated theorem of Dirichlet about the distribution of primes in an arithmetic progression, which settles it in the linear case. We will see later (using methods of Chapters 6 and 10) that sieve techniques can be applied to shed some light on this conjecture. In some cases, the methods come very close to settling it. One can also establish the analogue of the Erdös–Kac theorem for fn. This has been done in [49]. If f is not irreducible, but has r irreducible factors, then the prime ideal theorem implies rx f p ∼ log x p≤x as x → . It will be of interest to make this effective and investigate whether such a result can be used to give an efficient ‘primality test’ or ‘irreducibility test’ for an arbitrary f ∈ x. It would also be of interest to generalize these investigations to study polynomials of several variables. This leads to the study of polynomial congruences modulo p in several variables, about which much is known thanks to the spectacular development of modern algebraic geometry (see, for example, the excellent monograph of Ireland and Rosen [32] for an introduction).
3.3 Prime estimates We can investigate in a similar way p −1 as p varies over the primes. More precisely, let k be a natural number and define for a k = 1 the quantity x k a = # p ≤ x p ≡ amod k
Then it is easily seen that p≤x
p − 1 =
x 1
≤x
where we recall that denotes a rational prime. As before, it is convenient to observe that, by an application of Chebycheff’s theorem, x p − 1 = y p − 1 + O log x p≤x p≤x
3.3 Prime estimates
39
so that we obtain sums of the form x 1
≤y
to investigate with y = x for some > 0. A classical theorem of Bombieri and Vinogradov states that, for any A > 0, there is a B = BA > 0 so that
x li y
max max y k a −
y≤x ak=1 k logA x k≤x1/2 log−B x This theorem will be proven in Chapter 9. In fact, much of the development of the large sieve method (to be discussed in Chapter 8) culminated in its proof. We invoke this theorem to deduce li x x +O x 1 = logA x
≤y
≤y − 1 provided < 1/2. Since 1 1 = + O1
≤y − 1
≤y
we conclude that
p − 1 = x log log x + O
p≤x
x
log x
In a similar way we deduce that (see Exercise 9) 2 p − 1 = xlog log x2 + Ox log log x
p≤x
This establishes the following theorem of Erdös: Theorem 3.3.1 (Erdös’ theorem)
x log log x p − 1 − log log p = O
log x p≤x
2
Again, it is possible to establish the analogue of the Erdös–Kac theorem for p − 1. In a recent paper [38], it is shown how one may prove this theorem without using the Bombieri–Vinogradov theorem but rather a ‘weighted’ version of the Turan sieve discussed in the next chapter. A similar investigation can be made for the study of p−a for any integer a The case a = −2 is related to the well-known twin prime conjecture. To be precise, a prime p is called a twin prime if p + 2 is also a prime; it is
The normal order method
40
conjectured that there are infinitely many such primes. An interesting corollary of the analogues of Theorem 3.3.1 for p + 2 will be that the number of primes p ≤ x such that p + 2 is also a prime is x O
(3.4) log log x However, it is to be noted that we have proven (3.4) by assuming the Bombieri–Vinogradov theorem, which in turn is derived from the large sieve. The latter method is capable of yielding superior estimates for the number of twin primes, as will be discussed in Chapter 10.
3.4 Application of the method to other sequences Having in mind the main steps for the proofs of Theorems 3.1.2, 3.2.3 and 3.3.1, we deduce that the normal order method can be formalized as follows. Let = an be a finite sequence of natural numbers. Let 1 = and for each squarefree d, define d = an an ≡ 0mod d
For each squarefree d, we will write d X + Rd d
#d = or even
#d = d X + Rd where we think of X as an approximation to the cardinality of and of R1 as the error in the approximation. The function d = d/d is to be thought of as the ‘proportion’ of the elements of belonging to d . In particular, for primes p and q we write #p =
p X + Rp p
and #pq =
pq X + Rpq
pq
Now suppose that an = OnC
3.4 Application of the method to other sequences
41
for some positive constant C. As before, we can show that an = y an + Ox n≤x
n≤x
for some y = x and > 0. Then we find
an =
n≤x
p p≤y
p
X+
Rp + Ox
p≤y
and we see that, in order to proceed further, we would need the asymptotic behaviour of p p≤y
p
and an estimate for the sum of the error terms Rp
p≤y
Similarly, the study of
2 an
n≤x
would lead to finding an estimate for the sum Rpq
pq≤y
We will develop a general sieve out of this method in the next chapter. In the example of Section 3.1, an = n, X = x and p = 1. In the example of Section 3.2, an = fn with f an irreducible polynomial ∈ x, X = x and p = f p. In the case an = pn − 1 where pn denotes the n-th prime and pn ≤ x, we have X = x and p = p/p − 1. There are many other interesting applications of this method. For example, let g be a nonzero multiplicative function (that is, gmn = gmgn for any coprime integers m n), taking rational integer values and such that gn = 0 for every natural number n (this assumption can be relaxed somewhat (see [48])). Define g x d = #p ≤ x gp ≡ 0mod d
Let us assume: H0 for some > 0, gn ≤ n for all n; H1 for some > 0 g x d − dx d≤x
x log x
The normal order method
42
H2 for prime powers p q (p = q, p = p− + Op−−1 and p q = p− + Op−−1 q − + Oq −−1 where the implied constants are absolute. The following is proven in [49]: Theorem 3.4.1 Denote by gn − 21 log log n2 ≤ Nx = # n ≤ x √ log log n3/2 3 where g satisfies H0 H1 H2 above. Then 1 −t2 /2 Nx =√ lim e dt
x→ x 2 − Under the same hypotheses, one can also show that gp − log log p2 = Ox log log x p≤x
and, more generally, a theorem of Erdös–Kac type for gp. To be precise, if we let gp − log log p Sx = # p ≤ x ≤ log log p then
1 −t2 /2 Sx =√ e dt
x→ x 2 − lim
These results have been established in [48, 49]. Theorem 3.4.1 can, for instance, be applied to n. In [16], Erdös and Pomerance determined the normal order of n using the Bombieri– Vinogradov theorem. In [50], Murty and Saidak show that the same result can be established without this theorem and by using only the elementary sieve of Eratosthenes (to be discussed in Chapter 5). To cite a ‘modular’ example, consider the Ramanujan function n defined as the coefficient of xn in the infinite product x
m=1
1 − xm 24
3.5 Exercises
43
By invoking the theory of -adic representations (see [63, 64]), one can prove certain properties about the number of prime divisors of n. In 1916, Ramanujan conjectured that n is a multiplicative function. This was proven a year later by Mordell (see [61]). One can try to apply Theorem 3.4.1 to determine the normal number of prime factors of n whenever it is not zero. Such an investigation is caried out in [48, 49]. Hypothesis H0 is satisfied and hypotheses H1 and H2 are satisfied, as well, if we assume the generalized Riemann hypothesis for certain Artin L-functions. The method does lead to the interesting conclusion that p ≥ explog p1− for almost all primes p (that is, apart from o x/log x primes p ≤ x) and for any 0 < < 1. This is related to a classical conjecture of Lehmer (still unresolved) that p = 0. Finally, let us mention that the normal order method was recently applied by Ram Murty and F. Saidak to settle a conjecture of Erdös and Pomerance concerning the function fa p which is defined to be the number of prime factors of the order of a modulo p in /p∗ , albeit conditionally assuming the generalized Riemann hypothesis. Precise details can be found in [50].
3.5 Exercises 1.
Prove that
n − log log n2 = Ox log log x
n≤x
2.
Let y n denote the number of prime divisors of n that are less than or equal to y. Show that
y n − log log y
2
= O x log log y
n≤x
3.
Prove that
n − log log x2 = x log log x + Ox
n≤x
4.
Let that
n denote the number of prime powers that divide n. Show n has normal order log log n.
The normal order method
44 5.
Fix k ∈ and let a k = 1. Denote by n k a the number of prime divisors of n that are ≡ amod k. Show that n k a has normal order 1 log log n
k
6.
Let g be a non-negative bounded function defined on the primes and define gn = gp pn
An = Prove that
gp
p≤n p
gn − Ax2 = Ox log log x
n≤x
7.
Using Theorem 3.2.1, prove that there is a positive constant c such that f p 1 = log log x + c + O
p log x p≤x
If fx ∈ x is not irreducible, but has r distinct irreducible factors in x (hence in x), deduce that fn has normal order r log log n. 9. Using the Bombieri–Vinogradov theorem, prove that 2 p − 1 = xlog log x2 + Ox log log x
8.
p≤x
10. Let Px denote the greatest prime factor of
2 n +1
n≤x
Show that Px > cx log x for some positive constant c. 11. Show that, as x → , a2 + b2 = xlog log x + Ox a2 +b2 ≤x
where the summation is over all integers a b satisfying the inequality a2 + b2 ≤ x.
3.5 Exercises
45
12. Let p and q be distinct primes. We write pk n to mean that pk n and pk+1 does not divide n. Show that the number of natural numbers n ≤ x such that pa n and q b n is less than or equal to x 1 1 1− + 2
1− pa q b p q 13. Let f be an additive arithmetical function. Thus, by definition, fpk fn = pk n
where the summation is over prime powers pk dividing n exactly. If f is a non-negative function, show that fpa 1 − 1− fn ≥ x fpa
a p p n≤x pa ≤x pa ≤x 14. With f as in the previous exercise, show that 2 f n ≤ xE 2 x + xVx + 2
fpa fq b
pa q b ≤x p=q
n≤x
where Ex =
fpa 1 1 − a p pa ≤x p
and Vx =
fpa 2
pa pa ≤x
15. With f , Ex and Vx as in the previous exercise, show that fn − Ex2 xVx n≤x
by expanding the square on the left hand side and using the inequalities obtained in the previous two exercises. This result is a special case of the Turán–Kubilius inequality. [Hint: observe that ⎛ ⎞1/2 1/2 a 2 b 2 fp fq ⎜ a b⎟ fpa fq b ≤ p q ⎠ ⎝ a b p q pa q b ≤x pa q b ≤x pa q b ≤x p=q
by the Cauchy–Schwarz inequality.]
p=q
The normal order method
46
16. If f , Ex and Vx are as in the previous exercises and Vx tends to infinity as x tends to infinity, show that, for any > 0, the number of n ≤ x with fn − Ex ≥ Vx1/2+ is ox. 17. Extend the result of Exercise 15 to arbitrary real valued additive functions f , as follows. Define additive functions g1 n and g2 n by setting g1 pk = fpk if fpk ≥ 0 and zero otherwise, g2 pk = −fpk if fpk < 0 and zero otherwise. Then fn = g1 n − g2 n. If gj pk 1 Ej x = 1 − pk p pk ≤x then show that fn − Ex2 ≤ 2
2
gj n − Ej x2
j=1
by the Cauchy–Schwarz inequality. Deduce that fn − Ex2 xVx
n≤x
18. Extend the result of the previous exercise to complex valued additive functions by considering the real-valued additive functions g1 n = Refn and g2 n = Imfn. This establishes the Turán–Kubilius inequality for all complex-valued additive functions. 19. Show that the implied constant in the Turán–Kubilius inequality can be taken to be 32. 20. Show that the factor 32x implied by the previous exercise for the Turán–Kubilius inequality can be replaced by ⎞1/2 ⎛ 1/2 1 b ⎜ a b⎟ p q ⎠ +4 q
!x = 2x + ⎝ a pa ≤x p q b ≤x pa q b ≤x p=q
Deduce that lim sup x→
!x ≤ 2
x
4 The Turán sieve
In 1934, Paul Turán (1910–76) gave an extremely simple proof of the classical theorem of Hardy and Ramanujan about the normal number of prime factors of a given natural number. Inherent in his work is a basic sieve method, which was called Turán’s sieve by the authors of [37], where it was first developed. In this chapter, we will illustrate how this sieve can be used to treat other questions that had previously been studied using more complicated sieve methods. For example, the Turán sieve is more elementary than the sieve of Eratosthenes and in some cases gives comparable results.
4.1 The basic inequality Let be an arbitrary finite set and let be a set of prime numbers. For each prime p ∈ we assume given a set p ⊆ . Let 1 = and for any squarefree integer d composed of primes of let d = ∩pd p
Fix a positive real number z and set Pz =
p
p∈ p 1 and let Nn be the number of monic irreducible polynomials in p x of degree n. There are several ways of obtaining an exact formula for Nn . The simplest is via a technique of zeta functions, as follows. Consider the power series deg f T f
The Turán sieve
50
where the summation is over all monic polynomials f ∈ p x. Since the total number of monic polynomials f ∈ p x of degree n is pn , the power series is easily seen to be 1
pn T n = 1 − pT n=0 On the other hand, p x is a Euclidean domain and so it has unique factorization. Thus we can write an ‘Euler product’ expression for the power series above as
1 − T deg v −1 = 1 − T n −Nn v
n=1
where v runs over monic irreducible polynomials of p x. We therefore obtain
(4.3) 1 − pT−1 = 1 − T d −Nd
d=1
By using that − log1 − pT =
pn T n n=1
n
and taking logarithms in (4.3), we get pn T n n=1
n
= − log1 − pT = −
Nd log1 − T d
d=1
T de Tn = = dNd de n=1 n d=1 e=1
dNd
de=n
This proves: Theorem 4.2.1 Let Nd denote the number of monic irreducible polynomials of p x of degree d. Then dNd = pn
dn
Observe that an immediate consequence is pn n which can be viewed as the function field analogue of Chebycheff’s upper bound (1.1) for x. In fact, it is easy to deduce that (see Exercise 1) Nn ≤
Nn ∼
pn n
4.3 Counting irreducible polynomials in x
51
or, more precisely, Nn =
pn + Opn/2 dn n
where dn denotes the number of divisors of n. One can view this as the analogue of the prime number theorem (1.2) for p x. It is possible to invert the above expression for Nn and solve for it via the M¨obius inversion formula. This will give us an explicit formula for Nn . More precisely, by applying Theorem 1.2.2 we deduce from Theorem 4.2.1: Theorem 4.2.2 Let Nn denote the number of monic irreducible polynomials of p x of degree n. Then Nn =
1 dpn/d
n dn
4.3 Counting irreducible polynomials in x Fix natural numbers H and n > 1 We will apply the Turán sieve to count the number of irreducible polynomials xn + an−1 xn−1 + · · · + a1 x + a0 with 0 ≤ ai ≤ H, ai ∈ . We will prove that this number is H n + OH n−1/3 log2/3 H
Thus, a random polynomial with integer coefficients is irreducible, with probability 1. Let us observe first that if a polynomial is reducible over x, then it is reducible modulo p for every prime p. Our strategy will be to get an upper bound estimate for the number of reducible polynomials. Let = an−1 an−2 a1 a0 ∈ n 0 ≤ ai < H
We will think of the n-tuples an−1 a1 a0 as corresponding to the monic polynomials xn + an−1 xn−1 + · · · + a1 x + a0
We want to count the number of tuples of that correspond to irreducible polynomials in x. So, let consist of all primes and for each prime p, let p denote the subset of tuples corresponding to irreducible polynomials
The Turán sieve
52
modulo p. Let z = zH be a positive real number to be chosen later. Then S z represents an upper bound for the number of reducible polynomials in x because if a polynomial belongs to p for some prime p, then it is irreducible. We observe that has H n elements. If we specify a monic polynomial gx ∈ p x, then the number of elements of that, reduced modulo p, are congruent to gxmod p is n H + O1
p We will choose z satisfying z2 < H so that, for primes p < z, this expression can be written as n−1 H Hn +O
pn pn−1 From our previous discussion, the number of monic irreducible polynomials of degree n is pn + Opn/2 Nn = n where the implied constant depends on n. Thus the total number of polynomials in corresponding to irreducible polynomials of p x is n n−1 n n H H H Hn p n/2 + Op + O + O = + OH n−1 p
n n−1 p p n n pn/2 This implies that #p =
1 n H + OH n−1 p + OH n /pn/2 n
and, similarly, that for p = q #p ∩ q =
1 n H + OH n−1 pq + OH n /pn/2 + OH n /q n/2
n2
We can now apply Theorem 4.1.1 with p = 1/n and Rpq = OH n−1 pq + OH n /pn/2 + OH n /q n/2
By using Chebycheff’s bound, we deduce: Theorem 4.3.1 For as above, n ≥ 3 and z2 < H, we have S z
H n log z + H n−1 z2
z
4.4 Square values of polynomials
53
By choosing z = H 1/3 log H1/3 we obtain: Theorem 4.3.2 Let n ≥ 3. The number of reducible polynomials xn + an−1 xn−1 + · · · + a1 x + a0 is O H n−1/3 log H2/3 .
0 ≤ ai ≤ H ai ∈
For n = 2, the above analysis leads to an estimate of O H 5/3 log H2/3 log log H for the number of reducible monic quadratic polynomials, as the reader can easily verify. However, one can get a sharper estimate of OH log H for this, directly (see Exercise 6). Gallagher [20] obtained the sharper estimate of OH n−1/2 log H by using a higher dimensional version of the large sieve inequality. One conjectures that the optimal exponent should be n − 1, and, by Exercise 6, this is best possible.
4.4 Square values of polynomials Let fx ∈ x be a polynomial with non-zero discriminant disc f and which is not the square of another polynomial in x. Let H > 0 We consider the question of estimating # n ≤ H fn is a perfect square
This question was discussed earlier in Chapter 2 in the context of the square sieve. We will now apply the Turán sieve by way of illustration of how such questions can be tackled by this sieve. As noted earlier, Siegel’s theorem concerning integral points on hyperelliptic curves gives us a universal bound (depending on f ). The novelty here is that, on the one hand, we will not involve the deep work of Siegel. On the other hand, the method will have a wider range of applicability, such as the case of polynomials of several variables. Let = n n ≤ H and for each prime p disc f , let p = n ≤ H fnmod p is not a square
The Turán sieve
54
By the result of Weil quoted in Chapter 2 (see [35, p. 94]),
fa √
≤ deg f − 1 p
amod p p and so we find that the number of nmod p such that y2 ≡ fnmod p √ for some integer y is p/2 + O p Hence, number of nmod p so that √the fn is not a square modulo p is p/2 + O p Thus p 2H √ + O1 + O p #p = p 2 H = H +O √ +p
p Similarly, for distinct primes p q disc f , pq 2H √ √ #pq = + O1 + O pq + p q pq 4 H H H + O pq + √ + √
= 2 p q As # = 2H + O1 we see that, using the notation of Section 4.1, p = 1/2 and the axioms of the Turán sieve are satisfied with H Rp = O √ + p p and
H H Rpq = O √ + √ + pq
p q
Applying Theorem 4.1.1 to our context, we obtain # n ≤ H fn is a square Choosing z =
H log z H 1 + √ + z + 2 Hz3/2 + z4 log z2
z z z
H 2/5 log H4/5
we get
# n ≤ H fn is a square = O H 4/5 log H2/5 which is inferior to the estimate obtained by the square sieve. We note, however, that if we consider the analogous problem of how often fn is a cube or a higher odd power, the square sieve is inapplicable and we
4.5 An application with Hilbert symbols
55
leave it to the reader to show that a similar estimate can be deduced by the above technique.
4.5 An application with Hilbert symbols Let H > 0 We now consider the problem of counting the number of pairs of integers a b with 1 ≤ a b ≤ H with the property that ax2 + by2 = z2 has a non-trivial rational point. We will need some properties of the Hilbert symbol. These will be reviewed here, but we also refer the reader to [61, Ch. 3] for a more detailed exposition. Let k be or the p-adic field p For a b ∈ k∗ we define the Hilbert symbol a b to be equal to 1 if ax2 + by2 = z2 has a non-trivial solution (i.e. different from (0,0,0)) and to be equal to −1 otherwise. The Hilbert symbol satisfies aa b = a ba b
If k = p we may write a bp for a b to indicate the dependence on p and if k = we may write a b . If k = p with p > 2 the Hilbert symbol can be computed using Legendre symbols, as follows. Let a = p u and b = p v with u v ∈ ∗p Then p−1 u v a bp = −1 2 p p where (u/p) denotes the Legendre symbol u/p, with u denoting the image of u under the homomorphism of reduction (modulo p). By a classical theorem of Hasse and Minkowski (see [61, p. 41]), we have that ax2 + by2 = z2 has a non-trivial rational point if and only if a bp = 1 ∀p and a b = 1
Now let be an arbitrary set of primes, let = a b 1 ≤ a b ≤ H and for each prime p ∈ let p = a b ∈ a bp = −1
The Turán sieve
56 Then the size of
\ ∪p∈ p gives us an upper bound for the number of a b with 1 ≤ a b ≤ H and for which ax2 + by2 = z2 has a non-trivial solution. Before applying Turán’s sieve, let us show a few auxiliary results. Lemma 4.5.1 Let p ≡ 1 (mod 4). The value of a bp depends only on the reduced residue classes of a and b modulo p2 if neither of them is divisible by p2
Proof We write a = p2 t + a1 b = p2 s + b1 with 1 ≤ a1 b1 ≤ p2 − 1 If a1 b1 are both coprime to p, then a bp = a1 b1 p = 1 If pa1 and b1 is coprime to p, then b1
a bp = p The case pb1 and a1 coprime to p is similar. If pa1 and pb1 then a bp = a1 b1 p . This completes the proof. Lemma 4.5.2 Let p ≡ 1mod 4 Then 2 H H2 4 2 +O + p + Hp #p =
p p2 Proof We first remove from p those elements a b with at least one component divisible by p2 The number of such pairs is clearly OH 2 /p2 . The remaining elements are enumerated by the sum 1 1 − a bp
2 1≤ab≤H p2 ab
By the previous lemma, the value of a bp depends only on the reduced classes of a bmod p2 Thus 2 H a bp = a1 b1 p + O1 p2 1≤ab≤H 1≤a b ≤p2 −1 1
p2 ab
=
H2 p4
1
1≤a1 b1 ≤p2 −1
a1 b1 p + O p4 + Hp2
4.5 An application with Hilbert symbols
57
The summation on the right-hand side is easily determined. If a1 b1 are both coprime to p we have a1 b1 p = 1 and the contribution to the sum from such terms is pp − 12
If pa1 and b1 is coprime to p, then a1 b1 p =
b1 p
and the contribution to the sum from such terms is zero. The same is the case if pb1 and a1 is coprime to p. If pa1 , pb1 then a1 = b1 = p and we get an error of O H 2 /p4 . Hence the final result is that 2 H H2 4 2 +O + p + Hp
#p = p p2 Lemma 4.5.3 Let p q ≡ 1mod 4 be distinct primes. Then 2 H H2 H2 4 4 2 2 #p ∩ q = +O + + p q + Hp q
pq p2 q2 Proof As before, we remove from p and q any pair a b with at least one of a b divisible by p2 or q 2 The remaining elements are enumerated by 1 1 − a bp 1 − a bq
4 1≤ab≤H p2 ab q 2 ab
Expanding the product in the summand leads to the consideration of four sums, three of which can be treated by the previous lemma. Thus we need only consider a bp a bq
1≤ab≤H p2 q 2 ab
By Lemma 4.5.1, each of the summands depends only on the residue class of a bmod p2 q 2 As before, we partition the sum according to the residue classes modulo pq and find that it equals 2 2 p −p q −q 2 + O pq4 + Hp2 q 2
H 2 2 p q
The Turán sieve
58
By putting everything together we find 2 H H2 H2 4 2 2 #p ∩ q = +O + 2 + pq + Hp q pq p2 q as desired. We can now apply the Turán sieve to prove: Theorem 4.5.4 The number of pairs of integers a b with 1 ≤ a b ≤ H and for which ax2 + by2 = z2 has a non-trivial rational point is
H2
log log H
Proof By the Turán sieve we find that the number in question is
H2 + z10 + Hz6 log log z
where is chosen to be the set of primes p ≡ 1mod 4 with p ≤ z Choosing z = H 1/6− for any > 0 gives us the required result. Remark 4.5.5 It can be seen, by invoking the large sieve, that this estimate can be refined to H log H for some > 0 This is proven in [65].
4.6 Exercises 1.
With Nn as defined in Theorem 4.2.1, show that Nn =
2.
pn + O pn/2 dn n
where dn denotes the number of divisors of n. Show that x4 +6x2 +1 is irreducible in x but reducible modulo p for every prime p. [Hint: consider x2 + bx ± 1x2 − bx ± 1mod p for suitable b.] (One can prove that an irreducible polynomial fx ∈ x of degree n is reducible modulo p for every prime p if and only if the Galois group of f has no cycles of order n. This is a consequence of the Chebotarev density theorem.)
4.6 Exercises 3.
59
Prove that the number of irreducible polynomials xn + an−1 xn−1 + · · · + a1 x + a0
0 ≤ ai < H ai ∈ that are reducible modulo p for every prime p is O H n−1/3 log H2/3 . 4. Let be the set of natural numbers ≤ x and let be the set of primes p < z. Use Theorem 4.1.1 to show that x
x log log x
5.
Note that only a lower bound for Uz is required and hence the relevance of Remark 1.4.5 made at the end of Chapter 1, where Chebycheff’s theorem was not used. Prove that the number of reducible polynomials xn + an−1 xn−1 + · · · + a1 x + a0
0 ≤ ai ≤ H ai ∈
is H n−1 . 6. Show that the number of reducible polynomials x2 +ax+b with a b ∈ and 0 ≤ a b ≤ H is OH log H. Can this estimate be sharpened? 7. If k = , show that the Hilbert symbol a b is 1 if a or b is > 0 and is −1 if a and b are < 0
8. Modify the proof of the main result of Section 4.5 to deduce that the number of integers a b with a b ≤ H and for which ax2 + by2 = z2 has a non-trivial rational solution is H2
O log log H 9.
Prove that there is a constant c > 0 such that x log log x 2 n − log log x = x log log x + cx + O
log x n≤x
10. Let dn denote the number of divisors of n. Show that log dn has normal order log 2 log log n. [Hint: let n be the number of prime factors of n, counted with multiplicity, then 2n ≤ dn ≤ 2 n .] 11. Prove the inequality
2
fp
fp2 fp −
≤ 2x p p n≤x
pn x1/2 0 and y = x . By considering numbers of the form n = mp1 pk ≤ x with 2
x − < pi < x
1 ≤ i ≤ k = 1/
show that x y x where the implied constant depends on . 13. Let > 0 be fixed. By applying the method of proof of Theorem 4.1.1, show that the number of primes p ≤ x such that the smallest quadratic non-residue modulo p is greater than x is bounded by a constant depending on . [Hint: let be the set of numbers n ≤ x and let be the set of primes p ≤ x1/4 for which the least quadratic non-residue is > x . Let p be the set of elements of that are quadratic non-residues modulo p. Observe that S x1/4 ≥ x x ] (This result was first obtained by U. V. Linnik (1915–72) as an application of his large sieve inequality.) 14. Let f be a non-negative valued function and let z be as in Section 4.1. Assume that (4.1) and (4.2) hold. Define, for a ∈ , fp Nf a = p a∈p
Uf z =
p fp
pPz
Show that Nf a − Uf z2 = X fp2 p 1 − p a∈
pPz
+
pqPz
fpfqRpq − 2Uf z
fpRp
pPz
15. With notation as in the previous exercise and Theorem 4.1.1, show that S z ≤
XVf z 2 + fpRp 2 Uf z Uf z pPz +
1 fpfqRpq 2 Uf z pqPz
4.6 Exercises where
Vf z =
61
p fp2
pPz
16. With Uf and Vf as in the previous exercise, show that Uf z2 ≤ UzVf z
Infer that the optimal estimate for S z implied by Exercise 15 is the one given by Theorem 4.1.1 (namely, the choice f ≡ 1). 17. Let and be as in Section 4.1. For each a ∈ , let wa be non-negative real number (usually called a ‘weight’) and set wa
X = a∈
Suppose that, for p ∈ ,
wa = p X + Rp
a∈p
and, for distinct primes p q ∈ , wa = p q X + Rpq
a∈pq
Let Na be the number of p ∈ for which a ∈ p and let p
Uz = pPz
Prove that wa Na − Uz2 = X p 1−p + Rpq −2Uz Rp
a∈
pPz
pqPz
pPz
18. Let y = log x and for each natural number n ≤ x define wn to be 1 if n has no prime factor ≤ y and zero otherwise. Show that wn n − log log x2 = xVy log log x + OxVy n≤x
where
1 Vy =
1− p p≤y
[Hint: let Py denote the product of the primes ≤ y. Observe that d
wn = dnPy
Now apply the previous exercise.]
62
The Turán sieve
19. Deduce from the previous exercise that the number of primes p ≤ x is x
O log log x2 Compare this with the result obtained in Exercise 4 above.
5 The sieve of Eratosthenes
Eratosthenes (276–194 bce) was the director of the famous library of Alexandria. The sieve of Eratosthenes was first described in the work of Nicomedes (280–210 bce), entitled Introduction to Arithmetic. The form of the sieve of Eratosthenes we describe in Section 5.1 below is due to A. M. Legendre (1752–1833) and was written down by him in 1808 in the second edition of his book Théorie des Nombres. In this chapter, we treat this material from a modern perspective and show that when it is combined with ‘Rankin’s trick’, the sieve of Eratosthenes becomes as powerful as Brun’s pure sieve, the latter being more cumbersome and awkward to derive.
5.1 The sieve of Eratosthenes We will use the fundamental property of the M¨obius function given in Lemma 1.2.1 to study the number x z = # n ≤ x n is not divisible by any prime < z where x z are positive real numbers. This will be used as a motivating example that will signal a more formal development of the sieve of Eratosthenes, undertaken in a later section. If we let Pz =
p 0 consider 1 1 + " = 1+" n n=1 and recall that we have the Euler product −1
1
1 + " = 1 − 1+" p p Thus 1 1+" p p " 1 1 log 1 − 1+" + 1+" =− p p p
f" = log 1 + " −
so that c0 = lim f"
"→0
By Exercise 2 we know that, for " > 0, 1 + O" " = − log1 − e−" + O"
log 1 + " = log
=
n=1
e−"n n−1 + O"
5.2 Mertens’ theorem Put Ht =
1 n≤t
and Pt =
67
n
1 p≤t
p
Then, by partial summation, Pu 1 = " du = " Pet e−"t dt
1+" 1+" p u 1 0 p Similarly, log 1 + " = " Hence
f" = "
0
0
e−"t Htdt + O"
e−"t Ht − Pet dt + O"
Since Ht = log t + + O and
1 t
1 P e = log t + c1 + O t t
we deduce f" = "
e 0
−"t
1 − c1 + O t+1
dt + O"
= − c1 + O"
Letting "→0 gives f0 = c0 = − c1
This proves: Theorem 5.2.1 Vz =
p 0, we have
x 1 −1 x z = 1≤ ≤ x
1− n p n≤x n≤x p 0, x k a
x k logx/k
uniformly for k ≤ x . (This result is called the Brun–Titchmarsh theorem. Sharper results will be derived in later chapters.) 2. Observing that 1
1 −1 1 2 d 1 = 1− k k pk p k dk d deduce that
1 log x
k≤x k 3.
Prove that 2 d k =
k dk d Deduce that
k = Ax + Olog x k≤x k 4.
5.
for some positive constant A. Using partial summation, deduce from the previous exercise that there are positive constants A B such that 1 log x = A log x + B + O
x k≤x k Show that if dn is the divisor function, then dp − 1 = Ox
p≤x
6.
Let n denote the number of prime factors of n. Show that p≤x
p − 1
x log log x log x
where the summation is over prime numbers.
Brun’s sieve
110 7.
Let n denote the number of prime factors of n counted with multiplicity. Show that
p − 1
p≤x
8.
Show that
2 d dn
9.
d
≤
x log log x
log x
6e log log n + O1
2
Show that the number of n ≤ x not having a prime factor ≡ 3mod 4 is x
log x
10. Let 1 < y ≤ x. Show that the number of primes in the interval x x +y is y
log y 11. Prove that the number of primes in the sequence n2 + 1 with n ≤ x is x
log x 12. Show that there are infinitely many odd numbers that cannot be written as p + 2m . [Hint: consider numbers of the form 2n − 1.] 13. For a fixed positive integer m, show that the number of solutions of the equation p − 1 = mq with p and q prime ≤ x is
x m log2 x/m
14. Using Brun’s sieve, deduce that x
x
log x
15. Using the prime number theorem, show that, as x tends to infinity, pn − pn−1 x ≤ 1 + o1 log p log x n pn ≤x where the summation is over prime numbers pn ≤ x arranged in increasing order. Deduce that p − pn−1 ≤ 1
lim inf n n→ log pn
6.5 Exercises
111
A long-standing conjecture that the above limit is zero was recently proved by D. Goldston, J. Pintz and C. Yildirim. This represents a significant advance towards the twin prime conjecture (see Section 10.2). 16. Using Brun’s sieve, show that the number of solutions of a = p i − pj
pi pj ≤ 2x
where pi pj denote prime numbers, is bounded by 1 cx
1+ p log2 x pa for some absolute constant c > 0. 17. Let > 0 and I denote the interval 1 − log x 1 + log x. Show that
1 log x 1+ p a∈I pa where the implied constant is absolute. 18. Let pn denote the n-th prime. Using the previous two results, show that for some > 0, p − pn < 1 −
lim inf n+1 n→ log pn [Hint: if the result is false, then for all n, pn+1 − pn ≥ 1 − log pn
Let p1 < p2 < · · · < pt be the primes in the interval x 2x. Clearly, t−1
pn+1 − pn ≤ x
n=1
Break the sum into two parts according as pn+1 − pn ∈ I or not.] 19. By observing that the numbers m! + 2 m! + 3 m! + m are all composite for any natural number m ≥ 2, deduce that for infinitely many n, log pn pn − pn−1 > log log pn where pn denotes the n-th prime. 20. With pn denoting the n-th prime, observe that pn − pn−1 = 1 + o1x
pn ≤x
Brun’s sieve
112 Let
Gx = maxpn − pn−1
pn ≤x
Show that Gx log x
Deduce that pn − pn−1 log pn for infinitely many n. 21. Show that the Schnirelman density of the set of squarefree numbers is strictly greater than 1/2. Using Mann’s theorem, deduce that every natural number can be written as the sum of two squarefree numbers. 22. Let k be a natural number and a k = 1. Let pk a denote the least prime ≡ a(mod k). Show that there are positive constants c1 c2 such that for k sufficiently large, pk a ≥ c1 k log k for at least c2 k progressions modulo k. [Hint: show that the sum x
1 x k ax k a − 1 1+ 2 p t≤x/k log x pkt ak=1 by an application of Brun’s sieve. Notice that by the Cauchy–Schwarz inequality, 2 2 x k a ≤ #a x k a = 0 x k a
ak=1
ak=1
Choose x = c1 k log k for a suitable c1 to deduce the result.]
7 Selberg’s sieve
In the 1940s Atle Selberg discovered a new sieve method in his research on the zeroes of the Riemann zeta function. There he developed his ‘mollifier method’ that later gave rise to Selberg’s sieve, and he used it to show that a positive proportion of the zeroes of the zeta function lie on the critical line Res = 1/2. The Selberg sieve is essentially combinatorial in structure. As such, it has been generalized to the context of partially ordered sets by R. Wilson [74]. It may be that its versatility has not been fully realized.
7.1 Chebycheff’s theorem revisited We recall that in Section 1.3 of Chapter 1 we used a combinatorial argument of Chebycheff to show that x = O x/ log x In the subsequent chapters, the use of Turán’s sieve, of the Eratosthenes sieve, and even of Brun’s pure sieve, led to weaker upper bounds for x namely x = O x/ log log x (see Exercise 4 of Chapter 4 and Proposition 5.1.1 of Chapter 5) and x = O x log log x/ log x (see Corollary 5.3.4 of Chapter 5). The proof of the latter result was based on the inclusion–exclusion principle expressed in the form x z =
d
dPz
1
(7.1)
d
(7.2)
n≤x dn
and, consequently, on the equation x z =
n≤x dnPz
113
Selberg’s sieve
114
where Pz is, as usual, the product of all prime numbers < z A refined analysis of (7.2) eventually led to the formula
log x 1 + O xlog z2 exp − (7.3) 1− x z = x p log z p z
7.1 Chebycheff’s theorem revisited This gives x z ≤ x
&d1 &d2
d1 d2 ≤z
d1 d2
+O
115
d1 d2 ≤z
&d1 &d2
(7.6)
We notice that if we also had &d ≤ 1 then (7.6) would give us an error term of O z2 which, for z < x is smaller than the error term provided by the general sieve of Eratosthenes (see (7.3)). Hence it is reasonable to hope that Selberg’s method will give an improvement to our upper estimates of x z and, consequently, of x
Let us estimate the main term in (7.6). The key observation is to view the sum &d1 &d2 d1 d2 ≤z
d1 d2
as a quadratic form in &d d≤z and to seek to minimize this form. Using that d1 d2 d1 d2 = d1 d2
(7.7)
and that
= d
(7.8)
d
(see Exercise 20 of Chapter 1), we can write
&d1 &d2
d1 d2 ≤z d1 d2
&d1 &d2
=
d1 d2 ≤z
d1 d2
&d1 &d2
=
d1 d2 ≤z
=
d1 d2
d1 d2
d1 d2 ≤z d1 d2
≤z
d1 d2
&d1 &d2 d1 d2
⎞2 ⎜ &d ⎟ = ⎝ ⎠
d≤z d ≤z ⎛
d
Therefore, under the transformation u =
&d d≤z d
d
(7.9)
Selberg’s sieve
116
the initial quadratic form has been diagonalized to u2
≤z
Our next aim will be to minimize this new diagonal form, if possible. We recall that the sequence &d was chosen subject to the constraints that &1 = 1 and &d = 0 for d > z Equation (7.9) tells us that we must also have conditions on u This can be seen by using the dual M¨obius inversion formula (Theorem 1.2.3). We obtain & d = ud
(7.10) d Thus we have the constraints u = 0 for any > z and
u = 1
(7.11)
z
For a ∈ , let
Da =
p
p∈ a∈p
with the convention that Da = 1 if a ∈ p for any p ∈ . Then dPz a∈d
d =
d ≤
dPzDa
2
&d
⎞2
⎛
⎜ ⎟ =⎝ &d ⎠
dPzDa
(7.16)
dPz a∈d
Now let us look at the cardinality S z. Using (7.16), we obtain S z = 1= d 1
=
a∈ ap ∀pPz
dPz
⎛
⎞
⎛
⎞2
⎜ ⎟ ⎜ ⎟ d⎠ ≤ &d ⎠ ⎝ ⎝ a∈
dPz a∈d
⎛ =
a∈d
⎜ ⎜ ⎝
a∈
d1 d2 Pz a∈d1 d2
a∈
⎞
dPz a∈d
⎟ &d1 &d2 ⎟ &d1 &d2 #d1 d2
⎠= d1 d2 ≤z
By (7.14) the above expression becomes ⎛ ⎞ &d1 &d2 ⎜ ⎟ +O⎝ &d1 &d2 Rd1 d2 ⎠
X fd d 1 2 d1 d2 ≤z d1 d2 ≤z d1 d2 Pz
d1 d2 Pz
Here, the first sum is viewed as the main term of our estimates, while the O-sum is viewed as the error term. As in Section 7.1, we also view the main
7.2 Selberg’s sieve
121
term as a quadratic form in &d d≤z which we seek to bring to a diagonal form and to minimize. By Lemma 7.2.2 and (7.14), d1 d2
&d1 &d2
=
fd1 d2 Pz
d1 d2 ≤z d1 d2 Pz
&d1 &d2 fd1 fd2
=
d1 d2 ≤z d1 d2 Pz
=
&d1 &d2
fd1 fd2 d d1 d2 ≤z d1 d2 Pz d1 d2
&d1 &d2 fd1 fd2
⎛
=
≤z Pz
f1
1 d2
f1
≤z Pz
fd1 d2
⎞2
⎜ & ⎟ ⎜ d ⎟ f1 ⎜ ⎟
⎝ d≤z fd ⎠ dPz d
Thus, under the transformation &d d≤z fd
u =
(7.17)
dPz d
our quadratic form is reduced to the diagonal form f1 u2
(7.18)
≤z Pz
The dual M¨obius inversion formula enables us to write & d = ud f dPz d
and so, by recalling that &d = 0 for d > z and &1 = 1 we obtain u = 0 for any > z and ≤z Pz
u = 1
(7.19)
Selberg’s sieve
122 Then we can write, as before, ≤z Pz
f1 u2
=
≤z Pz
f1 u − f1 Vz
2 +
1 Vz
from which we see immediately that the minimal value of the quadratic form given in (7.18) is 1/Vz and is attained at u =
f1 Vz
(7.20)
We note that here we have used that f1 p = fp − 1 > 0 for any p ∈ hence, by the multiplicativity of f1 · that the coefficients f1 d appearing in our form are positive. It remains to analyze the error term ⎞ ⎛ ⎟ ⎜ &d1 &d2 Rd1 d2 ⎠
O⎝ d1 d2 ≤z d1 d2 Pz
More precisely, our aim is to find upper bounds for & for all ≤ z Pz
From (7.17) and (7.19) we obtain that for such we have d/ d
f1 d≤z
Vz& = f
dPz d
2 t f1 tf1 z t≤
= f
tPz t=1
fp = f p p 1
2 t f1 t z t≤
=
tPz t=1
p
1+
1 f1 p
2 t
f1 t z t≤
tPz t=1
Therefore Vz& ≤ Vz
7.2 Selberg’s sieve
123
and so & ≤ 1
Consequently, the error term becomes ⎛ ⎞ ⎜ ⎟ O⎝ Rd1 d2 ⎠ d1 d2 ≤z d1 d2 Pz
and this completes the proof of the theorem. In order to use Theorem 7.2.1 we need lower bounds for the quantity Vz
Useful bounds are given as follows. Lemma 7.2.3 We keep the notation of Theorem 7.2.1. We let f˜ · be the completely multiplicative function defined by f˜ p = fp for all primes p and we set
¯ Pz = p
p p 0 there exists x0 = x0 > 0 such that x k a ≤ for all x > x0
2 + x k log2x/k
Selberg’s sieve
126 Proof
We fix a positive real number z < x and observe that
x k a = z k a + # z ≤ p ≤ x p ≡ amod k ≤ z + # n ≤ x n ≡ amod k n ≡ 0mod p ∀p < z p k = 1
If we choose = n ≤ x n ≡ amod k = p p k = 1 and p = n ≤ x n ≡ amod k n ≡ 0mod p for all p ∈ Pz =
p
p k. Any further improvement of the constant 2 in the above estimate would imply the non-existence of Siegel zeros, as noted by several authors beginning with Chowla, Motohashi and Siebert. See [47] for a historical discussion. Following Erd¨os, we apply the Brun–Titchmarsh theorem, Brun’s sieve and the sieve of Eratosthenes to the problem of finding an asymptotic formula for the number of n ≤ x such that n n = 1. One can prove, using Sylow theorems, that this is a necessary and sufficient condition for any group of order n to be cyclic. Theorem 7.3.3 (Erd¨os) The number of n ≤ x such that n n = 1 is ∼
e− x log log log x
as x→ . To prove this theorem, we will need the following preliminary results. Lemma 7.3.4 Let 0 < < 1 and let p < log log x1− . Then 1 c1 log log x > > log log x /2 q p q where the dash indicates that the sum is over primes q ≡ 1mod p satisfying 2 q < x1/log log x .
Selberg’s sieve
128
Proof Using the Siegel–Walfisz theorem stated above and partial summation, we deduce the assertion of the lemma. Lemma 7.3.5 Let p be any prime. Then 1 log log x + log p < c2
q p q≤x q≡1mod p
Proof
This follows from Theorem 7.3.1 and partial summation.
Lemma 7.3.6 Let 0 < < 1 and let z ≤ log log x1+ . Then the number of n ≤ x not divisible by any prime p ≤ z is 1 + o1
e− x log z
as x→ . Proof
This is a simple application of the sieve of Eratosthenes.
Lemma 7.3.7 Let 0 < < 1 and let p ≤ log log x1+ . Then the number of n ≤ x having the least prime divisor p is 1 + o1
e− x p log p
as p→ . Proof
This is a simple consequence of Lemma 7.3.6.
Proof of Theorem 7.3.3 Let Ax be the number of n ≤ x such that n n = 1. We partition these numbers into sets p according to the smallest prime divisor p of n and we denote by Ap x the number of elements in p . Then Ax = Ap x = 1 + 2 + 3 p
where in 1 we have p < log log x1− , in 2 we have log log x1− < p < log log x1+ and in 3 we have p > log log x1+ for any fixed 0 < < 1
7.3 The Brun–Titchmarsh theorem and applications
129
Observe that for each prime p the numbers enumerated by Ap x do not have any prime factor q ≡ 1mod p. By Brun’s sieve and Lemma 7.3.4,
x 1 x 1− exp −log log x /2
Ap x p q≡1mod p q p q<x1/log log x
Therefore
2
x
=o 1 log log log x
For
2
we use Lemma 7.3.7 to deduce that
2
1 x log log log x p p
where the dash on the sum indicates that log log x1− < p < log log x1+ . We find 1 p p so that
2
x
log log log x
Finally, by the sieve of Eratosthenes,
e− x 1 + log log log x since all the prime divisors of an n enumerated by 3 are greater than log log x1+ . On the other hand, ≤ 1 + o1 3
3
≥ 1 + o1
x e− x − − 1 + log log log x p>y p2 p>y q≡1mod pq≤x
x pq p
where y = log log x1+ . The penultimate sum is easily seen to be x O
log log x The final sum is estimated using Lemma 7.3.5. It is x
log log x + log p p>y
p2
x log log x
by partial summation. This completes the proof.
Selberg’s sieve
130
7.4 Exercises 1.
Let f be a multiplicative function and let d1 d2 be positive squarefree integers. Then fd1 d2 fd1 d2 = fd1 fd2
2.
Use the method of Lagrange’s multipliers to find the minimal value of the form f1 u2
≤z Pz
defined in (7.18). Show that if a set of primes has natural density then it has analytic density and this is also
4. Find an example of a set of primes that has analytic density, but does not have natural density. 5. Let a be a positive integer and x a positive real number. Show that, for any A B > 0
3.
xA log xB
x k a
≤k≤xA log xB
x log log x
log x2
ka=1
6.
For any natural number n recall that the radical of n is
radn = p
pn
Show that the number of n ≤ x such that radn and radn are coprime is e− x ∼ log log log x as x→ . (This gives an asymptotic formula for the number of n ≤ x such that all groups of order n are nilpotent. Since finite abelian groups are nilpotent, we infer an asymptotic formula for the number of n ≤ x such that all groups of order n are abelian.) [Hint: modify the method of proof of Theorem 7.3.3 to deduce the result.] 7. Using the Siegel–Walfisz theorem, show that q 0 such that #d =
X + Rd d
for some real number Rd with Rd ≤ 1. Show that in this setting, Theorem 7.2.1 can be sharpened to X
1 −1 + z2
S z ≤ 1− log z p z. Let &n denote the Liouville function, which is equal to −1r , where r is the total number of prime factors of n counted with multiplicity. Assuming that &n = Ox exp−clog x1/2 n≤x
for some constant c > 0 (this assumption is equivalent to a strong form of the prime number theorem), show that, for any < 1 and any z < x , we have x d + Oxlog z exp−c1 log x1/2
odd x z ≤ 2 d d Establish a similar result for even x z. Deduce that d 2 + o1 ≥
√ d log x d≤ x
Selberg’s sieve
134
(This example, due to Selberg [59], shows the parity problem of the sieve method. As can be seen, the method gives the same bound for √ √ even x x and odd x x, whereas √ √ x even x x = 0 odd x x = 1 + o1 log x as x → . )
8 The large sieve
The sieves discussed in the previous chapters rely on the use of the Möbius function or of its variations (as in the case of Selberg’s sieve). They are classified as ‘combinatorial sieves’. In this chapter we will discuss a sieve of a completely different nature, called the large sieve, which was introduced by Yuri Linnik (1915–72) in 1941 and was subsequently improved by Rényi (1950), Roth (1965), Bombieri (1965), Davenport and Halberstam (1966), Gallagher (1967), and others. We refer the reader to [8, p. 151] for the history. This sieve can be deduced from a beautiful inequality, known as the large sieve inequality, which, at a first sight, does not seem to have the big potential that it actually has. Linnik’s original motivation was to attack Vinogradov’s hypothesis concerning the size of the least quadratic non-residue np mod p. Vinogradov conjectured that np = O p for any > 0 The generalized Riemann hypothesis implies np = O log p2
Linnik proved, using his large sieve, that the number of primes p ≤ x for which np > p is O log log x Linnik’s paper introduced a new theme in analytic number theory that employed ideas from probability theory. As will be seen in the following chapters, the large sieve has evolved into a powerful tool, its significant application being the Bombieri–Vinogradov theorem. The latter theorem has often served as a substitute for the use of the generalized Riemann hypothesis in certain contexts, and will be discussed in detail in the next chapter.
135
The large sieve
136
8.1 The large sieve inequality We begin with a preliminary lemma. Lemma 8.1.1 Let F 0 1 −→ be a differentiable function with continuous derivative, extended by periodicity to all with period 1. Let z be a positive integer. Then 1 1
a
F d + F d
(8.1)
≤ z2
F d 0 0 d≤z 1≤a≤d ad=1
Proof Let d ≤ z, a ∈ 1 d ∩ with a d = 1, and ∈ 0 1. Then a = −F + F t dt
−F a d d By taking absolute value on both sides, this implies
a
≤ F + a F t dt
F d d
(8.2)
Now let us fix > 0 (to be chosen later) so that the intervals a a a = − + I d d d are contained in 0 1. We integrate (8.2) over I a/d, with respect to , and obtain
a
(8.3) 2 F
≤ a F d + a a F t dt d
d I d I d d Since ∈ I a/d and t ∈ a/d , we obtain that t ∈ I a/d. Hence the right-hand side of the above inequality is F d + F t dt d ≤ I da I da I da = F d + 2 F t dt I da I da = F d + 2 F d
I da I da In other words,
a
2 F
≤ a F d + 2 a F d
d I d I d
(8.4)
8.1 The large sieve inequality
137
Now we choose =
1
2z2
With this choice, the intervals I a/d, 1 ≤ a ≤ d a d = 1 d ≤ z do not overlap (modulo 1). Indeed, let x ∈ I a/d ∩ I a /d for a/d = a /d . Then
a
a
x − < x − < d d and so
a a
− < 2 = 1
d d z2
(8.5)
On the other hand, we have
a a ad − a d
− = = 0
d d dd since if ad = a d, then, recalling that a d = a d = 1, we obtain d = d , which is false. Thus
a a
− ≥ 1 ≥ 1
(8.6)
d d dd z2 Putting together (8.5) and (8.6) we are led to a contradiction. We sum (8.4) over all intervals I a/d and get 1 1
a
≤ F F d + F d
z2 d≤z 1≤a≤d d z2 a I da I da a I d I d ad=1 ≤
1 0
F d +
1 1 F d
z2 0
This completes the proof of the lemma. Now let us choose
F =
2 an en
n≤x
where an n≥1 is an arbitrary sequence of complex numbers, x is a positive integer and for a rational number t, et = exp2it
For simplicity of notation, set an en S = n≤x
The large sieve
138 hence
F = S2 F = 2SS
By Lemma 8.1.1 we obtain 1 1
a
2 S2 d + 2 SS d
≤ z2
S d 0 0 d≤z 1≤a≤d ad=1
We now recall Parseval’s identity that
2
1
an 2
an en d =
0 n≤x n≤x thus
1
0
S2 d =
an 2
n≤x
This implies that 1
a
2 an 2 + 2 SS d
≤ z2
S d 0 n≤x d≤z 1≤a≤z ad=1
For the second term on the right-hand side of the above inequality we use the Cauchy–Schwarz inequality and, once again, Parseval’s identity. We obtain 1 1/2 1 1/2 1 2 2 SS d ≤ S d S d 0
0
≤
0
n≤x
≤ 2x
1/2 an
2
1/2 4 n an 2 2
2
n≤x
an 2
n≤x
We record this result as: Theorem 8.1.2 (The large sieve inequality) Let an n≥1 be a sequence of complex numbers and let x z be positive integers. Then
na 2
an 2 (8.7)
an e
≤ z2 + 4x
d n≤x d≤z 1≤a≤d n≤x ad=1
where, for a rational number e = exp2i
8.2 The large sieve
139
Montgomery and Vaughan [42], and Selberg [60] have independently shown that z2 + 4x can be replaced by z2 + x .
8.2 The large sieve We want to deduce a sieve method from the inequality described in Theorem 8.1.2. Let be a set of positive integers n ≤ x and let be a set of primes. For each p ∈ suppose that we are given a set w1p wpp of p residue classes modulo p Let z be a positive real number and denote by Pz the product of the primes p ∈ p < z We set (8.8) z = n ∈ n ≡ wip mod p ∀1 ≤ i ≤ p ∀pPz and we denote by S z the cardinality of this set. Theorem 8.2.1 (The large sieve) With the above notation, we have S z ≤ where Lz =
d≤z
z2 + 4x Lz
2 d
pd
p
p − p
The (unusual) idea for the proof of Theorem 8.2.1 is to use sums of the form na cd n = (8.9) e d 1≤a≤d ad=1
where n d ∈ , called Ramanujan sums. They have the following interesting properties: Proposition 8.2.2
Let d d be positive integers. Then
1. if d d = 1 we have that cdd n = cd ncd n d 2. cd n = D D Ddn 3. if d n = 1 we have that cd n = d, that is, na d = e
d 1≤a≤d ad=1
The large sieve
140
Proof Part 1 of the proposition is left to the reader as an exercise. Part 3 is a straightforward consequence of part 2. We now prove part 2. Let na
e c˜ d n = d 1≤a≤d On one hand we can write c˜ d n = e
n
d
0≤a≤d−1
e
na d
en−1 and we see that this is e n/d en/d−1 if d n and e n/d d if dn In other words, 0 if d n (8.10) c˜ d n = d if dn.
On the other hand, we can write na c˜ d n = e d Dd 1≤a≤d ad=D
=
Dd 1≤a1 ≤ d D a1 Dd =1
nDa1 e d
=
c Dd n
Dd
By using the Möbius inversion formula we deduce that d d cd n = D˜c D n = c˜ D n D Dd Dd which, by (8.10), is
d D
D Ddn
Proof of Theorem 8.2.1 First, let us set some notation. Let d = p1 pt be a positive squarefree integer composed of primes dividing Pz By the Chinese Remainder Theorem, for any i = i1 it with 1 ≤ i1 ≤ p1 1 ≤ it ≤ pt there exists a unique integer wid such that wid ≡ wij pj mod pj ∀1 ≤ j ≤ t
We denote by d the number of all possible wid appearing in this fashion (namely, as we vary i, but keep d fixed). Clearly, d is the product of the pi ’s.
8.2 The large sieve
141
Now let n ∈ z This implies that n − wid d = 1 for any d and i as above, and so by part 3 of Proposition 8.2.2 we obtain n − wid a e
(8.11) d = cd n − wid = d 1≤a≤d ad=1
We sum (8.11) over all indices i corresponding to d and over all integers n ∈ z and get na −wid a ddS z =
(8.12) e e d d 1≤a≤d wid n∈z ad=1
Squaring (8.12) out and applying the Cauchy–Schwarz inequality gives ⎞ ⎛
−w a 2 id ⎜
⎟
ddS z2 ≤ ⎝
⎠
e
d 1≤a≤d wid ad=1
⎞
na 2 ⎜
⎟ e ×⎝
⎠
d 1≤a≤d n∈z ⎛
ad=1
We write the first factor in the above expression as wi d − wid a e cd wi d − wid = d wid wi d 1≤a≤d wid wi d ad=1
and, further, by using part 2 of Proposition 8.2.2, as d d D= D D D wid wi d wid wi d Ddw −w Dd id i d wid ≡wi d mod D d d Dd = D D Dd = dd
EE Ed
= dd
pd
= d
pd
E
p 1− p
p − p
The large sieve
142 This gives us that
d dS z2
⎞
2
na ⎟ ⎜ ≤ d p − p ⎝ e
⎠
d 1≤a≤d n∈z pd ⎛
ad=1
or, equivalently, that dS z 2
2
pd
na 2
p
≤ e
p − p 1≤a≤d n∈z d
(8.13)
ad=1
Now we sum (8.13) over d ≤ z and we use Theorem 8.1.2 with the sequence an n≥1 chosen such that an is 1 if n ∈ z and 0 otherwise. We obtain d≤z
d2 S z2
pd
p ≤ z2 + 4x S z p − p
which, after doing the obvious cancelations, completes the proof of the theorem. Remark 8.2.3 When using Theorem 8.2.1, we need lower bounds for the sum Lz One way of obtaining a lower bound is by considering the summation over primes p < z and not over all integers d that is, Lz ≥
p
p − p p 0 By Exercise 17 with = lognm and = log u, we obtain T sint log u dt an bm nm = At Bt t −T n≤x m≤y nm≤u 1 an bm
+O (8.20)
T n≤x m≤y log nm u where At =
an n n≤x
nit
Bt =
bm m
mit m≤y
We note that, by our assumption on u we have
nm
1 1
log
u u xy and sint log u min 1 t log2xy
Then using the above in (8.20) gives
" T
1
log2xy dt an bm nm At Bt min
t −T
n≤x m≤y
nm≤u
+
xy a b
T n≤x m≤y n m
The large sieve
146
By applying the Cauchy–Schwarz inequality to the last sum, we get that
" T
1
log2xy dt a b nm At Bt min n m
t −T
n≤x m≤y
nm≤u 1/2 1/2 x3/2 y3/2 + an 2 bm 2
T n≤x m≤y We take the maximum over u and then sum over and d ≤ z to obtain
d ∗
max an bm nm
u d
n≤x m≤y d≤z nm≤u " d ∗ T 1 log2xy dt At Bt min t −T d≤z d 1/2 1/2 x3/2 y3/2 d ∗ 2 2 + an bm
(8.21) T n≤x m≤y d≤z d Now we use the Cauchy–Schwarz inequality, Corollary 8.3.2 and Exercise 15 to obtain upper bounds for the first term on the right-hand side of (8.21): " d ∗ T 1 log2xy dt At Bt min t −T d≤z d ⎛
2 ⎞1/2 ⎛
2 ⎞1/2
d ∗
d ∗
≤⎝
bm m ⎠
an n ⎠ ⎝
n≤x m≤y d≤z d d≤z d " T 1 log2xy dt × min t −T 1/2 1/2 2 1/2 2 1/2 2 2 ≤ z + 4x z + 4y an bm n≤x
× log T + log2xy
m≤y
(8.22)
For the second term on the right-hand side of (8.21) we observe that there are d characters modulo d (see Exercise 4), hence 1/2 1/2 x3/2 y3/2 d ∗ 2 2 an bm T n≤x m≤y d≤z d 1/2 1/2 x3/2 y3/2 z2 2 2 ≤ an bm
(8.23) T n≤x m≤y
8.4 An average result
147
Putting together (8.21), (8.22) and (8.23), and choosing T = x3/2 y3/2 gives the desired inequality.
8.4 An average result In this section we deduce an average result of Barban, Davenport and Halberstam about the behaviour of the error term in the prime number theorem for primes in an arithmetic progression. Let x be a positive real number and a d integers with a d = 1 By partial summation, in order to study x d a it is actually enough to study the function n (8.24) x d a = n≤x an=1
where · is the von Mangoldt function. Hence our focus in this section will be on the function x d a instead of x d a. Theorem 8.4.1 (The Barban–Davenport–Halberstam theorem) For any A > 0 and for any z satisfying x/log xA ≤ z ≤ x we have
2
x d a − x ≤ xz log x
d d≤z 1≤a≤d ad=1
Proof We start by rewriting the error term x d a − x/d in a more convenient form. Namely, for any d ≤ z and 1 ≤ a ≤ d with a d = 1 we observe that 1 ax x d a = d mod d where the summation mod d is over (not necessarily primitive) Dirichlet characters modulo d and where nn
(8.25) x = n≤x
Let 0 denote the trivial character modulo d. We observe that x d a −
1 x 0 − x x = ax +
d d mod d d =0
(8.26)
The large sieve
148
By using the orthogonality relations, we see that
2
ax = d x 2
1≤a≤d mod d mod d
ad=1 = = 0
(8.27)
0
Now let us remark that if the character = 0 modulo d is induced by some primitive character 1 modulo d1 then x = x 1 + O log xlog d
(8.28)
Indeed, we have log x k x 1 − x = 1 p log p log p log xlog d
p≤x log p pk ≤x pd
pd
Hence, by using (8.26), (8.27) and (8.28), we get
2
x d a − x
d
d≤z 1≤a≤d ad=1
1 1 x 2 + x 0 − x2 d d d≤z d≤z mod d
=0
1 1 x 1 2 + x − x2 + zlog z2 log x2 d d d≤z d≤z mod d
=0
where x is defined in (1.3) of Chapter 1. Let us remark that the second term above can be estimated using the following form of the prime number theorem: x = x + O x exp −c log x (8.29) for some c > 0 (see [8, Chapter 18]). Thus
1 x2 log z xz log x x − x2 log xA d≤z d where A > 0 is as in the statement of the theorem.
8.4 An average result
149
Since the term zlog z2 log x2 is negligible compared to the desired final estimate of xz log x we now see that in order to prove the theorem it suffices to show that 1 x 1 2 xz log x
d d≤z mod d
(8.30)
=0
For each character = 0 modulo d and its associated primitive character 1 modulo d1 let us write d = d1 k for some positive integer k We have d≤z
1 x 1 2 d mod d =0
=
x 1 2
d1 ≤z 1 mod d1
=
∗ d≤z
where the summation
∗
x 2
k≤ dz
1 d1 k
1
k≤ dz
1 dk
is over primitive characters modulo d.
We observe that the innermost sum satisfies (see Exercise 15) k≤ dz
1 2z 1 log
dk d d
Therefore, in order to prove (8.30) it is enough to show that 1 2z ∗ log x 2 xz log x
d d d≤z
(8.31)
To do this, we remark that, by the first modified large sieve inequality applied to the sequence an = n we have
d ∗ x 2 z2 + x x log x d≤z d since
n≤x
(8.32)
n2 x log x (see Exercise 13 of Chapter 1).
We will now use this estimate in a slightly indirect manner. More precisely, let D = Dx be a parameter to be chosen later and such that 1 < D ≤ z and let us divide the interval D z into dyadic subintervals U 2U with U = z/2k
The large sieve
150
where k are integers running from 1 to logz/D/ log 2. By (8.32) together with partial summation, we are led to 2z ∗ 1 log x 2 d d U 0 and for any z satisfying x/log xA ≤ z ≤ x we have
2
x d a − li x ≤ xz
d d≤z 1≤a≤d ad=1
14. (The Cauchy–Schwarz inequality) Let f g a b −→ be differentiable functions, with continuous derivatives. Then
2 b
b b
2 2
ftgt dt ≤ ft dt gt dt
a
a
a
8.5 Exercises
153
15. Let z be a positive real number and d a positive integer. Show that k≤ dz
1 2z 1 log
dk d d
16. Let y > 0 c > 0 T > 0 ⎧ ⎪ ⎪ ⎨0 y =
1/2 ⎪ ⎪ ⎩1
and Iy T = Show that Iy T − y
1
1 c+iT ys ds
2i c−iT s
⎧ ⎨yc min 1 ⎩c
y=1
1 T log y
if
y = 1
if y = 1
T
17. Use the previous exercise to show that, for any T > 0 ∈ > 0 we have ⎧ 1 ⎪ + O if < ⎪ T− ⎨ T it sint dt = 2 e if = ⎪ t −T ⎪ ⎩O 1 if >
T−
18. Let T > 0 and x y be positive integers. Show that T 1 log2xy dt = O log T + log2xy
min t −T 19. Let be a set of Z integers. Let p be a prime and set na a S e = p p n∈ where, as before, et = exp2it. Let Zp h be the number of elements of that are ≡ h (mod p). Show that
2
Z 2
S a = p Zp h −
p p ap=1 hmod p
The large sieve
154
20. With notation as in the previous exercise, let d be an arbitrary positive integer and set a na e S =
d d n∈ Let Zd h be the number of elements of that are ≡ hmod d. Let a −ah e
S Td h = d d ad=1 Prove that dZd h =
T
d
Deduce that
d h
2 d
a
2 d Z h
S
=d d h=1 d ad=1
21. Let S
a d
=
an e
n∈
na d
where is a finite set of integers. For p prime, let an
Sp a = n∈ n≡amod p
and let p be the number of amod p for which Sp a = 0. By observing that S0 =
p
Sp a
a=1
and using the Cauchy–Schwarz inequality, show that
2
p
S a
S02 ≤
p − p p ap=1 22. We keep the notation of the previous exercise. Let d1 d2 be coprime positive integers, and let d = d1 d2 . Observing that by the Chinese remainder theorem we have
2
a
2 a1 a2
S
=
S d + d d 1 2 ad=1 a d =1 a d =1 1
1
2
2
8.5 Exercises deduce that
a
2
≥ S02 Jd1 Jd2
S d ad=1
where Jd =
pd
p
p − p
Then deduce that
2
a
2
p
≤
an 2 d
S
n∈ p − p ad=1 d pd (This gives an alternate method for deriving Theorem 8.2.1.)
155
9 The Bombieri–Vinogradov theorem
The Bombieri–Vinogradov theorem is considered one of the finest consequences of the large sieve method. Its virtue lies in the fact that in many questions that require the use of the generalized Riemann hypothesis for Dirichlet L-functions, one can apply the Bombieri–Vinogradov theorem. Thus, in situations involving ‘sufficiently many’ L-functions, the theorem can be regarded as a substitute for the generalized Riemann hypothesis. As we will see below, one application is to the Titchmarsh divisor problem. Another more complicated application can be found in Hooley’s monograph [31], where an asymptotic formula is derived for the number of ways of writing a natural number as the sum of a prime number and two squares. In 1976, Motohashi [43] discovered an induction principle that generalizes the Bombieri–Vinogradov theorem and makes it applicable in a wider context for general arithmetical functions. This perspective was useful in the work of Bombieri et al. [2, 3], where the range of summation in the classical √ Bombieri–Vinogradov theorem is extended beyond the x barrier. We refer the reader to Section 12 of [1], where a brief discussion of the new ideas can be found. In this chapter we will derive the classical Bombieri–Vinogradov theorem using a method of Vaughan, as described in [8]. We show that the method has a wide range of applicability by putting it in a general context. We then apply the results to treat the Titchmarsh divisor problem. In the next chapter, we will use the Bombieri–Vinogradov theorem in conjunction with the lower bound sieve method to count the number of primes p such that p + 2 has a bounded number of prime factors, a result that represents some advance towards the twin prime conjecture.
156
9.1 A general theorem
157
9.1 A general theorem The main result of this section is derived as a generalization of a method of Vaughan, which lies at the heart of one proof of the celebrated Bombieri– Vinogradov theorem. Before stating the general result, let us introduce the class of functions = D −→
Dn2 = O xlog x for some > 0 (9.1)
n≤x
We have the following basic properties: Proposition 9.1.1 1. If D ∈ and ≥ 0 then Dn n≤x
for some > 0
2. If D1 D2 ∈ then
n
x1− log x
D1 eD2 f def xlog x
ef ≤x
for some > 0 and
D1 eD2 f 2 def xlog x
ef ≤x
for some > 0 where for a positive integer e, de denotes the number of the divisors of e
Proof Exercise. Theorem 9.1.2 Let x z be positive integers and let As =
an n≥1
ns
Bs =
bn n≥1
ns
be normalized Dirichlet series (that is, a1 = b1 = 1) for which we write ˜ bn As cn 1 = = s Bs n≥1 n Bs n≥1 ns
The Bombieri–Vinogradov theorem
158
˜ for some cn bn ∈ We assume that all these series are convergent for Res > "0 for some "0 and that they satisfy the following hypotheses: (H1) ann≥1 is an increasing sequence of positive real numbers; ˜ (H2) b· b· c· ∈ (H3) there exist 0 ≤ < 1 and 0 ≤ < 1 such that, for any non-trivial Dirichlet character modulo d, √ bnn x d log d + x
n≤x
Then 1−
1. if z ≤ x 3−
d ∗
max cnn (9.2) y≤x
d n≤y d≤z 1 5− 1−+2 5 1+ z2 x 2 + x + zx 23− + z2 x 3− + z 2 x 3− ax log x
for some > 0 1− 2. if z > x 3−
d ∗
max cnn y≤x
d n≤y d≤z 1 9−4 2− z2 x 2 + x + z 23−2 x 3−2 axlog z 3−4+3 2−2+ + z 3−2 x 3−2 log z log x
for some > 0
Here, the summation
∗
(9.3)
is over primitive Dirichlet characters modulo d
Proof We set Fs =
˜ cn bn Gs = s s n≤U n n≤V n
for some parameters U = Ux z and V = Vx z, to be chosen later. We think of Fs as an ‘approximation to As/Bs’ and of Gs as ‘an approximation to 1/Bs’, and we observe that we can write As = Fs − BsGsFs + AsGs Bs As + − Fs 1 − BsGs
Bs
(9.4)
9.1 A general theorem
159
In the literature, this is known as Vaughan’s identity and goes back to ideas of Linnik. By comparing the coefficients of n−s on both sides of (9.4), we deduce that cn = a1 n + a2 n + a3 n + a4 n where
a1 n =
cn if n ≤ U
0
if
n > U
˜ cg a2 n = − bebf
(9.5)
(9.6)
efg=n f ≤V g≤U
a3 n =
˜ aebf
(9.7)
ef =n f ≤V
˜ bgbh
a4 n = − ce ef =n e>U f>V
Therefore, for any Dirichlet character modulo d we can write cnn = ai nn = Si y
n≤y
1≤i≤4 n≤y
(9.8)
gh=f h≤V
(9.9)
1≤i≤4
We prove the theorem by estimating each of the sums
d ∗ max Si y 1 ≤ i ≤ 4
y≤x d≤z d The estimate for S1 y Using (9.5) and hypothesis (H2) together with part 1 of Proposition 9.1.1, we obtain
cnn
cn Ulog U0 S1 y =
n≤y
n≤U n≤U
for some 0 > 0 The above bound is independent of y and d hence, recalling that there are d Dirichlet characters modulo d, we obtain d ∗ max S1 y z2 Ulog U0
y≤x d d≤z
(9.10)
The Bombieri–Vinogradov theorem
160
The estimate for S2 y
Using (9.6) we write ˜ cgefg S2 y = − bebf efg≤y f ≤V g≤U
which we split into two parts according to whether fg ≤ U or U < fg ≤ UV
The first sum obtained in this way will be denoted by S2 y and the second sum by S2 y
For S2 y we write
˜bf cg bee S2 y ≤
y
e≤ fg g≤U f ≤minV Ug so that we can use hypothesis (H3) to estimate the innermost sum. We get
˜bf √ cg S2 y y dlog d g f g≤U f ≤minV Ug
˜bf cg +y g f g≤U f ≤minV Ug
√ cg ˜bf y dlog d g f ≤ U f g≤U g
cg ˜bf +y
g f ≤ U f g≤U g
Then, by using (H2) and part 1 of Proposition 9.1.1, we obtain √ cg U 1− S2 y y dlog d log U1 g g g≤U cg U 1− +y log U2 g g g≤U √ y dlog dU 1− log U3 + y U 1− log U4 for some 1 2 3 4 > 0 This implies that d≤z
d ∗ max S y d y≤x 2 5
x z 2 log zU 1− log U3 + x z2 log zU 1− log U4 (9.11)
9.1 A general theorem
161
For S2 y we write d≤z
d ∗ max S y d y≤x 2
⎞ ⎛
d ∗ ⎟ ⎜ ˜
bf cg⎠ eh = max be ⎝ y≤x
d d≤z eh≤y f ≤Vg≤U
U x 3− and that
d ∗
max cnn y≤x
d n≤y d≤z 1 9−4 3−4+3 2−2+ 2− z2 x 2 + x + z 23−2 x 3−2 axlog z + z 3−2 x 3−2 log z log x
for some > 0
This completes the proof of the theorem.
9.2 The Bombieri–Vinogradov theorem
167
9.2 The Bombieri–Vinogradov theorem In this section we return to the study of the error term that occurs in the asymptotic formula for x d a where a d are coprime integers and d ≤ x
We have already mentioned in Chapter 7 that if we assume a generalized Riemann hypothesis, then x d a =
li x + O x1/2 logdx
d
We are now ready to prove that, on ‘average’ and under no hypothesis, the error term has indeed size Od x1/2 log x More precisely, we shall be proving: Theorem 9.2.1 (The Bombieri–Vinogradov theorem) For any A > 0 there exists B = BA > 0 such that
x li y
max max
y d a −
log xA
y≤x ad=1 d x1/2 d≤
(9.27)
log xB
The Bombieri–Vinogradov theorem was first obtained, independently, by E. Bombieri and A. I. Vinogradov in 1965. Their proofs were based on an application of the large sieve for estimating the number of zeroes of certain L-functions in various rectangles. A different proof of the theorem was obtained by Gallagher in 1968. Yet another proof was obtained by Vaughan in 1975, and it is his approach that we will be following in our presentation below (see [8, p.16]). There are two main ingredients needed in Vaughan’s proof of the Bombieri– Vinogradov theorem. The first one is a particular case of the general theorem that we discussed in Section 9.1. We state it below. The second one is the result due to Siegel and Walfisz, stated in Section 7.3 of Chapter 7, which provides unconditional estimates for x d a − li x/d as long as d runs in small ranges with respect to x
Theorem 9.2.2 (Vaughan) Let x and z be arbitrary positive integers. Then
d ∗
max nn y≤x
d n≤y d≤z 2 1/2 (9.28) z x + x + zx5/6 log zlog x ∗ is over primitive characters for some > 0 where the summation modulo d and where · denotes the von Mangoldt function.
The Bombieri–Vinogradov theorem
168
Proof We use the notation introduced in Theorem 9.1.2 and set log n 1 As = − s = Bs = s =
s s n≥1 n n≥1 n Thus, for any n ≥ 1 ˜ an = log n bn = 1 cn = n bn = n and hypotheses (H1) and (H2) of Theorem 9.1.2 are clearly satisfied. By using the Pólya–Vinogradov inequality (8.14) we see that, for any d and any non-trivial Dirichlet character modulo d √ bnn = n d log d
n≤x
n≤x
Hence hypothesis (H3) is also satisfied, with = 0 = We obtain that if z ≤ x1/3 then
d ∗
max nn y≤x
d n≤y d≤z 2 1/2 z x + x + zx5/6 + z5/2 x1/3 log zlog x for some > 0 and if z > x1/3 then
d ∗
max nn y≤x
n≤y d≤z d 2 1/2 z x + x + z3/2 x2/3 log zlog x for some > 0 Combining the two estimates gives us the desired inequality. Corollary 9.2.3 Let x z D be positive integers such that z > D Then
d ∗
max nn y≤x
n≤y D 0 where, again, the summation is over primitive Dirichlet characters modulo d
Proof This is an immediate consequence of Theorem 9.2.2 and of the method of partial summation. We leave it to the reader as an exercise.
9.2 The Bombieri–Vinogradov theorem
169
Proof of the Bombieri–Vinogradov theorem We will follow very closely the proof of Theorem 8.4.1 of Chapter 8. To obtain the stronger estimate (9.27) we will use, instead of the modified large sieve inequality, the technical result given by Corollary 9.2.3. The details follow. First we remark that proving (9.27) is equivalent to proving that for any A > 0 there exists B = BA > 0 such that
y
x
max max y d a −
y≤x ad=1 d log xA x1/2 d≤
log xB
where y d a is defined in (8.24). Let A y x be positive real numbers such that y ≤ x, and let d ≤ y. We observe that
y 0 − y y
≤ 1
(9.30) max
y d a − y +
ad=1 d d d mod d =0
Let the character = 0 modulo d be induced by some primitive character 1 modulo d1 We saw in (8.28) that y 1 − y log ylog d
Using this in (9.30) gives us that
y
1 y 1 max
y d a −
ad=1 d d mod d =0
+
1 max y − y + log ylog d y≤x d≤z d
and, further, that max max d≤z
y≤x
+
1
y d a − y
ad=1 d d d≤z
mod d =0
max y 1 y≤x
1 max y − y + zlog zlog x y≤x d≤z d
where z = zx is a positive real number, depending on x, to be specified soon. For the second term above we use the prime number theorem in the form (8.29) given in Chapter 8, obtaining that
1 x log z max y − y
y≤x d log xA+1 d≤z It remains to estimate the first term.
(9.31)
The Bombieri–Vinogradov theorem
170
We write each modulus d as d = d1 k for some positive integer k and observe that 1 1 max y 1 = max y 1 y≤x y≤x d≤z d mod d d1 ≤z k≤ z d1 k mod d
d1
=0
1
1
2z ∗ 1 log y d d≤z d
where we have also used the estimate k≤ dz
1 2z 1 log dk d d
∗ and where the summation is over primitive characters modulo d
In order to complete the proof of the theorem we need to choose z appropriately (i.e. of the form z = x1/2 /log xB for some positive constant B = BA) and to show that 1 ∗ x max y
y≤x log xA d≤z d
(9.32)
We recall that by the Siegel–Walfisz theorem, namely by Exercise 12 of Chapter 8, there exists B = BA > 0 such that, if d ≤ log xB and = 0 is a character modulo d, then y
x
log xA+1
Thus d≤log xB
1 ∗ x max y
y≤x d log xA
(9.33)
Now we choose z =
x1/2 log xB
and apply Corollary 9.2.3 and partial summation to obtain 1 ∗ x max y
d y≤x log xA log xB 0 and > 1/2 is an exceedingly difficult one. A celebrated conjecture of Elliott and Halberstam [13] predicts that the above inequality is true for any < 1
If one discards the absolute value and fixes a then some significant results have been obtained in [2]. Without going into too much detail, we mention that for so-called ‘well-factorable functions &d of level z’ we have, for any 4 > 0 and z = x 7 − , x x &d x d a − d log xA da=1 d≤z
for any A > 0
This result has numerous applications. For example, in the case of the Titchmarsh divisor prolem one can show, for any A > 0 x dp + a = cx + c1 li x + O log xA p≤x
172
The Bombieri–Vinogradov theorem
for positive constants c c1 . Another aplication is to the Artin primitive root conjecture [44], as well as to the theory of elliptic curves. The proofs of these improvements depend upon estimates for averages of Kloosterman sums and these, in turn, are derived from the fundamental work of Deshouillers and Iwaniec [9] relating spectral theory to sieve theory. In conclusion, we mention an important paper of Fouvry [18] that signals a conjectural approach to the Elliott–Halberstam conjecture.
9.3 The Titchmarsh divisor problem Let a be a fixed integer. We recall that in Chapter 3 we considered the question of determining the asymptotic behaviour of the function p + a p≤x
for a = −1 Now let us consider the more complex question of determining the asymptotic behaviour of dp + a p≤x
where d· is the divisor function. This is known in the literature as the Titchmarsh divisor problem. It was first studied by Titchmarsh in 1930 [69] and is related to a famous conjecture of Hardy and Littlewood, formulated in 1922 and asserting that every sufficiently large integer can be represented as the sum of a prime and two squares. Already in 1930, Titchmarsh showed that dp + a = Ox p≤x
and that, under a generalized Riemann hypothesis, an explicit asymptotic formula for p≤x dp + a (see (9.35) below) also holds. In 1923, Hardy and Littlewood suggested that their aforementioned conjecture was true for almost all integers if one assumes a generalized Riemann hypothesis, and in 1928 Stanley showed that this was indeed true. Later on, the hypothesis of her results was gradually weakened by S. Chowla (1935), A. Walfisz (1935), T. Estermann (1936), H. Halberstam (1951), and C. Hooley (1957), without being completely removed. Only in 1960 Linnik obtained an unconditional proof for the Hardy–Littlewood conjecture. His method, now known as ‘the dispersion method’, can also be used to provide an unconditional proof for the Titchmarsh divisor problem.
9.3 The Titchmarsh divisor problem
173
Simpler proofs for both the Hardy–Littlewood conjecture and the Titchmarsh divisor problem were obtained in 1966 by P. D. T. A. Elliott and H. Halberstam, and in 1965 by G. Rodriquez, by combining Hooley’s 1957 approach (see [31]) with the Bombieri–Vinogradov Theorem. Our goal in this section is to describe a simple proof of the Titchmarsh divisor problem. More precisely, we are showing: Theorem 9.3.1 c such that
Let a be a fixed integer. Then there exists a positive constant
x log log x dp + a = cx + O log x p≤x
(9.35)
Proof First let us observe that for any positive integer n
dn = 2
1 − n
dn √ d≤ n
where n =
1
if
0
n is a square otherwise
Thus
dp + a = 2
p≤x
1−
p≤x dp+a √ d≤ x
=2
√ d≤ x
p + a
p≤x
x d −a + O
√ x
(9.36)
We recall that the Bombieri–Vinogradov theorem allows us to control the error terms in the asymptotic formulae for x d −a as long as d ≤ √ xlog x−B for some positive constant B (to be specified later). This suggests that we split the summation on the right-hand side of (9.36) into two parts: √ d≤ x
x d −a =
√
d≤
x log xB
x d −a +
√
x log xB
√ 0 and some B = BA Now we remark (see Exercise 4 of Chapter 6) that there exists a positive constant c0 such that, for any x
1 = c0 log x + O1
d d≤x Therefore d≤
√ x log xB
x d −a =
x log log x c0 x+O
2 log x
(9.38)
(9.39)
For the second sum in (9.37) we use the Brun–Titchmarsh theorem and, again, (9.38). We obtain √
x log xB
√ 0 and for any sequences of real numbers wt &d that are supported only at positive squarefree integers t ≤ y d ≤ z composed of primes of we have ⎞2 ⎛ ⎞⎛ ⎜ ⎟ ⎝ wt ⎠ ⎝ &d ⎠ = !X + E (10.3) a∈
where
t a∈t
⎛
d a∈d
⎞⎛
⎛
⎞2
⎞
⎜ ⎜ ⎟ ⎟ ⎜ ⎟ E = O ⎜ ⎝ wt ⎠ ⎝ &d ⎠ Rm ⎟ ⎝ ⎠ m≤yz2 mPyz
t≤y tm
d≤z dm
(10.4)
10.1 The lower bound sieve and
⎞2
⎛ ! =
t≤y≤z tPyPz t=1
179
⎟ ⎜ ⎟ 1 ⎜ wt ⎟ ⎜ · rz r ⎟ ⎜ ft f1 ⎝r≤z/ ⎠
(10.5)
&sr s≤z/r fsr
(10.6)
rPz rt
with zr = rf1 r
sPz
for any positive squarefree integer r composed of primes of
Remark 10.1.2 First we note that choosing the sequence &d is equivalent to choosing the sequence zr , since, by the dual Möbius inversion formula, (10.6) is equivalent to &d = dfd
2 rd zrd
r≤z/d f1 rd
(10.7)
rPz
Then we note that Theorem 10.1.1 is a generalization of Selberg’s sieve discussed in Chapter 7. To see this, we choose 1 if t = 1 wt = (10.8) 0 if t > 1 and
zr =
1 Vz
if
r ≤ z and rPz
0
otherwise
Vz =
2 d d≤z f1 d
(10.9)
where (10.10)
dPz
(notice that this is the same as Vz defined in Chapter 7). On one hand, from (10.7) and (10.9) we get that &d =
1 dfd 2 r · Vz f1 d r≤z/d f1 r rPz rd=1
(10.11)
The lower bound sieve
180
which is exactly what we had chosen in the proof of Selberg’s sieve. For this choice, we have &d = 0 if d > z &1 = 1 and &d ≤ 1 for any d ≤ z Using these observations in (10.4) we obtain that ⎞ ⎛ ⎟ ⎜ Rd1 d2 ⎠
E = O⎝ d1 d2 ≤z d1 d2 Pz
On the other hand, from (10.9) and (10.5) we get != Thus
1
Vz ⎛
S z ≤
⎞
X ⎜ ⎟ +O⎝ Rd1 d2 ⎠ Vz d1 d2 ≤z d1 d2 Pz
which is exactly Selberg’s sieve. The proof of Theorem 10.1.1 is a ‘weighted replica’ of the treatment of the summation ⎛ ⎞2 ⎜ ⎟ &d ⎠ ⎝ a∈
d a∈d
occuring in Selberg’s sieve. We will need the following auxiliary results. Lemma 10.1.3 Let f be a multiplicative function. For any squarefree positive integers t d1 d2 we have f t d1 d2 =
ftfd1 fd2 ft d1 d2 ft d1 ft d2 fd1 d2
where, for integers a and b, f a b = fa b and fa b = fa b
Proof Exercise. Lemma 10.1.4 Let f be a multiplicative function, let f1 be defined by (10.2) and let f−1 be defined by 1 = f−1 d
fn dn
(10.12)
10.1 The lower bound sieve
181
Let t be squarefree positive integers. Then f if t = 1 f1 rf−1 s = 1 0 otherwise. rs= st
Proof Let r s be squarefree positive integers, and let = r s, = r s. Using Lemma 10.1.3 we see that f1 =
f1 rf1 f1
f1 r =
f1 f1
f1 s
or, equivalently,
This allows us to write f−1 s f1 rf−1 s = f1 f f1 s s 1 rs= st st
= f1
f−1 sfs
f1 s st
Now we notice that for any prime p we have 1 − 1 fp fp f−1 pfp = = −1
f1 p fp − 1 Since f−1 ·f·/f1 · is a multiplicative function, we must have f−1 sfs = s
f1 s Therefore
rs= st
f1 rf−1 s = f1
s
st
which, by Lemma 1.2.1 of Chapter 1, gives us the result claimed in the lemma. Lemma 10.1.5 Let f be a multiplicative function and let f1 be defined by (10.2). Let r t be positive integers such that r is squarefree and t = 1
Then r f r if rt ft s = 1 s 0 if r t. sr
The lower bound sieve
182
Proof First we observe that since t = 1 we have t s = t s Thus we need to analyze r ft s
S = s sr Let r = p1 pk be the prime factorization of r. First let us consider the case when rt, hence the case when t = p1 pk t for some positive integer t . Then 1− 1− p1 1 pk k f p1 1 pk k S= 0≤1 ≤1 ··· 0≤k ≤1
=
fpj − 1
1≤j≤k
= f1 r
Now let us consider the case when r t. Without loss of generality we can write t = pi pk t for some i > 1 and some integer t coprime to p1 pi−1 . Then 1− 1− p1 pi−1 pi i pk k f pi pk t p1 1 pk k S= 0≤i ≤1 ··· 0≤k ≤1
=
1−
pj j ·
1≤j≤i−1 0≤j ≤1
pj + 1 ·
=
1− pj j f pj j
i≤j≤k 0≤j ≤1
1− pj j f pj j
i≤j≤k 0≤j ≤1
1≤j≤i−1
= 0
This completes the proof of the lemma. Proof of Theorem 10.1.1 By squaring out the sum over d interchanging summation and using the assumption on #d we obtain ⎛ ⎞⎛ ⎞2 ⎟⎜ ⎟ ⎜ ⎜ ⎟⎜ ⎟ wt ⎟ ⎜ &d ⎟ ⎜ ⎝ t≤y ⎠ ⎝ d≤z ⎠ a∈ tPy a∈t
=X
dPz a∈d
t≤yd1 d2 ≤z tPyd1 d2 Pz
wt &d1 &d2 f t d1 d2
+
t≤yd1 d2 ≤z tPyd1 d2 Pz
wt &d1 &d2 Rtd1 d2
10.1 The lower bound sieve
=X
183
wt &d1 &d2
t≤yd1 d2 ≤z tPyd1 d2 Pz
⎛
f t d1 d2
⎛
⎞⎛
⎞
⎞2
⎟ ⎜ ⎜ ⎟ ⎜ ⎟ +O⎜ ⎝ wt ⎠ ⎝ &d ⎠ Rm ⎟ ⎝ ⎠
m≤yz2 mPyz
t≤y tm
(10.13)
d≤z dm
Now we analyze
! =
wt &d1 &d2
t≤yd1 d2 ≤z tPyd1 d2 Pz
(10.14)
ft d1 d2
which, by using Lemma 10.1.3 and formulae (10.2) and (10.12), can be rewritten as wt &d1 ft d1 &d2 ft d2 · · f1 r f−1 s = fd1 fd2 t≤yd1 d2 ≤z ft rd d std d 1
tPyd1 d2 Pz
=
t≤yrs≤z tPyrsPz st
wt f rf−1 s ft 1
d1 d2 ≤z d1 d2 Pz rsd1 rsd2
t≤yrs≤z tPy rsPz st
1
2
&d1 ft d1 &d2 ft d2 · fd1 fd2
⎛
=
2
⎞2
⎜ & ft d ⎟ wt ⎜ ⎟ d f1 rf−1 s ⎜ ⎟
⎝ d≤z fd ⎠ ft dPz rsd
For any t and with tPy and Pz let ut =
&d ft d
fd d≤z
(10.15)
dPz d
By the dual M¨obius inversion formula this is equivalent to & ft d = utd
f d≤z dPz d
(10.16)
The lower bound sieve
184
By using (10.15) and Lemma 10.1.4 we obtain
! =
t≤y≤z tPyPz
=
t≤y≤z tPyPz t=1
wt 2 u f rf−1 s ft t rs= 1 st
wt f u2t
ft 1
(10.17)
We note that (10.17) is reminiscent of (7.18) of Chapter 7, since in the situation of Selberg’s sieve the sequence wt is given by w1 = 1 and wt = 0 for any t > 1 (see Remark 10.1.2). If we view ! as a quadratic form in &d then we see that, through the above calculations, we have diagonalized this form. In what follows we will try to bring the summation ≤z Pz
1 f1
into the expression of ! This will emphasize the resemblance of the result discussed here with Selberg’s sieve. Let us note that from (10.15) and (10.16) we get u1 =
&m fm m≤z/
(10.18)
mPz
and & = tu1t
f t≤z/
(10.19)
tPz
By using (10.19) and Lemma 10.1.5 we rewrite ut as ut =
&s ft s = ft s mu1sm fs s≤z/ s≤z/ m≤z/s sPz
=
r≤z/ rPz
sPz
mPz
r f1 ru1r
ft s u1r = s r≤z/ sr rPz rt
10.2 Twin primes
185
Now we use (10.18) and the observation that if rt and t = 1 then r = 1 to write
ut =
f1 r
r≤z/ rPz rt
=
&rs frs s≤z/r sPz
&rs
rrf1 r f1 r≤z/ frs s≤z/r rPz rt
sPz
We plug this back into (10.17) and get ⎛
! =
t≤y≤z tPyPz t=1
⎞2
⎜ ⎟ &rs ⎟ 1 ⎜ wt ⎜ ⎟
· rrf1 r ⎟ ft f1 ⎜ frs ⎝r≤z/ ⎠ s≤z/r rPz rt
sPz
By introducing the notation zr = rf1 r
&sr s≤z/r fsr sPz
we finally obtain ⎞2
⎛
! =
t≤y≤z tPyPz t=1
⎟ ⎜ ⎟ 1 ⎜ wt ⎜ · rzr ⎟ ⎟ ⎜ ft f1 ⎝r≤z/ ⎠ rPz rt
which is exactly the quantity ! introduced in the statement of the theorem. This completes the proof.
10.2 Twin primes In 1849, de Polignac conjectured that every even number is the difference of two primes in infinitely many ways. In particular, the conjecture predicts that there are infinitely many primes p such that p + 2 is also prime. This special case of de Polignac’s conjecture undoubtedly has roots in antiquity. In this section, we will prove:
The lower bound sieve
186
Theorem 10.2.1 There are infinitely many primes p such that p + 2 has at most four distinct prime factors. This is an application of the lower bound sieve discussed in Section 10.1 and of the following preliminary lemmas. Lemma 10.2.2 (Mertens’ formula) 1 1 = log1 + log x + A0 + O 1 + log x p≤x p − 1
for some positive constant A0
Proof Exercise. We denote by 1 · the multiplicative function defined by n =
1 d
dn
Lemma 10.2.3 2 1
≤x 2
= A1 log x + A2 + O
1
x1/4
for some positive constants A1 A2
Proof Exercise. Lemma 10.2.4
2
1 · log log x
1 + log x ≤x 1 2
Proof the interval 2 x into dyadic subintervals of the form k+1 We kbreak x/2 x/2 with 0 ≤ k < log x/ log 2 − 2 Then 2
1 1 x 1 + log k ≤x 1 1≤k y
Thus, if p is a prime such that #mod p ≤ y then two distinct elements of S are congruent modulo p. Hence p divides the numerator N of m
r a1 1 am r −1
for some m1 mr , and the number of such primes is clearly bounded by log N ≤ y1/r
r i=1
log Hai = Oy1/r
10.4 Application to primitive roots
197
where Hai = maxbi ci and ai = bi /ci with bi ci being coprime integers. Taking into account the number of possibilities for m1 mr , the total number of primes p cannot exceed Oy1+1/r
Now suppose a1 a6 are mutually coprime integers. We want to show that there is a positive constant A and a finite set of the form n
n
S = a1 1 a6 6 0 ≤ ni ≤ A such that, for some a ∈ S, there are infinitely many primes p for which a is a primitive root modulo p. Indeed, let p be a prime enumerated in Theorem 10.3.2 and the subgroup generated by a1 a6 . By the remark following Theorem 10.3.2, we can ensure that p−1/2
ai
≡ −1mod p
for 1 ≤ i ≤ 6. In particular, the index p∗ mod p is odd. As p − 1/2 has all its prime divisors > x1/6 /log x2B we see that if the index p∗ mod p is greater than 1, then #mod p ≤ x5/6 log x2B
By Lemma 10.4.2, the number of such primes is Ox35/36 log x7B/3 which is negligible compared to the number of primes supplied by Theorem 10.3.2. Thus, we may assume that the index p∗ mod p = 1 for at least
x log log x log x3
primes p ≤ x. Now fix such a prime p. Suppose that none of the elements of S are primitive roots modulo p. Choose a primitive root g modulo p and write ai = g ei mod p
198
The lower bound sieve
Thus, for every 6-tuple n1 n6 enumerated by S we have 6 ni ei p − 1 > 1
gcd i=1
The total number of elements of S is A + 16 . We remove from this set any tuple n1 n6 for which 6 ni ei p − 1 = 1
gcd i=1
As p − 1 has a bounded number of prime factors, there are not too many tuples to eliminate. For instance, the number of tuples with 6
ni ei ≡ 0mod 2
i=1
is at most A + 16 /2 (as each of the ei s is odd). For each of the remaining possibilities we may choose n1 n5 arbitrarily and then solve for n6 from n6 e6 ≡ −
5
ni ei mod q
i=‘1
for the prime qp − 1 with q > x1/6 /log x2B As A is fixed, there can be at most one solution for n6 in this congruence for x sufficiently large. Thus, if 1 A + 16 − A + 16 − 6A + 15 > 0 2 we deduce that there is a tuple n1 n6 in S such that n
n
a1 1 a6 6 is a primitive root modulo p for the primes enumerated above. Thus, if A > 11, we are ensured a solution. This proves: Theorem 10.4.3 Let a1 a6 be mutually coprime integers greater than 1. Then there are infinitely many primes p for which some member of the finite set n
n
S = a1 1 a6 6 0 ≤ ni ≤ 12 is a primitive root modulo p for infinitely many primes p. Moreover, the number of such primes p ≤ x is
x log log x
log x3
10.5 Exercises
199
As indicated in [23], one can obtain stronger results if we apply the Rosser– Iwaniec sieve [34]. By that technique, one can obtain
x log2 x
primes p ≤ x with the property that all prime divisors of p − 1/2 are greater than x1/4 /log xB
There are other applications of the lower bound sieve that are too numerous to enumerate here. We will content ourselves by relating one application to the Euclidean algorithm. An integral domain R is said to be Euclidean if there is a map R\0 → ∪ 0 such that for any a b ∈ R, b = 0, there are q r ∈ R such that a = bq + r with r = 0 or r < b
Any such Euclidean domain is necessarily a principal ideal domain (PID). In 1973, Weinberger [72] proved the remarkable theorem that if K is an algebraic number field with infinitely many units, then the ring of integers of K is Euclidean if and only if it is a PID, provided the generalized Riemann hypothesis holds. Naturally, it would be good to eliminate the use of this hypothesis from this result. In this direction, M. Harper and R. Murty [29] showed that whenever the unit rank of K is greater than 3, the generalized Riemann hypothesis can be dispensed with. A key tool in this work is the lower bound sieve and the large sieve method. We refer the reader to [29] for further details and applications.
10.5 Exercises 1. Prove Lemma 10.2.2. 2. Prove Lemma 10.2.3. 3. Let d· be the divisor function and let k be a positive integer. Show that dk m k log x2
m m≤x
200
The lower bound sieve
4. Let x be positive real numbers and let y = x+ z = x− for some > 0 Show that the conditions yz2 ≤ x 2 − 1
and
log y lim 1 − log x→ log z
>0
are satisfied if and only if 1 1 = =
4 8 5. Show that 2 is a primitive root modulo p if p is of the form 4q + 1 with q a prime. 6. In the proof of Theorem 10.4.3, what goes wrong if we take five mutually coprime integers greater than 1? 7. Fix a prime q < x with < 1/2 Show that the number of primes p ≡ 1mod q such that p − 1/2q has all its prime factors > x for some > 0 is x
qlog x2 8. Assuming the result stated at the end of Section 10.4, deduce that there are x log x2 primes p ≤ x with the property that all the prime divisors of p − 1/2 are greater than x1/4 log xB
Deduce that if a b c are three distinct prime numbers, then one of them is a primitive root modulo p for infinitely many primes p
9. Assuming the Elliott–Halberstam conjecture (see Chapter 9), show that for any distinct prime numbers a b, at least one of them is a primitive root modulo p
11 New directions in sieve theory
Nearly a century after the discovery of Brun’s sieve, we can look back and see how the subject has developed, and, to some extent, indicate how it may develop in the next 100 years. One of the dominant themes of the twentieth century number theory has been the ‘modular connection’. In 1955, Yutaka Taniyama (1927–58) first hinted at a connection between elliptic curves and automorphic forms. The Langlands program has absorbed this theme and the connection is expected to hold in a wider context. At the heart of the Langlands program lies the ‘Rankin–Selberg method’, which signals an ‘orthogonality principle’ for automorphic representations on GLn. This point of view has suggested one mode of generalizing the large sieve inequalities of analytic number theory. In a series of remarkable papers, H. Iwaniec and his school have developed the ‘modular connection’ and the cognate ‘spectral connection’ as it applies to GL2 analogues of the large sieve inequality (see, for example, [2,3,9,33]). In this chapter, we give a brief overview of the work of Duke and Kowalski [10] that suggests a future direction for the large sieve method. No doubt, there will be other directions of development, but the authors do not have a crystal ball to perceive them.
11.1 A duality principle The large sieve inequality can be reduced to a statement in linear algebra, which in turn can be proven using matrix theory. This is the point of view of Elliott [11, pp. 150–70]. We review this below, as it is one of the key ingredients in ‘non-abelian’ generalizations of the large sieve, initiated (and, to a large extent, developed) by Iwaniec [33]. 201
202
New directions in sieve theory
Theorem 11.1.1 (A principle of duality) Let cij 1 ≤ i ≤ m 1 ≤ j ≤ n be mn complex numbers. Let & be a non-negative real number. Then the inequality
2
cij aj ≤ & aj 2
1≤i≤m 1≤j≤n 1≤j≤n holds for any complex numbers a1 an if and only if the inequality
2
cij bi ≤ & bi 2
1≤j≤n 1≤i≤m 1≤i≤m holds for any complex numbers b1 bm
Proof First, let us recall some standard notation. For a vector v with complex components, vt denotes its transpose, v denotes its length in the Euclidean space, and v denotes its complex conjugate. Now, for a1 an ∈ and b1 bm ∈ let a and b denote the column vectors a1 an t and b1 bm t respectively. We consider the matrix C = cij 1≤i≤m1≤j≤n and we observe that the first inequality of the theorem can be rewritten as Ca2 ≤ &a2 ∀a and the second one can be rewritten as bt C2 ≤ &b2 ∀b
Let us assume that the first inequality holds. By the Cauchy–Schwarz inequality, we have bt Ca2 ≤ bt 2 Ca2 ≤ &b2 a2 ∀a the last inequality following from our assumption. Now set t
a = C b so that bt C4 ≤ &b2 bt C2
From this we deduce that bt C2 ≤ &b2 ∀b as required.
11.1 A duality principle
203
The converse implication is immediate upon interchanging the roles of i and j More can be said about & appearing in the previous theorem. The matrix t A = C C is an n × n Hermitian matrix. Consequently, all of its eigenvalues, denoted &1 &n , are real and we may diagonalize A by a unitary transformation U More precisely, there is a matrix U satisfying t
UU = I and t
t
Cx2 = xt Ax = yt U C CU y =
&j yj 2
1≤j≤n
where x = U y Without loss of generality we may assume that &1 ≥ &2 ≥ ≥ &n ≥ 0 so that Cx2 ≤ &1 y2 = &1 x2 as U is a unitary transformation (and hence preserves lengths). Thus & in the t theorem can be taken to be the maximal eigenvalue of C C and if the aj ’s are unrestricted, this is the best choice possible. t t By duality, we can consider B = CC instead of C C As B is Hermitian, its eigenvalues are real and, again, we may order them as 1 ≥ 2 ≥ ≥ n ≥ 0 to deduce that the inequality of the theorem holds with & = 1 which is best possible. Thus the theorem implies &1 = 1
In fact, more is true. One can show that &i = i for i ≤ minm n and that the remaining eigenvalues are all zero (see [11, pp. 163–4]). The large sieve inequality (Theorem 8.7) can be derived from the following more general inequality. Let xj 1 ≤ j ≤ R be real numbers that satisfy xj − xk ≥ > 0 j = k where y denotes now the distance of y to the nearest integer. That is, y = min y − y y + 1 − y (we hope that the reader will not confuse it with the length of a vector). Then the large sieve inequality can be deduced from (see Exercise 2)
2
an e2inxj ≤ N + −1 an 2
1≤j≤R 1≤n≤N 1≤n≤N
204
New directions in sieve theory
This last inequality can be proven by determining the largest eigenvalue of the R × R matrix t
CC where
C = e2ixj n 1≤j≤R1≤n≤N
t
Notice that the j k-th entry of CC is 2ix −x n j k e 1≤n≤N
which is a geometric series that can be summed directly. If xj = xk this sum is easily seen to be 2ixj −xk N e −1 2ieixj −xk
sin xj − xk By Exercise 3, this is xj − xk −1
Now, given any n × n matrix A = aij , it is easy to see (Exercise 6) that all the eigenvalues & of A lie in the discs & − aii ≤ aij
j=i
Since xj − xk ≥ for any j = k a little reflection (Exercise 5) shows that
1 xj − xk −1 −1 log j=k
which gives a slightly weaker form of the large sieve inequality. A refined analysis (see [11, pp. 166–170]) gives the sharper inequality stated above. As discussed in Theorem 8.3.1, the large sieve inequality must be reformulated in terms of Dirichlet characters before it can be applied to deduce important consequences to the theory of L-functions or the Bombieri–Vinogradov theorem (Theorem 9.2.1). More precisely, the form of the large sieve is the inequality
2
d ∗
an 2
an n ≤ z2 + 4x
d n≤x n≤x d≤z where the summation ∗ is over primitive Dirichlet characters modulo d
The transition is effected by the orthogonality of the Dirichlet characters and
11.2 A general formalism
205
the inequality should be viewed as some form of ‘quasi’-orthogonality. This viewpoint suggests the study of sums of the form
2
an f n
f
n
where f n is to be thought of as a generalized character and where f ranges over a suitable family. We formalize this in the next section.
11.2 A general formalism The following set-up is suggested by a paper of Duke and Kowalski [10]. Let be a finite set. For each f ∈ suppose that we are given a sequence &f n
n ≥ 1
Let an n≥1 be a sequence of complex numbers and x a positive real number. The goal is to derive an inequality of the form
2
an &f n ≤ ! an 2
n≤x f ∈ n≤x for some ! > 0. We will suppose that the elements of form a ‘quasi’-orthogonal family in the following sense: there exist non-negative numbers such that, for any f g ∈
&f n&g n = cf gx + O x #
(11.1)
n≤x
where the implied constant is absolute and where cf g = 0 unless f = g, in which case we suppose that cf f = O # for any > 0
By duality we need to estimate
2
bf &f n
n≤x f ∈
New directions in sieve theory
206
for any sequence bf f ∈ of complex numbers. Expanding and using that 2bf bg ≤ bf 2 + bg 2 , we get by (11.1) that bf bg &f n&g n = x bf 2 cf f + O x # +1 bf 2 fg
n≤x
f
f
x# + O x # +1 bf 2
f
This proves: Theorem 11.2.1 Let be a finite set and for each f ∈ , let &f nn≥1 be a sequence of complex numbers. Let an n≥1 be a sequence of complex numbers. Let x > 0 Under assumption (11.1) we have
2
an 2
an &f n x# + O x # +1
n≤x f ∈ n≤x Let us note that the Cauchy–Schwarz inequality gives the estimate x# 1+ + O x # +1 an 2 n≤x
(see Exercise 7). We see that Theorem 11.2.1 provides a saving of # in the main term. To apply this theorem, we must have a ready supply of families that satisfy (11.1). A natural family is provided by normalized Hecke eigenforms in the theory of modular forms. Indeed, if f and g are cusp forms of weights k1 and k2 , respectively, with Fourier expansions k1 −1 fz = &f nn 2 e2inz n≥1
gz =
&g nn
k2 −1 2
e2inz
n≥1
then the Rankin–Selberg L-series &f n&g n n≥1
ns
has been studied (independently) by Rankin [56] and Selberg [57]. Their results imply that if we take to be the set of normalized Hecke eigenforms of level ≤ Q then (11.1) holds with = 1/2 = 1/2 and with cf g equal to zero unless f = g in which case cf f is related to the Petersson inner product f f One also has the estimate on cf f required by (11.1).
11.3 Linnik’s problem for elliptic curves
207
This observation opens up a wide spectrum of examples and potential applications. Relation (11.1) holds in a wider context of Rankin–Selberg theory in the Langlands program. We refer the reader to [6] for a general introduction to the theory.
11.3 Linnik’s problem for elliptic curves In 1941, when Linnik [36] introduced his large sieve, he was motivated by the following question: let d be a natural number and a primitive Dirichlet character modulo d what is the size of the smallest n = n such that n = 1? In the special case that is the quadratic character (that is, 2 = 0 the trivial character) and d = q is a prime, this is the question concerning the size of the least quadratic non-residue modulo q and one has the celebrated conjecture of Vinogradov that n q for any > 0 The generalized Riemann hypothesis (see [41, Chapter 13]) implies n log q2
Using his large sieve, Linnik obtained a statistical result towards this question. More precisely, if D > 0 and > 1 and if we let ND be the number of primitive characters of modulus d ≤ D such that n = 1 for all n ≤ log D then Linnik’s argument shows that ND D + 2
for any > 0 Since the total number of characters under consideration is 2 about D2 , this result says that the number of characters with n > log D is o D for any > 2 In particular, one can conclude that Vinogradov’s hypothesis is ‘almost always’ true in the probabilistic sense described above. After the advent of the Langlands program and, to some extent, the important work of Iwaniec [33], automorphic representations are viewed as higher dimensional analogues of Dirichlet characters. The discussion in the previous section and its brief allusion to Rankin–Selberg theory justifies, to some extent, this viewpoint. Thus it has been a useful background theme to investigate to what extent theorems concerning Dirichlet characters can be generalized to the higher dimensional ‘automorphic’ context. Poetic as this may sound, the transfer of ideas is not easy and our understanding even in the GL2 context has been meagre. However, some results can be obtained, suggesting that ‘non-abelian’ analogues of the large sieve inequality do exist.
208
New directions in sieve theory
This perspective first arose in the fundamental work of Iwaniec [33] and then was developed in the foundational paper [9]. These works focused on the GL2 analogues. In [10], Duke and Kowalski presented a general approach on how to generalize the large sieve inequality to GLn It is this presentation that we will follow here. Inspired by Linnik’s problem, Duke and Kowalski consider the following problem: given two non-isogenous elliptic curves E1 and E2 over , how large can n be such that E1 and E2 have the same number of points modulo p for all primes p ≤ n where E1 and E2 have good reduction? We can phrase this question in another way, as follows. Given an elliptic curve E over let ap E be defined by ap E = p + 1 − #Ep for primes p of good reduction. It is known that two elliptic curves E1 and E2 are isogenous over if and only if ap E1 = ap E2 for all but finitely many primes p Thus, given two non-isogenous elliptic curves E1 and E2 over the problem is to find the size of the smallest prime p so that ap E1 = ap E2
We shall refer to it as Linnik’s problem for elliptic curves. This question comes up in many contexts and was first discussed by Serre [64] in his determination of the Galois groups obtained by adjoining the -division points of a given elliptic curve to He proved, under the generalized Riemann hypothesis (for Dedekind zeta functions), that one can find such a prime p satisfying p log D2 where D = maxN1 N2 with N1 N2 equal to the conductors of E1 E2 respectively. Thus, a natural question to ask is if one can prove a result analogous to Linnik’s without any hypothesis in the elliptic curve context. This is what is proven in [10]: Theorem 11.3.1 (Duke and Kowalski) Let D > 0 > 1 Let MD be the maximal number of isogeny classes of semistable elliptic curves over with conductor ≤ D such that for every prime p ≤ log D they have the same number of points modulo p Then, for any > 0 MD D +
10
11.4 Linnik’s problem for cusp forms
209
Fouvry et al. [19] have shown that the number of semistable elliptic curves of conductor ≤ D is 5
D6
That this is the expected result can be seen, roughly, as follows. Given an elliptic curve E y2 = x3 + ax + b over the discriminant of the equation is 4a3 + 27b2 (which may not be equal to the discriminant of the curve!). The number of choices for a and b so that 4a3 + 27b2 ≤ D and 4a3 + 27b2 is squarefree is, approximately, D1/3 D1/2 = D5/6
This heuristic argument shows that the result of [19] is of the correct order of expected magnitude. Thus, if 10/ < 5/6 that is, if > 12 we obtain from Theorem 11.3.1 a statistical result of Linnik type in the elliptic curve context. In view of the spectacular work of Wiles [73] and his school, all elliptic curves over are modular. That is, to each elliptic curve E over we can associate a weight 2 cusp form fE such that ap E = &fE p where &fE p denotes the p-th Fourier coefficient of fE Hence the Linnik problem for elliptic curves can be rephrased in terms of cups forms. This is the strategy adopted in [10]. We therefore consider the larger class of cusp forms of level ≤ Q, derive a ‘large sieve’-type inequality for them, and then deduce Theorem 11.3.1.
11.4 Linnik’s problem for cusp forms Let k be a positive even integer. For a positive integer d let Sk d+ be the set of primitive cusp forms of weight k and level d. We know that the elements of Sk d+ are Hecke eigenforms. Let D > 0 and let Sk ≤ D+ denote the set of primitive cusp forms of weight k and level ≤ D For f ∈ Sk ≤ D+ we write k−1 fz = &f nn 2 e2inz n≥1
for its Fourier expansion at i
New directions in sieve theory
210
There are precise formulae for the cardinality of Sk d+ and we can deduce that #Sk ≤ D+ ∼ ckD2 for some positive constant ck (see [66, pp. 25–46] for details). We will be interested in a special subset of Sk ≤ D+ denoted Sk ≤ D# consisting of ‘non-monomial’ forms. One can characterize the ‘monomial’ forms as those f for which there exists a quadratic Dirichlet character such that, for all primes p, we have &f p = &f pp
These are the so-called ‘forms of CM-type’. One can show that the collection of monomial elements of Sk ≤ D+ is of size o D2 so that #Sk ≤ D# ∼ ckD2 as D → (in fact, one can show that the number of monomial forms is O D1+ for any > 0). The main theorem in [10] is: Theorem 11.4.1 Let k be a positive even integer, D > 0 Let an n≥1 be a sequence of complex numbers. 1. If > 6 then
2
an &f n D+ an 2
# ≤D n≤D n≤D
f ∈Sk
2. If > 10 then
2 ∗
2 2 2
nan &f n D+ nan 2
f ∈S ≤D# n≤D n≤D k
where the star on the summation indicates that we sum over those f ‘up to quadratic equivalence’. That is, two forms f g ∈ Sk ≤ D# are said to be quadratic equivalent if there exists a quadratic character such that, for all p &f p = &g pp The summation is over representatives of the equivalence classes. We will discuss the proof of this theorem in the next section. In this section we will indicate how one may deduce Theorem 11.3.1 from it.
11.4 Linnik’s problem for cusp forms
211
Proof of Theorem 11.3.1 Fix k D and an n≥1 as in Theorem 11.4.1. Also, fix > 1 and a set of primes, of natural density Thus # p ∈ p ≤ x ∼ x as x →
We will say that two elements f g ∈ Sk ≤ D# are equivalent, denoted f ∼ g if &f p = &g p ∀p ∈ p ≤ log D
Let us fix f0 ∈ Sk ≤ D# and let d0 be the level of f0 We introduce the notation D = p ≤ log D p d0 1 D = p ∈ Q &f0 p ≥ 1/2 2 D = p ∈ Q &f0 p2 ≥ 1/2
Since d0 has log D prime divisors, the set D satisfies #D ∼
log D log log D
as > 1 For any p ∈ D we have
&f0 p2 − &f0 p2 = 1
so that, for some i0 ∈ 1 2 #i0 D ≥ #D/2 ≥
log D
3 log log D
Let m be a positive integer to be chosen soon. We denote by D the set of squarefree integers n that have m prime factors, all from i0 D
Let N be the largest element of D so that N ≤ log Dm = N say
We assume that m is chosen such that N is less than (but near) D , with to be chosen so that we may apply Theorem 11.4.1. We remark that if f ∼ f0 in the sense defined above, then &f p = &f0 p for all p ∈ i D hence &f n = &f0 n if n has all its prime factors in i D where 1 ≤ i ≤ 2. Moreover, if n ∈ D then
i i
&f n 0 = &f n 0 ≥ 2−m
(11.2) 0
212
New directions in sieve theory
Now let us deduce Theorem 11.3.1. For this, it is enough to take k = 2 and in this situation the cusp forms f appearing in the inequalities given by Theorem 11.4.1 will correspond to elliptic curves over . We will also consider only those f ’s belonging to the equivalence class of the fixed form f0 so that we count elliptic curves in the same isogeny class. We choose if n ∈ D & ni0 an = f0 0 otherwise, In the case i0 = 1 we use the first inequality of Theorem 11.4.1 to obtain
2
2 &f0 n D+ &f n2
MD
n∈ D n∈ D Now 11 2 implies that
&f0 n2 ≥ 2−2m # D
n∈ D
Thus MD
D+ 22m
# D
In the case i0 = 2, let M 2 D be the size of MD up to quadratic twists. Then, using the second inequality of Theorem 11.4.1, we get that, for > 10
2 2 2 2
2 2
&f n D+
&f n M D 0 0
n∈ D n∈ D so that, as before, M 2 D We choose
D+ 22m
# D
log Q m = log log Q
and so 22m = O D By unique factorization and Stirling’s formula, −1 #i0 D # D ≥ D −
m Hence
M 2 D D +
11.5 The large sieve inequality on GL(n)
213
Since any equivalence class contributes at most one semistable curve, this completes the proof of Theorem 11.3.1. We remark that the number of quadratic 1 twists of a given f that may be + 2 in any equivalence class is at most O D hence this would have to be taken into account in the general case. However, these estimates can be refined and we refer the reader to [10, p. 15] for further details. The interested reader should also compare this argument with the classical one of Linnik, which can be found in [1, p. 7].
11.5 The large sieve inequality on GL(n) In this section we adopt the general approach of [10] and outline, very briefly, how one may prove Theorem 11.4.1. For any comprehensive proof the prerequisites are formidable and it would be out of the scope of this book to provide complete details. Our goal is to indicate (rather vaguely) how such a theory is to be developed. As we feel the theory will have profound applications in the future, it would be short-sighted on our part to leave out a discussion of it. For the terminology and background, we refer the reader to [22]. Let be a cuspidal automorphic representation of GLn A where A denotes the adele ring of the rational number field. To each such Langlands has attached an L-function: Ls =
& n n≥1
ns
We fix the infinity component of and the central character # of . For any integer d ≥ 1 let Autd be the set of cuspidal automorphic representations of GLn A such that 1. is the infinity component of and # is its central character; 2. satisfies the Ramanujan conjecture in the sense that & n = O n for any > 0 3. the conductor of is d.
New directions in sieve theory
214 For D > 0 let
Aut≤ D = ∪d≤D Autd
For each d, suppose that we are given a set Sd ⊆ Autd and define S≤ D = ∪d≤D Sd
The goal is to prove an inequality of the form
2
an & n N 1+ an 2
n≤N ∈S≤D n≤N for D and N in a suitable range. As remarked in Section 11.1, this is equivalent, by duality, to
2
b & n N 1+ b 2
n≤N ∈S≤D ∈S≤D Since we seek only an upper bound, we choose a smooth, positive, compactly supported test function + −→ + so that x = 1 for 0 ≤ x ≤ 1 and x ≥ 0 for all x ∈ + Then we must estimate
2
n
b & n
N n≥1 ∈S≤D Expanding the square and interchanging the order of summation gives n
b1 b2 &1 n&2 n N n≥1 ∈S≤D 1
2
Let us consider the Dirichlet series Lb s 1 ⊗ ˜ 2 =
&1 n&2 n n≥1
ns
which we call the ‘naive’ convolution L-series of 1 and 2 . It is closely related to the Rankin–Selberg convolution Ls 1 ⊗ ˜ 2 : there exists a Dirichlet series Hs 1 ˜ 2 which converges absolutely for Res > 1/2 so that Lb s 1 ⊗ ˜ 2 = Hs 1 ˜ 2 Ls 1 ⊗ ˜ 2
11.5 The large sieve inequality on GL(n)
215
Moreover, we have, for any > 0 and uniformly for Res = " > 1/2 a bound Hs 1 ˜ 2 q1 q2 H" where H is a fixed Dirichlet series absolutely convergent in Res > 1/2 and satisfying 1 −A H" " − 2 for some A > 0 here, q1 q2 denotes the least common multiple of the conductors q1 q2 of 1 2 respectively. Define dx ˆ xxs s = x 0 so that 1 c+i ˆ x = sx−s ds 2i c−i for c > 2 Hence n≥1
&1 n&2 n
n N
=
1 c+i s ˆ N sHs 1 ˜ 2 Ls 1 ⊗ ˜ 2 ds
2i c−i
Now we can apply standard techniques of analytic number theory (see [45, Chapter 4]). We move the line of integration to Res = 21 + c with c < 1/2 to be chosen later. The Mellin transform ˆ is easily seen to be holomorphic for Res > 0 and rapidly decreasing in any vertical strip 0 < < Res < b for every positive b. By the Phragmén–Lindel¨of principle (see [45, Chapter 8]),
L 1 + c + it 1 ⊗ ˜ 2 D2n 41 − 2c tE
2 for some E, using the work of Bushnell–Henniart on conductors of tensor products of two automorphic representations. The final result is that the sum in question is 1 n ˆ = 1 2 1NR + O N 2 + 2 + 1 when N > D where · · is the Kronecker delta function and R1 is the residue of Ls 1 ⊗ ˜ 1 at s = 1. Moreover, R1 can be estimated to be O D by using the Ramanujan bound for &1 n that we have assumed. If we now assume that, for some a, #S≤ D Da
216
New directions in sieve theory
then putting everything together gives
2
an & n N 1+ an 2
n≤N ∈S≤D n≤N provided N > D and > 2a + n
This completes our discussion of the theorem.
11.6 Exercises t
1. Let C be an m × n matrix. Show that the non-zero eigenvalues of C C t and CC are equal. 2. Deduce Theorem 8.7 from
2
2inxj an e an 2
≤ N + −1
1≤j≤R 1≤n≤N 1≤n≤N where xj − xk ≥ for j = k. 3. Let C = cij be an m × n matrix with complex entries. Let & be the t largest eigenvalue of the Hermitian matrix C C. Show that n& ≥
m n
cij 2
i=1 j=1 t
By repeating the argument for CC , deduce that & minm n ≥
m n
cij 2
i=1 j=1
4. Infer from the preceding exercises that, in the large sieve inequality, the factor z2 + 4x cannot be replaced by a quantity smaller than maxx d
d≤z
5. Prove that for some constant c > 0 sin x ≥ cx
6. Let A = aij be an n × n matrix. Show that any eigenvalue & of A satisfies & − aii ≤ aij 1 ≤ i ≤ n
j=i
11.6 Exercises
217
7. Show that if (11.1) holds, then
2
an 2
an &f n x# 1+ + O x # +1
n≤x f ∈ n≤x 8. By applying the duality principle to the Turán–Kubilius inequality (see the exercises in Chapter 3), deduce that
2
k
p an − p−k an x an 2
n≤x n≤x pk ≤x n≤xpk n uniformly for all real x and all complex numbers an .
References
[1] E. Bombieri, Le grand crible dans la théorie analytique des nombres. Astérisque 18 (Société Mathématique de France, 1974). [2] E. Bombieri, J. Friedlander and H. Iwaniec, Primes in arithmetic progressions to large moduli. Acta Math., 156 (1986), 203–51. [3] E. Bombieri, J. Friedlander and H. Iwaniec, Primes in arithmetic progressions to large moduli, II, Math. Ann., 277: 3 (1987), 361–93. [4] V. Brun, Über das Goldbachsche Gesetz und die Anzahl der Primzahlpaare. Archiv for Math. og Naturvid., B34: 8 (1915), 19 pages. [5] V. Brun, Le crible d’Eratostène et le théorème de Goldbach. Videnskaps. Skr., Mat.-Naturv. Kl. Kristiana, no. 3 (1920), 36 pp. [6] D. Bump, Automorphic Forms and Representations, (Cambridge: Cambridge University Press, 1997). [7] A. C. Cojocaru, E. Fouvry and M. Ram Murty, The square sieve and the Lang– Trotter conjecture. Can. J. Math. (to appear). [8] H. Davenport, Multiplicative Number Theory, 3rd edn (New York: Springer Verlag, 2000). [9] J.-M. Deshouillers and H. Iwaniec, Kloosterman sums and Fourier coefficients of cups forms. Inv. Math., 70 (1982), 219–88. [10] W. Duke and E. Kowalski, A problem of Linnik for elliptic curves and mean value estimates for automorphic representations. Inv. Math., 139 (2000), 1–39. [11] P. D. T. A. Elliott, Probabilistic Number Theory I: Mean Value Theorems, New York, Berlin: Springer-Verlag, 1980. [12] P. D. T. A. Elliott, Probabilistic Number Theory II: Central Limit Theorems, New York, Berlin: Springer-Verlag, 1980. [13] P. D. T. A. Elliott and H. Halberstam, A conjecture in prime number theory. Symp. Math., 4 (1968/9), pp. 59–72. [14] C. Elsholtz, Some remarks on the additive structure of the set of primes. In Number Theory for the Millennium I, (Natick, MA: A. K. Peters, 2002), pp. 419–27. [15] V. Ennola, On numbers with small prime divisors. Ann. Acad. Sci. Fenn. Ser. AI, 440 (1969), 16 pp. [16] P. Erd¨os and C. Pomerance, On the normal number of prime factors of n. Rocky Mountain J., 15 (1985), 343–52. [17] Jody Esmonde and M. Ram Murty, Problems in Algebraic Number theory, (New York: Springer-Verlag, 1999).
218
References
219
[18] E. Fouvry, Autour du théorème de Bombieri-Vinogradov. Acta Math., 152: 3–4 (1984), 219–44. [19] E. Fouvry, M. Nair and G. Tenenbaum, L’ensemble exceptionnel dans la conjecture de Szpiro. Bull. Soc. Math. France, 120: 4 (1992), 483–506. [20] P. X. Gallagher, The large sieve and probabilistic Galois theory. Analytic Number Theory, Proc. Symp. Pure Math., 24, (AMS, 1973), 91–101. [21] P. X. Gallagher, A larger sieve. Acta Arithm., 18 (1971), 77–81. [22] S. Gelbart, Automorphic Forms on Adele Groups (Princeton, NJ: Princeton University Press, 1975). [23] R. Gupta and M. Ram Murty, A remark on Artin’s conjecture. Inventiones Math., 78 (1984), 127–30. [24] D. R. Heath-Brown, The square sieve and consecutive squarefree numbers. Math. Ann., 266 (1984), 251–9. [25] D. R. Heath-Brown, Artin’s conjecture for primitive roots. Q. J. Math., Oxford, 37: 2 (1986), 27–38. [26] L. K. Hua, Introduction to Number Theory (Berlin: Springer-Verlag, 1982). [27] H. Halberstam and H. E. Richert, Sieve Methods, London Mathematical Society monographs, No 4, (London, New York: Academic Press, 1974). [28] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers (Oxford, 1960). [29] M. Harper and M. Ram Murty, Euclidean rings of integers. Can. J. Math., 56 (2004), 71–6. [30] M. Hindry and J. Silverman, Introduction to Diophantine Geometry (New York: Springer-Verlag, 2001). [31] C. Hooley, Applications of Sieve Methods, (Cambridge: Cambridge University Press, 1976). [32] K. Ireland and M. Rosen, A Classical Introduction to Modern Number Theory, 2nd edn (New York: Springer-Verlag, 1998). [33] H. Iwaniec, Fourier coefficients of cups forms and the Riemann zeta function. Sém. Th. Nb. Bordeaux, exposé no. 18, (1979–1980) 36 pp. [34] H. Iwaniec, Rosser’s sieve. Acta Arith., 36 (1980), 171–202. [35] W. Li, Number Theory with Applications (Singapore: World Scientific, 1996). [36] Yu. V. Linnik, The large sieve. Dokl. Akad. Nauk. SSSR, 30 (1941), 292–4. [37] Yu-Ru Liu and M. Ram Murty, The Turán sieve and some of its applications, J. Ramanujan Math. Soc., 13:2 (1999), pp. 35–49. [38] Yu-Ru Liu and M. Ram Murty, A weighted Turán sieve method, to appear. [39] J. Merlin, Sur quelques théorèmes d’Arithmetique et un énoncé qui les contient. C.R. Acad. Sci.Paris, 153 (1911), 516–18. [40] J. Merlin, Un travail de Jean Merlin sur les nombres premiers. Bull. Sci. Math., 39:2 (1915), 121–36. [41] H. Montgomery, Topics in Multiplicative Number Theory, (New York: Springer Verlag, 1971). [42] H. L. Montgomery and R. C. Vaughan, The large sieve. Mathematika, 20 (1973), 119–34. [43] Y. Motohashi, An induction principle for the generalizations of Bombieri’s prime number theorem. Proc. Japan Acad., 52:6 (1976), 273–75. [44] M. Ram Murty, Artin’s conjecture for primitive roots, Math. Intelligencer, 10:4, (1988), 59–67. [45] M. Ram Murty, Problems in Analytic Number Theory, (New York: SpringerVerlag, 2001).
220
References
[46] M. Ram Murty, Sieving using Dirichlet series. In Current Trends in Number Theory, ed. S. D. Adhikari, S. A. Katre and B. Ramakrishnan. (New Delhi: Hindustan Book Agency, 2002), pp. 111–24. [47] M.Ram Murty, Sieve methods, Siegel zeros and Sarvadaman Chowla. In Connected at Infinity, Texts Read. Math., 25 (2003), pp. 18–35. [48] M. Ram Murty and V. Kumar Murty, Prime divisors of Fourier coefficients of modular forms. Duke Math. Journal, 51 (1984), 57–76. [49] M. Ram Murty and V. Kumar Murty, An analogue of the Erd¨os–Kac theorem for Fourier coefficients of modular forms. Indian J. Pure Applied Math., 15:10 (1984), 1090–101. [50] M. Ram Murty and F. Saidak, Non-abelian generalizations of the Erd¨os–Kac theorem. Can. J. Math., 56 (2004), 356–72. [51] M. Ram Murty and N. Saradha, On the sieve of Eratosthenes. Can. J. Math., 39:5 (1987), 1107–122. [52] M. Ram Murty and N. Saradha, An asymptotic formula by a method of Selberg. C.R. Math. Rep. Acad. Sci. (Canada), XV:6 (1993), 273–77. [53] B. M. Nathanson, Additive Number Theory. The Classical Bases (New York: Springer-Verlag, 1996). [54] D. A. Raikov, Generalisation of a theorem of Ikehara–Landau (in Russian). Mat. Sbornik, 45 (1938), 559–68. [55] S. Ramanujan, A proof of Bertrand’s postulate. J. Indian Math. Soc., 11 (1919), 181–2. [56] R. Rankin, Contributions to the theory of Ramanujan’s function n and similar arithmetical functions, III – A note on the sum of the Fourier coefficients of integral modular forms. Proc. Camb. Phil. Soc., 36 (1940), 150–1. [57] A. Selberg, Bemerkungen über eine Dirichletsche Reihe, die mit der Theorie der Modulformen nahe verbunden ist. Archiv for Mathematik og Naturvidenskab B., 43:4 (1940), 47–50 (see also A. Selberg, Collected Papers, vol. 1, pp. 38–41, Springer-Verlag, 1989). [58] A. Selberg, An elementary proof of the prime number theorem. Ann. Math., 50 (1949), 305–13. [59] A. Selberg, The general sieve method and its place in prime number theory, Proc. Int. Congress of Mathematicians Cambridge, MA (Providence, RI. American Mathematical Society, 1950), vol. 1, pp. 289–262. [60] A. Selberg, Lectures on sieves, Collected Papers, vol. 2. (Berlin: SpringerVerlag, 1991), pp. 66–247. [61] J.-P. Serre, A Course in Arithmetic, (Springer, 1973). [62] J.-P. Serre, Majorations de sommes exponentielles. Journées Arith. Caen, Astérisque, 41–42 (1977), 111–26. [63] J.-P. Serre, Divisibilité de certaines fonctions arithmétiques’, L’Ens. Math., 22 (1976), 227–60 (see also Collected Papers, vol. 3, pp. 250–83, Springer-Verlag, 1986). [64] J.-P. Serre, Quelques applications du théorème de densité de Chebotarev. Pub. Math. I.H.E.S., 54 (1981), 123–201. [65] J.-P. Serre, Spécialisation des éléments de Br2 T1 Tn . C. R. Acad. Sci., Paris, 311 (1990), 397–422 (see also Collected Papers, vol. 4, pp. 194–199 Springs-Verlag, 1986). [66] G. Shimura, Introduction to the Arithmetic Theory of Automorphic Functions, (Princeton, NJ: Iwanami Shoten and Princeton University Press, 1971). [67] C. L. Siegel, The integer solutions of the equation y2 = axn + bxn−1 + · · · + k. J. London Math. Soc., 1 (1920), 66–8.
References
221
[68] G. Tenenbaum, Introduction to Analytic and Probabilistic Number Theory (Cambridge: Cambridge University Press, 1995). [69] E.C. Titchmarsh, A divisor problem. Rend. Circ. Mat. Palermo, 54 (1930), 414–29. [70] R. C. Vaughan, A note on Schnirelman’s approach to Goldbach’s problem. Bull. London Math. Soc., 8:3 (1976), 245–250. [71] R. C. Vaughan, On the estimation of Schnirelman’s constant. J. reine Angew. Math., 290 (1977), 93–108. [72] P. Weinberger, On Euclidean rings of algebraic integers. Proc. Symp. Pure Math., 24 (1973), 321–32. [73] A. Wiles, Modular elliptic curves and Fermat’s last theorem. Ann. Math. 141:2 (1995), 443–551. [74] R. Wilson, The Selberg sieve for a lattice. In Combinatorial Theory and its Applications, III, Proc. Colloq. Ballatonfüred, 1969 (Amsterdam: North-Holland, 1970), pp. 1141–9. [75] A. Wintner, On the prime number theorem. Amer. J. Math, 64 (1942), 320–26.
Index
O-notation, 1 ·, 43, 59, 110 x z, 63, 68, 69, 76, 78, 79, 82, 85, 113, 117, 133 x y, 29, 30, 60, 68, 69, 76, 78, 79 -notation, 2 -notation, 1 -notation, 1 ·, 2, 32–35, 37, 43, 44, 59, 109 x, 5, 11, 12, 59, 62, 64, 69, 76, 79, 85, 110, 113, 117 x k a, 38, 76, 109, 112, 124, 125, 130, 133, 147, 152, 167 K x, 36 x, 7, 11, 12, 148 x , 147, 152 x k a, 147, 152 ∼-notation, 2 x, 5, 11 d·, 10, 11, 51, 58, 59, 109, 157, 172, 174, 175, 188, 189 o-notation, 2 additive arithmetical function, 45 analytic density, 124, 130 arithmetical function, 13 Artin, 195 Artin L-function, 43 Artin’s conjecture on primitive roots, 172, 195 Balasubramanian, 105 Barban-Davenport-Halberstam theorem, 147 Bertrand’s postulate, 6, 12 Bombieri, 135, 156, 167, 177, 178
Bombieri-Vinogradov theorem, 39, 40, 42, 44, 125, 135, 156, 157, 167, 171, 173, 175, 191, 195 Brun, 15, 73, 80, 88, 90, 177 Brun’s pure sieve, 63, 81, 82, 86, 113 Brun’s sieve, 3, 73, 93, 98, 100, 102, 109–112, 127, 129, 201 Brun’s theorem, 73, 74 Brun-Titchmarsh theorem, 109, 125, 127, 174 Buchstab’s function, 78 Buchstab’s identity, 29, 30, 78, 79, 89 Buniakowski’s conjecture, 38, 124 Bushnell, 215 Cauchy-Schwarz inequality, 18, 27, 45, 46, 104, 106, 112, 141, 144, 146, 152, 154, 162, 164, 174, 191, 202, 206 Chebotarev density theorem, 58 Chebycheff, 1, 5, 9, 113 Chebycheff’s theorem, 6, 10, 12, 38, 52, 59, 70, 76 Chowla, 172 completely multiplicative function, 76, 123 cyclotomic polynomial, 19, 28, 29 Davenport, 135 de la Vallée Poussin, 6 de Polignac’s conjecture, 106 Dedekind’s theorem, 36 Deligne, 23 Deshouillers, 105, 172 Dickman, 30 Dickman’s function, 29, 30 Dirichlet, 124 Dirichlet character, 142, 151, 204, 207
222
Index Dirichlet character, orthogonality relations, 151 Dirichlet character, primitive, 143 Dirichlet character, trivial, 142 Dirichlet L-function, 26, 124, 125 Dirichlet product, 13 Dirichlet’s theorem, 38, 125 dispersion method, 172 Duke, 201, 205, 208 Elliott, 173, 201 Elliott-Halberstam conjecture, 171, 172, 200 Ennola, 78 Eratosthenes, 63, 80 Eratosthenes’ sieve, 3, 47, 63, 68, 70, 73–75, 80, 82, 85, 113, 115, 118, 127–129, 195 Erd¨os, 32, 39, 42, 43, 127 Erd¨os’ theorem on cyclic groups, 127, 131 Erd¨os’ theorem on p − 1, 39 Erd¨os-Kac theorem, 34, 38, 39, 42 Estermann, 172 Euclidean algorithm, 199 Euclidean domain, 199 Euler, 100 Euler’s constant, 11, 66, 110 Euler’s function, 12, 19, 27, 29, 83, 109, 186 Fouvry, 172, 209 Gallagher, 18, 53, 135, 167 Gauss, 6 Gauss sum, 143, 151 generalized Riemann hypothesis, 43, 125, 135, 156, 167, 172, 195, 199, 207, 208 Goldbach, 100 Goldbach number, 100 Goldbach’s conjecture, 80, 100, 102, 105, 193 Gupta, 195 Hadamard, 6, 80 Halberstam, 135, 172, 173 Hardy, 32, 34, 100, 172 Hardy-Littlewood conjecture, 172 Hardy-Ramanujan theorem, 32, 47 Harper, 199 Hasse-Minkovski theorem, 55 Heath-Brown, 195 Henniart, 215 Hilbert, 105
223
Hilbert symbol, 55, 56, 59 Hooley, 156, 172, 173, 195 Hua, 105 inclusion-exclusion principle, 15, 72, 113 Iwaniec, 172, 195, 201, 207, 208 Jacobi symbol, 21, 28 Kac, 32 Kowalski, 201, 205, 208 Lagrange, 100, 105 Landau, 80 Langlands, 213 Langlands program, 201, 207 large sieve, 15, 16, 18, 39, 40, 58, 133, 135, 139, 156, 167, 199, 201, 204, 207 large sieve inequality, 60, 105, 135, 138, 201, 203, 204, 208 large sieve, first modified inequality, 143, 149, 169 large sieve, second modified inequality, 145, 161, 163 larger sieve, 16, 18 Legendre, 63, 124 Legendre symbol, 55 Lehmer’s conjecture, 43 Linnik, 60, 105, 106, 135, 159, 172, 207, 213 Linnik’s problem for elliptic curves, 208, 209 Liouville function, 133 Littlewood, 100, 172 logarithmic integral, 36 lower bound sieve, 156, 177, 178, 186, 195, 196, 199 M¨obius function, 1, 2, 30, 74, 85, 114, 135, 175 M¨obius function, dual inversion formula, 4, 116, 119, 121, 179 M¨obius function, fundamental property, 3, 63, 81 M¨obius function, inversion formula, 3, 51, 82, 87, 119, 140, 183 M¨obius function, truncated, 81 Mann’s theorem, 105, 112 Merlin, 80 Mertens, 66 Mertens’ formula, 186, 190 Mertens’ theorem, 65
224 Montgomery, 133, 139 Mordell, 43 Motohashi, 156 multiplicative function, 41, 119, 130, 180 Murty, 42, 43, 195, 199 Nair, 209 natural density, 124, 130 normal order, 35, 37, 42–44, 59 normal order method, 32, 40, 48 Pólya-Vinogradov inequality, 143, 152, 163, 168 parity problem, 134 Parseval’s identity, 138, 151 partial summation, 4, 5 Perron’s formula, 25 Phragmén–Lindel¨of principle, 215 Pomerance, 42, 43 prime ideal theorem, 36, 38 prime number theorem, 6, 11, 12, 78, 110, 133, 148, 169 prime number theorem for primes in an arithmetic progression, 147 prime number theorem, function field analogue, 51 primitive root, 195, 197, 198, 200 principle of duality, 202, 217 Rényi, 135 radical, 3, 77, 130 Raikov, 26 Ramanujan, 1, 6, 32, 34, 43, 100 Ramanujan sum, 139, 151 Ramanujan’s conjecture, 213 Ramanujan’s function, 42 Rankin, 206 Rankin’s trick, 63, 68, 71, 80 Riemann zeta function, 26, 65, 75, 113, 168, 175 Rodriquez, 173 Romanoff’s theorem, 106 Rosser-Iwaniec sieve, 199 Roth, 135 Saidak, 42, 43 Schnirelman, 100 Schnirelman density, 101, 102, 105, 106, 112 Schnirelman’s constant, 101
Index Schnirelman’s theorem, 100, 103 Selberg, 12, 113, 114, 134, 139, 177, 206 Selberg’s formula, 12, 13 Selberg’s method, 114, 118 Selberg’s sieve, 15, 113, 119, 124, 135, 177, 179, 184, 189 Serre, 208 Siegel, 53 Siegel’s theorem, 22, 53 Siegel-Walfisz theorem, 125, 128, 130, 150, 152, 167, 170 sieving using Dirichlet series, 16, 25 smooth number, 29 square sieve, 21, 23, 53, 54 Stanley, 172 Taniyama, 201 Tauberian theorem, 25, 26 Tennenbaum, 209 the method of dyadic subdivision, 75 Titchmarsh, 172 Titchmarsh divisor problem, 156, 171–173 Turán, 32 Turán’s sieve, 32, 47, 48, 51, 53, 54, 58, 60, 113 Turán’s theorem, 34, 47 Turán-Kubilius inequality, 45, 46, 217 twin prime, 39, 40, 64, 72, 73, 99, 111, 186, 193 twin prime conjecture, 39, 72, 80, 156 Vaughan, 101, 133, 139, 157, 167 Vaughan’s identity, 159, 174 Vaughan’s method, 156 Vaughan’s theorem, 167 Vinogradov, 100, 167 Vinogradov’s hypothesis, 135, 207 von Mangoldt function, 11, 12, 29, 147, 167, 175 Walfisz, 172 Waring, 100 Waring’s problem, 100, 105 Weil’s theorem, 24, 54 Weinberger, 199 Wieferich, 105 Wiles, 209 Wilson, 113