Introduction to Elliptic Curves and Modular Forms (Graduate Texts in Mathematics 97)

Neal Koblitz Introduction to Elliptic Curves and Modular Forms Springer-Veriag New York Berlin Heidelberg Tokyo- Ne...

Author: Neal Koblitz

223 downloads 1238 Views 11MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Neal Koblitz

Introduction to Elliptic Curves and Modular Forms

Springer-Veriag New York Berlin Heidelberg Tokyo-

Neal Koblitz Department of Mathematics University of Washington Seattle, WA 98195 U.S.A.

Editorial Board P. R. Halmos

F. W. Gehring

C. C. Moore

Managing Editor Indiana University Department of Mathematics Bloomington, Indiana 47405 U.S.A.

University of Michigan Department of Mathematics Ann Arbor, Michigan 48109 U.S.A.

University of California Department of Mathematics Berkeley, California 94720 U.S.A.

AMS Subject Classifications: 10-01, 10D12, 10H08, 10H10, 12-01, 14H45 Library of Congress Cataloging in Publication Data Koblitz, Neal Introduction to elliptic curves and modular forms. (Graduate texts in mathematics; 97) Bibliography: p. Includes index. L Curves, Elliptic. 2. Forms, Modular. 3. Numbers. Theory of. I. Title. II. Series. 84.10517 516.3'5 QA567.1(63 1984 With 24 Illustrations

©

1984 by Springer-Verlag New York Inc. All rights reserved. No part of this book may be translated or reproduced in any form without written permission from Springer-Verlag, 175 Fifth Avenue, New York, New York 10010, U.S.A. Typeset by Asco Trade Typesetting Ltd., Hong Kong. Printed and bound by R. R. Donnelley & Sons, Harrisonburg, Virginia. Printed in the United States of America. 987654321 ISBN 0-387-96029-5 Springer-Verlag New York Berlin Heidelberg Tokyo ISBN 3-540-96029-5 Springer-Verlag Berlin Heidelberg New York Tokyo

Preface

This textbook covers the basic properties of elliptic curves and modular forms, with emphasis on certain connections with number theory. The ancient "congruent number problem" is the central motivating example for most of the book. My purpose is to make the subject accessible to those who find it hard to read more advanced or more algebraically oriented treatments. At the sanie time I want to introduce topics which are at the forefront of current research. Down-to-earth examples are given in the text and exercises, with the aim of making the material readable and interesting to mathematicians in fields far removed from the subject of the book. With numerous exercises (and answers) included, the textbook is also intended for graduate students who have completed the standard first-year courses in real and complex analysis and algebra. Such students would learn applications of techniques from those courses, thereby solidifying their understanding of some basic tools used throughout mathematics. Graduate students wanting to work in number theory or algebraic geometry would get a motivational, example-oriented introduction. In addition, advanced undergraduates could use the book for independent study projects, senior theses, , and seminar work. This book grew out of lecture notes for a course I gave at the University of Washington in 1981-1982, and from a series of lectures at the Hanoi Mathematical Institute in April, 1983. I would like to thank the auditors of both courses for their interest and suggestions. My special gratitude is due to Gary Nelson for his thorough reading of the manuscript and his detailed comments and corrections. I would also like to thank Professors J. Buhler, B. Mazur, B. H. Gross, and Huynh Mui for their interest, advice and encouragement.

vi

Preface

The frontispiece was drawn by Professor A. T. Fomenko of Moscow State University to illustrate the theme of this book. It depicts the family of elliptic curves (tori) that arises in the congruent number problem. The'elliptic curve corresponding to a natural number n has branch points at 0, co, n and —n. In the drawing we see how the elliptic curves interlock and deform as the branch points + n go to infinity. Note: References are given in the form [Author year]; in case of multiple

works by the same author in the same year, we use a, b, ... after the date to indicate the order in which they are listed in the Bibliography.

Seattle, Washington

NEAL KOBLITZ

Contents

CHAPTER I

From Congruent Numbers to Elliptic Curves 1. 2. 3. 4. 5. 6. 7. 8. 9.

Congruent numbers A certain cubic equation Elliptic curves Doubly periodic functions The field of elliptic functions Elliptic curves in Weierstrass form The addition law Points of finite order Points over finite fields, and the congruent number problem

I 3 6 9 14 18 22 29 36 43

CHAPTER II

The Hasse—Weil L-Function of an Elliptic Curve

51

1. 2. 3. 4. 5. 6.

51 56 64 70 79 90

The congruence zeta-function The zeta-function of En Varying the prime p The prototype: the Riemann zeta-function The Hasse—Weil L-function and its functional equation The critical value

CHAPTER III

Modular forms 1. SL 2(7L) and its congruence subgroups 2. Modular forms for SL 2(Z)

98 98 108

viii 3. Modular forms for congruence subgroups 4. Transformation formula for the theta-function 5. The modular interpretation, and Hecke operators

Contents

124 147 153

CHAPTER IV

Modular Forms of Half Integer Weight

176

1. 2. 3. 4.

177

Definitions and examples Eisenstein series of half integer weight for r0(4) Hecke operators on forms of half integer weight The theorems of Shim ura, Waldspurger, Tunnel!, and the congruent number problem

185 202

212

Answers, Hints, and References for Selected Exercises

223

Bibliography

240

Index

245

CHAPTER .J

From Congruent Numbers to Elliptic Curves

The theory of elliptic curves and modular forms is one subject where the most diverse branches of mathematics come together: complex analysis, algebraic geometry, representation theory, number theory. While our point of view will be number theoretic, we shall find ourselves using the type of techniques that one learns in basic courses in complex variables, real variables, and algebra. A well-known feature of number theory is the abundance of conjectures and theorems whose statements are accessible to high school students but whose proofs either are unknown or, in some cases, are the culmination of decades of research using some of the most powerful tools of twentieth century mathematics. We shall motivate our choice of topics by one such theorem: an elegant characterization of so-called "congruent numbers" that was recently proved by J. Tunnell [Tunnel! 1983]. A few of the proofs of necessary results go beyond our scope, but many of the ingredients in the proof of Tunnell's theorem will be developed in complete detail. Tunnell's theorem gives an almost complete answer to an ancient problem: find a simple test to determine whether or not a given integer n is the area of some right triangle all of whose sides are rational numbers. A natural number n is called "congruent" if there exists a right triangle with all three sides rational and area n. For example, 6 is' the area of the 3-4-5 right triangle, and so is a congruent number. Right triangles whose sides are integers X, Y, Z (a "Pythagorean triple") were studied in ancient Greece by Pythagoras, Euclid, Diophantus, and others. Their central discovery was that there is an easy way to generate all such triangles. Namely, take any two positive integers a and h with a > h, draw the line in the uv-plane through the point ( 1, 0) with slope bla. Let (u, y) be the second point of intersection of this line with the unit circle (see Fig. I.1). It is not hard to show that

1.

2


%

Figure IA

u=

a2 — b 2 a2 + b2'

2ab v = 2 + b2. a

Then the integers X = a 2 b 2 , Y = 2ab, Z = a2 + b 2 are the sides of a right triangle; the fact that X 2 + /72 = Z 2 follows because u2 + y2 = 1. By letting a and b range through all positive integers with a > b, one gets all possible Pythagorean triples (see Problem 1 below). Although the problem of studying numbers n which occur as areas of rational right triangles was of interest to the Greeks in special cases, it seems that the congruent number problem was first discussed systematically by Arab scholars of the tenth century. (For a detailed history of the problem of determining which numbers are "congruent", see [L. E. Dickson 1952, Ch. XVI] ; see also [Guy 1981, Section D27].) The Arab investigators preferred to rephrase the problem in the following equivalent form : given n, can one find a rational number x such that x2 ± n and x2 — n are both squares of rational numbers? (The equivalence of these two forms of the congruent number problem was known to the Greeks and to the Arabs; for a proof of this elementary fact, see Proposition 1 below.) Since that time, some well-known mathematicians have devoted considerable energy to special cases of the congruent number problem. For example, Euler was the first to show that n =7 is a congruent number. Fermat showed that n=1 is not; this result is essentially equivalent to Fermat's Last Theorem for the exponent 4 (i.e., the fact that r + y4 = z4 has no nontrivial integer solutions). It eventually became known that the numbers 1, 2, 3, 4 are not congruent numbers, but 5, 6, 7 are. However, it looked hopeless to find a straightforward criterion to tell whether or not a given n is congruent. A major advance in the twentieth century was to place this problem in the context of the arithmetic theory of elliptic curves. It was in this context that Tunnell was able to prove his remarkable theorem. —

§1. Congruent numbers

3

Here is part of what Tunnell's theorem says (the full statement will be given later):

Theorem (Tunnel». Let n be an odd squarefree natural number. Consider the

two conditions: (A) n is congruent; (B) the number of triples of integers (x, y, z) satisfying 2x2 + y 2 + 8z2 = n is equal to twice the number of triples satisfying 2x 2 + y 2 + 32z 2 z-----. n. Then (A) implies (B); and, iJ. a weak form of the so-called Birch–SwinnertonDyer conjecture is true, then (B) also implies (A). The central concepts in the proof of Tunnell's theorem the Hasse–Weil L-function of an elliptic curve, the Birch–Swinnerton-Dyer conjecture, modular forms of half integer weight—will be discussed in later chapters. Our concern in this chapter will be to establish the connection between congruent numbers and a certain family of elliptic curves, in the process giving the definition and some basic properties of elliptic curves.

§1. Congruent numbers Let us first make a more general definition of a congruent number. A positive rational number r e 0 is called a "congruent number" if it is the area of some right triangle with rational sides. Suppose r is congruent, and X, Y, Z e CI are the sides of a triangle with area r. For any nonzero r Ea we can find some SE Qa such that s'r is a squarefree integer. But the triangle with sides sX, sY, sZ has area s2r. Thus, without loss of generality we may assume that r = n is a squarefree natural number. Expressed in group language, we can say that whether or not a number r in the multiplicative group 0 + of positive rational numbers has the congruent property depends only on its coset modulo. the subgroup (0 + ) 2 consisting of the squares of rational numbers; and each coset in 0 + /(0 + ) 2 contains a unique squarefree natural number r = n. In what follows, when speaking of congruent numbers, we shall always assume that the number is a squarefree positive integer. Notice that the definition of a congruent number does not require the sides of the triangle to be integral, only rational. While n = 6 is the smallest possible area of a right triangle with integer sides, one can find right triangles with rational sides having area n = 5. The right triangle with sides l. 6i, 6-.) is such a triangle (see Fig. 1.2). It turns out that n = 5 is the smallest congruent number (recall that we are using "congruent number" to mean "congruent squarefree natural number"). There is a simple algorithm using Pythagorean triples (see the problems below) that will eventually list all congruent numbers. Unfortunately, given

4

1.


■Pm..... ■••••=11■

6-23 Figure 1.2

n, one cannot tell how long one must wait to get n if it is congruent; thus, if n has not appeared we do not know whether this means that n is not a congruent number or that we have simply not waited long enough. From a practical point of view, the beauty of Tunnell's theorem is that his condition (B) can be easily and rapidly verified by an effective algorithm. Thus, his theorem almost settles the congruent number problem, i.e., the problem of finding a verifiable criterion for whether a given n is congruent. We must say "almost settles" because in one direction the criterion is only known to work in all cases if one assumes a conjecture about elliptic curves. Now suppose that X, Y, Z are the sides of a right triangle with area n. This means: X 2 + Y2 = Z 2, and -1-XY = n. Thus, algebraically speaking, the condition that n be a congruent number says that these two equations have a simultaneous solution X, Y, ZE Q. In the proposition that follows, we derive an alternate condition for n to be a congruent number. In listing triangles with sides X, Y, Z, we shall not want to list X, Y, Z and Y, X, Z separately. So for now let us fix the ordering by requiring that X < Y < Z (Z is the hypotenuse).

Proposition 1. Let n be a fixed squarefree positive integer. Let X, Y, Z, x always denote rational numbers, with X < Y < Z. There is a one-to-one correspondence between right triangles with legs X and Y, hypotenuse Z, and area n; and numbers x for which x, x + n, and x n are each the square of a rational number. The correspondence is:

x = (Z/2) 2

X, I;

x

X. =

+n

\Ix

n,

‘Ix + n + \lx n,

In particular, n is a congruent number if and only if there exists x such that x, x + n, and x n are squares of rational numbers.

First suppose that X, Y, Z is a triple with the desired properties: X 2 ± y2 = Z 2 , 1XY = n. If we add or subtract four times the second equation from the first, we obtain: (X + Y) 2 = Z 2 ± 4n. If we then divide both sides by four, we see that x = (Z/2) 2 has the property that the numbers x + n are the squares of (X + 37)/2. Conversely, given x with the desired properties, it is easy to see that the three positive rational numbers X < Y < Z given by the formulas in the proposition satisfy: XY = 2n, and X2 + Y 2 = 4x = Z 2 . Finally, to establish the one-to-one correspondence, it only remains PROOF.

5

§1. Congruent numbers

to verify that no two distinct triples X, Y, Z can lead to the same x. We leave 0 this to the reader (see the problems below). PROBLEMS

I. Recall that a Pythagorean triple is a solution (X, Y, Z) in positive integers to the equation X 2 + Y 2 == Z 2 . It is called "primitive" if X, Y, Z have no common factor. Suppose that a > b are two relatively prime positive integers, not both odd. Show that X z--- a2 b2, Y = 2ab, Z =-- a2 + b$2 form a primitive Pythagorean triple, and that all primitive Pythagorean triples are obtained in this way. —

2. Use Problem 1 to write a flowchart for an algorithm that lists all squarefree congruent numbers (of course, not in increasing order). List the first twelve distinct congruent numbers your algorithm gives. Note that there is no way of knowing when a given congruent number n will appear in the list. For example, 101 is a congruent number, but the first Pythagorean triple which leads to an area 3-2 101 involves twenty-two-digit numbers (see [Guy 1981, p. 106]). One hundred fifty-seven is even worse (see Fig. 1.3). One cannot use this algorithm to establish that some n is not a congruent number. Technically, it is not a real algorithm, only a "semialgorithm". 3. (a) Show that if I were a congruent number, then the equation x4 y4 =.-- u2 would have an integer solution with u odd. (b) Prove that 1 is not a congruent number. (Note: A consequence is Fermat's Last Theorem for the exponent 4.) —

4. Finish the proof of Proposition 1 by showing that no two triples X, Y, Z can lead to the same x. 5. (a) Find xe(Cr) 2 such that x + 5(D -F. )2 • (b) Find xe(Cr) 2 such that x + 6e(O+) 2 . 224403517704336969924557513090674863160948472041 8912332268928859588025535178967163570016480830

6803298487826435051217540 411340519227716149383203

411340519227716149383203 21666555693714761309610

Figure 1.3. The Simplest Rational Right Triangle with Area 157 (computed by D. Zagier).

1.

6


(c) Find two values x E (0 + ) 2 such that x + 210 E (0 + )2 . At the end of this chapter we shall prove that if there is one such x, then there are infinitely many. Equivalently (by Proposition 1), if there exists one right triangle with rational sides and area n, then there exist infinitely many. 6. (a) Show that condition (B) in Tunnell's theorem is equivalent to the condition that the number of ways n can be written in the form 2x 2 + y 2 + 8z2 with x, y, z integers and z odd, be equal to the number of ways n can be written in this form with z even. (b) Write a flowchart for an algorithm that tests condition (B) in Tunnell's theorem for a given n. 7. (a) Prove that condition (B) in Tunnell's theorem always holds if n is congruent to 5 or 7 modulo 8. (b) Check condition (B) for all squarefree nE.7. 1 or 3 (mod 8) until you find such an n for which condition (B) holds. (c) By Tunnell's theorem, the number you found in part (b) should be the smallest congruent number congruent to 1 or 3 modulo 8. Use the algorithm in Problem 2 to find a right triangle with rational sides and area equal to the number you found in part (b).

§2. A certain cubic equation In this section we find yet another equivalent characterization of congruent numbers. In the proof of Proposition 1 in the last section, we arrived at the equations ((X + Y)/2) 2 = (Z/2) 2 + n whenever X, Y, Z are the sides of a triangle with area n. If we multiply together these two equations, we obtain ((X 2 - Y 2)/4) 2 = (Z/2)4 — n2 . This shows that the equation 0 n2 = y2 has a rational solution, namely, u = Z/2 and v = (X 2 - Y 2)/4. We next multiply through by u2 to obtain u6 n 2u2 = (uv) 2 . If we set x = u2 = (Z/2) 2 (this is the same X as in Proposition 1) and further set y = uv = (X 2 — Y 2)Z/8, then we have a pair of rational numbers (x, y) satisfying the cubic equation: y 2 = x 3 _ n 2x. —

—

Thus, given a right triangle with rational sides X, Y, Z and area n, we obtain a point (x, y) in the xy-plane having rational coordinates and lying on the curve y2 = x3 — n2 x . Conversely, can we say that any point (x, y) with x, y e Q which lies on the cubic curve must necessarily come from such a right triangle? Obviously not, because in the first place the x-coordinate X = /4 2 = (742) 2 must lie in (0 + ) 2 if the point (x, y) can be obtained as in the last paragraph. In the second place, we can see that the x-coordinate of such a point must have its denominator divisible by 2. To see this, notice that the triangle X, Y, Z can be obtained starting with a primitive Pythagorean triple X', Y', Z' corresponding to a right triangle with integral sides X', Y', Z' and area s2n, and then dividing the sides by s to get X, Y, Z. But in a primitive

7

§2. A certain cubic equation

Pythagorean triple X' and Y' have different parity, and Z' is odd. We conclude that (1) x = (Z 12) 2 = (Z72s) 2 has denominator divisible by 2 and (2) the power of 2 dividing the denominator of Z is equal to the power of 2 dividing the denominator of one of the other two sides, while a strictly lower power of 2 divides the denominator of the third side. (For example, in the triangle in Fig. 1.2 with area 5, the hypotenuse and the shorter side have a 2 in the denominator, while the other leg does not.) We conclude that a necessary condition for the point (x, y) with rational coordinates on the curve y 2 =.-x 3 — n 2x to come from a right triangle is that x be a square and that its denominator be divisible by 2. For example, when n = 31, the point (41 2 /7 2 , 29520/7 3) on the curve y2 = x 3 — 31 2x does not come from a triangle, even though its x-coordinate is a square. We next prove that these two conditions are sufficient for a point on the curve to come from a triangle.

Proposition 2. Let (x, y) be a point with rational coordinates on the curve

y 2 = X.3 -4- n2x. Suppose that x satisfies the two conditions: (0 it is the square of a rational number and (ii) its denominator is even. Then there exists a right triangle with rational sides and arean which corresponds to x under the correspondence in Proposition 1. PROOF. Let u x e CD + . We work backwards through the sequence of steps at the beginning of this section. That is, set y = ylu, so that y 2 = y 2lx= .x 2 _ n 2 , i .e., v 2 ± n 2 = x2. Now let t be the denominator of u, i.e., the smallest positive integer such that tueZ. By assumption, t is even. Notice that the denominators of y2 and x2 are the same (because n is an integer, and y2 + n 2 = x2), and this denominator is t 4 Thus, t2y, t2n, 1 2x is a primitive Pythagorean triple, with 1 2n even. By Problem 1 of §1, there exist integers a and b such that: t2n = 2ab, 1 2 v = a 2 ..._ b2 , t 2x = a2 + b2. Then the right triangle with sides 2a11,2b1t, 2u has area 2ablt2 =---- n, as desired. The image of this triangle X = 2a/t, Y = 2b/t, Z = 2u under the correspondence in Proposition 1 is x = (Z/2) 2 2--- u2 . This proves Proposition 2. a •

We shall later prove another characterization of the points P = (x, y) on the curve y2 = x 3 — n2x which correspond to rational right triangles of area n. Namely, they are the points P = (x, y) which are "twice" a rational point P' = (x', y'). That is, P' + P' -7-- P, where " + " is an addition law for points on our curve, which we shall defin.e later. PROBLEMS ,

I. Find a simple linear change of variables that gives a one-to-one correspondence between points on ny2 = x3 + ax2 + bx + c and points on y2 =-- x 3 + anx 2 + bn2 x + cn3 . For example, an alternate form of the equation y2 = x3 -- n 2 x is the equation ny2 = x3 x. —

2. Another correspondence between rational right iriangles X, Y, Z with area 1-X Y = n and rational solutions to 372 = x' — tex can be constructed as follows.

t. From Congruent Numbers to Elliptic Curves

8

Figure 1.4 (a) Parametrize all right triangles by letting the point u = X/Z, y = YIZ on the unit circle correspond to the slope t of the line joining (— 1, 0) to this point (see Fig. 1.4). Show that

u=

(b) (c) (d)

(e)

1

1 + 1 21

v=

2s 1 + 1 2.

(Note: This is the usual way to parametrize a conic. 1f t = alb is rational, then the point (u, y) corresponds to the Pythagorean triple constructed by the method at the beginning of the chapter.) 1f we want the triangle X, Y, Z to have area n, express nIZ 2 in terms of t. Show that the point x = —nt, y = n2 (1 + t 2)IZ is on the curve y2 = x3 — n2 x. Express (x, y) in terms of X, Y, Z. Conversely, show that any point (x, y) on the curve y2 x3 — n2 x with y 0 0 comes from a triangle, except that to get points with positive x, we must allow triangles with negative X and Y (but positive area IX Y = n), and to get points with negative y we must allow negative Z (see Fig. 1.5). Later in this chapter we shall show the connection between this correspondence and the one given in the text above. Find the points on y2 = x3 — 36x coming from the 3-4-5 right triangle and all equivalent triangles (4-3-5, — 3)—(— 4)-5, etc.).

3. Generalize the congruent number prOblem as follows. Fix an angle 0 not necessarily 90°. But suppose that A = cos 0 and B = sin 0 are both rational. Let n be a squarefree natural number. One can then ask whether n is the area of any triangle with rational sides one of whose angles is O. (a) Show that the answer to this question is equivalent to a question about rational solutions to a certain cubic equation (whose coefficients depend on 0 as well as n). (b) Suppose that the line joining the point ( — 1, 0) to the point (A, B) on the unit circle has slope 1. Show that the cubic in part (a) is equivalent (by a linear change of variables) to the cubic ny2 = x(x À.)(x (1/.1)). The classical congruent number problem is, of course, the case 2= 1.

9

§3. Elliptic curves

Figure 1.5

§3. Elliptic curves The locus of points P = (x, y) satisfying y2 ----= x' — n2x is a special case of what's called an "elliptic curve". More generally, let K be any field, and let f(x)e K[x] be a cubic polynomial with coefficients in K which has distinct roots (perhaps in some extension of K). We shall suppose that K does not have characteristic 2. Then the solutions to the equation

(3.1) Y2 = f(x), where x and y are in some extension K' of K, are called the K' -points of the elliptic curve defined by (3.1). We have just been dealing with the example K = K' = Q, _fix) = x3 — n2 x. Note that this example y2 = x3 — n' x satisfies the condition for an elliptic curve over any field K of characteristic p, as long as p does not divide 2n, since the three roots 0, +n off(x) = x 3 — n2 x are then distinct. In general, if xo , yo e K' are the coordinates of a point on a curve C defined by an equation F(x, y) = 0, we say that C is "smooth" at (xo , yo) if the two partial derivatives OFfax and aF/ay are not both zero at (xo , ye). This is the definition regardless of the ground fi eld (the partial derivative of a polynomial F(x, y) is defined by the usual formula, which makes sense over any field). If K' is the field III of real numbers, this agrees with the usual condition for C to have a tangent line. In the case F(x, y) = y 2 — fix), the partial derivatives are 2yo and —f ' (x0). Since K' is not a field of characteristic 2, these vanish simultaneously if and only if yo = 0 and xo is a multiple root of fix). Thus, the curve has a non-smooth point if and only if j(x) has a multiple root. It is for this reason that we assumed distinct roots in the definition of an elliptic curve: an elliptie.curve is smooth at all of its points.

10

1. From Congruent Numbers to Elliptic Curves

In addition to the points (x, y) on an elliptic curve (3.1), there is a very important "point at infinity" that we would like to consider as being on the curve, much as in complex variable theory in addition to the points on the complex plane one throws in a point at infinity, thereby forming the "Riemann sphere". To do this precisely, we now introduce projective coordinates. By the "total degree" of a monomial xiyi we mean i + j. By the "total degree" of a polynomial Rx, y) we mean the maximum total degree of the monomials that occur with nonzero coefficients. If F(x, y) has total degree n, we de fi ne the corresponding homogeneous polynomial P(x, y, z) of three variables to be what you get by multiplying each monomial xi yi in F(x, y) to bring its total degree in the variables x, y, z up to n; in other by words,

Ax, y, z)

zz

± In our example F(x, y) = 3,2 — (x 3 — x), we have P(x, y, z) = y 2 z n2 xz 2 . Notice that F(x, y) = P(x, y, 1). - oefficients in a field K, and we are Suppose that our polynomials have C interested in triples x, y, Z G K such that Ax, y, z) = O. Notice that: (1) for any 2e K, P(Ax, Ay, Az) = AnP(x, y, z) (n = total degree of F); (2) for any nonzero i. K, P(2x, ) y, Az) = 0 if and only if P(x, y, z) = O. In particular, for z 0 we have F.(x, y, z) = 0 if and only if F(xlz, yiz) = O. Because of (2), it is natural to look at equivalence classes of triples x, y, z c K, where we say that two triples (x, y, z) and (x', y', z') are equivalent if there exists a nonzero Ae K such that (x', y', z') = A(x, y, z). We omit the trivial triple (0, 0, 0), and then we define the "projective plane Pi" to be the set of all equivalence classes of nonirivial triples. No normal person likes to think in terms of "equivalence classes", and fortunately there are more visual ways to think of the projective plane. Suppose that K is the field fl of real numbers. Then the triples (x, y, z) in an equivalence class all correspond to points in three-dimensional Euclidean space lying on a line through the origin. Thus, P oi can be thought of geometrically as the set of lines through the origin in three-dimensional space. Another way to visualize P oi is to place a plane at a distance from the origin in three-dimensional space, for example, take the plane parallel to the xy-plane and at a distance 1 from it, i.e., the plane with equation z = 1. All lines through the origin, except for those lying in the xy-plane, have a unique point of intersection with this plane. That is, every equivalence class of triples (x, y, z) with nonzero z-coordinate has a unique triple of the form (x, y, 1). So we think of such equivalence classes as points in the ordinary xy-plane. The remaining triples, those of the form (x, y, 0), make up the "line at infinity". The line at infinity, in turn, can be visualized as an ordinary line (say,

§3. Elliptic curves

11

the line y = I in the xy-plane) consisting of the equivalence classes with nonzero y-coordinate and hence containing a unique triple of the form (x, 1, 0), together with a single "point at infinity" (I, 0, 0). That is, we define the projective line Plc over a fi eld K to be the set of equivalence classes of pairs (x, y) with (x, y) ,s, ( ) x, 4). Then Pi-can be thought of as an ordinary plane (x, y, 1) together with a projective line at infinity, which, in turn, consists of an ordinary line (x, I, 0) together with its point at infinity (1, 0, 0). More generally, n-dimensional projective space [PI is defined using equivalence classes of (n + 1)-tuples, and ' can be visualized as the usual space of n-tuples (x 1 , . . . , xn, 1) together With a P' at infinity. But we shall only have need of Pic and P. Given a homogeneous polynomial P(x, y, z) with coefficients in K. we can look at the solution set consisting of points (x, y, z) in Pk (actually, equivalence classes of (x, y, z)) for which i(x, y, z) = O. The points of this solution set where z 0 0 are the points (x, y, 1) for which P(x, y, 1) = F(x, y) = O. The remaining points are on the line at infinity. The solution set of f(x, y, z) = 0 is called the "projective completion" of the curve F(x, y) = O. From now on, when we speak of a "line", a "conic section'', an "elliptic curve", etc., we shall usually be working in a projective plane Pi, in which case these terms will always denote the projective completion of the usual curve in the xy-plane. For example, the line y = 117X + h will really mean the solution set to y = mx + bz in Pk ; and the elliptic curve y 2 =--- x 3 — n 2 x will now mean the solution set to y 2 z = x 3 - 17 2 X2: 2 in P. Let us look more closely at our favorite example : F(x, y) = y 2 — .v 3 + 11 2 X , P(x, y, z) = y 2 z — x 3 + n2 xz 2 . The points at infinity on this elliptic curve are the equivalence classes (x, y, 0) such that 0 = P(x, y, 0) = — x 3 . i.e., X -=-- O. There is only one such equivalence class (0, 1, 0). Intuitively, if we take .K = 0'1, we can think of the curve y2 -L---- x 3 — n2 x heading off in an increasingly vertical direction as it approaches-the line at infinity (see Fig. 1.6). The points on the line at infinity correspond to the lines through the origin in the xy-plane, i.e., there is one for every possible slope yix of such a line. As we move far out along our elliptic curve, we approach slope y/x = co, corresponding to the single point (0, 1, 0) on the line at infinity. Notice that any elliptic curve y2 =f(x) similarly contains exactly one point at infinity (0, 1, 0). All of the usual concepts of calculus on curves F(x, y) = 0 in the xy-plane carry over to the corresponding projective curve Ax, y, z) = O. Such notions as the tangent line at a point, points of inflection, smooth and singular points all depend only upon what is happening in a neighborhood of the point in question. And any Point in PŒI has a large neighborhood which looks like an ordinary plane. More precisely, if we are interested in a point with nonzero z-coordinate, we can work in the usual xy-plane, where the curve has equation F(x, y) = P(x, y, 1) = 0. If we want to examine a point on the line z = 0, however, we put the triple in either the form (x, 1, 0) or (1, y, 0). In the former case, we think of it as a point on the curve F(x , 1, z) = 0

12

L From Congruent Numbers to Elliptic Curves

Figure 1.6

in the xz-plane; and in the latter case as a point on the curve F(1, y, z) = 0 in the yz-plane. For example, near the point at infinity (0, 1, 0) on the elliptic curve n2 xz' = O. )72 z x 3 ± n2 xz' , all points have the form (x, 1, z) with z The latter equation, in fact, gives us all points on the elliptic curve except for the three points (0, 0, 1), +n, 0, 1) having zero y-coordinate (these are the three "points at infinity" if we think in terms of xz-coordinates). PROBLEMS

Az) y, z] satisfies F(Ax, I . Prove that if k is an infinite field and F(x, y, Z)E F(x, y, z) for all 2, x, y, ZE K, then F is homogeneous, i.e., each monomial has total degree n. Give a counterexample if K is finite. 2. By a "line" in P we mean either the projective completion of a line in the xy-plane or the line at infinity. Show that a line in Pi has equation of the form ax + by + cz = 0, with a, b, ce K not all zero; and that two such equations determine the same line if and only if the two triples (a, b, e) differ by a multiple. Construct a 1-to -i correspondence between lines in a copy of Pi with coordinates (x, y, z) and points in another copy of Pi with coordinates (a, b, c) and between points in the xyz-projective plane and lines in the abc-projective plane, such that a bunch of points are on the same line in the first projective plane if and only if the lines that correspond to them in the second projective plane all meet in the same point. The xyz-projective plane and the abc-projective plane are called the "duals" of each other.

13

§3. Elliptic curves 3. How many points at infinity are on a parabola in N? an ellipse? a hyperbola?

4. Prove that any two nondegenerate conic sections in Pf, are equivalent to one another by some linear change of variables.

5. (a) If "(x, y, z)E K[X, y, z] is homogeneous of degree n, show that

a

afr- a

X - + y — + z— = 'Ir. az ax ay (b) If K has characteristic zero, show that a point (x, y, z)EP ii is a non-smooth point on the curve C: f(x, y, z) = 0 if and only if the triple (5i-70x, OPIey, aPlaz)is (0, 0, 0) at our particular (x, y, z). Give a counterexample if char K 0 O. In what follows, suppose that char K = 0, e.g., K = R. (c) Show that the tangent line to C at a smooth point (xo , y o , z o) has equation ax + by + cz = 0, where

(xo , yo , :0)

ay 1(x0 , yo . z o ) of-

,

(d) Prove that the condition that (x, y, z) be a smooth point on C does not depend upon the choice of coordinates, i.e., it does not change if we shift to x'y'z' coordinates, where (x' y' z') = (x y z)A with A an invertible 3 x 3 matrix. For example, if more than one of the coordinates are nonzero, it makes no difference which we choose to regard as the "z-coordinate", i.e., whether we look at C in the xy-plane, the xz-plane, or the yz-plane. (e) Prove that the condition that a given line I be tangent to C at a smooth point (x, y, z) does not depend upon the choice of coordinates. -

6. (a) Let P1 = (x 1 , Yi , z 1 ) and P2 = (X2 1 y2, z2) be two distinct points in Pi. Show that the line joining P1 and P2 can be given in parametrized form as sP, + tP2 , i.e., {(sx, + tx 2 , sy i + 1y2 , sz i + tz2)Is, t E K 1. Check that this linear map takes Pk (with coordinates s, t) bijectively onto the line P1 P2 in P. What part of the line do you get by taking s = 1 and letting t vary? (b) Suppose that K = R or C. If the curve F(x, y) = 0 in the xy-plane is smooth at pi = (x i , Yi) with nonvertical tangent line, then we can expand the implicit function y = f(x) in a Taylor series about x = x l . The linear term gives the tangent line. If we subtract off the linear term, we obtain f(x) — yi — f' (x 1 )(x — x 1 ) = a,,,(x — x1 )" + • • • , where a„, 0 0, m ..>. 2. m iscalled the "order of tangency". We say that (x / , Yi) is a point of in fl ection ifm > 2, i.e.,.f" (x 1 ) = O. (In the case K = [R, note that we are not requiring a change in concavity with this definition, e.g., y = x4 has a point of inflection at x = O.) Let P, = (x i , y 1 , z 1 ), z 1 # 0, and let / = P1 P2 be tangent to the curve F(x, y) = f(x, y, 1) at the smooth point P1 • Let P2 = (x2, y2 , z2). Show that in is the lowest power of f that occurs in P(x i + tx 2 , y 1 + ty 2 , z, + tz 2)e KN. (c) Show that m does not change if we make a linear change of variables in PZ . For example, suppose that Yi and z 1 are both nonzero, and we use the xz-plane instead of the xy-plane in parts (a) and (b). 7. Show that the line at infinity (with equation z =-- 0) is tangent to the elliptic curve y2 =f(x) at (0, 1, 0), and that the point (0, 1, 0) is a point of inflection on the curve.

14


§4. Doubly periodic functions Let L be a lattice in the complex plane, by which we mean the set of all integral linear combinations of two given complex numbers w 1 and co 2 , where w 1 and w 2 do not lie on the same line through the origin. For example, if w i = i and w 2 = I, we get the lattice of Gaussian integers fini + n1m,, ne Z}. It will turn out that the example of the lattice of Gaussian integers is intimately related to the elliptic curves y2 = x3 — n2 x that come from the congruent number problem. The fundamental parallelogram for oh , w2 is defined as

= {aw l + 170) 2 10

a

1,O b

Since oh , (11 2 form a basis for C over R, any number x e C can be written in the form x = aw l + bw 2 for some a, bE R. Then x can be written as the sum of an element in the lattice L = { niw l + nw2 } and an element in II, and in only one way unless a or b happens to be an integer, i.e., the element of H happens to lie on the boundary an. We shall always take w 1 , w 2 in clockwise order; that is, we shall assume that o 1 /w 2 has positive imaginary part. Notice that the choice of w 1 , w 2 giving the lattice L is not unique. For example, (di = w 1 + w2 and w2 give the same lattice. More generally, we can obtain new bases al , ol2 for the lattice L by applying a matrix with integer entries and determinant I (see Problem l below). For a given lattice L, a meromorphic function on C is said to be an elliptic function relative to L iff(z + I) =J(z) for all I e L. Notice that it suffices to check this property for I = wi and / = w 2 . In other words, an elliptic function is periodic with two periods oh and co2 . Such a function is determined by its values on the fundamental parallelogram 11 ; and its values on opposite points of the boundary of H are the same, i.e., f(aco l + w2) = f(aw i ), f(w i + bw2) f(bw 2). Thus, we can think of an elliptic function f(z) as a function on the set ri with opposite sides glued together. This set (more precisely, "complex manifold") is known as a "torus". It looks like a donut. Doubly periodic functions on the complex numbers are directly analogous to singly periodic functions on the real numbers. A function f(x) on R which satisfies f(x nw) = f(x) is determined by its values on the interval [0, a]. Its values at 0 and w are the same, so it can be thought of as a function on the interval [0, co] with the endpoints glued together. The "real manifold" obtained by gluing the endpoints is simply a circle (see Fig. 1.7). Returning now to elliptic functions for a lattice L, we let ej denote the set of such functions. We imn) ,2(1 1v see that L is a subfield of the field of all meromorphic functions. • sum, difference, product, or quotient of two elliptic functions is elliptic. In addition, the subfield 6/, is closed under differentiation. We now prove a sequence of propositions giving some very special properties which any elliptic function must have. The condition that a meromorphic function be doubly periodic turns out to be much more

15

§4. Doubly periodic functions

o Figure 1.7

restrictive than the analogous condition in the real case. The set of realanalytic periodic functions with given period is much "larger" than the set 67i, of elliptic functions for a given period lattice L. Proposition 1 A function f(z)eeL , L = { rmo,. + no) 2 }, which has no pole in the fundamental parallelogram H must be a constant. PROOF. Since I-1 is compact, any such function must be bounded on H, say by a constant M. But then i f(z)i < M for all z, since the values of f(z) are determined by the values on fl. By Liouville's theorem, a meromorphic 0 function which is bounded on all of C must be a constant. Proposition 4. With the same notation as above, let a + H denote the translate

of ri by the complex number a, i.e., {a + zlze II}. Suppose that f(z)E 6, has no poles on the boundary C of a --I- IL Then the sum of the residues off(z) in a + fl is zero. PROOF.

By the residue theorem, this sum is equal to

1 — f f(z)dz. 21ri c But the integral over opposite sides cancel, since the values off(z) at corresponding points are the same, while dz has opposite signs, because the path of integration is in opposite directions on opposite sides (see Fig. 1.8). Thus, the integral is zero, and so the sum of residues is zero. o Notice that, since a meromorphic function can only have finitely many poles in a bounded region, it is always possible to choose an a such that the boundary of a + ri misses the poles of f(z). Also note that Proposition 4 immediately implies that a nonconstant f(z) e must have at least two poles (or a multiple pole), since if it had a single simple pole, then the sum of residues would not be zero.

e,

1.

16


+

w2

Figure 1.8 Proposition 5. Under the conditions of Proposition 4, suppose that f(z) has no zeros or poles on the boundary of a + H. Let {m i} be the orders of the various m i = I rip zeros in a + n, and let {n i} be the orders of the various poles. Then

PROOF. Apply Proposition 4 to the elliptic function f ' (z)/f(z). Recall that the logarithmic derivative f ' (z)If(z) has a pole precisely where f(z) has a zero or pole, such a pole is simple, and the residue there is equal to the order of zero or pole of the original f(z) (negative if a pole). (Recall the argument: If + • •• , and so f (z) f(z) f(z) = c „i (z a)tm + • • - , then f ' (z) = cdn(z — = m(z ay' + • .) Thus, the sum of the residues off (z)lf(z) is I'm ni = O. We now define what will turn out to be a key example of an elliptic function relative to the lattice L = {nuo i + no)2 }. This function is called the Weierstrass p-function. It is denoted p(z; L) or (z; (0 1 , 0) 2) 1 or simply go(z) if the lattice is fixed throughout the discussion. We set

1 (z) = go(z; L) raz2

(4.1)

Proposition 6. The sum in (4.1) converges absolutely and uniformly for z in any compact subset of C — L. PROOF. The sum in question is taken over a two-dimensional lattice. The proof of convergence will be rather routine if we keep in mind a onedimensional analog. If instead of L we take the integers Z, and instead of reciprocal squares we take reciprocals, we obtain a real function f(x) = + E 1 17 + f, where the sum is over nonzero /€ Z. To prove absolute and uniform convergence in any compact subset of Z, first write the summand as x/(/(x — , and then use a comparison test, showing that the series in question basically has the same behavior as l. More precisely, use the following lemma: if Ili is a convergent sum of positive terms (all our sums being over nonzero /€ Z), and if Eff (x) has the property that 1 i(x)/b,1 approaches a finite limit as I + co, uniformly for x in some set, then the sum Ef(x) converges absolutely and uniformly for x in that set. The details

17

§4. Doubly periodic functions

are easy to fill in. (By the way, our particular example off(x) can be shown to be the function it cot nx; just take the logarithmic derivative of both sides of the infinite product for the sine function: sin nx t---- irx11 1 (1 —

.x21n 2).) The proof of Proposition 6 proceeds in the same way. First write the summand over a common denominator:

1 (z _

1 _ 2z — z 2// 02 /2 - (z _ 02/ •

Then show absolute and uniform convergence by comparison with 1/I', where the sum is taken over all nonzero /EL. More precisely, Proposition 6 will follow from the following two lemmas.

Lemma 1. If E bi is a convergent sum of positive terms, where the sum is taken over all nonzero elements in the lattice L, and if E f(z) has the property that I fi(z)11, 1 1 approaches a finite limit as ill --* co, uniformly for z in some subset of C, then the sum E J;(z) converges absolutely and uniformly for z in that set. Lemma

2. E Ili' converges if s> 2.

The proof of Lemma 1 is routine, and will be omitted. We give a sketch of the proof of Lemma 2. We split the sum into sums over / satisfying n — 1 2 and conditionally if s = 2 .,,,€z

(in the latter case, take the sum over m and n in nondecreasing order of Imco l + (b) Express Fs(co 1 , to 2) in terms of the values of (z; w 1 , (o-,) or a suitable derivative evaluated at values of z e 11 for which Nz € L (see Problem 2(b)).

4. Show that for any fixed u, the elliptic function p(z) — u has exactly two zeros (or a single double zero). Use the fact that p' (z) is odd to show that the zeros of 0 '(z) are precisely co 1 /2, (1)2/2, and (co l + 0) 2)/2, and that the values e l = 0(0 1 /2), e 2 = 0(°2/2), e3 = p((ca l + (.0 2 )/2) are the values of u for which p (z) — u has a double zero. Why do you know that e 1 ,e2 ,e3 are distinct? Thus, the Weierstrass 0-function gives a two-to-one map from the torus (the fundamental parallelogram fl with opposite sides glued together) to the Riemann sphere C k.) lac } except over the four "branch points" e l , e2 , e3 , co, each of which has a single preimage in C/L.

5. Using the proof of Proposition 9, without doing any computations, what can you say about how the second derivative 0 "(z) can be expressed in terms of p(z)?

1.

22


§6. Elliptic curves in Weierstrass form As remarked at the end of the last section, from the proof of Proposition 9 we can immediately conclude that the square of go' (z) is equal to a cubic polynomial in go (z). More precisely, we know that p`(z) 2 has a double zero at co 1 /2, w 2 /2, and (w 1 + co2)12 (see Problem 4 of §5). Hence, these three numbers are the a i 's, and we have

= C(P(z) — 0 (w1/ 2))(k) (z) — 0 (0)2/2))(0(z) — Sto U 0) 1 + (0 2)/2))

= C(p(z) — e 1 )(0(z) — e 2)(p(z)— e 3), where C is some constant. It is easy to find C by comparing the coefficients of the lowest power of z in the Laurent expansion at the origin. Recall that p (z) — z -2 is continuous at the origin, as is o' (z) + 2z 3 . Thus, the leading term on the left is (-2z 3) 2 = 4z -6 , while on the right it is C(z 2) 3 =I Cz'. We conclude that C z----- 4. That is, p(z) satisfies the differential equation '(z ) 2

= f(p(z)),

where f(x) = 4(x — e i )(x — e 2)(x — e 3) E C[x]. (6.1)

Notice that the cubic polynomial f has distinct roots (see Problem 4 of §5). We now give another independent derivation of the differential equation for p(z) which uses only Proposition 3 from §4. Suppose that we can find a cubic polynomial fix) = ax3 + bx2 + cx + d such that the Laurent expansion at 0 of the elliptic function f(p(z)) agrees with the Laurent expansion of p'(z) 2 through the negative powers ofz. Then the difference pl(z) 2 —f(so(z)) would be an elliptic function with no pole at zero, or in fact anywhere else (since 0 (z) and ga'(z) have a pole only at zero). By Proposition 3, this difference is a constant; and if we suitably choose d, the constant term in f(x), we can make this constant zero. To carry out this plan, we must expand p(z) and p'(z) 2 near the origin. Since both are even functions, only even powers of z will appear. Let c be the minimum absolute value of nonzero lattice points 1. We shall take r < 1, and assume that z is in the disc of radius rc about the origin. For each nonzero leL, we expand the term corresponding to 1 in the definition (4.1) of p(z). We do this by differentiating the geometric series 1/(1 — x) = I + x + x2 + - - • and then substituting z11 for x:

z 1 z2 z3 =1-1-2-1-3-1-4—+ / /2 /3 (1 — z//)2

.

• •

If we now subtract 1 from both sides, divide both sides by / 2 , and then substitute in (4.1), we obtain

0 (z) = --i + L

/EL.

1* 0

7 k-

Z

E 2-:-., + 3 — + 4-- + L

• • • +

2

(k — i) b ik +...

23

§6. Elliptic curves in Weierstrass form

We claim that this double series is absolutely convergent for 1z1 < re, in which case the following reversal of the order of summation will be justified :

1 (z) = + 3G4 z 2 + 5G6 z4 + 7G8 z6 +

(6.2)

• • •

where for k> 2 we denote Gk = Gk(L) = Gk(C0i, (92) ci-ff

E r-k = E

/EL

t()

(6.3)

m.n

e 2 (MC° I -F MO 2 )k not both 0

(notice that the Gk are zero for odd k, since the term for I cancels the term for —/; as we expect, only even powers of z occur in the expansion (6.2)). To check the claim of absolute convergence of the double series, we write the sum of the absolute values of the terms in the inner sum in the form (recall: izi < di!):

( 3 4 21z1.I11 -3 . I + — r + 2

5 — r3 + 2

< )

2izi (1 — r) 2 1/1 3 '

and then use Lemma 2 from the proof of Proposition 6. We now use (6.2) to compute the first few terms in the expansions of (z), p ( 2) 2 , (z) 3 to' (z), and p'(z) 2 , as follows: go '(z) =

—

6G4Z

20G6Z 3 42G8Z5

---

'(z) 2 = — 24G4 12 — 80G6 + (36G1 168G8

): 2

(6.4) +•••

( 6.5)

(z) 2 = + 6G4 + 10G6 z 2 + ... •

(6.6)

(z) 3 = z--g + 9G4 -z-2-1+ 15G6 + (21G8 + 27 GI)z 2 + • • •

(6.7)

Recall that we are interested in finding coefficients a, b, c, d of a cubic f(x) = ax3 + bx2 + cx + d such that

pi(z) 2 = a (z) 3 + b p (z) 2. + c (z) + and we saw that it suffices to show that both sides agree in their expansion through the constant term. If we multiply equation (6.7) by a, equation (6.6) by b, equation (6.2) by c, and then add them all to the constant d, and finally equate the coefficients of z- 6 3 z -4 3 Z and the constant terni to the corresponding coefficients in (6.5), we obtain successively:

a = 4; Thus, c

b = 0;

—24G4 = 4(9G4 ) + c;

—80G6 = 4(1 5G(,) + d.

60G4 , d = —140G6 . It is traditional to denote

24

I.


g2 = g2(1-) cr-ff 60G4 ------- 60

lE L 1* 0

- 140G6 = 140 g 3 = g3 (L) i--5

(6.8)

:eL

1*0

We have thereby derived a second form for the differential equation (6.1):

p'(z) 2 = f(p(z)),

where f(x) ----=- 4x 3 — g 2 x — g3 e C [x]. (6.9)

Notice that if we were to continue comparing coefficients of higher powers ofz in the expansion of both sides of (6.9), we would obtain relations between the various GI, (see Problems 4-5 below). The differential equation (6.9) has an elegant and basic geometric interpretation. Suppose that we take the function from the torus CIL (i.e., the fundamental parallelogram H with opposite sides glued) to PF defined by zi—+

(,(z), p'(z), 1)

for

z

o O;

01-0(0, 1, 0).

(6.10)

The image of any nonzero point z of CI L is a point in the xy-plane (with complex coordinates) whose x- and y-coordinates satisfy the relationship y2 = f(x) because of (6.9). Here f(x)eC[x] is a cubic polynomial with distinct roots. Thus, every point z in CIL, maps to a point on the elliptic curve y2 = f(x) in Pt It is not hard to see that this map is a one-to-one correspondence between CIL and the elliptic curve (including its point at infinity). Namely, every x-value except for the roots of f(x) (and infinity) has precisely two z's such that 0 (z) = x (see Problem 4 of §5). The y-coordinates y = Oz) coming from these two z's are the two square roots off(x) = f(p(z)). If, however, x happens to be a root of fix), then there is only one z value such that p(z) = x, and the corresponding y-coordinate is y = sg'(z) = 0, so that again we are getting the solutions to y2 =fix) for our given x. Moreover, the map from C/L, to our elliptic curve inti:q is analytic, meaning that near any point of C1L it can be given by a triple of analytic functions. Near non-lattice points of C the map is given by z --+ (0(4 ga'(z), 1); and near lattice points the map is given by z 1---9 (0(z)1 0/(z), 1, 11 p'(z)), which is a triple of analytic functions near L. We have proved the following proposition. Proposition 10. The map (6.10) is an analytic one-to-one correspondence

between CI L and the elliptic curve y 2 =--- 4x3 — g2 (L)x — g 3 (L) in P. One might be interested in how the inverse map from the elliptic curve to CIL, can be constructed. This can be done by taking path integrals of dxly = (4x3 — g 2 x — g 3) -112dx from a fixed starting point to a variable endpoint. The resulting integral depends on the path, but only changes by

25

§6. Elliptic curves in Weierstrass form

e1

e2

C3

00

Figure 1.10

a "period", i.e., a lattice element, if we change the path. We hence obtain a well-defined map to CIL. See the exercises below for more details. We conclude this section with a few words about an algebraic picture that is closely connected with the geometric setting of our elliptic curve. Recall from Proposition 8 that any elliptic function (meromorphic function on the torus C/L) is a rational expression in ka(z) and p'(z). Under our one-to-one correspondence in Proposition 10, such a function is carried over to a rational expression in x and y on the elliptic curve in the xy-plane (actually, in Pb. Thus, the field C(x, y) of rational functions on the xy-plane, when we restrict its elements to the elliptic curve y 2 = fix), and then "pull back" to the torus C/L, by substituting x = 0 (z), y = so`(z), give us precisely the elliptic functions ej • Since the restriction of y2 is the same as the restriction of f(x), the field of functions obtained by restricting the rational functions in C(x, y) to the elliptic curve is the following quadratic extension of C(x): C(x)[y]l(y 2 — (4x 3 — g 2 x — g 3)). Algebraically speaking, we form the quotient ring of C(x)[y] by the principal ideal corresponding to the equation y2 7--- f(x). Geometrically, projection onto the x-coordinate gives us Fig. I.10. Two points on the elliptic curve map to one point on the projective line, except at four points (the point at infinity and the three points where y = 0), where the two "branches" are "pinched" together. In algebraic geometry, one lets the field F = C(x) correspond to the complex line P,, and the field K = C(x, y)/ y 2 — (4x 3 — g 2 x — g 3) correspond to the elliptic curve in P. The rings A = C[x] and B = C[x, AI y 2 — f(x) are the "rings of integers" in these fields. The maximal ideals in A are of the form (x a)A; they are in one to one corrèspondence with a E C. A maximal ideal in B is of the form (x — a)B + (y — b)B (where b is a square root off(a)), and it corresponds to the point (a, b) on the elliptic curve. —

-

-

K D B (x — a)B + (y — b)B

FA

(x — a)B + (y + b)B ; ; (x — a)A

(b = \Ii (a))


26

The maximal ideal (x — a)A, when "lifted up" to the ring B, is no longer prime. That is, the ideal (x — a)B factors into the product of the two ideals:

(x — a)B = ((x — a)B + (y — b)B)((x — a)B + (y + b)B). The maximal ideal corresponding to the point a on the x-line splits into two maximal ideals corresponding to two points on the elliptic curve. If it so happens that b = 0, i.e., a is a root off(x), then both of the ideals are the same, i.e., (x — a)B is the square of the ideal ((x — a)B + yB). In that case we say that the ideal (x — a)A "ramifies" in B. This happens at values a of the x-coordinate which come from only one point (a, 0) on the elliptic curve. Thus, the above algebraic diagram of fields, rings and ideals is an exact mirror of the preceding geometric diagram. We shall not go further than these ad hoc comments, since we shall not be using algebra geometric techniques in which follows. For a systematic introduction to algebraic geometry, see the textbooks by Shafarevich, Mumford, or Hartshorne. PROBLEMS

I. (a) Let L ---7-- Z[i] be the lattice of Gaussian integers. Show that g 3 (L) = 0 but that g 2 (L) is a nonzero real number. 1 + 0), be the lattice of integers in the qua-(b) Let L = Z [co], where co -= dratic imaginary field 0(\/— 3). Show that g2 (L) = 0 but that g 3 (L) is a nonzero real number. (c) For any nonzero complex number c, let cL denote the lattice obtained by multiplying all lattice elements by c. Show that g2 (eL) = c- 4 g 2 (4, and g 3(cL) = (—

c --- 6g 3 (4 .

(d) Prove that any elliptic curve y 2 = 4x3 — g2 x — g 3 with either g 2 or g 3 equal to zero, is of the form y 2 =-. 4x 3 — g2 (L)x — g 3 (L) for some lattice L. It can be shown that any elliptic curve is of that form for some lattice L. See, for example, [Whittaker & Watson 1958, §21.73]; also, we shall prove this much later as a corollary in our treatment of modular forms. 2. Recall that the discriminant of a polynomial f(x) = ao x" + al x" + • • • + an = ao (x — e i ))x — e 2). - -(x — e„) is ag-1 1-1 1 ,i(ei — ei)2 . It is nonzero if and only if the roots are distinct. Since it is a symmetric homogeneous polynomial of degree n(n — 1) in the es, it can be written as a polynomial in the elementary symmetric polynomials in the es, which are ( — Ojai/a° . Moreover, each monomial tertn rl i (adao)mi has total "weight" m 1 + 2m 2 + • • • + rum, equal to n(n — 1). Applying this tof(x) = 4x 3 — g 2 x — g 3 , we see that the discriminant is equal to a polynomial in g 2 , g 3 of weight six, i.e., it must be of the form ag3 + fig. Find a and fl by computing 42 (e 1 — e 2)2(e 1 — e 3 ) 2(e 2 — e3) 2 directly in the case g 2 = 4, g3 = 0 and the case g 2 = 0, g 3 = 4. 3. Since the even elliptic function p"(z) has a quadruple pole at zero and no other pole, you know in advance that it is equal to a quadratic polynomial in p (z). Find this polynomial in two ways: (a) comparing coefficients of powers of z; (b) differentiating p'2 = 40 3 — g2 p — g 3 . Check that your answers agree.

77

§6. Elliptic curves in Weierstrass form 4. Use either the equation for 0' 2 or the equation for 0" to prove that G8 =

4G1..

5. Prove by induction that all Gk's can be expressed as polynomials in G4 and G6 with rational coefficients, i.e., GkEO[G4, G6]. We shall later derive this fact again when we study modular forms (of which the Gk turn out to be examples). 6. Let co l = it be purely imaginary, and let co 2 = 7r. Show that as t approaches infinity . Gk (it, n) approaches 2n -k ((k), where C(s) is the Riemann zeta-function. Suppose we know that “2) = n 2/6, ;(4) = m 4/90, C(6) = e/945. Use Problem 4 to find (8). Use Problem 5 to show that n -k ((k)e 0 for all positive even inEegers k. 7. Find the limit of g2 and g3 for the lattice L .--- {mil + mr} as / --- co. 8. Show that v = csc 2 z satisfies the differential equation v' 2 = 4v 2 (v — I), and that the function V

=

2 CSC Z

— 's1

satisfies the differential equation v' 2 = 4y 3 — iv — A- What is the discriminant of the polynomial on the right? Now start with the infinite product formula for sin(nz), replace z by zin, and take the logarithmic derivative and then the derivative once again to obtain an infinite sum for csc 2 z. Then prove that .

lim p(z; it, 7r) = csc2 z — i.

i-oGo

9. The purpose of this problem is to review the function

z = log y for v complex,

in the process providing a "dry run" for the problems that follow. (a) For v in a simply connected region of the complex plane that does not include the origin, define a function z of v by:

z=

i

f u di1 —, 1

where the path from I to v is chosen arbitrarily, except that the same choice is made for all points in the region. (In other words, fix any path from I to vo , and then to go to other v's use a path from vo to y that stays in the region.) Call this function z = log v. Show that if a different path is chosen, the function changes by a constant value in the "lattice" L =--- {2/rim} ; and that any lattice element can be added to the function by a suitable change of path. (L is actually only a lattice in the imaginary axis illi, not a lattice in C.) (b) Express dzidv and dvldz in terms of v. (c) If the function y = e2 is defined by the usual series, use part (b) to show that ez is the inverse function of z = log v. (d) Show that the map ez gives a one-to-one correspondence between C/L, and C — {0}. Under this one-to-one correspondence, the additive group law in C/L becomes what group law in C — {O}? W. Let L be a fixed lattice, set g2 ;----- g2 (L), g 3 = g3 (L), so(z) = so(z; L). Let u = 1(z) be a function on a connected open region R c C which satisfies the differential equation u'' = 4u 3 g 2 u g 3 . Prove that u = ko(z + a) for some constant a. —

—

11. Let L = {mco 1 + mo 2 } be a fixed lattice, and set g2 = g2 (L), g 3 = g3 (L), 6.)(:) = go(z; L). Let R 1 be an unbounded simply connected open region in the complex plane which does not contain the roots e l , e2 , e 3 of the cubic 4x 3 g 2 x — g3. —

28

I. From Congruent Numbers to Elliptic Curves

1.- - (- - 4.-° e2

(4212

Figure 1.11

For UER,, define a function z = g(u) by co

z = g (u) =

di

ju \14t 3 — g 2 t — g 3

,

where a fixed branch of the square root is chosen as t varies in R 1 . Note that the integral converges and is independent of the path in R, from u to oo, since R, is simply connected. The function z = g(u) can be analytically continued by letting R2 be a simply connected region in C — {e 1 , e2 , e3 ) which overlaps with R 1 . If ue R2, then choose u, eR, n R2, and set z :.---- g(u) = g(u1 ) + Jut,' (4t 3 — g2 t — g3) -112 dt. This definition clearly does not depend on our choice of u, E R 1 n R2 or our path from u to u, in R2. Continuing in this way, we obtain an analytic function which is multivalued, because our sequence of regions R I , R2, R3, . . . can wind around e l , e2 , or e3 . (a) Express (dz./4) 2 and (duidz) 2 in terms of u. (b) Show that u = p(z). In particular, when we wind around e l , e2 , or e 3 the value of z can only change by something in L. Thus, z = g (u) is well defined as an element in CIL for uE C — {e l , e2 , e3 } . The function z = g(u) then extends by continuity to e 1 , e2 , e3 . (c) Let Cl be the path in the complex u-plane from e2 to co that is traced by u =-• 0 (2) as z goes from w2/2 to 0 along the side of ri (see Fig. I.11). Show that Sci (4t 3 g 2 t g 3)- wdt::---- w 2/2 for a suitable branch of the square root. (d) Let C2 be the path that goes from co to e2 along C1 , winds once around e2 , and then returns along C, to co. Take the same branch of the square root as —

—

—

in part (c), and show that lc, (4j 3 — g 2 t — g 3) -1/2dt = co2 . (e) Describe how the function z = g(u) can be made to give all preimages of u under a =--

12. (a) Prove that all of the roots e l , e2 , e3 of 4x3 — g2 x — g3 are real if and only if g2 and g 3 are real and A = g3 — 27gi > O. (b) Suppose that the conditions in part (a) are met, and we order the ei so that e2 > e3 > e l . Show that we can choose the periods of L to be given by

L. e

1 to1 = i 2

I

dt

V g 3 + g2 t — 43

and

1 2 (D2 = fe 2

di 114/. 3 — g 2 /

— g3

?

where we take the positive branch of the square root, and integrate along the real axis. (c) With these assumptions about the location of the e • on the real axis, describe how to change the path of integration and the branch of the square root in

§7. The addition

29

law

Problem 11 so as to get the other values of z for which u = p(:), namely + z rnco + nw2 . 13. Suppose that g2 = 4n2, g 3 = O. Take e 1 , e 2 , e3 so that e2 > e 3 > e 1 . What are C 1 , e2 , e 3 in this case? Show that N I = kû2 , i.e., the lattice L is the Gaussian integer lattice expanded by a factor of w2 . Show that as z travels along the straight line from w 1 /2 to w 1 /2 (02 the point (x, y) = (0(z), p'(z)) moves around the real points of the elliptic curve y2 = 4(x 3 n 2x) between —n and 0; and as : travels along the straight line from 0 to w2 the point (x, = (p(z), o'(z)) travels through all the real points of this elliptic curve which are to the right of (n, 0). Think of the "open" appearance of the latter path to be an optical illusion: the two ends are really "tied together" at the point at infinity (0, 1, 0). 14. (a) Show that

JO \/i(1

1 n—-)

n 135

inch

t) =

222

(I)) Under the conditions of Problem 12, with e2 > e3 > e 1 , set ). = e3

e2

el

E(0. I).

Derive the formula:

1 \/e 2 —

ri e 1 J0

dt t(1

1)(1 — ;d)

(c) Derive the formula w 2 = n(e 2 e i ) -112F(A), where 1 3 5 ( \ -12 ?lit F(2) = 2- -2- • • • n n= 0 [

The function F(À) is called a "hypergeometric series". (d) Show that the hypergeometric series in part (c) satisfies the differential equa— 11)F" (A) + (1 — 2),)Ft (A) — = O. tion:

§7. The addition law In the last section we showed how the Weierstrass 0-function gives a correspondence between the points of ICJ, and the points on the elliptic g 3 (L) in Pt We have an obvious addition curve y2 = f(x) = 4x 3 g2 (L)x law for points in CIL, obtained from ordinary addition of complex numbers by dividing by the additive subgroup L. i.e., ordinary addition "modulo L". This is the two-dimensional analog of "addition modulo one" in the group

R/Z. We can use the correspondence between C1L and the elliptic curve to carry over the addition law to the points on the elliptic curve. That is, to add two points P i (x i , Yi) and P2 = (-)C21 y2), by definition what we do is go back to the z-plane, find z and z 2 such that PI = (p(z 1 ), 0/(:, )) and P2 =-- (P ( : 1 z2), + : 2 )). This P2 = (P(Z2), 0/(Z2)), and then set P i is just a case of the general principle: whenever we have a one-to-one correspondence between elements of a commutative group and elements of some

1. From

30

Congruent Numbers to Elliptic Curves'

,

a +wi +w2

Figure 1.12

other set, we can use this correspondence to define a commutative group law on that other set. But the remarkable thing about the addition law we obtain in this way is that (1) there is a simple geometric interpretation of "adding" the points on the elliptic curve, and (2) the coordinates of P1 + P2 can be expressed directly in terms of X 1 , x2 , Yi' y2 by rather simple rational functions. The purpose of this section is to show how this is done. We first prove a general lemma about elliptic functions.

Lemma. Let f(z)ee L . Let H =_-- {aco i ± bco2 10 .... a, b .... 1} be a fundamental parallelogram for the lattice L, and choose a so that f(z) has no zeros or poles on the boundary of a + H. Let lai l be the zeros off(z) in a ± II, each repeated as many times as its multiplicity, and let Ibil be the poles, each occurring as many times as its multiplicity. Then Za i —Ebi E L. PROOF. Recall that the function f(z)/f(z) has poles at the zeros and poles off(z), and its expansion near a zero a of order m is mgz — a) + • • - (and

near a pole b of order —m the expansion is —m/(z — b) + • • • ). Then the function zf(z)lf(z) has the same poles, but, writing z = a 1--(z a), we see that the expansion starts out am/(z — a). We conclude that E ai — E bi is the sum of the residues of zf(z)lf(z) inside a + II. Let C be the boundary of a + H. By the residue theorem, -

E ai — E bi = 2—n i c

f(z)

—

d

We first take the integral/ over the pair of opposite sides from a to a + w2 and from a + co l to a + co l + co 2 (see Fig. 1.12). This part is equal to

Œ+(01+(»2 I 0'14-'4)2 z f (z) dz f(z) bri a

-. 1 ( rœ+(02 I = -

CO 1

)

fOE+0),

Z f(z) dz

f(z)

2ni j a

2 -1- (0 2

2ni ,

f(Z) z _f(z) dz I

a

f (2,\ f(z)

1

fœ-f(92

dz.

(z

+

(01) 11(z)

f(z)

dz)

31

§7. The addition law

Now make the change of variables u = f(z), so that f'(z)dzIf(z) ---- du/u. Let Ci be the closed path from f(a) to f(Œ + 62) 7--- f(a) traced by u = f(z) as z goes from a to a + co2 . Then I du .r+C4)2 f(Z) d.Z = 2 n i ci II ' 2iti a f(z) and this is some integer n, namely the number of times_ the closed path C1 winds around the origin (counterclockwise). Thus, we obtain — co l n for this part of our original integral. In the same way, we find that the integral over the remaining two sides of C is equal to — co2 m for some integer in. Thus, o Z ai — Ebi = —nw i — mco 2 e L, as desired. This proves the lemma. 1

We are now ready to derive the geometrical procedure for adding two points on the elliptic curve y2 =f(x) = x3 — g 2 (L)x — g 3 (L). For z in CI L, let Pz be the corresponding point P, = (so(z), so'(z), 1), Po = (0, 1, 0) on the elliptic curve. Suppose we want to add Pz , = (x1 , Yi) to Pz2 = (x2, y 2) to obtain the sum Pz.,, z ,= (x3 , y 3). We would like to know how to go from the two points to their sum directly, without tracing the points back to the

z-plane. We first treat some special cases. The additive identity is, of course, the image of z = O. Let 0 denote the point at infinity (0, I, 0), i.e., the additive identity of our group of points. The addition is trivial if one of the points Pz, have the same is 0, i.e., if z i or z 2 is zero. Next, suppose that P x-coordinate but are not the same point. This means that x2 z--- x 1 , Y2 = — .Y1 In this case z 2 = —z 1 , because only "symmetric" values of z (values wh'cii are the negatives of each other modulo the lattice L) can have the same Pz, = P0 = 0, i.e., the two points are additive so-value. In this case, P inverse to one another. Speaking geometrically, we say that two points of the curve which are on the same vertical line have sum 0. We further note that in the special situation of a point PZ i = PZ2 on the x-axis, we have y2 = — Yi = 0, and it is easy to check that we still have Pz + P 2 2P. = O. We have proved: 1 2 1 Proposition 11. The additive inverse of (x, y) is (x, —y).

piven two points Pi = Pz , = (x i , y 1 ) and P2 = Pz2 = (X2 1 y 2) on the elliptic curve y2 = 4x 3 — g2 x — g 3 (neither the point at infinity 0), there is a line 1= Pi P2 joining them. If Pi = P2 1 we take 1 to be the tangent line to the elliptic curve at P1 . If / is a vertical line, then we saw that P i + P2 = O. Suppose that / is not a vertical line, and we want to find Pi + P2 = P3 = (x 3 , y 3). Our basic claim is that — P3 = (x 3 , —y 3) is the third point of intersection of the elliptic curve with J. Write the equation of 1 = P1 P2 in the form y = mx + b. A point (x, y) on 1 is on the elliptic curve if and only if (mx + bY = f(x) = 4x 3 —_g2 x — g3 , that is, if and only if x is a ro- I. the cubic f(x) — (mx + b) 2 . This cubic

32


has three roots, each of which gives a point of intersection. If x is a double root or triple root, then / intersects the curve with multiplicity two or three at the point (x, y) (see Problem 6 of §I.3). In any case, the total number of points of intersection (counting multiplicity) is three. Notice that vertical lines also intersect the curve in three points, including the point at infinity O; and the line at infinity has a triple intersection at 0 (see Problem 7 of §I.3). Thus, any line in N intersects the curve in three points. This is a special case of Bezout's Theorem. Let P(x, y, z) and 6 (x, y, z) be homogeneous polynomials of degree m and n, respectively, over an algebraically closed field K. Suppose that P and 6 have no common polynomial factor. Then the curves in Pi defined by F and 6 have mn points of intersection, counting multiplicities.

For a more detailed discussion of multiplicity of intersection and a proof of Bezout's theorem, see, for example, Walker's book an algebraic curves [Walker 1978]. In our case F(x, y, z) = y2z — 4x 3 + g2 xz 2 + g3 z 3 and 6-(x, y, z) = y — mx — bz. Proposition 12. If P1 + P2 = P3, then — P3 is the third point of intersection of I = P1 P2 with the elliptic curve. If P1 = P2 , then by P1 P2 we mean the

tangent line at P1 . We have already treated the case when P1 or P2 is the point at infinity 0, and when P2 =--- œ P1 . So suppose that I = PI P2 has the form y = mx + b. Let P1 = Pz1 , P2 = Pz 2 . To say that a point Pz = (p(z), pi (z)) is on / means PROOF.

that pY(z) = ma(z) + b. The elliptic function 0 1(z) — m(z) — b has three poles and hence three zeros in CIL. Both z 1 and z2 are zeros. According to the lemma proved above, the sum of the three zeros and three poles is equal to zero modulo the lattice L. But the three poles are all at zero (where so'(z) has a triple pole); thus, the third zero is — (z, + z2) modulo the lattice. Hence, the third point of intersection of / with the curve is P---(z 1 +z2) -= as claimed. The argument in the last paragraph is rigorous only if the three points of intersection of 1 with the elliptic curve are distinct, in which case a zero of 0 '(z) — mg(z) — b corresponds, exactly to a point of intersection P. Otherwise, we must show that a double or triple zero of the elliptic function always corresponds to a double or triple intersection, respectively, of I with the curve. That is, we must show that the two meanings of the term "multiplicity" agree: multiplicity of zero of the elliptic function of the variable z, and multiplicity of intersection in the xy-plane. Let z 1 , z 2 , —z 3 be the three zeros of so'(z) - - mp(z)— h, listed as many times as their multiplicity. Note that none of these three points is the negative of another one, since / is not a vertical line. Since — z, , —z2 , z3 are the three

33


1 P1 + P2

Figure 1.13

zeros of so' (z) + mo(z) + b, it follows that ±z 1 , ±z 2 , +z 3 are the six 1) f(0(z)) (mso(z) + b) 2 = 44(0(z) zeros of 0 1(2)2 — (z) + (0(z) x 2)(s (z) — x3), where x l , x2 , x 3 are the root off(x) (mx + b) 2 . If, say, (z 1 ) = x1 , then the multiplicity of x 1 depends upon the number of +z2 , ±z 3 which equal +z 1 . But this is precisely the number of z 2 , which equal z 1 . Hence "multiplicity" has the same meaning in both cases. This concludes the proof of Proposition 12. Proposition 12 gives us Fig. 1.13, which illustrates the group of real points on the elliptic curve y 2 = x3 — x. To add two points Pi and P2, we draw the line joining them, find the third point of intersection of that line with the curve, and then take the symmetric point on the other side of the x-axis. It would have been possible to define the group law in this geometrical manner in the first place, and prove directly that the axioms of an abelian group are satisfied. The hardest part would have been the associative law, which would have necessitated a deeper investigation of intersections of curves. In turns out that there is some flexibility in defining the group law. For example, any one of the eight points of inflection besides the point at infinity could equally well have been Chosen as the identity. For details of this alternate approach, see [Walker 1978 ]. Qne disadvantage of our approach using 0 (z) is that a priori it only applies to elliptic curves of the form y2 = 4x 3 g2 (L)x — g 3 (L) or curves that can be transformed to that form by a linear change of variables. (Note that the geometrical description of the group law will still give an abelian group law ' after a linear change of variables.) In actual fact, as was mentioned earlier and will be' proved later, any elliptic curve over the complex numbers can be transfoirmed to the Weierstrass form for some lattice L. We already know

1.

34


that our favorite example y2 = x 3 — n 2 x corresponds to a multiple of the Gaussian integer lattice. In the exercises for this section and the next, we shall allow ourselves to use the fact that the group law works for any elliptic curve. It is not hard to translate this geometrical prôcedure into formulas expressing the coordinates (x 3 , y3) of the sum of P1 = (x 1 , y) and P2 = (x2 , y2) in terms of x, , x2 , yi , y2 and the coefficients of the equation of the elliptic curve. Although, strictly speaking, our derivation was for elliptic curves in the form y2 =f(x) = 4x3 g2 (L)x g 3 (L) for some lattice L, the procedure gives an abelian group law for any elliptic curve y2 = f(x), as remarked above. So let us take f(x) = ax 3 + bx2 4- ex ± deC[x] to be any cubic with distinct roots. In what follows, we shall assume that neither P, nor P2 is the point at infinity 0, and that P1 0 —P2 . Then the line through P1 and P2 (the tangent line at P1 if P, = P2) can be written in the form y = mx + (3, where m (y2 — y1 )/(x 2 — x 1 ) if P1 0 P2 and m = dy/c/x1 (x1 , y1 if P, = P2. In the latter case we can express m in terms of x 1 and Yi by implicitly differentiating = f(x); we find that m =f(x 1 )/2y 1 . In both cases the y-intercept is

f3 = y i — mx 1 . Then x3 , the x-coordinate of the sum, is the third root of the cubic , two of whose roots are x l , x2 . Since the sum of the three f(x) (mx + roots is equal to minusihe coefficient of x2 divided by the leading coefficient, we have: x, + x2 + x 3 = —(b m 2)/a, and hence:

\2 X 3 = -X i - X2 - -b

a

x3

+ aV I (Y22 — x i ) '

—b + 1 (f1(x1) ) 2 , a a 2y i

if P, 0 P2 ;

(7.1)

i f P1 = P2*

(7.2)

The y-coordinate y3 is the negative of the value y = mx 3 + fi, i.e., Y3 = - Y1+ M(xl

(7.3)

x3),

where x3 is given by (7.1) and (7.2), and

m = (Y2 — Yi)/(x2 — xi) if Pi m -f (x1)/2y1 if P1 '

P2;

(7.4)

P2.

If our elliptic curve is in Weierstrass form y 2 = 4x3 g2 x g 3 , then we have a = 4, b =-- 0, and f(x 1 ) = 124 g2 in the addition formulas (7.1)—(7.4). In principle, we could have simply defined the group law by means of these 'formulas, and then verified algebraically that the axioms of a commutative group are satisfied. The hardest axiom to verify would be associativity. Tedious as this procedure would be, it would h le key advantage over


35

either the complex-analytic procedure (using ga(z)) or the geometrical procedure. Namely, we would never have to use the fact that our, field K over which the elliptic curve is defined is the complex numbers, or even that it has characteristic zero. That is, We would find that our formulas, which make sense over any field K of characteristic not equal to 2, give an abelian group law. That is, if y2 =----f(x) = ax3 + bx2 + cx + d G K[x] is the equation of an elliptic curve over K, and if we define f CO = 3ax2 + 2bx + c, then any two points having coordinates in some extension of K can be added using the formulas (7.1)—(7.4). We shall make use of this fact in what follows, even though, strictly speaking, we have not gone through the tedious purely algebraic verification of the group laws. PROBLEMS

1. Let L c R be the additive subgroup fmcol of multiples of a fixed nonzero real number w. Then the function zi—qcos(274w), sin(2740)) gives a one-to-one analytic map of RI L onto the curve x2 + y 2 = I in the real xy-plane. Show that ordinary addition in R/L, carries over to a rational (actually polynomial) law for "adding" two points (x l , yi ) and (x 2 , y2) on the unit circle; that is, the coordinates of the "sum" are polynomials in x 1 , x2 , yi , y2 . Thus, the rational addition law on an elliptic curve can be thought of as a generalization of the formulas for the sine and cosine of the sum of two angles. 2. (a) Simplify the expression for the x-coordinate of 2P in the case of the elliptic curve y2 = x3 — n2x. (b) Let X, Y, Z be a rational right triangle with area n. Let P be the corresponding point on the curve y2 = x 3 — n2x constructed in the text in §1.2. Let Q be the point constructed in Problem 2 of §I.2. Show that P =-- 2Q. (c) Prove that, if P is a point not of order 2 with rational coordinates on the curve y 2 = x' — 12 2x, then the x-coordinate of 2P is the square of a rational number having even denominator. For example, the point Q ::--: ((4117)2 , 720. 4l/7 3 ) on the curve y2 = x3 — 31 2x is not equal to twice a point P having rational coordinates. (In this problem, recall: n is always squarefree.) 3. Describe geometrically: (a) the four points of order two on an elliptic curve ; (b) the nine points of order three; (c) how to find the twelve points of order four which are not of order two; (d) what the associative law of addition says about a certain configuration of lines joining points on the elliptic curve (draw a picture). 4. (a) How many points of inflection are there on an elliptic curve besides the point at infinity? Notice that they occur in symmetric pairs. Find an equation for their x-coordinates. (b) In the case of the elliptic curve 'y2 = x' — n 2x find an explicit formula for these x-coordinates. Show that they are never rational (for any n). 5. Given a point Q on an elliptic curve, how many points Pare there such that 2P = Q? Describe geometrically how to find them.

6. Show that if K is any, subfield of C containing g 2 and g 3 , then the points on the elliptic curve y2 = 4x 3 — g 2 x — g 3 whose coordiriates - are in K form a subgroup

I. From Congruent Numbers to Elliptic Curves

36

of the group of all points. More generally, show that this is true for the elliptic , curve y2 = f(x) iff(x) e K[x].

7. Consider the subgroup of all points on y2 = x 3 — n2 x with real coordinates. How many points in this subgroup are of order 2? 3? 4? Describe geometrically where these points are located. 8. Same as Problem 7 for the elliptic curve y2 = x3 — a, a e R.

9. If y2 = f(x) is an elliptic curve in which f(x) has real coefficients, show that the group of points with real coordinates is isomorphic to (a) R/Z if f(x) has only one real root; (b) R/Z x Z/21 iff(x) has three real roots. 10. Letting a approach zero in Problem 8, show that for the curve y2 = x3 the same geometric procedure for finding P1 + P2 as for elliptic curves makes the smooth points of the curve (i.e., P * (0, 0), but including the point at infinity) into an abelian group. Show that the map which takes P = (x, y) to xly (and takes the point at infinity to zero) gives an isomorphism with the additive group of complex numbers. This is called "additive degeneracy" of an elliptic curve. One way to think of this is to imagine both co l and co2 approaching infinity (in different directions). Then g 2 and g3 both approach zero, so the equation of the corresponding elliptic curve approaches y2 = 4x 3 . Meanwhile, the additive group C1L, where L = {m(0 1 + nco2 }, approaches the additive *group C, i.e., the fundamental parallelogram becomes all of C. 11. Let a —* 0 in the elliptic curve y2 ----- (x2 ___ a)(x + 1). Show that for the curve y2 = x 2 (x + 1) the same geometric procedure for fi nding P1 + P2 as for elliptic curves makes the smooth points of the curve into an abelian group. Show that the map which takes P----- (x, y) to (y— x)/(y + x) (and takes the point at infinity to 1) gives an isomorphism with the multiplicative group C* of nonzero complex numbers. This is called "multiplicative degeneracy" of an elliptic curve. Draw the graph of the real points of y 2 = x2 (x + 1), and show where the various sections go under the isomorphism with C*. One way to think of multiplicative degeneracy is to make the linear change of variables yf—>iy, x F-4 —x — -1- , so that the equation becomes y2 = 4x 3 — lx — (compare with Problem 8 of §1.6). So we are dealing with the limit as t approaches infinity of the group C/{mit + tut}, i.e., with the vertical strip Ci{rin} (rather, a cylinder, since opposite sides are glued together), and this is isomorphic to C* under the map zi—*e 2'.

§8. Points of finite order In any group, there is a basic distinction between elements of finite order and elements of infinite order. In an abelian group, the set of elements of finite order form a subgroup, called the "torsion subgroup". in the case of the group of points in P, on the elliptic curve )72 =f(x), we immediately see that a point Pz = (x, y) has finite order if and only if Nz e I, for some N, i.e., if and only if z is a rational linear combination of co l and co2 . in that case, the least such N (which is the leas t common denominator of the coefficients of oi / and co2) is the exact order of P. Under the isomorphism

§8. Points of finite order

37

from ER/Z x R/Z to the elliptic curve given by (a, b) i—' P it is the image of Q/Z -x Q/Z which is the torsion subgroup of the elliptic curve. This situation is the two-dimensional analog of the circle group, whose torsion subgroup is precisely the group of all roots of unity, i.e., all e 2 " for ze OIL Just as the cyclotomic fields—the field extensions of 0 generated by the roots of unity—are central to algebraic number theory, we would expect that the fields obtained by adjoining the coordinates of points P = (x, y) of order N on an elliptic curve should have interesting _special properties. We shall soon see that these coordinates are algebraic (if the coefficients off(x) are). This analogy between cyclotomic fields and fields formed from points of finite order on elliptic curves is actually much deeper than one might have guessed. In fact, a major area of research in algebraic number theory today consists in finding and proving analogs for such fields of the rich results one has for cyclotomic fields. Let N be a fixed positive integer. Let f(x) --= ax3 bx 2 + ex + d = a(x — e 1 )(x e 2)(x — e 3) be a cubic polynomial with coefficients in a field K of characteristic 0 2 and with distinct roots (perhaps in some extension of K). We are interested in describing the coordinates of the points of order N (i.e., exact order a divisor of N) on the elliptic curve y 2 =f(x), where these coordinates may lie in an extension of K. If N =--- 2, the points of order N are the point at infinity 0 and (es , 0), i = 1, 2, 3. Now suppose that N> 2. If N is odd, by a "nontrivial" point of 'order N we mean a point P 0 such that NP = O. If N is even, by a "nontrivial" point of order N we mean a point P such that NP = 0 but 2P 0 O. Proposition 13. Let K' be any field extension of K (not necessarily algebraic). and let a: K' aK' be any field isomorphism which leaves fixed all elements of K. Let P ePic, be a point of exact order N on the elliptic curve y 2 =J(x) where f(x) e K[x]. Then aP has exact order N (where for P = (x, y, :) E P we denote o-P d=ef (ax, ay, az)_e 53,20.

It follows from the addition formulas that o-P, o-P, + P2 ), and hence N(aP) = a(NP) = a0 = 0 (since G(O, I, 0) --= (0, 1, 0)). Hence cr P has order N. It must have exact order N, since if N'o-P = 0, we would have cr(N'P) = 0 = (0, 1, 0), and hence N' P = O. This proves the proposition. o PROOF.

Proposition 14. In the situation of Proposition 13, with K a subfield of C, let KN c C denote the field obtained by adjoining to K the x-, and y-coordinates

of all points of order N. Let K denote the field obtained by adjoining just their x-coordinates. Then both KN and KI;41" are finite galois extensions of K. In each case KN and K, we are adjoining a finite set of complex numbers which are permuted by any automorphism of C which fixes K. This immediately implies the proposition. PROOF.

As an example, if N = 2, then K 2 = K-21- is the splitting field off(x) over K.

1.

38


Recall that the group of points of order N on an elliptic curve in Pt is isomorphic to VI NZ) x (/N). Because any Gre Gal(KN/K) respects addition of points, i.e., c(Pi + P2) = o- P1 + o-P2 , it follows that each a gives an invertible linear map of (ZI N .2) 2 to itself. If R is any commutative ring, we let GL(R) denote the group (under matrix multiplication) of all n x n invertible matrices with entries in R. Here invertibility of a matrix A is equivalent to det A E R*, where R* is the multiplicative group of invertible elements of the ring. For example:

(1) GL i (R) ::----: R* ; (2) G L 2 (11 NZ) = {C: Na, b, c, de ZINZ, ad — bc E (7 iNZ) * }.

/

It is easy to construct a natural one-to-one correspondence between invertible linear maps Rn -4 R" and elements of GL(R). There is no difference with the more familiar case when R is a field. In our situation of points of order N on an elliptic curve, we have seen that Gal(KNIK) is isomorphic to a subgroup of the group of all invertible linear maps (Z/NZ) 2 --* (Z/NZ) 2 . Thus, any a- e Gal(KN/K) corresponds to a matrix (ca bd ) e G L 2 (11 NZ). The matrix entries can be found by writing (71)(0 1/N = Paco i / N + cco 2 / N5

aPw2lN

= Plm 1 /N+dco2 /N•

Notice that this is a direct generalization of the situation with the N-th cyclotomic field ON cfe--f 0(YT). Recall that Gal(ON/Q),rzd, (ZINZ)* = GL i (ZI NZ), with the element a which corresponds to ci determined by a(e 2N) ..„....„ eznia/N . But one difference in our two-dimensional case of division points on elliptic curves is that, in general, Gal(KN/K) --3 GL 2 (ZIN1) is only an injection, not an isomorphism. In the case K c C, say K -----= 0(g2 , g3 ), where y 2 = f(x) = 4x 3 — g2 x — g3 is in Weierstrass form, we shall now use the p-function to determine the polynomial whose roots are the x-coordinates of the points of order N. That is, Kz- will be the splitting field of such a polynomial. We first construct an elliptic function fN (z) whose zeros are precisely the nonzero values of z such that Pz is a point of order N. We follow the prescription in the proof of Proposition 9 of §1.5. If u E CIL, is a point of order N, then so is the symmetric point —u (which we denoted u* when we were ' thinking in terms of points in a fundamental parallelogram). We consider two cases:

(1) N is odd. Then the points u and —u are always distinct modulo L. In other words, u cannot be w 1 /2, w 2/2 or (o) i + w2)/2 if u has odd order N. We define

Mz) := N H(0 (z) — g o(u)),

(8.1)

39


where the product is taken over nonzero ue C/L such that Nu e L. with one u taken from each pair u, —u. Thenfv (z) = FN (p(z)), where FN (x)E C[X] is a polynomial of degree (N 2 — 1)/2. The even elliptic function fN (z) has N 2 — I simple zeros and a single pole at 0 of order N 2 — 1. Its leading term at z = 0 is Niz N2 '. (ii) N is even. Now let u range over u e CI L such that Nu e L but u is not of order 2; i.e., u 0 0, o.) 1 /2, (.0 2 /2, (co l + w2)/2. De fi ne fiv (z) by the product in (8.1). Then IN (z) = FN (p(z)), where FN (x)EC[x] is a polynomial of degree (N 2 — 4)/2. The even elliptic function fN (z) has N 2 — 4 simple zeros and a single pole at 0 of order N 2 — 4. Its leading term at -: = 0 is

. N Iz N2'. 4 . If N is odd, the function L (z) has the property that frAz . ) 2

_ N2

n

(o(z)- o(0).

OucCIL,NuEL

If N is even, then the function fN (z)z.-10/(z)/N (z) has die property that

fN(z) 2 = 1-0 / (z) 2fN(z) 2 UE

=N2

n

1-1 C/L,Nue

((z)

-

o(u))

L, 2uit L

( 0 (z)- 0 (4)).

0*ti EC/L.,NueL

We see that a point (x, y) = (0(z), OM has odd order N if and only if FN (x) = O. It has even order N if and only if either y = 0 (i.e., it is a point of order 2) or else FN (x) = O. Because of Propositions 13 and 14, we know that any automorphism of C fixing K = 0(g2 , g3) permutes the roots of FN. Hence, the coefficients of FN are in K = CP(g2 , g3). If we started with an elliptic curve not in Weierstrass form, say y2 = f(x) = ax 3 + bx ‘ 2 + cx + d, and if we wanted to avoid using the 0-function, then we could repeatedly apply the addition formulas (7.1)—(7.4) to compute the rational function of x and y which is the x-coordinate of NP, where P = (x, y). We would simplify algebraically as we go, making use of the relation y2 =fix), and would end up with an expression in the denominator which vanishes if and only if NP is the point at infinity, i.e., if and only if P has order N (recall: "order N" means "exact order N or a divisor of N"). What type of an expression would we have to get in the denominator of the x-coordinate of NP? Suppose, for example, that N is odd. Then this denominator would be an expression in Iqx, y] (with y occurring at most to the first power), where K = 13(a, b, c, d), which vanishes if and only if x is one of the (N 2 — 1)/2 values of x-cOordinates of nontrivial points of order N. Thus, the expression must be a polynomial in x alone with (N 2 — 1)/2 roots. Similarly, we find that when N is even, this, denominator has the form

40

1.


y • (polynomial in x alone), where the polynomial in' K[x] has (N 2 — 4)/2

roots. It is important to note that the algebraic procedure described in the last two paragraphs applies for any elliptic curve y 2 = f(x) over any field K of characteristic 0 2, not only over subfields of the complex numbers. Thus, for any K we end up with an expression in the denominator of the xcoordinate of NP that vanishes for at most N 2 — I values of (x, y). For a general fi eld K, however, we do not necessarily get exactly N 2 — I nontrivial points of order N. Of course, if K is not algebraically closed, the coordinates of points of order N may lie only in some extension of K. Moreover, if K has characteristic p, then there might be fewer points of order N for another reason: the leading coefficient of the expression in the denominator vanishes modulo p, and so the degree of that polynomial drops. We shall soon see examples where there are fewer than N 2 points of order N even if we allow coordinates in 101g c'. This discussion has led to the following proposition. Proposition 15. Let y 2 = f(x) be an elliptic curve over any field K of characteristic not equal to 2. Then there are at most N 2 points of order N over any extension K' of K.

Now let us turn our attention briefly to the case of K a finite field, in order to illustrate one application of Proposition 15. We shall later return to elliptic curves over finite fields in more detail. Since there are only finitely many points in IFI (namely, q2 + q I: I), there are certainly only finitely many Fg-points on an ellipticcurve y2 ---= fix), where f(x) e Fg [x]. So the group of Fg-points is a finite abelian group. By the "type" of a finite abelian group, we mean its expression as a product of cyclic groups of prime power order. We list the orders of all of the cyclic groups that appear in the form: 2Œ2, 2112, 22, . . . , 3'3, 3/33, 3Y3, . . . , 5'5, 5fl5, • . . . But Proposition 15 implies that only certain types can occur in the case of the group of Fg-points on y2 :----- f(x). Namely, for each prime I there are at most two l-th power components /Œi, Pi, since otherwise we would have more than 12 points of order I. And of course /Œt1 must equal the power of / dividing the order of the group. As an example of how this works, let us consider the elliptic curve y2 =-x3 — n 2 x over K = Fq (the finite field of q = pf elements), where we must assume that p does . not divide 2n. In the case when qT_-_- 3 (mod 4), it is particularly easy to count the number of Fg-points. -

Proposition 16. Let q = pf , p,1/2n. Suppose that q -.E..- 3 (mod 4). Then there are q + I Fg-points on the elliptic curve y 2 = x3 — n 2 x. PROOF. First, there are four points of order 2: the point at infinity, (0, 0),

and ( +n, 0). We now count all pairs (x, y) where x 0 0, n, -'-n. We arrange

41


these q — 3 x's in pairs {x, —x}. Since f(x) = x 3 — n 2 x is an odd function, and —1 is not a square in Fg (here's where we use the assumption that q -a:. 3 (mod 4)), it follows that exactly one of the two elements f(x) and j( — x) = —f(x) is a square in Fg . (Recall: In the multiplicative group of a finite field, the squares are a subgroup of index 2, and so the product of two nonsquares is a square, while the product of a square and a nonsquare is a nonsquare.) Whichever of the pair x, —x gives a square, we obtain exactly two points (x, ±Vf(x)) or else (— x, ±,jf(—x)). Thus, the (q — 3)/2 pairs give us q — 3 points. Along with the four points of order two, we have q ± 1 Fg o points in all, as claimed. Notice that when q=-=:-_ 3 (mod 4), the number of Fg-points on the elliptic - -- I (mod 4). curve y 2 = x 3 — I2 2 x does not depend on n. This is not true if q a • As an example, Proposition 16 tells us that for q := 7 3 there are 344 = 23 • 43 points. Since there are four points of order two, the type of the group of 1F343 -points on y 2 .7.--- x 3 — n2 x must be (2, 2 2 , 43). As a more interesting example, let q =p = 107. Then there are 108 = 22 - 3 3 points. The group is either of type (2, 2, 3 3) or of type (2, 2, 3, 3 2 ). To resolve the question, we must determine whether there are 3,or 9 points of order three. (There must be nontrivial points of order 3, since 3 divides the order of the group.) Recall the equation for the x-coordinates of points of order three (see Problem 4 of §7): —3x 4 + 6n2 x2 + n4 = 0, i.e., x = +n-N/1 + 2.1-j/3. Then the corresponding y-coordinates are found by taking ÷ Vf(x). We want to know how many of these points have both coordinates in F107 , rather than an extension of F 107 . We could compute explicitly, using .1-- = + 18 in F107 , so that x =--- + V13, +,/— 11, etc. But even before doing those computations, we can see that not all 9 points have coordinates in F107. This is because, if (x, y) is in F107, then ( —x, .\/— ly) is another point of order three, and its coordinates are not in 1Fi07 . Thus, there are only 3 points of order three, and the type of the group is (2, 2, 3 3). Notice that if K is any field of characteristic 3, then the group of K-points has no nontrivial point of order three, because — 3x 4 + 6n2 x 2 + 124 = 114 0 O. This is an example of the "dropping degree" phenomenon mentioned above. It turns out that the same is true for any p -IF= 3 (mod 4), namely, there are no points of order p over a field of characteristic p in that case. This is related to the fact that such p remain prime in the ring of Gaussian integers 1[i], a ring which is intimately related to our particular elliptic curve (see Problem _13 of §6). But we will not go further into that now. PROBLEMS

1. For the elliptic curve y2 = 4x3 — of p(z) when N= 2.

g2x — g 3 ,

express

go(Nz)

as a rational function ,

2. Let fN (z) be the elliptic functions defined above. Express f3 (z) as a polynomial in _sv(z).

42


3. Set fi (z) = 1. Prove that for N = 2, 3, 4, ... we have: c.•

go (N z) z--- 0(z) — fly - 1 (z)fN + i (z)IfN(z) 2 .

4. In the notation of Proposition 14, suppose that CEGal(KN/K) fixes all x-coordinates of points of order N. That is, al K;,-iF =--- identity. Show that the image of a in GL 2 (7LIN7L) is +1. Conclude that Ga1(KN/K4 ) = { +1} n G, where G is the image of Gal(KN/K) in GL 2 (7LIN I). What is the analogous situation for cyclotomic fields? 5. Let L = {m o i ± nco2 } , and let E be the elliptic -curve y2 ---- 4x 3 — g2 (L)x — g 3 (L). Notice that E does not change if we replace the basis {co l , co 2 } of L by another basis {coil , co'2 }. However, the group isomorphism C/L,,Pz-'s ER/Z x R/Z changes, and so does the isomorphism from the points of order N on E to ZINZ x ZINZ. For example, the point (0 (co;IN), p'(co'l lN)), rather than (0 (co i l N), p'(co l IN)), corresponds to ( 1 , 0) eZINZ x ZI N Z. What effect does the change of basis from coi to co'i have on the image of Gal(KN/K) in GL 2 (ZINZ)? 6. Show that the group GL 2 (ZI2Z) is isomorphic to S3 1 the group of permutations of {1, 2, 3). For each of the following elliptic curves, describe the image in GL 2 (Z12Z) of the galois group over 0 of the field generated by the coordinates of the points of order 2. (n not a perfect square) (a) y2 = x3 — nx (b) y2 = x 3 — n2 x (n not a perfect cube) (c) y 2 = x3 — n (d) y2 = x3 — n3 . 7. (a) How many elements are in GL2 (Z/3Z)? (13) Describe the field extension K3 of K = 0 generated by the coordinates of all points of order 3 on the elliptic curve y2 = x3 — n 2x. (c) Find [K3 : 0]. What subgroup of GL2 (Z/3z) is isomorphic to Gal(K3/Q)?

(d) Give a simple example of an element in GL 2 (7L/3Z) that is not in the image of Gal(K3/10); in other words, find a pair of elements z i = (m i co l + n 1 co2)/3, z 2 =- (ni2 co 1 + n2 co2)/3 which generate all (mco l + nco 2)13 but such that Pzi , Pz, cannot be obtained from Pco1 13, Pco213 by applying an automorphism to the 'coordinates of the latter pair of points.

8. In Problem 13 of §1.15, we saw that the lattice corresponding to the curve y2 = x 3 — n 2x is the lattice L of Gaussian integers expanded by a factor co2 e R: L = {mico 2 + nco 2 } = c02 Z[i]. (a) Show that the map zi-÷ iz gives an analytic automorphism of the additive group C/L; and, more generally, for any Gaussian integer a -I- biEZ[i] we have a corresponding analytic endomorphism of Cii, induced by z I-4 (a ± bi)z. (b) Notice that if b = 0, this is the map zi--+z + z + • • • + z (a times) which gives Oa : P 1—* aP on the elliptic curve. By looking at the definition of 0 (z), pi (z), show that the map zi---* iz gives the automorphism Oi : (x, y)i—*(— x, iy) on the elliptic curve. This is an example of what's called "complex multiplication". Show that Oi o4 = ck_ i , and in fact the map a + NI—) Oa+bi is an injection of the ring Z[i] into the ring of endomorphisms of the group of points on the elliptic curve. (c) If L is a lattice in C and if there exists a complex number a = a + bi, b 0 0, such that ŒL OE L, show that a is a quAdratic imaginary algebraic integer, and that L. contains a sublattice of fini-e ya of the form c0 2 Z [a].

,3 k

§9. Points over finite fields, and the congruent number problem

43

9. Each of the following points has finite order N on the given elliptic curve. In each case, find its order. (a) P = (0, 4) on y2 = 4x 3 + 16 (b) P = (2, 8) on y2 = 4x 3 + lbx (c) P = (2, 3) on y 2 = x 3 + 1 (d) P = (3, 8) on y 2 = x 3 — 43x + 166 (e) P = (3, 12) on y 2 = — 14x 2 + 81x,

(f) P (0 , 0) 6n y 2 + y = x 3 x2 (g) P = (1, 0) on + xy + y =

x2 — 3x + 3.

§9. Points over finite fields, and the congruent

number problem We have mainly been interested in elliptic curves E over 0, particularly the elliptic curve y2 = X 3 — n2 x, which we shall denote E. But if K is any field whose characteristic p does not divide 2n, the same equation (where we consider n modulo p) is an elliptic curve over K. We shall let En (K) denote the set of points on the curve with coordinates in K. Thus, Proposition 16 in the last section can be stated: If q 3 (mod 4), then # En(Fq) = q + 1. The elliptic curve En considered as being defined over Fr,, is called the "reduction" modulo p, and we say that En has "good reduction" if p does not divide 2n, i.e., if y2 = x3 — n 2 x gives an elliptic curve over [Fly More generally, if y 2 = f(x) is an elliptic curve E defined over an algebraic number field, and if p is a prime ideal of the number field which does not divide the denominators of the coefficients off(x) or the discriminant off(x), then by reduction modulo p we obtain an elliptic curve defined over the (finite) residue field of p. At first glance, it may seem that the elliptic curves over finite fields— which lead only to finite abelian groups—are not a serious business, and that reduction modulo p is a frivolous game that will not help us in our original objective of studying Q-points on y2 = x 3 — n2 x. However, this is far from the case. Often information from the various reductions modulo p can be pieced together to yield information about the 0-points. This is usually a subtle and difficult procedure, replete with conjectures and unsolved problems. However, there is one result of this type which is simple enough to give right now. Namely, we shall use reduction modulo p for various primes p to determine the torsion subgroup of En(Q), the group of G-points on y2 =x3 — n 2 x. In any abelian group, the elements of finite order form a subgroup, called the "torsion subgroup". For example, the group E(C) of complex points on an elliptic curve is isomorphic to CIL, which for any lattice L is isomorphic to R/Z x R/Z (see Problem 2 of §I.5). Its torsion subgroup x R/Z, i.e., in CIL it corresponds to the subgroup 0/7L x 0/Z c consists of all rational linear combinations of cb i and co 2 . A basic theorem of Morde" -tPtr,s that the group E(0) of 0-points on an

44

I.


elliptic curve E defined over 0 is a finitely generated abelian 'group. This means that (1) the torsion subgroup E(Q)tors is finite, and (2) E(Q) is iso) and a finite number of copies of morphic to the direct sum of E (Q 'tors Z: E(0) E(0),tors 0 1r. The nonnegative integer r is called the "rank" of E(0). It is greater than zero if and only if E has infinitely many 0-points. Mordell's theorem is also true, by the way, if 0 is replaced by any algebraic number field. This generalization, proved by Andre Weil, is known as the Mordell—Weil theorem. We shall not need this theorem for our purposes, even in the form proved by Mordell. For a proof, the reader is referred to Husemuller's forthcoming book on elliptic curves or else to [Lang 1978b]. We shall now prove that the only rational points of finite order on En are the four points of order 2: 0 (the point at infinity), (0, 0), (+n, 0). Proposition 17. #E„(1111) tors ,

= 4.

homomorphism from En (Q) i „„ to E(IF) which is injective for most p. That will imply that the order of En(0) -,tors divides the order of E(F) for such p. But no number greater than 4 could divide all such numbers #E„(Fp), because we at least know that #E(F) runs through all integers of the form p + 1 for p a prime congruent to 3 modulo 4 (see Proposition 16). We begin the proof of Proposition 17 by constructing the homomorphism from the group of 0-points on E to the group of F,,-points. More generally, we simply construct a map from PL to 1131p . In what follows, we shall always choose a triple (x, y, z) for a point in PZ in such a way that x, y, and z are integers with no common factors Up to multiplication by +1, there is a unique such triple in the equivalence class. For any fixed prime p, we define the image 13 of P = (x, y, z) e PZ , to be the point /5 = (.",f, y, 2 c , where the the bar denotes reduction of an integer modulo p. Note that is is identically zero triple, because p does not divide all three integers x, y, z. Also note that we could have replaced the triple (x, y, z) by any multiple by an integer prime to p without affecting P. - It is easy to see that if P = (x, y, z) happens to be in E(0), i.e., if y 2 z = x 3 — 122 xz' , then 15 is in E(ll). Moreover, the image of Pi + P2 under this map is 131 + P2, because it makes no difference whether we use the addition formulas (7.1)—(7.4) to find the sum and then reduce mod p, or whether we first reduce mod p and then use the addition formulas. In other words, our map is a homomorphism from En(0) to En (Fp), for any prime p not dividing 2n. We now determine when this map is not injective, i.e., when two points = (x i , Yi' z 1 ) and P2 = (X 2 , y2 , z 2) in PL have the same image 151 - = in inp . PROOF. The idea of the proof is to construct a group

)

Lemma. P --1 -152 if and only if the cross-product of P1 and P2 (considered as

vectors in R3) is divisible by p, i.e., if and only if p divides y i z 2 — y 2z 1 , x 2 z — x i z 2 , and x l y 2 x2y1.

45


PROOF OF LEMMA.

First suppose thatp divides the cross-product. We consider

two cases: •

(i) p divides x 1 . Then -p divides x 2 z 1 and x 2 y 1 , and therefore divides x„ because it cannot divide x l , Yi and z 1 . Suppose, for example, that p (an analogous argument will apply ifp, iz t ). Then P2 = y, Y2 , Y 12) 1;. Y1Y2, Y2 z 1) = Yi 9 Z 1) = Pi (where we have used the fact that p divides y1 2 2 — y2 z 1 ). (ii) p does not divide x 1 . Then P2 = (î 1 X 21 -5c- 1 Y2 Ï2 9 g-2 Y1 9 x221) =

y Y1 y 1) = P1 •

Conversely, suppose that P1 = P2 . Without loss of generality, suppose that plx i (an analogous argument will apply if Oh or pil'z i ). Then, since 121 11 Y2 , 1 - 2) = = P2 = (12, Y2 f2), we also have 1,4' x 2 . Hence, 132 = P1 = (-V211 5c-7 2 Y. 71 y Y2 i1 ). Since the first coordinates are the same, these two points can be equal only if the second and third coordinates are equal, i.e., ifp divides x i y2 — x 2 y 1 and x 1 z2 — x2 z 1 . Finally, we must show that p divides y 1 z 2 y2 z 1 . If both Yi and z i are divisible by p, then this is trivial. Otherwise, the conclusion will follow by repeating the above argument with x i , x2 replaced by Yi , y2 or by z i , z 2 . This concludes the proof of the lemma. We are now ready-to prove Proposition 17. Suppose that the proposition is false, i.e., that En (0) contains a point of finite order greater than 2. Then either it contains an element of odd order, or else the group of points of order 4 (or a divisor of 4) contains either 8 or 16 elements. In either case we have a subgroup S = {P1 , P2 1 Pm } En (0) tors where ni = #S is either 8 or else an odd number. Let us write all of the points Pi , I =-- 1, . . . , ni, in the form in the lemma: Pi = (xi , yi , zi). For each pair of points Pi , Pi , consider the cross-product vector (yi zi — yjzi , xjzi x i zj, x•yj xjyi)E ER 3 . Since Pi and Pi are distinct points, as vectors in ff;1 3 they are not proportional, and so their cross-product is not the zero vector. Let nu be the greatest common divisor of the coordinates of this cross-product. According to the lemma, the points Pi and Pi have the same image F = Pi in E(F) if and only if p divides nii . Thus, if p is a prime of good reduction which is greater than all of the nu, it follows that all images are distinct, i.e., the map reduction modulop gives an injection of S in En(Fp). , But this means_that for all but finitely many p the number ni must divide # En(Fp), because the image of S is a subgroup of order m. Then for 'all but finitely many primes congruent to 3 modulo 4, by Proposition 16 we must have p — 1 (mod m). But this contradicts Dirichlet's theorem on primes in an arithmetic progression. Namely, if m = 8 this would mean that there are only finitely many primes of the form 8k + 3. If m is odd, it would mean that there are only finitely many primes of the form 4mk + 3 (if 3 m), and that there are only finitely many primes of the form 12k + 7 if 31m. In ail cases, Dirichlet's theorem tells us that there are infinitely many primes of the given type. This concludes the proa of Proposition 17. .

.

1

,

4

46

1.


Notice how the technique of reduction modulo p (more precisely,/ the use of Proposition 16 for infinitely many primes p) led to a rather painless proof of a strong fact : There are no "non-obvious" rational points of finite order on E. As we shall soon see, this fact is useful for the congruent number problem. But a far more interesting and difficult question is the existence of points of infinite order, i.e., whether the rank r of En(Q) is nonzero. As we shall see in a moment, that question is actually equivalent to the question of whether or not n is a congruent number. So- it is natural to ask whether mod p information can somehow be put together to yield information about the rank of an elliptic curve. This subtle question will lead us in later chapters to consideration of the BirchSwinnerton—Dyer conjecture for elliptic curves. , For further general motivational discussion of elliptic curves over finite fields, see [Koblitz 1982], We now prove the promised corollary of Proposition 17. Proposition 18. n is a congruent number if and only if En (Q) has nonzero

rank r. PROOF. First suppose that n is a congruent number. At the beginning of §2, we saw that the existence of a right triangle with rational sides and area n leads to a rational point on En whose x-coordinate lies in (Q + ) 2 . Since the x-coordinates of the three nontrivial points of order 2 are 0, +n, this means that there must be a rational point not of order 2. By Proposition 17, such a point has infinite order, i.e., r > 1. Conversely, suppose that P is a point of infinite order. By Problem 2(c) of §I.7, the x-coordinate of the point 2P is the square of a rational number having even denominator. Now by Proposition 2 in §I.2, the point 2P corresponds to a right triangle with rational sides and area n (under the o correspondence in Proposition 1). This proves Proposition 18. Notice the role of Proposition 17 in the proof of Proposition 18. It tells us that the only way to get nontrivial rational points of the form 2P is from points of infinite order. Let 2E„(0) denote the subgroup of En (0) consisting of the doubles of rational points. Then Proposition 17 is equivalent to the assertion that 2E„(0) is a torsion-free abelian group, i.e., it is isomorphic to a certain number of copies (namely, r) of Z. The set 2E„(0) — 0 (0 denotes the point at infinity) is empty if and only if r = O. We saw that points in the set 2E„(0) — 0 lead to right triangles with rational sides and area n under the correspondence in Proposition 1. It is 'natural to ask whether all points meeting the conditions in Proposition 2, i.e., corresponding to triangles, are doubles of points. We now prove that the answer is yes. At the same time, we give another verification of Proposition 18 (not relying on the homework problem 2(c) of §I.7). Proposition 19. There is a one-to-one correspondence between right triangles

with rational sides X < Y < Z and

n, and pairs of points (x, ±y) E

47

IV, Points over fi nite fields, and the congruent number problem

2E„(0) — O. The correspondence is: (x, X, Y, Zi—* (Z 214, + (Y 2 — X 2).Z/8). In light of Proposition 1 of §I.1, Proposition 19 is an immediate consequence of the following general characterization of the doubles of points on elliptic curves. Proposition 20. Let E be the elliptic curve y 2 = (x — e i )(x — e 2)(x — e s) with e l , e2 , e3 e O. Let P= (xo , yo)e E(Q) — O. Then Pe 2E(Q) — 0 if and only if xo — e l , .v„ — e 2 , xo — e 3 are all squares of rational numbers.

We first note that, without loss of generality, we may assume that xo = O. To see this, make the change of variables x' = x — x o . By simply translating the geometrical picture for adding points, we see that the point P' --7--- (0, yo ) on the curve E' with equation y2 = (x — e'l )(x — e'2)(x — where e; = ei — xo , is in 2E/(0) — 0 if and only if our original P were in 2E(G) — O. And trivially, the xo — ei are all squares if and only if the (0 — e;) are. So it suffices to prove the proposition with xo = O. Next, note that if there exists +2 E E(Q) such that 2Q = P, then there are exactly four such points Q, Q i , Q 2 , Q 3 eat)) with 2Q i = P. To obtain Qi , simply add to Q the point of order two (e t , 0)e E(0) (see Problem 5 in §I.7). Choose a point Q= (x, y) such that 2Q = P = (0, yo). We want to find conditions for the coordinates of one such Q (and hence all four) to be rational. Now a point Q on the elliptic curve satisfies 2Q = P if and only if the tangent line to the curve at Q passes through —P= (0, -Yo). That is, the four possible points Q are obtained geometrically by drawing the four distinct lines emanating from —P which are tangent to the curve. We readily verify that the coordinates (x, y) are rational if and only if the slope of the line from —P to Q is rational. The "only if" is immediate. Conversely, if this slope in is rational, then the x-coordinate of Q, which is the double root of the cubic (mx — yo) 2 = (x — e ,)(x — e 2 )(x — e s), must also be rational. (Explicitly ; x = (e l + e2 + e3 + m2)/2.) In this case the y-coordinate of Q is also rational: y = nix — y o . Thus, we want to know when one (and hence all four) slopes of lines from —P which are tangent to E are rational. A number m e C is the slope of a line from --'--P which is tangent to E if and only if the following equation has a double root: PROOF.

(mx — yo) 2 = (x — e l )(x — e2)(x — e 3) = x 3 4- ax 2 + bx + e,

(9.1)

with

a =--- —e l — e 2 — e 3 , ,

w)

'last equality c=

b= e1 e 2 + e1 e 3 + e 2 e 31

C = —e l e 2 e 3 = yg,

(9.2)

A comes from the fact that (0, y„ ) is on the curve

1.

48


y 2 = x3 + ax2 + bx + e. Now if we simplify (9.1) and factor out x, our condition becomes: the following quadratic equation has a double root:

x2 + (a — m 2)x + (b + 2my0 ) = O. This is equivalent to saying that its discriminant must vanish, i.e.,

(a _ m 2) 2 _ 4(b + 2myo) =-- 0.

(9.3)

Thus, our task is to determine when one (and hence all four) roots of this quartic polynomial in m are rational. We want to find a condition in terms of the ei's (namely, our claim is that an equivalent condition is: —e€0 2). In (9.3), the a and b are symmetric polynomials in the es , but the yo is not. However, yo is a symmetric polynomial in the .\Ies . That is, we introduce fi satisfying J 2 = —e s . There are two possible choices for fi , unless es .= O. Choose the f in any of the possible ways, subject to the condition that yo = fi f2f3 . If all of the es are nonzero, this means that the sign off, and f2 are arbitrary, and then the sign off3 is chosen so that yo and fi f2f3 are the same square root of —e 1 e2 e3 . If, say, e 3 = 0, then either choice can be made for the sign off, , f2 , and of course f3 = O. In all cases there are four possible choices of the fi's consistent with the requirement that yo =fif2/3. . Once we fix one such choice f1 , f2 , f3 , we can list the four choices as follows (here we're supposing that e l and e2 are nonzero):

fi,f2,f3;

fi, --f2, -fi;

-fi,f2, -f3;

-fi,'-f2,f3- (9.4)

The advantage of going from the ei's to thef's is that now the coefficients of our equation (9.3) are symmetric functions off, , f2 , L. More precisely, if we set s i =J +f2 4-f3 , sl =fif2 +fi f3 + /2' f3 , s 3 = fi f2 f3 , the elementary symmetric functions, then

a =f12 +./.22 + f32

= s 2 — i2s;

b = 2f22 ± L2f32 +f221-32 j = 4— 2s 1 s3 ; f1 yo = S 3 . Thus, equation (9.3) becomes

0 = (m 2 — s? + 2s 2) 2 — 4(4 — 2s rs- 3 + 2ms3 ) = (m 2 ___. 4 ) 2 ± 4s 2(m2 _ s ?\ ) _ 8s3 (rn — s 1 ).

(9.5)

We see at a glance that the polynomial in (9.5) is divisible by m — s l , i.e., in = s 1 =f1 +f2 +f3 is a root. Since we could have made three other choices for the signs of the _I's , the other roots must correspond to these choices, i.e., the four solutions of equation (9.3) are:

mi "1

=1; +12 +f31

3 = —fi +f2 -f3,

m2 = fi - f2 - f3, m4 = -fi - f2 +f3-

(9.6)

49


We want to know whether the four values in (9.6) are rational. Clearly. If all of the fi are rational, then so are the m i . Conversely, suppose the m i are rational. Then fi = (m 1 + m2)/2, f2 = (m 1 + m3 )/2, and .f3 ,-------. (m i ± m4)/2 are rational. The conclusion of this string of equivalent conditions is: the coordinates (x, y) of a point Q for which 2Q = P are rational if and only if the f = \I ei are rational. This proves Proposition 20. El —

Finally, we note that Proposition 20 holds with G replaced by any field K not of characteristic 2. Essentially the same proof applies. (We need only take care to use algebraic rather than geometric arguments, for example, when reducing to the case P = (0, ye).) PROBLEMS

1. Prove that forf odd, any Fp -point of order 3 on the elliptic curve En : y 2 = x'

n 2x is actually an 1F-point; prove that there are at most three such points if p m- 3 (mod 4); and find a fairly good sufficient condition on p and f which ensures nine Fpj--points of order 3. —

2. For each of the following values of q, find the order and type of the group of Fg-points on the elliptic curve El : y2 = ,c3 — x. In all cases, find the type directly, if necessary checking how many points have order 3 or 4. Don't "peek" at the later problems. (a) All odd primes from 3 to 23.

(b) 9 (c) 27 (d) 71 (e) 1 1 3 . 3. Find the type of the group of Fp-points on the elliptic curve E5: y 2 = x3 25x for all odd primes p of good reduction up to 23. , 4. Prove that for a e CV the equation y 2 = x3 a determines an elliptic curve over any field K whose characteristic p does not divide 6 or the numerator or denominator of a; and that it has q + 1 Fg-points if q ,. 2 (mod 3). —

—

-

-

5. Prove that there are exactly 3 Fg-points of order 3 on the elliptic curve in Problem 4 if q . 2 (mod 3). -

-

6. For all odd primes p from 5 to 23, find the order and type of the group of Fp-points on the elliptic curve y2 = x3 — 1. 7. Prove that the torsion subgroup of the group of 0-points on the elliptic curve y2 = x3 — a has order at most 6, and that its order is equal to: (a) 6 if a = b6 for some be Q; (b) 2 if a = c3 for some c e Q with c not of the form --b 2 ; (c) 3 if either a = --d2 for some de a with d not of the form b 3, or if a = 4321) 6 for some beQ; _ (d) I otherwise. —

/

8. Show that the correspondence constructed in Problem 2 of §I.2 gives a one-to-one correspondence between - right triangles as in Proposition 19 and pairs + P of

50

L From Congruent Numbers to Elliptic Curvesnon-identity elements of the quotient group En(C)/E,(0)'torsion, which is isomorphic to 2E„(0) under the Map PH-• 2P. See Problem 2(b) of §I.7./

In the problems below, we illustrate how more information can be obtained using two additional tools: (1) the complex multiplication automorphism (x, y)t—*(— x, \I — ly) of the group of K-points of the elliptic curve y 2 = x3 — n 2 x if K contains a square root of — 1 ; (2) the action of Gal(Kalgc l/K) on the coordinates of the Ka lgcl-points. 9. Suppose that q=_::-. 3 (mod 4), and I is an odd prime. Prove that: (a) there are at most I Fg-points of order I on the elliptic curve y 2 = x3 — n2 x, and there are at most eight Fg-points of order 4; (b) the group of Fg-poinfs is the product of a group of order 2 and a cyclic group of order (q + 1) 1 2. 10. Suppose that q -a- 2 (mod 3), 2 4' AT, 34'N. Prove that there are at most N Fg-points of order Non the elliptic curve y2 = x3 — a.

1 1 . Suppose that q -=.-- 1 (mod 4), and I a 3 (mod 4) is a prime not equal to p. Let (/OE, /fl) be the 1-part of thetype of the group of Fg-points on the elliptic Curve y2 = x3 n 2x. Prove that a = 13. If ! = 2, prove that a = /3 or a = 13 + 1. —

12. The group of K-points on an elliptic curve is analogous to the multiplicative group K*. In Problem 11 of §1.7, we saw that for K = C, as a --* 0 the elliptic curve y2 =

(x2

a)(x. + 1) "becomes" the multiplicative group C. Now let K be the finite field Fq . In this problem we work with K*, and in the next problem we work with the group of K-points on an elliptic curve. Let 1 be a prime not equal to p, and , suppose that Fg contains all 1-th roots of 1, i.e., q =p 1 --=- 1 (mod /). —

(a) Show that the splitting field of x i — a, where a e Fq , has degree either 1 or I over Fq . (b) Show that the subfield of Fga igc l generated by all /m+1 th roots of 1 is IFt, where M' < M. (c) (For readers who know about badic numbers.) Construct an isomorphism between the additive group Z i of /-adic integers and the galois group over Fq of the field extension generated by all / th power division points (i.e., / th power roots of unity). -

-

-

13. Now let E be an elliptic curve defined over Fq . Suppose that there are / 2 Fg-points of order I. (a) Let A be an Fepoint, and let Fe be the extension of Fq generated by the coordinates of a solution a to the equation la = A (i.e., Fe is the smallest extension of Fq containing such an a). Show that there are / 2 Fe-points ai such that

lai = A. (b) Fix an Fe-point a such that la = A. Prove that the map al—*or(a) — a gives an imbedding of Gal(Fe/Fg) into the group of points of order ion E. (c) Show that r = 1 or L (d) What is the field extension of IF, generated by all points of order 1", M = 1, 2, . . . ? What is its galois group?

CHAPTER II


At the end of the last chapter, we used reduction modulo p to find some useful information about the ellipik, ,urves En : 372 =-- x' — n 2 x and the congruent number problem. We considered En as a curve over the prime field Fp where ' pi'2n; used the easily proved equality # E(1F) = p + 1 when p =--:. 3 (mod 4); and, by making use of infinitely many such p, were able to conclude that the only rational points of finite order on En are the four obvious points of order two. This then reduced the congruent number problem to the determination of whether r, the rank of En (Q), is zero or greater than zero. Determining r is much more difficult than finding the torsion group. Some progress can be made using the number of Fg-points. But the progress does not come cheaply. First of all, we will derive a formula for # En (Fq) for any prime power q = pr. Next, we will combine these numbers N. = Nr.p -= #En (Fp r) into a function which is analogous to the Riemann zeta-function (but more complicated). The behavior of this complex-analytic function near the point 1 is intimately related to the group of rational points. Before introducing this complex-analytic function, which is defined using all of the N,. p, we introduce a much simpler function, called the "congruence zeta-function'', which is built up from the Nr ----= N,.,p for a fixed prime p.

§1. The congruence zeta-function Given any sequence N„ r = 1, 2, 3, .. . , we define the corresponding "zetafunction" by the formal power series

Z(T) (1 exp ,.( ic /V T: r ), -7 ,

r=1

where

c°

li

k

exp(u) j=ef E0 k1 . k

(1. I)

52

,

11. The Hasse—Weil L-Function of an Elliptic Curve

At first glance, it might seem simpler to define Z(71) as IN.?"' ; however, the above definition has crucial properties which make it the most useful one (see the problems below). Let K be a field. Let A"K' denote the set of m-tuples of elements of K. By an "affine algebraic variety in in-dimensional space over K" we mean a system of polynomial equations of the form fj(x i , . . . , x„,) = 0, where •i E K[X 1 , . . . , Xm ]. For example, a conic section is a system of two equations

fi (x, y, z) = x 2 + y2 — z 2 = 0;

f2 (x, y, z) = ax + by + cz -I- d =--- 0

in 3-dimensional space over R. If L is any field extension of K, the "L-points" T for which all of the polyof the variety are the m-tuples (x i , . . . , x.) nomials fi vanish. By a "projective variety in ,n-dimensional space over K" we mean a system ' of homogeneous polynomial equations4(x 0 , x i , . . . , xm) in m + I variables. If L is a field extension of K, the "L-points" of the projective variety are the points in PT (i.e., equivalence classes of in + 1-tuples (xo , . .. , x„,), where (x0 , ... , x„,) -- (),x 0 , .. . , /7,X,n), ). E L*) at which all of the fi vanish. For example, in the last chapter we studied the Fq-points of the elliptic curve defined in PF by the single equation f(x, y, z) = y 2z — x 3 + 112xz 2 = O. (Note: Here .Xio = z, x i = x, x2 = y are variables for a projective variety in Pi, while in the last paragraph x i = x, x 2 = y, x 3 = z were variables for an affine variety in Q.) If we have a projective variety, by setting xo = 1 in the f., we obtain an a ffi ne variety whose L-points correspond to the m + 1-tuples with nonzero first coordinate. The remaining L-points of the projective variety will be the projective variety in Pr' obtained by setting xo = 0 in all of the equations and considering the equivalence classes of m-tuples (x 1 , . . . , x„,) which satisfy the resulting equations. For example, the elliptic curve with equation y 227 — x 3 + igxz 2 consists of the affine points—the solutions of y2 = x 3 — n 2x—and the points (x, y) of PI, for which —x 3 = 0, i.e., the single point (0, 1) on the line at infinity ..-.• = O. Let V be an affine or projective variety defined over Fq . For any field K m Fq , we let V(K) denote the set of K-points of V. By the "congruence zeta-function of V over Fi" we mean the zeta-function corresponding to the sequence N. = # V(FF qr). That is, we define co Z( V/Fq ; T) ,Fa exp (E # V(Fqr) TrI) . (1.2) r=1

Of course, iv,. is finite, in fact, less than the total number of points in A;', , (in the affine case) or PT, (in the projective case). We shall be especially interested in the situation when V is an elliptic curve defined over Fq . This is a special case of a smooth projective plane curve. A projective plane curve defined over a field K is a projective variety given in Pi by one homogeneous equation f(x, y, z) = O. Such a curve is said to be "smooth" if there is no Kaig d-point at which all partial derivatives

53

§1. The Congruence zeta-function i

vanish. This agrees with the usual definition when K = C ("has a tangent line at every point"). , It turns out that the congruence zeta-function of any elliptic curve E ( defined over Fq has the form 4 ; T)—

1— 2aE T ± qT2 Z(EIF

(1— T)(1 — qT)

(1.3)

where only the integer 2aE depends on E. We shall soon prove this in the case of the elliptic curve En : y2 = x3 — n2x. Let a be a reciprocal root of the numerator; then 1_— 2aE T + qT2 = (1 — ŒT)(1 — IT). If one takes the logarithmic derivative of b6th sides of (1.3) and uses the definition (1,1), one easily finds (see problems below) that the equality (1.3) is equivalent to the following formula for N,. = #E(EFq r);

, Nr = qr + 1 _ a r _ (q /0 r . )

As a special case of (1.4) we have

N1 = #E(F q ) = q + 1 — a — --qa- = q + 1 — 2aE . Thus, if we know that Z(EfiF q ; T) must have the form (1.3), then we can determine aE merely by counting the number of Fg-points. This will give us Z(ElFq ; T), the value of a, and all of the values N,. = # E(Fq r) by (1.4). In other words, in the case of an elliptic curve, the number of Fg-points determines the number of Fe-points for all r. This is an important property of elliptic curves defined over finite fi elds. We shall prove it in the special case y2 = x 3 — n 2x. It will also turn out that a is a quadratic imaginary algebraic integer whose complex absolute value is \f-.4. In the case y2 = x3 — rgx, it will turn out that a is a square root of —q if q F.:- 3 (mod 4), and is of the form a + bi, a, be Z, a' + b2 = q, if q -m-- 1 (mod 4). This situation is a special case of a much more general fact concerning smooth projective algebraic varieties over fi nite fields. The general result was conjectured by Andre Weil in [Weil 19491 and the last and most difficult part was proved by Pierre Deligne in 1973. (For a survey of Deligne's proof, see [Katz 19764.) We shall not discuss it, except to state what it says in the case of a smooth projective curve (one-dimensional variety):

(0 Z(VI1F q ; T) is a rational function of T (this is true for any variety without the smoothness assumption) which for a smooth curve has the form P(T)I(1 — T)(1 — qT). Here P(T) has coefficients in 1 and constant term 1 (equivalently, its reciprocal roots are algebraic integers). (ii) If V was obtained by reducing modulo p a variety P defined over 0, then deg P = 2g is twice the genus ("Betti number") of the complex analytic manifold P. Intuitively, g is the "number of handles" in the corresponding Riemann surface. An elliptic curve has g = 1, and the , Riemann surface in Fig. II.1 has g =3.

11.

54

The Hasse—Weil L-Function of an Elliptic Curve'

Figure III

(iii) If a is a reciprocal root of the numerator, then so is (Va. (iv) All reciprocal roots of the numerator have complex absolute value

14.

One reason for the elegance of the Weil conjectures is the intriguing indirect connection between the "physical" properties of a curve (e.g., its number of handles as a Riemann surface when considered over C) and the number theoretic properties (its number of points when considered over Fe). Roughly speaking, it says that the more complicated the curve is (the higher its genus), the more N,'s you need to know before the remaining cities can be determined. In the simplest interesting case, that of elliptic curves, where g ----= I, all of the N,.'s are determined once you know N1 .

PROBLEMS

1. Show that if N„ = Nr* + Nr*** and Z(T), Z*(T), Z**(T) are the corresponding zeta functions, then Z(T) = Z*(T). Z**(T); and if N,. = Nr* — N r**, then Z(T) -7---Z*(T)1Z**(T). -

2. Show that if there exists a fixed set a i , . . . , as , Pi , • • • , N. = ji"; + - • • + fir arl — • • — 4, then

A such that for all r we have

—

Z(T)=

(1 — Œ1T)(1 — a2T)- • •(1 — as T) . (1 — fi' 1 T)(1 — 13 2 T)- (1 AT) —

—

3. Prove that if N, < CA' for some constants C and A, then the power series Z(T) converges in the open disc of radius 1/A in the complex plane. 1, r even; 4. Show that if Nr =t_ , then Z(T) is not a rational function; but if N, = 0 r odd, r even; then Z(T) is rational. In the latter case, interpret N, as the number r odd, of Fr-solutions of some equation.

1.2:

5. The Bernoulli polynomials .13„(x)e Q [x] have the properties: (i) deg Br = r; (ii) for all M, Br (M)— 14(0) = r( 1'-1 + 2 1 + • - . + (M Dr- '). Now for fixed M let N,_ 1 = .-(B,.(M) — B r(0)). Find the corresponding Z(T). (Cultural note: Bi (x) = x — 4, B2 (X) = X2 - X ± *, etc.; they are uniquely determined by properties (i) and (ii) along with the normalization requirement that S ) Br (x)dx =-- 0 for r .>:.1. One way to define them is by equating terms in the relation: te"Ret — 1) = Ecr°,0B,(x)trIr!.) —

55

§1. The congruence zeta function -

6. Suppose that / is a prime, q is a power of another prime p, q # 1 (mod /), q # 1 (mod / 2 ). (a) For fixed M, let N,. = # {dye Fgrix im = 11. Find the corresponding Z(T). (b) Now let N,. = # tx e Fgrixim = 1 for some M}. Find the corresponding zetafunction. Is it rational? 7. A special case of an affine or projective variety V is the entire space, corresponding to the empty set of equations. Let A'; denote in-dimensional affine space (the usual space of m-tuples of numbers in the field K), and let P'; denote projective space, as usual. (a) What is Z(A T /F.? ; T)? k = in, in — 1, (b) Find Z(PT /Fgq ; T) by writing In as a disjoint union of ... , 0, and using Problem 1. (c) Also find Z(P T IfFq ; T) by counting equivalence classes of (in + 1)-tuples, q answers agree. and check that your

M‘ ,

8. Show that, if V is a variety in ATq or P 7q, then Z(V IFq ; T) converges for I Ti < q- ni. 9. If one wants to prove that Z(V1F q ; T)eZHTE with constant term 1 for any affine or projective variety V, show that it suffices to prove this when V is any affine variety. Then show that it suffices to prove this when V is given by a single equation. Show that the rationality assertion Z(VIF q ; T)EQ(T) can also be reduced to the case of an affine variety V defined by a single equation. A variety defined by a single equation is called a "hypersurface". 10. Find the zeta-function of the curve y 2 = x 3 — n 2x in QV if pi2n, i.e., p is not a q prime of good reduction. 11. Find the zeta function of the hypersurface 'inAt defined by x 1 x 2 — x 3 x4 = O. q -

gn r.

12. Let Nr be the number of lines in Find its zeta function. (it is possible to view the set of k-dimensional subspace: in In as a variety, called the grassmannian ; in our case k = 1, m = 3.) -

13. Using the form (1.3) for the zeta-function of an elliptic curve, where the numerator has reciprocal root a, show that N,. is equal to the norm of 1 — Œ. Now, in the situation of Problem 13 of §1.9, suppose that E has / 2 Fg-points of order /, and no Fg-points of exact order 1 2 . Prove that the field extension of Fg generated by the coordinates of the points of order 114+1 is Fqt". (Note the close analogy with the multiplicative group F.:, with q E.--- 1 (mod 1) but q # I (mod / 2), where the field generated by all / m+i -th roots of unity is Fe.) 14. Let V be an affine algebraic variety defined over K by equations fi(x l , . . . , x) = O. By the coordinate ring R(V) we mean the quotient ring of K[x,, . . . , x„,] by the ideal generated by all of the fi . Let P ---4-- (al , . . . , am ) be a Ka'gc l-point on V. Let L = K(a l , . . . , a,n) be the finite extension of K generated by the coordinates of P. L is called the residue field of P, and its degree over K is called the residue degree. (a) Show that the map xi t--ai is well-defined on R(V), and extends to a homomorphism whose kernel is a maximal ideal m(P) in R(V). (It is not hard to prove that every maximal ideal of R(V) arises in this way.) 'how that m(P')= m(P) if and only if there is an isomorphism from L to L'

56

11. The Hasse—Weil L-Function of an Elliptic Curve (the residue fields of P and P', respectively) which takes ai to a;. Thus, the maximal ideal m(P) corresponds to d different Ka ig c i-points P on V, where d --7-- {R(V)Im(P): K] is the residue degree of any of the points P.

15. In the situation of Problem 14, let K = IF,i • For a given Ka fg c l -point P, the residue field is Fei for some d. Then P contributes I to each N,. for which r is a multiple of d. That is, the contribution of P to the exponent in the definition of the zetafunction is Eic,"1 1 T" lkd. Then Z( VifF, ; T) is exp of the sum of all contributions from the different Ka igc l-points P. Group together all points corresponding to a given maximal ideal, and express Z(VIE .,7 ; T) as the product over all maximal ideals m of (1 — Vegm)' . Then show that the zeta-function belongs to 1 + TUT]]. (Cultural note: If we make the change of variables T = q-s, and define Norm(m) to be the number of elements in the residue field, i.e., Norm(m) = gdegm , then we have Z(V IfF q ; sq-s)= ri.(1 — Norm(m)) -1 , which is closely analogous to the Euler product for the Dedekind zeta-function of a number field: C K (s) = n vo — Norm(p)r', in which the product is over all nonzero prime ideals of the ring of integers in the field K. In a number ring, a nonzero prime ideal is the same as a maximal ideal.) 16. Prove that if Z( V/F.7 ; T)eG(T), then the numerator and denominator are in 1 + TZ[T] (equivalently, the a's and /J's in Problem 2 are algebraic integers).

§2. The zeta-function of En We now return to our elliptic curve E„, which is the curve y2 = x3 — n2 x, where n is a squarefree positive integer. More precisely, En is the projective completion of this curve, i.e., we also include the point at infinity. En is an elliptic curve over any fi eld K whose characteristic does not divide 2n, and, as we have seen, it is sometimes useful to take K =--- f,,, or more generally K = Fq . The purpose of this section is to express the number of Fg-points "on En in terms of "Jacobi sums". To do this, we first transform the equation of En to a "diagonal form". We say that a hypersurface f(x i , . .. , x) = 0 in A';‘' is "diagonal" if each monomial in f involves at most one of the variables, and each variable occurs in at most one monomial. For example, the "Fermat curve" x" + yd = 1 is diagonal. It turns out that diagonal hypersurfaces lend themselves to easy computation of the N,. (much in the same way that multiple integrals are much easier to evaluate when the variables separate). We shall not treat the general case, but only the one we need to evaluate N,. ---.-- #En (Fgr). (For a general treatment of diagonal hypersurfaces, see [Weil 1949] or [Ireland and Rosen 1982, Chapter II].) We first show a relation between points on E„: y2 = .X 3 — n 2x and points on the curve EL : u 2 = y4 + 4n 2 . As usual, we suppose that p,t2n. First suppose that (u, y) is on E. Then it is easy to check that the point (x, y) = (1-(u + y 2); -1-y(u + y 2)) is on E. Conversely, if (x, y) is on En and its x-coordinate is nonzero, then we check that the point (u, y) = (2x — y2/x2 , y/x) is_on E.

57

§2. The zeta-function of En

Moreover, these two maps are inverse to one another. In other words, we have a one-to-one correspondence between points on E,Ç and points on En — 1 (O , 0)1. Let N' be the number of Fg-solutions (u, y) to u2 = c4 + 4n2 . Then the points on our elliptic curve consist of (0, 0), the point at infinity, and the N' points corresponding to the pairs (u, y). In other words, N 1 = #E„(Fq) is equal to N' + 2. So it remains to compute N'. The advantage of the equation u 2 = t24 ± 4n 2 is that it is diagonal. The basic ingredients in determining the number of points on a diagonal hypersurface are the Gauss and Jacobi sums over finite fi elds. We shall now define them and give their elementary properties. Let 1//: Fq -+ C* be a nontrivial additive character, i.e., a nontrivial homomorphism from the additive group of the finite field to the multiplicative group of complex numbers. (Since 1F,1 is finite, the image must consist of roots of unity.) In what follows, we shall always define iIi(x) = .1.1. x, where = e 2 ", and Tr is the trace from Fq to Fp . Since the tracp is a nontrivial additive map, and its image is Fp = LW, we obtain in this way a nontrivial additive character. Now let x: F: -.- C* be any multiplicative character, i.e., a group homomorphism from the multiplicative group of the finite field to the multiplicative group of nonzero complex numbers. In what follows, the additive character 0 will be fixed, as defined above, but x can vary. We define the Gauss sum (depending on the variable x) by the formula

g(X) =

E x(x)kk(x) xe Fq

(where we agree to take x(0) = 0 for all x, even the trivial multiplicative character). We define the Jacobi sum (depending on two variable multiplicative characters) by the formula

x1(x)x2(1-x). xe Fq

The proofs of the following elementary properties of Gauss and Jacobi sums are straightforward, and will be left as exercises. (Here x„i, denotes the trivial character, which takes all nonzero elements of Fg to 1; xl Xi , and X2 denote nontrivial characters; and j? denotes the complex conjugate (also called "inverse") character of x, whose value at x is the complex conjugate of X(4)

( 1) g(Xtriv) = — 1; J(xtriv , xtriv ) ------ q — 2; J(Xtriv,

Ax, k) = — A— 1);

Ax1, x2) := Ax2, xi);

x) = — 1;

(2) g(X) ' g(Z) = X( — 1 )q; WWI = /-4; (3) AX17 X2) = g(X1)9(X2)/9((1 X2) if X2 0 We now proceed to the computation of the number N' of u, y E Fq satisfying u2 = t,' + 4n 2 .--The key observation in computing N' is that for any a 0 0 _ in Fq and any m dividing q — 1, the number of solutions xE Fq to the equation xm = a is given by:

58

II.


(2.1)

x(a), xm= i

where the sum is over all multiplicative characters whose m-th power is the trivial character. Namely, both sides of (2.1) equal m if a is an m-th power in Fl and equal 0 otherwise; the detailed proof will be left as a problem below. By Proposition 16 of the last chapter, we know that N1 .--= q + 1 if qa---- 3 (mod 4). In what follows, we shall suppose that q F.---- 1 (mod 4). In counting the pairs (u, y), we count separately the pairs where either u or y is zero. Thus, we write

N' = # lu e Fq iu2 = 4n2 } ± # {ye Fq10 = y'' +4n 2 }

(2.2)

+ # u, v e EF:iu 2 = y4 + 4n 2 }. {

The first term in (2.2) is obviously 2 (recall that we are assuming that p 42n). We use (2.1) to evaluate the second term. Let x4 be one of the characters of IF: having exact order 4, i.e., x4 (g) = i for some generator g of the cyclic group O. Then, by (2.1), the second term in (2.2) equals 4

E x4(- 4n 2) = 2+ 2

(2.3)

4 (-4n 2)

i=1

(where we use the fact that —4n 2 is a square in P ) . Finally, we evaluate the third term in (2.2). Let x2 denote the nontrivial character of order 2 (i.e., x2 = xi). Using (2.1) again, we can write the third term in (2.2) as

E E Aooa - 4n2). /4 2 = a} • # {v4 = b} = E # { aEF*q a-4n 2 00 j1,2,3,4 a,bEiF q* ,

a=b+4n 2

k=1,2

Note that since xi(0) = 0, we can drop the condition a — 4n 2 o 0 on the right. We now make the change of variable x = al4n2 in the first summation on the right. As a result, after we reverse the order of summation, the right side becomes

x.4(-4n 2 ) E x'(x)x-4(l - x) =

E

j=1,2,3,4 k=1,2

x€F*(1

E x4(-4n 2)./(A5

xi).

j =1,2,3,4

k=1,2

Finally, bringing together the three terms in (2.2) and using property (1) of Jacobi sums when A or x.,1 is trivial or they are conjugate to one another, we obtain:

N' = 4 + 2x4 (-4n 2) + E x!,(-4n2)J(x2, z4) + q — 2 + 3 • (— 1) j=1, 3 +

2x4 (-4n 2) . ( —1)

(2.4)

= q — 1 -I- x4( —4 n 2 )(J(X21 X4) ± AX29 Z4)) • In the problems we show that x4( — 4) = 1. Hence, x 4 ( -4n 2) = Z2 ( 1 ). Thus, if we set Œ=

Œn. q

(-17-f -- 7,-.2 ( 1 '

(2.5)

59

§2. The zeta-function of E„

we conclude that Ni = #En( q)

(2.6)

q +lŒ — Z.

Notice that a is an algebraic integer in 0(i), since the values of x2 and x4 in the definition of Ax2 , x4) are all ± 1, +i. We now pin down the Gaussian integer a = a + bi, at least in the case when q = p is a prime congruent to I mod 4 or q = p2, is the square of a prime congruent to 3 mod 4. By property (3)relating Jacobi to Gauss sums, we have

cx = X2(n)g(X2)g(X4)/g(Z4), and hence, by property (2), we have i ce = a2 ± b 2 = q. In the two cases q=p 1 (mod 4) and q = p 2 , p 3 (mod 4), there are very few possibilities for such an a. Namely, in the former case there are eight choices of the form +a + bi, +b + ai;and in the latter case there are the four possibilities +p, +pi. The following lemma enables us to determine which it is.

Lemma 1. Let q 1 (mod 4), and let h and z4 be characters of Pq of exact order 2 and 4, respectively. Then 1 J(x 2 , x4) is divisible by 2 2i in the ring Z[i]. We first relate ./(x2 , x4) to ./(x4 , X4) by expressing both in terms of Gauss sums. By property (3), we have: Ax 2 , X4) = AX4, X4)g(X2) 2/g(X4)g(i4) = X4( 1)J(1 4 , 7.4) by property (2). Next, we write PROOF.

AX4, X4) = >

=

± 2 E1 x4(x)x4(1

x), where E' is a sum over (q — 3)/2 elements, one from each pair x, 1 4(x)4(1

x, with the pair ( P +2 1) , (P +2 1) omitted. Notice that x4 (x) is a power of i, and so is congruent to 1 modulo 1 i in [i] ; thus, 2z4 (x)x4 (1 x) 2(mod 2 ± 2i). As a result, working modulo 2 + 2i, we have ./(x4 , X4) .2=-. q — 3 + XL2P+2 1)) 1 (mod 4)). Returning to J(x 2 , x4), we obtain: 2 + X4(4) ( since q I +

Ax2,

(mod

X4) = I + x4( — 1)Ax4 , ;(4) FEI + x4( —4) + 2 x4( — 1)

2 + 2i).

Since 4( -4) = 1, as mentioned above (and proved in the problems below), and since 2(1 + x4( — 1 )) = 0 or 4, it follows that I + Ax 2 , z4) is divisible by 2 + 2i, as claimed. We now have the basic ingredients to prove a formula for Z(EnIFp ; T).

Theorem. Let En be the elliptic curve y 2 = x3 n2x defined over Fp , where p41 2n. -then Z(E nIEFp ; where elemt

1 — 2aT pT 2 (1 — aT)(1 = (1 p79 ) = (1 +.•.

c7cT) —

pT) 5

(2.7)

e a; a = ifp Er.- 3 (mod 4); and tfp 1 (mod 4), then a is an 1 1 of norm p which is congruent to ( 17,-) modulo 2 + 2 1.

60

Ii. The Hasse—Weil L-Function of an Elliptic Curve

Before proving the theorem, we note that in the case p 1-.-. 1 (mod 4) it says we choose cx = a + bi with a odd (and b even), where the sign of a is determined by the congruence condition modulo 2 + 2i. There are two possible choices a + bi and a — bi; and of course the formula (2.7) does not change if we replace a by its conjugate. In order to obtain Z(En/Fr ; T), we must let the power of p vary, and determine Nr -7---- #E„(Fpr) for p..--a.- 1 (mod 4) and N2 r = # En(Fq r) for p m 3 (mod 4), g = p2 (since we know that IV,. == pr + 1 for odd r in that case). So we fix g equal to p in the first case and equal to p2 in the second case (in either case g1-7-.--_. 1 (mod 4)), and we replace g by gr throughout the work we did earlier to find a formula for #E,,(Fq), q a-- 1 (mod 4). Because the r is varying, we need a notation to indicate which x2 and x4 we are talking about, i.e., to indicate for which finite field they are multiplicative characters. Let x2 , 1 = x2 denote the unique nontrivial character of Fq* of order 2, and let x4, i =--- x4 denote a fixed character of Fq* of exact order 4 (there are two, the other one being y4). Then by composing x2 or x4 with the norm from Fqr to Fq , we obtain a character of Pq`r of exact order 2 or 4, respectively. We denote these characters x2, and x4„. For example, if g is a generator of IF: such that x4 (g) = i, and if g,. is a generator of Wq*,- whose +q+ •••;1-qr - 1 = g, then we have x 4„(gr) = i. If IN, denotes norm is g, i.e. , ( gr )1 the norm from Fee to Fq , we can write our definitions: PROOF.

= X4 0 NI-, X2,r = X20 N r . With these definitions, using (2.5) and (2.6), we can write: X4,r

# En(Fqr) = qr + 1 — where

an, q r

(2.8)

an, q r — ain,q r,

(2.9)

( = — X2, r 1) g (X2 ' r)g(X41r) -

We now use a basic relationship, called the Hasse—Davenport relation, for Gauss sums over extensions of finite fields. The Hasse—Davenport formula Is

:

—g(° NO ---= ( — 900) r -

(2.10)

The proof of this fact will be given in a series of exercises below. Applying (2.10) to the three Gauss sums in (2.9), and observing that x 2 ,(n) = X2(n) = X2(n) r, we copclude the following basic relationship:

an, q r :=7-

ant. , q .

(2.11)

The theorem now follows quickly. First suppose p..-€:- 1 (mod 4), in which case g --= p. Then x 2 (n) is the Legendre symbol Cp. Using (2.5) and Lemma 1, we find that cx --= an, p is a Gaussian integer of norm p which is congruent to CD modulo 2 + 2i; and, by (2.9) and (2.11),

61


Nr = pr + I — ar dr. This proves the the,orem when p 1 (mod 4) (see Problem 2 of §11.1). Now suppose thatp 3 (mod 4), q = p 2 . Then x2 ( 12 ) = 1, since all elements of Fp are squares in [Fp2. Then Lemma_ I tells us that am is a Gaussian integer of norm q which is congruent to I mod 2 + 2i. Of the four Gaussian integers j = 0, 1, 2, 3, having norm q, only am = p satisfies the congruence condition. Then, by (2.6) and (2.11), we conclude that for r even we have —

Nr = #E„(Fq,12) = pr + 1 —

_py /2

Since Nr = pr + 1 for odd r, we have for any r:

= Pr

± - \/-13)r (-

This completes the proof of the theorem. We conclude this section by calling attention to the role Lemma 1 has played in pinning down the reciprocal roots a and Fie in (2.7). The congruence condition in Lemma I will again be needed when we start working with the Hasse—Weil L-function of the elliptic curve En , which combines the a's for different primes p. In that context, Lemma 1 is a special case of a general fact about how Jacobi sums vary as we vary, the prime p. The general case is treated in [Weil 1952 ]. PROBLEMS 1. Prove properties (1)-(3) of Gauss and Jacobi sums that were given in the text. 2. Let G be a finite group, and let Ô denote the group of characters x (i.e., of homo' morphisms X: G -0 t*). Recall that for any nontrivial x e 6, EaEG Z(g)= O. Notice that any fixed g e G gives a character g: xi--4x(g) on the group 6, and also on any subgroup S c Ô. Apply these general considerations to the case when G = IF: and S is the subgroup of characters x such that xm =1. In that way prove the relation (2.1) in the text.

3. Let q 1 (mod 4), and let x4 have exact order 4. Show that x4 (4) and x4( - I) are both equal to 1 if q E: 1 (mod 8) and equal to - 1 if q 5 (mod 8). Conclude that X4( -4) = 1 in all cases. 4. Show that g(x 2)2 = (-1)( ' 2q. It is somewhat harder to determine which square root to take to get g(x2) (see [Borevich and Shafarevich 1966, pp. 349-353]). Compute g(x2) when q = 3, 5, 7, 9. 5. For q 1 (mod 4), again let h be the nontrivial quadratic character, and let x, and 2- 4 be the two characters of exact order 4, Compute J(72, Z-4) and 4/(z 2 „ 2- 4) directly from the definition when q = 5, 9, 13, 17. 6. Show that if x2 is the nontrivial quadratic character of IF: and x is any nontrivial character, then J(x2 , x) = x(4)J(x, x).

7. Let x3 and X3 be the two characters of IF: of order 3, where q 1 (mod 3). Compute ../(x3 , x3) and J(23 , Z3) directly from the definition when q = 7, 13.

\

62

H. The Hasse—Weil L-Function of an Elliptic Curve

8. (a) Notice that we proved that the number N,. of Fqr-points on En is independent of n if r is even. Show this directly. (b) Also notice that N,. does not change if n is multiplied by an integer which is a square in F. This is for the same reason that we could, without loss of generality, reduce to squarefree n when considering 0-points. Namely, if K is any field not of characteristic 2 and if in, tie K*, construct a simple correspondence between En(K) and En„,2(10. 9. This problem concerns a more general definition of Gauss sums, examples of which will occur later in the chapter. Let R be the ring of integers in a number field K, and let I be a nonzero ideal of R. Then RII is a finite ring. Let tif : R11 -4 C* be an additive character which is nontrivial on any additive subgroup of RII of the form J11 for any strictly larger ideal j = I (including the "improper ideal" J = R, which will be the only such J if I is a prime ideal). Define the norm NI — # (R/1). Let x : (R1I)* ---) C* be any multiplicative character'. Take x(x)= 0 for x ER11 not prime to I. Define g(x) = g(x, III) = E x(x)0(x), where the summation is over def

RII. (a) Prove that Zx(x)ip(ax) 7.--- x7-(a)g(, tp) for any a e (RII)*. In parts(b) and (c-) we suppose that x is "primitive" modulo I. By definition, this means that, for any strictly larger ideal J = I, x is nontrivial on the subgroup of (RII)* consisting of elements congruent to 1 modulo J. (b) If x is primitive, show that the formula in part (a) holds for all a E RII. (c) For x primitive, prove that g(x, Og ( ? , Ifr) = x(- l)1, and ig(x, 1//)! = Some examples of the characters and Gauss sums in this problem are: (1) if I is a prime ideal with residue field Fq , then property (2) of Gauss sums in the text is a special case of part (c); (2) if R = Z and I is the ideal (N), then x is an ordinary Dirichlet character. NI= N, we often take t/i(x) = e27N1 and "primitive" means that the value of x(x) for xe (ZINZ)* does not depend only on its residue modulo some proper divisor of N; (3) later in the chapter we will encounter examples where XE

R =Z[i].

Problems 10-17 will lead to a proof of the Hasse—Davenport relation. 10. Let S be the set of all monic polynomials in Fq [x], and let S 1" denote the subset of all irreducible monic polynomials. Subscripts will indicate degree. By writing xqr — x = H2 ,(x - a), prove that xq r —x=rif, where the product is over all fin sr for all qd dividing r. 11. Let 0 be a nontrivial additive character and x a multiplicative character of N. If f€ S is written in the form f(x) =x=d _ e 1 x d —1 4_ . . . .4_ (...... rd) ed , define a map 2: S --+ C by 2( f) = x(cd)qi (c i ). (If f= 1 is the constant function in So , then define 2(I) = I.) Prove that 2(f1 f2) = 2(1 1 )2(1 2 ) for fi ,j E S. 12. Prove that the Gauss sum can be written g(x) = EfES I 2 (f). 13. Suppose that oceFqr satisfies monic irreducible polynomial fESP, where dlr. Then show that 2(f )'M = x r (a). Iiir (a), where the subscripts here indicate the characters of Fqr obtained by composing with the norm from Fqr to Fq (in the case of a multiplicative character) or with the trace from Fqr to Fq (in the case of an additive character). 14. Prove that g (x r) = —E dir E,R sjr ir dit(fyi d.

63-


15. Prove the power series identity ERs ,l(f)Pleg 1 -z---

n„ s irr(l — ).(f)Tdeg 1)1.

16. Show that if d> 1, then IfEsd 2(f) = O. 17. Taking the logarithmic derivative of both sides in Problem 15, prove that ( _ or- 1 g (x)r = E E cbt(f)rld 1 dlr. f € sirdr

and conclude the proof of the Hasse—Davenport relation.

18. (a) Show that the ideal (2) in Z[i] is the square of the prime ideal (1 + i); and that any element a e Z[i] not in (1 + i) has a unique associate iia which is congruent to 1 modulo (1 + i) 3 = (2 + 2i). (b) Show that the ideal (3) in l[ad, co = (— I + V — 3)/2, is the square of the prime ideal (J-3); and that any element a e Z[w] not in (I— 3) has a unique associate ( — (..o)ja which is congruent to 1 modulo 3. 19. - Consider the elliptic curve y2 = x3 — a, a e O. Recall from Problem 4 of §I.9 that it has g + 1 points if g .1.---._- 2 (mod 3). So suppose that q 71-7- 1 (mod 3). Let x 2 be the nontrivial quadratic character of iFq*, and let x 3 be either of the nontrivial characters of Fg* of order 3. Prove that the number of Fg-points on the elliptic curve is equal to

q + 1 + X2( -- a)(X3(a)AX2, X3) + X3(a)J(X29 X3)). 20. Let g r-,- 1 (mod 3), and let x3 be a nontrivial character of F: of order 3. (a) Prove that q./(x 3 , x 3) = (b) Prove that ./(x 3 , x 3) -=- —1 (mod 3) in Z[w], where (.0 = (— 1 + 0)/2. (c) Show that Ax3, X3) = p if g = p 2 , p m 2 (mod 3). (d) Suppose that g := p.7.-_ 1 (mod 3). Choose a + bco so that p = la + bcol2 = a2 — ab + b 2 . Show that exactly one of the two ideals (a + bco), (a + bCo) (without loss of generality, suppose the first one) has the property that

x 3 (x) .--i-: x( P' v3 (mod a + bw)

for all

x e Fp .

(e) Let g = p m 1 (mod 3), and choose a + bco as in part (d). Show that —./(x 3 , x 3 ) is the unique element generating the ideal (a + bw) which is congruent to I modulo 3.

21. Let N,. be the number of Fv-points on the elliptic curve y 2 = x 3 — a, where a E Fp*, p 0 2, 3. (a) If p m 1 (mod 3), let x2 and x3 be nontrivial characters of Fp* of order 2 and 3, respectively, and set a = — X2( — a)X3(4CI)J (X3, X3). Prove that N,. = Pr + I — (b) If p L-7 2 (mod 3), let x2 and x3 be nontrivial characters of Fp of order 2 and 3, respectively. First prove that x2 and x 3 are both trivial on elements of Fp*. Now set a = ij). Prove that N. = p r + 1 — a' — . (c) Conclude that in both cases the zeta-function is (1 — 2c T + pT2)1(1 — T)(1 — pT), where c = 0 if pa.-._-- 2 (mod 3), and c = —x 2 (—a)Re(x 3 (4c)J(x 3 , x 3 )) if p=_ 1 (mod 3).

22, Let C c Pi be the curve y2 + ay = x3, a e K (i.e., P(x, y, z) = y 2z + ayz 2 — x3). (a) Find conditions on the characteristic of K and on a e K which are equivalent ' to C being smooth at all of its Ka 1g cl -points.

11. The Hasse-Weil L-Function

64

of an Elliptic Curve

(b) Let K = 11 2r. Show that for r odd, # C(F2r) = 2r + 1 (this is independent of a). . (c) Let K =.- IF4r, aEK, a O. Let ;( 3 be a nontrivial character of K* of order 3, and let 2-6 be the other one. Derive the formula: # C(F4r) =-- 4r + 1 ± i3(a)J(3, X3) + X3(a)J(3, Y3). (d) In the situation of part (c), show that Ax 3 , )( 3) = ( — 1y_'2' then find a formula for Z(C/F 2 ; T) when a = 1. (e) Now let K = 0, a --=-- 1. Find a linear change of variables (with coefficients in 0) which transforms C to the elliptic curve y 2 = x 3 + 16. ;

23. Let N„ be the number of Fpr-points on the elliptic curve En y2 = x3 — n2x, where :

p,t2n. (a) Show that if p a 3 (mod 4), then Nr is independent of n; it equals pr + 1 if r is odd ; and it equals (p'I 2 — (— l)') if r is even. . (b) Now let p.--_.-: 1 (mod 4). In Problem 8 above, we saw that N„ is independent of n if r is even, and if r is odd it depends only on whether n is a quadratic residue or nonresidue modulo p. For odd r, let Nrres and N:r denote the N, for n a residue and for n a nonresidue, respectively. Show that N2r is a multiple of the least common multiple of N,`" and Nrnr. (c) For p ..= 5, make a table of Nrres and Nrnr for r = 1, 3, 5, 7 and a table of N, for r = 2, 4, 6, 8, 10, 12, 14. In each case, determine the type of the abelian group E,s(Fpr). (See Problems 9 and 11 in §1.9.) (d) For p .= 13, make a table of Nr"s and Nr" for r = 1, 3, 5 and a table of Nr for r =-- 2, 4, 6, 8, 10; and in each case, find the type of E n (Fpr).

§3. Varying the prime p In this section we look at the elliptic curve En : y2 = x3 — n 2 x and its zetafunction Z(E,TIFp ; T) as p varies. We shall later want to combine these zeta-functions for the various p into a single function, called the Hasse—Weil L-series of the elliptic curve. It is the Hasse—Weil L-function that is intimately related to the group of 0-points on E. _ The denominator of Z(EIFp ; T) is always (1 — T)(1 — pT). Only the numerator depends on p. If pi 2n, in which case En is not even an elliptic curve, the numerator is simply 1 (see Problem 10 in §II.1). Otherwise, the numerator is a quadratic polynomial in T of the form (1 — ŒT)(1 — bin When we later define the Hasse—Weil L-series of En , we shall take this quadratic polynomial and replace T by p - s (s is a new complex variable). The resulting expression (1 — Œp)(1 — p - s) is called the "Euler factor at p", by analogy with thé term in the Euler product expansion of the Riemann zeta-function : .

def

n=1

.FI prunes

I

p

1 — rs

(where Re s > 1).

(3.1)

In this section we shall study how this "Euler factor" depends on p. This

65

§3. Varying the prime p

dependence will turn out to be described by a certain character xn' of ,Z[i] (see Problem 9 of the last section). For the duration of this section we shall let P denote prime ideals of the --= 3 Gaussian integer ring Z[i]. There are two types: (1),, P = (p) for p=_ (mod 4); (2) P = (a + bi) for a2 + 1) 2 = p I (mod 4). In the latter case we have PP = (p), and we say that p "splits" in Z[i]. (There is also the special case P = (1 + i), which "ramifies", i.e., P 2 = (2)) The degree of a prime P dividing (p) is defined to be the degree of the field extension 7L[i]/P of Fp ; it is 2 in the first case and 1 ifp splits. We can then rephrase the theorem in the last section as follows. Proposition 1.

(1 — 7)(1 — pT)Z(E nliFp ; T) =

fi 0 - (ŒpT)deg P),

(3.2)

Pi(p)

where the product is over the (one or two) prime ideals of l[i] dividing (p), and where ap = if if P = (p) and ap = a + biif p splits, where a + hi is the unique generator of P which is congruent to (1-,) modulo 2 + 2i. We take ap = 0 if PI (2n). We now define a map 2„ on Z[i] which will be multiplicative and will satisfy in(x) = ap for any generator x of P = (x). This multiplicative map is of the form 2„(x) = xx n'(x), where xn'(x) has value 0, ± I, or ±i. First of all, we define x(x) = 0 if x has a common factor with 2n. Next, for n = 1 we define x'i (x) to equal ii, where ii is the unique power of i such that iix 7--..L..--- 1 (mod 2 + 20. Here x is assumed prime to 2, and hence an element of (Z[i]/(2 + 2i))*, which has four elements represented by the powers of i. Finally, for other n and for xel[i] prime to 2, we define x(x) = where Nx = x • 5c- is a positive odd integer, and GO is the Legendre symbol (which extends from prime modulus (i-,-) to arbitrary positive odd modulus by requiring that (4 2 ) = (k)(k)). To summarize, we have defined:

=

1

xx;i(x);

where for x prime to 2

ii

for x prime to 2n; nx) xi (x) (Ni

0

with1-ix .-_7-2. 1

(3.3)

otherwise;

(mod 2 + 21).

(3.4)

Suppose that x generates a prime ideal P = (x) nt dividing 2n. If P = (p) with p.. --z- 3 (mod 4), then (&) = (-pT) li = 1, and in(x) = iix = —p. That is, in takes any of the four possible generators of P to 4. If x is any of the four possible generators of a prime P of norm p ..E -. 1 (mod 4), then in(x) = AV» --.-. (7,) (mod 2 + 21), i.e., 2,i (x) is the unique generator oti; which is congrUent to CD modulo 2 + 2 1. We have thus shown:

66

IL The Hasse—Weil L-Function of an Elliptic Curve

Proposition 2. The map 2„ defined in (3.3)—(3.4) is the unique multiplicative map on Z[i] which coincides with cx';',eg' on any generator of a prime ideal P. Notice that is a character on CZ [0/(2 + 20)*. It takes any x to the using the root of unity in the class 1 /x. The general Li' is obtained from Legendre symbol, with the variable x appearing on the bottom. We now use quadratic reciprocity to bring the variable x up on top, thereby showing that x„' is a character. At this point recall Problem 9 of the last section, in particular, the definition of a "primitive" character on a number ring. Proposition 3. The map x„' defined in (3.3)—(3.4) is a primitive multiplicative character modulo (2 + 2i)n for odd n and modulo 2n for even n. Suppose x is prime to 2n. Let n = 2E1 • • • 4, where the /i are distinct odd prime numbers, and e = 0 or L Note that Nx is a product of odd prime powers, where the primes p l , , p, occurring to odd powers are all congruent to 1 (mod 4). First, it is easy to see that PROOF.

I. if 1%lx ==.- 1 (mod 8);

2 (

Nx) = I — 1

(3.5)

if Nx ==-- 5 (mod 8).

Next, we compute that (Nnx) = (N21x)E

I

pk

= 2 E 1 0 by the integral -

co

di e - tt s— . t j'0

(4.1)

r(s + I) = sF(s),

(4.2)

F (s)

It satisfies the relation

; 21

§4. The prototype: the Riemann zeta-function

which enables one to continue f(s) analytically onto all of the complex s-plane, except that it has simple poles at s =--- 0, — 1, —2, —3, . . . . The gamma-function also satisfies the relations

\ 7r f(s)r( I — si — sin(is)

(4.3)

and s) r, ( + 1 ) , ...... jr2 i -sr(s). r

(

(4.4)

2 2

Finally, using (4.2) and (4.3), one easily sees that the reciprocal of the gammafunction is an entire function of s. The gamma-function (4.1) is a special case of a construction known as the "Mellin transform". Given a function . f(t) on the positive real axis, its Mellin transform is the function g(s) defined by the formula co . dt f (OP— (4.5) g(s)----t

fo

for values of s for which the integral converges. Thus, f(s) is the Mellin transform of e t. Notice that for any constant c> 0, the Mellin transform of e- " is cf (s): dt e-"ts— = cf (s), t

(4.6) f

:

as we see after a simple change of variables. We shall often have occasion to use (4.6). Another tool we shall need is the Fourier transform. Let 99 be the vector space of infinitely differentiable functionsf: 1R 1-0 C which decrease at infinity faster than any negative power function, i.e., 1 xrf(x) --) 0 as x 4 + co for all N. An example of such a function is f(x) = e 2. For any fe 9' we define its Fourier transform f by: -

e -21rix)f(x)dx.

(4.7)

It is not hard to show that the integral converges for all y, and je , 9P. The following properties of the Fourier transform are also easy to verify: (1) If a e R and g(x) = f(x + a), then "(y) = e2 76aY1(y). (2) If a e IR and g(x) = e2 'i"f(x), then 4 (y) = Ay — a). (3) If b> 0 and g(x) = f(bx), then 4(y) = -T.,. f(y1b). For example, to check (3), we compute

4 y) = (

I

00

00

e'f(bx)dx =--_co

dx == I 2, a., e-2/ri(x/b)37f(x)_. —July). ' b b _00

IL

72


Proposition 5. If fix) = C. "2, then f

=f

PROOF. Differentiating under the integral sign, we have

d

' e -27rixyxe -nx2dx. e-27ilx-vf(x)dx = —276 11°9 ff CO = — I dy _ co co

Integrating by parts gives ft( y)

nx2

2 1 —27zi e-2nixy 2n e "

irixY 2niy e -2‘'

—2n

dx

GO

= — 2ny

Y.f(x)dx = —2nyl(y).

e -2

-co

Thus,fsatisfies the differential equation t(y)/f(y) = —27ry; this clearly has solutionf(y) = Ce - "2 , where C is obtained by setting y = 0: 00

e"- "2 dx = 1

C = f(0) =

- co (Recall the evaluation of the latter integral: e "2 dx f a) e - nY 2dy =

C2

J

f_co Jo

Thus,f(y) = e

012

—cc

e -"2 2nr dr =

■•■••• ■■•■■••■••

e - n(x 2 ÷Y2)dxdy

jo

e' du = 1.)

2 , as claimed.

0

Proposition 6 (Poisson Summation Formula). If g e Y, then

E

M=

On) = CO

E

(4.8)

j(m).

M = fl)

Define h(x) = E._ co g(x k). The function h(x) is periodic with period I, and has Fourier series h(x) = E:= _ co cm e2', where co I 1 i = f OE) g(x)e -2 'dx, g(x + k)e-2n"dx Cm = h (x)e -2 'in'xdx = 0 k= Jo where we interchanged summation and integration, and made a change of variables (replacing x + k by x) to obtain the last equality. But the last expression is simply 4(m). Now the left side of (4.8) is NO), by definition; and the right side is also NO), as we see by substituting x = 0 in the Fourier series for h(x) and using the fact that cm = (in). PROOF.

f

co f

i—.

We now define the theta-function : cc

E

e -nin

2

for

t > 0.

(4.9)

73


Proposition 7. The theta-function satisfies the functional equation 1 0(t) = — r7 0(1/t).

(4.10)

NI I PROOF. We apply Poisson summation to g(x) = e'tx 2 for fixed t> O. We write g(x) =f(fix) with f(x) -----; C nX2 . By Proposition 5 and property (3) of the Fourier transform (with b = \fi) we have 4(y) ---= t - ' 12 e - "21t. Then the left side of (4.8) is 0(1), and the right side is r 112 0(1/t). This proves the proposition. D' We sometimes want to consider 0(t) for complex t, where we assume that Re t> 0 in the definition (4.9). The functional equation (4.10) still holds for complex t, by the principle of analytic continuation of identities. That is, both sides of (4.10) are analytic functions of t on the right half-plane. Since they agree on the positive real axis, they must be equal everywhere for Re t > O. Proposition 8. As t approaches zero from above, we

have (4.11)

10(t) — t -1/2 1< e -clt

for some positive constant C. PROOF. By (4.10) and (4.9), the left side is equal to 21'12 Z_ i e - "2/r. Suppose t is small enough so that .fi > 4e -liz and also e'lti' < 1. Then IOW — 1 -1/21 < le iit(e -70 + e -470 + ...)< le-Or -1 Ni +1+ 71 ± ..13- 4_ . .. ) = -

Thus, we can take C = n — 1.

a

We now relate 0(t) to the Riemann zeta-function. Roughly speaking, (:(s) is the Mellin transform of 0(0. The functional equation for 0 (t ) then leads us to the functional equation for C(s), and at the same time gives analytic continuation of as). We now show how this works. Theorem. The Riemann zeta-function C(s) defined by (3.1)for Re s> 1 extends analytically onto the whole complex s-plane, except for a simple pole at s = 1with residue 1. Let A (s) ci7-f 7r - s12 1-' (0 as).

(4.12)

Then A(s) is invariant under replacing s by 1 — s:

A(s) = A(1 — s). That is, C(s) satisfies the functional equation

,

74

II.


_ ir ---( 1—s)12 -

1--

(

1 — s)

(4.13)

( l -- s) .

2

1

—

Basically, what we want to do is consider the Mellin transform ro° 0(t)ts(1). However, for large t the theta-function is asymptotic to 1 (since all except the n = 0 term in (4.9) decrease rapidly); and for t near 0 it looks like ry2 , by Proposition 8. Hence, we must introduce correction terms if we want convergence at both ends. In addition, we replace s by f (otherwise, we would end up with 42s)). So we define PROOF.

0(s) cit

-f

f ts 12 (0(t)

- 1) —t + f 1. t s12

0(

(4.14)

—

o In the first integral, the expression 0(t) - 1 = 2 E, e -- "2' approaches i

zero rapidly at infinity. So the integral converges, and can be evaluated term by term, for any s. Similarly, Proposition 8 implies that the second integral converges for any s. In any case, since 0(t) is bounded by a constant times ry2 in the interval (0, 1], if we take s with Re s> 1 we can evaluate the second integral as dt '' t di Ç ' ts/ 20(i)— ‘3 1 " I...,.

I

1 12

1 ts120(t) dt

- 10

tt \'t J Thus, for s in the half-plane Re s> 1, we obtain:

f 03 2 •-) dt 4) (s) =-. 2 E e- 7"1 'PI` ±

1 I1 di + 2 E /: i sl.._

CO

=2

s-1.

CC)

t

n=1 1

2

t

0

dt ± ;1-2 Ec° fc° e —nn 2 t t s12— t

n=1

0

- 2 /2 e " tt s

dt t

2 s

— 1)

2

1 — s•

Using (4.6) with c replaced by nn 2 and s replaced by '.-5-- we have:

2' 1 1 s .1 - 0(s) = i (n112)— s121— ____ + + 22 s 1-s n=1 :„____ n -s12 r

(....$) ((s) ± 1 + 2''

s

1

1-s

(4.15)

,

where always here Re s> 1. Now 0(s) is an entire function of s, since the integrals in (4.13) converge so well for any s, as we saw. Thus, (4.14) shows us that there is a meromorphic function of s on the whole complex plane, namely Tr

.s/2

(1

I' (42) 2

0(s)

1 s

1 1 - s)

,

which is equal to (s) for Re s> 1. Moreover, since 7r"2, 1ir(1), and Cs) are all entire functions, it follows that the only possible poles are at s = 0 and at s = 1. But near s = 0 we can replace sr(f) in the denominator by 2(1)F(i) = 2f(-3- + 1), which remains nonzero as s -- 0. H' the only pole

75


is at s = 1, where we compute the residue

(i

v s/2 "

1

o(s) r(s/2) 2

liM

s I

r (1/2)

:= 1.

It remains to prove the functional equation. Since, by (4.15), A(s) =-(1 -1 w and since + (1 1 s) is invariant under replacing s by 1 - s, 1430(s) it suffices to prove that Cs) = 0(1 - s). This is where we use the functional equation (4.10) for the theta-function. Using (4.10) and replacing t by -1- in under and J becomes (4.14), we obtain (note that d(-f )/(+) = the substitution):

s? =

fol

1 -42

e (1) u

dt 1)

+

t)

V

t (replacing by -1-)

co

I

dt .ji) Cs1200(t) 0 0) 1)— ± I - 412 (ft = t t 1 Jo odt = ' t2 s)12 (90) tu i1. i) 't11-t + ji t

f 0

(by (4.10))

.s

= 01 - s). This completes the proof of the theorem. In a similar way one can prove analytic continuation and a functional equation for the more general series obtained by inserting a Dirichlet character z(n) before rrs in (3.1), or, equivalently, inserting x(p) before p' in the Euler product (see Problem 1 below). That is, for any character (ZINZ)* --+ C*, one defines: co 1 where Re s > 1 (4.16) ) = EL(x,s x(n) n=1

( n ns

p1

X(P)P -s

The details of the proof of analytic continuation and the functional equation will be outlined in the form of problems below. The Hasse-Weil L-function for our elliptic curve En, to be defined in the next section, will also turn out to be a series similar to (4.16), except that the summation will be over Gaussian integers x, the denominator will be the norm of x to the s-th power, and the numerator will be n (x), where )-( n was defined in (3.3) in the last section. The techniques used in this section to treat the Riemann zeta-function can be modified to give analogous facts—analytic continuation and a functional equation—for the Hasse-Weil L-function for E. In the final section we shall use this information to investigate the "critical value" of the Hassp-Weil L-function, which is related to the congruent mi mber problem.

76

H. The Hasse-Weil L-Function of an Elliptic Curve

PROBLEMS 1. (a) Show that the summation form and the Euler product form of the definition of s) are equal.

(b) Prove that if x is nontrivial, then the sum in (4.15) actually converges (conditionally) for Re s > O. 2. Let G = ZINZ, and let = e2 . (a) Define the "finite group Fourier transform" of a function f: G f(a) = EbEGfibg -ab for a e G. Prove that f(b) k! Ea e G f (agab (b) For fixed s e C with Re s > 1, let f, : G -+ C be the function fs(b)

C by setting

n1 ,nb(mod N)

Prove that for any primitive Dirichlet character z modulo N and for any s with Re s > 1 :

L(x, s)= 1-g(x) N

E a€G

(a) n=1

where g(x) is the Gauss sum (see Problem 9 of 01.2). (c) Take the limit in part (1)) as s approaches 1 from above, supposing x nontrivial. In that way derive a simple formula for L.(x, 1). for fx1 1. Express L(x, 2) (d) Define the "dilogarithm" function by 1(x) = in terms of the dilogarithm.

02 ? For fixed t> 0 and a e R, what is the Fourier transform of e Suppose that a e R is in the open interval (0, 1). Define the following functions: CO

«a, s)=

Re s > 1;

± ay', n= 0 co

E ri-se'ina,

1(a, s) =

Re s> 1;

n=1

OA°

E e

=

-

iron +a)21

>

0;

n= - OD

0 a(1) =

n= - œ

e 2/ti na e -utn2

t > O.

(The notation 1(a, s) should not be confused with the function 1(x) in Problem-2) Prove that

(i) Oa(t)

(ii)Imo

t —10

t

"21 < e—Cdt

as t 0 for some positive constant C1 ; (Hi) 10a(t)i <e21' as t -* 0 for some positive constant C2. (c) Prove that a, s) + C(1 - a, s) as a function of se C extends to a`merorribrphic function with no pole except for a simple pole at s = 1; that the function /(a, s) + a, s) extends to an entire function; and that -

n-st2r0 2

s) + C(1 - a, s)) =,

r

(1

-

2

(1(ar , 1 - s) 1(1 - a, 1 - s)).

77


(d) Let x be a primitive Dirichlet character mod N. Let L(x, s) be defined as in (4.16): Prove that for Re s >-- 1:

(i)

---- Ns L(x, s), 0 O.

(Note that this is different from the definition of 0(x, t) for even x in Problem 4.) Let Oa and Oa be as in Problem 5(c). (a) Prove that: CO O(X, t) = 1-1- ii AaVaiN(N 21); 1 N

(ii) -

E X62041'1(t) = g(X) 0(7, 1);

(iii) 0(x, 0 = —iN -2 1 -312 g(x)0(7, 1 /N 2 t)(b) Show that the Mellin transform of 0(x, t) converges for any s, and that for Re s > 1 it is equal to n- sr" (s)L(x, 2s 1). (c) Use the functional equation in part (a)(iii) to give another proof of the functional equation for L(x, s) in Problem 5(e) above. —

7. Let x be the character mod 12 such that x( ± 1) ------- 1, x( ± 5) = — 1. Let I/ (z) = 0(x, —iz/12) for Tm z > O. Prove that n(— i/z) = Jriri(z), where we take the branch of which has value 1 when z = i. We shall later encounter q (z) again, and give a different expression for it and a different proof of its functional equation.

8. (a) Use the functional equations derived above to express 1(a, 1 s) in terms of C(a, s) and ‘(1 a, s). (b) Use the properties (4.3) and (4.4) of the gamma-function along with part (a) to —

—

show that

1(a, 1

—

s) = 1- (s)(2n)_s e i"12 (C(a, s) + e-i"C(1

—

a, s)).

(c) For a E C, a + n, define (a + n)- S to mean e"- s'"(a" , where we take the branch of log having imaginary part in ( — n, 7r]. Show that for a e C, 1m a > 0, Re s> 1, one has: —

co

1(a, 1 — s) = r(s)(2n)_se i"12

E Pt= -CO

(a 4- n)'.

79

§5. The Hasse—Weil L-function and its functional equation

(d) Let s= k be a positive even integer. Show that for a e C, im a> 0:

(a 1 n)k

(27rOk " _ 2 E n"e (k 1)1 „=,

•

(e) Give a second derivation of the formula in part (d) by successively differentiating the formula 00 1 1 ncot(na) =-- — + + a n a+n a—n

E

§5. The Hasse—Weil L.-function and its functional equation Earlier in this chapter we studied the congruence zeta-function Z(EliT p ; T) for our elliptic curves E„: y 2 = x3 — n 2x. That function was defined by a generating series made up from the number N =-- N,.. p of F,, -points on the elliptic curve reduced mod p. We now combine these functions for all p to obtain a function which incorporates the numbers for all possible prime powers p', i.e., the numbers of points on En over all finite fields. Let s be a complex variable. We make the substitution T = p- s in Z(En/Fp ; T), and define the Hasse—Weil L-function L(En , s) as follows:

C(s)((s — 1)

L(E n , 5 Hp z(E n lf Fp ; P 1

i

p.f

2n

1

2aErrpP-s

1

a deg P Np)- s We must first explain the meaning of these products, why they are equivalent, and what restriction on s e C will ensure convergence. In (5.1) we are using the form of the congruence zeta-function in the theorem in §11.2 (see the first equality in (2.7)), where the notation a Ewp indicates that the coefficient a depends on En and also on the prime p. We put the term (.F)(s — 1) in the definition so that the uninteresting part of the congruence zeta-functionits denominator—disappears, as we see immediately by replacing “s) and C(s — 1) by their Euler products (see (3.1)). Note that when pi2n, the denominator term is all there is (see Problem 10 of §11.1), so we only have a contribution of 1 to the product in that case; so those primes do not appear in the product in (5.1). In (5.2) the product is overall prime ideals P of Z[i] which divide primes p of good reduction. Recall that those primes are of two types: P p), p -:a- 3 (mod 4), deg P = 2, ‘IP = p 2 ; and P = (a + bi), a2 + b2 = p 1 (mod 4), deg P = 1, NP = p. The meaning of ap and the equivalence of (Si) and.(5.2) are contained in Proposition I (see (3.2)).

II. The Hasse—Weil L-Function of an Elliptic Curve

80

As in the case of the Riemann zeta-function, we can expand the Euler product, writing each term as a geometric series and multiplying all of the geometric series corresponding to each prime. The result is a Dirichlet series, i.e., a series of the form CO

L(E„, s) =

E

b.,,,rn'.

(5.3)

Before discussing the "additive" form of L(E„, s) in detail, let us work out the values of the first few b.,,, for the example of the elliptic curve E l : y 2 = x 3 — x. We first compute the first few values of aE 2 ,p in (5.1). If p --4-: 3 (mod 4), then aE1 ,p = 0. Ifp F...- 1 (mod 4), there are two easy ways to compute a = aE 1 , p : (1) as the solution to a 2 + b 2 = p for which a + bi _-_=. I (mod 2 + 2i); (2) after counting the number N1 of Fp-points on El , we have 2a = p + 1 — N1 (see (1.5)). Here is the result:

L (E 1 , s) =

1 1 1 1 1 + 3 - 9- S 1 + 2 - 5 - s + 5 - 25 - s 1 +7491+ 11 - 121' •••

.

1 — 6 .13' + 13169g 1 — 2 .17 -s + 17 . 289" = 1 — 2 . 5 -s — 3 .9" + 6 - 13 -s + 2 . 17 - s -1-

E rr:25

b. n tn - s . (5.4) '

We have not yet discussed convergence of the series or product for L(E„, s). Using (52) and the standard criterion for an infinite product to converge to a nonzero value, we are led to consider Ep1api deg P(NP) -5 for s real. By Proposition 1, we have iap reg P = NP 1/2 . In addition, Np1/2-s
4 the right side of (5.17) is equal to _ 44/7)1 - 2sn s-2

1- ( 2

— s)(1 + Or' — 2 (2 + 20nL(E„, 2 — s) n (5.20)

-___1(2 ( -172) ns -21— s)(80 1-si,(E„, 2 — s). On the other hand, if we bring the term (1/2)s over to the right in the functional equation (5.11)-(5.12) in the theorem, we find that what we want

s 87

§5. The Hasse—Weil L-function and its functional equation

to prove is: n - sr(s)L(E„, s) = (

—2 ) (N/2)s(27) 5-2 ( \/N) 2-sr(2 n

(--n2)

(N/4 )' - sns -2 1-(2

s)L(En , 2 — s)

s)L(En , 2 — s).

And this is precisely (5.20). Thus, to finish the proof of the theorem for odd n, it remains to prove the lemma. PROOF OF LEMMA. First suppose that in + m2 i is not divisible by I + i. This is equivalent to saying that m i and m 2 have opposite parity, i.e., their sum is odd. Now as a, b range from 0 to 4n, the Gaussian integer a + bi runs through each residue class modulo (2 + 2i)n exactly twice. Each time gives the same value of x,i'(a + bi), since Li' (a + bi) depends only on what a + bi is modulo n' = (2 + 2i)n. But meanwhile, the exponential terms in the two summands have opposite sign, causing the two summands to cancel. To see this, we observe that if a l + b 1 i and a2 + b2 i are the two Gaussian integers in different residue classes modulo 4n but the same residue class modulo (2 + 2i)n, thei, a l + b 1 (a2 + b2 (2 + 2i)n (mod 4n), and so (2nil4n)m.ya1,b1)-(a2,b2)) = e(27cil4rOm.(2n,2n) = e rri(nt 1 +m 2 ) = _1 . e This proves the first part of the lemma. Now suppose that m 1 + m 2 i = (1 + i)x. Note that in (a, b) = m l a + m2 b = Re((m, tn 2 i)(a + bi)) = Re((1 i).5(a + hi)). Hence, the exponential term in the summand in Sm is tli(ic(a + bi)), where •

i/f(x) def --=-- e211'" (with n' = (2 + 2i)n). Since x„' is a primitive character modulo (2 + 2i)n (see Proposition 3), we can apply Problem 9(a)—(b) of §11.2. Since the summation in (5.19) goes through each residue class modulo (2 + 2i)n twice, we have

S„, 2

E

x (a + bi)111(.t(a + hi))

a+biEgil/( 2 +2 0,8

2 k/r1(:)g(Z;1) = 2 Xax)9(4). This proves the lemma, and hence the theorem (except for some slight modifications in the case of even n, which will be left as an exercise).

In the problems we shall outline a proof of the analogous theorem in the case of an elliptic curve, namely y2 = x 3 + 16, which has complex -multiplication by another quadratic imaginary integer ring, namely Z[w], where co + -PO. There is one additional feature which is needed because we end up summing not over [i], which can, be thought of as 1 2 , but rather over a lattice which is the image of 1 2 under a certain 2 x 2-matrix.

II. The Hasse-Weil L-Function of an Elliptic Curve

88

So we have to apply Poisson summation to a function much like the function in this section, but involving this matrix. We conclude this section by mentioning two references for a more general treatment of the theory of which we have only treated a few special cases. First, in C. L. Siegel's Tata notes [Siegel 1961] (see especially pp. 60-72) one finds L-functions whose summand has the form 1 e

27r im- u

P(m + y) (Q[ni + Y]) s+g12 '

where nier, u, ye Er, Q is the matrix of a positive definite quadratic form, and P is a "spherical polynomial with respect to Q of degree g". The case we needed for L(Eii, s) was: n = 2, Q = ( ?), P(x 1 , x 2) = x l + ix 2 , g = 1. In Problem 8 below we have the case Q = ( 1,2 if), P(x 11 x2) =4 (a) + 1 /2)x1 ± (w12 + 1) x2 (where w = —1/2 ± i012), g = 1. In [Lang 1970, Chapters XIII and XIV], two approaches are given to this topic. In Ch. XIII, the approach we have used (originally due to Hecke) is applied to obtain the functional equation for the Dedekind zeta-function of an arbitrary number field. This is a generalization. of Problems 2 and 6 below. However, the case of more general Hecke L-series is not included in that chapter. A quite different approach due to J. Tate—using Fourier analysis on p-adic fields—is given in Ch. XIV of Lang's book. PROBLEMS 1. Finish the proof of the theorem for n even. 2. (a) Find a functional equation for 0(1) = E„„ z2e - orn 12, t > O. (b) The Dedekind zeta-function of a nucigiber field K is defined as follows: CK(S) SZ E(Ni)_s,

where the sum is over all nonzero ideals I of the ring of integers of K. This sum converges for Re s> I (see [Borevich and Shafarevich 1966, Ch. 5, §1]). Let K = 04 Prove that CI< (s) is an entire function except for a simple pole at s = 1 with residue n/4, and find a functional equation relating CK (s) to CK (1 — s). 3. For u and y in R 2, let

0:(t) . y def

e2itim-ve-ntim+u12

I-a

1

t > 0.

mEZ 2

Find a functional equation relating 0:(t) to 0!..,(f). 4. (a) In the situation of Proposition 11, express the Fourier transform of (w in terms off(y) for any nonnegative integer k. (b) Suppose that k is a nonnegative integer, ue R 2 is fixed with u it V,, t > 0 is fixed, and w = (1, 0 E C 2. What is the Fourier transform of g(x) = ((x + u) • oke -ntix+u1 2 ? (c) With k, u, t, w as in part (b), define:

-

89

§5. The Hasse-Weil L-function-and its functional equation

(On

Ou, k(t) =

+

. ok e -ntIm+ui

2 5

m e 22

o u,k (o=

E (m . oke2nim•ue -ntimi2

•

m € 22

r Find a functional equation relating 0 k (t) to 0"(+). (d) Suppose that / is a fixed ideal of Z[i], and x: (Z[i]//)*

C* is a nontrivial

character. Outline (without computing the details) how one would prove that the function Exe zro xkx(x)(Nx)_s extends to an entire function of s (if x were trivial, there would be a simple pole at s = 1), and satisfies a functional equation relating its value at s to its value at k + 1 s. (e) 'Explain why any Hecke L-series for Z[i] is essentially of the form in part (d).

5. Let f

C, fe

(a) For MeGL.(R), let Mt denote the transpose matrix, and let M* = (M - V. Let g(x) = f(Mx). Express 4(y) in terms of f(y). (b) Let L be a lattice in fi??; equivalently, L is of the form L = MZ", where M E GL(), i.e., L is obtained by applying some matrix M to the elements in the standard basis lattice Z. Let L' be the "dual lattice" defined by: L' = {ye Rnix • ye Z for all xe L}. First show that L' is a lattice. Then prove a functional equation relating ExeL f(x) to EyE d(y).

f.

6. Let w= - 1+ (a) Let M ((., 3 2 ). Show that the lattice L = Zkoj C = R2 is L = MZ 2 . What are,M* and L'? Show that L', considered as a lattice in C, is twice the "different" of Z[co], which is defined as {xee(co)ITr(xy) E Z for ally E Z [C0] 1. Here Tr x =

/x+.t- =2Rex. (1)) Prove that E xelfroj

e'"2

t- e -(40t)ixt2 2r_ 2s IN/ 3 xEzz(0,1

(c) Let K = Q(co). Prove that CK(s) extends meromorphically onto the complex sPlane with its only pole a simple pole at s = 1. Find the residue at the pole, and derive a functional equation for C K(S). (d) With K as in part (c), prove the identity: CK(s) = C(s)L(, s), where x is the nontrivial character of (Z0Z)*. (e) Use part (d), the theorem in §11.4, and Problem 5(e) of §11.4 to give another proof of the functional equation in part (c).

7. Let E be the elliptic curve y2 = x3 + 16. Let w = 4- 10. (a) Show that the reduction mod p of y2 = x3 + 16 is an elliptic curve over Fp if and only if p # 2, 3. For such p, recall that the Euler fador at p is (1 — 2a p p + p 2s)— where (1 — 2ap T + pr) is the numerator of Z(EIEF;; T). Show that for p # 2, 3 this factor is cxdpeg P N p- -1 )

where the product -is over all (one or two) prime ideals P of Z[co] dividing (p) and ot`iteg P is the unique generator of P such that acileg P I (mod 3). (I)) When p = 3, define the Euler factor at 3 to be simply 1, as we did for En : y2 = x3 — 112 .x when pl2n. When p = 2, at first glance it looks like we again have

90

IL The Hasse—Weil L-Function of an Elliptic Curve

y2 =--- x3 , and so we might be tempted to take 1 for the Euler factor at 2 as well. However, this is wrong. When one defines L(E, s), if there exists a 0-linear change of variables that takes our equation E: y2 = x3 + 16 in FD4 to a curve C whose reduction mod p is smooth, then we are obliged to say that E has good reduction at p and to form the corresponding Euler factor from the zeta-function of C over Fir In Problem 22 of §II.2, we saw that y2 = x3 + 16 is equivalent over Q to y 2 + y = x 3, a smooth curve over F2 whose zeta-function we computed. Show that the Euler factor at p = 2 is given by the same formula as in part (a). (c) For x E 7L[(.1)] prime to 3, let x(x) = — ce)i be the unique sixth root of I such that x(x) 1 (mod 3). Let x(x) = 0 for x in the ideal (,./ — 3). Show that L(Es)_ !

xEli

i (Nx)

(d) Let 11/(X) =

e(27c43)Tr(x/W3)

def

where Tr x = x + ï = 2 Re x. Verify that tif(x) is an additive character of Z[w]13 satisfying the condition in Problem 9 of §II.2 (i.e., that it is nontrivial on = IxEztco)/3 X(X)11/(X), where any larger ideal than (3)). Find the value of g(x, x is as in part (b).

8. (a) Let w = (1, co), u e — Z 2 , and t > 0 be fixed. What is the Fourier transform of g(x) = (x + u) • w e -ntl(x+to.wI 2 ? (b) Let 6,i(t) = E2 On), where g(x) is as in part (a). Find the Mellin transform 4)(s) of E x(a + bco)0.(i). tr-. a13,b/3) 0 0}. It is important to note that any g e SL2 (ER) preserves H, i.e., Im z > 0 implies lm gz> O. This is because

Im gz = Im

(az + 0(c-if + d) = az + b icz + df -2 Im(adz + bcZ). = Im Icz + dI2 . cz + d

But Im(adz + bd) = (ad — bc)Im z = Im z, since det g = 1. Thus,

lm gz = icz + d1 -2 Im z

a b for g= c d e SL2(R).

/( 1.2)

Thus, the group SL2 (R) acts on the set H by the transformations (1.1). The subgroup of SL2 (I1) consisting of matrices with integer entries is, by definition, SL2 (Z). It is called the "full modular group", and is sometimes denoted F. We shall denote f d=ef r/+./.. (Whenever we have a subgroup G . denote G1+1 if G contains — ./; if —I0G, we set of SL 2 (R), we shall let 6 6= G.) This group f = s12 (7L)/-Fi acts faithfully on H. It is one of the basic groups arising in number theory and other branches of mathematics. Besides F = SL 2 (7L), certain of its subgroups have special significance. Let N be a positive integer. We define

f(N)7 {(ac b) d ESL 2 (Z)la -=. d -74. 1 (mod N), b :÷-_. c E-2 0 (mod MI . (

1.3

)

This is a subgroup of r = SL 2 (Z), actually a normal subgroup, because it is the kernel of the group homomorphism from F to SL 2 (7L/NZ) obtained by reducing entries modulo N. In other words, r(N) consists of 2 x 2 integer matrices of determinant 1 which are congruent modulo N to the identity matrix. Note that F(1) = F. We shall later see that F(N) is analogous to the subsemigroup 1 + NZ c: Z consisting of integers congruent to 1 modulo N. 1"(N) is called the "principal congruence subgroup of level N". - Notice that F (2) = F(2)/ + I, whereas for N > '2 we have f(N) = because —1 # 1 (mod N), and so — I is not in r(N). A subgroup of F (or of r) is called a "congruence subgroup of level N" if it -contains F(N) (or F(N), if we are considering matrices modulo + I). Notice that a congruence subgroup of level N also has level N' for any multiple N' of N. This is.because F(N) = r(N/). We say that a subgroup of r (or of f) is a "congruence subgroup" if there exists N such that it is a congruence subgroup of level N. Not all subgroups of F are congruence

III. Modular Forms

100

subgroups, but we shall never have occasion to deal with non-congruence subgroups. For our purposes the most important congruence subgroups of F are:

)er ic 0 (mod N)};

170 (N) Fi (N)

c d

a {(c ci)e Fo (N)ia —= 1 (mod MI.

(1.4) (1.5)

bd )e It is easy to check that these are, in fact, subgroups. Note that if F1. (N), then since ad — be = 1 and Nic, it follows that ad 1 (mod N), and hence d 1 (mod N). We sometimes abbreviate the definitions (1.3)—(1.5) as follows:

T(N) =()) mod NI; 01

ro(N) =

0*

1* ) mod N}; I', (N) = {( 01

) mod N},

where * indicates the absence of any congruence condition modulo N. Whenever a group acts on a set, it divides the set into equivalence classes, where two points are said to be in the same equivalence class if there is an element of the group which takes one to the other. In particular, if G is a subgroup of F, we say that two points z 1 , z2 e H are "G-equivalent" if there exists g e G such that z2 = gz 1 . Let F be a closed region in H. (Usually, F will also be simply connected.) We say that F is a "fundamental domain" for the subgroup G of F if every z eH is G-equivalent to a point in F, but no two distinct points z 1 , z2 in the interior of F are G-equivalent (two boundary points are permitted to be

G-equivalent). The most famous example of a fundamental domain is shown in Fig.

:

F def = {z e —1 :5_ Re z

and Izi 1 } .

(1.6)

Proposition 1. The region F defined in (1.6) is a fundamental domain for F.

PROOF. The group F = SL2 (1) contains two fractional linear transformations which act as building blocks for the entire group:

T

1 1) : G1 0 —1

S

i 0

kf(

z + 1; (1.7)

: z —11z.

§1. SL 2 (Z) and its congruence subgroups

101

1 2

Figure III.!

To prove that every zeH is F-eqiiivalent to a point in F, the idea is to lf it use translations Ti to move a point z inside the strip —1 < Re z lands outside the unit circle, it is in F. Otherwise use S to throw the point to bring it inside the strip, outside the unit circle, then use a translation and continue in this way until you get a point inside the strip and outside the unit circle. We now give a precise proof. Let zeH be fixed. Let r be the subgroup of F generated by S and T (we shall soon see that actually F = + F'). If y = ( db ) e F', then Im yz = Im ziicz + dI 2 by (1.2). Since c and d are integers, the numbers Ia . + di are bounded away from zero. (Geometrically, as c and d vary through all integers, the complex numbers cz + d run through the lattice generated by 1 and z; and there is a disc around 0 which contains no nonzero lattice point.) Thus, there is some y = (c' bd) E r such that Im yz is maximal. Replacing y by Ty for some suitable j, without loss of generality we may suppose that yz is in the strip —I< Re yz < But then, if yz were not in F, i.e., if we had lyzi Im yz, which contradicts our choice of y e F' so that Im yz is maximal. Thus, there exists y e such that yz E F. We now prove that no two points in the interior of F are F-equivalent. We shall actually prove a more precise result. Suppose that z 1 , z 2 e F are F-equivalent. Here we are not supposing that z 1 and z2 are necessarily distinct or that they are necessarily in the interior of F. Without loss of generality, suppose that Im z2 Im z1 . Let y = er, e r be sticil that z2 = 1. Since z 1 is in yz i . Since Im z2 Im z 1 , by (1.2) we must have icz i + F and d is real (in fact, an integer), it is easy to see fromzFig. III.1 that this 2. This leaves the cases :i) c 7-7-- 0, d = +1; inequality cannot hold if (ii) c = +1, d = 0, and z 1 is on the unit circle; (iii) c d = .+1 and z 1 = + In casel(i), either y or —y ; (iV)C = —d= +1 and z, +

102

in. Modular Forms

is a translation V; but such a y can take a point in F to another point in F only if it is the identity or if j = + I and the points are on the two vertical boundary lines Re z = +1. In case (ii), it is easy to see that y = + TS with a = 0 and 2 1 and z2 on the unit circle (and symmetrically located with respect . In case to the imaginary axis) or with a = ± 1 and 2 1 = z 2 = ± + 1), and if such a map takes z 1 €F to 22 eF (Hi), y can be written as + we must have a = 0 and z2 = 2 1 = — + `ffs or else a = 1 and z2 = 2 1 + 1 = + Ç3. Case (iv) is handled in the same way as case (Hi). We conclude that in no case can 2 1 or z2 belong to the interior of F, unless +y is the identity and 22 = 2 1 . This proves the proposition. In the course of the proof of Proposition 1, we have established two other facts. Proposition 2. Two distinct points 2 1 , z 2 on the boundary of Fare F-equivalent only if Re 2 1 = ±4 and 22 = 21 + 1, or if 2 1 is on the unit circle and 22 =

In the next proposition, we use the notation Gz. for the "isotropy subgroup" of an element z in a set on which the group G acts: by definition, Gz =

{g eGigz = z}. Proposition 3. If z e F, then

rz +Iexcept

in the following three cases .•

(i) r= ± {/, S } if z = ; (ii) F = ± {I, ST, (ST) 2 } if z = co = + (Hi) F= {I, TS, (TS) 2 } if z = —63 = + Both Propositions 2 and 3 follow from the second part of the proof of Proposition 1. (ST) 3 = —I, and (TV = — I. Thus, in F = Notice that S 2 = SL 2 (7L)/+ I the elements S, ST, TS generate cyclic subgroups of order 2, 3, 3, respectively; and these subgroups are the isotropy subgroups of the elements i, (1), — el-3, respectively. These points in H with nontrivial isotropy subgroups are called "elliptic points". Another by-product of the proof of Proposition 1 is the following useful fact. Proposition 4. The group = SL,(Z)/+I is generated by the two elements S and T (see (1.7)). In other words, any fractional linear transformation can be written as a "word" in S (the negative-reciprocal map) and T (translation by 1) and their inverses. PROOF. As in the proof of Proposition 1, let F' denote the subgroup of f generated by S and T. Let z be any point in the interior of F (e.g. z = 21). Let g be an element of F. Consider the point gz EH. In the first part of thé

103

§1. SL 2 (Z) and its congruence subgroups

proof of Proposition 1, we showed that there exists y e I"' such that y(gz) e F. But since z is in the interior of F, it follows by Propositions 1 and 3 that yg = + I, i.e., g = e r. This shows that any g e r is actually (up to a sign) in r. The proposition is proved. Thus, any element in Î can be written in the form Sal T b, s a, T b2 Sairi, where all of the a bi are nonzero integers, except that we allow a l and/or b1 to be zero; and, since S 2 = —I, we may suppose that all of the ai equal 1, except that a l = 0 or 1. We may also use the identity (ST) 3 —1 to achieve a further simplification in some cases. We now return to the fundamental domain F for F = SL 2 (Z). Recall that in Chapter I §4 we also had a fundamental domain, in that case a parallelogram H c c for the lattice L. In that case the group was L, the action of g EL on a point z E C was simply g(z) = g + z. Every z e C is L-equivalent to a point in H, and no two points in the interior of fl are L-equivalent. In that situation we found it useful to glue together the boundary of fl by identifying L-equivalent points. We obtained a torus, and then we found that the map zi--+ (0(4 p'(z)) gives an analytic isomorphism from the torus CIL to the elliptic curve y2 = 4.r 3 g2 (L)x g 3 (L) (see §1.6). In our present situation, with the group F acting on the set H with fundamental domain F, it is also useful to identify r-equivalent points on the boundary of F. Visually, we fold F around the imaginary axis, gluing the 23 and + iy and the point eti° to e 2 i(112-0) for y > ..rpoint + iy to < O < I-. The resulting set F with its sides glued is in one-to-one correspondence with the set of f-equivalence classes in H, which we denote FVf. (The notation F\l/ rather than H/F is customarily used because the group f acts on the set H on the left.) In Chapter I we saw how useful it is to "complete" the picture by including a point or points at infinity. The same is true when we work with rvi. Let ii denote the set H {co} u G. That is, we add to H a point at infinity (which should be visualized far up the positive imaginary axis ; for this reason we sometimes denote it ico) and also all of the rational numbers on the real axis. These points {co} u Q are called "cusps". It is easy to see that r permutes the cusps transitively. Namely, any fraction a/c in lowest terms can be completed to a matrix (ca e r by solving ad — bc = 1 for d and b; 'this matrix takes co to a/c. Hence all rational numbers are in the same F-equivalence class as co. If r/ is a subgroup of r, then F' permutes the cusps, but in general not transitively. That is, there is usually more than one r' equivalence class among the cusps {co} u O. We shall see examples later. By a "cusp of F'" we mean a F'-equivalence class of cusps. We may choose any convenient representative of the equivalence class to denote the cusp. Thus, we say that "f has a single cusp at oo", where oo can be replaced by any rational number ,) a/c in this statement. We extend the usual topology on H to the set 11 as follows. First, a funda,

-

-

III. Modular Forms

104 00

iC

a Nc

0 a lc

Figure 111.2

mental system of open neighborhoods of co is Nc = {zEHIIM Z >C} L.) {CO} for any C > 0. Note that if we map H to the punctured open unit disc by sending z ,-.4 q d= ef e27tiz

,

(1.8)

and if we agree to take the point co E,171 to the origin under this map, then Nc is the inverse image of the open disc of radius e'Rc centered afthe origin, and we have defined our topology on I/ v {co } so as to Make this map (1.8) continuous. The change of variables (1.8) from z to q plays a basic role in the theory of modular functions. We use (1.8) to define an analytic structure on Hu {po } . In other words, given a function on H of period 1, we say that it is meromorphic at oo if it can be expressed as a power series in the variable q having at most finitely many negative terms, i.e. it has a Fourier expânsion of the form

f(z) =

E an e 2n1nz = E anqn, neZ

(1.9)

neZ

in which an = 0 for n 14 sk(r) =

118

III. Modular Forms

AMk-12(r

) (i.e., the cusp forms of weight k are obtained by multiplying

modular forms of weight k 12`by the function A(z)). (e) Mk (f) = Sk (F) CEk for k > 2. PROOF. Note that for a modular form all terms on the left in (2.21) are nonnegative.

(a) Letfe Mo (F), and let c be any value taken by f(z). Then f(z) c eMo (f) has a zero, i.e., one of the terms on the left in (2.21) is strictly positive. Since the right side is 0, this can only happen if f(z) — c is the zero function. (b) If k < 0 or k 2, there is no way that the sum of nonnegative terms on the left in (2.21) could equal k/12. (c) When k = 4, 6, 8, 10, or 14 we note that there is only one possible way of choosing the v(f) so that (2.21) holds: for k = 4, we must have v,,,(f ) = 1, all other vp(f)= 0; for k = 6, we must have vi(f)= I, all other vp(f)= 0; for k = 8, we must have v.(f) = 2, all other vp(f)= 0; for k = 10, we must have v0(f ) = vi(f)= 1, all other vp(f)= 0; for k = 14, we must have v »(f ) = 2, vi (f ) = 1, all other v(f) = O. Let f1 (z), f2 (z) be nonzero elements of Mk (z). Sincefi (z) and f2 (z) have the same zeros, the weight zero modular function f1 (z)/f2 (z) is actually a modular form. By part (a), f1 and f2 are proportional. Choosing f2 (z) E k (z) gives part (c). (d) For fe Si (r) we have v,„(f ) > 0, and all other terms on the left in (2.21) are nonnegative. Notice that when k 12 and f = A, (2.21) implies that the only zero of A(z) is at infinity. Hence, for any k and any Je Sk (F), the modular functionfl A is actually a modular form, i.e.,f/A E Mk _ 2(F). This gives us all of the assertions in part (d). (e) Since Ek does not vanish at infinity, given fe Mk (f) we can always subtract a suitable multiple of Ek so that the resulting f cEk eMk (f) vanishes at infinity, i.e., f cEk e sk (r). We now prove that any modular form for I" is a polynomial in E4, E6 (see Problem 4 of §I.6 for a different proof of this fact). Proposition 10. Any feAl k (r) can be written in the form

f(z) =

ci,jE4 (2) iE6' (Z) i .

(2.27)

4i+ 6j=k

PROOF. We use induction on k. For k = 4, 6, 8, 10, 14 we note that E4, E6, E4 E6 , E4 E6, respectively, is an element of Mk (r), and so, by Proposition 9(c), must span Mk (f). Now suppose that k = 12 or k> 14. It is clearly possible to find i and j such that 41 4- 6/ = k, in which case E‘it Ei Emk(r). Given fe Mk(r), by the same argument as in the proof of Proposition 9(e)

119

§2. Modular forms. for SL 2 (1)

we can find ceC such that f we can write f in the form

—

cE14ie Sk (F). By part (d) of Proposition 9, -

. f= cElEi + AA . = cELE-4 +

(270 1 2 1728

(E43 — E 62 )fl , s

where fi e Mk-12( 1") , Applying (2.27) to fi by the induction assumption a (with k replaced byjc — 12), we obtain the desired polynomial forf.

The kinvariant.We now define a very important modular function of weight zero:

E4 (Z) 3 2 1728 1728g2 (z) 3 Jkz litf A (z) — E4(z) 3 E6 (Z) •,

(by (2.14)-(2.15))

(2.28)

-

Proposition 11. The function j gives a bijection from FV-1- (the fundamental

dom qin with r-eguivakni sides identified and the point at infinity included) and the Riemann sphere N = C u lool. In the proof of Proposition 9(d) we saw that A(z) has a simple zero at infinity and no other zero. Since g 2 does not vanish at infinity, this means that j(z) has a simple pole at infinity and is holomorphic on H. For any C E C the modular form 1728g3 — cA eM 12 (F) must vanish at exactly one point P El"\H, because when k = 12 exactly one of the terms on the left in (2.21) is strictly positive. Dividing by A, we see that this means that j(z) — c ---z-- 0 for exactly one value of ze F\l/. Thus, j takes co to oo and on o F\li is a bijection with C. PROOF.

Proposition 12. The modular functions of weight zero for F are precisely the

rational functions off. PROOF. A rational function of j(z) is a modular function of weight zero (see Remark 3 at the beginning of this section). Conversely, suppose that f(z) is a modular function of weight zero for F. If zi are the poles of f(z) in rvy, counted with multiplicity, then f(z)•11 i (j(z) j(zi)) is a modular function of weight 0 with no poles in H, and it suffices to show that such a function is a rational function of f. So, without loss of generality we may assume thatf(z) has no poles in H. We can next multiply by a suitable power of A to cancel the pole off(z) at oo. Thus, for some k we will have A(z)Y(z) e M/2k (F). By Proposition 10, we can write f(z) as a linear combination of modular functions of the form EE6ilY (where 4i + 6j = 12k), so it suffices to show that such a modular function is a rational expression in j. Since 4i + 6j is divisible by 12, we must have i = 3i0 divisible by 3 and j :----- 2j0 divisible by 2. But it is easy to check that EPA and EPA are each of the form aj + b, by (2.28); and Ero.Eljo/A k is a product of such factors. This o proves the proposition. —

120

III. Modular Forms

The definition (2.28) is not the only way one could have defined the invariant. It is not hard to see that any ratio of two non-proportional modular forms of weight 12 would have satisfied. Propositions 11 and 12. But j(z) has the additional convenient properties that: its pole is at infinity, i.e., it is holomorphic on H; and its residue at the pole is 1, as we easily compute from (2.28) that the q-expansion off starts out !-1 + • • . Recall that in Chapter I we used the Weierstrass p-function and its derivative to identify the analytic manifold C/L with an elliptic curve in Pt. Similarly, in our present context we have used the j-invariant to identify 1"\ii as an analytic manifold with the Riemann sphere PL Proposition 12 then amounts to saying that the only meromorphic functions on the Riemann sphere are the rational functions. Thus, Proposition 12 is analogous to Proposition 8 in §1.5, which characterized the field of elliptic functions as the rational functions of x = p(z), y =

A loose end from Chapter I. In Chapter I we defined an elliptic curve over C to be a curve given by an equation of the form y2 = f(x), where f(x) is a cubic with distinct roots. We then worked with curves whose equations were written in the form

= 4x 3 — g2 (L)x — g 3 (L)

(2.29)

for some lattice L. It is not hard to see that a linear change of variables will bring an equation y2 = f(x), where f(x) has distinct roots, into the form

y 2 = 4x3 Ax B

with

A 3 — 27B2 0 0.

(2.30)

But in order to write such a curve in the form (2.29) we need to show that a lattice L can always be found such that g2 (L)= A, g 3 (L) z--- B. This was not proved in Chapter I, but it is easy to prove now using modular forms. In what follows we shall write a lattice L= {mo t + nco2 }, where z = c1) 1 1 o2 e H, as follows: L = 2Lz = {nviz n2 }. Here 2 = (0 2 . Thus, any lattice L is a complex multiple of (i.e., a rotation plus expansion of) a lattice of the form L. Proposition 13. For any A, BE C such that A 3 0 27B 2 there exists L =AL z

such that

g 2 (L)

A;

(2.31) (2.32)

PROOF. It follows immediately from the definition 1- 4 g 2 (Lz), and similarly g 3 (2L) = 1 6 g 3 (L3. -

of g 2 that g2 (2Lz)=

By (2.14), we can restate the proposition: there exist 2 and z such that E4 (z) = 24(3/4n4)A and E6 (z) = 26(27/80)B. Let a = 3A14E4,b = 27B/8e. Then the condition A 3 0 27B2 becomes a3 b2 . Suppose we find a value of z such that E6 (z) 2/E4(z) 3 = b 2/a3 . Then choose 2 so that E4 (z) = 24a,

121

§2. Modular forms for SL 2 (7Z)

in which case we must have

E6(Z) 2 = b2E4 (z) 31a3 = 212 b 2,

i.e.,

E6 (z) = +16b.

If we have + in the last equation, then our values of z and 2 have the required properties (2.31)42.34 If we have —, then we need only replace 1 by So it remains to find a value of z which gives the right ratio E6 (z) 2/E4(z) 3 . Now EVE43. = I — 1728/i, by (2.28). Since j(z) takes all finite values on H, and b2/a3 0 I, we can find a value of z such that b 2 la' = 1 — 17281j(z) = E6 (z) 21E4 (z) 3 . This completes the proof. Thus, there was no loss of generality in Chapter I in taking our elliptic curves to be in the form (2.29) for somelattice L.

The Dedekind eta function and the product formula for A(z). We conclude this section by proving the functional equation for the Dedekind a-function, from which Jacobi's product formula for A(z) will follow. The function q(z), z e H, is defined by the product -

ri(z) = e 2 ' 24

(1 11.- kJ. n=1

e2ninz).

(2.33)

(In Problem 7 of §II.4 we gave another definition, and another proof of the functional equation. In the problems below we shall see the equivalence of the two definitions.) Proposition 14. Let .1- denote the branch of the square root having nonnegative

realpart. Then

n(— 1/z) = fzTig(z).

(2.34)

PROOF. The product (2.33) clearly converges to a nonzero value for any ze H, and defines a holomorphic function on H. Suppose we show that the logarithmic derivatives of the left and right sides of (2.34) are equal. Then (2.34) must hold up to a multiplicative constant; but substituting z --= i shows that the constant must be 1. Now the logarithmic derivative of (2.33) is

ty(z) 2ni t n(z) — 24

co

24

E

ne 2. ninz

n= 1 1 — e 2irin )

If we expand each term in the sum as a geometric series in qn (0? = e27'), and then collect terms with a given power of q, we find that til(z)

_27ri

?Az) — 24

— 24

o-1 (n)qn) =

(2.35)

Meanwhile, the logarithmrc derivative of the 'relation (2.34) that we want is:

122

III. Modular Forms --,

rii (— 1 I z) z _2 ti(— 11z)

(2.36)

2z

Using (2.35), we reduce (2.36) to showing that E2 ( — liz)z -2 =

12 + E2(Z), 2niz 0

and this is precisely Proposition 7. Proposition 15.

(2/0 -12 A(z) -z--- q fl (1 — q") 2 4 ,

q = elniz

(2.37)

.

n=1

PROOF. Let f(z) be the product on the right of (2.37). The function f is holomorphic on H, periodic of period 1, and vanishes at infinity. Moreover, f(z) is the 24-th power of q(z), by definition; and, raising both sides of (2.34) to the 24-th power, we find that f(-11z) = z i2f(z). Thus, f(z) is a cusp form of weight 12 for r . By Proposition 9(d), f(z) must be a constant multiple of d(z). Comparing the coefficient of q in their q-expansions, we conclude that o f(z) equals (270' 2 A(z). Notice the role of the Eisenstein series E2 (z) in proving the functional equation for ri(z), and then (2.37). The 17-function turns up quite often in the study of modular forms. Some useful examples of modular forms, especially for congruence subgroups, can be built up from ri(z) and functions of the form ri(Mz). The q-expansion in (2.37) is one of the famous series in number theory. Its coefficients are denoted T (n ) and called the Ramanujan function of n, since it was Ramanujan who proved or conjectured many of their properties: Go co

E

efq t(n) q n d=

n=1

fi (1 — qn)24•

n=1

Among the properties which will be shown later: (1) t(n) is multiplicative (t(nm) = t(n)t(m) if n and m are relatively prime); (2) t(n)-.=- 0-11 (n) (mod 691) (see Problem 4 below); (3) t(n)/n 6 is bounded. Ramanujan conjectured a stronger bound than (3), namely: F rom < n i 1/20_o (n) (where o (n) is the number of divisors of n). The Ramanujan conjecture was finally proved ten years ago by Deligne as a consequence of his proof of the Weil conjectures. For more discussion of t(n) and references for the proofs, see, for example, [Serre 1977] and [Katz 1976a]. -0

PROBLEMS

1. Prove that for k > 4: Ek (z) = 4

E rn,nER (rn,n)=1

On: + nr k .

123

§2. Modular forms for SL 2 (7L) 2. What is E2 (0?

3. (a) Show that El = E8 E4E6 = E10, and E6E8 = E14 (I)) Derive relations expressing a 7 in terms of a3 ; a9 in terms of a 3 and as ; and a 13 in terms of as and u7 . 4. (a) Show that E12 — El = di, with c = (270- 12 26 35 7 2/691. (b) Derive an expression for t(n) in terms of a„ and 475 . (c) Prove that t(n) a 1 1 (n) (mod 691). 5. Prove that E4 and E6 are algebraically independent; in particular, the polynomial in Proposition 10 is unique. 6. (a) Prove the identity

oo

n aX n

cic

E

JO X n

n1 n= 1 (b) Let a 1 (mod 4) be an integer greater than L Show that Ea 4. 1 (i) = 0, and prove the following sequence of summation formulas: co

E1 e

n=

/1/504, no

B +1 = 1/264,

t, —

2(a + 1) a

1/24,

u = 5;

= 9; a = 13, etc.

a

7. Lef(z) be a modular form of weight k for F. Let

1 .f (z) —k E2 (z) f(z). g (z) =12 2n Prove that q(z) is a modular form of weight k if and only iff(z) is a cusp form.

2 for F, and that it is a cusp form

. 8. (a) Prove that E6 = E4 E2 and E8 = E6E2 — (b) Derive relations expressing a5 in terms of al and a3 , and o-7 in terms of a l and C5 .

9. Recall the following definitions and relations from Problems 4 and 7 in §11.4. Let x be a nontrivial even primitive Dirichlet character mod N. Define GO

0 (X, 1 ) =

E An)

2

Re 1>0.

n= 1

Then OCX, = (N 2 0 -1129(X) 0 ( 0-5 1/N 2 t). Now let x be the character mod 12 such that x(+ 1) = 1, x( + 5) = —1, and define (z) =-- 0(x, — I 12) for z H. Then Pi( — 1/z) = Vz/i (a) - Prove that Ti(z + 1) = e 224 - ) and that ri 24 e Si 2 (F). (b) Prove that i)(z) = (c) Write the equality in part (b) as an identity between formal power series in q . (Note. This identity is essentially Euler's "pentagonal number theorem." For a discussion of its combinatoric meaning and two more proofs of the identity, see [Andrews 1976, Corollaries 1.7 and 2.9].) 10. Find the j-invariant of the elliptic curve in Problem 3(b) of §1.2, which comes from the generalized congruent number problem. What is j in the classical case = 1?

124

In. Modular Forms Prove that there is no other value of 2 e15 for which the corresponding j is an integer. It is known that an elliptic curve with nonintegralf cannot have complex multiplication.

§3. Modular forms for congruence subgroups Let y = ( ) e f = SL 2 (1), let f(z) be a function on ii =1/uQu {co} with values in C u {co } , and let k e Z . We introduce the notation . f J[y]k to denote the function whose value at z is (cz_+ d) -1R(az + b)I(cz + d)). We denote the value off f [y] k at z by f(z)l[y] k :

f(Z)I Mk d:7--:f (CZ

+

dr kf(YZ) for

y

= 7 a b) ) e r. c dj

(3.1)

More generally, let GL(Q) denote the subgroup of GL 2 (Q) consisting of matrices with positive determinant. Then we define

f(z)l[y] k d=ef (det y)142 (cz ± d) -if(yz)

for

y = (ac bd e G LI" (0). )

(3.2) For example, if y = (f; °a ) is a scalar matrix, we have f i[y] k .--.--- f unless a < 0 and k is odd, in which case f j[yi k = f. Care should be taken with this notation. For example, by definition f(2z)i[y] k means f l[y] k evaluated at 2z, i.e., (2cz + d)Y(Paz + NI (2cz + d)). This is not the same as g(z)I[y] k for g the function defined by g (z) = f(2z); namely, g (z)l[y] k = (cz + d)- k g (yz) = (cz + d)_"f(2(az + b)I (cz + d )) . With this notation, any modular function of weight k for f satisfies fl[y]k =-- f for all y e I". Some functions are invariant under DI for other y e GL (Q). For example, recall the theta-function defined in §4 of Chapter II: 0(t) = EnE2 e'n2 for Re t > O. We saw that it satisfies the functional equation 0(1) = r 112 0(11t) (where \r- is the branch such that .-If = 1 ) . We define a function of z e H, also called the theta-function, by setting 0(z) = 0(— 2iz), i.e., —

0(z) = E

e2

_E

ne2

Substituting

—

qn2 for

ZEI-19

q = e2niz.

(3.3)

neZ

2iz for t in the functional equation fox : 0, we have 0(z) = (2z/0 -112 0( — 1/4z).

(3.4)

Squaring both sides and using the notation (3.2), we can write

° III= _ ie 2

for

_. = I E GLI(Q). Y (— 4 0 )

(3.5)

125

§3. Modular forms for congruence subgroups

_Notice that

fiEv1v21,=(fiEv1l,)1[Y2]k

for y 19 y2 e GLI(G).

(3.6)

To see this, write (det y) k12 (cz + d ) k in the form (dyzIdz) k12 , and use the chain rule. We are now ready to define modular functions, modular forms, and cuspforms for a congruence subgroup r c F. Let qN denote e 2z1z .

Definition. Let f(z) be a meromorphic function on H, and let I' r be a congruence subgroup of level N, i.e., r' r(N). Let k EL. We call f(z) a modular function of weight k for r if

fi[y] h =1 for all

y e r',

(3.7)

and if, for any y0 eJT = SL2 (1), f(z)l[yo]k has the form Ea,A

with

an = 0 for n a ne n= no

= (a/d)'42

E a e 27ribnAIN nan Nd n = no

n

briggrd, = nE = ano

where

10

bn

if ci,11 n;

(ald)ka e 2 ninbladN ania

if a

in.

This proves the lemma.

Li

We now turn to the proof of the proposition. The first assertion in part (c) follows from Lemma 1. Now suppose that fe Mk (r). Then for a y'a e r" (where y' e I') we have (f Mk)! [a' y' = (fi [yl k)i = f [a]k ; • in addition, for yo e F we have (f [Œ]k)1 [YO] k = fi [ayo ] k , and the condition (3.8) holds for f [a] k , by Lemma 2. Thus,f1 e Mk (F"). Iff vanishqs at all of the cusps, then so doesfl[a] k , by Lemma 2. To obtain the last assertion in Proposition 17(a), we write a = ?), g = N - k2fl[a]k , and note that a-l Fa n F = Fo (N). The values at the two cusps co and 0 come from the above formula for bn with n = 0 and a replaced by ( ?) at the cusp co and by (i:4,1 Sa Iv) ) at the cusp O. This completes the proof of Proposition 17(a). (1)) Let = e2 , and let g =EN x i (j)tj be the Gauss sum. Then co (1 Nil

N —1

f 1 (z) =

E

n=0

1=0

N v=0 co

N —1 = Ar IT

anqn

(v) X1 (1V)

1,v= 0

iv

E ane2 irtn(z—vIN) n= 0

N —1 = n

N

E0

(v)fiz — vIN).

Now let y = bd ) e F0 (MN 2). We want to examine fxa (yz). 'Let yv denote k0 1 . Then for each y and y', 0 < y y' < N, we compute ,

yory;; 1 =

— cvIN b + (v'a vd)IN cvv'IN 2) d + cv'IN

•

129

§3. Modular forms for congruence subgroups r

If y' is chosen for each y so that v'a -=-- vd mod IV (such a choice is unique, because a and d are prime to N), then we have yvyy,7, 1 E ro (M and so ),

fy

4 1 (yz)=--A ,.

N-1

E -ii(v)ftyvvy

1 1, z)

n N-1

=

N v=o

y i (v)x(d)(cy,,z + d + cv'IN) kAy,,z) a

=

N-1

X(d)(ez + d) k --- E -ii(of(z N v. 0

- v' IN).

But )71 (v) = -ii (036. (a)h-Ci (d) = X1(c1) 2 i1(V). Thus, a

xi fti (yz) = xxi(d)(cz + d) k ----'IV

N-1

E i-c-- ,(0ftz - v' IN) = xxi(d)(cz + d)kfx, (z),

V=0

and so 4, i has the right transformation formula to be in Mk (MN 2 , XXI)The cusp conditions are verified by the same method as in the proof of part (a) of the proposition. Namely, for all y E F,fx ,(z)1 [y] k is a linear combination a off(z)l [m ]k, and so the cusp conditions follow from Lemma 2. The next proposition generalizes Proposition 9(a) in the last section. Like Proposition 9(a), it is useful in proving equality of two modular forms from information about their zeros. Proposition 18. Mo (r) = C for any congruence subgroup

r c F. That is,

there are no non-constant modular forms of weight zero. Letfe Mo (r), and let a = f(z 0) for some fixed zo e H. Let r = U air be a disjoint union of cosets, and consider g d=ef ri (f 1[000 — a), i.e., PROOF.

g(z) =

ri (f(Œ

-1 z)

— a).

(3. 1 1)

Then g(z) is holomorphic on H, and it satisfies (3.8), because f does. M oreover; given y f - we have gl [y-1 ] 0 = II (f i [(yai) 1 0 — a). But since f 1[(4-10 does not change if a is replaced by another element ay' E ar, and since left multiplication by y permutes the cosets ai r, it follows that { f I[(y air1 ] 0 } is merely a rearrangement of { f i[oci 9 0 } . Thus, g l[y- 1 0 = g, and we conclude that g e Mo (T). By Proposition 9(a), g is a constant. Since the term in (3.11) corresponding to the coset ir isf(z) — a, it follows that for z = z0 the product (3.11) includes a zero factor. Thus, g = O. Then one of th`e factors in (3.11) must be the zero function (since the meromorphic functions on H form a field). That is, f(cci l z) — a = 0 for some j and for all z e IL Replacing z by aj z, we have: f(z) = a for all ze H, as claimed. o As an example of the applications of Proposition 18, we show that for any positive integers N and k (with k even) such that k(N + 1) =--- 24, a

-

130

HI. Modular Forms

cusp-form of weight k for ro(N) must be a multiple of (n(z)ri(Nz)) k . Here n(z) = e 2riz/24 nn (1 _ qn) , q = e2, as in §2.

Proposition 19. Let f(z) be a nonzero element of sk (ro(N)), where N = 2, 3, 5, or 11 and k = 8, 6, 4, or 2, respectively, so that k(N + 1) = 24. Then f(z) is a constant multiple of g(z) aTf (ti(z)ri(Nz)) k . By Proposition 17(a), g(z) N+1 ---= A(z)A(Nz) is an element of S24( r.0 (N)). In addition, g(z) N+1 is nonzero on H, since A(z) 0 0 on H. At infinity, g(z)' = q N+1 n (1 _ q n)24(1 _ en\) 24 has a zero of order N + 1 in its q-expansion. At the cusp zero, we write -24z12A (z)(zIN) 12A(zIN) oz , , N+1 I[S]24 = z -246,(-1/z),A(—Nlz) = z ) = N -12 q r1 ii 0 _ 4)240 _ q 11 1/1)2 4 , PROOF.

which has a zero of order N + 1 in its qN-expansion. On the other hand, since fesk (ro (N)), its q-expansion at oo is divisible by q, and the 'isexpansion of fl[S] k is divisible by qAi. Now (PO', as a ratio of two elements of S24 (f0 (N)), is a modular function of weight zero for Fo (N). Since g(z) 0 0 on H, this ratio is holomorphic on H. Moreover, the qexpansion off ' is divisible at least by q N+1 , and at zero the qN-expansion in g' and the 4 +1 in is divisible at least by q"- . Hence, the g N+1 1[S] 24 are 'canceled, and the ratio is holomorphic at the cusps, i.e., ( .fig ) v +1 e Mo (fo (N)). By Proposition 18, ( flg) N+1 is a constant, and hence fig is also a constant. This completes the proof. 0

ei +1

Notice that we did not actually prove that g(z) ---= (ri(z)ri(Nz)) k is in sk (ro(N)), unless we can be assured that there exists a nonzero element feS k (ro (N)). The same proof, for example, would tell us that any nonzero Je S3 (r0(7)) must be a constant multiple of (n(z)v(7z)) 3 ; but s3(r0(7)) = 0, since 3 is odd and — /e I-10 (7). However, for the values of N and k in Proposition 19 it can be shown that dim sk(ro(N)) = 1 (see Theorem 2.24 and Proposition 1.43 in Chapter 2 of [Shimura 1971]). Thus, Proposition 19 is not vacuous. For N = 2, k = 8 we can see this directly, since we know that T and ST 2S ------ ( 1 _?) generate r0 (2) (see Problem 13(b) of §III.1).

Proposition

20. (q(z)ri(2z)) 8 E S8(1-0(2)).

Clearly, g(z) = (ri(z)q(2z)) 8 is holomorphic on H. Its q- expansion at oo is: ( e 2iriz/24±2ni2424)8 n 0 ...... qn)8 (1 _ q2.)8 = q ri 0 _ qn)8 (1 _ q 2n) s . Using the relation ?A-1/z) =--- V zlig(z), we easily see that g(z) vanishes at the cusp 0 as well. It remains to show invariance under [(1 _?)] 8 . Set a = (2 1). Then (1 _?) = lane, and PROOF.

g (z)l[a] 8 = 24 (20 -8 0/( — 1/2z)q( — 1/z)) 8 = (2z2) -4(V2z/i n (2z) / ti(z)) 8 = (r/(z)t/(2z)) 8 =-- g(z).

131


Since g[T]8 = g, and Ha is always trivial, it follows that g is invariant under la Ta, as desired.

Eisenstein series. Let N be a positive integer. Let a = (al , a2) be a pair of integers modulo N. We shall either consider the ai as elements of ZINZ or as 'integers in the range 0 < ai 3 to be either odd or even. Proposition 21. Gkcj-mod

N

mk(r(N .

)) and Gr a 2 ) mod N e mk (r i (N) ).

PROOF. First, the series (3.12) is absolutely and uniformly convergent for z in any compact subset of H, because k > 3 (see, for example, Problem 3 in §1.5). Hence G(z) is holomorphic in H. Now let y = (ca bd) e F. Then G(z)I [y] k =

E

(cz + dyk

mr, a mocIN

E

k ( az + b mi cz-+ d + m2)

t(p"

mz-Eamod N

W" 1"" + m2 c)z + ()nib + rn2c1))k

Let m' = (m i a + m2 c, mb + m 2 d) (m i m2 )( ca bd) =m1). Note that modulo N we have m' ay. Let a' = ay (reduced modulo N). Then the maps mi--m/ = my, and m' = m'y' give a one-to-one correspondence between pairs m e V with m a mod N and pairs m' e Z 2 with m' a' mod N. This means that the last sum above is equal to Em' Ea' mod NOnfiz + (z). Thus, = Gkay mod N

Gg-m°dN

for y e re

(3.13)

If 'y e F(N), then by definition y I mod N, and so ay a- a mod N. Thus, (3.13) shows that G: is invariant under [y] k for iy e F(N). Similarly, if y (N) and a l = 0, we have (O a2)y (0, a2) mod N, and so Gra2) m0d N is ,

132

HI. Modular Forms

invariant under [y]k for y e f 1 (N). It remains to check the holomorphicity condition at the cusps. But [yoik permutes the G" dN for yo e F, by (3.13). Hence it suffices to show that each G: is finite at infinity. But

E

lim GAz) =

z-4.iao

if a l 0 0;

0

E

mEarnedN,m1=0

n"

if a 1 = O.

n-rta 2 mod N

The sum / n" over n L--s a2 mod N converges because k >_ 3. It is essentially a "partial" zeta-function. More precisely,

± (_ l) CCk.` - a2( ) 9 where Ca(k)i-Q. c Gr,a 2)(co ) _ a,(k) ts1 n.s.---amodN

This completes the proof of the proposition. As a special case of (3.13), if we set y = —! we have

GI-(2 = (— Div. This can also be seen directly from the definition (3.12). Thus, for example, G: = 0 if k is odd and 2a--= 0 mod N. It is now not hard to construct modular forms for any congruence subgroup F', f = I"' D f(N), out of the Eisenstein series G:d rno N . For fixed a, the elements y E F' permute the Eisenstein series G:, where a' ranges over the orbit of a under the action of r, i.e.,

g' Gel ra foiverl. Let r = #(41') be the number of elements in the orbit. If F(Xi , .. . , X,.) is any homogeneous symmetric polynomial in r variables with total degree d, and we set Xi = Gri (where ayi rubs through the orbit ar'), then F(Gr', . . . , Ge ') is easily seen to be a modular form of weight kd for F'. For example, taking F to be X1 + - • - + Xr or X1 . . . Xr , we have E Ge

,„2 , mod

N(z )

€ mk (ro(N)) ;

a2 e (7L /NZ)* Gi(c 0 , a2) mod AT( 2.) e mo(N),,

n

(3.16) ( Fo(N)).

a2 e(ZINZ)*

We now compute the qN-expansion for G:. We shall be especially interested in the cases when a l = 0 or a2 =---- O. Recall the formula used in deriving the q-expansion for Gk in the last section (see the proof of Proposition 6):

1 k = (..._ ok--12c(k).___ k 'E i• e2nijz nE ez (z + n) Bk j = 1

for

k > 2,

z e H.

(3.17) Let

133


c

cf

a

=

NkBk

{0 k d7f Ca2(k) (— 1) k r i2 (k)

if a / 0 0; if a / = O.

(3.18) Then

G:(z) =

14,

E + ni 1 :4a 1 mod E N,m i *0 m2 Ea 2 mod N

(

E

1)"

-k

N + n)

m i Ea, mod N,m i >0 neZ

+ N -k

in2) –k

m iz + a2

E

=b+N

iz +

E

(m i z - a2

m 1 E–a 1 modN,m i >0 nEZ

=

00

E

ck(

ik_ie2Rifi(rniz+a2)/N)

m i Ea ' mod N j=1 m i >0 03

E

(—ok

m1=--- – al mod m >0

e2iriAmiz–a2)1N)

E N j=1

In these computations we had to split up the sum into two parts, with m i replaced by m 1 for m i negative, because in applying (3.17) with z replaced by (m / z ± a2)I N we need (rn i z + a 2)IN E II, i.e., m 1 > O. Now let -

qN = e2nizIN

crf e2mmi

(3.19)

Then G(z) =

E

b + ck

00 E i k„ va2q.km 1

m 1 E-a l mod N j=1

(3.20)

( _ ok

co

E

E

m i s--. –a i mod N j=1 m >0

To find the coefficient 91; of q?k, it remains to gather together terms with = n. We shall only do this in the cases when a l = 0 and a2 = O. As a result, we have the following proposition. Proposition 22. Le t ck ,bt k ,,q N be as in (3.18) (3.19). For k > 3 let Gklm°dN (z) -

a mod be the Eisenstein series (3.12). Then the qN- expansion of G N

G:(z) = bt k + E

(3.21)

tfirT07;

n=1

Can be computed from (3.20). If a = (a 1 , 0), then for n > 1 = Ck(

E iln n/jE-..-`. a l mod N

j k, +( _ ok

E jjn

nijE –a l mod N

(3.22)

134

HI. Modular Forms

If a = (0, a2), then for n __ I

bNa n,k = ck E ik—i (Va 2 ± ( ..._1)k—i(2 2) . (3.23)

er, k = 0 if N )(n;

jln

Thus, for a = (0, a 2) co

q = e2niz .

Ge ,a 2)(z) = b(00:ka2 ) ± ck 1 E i k, (Va 2 ± (____ 01,: ja 2) q n 1 n=1i

(3.24)

Proposition 23. If 2a a- (0, 0) mod N and k is odd, then G laT md N = O. Other-

wise, Ge'a2) is nonzero at co, and Ge" 9) has a zero of order min(a i , N — a 1 ), where we are taking al in the range 0 N/2. n=c, ( a2 + nN) k (N — a 2 + nN)k)t< 0 I

Finally, we look for the first possible value of n in (3.22) for which either sum in (3.22) is nonzero. That value is n = min(a i , N — a l ), where we have the possible value/ = 1 in one of the two sums. Thus, b' °) = +ck for n = 0 min(a / , N — a 1 ). This completes the proof.

- "dN can be As an application, we show that a certain product of G31" expressed in terms of the n-function. We shall use this result in the next section, where we give Hecke's proof of the transformation formula for 0(z). Recall the Weierstrass p-function and its derivatives from Chapter I: So(z; w 1

,

(0

1 1 1 • 2) = —2z + „iI ,„ ez l (z + nu°, + nw 2) 2 (n2C01 + no) 2 ) 2 '

sg (z co 1 , co 2) = —2 E '

(z

;

+ m, u) + nco2)- 3

;

m,neR

(0 1 9

(0 2) =

E

(— l )"(k — 1)!

(z

+ imo, + nw 2) -k ,

k > 3.

m,neZ

If a 0 (0, 0), we can express Gk'llmd N (Z) in terms of gp (k -2) as follows: G: mod N (z)

—k a i z + a., .`+ Ira + n) N m,nEZ

= N _k z

( -- Ok = Nk (k — l)!

(k_ 2 )

al z + a2 ;

(3.25)

Z,

1) . This is the value of p (k-2) for the lattice L, = linz + n1 at a point of order

135


N modulo the lattice, namely (a1 z + a2)/N. Thus, the Eisenstein series for r(N) are related to the values of the derivatives of p at division points. Now suppose that k = 3. By (3.25), G(z) can vanish only if 0 1 ((a1 z + a2)IN ; z, 1) = O. But pi vanishes only at half-lattice points (see the end of §1.4). If 2a ET: (0, 0) mod N, i.e., if 2(a 1 z + a2)IN is a point in L z , then G39- is the zero function, by Proposition 23. Otherwise, oVa l z + a2)IN ; z, 1) is nonzero. We have proved Proposition 24. If 2a # 0 mod N, then Grd N (z) V: lr ) for z-e H.

Let p be an odd prime. We now define 12- 1

h(z) =-- n Gl o,a2 ,modp(z).

(3.26)

a 2 =1

Proposition 25. h(z)e M3(p ... 1)(1- 0( p)), its only zero is a (p 2 — 1)/4 —fold zero at 0, and it is a constant multiple of (TIP(z)lri(pz)) 6 . The first part is the special case of (3.16) when N = p, k = 3. h(z) is nonzero on H by Proposition 24, and at infinity by Proposition 23. To find its order of zero at 0, we examine the qp-expansion of PROOF.

hl[S]3 (p _i ) =

P— 1 P-1 [I Gr 2 ,0)m04 p n Gl o 2,modp i[s h= ,a

a 2 =1

a2=1.

by (3.13). According to Proposition 23, the first power of qp which appears in Gr2 ' 0)rmdP is qpinin("2 4'- '2 ). Thus, the order of zero of h(z) at 0 is (1) - 1)12

12 - 1

E min(a 2 , p --: a2) = 2 a 2 =1

E

a2 = (p2 — 1)/4.

a2 =1

Now set 172(z) == (re(z)li(pa. Then î(z) 4= A(z)P I A(pz) is holomorphic and nonzero on H, since A(z) is holomorphic and nonzero on H. Because A(z) e S12 (r) c S12 (1-0 (p)) and A(pz) e Si2 (ro(p)) by Proposition 17(a), it follows that h(z) 4 is a modular function of weight 12(p — 1) for Fo (p). Its q-expansion at infinity is q p no ...... q n) /q p no _ e p) 24 _ n „, _ q n) p,o 24" hence 1i(z) 4 is holomorphic and nonzero at co. At the cusp 0 we have I 2(p-1) • f — , at 11z)P-IA(—plz) kz)4 1[S] 1 2(p-1) = z— = z-12(p-1)Z12PA(z)P1((zIp)12A(zip)) = p'2 A(z)P I A(z Ip) = p 12 q p no .....

q .) 24p/qpn(1

qnp)24 1

which has leading term p 1242-1 . Thus, both h4 and ii 4 are elements of M12(p_ 0(1"0 (p)) with no zero except for a (p2 — 1)-ordér zero at O. Hence their ratio is a constant by Proposition 18. But (///h )4 = const implies that hiii = const. This concludes the proof of Proposition 25. 0

136

III. Modular Forms ,

Notice that, by (3.15), we have some duplication in the definition (3.26) of 12( 2). That is, if we define ( p— 1)12 f(z) = fi , Gr,a2) mod p(z ) ,

(3.27)

a2 =1

we find, by (3.15), that

h(z) = (

0

( p-1 )/2f(z)2.

'

—

(3.28)

Proposition 25 then tells us that the square of the ratio off(z) to 07 P(z)11(pz)) 3 is a constant. Hence, the ratio of those two functions must itself be a constant, and we have proved Proposition 26. The function f(z) defined in (3.27) is a constant multiple of

(r1P(z)M(pz)) 3 . Because each of the GV'a2) in (3.27) is in M3 (1", (p)) by Proposition 21, it follows that fe M3(p _ 0/2 (1"1 (p)). However, unlike 144 f(z) is not, strictly speaking, a modular form for the larger group Fo (p). Proposition 27. Let f(z) be defined by (3.27), and let y = Ca D e Fo (p). Then

f l[y] 3( P — 1)/2 = (1,)f, where (i) is the Legendre symbol (which is +1, depending on whether or not d is a square modulo p). PROOF.

Since (0, a 2 ) (° )2L---_ (0, da2) mod p, it follows by (3.13) that

f I [y] 3,p - 1 „ 2 =

(p-1.)/2

fi Gr ,da 2

)

mod p .

a 2 =1

But by (3.15), the terms in this product are a rearrangement of (3.27), except that a minus sign is introduced every time the least positive residue of da2 modulo p falls in the range (p + 1)/2, (p + 3)/2, . . . , p 1. Let nil be the number of times this occurs. Thus, fl[Y]3(p012 = ( 1)ndf. According to Gauss's lemma, which is an easily proved fact from elementary number theory (see, for example, p. 74 of [Hardy and Wright 1960 ]) , we have (— 1)d = (1). This proves the proposition. 0 —

—

The transformation formula in Proposition 27 is an example of a general relationship between modular forms for r1 (N) and "twisted-modular" forms for Fo (N), called "modular forms with character". We now discuss this relationship. We start with some very general observations. Suppose, that F" is a subgroup of F', and f(z) is a modular form of weight k for the smaller group F" but not necessarily for the bigger one. Then for y e I' we can at least say thatf i[y - l k only depends on the coset of y modulo F". That is, if y" e r", then fl [(yy") - 1 = fi[y- lk . Now suppose that the subgroup F" is normal in F'. To every coset yF" associate the linear map fF--> f l[y- lk which takes an element in Mk(F") to Mk (F") by Proposition 17 (with F" and y -1 in place of F' and a; note that yF"y -1 n F = r", since y e F' and F" is normal in F'). This gives us a group

137

,


-

1

homomorphism p from F'/F" to the linear automorphisms of the vector space Mk(F"), because for yi , y2 e F' , P(Yi Y2): fE.-fl{(yiy2) - 9k = (fl[Ynk)i[Yilk = POO (p(y2)f).

In other words, we have what is called a "representation" of the group

r'/r" in the vector space mk(r"). If : F'/F" -- C* is a character of the quotient group, then define Mk(F', x) to be the subspace of Mk(F") consisting of modular forms on which the representation p acts by scalar multiplication by x, i.e.,

mk(r, x) a tfe Afk(r")1p(y)f= lLe ---,

x(y)f for all ye

r}.

If F'/]T" happens to be abelian, then according to a basic fact about representations of finite abelian groups, Mk(F") decomposes into a direct sum of Mk(F', x) over the various characters x of FYI'. We shall soon recall the simple proof of this fact in the special case that will interest us. We apply these observations to the case F" = ri (N), F' = Fo (N). Since ri (N) is the kernel of the surjective homomorphism from FO (N ) to (11 N 1)* that takes (ca db to d, it follows that Fi (N) is a normal subgroup of Fo (N) with abelian quotient group isomorphic to (ZINZ)* (see Problems 1-2 of §III.1). Let x be any Dirichlet character modulo N, i.e., any character of (Z/NZ)*. In this context the subspace Mk(ro(N), x) Mk(ri (N)) is usually abbreviated Mk(N, x). That is, )

Mk (N, x) ,1---3 ffe Mk (Fi (N))I f l[y] k = x(d)f for y = (

a 11 ) e ro c_ d

(N4 (3.29)

In particular, if x = 1 is the trivial character, then Mk (N, 1) =

mk(ro(N)).

Proposition 28. Mk (f i (N)) = 10Mk (N, x), where the sum is over all Dirichlet /characters modulo N. As mentioned before, this proposition is actually a special case of the basic fact from representation theory that any representation of a finite abelian group decomposes into a direct sum of characters. However, we shall give an explicit proof anyway. First, any function f that satisfies f 1[y] k = x(d)f for two distinct characters x must clearly be zero; hence, it suffices to show that anyfe Afk (Fi (N)) can be written as a sum of functions fx E x). Let , PROOF.

Mk(N,

1,

fx = 0(:AT)

d

E

k(d)f I[Ydlo

where yd is any element of Fo (N) with lower-right entry congruent to d mod N. We check that for y = (` S)e fo (N)

fx I Hit =--- 4)(1N) d Zz)* TCfrOf f Fy 1 which, if we replace dd' by d as the variable of summation, is easily seen

138

HI. Modular Forms

to equal x(d)f, i.e.,ft e Mk (N, x). Finally, we sum theft over all characters x modulo N, reverse the order of summation over d and x, and obtain:

Efx =

1

_

Ex(d) E N) x d. ( zINE) * o(

fiETL,

which is equal to f, because the inner sum is 1 if d = 1 and 0 otherwise. o Thus, f can be written as a sum of functions in Mk (N, x), as claimed.

Notice that Mk (N, x) = 0 if x has a different parity from k, i.e., if (— 1) 0 (-1)k . This follows by taking y = —I in the definition (3.29) and recalling that f i[ nk = ( 1)k. f For example, as an immediate corollary of Proposition 28 and the pre7 ceding remark, we have —

—

Proposition 29.

AO , 1), k even; 4 AM r 1( )) = I ./IA(4 , x), k odd, where 1 denotes the trivial character and x the unique nontrivial character , modulo 4. Notice that the relationship in (3.29) is multiplicative in y; that is, if it holds for y l and y2 , then it holds for their product. Thus, as in the case of modular forms without character, to show that f(z) is in Mk (N, x) it suffices to check the transformation rule on a set of elements that generate F o (N). As another example, we look at 0 2 (z) = (Inez qn2)2 , whose n-th qexpansion coefficient is the number of ways n can be written as a sum of two squares. Proposition 30. 0 2 e M1 (F 1 (4)) ---.--- Aft (4,

x), where x(d) = (

-

1)("w 2 .

the transformation rule for —I, T, and ST 4S = (-1 _7), which generate To (4) (see Problem 13 of §1117.1). This is immediate for T, since 02 has period F. Next, the relation fl[ n i = f= x( 1)f holds for anyf, by definition. So it remains to treat the case ST4S. Let PROOF. It suffices to verify

—

—

—

(0 —1 CtIV d=ef V

and

aN

aN-1 = —laN , N

so that

o) '

(a b) -1 c d ccN

(

=

—c/N\ — Nb a ). d

(3.30)

We write ST4S = —a4 Tcc4-1 = ia4 Tec4 , and use the relationship 0 2 1 [cc4j i _j® 2 (see (3.5)) to obtain 021[574511 z..= 021[Œ4T,,4]1 =_._ — 021[Ta4]1 = —i®21 [°44]1 [Œ4]i = (Recall that the scalar matrix il acts trivially on all functions, i.e., [1/4], identity.)

6111•••18 ■

-

139


To finish the proof of the proposition, we must show the cusp condition, i.e., that 0 2 1[701 1 is finite at infinity for all yo e T. But the square of 0 2 1[y0] 1 is 04 I [To] 2 , and it will be shown in Problem 11 below that 04"e M2 (ro (4)) ; in particular, this means that CO'f[yo ] 2 , and hence also Col [yo]i , is finite at o infinity. This completes the proof. The spaces Mk (N, x) include many of the most important examples of modular forms, and will be our basic object of study in several of the sections the follow. We also introduce the notation Sk (N, x) to denote the subspace of cusp forms: Sk (N, x) j=f2Mk (N, x) n S k (r i (N)).

The Mel/in transform of a modular form. Suppose that f(z) = E an 4 (where q Iv ......_ e 2niz1N ) is a modular form of weight k for a congruence subgroup F' of level N. Further suppose that Ia n ' = 0(n') for some constant c ER, i.e., that an/ne is bounded as n --> co. It is not hard to see that the qN-expansion coefficients for the Eisenstein series G:m" N have this property with c = k — 1 + e for any r, > O. For example, in the case F' = F, the coefficients are a constant multiple of o-k _ i (n), and it is not hard to show that ak _ 1 (n)I rl k—i+e 0 as n ---÷ cc . We shall later show that, if f is a cusp form, we can take c = lc/2 + E. It has been shown (as a consequence of Deligne's proof of the Weil conjectures) that one can actually do better, and take c = (k — 1)/2 + E. In Chapter II we saw that the Mellin transform of OW = E e -'2 and certain generalizations are useful in investigating some important Dirichlet series, such as the Riemann zeta-function, Dirichlet L-functions, and the Hasse-Weil L-function of the elliptic curves En : y2 = x 3 — n2x. We now look at the Mellin transform for modular forms. Because we use a variable z in the upper half-plane rather than t (e.g., t = — 2iz), we define the Mellin transform by integrating along the positive imaginary axis rather than the positive real axis. The most important case is F' = F1 (N). For now we shall also assume that f(co) z.-- O. Thus, let f(z) = E,7=1 an qn e Mk (Fi (N)). (Recall that since Te F1 (N), we have an expansion in powers of q = e27riz rather than qN .) We set ioo f(z)zs' dz. (3.31) def —0

We now show that if f(z) = E,T, i anq" with lan i = 0(d), then the integral g(s) defined in (3.31) converges for Re s> c 4- 1: \ ico

.

JO

oo

it°

f(z)zs -i dz = E an n =1

J0

2 dz zse ninz ______ z

— E0° an (— 27r1 1n y) n=1

foe

dt tse-r 7

(where t = —27rinz)

0

00

(-270 -sr(s) E an n -s

(see (4.6) of Ch. II)

n=1

(3.32)

140

111. Modular Forms

(where the use of I" in the gamma-function F(s) has no relation to its use in the notation for congruence subgroups; but in practice the use of the same letter F should not cause any confusion). Since iann-si = o(nc-Res\), this last sum is absolutely convergent (and the interchanging of the order of integration and summation was justified). - Inœ=0 42Ane mk (r,(N)) has la c) 0 0, we replace f(z) by f(z) — ac, If f(z) =----in (3.31). In either case, we then obtain g(s) = (-2ni)_sr(s)LAs), where co co ;le Lf (s) d=ef and for Re s > c + 1, if f(z) =

E

E

n=1

n=0

(3.33)

with Ian ' = 0(d).

In addition to their invariance under [y]k for y el",(N), many modular forms also transform nicely under [acia, where ŒN = ( ) as in (3.30). It will be shown in the exercises that, for example, any function in Mk(N, x) for z a real character (i.e., its values are +1) can be written as a sum of two functions satisfying f1[aN]k= Ci-kf

C 7--- 1 or —1,

(3.34)

where one of the functions satisfies (3.34) with C = 1 and the other with C = —1. An example we already know of a function satisfying (3.34) is 02 : the relation (3.5) is a special case of (3.34) with k = 1, C = 1, N = 4. We now show that if (3.34) holds, then we have a functional equation for the corresponding Mellin transform g (s) which relates g (s) to g (lc — s). For simplicity, we shall again suppose that f(z)= I an sq" with ao = 0. We can write (3.34) explicitly as follows, by the definition of [aid', : f(-1INz) = CN -42 (—iNz) kf(z).

(3.35)

In (3.31), we break up the integral into the part from 0 to il\FN and the part from i//i to ix. We choose ifIKT because it is the fixed point in H of ŒN : zF.--+ —1INz. We have

ico g(s) =

fib.rsT

= •••■•••■■•

••••••■••

dz

.fiz)z s--z-

—f

ioo

111Vz)

f(— 1 INz)(— 11Nz) 3d( — 1INz iw-st

ioo (f(z)zs ± f(-1INz)(-11Nz)s)—dz z fiwN ico ) —dz (f(z)zs ± i k CN - wf(z) ( — 1 /Nz)s—k

. f ibliq

Z

,

because of (3.35). In the first place, this integral converges to an entire function of s, because f(z) decreases exponentially as z -4 ioo. That is, because the lower limit of integration has been moved away from zero, we no longer have to worry about the behavior of the integrand near O. (Compare with the proof of Proposition 13 in Chapter II, where we used a similar technique to find a

141


rapidly convergent series for the critical value of the Hasse-Weil L-function ; see-the remark following equation (6.7) in §II.6.) Moreover, if we replace s by k s in the last integral and factor out ik CN -42 (- NY , we obtain:

(i'C'N k/2 (- N)sf(z)z's

g(k s) = i k CN -k2 (-

= eCN -k12 (-

f(z)zs)

dz

ioo I (Pc CAT -kl2f(z)(- 11Nz)s -k +f(z )e)

= i k CA T -142 (- N)sg(s), because the last integral is the same as our earlier integral for g(s). This equality can be written in the form

(- i\ N) 5g(s) = C(- iffV)- k- sg(k s). Thus, by (3.32)-(3.33), if we define A(s) for Re s> c + 1 by

(3.36)

A(s) = (- i Kr)g(s) = ( N 12n)sr(s)L f (s),

we have shown that A(s) extends to an entire function of s, and satisfies the functional equation

(3.37)

A(s) = CA(k s).

An,

As an example of this result, we can take f(z) = A(z)e S 1 which satisfies (3.35) with N = 1, k = 12, C = 1. Then A(z) = Z t(n)q", 6.(s) = z,f_ -c(n)n -- , and A(s) = (2n) -sr(s)LA(s) satisfies the relation : A(s) =

A(12 The derivation of (3.37) from (3.34) indicates a close connection between Dirichlet series with a functional equation and modular forms. We came across Dirichlet series with a functional equation in a very different context in Chapter II. Namely, the Hasse-Weil L-function of the elliptic curve : y2 = .x3 - n 2 x satisfies (3.36)-(3.37) with k = 2, N = 32n' for n odd and 16n2 for 'n even, C = for n odd and ( ) for n even (see (5.10)(5.12) in Ch. II). We also saw that the Hasse-Weil L-function of the elliptic curve y2 = x3 + 16 satisfies (3.36)-(3.37) with k = 2, N = 27, C = I (see Problem 8(d) of 11.5). So the question naturally arises: Can one go the other way? Does every Dirichlet series with the right type of functional equation come from some modular form, i.e., is it of the form Lf(s) for some modular form f? In particular, can the Hasse-Weil L-functions we studied in Chapter II be ,obtained by taking the Mellin transform of a suitable modular form of weight 2? That is, if we write L(E„, s) in the form E:=1 bm ars (see (5.3) in Ch. II), is E b niqm the q-expansion of a weight two modular form? Hecke [1936] and Weil [1967] showed that the answer to these questions is basically yes, but with some qualifications. We shall not give the details,

142

W. Modular Forms ,

which are available in [Ogg 1969 ], but shall only outline the situation and state Weil's fundamental theorem on the subject. Suppose that L(s)= I anti' satisfies (3.36)7(3.37) (and a suitable hypothesis about convergence). Using the "inverse Mellin transform", one can reverse the steps that led to (3.36)-(3.37), and find that f(z)= E anqn satisfies (3.34). For now, let us suppose that N = A2 is a perfect square, and that C = ik (k even). Then, iff(z) satisfies (3.35), it follows that ft (z) cre--f f(z I A) = 1 anqr,1 satisfies: f1 (— 11 z) = z kfi (z). Thus, fi is invariant under [S] k and [T] k , and hence is invariant under the group generated by S and V. Hecke denoted that group 0(2). We have encountered the group 0(2) before. In this way one can show, for example, that L(Eino , s) corresponds to a modular form (actually, a cusp form) of weight 2 for 0(8n 0). Unfortunately, however, Hecke's groups 0(A) turn out not to be large enough to work with satisfactorily. In general, they are not congruence r(2) is an exception.) subgroups. (6(2) But one can do much better. Weil showed, roughly speaking, that if one has functional equations analogous to (3.36)-(3.37) for enough "twists" E x(n)ann -s of the Dirichlet series Z anti', then the corresponding qexpansion is in mk(ro(N)). we now give a more precise statement of Weil's theorem. Let xo be-a fixed Dirichlet character modulo N (x o is allowed to be the trivial character). Let x be a variable Dirichlet character of conductor m, where m is either an odd prime not dividing N, or else 4 (we allow m = 4 only if N is odd). By a "large" set of values of m we shall mean that the set ' _iv} jez , contains at least one m in any given arithmetic progression {u+ where u and y are relatively prime. According to Dirichlet's theorem, any such arithmetic progression contains a prime; thus, a "large" set of primes is one which satisfies (this weak form of) Dirichlet's theorem. By a "large" set of characters x we shall mean the set of all nontrivial x modulo m for a "large" set of m. Let C = ±1, and for any x of conductor m set

C, --z--• CX0(m)( — N)g(X)igW,

(3.38)

where g(x) = E.7,_, x(j)e 27'iiiN is the Gauss sum. Given a q-expansion f(z)= Ef=o anqn, q = e'iz, for which lan i = 0(d), we define Lf (s) by (3.33) and A(s) by (3.36), and we further define CO

Lf (x, s) =

E x(n)anirs ;

A(y, s) = (m \ I-A 1 72n)r (s)Lf (x, s).

(3.39)

n=1

Weil's Theorem. Suppose that f(z)=E„œ_ o an qn, q = e2', has the property that Ian ' =.-- 0(nc), ce R. Suppose that for C =1 or —1 the function A(s) defined by (3.36) has the property that A(s)-1- a o (lls --I 1 (k — 4) extends

143

§3 Modular forms for congruence subgroups

to an entire function which is bounded in any vertical strip of the complex plane, and satisfies the functional equation A(s) CA(k s). Further suppose that for a "large" set of characters x of conductor m (in the sense explained above), the function A(z, s) defined by (3.39) extends to an entire function which is bounded in any vertical strip, and satisfies the functional equation A(z, s) = Cx A(, k s), with C, defined in (3.38). Then feM k (N, x 0), and f satisfies (3.34). lf, in addition, L1 (s) converges absolutely for Re s> k s for some e> 0, then fis a cusp form. -

One can show that the Hasse—Weil L-functions of the elliptic curves in Chapter II satisfy the hypotheses of Weil's theorem (with zo = 1). The same techniques as in the proof OC the theorem in §I1.5 can be used io show this. However, one must consider the Hecke L-series obtained in (5.6) of Ch. II by replacing 2,7(I) by the character jc",(.0z(Ni1) with z any Dirichlet character modulo ni as in Weil's theKm. For example, if we do this for L(E, , s), where E1 is the elliptic curve, 1=- x 3 x, we can conclude by Weirs theorem that

fE i (z) = q

2q5 3 (i) 6q' 3 + 2q' 7

E

bni e

(3.40)

rn>25

(see (5.4) of Cu. 10 is a cusp form of . weight two for 1'0 (32). If we form the q-expansion corresponding to the L-series of E : y 2 = n 2x, namely, fEn (z) = z n(m)k nqin, it turns out that fEn e M2 (F0(32n 2)) for n odd and fEn eM2 (F0 (16n 2)) for n even. Note that when n 1 mod 4, so that zn is a character of conductor n, this is an immediate consequence of the fact that fE , M2 (f0 (32)), by Proposition 17(b). More generally, it can be shown that the Hasse—Weil L- function for any elliptic curve with complex rrfultiplication satisfies the hypotheses of Weirs theorem with k = 2, and so corresponds to a weight two modular form (actually, a cusp form) tor FAN). (N is the so-called "conductor" of the elliptic curve.) -Many elliptic curves without complex multiplication are also known to have this property. In fact, it idtS conjectured (by Taniyama and Weil) that every elliptic curve defined over Vivo rational numbers has L-function which satisfies Weil's theorem for some N. Geometrically, the cusp forms of weight two can be regarded as holornorphic differential forms on the Riemann surface F0(N)V-1 (i.e., the fundamental domain with FAN} equivalent boundary sides identified aï: the cusps included). The TaniyamaWeil conjecture then can be shown to take the form: every elliptic curve over G can be obtained as a quotient of the Jacobian of some such Riemann surface. For more information about the correspoK dence between modular forms and Dirichlet series, see [Hecke 1981] ,teil 1967], [Ogg 1969 ], and

[Shimura 197f].

144

III. Modular Forms

PROBLEMS

L Let a e GL1. (0), and let g(z) ---= f(az). Let y = (cz bd )e F. Notice ihat f(az)l[y] h was defined to be (caz + d) kf(yaz), which is not the same as g(z)l[y]k = (cz ± d)' f(ayz). Show that if a = (g ?), i.e., if az =-- nz, then g(z)1[T]k -----.ficaaccrx - lk = f(nz)IEGin "Mk. be a congruence subgroup of r of level N, and denote r; = {y E FITS = . 5 } for sEQ v {co}. Let s= Œ 1 CC, 0: E F. (a) Prove that ars'OE -1 z-_- (Œria -1 )co . (b) Show that there exists a unique positive integer h (called the "ramification index" of r at s) such that

2. Let

r'

(1) in the case —.ter

• r; = + aoe t

trinEz a;

(ii) in the case —1.0r either r; = 0,-1 {T hrilf ne za; Or

r; =

nela•

Show that h is a divisor of N. (c) Show that the integer h and the type (I, Ha, or III)) of s does not depend on the choice of a E r with s = a' co ; and they only depend on the F'-equivalence class of s. (d) Show that if a -1 oc is of type I or Ha andfe /cr.), thenfl[a -l ] k has a Fourier expansion in powers of qh . A cusp of r is called "regular" if it is of type I or Ha; it is called "irregular" if it is of type lib. (e) Show that if a-1 co is an irregular cusp, and femk (r), then fl[a- i]k has a Fourier expansion in powers of q2,, in which only odd powers appear if k is odd and only even powers appear if k is even. If k is odd, note that this means that to show that fE Mk (r) is a cusp form one need only check the q-expansions at . the regular cusps.

3. Let h be any positive integer, and suppose 2hIN, N 4. Let F' be the following level N congruence subgroup: F' = f(a, ,1!1 ) --a-. (1 -4)i mod N for some j}. Show that co is a cusp of type Jib. 4. (a) Show that F1 (N) has the same cusps as ro(N) for N = 3, 4. (b) Note that —.10 ri (N) for N> 2. Which of the cusps of r'1 (3) and 1T1 (4), if any, are irregular?

5. Find the ramification indices of F' at all of its cusps when: (a) F' = Fo(p) (p a prime); (b) F' = Fo(p2); (c) F' = F(2). 6. Prove that if r a F is a normal subgroup, then all cusps have the same ramification index, namely

[Toe : ±r].

7. (a) Show that any weight zero modular function for F' c r satisfies a polynomial of degree [F: r ] over the field C( j) of weight zero modular functions for T.

145


(1)) Show that, if r is a normal subgroup, and if f(z) is F'-invariant, then so is - f(Œz) for any °ter. Then show that the field of weight zero modular functions for r' is a galois field extension of C( j) whose galois group is a quotient of In practice (e.g., if r is a congruence subgroup), it can be shown that the galois group is equal to f/F'.

8. Prove the following identities, which will be useful in the problems that follow and in the next section, by manipulation of power series and products in q = e2 1ni :

(a) 0(z) + e(z + = 20(4z); E2 (z) = 48Z odd n> 0 61 (n)q n ; (b) E2 (z + 2k p4 n crk-i(n)q n (c) Ek(Z) — ( 1 + Pk-1 )Ek(PZ) ± Pk-1 Ek(P 2 Z) = Bk (k 2, p prime); (d) E2 (z) 3E2(2z) + 2E2 (4z) =-1-(E2 (z) E2(z +1)); (e) q(z + -21) = e21ti/4 8 7/ 3(2z)/ri (z)ri(4z). 9. Prove that if k is even and f(z) has period one and satisfies f(— 114z) = (— 4z 2)42 then fl[y]k = f for all ye To (4). f(z),

10. (a) Prove that 18 (4z)/I 4 (2z)e M2 (FM)), and find its value at each cusp. 67 + 1). (13) For a E Z prove that E2(ST'Sz) = (az + 1)2 E2 (z) — 4(az -214-(E2 (z) — 3E2 (2z) + 2E2(4z)) = Zoddn> 0 ai (n)qn by Problem (c) Let F(z) 8(c). Prove that F(z)E m2(1T0(4)), and find its value at each cusp. (d) Prove that F(z) = ri8 (4z)M 4(2z). Then derive the identify q

(1 n;=.- 1

q4n)4 (

1+

q 2 r74.

E odd

0.

1

(n) q .

n> 0

(e) Give a different proof that —24F(z) = RE2 (z) — E2(z + 1/2)) is in M2 (1-0(4)) by proving that, more generally, E2 (z) — 141,7E7:01 E2 (z + j/N) is in M2 (r0 (N 2)).

11. (a) Prove that 0(z)4 em2 (r0 (4)), and find its value at each cusp. (13) Show that 0(z)4 and F(z) (see preceding problem) are linearly independent. (c) Prove that te 0 (2z)/q 8 (z)r/8 (4z)em 2 (r0 on, and find its value at each cusp. (d) Prove that 0(z) = (e) Prove that 40(z) = 2n1124172 (z +. D/r/(2z). 12. Let N = 7 or 23, and let k = 24/(N + 1). Let x be the Legendre symbol x(n) Prove that any nonzero element of Sk(N, x) must be a constant multiple of (ri(z)ri(Nz)) k 13. Using Propositions 25-27, prove that (r/(z)r/(3z)) 6 eS6 (fo (3)) and (r/(z)n(7z)) 3 E S3 (7, x) where x(n) = 14. Let 4 (z) = En , z e izn2 = 0(z/2). Let x be the unique nontrivial character of 0(2)/ F(2) (which has 2 elements). Show that 04 e M2 (6(2), x). 15. Let fe Mk (N, x), and set ŒN =( 1). (a) Prove that f i[ŒN]k E Mk(N1 3(), and that the mapfl-4.f I kNI is an isomorphism of vector spaces from Mk (N, x) to Aik (N, 7). Prove that the square of this map (i.e., where one uses it to go from M/Or, x) to Mk(N, je) and then again to go , from ./4k (N, k) back to Mk(N7 X)) is the map (-1)k on Mk(Ar7 x)(b) 1f x = X, i.e., if x takes only the values +1, then prove that Mk(N, = Mk+ (N, X) 0 MC (N, X), where Mk ± (N, X) crf ffe,Mk(N, X)Ifi[ŒN]k = In other words, any modular form in Mk(Ne, x) can be written as a sum of one

146

-

Ill. Modular

FOrms

which is fixed under ik [aN]k and one which is taken to its negative by i rL.aN_,k• (c) Let N = 4. In Problem 17(d) below, we shall see that 04 and Fspan M2 (r0 (4)) = M2(4, 1) (where F = En odd ci(n)q n as in Problem 10, and 1 denotes the trivial character in M2 (4, 1)). Assuming this, find the matrix of [a4 ] 2 in the basis 04, F; show that M2+ (4, 1) and M2- (4, 1) are each one-dimensional; and find a basis for M2 (r0 (4)) consisting of eigenforms for r04.1 2 If you normalize these eigenvectors by requiring the coefficient of q in their q-expansions to be 1, then they are uniquely determined. 8 2 : + Enœ, i o-k . . i (n)qn. Express L1(s) (see = — 16. (a) For k 2 even, let f(z) the definition in (3.33)) in terms of the Riemann zeta-function. (1)) Write an Euler product for Lf(s). (c) Let fx(z)= E;f)= 1 (4_ 1 (n)x(n)q" for a Dirichlet character x. Write an Euler product for Li(s).

17. Let F(2) be the fundamental domain for F(2) constructed in §1 (see Fig. 111.3). Then

F' = ŒF(2), where a = (2) 1), is a fundamental domain for F0 (4) = al- (2)a-1 (see Problem 10 in §III.1). The boundary of F' consists of: two vertical lines extending from ( 3 + i0)/4 and from (1 + LA/4 to infinity ; two arcs of circles of radius 1, one centered at and one centered at 0; the arc of the circle of radius -4- and center -4- which extends from 0 to (9 + i.A/28; and the arc of the circle of radius and center A which extends from (9 + LA/28 to 1. Consider F0 (4)-equivalent points on the boundary of F' to be identified. (a) Find all elliptic points in F' (i.e., points which are F-equivalent to i or co = (-1 + 0)14 Which are on the boundary and which are in the interior of F' ? (b) Let f(z) be a nonzero modular function of weight k (keZ even) for 1-0 (4). Let v(f) denote the order of zero or pole of f(z) at the point P. At a cusp P = Œ 1 co, we define v(f) to be the first power of qh with nonzero coefficient in the Fourier expansion of fl[a - lk (where h is the ramification index; see' , Problem 2 above). Prove that: Ep€F, v(f) = k/2, where the summation is over all points in the fundamental domain F', including the three cusps, but taking only one point in a set of F0 (4)-equivalent boundary points (e.g., { + iy, 14- iy} or { + + + (c) Describe the zeros of 0(z)4 and F(z) (see Problems 10-11 above). (d) Prove that 04 and F span Af2 (r0 (4)). (e) Prove that Mk(F0 (4)) = 0 if k < 0, and it contains only the constants if k = 0. (f) Prove that for k = 2,k o a nonnegative even integer, any fEMk (F0 (4)) can be written as a homogeneous polynomial of degree k0 in F and O. (g) Prove that S6 (F0 (4)) is one-dimensional and is spanned by 0 8 F 1604F 2 . (h) Prove that ,j 12(2z) S6 (F0(2)), but that 2 (2z) E S6 (F0(4)). Then conclude that q 12 (22.) = 08 F— 1604F 2 . (i) Prove that for k = 2k0 6, any fe Sk (F0(4)) can be written as a homogeneous polynomial of degree k 0 in F and 04 that is divisible by 04F(04 — 16F). 18. (a) IffE Mk , (N, x i ) and g E Mk2 (N, X2), show that fg e Mk, ÷k2 (N, Xi X2)• (b) Let x be the unique nontrivial character of (/4)*. Show that any element of M1 (4, x) is a constant multiple of 0 2 . (c) With x as in part (b), find a formula for dim Sk (FI (4)) = dim Sk(4, (d) Let f(z) = (ii(z)q(2z)) 8 , and let g(z) = f(2z). Show that S8 (1"1 (4)) is spanned by f and g.

§4. Transformation formula for the theta-function

,

j 47

19. Let I' c r be a congruence subgroup, with r _-= Uai r, so that F' = Ucc.T' F is a fundamental domain for rt. Let j; ---- f I [ocr ' ],, forfe Mk (r). Suppose thatfe Sk (r), the ramification index of r' at the cusp so that fi (z) = Z__, anA, where h si = aj i oo. In particular,f(z) = I nc°,_ i anqii: at the cusp co. (a) Show that there exists a constant C independent off and x such that Ifi(x +

(b) (c) (d) (e) (f)

i y)I Lç CC-2 "mi for y> e.

Let g(z) = (Im z) k12 1f(z)f. Show that g(yz) = Om z)"12 1f(z)l[y] k i for y E F. Show that g(z) ref Om zr 14(2)1 is bounded on F. Show that g(z) is bounded on F'. Show that g(z) is bounded on H. Show that for any fixed y: 1fh im-27rirox+iyvh dx. an = - fi x ±

h o

(g) Show that there exists a constant C1 such that for all y:

Ian ! < ciy-ki2 e21oty/h . (h) Choosing y = 1/n in part (g), show that for C2 = C1 e2irm : lan i < C2 n k12 . (i) Show that la1n -42 is similarly bounded for each j.

§4. Transformation formula for the theta-function We first define some notation. Let d be an odd integer, and let c be any integer. The quadratic residue symbol () is defined in the usual way when d is a (positive) prime number, i.e., it equals 0 if dic, 1 if c is a nonzero quadratic residue modulo d, and — 1 otherwise. We extend this definition to arbitrary odd d as follows. First, if g.c.d.(c, d) > 1, then always (i) = O. Next, if d is positive, we write d as a product of primes d=rlipi (not necessarily distinct), and define (-) = II i (j-i). If d = 4 1 and c = 0, we adopt the i convention that (K) = 1. Finally, if d is negative, then we define (f-i ) = ( t) if c > 0 and () = —(1`6) if c < O. It is easy to check that this quadratic residue symbol is bimultiplicative, i.e., it is multiplicative in c if d is held fixed and multiplicative in d if c is held fixed. It is also periodic with period d.when d is positive : (cP) = (-) if d> O. However, one\ must be careful, because periodicity fails when d is negative and c + d and c have different signs: (q4-) = —() if e> O> c + d. This is because of our convention that ( -s) = — (t'il) when both c and d are negative. On the other hand, this convention ensures that the usual formula (2-) = (— 1) (" )/2 holds whether d is positive or negative. Next, we adopt the convention that „ii: for zE C always denotes the branch whose argument is in the interval ( — n/2, n/2]. We next define Ed for d odd by: -

ser = -Ni(l), i.e.,

148

HI. Modular Forms

11

if d d

1 mod 4; 3 mod 4. •

(4.1)

With these definitions we finally define the "automorphy factor" j(y, z), which depends on y = db) e F0 (4) and z e H:

AY,

crlf

c; 1 NiCZ + d

for

y

(

a b c d)er°(4),

z e H.

(4.2)

Inez q n2 = Inez e Irtizn 2 . Recall our definition of the theta-function 0(z) = The purpose of this section is to prove the following theorem, following Hecke [1944 ].

Theorem. For y eF0(4) and z e H 0 (yz) =j(y, z)0 (z),

(4.3)

where Ay, z) is defined by (4.2). Notice that the square of Ay, z) is ( 1-)(cz d), and so the square of the equality is precisely what we proved in Proposition 30. Thus, for fixed y e F0 (4) the ratio of the two sides of (4.3) is a holomorphic function of z e H whose square is identically 1. Thus, the ratio itself is + 1. The content 1, i.e., that j(y, z) has the right sign. of the theorem is that this ratio is Simple as that sounds, the theorem is by no means trivial to prove. At first, it might seem sensible to proceed as in the proof of Proposition 30, proving that (4.3) holds for generators of F0 (4). However, then we would have to show that the expression j(y, z) in (4.2) has a certain multiplicative property which ensures that, if (4.3) holds for y1 and y2 , then it must hold for Vi y2 . But that is a mess to try to show directly. We shall, in fact, conclude such a property for j(y, z) as a consequence of the theorem (see Problem 3 at the end of this section). In proving the theorem, it turns out to be easier to work with the function 4 (z) def = 0(42) = Ine z ein2z , which we encountered in Problem 14 of §111.3. This function satisfies : 4(T 2z) = 4)(z) (obvious from the definition) and izO(z) (immediate from (3.4)). Hence, 4)(yz) has a transforma0(Sz) = tion rule for any y in the group 0(2) generated by + T 2 , S. It is because 0(2) is such a large group—having only index 3 in F—that it is sometimes easier to work with 4). The corresponding group under which 0(z) = a= ?), has a transformation rule is a- '0(2)a (see Problem 1 of §III.3). But cc -1 4:5(2)crt is not contained in I" = SL2 (7L); its intersection with f is the subgroup F0 (4) of index six in F. Thus, we can work with a "larger" subgroup of F (i.e., its index is smaller) if we work with 4)(z) rather than 0(z). The next lemma gives an equivalent form of the theorem in terms of 4)(z).

Lemma 1. The theorem follows if we prove the following transformation formula for 4)(z):

149

§4. Transformation formula for the theta-function

ck(yz) = i" -c)12- 6 \I — i(cz for y r such that y =(

)

+ d)(1)(z)

(4.4)

(mod 2) with d 0 O.

Suppose that we havé the transformation rule for O(z). We must show that 0 satisfies (4.3) for all y = (ac ' ) e F0 (4). We first note that if c = 0, then (4.3) holds trivially, since in that case 0(yz) = 0(z), while = 1 (since d = +1). So in what follows we suppose that j(y, z) = c O. )E F0 (4) with c Owe write For any y = PROOF.

,

0 (7z) =

n az

a) = (y`( — 1 /2z)), 1/2z) — c/2

+.1 _ /4, (2b(— 112z) —

where y' = (2,7 .42) and y' m- (_? (13) (mod 2), because 41c. We apply (4.4) = (1)(), with y' in place of y and — 1/2z in place of z. Using the fact that we obtain

en

0

(yz) = i( -d)/2 d2) (

— i(d( —

112z)

c12)4)( - 112z).

2i0(z) by (3.4). The product Next, we have 4)( — 1/2z) = 0( — 1/4z) = of the two square root terms is + Vcz + d (note that, because of our convention on the branch of the square root, we have .fx,./j; =± xy; for example, — 1 = — Nib. But since the three functions — i(d( — 112z) — c12), — 2iz, and Vcz + d are all holomorphic on H, the + must be the same for all z; so it suffices to check for any one value of z, say z = i. But in that case „I— 2iz = „/2, and .\/.7-c.f; = +\41)cy always holds when y is positive real. Thus, the product of the two square root terms is Vcz + d, and we have C) (yz) = i( 1 -d)/2

2

_c d d

/c.z +

N

d0(z).

1 --d)/2(-- 2) -= To complete the proof of Lemma 1, it remains to check that i( k d which we easily do by considering the cases d 1, 3, 5, 7 (mod 8). —

t'd

The remainder of this section is devoted to proving (4.4). For a fixed odd prime p, let us denote

= n '(z)/(pz),

(4.5)

where n (z) is the Dedekind eta-function, as in §2 and §3. According to i.e., tfr3 is a modular Propositions 26-27, we have 1/.3 Em3,p_ iofp, form for ro (p) with character x(d) = (9,-). This is the basic tool which will be used to prove (4.4). But it will take several lemmas to relate tir 3 and 0.

Lemma 2. (P(pz)/(z) = 131/(z)/0 2 (z + 2 'I) •

(4.6)

150

III. Modular Forms

By Problem 11(e) in the preceding section, with z replaced by we find that the left side of (4.6) is equal to

PROOF.

by

e -2/ri/24 1,1 2 (Pz2+1 )/n ,

nz

,hfz)

,

V)IrKz)y = e(CP -

(e - 27ri/24 1,1 2(z -

11/24)2irt

-42

and

(p.z2f-1) .:_

02(z12-1) n 2 09 .12-1 ) -

But since q(z + 1) = e224q(z), and p((z + 1)/2) = (pz + 1)/2 + 91, it follows that the last term on the right is e-( P -1)2ni124. This proves (4.6).

Lama 3. Le t - p be an odd prirne, let y = 0 3(2 1 1). Then fl[y] 3(p - 1)12 =

cl D E 0(2) n fo(p), and let f(z)=

().!

Let a =( ), so that Iii((z -I- 1)/2) = tfr(az). Then 0((yz + 1)/2) = kayz) = ig(aya-i )ocz). Now for y as in the lemma, we have PROOF.

ŒyŒ 1 =

+ c + d — a — 012) 2c d c

(Note that b+d a c is divisible by 2 because y e 6(2)) Hence, by Propositions 26-27, we have —

—

yz + 1) = (d_ c)

3 (

2

(2caz + d

—

c) 3( P-1)J2tp 3(otz) z+1 2 )'

.( 4)(cz + d)3(P-1)1203(

because d

—

c

d (mod p ) . This is the relation asserted in the lemma.

Lemma 4. Let p be an odd prime, let y = 4(pz)I0P(z). Then gl[y](l _p)/2

bd)

e 0(2) nTo(P), and let g(z)=

=

We first claim that g8 transforms trivially under y. Let a = (g ?), and let y' = yo -1 . Then both y and y' are in 6(2), and we can use Problem 14 of the preceding section to compute PROOF.

g8(z)i [y1441 p) =

(CZ

± d)4 P-1 )0 8(y'ocz)14) 8P(yz)

_ (Œz ± d)-40 8('Y'az) (cz + d)-41'08 P(yz) _ 0 8(Œz) —

8

8P (Z) = g (z),

as claimed. Meanwhile, the ninth power of g transforms under y by (11,), as we see by raising both sides of (4.6) to the 9th power and using Propositions 26-27 and Lemma 3 (here f(z) is as in Lemma 3):

99 1 [7] 9(1 -p)/2 = (0 91f6 )1[7]9(1-p)/2 = (IA 31 [11(13-1)/2)3 — 4-415)1fr 3)3 — (d)a9 • i) ( V) 6 ( J. I [7] 3( p-1)/2) 6

151

§4 . \Transformation formula for the theta-function

faking the quotient of these two relationships gives

gibli-p)/2- := g9 ibil 9( 1 - p)/2/g 8 I [T] 4( 1 -p)

=

0

p)g .

(

r

The next lemma generalizes Lemma 4 by replacing the prime p by an arbitrary positive odd number n. Lemma 5. Let n be

a positive odd integer, let y ,:--- (a, bd)e 0(2) n ro (n), and let

g(z) = 4)(nz)10n(z). Then gibli-no= ()g PROOF. We write n =pi - • . p, as a product of primes (not necessarily distinct), and we use induction on the number r of prime factors. Lemma 4 is the case r = 1. Now suppose we know Lemma 5 for n; we shall prove the corresponding equality for a product n'rz---- np of r + I primes. We write

0(n'z) _ cknaz) N(pz)Y çbn'(z) — O n(OLZ)0 P(Z) ) 1

(4.7)

where a = (6 ?). For y = (a, be') e Fo (n) n 0(2) we have y' = aya-1 = fan) n 0(2), and so, by the induction assumption,

4)(nocyz)10"(Œyz) = 4)(ny'az)10"(y'az) = (1:7-1 ) (az + d)

(

c 7p bdi E

0(nocz)b4?(ocz)

= (-nd)(cz + d)" -n)I2 0 onezve(ŒZ). In addition, by Lemma 4, we have

d 4)(pyz)10P(yz) =(- )(cz -I- d)(1-P)12 4)(pz)10P(z). P Combining these two relations, we see that replacing z by yz in (4.7) has the effect of multiplying by

. ,Gd)

(CZ

+ d)" -n)12 (()(CZ + d) (1-P)12) n = (1)( 4)(CZ ± n p P

d , (n ')

d) (1- np)I2

(cz + d)"-n1)12.

This completes the induction step, and the proof of the lemma.

0

We are now ready to prove (4.4). We first note that.both sides of (4.4) remain unchanged if y is replaced by —y (see Problem 2 below). Hence, without loss of generality we may suppose that y = (:. bd ) F... (_? D mod 2 with c > O. We now apply Lemma 5 with n replaced by the positive odd integer c,

obtainir

152

III. Modular Forms

(1)0(cC(7YzZ) =

()(CZ

± d ) 1 -c)/2 O(CZ) CV(Z)

(4.8) •

The ratio that ultimately interests us is 4)(yz)10(z). Solving for this in (4.8) gives

(0

(cczz)) . Yzz) )c = ( d c ) ( c z 4_ d) c- 1 0 04)(Y

: (

(4.9)

)

On the other hand, we have (using: ad

1 = bc)

—

cyz . c az + b ,... a cz+d

I cz±d i

Since a is even and 4) has period 2, this means that

Ccyz) = 0 ( cz + 1 d) = V —i(cz + d)(k(cz + d) (4.10) = \I —i(cz + d)0(cz). Combining (4.9) and (4.10) gives 4)(Yz)

0(z)

)c = (--1)(cz + d) — 1 ) /2.\/ — i(cz + d). c

(4.11)

(

Meanwhile, we saw in Problem 14 of the preceding section that 0 8 is invariant under Ed4 for y E (5(2), i.e.,

8k = 0)(v/

(CZ

+

VI'

for any k e Z.

, (4.12)

We now raise both sides of (4.11) to the c-th power and divide by (4.12), where k is chosen so that c2 = 8k + 1. (Since c is odd, of course c2 :1-=._ 1 mod 8.) The result is: (KYz) = (-1-)(cz + d)cc -112-41c(— i(cz + d))(` -l v2V — i(cz + d).

0(z)

c

But c(c — 1)/2 — 4k + (c

—

1)/2 = (c2

—

1)/2 — 4k = O. Hence,

0(yz) = (y_),_ i.,,c_ i ,i2 , . y It v — i(cz c‘ 0(z)

+ d),

which is the transformation formula (4.4) that we wanted to prove. This concludes the proof of the main theorem as well. 0

The transformation formula for the theta-function is similar to the transformation formula for a modular form of weight k if we take k =1, i.e., except for a power of i the "automorphy factor" is (cz + d) 1/2 . In the next chapter we shall see that there is a general theory of modular forms whose weight is a half-integer, and the transformation formula for 0(z) plays a fundamental role in describing such functions.

153 -

§5., The modular interpretation, and Hecke operators

PROBLEMS

,

1. Prove that the generalized quadratic residue symbol (i) as defined in this section satisfies the following form of quadratic reciprocity: if c and d are both odd integers, then

( _ occ-tud-io()

if c or d is positive; (f) = _(...... o(c-i)(d- 1W4() if c and d are negative.

2. Show directly that both sides of (4.4) remain unchanged if y is replaced by — y. 3. (a) Show directly (using (3.4)) that the theorem (4.3) holds for the generators T, and ST'S of 1-0 (4). (b) Show that the theorem would follow from part (a) if one could show that

Ace13, z) = Act, &M g, z) for all a, /3e r0(4).

—

I,

(4.13)

(c) Conversely, show that the theorem proved in this section implies the relation

(4.13).

§5. The modular interpretation, and Hecke operators A basic feature of modular forms is their interpretation as functions on lattices. More precisely, we consider the most important cases of a congruence subgroup F': F' = F, F(N), F 0 (N) or F(N). (Of course, F = fl (1) = F0 (1) = F(1), so everything we say about the cases F 1 (N), F0(N) or F(N) will apply to F if we set N = 1.) By a "modular point" for F' we mean : (j ) for F' = f: a lattice L c C; (ii) for F' = F1 (N): a pair (L, t), where L is a lattice in C, and t e CIL, is a point of exact order N; (iii) for F' = F0(N): a pair (L,- S), where L is a lattice in C, and S c C1L is a cyclic subgroup of order N, i.e., S = Zt for some point te CIL, of exact order N. _ (iv) for f' = F(N): a pair (L, { t 1 , t2 }), where 1 1 , 1 2 e CIL have the property that every tekLIL is of the form t = mi l ± nt 2 , i.e., t i , t 2 form a basis for the Points of order N (in particular, t i and t must each have exact, , order N). Given a lattice L, in general there will be several modular poir.. s of the form (L, t), (L, S), or (L, {l i , 1 2 )). However, when N = 1, there is only one modular point corresponding to each L, and we identify it with the modular point L for F. Let k e Z. In each case (i)-(iv), we consider complex-valued functions F on the set of modular points which are of "weight k" in the following sense. If we scale a modular point by a nonzero complex number A, then the value of F changes by a factor of A - k . That is, for 2 e C* we consider AL =

154

III.

P.111 e LI, /it e C , S = Pik ES} c weight k if for all E C *

Modular Forms

AL. Then F is defined to be of

case (i) F(2L) = A'F(L) for all modular points L; = ATY(L, t) for all modular points (L, 1); case (ii) case (hi) F(2L, /IS) = /1-1`F(L, S) for all modular points (L, S); ) t2 }) = .1-kF(L, t 1 , t2 }) for all modular points case (iv) F(AL, (L, { t 1 , t2}). An example of such a function of weight k is Gk (L) =

E

(k > 2 even).

(5.1)

0*IeL

Notice that any function F in case (0, such as Gk, automatically gives a function for the other groups; for example, by setting F(L, t) = F(L). Given a function F of weight k, we define two corresponding functions P and f as follows. fl(co) is a complex-valued function on column vectors = ( T) such that co 1 ko 2 E H; f(z) is a function on the upper, half-plane H. Let 2L. be the lattice spanned by co l and co2 , and let L, be the lattice spanned by -z and 1. Given F as above, we define case (i) P(c )) = F(L); case (ii) P(w) = F(L., (o 2IN); case (iii) P(co) = F(L., Zco 21 N); case (iv) P(co) = F(L., {w 1 /N, w 2/N}). In all cases we define f(z) = P(f). Thus, for example, the function f(z) that corresponds to Gk (L) (see (5.1)) is the Eisenstein series we denoted Gk (z) in §2 (see (2.5)). For y = (ca e F = SL 2 (Z), we define the action of y on functions of co Rya)), where yoi is the usual multiplication of a column by the rule 'AO vector by a matrix.

r, r,

(N), FAN) or f(N). The above Proposition 31. Let k E Z, and let r = association of F with P and f gives a one-to-one correspondence between the following sets of complex-valued functions: (1) F on modular points which

have weight k; (2) P on column vectors co which are invariant under y for y E r and satisfy 1-7(.1co)= .1.-k .P(w); (3) f on H which are invariant under [Y]k for y e

PROOF. We shall treat case (ii), and leave the other cases as exercises. Suppose y E 1"1 (N) and F is a weight k function of modular points (L, We first compute: P4- (yco) = F (1,

cco ± cico 2 -

because Law1+bc02.cw1+do)2 = L ) mod L (since y E F1 (N)). We also have

= F(L., w2/N) Aco), y e F) and (co), + dco 2)/.N

co2/N

§5._ The modular interpretation, and Hecke operators

155

-

-

\

P(co) =. F(LAw , Aco 2 IN) = /1 -4 F(L., co 2IN) = Next, we have ^

f(yz)= F (L (az+b)1(cz+d) , ,

I

Tv

)

= (ez ±

clY c F (Laz+b,cz+d,

ez + d N

)1

because F has weight k. But the lattice spanned by az + b and cz + d is Lz ; and (cz + d)I N.=..--- k mod Lz ; hence

f(yz) =--- (cz + d)k F (L z ,

NI) = (cz ± d)kf(z).

-

Thus, the P and feorresponding to F have the properties claimed. To show the correspondence in the other direction, given f(z) we define P(co) to be coi kfico //co 2); and given P we define F(L, t) to be Aco), where co is chosen to be any basis of L such that co2/N a t mod L. One must first check that the definition of F makes sense (i.e., that such a basis co exists), and that the definition of F is independent of the choice of such a basis co. The first point is routine, using the fact that t has exact order N in C/L, and the second point follows immediately because any other such basis must be of the form yco with y e F1 (N). It is also easy to check that, oncefis invariant under [y]k for y e F1 (N), it follows that F is invariant under y and has weight k; and that, ifFhas weight k, then so does the corresponding F. The construction going from F to P to f and the construction going from f to P to F are clearly inverse to one another. This concludes the proof. o We say that F is a modular function/ modular form/ cusp form if the corresponding f is a modular function/ modular form/ cusp form as defined in §3. We now discuss the Hecke operators acting on modular forms of weight k for r1 (N). We could define them directly on f(z) E Mk (1"1 (N)). However, the definition appears more natural when given in terms of the corresponding functions F on modular points. Let ..r denote the 0-vector space of formal finite linear combinations of modular points, i.e., 2' = OCIeL is the direct sum of infinitely many one-dimensional spaces, one for each pair (L, t), where L is any lattice in C and t E C1L is any point of exact order N. A linear map T: 2' --0 2' can be given by describing the image TeL,, = E. an ep.n of each basis element; here {P.} are a finite set of modular points. For each positive integer n we define a linear map Tn : 2) --* ..r by the following formula giving the image of the basis vector eL , t : 1 Tri (eL, r) = — 4'5' ev,t, n

(5.2)

where the summation is over all lattices L' containing L with index n such that (L', t) is a modular point. (Here for t E Ca, we still use the letter t to /

156

Ill. Modular Forms

denote the image of t modulo the larger lattice L'.) In other words, L' IL C CIL is a subgroup of order n, and t must have exact order N modulo the larger lattice L' as well as modulo L. The latter condition means that the only multiples Zt which are in L' IL are the multiples INt which are in L. = I", this condition disappears, and we sum In the case N = 1, i.e., = n. The condition on t is also empty if over all lattices L' with [L' : n and N have no common factor. To see this, suppose that g.c.d.(n, N)= 1, and suppose that N't c L'. Then the order of N't in L'IL divides N (because N'Nt E L) and divides n (because # L'IL=n), and so divides g.c.d.(N,n)=1. Thus N't E L. Notice that the sum in (5.2) is finite, since any lattice L' in the sum must because each element of L' IL has order be contained in L = dividing n = # L' IL. Thus, each L' in the sum corresponds to a subgroup of order n in iLIL ,-z-!,(11nZ) 2 . Note that T1 = 1 = the identity map. Next, for any positive integer n prime to N we define another linear map T7, „ :2—*Y by

1 Tn,n (eL,t ) = --ye( 1/n)L,t•

(5.3)

Note that t has exact order N modulo L because g.c.d.(N, n) = 1. Again we are using the same letter t to denote an element in C/L, and the corresponding element in CAL. It is easy to check the commutativity of the operators ,

Tni , ni Tn2,n2

Tnin2,nin2 = T,g2,„2 Tni ,ni

Tn,n Tm = Tni Tn, n .

(5.4)

It is also true, but not quite so trivial to prove, that the T.'s commute with one another for different m's. This will follow from the next proposition. Proposition 32. (a) If g.c.d.(m, n) = 1, then T„,„= T.T, 7 ; in particular, 2",,, and Ti, commute. (to) Ifp is a prime dividing N, then Tpt = 7. (c) If p is a prime not dividing N, then for 1> 2

Ti = Tpi--1Tp — pTp1- 2l

,

.

(5.5)

PROOF. (a) In the sum (5.2) for Tmn , the L' correspond to certain subgroups S' of order mn in namely, those which have trivial intersection . with the subgroup It CIL. Since g.c.d.(m, n)= 1, it follows that any such S' has a unique subgroup S" of order n; if L" L is the lattice corresponding to S", then S'/S" gives a subgroup of order m in ,-1,-,- L"/L". Both S" and S'/S" have nontrivial intersection with Zt. Conversely, given S"=L"ILc -1-iLIL of order n and a subgroup S' =L'IL",c kL"/L" of order m, where both subgroups have trivial intersection with It, we have a unique subgroup L'IL c ALIL of order mn with nontrivial intersection with Zt. This shows

157

§5. The modular interpretation, and Hecke operators

thatthe modular points that occur in Tm7,(eL ,,)= ï E eL ,,, and in Tm (Tn(e L. )) =-1-ITm (er ,,,) are the same. (b) By induction, it suffices to show that Tpi-i Tp = Tpt for I> 2. Let t' =It. Then Tp /(eL , t) = p -' lev ,„ where the summation is over all L' L LIL has order p i and does not contain t'. Notice that such that L' IL c L' IL must be cyclic, since otherwise it would contain a (p, p)-subgroup of pT iLIL. There is only one such (p, p)-subgroup, namely f,L1L, and t' E since pt' = Ni e L. Once we know that L' IL must be cyclic, we can use the same argument as in part (a). Namely, for each L' that occurs in the sum for Tp r(e L ,,) there is a unique cyclic subgroup of order pin L' IL; the corresponding lattice L" occurs in the sum for Tp (e"), and L' is one of the lattices that occur in 7;i -i(eL „,,). This shows the equality in part (b). (c) Since p,t N, the condition about the order of t in C/L' is always fulfilled. We have 1,i -iTp (eL,,)= p -1 IL ,, EL, e L,.„ where the first summation is over all lattices LI' such that S" = L" IL has order p, and the second summation is over all L' such that S' = L' IL" has orderp'. On the other hand, Tpt (eL. ,) = poe 1 EL, eL .,,, where the summation is over all L' such that L' IL has order IA Clearly, every L' in the inner sum for 1p r--1 Tp is an L' of the form in the sum for Tpi, and every L' in the latter sum is an L' of the form in the former sum. But we must count how many different pairs L", L' in the double sum lead to the same L'. First, if L' IL is cyclic, then there is only one possible L". But if L'1L is not cyclic, i.e., if L'IL D g1L, then L" can be an arbitrary lattice such that L" IL has order p. Since there are p 1 such lattices (for example, they are in one-to-one correspondence with the points on the projective line over the field of p elements), it follows that there are p extra times that eL ,,, occurs in the double sum for Tpt-iTp . Thus,

Tpl(eL , t) = 7p l-i 7p (e 1 , 1)

p

,t• L'(1/p)L

[L':( lip)Li=pI- 2

But

1 T 1-2T (e )= — 2 TP I-2e (1/p)L, t =p 14,t P, P

[L':( 1/P)Li

e

t

This concludes the proof of part (c). If n = prŒr is the prime factorization of the positive integer n, then Proposition 32(a) says that 71 = Tpf, Tpft . Then parts (b)-(c) show that each Tpcp is a polynomial in Tpi and Tpi , pi . It is easy to see from this and (5.4) that all of the Yin's commute with each other. Thus, the operators T,,, (n a positive integer prime to N) and Tm (m Any positive integer) generate a commutative algebra AP of linear maps from 2 to 2; actually, eit9 is generated by the Tp , p (p,I1 N a prime) and the Tp (p any prime). There is an elegant way to summarize the relations in Proposition 32 as formal power series identities, where the coefficients of the power series

III. Modular Forms

158 J

are elements in follows:

Ye.

First, for p

v.:el

IN,

Tp 1X 1 =--

we can restate Proposition 32(b) as

1 I — TX

PIN'

(5.6)

i.e., (I TpIX 1)(1 — Tp X) = 1 in le[[X]]. This follows from Proposition 32(b) because, equating coefficients, we see that the coefficient of X' is — Tri-iTp . Similarly, for p,I 'N, part (c) of Proposition 32 is equivalent to the identity 1 pN, (5.7) i TplX 1 = Tp X + pTp, pil: ' 1 i.--, 0

7;,X + pTp, p X 2 and equate i.e., if we multiply both sides of (5.7) by 1 coefficients of powers of X, we see that (5.7) is equivalent to the equalities —

Tp — Tp = 0,

Ti = 1,

Tpt — Tpr-iTp + pTpi-2 T„,p for / > 2.

To incorporate part (a) of Proposition 32, we introduce a new variable s by putting X = p - s for each p in (5.6) or (5.7). We then take the product of (5.6) over p with pIN and (5.7) over all p with p,11 N: co I 7 1-2s I -H fI E p1P -Is = fl piN 1 — TpP s p-f1■1 1 — TpP -s + Tp,pP all p 1 ,-- 0 But, by part (a) of the proposition, when we multiply together the sums on the left in this equality, we obtain / To', where the sum is over all positive integers n. The proof is exactly like the proof of the Euler product for the Riemann zeta-function. We use the factorization n = p71 • • • A.'', and the « p r-ars). We hence conclude that relation: T„n -3 = (Tpli p -l Œts) • - • (T Pr' oo

To' = n=1

ri piN 1 —

1 11 I E 12s• - • T ,PP TpP s pp' 1 — TpP - +P

(5.8)

For d an integer prime to N, let [d]: Y -÷ 2 be the linear map defined on basis elements by Me ', , =-- eL di • Note that di has exact order N in C/L, because g.c.d.(d, N) =--- 1. A lso nc;te that [d] depends only on d modulo N, i.e., we have an action of the group (Z/ NZ)* We now consider function's F on modular points and the corresponding functions f(z) on H. Again we suppose that we are in the case I' = 1"1 (N). If T: 21 ---* 21 is a linear map given on basis elements by equations of the form T(e La ) = 1 an e pa , then we have a corresponding linear map (which we also denote T) on the vector space of complex-valued functions on modular points: TAL, t) =EaF(P). For example,

[d]F(L, t) ------ F(L, dt) T„,„F(L, t) = n -20, t) n T„F(L, t) = 1 E F(L1 , t), n L,

(here g.c.d.(d, N) = 1);

(g.c.d.(n, N) = 1);

(5.9)

159


where the last summation is over all modular points (L', t) such that [L' = n, as in (5.2). Proposition 33. Suppose that F(L, t) corresponds to a function f(z) on H which is in Mk (Ti (N)). Then [d]F, TF, and TF also correspond to functions (denoted [d]f, Tf, and Tnf) in Mk (ri (N)). If f is a cusp form, then so are [d]f, Tn.nf, and Tnf. Thus, [d], 7, and Tn may be regarded as linear maps on mk(ri (N)) or on sk(ri (N)). In this situation, let x be a Dirichlet character modulo N. Then Je MAN, x) if and only if [d]F = x(d)F, i.e., if and only if

F(L, dt) = x(d)F(L, t) for de (ZI NZ)* .

(5.10)

PROOF. To show that [d]f, T„, „And Tjare invariant under [y] k for y e F1 (N), by Proposition 31 it suffices to show that [d]F(L, t), TF(L, t), and T„F(L, t) have weight k, i.e., [d]F(AL, At) =--- A -k [d]F(L, 1), TF(AL, At) =--A -k T„,nF(L, t), and TnF(AL, At) = A-k TnF(L, t); but this all follows trivially from the definitions. We next check the condition at the cusps. Note that if a = e GL (Q) and iff(z) corresponds to F(L, t), then f(z)l[a] k = (det Œ) 2 (CZ d) -k F (L„z , -110

= (det 00'12 F (1-,..z+b,cz+d, cz In particular, if ŒE F, then this equals F(Lz , (cz d)/N). Next, for each de(ZINZ)*, choose a fixed ad e T such that ad (cr) )mod N. (This is possible, because g.c.d.(d, N) = 1, and the map F SL 2 (7L/N7L) is surjective, by Problem 2 of §1II.1.) We then have .f(z) I [act]k

F (45 / c-v )= [d] F

(L

Z

[d]f(z),

i.e., [d]f = fif ad_ik• 1 Thus, for 70 E F we check (3.8) for [d]f as follows: [d],f i[Ydk = f I [6dYdk , which has a q-expansion of the required type, i.e., [d]fsatisfies (3.8) iffdoes. Similarly, we find that 7,J(z) = n-2 F(L,L,, h) = nk -2 F(Lz , ;v2-) = n k- 2 [I ] f(z) , so this case has already been covered. We next consider the cusp condition for T„f(z), which is a sum of functions of the form 1,7), where L' is a lattice containing L 2 with index n. We take such an L' and let (co l c02) be a basis for L'. Since Lz L' with index n, there is a matrix t with integral entries and determinant n such that (7) = TO) (co denotes the column vector with entries co l and c02). We can choose a set T of Such t (independently of z) such that

Tf(z) =

E F(L., ( z ) n t€ , 9 N

We now consider each F(L., h), where (I) = t(: 21). We find a yer such that yr 71 ,1,(g ',7,) with zero lower-left entry and a, b, d integers with ad = n

III. Modular Forms

160 -

(it is an easy exercise to see that this can be done). The lattice spanned by CO = T -1 (7) is the same as that spanned by y() = -,vg ",)(7). Thus, a

F(Lto I ) = F (I,( az+ bjln,d In , -9 N = ci c F (L(az + b )1d , 1 , -N'— N = ak [a] f((az + 1))1d). But if f(z) satisfies (3.8), then so does f((az + 12)1d), and [a] also preserves (3.8), as shown above. This proves the cusp condition for Tnf. Finally, let y d E To (N) be any element with lower-right entry d. As shown above, f(z)l[yj k = F(Lz , kJ) = [d]F(Lz , Ntf). Thus, f INk corresponds to [diF, and so ,f,ifL.,vd_iI k T his completes 0 the proof of the proposition. We saw before (Proposition 28) that a function Je Mk(Ti (N)) can be written as a sum of functions in Mk (N, x) for different Dirichlet characters x. Thus, using the one-to-one correspondence in Proposition 31, we can write a modular form F(L, t) as a direct sum of F's which satisfy (5.10) for various x. Proposition 34. The operators 7,, and 7 commute with [d], and preserve the space of F(L, t) of weight k which satisfy (5.10). If F(L, t) has weight k and

satisfies-(5.10), then T„F = nk-2 x(n)F.

PROOF. That the operators commute follows directly from the definitions.

Next, if [d]F = x(d)F, it follows that [d] TF = T[d] F = x(d)TF and [d] TF = T[d]F = x(d)7,F. This is just the simple fact from linear algebra that the eigenspace for an operator [d] with a given eigenvalue is preserved under any operator which commutes with [d]. Finally, if F(L, t) satisfies (5.10), then 1;,,nF(L, t) = n-2 F(L, t) = nk-2 F(L, nt) =

0

nk-2 [n]F(L, 0 = nk-2 X(n)F(1-1 1).

If we translate the action of Tn , T. „ and [d] from functions F(L, t) to functions f(z) on H, then Proposition 34 becomes Proposition 35. Tn and T„„ preserve Alk(N, X), and also Sk (N, x). For 4 f€ Mk (N, X) the action of T„,:, is given by T„,„f = nk-2 x(n)fs Proposition 36. The operators 7

Mk (N, x) satisfy the formal power series

identity a,.

E nr,-- 1

Tun - s = jj 0 - Tpp -s + x(p)p k-i — 2s‘)—1 .

(5.11)

all p

PROOF. We simply use (5.8) and observe that for

p)( N we have Tp, pf = p k-2 x(p)f, while for .1)1 N the term on the right in (5.11) becomes (1 — Tpp -s)l because x(p) = 0. 0

161


As a special case of these propositions, suppose we take x to be the trivial character of (ZIN1)*. Then Mk (N, -x) = Mk(Fo (N)), and the modular forms fe Mk(N, x) correspond to F(L, 1) on which [d] acts trivially, i.e., F(L, = F(L, dt) for all de (ZINZ)*. Such F(L, 1) are in one-to-one correspondence with functions F(L, S), where S is a cyclic subgroup of order N in C/L. Nainely, choose t to be any generator of S, and set F(L, S) = F(L, t). The fact that F(L, t) = F(L, dt) means that it makes no difference which generator of S is chosen. Conversely, given F(L, S), define F(L, t) = F(L, S1), where = Zt is the subgroup of C/L generated by t. So we have just verified that functions of modular points in the sense of case (iii) at the beginning of this section correspond to modular forms for ro (N). We now examine the effect of the Hecke operator Tm on the q-expansion at oo of a modular form f(z) E Mk(N, x). That is, if we write f(z) = E an q" and Tmf(z) = Zbq, q = e'r iz, we want to express b„ in terms of the an . We first introduce some notation. IffE C [Pin f Za n e, we define ;

ben

;

Um f=

(5.12)

an gnim,

where the latter summation is only over n divisible by m. Note that U1 = identity, and Um o J' the identity, while V„,o Un, is the map on power series which deletes all terms with n not divisible by ni. Suppose that f(z) = E anqn, q = e2niz, converges for ze H. Then we clearly have:

Vmf(z) = f(mz);

Umf(z) =

Proposition 37. Let f(z) = E,T=0 anqn , q = e 2 E,T_.. 0 kg". Then

bn= apn

X(P)P

m -1

m i=0

f

+j m

E).

(5.13)

, fe Mk (N, x), and let 7,f(z)

k-1. a

nip9

(5.14)

where we take x(p) = 0 i piN and we take anip = O fn is not divisible by p. In other words,

Tp =

x(p)p k-1 Vp on

Mk (N, x).

(5.15)

We have Tpf(z) = F(L' , k), where F is the function on modular points which corresponds to f and the sum is over all lattices L' containing Lz with index p such that ki has order N modulo L'. Such L' are contained in the lattice -,15Lz generated by 1, and +„ and the lattices of index p are in one-toone correspondence with the projective line Np over the field of p elements Fp = ZIpl. Namely, the point in 1131-p with homogeneous coordinates (a 1 , a2 ) corresponds to the lattice generated by Lz and (a l z a 2)/p. Thus, there are p + 1 possible L' corresponding to (1, j) for j = 0, 1, . . . , p — 1 and (0, 1). If ON, then all p ± 1 of these lattices L' are included; if piN, then the last lattice (generated by Lz and must be omitted, since k has order 1, in that case. Note that the lattice generated by Lz and (z Alp is L(z+i)/p . Thus, if pi N we have 7,f(z) = t, h) = Upf(z). If PROOF.

162

III. Modular Forms

ON, then we have the same sum plus one additional term corresponding

to the lattice generated by

s pz . Thus, in that case 4 and 1,-; this lattice is f,L

p:N) 1) = Upf(z) + plc' F (Lpz, N

Tpf(z) = Upf(z) + ;I'01 , pz,

= Upf(z) + p k-1 x(p)F (L pz , Tv1 -) = Upf(z) + pk-1 x(p)f(pz),

which is (5.15). The expression (5.14) for the bn follows directly from (5.15) 0 if we use the expressions (5.12) for the operators Vp and U. As a consequence of Proposition 37 we have the factorization

1 — 7X+ x(p)p k-1 X 2 = (1 — UpX)(1 — x(p)p k-1 VpX),

(5.16)

where both sides are regarded as polynomials in the variable X whose coefficients are in the algebra of operators on the subspace of C [[q]] formed by the q-expansions of elements f(z) E MOT, X). To see (5.16), note that the equality of coefficients of X is precisely (5.15), while the coefficients of X' agree because U po Vp = 1. Proposition

38.

We have the following formal identity : co

co

E

ru n — s

n=1

= E x (n)nk —i

i Un).

n=1

or, equivalently,

Tn =

(5.17)

n=1

E x(d)d' Vd

(5.18)

0 Unid.

din

PROOF. By (5.11) and (5.16) we find that the left side of (5.17) is equal to

Fl ((1 — upp — s)(1 — x(p)pk -1 VpP —s

)) -1 •

P

Since the Up and Vp do not commute, we must be careful about the order of the factors. Moving the inverse operation inside the outer parentheses, we reverse the order, obtaining

ri (0 -

x(p)pk - i vpp - syi (1

- Up').

P

Note that Up , and Vp2 do commute for p 1 0 p2 , as follows immediately from (5.12). This enables us to move all of the (1 — U p - s)_l terms to the right past any (1 — X(P2)Pr i Vp,P2-s) -1 term for /32 0 p. This gives us separate products with the Vp's and with the Up's :

1

1

Il l - x(p)pk -1 Vpp - s n 1 - Upp - s: p

p

We now expand each term in a geometric series and use the fact that V. = Vn, o Vn and U. = Una o U. The result is (5.17). 0

163

§5. the modular interpretation, and Hecke operators

Proposition 39. Under the conditions of Proposition 37, if Tmf(z) = E(:,°_, 0 b,,qn,

then

x(d)dk-i

Z

bn =

a mnid 2

(5.19)

'

dig-c.c1-(m,n)

PROOF. According to

(5.18), we have

Tm Z (lair = n= 0

,

E x(d)d k-1 Vd o Umid E an q n n= 0

dim

= E x(d)d" 1 E dim

mldfn

a qd2 n

ron .

If we set r = d2 n1m, the inner sum becomes I arm1d2gr with the sum taken over all r divisible by d. Replacing r by n and gathering together coefficients o of q' , we obtain the expression (5.19) for the n-th coefficient. Most of the most important examples of modular forms turn out to be eigenvectors ("eigenforms") for the action of all of the Tm on the given space of modular forms. Iff E Mk(N, z) is such an eigenform, then we can conclude a lot of information about its q-expansion coefficients.

is an eigenform for all of the operators Tm with eigenvalues Am , m = 1, 2, . . . : Tmf = Anil'. Let am be the q-expansion coefficients: f(z) =Em",, o am qm. Then am = Amai for m = 1, 2, . ... In addition, al 0 0 unless k = 0 and f is a constant function. Finally, if ao 0 0, then A m is given by the formula Proposition 40. Suppose that f(z) E Alk (N,

A m ------

)()

E x(d)d'.

(5.20)

dim

Using (5.19) with n = 1, we find that the coefficient of the fi rst power of q in Tmf is am . If Tmf = Am!, then this coefficient is also equal to A m a 1 . This proves the first assertion. If we had a l = 0, then it would follow that all am = 0, and f would be a constant. Finally, suppose that a() 0 O. If we compare the constant terms in Tmf = 2.f and use (5.19) with n = 0, we obtain: A m ao = 1)0 = Eon x(d)d' i ao . Dividing by a() gives the formula for PROOF.

0

Am .

Iff is an eigenform as in Proposition 40 (with k 0), then we can multiply it by a suitable constant to get the coefficient of g‘ equal to 1: a l = 1. Such an eigenform is called "normalized". In that casé, Proposition 40 tells us that am = Am is simply the eigenvalue of Tm . If we then apply the operator identity (5.11) to the eigenform f, we obtain identities for the g-expansion coefficients of f. Namely, applying both sides of (5.11) to a normalized eigenform fE Mk(N, x), we have: CO

E an n=1

= fio _ app_s +x(p)?,,,sy i

•

III. Modular -Forms

164

1. Let N = 1, and let k > 4 be an even integer. Suppose that is an eigenform for all of the T., and suppose that a o 0 0 f= auir G Mk and f is normalized (i.e., a l = 1). Then Proposition 40 says that f= (10 + Eo-k _ i (n)qn. But recall the Eisenstein series Ek = 1 — Lik-1(1)qn (see (2.11)). Thus, f and --IkEk are two elements of Mk (r) which differ by a constant. Since there are no nonzero constants in Mk (F), they must be equal, i.e., a o =— Bk/2k. Therefore, up to a constant factor, any weight k eigenform for r which is not a cusp form (i.e., a o 0 0) must be Ek. Conversely, it is not hard to show that Ek is actually an eigenform for all of the Tm . This can be done using the q-expansion and (5.14) (it suffices to show that it is an eigenform for the Tp , since any Tm is a polynomial in the operators Tp for p prime). Another method is to use the original definition of Tm on modular points, applied to Gk (L) = - o*rel.r k . 2. Let N = 1, k = 12. Since S12 (1") is one-dimensional, spanned by (2n) -12 A(z) = E;;°=1 t(n)q" (see Propositions 9(d) and 15), and since the L preserve Sk (F), it follows that f= Et(n)q" is an eigenform for L. It is normalized, since t(1) = 1. Then Proposition 40 says that Tmf = t(m)f for all in. Thus, the relation (5.11) applied to f gives co 1 (5.21) -c(n)n - s = sp11—2s• EXAMPLES.

(r)

E

allp 1 — t(P)P—

n=1

This Euler product for the Mellin transform off = E t(n)q' is equivalent to the sequence of identities (see Proposition 32):

t(mn) = t(m)t(n)

for in, n relatively prime;

T(P i) = t(p 1 )c(p) P 1 'T(P 1-2 )Ramanujan conjectured that in the denominator of (5.21) the quadratic 1 — t(p)X p' X 2 (where X = p -s) has complex conjugate reciprocal roots ap and a-14p ; equivalently, 1 — t(p)X pll X 2 = (1 — ap X)(1 — FipX), i.e., t(p) = ap + p with lap ! = p l 112 From this it is easy to conclude that IT(p)i 2/3' 1/2 , and, more generally, IT(n)! < 0- 0 (n)n 1112 (see the proof of Proposition 13 in §II.6). The Ramanujan conjecture and its generalization, the Ramanujan-Petersson conjecture for cusp forms which are eigenforms for the Hecke operators, were proved by Deligne as a consequence of the Weil conjectures (see [Katz 1976a]). The Euler product (5.21) is reminiscent of the Hasse-Weil L-series for elliptic curves. For more information on such connections, see [S4imura 1971]. 3. Let N = 4, z(n) = (4) = (— 1)(11-1)12 for n odd. We saw that Mi (N, x) is one-dimensional and is spanned by .0 2 . If we apply Proposition 40 to e2 = + q + • - • + 21,qm + • • • 9 we find that Am = (-L -;1 ), where the sum is over odd d dividing m. For example,

Ap 1

,—1

-r

—

p

)

10 ifp 3 (mod 4); 2 if pE:-. 1 (mod 4).

(5.22)

165


Since 2„, is one-fourth the number of ways m can be written as a sum of two squares, we have recovered the well-known fact that p can be so expressed in eight ways if p 1 (mod 4) and in no way if p E-: 3 (mod 4). That fact is usually proved using factorization in the Gaussian integers Z[i]. The Gaussian field 0(i) has discriminant 4, and corresponds to the character x of Z having conductor 4: x (n) = 6-,1). Let 1 denote the trivial character of (Z/41)*, and let p denote the two-dimensional representation ( ) of (Z/4Z)*. Then we can write (5.22) in the form A p = Tr p(p). This turns out to be a very special case of a general fact: a normalized weight-one eigenform in M l (N, x) has its q-expansion coefficients determined by the trace of a certain two-dimensional representation of a certain galois group. For more information about this, see [Deligne and Serre 1974].

Another approach to Hecke operators. Let F1 and some group G.

F2

be two subgroups of

Definition. F1 and F2 are said to be "commensurable" if their intersection has finite index in each group: [F1 F/ n F2] < 00 , [F2 : F1 n F2] < 00 . Let r be a congruence subgroup of F SL 2 (1), and let a e G = G.L4 - (0). Then r and a' F'a are commensurable. To see this, suppose r F (N ). By Lemma 1 in the proof of Proposition 17, we have r(ND) for some D. Let F" = a-l r'a F(ND) and F' n ara F' n cc Pa. Then F" r(ND), otr"oc— i r (ND). Thus, [r F"] [F: ['WD)] < cc, and also [a' F'a : F"] = [F' : aF"a -1 ] [F: F(ND)]

BASIC EXAMPLE.

, S +) • A"(N, V , S +)\c A"(N, V , S +),

(5.24)

and that A l (N, S' , S +) is a group. A l (N, 5' , S +) is clearly a congruence subgroup of I", since it contains r(N/), where N' is the least common multiple of M and N (recall S+ = MZ). Here are our familiar examples:

ro (N ) = A l (N, (7L1 NZ)* , Z);

ri (N) = A l (N , {1 , Z); }

I" (N) = A 1 (N, {1} , NZ). Definition. Let r be a congruence subgroup of r, and let a e GL I (0). Let 1"" = r n a -1 1-"a, and let d z---.- [f': r"], I"' = U1=1 1""yj. Let f(z) be a function on H which is invariant under [y]k for y e r. Then , f(Z)

Proposition

d

I Crlarik ,-;-,. E f(z)f[aynk.

(5.25)

i---1

42. f(z)l[rall k does not change if a is replaced by any other

representative a' of the same double coset : Fair = Far. Nor does it depend on the choice of representatives y.; of r/ modulo r". If Je mk (r), then fi[railk e

mk(n.

PROOF. We first prove the second assertion, that is, that

(5.25) is unchanged

if y.; is replaced by ylyi, where 7; e r". Since r" a-l ra, we have 7; = a-l .pia for some -,i e r. Then f l[ay; y:]k = flP jayAk =f[ŒY5]k, because f IDA, =f. Next,. we observe that, by Proposition 41, the sum on the right in (5.25) can be written Ef(z)I[ai] k , where- the ai are any elements such that rar = U rai . It is then immediate that the definition depends only on the double coset Far and not on the choice of representative a. Finally, suppose that fe Mk (r). If y e r', then (f f[rocrlk)([y]k = Ef lEotyAk = f [Par ], since right multiplication by y just rearranges the right cosets ray]. If f

I

167


satisfies the cusp conditions, then so does fl[ayA k for each j, by Lemma 2 in the proof of Proposition 17. This completes the proof of Proposition 42. CI

Definition. Let r = A 1 (N, S", S+ ), and let n be a positive integer. Let fe Mk (r). Then

7f

(k12)-1

Ef I [rank,

(5.26)

where the sum is over all double cosets of F' in An(N, Sx , S t ). By Proposition 42, we have Tnfe mk (r). _ Equivalently, we can define Tn f

= n(k/2)-1 E

.,...k, fi[al where Pai runs through the right cosets of ri in An(N, S x , S). Proposition 43. In the case

r = ri ( N), the definition

(5.27)

(5.26) agrees with our

earlier definition of the Hecke operators T n . PROOF. Let A" = An (N , {1}, z). For each a e (Z1 NZ)* we fix a

GF

such that

Lemma. (5.28)

A" = U fi (N)øa (" disjoint

0 dt

where the disjoint union is taken over all positive a dividing n and prime to N, and for each such a we set d = nla and take b = 0, 1, .. . , d — 1. The terms on the right in (5.28) are clearly contained in A. Suppose the union were not disjoint. Then for some a', b', d' = nid we would have Yi aa (g :11) = y2 o-„,q; ct;:), and so F would contain the matrix (g byl; ':,:y1= (a/81' 4,); then a' = a, d' = d, and so (g ,'0,) = ab('; db :) for some j, i.e., b = b' + jd. If 0 < b, b' < d, this means that b = b'. To prove the lemma, it remains to show that any a = (ae: n e A" is in one of the terms on the right in (5.28). Choose g, h relatively prime so that go' + hc' =- 0, and complete the row g, h to a matrix y = ge il; ) e F. Then ya has determinant n and is of the form (8' :). Replacing y by +(13,. by if necessary, without loss of generality we may suppose that ya = (g ) with a> 0, ad = n, 0 b < d. Then, considered modulo N, the equality a = y -1 (g bd ) gives (1) ,7*) r_.=_- ( 1,!; :)(g 1,;), i.e., y-1 is of the form Yi aa for some yi e 1"1 (M. Thus, a = y1 a,,(0` bd) e r-i1 (N)cra(8 !,), as desired. This completes the proof of the lemma. o PROOF OF LEMMA.

(

,We now prove the proposition. Let Tr be the linear map, on Mk (Fi (N)) defined by (5.26) (or (5.27)), and let Tir d be the earlier definition. Since Aik(Fi (N)) = C i x Mit(N , x), it suffices to show that T:ew = T:k1 on Mk (N, z).

168

Iii.

Modular Forms

Let Je Mk (N, x). By the lemma, we have T: ewf = w k/2)-1 Ef i pia (g bA k. Since f Mk (N, x) we have f I [aa ],, = x(a)f. Also, for each a> 0 and d ad = n we have d-1

E f(z)IN

b=0

bdnk

with

= 11 nki2 cl-kf(az d± 1 b=0 = n ki2 ai-k Va 0 dUdf(z),

by (5.13). Thus, Tnnewf = (k/2)- 1

Ex(a)n ka d i

—k va 0 udf.

an

(The sum is over a > 0 dividing n and prime to N; but the term x(a) ensures . that the terms with g.c.d.(a, N) > 1 will drop out.) That is,

T:ew z---

E x (a)ak - i va o u„,a =

Told ,

all;

by (5.18). This proves Proposition 43.

to

In many situations, the definition of Tn in terms of double cosets is more convenient than the definition in terms of modular points. For example, we shall use this definition below to show that Tu is a Hermitian operator with respect to the Petersson scalar product. Moreover, the double coset approach can be generalized to situations where no good interpretation in terms of modular points is known. We shall see an example of this in the next chapter, where we shall define Hecke operators on forms of half-integral weight. For a more detailed treatment of the double coset approach in a more general context, see [Shimura 1971].

The Petersson scalar product. Suppose we have an integral over some region in the upper half-plane and make the change of variable z F.+ az = where a = (ca 1:1) E GL I (0). If we have a term dxdyly 2 in the integrand, this does not change under such a change of variable. To see this, we compute:

, daz ....._ = det a dz (cz + d) 2 '

az

Im _ det a Im z — Icz + dr .

(5.29)

In general, if we make a differentiable change of complex variable z 1 = u(z), then the area element dxdy near z is multiplied by lut(z)1 2 (see Fig. 111.5). Thus, interpreting the first equality in (5.29) in terms of the real variables x = Re z and y = Im z, we see that an element of area near z is expanded by a factor idazIdzi 2 = (det a) 2/Icz + dr. Thus, dxdyly 2 is invariant under the change of variables, by (5.29).

Proposition 44. Let F' c F be a congruence Subgroup, let F' c H be any fundamentardomain for F', and define

169

§5._ The modular interpretation, and Hecke operators

Figure 111.5

I .7v

dxdy

11(r1)

cfff

F"

(5.30)

2 '

Then 0 (a) The integral (5.30) converges and is independent of the choice of Fi . (b) [t.- : i-4 1 = griht(r). (e) If a e GLI (0) and cc-1 ra c F, then [r: rf] = [f: a-1 f' al PROOF. First let [r: 11 = d, let f = U1=1 (04'9 and take F' = U ai l FNotice that the integral for p(F) converges, because

f dxdy f 1/2 JF

< I Y2 .1

y'ydx d :=

2 Vi

—1/2J srS/2

If for each j we make the change of variables zi--+ aj z, we find that ja7 1 F dXdY1Y2 = IF dxdyIy 2 ; hence, for our choice of F' we find that the 1 integral in (5.30) converges to dp(F). Suppose we chose a different fundamental domain F1 for F'. Then we divide F1 into regions R for which there exists a e F' with aR c F', and again we use the invariance of dxdyly 2 under zl---+az to show that the integral over R c F1 and the integral over the corresponding region aR in F' are equal. Finally, to show part (c), we note that a' F' is a fundamental domain for a-i ra. Since

f

dxdy _ f dxdy

Gt_ir y2

r y2

,

it follows that p(a - ir a) = p(r), and so part (c) follows from part (b).

a

mk(r).

Now suppose that f(z) and g(z) are two functions in We consider the function f(z)g(z)y k , where the bar denotes Complex conjugation and y = Im z. If we replace the variable z by az for a e GL I (0), we obtain: f(az)g(az)y k (det allez + di 2 )k, by (5.29). But this is just (f(z)l[a] k) (g(z)i[a] k)y k . Thus, the effect of the change of variable is to replace f by f[Œ]k and g by g I [a]k .

rf = r

Definition. Let be a congruence subgroup, let F' be a fundamental domain for F', and let f, g e Mk (F'), with at least one of the two functions f,

170

in Modular Forms

g a cusp form. Then we define

f

o dyxd2y . (5.31) i;g> (if [f : fl < p i" It is immediate from this definition that is linear in fand antilinear in g (i.e., < f, cg> = 5 - (f, g>), it is antisymmetric (i.e., = ), and also < f, f> > 0 for f 0 O. These are the properties one needs in order to have a hermitian scalar product. We shall return to this later. 1

...z .

75 g yk

Proposition 45. The integral in (531) is absolutely convergent, and does not depend on the choice of F'. If F" is another congruence subgroup such that f, g emk (r"), then the definition of is independent of whether f, g are considered in Mk (F) or in mk (r").

n

Remark. The term lir : in (5.31) is needed in order to have the second assertion in the proposition. The requirement that f or g be a cusp form is needed to get convergence of the integral, as we shall see. First take F' = H ail F, where f = U Œif-/- In each region cT i F make the change of variables which replaces z by cc1 1 z, thereby transforming the integral into (z)i[ail]kg(zacc-i]. , dxdy

PROOF.

Efj f)

)

F

K

•T

y

2 •

Sincef, g G mk (r), we can writef l[ct,7 ' ],, = Es13.--.0 and, 91[01.r 1 ]k = Ec:= 0 bnd, with either an = 0 or bn = 0, since f or g is a cusp form. Because IqN I = le2 niziNi= e -21rY1N and y > 1 in F, and because an and b„ have only polyno. mial growth (see Problem 19 in §3), it is not hard to see that the integral is absolutely convergent. Next, if F1 is another fundamental domain, we proceed as in the proof of Proposition 44, comparing SR f(z)g(z)y'dxdy with SŒR f(z)g(z)? -2 dxdy, where R is a subregion of F1 and al?, with a e F', is the corresponding subregion of F'. Making the change of variable z 1- az and using the fact that f I NI, = f, giNk = g, we find that the two integrals are equal. Finally, suppose f, ge Amr"). First suppose that F" c F'. Writing f' = f ", F" =U67'F', we have

U1&

1

0-4 : f 1 fr,

— k dxdy f(z)g(z)y 2 Y

.

=

1

-

1

[I':

E ri"] s

l f(z)1[6i-l ]kg(z)1[6i-l ]Yk k dxdy y F'

[f : ft] But all of the summands on the right are equal, sincefi[bi] k = 1 gi[61- 1k = g; and there are [r': i---"] values of!. We thus obtain the right side of (5.31). If F" s* rt, we simply set r"f = F" r' F', and show that (5.31) and (5.31) with F".in place of F' are each equal to (5.31) ü7' - "I in place of F'. This completes the proof of Proposition 45. 0

171


Proposition 46. Let f, g e Mk (r) with f or g a cusp form. Let a e GL (Q). Then

(5.32)

<j" g> = .

PROOF. Let F" = F' n ar'a'. Then f, g e Mk(f"), and fi[a]k , gl[a]k E Mk(GC 1 F"a) by Proposition 17(a). Here a -1 1""a = a' Fla n F'. Let F" be a fundamental domain for r", and take o( 1 F" as a fundamental domain for a -l F"a. Then the right side of (5.32) is

1

,c_ iF„ [Ï: .4 : a - r"a] f

kdxdy 1 f(z)1[ 0 ]9(z) I [Œ]k.Y

r-

I

-

fr,f(Z)g (z)

„kdxdy . y2

-,/

--Since [r: a-1 1-- "a] = [r: t- "] by Proposition 44(c), we obtain the left side 0 of (5.32). Now for a E GL I (0) we note that a can be multiplied by a positive scalar without affecting [a]k. So without loss of generality we shall assume in what follows that a = ( 'c'i ) has integer entries. Let D = ad — bc = det a, and set a' = Da- .1 _tic -ab‘) Then [aœl ]k = [alk . (

.

Proposition 47. With f, g, a as in Proposition 46,

= .

(5.33)

In addition, depends only on the double coset of a modulo F'. PROOF. If in (5.32) we replace g by gi{a -l j k and replace F' by r n then Proposition 46 gives (5.33). Now suppose we replace a by )7, ay2 in < f I [a], , g>. We obtain = = , because f is invariant under [yi jk and g is invariant under [Y 1 ]k' since

o

yi , 7;' e r'.

Proposition 48. Let r = IVN), A" = dn(N, {1}, Z) (see (5.23)). Let f, g e Mk (r) with for g a cusp form. For each de (ZINZ)* , fix some ord er such that ad ---- ( '' (2, ) mod N. Let n be a positive integer prime to N. Then

= . In particular, iffe M k (N, x) ., then = An) for n prime to N. PROOF.

If a = C JD e A" then a .7-_-- (1) bn) mod N, CL' (!)- -bnin Mod N. Thus, o-n a' e An. By definition,

(g ---i b) mod N, and

)

,,f= 77 k/2)-1 Efl[ r, r /], 7 (

where the sum is over the distinct double cosets of F' = A' in A". Let dOE denote the number of right cosets in F'ar. Then dOE = [r: r n a -1 ra] by Proposition 41. By ( ',), (5.25) and the second assertion in Proposition 47,

we have

-

172

111. Modular Forms

g>, =n(14/2)-1 E cice as a scalar product on mk (r,(N)), because makes sense only if f or g is a cusp form. Instead, one can use the explicit construction of Eisenstein series in §III.3 above, directly constructing eigenforms. Then Mk(I7i (N)) can be written as the orthogonal direct sum of Sk(fii (N)) and a space spanned by Eisenstein series which are eigenforms. (In fact, an intrinsic definition of an Eisenstein series, which generalizes to many other situations, is: a modular form which is orthogonal to all cusp forms.) The dimension of the space of Eisenstein series turns out to be the number r of regular cusps of 1"1 (N). Roughly speaking, in order for a modular form to be a cusp form it must satisfy r conditions (vanishing of the constant term in the qN-expansion), one at each regular cusp; and so Sk(I-1 (N)) has codimension r in Mk (fl (N)). For more information on Eisenstein series, see, for example, [Gunning 1962 ]. PROBLEMS

1. Prove Proposition 31 in cases (iii)—(iv),

I"' = Fo (N), F(N).

2. (a) Let ŒN = 1). We saw in Problem 15 of §III.3 that [ŒN]k preserves Mk (Fo(N)). Prove that the Hecke operator T. commutes with ŒN for n prime to N. (b) Let F =Enodd cr (n)q", 04 = En anqn M2 (F0(4)) (where a the number of ways n can be written as a sum of four squares). Write the matrix of T2 in the basis 04, F, and find a basis of normalized eigenforms for T2. (c) Show that 1 2 does not commute with a4 (see Problem 15(c) in §III.3). (d) Suppose n is odd. Since T. commutes with cx4 and T2, it preserves each eigenspace in part (b) and each eigenspace in Problem 15(c) of §III.3. Show that the operator T. on the two-dimensional space M2 (F0(4)) is simply multiplication by ai (n). (e) Derive the following famous formula (see, for example, Chapter 20 of [Hardy and Wright 1960 ]) for the number of ways n can be written as a sum of four squares:

an

=

8a, (n) 12401 (n0)

for n odd; for n= 2rn0 even, 2,1'n0 .

3. If JE Mk (N, x), show that f(0o) = 0 implies 7„f(00) = 0, but that f(s) = necessarily imply 7"„f(s) = 0 for other cusps s (give an example).

0 does not

4. (a) Show that if feMk (N, x) and g(z) = f(Mz), then gE Mk (MN, x), where 2( is defined by xi(n) = x(n mod N) for ne(ZIMNZ)*. Let T. be the Hecke operator


175

considered on Mk(N, x), and let Ts' be the Hecke operator considered on Mk (MN, x'). Suppose n is prime tor M. Show that forfeMk (N, x) c: Mk (MN, x') and g(z) = f(Mz)E Mk (MN, X') we have: 7 f= 7;,f ; '

g(z) = (Tf)(Mz).

(b) Let f(z) = (q(z)q(2z)) 8 , g(z) = f(2z) E S8 (1-0 (4)) (see Problem 18(d) of §111.3). Show that for odd n, T„ acts on the two-dimensional space S8 (F0(4)) by multiplication by a scalar. (c) Find the matrix of T2 acting on s8 (r0(4)) in the basis f, g ; then find the basis of normalized eigenforms for all of the Tn .

5. Let f, g E Mk (N, i) with f or g a cusp form. Let AN. Note that Tp = Up . Show that <J, Vp lj> 6. Let fi = (270- 2 4 A2 5 = (a) 2A "6 9 where (20-12 d q11„(1 qn) 24 and E6 =1 - 504 05 (n)q"• We know that S24 (1") is two-dimensional and is spanned by fi and f2 (see §111.2). (a) Give a simple reason why fi is not an eigenform for the Hecke operators. (b) With the help of a calculator, find the matrix of the Hecke operator T2 in the basis fi , f2 (c) Express in terms off, andf2 the basis for s24(r) consisting of normalized eigenforms for all the 7. (d) Express in terms off, and f2 the cusp form whose n-th q-expansion coefficient is the trace of 7,, on S24(1"). (e) Find Tr T3 from part (d), and also directly by computing matrix entries.

In this problem one encounters fairly large numbers; for example, the coefficients in part (c) are in Q(V144169). While Proposition 51 guarantees the existence of a basis of normalized eigenforms, it does not guarantee that it will always be easy to find it computationally.

7. Let (L, t) be a modular point for Fi (N). Let a e W(N, {1}, z), and let y e (N). For each a and y, consider the lattice L' = !-ay.L. Show that L' is a lattice which contains L with index n, that L' depends only on the right coset of F'1 (N) in An(N, {1 } , Z) which contains ay, and that t has order N in C/L'. Show that the L' in (5.2) are in one-to-one correspondence with the right cosets of Fi on in An(N, { 1 } , Z). Use this to give another proof of Proposition 43 by comparing (5.9) and (5.27). 8. Let fe Mk (Fo(p)). (a) Show that yo = 1 and yi = (1). -1.;), 0 <j < p, are right coset representatives for F modulo Fo (p). (b) Let Tr( f) be defined by: Tr(f) = liP, 0fl[y j k . Show that Tr(f)eMk(F). (c) If f happens to be in man, then show that Tr(f) = (p + 1)f and Trtff [(p) l )] k) = ki2 7;f

CHAPTER IV


Let k be a positive odd integer, and let A = (k 1)/2. In this chapter we shall look at modular forms of weight k/2 = A + 1/2, which is not an integer but rather half way between two integers. Roughly speaking, such a modular form f should satisfy Maz + 13)1(cz + d)) = (cz + d)+112f(z) for (ca ,i) in f = SL2 (7L) or some congruence subgroup I' c T. However, such a simple—

minded functional equation leads to inconsistencies (see below), basically because of the possible choice of two branches for the square root. A subtler definition is needed in order to handle the square root properly. One must introduce a quadratic character, corresponding to some quadratic extension of Q. Roughly speaking, because of this required "twist" by a quadratic character, the resulting forms turn out to have interesting relationships to the arithmetic of quadratic fi elds (such as L-series and class numbers). Moreover, recall that the Hasse—Weil L-series for our family of elliptic curves E„ : y2 = x 3 — tex in the congruent number problem involved "twists" by quadratic characters as n varies (see Chapter II). It turns out that the critical values L(E„, 1) for this family of L-series are closely related to certain modular forms of half-integral weight. One classical reason why modular forms were studied is their use in investigating the number of ways of representing an integer by a quadratic form: m =41,_. 1 A fi nin, = [nir A[n], where A = [Ail] is a giiren symmetric matrix, [n] is a column vector and [n]' the corresponding row vector. For example, the number of ways m can be represented as a sum of k squares is equal to the coefficient am in the q-expansion of O": k

q EnJ =7-. Eam es i ek = j=11 i gni2 = 1 n i= n 1 , - - • , nk= — —

co

co

In Chapter III we saw that for k even Ok e Mk/2 (4,Xic—Ii), where x_.. / is the

177

§1. Definitions and examples , /

character modulo 4 defined by x_ 1 (n) = (-741) = (-1)(n -1)12 . More generally, for k even one can use In q tn"Ini to construct modular forms of weight k/2. The properties of modular forms are then useful in studying the number of representations m = [n]fil[n]. For example, in Problem 2 of §III.5 we used the action of the Hecke operators on M2 (1"0 (4)) to derive a simple formula for the_number of ways an integer can be written as a sum of four squares. For more information about the connection between quadratic forms in k variables and modular forms of weight k/2, see, for example, [Gunning 1962] and [Ogg 1969]. It is natural to ask whether a similar theory exists for quadratic forms in an odd number of variables. This would be a theory of modular forms of weight k/2 where k is an odd integer. One would want ek for k odd to be an example of such a form. Early investigators of representability of an integer as a sum of an odd number of squares—Eisenstein, and later G. H. Hardy understood the desirability of such a theory of modular forms of halfintegral weight. But it was only recently—starting with Shimura's 1973 Annals paper—that major advances have been made toward a systematic theory of such forms which rivals in elegance and beauty the much older theory of forms of integral weight. In this chapter we shall first present the basic definitions and elementary properties of forms of half-integral weight, largely following [Shimura 1973a, 1973b], but with slightly different notation. We shall discuss examples in some detail. But when we come to the fundamental theorems of Shimura and Waldspurger, we shall not give the proofs, which go beyond the scope of an introductory text. Rather, we shall motivate those theorems using the examples. Finally, we shall conclude by returning to the congruent number problem and Tunnell's theorem, which we used in Chapter I to motivate our study of elliptic curves.

§1. Definitions and examples We shall always take the branch of the square root having argument in (-7r/2, ir/2]. Thus, .sii. is a holomorphic function on the complex plane with the negative real axis ( — co, 0] removed. It takes positive reals to positive reals, complex numbers in the upper half-plane to the first quadrant, and complex numbers in the lower half-plane to the fourth quadrant. For any integer k, we define zk12 to mean Whenever we have a transformation rule such as f(yz) = (cz + d) kf(z) (where y = (ca jt ), yz = (az + b)I(cz + d)), we call the term (cz + d)k_ the "automorphy factor". It depends on y and on z. That is, an automorphy factor J(y, z) for a nonzero- functionf has the property thatf(yz) = Ay z)f(z) 4 for-z e //And y in some matrix gr-oup. Because ,

178

iv. Modular Forms of Half Integer Weight

f(aliz) _ f(aliz) . Agz) .f(z) f(13z) f(z) it follows that any automorphy factor Ay, z) must satisfy

J(oefl, z) = f(Œ, /3z) J(fl, z).

For example, Ay, z) = cz + d for y = ( ,171)e F = SL 2 (7L) satisfies (1.1). If — 1 = G1 Pi ) is in our matrix group, then J(y, z) must also satisfx: J(— 1, z) = 1. Another example of an automorphy factor is Ay, = j(y, z) = (ff)ed-1 \ cz + d, y = bd )e F0 (4), which is the automorphy factor for 0(z) = qn2 , q = e2iriz (see §1II.4). Suppose that, in defining modular forms of weight k/2 for k an odd integer, we try the most obvious thing: we look for functions f satisfying f(yz) = (cz + d) kl2f(z). However, Ay, = (cz + d ) ci2 cannot be an automorphy factor of a nonzero function for any congruence subgroup. F' c F and any odd integer k. To see this, suppose r f(N), N> 2. Let a = iN_N), ?). Then a, fl E F', and (1.1) would require the k-th power of the following equality of holomorphic functions on H:

AN 2 2N)z +1 — N = \41--Nzl(— Nz + 1) + 1 — N • \I —Nz + I.

N

(1.2)

Clearly, the square of (1.2) holds; thus (1.2) itself holds up to a sign. Since both expressions in the radicals on the right are in the lower half-plane, the right side is the product of two complex numbers in the fourth quadrant; Ne H. but the left side is in the first quadrant, since (N 2 — 2N)z + 1 Hence, (1.2) is off by a factor of — 1, and (1.1), which is the k-th power of (1.2), is also off by —1 if k is odd. Thus, we cannot have Ay, z) = (cz + 0 42 . The natural way out of this difficulty is to force (1.1) to hold by simply defining the automorphy factor J(y, z) to be the k-th power of

j(y, , z) cr:f 0 (y z) I 0 (z) = (-c-

d

1 1 \I cz- + d

for

r

y = (a hi\ c d)

E

o

(4). (1.3)

That is, for a congruence subgroup F' c F0 (4), one defines a modular form of weight k/2 to be a holomorphic function on H which transforms like the k-th power of 0(z) under any fractional linear transformation in I"' (and which is holomorphic at the cusps, in a sense to be explained below). Recall that (fi) is the Legendre symbol, defined for a positive odd prime d as the usual quadratic residue symbol, then for all positive odd d by multiplicativity, and finally for negative odd d as (-) if c> 0 and — (-4-) if c < O. (Also, (4---) = 1.) Further recall that we define ed = 1 if d -a- 1.(mod 4) and Ed = i if dE--_-: — 1 (mod 4). Thus, e j . X - i (d) = (— 1)(d-1)12 (note that this holds for negative as well as positive d). Thus, if we define f(z)i[y] k/2 = j(y, z) kflyz) for y e F0 (4) and k odd, we shall require a modular form of weight k/2 for F' to be fixed by [y] k/2 for y e f'. As in the case of integer weight, we shall want to define [Œ]k/2 for arbitrary matrices a e GL I (0) with rational entries and positive determinant.

179

§1. Definitions and examples \

But since the automorphy factor j(y, z) is defined only for ye r0 (4), we have no preferred branch of the square root for arbitrary a e GLI(0). This circumstance requires us to work with a bigger group than GOO), a group G that contains two "copies" of each a EGLI (G)., one corresponding to each branch of the square root of cz + d. The example of the autornorphy factor Ay, z), which is a square root of (-7i)(cz + d), shows that we also should allow square roots of —(cz + d). Thus, we shall actually define G to be a four-sheeted covering of GOO). We now give the precise definition of the group G. Let Tc C denote the group of fourth roots of unity: T :.--- {,± 1, + i} . We now define G to be the set of all ordered pairs (a, 0(z)), where a = (a, 1:) e GL(Q) and 4)(z) is a holomorphic function on H such that

2 qt)(z) =

i

cz + d

Met a

\

for some t e T2 = { +1 } . Tliat is, for each a e GL() and for each fixe'd t = +1, there are two possible elements (a, +0(z))EG. We define the product of two elements of G as follows:

(1.4)

(al 0(z))(fil P(z)) = (Œ)6, 0(13z)0(z))-

Remark. We could define G in the same way but with T defined to be the group of all roots of unity or even (as in [Shimura 19734) the group of all complex numbers of absolute value 1. However, for our purposes we only need fourth roots of unity. Proposition 1. G

is a group under the operation (1.4).

PROOF. We first check closure, i.e., that the right side of (1.4) belongs to G. If a = (a, :11), fi = (eg f,), we have

(0(f3z)tP(z)) 2 = t i (ctiz + d)t 2 (gz + h)t\i'det a det /3

-=.-- t 1 t2 (c(ez +f) + d(gz + h))/Vdet afl = 1 1 12 ((ce +- dg)z + cf + dh)IVdet a/3,

■

which shows that (4, (P(Az)/i(z))e G. Associativity is immediate. It is also obvious that ( ( ), 1)EG serves as the identity element. Finally, to find an inverse of (a, 4)(z)), where a = (, bd ), we write D. = det a, a' = Da - I = ( cr, -ab), and look for (a( 1 , lit(z)) such that: OW 1 z)0(z) = ,1. That is, we set 0(z) = 110(or'z), and then must check that (a', 1//(z)) e G. But

tit(z) 2 = 1 1 ( ( dzb + d)/) = —1 f:6(—cz + a)/(ad — bc) a t —

.

1 t

4.

a )/„Met a -1

D D

which is of the reqUirVI

,

form. This completes the proof.

o

180

Iv. Modular Forms of Half Integer Weight

We have a homomorphism

P: G which simply projects onto the first part of the pair: (a, c/)(z)))--> a. The kernel of P is the set of all (1, 4)(z)) E G. Since 0(z) 2 = t for (1, 0(2))€G, it follows that 0(z) must be a constant function equal to a fourth root of unity. Thus, if we map T to G by I (1, t),_ we have an exact sequence 1

in other words, the kernal of P is the image of T under t (1, t). Similarly, if a = aa for a E 0, the inverse image of a under P is the set of all pairs (a, t) for le T, since in that case we again find that 4)(z) 2 must» be +1. We let G' = P --- 1(r) be the set of all pairs (a, (z)) e G such that Œ € r = SL 2 (Z). G' is clearly a subgroup. If = (a, 4)(z)) e G, we shall sometimes use the notation z to mean the same as az for z e H. For = (a, 0(z)) e G and any integer k, we define an operator []k/2 on functions J on the upper half-plane H by the rule f(z)

k12 f(ŒZ) 0(Z)

k•

(1.5)

This gives an action of the group G on the space of such functions, i.e., we have (f(z)1F1_Ik12., 6 L.- , 2_11 10 = f(Z) I [1 2]k/2 because of (1.4). L.,, A Now let ry be a subgroup of F0(4). Then j(y, , z) is defined for elements y e (see (1.3)). We define d--ff

r'

(Y, AY, Z ))1Y Er/.

(1.6)

Clearly is a subgroup of G (because of (4.13) in §III.4) which is isomorphic to rt under the projection P. Let L denote the map from T0 (4) to G which takes y to 5./ cfff (71./(79 Z)) e G.

Then P and L are mutually inverse isomorphisms from r0 (4) to r0 (4). We call L a "lifting" or "section" of the projection P. In our notation we are using tildes to denote this lifting from 1T0 (4) to G:

L:

(Y1 .1(Y1 z));

P: 6' .1-(Y z ) i---° Y. Suppose that f(z) is a meromorphic function on H which is invariant under where I"' is a subgroup of finite index in F0 (4). We now []ki2 for Y e describe what it means for f(z) to be meromorphic, holomorphic, or vanish at a cusp seQu tool. We first treat the cusp co. Since r has finite index in F0 (4) and hence in f, its intersection with

± fi

j)1 \O 1/jjz

181

§1. Definitions and examples

'Mil must, be of the form f+ a 'Di} if — 1 e I"' and either {a ini} or 1( — z) if — I rf. (We always take h > O.) Since [ -2-'1 ] 42 is the identity, and j((1) = 1, in all cases f is invariant under h, and so has an expansion in powers of gh e2h. As in the case of integer weight, we say that f is meromorphic at oo if only finitely many negative powers of qh occur, we say thatfis holomorphic at oo if no negative power of qh occurs, and we define f(co) to be the constant term when f is holomorphic at co. , Now suppose that se Q L.) fool. Let s = a co, Œ E F, and let =--- (a, 0(z)) be any element of G 1 which projects to a: P( ) = a. Define rsi = e I ys = sl, and let Gal = It/ e G cx) = co} . We have

i1 j)' {(± ( 0 because Foe ={ +(,1) 1)}, and the 4)(z) for ( ) must be a contant function I, as noted above. Now ir is contained in G' and fixes co, i.e., G. Moreover, P gives an isomorphism from Ç-1-rs' to a' F;a Ç. Since F' has finite index in F, it follows that a-i Fs'a is of one of the forms { + nj}, {(t) 1)-11, or ((— Mil. Thus, for some t E T we have Gsp

1 , = {(-±(i h)

iE z

tY} •

h> 0 is determined by requiring a n to be a generator of Given s ana a-1 1a modulo + 1; then t is determined, since P is one-to-one on e I. i.e., we can find t by applying the lifting map L to + c Proposition 2. The element ( ( ';), t) e Gcol depends only on the r-equivalence class of the cusp s. First suppose that c is replaced by another element (a l 4 (z)) e G 1 such that s = oo. Then e G' fixes co, and so it is of the form t i ). Then (±(,!,

PROOF.

Ofgi (M) -1 (± -1 rg)(M) ) t

(

(-±-(j)

hiM i RG

—1 1)

But (() f), t) commutes with ((4 t), so conjugating by affect(( 1), t). Now suppose that s l = ys = ('2c)oo, where y e F', = (y, that r;1 = yr;y — ', and so f;1 = ,pr;53-1 . Thus,

± GO -

=± and weagain obtain the same (0

)g

does not

z)) E f". Note

±

t). This completes the proof.

182

iv.


We are now ready to define meromorphicity/holomorphicity/vanishing at Let s = the cusp s of r". Suppose f is invariant under DI/2 for .2. E we have e and set g = fiN142 . Then for any element +

fiEgL i2= f1R1q 2= g..

g [

t)] k/2 :

That is, g is invariant under [(a) I

g(z) = g(z)

h

01

t-kg(z + h). t

Icf2

where r = 0, 14 , 1, or Then e -2nirzlh g(z) is invariant under z 1 * z + h, and so we obtain a Fourier series expansion e'i"filg(z) = Eanq'hi, Write i k = —

nign-1-79/11. g(z) = Eane2

We say that f is meromorphic at s if an = 0 for all but finitely many n < 0, and that f is holomorphic at s if an = 0 for all n < O. Iff is holomorphic at s, g(z). Automatically f(s) = 0 if r 0, since in that we define f(s) case the first term is ao e'z'fit. If r = 0, then f(s) = ao . It is easy to see that these definitions depend only on the F'-equivalence class of s, i.e., g = kI2, where y E f' and c oo = s (see flMic /2 may be replaced by g = f the proof of Proposition 2, and also the proof of Proposition 16 in §III.3). It should be noted, however, that there is some indeterminacy in the value f(s) of a modular form at a cusp. Namely, if we replace e G l by ((4 7), t ig, with t t e T, then the effect is to multiply g = fiM k/2 by CI k . If k/2 is a halfinteger, this may alter the value by a power of i; if kl2 is an odd integer, then f(s) may be defined only up to + I (compare with the proof of Proposition 16); and if k/2 is an even integer, then f(s) is well defined in all cases. Given a cusp s of I"' and an integer k, we say that s is k irregular if r 0, and we say that s is k regular if r = 0, i.e., if t k = I. Thus, iff is holomorphic at the cusps, it automatically vanishes at all k-irregular cusps. Note that whether a given cusp s of F' is k-regular or k-irregular depends only on k modulo 4. That is, if t = ± i, then the cusp is k-irregular unless k/2 is an even integer; if t = 1, then it is k-irregular unless k/2 is an integer; and if t = I, then it is always k-regular. In the case when /02 is an odd integer, the terminology agrees with the definition of regular and irregular cusps in Problem 2 of §1.11.3. -

-

—

Definitions. Let k be any integer, and let F' c F0(4) be a subgroup of finite index. Let f(z) be a meromorphic function on the upper half-plane H which is invariant under [nk/2 for all j) E ff. We say that f(z) is a /nodular function of weight 1(12 for r/ if f is meromorphic at every cusp of I'. We say that such an f(z) is a modular form and write Je Mk/2 (r/), if it is holomorphie on H and at every cusp. We say that a modular form f is a cusp form and write fe Slog"), if it vanishes at every cusp.

183

§1. Definitions and examples

rouso c To (4).

Let x be x) denote the subspace of

Now let N be a positive multiple of 4, so that

a character of (ZPVZ)*. We let Mk/2 (ro (N), -44/2(r (N)) consisting off such that for y = (c" bd) e FAN)

(1.7) fiEflka = x(cOf. We define Ski2 (ro (N), x) = Sql2 (f i (N)) n 402 (f" 0 (N), x). The exact same argument as in the integer weight case shows that

Mk/2 (f'l (N)) = Ox Mk/g0 (MI x). Furthermore, it follows immediately from the definitions that if x i and X2 are Dirichlet characters modulo N, then E Mki/2 ( Ir0 (N),

xi)

(i = 1

5

2)

implies

f1 f2

J J

-

(IC

j+k2)12(f 0( N ) , Z1Z2) .

(1.8)

Notice that for any k e Z, we have Mk/2(fro (N), x) = 0 if x is an odd character, as we see by substituting — 1 for y in (1.7): f[((1 Wko f = x( Of For example, since the trivial character x= 1 is the only even character of (Z/41)*, this means that

Mk/2 (fAl (4)) = Mk/2 (to (4), 1) = mk,2(t0(4))If 4IN, recall that x_i denotes the character modulo N defined by x_ i (n) = ( 1) n-0/2 for n (ZI NZ)* . Notice that in our definitions k is any integer, not necessarily odd. For k even, let us compare the above definitions of Mk/2 (0, Sic/2(t '), Mk/2 (f-0 (N) X), Ski2(to(N), x) with the definitions of Mk/2(F /), S12 (r'), Mk/2 (N, )01 Sk12 (N, x) in Chapter III. First, for any = (a, (z)) e G, cc = (a, ',;), 4i(z) 2 = t(CZ j SO d)/.,./det cc, note that f(z)IM k/2 = 4)(Z)- kfoŒ) = rk/2f(z)i [Œ1k12 that Mk/2 for kl2 e Z differs only by a root of unity from the operator [a]k/2 defined in the last chapter. It immediately follows that for kl2e Z the cusp —

condition defined in the last chapter is equivalent to the cusp condition defined above. There is a slight difference, however, between the condition that f be invariant under [y]k/2 for y = eic r" and the condition that f be invariant under []k/2. Namely, firin-111/2 =i ( y, z) - kf(yz) = (fri)(cz dn -ki2f0z)) (7-61 1 )1q2f [Y]k/2 . Thus, if 102 is odd, then M k12 differs from [y],02 by x_, (d). From this discussion we conclude Proposition 3. Let 41N, kl2e Z. Then Mk 1 2 (f‘O(N) ,

= mk12(1V, eix),

sk1 2 r-0(N), X) = Ski2(N (

It is now not hard to describe the structure of Mki2 (ro(4)). If we have a polynomial P(X1 , , X„,)e C[X, , . . . , X.] and assign "weights" wi to Xi, then we say that a monomial ri Xi has weight w = Z ni wi , and we say that a polynomial P has pure weight w if every monomial which occurs in P with nonzero coefficient has weight w.

184


Proposition 4. Let 00(z) = , qn2 , F(z) = Z n>0 odd at OW q = e2ni2 Assign weight 1/2 to O and weight 2 to F. Then Mk/2(f0(4)) is the space of all polynomials in C[0, F] having pure weight kl2. PROOF. In Chapter III we found that for k/2 e Z, the space Mko (N, consists of polynomials in 0 and F having pure weight k/2 (see Problems 17-18 of §III.3). This gives Proposition 4 when k/2 E Z. Next, by the definition of [13] k/2 , we have OID1 1/2 = Co for y e f0 (4). 0 is holomorphic at the cusps, because we checked holomorphicity of 0 2 at the cusps, and if O had a singularity at s, so would 0 2 . Thus, Cs € M112 (f0 (4)). It now follows by (1.8) that any polynomial in O and F of pure weight kl2 is in Mk/2 (f0(4)). Conversely, suppose that fe Mk/2 (Î o (4)), where k is odd. Then f 0 is in aPk+1)/4 e2P(0 , F) , M(k + 1)/2 ( r0(4)), and so can be written in the formf where a E C is a constant which equals O if 4)(1 k + 1 and P(0, F) has pure weight (k — 1)/2. Then f OP(0, F) = aF(')1410 e Mk/2 (t0 (4)). But because 0 vanishes at the cusp —1, while F does not, this form satisfies the holomorphicity condition at --12- only if a = O. Thus, f= OP(0, F), and the proof is complete. Corollary. Dim Mk12(f210( 4 )) = 1 + [k/4]. PROBLEMS

1. Let y = '1)e ro (4), and let p = ( ( ), m -14). Check that p e G. (a) Compute pp'. (la) If ye f0 (4m), then y/ = 7)y ?) -1 is in r0(4). Compare piip -1 with .ç)i . Show that they are always the same if m is a perfect square, but that otherwise they differ by a sign for certain y e F0(4m). (

2. Let y =

'De r0(4), and let p = N '14,1i). Check that p E G. (a) Compute (b) Suppose that 41N and y e fo(N). Then yl = (g, 1)y( ,( ,, 1) -1 E ro (N). Compare pijp - i with 55 f .

3. Find h and t for each cusp of r0(4). 4. Let r c f0 (4) be a subgroup of finite index. Let /02 e Z. Show that if c I-, (4), then Mk/2 (f") = Mk/2 (r). Otherwise, let x be the unique nontrivial character of n r 1 (4); show that m„12(r) Mk/2(r', e2 ) in the notation of 011.3 (see the discussion following Proposition 27). 5. (a) Describe Ski2 (r0 (4)), and find its dimension. (b) Show that for k 5, the codimension of Ski2 (t0 (4)) in Mk/2 (r0(4)) is equal to the number of k-regular cusps. (c) Find an element E anqn in S1 312 (r0 (4)) such that a l = I and a2 = 0 (there is only one).

6. Let 4IN, and define xiv by kv (d) = (1) if g.c.d.(N, d) = 1, xN(d) = 0 if g.c.d.(N, d) > 1. (a) Show that xN is a Dirichlet character modulo N.

. §2. Eisenstein series of half integer weight for r0(4)

185

(b) Let p be as in Problem 2 above, let x be an arbitrary Dirichlet character modulo N, and suppose that fe Mkigo(N), z). Show that fj[P]ko e Mk12(f-0 (N), i.x ki ) •

ro (4 ).

7. Find the value of 0 G Milgo (4)) at the three cusps of

§2.

Eisenstein series of half integer weight for

r0(4)

Recall the functions Gk (z) and Ek (z) in Mk (F) that we defined in §I11.2 for k > 4 an even integer. We set Gk (z) = E(niz + n) -k , where the sum is over m, n e Z not both zero; and then Ek (z) was obtained by dividing by Gk (ioo) = Mk). Alternately, we can obtain Ek (z) by the same type of summation Z (mz + n) -k if we sum only over pairs m, n which have no common factor (see Problem 1 of 011.2): -

Ek (z) -_-_-_ E

i

(2.1)

Om + n)k'

where the sum is over m, n e Z with g.c.d.(m, n) = 1 and only one pair taken from (m, n), ( in, n). In §111.2 we obtained the g-expansion —

—

Ek (z) = I —

2k '

1 =....... e 2itiZ

4

k n=1

We now want to give an analogous construction, with k replaced by a half-integer k/2 and F replaced by F0(4). To do this, we give a more intrinsic description of the sum (2.1). Notice that if we complete the row (m, n) to form a matrix y = (4,1 bn) e F, which can be done because g.c.d.(m, n) = 1, then the summand in (2.1) is the reciprocal of the automorphy factor Jk (y,9 z) for functions in Afk (F). That is, such functions f satisfy : f(yz) = .1k (y, z)f(z) for .4(y, z) = (mz + n)". In (2.1) we are summing litik(y, z) over all equivalence classes of y e F, where Yi ?--, y2 if y1 and y2 have the same bottom row up to a sign, i.e., if y2 = + a. Dy1 for some +G; D E For, . In other words, we may regard ik (y, z) as a function on the set Fcc\F of right cosets, and then rewrite (2.1) as follows: E k(Z) =

E Jk (y, zy l 9 y E rcAr

k > 4 even.

(2.2)

It is now possible to give a proof of the invariance of Ek (z) under [yjk for Yi e F which easily generalizes to other such sums of automorphy factors. Namely, note that, by the definition of [Y1]k, we have

Ek(z)I [Yi lc ------- .4 0'19 zY

= Jkoi,

l Ek(vi z)

zri

E Jk(y,

y e roAr

yizr

= E Jktyvi, zr, ye

foAr

because of the general relation (1.1) satisfied by any automorphy factor.


186

But right multiplication by yl merely rearranges the right cosets F.y. Thus, the last sum is just a rearrangement of the sum in (2.2) for Ek. Since the sum is absolutely convergent for k > 2, we conclude that Ek I [yj k = Ek. We now mimic this procedure for half integral weight k/2 and the group f0 (4). In order to have absolute convergence, we must assume k/2 > 2, i.e.,

k > 5. Note that we obviously have F0 (4). = F. = { + ( - 1)}. Definition. Let

k

be an odd integer,

k > 5.

E

Eko (Z) =

Ay,

z)_ ".

(2.3)

7 e roD\ro( 4)

) e F0 (4) for each As representatives y of F.VE0 (4) we take one matrix (in, n) with 41m, g.c.d. (m , n) = 1 (in particular, n is odd), and we may specify n> 0 so that we get only one of the pairs + (m, n). Thus (a m

b n

a b Ek/2 (Z) = , .i E m,n e Z, 4Im,n> 0, g.c.d.(m,n)= 1 (( m n) zY k Substituting the definition of j(y, z) and noting that (-7:) -k = (v), since k is odd, we obtain Elo (z)

=

E

(—m)Einc(inz

41m, n> 0 g.c.d.(m,n)=1

n

+ n)

2,

k > 5 odd.

(2.4)

mk/2 (r0(4)) now follows by the exact same argument The fact that Ek/2 - — . as in the case of the Eisenstein series of integral weight for 1" which we considered above. Namely, Ek/2(z)i [MI0 = j(yi , z)_ kEk12 (y i z), which just gives a rearrangement of the sum (2.3). Finally, the cusp condition is routine to check, so we shall leave that as an exercise (see below). Unlike the case of 1", where there is only one Eisenstein series of each weight, Ek/2 (z) has a companion Eisenstein series Fki2 (z), defined by Fk/2 = Ek/2

That there are two basic Eisenstein series for each half-integer k/2 is related to the fact that F0 (4) has two regular cusps (see Problem 3 of §1 and the discussion at the very end of §III.5). To see that Fkl2 e Mk/2 (f0(4)), we apply Problem 6 of §1 in the case N = 4, x = 1. To write Fk12 more explicitly, we compute

Fk/2 (z)= (2z) -4/2 Ek/2( — 1/4z ) =

242

(4z)kI2

m)r_k

4Im1 . n> 0 g .c.d .(nz, n)= 1

I

n - n (— ml4z + n) k/2

1 = 2k/2 "S" (M) ek 4-4 n " (— m + 4nz)k12'

187

§2. Eisenstein series of half integer weight for fo (4)

where we have used the fact that n > 0 to write ,i4z \i—m/4z + n = —m + 4nz. We now replace m by —4m, so that the sum becomes a summation over all me/ with g.c.d.(4,n, n) = 1, i.e., n odd and g.c.d.(m, n) = 1. We obtain

1 n J F(nz + m)kI2 • IC k

E m,ne 7

Fic/2(Z) =

n>Oodd

(2.5)

g.c.d.(m,n)=1

We next compute the g-expansions of Ei02 and Fki2 . We proceed much as in the computation of the q-expansion of Eit in §111.2, except that there are added complications. The derivation of the g-expansion for Fie/2 (z) turns out to be slightly less tedious than for Ek12 (z). Hence, we shall give that derivation, and omit the analogous proof of the g-expansion formula for Ek/2 (z). To compute the g-expansion of Fie/2 (z), we start with the following relation, which was proved in Problem 8(c) of §11.4 for any a> 1, Z E H (where we have replaced a, s by z, a here): co 00 6,2nit2. E e—nial2r(a)-1 (2.6) E (z + h) ° (Note: for z e H we define z" = ea log z, where log z denotes the branch having imaginary part in (0, n) for z e H.) Since the sum (2.5) for Fk12 (z) is absolutely convergent, we may order the n I and write m = —j + nh, terms as we please. If we let j =-- 0, 1, . h eZ, we obtain Fk I2(z)

n>0 odd

= 2 —k/2

(nz

E

2-142

0 j 0 odd -0 <j 0 odd we let 26.(f) denote (7). Thus, if j is a prime, it is L(J) which tells us whether j splits, remains prime, orj.amifies in CP(jn) (i.e., this depends on whether xpi(j) equals 1, — 1, or 0, respectively). There is a unique character, also denoted x„„ which has 1 mod 4 and conductor 14ml if m 2 or 3 mod 4 and conductor iml if m which agrees with zni (j) = C9 for j > 0 odd. We can express x„, in terms of the usual Legendre symbol as follows:

188

IV. Modular Forms of Half Integer Weight

if m

E---

1 mod 4;

if m -a-- —1 mod 4; if m = 2m0 za-- 2 mod 4, where we define (51--) = ( —1)( i-1)12 for j odd and (il) = 0 for j even, and (4) = (— 0°2-1)/8 for ] odd and () = 0 for j even. i Note that Z.( — 1) = 1

if m > 0,

and ;Cm( — 1) = —1

if m < 0. (2.8)

Recall that we de fi ne A = (k — 1)/2.

has q-expansion coefficient bl which for I squarefree is equal to L(4_ 1)).1 , I — A) times a factor that depends only on A and on 1 mod 8, but not on I itself. This factor is given in (2.16) below. Proposition 5. The Eisenstein series

Fic/2

We first prove a lemma.

Lemma. Let n = no n?, where no is squarefree. Then for I squarefree, the inner sum in (2.7) vanishes unless n 1 11, and f n i ll it equals

1) ---an(no ,Ino p(ndn i ,

(2.9)

wherits to 0 if n i is not squarefree and equal to Al5bius function (equal (-1)" if n 1 is the product of r distinct primes). First, if n 1 = 1, then (Tia) has conductor n = no . If / has a common factor d with n, then, combining terms which differ by multiples of nld, we see that the inner sum in (2.7) is zero. On the other hand, if lealnlr, then making the change of variables f = —/j leads to (-7,-,E)E (De'afin . This Gauss sum is well known to equal snjti (see, e.g., §5.2 and §5.4 of [Borevich and Shafarevich 1966]). This proves the lemma when n 1 = 1. We next show that the inner sum is zero if g.c.d.(no , n 1 ) = d> 1. Note that PROOF OF LEMMA.

1 (4)

if g.c.d.( j, n) = I,

[0

otherwise;

and it follows that (1,) --z-- (--/--d. 2 ) • Combining terms with j which differ by multiples of nId2 and usingnthe fact that c/2 4'/ (since 1 is squarefree), so that If , e -27ti1( j+j'n1d2 )/n = 0, we find that the inner sum is zero in this case.

189

to(4)

§2. Eisenstein series of half integer weight for

Now suppose that n 1 > 1 is prime to n o . Let p, run through all primes dividing n,: n, = 11/4v. Set qv = pv; and let bv(j) = 0 if pd./ and (5,,( j) = I if pv )fr f. Then (-,4) = (54) II,, (VD. Any JE ZinZ can be written uniquely in the form

j =ion?. + >in/q,

0 -...ily 0, is the discriminant of a quadratic field and h) is the corresponding character, then the ID1-th g-expansion coefficient of 1412 is equal to L(x D , 1 — 2). IAl2 ra C( 1 — 22)(Eko

This proposition is due to H. Cohen [1975 ]. (His notation is slightly different from ours.) It can be viewed as a prototype for the theorem of Waldspurger-Tunnell which we shall discuss later. The WaldspurgerTunnel! theorem exhibits a modular form of weight 3- (think of A as being equal to 1 in Proposition 6) for F0(128) such that the square of its n-th g-expansion coefficient is a certain nonzero factor times L(E„, 1), where L(E, s) is the Hasse-Weil L-function of an elliptic curve E (see §I1.5) and E. is the elliptic curve y2 = x3 — n 2 x. Thus, the g-expansion coefficients of this particular form of half integral weight are closely related to all of the critical values 4E., 1) which we encountered in Chapter IL If we were allowed to take 2 = 1 in Proposition 6, then the ID1-th qexpansion coefficient would be L(XD, 0), which differs by a simple nonzero

194


factor from L(x D , 1) (because of the functional equation), and is essentially equal to the class number of the quadratic imaginary field CP(NA (here = 1 or — 40. We will say more about the case A = 1 below. For another proof of Proposition 6, using so-called "Jacobi forms," see [Eichler and Zagier 1984]. We now discuss some interesting consequences of Proposition 6. Since = any element 1-4/2 e -M - /c/2(T0(4)) can be expressed as a polynomial in - 00 qn2 and F = En> 0 odd al OW (see Proposition 4), it follows that there are formulas expressing L(x D , 1 — .1.) in terms of i (n) and the number of ways n can be expressed as a sum of r squares. We illustrate with the simple case k = 5, /1 = 2. in this case M512 (t0 (4)) is spanned by CO and OF, and a comparison of the constant coefficients and the coefficients of the first power of q shows that H5/2 — 1 100 5 — *OF (see the exercises below). This yields the following proposition. —

Proposition 7. Let sr(n) denote the number of ways n can be written as a sum of r squares thus 0 5 =--- E s5 (n)q". Let D be the discriminant of a real quadratic field, and let XD be the corresponding character. Then 1

L(xD, 20s5(D) 1 = 17l)

D- j2 odd

Many other relations of this sort can be derived, in much the same way as we used the structure of Mk (F) in the exercises in §111.2 to derive identities between various ar (n) . For details, see [Cohen 1975 ]. These relations can be viewed as a generalization to A > 1 of the so-called "class number relations" of Kronecker and Hurwitz. (For a statement and proof of the class number relations, see, for example, Zagier's appendix to [Lang 1976] and his correction following the article [Zagier 1977 ]. ) The class number relations correspond to the case À. = 1, i.e., L(x D , 1 — A) = L(XD, 0). But A = 1 falls outside the range of applicability of Proposition b. To get at the class number relations from the point of view of forms of half integral weight, one must study a more complicated function H3/2 which transforms under F0(4) like a form of weight but which is not analytic. The IDI-th coefficient of H3/2 is essentially the class number of the imaginary quadratic field O(Ji75). The study of 113/2 is due to Zagier [19754 In some sense, the situation with 1/312 is analogous to the situation with E2 in §111.2. In both cases, when we lower the weight just below the minimum weight necessary for absolute convergence, problems arise and we no longer have a true modular form. The study of E2 and H312 is much subtler than that of Ek and 14/2 for k> 2 and k/2 > 2, respectively. But in both cases the Eisenstein series, though they fail to be modular forms, have a special importance: we saw that the functional equation for E2 led to the functional equation for the discriminant form A(z) e S1 2(n ; and H312 has as its coefficients the class numbers of all imaginary quadratic fields.

§2. tisenstein series of half integer weight for

r0 (4)

195

We next examine the quite different question of how the nonsquarefree q=expansion coefficients of Ek12, Flo, and 14/2 can be determined. Again we shall give the details only for Fk12 and shall omit the similar computations

for- Ek12 .We proceed one prime at a time. That is, for each prime p we express bi in terms of bio , where 1 = el° and p2 4". Finally, combining the different primes p, we shall be able to express 1)1 in terms of bio when / = 44 and 10 is squarefree. This information can be conveniently summarized in the form of an Euler product for Er, bioliti- s, as we shall see later. The easiest case is p = 2. Suppose that / = 4v 1o , 44'10 . We return to our earlier computation of 1)1 in (2.7). The general formula (2.7) is valid for all /, not necessarily squarefree. If we replace 1 by 4/ in (2.7), the double sum does not change, because replacing I by 4/ is equivalent to replacing j in the inner sum by 4f, which runs through as j does, since n is odd. Thus, 410-1 9 and so for 1 = 4v/0 we have ba, = = 2(k--2)v bio. (2.21) This case p = 2 can be summarized as follows: for any 10 , co

E

v=o

1

4v2-sv = /71 0

(2.22)

0 1 — 2k -2- s

Comparing coefficients of 2-" on both sides of (2.22) shows that (2.22) is equivaleni to (2.21). (Note that we do not actually need to assume that 44(49 .) Now suppose that p is odd. Let 1 p=2v/o, where p2 V ° . We again use (2.7) and divide into two cases. Case (i ) pi 10 . Let 10 =pio • We write the double sum in (2.7) in the form n> Ø odd Sri, where Sn is the term in (2.7) corresponding to n. Let n = nop', where p2 41 no . We fix n o and look at the sum of the Sn over all n = n0p 21 , h = ct, u 1, 2, . . . . Note that k/2 = no-kap -hk . we 6, = 8,. and nnow evaluate the inner sum overje ilni. First suppose that Ono . As representatives j of i mod ni we choose iop 2h ji no as j0 ranges from 0 to no — 1 and j1 ranges from 0 top 21 — 1. With p fixed we introduce the 'notation

=

if pij;

(2.23)

I if p Then for h > I we have p 2h

no (11) =

5(j) (it

)

;;

6(/1) (k) •

Also, e -21rign =

p I + 2(v-h)

Thus, the term Sn in (2.7) corresponding to n =-- nop

is the product of

Iv.

196

E

k – 142 e no no

(-/0 nn-

—

0 3 be an odd integer, A = (k — 1)/2, 4IN, x be a Dirichlet character modulo N. Let f(z) = 4°_, a ne' e Sic12( f0(N), x) be an eigenform for 7p 2 for all primes p with corresponding eigenvalue A p : 1,2f = Apf. Define a function g(z) = 4'1_ 1 be 2 " by the formal identity co 1 E b a n -s = 11 (4.1) + x(p)2pk-2— 2s • n=1

alip 1 — A pp —s

Then g e Mk-i(N I , x2) for some integer N' which is divisible by the conductor of x2 . If k > 5, then g is a cusp form.

Notice that the definition (4.1) of the b„ is equivalent to the following: (1) b 1 = 1; (2) bp = 2, for all primes p; (3) bpv = 2bp v-i — x(p) 2pk-2 bp v-2 for y >. 2; (4) b,,,,, = bn,b, if m and n are relatively prime. In Shimura's original theorem, the determination of the level N' of g was a little complicated. However, it has since been shown ([Niwa 1975]) that

-

§4. The theorems of Shimura, Waldspurger, Tunnel!, and the congruent number problem 213 one can always take N' ----z NI2. It should also be noted that Shimura actually - proved a somewhat more general theorem, applying to f which are not necessarily eigenforms for all of the Tp2. As a simple numerical ex-ample, suppose N =--- 4, x is trivial. The first nonzero cusp form of half integer weight occurs when k = 9, 11 = 4 (see Problem 5 of §IV.1). Up to a constant multiple, it is f = ORO' — 16F) =--E awl", where we have chosen f so that a l ---7. 1. (By Problem 17(h) of §I11.3 this f is also equal to q 12 (2z)/0 3 (z).) Clearly, f is an eigenform for all 7p 2, because S912 (r0 (4)) is one-dimensional. Then Shimura's theorem holds with N' = 2. Now Ss (f0 (2)) is one-dimensional and spanned by the normalized form g(z) = (n(z),1(2z)) 8 = ql-In"=1 (1 — q") 8 (1 — err ), 8 = E bnqn (see Propositions 19 and 20 in §III.3) ; hence this g(z) must be the g(z) in the theorem. It is now easy to relate the coefficients bin of g to the coefficients an off: Namely, using (3.7) with /0 = / 1 =-- 1, = 4, and noting that a1 = 1, x' = x (this is the trivial character mod 2, which equals 1 on odd numbers and 0 on even numbers), we obtain: bA =

=..- a P 2 + p3

if

p > 2;

b2 = a,.

(4.2)

While (4.2) follows immediately from Shimura's theorem, it is nevertheless quite a remarkable numerical identity : the p-th coefficient in q n (1 — q")8 (1 — q 2") 8 is equal to p 3 plus the p 2 -th coefficient in q 11 (1 — 47 2n) 1 2/(E q)3! Like many numerical relations that follow from the theory of modular forms, this fact looks rather outlandish when stated in this elementary form without the theoretical context. If we have a fixed set of linearly independent forms f in Sko (r0 (N), x) which satisfy the hypotheses of Shimura's theorem, then we can extend the Shimura map by linearity to the subspace of Skj2 (r0 (N), x) spanned by them. Note that the image gi off is always a normalized eigenform in Sk_ 1 (N/2, x 2). If we take another set {f'} of forms which satisfy Shimura's theorem and are also a basis for the same space as thef (for example, if we multiply each f by a scalar), the Shimura map is clearly affected. When we refer to the image of a single eigenform under the Shimura map, we shall always mean the normalized eigenform g in Shimura's theorem. But if we have a space of modular forms with fixed basis f of eigenforms with Shimura(f) = gi , then we define Shimura(/ aifi) = I aigi , which is not necessarily a normalized eigenform. So our meaning of "image" or "preimage" under the Shimura map depends upon the context. In general, it is possible for several different fe Ski2 (r0 (N), x) to go to the same g e Mk_ , (ro (N), x 2). For instance, there might be anf' e Sk12(fO(N) , x') which corresponds to g, where x' is a character different from x which has the same square: x 2 = x'2 . However, when N = 4, Kohnen [1980] described the situation more precisely. We now describe his main result. Let Mk+12 (r0 (4)) denote the subspace of Mk/2 (r0 (4)) consisting of f(z) = Eane for which an = 0 whenever (— 1)'n - -. 2 or 3 modulo 4. It is certainly a priori possible that there are no nonzero forms f with this property, which

214

IV. Modular Forms of Half Integer Weight i

requires that "half" its coefficients vanish. However, we know from Proposition 9 that 14/2 is such a form. Also, it is easy to check (using Proposition 17(a) of §I11.3 and (1,.8) of §IV.1) that 0(z) f(4z) e /1/42 (r0 (4)) for any fE M(k-1)12(SLi(Z)). It turns out that Aik2 (ro (4)) is the direct sum of the one-dimensional space spanned by the Eisenstein series 14/2 and the subspace Sk+12 (4)) of cusp forms:

(ro

,

/

sk412(r0(4)) = {f=

E anqn E Skigo( 4))1an = 0 if (-1)'n -m 2 or 3 mod 41. (4.3)

It is not hard to show that Mk+/2 (f0(4)) is preserved by all of the Hecke operators Tp2 except for T4. According to Shimura's definition of T4, one has T4 I ant? = E a4„q". If we look back at §IV.2, we see that Flo is an eigenform for T4 (with eigenvalue 2 2 , see (2.21)), but Ilk/2 is not. In fact, it is easy to see that T4Hk/20 Alk+i 2 (ro (4)). For this reason, Kohnen modifies' T4, and defines a slightly different operator TI so that the m-th coefficient of T: 1 anqn for (-1)2m -a 2 or 3 mod 4 is zero and for (— l )m a 0 or 1 mod 4 is equal to

a4„, + x(_..1)Am(2)2'ia,, + 2k-211"m14• With Shimura's definition, since we have x(p) = 0 when pIN even if x is the trivial character on (ZINZ)*, the second and third terms vanish in

a4m + XX(-1)Ani( 2) 22-1 ain + X(4) 2k -2am/4. Thus, Kohnen's modification is to replace the trivial character x by the map which takes the value 1 (and never 0) on all numbers including those not prime to N. It is T-4,- rather than T4 which preserves M 2 (F0(4)) and Sk4i2 (f0 (4)). Note that Hk/2 is an eigenform for T: with eigenvalue 2 2 . Kohnen further shows that M4-2 (r0 (4)) has a basis of eigenforms for all of the Tp2 (p 0 2) and for n which is unique up to permutation of the elements and scalar multiplication. There is no obvious way to normalize an eigenform f= E anqn ; for example, we cannot necessarily multiply by a scalar to get a l = 1, since a l = 0 for all fe 47/2 (f0 (4)) if .3. is odd. But one can require that the coefficients all lie in as small a field extension of 0 as possible. It turns out that the images g of these eigen-basis forms Je Sk4i2(r0(4)) under the correspondence in Shimura's theorem are all contained in Sk_i(F), r = 51,2 (Z) (the results of Shimura-Niwa only guarantee that they are in Sk-1( r0(2))) ; they are all distinct; and they form a basis for Sk_i(r) consisting of normalized eigenforms for all of the Hecke operators T„ acting on Sk _ 1 (F). (In particular, dim Sk41-2(r0(4)) = dim Sk_i (F).) Thus, if we take each of our basis elements for S4-2(r0(4)) to its image under the Shimura map—and take - p I/42 to the normalized Eisenstein-series -i---1-1 —and then extend by linearity to all of Afk4i2 (r0 (4)), we obtain an isomorphism from M4-2 (r0 (4)) to Mk_i (F) (and from Sk,2 (r0 (4)) to Sk _i(fl). This isomorphism commutes

§4. The theorems of Shimura, Waidspurger, Tunnell, and the congruent number problem

215

with the Hecke operators, in the sense that:

Mk4i2(r0 (4)) TP2

f

g E Mk -1(r)

I I

TP

and

nI

g

I

T2

T44-fi--+ T2 g

Ti,g

The basic method of Shimura's proof of his theorem was to use Weil's theorem which we discussed briefly at the end of §III.3. Weil's theorem says that if E bnn and its "twists" E bn O(n)n' for certain Dirichlet characters each satisfy the right type of functional equation relating the value at s I s, then g =. E bn iqn e Mic _i(Fo(N /), x 2). But the proof to the value at k that all of these functional equations are satisfied is not easy ; about twenty pages of [Shimura 1973a] are devoted to an investigation of delicate analytic properties of the Dirichlet series corresponding to g and its twists. Shimura's correspondence seems rather roundabout. It says : take the q-expansion of a suitable Je Sk/2 (r0 (N), x); look at the q-expansion coefficients a„ as n varies over integers with fixed squarefree part /0 , and form a Dirichlet series from them which turns out to have an Euler product ; then take the part of this Euler product which is independent of /0 , and expand it and finally, go from this new Dirichlet into a new Dirichlet series Eb n series to the q-expansion E bn qn, which will be your modular form of integral weight. After Shimura's paper appeared, people started looking for a more conceptual, less roundabout construction of the Shimura correspondence. Certain more direct, analytic constructions were given by Shintani [1975] and Niwa [1975 ]. In addition, group representation theory was found to provide a conceptual explanation of this correspondence (see [Gelbart 1976], [Flicker 1980]). Moreover, the use of representation theory has led to striking new results about forms of half integer weight, especially in the work of J.-L. Waldspurger. Using representation theory, Waldspurger [1980, 1981] proved a remarkable theorem establishing a close connection between critical values of 1 e 21 and the coefficients in 4-series for a modular form g of weight k the q-expansion of a form f of half integer weight 10 which corresponds to g under the Shimura map. Roughly speaking, the theorem' says that the critical value is equal to the square of a corresponding q-expansion coefficient times a nonzero factor which can be explicitly described. Waldspurger's general result is complicated to state, so we shall \ only describe what it says in two particular situations. As mentioned before, Kohnen [1980] showed that the Shimura map gives an isomorphism —

—

—

ra s2 (f(4)) Shimu --° Sk-i (n.

(4.4)

Here S 2 (r0 (4)) is defined in (4.3). Let g(z)=-- E bonqn e Sk _ i (f) be a normalized eigenform for all of the Hecke operators, and let xi, be the character corresponding to the quadratic field of discriminant D. Suppose that

216


(— 1)'D > 0, i.e., the quadratic field is real if A = (k — 1)/2 is even and it is imaginary if 2 is odd. Recall that Lg (ID, s) denotes the analytic continuation of Enap - i XD(n)bnn -8 (which can be shown to converge absolutely if Re s> k12). Let f(z) = Zan g?' e S4-2 (r0 (4)) be the unique preimage of g under the Shimura map (4.4), i.e., g - 7---- Shimura(f ). Let )122Di.

Lig (XD, /1.) = (11)1) (k — 1)! k2 , then the mapping is easily seen to be a character of (Z[i]/D*; if k 2 > k 1 , then def X(X) =f i(OXk (NX)k l is a character of (Z[i]/j)*. In, for example, the case 1( 1 > k 2 , the Irecke L-series is then .1 "E 0 x e ZVI 2(X)(N1 xrs E xk x(x)(NV2 - s. 5. (a) Make the change of variables x' = Mx to obtain 4(y) = Sure-216M 2 Y f(e) idde :`,;41 — -frfiFifif(M *y), since Ats i x' • y = x' • M*y. (b) L' M*Z" ; let g(x) =f(Mx). Then rr g(m) = I mEz nj(rt) ide!mi EyL,./(y) by part (a). Exez.f(x) = 6. (a) M* = -A-(1 ?); considered in C, L' = Note that Tr xy = 2x - y, where on the left x and y are considered as elements of C, and on the right as elements of R2 . (b) In Problem 5(b), let f(x) = e - '11'12 , g(x) = f(Mx) with M as in part (a); then apply Poisson summation. (c) Let 0(t) be the sum on the left in part (b), and let 4)(s) = 1`2,,rst 5(0(t) 1) 1L + 0(0(0 —')1'-. For Re s > I show th a t

Answers, Hints, and References for Selected Exercise

230

4)(s) = (20)sa + 1 10 + n -s1"(s)6CK(s). The residue at s == 1 is n/3f."3. Finally, replacing t by 4/3t in the integrals for ck(s) leads to the relation: 0(s) = (2/j) 21 4(1 s), and this leads to A(s) = A(1 s) for A(s)rf (07270s1"(s)4(s). (d) Use the Euler product form K(S) fl p(1 (INP) -s) -1 . For ea prime ideal P of norm p 1 (mod 3), the contribution from P and P is (1 - sr 2 ; for P = (p), where p 2 (mod 3), the contribution is (1 - p -2T' = (1 (l - X(P)PT"' ; and for P (NI -3) the contribution is (1 - X(3) 3-T"' (since x(3) =-- 0). So the Euler product is the Euler product for “s) times the Euler product for L(x, s). (e) Multiplying the functional equations for C(s) and for L ( c, s) gives A(s) = A(1 s) for A(s) = n.-42 r()C(s)(3/7 ) 42 r((s + 1)/2)L(x, = ("jinn K(s)r OW as + 1)/2) = const -0/27r)sr(s)C K(s) by (4.4). 7. See Problems 20-22 in §II.2; (d) g(x, tfr) = 3. 8. (a) By part (1) of Proposition 9, j(y) equals e'Y times the Fourier transform of (x w)e - nox'12 ; note that x w = Mx • (1, i) with M as in Problem 6(a); then proceed De zniu- y e--(47rooly•(-1„), (I)) Use as in Problem 6(b) to obtain 4" (y) = 34t y . n-sr (s)6L(E, s). (d) Replacing t by IF in the Problem 7(c) to obtain 0(s) = eh where the integrals in 4 (s), one obtains 4)(s) = (-1)s -1 Ex(a + ba))14T 2t summation is over 0 a, b < 3 ; u = (a13, b13); and ) e2g1"•rne-4tInl'( '' 1) I 2 . Then for Re (2 s) > 3/2 use (4.6), OU(t) = E„, z2 tn • ( Probrm 2 of §11.2 (note that u • in = iTr((a + bco)(- &m 1 m 2)1i0)), and the evaluation of the Gauss sum in Problem 7(d). The result is: 0(s) = (4/3)5-1 370 -2 1-(2 s)6L(E, 2 s). Equating this with the expression in part (b) and t 12

1/

collecting terms gives the desired result.

01.6 2. The function f(s) = A(1 + s) is even in the first case (so that its Taylor expansion at s 0 has only even powers of s) and odd in the second case. 3. (c) Use: •(3-1, and 2ap = ap 'ercp .

5-6. n

first few nonzero kn,„ 2 b 1 = 1, b 5 = 2, b9 = -3 3 b 1 = 1, b5 = 2, b 13 = 6, b17 = -2 10 = 1, b9 = -3, b 13 = 6, , l7 = -2

1,(E„, 1)

0.92707 1.5138 1.65

7. You want 14 4. 1 1 to be less than c/2, i.e., c/2 > 4(1---

remainder estimate

0.00027 1R25 1 0.00123 1R29 1.ç.. 0.289 e m+iy,fir ; for 1R131 ...ç.

N' e - gml`51- , so choose M> N ' 1og(8,1Thre) -71-NTAT; log n, i.e. (2n N/2/7r)log n for n odd, (2n/n)log n for n even. 8. (a) Use Problem 3 to find the b„,,,,. (b) See Problem 7 of §I.1. large n the right side is asympotic to

1. Te 1-1 (N) c ro (N ), but STS -1 F0 (N); hence, neither I-, (N) nor Fo (N) is normal in r. 2. (a) Given Ac SL 2(ZI1VZ), let A be any matrix with A = A mod N. Find B, CESL 2 (Z) such that BAC is diagonal: BAC = (g ?,), where ad =' det A 1 (mod N). It suffices to find A= re 'SI) with determinant 1, since then SL 2 (Z), and B -1 B -1 (.4 1)0C -1 = A 71(mod N). To find A so that 1 = det 1 = 1 + ((ad - 1)/N + xd)N - zN 2 , first find x so that xd -(ad

1)/N (mod N), and then choose z so as to make det /i= 1.

231

_Answers, Hints, and References for Selected Exercises

3. (a) (q 2 — 1)(q 2 — q); (b) q(q 2 — 1). 4. (b) Since the kernel in part (a) has p4(`'' ) elements and #GL2 (7L/pZ) = (p 2 — 1)(p2 — p), it follows that #GL 2 (11peZ)= p4e-3(p2 1)(p — 1). (c) Divide the answer in (1)) by (Ape) to get p 3e-2( p2 _ 1) - 5. Use the Chinese remainder theorem. 6. N 3 rIppia — P -2 ). 7. N 3 flpiN (1 — p -2); N; N11,0,(1 — p -1 ); N 2 1-1piN (1 — 1) -1 ); Nri piN (1 + if') 8. The image of F(N) under conjugation by (/?; 1) is the subgroup of F0 (N 2) consisting of matrices whose upper-left entry is r--- 1 (mod N). 10. and 14(b). See Figures A.1—A.2. 12. Besides F and F(2), there are four, namely, the preimages of the three subgroups of order 2 in S3 and the one subgroup of order 3 under the map 1" —0 SL2 (Z/2Z)•:-., S3. (1) r0(2) = {(t, :) mod 2), F0(2) = the right half of F(2); (ii) r(2) = {(: :) mod 2 } , F°(2) = Fu T -1 Fu SF; (iii) 0(2)-.----. {(ccf bd) E..- I or S mod 2 } , fundamental domain = Fu T' Fu T' SF; (iv) {C. bdr-1- I, ST or (ST) 2 mod 21, fundamental domain = Fu T' F. 13. (a) This can be proved using the fundamental domain, as in the proof of Proposition 4 in the text. Here is another method. Let G denote the subgroup of 0(2) generated by +S, T2 . Clearly G C 0(2). Conversely, write g E (g1(2) as a word of the form +Sal T 1'IST b2 • • • STN, where al = 0 or 1 and bi # 0, j = 1, . . . , / — 1. Use induction on / to show that g E G. We work mod +1, so that S 2 = (ST) 3 = 1. Without loss of generality we may suppose a l = 0, bi # 0, since we can always multiply g on the left or right by S without affecting whether g e G. For the same reason we may suppose that b 1 ------- 4 = 1, since T 2 e G. Note that I= I or 2 is impossible, since T,TSTO 0(2). If / > 2, write g = TSP2 - - • . Since (STS)(TST) = 1, we have TST = (STS) - ' = ST' S, and so T 2 Sg = TSTb2 -1 S • • • , which is just like g but with b2 replaced by b2 — 1. If b2 > 0, use induction on b2 to finish the proof. If b2 < O, write ST'g = ST -1 ST b2 - • • = TSr2 +1 - - - , and again use induction on 'bd. (b) Use: r°(2) = F0(2) = STO(2)(ST) -1 . (c) Let G be the subgroup of generated by T2 and ST'S. Since G c= r(2), it suffices to show that r: G]= 6, e.g., that any "word" in S and T can be multiplied on the left by elements of G to obtain one of the elements cx; used in the text for coset representatives. Use S 2 = (ST) 3 = 1 and induction as in part (a). (d) Use part (c) and the isomorphism in Problem 8. 14. (e) Fo( p) is bounded by the vertical lines above l and —1; arcs of circles with diameter [0, 2] and [— 2/(2p — 1), 0]; and arcs of circles with diameter [ —1/(k — 2), —1/4

r

[

(The centers of the boundary arcs are at

—1 /2, 0, 1/6, 4/10.) %

1/2 Figure A.1

232


I

1 ....Pw

...

■ ......

.......

4....

•as, .....

1

0

1/2

Figure A.2. Two possible fundamental domains for F0 (4):

I. Let F(2) be the fundamental domain for F(2) in the text. Then ŒF(2), where a = (3 1), is a fundamental domain for F0(4) = Œr(2)a -1 (see Fig. A-1). IL U__ / ocT'F is a fundamental domain for F0(4), where F is the fundamental domain for r = SL 2 (Z) given in the text, and ai are coset representatives for r modulo F0 (4). Here we have taken: a l = 1, a2 = S, a 3 = T'S, a4 = 7-2 S, a5 = 7-3 S, a 6 = ST'S. (In Fig. A.2, the number j labels Œ' F; the solid boundary arcs are centered at 1, —4, —i; the dotted arcs are centered at 0, —1, —4, --i, —4; the point P is (-7 + i0)/26.)

k = 3, . . . ,p. For example, F0 (3) is the union of the regions marked 1, 2, 3 and 4 in Figure A.2. 15. See Problems 12 and 14(c) for counterexamples. 16. (b) (i) 2; (ii) 2; (iii) 2; (iv) 1.

§111.2 1. In the sum for Gk, group together indices m, n with given g.c.d. 2. Substitute z = i in Proposition 7 to obtain: E2 (i) = 3/7r. 3. (a) Since both sides of - each equality are in a one-dimensional ilik (F), by Proposition 9(c), it suffices to check equality of constant terms. (b) Equate coefficients of q" on both sides of the ' equalities in part (a) to obtain: o-7 (n) = o.3 (n) + 120 ET.1 o-3 (j)o-3 (n — j); 110-9 (n) = —100.3 (n) + 21a5 (n) + 5040 ri',::,1 o-3 (j)o-5 (n — j); 613 (n) = 21a5 (n) — 20a7 (n) + 10080E7:4 a.5 ( j)a7 (n — j). 4 • (a) Since the left side is in S1 2 (f), by Proposition 9(d) it suffices to check equality of coefficients of q. (b)t(n) = 76556 o-1 / (n) + -a5 (n) — 0.5 ( j)o.5 (n —j). (c) Consider part (b) modulo 691. 5. Let F(X, Y) be an irreducible polynomial satisfied by X ----- E4 , Y = E6. Substituting z = i in E4 (z), E6 (z) leads to a contradiction, since E6 (i) = 0, E4 (i) * 0. 6. (b) Use part (a) with x = e- 2tt = e2ni(i), along with the relation 0 = 2(2, /)/3„, / E„.,. / (i) — 2(a1+ i)Ba+ 1 — Efz/ sria (n)q n with q = e2gi(i). '7. Use Proposition 7 and the derivative of the identity f(— 1 fz) = z kf(z). 8. (a) By Problem 7, the right sides are in m6(r) and m8(r), respectively. Now proceed as in Problem 3(a). (b) 210-5 (n) = I0(3n — 1)(73 (n) + ai (n) + 240E7:1 al (j)a3 (n — j); 20a7 (n) = (42n — 21)a 5 (n) — ai (n) + 504 E.7:1 ai (j)a5 (n — h. 9. (a) Use the fact

233


that n2 1 (mod 24) if n is prime to 12. (b) Use Proposition 9(d) to show that their 24 ±1 mod 12 q — q n) = 24-th powers are equal. (c) Ens 15 mod 12 q (n2 -1)/24 .

10. j= 256(.1 -2 + I + 12 ) 3/(.1. + .1 - 1) 2 ;j= 1728 if A = I; if A = alb in lowest terms (with a and b positive) and jeZ, that means that a4 b4(a2 + 132) 2 divides 256(a4 a2 b2 + b 4) 3, but since a, b and a2 + 112 each have no common factor with a4 + a2b2 + b 4, that means that a2 b2 (a2 + b2) divides 16, and now it's easy to eliminate all possibilities except a = b = 1. 1

§111.3 4. (a) Since F0 (N) = +F/ (N) in those cases, there is no difference between F0 (N)-equivalence and fl (N)-equivalence. (b) —1/2 is an irregular cusp for F0 (4).

5. Group

ro(p)

Cusp Index

o

co

f(2)

ro(P2)

—1Ikp, k =1, 1

p

,p 1

0

oo 2

2

—

1 2

7. (a) See the proof of Proposition 18; replace a by X in (3.11). 8. (a) Replacing z by z + 1/2 in Z e21riz"2 gives Z(-1)"e'" 2 ; meanwhile, the right-hand side is 2 E e2R1z(2n)2. (b) works the same way. (c) On the left side, the constant term is clearly zero; the coefficient of qn for p ,t n is (-2/c/Bk)ak_ 1 (n); if pin but p21e n, then the coefficient is —2k/B k times 0-k _ 1 (n)— (1 + p k-i )o- (n1p) = 0; if n = pmn o with and Ono , then the coefficient of q" is —2k/B k times (pmn o) —(1 + pk-i )o-k _ i (pm -1 n o) p ic-1 0.k_, ( pm— 2no)

>1

( 1 +Pk-i )crk i(Pm-1 ) +Pk-i crk 1(P m-2)) = O. (d) Use parts (b) and (c) with p = k = 2. (e) Rearrange the infinite product of (1 _ e2icicz -1/2)n) = 1 Ong') by writing 1 + qn = (1 — qn), getting

ak---1010)(0i 1(Pm1) -

—

-

-

(where 11 denotes 11(1 — qn)). But this equals nn2even even Mall n twice an even 11 4in • 9. See the proof of Proposition 30. 10. (a) Since n8 (z)ri8 (2z)ES8(F0 (2)) by Proposition 20, to show invariance of ri 8 (4z)/n 4 (2z) under [y] 2 for y E F0(4) it suffices to show invariance of (4z)e(z)71 4(2z) under [y] 10 for y e r0(4) ; now use Problem 9 to show this. At the cusp 00 5 we see that r/8 (4z)/q4(2z) = q11(1 q4") 8/(1 q2")4 has a first order zero; to lin even • rintwice an oddinnodd

find q(0) we apply [5 ]2 : Z -2 q 8 ( - 41z)17/ 4(— 21z) = z -2 (z/4) 4 n8 (z 14)1( —(z/2) 2 114(42)) = —411(1 — q) 8/( 1 q , which approaches --1-4 as z cc. To find the value at the cusp —1, which is equivalent to the cusp = T(— 1/2), we can apply [(I ?)] 2 , as follows: —

/78 (2z + 1)-2

(

4z

(2z + 1) 4

2z + 1) = (2z + 1)' ( 2z ) 114 2z + 1

4z (2z +

64z 2

( 1 4z

1\ 2,14

2z ) ti

=

8

8

1 77 2° (2z) 16 ri 8 (z)1 8 (4z)'

1)

1 __

2z

)

24

2Z1 ) (-

e —26

)4

—1\ 4z )1 8

1

by Problem

z1) e-2ni/6 4 (

I 2z

8(e)

234


which approaches as z oo

(b) E2 (ST -aSz) = E2 (S(- a -))= (a + E2( - a - 12") + Ir(- a - +), but E2 (--a - = E2 (-1), so this equals (a + 71) 2 (z 2E2(z) + + TN- a - = (az + 1) 2122 (z) - 6-7- (az + 1). (c) Obviously F1[7] 2 = F. Using Problems 1 and 10(b), we have -24F(z)1[ST -4 S] 2 = E2 (z)1[ST -4 S] 2 3E2(2z)l[5T -2 S] 2 + (E2 (2z) + (42+10 + 2(E2 (4z) 2E2(4z)I[ST-1 S12 = E2 (Z) + (42+1) (42+1)) -24F(z). We now look at the cusps. First, we clearly have F(co) ----= O. In evaluating FI[S] 2 , ignore terms which approach zero as z co ; then, using Proposition 7, obtain F(z)1[S] 2 "rd — 24E2(Z) — 3 (1E2 (Z/ ) 2(- E2(44)) -) = + we similarly ignore terms which approach zero: This is F(0). To find F( + 2(2z ± 1 ) -2 E2(1÷– 24F(z)i[ST -2 S] 2 = E2(z)I[ST -2 Sh 7 3E2 (2z)I[ST f 2) by Problem 1 above, and, by Proposition 7 and part (b), neglecting terms which = D2 E2(-ii approach zero, this becomes E2 (z) - 3E2(2z) + 2(2z + 1) -2 ( E2 (Z) — 3E2(2z) + z 2 E2 — ± 6E2 — "1i) --- 4E2( I)) by Problem 8(d); again applying Proposition 7 and neglecting small terms, we obtain E2(z) - 3E2 (2z) + -13- ( - 16E2 (4z) + 24E2 (2z) - 4E2 (z)) - t - 3 + -I ( -16 + 24 - 4) = (d) Since the only zero of ri 8 (4z)f 4(2z) is a -4, so that F( 1) = M 1) = simple zero at co, it follows by Proposition 18 that F(z)101 8 (4z)M 4(2z))€1110(F0 (4)) is en) = 1 + t )•(e)- • First show a constant. To get the identity, write (1 - 17 4n)/( that A(z + ilN)eS12m(ro(N 2 )), using the same type of argument as in the proof of Proposition 17(b). Then set f(z) = 6.(z) N 111A(z + jIN), and take the logarithmic derivative of the equality f(yz) = f(z) for y eFo (N2). 11. (a) Prove invariance under [A 2 for y E r0(4) as in Proposition 30. At the cusps, clearly 04(co) = 1; at 0 we have 04 (z)1[5 ] 2 = 2 2 04 ( - 1/(4z/4)) = z -2 (- z 2 /4)04(z/4), which approaches -I as z oo ; at .1- we have 04 (z)1[ST -2 S] 2 = (2z + 1) 2 04(z/(2z + 1)) = (2z + 1)2®4(_ 1/(4( - 1/4z - 1/2))) = (2z) 2 ® 4(- 1/4z - 1/2) by (3.4), but by Problem 8(a) this equals (2z) -2 (20( 1/z) - E(- 1/4z))4 = -1(2(20 -112 0(z/4) 0(z))4 = (0(z/4) - 0(z)) 4, which approaches zero as z co. (b) Use: 04(oo) 0 = F(co). (c) One can proceed directly, as in Problem 10(a). Alternately, write it as re (2z) and use Propositions 17 and 20 and Problem 10(a). An even 6.(2z) —

-

(

-

-

-

-

en\

.-

(n(z)g(2z))8778(4z ) ,

easier method is as follows. Letfi (z) 714(2z)M 8 (4z), and let f2(z) re ° (2z)117 8 (z)e (4z). In the solution to Problem 10(a), we saw that for cc = ST -2 S = ?) we havef2 = 16f1 1[4 2 . Sincefi is invariant under [y] 2 for y e F0 (4), it follows thatf2 is invariant under a -1 1-0 (4)a, which is easily checked to be equal to F0 (4); finally, since a keeps 0 fixed and interchanges the'equivalence classes of cusps oo and -1, we see thatf2(0) = 1 6f1(0) = - 1,f2( 16f, ( co) = o, f2(œ) = 145fi(—) = 1. (d) Same procedure as 10(d). (e) Use Problem 8(e). 12. Follow the proof of Proposition 19. 13. Write (q(z)11(3z)) 6 = . (z)(n(3z)17/ 3 (z)) 6 , (1(z)1(7z)) 3 = (z)(q(7 z)1,7 7 (z)) 3 . 14. Check the generators T2 and S, using (3.4) to get 0(Sz) = firk/Az); to check the cusp -1 = (TST)' cc, write 04(z)I[TST] 2 = 04(z/2)1[(1 ?)] 2 (see Problem 1). 15. (a) For y = e 170 (N) note that ocNyaV = (- `kb -`,,1N)e Fo (N), and so (f I [aN]01[Y]k = (f 1 [OENYŒV ]k)I1ŒNik = x(a)f [ŒN]k. But X(a) = TO). (b) For fe Mk (N, x) write f f+ ' + f , where f = ± efl[aN]k). (c) By 10(d), +.1q + - = + F; matrix is F(z)I[Œ4]2 = -- *4 8 (z)M 4 (2z) = (1 116); M; -(4, 1)' = C-10 4, M;(4, 1) = -12-404): 16. (a) E ak_ i (n)n -s = Zfrii _l_„1 -1 (fin)7 = Z Ent 5 = C(S 7 k)(s).

235


I: 1/2

0 Figure A.3

Sc

SB

Figure A.4

s 4. pk-1-2s)l . (b) L1(s) = n[(1 — p)(1 — Pk-1-5)r =14( 1 — ak-i(P)P(c) L1 x (s) = 11 19[0 — X( P)P - s)( 1 —

= np( 1 — X(P)ak-i(P)P -s + X(P) 21)4-1-25) -1 . 17. (a) Any point r'-equivalent to i must be F0 (4)-equivalent to one of the points aj-1 i, where 1" = Uï... 1 airo (4). In this way, find that i, (2 + i)15 e Int F' are F-equivalent to i; co and (5 + i.11)/14E Int F' are F-equivalent to co ;. the two F0 (4)-equivalent boundary points (— I + 012 and (3 + 0/10 are F-equivalent to i; and no boundary points are F-equivalent to co. (b) Follow the proof of Proposition 8, but with slightly more involved computations. Note that the three "corners" (-3 + i. 1 ) /4, (1 + i0)/4, (9 + i0)/28 are all F0 (4)-equivalent, and the sum of the angles at the three corners is 360 0 . To illustrate the elements in the integration around the cusps and along the circular arcs, let us compute (27ri)-1 f f ' (z)dz If(z) over the contour ABCDEF pictured in Fig. A-3, where we suppose that f(z) has no zeros or poles on the contour (but may have a zero or pole at the cusp 0), and we ultimately want the limit as E --+ O. Here BC is a circular arc of radius e centered at O. The element a l = (1 ?) E F0 (4) takes the arc from 'A to O to the arc from D to O. The image of B is very close to C, and so we can use the integral from D to a l B to approximate the integral from D to C. First, we use S to take the arc BC to a horizontal line between Re z = —3 and Re z = 1. (see Fig. A.4). By (2.24), we have

f : fj

.

z z) )

dz =--- is: f l i dz — k f c dz . B

Z

But the latter integral approaches 0 as E -4 0 (since the angle of the are BC approaches zero), while the first integral on the right'approaches —v o(f ) (as in the proof of Proposition 8 , but we use the map 44 = e21644 into the unit disc). Next, by

236

Answers, Hints, and References for Selected Exercises \

(2.24), we have

= _ k f B dz j A z+ 1/4

fBf/(z)dz +fDr(z)dz,, l'Brkl(tz fœlitLikIdZ

J 4 f(z)

c f(z)

JA f(z)

--, _k J /

0

j Go f(z)

dz z+ 1/4

= -k

I,

1/4

dz

.

— 2 -1-isrS)/4 Z

Similarly, using a2 = a 1) e r0(4) to take the arc DE to the arc FE, we find that the sum of the integrals over those two arcs is equal to - k times the integral of l• from (2 + i0)/28 to i. These two integrals are evaluated by taking in z between the limits of integration, where the branch is determined by following the contour. As a result, we find that the sum of the two integrals is - k times the logarithm of

1/4

1/4 --1 0)/4 (2 + i.0)/28 (-2+

•

where, if we keep track of the contour, we see that we must take ln( - 1) = - ni. Thus, (20 -1 times the sum of these two integrals is equal to k/2. For a systematic treatment of formulas for the sum of orders of zero of a modular form for a congruence subgroup, see [Shimura 1971, Chapter 2]. (c) The only zero of 04 is a simple zero at the cusp -1; the only zero of F is a simple zero at co. (d) If fE M2 (ro(4)), apply part (b) to the element f - f(cc)0 4 - 16f( - - -1) Fe M2 (ro (4)), which is zero at -1 and co, and hence is the zero function. (e)-(f) See the proof of Proposition 9. (g) Look at the value at each cusp of a012 + b08 F + c04 F2 + dF3 : f(cc) = a, f(-1) = dI163 , so a = d= 0; then f= 1(0) = (--4 - ) (-4)( -I), so c ------ -16b. (h) Let f(z) = ql 2 (2z). Since f(z) 2 = A(2z)e M12( 1-0( 2 )), we immediately have vanishing at the cusps. Let Œ = (_.1 ?). By writing 2az = -1/(1 - 1/2z), show that f i[a]6 = -f, and so fit M6 (r'0(2)). However, r0(4) is generated by T and a2. Next, since s6(r0 (4)) =C(08 F- 1604F2), to check the last equality in part (h), it suffices to check equality of the coefficient of q. 18. (b) Iffe M 1 (4, X), by subtracting off a multiple of ®2 we may suppose that f(co) = 0; then M2 (4, 1)f 2 1.--- a?q2 + - • - . But a multiple zero at co Would contradict Problem 17(b), unless f is the zero function. (c) First show that, for k _._. 5 odd, Sk (4, x) = 0 -2 Sk+I (4, 1); then, by Problem 17( 1 ), dim Sk (4, X k) --= [(k - 3)12] (where [ ] is the greatest integer function). (d) Use Proposition 20, Problem 17 (with F' = F0 (2), a = CI 7)), and part (c).

§111.4 2. Show that V- i(cz + d)/V - 4- cz - d) = rsg" (it suffices to check for z = 0), and then divide into four cases, depending on the sign of c and d. 3. (b)-(c) Use the (. 0 a/3z) 0(43z) 0(i3z) relation 0(z) - 0(/3z) 0(z) .

§111.5 2. (a) By (5.27) and (5.28), Tn(f i[ŒN]k) = d k/2" Ea,bf 1[ŒNcra(c; ba) ]k . Let , Œa ,b = anaN ajg riDoc-IV ENV, tl}, Z), and show that the Œ,b are in distinct cosets of r;(N) in An(N, {1}, Z); and hence form a set of representatives; so, by (5.27),


237

(Tnf)i[aidic = n(42 " Ea,bfl[cca,baN]k = n (W2)-1 Ef i[cr naN0-.(8 !)I beCaUSe fl[Crac = f. (b) TF = 0, T204 = 04 + 16F, so {F, F + A.Ø 4} is a normalized eigenbasis. (c) (7'2 F)i[Œ4] 2 = 0 # T2(FI[Œ4]2)- (d) Any eigenvector of T2 or [M k must be an eigenvector of T„ ; this includes F,iF +2-40 4 , C04, and so all of M2(r0( 4 )); by Proposition 40 applied to F, we have An = al (n) (n odd). (e) Let 04 (z) = Eanqn. For n odd, a, = (n)a i = 8u1 (n) by Proposition 40. For n = 2no twice an odd number, compare coefficients of qn0 in T2 04 = 04 + 16F to get an = a 16a1 (n0) = 24d1 (n0). For n = 2n 1 divisible by 4, compare coefficients of qi_'1 in T2 04 = 04 + 16F to get a„ = an, +13 = 24o-1 (n0) by induction. 3. The first part follows from (5.19) with n = 0; a counterexample for other cusps is in Problem 2(b), since Co4 + 16F does not vanish at s = —1/2. 4. (a) Both 1,, f and Tnf are equal to n(142 ". Eflk(401 M k over the same a, b, d, because g.c.d. (n, M) = 1. Note that g = M -42 f 1[01 M k . Thus, AP/2 T g = n(42)-1 E f 1[( ?)o-jg 'la, where ?,) mod MN. Set aa in the sum for T,, f equal to (411 ?)aj ig -?)'. Then = ((k/2I Ef f[aa(1 ?)(sSI !)O' ?)--I]ol oof ?)],= ( 7nf)1[0o1 ?ilk, because in (f; bia4) = (11 ?)(iO4 ?)--1 the entry bM runs through a complete set of residues mod d. (b) Since s8(r0 (2)) = Cf, f is an eigenform for Ta ;.then by part (a), g is an eigenform with the same eigenvalue. (c) T2 9 = U2 V2 f = f; T2 f = —8f by a comparison of coefficients. Hence {f, 8g +f} is the normalized eigenbasis. 5. = < p0/ 2)-1f1[( )] kl g> = Ef: , by (5 . 33) . ii)] Since gl[(t; li)]k = g(pz j) = e2 g(pz), we have = p(f, Pk-1 g(pz)> = p k . 6. (a) It has a l = O. (b)f, =q2 48q3 + 8 . 27 . 5q4 — 64 .5 . 47q5 + 4 . 9 - 5 .17 - 47q 6 + ••-;

(11■1

[

12 = q

24. 43q 2 + 4.9.49.139e + 64.171337e + • ; T2f1 = q + 1080q 2 + • • • =f2 + 2112f; T2 f2 = —24 . 43q +(64 .171337 + 2 23)q2 + - - = —1032f2 + 29367211(0 Tr T2 = 1080, Det T2 = — 2 ". 3 2 • 2221; eigenvalues = 540 + 12V144169; normalized eigenforms aref2 + (131 ± J144169) 12f1. (d) 2f2 + 3144fi . (e) From part (d): 2(4 . 9 -49 139) — 48 . 3144 = 89 5-23 41. Alternately, compute T3f1 = —48f2 + 36 - 2619f1 5 T3 f2 = 36 . 681112 *f (we don't care about *), so Tr 7'3 = 36(2619 + 6811) = 8 .9 .5 . 23 - 41. 8. In case of difficulty, see §3.1(c) of [Serre 1973 ].

NV.

= «73 ?), n2-114) ((: ()cil " d) ((141n) m 114) = ((;„, ()V czlm + d) by (1.4). (b) yi = (J„, mdb), so "j), = 1), ()E;i 1 Niczim - + d)= ?), (5)). Thus, they are equal if m is a square mod d. For m not a perfect square, to find y for which = 135" p -1 (1, —1), it suffices to find d prime to 4m such that (7) = —1, since one can then find y of the form Lam 'De ro (4m). To do this, first let m = 24mm 0i, where s = 0 or 1 and m o is squ-arefree. Choose d Ez- 1 mod 8 and d do mod mo , where do is chosen so that

1. (a)

(t) = —1. Then one easily checks that (7) = (ia) = (4) = — 1, as desired. 2. Replacing y by —y, if necessary, without loss of generality we may suppose that b> 0 or else b == 0 and a> 0. (a) = (ev N 114Vinf 1:1), (DE» cz + d)-

((-1 1r), —iNi/4,/-z-)=((z ;-,),N14-,iect:::-.:(5)evN cz + d)((_? le) Note that \I(az + b)I(cz + d) - cz + d = q i Vaz + b with 1 1 = —1 if a .< 0' and c 0, th = 1 otherwise. Thus,

238

.


pi'P-1 z--- (( 2Nb - r), ti 1N '14( E V 1/ a ( — 11Nz)+ b(—iN 114\li))= , 0 4vb 'IN), q i (D8 d-- ' — Nbz + a). (b) We have .•j3 1 = ((-daNb - `,;IN ), (411)--)e c-1- 1 \I — Nbz + a). Since ad — bc = 1 and 41c, we have a 2_ ---: d mod 4, and so ea = ed. Thus, it remains to compare ( 1%16 ) with Ili a We suppose c > 0 (an analogous argument gives the same result if c < 0). Then

r). Now

- fivb ) = ( 04)(7) = (--&) — 1, and (4-) = (÷), so that (-=-(N (N fr (1b )( ) = )( 1 71 — ) sgn a =--- (Dth. Thus, yi = pj3p -1 ( ( ?), (1)). Again ,-, and p,- p -1 are equal if N is a perfect square; otherwise there exist y for which the two differ by (1, — 1). 3. At oo, p = 1, h = 1, t = I. At 0, take p = ((? I), N[Z-), so p" = ((_? 16), —i N/i), and we need f 0 (4) p(( in, Op = 7), -iiv-tiz+ h . VZ) = ((rh1 ), 1 ,I z + h) (( 4 L), -il)= ( (( .7), —it,lhz — I). This is in fo(4) if /i = 4, 1= 1. Finally, at the cusp —I take a = (_1 ?), p = ((A ?), V —2z + 1), so p" = (0, V2z + 1). Since aa Da' er0 (4), we can take h = 1. To find t, compute p(( 1), 0 p' — ((_,1 4), :J-4z — 1), which is j((_ _1), z) provided that ! = i. S. (a) Ski2(f.0(4)) = 0 if k < 9• For k ,.._ 9, it consists of elements of the form OF(I6F — 04)P(0, F), where P is a polynomial of pure weight (k — 9)/2. Thus, for k .... 5, dim Sk2 (r0 (4)) = [(k — 5)14]. (b) Since ! = 1 at the cusps co and 0, there are always those two regular cusps. Since 1 = i at the cusp —4, that cusp is k-regular if and only if ik = 1, i.e., 41k. But I + [k/4] — [(k — 5) 14] = 2 if k ,... 5, 4,1'k and = 3 if k ._._ 5, 41k, as call be verified by checking for k = 5, 6, 7, 8 and then using induction to go from k to k + 4. (c) ®F(® 4 — 16F)(0 4 — 2F). 6. (a) If d'E_-- d mod N and Nj4 is odd, then (1) = (11..) = ( 1 r//4 - 1)(d-I W2( Ndi4 ) = N If N/4 is even, then d' -2E d mod 8, and the proof works (_._ 1)(NI4-1)(d' -1)12(4) = .ur) the same way, with the additional observation that (i) = (-,÷). (b) The proof that the h

(

.. .

cusp condition holds for f 1[4,12 is just like the analogous part of the proof of Proposition 17 in §I11.3. Now let y = C. 'Del-004 and let yl = ( -dbN -eir). By 4. Problem 2 above, we have p53 p-1 = (1, xpl(d))i, 1 . Thus, (f i [P]42) i Vilk/2 "4"--(f 1 [ei'P -1 ]ki2)1[P]k/2 = A(d)(fi Di ik/2)1[P]ko z--- X ki (d);((a)f 1[P]ka. But since ad ÷ .- --_-. 1 mod N, we have x(a) = .k(d). Thus, (f i [P]k2)1[A42 = )---X il v(d)f I [P],42, as desired. 7. 0(co)- = 1, ®(— 1/2) = 0, c(0) = (1 - 012. §IV.2 1. E, 2 (co) = 1, Eki2( 0) = Ekt2 (— 1/2) = 0; Fic/2(0) = (if)_k ; 12(00) = F142(— 1/2) = O. 5. If one uses (2.16) and (2.19) rather than (2.17) and (2.20), then the solutions a and , fl to the resulting 2 x 2 equations involve x, which depends on 1. 6. By Proposition 8, it suffices to show this for squarefree. In that case use (2.16) and . (2.19) to evaluate the /-th coefficient of Ekg + (1 ± ik)2 -k12 F142. 7. lika = ((I — 22) + ( I — 2)q + - . . ; C(1 — A) = 0 if and only if 2 .._ 3 is odd. 9. Use Problem 1 above and Problem 7 of §1V.1 to find a and b so that Co" — aEk12 — bFkg vanishes at the cusps. a = 1, b = (1 + 0'2-k12. e5 = £512 _ (1 + i)/N/2 F512, 0 7 = E712 + (1 — i)/V2 F7/2 .

t

§IV.3 2. The computation is almost identical to that in Problem I in §IV.1. 3. (c) By the lemma in Proposition 43 in §111.5, right coset representatives for ri (N) modulo fl (N) n cC i r'1 (N)Œ, where a = (01 p, are ah = (01 pb,), 0 -._ b < pv, By Problems 1(b)

239 -,


and 3(b), we use the ab to get representatives for r,(N) modulo ri (N) n r-v i ri (N)ÇV. Since ab = (01 :Oa ), we have Tp v f(z) = p“k14- ' ) • Ebfl[(( (i) :r), Pvil • ( ( i), 1 )L2 z ---- p f( z,-?) ------ E pin ane2xinziPv. The same argument works for Ti v, since the corresponding t is trivial even when 8,f'N. 4. (3.5) gives bn for half integer weight 102; in the case of integer weight k, the formula (5.19) of

§111.5 with m .„.. p 2 gives b„ ...._ ap,n+ x(p)pk-i an+ x ( p 2)p 2k- 2 a nfp 2 . If k is

formally replaced by 1c/2 here, the middle term on the right differs by )( NI ).1„(p)j) from the middle term on the right in (3.5). 5. Let y = (ca 1:1)e ro (N) be such that p2 11). Then for yi e 1", (N) we have .1g2:iiij A'i.. = 1:p2 'g p 21 42 :1) I 53.i : By Problem 2, we have i,2).)„.-2 1 = with Vi = ( p4j2c bliiii 2) e ro(N). Let t y' yiy E1"1 (N). Then D

(fi [f1'1 (M4411001/2)1[342 = Efl[42'Plii)/ki2 i

= EA Dl

pl2;(jk/2

-----

X(d)

Zil Rp2 Mk/2 •

But one checks that the correspondence ri (N)421`)i i-- f1 (N) p2ti permutes the right cosets in r, 0\042 f-, (N), and so this equals x(d)fl[f-i (N),2 fi (N)L2 . (Verify that if ti and Tx are in the same right coset of 1-1 (N) n a - -1 r 1 (N)a in T1 (N), where a =. (01 :2), then so are y; = rciy and yi, = yt.r y'.) Thus, Tp 2 .T1[342 = X(d)72f it remains to note that Fo (N) is generated by 1"1 (N) and y = (ea e fo (N) for which p2 1b. Hence 1p2 fl[fl ki2 = x(d)Tp2 f for all y = (ca dI' ) e ro(N).

Bibliography

R. Alter, The congruent number problem, Amer. Math. Monthly 87 (1980), 48-45. G. E. Andrews, The Theory of Partitions, Addison-Wesley, 1976. N. Arthaud, On Birch and Swinnerton-Dyer's conjecture for elliptic curves with complex multiplication, Compositio Math. 37 (1978), 209-232. E. Artin, The Gamma Function, Holt, Rinehart & Winston, 1964. E. Artin, Collected Papers, Addison-Wesley, 1965. A. G. van Asch, Modular forms of half integral weight, some explicit arithmetic, Math. Anna/en 262 (1983), 77-89: A. O. L. Atkin and J. Lehner, Hecke operators on ro (m), Math. Anna/en 185 (1970),

. 134-16a. R. Bellman, A Brief Introduction to Theta Functions, Holt, Rinehart & Winston, 1961. B. J. Birch, Conjectures on elliptic curves, Amer. Math. Soc. Proc. Symp. Pure ,Math. 8 (1963), 106-112. B. J. Birch, Heegner points on elliptic curves, Symp. Math., 1st. d. Alta Mat. 15 (1975), 441 445. B. J. Birch and H. P. F. Swinnerton-Dyer, Notes on elliptic curves 1 and 11, J. Reine Angew. Math. 212 (1963), 7-25 and 218 (1965), 79 - 108. Z. Borevich and I. Shafarevich, Number Theory, Academic Press, 1966. A. Brumer and K. Kramer, The rank of elliptic curves, Duke Math. J. 44 (1977), 715-742. J. W. S. Cassels, Diophantine equations with special reference to elliptic curves, J. London Math. Soc. 41 (1966), 193-291. J. W. S. Cassels and A. Fr6hlich, eds., Algebraic Number Theory, Academic Press, 1967. J. Coates, The arithmetic of elliptic curves with complex multiplication, Proc. Int. Cong. Math. Helsinki (1978), 351-355. J. Coates and A. Wiles, On the conjecture of Birch and Swinnerton Dyer, Inventiones , Math. 39 (1977j, 223-251. H. Cohen, Sommes de carrés, fontions L et formes modulaires, C- R. Acad. Sc. Paris 277 (1973), 827-830. , H. Cohen, Sums involving the values at negative integers of L-functions of quadratic characters, Math. Anna/en 217 (1975), 271-285. H. Cohen, Variations sur un thème de Siegel et Hecke, Acta Arith. 30 (1976), 63 93. -

-

Bibliography

241

H. Cohen and J. Oesterlé, Dimensions des espaces de formes modulaires, SpringerVerlag Lecture Notes in Math. 627 (1976), 69-78. H. Davenport and H. Hasse, Die Nullstellen der Kongruenz-zetafunktionen in gewissen zyklischen Fallen, J. Reine Angew. Math. 172 (1935), 151-182. P. Deligne, Valeurs de fontions L et périodes d'intégrales, Amer. Math. Soc. Proc. Symp. Pure Math. 33 (1979), Part 2, 313-346. Serre, Formes modulaires de poids 1, Ann. Sci. Ecole Norm. Sup. P. Deligne and 7 (1974), 501-530. L. E. Dickson, History of the Theory of Numbers, Chelsea, 1952. P. G. L. Dirichlet, Werke, Chelsea, 1969. B. Dwork, On the rationality of the zeta function, Amer. J. Math. 82 (1960), 631-648. M. Eichler and D. Zagier, On the Théory of Jacobi Forms, Birkhauser, 1984. Y. Flicker, Automorphic forms on - covering groups of GL(2), Inventiones Math. 57 (1980), 119-182. S. Gelbart, Weil's Representation and the Spectrum of the Metaplectic Group, Lecture Notes in Math. 530, Springer-Verlag, 1976. D. Goldfeld, Sur les produits partiels eulériens attachés aux courbes elliptiques, C. R. Acad. Sc. Paris 294 (1982), 471-474. D. Goldfeld, J. Hoftstein and S. J. Patterson, On automorphic functions of half-integral weight with applications to elliptic curves. In : Number Theory Related to Fermat's Last Theorem, Birkhauser, 1982, 153-193. D. Goldfeld and C. Viola, Mean values of L-functions associated to elliptic, Fermat and other curves at the center of the critical strip, J. Number Theory 11 (1979), 305-320. D. Goldfeld and C. Viola, Some conjectures on elliptic curves over cyclotomic Trans. Amer. Math. Soc. 276 (1983), 511-515. R. Greenberg, On the Birch and Swinnerton-Dyer conjectures, Invenriones Math. 72 (1983), 241-265. B. H. Gross, Arithmetic on Elliptic Curves with Complex Multiplication, LectureNotes in Math. 776, Springer-Verlag, 1980. B. H. Gross and D. Zagier, On the critical values of Hecke L-series, Soc. Math. de France Métn. No. 2 (1980), 49-54. B. H. Gross and D. Zagier, Points de Heegner et dérivées de fonctions L, C. R. Acad. Sc. Paris 297 (1983), 85-87. R. C. Gunning, Lectures on Modular Forms, Princeton Univ. Press, 1962. R. K. Guy, Unsolved Problems in Number Theory, Springer-Verlag, 1981. G. H. Hardy, On the representation of a number as the sum of any number of squares, and in particular of five or seven, Proc.Nat. Acad. Sci. 4 (1918), 189-193. (Collected Papers, Vol. 1, 340, Clarendon Press, 1960.) G. H. Hardy, On the representation of a number as the sum of any number of squares, and in particular of five, Trans. Amer. Math. Soc. 21 (1920), 255-284. (Collected Papers, Vol. L, 345, Clarendon Press, 1960.) G. H. Hardy and E. M. Wright, An Introduci;on to the Theory of Numbers, 4th ed., Oxford Univ. Press, 1960. R. Hartshorne, Algebraic Geometry, Springer-Verlag, 1077. H. Hasse, Number Theory, Springer, 1980. E. Hecke, C1ber die Bestimmung Dirichletscher Reihen durchihre Funktionalgleichung, Math. Annalen 112 (1936), 664-699. (Math. Werke, 591-626.) E. Hecke, Herleitung des Euler-Produktes der Zetafunktion und einiger L-Reihen aus ihrer Funktionalgleichung, Math.-Annalen 119 (1944), 266-287. (Math. Wei--ke, 919-940.) E. Hecke, Lectures on the Theory of Algebraic Numbers, Springer-Verlag, 1981. E. klecke, Lectures on Dirichlet Series, Modular Functions and Quadratic Forms, Vatfdenhoeck_ and, Ruprecht, 1983. K. Ireland and M. Rosen, A Classical Introduction to Modern Number Theory, Springer-

242

Bibliography

Verlag, 1982. N. Katz, An overview of Deligne's proof of the Riemann hypothesis for varieties over finite fields, Amer. Math. Soc. Proc. Symp. PUre Math. 28 (1976), 275-305. N. Katz, p-adic interpolation of real analytic Eisenstein series, Annals of Math. 104 (1976), 459-571. N. Koblitz, p-adic Numbers, p-adic Analysis, and Zeta-Functions, 2nd ed., SpringerVerlag, 1984. N. Koblitz, p-adic Analysis: a Short Course on Recent Work, Cambridge Univ. Press, 1980. N. Koblitz, Why study equations over finite fields?, Math. Magazine 55 (1982), 144-149. W. Kohnen, Modular forms of half integral weight on 1-0 (4), Math. Anna/en 248 (1980), 249-266. W. Kohnen, Beziehungen zwischen Modulformen halbganzen Gewichts und Modul. formen ganzen- Gewichts, Bonner Math. Scriften 131 (1981). W. Kohnen, Newforms of half-integral weight, J. Reine und Angew. Math. 333 (1982), 32-72. W. Kohnen and D. Zagier, Values of L-series of modular forms at the center of the critical strip, lnventiones Math. 64 (1981), 175-198. S. Lang, Algebraic Number Theory, Addison-Wesley, 1970. S. Lang, Elliptic Functions, Addison-Wesley, 1973.. S. Lang. Introduction to Atodular Forms, Springer-Verlag, 1976. S. Lang, Sur la conjecture de Birch-Swinnerton-Dyer (d'après J. Coates et A. Wiles), Sém. Bourbaki No. 503. In: Springer-Verlag Lecture Notes in Math. 677 (1978), 189-200. S. Lang, Elliptic Curves Diophantine Analysis, Springer-Verlag, 1978. S. Lang, Units and class groups in number theory and algebraic geometry, Bull Amer. Math. Soc. 6 (1982), 253-316. S. Lang, Algebra, Benjamin/Cummings, 1984. W. J. LeVeque, Fundamentals of Number Theory, Addison-Wesley, 1977. Yu. I. Manin, Cclotomic fi elds and modular curves, Russian Math. Surveys 26 (1971), 7-78. Yu. I. Manin, Modular forms and number theory, Proc. Int. Cong. Math. Helsinki (1978), 177-186. D. Marcus, Number Fields, Springer-Verlag, 1977. Modular Functions of One Variable IV, Lecture Notes in Math. 476, Springer - Verlag, 1975. C. J. Moreno, The higher reciprocity laws: an example, J. Number Theory 12 (1980), 57-70. D. Mumford, Algebraic Geometry I: Complex Projective Varieties, Springer-Verlag, 1976. S. Niwa, Modular forms of half-integral weight and the integral of certain thetafunctions, Nagoya MO. J. 56 (1975), 147-161. A. Ogg, Modular Forms and Dirichlet Series, W. A. Benjamin, 1969. R. A. Rankin, Modular Forms and Functions, Cambridge Univ. Press, 1977. B. Schoeneberg, Elliptic Modular Functions: an Introduction, Springer-Verlag, 1974. J.-P. Serre, Formes modulaires et fonctions zeta p-adiques, Springer-Verlag Lecture Notes in Math. 350 (1973), 191-268. J.-P. Serre, A Course in Arithmetic, Springer-Verlag, 1977. J.-P. Serre and H. M. Stark, Modular forms of weight 1/2, Springer-Verlag Lecture Notes in Math. 627 (1977), 21-67. L R. Shafarevich, Basic Algebraic Geometry, Springer-Verlag, 1977. G. Shimura, Introduction to the Arithmetic Theory of Automorphic Functions, Princeton Univ. Press, 1971. G. Shimura, On modular forms of half-integral weight. Annals of Math. 97 (1973),

Bibliography

243

440-481.. G. Shimura, Modular forms of half integral weight, Springer-Verlag Lecture Notes in Math. 320 (1973), 59-74. T. Shintani, On construction of holomorphic cusp forms of half-integral weight, Nagoya Math. J. 58 (1975), 83-126. C. L. Siegel, On Advanced Analytic Number Theory, Tata Institute (Bombay), 1961. N. M. Stephens, Congruence properties of congruent numbers, Bull. London Math. Soc. 7(1975), 182-184. H. P. F. Swinnerton-Dyer, The conjectures of Birch and Swinnerton-Dyer and of Tate. In : Proc. Conf Local Fields, Springer-Verlag, 1967. J. Tate, The arithmetic of elliptic curves, Inventiones Math. 23 (1974), 179-206. J. Tunnel], A classical Diophantine problem and modular forms of weight 3/2, 'liven 72 (1983), 323-334. -honesMath. M.-F. Vignéras, Valeur au centre de symmétriè des fonctions L associées aux formes modulaires, Sém. Delange-Pisot-Poitou, 1980. B. L. van der Waerden, Algebra (2 vols.), Frederick Ungar, 1970. J. L. Waldspurger, Correspondance de Shimura, J. Math. Pures et Appl. 59 (1980), 1-132. J. L. Waldspurger, Sur les coefficients de Fourier des formes modulaires de poids demi-entier, J. Math. Pures et Appl. 60 (1981), 375-484. R. J. Walker, Algebraic Curves, Springer-Verlag, 1978. L. Washington, Introduction to Cyclotomic Fields, Springer-Verlag, 1982. A. Weil, Number of solutions of equations in finite fi elds, Bull. Amer. Math. Soc. 55 (1949), 497-508. (Collected Papers, Vol. 1, 399-410.) A. Weil, Jacobi sums as "Grôssencharaktere," Trans. Amer. Math. Soc. 73 (1952), 487-495. (Collected Papers, Vol. II, 63 - 71.) A. Weil, On a certain type of characters of the idèle-class group of an algebraic number-field, Proc. Intern. Symp. on Alg. Num. Theory, Tokyo-Nikko (1955), 1-7. (Collected Papers, Vol. II, 255-261.) A. ,,i1, Ober die Bestimmung Dirichletscher Reihen durch Funktionalgleichungen, Math. Anna/en 168 (1967), 149-156. A. Weil, Review of "The mathematical career of Pierre de Fermat" by M. S. Mahoney, Bull Amer. Math. Soc. 79 (1973), 1138-1149. (Collected Papers, Vol. Ili, 266-277.) A. Weil, Basic Alumber Theory, 3rd ed., Springer - Verlag, 1974. A. Weil, Sur ics ,,ommes de trois et quatre carrés, L'Enseignement Math. 20 (1974), 215-222. (Co/It (l ('d Papers, Vol. 111, 303-310.) A. Weil, La cyclotom:c jadis et naguère, Sém. Bourbaki No. 452. In : Springer-Verlag Lecture Notes in Math. 431 (1975), 318-338 and in l'Enseignement Math. 20 (1974), 247-263. (Collected Papers, Vol. III, 311-327.) A. Weil, Sommes de Jacobi et caractères de Hecke, Gait. Nadir. Nr. .1 (1974), 14 pp. (Collected Papers, Vol. III, 329-342.) A. Weil, Number Theory: an Approach Through History from Hammurapi to Legendre, Birkhauser, 1983. E. Whittaker and G. Watson, A Course of Modern Analysis, 4th ed., Cambridge Univ. Pre ss, 1958. D. Zagier, Nombres de classes et formes modulaires de poids 3/2, C. R. Acad. Sc. Paris 281 (1975), 883-886. D. Zagier, Modular forms associated to real quadratic fields, Inventiones Math. 30 (1975), 1-46. D. Zagier, On the values at negative integers of the zeta function of real quadratic fields, Enseignement Math. 22 (1976), 55-95. D. Zagier, Modular forms whose Fourier coefficients involve zeta-functions of quadratic fields. gnrir 'r Lecture Notes in Math. 627 (197;6, 105-169.

Index

Addition law, 7, 29-35 Affine variety, 52 coordinate ring of, 55 Algebraic geometry, 25-26

Algorithm for congruent number problem, 4, 5, 222 semi-algorithm, 5 Automorphy factor, 148, 152, 153, 177-178

Bernoulli

numbers, 110 polynomials, 54 Bezout's theorem, 32 Birch-Swinnerton-Dyer conjecture, 3, 46, 9093, 218, 221, 222 Branch points, 21, 25

Character additive, 57, 62 conductor, 67 Dirichlet, 62, 75 of group, 61 multiplicative, 57, 62 primitive, 62 quadratic, 82, 176, 187-188, 191-192

/

trivial, 57 Class number, 176, 194, 218 relations, 194 Coates, J., 92 -Wiles theorem, 92, 96, 221 Commensurable subgroups, 165 Complex multiplication, 42, 50, 92, 124, 143, 222 Conductor of character, 67 of elliptic curve, 143 Congruence

subgroup, 99-100 principal, 99 zeta-function, 51, 52 Of elliptic curve, 53, 59 Congruent

number, 1, 3, 5, 46, 70, 92, 221-222 number problem, 1, 2, 4, 221-222 generalized, 8, 123-124, 223-224 Coordinate ri ng, 55 Critical value, 90, 95, 193, 215, 216, 217 Cusp, 103, 106, 108, 126 condition, 125-126, 180-182 form, 108, 117-118, 125, 127, 155, 182 irregular, 144, 182 regular, 144, 1-74, 182

246

Cyclotomic fields, 37

Dedekind eta-function, 78, 121, 122 zeta-function, 56, 88, 89 Deligne, P., 53, 122, 164 111-112, 122, 164 Diagonal hypersurface, 56 Different. 89 Dilogarithm, 76, 78 Diophantus, 1 Dirichlet L-series, 75, 188, 190, 193 series, 80, 141 and modular forms, 140-143 theorem (primes in arithmetic progression), 45, 142 Discriminant modular form, 111-112, 12"1 , 164 of polynomial, 26 Double coset, 165, 204 Doubly periodic, 14

Eigenforms for Hecke operators, 163, 173-174, 201 Euler product, 163 half integer weight, 210-211, 214 normalized, 163 for involution of M2(r0(4)), 146 Eisenstein, F., 177 Eisenstein series, 109-110, 123, 154, 164, 174, 185 of half integer weight, 186-188, 193 Euler product, 199-201 of level N, 131-134 L-function of, 146 normalized, 111, 122 Elementary divisor theorem, 202 Elliptic curve, 9, Il addition law, 7, 29-35 additive degeneracy, 36 complex multiplication, 42, 50, 92, 124, 143, 222 over finite fields, 40-41, 43 inflection points, 13, 35, 41 Legendre form, 224 multiplicative degeneracy, 36 points of order N, 21, 36, 38-40 rank, 44, 46, 51, 91 torsion subgroup, 36, 43.-44, 49-50 Weierstrass form, 24, 26, 33, 120

Index

functions, 14-16, 18, 25 integrals, 27-29, 217 point, 102, 146 -ri(z), 78, 121, 122 Euclid, I

Fermat, 2, 96 curve, 56 last theorem, 2, 5 Fields cyclotomic, 37 of division points, 37 finite, 40 Fourier transform, 71, 83 for finite group, 76 Fractional linear transformation, 98, 102 Functional equation Dedekind eta-function, 78, 121 Dedekind zeta-function, 88, 89 Dirichlet L-series, 77-78 Hasse-Weil L-series, 81, 84, 90, 91 L-series of modular forms, 140-143, 216 Riemann zeta-function, 73-74 theta-functions, 73, 76-78, 85, 88-89, 124 Fundamental domain, 100, 103, 105-107, 146, 231-232 parallelogram, 14

0(X), 107, 142, 148 Galois action on division points, 37-38, 42, 50 Gamma-function, 70-71 Gauss lemma (on quadratic residues), 136 sums, 56, 62, 67-68, 188 Gaussian integers, 14, 41, 42, 65, 165 General linear group, 38, 98 Genus, 53.-54 Good reduction, 43, 90 Grassmannian, 55 Greenberg, R., 222 Gross, B. H., 93, 222

Hardy, G. H., 177 Hasse-Davenport relation, 60, 62-63, 70 Hase-Weil L-function, 3, 61, 64, 75, 79, 81, 84, 90, 141 and modular forms, 143 Hecke, E., 88, 141, 142 character, 81 L-series, 81

247

index

Hecke, E. (cont.) operators, 155, 156, 158, 167, 202 algebra of, 157, 210 Euler product, 158, 160 in half integer weight, 168, 201, 206207, 210 Hermitian, 168, 172 on q-expansion, 161, 163, 207 trace, 175 via double cosets, 167, 168, 202 Heegner, K., 92-93 Homogeneous polynomial, 10, 12, 52 Hurwitz, A., 194 Hypergeometric series, 29 ,-

Irregular cusp, 144, 182 Isotropy subgroup, 102

Jacobi. K., 112 forms, 194 sums, 56, 57, 61 triple product, 219 j-invariant, 105, 119-120, 123-124

Kohnen, W., 214 plus-subspace, 213-214 -Shimura isomorphism, 201, 213-216 -Zagier theorem, 216 Kronecker, L., 194

Lattice, 14, 21, 89, 153 dual, 89 Legendre form of elliptic curve, 224 symbol, 60, 65, 82, 136, 147, 178, 187-188 Level, 99 L-function Dirichlet, 75, 77-78, 188, 190, 193 Hasse-Weil, 3, 61, 64, 75, 79, 81, 84, 90, • 141 of modular form, 140-143, 216 Linear group general, 38, 98 special, 98 Line at infinity, 10-11 Liouville's theorem, 15

Me llin transform, 71, 84, 85, 139

inverse, 142 Möbius function, 188 Modular curve, 91 form, 96-97, 108-109, 117-118, 125, 127, 155 with character, 127, 136-139, 183 and Dirichlet series, 140-143 Euler product, 163 of half integer weight, 3, 176, 178, 182 weight one, 165 function, 108, 119, 125, 127, 155, 182 group SL 2(Z), 99 point, 153 Mordell theorem, 43-44 -Weil theorem, 44 Multiplicity one, 173-174, 214

New form, 174 Niwa, S., 212, 214, 215

Pentagonal number theorem, 123 Petersson scalar product, 168, 169-170, 216 Points at infinity, 10, 11, 13, 103 on elliptic curve, 11, 12 Poisson summation, 72, 83 Projective completion of curve, 11 line, 11, 12, 98, 105 plane, 10 dual plane, 12 space, 11 variety, 52 Pythagoras, 1 Pythagorean triple, 1-2, 3, 5, 8 primitive, 5, 6-7

_ q-expansion, 104, 109, 125, 126 Quadratic character, 82, 176, 187-188, 191-192 form, 176-177, 221 reciprocity-, 66, 82, 153 residue symbol, 60, 65, 82, 136, 147, 178, 187-188

Ramanujan, S., 122 conjecture, 122, 164 -Petersson conjecture, 164 T(n), 122. f23, 164 RamiPe • - • •icx, 144

248

Rank of elliptic curve, 44, 46, 51, 91 Reduction mod p, 43 Regular cusp, 144, 174, 182 Representation, 137, 165 theory, 215 Residue field, 43, 55-56 theorem, 15, 30, 116 Riemann sphere, 10, 21, 98, 105, 119, 120 surface, 53, 54, 143 zeta-function, 27, 51, 64, 70, 73 Root. number, 70, 84, 92, 97

Shimura, G., 177, 215 map, 200-201, 213-215 theorem, 212-213, 215' Shintani, T., 215 Smooth curve, 9, 13, 52-53 Special linear group, 98

Tangency, order of, 13 Taniyama, Y., 91, 143 -Weil conjecture, 91, 143 Tate, J., 88 -Shafarevich group, 218, 222 T(n), 122, 123, 164 Theta-functions, 72, 76-78, 84, 88-89, 96-97, 124, 176-177 Hecke transformation formula, 148 Torsion subgroup, 36-37, 43 ill Torus,1452 Tunnel!, J., 1, 200, 217, 219 theorem, 1, 3, 4, 6, 193, 217, 219-222 Twisting, 81, 127, 142, 176

IndeJ

Type of finite abelian group, 40

Upper half-plane, 99

Variety affine, 52 projective, 52

Waldspurger, J.-L., 215 theorem, 193, 200, 215, 220 Weierstrass p-function, 16-18, 21, 134 derivatives of, 17, 18, 20, 21, 22 differential equation of, 21, 22, 24 form of elliptic curve, 24, 26, 33, 120 Weight, 108, 153-154 half integer, 176 one, 165 of polynomials, 183 Weil, A.,, 44, 91, 141, 142, 143 conjectures, 53-54, 91, 122, 139, 164 parametrization, 91 -Taniyama conjecture, 91 theorem, 142-143, 215 Wiles, A., 92

Zagier, D., 93, 194, 222 Zeta-function congruence, 51, 52 Dedekind, 56, 88, 89 partial, 132 Riemann, 27, 51, 64, 70, 73

Introduction to Elliptic Curves and Modular Forms (Graduate Texts in Mathematics 97)

Introduction to Elliptic Curves and Modular Forms (Graduate Texts in Mathematics, Vol 97)

Introduction to Elliptic Curves and Modular Forms (Graduate Texts in Mathematics, Vol 97)