Abstract Algebra and Famous Impossibilities (Universitext)

Universitext Editorial Board (North America) : J.H. Ewing F.W. Gehring P.R. Halmos Universitext Editors (North Ameri...

Author: Arthur Jones | Sidney A. Morris | Kenneth R. Pearson

99 downloads 998 Views 61MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Universitext Editorial Board (North America) :

J.H. Ewing F.W. Gehring P.R. Halmos

Universitext Editors (North America): J.H. Ewing, F.W. Gehring, and P.R. Halmos AksoylKhamsl: Nonstandard Methods in Fixed Point Theory Aupetlt: A Primer on Spectral Theory Bachumlkern: Linear Programming Duality BenedettllPetronlo: Lectures on Hyperbolic Geometry Berger: Geometry I, II (two volumes) BlledtnerlHansen: Potential Theory BoosslBleecker: Topology and Analysis CariesonlGamelln: Complex Dynamics Cecil: Lie Sphere Geometry: With Applications to Submanifolds Chandrasekharan: Classical Fourier Transforms Charlap: Bieberbach Groups and Flat Manifolds Chern: Complex Manifolds Without Potential Theory Cohn: A Classical Invitation to Algebraic Numbers and Class Fields Curtis: Abstract Linear Algebra Curtis: Matrix Groups van Dalen: Logic and Structure Das: The Special Theory of Relativity: A Mathematical Exposition DiBenedetto: Degenerate Parabolic Equations Dlmca: Singularities and Topology of Hypersurfaces Edwards: A Formal Background to Mathematics I alb Edwards: A Formal Background to Mathematics II alb Emery: Stochastic Calculus Foulds: Graph Theory Applications Frauenthal: Mathematical Modeling in Epidemiology FukhslRokhlln: Beginner's Course in Topology GaliotIHullnlLafontalne: Riemannian Geometry Gardiner: A First Course in Group Theory Glrdlngfl'ambour: Algebra for Computer Science Godblllon: Dynamical Systems on Surfaces Goldblatt: Orthogonality and Spacetime Geometry Hahn: Ouadratic Algebras, Clifford Algebras, and Arithmetic Witt Groups Hlawka/Schoissengelerrraschner: Geometric and Analytic Number Theory Howe/Tan; Non-Abelian Harmonic Analysis: Applications of SL(2,R) Huml/MllIer: Second Course in Ordinary Differential Equations Hurwltz/KrItlkos: Lectures on Number Theory Iversen: Cohomology of Sheaves JoneslMorrls/Pearson: Abstract Algebra and Famous Impossibilities Kelly/Matthews: The Non-Euclidean Hyperbolic Plane Kempf: Complex Abelian Varieties and Theta Functions Kostrlkln: Introduction to Algebra KrasnoselskillPekrovskll: Systems with Hysteresis Luecklng/Rubel: Complex Analysis: A Functional Analysis Approach MacLane/Moerdljk: Sheaves in Geometry and Logic Marcus: Number Fields McCarthy: Introduction to Arithmetical Functions Meyer: Essential Mathematics for Applied Fields

(continued after index)

Arthur Jones Sidney A. Morris Kenneth R. Pearson

Abstract Algebra and Famous Impossibilities With 27 Illustrations

Springer-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong Barcelona Budapest

Arthur Jones Kenneth R. Pearson Department of Mathematics La Trobe University Bundoora 3083 Austra lia

Sidney A. Morris Faculty of Informatics University ofWollongong Wollongong 2500 Australia

Editorial Board (North America): J.H. Ewing Department of Mathematics Indiana University Bloomington, IN 47405, USA

F.W. Gehring Department of Mathematics University of Michigan Ann Arbor, MI 48109, USA

P.R. Halmos Department of Mathematics Santa Clara University Santa Clara , CA 95053, USA

Library of Congress Cataloging-in-Pub lication Data Jones, Arthur, 1934Abstract algebra and famous impossibilities / Arthur Jones, Sidney A. Morris , Kenneth R. Pearson p. cm. - (Universitext) Includes bibliographic references and index. ISBN 0-387-97661-2 I. Algebra , Abstract 2. Geometry - Problems, Famous . I. Morris, Sidney A., 1947- . II . Pearson, Kenneth R. III. Title. QA162.J65 1992 512'.02-dc20 91-24830 Printed on acid-free paper.

© 1991 Springer-Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York , Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly ana lysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden . The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Mark s Act, may accordingly be used freely by anyone . Camera -ready copy provided by the authors. Printed and bound by R.R. Donnelley and Sons, Harr isonburg, VA. Printed in the United States of America . 9 8 7 6 5 4 3 2 (Corrected second printing, 1994) ISBN 0-387-97661-2 Springer-Verlag New York Berlin Heidelberg ISBN 3-540-97661-2 Springer-Verlag Berlin Heidelberg New York

Preface

The famous problems of squaring the circle, doubling the cube and trisecting an angle captured the imagination of both professional and amateur mathematicians for over two thousand years. Despite the enormous effort and ingenious attempts by these men and women, the problems would not yield to purely geometrical methods. It was only the development. of abstract algebra in the nineteenth century which enabled mathematicians to arrive at the surprising conclusion that these constructions ar e not possible. In this book we develop enough abstract algebra to prove that these constructions are impossible. Our approach introduces all the relevant concepts about fields in a way which is more concrete than usual and which avoids the use of quotient structures (and even of the Euclidean algorithm for finding the greatest common divisor of two polynomials) . Having the geometrical questions as a specifi c goal provides motivation for the introduction of the algebraic concepts and we have found that students respond very favourably. We have used this text to teach second-year students at La Trobe University over a period of many years, each time refining the material in the light of student performance. The text is pitched at a level suitable for students who have already taken a course in linear algebra, including the ideas of a vector space over a field, linear independence, basis and dimension. The treatment, in such a course, of fields and vector spaces as algebraic objects should provide an adequate background for the study of this book. Hence the book is suitable for Junior/Senior courses in North America and second-year courses in Australia. Chapters 1 to 6, which develop the link be tween geometry and algebra, are the core of this book. These chapters contain a complete solution to the three famous problems, except for proving that 7r is a transcendental number (which is needed to complete the proof of the impossibility of squaring the circle). In Chapter 7 we give a selfcontained proof that 7r is transcendental. Chapter 8 contains material ' ab out fields which is closely related to the topics in Chapters 2-4,

vi

Famous Impossibilities

although it is not required in the proof of the impossibility of the three constructions. The short concluding Chapter 9 describes some other areas of mathematics in which algebraic machinery can be used to prove impossibilities. We expect that any course based on this book will include all of Chapters 1 - 6 and (ideally) at least passing reference to Chapter 9. We have often taught such a course which we cover in a term (about twenty hours) . VVe find it essential for the course to be paced in a way that allows time for students to do a substantial number of problems for themselves. Different semester length (or longer) courses including topics from Chapters 7 and 8 are possible. The three natural parts of these are (1) Sections 7.1 and 7.2 (transcendence of e), (2) Sections 7.3 to 7.6 (transcendence of 7l'), (3) Chapter 8. These are independent except, of course, that (2) depends on (1). Possible extensions to the basic course are to include one, two or all of these. While most treatments of the transcendence of 7l' require familiarity with the theory of functions of a complex variable and complex integrals, ours in Chapter 7 is accessible to students who have completed the usual introductory real calculus course (first-year in Australia and Freshman/Sophomore in North America). However instructors should note that the arguments in Sections 7.3 to 7.6 are more difficult and demanding than those in the rest of the book. Problems are given at the end of each section (rather than collected at the end of the chapter) . Some of these are computational and others require students to give simple proofs. Each chapter contains additional reading suitable for students and instructors. \Ve hope that the text itself will encourage students to do further reading on some of the topics covered. As in many books, exercises marked with an asterisk >I: are a good bit harder than the others. We believe it is important to identify clearly the end of each proof and we use the symbol - for this purpose. We have found that students often lack the mathematical maturity required to write or understand simple proofs. It. helps if students write down where the proof is heading, what they have to prove and how they might be able to prove it. Because this is not part of the formal proof, we indicate this exploration by separating it from the proof proper by using a box which looks like

P reface

VII

(Incl ude here what must be proved et c.)

Experience has shown th at it helps students to use this material if import ant theorems ar e given sp ecific names which suggest their conte nt . We have enclosed these names in square br ackets b efore the state me nt of th e th eorem . We encourage stude nts to use these names when justifying th eir solutions to exercises. They oft en find it convenient to abbreviate the names to just the relevant initials. (For exam ple, the name "Small Degree Irreducibility Theorem " can b e abbreviated to S.D .LT.) We ar e especially grateful to our colleague Gary Davis, who pointed the way towards a more concret e treatment of field extensions (using residue rings rather than qu oti ent ring s) and thus mad e th e course accessible to a wider class of stude nts. We are grateful to Ernie Bowen , J eff Brooks, Grant Cairns, Mike Canfell, Bri an Davey, Alistair Gray, P aul Halmos, P eter Hodge, Alwyn Horadam, Deborah Kin g, Margar et Mclntyre, Bernhard H. Neumann, Kristen Pearson , Suzanne Pearson, Alf van der Poorten , Brailey Sims, Ed Smi th and Pet er Stacey, who have given us helpful feedback, mad e suggestions and ass isted with the proof reading. We thank Dorothy Berridge, Ernie Bowen , Helen Cook , Margar et McDonald and Judy St orey for skilful Tgxing of the tex t and diagrams, and Norman Gaywood for assist ing with the ind ex. A.J. , S.A.M., K.R.P. April 1991

Contents

Preface

v

Introduction

1

0.1 0.2 0.3

Chapter 1.1 1.2 1.3 1.4

Three Famous Problems Straightedge and Compass Constructions Impossibility of the Constructions

1 Algebraic Preliminaries Fields, Rings and Vector Spaces Polynomials.. The Division Algorithm

7 8 13 17

The Rational Roots Test Appendix to Chapter 1

20 24

Chapter 2 Algebraic Numbers and Their Polynomials 2.1 Algebraic Numbers 2.2 Monic Polynomials 2.3

1 3 3

Monic Polynomials of Least Degree

Chapter 3

Extending Fields

27 28 31 32

39

3.1 3.2

An Illustration: Q( J2) Construction of IF (0:) . .

40 44

3.3 3.4

Iterating the Construction Towers of Fields ..

50 52

Chapter 4 Irreducible Polynomials 4.1 Irreducible Polynomials 4.2 Reducible Polynomials and Zeros

61 62 64

4.3

Irreducibility and irr( a, IF)

68

4.4

Finite-dimensional Extensions

71

ix

x

Fam ous Impossibilities

Chapter 5

Straightedge and Compass Constructions

75

5.1

Standard Straight edge and Compass Constructions

76

5.2

Products, Qu otients, Square Ro ots . .

85

5.3

Rules for Straigh t edge and Compass Constru ction s

89

5.4

Constructible Numbers and Fields .

93

Chapter 6

Proofs of the Impossibilities

99

6.1

Non-Constructible Numbers

100

6.2

The Three Constructions ar e Impossible

103

6.3

Proving th e "All Constructibles Come From Squ ar e Root s" Theorem ..

Chapter 7

Transcendence of e and

108 1T"

115

7.1 7.2

Preliminaries e is Transcendental

116 124

7.3

Preliminaries on Symmetri c Polyn omials

134

7.4

1r

7.5

Preliminari es on Compl ex-valu ed Integrals

7.6

1r

Chapter 8

is Tr an scendental - Part 1 is Tran scendental - P ar t 2 An Algebraic Postscript

146 149 153 163

8.1

The Ring F[X] p(x )

164

8.2 8.3

Division and Reciprocals in F[X]p(x ) Reciprocals in F (0')

165 171

Chapter 9

Other Impossibilities and Abstract Algebra 177

9.1

Con structi on of Regular Polygons

178

9.2

Solution of Quinti c Equations

179

9.3

Integration in Closed Form

181

Index

183

Introduction

0.1

Three Famous Problems

In this book we discuss three of the oldest problems in mathematics. Each of them is over 2,000 years old. The three problems are known as : [I] doubling the cub e (or duplicating the cube, or the Delian problem) ; [II] trisecting an arbitrary angle; [III] squaring the circle (or quadrature of the circl e). Problem I is to construct a cube having twice the volume of a given cube. Problem II is to describe how every angle can be trisected. Problem III is that of constructing a square whose area is equal to that of a given circle. In all cases, the constructions are to be carried out using only a ruler and compass. Reference to Problem I occurs in the following ancient document supposedly written by Eratosthenes to King Ptolemy III about the year 240 B.C. : To King Ptolemy, Etetostheues sends greetings. It is said the: one of the ancient tragic poets represented Minos as preparing a tomb for Glaucus and as declaring, when lle learnt it was a hundred feet eedi way: "Small indeed is the tomb tboii hast chosen for a royal burial. Let it be double [in volume}. And ihou slielt not miss the: fair form if thou quickly doublest eecli side of the tomb ." But iie was wrong. For when the sides are doubled, the surface [area} becomes four times as great, and tlle volume eigllt times. It became a subject of inquiry among geometers in whet. manner one miglit. double the given volume without cllanging the sliepe. And this problem was called the duplication of the cube, for given a cube tlley sought to double it .. . The origins of Problem II are obscure. The Greeks were concerned with the problem of constructing regular polygons, and it is likely 1


2

that the trisection problem arose in this context. This is so becau se the construction of a regul ar polygon with nin e sides necessitates th e tri section of an angl e. The history of Problem III is linked to that of calculating the area of a circl e. Information about this is contained in the Rhind Papyrus, perhaps th e best known an cient mathemati cal manuscript, whi ch was brought by A.H. Rhind to the British Mus eum in th e nin eteenth century. The manuscript was copi ed by th e scribe Ahmes about 1650 B.C . from an even older work. It states that the ar ea of a circle is equal to that of a square whose sid e is th e diamet er diminished by one ninth; that is, A = ( ~)2 d2 • Comparing this with the formula

A

= 7rT 2 = 1r~ gives

7r

= 4. (~r =

2:

6 1

= 3.1604 . ..

.

The Papyrus contains no explan ation of how this formula was obtained. Fift een hundred years later Archim edes showed th a t 10 10 1 3 -< 1r 1 can be written as a product of prime numbers and, except for the order of the factors, the expression of n in this form is unique. Thus n has a unique expression

where Pl ,P2," . ,Pk are distinct prime numbers and ai, a2, . . . , ak are positive integers. (a) Use the Fundamental Theorem of Arithmetic to prove that if r and 8 are natural numbers such that a prime P is a factor of the product 1'8, then P must be a factor of r or of 8 (or of both). [Hint . Express l' and 8 as products of primes and then use the uniqueness part of the Fundamental Theorem.] (b) Let 1', 8, and b be positive integers such that rand 8 have no common factors except 1 and -1. Use the Fundamental Theorem of Arithmetic to show that if r is a factor of bs , then r must be a factor of b. (c) Let r , 8, b, and In be positive integers such that rand 8 have no common factors except 1 and -1. Show that if r is a factor of bs'", then r must be a factor of b. 10.* Use the Rational Roots Test to prove that the number sin is, sin 20°) is irrational.

g (that

[Hint . First. use the formulae cos2 ()

sin( () + 1» cos(() +

1»

+ sin 2 () = 1 ,

= sin () cos 1> + cos () sin 1> ,

= cos () cos 1> -

sin () sin 1> .

to write sin 3() in terms of sin () (without any cos terms). Next, put () =

g, x = sin e, and use sin J =

4.]

24

Fam ous Im possibilit ies

Appendix to Chapter 1 We record here formal definit ions of ring, field, vector space, and of relat ed te rms . If you are not familiar wit h these conce pts, we recommend that you read Sections 23, 24 and 36 of [JF] or Chapters 1 and 2 of [AT] or the early parts of Ch ap ter 3 of [AC). l.A.1 Definitions. A group (G, + ) is a set G together with a binar y ope ration + such that

+ g2 E G; if gI, tn , g3 E G , then gl + (g2 + g3) = (gl + g2) + g3;

(a) if gt, g2 E G , then gl (b)

(c) there is an element 0 E G such that g+O

= 0+ 9 = 9 for all 9 E G ;

(d) for every element 9 E G , there exists an element g' such that 9 + g' = g' + 9 = O. The element 0 is called the identity of the group G . T he element g' in (d) is called the inverse of g. The group G is said to be abelian or commutative if gl + g2 = g2+g1 for all gl, g2 E G. • We shall be more interested in sets with two binary operations . l.A.2

Definitions.

+ and . such that

A ring is a set R with two binary operations

(a) R together wit h the operation (b) if

rI, 1'2

E R , then

rl .1'2

+ is an ab elian gro up ;

E R;

(c) if rl ,r2 ,r3 E R, t hen rdr2.1'3) =

( 1'/ .1'2). 1'3;

(d) if 1'1,1'2,1'3 E R, then 1'1. (1'2

+ 1' 3 )

= 1'1·1'2

+ 1'1·1'3

an d

(1'2

+ 1'3) .1'1 =

1'2.1' 1

The addi tive identity of R is de noted by O. The ring R is said to be a commuiaiiue ring if 1'1.1'2 = 1'1,1'2

E

+ 1'3.1'1 .

1'2.1' 1

for all

R.

A ring wit h unity is a ring R with an element 1 (called the unity of R) such that 1. 1' = 1'.1 = r for all r E R. An element r of a ring (R, + ,. ) is sai d to be a zero-divisor if r =j:. 0 and there exists an element 8 =j:. 0 in R such that 1'. 8 = 0 or 8.1' = O.

25

Algebraic Preliminaries

A ring (R, +,.) is said to be an integral domain if it is a commutative ring with unity and it contains no zero-divisors. A subset 8 of a ring (R, +,.) is said to be a subring of R if (8, +,.) is itself a ring. -

1.A.3 Definitions. A ring (IF, +, .) is said to be a field if IF \ {O} together with the operation . is an abelian group. A subset 8 of a field (IF, +, .) is said to be a subfield of IF if (8, +, .) is itself a field. The additive identity of a field F is usually denoted by 0, and the multiplicative identity of IF is denoted by 1. Note that while every field is an integral domain, there are integral domains which are not fields: for example, the ring 7L of integers is an integral domain which is not a field.

l.AA Definitions. A set V together with two operations + and . together with a field IF is said to be a vector space over the field IF if (a) V together with the operation

+ is an

abelian group;

(b) if A E IF and v E V, then A.v E V; (c) if AI, A2 E IF and v E V, then (AI + A2)'V = AI.V + A2.V; (d) if VI, V2 E V and A E IF, then A,(VI + V2) = A.VI + A.V2; (e) if 1 is the multiplicative identity of IF, then Lv = v for all

V

EV;

(f) if AI, A2 E IF and v E V , then (AIA2)'V = AdA2'V), We refer to the elements of IF as the scalars and the multiplication of an element of IF by an element of V (for example, A.V in (b) above) as scalar multiplication: The elements of V are called vectors. The identity of the group (V, +) is called the zero vector and is denoted by 0. The operation + is often called vector addition. A subset 8 of the vector space If over th e field IF is said to be a subspace of V if 8 together with the operations + and . of V is a vector space over the field IF. -

l.A.5 Definitions. If V is a vector space over a field IF and 8 is a subset of V, then the set span(8) =

{AI .VI

+...+ A".V" : n ;::: 1 and each

Ai E IF, Vi E S}

is called the span of 8 over IF or the linear span of 8 over IF. If span(8) = V then 8 is said to span V over IF .

-

26


l.A.6 Definitions. If V is a vector space over a field IF and S is a subset of V, then S is said to be a linearly independent set over IF if the zero vector, 0, can be written only as a trivial linear combination of vectors in S ; that is, for VJ, V2, "" V n E S,

+ A2,v2 + .,.+ An.vn Al = A2 = ... = An = 0. 0=

Al,Vl

implies that S is said to be linearly dependent over IF if it is not linearly independent over IF. •

I.A.7 Definitions. If V is a vector space over a field F and S is a nonempty subset of V which is linearly independent over IF and which spans V over IF, then S is said to be a basis of V over IF. If S is finite, then the number of elements of S is called the dimension of V over F. • Additional Reading for Chapter I General algebra books which include the background information we assume about rings, fields, and vector spaces (and which also deal with geometrical constructions) include [AC], [JF] and [LS], while [AT] is a good introduction to vector spaces. Deeper results about irrational numbers can be found in [GH] (which is the classic reference on number theory) and in [IN1] and [IN2]. [GH] contains a proof of the Fundam ental Theorem of Arithmetic and is also a good reference on prime numbers. [AC] A. Clark , Elem ents of Abstract Algebra, Wadsworth , Belmont , Californi a, 1971. (JF] J.B . Fraleigh, A First Course in Abstract Algebra, 3rd edition, Addison-Wesley, Read ing, Massachusetts, 1982. [GHJ G.H. Hardy and E.M. Wright , An Introduct ion to the Theory of Numbers, Clarendon, Oxford, 1960. [IN1] I. Niven, Numbers: Rational and Irrational, Random Honse, New York, 1961. [IN2] I. Niven, Irrational Numbers, Carus Mathematical Monographs, No. 11, Mathematical Association of America, 1967. [LS] L.W. Shapiro, Introduction to Abstra ct Algebra, McGraw-Hill, New York, 1975. [AT] A.M. Tropp er, Linear Algebra, Thomas Nelson, London, 1969.

CHAPTER 2 Algebraic Numbers and Their Polynomials

Straightedge and compass constructions can be used to produce line segments of various lengths relative to som e preassigned unit length. Althougll th e lengths are all real numbers, it turns out that not every real number can be obt ained in this way. Tb e lengths which can be constru cted are teth er special. As th e first step towards classifying th e lengths which can be constructed, this chapter introduces the concept of an algebraic number (or more specifically of a number which is algebraic over a field). Each sucli number will satisfy many polynomial equations and our immediate goal is to choose the simplest one.

27

28

2.1


Algebraic Numbers

Numbers which lie in R but not in Q are said to be irrational. Wellknown examples are ..;2, e and 1r. The irrationality of ..;2 was known to the ancient Greeks whereas that of e and 1r was proved much later. Although these three numbers are all irrational, there is a fundamental distinction between ..;2 and the other two numbers. While ..;2 satisfies a polynomial equation with coefficients in Q, no such equation is satisfied by e or 11" (which we shall prove in Chapter 7.) For this reason ..;2 is said to be an algebraic number whereas e and 1r are said to be transcendental numbers. The precise definition is as follows. 2.1.1 Definition. A number a E C is said to be algebraic over a field F ~ C if there is a nonzero polynomial f(X) E F[X] , such that a is a zero of f(X) ; that is, there is a polynomial

f(X)

= ao + al X + ...+ an X n

whose coefficients ao, aI , ... ,an all belong to F, at least one of these coefficients is nonzero, and f( a) = O. • Observe that for each field F, every number a in F is algebraic over F because a is a zero of the polynomial X - a E F[X]. This implies that e and 1r are algebraic over R even though they are not algebraic over Q, as noted above. 2.1.2

Examples.

(i) The number ..;2 is algebraic over Q because it is a zero of the polynomial X 2 - 2, which is nonzero and has coefficients in Q. (ii) The number .y2~ is algebraic over Q because it is a zero of the polynomial X l 2 - 72, which is nonzero and has coefficients in Q.

•

In order to show that a number is algebraic, we look for a suitable polynomial having that number as a zero.

29

Algebraic Numbers

2.1.3

Proof.

Example. Let

0'

The number 1 + .,fi is algebraic over Q.

J

= 1 + .,fi.

We want to find a polynomial , with Cl as a zero, which has coefficients in Q. This suggest s squa ring to get rid of t he squa re roots: 0'2

= 1 + 2./2 + 2,

which is no good as ./2 st ill a ppea rs on th e right hand side. So we go back and isolat e ./2 on one side before squa ring!

Isolating V2 gives Squaring both sides gives

0' -

1=

Vi.

(0: - 1)2 = 2, 0:2 - 20: - 1 = O.

and so

Thus 0: is a zero of the polynomial X 2 - 2X - 1, whi ch is nonzero and has coefficients in Q . Hen ce 0' is algebraic over Q. • 2 .1.4

Other Versions.

It is useful to b e able to recognize the defini ti on of "algebraic over a field F" when it appears in different guises : t hus 0' E C is algebraic over F if and only if

(i) th ere exists f (X ) E F[X] sucli that f (X ) i- 0 and f (o:) = 0,

OR (ii) th ere is a posi tive int eger n and numbers no t all zero, such tllat 0.0

0.0, a j , ... , a n- I, a n

in IF ,

+ a I 0' + a 2O'2 + . . . + a,, _IO' n- j + anO' = 0 . 11

The last statement may ring a few bells! Wher e have you seen it b efor e? Well, to put things in context recall that, since IF is a subfield of C, we can regard C as a vector space 011er IF. (See the paragraph preceding Definition 1.1.2.) Now the numbers

are all eleme nts of C, and hen ce ca n be regarded as vectors in t he vector space C over IF. The coefficients aO,a j,a2, . .. ,a n -],a n, on t he other hand, are all in F so we can regard them as scalars . Thus, we can write (ii ) in the alternative way:

30


(iii) there is a positive integer n such that the set of powers

{I ,0:,0'2 , ... ,0:n-I ,0'n} of the number 0' is linearly dependent over F .

•

You will often meet the terms "algebraic number" and "transcendental number" where no field is specified. In such cases the field is taken to be Q. We formalize this below.

Definitions. A complex number is said to be (i) an algebraic number if it is algebraic over Q;

2.1.5

(ii) a transcendental number if it is not algebraic over Q.

•

Exercises 2.1 1.

Verify that each of the following numbers is algebraic over the stated field: (a)

/2 ~ over Q; /2 + J3 over Q;

(b) (c) ..[iF over IR; (d) i over Q; (e) 2.

/2 + J3i

over R.

(a) Write down three different polynomials in Q[X], each of degree 2, which have J5 as a zero. (b) Write down three different polynomials in Q[Xj, each of degree greater than 2, which have J5 as a zero . (c) Write down three different polynomials in Q[X], each of degree 100, which have J5 as a zero.

3.

Prove that if 0' E C is algebraic over Q then so is 20'.

4.

Prove that if 0' E C is algebraic over a subfield F of C then so is -0'.

5.

Prove that if 0' is a nonzero complex number which is algebraic over a subfield F of C, then 1/0' is also algebraic over F.

31

Algebraic Numbers

6. Show that every complex number is algebraic over R 7. * Let a be a positive real number, and let F be a subfield of R. Prove that a is algebraic over F if and only if ..fa is algebraic over IF .

2.2

Monic Polynomials

An algebraic number such as V2 will be a zero of many different polynomials. Ultimately we want to pick out from all these polynomials one which is, in some sense, the best. Here we make a start in this direction by restricting attention to monic polynomials, defined as follows:

2.2.1 Definition. A polynomial f(X) = ao + a1X + .. . + anX" in F[X] is said to be monic if its leading coefficient an is 1. Thus, for example, X2 - 2 is a monic polynomial whereas 3X2 - 6 is not. Note that both these polynomials have V2 as a zero and, for our purposes, the polynomial X 2 - 2 is somewhat nicer than 3X 2 - 6. The following proposition shows that, in the definition of "algebraic over a field", we can always assume the polynomial is monic.

2.2.2 Proposition. If a complex number a is a zero of a nonzero polynomial f(X) E IF[X] then 0' is a zero of a monic polynomial g(X) E F[X] with degg(X) = degf(X) . Assume that a is a zero of a polynomial f(X) i= O. Since 0, some coefficient must be nonzero and (because there are only finitely many coefficients aj) there must be a largest i such that aj i= O. Let n be the largest such i. Hence

Proof.

f(X)

i=

f(X)

= ao + a1X + a2X2 + ... + anX n

where an i= O. We choose

g(X) =

~ f(X). an

Thus g(X) is monic, has 0' a') a zero and the same degree n as f(X). All the coefficients of g(X) are in F, moreover, since IF is a field. -

32


Exercises 2.2 1. If 0' is a zero of 3X3 - 2X + 1, find a monic polynomial with coefficients in Q having 0' as a zero.

2. If 20'3 - 1 = a zero.

J3, find

a monic polynomial in Q[X] which has 0' as

3. (a) Write down three different monic polynomials in Q[X] which have .ys as a zero. (b) Write down three different monic polynomials in Q[X], each of degree 100, which have .ys as a zero. (c) How many monic polynomials in Q[X] have

2.3

.ys as

a zero?

Monic Polynomials of Least Degree

Even if we restrict attention to monic polynomials, there are still a lot of them which have v'2 as a zero. For example, v'2 is a zero of each of the polynomials

X 4 _ 4, (X 2 - 2)2 , (X 2 (X 2 _ 2)(X I00 + 84X 3 + 73)

-

2)(X 5 + 3),

all of which are in Q[X] . What distinguishes X 2 - 2 from the other polynomials is that it has the least degree. If a number 0' is a zero of one monic polynomial then (as suggested by Exercises 2.2 #3) there are infinitely many monic polynomials of arbitrarily high degree having 0' as a zero. Although there is no maximum possible degree for such polynomials, there is always a least possible degree. (For example, if we know 0' is a zero of a monic polynomial of degree 12, we can be sure that the least possible degree is one of the finitely-many numbers 1,2,3, . . . ,11,12 .) The following proposition shows that by focusing on the least possible degree, we pick out a unique polynomial.

2.3.1 Proposition. If 0' E C is algebraic over a field F ~ C then among all monic polynomials f(X) E F[X] witu f(O') = 0 there is a unique one of least degree.

33

Algebraic Numbers

Proof. Since a is algebraic, there is (by Proposition 2.2.2) a monic polynomial f(X) E F[X] with f(a) = O. Hence there is one such polynomial of least possible degree. To prove there is only one such polynomial, suppose there are two of them, say heX) and heX), each having the least degree n, say. Thus heX) = ao + alX + + an_1x n- 1 + x n heX) = bo + blX + + bn_1Xn- 1 + X n • Hence if we put f(X) = heX) - heX) we get f(X)

= Co + C1X + ... + cn_1X n- 1 ,

where

c;

= aj -

bj •

~ant to prove that !I(X) :: h(X) . To do this we show that f(X):: O.

Clearly f(X) E F[X], f(a) = 0 and either f(X) is zero or f(X) has degree < n. If f(X) i= 0 then, by Proposition 2.2.2, there is a monic polynomial which has the same degree as f(X) and which has a as a zero. But this is impossible since n is the least degree for which there is a monic polynomial having a as a zero. • It follows that f(X) = 0, and hence heX) = heX). The above theorem enables us to assign to each algebraic number a unique polynomial. This polynomial is so important that we introduce some terminology and notation for it.

2.3.2 Definitions. Let a E C be algebraic over a field F ~ C. The unique polynomial of least degree among those polynomials f(X) in F[X] satisfying (i) f(a) = 0 and (ii) f(X) is monic is called the irreducible polynomial of a 011er F. This polynomial is denoted by irr(a, F). Its degree is called the degree of a over F and is denoted by deg(a, F).

•

34


The word "irreducible" means "cannot be reduced" and its use in the above definition stems from the idea that we cannot reduce the degree below that of irr(a, F) if we still want to have a polynomial with the desired properties (i) and (ii). Later you will find that irr(a, F) is also irreducible in a different sense. Thus the use of the word "irreducible" is doubly justified. 2.3.3 Example. over Q is X 2 - 2.

We prove that the irreducible polynomial of

../2

Proof. It is clear that this polynomial has ../2 as a zero , that its coefficients ar e in Q and that it is monic. It remains to show there is no polynomial of smaller degree with these properties. If there were such a polynomial it would be a + X,

for some a E Q .

But then we would have so that ../2 Hence

= -a E Q, which

is a contradiction.

irr(..[2, Q)

=X2 -

and therefore deg(..[2, Q) = 2.

2

•

Note that, although the irreducible polynomial of ../2 over Q is X 2 - 2, its irreducible polynomial over R is X -../2. This is because X - ../2 is in IR[X] but not in Q[X] . The idea in the worked example above is quite simple - find the polynomial you think is of least degree and, to show it really is of least degree, consider all smaller degrees . This can be far from straightforward if the polynomial in question has degree greater than 2. For example, you would suspect that irr(.y2, Q) is X 7 - 2. But to prove that .y2 is not a zero of any monic polynomial in Q[X] of degrees 1, 2,3, 4, 5 or 6 would be very difficult (to say the least) with the techniques you know at present. Special techniques have been developed for these sorts of problems; you will meet some of them in Chapter 4.

Algebraic.Numbers

35

Exercises 2.3 1. (a) Write down the irreducible polynomial of y'5 over Q and then prove your answer is correct.

(b) Write down deg( y'5, Q) . 2. In each case write down a nonzero polynomial f(X) satisfying the stated conditions: (a) f(X) is a monic polynomial over Q with (b) f(X) is a polynomial over Q with momc.

J2

J2 as a zero.

as a zero but is not

(c) f(X) is a polynomial of least degr ee such that

f(X) E R[X]

and

f( V2) = O.

(d) f(X) is another polynomial satisfying conditions (c). (e) f(X) E Q[X] and has both

J2 and v'3 as zeros .

3. (a) In each case write down th e unique monic polynomial f(X) of least degree which has the stated property:

v'3 as a zero. and has v'3 as a zero .

(i) f(X) E IR[X] and has

(ii) f(X) E Q[X]

(b) Explain why the polynomials you gave in part (a) have least degree. (c) What do your answers to part (a) tell you about (i) irr( v'3, IR) and deg( v'3, IR) ?

(ii) irr( v'3, Q) and deg( v'3, Q)? 4. Assume a is algebraic over IF. Let n be the least degree of all monic nonz ero polynomials f(X) in F[X] with f(a) = 0 and let m be the least degree of all nonz ero polynomials g(X) (monic or not) in IF[X] with g(a) = O. Explain why m = n. (Which theorem or proposition from an earlier section do you use?) 5. Let IF be a subfield of IR and let a , b, c E IF with c positive. (a) Show that a in IF[X].

+ bjC

is a zero of a monic quadratic polynomial

(b) What can be said about the degree of a

+ bjC over IF?

36


6. In accordance with # 4, 5,7 of Exercises 2.1, if a E R is algebraic over F then so ar e - a , ..[0: when a > 0, and l/a when a =I- O. (a) What is the relationship between deg(a , F) and deg( - a , IF )? (b) Prove that deg(..[O:,F) ~ 2deg(a ,F). (c) Is the following state ment tru e or false? (Justify your an swer. ) If a =I- 0 is algebraic over F then deg(a , F) = deg

(±'F) .

7. Let F ~ C and let a E C . In each case decide wheth er the st ate ment is true or false (and justify your an swer). (a) If deg(a, F) (b) If deg(a, IF)

= 2 then a rt. IF. = 2 th en a 2 E IF .

8. If a E C is algebra ic over a su bfield IF of C and { I , a , a 2, a 3 }

is linearl y indep end ent over IF, what can be said about deg(a , IF)? 9. Let IF be a subfield of C and let a be a com plex number. Assume that the set {I , 0: , 0: 2 a 3 , a 4 } is linearl y dep endent over F . (a) Prove tha t a is algeb raic over F . (b) What can be sa id about deg(o:, IF )? (Justify your answer.) 10. Let IF b e a subfield of C and let a E C. Prove that deg(a , IF) = 1 if and only if 0: ElF . 11. Is the following st a teme nt true or false? (Justify your answer.) If a, j3 and aj3 are algebra ic over Q then

deg(o:j3 , Q) = deg(o:, Q) deg(j3, Q). 12. Let IE ~ IF be subficlds of C and ass um e that j3 E C is algebraic over E. (a) Prove that j3 is algebraic over IF also. (b ) Prove that deg( j3 , IE ) 13. Let E

~

~

deg(j3, IF ).

IF be subfields of C such th at [IF : IE] = 5. Let j3 E IF .

(a ) What is the maximum numb er of elements which ca n be in a subset of IF which is linearl y independen t over IE?

37

Algebraic Num bers

(b) Stat e why the following powers of {3 ar e all in IF: 1, {3, {32, {33, {34,{35. (c) Ded uce from (a) and (b) that (3 is algebraic over E. (d) What can be sa id about (i) deg({3, E)?

(ii) deg({3, F)?

14. Let F be a subfield of C and let 0' E C. Let f (X ) E F[X] be a monic polynomial wit h f (O') = O. If there are polynomials g(X) , h(X) in IF[X ], each of degree ~ 1, such that f (X ) = g(X) h(X), show that f (X ) is not irr (O', IF ).

Additional Reading for Chapter 2 We have introduced irr(a, F ) with very few preliminary results ab out polynomials. However , tr eatments of irr(n, F) in most other book s depend heavily on techniqu es for showing that a par ti cular polynomial cannot be redu ced (th at is, factori zed furth er) . For thi s reason , you may find it confu sing to read ab out irr( n ,F) in oth er book s before you have finished reading ou r Cha pter 4. A discussion of numb ers algebra ic over a field is given ill th e first two sections of Chapter XIV of [GB]. For some histo rical fact s about transcend ent al nu mber s see Section 5.4 of [IN 1]. [G B] G. Birkhoff and S. Mac Lane, A S ur l1ey of Mod ern A lgebra, New York , Macmillan , 1953. [IN1] I. Niven, Nu mbers: Ration al and Irrational, Rand om House, New York , 1961.

CHAPTER 3

Extending Fields

If IF is a subfield of C and 0' is a complex number which is algebraic over IF, we show how to construct a certain vector space IF (0') which contains 0' and wliicli satisfies

IF

~

F(O')

~

C.

This vector space is then shown to be a subfield of C. Thus, from the field IF and tile number 0', we have produced a larger field IF (0'). Fields of the form IF (0') are essential to our analysis of the lengths of those line segments wliicu can be constructed with straightedge and compass.

39

40

3.1


An Illustration: Q( J2)

As Q is a subfield of C, we can consider C as a vector space over Q, taking the elements of C as the vectors and the elements of Q as the scalars. 3.1.1

Definition.

The set Q(J2) ~ C is defined by putting

Q(J2)

= {a + bJ2:

a,b E Q}.

-

Thus Q( J2) is the linear span of the set of vectors {I, J2} over Q and is therefore a vector subspace of Cover Q. Hence Q( J2) is a vector space over Q. 3.1.2 Proposition. Tile set of vectors {I, vector space Q( J2) over Q.

J2} is

a basis for tile

Proof. As noted above, this set of vectors spans Q( J2). Hence it is sufficient to establish its linear independence over Q. To this end, let a, bE Q with

a + bJ2

= o.

(1)

If b =1= 0, then J2 = -alb, which is again in Q as Q is a field. This contradicts the fact that J2 is irrational. Hence b = O. It now follows from (1) that also a = O. Thus (1) implies a = 0 and b = 0 so that the set of vectors {I, J2} is linearly independent over Q. -

Note that the dimension of the vector space Q( J2) over Q is 2, this being the number of vectors in the basis 3.1.3

Proposition.

Q(J2) is a ring.

Proof. To show that Q( J2) is a subring of C, it is sufficient to check that it is closed under addition and multiplication. The former is obvious while the latter is left as a simple exercise. 3.1.4

Proposition.

Q( J2) is a field.

Proof. To show that this subring of C is a subfield, it is sufficient to check that it contains the reciprocal of each of its nonzero elements.

Ext ending Fi eld s

41

So let x E Q()2 ) b e such that x

f

O. Thus

= a + bV2 ol' b f o. It follows a - bV2 f 0 x

wher e a,

se Q and a fO

that

by t he lin ear ind ep enden ce of t he set {I , )2}. Thus 1

1

x - a+b)2 1

- a + b)2

a - b)2 a - b)2

= (a 2 ~ 2b2) +

C2=b2b2) V2

whi ch is ag a in an eleme nt of Q( )2), since a/ (a 2_ 2b2) and -b/ (a 2- 2b2) are b oth in Q. • The followin g proposi tion gives a way of describing Q( J2") as a field with a certain prop er ty. The proof, like that of Prop osit ion 1.1.1 , in volves proving a set inclu sion.

3.1.5 Proposition. Q()2 ) is the sma llest field containing all the n umbers in the field Q and the number )2. Proof.

Let IF b e a ny field containi ng Q and )2.

It is clear from what has been said ear lier in t his section t hat Q( J2)

is a field which contai ns bot h Q and J2. To show it is the s m lIest J such field we sha ll prove t hat

Q(J2) ~ F.

To prove that Q()2 ) ~ IF we shall show t hat

if x

E Q( V2), then

1:

E IF.

To prove t h is we start by ass um ing the hyp othesis of (2) . Let x E Q( J2"); t hat is ,

x

= a + bV2

for some a , b E Q.

(2)

42


~im is to prove that x E F. ;h:t ~ is a field.

To do this we shall use the fact

II

~

II

By assumption, ...;'2 E F. Also a, b E IF as IF is assumed to contain Q. Hence b...;'2 E IF as F is closed under multiplication. So a + b...;'2 E IF as F is closed under addition. Thus x E F, as required.

•

Because Q(...;'2) is a field, the theory developed in Chapter 2 for a field F ~ C can now be applied in the case IF = Q(...;'2) . The following example illustrates this. 3.1.6 Example. The number degree 2 over this field.

v'3 is algebraic over Q(...;'2) and has

Proof. The polynomial X 2 - 3 has coefficients in Q, and hence also in Q(...;'2). It is monic, moreover, and has v'3 as a zero . Hence v'3 is algebraic over Q(...;'2). Suppose X 2-3 is not the irreducible polynomial of v'3 over Q(...;'2). Then the irreducible polynomial is a monic first degree polynomial

a+X for some a E Q(...;'2). Hence a+ v'3 that

= 0 and so v'3 = -a, which implies

v'3 = c+ dV2

for some c, d E Q. Squaring both sides gives

3 = c2 + 2../icd + 2d 2 • If we have while if

c=O 2d 2

= 3 which leads to the contradiction that )3/2 is rational,

d=O

we get the contradiction that v'3 is rational. Hence cd gives 2 d2 In 3-c-2 v2 = 2cd E Q,

1=

0, which

43

Extending Fields

which is also a contradiction. Thus our original assumption that X 2 - 3 is not the irreducible polynomial of J3 over Q(.;2) has led to a contradiction. Hence X 2 - 3 is the irreducible polynomial of J3 over Q( V2); that is, irr( V3, Q( V2)) and hence

=X2-

3

•

deg( V3, Q( V2)) = 2.

- - - - - - - - - - Exercises 3.1 1. Verify that each of the following numbers is in Q( V2) by expressing it in the form a + bV2, where a,b E Q. (i) (2 + 3.;2)(1 - 2V2) ,

(ii) (1 + 3V2)/(3 + 2V2) , (iii) .)3 + 212. sides.J

[Hint. Let this equal a

+ bV2

and square both

2. Show that the number .)1 + 12 is not in Q( V2). [Hint. Let this number equal a+bV2, square both sides and derive a contradiction.J 3. Show that the zeros in C of the polynomial X 2 + 2X - 7 are both in Q( V2) . 4. Verify that the number degr ee over this field.

J5 is algebraic

over Q(.;2) and find its

5. Verify that the number J144 is algebraic over Q(.;2) and find its degree over this field. 6. Complete the proof of Proposition 3.1.3, begun in the text. [Hint. Assume that x, y E Q(.;2) and then spell out explicitly what this means. Hence show that .T y E Q( .;2).J

v:m

7. Verify that is algebraic over each of the fields Q and Q( .;2), for each integer m 2: O. State in which cases it is true that deg(vm, Q)

= deg(vm,

Q(V2)) .

44

8.


Let a, b E Q. Show that if a + b/2 is a zero of a quadratic polynomial in Q[X] then its "conjugate" a - b/2 is also a zero of this polynomial.

9.* Prove the result of Exercise 8, but without the restriction that the polynomial is quadratic.

3.2

Construction of F (a)

In the previous section we constructed, from the field Q and the number /2, a vector subspace Q( /2) of C by putting

Q(V2)

= {a + bV2 : a ,b E Q}.

We then found that Q( /2) is closed under multiplication (as well as addition) and hence is a ring. Next Q( /2) was shown to contain the reciprocal of each of its nonzero elements, and hence it is also a field. Finally we showed that Q( V2) was the smallest subfield of C containing all the numbers in Q together with the number V2. In this section we generalize the construction of Q( /2) by constructing a subset F(a) of C for any subfield F of C and any number a E C which is algebraic over F. We begin by defining IF(a) and then showing that this set is in turn a vector subspace, a subring, and then a subfield of C. Our final goal is Theorem 3.2.8, which says that IF(a) is the smallest subfield of C containing a and all th e numbers in IF. Throughout this section, we assume that a is algebraic over IF.

3.2.1 Definition. Let F be a subfield of C and let a E C be algebraic over F with deg(a, F) = n . The extension of F by a is the set F(a) ~ C where F(a)

= {bo + b10: + . . . + bn _ 10: ,,- 1 : bo, b1, ••• , bn - 1 ElF}.

•

Thus F (a) is the linear span over IF of the powers 1,

0: ,

2

a , . .. , a

n-l

and so is a '(lector subspace of Cover F . In this section we show that IF(a) is a field - indeed it is the smallest field containing F and a. These results are proved in Theorems 3.2.6 and 3.2.8.

45

Extending Fields

Why did we stop at the power an-I? The answer is given by the following result, which shows that any further powers would be redundant. . 3.2.2 Proposition. positive powers of a:

The set F(a) contains all the remaining

Proof. Since n = deg(a, IF), a is a zero of a monic polynomial of degree n in F[X]. Hence Co

+ cIa + ... + clI_la

ll

1

-

+

all

= 0

for some coefficients CO,CI, ... ,CII_I E IF. Thus (1)

and each -Cj ElF, as IF is a field. Hence , by the definition of IF(a) (Definition 3.2.1), all

E F(a).

If we multiply both sides of (1) by a we see that

(2)

Now a, ... , a,,-I are all in IF(a) (by definition) and we have just shown that an E IF(a). Thus, because IF(a) is a vector subspace of C, it follows from (2) that a ll + 1 E F(a) . Now multiply both sides of (1) by a 2 and then proceed in the same way, thereby showing that a ll+ 2 E IF(a).

It is clear that by proceeding in this way we can show that the powers " ," ,\..( " ,11+1 ,Ll "."+2 ,

(....{.

all belong to IF(a).

•• •

•

Thus, in our definition of F(a) we have, in some sense, included "enough" powers of a. Have we included "too many"? An answer can be deduced from the following result.

46


3.2.3 Proposition. If n is the degree of a over IF, then the set of vectors {I, a, a 2 , • •• ,an-I} is linearly independent over IF. Proof. Suppose on the contrary that this set is linearly dependent; so there are scalars co, CI, . .. ,Cn-I E IF ,not all zero, such that

Among the coefficients co,cI, . . . , Cn- I pick the one, say Ci, farthest down the list which is nonzero. Dividing by this coefficient gives

( ~~) + (~:) a + ... + (C~~ I )

i I

a -

+ ai = a

where i ~ n - 1, and the coefficients are still in the field IF . Hence a is a zero of a monic polynomial in IF[X] with degree smaller than n (which was the least possible degre e for such polynomials) . This contradiction shows that our initial assumption was false. Suppose we had tried to take S = {bo + bia + ...+ bn _ 2a n - 2

:

bo, . .. , bn - 2 E IF}

as a candidate for a ring (or field) containing IF and a. Then (if n ~ 2), a and a n - 2 are both in S but their product a n - I is not in S by the above proposition. Thus S is too small and so we need all the powers 1, a, ... , a n - I in IF(a) for it to be a ring or a field. 3.2.4 Theorem. [Basis for F(a) Theorem] Let IF be a subfie1d of C and let a E C be algebraic over IF with deg(a, IF) = n. Then the set of vectors {I , a, a 2 , .• • , an-I} is a basis for tlie vector space IF(a) over F . In particular this vector space lies dimension n , the degree of a over F . Proof. By Definition 3.2.1 , this set of vectors spans the vector space F(a) over IF and, by Proposition 3.2.3, the set of vectors is linearly independent. Hence it is a basis for the vector space and the number n of elements in the basis is the dimension of the vector space. As F(a) is a subset of C, its elements can be multiplied together. This leads to the following proposition.

47

Extending Fields

3.2.5

Proposition.

IF (0') is a ring.

Proof. To show IF(a) is a subring of C, it is sufficient to show it is closed under addition and multiplication. The former is obvious while the latter will now be shown. Consider any two elements in IF(a), say p = b + blo: + b a 2 + ... + bll_Ia n- 1 E IF(a)

o

2

and

q = Co + cIa + C2a2 + ... + cn_Ia n- 1 E IF(a), where the b's and c's belong to F. If we multiply these two elements together, we get a linear combination of powers of 0', with coefficients still in the field IF. Because all positive powers of 0' are in IF(a) (by Proposition 3.2.2) and because IF(o:) is closed under addition and scalar multiplication, it follows that the product pq is also in IF(a). At long last we are in a position to establish that IF(a) is a field.

3.2.6 Theorem. Let IF be a subfield of C and let 0' E C be algebraic over IF . Then IF (0') is a field . Proof. In view of Proposition 3.2.5, what remains to be shown is that 1/(3 is in IF (0:) for every nonzero (3 in IF(0'). So let (3 be a nonzero number in 1F(0:). Firstly note that {I, (3,(32, .. . , (3"} is a set of n + 1 numbers which are all in IF (0'), since IF (0') is closed under multiplication. Because IF (0') is an n-dimensional vector space over IF, this set must be linearly dependent, which means that there are scalars do, d l , ... ,dk in IF (not all zero) with k ::; n such that (3) If do = 0 in (3), we could divide by (3 (or multiply by 1/(3) to reduce the number of terms in (3). By repeating this if necessary, we see that (3) can be assumed to hold with do =1= O. If we multiply (3) by -1/do we have -1 + el(3 + e2(32 + ...+ ek(3k = 0, where each e, = -dildo is in IF since IF is a field. Thus 1 = el(3 + e2(32

+ ...+ ck(3k = (3(el + e2(3 + ... + ek(3k-I).

It follows that 1/(3

= el + e2(3 + ... + ek(3k-l,

which is in F (0') since the ei and (3i are in IF (0:) and IF (0') is a ring. -

48

Fam ous Impossibili ties

In simple cases the method used to prove Theorem 3.2.6 can be used to find a formula for 1/(3, as the following example shows.

3.2.7 Example. In Q( V2), if (3 = 1 + 2V2 then (32 = 9 + 4V2 and it is easy to see that (32 - 2{3 - 7 = O. This can be rewritten as (3({3 - 2) = 7 so tha t 1/(3

= ((3 -

1 2 2)/7 = -"7 + "7 V2.

-

3.2.8 Theorem. [Smallest Field Theorem] Let IF be a subfi eld of C and let 0' E C be algebraic over F . Th en IF (0' ) is the smallest field containing 0' and all th e numbers in F. Proof.

This is analogous to the proof of Proposition 3.1.5.

-

The Smallest Fi eld Theorem gives a way of describing IF( O') which is just as useful for many purposes as the definition of IF(O') (D efinition 3.2.1 ). Hence, in solving problems involving F (O') you should keep both Definition 3.2 .1 and Theorem 3.2.8 in mind and the n use whichever seems m ore appropri a te for the particul ar pr obl em . In some bo oks the roles of our Smallest Field Theorem and our Definiti on 3.2.1 are reversed , t he state me nt of the former appearing as a definition and the la t ter as a theorem. Our treatme nt is perhaps more concret e and so hopefully easier for those studyi ng the topic for the first t ime . Our approach assumes, however , that cv is algebraic over IF. If this is no t the case , our definition of F( O') is no t applicabl e. Hen ce, in such cases, we would use the stateme nt of Theorem 3.2. 8 as our definition. The followin g example is a typi cal application of the Small est Field Theorem.

3.2.9 0' E C.

Example.

F(cv 2 ) ~ F (O') for each subfield IF of C and each

Proof. Let F be a subfi eld of C and let 0' E C. By the Smallest Fi eld Theorem , F (cv) contains 0' and IF . Hen ce F (O') contains 0'.0' = 0'2 since, bein g a field , IF(O') is closed under multiplication. This shows that F (0') is a field cont aining 0'2 and IF. But F(o:2) is the smallest such field . Therefore F(0'2) ~ F(o:), as required. _

49

Extend ing Fi eld s

- - - - - - - - - Exercises 3.2 - - - - - - - - 1. What does the definition of F (Q') say in each of the following cases?

(i) degfo , F) (ii) degf o , F)

= 1, = 2,

(iii) degfo , F) = 3. 2. (a) What do es the definiti on of F( Q') tell you in the sp ecial case where F = Q and Q' = V3 ? (b) What do Propositions 3.2 .2, 3.2.3 and Theorem 3.2.4 t ell you in this sp ecial case? (c) Verify directly that (i) th e set {I , V3} is linearly independen t over Q,

(ii) Q( V3) contains the product (2 + V3)(3 + V3) and the qu oti ent (2 + .,13)/ (3 + .,13). 3. What is the degree of V2 over R? Whi ch familiar field is R(V2 )? 4. What is the degree of i over R? Which familiar field is R(i)? 5. (a) Stat e why 1 + V2 E Q ( V2) and why V2 E Q (l

+ V2).

(b) Prove that Q (l + V2 ) = Q( V2 ) by showing that each field is a subset of the ot her. (You may use the Small est Fi eld Theor em .)

6. (a) Write down the irreducible polynomial of followin g fields: (i) Q,

(b) Stat e the degree of

V3 over

each of these fields.

7. Find irr( V2 , Q(l + J2)) and deg( V2 , Q (l ans wer .)

rf.

of th e

(iii) Q(V3).

(ii) Ill ,

8. (a) Prove that V2

V3 over each

Q( V3).

(b) Show that V2 is algebraic over Q( V3). (c) Wri te down irr ( V2 , Q(V3)). (d) Justify your answer to (c). (Use (a) .)

+ V2) ). (Justify your

50


9.

(a) What does your answer to Exercise 8(c) tell you about deg( V2, Q( V3))? (b) Let F denote Q( V3). Use (a) to write down a basis for the vector space F (V2) over F.

10. Complete the proof of Proposition 3.2.5 by showing that F(O') is closed under addition. 11.

Let F be a subfield of C and let 0', f3 E C. (a) Prove that if F(f3) ~ F(O') then f3 E F(O'). (b) Prove the converse of the result stated in part (a). [Hint. Use the Smallest Field Theorem.] (c) Deduce that F (0:) = IF (f3) if and only if 0' E IF (f3) and f3 E IF(0').

12. Let f3 = 1 + {Y2. Follow the method used in the proof of Theorem 3.2.6 and in Example 3.2.7 to obtain a formula in Q( ij2) for 1/f3. 13.

Let IF be a subfield of C and let 0' be a nonzero complex number. Show that if 1/0' can be expressed in the form do + dlo: + ...+ (hO'k

for k 2: 1 and d, E F, then

0:

is algebraic over F.

[Remark. It follows that if F(O') is defined to be the linear span of all positive powers 1,0',0:2 , • .• of 0: (much as in Definition 3.2.1) then IF(O') is a field precisely when 0' is algebraic over IF .] 14.* Find all the fields IF such that F (J2) = Q( J2) . (Justify your answer.) 15.* Describe the field Q(7r) as explicitly as possible. [Note that 7r is not algebraic over Q and so Definition 3.2.1 is not applicable. Instead, Q(7r) is defined as the small est subfield of C which contains Q and 7r.] 3.3

Iterating the Construction

The starting point for the previous section was a field IF ~ C and a number 0' E C which was algebraic over F. From these ingredients a new field was constructed, IF(O') ~ C.

51

Extending Fields

We can now take the field F(a) and a number (3 E C which is algebraic over F (a) as the starting point for a further application of the construction process to get a further new field F(a)((3) ~

c.

This process can be repeated as often as we like to give a "tower" of fields, each inside the next,

Q

~ F ~ F(a) ~ F(a)((3) ~

F(a)((3)(T)

~

...

~

c.

3.3.1 Example. Q( v'2)(.;3) is the linear span over Q( v'2) of the set of vectors {I, .;3}. This set, furthermore , is a basis for the vector space Q( v'2)(.;3) over Q( v'2). Proof.

First note that by the solution to Example 3.1.6 ,

irr(V3, Q(V2)) = X 2 - 3 and hence deg( V3, Q( V2)) = 2. It now follows from Definition 3.2.1 that

Q(V2)(V3)

= {x + V3y : x,y E Q(V2)} ,

which is just the linear span of the set of vectors {I, .;3} over Q( v'2). That this set of vectors forms a basis follows from Theorem 3.2.4. • The tower of fields

Q ~ Q( V2) ~ Q(V2)( V3) invites us to consider Q( v'2)(.;3) as a vector space over Q also. This leads to the following example. 3.3.2 Example. Q( v'2)(.;3) is the linear span over Q of the set of vectors {I , v'2,.;3, v'2.;3} . Proof.

By the previous example

Q(V2)( V3) = {x + V3y: x, y E Q( V2)} = {(a + bV2) + V3(c+ dV2): a,b,c,d E Q} = {a + bV2 + cV3 + dV2V3 : a,b,c,d E Q} which expresses it as the required linear span.

•

52


One might guess from the above example that the set of vectors {I , V2, V3, V2V3} is in fact a basis for the vector space Q( V2)(V3) over Q. One of the aims of the next section is to prove a theorem from which this result will follow very easily.

Exercises 3.3 1. Find a spanning set for Q( V2)( i)

(a) as a vector space over Q( V2),

(b) as a vector space over Q. 2. Simplify 1F(0:)(1+0:), where IF is a subfield of C and 0: E C. (Justify your answer.) 3. Write down three different extension fields of Q which contain

V2.

4. If IF is a subfield of C and 0:, (3 E C then IF (0:, (3) is defined to be the smallest subfield of C which contains all numbers in IF and also the numbers 0: and (3. Use this definition to show that, if 0: and (3 are algebraic over IF, then

IF(a)((3)

= 1F(0:, (3) = 1F((3)(a)

where, as in this section, IF(o:)(;3) means the smallest field containing all numbers in the field 1F(0:) and the number (3.

3.4

Towers of Fields

The aim of this section is to produce a theorem which is of vital importance in questions involving a tower

consisting of three distinct subfields IE, IF and K of C. Implicit in this set-up are three different vector spaces and the theorem to be proved will give a precise relationship between their dimensions. The three vector spaces arising from the tower are as follows. Since IE is a subfield of IF, we may take IF as the vectors and E as the scalars to give the vector space (i) IF over IE. Likewise there are the vector spaces

Extending Fields

53

(ii) K over IF, and (iii) IK over IE. It may be helpful to see all these vector spaces together on a single diagram, as shown below .

IF over IE

IK over IF

IK over IE Here IF plays a schizophrenic role: looking back we regard its elements as vectors, but looking forward we take them as scalars. The theorem on dimensions will follow easily from the following theorem. 3.4.1 Theorem. of subfieIds of C,

[Basis for a Tower Theorem]

Consider a tower

If the vector space F over IE has a basis {(I'\ , ... ,(I'm} and the vector space IK over IF has a basis {/3h ... , /3,, }, then the set of vectors

... , . .. ,

forms a basis for the vector space IK over IE.

Proof. The following diagram may help you to keep track of where the vectors come from. IE

~

IF QjEF

c

K ajt3iEK t3iEK

Firstly we show that the given set of vectors spans the vector space IK over IE. To do this, let k E IK.


54

~jm is to prove that k is a linear combination of the OJ{3i's II :i:~ ~coefficients in E. ~

II

Because the /3's form a basis for Kover F, there exist n scalars il, ... , In E F such that k=

n

L

i=1

f;/3i.

But since the os form a basis for F over E, there exist scalars (1 ~ i n, 1 ~ j m) in E such that

s

s

m

Ii = L

j=1

eijD'j'

(1) eij

(2)

Substituting (2) in (1) gives k= =

t

(f

eij Q'j) /3i i=1 j=1 n m e ij(Q'j/3i) . i=1 j=1

LL

Thus the Q'j/3/s span the vector space Kover E. Secondly we show that these vectors are linearly independent over E. Suppose eij (1 ~ i ~ n, 1 ~ j ~ m) are elements of E such that n

m

L L eij Q'j/3i = o.

(3)

i=lj=1

Our aim is to prove that all the e's are zero.

We can rewrite (3) as

L em L

n i=1

'= 1

eij

Q'j

) e. = O.

But in this sum, the coefficients of the /3/s (each coefficient is a sum of the eij Q'/s) are elements of F and the /3 's are linearly independent over F. Hence, for 1 ~ i ~ n , m

L eij Q'j =

j=1

O.

Extending Fields

55

But the o ''s are linearly independent over IE, which means that, for 1 ::; j ::; m, eij = O. This establishes the required linear independence. Thus the 0:/3 i 's span the vector space K over IE and are linearly independent over IE. Hence they form a basis for the vector space II< o~rE . 3.4.2 Example. of vectors

The vector space Q( V2)( /3) over Q has the set

as a basis. Proof.

Consider the tower

Q ~ Q( V2) ~ Q( V2)( V3). By the Basi s for 1F(0:) Theorem (Theorem 3.2.4) and Example 2.3.3 , the vector sp ace Q (V2) over Q has a basis {I , V2} while by the same theorem and Example 3.1.6, the vector space Q( V2)( /3) over Q( V2) has a basis {I , /3}. Hence, by the Basis for a Tower Theorem (Theorem 3.4.1), the vector space Q(V2)( /3) over Q has the basis

{I , V2,V3,

V2V3}.

-

By counting the numbers of elements in the bases for the vector spaces occurring in Theorem 3.4.1 we can get a relationship between their dimensions. This gives us the following very useful theorem. 3.4.3 Theorem. [Dimension for a Tower Theorem] a tower of subfie1ds of C,

Consider

If the vector spaces IF over IE and II< over IF Irave finite dimension then so does K over IE and

[II< : IE]

= [K : F][IF : IE]

or, in eltettietive notation, dimj, II
lal

and a

-I 0,

then b cannot be a factor of a.

Another consequence of the definition of a factor is that if an integer b is a factor of two integers, then it is also a factor of their sum, their difference and their product . Hence if b is a factor of an integer al bui. b is not a factor of an integer a2, then b is not a factor of al + a2

as otherwise b would be a factor of the difference (al + a2) - al = a2. Recall that an integer P is said to be a prime if p > 1 and if p has no positive factors other than 1 and p itself. The Fundamental Theorem of Arithmetic is stated in Exercises 1.4 #9. It can be shown from this theorem that if a prime number P is a factor of the product ab of two integers a and b, then p must be a factor of a or a factor of b (or a factor of both a and b). From this it is easy to deduce another useful test for failure to be a factor of a product:

117

Transcen dence

if a prime P is not a fa ctor of any of the in tegers , am th en it is no t a factor of th eir produ ct

aI , a 2, a laz

a m'

Another fact whi ch we shall need later is given by the following proposition, whose proof was known to Euclid.

7.1.1

Proposition.

Th er e are infini tely many prime numbers.

Proof. To prove this we note that there is at lea st one prime number (2 - for exam ple) and we shall prove that given any finit e se t of prime numbers (1) ther e is always another prime numb er whi ch is not in the se t . To this end conside r t he integer K

= PIP2 .. ' PlI + 1.

(2)

Each Pi is a factor of t he first te rm on t he right-hand side of (2), but not of the sec ond te rm, 1. Hen ce Pi is no t a factor of K. But K do es hav e a prime factor q by t he Fundamen tal Theorem of Ari thmetic. Thus ther e is a prime q whi ch is not in t he se t (1), as we wish ed to show. • This means t hat, given any integer m , no m a tter how large, we can b e sure t ha t ther e is a prime (indee d, infin itely many primes) larger t han m . We use this fact frequently in this chapter.

Calculus We expec t that all of our readers will have met the following properties of integrals in a calculus course. (Proofs can b e found in many texts: see for example Theorems 5 and 6 in Chapter 13 of [MS].)

7.1.2 Theorem. and let a, bE IR .

Let f ,9 end h-i , . . . .h.; be functi ons from IR to IR

(i) [Linearity of Integra ti on] If d l , . . . , dll E IR and if each of th e following int egrels exists , then


118

(ii) [Integration by Parts] If f and 9 are functions sucii that each of the following integrals exists, then

l f(x)g'(x)dx = f(b)g(b) - f(a)g(a) -l J'(x)g(x)dx.

•

We shall use these results only in the cases where the functions are polynomial or exponential functions , or products thereof; the integrals of such functions always exist. The proof of the next result depends on the idea of the degree of a nonzero polynomial form, which was explained in Chapter 1. 7.1.3 Lemma. Let g(X) and h(X) be polynomials whose coefficients are real numbers. If

g(x)

= h(x)e- X

for all x E IR

(3)

then g(X) and h(X) are both equal to the zero polynomial. Proof. It follows from (3) that h(:r) both sides gives

= g(x)e

X

and so differentiating

for all x E R. If we multiply both sides by g(x) and then replace the factor eXg(x) on the right-hand side by h(x), we find that

h'(x)g(x) = (g(x) + g'(x) )eXg(x) = g(x)h(x)

+ g'(x)h(x)

for all x E II\t

Rearranging this equation and interpreting it in terms of polynomial forms gives (4) g(X)h(X) = h'(X)g(X) - g'(X)h(X). If either of the polynomials g(X) or h(X) is zero, then so is the other by (3); hence it only remains to prove the lemma in the case where both g(X) and h(X) are nonzero polynomials. Now the degree of g'(X) is either undefined or is one less than the degree of g(X), and similarly the degree of h'(X) is either undefined or is one less than the degree of h(X). Hence the equation (4) gives a contradiction because the degree of its right side either is undefined or is less than the degree of g(X)h(X). Thus the case in which both of the polynomials g(X) and h(X) are nonzero cannot occur. •

Transcendence

119

The following lemma concerning the limit of a sequence will also be needed. 7.1.4

Lemma.

For eecli real number

C

Proof. Let m be an integer such that m integers n 2: m, C

-=- -

C

C

Cn

2: 0,

>

lim - = O.

n~oo

1 and m

C C

0, we can find integers 111 and All and a number CI such that Icd < C and MI a= M +CI ·

(7)

This says that there is a rational number MdM at a distance of Icd < C from the given real number a. For our purposes, a more relevant type of approximation, is suggested by the following question: Given a "sm all " number e, can we find integers 1\1 and 1\II and a nonzero number CI with ICII < C such that (8)

Thus we are asking not only that the distance from a to MIl M be small, but also that it be small even when multiplied by M, the denominator of the approximating fraction.

122


7.1.8 Example. It is easy to see that the number e can be approximated as in (8). To see this first note that the well-known Taylor series for the exponential function gives 1

1

1

1

e = 1 + -I'. +. 2' + ... + ,+ (n+. 1)' n. For each integer n

~

+ ...

1, let 1

1

1

An = 1 + -1'. +. 2' + ... + ,. n. Hence

e = An

1

[1

1

[

1

+ (n + I)! 1 + n + 2 + (n + 2) (n + 3) + ...

~ An + (n

1

1

+ 1)1 1 + 2 + 22 + ...

]

]

-A 2 n+(n+1)! Given e > 0 choose n to be any integer> 2/6, so that 2/n < c. Put M = n! and M; = (n!)A n. You should now be able to see how to choose Cl so that (8) holds. We leave the details as Exercises 7.1 #8.• Can all real numbers a be approximated as in (8) by rationals? That the answer to this question is NO is clear from the following proposition.

7.1.9 Proposition. Let a be a real number with the following property: for eecli real number C > 0 there are integers M and M l , and a nonzero real number Cl witl: Icd < e such that

M, +c1

a=---

Ai

Then the real number a is irrational. Proof. Suppose, contrary to the conclusion of the proposition, that a is a rational number, say a = p/q where p and q are integers and

s > O.

Let e = 1/(2q). By the hypothesis of the proposition there exist integers M and kI1 , and a nonzero real number C1 with IC11 < c such that p M, + C1 = and hence pA1 - q.M1 = qe-: q M The left hand side of the last equation is an integer, while 0 < Iqcd < ~ . This contradiction shows that a cannot be rational. •

123

Transcendence

From Example 7.1.8 and Proposition 7.1.9 it follows that e is an irrational number. Note also that requiring Cl to be nonzero is crucial in the proof of the above proposition. By requiring Cl to be nonzero, we are not allowing the rational approximation MdM to equal the number a. A very useful idea which this proposition suggests is that the "better" a real number can be approximated by rational numbers (not equal to it), the "worse" that real number must be. Here we are thinking of the rational numbers as "good", the irrational numbers which are algebraic as "bad", and the irrational numbers which are not algebraic as "worse" . We shall follow up this idea in the next section.

Exercises 7.1 1. Let n be a positive integer. Verify that each of the following numbers has (n - I)! as a factor.

(i) n!.

(ii) n! + (n

+ I)! + ... + (n + m)! where m

is a positive integer.

(iii) (2n)!. 2. Prove, from the definition of factor, that if an integer b is a factor of another integer a which is not zero, then lal ;::: Ibl.

3. Prove, from the definition of factor, that if an integer b is a factor of each of the integers al and a2 then b is also a factor of the integers al + a2, al - a2, and ala2. Is b always a factor of aJ/a2?

> n. In each of the following cases state whether the integer always has tri as a factor. (Justify your answers.) (i) mn - n.

4. Let m and n be integers with m

(ii) nm - m. 5. Let p be a prime number and let m be a positive integer such that m < p. Show that p is not a factor of

(i)m(m-l); (ii) m! ; (iii) (m!)P.


124

6. Let f(x) = ao+alx+ . . .+anx n be a polynomial function with real coefficients . The result of differentiating this polynomial i times in succession is denoted by f(i). It is called the i-th derivative of the polynomial f. Show that f(n+I)(x) = 0 for all x E R. 7. (a) Express f(X) = 2-4X +3X 2-7X3 as a polynomial in (X -r) . Write down the formulae for the polynomials do(X), . . . , d3(X) in Proposition 7.1.6. (b) Repeat (a) with f(X) = co+clX + C2X2+C3X3+C4X4 where Co,· • . , C4 E I. 8. Complete th e details of Example 7.1.8. Why is M 1 an integer? 9. (a) Show that the rational number 3/7 can be approximated as in (7) of Section 7.1. [Hint. What can M 1 and M be if E: = 1/100?] (b) Show that every rational number can be approximated as in (7) of Section 7.1. 10. Would Proposition 7.1.9 be true if the word "nonzero" were omitted? (Justify your answer.) 11. Prove that each of the numbers cos(1) and sin( 1) is irrational by using the Taylor series for the functions cos and sin.

7.2

e is Transcendental

An important property of the number e is that, if g(x) = eX for all x E JR , then l( x) = g(x) for all x. Indeed this fact distinguishes the number e from all other positive real numbers (as shown in Exercises 7.2 #1 below) . We shall also use the fact that eX+Y = eXeY for all real numbers x and y (a property of e which is shared by all positive reals). Our proof that the number e is transcendental (see Definition 2.1.5) shall begin by supposing, on the contrary, that e is algebraic over Q. We let

125

Transcendence

denote irr(e,Q) , the irreducible polynomial of e over Q. Then we have m 2: 1 and a; E Q for each i , and Now clearly ao i= 0 as otherwise X would be a factor of the polynomial t(X), which is irreducible (and which is not just X since e i= 0). If we now multiply (*) by the product of the denominators of the rational numbers ao, al , .. . , am-l we get Co + ere + ... + cme m = 0

(1)

where Co, cl, .. . , cm are int egers such that Co i= 0 and Cm i= o. In du e course, we shall derive a contradiction from this.

Idea of Proof: M's and e'S To show that (1) leads to a contradiction, we shall use the idea (mentioned in Section 7.1 ) that the "worse" a number is, the "better" it can be approximated by rationals. As we have seen in Example 7.1.8 and Proposition 7.1.9, the irrationality of e follows very simply from the following property: For each E > 0 there is a nonzero number El with IEll < E and integers M and M, such that e=

Ml+El

M The aim now is to extend this "close approximation" by rationals to each of the remaining powers of e in the equation (1) . The aim is thus: Given E > 0, to show there are numbers El , ... , Em all less than E in absolute value, and integers AI, AIl, ... , AIm sucli that e=

M l + El

AI

2

e =

!vI2 + E2 AI '

m

e =

AIm + Em M .

Substituting these powers of e into (1) and rearranging the terms gives

Because M is an integer and so is each c, and each Mi , the first sum on the left side of (2) is an integer. If we can show this integer is nonzero we shall get the desired contradiction: we can choose each E j so small that the second sum on the left side of (2) has its absolute value less than ~ and hence is unable to cancel out the first sum.


126

It now remains to show how to find suitable numbers

El, •.• , Em

and M, M 1 , . . • , 111m •

Producing the €'s and the M's from integrals The following lemma shows how an integer (namely k!) arises from an integral involving the exponential function. This lemma therefore suggests the possibility of using integrals to construct the required integers 111,1111, ... ,Mm.

7.2.1 Lemma. For eedi integer k ~ 0 there are polynomials gk(X) and hk(X) with real coefficients such thet, for all r E R,

(a)

= k! - e-rgk(1'), 1')ke-Xdx = hk(1') - e-"k! .

{ xke-xdx {(x -

(b)

Proof. This can be proved by mathematical induction on k, using integration by parts (see Theorem 7.1.2). • Motivated by the above lemma, we shall aim at getting the integers M and M 1 , • • • ,114m from integrals of the form (3)

for r E {I, 2, . . . , m}, where f is a polynomial function. In the following example, we show how to apply the above lemma to such integrals.

7.2.2 Example. Let f(X) = 3 + 4X - 10X 3. By linearity of integration (Theorem 7.1.2), and then Lemma 7.2.1(a), for all r E R,

{ f(x)e-Xdx = {(3 + 4x - lOx3)e-Xdx

1.e- xdx + 4 for xe-xdx - 10 for x 3e- xdx = 3(0! - e-rgo(1')) + 4(l! - e- rgl(1')) - 1O(3! - e- rg3(r)) = (3.0! + 4.1! - 1O.3!) - e- r(3g0(I') + 4g1(1') - lOg3(1')) = 3 for

where go(X), gl,Y) and g3(X) are polynomials. Hence

{ f(x)e-Xdx where M

= 3.0! + 4.1! -

=M

- e-rG(r)

10.3! = -53 and G(X) is the polynomial

G(X) = 3go(X)

+ 4g1(X) -

10g3(X).

•

127

Transcendence

The next lemma generalizes this example. 7.2.3 Lemma. Given a polynomial f(X) with real coefficients, there is a unique real number M and a unique polynomial G(X) with real coefficients sucl: tluu, for all r E R, { f(x)e- x d.1: = M - e-rG(r). Indeed M and G(X) are unique in tlie strong sense that, if PI (X) and P2(X) are polynomials witli real coefficients sucli that { f(x)e-Xdx = P1(r) - e- rP2 (r ) for a117' E R, then P 2(X) a constant polynomial).

= G(X)

And P1(X)

=M

(so that PI(X) is

Proof. The existence of M and G(X) follows as in Example 7.2.2 by expanding f(X) in powers of X, using linearity of integration (Theorem 7.1.2(i)) and then applying Lemma 7.2.1(a) . To prove uniqueness, we assume that the two formulae in the statement of the lemma hold. Then, for all 7' E IR ,

AI - e-rG(r)

= PI(r) -

e- r P2(r)

and hence M -P1(7') = e- I ' ( G(r)-P2(7' )) . It follows from Lemma 7.1.3 that M - PI(X) and G(X) - P2(X) must both be the zero polynomial, which gives PI(X) = M and P2(X) = G(X) , as required. We now use this lemma to define the numbers we need for the transcendence proof. 7.2.4

Definition.

In Lemma 7.2.3, choose

f to be the polynomial

f(X) = (XP-\,(X - l)P(X - 2)1' . .. (X - m)p p-1.

(4)

where p is a prime number (which we shall later take to be rather large) and m is the integer occurring in (1) . Further, choose (i) M and G(X) to be the number and polynomial given by Lemma 7.2.3, (ii) M; = G(r) for r E {I, 2, .. . , m}, and (iii) Cr = e r {f(x)e-Xdx for r E {1,2, ... ,m}.

-

128


Note that, from (4), the degree n of f(X) is n

= mp+p-1.

(5)

Also, by Proposition 7.1.6, there are polynomials do, ... , dn with integer coefficients and degree at most n such that, for all r E R,

f (X)

= (p

~ I)! (doe r) + dI (r ) (X -

1') + ... + d; (r ) (X - rt) .

(6)

7.2.5 Lemma. With the notation as in (6), if r E {1,2, ... ,m} then doer) = dI(r) = .. . = dp_I(r) = O. Proof. Fix I' E {I, 2, ... , m}. Because (X - r)P is a factor of f(X) by (4), we have

f(X) = (p ~ l)!(X - r)Pg(X) for some polynomial g(X) with integer coefficients. If we expand g(X) in powers of (X -1') (see Proposition 7.1.6) and multiply each resulting term by (X - r)P we see that

f(X) = (p ~ I)! (bp(X - r)p + bp+I(X - r)p+1 + ... + bn(X - rt) (7) for some real numbers bp , • • • .b.; By Lemma 7.1.7 we can equate the coefficients of 1, (X - 1'), ... , (X - 1')" in (7) and (6). Since the coefficients of 1, (X - 1'), . .. , (X - I' )p-I are zero in (7), they must be zero in (6) also . -

Properties of M, M l , .. . , Mill and

El, . . . , Em

We now derive properties of the numbers which were introduced in Definition 7.2.4.

7.2.6 Lemma. (a) The number AI is an integer which does not have the prime p as a factor when p > m. (b) There is a polynomial G I with integer coefficients and degree at most n such that G(l') = p G I (l') for r E {I, 2, . .. , m}. In particular, M I , . . . , A!m are a11 integers witli the prime P as a factor . Proof. (a) The polynomial f(X) defined in (4) can be expanded in powers of X so that we can write

f(X)

= (p ~ I)! (ap_IXP-I + apXP + ...+ a"X")

(8)

129

Tr anscend ence

where ap_I, . . . , an are integers with a p- l = ±(m!)p =f O. By the linearity of integration (Theorem 7.1.2(i) ), it follows from (8) that

{ f (x)e-Xdx

_- Jor (p _1 I)! ( a _ l x + apx +. . . + anx n) e - xdx p

1'- 1

I'

= (p _1 I)! ( a p- l Jot' x 1'-1 e-Xdx

+ a pJot' x I' e-xd x + ... + an Jorr x n e-xdx ).

By Lemma 7.2.1(a), each of the integrals in the above sum is the difference of two terms, the second of which involves e- r . Hence , by the method used in Example 7.2.2, { f(x)e- Xdx is the difference of two sums, the first of which is

(p ~ I)! (a p- 1(p - I)! + app!

+ .. .+ ann!)

whil e the second is of the form e-rx polynomial in r. By (i) of Definition 7.2.4 and th e uniqueness part of Lemma 7.2.3 , M is this first term. Henc e, dividing (p - I)! into each term in the above sum gives M = a p - l + app+ + ann(n -1 ) .. . (p + l)p. Sin ce the coefficients ap - l , aI" , an ar e all int egers, the number M is an integer. Also the prime p is a factor of every te rm in the above sum except the first term a p- l = ± (m! )P , which cannot cont ain p as a factor when p > m (by Ex ercises 7.1 #5); hence p cannot be a factor of M when p > m. (b) [Here we use the expansion of f(X) in powers of X - r , as given in (6) above.] By the linearity of integration , it follows from (6) that

{ f(x)e-Xdx

= (p ~ I)! (d O(7') {

e- xd.7.: + d 1(r) {( x - r)e-Xdx +

... + dn(r) {(;r - 7't e-xdx) . By Lemma 7.2.1 (b) , each of the integrals in th e above sum is the difference of two terms, th e second of which involves e- r . Hence for f (x) e-xdx is th e difference of two sums. The second of th ese is

(pe~"l) ! (dO(7' ) + d1(r )1! + ...+ dn(r )n!)

(9)

130


and the first is a polynomial in r with real coefficients. By the uniqueness part of Lemma 7.2.3, G(r) is equal to the polynomial part of (9) ; that is,

In the special cases where r E {I, 2, ... , m}, we know from Lemma 7.2.5 that do(r) = d1(r) = .. . = dp_1(r) = O. Hence G(r) = (p ~ I)! (dp(r)p! + ...+ dn(1')n!). If we divide (p - I)! into each term we see that G(1') = pG1(r) where

G1(X) = dp(X)

+ dp+1(X)(p + 1) + ... + d,,(X)n(n -

1) .. . (p + 1).

Notice that G1(X) is a polynomial with integer coefficients and degree at most n since each d;(X) is such a polynomial. Hence G1(r) is an integer (since r is) and so, by (ii) of Definition 7.2.4, M; = pG1(r) is _ an integer having p as a factor. The next lemma shows that the € 's defined in Definition 7.2.4 can be made arbitrarily small by choosing p large enough.

7.2.7 Lemma. If f(X) is given as a Innctiot: of p by (4) , and {1 ,2, .. . ,m}, then

r E

Proof.

Let r E {I, 2, . . . , m} and let x E [0,1'] , Using (4) gives

lJ(x)e- XI < If(x)1

= ~

p-l

(px_ l)!I(x - l)P(x - 2)P . . . (x - m)pl .p- l

(;'C_ l)!(x + l)P(x + 2)P .. . (x + m)P r P- 1

< (p _ 1)!(1' + l)P(1' + 2Y··· (1' + mY 1

CP

= r (p -

I)!'

131

Transcendence

where C = 1'(1' + 1)(1' + 2) ... (1'

I{

+ m)

is independent of p. But

f(x)e-Xdxl < (length of interval) x (upper bound for If(x)e-xl) 1 CP = I' X - -:---.,..,. r(p-1)r

Hence the conclusion of the lemma follows from Lemma 7.1.4 .

•

With the aid of the above lemmas, it is now an easy task to prove tha t e is transcendental.

7.2.8

Theorem.

e is a transcendental number.

Proof. Suppose on the contrary that e is algebraic. Then, as shown at the start of this section (see (1) at the beginning of this section) , there is an integer m ~ 1 and integers Co, c!, C2, ... , Cm such that Co # 0, Cm # 0 and (10) Let M and M,., En for r E {I, 2, . . . , m}, be as in Definition 7.2.4 . From parts (iii) , (i) and (ii) of that definition respectively, it follows that e-rE,. = { f(x)e-Xdx = M - e- r G (1') = 1\1- «ru, and hence e

,.

=

!vI,>+ Er !vI .

Substituting these results into (10) gives

[coM and

+ cIM! + ... + cmM m] + [CIE! + ... + CmEm] = O.

(11)

We now choose the prime number P to be larger than each of m Ieol. As p > m it follows from Lemma 7.2.6(a) that

the integer 111 does not luiue p as a factor. As p > leo I it follows that Co does not have p as a factor and hence (as explained at the beginning of Section 7.1) that

coAi does not have P as a factor. But each of !vII, A12 , ••• , AIm is an integer having p as a factor by Lemma 7.2 .6(b). Hen ce p is not a factor of the sum

coM +cI!vII + .. . +cmMm , which is therefore a nonzero integer. (12)


132

On the other hand, if we choose p sufficiently large (which is possible by Proposition 7.1.1), then by Definition 7.2.4(iii) and Lemma 7.2.7 we get

lelcl

1

+ ...+ cmcml < 2'

(13)

But (12) and (13) contradict (ll). So our supposition is wrong. Hence e is a transcendental number. -

7.2.9 Remark. Now that the proof is complete, you might like to reflect on the reasons for the various features in the definition of f(X) in (4). (i) The (p - I)! in the denominator means that er can be made arbitrarily small by taking p large enough. (See Lemma 7.2.7.) (ii) The factor Xp-I ensures that, despite the (p - I)! in the denominator of f(X) , M is an integer. The exact power p - 1 (rather than a higher power) means that M is not divisible by p if p > m. (See the proof of Lemma 7.2.6(a). See also Exercise 3 below .) (iii) The factors (X -l)P, . .. , (X - m)P ensure that, despite the (p -1)! in the denominator, each of Af1, .. . , A1m is an integer. (See the proof of Lemma 7.2.6(b).) (iv) All of the factors (X - l)P, . . . , (X - m)P are required so that we can deal with all the exponents 1,2, ... ,m in (10).

Exercises 7.2 1. [This exercise proves the claim made at the beginning of Section 7.2 that e is distinguished from all other positive real numbers by the fact that the derivative of eX is eX.] Let C be a positive real number and assume that g(x) = c" for all x E IR. If g'(x) = g(x) for all real x, prove that e = e. [Hint. c" = (e1ney = exine.] 2. Prove Lemma 7.2.1. 3. Show that Lemma 7.2.6 would be false if we omitted the factor Xp-I in the definition of the function f(X) given in (4) of the text.

133

Tran scendenc e

4. Show t hat Lemma 7.2.6 would be false if we replaced the factor X p- I by the fact or X P in the definition of f (X ) given in (4) of the te xt. 5. (a) Prove that e2 is a transce ndent al number . [Hint. Use Theorem 7.2.8 and the definiti on of algebraic.] (b) Prove that e2 + 3e + 1 is a nonzero transce ndental number. (c) Prove that if f (X ) is a nonzero pol yn omial with rational coefficients, then f (e) is a nonz ero transcendental number. 6. Prove that

ve is transcendental.

[Hint. Use Corollary 4.4.4 .]

roo xke-xdx = k. !. r ,E.+~ e- g(1') = 0 for each p olynomial

7. Deduce from Lemma 7.2.1(a) that [Hint. You may assume that

g(X) with real coefficients .]

Jo

8. (a) Deduce from Definiti on 7.2.4, Lemma 7.2.3 and the hin t for Ex ercise 7 that

M =

fooo f (x )e-xdx .

(b) Using (a) dedu ce from Definition 7.2.4 and Lemma 7.2.3 that

M,. = e r

1 f (x )e-Xdx. 00

9. Prove by inducti on that if f (X ) = ao + alX

+ ... + anxn

then

for each r E R, where f (i)(l') is the i-th derivative of f evaluate d at r. [Hint. For each integer m , take the inducti on state me nt to be

10. (a) Prove that e" is irr ational for all r E N. (b) Deduce from (a) and the formula just before (11) in the te xt that 10 1, • . • , Em ar e nonz ero. Why didn 't we need to use this fact in the proof of Theorem 7.2.8 (in cont rast to the pro of of Propositi on 7.1.9 where 10 1 being nonzero is essential)?

134

7.3


Preliminaries on Symmetric Polynomials

The proof that 7f is tra nscendental (to be given in Section 7.6) is sim ilar in many respects to that given in Section 7.2 for e. There are, however , two to pics we mu st in troduce before the proof, firstly sy mmetric polynomials (covered in t his section) and secondly int egrals of complex-valued fun cti ons (discussed in Section 7.5) . We are conce rned here with pol ynomials in seve ral ind eterminates X\,X 2 , •. • , X n . Of course, XIX~+3X3 = 3X3+X~XI , for example, since the order of multiplying ind et erminates or adding t erms makes no differen ce. 7.3.1

Example.

= 3, two such polynomials are x~xixj + 3X I + 3X2 + 3X 3,

With n

f(XI, X 2 , X 3) = g(X I, X 2, X 3 ) = X?X 2 + xix3 .

•

Permutations and Symmetry We are interested in wha t happens to such p olyn omials when t he indeterminates are permuted amongst themse lves, for exam ple, replacing X I by X 2 , X 2 by X 3 and X 3 by XI . If we apply this permuta tion to become , resp ectively,

f and

9 in the exam ple above they

ft (X I, X 2 , X 3) = xixixl + 3X2 + 3X 3 + 3Xt, g!CX" X 2 , X 3 ) = xix3 + xix l • Not ice that f and I, are equal polyn omials bu t 9 and gl are not . You can see that, wha tever permuta ti on of th e indetermina tes is made in I , th e resulting polynomial is equal t o f. We say t hat f is a sym me tric polynomial bu t that 9 is not. The precise definition is given as Definition 7.3.3 below. Before giving th at definition we need to say a little m ore about permutations. Intuitively a permutation of a set, for example the set {X I ,X2 ,X3 } of indet erminates ab ove, is just a rearra nge me nt of the order of the elemen ts . Mor e formall y, a permuta tion p of a set S is a one- to- one and onto mapping p : S -+ S. In the exa mple ab ove,

p(X d = X 2 ,

p(X 2 ) = X 3

and

p(X 3 ) = XI'

As you probably kno w, the re a re 6 (= 31) permu ta tions of any set such as {Xt,X2 ,X3 } with 3 elements . (We suggest you write the m all

135

Transcendence

down. Do not forget the so-called identity permutation which leaves all elements unchanged.) More generally there are n! permutations of a set with n , elements. Clearly if f is a polynomial in the n indeterminates XI, X 2 , •. . , X n and P is any permutation of the set, we get a new polynomial (which may be denoted by fP) by applying P to all the Xi'S in f. 7.3.2 Example. Let PI be the permutation of {X I,X2,X3 } which interchanges XI and X 3 and leaves X 2 fixed, and let P2 be the permutation given by P2(Xr) = X 3 , P2(X2) = XI, P2(X3 ) = X 2. Then, with f and 9 as in Example 7.3.1 above,

gPl(XI, X 2, X 3 ) = jP2(X I,X2,X3 ) =

xlx2 + xix l , xlxfxi + 3X3 + 3X I + 3X2.

You might like to write down f P1(XI,X2,X3 ) and gP2(X 1,X2,X3 ) •

•

7.3.3 Definition. A polynomial f(X 1,X2, .. . ,Xn) in indeterminates XI, X 2, • • • , X; is called symmetric if, for all permutations P of {X 1,.X2 , • •• ,.\n}, we have

• The next lemma contains two routine consequences of this definition. We leave the proof as Exercises 7.3 #7. 7.3.4 Lemma. (i) If f(XI, .. . , Xn) and g(XI, ... , Xn) are symmetric polynomials in X I, .. . , X n then so are

f(X 1 , .•• , Xn) + g(XI, ... , XII) , f(XI,"" Xn) - g(X 1, ••• , Xn) , f(X 1 , . • • , XII) g(X I , •• • , Xn), (their sum, difference, and product). (ii) If h(Yi , , Ym ) is any polynomial in indeterminates YI, . . . , Ym and if gl(XI, , XII)" . . , gm(X I, ... , Xn) are symmetric polynomials in X I, . . . ,XII then

is also symmetric in XI, . .. , XII '

•


136

Note that "quotient" is not included in (i) of Lemma 7.3.4 since the quotient of two polynomials is in general not a polynomial.

Elementary Symmetric Functions There is one way of obtaining several interesting and important symmetric polynomials which we illustrate in the case n = 3. Consider

If we expand this and collect the coefficients of the powers of the indeterminate Y we obtain

Notice that the three polynomials obtained (Tl(XI, X 2, X 3)

= Xl + X 2 + X 3,

(T2(X l , X 2, X 3) = X IX2 + X IX3 + X 2X3, (T3(X l , X 2, X 3) = X IX2X3, are all symmetric. This is not surprising since clearly the value of the expression (*) (from which they are derived) remains unchanged after any permutation of the X's. The three polynomials obtained are called the elementary symmetric functions in 3 indeterminates. This can be extended to any n in the following definition.

7.3.5 Definition. The n elementary summetric [unctions (Tl, (T2 , . . . , o ; in indeterminates XI, X 2, ... , X n are the coefficients respectively of the powers yn-l , yn-2, ... , yO in the expansion of

that is, (Tl(Xl , X 2,.··, X n) = Xl + X 2 + ... + X n, (T2(X l,X2, ... ,Xn) = X IX2 + X IX3 + + XIX n+ X 2X;J + + X 2X n +

.. .+ Xn-lX n,

•

137

Transcendence

7.3.6

Example.

(i) For n

= 2 we have

al (Xl, X 2)

= Xl + X 2 and a2(Xj, X 2) = X IX 2.

(ii) For n = 4 we have

= XI + X 2 + X 3 + X 4 , a2(Xj, X 2, X 3 , X 4 ) = X UY2 + X IX3 + X IX4 +

al(Xj, X 2, X 3 , X 4 )

X 2X3 + X 2X4 + X 3X4 ,

a3(X I , X 2, X 3, X 4 ) a4(X l , X 2, X 3 , X 4)

= X lX2X 3 + X lX2X 4 + X IX3X4 + X 2X3X4 , = X lX2X3X4 .

m

(iii) Note that, when n = 4, al has terms, a2 has has G) terms and a 4 has terms.

G)

m= 6 terms, a3

(iv) In general, ak(Xj,X2, .. . ,X,,) has (~) terms.

-

7.3.7 Remark. The elementary symmetric functions provide a connection between the zeros of a polynomial and its coefficients. To see this connection, let f(} ") = a" Y"

+ a,,_1 }nl-I + .. . + al },r + aD

be a polynomial of degree n (that is, (l" ::J 0) with indeterminate Y and coefficients in a field IF. Assume also that f has n zeros 1'1,1'2, .. . ,1'n in some (possibly larger) field IE so that

It follows immediately from the definition of the elementary symmetric functions (Definition 7.3.5) that this expands to

f(Y) =a n }n , - ( l n a l ( 1'I, · · · , 1'" )}nl-I

... + (-lta,Pn(')'I,""

Thus from (**)

al(1'I , a2(1'1,

+

(ln

a 2(1'1, · · · , 1'n)y n - 2 -

1',,).

,1'n) = - an- I/ a" , , ~/,,) = a"-2/a",

and so on down to

Thus each elementary symmetric function evaluated at the zeros of (**) is plus or minus the quotient of a coefficient divided by the leading coefficient. Note also that all of these quotients are in IF (even though the 1"S need not be in IF) . -


138

The elementary symmetric functions are also important because, as we prove below in Theorem 7.3.10, every symmetric polynomial can be expressed in terms of (indeed, as a polynomial in) the elementary symmetric functions. The next example illustrates this. 7.3.8

Example.

For

f

in Example 7.3.1,

where h(Yl , Y2 ) = Y? + 3Y2 and a3 and al are two of the elementary symmetric functions in the indeterminates X}, X 2 , X 3. • In Example 7.3.8, it was convenient to write just a3 rather than a3(X l,X2 ,X3). Similarly, in what follows, we shall often omit th e X's from the a's. However, you should always remember that the a's depend on indeterminates X}, X 2, • . • , X n •

xiI

7.3.9 Definition. The degree of a monomial X~2 . .. X~n, where iI, i 2, ... , in are non-negative integers, is defined as the sum of the exponents, i l + i 2 + . . . + in, and the degree of any nonzero polynomial is the maximum degree of the monomials involved in it. • In Example 7.3.8, note that the degree of h (which is 2) is no greater than the degree of f (which is 6), and that h has integer coefficients. The following theorem generalizes this example. 7.3.10 Theorem. [Fundamental Theorem on Symmetric Functions] Every symmetric polynomial g, witu coefficients in a field IF, in the indeterminates Xl, X 2 , • . • , X n can be written as a polynomial h (with coefficients in IF) in the n elementary symmetric functions. Moreover,

(i) the degree of h is no more than the degree of g, and (ii) if 9 lies integer coefficients then so does h.

•

Before proving the theorem, we describe an algorithm for constructing such a polynomial h. Later we shall prove that the algorithm works; this will prove the theorem. We first need another definition.

139

Tr an scendence

7.3.11

Definition.

· i lX-2i 2 ' .. v in CX I

-"" n

Consider two nonzero terms ancI

dv il v i z .A I J'\. 2 . . .

xtn

(c,d E F ).

\Ve say that the first of these is of high er orde-r th an the second (or, equivalently t hat the second is of lower order than t.he first) if, at. t.he first p osi t.ion, say k , in which the i's and j's differ , we have ik > j k. • For example, X l X~X3 is of higher ord er than each of x l X 2X3 and 1ower onIer tl Ian J\.IJ'\.2 "\r2"\r3-\.3' v

v X 25XI30' W 111'1 e ...v·I.1.A "\r5 v 3lu :IS 0 f .AI 2 J\.

7.3.12 Algorithm. [Fundamental Algorithm for Symmetri c Functions] The aim of the algorithm is as follows: Given a non zero sy m m etric polynomial 9 in n ind et erminates X" X 2 , •• . , X n sat isfying the hypotheses of the Fundam en tal Th eorem on Symm etric Functions, to construct a polyn omi al h satisfying the conclusion of tlie: theorem . St ep O. Ini tially set

e= 1 and 91 = 9 (which is not zero) .

Step 1. (a) Find the te rm of highest order in

9f,

say

(CE F,c f O) . (b) Wi th c and i i, i 2 , • • • , in as in (a) , set

(c) Put

ge+I(X I , X 2, . .. , Xu ) (d) Increas e e by 1.

= ge(X" X 2,.·· , X u) -

he(a t, 0'2, ·. ·, an).

(e) If ge f 0 th en go back to (a ) and rep eat St ep 1; ot herwise pro ceed to St ep 2 bel ow. Step 2. Put h

= hi + h 2 + ... + he- I.

•

Before pro ving th at thi s algorithm works, we give an example of its use.

140


7.3.13 Example. We apply the above algorithm to the symmetric polynomial (in n = 2 indeterminates)

(See Example 7.3.6 for formulae for Step O. Set

e=

al

and a2')

1 and put 91 = 9.

Step 1. (a) The term of highest order in 91 is 3Xr X 2.

= 3yl-IY;l = 3Y?Y2 • 92(XI,X2) = 91(XI,X2) - h l (a l , a2) = 3XfX2 - 2xrxi + 3X tXg -

(b) hl(YI , Y2 )

(c)

3(X I

+ X 2)2(X tX2)

" '2 ,\,2

= - 8",'1.1",'1.2'

(d) Set

e=

2.

(e) As 92 -1= 0, return to (a) and repeat Step 1 with 92 in place of 91. Step 1. (a) The term of highest order in 92 is -8Xr xi. (b) h2(YI , Y2) = _8y?-2y;2 = -81'22 . (c) 93(XI,X2)

(d) Set

e=

(e) Since 93 Step 2. Set

= 92(XI,X2) -

h 2 (al , a2)

= - 8x r x i + 8xrxi = o.

3.

= 0 go on to Step 2. h = hI + h2 ; that is, h(Yt , Y2 ) = 3Y?Y2 -

8Yl .

As a check, note that

and also that deg h = 3 < 4 = deg 9 and that the polynomial h has integer coefficients. Thus h satisfies the conclusion of Theorem 7.3.10.• To prove that the algorithm always works we need the following lemmas.

141

Transcendence

7.3.14 Lemma. Let g(X I , ... , Xn) be a symmetric polynomial over a field IF and let

(c E IF,

C

¥= 0)

(1)

be tile term of highest order in g. Then

(2) Proof. If, say, ik < ik+1 contrary to the conclusion of the lemma, then we apply to 9 the permutation p which just interchanges the indeterminates Xk and Xk+ 1 . The resulting polynomial gP (using the notation in Definition 7.3.3) then contains the term (3)

which is of high er order than (1) by Definition 7.3.11. But 9 is a symmetric polynomial so the term (3) must be in 9 = o" , which contradicts our choice of (1) as the term of highest order in g. Thus (2) holds. •

Proof. The term of highest order in a product of polynomials is the product of the terms of highest order in each polynomial. The terms of highest order in 0"1,0"2, .. . . o; are respectively

Hence the term of highest order in the product is X Ii l -

iZ( "\'

v

·""1--'-2

)iZ-i3

(V V

V )in _

. . . ·'-1·'-2··· ·'-n

-

"\' il viz

· '- 1 ·'-2 .. .

Xin

II'

•

Proof of Theorem 7.3.10. To prove the theorem we prove that the Algorithm 7.3.12 always produces a polynomial h with the properties stated in the theorem. If 9 =1= 0 is a symmetric polynomial in XI , X 2 , · • • ,Xn , we let DEN denote the number of monomials (not necessarily occurring in g) which are of lower order than the highest term occurring in g. (Recall that a monomial is a term X:t with coefficient 1.) Our proof is by mathematical induction on D . If D = 0 then 9 is a constant polynomial and it is left as an easy exercise to prove that the algorithm works in this case .

xiI .. .

142


Now let kEN and assume that the algorithm works for all symmetric polynomials 9 with D :::; k. Consider now a polynomial 91 with D = k + 1, and let its term of highest order be

(4) where c E F is nonzero. We wish to prove that the algorithm works for 9\ .

The first pass through Step 1 of the algorithm gives

(5) and 92(X1> X 2, . . . ,XII) = 91(X I , X 2, ... , XII) - h l ( Ibol, where bois as in (3) above . (b) There is a polynomial G I (X) wi tl: integer coefficients and of degree at most n such that G(r) = p G I (1') for r E {,BI, .. . , ,Bm}. Proof. The proof of each part (a) and (b) is very similar to the corresponding part of Lemma 7.2.6. Here we merely indicate the changes that are required and leave the details as Exercises 7.6 #3. To prove (a), replace x by ur and d.1: by du, and use Lemmas 7.6.1(a) and 7.6.2 in place of Lemmas 7.2.1(a) and 7.2.3. Recall that bo i 0 in (3). To prove (b), firstly expand f(X) in powers of (X - 1'), with r E C, using the obvious complex analogue of Proposition 7.1.6. Again replace x by ttl' and use Lemmas 7.6.1(b) and 7.6.2 in place of Lemmas 7.2.1(b) and 7.2.3 . That each of d o(1'), ... ,dl'-l(l') is zero if r is in {,BI' ,B2, ... ,,Bm} follows just as in Lemma 7.2.5, using the complex analogues of Proposition 7.1.6 and Lemma 7.1.7 . •

156


Note a significant difference between Lemma 7.2.6(b) and Lemma 7.6.4(b). In the latter we make no claim that G1(r) is an integer for r E {,8h. " ,,8m}' Although G1(X) has integer coefficients, the ,8's are complex numbers and so we do not expect G1(r) to be an integer (unlike Section 7.2 where r E {l,2, . . . ,m} so r is an integer, guaranteeing that Gt(r) E Z). In the proof for Jr we shall use symmetric polynomials to show that the sum

is an integer (even though the individual G t(,8i)'S may not be).

Note that a sequence {zn} of complex numbers converges to 0 as n ~ 00 if the sequences of real and imaginary parts of Zn converge to 0; this is equivalent to saying that the sequence {Iznl} of moduli converges to O.

Our final lemma is the analogue of Lemma 7.2.7. 7.6.5 Lemma. Let f(X) and n be as in Definition 7.6.3. For r E {,I31,' .. ,,13m} let Cr be as in Definition 7.6.3, so that

Tllen

lim e; = O.

p-->oo

Proof. Let tt E [0,1] and let x = ttl' where r is the complex number r = 1'1 + i1'2 (where 1'1,1'2 E R). By definition,

and so IeI'I (2),

= e'".

Similarly le-w'l

= e-

Ur t

< eh l. Also, from (4) and

157

Transcendence

Hence, using Lemma 7.5.5, Ibne

r

10

1

f(ttl')e-

lI r

n/ul

:::; bmp+p-lerl(2 length of interval) x (upper bound for If(ul')e-url'l)

: :; bmp p-1 +

e'·1.2.1.

~~I:I;~ (11'1 + 1131If ... (1 1'1+ 113mIfehlll'l

2erl+lrll

CP

b

(p-1)!

< -...,.......--,..---:-:-

2 where C = bm + Ir l(lr l + 113t1) ... (11'1 + Il3ml). Note that C is independent of p (and so are band Lemma 7.1.4, p-oo lim e, = o.

1'1) '

Hence by

•

Given the above results, the long awaited proof of the transcendence of 7r is relatively straightforward. 7.6.6

Theorem.

7r

is a transcendental number.

Proof. Suppose 7r is algebraic. We shall derive a contradiction. We have already proved various consequences of this assumption and restated them at the start of this section. We let p be any prime which is larger than all of q, band Ibol occurring in (1), (2) and (3), and define the polynomial f(X) and the integer n as in (4) above . It follows from Lemma 7.6.4 that there is an integer M not divisible by p and a polynomial G1(X) with integer coefficients and degree at most n such that, for r E {131,132, .. . ,l3m}, (5)

We now int roduce the numbers J\fp " . . . ,Afpm and

Abstract Algebra and Famous Impossibilities (Universitext)

Abstract Algebra and Famous Impossibilities (Universitext)

Abstract Algebra and Famous Impossibilities (Universitext)