No title

So, what's your problem? E D I TO R I A L PO L I CY Mathematics Magazine aims to provide lively and appealing mathemat...

Author: Walter Stromquist (Editor in Chief)

5 downloads 388 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

So, what's your problem?

E D I TO R I A L PO L I CY Mathematics Magazine aims to provide lively and appealing mathematical exposi tion. The Magazine is not a research jour nal, so the terse style appropriate for such a journal (lemma-theorem-proof-corollary) is not appropriate for the Magazine. Articles should include examples, applications, his torical background, and illustrations, where appropriate. They should be attractive and accessible to undergraduates and would, ideally, be helpful in supplementing un dergraduate courses or in stimulating stu dent investigations. Manuscripts on history are especially welcome, as are those show ing relationships among various branches of mathematics and between mathematics and other disciplines. A more detailed statement of author guidelines appears in this Magazine, Vol. 74, pp. 75-76, and is available from the Edi tor or at www.maa.org/pubs/mathmag.html. Manuscripts to be submitted should not be concurrently submitted to, accepted for pub lication by, or published by another journal or publisher. Submit new manuscripts to Allen Schwenk, Editor-Elect, Mathematics Mag azine, Department of Mathematics, West ern Michigan University, Kalamazoo, Ml, 49008. Manuscripts should be laser printed, with wide line spacing, and prepared in a sty le consistent with the format of Mathematics Magazine. Authors should mail three copies and keep one copy. In addition, authors should supply the full five-symbol 2000 Mathematics Subject Classification number, as described in Math ematical Reviews.

Cover image, What's Dirichlet's Prob by jason Challas. Inquiring Arthur Cay ley wants to know. You can find out more about Dirichlet's Problem by reading the ar ticle by Pamela Gorkin and joshua H. Smith in this issue. jason Challas teaches at West Valley Col lege in Saratoga, CA, where he helps stu dents solve their own problems with com puter graphics.

lem?,

A U T H O RS D. Harper has been teach i n g the fu l l spec tru m of u ndergraduate m ath cou rses at Central Wash i n gton U n i versity s i nce 1988. A l though he was tra i ned as a harmon i c a n a lyst, these days he is more of a mathematical hobbyi st dabb l i ng i n n u mber theory a n d i nfi n ite seri es. J i m got the i dea for thi s a rti c l e when he genera l i zed the gambler's ru i n exa m p l e in Susanna Epp's d i screte math book. S i nce p robab i l i ty i s not his strong s u i t, he asked Ken for he l p on gett i n g a l l h i s ps and qs to s u m to one.

james

H i s nonmathe m atical i nterests i nc l ude h i k i n g, run n i ng, a n d losing h i s vo ice at H ayward Fie l d track meets in E u gene, Oregon.

Kenneth A. Ross's fi rst rea l job was at the U n ivers ity of Rochester, where the cha i rman was h i s mathe matical godfather, Leonard G i l l man. Ken tau ght at the U n iversity of Oregon from 1965 to 2000. He was Sec retary and Assoc i ate Secretary of the MAA from 1984 to 1993, and he served as MAA Pres i dent, 1995-1996. For relaxation, he p refers passive activities s u ch as atte n d i n g concerts, p l ays, movies and baseba l l games, though he has been known to ra ise his vo ice at baseba l l games. His l atest book is A Mathematician at the Ballpark: Odds and Proba bilities for Baseball Fans, publ i shed i n 2004.

Pamela Gorkin

is Professor of Mathematics at B u ckne l l U n ivers ity, where she was named Pres i dential Professor d u r i n g the years 2000-2003. She teaches a wide range of c l asses, and her research i n terests i n c l ude fu nction theory and operator theory. Professor Gork i n is a frequent v i s i tor to Bern, Switzerland, the U n i versity of Metz in France, a n d she has had the p l easure of tak i n g part in the R I P program in Oberwolfach on severa l occas ions. Accord i n g to the Mathematics Genealogy Web s i te, Professor Gorki n is a descendant of D i ri ch l et.

joshua H. Smith is a Ph.D. student i n the De partment of Mechan i ca l and Aerospace E n g i n eer i n g at the U n iversity of V i rg i n i a. H i s research i nterests i n c l ude the transport of che m i c a l sol utes in poro u s med i a, with app l i cation to perfu s i o n i n b i o l og i c a l tissue. He com p l eted h i s Bache l o r's de gree in m athematics at B uckne l l U n ivers ity, i n c l ud i ng an honors thesis u nder the g u i dance of Pamela Gorki n. Sujoy Chakraborty

was born in 1972 in Cam i l l a, Bangl adesh. He entered the U n ivers i ty of Dhaka i n 1989 to read m athematics. H e earned the M.Sc. degree in P u re Mathematics and conti n u ed with the M.Ph i l. p rogram. He was awarded the M.Ph i l . degree i n 2002 for a d i ssertation on "The theory of groups from Cay ley to Froben i u s," written u n der the s u perv i s i o n of Professor M. R. Chowdhu ry. He has been teach i n g at Shah Ja l a l U n ivers ity of Sci ence and Technology si nce 1999. H i s i nter ests i n c l ude a l gebra, n u mber theory, and h i story of m athematics.

Munibur Rahman Chowdhury

was born i n 1941 i n Far i d p u r i n what was then B ritish I n d i a, now i n Bangl adesh. H e entered the U n iversi ty o f Dhaka i n 1958 to read phys i cs, b u t went to Germany t o read mathematics. E nteri n g the U n iversi ty of H a mb u rg i n fa l l 1960, he attended the l ast cyc l e of lec tu res on a l gebra by the legendary E m i l Art i n (18981962). After Arl i n's sudden demi se, he moved to the U n ivers ity of Gotti ngen, where he received the degree of D r. rer. nat. i n 1967 for a d i ssertation on " U n i ta ry groups over a l geb ra i c n u mber fiel ds," written u nder the superv i s i o n of Mart i n Kneser (1928-2004). H e has taught at severa l u n ivers ities at home a n d abroad, and has been a professor at the U n iversity of Dhaka si nce 1984. H i s i n terests i n c l ude a l gebra, n u mber theory, h i story of math ematics a n d m athematics education. He is a past pres ident of B a n g l adesh Mathematical Soc i ety.

Vol. 78, No.

4,

October 2005

MATHEMATICS MAGAZINE EDITOR Frank A. Farris Santa Clara University ASSOCIATE EDITORS Glenn D. Appleby Santa Clara University Arthur T. Benjamin Harvey Mudd College Paul J. Campbell Beloit College Annalisa Crannell

Franklin & Marshall College

David M. James Howard University Elgin H. Johnston Iowa State University Victor J. Katz University of District of Columbia Jennifer J. Quinn Occidental College David R. Scott University of Puget Sound Sanford L. Segal University of Rochester Harry Waldman MAA, Washington, DC EDITORIAL ASSIST ANT Martha L. Giannini

MATHEMATICS MAGAZINE (ISSN 0025-570X) is pub lished by the Mathematical Association of America at 1529 Eighteenth Street, N.W., Washington, D.C. 20036 and Montpelier, VT, bimonthly except July/August. The annual subscription price for MATHEMATICS MAGAZINE to an individual member of the Associ ation is $131. Student and unemployed members re ceive a 66% dues discount; emeritus members receive a 50% discount; and new members receive a 20% dues discount for the first two years of membership.) Subscription correspondence and notice of change of address should be sent to the Membership/ Subscriptions Department, Mathematical Association of America, 1529 Eighteenth Street, N.W., Washington, D.C. 20036. Microfilmed issues may be obtained from University Microfilms International, Serials Bid Coordi nator, 300 North Zeeb Road, Ann Arbor, Ml 48106. Advertising correspondence should be addressed to Frank Peterson ([email protected]), Advertising Man ager, the Mathematical Association of America, 1529 Eighteenth Street, N.W., Washington, D.C. 20036. Copyright© by the Mathematical Association of Amer ica (Incorporated), 2005, including rights to this journal issue as a whole and, except where otherwise noted, rights to each individual contribution. Permission to make copies of individual articles, in paper or elec tronic form, including posting on personal and class web pages, for educational and scientific use is granted without fee provided that copies are not made or dis tributed for profit or commercial advantage and that copies bear the following copyright notice: Copyright the Mathematical Association of America 2005. All rights reserved. Abstracting with credit is permitted. To copy other wise, or to republish, requires specific permission of the MAA's Director of Publication and possibly a fee. Periodicals postage paid at Washington, D.C. and ad ditional mailing offices. Postmaster: Send address changes to Membership/ Subscriptions Department, Mathematical Association of America, 1529 Eighteenth Street, N.W., Washington, D.C. 20036-1385. Printed in the United States of America

ARTICLES Stopp i ng Strategi es a n d Ga m b l er's R u i n J A M E S D. H A R P E R Central Washington U niversity E l lensbu rg, WA 98926 harperj @ cwu .ed u

KE N N E T H A. R 0 S S

U niversity of O regon E u gene, OR 97403 ross @ m ath. uoregon .edu

Let's play a game. Roll a die and you win $ 2 if the die shows l or 2. Otherwise you lose $ 1 . Thus about one-third of the time you win $2 and about two-thirds of the time you lose $ 1 . Suppose you have a definite idea in mind about how much you would like to win. To be concrete, suppose you start with $5 and decide to play until you double your fortune or lose it all (that is, are ruined). In other words, you will stop playing when you have $0, $ 1 0, or $ 1 1 . This game with a die is fair because, on average, for each 3 games, you win $ 2 once and lose $ 1 twice. However, it is not the same fair game as in the classical coin flipping game where you win or lose $ 1 , each with probability 1 /2. One difference is the experience of a player facing ruin at the gambling table. The classical theory of what is called "gambler's ruin" tells us that the probability of ruin in the coin-flipping game is 1 /2, that is, a player who starts with $5 and swears to stop at $ 1 0-a double or-nothing strategy-faces even chances of being ruined or winning double. Knowing this about the coin-flipping game calls some parallel questions to mind. In our die-rolling game, is it still true that probabilities for success or ruin are l /2, that the double-or-nothing strategy is fair? W hat is the expected time for the duration of this strategy? In other words, how many games will you play on average? We ask similar questions for unfair games. For example, we consider the game where you win $2 with probability 4/40, you win $1 with probability 1 1 /40, and you lose $ 1 with probability 25/40. This game is not fair; the average loss per game is 1 5 cents. We refer to gambling strategies such as the double-or-nothing strategies above as stopping strategies. We refer to the possible values of your fortune ($0, $ 1 0, and $ 1 1 in our first example) as "stopping values." In this paper we show how to answer the first questions (probabilities of success and ruin) using recurrence relations. Then we introduce the language and methods of random variables, showing how these probabilities can be used to solve the duration question. A key will be two identities due to Wald, for which we give elementary proofs at the end of the paper. This paper should be accessible to any reader with a working background with re currence relations and knowledge of sophomore level probability, in particular, an ac quaintance with discrete random variables and expectation, and a willingness to work with infinite sums and matrices. All random variables in this paper will be discrete ones 255

256

MATH EMATICS MAGAZI N E

(that is, have countable range-integers in our case); the concepts of independence and expectation are easier in this setting and we need nothing more. Our investigation was inspired by the classical ruin prob lem based on flipping a fair coin. We imagine a person repeatedly betting $ 1 on heads who is willing to risk his or her fortune, $a, to win $b. It turns out that

A classical ruin problem

probability of ruin

b and probability of success a+b

= --

=

a a+b

--.

These answers are reasonable because we would expect the strategy to be fair, and with these probabilities the expected gain is

b

a -a a+b

X --

b a +b

X -- =

0.

It turns out that the expected duration of this strategy is ab. That is, if this strategy were repeated over and over, then the average duration would involve ab flips of the coin. We will verify this in passing after we introduce Wald's second identity. Much more information about the classical ruin problem, both when the coin is fair and when the coin is not fair, can be found in Takacs [11], Jewett and Ross [7], Ross [9, chapter 7], and Feller [6, chapters III and XIV] . Reference [11] is an excellent source for history on the subject. Bak [1] also starts with a classical ruin problem, but rather than consistently betting the same amount, his gambler varies his bet as his anxiety level increases. This is an interesting story, which is very different from ours.

Solu tion to the 1/3 vers u s 2/3 game

Recall our game where, for each $ 1 bet, one-third of the time you win $ 2 and two thirds of the time you lose your $ 1 . Suppose that you have $5 to wager and you wish to (at least) double your fortune, that is, you would like to end up with $ 1 0 or $1 1 . Let P�10l denote the probability that you will eventually end up with $ 1 0 given that you have $N, where 0 ::: N ::: 1 1 . The other probability functions, P�lll and P�0l, are defined similarly. Note that P�10l and P�11l are the success probability functions and P�0l is the ruin probability function. The stopping values k = 0, 10, 1 1 provide us with the following initial values: Pi:) is 1 when N = k and is 0 when N =1= k for N = 0, 10, 1 1 . We use the tools of recurrence relations to find the probabilities of success and ruin or, to be more precise, the probability of each of the stopping values. Our primary and most complete reference for recurrence relations is Rosen [8, pp. 4 1 3-4 1 8] . Actually, many other discrete math texts have the pieces necessary to solve the recurrence re lations we will encounter (for instance, Epp [5, Chapter 8] and Ross and Wright [10, section 4.5]). We need to know how single roots and double roots of the characteristic equation contribute to the general solution of a recurrence relation. Here is the crucial observation: Each of these probability functions satisfies the following recursion relation:

Pi:l

=

(lj3)Pi:l2 + (2j3)Pi:2 1

for 1 ::: N :S 9.

The subscripts of the right-hand side probability functions correspond to winning $2 (with probability 1/3) and losing $1 (with probability 2/3). Now suppose this rela tion has a solution of the form Pi:) = xN for some real number x. When we plug

VO L. 78, NO. 4, OCTO B E R 2 005

257

2

this solution into our recursion we have xN=(l/3)xN+ +(2/3)xN�I. Dividing both sides by xN�J gives us x=(lj3) x 3 +(2/3), which is the characteristic equation for this recursion relation. From the characteristic equation, we have the characteristic

polynomial 3 2 f(x)= (lj3)x - x +(2/3)=(lj3)(x - 1) (x +2).

It can be shown that, when the characteristic polynomial has a double root at x=r, k then P� ) = N r N is another solution. The general solutions i n our case are linear combinations of 1N=1, N 1N=N, and (-2)N. A different linear combination will produce the correct solution for each value of k, that is, ·

·

�0) =x11 +x12 N +x13 (-2)N, 10) P� =x21 +x22 N +x23 (-2)N, Il) p� =x31 +x32 . N +x33 . (-2)N. P

·

·

·

·

Plugging in the initial values for N=0, 10, 11 leads to the following matrix equation:

[

Solving this equation is equivalent to finding the following inverse: 1 10 1024

]

1 ]� 1 11 = -31743 -2048

31744 -11 10

-3072 2049 1023

-1 ] 11 -10

0

The first row of the inverse provides us with the coefficients of the ruin probability 0 function P� ), and the second and third rows provide us with the coefficients of the 10 11 success probability functions P� ) and P� ), respectively. We can now give explicit formulas for these functions:

�O) = [31744 - 3072N - (-2)N]/31743, IO) p� = [- 11 + 2049N +11 (-2)N]/31743, II) p� = [10 +1023N- 10 (-2)N]/31743. p

·

·

In the scenario we described earlier, where you begin with $5 we have:

1824 (O) p5 =

.517153 '

pOD)= 1098 5 3527

736 ""' ""' . 462021, 1593

p 1 in this paper, though the classical ruin problem corresponds to the case u = v = 1. The games we mentioned in the introduction all involve u = 2 and v = 1. In the first example,P1 = 0, P2 = 1/3, q1 = q = 2/3, N = 5, and M = 10. In the unfair game, p1 = 11/40, p2 = 4/40, q = 25/40, and we did not specify N or M. The expected value for this unfair game is 1 · (11/40) + 2 · (4/40) - 1 · (25/40) = -.15 dollars. A gambler playing this game would lose, on the average, 15 cents per game. As we did in the 1/3 versus 2/3 game, let p�kl be the probability that you will eventually end up with $k when you have $N, 0:::::: N < M + u. Here k is one of the desired stopping values M, M + 1, . . . , M + u - 1, if you are successful, and one of the undesirable stopping values 0, 1, . . . , v- 1, if you lose. This leads to the following recursive relations: v� N

Suppose also that the expected loss per game is the same in these two games and that our stopping strategies are the same, that is, we begin with $N and hope to reach before going broke. Which game is better from the point of view of the stopping strategy? We suspect that the expectation of a strategy does not depend on the loss per game, but rather on its ruin root. This is a question that deserves further study. We now give an example to show that there can be a big difference between the expectations. In roulette, there are 38 equally likely outcomes, of which are black, of which are red and of which are green. The probability of winning on red is and the probability of winning on green is A winning bet on red will pay and a winning bet on green will pay Both games (betting on red or green) have the same expectation per game: that is, on average the loss will be about cents per game. As gambling at casinos goes, this is not a bad expectation. Suppose that Mrs. Rose bets on red and Mr. Green bets on green, and that they use the same stopping strategy by starting with and hoping to reach or more, before going

2.

$1

2.

$w

w 1.

$M

2

18

1/19. $17. -$1/19, $12

18 9/19,

$1

$1,

5.26

$18,


2 66

broke. Since Mr. Green will stop immediately if he ever wins a game, we can calculate both of their expectations using this strategy. Mrs. Rose's expected loss is about $3.94 whereas Mr. Green's is only about 48 cents, as follows: The only way Mr. Green can be ruined is if he loses the first 12 times he plays. If he wins just once, he will go home with money in his pocket. The probability that he wins on the first spin of the wheel is 1 I 19 and he would take home $29. The probability that he loses on the first spin and then wins on the second spin is (18/19) · (1/19) and he would take home $28. In general, the probability that he loses on the first k - 1 spins and then wins on the kth spin (1 :::=: k :::=: 12) is (18j19)k-1 (1/19) in which case he'd take home $(30- k). Hence his take-home expectation is •

12

L: 08/19l-1 . (1/19) . (30- k) k=1

�

$11.52.

Mrs. Rose's stopping strategy is the classical gambler's ruin strategy where p = 9/19 is the probability of winning $1 and q = 10/19 is the probability of losing $1. Formula (2.4) in section XIV.2 of Feller [6] shows that the probability that Mrs. Rose will succeed and reach $18, rather than lose $12 is (pjq )6

(pjq)1 8

------'-- = _

(9/10)6

1- (pjq)18

_

(9/10)1 8

1- (9/10)18

�

.448· '

Therefore Mrs. Rose's take-home expectation is $18 · .448 � $8.06. Since both Mr. Green and Mrs. Rose started with $12, their expected losses are 48 cents and $3.94 per stopping strategy, respectively. Even if they change the rules and both start with $1, or both start with $17, Mr. Green has the better expectation. In the first case, his expected loss is about 10.5 cents while Mrs. Rose's is over 66 cents. In the second case, Mr. Green's ex pected loss is about $1.24 while Mrs. Rose's is about $2.00.

Proving Wald's identities In keeping with the spirit of this article, we will avoid complications involving a-fields and other advanced notions by assuming that the random variables X; are discrete random variables taking integer values. We further assume that they are i. i.d. , that is, independent and identically distributed. Identically distributed means that they take the same values with the same probabilities, that is, for each k, Pr(X;

=

k)

=

Pr(Xj

=

k)

for all

i and j.

This implies that all E[X;] are equal and that all E[Xl1 are equal. Independence means that knowing the values of some of the random variables does not affect the probabilities of the values of the other random variables. This can be stated in terms of conditional probabilities, for instance, Pr(X2 = 5JX1 = -1) = Pr(X2 = 5). Our needs are rather special, and we won't need such detailed infor mation. W hat we need is that if Y and Z are independent random variables, then E[Y Z] = E[Y] · E[ Z]. In probability, the characteristic function of an event A is called an indicator function and written /A. Two events A and Bare independent if and only if their indicator functions IA and I8 are independent random variables. To help understand the proofs of Wald's identities, first imagine for a moment that the stopping time T is independent of all of the outcomes X1 , X2 , , so that whether T(w) = n does not depend on the values X1 (w), X2 (w), . . . , Xn(w). For example, suppose Jim and Ken played independent sequences of games, that Jim used a double•

•

.

VOL. 78, N O. 4, OCTO B E R 2 005

267

or-nothing stopping time T, and that Ken used the same stopping time, so that they would quit at the same time. Then T would be independent of Ken's sequence of outcomes, and there would be no limit on how much Ken might lose (or win). In this case, where T is independent of all outcomes, the proof of Wald's first iden tity is easy and natural by considering T on each of the sets where it is constant and summing: =

E(Sr) Since Sn and

00

n=l

I(T=n l

are independent, we get

E(Sr)

=

00

L E(Sn ) · E(I{T=n J) =

=

00

L n · E(XJ) · E(I(T=n J) , n=l

00

E(X I ) ·

00

L E(Sn · l(T=nJ) · n=l

n=l

so that

E(Sr)

=

L E(Sr · l(T=n J)

L n · Pr({T

=

n})

=

n=l

E(X I ) · E(T) .

However, in each of our examples, the stopping time T is not independent of the variables Sn . This is what makes them interesting. For instance, if we knew that T (w) = 17 in one of our first examples, then we would know that S17 (w) must be -5, 5 or 6. What is true in all of our examples is that Xn is independent of {T = n- 1} and, in fact, Xn is independent of {T :::: n 1}-knowing that the process stops before n steps gives no information about the nth outcome. Then Xn also is independent of the complement {T � n}. In terms of functions, -

for each n, the random variables Xn and

I(T�n l are independent.

To prove Wald's identities, all we need is this observation and the following key for mulas, which hold for any sequence {Xi } of random variables and any random variable T with range {1, 2, 3, . . .}:

Sr =

00

L Xn . f(T�n l

and

E[T] =

00

L Pr{T � n}.

n=l The first equality is easily checked a t each sequence w o f outcomes. For instance, if T (w) = 17 then /(T�n J(w) = 1 if and only if n :::: 17, so the sum equals I:!:1 Xn (w) = Sn(w) = Sr(w) . We derive the second equality from the first equality by letting each Xn be the identi cally 1 function, a random variable that isn't very random. This gives T = I::1 I(T�n J· Taking the expectation E of both sides yields the second equality. These computations work so long as E[IX11] and E[T] are finite. n=l

WALD'S IDENTITY I. With the hypotheses above,

E[IX11] finite and E[T] finite,

E[Sr] = E[Xd · E[T]. Proof. Since Xn and I(T�n l are independent, we have

E[Sr]

=

00

L E[Xn · l(T�n Jl

n=l and this equals E[Xd · E[T].

=

00

=

00

L E[Xn ] · E[I(T�n Jl L E[XJ] · Pr[T � n] n=l

n=l

MAT H E MATICS MAGAZ I N E

2 68

Actually, to interchange sum and expectation, we need dominated convergence. First put IXnl into the equations above to get the dominating function (a sum), and • proceed. WALD'S IDENTITY II. With the same hypotheses, if finite, then we have

E[XJ]

=

0, and if

E[Xfl is

E[Si] E[Xf] · E[T]. =

Proof Again, convergence must be dealt with in general, but here is the basic idea. Since T = 2::1 · IIT?::.n ) • we have

S

00

00

Xn

00

00

00

Si L L Xn Xm · /{T?::.n ) · /{T?::.m ) L X� · /{T?::.n ) +2 L L =

=

n=l m=l

n=l m=n + l

n=l

n,

Xn Xm · /{T?::.m ) ·

Xn are independent, and so are Xm and IIT?::.m ) · So Xm is indepen Xn · IIT?::.m ) · Therefore E[Xm · Xn · IIT?::.m l] E[Xm] · E[Xn · IIT?::.m l] 0 · E[Xn liT?::.m Jl 0. Thus E[S�] 2::1 E[X� · IIT?::.n ll · Since each Xn is independent of IIT?::.n ) • it follows that each X� is independent of IIT?::.n l · Hence

Form > Xm and dent of the product

=

=

•

=

=

00

E[Sil L E[X�] · Pr[T:::: n] E[Xf] · E[T]. =

=

•

n=l

[2,

Our treatment o f Wald's identities is based o n the proof i n Chow and Teicher section and other places, including Chung [3, section and Durrett sec tion in the latter two books, Wald II is an exercise. All three books are graduate level measure-theory-based texts.

5.3] 3.1];

5.5]

[4,

Acknowledgment. We thank the anonymous referees for their many improvements to the exposition and for an additional reference.

REFERENCES l. Joseph Bak, The anxious gambler's ruin, this MAGAZINE 74 (200 1 ), 1 82- 1 9 3 . 2. Yuan Shih Chow and Henry Teicher, Probability Theory: Independence, Interchangeability, Maningales, 1 978, Springer-Verlag, New York. 3 . Kai Lai Chung, A Course in Probability Theory, 2nd ed., 1 974, Academic Press, New York. 4. Richard Durrett, Probability: Theory and Examples, Wadsworth & Brooks/Cole, 1 99 1 . Third edition pub lished in 2004. 5. Susanna S. Epp, Discrete Mathematics with Applications, 3rd ed., 2004, Thomson/Brooks-Cole. 6. William Feller, An Introduction to Probability Theory and its Applications, Volume 1, 3rd ed., 1 968, John Wiley & Sons, Inc . , New York. 7. Robert I. Jewett and Kenneth A. Ross, Random walks on Z, Coil. Math. J. 19 ( 1 988), 330-342. 8 . Kenneth H . Rosen, Discrete Mathematics and its Applications, 4th ed., 1 999, WCB McGraw-Hill, Boston. 9. Ken Ross, A Mathematician at the Ballpark: Odds and Probabilities for Baseball Fans, 2004, Pi Press, Inc. 10. Kenneth A. Ross and Charles R. B. Wright, Discrete Mathematics, 5th ed., 2003, Prentice-Hall, Upper Saddle River, NJ. 1 1 . Lajos Takacs, On classical ruin problems, J. Amer. Statist. Assoc. 64 ( 1 969), 889-906.

VO L. 78, NO. 4, OCTO B E R 2 005

2 69

Arth ur Cay l ey a n d th e Abstract Gro u p Con cept S U J O Y C H A KR A B O R T Y

Shah jalal University of Science and Technol ogy Sylhet-3114, B angl adesh s u joy _chbty®yahoo.com

M U N I B U R RA H M A N C H O W D H U R Y University of D haka D h a ka-1000, Bangl adesh

In this article, we examine in considerable detail Cayley's first three papers on abstract group theory (1854-59), with special reference to Cayley's formulation of the abstract group concept. We show convincingly that-as far as finite groups are concerned Cayley's definition was complete and unequivocal, in contrast to opinion expressed by some other writers. These early papers on abstract group theory [4, 5, 6] seem to have been completely neglected and swept away by the burgeoning subject of permutation groups. The abstract group concept resurfaced in the work of Kronecker in 1870. Arthur Cayley (1821-1895) graduated from Cambridge in 1842 as Senior Wrangler and was awarded the prestigious First Smith's Prize. He then served as Fellow and Tutor of Trinity College (Cambridge) for three years. Since there was no prospect of a permanent academic position, he left for London and entered Lincoln's Inn to qualify for the legal profession. He was called to the Bar in 1849 and he practiced in London as a barrister for the next fourteen years until his appointment as the first incumbent of the Sadleirian Professorship of Pure Mathematics at the University of Cambridge. Even during the years of his legal practice, Cayley published a very large number of research papers in diverse areas of mathematics. Cayley's work spreads over a very wide range of topics, predominantly in the broad fields of algebra and geometry. He was one of the creators of the theory of algebraic invariants. This elaborate edifice, now almost completely forgotten, reigned supreme during the last quarter of the 19th century, and was the "modern higher algebra" of the day. His Collected Mathematical Papers were published by Cambridge University Press in thirteen volumes from 1889 to 1897 and contain 966 items ranging from very short notes to lengthy memoirs.

Prelude: Cayley on November 2, 1853 On November 2, 1853 Cayley dashed off two manuscripts to the Philosophical Mag azine; one was published before the year was out, while the other appeared in the first issue of the Magazine in 1854. The first of these is called, "On a property of the caustic by refraction of the circle" [3]. We contend that the work on this paper served Cayley

as the inspiration to generalize the group concept. Therefore we discuss its content in some detail. A thorough exposition of caustics would lead us far beyond our scope. We provide only a general description, referring the reader to papers by Bruce, Giblin, and Gib son [1] or Loe and Beagley [13]. A caustic is a curve that is always tangent to rays of light that proceed from a point to be reflected (or refracted) at a given surface. Readers familiar with envelopes will recognize this description of a caustic as the envelope of the reflected (or refracted) rays. One common way to compute caustics was to use an intermediate curve, the secondary caustic or orthotomic curve, which consists of the


2 70

reflections of the source point across all lines tangent to the reflecting surface. The caustic is the evolute of the secondary caustic. The paper under discussion was not the first time that Cayley was writing on caus tics. Six years earlier, Cayley had published an alternative derivation of the equation of the caustic resulting from reflection at a circle (meaning: across a circular surface) [2], a result first given by St. Laurent in 1826. The paper [3] also draws on a subsequent paper of St. Laurent, who showed that "in certain cases the caustic by refraction of a circle is identical with the caustic of reflexion of a circle (the reflecting circle and radiant point being, of course, properly chosen)." Cayley proposes to demonstrate the more general theorem, that the same caustic by refraction of a circle may be considered as arising from six different systems of a radiant point, circle, and index of refraction. The demonstration is obtained by means of the secondary caustic, which is (as is well known) an oval of Descartes. Such oval has three foci, any one of which may be taken for the radiant point: whichever be selected, there can always be found two corresponding circles and indices of refraction. [3]

Cayley derives the equation of the secondary caustic, which contains three parameters: the abscissa � of the radiant point, the radius c of the refracting circle, and the index JL of refraction, and puts it in several equivalent forms of which one is as follows:

[3, p. 120] FIGURE 1 shows this locus, using particular values of the parameters.

Figure

1

Cayl ey's secondary caustic, with �= 1 /3 ,

c

= 1 /2 , t-t= 2

Cayley observes that if we write successively, c' = c,

�� =�. I �= s

1:: 1 =

2

�

,

l_ f-t

I � =�.

2'

C

t-t'= {-t, C

( 1)

c =!:.._ 1-t

f-t

�·

(a)

cl=

1 I t-t= -

(/3 )

I= � ,

( y)

I

cl =

f-t

i, f-t

I=

1-t

f-t

c

-

VOL . 781 NO. 41 OCTO B E R 2 005

271 c'=c,

(8)

� cI = - ,

(c)

1-i

. . . then, whichever system of values of � � , c1 , J-i1 be substituted for � , c, J-i , we have in each case identically the same secondary caustic, the effect of the substitution being sim ply to interchange the different forms of the equation; and we have therefore identically the same caustic. By writing

(f . C1, {t1) = a (� . c, J-i) , &c, a, fJ, y , 8, c will be functional symbols, such as are treated of in my paper "On the Theory of Groups as depending on the symbolic equation en = l ," and it is easy to verify the equations l

= afJ = fJa = y2 = 82 = c 2 ,

a = {32 = 8 y = c8 = y c , fJ = a 2 = c y = y 8 = 8 c , y = 8a = c fJ = {38 = a c , 8 = ca = y fJ = ay = {J c , c = ya = 8 fJ = fJ y = a8 .

[3, pp. 1 20- 1 2 1 ]

There i s a difficulty here; these equations do not hold if the substitutions are named as above (for example, a2 = E: ) . But they do hold if we interchange f3 and E:, and y and 8 . The paper referred to here is the second one that Cayley completed on November 2, 1853 [4] and the first of the three that we will analyze. In this paper, Cayley wrote on groups for the first time. (In 1849 Cayley had published in the Philosophical Magazine a "Note on the theory of permutations," which had, however, no connection with per mutation groups or the group concept. In the 19th century, it was customary to make a distinction between substitution and permutation; the former referred to the process or operation, and the latter referred to the result or image.) Groups of substitutions had been around for some time, at least since Cauchy's numerous papers on the subject from the years 1844-46. As composition of substitutions is associative, the above six substitutions (as re named) form a nonabelian group of order 6. Pleasantly surprised and enormously en couraged by the unexpected appearance of a nonabelian group of order 6 in a physical context, Cayley reflected upon the group concept and realized that it was capable of far-reaching generalization. This is what Cayley set out to do in the paper [4]; but he went much further. To be specific, we contend that in this paper Cayley put forward the abstract group concept (albeit limited to the finite case) and consciously and purposely developed abstract group theory to a considerable extent.

Detailed analysis of Cayley's first paper Preparation for the abstract group concept W riters on group theory, before Cay ley, dealt with groups of concrete objects (mostly of permutations or, more generally, of operations). Cayley laid the groundwork for the abstract group concept by suggest ing that the operations are assumed to obey the associative law, though in general, not the commutative law. Indeed, Cayley devoted considerable effort to prepare the reader for the abstract group concept by alluding to operations, in general, starting thus:

2 72


Let e be a symbol of operation, which may, if we please, have for its operand, not a single ' ' quantity x, but a system (x , y, . . . ), so that 8 (x , y, . . . ) = (x ' , y , . . . ), where x ' , y , . . . ' ' are any functions whatever of x , y , . . . , it is not even necessary that x , y , . . . should be ' the same in number with x, y, . . . . In particular x ' , y , &c. may represent a permutation of x , y , &c, e is in this case what is termed a substitution. [4, p. 1 23]

He continues: . . . the symbol 1 will naturally denote an operation which (either generally or in regard to the particular operand) leaves the operand unaltered, . . . . A symbol 8¢ denotes the compound operation, the performance of which is equivalent to the performance, first of the operation ¢, and then of the operation 8 ; 8¢ is of course in general different from ¢8 . But the symbols 8 , ¢ , . . . are in general such that 8 . ¢X = 8¢ . X , &c , so that 8 ¢ X , 8 ¢ x w , &c. have a definite signification independent of the particular mode of compound ing the symbols; this will be the case even if the functional operations involved in the symbols e , ¢ , &c. contain parameters such as the quatemion imaginaries i, j, k; but not if these functional operations contain parameters such as the imaginaries which enter into the theory of octaves, &c, and for which, e.g. a . f3y is something different from af3 . y , a supposition which i s altogether excluded from the present paper. [4, pp. 1 23-1 24]

Concerning the last sentence of this quote, it is pertinent to recall that the division ring of quatemions was discovered by Sir William Rowan Hamilton (1805-1865) in 1843, while the nonassociative division algebra of octaves (also known as Cayley numbers) was discovered by Cayley in 1845. Cautioning the reader, "The order of the factors of a product e¢ x . . must of course be attended to, since even in the case of a product of two factors the order is material," Cayley sums up his preparatory remarks as follows: .

. . . the distributive law has no application to the symbols 8¢ . . . ; and that these symbols are not in general convertible, but are associative. It is easy to see that eo = l , and that the index law e m . e n = e m +n , holds for all positive or negative integer values, not ex cluding 0. It should be noticed also, that if e = ¢, then, whatever the symbols a, f3 may be, a8f3 = a¢(3 , and conversely. [4, p. 1 24]

In olden days, e o = 1 was treated as a deduction from the index law by setting m or n equal to 0. Nowadays, it is treated as an extra convention, needed to ensure validity of the index law when m or n is equal to 0. The implication e = ¢ ::::} ae{J = a¢{3 is the requirement that the operation is well-defined. The reverse implication ae{J = a¢{3 ::::} e = ¢ is called the (left and right) cancellation law, and is a uniquely special feature of groups.

The group concept and the group table After these preparatory remarks, Cayley immediately introduces the abstract group concept by saying: A set of symbols 1 , a, f3, . . . all of them different, and such that the product of any two of them (no matter in what order), or the product of any of them onto itself, belongs to the set, is said to be a group 1 • [ 4, p. 1 24]

(As we will explain later, the footnote explains Cayley's motivation for using the word group.) It is important to note that the "symbols" (elements) may be any kind of math ematical objects, not necessarily "symbols of operation" (that is, functions or map pings); Cayley is thus quite clearly speaking of a "group" in the abstract sense, a point which becomes abundantly clear when we study his classification of groups of orders 4 and 6.

VOL . 78, NO. 4, OCTO B E R 2 005

2 73

The use of the term here, in its precise technical sense, long before Georg Cantor (1845-1918), is noteworthy. The pivotal role of the symbol 1 as the identity element is tacitly implied, and the requirement of the associativity is not repeated as being superfluous after the introductory remarks. The very next sentence is the following:

set

It follows that if the entire group is multiplied by any one of the symbols, either as further or nearer factor, the effect is simply to reproduce the group; or what is the same thing, that if the symbols of the group are multiplied together so as to form a table, thus: Further factors fJ a

Nearer factors

fJ

I

a

fJ

...

a

a2

fJa

...

afJ

fJ 2

fJ

...

that as well each line as each column of the square will contain all the symbols l , a, fJ, . . . . [4, p. 1 24]

The statement is insightful and informative. In modern language, each of the maps (left multiplication by is bijective. (right multiplication by and ---+ In other words, each of these maps effects a permutation of the elements of the group. We have to consider this as the earliest occurrence of what later came to be known as Cayley's Theorem: This is the real significance of the sentence quoted above; the introduction of the group table is merely an adjunct. Observe that nowadays the group table is usually the transpose of Cayley's group table. The reason that the maps ---+ and ---+ are injective is the validity of the right and left cancellation law in a group. The proof of the cancellation law requires the use of the existence of inverse elements. It may seem that Cayley is not explicit about the existence of inverse elements in connection with his introduction of the abstract group concept. One might, therefore, think that Cayley's definition of a group is in co mpl e te ; someone has gone so far as to say that it is "muddled" [ 1 1 ] . The obj e ction is not valid, because for a finite set closed under a binary operation, the existence of in verse elements follows from the associative property and the cancellation law: ---+

x xa

n letters.

a) a ) x ax Every group oforder n is isomorphic to a group ofpermutations on x xa x ax

finite cancellative semigroup is a group. e, a x xc a' c ec = c. e, ab ea = e (cb) = (ec) b = cb = a. x xa. e m . e n = e m +n , e e-1 , m n, e - 1 • e, e - 1 = e n - 1 ). (e n =

Every

For easy reference we include a proof of this result. Here we need not assume the existence of an identity element. It is enough to show that such a semigroup S has a left identity element and each E S has a left inverse E S with respect to Choose E S (arbitrary but fixed), the map ---+ is injective (by the right cancellation law); hence also surjective (because S is finite). So, there exists an element E S such that This we claim, is a left identity for S. Given any E S, the map ---+ is surjective (for similar reasons). So, there exists an element E S such that = . Then The existence of a left inverse of with respect to e results from the surjectivity of the mapping ---+ Further, Cayley's allusion t o the index law "for all positive o r neg ative integer values," in the preparatory remarks shows that he was fully aware that in a group, every element must have an inverse element because he refers ex pressly to negative values of or and because negative integral powers of e are defined as positive integral powers of Moreover, in a finite group, e - 1 is express ible as a positive integral power of a fact implied by the very title of the paper 1 ==}

e.

e x ex a cb a

MATH EMATICS MAGAZ I N E

2 74

From Cayley's writing we cannot really be sure whether Cayley implicitly assumed the left and right cancellation laws as constituting a defining property of a (finite) group or whether he considered these laws as consequences of the (somewhat implicit but essentially complete) group axioms. In the former event, Cayley's definition of a finite group (as a cancellative semigroup) would be complete and unequivocal. In the latter event, this would still be true if our interpretation about Cayley's implicit reference to inverse elements is accepted. The order of an element and groups of order 4 Observing that "the product of any number of the symbols, with or without repetitions, and in any order whatever, is a symbol of the group," Cayley continues: Suppose that the group 1 , a, {3, . . . contains n symbols, it may be shown that each of these symbols satisfies the equation en = 1 ; so that a group may be considered as representing a system of roots of this symbolic binomial equation. It is, moreover, easy to show that if any symbol a of the group satisfies the equation ()' = 1 , where r is less than n, then that r must be a submultiple of n ; it follows that when n is a prime number, the group is of necessity of the form 1 , a, a2 , . . . , an - l , (an = 1 ) ; and the same may be (but is not necessarily) the case, when n is a composite number. [4, pp. 1 24--1 25]

Again, these lines are insightful and informative. First, every element of a finite group of order n satisfies the equation an = e (to use present-day notation) and the order (index in Cayley's terminology) of every element is a divisor of n. The existence of a least positive integer r with this property follows from considering the sequence of positive integral powers of e ; r is clearly _:::: n. That r must be a submultiple of n (meaning that n is divisible by r), when r < n, is a claim Cayley passes over without proof as being "easy to show" ; whereas just in the preceding sentence Cayley says: "it may be shown that each of these symbols satisfies the equation en = 1," likewise without proof. For a nonabelian group the relation en = 1 is not at all obvious. One is at a loss to understand why C ayley chose to suppress the proofs Second, when n is prime, the only abstract group of order n is the cyclic group of that order, but this may be the case also for certain composite values of n. It is known that every group of order n is cyclic if and only if n is coprime to cp (n), the number of natural numbers not exceeding n that are coprime to n. An inductive proof of the sufficiency part of the Theorem appeared in our paper, "When is every group of order n cyclic?" [8]. An abstract group of order n may, therefore, be regarded as a system of roots of the symbolic equation en = 1. The analogy to the system of the n (complex) roots of the equation xn = 1 is immediate. Indeed, these roots constitute a cyclic group of order n, whether n is prime or composite. The analogy is, however, tenuous for it breaks down in the very simplest and first possible, case n = 4. Indeed, Cayley himself says, "The distinction of the theory of the symbolic equation en = 1, and that of the ordinary equation xn - 1= 0, presents itself in the very simplest case, n = 4" [4, p. 125]. In addition to the cyclic group of order 4, there exists a noncyclic group 1, a, {3, y of order 4. Cayley's proof of the classification of groups of order 4 is as elegant as any to be found in present-day textbooks. For the latter group he proves .

a= {J y = yf3,

f3 = ya = a y,

y = af3 = {Ja ;

"and we have thus a group essentially distinct from that of the system of roots of the ordinary equation x 4 - 1= 0" [4, p. 126]. Cayley then mentions three concrete realizations of this abstract group. The first two are groups of transformations of elliptic functions and binary quadratic forms,

VOL . 78, NO. 4, OCTO B E R 2 005

2 75

respectively, while the third example is from the then emerging theory of matrices ("a theory which indeed might have preceded the theory of determinants"). This example is especially interesting, because here the group is generated by the two operations of and of (invertible) matrices (of a fixed order greater than whose commutativity Cayley expressly points out. Without pausing to explain the terminology or the notation, we mention Cayley's first two examples [4, p. In the theory of elliptic functions, let

inversion transposition 1), 126]: a (n) = -,czn f3(n) = - ++nn ' y (n) = - c c 1++nn) Q (x, y) = ax +2bxy+ cy2 , n a b, c) = (c, b, a), f3(a, b , c) = (a, -b, c), y (a , b, c) = ( c, -b, a). 1, {3, c

2

1

2

and

( 2

where is the parameter. In the theory of the quadratic form let a( ,

,

2

and

Then in each case, a, y form a noncyclic group of order 4. Two further examples that may be more accessible are (i) the residue classes 1, 3, 5, 7 modulo 8, under multiplication modulo 8 and (ii) the matrices

under multiplication of matrices. Groups of order 6 and the group algebra Next, Cayley embarks upon a detailed and careful analysis of all possible abstract groups of order and proves-much as present-day textbooks do-that there exist two, and only two, nonisomorphic abstract groups of order one is the cyclic group of order and the other group may be 2 2 2 2 3 y and y a a y. presented in the form 1 , a , a , y , a y , a y , where a = Cayley begins his analysis of a group of six symbols a , {3, y , 8 , s , by showing that

6, 6 1, = 1, 1,

6:

=

it is impossible that all the roots [except I , of the symbolic equation 8 6 = 1 ] should have the index 2. This may be done by means of a theorem which I shall for the present assume, viz. that if among the roots of the symbolic equation en = 1 , there are contained a system of roots of the symbolic equation 8P = 1 (or, in other words, if among the symbols forming a group of the order n there are contained symbols forming a group of the order p), then p is a submultiple of n [our italics] . In the particular case in question, a group of the order 4 cannot form part of the group of the order 6. [ 4, p. 1 27]

A group of order cannot have a subgroup of order 4 because, as Cayley stated earlier (without proof), the order of every element divides the order of the group. Cayley proved: if every nonidentity element had order then any two of them would generate the noncyclic group of order 4. This would violate the general theorem that Cayley states so clearly. Thus, Cayley has stated here, without proof, Lagrange's The orem for abstract groups and applied it in a specific situation. Why Cayley simply assumed the theorem and omitted the proof is somewhat of a mystery. Having proved the existence of a root (element) of index (order) 3, Cayley de 2 2 notes this element by a . Then the group must have the form 1, a, a , y , a y , a y (with 2 2 2 2 2 3 a where y must be a , or a • The suppositions y = a and y a give a cyclic group.

6

cyclic

2,

= 1),

1,

=

2 76


It only remains, therefore, to assume y 2 = I ; then we must have either ya = a y or 3 else ya = a 2 y . The former assumption leads to the group I , a, a 2 , y, a y , a 2 y , (a = I , y 2 = 1 , y a = a y ) , which is, in fact, analogous to the system of roots of the ordinary equation x 6 - 1 = 0; and by putting a y = A , might be exhibited in the form

3 1 , A. , A. 2 , A. , A.4 , A.s ,

(A. 6

=

1),

under which this system has previously been considered. The latter assumption leads to 3 the group 1 , a, a 2 , y, a y , a 2 y , (a = 1 , y 2 = 1 , ya = a 2 y ) , and we have thus two, and only two, essentially distinct forms of a group of six. [4, p. 1 27]

Denoting the elements by 1, each group.

a, {3, y, 8,

c,

Cayley presents the multiplication table of

An instance of a group of this kind [the nonabelian group of order 6] is given by the per mutations of three letters; the group 1 , a, {3, y, 8 , r:; may represent a group of substitutions as follows : a b c, abc

c a b, abc

b c a, abc

a c b, abc

c b a, abc

bac a b c.

Another singular instance is given by the optical theorem proved in my paper 'On a property of the Caustic by refraction of a Circle' . [4, p. 1 29]

In our opinion, it is the discovery of this group that prompted Cayley to introduce the abstract group concept. Cayley then hints at the construction of the group algebra of a group in these words: . . . if instead of considering a, f3 , &c. as symbols of operation, we consider them as quan tities (or, to use a more abstract term, 'cogitables ' ) such as the quatemion imaginaries; [then] the equations expressing the existence of the group are, in fact the equations defin ing the meaning of the product of two complex quantities of the form w + a a + b/3 + ; [4, p. 1 29] ·

·

·

Thus in the case of the noncyclic group of order 6,

(w + aa + bf3 + cy + d8 + ec) (w' + a' a + b'f3 + c'y + d'8 + e'c) = ( W + Aa + B/3 + C y + D8 + Ec) , where W

= w w ' + ab' + a'b + cc' + dd' + ee',

A = wa' + w'a + bb' + de' + ed' + ce',

etc.

"It is notable that Bartel van der Waerden (1903-1996) saw this as a significant remark and an early introduction of a 'group algebra'," says Crilly [10, p. 4] and refers to B. L. van der Waerden: A History of Algebra: From Al-Khwarizmi to Emmy Noether [16]. Many years were to elapse before this clear hint was taken up and group algebras were accorded the attention they deserved. It is worth noting that neither Cayley nor any of the other 19th century writers use the word abstract in connection with the group concept. The appellation abstract group is a later development; it is used profusely in Miller's 1916 text [14]. The end of Cayley's first paper Cayley ends this paper by expressing the hope "shortly to resume the subject of the present paper" and concludes with two examples

VO L . 78, N O . 4, OCTO B E R 2 005

2 77

of nonabelian groups of higher orders. It is impossible to determine whether these ex amples were constructed in an ad hoc manner, or whether Cayley had in his possession general examples of nonabelian groups of orders 2 p 2 and p 3 (where p is prime). The first of these is a group of [order] eighteen, viz.

where

{3 2 = 1 , y 2 = 1 , (f3y) 3 = 1 , (ya) 3 = (af3y) 2 = 1 , (f3 ya) 2 = 1 , ( yaf3) 2 =

1, 1;

and the other a group of [order] twenty-seven, viz. 1,

a, a 2 , y , y 2 , ya, ay, ya 2 , a 2 y, y 2 a, ay 2 , y 2 a 2 , a 2 y 2 , aya, ay 2 a, a 2 ya, a 2 y 2 a, aya 2 , ay 2 a 2 , a 2 ya 2 , a 2 y 2 a 2 , yay 2 , ya 2 y 2 , y 2 ay, y 2 a 2 y, y 2 aya 2 , yay 2 a 2 ,

where

It is hardly necessary to remark, that each of these groups is in reality perfectly symmet ric, the omitted terms being, in virtue of the equations defining the nature of the symbols, identical with some of the terms of the group. Thus, in the group of [order] 1 8, the equa tions a 2 = 1 , {3 2 = 1 , y 2 = I , (af3y) 2 = 1 give af3 y = yfJa, and similarly for all the other omitted terms. It is easy to see that in the group of [order] 1 8 the index of each term is 2 or 3, while in the group of [order] 27 the index of each term is 3 . [4, p. 1 30]

Cayley's motivation for the abstract group concept It is our contention that Cay ley's unexpected discovery of a nonabelian group of order in the practical context of geometrical optics, served as the trigger for generalizing the group concept. The cru cial point was Cayley's discovery of the six transformations that leave the equation of the secondary caustic unchanged, and his realization that these transformations form a group under the composition of mappings. One may wonder how Cayley noticed that. For one who had discovered the n on a ss oci ative algebra of octaves (Cayley numbers) very soon after the discovery (1843) of the division ring of quaternions by Sir William Rowan Hamilton and had worked on invariants, it would have been natural for Cay ley to ask himself the question: W hat, if any, transformation of variables leaves the equation unchanged? As we mentioned, Cayley's first paper on group theory has a single footnote, marked "1," next to his first usage of the term group :

6

The idea o f a group a s applied to permutations o r substitutions i s due to Galois, and the introduction of it may be considered as marking an epoch in the progress of the theory of algebraic equations. [4, p. 1 24]

There is no other footnote or any reference to any other writer in the entire series of papers [4, 5, 6]. So, clearly permutation groups served Cayley as a prime motivat ing example. After classifying the groups of order 4 and before moving on to three concrete examples of the noncyclic group of order 4, Cayley makes the following ob servation: Systems of this form are of frequent occurrence in analysis, and it is only on account of their extreme simplicity that they have not been expressly remarked. [4, p. 1 26]


2 78

After classifying the groups of order Cayley gives two examples of the nonabelian groups of order Noting that all five of these examples are groups of transformations (including one where the transformations are mappings defined on the set of all invertible matrices of a given order), there is no question that finite transformation groups were for Cay ley an equally important motivation for the abstract group concept. Indeed, we are inclined to believe that as far as Cayley was concerned, permutation groups played only a reinforcing role as one instance of the diverse types of concrete examples that are subsumed under a common umbrella by the new concept. Cayley felt so enthusiastic about the abstract group concept that, as early as he included it among "Recent Terminology in Mathematics, The English Encyclope dia'' [7] , a collection of terms mostly from the then-emerging theory of invariants. The entry Group covers almost an entire page and is a brief summary of the first sev eral sections of Cayley's paper [4] . Thus there is clear, though indirect, evidence that the theory of invariants-largely a creation of Cayley's-played an important role in shaping the abstract group concept.

6.

6,

1860,

Cayley's second paper on group theory In Cayley's second paper on group theory [5], published soon after the first (but in contrast bearing no date), Cayley introduces the concept of normal subgroup of a group (due to Galois) in a somewhat curious manner by calling a normal subgroup of a group a submultiple of the group, whereas its cosets are called (by him) symmetrical holders. Through his study of symmetrical holders, Cayley arrives at some important results that, possibly because they were couched in somewhat obscure terms, have so far es caped attention. For one, Cayley shows that the nonabelian group of order has a unique nontrivial normal s ubgr oup (of order 3). For another, he show s ( without u sing the terminology) that the only other abstract group (the cyclic group) of order is the (internal) direct product of two subgroups of orders 3 and To substantiate our interpretation we quote Cayley in full:

6

2.

6

the group of six 1 , a, a 2 , y, ya, ya 2 , (a3 = 1 , y 2 = 1 , ay = ya), is a multiple of the group of three, 1 , a, a 2 (in fact, 1 , a, a 2 and y , ya, ya 2 are each of them a symmetrical holder of the group 1 , a, a 2 ); and so in like manner the group of six is a multiple of the group of two, 1 , y (in fact, 1 , y and a, ay and a 2 , a 2 y are each a symmetrical holder of the group 1 , y). There would not, in a case such as the one in question, be any harm in speaking of the group of six as the product of the two groups 1 , a, a 2 and 1 , y, but upon the whole it is, I think, better to dispense with the expression. Considering, secondly, the other form of a group of six, viz. 1 , a, a 2 , y, ya, ya 2 , a3 ( = 1 , y 2 = 1 , ay = ya 2 ); here the group of six is a multiple of the group of three, 1 , a, a 2 (in fact, as before, 1 , a, a 2 and y, ya, ya 2 , are each a symmetrical holder of the group 1 , a, a 2 , since, as regards y, ya, ya 2 , we have (y, ya, ya 2 ) = y ( l , a, a 2 ) = ( 1 , a 2 , a )y ) . But the group of six is not a multiple of any group of two whatever; in fact, besides the group 1 , y itself, there is not any symmetrical holder of this group 1 , y ; and so in like manner, with respect to the other groups of two, 1 , ya, and 1 , ya 2 • The group of three, 1 , a, a 2 , is therefore, in the present case, the only submultiple of the group of six. [5, p. 1 32]

It is a pity that having arrived at an example of the (internal) direct product of two subgroups, Cayley seems to have preferred not to follow up the idea behind it.

VO L . 78, NO. 4, OCTO B E R 2 005

2 79

Cayley's third paper o n gro u p theory Cayley's third paper on group theory preamble Cayley begins:

[6]

is clearly the sequel to the first. Without

The following is, I believe, a complete enumeration of the groups of [order] 8 : I. II. III. IV. V.

1 , a, a 2 , a3 , a4, aS , a6, a7 (a8 = 1) . 1 , a, a 2 , a3 , f3, f3 a, f3 a 2 , f3 a3 (a4 = 1 , {3 2 = 1 , af3 = f3 a ). 1 , a, a 2 , a3 , f3 , f3 a, f3 a 2 , f3 a3 (a4 = 1 , {3 2 = 1 , af3 = {3 a3 ). 1 , a, a 2 , a3 , f3, f3 a, f3 a 2 , f3 a3 (a4 = 1 , {3 2 = a 2 , af3 = f3 a3 ). 1 , a, {3 , f3 a, y, ya, y /3 , yf3 a (a 2 = 1 , {3 2 = 1 , y 2 = 1 , af3 = f3 a, ay = ya, f3y = yf3).

That the groups are really distinct is perhaps most readily seen by writing down the indices [orders] of the different terms of each group; these are I.

II.

III.

IV.

v.

1, 1, 1, 1, 1,

8, 4, 4, 4, 2,

4, 2, 2, 2, 2,

8, 4, 4, 4, 2,

2, 2, 2, 4, 2,

8, 4, 2, 4, 2,

4, 2, 2, 4, 2,

8. 4. 2. 4. 2.

[6, p. 88]

This enumeration is, as we know, correct and complete. However, Cayley does not embark upon a proof of his claim. One point is implicit here: two groups of the same order whose numbers of elements of each (possible) order do not all coincide, must be "really distinct" (in modem terms, nonisomorphic). Cayley next considers the question, "Why there is no group where the symbols a, f3 are such that a 4 = {3 2 = a 2 , af3 = f3a."

1,

A group which presents itself for consideration is 1 , a, a 2 , a3 , {3 , f3 a, f3 a 2 , f3 a3 (a4 = 1 , {3 2 = a 2 , af3 = f3 a) but the indices of the different terms of this group are 1 , 4, 2, 4, 4, 2, 4, 2, and if we write f3 a 2 = y, then we find y 2 = f3 af3 a = f3f3 aa = a4 = 1 , ay = af3 a = f3 aa = ya ; and the group i s l , a, a 2 , a3 , y, ya, ya 2 , ya3 (a4 = 1 , y2 1, ay ya), which is the group II. [6, pp. 88-89] =

=

This is a further confirmation that Cayley was aware of the idea of isomorphism of groups. The group IV (in Cayley's enumeration) is the Quaternion group and Cayley dis cusses it in detail. "The group IV is a remarkable one; . . . the nature of the group in question will be better understood by presenting it under a different form," says Cayley, and continues: In fact, if we write f3 a3 = y , a 2 = {3 2 f3 a = 8 y , and the group will be 1 , a, f3 , tion are

= 8, then we find a 3 = 8a, f3 a 2 = 8{3 and y, 8, 8a, 8{3, 8 y , where the laws of combina

8 2 = 1 , a 2 = f3 2 = y 2 = 8, f3y = a, ya = f3 , af3 = y, y f3 = a8 = 8a, ay = f38 = 8{3, f3 a = y8 = 8y. Observe that 8 is a symbol of operation such that 8 2 = 1 , and that 8 is convertible with each of the other symbols a, f3 , y . It will be not so much a restrictive assumption in regard to 8, as a definition of - 1 considered as a symbol of operation if we write 8 = - 1 ; the group thus becomes

1 , a, f3 , y, - 1 , -a, - {3 , -y,

MAT H E MATICS MAGAZI N E

2 80 where

a 2 = fl 2 = y 2 = - 1 , a = fly = -yfl, fl = ya = -ay,

y = afl = -fla .

Hence a, fl, y combine according to the laws of quatemion symbols i, j, k; and it is the only point of view from which the question is here considered which obliges us to con sider the symbols as belonging to a group of 8, instead of (as in the theory of quatemions) a group of 4. [6, p. 89]

By presenting the group IV in the alternative form, Cayley makes it clear that the quaternion group {± 1 , ±i, ±j, ±k} is a concrete realization of this group. As Crilly [10] says, "This is Cayley's really strong point: his ability to see connections." The last part of the quote points to the fact that the group of four quatemion units { 1 , i , j, k} is not a group in the technical sense. Cayley then investigates the conditions under which the mn symbols aP f3q (where p has the values 0, 1 , . . . , m - 1 and q has the values 0, 1 , . . . , n - 1 ) may form a group for which am = 1 , f3 n = 1 , af3 = f3a5 , where s is a positive integer to be determined.

2,

2,

. . . then we find a u fl v = fl v a u s " ; and therefore if v = n , a u = a u sn or a u (sn - l ) = 1 , whence u (s n - 1 ) = 0 (mod. m ) ; or since u i s arbitrary, s n - 1 = 0 (mod. m), an equation which, if m , n are given, determines the admissible values of s ; thus, for example, if n = 2, and m is a prime number, then s = 1 or s = m - 1 . The equation a u fl v = fl v a u s " shows that any combination whatever of the symbols a, fl can be expressed in the form flq aP (or, if we please, in the form aP flq ) . [6, pp. 89-90]

Cayley then carefully shows "that the assumed law is consistent with the associative law, viz. that the expression f3baa f3dac f3I ae can be transformed in one way only into the form f3qaP" [6, p. 90] . Indeed, either manner of forming the above product ·

·

leads to the result f3b+d+ f aa s d+ 1 +csf +e ; so that the associative law is satisfied. Thus, s n = 1 (mod m) is a necessary condition for the mn symbols aP f3q (or f3q aP), where p has the values 0, 1, . . . , m - 1 and q has the values 0, 1, . . . , n - 1, to form a group.

2,

2,

In particular, as already noticed, if n = 2, and m is [an odd] prime, then either s = 1 or s = m - 1 ; the two groups so obtained are essentially distinct from each other. If n = 2, but m is not prime, then s has in general more than two values: thus for m = 1 2 , s 2 = 1 (mod. 1 2) , which is satisfied by s = 1 , 5, 7 and 1 1 ; the group corresponding to s = 1 is distinct from that for any other value of s , but I have not ascertained whether the values other than unity do, or do not, give groups distinct from each other. [6, p. 90]

Cayley has thus constructed here the abstract dihedral group of order 2m. As an ex ample, he writes down the elements of the dihedral group of order 1 0 as follows: 1,

a, a 2 , a 3 , a4, fl , fl a, fla 2 , fl a 3 , fla 4,

where

a5 =

1,

fl 2 =

1,

afl = fl a4 ;

and observes that the indices (orders) of these elements are, respectively, and says:

2, 2, 2, 2, 2;

1,

5, 5, 5, 5,

The group is here expressed by means of the symbols a, fl, having the indices 5 and 2 respectively, but it may be expressed by means of two symbols having each of them the index 2. Thus putting fla = y , we find fl 2 = 1 , y 2 = 1 , (fly)5 = 1 , which is equivalent to (yfl)5 = 1 , and the group may be represented in the form 1,

fl, y, fly, yfl, flyfl, yfly, fly fly , yflyfl , flyflyfl = yflyfly,

VOL . 78, N O . 4 , OCTO B E R 2 005

281

the equality of the last two symbols being an obvious consequence of the equation 5 (f3 y ) = l . It is clear that for any even number 2p whatever, there is always a group which can be expressed in this form. [6, p. 9 1 ]

Con cludi ng remarks Cayley's early papers on group theory apparently had very little influence on his con temporaries. Even modern writers seem reluctant to give Cayley his due in the creation of the abstract group theory. Thus J. Nicholson [15] states that the process of abstrac tion of group theory "took place mainly during the period 1 870-1 890 and was carried out almost exclusively by German mathematicians," and is completely silent on Cay ley's seminal contribution. On the other hand, Crilly [10, p. 3] says: "he [Cayley] was far from the modern abstract definition of a group in which symbols are defined by their obedience to stated axioms." We trust that we have been able to dispel such a misconception. Cayley had indeed defined a group as "a set of symbols" (of any na ture, an epithet which he did not feel necessary to add) obeying certain axioms, stated or implied. His classification of groups of small orders leaves no doubt that he was fully aware of the abstract group concept. Both Wussing and Kleiner clearly recognize Cayley's contribution in this respect. Cayley published two [our italics] papers [ 1 40, 1 4 1 ] [ [4] and [5] in our list] , "On the The ory of Groups as depending on the Symbolic Equation = 1 ," in which he presented a remarkable conception of groups. At a time when the Galois theory of equations had just began to be known, and when the only explicit groups under study were permuta tion groups, Cayley recognized the generalizability of the group concept as well as the implicitly group-theoretic nature of many contemporary ideas . [ 17, p. 230] Cayley's orientation towards an abstract view of groups-a remarkable accomplish ment at this time of the evolution of group theory-was due, at least in part, to his contact with the abstract work of G. Boole. The concern with the abstract foundations of math ematics was characteristic of the circles around Boole, Cayley, and Sylvester already in 1 840s. Cayley's achievement was, however, only a personal triumph. His abstract defini tion of a group attracted no attention at the time, even though Cayley was already well known. The mathematical community was apparently not ready for such abstraction: per mutation groups were the only groups under serious investigation, and more generally, the formal approach to mathematics was still in its infancy. [12, p. 209]

en

Textbooks also for the most part ignore Cayley's contributions to group theory, except for mentioning "Cayley's Theorem." The fact that Cayley had, in his first paper ( 1 854) on group theory, completely classified abstract groups of orders up to and extended this classification (correctly) to groups of order 8 (by 1 859), is never even mentioned in connection with groups of small orders. F. Chowdhury [9] rightly observes:

6,

While the fact that every abstract group is realizable as a group of permutations (some thing which Cayley mentioned in passing) has been codified into (and glorified as) Cay ley's theorem, the fact that (at the same time, 1 854) Cayley had classified all abstract groups of orders up to 6, seems to have fallen into complete oblivion. [p. 2 1 ]

Even more curious is the fact that the (abstract) noncyclic group of order 4 is named after Felix Klein (Klein's Vierergrupp e ). Klein was a boy of five when Cayley's first paper appeared. G. A. Miller [14, p. 65, Footnote] mentions as many as eight names with precise references for their occurrence-for this group, but none of them is associ ated with Cayley. These observations further substantiate our contention that Cayley's

2 82

MAT H EMATICS MAGAZ I N E

early papers on group theory were ignored by his contemporaries and were all but forgotten when abstract group theory came to be studied by other mathematicians. The fact that the Philosophical Magazine was primarily a journal of science was no doubt partially responsible for Cayley's papers on group theory going unnoticed by mathematicians. Perhaps, Cayley's choice of the title of the first paper we discussed did little to attract attention to it. "The General Theory of Groups" would have been a more appropriate and attractive title. Or perhaps, the whole topic "may have been too new" [T. Crilly, personal communication] . At any rate, Cayley's early work in abstract group theory was a remarkable accomplishment for its time.

R E F E R E NCES I . J. W. Bruce, P. J. Giblin, and C. G. Gibson, O n caustics o f plane curves, Amer. Math. Monthly 88:9 (Nov. 1 9 8 1 ), 65 1 --667. 2. Arthur Cayley, On the caustic by reflection at a circle, Camb. Dub!. Math. J. 2 ( 1 847), 1 28- 1 30, reprinted in Collected Papers 42, vol. 1, 273-275. 3 . -- , On a property of the caustic by refraction of the circle, Phil. Mag. 6 ( 1 853), 427-43 1 , reprinted in Collected Papers 1 24, vol. 2, 1 1 8- 1 22. Dated: Nov. 2, 1 85 3 . 4. -- , On the theory o f groups, a s depending o n the symbolic equation en = I , Phil. Mag. 7 ( 1 854), 40-47, reprinted in Collected Papers 1 25, vol . 2, 1 23-1 30. Dated: Nov. 2, 1 85 3 . 5 . -- , On the theory o f groups, a s depending on the symbolic equation en = I .-Second Part, Phil. Mag. 7 ( 1 854), 408-409, reprinted in Collected Papers 1 26, vol. 2, 1 3 1 - 1 32. 6. -- , On the theory of groups, as depending on the symbolic equation en = I .-Third Part, Phil. Mag. 18 ( 1 859), 34-37, reprinted in Collected Papers 243 , vol. 3, 88-9 1 . Dated: June 9, 1 859. 7 . -- , Recent terminology in mathematics, The English Encyclopedia 5 ( 1 860), 534-542, reprinted in Collected Papers 299, vol. 3 , 594-602. 8. S. Chakraborty and M. R. Chowdhury, When is every group of order n cyclic?, GANIT J. Bangladesh Math. Soc. 20 (2000), 97-1 06. 9. F. Chowdhury, Arthur Cayley and the theory of groups, J. Math. & Math. Sci. 10 ( 1 995), 2 1 -3 1 . 10. T. Crilly, The appearance of set operators i n Cayley's group theory, News!. S. Afr. Math. Soc. 31:2 (2000), 9-22 (Our page references are to the electronic copy of the manuscript received from the author). 1 1 . J. J. Gray, Arthur Cayley ( 1 82 1 - 1 895), Math. Intelligencer 1 7 :4 ( 1 995), 62--63 . 1 2 . Israel Kleiner, The evolution of group theory: a brief survey, this MAGAZINE 59 ( 1 986), 1 95-2 1 5 . 1 3 . Brian J. Loe and Nathaniel Beagley, The coffee cup caustic for calculus students, Coli. Math. J. 28:4 ( 1 997), 277-284. 14. G. A. Miller, Substitution and abstract groups, in G. A. Miller, H. F. Blichfeldt and L. E. Dickson, Theory and Applications of Finite Groups, J. Wiley, 1 9 16, pp. 1 - 1 92. 15. J. Nicholson, The development and understanding of the concept of the quotient group, Historia Math. 20 ( 1 993), 68-88. 1 6 . B . L. van der Waerden, A History of Algebra: From Al-Kwarizmi t o Emmy Noether, Springer-Verlag, 1 985. 1 7 . H. Wussing (trans. A. Shenitzer), The Genesis of the Abstract Group Concept, MIT Press, Cambridge, Mass, 1 984, p. 3 3 1 .

Continued from Page 3 3 3 Response from Maureen T. Carroll and Steven T. Dougherty We are thrilled to receive the Merten M. Hasse Prize. It is a great honor to be recognized by the MAA for our exposition. As professors, we have dedicated ourselves to guiding students in the path of mathematical discovery. In the paper, out hope was to make interesting results from finite geometry, combinatorics, and game theory accessible to students through the use of our game. We are especially proud to be honored by an organization that dedicates itself to beautiful mathematics, excellence in teaching, and student involve ment. In addition to thanking the University of Scranton for its support, we must also thank our students. Not only did they help inspire the idea of the game, they also keep the spirit alive by competing in our annual tic-tac-toe contest.

VO L . 78, NO. 4, OCTO B E R 2 005

2 83

D i r i c h l et: H i s L i fe, H i s P r i n c i p l e, a n d H i s P rob l e m P A M E L A G O R KI N

B u ckne l l Univers ity Lewisbu rg, PA 17837 pgorkin@bu ckne l l .edu

JOS H U A H. SMITH

Department of Mechanical and Aerospace Engineering Univers ity of Vi rginia Charl ottesv i lle, VA 22904-4746 jos h ® v i rgi n i a .edu

In Joseph Fourier's book, Theorie analytique de Ia chaleur (Analytic Theory of Heat), published in 1 822, Fourier studies heat conduction. In his model, temperature is a function of time t and position on a bounded domain, so at a point (x , y, z) of the domain, the function u may be written u (t, x , y , z) . This function satisfies the heat equation: 1 au

c2 at

-

-

-

-

a2u a2u a2u + + ax2 ay2 az2 '

-

-

-

where c is a positive constant. As time passes things might stabilize, entering a state of equilibrium. This means that the function u would be independent of t, implying that au;at = 0 and therefore the object on the right (which is called the Laplacian of u and denoted !:J.u) must vanish. We might wish to find the equilibrium temperature in the domain when the temperature on the boundary is independent of time. In this context, we are asking for a solution to something now known as the Dirichlet problem, which is also connected to problems in electrostatics and elasticity theory. (Garding [17] provides a more detailed explanation of this connection.) Informally, the Dirichlet problem asks whether, given a bounded domain and a con

tinuous function f on the boundary of that domain, we can find a function u, con tinuous on the domain together with its boundary, for which !:J.u = 0 throughout the domain and u = f on the boundary of the domain. As we shall see, Dirichlet used a technique that came to be known as the "Dirich let principle" to solve this problem. This principle, which was justified by physical arguments, asserts that among all functions (satisfying certain smoothness conditions) with the correct boundary values, there is one particular solution for which the integral of the square of the norm of the gradient is minimal. Other mathematicians used this principle as well. Though the principle was useful and influential, there was a difficulty with it-an objection raised by Karl Weierstrass ( 1 8 1 5- 1 897), who pointed out that no one had proved this useful principle. Following Weierstrass's criticism, some mathe maticians tried to solve Dirichlet's problem without the principle and others tried to save the (quite useful) principle. In Kline's words [24, p. 704] , The history of the Dirichlet principle is remarkable. Green, Dirichlet, Thomson, and others of their time regarded it as a completely sound method and used it freely. Then Riemann in his complex function theory showed it to be extraordi narily instrumental in leading to major results. All of these men were aware that the fundamental existence question was not settled, even before Weierstrass an-

2 84


nounced his critique in 1 870, which discredited the method for several decades. The principle was then rescued by Hilbert and was used and extended in this [the 20th] century. Had the progress made with the use of the principle awaited Hilbert's work, a large segment of nineteenth-century work on potential theory and function theory would have been lost. In this paper, we look at Dirichlet's problem and principle. We shall explain Dirich let's solution as well as Weierstrass's criticism. Having done so, we then tum to our own solution of a very special Dirichlet problem: the case in which the domain is the open unit disk, the boundary is the unit circle, and the continuous function on the boundary is a rational function. Finally, we exhibit an algorithm that produces exact solutions to this famous problem for the special case of rational functions on the unit disk. This solution is surprisingly easy to understand and implement. It was part of the second author's undergraduate honors thesis at Bucknell University [ 40] . The Dirichlet problem is ideal for early study, not only because of its widespread interest and applicability in many areas of mathematics, physics, and engineering, but also because of its rich history. Yet students rarely see it early in their studies. More often, they first solve the Dirichlet problem in an advanced course that requires a strong background in analysis (such solutions on the ball or disk can be found in several texts, [4, p. 1 2] or [32, p. 227], though it should be mentioned that Palka's book [32] contains an elementary solution when the data is a polynomial). Some students might not even see it until they take a course in functional analysis [ 44, p. 1 99 or p. 254] . It is our hope that our simple solution will be accessible to students at a much earlier stage. Dirich let

We briefly describe Dirichlet and many of the people who were to play a key role in developing the solution of the problem. In order to gain a more complete picture of Dirichlet's life than what we provide, the reader may consult various sources [9, 18, 27, 25, 34, 36, 37, 39] . Much of the material in Koch's work [26, p. 1 74] has been translated into English [8] . In addition, the web site http://www-gap.dcs.st-and.ac.uk/ - history/ is easily accessible and quite informative. Each of these articles provides a different point of view, and all are well worth reading. We tum now to Dirichlet's life. Peter Gustav Lejeune Dirichlet was born on February 1 3 , 1 805 in Diiren, between Aachen and Cologne. His father was a postmaster and his grandfather hailed from the Belgian town of Richelet (hence the name "Le jeune de Richelet"). In 1 8 1 7 , he went to gymnasium in Bonn and two years later in Cologne, where he was taught by the not-yet-famous Georg Simon Ohm. At this time, Paris was a hub of mathematical activity ; Fourier and Poisson were just two of the many brilliant mathematicians living there. Fourier ( 1 768- 1 830), motivated by physical concerns, was interested in the flow of heat. In 1 822, Fourier published what was to become a mathematical classic on the subject of heat, Theorie analy tique de la chaleur. Following Fourier, from about 1 8 1 5 on, Poisson ( 1 78 1- 1 840) also worked on heat conduction problems. This active mathematical environment was what attracted Dirichlet when he went to study in Paris in 1 822. Fourier's ideas would strongly influence Dirichlet's later work in mathematical physics and trigonometric series. Dirichlet attended lectures at the College de France and the Faculte des Sciences. Most sources refer to his study of Gauss's Disquisitiones arithmeticae, which he not only read carefully and understood, but also simplified. As Kline points out [24, p. 829] , "Dirichlet's great work, Vorlesungen iiber Zahlentheorie, expounded Gauss's

VO L. 78, NO. 4, OCTO B E R 2 005

2 85

Disquisitiones and gave his own contributions." Dirichlet's simplification and clarity of thought, which was to characterize his work, also served the purpose of making Gauss's work accessible to others. In 1 823, the well-educated General Foy, leader of the opposition in the Chamber of Deputies, began looking for someone to teach his children, primarily German lan guage and literature. Dirichlet was recommended and, making a fine impression on the General, he was hired. Dirichlet was pleased with the position for it enabled him to earn money while leaving him enough time for mathematics [27] . Dirichlet's first paper, Memoire sur l 'impossibilite de quelques equations indeter minees du cinquieme degre, won recognition from Fourier and Alexander von Hum boldt. In 1 827, through von Humboldt, Dirichlet secured a position at the University of Breslau, after being awarded his doctorate (honoris causa, or honorary doctor ate) from the University of Bonn. This degree, which would enable Dirichlet to advance through the system, was awarded on January 1 8, 1 827 . (Though some litera ture [1] cites Cologne as the institution awarding Dirichlet this degree, the University of Cologne, founded in 1 388, was closed down by the French occupation troops in 1 798, and did not reopen until after the First World War [3] .) Dirichlet's salary at this time was 400 thalers, which was about half of the amount sought by von Humboldt. At the University of Breslau, to obtain the title of Privatdozent (which would en able him to lecture at the university) Dirichlet was required to give a Probevorlesung or lecture, write his Habilitation (a post Ph.D. thesis), and present it in Latin. He did give his lecture (on the irrationality of the number IT) and, because the university needed him to teach, they allowed Dirichlet to complete his Habilitation later. Dirich let was not terribly successful teaching in Breslau and, though he was thought of as a well-educated and pleasant young man, students found his teaching style unusual [27] . Since Dirichlet never talked about himself or his accomplishments and since this was just the beginning of his career, his reputation was not widely known among the stu dents. Dirichlet published two works while in Breslau, the second of which was written in Latin. In 1 828, Dirichlet was promoted to ausserordentlicher Professor, or senior lecturer (much like our associate professor) in Breslau, but his salary remained at 400 thalers. In 1 828, he took a position at the Allgemeine Kriegsschule (General Military School) in Berlin. According to Kummer [27] , Dirichlet felt comfortable teaching in this environment, due in part to the years he had spent with General Foy. While he continued teaching at the Kriegsschule, he also taught at the University of Berlin. In 1 829 he became a Privatdozent at the University of Berlin, in 1 83 1 he was promoted to ausserordentlicher Professor, and in 1 839 he became ordentlicher Professor, or full professor. In 1 832, Dirichlet married Rebekka Mendelssohn. They had three sons and a daughter. Rebekka's brother was the composer Felix Mendelssohn-Bartholdy, her sister was the composer Fanny Mendelssohn, and her father was a banker. Her grand father was the philosopher Moses Mendelssohn. (Much more information on the Mendelssohn family is available in Sebastian Hensel's book [ 19] .) Dirichlet remained in Berlin, except for a period from the fall of 1 843 through the spring of 1 845 , which he spent in Italy together with Carl Jacobi ( 1 804- 1 85 1 ) and the Swiss geometer Jakob Steiner ( 1 796- 1 863). The Swiss teacher, Ludwig SchHifii ( 1 8 1 4- 1 895), who was later to become a well-known mathematician, accompanied Jakob Steiner. According to Koch [26] , Dirichlet presented lectures to SchHifii on a daily basis during this time. (Curiously, while trying to show that the circle is the curve with the largest area among all plane closed curves with a given length, Steiner assumed without proof that a par ticular maximum value could be attained-the same type of unjustified assumption present in Dirichlet's work. Surprisingly, Steiner's proof was criticized by none other

286


than Dirichlet, and existence of a solution was first proven by Weierstrass. Blasjo's article [11] and Manna's text [30, p. 38] contain more information about this.) In 1 855, Dirichlet was asked to come to Gottingen to succeed Gauss. Though Dirichlet had at first enjoyed teaching at the Kriegsschule, his dual obligations at the Kriegsschule and the University of Berlin began to overwhelm him. At the university, Dirichlet had a large group of followers, whom he found intellectually stimulating. So, Dirichlet asked the Prussian Ministry of Culture to excuse him from teaching at the Kriegsschule, and he said that he would accept the position in Gottingen if he were not excused from his teaching duties at the Kriegsschule by a certain date. The Prussian Minister waited too long, and Dirichlet moved to Gottingen in 1 855. In 1 858, Dirichlet traveled to Switzerland to give a memorial speech about Gauss, and it was there that Dirichlet had a heart attack. In December of 1 858, after his return to Gottingen and during his illness, his wife died. Dirichlet died the following spring at the age of 54. Dirichlet was not only an excellent mathematician, he also attracted a large au dience. According to Kummer [27] (who was, incidentally, Dirichlet's successor in Berlin), "Particularly in the later years of his academic endeavors, the success of his teaching, measured externally by the number of listeners, was so significant that, in this respect, surely no other instructor at a German university can match him in the subject of higher mathematics. It was in no way his didactic art to which he owed this success, nor the gift of a brilliant lecture, but rather solely the inner clarity of his mind, which made it possible for him to grasp and present the simple truth of even the most difficult things." Dirichlet's publications, students, and lectures were to be a strong influence on the development of mathematics. Students in Dirichlet's lecture included Eisenstein ( 1 823-1 852), Kronecker ( 1 823- 1 8 9 1 ) , Riemann ( 1 826-1 866), and Dedekind ( 1 83 11 9 1 6) . Lipschitz ( 1 832-1 903) also studied under Dirichlet, completing his doctorate in 1 85 3 . (In some literature, Dirichlet is cited as Lipschitz's advisor; in others, as a second reader on his thesis.) In 1 856- 1 857, Dirichlet lectured on potential theory in Gottingen, and notes of those lectures were published by F. Grube in 1 876. The subject treated in these notes is mathematical, but the influence of physics is apparent. Both Gauss and Dirichlet were influenced by Newton's law of gravitation, which the reader will recall states that any two objects exert a gravitational force of attraction on each other; the direction of the force is along the line joining the centers of mass of the objects and the magnitude of the force is proportional to the product of the masses of the objects, and inversely proportional to the square of the distance between the centers of mass. In electricity and magnetism this law takes the form of Coulomb's law. Newton studied the problem of attraction between a point mass and solid sphere, but the earth was known not to be spherical, and thus Dirichlet's lectures include the attraction on an object by an ellipsoid. Other applications in the lectures include the theory of electricity and of magnetism. In this set of lecture notes, the Dirichlet problem and Dirichlet's method of treating it (which came to be called Dirichlet's principle) both appear. We mention briefly that this is not the first occurrence of the use of the principle. (The first use of the principle, which is another story in and of itself, seems to be by George Green ( 1 793- 1 84 1 ) whose work was published privately b y Green, neglected, and then rediscovered by Sir William Thomson ( 1 824- 1 907), also known as Lord Kelvin.) We return to the story of the Dirichlet principle later. Dirichlet's influence was not limited to potential theory. Indeed, Dirichlet is perhaps best known for his influence on the field of number theory (and this is the focus of the articles by Shields [39] and Rowe [34]). Another very nice article on Dirichlet's mathematics is in the Dictionary of Scientific Biography [1].

VO L . 78, NO. 4, OCTO B E R 2 005

287

Keith Devlin mentions Dirichlet's influence i n his article, The Forgotten Revolution (http://www.maa.org/devlin/devlin_03 _03 .html): In the middle of the nineteenth century, however, a revolution took place. One of its epicenters was the small university town of Gottingen in Germany, where the local revolutionary leaders were the mathematicians Lejeune Dirichlet, Richard Dedekind, and Bernhard Riemann. In their new conception of the subject, the primary focus was not performing a calculation or computing an answer, but for mulating and understanding abstract concepts and relationships-a shift in em phasis from doing to understanding. Within a generation, this revolution would completely change the way pure mathematicians thought of their subject. Nev ertheless, it was an extremely quiet revolution that was recognized only when it was all over. It is not even clear that the leaders knew they were spearheading a major change. As an example, Devlin cites Dirichlet's definition of function. In fact, "Dirichlet abandoned the then universally accepted idea of a function as an expression formu lated in terms of special mathematical symbols or operations." [1] Indeed, Dirichlet introduced the modern idea of thinking of a function as a correspondence that asso ciates to each x a unique value denoted f (x ) . (Much more information on this can be found in other articles, such as Kleiner's article on the Evolution of the Function Con cept [23] or Riithing's list of the ever-changing definition of function [35]). Dirichlet's influence is also made clear by Riemann who, in his Habilitation on Fourier series, wrote the following in the introduction [8, p. 37] : "In composing this thesis, I had the privilege of using some tips of the famous mathematician, to whom we are indebted for the first fundamental work in this subject." Dirichlet published relatively little, but what he published was of very high quality. Perhaps this is best summed up by the following well-known quote of Gauss: "Gus tav Lejeune Dirichlet's works are jewels, and jewels are not weighed with a grocer's scale." [33] The Dirichlet principle

We now turn to the Dirichlet problem and the principle that was first used to find a solution to it. A real-valued function u on an open subset Q of is called harmonic if it is twice continuously differentiable and the Laplacian of u , defined by fiu = o 2 ujoxf + + o 2 u j ox�, satisfies fiu = 0 throughout Q . The problem that motivated Dirichlet appears below. The conditions are vague, just as they were in the original motivating problem. When we work on our Dirichlet prob lem, we promise to state the conditions more precisely [10, p. 385 ; 14] .

JRn

·

·

·

T HE DIRICHLET PROB LEM I N JR 3 . Given a bounded region Q, does there exist a unique function u of variables x , y and z which, together with its first derivatives, is continuous within the domain, satisfies

o 2u o2u o2u + + =0 ox2 oy2 oz2

-

-

-

and assumes a given value at each point of the boundary? The idea was, then, to start with a function f continuous on the boundary of the domain, and to find a harmonic function u continuous on the domain together with its


2 88

boundary, and equal to f on the boundary. As we have already indicated, the problem was solved using something called the Dirichlet principle:

3 . On the boundary of a bounded connected set, JR continuous boundary values are prescribed. Among all functions that are continuous T HE DIRICHLET PRINCIPLE IN

on the set together with its boundary, have continuous partials, and have the given boundary values, there is a function u for which

[L ( :: ) 2 ( :; )' (::)'] +

+

dV

takes its minimum value.

Let us clarify what the principle has to do with the problem. According to Grube's notes [10], Dirichlet sets up this integral of the square magnitude of the gradient, uses the Dirichlet principle to find the minimizing function, and then shows three things : first, that any harmonic function (with the prescribed boundary values) minimizes the integral; second, that any function (with the prescribed boundary values) minimizing the integral must be harmonic; finally, that only one function yields the minimum value of the integral. We present the argument for the first assertion in here. The second argument is similar and can be found in Dirichlet's lectures, or rather the translation of Grube's version of Dirichlet's lectures [10, p. 385] . Note that once Dirichlet has this result, his notes indicate that this proves that there always exists a function with the desired prop erties. Why? He believed that it was clear that the integral had a function minimizing it, and now he shows that any minimizing function is harmonic. So if we put these two things together, the minimizing function solves the problem. Uniqueness follows from the maximum principle for harmonic functions, and will not be proved here. Taking a harmonic function u that solves the Dirichlet problem for the function f, let u s show why u minimizes the given integral. Suppose w = u o n the boundary of Q and that w is such that we can take first partial derivatives. Consider the function v = u - w and note that since u and w agree on the boundary of Q, we have v = 0 on the boundary of Q. Now

JR2

fn (�:r ( �;Y dA fn ( � y ( ; y dA L (!: Y (:�y dA L (!:Y (:�y dA _ 21 ( ) =

+

a u < a

v

=

Using Green's theorem, recalling that v details, we get ill

au a v

+

a u < a

au a v + au a v ax ax ay a y =

au a v + ax ax ay ay

v

)

+

+

ill

1(

)

+

dA .

0 on the boundary of n, and omitting the

) dA - 1 =

ill

v [}. u dA .

Since u is harmonic, this last integral must be zero. Therefore, if u solves the Dirichlet problem and w is any other smooth function with the same boundary values,

Thus, u minimizes the integral, as claimed.

VO L . 78, NO. 4, OCTO B E R 2 005

2 89

Weierstrass weig h s in

The Dirichlet principle was used by many mathematicians, including Gauss, Green, Riemann, Thomson, and others. The principle and the existence of a solution to the Dirichlet problem were justified on physical grounds. For example, when Weierstrass wrote his article he used Dedekind's notes of Dirichlet's lectures; in reference to the Dirichlet problem, the notes say [ 43] In fact this theorem is identical with one from the theory of heat that is evident to everyone in that field, namely the theorem that says that when an arbitrarily prescribed temperature remains constant on the boundary of the region, there is one and only one temperature distribution in the interior such that equilibrium takes place. In other words, if the temperature in the interior were arbitrary, then it would approach a final state in which equilibrium would occur. But as the demand for rigor in mathematics increased, so did the realization that know ing that a set was bounded below was not enough to assure the existence of a function achieving the infimum; in other words, an infimum is not the same as a minimum. It was Weierstrass who indicated that the Dirichlet principle required a proof. His paper is remarkably easy to read, and his example-showing explicitly how the argu ment can fail-is easy as well. In fact, the example is so simple that it is somewhat surprising that it is not included in introductory analysis texts. (It can be found in texts on variational methods, such as Mikhlin's book [28, p. xvi], and as it makes good reading on the difference between the minimum and infimum, a brief history of the problem is included in the text [13] .) We summarize Weierstrass's paper below, just to make our point self-evident. Weierstrass [ 43] begins his paper, "On the so-called Dirichlet Principle," by recon structing Dirichlet's argument as it appeared in Dedekind's notes. He then proceeds to the following example in R His example is sometimes referred to as a "modified Dirichlet integral" because it seems to be very similar to the one appearing in the Dirichlet principle. We consider a family of integrals

1 1 (x dfE(x ) ) 2 dx , dx _

1

where each integral is similar to the one Dirichlet considered. This family has an in fimum (it is obviously bounded below by zero), but no minimum. To define the func tions, let a and b be real numbers with a =/= b. Define for each E > 0, on the interval [ - 1 , 1 ] by

fE ,

b-

fE(x ) = a +2 - + 2 -a · b

-

-

arctan(x /E)

arctan( 1 /E ) ·

The graphs of some members of this family, with a = - 1 , b = 1 , appear in FIGURE 1 . We first note, as Weierstrass did, that this function satisfies = b while

fE(l)

/E( -1 ) = a .

Let's compute the infimum over E of the integrals

1 1 (x dfE(x ) ) 2 dx , dx . _

1

and then see what happens if we assume, as many mathematicians of the 1 9th century did, that there is a function that minimizes the integral. An easy computation of the


2 90

-I

Figure

1

A typical fam i ly of f, fo r E = 1 , 0 . 5 , 0 . 1 , 0.05 , 0 . 0 1

derivative shows that

df, (x) dx

-

b-a 2 arctan ( 1 /E )

Thus, we see that

1 1 (x df, (x) ) 2 dx -I

dx

·

E

x2 + E2 ·

x2 1 1 dx (x 2 + E 2 ) 2 (2 arctan( l /E)) 2 (b - a) 2 1 1 x22 + E22 dx < E2 2 (x + E ) 2 (2 arctan( 1 /E)) (b - a) 2 1 1 E 2 dx = E E 2 2

=

(b - a) 2

E2

_

_

=

(2 arctan ( l /E ) )

_

1

1

1

x +E

(b - a) 2 2 arctan ( l /E ) ·

We already know that the infimum must be greater than or equal to zero. Letting o+ , we see that the infimum over all such f, must be zero. Suppose there were a function, call it f, with continuous derivative, that yields the minimum value of the integral. Then this function would have to satisfy

E

--+

) 1 1 (x df(x) dx _

1

z dx

=

0.

Since the integrand is squared, the only way this can happen is if x 2 (dfjdx) 2 = 0 on the interval [ - 1 , 1 ] . Now, this can only happen if dfjdx = 0 on the interval, and consequently f must be constant. But f ( 1 ) = b, while f ( - 1 ) = a and Weierstrass had carefully chosen a -:f b . The Weierstrass example i s simple and clever. Remember, though, that w e are after a solution to the Dirichlet problem-not a proof of the Dirichlet principle. For example, in this case the Dirichlet problem asks for a function, harmonic on the interval ( - 1 , 1 ) , continuous on the closed interval [ - 1 , 1 ] , and equal to a given continuous function on the boundary of [ - 1 , 1 ], that is, at the points - 1 and 1 . Finding a solution to the Dirichlet problem in this case is easy (try it !), and certainly does not require the Dirichlet principle. Towards the end of the 1 9th century, work in this area continued along two lines. First, some mathematicians tried to prove the principle. This lead to the need for a better understanding of maxima and minima. Following Weierstrass's criticism of the principle, many mathematicians also worked on finding alternate solutions to the

VOL . 78, NO. 4, OCTO B E R 2 005

291

Dirichlet problem, that is, solutions independent of the Dirichlet principle. These math ematicians included a student of Weierstrass, Hermann Amandus Schwarz ( 1 8431 92 1 ), who made significant progress on the two-dimensional problem, Carl Neumann 1 832-1 925), who was able to prove the existence of a function that solves the Dirich let problem under fairly general conditions [31], and others [41]. Finally David Hilbert ( 1 862- 1 943), in 1 904 and 1 905, presented his "resuscitation" of the Dirichlet princi ple by establishing it for sufficiently general domains. Hilbert presented his work [20] before the German mathematical congress. More information on this can be found in Monna's text [30, p. 55] . As Courant states in his text entitled [12, p. 2], "Since Hilbert's pioneering achievement, the theory has been both simplified and extended. Today Dirichlet's principle has become a tool as flexible and almost as simple as that already envisaged by Riemann. It has, moreover, been the starting point for the development of the so-called direct methods of the variational calculus, a development equally im portant to both pure and applied mathematics."

(

Dirichlet's Principle

Rational bo undary fu nctions

In this section, we present our solution [40] of a special case of the Dirichlet problem. This solution was motivated by the solution to the Dirichlet problem on the ball in ]Rn discovered by Axler and Ramey [6] . Like their solution, our solution is integration free, and consequently does not use the Dirichlet principle. Other relatively simple (though not quite so simple) solutions for various cases have appeared [7, 21, 29] . An interesting use of the existence and uniqueness of the solution to the Dirichlet problem can be found in Edstrom's paper [16] , where the author indicates how the solution can be used to sum certain series. We will solve the Dirichlet problem on the open unit disk D in with rational data on the boundary. In other words, our continuous boundary function has the form where and are polynomials in and and has no zeros on the unit circle. Theoretically, our solutions are exact. However, the method requires finding the partial fraction expansion of a rational function, which requires finding the roots of a com plex polynomial. Once this has been done, we obtain exact solutions to the Dirichlet problem, which provide a useful way to test algorithms for computing solutions to the Dirichlet problem and are accessible to students at a very early stage in their mathe matical career.

JR2

p /q,

p q

x y, q

Given a rationalu, continuous variablesunitthatdisk,is function R onof two continuous on the unit circle, find a the closed function harmonic on the open unit disk and equal to R on the unit circle. Let R 1 (x, y ) x. What is u? 1 , R 1 u (x y ) x Let R2 (x , y ) x 3 + xy2 . 2 y2) R (x, y x (x ) 2 2 2 x +y u 2 (x , y ) x RR2 2 O U R D I R I C H L E T P RO B L E M .

D,

EXAMPLE

1.

=

= It is easy to see that is harmonic on D, continuous on the closed unit disk, and equal to on the unit circle.

E X A M PL E

2.

=

+ = This one is harder, but not really that much harder. Note that and we want something that equals on the circle = 1 . As luck would have it, the function = is continuous on the closed unit disk, harmonic on the open disk, and equal to our function on the unit circle.

2 92

EXAMPLE 3 .

Let R3 (x , y )

=


1/(5 +

3x ) .

This one i s more difficult and the solution will require our algorithm. After explain ing the algorithm, we will show how to work through the details to obtain the desired function. EXAMPLE 4.

Let R4 (x , y )

=

l

/( 1 + x 2 ) .

You will not get this one without our help. (At least we don't think you will.) We will solve this at the end of the paper as well. The solution to the Dirichlet problem on the disk is usually given by integration against the Poisson kernel [4, p. 1 2] and an exact computation of the solution is often quite difficult. When the function R is a polynomial defined on the open ball in ]Rn , Axler and Ramey [6] developed a fast, exact algorithm to compute the solution to the Dirichlet problem using differentiation. In a private communication, Axler mentioned the question of creating an algorithm for rational functions. One such algorithm was developed by R. Walker [42], who created a function of one complex variable and then used the Schwarz integral [2, p. to obtain an exact solution to the Dirichlet problem on the disk when the boundary data is a continuous rational function. As indicated in the introduction, our solution differs from Walker's in that the techniques are integration-free, and based primarily upon techniques available to a student after a second course in calculus. To understand the solution, the reader will need to know that = x + i y , that the conjugate of (z = x i y ) satisfies z = on the circle, how to compute the partial fraction expansion of a rational function over the complex numbers, what a complex analytic function is, and that the real part of an analytic function is harmonic. Since this follows readily from the Cauchy-Riemann equations, students should be able to understand this proof early in their studies in mathematics, physics, or engineering. Our algorithm has been implemented in Mathematica. You can download the pack age from the MAGAZINE website. In addition to implementing the algorithm, our pro gram also checks that the solution is harmonic on n and equal to the original function on the unit circle an.

168]

z

1/z

-

z

We seek to solve the Dirichlet problem for a rational function R of x and y that is continuous on the unit circle. Our solution will be the real part of a certain analytic function H , which we denote by Re(H) . First, we create a rational function of one complex variable, which we will call h, that agrees with R on the boundary. Throughout, the substitutions we make depend heavily on the fact that we are working on the unit circle. The new rational function arises from the well-known fact that x = + Z) and and from noticing that z = I on the unit circle. So if we define y = - z) The solution

(z /2i

(z / 2

1z h (z R ((z + 1/z)/2, (z - l/z)/2i ) , )

=

then h is as desired: a rational function of one complex variable that is continuous on an and equal to R on an. Now we examine h. It is a rational function of one complex variable and long di vision decomposes h as h = p + s, where p is a polynomial in and s is a rational function of for which the degree of the numerator is less than the degree of the de nominator. The Dirichlet problem is linear; that is, if we have the solution for each term in a sum, we can add them to get the solution to the Dirichlet problem for the entire sum. Since p is a polynomial and it is analytic, we know that once we solve the problem for s, we can use linearity to solve the problem for h . So, let's focus on s .

z

z

VOL . 78, NO. 4, OCTO B E R 2 005

2 93

The rational function s will have finitely many poles, and we denote them by em . Since is continuous on the unit circle, I em I =/= 1 . The partial fraction expansion of s will look like

s

m where km (Z) = m f (z - Cm ) nm , am

{ m :l cm l < l }

{ m :l cm l > l }

E C, and E z+ . If f m f > 1 , then km is analytic on an open set containing D. On the other hand, if I em I < 1 , we have to replace km by an analytic function in D without changing the boundary values. Here's what we have to do: Let k (z) = j (z - c ) n , where is a complex number, E D, and is a positive integer. We claim that the function K defined by

a

c

nm

c

a

n

K (z)

=

a

k ( l /z)

is a rational function that is analytic on an open set containing D and satisfies Re( K ) = Re(k) on aD. To check this, simplify K to obtain

Since l e i < 1 , it is clear that K is analytic on an open set containing D. Furthermore, on an, Re(K (z)) = Re

( a ) ( l jz - c) n

= Re

( a ) (z - c) n

= Re (k (z) ) ,

which is what we needed to show. Let's review: We start with a rational function R and replace it with a rational func tion of one variable, which has the same values as R on the circle. Using the linearity of the Dirichlet solution, long division, and a partial fraction decomposition, we reduce our problem to solving the Dirichlet problem for each term in the partial fraction de composition of Now we return to the partial fraction decomposition and handle the poles that fall inside the circle. If l cm I < 1 , trade the poles inside the disk for poles outside the disk by replacing km with the corresponding function Km as described above. In every step, the real part of the boundary values remain the same. Define a new function S by

h,

h.

{ m :lcm l < l }

{ m :l cm l > l }

Then S is analytic on an open set containing D and, since Re(Km ) = Re(km ) , it i s clear that Re(S) = Re(s ) on a D . Let H = p + S. Then H is analytic on an open set containing D and Re(H) = Re(p + S) = Re(p + = Re( ) = R on aD. Since the real part of an analytic func tion is harmonic, we have proved the following theorem:

s) h be au,rational function Thenu there and y continuous exion sts a rationalLetfunction harmonic on ofxcontinuous on andonsuch that THEOREM . a D.

R

D,

D,

a D.

= R

This theorem not only gives us an algorithm for finding the solution, it also shows that if the function we start out with is rational, then the solution will be as well. A well-known result [6, 38] says that if the data we begin with is a polynomial, and the


2 94

region i s the ball in JRn , then the solution i s again a polynomial. A natural question to ask is whether or not this result generalizes to rational functions. After completing this paper, we learned from P. Ebenfelt, D. Khavinson, and H. Shapiro that, in fact, this is not the case. They give an example of a rational function in JR3 for which the solution to the Dirichlet problem on the ball is a transcendental function [15] . Of course, because of our theorem above, this cannot happen in JR2 • Examples

As we shall show in this section, the solutions are generally quite com plicated, even if the original function is simple.

We returnworks.to our function R3(x , y) we use it to show3 how our algorithm EXAMPLE

REV I S ITED .

=

1 / (5 + 3x),

and

We first obtain a function of one variable by making the substitutions

x

=

(z + 1 /z)/2 and y

=

(z - 1 /z)/2i

to get

h (z)

=

R3 ( (z + 1 /z)/2, (z - 1 /z)/2i )

=

1 z + l /z 5 + 32-

=

2z 3 z 2 + 1 0z + 3 ·

h

Since is a rational function for which the degree of the numerator is less than the degree of the denominator, there is no need for long division. There are two poles of this rational function: one outside the disk at z -3 and one inside the disk at z = - 1 I 3. Expanding using partial fractions yields =

h (z)

=

h

1 /4 3/4 - -- + - . 3z + 1 z + 3

The next step in our algorithm is to fix whatever problems the first rational function might have. Following our algorithm, we replace the first term by - ( 1 /4)/ ( 1 + 3/z) to obtain

H (z)

=

-

1 /4 3/4 + -3/z + 1 z + 3

Now our algorithm tells us that

u 3(x , y)

=

u3

=

- (z - 3) 4(z + 3)

9 - x 2 - y 2 - 6 iy

= ----=---.,..---=-�

36 + 24x + 4(x2 + y2)

should be the real part of H, so

Re ( H (z) }

=

9-

x2 - y2 36 + 24x + 4(x2 + y2)

= = 0 and that To verify that this solution is correct, we must check that on the unit circle. The former is messy, but easy. For the latter, recall that on the unit circle x 2 + y 2 = 1 and so

D.u 3

u 3(x , y)

9- 1 =

u3

36 + 24x + 4

=

8 40 + 24x

=

1 5 + 3x

=

u 3 R3

R3(x , y) .

A contour plot of appears in FIGURE 2 (where higher values are colored lighter). In general, the solution can be quite complicated, as the solution to Example 4 below shows. In this case, checking that it is correct involves replacing powers of x 2 + y 2 by 1 , which can be difficult.

2 95

VOL . 78, NO. 4, OCTO B E R 2 005

Figure

2

Conto u r p l ot of u3

Figure

3

Conto u r p l ot of u

R4 (x , y) = y 4 j ( l FIGURE 3:

We return functionin algorithm produces thefollowing solution,to our depicted EX A M P L E

u 4 (x , y)

=

4 REV I S I TED .

4

+ x2).

Our

( -5 + 4-Jl - 27x 2 + 24-J2x 2 + 1 4x4 + 1 0x 6 - 24-J2x 6 + 7x8 - 4-J2x8 + x 1 0 +

27 y 2 - 24-J2y 2

+

60x 2 y 2 - 6x4y 2 - 24-J2x4y 2 - 20x 6 y 2 - 1 6-J2x 6 y 2

+ 3x 8 y 2 + 1 4y4 + 6x 2 y4 + 24-J2x 2 y 4 - 54x4y4 - 24-J2x 4y4 + 2x 6 l - 1 0y 6 + 24-J2y 6 - 20x 2 y 6 - 1 6-J2x 2 y 6 - 2x4y 6 + 7 y8 - 4-Jly B - 3x 2 y8 - y 1 0) /

(2 + 24x 2 + 1 6x4

+

24x 6

+

2x8 - 24y 2 + 1 20x 2 y 2 + 24x4 y 2 + 8x 6 y 2 + 1 6y 4

- 24x 2 y4 + 1 2x4 l - 24y 6 + 8x 2 y 6

+

2y8 ) .

There is also another elementary solution to the special case of the Dirichlet problem for polynomials on the ball in JRn that makes a nice addition to an elementary course in linear algebra [7, 21, 22] . It is also possible to solve the Dirichlet problem for polynomial boundary data on an ellipsoid using partial derivatives and linearity [5] . We hope that the simple algorithm presented here will pique a student's interest in the Dirichlet problem-a problem with a fascinating history, as well as broad and appealing applications. Last thoughts

Acknowledgment. This work was part of the second author's honor thesis completed at Bucknell University under the direction of the first author. We thank Bucknell University for its support. We are grateful to Ulrich Daepp for his assistance with the translations as well as his careful reading of the manuscript. We are also grateful to Michael von Renteln, Karl Voss, and the anonymous referees of this paper for their very helpful comments.

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9.

Dictionary of Scientific Biography, Charles Scribner's Sons, New York, 1 97 1 .

Lars V. Ahlfors, Complex Analysis, McGraw-Hill Book Co., New York, 1 97 8 . University o f Cologne Archives, Private communication, Letter, 2004. Sheldon Axler, Paul Bourdon, and Wade Ramey, Harmonic Function Theory, Graduate Texts in Mathematics, vol. 1 37, Springer-Verlag, New York, 200 1 . Sheldon Axler, Pamela Gorkin, and Karl Voss, The Dirichlet problem o n quadratic surfaces, Math. Comp. (to appear), no. 9. Sheldon Axler and Wade Ramey, Harmonic polynomials and Dirichlet-type problems, Proc. Amer. Math. Soc. 123: 1 2 ( 1 995), 3765-3773 . John A. Baker, The Dirichlet problem for ellipsoids, Amer. Math. Monthly 106:9 ( 1 999), 829-834. H. G. W. Begehr, H. Koch, J. Kramer, N. Schappacher, and E.-J. Thiele, Mathematics in Berlin, Birkhliuser Verlag, Berlin, 1 998. K.-R. Biermann, Johann Peter Gustav Lejeune Dirichlet, Dokumente fiir sein Leben und Wirken, Abh. der Deutschen Akademie der Wissenschaften zu Berlin, Klasse fur Mathematik, Physik und Technik 2 ( 1 959), 2-88.

296


1 0. Garrett Birkhoff, A Source Book in Classical Analysis, Harvard University Press, Cambridge, Mass., 1 97 3 . 1 1 . Viktor Blasjo, The evolution o f . . . the isoperimetric problem, Amer. Math. Monthly 1 12:6 (2005), 526-566. 1 2 . Courant, R., Dirichlet's Principle, Conformal Mapping, and Minimal Surfaces, Interscience Publishers, Inc. , New York, N.Y. , 1 950. 13. Ulrich Daepp and Pamela Gorkin, Reading, Writing, and Proving: A Closer Look at Mathematics, Under graduate Texts in Mathematics, Springer-Verlag, New York, 2003. 14. P. G. Le Jeune Dirichlet, Vorlesungen iiber die im umgekehrten Verhiiltniss des Quadrats der Entfemung wirkenden Kriifte, herausgegeben von Dr. F. Grube, Leipzig, 1 876. 15. P. Ebenfelt, D. Khavinson, and H. Shapiro, Algebraic aspects of the Dirichlet problem with rational data, Operator Theory Advances and Applications 156 Birkhauser, 2005 , 1 5 1-1 72. 16. Clarence R. Edstrom, A Dirichlet problem, this MAGAZINE 45 ( 1 972), 204-205. 1 7 . Lars Garding, The Dirichlet problem, Math. Intelligencer 2: 1 ( 1 979/80), 43-5 3 . 1 8 . K. Gruhl, Studienerinnerungen 1853-1856, Mathematik aus Berlin, Ausziige von Gert Schubring (Berlin), Weidler, Berlin, 1 997, pp. 55-57. 1 9 . Sebastian Hensel, Die Familie Mendelssohn, nach Briefen und Tagebiichem, Insel Verlag, Frankfurt am Main und Leipzig, 1 995. 20. David Hilbert, Uber das Dirichletsche Prinzip, Math. Ann. 59 ( 1 904), 1 6 1- 1 86. 2 1 . Dmitry Khavinson, Cauchy's problem for harmonic functions with entire data on a sphere, Canad. Math. Bull. 40: 1 ( 1 997), 60--6 6 . 22. Dmitry Khavinson and Harold S . Shapiro, Dirichlet's problem when the data is an entire function, Bull. London Math. Soc. 24:5 ( 1 992), 456-468. 23. Israel Kleiner, Evolution of the function concept: A brief survey, College Math. J. 20 ( 1 989), 282-300. 24. Morris Kline, Mathematical thought from ancient to modem times, vol. 3, Oxford University Press, New York, 1 972. 25. H. Koch, J. P. G. Lejeune Dirichlet zu seinem 1 75 . Geburtstag, Mitt. Math. Ges. DDR ( 1 9 8 1 ), no. 2-4, 1 531 64. 26. H. Koch, P. J. Lejeune Dirichlet, Mathematik in Berlin (Berlin), Birkhauser Verlag, Berlin, 1 998, p. 1 74. 27. E. E. Kummer, Gediichtnisrede auf Gustav Peter Lejeune Dirichlet, Abh. Konig. Akad. Wissen. zu Berlin; reprinted in G. Lejeune Dirichlet's Werke, ed. L. Kronecker and L. Fuchs ( 1 9 8 1 ), no. 2, 309-344. 28. S. G. Mikhlin, Variational Methods in Mathematical Physics, trans. T. Boddington; editorial introduction by L. l. G. Chambers; The Macmillan Co., New York, 1 964. 29. David Minda, The Dirichlet problem for a disk, Amer. Math. Monthly 97: 3 ( 1 990), 220-223. 30. A. F. Monna, Dirichlet's Principle: A Mathematical Comedy of Errors and Its Influence on the Development of Analysis, Oosthoek, Scheltema, and Holkema, Utrecht, 1 975. 3 1 . C . Neumann, Untersuchungen tiber d a s Logarithmische u n d Newton ' sche Potential, Math. Ann. 13 ( 1 878), 255-300. 32. Bruce P. Palka, An Introduction to Complex Function Theory, Springer-Verlag, New York, 1 99 1 . 3 3 . Michael von Renteln, Geschichte der Analysis im 1 9. Jahrhundert, von Cauchy bis Cantor, Skriptum zur Vorlesung, Universitat Karlsruhe, 1 989. 34. David E. Rowe, Gauss, Dirichlet, and the law of biquadratic reciprocity, Math. lntelligencer 10:2 ( 1 988), 1 3-25 . 35. Dieter Riithing, Some definitions of the concept of function from Joh. Bernoulli to N. Bourbaki, Math. Intelligencer 6:4 ( 1 984 ), 72-77. 36. Gert Schubring, The three parts of the Dirichlet nachlass, Historia Math. 13: 1 ( 1 986), 52-56. 37. , Dirichlet. Comment on: "Gauss, Dirichlet, and the law of biquadratic reciprocity" [Math. Intelli gencer 10 ( 1 988), no. 2, 1 3-25 ; MR 89e:0103 1 ] by D. E. Rowe, Math. Intelligencer 12 ( 1 990), no. 1 , 5-6. 38. Harold S. Shapiro, An algebraic theorem of E. Fischer, and the holomorphic Goursat problem, Bull. London Math. Soc. 21:6 ( 1 989), 5 1 3-537. 39. Allen Shields, Lejeune Dirichlet and the birth of analytic number theory: 1 837-1 839, Math. lntelligencer 1 1 : 4 ( 1 989), 7-1 1 . 40. Joshua H. Smith, The Dirichlet Problem for Rational Functions, Honors Thesis, Bucknell University, 1 999. 4 1 . Rossana Tazzioli, Green 's function in some contributions of 1 9th century mathematicians, Historia Math. 28: 3 (200 1 ) , 232-252. 42. Ronald Walker, Problems in Harmonic Function Theory, Undergraduate Thesis, University of Richmond, 1 998. 43. Karl Weierstrass, Uber das sogenannte Dirichlet'sche Princip, gelesen in der Konig/. Akademie der Wis senschaften am 14. Juli 1870, Karl Weierstrass, Mathematische Werke (Berlin), vol. 2, Mayer & Miiller, Berlin, 1 895, pp. 49-54. 44. Dirk Werner, Funktionalanalysis, Springer-Verlag, Berlin, 2000. ---

N O TES H eads U p : No Tea m wor k Re q u i red M A R T I N j. E R I C K S O N

Division of Mathematics and Comp uter Science Tru m an State Univers ity Ki rksville, MO 63501 erickson@tru m an. edu

Many team games require collaborative strategies. In this Note, I present a game that paradoxically requires cooperation, but at the same time forbids communication of any kind. A team skilled in probability can compete as if they could confer (almost). In the "Hat Problem" [1], each contestant on a team tries to guess the color (blue or red) of a hat on his/her head while seeing only the hats of the other contestants. The guesses are made simultaneously, and "passing" is an option. The team wins if at least one person guesses and no one guesses incorrectly. Before the guessing round, the participants are allowed to discuss strategy. With best play, an n-person team (where n is one less than a power of 2) can win with the surprisingly high probability n j (n + 1 ) . (The solution to the Hat Problem provides a novel perspective o n Hamming codes, but you don't need to know anything about Hamming codes to understand this note.) I enjoyed the Hat Problem and was investigating variations of it when I thought of eliminating both the pregame strategy meeting and any information the teammates could gain from each other. The resulting game, recast in the context of coin-flipping, has some curious mathematical features. Heads Up : A team of n :::: 2 people is selected to participate in a game in which the team may win a prize. The players are not allowed to communicate with each other before or during the game. (Assume, for reasons that should become clear in a moment, that the players have never met before and know nothing about each other.) Each player is given a fair coin. At a signal, the players simultaneously open their hands to reveal either the coin or nothing. Those players showing coins then flip their coins. The team wins if at least one coin is flipped and all the flipped coins land heads. What strategy should each player follow in order to maximize the likelihood of the team winning, and what is the probability of winning?

The crucial factor in Heads Up is that the team members cannot communicate with each other so they cannot form a strategy together. (In the solution to the Hat Problem, the players determine an ordering of themselves so that they can convert the hat color distribution into a vector.) Ideally, the team would choose one member to flip the coin (and the team would win half of the time), but this is impossible because of the ban on communication. Are there, then, only two alternative strategies for each player: either flip the coin or don' t flip it? If so, then not flipping the coin isn't good because if everyone does that, the team loses. But if everyone flips the coin, the team wins with very low probability ( ( 1 /2) n ) . With n = 2, the probability of winning if both players flip the coin is 1 I 4. However, with a superior strategy a two-person team can 297


298

do better than 1 /4. In fact, the same i s true for any number of players. With best play, an n-person team can do better than they could if they were able to choose two members to flip their coins ! Since the problem is about probabilities, we suppose that the players approach their strategies probabilistically. In addition, the players must assume that their teammates will also arrive at this conclusion. Each player should flip the coin with probability p, where the value of p is to be determined (0 � p � 1 ) . We obtain the probability of the team winning, w n ( p ) , via a binomial expansion: Wn ( P ) =

=

t (�Y G)

p k ( l - pt- k

( 2p + 1 - p) n - ( 1 - p) n = ( 1 - 2p ) n - ( 1 - p) n .

Using calculus, we find that W n is maximized at 2 n f (n - I ) 2 Pn = p = 2 n n f ( - I) - 1 ' _

(1)

(2 )

and the maximum winning probability is -n n n W n ( Pn ) = ( 2 /( - I ) - 1 ) 1 .

(3)

For example, with two players, p2 = 2 /3 and w 2 ( p2 ) = 1 /3 . The coin can be used to generate an event with the required probability: express Pn in base 2 and flip the coin until the number generated (heads = 1 , tails = 0) deviates from Pn · (This concept was the subj ect of a Putnam Competition problem [2, p. 1 37].) The case n = 2 can be handled in a particularly simple way: flip the coin twice; if it lands tails both times, do over; the probability of one heads and one tails is then 2 /3. Let's analyze our solution to Heads Up. Although we might conjecture that the maximum probability of winning in (3) tends to 0 as n tends to infinity, a calculation with l' Hopital's rule reveals that 1 h. m W n ( Pn ) = - . n->oo 4

(4)

Actually, we might have guessed this result ! Observing that in the formula in ( 1 ) the second quantity is almost the square of the first quantity, we could reason that the limit has the form x - x 2 , and everyone who can complete a square knows that the maximum value of x - x 2 is 1 I 4. It seems reasonable that the maximum probability of winning in (3) is a decreasing function of n . Numerical calculations indicate that this is true, and our intuition about the problem reinforces it (the chances of the team winning would appear to decrease as n increases). Let us define a real-variable version of the function: f (x) = ( 2xf (x - l ) - 1 ) 1 -x '

X > 1.

It is a tricky problem to prove that f (x) is decreasing. Here is a calculus proof found by my undergraduate student Khang Tran. Take a logarithm: g (x) = ln f (x ) = ( 1 - x) In ( 2xf (x - l ) - 1 ) .

Now g ' (x) - - In (2xf (x - l ) - 1 ) +

2xf (x - I ) In 2 (X

_

1 ) ( 2xf (x - I )

_

1) '

VO L . 78, NO. 4, OCTO B E R 2 005 >

and w e want to show that g' (x ) i s negative for x y = 1 / (x - 1 ) : h (y) =

-

ln ( 2 y + l - 1 ) +

-

1 . Make a change o f variables,

2 Y + 1 y ln 2 y ln 2 = - ln ( 2 y + l - 1 ) + y ln 2 + . 2 2Y + I - 1 y+l - 1

It suffices to show that h (y) is negative for y h ' (y) =

299

>

0. Observe that h (O) = 0 and

2 Y + 1 ln 2 ln 2 - 2 Y + 1 y (ln 2) 2 + + + ln . 2 2Y + I - 1 2Y + I - 1 (2Y + I - 1 ) 2

The first three terms sum to zero and the last term is negative. Surprisingly, we can give a short proof that W n (Pn ) is decreasing via the solution to Heads Up. Here it is:

(5) The inequality in (5) holds by virtue of the definition of P n · The equality, which can be proved directly by combining ( 1 ) and (2), is an algebraic curiosity; it says that, for each n, the maximum value of W n+ l occurs on the graph of W n · Put another way, this peculiar identity says that the success probabilities for n and n + 1 are equal for the optimum probability Pn+ l · This is like one team member "dropping out" (but, of course, no one member can drop out) . We see from ( 2 ) that the optimum probability Pn decreases to 0 as n tends to infinity. What can we say about n Pn • the expected number of coins flipped? From (2), using l' Hopital's rule, we determine the limiting value: lim n Pn = ln 4. n---> oo

(6)

(We could have guessed this result by invoking the fact that limn ---> oo O - c j 2n) n = e - c / 2 and recognizing that x - x 2 is maximized when x = 1 /2.) Furthermore, the expected number of coins flipped increases with n; that is, (7) as we shall prove. Does intuition support this result? A calculus proof, with n replaced by a real variable x , is possible but tedious (make the change of variables y = 2xf <x - l ) and differentiate) . We prefer to stay with the discrete variable. The inequality (7) is equivalent to n

·

2n( (n - l ) 2n f < n - l )

_

_

2

1

< (n + 1 )

·

2 3, is the winning probability merely W n (Pn ) , or is there a way for the players to use the knowledge of each other's hat colors?

n

n

Acknowledgment. I am grateful to Suren Fernando, Upendra Kulkarni, Anthony Vazzana, and the referees for their suggestions concerning this note.

REFERENCES 1 . M. Bernstein, The hat problem and hamming codes, Focus 2 1 : 8 (200 1 ) , 4--6 . 2. L. Larson, 50th annual William Lowell Putnam Mathematical Competition, winners and solutions, this MAG AZINE 63: 2 ( 1 990), 1 36-1 39.

Th e H u m b l e S u m of Re m a i n d ers F u n ction M I C H A E L Z. S P I V EY University of Puget Sound Tacoma, WA 98416 mspivey ® u ps . ed u

The sum of divisors function is one of the fundamental functions in elementary number theory. In this note, we shine a little light on one of its lesser-known relatives, the sum of remainders function. We do this by illustrating how straightforward variations of the sum of remainders can 1 ) provide an alternative characterization for perfect numbers, and 2) help provide a formula for sums of powers of the first positive integers. Finally, we give a brief discussion of perhaps why the sum of remainders function, despite its usefulness, is less well known than the sum of divisors function. Some notation is in order. The standard notation for the sum of divisors function is

n

[3]

a(n ) :

p

p

(n ), (n L (n d) .

We denote the sum of remainders function by )

=

n

d =l

mod

namely,

VO L. 78, NO. 4, OCTO B E R 2 005

301

A perfect number is a number equal to the sum of its positive divisors, excluding itself. Another way to express this is to say that perfect numbers are those for which a (n) 2n . The three smallest perfect numbers are 6, 2 8, and 496. Euclid proved that every number of the form 2 P - l ( 2 P - 1 ) , where p and 2 P - 1 are prime, i s perfect. Euler proved the converse, that numbers of this form are the only even perfect numbers [4, p. 59] . One of the famous unsolved problems in number theory is whether or not there are any odd perfect numbers. To see the connection between the sum of remainders and perfect numbers, let's take a look at what p (n) is actually adding up. A table of remainders for n 15 is as follows: Sums of remainders and perfect numbers

=

=

1 0

divisor: d 1 5 mod d

2 1

3 0

4 3

5 0

8 7

7 1

6 3

II

10 5

9 6

12 3

4

14 1

13 2

Two patterns are fairly immediate: There are zeroes in the entries for which d divides n, and once past half of 1 5 , the remainders count down from 7 (just under half of 1 5 ) to 1 . Other than that, there does not appear to be much of a pattern. However, if we line up the remainders for several consecutive integers we start to see a pattern: divisor: d 1 mod d 2 mod d 3 mod d 4 mod d 5 mod d 6 mod d 7 mod d 8 mod d 9 mod d 1 0 mod d 1 1 mod d

0 0 0 0 0 0 0 0 0 0 0

1 3 mod d 1 4 mod d 1 5 mod d 1 6 mod d 1 7 mod d 1 8 mod d

0 0 0 0 0 0

1 2 mod d

0

2

3

4

5

6

7

8

9

10

11

0 1 0 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2

0 1 2 3 0 1 2 3

0 1 2 3 4 0 1

0 1 2 3 4 5

0 1 2 3 4

0 1 2 3

0 1 2

0 1

0

I

I

I

I

1

0

6

5 6 7 0 1 2

4 5 6 7 8 0

3 4 5 6 7 8

2 3 4 5 6 7

1 2 3 4 5 6

0

0 1 0 I

0

()

2 0 I

2 0

0

2 3 0 1 2

2

3 4 0 I

2 3

()

2 3 4 5 0

5

0 1 2 3 4

4

3

2

12

13

0 1 2 3 4 5

14

15

0 I

2 3 4

0 1 2 3

For fixed d and increasing n , n mod d cycles from 0 to d - 1 , hitting 0 at all the integers that d divides. This makes sense, of course, when you think about it. In particular, if we compare n mod d and (n - 1 ) mod d, we see that the difference between them is either 1 , if d does not divide n , or I - d, if d does divide n . In symbols, this becomes

(n mod d) - (n - 1 mod d) =

{ �·- d , dd I nn ,. if if

f

(1)

Now, define the backward difference operator V by V f (n) f (n) - f (n - 1 ) . Then the connection between the sum of remainders and perfect numbers i s the following: =


3 02

THEOREM

1.

Proof

V' p

n is perfect if and only if (n) = - 1.

d

n

Summing up the left-hand side of the equation in ( 1 ) over from 1 to - 1 yields p (n) - p (n - 1) . The sum of the right-hand side of the equation in ( 1 ) over d from 1 to - 1 can be split into the sum over those that do not divide and the sum over those that do. Setting the two expressions equal to each other yields

n

d

n

n-1

p(n) - p(n - 1 ) = z= 1 + z= o - d) = z= 1 - z= d = n - 1 - L d + n = 2n - 1 - u(n). u(n) = 2n, dln,df=n

dfn

d=l

dln ,d,Pn

din

Since a number is perfect if and only if

•

the theorem follows.

Theorem 1 is a surprisingly simple characterization of perfect numbers in terms of the sum of remainders. The theorem is proved by Cross [1] and by Lucas [8, p. 374] . (Cross's reference to p. 388 in Lucas is in error.) The proof of Theorem 1 also implies simple characterizations of numbers that are close to being perfect. Define an number to be a number such that - 1 . Powers of are almost perfect, but these are the only numbers known to be almost perfect. Define a number to be a number such that + 1 . It is not known if there are any quasi-perfect numbers, but it is known that they must be odd squares [5] . Slight modifications of the proof of Theorem 1 then yield

perfect 2 quasi-almost perfect

u (n) = 2n 2n •

n

u (n) =

We have the following: n is almost perfect if and only if (n) = 0. n is quasi-perfect if and only if (n) = -2.

C OROLLARY •

n

1.

V' p

V' p

This can go on, of course, for other numbers that are close to being perfect. There is another interesting corollary to Theorem 1 . Calculus, as any freshman math major knows, involves derivatives and integrals of functions defined over real numbers. Calculus can also be done with discrete functions, functions defined over, say, the integers, in which case it is known as The discrete analog of the derivative is the

finite difference:calculus offinite differences. 6f(n) = f(n + 1 ) - f(n). antidifference: 6 f(n) F 6 F = f. 1 (n) 2n + 1 - u(n + u(n) (n - 1 ) 2 - p(n - 1 ) . u And the discrete analog o f the antiderivative i s the -I

is any function

such that

With finite calculus in mind, we see that the proof of Theorem says that the finite difference of p is 1 ) , or, alternatively, that an antidifference of is Thus the p and functions are related to each other in a simple way via finite calculus. We are not aware of any other basic number-theoretic functions for which this is the case. For various integers k, the sums of kth pow ers of the first positive integers, 1 + +···+ are among the most popular sums in all of mathematics. Jakob Bernoulli developed a whole new class of num bers, later named in his honor, to represent these sums. In addition to formulas using Bernoulli numbers, there are formulas involving Eulerian numbers and binomial coef ficients, as well as Stirling numbers and binomial coefficients [10] . Many other papers have appeared on the subject [7, 9] .

Sums of remainders and power sums

n

k 2k

nk ,

303

VOL . 78, NO. 4, OCTO B E R 2 005

The promised formula for the power sum using remainders involves variations of both p and Recall that the value of (n) is determined by summing the divisors of n . Besides summing the divisors, however, w e can also sum up powers o f the divisors; that is, squares, cubes, or even any real powers. The standard notation is

a.

a

ak (n) L dk . di =

n

Results concerning this function are scattered throughout Chapter X of Dickson [3] ; other problems, some solved and some unsolved, appear in books by Dudley [4, p. 56] and Guy [5, pp. 1 02-1 05] . We could also sum powers of the remainders. However, for our purposes w e need remainders that are weighted by powers of integers ; namely, multiply n mod d by and then add up. This leads to the following definition:

dk

L dk (n mod d) . n

p (n , k)

=

d= l

Then the formula for the power sum i n terms o f ak (n) and p (n , k ) is THEOREM

2. For any real number k, 1 k + 2k + · · · + n k

1

=

(

-;; p (n , k) +

� a J (l)· ) . t1 k+

A nice property of Theorem 2 is that k can be any real number; most other formulas for the power sum restrict k to the nonnegative integers. Two special cases of Theorem 2 are worth mentioning. If k 0, we have the nice formula =

n2

n

=

p (n) + L a (i ) , i= l

which appears in Lucas [8, p. 388] . This formula can also be derived easily from the

proof of Theorem 1 . In addition, the case k

- 1 can be rewritten as

=

n

n

L ln / i J L d (i ) , =

i=l

i=l

where d (i) gives the number of divisors of the integer i . This result is discussed in the proof of Theorem 320 in Hardy and Wright [ 6] . Before we prove Theorem 2, though, we need to define the indicator function I{p ) for a statement p . This is given by the following: /{p }

=

Let us proceed to prove Theorem 2.

{ 6:

if p is true, if p is false.

=

Proof The division algorithm tells us that, for any integers n and d, n d qd + rd , where qd i s the quotient and rd i s the remainder when n i s divided b y d. Multiplying both sides of this equation by d k and summing over d from 1 to n yields the following: n L dk n

d= l

L (dk+ l qd + dk rd ) n

=

d= l

L dk+ l qd + p (n , k) . n

=

d= l

(2)


3 04

At this point we can start to see the formula becoming clear. All that remains i s to show that :L� = I d k + l qd = :L�= I ak 1 (i ) . To see this, we need the fact that qd counts + the number of integers between 1 and n that d divides evenly. For example, 37 divided by 7 leaves a quotient of 5 ; this is because 7 divides exactly five integers between 1 and 37: 7, 1 4, 2 1 , 28, and 35. Therefore,

n

=

n

n

d=i

i =i

L dk + I qd L dk + I L I{d l i) d=i

=

n

n

L L dk + I I{d l i) i =l d=l

Substituting this expression into (2) and dividing both sides of the equation b y n yields the formula. • With so many uses of the sum of remainders function, why is it so little known relative to the sum of divisors function? While some of the answer may be historical, much of it also lies in the fact that p does not have some of the nice properties that a has, properties that a shares with other well-known arithmetical functions. For example, p is not multiplicative, whereas a is. Multiplicative arithmetical func tions f are those that are not identically zero and have the property that f (mn ) = f ( m) f (n ) whenever m and n are relatively prime. This property means that the values of multiplicative functions are completely determined by their behavior on powers of primes. It is easy to see that p is not multiplicative; for example, p (2) = 0, but no other even number n has p (n ) = 0. Dudley [ 4, p. 54] gives a proof that a is multiplicative. Another nice property that a has but p does not is that a is a unit in the standard ring of arithmetical functions. (Recall that the units in a ring are the elements that have multiplicative inverses.) In a recent article in this M A G A Z I N E , Delaney [2] character izes the units in this ring. He also gives a formula for a as the Dirichlet product of two simple functions, and one can use his subsequent discussion to construct the in verse of a . Again, it is easy to see that p has no inverse. The multiplicative identity in the ring is the indicator function 1 (n ) = I{ n= I l · and since p ( l ) = 0, p can have no inverse. (Further details on the ring of arithmetical functions can be found in Delaney's paper [2] .) Some limitations of the sum of remainders function

We have seen that the sum of remainders function p can provide a simple alternative characterization for perfect numbers, and, in conjunction with a variation of the sum of divisors function, give a new formula for sums of powers. We have also seen that the sum of remainders function does not have two nice properties that many more well-known arithmetical functions do have, namely, p is not multi plicative, and p is not a unit in the standard ring of arithmetical functions. In the absence of much literature on the subject, interested readers may enjoy fram ing their own questions concerning the sum of remainders function. Final remarks

REFERENCES 1. 2. 3. 4.

James T. Cross, A note on almost perfect numbers, this MAGAZINE 47 ( 1 974), 230-23 1 . James E . Delaney, Groups of arithmetical functions, this MAGAZINE 78 (2005), 83-95 . Leonard E. Dickson, History of the Theory of Numbers, Vol. I, Chelsea, New York, 1 952. Underwood Dudley, Elementary Number Theory, W. H. Freeman, New York, 2nd ed., 1 978.

VO L . 78, NO. 4, OCTO B E R 2 005

305

5 . Richard K . Guy, Unsolved Problems in Number Theory, Springer, New York, 3rd ed., 2004. 6. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, Clarendon Press, Oxford, 5th ed., 1 979. 7. Donald E. Knuth, Johann Faulhaber and sums of powers, Mathematics of Computation 61 ( 1 993), 277-294. 8. Edouard Lucas, Theorie des Nombres, Gauthier-Villars, Paris, 1 89 1 . 9 . Robert W. Owens, Sums o f powers of integers, this MAGAZINE 65 ( 1 992), 3 8-40. 10. Kenneth H. Rosen, ed. , Handbook ofDiscrete and Combinatorial Mathematics, CRC Press, Boca Raton, FL, 2000.

O n Ti l i ng the n- D i m e n s ion a l C u be WI L L I AM STATO N B E N TO N TY L E R U n ivers ity of M i ss i ss i pp i U n ivers ity, MS 3 8 6 7 7 m m stato n ® o l e m i ss.edu

Take a square and cut it into smaller pieces, each of which is a square. Now count the number of pieces. It is easy to show that the answer is not 2, 3, or 5, but it might be anything else. Asking the question a different way: for what values of k can one tile a square with k squares? Here tiling means covering with pieces that overlap only on edges. FIGURE 1 shows tilings of a square with k = 4, 6, 7, and 8.

Figure

1

Some ti l i ngs of a s q uare

It is an easy observation that if a tiling with k squares exists, then a tiling with k + 3 squares also exists. For in a tiling, a single square may be replaced by the first tiling in FIGURE 1 . So, the existence of tilings for k = 6, 7, and 8, along with the easy passage from k to k + 3, provide an inductive proof of the existence of tilings for all k 2: 6. We leave as an exercise the impossibility of k = 2, 3 , and 5 . Hint: count comers. Now we ask similar questions in higher dimensions. Can an n-dimensional cube (n-cube for short) be tiled with k smaller n-cubes?

FACT 1 . The standard tiling of a unit n -dimensional cube with r n cubes each of edge length I / r shows that the answer is yes in dimension n for k = 2n , 3 n , 4n , . . . . Here is another key to solving the higher dimensional problem: Note that in a given k-tiling, a single tile may be replaced by a cube tiled with l tiles. This observation gives us Fact 2. FACT 2. If the n -cube can be tiled with k cubes and can be tiled with l cubes, then it can be tiled with k + l - 1 cubes. Now let k = 1 and l = (2n - 1Y and apply Fact 2 j times to learn that the n-cube may be tiled with 1 + j [(2n - l ) n - l ] = j (2n - l ) n - j + 1 cubes. As j varies from 0 to 2n - 2, the resulting numbers of tiles hit every congruence class modulo 2n - 1 . For example, in dimension three, the situation is as in TABLE 1 .


306

Now, b y Fact 2 , if a tiling exists with k tiles, then one exists for k + 2n - 1 tiles, so if a tiling exists with k1 tiles then a tiling exists with any larger k2 = k1 mod (2n - 1 ) . We conclude that tilings exist in dimension n for all k greater than or equal to

TAB L E 1 : Some poss i b l e ti l i ngs of the 3 -c u be j

j (2 3 - l ) 3 - j + l = 343j - j + l

mod 7

0 1 2 3 4 5 6

1 343 685 1 027 1 369 171 1 2053

1 0 6 5 4 3 2

Now it is perhaps interesting to ask for the smallest number f (n) so that an n-dimensional cube can be tiled with k n-dimensional cubes for all k :::: f (n ) . We have shown f (2) = 6. Examine TABL E 1 . The values of k in the interval [2047 , 2053] are covered in the 7 congruence classes, but k = 2046 is not, since 2053 is the smallest number covered that is congruent to 2 mod 7. These 7 numbers, along with the k to k + 7 induction show that in fact f (3) ::::; 2047. Similarly, in n dimensions, the threshold is actually no larger than

Our gen eral upper bound for f (n) is probably very bad. We will show that in fact

f (3) :s 48. We must cover every congruence class modulo 7 = 2 3 - 1. We accomplish this with the following values of k : TAB L E 2 : Better ways to ti l e the 3-cube k mod 7 k

0 49

2 51

3 38

4 39

5 54

6 20

Beginning with the simplest, we note that tiling with 1 cube is trivial. Tiling with 20 cubes can be done by taking a standard tiling of 27 congruent cubes and replacing the 8 cubes of a 2 x 2 x 2 subcube with a single cube, as in FIGURE 2.

Figure

2

U s i ng 2 0 c u bes

VOL . 78, NO. 4, OCTO B E R 2 005

307

Now w e invoke Fact 2 with k = l = 2 0 to obtain a tiling with 3 9 cubes. This is shown in FIGURE 3. To construct a tiling with 38 cubes, begin with a standard tiling of 64 cubes and replace the 27 cubes of a 3 x 3 x 3 subcube with a single cube.

Figure

3

Ti l i ng with 3 9 cu bes

To construct a tiling with 49 cubes, imagine a unit cube sliced into three horizontal slabs of thickness 1 /2, 1 /3 , and 1 /6. Tile the first slab with 4 cubes of side length 1 /2, the second with 9 cubes of side length 1 /3 , and the third slab with 36 cubes of side length 1 /6. This is shown in FIGURE 4.

Figure 4

A 49er

Now tiling with 5 1 cubes is a bit trickier. Begin with a tiling of 20 cubes as above, recalling that 1 9 of the 20 are the same size. Look at the 6 faces of the cube and note that three are like the right-hand image in FIGURE 5 and three are like the left-hand one.

Figure

5

Faces of the 2 0 cube ti l i ng

Pick two small cubes of the 20 cube tiling that share a face. Replace each of the two with a cube tiled with 20 in such a way that the shared face of each of the two looks like the right-hand picture in FIGURE 5. Now observe that 8 of the smallest cubes form a 2 x 2 x 2 subcube, which can be replaced with a single cube. The result is a tiling with 20 + (20 - 1 ) + (20 - 1 ) - 8 + 1 5 1 cubes. The construction of a tiling with 54 cubes is similar. Recall our construction of a tiling with 38 cubes. Begin with a standard tiling using 8 cubes. Replace 2 cubes that share a common face with 38 cubes apiece so that the shared faces of the two are as =


308

I I

-r-

-r-

Figure

6

I I

CV' v

Ti l i n g with 54 subcubes

in FIGURE 6. There are now four subcubes each consisting of 8 cubes. Each of these can be replaced by a single cube. This yields a tiling with 8 + (38 - 1 ) + (38 - 1 ) 4(8 - 1 ) = 54 cubes. We note that our constructions yield no tiling with 47 cubes, but that along with Fact 2 , they yield tilings for all larger values of k. Many interesting questions remain. What i s the exact value o f f (n ) ? Even f (3) may be difficult. For n = 2 we know all values of k for which no tiling exists. For larger n one can easily see by counting comers that no tiling exists for 2 ::=:: k ::=:: 2n - 1 . We conjecture that for n :::_ 3 no tiling exists in dimension n for 2n + 1 ::=:: k ::=:: 2n+ l - 2 . For n = 3, this conjecture can be proved with a lengthy examination of cases. Acknowledgment. The referee has drawn our attention to an article by Hudelson [1], where most of our results have been anticipated, indeed with better bounds in four dimensions and higher.

REFERENCE 1 . Matthew Hudelson, Dissecting d -cubes into smaller d -cubes, Journal of Combinatorial Theory, Series A 8 1 ( 1 998), 1 90-200.

Th e Mag i c M i rror Property of th e C u be Corn er J U A N A. A C E B R 6 N

Departamento de Automatica Escuel a Po l itec n i c a Un iversi dad de A l c a l a de Henares 28871 A l c a l a de Henares, Madrid, Spa i n j u a n . acebro n @ u a h .es

R E N ATO S P I G LE R Un iversita d i " Roma Tre" 00146 Roma, Ita l y s p i g l er@ m at. u n i roma3. it

We show that a ray of light, successively reflected from three mutually perpendicular planes in ordinary three-space, comes back in the same direction it came from. This

VOL . 78, NO. 4, OCTO B E R 2 005

3 09

fact has applications in laser resonators, space measurements, and communications, but also in such simple devices as the retro-reflectors on bicycles and cars. The so-called "cube corner," or also "corner cube," consists of three reflecting sur faces that are portions of concurrent faces of a cube. It can be realized in silica or other vitreous material, possibly coated with a mirror surface. If a ray hits any of the three surfaces, assuming that it does not hit an edge (in particular, the corner point), it turns out that, after three reflections (in general), it is reflected back in exactly the same direction of the original ray. This result is well known within the community of optical physicists, since it has been applied in a number of laboratory laser devices [5] . A more dramatic application is to reflect laser rays from the Moon, where many such devices have been in place since the 1 969 Apollo mission, which sent men to the Moon for the first time [2, 4] ; among other things, the Earth-Moon distance can be measured by firing a laser beam from the Earth to the Moon, and measuring the travel time it takes for the beam to reflect back. This has allowed an estimate of the distance to within an accuracy of 3 em. Two other applications are worth mentioning. All retro-reflectors used in bicycles and cars are actually arrays of hundreds of cube corners. (The curious reader may find an interesting and exhaustive discussion of bicycle safety on the web [1], in particular, the use of cube-corner retro-reflectors.) A more hi-tech application of these devices is in an invention called "Smart Dust" [3, 6] . Smart Dust refers to an ensemble of very tiny devices-one millimeter or less meant to be used in large numbers (so to represent a dust of particles), acting like "intelligent" dust motes. Each mote would contain very small sensors and possibly other little devices. Tiny micromachined cube-corner retro-reflectors located inside such motes are considered good candidates for passive transmitters in free-space opti cal communication systems, since they do not need to emit energy, but only reflect it. Smart Dust has potential applications in disparate areas from meteorology to medicine. A handful of such particles thrown on the back of a subject could be used to collect medical data; disseminating them in the air or sea would allow one to monitor the state of the environment as it evolves in time. We establish the main useful property of the cube corner: It reflects any incoming ray back in the same direction it came from, usually after three successive reflections.

We confine ourselves to geometric optics, ignoring, for instance, changes in the polar ization of light. In fact, a number of issues may arise depending on the exact material and coating. These remain the subject of active investigation.

Since our claim is about the directions of rays, it suffices to consider the direction cosines of the vectors that represent such rays; see FIGURE 1 . Let us assume, once and for all, that the incoming ray hits exactly one of the three faces of the cube corner, ruling out the pathological cases that it might hit one edge or the origin. In the real-world applications, almost none of the rays will be pathological. Without loss of generality, we assume that the incoming ray, characterized by the direction cosines (A. , JL , v), hits the face represented by the (x , y) plane. As with all other triples of direction cosines, we have A. 2 + JL 2 + v 2 = 1 . Let us also assume that the incoming ray is not perpendicular to this plane. After all, the property is observed quite simply in this case; hence v 2 f. 1 . B y the laws of optics, the reflected ray, whose direction cosines are (A. 1 , JL 1 , v t ) , must be coplanar with both the previous ray and the normal to the plane (x , y) . There fore,

The reflective property of the cube corner


310 z

Q

y p X

Figure

1

Th ree reflections of a l ight ray by a c u be corner

1L

v

A. AI

/L J

Vj

0

0

1

= 0,

from which we obtain that A JL 1 = A if.l- and hence A. 1 = rA. and /L I = rJL for some number r. Moreover, A. i + JL i + v r = 1 . On the other hand, the laws of optics also imply that incoming and reflected rays form the same angle with the normal to the x - y plane, whose direction cosines are (0, 0, 1 ) , so that (A. � o JL 1 , v 1 ) • (0, 0, 1 ) = (A. , JL, v ) (0, 0, 1 ) , which means v 1 = v . ·

The latter, along with the previous result, yields (A. 1 , JL J , v 1 ) = (rA. , r JL , v ) , and then

r 2 (A. 2 + JL2) + v2 = 1 . But since A. 2 + JL2 + v2 = 1 , we conclude that r2 = 1 , and r = 1 or r = - 1 . The solution r = 1 , however, is ruled out by the fact that the reflected

ray cannot be parallel to the incoming ray, unless the latter is perpendicular to the plane (x , y ) , a case that we can exclude. Therefore, r = - 1 and

We proceed similarly with the other reflections. After a ray reflects off a given coor dinate plane, the cosine of the angle with the normal to that plane will be unchanged, while the other two direction cosines will change sign. For instance, if the ray with direction cosines (A. 1 , JL J , v1 ) next hits the x - z plane and then the y - z plane, the successively reflected rays will have direction cosines

and

respectively. In the case of a ray hitting first the y we have

z

plane and then the x

-

z

plane,

VOL . 78, NO. 4, OCTO B E R 2 005

31 1

and then

The final result will be the same. This shows that the ray emerging after three reflec tions is parallel to the original ray. Finally, we consider the special occurrence of a ray entering the cube corner parallel to one of its faces. This case can be viewed as if the reflections take place on a billiard table, with a ball hitting successively two concurrent sides of the table; the answer is the same. We have shown that (almost) every ray emerging from a cube corner system after reflecting off its three plane walls has the same direction as the incoming ray that produced it. Thanks to clever applications of this magic mirror property, our travel is safer and we can measure the distance all the way to the Moon. If you ever encounter Smart Dust, remember the reflective property behind its design. Conclusion

REFERENCES 1. J. S . Allen, About bicycle reflectors contents, , last revised March 14, 2003, and all links, last revised April 27, 2002. 2 . J. E. Faller and E. J. Wampler, The lunar laser reflector, Scientific American, 38, March 1 970, 38-50. 3. J. M. Kahn, R. H. Katz, and K. S. J. Pister, Emerging challenges: mobile networking for "Smart Dust;' J. Comm. and Networks 2:3 (2000), 1 88- 1 96. 4. M. S . Scholl, Ray trace through a comer-cube retroflector with complex reflection coefficients, J. Opt. Soc. Amer. A 12 ( 1 995), 1 5 89- 1 592. 5 . Chun-Ching Shih, Depolarization effect in a resonator with comer-cube reflectors, J. Opt. Soc. Amer. A 13 ( 1 996), 1 378- 1 384. 6. L. Zhou, J. M. Kahn, and K. S . J. Pister, Comer-cube retro-reflectors based on structure-assisted assembly for free-space optical communication, IEEE J. Microelectromech. Syst. 12:3 (2003), 233-242.

Sp h e r i ca l Tr i a n g l es of A rea rr a n d I sosce l es Tetra h ed ra J E F F B RO O KS J O H N S T RA N TZ E N

La Trobe U n ivers ity Victoria 3086, Austra l i a J . B rooks® l atrobe.edu .au ) .Strantze n ® l atrobe.ed u . a u

There is a beautiful theorem in spherical geometry that is not well known, and that doesn't appear in most texts on the subj ect. The theorem says that for any spherical triangle of area rr on the unit sphere, four congruent copies of it tile the sphere; that is, the copies can be positioned so that any two triangles meet at an edge, with vertices meeting vertices, and the union of these four triangles is the sphere [1; 3, p. 44; 6, p. 94] . The configuration is shown in FIGURE 1 , which may aid readers in guessing the proof. What appears to be less known (in fact, the authors can find no reference to it) is that the four vertices on the sphere determined by this tiling form the vertices of an

MATH EMAT I CS MAGAZI N E

312

isosceles tetrahedron (opposite edges of the tetrahedron are equal) [2] . Hence, every such tiling of the sphere is obtained as the projection from the center of an inscribed isosceles tetrahedron. Background We introduce some preliminary definitions and results that the reader may already know, in which case it is possible to skip ahead. Given a point P on the unit sphere, the intersection of the line joining the center of the sphere to P meets the sphere at a diametrically opposite point P'. The pair P and P' are called antipodal points. We define a line on the sphere to be a great circle, that is, the intersection of the sphere with a plane containing the center of the sphere. The complement of a line is two disjoint open hemispheres. Given two distinct nonantipodal points, P and Q , there is a unique line containing them. This line is the intersection of the sphere with the plane containing P and Q and the center of the sphere. We denote it by lPQ. We let PQ denote the smaller of the arcs on the line that connects P to Q . It follows that length of PQ is less than rr . We denote the fact that two arcs PQ and RS have equal length by PQ = RS, blurring the distinction between a segment and its measurement. Given three distinct noncollinear points A , B , C, no two of which are antipodal, we define the spherical triangle MBC to be the union of the arcs AB, BC, and CA. We call AB the edge of the triangle joining vertex A to vertex B, similarly for BC and CA. Such a triangle necessarily lies in an open hemisphere. Clearly each edge of MBC determines an open hemisphere containing the remaining vertex. The intersection of these three open hemispheres is the interior of 6ABC. The area of MBC is defined to be the area of its interior. It should be intuitive as to what is meant by the three angles of MBC, and we denote by L CAB, LABC, and LBCA the measure in radians of these three angles. The main result that we use to prove our theorem is a remarkable theorem attributed to Girard. Girard's theorem says that

L CAB + LABC + LBCA - rr

=

Area(6ABC)

Weeks [8] gives an elegant proof of this result. Finally, we need to say what we mean by two triangles being congruent. We adopt the Euclidean notion, which says that two triangles are congruent if there is an isometry between the triangles. The isometries of the sphere are of three types, a reflection in a line, a product of two reflections (this is equivalent to a rotation about an axis through the antipodal points where the lines meet), and a reflection followed by a rotation perpendicular to the line of reflection (a glide reflection) [5] . Reflections and glide reflections do not preserve orientation, which we wish to do in this discussion. Hence, for us two triangles are congruent if one can be rotated onto the other, see [7] . The usual theorems for congruence of spherical triangles-SSS, SAS, ASA, and AAA-will determine uniqueness up to isometry, so we must show that the copies of our triangle are all rotations of it. Proving the theorem To demonstrate our result, let MBC be a spherical triangle of area rr . Choose an edge (say AB), let M be the midpoint of that edge, as in FIGURE 1 . From our definition of a triangle it follows that MC\{M, C } lies in the interior of MBC. Rotate MBC by rr about the axis through M and its antipodal point M ' . Under this rotation A and B interchange and C maps to a fourth distinct point D. It is clear that C and D are distinct, since the length of MC is less than rr . Also C, M, and D lie on the line determined by M, M', and C . It is not difficult to see that M C has a length greater than rr /2 otherwise MBC would fit inside a quadrant (as the length of AB is less than rr) and hence have area

VO L . 78, N O . 4, OCTO B E R 2 005

31 3

I

/ ....,

D '

Figure

\

1

strictly less than n . Hence, M' is the midpoint of CD. The points C, M, and D lie on the line determined by M, M', and C and M ' D = M ' C . Also M, M', and A lie on line lAB, with A and B lying in distinct open hemispheres determined by line ! CD. Consider the four triangles MBC, 6.BAD, 6.CDA, and 6.DCB. Clearly any two meet at a common edge with vertices meeting vertices. If we partition 6. CDA into 6. CM'A and MM' D, and 6.DCB into 6.DM'B and 6.BM'C, we see that the three triangles 6. CM' A, MBC, and 6.BM' C tile the closed hemisphere determined by line AB and C. Similarly MM' D, 6.BAD, and 6.DM' B tile the closed hemisphere deter mined by line lAB and D. It follows that MBC, 6.BAD, 6. CDA, and 6.DCB tile the sphere. We have MBC congruent to 6.BAD, as they are rotations of each other by n about the axis through M, M ' . Also 6. CDA is congruent to 6.DCB for the same reason. By Girard's theorem, we have !_ CAB + !_ABC + !_BCA

=

2n (since Area MBC

=

n).

Also, !_ CAB + iDAC + !_BAD = 2n (sum of angles around vertex A ) . But !_BAD = !_ABC (MBC is congruent to 6.BAD) hence !_DAC = iBCA. It now follows that if we rotate 6.ABC by n about the midpoint N of AC, then A and C change places and B maps to D (DA = CB as 6.CDA is congruent to 6.DCB). Thus MBC is congruent to 6. CDA and so the four triangles are congruent. In particular AB = CD, BC = AD and AC = BD, which means that the tetrahedron formed by the vertices ABCD is isosceles. Now the great circle through M, M ' , and the midpoint N of AC has MN = M ' N (MBC is congruent to 6. CDA), and as MN + M ' N = n , it follows that MN = n j2. We therefore deduce that the medial triangle of MBC is an octant [1, 4] . Constructions This proof gives a method for drawing such triangles on a sphere: Mark two antipodal points on the sphere. Center congruent arcs of length less than n on these points in such a way that the arcs do not lie on the same great circle. Connect the end points of the arcs in the appropriate way to obtain the tiling. Varying the length of the arcs and the angle they make with each other gives all possible tilings. To construct an isosceles tetrahedron, note that its faces must be four congruent acute angled triangles. Take any acute angled triangle ABC and rotate it by n about an axis through the midpoint of AB perpendicular to the plane of the triangle. Let D be


314

the image of C. Then AB < CD. Now rotate MBD about AB so that the image of D is D' and AB = CD'. It follows that tetrahedron ABCD ' is isosceles. Acknowledgment. The authors would like to thank Paul Pontikis for the typing, Jeanette Varrenti and Marcel Jackson for the electronic picture.

REFERENCES I . Grant Cairns, Margaret Mcintyre, and John Strantzen, Geometric proofs of some recent results of Yang Lu, this MAGAZINE 66:4 (October 1 993), 263-265 . 2. N. A. Court, Modern Pure Solid Geometry, Chelsea Publishing Co., New York, 1 964. 3. H . L. Davies, Packings of spherical triangles, Proceedings of The Colloquium on Convexity, Copenhagen, August 1 965, pp. 42-5 1 . 4 . W. J . M ' Clelland and T. Preston, A Treatise on Spherical Trigonometry with Applications to Spherical Geom etry Part I, Macmillan Publishing Co., New York, 1 907, and Part II, 1 909. 5. Patrick J. Ryan, Euclidean and Non-Euclidean Geometry, An Analytic Approach, Cambridge University Press, Cambridge, London, New York, New Rochelle, Melbourne, Sydney, 1 986. 6 . D . M. Y. Sommerville, Division of space by congruent triangles and tetrahedra, Proceedings of The Royal Society of Edinburgh, 43 ( 1 923) pp. 85-1 1 6 . 7 . I. Todhunter and J. G Leatham, Spherical Trigonometry, Macmillan and C o . Limited, S t . Martin's Street, London, 1 943. 8 . Jeffrey R. Weeks, The Shape of Space, 2nd ed., Marcel Decker, Inc. , New York, Basel, 2002.

A B u tte rfl y Th eo re m fo r Q u ad r i l ate ra l s SI D N EY KU N G

U n iversi ty of N o rth F lorida Jacksonv i l l e, F L 3 2 2 2 4 s i d neyku n g®yahoo.com

One of the more appealing theorems in plane geometry is the butterfly theorem: B UTTERFLY THEORE M . Through the midpoint I of a chord AC of a circle, two other chords EF and HG are drawn. If EG and HF intersect AC at M and N, respec tively, then IM = IN. Since 1 8 1 5 , when this theorem appeared as a proposed problem in the Gentleman 's Diary ( 1 8 1 5 , pp. 39-40; see also [1, p. 1 95]) it has attracted many lovers of mathe matics, some of whom have produced simple and elegant proofs, while others devised various generalizations. In a delightful and well-documented article, B ankoff [1] dis cusses the butterfly theorem for circles and some variants. In particular, on p. 207 one finds an "area method" applied to prove that if I is a point anywhere on the chord AC (as in FIGURE 1), IA = a, IC = c, IM = m , IN = n , then:

1 m

1 n

1 a

1 c

(1)

In this note, we give a similar proof of a butterfly theorem for quadrilaterals . Our proof depends primarily upon the following properties for areas of triangles: P1 If K is the intersection of the lines XY and UV, V :/= K (FIGURE 2a), then

A(UXY) UK A(VXY) - VK '

where A( UXY) denotes the area of triangle UXY.

315

VO L. 78, NO. 4, OCTO B E R 2 005

F

Figure

1

A variant of the butterfly theorem for c i rc l es

P2 Given triangles ABC and XYZ, suppose that, as in FIGURE 2b, 2c, we have LABC = LXYZ

LABC + LXYZ = 1 80 ° ,

or

then

A(ABC) A(Xyz)

AB · BC

= ---

XY · YZ

For a proof of P l , consider FIGURE 2a. We see that the two triangles /:::,. UXY and /:::,. VXY share base XY. Furthermore, their altitudes are in the same proportion as UKj VK. Hence, so are their areas. In a similar manner we can easily establish P2. u

v

B= Y

Z

(a)

Z

B=Y

(c)

(b)

Figure

2

Proportional areas i n tri angles

B UTTERFL Y THEOREM FOR QUADRILATERALS . Through the intersection I of the diagonals AC, BD of a convex quadrilateral ABCD, draw two lines EF and HG that meet the sides of ABCD at E, F, G, H. If M = EG n AC, N = HF n AC, then lA AM IN - · - = IC IM CN B

D

Figure

3

A

butterfly theorem for q uadri l atera l s


31 6

Proof. Refer to FIGURE 3. We find that there are twelve pairs of triangles, each pair of which has a common side or a common angle (or two congruent angles one from each triangle of the pair). Applying P 1 and P2 on these triangles we have AM IN A(AEG) - · - = IM CN A(IEG)

A(IFH)

(P 1 )

A(CHF)

A(IFH) A( CBD) A(ABD) A(AEG) . . = --__ . A (IEG) A(CHF) A(CBD) A(ABD) IF · IH CD · CB lA AE · AG -- · IE · IG CF · CH IC AB · AD A(FAC) A(HAC) A(DAC) A(BAC) !A A(EAC) . . . . . A(EAC) A(GAC) A(FAC) A(HAC) IC A(BAC) ---

=

(P 1 , P2) A(GAC) A(DAC)

/A

(P 1 )

IC

Thus, if we let /A = a , IC

=

•

c, IM = m , IN = n , we get a -m m

--

·

n e-n

--

a

=

c

which simplifies to 1

1

m

n

=

1

1

a

c

Hence a butterfly inscribed in a quadrilateral satisfies the same relation ( 1 ) as a butterfly inscribed in a circle. Equivalently, the conclusion of the theorem indicates that the ratio of the ratios, (AMj/M) j (CNj/N) , is the same as the ratio /Aj/C, or that the harmonic mean of IC and /M equals the harmonic mean of /A and IN. In either case, if /C = /A, we have /M = IN thereby the analog of the usual butterfly theorem for quadrilaterals. Acknowledgment. The author would like to thank the referees for their helpful suggestions, and Joseph Kung for preparation of the article.

REFERENCE I . Leon Bankoff, The metamorphosis of the butterfly problem, this MAGAZINE 60 ( 1 987), 1 95-2 1 0 .

Row Ra n k Eq u a l s Co l u m n Ra n k W I L L I A M P. W A R D L A W U . S . N ava l Academy Annapo l is, MD 2 1 402-5 000 wpw ® u sna.edu

Dedicated to George Mackiw, a good friend and an excellent mathematical expositor.

The purpose of this note is to present a short (perhaps shortest?) proof that the row rank of a matrix is equal to its column rank. The proof is elementary and accessi ble to students in a beginning linear algebra course. It requires only the definition of

VOL . 78, N O . 4, OCTO B E R 2 005

31 7

matrix multiplication and the fact that a minimal spanning set is a basis. It differs in approach from proofs given in textbooks as well as from some interesting proofs in MAA journals [1, 2] . And, unlike the latter, this proof is valid over any field of scalars. But first, recall that if the m x n matrix A = BC is a product of the m x r matrix B and the r x n matrix C, then it follows from the definition of matrix multiplication that the i th row of A is a linear combination of the r rows of C with coefficients from the i th row of B , and the j th column of A is a linear combination of the r columns of B with coefficients from the j th column of C . (If you have trouble understanding this or the next paragraph, you should construct several examples of small matrix products, say, a 3 x 2 times a 2 x 3 matrix, etc., with small integer as well as symbolic entries.) On the other hand, if any collection of r row vectors c 1 , c2 , , c, , spans the row space of A , an r x n matrix C can be formed by taking these vectors as its rows. Then the i th row of A is a linear combination of the rows of C, say a; = bi l c 1 + b; 2 c2 + · · · + b;,c, . This means A = BC, where B = ( bij ) is the m x r matrix whose i th row, b = ( bi l , bi 2 , . . . , b;,) , is formed from the coefficients giving the i th row of A as a linear combination of the r rows of C. Similarly, i f any r column vectors span the column space o f A , and B is the m x r matrix formed by these columns, then the r x n matrix C formed from the appropriate coefficients satisfies A = BC. Now the four sentence proof. •

THEOREM .

•

•

If A is an m x n matrix, then the row rank of A is equal to the column

rank of A. Proof If A = 0, then the row and column rank of A are both 0; otherwise, let r be the smallest positive integer such that there is an m x r matrix B and an r x n matrix C satisfying A = BC. Thus the r rows of C form a minimal spanning set of the row space of A and the r columns of B form a minimal spanning set of the column space • of A. Hence, row and column ranks are both r . Several other properties of the rank of a matrix over a field are also very easy to ob tain. The factorizations A = Irn A = Aln show that r _:::: m and r _:::: n, which proves that the rank of A is less than or equal to the number of rows and the number of columns of A . Since A = BC implies A T = C T B T , the transpose clearly has the same rank as the original matrix. Since A = BC and D = EF implies AD = B(CD) = (AE) F, the rank of AD must be less than or equal to the rank of A and to the rank of D. These concepts suggest the following definition [5, p. 1 23 ] : D E FI N I T I O N . Let R b e a commutative ring with identity and let A b e a n m x n matrix over R . Then the spanning rank of A is 0 if A = 0 and otherwise is the smallest positive integer r such that there is an m x r matrix B and an r x n matrix C satisfying

A = BC. This definition is one way of extending the notion of rank to matrices over commu tative rings. Even if the ring has no identity, it can be embedded in a ring with identity so that the definition can be used. Care must be taken in considering rank over commu tative rings, because several different extensions of the definitions over a field can give different results over rings, even though they all give the standard concept of rank over a field. Nonetheless, if the above definition is used, matrices over rings automatically have row rank equal to column rank, have rank less than or equal to the number of rows and the number of columns, the rank of the transpose is equal to the rank of the matrix, and the rank of a product is less than or equal to the rank of either factor. Another application of the spanning rank, first used by the author in a problem [3] and later a Note [5] in the M A G A Z I N E , is the proof that a matrix over a commutative

MAT H EMATICS MAGAZ I N E

31 8

ring with spanning rank r satisfies a polynomial equation of degree at most r + 1 . For if A = BC is an n x n matrix of spanning rank r , then D = CB is an r x r matrix with characteristic polynomial fv (x) = det(x / - D) of degree r and fv (D) = 0 fol lows from the Cayley-Hamilton Theorem. (The author has shown [4] that the Cayley Hamilton Theorem holds for matrices over commutative rings.) Thus there is a poly nomial m (x ) of smallest positive degree such that m (D) = 0. Then p (x ) = xm (x ) is a polynomial of degree :::: r + 1 such that p (A) = Am (A) = BCm(BC) = B m ( CB) C = B m ( D ) C = 0. REFERENCES 1 . H. Liebeck, A proof of the equality of column and row rank of a matrix, Amer. Math. Monthly 73 ( 1 966), 1 1 1 4. 2. G. Mackiw, A note on the equality of colunm and row rank of a matrix, this MAGAZINE 68 ( 1 995), 285-286. 3. W. Wardlaw, problem 1 179, this MAGAZINE 56 ( 1 983), 326, and solution 1 179, this MAGAZINE 57 ( 1 984), 303. 4. , A transfer device for matrix theorems, this MAGAZINE 59 ( 1 986), 30-33. 5. , Minimum and characteristic polynomials of low-rank matrices, this MAGAZINE 68 ( 1 995), 1 22-1 27 . --

--

A Modern Ap p roach to a Med i eva l P rob l e m A W A N I K U M A R, D i recto r Lucknow Zoo logical Garden L ucknow 2 2 6001 I N D IA awa n i eva @eth . n et

The following problem from Lilavati [1] , a mathematical treatise written by Bhaskaracharya, a 1 2th-century Indian mathematician and astronomer, deserves a modem approach: A snake's hole is at the foot of pillar, nine cubits high, and a peacock is perched on its summit. Seeing a snake at the distance of thrice the pillar gliding towards his hole, he pounces obliquely upon him. Say quickly at how many cubits from the snake's hole they meet, both proceeding an equal distance. Since both proceed an equal distance, it is reasonable to assume that their speeds are equal. Readers are invited to solve this problem before proceeding. Assuming that the peacock flies along the hypotenuse of a right-angled triangle and knows the Pythagorean Theorem, it will grab the snake at a distance of 1 2 cubits from the pillar. Practically, however, such a thing does not happen. Why should a peacock know-a p rio ri-to fly along the hypotenuse of a right-angled triangle having a base of 1 2 cubits? A more peacock-like behavior would be to keep an eagle eye on the snake and change its direction at every instant, always aiming toward the snake. This type of pursuit problem has a history of over five hundred years. However, this particular problem is a bit different from most. Instead of the prey running away from the predator, here prey and the predator are moving toward each other. Even so, the results are startling. Although the reader may have seen similar problems, I offer a general analysis. We assume that the snake and the peacock move at different, but constant, speeds: the snake in a straight line toward its hole and the peacock along a curve, changing its direction at every instant so as to be flying directly toward the snake.

31 9

VO L . 78, NO. 4, OCTO B ER 2 005

Let v be the speed of the peacock, and u be the speed of the snake. The snake is at the origin (0, 0) and the peacock is perched on the pillar at (x0 , y0 ) . They start moving at time t = 0 and are at S(ut, 0) and P (x , y) respectively after time t (as in FIGURE 1 ) . Let e be the angle between the x -axis and the line connecting peacock and snake. Readers may wish to solve the problem themselves, using this notation, before continuing. y

(xo ,Yo) P(x,y) I

/

l

/'

/e S(ut,O)

(0,0)

Figure 1 Notation for the p u rs u i t prob lem: l i ne joi n i ng the snake and the peacock

(}

is the angle between the x-axi s and the

Since the horizontal component of the peacock's velocity is v cos e towards the snake, the rate of change of the horizontal distance between them is - (u + v cos e) . Similarly, the component of snake's velocity towards the peacock is u cos e and so the rate of change of the distance between them is - (v + u cos e) . The negative signs indicate that the distances are decreasing. Initially, the horizontal distance between them is x0 and the distance between them Differential equations for the peacock's path

is

Jx6

+

Y6 . They will meet after time T if the following conditions are satisfied.

)x6

+

Y6 =

1 T (v

+

u cos e ) d t

We find the meeting time by eliminating

T=

jx6

and

Xo =

1 T (u

+

v cos e ) d t

(1)

foT cos e dt from ( 1 ) to get

Y6 - kxo , v ( 1 - k2 ) +

where

k = ufv.

(2)

Equation (2) implies that if k --+ 1 then T --+ oo. The startling result is that, however small the pillar may be, if the speeds of the peacock and snake are equal, the animals will never meet! We rarely find such a counter-intuitive result in mathematics. Let us look into it in more detail. The key is to tum our assumptions into differential equations. The tangent line to the curve of pursuit at P (x , y) must pass through the point S(ut, 0) . Looking at its slope, we have

dyfdx

=

yf(x - ut)

or

x - ut = y dxfdy.

(3)

We will eliminate t to find an ordinary differential equation in x and y. As a first step, let s be the distance covered by the peacock in time t, so that s = v t . Substitution into (3) gives

dx y - = x - ks. dy

(4)


320

For the next step, it i s fruitful to treat y as the independent variable, a trick com mon in solving ordinary differential equations. The distance s is also the length of the pursuit curve from (x0 , y0 ) to (x , y). Therefore, using the arc-length formula, we get

s=

iYo J1 + (dxjdy) 2 dy

Differentiating ( 4) with respect to

d 2x ds y -2 = -k - , dy dy

dsjdy = -J1 + (dxjdy) 2 .

or

(5)

y gives,

and thus

d2x y -2 = kJ1 + (dxjdy) 2 • dy

This second order differential equation can be reduced to first order by another standard trick, namely, putting p for dxjdy. Therefore, y dpjdy = kJ1 + p 2 or, sep arating variables,

dp

=k

dy . y

(6)

J1 + p 2 At t = 0, x x0, y y0 and p = x0jy0• Substituting p = tan a (where a = rr / 2 - e ) , we get dp = sec 2 a da and dpfJ1 + p 2 sec a da . Therefore, integrat =

=

=

ing (6) and applying the initial condition gives

p + J1 + p 2 = C t (Y/Yo ) k , where

(7)

c1 = x0 jy0 + J1 + (x0jy0) 2 • Using lots of algebra, (7) simplifies to

[

]

dx 1 1 = 2 C t (Y ! Yo ) k - (Yo/Y ) k c1 dy investigate the position of the peacock when p = 0, p=

(8)

Let us which means that the peacock is exactly above the snake. This situation distinguishes the cases where the peacock catches the snake while always flying away from the pillar from those where it reverses direction. Putting p = 0 in (7) gives c1 (y I y0 ) k = 1 or,

Y

=

Yo0 ! c t ) 11 k .

(9)

This gives the relationship between the position of the peacock, its initial positions and k. In the given problem, k = 1 and c1 = 3 + JlO � 6. 1 6 cubits. So, from (9), the peacock is exactly above the snake at y = 1 .46 cubits. Before this position, the peacock was moving away from the pillar and after this, it will be moving toward the pillar, while always heading for the snake. Whatever the value of k, (9) shows that the peacock always reverses its direction before catching the snake. This is a surprise: However fast the peacock may fly and however small the pillar may be, the peacock can never catch the snake before reversing its direction ! Mathe matics, like nature, is full of surprises. At what angle does the peacock catch the snake? Readers are challenged to find it from the given analysis. Catching the snake We have seen earlier that the peacock and snake do not meet when k = 1 . Let us confirm this using our solution. We can also determine the critical value of k that allows the peacock to catch the snake. When k = 1 , integrating ( 8 ) gives

( )

Yo x = xo + A t ln y

y0z - y z

8A t

,

( 1 0)

VO L . 78, N O. 4, OCTO B E R 2 005

32 1

where A t = y0 j (2ct) . Let r be the separation between the peacock and the snake. According to our analysis,

(11) Since y 2 j 4A t > 0, this distance is greater than A t cubits. In the given problem with a pillar nine cubits high, the peacock cannot get closer than 0.73 cubits. What a narrow escape for the snake ! When k =1= 1 , integrating (8) gives

FIGURE 2 shows the path of the peacock for various values of k. If the peacock does not wish to go hungry, then it will have to maintain a greater speed than the snake. To investigate the critical situation where the snake is caught just as it enters its lair, we substitute x = x0 and y = 0 into ( 1 2) and find

kcriticat

=

c2 - 1 t

-�-c +1

·

If k > kcriticat (here kcriticat � 0.95), then the snake will safely reach its hole; otherwise, the peacock will have a nice dinner! y

10 8 6 4 2

X

0

2

4

6

8

10

Figure

2

12

14

16

18

20

22

24

26

28

The fl ight of the peacock

This particular story about peacocks pursuing snakes may be new, but the mathe matics behind it is not. Knaust [2] credits Leonardo da Vinci ( 1452- 1 5 1 9) with first studying pursuit curves. Pluckette [3] and Neville [4] have also pondered them. Many textbooks on differential equations, such as those by Simmons [5] cover pursuit prob lems. Crough [6] gives some information about their history. None of these authors solve the exact same problem of the peacock and the snake. However, I found that my efforts were not completely original. In the Advanced Problems section of the August-September 1 94 1 issue of the American Mathematical Monthly, the problem is given for k = 2 and is solved by H. A. Luther. It is noted there that the problem appears on p. 295 of Fine's Calculus, and on p. 332 of Osgood's Advanced Calculus. It also appears as Example 3 on p. 1 38 of Elements of Ordinary

MATH EMAT I CS MAGAZ I N E

322

Differential Equations b y Golomb and Shanks and probably i n many other similar text books. But what must be the earliest published solution is noted by Arthur Bernhart in Scripta Mathematica, 20, p. 1 27 : Pierre Bourguer ( 1 698-1 758) in a 1 732 mem oir Upon New Curves Which May be Called Lines of Pursuit [Lignes de poursuite, Memoires de L'Academie Royale des Sciences ( 1 732), pp. 1-14] derives the very same equation I have. My work has been anticipated by 238 years ! Even after 500 years, pursuit problems have not lost their charm and deserve to be pursued !

c?

Straight line or curve, C:5 that is the question!

k= l

REFERENCES Bhaskaracharya, Lilavati, The Book Company Ltd., Calcutta, India, 1 927. Helmut Knaust, Lab on pursuit curves, www.math.utep.edu/classes/3226/lab2bllab2b.html C. C. Pluckette, The curve of pursuit, Math. Gaz. 37 ( 1 953), 256-260. E. H. Neville, The curve of pursuit, Math. Gaz. 15 ( 1 930), 436. Simmons, Differential Equations: Applications and Historical Notes, McGraw Hill Education, Europe, 2nd ed., 1 96 1 , pp. 68-70. 6. Gerald Crough, A pursuit problem, this MAGAZINE 24 (Mar-Apr. 1 97 1 ), 94-97.

1. 2. 3. 4. 5.

40 Years Ago in the M A G AZ I N E

Readers who enjoyed Kung's note,"A Butterfly Theorem for Quadrilaterals," might enjoy recalling Klamkin's "An Extension of the Butterfly Problem," 38 ( 1 965), 206208. His main result can be summarized as follows : Let A B b e an arbitrary chord o f a circle with midpoint P , and let chords J H and G I intersect A B at M and N respectively. If M P = P N, then if A B intersects J I at C and G H at D w e have C P = P D . Furthermore, if line segments A B , G J , and I H are extended so that A B intersects G J and I H at E and F respectively, then

EP

=

PF. G

P R O B LE M S E LG I N H . J O H N STO N, Editor

Iowa State U n ivers ity

RAZVAN G E LCA, Texas Tech U n ivers ity; ROB E RT G REGORAC, Iowa G E RALD H E U E R, Concord i a Col lege; VAN IA MASC I O N I , B a l l State U n iver BYRON WAL D E N , Santa C l a ra U n ivers ity; PAU L ZE ITZ, The U n ivers ity of San Fra n c i sco

Assistant Editors: State U n i versity; s ity;

Pro p osa l s To be considered for publica tion, solutions should be received by March 1 , 2 006.

1726. Proposed by Jerry Metzger, University of North Dakota, Grand Forks, ND.

-

Let a and j be positive integers with a ::::_ 2. Show that there is a positive integer n such that a n = j (mod aj + 1 ) if and only if j = a k for some k ::::_ 0.

1727. Proposed by lady M. Lockhart and William P Wardlaw, U. S. Naval Academy, Annapolis, MD. Chain addition is a technique used in cryptography to extend a short sequence of digits, called the seed, to a longer sequence of pseudorandom digits. If the seed sequence of digits is a 1 , a 2 , . . . , an , then for positive integer k, an +k = ak + ak + 1 (mod 1 0) , that is, an +k is the units digit in the sum ak + a k +t · Suppose that the seed sequence is 3 , 9, 6, 4. Prove that the sequence is periodic and find, without the use of calculator or computer, the number of digits in the sequence before the first repetition of 3 , 9, 6, 4. 1728. Proposed by Jose Luis D!az-Barrero, Universitat Politenica de Catalunya, Barcelona, Spain.

Let A 1 A 2 . . . A 3 n be a regular polygon with 3n sides, and let P be a point on the shorter arc A 1 A 3 n of its circumcircle. Prove that

(_L ) L ( - --) n

k=l

PA n +k

n

1

k=l

PA k

+

1

PA 2n +k

We invite readers to submit problems believed to be new and appealing to students and teachers of advanced undergraduate mathematics. Proposals must, in general, be accompanied by solutions and by any bibliographical information that will assist the editors and referees. A problem submitted as a Quickie should have an unexpected, succinct solution. Solutions should be written in a style appropriate for this MAGAZINE . Each solution should begin on a separate sheet. Solutions and new proposals should be mailed to Elgin Johnston, Problems Editor, Department of Mathematics, Iowa State University, Ames IA 500 1 1 , or mailed electronically (ideally as a lbTIJX file) to ehjohnst @ iastate.edu. All communications should include the readers name, full address, and an e-mail address and/or FAX number.

323


324

Proposed by Brian T. Gill, Seattle Pacific University, Seattle, WA For positive integer k, let ck denote the product of the first k odd positive integers, and let c0 = 1 . Prove that for each nonnegative integer n , 1729.

Proposed by Steven Butler, University of California San Diego, La Jolla, CA. Let A and B be symmetric, positive semi-definite matrices such that A + B is pos

1730.

itive definite, and let I I Y I I denote the usual 2-norm of the vector y. Prove that for all x tf: O,

I ( I - A) (I + A) - 1 (! - B) (I + B) - 1 x ll

< llxll .

Proposed by David Callan, Madison, WI. Let k, n be integers with 1 ::=::: k ::=::: n . Prove the identity

1718.

Note. This corrects a typo in the April 2005 appearance of the problem.

Qu i ckies Answers to the Quickies are on page 328.

Proposed by Michael W. Botsko, Saint Vincent College, Latrobe, PA. Let f be a derivative and g an increasing differentiable function on [a , b] . Prove that fg is also a derivative on [a , b] . Q954. Proposed by Arthur Benjamin, Harvey Mudd College, Claremont, CA, and Michel Bataille, Rauen, France. Show that for positive integer n, Q953.

t ( n ) (n + k ) t 2 ( n ) 2 k=O

k

k

=

k

k=O

k

So l uti ons A real inequality

October 2004

Proposed by Murray S. Klamkin, University ofAlberta, Edmunton, AB. Prove that for all positive real numbers a, b, c, d, a 4 b + b4 c + c4 d + d4 a ::: abcd(a + b + c + d) .

1701.

I. Solution by Zuming Feng, Philips Exeter Academy, Exeter, NH. By the arithmetic-geometric mean inequality,

a 4 b + abc 2 d + abcd 2 ::: 3a 2 bcd.

VO L . 78, NO. 4, OCTO B E R 2 005

325

Similarly,

b4 c + abcd 2 + a 2 bcd c4 d + a 2 bcd + ab 2 cd

:=: :=:

3ab 2 cd 3abc 2 d

d 4 a + ab 2 cd + abc 2 d

:=:

3abcd 2 •

Adding these four inequalities leads to the desired result. II. Solution by Chip Curtis, Missouri Southern State University, Joplin MO. Note that

so by the weighted arithmetic-geometric mean inequality,

Adding this to the analogous results for b 2 cda , c 2 dab, d 2 abc gives the desired in equality.

Note. A few readers proved that if a1 , a 2 , • . • , an are nonnegative real numbers, then

Also solved by Arkady Alt, George Balog lou, Roy Barbara (Lebanon), Rafael Benguria (Chile), Paul Bracken, Daniele Donini (Italy), David Farnsworth, Fejentalaltuka Szeged Problem Group (Hungary), Michael Golden berg and Mark Kaplan, G.R.A.20 Math Problems Group (Italy), John Kieffer and Ana Mantilla, Elias Lampakis (Greece), Kee- Wai Lau (China), T. L. McCoy (Taiwan), Microsoft Research Problems Group, David A. Morales

( Venezuela), Michael G. Neubauer, Michael Reid, Heinz-Jiirgen Seiffert, Achilleas Sinefakopoulos, Nicholas C. Singer, Paul Weisenhorn (Germany), Li Zhou, and the proposer. There was one incorrect submission.

Bounds on a sum of distances 1702.

October 2004

Proposed by Roy Barbara, American University of Beirut, Beirut, Lebanon.

Let R be the circumradius of nondegenerate triangle ABC. For point P on or inside of triangle ABC, let S(P) = IPA I + I PB I + I PC I . Find the maximum value of k and the minimum value of K such that kR :::: S(P) :::: KR for all acute triangles ABC.

Solution by Microsoft Research Problems Group, Redmond, WA. The maximum value for k is 2, and the minimum value for K is 4. We first show that 2R < S( P ) . Using the triangle inequality, the Law of Sines, and the fact that sin e > � e for O < e < we have

2S(P)

=

!}

( I PB I + I PCI ) + ( I PC I + I PA I ) + ( I PA I + I PB I )

::: I BC I + I CA I + I AB I

=

2R sin L A + 2R sin L B + 2R sin L C

2

> 2R · - (L A + L B + L C) 1r

=

4R .

To prove that S(P) :::: 4R, place the triangle in the complex plane with the circumcen ter at the origin, and let A , B , C also represent the complex coordinates of the vertices. Because P is on or inside of the triangle, there are nonnegative real numbers a , b, c with P = aA + bB + cC and a + b + c = 1 . Then


326

I P - A I = i aA + bB + cC - A I ::::: l (a - l ) A I + l bB I + I cC I = ( 1 - a + b + c) R , with similar inequalities for I P - B I and I P - C 1 . We then have

S (P) = I P - A I + I P - B l + I P - C l ::::: ( 1 - a + b + c) R + ( 1 + a - b + c) R + ( 1 + a + b - c) R

= (3 + a + b + c) R = 4R . To show that k = 2 and K = 4 are best possible, consider the triangle with vertices A = - 1 , B = e 2; e , and C = e -2i e , where () is small and positive. This triangle has circurnradius R = 1 and sides l AB I = I AC I = 2 cos () and BC = 2 sin 2() . Observe that S (A)/R = 4 cos () can be made arbitrarily close to 4, and that S(B)/R = 2 cos () + 2 sin 2() can be made arbitrarily close to 2. Also solved b y Michel Bataille (France), Allen Broughton and Herb Bailey, Daniele Donini (Italy), Elias Lampakis (Greece), Raul Simon (Chile), Michael Vowe (Switzerland), Paul Weisenhorn (Germany), Li Zhou, and the proposer.

Polynomial multiples of 3 n

October 2004

Proposed by Shahin Amrahov, ARI College, Ankara, Turkey. Let p (x) = ax 3 + bx 2 + ex + d where a , b, c, d are integers with a = c = 2 (mod 3) and b = 0 (mod 3 ) . Prove that for any positive integer n there is an integer k such that p (k) is a multiple of 3 n .

1703.

Solution by Nicholas

C.

Singer, Annandale, VA.

First observe that

p (t) - p (u) = (t - u) (a (t 2 + tu + u 2 ) + b (t + u) + c ) =

(t - u) (2(t 2 + tu + u 2 +

0)

(mod 3) .

Because t 2 + tu + u 2 + 1 is not a multiple of 3 for any integers t , u , it follows that 3 n I (p (t) - p (u)) if and only if 3 n I (t - u ) . Thus, as k takes on integer values 1 , 2, . . . , 3 n , p (k) takes on 3 n distinct values modulo 3 n . One of these values must be a multiple of 3 n . Also solved by Roy Barbara (Lebanon), Michel Bataille (France), John Christopher, Chip Curtis, Knut Dale (Norway), Richard Daquila, Fejentaltiltuka Szeged Problem Group (Hungary), Dmitry Fleishman, Michael Gold enberg and Mark Kaplan, Elana C. Greenspan, Russell Jay Hendel, Keun-Ae lung and Hannah Harding, Victor

Y. Kutsenok, Elias Lampakis (Greece), Microsoft Research Problems Group, Nonhwestern University Math Prob

lem Solving Group, Manual Reyes, Rolf Richberg (Germany), Paul Weisenhorn (Germany), Hongbiao Zeng, Li Zhou, and the proposer.

Fermat-Torricelli cevians 1704.

October 2004

Proposed by Mowaffaq Hajja, Yarmouk University, Irbid, Jordan.

Let F be the Fermat-Torricelli point for triangle ABC. Let the cevian from B through F meet AC in B* and the cevian from C through F meet AB in C* . Prove that if BB * = CC* , then triangle ABC is isosceles. (The Fermat-Torricelli point F of triangle ABC is the point for which the sum of the distances from the vertices is minimum.)

Solution by David Finn and Herb Bailey, Rose Hulman Institute of Technology, Terre Haute, IN. If each angle of the triangle has measure less than 1 20° , then F is in the interior of the triangle. In this case it is well known that LAFB = LBFC = L CFA = 1 20° .

VOL . 78, NO. 4, OCTO B E R 2 005

327

If L A ::: 1 20° , then F = A and the result i s immediate. If either L B ::: 1 20° or L C ::: 1 20° , then either B * or C* is undefined. Thus we assume that all angles have measure less than 1 20° . If AF is extended to meet BC at A* , then the six nonoverlapping angles with com mon vertex F each have measure 60° . We now establish two lemmas to prove and generalize the desired result. LEMMA 1 . If FB = FC, then AB

=

AC and BB *

=

CC * .

Proof. Because !:::. BFA � l:::. CFA, it follows that AB = AC. To prove the other equal ity, note that LFBC = LFCB and that L C*BC = L B* CB. Thus l:::. B CC* � l:::. CBB * , so BB * = CC* . A

B c

LEMMA 2 . If BB *

=

CC* , then FB

=

FC.

Proof. Assume that FB "1- FC. Then without loss of generality we may assume that FC > FB. Construct C' on FC so that FC' = FB. Let B' be the intersection point of AC' and BB * , and let D be the point on AC such that C ' D is parallel to BB * . Because F i s also the Fermat-Torricelli point for l:::. B C' A , it follow from Lemma 1 that BB ' = C'C* . Since l:::. CFB * "' l:::. CC' D, we have L CC' D = 60° . Now in l:::.A FC, LAFC = 1 20° , and thus LFCA < 60° . Hence L C' DC > 60° and C' C > C' D . Because l:::.A C ' D "' l:::.AB ' B * and AC' > AB ', we have C' D > B' B* . Combining inequalities we have C' C > B' B * , and hence, CC * > BB * . This completes the proof of Lemma 2.

Combining Lemmas 1 and 2 shows that BB * = CC* implies AB = AC, establishing the desired result. On the other hand, if we are given that AB = AC, then l:::.A FB � l:::.A FC (by SSA with angle F obtuse) and FB = FC. Combining this with Lemmas 1 and 2 we see that each of the following three statements implies the other two: (i) FB = FC, (ii) BB * = CC* , (iii) AB = AC. Furthermore, these conditions occur if and only if L CBB * = 30° . Also solved by Herb Bailey, Roy Barbara (Lebanon), Daniele Donini (Italy), Ovidiu Furdui, Marty Getz

and Dixon Jones, Michael Goldenberg and Mark Kaplan, Elana C. Greenspan, Geoffrey A. Kandall, Victor Y. Kutsenok, Microsoft Research Problems Group, Peter E. Nuesch (Switzerland), Raul A. Sim6n (Chile), John W. Spellman and Ricardo Torrej6n, Ricardo M. Torrej6n, Michael Vowe (Switzerland), Paul Weisenhom (Germany), Li Zhou, and the proposer.

October 2004

A symmetric minimum 1705. Proposed by Michel Bataille, Rauen, France.

Let n be a positive integer. Find the minimum value of (a

_

b) 2n+ l

+

(b

_

c ) 2n+ l

+

(c

_

a) 2n+ l

(a - b) (b - c) ( c - a)

for distinct real numbers a, b , c with be + ca ::: 1 + ab + c2 •

MAT H EMATICS MAGAZI N E

328

Solution by Northwestern Math Problem Solving Group, Northwestern University, Evanston, IL. =

With the change of variables u finding the minimum of

(u

=

f (u , v )

=

b - c, v

+

c-

a , the problem becomes that of

v ) 2n+ l - u2n+ l - v 2n+ l (u + v ) u v

------

subject to u v � 1 . Note that u and v must have the same sign and that f ( - u , - v) = f (u , v), so we may assume that u and v are positive. Applying the binomial theorem and grouping terms, we have

f ( u , v)

=

n

L k=l

(2n ) + 1

k

By the power means inequality,

u2n+ l -2k

;

u2n+l -2k + v 2n+l -2k u k-! v k-1 . u+v

v 2n+ l -2k

�

(;) u

v

2n+ l -2k

Hence, by the arithmetic-geometric mean inequality,

u2n+l -2k + v 2n+ l -2k ----- � u+v Because u v � 1 , we have

f(u, v ) �

n

L k=l

(u ) +

2

v 2n-2k

(2n ) + 1

k

=

� ( u vt -k .

4n - 1 .

Thus, the minimum value for the given expression is 4n - 1 . Equality is attained for u = v = 1 , that is, for b = c + 1 = a + 2. Also solved by Mohammed Aassila (France), Arkady Alt, Roy Barbara (Lebanon), Minh Can, Chip Curtis, Elnur Emrah (Turkey), Fejentaltiltuka Szeged Problem Group (Hungary), Marty Getz and Dixon Jones, Michael Goldenberg and Mark Kaplan, G.R.A.20 Math Problems Group (Italy), Elias Lampakis (Greece), T. L McCoy (Taiwan), Microsoft Research Problems Group, Rolf Richberg (Germany), Heinz-Jiirgen Seiffert, Nora Thornber; Paul Weisenhom (Germany), Li Zhou, and the proposer. There were two incorrect submissions.

Answers Solutions to the Quickies from page 324. A953. Let

F be an antiderivative for f and let T (x)

=

F (x)g(x)

- 1x F (t)dg (t) .

Note that the Riemann-Stieltjes integral exists because F is continuous and g is in creasing. Because g is differentiable, we have

T'(x)

=

F (x)g'(x)

+

g (x ) F' (x) - F (x)g'(x)

showing that fg is the derivative of T .

=

f (x ) g (x ) ,

VO L . 78, NO. 4, OCTO B E R 2 005

329

[n] denote the set { 1 , 2, . . . , n } and S denote the set of ordered pairs (A , B) where A is a subset of [n] and B is an n-subset of [2n] that is disjoint from A . We can select elements for S in two ways:

A954. Let

(1) For 0 :::; k :::; n , let Z be a k-subset of [n] . The let A = z c , the complement of Z, which is an (n - k)-subset of [n] , and let B be an n-subset of {n + 1 , . . . 2n } U Z . This yields ,

(2) For 0 :S k :S n, choose a k-subset B1 from {n + 1 , . . . , 2n } and a k-subset B2 of [n] . Then form B = B 1 U B2, and choose A from among the 2k subsets of B2 • This leads to

This completes the proof.

Note. Another proof, using lattice paths, can be found in Robert A. Sulanke's article, Objects Counted by the Central Delannoy Numbers, The Journal of Integer Sequences, Vol 6 , 2003. A proof by polynomials is in Michel Bataille's paper Some Identities about an Old Combinatorial Sum, The Mathematical Gazette, March 2003, pp. 144-8. A slight change in the above proof leads to

for m � n , a generalization proved by Li Zhou using lattice paths in The Mathematical Gazette, July 2004, p. 3 1 6. To appear in

The Col l ege Mathemati cs journal, N ovem ber 2 005

Articles :

When the Pope was a Mathematician, by Leigh Atkinson Ramanujan's Continued Fraction for a Puzzle, by Poo-Sung Park Centers of the United States, by David Richeson A Paper-and-Pencil gcd Algorithm for Gaussian Integers, by Sandor Szabo How to Avoid the Inverse Secant (and Even the Secant Itself), by S. A. Fulling Differentiability of Exponential Functions, by Philip M. Anselone and

John W. Lee Classroom Capsules :

A Non-Visual Counterexample in Elementary Geometry, by Marita Barabash Can You Paint a Can of Paint?, Robert M. Gethner A Paradoxical Paint Pail?, by Mark Lynch Differentiate Early, Differentiate Often ! , by Robert Dawson A Two-Parameter Trigonometry Series, by Xiang-Qian Chang

R E VI E WS PAU L j . CAMPB ELL, Editor B e l o i t Co l l ege

Assistant Editor: Eric S. Rosenthal, West Orange, NJ. Articles and books are selected for this section to call attention to interesting mathematical exposition that occurs outside the main stream of mathematics literature. Readers are invited to suggest items for review to the editors.

Gardner, Martin, Martin Gardner 's Mathematical Games: The Entire Collection of His Sci entific American Columns, CD-ROM for Windows and Macintosh, MAA, $55.95 ($44.95 to MAA members) . ISBN 0--8 83 85-545-3 . Albers, Don, On the way to "Mathematical Games": Part I of an interview with Martin Gardner, College Mathematics Journal 36 (3) (May 2005) 1 7 8- 1 90; "Mathematical Games" and beyond: Part II of an interview with Martin Gardner, 36 (4) (September 2005) 301-3 14. Jackson, Allyn, Interview with Martin Gardner, Notices of the American Mathematical Society 52 (6) (June/July 2005) 602-6 1 1 . From 1 956 to 1 986, Martin Gardner wrote a "Mathematical Games" column in Scientific Amer ican, which delighted readers, fostered mathematicians (for example, me), and introduced to the public such discoveries as public-key cryptography, fractals, the game of Life, and Penrose tilings. Those columns were collected into 1 5 books and are now available on this searchable CD of PDF files, mostly from MAA editions of the books. The package also includes a printed booklet with a biographical sketch by Peter Renz and an interview by Donald J. Albers, and accompanying photos. The interviews were conducted in 2000--200 1 and in in November 2004, just after Gardner turned 90. Bewersdorff, Jorg, Luck, Language, and White Lies: The Mathematics of Games, A K Peters, 2005 ; xvii + 486 pp, $49. ISBN 1 -568 8 1-2 1 0--8. Translated (by David Kramer) from German, this book continues Martin Gardner's tradition of explaining how to play and to win at various mathematical games. Bewersdorff classifies the uncertainty in games into chance ("luck"), logic, and bluff ("white lies"), corresponding to games of chance, combinatorial games, and strategic games-and, of course, games offer ing varying combinations of these elements. Bewersdorff describes and analyzes games of all these kinds, from dice games, lotteries, Monopoly, and blackj ack; to Nim, Dominos, Go, Hex, backgammo n, and Mastermind; and rock-paper-scissors, poker, and baccarat. He introduces probability distributions (binomial, Poisson, normal), Markov chains, minimax theory, Grundy values, complexity theory, game theory, and linear programming. The book gives up-to-date results about the games, as well as relevant citations to the literature and history. Why computers are like the weather, New Scientist (9 July 2005) 1 7 , http : I / www . new s c i ent i s t . com / art i c le . n s?id=mg 1 8725074 . 600 . Berry, Hugues, Daniel Gracia Perez, and Olivier Ternan, Chaos in computer performance, http : I / www . arx i v . org/ nlin . AD / 0506030 .

" . . . microchips . . . Their behaviour is inherently unpredictable and chaotic." This somewhat mis leading lead from New Scientist reports on a study whose authors analyzed the performance of microprocessors. During execution of certain programs, the microprocessors displayed "com plex non-repetitive variations that challenge traditional analysis." The sources of the variations include randomness but also, as the authors demonstrate, deterministic chaos. However, what is under discussion here as "performance" is the rate of execution, not the results.

330

VOL . 78, N O . 4, OCTO B E R 2 005

33 1

Peterson, lvars, Closing the gap on twin primes, Science News ( 1 6 July 2005), http : I lwww . s c i encenews . orgl art i c l e s l 200507 1 6 lmathtrek . asp .

The average spacing between primes near p is log p, but it is known that infinitely many pairs of primes have gaps between them smaller than log p/4. Daniel A. Goldston (San Jose State University) and Cern Y. Yildirim (Bogazi

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

Recommend Documents