MARTIN DAVIS
Exponentia and Trigonometric Functions From the Book
fter the piecemeal manner in which we learn about the exponential and trigonometric functions, the view presented to us when we finally gain access to the standpoint of complex function theory is a revelation. It is like the panorama spread out before us after an ascent, in which various familiar landmarks are seen for the first time as part of a coher ent whole. In the few weeks before writing this note, I had been entertaining myself while lying in bed during spells of insomnia by thinking through how one could use the tools of complex variable theory to develop the properties of these functions, starting from scratch. How would they be treated in Paul Erdos's "book" of optimal proofs? It all turned out to be pretty easy and, I thought, rather elegant. What We Can Use
We have available the following tools: •
•
A power series defines a function analytic within its cir cle of convergence, and the derivative of that function can be computed by term-by-term differentiation. Ifj' (z) 0 in a domain CZ!J, thenjis constant in that do main. (We can see this by setting z = x + iy, f(z) = u(x, y) + iv(x, y), and noting that the hypothesis implies =
au ax •
=
au ay
=
av ax
=
av ay
=
0) ·
Cauchy's theorem and its corollaries. In particular, if
f is analytic in the simply connected domain CZ!J, then J�f(()d� is well defined for any a, b E CZ!J, indepen dent of the path joining them. Moreover, for a E CZlJ the function
F(z)
=
r f(()d� a
is analytic in CZlJ and F'(z) = f(z) in CflJ. On the other hand, the presence of the term 2m in Cauchy's integral formula serves as a warning that it and its corollaries will not be available to us. In particular, we can't use the fact that an analytic function can be expanded into a power series converging to the function.
© 2003 SPRINGER-VERLAG NEW YORK. VOLUME 25. NUMBER 1. 2003
5
The Addition Formula
Using
We define exp(z) =
x
L
n=O
n
� n.
and we have at once exp'(z) = exp(z) for all z, and exp(O) = 1. Now let G(z) = exp(z + c) exp( - z).
cos(yl + Yz) + i sin(yl + Yz) = exp(i (Yl + Yz)) = exp(iy1) exp(iyz) = (cosy1 + i siny1)(cos Y2 + i sin Yz) = [cos y1 cos Y2 - sin Y1 sin Yzl + .i [siny1 cosy2 + cos Y1 sin Y2L we obtain the addition formulas
Then, a simple calculation yields G'(z) = 0 so that
cos(yl + Yz) = cos Y1 cos Y2 - sin Y1 sin Yz sin(Yl + Yz) = sin Y1 cos Y2 + cos Y1 sin Y2·
G(z) = G(O)= exp(c). Thus we have proved
Periodicity
exp(z + c) exp( - z) = exp(c). Setting c = 0, we have: exp(z) exp( - z) = 1.
(1)
Let the simply connected domain C!lJ consist of all z = x + iy for which x > 0 or y > 0 (or both). That is, C!lJ is obtained from the complex plane by excising the quadrant on the lower left, together with its boundary. For z E C!lJ, we defme log(z) =
This implies that exp(z) is never equal to 0, that exp( - z) =
1
z -z1 d?.
1
indicated above, it follows from Cauchy's theorem that this analytic function is well defined, and that log(z)' = liz. Also log(l) = 0. For z E C!!J, let As
exp(z) '
and that exp(z + c) = exp(z) exp(c).
(2)
an aside, we note that the power series implies that exp(x) > 0 for x � 0, and that (1) then implies the same for all real x. Finally, because exp'(x) is positive, we see that the exponential function is increasing for all real x.
As
The Trigonometric Functions
Writing z = x + iy, we have exp(z) = exp(x + iy) = exp(x) exp(iy). Here exp(x) is real, and we define sine and cosine by setting
G(z) =
exp(log(z)) z
Then, G'(z) = 0 so that G(z) = G(1)
=
1, i.e.
exp(log(z)) = z Next we consider the defining integral for the log func tion taken over the path consisting of an arc of the unit cir cle 1z l = 1 from 1 to z. For each such point z = x + iy, let r = log(z) so that exp(r) = z. Writing r = p + ia we have z = exp(p)(cos a + i sina). Thus
exp(iy) = cosy + i sin y. Setting y = 0, we get cos(O) = 1, sin(O) = 0. Using exp(iy)' = i exp(iy) = -siny + i cosy, we have
1 = l,z l = so that p
Y(exp(p) cos a)2 + (exp(p) sin a)2 = exp(p),
=
0 and cos a = x; sin a = y. In other words, z = (cosa + i sina) = exp(ia).
(sin y)' = cosy; (cosy)' = - siny.
If s is the length
Thus, (sin2 y + cos2y)' = 0, so that sin2 y + cos2y = sin2 (0) + cos2 (0)
=
Since exp(-iy) =
�· )
exp �y
1 cosy + i siny cosy - i siny cos2y + sin2y = cosy - i siny, we have cos( -y) = cosy; sin( -y) = -siny.
6
J
THE MATHEMATICAL INTELLIGENCER
1.
of the circular arc from 1 to z, then we have
ds2 = dx2 + dy2 = sin2 a daZ + cos2a daZ = da 2 . Since ds= da and s = 0 when a = 0, we see that the pa rameter a is just the arc length. Applying these considera tions to z = i , the length a is simply the length of a quad rant of the unit circle, i.e., rr/2 where, as usual, rr denotes the length of the unit semi-circle. Therefore, exp(i rr/2) = i ; cos( rr/2) = 0; sin(rr/2) = 1. From this it follows, using the addition formula, that exp(i rr) = -1 and exp(2m) = 1. Therefore the exponential function is periodic with period 2m. Finally, sin(x + 2rr) = sin x; cos(x + 2rr) = cos x.
AUTHOR
SpringerMath(Jpcpress Personalized book announcements to your e-mail box. www.springer-ny.com/express
5REASONS TO SUBSCRIBE: •
Free Subscription
-
o charge
and you can un ub cribe any time. •
MARTIN DAVIS
Personalized
-
Choo e as many
cientific pecialties as you'd like.
Department of Mathematics
New book in the subjects you
University of California, Berl<eley
choose are contained in a single
Berl<eley, California 94720 USA
weekly or monthly e-mail
e-mail:
[email protected] (the frequency is up to you). •
Easy Book Buying- Just click
Martin Davis was born in New Yorl< City in 1928. A math ma
on links within the e-mail to
jor at City College, he was particularly inspired by two of his
purcha e books via Springer'
professors: E. L. Post and B. P. Gill. His doctorate at Prince ton in 1950 was in the field of mathematical logic. He is best
new secure website. •
known for his pioneering worl< in automated deduction and
Many Subjects to Choose From
-
Choose from an array of pecialties
for his contributions to the solution of Hilbert's tenth problem,
within the di cipiLnes of A tronomy,
for which latter he was awarded the Chauvenet and Lester R.
Cherni try, Computing & Infor
Ford Prizes by the Mathematical Association of America and
mation Science, Environmental
the Leroy P. Steele Prize by the American Mathematical So
Science, Economics & Bu ines ,
ciety. Davis has been on the faculty of the Courant Institute
Engineering, Geoscience, Life
of Mathematical Sciences of New Yorl< University since 1965,
Science, Mathematics, Medicine,
was one of the charter members of the Computer Science
Physic , and Statistics.
Department founded in 1969, and is now Professor Emeritus. Since retiring, his book The Universal Computer, written for the general public, has been published. He lives with his wife
MATHEMATICS: •
Algebra
•
Combinatoric and Graph Theory
•
Computational Mathematic and
•
Differential Equations and
•
Fluids and Mechanics
•
Functional Analysis and
Departing column editors manage to creep out the
•
General Mathematics
Virginia in Berl<eley, California, where he is a Visiting Scholar at the University. They have two children and four grandchil dren.
Scientific Computing Dynamical Systems
Thank You
Operator Theory
door unobseNed. Jet Wimp may have thought he
•
Geometry and Topology
could leave us with the usual absence of fuss. He
•
Mathematical Biology
said in letting go of the duties of Reviews Editor sim
•
Mathematical Method
ply that he was resigning from his university position
•
Mathematical Physics
and from editing the Journal of Computational and
•
Applied Mathematics, and he would like to leave the
•
Optimization, Control Theory and
collecting and his other non-mathematical passions.
•
Probability Theory
Well, enjoy retirement, Jet, but after working hap
•
Real and Complex Analy is
umber Theory Operations Research
lntelligencer too-to make more time for his book
pily with you for eleven volumes I must say publicly
as I already wrote you-that your contributions have
been great and are widely appreciated. We will strug
gle along without you, but please do slip us the oc casional review, or word of advice!
Springer
-Editor's note
VOLUME 25. NUMBER 1 . 2003
7
�?·fii•i§ufhi¥1Uii;J§.i, Colin
Adams,
E�itor I
A Difficult Delivery Colin Adams The proof is in the pudding.
(}pening a copy of The
Mathematical
Intelligencer you may ask yourself uneasily, "What is this anyway-a mathematical journal, or what?" Or
"Oh, my god, it hurts. " "It's okay honey. You're almost there. " "It's splitting me wide open!" "You can do it, honey, " said Jeff "What did I do to deserve this?, " screamed Karen.
you may ask, "Where am !?" Or even "Who am !?" This sense of disorienta tion
is at its most acute when you
open to Colin Adams's column.
Relax. Breathe regularly. It's mathematical, it's a humor column, and it may even be harmless.
Column editor's address:
Colin Adams,
Department of Mathematics, Bronfman Science Center, Williams College, Williamstown, MA 01 267 USA e-mail:
[email protected] 8
hey had met in the Math Lounge while grad students. Although Karen was an algebraic geometer, and Jeff was a number theorist, it didn't seem to matter. Their love transcended the bounds of their respective mathe matical specializations. But little was expected of the union. Dr. Sylvia Vit tle, Karen's advisor, had urged her to reconsider. In her Austrian accent, she said, "There are lots of strong algebraic geometers out there. Look at Brogan from UCLA. Or Stigglemeyer from Brown. Why settle for a number theo rist?" But Karen knew her heart, and the two were married three weeks af ter they both received their Ph.D.s. One morning, four months into the marriage, as they sat at the kitchen table sipping their morning coffee, Karen cleared her throat. Jeff looked up from his morning paper, "Zero-Free Regions for Dirichlet 1-Functions." "Urn, Jeff, there's something I want to tell you." "Yes, honey, what is it?" "Remember that night two months ago when we stayed up until 3:00 in the morning talking about jet bundles?" He smiled provocatively. "Who could forget it?" "Well, about three weeks later, I found myself having trouble sleeping at night."
T
THE MATHEMATICAL INTELLIGENCER © 2003 SPRINGER-VERLAG NEW YORK
"Yes?" "I just couldn't seem to get some of the ideas out of my head. I was waking up in the morning feeling lousy." "Uh huh." "So, I guess what I am trying to say is that I think I may be with theorem." "Oh my god," gasped Jeff as he reached for her hand across the table. "Really? You think so? How can we find out for sure?" "Well, I have a test that Dr. Vittle gave me. Just a set of possible coun terexamples. We can see if it with stands them." "Okay, defmitely, let's do it. Should I do anything?" "No, just wait here. I can do it in the study. It shouldn't take more than a half hour. " Karen pulled the belt of her bathrobe tight, picked up a pencil and a pad of paper and marched off to the study. After waiting impatiently for 20 minutes, Jeff knocked on the door. "Karen, are you doing okay?" "Just a few more counterexamples to try, honey. Shouldn't take much longer. " Fifteen minutes later, Karen threw open the door. Jeff leaped up from the kitchen table. "So?" She threw her arms around him. "It's true. I am with theorem." "Yahoo!," said Jeff. "We're going to be published!" The next day, they made an appoint ment with Dr. Vittle. "Well, yes it is unusual, but it is not unheard of. Look at the Atiyah-Singer Index theorem. That was a product of a topologist and an analyst. But these matches are risky. I want to put you on a strict regimen of ten pages of alge braic number theory a day, say Cohen's book. "And when it is time for the theorem to come, it is impmtant to be ready. Have you considered taking a Lemmas
class? They are good preparation men tally and physically for the big event." Karen woke in the middle of the night in a cold sweat. Calculations raced through her head. It appeared that the kernel of the Sowklitz opera tor was in fact a left R-module. She grabbed Jeffs arm. "Jeff, Jeff, wake up. I think it's here!" Jeff leaped out of bed, already dressed. "Okay, look. Stay calm. I'll call Dr. Vittle. We can meet her down at the university. You get dressed." Karen stopped by the side of the bed. "Oh goodness, that was a big idea. It's coming fast. We need to hurry."
Dumped Dr. Vittle like a sack of old conjectures. She swore she would never get involved with another num ber theorist for as long as she lives."
They arrived at the university and raced up to the faculty lounge. Dr. Vit tle was waiting for them there with sev eral clean pads of paper. "Here, she said to Karen, "you sit here." She turned to Jeff. "Are you going to be here through the whole process?" "It's as much my theorem as hers," he said. "Just like a number theorist," laughed Vittle. "You make one small contribution nine months ago, and you think you have done all the work." "Hey, that contribution nine months ago was key. Without it there would be no theorem. " "Yes, but I don't see you i n much pain right now." "You don't like me very much, do you?" "Don't take it personally. I don't like any number theorists. I am going to go down to my office, but I will check on you in a bit." "What's her problem with number theorists?", asked Jeff as soon as Vittle had disappeared down the stairs. "Don't you know? She was collabo rating with Smythe, and one day he saw a talk on wavelets, the hot new thing.
wrote another
Karen sat at the table. Jeff paced back and forth. Every once in a while she would say, "Ah, urn, I think, maybe, . . . maybe, . . . oh, no not yet." Then sud denly, she screamed, "Jeff, this is it. Quick, get Dr. Vittle." Symbols spilled out on the page. It was agonizing and amazing at the same time. Jeff flew down the stairs three at a time, and re-
She furiously half pag e of equatio n s , and then , awestruck, wrote QED. turned almost immediately with Dr. Vittle. Karen was writing furiously. Vittle peered over Karen's shoulder. "Ah, yes, things are going well. Looks like a big one." Karen tore page after page off the pad. Vittle turned to Jeff and pointed at one of the sheets of paper. "If you don't think it would be too difficult for you, perhaps you could clean up that lemma there." Jeff sat down and slowly began to write. Karen was scribbling like mad, as sweat dripped off her brow. She was breathing heavily. Suddenly she tensed. "Oh, my god," she screamed. "It's huge." "It's okay, " coached Dr. Vittle. "Re lax and just let it come out."
Karen tore filled sheets off the pad, one after another. She sketched dia grams and figures. Equations with subindices on superindices flowed from her pen. Then her eyes opened wide. "I see it," she gasped. "I see it all." She furiously wrote another half page worth of equations, stopped suddenly and then, awestruck, wrote QED at the bottom of the page. There was a mo ment of complete silence and then she collapsed on the table. "Are you all right?," Jeff gasped, grasping her by the shoulders. "Of course she is all right," said Dr. Vittle. "Let her rest. She has just given birth." Karen raised her head slowly from the table, a beatific smile on her lips. "Where is it? Can I hold my theorem?" "Of course you can," said Vittle. She scooped up the pages, pulling one from Jeffs grasp, and laid the pile in Karen's arms. Karen cradled the pages care fully. "It's really beautiful isn't it?," she said. Vittle nodded. "It's a healthy theo rem, probably 9 to 10 pages in 12 point type. What will you name it?" Karen looked at Jeff tentatively. "Well, we were thinking of calling it the Bounded Co-Generation theorem, but after what we have just been through, I was thinking maybe the Constrained Optimization theorem." Jeff smiled. "I think that's a won derful name, honey." "Well, I would like to keep an eye on it tonight. Make sure it's immune to counterexamples," said Vittle. "And then in the morning, we will send it to the Annals of Mathematics." "The Annals?," Jeff gasped. "Even in my wildest dreams, I didn't imagine . . . . Oh, Karen, I love you." The two hugged each other, cradling the theorem between them, and even Dr. Vittle smiled.
VOLUME 25, NUMBER 1 , 2003
9
IGOR PAK
On Fine's Partition Theorems, Dyson, Andrews, and Missed Opportunities
istory almost never works out the way you want it to, especially when you are looking at it after the dust settles. The same is true in mathematics. There are times when the solution of a problem is overlooked simply by accident, due to a combination of unfortunate circumstances. In a celebrated address [D4], Freeman Dyson described several "missed opportunities, " in particular his own advance glimpse of Macdonald's eta function identities. I present here the history of Fine's par tition theorems and their combinatorial proofs. As the reader will see, many of the results could and perhaps should have been discovered a long time ago. There was a whole string of "missed opportunities." The central event is the publication of a short note [Fl ] by Nathan Fine. To quote George Andrews, "[Fine] an nounced several elegant and intriguing partition theorems. These results were marked by their simplicity of statement and [ . . . ] by the depth of their proof." [A7] Without taking anything away from the depth and beauty of the results, I will show here that most of them have remarkably simple combinatorial proofs, in a very classical style. Perhaps that's exactly how it should be with important results! Even a reader who prefers analytic methods may find that here the combinatorial approach fits the problem well.
10
THE MATHEMATICAL INTELLIGENCER © 2003 SPRINGER-VERLAG NEW YORK
Fine's partition theorems can be split into two (over lapping) categories: those dealing with partitions into odd and distinct parts, a la Euler, and those dealing with Dyson's rank I shall separate these two stories, as they have relatively little to do with each other. The fortune and misfortune, however, had the same root in both stories, as you will see. Fine's note [Fl] didn't have any proofs; not even hints on complicated analytic formulae which were used to prove the results. It was published in a National Academy of Sci ences publication, in a journal devoted to all branches of science. Thus the paper was largely overlooked by subse quent investigators. The note contained a promise to have complete proofs published in a journal "devoted entirely to mathematics." This promise was never fulfilled. Good news came from a different quarter. In the sixties, George Andrews, while a graduate student at the Univer sity of Pennsylvania, took a course of Nathan Fine on ba-
sic hypergeometric series. As he writes in his mini biogra phy [AS], "His course was based on a manuscript he had been perfecting for a decade; it eventually became a book [F2 ] . " In fact, the book [F2] was published only in 1988, ex actly 40 years after the publication of [Fl]. It indeed con tained the proofs of all partition results announced in [Fl]. Meanwhile, Andrews kept the manuscript and used it on many occasions before [F2] appeared. Among other things, Andrews gave new analytic proofs of many results, found connections to the works of Rogers and Ramanujan, and, what's important for the subject of this paper, gave com binatorial proofs to some of the theorems. Much of the fame Fine's long-unpublished results now have is owing to An drews's work and persistence (see [Al-AS]). This is where the story splits into two. The rest of this article is largely mathematical, dealing separately with each of Fine's partition theorems. To simplify the presentation, I change their order and use different notation. I conclude the discussion with Dyson's proof of Euler's Pentagonal Theorem and a few more surprises. A few words about the notation. I denote partitions of n by A (;\ 1 , A2, . . . , At), and I write A 1- n, or ,A = n. Let A' be the conjugate partition to A. The largest part and the number of parts of A are denoted by a(;\) and €(;\), re spectively. Every partition A may be represented graphi cally by its Young diagram [A]; recall that one definition of A' is the condition that [A'] is the transpose of [A ] . See [A3] for standard references, definitions, and details. =
Partitions Into Distinct Parts and Franklin's Involution
THEOREM 1 (Fine) Let 5:?, and CiJJh b e the sets of partitions A of n into distinct parts, such that the largest part a(A) = A 1 is even and odd, respectively. Then -
12/lA
=
{
1, -1, 0,
if n = k(3k+l)/2 if n = k(3k - 1)/2 otherwise.
It is perhaps suggestive to compare Theorem 1 with the similar-looking Euler's Pentagonal Theorem, which can be stated as follows: THEOREM 2 (Euler) Let 229, and 92�, be the sets of partitions A of n into distinct parts, such that the number of parts t(A) = Ai is even and odd, respectively. Then
·22"0 1 0
_
122"11 .u
_
-
{
C- l t , 0,
if n k(3k otherwise. =
Proof Denote by 91;, 20� U 9£h. the set of all partitions into distinct parts. Let A E CiJJ11, and let [A] be the Young di agram corresponding to A. Denote by s(A) the length of the smallest part in A, and by b(A) the length of a maximal se quence of subsequent parts: a, a - 1, a 2, . . . , where a = a(A) = A 1 . One can views(;\) and b(A) as the lengths of the horizontal line and diagonal line of squares of [ A ] , as in Figure 1. Now, if s(A) s b(A), move the horizontal line to attach to the diagonal line. Similarly, if s(A) > b(A), move the diagonal line to attach below the horizontal line. If we cannot make a move, stay put. This defines Franklin's in volution a: C)) , � CiJJ , . Note that a changes parity of the number of parts, ex cept when A is a fixed point. Observe that the only fixed points of the involution are the Young diagrams where the lines overlap, and s(A) - b(A) is either 0 or 1 (see Fig ure 2). The number of squares in these diagrams are m(3m ± 1)/2, which are called pentagonal numbers. 92� is 0 unless n is a pentagonal number, Therefore, 192�! and is ± 1 in that case. This proves Theorem 2. Similarly, note that a changes parity of the largest part. '2/JA is 0 unless n is a pen Thus again, we infer that lrzv�l tagonal number, and is ± 1 in that case. This completes the proof of Theorem 1 . D =
-
The following result is straight from [Fl ] :
i21J�i
blance to the famous pentagonal theorem of Euler, but we have not been able to establish any real connection be tween the two theorems." In the Math. Reviews article [L], Lehmer reiterates: "This result parallels a famous theorem of Euler. " As I shall show, Theorem 1 has a proof nearly identical to the famous involutive proof by Franklin of Theorem 2. Franklin was a student of Sylvester at Johns Hopkins Uni versity, active in Sylvester's exploration of the "construc tive theory ofpartitions." He published his proof [ Fr] right before the publication of a celebrated treatise [Sl] by Sylvester (to which Franklin also contributed). These two papers laid the foundations of Bijective Combinatorics, a field which blossomed in the second half of the twentieth century. Of course, it is hard to blame Fine for not discovering the connection. In those days bijections were rarely used to prove combinatorial results. Since the late sixties, how ever, the method became popular again, with a large num ber of papers proving partition identities by means of ex plicit bijections. Franklin's involution was far from forgotten, and was used on many occasions to prove vari ous refinements of Euler's Pentagonal theorem [KP] , and even most recently to prove a new partition identity [ C ] . It is a pity that an application to Fine's theorem remained un noticed for so many years.
±
1)/2
Of course, this similarity was not overlooked. Fine him self acknowledged that Theorem 1 "bears some resem-
-
Figure 1. Young diagram [A] corresponding to a partition A = (9,8,7,6,4,3). Here s(A) = 3, b{A)
-
=
4, and a(,\)= (1 0,9,8,6,4).
VOLUME 25. NUMBER 1, 2003
11
2m-
2m
m
m
�/
...___ _ ......
m
m+l
7
Figure 2. Fixed points of Franklin's involution. Partitions Into Odd Parts and Sylvester's Bijection
Now I recall another famous theorem of Euler: that the number of partitions of n into odd numbers is equal to the number of partitions of n into distinct numbers. Here is an other gem from [F1 ] : 3 (Fine) Let O h and O g b e the sets of partitions A of n into odd parts such that the largest part a(A) is 1 and 3 mod 4, respectively. Then
THEOREM
,,O1n1- l'riAO �n l•
101' n
=
'qnnll,
if n is even, if n is odd.
Clearly Fine's Theorem 3 is a refinement of Euler's the orem. As we shall see shortly, the following result of Fine [F2] is an extension: 4 (Fine) For any k > 0, the number of partitions n into distinct parts such that a(f.L) = k, is equal to the number of partitions A 1- n 'into odd parts such that a(A) + 2€(,\) = 2k + 1 . THEOREM fL 1-
I n his early paper [A 1 ] , Andrews proved Theorem 4 com binatorially, but never noticed that it implies Theorem 3. The reason could be that Theorem 3 was coupled with The orem 1 in [F1 ] , while the proofs use two different classical combinatorial arguments. The proofs of Theorem 3 and Theo rem 4 follow from Syl vester's celebrated bi jection, sometimes called a fish-hook construc tion. This bijection is a map between partitions into odd and distinct numbers, and gives a combinatorial proof of Euler's theorem (see [A1,A3]). Sylvester's bijection is another fixture in the combina torics of partitions. It has been restated in many different ways (e.g., using Frobenius coordinates and 2-modular di agrams [A6,B,PP]), and was used to prove other refine ments of Euler's theorem [KY] . Had Theorem 3 been bet ter known and not omitted in [L] , the following proof could have been standard.
Proof Denote by On = Oh U 0� the set of all partitions into odd parts. Define Sylvester's bijection (7,6,4, 1 ) .
12
THE MATHEMATICAL INTELLIGENCER
l l I I
...
f-'..... Figure 4. Dyson's map t/Jr:A--. JL, where A= (9,7,6,6,3,1) JL = (8,8,6,5,5,2) E '!l32+r,r-1• and
r=
By Theorem 5, and from the formula 7T(n) = h(n, 0) + g(n, 1) = h(n, 0) + h(n, -1), we deduce equation 1):
I l I I
2.
E
;Jt32,r+1,
Unfortunately, except for Andrews's paper [A5], nobody seems to have noticed that in fact Dyson's map, sometimes called Dyson 's adjoint [BG] , can be used to give combina torial proofs of Fine's results. Even Andrews did not seem to realize that Dyson's map proves two other theorems of Fine as well. I return to that Andrews paper in the next section. Define the rank of a partition A as r(A) = a (A)- € (A). Denote by �n,r the set of partitions of n with rank r, and let p(n , r) = 1�11.,.. Similarly, denote by 'Jen,r (Cfl11,,.) the set of partitions of n with rank at most r (at least r). Let h(n, r) 1£11,,., g(n, r) = Cfl11,,.:. Clearly, p(n, r) = h(n, r)- h(n, r - 1), and (by comparing A to A') g(n, r) = h(n, -r). Also, h(n, r) + g(n, r + 1) 1r( n), where 7T(n) = h(n, n) = I,. p(n, r) is the total number of partitions of n. =
=
THEOREM
5 (Fine) For all n > 0 , we have h(n, 1 + r) =
h(n + r, 1 - r). Proof I shall construct an explicit bijection if;,.: 'Jen,r+1--> C§n +r,r-l• which implies the result. Start with the Young diagram [A] corresponding to a partition A E € (A) squares. Add 1£11,,.+ 1. Remove the first column withe the top row with (€ + r) squares. Let [M] be the resulting Young diagram (see Figure 4.) Call the map if;,. : A - -> IL =
Dyson's map.
By assumption of A, we have r(A) = a (A) - e ::; r + 1, so a (f.t) € + r2 a(A) - 1 . Thus fL is a partition indeed. And the same inequality shows that the inverse map is defined. Clearly, 1f.t1 A -e + (€ + r) n + r. Also, r(/L) = a(/L) - f(M) = f(A) + r - (A2 + 1) 2r - 1 . There fore, fL = lj;,.(A) E C§n+r,r-l, which completes the proof. D =
=
=
I call the result of Theorem 5 the Fine-Dyson relations. The rest of the paper is built upon these relations and Dyson's map. First I prove the following four equations, which are listed in [F1] as one theorem as well. THEOREM
6 (Fine) We have:
7T(n + 1) - 7T(n) = (h(n + 1, 0) + h(n + 1, - 1)) - (h(n, 0) + h(n, - 1)) = (h(n + 1 , 0) - h(n + 1 , - 1)) + (h(n, 0) - h(n, -1)) + 2 h(n + 1 , -1) - 2 h(n, 0) = p(n + 1, 0) + p(n, 0) + 2(h(n - 1, 3) - h(n - 1, 2)) = p(n + 1, 0) + p(n, 0) + 2p(n- 1, 3). Equation 4) follows in a similar manner:
p(n - r - 3, r + 4) - p(n - r- 2, r + 3) (h(n - r- 3, r + 4) - h(n - r- 3, r + 3)) - (h(n- r - 2, r + 3) - h(n - r- 2, r + 2)) = h(n, -r - 2) - h(n 1, -r - 1) - h(n, -r - 1) + h(n - 1, -r) - (h(n, -r- 1) - h(n, -r - 2)) + (h(n- 1, -r) - h(n - 1, - r - 1)) = -p(n, -r- 1) + p(n - 1, -r) = -p(n, r + 1) + p(n - 1, r). =
-
=
This completes the proof. 0 Since the equations in Theorem 6 follow immediately from the Fine-Dyson relations, one can obtain combinato rial proofs for these equations as well, by separating terms with positive and negative signs and then using Dyson's map to obtain identical sets of partitions on both sides. I present a variation on such a proof in case of another the orem from [F1]. THEOREM
7 (Fine) For r2n - 3, we have 7T(n) - 7T(n- 1) =
p(n + r + 1, r). Proof Denote by 9Ji 11 the set of partitions A 1- n with the smallest parts(A)22.0bserve thati9Ji11, = 7T(n) - 7T (n-1). Indeed, one can always add a part (1) to every partition v 1- n - 1 to obtain all partitions of n, except for those in '!Fn· Now, to a partition A E 9Fn apply Dyson's map lj;,.+1: A--> f.t, corresponding to the rank (r + 1). We have /Lt = 1 + f(A) + r22 + (n - 3) = n - 1 . On the other hand, f.tz = A1 - 1 ::; n - 1 by construction. Therefore /Lt 2f.tz, and IL is a partition indeed. Because s (A) > 1, we know that f(M) € (A) + 1. Thusr(f.t) (f(A) + r + 1)- (£ (A) + 1) = r, so fL E �n+r+l,r· Since the map is clearly reversible, we ob tain the result. 0 =
=
The Iterated Dyson's Map
1) p(n + 1, 0) + p(n, 0) + 2p(n - 1, 3) = 7T (n + 1) - 7T(n), for n > 1, 2) p(n - 1, 0) - p(n, 1) + p(n- 2, 3) - p(n - 3, 4) = 0, for n > 3, 3) p(n - 1, 1) - p(n, 0) + p(n - 1 , 2) - p(n - 2, 3) 0, for n > 2, 4) p(n - 1, r) - p(n, r + 1) + p(n - r - 2, r + 3) - p(n r - 3, r + 4) O, jor n > r + 3. =
=
Proof Taking r = 0 and r = -1 in equation 4), and us ing p(m, r) = p(m, - r), we obtain 2) and 3), respectively.
mentioned before, Andrews in [A5] proved combinato rially the following theorem from [F1]:
As
8 (Fine) Let CZDn,r be the set of partitions fL E CZDn with rank r(/L) r. Let UUn,Zk+l be the set of partitions A E On, such that the largest part a(A) = 2k + 1. Then: THEOREM
=
,UUn, 2r+ll
=
:rzDn,2r +li + rzD n,Zrl·
One can view Theorem 8 as another refinement of Euler's theorem on partitions into odd and distinct parts. Andrews showed in [A5] that the theorem follows easily
VOLUME 25. NUMBER 1 . 2003
13
o-
IIIII-O±J1-
Figure 5. The iterated Dyson's map (:A---> J.L, where A = (5,5,3,3,1)
E
from the properties of Dyson's map !fi,.. It is unfortunate that Andrews's proof was published in a little-known jour nal and was never studied further. I will now present a di rect bijection between ()n and CZJJ, which is different from Sylvester's and Glaisher's bijections [A3], and which proves Theorem 8. Naturally, this construction is moti vated by [A5]. Let A= CA1, Az, ... , At) E ()n be a partition into odd parts. Consider a sequence of partitions vl, v 2, ..., ve, such that ve= (Ac), and vi is obtained by applying Dyson's map !fiA; to vi+I.Now let p.,= v1. Call the resulting map (: A� p., the iterated Dyson's m ap . See Figure 5 for an example.
l=Ef:Jitii
II I I
�1117,5 and J.L = (8,6,2, 1) E ':£17,4•
Dyson's Proof of Euler's Pentagonal Theorem
I already mentioned that Dyson used his map to obtain a simple proof of Euler's Pentagonal Theorem, Theorem 2 above. He writes, "This combinatorial derivation of Euler's formula is less direct but per haps more illuminating, than the well-known combinator ial proof of Franklin. " [D3] Twenty years later he adds, "This derivation is the only one I know that explains why the 3 appears in Euler's formula." [D5] Here is how Dyson's proof goes. Let P(t) and G,.(t) be the generating functions for all partitions of n, and all par titions of n with rank 2: r :
Dyson used his map to
obtai n a proof of Euler ' s Pentagonal Theorem .
X
9 The iterated Dyson's map (defined above is a bijecti.on between 011 and CZJJ, . Moreover, �CUUn,2r+I)= CZl!n,2r U CZVn2r+ J, for all r 2: 0. Clearly, Theorem 9 implies Theorem 8. It would be in
THEOREM
teresting to find further applications of the map ( to other partition theorems.
G,.(t)
=
X
P(t)= 1+
I
n�l
I
n =l
g(n, r) t" , X
7r(n)t"=
1
n
i�I
+
(1 - fi)
.
Write the relations h(n, r) + g(n, r 1)= 1r(n) and the Fine-Dyson relations h(n, 1 + r) h(n + r, 1- r) in terms of g( · ) alone: =
Proof First, note that ;vi = A;+ A;+ I+ ·+ A1. There fore !f..tl I v11= IAI= n, as required. Let us prove by induc tion that .,) is a partition into distinct parts, such that r( vi) is either A; or A; - 1. The base of induction, when i £ and vi= (A,), is obvious. Suppose the claim holds for vi+I, i.e., a(vi+1)- £(vi+ I) is either A; + I or Ai + I+ 1, depending on the parity. Since a(vi)= £(vi+I) + A;, we have ·
·
=
=
(vi) = a(vi) 1
2: (a(vi+I)- A;+I)+ A;> a(vi+I)- 1= (vi)2,
and this inequality is maintained ((vi)2- (vi)3= (vi+1 )1 (vi+1)2 > 0, and so on); this implies that vi is indeed a par tition into distinct parts. Now, observe that £(vi) = £(vi+I) or £(vi+I)- 1. We have
r(vi)= a(vi)- £(vi)= (£(vi+!)+ A;)- £(vi) E {A;,A;- 1) , which proves the induction step. Note that we never used the fact that A E 011• This be comes important in the construction of the inverse map C1. Define the map g-I by induction, starting with p.,= v1 and applying the inverses of Dyson's maps ljJ; 1.Clearly, the only freedom in the construction comes from the choice of r. But we need to have r= a( vi) - £(vi) or r= a(vi)- £(vi)- 1, and r has to be odd; this makes the choice of r unique. Therefore the map C 1 is well defined, and g is a bijection. The second part of the theorem is im mediate from the arguments above. This completes the proof. D 14
THE MATHEMATICAL INTELLIGENCER
g(n, r) + g(n, 1 - r)= 7r (n), g(n, r)= g(n - r- 1 ,-2 - r). In the language of generating functions, these relations im ply the following two equations:
1+ G,.(t)+ G I-r (t)= P(t),
G,.(t)= t"+1(1 + G-2-r(t)).
Here 1 in both equations comes from taking into account the "empty" partition. Thus we have G,.(t)= t"+1 P(t)- c·+l Gr+3 (t).
Iterating the above equation, we obtain: G,.(t)= tr+l P(t)- tr+l G r 3 (t) + = tr+l P(t)- t2r+5 P(t) + t2r+5 Gr+G(t) t2r+5 P(t) + t3r+12 P(t) = rr+ l P(t) t3r+12 G,.+g(t) _
_
x
=
I
Jrt=l
( -1)rn- l t
m(;1m
2
) -1
+
,.,
P(t).
Substituting this into P(t) - G0(t)- G (t)= 1, we deduce 1 Euler's Pentagonal Theorem:
(1
+
i
Jn=l
( -1)1'' t
m(:l
•�
.
1)
+i
tn=l
)
m •�-1 ) = 1.
(-1)"' t (:l
Dividing both sides by the product P(t) and equating the coefficients gives us Theorem 2. In fact, Euler [E] was interested in the recurrence rela tion for the number of partition ?T(n). The above formula implies
?T(n) = ?T(n - 1) + ?T(n- 2)- ?T(n - 5)- ?T (n- 7) + ?T(n - 12) + ?T (n 15) - . . . -
By analogy, Dyson [D5] obtained the following refinement of Euler's recurrence:
g(n, r) = ?T(n
-
r- 1) - ?T(n - 2r- 5)
+ 7T(n - 3r
-
1 2) -
Naturally, one is tempted to convert the above simple analytic proof into a bijective proof of both recurrences. This turns out to be possible. Denote by 'ZP11 the set of all partitions of n. Write Dyson's recurrence as follows:
'HII. -r = '!Pn-r-1 - gpll_2r-5 + 21'11-:!r-12 iC§n-r-1,-r-2 U Z�t'n-r-1,-r-:3! - 1C§II-2>·-5,-r-5 U Z�t'll-2r-:),-r-61 + ;ctln-:3r-12,-r-8 U Z�t'n-3r-12,-r-fl IC§n-r-1,-,·-d + (IZ�t'n-r-1,-r-31- lctin-2r-5,-r-5) - (Z�t'n-2r-5.-r-6 - 'C§n-3r-12.-r-s) + =
=
·
·
·
Now Dyson's map 1/J-r-1 gives a bijection between the left hand side and the first term on the right hand side of the equation. Similarly, maps 1/J-r-4, 1/J-r-7, etc., give bijections for the terms in the brackets. Thus we have a simple bijective proof of Dyson's recurrence. One can view the above bijection as a sign-reversing involution on the set of partitions A E ':ff11,-n or A� n - rm - m(3m - 1)/2, where
m ::::l .
Similarly, after combining two involutions for r = 0 and 1 , we easily obtain an involution 1' proving Euler's recur rence: 1':
U
m even
rPn-m(3m-l)/2 �
U
m odd
rPn-m(3m-1)/2,
where m on both sides is allowed to take negative integer values, and the map 1' (see Figure 6) is defined by the fol lowing rule:
if r (A) + 3m :s: 0, if r (A) + 3m > 0 . Now comes a final surprise. Bijection 1' is in fact well known! In this exact form it was discovered in 1985 by Bres soud and Zeilberger [BZ] , for the sole purpose of finding a simple proof of Euler's recurrence. The authors, completely unaware of Dyson's proof, managed to rediscover a version of Dyson's map anyway. It seems, the Fine-Dyson relations and Dyson's map ljf,. are simply so fundamental they resur face despite the "missed opportunities" . . . Final Remarks
There remains one last partition theorem of Fine [F1] with out a simple combinatorial proof. Let L(n) be the number of partitions A� n with odd smallest part s (A). The theo rem states that L(n) is odd if and only if n is a square. Shouldn't one look for an involution proving this result? Any interested reader may draw inspiration from an invo lutive proof of the Rogers-Fine identity [A2]. The conditions in Fine's theorems 6 and 7 are slightly changed in this paper, either to correct or simplify the re sults (so as not to define p(n, r) for n :s: 0). Dyson's map as defined here is the conjugate of the one in the literature. I find this version somewhat easier to work with. The iterated Dyson's map g appears to be new. It is ba sically a recursive application of Andrews's recurrence re lation for lcJU11,2k+1 and :rzn11,r (see [A5]). Whether this bijec tion between partitions into odd and distinct numbers has other nice applications or not, the map ? seems to give a natural proof of Theorem 8, just as Dyson's map gives a natural proof of Theorem 5. Unfortunately, the iterative construction of g is perhaps intrinsic. As Xavier Viennot once tole me, "Sometimes, a recursive bijection is the only one possible and one cannot do better." Note that Dyson's proof of Euler's recurrence relation [D3] produces a bijection almost immediately once one employs Dyson's map. A different bijection, based on Franklin's involution, was obtained by means of the invo lution principle by Garsia and Milne [GM]. These two "au tomatic" approaches challenge Sylvester's paradigm that bi-
...
m
Figure 6. Bijection 'Y proving Euler's Pentagonal Theorem.
VOLUME 25, NUMBER 1, 2003
15
jections "should rather be regarded as something put into the two systems by the human intelligence than an absolute property inherent in the relation between the two [sets)" [S2]. In a recent paper [BG], Berkovich and Garvan defined a 2-modular version of Dyson's map. They used this new map to give a combinatorial proof of Gauss's famous identity. It would be interesting to convert this proof into a fully bi jective proof of the identity and compare with Andrew's in volutive proof [A2 ] . Similarly, one can consider an iterated version of this map and try to find new partition theorems this construction may prove. This paper was motivated in part by the following quote: "[A5] seems to be the only known application of Dyson's transformation" [BG]. Let me add that had the preprint [BG] never been put on the internet, this paper might have never been written. That would have been another "missed opportunity" . . .
[D3] F. J. Dyson , A new symmetry of partitions, J . Combin. Theory 7 (1 969), 56-61 . [D4] F. J. Dyson, Missed opportunities, Bul l . Amer. Math. Soc. 78 (1 972), 635-652. [D5] F. J. Dyson, A walk through Ramanujan's garden, in Ramanujan revisited, Academic Press, Boston, 1 988, 7-28.
[D6] F. J. Dyson , Mappings and Symmetries of Partitions, J. Combin. Theory Ser. A 51 (1 989), 1 69-1 80 [E] L. Euler, lntroductio in analysin infinitorum. Tomus primus, Marcum Michaelem Bous-quet, Lausannae, 1 748. [F1 ] N. J. Fine, Some new results on partitions, Proc. Nat. Acad. Sci. USA 34 (1 948), 61 6-61 8. [F2] N. J. Fine, Basic hypergeometric series and applications, AMS, Providence, Rl, 1 988. [Fr] F. Franklin, Sure le developpement du produit infini (1
-
x)
(1 - x2)(1 - x3) . . . , C. R. Acad. Paris Ser. A 92 (1 881 ) , 448-450.
[GM] A. M. Garsia, S. C . Milne, A Rogers-Ramanujan bijection , J. Com bin. Theory Ser. A 31 (1 981 ) , 289-339.
Acknowledgments
I am very grateful to Oliver Atkin for sharing invaluable his torical comments on the discovery and the proof of Dyson's conjectures. I also thank Michael Noga of the MIT Science Library for obtaining a copy of [A5] not available in the li braries of the Commonwealth of Massachusetts. The au thor was partially supported by the NSA and the NSF.
[KY] D. Kim, A. J . Yee, A note on partitions into distinct parts and odd parts , Ramanujan J. 3 (1 999), 227-231 .
[KP] D. E. Knuth, M. S. Paterson , Identities from partition involutions, Fibonacci Quart. 16 (1 978), 1 98-2 1 2. [L] D. H. Lehmer, Math. Reve i ws 1 0,356d, and Errata 1 0,856 (on paper [F1]). [PP] I. Pak, A. Postnikov, A generalization of Sylvester's identity, Dis crete Math. 1 78 (1 998), 277-281 .
[S1 ] J. J. Sylvester, with insertions by F. Franklin, A constructive the REFERENCES
ory of partitions, arranged in three acts, an interact and an exodion,
[A 1] G. E. Andrews, On basic hypergeometric series, mock theta func
Amer. J. Math. 5 (1 882), 251-330.
tions, and partitions
(II), Quart. J . Math.
1 7 (1 966), 1 32-1 43.
[A2] G . E. Andrews, Two theorems of Gauss and allied identities proved arithmetically, Pacific J . Math. 41 (1 972), 563-578.
[A3] G. E. Andrews, The Theory of Partitions, Addison-Wesley, Read ing , MA, 1 976. [A4] G . E. Andrews, Ramanujan's "Lost" Notebook. I. Partial (}-Func
[S2] J. J. Sylvester, On a new Theorem 1n Partitions, Johns Hopkins University Circular 2: 22 (April 1 883), 42-43. [W] G. N. Watson, The final problem: An account of the mock theta functions, J . London Math. Soc. 1 1 (1 936), 55-80. AUTHOR
tions, Adv. Math. 41 (1 981 ) , 1 37-1 72.
[A5] G. E. Andrews, O n a partition theorem o f N. J . Fine, J . Nat. Acad. Math. India 1 (1983), 1 05-107. [A6] G . E. Andrews, Use and extension of Frobenius' representation of partitions, in "Enumeration and design", Academic Press, Toronto,
ON, 1 984, 5 1 -65. [A7] G. E. Andrews, Foreword to [F2] , 1 988. [AS] G. E. Andrews, Some debts I owe, Sem. Lothar. Combin. 42 (1 999), Art. B42a. [At] A. 0. L. Atkin, personal communication, 2002 . [AS] A. 0. L. Atkin, H. P. F. Swinnerton-Dyer, Some Properties of Par
IGOR PAK
titions, Proc. London Math. Soc. (3) 4 (1 954), 84-106.
[BG] A. Berkovich, F. G. Garvan, Some Observations on Dyson's New Symmetries of Partitions,
preprint (2002) , available from
http://www.math.ufl.edu/�frank.
Department of Mathematics Massachusetts lnst1tute of Technology Cambridge, MA 021 39-4307 e-mail:
[email protected] [B] C. Bessenrodt, A bijection for Lebesgue's partition identity in the spirit of Sylvester, D iscrete Math. 132 (1994), 1-10.
[BZ] D. M. Bressoud, D. Zeilberger, Bijecting Euler's Partitions-Recur rence, Amer. Math. Monthly 92 (1 985), 54-55.
[C] R. Chapman, Franklin 's argument proves an identity of Zagier, El. J. Comb. 7 (2000), RP 54. [D1 ] F. J. Dyson, Some Guesses in The Theory of Partitions, Eureka (Cambridge) 8 (1 944), 1 0- 1 5 . [D2] F . J . Dyson, Problems for Solution: 4261, Amer. Math. Monthly 54 (1 947), 4 1 8. 16
THE MATHEMATICAL INTELLIGENCER
Igor Pak was bom in Moscow and educated at Moscow Uni versity and at Harvard (Ph.D. 1 997). After several temporary appointments, he had the good fortune to become Assistant Professor at MIT. In his free time he enjoys driving, bicycling, and roller blading. He is keenly conscious of his indebtedness to the in ventor of the wheel, and he hopes the future holds further de velopments in this direction.
l ,0, ffij.t§ @ih$11+J.JI.Irrll!,iih¥J ..
M a rj o r i e S e n e c h a l ,
E d itor
uch though I admired the Intelli account of the MASS pro gramme (vol. 24, no. 4, 50-56), grafting the best aspects of the formidable tra dition of the Russian school of mathe matics into the framework of US undergraduate mathematics training and much though I hope that students from my own alma mater will number amongst the participants in MASS courses-I was left thinking, Yes, this is a fine way to move into mathemat ics, but is it the unique best way? A for tuitous event had led me to look back at my own undergraduate training, which was very different. In October 2000 I received an invi tation to return to my undergraduate school, Smith College, in Northamp ton, Massachusetts, to give a talk at the first Smith College Alumnae Mathe matics Conference. It seemed a good idea at the time; I responded positively, with no greater expectations of the weekend than an opportunity to return for the first time in over twenty-five years to a place I remembered fondly. I certainly had no inkling that the ef fect of that weekend in April would be a comprehensive rewriting of my atti tudes about suitable undergraduate training for mathematicians.
Undergraduate MgenceT Training Revisited: Thoughts on an Unusua l Reunion by Marjorie Batchelor
This column is a forum for discussion of mathematical communities throughout the world, and through all time. Our definition of "mathematical community " is the broadest. We include "schools" of mathematics, circles of correspondence, mathematical societies, student organizations, and informal communities of cardinality greater than one. What we say about the communities is just as unrestricted. We welcome contributions from mathematicians of all kinds and in all places, and also from scientists, historians, anthropologists, and others.
Please send all su bmissions to the Mathematical Communities Editor, Marjorie Senechal,
Department
of Mathematics, Smith College, Northampton, MA 01 063 USA e-mail:
[email protected] .edu
Smith College: A Promising Starting Ground for Working Mathematicians?
Those of us who were brought together at the Conference had had an under graduate experience quite unlike that at large universities or technical schools. That different experience which we shared, albeit at various pe riods during the last thirty years, is our common source, the defining property of our community. It appears to have given us all an approach to mathemat ics which has proved durable enough to see us through difficult bits, flexible enough to allow us to adapt to a wide variety of careers, and in its own way, solid enough to enable us to make our way in a competitive field. This is why
I
it seems important to write about it. So what is Smith? Smith College is a liberal arts college in western Massachusetts. The liberal arts degree course presents a consider able contrast to traditional undergradu ate mathematics courses offered at UK universities for example. British maths students expect to apply the great ma jority of their academic effort to math ematics. Their counterparts at a liberal arts college will be required to spend at least half their time studying courses (well) outside mathematics. Thus I stud ied, in my time, French poetry, history, Old English (Beowulf, all of it), chem istry, and Latin, amongst many other subjects. Whereas British students are expected to enter university with a sub stantial background in mathematics, in liberal arts colleges rank beginners are energetically encouraged to try mathe matics (you might like it), and the pos sibility for proceeding from humble be ginnings to a degree in mathematics is always available. Indeed I was exactly one such student, who signed up for a first course in calculus, expecting it to be my last. The wonderful thing about liberal arts courses is that surprising things can and do happen. A student ex pected to make a serious study of French literature can discover algebra instead. Smith, like many liberal arts col leges, is a teaching institution pre dominantly, and not a research insti tution. Like many others, it has a relatively small student body, and a correspondingly small faculty. The corollary to the "try it, you might like it" policy at Smith has been that while there has been a good number of stu dents majoring in mathematics, these students differ markedly in interests, ability, and ambitions. The student with the potential to become a work ing mathematician remains a rarity. The variety, level, and presentation of the courses offered obviously must aim to suit the demands of the major ity of the students.
© 2003 SPRINGER-VERLAG NEW YORK. VOLUME 25. NUMBER 1 2003
17
Smith College is a women's college. Does this make any difference? Yes, and it possibly matters. The women only setting affects the style of teach ing, the patterns of collaboration be tween students and between students and staff, and probably also the pro portion of students with intentions of continuing in mathematics. From this background our "commu nity" dispersed, at various times, to fol low varied lives in mathematics, to re turn in April 2001. This is the story of how the reunion came about, and what we discovered over that weekend. The Idea of the Weekend
Happenstance brought four Smith alumnae and one (male) faculty mem ber together at a maths conference. Women mathematicians are not so few in number today that four women at a conference is noteworthy, but here were four from the same undergradu ate college, and one hardly widely recognised as a leader in supplying the world with mathematicians. At what point did the organisers de cide that talking over old times was such fun that they wanted to invite all Smith alums with PhDs in mathemat ics to come back for a weekend at Smith? They began counting. "All" turned out to be not so small a num ber: it surprised them, indeed, as it sur prised me: there are about thirty-five on the list. I have no idea how many the or ganisers imagined might actually make the trip back. While not accustomed to responding to reunion summonses, I found this invitation to return com pelling. These people, if any, were my colleagues. Although I knew only one other Smith mathematician, knowing where they started from, and how tough the path from there through a Ph.D. in mathematics and beyond can be, I was certain that my colleagues would have stories worth listening to. I went. So did about twenty-five of the thirty-five of us known to be work ing in maths. It was clear that others felt similarly drawn. Together with fac ulty, present students, and other inter ested parties, about fifty people gath ered for the conference. We certainly did have stories to tell.
18
THE MATHEMATICAL INTELLIGENCER
We discovered that we had each got ten from our perhaps unpretentious undergraduate mathematics experi ence something that well enabled us to meet the challenges that maths gradu ate students face. It began to appear, through the common themes in our stories, that perhaps we survived not so much in spite of our unusual un dergraduate experience, but because of it. That perhaps our ability to adapt to non-standard paths through study and employment, and our consequent satisfaction in employment, was also a strength gained through our particular undergraduate experience. If this is so, then it should be shouted out loud-at schools, to en courage students to consider this as a route through to the position of work ing mathematician, but also to the mathematical community at large. This
There had to be somethi ng i n o u r u n derg rad uate experience that gave us a solid start for our careers. route works. It offers great flexibility, and its graduates show resilience, durability, adaptability, and an uncom mon willingness to communicate. The Mechanics of the Weekend: The Making of a Community
The plans for the weekend were sim ple. Social gatherings on the Friday night. Sessions of twenty-minute math ematics talks throughout Saturday. "Round-table" discussion on women's issues in the mathematical community; a conference dinner. Further talks on the Sunday, including a panel discus sion on employment in computer sci ence; a brunch; then another session of talks before we went our various ways. It is a stiff task, to explain, to a room ful of people with only basic under graduate mathematics in common, the
work that is of interest to you, in no more than twenty minutes. The subject matter ranged from L-functions, to how best to design and run distance learn ing courses, to the launching of satel lites, with plenty of instalments of al gebra, graph theory, and operations research in between. The quality of the talks was the startling bit. Without ex ception the speakers demonstrated a desire to communicate and the ability to do a good job of it! Inevitably I re call most vividly those talks closest to my own field, but I remember with par ticular pleasure a twenty-minute invita tion to vertex algebras, which included a concise and appealing summary of Lie algebras and their representations in the first sixteen minutes! The next surprise was the round table discussion. Such sessions offer an opportunity to those in the grip of oppression to cry out against the in justices, to stir the sisters to arms against prejudice both deliberate and thoughtless. Not this time. I heard no crying. I did hear lots of laughter. The horror stories were there, all right. The ladies assembled had not been magically shielded from the common worries-two-body prob lems, combining children and careers, and insensitive colleagues. The tales were, however, of survivors, cheerful ones at that, who could find amuse ment in the absurd injustices; definitely not victims. The twenty-minute talks turned out to be an absolutely splendid method of introductions. What more essential could you know about someone than that she has a strong enthusiasm for launching rockets, or a passion for colouring graphs? By Sunday this group of people who had had only a pa per connection-that of having passed through the same place at various times-began to have a sense of a com munity. As particles of cosmic dust of sufficient quantities coalesce to form a star, by the time we left, we were a community. My Story
If any of us had been asked before the weekend, "How was it that you came to be a mathematician?" I suspect the most frequent answer would have been
some equivalent of "It was a complete fluke." Certainly it seemed so in my case: I was good at French in high school, it was assumed I would be a French student. But then I took calcu lus, with Michael Gemignani, decided to ask questions, and never stopped. So many flukes make no fluke at all. It cannot be coincidence that we sur vived, even flourished. There had to be something in our undergraduate expe rience that gave us a solid start for our careers. Smith, in my day, was not widely re garded as a suitable college for a tech nically minded young woman. (I was not regarded as a technically minded young woman.) Nor was it, by objective accounting, a suitable place for them. The breadth and depth of mathematical knowledge required for a Smith hon ours degree fell woefully short of that required of first-year graduate students. I spent much of my first year in gradu ate school borrowing notes from other students' undergraduate courses, trying to make up the background to under stand the first-year lectures. It was aw ful to keep hearing, "What, you don't know that?" It does not matter. Hang what I failed to learn at Smith! It is clear that what I did get from Smith was far more valuable. I learned that I liked mathematics, that I could do mathematics. I had the experience that all doors were open to me: every faculty member gave the im pression that he or she had any amount of time available to answer my ques tions. I had the experience of doing mathematics, not just learning it. I learned to talk mathematics both with my classmates and with my teachers. I developed the taste for thinking math ematics as a way of life, not just a course of study, or a career. Through talking, asking questions, doing maths I gained a confidence that I could do maths: perhaps I didn't know much, but that could be changed. I could work with what I knew, I liked the work, and I wanted to do it enough to take whatever steps were necessary to learn the trade. The weekend showed that I am not alone.
The most important experience was finally being able tofeelfree to express
myself without worrying that I was going to say something really stupid and embarrass myself (this was after say, my 181 year), and consequently developing my own voice. Just getting practice speaking out. I now speak out freely. I have a feeling that I make some professors uncomfortable and amaze other female students. The in teract'ions 'in classr-ooms at Smith have validated for me that fact that I have good things to say, so now I know that what I have the urge to say is probably worth saying.-Diane Christine Jam (9S, Rice University) What I learned at the weekend is that my experience seems also to have been the experience of the others. This is borne out in the e-mails and comments that have come back to me. It would appear that this is what matters. Per-
I had the experi ence of d o i n g mathem atics , not j ust learn i n g it . haps we were not towed so far across the sea of mathematical knowledge as our contemporaries at more techni cally oriented universities, but we were equipped with j olly good paddles! Moreover, we discovered we were not alone: there were more than just one or two of us. That matters, too. What Is Special About Smith?
Through many different routes, against often considerable obstacles, the col leagues I met at the weekend had found their way into the greater math ematical community. This cluster of success stories is not simply a statisti cal aberration: if something at Smith enabled the successes, what are prin cipal features? Two stand out: an ag gressive "open door" policy, encourag ing questions outside as well as within the classroom; and the opportunity to do active mathematical research while still an undergraduate. If you visit the Mathematics Department at Smith now, you arrive at the "forum," an open area with comfortable chairs, tables, a
few computers, and space to spread books, work, and talk There are cof fee facilities in the corner. The place is designed to encourage discussion. Most math faculty offices open directly into this space. The only option for a faculty member wishing to avoid en countering students is to refrain from coming in at all. They come in. In my day, catching faculty mem bers was not so comfortable. My back has a permanent memory of the numb patch gained from hours spent sitting on the floor, leaning against the walls outside faculty doors, lying in wait against the arrival of my teachers, with the day's questions. It seemed natural to me: it was what they were there for, wasn't it? And not one of them ever gave me so much as a hint that my de mands on their time might be either ex cessive or inconvenient. I also felt, by the time I left Smith, that as far as research went, I was a seasoned campaigner. I had written an honours thesis, for which I had read a paper, and I had even, well, tried to prove something that might have been new. And I had spent a summer assisting in a research pro ject. I knew what research was about, I thought. And in essence, I wasn't far wrong.
I was offered a small part of a small part ofsornebody's research as a final pr-oject 'in a number theory class. It was the fi,rst problem I had ever seen that wasn't a homework pr-oblem. It was actually a problem to which no one knew the answer! I worked on it night and day for a few weeks andfell in love with research! It was exhila rating.-Debra Boutin (91 , Hamilton College) Perhaps the strong history of com binatorics and graph theory at Smith plays a significant part. Graph theory and combinatorics do have many prob lems that can be explained without the necessity for a great deal of advanced mathematical machinery. Could it be done in less accessible subjects, with greater difficulty perhaps? Before you are too quick to excuse yourself from the effort of trying, con sider: how much mathematical equip-
VOLUME 25, NUMBER 1 , 2003
19
ment do you really need to start to work out examples? I have had some strange experiences in recent years. The first involved teaching repre sentations of Lie algebras to a gradu ate class with wildly divergent, even inappropriate backgrounds. As an al ternative to yielding to catastrophic de spair, we settled into a routine of ex amples classes following the lectures, in which we got everyone decompos ing representations of sl(2) and sl(3), lots of them. I was quite impressed how the hands-on experience enabled even students whose understanding of vec tor space had started as a fistful of ar rows (to be added tail to head) to achieve a very satisfactory under standing of the theory of Lie algebras and their representations. The second experience has been a project with some local ten-year-olds in a state school, on orthogonality, dot products, leading up to Fourier analy sis of sounds. It is a challenge, indeed, to explain orthonormal bases to ten year-olds, but by no means impossible. They appear to have a pretty good grasp of the ideas. I did have one child in floods of tears-she was unable to add or multiply the fractions as re quired! (I now use decimals and a cal culator.) But it was not the ideas she was unable to grasp (and she has strong motivation to get to grips with the fractions). It's a bit like trying to tell a story without using the letter e, but it can be done (or, perhaps, you can do it). Quite apart from the benefit to the students participating in research, the discipline of having to explain some part of your subject to an undergraduate without the luxury of extensive mathematical baggage, can be a revealing exercise in its own right, leading to substantial simplifications in the theory. It is worth doing. Could It Happen Anywhere?
Why Smith? Clearly Smith offered the right climate for a certain type of po tential mathematician to flourish. The right climate again appears to depend on several factors. First, Smith does not have a gradu ate programme. It is perhaps paradox ical that this should be counted as a
20
THE MATHEMATICAL INTELLIGENCER
positive feature in the analysis of whether an institution is a suitable one for growing mathematicians, but it seems to play a critical role. In the ab sence of research students, the under graduates get the opportunity to take part in research and receive the direct attention of their teachers. Second, Smith manages to integrate the research interests of the faculty with their undergraduate teaching re sponsibilities. Courses labelled "Top ics in . . . " allow teachers to bring their interests into the classroom. This ap pears to blur the usual divide between teaching and research with apparent benefit to both aspects. This is of course contingent on the absence of a graduate school: where there is a grad uate programme, the "Topics in . . . " courses are aimed at graduate students (or, very often, visiting experts in the subject!).
I hadn't had as extensive a background in linear algebra, algebra, analysis as my peers when beginn'ing graduate school. I took upper-level undergradu ate courses to fill in my background. (This is not uncommon for first year graduate students from small liberal arts colleges at Cornell.) However, un like any ofmy peers, I had already seen Quantum Logic in David Cohen's top ics in analysis class, Groebner Bases in Patricia Sipe's topics in algebra class, and Quasi Crystals in Marjorie Senechal's Topics in Discrete Applied Mathematics class. As a graduate stu dent this exposure to cutting-edge mathematics raised my standing in the eyes of my peers, and in my own eyes.-Debra Boutin. Many schools have not got graduate programmes, but clearly not all such schools have had the same success in launching Ph.D. mathematicians. Per haps in this one respect the Smith suc cess story really is a fluke of personal ities. Probably every mathematician has been a part, at some time or an other, of a group that simply "worked." In such a group, there would be one or two people who set the style which others fit into. Over time the character of the group might change, but the past personality and strengths of the group
so influence the incoming leaders, that some of the original character would survive through complete changes of leadership. Smith was undoubtedly lucky in its early leaders: Neal McCoy and Alice Dickinson, to pick two. Neal McCoy retired in 1970, Alice Dickinson in 1980, but their style set the mould that stamps the practices and policy even now. Should It Happen Anywhere?
I have spent my working life at large schools with graduate programmes. They do an excellent job of training mathematicians, including at the un dergraduate level. I respect and admire these institutions. Nonetheless, the ev idence of the Smith weekend is that there is a role for other institutions of fering alternative routes to the status of PhD mathematician. Not only has Smith produced a sur prising number of mathematicians, but the unusual qualities of those mathe maticians-their adaptability, perse verance, and willingness to communi cate-suggest to me that this route into mathematics makes for particularly useful mathematicians at the end of their training. Therefore, this is a pathway which should be encouraged, studied, and adapted for use at other institutions. I do not believe that the principle is in any way gender-specific. Nor do I be lieve that beyond the circumstances mentioned above, there is anything so special about Smith that equally favourable conditions cannot be fos tered elsewhere. Those institutions which have those properties, an ab sence of graduate programme and a willingness to involve undergraduates in research, should recognise their po tential and set about making the most of the students they have got.
I wonder if teaching is different at Smith because methods that don't re ally workfor anybody, more obviously don't workfor women. I do think some women need more time to commit to mathematics. At some schools, you get a chance as a first-year student to join the team and you never really get an other chance. Smith, of course, is not like that.- Jim Henle
I believe this route in to mathematics has recruited, and will continue to re cruit students who would never other wise consider going into mathematics (myself amongst them!). I suspect also, observing my colleagues over the weekend, that those who follow this route will bring to the larger mathe matical community different strengths and expectations than those who fol low a more traditional route. I expect mathematics will benefit greatly from the added diversity. Where to Go from Here?
The Smith route works. However, it did not make for an easy time in the pas sage from our Smith degree through to the PhD. Let no one romanticise our struggles to overcome deficient back grounds and learn to compete with those "better prepared. " If this route into mathematics from the small liberal arts background is to become more commonly used, both the small col leges and the large graduate schools must consider the challenge of making the transition between undergraduate programmes and graduate training less rocky. One must wonder whether it is even conceptually possible to combine the virtues of the small liberal arts college with providing all those advanced courses. There are different policies small colleges can pursue. Smith has favoured encouraging a wide variety of students, allowing them to proceed to their own goals at their own rate. This provides Smith with a large number of mathematics majors, of whom only a minority will have interest in high-level courses. With the University of Massa chusetts within easy commuting dis tance, these ambitious students have the possibility of studying such courses there, although there are drawbacks for both student and Smith in doing so. Course scheduling can be complicated enough without having to schedule in commuting time, and Smith loses the presence of these students in its own courses. Some colleges adopt the strat egy of requiring of their mathematics students a commitment to acquiring a mathematical background comparable with what would be expected of po tential research students. Then few stu-
dents major in mathematics, and it is no easier to provide advanced level courses. Whatever strategies are followed by the smaller schools, graduate schools must share the responsibility for bridg ing the gap. A first positive step might be for admissions personnel to recog nise the value of training such as we received at Smith. Seek out some can didates who have come perhaps from smaller schools, who have been in volved in research, perhaps particu larly those who have tried other ca reers and yet wish to come back to do mathematics at graduate school. Ad mit, encourage, and value them. Help them catch up where their training falls short. The effort will be repaid. Even pure mathematics is no longer the introspective subject I studied as a graduate student. The larger mathe matical community could well benefit from an influx of adaptable students with well-developed abilities to com municate and a positive willingness to complement the strengths of whatever group they find themselves working in. If these virtues are indeed fostered by training such as we received at Smith, this is one more argument for a more imaginative and flexible concept of the "good undergraduate mathematical background."
ries are unusual, even knowing that there are others becomes important. As women in mathematics, we find ourselves with colleagues and neigh bours who mostly share too little with us to provide helpful understanding. The community of Smith alums in Mathematics can. We came from a common experience, we meet similar problems. Often it is enough to have someone else say, "Yes, I know, we had that problem too, and we have resolved it. Persevere. " Moreover, such a com munity can work in absentia. Knowing that other women, from the same un dergraduate background, have found their varied ways through graduate school, and have found a comfortable corner of mathematics to make a home in, is often sufficient to stifle doubts and counteract discouragement. And if that fails, they are there on e-mail.
AU T HOR
Smith Alums i n Mathematics: The Virtues of a Virtual Community
Before the weekend few of us knew each other: I would be surprised if any of us could have named more than four other Smith graduates working in mathematics. Now, more than a year after the weekend, we have returned to our routines. Some of us may never take part in a similar weekend, may never make further contact again. The pragmatic might wonder whether there is any reality in such a virtual commu nity: one that theoretically exists, which met once, which may or may not meet again. The answer is yes. The prime bene fit a community can give has got to be the support its members can provide each other, often simply by sharing sto ries. Where the members of the com munity are isolated, where their histo-
MARJORIE BATCHELOR
1 02 Mawson Road Cambridge CB1 2EA United Kingdom e-mail: mb1
[email protected] Marjorie Batchelor, after her Smith years recalled here, was a graduate student at Warwick and then MIT (Ph.D. 1 g?8). Since then she has lec tured, in various capacities, at Cam bridge and Kings College London. Her research interests are comod ules, supermanifolds, and quantum groups. For the last ten years she has also, as an outgrowth of her hobby of chamber music, learned and prac ticed professionally the craft of re pairing and making violins. She is married, with three children.
VOLUME 25. NUMBER 1. 2003
21
i,¥, fflj.t§,@ih£il@j§#@ii,i,t§.iti
The M athematical Knight Noam D. Elkies Richard P. Stanley
This column is a place for those bits of contagious mathematics that travel from person to person in the community, because they are so
M i chael K l e t>e r a n d R a v i Vaki l ,
uch has been said of the affinity between mathematics and chess: two domains of human thought where very limited sets of rules yield inex haustible depths, challenges, frustra tions and beauty. Both fields support a venerable and burgeoning technical lit erature and attract much more than their share of child prodigies. For all that, the intersection of the two do mains is not large. While chess and mathematics may favor similar mind sets, there are few places where a chess player or analyst can benefit from a specific mathematical idea, such as the symmetry of the board and of most pieces' moves (see for instance [24]) or the combinatorial game theory of Berlekamp, Conway, and Guy (as in [4]). Still, when mathematics does find applications in chess, striking and in structive results often arise.
M
elegant, suprising, or appealing that one has an urge to pass them on. Contributions are most welcome.
Please send all su bmissions to the Mathematical Entertainments Editor, Ravi Vakil,
Stanford University,
Department of Mathematics, Bldg. 380, Stanford, CA 94305-21 25, USA e-mail:
[email protected]. edu
22
Introduction
This article shows several mathemati cal applications that feature the knight and its characteristic (2, 1) leap. It is based on portions of a book tentatively titled Chess and Mathematics, cur rently in preparation by the two au thors of this article, which will cover all aspects of the interactions between chess and mathematics. Mathemati cally, the choice of (2, 1) and of the 8 X 8 board may seem to be a special case of no particular interest, and indeed we shall on occasion indicate variations and generalizations involving other leap parameters and board sizes. But long experience points to the standard knight's move and chessboard size as felicitous choices not only for the game of chess but also for puzzles and prob lems involving the board and pieces, in cluding several of our examples. We will begin by concentrating on puzzles such as the knight's tour. Many of these are clearly mathematical prob lems in a very thin disguise (for in stance, a closed knight's tour is a Hamiltonian circuit on a certain graph C§), and can be solved or at least better
THE MATHEMATICAL INTELLIGENCER © 2003 SPRINGER-VERLAG NEW YORK
Ed i t o rs
------ -
-
l
___j
__
understood using the tem1inology and techniques of combinatorics. We also relate a few of these ideas with practi cal endgame technique (see Diagrams Iff., 10, 1 1). The latter half of the article shows some remarkable chess problems fea turing the knight or knights. Most "practical" chess players have little pa tience for the art of chess problems, which has evolved a long way from its origins in instructive exercises. But the same formal concerns that may deter the over-the-board player give some problems a particular appeal to math ematicians. For instance, we will exhibit a position, constructed by P. O'Shea and published in 1989, where White, with only king and knight, has just one way to force mate in 48 (the current record). We also show the longest known legal game of chess that is determined completely by its last move (discovered by Ri:isler in 1994) which happens to be checkmate by promotion to a knight. Algebraic notation
We assume that the reader is familiar with the rules of chess, but we assume very little knowledge of chess strategy. (The reader who knows, or is willing to accept as intuitively obvious, that king and queen win against king, or even against king and knight if there is no immediate draw, will have no diffi culty following the analysis.) The reader will, however, have to follow the notation for chess moves, either by visualizing the moves on the diagram or by setting up the position on the board. Several notation systems have been used; the most common one nowadays, and the one we use here, is "algebraic notation," so called because of the coordinate system used to name the squares of the board. In the re maining paragraphs of this introduc tory section we outline this notation system. Readers already fluent in alge braic notation may safely skip to the next section, A Chess Endgame.
Each square on the 8 X 8 board is uniquely determined by its row and column, called "rank" and "file" re spectively. The ranks are numbered from 1 to 8, the files named by letters a through h. In the initial array, ranks 1 and 2 are occupied by White's pieces and pawns, ranks 8 and 7 by Black's; both queens are on the d-file, and both kings on the e-file. Thus, viewed from White's side of the board (as are all the diagrams in this article), the ranks are numbered from bottom to top, the files from left to right. We name a square by its column followed by the row; for in stance, the White king in Diagram 1 be low is at d2. Each of the six kinds of chessmen is referred to by a single let ter, usually its initial: K, Q, R, B, P are king, queen, rook, bishop, and pawn (often lowercase p is seen for pawn). We cannot use the initial letter for the knight because K is already the king, so we use its phonetic initial, N for kNight. For instance, Diagram 1 can be described as: White Kd2, Black Ka1, Nf2, Pa2, Pc2. To notate a chess move we name the piece and its destination square, inter polating " X " if the move is a capture. For pawn moves the P is usually sup pressed; for pawn captures, it is re placed by the pawn's file. Thus in Dia gram 1 1 , Black's pawn moves are notated a2 and a X b2 rather than Pa2 and P X b2. We follow a move by " + " if i t gives check, and by " ! " or "?" if we regard it as particularly strong or weak. In some cases " ! " is used to indicate a thematic move, i.e. , a move that is es sential to the "theme" or main point of the problem. As an aid to following the analysis, moves are numbered consecutively, from the start of the game or from the diagram. For instance, we shall begin the discussion of Diagram 1 by con sidering the possibility l . K X c2 Nd3!. Here "1" indicates that these are White's and Black's first moves from the diagram; "KX c2" means that the White king captures the unit on c2; and "Nd3!" means that the Black knight moves to the unoccupied square d3, and that this is regarded as a strong move (the point here being that Black prevents 2.Kc l even at the cost of let ting White capture the knight). When
analysis begins with a Black move, we use " . . . " to represent the previous White move; thus " 1 . . . Nd3! " is the same first Black move. A few further refinements are needed to subsume promotion and castling, and to ensure that every move is uniquely specified by its notation. For instance, if Black were to move first in Diagram 1 and promoted his c2pawn to a queen (giving check), we would write this as 1 . . . c 1 Q +, or more likely 1 . . . clQ + ?, because we shall see that after 2.K x c l White can draw. Short and long castling are notated 0-0 and 0-0-0 respectively. If the piece and destination square do not specify the move uniquely, we also give the de parture square's file, rank, or both. An extreme example: Starting from Dia gram 9, "Nb1" uniquely specifies a move of the c3 knight. But to move it to dfi we would write "Ncd5" (because other knights on the b- and f-files could also reach d5); to a4, "N3a4" (not "Nca4" because of the knight on c5); and to e4, "Nc3e4" (why?).
natural try is l . K X c2, eliminating one pawn and imprisoning two of Black's remaining three men in the comer. But 1 . . . Nd3! breaks the blockade (Dia gram 2a). Black threatens nothing but controls the key square c l . The rules of chess do not allow White to pass the move; unable to go to cl, the king must move elsewhere and release Black's men. After 2.KX d3 (or any other move) Kb 1 followed by 3 . . . a1Q, Black wins easily. Diagram 2a
A Chess Endgame
We begin by analyzing a relatively sim ple chess position (Diagram 1). This may look like an endgame from actual play, but is a composed position-an "endgame study"-created (by NDE) to bring the key point into sharper focus.
Diagram 2b
Diagram 1
White to move
White, reduced to bare king, can do no better than draw, and even that with difficulty: Black will surely win if either pawn safely promotes to a queen. A
Returning to Diagram 1 , let us try in stead l . Kc l ! This still locks in the Black Ka1 and Pa2, and prepares to capture the Pc2 next move, for instance 1 . . . Nd3 + 2 . K X c2, arriving at Diagram 2a with Black to move. White has in effect succeeded in passing the move to Black by taking a detour from d2 to c2. Now it is Black who cannot pass, and any move restores the White king's ac cess to c l . For instance, play may con tinue 2 . . . Nb4 + 3.Kcl, reaching Dia gram 2b. Black is still bottled up. If it
VOLUME 25, NUMBER 1 , 2003
23
were White to move in Diagram 2b, White would have to release Black with Kd1 or Kd2 and lose; but again Black must move and allow White back to c2, for instance 3 . . . Nd3 + 4.Kc2 and we are back at Diagram 2a. So White does draw-at least if Black obligingly shuttles the knight be tween d3 and b4 to match the White king's oscillations between c 1 and c2. But what if Black tries to improve on this? While the king is limited to those two squares, the knight can roam over almost the entire board. For instance, from Diagram 2a Black might bring the knight to the far comer in m moves, reaching a position such as Diagram 3a, and then back to d3 in n moves. If m + n is odd, then Black will win since it will be White's tum to move. Instead of d3, Black can aim for b3 or e2, which also control c l ; but each of these is two knight moves away from d3, so we get an equivalent parity condition. Alter natively, Black might try to reach b4 from d3 in an even number of moves, Diagram 3a
to reach Diagram 2b with White to move; and again Black could aim for another square that controls c2. But each of these squares is one or three knight moves away from d3, so again would yield a closed path of odd length through d3. Can Black thus pass the move back to White? For that matter, what should White do in Diagram 3b? Does either Kcl or K X c2 draw, or is White lost re gardless of this choice? The outcome of Diagram 2a thus hinges on the answer to the following problem in graph theory:
Let C§ C§8,8 be the graph whose ver tices are the 64 squares of the 8 X 8 chessboard and whose edges are the pairs of squares joined by a knight's move. Does 'f) have a cycle of odd length through d3 ? =
Likewise White's initial move in Dia gram 3b and the outcome of this endgame comes down to the related question concerning the same graph C§:
What are the possible parities oflengths of paths on C§ from h8 to c1 or c2? The answers result from the following basic properties of C§:
LEMMA. (i) The graph 'fJ is connected. (ii) The graph is bipartite, the two parts comprising the 32 light squares and 32 dark squares of the chess board.
the same color as the one occupied by the knight. Our analysis would reach the same conclusions if the Black pawn on c2 were removed from Diagrams 1 and 3b; we included this superfluous pawn only as bait to make the wrong choice of c2 more tempting.
REMARK.
For which rectangular boards (if any) does part (i) or (ii) of the Lemma fail? That is, which C§m,n are not connected, or not bipartite? (All puz zles and all diagrams not explicated in the text have solutions at the end of this article.) Puzzle 1.
Knight's Tours and the Thirty-Two Knights
The graph C§ arises often in problems and puzzles involving knights. For in stance, the perennial knight's tour puz zle asks in effect for a Hamiltonian path on C§; a "re-entrant" or "closed" knight's tour is just a Hamiltonian cir cuit. The existence of such tours is classical-even Euler spent some time constructing them, finding among oth ers the elegant centrally symmetric tour illustrated in Diagram 4 (from [9, p. 1 9 1 ]): Diagram 4
Proof Part (i) is just the familiar fact
Diagram 3b
that a knight can get from any square on the chessboard to any other square. Part (ii) amounts to the observation that every knight move connects a light and a dark square. CoROLLARIES. (1) There are no knight cycles of odd length on the chessboard. (2) Two squares of the same color are connected by knight-move paths oj even length but not of odd length; two squares of opposite color are con nected by knight-move paths of odd length but not by paths of even length.
White to move
24
THE MATHEMATICAL INTELLIGENCER
We thus answer our chess ques tions: White draws both Diagram 1 and Diagram 3b by starting with Kcl . More generally, for any initial position of the Black knight, White chooses between c 1 and c2 by moving to the square of
A closed knight's tour constructed by Euler
The extensive literature on knight's tours includes many examples, which, when numbered along the path from 1 to 64, yield semi-magic squares (all row and column sums equal 260), some times with further "magic" properties, but it is not yet known whether a fully magic knight's tour (one with major di agonals as well as rows and columns
summing to 260), either open or closed, can exist. More generally, we may ask for Hamiltonian circuits on C§m, n for other m,n; that is, for closed knight's tours on other rectangular chessboards. A necessary condition is that C§ m ,n be a connected graph with an even number of vertices. Hence we must have 2)mn and both m,n at least 3 (cf. Puzzle 1). But not all Cf:!m ,n satisfYing this condi tion admit Hamiltonian circuits. For in stance, one easily checks that Cf:l3,4 is not Hamiltonian. Nor are Cf:l3,6 and C§3,s, but C§3, 1o has a Hamiltonian circuit, as does G 3,11 for each even n > 10. For in stance, Diagram 5 shows a closed knight's tour on the 3 X 10 board:
set of more than at least three squares must include two of the same color. Co cliques are more interesting: how many pairwise nonattacking knights can the chessboard accommodate? 1 We follow Golomb ([21], via M. Gardner [9, p. 193]). Again the fact that C§ is bipartite suggests the answer (Diagram 6): Diagram 6
32 mutually nonattacking knights x
10 board
There are sixteen such tours (ignoring the board symmetries). More gener ally, enumerating the closed knight's tours on a 3 X (8 + 2n) board yields a sequence 16, 1 76, 1536, 15424, . . . sat isfYing a constant linear recursion of degree 21 that was obtained indepen dently by Donald Knuth and NDE in April 1994. See [23, Sequence A070030]. In 1997, Brendan McKay first com puted that there are 132673644 10532 3 (more than 1.3 X 10 1 ) closed knight's tours on the 8 X 8 board ([ 19]; see also [23, Sequence A001230] , [26] ) . W e return now from enumeration to existence. After Cf:la,n the next case is Cf:l4,,. This is trickier: the reader might try to construct a closed knight's tour on a 4 X 1 1 board, or to prove that none exists. We answer this question later. What of maximal cliques and co cliques on C§? A clique is just a collection of pairwise defending (or attacking) knights. Clearly there can be no more than two knights, again because C§ is bi partite: two squares of the same color cannot be a knight's move apart, and any 1 Burt
Are Diagram 6 and its complement the only maximal cocliques? Yes, but this is harder to show. One elegant proof, given by Greenberg in [ 2 1 ] , invokes the existence of a closed knight's tour, such as Euler's Diagram 4. In general, on a circuit of length 2M the only sets of M pairwise nonadjacent vertices are the set of even-numbered vertices and the set of odd-numbered ones on the circuit. Here M 32, and the knight's tour in effect embeds that circuit into C§, so a fortiori there can be at most two cocliques of size M on Cf:l-and we have already found them both! Of course this proof applies equally to any board with a closed knight's tour: on any such board the light- and dark squared subsets are the only maximal cocliques. Conversely, a board for which there are further maximal cocliques can not support a closed knight's tour. For example, any 4 X n board has a mixed color maxin1al coclique, as illustrated for n 1 1 in Diagram 8. =
Diagram 5
A closed knight's tour on the 3
Puzzle 2 . What happens if m,n are both odd, or if m s 2 or n s 2?
Diagram 7
=
Diagram 8
A third maximal knight coclique on the 4 1 1 board
A one-factor in '§
It is not hard to see that we cannot do better: the 64 squares may be parti tioned into 32 pairs, each related by a knight move, and then at most one square from each pair can be used (Di agram 7). This is Patenaude's solution in [21]. Such a pairing of C§ is called a "one-factor" in graph theory. Similar one-factors exist on all C§m,n when 2:mn and m,n both exceed 2; they can be used to show that in general a knight coclique on an m X n board has size at most mn/2 for such m,n.
x
This yields possibly the cleanest proof that there is no closed knight's tour on a 4 X n board for any n. (According to Jelliss [ 14 ] , this fact was known to Euler and first proved by C. Flye Sainte-Marie in 1877; Jelliss attributes the above clean proof to Louis Posa.) WARNING: the existence of a closed knight's tour is a sufficient but not nec essary condition for the existence of only two maximal knight cocliques. It is known that an m X n board supports a closed tour if and only if its area mn is an even integer > 24 and neither m nor n is 1 , 2, or 4. In particular, as noted above there are no closed knight's
Hochberg jokes (in [1 1 , p. 5], concerning the analogous problem for queens) that the answer is 64, all White pieces or all Black: pieces of the sarne color can
not attack each other' Of course this joke, and similar jokes such as crowding several pieces on a single square, are extraneous to our analysis.
VOLUME 25, NUMBER 1 2003
25
tours on the 3 X 6 and 3 x 8 boards, though as it happens on each of these boards the only maximal knight co cliques are the two obvious mono chromatic ones. More about C§: Domination Number, Girth, and the
We already noted that C§, being bi partite, has no cycles of odd length. (We also encountered the nonexis tence of 3-cycles as "C§ has no cliques of size 3".) Thus the girth (minimal cy cle length) of C§ is at least 4. In fact the girth is exactly 4, as shown for instance in Diagram 10.
pawn and giving White time for 4.N X a2 and a draw. On other Black moves from Diagram lOa White resumes con trol of a2 with 3.Nc l or 3.Nb4; for in stance 2 . . . Kc2 3.Nb4 + or 2 . . . Kc3 3.Ncl Kb2 (else Na2 + ) 4.Nd3 + ! etc. Note that the White king was not needed.
Diagram 1 0
NOTE TO ADVANCED CHESS PLAYERS: it might seem that the knight does need a bit of help after l.Nb4 Kb l !?, when either 2.Na2? or 2.Nd3? loses (in the latter case to 2 . . . a2), but Black has no threat so White can simply make a random ("waiting") king move. But this is not necessary, as White could also draw by thinking (and playing) out of the a2-b4-d3-cl-a2 box: l . Nb4 Kb l 2.Nd5! If now 2 . . . a2 then 3.Nc3 + is a new drawing fork, and otherwise White plays 3.Nb4 and resumes the square dance.
White to move draws
Construct a position where this Nd5 resource is White's only way to draw.
Knight Metric
Another classic puzzle asks: How many knights does it take to either occupy or defend every square on the board? In graph theory parlance this asks for the "domination number" of C§ . 2 For the standard 8 X 8 board, the symmetrical solution with 12 knights (Diagram 9) has long been known: Diagram 9
Diagram 10a
All unoccupied squares controlled Puzzle 3. Prove that this solution is unique up to reflection.
The knight domination number for chessboards of arbitrary size is not known, not even asymptotically. See [9, Ch. 14] for results known at the time for square boards of order up to 15, most dating back to 1918 [ 1 , Vol. 2, p. 359]. If we ask instead that every square, occupied or not, be defended, then the 8 X 8 chessboard requires 14 knights. On an m X n board, at least mn/8 knights are needed, since a knight defends at most 8 squares. Puzzle 4. Prove that mn/8 + 0(m + n) knights suffice. HINT: Treat the light and dark squares separately.
After 2 Nd3!
This square cycle is important to endgame theory: a White knight trav eling on the cycle can prevent the pro motion of the Black pawn on a3 sup ported by its king. To draw this position White must either block the pawn or capture it, even at the cost of the knight. The point is seen after l .Nb4 Kb3 2.Nd3! (reaching Diagram l Oa) a2 3.Ncl + ! , "forking" king and
Puzzle 5.
WARNING: This puzzle is hard and re quires considerably more chess back ground than anything else in this arti cle. The construction requires some delicacy: it is not enough to simply stalemate the White king, because then White can play 2.Na2 with impunity; on the other hand if the White king is put in Zugzwang (so that it has some legal moves, but all of them lose), then the direct 1 . . . a2 2.NXa2 K X a2 wins for Black Even more important for the prac tical chess player is the distance func tion on C§, which encodes the number of moves a knight needs to get from any square to any other. The diame ter (maximal distance) on C§ is 6, which is attained only by diagonally opposite corners. This is to be ex pected, but shorter distances bring some surprises. The table accompa nying Diagram 1 1 shows the distance from each vertex of C§ to a corner square:
2This terminology is not entirely foreign to the chess literature: A piece is said to be "dominated" when it can move to many squares but will be lost on any of them. (The meaning of "many" in this definition is not precise because domination is an artistic concept, not a mathematical one.) The introduction of this term into the chess lexicon is attributed to Henri Rinck ([1 2 , p. 93], [1 6, p. 1 51 ]). The task of constructing economical domination positions, where a few chessmen cover many squares, has a pronounced combinatorial flavor; the great composer of endgame studies G . M . Kasparyan devoted an entire book to the subject, Domination in 2545 Endgame Studies,
26
Progress Publishers, Moscow, 1 980.
THE MATHEMATICAL I NTELLIGENCER
Determine the knight dis tance from (0,0) to (m,n) on an infinite board as a function of the integers m,n. Puzzle 6.
5
4
5
4
5
4
5
6
4
3
4
3
4
5
4
5
3
4
3
4
3
4
5
4
2
3
2
3
4
3
4
5
3
2
3
2
3
4
3
4
2
1
4
3
2
3
4
5
3
4*
1
2
3
4
3
4
tZJ
3
2
3
2
3
4
5
Diagram 1 1
Further Puzzles
We continue with several more puzzles that exploit or extend the above dis cussion. Puzzle 7. How does White play in Di agram 12 to force checkmate as quickly as possible against any Black defense? Yes, it's White who wins, despite hav ing only king and pawn against 15 Black men. Black's men are almost paralyzed, with only the queen able to move in its comer prison. White must keep it that way: if he ever moves his king, Black will sacrifice his e2-pawn by promoting it, bring the Black army to life and soon overwhelm White. So White must move only the pawn, and the piece that it will promote to. That's good enough for a draw, but how to actually win? Puzzle 8. (See Diagram 13.) There are exactly 24 4! paths that a knight on dl can take to reach d7 in four moves; plotting these paths on the chessboard yields a beautiful projection of (the ! skeleton of) the 4-dimensional hyper cube! Explain. =
White loses
The starred entry is due to the board edges: a knight can travel from any square to any diagonally adjacent square in two moves except when one of them is a comer square. But the other irregu larities of the table at short distances do not depend on edge effects. Anywhere on the board, it takes the otherwise ag ile knight three moves to reach an or thogonally adjacent square, and four moves to travel two squares diagonally. This peculiarity must be absorbed by any chess player who would learn to play with or against knights. One con sequence, known to endgame theory, is shown in Diagram 11, which exploits both the generic irregularity and the spe cial comer case. Even with White to move, this position is a win for Black, who will play . . . a2 and . . . a1 Q. One might expect that the knight is close enough to stop this, but in fact it would take it three moves to reach a2 and four to reach a1, in each case one too many. In fact this knight helps Black by block ing the White king's approach to a1!
Diagram 13
We saw that there is an es sentially unique maximal configuration of 32 mutually non-defending knights on the 8 X 8 board.
Puzzle 9.
i. Suppose we allow each knight to be defended at most once. How many more knights can the board then accommodate?
Diagram 12
White to play and mate as quickly as possible
The 4! shortest knight paths from d1 to d7
ii. Now suppose we require each knight to be defended exactly once. What is the largest number of knights on the 8 X 8 board satisfy ing this constraint, and what are all the maximal configurations? Puzzle 10. A "camel" is a (3, 1) leaper, that is, an unorthodox chess piece that moves from (x,y) to one of the squares (x :±:: 3, y :±:: 1) or (x :±:: 1 , y :±:: 3). (A knight is a (2, 1) leaper.) Because there are eight such squares, it takes at least mn/8 camels to defend every square, occupied or not, on an m X n board. Are mn/8 + 0(m + n) sufficient, as in Puzzle 4? Synthetic Games
The remainder of this article is devoted to composed chess problems featuring knights. A synthetic game [ 13] is a chess game composed (rather than played) to achieve some objective, usually in a minimal number of moves. Ideally the solution should be unique, but this is very rare. Failing this, we can hope for an "almost unique" solution, e.g. , one where the final position is unique, but not the move order. For instance, the shortest game ending in checkmate by a knight is 3.0 moves: l .e3 Nc6 2 .Ne2 Nd4 3.g3 Nf3 mate. White can vary the order of his moves and can play e4 and/or g4 instead of e3 and g3. The Black knight has two paths to f3. The biggest flaw, however, is that White could play c3/c4 instead of g3/g4, and Black could mate at d3. At least all 72 solutions share the central feature that White incarcerates his king at its home
VOLUME 25. NUMBER 1 , 2003
27
square. A better synthetic game in volving a knight is given in Puzzle 1 1 .
Diagram 1 4
Puzzle 1 1 . Construct a game of chess in which Black checkmates White on Black's fifth move by promoting a pawn to a knight.
Proof Games
very successful variation of syn thetic games that allows unique solu tions are proof games, for which the length n of the game and the final po sition P are specified. For the condi tion (P, n) to be considered a sound problem, there should be a unique game in n moves ending in P. (Some times there will be more than one so lution, but solutions should be related in some thematic way. Here we will only consider conditions (P, n) that are uniquely realizable, with the ex ception of Diagram 1 7.) The earliest proof games were com posed by the famous "Puzzle King" Sam Loyd in the 1890s but did not have unique solutions; the earliest proof game meeting today's standards seems to have been composed by T. R. Daw son in 1913. Although some interesting proof games were composed in subse quent years, the vast potential of the subject was not suspected until the fan tastic pioneering efforts of Michel Gail laud in the early 1980s. A close to com plete collection of all proof games published up to 1991 (around 160 prob lems) appears in [28] . Let us consider some proof games related to knights. We mentioned ear lier that the shortest game ending in mate by knight has length 3.0 moves. None of the 72 solutions yield proof games with unique solutions; i.e., every terminal position has more than one way of reaching it in 3.0 moves. It is therefore natural to ask for the least number n (either an integer or half integer) for which there exists a uniquely realizable game of chess in n moves ending with checkmate by knight, i.e., given the final position, there is a unique game that reaches it in n moves. Such a game was found in dependently by the two authors of this article in 1996 for n = 4.0, which is surely the minimum. The final position is shown in Diagram 14.
A
28
THE MATHEMATICAL INTELLIGENCER
Position after Black's 4th move. How did the game go?
Five other proof game problems in volving knights are presented in Puzzles 12 through 16. The minimum known number of moves for achieving the game is given in parentheses. (We repeat, the game must be uniquely realizable from the number of moves and final position.)
earliest of all proof games, while Dia gram 16 is considerably more chal lenging. Diagram 1 7 features a differ ent kind of impostor. Note that it has two solutions; it is remarkable how each solution has a different impostor. The complex and difficult Diagram 18 illustrates the Frolkin theme: the mul tiple capture of promoted pieces. Dia gram 19 shows, in the words of Wilts and Frolkin [28, p . 53] , that "the seem ingly indisputable fact that a knight cannot lose a tempo is not quite un ambiguous." Construct a proof game ending with mate by a pawn promoting to a knight without a capture on the mating move (6.0).
Puzzle 1 5 .
16. Construct a proof game ending with mate by a pawn promoting to a knight with no captures by the mat ing side throughout the game (7.0). Puzzle
Diagram 15
Construct a proof game without any captures that ends with mate by a knight (4.5).
Puzzle 1 2 .
13. Construct a proof game ending with mate by a knight making a capture (5.5).
Puzzle
14. Construct a proof game ending with mate by a pawn promoting to a knight (5.5). Puzzle
There is a remarkable variant of Puz zle 14. Rather than having the game de termined by its final position and num ber of moves, it is instead completely determined by its last move (including the move number)! This is the longest known game with this property.
After Black's 4th. How did the game go?
Diagram 1 6
Puzzle 1 4 ' . Construct a game of chess with last move 6.g x f8N mate.
The above proof games focus on achieving some objective in the mini mum number of moves. Many other proof games in which knights play a key role have been composed, of which we give a sample of five problems. Di agrams 15, 16, and 17 feature "impos tors"-some piece(s) are not what they seem. The first of these (Diagram 15) is a classic problem that is one of the
After Black's 1 2th. How did the game go?
Diagram 1 7
Diagram 20
After White's 1 3th. How did the game go?
Mate in one
should be dual-free, which means that Black has at least one method of de fending which forces each White move uniquely if White is to achieve his ob jective. The objective of checkmate can be combined with other condi tions, such as White having only one unit besides his king. The ingenious Di agram 2 1 shows the current record for a "knight minimal," i.e., White's only unit besides his king is a knight. For other length records, as well as many other tasks and records, see [20]. Diagram 21
Two solutions! Diagram 1 8
After White's 27th move. How did the game go?
history of the game. It is only assumed that the prior play is legal; no assump tion is made that the play is "sensible." Proof games are a special class of retro problems. We will give only one illus tration here of a retro problem that is not a proof game. It is based on con siderations of parity, a common theme whenever knights are involved. Dia gram 20 is a mate in one. A chess prob lem with this stipulation almost invari ably involves an element of retrograde analysis, such as determining who has the move. In a problem with the stipu lation "Mate in n," it is assumed that White moves first unless it can be proved that Black has the move in or der for the position to be legal.
Diagram 1 9 Length Records
After Black's 1 0th move. How did the game go? Retrograde Analysis
In retrograde analysis problems (called retro problems for short), it is neces sary to deduce information from the current position concerning the prior
In length records, one tries to con struct a position that maximizes the number of moves that must elapse be fore a certain objective is satisfied. The most obvious and most-studied objec tive is checkmate. In other words, how large can n be in a problem with the objective "mate in n" (i.e., White to play and checkmate Black in n moves)? Chess problem standards demand that the solution should be unique if at all possible. It is too much to expect, es pecially for long-range problems, that White has a unique response to every Black move for White to achieve his objective. In other words, it is possible for Black to defend poorly and allow White to achieve his objective in more than one way, or even to achieve it ear lier than specified. The correct unique ness condition is that the problem
Mate in 48 Paradox
The term "paradox" has several mean ings in both mathematics and ordinary discourse. We will regard a feature of a chess problem (or chess game) as paradoxical if it is seemingly opposed to common sense. For instance, com mon sense tells us that a material ad vantage is beneficial in winning a chess game or mating quickly. Thus sacrifice in an orthodox chess problem (i.e., a direct mate or study) is paradoxical. Of course it is just this paradoxical ele ment that explains the appeal of a sac rifice. Another common paradoxical theme is underpromotion. Why not promote to the strongest possible piece, namely, the queen? This theme is related to that of sacrifice, because in each case the player is forgoing ma terial. To be sure, underpromotion to knight in order to win, draw, or check mate quickly is not so surprising (and has even occurred a fair number of times in games), since a knight can make moves forbidden to a queen. Tim Krabbe thus remarks in [ 15] that
VOLUME 25, NUMBER 1, 2003
29
knighting hardly counts as a true "un derpromotion."3 Nevertheless, knight promotions can be used for surprising purposes that heighten the paradoxical effect. Diagram 22 shows four knight sacri fices, all promoted pawns, with a total of five promotions to knight. Diagram 23 shows a celebrated problem com posed by Sam Loyd where a pawn pro motes to a knight that threatens no pieces or checks and is hopelessly out of play. For some interesting com ments by Loyd on this problem, see [27, p. 403]. Diagram 22
other knights. Similarly the time-wast ing 5.h X g8N 6.Nh6 7.N X f7 of Diagram 19 seems paradoxical-why not save a move by 5.h X g8B and 6.b X f7 + ? Helpmate
In a helpmate in n moves, Black moves first and cooperates with White so that White mates Black on White's nth move. If the number of solutions of a helpmate is not specified, then there should be a unique solution. For a long time it was thought impossible to con struct a sound helpmate with the theme of Diagram 24, featuring knight promo tions. Note that the first obstacle to overcome is the avoidance of check mating White or stalemating Black The composer of this brilliant problem, Ga bor Cseh, was tragically killed in an ac cident in 2001 at the age of 26. Diagram 24
solution to #341 ) and called the method of "buttons and strings," is to form a graph whose vertices are the squares of the board, with an edge between two vertices if the problem piece (here a knight) can move from one vertex to the other. For Diagram 25, the graph is just an eight-cycle (with an irrelevant isolated vertex corresponding to the center square of the board). Diagram 26 is a representation of the problem that makes it quite easy to see that the minimum number of moves is sixteen (eight by each color), achieved for in stance by cyclically moving each knight four steps clockwise around the eight-cycle. If a White knight is added at b 1 and a Black knight at b3, then somewhat paradoxically the minimum number of moves is reduced to eight! A variation of the stipulation of Dia gram 25 is the problem presented as Puzzle 17, whose solution is a bit tricky and essentially unique. Diagram 25
White to play and win Exchange the knights in a minimum number of moves
Diagram 23
Diagram 26
c3
a2
1.&
Helpmate in 1 0 Piece Shuffle
Mate in 3
Note that the impostors of Diagrams 15-17 may also be regarded as para doxical, because we're trying to reach the position as quickly as possible, and it seems a waste of time to move knights into the original square(s) of
In piece shuffles or permutation tasks, a rearrangement of pieces is to be achieved in a minimum number of moves, sometimes subject to special conditions. They may be regarded as special cases of "moving counter prob lems" such as given in [2, pp. 769-777) or [3, pp. 58--68] . A classic example in volving knights, going back to Guarini in 1 5 12, is shown in Diagram 25. The knights are to exchange places in the minimum number of moves. (Each White knight ends up where a Black knight begins, and vice versa.) The sys tematic method for doing such prob lems, first enunciated by Dudeney [3,
3More paradoxical are underpromotions to rooks and bishops, but we will not be concerned with them here.
30
THE MATHEMATICAL INTELLIGENCER
cl
'-
"- b l
1.&
b3 al
�
a3
c2
The graph corresponding to Diagram 25
In Diagram 25 exchange the knights in a minimum number of move sequences, where a "move sequence" is an unlimited number of consecutive moves by the same knight. For some more sophisticated prob lems similar to Diagram 25, see [ 10, pp. 1 14-124) . The most interesting piece
Puzzle 17.
parity is a knight-move away from exactly one of the squares with co ordinates (2x,2y) with x == y mod 4. Intersecting this lattice with an m X n chessboard yields mn/16 + 0(m + n) knights that cover all odd squares at distance at least 3 from the nearest edge. Thus an ex tra 0(m + n) knights defend all the odd squares on the board. The same construction for the even squares yields a total of mn/8 + O(m + n).
shuffle problems connected with the game of chess (though not focusing on knights) are due to G. Foster [5, 6, 7, 8], created with the help of his com puter program WOMBAT (Work Out Matrix By Algorithmic Techniques). Puzzle Answers, Hints, and Solutions
The graph C!3rn,n is connected for m = n = 1 (only one vertex) and not connected for m = n = 3 (the central square is an isolated ver tex). With those two exceptions, C!3rn, n is connected if and only if m > 2 and n > 2. Every C!3m,n is bi partite, except Cf3u (empty parts not allowed); each non-connected graph C!3rn,n is bipartite in several ways except for CfJ 1,2 = Cf12, 1 · 2. If m = 1 or n = 1 then C§m,n is dis connected, so the maximal co clique is the set of all mn vertices. The graph 'fi2,n (or Cf3,, 2) decom poses into two paths of length Ln/2J and two of length ln/2l. It thus has a one-factor if and only if 41n, and otherwise has cocliques of size >n; the maximal coclique size is n + B where B E { 0, 1, 2) and n == :±: B mod 4. If m and n are odd in tegers greater than 1 then the max imal coclique size ofC§rn,n is (mn + 1 )/2, attained by placing a knight on each square of the same parity as a comer square of an m X n board. One can prove that this is maximal by deleting one of these squares and constructing a one factor on the remaining mn - 1 vertices of C!3m,n · 3. Each of the four 2 X 2 comer sub boards requires at least three knights, and no single knight may occupy or defend squares in two different subboards. Hence at least 4 · 3 = 12 knights are needed. For three knights to cover the {a1, b1, a2, b2 ) subboard, one of them must be on c3; likewise f3, f6, c6 must be occupied if 12 knights are to suffice. It is now easy to verify that Diagram 9 and its reflection are the only ways to place the remaining 8 knights so as to cover the entire chessboard. 4. ( [3, #319, p. 127]) On an infinite chessboard, each square of odd 1.
Diagram 27
White to move draws
One such position is shown in Di agram 27. Once the a-pawn is gone, the position is a theoretical draw whether Black plays fX g2 + (Black can do no better than stalemate against K x g2, Kh 1 , Kg2 etc.) or f2 (ditto after Ke2, Kf1 , etc.), or lets White play g x f3 and Kg2 and then jettison the f-pawn to reach the same draw that follows f X g2 + . But as long as Black's a-pawn is on the board, White can move only the knight since g x f3 would liber ate Black's bishop which could then force White's knight away (for instance l . Nb4 Kb 1 2.g X f3? g2 + ! 3.KXg2 Bd6 4.Nd5 Kb2) and safely promote the a-pawn. Black's pawn on f3 could also be on h3 with the same effect. 6. The distance is an integer, congru ent to m + n mod 2, that equals or exceeds each of !m'/2, ln'/2, and (im, + lni)/3. It is the smallest such integer except in the cases already noted of (m,n) = (0, :±: 1), ( :±: 1,0), or ( :±: 2, :±:2), when the distance ex5.
ceeds the above lower bound by 2. (adapted from Gorgiev) To win, White must promote the pawn to a knight, capture the pawns on b5 and c4, and then mate with N X b3 when the Black queen is on al. Thus N X b3 must be an odd-num bered move. Therefore l .h4, 2.h5, 3.h6, 4.h7, 5.h8N does not work be cause all knight paths from h8 to b3 have odd length. Since the knight cannot "lose the move," the pawn must do so on its initial move: l .h3!, followed by 6.h8N! , 7.Nf7, 8.Nd6, 9.N x b4, 10.Nd6, 1 l.N X c4. 12.Na5. At this point the Black queen is on a2, having made 1 1 moves from the initial position; whence the conclusion: 12 . . . Qa1 13. N X b3 mate. (We omitted from Gorgiev's original problem the initial move l . Kf2 X Ne 1 Qa2-a1, which only served to give Black his entire army in the initial position and thus maximize the material disparity; and we moved a Black pawn from c5 to b5 to make the so lution unique, at some cost in strategic interest.) 8. Recall that a knight's move joins squares differing by one of the eight vectors ( :±: 1 , :±: 2) or ( :±: 2, :±: 1), and check that to get some four of those to add to (0,6) we must use the four vectors with a positive ordinate in some order. Thus, to reach d7 from d1 (or, more generally, to travel six squares north with no obstruction from the edges of the board) in four moves, the knight must move once in each of its four north-go ing directions. Therefore each path corresponds to a permutation of the four vectors ( :±: 1 ,2) and ( :±: 2 , 1 ). The number of paths is thus 4! = 24, and drawing them all yields the image of the 4-cube under a pro jection taking the unit vectors to (:±: 1 ,2) and ( :±: 2, 1). Instead of d 1 and d7 w e could also draw the 24 paths from a4 to g4 in four moves to get the same picture. Not b2 and f6, though: besides the 24 paths of Diagram 13 there are other four move journeys, for instance b2-d3f4-h5-f6. 9. (i) The maximum is still 32 (though 7.
VOLUME 25. NUMBER 1 , 2003
31
there are many more configura tions that attain this maximum). To show this, it is enough to prove that at most 8 knights can fit on a 4 X 4 board if each is to be de fended at most once. This in tum can be seen by decomposing 2
In this note we have concentrated attention mainly on the simplest multivariate domain []). Some results, however, ex tend to domains with other shapes (domains diffeomorphic to [D, for example) and to domains of higher dimension. Proposition 4 appears to be related to the Poincare-Hopf theorem linking the degree of the gradient map on a bound ary to the zeroes within (see, for example, Milnor [M69], section 6). It may be harder, though, in higher dimensions, to express our hypotheses in terms of boundary values of the function itself.
-
-
PROPOSITION 5. If the Conjecture holds, there exist contin uously differentiable functions fk : [D � IR (k = 1, 2, . . . , N), for some finite N, such that all have the same values on a [D, yet N
n k=l
Acknowledgments
This work was supported in part by a Hanyang University research grant ( 1999) and by a research grant (now called a Discovery Grant) from NSERC of Canada, held at the Uni versity of Guelph.
REFERENCES
[F-M95] M. F uri and M. Martelli, A multidimensional version of Rolle's theorem. Amer. Math. Monthly 1 02 (1 995), 243-249.
V'fk([D) =
,
[M69] J. Mi lnor Topology from the Differentiable Viewpoint. University
0.
Press of Virg i n i a, 1 969.
Can we have N = 3 here, as in Proposition 1? This question, along with the Conjecture itself, will require further study.
[RB-C74] W. W. Rouse Ball and H . S. M. Coxeter, Mathematical Recre ations and Essays , 1 2th edition, University of Toronto Press. 1 974.
AU T HOR S
SUNG SOO KIM
JOHN HOLBROOK
Hanyang University
Mathematics and Statistics
Ansan, Kyunggi 425-791
University of Guelph
Korea
Guelph, Ontario N 1 G 2W1
e-mail:
[email protected] Canada e-mail:
[email protected] Sung Soo Kim (B.Sc. Hanyang University, Ph.D. Korea Advanced
been Visiting Professor 2001-2. Normally Kim works at Hanyang
Institute of Science and Technology) and John Holbrook (B.Sc.,
University at Ansan. Holbrook has been mainly at Guelph for many
M.Sc. Queens, Ph.D. Caltech) pursued mathematics separately
years but has also worked in Califomia, Venezuela, India, Slove
and on opposite sides of the world until The Mathematical lntelli
nia, and Hawaii. While their individual research topics range over
gencer (vol. 22, no. 4) brought them together. Their electronic col
matrix analysis, special functions, statistical sampling, and image
laboration on the article "Bertrand's paradox revisited" led to a
analysis, the Kim-Holbrook joint work has centered more on math
real-life collaboration at the University of Guelph where Kim has
ematical curiosities, paradoxes, and philosophical puzzles.
VOLUME 25. NUMBER 1 , 2003
47
l$ffll•i§u@h1¥11@%§4fJi•ipl§ljd
Prime Maze Dean Hickerson
This column is a place for those bits of contagious mathematics that travel from person to person in the community, because they are so elegant, suprising, or appealing that one has an urge to pass them on. Contributions are most welcome.
M i chael K l e b e r and Ravi Vaki l ,
number-theorist set his car's odometer to zero and then went for a drive along the roads shown in the diagram, starting and ending in his home town. He noticed that his odome ter reading was a prime number each time he entered a town. Where did he live and how far did he drive?
A
(Notes
to
avoid
trick solutions:
Every town is at an intersection of roads and evecy such intersection has a town. Distances on the map are in miles and are exact. The odometer measures miles and is accurate. The driver didn't reset his odometer at any time after he
E d i t o rs
started. He never turned around outside of a town, but he did sometimes leave a town by the road on which he arrived. He didn't add extra distance by driving within a town or raising his car off the ground and spinning his wheels or get ting out of the car while someone else drove it. He didn't subtract distance by driving backward. He stayed on the roads shown here at all times, etc.) Department of Mathematics University of California at Davis Davis, CA 956 1 6, USA e-mail: dean@math. ucdavis.edu
5
Please send all submissions to the
13
Mathematical Entertainments Editor, Ravi Vakil,
Stanford University,
Department of Mathematics, Bldg. 380, Stanford, CA 94305-21 25, USA e-mail:
[email protected] 48
THE MATHEMATICAL INTELLIGENCER © 2003 SPRINGER-VERLAG NEW YORK
The solution will appear in the next issue.
CHRISTIAN C. FENSKE
Extrema i n Case of Severa Variab es a
favourite topic of most calculus courses is the calculation of extrema. Every calculus student is confronted with the following:
Standard calculus result. Let n 2: 2 a nd J c !R; an open interval. Let f : J � IR; be n - 1 times differentiable on J and n times differentiable at some po·int a E J. Assume that jCk)(a) = O for k = 1 , . . . , n - 1 but fCn)(a) * 0. Then there is the following alternative:
(1) Either n is even. Then f has an isolated extremum at a, and that is a maximum in casefCn)(a) < 0 and
a minimum in case f(n)(a) > 0. (2) Or n is odd. Thenf does not attain a local extremum
at a. When the course proceeds to functions of more than one variable we meet this theorem again-but now only for second derivatives. But what about a function with the first five derivatives vanishing? Of course, there arises the ques tion of what precisely we mean by "vanishing" of an n-th derivative, and what we should use as a substitute for the conditionjCn)(a) being positive or negative. This again de pends on how we define differentiability for functions of several variables. In this paper, I first explain how the theorem would look for a low-brow approach. Then I will discuss briefly the modifications required for the high-brow approach where higher derivatives are viewed as multilinear forms. Of course, at first glance one suspects that the multi variable case should be well known, and I am pretty sure it is. Although I have looked into numerous calculus texts and asked at least as many colleagues, I have not been able to identify a source. Either there is a proof of this result in the literature, but I did not find it, or the result seemed plau sible to everyone who thought of it, but writing it down was not worthwhile. So I just present here a proof for reference purposes and maybe for use in calculus courses.
The Low-Brow Approach
In the low-brow approach a function f : U � !R; on an open set U C �H;rn is said to be n times continuously differentiable if all partial derivatives up to order n exist and are continu ous on U. (I write Di for the partial derivative with respect to the i-th variable.) Schwarz's theorem on the interchange ability of partial derivatives then tells us that for a E U and h(l), . . . , h(n) E [R;m , the map dnj(a) : !R;m X · · · X [R;m � [R; with dnj(a) (kC 1 ), . . . , h(n)) = L Dj, · · · DjJ(a)h ) · · · h ) (summation over all distinct n-tuples (j 1 , . . . , Jn ) with 1 :s: j1 :s: m) is a symmetric n-linear map. If h E �R;m we write dn f(a)(hn) : = d'1(a)(h, . . . , h). We then have
J�
)�
THEOREM. Let U be open in !R;m and let f : U � IR; be n times continuously differentiable. Let 2 :s: p :s: n, and assume that for some a E U and all h E �R;m, we have df(a)(h) = d2j(a)(h2) = · · · = dP-�a)(hP- 1 ) 0, but dPf(a)(hP) * 0 for some h E IR;m. Then the following holds: =
(1) A necessary condition forf to have a local extremum
at a is that p be even. Let then p be even. (2) A necessary condition for f to mum (minimum) at a is that for all h E !Rm. (3) A sufficient condition for f to mum (minimum) at a is that for all h E [Rm\{0}.
have a local maxi dPf(a)(hP) :s: 0 (2:0) have a local maxi dPf(a)(hP) < 0 (>0)
Proof To begin with, choose your favourite norm 11·11 o n !Rm . (1) Let p be odd. We have to show that f does not have a local extremum at a. By assumption there is an h such that dPf(a)(hP) * 0. Upon dividing h by its norm we may assume that llh ll = 1 . Since p is odd we may © 2003 SPRINGER-VERLAG NEW YORK. VOLUME 25. NUMBER 1 . 2003
49
even assume that JL : dPf(a)(hP) < 0 (else we would replace h by -h). Now we set E : = - p./p! and use Taylor's formula ([J, Corollary 8. 17]) to find a 8 > 0 such that =
I f(a + k) - f(a) - � dPf(a)(kP) I :S !
fi lklfl'
whenever k E !Rm satisfies l lkll < 8. Let 0 < t < 8. We then have f(a + th) - f(a) :S _!!: :_ t P + �tP = J.LtP < 0. So there cannot be a local minp! 2 2p ! tmum at a. But we may as well choose an h with llhl l = 1 such that JL' = dPf(a)(hP) > 0. Let then E : = JL'/p!. We choose 8 and t as above and find that
' ' tP E 0 < _l!:__ tP = - - tP + 1!:__ :Sf(a + th) - f(a). 2p! 2p ! 2 So w e don't have a local maximum at a either. (2) Now assume that we have a local minimum at a (else we replace f with -f). Then there is an 17 > 0 such that f(a + h) - f(a) 2: 0 whenever llhll :S 11· Let E > 0. Again we apply Taylor's theorem, and find a positive 8 :S 17 such that
- EIIhl f :S j(a + h) - f(a) - _!_ dPf(a)(hP) :S EllhlfP p! whenever llhll < 8 . S o for 0 < llhll < 8 w e have
0 :S f(a + h) - f(a) :S _!_ dPf(a)(hP) + Ellhl f, p! hence 0 :S dPf(a)(hP) + p!EIIh l f. Now fix h. Choose a t i= 0 with llthll < 8, which implies dPf(a)((th)P) + p!EIIthlfP 2: 0. Because p is now assumed to be even we have tP > 0, hence dPf(a)(hP) + p!EIIhlfP 2: 0. Be cause this holds for each E > 0 we conclude that dPf(a)(hP) 2: 0. (3) We deal with the case where dPf(a)(hP) > 0 whenever h i= 0. Denote by S the unit sphere in !Rm. Since S is compact, the continuous map 4> : S � IR defined by cf>(h) : = dPf(a)(hP) attains its minimum. So there is a A > 0 with dPf(a)(hP) 2: A whenever llhll = 1, so dPf(a)(hP) 2: Allhl f for all h E !Rm. Let 0 < E < Alp!. In voking Taylor's theorem for a last time, we choose a dPf(a)(hP) I :S 8 > 0 such that if(a + h) - f(a) - _!_ pi EllhlfP whenever llhll :S 8. For these h we then have - � lhlf :Sf(a + h) - j(a) - _l_ CfP.f(a)(hP) :S �lhl f, hence p!
ll l
0 < - EIIhl f + � h f :Sf(a + h) - f(a). p! But this means that f(a + h) > f(a) as long a s 0 < llhl l < 8. 0 The High-Brow Approach
Of course, it is aesthetically unsatisfactory to have a condi tion such as dP.f(a)(hP) i= 0 where one would expect just dPf(a) i= 0. But this can be taken care of if we view higher derivatives as multilinear maps. So let E, F be finite-dimen sional vector spaces (over the reals), and choose norms in E, F. If D C E is open, a map f : D � F is said to be differ50
THE MATHEMATICAL INTELLIGENCER
entiable at a E D if there is a linear map T E L(E, F) and there is a function r : D � F continuous at a, such that r(a) = 0 and f(x)
=
f(a) + T(x - a) + llx - al l r(x)
for all X E D. One proves immediately that T in the above definition is uniquely determined and consequently writes df(a) : = T. One says thatfis continuously differentiable on D iff is dif ferentiable at each x E D and df : D � L(E, F) is continu ous. If E1 , . . . , Ek are vector spaces, the space of k-multi linear maps of E1, . . . , Ek to F is denoted by L(E1 , . . . , Ek ; F) . We endow this space with the obvious norm. If E1 · · · = Ek E this space is denoted Lk(E; F). If h E E we write T(hk) : = T(h, . . . , h). By L�(E; F) one denotes the subspace of symmetric mappings. We say that T E L1(E; IR) is positive (negative) semidefinite if T(hk) 2: 0 (resp. :S 0) for all h E E. Similarly, T is said to be positive (negative) definite if T(hk) > 0 (resp. T(hk) < 0)) whenever h i= 0. Returning to differentiation, we know from (multi-)lin ear algebra that there are natural isomorphisms =
=
L(E1 , L(E2 , . . . , Ek ; F)) = L (E1 , . . . , Ek-1 ; L(Ek, F) ) = L(Et, . . . , Ek ; F). If f is differentiable on D and df : D ---? L(E, F) is again differentiable at a E D, then d2f(a) : = d(df)(a) E L(E; L(E, F) ) = L2(E; F). Inductively, we define dkf(a) : = d(dk- 1 f)(a) E Lk(E; F) if dk- lj : D � Lk - 1 (E; F) is differ entiable at a. The Schwarz lemma then tells us that in fact dkf(a) E L}(E; F) providedf is k times continuously differ entiable. Our theorem then looks almost the same: THEOREM. Let U be an open subset of a finite-dimensional vector space E, and letf : U � IR be n times continuously differentiable. Let 2 :S p :S n, and assume that for some a E U we have dkf(a) = Ofor 1 :S k :S p - 1 but dPf(a) i= 0. Then the following holds:
(1) A necessary condition forfto have a local extremum
at a is that p be even. Let then p be even. (2) A necessary condition for f to have a mum (minimum) at a is that dPf(a) (positive) semidefinite. (3) A sufficient condition for f to have a mum (minimum) at a is that dPf(a) (positive) definite.
local maxi be negative local maxi be negative
The proof carries over almost verbatim from the low brow case. There is but one crucial point: We know that dPf(a) i= 0, so we know that there are h 1 , . . . , hp E E with dP.f(a)(h1 , . . . , hp) i= 0; but we need to know that this hap pens with h1 , . . . , hp all equal. It is here that we exploit the fact that dP.f(a) is symmetric. If p = 2 we could use the par allelogram identity to conclude that d::f(a) = 0 iff d 2f(a)(h2) = 0 for all h E E. For the general case we could appeal to the polarization identity [AMR, Proposition 2.2. 1 1 ] : PROPOSITION.
and k E
Let E, F be finite-dimensional vector spaces : E ---? F by A(h) : =
N. For A E Lk(E; F) define A
A(hk) and denote by Sk(E; F) the vector space {A A E Lk (E; F)} endowed with the norm I lA I : = sup{ //A (hk)/ 1 l lhll :S 1 } . Then I IAII :S !!A ll :S t i!Aii for A E Lk(E; F), and when t·e stricted to L.�(E; F) is an isomorphism. A
If you want to prove the theorem on extrema in a calculus course you might be reluctant to bother your students with too much multilinear algebra, so here is a simple and direct proof which I learned from my colleague Thomas Meixner [M]: LEMMA. Let E, F be vector spaces, n E N, and T E L�(E; F). If T(h") = 0 for all h E E, then T = 0. PROOF. We proceed by induction. If n = 1 there is nothing to prove. So assume the claim for n - 1 and let T E L�E; F) C L(E, L� - 1 (E; F)). Suppose T(h") = 0 for all h but T =!= Then there must be an a E E with 0 =!= T(a) E L�- 1 (E; F). Since we assume the claim for n - 1 there must be a b E E with T(a, b, . . . , b) =!= By our assumption, we must have that T((Aa + b)") = 0. Expanding by multilinearity and using the symmetry of T to collect common terms, we fmd that
0.
0.
0 = T((Aa + b)n)
for all A E 1R
II
= I mjT(a, . . . , a, b, . . . , b )}j with mi E j�o
=
n-1
I
j� 1
� n�
N\{0}
sis, and Applications, 2nd edition, Springer, 1 988, Appl. Math. Sci. 75.
T(b") = T(a") = 0.
[J] Jurgen Jost, Postmodern Analysis, Universitext, Springer, Berlin et
Now we put A = 1, . . . , n - 1 and we obtain the following equation:
( .�
n -
REFERENCES
[AMR] R. Abraham, J . E . Marsden, T. Ratiu, Manifolds, Tensor Analy
mjT(a, . . . , a, b . . . , b)Ai,
because
(*)
Of course, we now might wish to remove the assump tion that our vector spaces be finite-dimensional. So now assume that E is a Banach space. Then we may copy the above arguments with almost no changes: We simply change our notation: we now denote by L(E, F) and Lk(E; F) the space of all continuous (k-)linear maps (in the finite-dimensional case (multi-)linear maps are automati cally continuous, so this is even consistent with our previ ous terminology). Moreover, I have deliberately chosen ref erences ([J] and [AMR]) that actually deal with the infinite-dimensional case, and Meixner's lemma does work in any vector space. There is but one point where we re ally needed a finite-dimensional vector space: in part 3) of the proof we used the fact that the unit sphere is compact. The best way out of this is to require just what we need: Let us call T E Lk(E; IR) strongly positive (negative) defi nite if there is a A > 0 such that T(hk) ::=: A//hllk (resp., T(hk) :S - A//hl/k) for all h E E. The high-brow theorem then continues to hold if we replace "finite-dimensional vector space" by "Banach space," provided we insert "strongly" before "negative (positive) definite" in 3).
1
X
(
m 1 T(a, b, b, . . . , b, b) m2 T(a, a, b, . . . , b, b) mn - 1 T(a, a,
�
•
.
.
.
, a, b)
) ()
2
AU T H O R
0
=
0
� .
We denote the matrix on the left-hand side by M, _ 1 and claim that det M,_ 1 =!= 0: Starting from the last column, mul tiply the (j - 1 )-st column by n - 1 and subtract the result from the j-th column. Thus the first column remains unal tered. This gives us 1
al. , 1 998. [M] Thomas Meixner, Personal communication.
CHRISTIAN C. FENSKE
2- n
2"- 2 (3 - n)
Mathematisches lnstitut Justus-Uebig-Universitat Giessen
det
n - 2 (n - 2)( - 1) 0 n- 1 Expanding the determinant according to the last row and collecting common factors we obtain
( - 1)"(n - 1)( - 1)"- 2 (n - 2)! det Mn - 2 = (n - 1)! det Mn - 2 · Because det M2 =!= 0, this shows what we needed. But then (*) has only the trivial solution. In particular, m 1 T(a, b, . . . , b) = which shows that T(a, b, . . . , b) = 0. Contra diction. D
0,
D-353g2 Giessen Germany e-mail:
[email protected] Christian Fenske, born in Germany in 1 g39, studied mathe matics and physics in TObingen and Bonn. Since 1 972 he has been at the Mathematical Institute in Giessen. He works on global analysis: topological fixed-point theory, and periodic or bits. Married with two children, he enjoys foreign languages and foreign travel.
VOLUME 25, NUMBER 1 . 2003
51
M a t h e ��n a t i c a l
E n t e rt a i n ��n e n t s
This column is a place for those bits of contagious mathematics that travel from person to person in the community, because they are so elegan� suprising, or appealing that one has an urge to pass them on. Contributions are most welcome.
M i c h ae l K l e b e r a n d Ravi Vaki l ,
Capitalism Overturned
29. 30. 32. 33.
Michael Kleber
1. 2. 3. 4. 5. 6. 8. 9. 13. 14. 15. 16.
Across BIKE
X DOCK (MODULO Sg)
REVISES YES TO BOURBAKI ___
WITHOUT LOSS OF GENERALITY
A GREEK
376/5 1 1 (BASE 2) CONFERENCE (SUFFIX)
PH.D. DIRTY ANGLE TRISECTOR, E.G. TOLD YOU SO!
2
IMPASSIVE ACCOUNT RECORD
ONE OF A KINE? ANTIPODAL TO SPANIARD
PC NOT AD REFERENCE WORK HAREM CONCUBINE SPHERE CONTACT AIN'T NOT OR
10, E.G., BUT NOT 1 1
PRONOUNCED ___
, AAH
! OF U - 1 ___
ARE ROUND
(CORNBREAD ARE SQUARE!)
SVP (EN ANGLAIS)
1
DARKENED
2
Down
Word boundaries are indicated by heavy lines. Answers extending past the right edge of the grid re-enter on the left, which may be somewhat dis orienting. A solution will appear in the next issue.
7. 10. 11. 12. 16. 1 7. 19. 21. 22. 23. 25. 27.
Ed i t o rs
3
4
5
18. 19. 20. 24. 25. 26. 28. 30. 31.
RULERS ALL FOR ONE, AND . . . THAT ONE
EUK TIP
MATH LANGUAGE REQUIREMENT PRE-JAN FARE STORAGE MEDIUM
6
7
8
11
10
13
12 15
14
17
16
Please send all submissions to the Ravi Vakil,
30
28
27
Stanford, CA 94305-2 1 25, USA e-mail:
[email protected] THE MATHEMATICAL INTELLIGENCER © 2003 SPRINGER-VERLAG NEW YORK
29 32
31
Stanford University,
Department of Mathematics, Bldg. 380,
52
26
25
23
22
21
Mathematical Entertainments Editor,
18
20
19
24
9
33
REUBEN HERSH
Th e B i rth of Ran d om Evo utions
t
he theory of random evolutions was born in Albuquerque in the late 1 960s, flourished and matured in the 1 9 70s, sprouted a robust daughter in Kiev in the 1 980s, and is to day a tool or method, applicable in a variety of "real-world " ventures.
"Random evolutions" are stochastic linear dynamical systems. "Random" means, not just random inputs or ini tial conditions, but random media-a random process in the equation of state. A year or two after I came to the University of New Mex ico, we hired a young probabilist, Richard Griego. Griego took his Ph.D. at Urbana and was well versed in stochas tic processes a la Doob. He showed me something I, a p.d.e. specialist, had never heard of at N.Y.U. or Stanford Brownian motion can solve Laplace's equation or the heat equation! We wrote this up in a popular article [ 132] for the
Scientific American.
Probabilistic methods worked for p.d.e.s of the para bolic or elliptic types. But I was a student of Peter Lax. I had learned to concentrate on hyperbolic equations, like the wave equation. "Can you do it for the hyperbolic case?" I asked. So far as Richard had heard, it seemed you couldn't. There was a vague impression that Doob had even proved it couldn't be done.
We organized a little seminar, with Bert Koopmans, a statistician, and Nathaniel Friedman, a young ergodicist. (Nat is now a sculptor, and organizer of mathematical art, in Albany, NY.) We also had the pleasure of interest on the part of Einar Hille, who had come to New Mexico after re tiring from Yale. By good luck, Richard discovered in Nat's bookcase a copy of Mark Kac's Magnolia Petroleum Company Lec tures in Pure and Applied Science [67]. These Lectures were not in the University of New Mexico's library, nor in many other libraries. Most of their contents had been pub lished elsewhere, but one section had never appeared in a journal. Kac considered a particle moving on a line at speed c, taking discrete steps of equal size, and undergoing "col lisions" (reversals of direction) at random times, according to a Poisson process of intensity a. He showed that the ex pected position of the particle satisfies either of two dif ference equations, according to its initial direction. With correct scaling followed by a passage to the limit, the dif ference equations become a pair of first-order p.d.e.s. Dif-
© 2003 SPRINGER-VERLAG NEW YORK, VOLUME 25, NUMBER 1 , 2003
53
ferentiating these and adding them yields the "telegraph equation":
( )
( )
!!:___ du + a du c _E_ du c = dt dt dt dx dx · �
(1)
This is an equation of hyperbolic type. (If you drop the lower-order term, it's just the one-dimensional wave equa tion.) A probabilistic solution of a hyperbolic equation! I was fascinated by Kac's little-known feat. Surely the telegraph equation couldn't be the only hyperbolic equa tion with a probabilistic meaning! Equally intriguing: if we can construct a probabilistic solution, we can exploit it. The central limit theorem was beckoning. If we could prove a central limit theorem for Kac's model, we'd have a limit theorem about p.d.e.s. As I contemplated the steps by which Kac constructed his probabilistic solution of the telegraph equation, I real ized that one could replace the group of translations at speed c by any group of operators! Suppose A is the generator of a group of linear operators, acting on a linear space 'lR>. In stead of translations moving randomly to the left and right at speed c, substitute time evo lutions according to generators A and -A. In place of a parti cle whose position at time t is the sum of random transla tions, we get a random element of 'lR>, the result of successive evolutions "forward" (generator A) and "backward" (gen erator -A). What is the expected value of this 'lR>-valued random process? Direct differentiation showed that in place of the classical telegraph equation, this expectation satisfies an operator differential equation. This can be ob tained by simply substituting, in the classical telegraph equation, the abstract generator A for c dldx, the genera tor of translation at speed c:
has a time coefficient tk equal to the occupation time in the kth state assumed by the n-state Markov chain. At first we called this random product of solution operators a "random semigroup." Peter Lax pointed out that it was not really a semigroup; he suggested the name "random evolution." If the generators Ak don't commute, Richard found that this random product should be written "backwards," last operator first; if we define the random evolution this way, u(t), the expected value of the random evolution, satisfies a simple ordinary differential equation:
du - = Z u + Q u. dt u(t) is an n-tuple of elements from the linear space 'lR>, in dexed according to each of n possible initial modes of evo lution. Z is a diagonal matrix of the operators AJ· Q is a real matrix, the transition matrix of the Markov chain which controls the switching among the Aj. In the classical case, the n semigroups are translations in IR3. Their generators, the di agonal elements of Z, are first order differential operators in the spatial variables. The equa tions are a hyperbolic system of first-order differential equa tions, coupled through Qu. So we called the general case, with abstract generators AJ, an "abstract hyperbolic system." In the special case of only two semigroups, each the negative of the other, with a symmetric mechanism of reversing direction, the abstract system of two equations is equivalent to the abstract telegraph equation written above. For this special case we proved a limit theorem. In prob ability language it's called a "central limit theorem," and in differential equations language it's a "singular perturbation theorem." When a lower-order term in a differential equa tion has a small coefficient, it's called a regular perturba tion. When a leading-order term has a small coefficient, it's called a singular perturbation. In the telegraph equation, the two leading terms have second order in x and in t. We can't throw away Uxx, for we'd be left with an ill-posed problem. But if we throw away uu, we get the heat equation: a well-behaved equation which has a probabilistic solution-Brownian motion. If we can make our random linear process approach Brownian motion, surely its expected value-which satisfies the tele graph equation-will go to Brownian motion's expected value-which satisfies the heat equation. We'll have a prob abilistic proof of a singular perturbation theorem for the heat and telegraph equations. How to put a small coefficient in front of the second time-derivative, to make our process look like Brownian motion? It's not hard to guess that we must speed up both the linear motion (translation) and the switching, to make the collisions frequent. The expected time between calli-
I n pro babil ity lang uag e
it ' s cal led a " central l i m it theo re m , " and in d iffer ential eq uations lan
g uage it ' s a "singu lar
pertu rbation theorem . "
( )
d du du - - + a - = A2u dt dt dt We called this the "abstract telegraph equation." For two speeds c and -c, switching at random times ac cording to a Poisson process is equivalent to defining the process as a two-state Markov chain. But clearly one could let the particle move at any n speeds, with the switch be tween speeds governed by an n-state Markov chain. In the abstract case, instead of switching among random speeds, you could switch among n semigroup generators Aj oper ating on a given linear space 'lR>. The result would be a ran dom process on 'lR>, which would in fact be a random prod uct of random solution operators, each generated by its randomly chosen generator. Each solution operator
54
THE MATHEMATICAL INTELLIGENCER
sions is the reciprocal of a, so to make the time between collisions small, multiply a by a large number R. To speed up the linear motion, multiply the speed c by another large number S. Then if you divide the equation by S 2 and choose R
=
(cS? , a
you get the telegraph equation (classical or abstract!), with u11 divided by S2. The mean free path is the speed Sc times the expected time between collisions, liRa. This reduces to liSe, going to zero as R and S go to infinity. Easy to guess now that as R and S go to infinity, the expected value u(t) goes to a solution of the heat equation. Since the funda mental solution of the heat equation is the distribution func tion of the normal random variable, it's no longer a surprise that this singular perturbation theorem for a p.d.e. can be proved by means of the central limit theorem of probabil ity. What happens in the abstract telegraph equation, where c d/dx is replaced by A? Then, of course, the solution con verges to the solution of an "abstract heat equation,"
Prof. Hille published an announcement of our results in the Proceedings qf the National Academy of Sciences [45]. Then we submitted the complete paper to the Journal of Functional Analysis. We were surprised and disappointed when it was rejected. However, the letter of rejection was twelve pages long, and contained sensible suggestions for improvement. The referee's references to "the book" left no doubt that he was William Feller. We took advantage of his advice, and the improved paper appeared in the A.M.S. Transactions [46]. I still regret that I never met Feller in person. By another piece of luck I stumbled across the M.I.T. thesis of Mark Pinsky. Pinsky had extended Kac's work on the telegraph equation. Instead of Kac's two velocities c and -c, Pinsky allowed n arbitrary velocities, switching ac cording to an ergodic n-state Markov chain. More impor tant, Pinsky proved a central limit theorem for this n-state process. The identity of the limiting equation in this gen erality is much less obvious than in the telegraph equation that Griego and I had treated. Mark spent two weeks in Albuquerque the following summer. My daughter baby-sat so that he and Joanna could sight-see. Mark and I undertook to extend his central limit theorem to abstract semigroups-or, what is the same thing, to extend the central limit theorem Griego and I had proved for the abstract telegraph equation to an arbitrary abstract hyperbolic system. We assumed that the generators were mutually com mutative. This was an undesirable restriction, but it was a significant step forward. It included the Kac and Griego Hersh papers, with switching between cA and -cA, and Pinsky's thesis, with switching between first-order spatial differential operators with constant coefficients. We showed that in the limit of an appropriately scaled
small parameter, all components of the solutions of the commutative abstract hyperbolic system converge to solu tions of an abstract heat equation:
du dt
=
Hu.
H is a certain quadratic expression in the n generators of the constituent semigroups. In the classical case (Pinsky's thesis), these generators are constant-coefficient first-or der spatial differential operators, and H is a second-order elliptic operator in IR3 with constant coefficients. Pinsky found an abstract characterization of random evolutions in terms of multiplicative functionals, which he presented in a book [ 104, 107]. After the first Griego-Hersh paper was published, and again after my paper with Pinsky was published, I received letters from Tom Kurtz. Kurtz had his own abstract-space version of singular perturbation theory. He used it to re prove both the abstract telegraph equation limit theorem (Griego-Hersh) and the commutative abstract hyperbolic systems limit theorem (Hersh-Pinsky). (He did the same thing to the non-commutative limit theorem I obtained later with George Papanicolaou.) It seemed that he was deter mined to find non-probabilistic proofs for all our proba bilistic theorems [86]. But as his career developed he be came committed to probabilistic problems and methods. He obtained first-order limit theorems [85] (laws of large numbers) for random evolutions, in addition to second-or der theorems (central limit theorems) of the type on which Griego and I had focused. Particularly interesting were his limit theorems for sequences of semigroups of nonlinear operators. The following year I had a sabbatical at my alma mater, the Courant Institute. I attended a probability seminar led by Monroe Donsker. There George Papanicolaou was re porting on recent work [81 , 82] of Khas'minskii. George sug gested we work together. Before I returned to New Mex ico we succeeded in proving limit theorems for random evolutions made of non-commuting semigroups. These proofs were more probabilistic than my earlier ones. Now we assumed that the process by which the semigroups switch has a state whose recurrence time has finite mean and variance. (This is automatically satisfied in case of an ergodic finite-state Markov chain.) Under this hypothesis, the evolution is a random product of random factors, each of which starts and ends at the distinguished ergodic state. These random factors are independent and identically dis tributed. We were able to calculate their common expected value, estimate its dependence on the small parameter, and find what happens when the parameter vanishes. At one point in our work, we were stuck A few words from S.R.S. Varadhan were sufficient to point toward the solution. As in the commutative case, all components of the "abstract hyperbolic system" which is satisfied by the expected val ues of the random process converge to solutions of a sin gle "abstract heat equation," with a single "abstract second order elliptic" operator on the right-hand side. This operator is again given by a quadratic expression in the gen-
VOLUME 25. NUMBER 1 . 2003
55
erators of the constituent semigroups. But these do not commute, so the quadratic expression is defined by non commutative algebra. It is given explicitly in our paper. George went on to write several more papers applying ran dom evolutions to physical problems. A specific, concrete example of non-commuting group generators is a collection of first-order differential opera tors in �3 with position-dependent coefficients. Each of these generates its group of translations along its family of curved trajectories in �3. The limiting equation is again sec ond-order parabolic, but now with variable (position-de pendent) coefficients. In translation semigroups, whether with constant or variable coefficients, the small parameter has a very sim ple physical interpretation. Just as in the case of the tele graph equation discussed above, it is proportional to the mean free path-the average distance traveled between collisions. (A simple de rivation of this fact is given in Hersh-Pinsky [51] .) Vanishing of the small parameter-the mean free path-means a drastic transformation of the physical process, from discrete particles in collision to smooth motion in a continuum. Our convergence theorem says that in the limit of small mean free path, the expected motion of a Newtonian par ticle goes over to diffusion. It may seem surprising that the limit depends only on shrinking the mean free path, not on increasing the density of particles. But physically the vanishing of the mean free path comes about by increasing the particle density. In creasing the density of particles results in a passage to the diffusion limit by making the mean free path go to zero. While in New York I spoke at Mark Kac's seminar at Rockefeller University. This was the first time I had met Kac. Fortunately, I was able to see him several more times before he died. During the seminar he listened closely, and challenged me twice. Is the square root of the Laplacian operator really the generator of a group? Was I right to call my formula for the limiting operator "explicit"? Both times I explained, and he yielded. In the telegraph equation (1) the term a du!dt is what makes energy dissipate, so I called this term the "parabolic part" of the equation. This term carries the stochastic part of Kac's solution of the telegraph equation-switching ac cording to a Poisson process with intensity a. If a is 0, there is no switching, the "parabolic term" drops out, and the equation reduces to the one-dimensional wave equation.
apparently unaware of the intimate connections between his measure and his own work on potential theory. It is now almost universally known that the generalized Wiener Perron capacitory potential of a closed set F (in Euclidean space of dimension 3 or higher) at a point p is the Wiener measure of the set of paths which originate from p and which at some time hit F. Moreover, Wiener's famous cri terion for regularity of boundary points has a most ap pealing interpretation in terms of his measure. The surprise is heightened if one recalls that Wiener's work on poten tial theory was almost simultaneous with that on Brown ian motion." Griego reminded me of this quotation because of the in structive fact that what Kac said of Wiener with regard to potential theory, he could have said of himself with regard to random evolutions. Kac was a great advocate of func tion space integrals as a tool for solving p.d.e.s-the Feyn man-Kac formula [68] . And Kac found a proba bilistic solution of the telegraph equation [67]. But he didn't connect his solution of the tele graph equation with his function-space inte grals. Had he done so, his use of an approxi mation by a finite difference equation would have become unnecessary, and he could have done the whole random evolution thing by him self-if he had cared for abstract spaces and operators! When I returned to New Mexico after my sabbatical leave, I was fortunate again. My colleague Bob Cogburn (one of Michel Loeve's three Ph.D. students) became in terested in random evolutions. With his mastery of proba bilistic estimation, it was possible to carry all earlier asymptotic results on random evolutions to their natural generalization [ 16]. The random process need not be finite or even discrete. The family of semigroups need not be fi nite or countable. The random process controlling the choice of semigroups need not be Markovian; it's sufficient that it satisfy a "mixing condition" (be almost independent when separated by long time intervals.) Shortly afterward, a comparable result was published by Papanicolaou and Varadhan [ 102]. Under some extra hypotheses they also got a rate of convergence. Also around the same time Richard Ellis and Walter Rosenkrantz published a paper in which Kac's particle is restricted to a bounded interval, with ap propriate boundary conditions [30, 3 1 ] . They obtained a central limit theorem for this case. In 1972 I was asked by the Rocky Mountain Mathemat ics Consortium to organize a summer meeting on stochas tic differential equations. This was part of a series of meet ings which had been supported by a grant of $25,000 a year from the National Science Foundation. The year I became responsible, the N.S.F. changed its policy, and gave no money at all. We had planned to meet in Santa Fe, NM. For tunately, the Consortium included a Canadian school-the
O u r convergence theorem
says that in the l i m it of smal l
m ean free path , the expected
m otion of a Newton ian particle g oes over to d iffusion .
The presence of the "parabolic term", a du!dt, is what makes possible a probabilistic solution of this hyperbolic equation. Kac leaned forward and exclaimed, "That's right!" A treasured and memorable moment for me. Later Richard Griego reminded me that in [68] Kac wrote of Norbert Wiener: "What is really surprising is that he was
56
THE MATHEMATICAL INTELLIGENCER
University of Alberta, in Edmonton. Jack Macki, Alberta's representative to the Consortium, convinced his university to contribute $10,000. There was also a small additional do nation from the Provincial Government of Alberta. Because of this Canadian support, we had our meeting in Edmon ton. Richard Griego took a lot of responsibility on this pro ject. The participants seemed glad to pay their own way to Edmonton. Mark Kac and Wendell Fleming were invited speakers. The proceedings became a special issue of the
Rocky Mountain Mathematics Journal. Two papers by Griego, one [47] with me and one [ 50] with Andrzej Korzeniowski, applied random evolutions to a seemingly remote area of pure mathematics: the spectral theory of elliptic operators. We were able to compute the asymptotics of the spectrum of a certain class of degener ate elliptic operators (merely non-negative definite, not strictly positive-definite) by solving an associated equation as a random evolution. Our students carried forward with random evolutions. We had limited ourselves to continuous evolutions, with jumps only in the derivatives (the "generator"). Mark Pin sky's student Bob Kertz obtained a central limit theorem for evolutions with discontinuities. Richard Griego's student Manuel Keepler was prolific. He wrote about duality and time reversals. David Heath, while still a grad student at Urbana, moved in quite an unexpected direction. He found a probabilistic model for the telegraph equation with space-dependent co efficients. Unfortunately, his thesis remains unpublished. Tom Kurtz's student Joe Watkins published impressive papers carrying the limit theorems further than before. Pinsky, Cogburn, and Papanicolaou went on to create different kinds of stochastic mathematics. Pinsky brought in differential geometry, and studied stochastic processes on manifolds (see his book [ 109]). Cogburn and his students did detailed analysis of dif ferent kinds of stochastic processes in different kinds of random environments. ("Random environment" means something rather similar to "random evolution.") George Papanicolaou published very extensively in ran dom media, and on "homogenization." In homogenization, the random variation takes place in space rather than in time. An example could be sound passing through a ran dom alternation of layers having different acoustic prop erties. If there are a great many very thin layers (in an ap propriate scaling, of course), the process is approximated by a "homogenized" medium. Other interesting contributions came from Koopmans's student Donald Quiring, Griego's student John Hagood, and applied mathematicians Steven R. Dunbar, G. A. Becus, and 0. Iordache. I felt that the joint paper with Cogburn had completed my romance with random evolutions, getting full general ity in both the operators and the random switching. (Un like some papers by later authors, the work reported here includes unbounded semigroups as well as bounded ones.) Unknown to me and my American collaborators, our work was noticed in Kiev. Vladimir Korolyuk, a distinguished
probabilist and academician, and his pupil Anatolyi Swishchuk, undertook a massive research into random evo lutions controlled by semi-Markov processes. In this gen erality they answered all the questions we had asked, and many others we had neglected. Semi-Markov processes, like Markov processes, "have no memory." The mode of evolution you jump into after a "collision" depends only on what mode you are in at the time of the collision. The difference is that, for Markov processes, the waiting time between collisions is expo nentially distributed. In the semi-Markov theory, this re striction is relaxed. The waiting time can be prescribed ac cording to the particular application in view. Swishchuk's two recent books [ 125, 126] give scores of references in physics, biology, and especially in modern finance. (Swishchuk is now at York University, in Toronto.) I only learned about this work in 1992, years after it had been going on. "Out of the blue" I received an invitation to the third ( ! !) conference on random evolutions, to be held on the shore of the Black Sea, in Katsively, where the Ukrainian Academy of Science has a beautiful resort. I took two weeks off from teaching to go to the Ukraine. First to Kiev, which to my surprise is a beautiful city on the shore of the magnificent Dnieper River; then to the stimu lating conference in beautiful Katsively (I had the difficult role of "important guest"); then back to Kiev. From there, Swishchuk took me south on a long bus and car trip, to the forest village of Butznivets, which my mother Malke had left in 1919. The big lake is there, just as my mother described it. We found two or three old ladies herding a few cows down the forest road. Swishchuk translated for me. "Do you remember Malke Shluger? Her father, Sholom Shluger? The store her grandmother kept? The Weinbergs? The Tobacks?" "There are no Jews here. The Germans killed them. A few ran away." More Ukrainian conversation between the ladies and Tolya Swischuk. "Those houses used to be Jew houses." I entered one of the houses. It had been burned out, and then used as a barn. I took pictures of the abandoned rooms. We drove away. REFERENCES
I recommend [1 25, 1 26] by Swishchuk, and my own surveys [59] and [58]. Chapter 1 2 of [32] by Ethier and Kurtz is a fine exposition of the subject, as is Mark Pinsky's outstanding book [1 09]. 1 . L. Baggett and D. Stroock, An ergodic theorem for Poisson processes on a compact group with applications to random evo lutions, J. Func. Anal. 16 (1 974), 404-4 1 4. 2 . G. A. Becus, Wave propagation in imperfectly periodic structures: a random evolution approach, J. Appl. Math. & Phys. (ZAMP) 29 (1 978), 252-260. 3. G. A. Becus, Stochastic prey-predator relationships: a random evolution approach, Bull. Math. Bioi. 41 (1 979), no. 1 , 9 1 -1 00. 4. G. A. Becus, Random evolutions and stochastic compartments, Math. Biosci. 44 (1 979), no. 3-4, 2 4 1 -254.
VOLUME 25, NUMBER 1 2003
57
5. G. A. Becus, Homogenization and random evolutions; applica
30. R. S. Ellis and W. A. Rosenkrantz, Diffusion approximation for
tions to the mechanics of composite materials, Quart. Appl. Math.
transport processes with boundary conditions, Indiana U. Math.
39 (1 979-1 980), no. 3, 209-2 1 7 .
J. 26, no. 1 6 (1 977), 1 075-1 096.
6 . A . T . Bharucha-Reid, Random integral equations, Academic Press, New York, 1 97 2 .
31 . R. S. Ellis and W. A. Rosenkrantz, A class of transport processes with boundary conditions. Preprint, 1 978.
7 . G. Birkhoff and R. E. Lynch, Numerical solutions o f the telegraph
32. S. N . Ethier and T. G . Kurtz, "Random evolutions, " ch. 1 2 in
and related equations, in Numerical Solutions of Partial Differential
Markov Processes Characterization and Convergence, Wiley,
Equations, Proc. Symp. U. Md. , Academic Press, New York, 1 966.
8. R. Burridge and G . Papanicolaou, The geometry of coupled mode propagation in one-dimensional random media, Comm. Pure & App/. Math. XXV (1 972), 7 1 5-757 .
N . Y . , 1 986. 33. W. H. Fleming, A problem of random accelerations , MRC Tech nical Summary Report 403 (June 1 963). 34. S. K. Foong, "Kac's solution of the telegrapher equation, revis
9. J. Chabrowski , Les solutions non negatives d'un systeme
ited , Part II", in Developments in General Relativity, Astrophysics
parabolique d'equations, Ann. Polan. Math. 19 (1 967), 1 93-1 97.
and Quantum Theory, eds. F. I . Cooperstock et al. , I .O.P. Pub
1 0. R. Cogburn, A uniform theory for sums of Markov chain transition probabilities, Ann. Prob. 3, no. 3 (1 975), 1 9 1 -2 1 4. 1 1 . R. Cogburn, Markov chains in random environments, the case of Markovian environments, Ann. Prob. 8, no. 5 (1 980), 989-91 6. 1 2 . R. Cogburn, Recurrence vs. transcience for spatially inhomoge neous birth and death in a random environment, Z. Wahrsch. Ge biete 1 6; 1 (1 982), 1 53-1 60.
1 3. R . Cogburn, The ergodic theory of Markov chains in random en vironments, Z. Wahrsch. Gebiete 66 (1 984), 1 09-1 28. 1 4. R . Cogburn and R . D. Bourgin, On determining absorption prob abilities for Markov chains in random environments, Adv. Appl. Prob. 1 3 (1 981 ), 369-387.
1 5. R. Cogburn and W. Torrez, Birth and death processes with ran dom environments in continuous time, J. Appl. Prob. 18 (1 981 ) , 1 9-30. 1 6. R. Cogburn and R. Hersh, Two limit theorems for random differ ential equations, lnd;ana U. Math. J. 22 (1 973), 1 067-1 089. 1 7 . J. E. Cohen, Random evolutions and the spectral radius of a non
lishing, Bristol, 1 990, pp. 351 -366. 35. S. K. Foong, Path integral solution for telegrapher equation, 1 992, preprint. 36. S. Goldstein, On diffusion by discontinuous movements and on the telegraph equation, Quart. J. Mech. Appl. Math. 4 (1 95 1 ) , 1 29-1 56. 37. L. G. Gorostiza, An invariance principle for a class of d-dimen sional polygonal random functions, Trans. A.M.S. 1 77 (1 973). 38. L. G. Gorostiza, The central limit theorem for random motions of d-dimensional Euclidean space, Ann. Prob. 1 (1 973), 603. 39. L. G. Gorostiza and R. J. Griego, Convergence of d-dimensional transport processes with radially symmetric direction changes, preprint, 1 977. 40. L. G. Gorostiza and R . J . Griego, Strong approximation of diffu sion processes by transport processes, J. Math. Kyoto U. 19, no. 1 (1 979), 9 1 -1 03. 41 . R. J . Griego, Dual random evolutions, U . N . M . Tech. Rept. no. 301 , September 1 97 4.
negative matrix, Math. Proc. Camb. Phil. Soc. 86 (1 979), 345-350.
42. R . J. Griego, "Dual multiplicative operator functionals , " Prob. Meth.
1 8. J. E. Cohen, Random evolutions in discrete and continuous time,
in Diff. Eqns. (Proc. Cont. U . of Victoria, 1 97 4), pp. 1 56-1 62 . Lec
Stach. Proc. Appl. 9 (1 979), 245-251 .
1 9. J . E. Cohen, Eigenvalue inequalities for products of matrix expo nentials, Linear A/g. and Appl. 45 (1 982), 55-95. 20. J. E. Cohen, Eigenvalue inequalities for random evolutions: origins and open problems, lneq. in Stat. and Prob . , IMS Lecture N otes Monograph Series, 5 (1 984), 4 1 -53. 2 1 . J . Corona-Burgueno, A model of branching processes with ran dom environments, Bo/. Soc. Mat. Mex. 2 (1 976), no. 1 , 1 5-27 . 22. C . DeWitt-Morette and Sang-lr-Gwo, Two p i n groups 23. C. Dewitt-Morette and S. K. Foong , Path integral solutions of wave equations with dissipation, Dept. Phys. & Center for Relativity, U. of Texas, Austin. 24. C. DeWitt-Morette and S. K. Foong, Phys. Rev. Let. 62 (1 989), 2201 -2204. 25. C. Dewitt-Morette and S. K. Foong, Kac's solution of the tele grapher equation, revisited, Part I, preprint, 1 990. 26. S. R. Dunbar, A branching random evolution and a nonlinear hy perbolic equation, SIAM J. Appl. Math. 48 (1 988), no. 6, 1 5 1 0-1 526. 2 7 . R. S. Ellis, Limit theorems for random evolutions with explicit er ror estimates, Z. Wahrsch. Gebiete 28 (1 974), 249-256. 28. R . S. Ellis and M . A. Pinsky, Limit theorems for model Boltzmann equations with several conserved quantities, preprint. 29. R. S. Ellis and M. A. Pinsky, The first and second fluid approxi
58
ture Notes in Mathematics, vol. 451 , Springer, Berlin, 1 975. 43. R . J. Griego, Limit theorems for a class of multiplicative operator functionals of Brownian motion, Rocky Mtn. J. Math. 4, no. 3 (1 974), 435-441 . 44. R . J. Griego, D. Heath, and A. Ruiz-Moncayo, Almost sure con vergence of uniform transport processes to Brownian motion, Ann. Math. Stat. 42 (1 97 1 ), 1 1 29-1 1 31
45. R. J. Griego and R . Hersh, Random evolutions, Markov chains, and systems of partial differential equations, Proc. Nat/. Acad. Sci. U.S.A. 62 (1 969), 305-308.
46. R. J. Griego and R. Hersh, Theory of random evolutions with ap plications to partial differential equations, Trans. A.M. S. 1 56 (1 97 1 ), 405-3 1 8 . 4 7 . R . J. Griego and R . Hersh, Weyl's theorem tor certain operator valued potentials, Indiana U. Math. J. 27, no. 2 (1 978), 1 95-209. 48. R. J. Griego and A. Moncayo, Random evolutions and piecing out of Markov processes, Bo/. Soc. Mat. Mex. 15 (1 970), 22-29. 49. R. J. Griego and A. Korzeniowski, On principal eigenvalues for random evolutions, Stach. Anal. Appl. 7 (1 989), no. 1 , 35-45. 50. R. J. Griego and A. Korzeniowski , Asymptotics for certain Wiener integrals associated with higher-order differential operators, Pac. J. Math. 142, no. 1 (1 990), 4 1 -48.
5 1 . J. W. Hagood, The operator-valued Feynman-Kac formula with
mations to the linearized Boltzmann equation, J. Math. Pure &
non-commutative operators, J. Func. Anal. 38 (1 980), no. 1 ,
Appl. 54, 1 25-1 56.
99-1 1 7.
T H E MATHEMATICAL INTELLIGENCER
52. D. C. Heath, Probabilistic analysis of hyperbolic systems of par tial differential equations, Dissertation, Univ. of Illinois, 1 969. 53. R . Hersh, Mixed problems in several variables, J. Math. Mech. 1 2 , 54. R. Hersh, Boundary conditions for equations of evolution, Arch.
25-53. Kertz,
Random evolutions with underlying semi-Markov
processes, Pub/. Res. lnst. Math. Sci. 1 4 , no. 3 (1 978), 589-6 1 4 . 80. R . Kertz, Limit theorems for semi-groups with perturbed genera
Rat. Mech. Anal. 21 , no. 5 (1 966).
55. R. Hersh, A class of central limit theorems for convolution prod ucts of generalized functions, Trans. A . M. S. 1 40 (1 969), 7 1 -75. 56. R . Hersh, Maxwell's coefficients are conditional probabilities, Proc.
tors, with applications to multi-scaled random evolutions, J. Func. Anal. 27 (1 978), 2 1 5-233.
81 . R. Z. Khas'm1nskii, On stochastic processes defined by differen tial equations with a small parameter, Th. Prob. & App/. X I (1 966),
A.M.S. 44 (1 974), 449-453.
5 7 . R. Hersh , Introduction to a special issue on stochastic differential
2 1 1 -228. 82. R . Z. Khas'minskii, A limit theorem for the solutions of differential
equations, Rocky Mt. J. Math. 4 ( 1 97 4). 58. R. Hersh, Random evolutions: a survey of results and problems,
equations with random right-hand sides, Th. Prob. & Appl. XI (1 966), 390-406.
Rocky Mt. J. Math. 4 (1 974), 443-477.
59. R. Hersh, Stochastic solutions of hyperbolic equations, in Part. Diff. Eq. & Related Topics, Springer-Verlag Lecture Notes in Math.
83. R. Kubo, Stochastic Liouville equation, J. Math. Phys. 4 (1 963), 1 74-1 83. 84. T. G . Kurtz, A random Trotter product formula, Proc. A . M. S. 35
No. 446 (1 975). 283-300. 60. R. Hersh and G. Papanicolaou, Non-commuting random evolu tions and an operator-valued Feynman-Kac formula, Comm. Pure
(1 972), 1 47-1 54 . 8 5 . T . G. Kurtz, A limit theorem for perturbed operator semigroups w1th applications to random evolutions, J. Func. Anal. 12 (1 973),
& Appl. Math. XXX (1 972), 337-367.
61 . R. Hersh and M. Pinsky, Random evolutions are asymptotically XXV
(1 972), 33-44.
62. M. Hitsuda and A. Shimizu, A central limit theorem for additive functionals of Markov processes and the weak convergence to Wiener measure, J. Math. Soc. Japan 22, no. 4 (1 970). 63. Hosoda, T. , Ph.D. dissertation , On the principal eigenvalue of an elliptic system of second order differential operators, U. of New Mexico, July 1 988. 64. 0 . lordache. Polystochastic models in chemical engineering, VNU Science Press, Utrecht, the Netherlands, 1 987. 65. B. Jefferies, SemJfj(Oups and diffusion process /, Centre for Math ematical Analys1s Research Report, Canberra.
66. B. Jefferies, Evolution processes and the Feynman-Kac formula , Centre for Mathematical Analysis Research Report, Canberra. 67. M. Kac, Some stochastic problems in physics and mathematics, Magnolia Petroleum Co., Lectures in Pure and Applied Science No. 2 (1 956). 68. M. Kac, Wiener and Integration 1n Function Spaces, Bull. A.M.S. 72 (1 , II) (1 966).
69. M. Kac, A stochastic model related to the telegrapher's equation, Rocky Mt. J. Math. 4, no. 3 (1 97 4), 497-509.
70. M. Kac, Probabilistic methods in some problems of scattering the ory, Rocky Mt. J. Math. 4, no. 3 (1 97 4). 71 . S . Kaplan, Differential equations in which the Poisson process plays a role, Bull. A.M.S. 70 (1 964), 264-268. 7 2 . M. Keepler, Backward and forward equations for random evolu tions, Indiana U Math. J. 24, no. 10 (1 975), 937-949. 73. M. Keepler, Perturbation theory for backward and forward ran dom evolutions, J. Math. Kyoto U 1 26, no. 2 (1 976), 395-41 1 . 7 4. M . Keepler. On random evolutions induced by countable state space Markov chains, Portugalia Math. 37 (1 978), No. 3-4, 203-207. 75. M. Keepler, Random evolutions are semigroup Markov chains. 76. M. Keepler, Limit theorems for commuting and non-commuting forward and backward random evolutions on ergodic Markov chains. 77. R. Kertz, Limit theorems for discontinuous random evolutions, with applications to initial-value problems, and to Markov chains on N lines, Ann. Prob. 2 (1 974), 1 045-1 064.
tions to discontinuous random evolutions, Trans. A.M.S. 199 (1 974) 79. R.
n o . 3 (1 963).
Gaussian, Comm. Pure & Appl. Math.
78. R . Kertz, (1 974) Perturbed semi-group limit theorems with applica
55-67. 86. T. G. Kurtz, Convergence of sequences of semigroups of nonlin ear operators with an application to gas kinetics, Trans. A.M.S. 1 86 (1 974), 259-272.
87. M. Lax, Classical noise IV: Langevin methods, Rev. Mod. Phys. 38 (1 966), 561 -566.
88. J. A. Morrison, G. C. Papanicolaou, and J. B. Keller, Analysis of some stochastic ordinary differential equations, SIAM-AMS Proc. VI
(1 973), 97-1 61 .
89. J. A. Morrison, G. C. Papanicolaou, and J. B. Keller, Mean power transmission through a slab of random medium, Comm. Pure & Appl. Math. XXIV (1 971 ) , 473-489.
90. G. C. Papanicolaou, Motion of a particle in a random field, J. Math. Phys. 1 2 (1 971 ) , 1 49 1 -1 496.
91 . G. C. Papanicolaou, Wave propagation in a one-dimensional ran dom medium, SIAM J. Appl. Math. 21 (1 97 1 ) , 1 3-1 8 . 9 2 . G . C. Papanicolaou, A kinetic theory for power transfer in sto chastic systems, J. Math. Phys. 13 (1 972), 1 91 2-1 91 8. 93. G.
C.
Papanicolaou,
Asymptomatic
analysis
of transport
processes, Bull. A. M.S. 81 , no. 2 (1 975), 330-392. 94. G. C. Papanicolaou, Stochastic equations and their applications, Amer. Math. Monthly 80 (1 973), 526-544.
95. G. C. Papanicolaou, Some probabilistic problems and methods in singular perturbations, Rocky Mt. J. Math. 4 (1 976), 653-67 4. 96. G. C. Papanicolaou and R . Hersh, Some limit theorems for sto chastic equations and applications, Indiana U Math. J. 21 (1 972), 81 5-840. 97. G. C. Papanicolaou and J. B. Keller, Stochastic differential equations with applications to random harmonic oscillators and wave propa gation in random media, SIAM J. Appl. Math. 21 (1 97 1 ), 287-305. 98. G. C. Papanicolaou and W. Kohler, Asymptotic analysis of deter ministic and stochastic equations with rapidly varying compo nents, Comm. Math. Phys. 45 (1 975), 2 1 7-232. 99. G. C . Papanicolaou and W. Kohler. Asymptotic theory of mixing stochastic ordinary differential equations, Comm. Pure & Appl. Math. 1 1 2, no. 7 (1 974) , 64 1 -668.
1 00. G. C. Papanicolaou, D. Mclaughlin, and R. Burridge, A stochas tic Gaussian beam, J. Math. Phys. 14 (1 973), 84-89.
VOLUME 25. NUMBER 1 . 2003
59
1 01 . G. C. Papanicolaou, D. W. Stroock, and S. R. S. Varadhan, Mar
1 22. A Swishchuk, Random evolutions: a survey of results and prob
tingale approach to some limit theorems , Conf. on Statistical Me
lems since 1 969, Random Op. & Stach. Eq. , VSP. no. 3 (1 991 ),
chanics, Dynamical Systems & Turbulence, M. Reed editor. Duke
(to appear).
U. Math. Series, 3, Durham, NC, 1 97 7 . 1 02 . G . C . Papanicolaou and S. R . S. Varadhan, A limit theorem with strong mixing in Banach space and two applications to stochastic differential equations, Comm. Pure & Appl. Math. 26 (1 973), 497. 1 03. A A Pichardo-Maya, Brownian motion in a random environment, Dissertation, U. of New Mexico, 1 985. 1 04 . M. A Pinsky, Random evolutions i n Prob. Meth. in Diff. Eq . , Lec ture Notes in Math. 451 , Springer Verlag , New York, 1 975.
1 23. A Sw1shchuk, Semi-Markov random evolutions, Kluwer AP, Dor drecht (with V. S. Korolyuk). 1 24. V. S. Korolyuk and A Swishchuk, Evolution of Systems in Ran dom Media . CRC Press, Boca Raton , 1 995.
1 25 . A Swishchuk, Random Evolutions and their applications. Kluwer AP, Dordrecht, 1 997. 1 26. A Swishchuk, Random Evolutions and their applications. New Trends. Kluwer AP, Dordrecht, 2000.
1 05. M . Pinsky, Differential equations with a small parameter and the
1 27 . A Swishchuk and Jianhong Wu , Evolution of Biological Systems
central limit theorem for functions defined on a finite Markov chain,
in Random Media. Limit Theorems and Stability. Kluwer AP, Dor
Z. Wahrsch. Gebiete 9 (1 968), 1 01 -1 1 1 .
1 06. M . Pinsky, Multiplicative operator functionals of a Markov process, Bull. A . M. S. 77 (1 97 1 ) , 377-380.
1 07 . M. Pinsky, Stochastic integral representation of multiplicative op erator functionals of a Wiener process, Trans. A. M.S. 1 67 (1 972), 89-1 04. 1 08. M . Pinsky, Multiplicative operator functionals and their asymptotic properties, Adv. in Prob. 3 (1 974), 1 -1 00. 1 09 . M. Pinsky, Lectures on random evolutions, World Scientific, Singapore, 1 991 . 1 1 0. D.
Quiring, Random evolutions on diffusion processes, Z.
drecht (submitted). 1 28. J. C. Watkins, A central limit problem in random evolutions, Ann. Prob. 1 2, no. 2 (1 984), 480-5 1 3 .
1 29 . J . C. Watkins, A stochastic integral representation for random evo lutions, Ann. Prob. 1 3 , no. 2 (1 985), 531 -557. 1 30. J. C. Watkins, Limit theorems for stationary random evolutions, Stach. Proc. & Appl. 19 (1 985), 1 89-224.
1 31 . J . C. Watkins, Limit theorems for products of random matrices; A comparison of two points of view, preprint, 1 986. 1 32 . R. J. Griego and R. Hersh, Brownian motion and potential the ory, Scientific American 220, no. 3 (1 969), 66-74 .
Wahrsch. Gebiete 23 (1 972), 230-244.
1 1 1 . R. Rishel, Dynamic programming and minimum principles for sys tems with jump Markov disturbances, SIAM J. Control 13 (1 975), 338-37 1 . 1 1 2 . S. I . Rosencrans, Diffusion transforms, J. Diff. Eq. 1 3 (1 973), 457-467.
A U THOR
1 1 3. G. Schay, Notices A . M.S. , no. 1 47 (1 973), Abstract 70-60-5. 1 1 4. A Schoene, Semi-groups and a class of singular perturbation problems, Indiana U. Math. J. 20 (1 970), 247-263. 1 1 5. K. Siegrist, Random evolution processes with feedback, Trans. A . M. S. 265, no. 2 (1 98 1 ) , 375-392.
1 1 6 . K. Siegrist, Harmonic functions and the Dirichlet problem for re newed Markov processes, Ann. Prob. 1 1 , no. 3 (1 983), 624-634. 1 1 7 . D. Stroock, Two limit theorems for random evolutions having non ergodic driving processes. Proc. Cont. Stach. D. E. and Appl. (Park City, Utah, 1 976), 24-253, Academic Press, New York, 1 977. 1 1 8. A Swishchuk, Markov random evolutions, IV Soviet-Japan Symp. on Prob. Th. & Math. Stat . , Abstracts. Tbilisi, 1 982, p. 39-40, (w ith V. S. Korolyuk, A F. Turbin). 1 1 9. A Swishchuk, Limit Theorems for semi-Markov random evolu tions in an asymptotic phase merging scheme. Dissertation, Kiev,
REUBEN HERSH
1 000 Camino Rancheros Santa Fe, NM 87501 USA e-mail:
[email protected] lnst. Math , 1 985, 1 1 6 pp. 1 20. A Swishchuk, Applied problems of theory of random evolutions,
Reuben Hersh has had many roles in his long career: poet (a
Znanie Publishing House, Ukrainian SSR, Kiev, RDENTP, 30 pp.,
half-century ago he won the Lloyd M. Garrison prize for po
(with V. S. Korolyuk).
etry by an undergraduate two years in succession); machin
1 2 1 . A Swishchuk, Semi-Markov random evolutions: a survey of the recent results, Cont. Trans. lith Prague conf. , 1 991 , 1 2 pp. (to appear).
60
THE MATHEMATICAL INTELLIGENCER
ist; mathematician; and in recent years especially, ated writer about mathematics.
opinion
17i¥1fW·\· (.i
D av i d E.
Rovve , E d i t o r
Hermann Weyl, the Reluctant Revolutionary David E. Rowe
Send submissions to David E. Rowe, Fachbereich 1 7 - Mathematik, Johannes Gutenberg University, 055099 Mainz, Germany.
''
l
rouwer-that is the revolution!"-with these words from his manifesto "On the New Founda tions Crisis in Mathematics" [Weyl 192 1 ] , Hermann Weyl jumped headlong into ongoing debates concerning the foundations of set theory and analysis. His decision to do so was not taken lightly: this dramatic gesture was bound to have immense repercussions, not only for him but for many others within the fragile and politically frag mented European mathematical com munity. Weyl felt sure that modem mathematics was going to undergo massive changes in the near future. By proclaiming a "new" foundations crisis, he implicitly acknowledged that revo lutions had transformed mathematics in the past, even uprooting the entire edifice of mathematical knowledge. At the same time he drew a parallel with the "ancient" foundations crisis com monly believed to have been occa sioned by the discovery of incommen surable magnitudes, a finding that overturned the Pythagorean world view based on the doctrine "all is Num ber." In the wake of the Great War that changed European life forever, the zeitgeist appeared ripe for something similar, but even deeper and more per vasive. Still, revolutions cannot occur with out revolutionary leaders and ideolo gies, and these W eyl came to recognize in Egbertus Brouwer and his philoso phy of mathematics, which Brouwer originally called "neo-intuitionism" (in deference to Poincare's intuitionism, see [Dalen 1995]). Weyl had known Brouwer personally since 1912, and had studied his novel contributions to geometric topology as a prelude to writing Die Idee der Riemannschen Fldche [Weyl 1913]. But the Brouwer he and most others knew back then was the brilliant topologist, not the mystic intuitionist Dirk van Dalen ac quaints us with in his rich biography [Dalen 1999]. Weyl simply had not known the whole Brouwer, and proba-
B
bly never did. True, he regarded him as a kindred philosophical spirit, but he seems never to have referred to Brouwer's Leven, Kunst en Mystiek (Life, Art, and Mysticism) or any of his other more general philosophical writings, presumably because he never read them (all were written in Dutch). If so, this surely precluded any chance of fully understanding the vision be hind Brouwer's views. Nevertheless, he was swept off his feet both by Brouwer's personality and by his revo lutionary message for mathematics. Weyl had been teaching since 1913 at the ETH in Zurich (on his career there, see [Frei and Stammbach 1992] ). His conversion experience took place in the summer of 1919 while vacation ing in the Engadin, where Brouwer, too, was staying. Their encounter was brief, lasting only a few hours, but long enough for Weyl to see the light. Af terward, Brouwer lent him a copy of his 1 9 13 lecture on "Formalism and In tuitionism," but Weyl returned it, com menting that he already had "a copy . . . from the old days," presumably an al lusion to the pre-revolutionary era. He further confessed that "at the time I did not pay attention to it or understand it. . . . " (Weyl to Brouwer, 6 May 1920, quoted in [Dalen 1999, p. 320]), a re mark befitting a new disciple of the faith. Discipleship played a crucial role in the social relations among the mathe maticians of this era, and no one felt this more keenly than Hermann W eyl when he studied under David Hilbert in Gottingen. Hilbert's aura as a youth leader-the "Pied Piper of Mathemat ics"-was perhaps the most distinctive quality that separated him from all his contemporaries. He must have felt a mixture of guilt and relief when, as he later described it, "during a short va cation spent together, I fell under the spell of Brouwer's personality and ideas and became an apostle of his in tuitionism" (Weyl Nachlass, Hs 9 1a: 1 7). Even young Bertus Brouwer was
© 2003 SPRINGER-VERLAG NEW YORK, VOLUME 25. NUMBER 1, 2003
61
strongly attracted by Hilbert's alluring persona. He spent a considerable amount of time with him during the summer of 1909 when Hilbert was va cationing in Scheveningen, a seaside resort town near the Hague. The first personal encounter left a deep impres sion on Brouwer, as he related to his friend, the poet Adama van Scheltema: "This summer the first mathematician of the world was in Scheveningen; I was already in contact with him through my work, but now I have re peatedly made walks with him, and talked as a young apostle with a prophet. He was only 46 years old, but with a young soul and body; he swam vigorously and climbed walls and barbed wired gates with pleasure. It was a beautiful new ray of light through my life." (Brouwer to Adama van Scheltema, 9 November 1909, quoted in [Dalen 1999, p. 128]). Brouwer had already criticized Hilbert's axiomatic methods in his doc toral dissertation, submitted in 1907, where he concluded "that it has nowhere been shown, that if a finite number has to satisfy a system of con ditions of which it can be proved that they are not contradictory then the number indeed exists" (quoted in [Dalen 2000, p. 127]). For his part, Hilbert clearly recognized that the ax iomatic method could never show more than consistency, but he em phatically asserted that this was all a
'
\
·'�
·' ,
,
mathematician needed to prove in or der to assert that a mathematical ob ject exists. As van Dalen has observed, it would not have been like Brouwer to pass up this golden opportunity to ex plain his foundational ideas to Hilbert firsthand. Unfortunately, neither ap parently left any notes of what they talked about while strolling through the sand dunes of Scheveningen, but nearly twenty years later Brouwer did refer to these discussions while lamenting that Hilbert had in the mean time appropriated some of his key in tuitionist principles [Brouwer 1928]. In May 1920, the ink of his "New Cri sis" manuscript barely dry, Weyl sent it off to Brouwer along with the above cited letter in which he explained his motives. "It should not be viewed as a scientific publication," he informed his new-found ally, "but rather as a propa ganda pamphlet, thence the size. I hope that you will find it suitable for this pur pose, and moreover suited to rouse the sleepers . . . . " [Dalen 1999, p. 320] . That it certainly did. Weyl's provocative broadside caused the long-bubbling cauldron of doubts about set theory and analysis to boil over into what came to be known as the modem "foundations crisis, " a slogan taken di rectly from the title of this essay. Brouwer responded with almost glee ful delight: "your wholehearted assis tance has given me an infinite pleasure. Reading your manuscript was a con-
l
j
,
•
•
i '
w
.
..
.
·� Brouwer and his wife communing with nature in their garden (from [Dalen 1 999], p. 63).
62
THE MATHEMATICAL INTELLIGENCER
tinuous delight and your exposition, it seems to me, will also be clear and con vincing for the public . . . " [Dalen 1999, p. 32 1 ] . Arnong such delights was Weyl's use of politically inspired metaphor to con vey a heightened sense of urgency. The antinomies of set theory, he wrote, had once been regarded as "border con flicts" in "the remotest provinces of the mathematical empire" [Weyl 192 1, p. 143 ) . But now they could be seen as symptomatic of a deep-seated prob lem, till now "hidden at the center of the superficially glittering and smooth activity," but which betrayed "the inner instability of the foundations upon which the structure of the empire rests" (ibid.). Weyl likened the onto logical status of objects whose "exis tence" depends on proof by reductio ad absurdum to currency notes in a "paper economy," whereas true math ematical existence was surely a "real value, comparable to food products in the national economy." Nevertheless, "we mathematicians seldom think of cashing in this 'paper money.' The ex istence theorem is not the valuable thing, but rather the construction car ried out in the proof. Mathematics, as Brouwer on occasion has said, is more an activity than a theory" [Weyl 192 1, p. 157). With Bismarck's mighty German Empire now in shambles, Weyl clearly thought that the empire of modem analysis would soon fall, too. Its mighty fortress in Gottingen, led by the fear less and often ferocious Hilbert, had weathered all assaults up until now, but Weyl saw its walls cracking and prognosticated that they would soon come a crumblin' down, while the sage of Amsterdam stood ready to ride in and assume power. Weyl's defection to his intuitionist camp was clearly un dertaken in order to tip the scales in the Dutchman's favor, thereby prepar ing the overthrow of the old regime. His manifesto, penned during the pe riod of the abortive Kapp Putsch and its aftermath, reflected the mood of the times, when thoughts of revolution and counter-revolution abounded in Weimar Germany. Its principal target, of course, was his former mentor, Hilbert, who needed no rousing to see
armed and protected through Frege, Dedekind, and Cantor, it is doomed to fa ilure from the outset [Hilbert 1 922, pp. 159-1 60].
Hermann Weyl, circa 1 91 0, around the time he first became skeptical of Zermelo's ax ioms for set theory.
what was at stake. He struck back quickly, hard, and with plenty of polemical punch:
What Weyl and Brouwer are doing amounts, in principle, to a walk along the same path that Kronecker once fol lowed: they are attempting to estab l'ish the foundations of mathematics by throwing everything overboard that appears uncomfor-table to them and erecting a d ictatorship [Verbots diktatur] a la Kronecker. This amounts to dismembering our- science, which runs the t·isk of losing a large part of our most valuable possessions. Weyl a�nd Brouwer- ban the general concept of irrational number-function, . . . the Cantor-ian numbers of higher number cla.sses, etc. ; the theorem that among infinitely many whole numbers there is always a smallest, and even the log ical "Tertium non datur" . . . are ex amples offorbidden theorems and ar- guments. I believe, that just as earlier when Kronecker failed to do away with irrational number-s . . . so, too, today will Weyl and Brouwer not suc ceed; no: Brouwer- is not, as Weyl con tends, the revolution but rather only the repetition ofa Putsch attempt with old means. If earlier- it was earned out more sharply and still completely lost out, now, with the state so well
This tense encounter, pitting the all powerful Hilbert against his most gifted pupil, stands out as one of the more dramatic episodes of twentieth century mathematics. Yet despite all its high drama, Hermann Weyl's commit ment to Brouwer's intuitionist program soon lost its intensity. By the mid twenties, Brouwer had put an immense amount of energy into his program for revolutionizing mathematics, while in Gi:ittingen Hilbert and Paul Bemays were just as busy developing proof the ory as a bulwark of defense for classi cal analysis. By 1924, Brouwer had proved a series of results culminating in the theorem that every full function is uniformly continuous. Because these intuitionist findings had no counter parts in classical mathematics, Brouwer, who wasn't one to mince words, con cluded that "classical mathematics is contradictory" [Dalen 1999, p. 376 ] . S o where was Weyl? H e largely stood by and watched this lively action from the sidelines, albeit with consid erable interest. His flirtation with intu itionism seemed to many just that, a fleeting affair doomed from the start to end in disappointment. Weyl felt dif ferently. To understand why, it will be helpful to glance back at his earlier in terest not only in foundations of analy sis but in mathematical physics as well. It was hardly an accident that these two fields coincided with Hilbert's principal research interests after 1910, as both men shared high hopes for breakthroughs in these two realms. In Weyl's case these took concrete form in 1918 with the nearly simultaneous publication of Das Kontinuum [Weyl 19 18a] and Raum-Zeit-MateTie [Weyl 19 18b ] . On the Roots o f Weyl's Ensuing Conflict with Hil bert
Hilbert set out his early foundational views on a number of prominent occa sions, but for the most part he pre ferred to evade direct controversy [Rowe 2000] . After 1904, when he de livered a highly polemical address on
foundations issues at the Heidelberg ICM [Hilbert 1904 ] , he remained virtu ally mute about these matters for over a decade. He did not return in earnest to research in this field until the late war years. Still, this hardly meant that he had lost interest. As Volker Peck haus has described, Hilbert's long standing efforts on behalf of Ernst Zer melo, who held a modest position in Gi:ittingen teaching mathematical logic, as well as his support for the philosopher Leonard Nelson were part of Hilbert's long-term strategy aimed at providing institutional support for re search in set theory, foundations, and mathematical logic [Peckhaus 1990, pp. 4-22]. Hilbert, now at the height of his career, had emerged as Gi:ittingen's second great empire-builder. He did so, however, not so much by building on the groundwork Felix Klein laid in var ious branches of applied mathematics [Rowe 200 1 ] , but rather by promoting research that extended the territorial claims he himself had already staked out in analysis, number theory, foun dations of geometry, and mathematical physics. Compared with the Hilbertian production lines in these fields, Gi:it tingen research efforts in set theory and foundations resembled a mere cottage industry. Presumably Hilbert hoped that by delegating this research to specialists he could tum to other matters, in particular the foundations of physics, which dominated his atten tion after the death of Hermann Minkowski in 1909 [ Corry 1999 ] . Skuli Sigurdsson has addressed the theme of "creativity in the age of the machine" in connection with Weyl, who during his student days had close associations with Hilbert's "factory" for integral equation theory [Sigurds son 200 1, pp. 2 1-29] . Hermann Weyl clearly never wanted an ordinary job on this fast-moving assembly line, and he later downplayed the value of much that came off it. In one of his two obit uaries for his mentor, he wrote that it had been due to Hilbert's influence that "the theory of integral equations be came a world-wide fad in mathematics . . . producing an enormous literature of rather ephemeral value" [Weyl 1944, pp. 126--127] . Nevertheless, the young Weyl took a keen interest in the work
VOLUME 25. NUMBER 1 . 2003
63
of E. Schmidt, E. Hellinger, 0. Toeplitz, et al., and he also did a fair amount of mingling with his peers in the Gottin gen mathematical community. This gave him ample opportunity to partic ipate in discussions on set theory and foundations with Zermelo, whose proof of Georg Cantor's well-ordering theorem in 1904 led to an intense de bate regarding the admissibility of Zer melo's axiom of choice [Moore 1982] . Four years later, i n an effort t o quell this controversy, Zermelo presented his well-known system of axioms for set theory [Zermelo 1908]. Soon thereafter, Weyl began to take a serious and active interest in set the ory. Writing to his Dutch friend, Pieter Mulder, on 29 July 19 10, he character ized his standpoint as closer to that of Emile Borel and Henri Poincare than to Zermelo's views. But he also indi cated that he would have to think these matters through very carefully, espe cially because he feared the contro versy that typically ensued whenever issues in set theory and the founda tions of analysis were addressed. Re calling these times, Weyl would later write: "I grew up a stern Cantorian dog matist. Of Russell I had hardly heard when I broke away from Cantor's par adise; trained in a classical gymnasium, I could read Greek but not English" (Weyl Nachlass, Hs 9 1a: 1 7) . Eight years later, Weyl alluded to the diffi culties he encountered in trying to make sense of Zermelo's axiom system for set theory: "My investigations be gan with an examination of Zermelo's axioms for set theory. . . . Zermelo's ex planation of the concept 'definite set theoretic predicate,' which he employs in the crucial 'Subset'-Axiom III, ap peared unsatisfactory to me. And in my effort to fix this concept more pre cisely, I was led to the principles of definition of *2 [in Das Kontinuum]" [Weyl 19 18a, p. 48] . These principles were already enunciated in [Weyl 1910], a paper Solomon Feferman has discussed in connection with Alfred Tarski's ideas [Feferman 2000, p. 1 80 ] . Weyl described his initial orienta tion as similar to Dedekind's theory of chains, in that he sought to establish the principle of complete induction without recourse to the primitive no-
64
THE MATHEMATICAL INTELLIGENCER
tion of the natural numbers. This quest "drove me to a vast and ever more com plicated formulation but, unfortu nately, not to any satisfactory result." He finally abandoned this as a "scho lastic pseudo-problem" after achiev ing "certain general philosophical in sights," presumably derived from read ing Edmund Husserl and distancing himself from Poincare's conventional ism. Nevertheless, he concluded that Poincare had been right regarding the status of the sequence of natural num bers as "an ultimate foundation of mathematical thought" [Weyl 19 18a, p. 48] . What prompted Weyl to reenter this arena in 1918, a move that took his good friend, Erich Heeke, by surprise? Probably he had several motives, but he surely kept a keen eye on his men tor's activities, about which he had first-hand knowledge. On 1 1 Septem ber 1 9 1 7, Hilbert delivered a lecture on "Axiomatic Thought" [Hilbert 1918] be fore a meeting of the Swiss Mathemat ical Society in Zurich. This gave the first clear signs that he was about to take up the foundations of mathemat ics once again. Probably no one in Hilbert's audience listened more atten tively than Hermann Weyl, who dis cussed this lecture in detail many years later. For Hilbert's talk offered a sweeping panorama of mathematical and physical ideas that stressed not only their mutual interdependence but the role of axiomatics in both realms (see [ Corry 1997] on Hilbert's back ground interests). Like many of his contemporaries, Hilbert regarded the growth of mathe matical knowledge as an essentially teleological process in which thought obeys higher, transcendental laws. As such, his positivism had something like an Hegelian flavor, only with the math ematician replacing the metaphysician as the highest human form of Reason. In his lecture, Hilbert described the manner in which axiomatization took place as part of a natural, organic process starting from an informal sys tem of ideas ("Fachwerk von Begrif fen"). These ideas, which arose spon taneously in the course of the theory's development, were merely provisional in nature. Only during the next stage,
when researchers attempted to pro vide deeper foundations for the theory, did axiomatization actually begin. By invoking architectonic imagery, Hilbert suggested how this process structured scientific thought:
Thus arose the actual, present-day so called axioms of geometry, arith metic, statics, mechanics, radiation theory, and thermodynamics. These axioms build a deeper-lying layer of axioms than the axiom layer that was earlier characterized by the funda mental theorems of the individual fields. The process of the axiomatic method . . . thus amounts to a deep ening of the foundations of the indi vidual fields just as it becomes neces sary to do with any building to the extent that one wants to make 'it se cure as one builds it outward and up ward [Hilbert 1 91 8}. In his concluding remarks, Hilbert mentioned two particularly pressing problems confronting the foundations of mathematics: proving the consis tency of his axioms for arithmetic (the second of Hilbert's 23 Paris problems), and doing the same for Zermelo's sys tem of axioms for set theory. He em phasized that both of these problems were wedded to a whole complex of deep and difficult epistemological questions of "specifically mathematical coloring": (1) the problem of the solv
ability of every mathematical ques tion in principle, (2) the problem of the subsequent verification of the re sults of a mathematical investigation, (3) the question of a criterion for the simplicity for mathematical proofs, (4) the question of the relationship be tween content and form in mathemat ics and logic, and (5) the problem of the decidability of a mathematical question by means of a finite number of operations. Hilbert then summed up his position regarding all these com plex issues as follows: "All such fun damental questions . . . appear to me to form a newly opened field of re search, and to conquer this field-this is my conviction-we must undertake an investigation of the concept itself of the specifically mathematical proof, just as the astronomer must take into
account the movement of his position, the physicist must care for his appara tus, and the philosopher criticizes rea son itself" [Hilbert 1918, p. 155]. While conceding that, for the pres ent, these ideas remained but a sketch for future research, Hilbert retained his optimistic outlook for his program:
I believe that everything which can be the subject of scientific thought, as soon as it is ripe enough to constitute a theory, falls within the scope of the axiomatic method and thus directly to mathematics. By pursuing ever deeper-lying layers of ax·ioms . . . toe gain ever deeper i nsights into the essence of scientific thought itselj; and we become ever more conscious of the un·ity of our knowledge. In the name of the axiomatic method, math emal'ics appears called upon to as sume a leading role in all of science [Hi.tbert 1918, p. 1 56}.
to the approaching possib·ility, that out ofphysics in pri.nc'ipl.e a science simi lar to geometry w·ill a-rise: truly, the most glorious ja.me qf the axiomatic method, while here, as we see, the mighty instruments of analysis, namely the calculus qf vmiations and invariant theory, are taken into ser vice [Hilbert 1915, p. 407}. Tilman Sauer has recently noted how Hilbert, quite ironically, made only a vague allusion in his Zurich lec ture to this vision for a unified field physics [Sauer 2002 ] . Weyl could not have failed to notice that Hilbert sounded very subdued about these prospects on this occasion. He also knew very well what Einstein thought of Hilbert's methodological approach. In a letter from November 19 16, Ein stein confessed:
To me Hilbert's Ansatz about matter appears to be ch·ildish, just like an in-
This tour de force performance clearly signaled Hilbert's intentions to take up once again the foundations program he had sketched thirteen years earlier in his speech at the Hei delberg ICM. Indeed, his rhetorical flourishes clearly echoed themes Weyl and others would have recognized from Hilbert's even more famous ad dress at the Paris ICM in 1900. Just as striking, however, were the parallels with his concluding remarks from his first contribution to the general theory of relativity, in which he made similarly sweeping claims regarding the strength and resilience of the axiomatic method:
As one sees, the jew simple assump tions expressed in Axioms I and II suffice by sensible interpretation for the development of the theory: through them not only are our conceptions of space, time, and motion fundamen tally reformulated in the Einstein ian sense, but I am convinced that the most minute, till now hidden processes within the atom will become clarified through the fundamental equations herein exhibited and that U must be possible in general to refer all phys1:cal constants back to mathe matical constants-just as this leads
Einstein relaxing in h i s home office in Berlin. By 1918 he was carrying on an extensive sci entific correspondence.
fant who is unaware of the p i tfalls of the real world. . . . In any case, one cannot accept the m·ixture of well founded considerations arising from the postulate of general relativity and unfounded, risky hypotheses about the structure of the electron . . . . I am the first to admit that the discovery of the proper hypothesis, or the Hamil ton junction, of the structure of the electron is one of the most importan t tasks of the current theonJ. Th e "ax iomatic method, " however, can be of little use in this (Einstein to Weyl, 23 November 191 6 [Einstein 1 918a, p. 366}). Weyl took up these problems around this very time. In the summer semester of 1 9 1 7 he offered a lecture course on general relativity, and, on the advice of Einstein's close friend Michele Besso, he decided to adapt his notes into a book on special and gen eral relativity. This was published the following year by Julius Springer Ver lag as the first edition of Raum-Zeit Materie [Weyl 19 18b]; a second soon followed, and three substantially re vised editions appeared between 1919 and 1923. A few months before it came out, however, Weyl had proofs sent to both Einstein and Hilbert. Their re spective reactions reveal a good deal about both men. Einstein was euphoric: "it's like a symphonic masterpiece. Every word has its relation to the whole, and the design of the work is grand" (Einstein to Weyl, 8 March 1918 [Einstein 1918b, pp. 669-670]). A week earlier, Hilbert wrote also, but he merely acknowl edged receipt of the proofs (Hilbert to Weyl, 28 February 19 18, Weyl Nach lass, Hs. 9 1 : 604). Because he was on his way to Bucharest to attend a meet ing on space and time in physics, he had no time to read them. Still, he ex pressed regret that he would not be able to meet Weyl in Switzerland over the semester break, but hoped to do so during the summer or early the fol lowing year. He then added some re marks about the professorship in Bres lau recently offered to Weyl. Hilbert had been consulted during the deliberations over potential candi dates, and he had apparently recom-
VOLUME 25, NUMBER 1 , 2003
65
mended Weyl for the post. But he now
quickly added "unfortunately this again
Weyl received this letter in time to
counseled him against accepting the
creates a vacant mathematical position
make modifications in the text before
"im Interesse des Reichsdeutsch tum," because if he left Zurich this
in Switzerland that will be difficult to
the book went to press. He added a few
fill and unlikely so with a suitable per
citations and brief remarks on Hilbert's
would probably leave some worthy
sonality."
first note, but these remained shadowy
offer
German mathematician without a job.
Miffed that W eyl had ignored his
features of his book compared with his
Hilbert took a very active role in this
wishes, Hilbert added some curt praise
own contributions and, of course, Ein
game of mathematical musical chairs,
for Raum-Zeit-Materie: "I have looked
stein's. Hilbert could not have felt grat
and he apparently thought that W eyl
more carefully at the proofs of your
ified by this, especially after reading
should pass on this round. He also
book, which gave me great pleasure,
the preface, which Weyl wrote while
thought
government
especially also the refreshing and en
vacationing at his in-laws' home in
the
Prussian
would be receptive to this argument, so
thusiastic presentation. I noticed that
Mecklenburg. He could hardly have
that declining would not have unfavor
you did not even mention my first Got
failed to notice his protege's animus against his views on the axiomatization
able consequences for Weyl's future.
tingen note from 1 9 1 5 . . . . " He then pro
Apparently Weyl was not inclined to
ceeded to rattle off a litany of com
of physics. At the same time, Weyl an
follow this advice; at any rate, in April
plaints
nounced that he had found a new av
he accepted the chair in Breslau (al
remarks that provide insight into what
though he would later turn it down for
Hilbert
health
achievements
reasons).
Hilbert,
having
re
turned from Bucharest, wrote him on
paper
bearing himself on
in
on saw
this as
his
"Foundations
omission, the
enue to a truly unified field theory:
main
controversial of Physics"
22 April, sending "congratulations on
[Hilbert 1 9 1 5) . (For details, see the ac
accepting the Breslau position," but
companying box.)
22 April 1918 Dear H