The Mathematical Intelligencer encourages comments about the material in this issue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
Report on the Zurich Congress
Corrigenda to "Quaternions in Physics"
I find I agree with V.I. Arnold's "Will mathematics survive? Report on the Zurich Congress" (Mathematical Intelligencer 17 (1995), no. 3, 6-10) in a few things: e.g., most talks in ICM94 were not enlightening. I disagree in more: e.g., I didn't find the talks given by representatives of the Russian school more comprehensible than the median. There is, however, an inaccuracy about the General Assembly of the International Mathematical Union which I think needs correction. The Assembly did not reject the proposition of the American delegation to increase the representation of women and ethnic groups, but rather refused to vote on it. It used one of the Assembly's rules of procedure: that a proposition made on the second day of the meeting can't be voted on unless the Assembly decides (by vote) to take it up. It seems the members of the Assembly, while not endorsing the resolution, were not terribly eager to endorse the comment made there that such a resolution would lower the quality of the Congress. As Ingrid Daubechies asked during the Congress (I am paraphrasing), "How come increasing the number of women would automatically lower the quality?" If memory serves me correctly, the Assembly delegate who made the sarcastic comment about "sexual minorities" quoted in the Intelligencer article was V.I. Arnold himself, now vice-president of the International Mathematical Union. I found the comment offensive then, and I find it offensive now.
In my article in The Intelligencer 17 (1995), no. 4, 7-15, the following two items were inadvertently omitted from the list of references:
Alfredo Octavio IWC Caracas, Venezuela
W.S. Anglin and J. Lambek, The Heritage of Thales, Springer-Verlag, New York, 1995. L. Silberstein, Quaternionic form of relativity, Phil. Mag. 23 (1912), 790. The name "Sudbery" was misspelled. The journal reference of the article by A.W. Conway [1948] should be corrected to Pontificia Academia Scientiarum. Page 9, first column, line 17: I did not mean to imply that Hamilton was in fact influenced by Parmenides. I am informed by E.A. Costa that there is no direct evidence for this. Page 10, second column, last paragraph: Lewis Carroll did not assert that time is reversed in a mirror. My remark must have been based on a flawed recollection of Through the Looking Glass. Thanks to Lewis Stiller for pointing this out. On page 12, column 2, lines 12 and 15: replace
dx)
d *
J. Lambek Department of Mathematics and Statistics McGill University Montreal, Quebec H3A 2K6 Canada
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Sprlnger-Verlag New York
3
The Missing Link Felipe Acker
Introduction The goal of this article is to change the views of mathematicians throughout the world on three fundamental theorems of elementary analysis: Cauchy-Goursat's Theorem, Stokes's Theorem, and the Mean Value Theorem. My claims are the following:
I'll prove below. The real question is, H o w could such a fiction persist for one entire century? For Stokes's Theorem, let's restrict ourselves to Green's Theorem: the exposition will be less technical, and the central ideas won't suffer. The simplest version states that if R is a rectangle and
P,Q: R---~ R 1. Cauchy-Goursat's Theorem is really a mere corollary of Green's. 2. The usual treatment of Stokes's Theorem is misguided and I will do it properly. 3. The Mean Value Theorem does generalize to higher dimensions as an equality. Sophisticated objects like differentiable manifolds and exterior differential forms will not figure in the exposition, lest they discourage potential readers. For a more technical version, see [1] and [2]. To make my points of view clear, I'll begin by summarizing what seems to be received wisdom about these theorems. Cauchy-Goursat's Theorem states (in a simplified version) that if A is an open subset of the complex plane, f is holomorphic on A, and R is a (closed) rectangle contained in A, then
are C 1
Although it is obvious that the C 1 hypothesis can be relaxed, the universally accepted proof is based on iterated integration and needs some kind of regularity of each one of the partial derivatives oQ/ox and OP/Oy. However, if we look at this theorem as a generalization
far f(z) dz = 0 (where OR represents the boundary of R with positive orientation). Almost every introductory complex analysis book contains the remark that, "with the additional hypothesis that f is C1," the proof can be carried out using Green's Theorem. So there seems to be a general belief in the reciprocal: without this additional hypothesis, Green's Theorem wouldn't apply. This is a fallacy, as 4 THEMATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag
functions, then
New York
of the F u n d a m e n t a l T h e o r e m of Calculus, w e feel that the natural h y p o t h e s i s should be the integrability of ( 3 Q / Ox - oP / oy), e v e n if individually o Q / ox a n d OP / Oy are bad. This is, in fact, true: if P a n d Q are differentiable ( a p p r o x i m a b l e b y linear functions), it is e n o u g h to a s s u m e ( o Q / O x - oP/Oy) to be R i e m a n n integrable. The precise h y p o t h e s e s are m o r e subtle, as I will show, b u t this version is sufficient in o r d e r to get C a u c h y - G o u r s a t as a corollary. N o w let's turn to the M e a n V a l u e T h e o r e m or, should I say, the M e a n Value Equality: if
f : [a, bl ---~ R is continuous on [a, b] a n d differentiable at each point of ]a, b[, then there exists a point c in ]a, b[ such that
m i m i c this proof to get Stokes's Theorem, it b e c o m e s clear that a general version of the M e a n Value Equality w o u l d be welcome. In the chain leading f r o m the F u n d a m e n t a l T h e o r e m of Calculus to Stokes's Theorem, this is the missing link.
The Fundamental Theorem Stokes's Theorem
Let m e briefly recall the p r o o f of the F u n d a m e n t a l T h e o r e m of Calculus just to e m p h a s i z e the role p l a y e d in it b y the M e a n Value Theorem. T H E F U N D A M E N T A L T H E O R E M OF C A L C U L U S . Let f : [a, b] --~ R be continuous on [a, b] and differentiable on ]a, b[. If f ' is (Riemann) integrable, then
f'(c) = fib) - fla) b-a The trouble a p p e a r s w h e n w e try to generalize this result to higher dimensions: the pretty a n d geometrical equality becomes an inequality. I think the best expression of w h a t e v e r y b o d y seems to believe w a s given b y Jean Dieudonn6 in his celebrated Foundations of Modern Analysis [41: After the formal rules of Calculus have been derived (sections 8.1 to 8.4), the other sections of the chapter are various applications of what is probably the most useful theorem in Analysis, the mean value theorem, proved in section 8.5. The reader will observe that the formulation of that theorem, which is of course given for vector-valued functions, differs in appearance from the classical mean value theorem (for real-valued functions), which one usually writes as an equality f(b) - f(a) = f'(c)(b - a). The trouble with that classical formulation is that: 1~ there is nothing similar to it as soon as f has vector values; 2 ~ it completely conceals the fact that nothing is known on the number c, except that it lies between a and b, and for most purposes, all one needs to know is that f'(c) is a number which lies between the g.l.b. and 1.u.b. of f ' in the interval [a, b] (and not the fact that it actually is a value of f'). In other words, the real nature of the mean value theorem is exhibited by writing it as an inequality, and not as an equality.
Well, D i e u d o n n 6 w a s wrong! ~ The M e a n Value T h e o r e m does generalize to higher d i m e n s i o n s as an equality. 2 This is a k e y idea: w h e n w e p r o v e the F u n d a m e n t a l T h e o r e m of Calculus, w e really need the M e a n Value T h e o r e m in the equality form. If w e try to
1I really appreciate people like Dieudonn6 (or, on the opposite side, Arnold) who express polemic opinions; polemics is fundamental to intellectual activity. I prefer Arnold, but, as the French say, "il faut de tout pour faire un monde." 2And this equality reveals a new aspect of its nature. The theorem referred to by Dieudonn6 is usually called by French authors finite increases theorem. I claim the true mean value theorem is the one I will present below.
of Calculus and
~f
' = f(b) - f(a).
Proof: Let P = {a0. . . . . an}, a = a0 < al < "'" < G = b b e a partition of [a, b]. Then writing n
f(b) - f(a) = ~ . f(a i) - flai-1) i=1
a n d a p p l y i n g the M e a n Value T h e o r e m to each subinterval [ai-~, ai], w e get
f(b) - f(a) = ~ f'(ci)(ai - ai-1), i=1
where ci E ]a/-1, ai[,
i = 1. . . . . n.
The right-hand side c o n v e r g e s to []
bf,.
N o w let us turn to Stokes's T h e o r e m a n d try to a d a p t the p r o o f of the o n e - d i m e n s i o n a l case. For simplicity, let us restrict o u r s t u d y to the e l e m e n t a r y case of G r e e n ' s Theorem. G R E E N ' S T H E O R E M (TENTATIVE). Let R = [a, b] • [c, d] a n d P,Q : R --~ ~ be continuous on R and differentiable in its interior. Let aQ
ax
ay
be integrable on R. Then
S P K+Q Y SSR(
~x
dx dy.
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996 5
Proof. f(x)
d
cj
i
cj-1 C
mlm'~
a
ai_ 1 a i
b
The proof should begin by taking two partitions, a = ao < aI
0 can be presented as a finite union of continua, each of diameter less than E. It was also Sierpiiiski who proved [36] that no continuum can be represented as a union of countably
The Sierpifiski gasket was used in the emblem of the Polish Mathematical Olympic Games for secondary school students. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
35
Stefan Mazurkiewicz.
m a n y pairwise disjoint n o n e m p t y closed sets. Moreover, he gave a very interesting characterization of an arc: a set M is a continuous image of a closed unit interval if and only if it is a continuum and there are a,b E X such that for a n y x different from a and b there are A and B with a E A, b E B, A n B = {x} and A U B = M. There was another definition of curve, called the Cantor definition. A Cantor curve was defined as a plane continuum which does not contain interior points (in other words, a nowhere dense planar continuum). This definition rules out the Peano curve; however, it admits another strange phenomenon: it turns out that there is an example of such a curve which does not con-
tain a n y arc (i.e., no subset of this curve is homeomorphic to a segment!). This example was presented by Janiszewski during his lecture at the International Congress of Mathematicians in 1912 in Cambridge. In 1919, during a Warsaw seminar, H u g o Steinhaus called Janiszewski's example the most complicated geometrical set ever considered in geometry. Soon it appeared that there existed much more complicated planar sets. This followed from results obtained by Knaster. We will come to this point later. Sierpiriski supposed that even combining the Jordan definition and the Cantor definition w o u l d allow some strange examples. This was the origin of further famous sets, called the Sierpiriski carpet and the SierpiIiski triangle curve. The SierpiIiski triangle curve (also called the Sierpiiiski gasket) has the following interesting property: a n y of its points (other than the three vertices of the triangle) is a common endpoint of three arcs in the set which have only this point in common. Moreover, a n y Cantor curve is homeomorphic to a subset of the Sierpiriski curve! Sierpiriski presented different constructions of curves of this type: for instance, take a square, divide it into four congruent squares, and throw a w a y the square at the bottom left-hand corner. Apply the same procedure to each of the three remaining squares. Iterate this procedure. The infinite intersection of all sets obtained in this m a n n e r gives the curve with the same properties. This construction was described by Sierpirlski in his paper [28]. Recent Intelligencer articles have shown a little-known construction of the triangle curve [6, Fig. 4] as well as a well-known one [38, Fig. 2]. Also in [28], we read with surprise the following: Note that as early as one year ago Mr. Mazurkiewicz found an example of a curve which was simultaneously a Jordan curve and a Cantor curve... Mazurkiewicz forms this curve by dividing the square into nine smaller squares using lines parallel to the sides and removing the interior of the center square, performing the same procedure on each of the re-
The Sierpiriski carpet is also called the Sierpiriski universal curve, because any Cantor curve is h o m e o m o r p h i c to a subset of the Sierpifiski carpet (see [5]). This means that for any such curve T contained in ~2, there is a set S h o m e o m o r p h i c to the Sierpiriski carpet with T C S. The picture s h o w s the idea of construction of such a set containing a given curve (in the shape of a letter 3). 36
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
The pictures published in [12] showing the constructions of untypical (now well-known) connected sets.
maining eight squares, and iterating this procedure ad infinitum. We recognize the description of the Sierpiriski carpet! Apparently the Sierpiriski carpet was found by Mazurkiewicz. Nowadays the Sierpirlski carpet is mentioned in almost every book about fractals. We have already mentioned the example of a curve containing no arc, given by Janiszewski, and promised to describe planar sets with still stranger properties. The most
famous one is undoubtedly a hereditarily indecomposable continuum, constructed by Bronislaw Knaster in his Ph.D. thesis in 1922 [11]. With the methods known at the time, the construction was extremely difficult. A continuum X is called indecomposable if it contains more than one point and is not the union of two closed continua different from X. Indecomposable continua were discovered in 1910 by Luitzen Brouwer. A hereditarily indecomposable continuum, or Knaster continuum, is a set which has the property that every con-tinuum contained in it is either a one-point set or an indecomposable continuum! This seems incredible. However, it turned out that such a strange set is not so unusual. Mazurkiewicz proved [21] that Knaster continua form a dense G8 set in the family of all subcontinua of the square. A set constructed by Knaster quickly became a favorite counterexample; if somebody had a conjecture about continua, he would start the verification by checking whether a Knaster continuum satisfied it. Later, it was shown that a Knaster continuum, although very strange, is quite regular. In 1948 E. E. Moise proved [23] that a Knaster continuum X is homeomorphic to each subset of X which is a continuum and contains more than one point. This was the origin of another name for this set: pseudoarc. In 1951 R. H. Bing [2] obtained another beautiful result: a pseudoarc is homogeneous; that is, if p and q are points of a pseudoarc, then there is a homeomorphism carrying the pseudoarc to itself and p into q. This condition is fulfilled by a circle. For a long time, it was supposed that a circle is the unique subset of the plane having this property. The construction of strange, unusual sets was a specialty of Knaster. Let us mention some more examples. First, a definition: a set A is a separator of the plane if R2\A is not connected; a separator A is irreducible if any proper subset of A is not a separator of the plane. Knaster constructed [11] a separator A of the plane such that A does not contain any irreducible separator. He also proved that there exists an uncountable family of pairwise disjoint and not locally connected separators of the plane. Another example constructed by Knaster is a subset of a plane which is the common boundary of infinitely many pairwise disjoint regions. Separators were also investigated by Kuratowski. Kuratowski showed that any subset A of the plane such that R2\A is the union of finitely many (at least two) regions contains an irreducible separator. Also, he proved that any irreducible separator A of the plane, such that R2XA is the union of more than two regions, is either an indecomposable continuum or the union of two indecomposable continua. We turn to the important results in plane topology now called the theorems of Janiszewski. The First Theorem of Janiszewski states that if A and B are continua and they are not separators of the plane, then A U B is a separator of the plane if and only if A A B is not connected. The Second Theorem of Janiszewski states that the two-dimensional sphere S2 is a Janiszewski THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
37
space (a locally connected continuum X is said to be a Janiszewski space if for any continua A, B C X with A n B disconnected, there are p,q E X such that A U B is a separator of the plane and p and q belong to different components of X \ ( A U B)). Later, these theorems were generalized by Kuratowski and Stefan Straszewicz; Straszewicz noticed that in the assumptions continua may be replaced by closed connected sets. The First Theorem of Janiszewski was applied to shorten the proof of the Jordan Curve Theorem. We have to say something about the famous results on connectivity. Connected sets were investigated as early as the very beginning of the 20th century. However, it was Kuratowski and Knaster who developed the ideas of connectivity. One of the most famous examples in topology is the Knaster-Kuratowski fan [12]. Let C be the Cantor set on the interval [0, 1] x {0} c R2; denote by P the set of all end points of intervals removed from [0,1] • {0}in the process of constructing the Cantor set. Join every point c E C to the point q = (1/2,1/2) ~ ~2 by a segment Ic, and denote by Fc the set of all points (x, y) E I~, where y E Q for c E C \ P and y E ( R \ Q ) for c E P . The Knaster-Kuratowski fan is the set F = U {Fc : c E C} C R 2. There were different kinds of disconnectivity. Sierpirlski investigated punctiform sets (discontinuous sets), a notion introduced by Janiszewski. A set is punctiform if it does not contain any continuum of cardinality greater than 1. Sierpifiski proved some theorems about decomposition of the plane into punctiform sets [27, 34]. Sierpirlski and Kuratowski [18] presented a decomposition of the plane into two punctiform sets A and B such that A is an intersection of a Fr set and a G8 set, and B is a union of a Fr set and a G8 set. The technique of construction of peculiar spaces by the use of graphs of functions was used by Sierpiriski and Kuratowski here for the first time. Sierpiriski distinguished the following "better" and "worse" kinds of disconnectivity [definitions (c) and (d) were introduced by him in 1921 [30]; (b) by Felix Hausdorff in 1914]: (a) countable space; (b) hereditarily disconnected space (i.e., a space which does not contain any nontrivial connected subset); (c) totally disconnected space (any two different points can be separated by open-and-closed sets); (d) zero-dimensional space (the space has a base consisting of open-and-closed sets); (e) punctiform spaces. For instance, the Knaster-Kuratowski fan F is connected and punctiform; the space F\{q} is hereditarily disconnected but not totally disconnected. Sierpifiski gave examples showing the difference among all these classes. He also noticed that any countable space dense in itself is homeomorphic to the set Q of rationals. The investigation of zero-dimensional spaces as well as some results by Mazurkiewicz anticipated the development of dimension theory. 38
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Let us also mention some other important results. Kuratowski invented the method of generating topology by the closure operator. Sierpiriski, independently of F. Riesz, characterized compact sets by families having the finite intersection property [35]. These are only a few of the many important and famous results obtained by Polish topologists in the very beginning of the century. Outside of Warsaw, general topology was not the main field of research. However, some topological results were obtained by Lw6w mathematicians: Stanis~aw Mazur, Stanis~:aw Ulam, Juliusz Pawe~ Schauder, and, of course, Stefan Banach. Also in Krak6w, Tadeusz Wa~ewski constructed the space which is nowadays called the Wa~ewski dendrite. We do not mention the many results obtained in the thirties and later. Among the young Polish topologists there were two who turned to another kind of topology and who emigrated early from Poland: they were Samuel Eilenberg and Witold Hurewicz. Eilenberg first worked in general topology. Later, when he moved to the United States, he became interested in algebraic topology. The results of Eilenberg are nowadays considered classical and appear in many textbooks on algebraic topology. Hurewicz studied in Vienna, and then worked for a long time in Amsterdam under Brouwer. He obtained fundamental results on homotopy theory and dimension theory. Finally, let us return to the idea of the development of mathematics in Poland with special emphasis on just one branch of mathematics. Although it gives one the opportunity to achieve significant results quickly, such a conception may be dangerous. For example, if the subject is ill-chosen, the finest mathematical brains in the country may squander their effort on a very narrow area. Also, editing a journal devoted only to one branch of mathematics was a very controversial notion at that time. For instance, when the first issue of Fundamenta Mathematicae was published, Henri Lebesgue wrote a letter to Sierpiriski, in which he expressed his enjoyment of the papers in this volume, but doubted if so specialized a journal would receive enough papers to ensure its continuation at such a high level. The creators of the Warsaw School of Mathematics realized all these dangers. Nevertheless, they believed that their choice was good. Sierpiriski thought that it was better to concentrate on one branch of mathematics than to work chaotically, with no sense of partnership. The development of topology in the 20th century was enormous. Even those who selected topology as the branch of investigation for Polish mathematicians did not anticipate this. N e w trends like algebraic topology and differential topology grew and were widely applied. In these areas, Polish mathematicians were not as dominant as in general topology: the best known are Karol Borsuk, Hurewicz, and Eilenberg, the last two not working in Poland. Some mathematicians criticize not so much the choice
of topology but rather the long concentration on just general topology. They think that the research area should have been extended to algebraic topology and differential topology. Famous mathematicians differ on this point. It is impossible to imagine the d e v e l o p m e n t of general topology w i t h o u t the results obtained b y Polish mathematicians. On the other hand, at present Polish mathematics does not play such a role in t o p o l o g y as it did 70 years ago. But is that the fault of the creators of the W a r s a w School? Is it a n y b o d y ' s fault? It is impossible to require a n y talented mathematician to w o r k on a particular kind of problems. Also, the period 1935-1950, w h e n algebraic t o p o l o g y d e v e l o p e d so richly, was v e r y difficult for Poland. It is p e r h a p s too early to judge these things now. We have to wait about 100 years. We can say that the Polish contribution to the develo p m e n t of t o p o l o g y was extremely impressive. It was in Poland w h e r e the basics of general t o p o l o g y began to be p u t in order, m a n y ideas were formalized, m a n y definitions stated, and m a n y really i m p o r t a n t problems solved. It even seems that the mathematicians of the twenties solved too m a n y problems and did not leave e n o u g h for their successors. The books written by Polish topologists, especially the impressive m o n o g r a p h b y Kuratowski, b e c a m e classics. Although old, they are still m u c h cited. Let us end b y quoting the f a m o u s Japanese mathematician, J. Nagata [4]; asked about his teachers, he answered, "I had two teachers: A l e x a n d r o v and Kuratowski, because I learned topology from the books written by them."
References 1. A.V. Arkhangelskii and L.S. Pontryagin (eds.), General Topology, vol. L Springer-Verlag, New York: 1990. 2. R.H. Bing, Concerning hereditarily indecomposable continua, Pacific Journal of Mathematics 1 (1951), 43-52. 3. R. Engelking, General Topology, PWN, 1977. 4. R. Engelking, "P.S. Aleksandrow," WiadomodciMatematyczne 20 (1978), 174-177. 5. R. Engelking and K. Sieklucki, Introduction to Topology, Amsterdam: North-Holland, 1994. 6. K. Hannabuss, Forgotten fractals, Mathematical Intelligencer 18, no. 3, 28-31. 7. Z. Janiszewski, O potrzebach matematyki w Polsce, in: Nauka Polska, Warszawa, Kasa im. Mianowskiego 1917: reprinted in: Wiadomosci Materaatyczne, 7(1963), 3-8. 8. Z. Janiszewski, O rozcinaniu plaszczyzny przez continua, Prace Matematyczno-Fizyczne 26 (1913), 11-63. 9. Z. Janiszewski, Sur les continus irr6ductibles entre deux points, Comptes Rendus Paris (1911), 752-755. 10. Z. Janiszewski, Uber die Begriffe "Linie" und "F1/iche," International Congress of Mathematicians, Cambridge, 1912. 11. B. Knaster, Un continu dont tout sous-continu est ind6composable, Fundamenta Mathematicae 3 (1922), 247-286. 12. B. Knaster and K. Kuratowski, Sur les ensembles connexes, Fundamenta Mathematicae 2 (1921), 206-255. 13. K. Kuratowski, Notatki do autobiografii, Czytelnik, Warszawa, 1981.
14. K. Kuratowski, P6~ wieku matematyki polskiej, Wiedza Powszechna, Warszawa, 1977. 15. K. Kuratowski, S. Mazurkiewicz et son oeuvre scientifique, Fundamenta Mathematicae 34 (1947), 316-331. 16. K. Kuratowski, Topologie, vol. I, Warszawa, 1933. 17. K. Kuratowski, Topologie, vol. II, Warszawa, 1950. 18. K. Kuratowski and W. Sierpirlski, Les fonctions de classe 1 et les ensembles connexes punctiformes, Fundamenta Mathematicae 3 (1922), 303-313. 19. A. Lelek, Zbiory, Warszawa, PZWS, Warszawa, 1966. 20. S. Mazurkiewicz, O arytmetyzacji continu6w, Comptes Rendus Varsovie 6 (1913), 305-311. 21. S. Mazurkiewicz, Sur les continus absolument ind6composables, Fundamenta Mathematicae 16 (1930), 151-159. 22. S. Mazurkiewicz and W. Sierpiriski, Contribution a la topologie des ensembles d6nombrables, Fundamenta Mathematicae 1 (1920), 17-27. 23. E.E. Moise, An indecomposable plane continuum which is homeomorphic to each of its nondegenerate subcontinua, Trans. American Mathematical Society 63 (1948), 581-594. 24. A. Schinzel, Rola Waclawa Sierpiriskiego w historii matematyki polskiej, Wiadomo~ciMatematyczne 26 (1984), 1-9. 25. W. SierpiIiski, Oeuvres Choisies,vols. L II, PWN, Warszawa, 1974 26. W. Sierpiriski, Sur une condition pour qu'un continu soit une courbe jordanienne, Fundamenta Mathematicae I (1920), 44-60. 27. W. Sierpirlski, Sur la d6composition du plan en deux ensembles punctiformes, Bulletin International de L'Acaddmie des Sciences de Cracovie, Ser. A (1913), 76-82. 28. W. Sierpiiiski, O krzywej, kt6rej ka~cly punktjest punktem rozga~e,zienia (Sur une courbe dont tout point est un point de ramification), Prace Matematyczno-Fizyczne 27 (1916), 77-85. 29. W. Sierpiriski, O krzywych, wyperniajacych kwadrat (Sur les courbes qui remplissent un carr6), Prace MatematycznoFizyczne 23 (1912) 193-219. 30. W. Sierpiriski, Sur les ensembles connexes et non connexes, Fundamenta Mathematicae 2 (1921), 81-95. 31. W. Sierpillski, Sur une courbe dont tout point est un point de ramification, Comptes Rendus Paris 160 (1915), 302-305. 32. W. Sierpiliski, Sur une courbe cantorienne qui contient une image biunivoque et continue de toute courbe donn6e, Comptes Rendus Paris 172 (1916), 629-632. 33. W. Sierpi6ski, Sur une nouvelle courbe continue quelconque, Bulletin International de L'Acaddmie des Sciences de Cracovie, Ser. A (1912), 462-478. 34. W. SierpiIiski, Sur un ensemble punctiforme connexe, Fundamenta Mathematicae 1 (1920), 7-10. 35. W. Sierpirlski, Un th6or6me sur les ensembles ferm6s, Bulletin International de L'Acaddmie des Sciences de Cracovie, Ser. A (1918), 49-51. 36. W. Sierpiiiski, Un th6oreme sur les continus, T6hoku Mathematics Journal 13 (1918), 300-303. 37. L.A. Steen and J.A. Seebach Jr., Counterexamples in Topology, New York: Springer-Verlag, 1978. 38. I. Stewart, Four encounters with Sierpinski's gasket, Mathematical Intelligencer 17, no. 1, 52-64. 39. G. Temple, 100 Years of Mathematics, Duckworth, London, 1981.
Mathematics Institute Jagiellonian University Reymonta 4, 30-059 Krak6w, Poland e-maih
[email protected] e-maih
[email protected] THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
39
Jeremy J. Gray*
Augustus De Morgan (1806-1871) Adrian Rice
This year marks the 125th anniversary of the death of Augustus De Morgan. Immortalised by the famous laws which bear his name, De Morgan is otherwise largely unknown to the majority of today's mathematicians. However, he was one of the most respected and influential British mathematicians of his day, an intriguing character whose enormous intellect was matched by a sharp wit and sense of humour. Born in Madurai, southern India, on 27 June 1806, De Morgan suffered an early infection which left him blind in the right eye, a disability which throughout his life resulted in concentration on mental rather than physical activities. Raised and educated in the southwest of England, he entered Trinity College, Cambridge, in February 1823, where his mathematical talents soon blossomed under the influence of tutors such as the algebraist George Peacock, the philosopher of science William Whewell, and the astronomer George Biddell Airy. As a result, in 1827, he graduated in fourth place as a 'Wrangler' (i.e., one with a first-class degree). His graduation coincided with the search for professors at the newly established University College London. Founded in 1826 as "The London University," UCL was the first such body to be established in
England since Oxford and Cambridge. Inspired by its progressive aims and explicit secular character, De Morgan applied for the mathematics chair. At 21, he was the youngest of thirty-one candidates and had no teaching experience whatsoever. Nevertheless, due in no small part to excellent references from his Cambridge mentors, he was unanimously elected founder Professor
*Column editor's address: Faculty of M a t h e m a t i c s , The O p e n University, Milton Keynes, MK7 6AA, England.
40 THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3 9 1996Springer-VerlagNew York
thor Isaac Todhunter, the economist and logician William Stanley Jevons, and the constitutional writer Walter Bagehot. Recollections of pupils such as these tell us that De Morgan was an "eccentric but brilliant teacher" whose lectures were stimulating, often inspiring, but far from easy. Even his best students had to struggle to keep up, as Bagehot wrote in 1843: De Morgan has been taking us through a perfect labyrinth lately; he was quite lost by the whole class for one lecture, but we are, I hope, getting better... We have been discussing the properties of infinite series, which are very perplexing.
Augustus DeMorgan of Mathematics on 23 February 1828, and gave his first lecture on 5 November that year. All did not go smoothly, however. Relationships between the twenty-eight professors and the college's dictatorial governing council were often strained; and when, in 1831, the professor of anatomy was unfairly dismissed, De Morgan, being a man of principle, promptly resigned in protest. Five years later, however, his successor, Professor George James Pelly White, was accidentally drowned while on holiday in the Channel Islands. De Morgan immediately offered himself as a temporary replacement ... and stayed on for another thirty years! (The professor of anatomy who had been at issue never did return.) The maths course then offered at University College was divided into four groups: the junior and senior classes, each with a higher and lower division. The course began with elementary arithmetic and the first book of Euclid's Elements, progressing as far as the calculus of variations in a period of two years. Incidentally, De Morgan never taught what we today would call "applied maths." Subjects such as dynamics and statics were taught by the Professor of Natural Philosophy (i.e., physics). De Morgan lectured from 9 to 10 am and 3 to 4 pm every day except Sundays, and at the end of each lecture would give homework problems for the class to solve by next time. Although University College was something of a feeder for the more advanced instruction offered at Cambridge, which took many of De Morgan's graduates, a number of his students went on to achieve fame in their own right, such as the algebraist James Joseph Sylvester, the mathematical textbook au-
(De Morgan was one of the first to lecture on this topic in Britain.) Perhaps due to his own experiences at university, De Morgan was severely critical of how students were examined, preferring them to be able to think for themselves rather than reproduce proofs in an exam. As one ex-student later wrote: "All cram he held in the most sovereign contempt. I remember, during the last week of his course which preceded an annual College examination, his abruptly addressing his class as follows: 'I notice that many of you have left off working m y examples this week. I know perfectly well what you are doing; YOU ARE CRAMMING FOR THE EXAMINATION. But I will set you such a paper as shall make ALL YOUR CRAM of no use.' " De Morgan was, throughout his career, a prolific writer, publishing 18 books and over 160 papers on many subjects. His research is primarily remembered today for its contribution to the development of modern symbolic algebra and logic, encouraging William Rowan Hamilton with his work on quaternions and George Boole in his algebraic logic. Indeed, De Morgan's major achievement lies in his recognition of the connection between the two disciplines. As he later characteristically put it: We know that mathematicians care no more for logic than logicians for mathematics. The two eyes of exact science are mathematics and logic: the mathematical sect puts out the logical eye, the logical sect puts out the mathematical eye; each believing that it sees better with one eye than with two. De Morgan's interest in logic arose from his teaching of Euclid. He noticed that the Elements provided a perfect example of the poor relationship between logic and mathematics: while Euclidean geometry was considered to be the model of deductive reasoning in mathematics and the syllogism in Aristotelian logic, hardly any connections existed between the two systems. De Morgan was virtually the only person to consider this peculiarity in 2000 years, although his attempts to "syllogise" Euclid were largely unsuccessful. He also believed that the traditional Aristotelian syllogistic method was inadequate in any reasoning inTHE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
41
volving quantity. Giving the following example, 7~ost of the Ys are Xs Most of the Ys are Zs .'. Some Xs are Zs he asserted that this argument could not be proved by means of any of the normally accepted Aristotelian syllogisms. In order to rectify this defect, De Morgan introduced the notion of "quantifying the predicate" into his logic. Here he said that if the total number of Ys is m, the number of Ys that are Xs x, and the number of Ys that are Zs y, then there are at least (x + y - m) Xs that are Zs. For example, given that a boat with 100 people on board sinks, if 55 were below deck and the total number drowned is 70, then, by De Morgan's syllogism, at least 25 (i.e., 55 + 70 - 100) people below deck were drowned. This extension of the concept of syllogism was successful in developing a numerically definite system of logic: a significant step forward. He published two books and four papers based on his research, of which the fourth is now regarded as his most original contribution. In it, he introduced the logic of relations which, although his work in this area was left unfinished, substantially increased the scope of the subject. Less enduring perhaps were his attempts to invent a suitable notation for his symbolic logic, which were superseded by Boole's more algebraic approach. An illustration of this is provided by the fact that the famous De Morgan's Laws are far more familiar to us in their modern Boolean formulation: (A n B)' = A' U B',
(A U B)' = A' n B'.
In addition to his work in mathematics and logic, De Morgan had a lifelong fascination for the history and philosophy of science in general, and mathematics in particular. He contributed over 700 articles to a publication entitled the Penny Cyclopaedia on all areas of mathematical science, including one in which he invented the term, though not the method, of "mathematical induction." Though deeply interested in philosophy, this mode of thought was not usually one of his strengths. As he wrote, he "had no objection to Metaphysics, far from it, but if a man takes a candle to look down his own throat, he must take care not to set his head on fire." It was the history of mathematics that was his particular forte. Articles such as "The early history of infinitesimals in England" and "Notices of English mathematical and astronomical writers between the Norman Conquest and the year 1600" give a mere indication of the breadth of his knowledge and interest in the subject. Yet his approach was never dry. In a letter to Hamilton in 1852, he wrote: Dates are of as much importance to an historian as to an Arab. The Arab, however, has to dry his; the historian's are as dry as possible from the outset. 42
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Perhaps De Morgan's best known work is a book entitled A Budget of Paradoxes. This is a collection of humorous writings and reviews featured in a weekly Victorian magazine called The Athenaeum, compiled posthumously by his widow Sophia. De Morgan was a keen bibliophile, accumulating over 3000 mathematical volumes by his death, and the Budget consists of accounts of many of these works together with anecdotes and witty verses. A couple of reviews will suffice to give the flavour. "The Decimal System as a whole. By Dover Statter. London and Liverpool, 1856. The proposition is to make everything decimal. The day, now 24 hours, is to be made 10 hours. The year is to have ten months, Unusber, Duober, &c. Fortunately there are ten commandments, so there will be neither addition to, nor deduction from, the moral law. But the twelve apostles! Even rejecting Judas, there is a whole apostle of difficulty. These points the author does not touch." "A method to trisect a series of angles having relation to each other; also another to trisect any given angle. By James Sabben. 1848 (two quarto pages). 'The consequence of years of intense thought': very likely, and very sad." Another area into which De Morgan directed his intellectual energy was mathematical astronomy. He served on the council of the Royal Astronomical Society for over three decades between 1830 and 1861, holding the office of secretary between 1831 and 1838 and again from 1848 to 1854, as well as being the society's vicepresident on many occasions. Although an enthusiastic member, due to his optical disability De Morgan was not an observational or experimental astronomer. For this reason, he resisted considerable pressure to become the society's president. He wrote at the time: I will vote for and tolerate no President but a practical astronomer. . . . The President must be a man of brass--a micrometer-monger, a telescope-twiddler, a star-stringer, a planet-poker, and a nebula-nabber. De Morgan was a man of many eccentricities. When asked his age, he was wont to declare: "I was x years old in the year xa'--a phenomenon peculiar to those born in years such as 1640, 1722, 1806, 1892, 1980, and so on. In 1859, when offered an honorary law doctorate by Edinburgh University, he declined it, saying that he "did not feel like an LL.D." He also refused to allow himself to be proposed as a Fellow of the Royal Society. "Whether I could have been a Fellow," he later said, "I cannot know; as the gentleman said who was asked if he could play the violin, I never tried." Married with seven children, De Morgan lived in close proximity to University College, first at No. 69, Gower Street (now numbered 35), later moving to No. 7, Camden Street, then on the edge of London, but now
relatively central. He retained a lifelong love of London, rarely leaving it, not even for family holidays in the "desolation" of the countryside. Viewing these rural excursions with a humorous dread, he once wrote of himself: Ne'er out of town; 'tis such a horrid life: But duly sends his family and wife. Yet despite his love of the city, he never visited Westminster Abbey, or the Tower, or the House of Commons, and he refused to vote in any election. The last major event of his career was his term as first president of the London Mathematical Society, founded in 1865. Conceived as the "University College Mathematical Society" by two former students, one of whom was his son George, the society received great encouragement and support from De Morgan. His inaugural address was principally noteworthy for the emphasis it placed on the necessity for research into his two favourite topics: logic and the history of mathematics. To this day, the society commemorates its founding president with the De Morgan Medal, awarded every three years for outstanding mathematical achievement. Based at University College for the whole of De Morgan's presidency, the LMS moved to new premises at the end of 1866. De Morgan's own link with University College ended simultaneously, although the two events were unconnected. His departure was on another matter of principle, this time over adherence to the college's policy of religious equality. For De Morgan, the council's refusal to appoint a candidate to the vacant chair of philosophy on the grounds of his being a controversial Unitarian minister was a betrayal of its founding principles. He resigned his chair on 10 November 1866, giving his last lecture in the summer of 1867. He never returned, refusing even to sit for a bust to be placed in the college library, explaining that, as far as he was concerned, "our old college no longer exists." The years following this final resignation were plagued by misfortune. Although no personal bitterness had resulted from the controversy, the incident affected De Morgan so strongly that it injured his health. The death of George De Morgan in October 1867, at the age of just 26, served as a further blow. In 1868, he suffered a stroke from which he never fully recovered. The final decline in his health followed the premature death of another child, Helen Cristiana in August 1870. He died of nervous prostration and kidney disease on 18 March 1871, and was buried in Kensal Green Cemetery in north-west London.
".4 must-read book for anyone interested in science, mathematics, computers, quantum mechanics, human capabi/ities, consciousness, free wi//, reincarnation, and the scientific possibility of e t e m a / life. " --R. Rao Chivukula, Ph.D., Department of Mathematics and Statistics, University of Nebraska--Lincoln
Edges of Rea/ity is an astounding exploration of consciousness beyond the edge of a thought we can never think, a problem we can never solve, and a place we can never go. Dr. May entertainingly explains and illustrates the reasons for many of our intellectual and physical limitations, and offers a glimpse of what wonders the future of human and computer "thought" may hold. 0-306-45272-3/322 pp./ill./1996/$28.95 ($34.74 outside US & Canada)
Adrian Rice School of Mathematics and Statistics Middlesex University Bounds Green London N l l 2NG, UK THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
43
Light Shadows: Remembrances of Yale in the Early Fifties Gian-Carlo Rota
Jack Schwartz The first half of Jack Schwartz's life coincides with one of the greatest ages of science. The achievements in the exact sciences of the period that runs from roughly 1930 to 1990 may well remain unmatched in any foreseeable future. Jack Schwartz's name will be remembered as a beacon of this age. No one among the living has left as broad and deep a mark on as many areas of pure and applied mathematics, on computer science, in economics, in physics, as well as in fields which ignorance prevents me from naming. I hope you will forgive me as I declare m y incompetence to do justice to Jack Schwartz's life, to his personality, to his achievements. I beg your indulgence if I resort instead to an easier task. I'd like to recall a few anecdotes from a brief period of the past, the years 1953 to 1955, when I met Jack and learned mathematics as a graduate student at Yale. The first lecture by Jack I listened to was given in the spring of 1954, in a seminar in functional analysis. A brilliant array of lecturers had been expounding throughout the spring term on their pet topics. Jack's lecture dealt with stochastic processes. Probability was still a mysterious subject cultivated by a few scattered mathematicians, and the expression "Markov chain" conveyed more than a hint of mystery. Jack started his lecture with the words, "A Markov chain is a generalization of a function." His perfect motivation of the Markov property put the audience at ease. Graduate students and instructors relaxed and followed his every word to the end.
Jack's sentences are lessons in clarity and poise. I remember a discussion in the mid-eighties about the future of artificial intelligence, in which for some reason I was asked to participate. The advocates of what was then called "hard A. I." were painting a triumphalist picture of the future of computer intelligence, to the dismay of their opponents. As the discussion went on, all semblance of logical argument was given up. Eventually, everyone realized that Jack had not said a word, and all faces turned toward him. "Well," he said, "some of these developments may lie one hundred Nobel prizes away." His felicitous remark calmed everyone down. The A. I. people felt they were being granted the scientific standing they craved, and their opponents felt vindicated.
1Inaugural address delivered at Courant Institute (New York University) at a meeting in h o n o r of Jacob T. Schwartz on May 19, 1995.
44 THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3 9 1996Springer-VerlagNew York
Jack Schwartz
I have made repeated use in my own lectures of Jack's strikingly apposite phrases. You may forgive this shameless appropriation upon learning that my students have picked up the very same phrases from me, and so on.
From Princeton to Yale Mathematics in the fifties was a marginal subject, like Latin. The profession of mathematician had not yet been recognized by the public, and it was not infrequent for a mathematics graduate student to be asked whether he was planning to become an actuary. The centers of mathematics were few and far between, and communication among them was infrequent. The only established departments were Princeton and Chicago. Harvard was a distant third, and Yale was in the process of overcoming its overdependence on the College. In New York, Richard Courant was busy setting up his Institute of Mathematical Sciences at 25 Waverly Place, and he had just finished training his first generation of students in America, the generation of Lax and Nirenberg, of Cathleen Morawetz and Harold Grad. It was already clear that the Institute he was putting together was going to be a great center of mathematics. In the spring of 1953, I was a senior at Princeton, and I applied to various universities for admission to gradu-
ate school. It soon became apparent that I only needed to apply to one graduate school. Professor A. W. Tucker was not yet the chairman of the Mathematics Department, but he was already acting as if he were. Solomon Lefschetz, the nominal chairman on the verge of retirement, would make fun of Tucker, by lavishing in public uncomfortably high praise on Tucker's managerial skills. There were few undergraduate majors, maybe a half dozen each year, and A1 Tucker would see to it that they were sent to the "right" graduate schools. He made sure that Jack Milnor stayed in Princeton, and he sent H y m a n Bass, Steve Chase, and Jack Eagon to Chicago, Mike Artin to Harvard. In April 1953, I wrote a letter of acceptance to the University of Chicago, which had offered me a handsome fellowship (in those days, it was extremely easy to be offered a graduate fellowship anywhere). On my way to the mailbox, I met Professor Tucker on the narrow, rickety stairs of the old Fine Hall. He asked me where I had decided to go to graduate school, and, upon hearing of m y decision, he immediately retorted, "You are not going to Chicago, you are going to Yale!" I had no choice but to do what he bid me; I tore up the letter to Chicago and wrote an identical letter of acceptance of a fellowship that I had been offered by Yale. In retrospect, my decision to go to Yale is one of the few right decisions I have made, and I will always thank A1 Tucker's memory for guiding me to it. Don Spencer, another of my undergraduate teachers, was first to mention Jack Schwartz's name to me. He complimented me on m y choice of a graduate school, with the remark: "Oh yes, Yale, that is where Jack Schwartz i s . . . ' . It was an astounding statement, considering that Jack Schwartz was getting his Ph.D. from Yale that very month. Spencer's remark began a process of turning Jack Schwartz into a mythological figure in m y mind, a process that did not stop after I actually met Jack Schwartz a few months later. Actually, I have never been able to stop the process.
Josiah WiUard G i b b s The sciences at Yale have always played second fiddle to the humanities. At faculty meetings it is not unusual to witness a professor of literature point with a wide gesture, like a Roman senator, towards Hillhouse Avenue, where most of the science departments are located, and begin an oratorical sentence with the words "Even in the sciences..." Despite the distrust that Yale College has felt towards science, Yale was once blessed with the presence of one of the foremost scientists of the nineteenth century, namely, Josiah Willard Gibbs. Gibbs served as a professor at Yale without any stipend. Professors did not receive any salary from Yale in those happy days. Teaching young men from the upTHEMATHEMATICALINTELLIGENCERVOL.18,NO.3, 1996 45
per echelons was not a salaried profession, it was a privilege. The administration did of course receive handsome salaries, like all administrations in all times of history. One day, Gibbs received an offer from the recently founded Johns Hopkins University. It was an endowed professorship. We may hazard the guess that it was the position Sylvester had relinquished when he accepted a professorship at Oxford, after the requirement of religious vows for professors was dropped by the two English universities. Thanks to its endowment, the Johns Hopkins professorship carried a stipend of one hundred dollars a year. It is unclear whether Gibbs was delighted with the offer; in any case he felt obliged to get ready to move to Baltimore. One of his colleagues realized that Gibbs was packing, and hastened to contact the Dean of the College. The Dean asked the colleague if he could do something to keep Gibbs at Yale. "Why, just tell him that you'd like him to remain at Yale!" answered the, colleague. The Dean kept his word and did what the colleague had recommended. He summoned Gibbs to his office and generously let him know that'he wanted Gibbs to stay. It was the kind of reassurance Gibbs needed. He declined the Johns Hopkins offer, and remained at Yale for the rest of his career. Some of Gibbs's most original papers in statistical mechanics were published in the Proceedings of the Connecticut Academy of Sciences, a journal which I dare surmise few of us have ever seen. One might wonder how papers which saw the light in such an obscure publication could manage to receive within a short time worldwide publicity and acclaim. After I moved to Yale in the summer of 1953, I accidentally found the answer to this puzzle. There was no mathematics library at Yale in the fifties; a mathematics library was not opened until the early sixties, after several members of the Mathematics Department had threatened to quit. Before that time, mathematics books were relegated to a few shelves in the Sterling Library, randomly classified under that miscarriage of reason that was the Dewey Decimal System. All students had access to the shelves. In August 1953, I used to walk through the mathematics shelves of the Sterling Library and to pull out books at random, as we do when we are young. Next to an array of perused calculus books were hard-bound lecture notes of courses offered at Yale at various times by members of the faculty. Among these were some course notes by Gibbs, written in his own hand. A few additional sheets were glued to one of these volumes. The names of all notable scientists of Gibb's time were listed in these sheets, from Poincar6 and Hilbert and Boltzmann and Mach, all the way to individuals who are now all but forgotten. Altogether, more than two hundred names and ad46
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
dresses were alphabetized in a beautiful, fading handwriting. Those sheets were a copy of Gibbs's mailing list. As I leafed through it with amazement, I realized at last how Gibbs had succeeded in getting himself to be known in a short time. I also learned an instant lesson, the importance of keeping a mailing list.
Yale in the Fifties In the early fifties, Yale had not yet lost the charm of a posh out-of-the-way college for the children of the wealthy. Erwin Chargaff, in his autobiography Heraclitean Fire, describes Yale in the following words: Yale University was much more of a college than a graduate school; and the undergraduates were all over town. They were digesting their last goldfish, for the period of whoopee, speakeasies, and raccoon coats was coming to an end, to be replaced by a grimmer America which was never to recover ' the joy of upper-class life. The University proper was much less in evidence. Shallow celebrities, such as William Lyon Phelps, owed their evanescent fame to the skill with which they kept their students in a state of elevated somnolence. The main part of the campus, consisting of nine shining colleges in the middle of New Haven, was of recent vintage. At the lower end of Hillhouse Avenue, the red bricks of Silliman College shone like the plaster of a movie set as one made one's way back to the main campus from the deliberately distant science buildings. Envious Englishmen spread the malicious rumor that the colleges that Mr. Harkness's money had built were Hollywood-style imitations of Oxford colleges. But nowadays the shoe is on the other foot, and it is Oxford that is at the receiving end of other jibes. The graduate school was a genteel (though less and less gentile) appendage added to the University by gracious assent of the Dean of the College. The Dean of the College held the real power, and he could overrule the President. Since the thirties, professors appointed to the few and ill-paid graduate chairs had consistently turned out to be better scholars and scientists than the Administration had foreseen at tenure time. Nonetheless, evil tongues from Northern New England whispered that a certain well-known physics professor would never have made it past assistant professor in Cambridge; but he was one of the last exceptions, soon to fade into best-sellerdom. Hard work, the kind one reads about in the hagiographies of scientists, was regarded by the graduate students with embarrassment. It was not unusual for a graduate student to spend seven postgraduate years as a teaching assistant before being reluctantly awarded a terminal Ph.D. The university cynically encouraged graduate students to defer their degrees: the money saved by hiring low-paid teaching assistants in place of professors could be used to enrich the rare book collection. Writing a doctoral dissertation was an in-house af-
fair, having little to do with publishing or with distasteful professionalism. On learning about the shocking leisure of graduate life at Yale in the fifties, one may seek shelter in one of the current philosophies of education, which promise instant relief from the onslaught of reality. One would thereby be led to the mistaken conclusion that "creativity" (a pompous word currently enjoying a fleeting but insidious vogue) would be stifled in the constricted, provincial, unhurried atmosphere of N e w Haven. The facts tell a different story. The comforts of an easy daily routine in a rigidly circumscribed environment, encouraged by the indulgent scrutiny of benign superiors, foster the life of the mind. Professors were poorly paid but enjoyed unquestioned prestige. In their sumptuous quarters in the colleges they would encourage their students with sherry and conversation. Purposeless delectation in ideas may be as educational as intensive study. At Yale, together with the enjoyment of an absorbing range of campus activities went the lingering belief that nothing much mattered in that little corner of the world. Teachers and students were thereby led to meet the fundamental requirement of a successful educational experience: They were kept from taking themselves too seriously.
It is not clear how functional analysis took over the Mathematics Department. Einar Hille was hired away from Princeton sometime in the mid-thirties, but for several years he was one of two research mathematicians. At the time, several universities would hire one and only one "research mathematician"; Yale could afford as many as two: Einar Hille and Oystein Ore. Nelson Dunford was next to come, as an assistant professor. Soon after his arrival, he received an attractive offer from the University of Wisconsin, and Yale took the unusual step of promoting him two steps up to a full professorship. After the end of World War Two, Kakutani came over from Japan, and Charles Rickart from Michigan. By the early fifties, just about every younger mathematician at Yale was working in functional analysis, and the weekly seminars were attended by well over 50 people.
Mathematics at Yale
The core of graduate education in mathematics was Dunford's course in linear operators. Everyone who was interested in mathematics at Yale eventually went through the experience, even some brilliant undergraduates, such as Andy Gleason, McGeorge Bundy and Murray Gell-Mann. The course was taught in the style of R. L. Moore: mimeographed sheets containing unproved statements were handed out every once in a while, and the students would be asked to produce proofs on request. Occasionally, some student at the blackboard would fall silent. Dunford would make no effort to help, and the silence, sometimes lasting the whole 50 minutes, became unbearable to all. I suspect that Dunford wanted to minimize his teaching load, which in those years ran to 12 hours per week for full professors. Everyone who took Dunford's course was marked by it. George Seligman once remarked to me that Dunford's course in linear operators was the turning point in his graduate career as an algebraist.
The Mathematics Department was the first of the science departments to awaken. It was not until the fifties when the last of a long line of professional teachers of calculus retired from the Mathematics Department: fine, upright gentlemen of the old school, richly endowed with family values, who reaped handsome profits on the royalties of their best-selling textbooks. The mathematicians who replaced them were eager to create a research atmosphere, and at last a few graduate students were slowly beginning to drift over to N e w Haven. From the beginning of the Yale graduate school all the w a y to the twenties, the one notable research mathematician to have taught at Yale was E. H. Moore, and two of the few distinguished mathematicians to come out of Yale until the fifties were Marshall Hall and Irving Segal. In the fifties, a sudden plethora of stars appeared, led by Jack Schwartz.
Dunford had an unusual youth. After being passed over for a graduate fellowship in the middle of the depression in the thirties, he survived in St. Louis on 10 dollars a month, while studying and writing in the public library. Remarkably, the St. Louis library did subscribe to the few mathematics research journals of the time, and while unemployed in St. Louis Dunford managed to finish his first paper, which deals with integration of functions with values in a Banach space. After the paper had been accepted for publication in the Transactions of the A. M. S., Dunford was offered an assistantship at Brown, where he worked under Tamarkin. His doctoral dissertation dealt with the functional calculus that bears his name. He was hired by Yale right after he received his Ph.D., and spent his entire career there. He retired early, ostensibly because he had made lucrative investments in art and in the stock market. But in reality, Dunford's re-
There is a fundamental difference between the quality of life in Northern N e w England and in Southern N e w England. It comes from the shadows. On a Cambridge Sunday, the sharp shadows across the Charles River cut out the outlines of the distant buildings of Boston as if made of stiff cardboard, and deepen the blue of the water. In New Haven, by contrast, the light shadows are softened in a silky white haze, which encloses the colleges in a cozy aura of unreality. Such foresight of Mother Nature bespeaks a parting of destinies.
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
47
remains to be estimated. The period that runs roughly from the twenties to the middle seventies was an age of abstraction. It probably reached its peak in the fifties and sixties. The fifties were the heyday of functional analysis, as the sixties were the heyday of algebraic geometry. The two major centers of functional analysis in the fifties were Yale and Chicago. The Mathematics Department at Stanford, which consisted entirely of classical analysts, had trouble finding graduate students. The great classical analysts at Stanford, such names as P61ya, Szeg6, Loewner, Bergman, Schiffer, and the first Spencer, were considered to be hopelessly old-fashioned. At Yale you could find no analysis courses offered other than functional analysis and supporting abstractions. Algebra reached an independent peak of abstraction with Nathan Jacobson and Oystein Ore. There was a standing bet among graduate students at Yale that whenever a doctoral dissertation in analysis was turned in, the writer would be challenged to use its results to give a new proof of the spectral theorem.
Post-retirement portrait of Nelson Dunford and his wife.
tirement could be another episode of The Bridge of San Luis Rey. It coincided with the completion of his life work, which was the three-volume treatise Linear Operators, written in collaboration with his student Jack Schwartz.
Linear Operators started out as a set of solutions to problems handed out in class; it gradually increased in size. Soon after Jack Schwartz enrolled in the course, Dunford asked him to become co-author. The project quickly expanded to include Bill Bade and Bob Bartle, as well as several students, instructors and assistant professors. It was fully supported by the office of Naval Research. There is a persistent rumor, never quite denied, that every nuclear submarine on duty carries a copy of Linear
In those days, no one doubted that the more abstract the mathematics, the better it would be. A distinguished mathematician, who is still alive, pointedly remarked to me in 1955 that any existence theorem for partial differential equations which had been proved without using a topological fixpoint theorem should be dismissed as applied mathematics. Another equally distinguished mathematician once whispered to me in 1956, "Did you know that your algebra teacher Oystein Ore has published papers in graph theory? Don't let this get around!" Sometime in the early eighties the tables were turned, and a stampede away from abstraction started, which is still going on. A couple of years ago I listened to a lecture by a well-known probabilist, which dealt with properties of Markov processes. After the lecture, I remarked to the speaker that his presentation could be considerably shortened if he expressed his results in terms of positive operators rather than in terms of kernels. "I know," he answered, "but if I had lectured on positive operators nobody would have paid any attention!" There are already signs that the tables may be turning again, and we old abstractionists are waiting with mischievous glee for the pendulum to swing back. Just a few months ago, I overheard a conversation between two brilliant assistant professors, purporting to provide an extraordinary simplification to some recently proved theorem; eventually, I realized with pleasant surprise that they were rediscovering the usefulness of taking adjoints of operators.
Operators.
Linear Operators: the Past Abstraction in M a t h e m a t i c s
The pendulum of mathematics swings back and forth towards abstraction and away from it, with a timing that 48
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
The three-volume treatise Linear Operators was originally meant by Dunford as a brief introduction to the new functional analysis, and to the spectral theory that
had been initiated by Hilbert and Hellinger, but that had not really taken root until the work of yon Neumann and Stone. Dunford, however, championed spectral theory as a new field. He introduced the term "resolution of the identity," and he developed the program of extending spectral theory to non-self-adjoint operators. The initial core of the book consisted of what are now Chapters 2, 4, and 7, as well as some material on spectral theory now in chapter 11; eventually, this material expanded into two volumes. The idea of volume three was a belated one, coming in the wake of the development of the theory of spectral operators. The writing of Linear Operators took approximately 20 years, starting in the late forties. The third volume was published in 1971. Entire sections and even entire chapters were added to the text at various times, up to the last minute. For example, one of the last bits to be added to the first volume right before it went to press is the last part of section 16 of chapter 4, containing the GaussWiener integral in Hilbert space together with a simple formula relating it to the ordinary Wiener integral. This section was the subject of a lecture that Jack gave at the famous seminar on integration in function space that was held at the Courant Institute in the fall of 1956. The flavor of the first drafts of the book can be gleaned from reading chapter 2, which underwent fewer redrafts than most of the other chapters. Dunford meant the three theorems proved in this chapter, namely, the Hahn-Banach theorem, the uniform boundedness theorem and the closed graph theorem, to be the cornerstones of functional analysis. The exercises for this chapter, which in their first draft were rather dry, were eventually enriched by a set of exercises on summability of series. These problems are continued in chapter 4, and conclude in chapter 11 with the full expanse of Tauberian theorems. The contrast between the uncompromising abstraction of the text and the incredible variety of concrete examples in the exercises is immensely beneficial to the student who learns mathematical analysis from Dunford-Schwartz. The topics dealt with in Dunford-Schwartz can be roughly divided into three kinds. There are topics for which Dunford-Schwartz is still the definitive account. There are, on the other hand, other topics fully dealt with in the text which ought to be well-known, but which have yet to be properly read. Finally, there are topics that are still ahead of the times, and that remain to be fully appreciated. Presumptuous as it is on my part, I will try to give some examples of each kind, Besides the introductory chapter on Banach spaces (chapter 2), the treatment of the Stone-Weierstrass theorem and all that goes with it in chapter 4 still makes nowadays very enjoyable reading; in its time, it was the first thorough account. The short sections on Bohr compactification and almost periodic functions are also still the best reference for a quick summary of Bohr's extensive theory.
Section 12 of chapter 5 is remarkable. It presents a proof of the Brouwer fixpoint theorem. The proof was submitted for publication in a journal in 1954, but it was rejected by an irate referee, a topologist who was miffed by the fact that the proof uses no homology theory whatsoever. It does use instead some determinantal identities, the kind that are n o w again becoming fashionable. Spectral theory proper does not make an appearance until chapters 7 and 8, with the functional calculus and the theory of semigroups. In those days, such terms as "resolvent" and "spectrum" carried an aura of mystery, and the spectral mapping theorem sounded like magic. The meat and potatoes comes in chapters 10, 12, and 13; the proofs are invariably the most instructive, bringing into full play the abstract theory of boundary conditions of Calkin and yon Neumann, as well as the theory of deficiency indices.
Linear Operators: the Present There are topics for which Dunford-Schwartz was the starting point of a long development, and which have grown into autonomous subjects. Thus, for example, the notion of unconditional convergence of series in Banach spaces, which goes back to an old theorem of Steinitz and which is mentioned in chapter 2 almost as a curiosity, has blossomed into a full-fledged discipline. The same can be said of the geometry of Banach spaces initiated in chapter 4, and of the theory of convexity in chapter 5. In the sixties, several mathematicians pronounced the general theory of Banach spaces dead several times over, but this is not what happened. The geometry of Banach spaces not only managed to survive, but it is now widely considered to be the deepest chapter of convex geometry. Grothendieck once told me that his favorite theorem of his analysis period was a convexity theorem that generalizes a result in DunfordSchwartz. Unfortunately, he published it in an obscure Brazilian journal, and he never received any reprints of the papers. The theory of vector-valued measures in chapter 4 has equally blossomed into a chapter of functional analysis of beauty and depth. Strangely, at the time of the book's writing, we all thought that this theory had reached its definitive stage, perhaps because the proofs were so crystal clear. The same can be said of the theory of representation of linear operators in chapter 6; here again whole theories nowadays replace single sections of DunfordSchwartz. Corollary 5 of Section 7, stating that in certain circumstances the product of two weakly compact operators is a compact operator, has always struck me as one of the most elegant results in functional analysis, and undoubtedly sooner or later some extraordinary application of it will be found, as should happen to all beautiful theorems. Thorin's proof of the Riesz convexity theorem had apTHE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO. 3, 1996
49
peared a short time before chapter 6 was written, and it is here given its first billing in a textbook. I take the liberty of calling your attention to problem 15 of section 11. This exercise holds the key to giving one-line proofs of some of the famous inequalities in the classic book by Hardy, Littlewood, and P61ya. Section 12.9 has been scandalously neglected. The classical moment problems are thoroughly dealt with in this section by an application of the spectral theorem for unbounded self-adjoint operators. It is shown in a couple of pages that the various criteria for determinacy of the moment problem can be inferred from a simple computation with deficiency indices. Partial rediscoveries of this fact are still being published every few years by mathematicians who haven't done their reading.
Linear Operators: the Future Finally, there are numerous subjects that were first written up in Dunford-Schwartz, from which the mathematical world has yet to benefit. It is surprising to hear from time to time probabilists or physicists addressing problems for which they would find ready help in Dunford-Schwartz. The functional analytic incompetence of physicists has decreased since the fifties, but one suspects that a lot of research funds might be saved if all physicists were to be required to have some basic functional-analytic background. Once, while I was a graduate student, a physicist working in quantum mechanics, who is n o w one of the leading theoretical physicists of our day, asked me to describe the difference between a symmetric and a self-adjoint operator in Hilbert space, which he ignored; one wonders how much the situation has improved in forty years. Chapter 3 on measure theory is one of the chapters inserted at a fairly late stage. It has not been read much, perhaps because every reader believes he or she is supposed to know measure theory when embarking upon the reading of Dunford-Schwartz. Actually, chapter 3 contains a number of yet-to-be-appreciated jewels. One of them is the comprehensive treatment of theorems of the Vitali-Hahn-Saks type. The proofs are so concocted as to bring out the analogies between the combinatorics of sigma-fields of sets and the algebra of linear spaces. Few analysts make use of this kind of reasoning. In probability, an appeal to the Vitali-Hahn-Saks theorem would bypass technical complications that are instead settled by the Choquet theory, for example, randomization theorems of the De Finetti type. Apparently, the only probabilist to have taken advantage of this opportunity is Alfred R6nyi in an elementary introduction to probability that also has been little read. Similarly, one wonders w h y so little use is still made of theorem III.7.6, which might come in handy in integral geometry. Large portions of spectral theory presented in Dunford-Schwartz remain to be assimilated. Thus, for example, the fine theory of Hilbert-Schmidt operators 50
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
and the wholly original theory of subdiagonalization of compact operators in chapter 11 have not been read. The spectral theory of non-self-adjoint operators of chapters 14 through 19 is a gold mine that is still waiting for its day in the sun; only the latter parts of chapter 20, dealing with what the authors have successfully called "Friedrichs's method" and with the wave operator method, have been developed since the treatise was published. It will be a pleasure to watch the rediscovery of these chapters by the younger generations.
Working with Jack Schwartz There are fringe benefits to being a student of Jack's. Occasionally I decline invitations to attend meetings in computer science and even in economics, from organizers who mistakenly assume that I have inherited m y thesis advisor's interests. Two traits of Jack's personality have particularly endeared him to his students. One is his instinctive understanding of another person's state of mind, his tact in dealing with difficult situations. He gives encouragement without exaggerating, and he knows how to steer his friends away from being their own worst enemies. The second is his Leibnizian universality. It spills over onto all of us, it lifts us and points us in the right direction. Whatever topic he deals with at one time he sees as a stepping stone to some wider horizon to be dealt with at some future time. Both of these qualities shine in the pages of Linear Operators, the first by the transparent proofs, the second from the encyclopedic range of the material that is dealt with in 2592 pages. I was hired to work on the Dunford-Schwartz project in the summer of 1955, together with Bob McGarvey. Immediately, Jack took us aside and let us in on the delicate matter of the semicolons. There were to be no semicolons in anything we wrote for the project. Dunford would get red in the face every time he saw a semicolon. For years hence, I was terrified of being caught using a semicolon, and you may verify that in the three printed volumes of Dunford-Schwartz not a single semicolon is to be found. I was asked to check the problems in chapter 3, while Bob was checking problems in chapters 7 and 8. We would all get together every morning in a little office in Leet-Oliver Hall, an office that nowadays would not be considered fit for a teaching assistant. A bulky record player, which we had bought for ten dollars, occupied much of the space; we played over and over the entire sequence of Beethoven quartets and Bach partitas while working on the problems. It took me half the summer to finish checking the problems in chapter 3. There were a few that I had trouble with, and worst of all, I was unable to work out problem 10 of section 9. One evening, Dunford and several other members of the group got together to discuss changes in the exercises. Jack was in N e w York City. It was a warm
summer evening, and we sat on the hard wooden chairs of the comer office of Leet-Oliver Hall. Pleasant sounds of squawking crickets and frogs came through the open window, and mosquitoes were flying in through the open Gothic windows. After I admitted my failure to work out problem 10, Dunford tried one trick after another on the blackboard, in an effort to solve the problem or to find a counterexample. No one remembered where the problem came from, or who had inserted it. After a few hours, we all got up and left, somewhat downcast. The next morning, I met Jack, who patted me on the back and told me, "Don't worry, I could not do it either". I did not hear again about problem 10 of section 9 for another three years. A first-year graduate student took Dunford's course in linear operators. Dunford assigned him the problem, and the student solved it, and developed an elegant theory around it. His name is Robert Langlands. In the second half of the summer of 1955, after checking the problems in chapter 3, I was assigned to check the problems in spectral theory of differential operators in chapter 13. This is the chapter of Dunford-Schwartz that decided m y career in mathematics. Apparently, I had less difficulty with this second round of exercises, but I made a number of careless mistakes, as I always have since. One day, I was unexpectedly called in by Dunford. The details of this meeting have been many times rewritten in my mind. The large office was empty, except for Dunford and Schwartz sitting together at the desk in the shadow, like judges. "We have decided to assign you the problems in sections G and H of chapter 13", they said. A minute of silence followed. I had the feeling that there was something they weren't saying. Eventually, I got it. They were NOT assigning me the problems in section I, which dealt with the use of special functions in eigenfunction expansions. I soon learned, somewhat to my annoyance, that the person in charge of checking the problems in section I was an undergraduate who had just gotten his B.A. two months before. "You will never find a better undergraduate in math coming out of Yale," Jack told me, aware of m y feelings. He was right. The undergraduate checked all the special function problems by the end of the summer, and section I is now spotless. His name is John Thompson, and he went on to win the Fields Medal. I have kept a copy of the mimeographed version of the manuscript of Dunford-Schwartz. On gloomy days, I pull the dusty 15-pound bulk out of the shelf. Reading the now-yellowed pages, with their inky smell, was once a great adventure; rereading them after 40 years is a happy homecoming.
Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02439, USA
SpringerNewsMathematics Hans Hahn Gesammehe Abhandlungen / Collected Works L. Schmetterer, K. Sigmund (Hrsg./eds.) Mit einem Geleitwort yon / With a Foreword by Karl P o p p e r Like Descartes a n d Pascal, H a n s H a h n (1879-1934) was both a n eminent mathematician and a highly influential philosopher. He founded the Vienna Circle a n d was the teacher of both K u r t G f d e l and Karl Popper. His seminal contributions to functional analysis a n d general topology h a d a huge impact on the development of modern analysis. H a h n ' s passionate interest in the foundations of mathematics, vividly described in Sir Karl Popper's foreword (which became his last essay) h a d a decisive influence upon K u r t G6del. Like F r e u d , Musil or Sch6nberg, H a h n became a pivotal figure in the feverish intellectual climate of Vienna between the two wars.
B d . 1 / V o l . 1: 1995. XII, 511 pages. Cloth DM 198,-, approx.US $140.00.ISBN 3-211-82682-3 The first volume contains H a h n ' s path-breaking contributions to functional analysis, the theory of curves, and ordered groups. These papers are commented by H a r r o Heuser, Hans Sagan, and Laszlo Fuchs. B d . 2 / V o l . 2: 1996. Approx. 560 pages. Cloth DM I98,-, approx.US $140.00.ISBN 3-211-82750-1 The second volume of H a h n ' s Collected Works deals with functional analysis, real analysis and hydrodynamics. The commentaries are written by Wilhelm F r a n k , Davis Preiss, a n d Alfred Kluwick. B d . 3 / V o l . 3: Approx. 480 pages. ISBN 3-211-82781-1. Will be published in Fall 1996. In the third volume, H a h n ' s writings on harmonic analysis, measure a n d integration, complex analysis a n d philosophy are collected a n d commented by Jean-Pierre Kahane, Heinz Bauer, Lutger Kaup, and Wolfgang Thiel. This volume also contains excerpts of letters of H a h n a n d accounts by students and colleagues. Subscription price (only valid when taking all three volumes): 20 % price reduction
SprlngerWien N ew~ ork 9
~
T
T
P.O.Box89, A-1201Wien NewYork,NY10010,175FifthAvenue HeidelbergerPlatz3, D-14197Berlin Tokyo113,3-13,Hongo3-chome,Bunkyoku
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
51
Ian Stewart* The catapult that Archimedes built, the gambling-houses that Descartes frequented in his dissolute youth, the field where Galois fought his duel, the bridge where Hamilton carved quaternions-not all of these monuments to mathematical history survive today, but the mathematician on vacation can still find many reminders of our subject's glorious and inglorious past: statues, plaques, graves, the cafd where the famous conjecture was made, the desk where the
famous initials are scratched, birthplaces, houses, memorials. Does your hometown have a mathematical tourist attraction? Have you encountered a mathematical sight on your travels? If so, we invite you to submit to this column a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks. Please send all submissions to the Mathematical Tourist Editor, Ian Stewart.
Sacred Star Polyhedron Istv n Hargittai There is a beautiful star polyhedron at the top of the Sacristy of St. Peter's Basilica in Vatican City (Fig. 1). It was built by the architect Carlo Marchionni in the years
1776-1784. It is a great stellated dodecahedron, called also Kepler's great stellated dodecahedron (Fig. 2 [1]), with 2 of its 20 triangular pyramids left out to accom-
Figure 1. Left: The Sacristy of St. Peter's Basilica in Vatican City; right: the star polyhedron at its top.
*Column Editor's address: MathematicsInstitute, University of Warwick, Coventry, CV4 7AL England. 52
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
modate the vertical rod serving as the stand of the cross above the polyhedron. There are many other examples of star polyhedron decorations from even earlier times, such as at the top of the obelisks in St. Peter's Square and in the Rotonda Square in Rome, and on the gate in the Square of September 20 in Bologna (Fig. 3). The star polyhedron often stands on a pile of dome-shaped stones. An octagonal star standing on top of a pile of domeshaped stones was a characteristic motif in the coat of arms of the Chigi family of Pope Alexander VII (1655-1667). This motif is prominently displayed on the colonnades of St. Peter's Square (Fig. 4). Giovanni Lorenzo Bernini (1598--1680) and Francesco Borromini (1599-1667) were leading architects of the Baroque period and their activities overlapped with the reign of Pope Alexander VII. The octagonal star and the coat of arms of the Chigi family are conspicuously present in many of their works. Figure 5 shows Sant Ivo's Church and three of its details by Borromini. Two of them display star polyhedra on piles of dome-shaped stones and octahedral stars. However, the decoration beneath the cross at the top of the tower is not a polyhedron but a sphere. All photographs in this article were taken by the author in 1993. I am grateful to Anna Rita Campanelli and
Figure 2. Great steUated dodecahedron. Photograph courtesy of Magnus J. Wenninger [1].
Aldo Domenicano (Rome), Lodovico Riva di Sanseverino (Bologna), and Magnus J. Wenninger (Collegeville, Minnesota) for assistance and advice.
Figure 3. Left: Top of the obelisk in St. Peter's Square, Vatican City; center: top of the obelisk in Rotonda Square, Rome; right: one of the two side decorations of the gate in the Square of September 20, Bologna. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 1996
53
Reference 1. M. J. Wenninger, Polyhedron Models, N e w York: C a m b r i d g e University Press (1971).
Budapest Technical University Szt. Gelldrt, tdr 4 I-I-1521 Budapest, Hungary
Figure 4. Decoration from the top of the colonnade in St. Peter's Square, Vatican City.
Figure 5. Sant Ivo's Church (top right) w i t h three details enlarged (above). THEMATHEMATICALINTELLIGENCERVOL.18,NO. 3, 1996 54
S i m o n Stevin's Statue Dirk Huylebrouck
Simon Stevin was b o r n in the Belgian city of Bruges in 1548, but left Belgium in 1582 and became a few years later professor of mathematics at the University of Leyden. A successful engineer, he first published his mathematics in Latin (1583: Problemata Geometrica), but later d e f e n d e d the "use of the m o t h e r t o n g u e to stimulate the progress of science" (1585: Dialectike ofte Bewysconste, The Art of Proving Statements). H e died in 1620. The University of Gent placed his bust in an auditorium, and until 1994 its mathematics review was n a m e d after him (see below). Belgium-in-24-hour tourists always have on their program a visit to the Venice of the North, Bruges. From Brussels, there is only one w a y to drive t h r o u g h this city c r o w d e d with tourists, and one cannot miss the Simon Stevin square in the centre of Bruges. A statue erected in his honor shows a thinking man, holding a pair of compasses in the right hand, and resting the left hand on a book with a d r a w i n g of a parallelogram for adding forces. The inscription S I M O N S T E V I N INAUG. MDCCCXLVI.F. tells us it took the city more than 200 years to h o n o r the mathematician. It was indeed quite a controversial decision. Until the first half of the 19th century, the Catholic Belgian and Protestant Dutch blocs were involved in something like a cold war, and to some Stevin had passed to the other side of the religious curtain. Several politicians and priests did not hesitate to use insults, but the offended party fortunately got the statue anyway. The following plea in his favor by the (Belgian!) physicist A. Quetelet is more polite, although the ordinary peo-
ple he refers to include a m e m b e r of the Brussels A c a d e m y of Science. M a n y of the statements m a y still be valid t o d a y (just replace "princes," "crusade," etc. b y "generals," "war," etc. and use names y o u think appropriate instead of Simon Stevin and Bruges): Simon Stevin, no matter what foreigners have said, was not forgotten by his compatriots. His statue will decorate his native city and will make her proud, a pride he felt himself for her, since it was the only title he used in his works, on the front pages of which one reads the words so remarkable in their simplicity: "By Simon Stevin of Bruges'. But, one may say, does an ordinary scholar, whose name the ordinary people do not know, deserve the honor of a statue? Certainly! an ordinary scholar, who, lost in the mass of people, has grown by himself and the force of his genius up to the highest conceptions: who, by his work and his insight, impregnated the domain of the intelligence; who tore aside with a steady hand the veils covering the great laws of nature; who enriched us with useful discoveries whose fruits we reap peacefully: what, this scholar should not take place next to those great conquerors who distinguished themselves, very often, by the evil they caused to humanity: those princes who impoverished and exterminated their population, and brought ruin and desolation to their neighbors? If you deify those men, then do not deny the honors given to great virtues, to sublime intellects. Those precious qualifies are more obvious signs of Divinity than those you honor by your statues. It is in the obscurity of the forest, in the childhood of society that man, still under the strain of material compulsion, elevated fear and glorified him who inspired it. Today, our honors must see higher; and the nation that knows how to celebrate the great military virtues, who made a statue for the famous head of the first crusade, for the hero praised by Tasso; that nation will not refuse to use the talent of its sculptors to reproduce features of its children who distinguished themselves in other careers as well. If the ordinary people do not know their names, let them learn them; that they know who their benefactors were. Ingratitude is humiliating; it is one of the principal
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
55
SpringerNewsMathematics CollegiumLogicum Annals of the Kurt Giidel Society
Volume 2 1996. Approx. 140 pages. ISBN 3-211-82796-X Soft cover DM 64,-, approx. US $ 44.00
Contents: H. de Nivelle: Resolution Games and Non-Liftable Resolution Orderings. - M. Kerber, M. Kohlhase: A Tableau Calculus for Partial Functions. G. Salzer: MUhlog: an Expert System for Multiple-valued Logics. - J. Krajiffek: A Fundamental Problem of Mathematical Logic. - P. Pudl~k: On the Lengths of Proofs of Consistency. - A. Carbone: The Craig Interpolation Theorem for Schematic Systems. - I.A. Stewart: The Role of Monotonicity in Descriptive Complexity Theory. - R. Freund, L. Staiger: Numbers Def'med by Turing Machines.
Volume 1 1995.2 figures. VII, 122 pages. ISBN 3-211-82646-7
factors of dissolution of societies: it breaks the links, fosters political egoism, and dries u p the source of all the civic virtues. Honor to the city of Bruges, which w a n t e d to celebrate the m e m o r y of one of its most famous sons! More than one young talent will be roused before this m o n u m e n t of gratitude, and even a foreigner will not look at it unmoved. Before climbing, two centuries after his death, on the pedestal destined to him, the scholar of Bruges met more than one obstacle. Was not he even accused of bearing arms against his country? A n d on what proof was this accusation based? I do not know, and neither do those w h o m a d e the accusations, because the life of Simon Stevin is clouded by mysteries; and although the scholar held high functions, one only knows him through his works and b y the few things he told us about himself in his works. But the silence of history does not authorize us to become unjust twice towards him. Q u e t e l e t ' s t e x t is q u o t e d in A. V a n h o u t r y v e ' s b o o k The Statues of Bruges, p p . 22-23. T h e p h o t o g r a p h w a s p r o v i d e d b y M r . R. J a c o b u s , p r e s s a t t a c h 6 for t h e c i t y of Bruges.
Aarsthertogstraat 42 8400 Oostende Belgium 56
THE M A T H E M A T I C A L INTELLIGENCER VOL. 18, NO. 3, 1996
Soft cover DM 64,-, approx. US $ 44.00
Contents: P. Vihan: The Last Month of Gerhard Gentzen in Prague. - F.A. Rodriguez-Consuegra: Some Issues on Giidel's Unpublished Philosophical Manuscripts. D.D. Spalt: Vollst/indigkeit als Ziel historischer Explikation. Eine Fallstudie. E. Engeler: Existenz und Negation in Mathematik und Logik. - W.J. Gutjahr: Paradoxien der Proguose und der Evaluation: Eine fixpunkttheoretische Analyse. - R. Hiihnle: Automated Deduction and Integer Programming. - M. Baaz, A. Leitsch: Methods of Functional Extension. -
SpringerWien New~t%rk I~O.Box 89, A-1201Wien New York, NY 10010,175Fifth Avenue Heidelberger Piatz 3. D-14197Berlin Tokyo 113, 3-13, Hongo3-chome, Bunkyo-ku
Quaternionic Determinants Helmer Aslaksen
Introduction The classical matrix groups are of f u n d a m e n t a l importance in m a n y parts of geometry and algebra. Some of them, like Sp(n), are most conceptually defined as groups of quaternionic matrices. But, the quaternions not being commutative, we must reconsider some aspects of linear algebra. In particular, it is not clear h o w to define the d e t e r m i n a n t of a quaternionic matrix. Over the years, m a n y p e o p l e have given different definitions. In this article I will discuss some of these. Let us first briefly recall some basic facts about quaternions. The quaternions were discovered on October 16, 1843 b y Sir William Rowan Hamilton. (For m o r e on the history, I r e c o m m e n d [19], [31], [47], and [48].) They form a n o n c o m m u t a t i v e , associative algebra over R: H={a+ib+jc+kd
n x n matrices over R by GL(n, R). (Some readers might w o r r y about our definition of invertible in M(n, H): Is there a distinction between left and right inverses? We will see later that there is n o such problem. See also [15] and [32].)
Cayley The most simple-minded a p p r o a c h w h e n trying to define the determinant of a quaternionic matrix w o u l d be to use the usual formula. But then the question is: Which
a,b,c,d~R},
where i2 = j2 = k2 = _ 1, jk = i = - k j ,
/j = k = - j i , ki = j = - i k .
We can also express z E H in the form z = x + jy, where x, y E C, but then w e have to r e m e m b e r that yj = ]~ for y E C. Notice that H is not an algebra over C, since the center of H is only ~. Conjugation in H is defined b y a +ib +jc +kd =a- ib-jc-kdandsatisfiesh~= ~. We will call the quaternions of the form ib + jc + kd with b, c, d E R the pure quaternions. For any ring R, w e let R* denote the set of units in R, i.e., the invertible elements of R. If R is a skewfield, then R* = R - {0}. Let M(n, R) be the ring of n x n matrices with entries in R. We will denote the set of invertible THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
57
usual formula? For a 2 • 2 determinant we could use alia22 - a12a21 (expanding along the first row), or aua22 azla~2 (expanding along the first column), or some other ordering of the factors in the usual formula. To a modem mathematician, this lack of a canonical definition is an indication that this is not the correct approach. But we might still ask ourselves: What exactly would happen if we tried one of these formulas? In 1845, just 2 years after Hamilton's discovery of the quaternions, Arthur Cayley [10, 35] did precisely this. He chose to expand both the original matrix and all the minors along the first column (or vertical row as he called it). If we denote the Cayley determinant by Cdet, we get C d e t ( a~ a2
(
a~ b~ c~) a2
b2
c2
a3
b3
c3
= al(b2c 3 -
b3c2) -
a2(blc 3 -
b3cl) +
a3(blc 2 -
b2Cl).
Is this a good definition? Cayley himself points out that if two rows are the same in a 2 • 2 matrix, then
Cde,(:
T(fv) = c(fv) is in general different from
fT(v) =f(cv), whereas
T(vf) = c(vf) = cvf = T(v)f. We see that we must write the coefficients of a linear transformation on the opposite side of what we use for the vector space structure. I will identify vectors with columns and identify linear transformations with matrices on the left, but consider all vector spaces to be fight vector spaces. Axiom 3 can be expressed in terms of matrix multiplication. Let @ be the matrix with a I in the (i, j) entry, and 0 otherwise. Define
bl) = a~b2-a2b~ b2
and
Cdet
trices. Thus, we need only to define the determinant of invertible matrices. Notice that in Axiom 3 there is a distinction between left and right scalar multiplication. Consider the mapping T(v) = cv. Then, for f E H,
bb) = ab - ab = O'
whereas if two columns are the same in a 2 • 2 matrix, then
which in general is nonzero. For some reason, this didn't seem to bother Cayley much, and he happily proceeded to write a couple more pages about his new function. But it should bother us. Let us try to clarify the situation by first deciding on which properties we want the determinant to satisfy. Based on our experience with complex matrices, we will call d : M(n, H) ~-~ H a determinant if it satisfies the following three axioms. AXIOM 1. d(A) = 0 if and only if A is singular. AXIOM 2. d(AB) = d(A)d(B) for all A, B ~ M(n, H). AXIOM 3. If A' is obtained from A by adding a left-multiple of a row to another row or a right-multiple of a column to another column, then d(A') = d(A). Let me make some comments about these axioms. It can be shown [7] that if d is not constantly equal to 0 or 1, then Axiom 2 implies that d(A) = 0 for all singular ma58 THE MATHEMATICALINTELLIGENCERVOL.18, NO. 3, 1996
Bq(b) = I, + beij for i ~ j. Multiplying a matrix A by Bq(b) on the left adds the jth row multiplied by b on the left to the ith row, whereas multiplying A by Bq(b) on the fight adds the ith column multiplied by b on the fight to the jth column. So Axiom 3 can be restated (using Axiom 2) as saying that d(Bij(b)) = 1. It is easy to see that
Bq(b)- i = Bq(- b), so it follows that products of Bq(b)'s generate a subgroup of GL(n, H), which we will denote by SL(n, H). Notice that when K is a field, we define SL(n, K) to be the set of matrices with determinant equal to 1. But because we don't have a determinant yet, we must define SL(n, H) in some other way, and then hope that once we have our determinant, it will have SL(n, H) as its kernel. That Axiom 3 can be restated as saying that matrices in SL(n, H) have determinant equal to 1 is therefore promising. An obvious question is now whether such determinants exist. Let me first state a simple obstruction. THEOREM 1. Assume that d is a determinant, i.e., d satisfies our three axioms. Then the image d(M(n, H)) is a commutative subset of H. This theorem essentially says that when trying to define a quaternionic determinant, we must keep it complexvalued. This rules out Cayley's definition, since Cdet is onto H. The proof of Theorem I depends on the next two lemmas. We first observe that the definition of Bi)(b) only
involves two indices. We can, therefore, often assume without loss of generality that n = 2. A simple calculaton proves the following lemma.
It is n o w time to ask h o w Cayley's definition fits into this. It clearly cannot satisfy all the three axioms. In fact, it doesn't satisfy any of them! Consider the matrix
L E M M A 2. Let a =~0 and d be a determinant. Then
:,) =
(,)( - a -1
0
1
1
0
a
0 1
1
)(
1
1
0
1
)
and d
((0 0)) a_ 1
It is easy to prove that if
then x = y = 0, so M is invertible. But = 1.
The next lemma is crucial.
LEMMA 3. Every A ~ GL(n, H) can be written in the form A = D(x)B,
where
D(x) =
()
1
" X
and B E SL(n, H). Proof. Because A is invertible, there m u s t be at least one nonzero element in the first row, say alj ~ O. By adding the jth column multiplied by a ~ (1 - a u) on the right to the first column, we get a matrix with a l l = 1. We can then make all the other entries in the first row equal to zero, and proceed by induction. [] The observant reader m a y n o w be w o n d e r i n g about the uniqueness of the A = D(x)B decomposition. But it is more urgent to prove Theorem 1. Proof of Theorem 1. Define f : H --~ H by
f(x) -- d(D(x)). It follows from L e m m a 3 that f(H) = d(M(n, H)). For simplicity of notation we will assume that n = 2. We have
by Axiom 2 and L e m m a 2. But then
f(x)f(Y)=d((o
0 1 1)(0
~))=d(o
Oy)
and we see that f(H) = d(M(n, H)) is commutative.
[]
so M t is singular. But Cdet M --- 0 and Cdet M t = 2k, so we see that Axiom I fails. This also shows that the transpose is not a very useful concept in quaternionic linear algebra. The reason is that it is neither an automorphism nor an antiautomorphi____sm! (But notice that Hermifian involution, M* = M t, is an antiautomorphism, i.e., (MN)* = N'M*.) For similar reasons, the concept of rank is also more complicated. The right column-rank is the same as the left row-rank, but they might be distinct from the left column-rank, which is equal to the right row-rank [12]. Noting that C~((~
i)(k ~ i
Cdet( 1
~)Cdet(k i
~)) = 2 -
2k
whereas
~)=0
we see that Axiom 2 also fails. As for Axiom 3, we have
;)0 but after subtracting the second row multiplied by b on the left from the first row, we get
and Cdet(A') = ab - ba, which in general is nonzero. This clearly shows that Cdet is not the w a y to go. A more promising lead is before us, in Lemma 3. It will be followed up later. Let me finish this section with a remark about Theorem 1. It is inspired by a related theorem proved by the physicist and mathematician Freeman J. Dyson in 1972 [21]. He used a different third axiom: THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996 5 9
A X I O M 3'. Let A = (aij), B = (bq), and C = (cq). If for some
Set
row index r we have aq = bq = cii, then
i ~ r,
and
ari q-
bri =
Cri,
d(A) + d(B) = d(C).
In other words, d should be additive in the rows. He then proved that if d satisfies Axioms 1, 2, and 3', the image of d is commutative. It is easy to see that Axioms 1, 2, and 3' imply Axiom 3. We just have to prove that d(Bij(b)) = 1. Let B' be the matrix obtained by replacing the ith entry along the diagonal in Bij(b) by a 0. Then B' is singular, and it follows from Axiom 3' that d(Bij(b) = 1. So his definition of determinant is more restrictive than ours. But it is, in fact, too restrictive. Determinants satisfying his three axioms simply d o n ' t exist over the quaternions! W h y ? It follows from Axiom 2 that d(I n) = 1. Define
D(x) =
1
"..
1
~b(M(n, C)) = {P E M(2n, ~)[JP =
PI}.
In a similar way, any quaternionic n x n matrix can be expressed uniquely in the form M = A + jB, where A and B are complex n x n matrices. (We write j on the left since we work with right vector spaces.) We can, therefore, define ~b: M(n, H) ~ M(2n, C) by
x/"
Since In + D ( - 1 ) = 2D(0) is singular, it follows from Axioms 1 and 3' that d ( D ( - 1 ) ) = - 1 . Because - 1 = iji-lj -1, we get D ( - 1 ) = D(i)D(j)D(i) -1 D(j) -1, so D(-1) is a commutator in GL(n, H). But Axiom 2 and Theorem I then imply that d ( D ( - 1)) = 1, which is a contradiction.
Study Concerning quaternionic determinants, nothing much happened during the 75 years after Cayley. In the second (posthumous) edition of W. R. Hamilton's book Elements of Quaternions [24] from 1889, the editor added an appendix, which was just a restatement of Cayley's paper. Also, a paper by J. M. Peirce [38] from 1899 is just a laborious elaboration on the Cayley determinant. But in 1920 a very interesting paper by Eduard Study appeared [44]. (For more details, see also [16], [23], and [46].) His idea was to transform a quaternionic matrix into a complex 2n • 2n matrix and then take the determinant. I will start by discussing some important homomorphisms between quaternionic, complex, and real matrices. Recall that a n y complex n x n matrix can be written uniquely as N = C + iD, where C and D are real n • n matrices. We can then define an injective algebra homomorphism q~ : M(n, C) ~ M(2n, ~) by
60
Let R i be right-multiplication by i on C ". The corresponding matrix is iI, and J = ~b(i/) = q~(ai) (I will sometimes identify a linear transformation and its standard matrix). This gives a complex structure on ~2n, and we k n o w that P ~ M(2n, R) corresponds to a complex linear transformation if and only if P commutes with the complex structure. Hence,
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
It is straightforward to s h o w that this map is an injective algebra homomorphism. [This implies in particular that there is no distinction between left- and right-inverses in GL(n, H).] Let Rj be right-multiplication by j on H n. Notice that a n y H-linear transformation commutes with Rj, but that Ry is not H-linear. Thus, there is no matrix associated to Rj, and it doesn't make sense to talk about ~(Rj), but we can still consider the corresponding map of C 2n given by Rj(x, y) = (-~, x-). We see that Rj corresponds to first multiplying by J and then conjugating. This gives a quaternionic structure on C 2n, and we k n o w that N E M(2n, C) corresponds to a quaternionic linear transformation if and only N commutes with the quaternionic structure. Since NJv = NJv, we have NJv = JNv if and only if NJ = JN, so ~ M ( n , H)) = {N ~ M(2n, C)IJN = NJ}.
(1)
Notice that this is simply a generalization of the formula jz = ~j for z ~ C. It follows immediately from (1) that detc ~(M) E ~, but we will soon see that, in fact, we have det c 6(M) >-0. (I will sometimes write detR or det c to stress that I'm taking the determinant of a real or complex matrix.) By applying the h o m o m o r p h i s m (J~l : C ~ M(1, C) M(2, R) to each element of M ~ M(n, C), we get a m a p c~: M(n, C) ---~M(2n, ~). [~b(N) consists of four n-blocks, whereas ~b(N) consists of n 2 2-~ocks.] The important thing here is that the 2-blocks in ~b(N) are easier to manage than the n-blocks in ~b(N). Since C is commuative
and (ha is a homomorphism, the 2-blocks in (~(N) commute. This allows us to use the following folklore theorem. [It has been rediscovered numerous times, but to the best of m y knowledge it is originally d u e to M. H. Ingraham [26].] THEOREM 4. If A = (Aij) is a square block matrix, where
M(2n, C), but we need to k n o w that the inverse actually lies in ~ M ( n , H)). By conjugating and inverting the formula J~O(M) = ~M)J, we see that JO(M) -1 = qJ(M)-ll. But then it follows from (1) that ~ M ) -1 lies in
q,(M(n, H) ). To s h o w that Axiom 3 holds, it suffices to prove that Sdet Bij(b) = 1. If b = bI + jb2, then
the Aij are mutually commutative m ;< m matrices, and B is the m x m matrix obtained by taking the determinant of A with the A~j as elements, then det A = det B. For example, if All, A12, and A22 are m u t u a l l y commutative, then det [/ A1, \a21
A121 \ = det (AlaA22 - A,2A2a).
(2)
a22J
In other words, y o u evaluate by "taking the determinant twice." By shuffling some rows and columns, we see that detR r = detR ~b(N), and we can n o w apply Theorem 4 to get [6] detR 4)(N) = det~ (~(N) = detR(~bl(detc N)) = d e t a ( R e det c N \ I m detc N
- I m detc N / = Idetc NI2, Re detc N ]
for N E M(n, C). This discussion leads to the following important theorem.
q,(Bij(b)) = (In + bleij b2eij
-bae-ij I" In + bleq/
But eiflij = 0, so we can apply Theorem 4 to get det(q,(Bij(b)) = det(I,) = 1. Thus, the Study determinant satisfies all our axioms, and it is used frequently in differential geometry and Lie theory [23]. Bear in m i n d that it is a quadratic function of the entries, not multilinear in the rows and the columns like the usual determinant. Let me finish this section with a couple of additional comments. The Study determinant was defined above by identifying H with C a. What would happen if we instead identified H with ~4? After all, the center of H is R, not C, so the quaternions form an Q-algebra. We can write M E M(n, H) uniquely as M = A o + iA1 + jA2 + k a 3 where A0, A1, A2, and A 3 are real n x n matrices, and apply the homomorphism/~ : M(n, H) ---) M(4n, ~) given by
I~(Ao+iAl+jA2+kAa)=
A1 A2
A3
Ao A3
-A2
-A 3 A2 Ao -A1
A1
"
Ao
THEOREM 5. For any complex matrix N, we have Notice that detR 4~(N) = Idetc NI2 ~ 0.
(3)
For any quaternionic matrix M, we have det c qKM) =
X/det~ ~b(q,(M)) -> 0.
c ~ A o + iA1 + jA2 + kA3) = \-A3
(4)
Proof. The first part follows from (2). It follows from (1) that det c ~ M ) E ~, and since det c~(GL(n, H)) is a connected subset of ~, we get that Sdet M >- 0 for quaternionic matrices. We then deduce (4) from (2). []
Ao -A 3
A3 A0
-A1
A2
A1
.
--
tz(Ao + iA1 + jA2 + kA3), but it is easy to see that by shuffling some rows, columns, and signs, we get (see also [4] and [30]) detR /~(M) = deta ~b(~M)) = Sdet(M) 2.
We are finally ready to define the Study determinant Sdet by
I also note that in general
A, jBt (At
Sdet M = detc ~M). The obvious question is n o w which axioms the Study determinant satisfies. The Study determinant satisfies Axiom 2 because ~Ois a homomorphism. Let us show that Axiom 1 holds. (Notice that the proof of this statement is wrong in both editions of the otherwise excellent book by Morton L. Curtis [16].) We k n o w that if S d e t M = det c q,(M)~ 0, then q,(M) is invertible in
Bt
-at
-~
(At B') _-Bt
-at
= ~(f)t'
but
~(M*) = ~(A' + j-~t) = ~(-~, _ ~tj)
( -d' ~') = ~(-~t_jBt)= -B t A t
= ~M)*.
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
61
Hence, SdetM* = S d e t M = SdetM; but in general Sdet M t --PSdet M, for, as we saw earlier, M can be invertible while M t is singular. Dieudonn4 Study was not the only one studying quaternionic determinants in his time. In the next 10 years, A. Heyting, E. H. Moore, f~. Ore, and A. R. Richardson all wrote about this topic [25, 34, 36, 42, 43]. The paper by Oystein Ore [36] is important because it introduces the concept of the ring of fractions for a noncommutative ring. But from the point of view of determinants, the most interesting are the papers by A. R. Richardson [42, 43] (this is the Richardson in the Littlewood-Richardson rule, but Littlewood is not the one in Hardy-Littlewood). His main contribution was to make it apparent that commutators play a key role. His papers are filled with formulas involving commutators. Let us go back to studying SL(n, H) and take a closer look at L e m m a 3. It is easy to see that SL(n, H) is a normal subgroup of GL(n, H), and it can be shown [1, 15, 17, 40] that SL(n, H) is the commutator subgroup of
m i n a n t is well defined, so it is an easy consequence of results in [1], [17], and [40], and I refer the reader to those excellent sources for the details. [] It follows that in the decomposition A = D(x)B, neither x nor B is unique, but that the coset x[H*, H*] E H*/[H*,H*] is unique. This is exactly w h a t Jean Dieudonn6 used in his 1943 paper [17]. His goal was to s h o w h o w the determinant could be expressed in terms of group theory. We w o u l d expect det(~
0b)=det(~
0a),
but then we probably need the determinant to take values in a commutative ring, a n d we get that by considering H*/[H*, H*]. His main theorem states that for a n y skew field K, there is an isomorphism
GL(n, K) /IGL(n, K), GL(n, K)] ~ K*/[K*, K*]. For K = H, this is immediate from Lemmas 3 and 7. We therefore define the Dieudonn6 determinant by
GL(n, H). det A = det(D(x)B) = x[H*, H*]. LEMMA 6. SL(n, H) = [GL(n, H), GL(n, H)]. Let me mention in passing that for any field k, the commutator of GL(n, k) is SL(n, k), except w h e n n = 2 and k is 772 or 773 [15]. The main reason w h y Lemma 3 is so crucial is that it shows that we only need to define our determinant on the matrices D(x). But you m a y be impatient for me to get back to the issue of uniqueness. Since SL(n, H) is normal in GL(n, H), the question becomes: For which x E H does D(x) lie in SL(n, H)? The answer is given by the following lemma.
Thanks to Lemma 7, we see that this is well defined and that the kernel is precisely SL(n, H), i.e., our definition of SL(n, H) agrees with the usual one, once we have the determinant. If we n o w extend to M(n, H), we get a determinant that takes values in H*/[H*, H*] U {0}. But w h a t does this set look like? We need the following lemma. L E M M A 8. [H*, H*] is isomorphic to the set of quaternions
of length 1. Proof. It is clear that every commutator has length 1. The
LEMMA 7.
D(x) =
Ill 1
X
is a commutator in GL(n, H) [i.e., it lies in SL(n, H)] if and only if x is a commutator in H*.
pq = -{p,q) + p x q.
Proof. One direction is trivial: (10
aba-lb-O)=(lo
0
1 0
a)(Ob)(O
1
a-0)(10
b0-1)"
The other direction, however, is not so easy. It is essentially equivalent to showing that the Dieudonn6 deter62
set of quaternions of length 1 can be identified with S3, and ~ S 3) = SU(2). But every element of SU(2) is conjugate to a diagonal element, so it follows that every elem e n t in S3 is conjugate to an element of S 1, the unit circle of C C H. (This also follows from the Noether-Skolem Theorem.) So, given z E S3, we can write z = xyx -1 with y E S 1. We can identify the pure quaternions with R3, and for p, q E ~3 we have p-1 = p/Ipl 2 = -p/]pl a and
THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996
where {, ) is the usual inner product o n ~ 3 and • is the vector product in ~3. From this, we can easily deduce that every quaternion can be written as the product of two pure quaternions. Since y is complex, we can find w ~ C with y = w 2, a n d it follows from the above that we can write w = pq,
where p, q E ~3. Since Iwl = lyl = 1, w e can also assume that Ipl = Iql = 1, so p-1 = - p and q-1 = -q. But then
plies the determinant by m[H*, H*]. (This last product can be either on the left or on the right, since H*/[H*, H*] is commutative.) On the other hand,
z = xpqpqx -1 = xpq(-p)(-q)x -1 = xpqp-lq-lx -1 = (xpx-1)(xqx-1)(xpx-1) -1 (xqx-1)-k
For other proofs, see [9], [17], and [50]. It follows that H*/[H*, H*] is isomorphic to the positive real numbers. Define co: H*/[H*, H*] ~
det( lb aba)=(ab-ba)[H*'H*]'
[] but
:)0
R + by ~o(x[H*, H'l) = Ixl, and w e see that w e cannot factor out a right multiple of
and define the normalized Dieudonn6 determinant by Ddet(M) = ~o(det(M)). Dieudonn6 s h o w e d [17] that any determinant function satisfying our three axioms will be of the form
a row.
Moreover, it doesn't behave well with respect to addition. Consider the determinant as a function of the first row, keeping the other rows fixed. Denote this function b y m(v). Define addition in H*/[H*, H*] by setting a[H*, H*] + b[H*, H*] = {akl + bk21kl, k2 E [H*, H*]}.
d(M) = Ddet~(M)
(5) It can then be shown [1] that
for some r E ~. In particular, we can easily check the following theorem.
Sdet M = detc(~M)) = Ddet2(M),
(6)
If w e use Ddet instead of det and denote the corresponding function by M(v), we get a sort of triangle inequality:
deta/~(M) = detR ~b(6(M)) = Ddet4(M).
(7)
M(Vl + v2) - M(Vl) + M(v2).
T H E O R E M 9.
Let me also point out that it follows from (6) that the Study determinant corresponds to the reduced norm [15]. Equation (5) has been generalized b y L. E. Zagorin [52]. If v is a h o m o m o r p h i s m of H into M(s, C), and T is the corresponding h o m o m o r p h i s m of M(n, H) into M(ns, C), then det c T,(M) = DdetS(M). In addition to satisfying our three axioms, the Dieudonn6 determinant has several other properties [1, 17, 40]. Interchanging rows i and j corresponds to leftmultiplying by the matrix Pij = Bij(1)Bji(-1)Bij(1). But - 1 E [H*, H*], so det Pq = I[H*, H*]: interchanging two rows doesn't change the determinant. When n = 2, det(;
~ ) d=e t ( 0 a
d-
cba_lblj = (ad - aca-lb)[[~ *, H*]
if a r 0, and det(0c
re(v1 + v 2) C m ( v 1) + rt/(v2).
~)=det(~
d) =cb[H*'H*]=-bc[H*'H*]'b
We can also s h o w that multiplying a r o w on the left by m or multiplying a column on the right by m multi-
Moore We started out by showing what was wrong with the Cayley determinant. But sometimes it does work. Granted that his formula doesn't make sense in general, does it still make sense for certain matrices? The answer is that if w e restrict to Hermitian quaternionic matrices (M* = M), then w e get a useful function b y specifying a certain ordering of the factors in the n! terms in the sum. This was first studied b y Eliakim Hastings Moore (for biographical information about Moore, see [37]), and I will denote it by Mdet. Let Orbe a permutation of n. Write it as a product of disjoint cycles. Permute each cycle cyclically until the smallest number in the cycle is in front. Then sort the cycles in decreasing order according to the first n u m b e r of each cycle. In other words, write Or = (n11...rllll)(n21...r1212)...(rlr1...rlrlr),
where for each i, we have nil < nq for allj > 1, and r/11 > //21 ~ ... ~ nrl. Then w e define Mdet M = Z
]O~mn,1n,2""mn111nllmn21n22""mnrlrnrl.
o ' ~ Sn THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996 63
If H is Hermitian, then Mdet H is a real number. I will not go into details, but refer to the w o r k of Moore, Jacobson, Dyson, Mehta, Chen, Van Praag, and Piccinni [5, 11, 12, 20, 21, 27, 28, 32, 33, 34, 39, 49, 50]. But I would again like to m a k e some comments. In general, it is difficult to talk about eigenvalues of a quaternionic matrix [13, 29]. As we work with right vector spaces, we must consider right eigenvalues. If M x = xh,
then for q ~ 0, we get M(xq) = xhq = (xq)q-lhq.
Hence, all the conjugates of ,~ are also eigenvalues. Let us s t u d y the conjugacy classes more closely. For q E H, we define p(q) by p(q)(x) = qxq -1. Since p(q) leaves the real axis invariant and is orthogonal, we can restrict to R 3. It is easy to see [18] that if we write q = q0 + q' with q0 E R and q' ~ R 3, then p(q) represents the rotation of R 3 with the axis q' and angle 2 arctan(Iq']/qo). From this we get that if x is real, then the conjugacy class of x is just {x}, whereas for x E S3 \ {+ 1}, we get a copy of S2 containing x and orthogonal to the real axis. Suppose that h = h 0 + h' with h0 E R and h' E ~ 3 . Then qhq -1 = ,~o + q,Uq -1, and the conjugacy class of h' intersects the i axis at +lh'li. It follows that the conjugacy class of a non-real eigenvalue contains exactly two complex numbers and that they are conjugate. If p is complex and v = u + jw, then M v = vp if and only if i~r t = (uw)tp, and it can be proved by induction [29] that the eigenvalues of ~ M ) occur in conjugate pairs. It follows that the eigenvalues of_~M) ar__e precisely the 2n numbers h ~ , . . . , An and hi . . . . . hn, whereas the eigenvalues of M are the elements of the conjugacy classes of hi . . . . . hn, where we can replace hi by hi. It is n o w easy to show [29] that M is symplecfically similar to a triangular matrix with diagonal elements di, where di equals hi or hi. For more about normal forms of quaternionic matrices, see [27], [29], [41], [45], and [51]. If we restrict to a Hermitian matrix, H, then it turns out that all its eigenvalues are real (and there are, therefore, precisely n of them, since each conjugacy class only contains one element) and that the matrix can be symplectically diagonalized; that is, we can find P E GL(n, H) such that PH-Pt = D,
THEOREM 10. Let H be a Hermitian quaternionic matrix. Then
IMdet H l = Ddet H
and
M d e t H[H*, H*] = det H. (8)
For any quaternionic matrix M , we have
Sdet M = Mdet(M M*).
(9)
Proof. It can be shown that for a Hermitian matrix, the Moore determinant is equal to the product of the eigenvalues, so Mdet H is real-valued. But the normalized Dieudonn6 determinant of a diagonal matrix is the n o r m of the product of the diagonal elements, so (8) follows. To prove (9), we just have to observe that the eigenvalues of AA* are positive, and use the product rule and (6). [Z Finally, if H is Hermitian, then (J~ H ) ) t = - ~(H)tJ = - J ~ H ) t = - J ~ H),
so J ~ H ) is skew-symmetric, and we can take its Pfaffian [14]. But p f ( - J ~ H ) ) 2 = d e t c ( - J ~ H ) ) = Ddet2H = Mdet2H, SO
Mdet(H) = pf(-J6(H)). For other applications of the Pfaffian, see [2] and [3].
SP(n) I w o u l d like to finish with a simple application of these ideas. As mentioned in the introduction, the group SP(n) can be defined as the group preserving the norm on H n. But the usual description of this group is by considering its image under ~b in M(C, 2n). It is easy to see that all such matrices have determinants + 1. There are different ways of proving that in fact the determinant is equal to 1, but this also follows from the results above, since all matrices in ~GL(H, n)) have positive determinants. In conclusion, I would also like to mention the recent work of Gelfand and Retakh [22]. Unfortunately, it is b e y o n d the scope of this article to report on it.
Acknowledgments
m
where pt = p-1 and D is diagonal and real. We can n o w prove the following theorem that relates the Moore determinant to the other determinants. 64
THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996
The author would like to thank Jon Berrick, P.M. Cohn, Soo Teck Lee, and the referee for help in improving this article.
References 29.
of a Hermitian matrix, Bull. Amer. Math. Soc. 45 (1939), 745-748. H. C. Lee, Eigenvalues and canonical forms of matrices with quaternionic entries, Proc. Roy. Irish Acad. Sect. A, 52 (1949), 253-260. D. W. Lewis, A determinantal identity for skewfields, Linear Algebra Appl. 7 (1985), 213-217. K. O. May, The impossibility of a division algebra of vectors in three dimensional space, Amer. Math. Monthly 73 (1966), 289-291. Madan Lal Mehta, Determinants of quaternion matrices, J. Math. Phys. Sci. 8 (1974), 559-570. Madan Lal Mehta, Elements of Matrix Theory, Dehli Hindustan Pub. Corp., 1977. E. H. Moore, On the determinant of an hermitian matrix of quaternionic elements, Bull. Amer. Math. Soc. 28 (1922), 161-162. Thomas Muir, The Theory of Determinants, Vol. 2, London: MacMillan, 1911. f~. Ore, Linear equations in non-commutative fields, Ann. Math. 32 (1931), 463-477. K. Hunger Parshall and D. E. Rowe, The Emergence of the
1. E. Artin, Geometric Algebra, New York: Interscience, 1957; reprinted by Wiley, New York, 1988. 30. 2. H. Aslaksen, SO(2) invariants of a set of 2 X 2 matrices. Math. Scand. 65 (1989), 59-66. 31. 3. H. Aslaksen, E.-C. Tan, and C. Zhu, Invariant theory of special orthogonal groups, Pacific J. Math. (in press). 4. A. Bagazgoitia, A determinantal identity for quaternions, 32. in Proceedings of 1983 Conference on Algebra Lineal y Aplicaciones, Vitoria-Gasteiz, Spain, 1984, pp. 127-132. 33. 5. R. W. Barnard and E. Hastings Moore, General analysis. Part 1, Memoirs of the American Philosophical Society, 34. 1935. 6. J. Brenner, Expanded matrices from matrices with complex elements, SIAM Rev. 3 (1961), 165-166. 35. 7. J. Brenner, Applications of the Dieudonn6 determinant, Linear Algebra Appl. 1 (1968), 511-536. 36. 8. J. Brenner, Corrections to "Applications of the Dieudonn6 determinant," Linear Algebra Appl. 13 (1976), 289. 37. 9. J. Brenner and J. De Pillis, Generalized elementary symAmerican Mathematical Research Community, 1876-1900: J. J. metric functions and quaternion matrices, Linear Algebra Sylvester, Felix Klein and E. H. Moore, Providence, RI: Appl. 4 (1971), 55-69. American Mathematical Society, 1994. 10. A. Cayley, On certain results relating to quaternions, 38. J. M. Peirce, Determinants of quaternions, Bull. Amer. Philos. Mag. 26 (1845), 141-145; reprinted in The Collected Math. Soc. 5 (1899), 335-337. Mathematical Papers Vol. 1, Cambridge: Cambridge 39. P. Piccinni, Dieudonn6 determinant and invariant real University Press, 1989, pp. 123-126. polynomials on ~I(n, H), Rend. Mat. (7)2 (1982), 31-45. 11. L. Chen, Definition of determinant and Cramer solution 40. R. S. Pierce, Associative Algebras, New York: Springerover the quaternion field, Acta Math. Sinica (N.S.) 7 (1991), Verlag, 1982. 171-180. 41. J. Radon, Lineare Scharen orthogonaler Matrizen, Abh. 12. L. Chen, Inverse matrix and properties of double deterMath. Sem. Univ. Hamburg 1 (1922), 2-14. minant over quaternion field, Sci. China Ser. A 34 (1991), 42. A. R. Richardson, Hypercomplex determinants, Messenger 528-540. of Math. 55 (1926), 145-152. 13. P. M. Cohn, The similarity reduction of matrices over a 43. A. R. Richardson, Simultaneous linear equations over a diskew field, Math. Z. 132 (1973), 151-163. vision algebra, Proc. London Math. Soc. 28 (1928), 395-420. 14. P.M. Cohn, Algebra, vol. I, 2nd ed., New York: Wiley, 1991. 44. E. Study, Zur Theorie der linearen Gleichungen, Acta 15. P.M. Cohn, Algebra, vol. 3, 2nd ed. New York: Wiley, 1991. Math. 42 (1920), 1-61. 16. M. L. Curtis, Matrix Groups, New York: Springer-Verlag, 45. O. Teichmiiller, Operatoren im Wachsschen Raum, J. Reine 1979; 1984. Angew. Math. 174 (1935), 73-124. 17. J. DieudonnG Les d6terminants sur un corps non-com- 46. C. L. Tong, Symplectic Groups, honours thesis, National mutatif, Bull. Soc. Math. France 71 (1943), 27-45. Univ. of Singapore, 1991. 18. J. Dieudonn6. Special Functions and Linear Representations 47. B. Leednert van der Waerden, Hamilton's discovery of of Lie Groups, CBMS 42, Providence, RI, American Mathequaternions, Math. Mag. 49 (1976), 227-234. matical Society, 1980. 48. B. Leednert van der Waerden, A History of Algebra, New 19. R. Dimitrid and B. Goldsmith, Sir William Rowan York: Springer-Verlag, 1985. Hamilton, Math. Intelligencer 11 (1989), no. 2, 29-30. 49. P. Van Praag, Sur les d4terminants des matrices quaterni20. F.J. Dyson, Correlations between eigenvalues of a random ennes, Helv. Phys. Acta 62 (1989), 42-46. matrix, Commun. Math. Phys. 19 (1970), 235-250. 50. P. Van Praag, Sur la norme r6duite du d6terminant de 21. F. J. Dyson, Quaternion determinants, Helv. Phys. Acta 45 Dieudonn6 des matrices quaterniennes, J. Algebra 136 (1972), 289-302. (1991), 265-274. 22. I. M. Gelfand and V. S. Retakh, Determinants of matrices 51. L. A. Wolf, Similarity of matrices in which the elements over noncommutative rings, Functional Anal. AppI. 25 are real quaternions, Bull. Amer. Math. Soc. 42 (1936), (1991), 91-102. 737-743. 23. F. Reese Harvey, Spinors and Calibrations, New York, 52. L. E. Zagorin, The determinants of matrices over a field Academic Press, 1990. (Russian), Proc. First Republican Conf. Math. Byelorussia, 24. W. R. Hamilton, Elements of Quaternions, 2nd ed., London: Izdat, Minsk: "Vys~aja ~kola", 1965, pp. 151-152. Longman, 1889. 25. A. Heyting, Die Theorie der linearen Gleichungen in einer Zahlenspezies mit nichtkommutativer Multiplikation, Math. Ann. 98 (1927), 465-490. 26. M.H. Ingraham, A note on determinants, Bull. Amer. Math. Department of Mathematics Soc. 43 (1937), 579-580. National University of Singapore 27. N. Jacobson, Normal semi-linear transformations, Amer. J. Singapore 0511 Math. 61 (1939), 45-58. Republic of Singapore 28. N. Jacobson, An application of E. H. Moore's determinant e-maih
[email protected] THE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO. 3, 1996
65
The Solution of the n-body Problem* Florin Diacu
The wind scrambles and thunders over hills with a voice far below what we can hear. Whalesong, birdsongs boom and twitter. Sea, air, everything's a chaos of signals and even those we've named veer and fall in pieces under our neat labels. Waves-how to speak of the structure of waves when all disperses and there's nothing fixed to tell?
--Philip Holmes, Background Noise
Folk-Mathematics A folk-tale is a popular story uttered from one genera-
tion to the next. The main source of culture in times of old, oral tradition plays a marginal role in spreading scientific information today. Still, its significance is by no means negligible, and all domains of human activity are more or less influenced by it. Mathematics is no exception. We all know theorems we have never read in books or papers or learned about at formal presentations. We often don't know a reference, have no idea who proved that result, how, and when. Usually a colleague mentioned it at some conference dinner, during a coffeebreak or in a friendly discussion in our Department. It is striking, it sticks to our mind, and after a while it is part of our mathematical heritage---we just know it. Then we tell it further under similar circumstances, and so the wheel turns on. We will call this component of our knowledge folk-mathematics. Without denying the positive role folk-mathematics plays in spreading information, we must admit that results gathered through it are sometimes misleading or misunderstood. A typical example is the Cantor set. Everybody knows that the middle-third Cantor set has zero Lebesgue measure, and many believe that the middle-fifth analogue has positive measure. Intuitively this sounds plausible: if we remove each time a smaller segment, the remaining quantity should be larger. Unfortunately, the intuition leads us astray this time. For any
k, the middle-kth Cantor set has zero measure. Though a simple computation would show this, few do it, so the mistake propagates from one mathematician to the other. We can indeed obtain a Cantor set of positive measure by assigning a variable removal step. Delete first the middle-third segment, then the middle-ninth, then the middle-twenty-seventh, and so on. This algorithm will lead us to the desired result. The above example is easy to check, but what are we up against when a more complicated folk-mathematical situation appears? Physicists and mathematicians less familiar with celestial mechanics, have asked me at different occasions to provide details about the "impossibility of solving the n-body problem." Some had heard that Poincar6 had proved the result, others recalled only that such a theorem exists somewhere in the literature. After all, this is a natural question. Since Abel and Galois proved the impossibility of solving algebraic equations
*Dedicated to Philip H o l m e s , for his d e e p m a t h e m a t i c s , for his w a r m a n d candid poetry, a n d for the i m m e n s e intellectual joy he h a s instilled in m e d u r i n g the time o u r book took shape.
66 THE MATHEMATICALJNTELLIGENCERVOL. 18, NO. 3 9 1996Springer-VerlagNew York
of degree higher than five through formulae involving only roots, w h y should there not be an impossibility proof for solving the n-body problem? The astonishment comes when we respond that the n-body problem has already been solved. Of course, the answer requires explanation, and since this old question of celestial mechanics continues to raise interesting challenges (as it has for the last three centuries), it is worth telling here the intriguing story and the unexpected consequences the most important attempts to obtain an explicit solution.
King Oscar's Prize Having its origins in Newton's Principia, the n-body problem of celestial mechanics is an initial-value problem for ordinary differential equations: for given initial data qi(0), (~i(O), i = 1 , . . . , n (with qi(0) ~ qj(0) for mutually distinct i and j), find the solution of the secondorder system
miiii =
~
mimj(qi -- qj)
J~'
-~ii C ~l ~
i = 1,
.....
n,
(,)
where ml, m 2 , . . . , mn are constants representing the masses of n point-masses, and ql, q2,. 9 qn are 3-dimensional vector functions of the time variable t, describing the positions of the point-masses. For n = 2 the problem was completely solved by Johann Bernoulli in 1710 (see [B], [W], [DH]), but for more than a century and a half after Bernoulli's success, the case n -> 3 eluded the efforts of everyone. Interest in the problem grew towards the end of the last century, when a special event made the best mathematicians look at celestial mechanics with more concern than ever before. In volume 7, 1885/86, Acta Mathematica announced the establishment of a prize in honour of King Oscar II of Sweden and Norway, to be awarded on the King's 60th birthday: 21 January 1889. The deadline for submission was set for 1 June 1888. Finding a convergent power-series solution of the above initial value problem, was the first--and the most imp o r t a n t - a m o n g the four questions proposed by the three-member jury: G6sta Mittag-Leffier (the editor-inchief of Acta), Charles Hermite, and Karl Weierstrass. The formulation of the first question, due to Weierstrass, who had shown growing interest in the problem himself, appeared in German and French as follows in our translation (a slightly different translation was given by Daniel Goroff in [P]): Given a system of arbitrarily many mass points that attract each other according to Newton's laws, under the assumption that no two points ever collide, try to find a representation of the coordinates of each point as a series in a vari-
able that is some known function of time and for all of whose values the series converges uniformly. This problem, whose solution would considerably extend our understanding of the solar system, seems capable of solution using analytic methods now at our disposal; we can at least suppose as much, since Lejeune Dirichlet communicated shortly before his death to a geometer of his acquaintance [Leopold Kronecker] that he had discovered a method for integrating the differential equations of Mechanics, and that by applying this method, he had succeeded in demonstrating the stability of our planetary system in an absolutely rigorous manner. Unfortunately, we know nothing about this method, except that the theory of small oscillations would appear to have served as his point of departure for this discovery. We can nevertheless suppose, almost with certainty, that this method was based not on long and complicated calculations, but on the development of a fundamental and simple idea that one could reasonably hope to recover through persevering and penetrating research. In the event that this problem remains unsolved at the close of the contest, the prize may also be awarded for a work in which some other problem of Mechanics is treated as indicated and solved completely.
Out of the 12 papers eventually submitted for the competition, 5 treated the n-body problem; none of them, however, obtained the required power-series solution. Under these circumstances the jury decided to award the prize to the 35-year-old Henri Poincar6, for his remarkable contribution to the understanding of the equations of dynamics (called Hamiltonian systems today) and for the many new ideas he brought into mathematics and mechanics. Indeed, Poincar6's memoir, later developed into his monumental 3-volume work Les Mdthodes Nouvelles de la Mdcanique Cdleste, laid the foundations of several branches of mathematics and--most important--opened the way to qualitative methods, as opposed to the quantitative ones that had reigned in analysis since Newton and Leibniz. Published in volume 12, 1890, of Acta Mathematica, Poincar6's memoir offered the first example of chaotic behavior in a deterministic system (it involved homoclinic orbits in a first-return map in the restricted 3-body problem). In fact Poincar6 understood the complicated behavior of those orbits only after the prize was awarded to him. The first version of his paper, the one actually awarded the prize, incorrectly claimed that such orbits were stable, by missing the important fact that the homocIinic intersection might be transversal. Assaulted with questions by Edvard Phragm6n, the assistant editor at Acta in charge of preparing the manuscript for publication, Poincar6 finally discovered and corrected the mistake. Phragm6n had found Poincar6's work very hard to read. The initial version almost doubled in size after Phragm6n's repeated requests for clarification. Writing about the subsequent 1895 paper entitled Analysis Situs, Jean Dieudonn6 [Di] characterized Poincar6's style in the following words: THE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO, 3, 1996
67
As in so many of his papers, he gave free rein to his imaginative powers and his extraordinary intuition, which only very seldom led him astray; in almost every section is an original idea. But we should not look for precise definitions, and it is often necessary to guess what he had in mind by interpreting the context. For many results he simply gave no proof at all, and when he endeavored to write down a proof, hardly a single argument does not raise doubts. The paper is a blueprint for future developments of entirely new ideas, each of which demanded the creation of a new technique to put it in a sound basis.
velocity components) to 6n - 10. Jacobi had shown that using a so-called reduction of nodes (some symmetries), the dimension of the system could be further reduced to 6n - 12, but this was not enough to understand even the 3-body problem--it still left a complicated 6-dimensional first-order system unsolved--not to mention higher values of n. In 1887 the 39-year-old German mathematician Ernst Heinrich Bruns published in Acta Mathematica a surprising result [Bru]: the n-body problem has no integrals-----alge-
Unfortunately Poincar6's correction came only after the memoir had been printed and some of Acta's issues delivered to subscribers. As editor-in-chief of Acta, as a member of the jury, and as a favorite of the King, MittagLeffier was put in a delicate position. To defend the honor of the prize and his own credibility and position, he decided to recall the published issues and print the correct version. Poincar6 agreed to bear the costs of the first printing: 3585 Swedish crowns and 63 6re, more than the 2500 crowns he had received for the prize (to understand the figures, bear in mind that MittagLeffier's annual salary as a professor at the University of Stockholm had been 7000 crowns in 1882) [A],[BG]. I do not go further into the history and the scandal that followed (the interested reader can find the historical and mathematical details in [DH], our forthcoming book about the origins and the development of chaos and stability). What matters now is the negative result proved by Poincar6 in the prize memoir, a result that does show the impossibility of solving the n-body problem, but only by use of a certain method.
braic with respect to the time, the position, and the velocity coordinates--except the 10 known ones. Though some gaps
Is this Problem Unsolvable?
were subsequently discovered in Bruns's proof, Poincar6 had no doubt that the result was true. In his prize paper he proved an even stronger theorem: there are no inte-
grals---algebraic with respect to the time, the position, and the velocities only---other than the 10 known ones. In other words, these negative results showed it is impossible t o solve the equations of motion of the n-body problem by reducing the dimension of the system with the help of first integrals. This does not mean that the n-body problem is unsolvable, just that a certain method fails to solve it. In fact, standard results of differential equations theory show that any initial value problem for the equations (~), with initial data not starting from collisions, leads to the existence of a unique solution defined on a maximal interval, which is the whole real line if singularities do not occur. So the problem posed by King Oscar's prize made sense and could be solved, in principle. Unfortunately, the folk-mathematical tradition retained only one aspect of these results and perpetuated the wrong message that the n-body problem was unsolvable. After a digression into the foundations of mathematics, I will tell how the n-body problem was later solved in the spirit of King Oscar's prize.
First integrals (or simply integrals) for systems of differential equations are functions that remain constant along any given solution of the system, the constant depending on the solution. In other words, integrals provide relations between the variables of the system, so each scalar integral would normally allow the reduction of the system's dimension by one unit. Of course, this reduction can take place only if the integral is an algebraic-not very complicated--function with respect to its variables, such that one variable can be expressed as a function of the others. If the integral is transcendent, any attempt to obtain such an expression is pointless. At the time of Poincar6, the method of solving systems of differential equations by finding first integrals was much in use. It had been known for a long time that the n-body problem had 10 independent algebraic first integrals: 3 for the center of mass, 3 for the linear momentum, 3 for the angular momentum, and one for the energy (see, e.g., [W], [D1], [D2]). This allowed the reduction of the primitive system from 6n variables (each point-mass is represented in space by 3 position and 3 68
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Brouwer's Attack All active mathematicians have opinions about what problems have importance, what branches are difficult, and what directions are promising in their own field. But unlike other sciences, whatever differences of opinion arise, all mathematicians agree that a result proved two millennia, two centuries, or two years ago, remains true forever. The progress of mathematics has little to do with the foundations. In spite of this, some prominent mathematicians have dedicated time and energy towards understanding the roots of their discipline. Sometimes, their efforts have raised polemics and disputes as sharp as those frequently met in other domains of h u m a n activity. In 1913, the 32-year-old Luitzen Brouwer launched an attack against a well established mathematical method of reasoning. As an editor of the prestigious Mathematische Annalen, he rejected all submitted papers that
used reductio ad absurdum as a method of proof. This led in Acta Mathematica a few months before by a Finn of to a scandal. The editorial board held an emergency Swedish origin, Karl Sundman. If he had known and meeting to save the reputation of the journal. The board understood Sundman's work, Brouwer would probably resigned as a whole and reelected itself, except Brouwer. never have developed his intuitionism. Offended by his colleagues' attitude and supported by Sundman's paper [Su3] revisited and republished some of his own results (inspired by a previous work his government, Brouwer immediately established a riof the Italian mathematician Giulio Bisconcini [Bi]) that val journal in Holland [G]. That embarrassing incident marked the beginning of had appeared in 1907 [Sul] and 1909 [Su2] in a Finnish a long fight between intuitionism and formalism, the main journal of lesser fame and circulation. One of Sundman's schools of mathematical-philosophical thought at the achievements was to find, for almost all admissible inibeginning of our century, each claiming to have f o u n d - - tial data, a series solution of the 3-body problem. If he against the other--the only viable w a y of laying the had gotten this result 22 years earlier, he would have foundations of mathematics. The building of founda- probably been awarded King Oscar's prize. Reading Sundman's paper we see that he obtained tions had come to seem urgent due to the antinomies, known already by the Greeks, but which had now a series solution in powers of t 1/3 for the 3-body probstarted to embarrass the recently established set theory. lem, a series convergent for all real t, except for a negThe main objection of Brouwer's intuitionism against ligible set of initial conditions, namely, those for which Hilbert's formalism concerned existence theorems. the angular momentum is zero. Indeed, Sundman proved Brouwer considered that a nonconstructive argument first the convergence of the series as long as no collicannot be accepted as proof of existence, so reductio ad sions take place. (The importance of the method develabsurdum seemed to him a good point to start the oped in that paper, which is based on the theory of funcpolemic. On the other hand Hilbert, who took Brouwer's tions of a complex variable, is analyzed in a nice article action personally, attempted to show that every theo- by Donald Saari [S].) Sundman also surmounted the rem can be deduced by logical steps from the postulates impediment of binary collisions through a process of a given axiomatic system. Unfortunately, in this re- he called regularization, which means to analytically extend the solution beyond the collision singularity, and spect the German mathematician was wrong. In 1931, Hilbert's formalism received a sharp blow which physically corresponds to an elastic bounce. In when the Austrian logician Kurt G6del published his this case, his series still proves convergent for all real incompleteness theorem [G6]. G6del proved that any values of the time variable. Unfortunately he could not sufficiently rich, sound, and recursively axiomatizable theory apply the same method if a triple collision occurs, but is incomplete. A recent paper [CJZ] goes even further by he showed that such a collision can take place only if showing that, in a quite general topological sense, in- the angular momentum cancels, hence for a set of inicompleteness is a common phenomenon: with respect to tial data having measure zero. (Even within this set, the any reasonable topology, the set of true and unprovable state- subset of initial data leading to triple collisions has meaments is dense in the set of all statements. This re- sure zero, as one of Saari's students has shown in his sult has persuaded some mathematicians that the fu- Ph.D. thesis [U].) In 1941, Carl Ludwig Siegel proved ture of mathematics is not with proving theorems but that such a regularization is possible only for a negligiwith trying to estimate the probability that a result is true. ble set of masses, so indeed, the analytic continuation of triple collisions is generically impossible [Si]. On the other hand, Brouwer's intuitionism--though Sundman's method failed to apply to the n-body never fully refuted by any other theory and still the object of some research--fell into oblivion, because it problem for n > 3. It took about 7 decades until the genraised barriers which the mathematical community re- eral case was solved. In 1991, a Chinese student, fused to acknowledge. Mathematics has developed al- Quidong (Don) Wang, published a beautiful paper [Wa], [D1], in which he provided a convergent power most undisturbed by the fight for its foundations. We will further see, however, that the main idea of series solution of the n-body problem. He omitted only intuitionism is off target. In certain cases a constructive the case of solutions leading to singularities--collisions proof of existence brings no more information than a in particular. (To understand the complications raised nonconstructive one. This is surprising, and the exam- by solutions with singularities, see [D2].) Did this mean the end of the n-body problem? Was ple I offer is the n-body problem. this old question--unsuccessfully attacked by the greatest mathematicians of the last 3 centuries--merely solved by a student in a moment of rare inspiration? The Series Solution Though he provided a solution as defined in sophomore textbooks, does this imply that we know everything In 1913, when he launched the attack that would de- about gravitating bodies, about the motion of planets and stars? Paradoxically, we do not; in fact we know prive him of editorial membership at the Mathematische nothing more than before having this solution. Annalen, Brouwer was not aware of a paper published THE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO. 3, 1996
69
The following section deals with this apparent paradox.
The F o u n d a t i o n s of Mathematics What Sundman and Wang did is in accord with the way solutions of initial value problems are defined; everything is apparently all right; but there is a problem, a big one: these series solutions, though convergent on the whole real axis, have very slow convergence. One would have to sum up millions of terms to determine the motion of the particles for insignificantly short intervals of time. The round-off errors make these series unusable in numerical work. From the theoretical point of view, these solutions add nothing to what was previously known about the n-body problem. This unusual situation makes us think once more about the foundations of our discipline. First of all, it illustrates that even a constructive solution can be useless from the practical point of view. Then why stick to it, why give intuitionism any concern? Well, this difficulty would still not keep us from sleeping soundly. How many of us really care about intuitionism when doing mathematics? Unfortunately, doubt is also cast on the definition of a solution for an initial value problem attached to a differential equation. If our definition is meaningful, then shouldn't it exclude totally useless solutions? In certain cases all our efforts toward finding and writing down solutions might be as futile as Sisyphus's work; moreover, we have no way of knowing in advance when this will be the case. What to do then? Eliminate power series solutions from our definition? This would mean to negate two centuries of mathematics and throw many achievements away. Clearly there is no simple answer. The third problem is connected to what "good" mathematics means. Consciously or not, we usually understand by this the mathematics promoted by famous mathematicians. No one would doubt that the mathematics of Weierstrass, for example, was and remains "good." But Weierstrass stated the first problem of King Oscar's prize, a problem tackled by the sharpest minds of the time. It was eventually solved exactly as the German mathematician had wished; still, a hundred years later, its solution presents only historical interest. Fortunately, the genius of Poincar6 steered our discipline in the right direction--at least this is what we believe today. But how will mathematicians think a hundred years from now? The n-body problem--a bulwark against the flow of time, a reliable landmark on the map of mathematics-has posed and continues to pose new challenges. Almost untouched, mysterious as in the beginning, it has survived 300 years of siege. It has kindled and witnessed a few revolutions: the beginnings of calculus, of qualitative methods, of relativity, of chaos; tackled numerically, it has contributed to the launch of satellites and to the first human step on the moon. Now it is disturb70
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
ing the fundamentals of differential equations theory, the structure on which a significant part of modern science and technology is based. Do we have an answer to this last challenge?
References [A] [BG]
[B] [Bi] [Br] [CJZ] [D1] [D2] [DH] [Di] [G] [G6]
K.G. Andersson, Poincar6's discovery of homoclinic points, Archive for History of Exact Sciences 48 (1994), 133-147. J. Barrow-Green, Oscar II's prize competition and the error in Poincar6's memoir on the three body problem, Archive for History of Exact Sciences 48 (1994), 107-131. J. Bernoulli, Opera Omnia, vol. I, Georg Olms Verlagsbuchandlung, Hildesheim, 1968. G. Bisconcini, Sur le probl6me des trois corps, Acta Mathematica 30 (1906), 49-92. E.H. Bruns, Uber die Integrale des Vielk6rperProblems, Acta Mathematica 11 (1887), 25-96. C. Calude, H. Jfirgensen and M. Zimand, Is independence an exception? Applied Math. Comput. 66 (1994), 63-76. F.N. Diacu, Singularities of the N-Body Problem, Les Publications CRM, Montr6al, 1992. F.N. Diacu, Painlev6's conjecture, The Mathematical Intelligencer 15 (1993), no. 2, 6-12. F.N. Diacu and P. Holmes, Celestial Encounters--The Origins of Chaos and Stability. Princeton University Press (to appear in August 1996). Dieudonn6, J., A History of Algebraic and Differential Topology 1900--1960, Birkh/iuser, Boston, Basel, 1989. R.L. Goodstein, Essays in the Philosophy of Mathematics, Leicester University Press, 1965. K. G6del, Uber formal unentscheidbare S/itze der Principia Mathematica und verwandter Systeme, Monatshefle fiir Mathematik und Physik 38 (1931),
173-198. [P] IS]
[Si] [Sul]
[Su2] [Su3] [U]
[Wa] [W]
H. Poincar4, New Methods of Celestial Mechanics (with an introduction by D.L. Goroff), American Institute of Physics, 1993. D.G. Saari, A visit to the Newtonian N-body problem via elementary complex variables, The American Mathematical Monthly 97 (1990), 105-119. C.L. Siegel, Der Dreierstot~, Annals of Mathematics 42 (1941), 127-168. K. Sundman, Recherches sur le probl6me des trois corps, Acta Societatis Scientiarum Fennicae 34 (1907), no. 6. K. Sundman, Nouvelles recherches sur le probl6me des trois corps, Acta Societatis Scientiarum Fennicae 35 (1909), no. 9. K. Sundman, M6moire sur le probl6me des trois corps, Acta Mathematica 36 (1912), 105-179. J.B. Urenko, Improbability of collisions in Newtonian gravitational systems of specified angular momentum. SIAM J. Appl. Math. 36 (1979), 123-147. Q. Wang, The global solution of the n-body problem, Celestial Mechanics 50 (1991), 73-88. A. Wintner, The Analytical Foundations of Celestial Mechanics, Princeton University Press, Princeton, NJ, 1941.
Department of Mathematics and Statistics University of Victoria Victoria, British Columbia V8W 3P4 Canada
Jet Wimp*
Probability Theory: An analytic view by Daniel W. Stroock Cambridge, England: Cambridge University Press, 1993. xvi + 512 pp. Hardcover: $52.95, ISBN 0-521-43123-9 Reviewed by Peter Whittle The book has an intriguing title. There is no doubt that many stochastic models can be treated either by probabilistic or by analytic methods, and that a strange complementarity between the two approaches prevents one from judging either superior. The most obvious case in point is the treatment of sums of independent random variables. This can be undertaken without recourse to the characteristic function. However, such a recourse, which amounts to an explicit appeal to Fourier techniques and the great body of associated analytic theory, offers speed and economy. The usefulness of this approach is not confined to the case of independent random variables; it is very often helpful to transform the Kolmogorov equations for a Markov process into operator equations in the characteristic function. The possibility of the two approaches, each with its own advantages, becomes even more evident if one considers the large deviation analysis of such processes when an increase in physical scale causes the model to approach determinism. The probabilistic approach appeals to the tilting of distributions and to martingale structure, the analytic approach to a WKB treatment of the operator equations essentially to approximation of a Fourier transform by a Legendre transform. Quite a different class of ideas was initiated by Kac in his magnificently stimulating paper [1]. Kac considered a random walk (in fact, a Wiener process) in a region with an absorbing boundary. He showed how properties of the process which were "obvious" probabilistically implied classic theorems of Weyl and Carleman on the distribution of eigenvalues of the linear operator constituted by the infinitesimal generator of the process. * C o l u m n Editor's address: Department of Mathematics, Drexel University, Philadelphia, PA 19104 USA.
Either approach, though it must be supplemented by detailed argument if it is to provide a rigorous proof, can supply a key insight into the problem. However, the analytic approach is notable in that formal use of a standard machinery can give one a very quick route to the results which one can expect will hold. I have expressed this view elsewhere in words which cannot command universal assent: "One might assert as a rough truth that the probabilistic course is preferred by the purist and the analytic course by the stylist ... the analytic route has the advantage of homing more directly onto the goal, even if it is the probabilistic route which ultimately provides both rigor and insight." So much for the train of thought initiated by Professor Stroock's title. Does his text follow the tracks suggested? Only slightly. Professor Stroock also pays homage to Kac, but has his own ideas of direction and style. Briefly, the book separates into two parts. In the first, Chapters 1-4, he works simply with the concept of independent random variables and exploits this for all it is worth. The role of analysis in this part seems largely to supply rigour, rather than novelty, to the arguments. Not until the second part, Chapters 5-8, does he introduce the concept of conditioning. This leads quickly to the study of martingales and a demonstration of the relationships--significant, but evident only to a powerful mind--which these enjoy with some of the truly major themes of classical analysis. Chapter 1 (Sums of independent random variables: Independence; the weak law of large numbers; Cram6r's theory of large deviations; the strong law of large numbers; the law of the iterated logarithm). This is necessarily fairly standard material. The concept of probability (as a measure on sets) is adopted immediately, together with that of independence. Professor Stroock produces a rabbit out of a hat with deduction of the Kolmogorov 0-1 law on p. 2. The expectation concept is smuggled in--defined in a two-line footnote on p. 3 and not mentioned in the index---despite the fact that it is put to work immediately and continually. The proofs of the laws of large numbers are standard--the weak law by the Chebychev inequality plus truncation, the strong law by Kolmogorov's inequality. However, interesting applications are given immediately, e.g., the approximation of functions by Bernstein polynomials.
THE MATHEMATICALINTELLIGENCERVOL.18, NO. 3 9 1996 Springer-VerlagNew York 71
The treatment of Cram6r's theorem provides the only mention of large deviations in the book, despite the fact that this would have been a natural theme (for reasons indicated above) and one which the author is impressively qualified to develop. One notices idiosyncratic notation: X / ~ (rather than the accepted i) for the square root of minus one and, later, T (rather than the accepted ~b) for the normal density. As in some botanical gardens, common objects often do not bear their common names, and interpretation is sketchy. Professor Stroock assumes not merely an analytic competence but also an analytic motivation, and so gives some of us an opportunity to learn a lot. Every section has a considerable collection of exercises--an appropriate term with its connotations of muscularity. Professor Stroock has an extensive private collection of pet ideas, techniques, associations and applications, and the exercises give him a fine chance to run through these. Chapter 2 (The central limit theorem). The theorems of Lindeberg and Berry-Esseen are proved by very elegant and ingenious arguments; essentially a weak convergence proof, appealing (for the Berry-Esseen sharpening) to Bolthausen's version of Stein's method. Fourier ideas and the characteristic function are invoked first in the following section ("extensions"), which treats the multivariate case and other characterizations of the normal distribution (e.g., invariance under convolution for finite-variance distributions, the isotropy/independence characterizations of statistical mechanics). The chapter concludes with what the author admits to being a non-probabilistic diversion, but one that he cannot resist: Hermite multipliers. In fact, the reader is less likely to object to the diversion as such than to the fact that its motivation and point remain obscure obstinately analytic. Hermite polynomials make an appearance, and operators are introduced which are plainly the creation and annihilation operators of quantum mechanics, but these names are not used, although the author references Nelson's construction of a two-dimensional quantum field. Enthusiasm is a fine thing, but the section seems to pursue a formalism for its own sake, with no explanation of the context which would make it meaningful. Chapter 3 (Convergence of measures, infinite divisibility, processes with independent increments). The introductory section, developing the weak convergence concept, opens in great generality. However, matters begin to clear quickly, and the relatively limpid Theorem 3.1.7 is especially welcome. This is a variation of the Riesz representation theorem, establishing a sufficient condition that a nonnegative linear functional should be representable as an integral with respect to a measure. Standard concepts and results then begin to click easily into place: tightness, the Kolmogorov extension theorem, the L6vy continuity theorem. The following two sections then develop the notion of infinite divisibility, 72
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
first the non-Gaussian case and then the limit (Gaussian) case, so leading to the Wiener process, Donsker's invariance principle, and the full L6vy-Khintchine characterization of infinitely divisible processes. Some hint is given early in the chapter of the form this representation might plausibly take, which is just as well, because 44 pages of dense technical argument are required to get there. Chapter 4 (A celebration of Wiener's measure). This chapter is motivated by Professor Stroock's assertion that "Wiener's measure is possibly the single most important object in all of modern probability theory." This is an assertion which invites discussion, at the very least, but one can see its particular validity for an analyst, with the relation of the Wiener process to the ~7 2 operator, and so to harmonic functions, potentials, and the like. The chapter opens with a compact demonstration of standard properties of the process: scaling-invariance; continuity in a strong sense, unbounded variation. The topic of the second section, "Gaussian aspects," would indeed be straightforward in its finite-dimensional aspects, but Professor Stroock is skirting the idea of white noise: that the exponent in the "probability density" for the Wiener process {q~}would involve the time integral of ~2. He resolves familiar difficulties by an abstract formulation and discussion of the characteristic function of a linear functional of the process. Other deductions follow: e.g., the Cameron-Martin Lemma and the time-reversibility of the pinned process. In the final section, Stroock introduces what he terms: "Markov aspects" and stopping times, and demonstrates the strong Markov nature of the Wiener process (all this without mention, as yet, of the conditioning concept!). From this follows the reflection principle, the first-exit distribution, and the Feynman-Kac formula. Actually, to anyone with a background in dynamic programming, and so an easy if presumptuous familiarity with the Kolmogorov backward equation, the Feynman-Kac formula is immediately evident, and for a general Markov process rather than the Wiener process. However, presumption is not tolerated here, rigour is de rigeur, and rigorous deduction of versions of the formula is a heavy matter. The first part, then, does not display the slickness of the analytic approach, but rather its grinding power in hands as skilled as Professor Stroock's. It is perhaps vain at the present time to resurrect the debate between those with a principal concern for rigour (largely, but not entirely, represented by the mathematicians) and those with a principal concern for insight (largely, but not entirely, represented by the physicists). At the moment fashion sways in favour of the former, although the dynamics of attitude and fashion ensure that this will not last. The cry goes in one direction "You have proved nothing" and in the other "You have proved nothing," and neither group listens. However, one may at least claim a qualitative difference between those results which remain interesting
and significant in their naive (finite-dimensional) version and those which do not. The Kac-Feller assertions relating expected recurrence time and equilibrium occupation probability would be examples of the first type, the Cameron-Martin formula (giving the effect on a Gaussian density of a displacement of the mean) an example of the second (which is not to deny the content of the infinite-dimensional version). To these one must of course add concepts which are intrinsically infinitedimensional in nature, of which the Wiener process is a clear example (being the only homogeneous process of independent increments which has continuous paths). Chapter 5 (Conditioning and martingales). Conditioning is introduced first in the naive characterisation and then by the Kolmogorov characterisation of a conditional expectation. (This latter, I cannot resist saying, illustrates the natural primacy of the expectation concept.) It is introduced for a definite purpose: to set up the concept of a martingale (in fact, a discrete-parameter martingale). That this is now seen as the natural progression of ideas is a tribute to Doob's pioneering insight. Martingale convergence is demonstrated, largely following Doob's treatment, although with a recognition of the advantages of seeing a martingale as a sequence of projections, in the case when second moments exist. After his habitual meaty selection of exercises, Professor Stroock links the martingale concept immediately with some of the material of classical analysis. Explicitly, he discusses the Hardy-Littlewood maximal function Mf(x)--the maximal value of the average of Ifl over cubes centred on x. By appeal to martingale arguments he then deduces a celebrated inequality of these two authors, the Lebesgue differentiation theorem, and the Calder6n-Zygmund decomposition. However, it is in Chapter 6 (Some applications of martingale theory) that Professor Stroock cuts loose and demonstrates the far reach of the martingale concept. First he establishes the individual ergodic theorem. Then he moves on to a topic which one would have imagined completely unrelated to martingales: the study of singular integral operators (of which the Hilbert transform, with kernel proportional to (x - ~)-1, is the prime example). The discussion is again technical and does not lend itself to summary, but confirms the author's contention that the topic exemplifies "the kind of delicate cancellation properties which underlie the most challenging applications of martingale theory." Indeed, in the remarkable following section the author relates this work, Burkholder's inequality, and the general question of the Fourier representation of the action of an operator. Chapter 7 (Continuous martingales and elementary diffusion theory). Completion of the discrete-parameter theory to the continuous-parameter case leads straight into a discussion of the properties of Wiener paths, e.g., their recurrence properties and their ability to mimic
any given continuous path arbitrarily well over any finite time interval. "Perturbations of Wiener paths" refers to what one would loosely speak of as a first-order stochastic differential equation driven by additive white noise--what we otherwise know as a diffusion process. The qualitative conclusions are that the drift (deterministic) term in the equation determines the global properties of the path, but that the local properties are very much those of the Wiener path itself. All this is, of course, expressed and analysed in the most rigorous fashion. A later section deals with the case when the drift term is the gradient of a potential, when it is known that, at least formally, the process has an invariant measure whose density is exponential in this potential. Professor Stroock rigorises the conclusion, and also establishes rate-of-convergence results for passage to this equilibrium. Chapter 8 (A little classical potential theory: The Dirichlet heat kernel: the Dirichlet problem; Poisson's problem and Green's functions; Green's potentials, Riesz decompositions and capacity). This is lovely stuff, which the author plainly handles with the keenest pleasure. It is, of course, again all very much centred on the Wiener process. For example, the Dirichlet problem concerns the solution of Laplace's equation ~72u = 0 in a region ~ given the value of u on the boundary of ~ . The 'probabilistic solution' u(x) is the expectation over boundary values under the first-passage distribution to the boundary of a Wiener path originating at x. Professor Stroock is concerned with rigorous proof that this is indeed the valid and unique solution. One might assert that Wiener character plays a role here only in t h a t V 2 is the infinitesimal generator of the Wiener process and that the path stops on the boundary. Modulo rigour, the corresponding assertion for a Markov process with infinitesimal generator A is that the solution u(x) of Au = 0 in ~ subject to prescription of u outside ~ is just the expectation of terminal u-value under the first-passage distribution of the Markov process to the exterior of ~ , starting from x. However, "modulo rigour" means "nullity" to Professor Stroock, and he asserts nothing that he cannot prove to the hilt. On the other hand, he does invoke Kac's intuitive insights; principally, that the process "does not feel the boundary" for the first few moments of its start from an interior point x. The treatment goes on to consider Poisson's equation (the driven form of Laplace's equation) and to develop a probabilistic view of the concepts of potential and capacity. This is again all very much Wiener-centred, presumably for definiteness, since (as Doob, Hunt, and others have demonstrated) these concepts have versions for much more general processes. The dominating impression conveyed by the text is that a distinguished research worker has written the book that he wanted to write. Such a work cannot be anything other than strong, original, sincere, and thoughtTHE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996 7 3
first impact theory in physics. In mathematics, billiards offers a look at a new and complicated geometries. An inexperienced reader might be tempted to exclaim, "Look, the billiard table is merely a rectangle. Although I was not the top student in my school geometry class I can easily solve all problems dealing with rectangles." But the analysis of billiards offers us a chance to witness the ability of the human intellect to question and to extend. Starting from rectangle billiards the inquisitive mind goes on to inquire, "What are triangle billiards like?" and " H o w about polygons or circles and ellipses?" "Can someone play multi-dimensional billiards or billiards where balls move along geodesic lines of some Riemannian metric?" Even in ordinary billiards the geometry involved turns out to be not so simple as it may seem at first. In order to describe the position of a billiard ball, we need Reference to know not only its position on the billiard table (given 1. Mark Kac. On some connections between probability the- by two coordinates x,y) but also the direction of its moory and differential and integral equations. Second Berkeley tion (defined by a unit vector ~). The triplet (x, y, ~) Symposium. University of California Press (1951), 189-215. forms the "real coordinates" of the billiard ball. Thus its "real motion" is displayed in a 3-dimensional space of Statistical Laboratory "coordinates and velocities" called the phase space of University of Cambridge the billiards. The geometry of this space is much more Cambridge CB2 1SB complicated then the geometry of a plane rectangle. The U.K. main problem is to describe different types of billiard trajectories in the phase space. For example, one can ask, "Is there a closed trajectory?" If the answer is positive Billiards, A Genetic Introduction to the we can ask, "How many of them can we count?" and Dynamics of Systems with Impacts then, "Are they stable?" (this means that any trajectory by Valerii V. Kozlov and Dmitrii V. Treshchev starting near this periodic one tends to it with time). The Providence: American Mathematical Society Transla- last question can be crucial for any billiards player: if the periodic trajectory does not hit any billiard pockets tions of Mathematical Monographs No. 89, 1991. and the player launches the ball in a direction close US $157.00, ISBN 0-8218-4550-0 enough to the trajectory, then the ball will not fall into those billiard pockets. Perhaps expert players have to Invariant Manifolds; Entropy and Billiards; develop intuitive methods for finding stable trajectories. Smooth Maps with Singularities One might suspect that a good billiards player knows by Anatole Katok and Jean-Marie Strelcyn all the stable periodic trajectories but keeps them secret N e w York: Springer-Veflag, 1986. from mathematicians! US $42.00, ISBN 0-387-17190-8 The analysis of billiards can be done in the framework of the theory of dynamical systems. The analysis is Reviewed by Ya. B. Pesin based on classical geometry but also uses extensively reGenerations of people have enjoyed billiards. It is an old sults in number theory, topology, ergodic theory, and game, known in India and China long before the birth theoretical mechanics. Many of the methods used in of Christ. For example, in Shakespeare's Anthony and these books, though quite elementary, allow very nonCleopatra, the Egyptian Queen liked to play billiards trivial conclusions. Both of these books are monographs on the theory of with her maid of honour. And the story goes that French King Charles IX was playing billiards when he heard a billiards. However, right away I would like to emphaprearranged signal for the St. Bartholomew's Day mas- size the important difference between them. The first sacre--the ringing of the bells of St. Germain Cathedral. book can be considered an introduction to the theory of Nowadays the game is even more popular. However, billiards. It is intended for an undergraduate knowing I should warn those readers seeking advice on winning calculus and algebra. The second book is for readers fathat the books being reviewed will give you none. They miliar with the basic notions of dynamical systems and are concerned entirely with the mathematics and me- ergodic theory. chanics of billiards and the great influence this game The main idea of the first book is expressed by the has exerted on mathematics and physics. Billiards is the authors as follows: "The authors strive to clarify the genprovoking. According to the author's own acknowledgements it is a "kinder and gentler" text than the version which Professor Diaconis first saw. In that case we owe a debt of gratitude to Professor Diaconis: the text is more remarkable for muscle than for grace, but has perhaps the more character for that. It is a first-class work of reference, for technique as well as for results and definitive theory; even as extended a review as this does not do justice to its detail and density. It will be a source of stimulation to those mature enough to appreciate its range and the richness of connections it establishes. It could certainly make a great graduate text if one regarded it as concentrating nourishment in the same way as does dried beef, from which one carves off a piece to be chewed and digested at leisure.
74
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
esis of the basic ideas and concepts of the theory of dynamical systems with impact interactions and also to demonstrate that they are natural and effective." The reader's attention is focused on the mechanics of billiards and the study of its stable trajectories. The authors show how to derive the mathematical laws of billiards from the well-known physical principles of impact theory. They point out that "An impact is a short-time interaction of bodies" such that "the positions of bodies do not change at the moment of impact, while their velocities acquire finite increments. Thus, a central feature of impact theory is finding the dependence between the velocities before and after impact." Classical billiards corresponds to absolutely elastic collisions; the trajectories in it obey the variational principles of Hamilton and Maupertuis. The main law is the law of reflection, which is expressed in the well-known maxim "the angle of incidence equals the angle of reflection." A detailed exposition of this theory can be found in the first chapter of the book. The next chapter is devoted to the problem of the number of geometrically distinct closed trajectories. The first result in this direction is due to Birkhoff: there exist at least two such trajectories if the billiard table is given by a smooth, closed, convex plane curve having nonzero curvature at every point (such billiards are called Birkhoff's billiards). The proof introduces the reader to interesting geometrical ideas and constructions in billiard theory. The authors also formulate some geometrical conditions for stability of Birkhoff closed trajectories. Chapter 5, devoted to integrable billiard systems, plays a special role. Completely integrable systems in mechanics are the simplest ones: they can be completely solved. A natural problem is to find all completely integrable billiards. An example has been known for a long time: elliptical billiards. A reader will find a quite elementary geometrical description of this case as well as some others, and the authors are careful to convey the contemporary status of the problem. It has been conjectured that only billiards on elliptical tables are completely integrable. The authors deal with this problem in the last chapter of the book. The study of nonintegrable billiards is quite different and utilizes ideas and methods from the theory of dynamical systems and ergodic theory. Mathematicians have accomplished a great deal and the end seems to be somewhere only a little beyond the horizon. For the general theory of nonintegrable billiards one must read the second, more advanced book. It develops the ergodic theory of smooth maps with singularities having nonzero characteristic Lyapunov exponents with respect to a smooth invariant measure given by the Riemannian metric of the phase space. This sentence can serve as a test. If you are familiar with every notion in it, then you are ready to start reading the book. You will find 1) smooth nonuniformly-hyperbolic theory in its
modern guise; 2) one of the most general versions of the theory of local invariant manifolds; and 3) the complete description of ergodic properties of the systems specified in the above "test" sentence. The book contains much more: for example, formulas for entropy, and information about a number of periodic trajectories. The authors' methods work like long-range guns: they cover a large area where many classes of billiards are located. The most interesting among them are the dispersed billiards or Sinai billiards, introduced and studied by Ya. Sinai in 1970. Sinai's methods were "more geometrical" while the present authors propose a "more dynamical" approach. Throughout this book the predominance of dynamics over geometry allows greater generality. Both books are written by well-known mathematicians who have contributed a great deal to the field. The exposition in the books is very thorough, and in the second book highly demanding. The reader is rewarded with carefully formulated statements and detailed proofs.
Department of Mathematics Pennsylvania State University University Park, PA 16802 USA e-maih
[email protected] Genetic Algorithms + Data Structures = Evolution Programs by Zbigniew Michalewicz Second Extended Edition, N e w York: Springer-Verlag, 1994. xvi + 340 pp. US $39.00, ISBN 3-540-58090-5
Reviewed by Stephen J. Hartley Some People Receive Too Many Books H o w I came to review this book for the Mathematical Intelligencer is amusing. I needed a text for an artificial intelligence special topics course in genetic algorithms that I was going to teach in the spring. I called several publishers for examination copies of various books, including Springer-Verlag. In a few weeks, all books arrived except Michalewicz's. After waiting another month, I called Springer-Verlag to see what had happened. They called UPS to verify that the book had arrived at Drexel. I sent electronic mail to all department members to determine whether someone had picked it up accidentally. Jet Wimp, Review Editor for this journal, gets many books from Springer-Verlag and realized he had taken it. When he returned the book to me, he asked me to review it. The focus of this review is to compare Michalewicz's book with the one I chose for the genetic algorithms course, Goldberg's popular text [3]. Will I switch books THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
75
if I teach the course again? First, I'll give a brief introduction to genetic algorithms.
Genetic Algorithms are Inorganic Genetic algorithms are used for search and optimization, for finding the maximum or minimum of a function. Instead of a deterministic search, as in hill-climbing or gradient methods, genetic algorithms use randomization. The members of the search space--for example, integers or real numbers in some domain--are encoded as bit strings. An initial population of bit strings is generated at random. Each bit string is called a chromosome. Each member of the initial population is evaluated for its fitness in solving the problem (maximizing or minimizing the function). A new population of candidate solutions to the problem is generated using three genetic operators: reproduction, crossover, and mutation. These are modeled on their biological counterparts. With probabilities proportional to their fitness, members of the population are chosen or selected for a new population. Pairs of chromosomes in the new population are chosen at random to exchange genetic material (bits) in a mating operation called crossover, resulting in two offspring. Bits are flipped at random, a procedure called mutation. The new population that is generated with these operators replaces the old population. The algorithm has performed one generation and then repeats for some specified number of additional generations. The population evolves, containing more and more highly fit chromosomes. When the convergence criterion is reached, such as no further increase in the average fitness of the population, the best chromosome is decoded into the solution (maximum or minimum) produced by the genetic algorithm for the problem. Genetic algorithms have been very successful at solving many types of problems, such as maximizing discontinuous, multimodal, multidimensional functions. They have also been used on discrete problems, including such combinatorial optimization tasks as the traveling salesperson, bin-packing, and job-shop scheduling. Two excellent introductory articles are [1,2].
I'd Rather Fight than Switch I chose Goldberg's book as the text because it contains introductory material in addition to advanced topics, has problems and programming assignments at the ends of the chapters, and describes module-by-module a program called SGA implementing a simple genetic algorithm. Michalewicz's book also contains introductory material in Part I. However, the book seems to be written for those who already have some familiarity with genetic algorithms. Advanced operators are mentioned (PMX, OX, CX on page 86) but not defined until much later (page 218). The Banach fixed-point theorem 76
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
is used to explain the convergence of genetic algorithms (pages 66-67). Part I could be used as the introductory material for a genetic algorithms class if the instructor provides exercises and problem sets. Part II is on numerical optimization and assumes a sophisticated mathematical background, for example in dynamic control (page 98) and nonlinear optimization (page 158). There are many minor, readily identifiable typographical and editing errors of the kind common to author-supplied camera-ready copy. Michalewicz's book is an excellent resource on genetic algorithms for the specialist. It concentrates on incorporating linear and nonlinear problem constraints into genetic algorithms. However, I would hesitate to use it instead of Goldberg as the text for an introductory class in genetic algorithms because of Michalewicz's mathematical sophistication and lack of exercises.
References [1] David Beasley, David R. Bull, and Ralph R. Martin, An overview of genetic algorithms: part 1, fundamentals, University Computing 15, 2 (1993), 58-69. [2] David Beasley, David R. Bull, and Ralph R. Martin, An overview of genetic algorithms: part 2, research topics, University Computing 15, 4 (1993), 170-181. [3] David E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Reading, Mass: Addison-Wesley (1989).
Department of Mathematics and Computer Science Drexel University Philadelphia, PA 19104 USA e-maih
[email protected] website: http://www.mcs.drexel.edu/-shartley
Polynomials and Polynomial Inequalities by Peter Borwein and Tam~s Erd41yi Graduate Texts in Mathematics Vol. 161 N e w York: Springer-Verlag, 1995. x + 480 pp. US $59.00, ISBN 0-387-94509-1
Reviewed by Jet Wimp Several years ago I reviewed John and Peter Borwein's book, Pi and the AGM (Wiley-Interscience, 1987), and m y praise of that book was unstinting. I find this book equally praiseworthy, and to dispel any suspicion that the authors and myself are in league, I will state for the record that I haven't seen the Borwein brothers for at least 10 years, and I've never met Tam~s Erd61yi. Partly, it's a case of confluent sensibilities--the things that interest these authors interest me. Another factor is that all these authors write exceedingly well. Their prose is deft and uncluttered, and they organize their material in a w a y that could serve as a model to up-and-coming mathematical writers. Reading the present book or the "Pi" book, I was constantly energized, divided between
my desire to toss it aside and lavish my own research on the subject at hand and the desire to stay on the speeding train, wondering what on earth was going to appear around the next bend. Polynomials and Polynomial Inequalities is one of the best mathematical books in years. Polynomials are the workhorses of analysis. A question that recurs on the Ph.D. written exams at Drexel is the following: Let ck E L1 [a, b] and
f~t nek(t) dt = 0 ,
n =0,1,2 .....
Show that qh = 0 a.e. If the examinee had me as instructor for the threeterm real variable sequence, he or she knew exactly how to proceed. First, approximate ~bby a continuous function and, then, using the Weierstrass approximation theorem, approximate that function by a polynomial. In my graduate courses I always emphasize the utility of polynomials in analysis. It's not that crucial results in approximation theory or interpolation theory can't be obtained any other way--they usually can. It's just that polynomials are often the slickest way to prove things. The books of Davis, Interpolation and Approximation [dav], and Achieser, Theory of Approximation [ach], contain many examples. This book is about equally divided between the traditional literature on polynomials in a single complex variable and more recent research, due to the authors, on Mfintz generalized polynomials. I found the original material especially enjoyable. Let me introduce some notation. D and D will denote the open unit disk and its closure, respectively. p(z)
= an zn q- a n _ l z n - 1
q- " ' " q- a 0 ,
q(z) = anz ~ + an_lZ ~-1 + "" + ao z ~~ 3o < 81 < "'" < 8n.
(1)
(2)
q is called a Miintz polynomial. Many features of ordinary polynomials have Mfintz polynomial analogs. The supremum norm of a function on a complex set z~ is
IlfG
=
sup
If(z)[.
z~A
It's quite remarkable that, given the work record of polynomials, few general accounts of their properties have appeared in book form. The only contender, E. J. Barbeau's similarly excellent Polynomials (1989), 1 harbors a different set of concerns. It is pitched at the undergraduate level. Although it too has an exhaustive and entertaining set of exercises, there is little overlap with the present book. 1Reviewedin Mathematical Intelligencer, Vol. 16, no. 2, pp. 78-79.
Of the remaining books on polynomials, some treat specialized polynomial sets--Stirling or Eulerian polynomials, for instance and are little more than pamphlets. Some treat orthogonal polynomials, or the role of polynomials in approximation theory. In numerical analysis, estimating the location of the roots of polynomials is imperative for analyzing the growth of error in many algorithms and, in engineering problems, making decisions about the stability of dynamical processes. Yet, the only comprehensive book treatment is Marden's 1949 The Geometry of the Zeros of a Polynomial in a Complex Variable. My copy is so tattered and the gold lettering on the spine so effaced that I sometimes can't find it on my bookshelf. Much material in Marden is present here, including one of my favorites, the extraordinary theorem of Enestr6m-Kakeya:
Let all ak > 0 in (1). Then all the zeros of p lie in the annulus rl : = min ak ~ IZI ~ r2 : = max - -ak . ak+ 1 ak+1 Before I get into details, let me indicate the plan of the book. There are seven chapters. Each chapter starts with a one-paragraph overview, a greatly effective organizing principle. Chapters are divided into sections; each section closes with a subsection on comments, exercises, examples, and historical observations. I appreciated the firm sense of historical grounding the Borweins displayed in "Pi," and the same disposition is at work here. Books that lack contact with this side of their subject always seem to me superficial, evanescent: we need to know where we've been to know where we're going, and we can convey the richness and vitality of mathematics only by attending to its social and historical dimensions. The text is not cluttered with proofs; usually these are left to the exercises (sometimes with appropriate hints), so that occasionally the book assumes the flavor of an encyclopedia, which is not at all a bad thing. It's what makes the book such compulsive reading. The authors organize and present the material in a way that emphasizes its usefulness. It's very much a working mathematician's book. Now about the content of the individual chapters. Some of the results I knew of; some were surprising, others were more than surprising--they were astonishing. As we go along I'll display a few that caught my eye, with almost no commentary and, of course, no proofs. I just want to give an idea of the terrain of the book. We have all found things in mathematics to make us marvel. Often our reaction was one of wonderment: where do people get such ideas? There are many marvels in this book-pretty, or useful, or both. I'll concentrate on the pretty. Chapter 1, the introduction, contains everything we need to know about polynomials to handle later chapters: the fundamental theorem of algebra; explicit solutions for quadratic, cubic, and quartic equations (You all knew these formulas existed. Did any of you know THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, I996 7 7
where to find them?); Newton's identities; norms; partial fractions; theorems on the location of zeros and critical points of polynomials; basic theorems from complex analysis. (I would like to have seen the Schur criterion, a recursive algorithm for deciding whether the zeros lie in the interior of the unit disk. It's a valuable tool for analyzing the stability of numerical integration schemes. 1) The preparation in this chapter for what follows is painstaking. There are no nebulous concepts-we always have what we need to understand what we're reading. One of the first results in this chapter to strike my fancy was the following, a consequence of a general theorem of Szeg6: If Z ak zk has all its zeros in -D, so does ~. akzk k=0
k=0
(~)
Chapter 2, Some special polynomials, begins with a discussion of Chebyshev polynomials Tn(x) and U,(x). They occur in the most unexpected contexts, including the problem of functional iteration: Define pIk] = p(p[k-1]), p[1] = p, see (1), and suppose closure of {z E C]p[kl(z) = 0 for some k = 1, 2, ...} is the interval [ - 1 , 1]. Then p(x) = + Tn(x). In the subsection on transfinite diameter, the important notion of the Fekete polynomial and logarithmic capacity for a complex set appear. Next, the authors discuss orthogonal polynomials on the real line, polynomials orthogonal with respect to an arbitrary measure, the Gram-Schmidt process, and best approximation. The section on Lp spaces is a model of brevity. Section 2.3 is devoted to the classical orthogonal polynomials and their properties. There follows an exercise set on the moment problem. The celebrated theorem of Favard is relegated to an exercise! The chapter closes with a section on polynomials with non-negative coefficients. Chapter 3 explores the vital role polynomials play in approximation theory: Chebyshev and Descartes systems, rational systems, M/,intz polynomials. The authors define a very clever Miintz analog of the Legendre polynomials: Ln(x) = ~
1 f r - r t + X*+ 1 xt I H t ~-~k t - - ;t-----~dt, n = O, 1, 2, ....
where F encloses all the poles of the integrand. Ln is a polynomial of the kind (2), but with the powers unre-
1Call a polynomial Schur if all its roots are in D. Define p(z) = a* + a*-i z + 9 9 9 + a~ z n, the * indicating complex conjugation, and pl(z) = (1/z)[~(O)p(z) - p(z)p(0)l. Then p(z) is Schur if and only if (i) Ip(0)] > [p(0) I and (ii) pl(z) is Schur.
78
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
stricted complex numbers, as we find by using residue calculus, Ln(x) =- Z
j=0
A n , j xAJ,
n
An,j = II(;~j + ;~ + 1) k O
n
H(Aj kr
)
(The above formula requires that the Ak be distinct, but the case where they aren't is easily handled.) Quite surprisingly, these polynomials, like the Legendre polynom i a l s - t h e case Aj = j--are an orthogonal set. I found the following result highly dramatic when I encountered it in one of the authors' previous publications. I was happy to see it here: 1 Let {Ak} be a complex sequence, Re (Ak) > -- ~. Then Ln(x)L~(x) dx = 1 + An + A~, It is well known that all the zeros of the Legendre polynomials lie in the open interval (-1, 1). Where are the zeros of the Miintz-Legendre polynomials? I won't spoil the fun for the reader. Read the book and find out. In Chapter 4 we find a discussion of denseness properties of various approximation families, including the powers (essentially, Weierstrass's approximation theorem) and the generalized powers, {x~q (essentially the classical result of Miintz). As I was browsing, m y eye jumped to the following curious item: 1 +
min 1 ai aiEC i=1 z - ~i
=Hlfli[ i=1
-1
"
Bernstein inequalities and the Paley-Wiener theorem finish the chapter. Chapter 5, Basic inequalities. Many polynomial inequalities are scattered throughout the mathematical literature, and the authors have performed a service to the mathematical community by gathering them together: the Remez, Bernstein, Markov, Schur inequalities among them. One of the most inscrutable is the Remez inequality:
I[PllI-l,ll
~ T ~ ( 2 + s~ ~2 - s J
holds for every real polynomial p and s E (0, 2) satisfying ~ff~{x : [p(x)[< 1} -> 2 - s. (Tn is the Chebyshev polynomial and ~fl~is the Lebesgue measure.) It has been said that great mathematics is always surprising. According to this criterion, the Remez inequality, coming out of the blue, surely qualifies for greatness. What could conceivably lead anyone to suspect that this result was true?
Among all this chapter's gorgeous results, I was particularly attracted to Chebyshev's inequality: Jpn(y)[ --- [Tn(y)J'[[p[l[-1, 1],
y ~
[-1, 1],
for p a real polynomial, with equality if and only if pn = cTn. The authors generalize a bit in Chapter 6 by mentioning some inequalities for entire functions of exponential type. The chapter ends with weighted inequalities and inequalities for norms of factors. Many of these findings can be extended to M~intz spaces, where integer powers of the variable are replaced by general powers, and this is done in Chapter 6. Much of this material is due to the authors. In Chapter 7 the authors take on inequalities for rational function spaces. I was struck by an inequality on logarithmic derivatives:
MOVING? We need your new address so that you do not miss any issues of
THE MATHEMATICAL INTELLIGENCER. Please fill out the form b e l o w and send it to: Springer-Verlag N e w York, Inc. Journal F u l f i l l m e n t Services P.O. Box 2485, Secaucus, NJ 07096-2485 Name Old Address (or label)
Address City/State/Zip
Let p be a real polynomial of degree n. Then I P'(K) ~ ~} ~ 2--n-n ~f~ x E ~ : p(x) ol There are five appendices. None of these contains material necessary for topics in the text. Instead, they are mini-essays, each devoted to a subject relevant to polynomials. The Borwein brothers in the book "Pi" expressed a steadfast concern about computability and algorithmic construction. The first appendix in the present book treats computability as the idea relates to polynomials: the fast Fourier transform, fast polynomial algebraic operation on polynomials, methods for localizing zeros. I was disappointed that the authors mention neither the Jenkins-Traub method nor the Lehmer-Schur search algorithm. These are the methods of choice for computing the complex zeros of polynomials. The Lehmer-Schur algorithm is a generalization of the bisection method to disks in the complex plane, and it always gets all the zeros and always converges linearly. The discussions of root finding both here and in Barbeau's book [bar] are unsatisfactory. In particular, Newton's method and its kin, touted for their elegance and their easy extension to operator equations, have no value as global methods for finding the zeros of complex polynomials. The reader who is concerned about effective methods for root finding should consult the recent survey article by McNamee [mcn] and the references given in [pre]. The other appendices treat orthogonality and rationality, interpolation, inequalities for Lp polynomials, and constrained inequalities. There is an 18-page bibliography and a very helpful index of notation. The production values of the book are exceptional, just what we always expect from the Springer-Verlag tradition. The book can be used as a reference work or as a basis for a very imaginative graduate or even advanced undergraduate course.
Name New
Address
Address City/State/Zip Please give us six weeks notice.
This superb book fills a need that was unaddressed for far too long. I predict it will become a classic. References
[ach] N. I. Achieser, Theory of Approximation, Ungar Publishing, New York, 1956. [bar] E. J. Barbeau, Polynomials, Springer-Verlag, New York, 1986. [dav] P. J. Davis, Interpolation and Approximation, Blaisdell Publishing, Waltham, MA, 1963. [mar] M. Marden, The Geometry of the Zeros of a Polynomial in a Complex Variable, American Mathematical Society, Providence, RI, 1949. [mcn] J. M. McNamee, A bibliography on roots of polynomials, Journal of Computational and Applied Mathematics 47, 391-394 + floppy disk (1993). [pre] W. H. Press, W. T. Vettering, S. A. Teukolsky, and B. P. Flannery, Numerical Recipes in C: the Art of Scientific Computing, 2nd ed., Cambridge University Press, Cambridge, England 1992.
Department of Mathematics and Computer Science Drexel University Philadelphia, PA 19104 USA e-maih
[email protected] THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996 7 9
Robin Wilson* Irish Mathematics Raymond Flood and Robin Wilson Irish mathematicians have featured on a number of stamps. We illustrate four of these here, covering three centuries. George Berkeley (1685-1753) was a philosopher and clergyman who became Bishop of Cloyne in 1734. A vehement and highly competent critic of many aspects of Newtonian science, he sought to show that Isaac Newton's universe was constructed upon shaky foundations. In his 1734 book The Analyst, or a Discourse addressed to an infidel mathematician (generally thought to refer to Edmond Halley), he unleashed a devastating attack on the calculus of Newton and Leibniz. In particular, he argued cogently that Newton's method of fluxions was logically unsound, referring to derivatives as "ghosts of departed quantities." The Irish stamp below was issued in 1985 to commemorate the 300th anniversary of his birth. Sir William Rowan Hamilton (1805-1865) was a child prodigy who mastered several languages (modern, classical and oriental) by the age of 14. While still a teenager he discovered an error in Laplace's Traitd de Mdcanique Cdleste, and was appointed Astronomer Royal of Ireland while an undergraduate at Trinity College, Dublin. He carried out important theoretical work in geometrical optics and dynamics, and several concepts and results are named after him, such as Hamiltonian function, Hamilton's principle, and the Hamilton-Jacobi equation. He also revolutionized algebra by his investigations into non-commutative systems. The two stamps
Berkeley
below were issued in 1943 and 1983 to commemorate Hamilton's discovery of quaternions in 1843. We recall Dimitric and Goldsmith's "Mathematical Tourist" article, Mathematical Intelligencer vol. 11, no. 2, 29-30. ~amon de Valera (1882-1975) was brought up on a farm in County Limerick. He became a teacher of mathematics in Dublin, where he increasingly became inv o l v e d in republican circles. He was a commandant in the 1916 Easter uprising and narrowly escaped death by firing squad when the uprising was defeated. After independence he was an opposition leader, Prime Minister, and eventually President of Ireland. De Valera's lifelong interest in mathematics, particularly in celestial mechanics and quaternions, is shown in his letters from prison and in his founding one of the leading research institutions of Ireland, the Dublin Institute for Advanced Studies, with its three constituent Schools of Theoretical Physics, Cosmic Physics and Celtic Studies. On establishing the Institute in 1939, he said, "This is the country of Hamilton, a country of great mathematics; ... establishing a School of Theoretical Physics will again enable us to achieve a reputation in that direction comparable to the reputation which Dublin and Ireland had in the middle of the last century."
Raymond Flood Department of Continuing Education Kellogg College Oxford, OX1 2JA UK
Hamilton
Quaternions
de Valera
*Column editor's address: Facultyof Mathematics and Computing, The Open University,Milton Keynes,MK7 6AA, England. 80
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York