The Mathematical Intelligencer encourages comments about the materlal m this issue. Letters to the e&tor should be sent ...
9 downloads
592 Views
29MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
The Mathematical Intelligencer encourages comments about the materlal m this issue. Letters to the e&tor should be sent to the reviews editor, Chandler Davzs.
9B r o u w e r
and Hilbert
I read the Fall 1990 Mathematical Intelligencer with pleasure. In regard to Dirk van Dalen's article about the struggle between Brouwer and Hilbert, here is an elaboration of part of my autobiography in Mathematzcal People. After I left G6ttingen in 1931, I spent 1932 and 1933 in Vienna. During that time I visited Holland. Among the people I saw there was Professor H. Freudenthal. He took me to the h o u s e of Professor Brouwer, passing by the pharmacy owned by Mrs. Brouwer. Brouwer received me very politely, but w h e n his wife asked whether she should serve tea, he hesitated. We got on well and a little later he asked her to bring tea. Since I had spent 1930-31 editing the number-theoretical papers of Hflbert, Brouwer and I had a lot to talk about. Brouwer said that he thought many of Hilbert's papers were not entirely his own (having in m i n d , for example, the f a m o u s walks w i t h Minkowski). However, Brouwer said there was one paper of Hilbert's that Brouwer was sure was Hilbert's own: the solution of Waring's Problem, for Hilbert wrote this when he was a guest in Brouwer's house. Olga Taussky Todd Department of Mathematics Cahforma Institute of Technology Pasadena, CA 91125 USA
9F o r m a l S y s t e m s .
Although I enjoyed and mostly agreed with James Henle's article, "The Happy Formalist" (Mathematical Intelligencer, Winter, 1991) I disagree with what he calls the formalist thesis: " . . . rather like Church's thesis, [it] is simply that all of pure mathematics can 4
be imbedded in formal systems." Unlike Church's thesis, I believe this to be false. As a counterexample, I propose the thesis that there is no way to formalize in a satisfactory way the distinction between "finite" and "infinite." Oh, sure, it is possible to formalize the concept, "in 1-1 correspondence with the set of the positive integers ~n, for some integer n," but this is inadequate, because however one formalizes the concept of "the integers," there are nonstandard models in which there are integers that are infinite in the generally accepted sense of the word. Nor can the formalism distinguish between these and the truly finite integers; the set of finite integers is not a member of any of these models. [I wonder how many devotees of nonstandard calculus reveal this embarrassing fact to their students, that the set of finite integers has no meaning in their system! True, they speak of the set of "'standard integers," but that is not the same thing: any formalism of the concept "standard integer" must itself allow for models with infinite standard integers.] Later in the article, Henle writes that the existence of nonstandard models made him doubt the existence of mathematical truth, but wisely dissociates this doubt from his definition of formalism. For me they had the opposite effect: they helped impress on me the distinction between the proven (or even provable) and the true. To take a colorful illustration: it is quite conceivable that there are n o n s t a n d a r d models in which the Twin Primes Conjecture holds and others in which it fails to hold, and that we can even construct such models without knowing the truth or falsehood of the conjecture for the aggregate of finite integers. This could be done by proving that in one model, given any integer n, there is an infinite integer k > n such that both k and k + 2 are prime, while in another,
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3 9 1991 Sprmger-Verlag New York
there is an infinite integer n such that no such k exists. All this would mean is that the present ZFC axioms, which capture all that seems to us at the present time to be true about sets (even at the cost of admitting some controversial axioms), are inadequate for deciding the truth or falsehood of the conjecture for the finite integers, which is what number theorists refer to when speaking of the conjecture. It would have no bearing on whether this conjecture is true or false; and I will gladly defend the claim that this conjecture really is either true or false, quite apart from whether our minds (or indeed any mind anywhere) can prove or disprove it. The thesis that it is impossible to unambiguously state this conjecture in any formal system makes it all the more fascinating.
Peter J. Nyzkos Department of Mathematzcs Unwersity of South Carohna Columbza, SC 29208 USA
Wrong Decade I enjoyed Steven Krantz's article Mathematical Anecdotes (Mathematical Intelligencer, vol. 12, no. 4) very much. He writes that in " t h e 1930s and 1940s, a theorem was 'true in the sense of Cartan' if Grauert could not find a counterexample in the space of an
hour." Perhaps in the 1950s; Hans Grauert was born on 8 February 1930.
Hanfried Lenz Mathematzcs Institute Free University of Berhn D-IO00 Berhn 33, Germany Steven Krantz replies: Thank you for this correction. I should have known better.
Karen V. H. Parshall*
Sixty Years After GOdel Gregory H. Moore It is now 60 years since Kurt G6del published his remarkable discoveries on incompleteness. These discoveries changed the face of mathematical logic and, in large measure, discredited two of the three philosophies of mathematics that were prominent at the time: Hilbert's formalism and Russell's logicism. Hilbert's program was discredited by showing that to prove an axiom s y s t e m c o n s i s t e n t requires a still stronger system. Logicism was discredited by showing that no axiom system for logic, strong enough to do arithmetic, could prove all and only the true propositions expressible in the system. A decade ago, the incompleteness theorems finally reached the general public through Hofstadter's Pulitzer Prize-winning book, GOdel, Escher, Bach. G6del's discoveries also helped to set in motion the clear separation of mathematical logic from the philosophy of mathematics. When G6del's incompleteness paper [10] was published in 1931, articles on mathematical logic were often treated as part of philosophy. At that time the two main abstracting journals, the venerable Jahrbuch ~iber die Fortschritte der Mathematik and the brand n e w Zentralblatt fiir Mathematik und ihre Grenzgebiete, routinely reviewed articles on logic in their sections on philosophy. 1 G6del's paper, although its reviewers considered it important, did not receive lengthy reviews, as compared with other papers of the time.
Godel's first announcement of his results had appeared in 1930 in [9]. It bore the title "Some Metamathematical Results on Completeness and Consistency." In it he stated that if we take the logic of Russell's Principia Mathematica (the theory of types) together with the Peano Postulates, we obtain a formal system S that is not complete. That is, there is a proposition P that is neither provable nor refutable in the system S. Moreover, there is still such a P in any system T obtained from S by adding finitely or infinitely many new axioms, provided that T is consistent and that it is metamathematically "decidable" whether each formula of T is, or is not, an axiom of T. 2 This result came to be known as the First Incompleteness Theorem. He also stated the Second Incompleteness Theorem: The system S cannot be proved to be consistent by the methods available in S, and an analogous
1 W h e n Mathematzcal Revzews b e g a n In 1940, articles o n m a t h e m a t i c a l logic were in t h e section called " F o u n d a t i o n s , " w h e r e a s set theory w a s in a separate section u n d e r analysis. That s c h e m e r e m a i n e d in effect for decades. 2 Strictly s p e a k i n g , t h e r e q m r e m e n t w a s 0a-consistency, a techmcal m a t t e r that w e omit here. Rosser r e d u c e d t h e r e q m r e m e n t to consist e n c y m [30].
* C o l u m n Editor's a d d r e s s : D e p a r t m e n t s of M a t h e m a t i c s a n d History, University of Virgima, Charlottesville, VA 22903 USA. 6 THE MATHEMATICALINTELL1GENCERVOL 13, NO 3 9 1991Spnnger-VerlagN e w
York
result is true for any of its consistent extensions T. In other words, the consistency of S can be proved only in a formal system that is stronger than S. In order to understand why these two incompleteness theorems are important, let us consider what mathematical logic was like in the previous decade. 3 During the 1920s there were communities of mathematical logicians in several countries, including the United States, Germany, and Austria. These communities differed in their professional loyalties, some being mathematicians and others philosophers. At that time there was no generally accepted system of mathematical logic. In Germany, A. Fraenkel wrote in 1922 of "the uncertainty of general logic," no doubt having the paradoxes in mind ([6], 101). At the end of the 1920s Z e r m e l o observed, r e g a r d i n g his 1908 axioms for set theory, that "a generally recognized 'mathematical logic,' to which I could have referred, did not exist t h e n - - a n y more than today, when every foundational researcher has his own logistical system" ([33], 340). And in the United States, O. Veblen noted in 1925: "The fact is that there does not exist an adequate logic at the present time, and unless the mathematicians create one, no one else is likely to do so" ([32], 141). Nevertheless, the most common system of logic during the 1920s was Russell's theory of types. In the United States, one such community of logicians has sometimes been called the American Postulate Theorists. 4 The members of this loose group included B. A. Bernstein at Berkeley, E. V. Huntington at Harvard, E. H. Moore at Chicago, and O. Veblen at Princeton. They formulated sets of axioms for various concepts, such as that of Boolean algebra and that of real number. Their logical concerns were particularly that a set of axioms be satisfiable (i.e., have a model) and that they be independent; they also intended that certain sets of axioms be categorical (i.e., have a unique model up to isomorphism). Often they were concerned with Moore's notion of the complete independence of an axiom system. By and large, they did not specify any "underlying logic," i.e., the logic in which their axiom systems were to be formulated, s During the first three decades of the twentieth century, members of this group frequently published articles on axiom systems in the Bulletin and the Transactions of the American Mathematical Society, and occasionally in the Annals of Mathematics and the American Journal of Mathematics. One of them (Veblen) was instrurnental in bringing Godel to the United States, first
3 A thorough treatment of this topic, includingmatters beyond the scope of the present article, can be found in [14] 4 See M. Scanlan's article "Who were the American Postulate Theorists?", forthcomingin the Journalof SymbohcLogic s Although Godel was not particularlyinfluencedby the American Postulate Theorists, he was aware of their work and rewewed some of it (cf. [11]).
in 1933 on a temporary basis and later on a permanent one. 6 Like most logicians before 1930, the American Postulate Theorists did not distinguish clearly between syntax and semantics, i.e., between notions about the formal language used (e.g., notions such as consistency, proof, and theorem) and notions about the mathematical objects themselves (e.g., notions such as satisfiability, definability, and truth). For example, Huntington in [21] did not distinguish between the consistency of a set of axioms (i.e., there is no formal proof of "P and not P" for some proposition P) and its satisfiability. Yet the community most influential during the 1920s was not found in the United States but at G6ttingen. It consisted of Hilbert and his fellow researchers: W. Ackermann, P. Bernays, and J. von Neumann. The roots of Hilbert's concern with mathematical logic are found, already in 1899, in his research on the foundations of geometry [15]. There he emphasized three concepts that would concern him over the next three decades: consistency, independence, and completeness. In his Paris address of 1900 he had expressed a fourth, his "conviction of the solvability of every mathematical problem, ''7 and this later led to his Entscheidungsproblem, or Decision Problem. The logicians of Hilbert's group, which he began to assemble in 1917, had a clear grasp of the difference in logic between syntax and semantics. This distinction would be vital to the incompleteness theorems. The notions of consistency and independence were clear at the turn of the century, but the notions of completeness and decidability were not. These latter two notions were clarified in Hilbert and Ackermann's 1928 textbook on logic, Grundziige der theoretischenLogik [18], which strongly influenced G6del's approach and choice of problems. In it, completeness for first-order logic (where quantifiers can only range over individuals, not over sets) meant that each first-order formula valid in every domain of individuals is provable from the logical axioms. 8 The Entschezdungsproblem asked for a procedure that, given any formula P of logic, would determine in finitely many steps whether or not P is valid or satisfiable. Hilbert and Ackermann regarded the Decision Problem as the most important problem in mathematical logic ([18], 77). A third community was at Vienna, where G6del was educated. There the philosophers R. Carnap and L. Wittgenstein had already published on mathematical logic, and the mathematicians H. H a h n and K. Menger were quite interested in the subject. H a h n
6 On Godel's life, see [29] 7 See [16], 443 s On the historical emergence of first-order logic from various stronger logics, see [27]. THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
7
was GOdel's thesis supervisor. When G6del submitted his 1931 paper on incompleteness as his Habilitationsschrifl, Hahn evaluated it as "a scientific achievement of the first rank . . . . Today Dr. Godel is already the principal authority in the field of symbolic logic and research on the foundations of mathematics. ''9 Godel was also strongly influenced by a community of logicians that scarcely existed any more when he got his doctorate in 1930: the English logicians. From 1910 to 1913, Russell and Whitehead had published three massive volumes of Prmcipia Mathematzca (the most extensive treatise of symbolic logic ever to see print), and its second revised edition appeared in the mid-1920s. After that, Russell and Whitehead did no more work in mathematical logic. For Russell, as for Frege before him, logic was allinclusive. It was impossible to stand outside logic, and so it was also impossible to ask any metamathematical questions, such as whether logic was consistent. Such questions could only be dealt with empirically; if, for example, no contradiction had been found in logic thus far, then logic was (provisionally) consistent. But Russell, while he did not share Hilbert's preoccupation with consistency, did believe in logicism. This was the philosophical thesis that all the concepts of mathematics can be expressed in mathematical logic and that all the t h e o r e m s of m a t h e m a t i c s can be proved using only the axioms and rules of inference of mathematical logic. Godel had grave doubts about logicism, just as he did about Hilbert's formalism. There is an intimate connection between the results of G o d e l ' s d i s s e r t a t i o n a n d his i n c o m p l e t e n e s s theorems. Both had their origin in Hilbert's program to establish the consistency and completeness of classical mathematics. Hilbert's plan of attack was to work his way up. First, the consistency and completeness of the axioms for n u m b e r theory must be established, then the consistency and completeness of the axioms for the real numbers, and finally the consistency and completeness of set theory. In 1928, when he spoke in Bologna to the International Congress of Mathematicians, Hilbert was at the height of his power. He was about to successfully eliminate L. E. J. Brouwer from the editorial board of Mathematische Annalen, ~~ and he h a d o v e r c o m e Brouwer's attempt to keep German mathematicians from participating in an international congress for the first time since World War I. Moreover, Hilbert mistakenly believed that Ackermann had established the consistency of n u m b e r theory. 11 In his Bologna address Hilbert listed five problems to be solved as the 9 Hahn in [29], 350 Translated by the author 10See [31]. 11Only after Godel pubhshed his incompleteness theorems In 1931 did Hllbertcome to realize that Ackermann's proof did not estabhsh the consistencyof all of number theory. 8
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
next steps in carrying out his program, and then concluded with a problem not on his list: "The question, put in a general form, of establishing the completeness of the system of logical rules is a problem of theoretical logic. So far, we have only attained the conviction, by testing these rules, that they are sufficient. ''12 Hilbert then referred to Bernays' earlier proof of the completeness of the lowest level of logic, namely propositional logic. G6del's dissertation was devoted to answering Hilbert's question for the next level of logic, viz., firstorder logic. Taking first-order logic as presented in Hilbert and Ackermann's book [18], G6del s h o w e d that first-order logic is complete, i.e., each valid formula is provable. Moreover, he showed that, in firstorder logic, every consistent axiom system has a m o d e l - - t h e r e b y justifying, in part, a dogma that Hilbert had upheld for three decades. Of particular relevance to the i n c o m p l e t e n e s s theorems, however, are some remarks at the beginning of Godel's thesis [7]. There he observed that, to take consistency to be the sole criterion for the existence of a model, as Hilbert appeared to do, presupposed that we cannot establish the unsolvability of s o m e p r o b l e m . For if the u n s o l v a b i l i t y of s o m e problem (in the real numbers, say) could be shown, then there would be two nonisomorphic models of the axioms for the real numbers; but, he noted, we can prove that any two models of those axioms are iso-
12See [17], 140 Translated by the author
G6del met the first real opposition to his incomp l e t e n e s s results w h e n he spoke on t h e m to the Deutsche Mathematiker-Vereinigung in September 1931. Zermelo, who had heard G6del's talk, quickly wrote him about "an essential gap" in the proof of the First Incompleteness Theorem. This " g a p " was really a contradiction that Zermelo saw in the proof, namely that a certain proposition was neither true nor false.lS G6del replied in October, pointing out the source of the error in Zermelo's argument: the notion of truth was not expressible in the formal system S that G6del was using, or in any other formal system able to express a d d i t i o n and multiplication of positive integers. 16 Godel also noted that the existence of classes not definable in S followed more simply from a cardinality argument. But he added:
morphic. Finally, Godel carefully noted that he had not shown the unsolvability of some problem. G6del omitted these remarks when he published the thesis in [8], perhaps due to his extreme caution. But they reveal that the subject of incompleteness was on his mind even while he was establishing the completeness of first-order logic. The reaction to G6del's incompleteness theorems was quite diverse. His first public announcement of the First Incompleteness Theorem occurred in September 1930 at a conference in K6nigsberg on the foundations of mathematics. Although Godel only mentioned the result in passing, von N e u m a n n was intrigued by it and wrote to G6del in November to announce an important corollary to the result: the unprovability of completeness. But G6del had already discovered this Second Incompleteness Theorem himself, and had sent his 1931 paper to the publisher several days before yon Neumann's letter arrived. 13 Word of G6del's results spread before his paper appeared in print in January 1931. Around Christmas 1930, Bernays, who had learned of these results from Courant, wrote to G6del requesting a copy of the galley proofs of his paper. After receiving them, Bernays wrote to Godel in mid-January that the results were "an important step forward in research on foundational problems. 'q4 In a letter of mid-April, Bernays called these results "surprising and significant." 13 See [4], 256 14 Bernays' letters to G o d e l are f o u n d a m o n g t h e latter's p a p e r s kept at P n n c e t o n U m v e r s l t y .
I do not consider the essential point of my result to be that we can somehow go beyond each formal system . . . . but rather that for each formal system of mathematics there are propositions which can be expressed within this system but which cannot be deczded by the axioms of this system and that these propositions even have a relatively simple form since they belong to the theory of positive integers. The fact that the whole of mathematics cannot be captured in a formal system already follows from Cantor's diagonal argument, but it still remained conceivable that at least certain subsystems of mathematics could be formahzed completely. My proof shows that even this is impossible if the subsystem mcludes at least the notions of addition and multiplication of integers 17 Zermelo's reply later in October shows how greatly his perspective differed from G6del's. Zermelo came back to the cardinality argument for undecidable propositions: there are uncountably many propositions but only countably many provable propositions. This situation arose, he believed, because proof was restricted to finitistic methods. He proposed that this situation could be overcome by using a more general "schema" of proof. What he had in mind was clear from his later articles: an infinitary logic, in which there were infinitely long sentences and rules of inference with infinitely many premises. TM In such a logic, he insisted, "'all propositions are decidable!" But the time was not yet ripe for such an infinitary logic, which, in a more restricted form and without escaping incompleteness, became part of mathematics in the mid-1950s through the work of L. Henkin, C. Karp, and A. Tarski.19 G6del did not get a positive reception from Hilbert
is Zerrnelo's letter a n d its translation can be f o u n d m [5], 69 16 H e r e Godel h a d discovered t h e m d e f i n a b l h t y of t r u t h i n d e p e n d e n t l y of Tarski, w h o is u s u a l l y credited w i t h it 17 T r a n s l a t e d b y t h e a u t h o r . G o d e l ' s letter a n d Z e r m e l o ' s reply are f o u n d m [13], 298-302 18 This e x c h a n g e b e t w e e n Godel a n d Zerrnelo is d i s c u s s e d m m o r e detail m [26], 124-128. 19 For a d i s c u s s i o n of t h e p r e h l s t o r y of m h m t a r y logic, see [1], [26], a n d [28]. THE MATHEMATICALINTELLIGENCERVOL 13, NO 3, 1991 9
either. When Hilbert returned to logic in 1934, he still insisted on the final goal of knowing that our customary methods m mathematics are totally consistent. Concerning this goal, I would like to stress that the view temporarily widespread --that certain recent results of Godel imply that my proof theory is not feasible--has turned out to be erroneous. In fact, those results show only that, in order to obtain an adequate proof of consistency, one must use the finitary standpoint in a sharper way than is necessary in treattng the elementary formahsm.2~ What Hilbert may have had in mind was G. Gentzen's new consistency proof for arithmetic. But this proof worked by encoding formal proofs via countably infinite ordinals--a procedure that was not convincingly finitary. G6del found a happier reception in Princeton. In the fall of 1931, von N e u m a n n gave a talk at the Princeton mathematics colloquium. He chose to speak, not on his o w n w o r k , b u t on G 6 d e l ' s i n c o m p l e t e n e s s theorems. As a result, G6del's 1931 paper was incorporated into the course that A. C h u r c h was then giving on logic, a course devoted to presenting his new formal system. 21 At that time, Church had two graduate students, S. C. Kleene and J. B. Rosser, who aided Church in exploring his system for logic. The later work of both Kleene and Rosser was heavily influenced by the incompleteness theorems. Church believed that his system could escape from Godel's i n c o m p l e t e n e s s results, but a worse fate awaited it: it was proved inconsistent. 22 But what was salvaged from the system became the K-calculus. This calculus, Church was convinced, contained all and only the effectively calculable functions--a claim that became famous as "Church's thesis." G6del, who had come to Princeton in the fall of 1933, was not inclined to accept Church's thesis at the time. Nor was G6del inclined to believe that his own general recursive functions, presented in his Princeton lectures of 1934, necessarily captured the intuitive notion of all effectively calculable functions. G6del only accepted Church's thesis after A. Turing's work appeared in 1937, using the notion of Turing machine to give a conceptual analysis of computability. What were the long-term effects of G6del's completeness and incompleteness theorems? One, as we have just seen, was to encourage research on computability and on the undecidability of various axiom systems. Thus, in particular, the undecidability of the Halting Problem was a natural extension of his work.
20Translated by the author from [19], v. 21See [23], 52. 22 One inconsistencym the 1932 system was found by Church [2], who then modifiedit to obtain his 1933system [3]. But an inconsistency was found in the new systemby Kleene and Rosser in their [24]. 10
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
Another effect was to downplay the theory of types and second-order logic (in which we can quantify over sets), since these logics are incomplete. In a complementary fashion, the completeness of first-order logic gradually became the ideal. Other logics were seen as "nice" only to the extent that they embodied completeness, despite the fact that first-order logic cannot express so simple a notion as finiteness. It is clear that G6del's completeness and incompleteness results substantially shaped mathematical logic in the decades after 1931. Yet, even today, as G. Kreisel remarked in his obituary of G6del, the incompleteness theorems have not changed the conception of mathematics held by the majority of mathematicians. 23 Only in set theory and general topology (and to a lesser extent, in algebra) have the incompleteness results played a major role and helped to clarify the need for new axioms. Perhaps the incompleteness theorems will only be truly appreciated by non-logicians w h e n some central problem of analysis, such as the Riemann Hypothesis, is shown to be undecidable on the basis of the accepted axioms. But that day has not yet come.
References
1. Jon Barwise, Infinitary logics, Modern Logic--A Survey (E. Agazzi ed.), Dordrecht" D. Reldel (1980), 3-112. 2. Alonzo Church, A set of postulates for the foundation of logic, Annals of Mathematzcs 33 (1932), 346-366. 3. , A set of postulates for the foundation of logic (Second Paper), ibzd 34 (1933), 839-864. 4. John W. Dawson, Jr., The reception of Godel's incompleteness theorems, PSA 1984: Proceedings of the 1984 Bzennzal Meeting of the Phtlosophy of Science Assoczatlon (East Lansing, MI: Philosophy of Science Association) 2 (1985), 253-271. 5. , Completing the Godel-Zermelo correspondence, Hzstoria Mathematica 12 (1985), 66-70. 6. Abraham A. Fraenkel, Zu den Grundlagen der Mengenlehre, Jahresbertcht der Deutschen Mathematiker-Veremzgung 31, Angelegenhe~ten (1922), 101-102. 7. Kurt Godel, Uber die Vollst~indigkeit des LogikkalkiJls (doctoral dissertation); submitted (1929); pubhshed and translated in his [12]. 8. , Die Vollst~indigkeit der Axiome des logischen Funktionenkalkiils, Monatshefte fur Mathematik und Physzk 37 (1930), 349-360. 9. , Einige metamathematische Resultate uber Entscheidungsdefinitheit und Widerspruchsfreiheit, Anzeiger der Akademze der Wtssenschaften in Wien 67 (1930), 214-215. 10. , Uber formal unentscheidbare Satze der Principia Mathematica und verwandter Systeme I, Monatshefle fur Mathematzk und Physzk 38 (1931), 173-198. 11. , Review of [21], Zentralblatt fur Mathemattk und ihre Grenzgebiete 2 (1933), 146. 12. , Collected Works, vol. 1 (Solomon Feferman, John W. Dawson, Jr., Stephen C. Kleene, Gregory H. Moore,
S e e [25], 149
13. 14. 15. 16.
17.
Robert M. Solovay, and Jean van Heljenoort, eds.), New York: Oxford University Press (1986). Ivor Grattan-Guinness, In memoriam Kurt Godel: his 1931 correspondence with Zermelo on his incompletability theorem, Historia Mathematlca 6 (1979), 294-304. , On the development of logics between the two world wars, American Mathematical Monthly 88 (1981), 495-509. David Hilbert, Grundlagen der Geometme, Leipzig: Teubner (1899). , Mathematical Problems. Lecture Delivered before the International Congress of Mathematicians at Parrs in 1900, Bulletin of the American Mathematical Society 8 (1902), 437-479. , Probleme der Grundlegung der Mathematlk,
Attz del Congresso mternazwnale dez matematzci, Bologna 3-10 settembre 1928, 1 (1929), 135-141. 18. David Hilbert and Wilhelm Ackermann, Grundzzige der theoret~schen Logik, Berlin: Springer (1928). 19. David Hilbert and Paul Bernays, Grundlagen der Mathematik, vol. 1, Berlin: Springer (1934). 20. Douglas R. Hofstadter, G6del, Escher, Bach: An Eternal Golden Braid, New York: Basic Books (1979). 21. Edward V. Huntington, "A complete set of postulates for the theory of absolute continuous magnitude, Transactions of the American Mathematical Society 3 (1902), 264-279. 22. , A new set of independent postulates for the algebra of logic with special reference to Whitehead and Russell's Princlpza Mathematzca, Proceedings of the National Academy of Sciences, U.S.A. 18 (1932), 179-180. 23. Stephen C. Kleene, Origins of recursive function theory, Annals of the History of Computmg 3 (1981), 52-67. 24. Stephen C. Kleene and J. Barkley Rosser, The inconsis-
25. 26.
27.
28. 29. 30. 31. 32. 33.
tency of certain formal logics, Annals of Mathematics (2) 36 (1935), 630-636. Georg Kreisel and Kurt Godel, 28 April 1906--14 January 1978, Biographical Memoirs of Fellows of the Royal Society 26 (1980), 148-224. Gregory H. Moore, Beyond first-order logic: The historical interplay between mathematical logic and axiomatic set theory, Hzstory and Philosophy of Logzc 1 (1980), 95 - 137. , The emergence of first-order logic, History and Philosophy of Modem Mathematics (William Aspray and Philip Kitcher, eds.), Minneapolis: University of Mirinesota Press (1988), 95-135. , Proof and the infinite, Interchange 21 (1990), 46-60. , Godel, Kurt, Dictionary of Scientific Biography, Supplement II (1990), 348-357. (C. C. Gillispie, ed.) New York: Charles Scribner's Sons. J. Barkley Rosser, Extensions of some theorems of Godel and Church, Journal of Symbohc Logic I (1936), 87-91. Dirk van Dalen, The war of the frogs and the mice, or the crisis of the Mathematische Annalen, The Mathematical Intelhgencer 12 (4) (1990), 17-31. Oswald Veblen, Remarks on the foundations of geometry, Bulletin of the American Mathematical Society 31 (1925), 121-141. Ernst Zermelo, Uber den Begriff der Definitheit in der Axiomatik, Fundamenta Mathematicae 14 (1929), 339-344.
Department of Mathematlcs and Statistics McMaster Unwerszty Hamzlton, Ontario L8S 4K1 Canada THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
11
The Opmzon column offers mathematicians the opportumty to wrzte about any zssue of interest to the international mathematical community. Disagreement and controversy are welcome. An Opmlon should be submztted to the reviews e&tor, Chandler Davis.
The Information in Your Hand Jack Cohen and Ian Stewart Hand as I n f o r m a t i o n . Your hand is designed according to certain instructions coded up in your DNA. The length of these instructions gives a measure of the amount of reformation in your hand. Rudy Rucker, Mind Tools Each age interprets its universe in terms of what is currently important to it. When ancient animistic Man wanted to make sense of the starry sky, she saw it as a zoo of people and a n i m a l s - - t h e Hunter, the Swan, the Lion, the Dog. The Mechanical Age of the 18th Century bred a mechanistic philosophy, the clockwork universe, with God as the watchmaker who set the wheels spinning and then stood back to watch his creation turn. Our present Computer Age sees the universe as an ever-changing flow of information. If we were to discover the stars today, our first instinct would be to try to decode their message. So when, in the Computer Age, Crick and Watson
12
stumbled across the double helix of DNA and its aperiodic sequence of nucleotides, it was inevitable that DNA would be seen as a " p r o g r a m " or "code" that contained the "genetic information" needed to make you and me. Indeed, it was a major breakthrough, perhaps the major breakthrough, of this century to decipher the "genetic code" whereby triples of nucleotides specify protein structure. From such a viewpoint DNA is a genetic message transmitted from parent to offspring, a list of instructions, like a glorified knitting pattern. And, just as we can look at a knitting pattern and see which part of it governs the design of the neckline or the armhole, we imagine that if only we were clever enough we could look at the DNA pattern and see which part of it governs the design of a neck, or an arm. Or a hand. And of course, if we want to produce a very compli-
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3 9 1991 Spnnger-Verlag New York
cated sweater, say one with an intricate lacy three-dimensional effect looking like butterflies on a background of bullrushes, the more information the knitting pattern must provide. So the longer the DNA sequence is, the more complicated must be the part of the organisms that it contains the instructions for, and the more information that part must "contain." It's a picture of DNA as the Book of Life. You can imagine thumbing the pages of the genetic handbook, looking for the Sentence that produces hemoglobin, the Paragraph that produces a blood cell, the Chapter that produces an a r t e r y - - e v e n the Appendix that produces an appendix. The Book of Life image is often explicit in the sales pitch for the self-proclaimed Great Project of sequencing the human genome. It is the world-view of the SF writer Tom Easton's "gengineer" stories, in which you can tear out the pouch-page from the Book of the Kangaroo and glue it into the Book of the Albatross to get air mail. Above all it's a picture of information as data-string: the longer the sequence of instructions, the more information it contains. So, of course, because there are more letters in "quadruped" than in "dog," the message "Fido is a jquadruped" contains more information than "Fido is a dog." And since the DNA of a mammal contains fewer nuc l e o t i d e s than that of an a m p h i b i a n , a n d s o m e amoebae contain a hundred times as much DNA as either, it follows that mammals are pretty simple creatures, really, and amoebae are amazingly complex in comparison. Once you start thinking like that, you begin to realise that DNA-as-message must be a flawed metaphor.
Convergence The idea of DNA as genetic information sits uneasily w i t h the p h e n o m e n o n of convergence. Different "causes" can produce the same "effect." Flight, for example, has risen at least four times in the history of evolution: in pterosaurs, insects, birds, and bats. The wing is a common structure in the world of living creatures. But these creatures do not possess some common DNA sequence that produces wings. There is also a great deal of convergence within a single species. For example, chemical changes are highly dependent upon temperature. Frogs develop from tadpoles in ponds whose temperatures vary from perhaps 5~ to 25~ within the course of one day. Many of the genetic instructions in frog DNA buffer the frog biochemistry against temperature changes. This leaves a great deal less "information" to determine the basic developmental program around which the buffering routines fit. One is left with the uncomfortable feeling that an adult frog is far too complicated an object to be produced by the amount of information that we know exists in its DNA.
Information The idea of information as a quantity was invented by Claude Shannon around 1930, and it arose from engineering problems in telecommunications. Information theory models the following situation: a message (represented as a string of binary digits 0 and 1) is to be sent from a transmitter to a recezver. In the simplest setup each digit is considered equally likely, and thus conveys exactly one bit of information. In this case, a message of n binary digits contains n bits of information, so here the longer the message, the more information it contains. The primary concerns of classical information theory are twofold: noise in the communication channel, and coding of information. Noise degrades the signal and r e d u c e s the information-carrying capacity of the channel. Encoding the message at the transmitter and decoding it at the receiver is a mathematical device to protect against degradation by noise: it can also incorporate situations in which the probabilities of individual m e s s a g e c o m p o n e n t s are non-uniform, or where there is extra structure or redundancy in the original message. The unequal probabilities of different components of the message are transformed by the coding procedure into (possibly) equal probabilities of O's and l's in the actual signal sent. In consequence, when applying a simple bit-count to deduce the quantity of information contained in some message, it is important to take the context, the assumptions that lie behind the encoding/decoding method, into account. Only in the ideal situation does every possible message occur with equal probability; only then does each binary digit carry one bit of information. In the "transmission" of genetic "information" from parent to offspring, the context is currently largely unknown. It could make an awful difference!
Thought Experiments Here are a number of thought experiments. Their aim is not to cast doubt upon the validity of classical information theory, for each has an interpretation within that scheme in which it makes perfectly good sense. The object is to make it clear that naive bit-counts, not taking context into account, can generate nonsense; and to show that the aspects of a message that matter to human beings, such as meaning, understanding, and development, do not fit readily into the information t h e o r e t i c m o u l d . We are not claiming that Shannon ever intended them to; but we feel that the distinction b e t w e e n bit-count and meaning is not always appreciated. 1. "If I don't phone you tonight, Auntie Gertie will be arriving on the 4:10 train from Chattanooga. Take her home." 2. "You'll find what you want on pages 75-94 of THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
13
volume 77 of the Bulletin of the American Mathematical Society." 3. "IRPNY." 4. O n a t e l e v i s i o n s c r e e n , the c a p t i o n " C a l l 0800-666-777777 to make a donation." Experiment 1, on the face of it, conveys a sizeable quantity of information with a zero-bit m e s s a g e - though since the alternative is that you do phone, it's really a one-bit message. An enormously complicated sequence of events is set in train by the absence of a telephone call: get out address-book to find Auntie's address so that you can think about traffic-patterns and work out the best route to take her home while you're on the w a y to the station; put on coat, open front door, go through, shut it again, get keys from pocket, open car door, get in, shut door, start car, engage gear, let out clutch, avoid neighbour's cat in driveway, turn left on to street . . . . In experiment 2 a message of 105 characters, say 525 bits in less-than-optimal code, triggers access to the entire information in t w e n t y pages of a technical journal, say around 80,000-100,000 bits. In experiment 3, a masterpiece of the advertiser's art, a simple but direct message is conveyed in a mere four characters. In experiment 4, a message of thirteen decimal digits, or around 43 bits, is received. However, the engineers that designed the format of television signals know that the actual amount of information consumed by the appropriate segment of the TV screen is far h i g h e r - - a r o u n d 100 lines, each of 1000 individual phosphor dots in three colours must be activated: say 800,000 bits. To transmit the telephone number by television, you have to send an 800,000-bit message! It won't work otherwise. In each case it is reasonably clear what mechanism is operating, where to locate the sleight-of-hand that turns comfortable communication-theorist's information from a conserved quantity into something so malleable that there is no point in measuring it. Indeed, each can be viewed as an exercise in coding, "'triggering" the access of information from a specific range of possibilities.
Bit-Counts Don't Quantify Meaning Yes, b u t . . . There is a species of biological theorist--increasing in n u m b e r s - - t h a t counts the quantity of information carried by a segment of DNA and deduces limits on the complexity of the resulting animal. Does this make sense? One could count the quantity of information transmitted in examples 1 - 3 and deduce limits on the complexity of the resulting actions. Those limits would be gross underestimates. For example 4, on the other hand, it w o u l d be a gross overestimate. Mere bit-
14
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
counting ignores the context in which the "message" is sent. It bears no relation to the true "information utility" of the message; that is, the complexity of the action that it initiates. In more familiar terms, h o w much meaning it possesses. If the manner by which DNA code is transformed into creatures is ignored, we have no idea whatever of the possible complexity of the creature that results from a given segment of DNA. It takes very few bits to send " m a k e a tiger"; and a receiver that understood such a message (i.e., "knew" what it had to do to implement it) would need nothing more, apart from appropriate context, to construct the world's most beautiful feline. Such receivers do exist, namely zoo cur a t o r s - a n d the context is equally straightforward, a pair of tigers. On the other hand, twenty million DNA codons might do no more than convey the colour of the animal's ear-tufts, if some TV-like system were in use. (Oddly, it isn't even necessary to code for the colour of ear-tufts p r o v i d e d - - a s is u s u a l - - y o u want them darker than the rest of the animal. The chemical that determines the colour is temperature-dependent, and ear-tufts are cool because they are at an extremity. It is trivial to "code for" darker ear-tufts, just as the absence of a telephone call "codes for" meeting Auntie G.) Prescmptzon, rather than description, is closer to the mark; not just for DNA "messages," but for any message outside of the abstract setting of information theory, which deliberately strips out the context. A prescription from the doctor is not of itself a cure: it only turns into one when taken to a drugstore, "received" by a pharmacist, and acted upon. All "messages" in the real world that really are messages happen within a context. That context may be evolutionary, chemical, biological, neurological, linguistic, or technological, but it transforms the question of information content beyond measure. The accurate definition of the information content of a message relies upon the context. When speaking of technology, we generally k n o w - - a t least in principle - - w h a t the contextual contribution is. We don't normally try to play a compact disc on a telephone answering-machine. But when thinking about the natural world, we often forget that we do not know how much contextual input there is into processes that we like to model as "message-sending." Many biologists talk of developmental processes being "switched on" by genes; for example, that hereditary disease A is "caused by" defect B in gene C, or that gene D "codes for" structure E. In this style of thinking the DNA sequence is a computer program, and the organism appears w h e n you "run" this program. Genetic engineering is analogous to computer hacking. So ingrained has this type of picture become, that many biologists act as if "it's programmed in the
DNA" answers everything. It's not that the "program" image is completely false; just that it's only part of the process. What about the "'computer"? H o w does that work? How does the "information" in the DNA "program" lead to a fully developed organism? What else is n e e d e d ? These are important questions, to which the "bit-count" measure of information is at best marginally relevant.
What Message? The metaphor of DNA as a "message" from parent to offspring doesn't hold up under scrutiny. When the "message" is transmitted, there is no recewer. The message, indeed, is supposed to describe h o w to construct the receiver! Strictly speaking, the "genetic code" isn't even a code. It is true that DNA "codes for," that is, determines, p r o t e i n s - - b u t there is no converse process of encoding proteins into DNA. DNA is not "transmitted," but copzed (subject to the complications of sexual reproduction); and the process w h e r e b y DNA "becomes" offspring also revolves the parent. Our obsession with information technology and ,messages as bit-strings has led us to focus almost exclusively on DNA as "software" and to ignore the contextual "hardware" (or "wetware") in which it produces actions. Moreover, there are other things than DNA that also pass from parent to offspring, things that on a biochemical level are comparable to DNA but which we don't think of as coding anything, and therefore fail to think of as conveying information. In most sexually reproducing cellular animals the egg begins development without involving the embryo's own genes. Only w h e n the "ground plan" is sufficiently well developed in the embryo structure do the embryo's o w n genes take control. Mammals take the whole process much further: they put a great deal more into the mother, thereby simplifying what has to be put into the embryo's DNA. We have already mentioned that a large part of frog DNA deals with alternative enzyme pathways for different temperature levels. In contrast, in a mammal the uterine temperature is kept constant by the mother's o w n regulatory systems; so mammals don't need to put that kind of "information" into their DNA. This is w h y mammal DNA contains fewer nucleotides than amphibian DNA, while managing to produce animals that are manifestly "more complex." We might now speculate about a super-mammal that puts the available "extra" DNA to good use . . . . Humans take the whole process one stage further. Much of what we need in order to be human is genuinely "transmitted" to us as a m e s s a g e - - n o t genetically, but through our brains. Language is an example. If language were "hard-wired" into our genes (assuming this to be possible), then we wouldn't have to learn it; but it would presumably use up a lot of DNA.
Instead, our DNA seems just to code for languagelearning ability within a brain that has already evolved for other reasons; then the language itself is transmitted culturally, from the mother and other adults to the child. This method is far from foolproof--the language that we learn is an imperfect rendition of what is taught to u s - - b u t it gains in efficiency and flexibility. In Richard Dawkins's terminology, we not only pass on our genes to our offspring, but our memes (selfperpetuating mental structures) as well. Why, then, do we focus so obsessively on the DNA sequence? Because it looks like a message, a code, a piece of software. That metaphor has borne considerable fruit, but it can also be a snare and a delusion. Our complexity is not determined by the number of nucleotides in our DNA sequence: it is determined by the complexity of the actions that can be initiated by those nucleotides within the overall system that constitutes not just us, but out parents, other ancestors, and indeed our cultural heritage. Much of that complexity, in our case, is built into the overall system: it is not coded into our DNA. The development of a hand, for example, is part of the culmination of a series of processes that produces our skeleton, our muscles, our skin, and so on; each stage being dependent upon the current state of others, and all of them dependent u p o n contextual physical, biological, chemical, and cultural processes for which no "information" as such is required. A part of DNA can no more code for a hand than a part of the scaffolding for a building under construction can hover unsupported in midair. Your hand contains flesh, blood, and b o n e - - b u t no information. Jack Cohen The Lodge 39 Greenhlll Blackwell Bromsgrove B60 1BL United Kingdom
Ian Stewart Mathematzcs Institute Universtty of Warwzck Coventry CV4 7AL United Kingdom
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 1 5
A Portrait of Alfred Tarski Steven R. Givant
I first heard of Alfred Tarski while in high school. My family lived in Berkeley, and my sister was a close friend of Ina, Tarski's daughter. Later, when as an undergraduate at the University of Californxa I began to study logic with Leon Henkin, I encountered his name more often. But I did not meet him until September 1968, when I started my graduate studies at Berkeley. He was scheduled to teach an advanced course cure seminar in equational logic. Although I was not sure I would be able to understand anything (I had not yet taken the basic graduate course in logic), I decided to ask him for permission to audit. He was quite kind and encouraging. Thus began an acquaintanceship that lasted fifteen years. It progressed slowly. For a long time I was simply another one of his graduate students. Around 1972 he asked me to be his research assistant for a year, and in 1975, after I finished my doctoral dissertation, we began a serious collaboration on a book he was in the process of writing. Eventually I became a close friend. The picture I shall paint, then, is not of the young, vigorous logician w h o was the brightest star in a constellation of logicians in Warsaw, nor is it of the mature logician w h o founded his own school of logic at Berkeley. I can only portray an individual who was already 67 years old when I met him, a still quite powerful intellect, but no longer the dynamic leader and center of attention of the world of logic.
His face was strong, with a protruding vein running up the left side of his broad forehead, a large nose, quite rounded at the end, and animated eyes that gave glimpses of the razor-sharp mind behind them. He parted his thxn, neatly trimmed grey hair almost in the middle, just slightly to the right side, and brushed it straight d o w n to just above his large ears. A generous mouth betrayed the pleasure that lively social interaction brought him. When he spoke, his eyes, nose, and mouth worked in concert, as players m a pantomime, to dramatize the different emotions he was feeling. His face would almost contort as he wrinkled up his nose in disapproval, or smiled broadly at something
The University Professor Tarski was a rather short person, robust in appearance t h r o u g h o u t most of his life (although a bit p l u m p when I first met him, and quite thin in his last years). 16
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3 9 1991 Spnnger-Verlag New York
Tarski on a train in Russia (1966)
Tarski at the Tarski Symposium (1971)
jthat pleased him, or opened his eyes and mouth wide to make a point emphatically. It seemed calm only when he was deep in thought: then he would rub the side of his nose with his index finger and his expression would go blank. Perhaps the expressiveness of his face made him difficult to photograph. Of all the shots taken of him, there was only one that he liked: lost in thought (on a train in Russia), with a cigarillo in one hand. "A man has a right to choose his own conception of his appearance," he once said, and he always chose the Russian one. Future generations of logicians may only know his face by that one likeness. He was definitely not a man of fashion, but when he went out of the house he was always neatly dressed, and he regularly wore a sports coat and slacks. In this respect he did not adopt the relaxed habits of his Berkeley colleagues. His old-world formality helped to create a certain distance between him and us, a distance he did not try to bridge. (Robert Vaught once told me, in an almost reprimanding tone, that earlier generations of Tarski's students had carried his briefcase for him.) We a d d r e s s e d him as " P r o f e s s o r Tarski," and it was usually not until after we finished our Ph.D.s that we were invited to call him "Alfred." He spoke English well, but with a p r o n o u n c e d Polish accent. It was hard to imitate, but we tried mightily as we swapped Tarski stories. There was, and still is, a kind of camaraderie among his students. It is impossible for me to think of him without a cigarillo. He smoked constantly. In class we would watch him take out still another Robert Burns, strike a match, and continue to lecture. Seconds ticked by, the burning match in one hand, the unlit cigarillo or the
chalk in the other. We no longer attended closely to his words. Would he burn his finger? Would he finally light the cigarillo, or would he light the chalk? Tarski was an exciting lecturer. In his hands the thick rough garments that often serve to clothe a mathematical subject were gently peeled away to reveal the lithe, vital body beneath. Mathematics was an art as well as a science, and its aesthetic side was extremely important to him. The proofs he presented in class were models of clarity: polished gems, so elegant and pleasing, with the main ideas bold and transparent. As with the old movie serials, at the end of the hour you could hardly wait for the next thrilling installment. Most mathematics instructors try to get past the basic material in graduate courses and seminars as soon as possible in order to bring students to the frontiers of r e s e a r c h - - t h e difficult, deep, interesting theory. Tarski felt this approach was misguided. He believed it essential to develop in students a very thorough understanding of the fundamental notions and theorems, to lay a solid foundation upon which to erect the structure of the more advanced but more specialized theory. In this way, their confidence in their mathematical abilities and in their grasp of the theory was gradually built up, which made it easier for them to later do research. Of course, the more senior students might sometimes get a bit impatient when he went over very basic material in advanced courses, but even then, it was still a pleasure to watch how beautifully and elegantly he developed the mathematics. Tarski never used a textbook, not even to supplement the lectures. It was the European tradition for a professor to develop the material himself. The night THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
17
before a lecture he would lay aside his research and carefully plan out the next day's talk. Folding a piece of 81/2-by-ll paper in half, he would make on the right side a careful list of the definitions and theorems he wanted to present; the left side was kept blank for possible later changes. He took great pains to formulate simple and precise statements. The notes usually did not contain indications of the proofs he planned to give. In class he almost always lectured without referring to his notes. Three things were especially important to Tarski: correctness, preciseness, and conciseness. Vague formulations were a sign of incomplete understanding, fuzzy thinking. He liked short, snappy statements that struck the reader, and it pleased him to express a complicated theorem succinctly, using his elaborate notational system. Over the course of his career, Tarski developed what was almost a theory of notation, a lexicon of symbolism involving different alphabets, different fonts, and different type faces, in both upper and lower case, with precise rules governing w h e n to use each. For example, algebraic structures were always rendered in capital German script, classes of structures in upper case Roman sans serif, expressions of a formal language in lower case Greek (from the middle of the alphabet), and sets of expressions in upper case Greek. To denote a fixed set or class of objects or a fixed operation on and to such sets he would often use two boldface letters, the first of which was upper case and the second lower case. For example, the class of all models of a language might be denoted by "Mo", the substructure of a model generated by a set X by " ~ g X " , the set of all terms of a language by "Tp,", the theory generated by a set of sentences F by "O~1 F", and so on. He gave a great deal of thought to the notation and terminology that he adopted, employed it systematically, and encouraged his students to employ it. He felt a good notational system used consistently by people working in a given field was extremely important: it made papers and talks easier to follow, and helped to eliminate misunderstandings. Poorly conceived notation, on the other hand, got in the way of m a t h e m a t i c a l u n d e r s t a n d i n g a n d could c o n f u s e readers. He wanted to develop a uniform and comprehensive notational system that would eventually be adopted by all people working in foundational research. A glance at the system presented in [10] will give the reader an idea of the seriousness with which he viewed this project and the progress he made in carrying it out. His concern with notation was probably a legacy of his association with his thesis advisor, Stanis~aw Le~niewski. During a lecture Tarski often posed questions anticipating the next point that he wanted to make, and waited patiently for someone to answer. How stimu18
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
lating this type of classroom interaction was! Not only did it get you thinking about the mathematics, it made you feel you were actually one step ahead of the lecture. Tarski cared about students being able to understand his lectures. He used to look at their eyes to see if they were following him. This concern for good teaching was an outgrowth of his own teaching experiences in Warsaw. Because of the scarcity of professorships in mathematics and logic before the war, and perhaps also because of the anti-Semitic climate at the time, he was unable to obtain a regular university appointment. True, he did have a position as Jan r siewicz's assistant, and later as his adjunct, but these did not pay well and he was forced to accept other positions, first as a lecturer at a pedagogical institute, and then as a high school mathematics instructor. He took his teaching duties quite seriously, and in later years related with pride the mathematical achievements of some of his Warsaw pupils. In fact, he saved several of their homeworks in which they wrote down completely correct proofs involving the use of Dedekind's continuity axiom. Teaching and interacting with students, even high school students, stimulated Tarski mathematically. An example of this can be seen in some papers, [37] and [39], that form a sequel to his famous work with Stefan Banach on the equivalence of geometric figures by finite decompositions, [34] and [1]. It is a little known fact that, during an extremely busy period of his life, he coauthored a high school geometry text, [4]. His interest in the completeness and in the definable notions of elementary algebra and geometry, as well as in the axiomatic f o u n d a t i o n s of g e o m e t r y , w e r e perhaps also partly motivated by his high school teaching experiences. By the time I knew Tarski, he taught only one graduate course or seminar a quarter, always on Tuesdays and Thursdays, and always in the afternoon (since he h a d the habit of w o r k i n g far into the night a n d sleeping late in the morning). Two anecdotes may serve to describe h o w his students felt about his classes. During the 1969-70 academic year, the Berkeley campus was racked with student protests (mostly against the war in Viet Nam) that reached a climax w h e n the national guard was called in and used tear gas to control the situation. Tarski's seminar that year was on the theory of relation algebras (a subject that had been revitalized through his own research and that was dear to him). There were around fifteen participants, of all political persuasions, and despite the general call to cancel classes, all wanted Tarski to continue teaching. Special arrangements were made for the class to meet off campus. The next year he lectured on the theory of general algebras. When it became apparent in early spring that he would not be able to cover his planned syllabus, the class asked him to talk an extra hour each meeting
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
19
(for a total of 21/2 hours). He was touched by their eagerness to learn and agreed to their request, despite the extra work it meant for him. In later years he mentioned several times h o w m u c h the r e q u e s t had moved him. For m y part, I must say that he was one of the few lecturers I could continue to listen to for more than two hours without getting bored, restless, exhausted, or lost. It was a source of pride to Tarski that his students seemed to inherit his ability to lecture well. "If an invited speaker has been a student of Tarski, you can be fairly sure about a good presentation." This compliment he once received might apply equally well to the quality of the exposition in their papers. However, the ability to lecture or write well did not come painlessly to those who studied with him. He could be scathing in his criticisms if you were vague or made a mistake in a seminar talk. Other times, noticing an error, he would wait quietly. Then, a hunter baiting his prey, he would ask a series of innocent-sounding questions until at last, confused and embarrassed, you were ensnared, painfully aware of the error that had been committed. Tarski required students to turn in complete written reports of their seminar talks, which he saved. (These reports are n o w stored in the Tarski Archives of the U n i v e r s i t y of C a l i f o r n i a ' s B a n c r o f t L i b r a r y , in Berkeley.) He was exacting in his standards for these, and students were occasionally required to redo them. But he was ever so much more demanding with regard to the writing of doctoral dissertations. He would concentrate on the nonproof parts, which were in some sense more important to him than the details of the proofs. A careful elucidation of the problem had to be given: its history (including proper credits), the reason for its importance, and the general idea behind its solution. Definitions had to be precisely formulated and the notions involved had to have intuitively clear meanings. No sloppiness was tolerated. He refused to sign the dissertation (with very nice results) of one of his later students, in part because the exposition did not meet his standards. (The thesis was accepted by another professor at Berkeley.) Several of his former students have told me h o w they w r o t e their first papers: Tarski invited them to his house, discussed carefully the results with them, and then practically dictated a suitable text. As I indicated, Tarski had extremely high mathematical standards and was notorious for not being satisfied, for insisting that still more results were needed for a thesis. His desire to extract as much as possible from his students caused a great deal of frustration and resentment. Some quite capable students gave up. Others finally completed their theses b u t were left with feelings of bitterness. And we'll never know how many simply decided to write their dissertations with someone else. But it must be said that virtually all the 20
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3. 1991
theses written under his supervision contain important results that have become well known, at least among specialists. Many have become classics: for example, Wanda Szmielew's proof of the decidability of the theory of abelian groups, [31]; the theorems of Bjarni J6nsson and Tarski on direct product decompositions of algebraic structures, [15]; Julia Robinson's proof of the undecidability of the theory of fields, [28]; the results of Solomon Feferman and Robert Vaught on sentences preserved under various types of direct products of structures, [7], and the Tarski-Vaught theorems on elementary extensions of relational structures, [64]; Richard Montague's proof that general set theory is essentially nonfinitely axiomatizable, and that the axiom schema of replacement is not finitely axiomatizable over the o t h e r axioms of ZermeloFraenkel set theory, [21], [22]; and Jerome Keisler's algebraic characterization (using the GCH) of elementary equivalence in terms of isomorphic ultrapowers, [18]. Moreover, an uncommon number of the students who did manage to finish their dissertations with him went on to become leaders in their fields, as a glance over the list of his doctoral students will quickly verify. Yes, Tarski could be extremely demanding, always trying to squeeze more results out of his students, criticizing them when they did not work hard enough, being very fussy and meticulous about the exposition in their theses, and rarely complimenting them on their work. But he also was a loyal teacher and friend. I never knew him to turn d o w n a request to be the dissertation supervisor. Although he guarded his time carefully, he was always willing to listen to students, to discuss their work and progress with them. He would advise them on papers to r e a d - - f o r most of his career he had an exceptional c o m m a n d of the l i t e r a t u r e - - a n d w o u l d suggest possible attacks on problems. Nonetheless, it must be admitted that there was not a sense of "I'll see you through this thesiswriting ordeal," and some students dragged on and on, while others fell by the wayside. His sense of loyalty and responsibility towards his students extended beyond his role as a thesis advisor. When they had serious problems in their private lives, he tried to help, although he avoided getting too involved. Also, he was not above giving unsolicited advice regarding personal matters. At most American universities, research assistants, unlike teaching assistants, have few if any official duties. Tarski, however, had a very different conception of the job of his own assistants, more in line with the European tradition: It was to help him in his work, a n d he u s e d t h e m extensively. Often this m e a n t helping him write papers or letters, editing manuscripts or galley proofs, searching through the literature, checking proofs, reading papers, organizing his personal reprint collection or the library of the Group
in Logic and the Methodology of Science. Occasionally, their duties w e n t much further. Don Pigozzi spent several years as Tarski's assistant, helping him write various drafts of Cylindric Algebras, Part I and then editing the typed manuscript and galley proofs. Without Pigozzi's extensive and careful help, Tarski's contributions to the actual publication of this book would certainly have been very much diminished. Tom Frayne also invested years helping Tarski translate his prewar mathematical papers into English for an intended publication that would complement the volume Logic, Semantics, Metamathematzcs, a collection of his more philosophical early papers. Unfortunately, this project was never completed. I think it is fair to say that Tarski often made such extensive use of his assistants that their official duties seriously interfered with their own research. It was difficult for him to see things from their point of view. He was always conscious of the importance of his work and of his great need for assistance. In some sense he felt he was " o w e d " help. It was even difficult for him to express the gratitude he certainly felt for the help they gave. In part, this was because he felt he g a v e a lot in return. Certainly, their work with him could benefit them in their own research, as a glance through Cylindric Algebras, Part I will verify.
Around t w o or three in the morning it w o u l d become apparent even to Tarski t h a t his poor assistant w a s about to collapse, and, mumbling under his breath a b o u t the lack of s t a m i n a in the younger generation, Tarski w o u l d reluctantly call it a night. Tarski was a nocturnal animal, and others simply had to adjust to his time schedule. In a typical work session, his assistant might be invited for dinner at which there would be hors d'oeuvres and obligatory glasses of fruit- or buffalo-grass-flavored vodka, followed by wine with the meal. (Learning h o w to drink was one of the requirements for obtaining a Ph.D. under Tarski.) Around eight o'clock the two of them would disappear downstairs, the young assistant by now rather tipsy. The study was an isolated room on the bottom floor of the house with a view over the large garden and the spectacular San Francisco Bay. Tarski would light up a cigarillo, and they got down to work. As the evening wore on, Tarski might feel drowsy and take some sort of s t i m u l a n t - - w h e n I knew him it was Kola Astier. But whether he took a stimulant or not, by midnight he was going full steam. The air was by then thick with clouds of grey smoke drifting lazily around the dimly lit room. The weary y o u n g mind
next to him began to wander: Would Tarski notice the long finger of ash that was about to fall off his cigarillo and onto his lap? It was a stuggle to keep both eyes open. If only there was a breath of fresh air . . . . Around two or three in the morning it would become apparent even to Tarski that his poor assistant was about to collapse, and, mumbling under his breath about the lack of stamina in the younger generation, Tarski w o u l d reluctantly call it a night. Sometimes, however, he would merely suggest that they go upstairs to the kitchen for some refreshments and coffee before continuing to work for a while longer. (His assistant Don Pigozzi used to arrive home at dawn and read through the morning paper before finally "calling it a day.") These night sessions were typical of Tarski's work habits, the incredible drive and self-discipline he possessed. Since he found it impossible to do research in his university office (which also had a magnificent view of the Bay), he filled his study at home with books, journals (in particular, the complete volumes of the Journal of Symbolic Logic and Fundamenta Mathematzcae), and frequently used reprints, and he spent most of his time working there. Work, and in particular his work, came before everything else. In this respect he was quite selfish and demanding. He didn't let outside things interfere, not even family life. Although his iron discipline helped to make him a great logician, it was a great burden on his wife and children. Given his nocturnal habits, it is no surprise that Tarski was a very late sleeper. The family took pains not to awaken him with their morning bustle. After getting up, he would take a light meal while reading through the newspaper at the breakfast table, and t h e n - - o n days w h e n he didn't have to t e a c h - - g o downstairs to work. He took care of correspondence and administrative matters first before setting into research. Dinner would be at six-thirty or seven (his wife, Maria, usually had to call him several times), and around eight he would return to his study to work for the rest of the night. If it was the evening before a lecture, he would devote several hours to its preparation. Certain days of the week involved a deviation from this routine. Monday afternoons were devoted to the garden, and he never scheduled any meeting or appointment during that time. Sundays he would occasionally make excursions with the family and assorted students and colleagues. Tarski was consciously trying to build a school of logic at Berkeley. He fought stubbornly and tenaciously within the department, the university administration, and the larger community of mathematicians and logicians to bring promising students and faculty to Berkeley, to expand the logic and foundations course offerings of the department, to form a special graduate program, the Group in Logic and the MethTHE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
21
odology of Science, to get money for visitors, and to organize three highly successful international conferences9 He was always using his powers of persuasion to get other people to help him in these efforts. This persistence paid off--Tarski was able to build one of the great centers of logic in the w o r l d - - b u t it was hard on his associates, who often found him very difficult to deal with.
Whether because of the hostile reception accorded logic by mathematicians or because he himself was essentially trained as a mathematician by people like Wac;gaw Sierpiliski and Stefan Mazurkiewicz, Tarski was extremely sensitive to the interests (or disinterests) of mathematicians, and he worked on problems that he thought would appeal to them. His point of view regarding the logic community was cosmopolitan, and he encouraged colleagues and students from abroad to come to Berkeley for visits ranging from a few weeks to several years. They came from all over the world: Roland Fraiss6 and Anne Preller from France; Karl-Heinz Diener, Walter Felscher, Gebhard Fuhrken, Wolfgang Rautenberg, and Wolfram S c h w a b h a u s e r from Germany; John Shepherdson from Great Britain; Evert Beth from Holland; Giovanni Sambin and Aldo Ursini from Italy; Lars Svenonius from Sweden; Richard Bfichi, Erwin Engeler, Hans L/~uchli, and Ernst Specker from Switzerland; Karel Prikry, Miroslav Benda, and Thomas Jech from Czechoslovakia; Paul ErdOs, Andras Hajnal, and Mihaly Makkai from Hungary; Andrzej Ehrenfeucht, Jan Kalicki, Jerzy Lo~, Andrzej Mostowski, Jan Mycielski, L e s z e k Pacholski, Czesdaw Ryll-Nardzewski, L e s h w Szczerba, and Wanda Szmielew from Poland; Mohamed Amer from Egypt; Haim Gaifman, Azriel L6vy, Menachem Magidor, Michael Rabin, and Abraham Robinson from Israel; Haragauri Gupta from India; Max Dickmann from Argentina; Oswaldo Chateaubriand and Newton da Costa from Brazil; Rolando Chuaqui and Gonzalo Reyes from Chile.
The Mathematician Tarski was a logician with exceptionally broad mathematical interests. At various points in his life he w o r k e d extensively in set theory, particularly the theory of cardinal numbers (he expended a great deal of energy as a y o u n g man trying to settle the continuum hypothesis); the theory of formal systems; sentential, first-order, and infinitary logic; decision problems; semantics; model theory; universal algebra; 22
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
Boolean algebra and lattice theory; algebraic logic; measure theory; and geometry (in particular the foundations and rnetamathematics of geometry) 9 A pronounced inclination to use algebraic methods manifested itself early in his career, influencing most of his work after 1930. It is therefore no surprise that he is considered one of the founders of several fields that lie on the borderline of logic and algebra: model theory, universal algebra, and algebraic logic. Whether because of the hostile reception accorded logic by mathematicians or because he himself was essentially trained as a mathematician by people like Waclaw Sierpifiski and Stefan Mazurkiewicz, Tarski was extremely sensitive to the interests (or disinterests) of mathematicians, and he worked on problems that he thought w o u l d appeal to them. More than anyone else, he influenced work on the decision problems of theories studied by mathematicians (see [38], [47], [48], [491, [50], [56], [571, [58], [591, [60], [63], [5], [30], [32], and [621). Together with J6nsson, C. C. Chang, and Michael Fell he published a series of papers on direct product decompositions of general algebraic structures ([15], [8], [55], and [2]). He also initiated the study of several algebraic forms of predicate logic, and showed how classical mathematics could be formalized in simple and natural algebraic languages (see [46], [3], [16], [17], [13[, [10], [11], [12], and [62]). In addition to pursuing lines of research that were of potential interest to mathematicians, he also gave many of the notions and results in his papers a mathematical, as well as a metamathematical, formulation, so that they would be more accessible to mathematicians. For example, in [38] he went to some length to explain the notion of a definable set in mathematical terms. In the introduction to that paper he wrote: Mathematicians, m general, do not like to deal with the notion of dehnablhty; their attitude toward this notion is one of distrust and reserve. The reasons for this aversion are quite clear and understandable . . . . The distrust of mathematicians towards the notion in question is reinforced by the current opinion that this notion is outside the proper limits of mathematics altogether. The problems of making its meaning more precise, of removing the confusions and misunderstandings connected with it, and of establishing its fundamental properties belong to another branch of sclence--metamathematics. 9 . . I believe I have found a general method which allows us to construct a rigorous metamathematical definition of this notion. Moreover, by analyzing the definition thus obtained it proves to be possible . . . to replace it by a definition formulated exclusively in mathematical terms. Under this new definition the notion of definability does not differ from other mathematical notions and need not arouse either fears or doubts; it can be discussed entirely within the domain of normal mathematical reasoning9 In [51] a theory of arithmetical classes was also presented in terms meant to appeal to mathematicians. A
more extensive manuscript developing this theory was never published. Tarski did not like overly specialized results. He thought, for instance, that problems concerning firstorder theories categorical in power were too limited in scope, and as a result he never appreciated the results obtained by Michael Morley and Saharon Shelah, although he admired their virtuosity. Instead he was attracted to problems that admitted a simple formulation and yet had broad applicability. In his own work, he displayed a remarkable talent for setting his results within a very general framework. Examples include the investigations into the methodology of deductive sciences, [35], [36], [40], [41], [42], [43], [44], [45], and [19]; the theorems on universal classes, [52], [53], and [54], which may have been an outgrowth of his efforts to prove that the class of representable relation algebras is axiomatizable, and in fact has an equational base; the results on Boolean algebras with operators, [16] and [17], which were, I believe, a consequence of the attempts to represent relation algebras by e m b e d d i n g them into complete atomic relation algebras; and the J6nsson-Tarski direct j decomposition theorems referred to above. Even in the case of theorems with a more specialized content, he strove to present them in the most general possible way. For example, his result that the set of cardinalities of the different possible independent axiom sets of a finitely axiomatizable theory must form an interval of natural numbers was formulated in terms of the cardinalities of irredundant bases (i.e., independent generating sets) of an abstract closure structure (see [61]). In this form it can be applied to an arbitrary finitely generated algebra. The generality with which Tarski formulated results sometimes seemed to actually make it more difficult for people to appreciate their power and significance. For instance, several times in the fifties and sixties he f o u n d himself at a mathematical m e e t i n g w h e r e someone announced a new result (such as the uniqueness of the direct decomposition of finite lattices with zero or of finite rings) that was really an immediate consequence of one of the main theorems of [15]. To give another example, the use of relational structures as models for modal and other intensional languages (often referred to as "Kripke semantics") was anticipated in [16]. In particular, the completeness, with respect to Kripke models, of a number of well-known modal systems follows from some of the representation theorems of that paper. But many researchers in the field of intensional logic were apparently unaware of this connection for a long time. Tarski did not like to work on well-known questions in the "mainstream" of logic, and the inclination of certain logicians to treat fashionable areas of research as the only worthwhile ones irritated him. For one thing, I think he did not like to compete with other
Teachers and the Grandteacher
Jan Eukasiewicz, professor of p h i l o s o p h y , and twice rector, at the University of Warsaw (probably 1930s)
K a z i m i e r z Twardowski, professor of philosophy at the University of LwOw and the thesis advisor of Tarski's three logic teachers: Kotarb i h s k i , Le~niewski, and E u k a s i e w i c z (probably around 1900)
Tadeusz Kotarbiriski, professor of philosophy at the University of Warsaw (circa
Stanis/aw Le~niewski, professor of the philosophy of mathematics at the University of Warsaw and Tarski's thesis advisor (probably 1910s)
1934)
W a d a w Sierpinski, professor of mathematics at the University of Warsaw and a founder of the Polish school
of mathematics (probably 1950s)
Stefan Mazurkiewicz, professor of mathematics at the University of Warsaw and a founder of the Polish school of mathematics (probably 1920s or 1930s)
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 2 3
mathematicians in trying to solve problems. Rather, he preferred to carve out a domain for himself, to create his own problems and theories that, by the force of the results he obtained and the interest he aroused in others, became mainstream. Such was certainly the case with model theory, with the mathematical decision problems, and with universal algebra. An unfortunate consequence of this approach, however, has been that some achievements close to his heart did not receive much recognition. An example is his development of algebraic logic. Almost singlehandedly he created a modern theory of the algebra of binary relations and showed that it provided a suitable framework for formalizing all of mathematics. Several of his students and followers, including Louise Chin, Bjarni J6nsson, Roger Lyndon, Roger Maddux, Ralph McKenzie, Donald Monk, and, most recently, Hajnal Andr6ka and Istv~n N6meti, made important contributions to the theory, yet interest in relation algebras remains limited. With his students Louise Chin and Frederick Thompson he created the theory of cylindric algebras, an algebraic analogue of first-order logic, which he, Henkin, Monk, and their students systematically developed. But outside of the small circle of logicians and computer scientists who continue to work in this domain, it seems not to have attracted much attention. Often, w h e n thinking of a great mathematician, we imagine someone who has developed beautiful theories, proved many deep and interesting results. However, although it is rarely acknowledged as great m a t h e m a t i c s , t h e ability to p r o p o s e i n t e r e s t i n g problems is the life blood of the subject. Tarski was the inventor of problems par excellence. They seemed to occur to him almost without effort on his part. His problems had a special appeal. Their striking and deceptively simple formulations made them appear not only natural, but "basic." It would be fascinating to have a complete list of the ones he posed to students and colleagues, for it would reveal the important role they must have played in the great progress that was made in logic during his lifetime. When giving a problem to a student, he would sometimes not reveal all its implications at once. For example, w h e n he first posed to Julia Robinson the problem of defining the set of integers within the field of rationals, he made no mention of its ramifications. Only after she obtained a positive solution did he unveil its consequences for the decision problems for fields. He might not even reveal the entire problem to you, only a special case. When you came to him with a solution (and the hope that this would constitute your thesis), he would ask whether you could extend the result to such-and-such a case. When you returned with the desired extension (now, surely, you were done with your thesis), he would propose yet a further possible extension. This could go on until you ei24
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
ther had an extremely beautiful and strong theorem or you gave up in frustration. It was in just this fashion that John Doner got his results on the extended arithmetic of ordinal numbers, [6]. Not all the problems Tarski posed were difficult. During a universal algebra course in 1970, he presented a proof of his beautiful theorem on the cardinalities of irredundant bases of algebras, and then asked the class, as a kind of homework, what further restrictions the size of a finzte algebra might impose on these cardinalities. The question had a special charm, and yet it proved to be quite accessible (see [9]). As this example shows, Tarski occasionally gave interesting open problems as homework, sometimes questions that had just occurred to him the night before as he was preparing his lecture. It certainly was an added incentive to think that doing your homework might lead to a publishable result. Of course, there were examples at the other end of the spectrum of difficulty. When Vaught was working on his dissertation, he would go to Tarski with a solution to one problem, only to be given yet another. Finally, in exasperation, he asked for a problem whose solution would be sufficient for a thesis. Tarski needed some time to think about it. When they met again, he proposed the following: Are two free groups on two or more generators elementarily equivalent, and is the theory of such groups decidable? This problem (for a finite number of generators) is still open, thirty-five years later, despite the efforts of many outstanding logicians. That was a problem worthy of a thesis! Three other well-known and still unsolved Tarski problems concern decision questions: Is the elementary theory of the field of real numbers that are "constructible by straightedge and compass" decidable? Is the theory of the real numbers under addition, multiplication, and exponentiation decidable? Is there an algorithm for determining whether the set of identities true in a finite algebra is finitely axiomatizable? Two other famous problems of his have recently been solved: the "high school" problem and the "circle squaring" problem. The first asks whether every identity involving the operations of addition, multiplication, and exponentiation, and the distinguished constant 1, that is true of the natural numbers can be derived from the set of eleven well-known identities expressing the fundamental laws of arithmetic taught in high school. Alex Wilkie has shown that the answer is negative (see [14]). The second, a modern version of an ancient Greek query, asks whether a circle can be decomposed into finitely m a n y disjoint pieces that can be reassembled to form a square. Mikl6s Laczkovich recently announced a positive solution (see [65]). Along with his gift for thinking of problems, Tarski possessed a gift for formulating in a precise way notions of great interest, generality, and usefulness. Some of these notions were original with him; others
had been employed previously, but only informally or in specialized situations. Examples include the notion of a logical (or invariant) notion; satisfaction, truth, and logical consequence; syntactic and semantic definability (in a theory and in a model), and syntactic and semantic definitional equivalence (of two theories and two models); elementary equivalence and elementary extensions of models; interpretability and relative interpretability in a model and in a theory; the hereditary and essential undecidability of a theory; a Boolean algebra with operators; a closure structure; and an algebra derived from a formal language. In their review of Logzc, Semantzcs, Metamathematics, Pogorzelski and Surrna [24] wrote, Most of Tarski's papers contained in the volume reviewed are classical for metalogic and metamathemahcs. A great many notions and theorems found here have already become standard, and these are a most essential component m logical education at all levels, beginning with the most elementary. Looking, however, at the book as a whole, the thing which we find most striking is that there would hardly be found another scienhst in the history of the exact sciences whose part m the construction of notions for a large domain of science was so powerful as the contribufion of Tarski to the creahon of conceptual apparatus for logic, metalogic--and even--metamathematics. In fact, the conceptual structure of these disaphnes is due to Tarski. Some people have criticized Tarski for doing what they consider to be easy mathematics. However, as Scott points out in [29], "It is one of his special virtues that h e . . . has made it possible to see how to distinguish the [elementary but] important from the trivial." Tarski is famous for his contributions to philosophy, particularly his paper on the conception of truth in formalized languages. It is surprising, then, that philosophical considerations did not play a major role in his work. In fact, he seemed to avoid taking any particular philosophical position. Unlike the intuitionists or the finitists, he availed himself of all possible mathematical tools, and he made abundant use of infinitistic methods. He was reluctant even to discuss his philosophy of mathematics and maintained that his research was independent of any particular philosophical point of view. A remark at the end of the introduction to [36] typifies his attitude: In conclusion it should be noted that no particular philosophical standpoint regarding the foundations of mathematics is presupposed in the present work. But he did occasionally express certain philosophical preferences. Mostowski writes in [23] of Tarski's sympathies for nominalism. Once Tarski told me that if he had to identify the philosophical position to which he felt closest, it might be the reism of his teacher, Tadeusz Kotarbifiski. Several times towards the end of his life he also mentioned a certain attraction for the ultrafinitism of Aleksandr Esenin-Volpin. Tarski was quite sensitive to the problem of proper
recognition. This feeling may have had two sources: his inability in Poland to obtain a regular university position and the negative athtude of many mathematicians towards logic. Because he had to teach in a high school to supplement his meager wages as an assistant, the amount of time he could devote to research was sharply reduced. To prepare a paper for pubhcation, for example, he would feign sickness, so that he could stay home from school to write. It was frustrating for him to work so hard, and still not have e n o u g h hme to obtain some of the results he was after. The missed opportunities must have upset him. One can glean, for example, from footnote 88 and the historical remarks at the end of [42], that by 1930 he was quite close to obtaining the nondefinablhty of the truth predicate. This result, w h e n applied to theoremhood for a theory such as Peano arithmetic and combined with the definability of proof predicates for such axiomatizable theories, leads directly to Kurt Godel's first incompleteness theorem. Tarskl once wondered out loud to me whether he might not have obtained his theorem before Godel's work appeared, had he had, in the period 1925-1931, a position that allowed him to devote himself more to research. At any rate he lamented the fate that so hmlted the time and energy available to him for creative mathemahcs during the years w h e n he was at the height of his mathematical powers. The inability to find a suitable university position weighed heavily on his mind. Hemrich Scholz's admonition to him, around 1938, that a mathemahclan over forty without such a position had bleak prospects, gave voice to the fears that had been growing within him. He made up his mind to look abroad. At this point fate stepped in. Just as he arrived in America in 1939 to attend a Umty of Science congress at Harvard, Germany invaded Poland. Unable to return home, he began looking for a poslhon in the United States. The exodus of European scientists to America in the 1930s had made it difficult for foreign mathematicians to find jobs in the States; Tarski's prospects were not i m p r o v e d by the fact that he worked m logic. He struggled for several years before the chairman of the mathematics department at the Umversity of California, Griffith Evans, acting on a recommendation from the philosopher Wlllard van Orman Quine, offered him a position as a lecturer at Berkeley in 1942. Even then, it was three years more before he was promoted (to associate professor) and longer still before the department agreed to offer logic courses as part of the regular curriculum. In all it took more than twenty years (in Poland and America) for Tarski t o obtain a position commensurate with his accomplishments. Tarski's sensitivity to recognition was perhaps but a manifestation of his sensitivity to the broader question of proper attribution. He disagreed with the attitude that only science itself matters, not its creators, and he THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991
25
had sharp words for those who were simply too lazy to find out what the correct attribution of a result was. In his own writings and lectures he was very careful to assign proper credit and to describe accurately the history of an idea or a theorem. It annoyed him enormously that other mathematicians were not equally careful, and he would often stand up after someone's talk at a meeting to make historical comments. This insistence on proper credit was an occasional source of irritation to others. A well-known example occurred in connection with the notion of a Boolean algebra of formulas. According to Tarski, this construction was known to him already in 1930 and appeared explicitly in [41], where references to earlier publications of B. A. Bernstein and E. V. Huntington from the period 1931-1934 can be found (op. cit., pp. 510-511). An oral tradition apparently arose in Poland sometime after the war that the construction was due to Adolf Lindenbaum, and in [25] Rasiowa and Sikorskl referred to these algebras as "Lindenbaum algebras." Later, in [26], p. 143, footnote 1, they justified this name on the basis of their misinterpretation of a published remark in [20] by McKinsey (who cited Tarski as his source). Their apparent carelessness motivated Tarski to point out the error in [13], p. 85, footnote 4. He was at any rate opposed to the practice of naming notions after people, and in this case it seemed all the more inappropriate since Lindenbaum had nothing to do with the algebra of formulas. When Helena Rasiowa visited Berkeley, Tarski discussed with her the history of the construction and the correct reading of the McKinsey remark. He was left with the impression that she acknowledged her earlier m i s u n d e r s t a n d i n g and accepted what he said. In their book, [27], Raslowa and Sikorski dropped the name "Lindenbaum algebra." However, in a footnote, pp. 245-246, they offered an explanation of w h y Polish logicians used the name, and they assigned credit for the construction jointly to Lindenbaum and Tarski, again citing the McKinsey article, but now adding that it was their duty to report also Tarski's reply (which they did). Their continued insistence on Lindenbaum's role, and especially the provocative formulation of their footnote, upset Tarski. Probably, the situation was further aggravated by the fact that the name "Lindenbaum algebra" was starting to catch on among logicians. At any rate, he did something rather uncharacteristic. He published a biting rejoinder to the Rasiowa-Sikorski explanation (see [10], p. 169, footnote): 9 . For misinformation concerning the history of this method see the footnote on p. 245 f. m Rasiowa-Slkorski [63*] and compare it with the article of J. C. C. McKinsey quoted there. A different aspect of Tarski's attitude towards recognition can be found in his dealings with publishing houses. He did not share the opinion that mathemati26
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
cians should be in&fferent to their earnings from scientific publications. He was careful to negotiate contracts that gave him a satisfactory royalty and that clearly specified those rights he was ceding to the publisher and those he was retaining. Several contract negotiations with him foundered because publishers did not agree to a high enough royalty percentage. Tarski had an unshakable belief in himself, in the directions he espoused in logic, in the problems he worked on and the importance of the results he obtained, in the goals he was fighting for at Berkeley and in the larger community of mathematicians and logicians, in his political and social opinions, even in his own personal charm and its effect on people. It was this confidence that enabled him to battle so tenaciously on behalf of logic, even in the face of very strong opposition and hostility. Unfortunately, it sometimes also made him seem self-centered and insensitive to other people's feelings or points of view.
The Nonmathematical Side The breadth of Tarski's mathematical interests was only a reflection of the breadth of his general intellectual interests. His first inclinations were not towards logic at all. He was an exceptional student in high school, and logic was the one subject in which he did not receive the top grade (he got the equivalent of o u r B). Tarski began his high school studies while Poland was still under Russian occupation, and the language of instruction in the classical gymnaszum he attended was Russian. At one point during this period he was studying seven languages simultaneously: Russian, German, French, Greek, and Latin in school; Polish in special lessons after school for Polish students; and Hebrew in private lessons. Although his knowledge of Greek, H e b r e w , and Latin faded, he m a i n t a i n e d t h r o u g h o u t his life an ability to read and s p e a k German, French, and Russian; to these he added an eventual mastery of English. (He and Maria continued to speak Polish at home.) This talent for languages was but a sign of his interest in language and words. There were people such as Henkin whose linguistic advice he sought on a regular basis, because he considered them especially sensitive to the use of language. One of the reasons that his writing proceeded as slowly as it often did was that he spent inordinate amounts of time thinking of the best w a y to phrase a sentence, of exactly the right word to convey an idea or to name a notion. He was critical of what he considered sloppy use of language among mathematicians; for example, describing notions with neutral, nonsuggestive terms like "good," "regular," "normal". Not only was the correct choice of words and phrases important to him, he was also concerned with style; effective and ineffective forms of repetition, cohesive sentence constructions, redun-
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
27
dancy; such questions were always in his mind as he wrote. His sensitivity to language was a manifestation of his meticulous nature and the intellectual importance which preciseness held for him. It was also an expression of his aesthetic sensibility. He was extremely fond of literature, and of poetry in particular. He read not only Polish works, but also modern Russian literature (in the original). This included well-known authors such as Aleksandr Solzhenitsyn and Aleksandr Zinoviev, and lesser known figures whose writings were published in the emigr6 journal Kontinent to which he subscribed. One w a y he demonstrated his affection for a person was by reading aloud from a favorite poet such as Heinrich Heine. Even in old age he could recite poems in Polish, German, and Russian, and in his final, delirious hours of life he murmured fragments of poems in these languages. Leszek Kcdakowski, the Polish philosopher and writer, was his friend, as was Czesfaw MiXosz, the Polish poet and Nobel laureate. In fact, when Milosz embarked on his project of translating the Psalms into Polish, it was Tarski who directed his attention to a beautiful but little-known translation from Hebrew into Polish of the Old Testament. This aesthetic sensitivity, which was so pronounced in Tarski, extended to art. Throughout his house were hung paintings and etchings that he had collected on his travels, including a Rembrandt, a K/ithe Kollwitz, a Franz Masereel, and portraits of Maria and of himself from his Warsaw days executed by the well-known Polish artist Stani~aw Witkiewicz (while under the influence of a drug, the chemical formula of which is painted just by his signature). By w a y of contrast, he exhibited very little musical inclination, and did not share the classical tastes of his wife. He did enjoy Polish cabaret songs and sentimental melodies from his youth, and he occasionally played for guests the comical ditties of Tom Lehrer, but that was about the extent of his musical interests. Tarski's first intellectual love, h o w e v e r , was biology. It was the subject that had most fascinated him in high school, and he entered the university with the intention of obtaining his degree in it. What derailed him was success. In his first course with Le~niewski he was able to solve a problem in set theory that Le~niewski had mentioned to the class. It was not, in Tarski's later opinion, an interesting result, but it led to his first published paper, [33]. This flush of success, together with the urgings of his professor, were enough to sway him to change disciplines. However, in later years he several times expressed his regret at having abandoned his first love, and jokingly vowed to return, in his next incarnation, as a biologist. He was especially interested in evolution and primate behavior. The recent attempts to teach chimpanzees and gorillas h u m a n language and to raise 28
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
them in human environments fascinated him because of the light it shed on the relationship between human and simian intelligence. He was also quite knowledgeable about plants, and took pleasure in visiting exotic botanical gardens and eating strange fruits during his trips abroad. An avid gardener, he was particularly fond of roses, fuchsias, rhododendrons, and azaleas. A whole section of his garden was devoted to different varieties of roses; there were also several unusual types of fruits: feijoas (Stalin's favorite fruit, according to Solzhenitsyn in The First Circle), strawberry guavas, and Rangpur limes. He liked animals and felt a particular affinity for cats. He used to tell the story of a she-cat that visited him regularly as he w o r k e d in the garden. They seemed to have established a rapport. One day she appeared and gave him a longing, sad look as if to say good-bye; it was indeed the last time he ever saw her. At various times in his life the family had dogs or cats as pets, and twice there was also an aviary. However, animals disturb the order of a household and require time and energy to take care of. As these were precious commodities to Tarski, there were no more pets once the children left h o m e - - w i t h one exception. In the last months of his life, when his mental powers were already failing him, and he felt quite frustrated and lonely, his student, Judith Ng, presented him with "Kitty," a cat to w h o m he became quite attached. It would be hard to call Tarski a "liberated" person in the sense that this word is used today. But his attitudes towards many issues were quite uncharacteristic for men of his generation. He greatly admired strong, intelligent, and creative w o m e n , and related with pride h o w Maria and her sisters, who were active in the Polish underground during the First World War, went back and forth across enemy (Russian) lines, completely unescorted. He openly admitted that, in similar circumstances, he might not be capable of such courage. For the time, he had an exceptional number of w o m e n students: Wanda Szmielew, Julia Robinson, Louise Chin Lim, Anne Davis Morel, Jean Butler, and, later, Judith Ng. Several people w h o were close to him were homosexuals. Although he was unaware of their homosexuality for a long time, w h e n he finally did learn of it, his friendships with them were in no w a y affected. It was characteristic of him that he was a staunch supporter of the civil rights movement and, at the same time, an equally staunch defender of the right of such people as William Shockley and Arthur Jensen to conduct research into possible genetic differences in intelligence between races. In the Polish tradition, he was extremely interested in politics. While at the university he had been active in a political group that, among other things, provided aid to families of Polish workers. Tarski was always proud of the help he was able to extend to the families assigned to him. His political sympathies at the time
Left to right: R6~a Prussak Teitelbaum, Tarski's mother (probably late 1910s); Ignacy Teitelbaum, Tarski's father (probably late 1910s); Maria Tarski, Tarski's wife (probably 1920s); Tarski and his children, Jan and Ina (circa 1947).
were decidedly left-wing, although probably not outright communist like those of his close friend Lindenbaum. By the mid-1930s, the events in Russia seem to have dissuaded him of any communist sympathies whatsoever, and in his later years, he was known in Berkeley as an ardent anti-communist. However, he retained his belief that the state bore certain obligations towards its citizens, including the provision of medical care and, w h e n necesary, food and housing. He liked to characterize his sympathies as socialist on domestic policy and conservative on foreign policy. He retained a lifelong attachment to Poland, at least to the country that, for a brief period between the two world wars, finally achieved independence, after centuries of foreign domination. He followed political and cultural events there closely, and participated in several international attempts to help Polish workers in their struggles against the postwar government. The rise of the trade union Solidarity--whose existence only a few years earlier had been unthinkable--fascinated him, and he occasionally wore a Solidarnod~ button on his lapel. He kept in touch with Polish expatriate intellectuals and read Polish cultural, historical, and political writings, in particular in the Polish emigr6 journal Kultura. Of course he was very active in helping to bring Polish logicians to Berkeley after the war. Guests to his home were regaled with Polish hospitality: Polish food cooked by Maria, Polish-style hors d'oeuvres, Tarski's fruit-flavored v o d k a - - f o r t i f i e d with pure alcohol brought from Mexico by his students, a n i m a t e d conversation (often on political topics), and stories. Tarski loved to tell good stories. Every frequent guest to his home inevitably built up a repertoire of Tarski stories: the Jewish holidays w h e n he stole candy from the pockets of his uncles; solving his thesis problem while in the chair of a country dentist; the hikes in the Tatra mountains, and the time Maria
saved his life during a fall; Lindenbaum successfully locating the body of his missing father by consulting a famous psychic; Zermelo's visit to Warsaw and his morning "singing" lessons; Sierpi~ski as a teacher (lecturing with his back to the students, while writing nonstop on the blackboard) and as a thesis advisor ( w i t h an " a m a z i n g n u m b e r of s t u d e n t s " ) ; t h e founding of the journal Fundamenta Mathematicae; the "discovery" of Banach by Hugo Steinhaus; Eukasiewicz as rector of the University, and his "zoologically anti-Semitic" wife; Scholz's visit to Warsaw and his courageous support of Tarski (he attended a party with Tarski and scolded Eukasiewicz because Tarski had such a terrible position--this at a time of strong anti-Semitic sentiments in both Poland and Germany). T h e r e w e r e so m a n y stories; stories of life in Warsaw, the delicious pastries and wonderful cabarets; stories of heroism and villainy that took place in Poland during the war; his years on the east coast ("dating" WAVEs with Quine); his attempts to get his family out of Poland; the Sunday outings in California; the adventures of his students and colleagues at Berkeley; the unending stories about the many interesting personalities with w h o m he came into contact as a logician: Paul Bernays, Stefan Banach, Evert Beth, L. E. J. Brouwer, Rudolf Carnap, Yuri Ershov, Paulette F6vrier, Jean Destouches, Robert Feys, Kurt G6del, Bronidaw Knaster, Georg Kreisel, Kazimierz Kuratowski, E d m u n d Landau, Anatolii Mal'cev, W. V. O. Quine, Bertrand Russell, Heinrich Scholz, Patrick Suppes, Boris Tracht6nbrot, Stanistaw Ulam, John von Neumann, Ernst Zermelo, even Eduardo Frei, the former President of Chile. Quite honestly, Tarski loved to gossip, sometimes even maliciously. But he told his stories with such charm and wit, it was difficult not to be captivated by them. Although he missed the world that was destroyed soon after he left Poland in 1939, Tarski quickly THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 2 9
Tarski disagreed with the attitude that only science itself matters, not its creators, and he had sharp words for those who were simply too lazy to find out what the correct attribution of a result was. a d a p t e d to his n e w h o m e in America a n d became quite h a p p y here. Yet, in some sense I think he always felt a bit out of place, a n d perhaps a bit lonely. His old world had disappeared, but not his inner attachment to the old-world w a y of life. His mentality was very different from the American mentality, a n d although he had m a n y friends in this country, they were not really close friends. It was amazing to me that a m a n as gregarious a n d extroverted as he h a d so few deep personal contacts with other people. In fact, he once told me that the o n l y real friend, in the European sense of the word, that he had had in this country was J. C. C. McKinsey. It was n o t e a s y to be close to Tarski. He rarely o p e n e d up or s h o w e d affection. It seemed m u c h easier for him to criticize than to compliment. The very traits that m a d e h i m an o u t s t a n d i n g logician: self-confidence, self-discipline, single-mindedness of purpose, a quick and curious mind, an outgoing personality, a n d persistence, seemed to make him difficult in personal relationships. I don't k n o w if a n y o n e w h o interacted with him on a deeper level can completely sort out the mixture of admiration, exasperation, loyalty, affection, frustration, anger, a n d gratitude that he evoked. I k n o w I can't.
Acknowledgement: I w o u l d like to thank William Craig, Anita and Solomon Feferman, Robert Goldblatt, Leon Henkin, Roger Maddux, Judith Ng, Don Pigozzi, Constance Reid, a n d Jan Tarski for their m a n y helpful remarks, and Maria Moszynska, Jan Tarski, a n d the University of Warsaw archivists for their help in gathering t o g e t h e r the p h o t o g r a p h s . I owe a special debt to Verena Huber-Dyson. It was she w h o suggested that I write this portrait as part of a larger, jointly authored piece about Tarski, a n d she spent m a n y hours with me, discussing various ideas and sharing her recollections of Tarski. Our jointly authored paper will appear in the Polish journal Wiadomodci Matematyczne.
4.
5.
6. 7. 8. 9. 10. 11. 12 13.
14.
15.
16. 17. 18.
19. 20.
Bibliography 21. 1. Banach, S. and Tarski, A. Sur la d6composition des ensembles de points en parties respectivement congruentes, Fund. Math. 6 (1924), 244-277. 2. Chang, C. C., J6nsson, B. and Tarski, A Refinement properties for relational structures, Fund. Math. 55 (1964), 249-281. 3. Chin, L. H. and Tarskl, A. Distributive and modular 30
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
22.
laws in the arithmetic of relation algebras, Unzversity of Cahfornia Pubhcatzons zn Mathematzcs, (new series) 1, no. 9 (1951), 341-384. Chwialkowski, Z., Schayer, W. and Tarska, A. Geometrja dla trzeczej klasy gzmnaz]alne! (Geometry for the third class of the gymnaszum), Lw6w: Pafistwowe Wydawnictwo Ksi?~.ek i Pomocy Szkolnych (1935). (Second edition, Sekcja Wydawnicza Armii Polskiej na Wschodzie w Jerozolimie, 1944. Reprinted by Polskl Zwi?zek Wychod~ctwa Przymusowego w Hanowerze, Hanover, 1946.) Doner, J., Mostowski, A. and Tarski, A. The elementary theory of well-ordering--A metamathematical study, Logzc Colloquzum 77, (A. Macintyre, L. Pacholski, and J. Paris, eds.), Amsterdam: North-Holland Publishing (1978), 1-54. Doner, J. and Tarski, A. An extended arithmetic of ordinal numbers, Fund. Math. 65 (1969), 95-127. Feferman S. and Vaught, R. L. The first order properties of products of algebraic systems, Fund. Math. 47 (1959), 57-103. Fell, J. M. G. and Tarski, A. On algebras whose factor algebras are Boolean, Paczfic J. Math. 2 (1952), 297-318. Givant, S. R. Possible cardinahties of irredundant bases for finite closure structures, Dzscrete Math. 12 (1975), 201- 204. Henkin, L., Monk, J D. and Tarski, A. Cyhndric algebras. Part I, Wzth an zntroductory chapter: general theory of algebras, Amsterdam: North-Holland Publishing (1971). , Cyhndrzc algebras. Part II, Amsterdam: NorthHolland Publishing (1985). Henkin, L., Monk, J. D., Tarski, A., Andr~ka, H. and N6meti, I. Cyhndrtc set algebras, Lecture Notes zn Mathematics, 833, Berhn: Sprmger-Verlag (1981). Henkin, L. and Tarski, A. Cylindric algebras, Lattice theory, Proceedings of Symposia in Pure Math., 2, (R. P. Dilworth, ed.), Providence, RI: Amer. Math. Soc. (1961), 83-113. Henson, C. W. and Rubel, L. A. Some applications of Nevanlinna theory to mathematical logic: identities of exponential functions, Trans. A M S. 282, no. 1 (1984), 1-32. J6nsson, B. and Tarskb A. Dzrect decompositions of finzte algebrazc systems, Notre Dame Mathematical Lectures, 5, Notre Dame, IN: Umversity of Notre Dame Press (1947). --, Boolean algebras with operators. Part I, Amer. J. Math. 73 (1951), 891-939. --, Boolean algebras with operators. Part II, Amer. J. Math 74 (1952), 127-162. Keisler, H.J. Ultraproducts and elementary classes, Konznkhlke Nederlandse Akademie van Wetenschappen, Proceedings, Series A, 64 (= Indagationes Mathematzcae, 23) (1961), 477-495. Lindenbaum, A. and Tarski, A. Uber die Beschranktheit der Ausdrucksmittel deduktiver Theonen, Ergebnzsse eines Mathematischen Kolloquzums 7 (1936), 15-22. McKmsey, J. C. C. A solution of the decision problem for the Lewis systems $2 and $4 with an application to topology, I. Symbolic Logic 6 (1941), 117-134. Montague, R. M. Semantical closure and non-finite axiomatlzability. I, Infinitistzc methods. Proceedzngs of the symposzum on foundations of mathematics, Warsaw, 2-9 September 1959, Oxford: Pergamon Press (1961), 45-69. , Fraenkel's addition to the axioms of Zerrnelo, Essays on the foundations of mathematzcs, (Y. Bar-Hillel, E. I. J. Poznanski, M. O. Rabin, and A. Robinson, eds.) Amsterdam: North-Holland Publishing (1962), 91-114.
Friends and Colleagues Right: Tarski and Kurt GOdel in Vienna (circa 1935; photo by Maria Luttmann-Kokoszynska). Below (left to right). First row: Adolf Lindenbaum, close friend and collaborator in Warsaw (circa 1922); Leon Henkin, professor of mathematics at Berkeley (circa 1955); Tarski and Heinrich Scholz (center), professor of logic at Mtinster, person at right unidentified (1937). Second row: Willard Van Orman Quine, professor of philosophy at Harvard; Raphael Robinson, professor of mathematics at Berkeley, both at the Tarski Symposium (1971); "Socrates" (Joseph Woodger), translator of Logic, Semantics, Metamathematics, with Rudolf Carnap, one of the philosophers of the Vienna Circle (1935). Bottom row: Patrick Suppes, professor of philosophy at Stanford; William Craig, professor of p h i l o s o p h y at Berkeley, both at the Tarski Symposium (1971).
23. Mostowski, A. Tarski, Alfred, entry in The Encyclopedia of Philosophy, (P. Edwards, ed.), New York: Macmillan Company and Free Press 8 (1967), 77-81. 24. Pogorzelski, W. A. and Surma, S. J. Review of Logzc, se25. 26. 27. 28. 29.
30. 31. 32.
33.
34. 35.
36. 37. 38. 39. 40.
41. 42. 43. 44. 45.
32
mantics, metamathematics. Papers from 1923 to 1938, m J. Symbohc Logic 34 (1969), 99-106. Rasiowa, H. and Sikorski, R. Algebraic treatment of the notion of satisfiability, Fund. Math. 40 (1953), 62-95. , On isomorphism of Lindenbaum algebras with fields of sets, Colloq. Math. 5 (1958), 143-158. , The mathematzcs of metamathematics, Monografle Matematyczne, 41, Warsaw: Pafistwowe Wydawnictwo Naukowe (1963). Robinson, J. Definability and decision problems in arithmetic, J. Symbohc Logic 14 (1949), 98-114. Scott, D. Completeness and axlomatizability in manyvalued logic, Proceedings of the Tarskz symposium, Proceedings of Symposia in Pure Math., 25, Providence, RI: Amer. Math. Soc. (1974), 411-435. Szczerba, L. and Tarski, A. Metamathematlcal discussion of some affine geometries, Fund. Math. 104 (1979), 155-192. Szmlelew, W. Elementary properties of Abelian groups, Fund. Math. 41 (1955), 203-271. Szmielew, W. and Tarski, A. Mutual interpretability of some essentially undecidable theories, Proceedings of the International Congress of Mathematicians, Cambmdge, Massachusetts, U.S.A., August 30-September 6, 1950, 1, Providence, RI: Amer. Math. Soc. (1952), 734. Tarskl, A. Przyczynek do aksjomatykl zbloru dobrze uporz~dkowanego (A contribution to the axiomatic of well-ordered sets), Przeglqd Fzlozofzczny ( = Revue Philosophzque) 24 (1921), 85-94. , O r6wnowa~no~ci wielok?t6w (On the equivalence of polygons), Przeglqd Matematyczno-Fzzyczny 2 (1924), 47-60. , Uber einige fundamentalen Begnffe der Metamathematik, Sprawozdania z Poszedze~ Towarzystwa Naukowego Warszawskzego, Wydzzal III Nauk Matematyczno-Fzzycznych (= Comptes Rendus des S~ances de la Socz~t~ des Sczences et des Lettres de Varsovie, Classe III, Sciences Mathdmatzques et Physiques) 23 (1930), 22-29. , Fundamentale Begriffe der Methodologle der deduktiven Wissenschaften. I, Monatshefte fiir Mathematik und Physik 37 (1930), 361-404. , O stopniu r6wnowa~no~ci wielokz~t6w (On the degree of equivalence of polygons), Mlody Matematyk 1 (suppl. to Parametr, 2) (1931), 37-44. , Sur les ensembles d6finissables de nombres r6els. I, Fund. Math. 17 (1931), 210-239. , Uwagi o stopniu r6wnowa~no~ci wielok?t6w (Remarks on the degree of equivalence of polygons), Parametr 2 (1931-32), 310-314. , Der Wahrheitsbegriff in den Sprachen der deduktiven Disziplinen, Akadem~e der W~ssenschaften in Wien, Mathematisch-naturwissenschaftliche Klasse, Akademischer Anzezger 69 (1932), 23-25. , Grundztige des Systemenkalk/fls. Erster Teil, Fund. Math. 25 (1935), 503-526. --, Der Wahrheitsbegriff in den formalisierten Sprachen, Studla Philosophica 1 (1935), 261-405. --, Einige methodologische Untersuchungen iiber die Definierbarkeit der Begriffe, Erkenntnzs 5 (1935), 80-100. --, Grundzuge des Systemenkalkills. Zweiter Teil, Fund. Math. 26 (1936), 283-301. --, Uber den Begriff der logischen Folgerung, Actes du Congr~s International de Philosophze Sczentzfique 7, ActuTHE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
alit6s Sclentifiques et Industnelles, 394, Paris Hermann et C 'e (1936), 1-11. 46. , On the calculus of relations, J. Symbohc Logzc 6 (1941), 73-89. 47. , A decision method for elementary algebra and geometry, (Prepared for pubhcation by J. C. C. McKznsey), Santa Monica, CA: U.S. Air Force Project RAND, R-109, the RAND Corp., (1948). 48. , Arithmetical classes and types of Boolean algebras, Bull. A.M.S. 55 (1949), 64 and 1192. 49. , Arithmetical classes and types of algebraically closed and real-closed fields, Bull. A.M.S. 55 (1949), 64 and 1192. 50. , Undecidabihty of the theories of lattices and projective geometries, J. Symbohc Logzc 14 (1949), 77-78. 51. , Some notions and methods on the borderline of algebra and metamathematlcs, Proceedings of the Interna-
52.
53.
54.
55. 56.
57. 58. 59. 60.
61. 62. 63. 64. 65.
tzonal Congress of Mathematzczans, Cambridge, Massachusetts, U.S.A., August 30-September 6, 1950, 1, Providence, RI: Amer. Math. Soc. (1952), 705-720. , Contributions to the theory of models. I, Koninkhjke Nederlandse, Akademle van Wetenschappen, Proceedings, Serzes A, 57 (= Indagatlones Mathematlcae, 16) (1954), 572-581. , Contributions to the theory of models. II, Konmklijke Nederlandse Akademze van Wetenschappen, Proceedmgs, Series A, 57 (= Indagationes Mathematlcae, 16) (1954), 582-588. , Contributions to the theory of models. III, Konmkhjke Nederlandse Akademze van Wetenschappen, Proceedings, Series A, 58 (= Indagatzones Mathematlcae, 17) (1955), 56-64. , Remarks on direct products of commutative semigroups, Math. Scand. 5 (1957), 218-223. , What is elementary geometry? The axzomatzc method, wzth speczal reference to geometry and physlcs, (L. Henkin, P. Suppes, and A. Tarski, eds.), Amsterdam: North-Holland Publishing (1959), 16-29. - - , Solution of the decision problem for the elementary theory of commutative semigroups, Not. A.M.S. 9 (1962), 205. - - , The elementary undecidability of pure transcendental extensions of real closed fields, Not. A.M.S. 10 (1963), 355. - - , Some decision problems for locally free commutative algebras, Not. A.M S 13 (1966), 634. - - , The completeness of elementary algebra and geometry, Paris: Institute Blaise Pascal, (1967). (A reprint from page proofs of a work scheduled to appear in 1940 in Actualiti6s Scientifiques et Industrielles, Hermann et C 'e, Paris, but which did not appear due to the wartime conditions.) - - , An interpolation theorem for irredundant bases of closure structures, Discrete Math., 12 (1975), 185-192. Tarski, A. and Givant, S. R. A formalization of set theory without variables, Colloquium Pubhcatlons, 41, Providence, RI: Amer. Math. Soc. (1987). Tarski, A., Mostowski, A. and Robinson, R. M. Undecidable theorzes, Amsterdam: North-Holland Publishing (1953). Tarski, A. and Vaught, R. L. Arithmetical extensions of relational systems, Compositio Math. 13 (1957), 81-102. Wagon, S. The circle squared, beyond refutation, Focus 9, no. 2 (1989), 1-2.
Department of Mathematzcs Mzlls College Oakland, CA 94613 USA
Descartes and the Philosophy of Mathematics Anthony Lo Bello
The Life and Times of Descartes Descartes was born on 31 March 1596 at La Haye, near Tours; the French authorities have quite properly renamed the place La Haye Descartes. He was given the name RenG which was commonly bestowed on children whose mothers died in or shortly after childbirth. His was a well-to-do middle-class family; his father was a member of the Touraine parliament. This was during the convulsions of the French civil war, for at the assassination of Henry III in 1588, the throne fell to a Protestant, Henry de Bourbon, King of Navarre, and the French Catholics refused to accept a heretic to direct the affairs of the eldest daughter of the Roman Church. The whole business was soon settled, however, when Henry of Navarre sensibly decided that Paris was well worth a Mass and converted to Catholicism; he w e n t on to be the most p o p u l a r king in French history. Descartes was sent away to school at the nearby Jesuit establishment at La Fl~che; there he first studied mathematics from the textbooks of Peter Clavius, the most famous mathematician of his time. Clavius had acquired a great reputation from his edition of Euclid, in which he pointed out some omissions in the list of postulates and axioms. He was the first to use the decimal point. His chief accomplishment, however, and the real reason that Clavius is remembered, was the reform of the Julian Calendar he undertook for Pope Gregory XIII; the calendar we use today, the "Gregorian" Calendar, is due to him. This got Clavius into a lot of trouble, for some significant mathematicians, Francois Vi6te for example, did not like Clavius's calendar. Furthermore, in order to catch up with the sun, Clavius had ordered that 4 October 1582 was to be followed by 15 October 1582; the people of Frankfurt rioted against mathematicians and
the Pope, w h o were in cahoots to rob them of ten days. Clavius was sorely dismayed by the controversy, though he got the better of his adversaries. While Descartes was learning mathematics from the books of Clavius, he developed a certain life-style at school that he was not to change until 1650, w h e n the change resulted in his death. He petitioned the Most Reverend Head Master of the school, Fr. Charlet, that he be allowed to remain in bed until 11 o'clock and be exempted from all the morning's classes and activities. Surely this must strike a sympathetic chord in the hearts of all s t u d e n t s w h o dislike early m o r n i n g classes. Descartes' excuse was that he was too sickly to get out of bed so early, and that it was more beneficial for him to remain there and think until 11 a.m. The Jesuits granted him the requested dispensation, and
THE MATHEMATICALINTELLIGENCERVOL 13, NO 3 9 1991 Spnnger-Verlag New York 3 5
for the rest of his life he stayed in bed thinking until 11 a.m., and it was during those hours that he produced all his mathematical and philosophical works. If one wants to accomplish anything in life, one has to make time for it, an inviolable block of time, and Descartes was of the opinion that his thinking was more important than attending the performances of his mediocre instructors. O n e makes time for w h a t one really wants. Descartes did not go on to attend a university. In those days it was not necessary to get a degree in order to function as a mathematician or anything else; if you wanted to be a doctor, you just put a sign on your door "I am a doctor" and that was sufficient. The mathematician Euler's first job was as a Professor of Medicine at St. Petersburg in Russia. The Master's Degree was a license to teach, but Descartes had no intention of teaching, and I cannot think of many great mathematicians of his time who did. Teaching at a university then was entirely different from what it is today. It is even more amusing than instructive to point out some of the differences. In many cases, the professor was only paid by the students who attended his lectures; the greater his following, the greater his income. Since there were no grades, the students flocked to the better rather than the easier instructors; there was no need to weed out the poor teachers. They mostly starved, since everyone went to the cow that gave milk. In some places, like the University of Basel, certain positions were filled not by search committees but by God; the names of all qualified applicants (those w h o h a d submitted a decent learned paper along with their application) were put into a bag, and a name was drawn out at random to fill the vacancy. This was w h y Euler had to leave Basel and go to Russia; the vacancy in the Physics Department for which he had applied was assigned to someone else by the laws of probability. Once you got a position, there was often no such thing as tenure to protect you. In France, for example, no matter what your seniority, you could be challenged for your post by any newcomer, and if he defeated you in a public competition, you were out. The great Roberval, who found the area under one arch of the cycloid, held the record in France for the longest tenure under these trying circumstances; at the Royal Coll~ge de France he defeated all comers for about 40 years. In any case, Descartes determined not to become a teacher, and instead joined the army of the Dutch Prince of Orange. As a result of this military career, Descartes was able to write the first book on dueling, which was his only work not to be put on the List of Prohibited Books after his death. When the Thirty Years War broke out in 1618, he transferred his allegiance to the Catholic army of the Duke of Bavaria. The war broke out because people were itching for a fight; the immediate cause was an insult offered by some Protestant Bohe36
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
mians to the Catholic Holy Roman Emperor. They threw his emissaries out of a window. The emissaries were not physically harmed since they landed in a pile of m a n u r e , but this so-called " D e f e n e s t r a t i o n of Prague" was a sufficient casus belh. Descartes eventually left the Bavarian army and joined the forces of Cardinal Richelieu, Prime Minister of France, which were reducing the Huguenot fortress of La Rochelle. After this, he retired from military life. It is not known how m a n y people he personally killed. Descartes spent some time travelling about on religious pilgrimages; in particular, he went to inspect the Holy House of Loreto, and visited Rome to win the Jubilee indulgence of the Holy Year of 1625. He was a
Descartes moved his household at least once a year in order not to be bothered by the inconveniencing courtesies of society. pious Catholic, and it was his major concern later in life to provide a rational basis for his religious belief. When he returned to Paris, he took the advice of his mentor, Cardinal B6rulle, and decided to devote the rest of his life to learning. Since he was determined to begin by doubting everything, he thought it advisable to leave France, whose people love controversy and where it was dangerous to doubt, and go to some country where almost everything was tolerated. Just as today, so also in the s e v e n t e e n t h century, that country was Holland. The people there were too busy making money to care about his skepticism. Descartes removed thither in 1629 and stayed for twenty years. He was painted there by Hals, and the portraits are the most famous of any mathematician ever made. One can see from the Copenhagen and Louvre pictures that he dressed perpetually in dreary black, and that if the Dutch barbers prospered, it was not due to his business. One is reminded of Raphael's Plato. Descartes moved his household at least once a year in order not to be bothered by the inconveniencing courtesies of society. Only one man knew where to find him at all times, his friend Fr. Mersenne. He gave his address to no one else. He infuriated his visitors by refusing to get out of bed if they arrived before 11 a.m. This all changed in 1649. In that year, Descartes received a message from Queen Christina of the Goths and Vandals (i.e., Sweden), who was fascinated by his books, to come and visit her in Stockholm. This was asking quite a lot, since Sweden was out of the world, though not so much as it is today. Queen Christina was in trouble because, having read in Descartes that she s h o u l d doubt everything, she began to d o u b t w h e t h e r the Swedish L u t h e r a n state religion was right, and it was illegal for her to do so. Eventually she had to abdicate and go into exile in Rome. At any rate, she wanted Descartes to come to Stockholm, teach her
p h i l o s o p h y a n d g e o m e t r y , and o r g a n i z e a Royal Academy of Science. Perhaps Descartes was flattered by this attention, because he accepted the invitation, and it was the end of him. The Queen wanted to study his Method and draw her tangent lines at 5 a.m. when she got up, not at 11 a.m. when he did. Descartes got run d o w n by this n e w regime, and after catching pneumonia walking to the palace every early winter morning, he died on 11 February 1650. I want to pause here to note that in his Short Account of the History of Mathematics, the English historian of science W. W. Rouse Ball wrote (p. 271) of Descartes, "In disposition, he was cold and selfish." I do not see how Ball could have known this. Probably he copied it from the reminiscences of one of Descartes's contemporaries who did not like him. As a stubborn individual who knew what he wanted and went after it, Descartes must have aggravated many who were annoyed that he had no time for them. These characters then called him "cold and selfish." More damaging is the fact that Descartes considered it right to enlist in the Dutch and Bavarian armies and take part in wars that did not concern him in the remotest way. The Discourse
on Method
Rather than dissipate my time in making a very few observations about each of Descartes's several works, I prefer to devote all the remaining time to one book, The Discourse on Method, the examination of which is sufficient for those who are content to know something about Descartes and his philosophy of mathematics rather than everything about them. The book appeared anonymously, in case any of its doctrines should offend the authorities. It was written in French to underline the fact that it was revolutionary and had something to offer even those people who could not read Latin, for Descartes would not have agreed with Schopenhauer, who wrote in his Essay on Latin two centuries later that "he who does not know Latin is a fool, even if he is a virtuoso on the electric machine and has the base of hydrofluoric acid in his crucible." The title of the work, Discourse on Method, emphasized that he was offering a plan, a well thought out systematic way of acquiring knowledge and then of organizing that knowledge into science. Without the discipline of a method, one could not expect to find the truth. Descartes divided the Discourse into six books so that, he said, his readers could take it leisurely in six installments; however, it is not so long, a mere 50 printed pages. It is a masterpiece of seventeenth-century French prose. In the first part, Descartes tells how, having studied the usual subjects at school and having travelled over much of Europe to read in what he calls "the great book of the w o r l d , " he had concluded that among "the diverse actions and enterprises of all mankind, I
The portrait of Ren4 Descartes by Frans Hals (1580-1666) is on permanent loan to the Royal Museum of Fine Arts, Copenhagen, from the Ny Carlsberg Glyptotek (Inv. No. Dep. 7). An inferior likeness of u n k n o w n authorship may be seen in the Louvre, Paris. Photo: Hans Petersen.
find scarcely any which do not seem to me vain and useless." He therefore decided to turn his mind in on itself and to make himself the object of his study. He was more at home in and by nature more suited to the mental world of ideas rather than the physical world without. He gave evidence of the Platonic predilection for m a t h e m a t i c s a n d n o t e d that of all his school studies, Most of all was I delighted with mathematics because of the certainty of its demonstrations and the evidence of its reasoning. So, the key to understanding Descartes is that he liked mathematics and that mathematics appeared to him not just one subject a m o n g many, nor even first among equals, but definitely special. In Part II, he tells of his mystical experience in the stove-heated room, where God appeared to inspire him to begin from scratch: As regards all the opinions which up to this time I had embraced, I thought I could not do better than endeavor once and for all to sweep them completely away and to start all over. Descartes was one of those people who are obsessed with wanting to be absolutely certain. Such people must almost surely be disapTHE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
37
pointed, and Descartes was careful not to r e c o m m e n d his plan for public consumption: The simple resolve to strip oneself of all opinions and beliefs formerly received is not to be regarded as an example that each man should follow. He thought, t h o u g h , that he might be the exception a n d end up the better for it, and he was at least sure that in going his o w n w a y he would not succumb to those errors m a n k i n d had adopted by u n a n i m o u s consent: The voice of the majority does not afford a proof of any value in truths a little difficult to discover, because such truths are much more likely to have been discovered by one man than by a nation. Descartes t h e n goes on to explain the m e t h o d of four parts that he adopted as an infallible procedure for discovering the truth. He came u p o n it by observing h o w mathematicians go about their art; mathematics for him p r o v i d e d the correct m e t h o d of reasoning and seeking for truth in all subjects. The four parts are: 1) To accept n o t h i n g as true that he did not clearly recognize to be so; 2) Divide a n d conquer; to break each big problem up into m a n y smaller ones; 3) To proceed mathematically in solving the smaller problems, that is, from the simplest to the more complex, one at a time according to their order; 4) To check all his work to catch a n y error of omission or commission. This m e t h o d was sure to work, he believed, because Those long chains of reasoning, simple and easy as they are, of which geometricians make use in order to arrive at the most difficult demonstrations, had caused me to imagine that all those things which fall under the cognizance of man might very likely be mutually related in the same fashion. He concludes the section by observing that he was t w e n t y - t h r e e years old w h e n he came up w i t h this plan. Descartes begins Part III by observing that because he could not postpone living until he arrived at the truth he was after, he determined to live for the time being according to a reasonable moral code, which also had four parts: 1) To obey the laws and customs of his country, and to adhere to its religion; 2) Once he h a d decided to do something, to be firm and resolute in doing it; 3) To try always to conquer himself rather than fortune, and to alter his desires rather t h a n change the order of the world; 4) To review all the occupations of m e n in his life in order to determine the best for him, but meanwhile to continue in his own, namely, thinking. 38
THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991
He t h e n describes h o w in his travels he viewed all the comedies that the world displays before withd r a w i n g to Holland to live as quietly as a hermit in deserts the most remote. In Part IV, Descartes explained that though he could doubt everything else, he could not doubt that he w h o was thinking existed, a n d he arrived at the first principle of his p h i l o s o p h y , C O G I T O ERGO S U M - - I t h i n k , t h e r e f o r e I am. He t h e n p r o c e e d e d to the highest speculations: I saw from the very fact that I thought of doubting the truth of other things, that it very evidently and certainly followed that I was; on the other hand if I had only ceased from thinking, even if all the rest of what I have ever imagined had really existed, I should have no reason for thinking that I had existed. From this I knew that I was a substance the whole essence or nature of which is to think, and that for its existence there is no need for any place, nor does it depend on any material thing; so that this "me," that is to say, the soul by which I am what I am, is entirely distract from body, and is even more easy to know than is the latter; and even if the body were not, the soul would not cease to be what it is. He t h e n describes h o w his m i n d conceived clearly and distinctly of an all-perfect being, and since for it not to exist w o u l d be an imperfection in it, it had to exist: The existence of the perfect being was implied in the idea of God just as, he says, the fact that the s u m of the angles of a triangle is 180 ~ is implied in the idea of a triangle. The existence of God is therefore as certain as the results of mathematics; it is m u c h more certain t h a n the existence of the physical world, which m a y be an illusion, like something we see in a dream. In fact, instead of proving the existence of God from design in nature (which John Stuart Mill said was the only a r g u m e n t with possibilities), he proved that the physical world existed from the existence of God, because God would not deceive us. Thus, for Descartes, unlike for most philosophers, the existence of the physical world is more difficult to establish t h a n the immortality of the soul a n d the existence of God, a n d in fact cannot be established w i t h o u t first proving that Deity exists. He turns the usual order of things upside-down. Part V begins w i t h a r e v i e w of all the t h e o r e m s about the world that Descartes was able to prove using his m e t h o d . The physical w o r l d that we live in, he says, o b e y s laws t h a t follow directly from the attributes of God; they are necessary, so that, in a sense, we have here the idea that this is the only possible world: Even if God had created other worlds, He could not have created any in which these laws would fail to be observed. The laws of nature, then, follow from the perfection of Deity, a proposition that John Stuart Mill was to attack in his Essay on Nature. These laws are mathematical, and a n y other world that G o d created would turn out
to be exactly like this one we now have. God did not need to create the world exactly as we now see it; it would have evolved thus even if God had only produced the chaotic matter and allowed the laws to act upon it, but God did so in order to save time. Descartes then goes on to treat in some detail the functioning of the h u m a n heart, asking his reader to dissect the heart and lungs of a great mammal as they proceed through his description. The section ends with an account of how the soul of a man differs from that of an animal, namely, the m a n ' s has reason, something i n d e p e n d e n t of body and therefore not mortal, i.e., immortal. For next to the error of those who deny God, which I think I have already sufficiently refuted, there is none which is more effectual in leading feeble spirits from the straight path of virtue, than to imagine that the soul of the brute is of the same nature as our own, and that in consequence, after this life we have nothing to fear or to hope for, any more than the flies and ants. Like St. Paul, Descartes was one of those people who were obsessed with death. He just could not believe that his mind could stop thinking, any more than there could cease to be circles and triangles. As for that reason which Descartes says distinguishes the soul of man from that of an animal, what is the sign of it? The sign of reason, according to Aristippus, the Socratic philosopher, was mathematics. Finally, in Part VI, Descartes tells how he had delayed the publication of his scientific discoveries when he heard of the condemnation of Galileo, lest any of the opinions he expressed be found offensive by the authorities. He was tempted to change his mind when he realized that by keeping his method to himself, he was holding up the advancement of the human race, which would benefit from the truths his procedures made it possible to discover. Should he, for the good of humanity, allow his treatise to be published, and invite all men of learning to adopt his method and communicate to him the various discoveries that they should make by using it, so that he might circulate them to all? Indeed, he hoped for significant discoveries, especially in medicine, which he considered the only real hope for the improvement of the h u m a n condition: The mind depends so much on the temperament and disposition of the bodily organs that, if it is possible to find a means of rendering men wiser and cleverer than they have hitherto been, I believe that it is in medicine that it must be sought. No, Descartes finally decided he should not go public b e c a u s e 1) the inevitable c o n t r o v e r s i e s t h a t his writings would arouse would disturb the peace and quiet he required for further progress, 2) the contributions of others would probably be full of mistakes and superfluities, and 3) there is no better way to insure progress in science than to let the individual genius
alone and encourage him by protecting his precious leisure from the importunities of others. Nevertheless, as a sort of compromise, he relented and published three scientific appendices, on meteors, on optics, and on geometry, because 1) he did need to interest other scientists in helping him with necessary experiments and 2) he did not want to make people think that he was keeping quiet because he had something criminal to hide.
A Different Point of View Descartes's thesis that the method of mathematics was universally valid and necessary for arriving at the truth in all the sciences was never received by all those competent to hold an opinion in such matters, nor have all philosophers ever agreed that mathematics was essential to strengthen, refine, and enrich the intellectual powers of students. Cardinal Newman, for example, in The Idea of a University, taught that the perusal of the poets, historians, and philosophers of ancient Greece and Rome, i.e., the Classics, will best accomplish this latter purpose, and that each branch of knowledge has its own method of reasoning and inquiring, that these methods are contrary the one to the other, and that controversy arises w h e n the practitioners of one science attempt to impose its method on another. To prevent this aggression, he assigned to philosophy the authority to determine the method proper to each branch of knowledge and the precise boundaries of its subject matter. Mathematics did not play a conspicuous role in Newman's treatise, though the mathematical quadrivium provided four of the seven liberal arts of the ancient Roman system. Those readers who are intrigued by the speculations discussed in this essay would do well to begin their investigations by reading the Republic, Timaeus, and Meno of Plato and, of course, the Discourse on Method by Descartes. For the history of philosophy, they might examine Will Durant's The Story of Philosophy, which made its author a millionaire.
Further Reading W. W. Rouse Ball, A Short Account of the History of Mathernatzcs, New York: Dover Publications, Inc. (1960). Sir Kenneth Clark, Cwilizat~on, a Personal View (illustrated abridged transcript of the PBS television series), New York: Harper and Row (1969). Ren6 Descartes, Discourse on Method, in Great Books of the Western World, vol. 31, translated by E. S. Haldane and G. R. T. Ross, Chicago: Encyclopedia Britannica Inc., (1952). Will Durant, The Story of Phdosophy, New York: Simon and Schuster (1927), 20th printing.
Department of Mathematzcs Allegheny College Meadville, PA 16335 USA THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
39
Circuits in Directed Grids Joseph A. Gallian
We shall have to evolve problems-solvers galore-since each problem they solve creates ten problems more. The Only Solution, Piet Hein About ten years ago I happened across a paper by Tom Trotter and Paul Erdos [8] that caught my fancy. Consider the diagram in Figure 1. Is it possible to find a Hamiltonian circuit for this digraph? That is, can we begin at the upper left vertex say, and following the arrows, visit every vertex exactly once and return to the starting position? More generally, Trotter and Erd6s investigated the analogous problem for rectangular grids with any number of rows and columns. Such a grid with m rows and n columns is called the Cayley dzgraph of the group Z m x Z n with generating set {(1,0),(0,1)}. This graph theory problem had great appeal to me. For one thing, its solution involved group theory and number theory, two topics I enjoy. But most of all, it was what Trotter and Erdos did not do that intrigued me. When a Hamiltonian circuit through all vertices does not exist, is there one through all vertices save one? If not, then h o w about all but two? More generally, what is the circuit of maximum length in the Cayley digraph of Z m x Z,? More generally still, what are the lengths of all possible circuits? Or suppose we permit exactly one vertex to be visited twice and all others exactly once. W h y restrict ourselves to the generators (1,0) and (0,1)? What about higher dimensions? H o w about groups other than cyclic ones? Over a period of eight summers these questions would occupy me and a succession of exceptional undergraduate students I recruited from throughout the 40
United States to work with me. The students were participants in summer undergraduate research programs sponsored by the National Science Foundation. I discovered some of the students through their performance on the Putnam Competition, while others discovered me from information I sent to P u t n a m Competition supervisors at their schools. In this article I survey the fruits of this labor. For any group G and set of generators S of G we define the digraph Cay(S : G) as follows:
THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3 9 1991 Spnnger-Verlag New York
Figure 1. Cayley digraph of Z 3 x Z 4.
Figure 2. Cayley digraph of the alternating group of degree 4 with generators (12)(34) and (123).
Figure 3. Cayley digraph of the symmetry group of degree 4 with generators (123) and (34).
Figure 4. Cayley digraph of the quaternion group of order 8 with generators a and b satisfying a 4 = e , a 2 = b ~, b-lab
1. The elements of G are the vertices of Cay(S : G). 2. For x a n d y in G, there is an arc from x to y if and only if xs = y for some s in S. Figures 2, 3 a n d 4 s h o w the Cayley digraphs of some familiar groups. More are given in [3, ch. 31] and [1]. Figure 1 is the digraph Cay({(1,0),(0,1)} : Z 3 x Z4). Let us return to the question a n s w e r e d by Trotter a n d Erdos. (The formulation given below is due to Steve C u r r a n - - s e e [11]). T H E O R E M 1 [81. Cay ({(1,0),(0,1)} : Z m x Zn) has a Hamiltonian circuit if and only if there exzst relatively prime positive integers a and b such that am + bn = mn. The n u m b e r s a a n d b have topological significance. With Cay({(1,0),(0,1)} : Z m • Zn) naturally e m b e d d e d on a torus as illustrated in Figure 5, a is the n u m b e r of times the circuit wraps around the torus longitudinally and b is the n u m b e r of times it wraps a r o u n d meridianally. Interestingly, Trotter and Erd6s gave a grouptheoretic proof of T h e o r e m 1, while C u r r a n gave a knot-theoretic one. W h e n Cay({(1,0),(0,1)} : Z m x Zn) does not have a Hamiltonian circuit, it is natural to ask for the length of the longest circuit, or i n d e e d the
Figure 5. Cayley digraph of Z7 • Zs embedded on a toms. lengths of all circuits. In 1981 I put this question to Larry P e n n a n d David Witte, w h o , using C u r r a n ' s knot techniques, a n s w e r e d as follows. T H E O R E M 2 [7]. Cay({(1,0),(0,1)} : Z m x Z,) has a circuit of length r if and only if there exist relatively prime positive integers a and b such that am + bn = r. THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
41
Figure 6. Circuit of length 19 in Z 3 x ZT.
Figure 7. Closed walk of length 22 in Za x ZT.
Notice that the Trotter-Erd6s result is simply the case that r = mn. Let's look at Z 3 x Z 7. According to Penn and Witte the lengths of the circuits are 1 93 + 0.7= 3;0"3 + 1"7 = 7;1"3 + 1.7 = 10;2"3 + 1 . 7 = 13; 3 " 3 + 1 . 7 = 16; 1 . 3 + 2 " 7 = 17 and 4 93 + 1 97 = 19. Figure 6 s h o w s the circuit of length 19. Knowing that it is impossible to find a circuit in Z 3 x Z 7 passing t h r o u g h all 21 vertices, I decided that the next best thing w o u l d be to find a closed walk that passes t h r o u g h 20 vertices exactly once a n d the remaining vertex exactly twice. The general version of this problem was solved by Witte and me.
gcd(m, n) = 1; the walk is unique. F i g u r e 7 i l l u s t r a t e s T h e o r e m 3 f o r Z 3 X Z 7. T h e o r e m s 1 and 3 reveal that Z s x Z 7 has neither a H a m i l t o n i a n circuit nor a closed walk passing t h r o u g h 34 vertices exactly once a n d the remaining vertex exactly twice. Well then, w e again ask w h a t w o u l d the n e x t best thing be? P e r h a p s a closed walk passing t h r o u g h 33 vertices exactly once and two vertices exactly twice. I n d e e d there is such a walk. Motivated by this example, in 1985 1 asked D o u g Jungreis to investigate the f o l l o w i n g g e n e r a l i z a t i o n s of T h e o r e m s 1 a n d 3.
T H E O R E M 3 [4]. Cay({(1,0),(0,1) : Z m x Z.) h a s a closed walk passing through one vertex exactly twice and all others exactly once if and only if there are positive integers a and b such that am + bn = mn + 1 a n d gcd(a, b) ~ 2.
T H E O R E M 4 [5]. Cay({(1,0),(0,1) : Zm • Zn) has a closed walk passing through r (r >~ O) vertices exactly twice and all others exactly once if and only if there are positive integers a and b such that am + bn = mn + r and gcd(a, b) 1; the circuit is n e v e r unique. O n the o t h e r hand, for the closed walk described in T h e o r e m 3 to exist it is n e c e s s a r y that
Surely these results p u t the subject of closed walks on Z m x Z n to rest. Not so! In 1987, Amie Wilkinson e x t e n d e d T h e o r e m s 2 and 4 to cover all t w o - g e n e r a t e d Abelian g r o u p s over all t w o - e l e m e n t generating sets a n d e v e n certain t h r e e - e l e m e n t generating sets [9]. O f t e n w h e n one keeps getting more a n d m o r e gen-
Figure 8. Cayley digraph of Za x Z a decomposed into two Hamiltonian circuits.
Figure 9. Cayley digraph of D4 x Z a.
42
THE MATHEMATICALINTELLIGENCERVOL. 13, NO. 3, 1991
eralizations of an initially simple idea, it pays to return to the original source and head off in another d i r e c t i o n . L o o k at t h e H a m i l t o n i a n circuit for Cay({(1,0),(0,1)} : Z 3 • Z3) s h o w n in h e a v y ink in Figure 8. Notice that the complementary arcs themselves constitute a Hamiltonian circuit so that the Cayley digraph is the disjoint union of two Hamiltonian circuits. This observation led me to ask John Lindgren and Kevin Keating w h e n this happens in general. Their answer is our next result.
There are also graph-theoretic products other than the cartesian product ([10]). However, I have said enough to demonstrate w h y the simple problem solved by Trotter and Erdos had such appeal to me.
THEOREM 5 [6]. Cay({(1,0),(0,1)} : Z m x Z,) can be decomposed into two disjoint Hamiltonian circuits if and only if there are positive integers u and v such that u + v = gcd(m, n) and gcd(uv, mn) = 1.
References
Acknowledgment: This paper was written while the author was partially supported by the National Science Foundation (DMS 9000742) and the National Security Agency (MDA 904-88-H-2027).
Another direction to take is to consider higher-dimensional analogs. In a tour de force, Curran and Witte [2] proved that Hamiltonian circuits always exist for Znl x Zn2 x . . . x Znk (k >>-3) with the standard generating set. Their proof involved knot theory, group theory, and number theory. The above results pretty well exhaust the possibilities for direct products of cyclic groups. So let us move on to other groups. Let D m = (a, b I am = b2 = e, b-lab = a-1) (the dihedral group of order 2m) and consider Cay({(a,0),(b,0),(0,1)} : D m x Z,). Figure 9 shows D 4 • Z 3. In 1981 Witte, Gail Letzter and I [11] proved that these Cayley digraphs always have Hamiltonian circuits. Figure 10 shows the circuit for D4 x Z 3. In the same paper we obtained the corresponding result for Qm x Z n, where Q,, is the dicyclic group ( a , b l a 2m = e,a m = b2, b-lab = a -1) of order 4m (m > 1). (When m = 2, the dicyclic group is also known as the quatern i o n s - - s e e Figure 4.) Figure 11 shows a circuit for Q2 xZs.
1. F. Budden, Cayley graphs for some well-known groups, Math. Gazette 60 (1985), 271-278. 2. S. J. Curran and D. Witte, Hamfltonian paths in cartesian products of directed cycles, Ann. Dtscrete Math. 27 (1985), 35-74. 3. J. A. Gallian, Contemporary Abstract Algebra, 2nd ed. Lexington: D. C. Heath (1990). 4. J. A. Gallian and D. Witte, When the cartesian product of two directed cycles is hyperhamiltonian, J. Graph Theory 11 (1987), 21-24. 5. D. S. Jungreis, Generalized Hamiltonian circuits in the cartesian product of two directed cycles, J. Graph Theory 12 (1988), 113-120. 6. K. Keating, Multiple-ply Hamiltonian graphs and digraphs, Ann. Discrete Math. 27 (1985), 81-88. 7. L. Penn and D. Witte, When the cartesian product of two directed cycles is hypohamiltonian, J. Graph Theory 7 (1983), 441-443. 8. W. T. Trotter, Jr. and P. Erdos, When the cartesian product of directed cycles is Hamiltonian, J. Graph Theory 2 (1978), 137-142. 9. A. M. Wtlkinson, Circuits in Cayley digraphs of Abelian groups, J Graph Theory 14 (1990), 111-116. 10. D. Witte and J. A. Gallian, A survey: Hamiltonian cycles in Cayley graphs, Dzscrete Math. 51 (1984), 293-304. 11. D. Witte, G. Letzter, and J. A. Gallian, On Hamiltonlan circuits in cartesian products of Cayley digraphs, Dzscrete Math, 43 (1983), 297-307.
I could go on to discuss products of other groups or even semidirect products or wreath product of groups.
Department of Mathematics Umversity of Minnesota, Duluth Duluth, M N 55812 USA
Figure 10. A Hamiltonian circuit in D4 x Z s.
Figure 11. A Hamiltonian circuit in Q2 x zs. THE MATHEMATICAL INTELLIGENCERVOL 13, NO 3, 1991 4 3
A Fractal Puzzle G. A. Edgar
Figure I is a picture of Barnsley's wreath (see [1], page 189). This is a fractal subset of the plane. When I started to think about the computation of its Hausdorff dimension, it led to the construction of a puzzle. First, I will describe the puzzle, then I will make a few remarks about this fractal and its Hausdorff dimension.
The Puzzle Figure 2 shows the 14 different cards that are (potentially) to be used in the puzzle. Each one is an equilat-
eral triangle with side 5 cm. The back of each card shows the mirror image of the front. There are in fact several versions of the puzzle. The trivial version of the puzzle uses six copies of card A. They are to be assembled into a picture of the Barnsley wreath that is 10 cm wide. The elementary version of the puzzle uses 24 cards (12 copies of card A, 6 copies of card B, and 6 blank cards Z). They are to be assembled into a picture of the wreath that is 20 cm wide. The novice version of the puzzle uses 96 cards. (Each version has four times the number of cards as
Figure 1. Barnsley's wreath. 44
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3 9 1991 Spnnger-Verlag New York
2.
the previous version.) The cards to be used are: 30 copies of A, 18 B, 12 C, and 36 Z. They are to be assembled into a picture of the wreath that is 40 cm wide. Figure 3, which shows a group of 16 cards (front and back views), provides some help for making this version of the puzzle. Make six photocopies of Figure 3, glue it onto some heavier cardboard, and cut the cards out. (I will leave the exact details to be determined by your experience and art supplies. I used rubber cement to glue the front faces on, cut them out with a razor blade, then glued the back faces on.) The intermediate version of the puzzle uses 384 46
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
cards, and produces a picture 80 cm wide. Use: 78 A, 48 B, 36 C, 24 D, 12 E, 186 Z. As the version increases, the percentage of blank cards to be used increases (to 100 percent in the limit). This is because the Hausdorff dimension of the wreath is strictly less than 2, so it has "area" zero; according to Lebesgue measure, it occupies a negligible portion of the plane. Specification of the advanced versions of the puzzle is left as an exercise for the reader. If you like puzzles, spend some time working on the novice version of the puzzle before reading on. There are some "spoilers" below.
Figure 3. Cards for use in the novice version. THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
47
Figure 4. Magnifying a fractal. Iterated Function S y s t e m
Figure 5. Decompositions lead to easy but dull solutions.
Fractal geometry has been promoted in recent years by B. Mandelbrot (see [7], for example). The sets that occur in classical geometry become very simple w h e n they are magnified. For example, if a differentiable curve is magnified enough, it is indistinguishable from a straight line. Fractals, however, often do not have this property. Regardless of how much it is magnified, a fractal may not appear any simpler than it appeared at first. In fact, the magnified region may be the same as the set itself. This can be illustrated using the cards in Figure 2. (See Figure 4.) Six copies of card A constitute the wreath itself. Each of the triangular cards can be divided into four triangles by joining the midpoints of the sides. The part of the wreath inside each of these four small triangles is a half-size version of one of the cards. For example (Figure 5), card A is decomposed into two copies of card A, and one copy each of cards B and Z. Card B is decomposed into one card A, one card B, and two cards C. (These decompositions provide the "easy" way to solve the puzzles, above. But of course the easy way is the dull way.) No matter how much we magnify a small region of the wreath, it will look exactly like one of these cards. So it will not get simpler with magnification. One of the best ways for constructing fractals has been promoted in recent years by M. Barnsley (see [1]). This is the method known as the "iterated function system." Barnsley's wreath may be specified in this way. We begin with a regular hexagon V. Using V we will specify 6 transformations fl,f2 . . . . . f6 of the plane into itself. All of the transformations are similarities; they act as shown in Figure 6. The first three similarities rotate by 180 degrees and shrink by a factor of 1/2. The last three similarities rotate by 180 degrees and shrink by a factor of 1/4. The wreath is the unique nonempty compact set W in the plane satisfying w = h [ w ] u f=[w] u fdw] u f,[w] u fdw] u f d w ] .
Figure 6. Similarities of the plane. 48
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
The existence and uniqueness of this "invariant set" or "attractor" can be proved using the contraction
m a p p i n g theorem in the space of n o n - e m p t y compact sets with the Hausdorff metric; see, for example [1], p. 82. The 6 parts were m o v e d apart for Figure 7. (I assume that this description coincides with w h a t Barnsley h a d in mind. His book merely has a picture a n d asks the r e a d e r to find the iterated f u n c t i o n system. In a n y case, I h a v e pictured a n d discussed here the wreath W described by the iterated function system (fl,f2. . . . . f6).) The similarity dimension associated with the iterated function system (fl,f2 . . . . . f6) is the solution s of the equation 1 =3 Figure 7. Barnsley's wreath with the 6 parts moved apart.
.
Some elementary calculation shows that the solution is So =
Figure 8. Barnsley's wreath with its overlap set.
+3
log(3 + V ' ~ ) log2
- 1 ~ 1.9227.
The usual theory of "self-similar" sets then shows that the Hausdorff dimension of the wreath W is at most so 1.9227. (This theory is due to Hausdorff [5], Moran [9], H u t c h i n s o n [6]. Expositions are in [1] and [3].) Is so the exact value of the Hausdorff dimension? The obstruction to equality of the Hausdorff dimension with the similarity dimension is often "overlap." The six images fflW] are not disjoint. They are not even "just touching" in the sense of [1]. Equivalently, in the language of [3], Moran's open set condition fails. The "overlap set" is s h o w n in Figure 8. It has subsets similar to W itself, so the H a u s d o r f f d i m e n s i o n of the overlap is not strictly smaller t h a n the Hausdorff dim e n s i o n of the whole. Graph S e l f - S i m i l a r i t y A generalization of the usual "self-similarity" is necessary in this case. (The generalization is called " g r a p h self-similarity" in [3].) The basis for the generalization can be f o u n d in papers of Drobot and Turner [2] a n d Mauldin a n d Williams [8], but similar ideas also occur elsewhere. There is an exposition in [3]. We will use the 13 n o n e m p t y cards of Figure 2. Each of t h e m is m a d e up of (three or) four parts, similar to other cards, s h r u n k by factor 1/2. The m a k e - u p of the cards is s h o w n in Figure 9; for example, the B row means that set B is m a d e up of one copy of set A, one copy of set B, a n d two copies of set C. The spectral radius of this matrix m u s t be computed. The characteristic polynomial is
Figure 9. Matrix for computing graph similarity dimension.
t13 _ 5t12 + 3t 11 + 5t 10 + 5t 9 + 31t8 - 62t 7 - 20t 6 + 14~ + 48t 4 - 60t3 + 40t 2 - 48t + 48. THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
49
Its largest zero is t 1 -~ 3.5948. The " g r a p h similarity" d i m e n s i o n associated with this system is sl = logtl/ log2 -~ 1.8459. This time the verification of the o p e n set condition is easy: the interiors of the equilateral triangles of the cards are the o p e n sets. The images are the half-size triangles, a n d t h e y are disjoint. So w e conclude that the H a u s d o r f f dimension of Barnsley's w r e a t h is s 1 ~ 1.8459. It is strictly less than the similarity d i m e n s i o n s o c o m p u t e d above for the iterated function system (~1,f2. . . . . f6).
References 1. M. F. Barnsley, Fractals Everywhere, San Diego: Academic Press (1988). 2. Vladimir Drobot and John Turner, Hausdorff dimension and Perron-Frobenius theory, Ill. J. Math. 33 (1989), 1-9. 3. G. A. Edgar, Measure, Topology, and Fractal Geometry, New York: Springer-Verlag (1990). 4. K.J. Falconer, The Geometry of Fractal Sets, New York: Cambridge University Press (1985). 5. F. Hausdorff, Dimension und aut~eres Mat], Math. Ann. 79 (1918), 157-179. 6. John E. Hutchinson, Fractals and self similarity, Indiana Univ. Math. J. 30 (1981), 713-747. 7. Benoit B. Mandelbrot, The Fractal Geometry of Nature, New York: W. H. Freeman and Company (1982). 8. R. Daniel Mauldin and S. C. Williams, Hausdorff dimension in graph directed constructions, Trans. Amer. Math. Soc. 309 (1988), 811-829. 9. P. A. P. Moran, Additive functions of intervals and Hausdorff measure, Proc. Cambmdge Phd. Soc. 42 (1946), 15-23.
Department of Mathematzcs The Ohio State Unwerszty Columbus, OH 43210 USA
50
THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991
David Gale* For the general philosophy of this section see vol. 13, no. 1 (1991). Contributors to this column of problems, solutions or other items who wish an acknowledgement of their contribution should enclose a self-addressed postcard.
Conjectures
[ N u m b e r s of solutions of equations over finite fields, Bulletin of the American Mathematical Society 55 (1949), p.
A m o n g notable recent achievements in mathematics h a v e b e e n the r e s o l u t i o n of some c e l e b r a t e d longstanding conjectures, a m o n g t h e m the p r o o f by Deligne of the Weil conjectures, b y de Branges of the Bieb e r b a c h conjecture, a n d b y Faltings of the Mordell conjecture. O n the other hand, the Fermat problem, the Riemann hypothesis, and the so-called Poincar6 conjecture r e m a i n u n r e s o l v e d , t h o u g h for brief periods it was claimed that they too had b e e n settled. In / a n y case, it w o u l d seem to be timely to p r e s e n t some s c a t t e r e d facts a b o u t t h e origins of s o m e of t h e s e problems. For m o s t of w h a t follows in this section, aside from the next paragraph, I am i n d e b t e d to Professor M. R. C h o u d h u r y of the University of Dhaka, Bangladesh. The statement c o m m o n l y referred to as the Poincar6 conjecture is the assertion that the only simply conn e c t e d compact 3-manifold is the 3-sphere. H o w e v e r , as has b e e n p o i n t e d out (see e.g., S. Smale, Mathematical Intelligencer, vol. 12, no. 2, 1990) Poincar6 n e v e r p u t this forward as a conjecture. H e writes [PoincarG H., Cinqui~me c o m p l 6 m e n t a l'analysis situs, CEuvres V/, Gauthier-Villars, Paris (1953), p. 498]: I1 resterait une question a traiter: Est-il possible que le groupe fondamental de V se r6duise a la substitution identique, et que pourtant V ne soit pas simplement connexe? (There remains a question to be treated. Is it possible that the fundamental group of V reduces to the identity although V is not simply connected?) There follows a p a r a g r a p h in which the question is r e p h r a s e d in terms of some of the concepts i n t r o d u c e d in the paper, a n d t h e n there is a final one line paragraph: "Mais cette q u e s t i o n n o u s e n t r a i n e r a i t trop loin" which r o u g h l y translated says "But that question w o u l d carry us too far afield." Thus the question is p r e s e n t e d n e i t h e r as a c o n j e c t u r e n o r as a n o p e n problem. Poincar6 does not even guess at the answer. By contrast Weil is absolutely explicit that h e believes his u n p r o v e d r e s u l t s to be t r u e . H e w r i t e s *Column editor's address: Department of Mathematics, University of California, Berkeley, CA 94720 USA.
4981, This, and other examples which we cannot discuss here, seem to lend some support to the following conjectural statements, which are known to be true for curves, but which I have not so far been able to prove for varieties of higher dimension. As w e n o w k n o w from the w o r k of Deligne, Weil's conjectures t u r n e d out to be correct. It is interesting t h e n to read w h a t Weil has written on the subject of c o n j e c t u r e s in g e n e r a l [ T w o l e c t u r e s o n n u m b e r theory, past a n d present, L'Enseignement mathdmatique (2), 20 (1974), 87-110]: Here I may point out that in the old days, when we used the word 'hypothesis' or 'conjecture' (in German, Vermutung), this was not to be taken as simply a form of wishful thinking. Nowadays the two are often confused. For instance, the so-called 'Mordell conjecture' on Diophantine equations says that a curve of genus at least two with rational coefficients has at most finitely many rational points. It would be nice if this were so, and I would rather bet for it than against it. But it is no more than wishful thinking because there is not a shred of evidence for it, and also none against it. So Poincar6 clearly did n o t m a k e his conjecture, Weil definitely did. We leave it to the reader to decide w h e t h e r Mordell's s t a t e m e n t [On the rational solutions of the indeterminate equations of the third a n d f o u r t h degrees, Proceedings Cambridge Philosophzcal Society 21 (1922/23), 191-192] is a conjecture or wishful thinking: In conclusion, I might note that the preceding work suggests to me the truth of the following statements concerning indeterminate equations, none of which, however, I can prove . . . . (3) The equation ax6 + bxSy + . . . f x y s + gy~ = z 2 can be satisfied by only a finite number of rational values of x and y, wRh the obvious extension to equations of higher degree. (4) The same theorem holds for the equation ax4 + by4 + cz4 + 2fy2z2 + 2gz2x2 + 2hx2y 2 = O. (5) The same theorem holds for any homogeneous equation of genus greater than unity, say, f(x,y,z) = O.
THE MATHEMATICALINTELLIGENCERVOL 13, NO 3 9 1991 Spnnger-VerlagNew York 53
(4,6,9,7,5), and (44,66,99,74,111,83,62, 93,70,105,75,59). The smallest missing number among As mentioned in an earlier column, many mathemati- these is 8, whose orbit starts out 73, 55, 41, 31, 23, 13, cians have observed that the main impact of com- 10, 15, 11, 8, 12, 18, 27, 20, 30, 45, 34, 51, 38, 57, 43, 32, puters on m a t h e m a t i c s has been in raising new 48, 72. . . . . The orbit shows no signs of cycling after problems rather than solving old ones. I will here sug- several thousand iterations in both directions although gest that some of these new problems, though easy to there are a number of "near misses." Note the 73 and formulate, may in fact be, in an essential way, impos- 72 at the beginning and end of the sequence above. sible to solve. This happens again with 153 and 154, 161 and 162, 500 Among problems that would probably never have and 501, 790 and 791. been posed but for the existence of computers, one of Question 1: Are there any other finite orbits unthe simplest and probably the best known is the so- der T? called (3n + 1) conjecture due to Lothar Collatz. Let f Question 2: Is the orbit of 8 finite? be the function on the natural numbers N where A striking feature of this last question is that it concerns a single sequence as contrasted with the Collatz problem, which asks about the behavior of an infinite = In~2 for n even number of sequences. Of course it is meaningless to say this question is undecidable, but I will later argue fin) L3n + 1 f o r n o d d that it may well be " u n p r o v a b l e , " that is to say, it might be the case that the orbit of 8 is in fact infinite, The conjecture is that for any n there is some k such but there is no proof of this from our usual system of that fk(n) = 1, or in the l a n g u a g e of dynamical axioms. systems, the orbit of every n contains 1. This has been Further experimentation leads to further speculaverified for all n < 109. tion. The smallest number not in any of the preceding A general class of questions of which this is a special orbits is 14, whose orbit appears to be heading resocase is easily d e s c r i b e d . For a n y n u m b e r k a n d lutely for infinity at both ends. The next missing numbers a,, b, in N, 0 ~ z < k, let f from N to N be number is 40, and so on. Using Mathematica we found defined by that all numbers up to 1000 appear to belong to 54 disjoint orbits. We call the smallest number in an orbit f(kn + i) = a,n + b,. the seed s. The elements of Tn(s) are called forward One can then ask questions about the orbits of points numbers for n positive and backwards numbers for n under f, for example, do they all contain the number 1. negative. Of course, just as we have no proof that The Collatz problem corresponds to the case k = 2, a0 these orbits are not parts of cycles, we also don't know that they are disjoint. It is conceivable, for example, = 1, b 0 = 0, a 1 = 6, b 1 = 4. The main result on the general problem is due to that some forward iterate T"(8) might eventually hit a John H. Conway [Unpredictable Iterations, Proceedings backward iterate of Tin(14). A crude statistical study Number Theory Conference, Boulder, 1972] and asserts indicates that the forward iterates contain roughly the that it is undecidable. More precisely, Conway shows, same number of even as odd numbers. A consequence even for the case when all the b, are zero, that there is of this is that roughly half of the forward numbers are no algorithm for deciding whether the orbit of a given divisible by 3, since from the definition of T a number n contains the number 1. Of course this says nothing is divisible by 3 if and only if its predecessor is even. about the decidability of the Collatz problem, but it Among the backward numbers, on the other hand, does show that there exist specific problems with the odd numbers seem to outnumber the evens by numbers k and a z, which could in principle be calcu- about 2 to 1. This must clearly be the case, for an even lated, for which the problem is undecidable. Further, backwards number follows a down-jump of 2/3, while Conway has come up with some interesting examples, an odd backwards number follows an up-jump of only referred to by Richard Guy as permutation sequences. 4/3, so there must be m a n y more odds than evens The simplest of these appears in G u y ' s Unsolved since the numbers T-"(s) must remain positive. Problems in Number Theory [Springer-Verlag, New With the data at hand one can prove that if there are York, 1981] and is given by the mapping T defined by any cycles other than those listed, they must have length at least 360. Further one can make intuitive arT(2n) = 3n, g u m e n t s that the existence of cycles becomes very T(4n + 1) = 3n + 1, "improbable." First note that any odd number gives a T(4n - 1) = 3n - 1. forward down-jump of (approximately) 3/4 while an T is easily seen to be a bijection, so all orbits are either even number gives an up-jump of 3/2, so if there are m cycles or bi-infinite sequences that approach infinity in odd and n even numbers, then for a cycle we must both directions. One finds easily the cycles (1), (2,3), have (3/4)m(3/2)" = 1 and if the lowest number in the More
54
Mysteries
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
cycle, the seed, is around 1000 then this approximation must be close, that is, that ratio m/n must approximate .70951 to four decimal places. If the seed is greater than 10,000 then the smallest cycle would have length 665 and could occur only for m = 276 and n = 389. What are the "chances" of this happening? What then is the moral of this story? We are all aware from the work of G6del that no matter what system of axioms we work with, as long as just a bit of number theory is available, there are true propositions for which there is no proof, yet we continue as diligently as ever looking for proofs and frequently we find them, mainly I think, because of the problems we choose to attack. But problems like the one discussed here seem to be of a special sort. It would seem to me overwhelmingly "likely" that the orbit of 8 is infinite and correspondingly "unlikely" that there is a proof of this fact. Indeed, w h y should there be?
Whitehead Wit The topologist J. H. C. Whitehead was often asked for his views on the work of his uncle, the renowned philosopher Alfred North Whitehead. Eventually he developed a stock answer. When asked "What do you think of your uncle's philosophy?", he would reply "I really haven't thought much about i t - - b u t what do you think of your uncle's philosophy?" THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991 5 5
The Mathematics of Modems A. R. Calderbank
A man and a w o m a n are talking across a noisy room. Chances are that occasionally a word will be lost or misconstrued. However, it is considerably less likely that the thread of meaning will be lost. The reason is that spoken language is inherently redundant, and this redundancy allows the couple to bridge the gaps in their conversation. If the noise becomes worse then they may find it necessary to repeat every phrase twice. This is one w a y of introducing additional redundancy into the conversation so as to make communication more reliable. It is a simple example of what is called a channel code. A channel is simply the means that allows users to exchange information by transmitting and receiving signals. The telephone network is a familiar example of a communications channel, and we shall be interested in moving binary data over this network. No channel is perfectly reliable; the channel adds noise to the transmitted signal, and this noise m a y cause errors. The purpose of a channel code is to overcome transmission errors caused by this noise. Figure I shows the channel model. The szgnals Yk are drawn from a finite set fl, called the signal constellation. A signal is simply a vector in R N, and the coordinate entries correspond to voltage levels on a transmission line. The encoder transforms the binary data stream into a sequence of signals (Yk). The channel code makes communication more reliable by introducing redundancy into the signal selection procedure. To say that the signal procedure is redundant means that the encoder is not able to generate all signal sequences (Zk), where z k ~ 1~ for all k. The signal sequences that it can generate are called codewords. The idea is that even if the wrong signal occasionally ends up at the receiver, the sequence will still resemble the transmitted 56
c o d e w o r d more than any other codeword. The decoder finds the closest codeword to the received sequence (Xk), correcting transmission errors caused by the additive noise. Encoding is sometimes called modulation, so that d e c o d i n g b e c o m e s demodulation. A modem is, then, a box capable of exchanging information with other modems by transmitting and receiving signals. There are two different types of transmission media: w i d e b a n d media such as optical fibers, and bandlimited media such as telephone circuits. The difference between these media is in the length of the signaling interval. For an optical fiber the signaling interval T is about 10 -9 sec. (the signaling rate is of course 1/T Hz). The signaling rate is enormous, but as
THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3 9 1991 Spnnger-Verlag New York
a result, there is little time for sophisticated signal processing. On the other hand, the signaling rate on the telephone channel is about 2400 Hz. If we are to transmit data at 9600 bits/sec., then every use of the channel must convey 4 bits of information. It is impossible to transmit information at this rate with only two signals, 0 and 1, and we must therefore expand the size of the signal constellation. We are fortunate that the low signaling rate allows more time for signal processing. The telephone network is one example of a bandlimited channel. This means there is a constant W (in this case W ~ 2400) such that the channel only supports transmission of continuous functions that contain no frequencies higher than W Hz in their Fourier transform. Optical fibers are also bandlimited, but in this case the constant W is very large; hence the name wideband me&a. The spectrum is then zero outside the band ( - 2"rrW, 2"rrW). The Shannon Sampling Theorem [11] allows us to replace a continuous bandlimited function by a discrete sequence of its samples without the loss of any information. Shannon proved that "If a function f contains no frequencies higher than W Hz, then it is completely determined by giving its ordih a t e s at a series of points spaced (1/2W) seconds apart." He then constructed f as the sampling series:
f(t) =
k
~
~(t - k/2W)
(1)
The relationship with Figure 1 is that the discrete sequence of samples (f(k/2W)) is just (Yk)- Abdul Jerri [7] has written a tutorial review of the sampling theorem and its various extensions and applications. He reports that though Shannon introduced the sampling theorem to information theory, the theorem itself was discovered by E. T. and J. M. Whittaker [15-17], by Ferrar [5], and that some even attribute it to Cauchy [1, p. 41]. The interest of the communications engineer in the sampling theorem can be traced to Nyquist [9]. In the Russian literature, Kotel'nikov [8] introduced the sampling theorem to communication theory in 1933. We need to quantify the effect of channel noise on the sequence of transmitted signals. The function f given in (1) represents a continuous bandlimited voltage signal. The average signal power P is proportional to the square of the voltage, and so we may take P to be the average value of IlYkii2. Let d2 be the minimum squared distance between codewords. If c1, c2 are codewords for which llCl - c2112 = d 2 , then cl and c2 are said to be nearest neighbors. Let C be the average number of nearest neighbors to a given codeword. Then it turns out that the probability of an error beginning at any given time is well approximated by
Ce -d2A/P,
(2)
Figure 1. The channel model.
Figure 2a. Wideband media such as optical fibers.
Figure 2b. Bandhmited media such as telephone circuits.
where A is a constant proportional to the signal to noise ratio (SNR) of the channel (the details can be found in [4, Chapter 3]): the SNR relates the strength of the useful signal to the strength of the noise. And the lower the SNR the more significant is the path multiplicity C in (2). We w a n t to transmit as fast as possible while keeping the probability of error acceptably low. We shall attack the problem via the geometry of finite-dimensional lattices.
Signal Constellations Many signaling schemes are based on finite-dimensional lattices. The signal constellation 1~ consists of all lattice points within a region ~. The signal constellations s h o w n in Figures 3a and 3b are b o u n d e d by spheres centered at the origin. Since we want to compare different dimensional signaling schemes, we define the average signal power P (normalized to 2 dimensions) of an N-dimensional constellation f~ by
P -
2
~ IIxil2.
(3)
We normalize to 2 dimensions because most of the signal constellations that appear in the engineering literature are 2-dimensional. The signal points in Figures THE MATHEMATICALINTELLIGENCERVOL 13, NO 3, 1991 57
Figure 3a. A 1-dimensional signal constellation. The uncoded transmission rate R = 6 bits/2-dim., d2 = 1, P = 21/2, and d2/p = 2/21.
The transmisszon rate is the number of bits transmitted per 2-dim. symbol. For an N-dimensional constellation 1), the uncoded transmission rate R (normalized to 2 dimensions) is given by 2 R = -N log2
bits/2-dim.,
(4)
and it is impossible to use f~ to transmit more information. If we introduce redundancy (coding) into the signal selection procedure then the transmission rate will drop. Suppose we want to signal at some rate R, and that we are not happy with the reliability of non-redundant (that is to say uncoded) transmission. Coding improves communication reliability by introducing redundancy into the signal selection procedure. The possible signal strings are the codewords, and not every signal string is a codeword. The redundancy is used to increase the minimum squared distance d2 between distinct codewords. However, to maintain signal rate R, we must increase the size of the signal constellation. This constellation e x p a n s i o n increases the average signal power P. The coding gain ~/is given by
(d2/P)coded I
- (d2/P)uncoded
Figure 3b. A 2-dimensional signal constellation. The uncoded transmission rate R = 5 bits/2-dim., d2 = 1, P = 5, and d2/p = 1/5.
which we shall often express in decibels (dB) by taking 10 log10 % It follows from (2) that a 3 dB coding gain (~/ 2) corresponds to squaring the probability that a decoding error at any given time. We may rewrite the coding gain -~ as Puncoded -
x Pcoded
Figure 4. Two different fundamental regions for the hexagonal lattice in the plane. 3a and 3b are actually drawn from a translate of the integer lattice and this translate is chosen so as to minimize the average p o w e r P for a fixed constellation size. If d2 is the minimum squared distance between distinct signals, then (2) implies that the ratio d2/p is a good figure of merit for the signal constellation. The reliability of the signaling scheme is determined by the signal separation d2, and the average signal power P is the cost of achieving that reliability. Note that scaling the signal constellation does not change the ratio d2/p. 58 THE MATHEMATICAL INTELLIGENCERVOL 13, NO 3, 1991
(5)
dc2oded d--7--~,
uncoded
(5')
where the first term is the power penalty incurred by expanding the signal constellation. The first problem we look at is that of calculating this power penalty. A fundamental region ~(A) for a lattice A is a region of R N that contains one and only one point from each equivalence class modulo A, i.e., ~(A) is a complete system of coset representatives for RN/A. The translates 9i(A)+x (x ~ A) tile R N. The volume of ~(A) is the fundamental volume V(A) of the lattice. The fundamental volume is the volume of N-space per lattice point. There are many ways to choose a fundamental region for a lattice A, but the volume of such a region is always V(A). (See Figure 4.) In calculating the average signal power P we assume the accuracy of the following approximation principle (this approximation is used by Forney in [6] and can be proved to be accurate; see for example [3]).
The Continuous Approximation Principle: The average p o w e r P (normalized to 2 dimensions) of a constellation 1~, where the signal points are the ele-
ments of any lattice A (or translate of A) that lie in a region 9t or R N (with centroid the origin) is approximately equal to the average power P(91) of a continuous distribution that is uniform within 91 and zero elsewhere:
2 (~ llrll2dv ), P "~ P(91) - NV(91)
(6)
where V(91) = fgt dv is the volume of 91. Example 1. L e t A = Z. Then V(A) = l a n d a n y i n terval [y,y + 1) is a fundamental region. The interval [-L,L] has volume 2L and contains the 2L half-integers ---(2i + 1)/2, i = 0, 1. . . . . L - 1. The true average signal power P (normalized to 2 dimensions) is given by 1 L-1 4L2 - 1 P 2L ~ (2i + 1)2 , (7) ,=0 6
whereas the continuous approximation P([-L,L]) is given by
P([-L,L]) = L
L t2dt = 2L2/3.
(8)
The agreement is pretty good. Equation (6) says that the average signal power of a signal constellation f~ depends on the lattice A and on the region 91 that bounds l-l. We rewrite (6) as P ~ 2G(91)V(91)2~N,
(9)
where
f H 2dv G(91)
=
NV(91)1 + 2/N
is the normalized or dimensionless second moment. The second m o m e n t G(91) results from taking the average squared distance from a point in 91 to the centroid, and normalizing to obtain a dimensionless quantity. Note that scaling the region 91 does not change G(91). The second moment G(91) measures the effect of the shape of the region 91 on the average signal power. The second factor V(91)2/n measures the effect of the choice of lattice A, since V(91) ~ II~IV(A), where V(A) is the fundamental volume of A. Choosing the shape of the region 91 is a more subtle question than one might think. Certainly it is clear that spherical regions minimize the average signal power. (A little calculation in the final section shows that the s h a p e gain of the N - s p h e r e over the N-cube approaches Tre/6 ~ 1.53 as N ~ o0.) However, as a practical matter, it is difficult to address signal points in a large spherical constellation. Suppose we have decided to choose signal points from a particular lattice A, and that the signal constel-
lation is to be bounded by a region 91 with a particular shape. Equation (9) predicts that multiplying the size of the signal constellation by C increases the average signal power by C2~N. For N = 1, this certainly agrees with Example 1, where P([ - L,L]) = 2L2/3. If we adopt a coding scheme that maintains rate by doubling the size of a 1-dimensional signal constellation, then the average signal power is multiplied by a factor of 4. If the coding scheme does not increase the minimum squared distance by a factor greater than 4, then the code is useless; by (5) there is actually no gain in reliability over uncoded transmission. However, we are not b o u n d to fail, and in the next section we will design codes that succeed. Trellis C o d e s
Trellis coding is a technique for encoding a binary stream as a sequence of real vectors that are transmitted over a noisy channel. This technique was invented by Gottfried Ungerboeck in 1976. Ungerboeck's original paper was submitted to the IEEE Transactions on Information Theory in 1977, b u t the a u t h o r w a s reluctant to make the changes requested by the reviewers. As time passed, the value of trellis codes became more apparent, and the editors asked Ungerboeck if they could reconsider. The paper was finally published in 1982 and promptly w o n the best-paper award given by the Information Theory Group. Ungerboeck has also written a popular article ([14]) for engineers. A simple example is the best w a y to introduce these codes. The trellis e n c o d e r s h o w n in Figure 5 slides a window of length 3 along a binary data stream. The state of the encoder is the pair of prior input bits, and the w i n d o w determines the possible transitions between states. Suppose that the triple abc appears in the window. The present state is ab and the next state will be bc. The corresponding output is the label on the edge from ab to bc. The sliding w i n d o w introduces
Figure 5. A 4-state trellis encoder. THE MATHEMATICALINTELLIGENCERVOL 13, NO 3, 1991 5 9
Figure 6. Binary data encoded as a path through the trelhs
Figure 7. An error event with minimum squared distance d2=9.
programming. Every signaling interval, the decoder calculates and stores the most likely path that terminates in a given state. The most likely path is the one that most closely resembles the sequence of noisy samples. Andrew Viterbi was the first to recognize the value of this decoding method, and it is called the Viterbi algorithm by communication theorists. We assume that the binary data sequence is independent and identically distributed (i.i.d.), with the values 0,1 occurring with equal probability. The edges of the state diagram are then equiprobable, and the average signal power P = 5/2 (compared with P = 1/2 for uncoded transmission using the signals + 1/2). Calculating the minimum squared distance d2 is a little harder. An error event occurs when the decoder chooses incorrectly between two paths that start in the same state, finish in the same state, and do not simultaneously occupy the same state in between. We require the error event with minimum squared distance. In this case d2 = 9, and Figure 7 shows an error event E for which d2(E) = 9. The coding gain over uncoded transmission using the signals + 1/2 is [9/(5/2)]/[1/(1/2)] = 1.8 (or approximately 2.5 dB). This example shows that it is possible to transmit 1 bit/dim, reliably. To increase the transmission rate we turn to trellis codes based on lattices and cosets.
Trellis Codes Based on Lattices and Cosets
Table 1. Coset representatives, names, norms, and multiplicities for the 8 cosets of 63Z2 in Z 2. memory into the encoding process (the present output does not depend only on the most recent data bit), and the signal constellation is twice the size required by uncoded transmission at 1 bit/dim. The origin of the term trellis code is perhaps that the graph obtained by concatenating copies of the encoder state diagram looks like the structures used by gardeners to support climbing plants. The binary data stream is encoded as a path through this trellis. (See Figure 6.) The decoder has a copy of this trellis. It processes the noisy samples, and tries to find the path taken by the binary data. The decoding algorithm is dynamic 60
THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991
Here the signal points are taken from an N-dimensional lattice A, and the signal constellation contains an equal number of points from each coset of a sublattice A'. One part of the binary data stream selects cosets of A' in A, and the other part selects points from these cosets. This is a coding technique proposed by Neil Sloane and me in [3]. Again we lead with an example. We take A to be the integer lattice Z 2 spanned by (0,1), (1,0), and A' to be the sublattice (b3A, where qb = 1 1 [I-1] is a linear transformation that multiplies norms by 2. There are 8 cosets. The norm N[C] of a coset C is the minimum norm of a point in C, and the multzplicity Mult[C] is the number of points in C with minimal norm. Table 1 gives coset representatives, names, norms, and multiplicities for the 8 cosets. Figure 8 shows a 32-point signal constellation where the signals are integer lattice points translated by (1/2, 1/2). The first quadrant of Figure 8 contains precisely the translates of the 8 coset representatives listed in Table 1. There are 4 points from each of the 8 cosets. We see, for example, that if x ( C and y E C + (2,0), then Hx y[[2 ~ N[(2,0)] = 4, and there are Mult[(2,0)] = 4 signals y at this minimum distance (if x is a boundary point then there are fewer nearest neighbors). The transmission rate is 4 bits/2-dim, so the signal constellation is again twice the size required by uncoded transmission. Two of the four input bits choose
the coset, and the remaining two bits choose a signal point from that coset (as shown in Figure 9). We shall refer to the bits d k, ek as uncoded bits since there is no redundancy in the signal selection procedure once the coset has been chosen. The big advantage of coset codes is that it is possible to achieve any transmission rate just by increasing the number of uncoded bits. The rule for selecting the coset sequence (Ck) is
the minimum distance calculation under the rug. This time we will be more forthright. Let (Yk), (YD be distinct sequences of signal points corresponding to inputs (ak, bk, Ck,dk) and (a~,b~,c~,d~). Let Yk E C k, y~ ~ C~. If the input sequences (ak, bk), (a~,bD are distinct, then so are the coset sequences (Ck), (C;:), and IlYk -k
Ck ~"
..VI~IF/>
2N[G
-
G].
(11)
k
pOliO] T 1 20 12 [ak-21bk-lak-llbk ak] '
(10) If the sequences (ak, bk), (a[,,b;:) are equal, then Yk -- Y~ A' for all k, and so ]lYm -- Ymll2 ~ 8 for some m (since the minimum non-zero norm in A' is 8). For coset codes, we where addition of columns takes place in the coset see that the minimum squared distance d2 is deterspace A/~3A. For example, if [a~_2,bk_l,ak_1,bk, ak] = mined by the minimum norm in the sublattice A', and [1,0,1,0,1] then C k = [2,3] T = [0,1] T. Again we assume by the method of selecting cosets. that the binary data streams are i.i.d., and that the In this example the minimum squared distance d2 is symbols 0,1 are equiprobable. It follows from (10) that 5. To prove this we check that if u, v are binary 5the 8 cosets are selected equally often. N o w points in tuples then the same coset are used equally often, so it follows N[Gu v - Gv T] = N[G(u G v)T], (12) that all signals are equiprobable. The average signal power P is 5 (see Figure 3b). In the preceding example, we swept the details of where G is the matrix given in (10), and 9 denotes addition modulo 2. This property, which we call regularity, makes calculation of d2 much easier because it allows us to assume that one of the input sequences is the zero sequence. In algebraic coding theory, every linear code is regular. (What makes (12) true is that the matrix G contains only one column from outside the binary space of cosets with even norm.) Next we draw a directed graph on 8 vertices labeled by the possible inputs (ak_2bk_lak_l). The vertex labeled (ak_2bk_lak_l) is joined to the 4 vertices (ak_l,bk,ak) and the edge is weighted by N[G[ak_llbk_~ak_iIbkak]T]. The edge from the vertex (000) to itself is deleted. The weight of a directed path is simply the sum of the edge weights, so that the distance d2 is the minimal weight of a path from (000) to itself. This can be calculated using Dijkstra's algorithm (see [10]), or in this case by hand. Here d2 = 5, and there are 4 minimum weight paths from (000) to itself:
(000)~ (011)-.-~(100).._~(ooo), Figure 8. The 32-point signal constellation divided among the 8 cosets of (J)3Z2 in Z2.
Figure 9. The trellis encoder for the coset code.
(000) (011) _!, (110) (000) ~_~ (010) --~ (001) ~ (000) (010) --~ (001)
(ooo), (lOO)& (ooo), (11o)-~ (ooo).
It is also easy to calculate the number of nearest neighbors to a given signal sequence (Yk). The number of ways to choose y~ E C/r so that equality holds in (11) is just IIk Mult[C k - C~]. Table I relates norms of cosets and multiplicities, and we see that each of the 4 paths given above contributes 4 to the overall path multiplicity. In this example the path multiplicity is then 16 (per 2 dim.). For the particular transmission rate of 4 bits/2-dim., we can measure performance against uncoded transmission using the 16-point signal constellation { + 1/2, THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991
61
Ungerboeck's original paper was submitted to the IEEE T r a n s a c t i o n s on I n f o r m a t i o n Theory in 1977, but the author was reluctant to make the changes requested by the reviewers. As time passed, the value of trellis codes became more apparent, and the editors asked Ungerboeck if they could reconsider. The paper was finally published in 1982 and promptly won the best-paper award given by the Information Theory Group. -+3/2}2. The coding gain is (5/5)/(1/[5/2]) = 5/2 -~ 4.0 dB. However, it is also possible to use the continuous approximation to calculate an asymptotic coding gain, which becomes increasingly accurate as the size of the signal constellation (or the number of uncoded bits) becomes increasingly large. In general we suppose that (1) Signal points for both coded and uncoded transmission are drawn from the same lattice; (2) the signal constellations for coded and uncoded transmission are b o u n d e d by regions with the same shape. It then follows from (9), and the discussion which follows it, that the asymptotic coding gain ~/is given by (d2)c~
X ([l')'lunc~
-- (d2)uncoded
\'~'/
.
(13)
For this coset code the asymptotic coding gain is 5/2, which agrees perfectly with the particular coding gain calculated above. To summarize: trellis codes based on lattices and cosets allow reliable communication at any desired transmission rate. The (asymptotic) coding gain is a simple function of the redundancy and of the ratio of coded to uncoded squared distance.
Lattice Codes vs. Trellis Codes
d2(A,) x ~/(AI:A2) -
d2(A2)
\V~I)]
9
(14)
The gain of an N-dimensional lattice A is the gain "/(A:Z N) with respect to the integer lattice. N o w consider a coset code based on some quotient of an N-dimensional lattice A. We compute performance against that of uncoded transmission using the integer lattice Z N. The n e w figure of merit is the product of the coding gain ~/of the coset code (given in (13)) and the lattice gain "y(A:ZN) given above. W h e n engineers speak of lattzce codes they mean nonredundant transmission, where the signal points are drawn from a lattice other than Z N. We admit that it is not altogether consistent to associate the term "code" with non-redundant transmission. Nevertheless we shall follow the engineers.
Example 2. The lattice E8 was discovered in the last third of the nineteenth century by the Russian mathematicians Alexander N. Korkin and E. I. Zolotaroff, and by the English lawyer and amateur mathematician T h o r o l d G o s s e t . The lattice p o i n t s are v e c t o r s (z 1. . . . . Zs), where z, is an integer or a 1/2-integer, and z 1 + . . . + z8 is an even integer. (For more information about sphere packings and about E8 in particular, see the Scientific American article by Neil Sloane [12].) The lattice E8 is unimodular (the fundamental volume V(E8) = 1) and the minimum norm d2(Es) = 2. The gain ~/(E8:Z8) over the integer lattice is 2, that is to say, about 3 dB. The lattice E8 is a fascinating mathematical object with a great many symmetries. Every lattice point has 240 nearest neighbors; the neighbors of the origin are the 112 points (---1)206 and the 128 points ( + 1/2) 8 where the number of minus signs is even. This lattice offers a w a y of arranging unit spheres in 8-dim. space so that 240 spheres touch any given sphere. It is impossible to do better (see [4] for details). U n f o r t u n a t e l y , the l a r g e n u m b e r of n e a r e s t n e i g h b o r s really limits the u s e f u l n e s s of E8 as a channel code. There is a 4-state trellis code with a coding gain equal to 7(Es:ZS), and a path multiplicity of only 4. On the telephone channel, this corresponds to an order of magnitude difference in error probability. It is not completely fair to say that lattice codes make poor channel codes. But the lattice gain of a lattice code with m a n y nearest neighbors should be taken with a pinch of salt.
Different lattices require different volumes to enclose the same number of signal points. It is therefore possible to save on signal power by an appropriate choice of lattice. Consider u n c o d e d transmission at the same rate using two lattices A 1, A2, with minimum norms d2(A1), Non-Equiprobable Signaling and d2(A2), respectively. Suppose that the two signal constellations are bounded by regions with the same Hitherto we have held to the principle of equiprobable shape. We compute the figure of merit d2/p for the two signaling. We have s h o w n h o w to improve perforlattices, and the ratio is the lattice gain ~/(AI:A2) of A 1 mance by introducing redundancy, or by drawing signals from a lattice other than Z N, or by a cornbinaover A2. By (9), this gain is given by 62 THE MATHEMATICALINTELLIGENCERVOL 13, NO 3, 1991
tion of the two. We have presented these gains as
coding gains. N o w we are going after gains that result from shaping the signal constellation or (equivalently) from nonequiprobable signaling. Ungerboeck was guided by three design principles in his construction of 2-dim. trellis codes. The first of these is; All signals should be equiprobable since good codes should be of regular structure. In this final section we shall d e m o n s t r a t e that the design rule Signals with large norm should be used less frequently than signals with a small norm is superior. This is not to say that the codes constructed by Ungerboeck are bad, because they are not. Nonequiprobable signaling will make them still better. To motivate this second design rule, we recall that for any lattice the average signal power of a spherical constellation is less than that of a rectangular constellation. N o w consider a constituent 2-dim. constellation obtained by projecting the constellation onto some pair of coordinates. For the rectangular constellation, signal points in the 2-dim. constituent are equiprobable. But for the spherical constellation, the probability distribution induced on a 2-dim. constituent favors !ight points over heavy points. It is worthwhile to conJsider a small example. Table 2 describes a spherical 4-dim. constellation with 512 signal points. The signal points are obtained from the representatives listed in Table 2 by applying all permutations of coordinates and all possible sign changes. Figure 10 shows the probability distribution induced on the constituent 2-dim. constellation. This example shows that it is possible to adjust the signal power either by shaping a higher dimension constellation or by introducing nonequiprobable signaling on a lower dimensional constellation. The classical problem of constellation shaping is that of choosing the region 9t that bounds the constellation. We generalize this by choosing a region ~t, and a probability distribution on the signal points within ~tt. As always, we begin with an example.
Example 3. Here the signal constellation is 1-dimensional, and consists of the 2b+l half-integers between - 2b and 2b. The weight wt(x) of a signal point x is given by wt(x) =
0, if x ~ (--2b-1,2 b-l) 1, i f x ~ (--2 b,-28-1 )u(2b-l,2b).
Figure 10. The probability distribution reduced on the constituent 2-dimensional constellation.
(15)
Signal points x for which wt(x) = 0 are considered light and signal points for which wt(x) = 1 are considered heavy. The average power (normalized to 2 dimensions) of the light signal points is P0 = 22b/6, and the average power of the heavy signal points is P1 = 7 x 22b/6. The average transmitted signal p o w e r P is given by P = f0P0 + flP1,
Table 2. Representatives from a spherical 4-dimensional constellation with 512 signal points.
(16)
where f~ is the frequency that wt(x) = e. Here we are assuming that if signals x, y are both light, or both heavy, then they are equiprobable. We control the average signal power P by allowing only those signal sequences (y,), for which (wt(y,)) is in a restricted set C, which we call a shaping code. Consider the shaping code that results from concatenating binary 7-tuples with no more than 3 nonzero entries. Then THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991 6 3
would increase the average signal power P by a factor of 4 (see (9)). Here the average signal power has only been increased by a factor of 49/16 with respect to P0. Equiprobable u n c o d e d transmission at rate b + 6/7 bits/dim, w i t h the integer lattice Z, uses the 1/2-integers b e t w e e n - 2 b- 1/7 a n d 2 b- 1/7. The average signal power P' is given by 2 22(b- 1/7) p' _
- 2~2/7P0" 3
The shaping code produces a gain (over equiprobable signaling) of (212/7Po)/(49Po/16) which is approximately 0.3 dB. For simplicity we shall not describe h o w to integrate coset coding with nonequiprobable signaling. Suffice it to say that this can be d o n e a n d that the shaping gain can be a d d e d to the coding gain. In N-dimensions, the shape gain ~/N of the N-sphere over the N-cube is given by
I Figure 11. Decomposition of a spherical 2-dimensional constellation f~ into subconstellations F/0, f~l. . . . . I~T-I.
ilrll2dv _
1/2,1/21N
N/12
")lN =
=
ilrll2dv N-1
F i g u r e 12. The biasing gain ~/(fo, "-~ x, where fl = (1 - x)x'/(1 - xT).
0(7)i i fl-7~(~)-
, fT-1) as a function of
22
42
64,
f0= ~,
N (N + 2)'rr
F
+ 1 2
w h e r e SN- 1 is an N-dim. sphere with unit v o l u m e , centered at the origin, and F is the gamma function. Then ~2 "~ "~/3 = 0.2 db., ~/8 ~ 0.73 db., ~/24 ~ 1.10 db., and Stirling's approximation (F(N/2 + 1) - (N/2e) NI2) shows ~/N ~ ~re/6 ~ 1.53 db. as N ~ o0. However, it is possible to achieve the shape gain ~/24 w i t h o u t the bother of i m p l e m e n t i n g a 24-dimensional signaling scheme. The trick is non-equiprobable signaling (see [2] for more details). O u r generalization of constellation s h a p i n g starts with a region ~ , and we increase constellation size by scaling this basic region. We obtain a nested sequence ~)~, O~1~ . . . . . (~T_I ~ of copies of a basic region 3t. This allows us to express a constellation O b o u n d e d by aT_I~R as a union of subconstellations 0 o. . . . . OT-1, where 1~, = f l n (a,+l~\a,~t), z = 1, 2 . . . . . T - 1, a n d 0 o = O n 9t (see Figure 11). We require Ifl01 = IOll . . . . . IOT_II, SO that V(a,~) = 0 + 1)V(~). Hence ~, = 0 + 1) I/N. From n o w on we fix N = 2. Let P, = t,P o be the average power of the constellation 0,. Then t o = 1, and by (9), 1 ,--1
- EtiPo
=
iPo"
l i=o
,=Ok-/
and the signal p o w e r P = 49Pd16. We have doubled the size of the signal constellation in order to increase the transmission rate by 6/7 bit/dim. If the signals were equiprobable, t h e n increasing the transmission rate by 1 bit/dim. (doubling the size of a 1-dim. constellation) 64 THE MATHEMATICALINTELLIGENCERVOL 13, NO 3, 1991
It follows that t, = 2i + 1, a n d that P, = (2i + 1)P 0. We s u p p o s e that the constellation O, is u s e d w i t h frequency f, and that signal points within 0, are equiprobable. If we are required to use symbols 0,1 . . . . . T - 1 with frequencies fo, fl . . . . . fT-1, respectively, t h e n it is possible to transmit
T-1
H(fo . . . . .
fT-,) = -- ~ f , ] o g f , l=0
bits of information. For example, if T = 2", a n d the s y m b o l s 0,1 . . . . T - 1 are e q u i p r o b a b l e , t h e n H(fo . . . . . fT-1) = ET--o1 n/2" = n bits. If we were to transmit this information by equiprobable signaling, using a signal constellation b o u n d e d b y a scaled version of the region fl~, t h e n the average signal p o w e r w o u l d be given b y p ' = 2n(fo, ,fT-1) Po. The bzasmg gain ~l(fo. . . . . fT-1) is defined to be the m a x i m u m gain that can result from signaling with frequencies f0,h . . . . . fT-1 and is given b y
~l(fo. . . . .
fT-1)
-
-
2n(fo, 9fT- 1) T-1 (2z + 1)]:, t=0
The shaping gain is the p r o d u c t of this biasing gain and the shape gain of the basic region ~ over the square (the shape gain of the circle over the square is ~/3 0.2 db.). A simple c o n s t r a i n e d m a x i m i z a t i o n s h o w s that the biasing gain approaches e/2 as T ~ =. The p r o d u c t ~/3 x e/2 is of course the potential shape gain. In F i g u r e 12 w e set fz = (1 - x)xV(1 - x'), i = 0,1 . . . . . T - I a n d plot the biasing gain as a function of x (this choice for f~ d r o p s out of the c o n s t r a i n e d maximization). Very few probability levels suffice to achieve almost all the potential shaping gain of ~e/6 1.53 db.
2. A. R. Calderbank and L. H. Ozarow, Non-equiprobable signaling on the Gaussian channel, IEEE Trans. Inform. Theory, to appear. 3. A. R. Calderbank and N. J. A. Sloane, New trellis codes based on lattices and cosets, IEEE Trans Inform. Theory IT-33 (1987), 177-195. 4. J. H. Conway and N. J. A. Sloane, Sphere packmgs, Lattices and Groups, New York: Springer-Verlag (1987). 5. W. L. Ferrar, On the consistency of cardinal function interpolation, Proc. Roy. Soc. Edinburgh 47 (1927), 230-242. 6. G. D. Forney, Coset codes I: Introduction and geometrical classification, IEEE Trans. Inform. Theory IT-34 (1988), 1123-1151. 7. A. J. Jerri, The Shannon Samphng Theorem--its various extensions and applications: a tutorial review, Proc. IEEE 65 (11) (1977), 1565-1596. 8. V. A. Kotel'nikov, On the transmission capacity of "ether" and wire in electrocommunicatlons (material for the first all-union conference on questions of communications) Izd. Red. Upr. Svyazz RKKA (Moscow) (1933). 9 H. Nyqmst, Certain topics in telegraph transmission theory, AIEE Trans. 47 (1928), 617-644. 10. E. M. Remgold, J. Ntevergelt, and N. Deo, Combinatorial Algorithms: Theory and Practice, Englewood Cliffs, NJ: Prentice-Hall (1977). 11. C. E. Shannon, Communications in the presence of noise, Proc. IRE 37 (1949), 10-21. 12. N. J. A. Sloane, The packing of spheres, Sclentzfzc American 250 (1) (1984), 116-125. 13. G. Ungerboeck, Channel coding w]th multilevel/phase signals, IEEE Trans. Inform. Theory IT-28 (1982), 55-67 14. - - , Trellis coded modulation with redundant signal sets, IEEE Comm. Magazine 25 (1987), 5-21. 15. E. T. Whittaker, On the functions which are represented by the expansion of interpolating theory, Proc. Roy. Soc. Edinburgh 35 (1915), 181-194. 16. J. M. Whittaker, The Fourier theory of the cardinal functmns, Proc. Math. Soc Edinburgh (1929), 169-176. 17 , Interpolutory Function Theory, Cambridge, England: Cambridge Umverslty Press (Cambridge Tracts in Mathematics and Mathematical Physics, No. 33, 1935).
References
1. H. S. Black, Modulahon Theory, New York: van Nostrand (1953).
Mathematzcal Sczences Research Center AT&T Bell Laboratorzes Murray Hzll, NJ 07974 USA
Change of Address?
Mail to:
Six weeks before you move, please let the Publisher know what your n e w address will be. Be sure to supply the Pubhsher with both your old and new addresses and z~p codes, and the address label from your current tssue
Spnnger-Verlag New York, Inc Subscription Department 175 Fifth Avenue New York, New York 10010
Attach Your Subscription Address Label Here My n e w a d d r e s s will be: Name .......................................................................... Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . City ...................... State ..............
Zip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
65
Ian Stewart* The catapult that Archimedes bmlt, the gambhng-houses that Descartes frequented in hzs &ssolute youth, the field where Galois fought his duel, the bmdge where Hamilton carved quaternions-not all of these monuments to mathematical history survwe today, but the mathematzcian on vacatton can still find many reminders of our subject's glorious and inglorwus past: statues, plaques, graves, the caf~ where the famous conjecture was made, the desk where the
famous mitzals are scratched, birthplaces, houses, memorials. Does your hometown have a mathematical tourist attraction? Have you encountered a mathematzcal slght on your travels? If so, we invite you to submit to thzs column a picture, a descrzptzon of its mathematical significance, and either a map or dzrectzons so that others may follow in your tracks. Please send all subm~sszons to the European Editor, Ian Stewart.
The Birth of Galois and the Death of Condorcet Herv4 Lehning What could be similar about Condorcet and Galois? From a mathematical point of view, very little I believe; the first one was acknowledged during his lifetime and the second one was not. Condorcet (17431794) is known for having been one of the writers of the mathematical articles of the Encyclop~die and for his studies on probability. Galois (1811-1832) is known for his studies on equations and as a famous example of unrecognized genius. However, one of them was born and the other one died in the same street of the same little town. The name of the town was Bourg EgalitG the n a m e of the street G r a n d Rue, the numbers of the houses were 54 and 81. If you come to Paris, it is easy to go to that place. Keep on the m~tro/R.E.R, line B, leading to Saint R~my les Chevreuses or Robinson. Get off at Bourg-laReine station (about 10 minutes later). As before the R6volution, since a decree of Napol6on (7 October 1812, Moscow), Bourg Egalit~ has been called Bourgla-Reine. When you leave the station, you are at Place de la Gare. Walk straight on in the direction of the Town Hall (Mairie in French) a p p r o x i m a t e l y 200 meters away. You will walk along a street called Ren6 Roeckel. At the right of the entrance of the Mairie, you will see a plaque with the following inscription:
* Column Editor's address: Mathematics Institute, University of Warwick, CoventryCV4 7AL England. 66
A LA MEMOIRE de M r. GALOIS MAIRE DE BOURG-LA-REINE PENDANT 15 ANS MORT EN 1829 LES HABITANS RECONNAISSANTS. This means: To the memory of Mr. Galois, Mayor of Bourg-la-Reine for 15 years, who died in 1829; from the grateful inhabitants. Of course, the person implied is not the mathematician Evariste but his father Nicolas Gabriel. Then go back on your tracks--the wide street you crossed is the former Grand Rue. Since the Liberation, it has been called Avenue du G~n4ral Leclerc. As a matter of fact, Leclerc's army used it to go to liberate Paris in 1944. Then turn right on that street. You are at Place Condorcet. Continue walking and on the right side of the street, at number 81, you will see an old house used by a hairdresser. There is a plaque on the side of his sign (a pair of scissors): MARIE JEAN ANTOINE NICOLAS CARITAT MARQUIS DE CONDORCET EST MORT DANS CETTE MAISON LE 9 GERMINAL AN 2 (30 MARS 1794) This means: Marie Jean A n t o i n e Nicolas Caritat, Marquis of Condorcet, died in this house on Germinal 9th of Year 2 (30 March 1794). Below, a small plaque in green plastic recalls the life
THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3 9 1991 Spnnger-Verlag New York
of Condorcet without any mention of his mathematical works. Proscribed after the fall of the Girondins, Condorcet lived hidden in Paris for several months. In order not to compromise the people who were hiding him, he tried unsuccessfully to take refuge at another friend's, or so it seems. He was arrested without being recognized at Clamart on 27 March. Then he was jailed at Bourg-la-Reine. He was found dead in this small house on the 30th. It might have been a suicide, because Condorcet was known to always carry some poison with him. Continue along the same street; after about one h u n d r e d meters, y o u will come to an i m p o r t a n t crossing. On your left, at number 54, you will see a bank. Above the sign, on the right, a small plaque: ICIEST NE EVARISTE GALOIS ILLUSTRE MATHEMATICIEN FRANCAIS MORT A VINGT ANS 1811-1832
This means: Here Evariste Galois was born, a famous French mathematician who died at the age of twenty, 1811-1832. Had you been here 15 years ago, you would have been able to see the house of Evariste's birth. The present building is recent. Bourg-la-Reine cemetery is not far away, but it is useless to go there if you want to see the graves of Condorcet and Evariste Galois. Condorcet is at the Panth6on, Galois is at Montparnasse Cemetery (without any stele, simply under the earth at a place that is not well-determined). If you backtrack and continue past Place Condorcet for a h u n d r e d meters, you will find another large crossing. On your left, a street called Galois (after Evariste's father) begins; on y o u r right, the All6e d'Honneur leads to the castle and the park of Sceaux (about 500 meters). A walk in the park will nicely complete your visit. 13 rue Letelher 75015 Paris, France THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
67
History and Variation on the Theme of the Frobenius Reciprocity Theorem* Floyd L. Williams
The search for truth is in one way hard and in another way easy. For it is evident that no one can master zt fully nor mzss zt wholly. But each adds a little to our knowledge of nature, and from all the facts assembled there arises a certazn grandeur.
Aristotle The German mathematician Georg Ferdinand Frobenius (1849-1917) was born in Berlin on 26 October 1849, the son of Christian Ferdinand Frobenius. After attending the Joachimthal Gymnasium, Frobenius began his mathematical studies at Gottingen in 1867. Under the direction of the inimitable Karl Weierstrass he earned a doctorate three years later in Berlin. In ensuing years, Frobenius made fundamental discoveries in such diverse areas as matrix theory mod p and 0-functions in several variables. His particularly productive work during the 1870-75 period is chronicled in Crelle's Journal. The major achievements of Frobenius are undoubtedly in the theory of group representations--a theory he initiated in 1896 at the age of 47. Along with its three historic roots--Galois theory, Lie theory, and number theory--it stands as a cornerstone of modern mathematics.
group, so we begin with the latter notion. It appears, implicitly, as early as 1801 in the work of Gauss--viz., in his Dzsquisitiones Arithmetzcae--and m later years, beginning in 1878, Richard Dedekind also considered the notion (again implicitly). An extensive correspondence b e t w e e n Dedekind and Frobenius, spanning periods from 1882-83 and from 1895-98, provides valuable insight into the development of Frobenius's creation. A fascinating and learned account of this cor-
Group Representations and Frobenius Reciprocity The theory of group representations (or group characters) was created by Frobenius in an attempt to generalize the theory of characters of a finite Abelian
* T h i s p a p e r is b a s e d o n t h e a u t h o r ' s J C l a r e n c e K a r c h e r L e c t u r e g i v e n a t t h e U m v e r s l t y of O k l a h o m a
68 THE MATHEMATICALINTELLIGENCERVOL 13, NO 3 9 1991 Spnnger-Verlag New York
respondence can be found in the Hawkins paper [6]. In 1881 H. Weber formalized the notions due to Gauss and Dedekind of a character of a finite Abelian group. Namely, one has the following: Definition 1. Let T = {z E C: ]z] = 1} be the multiplicative group of complex numbers of unit modulus. A character of a finite Abelian group A is a group homomorphism • A ~ T.
tegers. Then there exist infinitely many primes p such that p =- n (rood m). The Dirichlet series in equation (4) was introduced in 1840. Dirichlet, by the way, first gave a rigorous convergence proof of Fourier series. For • = 1 (the trivial character), L(s,1) = 11. (1 - p-~)~(s), where plm |
~(s)
Under point-wise multiplication the set A of characters of A is itself an Abelian group. ,d is in fact finite and has the cardinality IAI of A. One has the famous orthogonality relations for Xa,• E A: aEa
xl(a)x2(a) = {]A~if X1 r X2 if X1 = X2"
1
--, ns
n= 1
p
1-1 " 1 prune
1
p-s
(5)
is the usual Riemann zeta function; here Re s > 1. Generalizing Weber, Frobenius made the following definition:
(1)
Moreover, every complex-valued function f on A has a finite "Fourier series" expansion:
Definition 2. Let G be a finite group (not necessarily Abelian). A representation of G on a finite-dimensional vector space V is a homomorphism ~r: G --* GL(V).
Here GL(V) is the group of invertible linear operators f = ~ n(• where n(x) = ~ a~A ~ f(a)~(a). (2) on V. It is customary to write a 9 v for n(a)v, (a,v) E G x V, and thus to regard V (equivalently) as a left Gx~A / module. If P is a subgroup of G, then ~rlp is a represenIn connection with equation (2), we note the fol- tation of P on V. It is clear, however, that most reprelowing. In the case of the unit circle T in Definition 1 sentations of P cannot arise this way. On the other (T being a non-finite Abelian group) the continuous hand there is, fortunately, a procedure by which left characters of T have the form • = X,:z ~ z", n E Z. G-modules are constructed from left P-modules: Let W Any f E L2(T, d0/2"rr) has an infinite Fourier series L2- be a left P-module, and define IW to be the space of expansion functions f: G ~ W satisfying flap) = p-1 " f(a) for (a,p) E G x P. IW has a left G-module structure given by (a1 9f)(a) = f(a~ 1 a) for (ax,a) E G x G, f E IW. We may f = ~ c(n)x, where nEZ=1" of course view h W ~ IW as a functor from left P(3) modules to left G-modules. In 1898 Frobenius proved the following key result, which relates the induction c(n) = - f(e'~176 functor with the restriction functor [4]. 2~ 1
Such expansions as in equation (3) arose initially in connection with the theory of partial differential equations and have been celebrated since Fourier's submission to the French Academy of his 1807 memoir on the analytical theory of heat conduction. Let m be a positive integer. An important application of character theory comes by way of the multiplicative group A(m) of invertible elements in the ring Z/mZ. IA(m)l equals ~b(m), where ~b is Euler's ~b-function; i.e., ~b(m) = I{x E Z I 1 ~ x ~ m, x is prime to m}I. A character • E ,~(m) gives rise to a Dirichlet L-function: co
L(s,x) = ~ x(n)n -~,
s E C.
(4)
Using properties of L-functions Dirichlet proved the following result: T H E O R E M 1. Let n,m be relatively prime positive in-
THEOREM 2. (Frobenius reciprocity). Let P be a subgroup of a finite group G, and let V, W be left G-, Pmodules, respectively. Then the vector spaces HomG(V, IW), Homp(VIp, W) are canonically isomorphic. Consider the special case P = {1}, where I is the identity of G, and w h e r e W is the field C of complex numbers. Then IW is the space of all functions f: G --~ C. The corresponding representation of G on I W (where again G acts by left-translation on a function) is then called the (left) regular representation of G. A Gmodule V is said to be irreducible if V and the trivial subspace {0} are the only G-invariant subspaces of V. An immediate corollary of Theorem 2 is the following: Corollary 1. An irreducible G-module V occurs in the regular representation of G exactly dim V times. In other words, the regular representation contains exactly dim V submodules G-isomorphic to V. THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991 6 9
Historically, Corollary 1 precedes Theorem 2. Frobenius worked hard and long, with some despair, to obtain Corollary 1. Having obtained the result in some special cases (for example when G is the tetrahedral group of order 12), he found the general case elusive - - s o much so that at one point, in his 26 April 1896 letter to Dedekind, he asked for a possible counter-example so that he should not "go astray." Frobenius needed the result of Corollary I to complete his theory of the g r o u p d e t e r m i n a n t and t h u s p r o v i d e the "missing link" between that theory and Dedekind's t h e o r y of c o m m u t a t i v e h y p e r - c o m p l e x n u m b e r systems. Frobenius finally proved the corollary in an 1896 work [3]. In another land, W. Burnside managed to "rediscover" various results of Frobenius--for example, proofs of the Sylow t h e o r e m s , results on groups of order p"q, etc.
A simple idea of Frobenius has blossomed to unforeseen heights of beauty and profundity. Up until about 1923 the theory of group representations was purely a branch of algebra. The ideas of Frobenius continued to be developed by Frobenius, Burnside, I. Schur, and others. Important applications to the theory of finite group structure continued to evolve. Frobenius and Burnside proved the following, for example. THEOREM 3. Suppose p, q are primes and G is a finite group. If p divides IGI, then G contains an element of order p. If IGI = pnqm for some positive integers n and m, then G is solvable. Matters changed in 1924 when Schur recognized that by using the work of A. Hurwitz on invariant integration on manifolds the "averaging process"
~ f ( a o + a) = ~f(a) a(G
(6)
a(G
applied to a function f on a finite group G could be extended to continuous compact groups--i.e., to compact Lie groups. The scope of Frobenius's work enlarged further when about the same time (in the early 1920s) important applications were found for number theory by E. Artin and E. Hecke, and for the developing quantum theory by H. Weyl. In his famous book [12] Weyl uses the character theory of the symmetric group ~r, on n letters to give a group-theoretic classification of line spectra of an atom with n electrons. The work of E. Cartan and Weyl, and of F. Peter and Weyl (1927) was decisive in completely clarifying the structure of the representations of compact Lie groups and set the stage for the modern development both of Lie theory 70
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
and the infinite dimensional representation theory of semisimple Lie groups. The Hurwitz invariant integral was extended to compact groups (not necessarily continuous) and in fact to locally compact groups in 1933 by A. Haar. Uniqueness, up to a positive constant, of the invariant integral (which is nontrivial) was established by J. von Neumann. Such results, on the one hand, permitted a natural extension of Frobenius reciprocity to compact g r o u p s - - a n d on the other hand they were the basis from which the work of G. Mackey would ensue. In two papers, 1952, 1953, Mackey [7] laid the foundations of the theory of induced representations of locally compact groups. In particular he established a direct integral version of Frobenius reciprocity. Such a result is too technical to be stated here. If the locally compact group G in question has a much tighter structure, then a considerable refinement of the Mackey version of reciprocity is possible. One such refinement is due to the author w h e n G is, for example, a connected complex semisimple Lie group [14]. A direct integral decomposition of a group representation is a generalization and "continuous" analogue of a direct sum decomposition. The simplest example of the former (and the prototype of all such examples) is the decomposition of the regular representation ~r of the real n u m b e r field R on the Hilbert space L2(R,dx), where dx denotes Lebesgue measure. As in the earlier discussion, 7r is defined by the left-translation action of R on L2-functions. The Fourzer transform F, where
(Ff)(t) = fR flx)e'Xtdx
(7)
forf (Ll(R,dx) A L2(R,dx), t ~ R, gives a unitary map F of L2(R,dx). The unitarily equivalent representation (or "model" of "rr)F ~ ~r o F -1 of R on L2(R,dx) is a realization of ~r as a direct integral or continuous sum (in a sense readily made precise) of characters Xd x ~ ezxt, t,x ~ R, of R. Besides the Mackey formulation of Frobenius reciprocity in the general context of locally compact groups, other variations on the theme are due to F. Mautner [8] as early as 1951, and later by W. Armacost [1] and C. Moore [9]. In a quite different context Bott considered the notion of holomorphic induction, where G and a closed s u b g r o u p P of G are a s s u m e d to be complex Lie groups such that G/P is a complex compact manifold. Thus, locally, G (and similarly P, G/P) is homeomorphic to an open set in complex n-space C'. The group operations (multiplication and the map x--~ x-1, x ~ G) are holomorphic with respect to the holomorphic structure on G (or on P, respectively). Analogous to Definition 2 we have the following definition:
Definition 3. A holomorphic representation of G (a complex Lie group) on a finite-dimensional complex vector
space V is a homomorphism ~r: G --* GL(V) that is a holomorphic map; i.e., for each v ( V and f ~ V* (the dual space), the map x ~ f(~r(x)v) of G to C is holomorphic. We also say that V in Definition 3 is a holomorphic Gmodule. Given a holomorphic P-module W one can form (exactly as above) the space IW of holomorphic functions F: G--~ W satisfying f(ap) = p-1 " f(a) for (a,p) E G x P, and assign a left G-module structure to IW, again defining a I . f by the equation (a1 9f)(a) = f(aFla) for (al,a) ~ G x G, f ~ IW. Similar to Theorem 2, Bott proves in [2] the following: THEOREM 4. (Holomorphic Frobenius reciprocity). If W, V are holomorphic modules for P, G, respectively, then the vector spaces HomG(V, IW), Homp(VIp, W) are canonically isomorphic. Actually quite a bit more than Theorem 4 is established in [2]--again under the assumption that G/P is compact. Namely, the space IW has a certain geometric interpretation. One can show that it is what geometers call the "space of holomorphic sections of a (suitable) holomorphic vector bundle over G/P.'" Given this important interpretation one can, as Bott does in [2], greatly generalize the formulation of Theorem 4 by using what is called cohomology-sheaf cohomology [2] and Lie algebra cohomology [10] in particular. Because of the various technicalities involved we will not attempt to state here any "higher cohomology" version of Frobenius reciprocity. Two such versions are presented in [13] with full details. Other such versions are treated in [11], [15]. The point we are trying to make here is that a simple idea of Frobenius has blossomed to unforeseen heights of beauty and p r o f u n d i t y - - t o permeate and shape various developments on the modern mathematics horizon.
Acknowledgments: The author would like to express sincere thanks to the University of Oklahoma and to the Department of Mathematics for the kind and honored invitation to present this particular J. Clarence Karcher lecture. I thank Professor Robert Fisher, Jr., and the many department members who insured that my stay in Oklahoma would be especially comfortable.
Georg Frobenius
4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
References 14. 1. W. Armacost, The Frobenius reciprocity theorem and essentially bounded induced representations, Pacific J. Math. 36 (1971), 31-42. 2. R. Bott, Homogeneous vector bundles, Annals of Math. 66 (1957), 203-248. 3. G. Frobenius, Uber die Primfactoren der Gruppendeterminante, Sitzungsber. K6n. Preuss. Akad. d. Wiss. Berlin, 1343-1382 (1896 c.).
15.
, Uber Relationen zwischen den Characteren einer Gruppe and ihrer Untergruppen, Sztzungsber. Kdn. Preuss. Akad. d. Wzss. Berhn, 501-515 (1898). K. Gauss (Bnefwechsel, Band II 1, p. 268 (1900)). T. Hawkins, New light on Frobenius' creation of the theory of group characters, Archw for the History of the Exact Sczences 12 (1974), 217-243. G. Mackey, Induced representations of locally compact groups I, II, Annals of Math. 55, 58 (1952, 1953), 101-139, 193-221. F. Mautner, A generalization of the Frobenius reciprocity theorem, Proc. Nat. Acad. Sci., U.S.A. 37 (1951), 431-435. C. Moore, On the Frobenius reciprocity theorem for locally compact groups, Pacific J. Math. 12 (1962), 359-365. G. Hochschild and J.-P. Serre, Cohomology of Lie algebras, Annals of Math. 57 (1953), 591-603. T. Enright and N. Wallach, Notes on homological algebra and representations of Lie algebras, Duke Math. J. 47 (1980), 1-15. H. Weyl, The theory of groups and quantum mechamcs, New York: Dover Publications, Inc. (1931). F. Williams, Frobenius reciprocity and Lie group representations on 0-cohomology spaces, L'Ensezgnement Math~mat~que (2) 28 (1982), 3-30. , Tensor products of principal series representations of complex semisimple Lie groups, Lecture notes in Math. 358, New York: Springer-Verlag (1973). G. Zuckerman, Notes on the construction of representation by derived functors, Dept. Math. Yale Univ.
Department of Mathematzcs University of Massachusetts Amherst, MA 01003 USA THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991 7 1
Quantum Theory and the Lattice Join James H. McGrath
Quantum theory presents us with a dramatic problem. Its predictions are beyond reproach. Yet every attempt to explain quantum events affronts common sense. If quantum theory is correct, something we consider to be nonsense must be true; the theory prompts us to decide which common sense beliefs to jettison. This problem has e n g a g e d some of our century's best minds. In 1933, John von Neumann, then thirty, was the youngest member of the newly opened Institute for Advanced Study. A year earlier his Mathematische Grundlagen der Quantenmechamk had provided a rigorous Hilbert space axiomatization of the intuitive methods then used in the new quantum theory. Also in 1933, Garrett Birkhoff, the son of George David Birkhoff, had just graduated from Harvard and had begun his work on abstract algebra and lattice theory. In 1936, the two collaborated to write The Logic of Quantum Mechanics, which initiated a cluster of research programs n o w called quantum logic. The familiar p r o p e r t y lattice we inherited from George Boole serves us well from day to day. However, according to Birkhoff and von Neumann, for q u a n t u m theory w e need to replace our everyday property lattice with a n e w one s u g g e s t e d by the structure of Hilbert space, a quantum property lattice. In this article, I attempt to provide an informal account of this proposal; an afterword contains references and my paper [9] is an elaboration. A word of caution may be in order. The value of this account to the mathematical reader m a y be in its leading him to appreciate the ingenious Birkhoff-von Neumann strategy of capitalizing on physical or philosophical nuances of ordinary mathematical concepts. The mathematics itself is relatively straightforward; these nuances tend to be more elusive. 72
Explaining a Puzzle Here is a little puzzle for the reader. Take three pairs of ordinary Polaroid sunglasses. Begin by aligning two lenses, one behind the other. Look through the pair at an open window and rotate the second lens. This dims the light and when the lenses become crossed at right angles, all the light is blocked. Next, place a lens of the third pair of sunglasses either in front of or behind the initial crossed pair. Rotate it. Nothing changes; all the light is still blocked. Now, sandwich the third lens diagonally in between the first two. As it moves into place, light begins to pass where formerly none did. Figure I represents the puzzle. One wit warns farmers with an effectively double-fenced barnyard not to take the precaution of adding a third fence. They risk being overrun by varmints!
THE MATHEMATICAL 1NTELLIGENCER VOL 13, NO 3 9 1991 Spnnger-Verlag New York
By the early nineteenth century with the work of Young and Fresnel, physicists began to regard polarization as a property of light waves; they assumed polarization was a preferred direction or alignment of transverse light waves. An example of a mechanical transverse wave is a vibrating rope. Visualize a rope propagating along the z-axis; it can vibrate at any direction in the x-y plane. A light wave propagating in the z-direction and vibrating along the y-axis (or the x-axis) was said to have the vertical (or horizontal) polarization property. Classical physicists had developed a vibrating rope model of polarized light9 Figure 1. The Polaroid Puzzle. Two Polaroids effectively Polaroid material was invented in 1929 by Edwin block all light. A d d i n g a third lets light through. How? PioLand. Classically, Polaroids were also a s s u m e d to neering investigations of lattice theory by Birkhoff and of have a preferred direction--their transmitting axis. Hilbert space by von N e u m a n n led to an explanation. They were also regarded as filters that transmit all light oriented parallel to their axis and absorb all perpendicular light. To a ropelike light wave, such a Po- cally, " o n e o b s e r v e s p h o t o n s only as particles." Without waves the classical explanation collapses and laroid must look like a picket fence9 According to this classical rope-fence model, the we are left to ask " H o w are we to explain these results crossed pair of Polaroids blocked all the light because on a photon basis?" If there are no waves, what is potheir transmitting axes were perpendicular. If the first larization a property of? According to Dirac's quantum was vertical, it transmitted vertical but not horizontal m o d e l , " o n e m u s t ascribe a p o l a r i z a t i o n to the p h o t o n . " However, if polarization is a property of light. The second Polaroid must have been horizontal. Because no horizontal light was incident upon it, the photons, it must be an extraordinary kind of property that will exhibit "anomalous behavior." pair passed no light. But w h y did the three Polaroids pass light? To answer that question, classical physicists appealed to Maxwell's electromagnetic theory where a transverse M o d e l s a n d S t a t e S p a c e s light wave has a vector space representation. ConvenIn 1936, the year of the Birkhoff-von Neumann paper, tionally the electric displacement vector E represents Erwin Schr6dinger, who was awarded the Nobel Prize the wave's polarization. The set of E vectors forms a in 1933 jointly with Dirac, wrote The Present Situatlon in linear vector space with an orthonormal basis. The exQuantum Mechanics. Schrodinger began setting the planation called on the basis vectors: any E vector can stage for Birkhoff and von Neumann by contrasting be resolved into two mutually orthogonal compothe difference b e t w e e n explaining and predicting. nents. The vertically polarized light transmitted by the Both classical and quantum physicists have two ways first Polaroid can be resolved into a 45 ~ component, to represent a "natural object." When their aim is to called diagonal light, and a 135 ~ component, called slant explain (as they were doing in the previous section): light. These two components are then incident upon 9. . they set up a representation of natural objects based the second, diagonal Polaroid. There the diagonal on the experimental data. To show that one does not think light passes; the slant light is absorbed. In turn, the this is literally how things go in the real world one calls second Polaroid's transmitted diagonal light can be rethis thinking-aid an image or model. solved into horizontal and vertical light. At the last, When their goal is to predict, physicists use a state horizontal Polaroid the vertical c o m p o n e n t is abspace. First, they idealize a "natural object" as a physsorbed, but the horizontal passes. The nineteenth-century wave theory explains the puzzle. Three Polaroids ical system that can be completely characterized by its instantaneous state. Then, they associate this state have done what two could not do. They passed light. No single quantum explanation of the puzzle com- with an element in the theory's state space. Birkhoff mands a consensus among physicists. One, however, and von Neumann explain h o w this state space procecarries the credentials of the Cambridge mathemati- dure works: cian, physicist, and 1933 Nobel laureate, P. A. M. 9. . the state space element associated with the physical system at a time to, together with a prescribed mathematDirac. In Chapter One of his The Principles of Quantum ical 'law of propagation,' fix another state space element at Mechanics, Dirac regarded polarization as a property of any later time t; the assumption evidently embodies the photons. He began with "an experiment using an inciprinciple of mathematzcal causatzon. dent beam consisting of only a single photon," which A simple classical physical system is a cannonball; he regarded as evidence of "a very general example of the breakdown of classical [wave] mechanics." Specifi- its instantaneous state is specified by a 6-tuple of real THE MATHEMATICAL INTELLIGENCER VOL 13, N O 3, 1991 7 3
numbers conventionally representing values for momentum and position. This state is associated with a point in a 6-dimensional Euclidean space, and the mathematical p r o p a g a t i o n law m a y be N e w t o n ' s s e c o n d law. A p h o t o n is a q u a n t u m s y s t e m . A photon's instantaneous state is fully specified by its energy, direction of motion, and polarization. A quantum state is associated with a point in a Hilbert space, which is quantum theory's state space. Schrodinger's equation may be the propagation law. Next, S c h r 6 d i n g e r issued a challenge to every quantum model. A physicist constructs a model by postulating a set of properties that the natural object supposedly possesses. Difficulties arise when this set of postulated properties is compared to the set of measured properties actually provided in the quantum lab. (For lab predictions, a recipe called the Born rule takes the state space representation and gives back the probability of finding u p o n measurement only some of the system's properties.) The lab's set of measured properties is smaller than the set of a model's postulated properties. Schrodinger's challenge is a consistency constraint. Those additional postulated properties that give a quantum model its explanatory power must not conflict with "any portion of the quantum theoretical assertions." The stage is n o w set for Birkhoff and von Neumann to propose a model to explain quantum puzzles. It avoids the extraordinary "anomalous behavior" Dirac encountered. It faces Schrodinger's consistency challenge. And, it too strains our common sense. Its hope comes from pioneering investigations of lattice theory by Birkhoff and of Hilbert space by von Neumann.
Birkhoff's Lattices Early on in the short history of lattice theory, Birkhoff stressed two distinctions about lattices. Although in mathematical practice it is often convenient to gloss over these distinctions, they provide a perspective that is critical if we are to appreciate the Birkhoff-von Neumann program. In "What is a Lattice?," written in 1943 for the American Mathematical Monthly, Birkhoff sought to "create a general lattice theory containing many special cases." Examples of special cases include "the real numbers, non-negative integers and the subsets of any class." The general lattice theory, he noted, has the advantages of "unity and economy" only because it is distinct from any special case; it is characterized only by "certain common formal properties." To fix Birkhoff's first distinction, we could say that general lattice theory deals with (families of) isomorphism classes, each of which contains various special cases (members of one family differ only by their rank). When doing general lattice theory, a mathematician focuses on an isomorphism class that can be fully de74
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
=o N
OO @ Repnnted with permission from National Rewew
fined by its structure or its "formal properties." But to study a special case, a mathematician must be concerned with (i) the structure of its isomorphism class, (ii) its particular set of members, and (iii) the operators defined on those members. In a review article, "Lattices in applied mathematics," presented to the American Mathematical Society in 1961, Birkhoff stressed his second distinction. It distinguished among special cases: some belong to "pure mathematics" while others are found in "applied mathematics (and mathematical physics)." Examples of the latter include "Reynolds operators and ergodic theory which have special physical applications to fluid turbulence and stochastic processes." Special cases from pure mathematics include those he cited in 1943. Of course his second distinction is relative to a vantage point. A mathematician may study a special case as a pure case with no interest in its application while a physicist might use or apply that same special case. This distinction will be critical here because the mathematician is viewing the special case as defined on one set while the physicist is seeing it as defined on another. Recall that a lattice is a partially ordered set with a u n i q u e m e e t and join d e f i n e d on all pairs of its members. Let S represent a set with members A, B, C . . . . , and ~< the partial order (a reflexive, antisymmetric, transitive relation on S). Every finite lattice contains two special elements 0 and I. For any A, 0 ~
.
Birkhoff's perspective, the collection of such day-today properties is an example of a lattice. Therefore, to define the lattice, we must specify: (i) Its set. S P is the set of our everyday classical properties. Members of S P will be represented by X, Y, Z . . . . . (ii) Its operators. For e x a m p l e , OR is an o p e r a t o r d e f i n e d on the members of S p. (iii) Its purely formal lattice properties. OR has the formal properties of a distributive ortholattice join. The dual of OR is AND. The general orthocomplement operator on properties is NOT. Our first example of a Boolean lattice is the classical property lattice: CPL = < S P, .
The point of departure of the classical proposal is the critical viewpoint that CPL captures the structure
Birkhoff and von Neumann stress that mathematicians often study just the purely formal Euclidean space, but when they do they are not studying a state space. Euclidean space became a state space because it could be "imbued with reality." The pure space became "imbued," Birkhoff and von Neumann emphasize, only because it could be put into a "one-one correspondence" with an "observation space." They define an observation space as the set of all of a physical system's possible measurement and prediction results or properties; for our one-particle system, this is the set of all the possible 6-tuples of position and mom e n t u m values. The reason this correspondence is possible is because it "preserves inclusion b e t w e e n subsets of the state space and subsets of the observation space." The idea is that the state space gets the job done only because it has the same lattice structure THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 7 5
Figure 2. A Boolean property lattice represents the world we observe. Our propositional logic and the state space of classical physics 'work for us' in this world only because they share the same Boolean lattice structure as the property lattice. X represents a property such as 'whiteness' or 'a wave's polarization angle O'; x represents a proposition such as "'This is white.'" or "The wave has polarization angle O." and X represents a state space element. The second line represents the trivial property that all objects possess, a tautology, and the whole space. On the last line is the lattice orthocomplement of the second line: an absurd property, a contradiction, and the imaginary element.
as the classical observation space. However, the observation space lattice is just that privileged sublattice of CPL that classical physicists rely upon. The critical viewpoint this time is that the state space works because our world h a p p e n s to be one represented by CPL. We could say that both our ordinary Boolean propositional logic and the Boolean classical state space have proven useful because this world of ours is Boolean. Figure 2 depicts the classical half of the Birkhoff-von Neumann proposal. Von Neumann's
Hilbert Space Argument
After he presented his first work on Hilbert space in 1927, von N e u m a n n devoted his efforts to such specific problems as the foundations of quantum statistics and quantum thermodynamics, the q u a n t u m measurement problem, and the possibility of using socalled hidden-variables to restore "causality" to the theory. This work led to his 1932 Grundlagen axiomatizafion of the theory. Its axioms included the assumptions that the state space elements that represent the states of a quantum physical system are points of a Hilbert space and that the measurable quantifies of a system are represented by Hermitian (generally unbounded) operators densely defined in that space. (For our present purposes, the Hflbert space need only be finite dimensional with a dimension greater than or 76
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
equal to three. Even this Hilbert space is physically rich enough to represent those quantum states that cause the theory's problem of interpretation.) A six-page subsection of the Grundlagen, "Projections as Propositions," presented an ingenious argument for a bold conclusion that four years later would underlie the Birkhoff-von N e u m a n n collaboration. The argument unfolds in three steps. Von Neumann begins by drawing our attention to two "concepts that are important objects of physics." The first is "the properties of the states of a physical system." Examples of such properties, as we have seen, are "that a certain quantity takes a particular value," for example "that polarizations take angle O." For classical physics, we saw that those properties made up the classical observation space, a sublattice of CPL whose properties we represented by X, Y, Z . . . . . Von Neumann n o w focuses on the corresponding set of properties he regards as important for quantum physics. We shall r e p r e s e n t these q u a n t u m p r o p e r t i e s as M,N,O ..... The second important concept is that of a twovalued measurement: "to each property we can assign a measurement which distinguishes between the presence or absence of a property." These measurements "take on only the values 0 and 1." They give 1 if a system possess a certain property, 0 otherwise. These two concepts are related: the properties are "characterized by the same behavior" as their measurements. They too can "take on only the values 0, 1." The first step of the argument has identified a set of two-valued quantum properties and a corresponding set of t w o - v a l u e d q u a n t u m m e a s u r e m e n t s . The second step n o w establishes that both of these sets can be put into several significant one-to-one correspondences. The first of these correspondences is sanctioned by one of von Neumann's axioms mentioned above: a system's physical quantities correspond to a set of Hermitian operators defined in the Hilbert space. Because both sets of the first step were twovalued, they both correspond to the subset of the Hermitian operators that are idempotent (E = E2). Each of these idempotent operators is an orthogonal projection operator on the Hilbert space; each projection corresponds to the subspace consisting of its range. Step two has put both the sets of properties and measurements from step one in one-to-one correspondences with the sets of idempotent Hermitian operators, projections and closed subspaces of the Hilbert space. In his third and critical step, von N e u m a n n proposes that from (certain) properties M, N we can form the additional properties 'M and N' and 'M or N', which are "characteristic for quantum mechanics." Let us elaborate our notation. Classically, from X, Y we formed the property X OR Y. Now, from two simultaneously measurable quantum properties, M, N we will form the quantum property M qOR N. Von N e u m a n n
explicitly tells us how to do that. We extend the correspondences established in step two: the property M qOR N corresponds to "its [idempotent Hermitian] operator E + F - EF,'" which is also a projection on the Hilbert space. In addition, von Neumann continues, other quantum properties we shall represent here as M qAND N and qNOT M are formed by similar correspondences. The result of this process, von Neumann concludes, is that . . . the relation between the properties of a physical system on the one hand and the projections on the other makes possible a sort o f . . . calculus.., which is characteristic for quantum mechanics. Four years later, having shifted from projections to subspaces, the Birkhoff-von Neumann collaboration would begin with a similar and equally bold statement: Our main conclusion, based on admittedly heuristic arguments, is that one can reasonably expect to find a calculus . . . which is formally indlstmgmshable from the calculus of linear subspaces with respect to set products, linear sums, and orthogonal complements--and resembles the usual calculus.., with respect to and, or and not. /
The Birkhoff-von Neumann Quantum Proposal
Von Neumann's insight was that the shift from the classical Euclidean state space to the quantum Hilbert state space had two dramatic consequences. First, the subspaces of Hilbert space correspond to q u a n t u m properties and measurements. And, those subspaces have a "calculus" of their own. Birkhoff's contribution was to identify these "quantum calculi" as non-Boolean lattices. As the collaboration now puts it, pure Hilbert space became "imbued with reality" and can give accurate predictions only because the non-Boolean lattice of its subspaces corresponds to non-Boolean lattices of quantum properties and propositions. The boldness of this proposal emerges when we recall that Euclidean space became "imbued" because the Boolean lattice of its subsets corresponds to the Boolean lattice of subsets of the classical observation space. The main idea of the quantum half of the proposal is t h a t this shift from classical s u b s e t s to q u a n t u m subspaces requires that the Boolean state space, property, a n d propositional lattices of the classical proposal must be replaced with non-Boolean quantum counterparts. In sum, Boole's property lattice and our ordinary proposition logic are commonsense jetsam if we h o p e to explain the q u a n t u m puzzles. However, for Birkhoff and von N e u m a n n to pull this off they must stand the classical lattice relationship on its head. Classically, the property lattice was fundamental. Finding ourselves in a world of Boolean properties, the story went, we used a Boolean state space and propositional logic. However, the upshot of
the quantum interpretation problem is that we do not know w h a t sort of properties inhabit the q u a n t u m world. The Birkhoff-von N e u m a n n response is to begin with the non-Boolean state space lattice and then to use their "admittedly heuristic arguments" to conclude that the lattices of quantum properties and propositions might be non-Boolean as well. Figure 3 displays the strategy of the Birkhoff-von N e u m a n n quantum proposal using a notation intended to emphasize that these new lattices are the quantum nonBoolean counterparts of the three classical Boolean lattices. The fundamental lattice is now the quantum state space lattice. The (closed linear) subspaces of Hilbert space form a non-Boolean lattice. Let Sn represent the set of subspaces with members M, N, O . . . . . The lattice meet is subspace intersection; the join is the closed linear span, denoted M v N. The subspace orthogonal complement is the lattice orthocomplement, giving the quantum state space lattice: QSL = <S H, ~, N , v , 3_ >. According to Birkhoff and von Neumann, the accurate predictions associated with QSL suggest that the lattice of q u a n t u m properties has the same lattice structure as QSL. Let SP represent the set of quantum properties with members M, N, O . . . . . The lattice join operator defined on these properties is qOR. Adding the meet and orthocomplement operators we get the quantum property lattice:
Figure 3. The fundamental lattice is the non-Boolean lattice of subspaces of Hilbert space, the quantum state space. "Admittedly heuristic arguments" suggest that properties such as 'a photon's polarization angle O' and propositions referring to such properties have a structure "which is formally indistinguishable from" this subspace lattice. Quantum properties and propositions combine in a nonBoolean way which is "characteristic for quantum mechanics" and thereby provide explanations of quantum theory's puzzles. THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 7 7
QPL = ~ SP, 4, qAND, qOR, qNOT >. Birkhoff and von Neumann also suggest that either
QSL or QPL, or both, require a revision of our classical propositional logic. In the classical logic lattice, 'or' was an operator defined on classical propositions. In the proposed q u a n t u m revision, operators such as 'qor' are defined on quantum propositions. The resulting quantum propositional connectives would be distinctively non-classical. For example, when quantum propositions such as 'm qor n' are formed, the process cannot be truth functional. In addition, quantum logic itself cannot be bivalent. The lattice reflecting the proposed quantum revision of our classical propositional logic is the quantum logical lattice:
QLL = ~ Ss, 4, qand, qor, qnot >. Figures 2 and 3 present a tautology and a contradiction from both classical and quantum logic. The payoff of the Birkhoff-von Neumann proposal is that properties of QPL are ordinary (unlike Dirac's) but they combine with each other in extraordinary ways (unlike Boole's). They 'obey' the 'laws' of QSL. For example, the quantum property M qOR N has the formal properties of the subspace M v N just as the classical p r o p e r t y X OR Y h a s t h e s t r u c t u r a l f e a t u r e s of the subset X O Y. More graphically, the quantum property formed using qOR is visualized with the algebraic sum in a model of the subspaces of R 2, while the classical OR property is depicted by the union in a Venn diagram. Advocates of such quantum property models have attempted to cash in on properties formed with qOR to explain the principle of superposition, an idea at the heart of quantum theory. According to Dirac, this idea requires the existence of "properties which are in some vague way indeterminant.'" Schr6dinger told us something similar: "quantum mechanics has and uses the V-function [the state space representation of a system]
Figure 4. The blurred or fuzzy model of the hydrogen atom often used in chemistry textbooks is consistent with quantum theory until its caption is added. 78
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
to image the blurring of all variables just as clearly and faithfully as the classical model does zts sharp numerical values." (Emphasis in both originals!) Schr6dinger suggested another non-classical model w h e n he added that the "qt-function has provided quite intuitive and convenient ideas, for instance the 'cloud of negative electricity' around the nucleus."
N e i l s B o h r reportedly s a i d t h a t a n y o n e w h o is n o t s h o c k e d by q u a n t u m theory has n o t und e r s t o o d it. Apparently it still does. Today, introductory chemistry texts propose: "Consistent with the V-function one can visualize an electron as forming a fuzzy cloud charge around the nucleus." Even more vividly: "electrons are considered to be a spherically symmetrical smear of negative charge surrounding the nucleus." The diagram in Figure 4 is typical. Proponents of such models try to use blurred clouds to explain quantum puzzles. But Are the Blurred Properties Consistent?
Versions of a theorem of Kochen and Specker have been used to make significant headway on this question by imposing specific consistency constraints on blurred models. Define a QSL-structure as a lattice of subspaces of a Hilbert space of dimension at least 3. L(2) is the Boolean lattice defined on the set {0,1}. The theorem exhibits a QSL-structure L which does correspond to physical observables but which admits no h o m o m o r p h i s m onto L(2). Every lattice homomorphism must by definition preserve A and v. A map h of this L onto L(2) which preserves/x (and consequently also preserves ~) must therefore fail to satisfy
h(MvN) = h(M) v h(N) for some M,N ~ L. This theorem challenges the third step of von Neumann's argument (above), which proposed that twovalued measurements or properties had the structure of a linear subspace. A related inconsistency confronts the classical physicists w h o advanced a model of a rope that can be resolved into its orthogonal components. The theorem insures that if quantum properties are blurred and measurements give definzte values, then no measurement by a q u a n t u m physicist can reveal those properties! (This is a version of the notorious Schr6dinger cat paradox.) Finally, the theorem shows that the chemist's model becomes inconsistent w h e n an incautious text adds (as in Figure 4) "the electron is likely to be somewhere in that region," although "we have a hard time saying exactly where." A blurred
m o d e l m a y be u s e f u l . But we m u s t h e e d Dirac's warning that such a model "cannot be conceived on classical ideas." Just keep in m i n d that if the electron has a location, there is no location it can have. Neils Bohr reportedly said that a n y o n e w h o is not shocked by q u a n t u m theory has not u n d e r s t o o d it. I have attempted to offer a shocking tale. If its genre is mathematical drama, the protagonist is the lattice join. You are invited to enter in and decide if the dramatic events have been a tragic confrontation in which the superior lattice h o m o m o r p h i s m theorem h e a p e d retrib u t i o n u p o n the hubris of the protagonist a n d destroyed its character.
Acknowledgment: It is a pleasure to acknowledge comm e n t a r y on previous drafts from Ken Smith, Bas van Fraassen, Mary Wardrop, and Linda Wessels.
References The Birkhoff-von Neumann paper and the Kochen and Specker theorem are reprinted in Hooker's first volume. My [9] contains a proof sketch of the lattice formulation of the theorem. Schrodinger's article (and Cat Paradox) is reprinted by Wheeler and Zurek, an anthology devoted to the problem of interpretation. Stairs vigorously defends an indeterminate model. Stachel (in Colondy [4]) and Jammer extensively review the Birkhoff-von Neumann program; Stachel adds incisive criticism. Herbert's account of the Polaroid puzzle and blurred model is a handy popularization. 1. G. Birkhoff, What is a lattice? The Amerzcan Mathematzcal Monthly 50 (1943), 484-487. 2. G. Birkhoff, Lattices in applied mathematics, Amerlcan Mathematical Society Proceedings of Symposia m Pure Mathematics Vol. II, Providence: The American Mathematical Society (1961). 3. G. Boole, The Laws of Thought, New York: Dover (1854). 4. R. B. Colondy, From Quarks to Quasars, Pittsburgh: University of Pittsburgh Press (1986). 5. P. A. M. Dirac, The Principles of Quantum Mechantcs, Oxford: Clarendon (1947). 6. N. Herbert, Quantum Reahty, New York: Doubleday (1985). 7. C. A. Hooker (ed.), The Loglco-Algebraic Approach to Quantum Mechanics Vol. I and Vol. II, Boston: Reidel (1975 and 1979). 8. M. Jammer, The Philosophy of Quantum Mechanics, New York: Wiley (1974). 9. J. H. McGrath, Quantum disjunctive facts, Phdosophy of Science Assoc~atzon Proceedings 1 (1986), 76-86. 10. A. Stairs, Quantum logic, realism, and value definiteness, Phzlosophy of Sczence 50 (1983), 422-436. 11. J. yon Neumann, Mathematzsche Grundlagen der Quantenmechamk, Berlin: Springer-Verlag (1932). (Translated by R. Beyer (1955) as Mathematzcal Foundations of Quantum Mechamcs, Princeton.) 12. J. A. Wheeler and W. H. Zurek, Quantum Theory and Measurement, Princeton: Princeton Umverslty Press (1983).
Department of Philosophy Central Michigan University Mt. Pleasant, MI 48859 USA THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
79
Chandler Davis*
Mathematics and the Unexpected by Ivar Ekeland University of Chicago Press, Chicago, 1988, xiii + 139 pp.; US$19.95, paper $8.95
The Problems of Mathematics by Ian Stewart Oxford University Press, Oxford, 1987, "" ix + 257 pp.; s
Reviewed by Cathleen S. Morawetz For aficionados of popular science with a mathematical flavor these have been banner years. For those, and there are many, w h o are willing to settle for the vaguest of explanations as long as there is no mathematics, there is Stephen Hawkins's The History of Time. For the somewhat more mathematically knowledgeable there is Roger Penrose's The Emperor's New Mind. But this reviewer has not read either of these, having got stuck early on the vague terms of Hawkins. But here at hand we have two wonderfully readable books written for the educated l a y m a n - - b y w h o m I believe the authors would have meant a person with a modest achievement in undergraduate mathematics, perhaps even a long time ago. The books, Mathematics and the Unexpected by Ivar E k e l a n d a n d The Problems of Mathematics by Ian Stewart, appeal in quite different ways. For sheer delight I recommend Ekeland's little book (the author's o w n translation of his 1984 book in French). H o w nice to read without stopping, except to think, a book that is both scientific and philosophical in the best sense. Gently we are led to grasp exactly w h y the world is not so deterministic as recent education has implied. It doesn't mean that if you throw the book at the indolent student you won't hit him. Gross quantities, if not too gross, still follow the rules we
* C o l u m n editor's address: Mathematics Department, University of Toronto, Toronto, Ontario M5S 1A1 Canada
understand. From second-year calculus we know that the same set of initial conditions for a differential equation will give you the same answer. But what is the same? If the conditions are triflingly different, the long-range outcome may put you, for example, in one galaxy or another. Or your brain may end up with one view or another. Ekeland delicately opens up these philosophical questions and skirts with even more delicacy the obvious questions about free will. He concludes that the eternal search for a unifying explanation of nature will forever be like dipping a bucket into a flowing river and studying its contents to find out about the river's flow. After a delightful analysis (no equations, lots of pictures and history) of Kepler's contribution to astronomy, Ekeland moves to Newtonian mechanics and describes w h y its apparent ability to predict forever forward had such a profound effect on everyone's thinking. As Ekeland puts it, the astronomer's war cry became (and is still heard), "Give me but pencil and paper and I shall reconstruct the world." Lovely quotes from Laplace as well as the failure to find the planet Vulcan remind us of the vainglories of predicting too much. The beginning of the analysis of prediction for ordinary differential equations (or read Newtonian mechanics) came with Poincar6's analysis of the perturbation problem, or not-quite-periodic motions. More p i c t u r e s give us the m a i n t h r e a d of Poincar6's thinking. His idea was to study a certain map. Imagine an arrow orbiting close to periodically. When it has travelled around approximately one period where is it? The w a y to look at this is to put a plane orthogonal to the trajectory of the arrow at some arbitrary starting point and see where the arrow hits it when it returns. That's a map taking A 0 (the original point) into A 1. Then continue thousands of times with the arrow orbiting round and round, and look at all the images, thousands of them. (We look always in the same plane and forget about the trajectory). The description of the patterns made by these images, which often almost fill out curves, constitutes the study of chaos, and we are provided with lots more lovely pictures illustrating in
THE MATHEMATICAL1NTELLIGENCERVOL 13, NO 3 9 1991 Spnnger-VerlagNew York 81
particular the duplication of patterns on different scales. Having taught us a little more of chaos and its appearance in nature, the author moves on to other more paradoxical m a p s - - A r n o l d ' s cat, Bernoulli shift, Smale's h o r s e s h o e - - a n d we end with a little about Lorenz attractors. The exotic patterns obtained cannot be predicted, and they are of only qualitative help in making predictions. So much for the long-range results, of Newtonian or any other mechanics. It's a natural m o v e next to examine catastrophe theory, which despite its name is concerned with steady states. Most systems contain dissipation, or the stuff that does not dissipate will move out of the region y o u are examining. So the patterns will ultimately become steady. There are lots of variables around and some of them are distinguished as parameters. One asks about sharp differences in the patterns of the other variables as the p a r a m e t e r s change slowly. It is the sharpness of the difference that constitutes "Catastrophe." Ekeland provides some neat examples from which an earnest reader can more or less get the exact message. He also points out where these ideas are useful, their limitations due to dimensions and numbers of parameters, and the mathematical beauty of the pattern outcome. Pictures galore. Ekeland winds up with a review of where we stand now, coupled with lots of literary references to tie us all in with the perpetual philosophical problems of mankind. My classical education was not up to a judgment about the Odyssey, but it's all fascinating. For the mathematically handy, two appendices supplement the investigation of maps with some more of Poincar6's ideas and a description of the recent beautiful results of Feigenbaum. Turning to Stewart's book, we find a good read, well written and digestible for the moderately sophisticated layman. Its racy style, interspersing bits of history with bits of mathematical information, is very attractive.
The situation in the neighborhood of a closed trajectory. 82
THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991
The scientist as hermit lost in contemplation of a small slice of reality (pp. 121-123 of Mathematics and the Unexpected). Hieronymous Bosch, c. 1500, The Temptations of St. Anthony, central panel, Madrid, Prado. Photo: Giraudon/Art Resource, N e w York.
Stewart is also of the "other kind." If one regards people as geometrically or numerically oriented, then Ekeland is geometric and Stewart is numeric in approach. Stewart devotes a rather large proportion of his book to number theory and other classical topics. It's natural to do this because, as almost all popular mathematics writers fxnd, the classical problems in those areas are much easier to explain to the layman and lead naturally to current work. Thus Stewart uses this opportunxty well to entice readers with bits of the new cryptographic applications of number theory. He gets into deeper water with the Korteweg-deVries equation where he cannot quite follow through. There are mistakes in attribution: the famous result on the infinitely many conservation laws and their intriguing and fruitful relation to the Schroedxnger equation is due to Kruskal, Miura, Gardner, and Green, and not to Peter Lax, who made other profound cont r i b u t i o n s . I h a v e l o n g r e g a r d e d the r e s u l t of K.M.G.G. as a milestone that will survive us all, and it ought to be known to history by its real parents. Tur-
bulence also gets short shrift. It's all a matter of taste, but I prefer the fundamental tie to the nature of the world around us, particularly its mysteries that are clearly tied to mathematics. Perhaps real turbulence is still too difficult for such a book. But the chapters on computability and probability were quite instructive for me. Stewart tries to differentiate between mathematics and other sciences on the ground of the permanence of theorems. It's a nice idea but one should not poke too hard at it. The "proofs" of ancient times, and even of as short a time ago as the last century, no longer look so rigorous. Results and connections are more permanent, and other sciences are relatively speaking Johnny-come-latelies. Parts of the book are skippable. Readers would be well advised to begin on page 4 and to quit on page 221. The beginning of the book is a long caveat that in the interests of popularization the author cannot tell all (who expects it anyway?). The caveat takes the form of an interview between the Mathematician and an Android, reminiscent of the style of the dialogues in the popular mathematical writings of J. L. Synge. This may account for the somewhat d u m b Android having the Irish name, Seamus, an otherwise gratuitous jibe. The suggestion to quit on page 221 is in order to avoid a discussion of pure mathematics vs. applied mathematics. Practitioners of both arms of mathematics have a multitude of thoughts and approaches, and it is just nonsense to say, as Stewart does with apparent approval, that some think "the applied one is so keen to get an answer that he doesn't worry much whether it is the right answer." The great applied mathematicans, people like G. I. Taylor or J. B. Keller, are interested in exploring in d e p t h with all possible tools of mathematics some phenomenon before them, just as the great, let's say, algebraic geometers explore in depth the mathematical phenomenon that is before them. The principle is the principle of being the inquiring scientist and that is all. Aside from all that highfalutin talk, there have been, still are, and will be people like von N e u m a n n or Wiener w h o m everybody wants to claim for their side if there has to be a distinction made. That, I think, leads me to the notion that the minefield of the chapter title that Stewart is trying to cover does not exist. Otherwise from p. 4 to p. 220 y o u have a book chockful of insight and education in the main thrusts of mathematics. Here and there problems surface because page x does not quite tally with page y, but all in all this is, as I said, a good read and of just the right length. It conveys its lesson and quits. Courant Institute of Mathematzcal Sczences New York Universzty New York, NY 10012 USA THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 8 3
Robin Wilson*
Mathematics in the Seventeenth Century Several seventeenth-century mathematicians have already been featured in this column--including John
Napier, Johannes Kepler, and Isaac Newton. Here we present three more: Torricelli, Leibniz, and Halley.
E v a n g e l i s t a Torricelli
tiply, divide and find square roots. Although Newton had priority for the invention of the calculus, Leibniz published his results first. Leibniz's calculus notation proved to be more versatile than Newton's and is still in use after three hundred years. This German Europa stamp was issued in 1980.
(1608-1647) is primarily rem e m b e r e d as the inventor of the m e r c u r y barometer and the discoverer of atmos p h e r i c p r e s s u r e . In his early years he worked on the m e c h a n i c s of falling b o d i e s , w h i c h led to his being a p p o i n t e d as assistant to Galileo. Later he contributed to various geometrical problems, determining the length of an arc of a logarithmic spiral and finding a point inside a triangle such that the sum of the distances from the point to the vertices of the triangle is a minimum. He was one of the precursors of the calculus, using the method of indivisibles to find areas under curves such as the cycloid. This Soviet stamp was issued in 1959 to commemorate the 350th anniversary of his birth.
Edmond
Gottfried Wilhelm Leibniz (1646-1716) was a towering intellectual figure who wrote on law, history, theology, philosophy, and mathematics. His aim of building up the whole of knowledge from a few basic principles led to plans for a u n i v e r s a l l a n g u a g e for mathematical logic and the construction of a number of calculating machines that could add, subtract, mul-
Halley
(1656-1742) is mainly remembered for the comet whose orbit he c a l c u l a t e d and with which his n a m e is n o w associated. While still a student, he sailed to St. Helena and prepared the first accurate catalogue of the stars in the southern s k y . In t h e 1680s Halley persuaded Isaac Newton to develop his researches on g r a v i t a t i o n a n d publish them in Prin-
cipia Mathematica. Later he s u c c e e d e d John Wallis as Savilian Professor of Geometry in Oxford and John Flamsteed as Astronomer Royal. These stamps were issued in 1986 to commemorate the re-appearance of Halley's comet.
* Column editor's address: Faculty of Mathematacs, The O p e n University, Mtlton Keynes MK7 6AA England 84 THE MATHEMATICALINTELLIGENCERVOL 13, NO 3 9 1991 Sprmger-Ver|agNew York