Letters
to
the
Editor
The Mathematical InteUigencer encourages comments about the material in this isxue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
One-Way Link Leonard Gillman, in his review of Emblems of Mind, alludes to w h a t I always think of as the one-way correlation b e t w e e n mathematics and mus i c - t h a t while mathematicians often have an affinity for music, m u s i c i a n s are m u c h less apt to have one for mathematics. He says this is "presumably b e c a u s e although a person with n o understanding of music can nevertheless enjoy a m u s i c a l performance, it is unlikely that a n y o n e can curl up with a m a t h e m a t i c s b o o k and enjoy it w i t h o u t u n d e r s t a n d i n g it."
My o w n e x p l a n a t i o n is quite different. It is simply that if one has an aptitude for b o t h m u s i c and mathematics, the practical consideration of having to make a living will often dictate the direction one takes. Jacob E. Goodman Department of Mathematics City College, CUNY New York, NY 10031 USA e-mail:
[email protected] 9 1999 SPRINGER VERLAG NEW YORK, VOLUME 21, NUMBER 1, 1999
3
Letters
to
the
Editor
The Mathematical InteUigencer encourages comments about the material in this isxue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
One-Way Link Leonard Gillman, in his review of Emblems of Mind, alludes to w h a t I always think of as the one-way correlation b e t w e e n mathematics and mus i c - t h a t while mathematicians often have an affinity for music, m u s i c i a n s are m u c h less apt to have one for mathematics. He says this is "presumably b e c a u s e although a person with n o understanding of music can nevertheless enjoy a m u s i c a l performance, it is unlikely that a n y o n e can curl up with a m a t h e m a t i c s b o o k and enjoy it w i t h o u t u n d e r s t a n d i n g it."
My o w n e x p l a n a t i o n is quite different. It is simply that if one has an aptitude for b o t h m u s i c and mathematics, the practical consideration of having to make a living will often dictate the direction one takes. Jacob E. Goodman Department of Mathematics City College, CUNY New York, NY 10031 USA e-mail:
[email protected] 9 1999 SPRINGER VERLAG NEW YORK, VOLUME 21, NUMBER 1, 1999
3
)pinior
On Blindness Lemme B. Bourbaki
The Opinion column offers mathematicians the opportunity to write about any issue of interest to the international mathematical community. Disagreement and controversy are welcome. The views and opinions expressed here, however, are exclusively those of the author, and neither the publisher nor the editor-chief endorses or accepts responsibility for them. An Opinion should be submitted to the editor-inchief, Chandler Davis.
4
Vine, when it is judged at competitions, is judged blindly. Musicians auditioning for coveted orchestra spots are judged blindly. The old Coke-versus-Pepsi taste test was blind. Even love is blind. But, alas, mathematics manuscripts, when judged for acceptance by journals, are subject to a bias: the author's name and school affiliation are attached to the submitted manuscript for no apparent reason other than to influence the referee. Some referees blanch at the idea of blind refereeing. Their usual defense of this position--author name and institution affiliation somehow help the referee render a judgment about the quality of the manuscript--at once undermines their stance, by acknowledging precisely that this information does influence the referee. But this is not surprising. Reading a mathematics manuscript carefully is demanding and time-consuming. The typical referee must work hard to find the time--between writing his own papers, teaching his classes, serving on committees, ad nauseam--just to glad-hand all of the manuscripts that editors send his way, to say nothing of actually reading them thoroughly and carefully. The temptation to cut corners is overwhelming. Imagine the typical time-pinched referee with two manuscripts on his desk, one from, say, Alotta Reputation at The Hugh G. Goes University of the Rather Impressive, the other from Joe Feeblepuss at Southeast State University of Agronomy. The poor referee has little time to devote to these manuscripts. The editor has reminded him (twice!) that the reports on both manuscripts are long overdue. Is it a stretch to imagine this frazzled referee will be inclined to give a pro forma scanning of the A. Reputation manuscript before rubberstamping a positive review, while reserving the full powers of his mordant scrutiny for the feckless Feeblepuss manuscript? Or, perhaps, also give a
W
THE MATHEMATICAL INTELLIGENCER 9 1999 SPRINGER-VERLAG NEW YORK
pro forma scanning of Feeblepuss's manuscript before rubber-stamping a negative review? ("The mathematics in this manuscript seems to be correct, but it is probably too specialized to be of interest to a wide audience.") The occasional delusional referee may seduce himself into thinking that he can judge manuscripts this way without bias (yea, right, and lobbyists' money doesn't influence politicians); the rest of us know that, of course, referees are influenced by the manuscript author's name and school affiliation. Now it's true that if feckless Feeblepuss submits a rare manuscript of exceptionally high, award-winning quality, it will probably be accepted by a reputable journal. Eventually. But not many mathematicians, not even "big name" mathematicians, often do awardwinning research. Aside from a tiny handful of mathematicians from each epoch, most of us do mostly good competent work that is not particularly monumental. Most of us are worker bees, quietly going about our business of filling in small gaps in the theresa, making computations that support or refute conjectures, etc. There are only a few queens. But the publishing process is blind to this fact. It treats far too many workers like queens, and treats the rest of the workers cavalierly. It doesn't even get the partit i o n - i n t o queens and workers--right: Galois's manuscript was rejected by a referee suitably unimpressed with the young Frenchman's name and pedigree. The American Mathematical Society experimentally used blind refereeing for its Proceedings once for a y e a r - the experiment was discontinued [2]-but other fields often use blind refereeing, e.g., The Journal of the History of Ideas asks its authors to omit their identity from submitted manuscripts [4]. Not only is this obviously fairer (Shaugnessy [8] notes that "some papers are published because of the rep-
utation of the a u t h o r s or institutions 9 editors or reviewers let inferior pap e r s 'slide' if they are s u b m i t t e d from a prestigious r e s e a r c h e r or institution"; s e e also [1] and [6]), it also i m p r o v e s the quality of the p a p e r s that are published. F o r instance, in their l a n d m a r k study, P e t e r s and Ceci [7] evaluated 12 p s y c h o l o g y j o u r n a l s that u s e d nonblind review by r e s u b m i t t i n g manus c r i p t s that had p r e v i o u s l y b e e n published in the same j o u r n a l two y e a r s before, changing only the n a m e s of the a u t h o r s a n d their institutions 9 Only 2 o u t of 16 reviewers felt that previously p u b l i s h e d but u n r e c o g n i z e d p a p e r s w e r e suitable for publication. Witness also the conclusions of Fisher, et al. [3]: "Blinded reviewers a n d editors in this study, but not n o n b l i n d e d reviewers, gave better scores to authors with m o r e previous articles. These results suggest that blinded reviewers m a y provide m o r e unbiased reviews and that nonblinded reviewers m a y be affected b y various types of bias." And fmaUy, c o n s i d e r the results of Labland's and Piette's massive study [5]: "Articles published in journals using blinded p e e r review were cited significantly m o r e than articles published in j o u r n a l s using nonblinded p e e r r e v i e w . . . Journals using nonblinded p e e r review publish a larger fraction of p a p e r s that should not have b e e n published than do j o u r n a l s using blinded p e e r review. When reviewers
k n o w the identity of the author(s) of an article, they are able to (and evidently do) substitute particularistic criteria for tmiversalistic criteria in their evaluative process." H e r e w i t h then, a m o d e s t p r o p o s a l to realign the m a t h e m a t i c a l manuscript s u b m i s s i o n ritual with b o t h fairness and excellence: 1. A u t h o r selects j o u r n a l a n d s e n d s m a n u s c r i p t to editor 9 2. E d i t o r f o r w a r d s manuscript, s a n s a u t h o r ' s n a m e and school affiliation, to referee. 3. Referee carefully reviews manuscript a n d s e n d s r e c o m m e n d a t i o n to editor. 4. E d i t o r u s e s referee's r e p o r t to inform his d e c i s i o n about w h e t h e r o r n o t to a c c e p t manuscript. I suspect, though, that m a t h e m a t i c s m a n u s c r i p t s will be r e v i e w e d with flagrant bias for s o m e time to come. The p e o p l e w h o have the p o w e r to i m p r o v e the p r o c e s s - - j o u r n a l e d i t o r s - - a r e t h e m s e l v e s "name-recognizable" a n d a m o n g t h o s e w h o have the m o s t to lose b y m a k i n g the p r o c e s s fair and increasing the quality of their journals. I imagine it w o u l d be difficult for t h e m to relinquish their prerogative to exercise their o w n shallow bias. W h e n Oedipus, King of Thebes, found o u t h e ' d m a r r i e d his mother, and
(probably) killed his father, the only logical action for this "blind" man to t a k e was to gouge his eyes out. I'm n o t suggesting that t h e editors of mathematics journals, b l i n d though they a r e to their o w n bias (and its c o n c o m i t a n t a d v o c a c y for less t h a n the b e s t p a p e r s in the p a g e s of t h e i r journals), gouge their own eyes out. I am suggesting that the rest of us help t h e m s e e - - r e move their b l i n d n e s s - - r e n d e r i n g selfmutilation unnecessary. Towards t h a t end, and in s u m m a r y , a simple argument:
i f attaching the author's name and school affiliation to the manuscript influences the referee, this is obviously unfair bias and should be avoided; i f attaching the author's name and school affiliation to the manuscript does not influence the referee, then there should be no objection to removing them. REFERENCES
1. Banner, J.M., Preserving the integrity of peer-review, Scholarly Publishing 19 (1988), no. 2, 109-115. 2. Notices of the American Mathematical Society 26 (1979), 119. 3. Fisher, M., et al., The effects of blinding on
4.
MOVING? W e n e e d y o u r n e w a d d r e s s so that y o u d o n o t miss a n y issues of
5.
6.
THE MATHEMATICAL INTELLIGENCER. Please s e n d y o u r o l d a d d r e s s (or label) a n d n e w a d d r e s s to:
Springer-Verlag N e w Y o r k Inc., Journal Fulfillment Services P.O. Box 2485, Secaucus, NJ 07096-2485 U.S.A.
7.
8.
acceptance of research papers by peer review, Journal of the American Medical Society 272 (1994), no. 2, 143-146. Journal of the History of Ideas 58 (1997), no. 1. Labland, D.N., and Piette, M.J., A citation analysis of the impact of blinded peer review, Journal of the American Medical Society 272 (1994), no. 2, 147-151. McGiffert, M., Is justice blind? An inquiry into peer-review, Scholarly Publishing 20 (1988), no. 1, 43-48. Peters, D.P., and Ceci, S.J., Peer-review practices of psychological journals: the fate of published articles, submitted again, Behav. Brain 5 (1982), 187-195. Shaughnessy, A.F., Comment; Blind peer review of journal articles, Drug Intelligenceand Clinical Pharmacy 22 (1988), no. 12, 1006.
Please give us six weeks notice. Lemme B. Bourbaki Southeast State University of Agronomy
VOLUME 21, NUMBER 1, 1999
5
PAULUS GERDES
Molecular Modeling of I--ullerenos with I ',exastrips*
~
ecently [1] Cuccia, Lennox, and Ow showed how origami, the ancient Japanese art of paper folding, can be used for the modeling of fuUerenes. They chose modular origami, wherein simple modules are interlocked to form larger and more elaborate structures. In this paper another, and relatively easy, way will
be presented to build models of fullerenes and related molecules using hexastrips (Fig. 1). It will be shown that these hexastrip models correspond to a particularly stabilizing Kekul~ structure which may render them useful in narrowing down the search for possible fullerene isomers. Hexastrips were introduced by the author in the early 1980's when he was exploring possibilities of incorporating a hexagonal basket weaving technique into the teaching of geometry in Mozambique [2]. In the north of Mozambique, Makhuwa craftsmen weave their light transportation baskets (litenga) and their fish traps (lema) with a pattern of regular hexagonal holes (Fig. 2). The strands are woven over-and-under in three directions leading to a very stable fabric. This structure consti-
tutes a model for a layer of graphite: Imagine the carbon atoms arranged at the vertices of the hexagonal holes; the edges of these holes represent single bonds between the
Figure 1. Hexastrip. The dotted line segments indicate the folds of the cardboard paper.
*Reprinted with permission from The Chemical Intelligencer Vol. 4 (1).
6
THE MATHEMATICAL INTELLIGENCER 9 1999 SPRINGER-VERLAG NEW YORK
Figure 2. Plane part of a hexagonally woven basket.
Figure 4. Pentagonal hole surrounded by hexagonal holes.
Figure 3. (a) Pattern of hexagonal holes; (b) Model for a layer of graphite.
carbon atoms, and the crossings of two strands between two neighboring vertices of two neighboring hexagonal holes represent the double bonds (Fig. 3). The same hexagonal basket-weaving technique has been used in several other regions of Africa and the world [3]. In Madagascar fish traps and transport baskets are made using it. In Kenya it is used for making cooking plates, and among the Pygmies (Zalre) for carrying baskets, as well as among various Amerindian peoples in Brazil (Ticuna, Omagua, etc.), Ecuador (Huarani), and Guyana (Yekuana). The Micmac-Algonkin Indians of Canada use it for their large eastern snowshoes, as do Eskimos in Alaska. In Asia the use of the hexagonal basket-weaving technique is well spread, from the Munda in India, the Kha-Ko in Laos, to Malayasia, Indonesia, China, Japan, and the Philippines. Artisans all over the world discovered that if they use this open hexagonal weave to produce a basket, they have to "curve" the faces at the basket's "comers." They found that this can only be done by reducing the number of strands at the comers, and so they weave comers with pentagonal holes [4]. Figure 4 displays such a pentagonal hole surrounded by five hexagonal holes. The extreme situation would be a "basket" consisting of pentagonal holes only. This happens with the Malaysian "sepak raga" ball, which has twelve pentagonal holes (Fig. 5). Various variations of the "sepak raga" game are played in other parts of Southeast Asia, including Burma, Thailand, the Philippines,
and Indonesia. It is a game with a long tradition. Dunsmore refers to a legend about a 14th-century Malay ruler who held his audience spellbound by kicking the ball more than 200 times without letting it touch the ground [5]. The structure of the "sepak raga" ball is very similar to that of the m o d e m soccer ball (since the end of the 1960s), and constitutes a model for buckminsterfullerene C60 (Fig. 6): Imagine once more the carbon atoms arranged at the vertices of the holes of the "sepak raga" ball (this time, 12 pentagonal holes, leading to 60 atoms); the (rectified) edges of these holes represent the single bonds between the carbon atoms, and the crossings of two strands between neighboring vertices of neighboring pentagonal
Figure 5. Malaysian "sepak raga" ball.
VOLUME 21, NUMBER 1, 1999
7
Figure 6. (a) Schematic representation (front view) of the woven "sepak raga" ball; (b) Schematic representation o f the modern s o c c e r ball structure; (c) The " s e p a k r a g a " - - s o c c e r ball structure o f buckminsterfullerene.
holes represent the double bonds. The hexagonal tings of carbon atoms are held together by alternately single and double bonds. There are 20 hexagonal rings. Both the hexagonal rings and the global icosahedral structure of C60 may be more easily visible if we weave the ball using hexastrips (Fig. 7). Hexastrips are cardboard strips in which a series of folds have been introduced in such a way that they facilitate the weaving together of the strips in three directions. Figure 8 shows how the fLrst folds may be produced to make a hexastrip, and Figure 9 shows how to join three hexastrips---over-and-under. The strips may be held together using paper clips or gluing their overlapping rhombi. Curl and Smalley (USA), and Kroto (UK) were awarded the 1996 Nobel Prize in Chemistry for their 1985 discovery of C60, observed in the mass spectrometer, and their conjecture that it would have the symmetrical structure of a truncated icosahedron [6]. The possible existence of a such structured, stable carbon molecule had been conceived in 1970 by Osawa in Japan [7]. Curl, Kroto, and Smalley named the molecule buckminsterfullerene after the designer/
Figure 7. Hexastrip model of C6o.
8
THE MATHEMATICALINTELLIGENCER
Figure 8. (a) Wrapping one strip around the other to obtain the first folding line (b) The second folding line is m a r k e d by folding the upp e r part of the second strip in such a w a y that it b e c o m e s a d j a c e n t
to the first strip. This process is repeated to produce the various folds,
Figure 9, Joining three hexastrips.
Figure 13. Hexastrip model of the tetrahedral isomer of C168. Figure 10, Hexastrip model of C72.
Figure 11. Hexastrip model of a nanotubule isomer of C~2o.
Figure 12. Hexastrip model of the isomer of C12o with tetrahedral symmetry.
inventor o f t h e geodesic domes, a n d i n d i c a t e d s o c c e r e n e as a p o s s i b l e alternative name. Looking at the structure o f the "sepak raga" ball, it b e c o m e s c l e a r that sepak-raga-ene could also have b e e n a p o s s i b l e n a m e for C60. It m a y be interesting to n o t e that the a r c h i t e c t R. B u c k m i n s t e r Fuller (1895-1983) h a d such a "sepak raga" b a l l - - ( f o r inspirat i o n ? ) - - o n t h e shelf of a b o o k c a s e in his d o m e home [8]. Prior to the discovery of buckminsterfullerene only t w o forms of crystalline carbon w e r e known: graphite and diamond. Since the 1990 success of Kr~ttschmer (Germany) a n d Huffman (USA) in synthesizing m e a s u r a b l e quantities of C60, m a n y o t h e r fullerenes and related m o l e c u l e s have b e e n studied. F u n e r e n e s are defined as c l o s e d cage molecules c o m p r i s e d entirely of sp2-hybridized c a r b o n s arranged in hexagons a n d p e n t a g o n s [9]. As a c o n s e q u e n c e of Euler's t h e o r e m a b o u t the relationship b e t w e e n the n u m b e r of vertices (V), the n u m b e r of edges (E), a n d the n u m b e r of faces (F) o f a c o n v e x polyhedron, V - E + F = 2, the total number of p e n t a g o n a l rings in a fullerene m u s t always be 12. Fig. 10 s h o w s a hexastrip m o d e l o f C72 with two p o l a r h e x a g o n a l holes. It is the s m a l l e s t e x a m p l e o f a carbon~ b a s e d nanotubule, a cylindrical fullerene tube, and has a sixfold r o t a t i o n a l axis. Fig. 11 s h o w s a h e x a s t r i p m o d e l o f a n o t h e r nanotubule, this time c o m p o s e d of two d i a m e t r i c a l l y o p p o s e d hemispherical C60 caps, j o i n e d b y a fivefold cylindrical wall o f two r o w s o f h e x a g o n a l holes. H a ~ n g 12 p e n t a g o n a l holes (12 x 5 = 60 vertices) and 10 h e x a g o n a l holes (10 x 6 = 60), the m o d e l r e p r e s e n t s an i s o m e r o f Ct20. A h e x a s t r i p m o d e l of a n o t h e r i s o m e r of Ct20 is s h o w n in Fig. 12. It has global t e t r a h e d r a l symmetry: the twelve p e n t a g o n a l holes are c l u s t e r e d in four groups o f three at the c o r n e r s of a t r u n c a t e d t e t r a h e d r o n a n d are s u r r o u n d e d by single b a n d s of h e x a g o n a l holes. The smallest possible tetrahedral hexastrip model is one for Cs4: in the middle of each of the four faces there is a hexagonal hole. The tetrahedral structure b e c o m e s clearly visible in the hexastrip model of an i s o m e r of C~6s shown in Fig. 13: t h r e e hexagonal holes on each o f the four "faces" and one hexagonal hole on each of the six "edges." A n o t h e r possibility consists of the twelve p e n t a g o n a l
VOLUME 21, NUMBER 1, 1999
9
Figure 14. Hexastrip model of the octahedral isomer of C276.
holes being distributed over the fullerene in six groups of two. If the groups of two pentagonal holes are c o m p o s e d of neighboring pentagonal holes, and regularly distributed over the closed surface, then the structure has a global octahedral form, as the hexastrip model of an isomer of C276 in Fig. 14 displays. At the corners of the woven truncated octahedron there are two opposite pentagonal holes surrounded by a layer of six hexagonal holes. To learn to make hexastrip models, one might start by weaving models for some small quasi-fullerenes. Closed carbon cages containing other than 5- and 6-membered rings are k n o w n as quasi-fullerenes. Fig. 16 shows a model of C24 woven of four hexastrips (each with only six folds): instead of twelve pentagonal holes, there are six square holes; it has both the form of a truncated cube and of a truncated regular octahedron. Still smaller is the model of C12, woven with three hexastrips (each with only four folds), which has tetrahedral symmetry (Fig. 17). Models of quasi-funerenes with, for instance, heptagonal rings [10] m a y also be built using hexastrips. Fig. 18 shows a hexastrip model of a quasi-fullerene C576 in the form of a torus; it has 12 pentagonal and 12 heptagonal holes. The heptagonal holes produce the concave regions. Hexastrip models of several non-cage carbon molecules
Figure 16. Hexastrip model of a C24 cluster [14].
Figure 17. Hexastrip model of C12 with tetrahedral symmetry.
Figure 18. Hexastrip model of the isomer of C576 with the form of a Figure 15. Hexastrip model of the icosahedral isomer of C240.
10
THE MATHEMATICAL INTELLIGENCER
torus.
Figure 21. Hexastrip model of C120 with the numbering scheme of Fig. 20.
Figure 19. Structural motifs and hexastrip models of (top) crysene; (middle) coronene; (bottom) corannulene, like carbon clusters.
m a y also b e built. Fig. 19 s h o w s m o d e l s of crysene-, c o r o n e n e - and corannulene-like c a r b o n clusters. H e x a s t r i p m o d e l s o f fullerenes and related c a r b o n struct u r e s are n o t only beautiful a n d relatively e a s y to m a k e o f c h e a p m a t e r i a l s - - a n d m a y as s u c h b e attractive f r o m a didactic p o i n t of v i e w - - b u t t h e y also give a c l e a r p i c t u r e of the b o n d i n g situation: the e d g e s of the holes r e p r e s e n t the
single bonds, a n d the folds (that is, w h e r e w o v e n h e x a g o n s are a d j a c e n t ) the double bonds. Given a h e x a s t r i p model, it is n o t difficult to d e t e r m i n e the n u m b e r of c a r b o n a t o m s implied, as t h e r e are no p r o b l e m s w i t h a p o s s i b l e double counting of vertices. Conversely, if a n u m b e r n is equal to 60 + 6m, w h e r e m is zero or an integer greater than 1, then it is p o s s i b l e to c o n s t r u c t a h e x a s t r i p m o d e l of Cn. F o r example, in t h e c a s e o f n = 120, w e have 120 = 60 + 6 • 10, and s e v e r a l n u m b e r i n g s c h e m e s m a y b e w o r k e d out to see which h e x a s t r i p i s o m e r s are possible. Fig. 20 displays a possible n u m b e r i n g s c h e m e for the top half of t h e C120 i s o m e r s h o w n in Fig. 21. This i s o m e r is different from the ones p r e s e n t e d in Fig. 11 and 12. When it is not possible to write n in the form of 60 + 6m, as in the c a s e n = 70, t h e n t h e r e m a y exist a variation of a h e x a s t r i p model. In fact, for t h e relatively stable C70 it is p o s s i b l e to w e a v e two s e m i C60 m o d e l s (see Fig. 22), and j o i n them: On the adjacent central h e x a s t r i p s there are ten vertices, r e p r e s e n t i n g the e x t r a 10 c a r b o n atoms. H e x a s t r i p m o d e l s p r e s e n t o t h e r a d v a n t a g e s as well which m a y t u r n out to be useful in t h e analysis of the possible e x i s t e n c e of certain i s o m e r s of fullerenes. Schmalz, et al. p o i n t e d out in 1986 that c a r b o n cage
Figure 20. Hexastrip numbering scheme for the top half of a C12o isomer. The numbers 5 and 6 represent pentagonal and hexagonal holes.
Figure 22. Hexastrip model of C70 composed of two woven halves.
VOLUME 21, NUMBER 1, 1999
11
a carbon cylinder. One might pose the question: do hexastrip weavable fullerenes provide another, and maybe powerful, way of reducing the number of candidate isomers. As woven structures they are very stable. What could in the molecules' microcosm correspond to the stable hexagonal basket weave technique? Or, formulated in another manner, do hexastrips with their zigzagging folds (double b o n d s . . . ) correspond to something strain-reducing or stability-reinforcing in fullerenes? [13]
Acknowledgments Maurice Bazin, Arnout Brombacher, and Marcos Cherinda are thanked for stimulating conversations. The Research Department of the Swedish International Development Agency is thanked for financial support, and the University of Georgia (Athens, USA) for the conditions created for doing research during the author's 1996-1997 sabbatical leave from Mozambique's Universidade Pedag6gica. REFERENCES AND NOTES
structures in which the pentagons are isolated are likely to be more stable than structures in which they abut. Subsequently the prescription that abutting pentagons are to be avoided has become known as the "isolated-pentagon-rule" (IPR). It appears to be obeyed by all fullerenes found and characterized so far [11]. The hexastrip models satisfy the "isolated-pentagon-rule": If two pentagonal holes would abut, at that place there would not be any pentagonal holes any more, but only a nonagonal hole. In their study on competing factors in fullerene stability, Fowler et al. note that the isolated pentagon rule is compatible with considerations of 7r electronic stability, but that pentagon isolation in itself does not guarantee it [12]. As a technique for reducing the number of candidate isomers, the isolated pentagon rule is initially very successful. At C7s there are over twenty thousand general fullerenes, but only five IPR isomers. However, for n = 100, there are 450 IPR isomers, and for n = 120 there are 10774 IPR isomers. Fowler et al. analyze three possible ways to reduce the number of candidate isomers. For n = 120, there are 4 isomers with optimal "hexagon neighbor indices," forty have the form of a "leapfrog" cage, and one is
12
THE MATHEMATICALINTELLIGENCER
1. Cuccia, L.A.; Lennox, R.B.; Ow, F.M. The Chemical Intelligencer 1996, 2(2), 26-31 2. See, for example, Gerdes, P. Educational Studies in Mathematics 1988, 19, 137-162; Gerdes, P. Ethnogeometrie: Kulturanthropolo gische Beitrage zur Genese und Didaktik der Geometrie; Franzbecker Verlag: Bad Salzdethfurth, 1990; pp 282-287 3. For examples and references, see Gerdes, P. Ethnogeometrie: Kulturanthropologische Beitr~ge zur Genese und Didaktik der Geometrie; Franzbecker Verlag: Bad Salzdethfurth, 1990; pp 52-53. Further examples may be found in Faubl6e,J. Ethnographie de Madagascar, Musee de I'Homme: Paris, 1946; pp 19, 28, 38; Somjee, S. Material Culture of Kenya, East African Educational Publishers: Nairobi, 1993, 96; Meurant, G.; Thompson, R.F. Mbuti Design--Paintings by Pygmy Women of the Ituri Forest, Thames and Hudson: London, 1995, 162; Guss, D.M. To Weave and Sing--Art, Symbol, and Narrative in the South American rain Forest, University of California: Berkeley, 1989, 73; Lane, R.F. Philippine Basketry: an Appreciation, Bookmark Inc.: Manila, pp 14, 44, 152, 170, 213; Ranjan, M.P., Bamboo and Cane crafts of North East India, National Institute of Design, 1986 4. Cf. Gerdes, P. In Fivefold Symmetry; Hargittai, I., Ed.; World Scientific: Singapore, 1992; pp 245-261 5. Dunsmore, S. Sepak Raga (Takraw)--The Southeast Asian Ball Game, Sarawak Museum: Kuching, 1983, 2 6. Kroto, H.W.; Heath, J.R.; O'Brian, S.C.; Curl, R.F.; Smalley, R.E. Nature, 1985, 318, 162 (Reproduced in Aldersey-Williams, H. The most beautiful molecule: The discovery of the Buckyball; John Wiley & Sons: New York, 1995) 7. Cf. e.g. Kroto, H.; Fischer, J.; Cox, D., Eds., The Fullerenes, Pergamon Press: Oxford, 1993, pp 1, 11 ; Hirsch, A. The Chemistry of Fullerenes, GeorgeThieme Verlag: Stuttgart, 1994, 5; Dresselhaus, MS.; Dresselhaus, G.; Ecklund, P.C. Science of Fullerenes and Carbon Nanotubes, Academic Press, San Diego, 1996, 2 8. As can be see in a photograph in Snyder, R., Ed. Buckminster Fuller: Autobiographical Monologue/Scenario, St. Martin's Press: New York, 1980, p 151 9. Cf., e.g., Taylor, R. The Chemistry of Fullerenes, World Scientific: (continued on page 27) Singapore, 1995
iL'rJF'ii|.[~]i=~t,-;.k|[.-T~-li[="z.]i,,iai=t!i,ail,
Parallel Worlds: Escher and Mathematics, Revisited This column is a forum for discussion of mathematical communities throughout the world, and through all time. Our definition of "mathematical community" is the broadest. We include "schools" of mathematics, circles of correspondence, mathematical societies, student organizations, and informal communities of cardinality greater than one. What we say about the communities is just as unrestricted. We welcome contributions from
l[=-z.--]a M a r j o r i e
Senechal,
Editor
he popularity--and u b i q u i t y - - o f the graphic work of the Dutch artist M.C. Escher (1898-1974) continues unabated: books on his w o r k remain in print, the public never seems to tire of Escher posters, mugs, Tshirts, calendars, and other paraphernalia, and exhibitions of his w o r k are packed. Over 300,000 visitors attended the six-month "M. C. Escher: A Centennial Tribute" at the National Gallery of Art in Washington last spring; exhibitions have recently been held, or soon will be held, in Brazil, Mexico, the Czech Republic, Hong Kong, Great Britain, China, Greece, Italy, Argentina, and Peru. "People are attracted like magnets to these works. They come closer and closer and closer, and they stay there an incredible amount of time," says Jean-Francois I~ger of the National Gallery of Canada. "Studies have shown that the average length of time that a gallery visitor will stay in front of a work of art is 17 seconds. But they stay minutes in front of Escher's, and discuss, and comment, and say, 'Do you see this, have you seen that?'." What is the magnet, what is the attraction? ls it profound, or is it superficial? It has b e c o m e rather fashionable to affect weariness with these questions. Although Escher was "discovered" by
T
I
research mathematicians (and other scientists) in the 1960's, t h e i r - - o u r - enthusiasm for his work has waned as (or because?) the public's has waxed. "Of course, the article contains the inevitable reference to Escher, the philistine mathsman's favorite artist," sniffed an a n o n y m o u s referee for the interdisciplinary journal Leonardo a few years ago [1]. Art critics have been disdainful all along, insisting that Escher will be, at most, a footnote in the history of twentieth-century art. But while this assessment may be correct, is it fair? E s c h e r never claimed to be either a mathematician or an artist. "My uncle floated between art and m a t h e m a t i c s - - t h o s e are his words," says his n e p h e w Nol Escher [2]. He was not at home in either world, yet he perhaps illuminates a profound relation between them. M.C. Escher's hundredth birthday provides an occasion for the mathematical commtmity to revisit his w o r k and come to terms with it. The Escher Centennial Congress, held in Rome and Ravello, Italy, June 24-28, 1998, brought together a diverse group of mathematicians, scientists, artists, designers, m u s e u m educators, and others to consider the entire range of Escher's work, "from landscapes to mindscapes," from m a n y different per-
mathematicians of all kinds and in all places, and also from scientists, historians, anthropologists, and others.
Please send all submissionsto the Mathematical Communities Editor, Marjorie Senechal, Department of Mathematics, Smith College, Northampton, MA 01063, USA; e-mail:
[email protected] Ravello: M.C. E s c h e r ' s h o m e in 1923. P h o t o g r a p h by M a j o r i e Senechal.
9 1999 SPRINGER-VERLAGNEWYORK, VOLUME21, NUMBER1, 1999
13
M. C. Escher's Print Gallery (1956 lithograph), 9 1998 Cordon Art, B.V.--Baarn--Holland. All rights reserved.
spectives [3]. During that congress I asked a small subset of the invited speakers to explore the reasons for Escher's enduring popularity with the general public in general, and in particular whether his appeal is in any sense "mathematical." The following comments splice together excerpts from two wide-ranging discussions. The participants were George Escher, a retired aeronautical engineer and oldest son of M.C. Escher; I s t v ~ Hargittai, Professor of Chemistry, Hungarian Academy of Sciences, author of numerous books on symmetry; Douglas Hofstader, Center for Research on Concepts and Cognition, Indiana University, author of G6del, Escher, Bach; Claude Lamontagne, Professor of Psychology at the University of Ottawa; Jean-Francois Leg~r, Education Director of the National Gallery of Canada in Ottawa; Arthur Loeb, Professor of Design Science at Harvard University; Istv~in Orosz,
14
THE MATHEMATICAL INTELLIGENCER
Budapest, artist (considered by some to be Escher's "successor"); and Doris Schattschneider, Professor of Mathematics, Moravian College, author of Visions of Symmetry.
Senechal: It is a truism that art critics dislike Escher's work but the public loves it. Many people have speculated on possible reasons f o r the first, but f e w seem to have seriously considered the second. Today, let's forget about the critics, and consider the public instead. And let's begin in a skeptical vein. I don't know of any other artist's work that has been so commercialized, not even Picasso 's. To what extent is Escher's popularity due to the commercialization? Or is the commercial success due to Escher's appeal? I-Iofstadter: You can't just say, well, we're going to make all those ties! People aren't necessarily going to buy them.
Esther: Yes, but there was a very organized sales campaign of the Escher concept which was invented after father died or maybe even before, by people around him who said, "If we let it go, it will just fall apart." Because of the character of the people involved then and the people involved now, that's what you have: marketing specialists. Senechal: Does that explain w h y other artists, such as Vasarely and Magritte, whose work challenges the imagination in ways somewhat analogous to Escher's, don't have the same mass following? Or is it, at least in the case of Vasarely, because his geometrical illusions are j u s t abstract figures, not embedded in fanciful worlds? Lamontagne: Maybe it is partly because they are not marketed the way Escher is, but also there is an immediacy in Escher. Magritte is not as easy to interpret. Escher chose simple things, waterfalls, monks walking. It looks understandable at first--but then you find surprises in it. L~ger: I'm not sure that everybody likes Escher. When we were working on our public programming, we tried to identify who would be most interested in him. We concluded that it would be young adults, who were interested in mind games and things like that. It may be that people become interested in Escher at a certain age, and then their interest fades a bit. Maybe Escher appeals to this group because his work is immediate: what you see is what you get. Orosz: The most terrible experience for us artists is when a viewer at an exhibition stops for a half of a second in front of our work and then walks on to the next one. This is impossible in front of a print of Escher. And usually I feel, when I see his work in an exhibition or in a book, that after some minutes the picture is not important anymore, the important thing is the thinking, the mysterium. Over time, it will be even more important than the picture. This may be why the publishers use his works in calendars, because people have to live with them for a month at least, and they see it every day.
Hofstadter: I don't r e m e m b e r w h e r e I first saw "Metamorphosis," it was probably in s o m e book, but I r e m e m b e r the fascination of the changing forms. I was n e v e r as attracted to the tessellatious as I was to the metamorphoses, the idea that here is something that is tessellating, but it's changing into something else. And then, on top of that, it changes from being a two-dimensional thing to a three-dimensional thing, and then b a c k into a two-dimensional thing, and then into a n o t h e r t h r e e - d i m e n s i o n a l thing, and then it w i n d s up being a village that plunges into the s e a with chess pieces, and words! There were so many ideas tangled together there in such an elegant and graceful and, again, startling and astonishing m a n n e r - - t h a t ' s what g r a b b e d me. It was a two-dimensionai, three-dimensional constant interplay and then bringing in these other worlds, like medieval villages, chess, the world of the intellect, the world of the p a s t a medieval village connotes m o r e than just the past, it again connotes a kind of mystical quality, something that's gone but that radiates a kind of c h a r m that I
M. C. Escher's Day and
can't put m y fmger on very well. And that, to me, was also marvelous. Orosz: It's not the visual image that is m o s t important, it's something in the mind. Still, it is very easy to s p e a k a b o u t the w o r k of Escher, m u c h easier t h a n to s p e a k about a b s t r a c t o r o t h e r k i n d s of art. S o m e h o w it is v e r y close to c o m m u n i c a t i o n - - y e t it is n o t visual c o m m u n i c a t i o n , n o r is it v e r b a l communication. Lamontagne: With Escher, the revealing t h a t h a p p e n s in the graphics is always a c c o m p a n i e d by a concealing w h i c h u n c o v e r s itself through t i m e as the visual s y s t e m seeks interpretations. E s c h e r was an incredible visual engineer; he e x p e r i m e n t e d with j u s t a b o u t all t h e w a y s in which you c a n int e r v e n e in the visual p r o c e s s to fool the system. I see t h r e e directions in his work. One I call "two-D," the tilings; ano t h e r is "three-D," the i m p o s s i b l e figures; and the third is w h a t I call "through-D," like the Print Gallery, in w h i c h s u b j e c t a n d object toggle with one another. The guy looking at the print is an o b j e c t but w h e n y o u go b a c k
he b e c o m e s the s u b j e c t and then he turns into an o b j e c t again. Loeb: We've h e a r d that it's the young p e o p l e w h o t a k e to E s c h e r ' s work. That m a y b e true; as a natural scientist I tend to question these things, but I think it's p r o b a b l y true. At this particular conference w e have several m u c h o l d e r people, b u t I think t h a t they bec a m e i n t e r e s t e d in E s c h e r when t h e y w e r e younger. It s h o u l d be possible to find out. Sehattschneider: I think you're right that p r o b a b l y the p r i m a r y audience is high school and college. I think part of it has to do w i t h the irreverence o f s o m e o f E s c h e r ' s a r t - - t h e y say it is "cool," "awesome." But p e o p l e w h o like to solve p r o b l e m s , w h o like to try to figure things out, are i m m e d i a t e l y att r a c t e d r e g a r d l e s s o f age. I think t h a t ' s w h y scientists a n d m a t h e m a t i c i a n s a r e so a t t r a c t e d - - i t ' s not so much t h a t there is m a t h e m a t i c s in it. Lamontagne: Like m o s t young undergraduate s t u d e n t s in the 60's, I got a kick out of Escher, a n d I had m y p o s t e r s - - i t didn't turn into mania,
Night (1938 woodcut), 9 1998 Cordon Art, B.V.--Baarn--Holland. All rights reserved.
VOLUME21, NUMBER1, 1999 15
Up and Down (1947 lithograph), 9 1998 Cordon Art, B.V.--Baarn--Holland. All rights reserved, M. C. Escher's
16
THE MATHEMATICALINTELLIGENCER
though. I really enjoyed it for a period, but t h e n I m o v e d on to something else, and I forgot a b o u t Escher. Then a few years ago the National Gallery o f C a n a d a in O t t a w a d e c i d e d to p u t up an exhibition a n d they were looking for s o m e o n e for the c o m m i t t e e w h o h a d s o m e k n o w l e d g e and expertise. Someone in the m u s e u m h a d been one of m y s t u d e n t s of p e r c e p t i o n and r e m e m b e r e d t h a t I h a d an interest in knowledge a n d illusions a n d vision a n d t h a t I was at the University of Ottawa, so soon I was b a c k in the Escherian world. I was v e r y h a p p y to be in it: in fact, I found in E s c h e r ' s w o r k the w h o l e p r o b l e m s p a c e in which I h a d b e e n playing o v e r the previous 20 years! It has to do with knowledge, and with the. fragility o f knowledge, with the unavoidable h y p o t h e t i c a l n a t u r e of knowledge. I s t a r t e d looking at the prints from that perspective, trying to see if I c o u l d fit t h e m into a unity. I don't have c l o s u r e on it, but I'm p r e t t y e x c i t e d a b o u t the w a y it is shaping up. H a r g i t t a i : This kind of discussion inevitably p r o m p t s m e to a s k m y s e l f w h a t I like in E s c h e r most, a n d w h a t I use E s c h e r m o s t for. I use him m o s t for his p e r i o d i c drawings, b u t I d o n ' t think I like t h e m most. After a while they b e c o m e very m u c h the same, boring a n d simplistic. If I c o u l d j u s t c h o o s e the one thing that I like most, it w o u l d b e his wild flowers. I s t a r t e d wondering, w h y do I like his wild flowers so m u c h ? It is p r o b a b l y b e c a u s e of m y s c i e n c e background: his wild flowers a r e v e r y geometric, t h e y a r e s t r i p p e d o f m a n y things, and t h e y s e e m to m e to give a fantastic m o d e l of nature. S o m e t h i n g is there, it is v e r y important, b u t m a n y o t h e r things are j u s t ignored, as in any very good model. His p e r i o d i c d r a w i n g s are e x t r e m e l y useful for me, b u t in this case "what y o u see is w h a t y o u get": after a while y o u get v e r y u s e d to it. I always get an uneasy feeling w h e n I see that m a t h t e a c h e r s are s p e n d i n g I d o n ' t k n o w h o w m a n y class hours on Escher. I think it's a v e r y g o o d w a y to m a k e children h a t e him and that kind of work. In fact, he is a unique artist for t h e conn e c t i o n b e t w e e n art and science. Lamontagne: E v e r y b o d y has s e e n illusions in p s y c h o l o g y b o o k s o r even
M. C. Escher's Castrovalva (1930 lithograph), 9 1998 Cordon Art, B.V.--Baarn--Holland. All
rights reserved.
m o r e widely available literature, b u t t h e y are crude. E s c h e r p u t t h e m into a w o r l d that has s o m e cogency, s o m e consistency. He u s e s a variety of them, s o m e of w h i c h d o n ' t strike us as being illusions, for i n s t a n c e the w a y in which he uses the v a r i o u s w o r l d s that point to one another, to m a k e p e o p l e realize that k n o w l e d g e c a n n o t be trusted, b u t at the s a m e time, it can be trusted. It can be t r u s t e d locally, but there's alw a y s a globality t h a t might s h o w that it does n o t m a k e sense. This local/ global relationship is fundamental in cognitive s c i e n c e as well as in mathematics. H o f s t a d t e r : T h e r e is, as I'm sure everyone knows, a b r a n d o f literature that may have s t a r t e d in South A m e r i c a called magic, or magical, realism, in w h i c h t h e r e is a mixture of reality and p a r a n o r m a l events. I haven't r e a d much of it; the only time I att e m p t e d to r e a d s o m e - - G a b r i e l Garcia Marquez's One Hundred Years of Solitude--I f o u n d I j u s t couldn't t a k e it, I couldn't s t a n d it. And yet,what is the difference b e t w e e n that kind of literature, w h i c h m i x e s reality with mystical, u n e x p l a i n a b l e events that violate the laws o f physics, and "High and Low," E s c h e r ' s p r i n t in which the scene is r e p e a t e d twice, with the b o y sitting on the staircase, with the p a l m trees in the courtyard, the t o w e r that is both going up and down, the windows right side up on one side and upside down on the other side, gravity obviously flowing in t w o different directions in the s a m e building. In s o m e sense that's magical realism, yet I love that! I don't understand w h a t it is in myself that fends Garcia Marquez uninteresting and silly, yet finds E s c h e r captivating and mesmerizing. There's a sense of m y s t i c i s m in it: I think the w o r d mysticism isn't wrong. I ' m not a mystic, b u t t h e r e ' s an a p p e a l to a sense of marvelous mystery, w h i c h is also w h a t caught m e so m u c h in "Day and Night," the first E s c h e r p r i n t I ever saw. The birds, not only intersecting and forming their o w n b a c k g r o u n d , b u t also becoming fields a n d then day and night in the s a m e p l a c e at the s a m e time, all o f that w a s overwhelming. It was so strange and c o m p l e x and weird. E s e h e r : Maybe this is b e c a u s e you can
VOLUME21, NUMBER1, 1999 17
l o o k at an E s c h e r print again and again and again, a n d think a b o u t it. S e n e c h a l : Yes, I think the difference between magical realism in literature and the magical sense in Escher is that as you look at Escher more and more, you begin to understand it. You don't see how he could possibly have thought of it, but you do see how it was actually executed. You begin to see, for example, why this seems to be convex when you look at it one way, but concave another, instead of just being baffled by it. You become intellectually engaged in trying to understand Escher, while with Garcia Marquez and the other magical realist writers that I have read, no understanding is possible because there's nothing there to understand. It's j u s t magic. Schattschneider: I agree. I don't think that "what y o u see is w h a t y o u get" with Escher. I gave John Conway a c o p y of m y b o o k [4], and he later told me that it t o o k him six m o n t h s to read. I said, "John, if I tell p e o p l e it t o o k y o u six m o n t h s to r e a d m y book, no one will open it!" He replied that at first he b e g a n to d e v o u r it, b u t then he d e c i d e d to put it on t h e p i a n o and only allow himself to t u r n one page a day, b e c a u s e he really w a n t e d to s t u d y it. When he s l o w e d down, he s a w things he h a d never seen b e f o r e although he h a d l o o k e d at t h e s e prints m a n y times. L ~ g e r : My u n d e r s t a n d i n g of the exp r e s s i o n "what y o u see is w h a t you get" is that it is immediate, in the s e n s e that the m e s s a g e is all included in the work: you c a n c o m e to it knowing nothing a b o u t art, a n d still you will get something out o f it. You don't have to k n o w w h a t w a s p r i o r to that, or after that, it d o e s n ' t cite s o m e b o d y else, y o u can engage in it with no prior knowledge of it. t I o f s t a d t e r : A n d yet, w h e n one k n o w s s o m e of E s c h e r ' s o t h e r works, one r e a d s his l a n d s c a p e s in a w a y that one might not have r e a d t h e m without that context. One h a s the s e n s e that this is s o m e b o d y w h o a p p r e c i a t e s magic. You feel it in that landscape, even though it's not directly there, and even though it was done m a y b e 20 y e a r s before something like "Day and Night." You feel that s a m e s e n s e o f the magical, a
1~
THE MATHEMATICALINTELLIGENCER
s e n s e of engagement, depth, power, space, and space b e t w e e n lines. L ~ g e r : The more I l o o k at Escher, the m o r e I a m interested in his landscapes. Even art critics will agree with that. I w o u l d wish that m o r e p e o p l e w o u l d focus on the Italian l a n d s c a p e s . E s t h e r : F a t h e r thought t h a t a m o n g all the artists who d e p i c t e d landscapes, he w a s nothing special. They w e r e all d e d i c a t e d p e o p l e with g o o d eyes, wanting to s h o w w h a t t h e y saw; he w a s one of thousands. It's only b e c a u s e he s w i t c h e d out of that field that his w o r k in it b e c o m e s visible; that's the strangest thing about the whole phenomenon. F a t h e r never c o n s i d e r e d himself an artist: b e c a u s e he h a d a certain prec o n c e i v e d i d e a of w h a t an artist was, he thought he wasn't one, a n d he couldn't d r a w anyway. But c a n you see, t h r o u g h his prints, t h a t he w a s looking at the w o r l d so intensely, with such interest, that it c o m e s through, it resonates, within you: "Oh, so t h a t ' s w h a t t h e w o r l d can look like!" Loeb: Maurits e x p r e s s e d surprise to m e m a n y times that p e o p l e w e r e seeing mystical things in his prints. He did n o t e x p e c t that at all. George, did you have any e x p e r i e n c e with this? Escher: Well, yes. It w a s r a t h e r funny, the reactions that father got to many of his prints. People saw their own imaginings in them, not w h a t he h a d m e a n t to say. What he m e a n t to say is what's there, and nothing more, according to him. These other p e o p l e s a w reincarnations and mystical things. Loeb: Maybe that is the "magic mirror" of M. C. Escher. Maybe t h a t ' s w h a t we all see: w e see ourselves m i r r o r e d in his work. L a m o n t a g n e : The question of interp r e t a t i o n is a very subtle one. The attitude that we should n o t interpret, t h a t w e should be cautious, is very naive: if you have to be c a u t i o u s w h e n y o u i n t e r p r e t then y o u have to be cautious w h e n you think, b e c a u s e thinking is interpreting. To an extent, I'm a Popperian. That is, I agree with the p h i l o s o p h e r Karl P o p p e r that all forms o f knowledge, including p e r c e p t u a l knowledge, are c o n j e c t u r a l o r hypothetical, and the only w a y in w h i c h w e can h o p e to p r o g r e s s in o u r under-
standing is to formulate our k n o w l e d g e in a falsifiable w a y [5]. F o r example, a p e r s o n c o m i n g into a different culture r e a d s it in a w a y that is refutable t h r o u g h further experience. That is, I think, a b e t t e r reading t h a n a naive reading of it which is not refutable. Seneehal: Is there anything mathematical in the appeal of Escher, or is that completely beside the point? Lamontagne: Perhaps mathematics was to E s c h e r as g r a m m a r w a s to S h a k e s p e a r e . Mathematics is form. I - I o f s t a d t e r : At the time I w a s writing my book, which b e c a m e k n o w n as GSdel, Escher, Bach [6], it w a s n o t called t h a t at all: the w o r k i n g title w a s s o m e t h i n g like "GSdel's T h e o r e m a n d the H u m a n Brain." As I was writing a n d writing, I realized that for m a n y of the c o n c e p t s that I h a d called "strange loops" o r "tangled hierarchies," i m a g e s that I k n e w from E s c h e r w e r e a p p e a r ing in m y head, over and o v e r again. F o r a while t h o s e images j u s t h e l p e d m e e x p r e s s myself m o r e clearly; t h e y h e l p e d m e get m o r e sharply into w o r d s w h a t I w a s trying to get across. But then eventually it o c c u r r e d to me, m y gosh, I s h o u l d be showing m y r e a d e r s this stuff, I s h o u l d not be s i m p l y having it in m y h e a d as a crutch o r an aid, I s h o u l d b e sharing this. If it's useful to m e as a w r i t e r to have an E s c h e r picture in m y head, it will be useful for m y r e a d e r s to have it in theirs. At that p o i n t E s c h e r b e c a m e an integral p a r t of the book, and it w a s a b o u t the s a m e time that Bach w a s entering, for v e r y different reasons. F o r me, m a n y o f the c o n c e p t s that I w a s trying to get across, particularly this notion of strange loop, were extremely well represented in Escher's pictures, and they were deeply connected, as I said, with GSdel's theorem and certain things in mathematical logic. I d o u b t that Escher had those notions in mind explicitly, but the abstraction that underlies GSdel's p r o o f and the abstraction that underlies "The Print G a l l e r y " - - t h e i d e a of a s y s t e m folding a r o u n d and engulfing itself, is the s a m e c o n c e p t as in a s y s t e m that can r e p r e s e n t its o w n predicates, a syst e m that c a n talk a b o u t itself. L a m o n t a g n e : I've b e e n r a i s e d in a c o n t e x t of Piagetian thought, in the
Piaget world, w h i c h is still quite valid. Piaget talks a b o u t a d o l e s c e n c e as the p e r i o d w h e n cognition o p e n s up to the r e a l m of possibilities. Before t h a t - - h e calls it the c o n c r e t e o p e r a t i o n a l per i o d - t h e mind is reactive a n d it can do very fancy things, b u t on the basis o f actual things, c o n c r e t e objects. But w h e n you reach the formal o p e r a t i o n a l stage, which starts a r o u n d 12, as you r e a c h 12, 13, 14, the m i n d o p e n s up. It realizes that there is not only actuality, b u t that actuality can l e a d to potentiality. A n d so the y o u n g m i n d s open up to the fact that t h e y are w h a t they are, b u t within the c o n t e x t of a huge c o m b i n a t o r i c s that is at the s a m e time physical, social, a n d psychological. Do y o u k n o w the test that Piaget did with m a t h e m a t i c s - - w i t h p e r m u t a t i o n s and combinations, showing h o w kids at the p r e - o p e r a t i o n a l level, a n d c o n c r e t e operational level, and formal o p e r a t i o n a l level handle c o m b i n a t o r i a l t a s k s [7]? Before adolescence, a child can comp u t e the n u m b e r of a r r a n g e m e n t s of
any given n u m b e r of objects, b u t cannot even u n d e r s t a n d the question if y o u a s k for the n u m b e r of arrangem e n t s o f N objects. A d o l e s c e n t s can think a b o u t N. In addition to explaining m a t h understanding, Piaget's i d e a s are beautiful m e t a p h o r s for the m i n d in general. When you get to the formal o p e r a t i o n a l stage, that is, a d o l e s c e n c e , then y o u o p e n up to the p o s s i b l e and y o u realize that you are one a m o n g s t an infinite set of possibilities. Now, at that age, t h e r e is at the s a m e t i m e the fear of losing w h a t you are b u t t h e exc i t e m e n t o f discovering w h a t y o u might be, a n d w h a t the w o r l d might be. I think t h a t it is in this a r e a that w e can locate the great fascination for Escher. This is o u r stopping point, but it is not the end. This particular c o n v e r s a t i o n w a s one a m o n g an infinite set o f possible c o n v e r s a t i o n s a b o u t the w o r k of M. C. E s c h e r and, m o r e generally, a b o u t t h e d e e p relations b e t w e e n art a n d m a t h e m a t i c s and the h u m a n mind.
Like E s c h e r ' s visual puzzles, it l o o p s b a c k on itself, leading us through n e w l a n d s c a p e s that s o m e h o w are familiar. REFERENCES
[1] John Rigby, private communication. [2] Nol Escher, private communication. [3] "M. C. Escher: From Landscapes to Mindscapes" was the title of an exhibition held at the National Gallery of Canada in Ottawa in 1998. The proceedings of the Escher Centennial Congress will be published shortly by Springer-Verlag. [4] Doris Schattschneider, Visions of Symmetry: notebooks, periodic drawings, and related work of M. C. Escher, New York, W.H. Freeman, 1990. [5] Karl Popper, Conjectures and Refutations: the growth of scientific knowledge, London, Routledge & Kegan Paul, 1963. [6] Douglas Hofstader, GSdel, Escher, Bach: an eternal golden braid, New York, Basic Books, 1979. [7] Jean Piaget and Barbel Infelder, La genese de I'idee de hasard chez /'enfant, Paris, Presses Universitaires de France, 1951.
VOLUME 21, NUMBER 1, 1999
19
GIAN-CARLO ROTA
Two Turning Points in Invariant Theory* 0
nvariant theory is the great Romantic story of mathematics. For 150 years, f r o m its beginnings with Boole to the time, around the middle of this century, when it branched off into several independent disciplines, mathematicians of all countries were brought together by their common faith in invariants: in England, Cayley, MacMahon,
Sylvester, and Salmon, and later, Alfred Young, Aitken, Littlewood, and Turnbull; in Germany, Clebsch, Gordan, Grassmann, Sophus Lie, Study; in France, Hermite, Jordan, and Laguerre; in Italy, Capelli, Brioschi, Trudi, and Corrado Segres, d'Ouidio; in America, Glenn, Dickson, Carus (of the Carus Monographs), Eric Temple Bell, and, later, Hermann Weyl. Seldom in history has an international community of scholars felt so united by a common scientific ideal for so long a stretch of time. In our century, Lie theory and algebraic geometry, differential algebra, and algebraic combinatorics are offsprings of invariant theory. No other mathematical theory, with the exception of the theory of functions of a complex variable, has had as deep and lasting an influence on the development of mathematics. Eventually, invariant theory was to become a victim of its own success: The very term "invariant theory" is nowadays applied to diverse offspring theories, so that it has become all but meaningless. It is no wonder if you are baf-
fled by the title of this lecture. Which invariant theory is it about? It is about classical invariant theory. The old treatises are being dusted off the shelves of library basements and reread, reinterpreted, and presented in a language that meets the standard of rigor of our day. The program of classical invariant theory, that had for some time been given up as hopeless, is again being pursued, and success may at last be within reach. I will review two turning points in the history of invariant theory. The first, the "new" one, happened around the turn of the century, and its effects are still being felt all over mathematics. The second, the "old" one, happened very early in the game and led to a serious misunderstanding that lasts to this day. A pedestrian definition of invariant theory might go as follows: invariant theory is the study of orbits of group actions. Such a definition is correct, but it must be supple-
*Slightly edited second Colloquium Lecture delivered at the Annual Meeting of the American Mathematical Society, Baltimore, January 8, 1998.
~0
THE MATHEMATICALINTELLIGENCER9 1999 SPRINGER-VERLAGNEWYORK
m e n t e d b y a p r o g r a m m a t i c statement. H e r m a n n Weyl, in t h e introduction to his b o o k The Classical Groups, summ a r i z e d the p r o g r a m in t w o assertions. The first s t a t e s that "all geometric facts are e x p r e s s e d by the vanishing of invariants," and the s e c o n d s t a t e s that "all invariants are invariants o f tensors." Let m e briefly c o m m e n t on t h e s e lofty s t a t e m e n t s . What is a g e o m e t r i c fact? It is a fact a b o u t s p a c e that is indep e n d e n t of the choice of a c o o r d i n a t e system. G e o m e t r i c facts are d e s c r i b e d b y m e a n s of equations w h i c h require a c h o i c e of coordinates. In a v e c t o r space V of d i m e n s i o n n, one c h o o s e s a c o o r d i n a t e s y s t e m Xl, x 2 , . . . , Xn. Since Descartes, w e have l e a r n e d to e x p r e s s g e o m e t r i c facts b y equations in the c o o r d i n a t e s Xl, x2, 99 9 Xn. However, a b o u t 100 y e a r s ago, m a t h e m a t i c i a n s and physicists m a d e the shoctdng discovery that the usual equations (i.e., equations in the commutative ring generated by the variables Xl, x2, 9 9 xn) are inadequate for the description of a lot o f geometric and physical facts. Motivated b y this discovery, they introd u c e d a m o r e general ring. This is the ring of non-commutative polynomials in the coordinates Xl, x 2 , . . . , xn. Homogeneous elements of this ring (i.e., h o m o g e n e o u s noncommutative polynomials in the variables Xl, x 2 , . . . , Xn) are called tensors. If we believe Hermann Weyl's philosophy, then w e will be satisfied that equations in the t e n s o r algebra suffice for the description of any geometric fact w e will ever meet. Furthermore, ff these equations are to express geometric properties, then they must hold no m a t t e r what coordinate system is chosen; in other words, equations that describe geometric facts m u s t be invariant under changes of coordinates. The p r o g r a m o f invariant theory, from Boole to our day, is precisely the translation of geometric facts into invariant algebraic equations expressed in terms o f tensors. This p r o g r a m of t r a n s l a t i o n of g e o m e t r y into a l g e b r a w a s to b e carried out in t w o steps. The first s t e p c o n s i s t e d in d e c o m p o s i n g t e n s o r a l g e b r a into irreducible c o m p o n e n t s u n d e r changes o f coordinates. The s e c o n d s t e p cons i s t e d in devising an efficient notation for t h e e x p r e s s i o n of invariants for each i r r e d u c i b l e component. The first step w a s successfully c a r r i e d o u t in this century; the s e c o n d w a s a b a n d o n e d s o m e t i m e in the twenties, a n d only recently has it resurfaced. The d e c o m p o s i t i o n o f t e n s o r algebra into i r r e d u c i b l e c o m p o n e n t s was d i s c o v e r e d a r o u n d the turn o f the cent u r y a l m o s t s i m u l t a n e o u s l y b y Issal Schur a n d Affred Young. The gist of this d e c o m p o s i t i o n is one of t h e great a d v a n c e s in m a t h e m a t i c s o f all times, and it m a y b e worthwhile to p r e s e n t it in a form that can be m a d e available to undergraduates. Let us c o n s i d e r functions of three variables, s u c h a s f ( x l , x2, x3). Two well-known c l a s s e s of functions o f t h r e e varia b l e s are s y m m e t r i c functions, t h o s e that satisfy the equations
fs(Xb x2, X3) = fs(Xil, xi2, xi3) for every p e r m u t a t i o n sending the indices (i, 2, 3) to (il, i2, i3), and s k e w - s y m m e t r i c functions, defined by the equations
fa(Xl, X2, X3) = -4-fa(Xil, Xi 2, Xi3), w h e r e the sign is § 1 or - 1 a c c o r d i n g as the p e r m u t a t i o n sending the indices (1, 2, 3) to (il, i2, i3) is even or odd. It is n o t true that a function of t h r e e variables is the s u m of a s y m m e t r i c function and a s k e w - s y m m e t r i c function. A third type o f function is required, w h i c h is called a cyclic function, c h a r a c t e r i z e d by the equation
fc(Xl, x2, x3) + fc(x3, xt, x2) + fc(x2, x3, Xl) = O. Every function of t h r e e variables c a n b e uniquely written as the s u m o f a symmetric function, a s k e w - s y m m e t r i c function, a n d a cyclic function, in symbols,
f(Xl, X2, X3) = fs(Xl, X2, X3) + fa(Xl, X2, X3) + fc(Xl, X2, x3). E a c h o f t h e three s y m m e t r y c l a s s e s is invariant u n d e r permutations; this is obvious for s y m m e t r i c and skew-symmetric functions b u t not quite so o b v i o u s for cyclic functions. These t h r e e invariant s u b s p a c e s play for the group of p e r m u t a t i o n s of a set of t h r e e e l e m e n t s a role analogous to the role o f the eigenvectors of a s y m m e t r i c matrix. F o r f u n c t i o n s f ( x l , x2, x3, x4) of f o u r variables, there a r e five s y m m e t r y classes, which are defined as follows: 1. S y m m e t r i c functions 2. S k e w - s y m m e t r i c functions 3. Cyclic-symmetric functions, satisfying the four equations
f(xb f(Xl, f(xl, f(xl,
x2, x3, x4) X2, X3, X4) x2, x3, x4) x2, x3, x4)
+ f ( x l , x4, x2, x3) § f(x4, x2, Xl, x3) + f(x4, Xl, x3, x2) + f(x3, xl, x2, x4)
+ + + +
f ( x t , x3, x4, x2) = 0, f(x3, x2, x4, x l ) = 0,
f(x2, x4, x3, xl) = O, f(x2, x3, Xl, x4) = 0
4. F u n c t i o n s satisfying the f o u r equations f(Xl, X2, X3, X4) § f(x2, Xl, X3, X4) § f ( x l , x2, x4, x3) + f(x2, Xl, x4, x3) = 0, f ( x l , x2, x3, x4) + f(x3, x2, xl, x4) + f ( x l , x4, x3, x2) + f(x3, x4, xl, x2) = 0, f(Xl, X2, X3, X4) § f(Xl, X3, X2, X4) § f(x4, x2, x3, Xl) § f(x4, x3, x2, xl) = O, sign(q) f(x~l, x~2, x~3, xw4) ----0 5. F u n c t i o n s satisfying the equations
f(xt, x2, x3, x4) - f ( x 2 , Xl, x3, x4) f(xl, x2, X4, X3) § f(x2, Xl, X4, X3) ----0, f ( X l , X2, X3, X4) --f(x3, X2, Xl, X4) --
f(xl, x4, x3, x2) + f(x3, x4, Xl, x2) = 0, f ( x l , x2, x3, x4) - f ( x l , x3, x2, x4) f(x4, x2, x3, Xl) + f(x4, x3, x2, x l ) = 0, Z f(xol, x~2, x~3, x~4) = 0 Every function of four variables is uniquely e x p r e s s i b l e a s the s u m of five functions, e a c h one belonging to one o f these s y m m e t r y classes. E a c h s y m m e t r y class is invariant under permutations. More generally, every function o f n variables f ( x l , x2, 99 9 xn) c a n be uniquely w r i t t e n as the sum of Pn functions, e a c h one belonging to a different s y m m e t r y class. Here, Pn equals the n u m b e r o f p a r t i t i o n s of the integer n.
VOLUME 21, NUMBER 1, 1999
21
E a c h s y m m e t r y class is defined by equations w h i c h are not difficult to find. This d e c o m p o s i t i o n holds for tensors as well, after some cosmetic changes of notation. To this day, only two symmetry classes o f t e n s o r s have been s t u d i e d in any detail. Symmetric t e n s o r s are ordinary c o m m u t a t i v e polynomials such as we l e a r n e d to use in analytic geometry. Skew-symmetric tensors are polynomials in the c o o r d i n a t e s xl, x 2 , . . 9 x~ p r o v i d e d that the variables are a s s u m e d to satisfy the equations xixj = -xjxi. Tensors belonging to symmetry classes o t h e r than the classes of s y m m e t r i c and skewsymmetric t e n s o r s also o c c u r in g e o m e t r y a n d physics. However, these s y m m e t r y classes have b e e n studied very little, and they are a long w a y from being understood. So m u c h for the w o r d "new" in the i n t r o d u c t i o n of this lecture; let us n e x t do s o m e justice to the w o r d "old." I will describe the m o s t p e c u l i a r feature of classical invariant theory, n a m e l y the s y m b o l i c or u m b r a l notation, to which Eric Temple Bell d e d i c a t e d his Colloquium Lectures in 1927. I will c o n s i d e r the simplest group, n a m e l y the group of translations o f the line. The unusual f e a t u r e s of the symbolic m e t h o d will a l r e a d y be a p p a r e n t in this special case. Let p(x) and q(x) b e m o n i c p o l y n o m i a l s in the variable x. I write t h e m in the following quaint notation:
p(x)=xn§247247
""§
an-lX§
and
" " + ( k k_ 1) bk-lX+bk 9 I a s s u m e that t h e p o l y n o m i a l q(x) is o f l o w e r degree than the p o l y n o m i a l p(x); that is, that k -< n. Define the t r a n s l a t i o n o p e r a t o r Tv on a p o l y n o m i a l p(x) as follows:
TCp(x) = p(x + c). Let us write
+
(2)P2(c)xn-2+'"+(nn_ The f l h coefficient c o m p u t e d to b e
1) Pn l(C)X+pn(C).
pj(c) of the p o l y n o m i a l p(x + c) is
Pj(o) = aj T (J) aj lC § (~) aj-2 c2 + ''' + cj. A p o l y n o m i a l I(al, a2, 9 9 9 , an, bl, b2, 9 9 9 , bk) in the variables al, a 2 , . . . , an, bl, b 2 , . . . , bk is said to be an invariant of the two p o l y n o m i a l s p(x) and q(x) w h e n
I(al, a2,. . . , an, bl, b 2 , . . . , b k ) =
I(pl(c), p 2 ( c ) , . . . , pn(c), ql(c), q 2 ( c ) , . . . , qk(C))
22
THE MATHEMATICALINTELLIGENCER
I(TCp(x), TCq(x)) = I(p(x), q(x)) for all c o n s t a n t s c. Invariant t h e o r y is c o n c e r n e d with the p r o b l e m of finding all invariants of a given set o f polynomials, as well as their significance. What is m e a n t by the "significance" of an invariant? I a p p e a l to H e r m a n n Weyl. "Every" p r o p e r t y of p o l y n o m i a l s w h i c h is invariant u n d e r the group of translations is exp r e s s e d b y the vanishing o f a set of invariants. In o t h e r words, "any" set of p o l y n o m i a l s w h i c h is invariant u n d e r t r a n s l a t i o n s is the same set as a set of p o l y n o m i a l s o b t a i n e d b y setting to zero a set of invariants of such polynomials. It is i m p o s s i b l e to u n d e r s t a n d this s t a t e m e n t w i t h o u t ex~ amples. Let us c o n s i d e r the s i m p l e s t a n d oldest example. The p r o p e r t y of a quadratic p o l y n o m i a l
q(x) = x 2 + 2blx + b2 of having a double r o o t is invariant u n d e r translations; in o t h e r words, if the p o l y n o m i a l q(x) has a double root, so d o e s the p o l y n o m i a l q(x + c) for any c o n s t a n t c. F o l l o w i n g H e r m a n n Weyl, w e look for an invariant w h o s e vanishing e x p r e s s e s this property. Sure enough, it is e a s y to c h e c k t h a t the discriminant
D(bl, b2) = b2 - b2
q ( x ) = x k + ( ~ ) b l X k - l + ( ~ ) b 2 xk-2+
p ( x + c) = xn T ( 1 ) P i ( C ) x n - i
for all c o m p l e x n u m b e r s c. By a b u s e of notation, w e w r i t e
I(p(x), q(x)) and we s p e a k o f / a s being an invariant of the p o l y n o m i a l s p(x) and q(x). In this abusive notation, a polyn o m i a l I is said to be an invariant of the p o l y n o m i a l s p(x) and q(x) w h e n e v e r
is the d e s i r e d invariant. This example, due to Boole, w a s the s p a r k that led to the birth o f invariant theory. One often h e a r s the s e n t e n c e "Hilbert killed invariant theory," referring to w h a t w e call the Hilbert b a s i s theorem. It is n o t true. Hilbert loved invariant theory, a n d he w e n t on publishing striking p a p e r s in invariant t h e o r y aft e r he p r o v e d the basis t h e o r e m . Some of the m o s t fascinating results in invariant t h e o r y w e r e d i s c o v e r e d in the first 20 y e a r s of this century, a long time after Hilbert p r o v e d his basis theorem. What then is the r e a s o n for the subsequent t e m p o r a r y d e m i s e of invariant theory? One r e a s o n is the e n d e m i c use of the s y m b o l i c o r u m b r a l notation. Dieudonn6 w r o t e that half the s u c c e s s of a p i e c e of m a t h e m a t i c s d e p e n d s on a p r o p e r choice of notation. It w o u l d b e interesting to m a k e a list of unfortunate notations that killed various c h a p t e r s of mathematics, as well as a list of felicitous n o t a t i o n s t h a t p r o m o t e d the developm e n t of o t h e r b r a n c h e s o f m a t h e m a t i c s . The s y m b o l i c o r u m b r a l n o t a t i o n was a c a t a s t r o p h e . A n u m b e r of m a t h e m a t i c i a n s tried to m a k e s e n s e of the symbolic m e t h o d without success, the three m o s t n o t a b l e ones being H e r m a n n Weyl, Eric Temple Bell, and E d w a r d Hegeler C a m s . Bell failed to p r o p e r l y define u m b r a l notation, and his Algebraic Arithmetic r e m a i n s to this d a y the b o o k of seven seals. If Weyl a n d Bell had lived 50 y e a r s longer, so as to benefit
from the d e v e l o p m e n t o f w h a t w a s in their t i m e called "modern" algebra, t h e y w o u l d u n d o u b t e d l y have s u c c e e d e d in p r o p e r l y defining u m b r a l notation. In o u r day, as I will s h o w you, this is easy: it will only t a k e a few minutes. Before I start spouting definitions, let m e say w h a t I a m not going to say. Umbral n o t a t i o n can be s h o w n to be equivalent (or "cryptomorphic," to use a t e r m invented by m y late friend Garrett Birkhoff) to ano t h e r notation that has gained great notoriety in o u r day: t h e n o t a t i o n of H o p f algebras. I will not justify this sibylline p r o n o u n c e m e n t , n o t b e c a u s e it is difficult to do so, but bec a u s e it is not needed. Let us go on to the definition of u m b r a l notation. Side b y side with the p o l y n o m i a l s p ( x ) and q(x), w e c o n s i d e r a n o t h e r p o l y n o m i a l a l g e b r a C[x, a, fi] in t h r e e variables x, a, and/3, together with a linear functional E defined on the underlying v e c t o r s p a c e C[x, a, /3]. The definition of the linear functional E is the k e y point. It is carried out in steps:
The u m b r a l or symbolic m e t h o d consists o f replacing all o c c u r r e n c e s of the coefficients of t h e polynomials p ( x ) and q(x) b y u m b r a e and equivalences. F o r example,
p ( x ) -~ (x + a) n and q(x) ~ (x + fi)k. Let us carefully c h e c k the first equivalence. By definition, the equivalence m e a n s the s a m e as
E ( p ( x ) ) = E ( ( x + a)n). Because E ( x j ) = x j for all non-negative integers j, this identity can be r e w r i t t e n as
p(x) = E((x + 00% E x p a n d i n g the right-hand side by the binomial theorem, w e obtain
S t e p 1. Set
E ( x j) = x j for all non-negative integers j , in particular E(1) = 1. Thus, the range o f the linear functional E is C[x]. S t e p 2. Set
By linearity, this equals
E(aJ) = aj; in particular, we have E(oJ) = 0 if j > n. S t e p 3. Set
E(fi j) = bj; in particular, we have E(fiJ) = 0 i f j > k. S t e p 4. This is the m a i n step. Set
E ( ,~i l ~ x ~) = E ( , ~ ) E ( f ~ ) x ~. Following Sylvester, the variables a and fi are called umbrae. In o t h e r words, the linear functional E is multiplicat i r e on distinct umbrae. S t e p 5. E x t e n d b y linearity. This c o m p l e t e s the definition of the linear functional E. We n e x t c o m e to the m o s t disquieting feature of u m b r a l notation. Let f ( a , fl, x) and g(a, fi, x) be t w o p o l y n o m i a l s in the variables a, fi, and x. We write f ( a , fi, x) ~ g(a, fi, x)
Evaluating the linear functional E, w e see that this, in turn, equals
xn+(nl)a,xn-'+(2)a2xn-2+"'+(nnl)an ,X+an, as desired. The e x p r e s s i o n
( x + ~)~ is c a l l e d a n u m b r a l r e p r e s e n t a t i o n o f t h e p o l y n o m i a l p(x). In u m b r a l notation, a c o m p l e x n u m b e r r is a r o o t of the p o l y n o m i a l equation p ( x ) = 0 if and only if ( r + o0 n --~ 0. Similarly, in u m b r a l notation, the p o l y n o m i a l TCp(x)= p ( x + c) m a y be r e p r e s e n t e d as follows:
to m e a n
p ( x + c) = (x + ~ + c) n,
E(f(~,/3, x)) = E(g(~,/3, x)). Read = as "equivalent to." The "classics" w e n t a bit t o o far; they w r o t e
and this yields the u m b r a l e x p r e s s i o n for the coefficients pj(c) o f the p o l y n o m i a l p ( x + c), n a m e l y
pj(C) ~-- (0~ + C) j. f ( a , fi, x ) = g(a, fl, x); t h a t is, t h e y r e p l a c e d the s y m b o l ~ by o r d i n a r y equality. This was an excessive a b u s e of notation. The "classics" w e r e a w a r e of the error, a n d while they a v o i d e d c o m p u t a tional errors by clever artistry, they were unable to get a w a y from the abuse.
Let us n e x t see h o w umbra] n o t a t i o n is related to invariants. Let us a s s u m e that the t w o p o l y n o m i a l s p ( x ) and q(x) have the s a m e degree n. Then, an invariant A of the p o l y n o m i a l s p ( x ) and q(x) m a y be d e f i n e d as follows:
A(q(x), p ( x ) ) -~ (fl - oOn.
VOLUME 21, NUMBER 1, 1999
23
The evaluation of the invariant A in t e r m s of the coefficients o f p ( x ) a n d q ( x ) proceeds as follows: A ( q ( x ) , p ( x ) ) = E(([3 - c~)~)
+
i)
Thus, we see that apolarity gives a trivial a n s w e r to the following question: w h e n c a n a polynomial p ( x ) be w r i t t e n as a linear c o m b i n a t i o n of polynomials of the form (x r l ) n, (x - r 2 ) n , . . . , (x - rn)n? A beautiful theorem on apolarity was proved by the British mathematician John Hilton Grace. I state it without proof: G r a c e ' s T h e o r e m . If two polynomials p(x) and q(x) of degree n are apolar, t h e n e v e r y d i s k i n the c o m p l e x p l a n e c o n t a i n i n g e v e r y zero o f p ( x ) also c o n t a i n s at least one
+
z e r o o f q(x).
§ (--1)n-1
n -- 1
=bn-(1)
bn_lal§
....
§ "'" § ( - 1 ) n - 1 ( n n- 1) b l a n - l § Why is A a n invariant? This is best seen in umbral notation: A(TCq(x), TCp(x)) ~-- (fi + c - a - c) n = (fl - a)n.
The invariant A is called the apolar invariant; two polynomials p ( x ) and q ( x ) having the property that A ( q ( x ) , p ( x ) ) = 0 are said to be apolar. In u m b r a l notation, two polynomials are apolar w h e n e v e r
([3 - a) n --- O. The concept of apolarity has a distinguished pedigree going all the way b a c k to Apollonius. What is the "significance" of the apolar invariant? What does it m e a n for two polynomials to be apolar? This question is a n s w e r e d by the following theorem: T h e o r e m 1. S u p p o s e that r is a root o f the p o l y n o m i a l q(x), that is, that q(r) = O. Then, the p o l y n o m i a l s q ( x ) a n d p ( x ) = (X -- r) n are apolar.
Grace's Theorem is an i n s t a n c e of what might be called a sturdy theorem. For almost 100 years, it has resisted all attempts at generalization. Almost all k n o w n results a b o u t the distribution of zeros of polynomials in the c o m p l e x p l a n e are corollaries of Grace's theorem. I will next generalize the apolar invariant to the case of two polynomials p ( x ) and q ( x ) of different degrees n a~d k, with k -< n. To this end, let us slightly generalize the defmition of invariant, as follows. A polynomial I(al, a 2 , . . . , an, bl, b 2 , . . . , bk, X) i n t h e v a r i a b l e s at, a2, 9 9 9 an, bl, b2, 9
bk, x i s s a i d t o b e a n i n -
v a r i a n t of the polynomials p ( x ) and q ( x ) when I(al, a2, 9 9 9 an, bl, b2, . 9 bk, x ) = I(pt(c), P2(C), 9 9 9 pn(c), ql(c), q2(c), 9 9 9 qk(C), X + C)
for all complex n u m b e r s c. Sometimes, these more general invariants are called covariants. Now define a more general apolar invariant as follows: A ( q ( x ) , p ( x ) ) -~ ([3 - a ) k ( x -- a) n-k.
Again, we say that two polynomials p ( x ) and q ( x ) are apolar w h e n A ( q ( x ) , p ( x ) ) is identically zero; that is, zero for all x. T h e o r e m 1 remains valid as stated; that is, if q(r) = 0, t h e n the polynomial p ( x ) = ( x - r) n is apolar to q(x). Let us consider a special case. Suppose that q ( x ) is a quadratic polynomial a n d p ( x ) is a cubic polynomial: q(x) = x 2 + 2 b l x + b2
PROOF. For p ( x ) = ( x - r) n, we have oLj ~ ( - - r ) j, a n d hence A ( q ( x ) , p ( x ) ) = (fl - ( - r ) ) n = ([3 + r) n ~-- 0
p ( x ) = x 3 + 3 a l x 2 + 3a2x + a3.
as desired.
Then, we have, in u m b r a l n o t a t i o n
COROLLARY. I f the p o l y n o m i a l q(x) h a s n d i s t i n c t zeros rl, r2, 9 9 9 , rn, a n d i f the p o l y n o m i a l p ( x ) i s a p o l a r to q(x), then there e x i s t c o n s t a n t s Cl, c2, 9 9 9 , C n f o r w h i c h p ( x ) = c l ( x - r O n + c2(x - r2) n §
"'"
§ Cn(X
--
rn) n.
PROOF. The d i m e n s i o n of the affine s u b s p a c e of all (not necessarily monic) polynomials p ( x ) which are apolar to q(x) equals n. But if the polynomial q(x) has simple roots, then by the above t h e o r e m the polynomials ( x - r l ) n , ( x r2) n, . . . , ( x - rn) n are linearly i n d e p e n d e n t a n d apolar to q(x). Hence, the polynomial p ( x ) is a linear c o m b i n a t i o n of these polynomials. This completes the proof.
24
THE MATHEMATICALINTELUGENCER
and
A ( q ( x ) , p ( x ) ) ~- (fi - a)2(x - a) = ([32 _ 2a[3 + 02)x - a[3 2 § 202[3 -- 02.
Evaluating the linear f u n c t i o n a l E, we obtain the following explicit expression for the apolar invariant: A ( q ( x ) , p ( x ) ) = E((fi 2 - 2a[3 + 02)x - a[32 + 2a213 - 02) = (b2 - 2 a l b l § a 2 ) x -- a l b 2 + 2a2bl - a3.
Thus, a quadratic polynomial q ( x ) and a cubic polynomial p ( x ) are apolar if and only if their coefficients satisfy the two equations b2 - 2 a l b l + a2 = O, - a i b 2 + 2a2bl - a3 = O.
Using t h e s e equations, w e can prove two i m p o r t a n t theorems: T h e o r e m 2. There is, i n general, one ( m o n i c ) quadratic p o l y n o m i a l w h i c h is apolar to a given cubic polynomial. PROOF. Indeed, the a b o v e equations m a y b e r e w r i t t e n as b2 - 2 a t b l = --a2, - a l b 2 + 2a2bl = a3. The solutions bt a n d b2 for given a b a2, and a3 are, in general, unique. T h e o r e m 3. There is a l w a y s a one-dimensional space o f ( m o n i c ) cubic p o l y n o m i a l s w h i c h are apolar to a given quadratic polynomial. PROOF. Indeed, given bt a n d b2, w e m a y solve for at, a2, and a3 from the equations
- 2 a t b l + a2 = -b2, - a l b 2 + 2a2bt = a3. These equations always have a single infinity of solutions, a s t h e y used to s a y in the old days. T h e o r e m s 2 a n d 3 p r o v i d e a simple and explicit m e t h o d for solving a cubic equation. It goes as follows. Given the cubic polynomial
p(x) = x 3
+ 3alx 2 +
3a2x + a3,
first, b y T h e o r e m 2 w e fend a unique quadratic p o l y n o m i a l q(x) w h i c h is a p o l a r to p(x). In general, such a quadratic p o l y n o m i a l q(x) has t w o distinct zeros r l a n d r2. By T h e o r e m 1, the cubic p o l y n o m i a l s (x - r t ) 3 and (x - r2) 3 are a p o l a r to q(x). Second, b y T h e o r e m 3, the affine linear s p a c e o f cubic p o l y n o m i a l s a p o l a r to q(x) h a s d i m e n s i o n 2. As p ( x ) is a p o l a r to q(x), w e conclude that p ( x ) is a line a r c o m b i n a t i o n of (x - r l ) 3 a n d (x - r2) 3. In symbols,
T h e o r e m 4. The d i m e n s i o n of the space o f all ( m o n i c ) p o l y n o m i a l s o f degree k that are apolar to a p o l y n o m i a l of degree n equals 2k - n, i n general, w h e n k -< n. T h e o r e m 5. The d i m e n s i o n o f the space o f all ( m o n i c ) p o l y n o m i a l s of degree n that are apolar to a p o l y n o m i a l of degree k equals k, i f k . This i m m e d i a t e l y yields a chain o f subgroups
(732)D(764>D(712s>D1
(1)
and the corresponding chain of fLxed fields
Q = t C (T2> t (Z (T4>t C "'" C (T128>t C (T256>t = Q(0));
(2)
h e r e (72) t = {x ~ Q(0)); 72(x) = x}, a n d so on. Recall that the effect o f 7 on w (and on the p o w e r s o f 0)) is to raise to the third power: 7(0)i) = 0)3i. Hence, 7 2 ( 0 ) ) : 0) 32, 73((,0) : (-033,a n d so on. 7 k raises ~o to its 3kth power. Likewise, 7k(w i) = (wi) 3k.
Our First Step Is to Reach (T2) t We shall w o r k a lot with polynomials m o d u l o p ( t ) = 1 + t + t 2 + ..- + t 256 with integer coefficients, that is, polynomials in the quotient ring Z[t]/p(t). Note that as p(t)It 257 - 1,
each e x p o n e n t occurring in such a polynomial m a y be reduced modulo 257. Also, note that each m ( t ) E Z[t]/p(t) has a well-defined value for w i for i = 1, 2 , . . . , 256. In fact, m ( w ~) does not depend on the choice of representative for m because p ( w i) = O. Thus, we are free to view m ( t ) as a function {w, w 2 , . . . , w256}--) C, where C is the field of complex numbers, and we shall let t denote any of w, o~2,..., ~o256. Consequently, I shall also write (for example) r(t) = t 3. Remembering that the period of 3 in Z~57 is 256, we ob255
256
tain ~ , t 3~ = ~, tJ = - 1 [modulo p ( t ) as usual]. Of course, i--O -1
j=l
is in the fLxed field of ~-, so it is no surprise that 255
~(Z t3')= i--0
255
255
i--0
i=0
~', 7(t3') = .~ [~'(t)]3~ 255 =~
t3i+1= i--0
256 255 Z t3i=Zt3i" i--1
i-- 0 255
The p o i n t is, of course, that T takes each term in ~ t3~ to its successor (and the last term to the first), i=o 127 Now, let us pick every other term and put ao(t) = ~ t 32i 127
and a t ( t ) = ~
i=0
t 32i+1, where it will be c o n v e n i e n t to con-
i-0
sider the indices 0 and 1 as elements of Z2. As the effect of ./.2 o n t is raising t to its n i n t h power, we obtain r2(ao(t)) = ao(t) a n d T2(al(t)) = a l ( t ) (for t = w, w 2 , . . . , w256, as usual). Also, note that ai+t(t) = ai(t 3) for i ~ Z2. Put ao = ao(w) and at = al(w). Then, ao, a l E (~-2)t; that is, ao and al both satisfy a quadratic equation over Q. It will be a pleasant surprise to us that they actually satisfy the s a m e equation. We already know that ao + at = - 1 . As for the p r o d u c t ao(t)al(t), it is a s u m of 128.128 m o n o m i a l s in t. A n a t u r a l guess is that these are uniformly distributed among t, t 2, . . . , t 256, and hence that ao(t)al(t) = (t + t 2 + ... + t256).128.128/256 = -64. It takes a few minutes for Maple to verify this, and then we may safely write a0
+ al = --1, aoat = - 6 4 .
Looking at this system of equations with algebraic eyes, we would say that ao a n d a l are the roots of the seconddegree equation t 2 + t - 64 = 0, and these roots are ( - 1 _+ 2V2~)/2. Note that the roots are real. But, which one is a0, a n d which one is a l ? To a n s w e r this, we let Maple estimate a0 a n d al. We may omit the imaginary parts in the defining 127
8i centered at -1/2, and let a0 and a l denote the intersections with the real axis. Then, it follows from elementary geonmtry (using similar triangles; a special case of the Intersecting Chords Theorem) that Laollal[ = 82 = 64, a n d h e n c e aoal = - 6 4 , as desired. We will repeatedly e n c o u n t e r the same problem: to find real n u m b e r s xt, x2 such that Xl + x2 = a a n d XlX2 = b. Therefore, let us make some general considerations before we proceed. If b < 0, we proceed as above and find Xl a n d x2, where the real axis intersects the circle through ~ i with c e n t e r at a/2. We were lucky this first time in that 4 is an integer. In general, we find H i , where the imaginary axis intersects the circle w h o s e diameter is the segm e n t of the real axis b e t w e e n - 1 a n d - b (this, of course, is j u s t the Intersecting Chords T h e o r e m again). See Figure 1. It will occasionally h a p p e n that b > 0. One way to deal with this case is to follow the idea of Descartes: First find a/2. Then draw the circle with radius la[/2 a n d center at X/bi. Suppose this circle intersects the real axis at _+c. Then, x = a/2 +_ c satisfies the system of equations, a n d hence xt a n d x 2 are easily constructed. See Figure 2. We shall make these simple c o n s t r u c t i o n s quite a few times, each time with n e w values of a a n d b. Let us p a u s e for a m o m e n t a n d see what we have achieved. We have found points a0 a n d a l which (separately) generate the quadratic field e x t e n s i o n ( r 2 ) r : Q. Our plan is to move upward in the chain (2) of fLxed fields, by successively constructing points bk ~ @4)*, Ck E (~S)t, a n d so forth. Our progress can be described in the following m u c h more e l e m e n t a r y way, which completely avoids the use of field theory. Take, as a starting point, the fact that the s u m of the 256 "unknown" vertices is equal to - 1 , b u t be careful to write the terms in the following order: w + oJ3 + w32 + w33 + . . . . . 1. Then, a0 is the s u m of every other term starting with ~o, and a l is the s u m of every other term starting with w3. We have thus, in this section, found the s u m of 128 vertices, which indeed is promising because our ultimate goal is to f'md the "sum" of one vertex alone.
FIGURE
127
sums, and hence obtain a0 = ~ , w3~i = ~ , cos(32i2~'/257) "~ i= 0 127
7.5 and a l = ~ i--0
i=O
127
w32i+1 --- Z cos(32i+12 Ir/257) ~ - 8 . 5 . i=0
F r o m a geometric p o i n t of view, the first equation, a0 + a t = - 1, shows that (the real n u m b e r s ) a0 a n d a l have the same distance to the p o i n t - 1 / 2 ; in other words, that a0 a n d a l are the points of intersection of the real axis and a certain circle with c e n t e r at - 1/2. We need to k n o w j u s t o n e point on this circle in order to draw it a n d h e n c e find a0 a n d al. Such a point is 8i. Indeed, draw a circle through
Xl
-1
V O L U M E 21, N U M B E R 1, 1999
r
a n d b2 satisfy the s a m e quadratic equation o v e r Q ( a o ) = (~-2)t ( a n o t h e r one of t h e s e nice surprises), and w e o b t a i n
:IGURE
bo + b2 = ao, bob2 = - 1 6 .
,X 1
.._
We a l r e a d y know t h a t bo a n d b2 are real. They are the r o o t s of the equation t 2 - aot - 16 -- 0. Again, Maple 63 W34i h e l p s us to distinguish b e t w e e n the r o o t s b0 = }~i=0 -Z63 o cos(34i2~r/257) = 9.2 a n d b2 = ~/63--0w3ai+2 = ~/63--0c o s (34i+22Ir/257) ~ - 1.7. D r a w a circle with c e n t e r at a0/2 through the p o i n t 4i. This circle intersects the real axis at b0 and b2. As for b] a n d b3, it is i m m e d i a t e that bl + b3 = ai. Moreover, w e get b l ( t ) b 3 ( t ) = b0(t3)b2(t3) = - 1 6 , and, hence, bl + b3 = al, bib3 = - 16.
In the n e x t section, w e shall split the set o f u n k n o w n vertices further, as a first step into 4 s u b s e t s o f 64 vertices e a c h with s u m s bo, bl, b2, and b3, respectively. It will app e a r that the bk t a k e n in pairs satisfy quadratic equations with coefficients rational in the ak. This is h o w w e shall proceed. W e { l e e , | t o t h e I'iekl (~.a)r We have so far d e t e r m i n e d the field (r2) *, and w e have found that (~.2), C R. In fact, even (T128)t C R, as w e prove next. We have T128(w) = wal2s. But (3128) 2 = 3256 =-- 1 ( m o d 257), and h e n c e 3128 = - 1 ( m o d 257), for the p e r i o d of 3 is 256 and n o t 128. Of course, 3128 = - 1 ( m o d 257) is also easily found b y a Maple computation. In fact, it will b e very convenient for future r e f e r e n c e to let Maple p r o d u c e a list o f all p o w e r s o f 3 m o d u l o 257, and the r e a d e r w h o w a n t s to c h e c k all c o m p u t a t i o n s is invited to do so. We n o w obtain ~'128(w) = 0 ; 1 = ~. Thus, the effect o f 7"128 is c o m p l e x conjugation; h e n c e (~.128)t C R, as w a s to b e proved. Now, w e m o v e on to (Ta) *. E l e m e n t s in this fixed field are easily f o u n d b y adding every fourth t e r m in the poly255
64
nomial ~ . t 3i, so p u t bk(t) = ~ . t 34i+k E Z [ t ] / p ( t ) for k = 0, i=O
i=O
1, 2, 3, w h e r e t h e i n d e x k is r e g a r d e d and w h e r e t h e k in the e x p o n e n t can sentative o f the index. The choice o f not affect bk(t) m o d u l o p ( t ) . Also, p u t Clearly, 63
bk = bk(w).
63
.ra(bk(t)) = ~" (t3t)3 " + k = ~ i=0
as an e l e m e n t of Za b e a n y fixed reprerepresentative does
t 3t(i+l)+k
i=0 64
63
= ~ . t 34i+k = ~ i=l
t 34i+k = bk(t)
A Maple c o m p u t a t i o n s h o w s that bl = 1.6 a n d b3 - 1 0 . 1 . We find bl and b3 w h e r e the real axis i n t e r s e c t s the circle t h r o u g h 4i with c e n t e r at a l / 2 . The n e x t s t e p is to go f r o m (ra) * to (T8) *. Define Ck(t) = 31
Z
t3si+k for k ~ Z8, where, as before, the k occurring in the
i=0
e x p o n e n t m a y b e any fixed r e p r e s e n t a t i v e of t h e i n d e x k. We have thus p i c k e d out every eighth term o f the s u m 255
~ , t3i. As usual, p u t
Ck =
Ck(W ). We have r8(ck(t)) = Ck(t)
i=0
for t = w, 0,2, . . . , w256, k E Z8, a n d w e shall also n e e d the f o r m u l a Ck+l(t) = Ck(t3), k ~ Z8. Clearly, co(t) + ca(t) = bo(t). T h e p r o d u c t co(t)ca(t) i s a s u m of 32.32 m o n o m i a l s in t, so r e m e m b e r i n g our experie n c e with a o ( t ) a l ( t ) a n d bo(t)b2(t), t h e natural guess this time is that Co(t)ca(t) should equal - 1.32.32/256 = - 4 . This, however, is far from the truth. In fact, a Maple c o m p u t a t i o n a n d a c o m p a r i s o n with the a i ( t ) a n d b i ( t ) yield co(t)c4(t) = - 5 - ao(t) - 2bo(t) ( m o d p ( t ) ) (I wish I could explain that). But this is good enough, b e c a u s e it s h o w s that Co a n d ca are r o o t s o f the s a m e quadratic equation over Q(ao, bo) = ('ca) r Satisfactory as this is, w e are, nevertheless, kept in s u s p e n s e c o n c e r n i n g the future. Can w e c o u n t on the s a m e g o o d luck as w e go on and introduce dk, ek, and so forth? One w o u l d wish here for a nice little t h e o r e m which settles this for good. However, I leave this discussion n o w and t a k e up the s u b j e c t again in the final s e c t i o n of the article. Using the formulas Ck + l ( t ) = Ck(t3), bk+ l ( t ) = bk(t3), a n d a k + l ( t ) = ak(t3), we get c l ( t ) c 5 ( t ) = co(t3)ca(t 3) = - - 5 a0(t 3) -- 2b0(t 3) = - 5 - a l ( t ) - 2bl(t) and, analogously, c2(t)c6(t) = - 5 - a0(t) - 2b2(t) and c3(t)c7(t) = - 5 - a l ( t ) 2b3(t). Thus, w e have the following four s y s t e m s o f equations:
i=O
C o -F C4 ---- b0,
for t = w, w2. . . . , w256. Also, note t h a t b k + l ( t ) = bk(t 3) ( m o d p ( t ) ) . To find the bk, in"st n o t e that b o ( t ) + b 2 ( t ) = ao(t). Next, c o m p u t e bo(t)b2(t), w h i c h is a s u m of 64.64 monomials. As a Maple c o m p u t a t i o n verifies, t h e s e are uniformly distributed a m o n g t, t 2, . . . , t 256, a n d h e n c e bo(t)b2(t) = - 1 6 . Thus, bo
34
THE MATHEMATICALINTELLIGENCER
el
CoCa = - 5 - ao - 2bo; c2 + c6 = b2, C2C6 = - - 5
--
ao - 2b2;
+ c5 = bl,
clc5 = - 5 - a l - 2bl; e3 + c7 = b3, c3c 7 = - 5 - a] - 2b3.
Here, Co ~ 11.9 and c4 ~ - 2 . 6 , so Co and c4 are readily c o n s t r u c t e d as the p o i n t s o f i n t e r s e c t i o n b e t w e e n the real
axis a n d the circle t h r o u g h ~ / 5 + ao + 2boi = y, say, with c e n t e r at b0/2. I recall h o w to fred y. First, find 5 + a0 + 2b0, which is on the positive real axis. Next, d r a w the circle w h o s e diameter goes from this point to - 1 . Then, y is where this circle intersects the imaginary axis. Thus, Co a n d c4 are constructed as i n Figure 1. We have c 2 = 2 . 3 a n d c 6 ~ - 4 . 0 , so c2 a n d c6 are c o n s t r u c t e d analogously. However, cl ~ 0.3 a n d c5-~ 1.3, so c~c5 > 0. Thus, we draw a circle with c e n t e r at X / - 5 - al - 2bli and radius bl/2. This circle intersects the real axis at + c, say, w h e r e c > 0; hence, cl = bl/2 - c and c5 = bl/2 § c are easily found. This is the situation pictured in Figure 2. Further, we have c3 = - 6 . 4 and c7 = - 3 . 7 (i.e., c367 > 0); hence, c3 a n d c7 are constructed in the same m a n n e r as el a n d c5. We go on and define dk, ek, fk, and gk as follows: 15
dk(t) = V / . t ~l~§ , i=0 3 V t 3~i§ , f k ( t ) = ~.. i=0
7
k E Z16 ,
ek(t) = ~ . t332i+k, k ~ Z32 , i=0 1
k E Z~,
gk(t) = ~ . t 3'2~+k,
k E Z12s.
could have been avoided had we c h o s e n as our primary goal not the vertex 0) of the regular 257-gon but some other vertex (appropriately chosen). For example, we could have chosen 0 27, because 0)27 + ~27 = g3 and g3g67,f3f35, andf27f59 are all negative, as is easy to verify. Thus, we may move from the e~ to 0)27 a s in Figure 1, and o n c e 0)27 is f o u n d , the other vertices of the 257-gon are readily constructed. However, we c a n n o t completely avoid the Figure 2 case, b e c a u s e it will t u r n out that we will need all the dk, and hence all the Ck.
The Missing Links Turn out to Be the Most Laborious We n o w go back again and r e m e m b e r that all the ak, bk, and Ck have b e e n constructed. The n e x t step is to construct the dk. We let Maple calculate do(t)ds(t) = ao(t) + Co(t) + c2(t) + 2c5(t). Applying the f o r m u l a dk+l = dk(t 3) and working as before, we obtain do + d8 = Co, dods = ao + Co + c2 + 2c5;
d l § d9 = Cl, dido = a l § cl + c3 § 2c6;
d2+dl0=C2, d 2 d l 0 = ao + c2 § c4 § 2C7;
d3 § dll -- c3, d 3 d u = al § c3 § c5 § 2c0;
d 4 + d12 =64,
d5 -I- d13 = c5, d5d13 = al § c5 § c7 § 2c2;
i:0
As before, the k in the e x p o n e n t can be a n y representative of the index k, b e c a u s e we always work m o d u l o p(t). It is n o w easy to check that :16(dk(t)) = dk(t), r32(ek(t)) = ek(t), ~ 6 4 ( f k ( t ) ) = f k ( t ) , a n d " r 1 2 8 ( g k ( t ) ) = gk(t) for t =0), 02 . . . . , 0256, a n d to verify the formulas d k + l ( t ) = d k ( t 3 ) , . . . , g k + l ( t ) = gk(t3). We can put dk = d k ( 0 ) ) , . . . , gk = gk(0)) for all k, b u t we shall n o t need all of them.
Working B a c k w a r d f o r a M o m e n t In fact, look at go = 0) + 0)3128 = 0) § ~ ---- 2 Re 0). Once go is d e t e r m i n e d (i.e., constructed), we are only a few steps from our goal 0). We take half of go and move parallel to the imaginary axis until we reach the unit circle Iz[ = 1. T h e n there is 0), the s e c o n d vertex of the regular 257-gon. The n u m b e r s go a n d g64 are the roots of a quadratic equation over t (we work b a c k w a r d for a m o m e n t ) . We have go(t) + g64(t) = fo(t), a n d a n easy c o m p u t a t i o n shows go(t)g64(t) = t 15 + t 17 + t 24~ + t 242 = f56(t). You do n o t n e e d Maple to c o m p u t e this ff you have already let Maple p r o d u c e a display of all p o w e r s of 3 modulo 257 [we have go(t) = t + t 256 and g64(t) = t 16 + t24]], as I r e c o m m e n d e d earlier. F r o m go = 0) -~ ~ a n d g64 = 0)16 § ~16, it is obvious that go > g64 > 0; hence, go a n d g64 are c o n s t r u c t e d as in Figure 2. Clearly, fo(t) + f32(t) = eo(t) and f24(t) + f56(t) = e24(t). A simple c o m p u t a t i o n shows fo(t)f32(t) = el(t) + e23(t), a n d therefore, f24(t)f56(t) = fo(t324)f32(t324) = el(t 324) + e23(t 324) = e25(t) + e47(t) = e25(t) + els(t). Here, we have fo = 0) + 0)16 § "~ § m16 a n d f 3 2 = 0)4 § 0)64 § ~ 4 § ~6a and, hence, fo >f32 > 0. Further, f24 = 0)6o + 0)6s + ~ + ~6s a n d f56 = 0)15 § 0)17 § ~15 § ~17, w h e n c e f56 >f24 > 0. Note that f24 is j u s t slightly greater than 0. This is b e c a u s e 0)6o + 0)68 = r0)64 for s o m e r e a l r > 0, and 0)64is j u s t slightly to the right of the imaginary axis! It is somewhat unfortunate that gog64,fof32, andf24f56 are all positive, for this brings us into the "Figure 2 case." This
d4d12 = ao + c4 + c6 § 2Cl;
d6 § d14 = C6, d6d14 = ao § c6 § Co + 2c3
d7§ dTd15 = al + c7 + cl § 2c4.
A list of approximations is: do ~-- 9.2, ds ~ 2.6, d l ~-" 4.9, d9 ~ - 4 . 6 , d2 ~ 2.4, dl0 ~ -0.11, d 3 ~ -2.96, d l l ~- - 3 . 4 ,
d4 -~ - 0 . 8 , d12 ~ - 1 . 8 , d5 -~ 3.3, d m - l . 9 , d 6 ~- -3.2, d14 - - 0 . 8 , d 7 ~ 2.7, d15 ~ -6.4. The missing link is n o w only to c o n s t r u c t the ek from the dk. This time, Maple yields eo(t)e16(t) = do(t) + dl(t) + d2(t) + ds(t). I skip the details a n d only give the equations a n d a p p r o x i m a t i o n s needed. e 0 § el6 = do,
e0e16 = do § dl + d2 + d5; el+el7=dl, elel7 = dl § d2 § d3 § d6; e7 + e23 = d7, e7e23 = d7 + d8 + d9 + d12;
eo ~ 5.9,
el6 ~ 3.4.
el ~ 4.6,
el7 ~ 0.3.
e7 ~ - 0 . 4 , e23 ~-- 3.1.
es + e24 = d8, ese24 = ds + do + dlo + d]3;
es ~ - 1 . 1 , e24 ~ 3.7.
e9 + e25 = (/9 e9e25 = d9 § dlo + dll § d14;
e9 ~ - 6 . 1 , e25 ~ 1.5.
e15 + e31 = d15, e15e31 = d15 + do + dl + d4;
el5 ~ -1.4, e31 ~ - 5 . 0 .
With eo, el, e15, e23, e24, a n d e25 n o w at hand, we cons t r u c t f o , f24,f32,f56, and, finally, go, g64, and 0), as already
VOLUME 21, NUMBER1, 1999
35
demonstrated. The actual p e r f o r m a n c e o f the c o n s t r u c t i o n is left to the reader. S o m e Final R e m a r k s Looking b a c k at o u r c o n s t r u c t i o n of the regular polygon o f two h u n d r e d a n d fifty-seven sides, I feel the n e e d to a d d a few remarks. First, have a l o o k at the two equations determining a0 and al. The ffist equation a0 + a l = - 1 is obvious, w h e r e a s the s e c o n d equation aoal = - 6 4 was found after a rather heavy computation, w h e r e Maple w a s of help. But, as I pointed out, the result aoal = - 6 4 m e r e l y confirmed our previous guess, so it w o u l d be rather a p t to a s k w h e t h e r we could prove this b~2~ome o t h e r me la~. And i n d e e d w e can! Consider a0 = ~
oJ32~ and a l = ~
i=O
so, in fact, a l l f ( s ) a r e equal. The s u m f ( 0 ) + f ( 1 ) + ... + f(1) + ... + f(255) is the total n u m b e r o f t e r m s in the exp a n s i o n of aoal, n a m e l y 1282, w h e n c e it follows t h a t f ( s ) = 255
1282/256 = 64 for all s. Thus, finally, aoal = ~ f ( s ) w 3~ = 255
s=0
64 ~ . w 3s = - 6 4 . s=O
Let us try to r e p e a t this argument with b0 a n d b2 in p l a c e 63
of ao and al. We have bob2 = ~ i=0
63
w3a~
w 34i+2, which
i--0
e x p a n d s to a s u m o f 642 t e r m s of the form w 34~+3~+2. It is e a s y to p r o v e t h a t this t e r m n e v e r equals 1 and, hence, that w u4i+34k+2 = w3~ for s o m e s E {0, 1 , . . . , 255}. L e t f ( s ) be the n u m b e r of t i m e s a certain s occurs, that is, the n u m b e r of solutions (i, k) ( c o u n t e d m o d u l o 64) to the equation 3 ~ -= 34i + 34k+2 ( m o d 257). Clearly, f ( 0 ) + f ( 1 ) + ..- + f ( 2 5 5 ) = 642. It follows from the i m p l i c a t i o n 3 ~ -= 34i + 34k+2 ~ 3 s+2 --- 34(k+l) § 34i+2 t h a t f ( s + 2) = f ( s ) for all s, so that we have f(O) = f ( 2 ) . . . . 255
obtain
and f(1) =f(3) .... 127
bob2 = ~ f ( s ) w 3s = f ( 0 ) ~
127
f(1) ~
s=0
w 32i +
i=0
w3ei+' = f ( O ) a 0 + f ( 1 ) a l .
i=0
This is r a t h e r interesting, b e c a u s e it s h o w s that bob2 Q(ao, a l ) and, hence, w e have found, w i t h o u t computing bob2, that bo a n d b2 a r e the two r o o t s o f a quadratic equation over Q(ao, a l ) . Morever, it is quite c l e a r (I o m i t the details) that a similar a r g u m e n t can be u s e d to prove that c0c4 = ~ 7
fiibi for a p p r o p r i a t e integers fii, that dod4 =
i=0
yici for a p p r o p r i a t e Yi, and so on, a n d this is the r e a s o n i=O
3~
2n-l--1
i--O
f ( s ) 2 f r o m s o m e p o i n t on," t h a t is, that for s o m e k, n -> k i m p l i e s that an > 2. We u s e o r d i n a r y letters for real n u m b e r s . Note t h a t a real n u m b e r r c a n b e v i e w e d as t h e s e q u e n c e a, w i t h an = r for all n. E q u a t i o n s a n d i n e q u a l i t i e s i n v o l v i n g s e q u e n c e s are interp r e t e d in the s a m e way; that is, a + b = c m e a n s an § b n = Cn f r o m s o m e p o i n t on, a n d s i n ( a ) = b m e a n s s i n ( a n ) = bn from s o m e p o i n t on. Note that f o r f ( a ) to b e defined, it is o n l y n e c e s s a r y t h a t f ( a n ) b e d e f i n e d f r o m s o m e p o i n t on. The s e q u e n c e b: 0, 0, 0, 1, 2, 3, 4 , . . . , for e x a m p l e , does h a v e a reciprocal, I/b, b e c a u s e , as n o t e d , finitely m a n y t e r m s of a s e q u e n c e m a y b e u n d e f i n e d . If a s t a t e m e n t P is true from s o m e p o i n t o n a n d statem e n t Q is also t r u e f r o m s o m e p o i n t on, t h e n "P a n d Q" is true from s o m e p o i n t on. This fact a l l o w s u s to do a l g e b r a on sequences. Suppose, for e x a m p l e , w e have a+4=3b
P r o p o s i t i o n 1. S u p p o s e a, b ~ 0. T h e n (1) a + b ~ 0. (2) a - b ~ 0. (3) I f e is f i n i t e , t h e n a c ~ 0. (4) I f Icl < lal, t h e n e ~ O. PROOF. Given a n y positive d, w e k n o w that lal < d/2 and Ibt < d/2. Then, la + b I -< lal + Ibl < d. T h i s p r o v e s P a r t (1). P a r t (2) is p r o v e d similarly. F o r P a r t (3), b e c a u s e Icl < r for s o m e real r, a n d lal < d/r for all positive d, w e h a v e lacl < d. F o r P a r t (4), o b s e r v e t h a t b e c a u s e lel < lal < d, Icl