SEMIGROUPS AND AUTOMATA SELECTA UNO KALJULAID (1941–1999)
Semigroups and Automata SELECTA Uno Kaljulaid (1941–1999)
Edited by
Jaak Peetre Lund, Sweden
and
Jaan Penjam Tallinn, Estonia
Amsterdam • Berlin • Oxford • Tokyo • Washington, DC
© 2006 The authors. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 1-58603-582-7 Library of Congress Control Number: 2005938840 Publisher IOS Press Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail:
[email protected] Distributor in the UK and Ireland Gazelle Books Falcon House Queen Square Lancaster LA1 1RN United Kingdom fax: +44 1524 63232
Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail:
[email protected] LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
CONTENTS
v
Contents Preface. Biography of Uno Kaljulaid. J. Peetre Bibliography of Uno Kaljulaid.
vii xi xxi
Chapter I. Representations of semigroups and algebras 1 1. [K69a] On the cohomological dimension of some quasiprojective varieties. 3 2. [K77a] Triangular products of representations of semigroups and associative algebras. 15 3. [K79a] Triangular products and stability of representations. Candidate dissertation. 19 4. [K79b] Triangular products and stability of representations. (Author review of Candidate thesis in Physico-Mathematical Sciences). 101 5. [K87a] Some remarks on Shevrin’s problem. 111 6. [K90] Transferable elements in group rings. 117 7. [K00] Ω-rings and their flat representations. Coauthor O. Sokratova 127 Chapter II. Automata theory 1. Preamble. Editors 2. Automata and their decomposition. 3. [K97] On two algebraic constructions for automata. Coauthor J. Penjam 4. [K98c] Revisiting wreath products, with applications to representations and invariants.
141 143 145 183 203
Chapter III. Majorization 1. Generalized majorization. Coauthor J. Peetre 2. Van der Waerden’s conjecture and hyperbolicity. J. Peetre 3. On generalized majorization. J. Peetre
205 207 225 233
Chapter IV. Combinatorics 1. [K88a] On Stirling and Lah numbers. 2. Letter (or draft of letter) c. 1991 from Uno Kaljulaid to Torbjörn Tambour. 3. On Fibonacci numbers of graphs.
237 239 243 245
Chapter V. History of Mathematics 251 1. Th. Molien, an innovator of algebra. 253 2. [K87e] On the results of Molien about invariants of finite groups and their renaissance in contemporary mathematics. 257 3. Theodor Molien, about his life and mathematical work as seen a century later. (A biographical sketch and a glimpse of his work). 265 4. Notes on five 19th century Tartu mathematicians (Backlund, Kneser, Lindstedt, Molien, Weihrauch). 291 Chapter VI. Popularization of Mathematics 1. [K68a] and [K69b] On the geometric methods of Diophantine Analysis.
325 327
vi
CONTENTS
2. [K68b] Lenin prize for work in Diophantine geometry. 3. [K69c] The history of solving equations. 4. [K70] Additional remarks on groups. 5. [K73a] Polynomials and formal series. 6. [K75a] On Galois theory. 7. [K75b] Theory of automata. Coauthor E. Tamme 8. [K93c] Mordell’s problem. 9. [K96] On two discrete models in connection with structures of mathematics and language.
351 355 373 389 399 413 427 447
Index of Names
459
Subject Index
467
PREFACE
vii
Preface
We have the pleasure to offer to the Mathematical Public the Selecta of the eminent, late Estonian algebraist Uno Kaljulaid. It contains mainly papers published in Kaljulaid’s lifetime. Many of them were originally written in Russian, a few also in Estonian, and have now been translated into English, mainly, by one of us, J. Peetre1.
Heritage. In addition to this published material, Kaljulaid left a large number of manuscripts in various states of completion. They are currently in the custody of the Senior Editor. For instance, there is an almost complete paper on right order groups, surveying the subject in its historical development, starting with D. Hilbert; some material on Petri nets, etc., things that, apparently, occupied Kaljulaid in his last years. Hopefully, part of it can also be made public, at a later stage, perhaps in the form of Selecta II. Let us now highlight some of the main items of the present Volume. Contents. We offer here the English translation of Kaljulaid’s 1979 Tartu/Minsk Candidate thesis [K79a], which originally was typewritten in Russian and manufactured in not so many copies. The thesis was devoted to representation theory in the spirit of his thesis advisor B. I. Plotkin: representations of semigroups and algebras, especially extension to this situation, and application of the notion of triangular product of representations for groups introduced by Plotkin. We include also two summaries of the thesis [K77a] and [K79b]. Through representation theory, Kaljulaid became also interested in automata theory, which at a later phase became his main area of interest. Another field of research concerns combinatorics. Besides being an outstanding and most dedicated mathematician Uno Kaljulaid was also very much interested in the history of mathematics. In particular, he took a vivid interest in the life and work of the great 19th century Dorpat-Tartu algebraist Th. Molien (see Chapter V). Perhaps he saw in Molien a kindred soul, as neither of the two got quite the recognition from their Alma Mater, which they for sure deserved; in Molien’s case, he had to go into voluntary exile in Tomsk, Siberia. Kaljulaid was also very interested in the teaching and exposition, or popularization of mathematics; he had several outstanding research students. Some of his more popular-scientific papers were published in an Estonian language journal Matemaatika ja Kaasaeg (Mathematics and Our Age). Amongst there is a whole series of papers about algebraic matters, culminating in a brilliant, elementary – although partly rather philosophical – essay devoted to Galois theory [K75a]. Another such series is his excellent essay of Diophantine Geometry [K68a,69b], in various installments, followed by his éloge [K68b] to another of his teachers Yu. I. Manin. We believe that the inclusion of these papers here will make the Volume more interesting for beginners, and perhaps even contribute to attracting young people to mathematics, in Estonia and elsewhere. 1Later on referred to as Senior Editor.
viii
PREFACE
Presentation. The papers in the Volume are assembled in chapters according to the theme. Important matters or notions have often, with some consequence, been set in italics, sometimes upon their first appearance, or else where they are defined. Rather rare quotes in other languages than English are usually followed by a translation within parentheses. References to items of Uno Kaljulaid come in the form [Kx], where x (a year) is taken modulo 1900, and refer to the bibliography. References to other mathematicians come in the form [y], where y runs through 1, 2, 3 . . . , independently in each separate paper. In case of books translated into Russian, the Russian translation is often indicated, along with the original for the benefit of the Readers reading Russian or having access to the Russian book. In transliterating the Cyrillic into English we use, with some consequence, the system in Mathematical Reviews, as set forth on p. 1–2 of the book [1]. Some facts about Estonia and Estonian mathematics. It should perhaps also be recalled here that Estonia is the northern most of the three Baltic Republics, facing the Finnish Gulf in the north, bordering to Latvia in the south and to Russia in the East. Its population is about 1.3 million, most of them Estonians, many living in the capital Tallinn; there is also a large Russian speaking minority. The Estonians speak a language somewhat affined to Finnish and not at all related to the language of their southern neighbors the Latvians and the Lithuanians. Estonians were mentioned already by the Roman writer Tacitus (c. 55–117) who spoke of them as the Aestorum gentes. However, around the beginning of the 13th century the Estonians were still among those few people in Europe who had not accepted Christianity. In a devastating war (1208–1227), against German, Swedish and Danish Crusaders, the new religion was forced upon them. The last stronghold of the Estonians, the Castle of Valjala on the island of Saaremaa, was conquered by a Crusader’s army, coming from Pärnu and marching over the frozen archipelago, in February, 1227. Then the Estonians became united, together with the Latvians, in a state ruled by the Order of the Brethren of the Sword, later known as the Teutonic Order, while the native population came to live, for centuries, in serfdom. The rule of the Order lasted until mid 16th century. At later times, Estonia was governed, alternatingly, by Swedes, Poles, and Russians. The situation of the indigenous deteriorated ever more and was particularly low towards the end of the 18th century, farmers were freely sold to the highest bidding landowner; one could even draw a parallel to the Belgian Congo at a much later epoch. However, in the mid of the 19th century a national awakening took place. After hard struggles, the Estonians managed to form an independent country of their own in 1918–20, in the aftermath of World War, when all empires collapsed, the Russian one included. In the advent of the Molotov-Ribbentrop treaty in August, 1939 it was annexed by the Soviet Union in June, 1940, and regained its independence in 1991, during the fall of the Soviet empire. For more details about the above, and also some information about mathematics in Estonia until 1940, with a tradition going back to the Academia Gustaviana in Tartu, founded by the Swedish King Gustavus Adolphus in 1632, closed down in 1656, when the city was captured by the Russians, and then followed by the Academia Carolina
PREFACE
ix
(1690–1710)2, we refer to an article by Ülo Lumiste, nestor of Estonian mathematicians, in the book [2]. After a long interregnum the university was reopened in 1802, under the auspices of czar Alexander I; Estonia was now a part of the Russian Empire, the university’s official name being Kaiserliche Universität zu Dorpat, as the language of teaching was German.
Acknowledgements. The appearance of the present compilation would not have been possible without the generous assistance of a large number of friends and colleagues, students, secretaries, librarians, family members, etc. – from Novosibirsk in the East to Iowa in the West. To all of them we express here our sincere thanks. The following list of names (in alphabetic order) comprises probably only a fraction of all: Gert Almkvist, Marianne Blauert, Leonid Bokut, Kerstin Brandt, Michael Cwikel, Martina Eicheldinger, Miroslav Engliš, Jan Gustavsson, László Filep, Eila Ritva Jansson, Margreth Johnsson, Kalle Kaarli, Dan and Christer Kiselman, Andi Kivinukk, Richard Koch, Petr Krylov, Ruvim Lipyanskiˇı, Indrek Martinson, Caroline Myrberg, Aleksandr Nikolskii, Inga-Britt Peetre, Jakob-Sebastian Peetre, Monika Perkmann, Ann-Christin Persson, Ulf Persson, Professor Pater Anders Piltz O.P., Boris Plotkin, Olga Sokratova, Sven Spanne, Gunnar Sparr, Michael David Spivak, Annika Tallinn, Hellis Tamm, Marje Tamm, Enn Tamme, Erki Tammiksaar, Gunnar Traustason, Michael Tsfasman, Victor Ufanrovski, Aleksandr Zubkov. Amongst institutions, we mention in particular the following: Eesti Loodusuurijate Selts (Estonian Naturalists’ Society, Tartu, Estonia); Verlag Heyn (Klagenfurt, Austria). We have had an invaluable aid from many libraries, amongst others: Mathematical libraries of Lund, and the one of Uppsala (named the Beurling library); Lund University, Giesen, and Heidelberg; the library of the Mittag-Leffler Institute; the library of the Institute of Cybernetics at Tallinn University of Technology. Finally, we express our great esteem for the generosity of our sponsors, the Royal Physiographic Society of Lund, taking over all costs of publication and the European Union’s Fifth Framework Programme project IST-2001-37592 (eVikings II) that partially supported the editing of this book and the related visits of Jaan Penjam to Lund. The Editors References [1] A., J. Lohwater. Russian-English Dictionary of the mathematical sciences. American Mathematical Society, Paris, 1961. [2] Ü. Lumiste and J. Peetre. Edgar Krahn, A centenary volume 1894–1961. IOS Press, Providence, Rhode Island, 1994. 2Probably, few mathematicians are aware of that the first ever to teach about Newton’s cosmology was the Swede Sven Dimberg in Tartu [3].
x
PREFACE
[3] Ü. Lumiste and H. Piirimäe. Newton’s Principia in the curricula of the University of Tartu (Dorpat) in the early 1690’s. In: R. Vihalemm (ed.), Estonian studies in the history and philosophy of science. Kluwer Academic Publishers, Dordrecht, Boston, New York and London, 2001, 1–18. Swedish translation, based on enlaged 1981 Estonian version: J. Peetre – S. Rodhe, Normat (to appear).
BIOGRAPHY OF UNO KALJULAID
xi
Biography of Uno Kaljulaid by J. Peetre
The following is mainly drawn from Uno Kaljulaid’s own curriculum vitae along with my personal recollections, as well as information obtained from his daughter Mrs. Annika Tallinn, and some other persons. Uno Kaljulaid was born on October 21, 1941 in Kõpu3 in the district of Viljandi in south-western Estonia.
Primary education. In primary school in Kõpu Uno was supposedly a naughty boy, but he had never any problems in learning. Once even a question was raised of sending him to a special school. After finishing primary school his father, Elmar Kaljulaid wanted him to become a tractor driver, but a relative (the husband of Uno’s sister) took care of him and so Uno moved to Pärnu, a nearby famous seaside resort on the Eastern side of the Riga bay.
Secondary education. So his secondary education young Uno got in Pärnu. He graduated the Pärnu First High School in 1959. But even after Uno still did return to Kõpu. In summer time he used to help his mother with haymaking. But his great hobby was to go and pick cranberries in the swamps and morasses – a great part of Estonia consists of morasses. Early in the morning off he went on his moped and returned only by midnight, when everybody at home already was worried about him. But each time his rucksack was crammed with berries. In Kõpu he also wrote many of his mathematical papers, a special room having been prepared as an office for him. After the death of his parents, however, the farm was sold. Then Uno began to spend his summers in Pärnu, where he rented a room in a house in Toominga (Wild Cherry) Street at the beach area. He liked the arrangement very much and spent at least five years there. Academic career. Uno Kaljulaid studied at Tartu University 1959–1963. But already in 1959, prior to his entering the university, he attracted general attention by participating in the All-Estonian Mathematical Olympiad, arriving as an honorable number four. This was a turbulent time in Estonian mathematics, as the old professors (Jaakson, Rägo, Sarv) were all about to retire. The leading mathematician at the mathematics department of Tartu was then Gunnar Kangro (1913–1975), who opened up a new direction, summation theFig. 1: Uno Kaljualid – a student in Tartu ory and attracted many good students4 there. After four years of study Uno was transferred to the Mechanical and Mathematical Faculty 3Kõpu, small village (population: 372 in 2000) situated on the highway connecting Viljandi and Kilingi-
Nõmme, first mentioned in 1481. [1], 12, p. 264 4In 1940/41, Kangro wrote a long paper on summation theory (100 p.). It appeared in the Acta in 1942, the author had, in 1941, been drafted by the Red Army and then deported to Russia. [2], p. 16.
xii
BIOGRAPHY OF UNO KALJULAID
of Moscow University. He got his diploma in algebraic geometry, under the auspices of Yuri Manin 1966, but he was never formally Manin’s “aspirant”, several applications by him being turned down (cf. below). Post-graduate studies again were done at Tartu University in 1968–1972. As follows to the comments by J.-E. Roos to his diploma work [K69a], some problems, then open, have been settled now. The advisor of his Candidate thesis was Professor Boris Plotkin (at Riga, now in Jerusalem). The defence took place, on March 11, 1979, at the Mathematical Institute of the Belorussian Academy of Sciences in that country’s capital Minsk, with Zenon I. Borevich and Alex E. Zaleskiˇı as official opponents. Uno Kaljulaid taught at Tartu University from 1972 on, first 1972–1974 as an Assistant Professor and then 1974–1983 as an Associate Professor. He was made a Docent in 1983. From 1993 on he did scientific work and provided consultative service at the Computer Science Institute of the Department of Mathematics of Tartu University. Simultaneously, Kaljulaid was a part time senior research fellow at the Institute of Cybernetics in Tallinn, where he carried out studies on compositional theory of abstract state machines with memory.
Fig. 2: Boris Isakovich Plotkin, supervisor of Uno Kaljulaid
Scientific work. Teaching. Students. Uno Kaljulaid’s scientific output is, nominally, not large. Much is in the form of short papers, often merely research announcements. The bibliography below sets the number of items printed under Kaljulaid’s life time to some 40. According to MathSciNet he has 27 reviewed papers in Mathematical Reviews. Searching there for Anywhere: Kalju∗ gave, somewhat surprisingly, 124 hits, indicating that Uno Kaljulaid, after all, was quite influential. To some extent this high figure can be accounted for by the fact that it comprises also reviews written by Kaljulaid. On the other hand, MATH Database lists 14 items covered by Zentralblatt. The first printed paper by Uno Kaljulaid seems to be [K69a] and visibly represents, although we are not explicitly told this, his diploma work at Moscow. It is about algebraic geometry in a rather abstract style (Serre, Grothendieck), to wit about the cohomological dimension of algebraic varieties. This is what Professor Manin wrote to me when he learnt about the untimely death of Uno: He was a student at the Algebra Chair of Moscow University. For some time, I was nominally his advisor, however, he always had his own scientific interests. I remember his mild smile and gentle speech. He was deeply interested in mathematics and enthusiastic about it. During the last decade or so I received a couple of letters and postcards from him. He was explaining what he was doing mathematically and usually added just a few words about life, which so drastically changed for many of us. I will miss him. Although Uno Kaljulaid was a dedicated mathematician and all absorbed by this subject, he had also wide interests outside mathematics. We have already recorded his
BIOGRAPHY OF UNO KALJULAID
xiii
passion for the lovely Estonian cranberries. During his Moscow days he also fell in love with ballet. After his sojourn at Moscow University, Uno Kaljulaid did one year of military service in the same city. Having returned to Estonia, in 1967, he worked some time with Professor Jaak Hion5 as supervisor. However, he soon was attracted to the theory of representations, especially of semigroups and algebras, and so his thesis advisor became, at least unofficially, Professor Plotkin, at this time one of the leaders in this area. His Candidate thesis [K79a] (in type script and written in Russian), is translated into English, and printed here for the first time. An “author’s review” of the thesis [K79b] is likewise included here. For a very brief overview we also refer to a note in Uspekhi Matematicheskikh Nauk [K77a]. Furthermore, some preliminary results later covered in [K79a] were presented in Fig. 3: Military service 1967 separate publications prior et posterior, see e.g. [K71b,71c,73c,76,77b,78a,78b,81,82a,82b, 83a,83e,85a], not reproduced here). The following lines were written, on my request, by Professor Plotkin about his contacts with Uno: Uno Kaljulaid was not only my pupil but also a very close friend. Our contacts started in the end of 60-ties, when I used to come to Tartu from Riga with talks and lectures. That time the mathematical life in Tartu was rather active. One of the most popular activities was Summer Mathematical Schools in Kääriku. In Kääriku there was a base of Tartu University and mathematicians enjoyed this place where mathematical discussions could be combined with rest, beautiful nature and conversations. I remember that these conferences were made possible due to [Jaak] Hion, Mati Kilp, [Ülo] Lumiste and other mathematicians from Tartu. At the beginning of 70-ties my interest was focused on the varieties of group representations. This topic attracted attention of Uno. Soon after he asked me to give him a problem for his [Candidate] thesis. I recommended him to build a similar theory for representations of semigroups. In this case I took into account that Uno [had] already 5Born in 1929, Hion got his Candidate degree in Moscow under A. G. Kurosh, an outstanding algebraist mainly known for his work in group theory, in 1955.
xiv
BIOGRAPHY OF UNO KALJULAID
studied semigroups for a long time. Simultaneously I proposed him some problems about group representations. Uno managed to prove a series of significant results and in the end of the 70-ties he brilliantly defended his [Candidate] thesis at the Institute of Mathematics in Minsk. His work was highly appreciated by the reviewers and the Council members. Along with great results achieved by Uno, I should mention that he had deep and wide mathematical background. Uno has graduated from Moscow State University, where he got his education from outstanding teachers. For example, I know that during his university years he collaborated with Yu. I. Manin. I think Uno took great advantages from his education in Moscow University and the wide style of mathematical thinking can be traced in all his works during his mathematical career. During the period of preparation of the thesis Uno frequently visited me in Riga. Also later he used to come to discuss various problems. Methods, elaborated in the thesis, were extended and used in the automata theory. We considered automaton as a three-sorted mathematical system which possesses algebraic operations converting states to states and states to output signals. The system of input signals naturally constitutes a semigroup with the representation on the set (space) of states. This algebraic point of view on automata turns out to be very fruitful. Last years he collaborated with his pupil Olga Sokratova and other pupils in automata theory. I think that they could give useful information about his last works. I am sure that your activity in commemorating the memory of Uno Kaljulaid will be appreciated by mathematicians.
Fig. 4: Participants of Summer School in Kääriku 1966: V. Vagner, J. Hion, E. Lyapin, L. Shevrin, L. Gluskin and B. Plotkin
BIOGRAPHY OF UNO KALJULAID
xv
Fig. 5: Mati Kilp and Uno Kaljulaid on their way to Moscow 1964
I find it curious that thus two men, independently, first declare that Uno Kaljulaid was not their pupil, but otherwise give him all the praise that they can! This shows that Uno already early on was an independent mind. There is however one person in Tartu that influenced him quite a lot. This is Hion, who also should be considered as the founder of the Estonian school of algebra. So, maybe he should after all be viewed as the true teacher of Uno Kaljulaid! Later he became, undoubtedly inspired by this, interested in automata theory. Already in [K69a] there is a brief treatment of at least linear automata. Indeed, automata theory became his main occupation in the last decade of his life. With his unusually broad mathematical education, Kaljulaid took also a vivid interest in the history of mathematics. In particular, he wrote several papers (see this Volume, Chapter V, in particular the last one) about Theodor Molien (1861–1941), born in Riga of Swedish decent, studied in Dorpat/Tartu and a docent there, later in Siberia), who was a pioneer in the field of algebras, but is relatively little known to the general mathematical public, despite the fact that he influenced, for instance, Emmy Noether, who also duly quoted him. Kaljulaid was also early interested in combinatorics, which is treated here in Chapter III. It is my guess that it was through teaching that he was led to this subject. Among research students of Uno Kaljulaid I mention Annela Kelly (née Rämmer), Peeter Laud, Riina Miljan, Jaan Penjam, Tiit Pikkmaa, Varmo Vene, Tiina Zingel (née Nirk).
My recollections of Uno Kaljulaid. I first met Uno Kaljulaid during a trip to then still Soviet occupied Estonia in the spring of 1989. Namely at a meeting of the Estonian Mathematical Society, which took place at Klooga-Ranna, a seaside resort a few miles West of Tallinn and not far from Paldiski, which at the time was a base for
xvi
BIOGRAPHY OF UNO KALJULAID
Soviet submarines. (The conflict about submarines with Sweden was going on. “There they are, the submarines, which you cannot catch”, I was told, and people pointed to across the bay.) On that meeting, Kaljulaid gave a talk on combinatorics. After the talk I had a discussion with him and I told him about my own experience of this subject. It ended by me inviting him to Sweden. Kaljulaid came to Stockholm in the spring of 1990. I had rented a room for him in the apartment of Bertil Eneroth, Civil Engineer, in Sibyllegatan 38 in the district Östermalm, where I housed many of my guests during my Stockholm years6. He gave a talk at the algebra seminar run by Jan-Erik Roos at Stockholm University. This was the year when I directed, jointly with Svante Janson, a program at the Mittag-Leffler Institute, which was devoted to Hankel theory. So I invited him also there one day. He had supper with me and my betrothed Eila in the company of, also, Marcel Grossman from Marseille. At the same time Uno went also to Lund, where he met Lars Hörmander and his team of bright young Russian students. Our contacts continued later by correspondence. Uno Kaljulaid wrote to me numerous letters to which I responded less frequently. Much of this correspondence is preserved, but some has, regretfully, been lost, especially most electronic messages. Corresponding with him was not easy. He told me about his ideas, gave bibliographical information7, often quite useful, wrote about his travel, the meetings he had been to, and people that he had met . . . Often he wrote several letters, one on the top of the other. Despite my reprimands, they were often undated, so it was not always clear in which order they ought to be read; now afterwards this makes identification quite complicated. Sometimes he, apparently short of paper, wrote numerous post scripts and supplements on odd post cards. He admired me very much and never stopped to thank me for having invited me and for other support. Nevertheless, I think that this collection – I have all stored in a special, rather thick binder –, gives a vivid picture of his thoughts and scientific activities. However, often Kaljulaid sent me odd items such as excerpts from local newspapers which I found of no interest. He sent me also a number of gifts at various times. Among these I value especially highly a copy of the Estonian translation of Johann Renner’s chronicle [6], which covers the highly dramatic period 1555–1561 in the country’s history, when the Swedes under Erik XIV established themselves in the turbulent country8. As a person Uno Kaljulaid was rather complex. He was always very friendly, and utterly polite, at least to me. Many mathematicians, at least among my Swedish colleagues, took a liking of him, and so the news of his untimely death came as a shock to everybody. In a way he was a maniac. He belonged to the category of mathematicians for whom there was no life outside mathematics. I am not a psychiatrist, but my diagnosis is 6Others who stayed at the same apartment, at various times, include Fernando and Luz Cobos, Genkai Zhang; the last was probably Gennadi Vainikko. 7For instance, he gave me precious references of vital importance for my work on trilinear forms, by pointing to work by V. V. Dolotin, I. M. Gel’fand et al etc. 8Johannes Renner, German man of law (c. 1552–1583), lived 1556–1561 in Estonia and was in the service of the Teutonic Order. He witnessed from a close corner the early phases of the devastating Livonian War (1558–1583). The chronicle was completed in 1582, a year before its author’s death, but the ms. of Renner’s book was lost for about two centuries and so the book appeared in print only in 1876. Nowadays it is regarded as a classic in Low German, which was the official language of Livonia (Estonia + Latvia) for centuries, until the beginning of the 18th century).
BIOGRAPHY OF UNO KALJULAID
xvii
that he suffered from a kind of persecution mania. I once called, in desperation, Vainikko (then at Helsinki) about this, but he showed little understanding; some of the things that Kaljulaid had told him also turned out not be true (that obstructions were made to him when he left Tartu etc.). Already in the very beginning of our acquaintance Kaljulaid began to worry about that some of his letters could have been intercepted. This was still in Soviet times, but such allegations continued throughout the period of our relation. Let me relate only one such episode, which is supposed to have taken place during one of his stays at Lund (cf. infra). Namely, Kaljulaid claimed that, in our Department’s coffee room, some Swedes, speaking in Swedish, had slandered him in his presence. With my knowledge of Swedes and Swedish mentality, I find this highly improbable, especially as I have doubts of Kaljulaid’s ability to understand spoken Swedish. Also many people here liked him; among them was Anders Melin – I am not sure if he was supposed to have been present on the occasion referred to above; it was also Melin who first suggested to us to make an application to the Crafoord Foundation (see again infra). Kaljulaid told me also of several other incidents, about various acts of persecution against him, which I found more or less credible. On these occasions his whole attitude suddenly changed, the voice altered almost to whisper, although there could be nobody nearby who could overhear our conversation in Estonian; to me he then looked more like an old woman telling a gossip. Once I wrote to him and advised to go to the Rector of the University and complain; afterwards I realized that, although this could have been a logical step in Sweden, it could hardly have been a good idea in post-communist Estonia. I doubt that Kaljulaid followed my advice. After my return to Lund in 1992, I arranged Uno Kaljulaid a second visit to Sweden with money coming from the Swedish National Council for Scientific Research (NFR); again, he visited both Stockholm and Lund. To Lund he came in September 1994. It was on this occasion that we set up a plan to study majorization from a very general point of view. However, only a tiny portion of our rather ambitious plan was ever materialized (see Chapter III); it is clear that I wrote the first version of that paper already then. We made also plans for future cooperation; to this end we applied, in 1995, for a grant from the Crafoord Foundation, and, indeed, we were given a rather handsome sum of money, which allowed Kaljulaid to come to Lund several times. So, Kaljulaid arrived again in Lund at the end of September 1994. By the irony of fate, he came the week before the Estonia catastrophe9, so, had he come only slightly later, he could well have been one of the victims. I recall that Eila and I heard about it by 6 o’clock in the morning by early, Finnish language broadcast on the Swedish radio. I immediately phoned Uno, who was staying in one of our Department’s guestrooms. We were both, of course, utterly shocked, and I reminded him about all the Estonian refugees, often in tiny vessels, who had drowned in similar weather conditions in the same month in September, 1944. Anyhow, soon we went on with mathematics. Kaljulaid gave several colloquium talks on automata theory; they were based on material which he had prepared on previous occasions. So I volunteered to write them up for him (see Chapter II, Section 2). Having learnt about his inability in practical matters, I saw it
9The passenger vessel Estonia, owned by a joint-venture Swedish Estonian company on the line TallinnStockholm, perished near the Finnish coast, on September 28, 1994, in one of the fierce autumn storms in the Baltic. On this occasion, 869 persons were killed.
xviii
BIOGRAPHY OF UNO KALJULAID
as my duty to try to help him publish at least part of his ideas. Probably, I prepared a TeX-version of Lecture 1 already while he was in Lund. Next time that Uno Kaljulaid came to Sweden was the year after, in October, 1996. We then made plans for another visit. This time we made an application to the Swedish Institute (SI), which included also a visit for Kaljulaid’s bright student Peeter Laud; I was supposed to have become his advisor. Unfortunately, the application was turned down. Later Laud showed interest in more applied things and defended his PhD thesis [3] on information security matters in 2002. An even shorter, last trip was in May 1997. After that time (during the last two years of the life of Uno Kaljulaid), my contacts with him were even more sporadic. I wrote Lecture 2. Uno sent me corrections and additions, and also some material for Lecture 3. Rereading our correspondence from this period, I find it striking that he showed relatively little interest in the whole project. On my side, I also took almost none initiative, as I was busy with teaching and other activities . . .
Marriage. Uno Kaljulaid married in 1973. His future wife Helle was a technical assistant at the mathematics department. From this marriage two daughters were born, Annika in 1974 and Kristina in 1979. In the mid 1980’s the parents divorced, but they never separated. Illness and death. Uno Kaljulaid became ill already at the end of the 1987 and had a surgery for a stomach cancer. At the time doctors gave him only at most five years to live. However, he was practically rather healthy until the middle of July, 1999. He worked and went jogging every morning. Until the mid of July he rested in his beloved Pärnu but then he began to cough and gradually felt less and less at ease. Nevertheless, at the end of July he participated in a conference in Poland, and, probably, gave also a talk there. Upon his return he, finally, went to see a doctor, because his health had seriously began to deteriorate. Mid September he underwent another surgery, but its purpose was only to set a diagnosis: a cancer in the stomach with remote metastases in the lungs and in the liver. After the operation Uno told that he would not surrender so easily and that he hoped to be able to finish at least the ongoing work. A few days before his death, however, asked that all should be finished. Luckily he did not suffer of heavy pain, but still it was very hard. Uno Kaljulaid passed away at the age of 57 on September 26, 1999 in the pulmonary clinic at Tartu. Annika wrote me that it was a sunny autumn day. He died in the arms of his half-sister Laine. He was buried on October 1st at his native village Kõpu in the district of Viljandi. His death was that of a true hero . . . In the meantime, I was quite unaware of everything. Early in June I received two postcards from Uno, dated in Tartu on June 4, 1999, and, apparently, sent in the same enclosure, the text of which I hereby offer a translation:
BIOGRAPHY OF UNO KALJULAID
xix
Dear Mr. Peetre, Thank you for sending me the thesis of Mr. Rosengren10, and likewise for your lines. This time everything arrived in unhurt shape, although with some delay. I have now finished my courses, and very soon I shall also finish the exams. But this occupation gives me steadily less and less satisfaction. Probably I’ll have a chance to participate in the CSconference in Uppsala. But I have not yet made up my mind whether to go there or not, because its scope covers a few of my interests. But it would be an opportunity to see Stockholm once more. Spring here was chilly, frost took the flowering of the currant bushes. Probably things were not so bad where you are – for Lund is on the latitude of Latvia or even further south. I presume that you are already by the sea, I wish a pleasant summer. Uno Kaljulaid
I was notified about Kaljulaid’s death, three days after, in an email message from his daughter Annika. She gave me also the above details of his illness and death. Furthermore, she told me that at his sickbed her father told that he wanted me to take care of his “Nachlass”, which I also eventually did . . . So all this is just my tribute to him . . . References (including two articles [2] and [5], in Estonian, commemorating Uno Kaljulaid) [1] Eesti Entsüklopeedia 1–14 + Supplementary Volume. (Estonian Encyclopedia.), Tallinn, 1985. [2] Mati Kilp. Uno Kaljulaid 21.10.1941–26.09.1999. In: Annual, 1999. Eesti Matemaatka Selts (Estonian Mathematical Society), Tartu, 2001, 111–114. [3] Peeter Laud, Computationally Secure Information Flow. Ph.D. Thesis. Universität des Saarlandes, Saarbrücken, April 2002. [4] Ü. Lumiste and J. Peetre. Edgar Krahn , A centenary volume 1894–1961. IOS Press, Providence, Rhode Island, 1994. [5] Rein Prank. Remebering Uno Kaljulaid. In: Annual, 1999. Eesti Matemaatka Selts (Estonian Mathematical Society), Tartu, 2001, 119–123. [6] J. Renner. Liivimaa ajalugu 1556–1561 (History of Livonia). Translated by Ivar Leimus and Enn Tarvel. Olion, Tallinn, 1995.
10Hjalmar Rosengren defended his PhD thesis Multivariable orthogonal polynomials as coupling coefficients for Lie and quantum representations on May 6, 1999.
This page intentionally left blank
BIBLIOGRAPHY OF UNO KALJULAID
xxi
Bibliography of Uno Kaljulaid
Many works of Uno Kaljulaid have been published in the Estonian journals: 1. Matemaatika ja Kaasaeg is a now extinct, popular-scientific Estonian language journal, whose name is here translated as Math. and Our Age. 2. Eesti Teaduste Akadeemia Toimetised, Füüsika-Matatemaatika = Proceedings of the Estonian Academy of Sciences, Physics–Mathematics (Proc. Estonian Acad. Sci. Phys. Math), founded in 1951/52 by Jüri Nuut (1894–1952). 3. Tartu Ülikooli Toimetised = Acta et commentationes Universitatis Tartuensis (Acta Comm. Univ. Tartuensis) As a rule, papers in the last two journals were published in Russian and supplied with a short abstract in English and in Estonian. Below, rare exceptions when the article done in English and abstracts in other languages (or missing) are pointed out. N.B. – A star * in front of a paper means that the item in question has not been reprinted in this Volume. A double star ** indicates that it will be available on the Senior Editor’s web page: http://www.maths.lth.se/matematiklu/personal/jaak/engJP.html [K68a] On the geometric methods of Diophantine Analysis, I and II. Math. and Our Age, 14; 15 (1968), 22–30; 3–13. [K68b]Lenin prize for work in Diophantine geometry. Math. and Our Age, 14 (1968), 108–110. [K69a] On the cohomological dimension of some quasiprojective varieties. Proc. Estonian Acad. Sci. Phys. Math., 18 (1969), 261–272 incl. loose errata). [K69b]On the geometric method of Diophantine Analysis, III. Mathematics and Our Age, 16 (1969), 20–26. [K69c] The history of solving equations. Mathematics and Our Age, 16 (1969), 122– 140. [K70]Additional remarks on groups. Mathematics and Our Age, 17 (1970), 7–22. *[K71a] On the absence of zero divisors in certain semigroup rings. Acta Comm. Univ. Tartuensis, 281 (1971), 49–57. *[K71b] On the powers of the augmentation ring of the integral group ring for finite groups. Acta Comm. Univ. Tartuensis, 281 (1971), 58–62. *[K71c] On the absence of zero divisors in some semigroup rings. In: Abstracts of the All Union Colloquium of Algebra, Kishinev, 1971, pp. 138–139 (Russian). [K73a] Polynomials and formal series. Mathematics and Our Age, 19 (1973), 39–47. *[K73b] 80 years from the birth of Villem Nano. Math. and Our Age, 19 (1973), 118– 122 (coauthors: E. Tamme, R. Kruus).
xxii
BIBLIOGRAPHY OF UNO KALJULAID
*[K73c] On the powers of the augmentation ideal. Proc. Estonian Acad. Sci. Phys. Math., 22 (1973), 3–21. [K75a] On Galois theory. Mathematics and Our Age, 20 (1975), 17–31. [K75b]Theory of automata. Mathematics and Our Age, 20 (1975), 32–47. (coauthor: E. Tamme). *[K76] On wreath type constructions for algebras. In: Abstracts of the Third All Union Symposium of Rings, Algebras and Modules, Tartu, 1976, pp. 49–50 (Russian). [K77a] Triangular products of representations of semigroups and associative algebras. Uspehi Mat. Nauk 32 (1977), no 4/196, 253-254 (Russian). *[K77b] Remarks on the varieties of semigroup representations and automata. Acta Comm. Univ. Tartuensis, 431 (1977), 47–67. *[K77c] Remarks on the course on discrete mathematics. Proc. of the III Regional Conference-Seminar of Leading Departments and Leading Lecturers of Mathematics, Minsk, 1977, p. 50 (Russian). *[K78a] A remark the basis of identities of an algebra of upper triangular matrices. In: Materials of Conf. “Methods of Algebra and Functional Analysis”, Tartu, 1978, pp. 105–107 (Russian). *[K78b] Triangular products and group rings. Vestn. Moskov. Univ. Mat., no. 6, 1978, p. 81 (Russian). [K79a] Triangular products and stability of representations. Candidate dissertation. Tartu University, 1979, 150 pp. (Russian, typescript). [K79b]Triangular products and stability of representations. Author review of Candidate thesis in Physico-Mathematical Sciences [K79a]. Minsk, 1979, 13 pp. (Russian). *[K79c] The arithmetics of varieties of representations of semigroups and algebras. Manuscript, deposited at VINITI, no. 344–78; “Matematika” 2AI36 DEP, 1979, 42 pp. (Russian). *[K81] About semigroup actions. Acta Comm. Univ. Tartuensis, 556 (1981), 27–32. *[K82a] Terminals of groups and stability of representations. Acta Comm. Univ. Tartuensis, 610 (1982), 15–25. **[K82b] A lower bound for the terminal of certain groups. Acta Comm. Univ. Tartuensis, 610 (1982), 26–37. [K83a] Triangular products representations of linear semigroups actions. Acta Comm. Univ. Tartuensis, 640 (1983), 13–28. *[K83b] A remark on Stirling numbers. In: Sb. “Komb. Analiz”, 6 (1983), p. 98 (Russian).
BIBLIOGRAPHY OF UNO KALJULAID
xxiii
*[K83c] Elements of discrete mathematics. Tartu University Press, Tartu, 1983, 100 pp. (Estonian). *[K83d] Lattices and combinatorics – a problem book. Tartu University Press, Tartu, 1983, 27 pp., (Estonian). *[K83e] On the freedom of the semigroup of special ideals. In: Abstracts of the conference “Methods of algebra and analysis”, Tartu, 1983, pp. 10–12. **[K85a] Unique factorization of varieties of semigroup representations. Acta Comm. Univ. Tartuensis, 700 (1985), 17–31. [K85b]Remarks on subcommutant rings. In: XVIII All Union Algebraic Conference, Abstracts of talks. Kishinev, 1985, p. 227. [K85c] On two results on strongly regular rings. In: Proc. of the Conference “Theoretical and applied questions of mathematics”, Abstracts of talks, Tartu, 1985, pp. 67–69. [K87a] Some remarks on Shevrin’s problem. Acta Comm. Univ. Tartuensis, 764 (1987), 30–38 (English). *[K87b] On the theory of vacuum deposition of layer on the rotating cylindrical substrate. Acta Comm. Univ. Tartuensis, 779 (1987), 127–136 (coauthor: J. Lembra). *[K87d] Theodor Molien and group algebras. In: Development of schools, ideas and theories in natural sciences at Tartu University, Tartu, 1987, pp. 16–24 (Estonian). [K87e] On the results of Molien about invariants of finite groups and their renaissance in contemporary mathematics. In: Development of schools, ideas and theories in Natural Sciences at Tartu University, Tartu, 1987, pp. 111–119 (Russian). [K88a] On Stirling and Lah numbers. In: Methods of algebra and analysis. Tartu, 1988, pp. 11–14 (Russian). [K88b]Fibonacci numbers of outer planar graphs. In: Methods of algebra and analysis, Tartu, 1988, pp. 15–17 (Russian, coauthor: T. Pikkmaa). *[K88c] On the theory of vacuum deposition of layer on a rotating cylindrical substrate from an asymmetrically located source. Acta Comm. Univ. Tartuensis, 830 (1988), 127–136 (coauthor: J. Lembra). [K90]Transferable elements in group rings. Acta Comm. Univ. Tartuensis, 878 (1990), 39–52. *[K93a] M. Meriste, J. Penjam, Algebraic theory of tape-controlled attributed automata. Research Report CS59/93, Institute of Cybernetics, Tallinn, 1993, 28 pp. (coauthors: M. Meriste, J. Penjam). *[K93b] Analytical and algebraic methods in combinatorics. Tartu University Press, Tartu, 1993, 159 pp. (Estonian, coauthor: Ü. Kaasik).
xxiv
BIBLIOGRAPHY OF UNO KALJULAID
*[K93c] Mordell’s problem. Estonian Mathematical Society. Annual 1988, Tartu University Press, Tartu, 1993, pp. 128–151, 178, 182 (Estonian, summary in English and Russian). *[K93d] Languages, tools and methods of conceptual modelling. Research Report CS61/93, Institute of Cybernetics, Tallinn, 1993, 49 pp. (coauthors: M. Meriste, J. Penjam et al.) [K96]On two discrete models in connection with structures of mathematics and language (the languages of life). Schola Biotheoretica XII, Tartu, 1996, pp. 84–95 (Estonian). [K97a] On two algebraic constructions for automata. Research Report CS92/97, Institute of Cybernetics, Tallinn, 1997, 27 pp. (coauthor:J. Penjam). *[K97b] Categories, automata and splicing systems. In: Proc. of 9th Nordic Workshop on Programming Theory, Tallinn, 1997, p. 47 (coauthor: J. Penjam). *[K98a] Flatness and localizations of Ω-semigroups. Research Report CS96/98, Institute of Cybernetics, Tallinn, 1998, 49 pp. (coauthor: O. Sokratova.) *[K98b] Does there exist a (non-abelian simple) linearly right-orderable group all of whose proper subgroups are cyclic?. In: Kourovka Notebook, 14th augmented edition, problem 14.45, Novosibirsk, 1999, p. 110. [K98c] Revisiting wreath poducts, with applications to representations and invariants. In: Yu. A. Bahturin, A. I. Kostrikin, A. Yu. Ol’shanskiˇı (eds.), Kurosh Algebraic Conference, Abstract of talks, Moscow University Press, Moscow, 1998, pp. 64–65. [K98d]Right order groups; their representations, structure and combinatorics. Manuscript, 37 pp.; 2nd version (1998) (to be submitted). [K00]Ω-rings and their flat representations. In: Contributions to General Algebra 12, Verlag Joh. Heyn, Klagenfurt 2000, 377–390 (coauthor: Olga Sokratova).
CHAPTER I Representations of semigroups and algebras
This page intentionally left blank
3
1.
[K69a] On the cohomological dimension of some quasiprojective varieties Comments by J.-E. Roos
Abstract. We prove that the cohomological dimension of the complement an arbitrary finite set of points in an r-dimensional Cohen-Macauly projective variety equals r −1.
The problem of the computation of the cohomology of quasiprojective varieties with coefficients in coherent sheaves leads, in particular, to the interesting question of the cohomological dimension of such varieties. This characteristic of a variety interests us, in first place, in connection with a result by Nagata [7] to the effect that every algebraic variety can be embedded in a complete algebraic variety. As simple examples show, far from always this embedding V → V ∗ satisfies the requirement of the minimality of the number dim(V ∗ \V ). It is an interesting problem to exhibit all the cases when this number can be described in terms of the cohomological dimension of the complement V ∗ \V . In this paper one such case is described in Theorem 1.2. Section 1.1 contains a brief survey of some known, but not readily available results of the theory of local cohomology of A. Grothendieck in a form suitable to us. In Section 1.2 we state some general properties of cohomological dimension. In Section 1.3 it is shown that the cohomological dimension of the complement of a finite non-empty of points in an n-dimensional projective space equals n − 1, and in Section 1.4 we give some auxiliary computations.
1.1. The local cohomology of Grothendieck 1. We give some basic definitions. The space X has cohomological dimension n if, for an arbitrary algebraic sheaf F on X, for i > n the group H i (X, F ) is zero, but there exists a sheaf F such that H n (X, F ) = 0. According to Grothendieck ([3], Theorem 4.15.2) a space of combinatorial Zariski dimension ≤ n has cohomological dimension ≤ n. On the other hand, there exists a space of infinite combinatorial dimension but having zero cohomological dimension [3]. For algebraic varieties X we change the definition of cohomological dimension, considering instead of Abelian sheaves on X the category of coherent sheaves. Then the affine varieties gives us an example of Zariski spaces of arbitrary large combinatorial dimension, in addition having zero cohomological dimension. 2. If Z ⊂ X is locally closed, then by definition one can find an open set V ⊂ X such that Z is closed in V . In the group F (V ) of sections of F on V we distinguish the semigroup ΓZ (X, F ) of all such sections whose supports are contained in Z. The group ΓZ (X, F ) is independent of the choice of V , and the functor F =⇒ ΓZ (X, F )
4
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
maps an exact sequence of sheaves 0 → F → G → H into an exact sequence of Abelian groups 0 → ΓZ (F ) → ΓZ (G) → ΓZ (H). This means that the functor F =⇒ ΓZ (X, F ) is exact from the left from the category of Abelian sheaves on X into the category of Abelian groups. If U ⊂ X is open, then the natural homomorphism of restriction F (V ) → F (V ∩ U ) induces a homomorphism ΓZ (X, F ) → ΓZ∩U (U, F |U ), which is indeed a sheaf. The functor F =⇒ ΓZ (F ) is exact from the left in the category of Abelian functions onto itself; we define the right derivative HZi (X, F ) of this functor which is called the sheaf of local cohomology of X. Let X be an r-dimensional Zariski space F , F an Abelian sheaf on it and Z ⊂ X locally closed. Grothendieck’s theorem ([5], Proposition 1.12) says that for i > r the groups HZi (X, F ) and that the sheaves HZi (X, F ) are zero. 3. Let X = SpecA be an additive scheme, Y one of its subschemes, given by an ideal I ⊂ A; the sheaf of coefficients F associated with the A-module N . Then one has for all i > 0 the isomorphism HYi (X, F ) ≈ lim ExtiA (A/I n , N )
([5], Theorem 2.8).
n
For each open Y ⊂ X and a coherent sheaf F on X one has the exact sequence 0 → ΓY (X, F ) → Γ(X, F ) → Γ(X\Y, F ) → HYi (X, F ) → . . . HYi (X, F ) → H i (X, F ) → H i (X\Y, F ) → HYi+1 (X, F ) → . . . . As in the case at hand H i (X, F ) = 0 for all i > 0, we have the isomorphism H i (X\Y, F ) ≈ HYi+1 (X, F ). Next, let X be an r-dimensional projective space and S = k[t0 , . . . , tr ] the algebra of polynomials over the field k. We take for F the sheaf O(n). Then, by Serre [11], for r O(n)) 0 < i < r the groups H i (X, are zero, while the group H (X, O(n)) is a vector −n−1 and has a basis of skew symmetric cocycles of cover space over k of dimension r U = (ti = 0) of the form 1 , f01...r = α0 t . . . tαr where αi > 0 and αi = −n. Therefore we have for 0 < r < r − 1 the isomorphism H i (X\Y, O(n)) ≈ HYi (X, O(n)), while, by definition the groups H r (X\Y, O(n)) are given by the exact sequence HYr (X, O(n)) → H r (X, O(n)) → H r (X\Y, O(n)) → 0.
1. On the cohomological dimension ...
5
4. Let M and N be graded S-modules. Then the derived functor Ext of the functor Homs (M, N ) = ⊕ HomnS (M, N ), defined, on the one hand, by Serre in [11] and, on the n
other hand, Cartan and Eilenberg in [1] need not coincide. However, it is easy to see they do coincide in the case needed by us of ExtiA (A/I n , A), where A = k[t1 , . . . ,r ] and I is the ideal in A given by Y ⊂ X. Indeed, the ring A/I n as a module over itself, is also an A-module. As a ring A/I n is Noetherian. The submodules of A/I n are ideals in it; therefore, it follows from Hilbert’s theorem ([1], p. 32) that this module is Noetherian. But a Noetherian module over a Noetherian ring is of finite type. In this case ([11], p. 434) both definitions coincide. Let there be given R-modules A, B, A , B and R-homomorphisms α : A → A and β : B → B . Introduce an R-homomorphism Hom(α, β) : Hom(A, B) → Hom(A , B ) which to each ϕ ∈ Hom(A, B) is defined by the Hom(α, β) ◦ ϕ = β ◦ ϕ ◦ α. The objects Hom(A, β) and Hom(α, B) are obtained from Hom(α, β) for A = A and B = B respectively. The following theorem from homological algebra may be useful in the calculation of local groups of cohomology. Let us consider the exact sequences of modules α
0 → I n → A → A/I n → 0 and
β
0 → A → K → A → K/A → 0, where A is a projective and K an injective module. The following isomorphisms hold true (cf. [1], p.141): n ExtiA (A/I n , A) ≈ Exti−2 A (I , K/A);
Ext2A (A/I n , A) ≈ Coker(HomA (α, β)); Ext1A (A/I n , A) ≈ Ker (Hom(α, β))/[Ker (HomA (α, K/A)) + Ker (HomA (A, β))]. As by the first main theorem of Grothendieck one has the isomorphism HYi (X, O) ≈ lim ExtiA (A/I n , A), n
the three isomorphisms just given suffice for the calculation in a 3-dimensional space.
1.2. Some general properties of cohomological dimension 1. Let X and Y be algebraic varities; ϕ : Y → X a morphism and F an algebraic sheaf on X. Then there is defined on Y an algebraic sheaf F ϕ , called the inverse image of the sheaf F under the isomorphism1 ϕ. If F is a coherent sheaf on X, then F ϕ too is coherent on Y . Indeed, in view of the coherence of F there exists U ⊂ X for which one has an exact sequence Op → Oq → F → 0. The homomorphism Ox → Oy induces the identity map on the base field k; therefore we have the canonical isomorphism Oy ⊗Ox Ox ≈ Oy . 1Regarding the construction of the sheaf F ϕ , cf. [9].
6
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
This gives us Oyn ≈ (Oxn )ϕ , n = 1, 2, . . . and so in ϕ−1 (U ) we have for F ϕ the exact sequence Op → Oq → F ϕ → 0, proving the coherence of F ϕ . 2. T HEOREM 1.1. For arbitrary algebraic varieties X and Y we have the inequality dimh(X × Y ) ≥ dimh X + dimh Y.
(1)
If dim X = dimh X, dim Y = dimh Y , then both sides of (1) coincide. P ROOF. Let p1 : X × Y → X and p2 : X × Y → Y be the natural projections. Furthermore, set dimh X = r, dimh Y = s. Then there exists coherent sheaves F and G, on X and Y respectively, such that the k-vector spaces H r (X, F ) and H r (Y, G) are not zero; therefore H r (X, F ) ⊗k H s (Y, G) = 0. Let us use the Künneth formula for sheaves [10]: H i (X, F ) ⊗k H j (Y, G), H n (X × Y, F p1 ⊗OX×Y Gp2 ) ≈ i+j=n
It follows from it that H r+s (X ×Y, F p1 ⊗OX×Y Gp2 ) = 0, whence dimh X ×Y ≥ r+s. Let us remark that for t > r + s the relation n ) = 0 H t (OX×Y
cannot hold true. This follows from Künneth’s formulae in view of n p1 OTn = OTn ⊗OT OT = (OX ) ⊗ OT (OYn )p2 ,
where T = X × Y . In the case dim X = dimh X, dim Y = dimh Y , we obtain in view of Grothendieck’s theorem ([3], Theorem 4.15.2) that dimh X + dimh Y ≥ dim X × Y ≥ dimh X × Y ≥ dimh X + dimh Y, which again gives dim X × Y = dimh X + dimh Y. 3. Let i : V → W be a closed embedding of algebraic varieties. Then holds the relation dimh V ≤ dimh W. Indeed, if we set r = dimh V , then the group H r (V, F ) is non-zeo for some coherent sheaf F over V . On the variety W we consider the sheaf F W , defined by the process of extending F off the variety V . The required relation follows from the isomorphism H r (W, F W ) ≈ H r (V, F ). We remark that for an open mapping this relation is not true. Indeed, let V = A2 \(0), W = A2 , where A2 denotes the affine plane. Then dimh W = 0, but dimh V = 1 (cf. Paragraph 1, Section 1.4).
1. On the cohomological dimension ...
7
4. We make the following conjecture: for an arbitrary fiber bundle (E, π, B) whose fiber is the projective space P r , it holds the formula dimh E = dimh B + r. If this is true it follows from it in a trivial way that the cohomological dimension for the σ-process for a point only can increase. Let X ∗ be a variety obtained by monoidal transformations from a non-special, irreducible algebraic variety X of dimension r. Let the center of this σ-process be a nonspecial d-dimensional variety i : V → X. Furthermore, let f : X ∗ → X be the projection. The inverse image of X under this projection of X ∗ is a projective fibering of rank r − d − 1 with basis V . In view of the fact that the embedding i∗ : V ∗ → X ∗ is closed, the hypothesis made and the monotonicity properties we obtain dimh X ∗ ≥ dimh V + r + d − 1. In particular, for the σ-process for a point we obtain dimh X ∗ ≥ r − 1. As dim X ∗ = dimh X, we have in view of known theorems (cf. Paragraph 1, Section 1.1) we obtain either dimh X ∗ = r or dimh X ∗ = r − 1. Let us now take for X an affine variety of dimension r, and for V a point. It is clear that dimh X = 0. Clearly, as V ∗ is a projective space, then dimh X ∗ = r − 1. Thus for r > 1 we have dimh X ∗ > dimh X.
1.3. The cohomological dimension of a certain variety 1. Let us consider the projective space P r and an arbitrary subvariety of codimension ≥ 2 in it. In Section 1.1 we saw that the group H r (P r \Y, O(n)) can be found from the exact sequence HYr (P r , O(n)) → H r (P r , O(n)) → H r (P r \Y, O(n)) → 0. Is the group H r (P r \Y, O(n)) always different from zero? The answer to this question is negative and follows at once from the following theorem ([5], Theorem 6.8): For any quasiprojective variety of dimension r the following three conditions are equivalent: (1) all irreducible components of X of dimension r are non-proper; (2) H r (X, F ) = 0 for any quasi-coherent sheaf F on X; (3) H r (X, OX (−n)) = 0 for all n ≥ 0, where OX (1) is the “very abundant sheaf” of Serre, induced by some projective embedding of X. As X = P r \Y is a quasiprojective variety of dimension r (open in P r ), apparently irreducible and non-complete, then condition (1) is fulfilled. Therefore condition (2) gives H r (P r \Y, F ) = 0 for every coherent sheaf F on X. 2. T HEOREM 1.2. The cohomological dimension of a quasi-projective variety P r \Y obtained by throwing away a non-empty finite set of points Y = {Q1 , . . . , Qs } in the projective space P r equals r − 1.
8
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
P ROOF. In view of the result of the previous subsection it is sufficient to find a coherent sheaf F on P r \Y such that the group H r (P r \Y, F ) is non-zero. It turns out that one can take F = O(n). We prove the relation H r (P r \Y, O(n)) = 0 by contradiction. Assume that for each coherent sheaf F the group H r (P r \Y, F ) is zero. Then in the exact sequence . . . → H r (P r−1 \Y, F ) → HYr (P r , F ) → H r (P r , F ) → H r (P r \Y, F ) → . . . the boundary groups are zero, and we obtain, in particular, the isomorphism HYr (P r , O(n)) ≈ H r (P r , F ). We use Proposition 1.9 in [5], which we reformulate in a form suitable for us. Let Y ⊂ Y ⊂ P r be closed subspaces and Y = Y \Y . Then for any coherent sheaf F on P r we have the exact sequence HYr (P r , O(n)) → H r (P r , O(n)) → HYr (P r , O(n)) → 0. By the excision formula ([5], Proposition 1.3) for a topological space X, a locally closed Y ⊂ X and an open V ⊂ X such that Y ⊂ V ⊂ X, one has, for each Abelian sheaf F and for all i, the isomorphism HYi (X, F ) ≈ HYi (V, F |V ). Take any point Q in the set Y = {Q1 , . . . , Qs } and consider for Y with respect to the set {Q}. The point Q lies in some component A of the standard affine covering of the space P r . We apply now the excision formula to the penultimate term of our exact sequence for the sheaf O(n). Taking account of that A is affine and the isomorphism O(n)|A = O|A, we get the isomorphisms HYr (P r , O(n)) ≈ HYr (A, O(n)) ≈ H r−1 (A\{Q}, O). Therefore we have the following exact sequence: α
HYr (P r , O(n)) → H r (P r , O(n)) → H r−1 (Ar \{Q}, O) → 0, where α is an epimorphism of k-vector spaces. Thanks to a result of Serre [11] one knows that H r (P r , O(n)) is a finite dimensional k-vector space. On the other hand, the computations in Paragraph 2 of Section 1.3 show that the k-space H r−1 (Ar \Q, O) is infinite dimensional. Therefore the exact sequence of vector spaces obtained concludes the contradiction. The Theorem is proved. In the question of the dimension of the k-space HYr (A, O), where Y = {Q1 , . . . , Qs }, one can limit oneself to the case of a one-dimensional space Y . In fact, the following corollary holds true. P ROPOSITION 1.3. Let A be an r-dimensional variety and F a coherent sheaf on r A. If the space HQ (A, O) is infinite dimensional over k for an arbitrary point Q ∈ A, then the relation r (A, F ) = ∞ dimk H{Q 1 ,...,Qs }
holds for any arbitrary finite family of points {Q1 , . . . , Qs } ⊂ A.
1. On the cohomological dimension ...
9
P ROOF. By Grothendieck [3] for Q1 ⊂ {Q1 , . . . , Qs } ⊂ A one has the exact sequence β
α
r r r HQ (A, F ) → H{Q (A, F ) → H{Q (A, F ) → 0, 1 1 ,...,Qs } 2 ,...,Qs }
which we, for the sake of simplicity, rewrite in the form α
β
A(1) → B(s) → C(s − 1) → 0. Our Proposition gives the possibility to carry induction over the number of points s. Let us assume that the statement is proved for s < n. Then in the exact sequence A(1) → B(n) → C(n − 1) → 0, the term C(n−1) has infinite dimension, which in view of the fact that B(n) is a k-space gives a contradiction. As the computation in 1.4.2 shows that r dimk HQ (k r , O) = dimk H r−1 (k r \Q, O) = ∞,
it follows from what has been proved that for each finite collection of points S in k r the k-space H r−1 (k r \S, O) is infinite dimensional. 3. The character of the facts, from [5] and [11], used in the proof of Theorem 1.2 is such that the statement of the theorem, apparently, can be carried over to the case of an arbitrary variety V of dimension ≥ 2, if it were possible for each affine variety r (X, OX ) is infinite dimensional. X = Spec A, dim A = r, to prove that the k-space HQ r (A) Clearly, A may be taken as a local ring; then everything reduced to the proof that HM is infinite dimensional, where M ⊂ A is a maximal ideal. As S. I. Dolgaev has observed that, when all local rings of a variety V are CohenMacauly rings (for example, when V is nonsingular or is locally a complete intersection), this easily follows from the following criterion of Grothendieck for the coherence of sheaves of local cohomology: Let X be a locally Noetherian pre-scheme locally embedded in a regular pre-scheme, and Y a closed subvariety of X, F a coherent OX -module, c(x) = dim{¯ x}, n ∈ Z. The following two conditions are equivalent [4]: (i) for all x ∈ X\Y such that c(x) = 1, depth Fx ≥ n; (ii) for i ≤ n the sheaves H iY (F ) are coherent. Indeed, take X = Spec OV,Q = Spec A. As by assumption A is a Cohen-Macauly r (A) ring and c(x) = dim{¯ x} = 1, then depth Ax = dim Ax = r − 1. If the space HM were finite dimensional, then it would follow from condition (ii) that n = r, from which by (i) depth Ax ≥ r, which is contradictory.
1.4. Some computations and remarks 1. Let us consider the algebraic variety X, obtained from the affine plane [with coordinates (x, y)] by exclusion of the origin; it is not affine but admits an affine cover U = (U1 , U2 ), where U1 = X\(x = 0) and U2 = X\(y = 0). If X is an arbitrary variety in which the subvariety Y has codimension ≥ 2, we obtain, in view of the fact that the singularity of every rational function on X has codimension 1, that H 0 (X \Y, O) ≈ H 0 (X , O). Therefore, in this case H 0 (X, O) consists of all polynomials P (x, y).
10
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us compute the group H 1 (U, O). It is clear that all cochains f12 ∈ C 1 (U) P (x, y) , where k, l are integers. In view of C 2 (U) = 0, all one dimenhave the form xk y l sional cochains are cocycles. The clarification of the question which of the cochains are P (x, y) coboundaries is equivalent to when all k can be written in the form x y xk P1 (x, y) − x P2 (x, y) , xk y Thus, we have
H (X, O) ≈ 1
P (x, y) xk y l
k, ≥ 0
xk P1 (x, y) − x P2 (x, y) xk y
,
where P, P1 , P2 are arbitrary polynomials, while k , , k, l ≥ 0. It is easy to see that this 1 factor space is infinite dimensional. To this end we remark that all expressions k give x y different cosets: 1 xk y − xm y n 1 − = , xk y xm y n xp y q where p = max(k, m), q = max(l, n), k = p − n, m = p − m, = q − , n = q − n. It is sufficient to show that there exist numbers P and Q such that xk y − xm y n = xp P − y q Q. To this end we have to consider two cases 1) p = k, q = and 2) p = k, q = n. Assuming that such P and Q exist in the first case, we obtain xp P − y q Q = 1 − xp−m y q−n , which is a contradiction, as the left hand side of the equation has unity among its terms. Analogously, in the second case the equation y q− −xp−m = xp P −y q Q, where p−m < p, q − < q, leads us to a contradiction. Thus we have proved that dimk H 1 (X, Q) = ∞. 2. T HEOREM 1.4. Let X be an r-dimensional affine space with a distinguished point, defined over an algebraically closed field k. Then the cohomology group H r−1 (X, O) is an infinite dimensional vector space over k. P ROOF. Consider the affine covering U = (Ui ) of X, where Ui = (xi = 0), i = 1, . . . , r. As dim U = r − 1 all (r − 1)-dimensional cochains are cocycles. The elements f1,...,r ∈ C r−1 (U) have the form P (x1 , . . . , xr ) . xi11 , . . . , xirr Let ρi be the restriction homomorphisms, i.e. ρi : Γ( ∩ Uj , O) → Γ(∩Uj , O). j=i
j
1. On the cohomological dimension ...
As by definition of the differential d df =
r
j+1
(−1)
ρj
Pj (x1 , . . . , xn ) i (j)
x11
j=1
11
i (j)
. . . x j . . . xrr
,
then for the computation of the group H r−1 (X, O) we must clarify which expressions P (x1 , . . . , xr ) are expressible in the form xi11 , . . . , xirr ⎞ ⎛ r 1 α α −i (j) α −i (j) j ⎝ (−1)j+1 x1 1 1 . . . xj . . . x1 r r · Pj (x1 , . . . , xr )⎠ = αr 1 xα 1 . . . xr j=1 ⎞ ⎛ r 1 α j ⎝ (−1)j+1 xj Pj (x1 , . . . , xr )⎠ , = α1 r x1 . . . xα r j=1 where αk = max ik (j), 1≤j≤r
k = 1, . . . , r.
Let us denote this equivalence by E. We show that the factor space P (x1 , . . . , xr ) E xi11 , . . . , xirr is infinite dimensional over the field k. To this end it is sufficient to remark that in the case that there exists an index j such that the expressions I1 =
xi11
1 1 and I2 = k1 , ∀ij > 0, kj > 0, ir . . . xr x1 . . . xkr r
j = 1, . . . , r, must lie in different cosets. Let us set αj = max(ij , kj ),
j = 1, . . . , r.
Then
1 r −ir r −kr (xa1 1 −i1 . . . xα − xa1 1 −k1 . . . xα ). r r r . . . xα r We must show that the expression within parentheses can be written in the form I1 − I2 =
1 xα 1
r
α
(−1)j+1 xj j Pj (x1 , . . . , xr ).
j=1
Without loss of generality we can assume that there exists an index s such that α1 = i1 , . . . , αs = is , αs+1 = ks+1 , . . . , αr = kr . The expression within parentheses takes the form (2)
a
x1 s+1
−is+1
r −ir s −kr . . . xα − xa1 1 −k1 . . . xα . r r
where αs+1 − is+1 < αs+1 . . . , αr−ir ; α1 − k1 < α1 , . . . , αs − ks < αs . But this equation shows that (2) cannot be expressed in the form
12
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
r
α
(−1)j+1 xj j Pj (x1 , . . . , xr ).
j=1
Our statement is proven.
3. As has been proved by M. Kneser, in a 3-dimensional space X an irreducible curve E can be expressed by three algebraic surfaces, which we denote by V0 , V1 , V2 . In view of E = ∩Vi we have for X = X \E the open affine covering U = (Ui = X \Vi ) and i
we can apply the following theorem of Serre [11]. Let X be an algebraic variety, F a coherent sheaf on X and U = (Ui ) a finite affine covering of X. Then for each i ≥ 0 the homomorphism σ(U) : H i (UU, F ) → H i (X, F ) is an isomorphism. As dim U = 2, we have by this theorem H 3 (X, F ) = 0 for all coherent sheafs on X. There arises an interesting question: For an algebraic curve E and a coherent sheaf F on X, can the group H 2 (X, F ) be different from zero? This is connected with the conjecture on the impossibility to express an arbitrary curve in a 3-dimensional space by two surfaces. Indeed, we would have a proof of this negative statement if for some curve E the answer to the question posed would be positive. The question of the non-triviality of H 2 (X, F ) arises also in connection with the conjecture that each vector bundle of rank 2 on a 3-dimensional affine space is trivial. Indeed, Serre proved in [12] that if this problem has a positive solution then each non-special rational or elliptic curve in a 3-dimensional affine space would be a complete intersection. Therefore this conjecture would be refuted if in a 3-dimensional affine space one could find a rational or elliptic curve E such that H 2 (CE, F ) = 0 for some coherent sheaf F . This shows that the question perhaps could be solved in terms of cohomological of algebra. In [6] Hartshorne introduced the notion of local connectivity of a variety of codimension 1, which refers to the situation when spreading out of a subvariety of codimension greater than unity does not disturb the connectivity structure of the variety. He obtains a necessary condition for a manifold to be a complete intersection, which amounts to local connectivity of this variety of codimension 1. It turns out that the non-triviality of the groups H i (X \V, O), i ≥ 2, is not a necessary for the representability of that variety as a complete intersection. This is shown by the following example. Consider in the complex affine space X = C4 with the Zariski topology the variety V which is the union of two planes: x1 = x2 = 0 and x3 = x4 = 0. It is clear that at the origin this variety is not connected with codimension 1 and so it cannot be a complete intersection. However, a computation reveals that H 2 (CV, O) = H 3 (CV, O) = 0, where [as before] CM denotes the complement on C4 of a set M . We have X = CV = C[(x1 = x2 = 0) ∪ (x3 = x4 = 0)] = 3
=C(x1 = x2 = 0) ∩ C(x3 = x4 = 0) = ∪ Ui , i=0
1. On the cohomological dimension ...
13
where U0 = (x1 = 0)∩(x3 = 0), U1 = (x1 = 0)∩(x4 = 0), U3 = (x2 = 0)∩(x4 = 0). Take in X the covering U = (Ui ). By Serre’s theorem H i (X, O) ≈ H i (U, O), i = 2, 3. Let us complete H 3 (U, O). To this end we remark that all 3-dimensional cochains are of the form P (x, y, z, t) . xk y z m tn It is clear that all 3-dimensional cochains are cocycles. For all j = 0, 1, 2, 3 we have ∩ Ui = ∩Ui . Therefore, all restriction homomorphisms i=j
i
ρi : Γ( ∩ Uj ) → Γ(∩ Uj ), j=i
j
i = 0, 1, 2, 3,
are exact. It is now easy to see that all 3-dimensional cochains are exact, hence H 3 (U, O) = 0. An analogous reasoning reveals that also the groups H 2 (U, O) are trivial. Note added in proof. R. Hartshorne, Cohomological dimension of algebraic varieties (Ann. Math. 3, 444–450 (1968)), has shown that H 2 (P 3 \E, F ) = 0 for all F .
Comments. The reference [5] has now appeared in Springer Lecture Notes in Mathematics 41 (1967). The reference [4] has been published by North-Holland/Masson 1968 as Volume 2 in the series Advanced Studies in Pure Mathematics. The problem of [12], mentioned at the end of Kaljulaid’s paper has now been solved: The conjecture of Serre that all projective modules over a polynomial ring are free (i.e. that algebraic vector bundles over k n are trivial) has been proved independently by Quillen [8] and Suslin [13] (Cf. also: [2]). Jan-Erik Roos
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
H. Cartan and S. Eilenberg. Homological Algebra. Princeton Landmarks in Mathematics. Princeton University Press, Princeton, 1999. Reprint of the 1956 original. D. Ferrand. Les modules projectifs de type fini sur un anneau de oltnômes sur un corps sont libres. In: Séminaire Bourbaki, Vol. 1975/76. Springer-Verlag, Berlin, 1977, 202–221. R. Godement. Topologie algébrique et théorie des faisceaux. Technical Report 13. Actualit’es Sci. Ind., no. 1252., Publ. Math. Univ. Strasbourg., Hermann, Paris, 1958. Russian translation: Moscow, 1961. A. Grothendieck. Cohomologie locale des faisceaux cohérents et théorèmes de Lefschetz locaux et globaux. Technical Report exposé 8, 8-2-4, I.H.E. Seminaire de Géométrie Algébrique, 1962. A. Grothendieck. Local Cohomology. Technical Report Lecture notes by R. Hartshorne. Harvard University, 1961. R. Hartshorne. Complete intersections and connectedness. Amer. J. Math. 84, 1962, 497–508. M. Nagata. Imbedding of an abstract variety in a complete variety. J. Math. Kyoto Univ. 2, 1962, 1–10. D. Quillen. Projective modules over polynomial rings. Invent. Math. 36, 1976, 167–171. J. Sampson and G. Washnitzer. A Vietoris mapping theorem for algebraic projective fibre bundles. Ann. Math. 68, 1958, 348–371. J. Sampson and G. Washnitzer. A Künneth formula for coherent algebraic sheaves. Illinois J. Math. 3, 1959, 389–402. J. P. Serre. Faisceaux algébriques cohérents. Ann. Math. 61, 1955, 191–278. J. P. Serre. Sur les modules projectifs. Technical Report 14-e année, no. 2. Seminaire Dubreil Pisot, Algèbre et Théorie des nombres, 1960. A. A. Suslin. Projective modules over polynomial rings are free. Dokl. Akad. Nauk. SSSR 229, 1976, 1063–1066.
This page intentionally left blank
15
2.
[K77a] Triangular products of representations of semigroups and associative algebras Revised by J. Peetre, comments by R. Lipyanskiˇı
The triangular product in the theory of varieties of representations of groups plays a role analogous to the role of the wreath product for group varieties. In this note we study the triangular product of representations of semigroups and associative algebras. We assume that K is a field. This is required in the main results of the paper, although the principal constructions and notions can be introduced for any associative and commutative ring K with unit. For pairs (G, Γ) such that the semigroup (algebra) Γ acts by semigroup (algebra) endomorphisms on the K-module G, one can introduce, exactly as in the case of groups, a net of notions and constructions. A variety of representations of semigroups and algebras is a saturated Birkhoff class of the corresponding pairs. By definition, a class K of pairs is termed saturated if for all right epimorphisms of pairs (G, Γ) → (G , Γ ) with (G , Γ ) ∈ K it holds that (G, Γ) ∈ K. The variety generated by the class K will be denoted Var K. Multiplication of two varieties Θ1 and Θ2 is defined by the rule: a pair (G, Γ) is contained in Θ1 · Θ2 if G has a Γ-invariant submodule H such that (H, Γ) ∈ Θ1 and (G/H, Γ) ∈ Θ2 . There arises the semigroup M(K) (the semigroup L(K)) of varieties of representations of semigroups (algebras). The semigroup M(K) is anti-isomorphic to the semigroup of ideals of the semigroup ring F = KΨ of the free monoid Ψ with a countable set of free generators, invariant with respect to all endomorphisms F induced by endomorphisms of the monoid Ψ. The semigroup L(K) is anti-isomorphic to the semigroup T (K) of non-zero ideals of the free associative Kalgebra F of countable rank (with respect to the usual multiplication of ideals of F ). 1. For pairs (A, Σ1 ) and (B, Σ2 ) we set Φ = Hom+ K (B, A) ⊂ EndK (A, B). The natural action of the semigroups Σ1 and Σ2 on the (additive) semigroup Φ allows us to introduce a multiplication in Φ × Σ1 × Σ2 , (ϕ, σ1 , σ2 ) · (ϕ , σ1 , σ2 ) = (σ2 · ϕ , ϕ · σ1 , σ1 σ1 , σ2 σ2 ). There arises the semigroup Γ = Φ Σ1 × Σ2 ; its action on G = A ⊕ B goes according to the formula (a + b) ◦ (ϕ, σ1 , σ2 ) = a ◦ σ1 + bϕ ◦ σ1 + b ◦ σ2 , extends to the pair (G, Γ), which will be denoted (A, Σ1 ) (B, Σ2 ) and called the triangular product of the given pairs. The properties of this construction are in many respects parallel to the properties of the triangular product of group pairs (B.I. Plotkin, 1971, [3]). Let us remark that Γ is a group if and only if Σ1 and Σ2 are groups and Φ is treated as the additive closure to a group of the semigroup Hom+ K (B, A). T HEOREM 2.1. The following formula holds true: Var(K1 K2 ) = Var K1 · Var K2 .
16
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
From this one deduces that the variety of linear representations (over a field) is a semigroup with a unique decomposition as a product of a finite number of indecomposable varieties. 2. The questions under study are also connected with automata theory. A linear semigroup automaton A = (A, Γ, B) is a system, where A (the states) and B (the outputs) are K-modules, while Γ (the input signals) is a semigroup and there are given K-linear operations A◦Γ → A and A Γ → B such that (A, Γ) is a liner map with respect to the action ◦, and a ∗ γ1 γ2 = (a ◦ γ1 ) ∗ γ2 for all a ∈ A, γ1 γ2 ∈ Γ. The automaton A = (A , Γ , B ) is called an invariant subautomaton of A if A and B are K-submodules in A and B respectively and A ◦ Γ ⊂ A , A Γ ⊂ B . By definition an automaton A belongs to the product of two varieties of linear automata Θ1 and Θ2 if there exists an invariant subautomaton such that A ⊂ A, A ∈ Θ1 with A/A ∈ Θ2 . T HEOREM 2.2. A variety of linear automata with the multiplication indicated is a semigroup which is not free but contains a free subsemigroup isomorphic to M(K). ¯ where Σ ¯ acts from the left and the right on 3. Let there be given K-algebras Φ and Σ, ¯=Φ ¯ ⊕Σ ¯ we retain Φ and that this is a bimultiplication in the sense of Hochschild. On Γ the definition of addition and multiplication by scalars, while multiplication is defined anew putting (ϕ, σ) · (ϕ , σ ) = (ϕ · σ + σ · ϕ + ϕϕ , σσ ). ¯ which is the semidirect product of the algebras Φ and There arises the K-algebra Φ Σ, ¯ Σ. ¯ 1) For given pairs (G1 , Σ1 ) and (G2 , Σ2 ), where Σi are K-algebras, we let (G1 , Σ ¯ 2 ) be the corresponding faithful pairs and set G = G1 ⊕ G2 . We treat Σ ¯ = and (G2 , Σ ¯ 2 and Φ = HomK (G2 , G1 ) as subalgebras of EndK G. Multiplication in EndK G ¯ 1 ⊕Σ Σ ¯ on Φ. Setting Σ = Σ1 ⊕ Σ2 we obtain a natural defines a left and a right action of Σ ¯ then we can extend the action of Σ ¯ on Φ to an action of Σ on epimorphism f : Σ → Σ, ¯ = Γ whose action on G = G1 ⊕ G2 is given by the Φ. We arrive at the algebra Φ Σ formula (g1 + g2 ) ◦ (ϕ, σ) = g2ϕ + (g1 + g2 ) ◦ σ. This action agrees with the operations on Γ. There arises the pair (G, Γ), which is the triangular product of the representations (G1 , Σ1 ) and (G2 , Σ2 ), which we denote by (G1 , Σ1 ) (G2 , Σ2 ). T HEOREM 2.3 (Main theorem). For any two classes K1 and K2 of representations there holds the formula Var(K1 K2 ) = Var K1 · Var K2 . It follows from this that each non-trivial representation of algebras decomposes uniquely as a finite product of indecomposbale representations. Thus, the semigroup L(K) is free. This opens a new door to the result of Bergman and Lewin [1] on the freeness of the semigroup T (K). Here we have supplementary possibilities. It is known that the set A(K) of proper varieties of (associative) K-algebras is in a bijective correspondence with the set T (K). Multiplication in T (K) induces now on A(K) an associative multiplication , which we denote by ∗. We are led to the following results. For a K-algebra A let A∗ be the result of an outer adjunction of a unit to it , and let Var A be the variety of K-algebras generated by A. Let us introduce for any K-algebras
2. Triangular products of representations
17
A and B the operation of wreath product by the formula AwrB = HomK (B ∗ , A∗ ) (A ⊕ B), where A∗ and B ∗ are regarded as K-modules. The justification of this name is given by the functional role of this operation, which is disclosed by the formula (Var A∗ ) ∗ (Var B ∗ ) = Var(BwrA). By definition a T -ideal is finitary if the variety of K-algebras defined by it generates a finite dimensional K-algebra. It turns out that a finite product of T -ideals is finitary when all factors are finitary. If a variety of K-algebras is given with the aid of identities in n variables then it can not be decomposed in more than n factors. In particular, the semisimplicity (in the sense of Jacobson)of a K-algebra forces Var A to be indecomposable. The author is obliged to Professor B.I. Plotkin for supervising this work, and for his valuable advice and interesting discussion, and, furthermore, G. Bergman for sending him his pre-print. [3], [2]
Comments. It is known that if K1 and K2 are two classes of group representations over a field and K1 K2 their triangle product, then Var(K1 K2 ) = Var K1 · Var K2 [4]. Uno Kaljulad extends this result to representations of semigroups, associative algebras and linear automata. In this way he obtains another proof of the Bergman-Lewin theorem that the semigroup of T -ideals (verbal ideals of absolutely free associative algebra over a field) is free Bergman-Lewin [1]. He introduces a new operation over associative algebras, the wreath product of algebras, and proves some interesting properties of this operation: (Var A∗ ) ∗ (Var B ∗ ) = Var(BwrA), where A∗ (and B ∗ ) is obtained from algebra A (and B) by adjunction to it a unit. Author investigates also decomposition of finitary T -ideal in indecomposable factors. He brings a sufficient condition when factors of this decomposition to be indecomposable: if K-algebra is semisimple (in the sense of Jacobson), then Var A to be indecomposable. There are also other sufficient conditions for the above-mentioned properties. I think that this paper of Uno Kaljulad was a pioneer work in the theory of the variety of semigroup representations and the variety of linear automata. His results extend also above mentioned Bergman-Lewin theorem. Ruvim Lipyanskiˇı
References [1] G. Bergman and J. Lewin. The semigroup of ideals of a fir is (usually) free. J. London Math. Soc. 11 (2), 1975, 21–31. [2] G. Birkhoff. The role of algebra in computing. Computers in algebra and number theory, SIAM-AMS Proc., Amer. Math. Soc IV, 1971, 1 – 47. [3] B.I. Plotkin and A.S. Grinberg. On semigroups of varieties, connected with group representations. Siberian Math. Journal 13, 1972, 841–858. [4] B.I. Plotkin. Multiplicative systems of varieties of pairs – group representations. Latvian Mathematical Yearbook 18, 1976, 143–169, 223.
This page intentionally left blank
19
3.
[K79a] Triangular products and stability of representations. Candidate dissertation Translation by J. Peetre, revised by K. Kaarli
Contents of the dissertation Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1. The triangular product. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 1.1. 1.2. 1.3. 1.4. 1.5.
Triangular products of group representations . . . . . . . . . . . . . . . . . . . . . 23 Triangular products of semigroup representations . . . . . . . . . . . . . . . . . 25 Triangular products of representations of algebras . . . . . . . . . . . . . . . . 35 Connections between -constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2. The arithmetics of varieties of representations of semigroups and algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1. 2.2. 2.3. 2.4. 2.5. 2.6. 2.7.
Varieties of linear pairs and automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Technical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 The fundamental lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 The theorem on generating representations of semigroups . . . . . . . . . . 55 Consequences. Connections with linear automata . . . . . . . . . . . . . . . . . 57 The theorem on generating representations of algebras . . . . . . . . . . . . . 62 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3. Powers of the fundamental ideal and stability of representations of groups and semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.1. Preliminary topics; on the terminal of nilpotent groups . . . . . . . . . . . . . 70 3.2. Construction of stable representations of groups with the aid of the triangular product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.3. Generalized measure subgroups of finite groups . . . . . . . . . . . . . . . . . . . 85 3.4. Mal’cev nilpotency and stability of semigroups . . . . . . . . . . . . . . . . . . . 90 3.5. Comments and remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Introduction In various branches of mathematics and its applications there arises a need to use representations, and so problems of their classification become urgent; cf. [18, 20, 32, 46, 47, 50, 54, 65, 66, 86]. If one takes into account that a representation is a two-sorted algebraic system (a pair) then the systematics of representations is facilitated. The book [35] is
20
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
written from this point of view, and, furthermore, there is visible evidence of this in the note [57] and in the survey [41]. The naturality and usefulness of the study of classes of algebraic systems has often been emphasized by A. I. Mal’cev; for example, in [30]. The reduction of classes to “simpler” ones is one of the fundamental problems of this direction; as an example, let us mention the result of A. L. Shmel’kin and Neumanns 2 on the freedom of the semigroup of varieties of groups ([34, Theorem 23.4]). The problem of decomposition has always been an essential ingredient of every theory of representations: the classical theory of reduction to irreducible linear representations of a fixed group (§14 in the book [48] by D. A. Suprunenko) or the reduction to indecomposable varieties of representations with a variable group (the paper [43] by B. I. Plotkin and A. S. Grinberg, as an example). Indecomposable classes as “simplest blocks” in a given theory can not be reduced to simpler classes and have to be studied separately. On the other hand, for the reduction to indecomposable classes one needs tools for doing the decomposition. In the theory of varieties of groups this role is played by the wreath product of groups, and in the case of representations with a variable group the construction of the triangular -product of group representations. According to B. I. Plotkin [36], the pair (G, Γ) is the triangular product of its subpairs (A, Σ1 ) and (B, Σ2 ) if the following conditions are fulfilled: (1) for the subgroup Σ = {Σ1 , Σ2 } ≤ Γ, generated by two subgroups Σ1 and Σ2 , the subpair (G, Σ) decomposes into the direct product of its subpairs (A, Σ1 ) and (B, Σ2 ); (2) in the group Γ there exists a normal subgroup Φ such that the subrepresentation (G, Φ) is faithful, and the image of Φ in Aut G coincides with the centralizer of the series 0 ⊂ A ⊂ G, that is, it acts as identity on each factor of this series; (3) the group Γ coincides with the semi-direct product Φ Σ. The object of this thesis is the study of the triangular product and its applications. It consists of two parts. The goal of the first part (Sections 3.1 and 3.2) is to find the -construction for representations of semigroups and algebras, to study the properties of these tools and their application to the decomposition of the varieties of the corresponding representations. In the second part (Section 3.3) the triangular product is applied to the study of the powers of the fundamental ideal of group rings. Representations by endomorphisms of modules, semigroups and algebras have been the subject of many studies by A. V. Mihalev [33] and L. M. Gluskin [10]. On the other hand, the representations of rings by endomorphisms of Z-modules is a classical research topic. The tendency towards a category-theoretic formulation of the classification of representations makes urgent the problem of the decomposition of classes of representations of various algebraic objects (groups, semigroups, algebras etc.), as the elaboration of a general theory requires the understanding of the possible deviations and its leading to a coherent series of notions, constructions, and results. This is one of the reasons why the introduction and the study of tools of decomposition of representations of semigroups and algebras deserve attention. The known difficulties for carrying over results for groups and their representations to semigroups increases the interest for cases and ways where it is possible; concerning that see [24, Chapter 7]. 2 Editors’ note. Three well-known group-theorists: Bernhard and Hanna Neumann and their son Peter M. Neumann
3. Triangular products and stability of representations
21
The essential results of the first part of this thesis concern the search for suitable constructions for representations of semigroups and algebras, the study of several their properties, and further the establishing of connections between these new constructions with the triangular products of representations of groups. There exists a cryptomorphism (in the sense of G. Birkhoff [58]) of the three constructions mentioned, although even their definitions are quite different, and, as a result, sometimes there are considerable differences in the proofs of properties. The varieties of representations of the semigroups admit an associative multiplication and the corresponding semigroup is factorial. This follows from the main result of Section 3.2, Theorem 3.33 about the generating representations. Such results are also obtained for algebras (Theorem 3.43 and Theorem 3.44). The role of the -constructions, introduced in Section 3.1, in the proof of these facts is analogous to the role of the wreath product in the proof of the above mentioned grouptheoretic theorem of Shmel’kin and Neumanns. For the theorem on freedom of the semigroup of varieties of linear representations of semigroups there is also a proof in terms of the semigroup ring of the free countably generated monoid; the extract of the reasoning needed is well-known from [56]. Among the consequences of the theorem on generating representations of algebras (Theorem 3.43) let us mention the theorem of Bergman and Lewin on the freedom of the semigroup of T -ideals, which in [56] is proved by means of the theory of FI-rings of Cohn [5]. Here the corresponding fact is interpreted as a statement about the freedom of the semigroup of varieties of representations of algebras, and in this form it readily follows from Theorem 3.43. The given approach, however, allows to penetrate more deeply into the essence of the matter. For example, for given finite dimensional (over a fixed field K) pairs (A, Σ1 ) and (B, Σ2 ) the identities for the variety Var(A, Σ1 )·Var(B, Σ2 ) are readily found, these are exactly the identities for Var(G, Γ), where (G, Γ) = (A, Σ1 ) (B, Σ2 ). This might be rather difficult to obtain such a result by means of multiplication of T -ideals. There are also other applications of the material in Section 3.2 concerning representations of algebras in the theory of associative algebras themselves. Let us mention a necessary condition for the indecomposability of a variety of algebras (Theorem 3.49). The technique developed in the first part of this thesis is tightly connected with automata theory [31]. After the interaction of this discipline with the theory of algebras (V. M. Glushkov [9]) an essential result was established, that is the theory of decomposition of finite automata. Nowadays algebraic methods in automata theory are developing rather intensively; cf. [18], [42], [65], [45] etc., and furthermore in Eilenbergs’ book [66] there is given a detailed analysis of the corresponding methods in a modern presentation. The present author has introduced the semigroup of varieties of linear automata and has given a description of it in the language of pairs of “consistent” ideals in the free countably generated associative algebra (over the given field), which gives the possibility to establish interesting properties of this semigroup (Theorem 3.37). Since the very beginning of the theory of group representations a major role have been played by the group and semigroup algebras. In this connection it was observed that the application of the ideas and methods of the (general) theory of algebras and their representations to group algebras was fertile and even the group algebras themselves turned out to be a subtle tool of calculation in the study of the structure of groups. The papers [81,98–100] convince of the great heuristic value of group and semigroup algebras
22
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
in Combinatorics. The situation here reminds the one in Number Theory, where for the achievement of many deep facts on integers one applies algebraic and analytic methods. The main goal of Section 3.3 of this dissertation is the study of an issue of the of the powers of the fundamental ideal of an integral groups ring stabilization, which is already being deeply investigated for more than a decade, cf., for example the survey A. V. Mihalev and A. E. Zaleskiˇı [51], the lectures by A. A. Bovdi [3] or the book by D. Passman [96]. Our choice of subject was stimulated by the deep and beautiful work of A. I. Mal’cev [27], K. Gruenberg [69] and B. Hartley [77], where the special role of nilpotence in this circle of ideas is likewise clearly set forth. In Sections 3.3.1–3.3.3 the possible values of terminal are found for Artinian groups and the limit of finite groups is calculated. These results of the papers [13, 14] have been obtained independently and by other methods, and were in part generalized by Gruenberg and Roseblade [71], Sandling [102] and Hartley [80]. In Section 3.3 the methods of [13, 14] are developed, using moreover systematically the language and technique of the general theory of group representations, and, furthermore, a circle of ideas connected with the well-known theorem of L. A. Kaluzhnin [84] on nilpotence of a group, acting faithfully and stably on a finite invariant series of another group, and some applications of it. The elements of such an approach were set forth by Hartley [76], but he uses it only for the interpretation of some results. Due to mentioned approach, a self-developed presentation, and in several cases a generalization and a considerable simplification of the proofs in [14, 71, 80] are achieved. The paper [26] of A. I. Mal’cev on the possibility of embedding semigroups into a group gave rise to a well-known cycle of developments, in particular, there appeared results, that are at the first glance not connected with stability. Given the goal for finding “good” classes of semigroups with cancellation embeddable in a group, A. I. Mal’cev has found in [28] a notion of nilpotence for semigroups such that each such semigroup with cancellation is embeddable in a nilpotent group. Up to now the interest for this notion has not been considerable. The present author has made an attempt to unify the results of [28] with the above mentioned theorem of Kaluzhnin. This leads to the necessity of reconsidering the notion of stability for semigroups of endomorphisms. This question, and further some properties of semigroup rings of locally nilpotent (in the sense of Mal’cev) semigroups are treated in Section 3.3.4. The papers [11–17] have been published on the theme of this dissertation. The main results have been communicated at the XI All Union Algebraic Colloquium (Kishinev, 1971); the All Union Symposium on Ring Theory, Algebras and Modules (Kääriku, 1976); at the Algebra Seminar of Tartu State University; at the Riga Algebra Seminar; at the the Seminars of Higher Algebra and Rings and Modules at Moscow State University; and at the Minsk Algebra Seminar. Twice the material of the two first sections was used in a special lecture course in automata theory presented by the author himself at Tartu State University; the main contents of this course were set forth at the III Regional Conference-Seminar of leading lecturers of mathematics of the Belorussian, Latvian, Lithuanian, Estonian Soviet Republics and the Kaliningrad Oblast of the Soviet Union (Minsk, 1977). Acknowledgement. The author is thankful to Prof. B. I. Plotkin for supervising this work, for his valuable advice, and generous support.
3. Triangular products and stability of representations
23
3.1. The triangular products This section has a preparatory character. First of all, here we treat representations as two-sorted entities (pairs) and carry over the corresponding definitions to the case when the acting object of a pair is a semigroup or an associative algebra. The main object of the section is the introduction of the triangular product of representations of semigroups and algebras and the study of their properties and connections. Applications of the frame of notions considered are given in the Sections 3.2–3.3. We underline that although the main constructions and notions of this section can be introduced for an arbitrary associative and commutative unitary ring K, we prefer to restrict ourselves in the first two sections because of reasons of organization, to the case when K is a field. 3.1.1. Triangular products of group representations 1. The object of this first section is preparatory, to acquaint the Reader with the notion of triangular product for group pairs. This construction turns out to be useful for us also in our study of the fundamental ideal of group rings in Section 3.3, but, in the first place, it serves as a model for analogous constructions of the triangular product of representations of semigroups and algebras. 2. Let A and B be any two groups. The set AB of all functions B → A forms a group on which B acts according to the formula ∀ x,
b ∈ B,
f ∈ AB ,
(f ◦ b)(x) = f (xb−1 ).
There arises the pair (AB , B). We accompany this pair with the semi-direct product AB B which will be called the (complete) wreath product of A and B, and denoted AwrB. Let us fix an associative-commutative ring K, for example K = Z and let Γ be an arbitrary group. If there is given a representation of Γ by automorphisms of a certain K-module G, then one speaks on the (group) pair (G, Γ). Let (A, Σ1 ) and (B, Σ2 ) be any two group pairs, and let Φ = HomK (A, B) be the module of all K-homomorphisms of B into A. Defining an action of the groups Σ1 and Σ2 on Φ respectively by the formulae ∀ x ∈ B,
σ1 ∈ Σ1 ,
ϕ ∈ Φ,
(ϕ ◦ σ1 )(x) = ϕ(x) ◦ σ1
and
∀ x ∈ B, σ2 ∈ Σ2 , ϕ ∈ Φ, (ϕ ◦ σ2 )(x) = ϕ(x ◦ σ2−1 ) we arrive at the pairs (Φ, Σ1 ) and (Φ, Σ2 ). Moreover, as the actions of Σ1 and Σ2 are permutable on Φ, we can now define the pair (Φ, Σ1 × Σ2 ) to which corresponds the group Γ = Φ Σ1 × Σ2 , where the initial groups Φ and Σ1 × Σ2 are embedded, ¯ = {φ¯ = (φ, 1)|ϕ ∈ Φ} ⊂ Γ, Φ→Φ while the group Σ1 × Σ2 can be identified with its image in Γ via the map (σ1 , σ2 ) → ¯ Σ1 × Σ2 there corresponds the pair (Φ, ¯ Σ1 × (ε, σ1 σ2 ). To the semi-direct product Φ ¯ It Σ2 ), the representation of the group Σ1 × Σ2 by inner automorphisms of the group Φ. ∼ ¯ is easy to convince oneself that (Φ, Σ1 × Σ2 ) = (Φ, Σ1 × Σ2 ). Let G = A ⊕ B and let us define the pair (G, Φ). To this end we consider in G the series of submodules 0 ⊂ A ⊂ G. In the group Aut G we introduce the centralizer Z
24
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
of this series, that is, all automorphisms that act as identities of A and G/A. The map σ : Z → Φ, for each b ∈ B, σ ∈ Z given by the formula bσ = b ◦ σ − b, is, as is readily seen, an isomorphism between the groups Z and Φ. Hence, we have a right isomorphism of the pairs (G, Φ) and (G, Z). Next, let us turn to the following question. Let there be given the pairs (G, Φ) and (G, Σ), and set Γ = Φ Σ. What are the necessary and sufficient conditions for the existence of the pair (G, Γ)? It turns out that if the condition ∀ g ∈ G,
ϕ ∈ Φ,
σ ∈ Σ,
(g ◦ σ) ◦ ϕ = (g ◦ ϕσ
−1
)◦σ
is fulfilled, then the action of the groups Φ and Σ can be extended to an action of the group Γ on G. Applying this to the reviewing situation, we arrive to the pair (A ⊕ B, HomK (B, A) Σ1 × Σ2 ) = (G, Γ), in which the action is given by the formula ∀ a ∈ A,
b ∈ B,
ϕ ∈ Φ,
σ1 ∈ Σ1 ,
σ2 ∈ Σ2 ,
(a + b) ◦ ϕσ1 σ2 = a ◦ σ1 + bϕ ◦ σ1 + b ◦ σ2 = (a + bϕ ) ◦ σ1 + b ◦ σ2 . This pair (G, Γ) is called the triangular product of the pairs (A, Σ1 ) and (B, Σ2 ) and will be denoted by (A, Σ1 ) (B, Σ2 ). Let us add that the pairs (A, Σ1 ) and (B, Σ2 ) need not necessarily be faithful and therefore the following formula is of interest Ker [(A, Σ1 ) (B, Σ2 )] = [Ker (A, Σ1 )] × Ker [(B, Σ2 )]. The operation of triangular product is a covariant functor in the first argument in the category of linear group actions, which preserves exactness from the left and from the right. But if we consider as morphisms only right homomorphisms of pairs, the triangular product becomes a covariant functor in both arguments preserving exactness from the left and from the right. These and many other properties of the -operation on group pairs are proved in [36], to which we refer the interested Reader. 3. Let A be any Abelian group, and B1 and B2 arbitrary groups. Let us consider the group AB1 B1 corresponding to the pair (AB1 , B1 ), where the action of B1 in AB1 is defined by the formula ∀ x ∈ B1 ,
f ∈ AB1 ,
(f ◦ b)(x) = f (xb−1 ).
For the regular pair (ZB2 , B2 ) and the triangular product (G, Γ) = (AB1 , B1 )(ZB2 , B2 ), B. I. Plotkin established the formula (3)
Γ = HomZ (ZB2 , AB1 ) B1 × B2 ∼ = A wr (B1 × B2 ).
One consequence of this fact deserves special attention because of an application in the last section. T HEOREM 3.1. Let A be an arbitrary (additively written) Abelian group, B an arbitrary group, and E the unit group. The acting group of the pair (A, E) (ZB, B) is isomorphic to AwrB. P ROOF. The proof amounts to applying the preceding formula to the pairs (A, E) and (ZB, B). Let us also give a sketch of the proof of formula (3), because of the lack of a suitable reference.
3. Triangular products and stability of representations
25
B2 2 Let there be given an arbitrary pair (A1 , B1 ). It induces pairs (AB 1 , B2 ) and (A1 , B1 ), 2 where the actions are given as follows: in the pair (AB 1 , B2 ) by the formula 2 ∀ f ∈ AB 1 ,
b2 , x ∈ B2 ,
(f ◦ b2 )(x) = f (xb−1 2 ),
2 and in the pair (AB 1 , B1 ) by the formula 2 ∀ f ∈ AB 1 ,
b1 ∈ B1 ,
x ∈ B2 ,
(f ◦ b1 )(x) = f (x) ◦ b1 .
B2 2 These actions on AB 1 commute and so there arises the pair (A1 , B1 B2 ), and we may add that the action in it is the following: 2 ∀ f ∈ AB 1 , b1 ∈ B1 , x, b2 ∈ B2 , (f ◦ b1 b2 )(x) = ((f ◦ b1 ) ◦ b2 )(x) = ((f ◦ b2 ) ◦ b1 )(x).
Setting now in the constructed pair A1 = AB1 we arrive at the pair ((AB1 )B2 , B1 B2 ). Let us first show that Γ ∼ = (AB1 )B2 B1 B2 or, what amounts to the same, let us show that there exists an isomorphism of pairs ((AB1 )B2 , B1 B2 ) ∼ = (Φ, B1 ×B2 ), where by Φ we denote the Abelian group HomZ (ZB2 , AB1 ). To this end we associate to each function f : B2 → AB1 its Z-linear extension f : ZB2 → AB1 , which reduces to an isomorphism of Abelian groups, ∗ : (AB1 )B2 → Φ. Moreover, the isomorphism ∗ agrees with the actions of the two pairs under view, and, thus, guarantees the requirement. As, by definition, Awr(B1 × B2 ) ∼ = AB1 ×B2 (B1 × B2 ), it suffices to establish B1 B2 the following isomorphism: (A ) B1 B2 ∼ = AB1 ×B2 (B1 × B2 ). To this end, we define, with the help of the formula ∀ f ∈ (AB1 )B2 ,
x ∈ B1 ,
y ∈ B2 ,
f μ (x, y) = (f (y))(x),
the map μ : (AB1 )B2 → AB1 ×B2 , which, as is readily seen, is an isomorphism of Abelian groups. The map μ can, however, be extended to an isomorphisms of the semidirect products under consideration, as for each b1 ∈ B1 and b2 ∈ B2 we have (f ◦ b1 b2 )μ = f μ ◦ b1 b2 . We omit the last details. 3.1.2. Triangular products of semigroup representations 1. Let K be an arbitrary associative and commutative ring with unit. We say that we have a pair (G, Γ), if the semigroup Γ acts as a semigroup by K-endomorphisms on the K-module G. In other words, there is defined an algebraic operation G × Γ → G, which we denote by g ◦ γ, possessing the following properties. (1) For γ ∈ Γ fixed the map g → g ◦ γ is a K-endomorphism of the module G; (2) for any g ∈ G, γ1 , γ2 ∈ Γ, it holds g ◦ (γ1 γ2 ) = (g ◦ γ1 ) ◦ γ2 . In the special case, when Γ is a monoid with unit ε, one requires the supplementary condition: (3) for any g ∈ G, g ◦ ε = g. Let us give a list of definitions connected with the notion of pair. By a homomorphism of pairs μ : (G, Γ) → (G , Γ ) we mean a pair of homomorphisms: a K-homomorphism μ : G → G and a homomorphism μ : Γ → Γ connected with the condition ∀ g ∈ G, γ ∈ Γ, (g ◦ γ)μ = g μ ◦ γ μ . In this way we get the category of pairs in which one can define all usual algebraic notions.
26
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
The pair (H, Σ) is a subpair of (G, Γ) if H is a submodule of G, Σ a subsemigroup of Γ, the submodule H is invariant with respect to the action of Σ and the representation of Σ with respect to H is induced by the given representation of the semigroup Γ. The kernel of a pair (G, Γ) is, by definition, the congruence Ker (G, Γ) of the semigroup Γ, whose classes are the classes of Γ which are equi-acting on G. If Ker (G, Γ) is the equality relation on G, then we say that (G, Γ) is a faithful pair. A congruence of the pair (G, Γ) is a pair H, σ, where H is a Γ-invariant submodule of G, and σ a congruence on the semigroup Γ such that σ ≤ Ker (G/H, Γ). In a natural way one defines likewise the notion of factor pairs, formulates and proves the homomorphism theorem, and, furthermore, Remak’s theorem.3 Besides usual homomorphisms of pairs one distinguishes also their onesided homomorphisms. A left homomorphism is a homomorphism of the Γ-modules corresponding to these pairs. In the case of right homomorphisms of pairs the latter have one and the same domain of action, on which the homomorphism acts identically. A variety of representations of semigroups is a saturated Birkhoff class of corresponding pairs. By definition, the class K is saturated if for any right epimorphism of pairs (G, Γ) (G, Γ ) it follows from (G, Γ ) ∈ K that (G, Γ) ∈ K. To each variety Θ there corresponds a verbal function ∗ Θ, which to each pair (G, Γ) associates the intersection ∗ Θ(G, Γ) of all Γ-submodules H ⊂ G such that (G/H, Γ) ∈ Θ. It is clear that (G/∗ Θ(G, Γ), Γ) ∈ Θ. This verbal function has the following property. Let Θ1 and Θ2 be varieties of pairs. The relation (G, Γ) ∈ Θ1 · Θ2 is fulfilled if and only if (∗ Θ2 (G, Γ), Γ) ∈ Θ1 . On the other hand, to each variety of pairs Θ there corresponds a radical function Θ which associates to each pair (G, Γ) the sum of all Γ-submodules H in G for which (H, Γ) ∈ Θ. Moreover, let Θ1 and Θ2 are the varieties of pairs. The relation (G, Γ) ∈ Θ1 · Θ2 is fulfilled if and only if (G/ Θ1 (G, Γ), Γ) ∈ Θ2 . We limit ourselves to these remarks in order not to overburden the picture with details somewhat modifying the notions and reasonings in the group case [37, 40]. 2. Let there be given the semigroups Φ, Σ1 and Σ2 ; we agree to write additively the operation on Φ. We assume that Σ1 acts from the right on Φ and that Σ2 acts from the left on Φ; moreover, we require that these two actions intertwine element-wise 4. On the set of triples5 Γ = {(ϕ, σ1 , σ2 )|ϕ ∈ Φ, σ1 ∈ Σ1 , σ2 ∈ Σ2 )} we define an operation setting (ϕ, σ1 , σ2 ) · (ϕ , σ1 , σ2 ) = (σ2 · ϕ + ϕ · σ1 , σ1 σ1 , σ2 σ2 ). 3Translators’ note. The theorem of Krull-Remak-Schmidt was proved around 1925. This important result
states that any finite dimensional A-module M , where A is an associative F -algebra over a field F can be written in an essentially unique way as a direct sum of submodules, which submodules cannot be written as direct sums of proper submodules. This reduces the problem of classification of A-modules to the determination of these so-called indecomposable modules. 4We denote these actions by σ · ϕ and ϕ · σ using the sign ‘·’ distinctly from the sign ‘◦’, which denotes 2 1 the action of Γ on G 5 The triple (ϕ, σ1 , σ2 ) will in the sequel also be denoted ϕσ1 σ2 .
3. Triangular products and stability of representations
27
One can check that this operation is associative, so that the set of pairs Γ has a semigroup structure, which we call the triple product of semigroups 6 and denote by Γ = Φ Σ1 × Σ2 . For given pairs (A, Σ1 ) and (B, Σ2 ), where Σ1 and Σ2 are semigroups, acting on 7 the K-modules A and B respectively, we set Φ = Hom+ K (B, A) ⊂ EndK (A ⊕ B). The natural action of the semigroups Σ1 and Σ2 on Φ defines semigroup structure on the semigroup Γ = Φ Σ1 × Σ2 . The action of Γ on G = A ⊕ B, defined by the rule (a + b) ◦ (ϕ, σ1 , σ2 ) = bϕ + a ◦ σ1 + b ◦ σ2 , agrees with the multiplication of the semigroup Γ and we arrive at the pair (G, Γ), which we denote by (A, Σ1 ) (B, Σ2 ) and call the triangular product of the two given pairs.
3. The following remark hints to the usefulness of this construction in the study of varieties of representations of semigroups. If the pair (A, Σ1 ) is contained in the variety Θ1 and (B, Σ2 ) in the variety Θ2 , then the triangular product (G, Γ) = (A, Σ1 ) (B, Σ2 ) is contained in the variety Θ1 · Θ2 . For the proof we remark that A is a Γ-submodule of G and so we have the pairs (A, Γ) and (G/A, Γ). Let us consider the diagram 2 fffff8 Σ1 fffff qqqq f f f f μ ff qq ff1fff qqqpr1 fffff q f f f q f ffff μ qq / Σ 1 × Σ2 Γ = Φ Σ1 × Σ2X MMM XXXXX XXXXX μ MMMpr2 XXXX2X M XXXXX XXXXX MMMM XXXXX MM XXX,& Σ2 where the “erasing” homomorphism μ is given by the formula (ϕσ1 σ2 )μ = σ1 σ2 , the map pri : Σ1 × Σ2 → Σi is the natural projection, and μi = μ · pri , i = 1, 2. It is easy to see that Ker μ1 and Ker μ2 act trivially on A and G/A ∼ = B, respectively. For all a ∈ A, γ ∈ Γ we have a ◦ γ = a ◦ γ μ1 , from which follows the existence of a right epimorphism (A, Γ) → (A, Σ1 ), which implies that (A, Γ) ∈ Θ1 .
6Cf. also [66, p. 142] 7Translators’ note. In the notation Hom+ the symbol + refers to a “forgetful” functor: While
HomK (B, A) denotes the Abelian group of homomorphisms from B to A, as K-modules, writing Hom+ K (B, A) we regard them as Abelian groups thus “forgetting” the K-module structure.
28
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Furthermore, defining μ2 as the natural projection we obtain an epimorphism of pairs μ2 : (G, Γ) → (B, Σ2 ). Moreover, the kernel of μ2 is A. Consequently, there arises the commutative diagram (G, Γ) (G/A, Γ)
μ2
≈
/ / (B, Σ2 ) O / (B, Γ)
the existence of which gives (G/A, Γ) ∈ Θ2 . Hence we have (A, Γ) ∈ Θ1 and (G/A, Γ) ∈ Θ2 . Hence, by definition, it follows that (G, Γ) ∈ Θ1 · Θ2 . 4. The final part of this section will be devoted to a deduction of the properties of the triangular product of pairs of representations of semigroups. P ROPOSITION 3.2. If the pairs (A, Σ1 ) and (B, Σ2 ) are faithful, then the pair (G, Γ) = (A, Σ1 ) (B, Σ2 ) is faithful too. P ROOF. Let us assume the contrary. Then there exist distinct elements γ = (ϕ, σ1 , σ2 ) and γ = (ϕ , σ1 , σ2 ) in Γ which act identically in G = A ⊕ B: we have g ◦ γ = g ◦ γ for all g ∈ G. In view of the faithfulness of (A, Σ1 ) and (B, Σ2 ) it follow readily that γ = γ , which contradicts our assumption. P ROPOSITION 3.3. Let (A, Σ1 ) and (B, Σ2 ) be two pairs8 and (G, Γ) = (A, Σ1 ) (B, Σ2 ) their triangular product. For each Γ-submodule H in G one has either H ⊂ A or A ⊂ H. P ROOF. If H ⊂ A everything is proved. So assume that A ⊃ H. Then there exists an element h ∈ H such that h ∈ A. This implies the existence of a1 ∈ A and b ∈ B, b = 0, such that h = b + a1 . Let us pick a basis in B containing the element b and let us consider an arbitrary map ϕ of this basis into A with bϕ = a. We continue ϕ to an element in Φ = Hom(B, A), which we likewise denote by ϕ . Moreover, we require the following remark. The pair (A, Σ1 ) can be “completed” to a pair (A, Σ∗1 ), where the semigroup Σ∗1 is obtained by adjoining to Σ1 a unity element , whose action on A is defined by the formula a ◦ ε = a for all a ∈ A. In an analogous manner we obtain (B, Σ∗2 ), and we end up with the pair (G, Γ∗ ) = (A, Σ∗1 ) (B, Σ∗2 ). It is easy to see that from the fact that the submodule H ⊂ G is Γ-invariant it follows H is Γ∗ -invariant, and vice versa.
8As earlier in this section here Σ and Σ , are semigroups acting on the K-modules A and B respectively. 1 2 Let us also emphasize that Σ1 and Σ2 need not be monoids. Starting with this moment and everywhere in this paper, we assume that K is a field.
3. Triangular products and stability of representations
∗
Let us now take γ ∈ Γ , where γ = ment h. We have
h ◦ γ = (b, a1 )
ε 0
ϕ ε
ε ϕ 0 ε
29
∈ End G and apply it to the ele-
= (b ◦ ε + a1 ◦ 0, b ◦ ϕ + a1 ◦ ε) =
= (b, bϕ + a1 ) = (b, a1 ) + (0, a) = h + a. We have showed that for any a ∈ A one can find γ ∈ Γ∗ such that h ◦ γ = h + a. Hence, it follows that a = h ◦ γ − h ∈ H in view the remark made above concerning the module H. Therefore, we deduce that A ⊂ H. 5. The triangular product of semigroup pairs enjoys good functional properties which are collected in the following two propositions. P ROPOSITION 3.4. Let there be given a homomorphism ν : (A, Σ1 ) → (A , Σ1 ) and let (B, Σ2 ) be an arbitrary pair. Then there exists a homomorphism of pairs μ : (A, Σ1 ) (B, Σ2 ) → (A , Σ1 ) (B, Σ2 ) coinciding with ν on (A, Σ1 ) and with identity on (B, Σ2 ). Moreover, if ν is a monomorphism (epimorphism), then μ is likewise a monomorphism (epimorphism). P ROOF. Let us introduce the notation: (G, Γ) = (A, Σ1 ) (B, Σ2 ), (G , Γ ) = + (A , Σ1 ) (B, Σ2 ), Φ = Hom+ K (B, A) and Φ = HomK (B, A ). We define a mor phism of semigroups μ : Φ → Φ by the formula
∀ϕ ∈ Φ,
b ∈ B,
μ
bϕ = (bϕ )ν ,
and, furthermore, “lift” it to a morphism of semigroups μ : Γ → Γ by setting (ϕ, σ1 , σ2 )μ = (ϕμ , σ1ν , σ2 ). Moreover, we define a morphism of K-modules μ : G → G by the formula ∀a ∈ A,
b ∈ B,
(a + b)μ = aν + b.
For any a + b ∈ A ⊕ B = G, σ1 ∈ Σ1 and σ2 ∈ Σ2 we then have ((a + b) ◦ (ϕ, σ1 , σ2 ))μ = (a ◦ σ1 + bϕ + b ◦ σ2 )μ = = (a ◦ σ1 + bϕ )ν + b ◦ σ2 = = (a ◦ σ1 )ν + (bϕ )ν + b ◦ σ2 = μ
= aν ◦ σ1ν + bϕ + b ◦ σ2 = (aν + b) ◦ (ϕμ , σ1ν , σ2 ) = = (a + b)μ ◦ (ϕ, σ1 , σ2 )μ . We see that the given morphism ν can be extended to a morphism of pairs μ : (G, Γ) → (G, Γ ). It is clear that μ is an identity on (B, Σ2 ). One verifies immediately that if ν is a monomorphism (epimorphism), then μ is defined by a pair of monomorphisms (epimorphisms) μ : G → G and μ : Γ → Γ and so is also a monomorphism (epimorphism).
30
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us again consider the triangular product (A, Σ1 ) (B, Σ2 ). For a fixed left pair the product still can be viewed as a functor, but now in the category of change of semigroups 9. Before formulating the result, we recall the definition of this category. The objects in the considered category are still pairs, but the morphism μ : (G, Γ) → (G , Γ ) in the category of changes is just two morphisms μ : G → G and μ : Γ → Γ connected with the following “compatibility condition”, γ ∈ Γ ,
∀g ∈ G,
g μ ◦ γ = (g ◦ γ μ )μ .
In order to distinguish the morphisms in the category of changes of semigroups, we denote them by μ : (G, Γ) (G , Γ ). P ROPOSITION 3.5. An arbitrary object (A, Σ1 ) and a morphism ν : (B, Σ2 ) (B , Σ2 ) in the category of changes of semigroups induce a morphism μ : (A, Σ1 ) (B, Σ2 ) (A, Σ1 ) (B , Σ2 ). in this category. P ROOF. 1) Let (G, Γ), (G , Γ ), Φ and Φ have the same meaning as in the proof of Proposition 3.4. Let us define the map μ : Φ → Φ in the following way: ∀b ∈ B,
bϕ
μ
= (bν )ϕ .
Moreover, we extend the homomorphism ν : Σ2 → Σ2 to a morphism of direct products μ : Σ1 × Σ2 → Σ1 × Σ2 , defining ν as identity on Σ1 . By the definition of the triangular product we have the pairs (Φ , Σ1 × Σ2 ) and (Φ, Σ1 × Σ2 ). Let us show that μ : Φ → Φ and μ : Σ1 × Σ2 → Σ1 × Σ2 induce a morphism of the pairs indicated. Indeed, for each b ∈ B we have
b(σ2 ·ϕ ·σ1 ) = (bν )σ2 ·ϕ ·σ1 = (bν ◦ σ2 )ϕ ◦ σ1 = [(b ◦ σ2 )ν ]ϕ ◦ σ1 = μ
= (b ◦ σ2 ν )ϕ
μ
ϕ
μ
◦ σ1 = (b ◦ σ2 μ )ϕ
ν
μ
μ
◦ σ1 = bσ2
·ϕμ ·σ1
.
In an analogous manner one can show that (σ2 · ϕ )μ = σ2 μ · ϕμ and (ϕ · σ1 )μ = · σ1 . 2) Let us give the map μ : Γ → Γ by the formula (ϕ , σ1 , σ2 )μ = (ϕμ , σ1 , σ2 μ ).
It turns out that μ is a morphism of triple products, μ : Γ → Γ. This follows from the computation [(ϕ , σ1 , σ2 )(ψ , τ1 , τ2 )]μ = ((σ2 · ψ + ϕ · τ1 )μ , σ1 τ1 , (σ2 τ2 )μ ) = = (σ2 μ · ψ μ + ϕμ · τ1μ , σ1 τ1 , σ2 μ · τ2 μ ) = = (ϕμ , σ1 , σ2 μ ) · (ψ μ , τ1 , τ2 μ ) = = (ϕ , σ1 , σ2 )μ · (ψ , τ1 , τ2 )μ . 3) Moreover, from the formula (a + b)μ = a + bν , bν ∈ B we obtain the morphism μ : A⊕B → A⊕B . Next, let us show that the mapping μ just defined gives a morphism μ : (A, Σ1 ) (B, Σ2 ) (A, Σ1 ) (B , Σ2 ) 9Translators’ note. This translation of the Russian “kategoriya zamen”, used by the author was kindly suggested to us by B. I. Plotkin,
3. Triangular products and stability of representations
31
in the category of changes of semigroups. Indeed, for any a ∈ A, b ∈ B, ϕ ∈ Φ , σ1 ∈ Σ1 and σ2 ∈ Σ2 we have, on the one hand, (a + b)μ ◦ (ϕ , σ1 , σ2 ) = (a + bν ) ◦ (ϕ , σ1 , σ2 ) =
= a ◦ σ1 + (bν )ϕ + bν ◦ σ2 = a ◦ σ1 + bϕ
μ
+ (b ◦ σ2 ν )ν ;
On the other hand, we have [(a + b) ◦ (ϕ , σ1 , σ2 )μ ]μ = [(a + b) ◦ (ϕμ , σ1 , σ2 μ )]μ = = [a ◦ σ1 + bϕ
μ
μ
+ b ◦ σ2 μ ]μ = (a ◦ σ1 + bϕ ) + (b ◦ σ2 ν )ν ;
we use here the fact that the map μ coincides with ν on Σ2 . As a result we have (a + b)μ ◦ (ϕ , σ1 , σ2 ) = ((a + b) ◦ (ϕ , σ1 , σ2 )μ )μ . This proves the statement, and at the same time Proposition 3.5
P ROPOSITION 3.6. Let there be given two pairs (A, Σ1 ) and (B, Σ2 ), and let (G, Γ) be their triangular product. For each radical F , satisfying the condition F(A, Σ1 ) < A, we have the identity F(G, Γ) = F(A, Σ1 ). If F˙ is a verbal, for which F(B, Σ2 ) > 0, then F(G, Γ) = A + F(G, Σ2 ). The proof, which is a repetition of the arguments with the help of which the corresponding fact was established in the case of groups (cf. [36, Lemma 2]), will be omitted. P ROPOSITION 3.7. Let there be given two pairs (A, Σ1 ) and (B, Σ2 ), and let (G, Γ) be their triangular product. For each Γ-submodule H in G containing A, there exists a right epimorphism (H, Γ) → (A, Σ1 ) (B ∩ H, Σ2 ). P ROOF. Let us denote B = B ∩ H, Φ = Hom+ (B, A) and Φ = Hom+ (B , A). We have G = A+ B, and from A ⊂ H it obviously follows that H = A+ B . Moreover, we have Γ = ΦΣ1 ×Σ2 . Let Γ = Φ Σ1 ×Σ2 . Hence, (A, Σ1 )(B , Σ2 ) = (H, Γ ). Each element ϕ in the semigroup Φ acts also from B into A; the corresponding element in Φ will also be denoted ϕπ . Thus there arises a map π : Φ → Φ . We remark also that each f ∈ F may be considered as the restriction to B of a homomorphism in ϕ ∈ Φ; this follows from well-known facts about vector spaces. Therefore π : Φ → Φ is an epimorphism. We see that it induces a right epimorphism of pairs (H, Γ) → (H, Γ ). With the aim to prove this we define a map π : Γ → Γ by the following formula: ∀ϕ ∈ Φ,
σ1 ∈ Σ1 ,
σ2 ∈ Σ2 ,
(ϕ, σ1 , σ2 )π = (ϕπ , σ1 , σ2 ).
π is surjective. Moreover, π is a homomorphism of semigroups. Indeed, let γ = (ϕ, σ1 , σ2 ) and γ = (ϕ , σ1 , σ2 ) be arbitrary elements of Γ. It is easy to see that for the identity (γγ )π = γπ · γ π it is sufficient that the elements δ = ϕ · σ1 + σ2 · ϕ and λ = ϕπ · σ1 + σ2 · ϕπ in Hom(B , A) are equal. However, this follows from the obvious fact that for all b ∈ B holds the equality bδ = bλ . Setting hπ = h for all h ∈ H, we obtain a pair of homomorphisms π : H → H, Γ → Γ . That the map π commutes with the actions on the corresponding pairs is readily verified expanding the definitions, and is therefore omitted.
32
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
6. Further information about a pair is obtained by its input to the triangular products of pairs that are more or less simpler arranged than the given single pair. The first result of this kind, which is obtained by passing to a simpler domain of action, is an analogue of the well-known theorem of Kaluzhnin and Krasner in group theory; in semigroup theory the corresponding fact is not known. P ROPOSITION 3.8. Let there be given an arbitrary faithful pair (G, Γ), and a Γsubmodule A of G, while Σ1 and Σ2 are the semigroups of endomorphisms induced by the semigroup Γ in A and in G/A. Then the pair (G, Γ) can be embedded as a subpair in the triangular product (A, Σ1 ) (G/A, Σ2 ). P ROOF. Exploiting the faithfulness of (G, Γ) and replacing Γ by the subsemigroup in End G, we arrive at a pair isomorphic (from the right) to the given pair (G, Γ). Therefore we can in what follows assume that Γ is contained in End G. For any element γ ∈ Γ we denote by γ μ and γ ν respectively the endomorphism of the spaces A and G/A induced by γ. Moreover, we set Γμ = Σ1 and Γν = Σ2 . Also, in G we can find a K-subspace B complementary to A. This yields for G a direct decomposition G = A + B, which provokes a natural epimorphism α : G → G/A and a projection β : G → B. The map α can also be viewed as an isomorphism α : B → G/A and this gives a unique sense for the notation α−1 , in particular, for each g ∈ G we have −1 (g α )α = g β . The pair (G/A, Σ2 ) and the map α induce the pair (B, Σ2 ): for b ∈ B and γ ∈ Γ, γ ν = σ2 ∈ Σ2 , we have −1
b ◦ σ2 = (bα ◦ σ2 )α
−1
= ((b ◦ γ)α )α
= (b ◦ γ)β .
We find from the decomposition G = A + B and the elements σ1 ∈ Σ1 and σ2 ∈ Σ2 respectively, the elements ε 0 σ2 0 and σ = σ1 = 2 0 ε 0 σ1 in End G, in this way establishing the embeddings Σi → Σi ⊂ End G, i = 1, 2. In addition, for each element g ∈ G, g = a + b, we have g σ1 = (b + a)σ1 = b + a ◦ σ1
and
g σ2 = (b + a)σ2 = b ◦ σ2 + a.
Moreover, remarking that it follows from bσ2 = b ◦ σ2 = (b ◦ γ)β that b ◦ γ − b ◦ σ2 ∈ A, we define a map ϕ : B → A according to the formula bϕ = b ◦ γ − b ◦ σ2 , it is easy to check that ϕ ∈ Hom(B, A). The semigroup Hom(B, A) can also be viewed as a subsemigroup Φin EndG, by associating to each element ϕ ∈ Hom(B, A) the ε ϕ of the space G. In addition, we have endomorphism ϕ = 0 ε
g ϕ = (a + b)ϕ = a + bϕ + b. Let Γ = Φ Σ1 × Σ2 . By our construction, (G, Γ ) = (A, Σ1 ) (B, Σ2 ). The map σ1 ϕσ2 → (ϕ, σ1 , σ2 ) induces an isomorphism of the subsemigroup Σ1 · Φ · Σ2 of End G onto Φ Σ1 × Σ2 , which isomorphism will be denoted π. Indeed, as in End G holds the equation σ2 ϕ σ2 0 ε ϕ ε 0 = , σ1 ϕσ2 = 0 ε 0 σ1 0 ε 0 σ1
3. Triangular products and stability of representations
we have
π σ ¯2 ϕ¯ σ2 ϕ = 0 σ1 0 σ ¯1 π ¯2 σ2 · ϕ¯ + ϕ · σ ¯1 ¯2 ) (σ2 σ σ2 σ = = 0 0 σ1 σ ¯1
33
[(σ1 ϕσ2 )(¯ σ1 ϕ¯ ¯σ2 ]π =
(σ2 · ϕ¯ + ϕ · σ ¯1 ) (σ1 σ ¯1 )
π =
¯1 )(σ2 · ϕ¯ + ϕ · σ ¯2 )(σ2 σ ¯2 ))π = = ((σ1 σ ¯1 ), σ1 σ ¯1 , σ2 σ ¯2 ) = = ((σ2 · ϕ¯ + ϕ · σ ¯ σ ¯1 , σ ¯2 ) = (σ1 ϕσ2 )π (¯ σ1 ϕ¯ ¯σ2 )π . = (ϕ, σ1 , σ2 )(ϕ, It is evident that π is bijective. It turns out that the semigroup Γ considered as a subsemigroup of End G can be embedded in Σ1 ΦΣ2 . For the proof pick an arbitrary element γ ∈ Γ and set γ μ = σ1 and γ ν = σ2 . Furthermore, let ϕ, ϕ , σ1 and σ2 be obtained by the procedure above. Let us show that γ = σ1 ϕσ2 . The left hand side and the right hand side of this equation are elements of End G and therefore for its verification it suffices to show that g γ = g σ1 ϕσ2 holds for all g = a + b ∈ G. We have, on the one hand, g γ = (a + b)γ = aγ + bγ = a ◦ σ1 + bϕ + b ◦ σ2 . On the other hand, we obtain ϕσ2 ε 0 g σ1 ϕσ2 = (b + a)σ1 ϕσ2 = (b, a) = 0 σ1 σ2 ε ϕ = = (b, a ◦ σ1 ) 0 ε σ 0 = (b ◦ σ2 , bϕ + a ◦ σ1 ), = (b, bϕ + a ◦ σ1 ) 2 0 ε
that is g σ1 ϕσ2 equals a ◦ σ1 + bϕ + b ◦ σ2 , an expression coinciding with the expression previously obtained for g γ . The required equation is thus established. As a consequence we have constructed an embedding (G, Γ) → (G, Γ ), which ∼ together with the isomorphism (G, Γ ) = (AΣ1 ) (B, Σ2 ) → (AΣ1 ) (G/A, Σ2 ) yields the embedding required. This proves Proposition 3.8 The final results of this section as given below somewhat unwind the connection between the -operation and the Cartesian multiplication of pairs, in particular, the operation of raising pairs to Cartesian power. pairs (Ai , Σi ), i ∈ I and a P ROPOSITION 3.9. Let there be given a family of ). Then the pair ( (A , Σ )) (B, Σ ) can be embedded in the pair pair (B, Σ i i i∈I (A , Σ ) (B, Σ ) . i i i∈I If in this result one takes all pairs (Ai , Σi ) equal to (A, Σ), we obtain from it the following. C OROLLARY 3.10. Let there be given arbitrary pairs (A, Σ) and (B, Σ ). Then for an arbitrary set (of indices) I one can embed the pair (A, Σ)I (B, Σ ) into the pair ((A, Σ) (B, Σ ))I .
34
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
We limit ourselves here to the following Proposition 3.11; let us just add that Proposition 3.9 is proved by similar arguments. P ROPOSITION 3.11. Let there be given arbitrary pairs (A, Σ1 ) and (B, Σ2 ). Then for an arbitrary set (of indices) I one can embed the pair (A, Σ1 )I (B, Σ2 ) into the pair ((A, Σ1 ) (B, Σ2 ))I . P ROOF. We introduce the three maps (1) ω : A + B I → (A + B)I ; (2) τ : Σ1 × ΣI2 → (Σ1 × Σ2 )I ; (3) ν : Hom(B I , A) → (Hom(B, A))I . They are defined as follows. First, for all a ∈ A and ¯b ∈ B I we define (a + ¯b)ω = a + ¯b, where a(i) = a for all i ∈ I. Clearly, a is the constant function sending the entire domain I onto one and the same value a ∈ A; in addition, a + ¯b is the function I → A + B required. ¯2 ∈ ΣI2 . Let us set (σ1 σ ¯2 )τ = σ1 σ2 , where Second, pick arbitrary σ1 ∈ Σ1 and σ σ1 (i) = σ1 for all i ∈ I. Third, for each ϕ ∈ Hom(B I , A) we define ϕν ∈ (Hom(B, A))I by the following condition ν ∀i ∈ I, ¯b ∈ B, [¯b(i)]ϕ (i) = ¯bϕ . It is easy to see that ν is an isomorphism, and that ω and τ are monomorphisms. The pair of maps τ and ν can be joined to a homomorphism of semigroups μ : Hom(B I , A) Σ1 × ΣI2 → (Hom(B, A) Σ1 × Σ2 )I , if we give the map μ by the formula (ϕ, σ1 , σ ¯2 )μ = (ϕν , σ1 , σ ¯2 ). It suffices to verify that the map μ is compatible with multiplication on the triangular product. As σ1 σ1 = σ1 ·σ1 , it is clear that the comparison of the elements [(ϕ, σ1 , σ ¯2 )(ϕ , σ1 , σ ¯2 )]μ and (ϕ, σ1 , σ ¯2 )μ · (ϕ , σ1 , σ ¯2 )μ reduces to the verification that the expressions (¯ σ2 · ϕ + ν ν ν ¯2 · ϕ + ϕ · σ1 represent one and the same element in (Hom(B, A))I . To ϕ · σ1 ) and σ this end, take any ¯b ∈ B I , i ∈ I and compute ν [¯b(i)](¯σ2 ·ϕ +ϕ·σ1 ) (i) = ¯b(¯σ2 ·ϕ +ϕ·σ1 ) = ν ¯2 )(i)]ϕ (i) + = (¯b ◦ σ ¯2 )ϕ + (¯bϕ ) ◦ σ1 = [(¯b ◦ σ ν + [¯b(i)]ϕ (i) ◦ σ =
1
ν ν = [¯b(i) ◦ σ ¯2 (i)]ϕ (i) + [¯b(i)]ϕ (i) ◦ σ1 =
(¯ σ2 ·ϕ = [¯b(i)]
ν
+ϕν ·σ1 )(i)
.
Thus we are led to the condition ∀i ∈ I, (¯ σ2 · ϕ + ϕ · σ1 )ν (i) = (¯ σ2 · ϕν + ϕν · σ1 )(i), which, apparently, is equivalent to the required equation. It remains to check that the morphisms μ and ω define monomorphisms of pairs μ∗ : (A + B I , Hom(B I , A) Σ1 × ΣI2 ) → ((A + B I ), (Hom(B, A) Σ1 × Σ2 )I ).
3. Triangular products and stability of representations
35
The only not completely immediate part of the proof is the verification of the compatibility of the map μ∗ , defined with the help of ω and μ, with the action of the pairs. ¯2 ∈ Σ2 we have Indeed, for arbitrary a ∈ A, ¯b ∈ B I , ϕ ∈ Hom(B I , A), σ ∈ Σ1 , σ ¯2 )μ = (a + ¯b) ◦ (ϕν , σ1 , σ ¯2 ) = (a + ¯b)ω ◦ (ϕ, σ1 , σ ν = ¯bϕ + a ◦ σ1 + ¯b ◦ σ ¯2 = ν
¯2 = = (¯bϕ + a ◦ σ1 ) + ¯b ◦ σ ¯2 )]ω . = [(¯bϕ + a ◦ σ1 ) + ¯b ◦ σ2 ]ω = [(a + ¯b) ◦ (ϕ, σ1 , σ ν It remains only to check that here we used the equality of the elements ¯bϕ + a ◦ σ1 and ¯bϕ + a ◦ σ1 , which follows from the relation
∀i ∈ I, (¯bϕ + a ◦ σ1 )(i) = ¯bϕ + a ◦ σ1 = = [¯b(i)]ϕ
ν
(i)
ν
+ a ◦ σ1 (i) = (¯bϕ + a ◦ σ1 )(i).
By this the proof of the proposition is complete.
3.1.3. Triangular products of representations of algebras 1. The construction indicated in the heading of this section will be achieved in two ¯ We may assume that Σ ¯ acts steps. First, let there be given two K-algebras Φ and Σ. from the right and the left on Φ and will denote this circumstance by ϕ · σ and σ · ϕ, respectively. We require that these two actions satisfy the following conditions 10: a) σ · (ϕ + ϕ ) = σ · ϕ + σ · ϕ ; (ϕ + ϕ ) · σ = ϕ · σ + ϕ · σ; b) σ · (ϕϕ ) = (σ · ϕ)ϕ ; (ϕϕ ) · σ = ϕ(ϕ · σ); c) (σ · ϕ) · σ = σ · (ϕ · σ ); (ϕ · σ)ϕ = ϕ(σ · ϕ ); d) (σ + σ ) · ϕ = σ · ϕ + σ · ϕ; ϕ · (σ + σ ) = ϕ · σ + ϕ · σ ; e) (σσ ) · ϕ = σ(σ · ϕ); ϕ · (σσ ) = (ϕ · σ) · σ ; f) σ · (κϕ) = κ(σ · ϕ); (kϕ) · σ = κ(ϕ · σ); g) (κσ) · ϕ = κ(σ · ϕ); ϕ · (κσ) = κ(ϕ · σ) ¯ = Φ+Σ ¯ we define addition and multiplication by In the direct sum of algebras Γ scalars component wise, but multiplication will be defined a new setting (ϕ, σ)(ϕ , σ ) = (ϕ · σ + σ · ϕ + ϕϕ , σσ ). ¯ there raises the structure of a new K-algebra, which we denote by Γ ¯ = Φ Σ ¯ On the set Γ ¯ and call the semidirect product of the algebras Φ and Σ. Remarks. 1) Let Σ be any K-algebra. Considering the field K as a K-algebra, we form the semidirect sum Σ∗ = Σ K. One gets a K-algebra, having the element (0, 1) as unit; moreover, the pairs of the form (σ, 0), σ ∈ Σ, give in Σ∗ a subalgebra isomorhic to Σ. This contains the essential part of a result mentioned in [21, p. 54-55]. 10Following MacLane [88] we speak of commuting bimultiplications (of Hochschild) on the algebra Φ.
36
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
2) Here we consider pairs in which the acting elements form a K-algebra. Let A be a K-module and Σ a K-algebra. We shall speak of a pair (A, Σ) if there is given an operation A × Σ → A, an action of Σ on A denoted by ◦ and satisfying for each a ∈ A, σ, σ ∈ Σ and κ ∈ K the conditions: (a) a ◦ (σσ ) = (a ◦ σ) ◦ σ ; (b) a ◦ (σ + σ ) = a ◦ σ + a ◦ σ ; (c) a ◦ (κσ) = κ(a ◦ σ); (d) for each fixed σ ∈ Σ, the map a → a ◦ σ is a K-endomorphism of the module A. Every pair (A, Σ) can be “lifted” to the pair (A, Σ∗ ), if the action of the algebra Σ∗ , constructed in the previous remark, in A is defined in the following way, ∀a ∈ A, σ ∈ Σ, κ ∈ K, a ◦ (σ, κ) = a ◦ σ + κa. The proof of the fact that there arises a pair (A, Σ∗ ) containing (A, Σ) as a subpair, is left to the Reader. Second step. Let there be given pairs 11 (A, Σ1 ) and (B, Σ2 ), and ¯ 1 ) and (B, Σ ¯ 2 ) be the corresponding faithful pairs. Then Σ ¯ =Σ ¯1 ⊕ Σ ¯ 2 can let (A, Σ in a natural way be interpreted as a subalgebra of EndK G, where G = A + B and the same is true for Φ = HomK (B, A). Therefore there is defined a left and right action of ¯ in Φ; this is just multiplication in EndK : Σ def
¯ϕ σ ¯·ϕ = σ
and
def
ϕ·σ ¯ = ϕ¯ σ.
It is clear that these actions are bimultiplications on Φ. Setting Σ = Σ1 ⊕ Σ2 we have a ¯ which allows us to lift the action of Σ ¯ on Φ to an action natural epimorphism f : Σ → Σ of Σ on Φ: def
¯f · ϕ σ·ϕ = σ
and
def
ϕ · σ = ϕ · σf .
We arrive at the algebra Φ Σ = Γ with an action on G = A ⊕ B defined by the rule (a + b) ◦ (ϕ, σ) = bϕ + (a + b) ◦ σ. Let us remark that the elements (ϕ, σ) of the algebra Φ Σ act on G as endomorphisms ϕ + σf . More precisely, the multiplication in Γ and its action on G are given by the formulae (ϕ, σ1 , σ2 )(ϕ , σ1 , σ2 ) = ((ϕ · σ1 + σ2 ϕ + ϕϕ ), σ1 σ1 , σ2 σ2 ) and (a + b) ◦ (ϕ, σ1 , σ2 ) = bϕ + a ◦ σ1 + b ◦ σ2 . As this action of Γ on G agrees with the operations in the algebra Γ, there arises a pair (G, Γ), which we call the triangular product of (A, Σ1 ) and (B, Σ2 ); we denote it by (A, Σ1 ) (B, Σ2 ). 11We emphasize, in particular, that the acting objects in the pairs given in this section are K-algebras.
3. Triangular products and stability of representations
37
2. Let us pass to the study of the triangular product of representations of algebras. P ROPOSITION 3.12. If the pairs (A, Σ1 ) and (B, Σ2 ) are faithful then so is the pair (A, Σ1 ) (B, Σ2 ). P ROOF. We repeat word by word the reasoning in the proof of Proposition 3.2 and arrive at the required result. P ROPOSITION 3.13. Let (A, Σ1 ) and (B, Σ2 ) be two pairs and let (G, Γ) = (A, Σ1 ) (B, Σ2 ) be their triangular product. Then for each Γ-submodule H in G we have either H ⊂ A or A ⊂ H. P ROOF. 1) Essentially the same reasoning as in the proof of Proposition 3.3 gives the result required. We can carry over completely the notations and reasonings in the first part of that proof with the single exception: while considering the action of the element (ϕ, σ1 , σ2 ) of the algebra Φ Σ on A ⊕ B we think of it here as the endomorphism ϕ + σ1f + σ2f . 2) Using the remark in the preceding subsection, we embed the pairs (A, Σ1 ) and (B, Σ2 ) into (A, Σ∗1 ) and (B, Σ∗2 ) respectively. In a natural way we extend the action on G of the algebra Γ = Φ Σ∗1 ⊕ Σ∗2 to the action of the algebra Γ∗ = Φ Σ∗1 ⊕ Σ∗2 ; this is done using the already known scheme ∗ Σ2 Φ . (B, A) 0 Σ∗1 Thus we get the pair (G, Γ∗ ). Next, we remark that from the Γ-invariance of the submodule H ⊂ Γ it follows that it is invariant with respect to the action of all elements of Γ∗ , and vice versa. In fact, assume that H ◦ Γ ⊂ H; then for any g = a + b ∈ H and γ ∗ = (ϕ, (σ1 , κ), (σ2 , κ)) ∈ Γ∗ , where ϕ ∈ Φ, σ1 ∈ Σ1 , σ2 ∈ Σ2 and κ ∈ K we have g ◦ γ ∗ = (a + b) ◦ (ϕ, (σ1 , κ), (σ2 , κ)) = bϕ + a ◦ (σ1 , κ) + b ◦ (σ2 , κ) = = bϕ + a ◦ σ1 + κa + b ◦ σ2 + κb = (a + b) ◦ (ϕ, σ1 , σ2 ) + κ(a + b) ∈ H. Conversely, if we start with an arbitrary g = a + b ∈ H and chose for γ ∗ the element (ϕ, (σ1 , 0), (σ2 , 0)), then it follows immediately from the relation g ◦ γ ∗ = (a + b) ◦ (ϕ, σ1 , σ2 ) that H is a Γ-invariant submodule. 3) Let us pass to the main reasoning in the proof. To this end, using the previous construction of the elements ϕ ∈ Hom(B, A) we find in Γ∗ the element γ ∗ = (ϕ , (0, 1), (0, 1)) and apply it to h = a1 + b. By Γ-invariance of H and the remark just made, it follows that h ◦ γ ∗ ∈ H, from which in view of the equalities
h ◦ γ ∗ = (a1 + b) ◦ (ϕ , (0, 1), (0, 1)) = bϕ + a1 ◦ (0, 1) + b ◦ (0, 1) = a + h we have a = h◦γ ∗ −h ∈ H. The relation A ⊂ H is proved. This achieves the proof. 3. An easy modification of the proof apparatus in Paragraph 5 of Section 3.1.2 above allows to derive here some features of functional behavior of the -product for representations of algebras. P ROPOSITION 3.14. Let there be given a morphism ν : (A, Σ1 ) → (A , Σ1 ) and an arbitrary pair (B, Σ2 ). There exists a morphism μ : (A, Σ1 ) (B, Σ2 ) → (A , Σ1 )
38
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
(B, Σ2 ), which coincides with the map ν on (A, Σ1 ) and is an identity on (B, Σ2 ). If ν is injective (surjective), then μ is also so. P ROOF. The proof is obtained by repeating almost word by word the proof of Proposition 3.4, the notations of which are preserved here with an immediate modification of their interpretation. We stop only at a fragment of the reasoning. Let us assume that ∀ϕ ∈ Φ,
σ1 ∈ Σ1 ,
σ2 ∈ Σ2 ,
(ϕ, σ1 , σ2 )μ = (ϕμ , σ1ν , σ2 ).
We have to verify that μ : Γ → Γ is a morphism of algebras, which agrees with the action of these algebras on G and G respectively. We restrict the verification to the fact that μ intertwine with multiplication on the algebras Γ and Γ ; the rest follows even simpler by checking the definitions. We have (ϕ, σ1 , σ2 )μ · (ϕ , σ1 , σ2 )μ = (ϕμ , σ1ν , σ2 ) · (ϕμ , σ1 ν , σ2 ) = = ((ϕ · σ1 + σ2 · ϕ + ϕϕ )μ , (σ1 σ1 )ν , σ2 σ2 ) = = [(ϕ, σ1 , σ2 )(ϕ , σ1 , σ2 )]μ . In these computations we used the relation ϕμ · σ1 ν + σ2 · ϕμ + ϕμ ϕμ = (ϕ · σ1 + σ2 · ϕ + ϕϕ )μ ; it follows from the identities ϕμ · σ1 ν = (ϕ · σ1 )μ ;
σ2 · ϕμ = (σ2 · ϕ )μ
and
ϕμ ϕμ = (ϕϕ )μ ,
of which only two first ones require the proof. The first of them follows from the series of equations, valid for any b ∈ B, bϕ
μ
·σ1 ν
μ
= bϕ ◦ σ1 ν = (bϕ )ν ◦ σ1 ν = (bϕ ◦ σ1 )ν = b(ϕ·σ1 ) . μ
Moreover, for all b ∈ B we have bσ2 ·ϕ
μ
= (b ◦ σ2 )ϕ
μ
ν μ μ = (b ◦ σ2 )ϕ = (bσ2 ·ϕ )ν = b(σ2 ·ϕ ) ,
from which the second equation in view follows. The statement is proved. We allow ourself not to produce the remaining details. The category of changes of substitutions of pairs of representations of algebras is defined completely in analogy with the semigroup case (cf. Subsection 3.2.5). P ROPOSITION 3.15. An arbitrary object (A, Σ1 ) and a morphism ν : (B, Σ2 ) (B , Σ2 ) in the category of substitutions of pairs of representations of algebras induces a morphism μ : (A, Σ1 ) (B, Σ2 ) (A, Σ1 ) (B , Σ2 ) of the same category. P ROOF. The proof is obtained carrying over verbatim to the present situation the notations and reasonings of Section 3.1.4, with the difference that here Φ and Φ are thought of as subalgebras in EndK (A ⊕ B) and EndK (A ⊕ B) respectively; all the rest of the proof mentioned is preserved in the present interpretation.
3. Triangular products and stability of representations
39
4. The behavior of radicals and verbals with respect to the triangular product of representations of algebras is the same as in the semigroup case described in Proposition 3.6; we omit its formulation as well as the proof, because in an obvious way it repeats the semigroup case. The same remarks refer to P ROPOSITION 3.16. Let there be given two pairs (A, Σ1 ) and (B, Σ2 ) and a set (G, Γ) = (A, Σ1 ) (B, Σ2 ). For any Γ-submodule H in G contained in A, there exists a right epimorphism (H, Γ) → (A, Σ1 ) (B ∩ H, Σ2 ). 5. Embedding theorems referred in Paragraph 6 of Section 3.1.2 hold true also for representations of algebras. P ROPOSITION 3.17. Let there be given an arbitrary faithful pair (G, Γ), a Γ-submodule A of G, while Σ1 and Σ2 are the subalgebras of endomorphisms, induced by the K-algebra Γ in A and in G/A respectively. Then the pair (G, Γ) can be embedded as a subalgebra in the triangular product (A, Σ1 ) (G/A, Σ2 ). P ROOF. We take advantage of the fact that the pair (G, Γ) is faithful and replace the algebra Γ by the corresponding subalgebra in EndK G; in this way we obtain a pair which is right isomorphic to the original pair (G, Γ). Therefore we may assume, in what follows, that Γ is already contained in the algebra EndK G. The action of the elements of the algebra Γ on the module G induces their actions on A and on G/A; the morphisms arising in this way will be denoted μ and ν respectively. We set Im μ = Σ1 and Im ν = Σ2 . We select in G a subspace B complementary to A, which gives a direct decomposition G = A + B, with the accompanying natural epimorphism α : G G/A and projection β : G → G/A. The map α may be viewed as an isomorphism B → G/A, giving a unique meaning to the notation α−1 ; in particular, for −1 each g ∈ G we have (g α )α = g β . The pair (G/A, Σ2 ) and the map α induce the pair (B, Σ2 ); for b ∈ B and γ ∈ Γ, γ ν = σ2 ∈ Σ2 we have α−1 −1 b ◦ σ2 = (bα ◦ σ2 )α = (b ◦ γ)α = (b ◦ γ)β . Take an arbitrary element γ ∈ Γ and let γ μ = σ1 , γ ν = σ2 . Let us associate to the elements σ1 ∈ Σ1 and σ2 ∈ Σ2 respectively the elements 0 0 σ2 0 and 0 σ1 0 0 in End G, which determine the embeddings Σi → End G, i = 1, 2. Further, if Φ = Hom(B, A) is embedded, in the manner indicated in Section 3.1.1, in End G, then the subalgebra Φ ⊂ End G may be treated as the annihilator of the series 0 ⊂ A ⊂ G. Next, we remark that by the construction of the embedding of Σ1 ⊕ Σ2 in End G, for each g = a + b ∈ G we have g ◦ σ1 = a ◦ σ1 and g ◦ σ2 = b ◦ σ2 . Moreover, from b ◦ σ2 = (b ◦ γ)β it follows the existence of an a ∈ A such that b ◦ γ = a + b ◦ σ2 . From these remarks we obtain the relations b ◦ (γ − σ2 − σ1 ) = a
and
a ◦ (γ − σ2 − σ1 ) = 0;
they show that γ − σ2 − σ1 ∈ Φ, because Φ is the annihilator of the series 0 ⊂ A ⊂ G. Therefore there exists a ϕ ∈ Φ such that γ = ϕ + σ1 + σ2 . There arises an embedding of
40
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
algebras Γ → Φ Σ1 ⊕ Σ2 , which we denote by π; for each γ ∈ Γ, γ = ϕ + σ1 + σ2 it is given by the formula γ π = (ϕ, σ1 , σ2 ). This morphism π together with the isomorphism (A, Σ1 ) (B, Σ2 ) → (A, Σ1 ) (G/A, Σ2 ) induces also the useful embedding of pairs (G, Γ) → (A, Σ1 ) (G/A, Σ2 ). The proof of each of the following three propositions is essentially a transfer of the corresponding proof in the semigroup case, sometimes with light modifications; all difficulties are overcome without any pain, so we leave them to the Reader. We limit ourselves to the formulations. family of pairs (Ai , Σi ), i ∈ I, P ROPOSITION 3.18. Let there begiven an arbitrary ). Then the pair (A , Σ ) (B, Σ ) can be embedded into the and the pair (B, Σ i i i∈I pair i∈I (Ai , Σi ) (B, Σ ) . C OROLLARY 3.19. Let there be given arbitrary pairs (A, Σ1 ) and (B, Σ2 ). Then for each family of indices I one has the embedding A, Σ1 )I (B, Σ2 ) → ((A, Σ1 ) I (B, Σ2 ) . P ROPOSITION 3.20. Let there be given arbitrary pairs (A, Σ1 ) and (B, Σ2 ). Then for each family of indices I the pair (A, Σ1 ) (B, Σ2 )I can be embedded in the pair ((A, Σ1 ) (B, Σ2 ))I . 3.1.4. Connections between -constructions 1. The constructions of the triangular product of pairs of representations of groups, semigroups and algebras, as considered in the previous sections of this paper, are, as it seems to us, not only isolated technical tools, but partial appearances of a whole, more general concept. Here we shall indicate some correlations between these three constructions. 2. P ROPOSITION 3.21. Let there be given pairs representations of semigroups (A, Σ1 ) and (B, Σ2 ), while (G, Γ) is their triangular product. The acting semigroup Γ = Φ Σ1 × Σ2 is a group if and only if Σ1 and Σ2 are groups and the semigroup Φ = Hom+ K (B, A) is treated as a group. If these conditions are fulfilled then (G, Γ) is isomorphic to the triangular product of (A, Σ1 ) and (B, Σ2 ) as group pairs. P ROOF. Let us first make an observation. Let Σ1 and Σ2 be groups and let us treat Φ = HomK (B, A) as an additive Abelian group. Let us show that then Γ = Φ Σ1 × Σ2 is also a group. To this end, we remark that the element (ϕ , σ1 , σ2 ) ∈ Γ is a unity of Γ exactly when for each element (ϕ, σ1 , σ2 ) ∈ Γ we have (4)
(ϕ, σ1 , σ2 ) = (ϕ · σ1 + σ2 · ϕ , σ1 σ1 , σ2 σ2 ) and (ϕ, σ1 , σ2 ) = (ϕ · σ1 + σ2 · ϕ, σ1 σ1 , σ2 σ2 ).
From these relations it follows that, in particular, σi = εi , where εi are the units of Σi , i = 1, 2. Taking account of this, the equality of the first components in the triples in (4) takes the form ϕ · ε1 + σ2 · ϕ = ϕ · σ1 + ε2 · ϕ = ϕ.
3. Triangular products and stability of representations
41
The equation ϕ = ϕ · ε1 and the arbitrariness of the the choice of the element σ2 ∈ Σ2 imply now that ϕ + ε2 · ϕ = ϕ, i.e. ϕ = 0. Consequently, the unity of Γ must be the triple (0, ε1 , ε2 ), where 0 is the zero homomorphism in HomK (B, A), which is verified by an immediate check. In an analogous way one solves the question of inverse elements. Indeed, for the triple (ϕ , σ1 , σ2 ) to be the inverse of (ϕ, σ1 , σ2 ) it is necessary and sufficient that the following equations be fulfilled (5)
(ϕ · σ1 + σ2 · ϕ , σ1 σ1 , σ2 σ2 ) = (0, ε1 , ε2 ) and (ϕ · σ1 + σ2 · ϕ, σ1 σ1 , σ2 σ2 ) = (0, ε1 , ε2 ).
It follows from () that σ1 = σ1−1 and σ2 = σ2−1 . It follows then from the equalities for the first components in (5) that ϕ · σ1−1 = −σ2 · ϕ and ϕ · σ1 = −σ2−1 · ϕ, which equalities are equivalent to ϕ = −σ2−1 ·ϕ·σ1−1 . We conclude that the inverse to the triple (ϕ, σ1 , σ2 ) is given by (−σ2−1 · ϕ · σ1−1 , σ1−1 , σ2−1 ). The first statement of the proposition is now proved in a standard way in both directions of the implication. Let us pass to the proof of the second statement of the proposition. First, it is clear that for the subgroup Σ generated in Γ by Σ1 and Σ2 the subrepresentation (G, Σ) splits, (G, Σ) = (A ⊕ B, Σ1 × Σ2 ). Second, for each (ϕ, ε1 , ε2 ) ∈ Φ and (ϕ1 , σ1 , σ2 ) ∈ Γ one has (ϕ, σ1 , σ2 )−1 · (ϕ, ε1 , ε2 ) · (ϕ1 , σ1 , σ2 ) = (−σ2−1 · ϕ · σ1−1 , σ1−1 , σ2−1 ) × × (ϕ, ε1 , ε2 ) · (ϕ1 , σ1 , σ2 ) = (σ2−1 · ϕ · σ1 , ε1 , ε2 ), which shows the invariance of the subgroup Φ in Γ. Moreover, one checks immediately that the pair (G, Φ) is faithful, along with the fact that the image of Φ in Aut G coincides with the centralizer of the series 0 ⊂ A ⊂ G. Third, let us introduce the map f of the pair (G, Γ) into the pair (G, Γ∗ ), being the triangular product of (A, Σ1 ) and (B, Σ2 ) as group pairs, defining it as the identity map on G and on Γ by the formula (ϕ, σ1 , σ2 )f = (ϕ · σ1−1 , σ1 , σ2 ). def
A check shows that the map f is a morphism of the group pairs (G, Γ) and (G, Γ∗ ), and, furthermore, bijective. With these reasonings our statement is proved, and at the same time the proof of Proposition 3.21 is finished as well. 3. In the study of the interrelations between the triangular product of semigroup pairs (A, Σ1 ) (B, Σ2 ) and of pairs (or representations) of algebras (A, S1 ) (B, S2 ) an essential role is played by the following remark. In the semigroup Φ Σ1 × Σ2 its elements (ϕ, σ1 , σ2 ) and their components ϕ, σ1 , σ2 are thought of as endomorphisms in End G: σ2 ϕ ε2 ϕ ε2 0 σ2 0 and , , 0 σ1 0 ε1 0 σ1 0 ε1 respectively. In the algebra Φ S1 ⊕ S2 one has a different interpretation for its elements (ϕ, σ1 , σ2 ) and their components : 0 0 σ2 ϕ 0 ϕ σ2 0 , . and , 0 σ1 0 0 0 σ1 0 0
42
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Next, let there be given two semigroup pairs (A, Σ1 ) and (B, Σ2 ), which we, in a well-known manner, lift to the corresponding monoid pairs (A, Σ∗1 ) and (B, Σ∗2 ). The linear extension of the actions of Σ∗1 in A and Σ∗2 in B gives pairs (A, KΣ∗1 ) and (B, KΣ∗2 ), where the acting object are the corresponding semigroup algebras. Let us consider the triangular products (A, Σ1 ) (B, Σ2 ) = (A ⊕ B, Φ Σ1 × Σ2 ), (where the semigroup Φ = HomK (B, A) is treated as the centralizer of the series 0 ⊂ A ⊂ G in End G) and (A, KΣ∗1 ) (B, KΣ∗2 ) = (A ⊕ B, Φ (KΣ∗1 ⊕ KΣ∗2 )), (where Φ is treated as the annihilator of the series 0 ⊂ A ⊂ G in End G.) It is easy to verify that one has the following fact. P ROPOSITION 3.22. The map π∗ : (ϕ, σ1 , σ2 ) → (ϕ − ε, σ1 − ε2 , σ2 − ε1 ) gives an embedding of Φ Σ1 × Σ2 into the multiplicative semigroup of algebra Φ (KΣ∗1 ⊕ KΣ∗2 ), which agrees with the actions in the pairs (A⊕ B, Φ Σ1 × Σ2 ) and (A⊕ B, Φ (KΣ∗1 ⊕ KΣ∗2 )). 4. Let there be given any two pairs (A, S1 ) and (B, S2 ), whose acting objects S1 and S2 are unitary K-algebras. Using the natural “cutting” functor, we obtain the semigroup pairs (A, Σ1 ) and (B, Σ2 ), where Σi is the multiplicative semigroup of the algebra Si , i = 1, 2. We form anew the corresponding triangular products (A, S1 ) (B, S2 ) = (A ⊕ B, Φ (S1 ⊕ S2 )) and
(A, Σ1 ) (B, Σ2 ) = A ⊕ B, Φ Σ1 × Σ2 .
Then we obtain P ROPOSITION 3.23. The map π ∗ : (ϕ, σ1 , σ2 ) → (ϕ + ε, σ1 + ε2 , σ2 + ε1 ) gives an isomorphism of the multiplicative semigroup of the algebra Φ (S1 ⊕ S2 ) with the semigroup Φ Σ1 × Σ2 , which agrees with the action in the pairs (A ⊕ B, Φ (S1 ⊕ S2 )) and (A ⊕ B, Φ Σ1 × Σ2 ). The proof is easily obtained by an immediate checking of the definition, and will be omitted. 3.1.5. Comments 1. Under the influence of the view at representations of algebraic structures as twosorted systems (or pairs) there arose the language of pairs which as a working tool in the systematic study of representations by B. I. Plotkin in his book [35] balances the role of the inner structure of groups and their outer properties of actions on the representation modules.
3. Triangular products and stability of representations
43
2. The extension of the theme of varieties of groups (cf., e.g., [34]) to varieties of linear pairs-representations of groups required the statement of the questions there, which were suggested by group theory. Their solution, however, leads rather far from the original and requires new tools. So, in [36] there arose the construction of the triangular product of group representations. This carries in itself the analogue of the properties and the role of the wreath product of groups: cf. [38, 43, 49]. This construction is the natural model for the -construction for representations of semigroups and algebras considered in the present Section. The connections found in Section 3.1.4 between the three constructions may be interpreted as an argument for the advantage of these constructions. The term, but not the notion of triangular product is borrowed from Eilenberg [66], but its appearance is (according to [66]) connected with Schützenberger (1965). The books I. B. Menskiˇı [32] and S. Eilenberg [66] point to further paths for developing this theme, important for the applications.
3.2. The arithmetics of varieties of representations of semigroups and algebras The topic of this Section concerns the arithmetic properties of families of varieties of linear representations (over a field K) of semigroups and algebras, and also the same properties of varieties. In the study of varieties the machinery of the triangular products, as developed in the previous Sections, is applied. The main goal is to prove the “theorem of generators” for varieties of representations of semigroups as well as of algebras. In order to make its formulation more precise we remark that the set of varieties of pairs admits an associative multiplication: the pair (G, Γ) is contained in Θ1 · Θ2 if G admits a Γ-submodule H such that (H, Γ) ∈ Θ1 and (G/H, Γ) ∈ Θ2 . Furthermore, let us agree to denote by Var K the variety generated by the class of pairs K. The formula we are interested in is given by the formula Var(K1 K2 ) = Var K1 · Var K2 , which holds for arbitrary classes of pairs K1 and K2 . From this one can derive, in particular, that the semigroup of non-trivial varieties of linear representations of semigroups is a semigroup with unique decomposition into factors. As an application of this fact we prove Theorem 3.37 on the structure of the semigroup of varieties of linear automata. In the case of algebras this leads to a new proof of the theorem of Bergman and Lewin on the freedom of the semigroup of T -ideals in a free associative K-algebra of countable rank. Our approach puts this theorem and its proof into one row with the corresponding results for varieties of representations of groups and semigroups, and gives supplementary information on varieties of algebras, which is hard to obtain in the language of T -ideals (cf. Theorems 3.49, 3.50 and 3.51 below). Everywhere in this Section, K is a field. We speak here of a pair (G, Γ) if the semigroup (algebra) Γ acts as a semigroup (algebra) of K-endomorphisms on the K-module G; also, in Sections 3.2.5 the acting object is a semigroup, while in the Section 3.2.6 it is a K-algebra. Unless the contrary is told, the word “variety” means a“ variety of pairs which is distinct from the unit variety (the class of all pairs with zero domain of action), and from the “variety of all pairs”. 3.2.1. Varieties of linear pairs and automata 1. Here we introduce a connection (of Galois type) whose closed objects are varieties of representations of semigroups and special ideals in suitable algebras.
44
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
2. Let X = {x1 , x2 , . . . } be a countable set. Let Ψ and Ψ∗ be the free semigroup and the free monoid, respectively, with the elements of X as free generators. Let u = u(xi1 , . . . , xik ) be an arbitrary element of the semigroup ring KΨ∗ . By definition, in the pair (G, Γ) holds the (special) bi-identity y ◦ u ≡ 0 if for each specialization y → g ∈ G, xij → γij ∈ Γ the equality g ◦ u(γi1 , . . . γiiik ) = 0 holds in (G, KΓ∗ ). Here we can consider also bi-identities of a more general type, parallel to what was done in [35, p. 566–572]. However, in the case when K is a field each such system of bi-identities can easily be replaced by a system of special bi-identities equivalent to it. To each class of pairs Θ we associate in KΨ∗ the set UΘ of all u ∈ KΨ∗ such that in each pair in Θ the bi-identity y ◦ u ≡ 0 is fulfilled; we call UΘ the indicator of the class Θ. The subset UΘ is a two-sided ideal in KΨ∗ , invariant with respect to all endomorphisms of the ring KΨ∗ which are induced by an endomorphism of the monoid Ψ∗ ; the endomorphisms of the ring KΨ∗ with this property will be called special. Furthermore, we call special likewise those ideals of KΨ∗ which are invariant with respect to all special endomorphisms. Thus to each class of pairs Θ there corresponds a special ideal ΘU in the ring KΨ∗ . On the other hand, let U be an arbitrary subset in KΨ∗ . We associate with it a class ΘU according to the following rule: the pair (G, Γ) belongs to the class ΘU if and only if all bi-identities y ◦ u ≡ 0, u ∈ U, are fulfilled in this pair. We remark that if U is a twosided ideal in KΨ∗ , then the class ΘU is closed with respect to subpairs, homomorphic images and Cartesian products of pairs, an furthermore saturated. The last thing means, by definition that the class is also closed with respect to complete pre-images of this pair under right homomorphisms. In other words, what was said above means that the class ΘU is a variety of pairs. Next, let Θ be a variety of pairs, and U ⊂ KΨ∗ a two-sided special ideal. We have the relations Θ → UΘ → Θ(UΘ ) = Θ and U → ΘU → U(ΘU ) = U . It turns out that one has the equalities U = U and Θ = Θ . Hence, varieties of pairsrepresentations of semigroups are in a bijective correspondence with special ideals in KΨ∗ . On the set of linear representations of semigroups one can define a multiplication as follows. By definition a pair (G, Γ) is contained in the class Θ1 · Θ2 if in G there exist a Γ-submodule H such that (H, Γ) ∈ Θ1 and (G/H, Γ) ∈ Θ2 . Varieties of pairs form a semigroup with respect to this multiplication, which we denote by M = M(K). We remark that the indicator of the variety Θ1 · Θ2 is the ideal U2 · U1 , where U1 and U2 are the indicators of the varieties Θ1 and Θ2 , respectively. We have the following result. P ROPOSITION 3.24 ([17]). The semigroup M(K) of varieties of representations of semigroups is anti-isomorphic to the semigroup of special ideals of the ring KΨ∗ . In the case of a fixed acting semigroup Γ the requirement of saturation in the definition of variety becomes trivial, and in this case the variety of pairs is the Birkhoff class of the corresponding Γ-modules. Here we have the following. P ROPOSITION 3.25 ([17]). The varieties of Γ-modules are in one-to-one correspondence with the two-sided ideals of the semigroup ring KΓ∗ .
3. Triangular products and stability of representations
45
3. Let us pass to linear automata, which constitute a partial generalization of the linear systems in [18]. A linear automaton (semigroup automaton (Mealey)) is a three-sorted algebraic system A = (A, Γ, B), where A (the states) and B (the outputs) are K-modules, Γ is the semigroup of input signals, and there are given a K-linear map of transition A ◦ Γ → A and an operation of output A ∗ Γ → B with the properties a ◦ (γ1 γ2 ) = (a ◦ γ1 ) ◦ γ2 , ∀a ∈ A; γ1 , γ2 ∈ Γ, a ∗ (γ1 γ2 ) = (a ◦ γ1 ) ∗ γ2 . A linear automaton A = (A , Γ , B ) is a subautomaton of A = (A, Γ, B) if A ⊂ A, Γ ⊂ Γ, B ⊂ B are subobjects of the corresponding algebraic structures, and A ◦ Γ ⊂ A and A ∗ Γ ⊂ B . Let there be given two linear automata A = (A, Γ, B) and A = (A , Γ , B ) and a triple of morphisms σ = (σ1 , σ2 , σ3 ), σ1 : A → A , σ2 : Γ → Γ , σ3 : B → B . By definition, σ : A → A is a morphism of automata if the following conditions are fulfilled ∀a ∈ A,
γ∈Γ
(a ◦ γ)σ1 = aσ1 ◦ γ σ2
and (a ∗ γ)σ3 = aσ1 ∗ γ σ2 .
It is clear that the submodules Ker σ1 = Aσ ⊂ A, Ker σ3 = Bσ ⊂ B, and the kernel congruence Ker σ2 = κ on Γ satisfy the requirement ∀a, a ∈ A,
γ, γ ∈ Γ
((a − a ∈ Aσ )&(γκγ )) =⇒
=⇒ ((a ◦ γ − a ◦ γ ∈ Aσ )&(a ∗ γ − a ∗ γ ∈ Bσ )). Conversely, if in the components of the linear automaton A = (A, Γ, B) is chosen a family of congruences Λ = (Aσ , κ, Bσ ) satisfying the requirements mentioned then Λ is called a congruence of the automaton A. In this case the system A/Λ = (A/Aσ , Γ/κ, B/Bσ ), in which all operations on the equivalence classes are induced by the corresponding ones in A, is a linear automaton. This is, by definition, the factor automaton of A by Λ. It is clear that for linear automata one can formulate and prove the homomorphism theorems and Remak’s theorem. The Cartesianproduct of the family of linear automata Ai = (Ai , Γ i , Bi ), i ∈ I, is called the system i∈I Ai = (A, Γ, B), where A = ci∈I Ai and B = ci∈I B i are the complete direct sums of the modules Ai and Bi , i ∈ I, respectively, while Γ = i∈I Γi is the Cartesian product of the semigroups Γi , i ∈ I, the operations A ◦ Γ → A and B ∗ Γ → B being defined component wise. By definition, a class Θ of linear automata is called a Birkhoff class if it is closed with respect to epimorphic images, subautomata and Cartesian products. Furthermore, we say that a class Θ of linear automata is saturated if together with the automaton A = (A, Γ, B) it contains all automata of the form (A, Γ, B ), where B ⊃ B and for each epimorphism (A, Γ, B) → (A, Σ, B) which is identity on A and B, it follows from (A, Σ, B) ∈ Θ that (A, Γ, B) ∈ Θ. Saturated Birkhoff classes of linear automata will be called varieties of linear automata. 4. Let A = (A, Γ, B) be a linear automaton, accompanied by the maps μ0 : Γ → EndK A and μ∗ : Γ → Hom(A, B), given by the formulae ∀a ∈ A,
γ μ0 (a) = a ◦ γ
and γ μ∗ (a) = a ∗ γ.
46
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
We extend them by linearity to maps μ0 : KΓ → EndK A and μ∗ : KΓ → HomK (A, B). Then there arises the automaton AL = (A, KΓ, B). We say that in the automaton A the bi-identity y ◦ u ≡ 0 (the bi-identity z ∗ u ≡ 0) is fulfilled if for the linear extension σ : KΨ∗ → KΓ of an arbitrary homomorphism σ : Ψ → Γ induced by a specialization σ : X → Γ, the following condition is satisfied: for all a ∈ A we have in the automaton AL the relation a ◦ uσ = 0 (a ∗ uσ = 0). To a class of automata Θ we associate in F = KΨ∗ the pair of subsets (UΘ , VΘ ), the indicator of the class Θ. Here UΘ (the indicator of states of the class Θ) is the subset of all u ∈ KΨ such that for every automaton in Θ there is fulfilled the bi-identity y ◦ u ≡ 0. Similarly, VΘ (the indicator of outputs for Θ) is the subset of all v ∈ KΨ such that for every automaton in Θ there is fulfilled the bi-identity z ∗ v ≡ 0. One sees readily that UΘ is a two-sided ideal in F, while VΘ is a left special ideal in F . For the pair (UΘ , VΘ ) we have further UΘ · F ⊂ VΘ (compatibility condition). Indeed, for all a ∈ A, u ∈ UΘ , f ∈ F we have: a ∗ (uf )σ = a ∗ (uσ f σ ) = (a ◦ uσ ) ∗ f σ = 0 ◦ f σ = 0, proving the required statement. A pair (U, V), where U is a two-sided special ideal in F and V is a left special ideal in F, will be called an ideal pair. On the other hand, let (U, V) be any compatible pair of subsets of KΨ, the compatibility means that UF ⊂ V. We associate to such a pair a class of automata Θ by the following rule: the automaton A = (A, Γ, B) belongs to the class Θ if in A hold all bi-identities y ◦ u ≡ 0, u ∈ U, and likewise all bi-identities z ∗ v ≡ 0, v ∈ V. It is clear that the class of automata Θ = Θ(U ,V) obtained in this way is a variety. Let (UΘ , VΘ ) be the indicator of Θ, U being the minimal special ideal in F, containing the set U, and V being the minimal special left ideal in F , containing V. Then U = UΘ and V = VΘ . The equation U = UΘ is proved by the following reasoning. As the indicator UΘ is special it suffices, in view of the inclusion U ⊂ UΘ , to show that UΘ is contained in each special ideal I containing U. To this end, we consider the pair (KΨ∗ /I, Ψ), induced by the regular action of Ψ on KΨ∗ ; from U ⊂ I it follows that this pair is contained in Θ. Let J be the subset of all u ∈ KΨ∗ such that for the given pair hold the bi-identities y ◦ u ≡ 0, u ∈ J . We have UΘ ⊂ J . It is not hard to convince oneself that I = J . Indeed, clearly I ⊂ J . Furthermore, for arbitrary g ∈ KΨ∗ , v ∈ J we have I = (g + I) ◦ v = gv + I. Taking g = ε ∈ KΨ∗ , we deduce that v ∈ I that implies J ⊂ I. The statement is proved. Next, we show that V = VΘ . We note that the elements of V are sums of the form i fi vi , fi ∈ F, vi ∈ V. Therefore in each automaton A = (A, Γ, B) ∈ Θ we have ∀a ∈ A, a ∗ ( fi vi )σ = a ∗ ( fiσ viσ ) = (a ◦ fiσ ) ∗ viσ = ai ∗ vi = 0. i
i
i
i
This proves that V ⊂ VΘ . We begin the verification of the converse inclusion with the following observation. From the condition UF ⊂ V it is easy to see that U F ⊂ V . Therefore, if V is any special left ideal containing the set V, we have the linear automaton A = (F/U , Ψ, F/V ) with the regular action in the role of ◦ and ∗. From V ⊂ V it follows that A ∈ Θ. Regarding the automaton A we prove further that its
3. Triangular products and stability of representations
47
indicator W coincides with V . By definition, W = {w ∈ KΨ | (g + U ) ∗ w = 0 for all g ∈ F }. Next, for each v ∈ V we have (g + U ) ∗ v = (g + U )v = gv + U v ⊂ V , from which it follows that V ⊂ W. Conversely, for each w ∈ W we have V = (ε + U ) ∗ w = (ε + U )w = w + U w, but U w ⊂ U ⊂ V ⊂ V . Hence, we find w ∈ V , and thus W ⊂ V . Consequently, we have proved the required equality W = V . Furthermore, we have the following obvious fact: if the ideal pair (UΘ , VΘ ) is the indicator of some class of linear automata Θ, while the ideal pair (U , V ) is the indicator of some concrete automaton in the class Θ, then UΘ ⊂ U and VΘ ⊂ V . From this it follows that VΘ ⊂ ∩V⊂V V = V . The equality V = VΘ is proved. 5. This Subsection is devoted to the proof of the following proposition. P ROPOSITION 3.26. The nontrivial varieties of linear semigroup automata are in bijective correspondence with the ideal pairs of the ring F . We require an auxiliary result. L EMMA 3.27. If for a linear automaton A = (A, Γ, B) all its subautomata of the form Aa = (a ◦ KΓA , Γ, a ∗ KΓ) are contained in the variety Θ, then also A ∈ Θ. (Aa , Γ, Ba ) isomorphic to Aa , P ROOF. We select for each a ∈ A an cautomaton c and form the automaton (A , Γ, B ) = ( a∈A Aa , Γ, a∈A Ba ). Each element γ ∈ Γ can be viewed as a constant function: γ(a) = γ for all a ∈ A. In this way the an embedding of semigroup Γ is embedded in the Cartesian power ΓA , which induces automata (A , Γ, B ) → (A , ΓA , B ). But (A , ΓA , B ) = a∈A (Aa , Γ, Ba ) ∈ Θ, hence also (A , ΓA , B ) ∈ Θ. The automaton (A , Γ, B ) contains the subautomaton AA = ( da∈A Aa , Γ, da∈A Ba ), where we denote by da∈A the discrete direct sum of the corresponding modules. However, the class Θ is closed with respect to subautomata, from which it follows that AA ∈ Θ. The isomorphisms isomorphisms Aa → a ◦ KΓ∗ and Ba → a ∗ KΓ * induce epimorphisms d a∈A
Aa
a∈A
a ◦ KΓ∗ = A and
d
Ba B ⊂ B,
a∈A
so that we obtain an epimorphism of automata AA (A, Γ, B ). In view of AA ∈ Θ it follows now that (A, Γ, B ) ∈ Θ, hence, also (A, Γ, B) = A ∈ Θ. Proof of Proposition 3.26. Let there be given an arbitrary variety of linear automata Θ and an ideal pair (U, V). We have the juxtapositions (U, V) → Θ(U ,V) → (UΘ(U,V) , VΘ(U,V) ) and Θ → (UΘ , VΘ ) → Θ(UΘ ,VΘ ) . In the previous subsection, we have, actually, shown that U = UΘ(U,V) and V = VΘ(U,V) . We show now the equality Θ = Θ(UΘ ,VΘ ) ; for simplicity of notation, we shall denote the right hand side by the symbol Θ .
48
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
It is clear that it suffices to show that Θ ⊂ Θ. To this end it is in turn sufficient to prove that, for each automaton A = (A, Γ, B) ∈ Θ , all its subautomata of the form Aa = (a ◦ KΓ∗ , Γ, a ∗ KΓ), a ∈ A, lie in Θ; this follows from Lemma 3.27. Next, we prove this statement itself. Let a map τ = (τ1 , τ2 , τ3 ) of the automaton (KΓ∗ , Γ, KΓ) to the automaton Aa be given by the formula ∀u ∈ KΓ∗ ,
γ ∈ Γ,
v ∈ KΓ,
uτ1 = a ◦ u, γ τ2 = γ v τ3 = a ∗ v.
It is clear that τ is an epimorphism of automata. Set Ker τ1 = U and Ker τ3 = V. As Aa ∈ Θ and one has the isomorphism B = (KΓ∗ /U, Γ, KΓ/V) ∼ = Aa , we have B ∈ Θ . In the following writing of the remaining deductions we shall use the following notation. Let W be an arbitrary special ideal in KΨ∗ and Γ a semigroup. We denote by WΓ the set of images (values) of all elements of W under the homomorphisms KΨ∗ → KΓ∗ , induced by all possible specializations X → Γ. Note that WΓ is a special ideal in KΓ∗ . By definition of the class Θ all bi-identities y ◦ u ≡ 0, u ∈ UΘ , as well as all bi-identities z ∗ v ≡ 0, v ∈ VΘ , are fulfilled in the automaton B. Hence, the ideal (UΘ )Γ in the regular action ◦ annihilates the module KΓ∗ /U and we have for each u ∈ (UΘ )Γ U = (ε + U) ∗ u = ε ◦ u + U = u + U, which implies u ∈ U. Thus we have showed that (UΘ )Γ ⊂ U. In analogous manner one proves (VΘ )Γ ⊂ V. The relations proved guarantee the existence of an epimorphism of the automaton (KΓ∗ /(UΘ )Γ , (KΓ∗ /(VΘ )Γ ), contained in Θ, onto the automaton B. Therefore B ∈ Θ, and hence, in view of Aa ∼ = B, it follows that Aa ∈ Θ. The proof of the proposition formulated in the proposition beginning of this subsection, is complete. 3.2.2. Technical results 1. First of all, we mention the following result on the triangular product of pairs, which is going to be used. P ROPOSITION 3.28. For arbitrary subpairs (A , Σ1 ) and (B , Σ2 ) of (A, Σ1 ) and (B, Σ2 ) respectively, the pair (A , Σ1 ) (B , Σ2 ) belongs to the variety Var((A, Σ1 ) (B, Σ2 )) P ROOF. Let us introduce the notation (G, Γ) = (A, Σ1 ) (B, Σ2 ). The statement will be established in several steps. First, we note that the embeddings Σi → Σi , i = 1, 2, induce in obvious way the embedding of pairs (A, Σ1 ) (B, Σ2 ) → (G, Γ). Let Γ be the acting semigroup of the pair (A, Σ1 ) (B, Σ2 ). Set H = A + B . Clearly, H ∩ B = B , while an immediate verification shows that H ◦ Γ ⊂ H. Therefore we have the epimorphism of pairs (A ⊕ B , Γ ) (A, Σ1 ) (B , Σ2 ); cf. Proposition 3.7. The acting semigroup of the pair to the right of the arrow will be denoted Γ . We remark further that A is a Γ -submodule of A ⊕ B . Let us consider the pair (A , Σ1 ) (B , Σ2 ) and distinguish in the semigroup Φ = Hom+ (B , A) the subsemigroup Φ of all elements ϕ such that Im ϕ ⊂ A . Clearly, we have the natural isomorphism Hom+ (B , A ) → Φ , which again induces an isomorphism of pairs (A , Σ1 ) (B , Σ2 ) → (A ⊕ B , Φ Σ1 × Σ2 ), from which, in view of the fact that (A ⊕ B , Φ Σ1 × Σ2 ) is a subpair of (A, Σ1 ) (B , Σ2 ), it follows that
3. Triangular products and stability of representations
49
there exists an embedding (A , Σ1 ) (B , Σ2 ) → (A, Σ1 ) (B , Σ2 ). In view of the properties of Var(G, Γ), the constructed morphism of pairs gives the inclusion required in the proposition. 2. Let X = {x1 , x2 , . . . } be a countable set, while Ψ and Ψ∗ are the free semigroup and the free monoid respectively with the elements of X as free generators. Furthermore, let Θ be a variety of pairs and U the corresponding special ideal in KΨ∗ . The pair (KΨ∗ /U, Ψ), apparently, is a cyclic pair, and, as is readily seen, free in the variety Θ. It is easy to see that Θ = Var(KΨ∗ /U, Ψ). P ROPOSITION 3.29. Let (A, Σ) be an arbitrary pair and (R, Ψ) a free pair in the variety Θ2 . Then Var((A, Σ) (R, Ψ)) = Var(A, Σ) · Θ2 . P ROOF. Let us denote Θ1 = Var(A, Σ) and Θ3 = Var((A, Σ) (R, Ψ)). Using Proposition 3.4 and the Corollary to Proposition 3.9 together with Proposition 3.28 just proved, we deduce that Θ1 · Θ2 ⊂ Θ3 . On the other hand, we have Θ3 = Var((A, Σ) (R, Ψ)) ⊂ Θ1 · Θ2 . Hence Θ3 = Θ1 · Θ2 . 3. The results of the preceding subsection widen our understanding of the structure of semigroups of varieties of representations of semigroups. We can at once establish a useful property of this semigroup – it is a semigroup with twosided cancellation. We formulate this as the following theorem. T HEOREM 3.30. Let Θ, Θ1 , Θ2 be arbitrary varieties. The following implications are true: (a) Θ1 · Θ = Θ2 · Θ =⇒ Θ1 = Θ2 ; (b) Θ · Θ1 = Θ · Θ2 =⇒ Θ1 = Θ2 . Let the proof be preceded by two remarks on special ideals in the ring KΨ∗ . First, an immediate check of the definitions shows that each special ideal U in KΨ∗ is contained in the fundamental ideal Δ of the semigroup ring KΨ∗ . ∗ Second, for the semigroup ring KΨ ∞ asn a ring of polynomials in noncommuting variables from X we have the relation n Δ = 0. This allows us to introduce a notion of weight of an ideal U, v(U), defining it as the first index κ such that U ⊂ Δκ , U ⊂ Δκ+1 . It is easy to see that if a special ideal U is split into the product of two other proper special ideals, then the weight of the factors is less than the weight of U itself. Proof of Theorem 3.30. (a) We must show that Θ1 ⊂ Θ2 and Θ2 ⊂ Θ1 . Let assume that, for instance, Θ1 ⊂ Θ2 . Choose an arbitrary pair (A, Σ), generating the variety Θ1 and let (R, Ψ) be a free pair in Θ. Then, in view of Proposition 3.29, the pair (G, Γ) = (A, Σ) (R, Ψ) generates the variety Θ1 · Θ = Θ2 · Θ. Let Θ2 be the radical of the variety Θ2 . Let us consider the submodule H = Θ2 (A, Σ) ⊂ A. If we have H = A, then (A, Σ) ∈ Θ2 , hence Θ1 = Var(A, Σ) ⊂ Θ2 , which contradicts the assumption. Consequently, we must have H < A and we can apply Proposition 3.6; as a result we obtain the relation H = Θ2 (G, Γ), which together with (G, Γ) ∈ Θ2 Θ gives (G/H, Γ) ∈ Θ. The natural epimorphism (A, Σ) (A/H, Σ)
50
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
induces an epimorphism (G, Γ) (A/H, Σ) (R, Ψ); cf. Proposition 3.4. The submodule H lies, in view of the construction, in the kernel of this epimorphism. But then we have the following commutative diagram of epimorphisms. (G, Γ)
L
L
L
l L% ll (G/H, Γ)
/ (A/H, Σ) (R, Ψ) l6 ll
Therefore it follows from (G, Γ) ∈ Θ that (A/H, Γ) (R, Ψ) ∈ Θ, and from this again we find (in view of Proposition 3.29) that Θ ⊂ Var(A/H, Σ) · Θ = Var(A/H, Σ) · Var(R, Ψ) = Var((A/H, Σ) (R, Ψ)) ⊂ Θ, i.e. Θ = Var(A/H, Σ) · Θ. Next, let us show that the last equality leads to a contradiction. To this note we notice that in view of H < A the variety Var(A/H, Σ) is not identity, and that it follows from Var(A/H, Σ) ⊂ Θ that it cannot be the variety of all pairs. Consequently, to the variety Var(A/H, Σ) there corresponds in KΨ∗ a proper special ideal U2 . The special ideal corresponding to Θ shall be denoted U1 . In view of Proposition 3.24 we have U1 = U1 U2 . Comparison of the weights in the left hand side and the right hand side of this equality gives v(U1 ) = v(U1 U2 ) ≥ v(U1 ) + v(U2 ) > v(U1 ). This is a contradiction. Hence, it is true that Θ1 ⊂ Θ2 . As the varieties Θ1 and Θ2 , in this argument, enter in a symmetric fashion, we obtain analogously Θ2 ⊂ Θ1 . (b) Let us assume that Θ1 ⊂ Θ2 . Take any pair (A, Σ) generating the variety Θ, def and let (R, Ψ) be a free pair in Θ1 . According to Proposition 3.29, the pair (G, Γ) = ∗ (A, Σ) (R, Ψ) then generates the variety ΘΘ1 = ΘΘ2 . Let Θ2 be the verbal of Θ2 . Consider the submodule R0 = ∗ Θ2 (R, Ψ). If R0 = (0), then (R, Ψ) ∈ Θ2 . Then Θ1 = Var(R, Ψ) ⊂ Θ2 , contradicting the assumption. Hence R0 > (0). Using def ∗
Proposition 3.6 we obtain H = Θ2 (G, Γ) = A + R0 . From (G, Γ) ∈ ΘΘ2 it follows now that (H, Γ) ∈ Θ. We have, however, the natural right epimorphism (H, Γ) → (A, Σ) (R0 , Ψ); so the pair to the right of the arrow belongs also to the variety Θ. Furthermore, note that the free cyclic pair (B, Ψ) in the variety Var(R0 , Ψ) is contained in V SC(R0 , Ψ); the proof of this fact is done carrying over Lemma 1.3 in [49] word by word, to the semigroup case. Next according to Proposition 3.11 the pair (A, Σ) (R0 , Ψ)I can be embedded into ((A, Σ) (R0 , Ψ))I . It follows from the above mentioned relation (B, Ψ) ∈ V SC(R0 , Ψ) the existence of a subpair (B, Σ2 ) in (R0 , Ψ)I such that there exists a right epimorphism μ : (B, Ψ) (B, Σ2 ). The map μ, apparently, induces an epimorphism of pairs (A, Σ) (B, Ψ) (A, Σ) (B, Σ2 )
3. Triangular products and stability of representations
51
while it follows from the relations (B, Σ2 ) ⊂ (R0 , Ψ)I and (A, Σ) (R0 , Ψ)I ∈ Θ that (A, Σ) (B, Σ2 ) ∈ Θ; cf. Proposition 3.28. Hence (A, Σ) (B, Ψ) ∈ Θ. Let us use Proposition 3.29; as in (a) we deduce that Θ = Θ · Var(B, Ψ). Assume that U1 and U2 are the special ideals corresponding to the varieties Θ and Var(B, Ψ), respectively. We obtain the equality U1 = U2 · U1 , which, however, is a contradiction, as a comparison of the weights to the left and to the right shows. Consequently, Θ1 ⊂ Θ2 . The roles of Θ1 and Θ2 , being symmetric, we derive in an analogous fashion Θ2 ⊂ Θ1 . This completes the proof of the theorem. 4. Let us now pass to the presentation of a technical result, which will be necessary in the proof of the Theorem of generators. Namely, we study in detail the form of the bi-identities satisfied by the triangular products of pairs. Let there be given two arbitrary pairs (A, Σ1 ) and (B, Σ2 ) and let (G, Γ) be their triangular product. Furthermore, select arbitrary elements γi ∈ Γ, γi = (ϕi , σi , σi ), where ϕi ∈ Φ = Hom(B, A), σi ∈ Σ1 , σi ∈ Σ2 , i = 1, . . . , n; and let u = u(x1 , . . . , xn ) be some fixed element in the semigroup algebra KΨ∗ . As a first step in this direction let us compute the element u(γ1 , . . . , γn ) ∈ KΓ∗ . It is easy to understand that in the basic case when u = f (x1 , . . . , xn ) ∈ KΨ∗ , the element f (γ1 , . . . , γn ) has the form m n rij (σ1 , . . . , σn ) · ϕi · sij (σ1 , . . . , σn ), f (γ1 , . . . , γn ) = i=1 j=1
(6)
f (σ1 , . . . , σn ), f (σ1 , . . . , σn )
,
here m1 + · · · + mn is the length of the word f ∈ Ψ∗ , while each of the elements rij (x1 , . . . , xn ) and sij (x1 , . . . , xn ) are defined by the word f and the pair of indices i, j only. The details of the necessary verification here are left to the Reader. The formula (6) may be written more compactly by taking account of the following. Let us set out with the fact that Φ is an additive Abelian group: therefore, together with elements in Σk on Φ there act also elements in Z0 Σk ⊂ ZΣ∗k , k = 1, 2, where we denote by Z0 the set of non-negative integers. Setting r¯i =
mi
rij (σ1 , . . . , σn )
j=1
and s¯i =
mi
sij (σ1 , . . . , σn ),
j=1
we get the following formula, n r¯i · ϕi · s¯i , f (σ1 , . . . , σn ), f (σ1 , . . . , σn ) . f (γ1 , . . . , γn ) = i=1
An anewed attempt allows us now also to settle the general case. Indeed, let there be given a fixed element u = u(x1 , . . . , xn ) = λk fk (x1 , . . . , xn ), λk ∈ K. k ∗
in the semigroup algebra KΨ , and elements γi ∈ Γ as before. It is not hard to see that there exist in Z0 Ψ∗ elements r¯ik (x1 , . . . , xn ) and s¯ik (x1 , . . . , xn ) such that their
52
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
values r¯ik = r¯ik (σ1 , . . . , σn ) and s¯ik = s¯ik (σ1 , . . . , σn ) allow us to write the element u(γ1 , . . . , γn ) ∈ KΓ∗ in the form λk fk (γ1 , . . . , γn ) = u(γ1 , . . . , γn ) = =
k
k
n λk ( r¯ik · ϕi · s¯ik ), λk fk (σ1 , . . . , σn ), λk fk (σ1 , . . . , σn ) . i=1
k
k
n Below we denote the element i=1 r¯ik · ϕi · s¯ik by ψk ; then the expression then for u(γ1 , . . . , γn ) can be written more concisely as λk ψk , u(σ1 , . . . , σn ), u(σ1 , . . . , σn ) . (7) u(γ1 , . . . , γn ) = k
Let us make explicit how the element u(γ1 , . . . , γn ) acts on G. To this end we apply it to the element g = a + b, a ∈ A, b ∈ B. The action of elements in the ring KΓ∗ on G is the linear extension of the action of the elements of Γ∗ ; therefore, using (7) we see that g ◦ u(γ1 , . . . , γn ) = (a + b) ◦ ( λk ψk , u(σ1 , . . . , σn ), u(σ1 , . . . , σn )) = a◦
u(σ1 , . . . , σn )
+
k
λk b
ψk
+ b ◦ u(σ1 , . . . , σn ).
k
After these preparatory calculations let us pass to the main issue of this subsection – the form of the bi-identities in (G, Γ) = (A, Σ1 ) (B, Σ2 ). More exactly, we seek the form of the element g ◦ u(γ1 , . . . , γn ) in the assumption that in both factors of the triangular product the bi-identity y ◦ u ≡ 0 is satisfied. From this assumption it follows, in particular, that a ◦ u(σ1 , . . . , σn ) = 0
and b ◦ u(σ1 , . . . , σn ) = 0.
Thus, we are here led to the formula g ◦ u(γ1 , . . . , γn ) =
λk bψk .
k
The terms of the right and the left side of this equation can be processed further. Assume that we have r¯ik (x1 , . . . , xn ) = nikp vikp (x1 , . . . , xn ) p
and s¯ik (x1 , . . . , xn ) =
mikq wikq (x1 , . . . , xn ),
q
where all nikp , mikq ∈ Z0 and all vikp (x1 , . . . , xn ) and all wikq (x1 , . . . , xn ) belong to the monoid Ψ∗ . For simplicity we write v¯ikp = vikp (σ1 , . . . , σn ) in this notation we have r¯ik =
and w ¯ikq = wikq (σ1 , . . . , σn ); p
nikp v¯ikρ
3. Triangular products and stability of representations
and s¯ik =
53
mikq w ¯ikq .
q
In this way we obtain the element in a form of interest to us: λk bψk = λk b( i r¯ik ·ϕi ·¯sik ) = g ◦ u(γ1 , . . . , γn ) = k
(8)
=
k
(nikp · mikq · λk )bv¯ikp ·ϕi ·w¯ikp .
k,i,p,q
3.2.3. The fundamental lemma 1. Let K be an arbitrary class of pairs, and DK the class of all direct products of pairs in K, Θ = Var K and (A, Σ) a free pair in Θ. In these assumptions we have the following. L EMMA 3.31. If there is given in A a finite linearly independent system of elements a1 , . . . , an , then there exists a pair (B, Σ ) ∈ DK and a homomorphism of pairs μ : (A, Σ) → (B, Σ ) such that the elements aμ1 , . . . , aμn are linearly independent in B. P ROOF. The varieties of semigroup pairs are in bijective correspondence with special ideals in the ring KΨ∗ ; cf. Section 3.1.2 Thus, if the variety Θ corresponds to the special ideal U, then the pair (KΨ∗ /U, Ψ) is a free cyclic pair in Θ; therefore the given pair (A, Σ) is a subpair of the Cartesian power of the pair (KΨ∗ /U, Ψ). However, using Remak’s theorem, we readily see that in K there are pairs (Ai , Σi ),i ∈ I, such that there exists a right homomorphism ν of the pair (A, Σ) into the pair i∈I (Ai , Σi ); ¯ Σ). ¯ As ν is the identity map on A, the elements a this pair will be denoted (A, ¯i = aνi , ¯ i = 1, . . . , n, must be linearly independent in A. Furthermore, for any set of indices F ⊂ I, let πF be the natural projection of A¯ into the Cartesian sum of the subspaces Ai , the index i of which lies in F . Moreover, let A¯(F ) be the kernel of πF ; apparently, for any subsets F , F ⊂ I, we have the relation A¯(F ) A¯(F ) = A¯(F ∪F ) . ¯n . Let us show the existence of a Finally, let V be the linear hull of the vectors a ¯1 , . . . , a finite subset of I such that the projection corresponding to it induces a monomorphism on V . Indeed, we observe that one has the equalities A¯(F ) = A¯(∪F ⊂I F ) = A¯(I) = (0). F ⊂I
From this it follows that 0=V
F ⊂I
(V A¯(F ) ). A¯(F ) = F ⊂I
As V is a finite dimensional space, it follows from this that there exists a finite subset ∗ F ∗ ⊂ I such that V ∩ A¯(F ) = 0. It is not hard to see that the map πF ∗ is a monomorphism on V . Similarly to as was done above for the domain of action, we define a projection ¯ → πF : Σ i∈F Σi for the acting semigroups; in this way we get a projection of pairs ¯ Σ) ¯ → (Ai , Σi ). Let us set (B, Σ ) = ∗ (Ai , Σi ). It is clear that πF : (A, i∈F i∈F
54
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
(B, Σ ) ∈ DK and that the homomorphism μ = νπF ∗ : (A, Σ) → (B, Σ ) satisfies the desired requirement. 2. We are in a position to formulate and prove a fundamental lemma en route to the Theorem on a generating pair. L EMMA 3.32. Let the variety Θ1 be generated by the single pair (A, Σ1 ) and assume that the variety Θ2 is generated by an arbitrary class of pairs K2 , subject to the condition DK2 = K2 . Then Θ1 Θ2 = Var((A, Σ1 ) K2 )). P ROOF. Clearly, we have the inclusion Var((A, Σ1 ) K2 ) ⊂ Θ1 Θ2 . However, if (R, Ψ) is a free pair in Θ2 , then we have by virtue of Proposition 3.29 Θ1 Θ2 = Var(A, Σ1 ) Var(R, Ψ). Consequently, every bi-identity of the pair (A, Σ1 ) (R, Ψ) is also true in the pairs (A, Σ1 ) (B, Σ2 ), where (B, Σ2 ) ∈ K2 . All this leads thus to the verification of the following statement: if a certain bi-identity y ◦ u ≡ 0 is not fulfilled in ˜ Γ) ˜ = (A, Σ1 ) (R, Ψ), then there exists a pair (B, Σ2 ) ∈ K2 such that this the pair (G, bi-identity is not fulfilled in the pair (A, Σ1 ) (B, Σ2 ) either. First, we may assume that in both varieties Θ1 and Θ2 the bi-identity y ◦ u ≡ 0 is fulfilled. Indeed, if the bi-identity y ◦ u ≡ 0 is not fulfilled in Θ2 , then there exists a pair (B, Σ2 ) ∈ K2 , in which the said bi-identity is not fulfilled. But then this bi-identity cannot be fulfilled in (A, Σ1 ) (B, Σ2 ) either, and our assertion is proved. If, however, the bi-identity y ◦ u ≡ 0 is not fulfilled in Θ1 , then it cannot hold neither in (A, Σ1 ) nor in (A, Σ1 ) (B, Σ2 ), for any choice of (B, Σ2 ) ∈ K2 , and in this case all is proved anew. Using this observation we assume that the bi-identity y ◦ u ≡ 0 is not fulfilled in ˜ and γ ∗ , . . . , γ ∗ ∈ ˜ Γ), ˜ but holds true in Θ1 and Θ2 . This means that there exist g ∗ ∈ G (G, 1 n ˜ Γ such that g ∗ ◦ u(γ1∗ , . . . , γn∗ ) = 0. In view of this condition, if g ∗ = a + h, where a ∈ A, h ∈ R and γi∗ = (ϕ∗i , σi , σi ), where ϕ∗i ∈ Hom+ (R, A), σi ∈ Σ1 , σi ∈ Ψ, i = 1, . . . , n, then a ◦ u(σ1 , . . . , σn ) = 0
and h ◦ u(σ1 , . . . , σn ) = 0.
With the aid of formula (8) of the previous Subsection we have (nikp · mikpq · λk )hv¯ikp ·ϕ1 ·w¯ikq . g ∗ ◦ u(γ1∗ , . . . , γn∗ ) = i,k,p,q
Let V be the linear hull in R of the finite subset i,k,p hv¯ikp . This is a finite dimensional subspace in R and so we may apply Lemma3.31. In view of this result there exists a pair (B, Σ2 ) ∈ K2 and a homomorphism μ : (R, ψ) → (B, Σ2 ) which is a monomorphism on V . It turns out that in the pair (G, Γ) = (A, Σ1 ) (B, Σ2 ) the bi-identity y ◦ u ≡ 0 is not fulfilled. In order to prove this let us consider a K-morphism ν : B → R which is inverse to μ on V μ and defined in an arbitrary, but fixed manner outside V μ ; such a morphism can be defined in a corresponding way on a basis of B obtained by complementing a basis of V μ ⊂ B. Moreover, we put
3. Triangular products and stability of representations
55
def
(1) fi = νϕ∗i , i = 1, . . . , n; it is clear that ϕi ∈ Hom+ (B, A). We further remark that ∗
∗
[(h ◦ v¯ikp )μ ]ϕi = [(h ◦ v¯ikp )μν ]ϕi = (h ◦ v¯ikp )ϕi , def
def
(2) b = hμ , and g = a + b ∈ A ⊕ B; def (3) τi = (σi )μ ∈ Σ2 , i = 1, , . . . , n; (4) γi = (ϕi , σi , τi ) ∈ Hom+ (B, A) Σ1 × Σ2 ; (5) v˜ikp = vikp (τ1 , . . . , τn ) = [vikp (σ1 , . . . , σn )]μ ∈ Σ2 . According to formula (8) we have in this notation g ◦ u(γ1 , . . . , γn ) = (nikp mikq λk )bv˜ikp ·ϕi ·w¯ikq . i,k,p,q
The sum to the right in this equation admits a not very difficult transformation12 showing that it equals g ∗ ◦ u(γ1∗ , . . . , γn∗ ). However, g ∗ ◦ u(γ1∗ , . . . , γn∗ ) = 0, so we conclude that g ◦ u(γ1 , . . . , γn ) = 0. Hence, y ◦ u ≡ 0 cannot hold in (G, Γ). Thereby, our statement is proved and so also Lemma 3.32. 3.2.4. The theorem on generating representations of semigroups This section will be devoted to the proof of one of the fundamental results of this section, Theorem 3.33 below. It is a key result and admits a series of consequences for the structure of classes of linear representations of semigroups, and gives also means for the study of interesting individual representations. T HEOREM 3.33. Let K1 and K2 be any two classes of linear representations (over a field K) of semigroups. The following formula holds true Var K1 · Var K2 = Var(K1 K2 ). P ROOF. Let us introduce the notations Θ = Var(K1 K2 ) and Θi = Var Ki , i = 1, 2. As was shown in Paragraph 3 of Section 3.1.2, for arbitrary pairs (A, Σ1 ) ∈ Θ1 and (B, Σ2 ) ∈ Θ2 it holds (A, Σ1 ) (B, Σ2 ) ∈ Θ1 Θ2 . Therefore we have the inclusion K1 K2 ⊂ Θ1 Θ2 , which also implies that Θ ⊂ Θ1 Θ2 . It remains to prove the converse inclusion Θ1 Θ2 ⊂ Θ. The corresponding reasoning will be given in two steps. In the first of them we assume temporarily that one can remove the restriction DK2 = K2 in Lemma 3.32, and show that this can be used in the proof at hand. In the second step we show that this refined version of Lemma 3.32, indeed, holds true. The first step is a reduction. Let (A, Σ1 ) be a faithful pair generating the variety Θ1 , as (A, Σ1 ) we may take, for instance, the “faithfulling” of a free cyclic pair in Θ1 . One sees readily that in these assumptions there exists a family of pairs (Ai , Σi ) ∈ K1 , i ∈ I, and a subpair (A , Σ ) in the Cartesian product (A, Σ) = i∈I (Ai , Σi ) such that there is an epimorphism of pairs (A , Σ ) (A, Σ1 ). Next, let us fix an arbitrary pair (B, Σ2 ) in the class K2 . According to Proposition 3.9 there exists an embedding ((Ai , Σi ) (B, Σ2 )), (A, Σ) (B, Σ2 ) → i∈I 12But is hard to write down and so will be omitted,
56
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
where the pair to the right of the arrow, clearly, lies in Θ. But then the same is also true for the pair (A, Σ) (B, Σ2 ) and also for the pair (A , Σ ) (B, Σ2 ); cf. Proposition 3.28. Finally, using Proposition 3.4: the epimorphism (A , Σ ) (A, Σ1 ) guarantees the relation (A, Σ1 ) (B, Σ2 ) ∈ Θ. To sum up, we see that (A, Σ1 ) K2 ⊂ Θ, from which, on the basis of our above assumption and Lemma 3.32, the relation Θ1 Θ2 ⊂ Θ follows at once. The second step is the refinement of Lemma 3.32. Assume that the class K1 consists ¯2 = of the single pair (A, Σ), the class K2 being arbitrary. Furthermore, let us denote K DK2 , Θ = Var(K1 K2 ) and Θi = Var Ki , i = 1, 2. It follows immediately from ¯ 2 ) = Θ1 Θ2 . Therefore we have Θ ⊂ Θ1 Θ2 . Lemma 3.32 that Var(K1 K Let us show that we have also the converse embedding Θ1 Θ2 ⊂ Θ. ¯ Σ) ¯ = Let (Bi , Σi ), i = 1, . . . , n, be an arbitrary finite family in the class K2 ; (B, n ¯ ¯ ¯ i=1 (Bi , Σi ); G = A + B; Γ = Hom(B, A) Σ × Σ. An easy verification shows that the subspaces A + Bi ⊂ G, i = 1, . . . , n, are Γ-invariant; thus we have the pairs (A + Bi , Γ). We show that all these pairs lie in the variety Θ. Indeed, the pairs (A, Σ) (Bi , Σi ) lie, apparently, in Θ. Let us prove the existence of epimorphisms μi : (A + Bi , Γ) → (A, Σ) (Bi , Σi ), from which it will follow that (A + Bi , Γ) ∈ Θ, i = 1, . . . , n. ¯ → On the domains of action we define μi as the identity. The natural projections Σ ¯ → Σ×Σi . The association to each ϕ ∈ Hom(B, ¯ A) ¯ i give homomorphisms μi : Σ× Σ Σ ¯ A) Hom(Bi , A); it suffices its restriction to Bi defines an epimorphism μi : Hom(B, ¯ obtained by comto recall that it is sufficient to give a K-morphism on the bases of B pleting the bases of Bi ⊂ B. Let us check that the triple of maps μi thus defined gives a morphism of the triangle products μi : Γ → Hom+ (Bi , A) (Σ × Σi ). Take any two elements (ϕ, σ1 , σ2 ) and (ϕ , σ1 , σ2 ) in Γ. We compute [(ϕ, σ1 , σ2 )(ϕ , σ1 , σ2 )]μi = ((σ2 · ϕ + ϕ · σ1 )μi , σ1 σ1 , (σ2 σ2 )μi ) = = (σ2μi · ϕμi + ϕμi · σ1 , σ1 σ1 , σ2μi · σ2 μi ) = = (ϕ, σ1 , σ2 )μi · (ϕ , σ1 , σ2 )μi . For this it suffices to invoke the relation (σ2 · ϕ + ϕ · σ1 )μi = σ2μi · ϕμi + ϕμi · σ1 , which we used in these computations. Indeed, for each b ∈ Bi we have b · σ2 = b · σ2μi ∈ Bi , from which it follows that
μi
b(σ2 ·ϕ +ϕ·σ1 )
= bσ2 ·ϕ +ϕ·σ1 = (b ◦ σ2 )ϕ + (bϕ ) · σ1 = = (b ◦ σ2μi )ϕ
μi
+ (bϕ ) ◦ σ1 =
μi ·ϕμi +ϕμi ·σ1
= bσ2
μi
.
Our statement is completely proved. It remains to establish that the map μi agrees with the action in the pairs considered. Take arbitrary elements a + b ∈ A + Bi and (ϕ, σ1 , σ2 ) ∈ Γ, and let us provide the
3. Triangular products and stability of representations
57
necessary verification taking in account that the map is identity on the domains of action. We have (a + b)μi ◦ (ϕ, σ1 , σ2 )μi = (a + b) ◦ (ϕμi , σ1 , σ2μi ) = = bϕ
μi
+ a ◦ σ1 + b ◦ σ2μi = bϕ + a ◦ σ1 + b ◦ σ2 =
= (a + b) ◦ (ϕ, σ1 , σ2 ). To sum up, we have proved the relation (A + Bi , Γ) ∈ Θ, i = 1, . . . , n. But the Γmodules A + Bi , i = 1, . . . , n, generate the module G. Hence, repeating the train of thoughts in the proof of Lemma 3.27 we deduce that (G, Γ) ∈ Θ. Consequently, we have ¯ 2 ⊂ Θ, which at once implies the desired relation Θ1 Θ2 = Var(K1 K ¯2) ⊂ K1 K Θ.
3.2.5. Consequences. Connections with linear automata 1. The rising interest in the arithmetic and the geometry of non-commutative rings gives the stimulus for the study of the question on the unique factorization of elements in semigroups. Here we shall prove that, in particular, the unique factorization holds in the semigroup of varieties of linear representations of semigroups. We are going to use the following lemma. L EMMA 3.34. Assume that the relations Θ1 Θ2 = Θ1 Θ2 and Θ2 ⊂ Θ2 hold for the varieties Θ1 , Θ2 , Θ1 , Θ2 . Then there exists a variety Θ3 such that Θ1 Θ2 = Θ1 Θ3 Θ2 . P ROOF. Let (Ri , Ψ) be a free pair in Θi , i = 1, 2, and set (G, Γ) = (R1 , Ψ) (R2 , Ψ). In view of the relations Θ2 = Var(R2 , Ψ) and Θ2 ⊂ Θ2 we have (R2 , Ψ) ∈ Θ2 . We denote by ∗ Θ2 the verbal of the variety Θ2 , and set ∗ B = Θ2 (R2 , Ψ). One can check that B = 0; otherwise we would have (R2 , Ψ) = (R2 /B, Ψ) ∈ Θ2 , which is a contradiction. We take Θ3 = Var(B, Ψ) and prove first that Θ1 Θ3 Θ2 ⊂ Θ1 Θ2 . Indeed, from B > 0 it follows that ∗ Θ2 (G, Γ) = R1 + B, and there exists a right epimorphism (R1 + B, Γ) → (R1 , Ψ)(B, Ψ), as follows from Propositions 3.6 and 3.7. However, by virtue of Theorem 3.33 the pair (R1 , Ψ)(B2 , Ψ) generates the variety Θ1 Θ3 from where, due to the epimorphism indicated above, it follows Var(R1 + B, Γ) = Θ1 Θ3 . Note also that (G, Γ) ∈ Θ1 Θ2 = Θ1 Θ2 , which is equivalent to the inclusion (∗ Θ2 (G, Γ), Γ) ∈ Θ1 . To sum up, we have shown that Θ1 Θ2 ⊂ Θ1 , from which the relation required here follows. On the other hand, by virtue of Theorem 3.33 we have Var(G, Γ) = Θ1 Θ2 , so the relation (∗ Θ2 (G, Γ), Γ) ∈ Θ1 Θ3 obtained above is equivalent to (G, Γ) ∈ Θ1 Θ3 Θ2 . We obtain Θ1 Θ2 = Θ1 Θ2 ⊂ Θ1 Θ3 Θ2 , which together with the inclusion proved above gives Θ1 Θ2 = Θ1 Θ3 Θ2 . A variety is called indecomposable, if it cannot be presented as the product of two non-trivial factors. The main consequence of the Theorem of Generating Representations is the following.
58
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
T HEOREM 3.35. Each variety of linear representations (over a field K) of semigroups can uniquely be decomposed as a product of finitely many indecomposable varieties.
P ROOF. Let us first show the possibility to decompose every variety as a product of finitely many indecomposable varieties. The anti-isomorphism between the semigroup of varieties of pairs and the semigroup of proper special ideals of KΨ∗ permits us to translate this statement to the language of ideals: we have to replace the word “variety” by “proper special ideal”. In this new formulation the statement is readily proved by induction over the weight of the special ideal considered. It remains to prove the uniqueness of the decomposition. It is not hard to see that this follows from the following fact: if the varieties Θ1 and Θ1 are indecomposable, then, for any varieties Θ2 and Θ2 , the equality Θ1 Θ2 = Θ1 Θ2 implies Θ1 = Θ1 and Θ2 = Θ2 . In order to prove this statement we replace Θ2 = Θ2 by the equivalent pair of inclusions Θ2 ⊂ Θ2 and Θ2 ⊂ Θ2 , assuming that Θ2 ⊂ Θ2 . In view of Lemma 3.34 there exists then a variety Θ3 such that Θ1 Θ3 Θ2 = Θ1 Θ2 . Next, using Theorem 3.30 and cancelling this identity to the right by Θ2 , we obtain Θ1 = Θ1 Θ3 , which contradicts the condition. The relation Θ2 ⊂ Θ2 is proved analogously. Thus, the equation Θ2 = Θ2 is established. Applying anew Theorem 3.30, we deduce from Θ1 Θ2 = Θ1 Θ2 that Θ1 = Θ1 . T HEOREM 3.36. The semigroup of varieties of linear representations (over a field K) of semigroups is free. This theorem follows at once from Theorem 2.3.
2. The role of the wreath product of groups in the proof of the theorem of Shmel’kin and Neumanns on the possibility of free generation of nontrivial varieties of groups by indecomposable varieties of groups is well-known; cf. [51, Theorem 23.4]. However, there is a different path of proof using another technique [64]. Guided by this, we give, for the sake of completeness, another proof of Theorem 3.36, which, moreover, works inside the ring KΨ∗ ; it is a suitable reinterpretation of the argument in [56]. Moreover, it is convenient to give here a new formulation of Theorem 3.36: The semigroup of proper special ideals of the ring KΨ∗ is free. Second proof of Theorem 3.36. It is mentioned in [56] that the semigroup ring F = KΨ∗ (it is a free associative algebra with unit on X over K) is a left and right FIring without non-trivial elements invariant from the right. Consequently, one can apply Theorem 5 in the same paper [56]; according to this theorem, the semigroup R of all nonzero two-sided ideals of the ring F is free with the set of all indecomposable proper ideals F in the role of the system of free generators. Furthermore, we note that the product of proper special ideals of F is again a proper special ideal, and in this way one distinguishes in R a subsemigroup S of such ideals. Clearly, our theorem is proved if we show that, for arbitrary ideals A and B in R such that AB ∈ S, it is true that A ∈ S and B ∈ S. This will be proved below.
3. Triangular products and stability of representations
59
We remark that from the uniqueness of decomposition of an ideal A ∈ S into indecomposable factors follows the invariance of these factors with respect to each special13 automorphism of the ring F. Moreover, it is expedient to introduce the following notion. An endomorphism of the ring F is called particular if it is induced by an endomorphism η of the monoid Ψ∗ such that X ⊂ X η . Let us show that for each particular endomorphism η holds A ⊂ Aη . Indeed, let u be an arbitrary element of A and S = {x1 , · · · , xn } ⊂ X be such that u ∈ Kx1 , . . . , xn . In view of the particularity of η there exist xi ∈ X such that xηi = xi , i = 1, . . . , n. Consider a permutation γ on X permutation such that xγi = xi , i = 1, . . . , n, and extend it to an automorphism of F. It is clear that γ is a special automorphism of F and according to the above remark we therefore have Aγ = A. By our construction uγη = u; hence, u = uγη ∈ Aγη = Aη . Thus we have proved that A ⊂ Aη . Next, we complete the proof of our main statement. The map η, being particular and AB a special ideal, we deduce that AB ⊃ (AB)η = Aη B η ⊃ Aη B ⊃ AB, hence AB = Aη B. The particularity of η further forces that F η = F . Therefore Aη is an ideal in F and, next, from the freedom of the semigroup R it follows, in particular, that A = Aη . Furthermore, let μ be a special endomorphism of the ring F . For each u ∈ A one can construct a particular endomorphism η : F → F , which coincides with μ on the element u. Indeed, for all xi ∈ S we set xηi = xμi ∈ Ψ∗ , and on the complement X\S we define η as an arbitrary surjective map X\S X. The map η : X → Ψ∗ obtained in this way is extended to a special endomorphism η : F → F which is particular by the construction. We have uμ = uη ∈ Aη = A which proves that A is a special ideal. In an analogous way one proves that B is special. This completes the proof. 3. Let us consider the relation between the above mentioned material and the theory of automata. An automaton A = (A , Γ, B ) is called an invariant subautomaton of the linear automaton A = (A, Γ, B) if the following conditions are fulfilled: (1) A ⊂ A and B ⊂ B are K-submodules; (2) A is Γ-invariant with respect to the action ◦; (3) For any a ∈ A and γ ∈ Γ we have a ∗ γ ∈ B . Every invariant subautomaton A ⊂ A is accompanied by a factor automaton A/A = (A/A , Γ, B/B ), where for all a ¯ ∈ A/A and γ ∈ Γ is put a ¯ ◦ γ = a ◦ γ and a ¯∗γ = a ∗ γ. It is apparent that this definition is consistent. Having this notion to our disposal, we can then define a corresponding associative multiplication of varieties of linear automata. Let there be given any two linear automata Θ1 and Θ2 . Then, by definition A = (A, Γ, B) ∈ Θ1 · Θ2 if there exists an invariant subautomaton A ⊂ A, A ∈ Θ1 , such that A/A ∈ Θ2 . We denote by Ma (K) the semigroup of varieties of linear automata over K. Each linear automaton (A, Γ, B) is accompanied by a linear pair (A, Γ), and the semigroup Ma (K) of the varieties of such pairs is free (Theorem 3.36). It is naturally to try to settle the question of the freedom 13Such a name is given to those automorphisms (endomorphism) of the ring F which are induced by automorphisms (endomorphisms) of the monoid Ψ∗ .
60
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
of the semigroup of varieties of linear automata Ma (K). The answer is given in the following theorem. T HEOREM 3.37. The semigroup Ma (K) of varieties of linear automata (over the field K) is not free, but it contains a maximal free subsemigroup isomorphic to the semigroup of varieties of linear representations of semigroups. P ROOF. Introduce on the set I a (K) of ideal pairs (cf. Section 3.1.4) the following multiplication (U1 , V1 ) ∗ (U2 , V2 ) = (U1 U2 , U1 V2 ). It is clear that I a (K) equipped with this multiplication is a semigroup, the semigroup of ideal pairs. It turns out that this semigroup is anti-isomorphic to the semigroup Ma (K). In order to see this we have to show: if the varieties of linear automata Θi are defined by the ideal pair (Ui , Vi ), i = 1, 2, then the variety Θ1 · Θ2 is defined by the ideal pairs (U1 , V1 ) ∗ (U2 , V2 ). Let us denote by Θ the variety defined by the latter ideal pair. It is easy to check that the automaton A = (F /U2 U1 , Ψ, F/V2 V1 ) ∈ Θ is a free14 linear automaton in the variety Θ. Let us also take into consideration the following two automata: A1 = (F/U1 , Ψ, F /V1 ) and A2 = (F/U2 , Ψ, F /V2 ); it is clear that the Ai are free in the varieties Θi , i = 1, 2, respectively. Let us show that A ∈ Θ1 · Θ2 . Let us consider in A the invariant subautomaton A3 = (U2 /U2 U1 , Ψ, V2 /U2 V1 ); the properties of the ideal pair (U2 , V2 ) guarantee its existence. One has A3 ∈ Θ1 . Indeed, we have the relations (U2 /U2 U1 ) ◦ U1 = (U2 /U2 U1 ) · U1 = U2 U1 /U2 U1
and
(U2 /U2 V1 ) ∗ V1 = (U2 /U2 V1 ) · V1 = U2 V1 /U2 V1 . This means that in A there is an invariant subautomaton A3 , A3 ∈ Θ, such that A/A3 = (F /U2 , Ψ, F/V2) ∈ Θ2 . Hence, it follows by definition that A ∈ Θ1 · Θ2 . So we have proved that Θ ⊂ Θ1 · Θ2 . Let us show the converse inclusion Θ1 · Θ2 ⊂ Θ. Take any automaton A = (A, Γ, B) ∈ Θ1 · Θ2 . By definition, there exists an invariant subautomaton A = (A , Γ, B ) ⊂ A, A ∈ Θ1 , such that A/A = (A/A , Γ, B/B ) ∈ Θ2 . With the help of this we show that A ∈ Θ. We have to verify that in A hold all bi-identities y ◦ u ≡ 0, u ∈ U2 U1 and all bi-identities z ∗ v ≡ 0, v ∈ V2 V1 . The interpretation of the relations A ∈ Θ1 , A/A ∈ Θ2 gives A ◦ U1σ = 0, A ∗ V1σ = 0 and A ◦ U2σ ⊂ A , A ∗ V2σ ⊂ B for each specialization homomorphism σ. This implies that A ◦ (U2 U1 )σ = A ◦ (U σ U1σ ) = (A ◦ U2σ ) ◦ U1σ ⊂ A ◦ U1σ = 0 and A ◦ (U2 V1 )σ = A ◦ (U σ V1σ ) = (A U2σ ) V1σ ⊂ A V1σ = 0.
14The notion of a free (free in a given variety) of a linear automaton is formulated in the known category scheme, and is left to the Reader.
3. Triangular products and stability of representations
61
From these computations it follows that A ∈ Θ. The relation Θ1 · Θ2 ⊂ Θ has been checked, and thus we have established the statement in the beginning of the proof. In view of the anti-isomorphism of the semigroups Ma = Ma (K) and the relation a I = I a (K) it is sufficient to prove the non-freedom for the semigroup I a . We assume the converse, and take arbitrary three ideal pairs (U1 , V1 ), (U1 , V1 ) and (U2 , V2 ) with V1 = V1 ). Then we have (U1 , V1 ) ∗ (U2 , V2 ) = (U1 U2 , U1 V2 ) = (U1 , V1 ) ∗ (U2 , V2 ). But the semigroup I a , by assumption being free, is a semigroup with cancellation. Therefore, from the equality (U1 , V1 ) ∗ (U2 , V2 ) = (U1 , V1 ) ∗ (U2 , V2 ) we deduce that (U1 , V1 ) = (U1 , V1 ), contradicting the condition V1 = V1 . At the same time, there is an epimorphism of the semigroup Ma onto the free semigroup M, τ : Ma M, which in the language of ideal pairs is given by the formula (U, V)τ = U. Moreover, in the semigroup Ma we can distinguish the free subsemigroup M0 , isomorphic to M; in the same language of ideal pairs M0 is described as a subsemigroup of all ideal pairs of the form (U, U). It is easy to see that M0 is a maximal free subsemigroup of Ma . Indeed, in the opposite case one can embed M0 into a larger free subsemigroup M1 ⊂ Ma , anti-isomorphic copy I1a of which in I a contains ideal pairs of the form (U1 , V1 ) with V1 = U1 . But then we have, for each pair (U2 , V2 ) ∈ I a , (U1 , V1 ) ∗ (U2 , V2 ) = (U1 U2 , U1 V2 ) = (U1 , U1 ) ∗ (U2 , V2 ). We obtain a relation which, by virtue of V1 = U1 cannot hold true in the free semigroup I1a . Thus Theorem 3.37 is proved. 4. The fact established in Theorem 3.35 brings up the question of the description of indecomposable varieties of linear representations of semigroups. There exists a discussion of the corresponding question for varieties of group pairs, [30]. It turns out that these arguments remain in force also for semigroup pairs. First, let us remark that in the ring KΨ∗ one can build up a Fox calculus [70]15, and deduce, in particular, all results which are reviewed in the two first pages of [30]. We omit the details of this translation of the fundamentals of the free differential calculus to the semigroup case. In view of this one can prove the following facts. T HEOREM 3.38. Let Θ be a variety of linear representations of semigroups given by bi-identities of the form y · u ≡ 0, where the expression of the elements u ∈ KΨ∗ involves only n elements (variables) in X. Then the equation Θ = Θ1 Θ2 . . . Θm with m > n is not possible for any varieties of pairs Θ1 , Θ2 , . . . , Θm . From this we obtain at once the following C OROLLARY 3.39. If we, in the conditions and notations of the previous theorem, in addition impose n = 1, then the variety Θ is indecomposable. P ROOF. The proof of Theorem 3.38 runs parallel to the corresponding proof in the group case. It is necessary to alter a little bit only the proof of Lemma 1 on pp. 1209-1210 in [8], where there is derived an expression of a certain special form of the element u. 15Translators’ Remark. For the work of Ralph Fox (1913-1873), see [90].
62
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
But this expression of the element u exists also in the ring KΨ∗ ; it suffices only to take in account the relations xi xj ≡ xj xi
(mod Δ2 ) and xti − 1 = t(xi − 1) (mod Δ2 ),
which are fulfilled for arbitrary xi , xj ∈ X and any natural number t. The further particularities are omitted. 3.2.6. The theorem on generating representations of algebras 1. The facts on linear semigroup pairs in the form in which they were presented in the previous section, have analogues also for representations of algebras. The degree of the parallelism with the semigroup case is high here, and all statements and their statements verifications can in practice be carried over word by word to the algebra case. Therefore we limit ourselves to formulating the results and making remarks. A central role is anew taken by the triangular product construction; for representations of algebras this construction was introduced in Section 3.1.3. As before, through out this section, K will be a fixed field and all algebras considered associative K-algebras. 2. A variety of representations of associative algebras, is, by definition, a class of pairs (G, G), where G is an algebra and G a K- and G-module, satisfying a condition of saturation, this class being closed with respect to Cartesian products, subpairs and homomorphic images. In the case of algebras we have the following results. P ROPOSITION 3.40. For any subpairs (A , S1 ) and (B , S2 ) in (A, S1 ) and (B, S2 ) respectively, the pair (A , S1 ) (B , S2 ) belongs to the variety Var((A, S1 ) (B, S2 )). Let Θ be a variety of representations of algebras and U a special ideal corresponding to it in F = KΨ∗ , being a free algebra of countable rank. The regular pair (F /U, F ) is a cyclic and free pair in the variety Θ, and Θ = Var(F/U, F). P ROPOSITION 3.41. Let (A, S) be an arbitrary pair, and (R, F) a free pair in the variety Θ2 . Then Var((A, S) (R, F )) = Var(A, S) · Θ2 . In complete analogy to the semigroup case we can define a multiplication of varieties of representations of algebras and the semigroup A(K). Similarly to Theorem 3.30 we prove Theorem 3.42. T HEOREM 3.42. The semigroup of varieties of representations of algebras is a semigroup with cancellation. If we take into account that in the subalgebra Φ = HomK (B, A) ⊂ EndK (A ⊕ B), arising in the definition of the pair (A, S1 ) (B, S2 ), the multiplication is zero, then the necessary reasonings in the Sections 3.2.3 and 3.2.4 can be easily carried over to the situation of representations of algebras, so in the same way we prove T HEOREM 3.43 (Theorem of generators of algebras). Let K1 and K2 be two classes of representations of algebras. Then holds the formula Var K1 · Var K2 = Var(K1 K2 ).
3. Triangular products and stability of representations
63
From this basic result, similarly to the implication similarly to “Theorem 3.33 =⇒ Theorem 3.35”, one obtains T HEOREM 3.44. Each variety of representations of algebras can be uniquely decomposed into a finite product of indecomposable varieties of representations. C OROLLARY 3.45. The semigroup A(K) of varieties of representations of algebras is free. 3. We indicate some applications of these results. To this end, we first briefly describe the known connections between varieties of algebras and varieties of their representations, [41]. To each variety of representations Θ one associates the class ω −1 Θ of algebras, It admitting faithful representations in Θ; parallel to ω −1 Θ we use also the notation Θ. −1 is immediate to verify that ω Θ is a variety of algebras. On the other hand, to each variety N of algebras we associate the variety of representations ωN , stipulating that (G, G) ∈ ωN if the algebra G, up to the kernel of the corresponding representation belongs to N . It turns out that for any N and Θ there hold the relations ω(ω −1 Θ) = Θ and ω −1 (ωN ) = N . From this follows the existence of a bijective correspondence between the varieties of algebras and the varieties of their representations. However, the set K(K) of all proper varieties of K-algebras is in one-to-one correspondence with the set J (K) of all (non-zero) T -ideals in the free algebra F , [83]. The usual multiplication of ideals in J (K), with respect to which this set is closed, induces on K(K) an associative multiplication on varieties of algebras, which we denote by the symbol “·”. Next, let N1 and N2 be the varieties of algebras corresponding to the T -ideals U1 and U2 respectively. Let us consider the variety of algebras N = ω −1 (ωN1 · ωN2 ) and let U be the T -ideal corresponding to it in F . One can prove that U = U2 · U1 . This means that there exists an anti-isomorphism between the semigroups A(K) and J (K). Using the previous connections, one can easily deduce from Theorem 3.44 the following. T HEOREM 3.46. Every proper T -ideal can uniquely be written as a product of finitely many indecomposable T -ideals. and as a consequence of Theorem3.44 an interesting result of Bergman and Lewin (cf. [56, Theorem 7]). T HEOREM 3.47. The semigroup J (K) is free. In addition to this, we obtain the following. The remarks in Paragraph 5 of Section 3.2.5 remain in force and so, in the case of algebras considered here, each T -ideal, apparently, is special and from the same type of reasoning as in [8] one proves a theorem (in a variant for algebras) whose original form for groups can be found in [8], p. 1209. In order to give the formulation of this result we give a definition. A family of elements uα ∈ F is, by definition, called a special basis of the ideal U ⊂ F if U as an ideal is generated by all elements of the form uηα ∈ F which are images of elements uα under all special endomorphisms η of the algebra F . It turns out that if a special basis of a T -ideal U can be written in terms of only n variables in X, then the equality U = U1 ·U2 ·· · ··Um , m > n, is not possible for any choice of T -ideals U1 , . . . , Um . From this it follows, in particular, that a variety of algebras is indecomposable if it is defined using identities
64
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
in only one variable (from X). Using the Nagata-Higman theorem (cf., e.g. [83, Appendix C]), we deduce that there exist indecomposable varieties consisting of radical algebras. 4. In order to exhibit still another series of indecomposable varieties of algebras, let us first prove an auxiliary statement. We denote by var K the variety generated by the class of algebras K, at the same time we denote, as before, by Var K the variety of representations of algebras generated by their class K. Furthermore, if one adjoins to the algebra S a unit, we get as the result an algebra S ∗ and the regular representation (S ∗ , S). L EMMA 3.48. For any algebra S and a faithful representation (L, S) of it we have the formula −−−−−−→ var S = Var(L, S). P ROOF. Denote Θ1 = ω var S and Θ2 = Var(L, S). It follows from the definition of Θ1 that (L, S) ∈ Θ1 , which is sufficient for Θ2 ⊂ Θ1 . Let us prove the converse inclusion. Using the existence for any c ∈ L of an isomorphism of pairs (S ∗ / AnnS (c), S) ∼ = ∼ = (c·S ∗ , S) and Remak’s theorem, we see that the regular pair (S ∗ /∩c∈L AnnS (c), S) is contained in Θ2 . But from the fact that (L, S) is faithful it follows that ∩c∈L AnnS (c) = 0, giving that the pair (S ∗ , S) lies in Θ2 . Next, let (A, T ) be an arbitrary pair in the variety Θ1 , while (A, T1 ) is the corresponding faithful pair. It follows from (A, T1 ) ∈ Θ1 that T1 ∈ var S and, therefore, T1 ∈ QSC S. Let us show that from the last thing it follows that (T1∗ , T1 ) ∈ Θ2 . In the case T1 = S I (Cartesian power) the statement is easy to prove if we use the embedding (S I )∗ → (S ∗ )I and the fact that Θ2 is closed with respect to to subpairs and Cartesian products. If T1 is a subalgebra of S I then the embedding (T1∗ , T1 ) → ((S I )∗ , S I ) shows that (T1∗ , T1 ) ∈ Θ2 . Finally, let T1 be the endomorphic image of the subalgebra S1 ⊂ S I . We have the endomorphism of pairs (S1∗ , S) (T1∗ , T1 ), so in view of (S1∗ , S) ∈ Θ2 it follows that (T1∗ , T ) ∈ Θ2 . Thus we have proved that T1 ∈ var S implies (T1∗ , T1 ) ∈ Θ2 . Now it is not hard to see that (A, T1 ) ∈ Θ2 . Indeed, for each a ∈ A the cyclic subpair (a ◦ T1∗ , T1 ) in (A, T1 ) is isomorphic to the pair (T1∗ / AnnT1 (a), T1 ) ∈ Θ2 , and so lies in Θ2 . But from the membership in the variety Θ2 of all cyclic subpairs of the pair (A, T1 ) it follows that (A, T1 ) ∈ Θ2 . For the proof we have to apply a variation of the argument in the proof of Lemma 3.27. As a result we get the inclusion Θ1 ⊂ Θ2 , but along with it also the equality ω(var S) = Var(L, S). Applying to the main part of this equation the operator ω −1 we are lead to the formula of interest to us. 5. T HEOREM 3.49. If the algebra A is semi-simple (in the sense of Jacobson), then the variety var A is indecomposable. P ROOF. 1) Assume that the algebra is primitive. In this case there exists an irreducible representation (G, A). The variety of representations generated by this pair, will be denoted by Θ. From the relation A ∈ ω −1 Θ we deduce that (A∗ , A) ∈ Θ; cf. Lemma 3.48 for this type of proof. On the other hand, there exists an epimorphism of
3. Triangular products and stability of representations
65
pairs (A∗ , A) (G, A), from which it follows that (G, A) ∈ Var(A∗ , A), hence also Θ = Var(A∗ , A). Lemma 3.48 now gives ω −1 Θ = var A. Let us assume that var A = N2 · N1 and that to the variety Ni in F corresponds the T -ideal Ui , i = 1, 2. Consider the variety N = ω −1 (ωN1 · ωN2 ). In Paragraph 3 of this section it was stated that the T -ideal corresponding to this variety of algebras is U2 · U1 . But this T -ideal corresponds to the variety N2 · N1 = var A. Hence, N = var A. From this we deduce that Θ = ωN1 · ωN2 , that is, the decomposablity of the variety of representations Var(G, A). But this is a contradiction, as will be proved in the second half of the proof. 2) Let the algebra A be semi-simple. Then it is a subdirect sum of primitive algebras: A=
. sd i∈I
Ai ,
Ai ∼ = A/Di ,
Di = 0.
i∈I
Furthermore, let (Gi , Ai ) be a faithful irreducible representation corresponding to the primitive summand Ai . Repeating the argument of the first part of the proof, we derive the equalities Var(Gi , Ai ) = Var(A∗i , Ai ), i ∈ I. However, the pairs (A∗i , Ai ) are contained in the variety Var(A∗ , A); this follows from the existence of an epimorphism (A∗ , A) (A∗i , Ai ). Moreover, it follows from Remak’s theorem that Var(A∗ , A) ⊂ Var( i∈I (A∗i , Ai )). In view of the equalities Var(A∗i , Ai ) = Var(Gi , Ai ), i ∈ I, it follows from this that the variety Var(A∗ , A) is generated by the irreducible pairs (Gi , Ai ). Using these facts we show that Var(A∗ , A) is indecomposable. Let us assume that Var(A∗ , A) = Θ1 ·Θ2 . Introduce the notation Ω = i∈I (Gi , Ai ) and Ωl = Θl ∩ Ω, l = 1, 2. It turns out that one has Ω = Ω1 ∪ Ω2 . Apparently, we have only to comment on the inclusion Ω ⊂ Ω1 ∪ Ω2 . If a pair (Gi , Ai ) is not contained in Ω1 , then it is not contained in Θ1 either. But at the same time, it follows from (Gi , Ai ) ∈ Θ1 · Θ2 that there exists a subpair (Hi , Ai ) in (Gi , Ai ) such that and (Hi , Ai ) ∈ Θ1 and (Gi /Hi , Ai ) ∈ Θ2 . As (Gi , Ai ) is irreducible, it follows, however, that either Hi = 0 or Hi = Gi . In the second case (Gi , Ai ) ∈ Θ1 , which is excluded. Hence Hi = 0 and so (Gi , Ai ) ∈ Θ2 , hence also (Gi , Ai ) ∈ Ω2 . The inclusion Ω ⊂ Ω1 ∪ Ω2 is established. From the equality Ω = Ω1 ∪ Ω2 and the relation Var(A∗ , A) ⊂ Var Ω it follows that Θ1 · Θ2 = Var Ω1 · Var Ω2 . As the semigroup of varieties of representations is free, this then shows that Θ = Var Ω , = 1, 2. Furthermore, from the definitions we deduce that Θ1 · Θ2 ⊂ Var(Θ1 ∪ Θ2 ) ⊂ Θ2 · Θ1 . In the case of incidence of the varieties Θ1 and Θ2 , the preceding relation gives a contradiction. Indeed, if, for example, Θ1 ⊂ Θ2 , then Ω ⊂ Θ2 and so Θ1 · Θ2 = Θ2 , which is a contradiction. Therefore, in order to complete the proof of the theorem it suffices to show that Θ1 and Θ2 are incident. We argue by contradiction and choose arbitrary (Gi , Ai ) ∈ Ω1 \Θ2 and (Gi , Ai ) ∈ Ω2 \Θ1 ; here i , i ∈ I. In the triangular product (G, G) = (Gi , Ai ) (Gi , Ai ) we take a verbal with respect to Θ1 . As (G, G) ∈ Θ1 · Θ2 ⊂ Θ2 · Θ1 , we have (∗ Θ1 (G, G), G) ∈ Θ2 . The irreducibility of all pairs in Ω implies that the only Gmodules in G are 0, Gi and G. If now ∗ Θ1 (G, G) is 0 or Gi , then, together with the pair (G, G) or the pair (G/Gi , G) (Gi , G) respectively, also its subpair (Gi , Ai ) lies Θ1 , which was excluded by the choice. If ∗ θ1 (G, G) = G, then, likewise, (Gi , Ai ) ∈ Θ2 , which also was excluded. The statement is proved.
66
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us note that the reasoning given works also in the case |I| = 1, guaranteeing the indecomposability of Var(A∗ , A) in the first part of the proof. This proves the theorem. 6. For associative K-algebras one can introduce a operation of wreath product. Namely, for the algebras A and B we consider the K-module HomK (B ∗ , A∗ ) as an algebra with zero multiplication and set def
AwrB = HomK (B ∗ , A∗ ) (A ⊕ B), we call this algebra AwrB the wreath product of the algebras A and B. The operation of the wreath product of algebras permits us to make explicit the generating algebra of the product of varieties of algebras. Indeed, let A = var A and B = var B. Using Lemma 3.48 and the theorem of generating representations of algebras, we obtain ω(B · A) = ωA · ωB = Var(A∗ , A) · Var(B ∗ , B) = = Var((A∗ , A) Var(B ∗ , B)) = Var(A∗ ⊕ B ∗ , AwrB) = = Var((AwrB)∗ , AwrB) = ωvar (AwrB). Let us add that in this computation we used the equation Var(A∗ ⊕ B ∗ , AwrB) = Var((AwrB)∗ , AwrB), the verification of which is immediate on the basis of the properties of the corresponding generating pairs. These computations prove the following T HEOREM 3.50. For any two algebras A and B holds the formula (var B) · (var A) = var (AwrB). Finally, let us indicate yet another application of the wreath product of algebras. A T -ideal is called finitary, if the variety of algebras defined by it is generated by a finite dimensional algebra. T HEOREM 3.51. The product of finitely many T -ideals in F is finitary if and only if all the factors are finitary. P ROOF. It is clearly sufficient to prove the theorem for two T -ideals U1 and U2 . Thus, let the T -ideals Ui be finitary, and let the varieties Ni defined by them be generated by the finite dimensional algebras Ai , i = 1, 2. The product U1 · U2 is a T -ideal defining the variety N1 · N2 . According to Theorem 3.50 the variety A1 · N2 . is generated by the algebra A2 wrA1 = Hom(A∗1 , A∗2 ) (A2 ⊕ A1 ), which clearly is finite dimensional. Thus the finitarity of U1 · U2 is established. Conversely, assume that the T -ideal U1 U2 is finitary. Then the variety N1 ·N2 defined by it is generated by some finite dimensional algebra G, A1 · N2 = var G. Let us then consider the regular pair (G ∗ , G), which in view of Lemma 3.48 generates the variety ωN2 · ωN1 . Take in G ∗ a right ideal A such that (A, G) ∈ ωN2 and (G ∗ /A, G) ∈ ωN1 .
3. Triangular products and stability of representations
67
According to Propositions 3.17 and 3.40 we have (G ∗ , G) ∈ Var((A, G) (G ∗ /A, G)). However, using the theorem of generating representations, generating we have Var((A, G) (G ∗ /A, G)) ⊂ ωN2 · ωN1 , from which it follows that ωN2 · ωN1 = Var(G ∗ , G) ⊂ Var(A, G) · Var(G ∗ /A, G) ⊂ ωN2 · ωN1 . Thus we have proved the equality Var(A, G) · Var(G ∗ /A, G) = ωA2 · ωN1 , where Var(A, G) ⊂ ωN2 and Var(G ∗ /A, G) ⊂ ωA1 . By the Corollary to Theorem 3.44 it follows from this that ωN2 = Var(A, G) and ωA1 = Var(G ∗ /A, G). Thus, the varieties ωN1 and ωN2 are generated by finite dimensional pairs. Let (C1 , H1 ) be a finite dimensional faithful pair generating the variety of representations ωN1 . Then the algebra H1 is a finite dimensional K-algebra, and it is not hard to see that N1 = var H1 . Thus, the ideal U1 is finitary. In an analogous manner one shows that the T -ideal U2 is finitary. 7. Let T = T (n) be the algebra of upper triangular matrices of order n over the field K. The natural representation (L, T ) of this algebra is faithful and there is the isomorphism of representations (L, T ) ∼ = (K, K) · · · (K, K) . n times According to Theorem 3.43 we get from this the equation Var(L, T ) = (Var(K, K))n . Moreover, we remark that the variety of algebras var T and the variety of representations of algebras Var(L, T ) correspond in the free algebra F to one and the same T -ideal Tn : proof by unwrapping the definitions. This remark and the anti-isomorphism of the semigroup of varieties of representations with the semigroup of T -ideals of the algebra F allows us to rewrite the above equation as Tn = T1n , T1 being the ideal of identities of the algebra K. Thus we have proved the following. T HEOREM 3.52. The ideal of identities of the algebra of upper triangular matrices of order n over the field K coincides with T1n , T1 being the ideal of identities of the algebra K. For char K = 0 this theorem coincides with the result of Yu. N. Mal’cev (1971) stating that the ideal Tn is generated by the polynomials [x1 , x2 ][x3 , x4 ] · · · · · [x2n−1 , x2n ], where we have written [x, y] = xy − yx. In the case char K > 0 this constitutes an answer to Problem 109 in [6].
68
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
3.2.7. Comments 1. The notion of automaton as an object for mathematical inquiry arose long ago and simultaneously in many papers; cf. [31]. The connections of this circle of questions in Theoretical Computer Science with Algebra (as indicated in B. M. Glushkov [9]) led to the proof of the basic theorems on decomposition of automata; a systematic study of these questions and their connection with linear systems was given in [18]. The point of view of automata as three-sorted systems and their connections with pairs, as in the work of B. I. Plotkin and others in [42], has led to a systematic application in automata theory of the techniques ready in the theory of representations. In particular, the -product introduced in Section 3.1 for semigroups leads to the construction of the -product of Moore automata and to a proof of theorems analogous to the theorems of KaluzhninKrasner and Krohn-Rhodes (B. I. Plotkin, unpublished). In problems of classification of linear automata, as indicated in the present chapter, the bijection between the varieties of linear automata varieties and ideal pairs in the algebra F proves to be useful. Here this bijection is used for the study of the multiplicative properties of the set of varieties of linear automata. 2. The subject of this chapter is related to the theme of unique factorization of ring and semigroups. The unique factorization in the ring of integers is beautiful and useful, and its properties have been known over a long time, but already in some rings of algebraic integers it is not easy to establish this property. The study of the rings of linear differential operators (Edmund Landau, 1902) led to the question of the unique decomposition of elements in certain non-commutative rings. Much attention has been devoted to the unique factorization in semigroups, as the unique factorization of a ring is a property of its multiplicative semigroup; cf. the paper [62] and the book of P. Cohn [5]. The Theorems 3.35 and 3.46 proved in this chapter are natural reformulations of the theme indicated parallel to the Theorem 23.4 in [34] and of the main theorem in [43]. 3. It is clear that the approach described in Paragraph 7 of Section 3.2.6 allows us to reduce the search of a basis for the identities of the algebra T of upper block-triangular matrices to the corresponding case of diagonal blocks. A definitive answer can be obtained in the case when the sizes of the upper triangular matrices from T do not exceed two, while the field K is either finite or has characteristic 0, because thanks to work by Yu. N. Mal’cev, E. N. Kuzmin and Yu. P. Razmyslov one knows a basis for the identities of square matrices for such fields. 4. In the theory of varieties of algebras one encounters usually another multiplication. Let us denote by T (A) the T-ideal defining a variety of (associative) K-algebras A. Then for any two varieties of algebras A1 and A2 , their product A1 ∗A2 is the variety of algebras defined by the T-ideal T A1 ∗ T A2 , generated by the set {f (g1 . . . gn ) | f (x1 . . . xn ) ∈ T (A1 ), g1 . . . gn ∈ T (A2 )} ⊂ F. It is possible that an improvement of the approach in this Section 3.2 together with an attraction of the notion of the free Menger system16 allows to achieve progress in the description of the “*-structure” arising here (cf. Problems 18 and 25 in [6]). 16Translators’ note. Cf. Jaak Henno. Free G-commutative Menger systems. In: Mathematics and Theoretical Mechanics, VIII, Proc. Estonian Acad. Sci., Phys. Math., 373, 1975, 19–26
3. Triangular products and stability of representations
69
5. Our treatment of M(K) and L(K) as locally finite partially ordered sets together with the information on indecomposable elements in these semigroups permits us, really, to use with advantage ideas in [98] and to develop the analytic side of a question in the spirit of the book [86].
3.3. Powers of the fundamental ideal and stability of representations of groups and semigroups Let ZΓ be the integral group ring of a group Γ. The fundamental ideal Δ in the ring ZΓ is the kernel of the homomorphism ZΓ → Z; in other words Δ is the set of all possible finite sums i ni γi , where ni ∈ Z, γi ∈ Γ, such that i ni = 0. Powers of Δ are inductively, that is, Δν = Δν−1 · Δ for a non-limit ordinal ν and defined ν μ Δ = μ Gω+n = 0; if, in addition, if it is required that A is a vector space over a field of characteristic p and that Q is a finite q-group, then it is also true that G ◦ Δω Γ = A. P ROOF. The proof of the Theorem is given in several steps. (1) We show that A ⊂ G1 . We remark that [B, Φ] ⊂ G1 and [b, ϕ] = −b + b ◦ ϕ = −b + (b + bϕ ) = bϕ for all b ∈ B and ϕ ∈ Φ. Take as the element b a basis element in the free Abelian group B. Then for each a ∈ A the map b → a can be extended to a homomorphism ϕ of B to A. Hence, we have A ⊂ [B, Φ] ⊂ G1 . (2) We show that A ⊂ G2 . The group B1 , being a subgroup of the free Abelian group B, is likewise free and Abelian. The same reasoning as in (1) shows that A ⊂ [B1 , Hom(B1 , A)]. The factor group B/B1 is free by assumption, and so the semigroup B1 is a direct summand of B, B = T ⊕ B1 . Hence Φ = Hom(T, A) ⊕ Hom(B1 , A), so Hom(B, A) ⊂ Φ. It follows that A ⊂ [B1 , Hom(B1 , A)] ⊂ [B1 , Φ] ⊂ [G1 , Γ] = G2 . (3) The inclusion A ⊂ Gk holds for all k ≥ 3. Indeed, let b1 be a generating element of the free Abelian group B1 . For any a ∈ A there is a ϕ ∈ Hom(B1 , A) such def
m that bϕ 1 = a. Moreover, there exists a number m such that b = q b1 ∈ Bk−1 , because B1 /Bk−1 is (by assumption) a q-group. Therefore m bϕ = (q m b1 )ϕ = q m bϕ 1 = q a.
In view of p = q, when the element a runs through the whole group A, the element x = q m a will run through the group A. We saw above that for any such element x ∈ A 19The meaning of this expression is revealed on Proposition 3.63
3. Triangular products and stability of representations
79
there exists ϕ ∈ Hom(B1 , A) such that bϕ = x. Hence, for any x ∈ A, x = q m a, we have x = bϕ = −b+(b+bϕ) = −b+b◦ϕ = [b, ϕ] ∈ [Bk−1 , Hom(B1 , A)] ⊂ [Gk−1 , Γ] = Gk , which gives A ⊂ Gk . (4) Together with the obvious relation Bk ⊂ Gk , what is proved in (1)-(3) also gives A + Bk ⊂ Gk for all k. Let us show by induction over k that Gk = A + Bk , k = 0, 1, 2, . . . . For k = 0 we have trivially G0 = A + B0 . Let us assume that the equality Gs = A + Bs is true for all s ≤ k. In order to prove that Gk+1 = A + Bk+1 it suffices to check the validity of Gk+1 ⊂ A + Bk+1 . Take any a ∈ A, b ∈ Bk , ϕ ∈ Φ, σ ∈ Σ. Using the relations in Paragraph 2 of Section 3.1.3, we find [a + b, ϕσ] = −(a + b) + (a + b) ◦ ϕσ = −a − b + a ◦ ϕσ + b ◦ ϕσ = = −a − b + (a ◦ ϕ) ◦ σ + (b ◦ ϕ) ◦ σ = = −a + a ◦ σ − b + b ◦ σ + (bϕ ) ◦ σ = = [a, σ] + [b, σ] + [bϕ , σ] + bϕ ∈ A + Bk+1 . From this it follows that Gk+1 = [Gk , Γ] = [A + Bk , Γ] ⊂ A + Bk+1 , which completes the induction. (5) Next, we show that Gω = A. In view of the results (1)-(3) it is clear that it suffices to verify that Gω ⊂ A. Let x ∈ Gω . Then x = a1 + b1 where a1 ∈ A, b1 ∈ B1 . On the other hand, in view of (4), for every k > 1 there exist ak ∈ A, bk ∈ Bk such that x = ak + bk . We have a1 − ak = bk − b1 ∈ A ∩ B = 0, whence we obtain b1 = bk ∈ Bk . We conclude that b1 ∈ ∩k Bk . But Bω = 0 by assumption. So b1 = 0, and, by the same token, x = a ∈ A. (6) For any a ∈ A, γ ∈ Γ, γ = ϕσ we compute [a, γ] = −a + (a ◦ ϕ) ◦ σ = −a + a ◦ σ = [a, σ] ∈ A1 . This computation shows that [A, Γ] = A1 . Hence, we have Gω+1 = [Gω , Γ] = [A, Γ] = A1 . By induction over k we conclude that Gω+k = Ak for all k ≥ 1. In particular, Gω+n = An = 0 and Gω+n−1 = An−1 = 0. This concludes the proof of the first statement of the theorem. Let us pass to the proof of the second statement of the theorem. Thus, below we assume that A is a vector space over a field of characteristic p and that Q is a finite qgroup. In particular, Φ = Hom(B, A) is a p-group. Let Φ1 be the commutator of the subgroups Φ and Q, and Φ2 be the commutator of the subgroups Φ1 and Q in Γ; setting ¯ is a p-group. ¯ = Φ/Φ2 , it is clear that Φ Φ (7) One has the equality Φ1 = Φ2 . With the goal to prove this, let us remark that conjugation in the semidirect product Γ = Φ Σ induces an action of Σ on Φ which is 2-stable by the construction; for this reason cf. [70, p. 2] and Paragraph 2 of Section 3.1.3. We make the following observation. For any 2-stable pair (M, Σ) of Z-modules, where M is Abelian and Σ a q-group, we consider the commutator [M, Σ], that is, the
80
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Z-module, generated by the commutators [a, σ] = −a + a ◦ σ, a ∈ M , σ ∈ Σ. As a consequence of the 2-stability of (M, Σ) we derive from a ◦ σ = a + [a, σ] that a ◦ σ2 = (a + [a, σ]) ◦ σ = a ◦ σ + [a, σ] = a + 2[a, σ], . . . n
and (by induction over k), a ◦ σ k = a + k[a, σ]. For some n = n(σ) we have σ q = ε, n because Σ is a q-group. Hence a = a ◦ ε = a ◦ σq = a + q n [a, σ], whence q n [a, σ] = 0. Therefore [M, Σ] is a q-group. ¯ Q) this observation shows that Φ1 /Φ2 is a q-group. When applied to the pair (Φ, ¯ We However, at the same time Φ1 /Φ2 must be a q-group, as it is a subgroup of Φ. conclude that Φ1 ⊂ Φ2 . As Φ2 ⊂ Φ1 is evident, the equality Φ1 = Φ2 is proved. (8) We present some auxiliary computations for the pair (G, Γ), cf. Paragraph 2 of Section 3.1.3. For any b ∈ B, ϕ ∈ Φ, σ ∈ Q we have b = b ◦ ϕϕ−1 = (b + bϕ ) ◦ ϕ−1 = b ◦ ϕ−1 + bϕ , whence b ◦ ϕ−1 = b − bϕ . Furthermore b ◦ [ϕ, σ] = (b − bϕ ) ◦ (σ −1 ϕσ) = (b ◦ σ −1 − bϕ ) ◦ (ϕσ) = = (b ◦ σ −1 + (b ◦ σ −1 )ϕ − bϕ ) ◦ σ = b + (b ◦ σ −1 )ϕ − bϕ . Take now arbitrary b ∈ B and ϕ1 ∈ Φ1 . There exist ϕ ∈ Φ and σ ∈ Q such that ϕ1 = [ϕ, σ] and, so we find [b, ϕ1 ] = −b + b ◦ [ϕ, σ] = (−b + b ◦ σ −1 )ϕ = [b, σ −1 ]ϕ . These computations show that [B, Φ1 ] ⊂ [B, Q]Φ . But the group [B, Q] ⊂ B is free, and so [B, Q]Φ = A. Thus the module A contains [B, Φ1 ]. (9) Let us show that A ⊂ [B, Φ1 ]. This was shown in step (2) of this proof in the case Φ1 = Φ. Let Φ1 < Φ. Using the classical theorem of Maschke ([19, p. 182]) for the pair (Φ, Q), we obtain the existence of a Q-invariant decomposition Φ = Φ0 ⊕ Φ1 . Moreover, (Φ0 , Q) ∈ S. Indeed, for any ϕ ∈ Φ0 and σ ∈ Q we have −ϕ + ϕ ◦ σ ∈ Φ1 , and also −ϕ + ϕ ◦ σ ∈ Φ0 , as Φ0 is Q-invariant. From Φ1 ∩ Φ0 = 0 we conclude that ϕ ◦ σ = ϕ. Consider the pair (B, Q) and let B ∗ be the set of all Q-fixed points of B. We check now that AnnΦ B ∗ = Φ1 . On one hand, in view of the definition of the action of Φ in B (Subsection 3.1.1.2) we must have b ◦ ϕ1 = b + bϕ1 for any b ∈ B ∗ and ϕ1 ∈ Φ1 . The elements ϕ1 = [ϕ, σ], where ϕ ∈ Φ and σ ∈ Q generate Φ1 , and so in view of the computations in step (8) we have b ◦ ϕ1 = b ◦ [ϕ, σ] = b + (b ◦ σ −1 )ϕ − bϕ = b + bϕ − bϕ = b, because b ◦ σ −1 = b. We come to equalities b + bϕ1 = b ◦ ϕ1 = b, from which it follows that bϕ1 = 0. Hence ϕ1 ∈ AnnΦ B ∗ . So we have verified that Φ1 ⊂ AnnΦ B ∗ . On the other hand, an immediate verification shows that for any b ∈ B the element −1 ¯b = is Q-invariant. Moreover, an arbitrary ϕ ∈ AnnΦ B ∗ as well as each σ∈Q b ◦ σ other element of Φ = Φ0 ⊕ Φ1 , can be written in the form ϕ = ϕ0 + ϕ1 , where ϕi ∈ Φi , i = 0, 1. Taking now into account the relations Φ1 ⊂ AnnΦ B ∗ and (Φ0 , Q) ∈ S obtained before, along with the formula ∀b ∈ B, ψ ∈ Φ, σ ∈ Q,
bψ◦σ = (b ◦ σ −1 )ψ
3. Triangular products and stability of representations
81
in Paragraph 2 of Section 3.1.3, we obtain 0 = ¯bϕ = ¯bϕ0 +ϕ1 = ¯bϕ0 + ¯bϕ1 = ¯bϕ0 = b ◦ σ −1 )ϕ0 = bϕ0 ◦σ = bϕ0 = |Q| · bϕ0 . =( σ∈Q
σ∈Q
σ∈Q
From this it follows that bϕ0 = 0, because bϕ0 lies in the p-group A, while |Q| = q k for some k ∈ N, and this is true for all b ∈ B. Hence ϕ0 = 0. This argument shows that AnnΦ B ∗ ⊂ Φ1 and so we have Φ1 = AnnΦ B ∗ . (10) The subgroup B ∗ is servant in B. Indeed, if for some b ∈ B and n ∈ Z the element nb is contained in B ∗ , then for any σ ∈ Q we have nb = n(b ◦ σ), hence n(b − b ◦ σ) = 0. For a free Abelian group B this is possible only if b ◦ σ = b. Hence b ∈ B ∗ and we have established that B ∗ is servant in B. We add, however, that this fact follows here from condition b), according to which there exists a subgroup B∗ ≤ B such that B = B ∗ ⊕ B∗ . From this we conclude that Hom(B, A) = Hom(B ∗ , A) ⊕ Hom(B∗ , A). We remark that for each a ∈ A and a basis element b∗ ∈ B∗ the map b∗ → a extends to a Z-homomorphism ϕ∗ : B∗ → A. We obtain from this A ⊂ [B∗ , Hom(B∗ , A)]. The equality Φ1 = AnnΦ B ∗ proved above along with the obvious relation Hom(B∗ , A) ⊂ Φ1 shows, however that A ⊂ [B, Φ1 ]. (11) Using the relation Φ1 = Φ2 we see that Ψ = Φ1 Q is a subgroup of the group Γ and that Φ1 Ψ. We shall find the ideal Δω Ψ in the ring ZΨ. ˜ Ψ∗ the right ideal in ZΨ, generated by all For any subgroup Ψ∗ ≤ Ψ we denote by ω ∗ ψ − 1 where ψ ∈ Ψ . In an analogous way as was done in the proof of Proposition 3.70, one can prove that ω ˜ Φ1 ⊂ Δω Ψ . Let us prove the converse inclusion. First, we remark that Ψ/Φ1 is a finite q-group so that Δω Ψ/Φ1 = 0. But ∞
n Δω ˜ Φ1 )/˜ ω Φ1 , Ψ/Φ1 = ∩ (ΔΨ + ω n=1
which gives ∞
∞
n=1
n=1
ω ˜ Φ1 ⊃ ∩ (ΔnΨ + ω ˜ Φ1 ) ⊃ ∩ ΔnΨ = Δω Ψ. Δω Ψ
Thus we have the equality =ω ˜ Φ1 . (12) By what was set out above, it follows that ω ω Gω = A = [B, Φ1 ] ⊂ B ◦ ω ˜ Φ1 = B ◦ Δω Ψ ⊂ B ◦ ΔΓ ⊂ G ◦ ΔΓ . ω From this we obtain A = Gω = G◦ Δω Γ , because the inclusion converse to Gω ⊂ G◦ ΔΓ is always true, cf. Subsection 3.3.1.1.
82
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
3. Let us pass to estimating the terminal of the group Γ introduced in the previous Subsection; to this end we assume that the auxiliary requirements, indicated in the statement of Theorem 3.71 are fulfilled. We assume that Δω+n−1 = Δω+n , then we have Γ Γ n−1 = (G ◦ Δω = 0 = Gω+n−1 ⊂ Gω ◦ Δn−1 Γ ) ◦ ΔΓ Γ n−1 = G ◦ (Δω ) = G ◦ Δω+n−1 = G ◦ Δω+n ⊂ Gω+n = 0. Γ · ΔΓ Γ Γ
= ΔΓω+n . A quantitative reformulation of the This contradiction proves that Δω+n−1 Γ result of these computations gives the following. T HEOREM 3.72. The terminal of the group Γ, introduced in Subsection 3.3.2.2, is not less than ω + n. 4. We give an application of this result. Fix n ∈ N and let A be an n-dimensional vector space over Zp , and let P = U T 1 (n, Zp ) be the group of (n × n)-matrices with elements in Zp with unit main diagonal and zeros under it (the unitriangular group). We denote by U T r (n, Zp ) the subset of matrices in P with r − 1 zero diagonals above the main diagonal. In view of (12) in [19, p. 38] we have the relation [U T r (n, Zp ), U T 1 (n, Zp )] = U T r+1(n, Zp ), showing that the nilpotency class of P equals n − 1. Let us remark that, in a natural way, there appears the pair (A, P ), which is faithful and has a stable series of length n. However, A does not have a P -stable series of the length less than n, because otherwise by Kaluzhnin’s theorem ([19, p. 144]) the nilpotency class of P would be less than n − 1. The pair (A, P ) satisfies, thus, the requirement a) in Theorem 3.71; cf. the beginning of Subsection 3.3.2.2. Furthermore, let us for B take the additive group of the integral group ring of a finite q-group Q, leading to the regular pair (B, Q); then B = B0 = ZQ(+) and Bk = ΔkQ , k = 1, 2, . . . , while the Abelian group B/B ∗ decomposes into a direct sum of cyclic subgroups, because together with B also B/B ∗ is finitely generated. Hence, the servant subgroup B ∗ is a direct summand of B ([22, p. 150]). It is easy to see that for such a group B the whole requirement b) in Theorem 3.71 is fulfilled. requirement In [61, p. 277], one finds, writing exp(Q/[Q, Q]) = n, the simple fact that the additive group ΔQ /ΔkQ has an exponent dividing nk (and hence is a q-group), but this can also be proved by the argument in step (7) of the proof of Theorem 3.71 by applying it to the pairs k (Δk−2 k = 2, 3, . . . Q /ΔQ , Q/[Q, Q]), and making an induction over k. Hence, the results of the two previous Subsections are also true for the pair (G, Γ) = (A, P )(B, Q) introduced here. We obtain the following result. T HEOREM 3.73. For each natural number n there exists a finite group such that in its integral group ring the (ω +n−1)-th and (ω +n)-th powers of the fundamental ideals are distinct. An example in [71, p. 223], shows that all values ω + n, n = 0, 1, 2, . . . , indeed appear as terminals of finite groups.
3. Triangular products and stability of representations
83
5. Groups, in which there exists an invariant nilpotent subgroup with a nilpotent factor group, are usually called metanilpotent. We denote in this Subsection by Γn the n-th term of the lower central series of the group Γ. Along with Theorem 3.73 one may now formulate the following statement which gives further properties of terminals of finite groups. T HEOREM 3.74. Let there be given a representation (G, Γ) of the group Γ with all metanilpotent factor groups torsion and the factor group Γ/∩n Γn nilpotent, by automorphisms of the Z-module G whose torsion part B is Γ-Artinian, while the factor module G/B is Γ-Noetherian. If G has a Γ-stable descending series of length ≤ ωn (n ∈ N) which reaches the zero, then the lower stable series of the pair (G, Γ) stabilizes to zero at a term of index < ω2. P ROOF. The statement of the theorem is obvious for n = 1, because the terms of the lower stable series of the pair (G, Γ) are contained in the corresponding terms of the given Γ-stable series of G. Hence we may assume that n ≥ 2. By the assumption, there exists in the module G a descending stable series of length ≤ ωn, (17)
G ⊃ G1 ⊃ · · · ⊃ Gu ⊃ . . . Gλ ⊃ · · · ⊃ Gμ = 0.
Consider the family of submodules {Gλ + B | λ ≤ μ}. In view of the Γ-stability of the series (17) and the Γ-invariance of the module B invariance we have for a non-limit λ (18)
Gλ−1 + B, Γ] ⊂ [Gλ−1 , Γ] + [B, Γ] ⊂ Gλ + B .
Let us show that for a limit λ holds the relation (19) Gλ + B = (Gα + B). α 0, (1 − ε)λσ ≥ 0 as requested. VdWC was formulated byB. L. van der Waerden in 1926. It is the problem to minimize the permanent among all doubly stochastic n × n matrices. It was suggested that minimum was attained precisely for the matrix Jn = n1 J where J stands for the n × n matrix all of whose entries are 1. In terms of formulae: it is true that (x ∈ Ωn and X = Jn ) =⇒ (per A > per Jn ). The desire to prove this was the stimulus to approximately 500 papers written on this topic until two papers, settling the question definitely, appeared simultaneously in 1981 – these were the papers by D. I. Falikman and G. P. Egorychev; see [14, 22]. The aim of the present paper is to develop several group-theoretical variations of these main themes.
1.1. Ωn (G) and its multiplicative structure. Let G be a finite group acting faithfully on some non-empty finite set X of cardinality n ∈ N. Thus, identifying X with n = {1, 2, . . . , n}, we may view G as a subgroup of the symmetric group Sn , G & Sn . Let Ωn (G) be the convex hull of all G-permutation matrices. Its elements are called G-doubly stochastic matrices. P ROPOSITION 1.1. The product of any two G-doubly stochastic matrices is a Gdoubly stochastic matrix. The multiplicative semigroup Ωn (G)(·) is a monoid whose invertible elements of finite order are precisely the G-permutation matrices. P ROOF. Let A and B be any two G-doubly stochastic matrices and write λσ Pσ and B = λτ Pτ A= τ ∈G
σ∈G
where λσ and λτ are non-negative numbers such that λσ = 1 and λτ = 1. τ ∈G
σ∈G
Then we have λσ Pσ )( λτ Pτ ) = λσ λτ Pσ Pτ = ( λσ λτ Pρ ) = λρ Pρ AB = ( σ∈G
τ ∈G
with
σ,τ ∈G
λρ =
ρ∈G στ =ρ
λσ λτ .
στ =ρ
It is clear that λρ ≥ 0. Moreover, we find ( λσ λτ ) = λσ λτ = λσ · λτ = 1. λρ = ρ
στ =ρ
σ,τ ∈G
σ∈G
τ ∈G
This proves that AB ∈ Ωn (G). That I ∈ Ωn (G) is obvious. Therefore, Ωn (G) is a monoid.
ρ
210
C HAPTER III. MAJORIZATION
Now, let A be an invertible G-doubly stochastic matrix. Then we may assume that Am = I for some integer m ≥ 2. Arguing by contradiction, suppose that A is not a permutation matrix. Then for some matrix element akl in A holds 0 < akl < 1. Set maxi∈n {ail } = apl . Then it is clear that 0 < apl < 1, as apl together with akl and, possibly, other elements in the lthe column sums to unity. Notice that, in view of what has already been the matrix Am−1 is doubly stochastic. Therefore, considering proved, ∗ the elements s∈n ais asl in the l-th column of the doubly stochastic matrices Am−1 and A, it follows that for any one of them holds a∗is asl ≤ apl a∗is = apl < 1. s∈n
s∈n m
This contradicts the assumption that A = I. Finally, let us prove that A is a G-permutation matrix. Take a reduced representation for A, i.e. A = σ∈H λσ Pσ with H ⊂ G and all λσ > 0. By the above argument, we know that λ σ∈H σ Pσ = P for some permutation matrix P . Thus we have P (i, j) = λ P (i, j). σ∈H σ σ If P (i, j) = 0 then 0 = σ∈H λσ Pσ (i, j) together with the fact that all λσ > 0 while Pσ (i, j) ∈ {0, 1} implies that Pσ (i, j) = 0 for all σ ∈ H. if Pτ (i, j) = 0 for If, however, P (i, j) = 1 then Pσ (i, j) = 1 for all σ ∈ H. Indeed, some τ ∈ H, it would follow from 1 = σ∈H λσ Pσ (i, j) that σ∈H λσ = σ∈H λσ with H = H\τ . But as all λσ > 0 this is impossible. Thus we have shown that for all σ ∈ H holds 0, if P (i, j) = 0, Pσ (i, j) = 1, if P (i, j) = 1. As Pσ and P are permutation matrices, this gives Pσ = P for all σ ∈ H. It follows that A = P is a G-permutation matrix. ( ' Fix a convex subset S ⊆ Mn (R). Linear transformations g : Mn (R) → Mn (R) such that g(S) = S are called symmetries for S. The set of all symmetries for S is a group. Its subgroups are called symmetric groups for S. Denote the action of g on matrices Q ∈ S by Q ◦ g. Take now S = Ωn (G) and consider a finite group L of symmetries for it. If Q is a matrix such that Q ◦ = Q for all ∈ L, then Q is called an L-fixed point. P ROPOSITION 1.2. Let L be a finite group of linear transformations of the space of matrices Mn (R) such that Ωn (G) is L-invariant. Then the subset of L-fixed points coincides with the convex hull of the set of all matrices of the form Qσ =
1 Pσ ◦ . |L| ∈L
P ROOF. Denote bt L∗ the set of all L-fixed points in Ωn (G). Given A, B ∈ L∗ consider the matrix C = λ + (1 − λ)B, λ ∈ [0, 1]; it is clear that C ∈ L∗ , as, for any ∈ L, C ◦ = λ(A ◦ ) + (1 − λ)(B ◦ ) = λA + (1 − λ)B = C.
1. Generalized majorization
211
For any matrix Qσ =
1 Pσ ◦ |L|
with σ ∈ G fixed
∈L
and ∈ L we find Qσ ◦ =
1 1 Pσ ◦ = Pσ ◦ = Qσ ; |L| |L| ∈L
∈L
here we used the fact that if runs through all of L the same is true for = also. This shows that Qσ ∈ L∗ . Now, any element D ∈ L∗ ⊆ Ωn (G) can be written as D= δπ Pπ with δπ ≥ 0, δπ = 1. π
π∈G
Hence,
1 1 D= δπ (Pπ ◦ ) = D= |L| |L| ∈L ∈L π∈G 1 = δπ (Pπ ◦ ) = δπ Q π . |L| π∈G
π∈G
∈L
This together with what was proved above finishes the proof.
( '
R EMARK 1.3. Using the obvious relations (A + B)t = At + B t and Pσt = Pσ−1 we see that for the transpose of any matrix X ∈ Ωn (G) it holds t t X = λσ Pσ = λσ Pσt λσ Pσ−1 , σ∈G −1
σ∈G
σ∈G
which, as σ ∈ G if σ ∈ G, implies that X ∈ Ωn (G). Therefore, Ωn (G) is invariant under transposition of matrices. Taking L to be the 2-element group t|t2 = idΩn (G) , it follows that the set of all symmetric doubly stochastic G-matrices coincides with the convex hull of the set of matrices { 21 (Pσ +Pσ−1 )}. Furthermore, specializing G to be the group Sn (the symmetric group of n) we obtain the result: the set of all symmetric n × n doubly stochastic matrices is identical with the convex hull of the set of all matrices of the form 12 (P + P t ), where P is an n × n permutation matrix. This is a result by M. Katz [12]; see also [5] t
Following [8], a semigroup S with involution a → a∗ is called a special involution semigroup if and only if every finite nonempty subset T ⊆ S has the property that there exists an element a ∈ T such that if for some b, c ∈ T we have aa∗ = bc∗ then b = c. P ROPOSITION 1.4. The multiplicative subgroup of all doubly stochastic G-matrices (for any given finite group G, G ≥ Sn ) is a special involution semigroup. P ROOF . Take any X ∈ Ωn (G) and write it in the form X = π∈G λπ Pπ . Then we get X t = π∈G λπ Pπ−1 . It follows that X → X t is an involution on the multiplicative subgroup Ωn (G)(·).
212
C HAPTER III. MAJORIZATION
Take any finite subset T ⊆ Ωn (G) and choose A ∈ T such that tr AAt = max XX t. X∈T
Then, assuming that AAt = BC t for some B, C ∈ T , we get AAt = (AAt )t = CB t . Hence, (B − C)(B − C)t = BB t + CC t − 2AAt . In view of our choice of A, this gives 0 ≤ tr(B − C)(B − C)t = tr BB t + tr CC t − 2 tr AAt ≤ 0. As obviously tr XX t = 0 if and only if X = 0 it follows that B = C, as required.
( '
R EMARK 1.5. In [8, p. 96], it is noted that every periodic (and so also every finite) special involution semigroup is inverse. In our case it follows easily from Proposition 1.1 that every periodic submonoid in Ωn (G)(·) is indeed a group.
1.2. Inequalities for Ωn (G) For a given matrix in Mn (R) it is very easy to decide whether it belongs to Ωn or not – just use the definition of a doubly stochastic matrix, as a positive matrix all of whose rows or columns add up to unity. Several combinatorial problems are essentially e.g. the Travelling optimization problems on some subsets of Ωn . One such problem is Salesman Problem: to minimize the function f : Mn (R), f (X) = i,j∈n cij xij , for X = (xij ) ∈ Ωn and S j ∈S / xij ≥ 1 for S ⊆ n. Therefore, it is desirable to have a clear picture of the interplay between the multiplicative structure and the linear structure of Ωn . In general, however such information is not available and the problem is by no means an easy one. So, to find “few and natural” inequalities for describing the convex hull of the set {Pσ |σ ∈ Sn \M }, even in the special case when M consists of the identity element only, is not trivial; the answer was given by Cruse [5] by a resourceful argument. Notice also that the travelling salesman polytope is nothing else than the convex hull of the {Pσ | σ ∈ Zn }, where Zn is the set of all (full) cycles of length σ in Sn The symmetric travelling salesman polytope has a similar presentation. The list of problems can be easily enlarged, e.g. we could enclose the question of finding a basis with minimal weight for a (simple) matroid over a finite field etc.; see [2], [13]. Denote by An the subgroup of Sn of all even permutations, i.e. the alternative group of Sn . Convex combinations of permutation matrices Pσ , σ ∈ An – i.e. of even permutation matrices – are called even doubly stochastic matrices –; the set of such matrices is denoted by Ω(An ). A. J. Hoffman proposed in 1955 the problem of describing Ω(An ) inside Ωn . With the aim to answer this question, L. Mirsky [19] established the following result. T HEOREM 1.6 (L. Mirsky, 1961). Let D = (dik ) be an even n× n doubly stochastic matrix. Then the inequalities (57)
n k=1
hold for all j ∈ n and π ∈ An .
dk,π(k) − 3dj,π(j) ≤ n − 3.
1. Generalized majorization
213
Unfortunately, these conditions are not sufficient for D to belong to Ω(An ). This was first noticed by J. von Below [2] who gave the example of the matrix DB4 which satisfies (57) but is not in Ω(An ): ⎛ ⎞ 1 3 2 0 ⎜3 2 0 1 ⎟ ⎟ DB4 = 12 P(12) + 13 P(134) + 16 P(243) = 16 ⎜ ⎝0 1 3 2 ⎠ . 2 0 1 3 Such counterexample exist for any n ≥ 5: DBn = 12 (P(12) + P(134...n) )
with (for n = 5) ⎞ ⎛ ⎞ ⎛ 0 0 1 0 0 0 1 0 0 0 ⎜0 1 0 0 0⎟ ⎟ ⎜1 0 0 0 0⎟ ⎜ ⎟ and P(1345) = ⎜0 0 0 1 0⎟ . P(12) = ⎜ ⎟ ⎜ ⎝0 1 0 0 0⎠ ⎝0 0 0 0 1⎠ 0 0 0 1 0 1 0 0 0 0 Four other necessary conditions in order that a doubly stochastic matrix be even are described by R. Brualdi and B. Liu [7]. Let G ⊂ Sn . Denote by i(σ) the number of fixed points of σ ∈ G induced in the natural action (n, G) by (n, Sn ). We call the set Spec G = {i(σ)|σ ∈ G, σ = ⊂ {0, 1, . . . , n}, the spectrum of the subgroup G ⊂ Sn .
1.3. On the diagonals of G-doubly stochastic matrices This section is motivated by the previous discussion. We are interested in describing the diagonals of G-doubly stochastic matrices. A first result in this direction is. T HEOREM 1.7. Let G be a subgroup of Sn with normalizer N (that is, π ∈ G, g ∈ N implies gπg −1 ∈ G). Assume that G is transitive in the following sense: (*) If A, B are any two subsets of n = {1, 2, . . . , n} with |A| = |B| = i, where i is an integer belonging to spec G, then there exists an element g ∈ N such that gA = B. Then the diagonals of G-doubly stochastic matrices form a convex subset of Rn+ which is Sn -invariant. P ROOF. As Ωn (G) = conv{Pg }g∈G , it is clear that diag Ωn (G) = diag(conv{Pg }g∈G ) = conv(diag{Pg }g∈G ). Therefore it suffices to show that the set diag{Pg }g∈G is Sn -invariant. An equivalent statement is: If i is any index in spec G and if u is a 0-1-vector of length i, |u| = i, then u ∈ diag{Pg }g∈G . To see this we observe first that if a is any fixed point of π ∈ G (or ∈ Sn , for that matter) and if g ∈ Sn , then ga is a fixed point of π = gπg −1 . (This is proved by the series of equalities: π ga = gπg −1 ga = gπa = ga.) Denoting by fix(π) the fixed point set of π, we can state this as fix(gπg −1 ) = g fix(π).
214
C HAPTER III. MAJORIZATION
Equivalently: Pg diag Pπ = diag(Pπ Pg Pπ−1 ) = diag Pgπg−1 . (Note that fix(g) = supp diag Pg .) Let now u be an arbitrary 0-1-vector with |u| = i, i ∈ spec G. As i ∈ spec G, there exists then also a vector w with w = diag Pπ for some π ∈ G and |w| = i. But (∗), together with the observations in the preceding paragraph, shows that u = Pg w for some g ∈ N . It follows now that u = Pg w = Pg diag Pπ = diag Pgπg−1 = diag Pg with g = gπg −1 ∈ G. Therefore u ∈ diag{Pg }g∈G .
( '
So our problem is reduced to a purely geometric question: describing the structure of the convex hull of an Sn -invariant set M of 0-1-vectors. We will answer this question only in a very special situation. L EMMA 1.8. Let n be a positive integer and let f be an integer satisfying 0 ≤ f ≤ n. Let Mf be the set of all vectors in Rn all of whose components are either 0 or 1, such that this set includes the vector (1, 1, 1, 1, ...., 1) and also all vectors which have at most n − f components equal to 1. (Thus Mf also contains the vector (0, 0, ..., 0).) Then the convex hull of Mf consists of all vectors (x1 , x2 , ...., xn ) which satisfy the conditions (58)
0 ≤ xj ≤ 1
and
n − f + f xj ≥
n
xk
for j = 1, 2, ..., n.
k=1
P ROOF. (After Michael Cwikel3) Let F be the set of all vectors in Rn which satisfy all the conditions (58). It is clear that F is a convex set containing every vector in Mf . So we have conv v(Mf ) ⊂ F , where conv v(E) denotes the convex hull of any set E ⊂ Rn . It remains to show that F ⊂ conv v(Mf ). In fact it suffices to show instead merely that F∗ ⊂ conv v(Mf ), where F∗ is the subset of F consisting of those vectors x = (x1 , x2 , ..., xn ) which satisfy 1 ≥ x1 ≥ x2 ≥ ... ≥ xn ≥ 0. This reduction of the problem follows immediately from the fact that the set F and also the set conv v(Mf ) are both invariant under permutations of the components of vectors. It will be convenient to use the notation D for the larger set of all vectors x = (x1 , x2 , ..., xn ) ∈ Rn which satisfy x1 ≥ x2 ≥ ... ≥ xn ≥ 0. R EMARK 1.9. Let x be any vector of the form x = kj=1 αj wj where wj ∈ Mf k or merely wj ∈ conv v(Mf ), and αj ≥ 0 for j = 1, 2, ..., k and j=1 αk ≤ 1. Then it is clear that x ∈ conv v (conv v(Mf )) = conv v(Mf ), since we can write x in the form k k x = j=0 αj wj where w0 is the zero vector and α0 = 1 − j=1 αj . ( ' Let us define the vectors v0 , v1 , ..., vn in Rn by letting v0 be the zero vector and, for j = 1, 2, ..., n, letting vj be the vector whose first j components equal 1 and all of whose remaining components (if there are any) equal 0. 3Note by J. Peetre. This result is due to Uno Kaljulaid, but his proof was not quite complete. So in 1997 I gave another proof. Unfortunately, I have been unable to reconstruct it. Therefore we offer here yet a third proof, constructed on my request by friend Michael Cwikel. I am immensely grateful to him for this.
1. Generalized majorization
215
Suppose that x = (x1 , x2 , ..., xn ) is an arbitrary element of F∗ . Then we can write x in the form n (59) x= θj vj , j=1
n
where each θj ∈ [0, 1] and j=1 θj ≤ 1 and vj are the special vectors just defined. In fact θj = xj − xj+1 for j = 1, 2, ..., n, where we define xn+1 = 0. By summing all components of all vectors in the sum for x in (59) we see that n
(60)
j=1
xj =
n
θj j.
j=1
n For x ∈ F∗ the n conditions n − f + f xj ≥ k=1 xk for j = 1, 2, ..., n, which appear in (58), are equivalent to the single condition n − f + f xn ≥
(61)
n
xk .
k=1
We will sometimes use the notation )x) = nk=1 xk . Suppose first that f = 0. Then vj ∈ Mf for every j = 0, 1, 2, ..., n. Taking the first n component of the vector equation (59) shows that 1 ≥ x1 ≥ j=1 θj . This inequality, combined with Remark 1.9 and (59), immediately gives us that x ∈ conv v(Mf ). Let us next consider the case where f = 1. By definition M1 = M0 , so we still have vj ∈ Mf for every j = 0, 1, 2..., n. Exactly the same reasoning as for f = 0 shows that x ∈ conv v(Mf ). (Note that so far the only part of condition (58) that we have had to use is 0 ≤ xj ≤ 1 for j = 1, 2, ..., n.
(62)
In fact, although we do not need it for this proof, it can be immediately checked directly that condition (58) for f = 0 is equivalent to condition (58) for f = 1 . Now suppose that f = n. Then condition (58) implies that the average value of the n components x1 , x2 , ..., xn is less than or equal to their minimum value. So all the components xj must be equal. Thus x = θn vn where θn = xn ∈ [0, 1] is the common value of all the components. So, again by Remark 1.9, we have x ∈ conv v(Mf ). It remains to consider the case where 1 < f < n. In this case we have vj ∈ Mf for j = 0, 1, 2, ..., n − f and also for j = n. For each j in the range n − f < j < n we claim that n−f (63) vj ∈ conv v(Mf ). j This is fairly obvious intuitively, but let us check it anyway. Consider the subspace V of Rn consisting of all vectors y of the form (y1 , y2 , ..., yj , 0, 0, ...0) i.e. all components after the j-th component are 0. Now consider the cyclic permutation map T acting on V defined by T (y1 , y2 , ..., yj , 0, 0, ...0) = (yj , y1 , y2 , ..., yj−1 , 0, 0, ...0). Since vn−f and all its permutations are in Mf , the vector w=
1 vn−f + T vn−f + T 2 vn−f + T 3 vn−f + ... + T j−1 vn−f j
216
C HAPTER III. MAJORIZATION
must be in conv v(Mf ) and must be of the form (a, a, a, ..., a, 0, ..., 0), i.e., the first j elements all equal the average value a = n−f of the first j elements of vn−f , and the j remaining elements are 0. This proves (63). It is convenient to divide the case where 1 < f < n into several subcases. Subcase (i): Suppose that the arbitrary element x ∈ F∗ chosen above satisfies xj = 0 for all j > n − f . Then we also have θj = 0 for all j > n − f in the representation (59). Since vj ∈ Mf whenever θj = 0, we obtain that x ∈ conv v(Mf ) by exactly the same reasoning as was used in the case f = 0. Subcase (ii): Suppose that x is such that in its representation (59) we have θj = 0 for all integers j in the range 1 ≤ j ≤ n − f and also θn = 0. This last condition is equivalent to xn = 0. So, by (58) and (60), we have n−1
(64)
θj j =
n−f j vj
θj j ≤ n − f.
j=1
j=n−f +1
Note that now see that
n
= wj ∈ Mf for each j in the range n − f + 1 ≤ j ≤ n − 1. We n−1
x=
θj vj =
j=n−f +1
n−1
αj wj
j=n−f +1
By (64) we have n−1 j=n−f +1 αj ≤ 1. So, once again, Remark 1.9 n−1 applies to show that x = j=n−f +1 αj wj is an element of conv v(Mf ). Subcase (iii): Suppose that x ∈ F∗ is of the form x = n−f j=1 θj vj + θk vk for some n−f particular k in the range n − f + 1 ≤ k ≤ n − 1. Now let y = j=1 θj vj and let z = vk . n−f n−f So y and z are both in D. Let y = n−f y y and z = z z = k vk . These are both elements of F∗ and furthermore, by subcases (i) and (ii) they are also in conv v(Mf ). n−f We can now write x = αy + βz where, necessarily α n−f y = 1 and β k = θk . Then where αj =
j n−f θj .
y x θk k α + β = n−f + n−f = n−f ≤ 1. So Remark 4.1 shows that x ∈ conv v(Mf ). Subcase (iv): Suppose that x ∈ F∗ satisfies xn = 0, or equivalently θn = 0. So, as in (59) we have
x=
n−1 j=1
⎣
k=n−f +1 n−1
=
θj vj +
j=1
⎡
n−1
=
n−f
θj vj =
n−1
θk vk =
k=n−f +1
θk
n−f
n−1 q=n−f +1 θq j=1
⎤
θj vj + θk vk ⎦ =
yk
k=n−f +1
where yk =
θk n−1 q=n−f +1
n−f θq
j=1
θj vj + θk vk . For each k in the range n − f + 1 ≤
n−f yk is exactly an element of F∗ of the k ≤ n − 1 we can see that the vector yk := y k form treated in subcase (iii) and so it is in conv v(Mf ). Furthermore we clearly have
1. Generalized majorization
n−1 k=n−f +1 n−1 k=n−f +1
)yk ) = )x) ≤ n − f . So x =
n−1 k=n−f +1
217
αk yk where αk =
yk n−f
and so
αk ≤ 1. This shows that x ∈ conv v(Mf ). Subcase (v): Finally we have to treat the last remaining subcase, where xn > 0. We shall write x in the form x = y+z where z = xn vn and so y = x−xn vn = (x1 −xn , x2 − 1 y. We xn , . . . , xn−1 − xn , 0). Let w = (w1 , w2 , . . . , wn−1 , 0) be the vector w = 1−x n xj −xn x1 −xn claim that w ∈ F∗ . To check this first note that 0 ≤ wj = 1−xn ≤ 1−xn ≤ 1 for each j. Then we have the following sequence of inequalities, where we shall use the fact that x ∈ F∗ in the second line. ⎛ ⎞ n−1 n n−1 n−1 1 1 ⎝ wj = wj = (xj − xn ) = xj − (n − 1)xn ⎠ 1 − x 1 − x n n j=1 j=1 j=1 j=1 ⎛ ⎞ n 1 ⎝ 1 xj − nxn ⎠ ≤ (f xn + n − f − nxn ) = 1 − xn j=1 1 − xn (n − f )(1 − xn ) = n − f. 1 − xn Since wn = 0 this is exactly what we need to show that w ∈ F∗ . Also, again using the fact that wn = 0, we see, from the previous case, that w ∈ conv v(Mf ). Finally we express x as a convex combination x = (1 − xn )w + xn vn . Since both w and vn are in ( ' conv v(Mf ), so is x. =
Combining Theorem 1.7 with Lemma 1.8 we obtain at once the following result. T HEOREM 1.10. Assume that G ⊂ Sn satisfies condition (∗) in Theorem 1.7 and, moreover, that spec G = {0, 1, . . . , n − f, n}. Then a vector x = (x1 , . . . , xn ) belongs to the convex hull of all diagonals of G-doubly stochastic matrices if and only if condition (1) in Lemma 1.8 is fulfilled.
1.4. G-variations on HPL Fix n ∈ N and consider any subgroup G & Sn . Take some vectors a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) with real non-negative components. D EFINITION 1.11. The vector a is said to be a G-average of b, if there exists a matrix X ∈ Ωn (G) such that a = bX. D EFINITION 1.12. The polynomial 1 aσ(1) a X1 . . . Xnσ(n) [a]G := |G| σ∈G
is called the symmetric G-mean of a. The following two examples are well-known 1 [(1, 0, . . . , 0)]Sn = (x1 + . . . xn ) n and
√ 1 1 [( , . . . , )]Sn = n x1 x2 . . . xn . n n
218
C HAPTER III. MAJORIZATION
The following fact is true. T HEOREM 1.13. Let a = (ai ) and b = (bi ) be some vectors in Rn with non-negative components. The condition [a]G ≤ [b]G holds for all real xi ≥ 0 if and only if a ia a G-average of b P ROOF. S UFFICIENCY. We can modify the scheme employed in [10]. We use the notation n y = (ln x1 , . . . , ln xn ); (c, z) = ci (zi ) i=1
for any vectors c = (ci ) and z = (zi ). Then we find [a]G = |G|−1 ·
n
a
xi σ(i) =
σ∈G i=1
= |G|−1
n exp( aσ(i) ln xi ) = i=1
σ∈G −1
= |G|
·
exp((aσ(1) , ·, aσ(n) ), y ) =
σ∈G
= |G|−1 ·
exp(aPσ , y).
σ∈G
As a is a G-average of b there exists a matrix X ∈ Ωn (G) such that a = bX. Let X= λπ Pπ , λπ ≥ 0, λπ = 1. π
π∈G
It follows that (aPσ , y ) =
λπ Pπ Pσ , y).
π∈G
Using the convexity of exponent we get λπ (bPπσ , y ) ≤ λπ exp(bPπσ , y). exp(aPσ , y) = exp( π∈G
Therefore, we obtain |G| · [a]G =
π∈G
exp(aPσ , y ) ≤
σ∈G
λπ exp(bPπσ , y) =
g∈G π∈G
=
λπ
λπ
π∈G
=(
σ ∈ G exp(bPπσ , y) =
π
π∈G
=
π∈G
exp(bPγ , y) =
γ∈G
λπ ) · |G| · [b]G = |G| · [b]G .
1. Generalized majorization
219
The needed inequality [a]G ≤ [b]G follows.
( '
To prove the necessity part of the theorem use the following result by R. Rado [26] T HEOREM 1.14 (Rado, 1952). For given vectors a = (ai ) and b = (bi ) in Rn with all their components non-negative and for any subgroup G & Sn it holds [a]G ≥ [b]G if and only if a belongs to the convex hull of the set {bσ | σ ∈ G}. Here the following notation is used: for b = (b1 , . . . , bn ) one writes bσ = (bσ(1) , . . . , bσ(n) ) ∈ Rn . Denote further by KG (b) the convex hull of the set {bπ | π ∈ G}. It remains to prove that a ∈ KG (b) is the same as saying that a is a G-average b. Indeed, if a = bX for some matrix X ∈ Ωn (G), then, representing X as tπ Pπ . tπ ≥ 0, tπ , X= π
π∈G
we get a = b(
tπ Pπ ) =
π∈G
tπ (bPπ =
π∈G
tπbπ ∈ KG (b).
π∈G
In the other direction, if a ∈ KG (b), then for some λσ ≥ 0, σ ≥ 0, σ lamσ = 1, we have λσbσ a = σ∈G
and therefore λσbσ = λσ (bPσ ) = b( λσ Pσ ) = bX, a = σ∈G
σ∈G
X=
Pσ ,
σ
σ∈G
with X belonging to Ωn (G). This means that a is a G-average b. P ROOF OF T HEOREM 1.14 ([26]). Let there hold [a]G ≤ [b]G ] and, arguing by contradiction, suppose that a ∈ / CG (b). Then it follows (see [26] that ui (ai − bτ (i) ) ≥ δ. ∃ui R (i ∈ n), δ > 0, i
Take any number M > 1 and set xu = M uu . Then we have n bτ (i) b xi τ (i) = M ui ≤ |G| · [b]G = τ ∈G i
≤
M
τ ∈G ui ai −δ
≤
τ ∈G n n (M ui )ai ] = ≤ |G| · M −δ · [ (M ui )ai + i
= |G| · M
−δ
· |G|[a]G .
τ =ε i
220
C HAPTER III. MAJORIZATION
Taking here ln M > we get
ln |G| δ
|G| · M −δ · |G|[a]G < |G|[a]G .
Hence, it follows [b]G < [a]G , which contradicts [a]G ≤ [b]G . Suppose now that a ∈ CG (b). Then there exist real numbers tπ ≥ 0, π tπ = 1 π such that a = π tπ b . This implies that aj = tπbπ(j) . π∈G
Then [a]G =
n 1 a{ σ(i) xi = |G| i=1 π∈G
=
n 1 xi |G| i=1
π tπ bσ(i)
=
σ∈G
n 1 tπ b xi π σ(i) = |G| σ∈G i=1 n tπ n 1 bπσ(i) xi ≤ = |G| σ∈G π∈G i=1 n bπσ(i) 1 = ≤ tπ xi |G| i=1 σ∈G π∈G = tπ · [b]G = [b]G .
=
π∈G
The first inequality in this calculation follows from the generalized version of the arithmetic geometric inequality [9]: for αi ≥ 0, i αi = 1 and xj non-negative it holds αn 1 α2 xα 1 x2 · xn ≤ α1 x1 + α2 x2 + · · · + α1 xn .
As a result we obtain as needed.
[a]G ≤ [b]G , ( '
1. Generalized majorization
221
1.5. Appendix. Research plan of the project “Groups and inequalities with applications to combinatorics and optimization” Introduction. History, motivation, examples. A. History: I. Schur (1923); Hardy-Littlewood-Pólya (1929); G. Birkhoff–J. von Neumann (1946); A. Ostrowsky and R. Rado (50’s); L. Mirsky (60’s); G Egorychev et al (1980); R. Brualdi (1970-1990). B. Motivation: (a) Majorization: Marshall-Olkin book – examples using the relation ≺ in combinatorics; T. Ando’s lectures on majorization – preprint (1990, ‘old’ version) of the lectures notes & the new version (T. Ando, Lin. Alg. Appl. 1994). (b) Generalized majorization. Peetre’s pre-print (1985). (c) Discrete optimization on Sn and its subsets. Marshall-Olkin’s examples of combinatorial and discrete optimization through majorization theory methods. The results of H. Ryser et al revisited. Minsk seminar (70’s and 80’s). Vershik - Barvinok (1990). (d) Polytope algebra. Lattice-theoretic generalizations. Permutohedron and superconductivity. McMullen’s polytope algebra I,II. Valuations – Geissinger, Rota, Lovász, etc. K. Fan and S. Sternberg’s results on Bruhat order and superconductivity. (e) H. Ryser’s problem and M. Hall’s problem on (0, 1)-matrices. Infinite extensions. Problems. H. Ryser’s survey; L. Skornyakov on ∞ versions; lattice theoretic versions; L. Lovász et al. Part I. Ωn (G) group theoretic variations on ‘bistochastic’ themes. A. Multiplicative structure of Ωn (G) (a) Elements of finite order in the monoid Ωn (G)(·) (b) Subgroups of Ωn (G)(·) via D. Farkas’ paper (c) Unit in the group rings ZS3 , ZD4 , . . . via Hughes-Pearson; . . . (d) On the algebra structure of the monoid Ωn (G). Eastwood & Munn (B = Sn ) B. Giving Ωn (G) by a few inequalities (a) The spectrum of a group G, G & Sn . A theorem on the inequalities for X ∈ Ωn (G). Corollaries: Results of L. Mirsky and A. Cruse. New counterexamples to Mirsky’s conjecture. (b) A criterion for the diagonals C. On a problem on finite simple groups: the (second) main theorem (T HEOREM 2) on the classification of groups (using FGC). D. Infinite extensions of results on the diagonals of bistochastic G-matrices, G & S∞ (substitutions displacing finitely may symbols only). E. G-majorizations (a) G-version of the HLP-theorem. TG -transformations (b) a ¯ & ¯b for any G & Sn ; the Dirichlet polytope (c) . . . (d) . . .
222
C HAPTER III. MAJORIZATION
Part II. G-permanents. A. A new solution of the van der Waerden inequality via L. Gårding’s inequality for hyperbolic polynomials. – Peetre’s preprint [22]. B. . . . (a) T HEOREM 3. G-extension of P. Hall’s marriage theorem. (b) T HEOREM 4. G-extension of the Frobenius-König theorem. (c) Egorychev-like proof of the (extended) G-version of the van der Waerden problem – via Peetre’s preprint (d) T HEOREM 5. Algebraic structure of the McMullen polytope algebra for Ωn (G). A (new?) Molien series for Ωn (G). Part III. Applications. A. Kaplansky-Riordan theory revisited: G-extension via Moebius inversion associated to the restriction matrix. B. Bruhat order on G & Sn and (possible permutahedron results for Ωn (G). Applications to superconductivity (K. Fan and S. Sternberg). On a problem of H. Ryser on (0, 1)-matrices. C. The M. Hall theorem on permanents of (0, 1)-matrices. G-extension. [1],[4],[3], [6],[11],[15],[16],[17], [20],[23],[25],[29],[30], [31],[32],[18],[34] References [1]
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]
A. I. Barvinok and A. M. Vershik. Methods of representation theory in combinatorial optimization problems. Izv. Akad. Nauk SSSR, ser. Tekhn. Kibernet. 6, 1988, 64–71. English translation: Soviet J. Comput. Systems Sci. 27 (5), 1989, 1–7. J. von Below. On a theorem of L. Mirsky on even doubly stochastic matrices. Discrete Math. 55 (3), 1985, 311–312. N. Biggs. Finite groups of automorphisms. London Mathematical Society Lecture Notes Series, 6. Cambridge Univ. Press, 1971. T. Bonnesen and W. Fenchel. Theorie der konvexen Körper. In: Erg. Math. u. ihrer Grenzgebiete, 3, No. 1. Springer, Berlin, 1934. A. Cruse. A note on symmetric doubly stochastic matrices. Discrete Math. 13, 1976, 109–119. A. Cruse. On removing a vertex from the assignment polytope. Linear Algebra Appl. 26, 1979, 45–57. R. Brualdi and B. Liu. The polytope of even doubly stochastic matrices. J. Combin. Theory Ser. A 57 (2), 1991, 243–253. D. Eastwood and W.D. Munn. On semigroups with involution. Bull. Aust. Math. Soc. 48, 1993, 93–100. G. H. Hardy, G. Littlewood, and G. Pólya. Inequalities. Cambridge Univ. Press, Cambridge, 1934. L. Harper and G.-C. Rota. Matching theory, an introduction. Advances in Probability Theory 1, 1971, 169–215. A. Horn. Doubly stochastic matrices and the diagonal of a rotation matrix. Amer. J. Math. 76, 1954, 620–630. M. Katz. On the extreme points of a certain convex polytope. J. Comb. Theory 8, 1970, 417–423. A. W. J. Kolen and J. K. Lenstra. Combinatorics in operator research. In: Handbook of combinatorics, Chap. 35. Elsevier, Amsterdam, 1995. J. H. van Lint. The Van der Waerden Conjecture: two proofs in a year. Math. Intell. 4, 1982, 72–77. M. Marcus and H. Minc. A survey of matrix theory and matrix inequalities. Allyn and Bacon, Inc., Boston, 1964. A. W. Marshall and I. Olkin. Inequalities, theory of majorization and its applications. Aacdemic Press, New York, 1979.
1. Generalized majorization
223
[17] P. McMullen and G. C. Shephard. Convex polytopes and the upper bound conjecture. London Mathematical Society Lecture Notes Series, 3. Cambridge Univ. Press, 1971. [18] H. Minc. Non negative matrices. London Mathematical Society Lecture Notes Series, 3. John Wiley, New York, 1988. [19] L. Mirsky. Even doubly stochastic matrices. Math. Ann. 144, 1961, 418–421. [20] L. Mirsky. Results and problems in the theory of doubly stochastic matrices. Z. Wahrscheinlichkeitstheorie 1, 1963, 319–334. [21] A. Ostrowski. Sur quelques applications des fonctions convexes et concaves au sens de I. Schur. J. Math. Pures Appl., IX. Ser. 31, 1952, 253–292. [22] J. Peetre. Van der Waerden’s conjecture and hyperbolicity. Technical Report LTH 1981:9. Lunds Universitet, Lund, 1981. Reprinted in this Volume. [23] J. Peetre. On generalized majorization. Technical Report LTH 1985:2. Lunds Universitet, Lund, 1985. Reprinted in this Volume. [24] L. I. Polotski˘ı, M. V. Saphir, and L. A. Skornyakov. Convex combinations of infinite permutation matrices. Acta Sci. Math. 51, 1987, 185–189. [25] D. G. Poole. The stochastic group. Amer. Math. Monthly 102, 1995, 798–801. [26] R. Rado. An inequality. J. London Math. Soc. 27, 1952, 1–6. [27] A. Rämmer. On minimizing matrices. In: Proc. of the First Est. Conf. on Graphs and Appl. (Tartu– Kääriku). Tartu Univ. Press, Tartu, 1991, 121–134. [28] A. Rämmer. On even doubly stochastic matrices with minimal even permanent. Acta Comm. Univ. Tartuensis 878, 1990, 103–114. [29] J. V. Ryff. On the representation of doubly stochastic operators. Pac. J. Math. 13, 1963, 1379–1386. [30] J. V. Ryff. Orbits of L1 -functions under doubly stochastic transformations. Tr. Am. Math. Soc. 117, 1965, 92–100. [31] J. V. Ryff. Majorized functions and measures. Nederl. Akad. Wetensch. Proc. Ser., A. 71 = Indag. Math. 30, 1968, 431–437. [32] J. V. Ryff. Extreme points of some convex subsets of L1 (0, 1). Proc. Am. Math. Soc. 18, 1967, 1026– 1034. [33] I. Schur. Über eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantentheorie. Sitzungsber. Berl. Math. Gesell. 22, 1923, 9–20. [34] G. Ziegler. Lectures on polytopes, Graduate Texts in Mathematics, 152. Springer-Verlag, New York, 1995.
This page intentionally left blank
225
2.
Van der Waerden’s conjecture and hyperbolicity by J. Peetre 4
Introduction. The Van der Waerden’s conjecture says that, if A = (aik ) is an n × n n! def doubly stochastic matrix, then per A ≥ n with equality if and only if A = J = ( n1 ) n (see definitions infra). It has been settled independently by two Soviet mathematicians Egorychev [4] and Falikman [5], that is, the latter proves only the inequality without discussing the case of equality. An analysis of Egorychev’s proof by van der Lint [13] has also appeared5. It is interesting that both Egorychev and Falikman at least implicitly invoke hyperbolic quadratic forms (Lorentz forms). In this note we wish to further clarify the role of hyperbolicity in this context, the main point being the simple observation that the permanent as a function of the rows (or columns) of the matrix is the complete polarization of a certain hyperbolic (in the sense of Gårding) polynomial, viz. the polynomial P (x) = n!x1 · · · xn . Since Falikman’s proof at least has not yet appeared in translation we reproduce its main features below (Section 2.2). We also indicate a few minor simplifications of Egorychev’s proof based on Falikman’s lower bound (Section 2.3). Therefore we have, in fact, here a proof which is “almost self-contained”, that is, modulo the only remaining purely combinatorial element, the celebrated FrobeniusKönig theorem (see [12, Chapter 3]) and the circumstance that we have not bothered to reproduce some reasoning which we otherwise would have taken over verbatim from [5] or [13]. Notation. If A = (aik ) is an n × n matrix then its permanent is defined as per A = a1σ(1) · · · anσ(n) σ
(summation over all permutations σ of {1, . . . , n}). If we consider it as a function of the rows x1 , . . . , xn of A we write per(x1 , . . . , xn ). Notice that 1 if σ is a permutation per(eσ(1) , . . . , eσ(n) ) = 0 if not, 6 . which relation essentially characterizes the permanent function A matrix A = (aik ) is called doubly stochastic if k aik = 1 = i aik , aik ≥ 0. 2 The set of all doubly stochastic matrices will be denoted Ω (it is a convex subset of Rn , dim Ω = (n − 1)2 ). The “interior of Ω will be denoted by Ω∗ and its “boundary” by ∂Ω (= Ω\Ω∗ ). Every permutation matrix is in ∂Ω. Also J = ( n1 ) is in Ω∗ . Attention: Sometimes xi means the i-th component of the vector x = (x1 , . . . , xn ) but sometimes it is the i-th member of the family of vectors {x1 , . . . , xm }.
4
Report LTH 1981:9, Lund, 1981. Reprint.
5Egorychev’s paper [4] has not been available to us; we know of its contents only through van der Lint
[13]. 6The standard reference for permanent theory is Minc’s book [12].
226
C HAPTER III. MAJORIZATION
2.1. Hyperbolicity Hyperbolic polynomials were introduced by Gårding [7] in 1950 in the context of Cauchy’s problem for linear partial differential equations. Their main algebraic characteristics in purely algebraic terms are summarized in his beautiful paper [6] (see also Hörmander’s book [8, Chapter 5] and Beckenbach-Bellman [2, §§ 36–39]). Let P (x1 , . . . , xm ) be a real symmetric m-linear form in Rn . If all the arguments are equal we write P (x) = P (x, . . . , x). Thus P is a homogeneous polynomial of degree m which uniquely determines P (x1 , . . . , xm ). One says that P (x1 , . . . xm ) is the complete polarization of P (x). Let a be any non-zero vector in Rn . D EFINITION 2.1. P is hyperbolic with respect to a (or a is hyperbolic with respect to x) in one variable P ) if P (a) > 0 and if, further, for any x in Rn , the polynomial P (sa + s has m distinct roots. That is, one has the factorization P (sa+x) = c j (a+λj (x, P )), where c > 0 and λ1 (x, P ) < λ2 (x, P ) < · · · < λm (x, P ).7 If P is hyperbolic with respect to a, let us introduce the set def
C(a, P ) = {x|∀ j λj (x, P ) > 0}. The main properties of hyperbolic polynomials can be summarized in the following theorem. T HEOREM 2.2. C(a, P ) is an open convex cone in Rn , in fact, as a set equal to the connected component of {P = 0} that contains the vector a. The vector b is hyperbolic with respect to P for any b ∈ C(a, P ); then, in particular, C(b, P ) = C(a, P ). For the proof we refer to Gårding’s paper [6]. Here we shall only need the following. C OROLLARY 2.3. If b1 , . . . , bk are any k vectors in C(a, P ) (0 < k < n) then the “partial” polarization def
Q(x) = P (x, x, . . . , x, b1 , . . . , bk ) n−k times
is hyperbolic with respect to any vector in C(a, P ). P ROOF. By induction it suffices to consider the case k = 1, b1 = a. That is, we shall prove that def Q(x) = P (x, x, . . . , x, a) n−1 times
is hyperbolic throughout C(a, P ). We have the formula d P (sa + x) = mP (sa + x, . . . , a) = mQ(sa + x). ds It follows from Rolle’s theorem that for any x all the roots of Q(sa + x) are real and, in fact, separated by the roots of P (sa + x): (65)
λ1 (x, P ) < λ1 (x, Q) < λ2 (x, P ) < λ2 (x, Q) < . . .
7Editors’ Note. It is advantageous to interpret the relation t = sa + x geometrically as a straight line in the t, x plane with direction vector a. Then we are dealing with the intersection of this line with the variety {P (t) = 0}. For hyperbolicity of non-homogeneous polynomials, see [7, 8].
2. Van der Waerden’s conjecture and hyperbolicity
227
Thus Q is hyperbolic with respect to a. Also (65) shows that C(a, Q) = C(a, P ), so that, moreover, by Theorem 2.2 Q is hyperbolic with respect to any element of C(a, P ). ' ( Let us consider some examples of hyperbolic polynomials. E XAMPLE 2.1. m = 2, P = x21 − x22 − · · · − x2n (Lorentz form). This is the canonical example, because every other hyperbolic quadratic form can be written in this way after a linear change of variables. As is well-known this example is of fundamental importance for special relativity. Hyperbolic vectors are now called time-like, those on the conical surface x21 − x22 − · · ·−x2n = 0 light-like, all other vectors (= 0) being termed space-like. ( ' E XAMPLE 2.2. m = n, P = n!x1 x2 . . . xn . Every positive vector is hyperbolic. As already mentioned in the Introduction, the complete polarization is now the permanent function per A = per(x, x2 , . . . , xn ) = P (x, x2 , . . . , xn ) if A is a matrix with rows x1 , x2 , . . . , xn . ( ' R EMARK 2.4. In relation to the permanent used in Example 2.2 let us forward the following interesting observation. Curiously enough the interpretation of the permanent as a multilinear form does not seem to be explicitly mentioned in [12]. On p. 103 there is reproduced Muir’s formula n n per A ι1 . . . ιn = aik ιk , i=1 k=1
where the ιk are generators of a commutative associative algebra such that ι2k = 0. Of course, a similar thing can be done with any polynomial (hyperbolic or not). If P (x1 , . . . , xm ) = aα1 ...αm x1α1 . . . xmαm then n n P (x1 , . . . , xm )ι1 . . . ιm = xiα ια i=1 α=1
with ια1 . . . ιαm = aα1 ...αm . Cf. Dirac’s introduction of the Dirac matrices etc.
( '
ν(ν + 1) and identify Rn with the set of all symmetric 2 ν × ν matrices. Define P (x) = det(xik ). Then P is hyperbolic with respect to any positive definite matrix. ( ' E XAMPLE 2.3. Take n =
E XAMPLE 2.3 BIS . Analogous example with Hermitian matrices.
( '
An elementary fact about hyperbolic quadratic forms is the “reverse Cauchy inequality”: (66)
P (x, y) ≥
-
P (x)
-
P (y),
valid for time-like vectors with equality if and only if x = y. In [6] Gårding generalized (65) by proving hyperbolic polynomials P an inequality of the type (67)
1
1
P (x1 , x2 , . . . , xm ) ≥ (P (x1 )) m · · · (P (xm )) m
(Gårding’s inequality)
The special case of (67) corresponding to Example 2.3 is due to Aleksandov [1] (apparently rediscovered by Chern [3]). It is just the Aleksandov’s inequality that presumably is used in Egorychev’s paper [4] and which van Lint [13] manages to replace
228
C HAPTER III. MAJORIZATION
by the more elementary inequality (65). In the case of the permanent (example 2.2), however, (66) gives a trivial result, viz. the inequality 1 per A ≥ n!(Π(A)) n with Π(A) = aik . Now let us record for reference that the corollary to Theorem 2.2 gives the following result. L EMMA 2.5. If A is a positive n×n matrix, then fixing any n−2 rows the permanent as a function of the remaining two rows is a hyperbolic quadratic form. In fact, the same conclusion remains in force if we only know that these n − 2 rows are positive. One can also easily give a direct proof, as is done in [5] and [13]. So one can rightly ask if it is really worth while to make this detour via Gårding’s rather sophisticated theory. Our point is that we hope that, in putting the Van der Waerden’s conjecture into this wider frame ultimately perhaps something more will be revealed about its true nature (cf. Section 2.4). Finally, likewise for reference, we state the following simple fact characterizing, in fact, hyperbolic quadratic forms. L EMMA 2.6. Let P (x, y) be a hyperbolic quadratic form. If x and y are any two vectors such that P (x) > 0, P (x, y) = 0, then also P (y) < 0 unless y = 0. P ROOF. This follows most conveniently just upon applying (65). But conversely (65) can be obtained from the lemma. The direct proof goes as follows: Pick a basis such that x = (1, 0, . . . , 0) and P is “in normal form”, P (x) = x21 − x22 − · · · − x2n . Then ( ' P (x, y) = 0 gives y1 = 0 so that P (y) = −y22 − · · · − yn2 < 0 provided y = 0.
2.2. Analysis of Falikman’s proof The main difficulty in the Van der Waerden’s conjecture has, throughout the years, been the treatment of the “boundary points” (A ∈ ∂Ω). For instance, in the fundamental paper of Marcus and Newman [10] (see Minc [12], notably Chapter 5, Section 1) it is shown that if A is an “interior” minimizing matrix (A ∈ Ω∗ ) then by necessity A = J. In the same paper it is also shown that if A is any interior” minimizing matrix, then per Aik = per A provided aik > 0, where Aik denotes the (n − 1) × (n − 1) matrix gotten by deleting the i-th row and k-th column. The proof is quite simple and is based on an application of Lagrange multipliers (cf. Section 2.3, infra). Falikman’s proof [5] parallels at the outset at least, although the author himself does not refer to it, this proof by Marcus and Newman. The basic new idea is the introduction of, as is customary in optimization theory, a penalty function, viz. ε def f (A) = fε (A) = per A + , (A ∈ Ω∗ ) Π(A) where ε is a parameter (> 0), and, as in the end of Section 2.1, Π(A) = aik . As Π(A) → 0 when A approaches the boundary ∂Ω = Ω\Ω∗ it is manifest that f takes on a “minimum” at an interior point. Let thus A ∈ Ω∗ be a matrix such that the minimum is
2. Van der Waerden’s conjecture and hyperbolicity
229
assumed. Then using Lagrange multipliers, or by a direct computation, which everybody familiar with the rudiment of calculus can do for himself,8 one finds c = λi + μk , (68) pik − aik where λi and μk are the Lagrange multiplier, and where we have put pik = per Aik , ε . c= Π(A) C LAIM 2.7. All the λi and all the μk are equal. P ROOF. If we multiply both members of (68) by aik and sum over k we get aik μk (69) λi = b − with b = p − nc, p = per A = (70)
k
aik pik . Similarly, we find aik λi . μk = b − k
i
If we substitute theexpression for μk as givenby (70) into formula (69) we get a relation of the form λi = j bij λj , where bij ≥ 0, j bij = 1 = i bij . It is easy to see that ( ' λ1 = · · · = λn = λ. Similarly, we find μ1 = · · · = μn = μ. R EMARK 2.8. The argument (omitted!) leading to the above conclusion is but a special case of the Perron(-Frobenius) theorem on positive matrices. What is really going on becomes somewhat more transparent if we use matrix notation. Then (69) and (70) can be written as λ = b1 − Aμ and μ = b1 − A∗ λ respectively (remember that A1 = 1, since A is doubly stochastic), that is, λ = Bλ with B = AA∗ positive. This gives again λ = const · 1 (Perron’s theorem). Note that from (69) and (70) now follows b = λ + μ. We have thus proved (see (68)) that if A ∈ Ω∗ is a critical point for fε , ε arbitrary then pik = b +
(71)
c . aik
The final step in the proof can now be condensed in the following lemma. L EMMA 2.9. Assume that A = (aik ) ∈ Ω∗ with def
pik = per Aik = φ(aik )
(72)
where the function φ is strictly decreasing or constant. Then by necessity A = J = ( n1 ). This lemma is thus, in particular, applicable if (68) holds with c ≥ 0 (corresponding to ε ≥ 0). 8The “tangent space” of Ω (the “infinitesimal doubly stochastic matrices”) is generated by all matrices
1 −1 , all other entries being zero. This gives fik − fi − fjk + −1 1 ), whence readily fik = λi + μk .
containing a submatrix of the type fj = 0 (with fik =
∂f ∂aik
230
C HAPTER III. MAJORIZATION
P ROOF. It suffices to prove that any two rows, say, x = x1 and y = x2 are equal: x = y. We consider the quadratic form P (x, y) = per(x, y, x3 , . . . xn ), which we know is hyperbolic (Lemma 2.5). Then by (72) def
xi = P (x, ei ) = φ(yi ); def
y i = P (y, ei ) = φ(xi ), where e1 , . . . , en is the standard basis in Rn . Using a fancy language, the xi (y i ) are the i def contravariant coordinates of x (y) and P (x, y) = x yi = xi y i . Set z = x − y. Then similarly def z i = P (z, ei ) = φ(yi ) − φ(xi ). Assuming first that φ is strictly decreasing, we draw from this the important conclusion that z i ≥ 0 =⇒ zi > 0. Thus, in particular, 0. Furthermore, since x and y are rows of a doubly stochastic P (z) ≥ yi = 1, whence zi = 0. It follows that we have matrix we have xi = cannot z i ≥ 0 for all i. Therefore we can find a positive vector c such that ci z i = 0 or P (c, z) = 0. Also P (c, c) = 0. But this plainly contradicts the hyperbolicity (see Lemma 2.6). The case π constant is even simpler. Now z i = 0 for all i, which contradicts already the fact that P is a non-degenerate quadratic form. ( ' R EMARK 2.10. To make the above argument work it obviously just suffices to know that the remaining rows (i.e. x2 , . . . xn ) are positive but not by necessity x1 and x2 . So now we know that if A ∈ Ω∗ is any minimizing element for fε (ε ≥ 0) then A = J. In particular, thus ε ε per A + ≥ per J + . Π(A) Π(A) Passing to the limit (ε → 0) we get per A ≥ per J for A ∈ Ω∗ and by continuity also for A ∈ Ω. We have established T HEOREM 2.11. per A ≥ per J for any A ∈ Ω.
2.3. Comments on Egorychev’s proof Using Falikman’s proof and result (Theorem 2.11) we can somewhat simplify Egorychev’s proof to the effect that A = J is the only minimizing element in Ω. In particular, we can eliminate all the partial result on which it depends (London’s theorem [9] etc.; cf. [13]). We are thus out for the proof of T HEOREM 2.12. If A ∈ Ω and per A = per J then A = J. P ROOF. We do this in several steps.The idea is to prove directly that for a minimizing matrix A ∈ Ω we must have pik = p (according to (71), with ε = 0). First we verify that this is indeed sufficient. Step 1. If we have a minimizing matrix A ∈ Ω with pik = p it is easy to reproduce another one, A , say, with the same property, having one row, x = x1 , say in common
2. Van der Waerden’s conjecture and hyperbolicity
231
with A and all other rows positive. (This is achieved by successively forming mean values of rows; for details see [13].) We see that x = ( n1 ). Since this row was an arbitrary row we infer that A = J = ( n1 ). Step 2. It suffices to prove that pik ≥ p. For assume that A ∈ Ω is a minimizing matrix with this property. Then if x and y are any two rows of A inequality (66) gives p2 = P (x, y)2 ≥ P (x)P (y) = xi xi xi p · yi p = p2 . yi y i ≥ (Here we use the notation of the Lemma 2.9.) Thus we are in the case of equality for that inequality (but since we do not know yet that P is (strictly) hyperbolic we cannot, at this stage, conclude that x = y), and it is plain that indeed pik = p must hold. Step 3. Next we conclude that a minimizing matrix, at any rate, must be fully indecomposable (cf. [12]). Indeed, assume that, contrariwise, A is decomposable, which, A being doubly stochastic, just means that the matrix after suitable permutations of the rows and columns can be put in the form A ⊕ A , where A is an r × r matrix, and A an (n − r) × (n − r) matrix, with 0 < r < n. Then Theorem 2.11 gives r! (n − r)! n! = per A = per A · per A ≥ r · n n r (n − r)n−r or n n−r n n r 1− ≥ 1. r r r But this contradicts the elementary inequality n r x (1 − x)n−r < 1 if 0 < r < n, 0 < x < 1. r Step 4. Now we are (in principle) in a position to carry out the details of the proof of the theorem of Marcus and Newman [10] already referred to at the beginning of Section 2.2: pik = p if aik > 0. We add additional constraints of the form aik = 0, one for each zero matrix element, and proceed exactly as in Section 2.2. Since we know that A is fully indecomposable the conclusion of Perron’s theorem is still applicable. Step 5. There remains only one more step – London’s theorem [9] (see [12], Chapter 5, p. 85-86) to the effect that without the restriction pik ≥ p for a minimizing matrix on has A ∈ Ω. The proof is based on the inequality n (73) piσ(s) ≥ p, s=1
valid for any permutation σ. (Proof by a straightforward variational argument. Consider the “deformation” (1 − θ)A + θP , 0 < θ < 1, of A, where P is the permutation matrix corresponding to σ. Remember that Ω is a convex set!) Having (73) at our disposal it suffices to remark that, A being fully indecomposable (see Step 3), for any i and k we can find a permutation σ such that σ(i) = k and aσ(s) > 0 if s = i, proving aik > 0. This again is essentially the Frobenius-König theorem ([12, Chapter 3, see notably Theorem 2.2, p. 31 and Theorem 35, p. 38.]) ( '
232
C HAPTER III. MAJORIZATION
2.4. Open questions 4a. Having uncovered the role of hyperbolicicity in the Van der Waerden’s conjecture there arises the question whether the conjecture perhaps is a special case of something more general. Here is, tentatively, a “non-commutative” version of the Van der Waerden’s “conjecture”: to minimize the complete polarization Per A (“hyperpermanent”) of the hyperbolic polynomial P (x1 , . . . , xn ) = n! det x1 . . . det xn , where each entry xi is a symmetric, say, ν × ν matrix (see Example 2.3), under the side conditions k aik = 1 = i aik , aik being the “matrix elements” of A, each of them thus in turn a positive definite (aik ≥ 0) matrix. Analogous problem with Hermitian matrices (Section 2.1, Example 2.3BIS ). 4b. In view of all the trouble one has had with the (non-existent!) minimizing boundary matrices one is tempted to ask if there is perhaps a more quantitative result than just the mere statement that the are no minimizing points on the boundary. In other words, what can be said about inf per A? A∈∂Ω
4c. The more general conjecture of Marcus and Minc [11] (cf. [12], p. 91) to the effect that nJ − A per A ≥ per , A ∈ ∂Ω, n−1 is still unsettled [in 1981]. The latter is also meaningful in the context of Subsection 4b, ultra (that is, for “hyperpermanents”). References [1]
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
A. D. Aleksandrov. Zur Theorie der gemischten Volumina von konvexen Körpern IV: Die gemischten Diskriminanten und die gemischten Volumina. Mat. Sbornik 3 (2), 1938, 227–249. Russian with German summary. E. F. Beckenbach and R. Bellman. Inequalities. Ergebnisse der Mathematik und ihrer Grenzgebiete, 30. Springer-Verlag, Berlin, Göttingen, Heidelberg, 1961. S.-S. Chern. Integral formulas for hypersurfaces in Euclidean space and their appplication to uniqueness theorems. Indiana Univ. Math. J. 8, 1959, 947–966. G. P. Egorychev. The solution of van der Waerden’s problem for permanents. Advances in Math. 42, 1981, 299–305. D. I. Falikman. Proof of van der Waerden’s hypothesis on the permanent of doubly stochastic matrices. Mat. Zametki 19, 1981, 931–938, 957. L. Gårding. An inequality for hyperbolic polynomials. J. Math. Mech. 8 (6), 1959, 957–966. L. Gårding. Linear hyperbolic partial differential equations with constant coefficients. Acta Math. 85, 1951, 1–62. L. Hörmander. Linear partial differential operators. (Grundlehren 116.) Springer-Verlag, Berlin, Göttingen, Heidelberg, 1963. D. London. Some notes on the van der Waerden conjecture. Linear Algebra and Appl. 4, 1971, 155-160. M. Marcus and M. Newman. On the minimum of the permanent of a doubly stochastic matrix. Duke Math. J. 26, 1959, 61–72. M. Marcus and H. Minc. On a conjecture of B.L. van der Waerden. Proc. Cambridge Philos. Soc. 63, 1967, 305–309. H. Minc. Permanents. In: Encyclopedia of Mathematicss and its applications, 6. Addison-Wesley, London etc., 1978. J. H. van Lint. Notes on Egorychev’s proof of the van der Waerden’s conjecture. Linear Algebra and Appl. 39, 1981, 1–8.
233
3.
On generalized majorization by J. Peetre 9 For my friends
While visiting Haifa recently (Jan. 85) I discussed with Michael Cwikel the question of extending the theory of majorization, which is connected with the special pair (L1 , L∞ ), to the case of other pairs. The problem is mentioned already in our joint paper [5] and even earlier in [12]. Schur, Ostrowsky . . . Consider first the finite dimensional case, that is, the pair p (1n , ∞ n ) (that is, L spaces based on a finite segment (1, n)). If x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) are positive vectors, which we for simplicity take to be decreasing too, we write x ≺ y if x1 ≤ y1 ; x1 + x2 ≤ y1 + y2 ; ................... x1 + x2 + · · · + xn ≤ y1 + y2 + · · · + yn . This is majorization, a term, in this context, apparently first used by Hardy and Littlewood. Given a function f (x) = f (x1 , . . . , xn ), which is always assumed to be symmetric in its arguments the problem is to decide when x ≺ y implies f (x) ≤ f (y) (Schurconvexity). T HEOREM 3.1 (Schur). Assuming that f is smooth a necessary and sufficient condition for f to be Schur convex is that ∂f ∂f (xi − xj ) · ≥ 0. − ∂xi ∂xj Schur was interested in this because of applications to Hermitian matrices of the type of Hadamard’s inequality. For more applications and a comprehensive treatment see especially [11]. See further [1–3] (I owe these references to Jonathan Arazy). A brief synopsis of the theory can likewise be found in [4, pp. 30-33]. Some interesting material is also contained in [7, 8] (especially Chapter 14), S HORT K - FUNCTIONAL PROOF OF S CHUR ’ S THEOREM . It is clear that f (x) depends only on the K-functional of the vector x = (x1 , . . . , xn ). In this case the Kfunctional is piecewise linear and the values at the n “knots” are precisely K1 = x1 , K2 = x1 + x2 ,. . . , Kn = x1 + x2 + · · · + xn . So we may write f = F (K) with K = (K1 , . . . , Kn ). Differentiating we get n ∂F ∂Kj ∂f = · . ∂xi ∂Kj ∂xi j=1 9
Report LTH 1985:2, Lund, 1985. Reprint.
234
C HAPTER III. MAJORIZATION
But
∂Kj = ∂xi
1 if i ≥ j; 0 otherwise.
Therefore we find
∂f ∂F ∂f − = . ∂xi ∂xi+1 ∂Kj This clearly is the embryo of Schur’s condition.
( '
We won’t elaborate more on the details. Instead we shall look on some more general cases, the point being that the argument just produced is quite general (Schur’s theorem is not too deep!). Every time we have sufficiently exact information about the K-functional the same proof can be carried over. The case (L2 , L2 (λ)). In this case it is convenient to use K2 in place of K. (In view of [10] this causes no essential change.) .∞ 1 2 (74) K2 (t, a) = |a(λ)|2 dλ. 1 0 1+ (λt)2 Let f = F (K22 ). Then formally (variational or Volterra derivatives) .∞ δK22 (t, a) δF δf = · dt. δa(λ) δK22 (t, a) δa(λ) 0
But by (74) [for a positive] δK22 (t, a) = δa(λ) Thus substituting δf = δa(λ)
.∞ 0
2a(λ) . 1 1+ (λt)2
δF · δK22 (t, a)
2a(λ) dt. 1 1+ (λt)2
δF ≥ 0. Thus the integral is of the form δK22 . dμ(λ) . 1 1+ (λt)2 with a positive measure μ, thus represents a Loewner (or Pick) function. Now if F is monotone in K2 then
T HEOREM 3.2. f is K-monotone if and only if δf 1 · a(λ) δa(λ) is a Loewner function in λ. / E XAMPLE 3.1. Let f be quadratic, f = (a(λ))2 (w(λ))2 dλ, w a (positive) weight. Our condition for K-monotonicity then becomes the classical one of (w(λ))2 being a Loewner function.
3. On generalized majorization
235
R EMARK 3.3. The above points also to that there might be a sort of “generalized Loewner theory”. As is well-known (for Loewner theory, see e.g. [6], cf. [13]) Loewner was concerned with the issue of “monotone operator functions”. For which (scalar) functions ϕ, is it true that A ≥ B in operator sense (A and B being s.a. operators in a Hibert space H) implies ϕ(A) ≥ ϕ(B)? Given a function Φ = Φ(x, A) of two variables (x ∈ H, A a s.a. operator in H) we may instead consider the more general inequality Φ(x, A) ≥ Φ(x, B). Thus Φ(x, A) = (ϕ(A)x, x) will correspond to the classical case. The case (Lp , Lp (λ), 1 ≤ p ≤ ∞. Nothing essential happens if we pass to the case of general p. The condition formally becomes that (a(λ))p−1 · Φ(x, A) should admit an analogous integral representation with the (convolution) kernel
1 1 + t2
1 1 + = 1. (Compare again [13].) p q (1 + The limiting case p = 1 is noteworthy. Then we have the kernel min(1, t) and by δf has to be Sparr’s lemma [14] (see once more [13]) this is the same as to say that δa(λ) a concave function of λ (observation by M. Cwikel).
replaced by
1
1
t−q ) q
, where
The case (Lp , Lq ), p = q. The case of different exponents is slightly more rewarding10. We further find it convenient to use the L-functional now. Recall that .∞ p q L(t, a; L , L ) = L(t, a(λ); R, R) dλ. 0
If f = F (L) we have δf = δa(λ)
.∞
δL(t, a) δF · dt. δL(t, a) δa(λ)
0
Therefore we will end up with a condition involving the kernel δL(t, a(λ); R, R) . δa(λ) R EMARK 3.4. The “scalar” L-functional L(t, a) = L(t, a(λ; R, R) has been invesδL(t, a) . tigated in [9] but not much seems to be known about the derivative δa Conclusion. The drawback of all this is that we have no applications at all, whereas in the primitive Schur case concrete applications (see especially [11]). Challenge to the Reader: find some! References [1]
P. Alberti and A. Uhlmann. Dissipative motion in state spaces. Teubner-Texte zur Mathematik, 33. Teubner, Leipzig, 1981. 10Because of the Stein-Weiss trick [15] the weight can always be removed.
236
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
C HAPTER III. MAJORIZATION
P. Alberti and A. Uhlmann. Stochasticity and partial order: doubly stochastic maps and unitary mixing. Mathematical monographs, 18. Deutscher Verlag der Wissenschaften, Berlin, 1981. T. Ando, Computationally secure information flow. Hokkaido University, Sapporo, 1982. E. F. Beckenbach and R. Bellman. Inequalities. Ergebnisse der Mathematik, 30. Springer Verlag, Berlin, Göttingen, Heidelberg, 1961. M. Cwikel and J. Peetre. Abstract K and J spaces. Abstract K and J spaces 60, 1981, 1–50. W. Donoghue. Monotone matrix functions and analytic continuation. Die Grundlehren der mathematischen Wissenschaften, 207. Springer Verlag, New York, Heidelberg, 1974. R. Farell. Multivariate calculation. Springer Series in statistics. Springer Verlag, New York, Heidelberg, Tokyo, 1985. I. Gohberg and M. Krein. Introduction to the theory of linear non-selfadjoint operators. Nauka, Moscow, 1965. English translation: Am. Math. Soc., Providence, 1988. M. Gustavsson and J. Peetre. Properties of the L function. Audia Math. 74, 1982, 106–121. T. Holmstedt and J. Peetre. On certain functionals arising in the theory of interpolation. Func. Anal. 4, 1969, 88–94. A. Marshall and I. Olkin. Inequalities: Theory of Majorization and Its Applications. Academic Press, New York, 1979. J. Peetre. On the connection between the theory of interpolation spaces and approximation theory. In: Proc. Conf. on Constructive Theory of Functions. Akadémiai Kiado, Budapest, 1969, 351–363. J. Peetre. On Apslund’s averaging method – the interpolation (function) way. In: Proc. Int. Conf. on Constructive Theory of Functions. Bulgarian Acad. Sci., Sofia, 1984, 664–671. G. Sparr. Interpolation of weighted Lp spaces. Studia Math. 62, 1978, 229–271. E. Stein and G. Weiss. Interpolation of operators with change of measure. Trans. Am. Math. Soc. 87, 1958, 159–172.
CHAPTER IV Combinatorics
This page intentionally left blank
239
1.
[K88a] On Stirling and Lah numbers
Given a finite set S, |S| = n, let us consider the set of all possible functions f : S → X, |X| = x. Each such function f gives a certain equivalence relation Ker f , the kernel of f . Conversely, each equivalence relation π serves as the kernel of a function f : S → X, and the number of functions with a given kernel π equals to decreasing subfactorial (x)n(π) , where n(π) is the number of blocks of the partition on S corresponding to the equivalency π.1 Let Π(n) be the lattice of all equivalencies on S. We have the equation (x)n(π) = x(n) . (75) n∈Π(n)
As (75) is true for infinitely many natural numbers x, it is the equality of two polynomials in Q[x]. Based on (75) and using methods of linear algebra, G.-C. Rota derived ([7], in def
1964) a series of properties of the numbers Bn = |Π(n)|, which, in particular, showed that Bn is the n-th Bell number [2]; for details about this see [4]. In the Proceedings of the All Union Seminar on Combinatorial Analysis (Moscow University, Jan. 1980), the author suggested a similar approach also to Stirling and Lah numbers; cf. further [1]. In part, this was realized (for the derivation of the basic properties of the Stirling numbers of the second kind) in [2], and, more completely, in [4], where one considers from this point of view (but, this time, involving a suitable order relation on the blocks of a partition of S) also Stirling numbers of the first kind. So far the author knows the papers [6] and [3], showing that the line of thought indicated deserves much attention. Here we give a new combinatorial foundation for some identities for Stirling and Lah numbers 2 illustrating the synthesis of the ideas of Pólya and Rota just mentioned. def The polynomials pu (x) = (x)u , u = 0, 1, 2, . . . form a basis of the vector space of polynomials Q[x], and so the formula Lk (pu (x)) = δu,k , k = 0, 1, 2, . . . defines uniquely a sequence of linear functionals Lk : Q[x] → Q, k = 0, 1, 2, . . . . Next, we obtain from (75) for the numbers def
S(n, k) = |{p ∈ Π(n)|n(π) = k}|
(76) the “strange”definition (77)
S(n, k) =
1 = Lk (xn ).
p∈Π(n)|n(π)=k}
On the basis of (77) all fundamental relations for the Stirling numbers of the second kind S(n, k) were derived in [4]. There it was also shown that the same approach works for a 1Translator’s Note. Quite generally, (x) = x(x − 1) . . . (x − (n − 1)) for any integer n. n 2Translator’s Note. These numbers were, apparently, introduced by Lah in [5], noted in [2].
240
C HAPTER IV. COMBINATORICS
combinatorial foundation of some more complicated identities for the numbers S(n, k), as, for instance: n i+j S(n, i + j) = S(k, i)S(n − k, j). i k k≥0
Πn
Let be the set of all partitions of a set n. It is assumed that there is given a cycle structure for those blocks. On the one hand, we may look at a function f : n → X, |X| = x, as on a distribution of |n| = n ordered objects into x distinct and unordered def
baskets; the number of those distributions equals to x(n) = x(x + 1) . . . (x + n − 1). On the other hand, the function f : n → X we may also look as on a composition f
f
n → n → X, where f is bijective and f is an arbitrary function from n to X. Let us consider π : Ker f together with the structure that arises from the cyclical construction of the bijection f : n → n. Then we arrive to the relation (78) x(n) = xn(π) . x∈Πn def
Introducing c(n, k) = |{π ∈ Πn x(n) }| allows us to write (78) in the form the numbers (n) k x = k c(n, k)x . The previous is a polynomial relation, and so remains in force if we make the change x → −x, which gives (x)n = s(n, k)xk k def
with s(n, k) = (−1)n+k c(n, k). Applying this to the functionals Lk : Q[x] → Q,
Lk (xu ) = δk,u , u = 0, 1, 2, . . .
gives the relations Lk ((x)n ) = s(n, k). This “strange” definition of the numbers s(n, k) can serve as a foundation of the derivation of the numbers s(n, k), in particular, of the recurrence relation s(n + 1, k) = s(n, k − 1) − ns(n, k). Together with s(n, 0) = 0, s(1, 1) = 1, this shows that we are here dealing with Stirling numbers of the first kind. We remark that in the same way the definition c(n, k) = Lk (x(n) ) can serve as the basis of a derivation of the properties of the numbers c(n, k); cf. also [4]. Using the linear functionals Lk one can give the recurrence relation indicated for the numbers s(n, k) the form Lk ((x − n) · (x)n ) = Lk−1 ((x)n ) − nLk ((x)n ). We see that an analogous relation holds for an arbitrary polynomial p(x) ∈ Q[x]: Lk ((x − n)· (x)) = Lk−1 (p(x)) − nLk (p(x)). It suffices to check the last statement on the basis sequence {xu |u = 0, 1, 2, . . . } of the space Q[x], which is immediate to do and yields a positive outcome. Let Πn be the set of all partition of n, on the blocks of which it assumed that there is given a structure of a chain. On each function f : n → X, |X| = x, we may look at as a map for which preimages f −1 (y), y ∈ X, there is given a structure of a chain. The number of such functions equals x(n) . On the other hand, a function f : n → X
1. On Stirling and Lah numbers
241
may be viewed as a pair (π, f ) consisting of an element of π ∈ Πn and an injection f : n/π → X; the number of such pairs is π∈Πn (x)n(π) . We obtain the relation (x)n(π) . (79) x(n) = π∈Π n
Applying to this relation the linear functionals Lk : Q[x] → Q, where Lk ((x)u ) = δu,k and u = 0, 1, 2, . . . , gives def (80) Lk ((x)n ) = 1 = L(n, k). {π∈Π n |n(π)=x}
Usually, the numbers L(n, k) arise as the coefficients of the expansion of the eigenpolynomials n dx n −x n−1 def (e x )= L(n, k)(−x)n n (x) = xex dx k
of the Laguerre operator .∞ L : p(x) → −
e−t
d p(x + t)t dt; dx
0
cf. [2, p. 111]. The “strange” definition (80) of these numbers allows us to derive all their properties, in particular, the recursive relation (81)
L(n + 1, k) = L(n, k − 1) + (n + k)L(n, k),
which together with L(0, 0) = 1 and L(n, 0) = 0 for n > 0 shows that the L(n, k) are the Lah numbers, for which holds n! n − 1 L(n, k) = , cf. [2]. k! k − 1 For example, let us indicate the deduction of (81). With the aid of (79) we may write (80) in the form (82)
Lk ((x + n)x(n) ) = Lk−1 (x(n) ) + (n + k)Lk (n, k)(x(n) )
It turns out that (79) is valid for any p(x) ∈ Q[x]: (83)
Lk ((x + n)p(x)) = Lk−1 (p(x)) + (n + k)Lk (n, k)(p(x))
It is sufficient to show (83) for the basis sequence {(x)u , u = 0, 1, 2, . . . } in the space Q[x]: Lk ((x + n)(x)u ) = Lk−1 ((x)u ) + (n + k)Lk (n, k)((x)u )) which, with the aid of the representation x + n = (x − u) + (n + u) leads, to the (not immediate) verification (for u = k; u = k − 1; or u = k, k − 1) of the relation (84)
δk,u+1 + (n + u)δu,k = δk−1,u + (n + k)δk,u .
It turns out that (84) is true which shows that, likewise, (82) is true, and along with it (81).
242
C HAPTER IV. COMBINATORICS
Perhaps it might be of some interest to carry over this approach to the case when one considers on n partitions on which blocks one assume that there is given a completely arbitrary structure. References [1] U. Kaljulaid. A remark on Stirling numbers. Sb. “Komb. Analiz” 6, 1983, 98. (see [K83b]). [2] M. Aigner. Combinatorial theory. Grudlagen der Mathematik, 234. Springer Verlag, Berlin, Heidelberg, New York, 1979. [3] S.-N. A. Joni, G.-C. Rota, and B. Sagan. From sets to functions: three elementary examples. Discrete Math. 37, 1981, 193–202. [4] U. Kaljulaid. Elements of discrete mathematics. Tartu University Press, Tartu, 1983. (see [K83c]). [5] I. Lah. Ein neue Art von Zahlen, ihre Eigenschaften und Anwendungen in der Mathemstischen Statistik. Mitteilungsblatt Math. Stat. 7, 1955, 203–212. [6] G. Pólya. Partitions of a finite set into structures subsets. Math. Proc. Camb. Phil. Soc. 77, 1975, 453–458. [7] G.-C. Rota. The number of partitions of a set. Am. Math. Monthly 71, 1964, 498–504.
Remark. The references [5, 7] were added by translator.
243
2.
Letter (or draft of letter) c. 1991 from Uno Kaljulaid to Torbjörn Tambour
Preamble (Note by Uno Kaljulaid to J. Peetre). This material and such a letter was sent to Professor Tambour in order to initiate anew our cooperation, which was interrupted in 1991 by reasons known to you (and he in the beginning of his trip returned to Sweden).3 Dear Professor Tambour, You asked me some details. Though chaotic, here they are! I would like to add to the remarks on p. 299 that, of course when finding Ω(P, Fm ) it seems to be important also [to invoke] the width w(P ) of P and the fact that order preserving maps P → Fm map chains “convexly” into chains of Fm . So there does not seem to remain so many possibilities when also taking into account a Dilworth partition on P (into chains with a minimal number of such blocks). Sincerely, Uno Kaljulaid
3 Note by J. Peetre Gert Almkvist and Torbjörn Tambour were supposed to visit Tartu in the summer of 1991. However, in Moscow Tambour was attacked by a robber, so he decided to cancel his trip, and returned home.
This page intentionally left blank
245
3.
On Fibonacci numbers of graphs Unpublished manuscript c. 1991, edited by J. Peetre
My curiosity was arisen to this several years ago while reading Prodinger and Tichy [5]; at first it seemed to me to be a recreational hobby. Let me describe the set-up now. Given a (simple) graph G = G(V ; E) with V , the set of vertices, and E, the set of edges, we define the Fibonacci number of the graph f (G) as the number of subsets S ⊆ V such that (a, b) ∈ E for all pairs {a, b} ⊆ S; let us call these subsets S acceptable. E.g., an easy induction shows that the (usual) Fibonacci number Fn+2 is the Fibonacci number of the n chain Rn (see Figure 1) and that the Lucas number Ln is Fibonacci • 1
• 2
• 3
...
• n
Fig. 1: The n-chain Rn
number of the elementary n-cycle Cn (see Figure 2). •3 •2 •1 •n •n−1 Fig. 2: The n-cycle Cn
Furthermore, Prodinger-Tichy [5] prove some elementary lemmas and an (easy) theorem for an n-tree Tn : Fn+1 ≤ f (Tn ) ≤ 2n−1 + 1, and they pose some questions (not difficult to solve): e.g., (1) the Fibonacci number for the graph in Figure 3 is 3n . √ (2) the Fibonacci number for the graph Rn in Figure 4 is f (Rn ) = 3+23 3 (1 + √ √ n 3−2√3 3) + 3 (1 − 3)n . (3) the Fibonacci number for the for the graph Qn in Figure 5 is f (Qn ) = 12 (1 + √ √ n+1 2) + (1 − 2)n . (4) the Fibonacci number for a 2n-cycle with √ opposite vertices √ joined, as depicted in Figure 6, is f (Zn ) = (−1)n+1 + (1 + 2)n + (1 − 2)n .
246
C HAPTER IV. COMBINATORICS
n+1 •
n+2 •
n+3 •
n+4 •
• 1
• 2
• 3
• 4
... • ... •
2n • • n
Fig. 3: The forest of “dipoles”
n+1 •
n+2 •
n+3 •
n+4 •
• 1
• 2
• 3
• 4
... • ... •
2n • • n
Fig. 4: The The graph Rn
n+1 •
n+2 •
n+3 •
n+4 •
• 1
• 2
• 3
• 4
... • ... •
Fig. 5: The The graph Qn
2 1 ?? ?? ?? ?? 0 ??? ? ?? 2n − 1 Fig. 6: The The graph Zn
2n • • n
3. On Fibonacci numbers of graphs
247
After several years I saw a note by A. Alameddine [1] (1983) on the Fibonacci number s of outerplanar graphs – these are planar graphs whose vertices can be thought as belonging to a single face4. Maximal among outerplanar graphs are those outerplanar graphs which do not allow addition of edges without disturbing outer planarity. According to the main result of this paper the Fibonacci number f (Pn ) of a maximal outerplanar graph Gn with n vertices satisfies the inequality f (Pn ) ≤ Fn+1 , and this result is the best possible. In the proof of this result in [1] there is a mistake: the author asserts that f (Pn−3 ∩ {v}) = f (Pn−3 ); yet, for n = 7 we have f (P4 ) = F6 = 8, but f (P4 ∩ {v}) = 16. Nevertheless, the assertion is true, as there exists a way to overcome the author’s difficulty. I have some additional remarks here. 1. Using a technique of A. Proskurowski, I can prove the following two theorems: T HEOREM 3.1. For a given maximal outerplanar graph G with n vertices, let us denote by G+ the maximal outerplanar graph obtained by adding a new vertex, and denote by G− the outerplanar graph with n − 2 vertices which we get upon dropping from G some two of its vertices. Then it is true that f (G) = f (G+ ) − f (G− ). T HEOREM 3.2. The Fibonacci number of the maximal outerplanar graphs Mn , given for n odd in Figure 7, and n even in Figure 8, are minimal among the Fibonacci numbers of maximal outerplanar graphs with n vertices.
n – odd 1 3 ? ? •?? ······ • • ??? ?? • ??? ?? ?? ?? ?? ?? ?? ? ? ? • • • ······ 2 4 n−1 Fig. 7: The maximal outerplanar graph Mn for odd n
This solves two questions posed by Alameddine in [1]. In addition, my reasoning to achieve the above seems to be such that there exists a quite realistic hope to do all the above for any planar graph: it seems that the needed lemmas exist already, and are contained in Chapter 11 of F. Harary’s book [4]. To this seems to be one the possible lines for extending the results on maximal outerplanar graphs. And, very probably, this extension will be useful for ‘chip-industry’. 4 Editor’s note. Equivalently, a graph is called outerplanar if it has an embedding in the plane such that the vertices lie on a fixed circle and the edges lie inside the disk of the circle and don’t intersect.
248
C HAPTER IV. COMBINATORICS
n−1 1 3 •?? •?? •?? • · · · · · · ?? ? ? ?? ?? ??? ?? ?? ?? ? ? ? ? • • • ······ n – even 2 4 Fig. 8: The maximal outerplanar graph Mn for even n
2. The equation f (Mn ) = f (Mn−1 ) + f (Mn−3 ) has the characteristic equation x3 −x2 −1 = 0. Setting x = y+ 31 we get y 3 − 13 y− 29 27 = 0 with the roots ⎧ y = u + v; 1 ⎪ ⎪ ⎪ ⎨ u − v√ u+v +i 3; y2 = − 2 2 ⎪ ⎪ √ ⎪ ⎩ y = − u + v − i u − v 3, 3 2 2 where 4 4 5 5 31 31 3 29 3 29 u= + and v = − . 54 108 54 108 So we obtain the general solution f (Mn ) = axn1 + bxn2 + cxn3 , where the approximate values of xi (i = 1, 2, 3) are ⎧ ⎪ ⎨ x1 = 1.465572; x2 = −0.232786 + i · 0.792551; ⎪ ⎩ x = −0.232786 − i · 0.792551; 3 As |f (Mn )| ≤ |a||x1 |n + |b||x2 |n + |c||x3 |n and we have |x2 | = |x3 | < 1, then for n → ∞ we have |x2 | = |x3 | → 0, and so for large values of n we obtain f (Mn−1 ) ≤ |x1 | = 1.465572. f (Mn ) Experimenting a little with various n shows that the ratio 1.465572 well enough even for small n: n 3 4 6 7 8 9
f (Mn ) 4 6 13 19 28 41
f (Mn+1 ) f (Mn )
1.5 1.444 1.4461538 1.4475684 1.4464287
f (Mn−1 ) f (Mn )
tends to this value
3. On Fibonacci numbers of graphs
249
E DITOR ’ S R EMARK . As Rn =
a + b( xx21 )n+1 + c( xx31 )n+1 axn+1 + bxn+1 f (Mn+1 ) + cxn+1 2 3 = 1 n = x , 1 f (Mn ) ax1 + bxn2 + cxn3 a + b( xx2 )n + c( xx3 2 )n 1
1
we see that the sought ratio Rn , indeed, tend to x1 as n → ∞. Likewise, it is easy to see that we have |x2 | n+1 |Rn − x1 | ≤ K( ) x1 for suitable constant K and all n. ( ' Note also that here the technique used by Pólya for solving recurrences appearing in connection with the enumeration of trees can be applied (when suitably extended) – and this seems to be an interesting reasoning here. 3. To sum up, all the above, probably, deserves to be written down, and to be critically analyzed once more together – [this should be] interesting at least for people concerning with graphs and chips-technology. I finish with some chaotic thoughts on these matters. First note that for mathematics the most interesting things seems to begin to appear, yet, when we pose questions analogous to the graph-theoretic ones above for a finite poset P . For such a P it is natural to define the Fibonacci number P as the number of all antichains in P . Let ζP be the zeta-function for the order relation in P ; that is, ζP (x, y) = 1 if and only if x ≥ y in P . So f (P ) equals the number of all k × k zero-submatrices in the matrix )ζP (x, y)); here x, y ∈ P (listing P ) and k takes all values in {1, 2, . . . , |P |}. The role of antichains when investigating the structure of poset is, of course, well-known; e.g., the maximal size of antichains in P as the width of P , Dilwoorth’s theorem, . . . . I have several observations here. To be more concrete I shall describe two of them here. 4. When considering order preserving maps ϕ : P → P it seems natural to consider the kernel of ϕ, π = Ker ϕ. And then to define x ¯ ≤ y¯ on P¯ / = P Ker ϕ if and only if π π there exist x , x ∼ x, and y , y ∼ y such that x ≤ y holds in P . This is a consistent definition as in the case if x ¯ < y¯ then for any pair (x , y ) with different components π π x , x ∼x, and y , y ∼y such that x ≤ y if these components are comparable then we must have x < y . Note also that for a (finite) poset P , taking π ∈ Π(P ) such that that all π-classes are connected (as subsets of P ), we can define x ¯ ≤ y¯ in P/π by the rule: x ¯ ≤ y¯ if and only if there exist x ∼ x and y ∼ y such that x ≤ y in P . Call such an π an acceptable equivalence. It follows from R. Stanley’s results that all acceptable equivalences form an Eulerian sublattice in Π(P ). Returning to the main point, observe that for any order preserving map ϕ : P → P there exists a natural ◦-epimorphism ψ : P → P¯ , π(x) = x¯, x ∈ P , and so the usual “◦-diagram” appears: Ψ – epi / / P/ Ker ϕ = P¯ P HH HH n n HH n H ϕ HH $ vn n ε – iso Im ϕ ≤ P
Now, the finding of the number Ω(P, P ) of all order preserving maps P → P reduces to the enumeration of acceptable equivalences on P and of ◦-automorphisms of P/π for acceptable π. This seems to have some point of contact with the Sands
250
C HAPTER IV. COMBINATORICS
conjecture, that I shall describe below. More generally, for any order preserving map ϕ : P → Q it holds ζP (x, y) = 1 =⇒ ζP (ϕ(x), ϕ(y)) = 1 for x ≥ y in P . 5. As above, denote by Ω(P, m) the number of order preserving maps Ω → m. Stanley has observed that Ω(P, m) = Z(Z(P ), m), with Z(Q, n) denoting the number of multichains y1 ≤ y2 ≤ · · · ≤ yn in Q, and this n-expression is called the zeta-polynomial of the poset Q. Ω(P, m) is called the order-polynomial of P and can be thought of as an m-polynomial of degree |P |. Stanley [6, Theorem 4.5.14], gives the following intriguing formula Ω(P, m) = ( λ1+d(π) )(1 − λ)−p+1 , m≥0
π∈L(P )
where L(P ) denotes the Jordan-Hölder set for P . My question is now: What will happen to this theory of Stanley if we take Fm (a fence (zigzag): {1, 2, . . . .m} with the only inequations 1 > 2, 2 < 3, 3 > 4, . . . , m − 1 < m) instead of the cochain m? Other posets P , instead of m, may be of interest also. Yet, Fm is interesting in relation to the paper Currie-Visentin [2]. In this respect, at least the Ω(P, Fm ) should deserve an attention. In [2] the generating function of Ω(Fm , Fm ) is introduced. According a conjecture of B. Sands the number Ω(P, P ) is minimal for P = Fm . Let us further mention the paper Duffus-Rödl-Sands-Woodrow [3], although we have not seen it so far. Here we can make the conjecture that Ω(P, P ) ≥ Ω(P, Fm ) ≥ Ω(Fm , Fm ), for any poset P , |P | = m. It seems that the outerplanar graphs Mn here somehow correspond to the “bichromatic” Jordan-Hölder sets for Fm , and so the role of f (Mn ) was, presumably, not just an accident? This expectation is supported by the observation that for any poset P its order polynomial depends only on its graph of comparability G(P ), with (x, y) ∈ E(G) if and only if x < y or y < x. Also, Ω(m, Fm ), Ω(Fm , m), Ω(Fm , Fm ) and | Aut(Fm )| are interesting, and so are the Eulerian (sec,tan)-numbers ... References [1] A. F. Alameddine. Centers of maximal planar graphs with two vertices of degree three. J. Combin. Inform. System Sci. 8, 1983, 90–96. [2] J. D. Currie and T. I. Visentin. The number of order-preserving maps of fences and crowns. Order 8, 1991, 133–142. [3] D. Duffus, V. Rödl, B. Sands, and R. Woodrow. Enumeration of order preserving maps. Order 9, 1992, 15–29. [4] F. Harary. Graph theory. Addison-Wesley, Reading, MA, 1969. Russian translation: Mir, 1973. [5] H. Prodinger and R. Tichy. Fibonacci numbers of graphs. Fibonacci Quarterly, 9, 1982, 16–21. [6] R. Stanley. Enumerative combinatorics I. The Wadsworth & Brooks/Cole Mathematics Series. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CA, 1986.
CHAPTER V History of Mathematics
This page intentionally left blank
253
1.
Th. Molien, an innovator of algebra Unpublished manuscript, c. 1985, translation from Estonian by J. Peetre
The life of Fedor Eduardovich Molin (1861-1941) was somewhat unusual. He was born in Riga and in 1883 received the scientific degree of a candidate in astronomy from the University of Dorpat/Tartu. In 1883-1885 he worked in Leipzig in the seminar of Felix Klein, on whose advice Molin began to research linear transformations of elliptic functions. When he returned to his alma mater, Molin was appointed a docent and during the following six years made contributions that earned him his place in the history of algebra. In 1892 he published his paper “On systems of higher complex numbers”. In modern language, in that paper by analogy with the notion of a simple group Molin defined simple algebras over the field of complex numbers, showed that they are algebras of matrices, and finally, discovered that the study of an arbitrary algebra over the field of complex numbers reduces to the case when the quotient by the radical is a direct sum of matrix algebras. In the short articles that followed that memoir, Molin applied these results to representation theory of finite groups. His research had much in common with works by Frobenius, Killing and Lie Lie, and immediately brought him international acclaim and a gold medal from the Paris Academy of Sciences. Georg Frobenius in one of his letters to Molin said, in particular, that Molin “with one stroke completely solved the most important questions in this field”. Unfortunately, neither Moscow nor St-Petersburg universities had any influential people capable of giving Molin’s papers their due, and after receiving for them his doctorate degree, he had to take a full professor position (called “ordinary professor”) in mathematics in the newly opened Tomsk Technological Institute. There, the daily needs of organizing teaching, a library, and other tasks vital for an institution of higher education that was new and distant from the capital remove him for a long time from the stream of international mathematical life. In 1917 Molin was appointed a professor of mathematics in the department of physics and mathematics in the newly opened Tomsk University. He became completely absorbed into organizing this department, and published form time to time articles of a general mathematical nature. For a long time, Tomsk had been the cultural capital of Siberia, and the current flourishing of Siberian mathematics is partially due to F.E. Molin. Excerpt from A.I. Ma’lcev, To history of algebra in the USSR for the first 25 years, Algebra i Logika 10, 1971, 102–118 (Russian). English Translation: Algebra Logic , pp. 68–81.
254
C HAPTER V. HISTORY OF MATHEMATICS
According to [1] the picture of the early history of the theory of group representations is perverted: the approach of Frobenius and Burnside is usually considered as fundamental, although in reality the innovative work was done by the little known T. Molien. However, the wider mathematical and historical context of Molien’s results and their connection with problems of contemporary mathematics has been little studied [2]. 125 years have elapsed since the birth (September 10, 1861) of Theodor Molien. He was born in a family of Swedish origin, which from northern Estonia had settled in Riga. His father [Eduard Molien] had graduated from Tartu University, was a teacher at a private gymnasium in Riga. T. Molien had received his basic education at the Government Gymnasium at Riga. There were laid the foundations for his ability for studies and his character, his intellectual interests and habits. At Tartu University Molien began to prepare himself for a career as an astronomer. His aptness and diligence was noted, his scientific tastes and ability developed. He graduated at the university (1883) and was sent to Leipzig (1883–1885), which gave him modern and deep knowledge. There, in the seminar of F. Klein, his scientific interests definitely turned to interior problems of mathematics. The future Docent at Tartu University (1885–1900) did not give up his connections with Leipzig (from 1886 on, the seminar was directed by Sophus Lie). Therefore, in the study of systems of hypercomplex numbers, he got stimulation and help from the activities which arose in Lie’s seminar from the results of Weierstrass and Dedekind on this theme, and in particular from Poincaré’s remark that the expression of the multiplication of hypercomplex numbers gives a Lie group. As a result [3] the results of Molien obtained in 1887–92 began the structure theory of algebras [5]. The facts known to Molien that group algebras, in special cases already known to A. Cayley, constitute a bridge between representation theory and the theory of algebras, led him to fundamental notions and facts in the theory of group representations [4]. In this way the “hypercomplex aspect” of the theory was born. Let G be a finite subgroup in the group of all regular linear maps of the subspace of the linear forms in the algebra of polynomials R = C[x1 , . . . , xm ]. The description of the homogeneous components RnG , the so-called subalgebra of invariants RG = {f ∈ R | ∀G ∈ G, f G = f } constitutes the central problem of the theory of invariants. This problem is equivalent to the determination of the formal series MG (t) = G n n≥0 (dim Rn )t , the Molien series of G. The answer, an the explicit form of the rational function MG (t), is provided by Molien’s formula: MG (t) =
1 1 . |G| det(I − tG) G∈G
The so-called Polyá theory in combinatorics studies the numbers d(τ ) of G-schemes of a given type, that is, the series LG (t) = τ d(τ )tτ . The formula for LG (t), which is the central part of Polyá theory, can be found in a similar way [6]. This example [6] does not limit the connections of Molien’s results with problems of contemporary mathematics. In particular, a somewhat more general variant of Molien’s formula just considered admits
1. Th. Molien, an innovator of algebra
255
applications to the noncommutative theory of invariants. Apparently, this new interest in the innovative papers [3, 4] of Molien is yet another confirmation of their lasting value. References [1] [2] [3] [4]
W. Gustafson. Review on S. Sehgal’s topics in group rings. Bull. Amer. Math. Soc. 1, 1979, 654–657. N. F. Kanunov. Fedor Eduardovich Molien. Nauka, Moscow, 1983. T. Molien. Über Systeme höherer komplexen Zahlen. Math. Ann. 41, 1893, 83–156. T. Molien. Über die Invarianten der linearen Substitutionsgruppen. Sitzungsber. der Königl. Preuss. Akad. d. Wiss. 52, 1897, 1152–1156. [5] R. S. Pierce. Associative algebras. Graduate Texts in Mathematics, 88. Springer-Verlag, New York, Berlin, 1982. Russian translation: Mir, Moscow, 1986. [6] R. Stanley. Invariants of finite groups and their applications to combinatorics. Bull. Amer. Math. Soc. 1, 1979, 475–511.
This page intentionally left blank
257
2.
[K87e] On the results of Molien about invariants of finite groups and their renaissance in contemporary mathematics Translation by J. Peetre
1. Molien’s papers [11] and [12] occupy an honorable place in the history of mathematics (cf. [1]). Nevertheless, it is written comparatively little on the wider historicalmathematical context of his classical results and their connection with the problems of contemporary mathematics [8, 15]. One of the reasons for this situation is a long-lasting complicated treatment of the interaction of different the results and the notions of the theory of representations of finite groups and their group algebras, which arose after the publication of the books [2] and [22]. These outstanding books indicated a landmark in the algebraic literature and become the basic text-books for several generations of mathematicians, still nowadays their impact is great. However, questions of the formation history of these concepts and results are not illuminated sufficiently clearly cf. e.g. [13]). Apparently, this distortion of history was caused by the underestimation the contribution of S. Lie and other mathematicians taking part in the Leipzig seminars (in the second half of the 1880’s) and thinking the same way, as well as by neglecting the works Molien and É. Cartan that were written in the old-fashioned language of the Pierce idempotents (1871). The latter happened after the brilliant presentation and generalization of the Molien-Cartan theory by Wedderburn (1907) and especially after the appearance of [14]. This is clearly seen in the history of group algebras, to which the second part of our paper is devoted. Only E. Noether was considered the founder of the theory of group algebras. Nowadays, more and more people refer to the role of A. Cayley in the genesis of the notion of group algebra, and as regarding to the theory of group algebras – of T. Molien, cf. [5, 8, 16]. In particular W. Gustafson writes in [5]: “Most people familiar with the early history of the theory of representations of finite groups in C, think immediately of Frobenius and Burnside, who used approaches that seem unsuitable and even bizarre in the light of modern treatments. Admittedly Frobenius’ group determinant and Burnside’s Lietheoretic approach both yielded the basic properties of complex characters. However, they said much less about the representations themselves. For this reason they have little application to the important problems of finding properties of representations over other rings[: representations over fields of finite characteristic and over rings of algebraic integers have very important applications in group theory, algebraic number theory and topology. Hence, a more flexible approach was needed.] In fact, the groundwork has been done by the little known Estonian mathematician Theodor Molien.” Let us add to this the words of H. Weyl: “The matter is closely connected with hypercomplex number systems or algebras. After Hamilton’s foundation of quaternion calculus (1843), and a long period of more or less formal research in which R. Pierce is played the major role, Molien (1892) was really the first who reached several general and profound results in
258
C HAPTER V. HISTORY OF MATHEMATICS
this direction” (cf. [23, p. 29] of the original 1939 edition1). This, as well as the recent reconstruction by J. Dieudonné, shows that Molien, undoubtedly, may be viewed as the first discoverer of the “hypercomplex aspect” of the theory of representation of groups, a discovery which is frequently ascribed to Burnside and E. Noether cf. [13]; as to Noether’s paper [14], the definition of the group ring given there and the treatment of the whole “hypercomplex aspect” has taken a modern form. Moreover, the paper has made a major impact on the style of algebraic thinking. Later, however, it was often neglected that Noether herself knew the Molien’s work well and had a high opinion of it (cf. [14]). Recently there was discovered an unexpected connection between Molien’s old result and contemporary problems of mathematics. Such a long interruption is partially explained by the fact that the actual adaption of invariant theory and the ideas of Klein’s program, to which one should precede when studying the corresponding groups and representations, was proceeding slowly. Therefore only in the 1950’s and 60’s it has lead to a new and essential posing of problems and applications. Below, following [21], we shall tell about the remarkable connection of the theory of invariants of finite groups with the combinatorial theory of counting, established using the formula (12) of Molien’s paper [12]. Knowledge and ability in combinatorics has become an essential component in undergraduate courses of applied mathematics. In many lecture courses on combinatorics, a central place is occupied by the so-called counting theory of Pólya, or, as it is nowadays adopted, the Redfield-Pólya theory. It turns out that the central result of this theory can be easily derived by an analogy of Molien’s formula. 2. Let us regard the algebra of polynomials R = C[x1 , . . . , xn ] as a vector space over the field of complex numbers C and let us represent it as the direct sum R = R0 ⊕ R1 ⊕ R2 . . . Rn ⊕ . . . , where Rn is the subspace of all homogeneous polynomial (forms) of degree n. The subspace R1 of linear forms will be denoted V and will be considered as a vector space with the fixed basis x1 , . . . , xn ; the column (x1 , . . . , xn )t will be written x. Let G be a finite subgroup of the group GL(V ) of all regular linear maps of V . As a basis is fixed in V , every element G ∈ G may be viewed as a matrix, and its action of a polynomial f ∈ R can be given by the formula f G (x) = f (Gx). In F we can distinguish the so-called subalgebra of invariants RG = {f ∈ R | ∀G ∈ G, f G = f }. An essential characteristic of the algebra RG is provided by the formal series (dimC RnG )tn , MG (t) = n≥0
which is called the Molien series of the group G. According to the theorem of Hilbert the Molien series is always a rational function. The classical result of Molien, about which we spoke at the end of Subsection 1, gives an explicit formula for the determination of this rational function, namely: 1 1 . MG (t) = |G| det(1 − tG) G∈G 1Translator’s note. Kaljulaid in [K87e] refers to p. 48 of the Russian translation (Moscow, 1947).
2. On the results of Molien about invariants of finite groups
For example, let R = C[x1 , x2 , x3 ] and G = G, H, where ⎞ ⎞ ⎛ ⎛ −1 0 0 1 0 0 G = ⎝ 0 −1 0 ⎠ and H = ⎝0 1 0⎠ 0 0 −1 0 0 i
259
,
i2 = −1.
Then G is an Abelian 8-group such that RG = C[x21 , x22 , x23 ](1 ⊕ x1 x2 ). As
⎛ 1 0 G = ⎝0 1 0 0 ⎛ −1 0 ⎝ 0 −1 0 0
⎞ ⎛ 0 1 0 ⎠ , ⎝0 1 0 ⎞ ⎛ −1 0 0⎠ , ⎝ 0 1 0
⎞ ⎛ 0 1 0 0⎠ , ⎝0 1 i 0 0 ⎞ ⎛ 0 0 −1 −1 0⎠ , ⎝ 0 0 i 0
0 1 0
⎞ ⎛ ⎞ 0 1 0 0 0 ⎠ , ⎝0 1 0 ⎠ , −1 0 0 −i ⎞ ⎛ ⎞ 0 0 −1 0 0 −1 0 ⎠ , ⎝ 0 −1 0 ⎠ , 0 −1 0 0 −i
the Molien series of G is given by 1 1 1 1 1 + + + + MG (t)) = 8 (1 − t)3 (1 − t)2 (1 − it) (1 − t)2 (1 + t) (1 − t)2 (1 + it) 1 1 1 1 + + + = + 2 2 3 2 (1 + t) (1 − t) (1 + t) (1 − it) (1 + t) (1 + t) (1 + it) 1 = . (1 − t2 )3 The Molien series carries the important information about the algebra RG , the study of which is also the central problem in invariant theory. The theory of invariants arose in England in the mid of 19th century in the form of generalization of the theory of determinants as an algebraic instrument for description of connections and configurations in projective geometry. At the beginning the foreground was the actual numerical computation of invariants of the group of all homogeneous linear transforms. This “combinatorial” development line of the theory was initiated by Cayley (1846). From determinants he proceeded to more general invariants (i.e. to algebraic expressions in the coordinates, which are changed in a definite way under non-degenerate transformations) and in 185459 he obtained a complete system of them for cubic and biquadratic forms. This was followed by important work by Sylvester, Clebsch, Cremona, Beltrami, Capelli, and others; the first of these authors has invented most of the terminology in invariant theory. As result, the so-called symbolic method was developed, and it is of current interest in modern combinatorics (cf. [21]). The number theory has also given the impulse to the development of invariant theory: the arithmetic theory (Gauss) of binary quadratic forms was forcing to study invariants of the group G of integer unimodular matrices. This line of reasoning found its sequel in the works of Eisenstein, Jacobi and Hermite. The following “abstract” development of invariant theory has moved the direct computation of invariants to a background and the main attention has been turned on general notions and relations. The final result of the key problems of the classical theory (the existence of finitely many generators of the algebra of invariants and of a finite bases for the syzygies) were obtained by Hilbert (1890-92). After these achievements the interest for problems about invariants has abruptly dropped. But in the 1930’s they again attracted the
260
C HAPTER V. HISTORY OF MATHEMATICS
interest due to developments in physics. At the same time the general formulation of the problem of invariants has been originated in the way as it was set forth at the beginning of this section and which provides the basis for reduction of the problem of invariants to a special case of the general problems of representations theory. Above we mentioned the underestimate role of S. Lie’s approach in the development of representation theory of groups. We may add to this that the theory of invariants has been productively unified with Lie’s infinitesimal methods by E. Study. Today his results have become the source of ideas that support the development of concrete differential equations for the invariants. This fact as long as the increased interest to this relation from the side of contemporary Discrete Mathematics shows that this approach did not exhaust all the possibilities.
3. The most interesting applications of the ideas and results in Molien’s paper [12] took place in the past decade [the 1970’s]. Let us now familiarize ourselves with a generalization of Molien’s formula, that lead to surprisingly wide range of applications in contemporary combinatorics. To this end let us consider the decomposition V = V1 ⊕+ · · ·⊕Vn , where Vi is the homogeneous subspace in V spanned by the basis element xi . Let G ≤ GL(V ) be a finite subgroup such that for each element G ∈ G there exists a permutation πG ∈ Sm with the property that G(Vi ) = VπG , i = 1, . . . , m. In this situation we say that G is a monomial group and it consists of monomial matrices having on each of its rows precisely one element different from zero. If, for some G ∈ G, C = {i1 , . . . it } is a cycle in the permutation πG , then {i1 , . . . it } ⊆ m and π(ik ) = ik+1 for 1 ≤ k ≤ t − 1, while π(it ) = i1 . In view of the monomiality of G there exist numbers α1 , . . . , αt ∈ C such that G(xik ) = αk xik+1 for 1 ≤ k ≤ t − 1 for 1 ≤ k ≤ t − 1 and G(xit ) = αt xi1 ; denote by γG (C) the product α1 α2 . . . αt . The type of a monomial xn1 1 . . . xnmm is the sequence τ = (τ1 , τ2 , . . . ), where τi is the number of indices nj equal to i, i.e. τi = |{nj | nj = 1}|. Let Rτ be the subspace of the C-algebra R spanned by all monomials of type τ . As the type does not change under the action of a monomial matrix G ∈ G, we have G(Rτ ) = Rτ . If we set RτG = RG ∩ Rτ , " then the representation RG = RτG grades RG as a C-space; let us, however, note that RτG · RσG is not always contained in some RμG . Molien’s formula suggests a path for finding the generating function of (infinitely many) variables t = (t1 , t2 , . . . )
LG (t) =
(dimC RτG )tτ =
τ
1 |C| |C| (1 + γG (c)t1 + γG (c)2 t2 + . . . ), |G| G∈G C
where tτ = tτ1 tτ2 . . . and C runs through all cycles of the permutations πG . For example, for the monomial group ⎛ 1 0 G = ⎝0 1 0 0
⎞ ⎛ −1 0 0⎠ , ⎝ 0 1 0
⎞ ⎛ 0 0 0 0 −1 0 ⎠ , ⎝0 1 0 −1 1 0
⎞ ⎛ 1 0 0 0⎠ , ⎝ 0 −1 0 −1 0
⎞ −1 0⎠ 0
2. On the results of Molien about invariants of finite groups
261
we obtain 1 [(1 + t1 + t2 + t3 + . . . )3 + (1 − t1 + t2 − t3 + . . . )3 + 4 (1 + t1 + t2 + t3 + . . . )(1 + t21 + t22 + t23 + . . . ) +
LG (t) =
(1 − t1 + t2 − t3 + . . . )(1 + t21 + t22 + t23 + . . . )] = =1+
∞
t2k +
k=1
∞ k=1
t2k +
∞
t2k t2 .
k,=1
4. The results on invariants of finite groups, to which the interest again arose in the 1950’s, admit various important applications in contemporary mathematics. It is especially noteworthy that the general theorem of Pólya that plays such an eminent role in combinatorics, is a special case of the generalization of Molien’s formula described in Subsection 3. Apparently, firstly this was noticed by Stanley [21]. Let us now describe briefly the ideas that led to the so-called Pólya theory. It has its origin in Cayley’s paper (1875) on counting of carbo-hydrides. However, the method proposed turned out to be impractical and so chemists did not pay much attention to it. Nevertheless, in the following 30 years many have showed interest to that technique, but on the mathematical level there was still no progress; for a survey of these attempts see [7]. The remarkable paper [18] was written by Redfield (1927). This paper remained unknown for a long time, although it contained many ideas and results that were later (1934–37) rediscovered by G. Pólya. Partially the neglecting of [18] was caused by its discouraging terminology and a hard penetrable presentation. Pólya’s work was likewise preceded by the paper [10], where the author promotes the idea of the usefulness of the terminology and technique of group representations for the counting of isomers. Let us note an interesting fact that one of cornerstones of the theory is called everywhere the “theorem or lemma of Burnside”, although according to [13] it was known to Cauchy and Frobenius long before the appearance of the book [2]. The work of Pólya and in particular his final paper [17] became a landmark in counting theory because of its influence on the subsequent development. The Redfield-Pólya theory results and its generalizations compose nowadays an important chapter of modern combinatorics. The above mentioned connection between this theory and Molien’s formula can be briefly described as follows. Let us consider the case wfere the monomial group G consists of permutation matrices. Then each element G ∈ G induces a substitution on the set F of all functions f : m → N satisfying the condition (Gf )(i) = f (πG (i)). If we now identify the f (1) f (2) f (m) function f using the monomial xf = x1 x2 . . . xm , then the action of G on the C-algebra R satisfies the relation G(xf ) = X G
−1
f
, where G−1 f (i) = f (πG−1 (i)).
The action G on F gives a partitioning of the set F , its classes (so-called G-schemes) are the orbits of this action, i.e. we write f ∼ g, if for some G ∈ G holds g = Gf . If f ∼ g then the multisets {f (1), . . . , f (m)} and {g(1), . . . , g(m)} and therefore xf and xg have the same type. One can speak of the type of G-schemes. The main problem of the counting theory of Pólya is to determine the number of G-schemes of a given type τ . Consequently, denoting the sought number by d(τ ), the counting theory problem can
262
C HAPTER V. HISTORY OF MATHEMATICS
be thought of as a problem of finding the generating series LG (t) = τ d(τ )tτ . The answer is given by theory of Pólya, which is obtained by specialization of the formula for LG (t) given above by considering the special case d(τ )tτ = LG (t) = τ
1 |C| |C| |C| (1 + t1 + t2 + t3 + . . . ), |G| G∈G C where C runs through all cycles of the permutations πG . This example does not exhaust the connections of Molien’s formula with modern mathematics. In algebra, during the recent years there has been a great interest for the non-commutative analogue of the situation considered in Subsection2. This amounts to studying the subalgebra of invariants RG of a finite group G in the algebra R = C[x1 , . . . , xn ] of polynomials of non-commutative variables xi . The corresponding generalization of Molien’s formula and its various applications are discussed in [4]. The authors of this paper developed the analogue of Molien’s formula for a non-commutative compact topological group2 G and used it for solving subtle (discrete) algebraic problems. Besides of the above-mentioned generalization of Molien’s formula finds the use in the theory of multi-partitions, in coding theory and other divisions, yielding a clearing up of the problems considered, and together with a single approach and simplification of the corresponding proofs and possibilities of generalizations. However, a more detailed analysis of these problems requires, the attraction of new notions and results and so this surpasses the bounds of the present publication. The interested Reader may acquaint him- or herself with the papers [4, 19, 20] that exhibit the importance of the paper [12] and an unprecedented value of Molien’s results. Our account is sufficient to see the unfoundedness of the pretty narrow appreciation of the scientific activity of T. Molien in the country at the turn of the century, which forced him to leave Tartu, the town where he wrote his classical papers [11] and [12] in the theory of algebras and groups. [9],[3],[4],[6] References [1] [2] [3] [4] [5] [6] [7] [8]
Nicolas Bourbaki. Éléments d’histoire des mathématiques. Masson, Paris, 1984. Russian translation: Gos. Izdat. Inostr. Lit., Moscow, 1963. W. Burnside. Theory of groups of finite order. University Press, Camnbridge, 1897. A. Cayley. On the analytical forms called trees, with application to the theory of chemical combinations. Rep. Brit. Assoc. Adv. Sci. 45, 1875, 257–305. W. Dicks and E. Formanek. Poincaré series and a problem of S. Montgomery. Linear and Multilinear Alg. 12, 1982, 21–30. W. Gustafson. Review on S. Sehgal’s topics in group rings. Bull. Amer. Math. Soc. 1, 1979, 654–657. T. Hawkins. Cayley’s counting problem and the representation of Lie algebras. In: Proc. of the Int. Congress of Math., August 3–11, 1986. Amer. Math. Soc., Providence, RI, 1987, 1642–1656. H. Henze and C. Blair. The number of isomeric hydrocarbons of the methan series. J. Amer. Chem. Soc. 53, 1931, 3077–3085. N. F. Kanunov. Fedor Eduardovich Molien. Nauka, Moscow, 1983. 2In such groups, the formula is called the Molien-Weyl formula.
2. On the results of Molien about invariants of finite groups
[9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]
[23]
263
N. F. Kanunov. F E Molin’s work "On invariants of groups of linear substitutions". Historical Mathematical research 30, 1986, 306–338. A. Lunn and J. Senior. Isomerisms and configuration. J. Phys. Chem. 33, 1929, 1027–1079. T. Molien. Über Systeme höherer komplexen Zahlen. Math. Ann. 41, 1893, 83–156. T. Molien. Über die Invarianten der linearen Substitutionsgruppen. Sitzungsber. der Königl. Preuss. Akad. d. Wiss. 52, 1897, 1152–1156. P. M. Neumann. A lemma that is not Burnside’s. Math. Scientist 4, 1979, 133–141. E. Noether. Hyperkomplexe Grössen und Darstellungstheorie. Math. Zeit. 30, 1929, 641–692. K. Parshall. Joseph Wedderburn and the structure theory of algebras. Arch. Hist. Exact Sci. 32, 1989, 223–349. R. S. Pierce. Associative algebras. Graduate Texts in Mathematics, 88. Springer-Verlag, New York, Berlin, 1982. Russian translation: Mir, Moscow, 1986. G. Pólya. Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische ur Verbindungen. Acta Math. 68, 1937, 145–254. J. Redfield. The theory of group-reduced distributions. Amer. J. Math. 49, 1927, 433–455. N. Sloane. Error-correcting codes and invariant theory. Amer. Math. Monthly 84, 1977, 82–107. L. Solomon. Partition identities and invariants of finite groups. J. Comb. Theory, Ser. A 23 (2), 1977, 148–175. R. Stanley. Invariants of finite groups and their applications to combinatorics. Bull. Amer. Math. Soc. 1, 1979, 475–511. B. L. van der Waerden. Moderne algebra, I; II. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berücksichtigung der Anwendungsgebiete. Springer, Berlin, 1930; 1931. H. Weyl. The classical groups. Their invariants and representations. Princeton University Press, Princeton, N.J., 1939. Russian translation: Gos. Izdat. Inostr. Lit., Moscow, 1947.
This page intentionally left blank
265
3.
Theodor Molien, about his life and mathematical work as seen a century later. (A biographical sketch and a glimpse of his work) Xerox3 copy of handwritten original [c. 1991], edited by J. Peetre, corrections by A. Zubkov Contents of the chapter 1. A biographical sketch and a glimpse of his thesis . . . . . . . . . . . 265 2. Molien’s 1897 papers on group rings and invariants . . . . . . . . .269 3. Molien type formulae in Combinatorics . . . . . . . . . . . . . . . . . . . 275 4. Noncommutative versions of Molien type formulae . . . . . . . . . 277 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Science and art are the two sides of our life. Usually science is considering all that is determined by some laws, and so its development can be predicted. But what is history? If we know the laws of processes we can predict them. But often there appears a point in developing some idea where we have two or infinitely many choices – chaos appears. If we look behind – there were laws and predicability. But looking forward we see chaos. The historian cannot write the history of what has not happened. The history of things that have not happened but might have happened – that is art. To go back to some point and try again in a new direction and with new connections in mind. Considered in such a way, the history of mathematical ideas, can, I believe, be a useful thing for a mathematician. Something like this has happened with some of the ideas of Molien.
3.1. A biographical sketch and a glimpse of his thesis Theodor Molien was born [in Riga] on September 10, 1861. His great-grandfather was a Swede, who had settled near Reval/Tallinn in the 18th century, and was a teacher at the local school there.4 Molien’s grandfather (Andrei [Andrew]) was a watchmaker who had settled in Riga. His father, Eduard Molien, had got his education at Riga Gymnasium and afterwards at Dorpat/Tartu University, where he got a diploma as a teacher of classical languages in 1843. Then he worked as a private teacher in Riga. Theodor Molien himself5 was a student at Riga Gymnasium in 1872–79; after his father’s death he wanted very much to support and please his mother, so he was very careful in his studies. All his family (himself, two sisters and their mother) moved to Tartu in 1880. There he became a student of mathematics, with the aim to prepare himself as an astronomer – he was primarily influenced by the famous observatory and intense 3 Editors’ Note. The symbol [GAP!] is used where a portion of the text has, regretfully, been lost in the process of Xeroxing. 4 Editor’s Note. According to Kanunov [12, p.7], the great-grandfather, Johan Molien, moved to Livonia from Göteborg in Sweden in 1751. He came to live a small town near Reval/Tallinn. Kanunov says “mestechko”. 5 Editor’s Note. His full name reads Theodor Georg Andreas.
266
C HAPTER V. HISTORY OF MATHEMATICS
scientific work in it from the days of W. Struve . He was a listener of P. Helmling (mathematics), F. Minding (mechanics), A. Oettinger (physics) and P. Schwarz (astronomy). He was very much engaged by lectures and seminars by a young Swedish astronomer and mathematician Lindstedt. Anders Lindstedt was born in 1853, and after having got a doctor’s degree in astronomy from Lund, he served as an astronomer at the Tartu Observatory since 1879. After the retirement of Minding, Lindstedt served as a Professor of applied mathematics at Tartu University (1883-86). He published works on celestial mechanics and integral calculus in the Memoirs of Petersburg Academy. His lectures were new, original, and influential for students, and (what is important in our context, among them were algebra and algebraic geometry courses). What was absolutely new for Tartu was his seminar for students with the object to support their scientific work. It was in this seminar where Molien got much encouragement and advice. As a result Molien wrote and published two papers in astronomy and got his diploma [8,24]. Anders Lindstedt was the first person to recognize Molien’s talent, and he insisted on Molien’s remaining at the University to prepare himself for the doctor’s degree. He also insisted that Molien be given a stipend for continuation of his (then beginning) studies in pure mathematics in Germany, namely for participation in the famous seminar of Felix Klein in Leipzig. Under the influence of this seminar (from 1886 on it was directed by S. Lie) Molien’s astronomical interests were finally changed into pure mathematics. Molien remained in Leipzig for two years for writing there (under Klein) his master’s thesis on elliptic functions, which he presented in October 1885 in Tartu [26].6 For the next 15 years Molien was a docent in Dorpat (soon afterwards renamed Yurjev’7) University, teaching on a huge variety of fields. Among them were new courses for Tartu, e.g. on quaternions and other hypercomplex numbers, lectures on Gauss’s theory of division of the circle etc. All this time Molien was keeping contact with the Leipzig seminar. And so it happened that he was among the very few who knew of W. Killing’s work on simple Lie algebras8 and, with E. Study and F. Engel, he considered Killing’s theory as a paradigm for his own investigations of hypercomplex numbers. His thesis advisor was Friedrich Schur, who had worked in Leipzig with S. Lie during the time when Killing was working on the structure of semisimple Lie algebras. Using this paradigm, Molien succeeded in solving some problems (the corresponding paper in Mathematische Annalen appeared in 1892 [28]). In September of same year he presented these results as a doctoral dissertation at Tartu. Molien’s results on hypercomplex numbers were quickly esteemed by the experts: they were included in S. Lie’s monograph, and two years later Molien got also the Ch. Hermite Gold Medal from the Paris Academy of Sciences. Let us stop our story for a moment, and give a glimpse at some mathematical details. 6
Editor’s Note. After his return to Sweden, Anders Lindstedt was a Professor of Mathematics and Theoretical Mechanic at the Royal Institute of Technology (KTH), Stockholm, in 1886-1909 and also the Rector of this school 1902–1909. He had also several other assignments as a civil servant. He died in 1939. 7 Translator’s Note. After the Christian name of the Kiev king Yaroslav the Wise who in 1030 during a short campaign founded here a small town Yurjev, as indicated in a Russian chronicle. It was recaptured by the Estonians about 1060. 8 Quite recently (Mathematical Intelligencer 11, no. 3, 1989) Killing’s papers in the “Mathematische Annalen” were characterized by John Coleman as the greatest mathematical papers of all times – only the Elementa of Euclid, and Newton’s Principia he considers [to] have been more influential. Really, Wilhelm Killing had discovered the entire theory of simple Lie groups, i.e. what is now called Coxeter groups, Weyl groups, Dynkin diagrams . . . Slowly then, beginning [GAP!].
3. Th. Molien’s life and mathematical work
267
The successful experience in Number Theory of Gauss had, at least, two consequences. First, the theory of algebraic numbers was created (E. Kummer, L. Kronecker, R. Dedekind – to name only very few!). Second, there followed quaternions and biquaternions by W. Hamilton in 1837, and matrices by A. Cayley in 1855. Then (1884) J. Sylvester noticed the possibility )aij ) = aij Eij with Eij · Ek = δij Ei . i,j
There remains a little step to “n-ary numbers” and Dedekind’s extraction of the “hypercomplex aspect” of all these new tools. So there was opened a way to a general theory of finite-dimensional associative algebras. Among the first general results: Karl Weierstrass proved the 3-dimensional numbers do not exist, i.e. that the non-existence of 3dimensional R-algebras without zero divisors, and Frobenius’ theorem was proved. For people connected with Lie’s seminar in Leipzig a turning point in the story was provided by the following remark by H. Poincaré (1884): multiplication of n-ary numbers, ( xi ei )( yi ei ) = z i ei , is given by equations zi = ϕi (x1 , . . . , xn ; y1 , . . . , yn ) that determine a Lie group. This observation was made by Scheffers, Study, etc., and their understanding related to W. Killing’s penetrating results and notions were taken by Molien as a paradigm for his investigation of associative C-algebras. A finite dimensional algebra is said to be simple if it has no non-trivial two-sided ideals, and semisimple if its only nilpotent ideal (the radical) is zero; nilpotency of an ideal means the existence of m ∈ N such that any product with ≥ m factors is 0. According to Molien: every semisimple C-algebra is isomorphic to the direct sum of simple C-algebras. Moreover, for every such simple component Si there exists ni ∈ N such that Si Mni (C). Specializing these results to that case where the basis {e1 , . . . , en } is a group led Molien to many results on group representations. As pointed out by Hawkins and Gustafson, Molien was the first to discover the “hypercomplex aspect” of this theory. Some details in this story deserve special attention – they are to be provided later. Five years later one of several graduates of the prestigious École Normale Supérieure, encouraged by Picard, Darboux, Poincaré . . . , and having already made rigorous sense out of W. Killing’s (1888-1890) papers on semisimple Lie algebras (dissertation 1894), entered into the story. Élie Cartan was the man who clarified the notions of radical and of simple and semisimple algebras and further proved the uniqueness of the decomposition; his corresponding report appeared in 1897. He considered also the Rcase. To finish our story: in 1907, Joseph Wedderburn generalized the theory to any field k (instead C or R). In this general situation, the simple algebras are, as before, full matrix algebras. Although, now not over k itself, but over a suitable division k-algebra. In the case k = R there are three division k-algebras only: R, C and H – this fact is known as Frobenius’ theorem. As Wedderburn returned to the Peirce approach via idempotents and this approach culminated in E. Noether’s paper, the style of which became standard in algebra for a long time, Molien’s name fell into oblivion for at least 50 years. This has happened despite the fact that E. Noether herself highly respected both Molien’s and Cartan’s contributions. Perhaps, one of the reasons was also that [GAP!].
268
C HAPTER V. HISTORY OF MATHEMATICS
There is no possibility to go into further details. Perhaps Karen Parshall’s report [18] on Joseph Wedderburn deserves your attention. One third of it is devoted to the history of the Molien-Cartan results. And, of course, Thomas Hawkins’s brilliant papers [11, 12] on the Hesse principle, Cayley’s counting principle and others, in this “Lie field”. After these comments let us continue our account about Molien. During the years following 1892 he simplified some proofs in his Mathematische Annalen paper, and published further three papers (two of them in a local journal) on finite substitution groups, using his theory of algebras. Quite quickly, Frobenius underlined the importance of Molien’s work on group representations. Nevertheless, Molien remained a docent at Yurjev University until the very beginning of the 20th century 1900. As an example of motives raised against him when applying for a professorship [ e.g. at Kharkov University] the Commitee (Lyapunov, Struve, Steklov, Koval’skiˇı) declared: “. . . we have not been able to gather an independent opinion about the degree of originality of Molien’s work, as it lies far away from the mainstream of mathematical thought, and so the Commitee knows these matters only superficially. This new-born theory of algebras seems to be a complicated and artificial construction motivated by the pure desire to generalize usual numbers, and therefore it cannot be justified properly . . . ” So it happened that Molien was forced to accept an offer from the Tomsk Technological Institute (in Siberia). Probably this was to some extent due to the fact that there was a friend of Molien, a certain P. Kadik 9 who had studied together with Molien in Tartu and who, after having obtained a master’s degree at Tartu University in 1885, had settled in Tomsk. He taught at the Gymnasium and had corresponded with Molien all these years. In Tomsk Molien worked until his death in 1941. He set up the standards for many mathematical courses, wrote a series of lecture notes (differential calculus; differential equations; geometry): in the period of 1902-1909 he published notes from 12 such courses. He was the first Professor of Mathematics in Siberia. Although he was highly esteemed by both students and colleagues, he was forced to retire in 1911. During the next three years nobody knew that there existed a circular about giving him the “Emeritus” – it was well hidden somewhere in the Russian Ministry of Education. And so he was not allowed by the officials to teach at the Institute. So Molien gathered a mathematical seminar outside the Institute, where most of the Tomsk mathematicians participated. He gave also some survey lectures on algebra and arithmetics for teachers in Ufa, and lectured to higher women courses in Tomsk. In 1917 the mathematical faculty at Tomsk University was opened10, and his colleagues from the Institute days called him to return as the Professor at this new University. Since then, during more than 20 years, almost all mathematics students participated in Molien’s seminars, which most often were devoted to elliptic functions and to the theory of surfaces. He had many postgraduate students (on Lie algebras, on minimal surfaces, on function theory), and he is also viewed as the founder of the Tomsk school of differential geometry. He did not stop his efforts to continue working in algebra despite his very intensive pedagogical work in other fields. For instance, in 1930 he published a note where he gave an example of a transcendental equation having an algebraic number as one of its 9Editor’s Note. Maybe Peteris Kadikis (1857-1923), Latvian, studied mathematics in Tartu and was a
private docent there. 10Note by Aleksandr Zubkov. Tomsk University itself was founded in 1879. In 2004 they celebrated their 125-th anniversary.
3. Th. Molien’s life and mathematical work
269
roots but not all conjugates of this “algebraic” root are roots of the equation. In 1935 he attempted to do some systematic work in the theory of algebras. In his last years he was very interested in hypergeometric series – he has an almost finished manuscript giving a systematic survey of the theory. There are also almost finished papers about Galois groups: there he wants to finds linear groups with [GAP!] a given Galois group is contained [GAP!]. Furthermore, there are almost finished methodological notes on Lobachevsky’s views in Geometry, Cremona transformations . . . There are lecture notes, e.g. notes on the theory of elliptic functions (from the time of the Klein-Lie seminars), notes on the history of mathematics. There are reprints of Hurwitz, Dehn, Klein, Kneser, Kronecker, Minkowsky, Study, Frobenius, Schur, Engel and others. Letters from Hurwitz, Klein, A. Kneser, Frobenius, I. Schur, Struve. All these and other things of the Molien Archive were left by his daughter Eliza to her blind student V. D. Fatneva, a Latinist. What will be the further fate of this heritage? In Siberia Molien has not been forgotten. In 1986, his bas-relief was put on the house in Nikitin Street where he had lived. Also his portraits hang at Tomsk University. The well-known Russian algebraist A. Mal’cev has designated Molien as the first professor and the patriarch of Siberian mathematics in the pre-war period. Recently, Professor Leonid Bokut, the leader of a well-known ring theory-school in Siberia (Efim I. Zelmanov and A.R. Kemer were [among] his postgraduate students) visited Tartu, and he declared, in his talk, that Th. Molien should be considered as the first real classic in the field of Algebra in the Russian Empire of that time. Sources for further details: a booklet in Russian with comments on Molien’s dissertation by Kanunov (a graduate from Tomsk University) [14] and, similarly a booklet with Russian translations of his main (1892 and 1897) papers [16].
3.2. Molien’s 1897 papers on group rings and invariants To get a 3-dimensional R-space with basis G = {x, y, z} we take all formal Rcombinations αx x + αy y + αz z. Similarly if |G| > 3 and, instead of R, there is any field K, we get the space V (G, K) = {α | α = g∈G αg g}. If G is a group, then its multiplication can be extended (distributively) to V (G, K). In this way we obtain the group ring KG. The elements of KG, i.e. the formal series sums g∈G αg g can be interpreted def
as mappings α : G → K with finite support, α(g) = αg ; their multiplication in KG is called convolution. Quite often E. Noether is considered to be the only creator of group algebras. Nevertheless, A. Cayley (in 1854) dealt with the ring C[S3 ]. A look at Molien’s paper (1897) on invariants of substitution groups shows that some first fundamental results in this field are due to him. The genesis of the the notion of “group algebra” can be illustrated by the diagram in Figure 111. More is true: in 1895–97 Molien discovered (independently of G. Frobenius) the basic facts in group representation theory. He was formally motivated by 11Editor’s note. Part of this chart is missing in the Xerox version. We are, however, convinced that it must be Euler that points to Hamilton; in 1770 he gave a parametrization of rotations in R3 , which can be interpreted in terms of quaternions. This has made some authors to view Euler as a forerunner of Hamilton: if a is a quaternion of unit length, one associates to it an orthogonal transformation given by x → a−1 xa (see e.g. [3, p. 3-4]). Another predecessor of Hamilton was C. F. Gauss. In posthumous work (cf. [10, especially p. 358]), he parameterized rotations with the aid of 4-tuplesx = (x0 , x1 , x2 , x3 ) ∈ R4 . If there are two rotations corresponding to x and y respectively, and z corresponds to their composition, he wrote down
270
C HAPTER V. HISTORY OF MATHEMATICS
Group algebras 6 gOOO nnn OOO nnn OOO n n n OO' n n n n nv nn Group Representation Hypercomplex numbers Theory / (Dedekind, Peirce, Noether) (Frobenius, Molien, < aDD hPPP x I.Schur, Noether) DD PPP xx O DD PPP xx PPP DD xx x P DD x PPP x DD xx DD xx Lie Theory Vector spaces DD xx x DD (Poincaré, Killing) (Hamilton, Grassmann) x S S hRRR x D j x D j RRR S S D j xx RRR j j S S DDD xx RRR j x D R S xx j j Substitution groups Matrix algebras Algebraic numbers (Cayley, Sylvester) (Cauchy, Galois, Jordan) (Kummer, Kronecker, Dedekind) O O O (Cayley, Molien, Noether)
Z[i] (Gauss)
Algebraic equations, Galois Theory
H iRRR RRR mm6 (Hamilton) m m RRR RRR mmm RRR mmm RRR mmm m m RR mm Parametrization of rotations
(Lagrange, Gauss, Abel, Galois)
(Euler, Gauss)
Fig. 1: Genesis of the the notion “group algebra”
the problem of determining the representation of minimal degree for a group. This problem had been suggested by F. Klein’s attempt to generalize Galois theory. The main step in Molien’s approach can be well illustrated by having a look at the problem of studying group determinants – the formal motivation for G. Frobenius. Let G be a finite group, |G| = n, and let {xg |g ∈ G} be n independent variables (over C). Frobenius’ theory of representations of finite groups in its historical context def was concerned with factorization of the group determinant Dg = det )xh−1 g ), with h, u ∈ G, viewed as a polynomial in C[xg |g ∈ G]. Take the field K = C(xg |g ∈ G) of rational functions over C. The group algebra KG can be viewed as a K-space V with all elements of G as its basis. Right multiplication by an element XG = g∈G xg on KG gives an endomorphism of V with matrix )xh−1 g ): xg g = xg (hg) = xh−1 u u, h ∈ G −→ h · Xg = h · g∈G def
g∈G
u∈G
where u = hg → g = h−1 u. We see that h −→ h·Xg = u∈G gxh−1 u u, again is a Klinear combination of all basis elements u ∈ G, so the matrix of this K-endomorphism is )xh−1 u ). As a result, the group determinant DG is interpreted as the determinant of the endomorphism of V , given by right multiplication by XG on KG. As char K = 0 (N.B. K = C(xg |g ∈ G)), the group ring KG is known to be semisimple. Therefore it is isomorphic as a K-algebra to the direct product of a full matrix algebra (over K = the components of z in terms of the ones of x and y, which again corresponds to the multiplication of the corresponding quaternions.
3. Th. Molien’s life and mathematical work
271
C(xg )): (85)
ψ
ψ = (ψ1 , . . . , vs ) : KG −→ Mn1 (K) × · · · × Mns (K)
Every element Y ∈ Mn (C) corresponds to an endomorphism of row spaces Cn → Cn , (z1 , . . . , zn ) → (z1 , . . . , zn )Y . It follows that the endomorphism Mn (C) → Mn (C), given by right multiplication by Y on Mn (C), has the determinant (det Y )n . Indeed, as a right Mn (C)-module, Mn (C) is isomorphic to Cn ⊕ · · · ⊕ Cn , and so right multiplin
cation by Y on Mn (C) can be viewed as an endomorphism Yˆ of Cn ⊕ · · · ⊕ Cn with the matrix ⎛ ⎞ Y 0 ··· 0 ⎜ ⎟ ⎜ 0 Y ··· 0 ⎟ ⎜ . .. ⎟ .. .. ⎜ . ⎟ . ⎝ . . ⎠ . 0 0 ··· Y with the determinant (det Y )n . Using this result together with formula (85) we get the following formula: ns DG = det ψ1 (XG ))n1 · · · · · det ψm (XG ) . It appears that this is the complete factorization of DG in C(xg ) and so we have obtained a solution to Frobenius’ question [9]. All that has been said above is true for any k instead of C. K. Johnson (1988) raised the question (in a combinatorial context) whether the group determinant determines the group G. It was proved recently by E. Formanek and D. Sibley [8] that is indeed true in the nonmodular case, i.e. if char k |G|. More precisely, they established the following. T HEOREM 3.1. If G and H are finite groups, char k |G|, and ϕ : G → H is a bijection (of them as sets!) such that ϕ(D ˆ H ) = DH for fˆ(xg ) = xϕ(g) , then G ∼ = H as groups. Next, we are going to give some details about Molien’s formula, another remarkable result in his 1897 paper [32]. Take the polynomial ring R = C[x1 , . . . , xn ] and, viewing it as an R-space, present ∞
it in the form R = ⊕ Ri , where Ri is the subspace of all homogeneous polynomials i=0
(forms) of degree i, i = 1, 2, . . . . The subspace V = R1 of linear forms has x1 , . . . , xn as its basis, thus is n-dimensional. Fix any finite subgroup G ≤ GL(V ) in the group GL(V ) of all C-linear automorphisms of V . An action of G on R is induced by the formula def f A (x) = f (xA), x = (x1 , . . . , xn ), A = )aij ). This yields the subalgebra of G-invariants in R, RG (x) = {f ∈ R | ∀A ∈ G, f A = f }, with the homogeneous components RiG = RG ∩ Ri . Substantial information about the subalgebra RG is given by the formal series def MG = (dimC RiG )ti , def
i≥0
272
C HAPTER V. HISTORY OF MATHEMATICS
called its Hilbert-Poincaré series of RG , sometimes also its Molien series. Indeed, this series is a rational function in t, and Molien proved (1897) the following theorem. T HEOREM 3.2. Let R = C[x1 , . . . , xn ] and G ≤ Mn (C) as above, and let G = {A1 , . . . , Ag } be all its elements. Then the generating function for the numbers dimC RiG of linearly C-independent i-forms is given by g 1 1 . (86) MG (t) = g α=1 det(I − tAα ) E XAMPLE 3.1. For G = C2 =
1 0
0 −1 0 , 1 0 −1
we have RG = C[x1 , x2 ]G = C[x21 , x22 ] ⊕ x1 x2 C[x21 , x22 ] and 1 1 + t2 1 1 MG (t) = = + . 2 (1 − t)2 (1 + t)2 (1 − t2 )2
⎞ −1 0 0 E XAMPLE 3.2. R = C[x1 , x2 , x3 ] and G = G, H with G = ⎝ 0 −1 0 ⎠ 0 0 −1 ⎞ ⎛ 1 1 0 and H = ⎝0 1 0⎠, i2 = −1, we have that |G| = 8 and that G is Abelian, 0 1 i ⎧⎛ ⎞ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ 1 0 0 1 0 1 1 1 0 ⎨ 1 0 0 G = ⎝0 1 0⎠ , ⎝0 1 0⎠ , ⎝0 1 0 ⎠ , ⎝0 1 0 ⎠ , ⎩ 0 0 −i 0 0 i 0 0 −1 0 0 1 ⎞⎫ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎛ −1 0 0 ⎬ −1 0 0 −1 0 0 −1 0 0 ⎝ 0 −1 0⎠ , ⎝ 0 −1 0⎠ , ⎝ 0 −1 0 ⎠ , ⎝ 0 −1 0 ⎠ . ⎭ 0 0 −i 0 0 −1 0 0 i 0 0 1 ⎛
and RG = C[x21 , x22 , x43 ](1 ⊕ x1 x2 ). According to (86) we get (R. Stanley [??]) 1 1 1 + MG (t) = + 3 2 8 (1 − t) (1 − t) (1 − it) 1 1 + + + (1 − t)2 (1 + t) (1 − t)2 (1 + it) 1 1 + + + 2 2 (1 − t) (1 − t) (1 + t) (1 − it) 1 1 + = + (1 + t)3 (1 + t)3 (1 + it) 1 = . (1 − t2 )3 R EMARK 3.3. Two nonisomorphic groups can have the same Molien series: e.g. the dihedral group D4 and the Abelian group C2 × C4 have both the series 1 = MC2 ×C4 (t). MD4 (t) = (1 − t)2 (1 − t4 )
3. Th. Molien’s life and mathematical work
273
( ' For any polynomial f (x), its mean g 1 ˜ f (xAα ) f(x) = g α=1
is also G-invariant. It is clear that, generally, any symmetric expression in the polynomials f (xA1 ), . . . , f (xAg ) is again a G-invariant. There exists a finite polynomial basis for RG , i.e. a set of all G-invariants f1 , . . . , f , > n such that any G-invariant f can be written as a polynomial in f1 , . . . , f . Then there are polynomial equations, of course, relating f1 , . . . , f , called syzygies12. E.g., f1 = x21 , f2 = x1 x2 , f3 = x22 form a polynomial basis for C[x1 , x2 ]C2 with the syzygy f1 f3 − f22 = 0. The existence and a method for finding a polynomial basis is given by the following. T HEOREM 3.4 (E. Noether [17]). The ring of invariants R = C[x1 , . . . , xn ]G for G ≤ Mn (C) has a normal polynomial (or integrity) basis, with not more than n+g n invariant in it, and their degree not exceeding g, g = |G|. Such a polynomial basis may be obtained by averaging over G of all monomials xa1 1 · . . . · xann with i ai = g, i.e. all monomials of degree g. Among polynomial bases the most important are the so-called “good polynomial bases”. It is not hard to prove that there always exist n algebraically independent Ginvariants. A good polynomial basis for RG consists of homogeneous G-invariants ( ≥ n) where: (1) f1 , . . . , fn are algebraically independent, and , furthermore, (2) we have C[f1 , . . . , fn ], if = n; or, G R = C[f1 , . . . , fn ] ⊕ fn+1 fn+1 C[f1 , . . . , fn ] ⊕ · · · ⊕ f C[f1 , . . . , fn ], if > n. In other words, any G-invariant can be written as a polynomial in ([GAP!] l > n) as such a polynomial (in [GAP!]). This means that f1 , . . . , fn are “free invariants” in the sense that they can be used as often as needed, while fn+1 , . . . , f are “transient” and can be used at most once. It is interesting to point to the following theorem proved by M. Hochster and J. Eagon [13] (1971), and independently by E. Dade [4] (1964). T HEOREM 3.5. Any finite group G has a good polynomial basis of invariants. For this good polynomial basis the syzygies are given by a simple rule: • if = n, then there are no syzygies; • if > n, then there are ( − n)2 syzygies, which express the products fi fj (i ≥ n, j ≥ n) in terms of f1 , . . . , f . 12Editor’s note. The word “zyzygy” was, in this mathematical context, apparently, first used by David Hibert. Etymology:, from Latin zyzygia, Greek συζυγια, yoked together, in turn from συν, together, and ζυγoν, yoke, the last word appearing as a loan in many languages, not only Indo-European ones, such as English, German, Estonian, Finnish, Russian.
274
C HAPTER V. HISTORY OF MATHEMATICS
Let the degrees of a good polynomial basis be known for RG : def deg f1 ,. . . , n = deg f . Then the Molien series of RG is given by ⎧ 1 ⎪ ⎪ , if = n; ⎪ ⎨ ni=1 (1 − tni ) (87) MG (t) = ⎪ 1 + j=n+1 tnj ⎪ ⎪ n , if > n. ⎩ ni i=1 (1 − t )
n1
def
=
(These formulae can be verified by expanding the right hand sides in powers of t and then comparing with RG = C[f1 , . . . , fn ] ⊕ fn+1 C[f1 , . . . , fn ] ⊕ · · · ⊕ f C[f1 , . . . , fn ], if > n.) 1 0 −1 0 E XAMPLE 3.3. Let G = C2 = , be our group. i.e. we take 0 1 0 −1 the cyclic group of order 2. Its homogeneous invariants are f1 = x21 , f2 = x1 x2 and f1 = x22 . One sees that this is a good polynomial basis with n1 = n2 = n3 = 2. So we have RG = RC2 = C[x21 , x22 ] ⊕ x1 x2 C[x21 , x22 ]. This means that any C2 -invariant can be written uniquely as a polynomial in x21 and x22 plus (perhaps!) x1 x2 times another such polynomial. Here = 3 > 2 = n, so by (87) MC2 (t) =
1 (1 + t2 ) = . (1 − t2 )(1 − t2 ) (1 − t2 )2
There is the single syzygy x21 x22 = (x21 x22 )2 . R EMARK 3.6. At the same times the polynomials in the above example, taken in a different order: x21 , x21 x22 , x22 do not give a good polynomial basis! It suffices to notice that RG + x42 ∈ C[x21 , x1 ] ⊕ x22 C[x21 , x1 x2 ]. ( ' R EMARK 3.7. As a consequence of the above Hochster-Eagon-Dade theorem, for any finite G, its Molien series can be put in the form (87), as there exists a good polynomial basis whose degrees match the powers of t in (87). ( ' R EMARK 3.8. However, the converse to (2) is, in ⎞ general, not true. Indeed, if we ⎛ ⎞ ⎛ 9 −1 0 0 1 0 0 : take the group G = ⎝ 0 −1 0 ⎠ , ⎝0 1 0⎠ , then it has the Molien series 0 0 i 0 0 −1 (88)
MG (t) =
1 , (1 − t2 )3
which by multiplication of both denominator and numerator by 1+t2 , can also be written as (89)
MG (t) =
1 + t2 (1 − t2 )2 (1 − t4 )
3. Th. Molien’s life and mathematical work
275
As seen above, there exits a good basis corresponding to MG (t) in form (88), which gives us C[x , x , x ]G = C[x2 , x2 , x4 ] ⊕ x x C[x2 , x2 , x4 ]. 1
2
3
1
2
3
1 2
1
2
3
But not corresponding to the form (89). It is a question of N. J. Sloane (1977): to which forms of MG (t) does there correspond good polynomial bases, and to which not? There are old results by Shephard-Todd (1954), but, in general, it seems to be open. ( '
3.3. Molien type formulae in Combinatorics.
13
During the past decades this old theme has been combined with new ones, so in order to gain greater coherence in understanding combinatorial and algebraic problems, I shall give briefly three such results. 3.3.1. Let V = V1 ⊕ · · · ⊕ Vn , with all dim Vi = 1 and xi as basis vectors of Vi . Let G ≤ GL(V ) be such a finite subgroup that for every G ∈ G there exists a πG ∈ Sn with Vi G = VπG (i) , i = 1, 2, . . . , n. In this case G is called a monomial group; it consists of monomial matrices, i.e. of matrices such that every line contains exactly one non-zero element of C. For any cycle {C = (i1 , . . . , it )} of πG we have {i1 , . . . , it } ∈ n, π(ik ) = ik+1 (if 1 ≥ k ≥ t − 1) and π(it ) = i1 . Monomiality of G means that ∃ α1 , . . . , αt ∈ C, G xG ik = αk xik+1 (if 1 ≥ k ≥ t − 1) and xit = αt xi1 . Put γG (C) = α1 · . . . · αt . For a1 an any monomial x1 · . . . · xn let τ be its type, the sequence such that τ = (τ1 , τ2 , . . . ) def
with τi = #{ak | ak = i}. Next, take the subspace Rτ of all monomials of type τ . Then for any G ∈ G it is clear that RτG = Rτ . Therefore setting RτG ∩ Rτ , we see that RG = ⊕τ RτG is a graduation of this C-space RG , and that the following Molien-type formula is true: 1 |C| |C| 2 PG (t) = (dim RτG )tτ = (1 + γG (C)t1 + γG (C)t2 + . . . ), |G| τ G∈G C with C here covering all cycles of the substitution πG , and t = (t1 , t2 , . . . ) being in determinants; |C| denotes the length of the cycle C, and [GAP!]. See Stanley [???]. 3.3.2. In the special case of the monomial matrices being permutation matrices, every element G ∈ G induces a substitution on the set F of functions f : n → N by f G (i) = f (1) f (2) f (πG (i)). To every such function there corresponds the monomial xf = x1 x2 · f (n) . . . · xn , and so G acts on the C-algebra R = C[x1 , . . . , xn ] by the formula ∀ i ∈ G f G n, (x ) = xf , where f G (i) = f (πG (i)). The orbits under this action are called G-schemes; they are given by the following equivalency on F: f ∼ g ⇐⇒ ∃ G, g = f G . This means that the multisets {f (1), . . . , f (n)} and {g(1), . . . , g(n)} coincide; in particular, the monoidals xf and xg have the same type. The main problem of the RedfieldPólya theory can be described as the question to find the number d(τ ) of G of a given 13See Tambour [23] for other finite lattices and their automorphism groups (homogeneous – [GAP!] degrees).
276
C HAPTER V. HISTORY OF MATHEMATICS
type τ . The answer is given by the following formula: 1 |C| |C| 1 + t1 + t2 + . . . , d(τ )tτ = PG (t) = |G| τ G∈G C where C runs over all cycles of πG . 3.3.3. Torbjörn Tambour also recently published a paper on this topic [23] (1989). His problem is the following. Let G be a finite group acting on a finite set S. It induces an action on Pk (S), the set of k-subsets of S: {s1 , . . . , sk }G = {πG (s1 ), . . . , πG (sk )}. Denote by pk the number of G-orbits of this action. Tambour aims at finding pk tk . [GAP!] generalize [GAP!] to other lattices. P ROBLEM . Finite vector spaces or equivalences of some others. A function-theoretic interpretation of this question is possible. First, we interpret any k-subset n as the image Im f of a suitable injection f : k → n. The action of G on n induces an action of G on Pk (n) : f → f G with the rule, ∀ i ∈ k, f G (i) = πg (f (i)). It follows that if Im f = (Im f )πG , from which it again follows that Im f is a (disjoint) union of cycles of πG . The converse “if Im f is a union of cycles of πG ” is obvious. So it follows that k (90) 1+ iG (1 + t|C| ); kt = k≥1
C
C(+) -
Fig. 15
G = H + Hg2 + Hg3 + . . . or, compactly, G =
Hg, where K = {e, g2 , g3 , . . . }
k∈K
is a complete system of representatives, that is a set to which belongs precisely one representative in each class. Let us have a look at the case when G is a finite group. The number of elements is called the order of the group and is denoted |G|. The number of distinct orbits is called the index of the subgroup H in the group G and is written indG H or, also, (G : H). T HEOREM 4.1 (Lagrange). The order of a finite group is divisible by the order of the each subgroup, the quotient being then the index of the subgroup. P ROOF. We check first that the map ϕ : H → Hg, given by f (H) = hg, is oneto-one. This shows that all orbits have the same number of points. As the orbits fill out the entire group and do not intersect, we see, in view of the definition of index, that |G| = |H| · indG H, which proves the assertion of the theorem. ( ' Thus Lagrange’s theorem says that in a finite group the orders of its subgroups are divisors of its order. One can also ask if the converse is true: if a number m divides the order of a finite group, is it then the order of some subgroup of order m? Simple examples indicate that such a “converse conjecture” is not true in general. However, it is true when m is a prime. Even more, if |G| = pn · r, if p is a prime number and the number p and r are relatively prime to each other, then G contains subgroups of order p, p2 , p3 , . . . , pn . This statement (together with a small addendum) is known as the First Theorem of Sylow. Isn’t it possible, in all the previous reasonings, to consider, instead of the [previous] so-called right cosets, left cosets gH, that is, sets gH = {gh|g ∈ G fixed; h ∈ H being arbitrary}? The Reader will easily realize this case (by a completely analogous argument) and as a result we obtain a “left hand picture” of the previous “right hand picture”. In the special case when the group operation is commutative, that is the equation ab = ba holds true for arbitrary a, b ∈ G, the two pictures coincide. In the general case, we have, of course, Hg = gH.
4. Additional remarks on groups
375
Let there be given an arbitrary G and a subgroup H. We know that the group can be covered both by right cosets Hg and by left cosets gH, that is, G= Hg = gH. g∈K
g∈K
Here K and K denote a complete system of representatives in the first and in the second case, respectively. In general, we have, of course, K = K . We ask if it is possible to choose the representatives of the cosets so that K = K , that is, that the system of the representative of the right cosets is at the same time a system of the representative of the left cosets? In the general case, the answer is negative. However, G. Miller showed, in 1910, that such a choice is always possible if H is a finite subgroup. The situation under view takes place also in some other cases. One of these will be described in the next subsection; we will not dwell on the remaining ones.53 2. Let G be a group and fix an arbitrary element a ∈ G. We define a selfmap σa of G by the formula σa (g) = a−1 ga. Then e → e, as a−1 ea = a−1 a = e; g1 = g2 ⇐⇒ σa (g1 ) = σa (g2 ), as a−1 g1 a = a−1 g2 a is equivalent to g1 = g2 ; σa (g1 · g2 ) = σa (g1 )σ2 (g2 ), as a−1 (g1 g2 )a = a−1 g1 aa−1 g2 a = (a−1 g1 a) · (a−1 g2 a). We see that the unit element e of G is a fixed point of σa , and further also that σa gives a one-to-one correspondence on G and maps products of elements into products of the corresponding images. Apparently, one can carry out the construction a → σa for all a ∈ G, so σa gives us a so-called inner automorphism of G. An important role in group theory is played by so-called invariant subgroups or normal divisors. These are subgroups N ⊆ G with the property that σa (N ) ⊆ N holds for all inner automorphisms σa of G. In other words, a subgroup N ⊆ G is a normal divisor if and only if for all n ∈ N we have that a ∈ G =⇒ a−1 na ∈ N . It is easy to see that the above definition is equivalent to the statement that aN = N a for all a ∈ G, that is, left and right cosets with respect to N coincide. The fact that a subgroup N is a normal divisor in the group G, will be written as N G. The unity subgroup and the group itself are normal divisors in any group; they are the so-called trivial normal divisors. If there are no other normal divisors, then we speak of a simple group. E XAMPLE 4.1. Bijective selfmaps of a finite set are called substitutions; the order of a substitution is the number n of the set under consideration. Since the “individuality” of the set is of no interest, we may view the elements of the set as the first n natural numbers. Therefore, every substitution S of order n can be codified in the form [of a matrix] i 1 . . . in S= , j1 . . . jn 53The Reader will find interesting material about this in [12].
376
C HAPTER VI. POPULARIZATION OF MATHEMATICS
where [the rows] (i1 , . . . , in ) and (j1 , . . . , jn ) are permutations of the numbers 1, 2, . . . , n. In this notation one should bear in mind that arbitrary rearrangements of the [vertical] columns does not change the substitution, so that we agree that 1 2 3 4 2 1 4 3 2 1 3 4 ≡ ≡ ≡ etc. 2 1 4 3 1 2 3 4 1 2 4 3 It is possible to “multiply” substitutions with each other: the product of j . . . jn i . . . in and T = 1 S= 1 j1 . . . jn k1 . . . kn i 1 . . . in . S·T = k1 . . . kn Taking account of the previous remark (about notation) the Reader will see that one can multiply [or compose] any two n-th order substitutions. Thus the set Sn of all n-th order substitutions comes equipped with an algebraic operation (multiplication [or composition]), which, as one checks readily, is associative but (if n > 2) not commutative. This multiplication has as unit element i . . . in 1 2 ... n ≡ 1 E= ≡ ..., 1 2 ... n i 1 . . . in is the substitution
and each substitution has an inverse S −1 , j −1 S = 1 i1
... ...
jn , in
since S ·S −1 = S −1 S = E. Thus Sn is a group, usually called the (complete) symmetric group. Its subgroups are called substitution groups. The remark about the notation above allows us to present each substitution in the so-called “normal form” 1 2 ... n S= , s1 s2 . . . sn from which we see that the permutation s = (s1 , s2 , . . . , sn ) determines the substitution uniquely. Even more, this “new notation” makes it possible to divide all substitutions into two classes, the even and the odd ones using the notion of inversion. One says that the numbers si and sj , i < j, form an inversion in the the permutation s = (s1 , s2 , . . . , sn ) if si > sj . The substitution S is called even or odd depending on if thepermutation 1 2 3 s contains an even or odd number of inversions. For example, is an even, 3 1 2 1 2 3 while is an odd substitution. 3 2 1 n! and that these form It is not hard do see that the number of even substitutions is 2 a group An ⊂ Sn , [usually called the alternating group]. It turns out that the groups An (with n ≥ 5) are simple. 54 All the groups An have an even number of elements. It was observed that all known finite non-commutative simple groups are of even order. Thus arose, in the early 20th 54The Reader will find the proof of this fact, in what concerns the tools, elementary in [10, p. 77–78].
4. Additional remarks on groups
377
century, the difficult Burnside’s problem: prove that all of finite non-commutative simple groups are of even order. This problem is now solved. 55 ( ' In an arbitrary group G, which is not simple, there exists a non-trivial normal divisor N , and we can consider the decomposition of G with respect to N . It is remarkable that in the case of a normal divisor “multiplication” in the system of orbits according to the formula N g1 · N g2 = N g1 g2 is “lawful”, that is, it does not depend on the choice on the representatives g1 and g2 in these orbits. Moreover, with respect to this multiplication the orbit acts as “unity” and each orbit N g has an “inverse orbit”, namely N g −1 . As a consequence, the set of orbits, denoted G/N , is a group with respect to this multiplication; it is called the factor group with respect to the normal divisor N . It follows from the definition if the index that |G/N | = indG N . Let us familiarize ourselves with some examples. E XAMPLE 4.2. In our previous discussion we have encountered the subgroup An of the group Sn . Let us check that this is a normal divisor. Let us take the odd substitution 1 2 3 4 ... n T = , 2 1 3 4 ... n and let us form the orbit An T ; its “points” are all odd substitutions, because the product of an even of an odd substitution is always odd. Next, take an arbitrary element S ∈ Sn . If S is an even substitution, then S ∈ An . But if S is odd, then S · T is even, and as S = (S · T ) · T then S ∈ An · T . We see that Sn = An + An · T. Hence indSn An = 2. But a subgroup of index 2 is always a normal divisor. Indeed, let N be a normal divisor of index 2 in a group G. Then for each a ∈ N we have G = N + aN = N + N a, which implies that aN = N a. But this is the same as N G. ( ' E XAMPLE 4.3. If G is commutative or an Abelian group, then each subgroup in it is normal. This follows at once from the definition of a normal divisor. ( ' E XAMPLE 4.4. In each group G we have the subgroup Z(G) = {z|z ∈ G, zg = gz for each g ∈ G}, which is called the center of the group G. This is a normal divisor. Indeed, for arbitrary z ∈ Z(G) and a ∈ G we have gzg −1a = gg −1 za = eza = az = azgg −1 = a · gzg −1 =⇒ gzg −1 ∈ Z(G). ( ' 55After the Reader has become familiar with the description of cyclic groups (Subsection 4), he or she will notice that the only commutative simple groups are those of prime order. Burnside’s problem will be discussed in Subsection 8. Commentator’s note. Nowadays the name Burnside’s problem is used in connection with another outstanding problem in group theory: whether a finitely generated group of bounded exponent must be finite.
378
C HAPTER VI. POPULARIZATION OF MATHEMATICS
E XAMPLE 4.5. In an arbitrary group G we consider its subgroup G generated by all elements g −1 h−1 gh, g ∈ G, h ∈ G, that is the subgroup consisting of all elements of the form g −1 h−1 gh and all possible products of such elements. This subgroup G is called the commutator subgroup of G. The subgroup G is a normal divisor of G. Indeed, it is easy to see that for all inner automorphisms σa of G one has σa (G ) ≤ G. That G G follows now from the definition. ( ' Immediate computations show that (S3 ) = A3 and (S4 ) = A4 . We shall soon see that, likewise, (Sn ) = An for all n ≥ 5. For the proof of this fact we require also two auxiliary facts, which, both taken by themselves, help to clarify the role of the commutator subgroup. L EMMA 4.2. The factor group with respect to the commutator subgroup is Abelian. P ROOF. Let a, b ∈ G be arbitrary. Then aG · bG = abG = ba(a−1 b−1 ab)G , that in view of a−1 b−1 ab ∈ G equals baG = bG ·aG . The relation at hand aG ·bG = bG · aG shows that G/G is Abelian. ( ' L EMMA 4.3. The commutator is contained in each normal divisor of the group such that the factor group with respect to it is Abelian. P ROOF. Let N G be such that G/N is Abelian, that is, for all a, b ∈ G one has the identity aN · bN = a · N . This gives abN = baN , that again yields a−1 b−1 · abN = a−1 b−1 · baN = N . Thus a−1 b−1 ab ∈ N . As a, b ∈ G are arbitrary, this relation shows that G ⊆ N . ( ' Let us now prove that if n ≥ 5 then (Sn ) = An . To this end we observe that (Sn : An ) = 2, so that Sn /An is a group of order 2. It is easy to see that groups of order two have the same structure as the group {a, e | e·e = e; e·a = a·e = a; a·a = e}. But this is an Abelian group, so that that in view of Lemma 4.3 (Sn ) ⊆ An . Furthermore, the Reader will notice that only in an Abelian group G we have the relation G = {e}. But as the groups Sn (n ≥ 5) are non-commutative, then (Sn ) = {e}. From the relation (Sn ) Sn , apparently follows the weaker relation (Sn ) An . But An is simple, so in view of (Sn ) = {e} we obtain the sought relation (Sn ) = An . ( ' 3. D EFINITION 4.4. Let there be given a single-valued function ϕ whose domain of definition is the set of all elements of a group (G1 , ·), its values being the elements of the group (G2 , ◦). The function ϕ is called a homomorphism of G1 into G2 (or a representation of G1 in G2 ), if for arbitrary x, x ∈ G1 holds the relation ϕ(x · x ) = ϕ(x) ◦ ϕ(x ) A trivial example of a homomorphism of a group G1 is the constant function whose value is the unit element in the group G2 . The set of elements in G1 for which the value of ϕ is unit element e2 of G2 is called the kernel of the homomorphism; notation: Ker ϕ = {n|n ∈ G1 , ϕ(n) = e2 }.
4. Additional remarks on groups
379
This is normal divisor in G1 , since ϕ(gng −1 ) = ϕ(g) ◦ ϕ(n) ◦ ϕ(g −1 ) = ϕ(g)e2 ◦ ϕ(g)−1 = = e2 =⇒ gng −1 ∈ Ker ϕ. But if N G, then the function ϕ : G → G/N , ϕ(g) = N g, is a homomorphism with kernel N . Therefore we see that normal divisors of a group, and only normal divisors are kernels of homomorphisms of this group. If the function ϕ : G1 → G2 gives a one-to-one correspondence between G1 and the domain of values Im ϕ = ϕ(G2 ), we call it a monomorphism; in this case Ker ϕ = (e1 ). If Im ϕ = G2 , that is if the domain of values is the whole of the group G2 , we call the function an epimorphism. If a homomorphism is at the same time a monomorphism and an epimorphism it carries the name isomorphism. The isomorphism of two groups G1 and G2 will be written G1 ∼ = G2 . As a simplest example we have the “identity homomorphism” of the group G1 onto itself, that is, the function ϕ : G1 → G1 , given by the formula ϕ(g) = g for all g ∈ G1 . If G1 = G2 , an isomorphism of G1 is called an automorphism, that is, an automorphism of a group is an isomorphism of the group with itself. We note that the set of automorphisms of a given group G is a group denoted Aut(G); the composition of automorphisms is concatenation: σ1 / G σ2 / G G HI JKO σ1 ·σ2
In the case of the groups Sn (n ≥ 3, n = 6) one has Aut(Sn ) ∼ = Sn ; this theorem was proved by O. Hölder in 1895. The Reader can check that for each a ∈ G the homomorphism σa : G → G given by σa (g) = a−1 ga, is an automorphism. Automorphisms of this type are called inner automorphisms. Apparently, the set of all inner automorphisms of G forms subgroup of Aut(G). In order to get used to the new notion we familiarize ourselves with some examples. E XAMPLE 4.6. Let us take as G1 the additive group of all real numbers R(+) and as G2 the multiplicative group of complex numbers on the unit circle C◦ . We consider the function ϕ given by the formula ϕ(a) = e2πia = cos 2π a + i sin 2πa. As ϕ(a + b) = e2πi(a+b) = e2πia · e2πib = ϕ(a) · ϕ(b), we have a homomorphism. Let us find its kernel. By definition Ker ϕ = {a| a ∈ R, e2πia = 1}; thus Ker ϕ = Z. As, apparently Im ϕ = C◦ , we have an epimorphism. ( ' E XAMPLE 4.7. Let again G1 = R(+) and take as G2 the multiplicative group of positive real numbers R∗ . The function ϕ(α) = eα determines an isomorphism between these groups (see Figure 16). ( ' E XAMPLE 4.8. Also the inverse function ϕ = ln : R∗ → R(+) gives an isomorphism of groups (see Figure 17). ( '
380
C HAPTER VI. POPULARIZATION OF MATHEMATICS
G2
ea
G1
( a, ea )
a
0
Fig. 16
G2
ln a
( a, ln a)
0
a
G1
Fig. 17
E XAMPLE 4.9. The natural embedding i : Z(+) → R(+) is an example of a monomorphism. ' ( E XAMPLE 4.10. Let us now have a look at the the multiplicative group G1 of all complex numbers = 0, C(·) and as G2 the group of all regular 2 × 2 matrices with real elements, GL(2, R). The function ϕ, given by the formula a b , ϕ(a + ib) = −b a is a representation56 of the group C(·) in GL(2, R). It is easy to see that it is a monomorphism. ( ' Finally, let ϕ : G2 → G2 be an arbitrary homomorphism. As Ker ϕ = N G1 we can form the factor group G1 /N . This group we may take as the domain of a new ˆ g) = ϕ(g). An easy check shows function ϕˆ : G1 /N → Im ϕ, given by the formula ϕ(N that ϕˆ is an isomorphism. Thus we have the following theorem. 56Translator’s note. A representation of any group G is a homomorphism into a matrix group GL(n, K), where K is a field or, more generally, a ring.
4. Additional remarks on groups
381
T HEOREM 4.5 (Theorem of homomorphisms). The function ϕˆ is an isomorphism between the groups G1 /Ker f and Im ϕ. 4. Let us introduce some classes of groups. We have already spoken of Abelian groups: these were the groups where the algebraic operation is commutative. As examples we may take the multiplicative groups Q(·) , R(·) , C(·) , that is, the corresponding sets of numbers, deprived of zero, where composition is usual multiplication of numbers, and, further, the additive groups Q(+) , R(+) , C(+) , that is, the corresponding sets of numbers, where the operation is the usual addition of numbers. An important subclass of these [Abelian ] groups are the cyclic groups. In a cyclic group all elements can be taken as the various powers of its distinguished element socalled generator (if the composition is called “multiplication”) or multiplier (if the composition is called “addition”). E XAMPLE 4.11. Consider the solutions of the equation xn − 1 = 0, that is the n-th roots of unity ε0 , ε1 , . . . , εn−1 . Clearly, the product of two roots of unity is again a root of unity; likewise, the inverse of a root of unity is a root of unity. So we have a group whose elements are the n-th roots of unity, the algebraic operation being ordinary multiplication of complex numbers. As the solutions of the equation xn − 1 = 0 form 2πk 2πk + i sin . By de a regular n-gone inscribed in the unit circle, we have εk = cos n n k Moivre’s formula ε1 = εk , k = 0, 1, . . . , n − 1, which shows that we are dealing with a ( ' finite cyclic group: all elements εk are powers of the generator ε1 E XAMPLE 4.12. The additive group of integers Z(+) is cyclic. Indeed, it has the generator 1, because each n ∈ Z can be written n = n · 1. All subgroups or factor groups of a cyclic group are again cyclic. For the proof of the first assertion we remark that as a generator of a subgroup we can take the power of a generator of the entire group with the lowest positive exponent. For the proof of the second assertion it suffices to take, as generator, an orbit of of the factor group passing through a generator of the group. ( ' Let us now apply these observations to the additive group of integers Z. Taking an arbitrary n ∈ Z, n = 0, and considering the set N ⊂ Z of all integers divisible by it, we obtain a subgroup. It is easy to see that all subgroups are of this form. As Z is Abelian , then all its subgroups are normal divisors. Therefore we can form the subgroups Z/N = Zn , which being the factor group of a cyclic group must be cyclic. In this way we have found all subgroups, all normal divisors, and all factor groups of Z. Even more, we have determined all homomorphisms of Z, because the Theorem 4.5 allows us to restore these from their kernels, and we know also the latter, as they coincide with set of normal divisors of Z. Next, we consider an arbitrary cyclic group G2 with generator a and define a function whose domain of definition is Z and whose domain of values is G2 , defined with the help of the formula ϕ(1) = a. This implies that ϕ(n) = an . It is easy to see that Im ϕ = G2 , so that we have an epimorphism. The Theorem 4.5 yields now G2 ∼ = Z/Ker ϕ. But all factor groups of Z are known to us; they are the groups Zn . Thus, we have G2 ∼ = Zn for some n; in case Ker ϕ = (0) we obtain G2 = Z, and we have an infinite cyclic group.
382
C HAPTER VI. POPULARIZATION OF MATHEMATICS
In the “abstract” theory of groups – that is, in group theory – the object in view is not so much the “individual” group but rather the class of groups isomorphic among themselves. In this sense we have obtained a description of cyclic groups in terms of the additive group of integers Z and its factor groups Zn , that is on the basis of the material “closest” to us. If n is a prime, then the groups Zn are simple, which is a consequence of Lagrange’s theorem (Theorem 4.1). Knowing the properties of cyclic groups, the Reader will convince him- or herself that these are the only simple Abelian groups. 5. What is a solvable group? We have seen57 that to a group G one can attach the subgroup generated by all commutators, that is elements of the form [a, b] = a−1 b−1 ab, a ∈ G, b ∈ G, the commutator subgroup of G. It was shown that this was a normal divisor in G. Iterating the construction we can form the commutator subgroup G of the subgroup G etc. Let G = G . The iteration gives then a decreasing chain of subgroups, where each “link” is a normal divisor in the preceding one: G G G
(103) (i)
with G
· · · G(i)
...,
(i−1)
= (G
).
D EFINITION 4.6. A group G is called solvable if the chain (103) breaks at the trivial subgroup, that is, there exists an index n such that G(i) = (e).58 For example, every Abelian group G is solvable 59, because in this case one has G = (e), so n = 1 here. Soon we we shall also encounter non-solvable groups, so one can say that the class of solvable groups is an essentially wider class than Abelian groups. In a solvable group G = (e) one has G = G, that is, the commutator subgroup in such a group cannot coincide with the group itself. For let G = (e). Then it follows from the relation G = G that G(i) = G = (e) for all i, which would contradict the solvability. Moreover, if G is solvable, then all (non-trivial) members of the chain (103) are distinct; from G(i) = G(i+1) it follows that G(i) = G(j) if j ≥ i. note that the factors of the chain (103), that is, the groups G/G , G/G , . . . , are Abelian groups. This follows from facts known to us: for any group the factor group with respect to its commutator subgroup is an Abelian group. Subgroups and factor groups of solvable groups are solvable. In order to prove the first statement it suffices to make the observation:
H ⊂ G =⇒ H ⊂ G =⇒ H ⊂ G =⇒ . . . . . . H (n) ⊂ G(n) = (e) =⇒ H (n) = (e). To prove the second statement we consider the epimorphism ϕ : G → G/N . As every commutator in G/N is the image of some commutator in G, we observe that ϕ 57See Example 4.12. 58Here e is the unit element of G. The solvable groups have obtained their name because of the fact that
algebraic equations are solvable in terms of radicals if and only if their Galois groups are solvable. Translator’s note. See also Section 6. 59In particular, all cyclic groups are solvable.
4. Additional remarks on groups
383
induces a new epimorphism ϕ : G → (G/N ) . Iterating this observation we get the epimorphisms ϕ : G → (G/N ) , ϕ : G → (G/N ) etc. As G(n) = (e) and ϕ(n) is an epimorphism, then (G/N )(n) = (N ). ( ' 6. In the case of finite groups one usually gives a different definition of solvability. This is done with the aid of the notion of composition series of a group. Let us first look at a concrete situation in order to be able to use it as an example in the following general reasoning. We consider the 6-th order group G = {e, a, a2 , a3 , a4 , a5 | a6 = e}. The sets N1 = {e, a3 }, N2 = {e, a2 , a4 } are subgroups, with N1 ⊂ N2 . There are no other proper subgroups. 60 In view of Theorem 4.1 the order of a subgroup must divide the order of the group itself. But the number 6 has only the divisors 2 and 3. As G is commutative, it follows that N1 and N2 are normal divisors. This gives us two chains G N1 (e) and G N2 (e). It is easy to check that G/N1 ∼ = N2 /(e) ∼ = N1 /(e) ∼ = N2 and N1 ∼ = G/N2 . D EFINITION 4.7. Let G be a finite group. A decreasing chain of subgroups (104)
G = N0 ⊃ N1 ⊃ N2 ⊃ · · · ⊃ Nk = (e)
is called a composition series if the following two conditions are fulfilled: (1) for all i = 0, . . . , k − 1, Ni+1 is a normal divisor of Ni , and (2) for no i = 0, . . . , k −1 there exists in Ni a normal divisor M such that Ni+1 ⊂ M ⊂ Ni , M = Ni+1 , M = Ni . The Reader sees at once that the chains G ⊃ N1 ⊃ (e) and G ⊃ N2 ⊃ (e) in the above example are composition series. Each finite group has a composition series. According to the Jordan-Hölder Theorem (see [7, p. 286] or suitable references in Section 1, footnote 12.) any two composition series have the same length, and one can find a one-to-one correspondence between them such that that the corresponding factor groups are isomorphic; one may say that the composition series of a finite group are isomorphic. This theorem allows to consider in a finite group any composition series as the “copy” of some “original” composition series. Thus, in essence a finite group has a unique composition series, namely this “original”, all others being just “copies” of the “original”, that is chains isomorphic to the latter. As the “original” one can, of course, take an arbitrary composition series. It turns out that a finite group is solvable if and only if all factor groups of its composition series are cyclic groups of prime order. This is the second definition of solvability in the case of a finite solvable group. In using it, the Reader will notice that for a finite solvable group to be simple it is necessary and sufficient that it be cyclic of prime order; that such a group is solvable is already known to us.61 Likewise, the Reader sees that the order |G| of a solvable group is the product of the order of its factors. Indeed, a repeated application of Lagrange’s Theorem (Theorem 4.1 60That is, subgroups distinct from (e) and G itself. 61. . . but it is also an immediate consequence of the second definition just given.
384
C HAPTER VI. POPULARIZATION OF MATHEMATICS
to the chain 104 gives |G| = |G/N1 | · |N1 | = |G/N1 | · |N1 /N2 | · |N2 | = · · · = = |G/N1 | · |N1 /N2 | · |N2 /N3 | . . . |Nk |. ( ' Here the orders of the factors may be equal. Even they can be all equal, i.e., the order |G| must then be of the form pα , p a prime number. According to Lagrange’s Theorem, the order of the group must divisible by the order of any of its subgroups. The interesting question arises whether the “converse conjecture” holds true for solvable groups. 62 It turns out that in a solvable group G of order |G| = m · n, where the numbers m and n are relatively prime, there exist subgroups of order m and n. For the proof of this fact we would have to plunge into the “technical wilderness”, which would not be suitable for the present compilation. However, it is of some interest to note that this fact is characteristic for the “nature” of solvable groups, that is, it can be used as a criterion for solvability. We mention further two criteria due to J. G. Thompson, as they can often be easily applied to the solvability or non-solvability of groups (see [13, 383-437]): I. A finite group is solvable if and only if each subgroup generated by any pair of elements in it is solvable. II. A finite group is solvable if and only if it does not contain any three elements distinct from unity whose orders are pairwise relatively prime and whose product equals unity. 7. The following theorem gives a series of examples of non-solvable groups. It plays also a decisive role in the proof of the Abel-Ruffini Theorem. T HEOREM 4.8. The complete symmetric groups Sn (n ≥ 5) are not solvable. The groups S2 , S3 , S4 (n ≥ 5) are solvable. P ROOF. As a subgroup of a solvable group is solvable, it suffices, for the proof of the first statement to find in the groups Sn (n ≥ 5) subgroups which are not solvable. This is easy: the subgroups An ⊂ Sn (n ≥ 5) suffice! The groups An are simple, but their order |An | = n! 2 is not a prime number; hence they are not solvable. Indeed, we saw in the previous Section that among solvable groups only those are simple which are cyclic of prime order. This proves the first assertion. Let us consider the group S2 . As |S2 | = 2! = 2 it is cyclic of prime order and so solvable. For the proof of the solvability of S3 , we note that A3 is solvable because in view of |A3 | = 12 3! = 3 we have a cyclic group. Moreover, (S3 : A3 ) = 2 is a prime so that (e) ⊂ A3 ⊂ S3 is a composition series. The factors of the latter are visibly cyclic of prime order. That the group S3 is solvable follows now from the definition.
62See also the first subsection of this paper.
4. Additional remarks on groups
The structure of S4 is the elements: 1 2 3 e= 1 2 3 1 2 3 a1 = 2 1 4 1 2 3 b1 = 2 3 1 1 2 3 b4 = 3 2 4 1 2 3 b7 = 1 3 4
385
somewhat more complicated. The subgroup A4 consists of 4 , 4 4 1 , a2 = 3 3 4 1 , b2 = 4 2 4 1 , b5 = 1 4 4 1 , b8 = 2 1
2 3 4 1 2 3 4 3 2 3 1 3 2 3 4 2
4 1 , a3 = 2 4 4 1 , b3 = 1 3 4 1 , b6 = 2 4 4 . 3
2 3 3 2 2 1
3 2
2 2
3 1
4 , 1 4 , 4 4 , 3
By an immediate check one verifies that the set K4 = {e, a1 , a2 , a3 } is a commutative subgroup. Furthermore, calculating the 24 products b−1 i aj bi , i = 1, 2, . . . , 8, j = 1, 2, 3, we see that they all belong to the subgroup K4 . But this means that K4 is a normal divisor in A4 . Taking N4 = {e, a1 } and forming the chain we obtain the chain (e) ⊂ N4 ⊂ K4 ⊂ A4 ⊂ S4 , we see that we have a composition series all of which factors are cyclic groups of prime ( ' order. This proves that S4 is solvable. The group K4 is called the Klein 4-group (or the four group). One readily checks that one has the relations a21 = a22 = a23 = e so that {e, a1 }, {e, a2}, {e, a3 } are subgroups. We see that K4 = {e, a1 } ∪ {e, a2 } ∪ {e, a3 }, so that the Klein group can be presented as the union of three proper subgroups. In 1959, S. Haber and A. Rosenfeld proved that K4 is the typical example of a group with this property: K4 and, furthermore, those groups which are epimorphic to K4 are presentable as the union of three proper subgroups. There are no other groups with this property. One can ask the question which groups are presentable as the union of two proper subgroups. A simple contradiction reasoning reveals that there are no such groups. The question which groups can be “covered” by n proper subgroups is not easy. 8. In the end of the 1950’s the general opinion was that the theory of finite groups was in a state of “congelation”; voices arose claiming that it had exhausted itself. The reason for this was not at all the absence of unsolved problems – the problem of describing all finite simple groups is still awaiting its solution! Rather it is the contrary: as in Number Theory, in Group Theory it is easier to formulate problems than to solve them. One could not even say that one lacked methods: powerful methods had been developed by Hölder; Jordan; Frobenius; Molien; Burnside; Schur. Therefore there arose the opinion that the circle attainable by these methods was completely exhausted, perhaps with the exception of only a few, not very interesting cases. In the beginning of the 1960’s there occurred a real break-through in the theory of finite groups: W. Feit and J. G. Thompson proved the following theorem [4]. T HEOREM 4.9. All finite groups of odd order are solvable.
386
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The proof of this theorem is based on ideas, results and theories due to P. Hall; G. Higman; H. Wieland; R. Brauer; M. Suzuki; and others. This theorem with its monumental proof, undoubtedly, has a deep influence on the development of Group Theory63. Below we study two examples of the numerous consequences of this theorem. We start with the following question: if a group G is not cyclic of prime order, what can be said about the existence of subgroups H of “sufficiently high” order in G? More exactly, we would like to prove thefollowing conjecture: there exists always a proper subgroup H ⊆ G such that |H| > 3 |G|. In the case of groups G of even order, R. Brauer and K. Fowler managed to prove this conjecture in 1955. There remained the case |G| odd; one was not able to “conquer” this. The knowledge of the Feit-Thompson Theorem makes this problem fairly simple and so also acceptable in the framework of this paper. Indeed, by this theorem all odd groups are solvable. But in every finite solvable groupG whose order is not a prime and greater than unity, there exists a subgroup of order ≥ |G|. P ROOF. Let us present the number |G| = g as a product of prime numbers g = 1 s . . . pα pα s . We consider two cases. 1 First, let s = 1. Then g = pα . As g must be a prime greater than unity, we deduce that α ≥ 2. The first theorem of Sylow tells us that G has a subgroup of order pα−1 . But √ as α ≥ 2 implies the inequality pα−1 ≥ g, we have found the desired subgroup H. Second, let s > 1. Then we can divide the set P = {p1 , p2 , . . . , ps } into two nonempty and non-intersecting (all the pi are distinct!) subset P and P , that is, we have the relations P = P ∪ P and P ∩ P = ∅. αj i Let m = pi ∈P pα = ∅ that the i and n = pj ∈P pj . It follows from P ∩ P numbers m and n are relatively prime. There are two possible special cases: (i) m > n, and (ii) n > m. Let us now apply the “natural property” of a solvable group (cf. Subsection 6). In case (i) it guarantees that there is a subgroup H in G of order m. As m > n, then √ g = mn < m2 , hence |H| = m > g, and we have found the desired subgroup. The reasonings in case (ii) are analogous. It suffices only to make in them the replacements m → n and n → m. Our assertion is proved. ( ' As for |H| > |G| one has also |H| > 3 |G|, the conjecture under view is established for all finite groups. We give yet another application of the Thompson-Feit Theorem. In Subsection 2 we mentioned the following problem of W. Burnside: Does there exist non-commutative simple groups of even order? The answer to this question is an immediate consequence of the Thompson-Feit Theorem: assuming that such a group exists, we see from this theorem that the group must be solvable. On the other hand, the class of simple solvable groups consists of only cyclic groups of prime order, thus only of commutative groups. This contradiction shows that the answer is negative. 63See the book [3], where the Reader will have a magnificent opportunity to familiarize him- or herself with contemporary Group Theory.
4. Additional remarks on groups
387
Up to this day64 two interesting (and difficult) questions remain unsolved. (1) It is well-known that groups of order pα · q β , where p, q are prime numbers, are solvable. This is Burnside’s Theorem. But so far one does not know the structure of groups whose order is divisible by precisely three distinct prime numbers. Nor does one know which simple groups have this property, and even not if there are only finitely many such groups. That we here have to deal with a very well-founded problem should be clear from the following theorem of Thompson: if the order of a simple group has the form pα · q β · rγ with distinct primes p, q, r (say p < q < r), then p = 2, q = 3, r = 5, 7 or 17. (2) To prove that a finite group, which admits an automorphism whose only fixed point is its identity, must be solvable. A support for this conjecture is the following fact: If the automorphism under view (considered as an element of Aut(G)) is of order = 2n , n an integer, then G is of odd order; that G is solvable follows now from the Feit-Thompson Theorem. It is essential that G be finite. Indeed, there exist infinite non-solvable “linearly ordered” groups G. Each such group admits the automorphism σ : g → G, σ(g) = g −1 , with unity of G as the single fixed point.
Comments. 1) In the proof on the pages 386 to 386 it is written “First, let s > 1 . . . ”, but the case s = 1 is never dealt with. However, this is easy, as any group of order pn contains a subgroup of of order pn−1 . 2) The problem of classifying non-Abelian finite simple groups, essentially going back to Galois, is now generally believed to be settled. This classification was finished in the early 1980’s. According to this there are, apart from the alternating groups, 16 infinite families of groups of Lie type (the finite analogies to the classical families of simple Lie groups that include: - the projective special linear groups PSL(n, K), the projective symplectic groups; - the simple orthogonal and unitary groups) and 26 sporadic simple groups (5 of which are the Mathieu groups discovered by E. Mathieu already in 1861 and 1873). For further discussion see e.g. the book [1]. The following two items refer to the two questions referred at the end of the paper. 3) Question 1 on p. 387. From the classification of finite simple groups, one can read exactly which simple groups occur that have orders divisible by exactly three distinct prime numbers. 4) Question 2 on p. 387. It has now been shown, using the classification of finite simple groups, that all finite groups, that admit an automorphism, whose only fixed point is the identity, must be solvable. For more information, see the book [8]. Gunnar Traustason
References [1] [2] [3] [4] [5]
R. Carter. Simple groups of Lie type. John Wiley & Sons, London, New York, Sidney, 1989. F. A. Cotton. Chemical applications of group theory, New York, London, 1964. W. Feit. Characters of finite groups. W. A. Benjamin, Inc., New York, Amsterdam, 1967. W. Feit and J. G. Thompson. Solvability of groups of odd order. Pac. J. Math. 13, 1963. E. Gabovitš. Principles of Algebra, I – V. Math. and Our Age 6–10, 1965. 64. . . according to the information available to the author.
388
[6] [7] [8] [9] [10] [11] [12] [13]
C HAPTER VI. POPULARIZATION OF MATHEMATICS
S. Helgason. Differential geometry and symmetric spaces. Academic Press, New York, London, 1962. G. Kangro. Kõrgem algebra (Higher algebra) II. Eesti Riiklik Kirjastus, Tallinn, 1950. (Estonian.) E. I. Khukhro. p-Automorphisms of finite p-groups. London Mathematical Society Lecture Note Series, 246. Cambridge University Press, Cambridge, 1997. N. Kristoffel and K. Rebane. Group theory and its applications in the physics of molecules and chrystals. Tartu Univ. Press, Tartu, 1961. A. G. Kurosh. Lectures in general algebra. Fizmatgiz, Moscow, 1962. English translation: Pergamon Press, Oxford, London, Edinburgh, New York, 1965. Ü. Lumiste. The notion of space in geometry. Geometry and transformation groups. Math. and Our Age 14, 1968, 3–21. Ø. Ore. On coset representatives in groups. Proc. Am. Math. Soc. 9, 1958, 665–670. J. G. Thompson. Non-solvable finite groups all of whose local subgroups are solvable. Bull. Am. Math. Soc. 74, 1968.
389
5.
[K73a] Polynomials and formal series
This paper arose from a desire to give the Reader a handy compilation for the formulation and proof of the Ruffini-Abel Theorem. Here we treat the symmetry of the polynomial and the notion of irreducibility and, moreover, the concept of formal series. Symmetric polynomials have applications in several domains of mathematics. The Reader will find on the pages of this paper an interesting possibility to use them in the solution of algebraic equations of higher order, see [2]65. Irreducible polynomials play in the arithmetic of the ring of polynomials about the same role as prime numbers in ordinary number theory. In recent times they have found applications in the theory of coding an decoding (see the book [1]66). The Reader will probably find the concept of formal series specially interesting. An elegant use of this theory is, among other things, one of the tools by which Henri Cartan has refreshed the presentation, in university courses, of such an important branch of classical mathematics as the theory of functions of a complex variable. However, our goal has not been to give a complete catalogue of the properties of any of the mathematical objects mentioned. The Reader will only learn of those properties of an object which will be later required to understand the proof of Abel’s theorem. In the opinion of the author the best way of learning something about mathematical objects is to see how they are used in achieving significant goals. In the composition of the remarks we have in an essential way used M. M. Postnikov’s book on Galois Theory [6].
5.1. Irreducibility of polynomials Let P be a field. We consider the ring P [x], that is, the set consisting of all polynomials with coefficients in P , f (x) = an + an−1 x + · · · + a0 xn ,
ai ∈ P.
Addition and multiplication are defined, as in the case of polynomials with numerical coefficients, by the formulae (f + g)(x) = f (x) + g(x), (f · g) = f (x) · g(x). There are no divisors of zero in the ring P [x] and one can develop a theory of division similar to the one in the domain of ordinary integers; in the role of prime numbers there appear then the “irreducible polynomials”; let us familiarize us with these strange “prime numbers”. 65 Translator’s note. For symmetric functions see e.g. Chap. 11 of Kurosh’s book [10] quoted in Section 1, footnote 12 66 Editors’ note. We suggest also an introduction to coding theory [5]
390
C HAPTER VI. POPULARIZATION OF MATHEMATICS
D EFINITION 5.1. A polynomial f (x) ∈ P [x] is called reducible over the field P if there exist non-constant lower order polynomials f1 (x) and f2 (x) in the ring P [x] such that f (x) = f1 (x) · f1 (x). In the opposite case one says that f (x) is irreducible over P . Next, we present an assertion which illustrates the similarity between irreducible polynomials and prime numbers. T HEOREM 5.2. If the polynomial f (x) with coefficients in P has a common solution with the polynomial p(x), which is irreducible over P , then f (x) is divisible by p(x). P ROOF. Let g(x) = GCD(f (x), p(x)). As the equation f (x) = 0 and p(x) = 0 have a common solution then67 deg g(x) ≥ 1. The coefficients of the polynomial g(x), obtained by Euclid’s algorithm, belong to P . If we had deg f (x) < deg p(x), then p(x) = g(x) · p1 (x), where deg p1 (x) < deg p(x) and where the coefficients of p1 (x) belong to P . But this contradicts the irreducibility of p(x). Thus deg g(x) = deg p(x). ( ' As an example we consider the irreducibility of polynomials over number fields. We use the standard notation: Z for the set of integers, Q for the set of rational numbers, C for the set of complex numbers. Let P = Q. It turns out that it suffices to know if a polynomial f (x) with integer coefficients is irreducible in the ring68 Z or not. Indeed, let f (x) be a polynomial with rational coefficients. We determine the least common divisor a of the coefficients ai and consider the polynomial af (x); this is a polynomial with integer coefficients and its reducibility is, apparently, necessary and sufficient for the reducibility of f (x) over Q. Thus the question of the reducibility over Q is solvable if we can clarify the reducibility of polynomials over Z. For this several criteria are known in algebra. We list a few of them. E ISENSTEIN ’ S CRITERION . Let there be given a polynomial f (x) = an + an−1 x + . . . a1 xn−1 + a0 xn ,
ai ∈ Z.
If there exist a prime number p such that the following conditions69 are fulfilled: p a0 ;
p|ai for i = 0;
p 2 an ,
then f (x) is irreducible over Q. The proof is easy; assuming the contrary the Reader will easily arrive at a contradiction. ( ' It follows from this criterion that there exist irreducible polynomials (over Q) of arbitrary high degree. Indeed, the polynomials xn + p, n = 1, 2, 3, . . . are irreducible. n C OHN ’ Si CRITERION . If the coefficients ai ∈ Z of the polynomial f (x) = i=0 an−i x satisfy the condition 0 ≤ an−i ≤ 9, and f (10) is a prime number, then f (x) is irreducible over Q. ( ' 67Here deg f (x) denotes the degree of the polynomial f (x). 68The polynomial f (x) is reducible in the ring Z[x] if there exist polynomials f (x) and f (x) with 1 2
non-constant integer coefficients such that f (x) = f1 (x) · f2 (x) and deg fi (x) < deg f( x), i = 1, 2. 69Here p|a means that the number a is divisible by p, and p a its contrary. i i i
5. Polynomials and formal series
391
According to Cohn’s criterion the polynomials f1 (x) = x3 + 8x2 + 2x + 3, f2 (x) = 2x3 + 6x + 3, f3 (x) = 2x3 + x2 + 2x + 9 are irreducible over Q. Among less know criteria of irreducibility, we note the following. In order for a polynomial f (x) = xn + a1 xn−1 + · · · + an , ai ∈ Z to be irreducible (over Q) it is sufficient that one of the following conditions is fulfilled: (1) |a1 | > |1 + a2 | + |a3 | + · · · + |an |, 3 √ (2) a2 > 0, a2 > √ (|a1 | + |a3 | + . . . |an |), 2 √ (3) a1 = 0, a2 > 0, an = 0, √ a2 > 3(|a3 | + · · · + |an |), √ (4) n > 4, a4 > 0, a4 > 4 2(1 + |a1 | + |a2 | + · · · + |an |),
an = 0.
Over the complex numbers, only linear polynomials are irreducible. Indeed, by the fundamental theorem of algebra it follows that an equation f (x) = 0 with complex coefficients has at least one solution α ∈ C, from which it again follows by Bézout’s Lemma that f (x) is divisible with the factor x − α. If necessary applying the same argument to the quotient, we find that f (x) decomposes into a product of linear factors. Over the field of real numbers there are already quadratic polynomials which are irreducible – a well-known example is provided by the polynomial x2 + 1. It turns out that higher order polynomials with real coefficients are already reducible. The fact that here appear besides linear polynomials some quadratic polynomials, follows again from the reasoning that if f (α) = 0, α ∈ C, α ∈ R, then also f (¯ α) = 0 and that the factor (x − α)(x − α ¯ ) is irreducible over R. We note the absence of effective criteria for deciding the irreducibility of polynomials over an arbitrary field. The answer to the following interesting question70 is not known. P ROBLEM OF PAUL T URAN n. Does there exist a non-negative integer c such that for each polynomial f (x) = i=0 ai xn−i , ai ∈ Z, a0 = 0 there exist a polynomial n n g(x) = i=0 bi xn−i , bi ∈ Z, which is irreducible over Z, such that i=0 |ai − bi | ≥ c. ( '
5.2. Symmetric polynomials We consider the polynomial f (x) = xn + a1 xn−1 + an−1 x + an with coefficients in the field P . There exists always an extension L of P in which f (x) decomposes into linear factors, that is, there exist elements α1 , . . . , αn ∈ L/P such that f (x) = (x − α1 ) . . . (x − αn ). For example, if P ⊆ C we may take for L the field of complex numbers C. But if two polynomials are equal, then there coefficients in front of the corresponding powers of the variable x must be equal also. This gives the well-known
70In this relation, see [7]
392
C HAPTER VI. POPULARIZATION OF MATHEMATICS
formula of Viète −a1 = α1 + α2 + · · · + αn , a2 = α1 α2 + α2 α3 + · · · + αn−1 αn , ........................... n−1
(−1)
an−1 = α1 · · · αn−1 + α2 · · · αn , an = α1 α2 · · · αn .
The right hand sides of these relations are not changing under arbitrary permutations α1 → αi1 , α2 → αi2 ,. . . , αn → αin of the solutions α1 , . . . , αn ; here (i1 , . . . , in ) is a permutation of the numbers 1, . . . , n. Therefore this expressions are called symmetric with respect to the solutions α1 , . . . , αn . This property makes it possible to distinguish among polynomials in n variables the so-called symmetric polynomials, that is, polynomials f (x1 , . . . , xn ) which do not change under an arbitrary permutation x1 → xi1 , x2 → xi2 ,. . . xn → xin . We have already examples of such polynomials: the so-called elementary symmetric polynomials σ1 = x1 + x2 + . . . + xn ; σ2 = x1 x2 + x2 x3 + · · · + xn−1 xn ; . . . ; σn−1 = x1 . . . xn−1 ; σn = x1 x2 . . . xn . It is easy to find other examples: x21 + x22 + · · · + x2n , x31 x32 . . . x3n , etc. As the symmetric polynomials form a subring of the ring of polynomials in n variables, it is easy to enlarge the number of these examples. The Reader will notice that many symmetric polynomials can be expressed in terms of the elementary ones, for example: x21 + x22 + · · · + x2n = σ12 − 2σ2 . x31 x32 . . . x3n = σn3 , x21 x2 . . . xn + · · · + x1 x2 . . . x2n = σ1 σn . Indeed, every symmetric polynomial can be expressed as a polynomial in the elementary ones: in other words, to each symmetric polynomial f (x1 , . . . , xn ) (with coefficients in the field P ) there corresponds a polynomial q(x1 , . . . , xn ) (with coefficients in the field P ) such that f (x1 , . . . , xn ) = q(σ1 (x1 , . . . , xn ), σ2 (x1 , . . . , xn ), . . . , σn (x1 , . . . , xn )), This is the so-called Fundamental Theorem of Symmetric Polynomials 71; we shall use it in the proof of the Ruffini-Abel Theorem.
71For the proof, see [4, p. 262–264].
5. Polynomials and formal series
393
5.3. Embedding of the field of rational functions in an algebraically closed field Let P be a field of characteristic 0, that is, Q ⊆ P . We consider the field of rational functions R = P (x1 , . . . , xn ) over P . Its elements are quotients f (x1 , . . . , xn ) g(x1 , . . . , xn ) of two polynomials f, g ∈ P [x]. Such a field needs not be algebraically closed, that is, there may exist non-linear irreducible polynomials over it. E XAMPLE 5.1. Let P = Q. We show that the equation x2 + 1 = 0 does not have any solutions in the field R = Q(x1 , . . . , xn ). Indeed, if f (x1 , . . . , xn ) ∈R g(x1 , . . . , xn ) were a solution of the equation x2 + 1 = 0, we would have 2 f (x1 , . . . , xn ) ≡ −1, g(x1 , . . . , xn ) which implies that
f (c1 , . . . , cn ) g(c1 , . . . , cn )
for any (c1 , . . . , cn ) ∈ Qn . But as
2
f (c1 , . . . , cn ) g(c1 , . . . , cn )
= −1,
∈ Q,
we have obtained a contradiction, because there exists no rational number r ∈ Q such ( ' that r2 = −1. Nevertheless, it is possible to embed the field R into an algebraically closed field, provided the base field P is algebraically closed. First we embed R in the field of formal series. What is a formal series? We denote the variable by x. A formal series is an infinite formal sum of the form a−m x−m + a−m+1 x−m+1 + · · · + a−1 x−1 + a0 + a1 x + a2 x2 + . . . , where ai ∈ P and m ∈ Z; in short, i≥−m ai xi . Among the formal series we have all polynomials, as a0 + a1 x + · · · + ak xk = i≥0 ai xi , where ak+1 = ak+2 = · · · = 0. The sum and product of any two formal series f = i≥−m ai xi and g = i≥−n bi xi is defined by the formulae: (1) if, for example n ≥ m, and f + g = i≥−n (ai + bi )xi , where am−1 = · · · = a−n = 0 and (2) ci xi , f ·g = i≥−m−n
394
C HAPTER VI. POPULARIZATION OF MATHEMATICS
where
⎧ ⎪ ⎨ ⎪ ⎩
c−m−n = a−m · b−n c−m−n+1 = a−m b−n+1 + a−m+1 b−n . .........
An easy check shows that one has a ring with respect to these operations – the ring of formal series P x. The Reader will notice that here we have to deal with an “extension” of addition and multiplication of polynomials. Thus the ring of polynomials P [x] is a subring of P x. However, the latter is a field, as each formal series f other than zero has an “inverse series”, that is, a series f −1 ∈ P x such that f · f −1 = 1. Indeed, each formal series different from zero has the form f = xn (a0 + a1 x + a2 x2 + . . . ),
n ∈ Z, a0 = 0.
Determining the coefficients bi of the series f −1 = x−n (b0 + b1 x + b2 x2 + . . . ) by the identities a0 · b0 = 1, a0 · b1 + a1 · b0 = 0, a0 · b2 + a1 · b1 + a2 b0 = 0, ............ we see that f · f −1 = 1. From the relation P [x] ⊆ P x it follows that the quotient of any two polynomials is contained in the field of formal series, that is, P (x) ⊆ P x. By induction one defines the notion of the field of formal series of several variables: P x1 , x2 = P x1 x2 . P x1 , x2 , x3 = P x1 , x2 x3 . P x1 , . . . , xn = P x1 , . . . , xn−1 xn . We see that P (x1 , . . . , xn ) ⊂ P x1 , . . . , xn . Next we extend the field of formal series to an algebraically complete field. To this end we consider general formal series, that is, formal sums ni f= ai x n , where , n ∈ Z, n > 0; n, n0 , n1 , · · · ∈ Z, i≥0
n0 < n1 < n2 . . . , among the integers only finitely many being negative. If n = 1 we 1 have ordinary formal series. Introducing the variable y = x n , we can view the general formal series as an ordinary formal series in the variable y: ni f (x) = ai x n = ai y ni . i≥0
i≥0
This simple remark shows that that general formal series may be added and multiplied in the same way as ordinary formal series. This gives again a ring, which we denote P {x}. Even more, P {x} is a field. Indeed, given a general formal series f (x) = 0 we consider the corresponding formal series f (y) ∈ P y. As P y is a field, we can find a series
5. Polynomials and formal series
395
1
f −1 (y) such that f (y) · f −1 (y) = 1. The change of variable y = x n gives the desired relation. One can prove that P {x1 } is an algebraically closed field, that is, each non-linear polynomial with coefficients in this field is reducible (over P {x1 }). By induction we may define the field of general formal series of several variables: P {x1 , . . . , xn−1 , xn } = P {x1 , . . . , xn−1 }{xn }. We have the relations (105)
P [x1 , . . . , xn ] ⊂ P (x1 , . . . , xn ) ⊂ P x1 , . . . , xn ⊂ P {x1 , . . . , xn }
As P {x1 } is an algebraically closed field, then a simple induction shows that also P {x1 , . . . , xn } is algebraically close (provided that the base field P is so). From the relation 105 it is now clear that the goal set out by us is achieved.
5.4. The splitting field of the general equation Here we illustrate the applications of the notions set forth in the previous two Sections, we introduce the notions of general equation and its splitting field; these are necessary for the formulations in contemporary language of the Ruffini-Abel Theorem. In what follows we use the word “field” in the meaning “subfield of the field of complex numbers”.72 Let P be any field. Let a1 , . . . , an be complex numbers. If there is no polynomial h(x1 , . . . , xn ) with coefficients in P such that h(a1 , . . . , an ) = 0, then we say that these numbers are algebraically independent (over P ).73 A series of examples of algebraic independence (over Q) are provided by the following theorem. T HEOREM 5.3 (Lindemann). If the algebraic numbers α1 , . . . , αn are algebraically independent over Q, then the numbers eα1 , . . . , eαn are algebraically independent over Q. √
2 For √ example, the numbers e and e are algebraically independent over Q, because 1 and 2 are algebraically independent over Q. As in Section 5.3 we can form the field of formal series K = P (a1 , . . . , an ). Let us now take arbitrary power series α1 , . . . , αn and consider the intersection of those subfields which contain the base field P and the power series α1 , . . . , αn . This is the smallest subfield in K with this property, we denote it P (α1 , . . . , αn ) = R.
D EFINITION 5.4. If the coefficients a1 , . . . , an if the equation (106)
f (x) = xn + a1 xn−1 + · · · + an−1 x + an = 0
are algebraically independent (over P ), we call it the n-th order general equation (over P ). Assume that the base field P contains Q and let it be algebraically closed. In this situation the field of power series K = P (a1 , . . . , an ) containing the coefficients of the 72In the following considerations it is sufficient that “field” means “a subfield of an algebraically closed
field of characteristic 0” 73Several criteria for verifying the algebraic independence of complex numbers can be found in [3, p. 118–121].
396
C HAPTER VI. POPULARIZATION OF MATHEMATICS
general equation, is algebraic closed (see Section 5.3), so that the polynomial f (x) falls into linear factors in it. Thus there exist elements α1 , . . . , αn ∈ K such that f (x) = (x − α1 ) . . . (x − αn ). The field R(α1 , . . . , αn ) = Δ is called the splitting field of the general equation (106). P ROPOSITION 5.5. R(α1 , . . . , αn ) = P (α1 , . . . , αn ). P ROOF. On the one hand, it follows from P ⊂ R that P (α1 , . . . , αn ) ⊂ R. On the other hand, as by the formulae of Viète ai = (−1)i σi (α1 , . . . , αn ), then Δ = R(α1 , . . . , αn ) = P (a1 , . . . , an )(α1 , . . . , αn ) ⊆ P (α1 , . . . , αn ). This proves the assertion. ( ' The result obtained shows that each element of the splitting field comes in the form f (α1 , . . . , αn ) , where f and g are polynomials with coefficients in the base of a fraction g(α1 , . . . , αn ) field P . One can show that this representation is unique: P ROPOSITION 5.6. There is no element a ∈ Δ such that f2 (α1 , . . . , αn ) f1 (α1 , . . . , αn ) = (107) a= g1 (α1 , . . . , αn ) g2 (α1 , . . . , αn ) with def h = f1 g2 − f2 g1 ≡ 0. P ROOF. We give a proof based on contradiction. Let us assume that nevertheless an element a with the property (107) nevertheless exists. But then there exists a polynomial h(x1 , . . . , xn ) with coefficients in the base field P , which is not the zero polynomial, but h(α1 , . . . , αn ) = 0. Let Sn be the complete symmetric group. We form for each σ ∈ Sn , 1 2 ... n , σ= ii1 ii2 . . . in the “conjugate polynomials” 74 polynomials hσ (x1 , . . . , xn ) ≡ h(xi1 , . . . , xin ). Then h ≡ 0 ⇒ (∀σ ∈ Sn )hσ ≡ 0 ⇒ hσ ≡ 0. σ∈Sn The product def hσ (x1 , . . . , xn ) s(x1 , . . . , xn ) = σ∈Sn is a symmetric polynomial, so that we can apply to it the Fundamental Theorem of Symmetric Polynomials: s(x1 , . . . , xn ) = q(σ1 (x1 , . . . , xn ), . . . , σn (x1 , . . . , xn )). Using the formulas of Viète σ(α1 , . . . , αn ) = (−1)i ai , 74For example, if n = 4, h(x , x , x , x ) = x2 x2 + x5 x and σ = 1 2 3 4 1 3 2 4
x22 x24 + x53 x1 .
1 2
2 3
3 4
4 , then hσ ≡ 1
5. Polynomials and formal series
397
we obtain (108)
s(α1 , . . . , αn ) = q(±α1 , . . . , ±αn ).
Here q ≡ 0, because s ≡ 0. On the other hand, we have the identities hσ (α1 , . . . , αn ) = s(α1 , . . . , αn ) = σ∈Sn (109) σ = h (α1 , . . . , αn ) · ·hσ (α1 , . . . , αn ), σ=ε
as h(α1 , . . . , αn ) = 0. Taking account of the relations (108) and q ≡ 0 we conclude from the last mentioned relation (109) that a1 , . . . , an algebraically dependent over P . But this is a contradiction, as the a1 , . . . , an , being the coefficients of the general equation, are algebraically independent over P . The assertion is proved. ( ' From the reasoning given one readily deduces that all solutions of the general equation are simple. Take h(x1 , . . . , xn ) = x1 − x2 . Then h ≡ 0 and h(α1 , . . . , αn ) = ( ' α1 − α2 = 0. Contradiction.
References [1] E. L. Bloch and M. S. Pinsker (eds.). Some questions of coding theory. Mir, Moscow, 1970. [2] H. Espenberg. Symmetric polynomials. Math. and Our Age 19, 1973, 25–38. [3] A. O. Gel’fond. Algebraic and transcendental numbers. Gosizdat, Moscow, 1952. English Translation: Dover Publ., Inc., New York, 1960. [4] G. Kangro. Kõrgem algebra (Higher algebra). Eesti Riiklik Kirjastus, Tallinn, 1962. (Estonian.) [5] J. H. van Lint. Introduction to coding theory. Graduate Texts in Mathematics. Spriner, New York, 1999. [6] M. M. Postnikov. Fundamentals of Galois theory. FyzMatGIZ, Moscow, 1963. English translation: Dover Publ., Inc., New York, 2004. [7] A. Schinzel. Reducibility of polynomials and covering systems of congruences. Acta Arithmetica 13, 1967, 91–101.
This page intentionally left blank
399
6.
[K75a] On Galois theory Comments by U. Persson
It is not feasible in practise to proceed like Swift’s scholar, whom Gulliver visits in Balnibarbi, namely to develop in systematic order, say according to the required number of inferential steps, all consequences and discard the “uninteresting” ones; just as the great works of world literature have not come into being by taking the twenty-six letters of the alphabet, forming all ‘combinations with repetition’ up to the length of 1010 , and selecting and preserving the most meaningful and beautiful among them. H. Weyl, Philosophy of Mathematics and Natural Science, Princeton, 1949
The path for Galois theory was prepared by the work of Lagrange, Gauss and Abel. The recognition of the main principles of the theory and their application is due to Galois. Galois associates to each algebraic equation a corresponding group and, having found a series of deep connections between the properties of these two objects, he gives an exhaustive answer to the difficult question about the algebraic solvability of algebraic equations by radicals, which occupied mathematicians for several centuries75. Galois’ point of view in the study of an equation via trying to understand the properties of its group is of revolutionary importance. His work gave the impetus to a tendency where the center of gravity in research in the area of algebra began to incline towards structured theories. The clear-cut unfolding of this tendency occurred in the 1920’s and a special credit here goes to D. Hilbert and E. Noether. Today such a point of view is well-known thanks to the influence of N. Bourbaki’s treatise Éléments de mathématique (Elements of mathematics). This paper consists of two parts. In the first of them the Reader will be acquainted with some notions, connections and results in Galois theory. We give also a proof of the Abel-Ruffini theorem and treat one line in Galois theory, the inverse problem of Galois. However, the level of generality considered has no limits: In a Bourbaki seminar in 1959 A. Grothendieck presented his results on the so-called Galois theory of schemes, of which what will be set out below is just a very narrow special case. This opened up for Galois theory a broad path in geometry. Of course, even this is not a limit, because the employment of the Galois correspondence is nowadays so frequent that it has almost developed into a philosophical principle.76 In the second half of the paper, which carries the title “The duality principle in mathematics”, we consider in which way this development of the ideas of Galois has taken place, and we treat some general questions connected with the application of Galois’ ideas in new disciplines. 75See also the author’s paper in Section 3. 76In the article [K75b] (see Section 7 on automata theory the Reader can acquaint himself with the
realization of the main idea of Galois in Computer Science.
400
C HAPTER VI. POPULARIZATION OF MATHEMATICS
One can also view the lines below as the final chord of the ideas of a series of papers77, where I have tried to prepare the Reader for the understanding of the present material. It will be assumed that the Reader has access to the papers in question, to which he or she can refer in case of need. Special references will not be given, but each time the Reader encounters a little known term or fact, he or she can turn for help to the papers mentioned. Finally, let us also remark that the paper’s second half can be read independently of the first one, so that a Reader who is only interested in the general development of the ideas derived from Galois theory may at once turn to the reading of the second half.
6.1. On the Galois correspondence 1. Let us consider the equation x4 + x3 + x2 + x + 1 = 0. Its solutions are 5th roots 2π 2π + i sin . These of unity: α1 = e, α2 = e2 , α3 = e3 , α4 = e4 , where e = cos 5 5 solutions satisfy the relations (110)
α1 α4 = 1,
α2 α3 = 1,
α21 α3 = 1,
α31 α2 = 1.
Of course, these solutions also satisfy the relations given by the formulae of Viète. The latter remain in force also after applying to the solutions in them an arbitrary substitution of degree 4. In the case of therelations we cannot maintain this. For example, the α1 α2 α3 α4 carries the relation α1 α4 = 1 into α2 α4 = 1. If substitution α2 α1 α3 α4 we try all 24 substitutions on the roots α1 , α2 , α3 , α4 we see that the relations are not disturbed by the following four among them: α1 α2 α3 α4 α1 α2 α3 α4 , , α2 α1 α3 α4 α2 α4 α1 α3 α1 α2 α3 α4 α1 α2 α3 α4 , . α4 α3 α3 α4 α3 α1 α4 α2 An immediate check shows that these four substitutions form a subgroup of the group of permutations S4 . Let us consider the general case. We consider the n-th order algebraic equation (111)
a0 + a1 x + · · · + an−1 xn−1 + xn = 0.
We denote the left hand side of the equation by f (x) and assume that its coefficients belong to some fixed field P . With the aid of the derivative of f (x) we can separate from f (x) the product of all factors which have only simple solutions. This can be done in such a way that the coefficients of these factors likewise belong to P . Therefore we may henceforth assume that all solutions α1 , . . . , αn of f (x) = 0 are simple. Let (112)
Ψi (α1 , . . . , αn ) = 0,
i∈I
be the system of all possible polynomial relations between the solutions of f (x) = 0 (the relations given by the Vietè formulae are always present; in general, this system of relations may also be infinite). In the complete symmetric group Sn we distinguish the 77Cf. [K69c, K70, K73a] or Sections 3, 4 and 5.
6. On Galois theory
401
subgroup G(f ) of those substitutions which either do not change any of the relations of the system (112) or else map each relation in (112) again to a relation in the system, in other words we have the assertion ∀ i ∈ I ∃ j ∈ I, Ψσi (α1 , . . . , αn ) = Ψj (α1 , . . . , αn ). The set G(f ) ⊂ Sn gives a subgroup. Indeed, if σ, τ ∈ G(f ), then one readily sees that σ · τ ∈ G(f ), and, likewise, that the identity substitution belongs to the set G(f ). On the other hand, for each substitution σ of order n there exists an m > 0 such that σ m = ε, so that σ −1 = σ m−1 , from which it is clear that σ −1 ∈ G(f ). D EFINITION 6.1. The substitution group G(f ), none of which elements change the relations between the solutions of the equation f (x) = 0, is called the Galois group of the equation. We will see some examples in the next section. 2. We give now another definition of the notion of Galois group, in which the equation itself is replaced by a new field, the splitting field of the equation. This makes the theory more transparent and increases its generality; at the same time it widens its range of application. Let Δ be a field. We look at those bijections σ : Δ → Δ which preserve in the field a given sum or product, i.e. for any a, b ∈ Δ one has the relations (a + b)σ = aσ + bσ ;
(a · b)σ = aσ · bσ .
Such maps σ : Δ → Δ are called automorphisms of Δ. As automorphisms are one-toone, it follows that for each b ∈ Δ one can find an a ∈ Δ such that aσ = b; here the element b is uniquely determined, as a = b =⇒ aσ = bσ . From this it follows that the −1 map defined by the formula bσ = a is an automorphism. If we define multiplication of automorphisms as composition, then the set of all automorphisms of the field Δ equipped with this operation (multiplication) is a group which we denote by G(Δ). The unit element of G(Δ) is the automorphism ε for which all elements of Δ are fixed points. We call G(Δ) the group of all automorphisms of Δ; it is often denoted Aut Δ. Next, let Δ be the splitting field of equation (111), i.e. the smallest field containing all solutions α1 , . . . , αn of the equation f (x) = 0. Thus Δ = P (α1 , . . . , αn ). We consider the group of those automorphisms σ in G(P ) which leave invariant all elements of P , i.e. from a ∈ P it follows that aσ = a. These automorphisms σ form a subgroup in G(P ) which is called the Galois group of the extension Δ/P and denote it by G(Δ, P ). R EMARK 6.2. The definition given indicates a path to far reaching generalizations. Indeed, we can speak of the Galois group, not only of the splitting field of an equation, but of the Galois group G(L, K) of an arbitrary extension L/K, where G(L, K) = {σ ∈ Aut L|aσ = a for all a ∈ K}. In other words, the elements of G(L, K) are all the automorphisms for which all the elements of the subfield K are fixed points. L EMMA 6.3. Let f (x) = 0 be an algebraic equation whose coefficients are in the field P and the splitting field is Δ. The Galois group of the extension Δ/P coincides with the Galois group of the equation f (x) = 0.
402
C HAPTER VI. POPULARIZATION OF MATHEMATICS
P ROOF. The automorphisms σ ∈ G(Δ, P ) have a remarkable property: they map any solution of f (x) = 0 into a solution of the same equation. Indeed, from the equality 0 = a0 + a1 α + · · · + an−1 αn−1 + αn it follows that 0 = 0σ = (a0 + a1 α + · · · + an−1 αn−1 + αn )σ = = aσ0 + aσ1 ασ + · · · + aσn−1 (αn−1 )σ + (αn )σ = = a0 + a1 ασ + · · · + an−1 (ασ )n−1 + (ασ )n . Therefore we have f (ασ ) = 0, which means that together with each solution α of f (x) = 0 also ασ is a solution. On the basis of this observation it is easy to prove the lemma. Indeed, it follows from it that for each root αi and an arbitrary σ ∈ G(Δ, P ) there exists an index si , 1 ≤ si ≤ n, such that aσi = asi . As the automorphism σ is one-to one and the fact that the roots α1 , . . . , αn are simple, we see that the indices si and j are distinct. This implies that for distinct indices i and j then si and sj are distinct too. This means that 1 2 ... n Φ(σ) = s1 s2 . . . sn is a substitution of order n. As Φ(σ · τ ) = Φ(σ) · Φ(τ ), then Φ is a representation (a homomorphism) of the group G(Δ, P ) in the group Sn . The kernel Ker Φ of Φ consists of all those automorphisms σ which leave invariant all solutions, and so the whole splitting field Δ. But the only such automorphism is ε. As the kernel Ker Φ of homomorphism Φ consists only of the identity automorphism, we may view the image Φ(G(Δ, P )) as subgroup of Sn . In order to complete the proof we must show that Φ(G(Δ, P )) = G(f ). This is a simple exercise and will be left to the Reader. The Lemma is proven. ( ' Let us now have a look at two examples of the computation of Galois group of an equation. E XAMPLE 6.1. Let Q be the rational field. We seek the Galois group G(Δ, Q) = G for the splitting field Δ/Q of the equation x4 − 2 = 0. This equation has the solutions √ 4 α1 = α = 2, α2 = iα, α3 = −α, α4 = −iα. Therefore the rational expression α1 α3 + α2 α4 = 0 ∈ Q must remain in force under the action of the elements of G. Thus the group G either coincides with the substitution group H = {1, (13), (24), (13)(24), (12)(34), (14)(23), (1234), (1432)}, or else is a subgroup of it. Here we used the representation of substitutions as cycles or their products. Therefore the order |G| of G is a divisor of 8 (theorem of Lagrange). We record that the splitting field of f (x) is Δ = Q(α, i). We denote by the symbol [L : K] the dimension of the vector space L/K. It is easy to convince oneself that [Q(α) ∩ Q(i) : Q] ≤ 2, from which it follows in view of i ∈ Q(α) that [Q(α)∩Q(i)] = 1. But this again means that Q(α) ∩ Q(i) = Q. As f (x) is irreducible over Q (by Eisenstein’s criterion), then [Q(α) : Q] = 4. The minimal degree of an algebraic equation with coefficients in Q(α) and which is satisfied by i must be 2. Therefore we have [Q(α, i) : Q(α)] = 2. The equalities [Δ : Q(α) : Q(α)] = 2 and [Q(α) : Q] = 4 show that [Δ : Q] = 8.
6. On Galois theory
403
In view of the Galois correspondence (cf. Subsection 4 [below]) we obtain the equality |G| = [Δ : Q] = 8. Therefore the group sought must coincide with the substitution group H. E XAMPLE 6.2. To find the Galois group G(Δ, Q(i)) = G for the [same equation] x4 − 2 = 0 but in the splitting group Δ/Q(i). Here we have the equalities α4 α3 α2 α1 = = = = i ∈ Q(i), α4 α3 α2 α1 αk so these three rational expressions must be invariants of the group G. Hence, αi G = {1, (1234), (13)(24), (1432)}. 3. Let us now determine the Galois group of the general algebraic equation a0 + a1 x + · · · + an−1 xn−1 + xn = 0. To this end we consider the extension Δ/R, where R stands for the field of rational functions P (a1 , . . . , an ) and Δ is the root field of the general equation. In view of Lemma 6.3, established in the previous section the problem is equivalent to finding the Galois field of Δ/R. L EMMA 6.4. The Galois group G(Δ, R) of the extension Δ/R is isomorphic to the complete symmetric group Sn . P ROOF. As the solutions α1 , . . . , αn of the equation are all simple each σ ∈ G(Δ, R) determines via the formulae aσk = aik , k = 1, 2, . . . , n, a substitution α1 α2 . . . αn . Sσ = αi1 αi2 . . . αin
1 2 ... n . i 1 i 2 . . . in . In the proof of Lemma 6.3 we showed that the map μ : G(Δ, R) → Sn , given by the formula μ(σ) is a monomorphism. In order to prove the lemma at hand it therefore suffices to prove that μ is an epimorphism. For the proof of the last statement we associate to each substitution 1 2 ... n Sσ = i 1 i 2 . . . in . a transformation σ in the splitting field Δ given by the formula
This can also be read as
Sσ =
(113)
f (α1 , α2 , . . . , αn ) g(α1 , α2 , . . . , αn )
σ =
f (αi1 , αi2 , . . . , αin ) . g(αi1 , αi2 , . . . , αin )
In order to show that μ is an epimorphism we must convince ourselves that for each transformation σ there holds the relation σ ∈ G(Δ, R). Let us first verify that σ is bijective. That σ is injective follows at once from (113) if we take account of the fact f that each element in Δ can be represented in a unique way in the form . Moreover, g σ is bijective since the transformation σ −1 : Δ → Δ given by (113) in the case of the substitution S −1 , has the property that σ · σ −1 is identity.
404
C HAPTER VI. POPULARIZATION OF MATHEMATICS
We show that the transformations σ under consideration are automorphisms of Δ. Indeed, using the notation f (αi1 , αi2 , . . . , αin ) = f S (α1 , α2 , . . . , αn ) = f S , we see that we have the equations σ σ f2 (f1 g2 + f2 g1 )S f S gS + f S gS f1 f1 g 2 + f2 g 1 + = = = 1 2S S2 1 S g1 g2 g1 g2 (g1 g2 ) g1 · g2 σ σ S S f f f1 f2 = 1S + 2S = + g1 g2 g1 g2 and
f1 f2 · g1 g2
σ
σ
=
f1 f2 g1 g2
=
f1S f2S · g1S g2S
(f1S · f2S ) f1S · f2S == = (g1S · g2S ) g1S · g2S σ f1 f2 = ( )σ · . g1 g2 =
These equations and the fact that the transformation σ is a bijective show the relation σ ∈ Aut Δ. It remains to convince oneself that for each a ∈ R it holds aσ = a. In other words, we have to show that the subfield R ⊂ Δ is invariant under all the transformations given by (113). More exactly, for any element a=
f (α1 , α2 , . . . , αn ) = A(α1 , α2 , . . . , αn ) ∈ R g(α1 , α2 , . . . , αn )
and each substitution σ ∈ Sn we have to verify the relation AS (α1 , α2 , . . . , αn ) ≡ A(α1 , α2 , . . . , αn ). Thus we have to check that the rational function A(α1 , α2 , . . . , αn ) is symmetric. We show that this is in fact so. Let a = A(α1 , α2 , . . . , αn ) =
f (α1 , α2 , . . . , αn ) g(α1 , α2 , . . . , αn )
be an element of Δ belonging to the subfield R. Then it can be presented in the form f¯(α1 , α2 , . . . , αn ) , a= g¯(α1 , α2 , . . . , αn ) where f¯ and g¯ are polynomials with coefficients in P . Using Viète’s formulae ai = (−1)i σi (α1 , α2 , . . . , αn ), we find that the function A(α1 , α2 , . . . , αn ) is symmetric. So we have proved that σ ∈ G(Δ, R), which completes the proof. ( ' 4. Let K be a field and G a finite group of automorphisms of this field. Thus G ⊂ Aut K. We consider all those elements of G which are “conservative” with respect to the group G, i.e. we consider the following subset of the field K: K G = {a ∈ K such that for all σ ∈ G holds aσ = a.} A check shows that K G is a field. The fundamental theorem of Galois theory can now be formulated in the following manner. T HEOREM 6.5. It is possible to establish a 1:1 correspondence τ : Li ←→ Hτ (i) between, on the one hand, the extensions L contained in K and containing the subfield K G , and, on the other hand, the subgroups H of G, such that it follows from Li ⊇ Lj that
6. On Galois theory
405
Hτ (i) ⊆ Hτ (j) . Thereby, the order [K : Li ] of the extension K/Li equals the number of elements in the subgroup Hτ (i) . The correspondence under view is τ : H ↔ K H , where K H = {a ∈ K such that for each σ ∈ H holds aσ = a}. In order illustrate the above we give the following scheme: (1) ⊂ ⊂ HO HO O τ
τ
τ
K ⊃ L = KH ⊃ KG At the same time it is often not very easy to survey the structure of the extension K/K G by “direct” means. Let us give two applications of this correspondence. First, we consider an algebraic equation f (x) = 0 with multiple solutions and coefficients in a field P . The splitting field of this equation will be written Δ. By Lemma 6.3 the Galois field of f (x) = 0 is isomorphic to the Galois field of the extension Δ/P . Taking K = Δ and G = G(Δ, P ) we obtain the relation K G = P . We see that in order to study the properties of the extension Δ/P (and thereby also the equation f (x) = 0) one can use the Galois correspondence discussed above. The result of these considerations is T HEOREM 6.6 (Galois’ criterion). For an equation f (x) = 0 to be solvable by radicals it is necessary and sufficient that its Galois group is solvable. An immediate consequence of Galois’ criterion and Lemma 6.4 is the following fundamental result. T HEOREM 6.7 (Abel-Ruffini). The general n-th order algebraic equation is not solvable by radicals for n ≥ 5. Indeed, by Lemma 6.4 the general n-th order algebraic equation has as Galois field the complete symmetric group Sn . But the groups Sn , n ≥ 5, are not solvable, so the assertion of the theorem follows from Galois’ criterion. Second, let K be a field of algebraic numbers, i.e. a finite extension of the field of rational numbers Q and let [K : Q] = n. According to Kronecker’s theorem there exists a complex number θ ∈ K and each k ∈ K can uniquely be expressed in the form k = k0 + k1 θ + · · · + kn−1 θn−1 ,
ki ∈ Q.
Thus K = Q(θ). As 1, θ, θ , . . . , θ are linearly dependent over Q (because dim K = [K : Q = n]), there exists a polynomial p(x) with rational coefficients such that p(θ) = 0. On can check that there is no other polynomial with the same property. Thus, dividing p(x) by the coefficient of xn (normalizing) p(x)), we obtain for θ the so-called minimal polynomial p¯(x), which is uniquely determined by p(x). According to the “fundamental theorem of algebra” an algebraic equation of the n-th order has n solutions; let the solutions of p¯(x) = 0 besides θ0 = θ further be θ1 , . . . , θn−1 . The maps κi : K → K given by the equations 2
n
k κi = k0 + k1 θi + · · · + kn−1 θin−1
i = 0, 1, . . . , n − 1
turn out to be automorphisms. The set {κ0 = ε, κ1 , . . . , κn−1 } is a group which we can view as the group G. As K G = Q, one can obtain from the structure of G valuable information about the structure of the field of algebraic numbers K/Q.
406
C HAPTER VI. POPULARIZATION OF MATHEMATICS
5. Today the central problem in field theory is the classification and description of all (algebraic) extensions. The so-called Galois inversion problem belongs here. Given a ground field it is the question to find all extensions which have a given group as Galois group. For example, in the case of finite fields it has been known since the times of Galois that their algebraic extensions are cyclic, i.e. they possess a cyclic Galois field and there exists precisely one extension of a given degree. In the general case the Galois inversion problem is still far from completely solved. The Galois inversion problem in its classical form was already known to Niels Henrik Abel. This question can be stated in several distinct forms. A. Given a group find an algebraic equation having the given group as Galois group. B. Find a method for determining all equations with a given Galois group. C. Given a group find the general form of the coefficients of those algebraic equations having the given group as Galois group. Basic among these question is C, because from its solution one can derive also the solutions of A and B. Is problem C always solvable? Emmy Noether showed that the answer is affirmative if the following (Lüroth’s) conjecture is true. Let us consider the field of rational functions P (x1 , . . . , xn ) over a ground field P . It is easy to find an “elementary series” of subfields in this field. To this end take m (≤ n) algebraically independent78 elements y1 , . . . , ym in the field P (x1 , . . . , xn ) and consider the subfield P (y1 , . . . , ym ) ⊂ P (x1 , . . . , xn ). Here P (y1 , . . . , ym ) denotes the smallest extension of P containing y1 , . . . , ym . One can show that the field P (y1 , . . . , ym ) is isomorphic to P (x1 , . . . , xm ). Lüroth’s question [9] is if the series of subfields {P (x1 , . . . , xm ), m ≤ n} obtained in this way exhausts all subfields of P (x1 , . . . , xn ) (up to isomorphisms)? Using notions and facts about algebraic curves Lüroth proved this assertion if n = 1. A simplified, purely algebraic proof was given by E. Netto already in 1895. G. Castelnuovo proved Lüroth’s conjecture for n = 2 in 1894, using deep results about algebraic surfaces. In 1908 G. Fano thought that he had found a counter-example to Lüroth’s conjecture for n = 3, but later essential gaps and shortcomings were found in his argument. In the following decades many attempts were made to prove Lüroth’s conjecture in general, but it turned out that the problem was exceedingly difficult. Adding a series of original ideas and new technical devices Yu. Manin and V. Iskovskih succeeded, comparatively recently (in 1971) to save Fano’s main ideas [6]. This allowed them to conclude that the answer to Lüroth’s question was in general negative. Using analytic method, also Ph. Griffiths and Ch. Clemens, reached about the same result almost simultaneously [1]. Let us also stop at a special case of the Galois inversion problem. Let us have a look at the Abelian extensions of the field Q, that is, extensions of Q the Galois group of which is Abelian. More exactly, an extension L/Q is called Abelian if the group G(L, Q is Abelian. The Kroencker-Weber theorem√states that each Abelian extension of Q is contained in a suitable field of the type Q( n 1), such fields are also called a cyclotomic fields. This assertion contains useful information about the structure of Abelian extensions of Q. The attempts to generalize the Kronecker-Weber theorem have led to series of different proofs. Hilbert raised the question to study Abelian extensions of the fields 78That is, there exists no polynomial f = 0 in m variables and with coefficients in P such that y1 , . . . , ym are solutions of the corresponding equation.
6. On Galois theory
407
√ Q( −d), where d is a square free integer, and, likewise, to study Abelian extensions of arbitrary algebraic number fields (the 12-th problem of Hilbert). Here we can see an attempt to find, for a given algebraic number field K, an “elementary series” of Abelian extensions, containing all other Abelian extensions of K. In 1923 Helmut Hasse solved the first part of Hilbert’s problem. A generalization of Kronecker-Weber theorem to arbitrary algebraic number fields was given, in 1961, by G. Shimura and Y. Taniyama. Progress has also been made in the study of non-Abelian extensions. So far the main result is the following: for each solvable group G and an arbitrary algebraic number field there exists an algebraic extension L/K such that the Galois group G(L, K) is isomorphic to G. One has found a method for determining all such extensions. However, in the general case, the situation for the solution of the Galois inversion problem is still far from complete. Often it is not even clear how one should pose the questions in a correct way and in which terms to seek their solution.
6.2. The duality principle in mathematics 1. The notion of morphism (isomorphism, homomorphism etc.) has become one of the basic concepts of mathematics. This was caused by the development of the notions of similarity and equivalence, both two tightly related with the notion of morphism. An exact determination of the notion of similarity was first given by G. W. Leibniz. Two objects are called similar if they cannot be distinguished from each other, each considered by itself, while each possible property belonging to one of the objects, also belongs to the other. The best illustration of the importance of the notion of morphism is the isomorphism, discovered by R. Descartes, between usual plane geometry and the Euclidean plane (viewed as the set of pairs of numbers (x, y)), obtained by the introduction of coordinates in the former. The fertility of this idea can be seen at the hand the facility with which one nowadays can give answer to the question of trisection of the angle, enormously difficult to the ancients.79 Scientists began to realize the importance of the notion of morphism in mathematics only in the 19th century in connection with a new step in the development of geometry. At the time, mathematicians were already used to passing from one theory to another just by changing the terminology. The set of concrete “models” in general mathematical theory began to grow. This was brought out in full relief in the development of projective geometry. According of the habits of the time one presented side by side in two parallel columns “dual” theorems. Let us recall that “duality” on the projective level consists of the fact that in the theorems of this geometry it is possible to replace the notion “point” and “line” with each other. Let us also note that it was precisely the attempt to establish the “existence” of the geometries of Lobachevskiˇı and Riemann by finding for them geometrical models that fortified the right to live of these new geometries. Duality appears here as a correspondence between, on the one hand, the assertions of the “abstract” theory and, on the other hand, the properties of the more “concrete” mathematical objects. The duality in projective geometry considered is just one example of numerous duality theorems 80 in mathematics, which all rely the common principle of finding an isomorphism between different (mathematical) categories. The duality found allows thereafter 79In greater detail about this in Yu. I. Manin’s article [10]. 80The duality in vector spaces, the duality between open and closed sets in topology. Pontryagin duality
for Abelian topology groups, the Poincaré duality between homology and cohomology in algebraic topology.
408
C HAPTER VI. POPULARIZATION OF MATHEMATICS
to carry properties of an object automatically over to the dual ones which have been sometimes investigated directly for centuries in order to find these properties. The most evidential example here would be Galois theory. S. Lie noticed that for every differential equation there is a group of (continuous) transformations of variables that do not change the equation. The knowledge of the structure of this group permits to draw several conclusions about the solutions of the equation. Such a point of view is of a special importance in differential geometry. The first to understand this clearly was Felix Klein. The main idea of his “Erlanger Programm” (1872) was to classify the properties of geometrical objects according to the mappings with respect to which they were isomorphic. The realization of this idea was a major step forward in the development of the idea of morphism. However, such a development had also a negative side. With indignation, M. Chasles says: Now everybody may take a known fact and just by applying various general transformation principles to it arrive at new truths, which differ from the original one and generalize it. These in turn may be be treated in the same way, and so one can multiply indefinitely the number of new truths, obtained from one and the same original. Despite its universality Klein’s program did exclude the direction of development in geometry which derive from B. Riemann’s lecture, but also this has deep connections with group theory. These directions were developed in the 20th century and led to the study of Riemann manifolds with the aid of their holonomy groups.81 2. Klein’s idea found a fruitful application in physics, which based on symmetry considerations. Symmetry expresses a certain order, proportionality and coherence between the parts of the whole. Already Pierre Curie pointed to the need of using symmetry in physics [2, p. 393.]: I believe that in the study processes it would be of interest to bring in considerations of symmetry, which are used with such great success in crystallography. Physicists often use results which derive from symmetry, but usually they do not make precise the notion of symmetry, because very often this appears to be given a priori almost obviously. The homogeneity and isotropy properties of space have been known since ancient times: they express the symmetry of space with respect to the group motions of the space. The latter consists of the distance preserving selfmaps of Euclidean space, the algebraic operation in it being concatenation. The discrete subgroups of this continuous group describe the symmetries of various crystals. One of the problems of theoretical physics has also been the uncovering of a transformation group with sufficiently many (continuous) invariants, allowing to interpret sets found by measurement and experiments in terms of conservation laws. The invariance of the laws of physics with respect to the 81Cf. [8]. Editors’ note. For the notion of holonomy in general, in the context of fibre bundles, we
may refer the article by G. I. Laptev in Vol. IV, page 443 of the Encyklopaedia of Mathematics, mentioned in Section 1 footnote 9. In the case of Riemannian manifolds the concept briefly referred to in the preamble to Chap. IV of Helgason’s book [5]. In this case this amounts to the following: Let M be a Riemann manifold. If o is a point of M and v a tangent vector at o, then going around a loop L issuing from o the vector X is under parallel transport replaced by τL (v)X, where τL is a certain linear operator. All such operators span a Lie group called the holonomy group of M at o.
6. On Galois theory
409
group of Lorentz transformations is reflected by the principle of relativity, first formulated by H. Poincaré 82. Especially rich of symmetries is the micro-cosmos.83 The mathematical apparatus created by Sophus Lie has had a special importance in the discovery of new properties of elementary particles in physics. This is explained by the following circumstance. If some physical theory expresses its experimental results with the help of differential equations, then it may be that the physical content harbored in these equations is much wider and covers a larger range of experiments than that from which the equations were derived. In connection with the discovery of electromagnetical radiation in Maxwell’s equations, Heinrich Hertz says strikingly: One cannot avoid the feeling that the mathematical equations have an existence independent of us, their own consciousness, that they are more clever than we, because we can obtain from them more than we initially invested in them. A good illustration of what was said is also the discovery of anti matter. It was observed that if a particle is described by the Dirac equation then it can be in two different states of charge. That the electron was described by the Dirac equation was known. Therefore the existence of the positron was predicted, which later also found an experimental verification. From here it was inferred that every particle has an anti particle. In elementary particle theory two kinds of symmetry are presented. Some of them are connected with a subgroup of the group of space-time transformations (that is, the Lorentz group). Others, so-called inner symmetries appear in the study of special unitary groups and reflect the “inner” properties of particles. 84 But nature is not throughout only symmetry! The irreversibility of thermodynamical processes, the violation of the laws of parity, time reversal, and charge conjugation for particles in the case of weak interaction – all this speaks of asymmetry. Also the organic worlds abounds of them. Here we may speak of an interchange of strata of symmetry and asymmetry, of their levels. Symmetry manifests itself now in the organization of asymmetrical elements and in doing this reflects their striving for development. 3. The world surrounding us is characterized by its structure, the granularity of this structure and the relative independence of the structures. This gives one the possibility to distinguish single structures with the purpose in mind to learn to know them more distinctly. The similarity of differing structures is in mathematics reflected by the notion of morphism – similarity on the level of a certain theory. Such an approach makes axiomatic methods expedient. The use of this method in mathematics was initiated in the work of M. Pasch, D. Hilbert and E. Steinitz. The acceptance more widely of this method did not go without pain. For example, Felix Klein was from the onset rather sceptical towards this “axiomatic mathematics”, as he saw here an assault on intuition and imagination, that is, the truly productive elements in the process of creation. The undoubtedly most outstanding achievement of this direction is the appearance of the theory of algebraic 82
See the articles “The principle of relativity” and “Professor H. A. Lorentz as a researcher” in the book
[4] 83
An exhaustive presentation of the questions to be treated below can be found in [12]. Translator’s note. See also e.g. [3]. 84 Translator’s note. One example is SU(2)-symmetry, which explains the similarity in the behavior of the proton and the neutron; recall that SU(2) is the group of unitary 2 × 2 matrices.
410
C HAPTER VI. POPULARIZATION OF MATHEMATICS
structures in the 1920’s (on the basis of the set theory of Cantor). A further step on this road was the enfolding of the mathematical structure and the closely related with this the allied appearance of the complete notion of morphism, which gradually became a tool for a reshaping all of mathematics. It is without doubt true that mathematics starts with number. In all ages number has been the soul of mathematics, because their level of development is tightly connected with the possibility of applying mathematics in other sciences. But the development and spread of “quantitative” methods has always called for a completion and development of the “qualitative” methods, because the latter enrich the art of calculation with new forms and promote the organizing part in the development of metrical (numerical) mathematics.85 4. The interaction of these two trends in mathematics, the applied and the theoretical ones, has been a constant source of stimulation in the work of many mathematicians. For instance, Feix Klein estimates the work of C. F. Gauss in applied mathematics as follows: Gauss obtained the stimulation for this work outside mathematics. But then, in the posing of the problems and in their solution, there appears a special creational power and experience, which he could develop in himself only by solving problems of “pure” mathematics. It manifests itself also in the principle not to count as done such problems where there still remains something to be done. The interaction of these tendencies were brought in even greater relief in the work of J.-L. Lagrange. Lagrange’ mathematical style is characterized by an unusual consequence, a desire to solve this or that problem to the very end. However, his most famous book is his Mécanique analytique (Analytical Mechanics), which appeared in 1788. Here is treated from a general point of view various principles for the solution of mechanical problems found up to that time, the relations between these principles, the dependence of each other are shown, as well as the limits of their applicability. In this treatise Mechanics has become part of Mathematical Analysis, as one does not stop at the narrow special cases of the problems of mechanics, but one has brought to the foreground the steps that are necessary in the solution of the problems under view, which in the course of the following centuries has been the point of departure, the foundation and the source of many theories in the various branches of applied mechanics. These methods have been applicable in the design of screws of ships, as well as in the study of the oscillation of ships, in creation of gyrocompasses, in the computation of the trajectories of shells, in projecting railway bridges, or also in the investigation of the motion of celestial bodies.86 One has compared Mathematics with a big city, in the outskirts of which a lively activity of construction takes place, where new districts and new blocks rise, where the air is cleaner and where the youth crowds, bringing new force and stimulus to the city. The birth of these new quarters is an inevitable necessity in the development of a big city, the exigency of its life. At the same time there is going on a not less intensive 85 Everybody knows the role played today by functional analysis in the development of numerical methods (and thereby also in widening of the possibilities for using the computer). 86 One can read more details about this in the paper [7]. A more contemporary illustration of what we have said is the works of J. von Neumann. The Reader can acquaint him- or herself about his views about the development of mathematics, the balance of empirical and aesthetic deliberations in it, in the paper “The Mathematician” contained in the book [11, p. 1–9].
6. On Galois theory
411
and extensive building activity downtown. Streets are reconstructed and widened, new up-to-date houses rise, in order to adapt the life of the big city to the new needs and requirements of life. We have a truly expedient and beautiful city only when these two tendencies in its development are in good harmony.
Comments. The basic tenet of Galois theory, namely the correspondence between field extensions and groups can be succinctly stated and easily proved, and is being offered as fare to undergraduates all around the world. In its basics it is a finished theory and nothing can be added to or subtracted from it. It involved a conceptual leap from the notion of a root of a polynomial to the notions of groups and fields, nowadays forming cornerstones of modern mathematics, which would be inconceivable without them. But Galois theory is not a dead subject as Kaljulaid points out. Once you start to apply it to specific situations, its subtlety becomes apparent. One fundamental example is the classification of Abelian extensions (Abelian Galois groups) of the rational numbers as subfields of cyclotomic extensions (adding roots to xn = 1). A surprisingly intractable problem is the inverse Galois problem over the rationales, namely to characterize the finite groups that can occur as Galois groups of number fields (i.e. finite extensions over the rationales). This may seem to be a somewhat artificial problem, although Kaljulaid is obviously fascinated with it and its ramifications; but natural applications of Galois theory abound whenever fields pop up, and thus it is an inevitable tool of the algebraic geometer or number theorist. Fields and groups are very different things, although in standard introductory courses both tend to be treated on equal footing as examples of algebraic structures. Groups are more fundamental, and they have worked on human imagination long before being recognized and identified as independent entities. The crucial concept is symmetry, the instinctive feeling that two different things are really the same, and that there is no way of intrinsically distinguishing between them. One example - the embryo of Galois theory, is to ask which one is ’i’ (the square root of -1) ’i’ or ’−i’? Is the question meaningful? The outcome of the question is complex conjugation constantly used even by people innocent of Galois theory. Kaljulaid speaks somewhat confusingly about duality and isomorphisms. The two things are really different. The classical example of duality is the correspondence between lines and points in the projective plane (a phenomenon so attractive that it literally forced the invention of the projective plane itself). Here there is some kind of symmetry. Isomorphism is something different and more general. In duality there is no natural isomorphism, to achieve one you need to specify and make a choice, and thus destroy the simplicity of the situation. The concept of isomorphism is far more further reaching, and the concept of auto-isomorphism (automorphism) makes exact the vaguer notion of symmetry and allows hence the introduction of composition and of groups. As Kaljulaid rightly points out, it is Felix Klein’s vision, of groups of symmetries being at the basis of geometrical classification – the oft referred to Erlangen program –, that elevated the notion. It is of course a great unifying principle, seductive in its elegance, but of course not telling the whole story. Fundamental physics should really be thought of as enhanced geometry, as Einstein’s general relativity so eloquently bears witness to, and it is in theoretical physics the Klein’s Erlangen program really has struck its deepest roots. Nowadays groups of symmetries play fundamental roles in describing different physical phenomena, and from a philosophical point of view, it is tempting to see in groups the deeper reality that Plato postulated, and whose various manifestations make up the material world. (One amusing example may be the aptly called Platonic solids as mere aspects of their underlying symmetry groups). This mathematical view of the world is really nothing but the modern sophisticated version of ancient number mysticism. Groups and mathematical formulas seem to rule the physical world, and some physicists, notably Dirac, set greater store in an elegant mathematical formula, than an ugly one empirically born out, only to be eventually vindicated! Why is there a world? Mathematical formulas of the vacuum predict its spontaneous emergence. Thus in a sense, mathematics is God, existing even before nothing. In the most ambitious effort so far to fundamentally understand physics, optimistically referred to TOE (Theory of Everything), empirical testing is no longer feasible, and the only source of corroboration and inspiration is mathematical beauty. Indeed Kaljulaid takes the ideas of Galois theory to the most exalted end. Ulf Persson
412
C HAPTER VI. POPULARIZATION OF MATHEMATICS
References [1]
Ch. Clemens and Ph. Griffiths. The intermediate Jacobian of the cubic threefold. Ann. of Math. 95 (2), 1872, 281–356. [2] P. Curie. Sur la symmétrie dans les phenomènes physiques. J. Phys. 3 (3), 1894. [3] J. P. Elliott and P. G. Dawber. Symmetry in Physics I–II. Vol. 1. MacMillan Press Ltd., Vol. 2. Clarendon Press, Oxfod Univ. Press, London, New York, 1979. Russian Translation: “Mir”, Moscow, 1983. [4] P. Erenfest. Relativity. Quanta. Statistics. Nauka, Moscow, 1972. [5] S. Helgason. Differential geometry and symmetric spaces. Academic Press, New York, London, 1962. [6] V. A. Iskovskih and Yu. I. Manin. Three-dimensional quartics and counterexamples to the Lüroth prob˝ lem. Mat. Sbornik (N. S.) 86 (128), 1871, 140U-166. [7] A. N. Krylov. Joseph Louis Lagrange. Uspehi Mat. Nauk 2, 1936, 3–16. [8] Ü. Lumiste. The notion of space in geometry. Geometry and transformation groups. Math. and Our Age 14, 1968, 3–21. [9] P. Lüroth. Beweis eines Satzes über rationale Curven. Math. Ann. 9, 1876, 163–165. [10] Yu. I. Manin. On the solvability of the problem of construction with ruler and compass. In: Encyklopedia of Elementary Mathematics, Vol. 4. FizMatGIZ, Moscow, 1963, 205–227. [11] J. von Neumann. Collected works. Vol 1. Pergamon Press, New York, 1961. [12] H. Õiglane. Chapters from Theoretical Physics, I–II. Tartu University Press, Tartu, 1965, 1967.
413
7.
[K75b] Theory of automata Coauthor E. Tamme
To the memory of Rein Tammeste87 I was asked to speak on “possible future developments, give conjectures, and speculate about futures advances”. To do so is always hazardous, if not foolhardy; it may be possible that right in this very moment a graduate student is busily at work on a theorem that might change present trends drastically. I. M. Singer, Future extensions of index theory and elliptic operators, Ann. of Math. Studies, 70 (1971), 171-185
7.1. Some points of view in analytic Cybernetics 1. Already since remote times man has tried to put the forces of Nature under his will. The study of the secrets of Nature has led to the discovery of the laws of Nature. Every such step forward has been followed with a shift in the development of Engineering: new machines are designed which using the conquered forces of Nature have enlarged the physical powers of man. At the same time man has sought paths and means for making the action of the human brain more powerful. In the 1940’s this question became especially actual due to necessity to carry out extraordinarily complicated and bulky computations. Electronic computers were discovered88. The use of computers becomes necessary in ever more numerous new domains of practical and intellectual activity, and it is not always that existing computers can satisfy these needs, both in quantity and in quality.89 A comeback from computer to abacus is as unthinkable as from electricity to candle light. While the technology improves the interference of man in the controlling of the work of the machines (automatized factories; automatic control and test devices; autopilots; an improved military technology etc.) is often abandoned. A need to create 87 Rein Tammeste was a gifted young Estonian mathematician. Born on the island of Hiiumaa (Dagö) on January 19, 1939 he graduated from Tartu University in 1960. He was the first one in Estonia to investigate notions connected with random variables in complex Hilbert space. These results were set forth in the book “Probabilities in Hilbert spaces” (in Estonian), and in a thesis, written in Russian, defended in Tartu in 1971. He published also research on the axiomatic generalization of information and entropy. Rein Tammeste died at the age of 34, on August 13, 1973, while descending Mount Elbrus. An obituary of him was published in Math. and Our Age 20 (1975), 132-135, and it is written about his thesis in Math. and Our Age 19 (1973), 122 (both in Estonian). E. Tamme 88This work began in 1943, when, under the direction of John W. Mauchly, the first computer project ENIAC was started in practice. The computer was completed in 1946. Editors’ Note. In 1940ties, several computer projects were carried on simultaneously in different countries. for example, the computer Colosseum (UK) was working already in 1944 and Z3 (Germany, Konrad Zuse) in 1943. 89Interesting material about the balance between reality and illusion can be found in the book [2]
414
C HAPTER VI. POPULARIZATION OF MATHEMATICS
more perfect and more powerful computers and automation devices enter the agenda. To succeed in this it is, undoubtedly, essential to know if it is possible to reach the goal set by improving or enlarging the size of the existing automata. Otherwise, one must hope that it will be possible to understand more deeply the principles of the functioning of the brain and to develop formal models of it that could be technically implemented (see [1]). The mathematician is here mainly interested in the following question: will it be possible to give, on the basis of existing mathematics, an adequate description of the law which govern the world of complicated automata? 2. In dealing with distinct (control) systems their structural similarity often becomes apparent, that is, the analogue between many elements and their mutual relations, but also a functional similarity, that is, a similar behavior of the systems in analogous situations. This makes it possible to create a general (axiomatic) theory for classes of control systems, in which the concrete systems are viewed as representatives of a class. In the axiomatization of the behavior of elements one has to assume that the elements can be regarded as “black boxes”, the inner structure of which need not to be known, but who react in a certain determined way to exactly determined exterior excitation. However, the axiomatics of such a theory requires, from time to time, improvements, especially after changes in our knowledge of the physico-chemical nature of the elements and their properties. The principal task is then to find suitable notions and methods for the research of the structure of systems and functional properties of a given class. A step on this path is the work of W. Mc Culloch and W. Pitts regarding formal neural networks [6]. A central role in this theory is the notion of a formal neuron. This is an (abstract) element, a “black box” with m inputs x1 , . . . , xm (where m ≥ 1), and a single output d. A neuron has m + 1 numerical characteristics: the level θ, and the weights ωi of the inputs xi . Here ωi > 0 means that the input xi is stimulating, while ωi < 0 that xi is inhibitory. Such a formal neuron works on a discrete scale of time t = 0, 1, 2, . . . , and gives at time t = n + 1 an impulse to the output d precisely when, at t = n, the sum of the weights of stimulating inputs exceeds the level of the neuron. A formal neural network is a union of elements obtained in the following way. The output of the neuron is divided into a suitable number of branches which are connected to the inputs of some other neurons. Here the outputs of a neuron can be connected with an arbitrary number of inputs in itself or to other neurons, but each input may only be connected with a single output. Some inputs of neurons may remain free and these are either connected to each other (to be considered as identical) or else they are grouped into input lines of the net (each free input is connected to an exactly one input line; the number of the latter may, however, be smaller than the number of free inputs). The output lines are identified as outputs, which are not joined to inputs of any other neuron. All neurons are working at the same time, and their level and weights do not change in the course of time. The study the functioning of such a formal network amounts to clarifying with which signals on the output lines the net will react to various signals on its input lines. Although the formal neural network is a rather primitive analogue of the brain, their study was still a first real step to use of mathematical devices in neuro-physiology. The study of such networks stimulated to a large extent the genesis of automata theory. Indeed, in 1946, John von Neumann set forth new ideas of the construction of electronic computers (the EDVAC project). Von Neumann had as basis of his construction the module, a notion, in the creation of which an essential role was played by the functional
7. Theory of automata
415
similarity between the elementary block of a computer and the formal neuron. The taking into use the notion of module in the construction of computers allowed to separate the logical synthesis of the computer (a problem in the area of mathematical logic) from the technical synthesis of the corresponding electrical network (engineering).90 Thereby one had taken the second step towards the creation of an automata theory, the main object of which became the mathematical realization of the structural and functional analogies between the brain and the computers of the future, and the results of which were supposed to assist, via feedback the creation of new principles for the construction of computers. If we take these requirements as basis, then we must admit that today one is still very far from the theory of automata that really deserves the name. Usually it is said that cybernetics became an independent discipline in the year 1948 when Norbert Wiener’s book “Cybernetics” appeared.91 Side to side with Wiener we must also mention the contribution of J. von Neumann. Although both scientists knew well each other’s work and were under mutual influence of each other, however, their approaches to the topic were quite different. Von Neumann called his variant “automata theory”, while Wiener spoke of “cybernetics”. The latter is well-known through the translation of the corresponding works into Russian. The same cannot however be said about the automata theory. 3. The more complicated is the construction of the computer, the more complicated becomes its structure and the mathematical description of the coding and the motion of information in it, while at the same time the logical depth of the computations in it decreases and the work speed grows. The absence of a suitable mathematical theory for the description of complex automata is undoubtedly a serious obstacle towards the development of powerful automata mimicking the manifold functions of human brain. Already the research of W. Mc Culloch and E. Pitts showed that the application of the methods of formal logic can give essential results in the modelling of the brain. Because of the inner relation between automata and logic, a central place in the description of automata ought to be taken by a certain system of logic. It is that one could do here with the traditional treatment of logic (see also [7]). It is because, for example, the advanced automata must be capable of performing operations consisting of the realization of analogies and generalizations. There is no reason to believe that, in the mathematical treatment of these questions, known concepts and symbolics of logic would suffice. One would rather find a way out in taking into use structure theories of categories and algebra, as the idea of similarity (and the notion of morphism, mirroring it) is one of their organic components. In other words, from the point of view of automata theory it seems important to include the duality principle92 into logic as its organic component. In the 1930’s the discoveries of Kurt Gödel led to the point of contacts between logic and arithmetic. Recently, there has began to find response an algebraic approach, in logic, which is a complement to the hitherto ruling arithmetical point of view. As an example of such an algebraic approach we mention the algebraic treatment of the theory of recursive functions in the works of A. I. Mal’cev 90Already in 1910, the well-known theoretical physicist P. Ehrenfest draw attention to the possibility
to acting in such a way. As at the time, the practical needs were restricted to the assembling together rather primitive electrical networks, where the use of Boolean algebra seemed ridiculous, this precipitate idea passed unnoticed. 91Translator’s Note. The word “Cybernetics” comes from the Greek κυβερνητ ησ, meaning helmsman. 92 see also the article “On Galois theory”, Section 6 of Chapter VI.
416
C HAPTER VI. POPULARIZATION OF MATHEMATICS
and S. Eilenberg. It may be that in the course of time there will be a synthesis of these two points of view on a new level on the basis of an “arithmetized algebra”, being a generalization of the arithmetical point of view. It is also of importance to note that formal logic, because of its approach (the principle of “all or nothing at all”), has so far been cut off from the possibility of using the most advanced part of mathematics – mathematical analysis – using instead combinatorics, an area, were there appear great mathematical difficulties. At the same time, one has, in Analytic Number Theory and in Diophantine Geometry, since a long time ago, found ways for fruitful application of the idea of continuity in the solution of problems which by their nature were discrete. One can claim even more, that the deepest results in the disciplines mentioned have been obtained precisely in this way (often by the intermediary of algebra and probability theory). In the mathematics of the antiquity an essential achievement was the polarity “finiteinfinite”, which now (based on the set theory of Georg Cantor) with the appearance of Mathematical Analysis has become a powerful instrument of cognition. But a deeper and more complete use of the polarity “continuous-discrete” still lies ahead. That this has not yet been done to its full extent is perhaps one of the reasons why physicists in their perspective research still find to little satisfaction in mathematics they can use. This idea due to Hermann Weyl seems to be forth to develop with respect of many parts of applied mathematics. The needs of the applications and the difficulties of the theories have created a situation where one appreciates ever more the value of the ideas which have arisen in the path to the goal, in which C. G. J. Jacobi believed in a passionate way: There will come a time when from each theorem in Mathematical Analysis there will follow a theorem in Number Theory, and vice versa each regularity in the domain of natural numbers will give a theorem in analysis. A powerful basis for the arising and development of such ideas was founded in the work of Leonard Euler. A series of original considerations were given by Yu. Manin in his talk “The physical and mathematical continuum” at the summer school in the history of mathematics in Tartu in 1973. These observations agree with the view of J. von Neumann, according to which the mathematical apparatus for the study of complicated automata ought to start with mathematical logic and proceed in the direction of algebraic, probabilistic and analytic structures and further optics and thermodynamics (in the form given by L. Boltzmann the latter is in many things close to the theory of information processing and measurement). The same point of view was also echoed in a talk by V. Glushkov at a meeting devoted to automata theory in Tashkent in May, 1968. Namely, according to Glushkov the main attention of mathematicians until the end of the 20th century will be directed towards the creation of the algebra and topology of a (formal) language, that is necessary for the mathematical description of complicated automata (see also [3]).
7.2. On algebraic methods in automata theory Several points of contact between automata theory and structural theories of algebra have been know for a long time. Namely, it turns out that each automaton can be interpreted as a certain algebraic object that allows to study the construction of the automaton by means of the structure theory. Proceeding in this way it becomes possible to get an
7. Theory of automata
417
overview of all possible finite automata. We point out an essential analogy to Galois theory, where an algebraic equation is connected to a group, in terms of the theory of which one can express the solvability of the equations by radicals. Although the algebraic apparatus taken into use is rather modest, it turns out that the detailed realization of the corresponding idea is quite complex. Here the work of Krohn and Rhodes [5], on the algebraic theory of machines which appeared in 1965, turned out to be a turning point. In the following we will set out the main features of this theory. 1. Let us give an exact mathematical definition of a finite automaton. D EFINITION 7.1. By a finite automaton or a machine we mean system M = (A, Q, B, λ, δ) consisting of three finite sets A, Q and B, together with the fixed functions λ : Q × A → Q and δ : Q × A → B. It is assumed that the set Q contains an element p such that λ(p, x) = p for each x ∈ A. In order to have an intuitive explanation, we give the following interpretation of the symbols appearing in the definition: - A is the set of input signals or the input alphabet, - B is the set of output signals, - Q is the set of states, whose element p may be viewed as a halt, - λ : Q × A → Q – a function determining the mapping of states, - δ : Q × A → B – a function for getting the output signals. In order to present the functions λ and δ one often uses tables. Then the rows of the matrices )λ(q, x)) and )δ(q, x)) are indexed by the elements of Q, and the columns by the elements of A. E XAMPLE 7.1. We present the automaton M = (A, Q, B, λ, δ) with the help of the following data. Let A = {a, b}, Q = {q0 , q1 , q2 , p}, B = {0, 1, 2}, and the functions λ and δ be defined as in the Table 1. λ q0 q1 q2 p
a q1 q1 p p
b p q2 q2 p
Λ p p q0 p
δ q0 q1 q2 p
a 1 0 0 2
b Λ 0 0 1 0 0 1 2 2
Table 1
Let us point out that we have added to the alphabet A a special symbol Λ, an “empty word”. The purpose of such a procedure will be disclosed in the following two sections. 2. The automaton M = (A, Q, B, λ, δ) is usually interpreted as a system working on a discrete time scale T = {0, 1, 2, . . . }, which being at the moment of time t in the state q ∈ Q and receiving the input signal x ∈ A moves at the moment t + 1 into the state λ(q, x) ∈ Q and sends the output signal δ(q, x). The functioning of the automaton may be visualized as follows. Imagine that the incoming information is written on a tape, which is divided into cells. We assume that in each cell there is either a letter of the
418
C HAPTER VI. POPULARIZATION OF MATHEMATICS
alphabet A or else it is empty (in this case we agree that in this cell there is the “empty word” Λ). Let in the successive cells of the tape be written a finite word s, all cells to the left and to the right of it be empty (by our agreement containing the symbol Λ). At the moment t = t0 the machine M starts in a situation where its state is q0 (initial state) and the leftmost symbol x1 of the word s enters the automaton. At the next moment of time t = t0 + 1 the signal δ(q0 , x1 ) leaves the automaton M and the automaton passes on to the situation (λ(q0 , x1 ), x2 ). The tape moves one step to the left. The further activity of the automaton occurs corresponding to its program given by the table q = λ(q, x). The left hand side of the equality λ(q, x) = q shows that at time t the automaton is in the state q and receives the input signal x. The right hand side of the command indicates the state of the automaton at time t + 1. If the automaton M after having “read” the word s, reaches the situation (q0 , Λ), then s will be called the word accepted by the automaton M . The set of all finite words (in the alphabet A) accepted by the automaton M , is called the formal language accepted by the automaton M .93 More generally, one calls a formal language in a certain alphabet A a set of words obtained on the basis of this alphabet. A formal language which is the accepted language by some finite automaton is called an automaton language. E XAMPLE 7.2. Let A = {a, b}. The formal language {a . . a b . . . b | m, n > 0} . m times n times
is an automaton language, because it is the language accepted by the automaton M in Example 7.1. But not all formal languages are automaton languages. 3. The set S(A) of all finite words in a given alphabet A is a semigroup under the operation of concatenation (“multiplication”), i.e. on S(A) this operation is associative. In what follows we agree that the “empty word” belongs to S(A); we denote it by Λ, and assume that it acts as the unit of S(A). Semigroups with a unit are called monoids. From the theoretical point of view it is expedient to extend the function λ : Q× A → Q to a function λ∗ : Q × S(A) → Q. This can be done inductively by the length of the “processed” words. If x ∈ A we set λ∗ (q, x) = λ(q, x). Let λ∗ be defined for all words u ∈ S(A) of length not exceeding n and let w = ux to be any word of length n + 1. We agree that λ∗ (q, w) = λ(λ∗ (q, u), x). In analogy to the above, the domain of the output function is, likewise, extended to the set Q × S(A), so that the extended function δ ∗ satisfies the condition δ ∗ (q, uv) = δ(λ∗ (q, u), v) for all words u, v ∈ S(A). In what follows it will be suitable to denote the functions λ∗ and δ ∗ again simply by λ and δ. As an illustration of the definition of the functions λ∗ and δ ∗ here are some of their values in case of the automaton considered in Example 7.1: λ∗ (q0 , aaa) = q1 , λ∗ (q0 , aabbb) = q2 δ ∗ (q0 , aaa) = 0, δ ∗ (q0 , aabbb) = 0,
λ∗ (q2 , a) = p, δ ∗ (q0 , a) = 1.
The correctness of these computations can be easily checked using the tables given in Example 7.1. 93An excellent introduction to mathematical linguistics is the book [4]. In this book the relation between formal languages, automata and algebraic theories is illustrated by extensive and good examples.
7. Theory of automata
419
4. Let there be given an automaton M = (A, Q, B, λ, δ), which at a fixed moment of time t ∈ T is in state q. The future behavior of the automaton M is characterized by the function f : S(A) → B, which for each u ∈ S(A) is given by the formula f (u) = δ(q, u). Of course, there may exist such distinct states q and r in M such that δ(q, ∗) ≡ δ(r, ∗). But if such a situation does not occur we say that M is a reduced automaton. It turns out that to each automaton M there corresponds a reduced automaton M = {A, Q , B, λ , δ } which is equivalent to M in the following sense: for each state q ∈ Q there exists a state r ∈ Q such that δ(q, ∗) = δ (r, ∗) as functions on S(A). To each word u ∈ S(A) we associate the left shift lu : S(A) → S(A), a function that for each v ∈ S(A) is defined by the formula lu (v) = uv. For arbitrary u, v, w ∈ S(A) one has lu lv (w) = luv (w). On the basis of the automaton M = (A, Q, B, λ, δ) we construct the automaton M (f ) = (A, Qf , B, λf , δf ) whose set of states is Qf = {g : S(A) → B | g = f lu for some u ∈ S(A)}. The functions λf and δf are defined by the formulae λf (g, v) = glv ,
δf (g, v) = g(v).
Here f lu and glu denote functions S(A) → B whose values at the word w ∈ S(A) are computed according to the formulae f lu (w) = f (uw) and glu (w) = g(uw). In order to better understand the nature of the automaton M (f ) we make the following observations. First, for each u ∈ S(A) the identity f lu (∗) = δ(λ(q, u), ∗) holds. Indeed, for an arbitrary v ∈ S(A) we have the equalities f lu (v) = δ(q, lu (v)) = δ(q, uv) = δ(λ(q, u), v), from which the desired equality follows. Second, it holds δf (g, v) = δ(q, uv), because δf (g, v) = g(v) = f lu (v) = f (uv) = δ(q, uv). We observe further that λf (g, v) = glv = f lu lv = f luv = δ(q, luv (∗)). Third, we show that M (f ) is a reduced automaton. To this end, it is sufficient to show that δf (g , ∗) = δf (g, ∗) implies that g (∗) = g(∗). Let g = f lu , g = f lw , u, w ∈ S(A), and take an arbitrary word v ∈ S(A). Assume that δf (g , ∗) = δf (g, ∗). We have the following chain of equalities g (v) = f lw (v) = δ(q, wv) = δf (g , v) = δf (g, v) = δ(q, uv) = f lu (v) = g(v). It follows from them that g = g. The assertion is proved. The reasonings given show that the set of states Qf of the automaton M (f ) can be regarded as the trajectory emanating from the state f (∗) ∈ Qf , where each state is “accessible” from the state f (∗). The connection of the automaton M (f ) with M is reflected by the fact that if we take the state q ∈ Q for the state f (∗) ∈ Qf , we can view Qf as the subset of Q consisting of those states which are “accessible” from the state q and where the states of the automaton M with the same behavior are considered to be identical. 5. On the monoid S(A) of input words of the automaton M (f ) there is given the following (Myhill) equivalence ≡f : v ≡f v ⇐⇒ ∀u, w ∈ S(A),
f (uvw) = f (uv w).
420
C HAPTER VI. POPULARIZATION OF MATHEMATICS
In other words, the words v and v are equivalent if and only if the function f acts on them in equal contexts in the same way. Myhill equivalence in the monoid S(A) is stable under multiplication of words, that is, for any u ∈ S(A) it follows from v ≡f v that vu ≡f v u and uv ≡f uv . Therefore the product of two equivalence classes can be defined as the equivalence class consisting of the product of any representatives of these classes. A semigroup is obtain whose elements are Myhill equivalence classes. This semigroup Sf of classes is called the semigroup of the automaton M (f ). Let us compute e.g. the semigroup of a trigger. A trigger is an automaton M = (A, Q, Q, λ, λ), where A = {x0 , x1 }, Q = {q0 , q1 } and the function λ is given by the table λ q0 q1
x0 q0 q0
x1 q1 q1
Λ q0 q1
Table 2
Thus a trigger is an automaton whose input alphabet and the set of states are 2element sets and the functions λ(q, ∗) and δ(q, ∗) coincide, that is, its output signal at time t is identified with its state at that moment. From the table defining the function λ shows that signal xi brings the trigger into the state qi independently of its preceding state. The semigroup of the trigger is obtained in the following way. Let f (∗) = λ(q0 , ∗). The definition of the congruence ≡f shows that the relation v ≡f v holds if and only if for all u, w ∈ S(A) one has λ(q0 , uvw) = λ(q0 , uv w). As the function λ can have only two different values, we have two equivalency ≡f classes: [x0 ] and [x1 ]. To the first of them belong all words with the last letter x0 , and to the second one the words with the last letter x1 . To these two classes we add also the equivalency class [Λ] corresponding to the empty word Λ. As multiplication of classes is defined by multiplication of their representatives, it can be seen from the definition of the trigger that multiplication of the classes [x0 ], [x1 ] and [Λ] is done by the rules t · 1 = 1 · t = t,
t · [xi ] = [xi ].
In this table we have denoted by t any of the classes [Λ], [x0 ] or [x1 ] and put 1 = [Λ]. 6. To an arbitrary monoid S one can associate the automaton M (S) = (S, S, S, λ, δ), where the functions λ and δ are given, using multiplication in S, as follows λ(q, u) = δ(q, u) = q · u ∈ S
for all q, u ∈ S.
Applying this construction to the monoid Sf , we get the automaton M (Sf ). It turns out that this models the behavior of the automaton M (f ) in a poor way. Therefore we complement the construction of M (Sf ). Let the function if : Sf → B be given by the formula if (s) = f (u), where s denotes the equivalency class [u]. As the value of if does not depend on the choice of the representative u ∈ S(A) from the class s, the definition is consistent. An automaton, whose behavior is “close” to the behavior of M (f ), is the automaton M (Sf , if ) =
7. Theory of automata
421
(Sf , Sf , B, λ, δ), where λ(s, s ) = s · s and δ(s, s ) = if (s · s ). The automaton M (Sf , if ) is close to the automaton M (f ) in the following sense: to the automaton M (Sf , if ) it is possible (if necessary) to add a coder of its input signals, and a decoder of its output signals in such a way that for each state of M (f ) there exists a state of M (Sf , if ) such that when the automaton M (Sf , if ) starts at this state, it maps the input signals in the same way as M (f ) does. In this case one says that M (Sf , if ) is a model of the automaton M (f ). As the automata M (Sf , if ) and M (f ) can be interchanged in the case at hand, these automata are quasi-equivalent. 7. What kind of information about the automaton M (f ) does the pair (Sf , if ) contain, and how can this information be used to study the automaton M (f )? The more extensive the functions fulfilled by the automatical devices become, the more grow the dimensions of its blocks and the complexity of its hierarchical structure. This tendency forces to construct the devices in several steps: first one determines the structure of the blocks of the automaton and afterwards the optimal block structure. This way the assembling of more complicated automatical devices has to be dealt with, which leads to the necessity of theoretical treatment of these problems – the theory of decomposition and synthesis of automata do the job. The decomposing of an automaton into “bricks” can be done in different “scales”. In case of the computer we can consider as such bricks both semiconductors as well as the complete circuits, while studying the brain, whole parts of it or just specific neurons. It is clear that a presupposition for a successful theory of decomposition and synthesis is the choice of an optimal “scale”. The classification of the bricks and the determination of their properties requires always a specific knowledge about the domain to which these objects belong. The experimental studies are here always accompanied by mathematical methods, where logic and algebraic methods have a sufficiently prominent position. In the first part of the paper we spoke of the work ofMcCulloch and Pitts on neural networks. It follows directly from the corresponding definitions that each formal neural network can be considered as a finite automaton. However, the possibility to realize the behavior of every finite automaton in some formal neural network is somewhat of a surprise. This result of McCulloch and Pitts solves the problem of decomposition of automata, if in joining of its primitive building blocks (the neurons), cycles are allowed (in joining neurons into the network, rather complicated cycles may occur). However, in practice one imposes several kinds of restrictions to the presentation of the automata (or their blocks) to exclude such cycles. Often serial or parallel connection of automata, or some combination of these, etc. is used. The properties of several connections of such type are reflected in the notion of cascade of automata. Let us now introduce this important concept. We consider an automaton M = (A, Q, B, λ, δ) such that its state at each moment of time determines the output at that moment, i.e., there is a function β : Q → B such that δ(q, x) = β(λ(q, x)). Such an automaton is called a state-output automaton or a Moore automaton. An example of a Moore automaton – a trigger is already known to us. Another important example (the P R-automaton) will be introduced to the Reader in the following Section. Let there be given two Moore automata M = (A, Q, B, λ, β) and M = (A , Q , B , λ , β ), an alphabet Z, and two arbitrary functions σ : Z × B → A and κ : Z → A. The coder κ maps each signal from Z to a signal acceptable by the automaton M . The coder
422
C HAPTER VI. POPULARIZATION OF MATHEMATICS
σ maps pairs of signals, of which the first component is a signal in Z, and the second one an output signal of M , into signals acceptable by the automaton M . A cascade of the automata M and M is the automaton M ◦ M = (Z, Q × Q, B × B, λ∗ , β ∗ ), whose state and output functions are given by the formulae λ∗ ((q , q), z) = (λ (q , σ(z, β(q))), λ(q, κ(z))), β ∗ (q , q) = (β (q ), β(q)). The functioning of the automaton M ◦ M is illustrated by the scheme in Figure 18. Z
• κ
/M
•
/ σ
/ M
B
/
B
/
Fig. 18
In the special case where there exists a function τ : B → A such that σ(z, y) = τ (y) for all z ∈ Z and y ∈ B, we are dealing with the serial connection of the automata M and M . We have a parallel connection if their exists a function τ : Z → A such that for all z ∈ Z and y ∈ B there holds σ(z, y) = τ (z). The notion of cascade of automata is illustrated also by the proof of the theorem in the next Section. 8. The main result in the theory of decomposition of automata is the following T HEOREM 7.2 (Krohn-Rhodes). It is possible to model each finite automaton with using triggers and a cascade of the automata M (G) corresponding to a suitable finite simple groups G. In this case one says that the given automaton cascades to a set of these automata. The most interesting route to this result belongs to H. Zeiger [8]. In the central role here is the notion of P R-automaton that is a Moore automaton where each input signal induces either a substitution on the state space or else brings the automaton into a state fixed by this input signal (that is, does not depend on the state of the automaton before receiving the input signal). It follows from the definition that to each P R-automaton a kind of substitution group on the set of its states is related. It turns out that a set of such automata is sufficient to build an arbitrary finite automaton. More exactly, we have the following result. T HEOREM 7.3 (Zeiger). Each finite automaton can be presented by a cascade of P R-automata. The proof of this theorem is rather complicated. It is essential to note that the method of covers (or mosaic pictures) used at this can probably be adapted for mathematical
7. Theory of automata
423
treating of certain problems in biology. The route from Zeiger’s Theorem to the KrohnRhodes Theorem consists of two steps. At first one shows that each P R-automaton can be modelled on a cascade of the automaton M (G) corresponding to its substitution group G and a suitable automaton K, where K again can be cascaded to triggers. At the second step, one connects M (G) with a set of simple finite groups. In this argument an important role is played by the notion of composition series of a group and the JordanHölder theorem94. As the factors of the composition series of G are simple groups, it suffices to establish the following result. • κ
/ M (G/H)
•
/ σ
/ M (H)
/ β O
/
Fig. 19
L EMMA 7.4. Let M = M (G) be an automaton corresponding to the finite group G. Then M cascades into the automata corresponding to the factors of the composition series of G. P ROOF. First we show that, for each finite group G and for its normal divisor H < G, the automaton M (G) cascades into the automata M (H) and M (G/H). For this we require the scheme given in Figure 19. We fix (arbitrary) representatives for the orbits Hg of G. In the following, we denote by g the representative chosen for the orbit Hg. The work of the above scheme proceeds as follows: At time t: • the signal g2 ∈ G appears at the input, • M (G/H) is in the state Hg1 , • M (H) is in a state h1 ∈ H such that h1 g1 = g1 . At time t + 1: the coder κ maps the signal g2 to the signal Hg2 (here Hg2 = Hg2 ) and as the result the automaton M (G/H) produces the signal Hg1 · Hg2 = Hg1 · g2 = H(g1 · g2 ) , which goes to the coder β. At the same time the signal Hg1 , from the output of M (G/H) goes, together with the signal g2 , to the coder σ. The working principle of the coder σ is the following: σ : (g2 , Hg1 ) −→ g1 g2 [(g1 g2 ) ]−1 = h ∈ H. 94The relevant notions about groups used in the present paper are also set forth in [K70] (Section 4 of this Chapter).
424
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The signal received h ∈ H goes now onto M (H), which is in the state h1 . The behavior the automaton M (H) can be described by the following equation h1 · h = h1 g1 g2 [(g1 g2 ) ]−1 = g1 g2 [(g1 g2 ) ]−1 . The coder β maps pair of signals according to the rule: β : (H(g1 g2 ) , h1 h) −→ h1 h(g1 g2 ) = = g1 g2 [(g1 g2 ) ]−1 · (g1 g2 ) = g1 g2 . These computations show that the statement made at the beginning of the proof is valid. We prove now the lemma by induction over the length of the composition series of G (in view of the Jordan-Hölder theorem the length of the composition series is an invariant of the group). If G is a simple group, then the validity of the theorem is evident. In the opposite case G has a non-trivial composition series G = G0 > G1 > · · · > Gk−1 > Gk = (1). Assume that the assertion has been proved for all groups with a composition series of length ≤ k − 1. The reasoning given at the beginning of the proof shows that M (G) cascades into the automata M (G/G1 ) and M (G1 ). From the induction hypothesis we see that M (G1 ) cascades into the automata that correspond to the factors G1 /G2 , G2 /G3 , . . . , Gk−1 /Gk = Gk−1 of the composition series. Thus we see that the seeking for cascade for the automaton M (G) is found. ( ' 9. Triggers can be viewed as sufficiently simple building blocks for automata. But what can be said about the automata corresponding to simple groups? If we had a list of all simple groups G and their necessary properties, then the Krohn-Rhodes Theorem would give a solution to the problem of decomposition of automata. But so far there is no such list.95 Therefore there arises the idea to look for even simpler building blocks than the automata M (G) corresponding to simple groups. For this one should continue cascading these automata. A closer investigation of the relations between the cascade of the automata and its semigroup shows that such a desire cannot be out into practice. We call an automaton M noncascadable if each time that it is modelled by the cascade of two automata M1 and M2 (with the corresponding semigroups S1 and S2 ) it follows that either M (S1 ) models M , or else M (S2 ) models M . An algebraic treatment of the problem of noncascadability of automata is made possible by the following two notions. D EFINITION 7.5. Let S1 and S2 be two semigroups, and let σ : S1 → End(S2 ) be a homomorphism of the monoid S1 to the semigroup of endomorphisms of the monoid S2 . 95The classification of simple groups has been worked on for about 70 years. First whole series of such groups were found, but later also individual groups of very high order. For example, at a meeting on Ireland (Galways) in 1973 about the application of computers in algebra, M. Hall treated a group of order 460 815 505 920, which simplicity was to be decided by a computer. It is so that there an algorithm can be given for decision of simplicity of the group by its Cayley’s table. A deficiency of such an approach is apparently that we lack a criterion to determine if the list already composed contains all simple groups or not. The question of the existence of such a criterion is difficult. Translator’s Note. The problem of the classification of simple groups is now settled. See Gunnar Traustason’s comments to Section 4 in this Chapter, and the references indicated there.
7. Theory of automata
425
Then the semi-direct product S2 Δσ is the set of all pairs in S2 × S1 with the composition (multiplication of pairs) given by the rule (s2 , s1 ) · (s2 , s1 ) = (s2 · σs1 (s2 ), s1 · s1 ). D EFINITION 7.6. A semigroup S is atomary if for all possible homomorphisms σ : S1 → End(S2 ) it follows from the relation S|S2 Δσ S1 that S|S1 or S|S2 . (Here the notation P |Q means that the semigroup P divides the semigroup Q in the following sense: there exist a semigroup Q1 ⊂ Q which can be mapped epimorphically onto P .) It turns out that an automaton is noncascadable if and only if its semigroup is atomary. This result is important because it transfers the difficulties related to the problem of deciding the noncascadability of an automaton to the algebraic domain. A somewhat expected fact is the atomarity of the semigroup of a trigger. The atomarity of simple groups is studied using an algebraic reasoning of considerable logical depth. Therefore a refinement of the building blocks obtained in the Krohn-Rhodes Theorem is not possible, so that getting an overview of all possible finite automata is, within the framework of this theory, tightly connected with the classification of simple finite groups. 10. At an international meeting devoted to universal algebras and their applications, held in Potsdam in 1970, Samuel Eilenberg indicated a new road to the Krohn-Rhodes result. His approach has in automata theory about the same effect as the passing from equations to the study of extensions of fields in Galois theory; in both cases it leads to a widening and a clarification of the theory. The ideas of automata theory have found widespread application in the creation of formalized methods in the schemes for designing computers. And likewise in the solution of theoretical problems in programming. Let us especially mention that it were the concepts arising from the algebraic decomposition theory of Krohn-Rhodes that made it possible for R. Kalman, in the years 1962-67, to carry out an “algebraic reform” in the theory of linear dynamic systems. This branch of optimal control theory turned in this way especially in relief and makes possible a fast improvement and widening of the theory in future. Undoubtedly, it is true that major progress in algebra has always been related to possibilities for an inner development of its theories. Moreover, it is necessary to use opportunities to apply the obtained results outside the traditional borders. Analytical cybernetics gives here an excellent opportunity. One cannot hope that all branches of algebra have yet been created which could turn out to be necessary in the course of such research. Also, it does not seem today probable that the necessary apparatus ought to be purely algebraic, although it is true that algebra takes always a fundamental role in all kinds of structural theories. One should rather treat the existing algebraic theories and facts as the basic matter in the creation of a language that will be adequate for a mathematical description of complex automata. References [1] N. Basov and O. Krohin. Laser-71. Izvestiya, Feb. 12, 1974. [2] H. Dreyfus. What computers can’t do: the limits of artificial intelligence. Harper & Row, New York, 1972. [3] V. Gluškov. Abstract theory of automata. Usp. Mat. Nauk 5, 1961, 3–62.
426
C HAPTER VI. POPULARIZATION OF MATHEMATICS
[4] M. Gross and A. Lantin. Notions sur les grammaires formelles (The theory of formal grammars). GauthierVillars, Paris, 1967. Russian Translation: “Mir”, Moscow, 1971. [5] K. Krohn and J. Rhodes. Algebraic theory of machines. I. Prime decomposition theorem for finite semigroups and machines. Trans. Am. Math. Soc. 116, 1965, 450–464. [6] W. Mc Culloch and W. Pitts. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 1943, 115–133. [7] P. Rashevskiˇi. On the dogma of the natural number system. Usp. Mat. Nauk 28 (4), 1973, 243–246. [8] H. P. Zeiger. Cascade decomposition of automata using covers. In: M. A. Arbib (ed.), Algebraic Theory of Machines, Languages, and Semigroups. Academic Press, Netherlands, 1968, 55 – 80.
427
8.
[K93c] Mordell’s problem Comments by G. Almkvist
In this paper we describe two results which have been widely known in the past 15 years, crowning the efforts of numerous mathematicians since the times of Pierre Fermat (1601-1665). At the same time we try to present the notions which allowed to arrive at this milestone. Regretfully, however, major events in the area of mathematics resemble high peaks of mountains – even if we have already climbed up to them, a majority of them remain unaccessible to but the very few, possessing the necessary special equipment and training for reaching these heights. Therefore it is natural that although people often speak romantically about the peaks of the “mathematical mountain range”, they avoid to mention the techniques and paths leading to the goals. About the latter it is also exceedingly difficult to speak, and which is even worse – the narrative becomes fragmentary and, as it is so hard to understand, it creates only displeasure. The author was however encouraged by many people (among them not only mathematicians) who nevertheless want to know more about the results of Pierre Deligne and Gerd Faltings. This paper is based on material presented in the first half of my talk “On old and new problems in Discrete Mathematics” at a meeting on Estonian mathematicians in Saaremaa96. In streamlining it I was very much helped by pertinent remarks done by Docent R. Prank, and, in particular, by Professor Ülo Kaasik. But it is hardly necessary to add that it is impossible to eliminate all its shortcomings – but, of course, the author does not accuse anyone besides himself.
8.1. The algebra of Fermat’s equation. 1. Since times immemorial the main object of study in mathematics has been numbers and equations. The simplest are the algebraic equations, and much more complicated the Diophantine equations. During the past centuries new objects of study have enriched mathematics - functions have been added, and differential and functional equations. The properties of integer numbers and Diophantine problems have attracted mathematics since the time of Hammurabi. For instance, in these days, one knew the equation x2 +y 2 = z 2 , which has the solution (3, 4, 5), and also all triples of the form (3n, 4n, 5n), where n ∈ N97. Triples of integers (a, b, c), having only number one as a common divisor and satisfy the relation a2 + b2 = c2 , are called simple Pythagorean triples. Let m ≤ n be natural numbers without a common divisor and of different parity. Putting the relation a2 + b2 = c2 in the form (a/c)2 + (b/c)2 = 1, we observe that to each triple (a, b, c) there corresponds a solution of the equation x2 + y 2 = 1 in terms of rational numbers (a Q-solution Q-solution), and, apparently, also more or less conversely. 96
Translator’s note. Saaremaa (Swedish or German: Ösel, Latin: Osilia) big island in the Baltic Sea, belonging to Estonia. 97Here and in the sequel we use the following notation: N = the set of (all) natural numbers, Z = the set of integers, Q = the set of real numbers, C = the set of complex numbers.
428
C HAPTER VI. POPULARIZATION OF MATHEMATICS
y
B λ(x λ, y λ)
x
A (0, −1)
Fig. 20
Thus the equation x2 + y 2 = 1 suffices for finding all Q-solutions. They can be found by the so-called method of pulverization (see Figure 20): Through the point A(0, −1) one has to draw straight lines y = λx − 1 with rational inclination λ ∈ Q and find their intersection Bλ (xλ , yλ ) with the unit circle x2 + y 2 = 1. Note that 2λ λ2 − 1 and yλ = 2 xλ = 2 λ +1 λ +1 are rational numbers. In this way we obtain all Q-solutions of the equation x2 + y 2 = 1; for any point Bλ (xλ , yλ ) determines a line ABλ with the rational slope (yλ + 1)/xλ . 2. The preceding serves for us as an example of a Diophantine problem. The majority of these problems have a long history, but all of them reduce to the solution of a Diophantine equation (or, at least, to a system of such) or, else, are at least closely connected with this. A Diophantine equation can be presented in the form F (x1 , . . . , xn ) = 0, where F is a polynomial with integer coefficients and n ≥ 2; one usually seeks its solutions in integers or rational numbers. These equations get their name from Diophantus, who was one of the greatest mathematicians of antiquity. He lived an flourished in Alexandria in the 3-rd century B.C. After the death of Alexander the Great, this city in the delta of the Nile had become the capital of Egypt. In Alexandria the Museon (a university, in contemporary terminology) was founded, and a library. Poets and writers were invited to the city, in better times there worked as many as 100 scientists. Among them was Euclid, who wrote his Elementa; Erathostenes, who excelled in many areas (for example, he is known for his method for finding prime numbers), and was the director of the library; Archimedes, who got his education there, and most of whose ideas became known through letters to scholars in Alexandria; Apollonius, whose “Conic sections” paved the path for the later work of Kepler and Newton. Such was the ancient center of culture where Diophantus wrote his “Arithmetic”. Out of 13 chapters of the latter book [4] only 6 or 7 have survived, but despite this the treatise came to strongly influence the development of Mathematics.
8. Mordell’s problem
429
Its true richness in ideas and content was appreciated only at the end of the 16th century (F. Viète, R. Bombelli), but especially in the 17th century. Diophantus’s text was translated into Latin by Claude Bachet (1621), whose own interest in numbers had been arisen by the solution of (mathematical) problems of recreation. This translation, the number-theoretic comments of which are especially emphasized, fell into the hands of Pierre Fermat and, in the years 1636-1640, turned the latter’s mathematical interest ever more towards Diophantine problems. In his research Fermat came to a conjecture which afterwards was called “Fermat’s Last Theorem” (FLT): “No cube decomposes into the sum of two cubes, no fourth power decomposes into the sum of two fourth powers.” In other words, the Diophantine equation xn + y n = z n admits, for n ≥ 3, no solutions in terms of natural numbers. Fermat had proved [this in] the special case n = 4 (using the so-called method of decent), and he may have had the case n = 3 in its broad outline (as was carried out by Euler in 1753). Fermat frequently wrote to his colleagues about these special cases, but he never mentions the general case again. (Around 1640 Fermat wrote a note in the margin of his copy of Diophantus’s book that he was in possession of a proof of FLT, but that it was far to long to be written down there; according to A. Weil’s version, Fermat might have thought so only in his youth). With his intensive work, up to the year 1660, Fermat laid a solid basis on which Euler, Lagrange, and Gauss later could build the edifice of Number Theory. 3. In the following centuries, FLT was one of the most well-known problems in mathematics. The attempts to solve this problem using new technique have enriched mathematics with quite fruitful notions and √ methods. In the case n = 3 Euler used in his proof of FLT numbers of the form a + b √−3, where a, b ∈ Z, and developed in an essential way the arithmetic of the domain Z[ −3]. This√line of thought was continued by Gauss √ (1831), who invented the domain G = (a + b −1; a, b ∈ Z) = Z[ −1] = Z[i], or the arithmetic of so-called Gaussian numbers. In 1825 Legendre and Dirichlet established FLT for n = 5, while Lamé and Lebesque did it for n = 7 in 1840. Really, it suffices to prove the theorem for n = 4 and for n a prime number. Indeed, each natural number n ≥ 3 is either divisible by 4 or else by an odd prime number. In the first case one can rewrite xn + y n = z n as (xm )4 + (y m )4 = (z m )4 , in the second case as (xm )p + (y m )p = (z m )p , where p is an odd prime. Now it is clear that from the truth of FLT for n = 4 and for all primes n > 2 follows also its truth in the case of all n ≥ 3. 4. An essential step forward toward the solution of FLT was taken by the German mathematician Ernst Eduard Kummer (1850’s). He managed to find a condition from which the correctness of FLT follows for almost all primes n less than 100. Only in three cases (37, 59 and 67) the issue remained open, because then Kummer’s condition does not work; the case n = 37 was solved later (1892). In this and the following four Subsections we shall learn about Kummer’s scheme of reasoning. √ In his proof of√FLT for n = 3 Euler used the quadratic field Q( −3), and in particular its subring Z[ −3], that is, properties of the Eulerian ring of integers. Euler noticed 2 2 that if, for a, b relative √ prime a +√3b , is a perfect cube, then it follows from the identity 2 2 a + 3b = (a + b −3)(a√− b −3) that also both factors of the right hand side are perfect√cubes in the ring √ Z( −3). In particular, there ought to exist c, d ∈ Z such than of a + b −3 = (c + d −3)3 . This would be sufficient, if the fundamental theorem √ arithmetic (uniqueness of the decomposition into prime factors) is true in Z[ −3] and
430
C HAPTER VI. POPULARIZATION OF MATHEMATICS
√ √ the factors a + b −3 and a − b −3 were without a common factor – this is what our experience in ordinary arithmetic tells us. √ But the fundamental theorem of arithmetic fails in the ring Z( −3). Indeed, we have √ √ 4 = 2 · 2 = (1 + −3) · (1 − −3), √ √ where a simple argument (by contradiction) shows that the factors 2, 1 + −3, 1 − −3 are indecomposable; the number 2 does not have a representation in the √ for example, √ form 2 = (e + f −3)(e − h −3). 5. But there is a way out if this dilemma. Let ζ 3 = 1, ζ = 1;√thus, ζ is a C-solution of the equation x2 + x + 1 = 0, or to be concrete: ζ = (−1 + −3)/2. We consider the domain of numbers Q[ζ] = {r + sζ | r, s ∈ Q}. As it is possible to carry out the four arithmetical operations with this set without violation of the usual rules of calculation, then in the guise of Q[ζ] we are dealing with a so-called number field Q(ζ). The subset E = Z + Z · ζ = {a + bζ | r, s ∈ Z} should, of course, be viewed √ as the “integers” of the field Q(ζ). In view of the identity a + bζ = ((2a − b) + b −3)/2 these integers √ can be represented as (p + q −3)/2, where p and q are ordinary integers, √ both of the same parity. It is somewhat more natural to consider the “integers” (p + q −3)/2 ∈ E, and not restrict oneself to the use of Euler’s integers. The reason is that the fundamental theorem of arithmetic holds true in E, but not in the ring of Eulerian integers. As an explanation we add that the identity √ √ 2 = (1 + −3) · (1 − −3)/2, √ does not contradict the fundamental theorem of arithmetic, as (1 − −3)/2 is a unit in this ring (its inverse is, also an “integer”), and so the identity shown expresses just the √ fact that the number 2 and 1 + −3 are associated (one number is obtained from the other by multiplication with a unit). It follows from the correctness of the fundamental theorem of arithmetic in the √domain E and the immediate verification of the fact that the Eulerian integers a + b −3 √ have no and a − b −3 have no common factor in the ring E (that is, these numbers √ common factor there distinct from a unit) that there exists a number c + d −3 ∈ E such √ √ √ than a + b √ −3 = (c + d −3)3 . It may then happen that c + d −3 is a number of the form) (p + q −3)/2, where p and q are both odd; then c and d are not integers. From the last identity we get p = 2u − v and q = v. Thus, if v is even, then p and q are also even, and, as a consequence, u + vζ is an Eulerian integer. We observe further that among the three numbers u + vζ, (u + vζ) · ζ and (u + vζ) · ζ 2 (associated to each other) there is in front of the multiplier ζ an even number.√It follows from √ these reasonings that, if necessary multiplying both members of a + b −3 = (c + d −3)3 with ζ or ζ 2 , that c and d are integers. 6. Next, let us consider the relation xp + y p = z p , where p ≥ 3 is a prime. Accordingly let ζ be a p-th order root of unity other than unity: ζ p = 1, ζ = 1. Then 1 + ζ + · · · + ζ p−1 = 0, so that we have the relation xp + y p = (x + y)(x + ζy)(x + ζ 2 y) . . . (x + ζ p−1 y). This gives the idea to use the ring of so-called algebraic integers Ep = {a + bζ + cζ 2 + . . . dζ p−1 | a, b, . . . d ∈ Z}
8. Mordell’s problem
431
contained in the number field Q(ζ); similarly, this ring is denoted Z[ζ]. The arithmetic of [ordinary] integers is based on the notion of integers and the fundamental theorem of arithmetic. It follows effectively from the latter that if AB . . . F = Lp and the factors A, B, . . . , F do not have a common factor, then they must likewise be pth powers. Are such statements true in number rings other than the ordinary integers? Sometimes it is so – for instance, in the case of Gaussian integers. It is also so in the rings Z[ζ], where ζ = cos(
2π 2π ) + i sin( ); p p
p ∈ {3, 5, 7, 11, 13, 17, 19},
in which case one has into prime factors. On the other hand, in √ a unique decomposition √ the number rings Z[ −3] and Z[ −5] the unique factorization fails, as: √ √ 4 = 2 · 2 = (1 + −3)(1 − −3) and √ √ 9 = 3 · 3 = (2 + −5)(2 − −5). Although, for example, in the last product the factors √ 2± cannot be represented as squares of numbers a + b −5.
√ −5 are unity divisors, they
7. Hilbert found a simple model explaining the difficulties in what was just said, indicating also a way to overcome them. Let us consider the domain of numbers H = { 4n + 1 | n = 0, 1, 2, . . . } = {1, 5, 9, 13, 17, 21, 25, . . .}. A number p ∈ H is called a “prime number”, if it cannot be represented in the form p = a · b, where a, b ∈ H and a = 1, b = 1. For example, 21 turns out to be prime in “H-arithmetic”: although 21 = 3 · 7, one has 3 ∈ H and 7 ∈ H. We note that 693 = 9 ·77 = 21 ·33 are two distinct decomposition of 693 ∈ H, but from the equalities 212 = 9 · 49 and GCD(9, 49) = 1 it does not follow that the numbers 9 and 49 can be presented as squares of H-numbers.
= A way out is the following. We extend the set H to the “domain of numbers” H {4n + 1, 4n + 3 | n = 0, 1, 2, . . . } = {1, 3, 5, 7, . . . }, in which all the rules of ordinary Z-arithmetic hold true (one has to take into account that all even numbers have been omitted). For example, one has now 9 = 32 , 77 = 7 · 11, 21 = 3 · 7 = 3 · 11, so that the two distinct decompositions of the number 693, to wit 9 · 77 and 21 · 33 reduce to a single one 32 · 7 · 11; also other difficulties disappear. √ 8. Is it possible to extend the domain Z[ −5], by adding to it new so-called “ideal numbers” in such a way that the the fundamental theorem of arithmetic remains valid in the new domain? Kummer showed that this can really be done; indeed, not only in the √ case of Z[ −5], but even in many other cases. In this way mathematicians arrived at the notion of ideal numbers in the middle of the 19th century. Later Richard Dedekind, from a set theoretic point of view, changed them into the often used “ideals” (for instance, in ring theory). In the work of Kummer, Dedekind, Kronecker, and others there arose a new, general theory of division in domains of numbers – the theory of algebraic numbers (for more details see [13]).
432
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The conditions found by Kummer, which were referred to in our discussion of FLT, have not lost there importance even today98. Basing himself on them and using computers, S. Wagstaff showed that FLT is true for all primes p ≤ 125 000 [20]. Let us add that to write down the number 2125000 one requires 37 628 digits, so that finding a counterexample to FLT is a rather hopeless task! Even more, based on results by G. Faltings (1983), D. Heath-Brown proved that FLT is true for almost all exponents n [5]. In other words, “bad exponents, if they exist, must appear very “seldom”. More precisely, if we denote by N (c) the number of bad exponents n not exceeding c, i.e., N (c) = |{n | n ≤ c and FLT is not true for n}|. then N (c)/c → 0 as c → ∞. But so far it is not known if there are infinitely many “good” exponents, or not. 9. In what respect does the two well-known number domains Z and Q differ from each other? How to express the common features of the number domains considered above? If we apply addition, subtractions and multiplication to the integers, we again obtain integers. Thereby addition and multiplication are associative and commutative, these two are connected via the distributivity law, while addition is supplemented by subtraction (the opposite of addition) – therefore in the case of Z, we have to deal with a ring. But division is not always possible in Z. In the domain of rational numbers the situation is different: the ring Q is a field, that is, a commutative ring in which for all a = 0 the equation ax = b has a unique solution. Also real numbers, and likewise the complex ones √form fields, but √ also suitable subsets of these fields R and C are fields: for example, Q[ 2] = {a + b 2 | a, b ∈ Q} or Q[i] = {a + bi | a, b ∈ Q}; the latter field contains the ring of Gaussian integers Z[i]. A typical example of a finite field is the so-called residue class field Zp , where p is a fixed prime. The elements of this field are classes of integers ¯ 1, ¯ 2, . . . , p − 1; these are obtained by putting into one and the same class all integers which give the same remainder upon division by p. The operations with classes is defined by the formulae m + n if m + n < p. m+n= m + n − p if m + n ≥ p. m·n=r
where r = mn − pt, 0 ≥ r < p.
More generally, one can view a finite field as a factor ring Z[x]/(g(x)), whose elements are equivalence classes of polynomials with coefficients in the field Zp ; here g(x) is a fixed polynomial of degree m, assumed to be irreducible over Zp . Two polynomials are considered to lie in the same class if their difference is divisible by g(x). The operations on this set of classes are as usual defined with the help of their representatives. It turns out that for each prime p and each natural number m this construction gives (up to isomorphism) a unique field denoted by Fpm ; there are no finite fields beyond the described series {Fpm |m ∈ Z, p a prime number}. Finite fields play an important role in Number Theory and in Diophantine Analysis – for example, one can replace a congruence F (x, . . . ) ≡ 0 mod pm by the equation F (x, . . . ) = 0 in the field Fpm etc. Let there be given a field K. Any field E containing this field K as a subfield is called an extension of K and is denoted E/K: thus C/R or R/Q or again C/Q are 98Translator’s note. Let us recall that the paper was written around 1988.
8. Mordell’s problem
433
extensions. An extension E/K is said to be finite if E can be regarded as a finite dimensional vector space over the ground field K. Finite extensions of the field of rational numbers Q are called algebraic number fields. From the previous discussion it may be inferred that especially algebraic number fields play a particular role in Diophantine Analysis. A deepened, but still sufficiently readable account of the themes consider in this first half of our paper can be found in the book [13].
8.2. The geometry of Diophantine equations. Greater clarity in the problems of Diophantine Analysis was created during the course of the 19th century by Algebraic Geometry, a dynamically developing subject. This discipline studies algebraic varieties. To each Diophantine equation one can associate a geometric object – a variety, whose points can be interpreted as solutions of the given equation(s). In what sense is such a geometry point of view better than the purely arithmetical methods of Diophantine Analysis? The advantage manifests itself in the fact that with an algebraic variety one has to deal with the interplay of a whole series of algebraic and topological structures, it is a topological space (even in several topologies), an analytical space, a Lie group etc. These structures have been intensively studied in the course of time, and obtained deep results which, used together with arithmetical considerations, put in new light many notions of Diophantine Analysis. This approach allows one to classify Diophantine problems according to the invariants of the variety. An example of such an invariant is the dimension of the variety. The Diophantine problems which interest us most here are mainly connected with one-dimensional algebraic varieties, traditionally called algebraic curves. 10. The geometric interpretation of an equation can lead to surprises. For example, consider the two equations x2 + y 2 = 1 and x3 + y 3 = 1, which superficially differ little from each other, and let us interpret them as curves in the R-plane (that is, a plane with real coordinates). Then they give two quite different pictures (see Figure 21). y
y
1
x
E : x2 +y2 = 1
1
E : x3 +y2 = 1
Fig. 21
x
434
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The equation x2 +y 2 +1 = 0 surprises even more – there exist “curves” without R-points! In order to get rid of this inconvenience one allows oneself too took look for points of the corresponding curve in the C2 -plane (that is, with coordinates in the extension C/R). For example, if we in the case of the curve under view set x = x1 + ix2 and y = y1 + iy2, we get relations connecting the R-quantities x1 , x2 , y1 , y2 , which in the case of the equation x2 + y 2 = 1 is a sphere in R4 , but for x3 + y 3 = 1 a torus in the same space (see Figure 22, where these surfaces are depicted in the usual space)
Sphere
Torus
Fig. 22
In Diophantine geometry one often encounters situations when the coefficients of the equation determining the curve are from one domain (the field K) but the coordinates of the sought points have to be taken from another domain (an extension L/K of K). 11. In the geometric interpretation of the solution of a Diophantine equation one requires the notion of projective space. Let us fix a field K. The points of the n-dimensional affine space An (K) can then be identified with sequences (x1 , . . . , xn ) in the set K n , The projective space Pn (K) is now obtained as follows. We denote by (K n+1 )∗ the set of all sequences (x0 , x1 , . . . , xn ), omitting the origin (0, 0, . . . , 0). We partition this set in such a way that we regard the points (x0 , x1 , . . . , xn ) and (y0 , y1 , . . . , yn ) in (An+1 (K))∗ to lie in the same class if there exists a ∈ K (a = 0) such than x0 = ay0 , x1 = ay1 , . . . xn = ayn . The set of classes thus obtained is called the n-dimensional projective space Pn (K). These equivalence classes may be viewed as straight lines through the origin in the affine space An+1 (K). In the special case when K = R and n = 2, we obtain then the (ordinary) real projective plane. Let there be fixed a set I of natural numbers and let there be given for each i ∈ I a polynomial Fi (x1 , . . . , xn ) with coefficients in K. The point set in An (K), defined by the system Fi (x1 , . . . , xn ) = 0, where i ∈ I and L/K is a suitable extension of the ground field K, is called an affine variety. A system of equations Fi (x1 , . . . , xn ) = 0, where all polynomials Fi are forms (that is, homogeneous polynomials over K), determines a subset M in the projective space Pn (K), called a projective algebraic variety. As points in the projective space Pn (L) are rays through the origin in An+1 (L), we may view M as a cone in An+1 (L). The field K is the field of definition of M . The points (x0 , x1 , . . . , xn ) in M ⊂ Pn (L) such than all quotients xi /xj ∈ K are called its rational points; their set will be denoted by M (K). The answer to the question about the structure and the properties of the point set M (K) is of paramount interest in the study
8. Mordell’s problem
435
of the corresponding Diophantine system. In the case of algebraic curves these questions were studied very carefully yielding also decisive progress towards the solution of FLT. 12. Consider an equation F (x, y) = 0 such than the left hand side is a polynomial of the form am (x)y m + am−1 (x)y m−1 + · · · + a1 (x)y + a0 (x), where all ai (x) are x-polynomials with real coefficients. Selecting in the plane A2 (R) all points (x, y) whose coordinates satisfy the equation F (x, y) = 0 we obtain a certain curve. Thus the equation y 2 + x2 − 1 = 0 defines a circle with center (0, 0). Next, consider a curve E with equation aij xi y j = 0, i,j
where all aij are integers. We denote the set of Z-points of this curve by E(Z), and its Q-points of this curve by E(Q). Let further o(E) denotes the order of the curve, that is, the maximum of i + j for the monomials aij xi y j appearing in the equation. In 1912, Carl Ludwig Siegel proved an important theorem to the effect that, on any curve of degree higher than two, there are at most finitely many Z-points. However, it is not easy to decide if E(Z) = ∅ or not; there is no algorithm for this. The question about the existence of an algorithm for deciding whether E(Q) = ∅ or not is entirely open. Before the work of J. L. Mordell(in the 1920’s) the following was know about E(Q). First, if o(E) = 1, then |E(Q)| = 1. Second, if o(E) = 2, then either E(Q) = ∅ or |E(Q)| = ∞. Third, in the case o(E) = 3 already Diophantus knew that a line through two Q-points of E must intersect E in a third Q-point. In the case of these lines Poincaré’s conjecture became known (1903), according to which one can find all Qpoints from a certain finite set, using the following geometric procedure: one has to draw all possible chords among the points of a given finite set and the tangents in these points, which by intersecting the curve generate new Q-points (starting with this extended set of Q-points one draws anew all chords and tangents etc.) This conjecture by Poincaré was proved (1922) by the British mathematician Joel Louis Mordell. According to Mordell’s theorem one can present the Abelian group of the rational points on the elliptic curve E as E(Q) = Za ⊕ VE , where VE is a finite group. The number a is called the rank of the curve. So far it is not known if there exist elliptic curves of arbitrary large rank, but computer experiments have shown how the rank depends on the coefficients of the cubic equation defining E. The group VE – the torsion of E is made up by its points of finite order. It turns out that either VE is a finite cyclic group or else it has the form Z2 + T with T finite. B. Mazur showed, in 1976, that either |VE | is one of the numbers 1, 2, . . . , 10 or 12 or else VE = Z2 + T , where |T | equals 2, 4, 6 or 8. Thus there are 15 possibilities for VE . The deeper reasons for the mystery of these numbers is so far unknown! However, one has obtained hopes for proving the analogue of Mazur’s theorem for an arbitrary number field K/Q, because recently it was found that the the torsion of the group E(K) of K-rational points is finite. A higher dimensional generalization of the Poincaré-Mordell conjecture was proved, in 1927, by the French-American mathematician André Weil. It is amazing that the arithmetic of Diophantine equations is too a large extent governed by the geometry of the point set E(C) for the corresponding curve E. Indeed, E(C) is a certain compact 2-dimensional surface in R4 , called the Riemann surface of
436
C HAPTER VI. POPULARIZATION OF MATHEMATICS
the curve and turns out to be topologically equivalent to a “sphere with handles”. Here B1 , where B1 is the first Betti number of E(C); the the number g of “handles” equals 2 number g is called the genus of the curve E. A curve of genus g = 0 is called rational. These are the straight lines (curves of order one) and the second order curves, and in some sense the list of rational curves stops with the ones mentioned. In this case E(C) is the Riemann sphere, which is a surface admitting a Riemannian metric of constant positive curvature. Curves of genus g = 1 are called elliptic. Each such curve belongs to an equivalence class of birationality of an non-singular cubic curve, the equation of the latter can be put in the form y 2 = x3 + ax + b, where a, b ∈ Z. The Riemann surface corresponding to such a curve is a torus which allows a flat metric induced by C2 . The set of Q-points (if it is non-empty) allows the structure of an Abelian group and this group has a finite number of generators (the so-called Mordell-Weil theorem). Curves of genus g > 1 are called non-elliptic. For instance, the so-called Klein curve y 3 + yx3 + x = 0 has genus 3. All Fermat curves xn + y n = 1 with n ≥ 4 are likewise non-elliptic. The genus of such curves equals (n − 1)(n − 2)/2. In this case the curve carries a Riemannian metric of constant negative curvature. The set of Q-points on a non-elliptic curve is finite: more generally: If K is a number field (that is, the extension K/Q is finite), then E(K) is finite. This statement became known, in 1922, as Mordell’s conjecture, but after 1983 it is called Faltings’ theorem. The story of the origin of this result and some of its later consequences will be treated in the following section.
8.3. About the theorems of Deligne and Faltings 13. The problem of finding the Z-solutions of the Diophantine equation F (x, y) = 0 is the “finite” analogue of the solution of the congruence F (x, y) ≡ 0 mod p. The latter can in turn be viewed as the solving of an equation, taking the components of the solutions in the number domain Zp . More generally, one can consider the congruences F (x, y) ≡ 0 mod pm and seek the solutions of the corresponding equation with components in an arbitrary (Galois) field Fq (we agree here and in what follows that p is a fixed prime and write q = pm ). This line of thought was well-known already to C. F. Gauss. As the sets Zp and Fq under view are finite, there arises the question of the number of the solutions of the equations. For example, the equation y 2 + x3 − 1 = 0 has two Z2 solutions, three Z3 -solutions and five Z5 -solutions. In Table 3 below, the columns with a plus sign indicate the Z3 -solutions of this equation. Table 3
y x F =0
0 0 0 1 - +
0 1 2 0 - +
1 1 -
1 2 2 0 - +
2 1 -
2 2 -
We denote by Nm the number of Fq -solutions, where q = pm , of the equation F (x, y) = 0. For simplicity we assume that the prime p, fixed through out the discussion, divides the number n − 1. We agree also that the indices j and k vary in the set
8. Mordell’s problem
437
{1, 2, . . . , n−1}, however in such a way that their sum is not n: the number of such pairs of indices (j, k) equals (n − 1)(n − 2). In the case of the Fermat equation xn + y n = 1 one has the identity J(j/n, k/n)m , (114) N m = pm + 1 − j,k
where the J(j/n, k/n) are the so-called Jacobi sums. In order to clarify their significance, we choose a generator ε of the group Z∗p of p-th roots of unity (that, is a primitive p-th roots of unity) and consider the maps j
χj/n : Z∗p → C,
with χj/n (εr ) = e2πir n .
With the help of these so-called multiplicative characters we can now define J(j/n, k/n) = − χj/n (x) · χk/n (1 − x); x∈Zp
here one has to put χ(0) = 0, for each character, thus also for a trivial one. For example, if p = 7 we have √ J(1/6, 2/6) = −2 − i 3. 14. In 1924, Emil Artin introduced for the numbers Nm a generating function of the form Nm m t ; (115) Zp (t) = exp m Zp (t) contains information about the number Nm , m = 1, 2, . . . of solutions of the equation F (x, y) = 0. The series (115) has two good properties. First, if Nm happens to come in the form αm (for example the number of Fq -solutions of the equation y = f (x) is precisely q, so one can take α = p), then 1 Nm m (116) Zp (t) = exp t . = e− ln(1−αt) = m 1 − αt Second, if Nm = Nm + Nm (for example, if f = G · H and G(x, y) = H(x, y) = 0 is not possible for any pair of elements (x, y) ∈ Fq × Fq ). then Nm m Nm m Nm m t t t = exp · exp . (117) Zp (t) = exp m m m m m m If Nm = αm 1 + · · · + αr − β1 − · · · − βs , where αj and βj are allowed to depend on the equation, but not on the index m, it follows from these properties that
(118)
Zp (t) =
(1 − β1 t) . . . (1 − βs t)) , (1 − α1 t) . . . (1 − αr t))
that is Zp (t) is in this case a rational function. For example, in the case of the Fermat equation xn + y n = 1 one has α1 = 1, α2 = 1 and in the role of the β:s one has the Jacobi sums J(j/n, k/n). Therefore (1 − J(j/n, k/n)t (119)
Zp (t) =
j,k
(1 − t)(1 − pt)
,
where in the numerator one has a polynomial of degree (n − 1)(n − 2).
438
C HAPTER VI. POPULARIZATION OF MATHEMATICS
It is amazing that the arithmetic question about the number of Fq -solutions is tightly connected with the geometry of the associated curve. In 1931, F. K. Schmidt proved that for a curve of genus g one has 2g
Zp (t) =
(1 − αj t)
j=1
(1 − t)(1 − pt)
;
here the numerator contains a polynomial of degree 2g with integer coefficients. Taking √ logarithms of both sides of this equality and, further, using the relation |α|j = p (the so-called Riemann hypothesis in the case of a curve over a finite field), we obtain (120)
Nm = 1 + p − m
2g
αm j x.
j=1
From here it is seen that the Riemann hypothesis is equivalent to the statement √ ∀m |Nm − 1 − pm | ≤ 2g pm . For elliptic curves (the case g = 1) this statement was first proved by Helmut Hasse in 1933. 15. André Weil gave (in 1940-41) a sketch for the proof of the Riemann hypothesis for curves of arbitrary genus g, established this goal (1949) and, likewise, generalized the question to the case of varieties in higher dimension. Weil’s conjecture, in a slightly simplified form, amounted to proving that for arbitrary p there exist complex numbers αkj such than ∀m ∈ N Nm =
2d
(−1)j
j=1
Bj
αm kj ,
|αkj | =
pj ;
k=1
here d is the dimension of the variety X corresponding to the system of equations under view, the Bj are the Betti numbers and Nm is the number of Fq -points of X. It would be more correct to speak of the Weil conjectures, as the exact original formulation consists of four different assertions (conjectures). One reason, why the Weil conjectures are so interesting, is that they directly connect the geometric properties of a curve (the variety X(C)) with its arithmetical properties. Among other things, it follows from them that the more complicated the geometry of the curve (the variety), the more of the numbers of the numbers Nm are needed for the determination of the remaining Nm . Of special interest is the case when a curve E is given by the equation y 2 = f (x). If the curve is elliptic (g = 1), the the numerator in Zp (t) is a second order polynomial and Weil’s conjecture gives that Zp (t) =
1 − ap t + pt2 ; (1 − t)(1 − pt)
the numerator of this fraction we denote from now on by ep (t), def
ep (t) = 1 − ap t + pt2 .
8. Mordell’s problem
439
It follows from (120) that ap = 1 + p − N1 , where ap = α1 + α2 . This result shows that, for the function under view, the function Zp (t), and thereby all numbers Nm , m > 1, are determined by N1 . 16. If we know the function Zp (t) for all p, then we know all the numbers Nm,p . All this information yields also the following function of one complex variable s (the Hasse-Weil function of the curve E) 1 1 Z(E, s) = = ; −s −s ep (p ) 1 − ap p + p1−2s p prime
p prime
the product is convergent in the half-space Re s > 32 . One believes that (TaniyamaWeil conjecture) for each elliptic curve E one can continue Z(E, s) to a meromorphic function in the entire complex s-plane and that the function obtained in this way satisfies some supplementary conditions (the so-called Weil conditions; see [7, pp. 142-143]). In the case that this conjecture is true, one can speak of the “critical” values Z(E, s). It turns out that the behavior of Z(E, s) at the point s = 1 depends on many arithmetical properties of the given elliptic curve over Q. Thus one believes (part of the conjectures of B. Birch and H. Swinnerton-Dyer) that Z(E, 1) = 0 precisely when E has infinitely many Q-points. Finally, we remark that for the curve X : y 2 = x3 + x2 (which is not elliptic!) we have ep (t) = 1 − t, so that 1 Z(E, s) = 1 − p−s p prime
this is the ordinary Riemann zeta-function in Eulerian form. Let us further add that it was the question of the truth of the conjecture of Birch and Swinnerton-Dyer on which J. Tunnel, in 1983, based his proof of his criterion for finding congruent numbers. Congruent numbers are such numbers which give the area of right triangles with integer sides, is likewise a Diophantine problem, known since the 10-th century. For example, the number 6 = 3 · 4/2 is a congruent number. It is of interest to note that proving that the number 1 is non-congruent is equivalent to proving FLT in the case n = 4; see [7]. 17. In proving his theorem Weil had to use various results in geometry of the Italian mathematicians, but in the general case one could not assert that these results had been proved in a convincing way. The attempts to give an adequate, strictly supported foundation to Weil’s plans led, in the 1950-1960’s to the creation of new theories in algebraic geometry. The man who paved the path to this was Alexandre Grothendieck, whose thirst for action, in the 1960’s, was almost inexhaustible. His style was to conquer a gorge by filling it. He tried to treat each notion in a as general way as possible, only those restrictions were taken into account, whose necessity was forced by the mathematical situation. His work may be viewed as a far reaching generalization of the analytic geometry of Descartes, where the real numbers are replaced by the elements of an arbitrary commutative ring. With the aid of the so-called covering cohomology devised by Grothendieck it became possible to interpret the numbers αkj in a way, on which Deligne later based his
440
C HAPTER VI. POPULARIZATION OF MATHEMATICS
proof. Grothendieck’s achievements were recognized by the mathematical community when he was given the Fields medal a the Moscow-ICM in 1966. In the 1960’s one began to have an inkling that there existed a connection between the Weil conjectures and a problem of Ramanujan. Let τ (n) be the coefficient in front ∞ of xn in the power series expansion of the function x m=1 (1 − xm )24 , |x| < 1; τ (n) is always an integer distinct from zero. So far this is not proved, but one has checked it 11 for n ≤ 1015 . Ramanujan had considered it as very plausible that |τ (n)| ≤ n 2 · d(n), where d(n) is the number of divisors of the natural number n. It follows from this 11 that τ (p) ≤ 2p 2 . From 1916 on, this statement is known as the Ramanujan conjecture. Deligne had reason’s to believe in the truth of this relation, because he proved in 1968 that Ramanujan’s conjecture follows from Weil’s. In 1970, R. Langlands draw attention to a possibility which opens up, for the solution of Ramanujan’s conjecture, 29 from little known work of R. Rankin (1939), where the estimation τ (n) = O(n 5 ) was given. While trying to understand Rankin’s discussion, Deligne managed (supported by J.-P. Serre) to “geometrize” Rankin’s method. He connected this method with the topological technique of Solomon Lefschetz for finding fixed points of a mapping, and unified this in an unexpected way with the proof of Weil’s conjecture. Let us add some information about Pierre Deligne. He was born in Bruxelles in 1944. At the age of fourteen he began to read the Elements of Bourbaki which contain the essence of contemporary mathematics. Already this enterprize is astounding, as in these books the treatment goes from the general to the the particular, and in them there is no other motivation besides the logical development of the theme. After having studied some time at the University of Bruxelles, he went to Paris at the suggestion of the group theorist Jacques Tits. There he took part in the activities of the Grothendieck seminar, in particular attending with great interest the lectures of Jean-Pierre Serre, having a number theoretic outlook. Already in 1966 Grothendieck considered him on a par to himself. The style of Deligne has been described as follows: he likes to surpass the gorge, but not by filling it, but by building a bridge. His papers are readable, the ideas are explained in an understandable way, what is told there is necessary and it is told at the right time. Pierre Deligne was given the Fields Medal for his proof of the Weil conjectures at the ICM in Helsinki in 1976. 18. However, the line of thought described above did not lead immediately further on the path of finding the Q-solutions. Thus despite the fact that formula (114) gives the number of Fq -solutions of Fermat’s equation but it does not tell us anything directly about FLT. Even for the question about the existence of Z-solutions there is no answer in the general case and – as was proved by Yu. Matiyasevich in 1970 – there does not exist an answer in terms of a general algorithm. Even more valuable is any general regularity discovered about the Q-solutions of Diophantine equations, one example of this is the above mentioned Mordellconjecture. Let us now pause at this question, giving it a new, simpler formulation. Which Diophantine equations F (x, y) = 0 do have infinitely many Q-solutions? As follows from our above discussion, the answer is positive, for example, in the case of x2 +y 2 = 1. Here we have to deal with a first possibility – all solutions are expressible in terms of a parameter (the genus of the corresponding curve is 0). A second possibility is when the solution of the equation F (x, y) = 0 can be obtained by relations x = Φ(u, v), y = Ψ(u, v), where Φ and Ψ are both quotients of two polynomials with rational coefficients.
8. Mordell’s problem
441
Here the quantities u and v are required to satisfy the relation u3 = v 3 + av + b (with a, b ∈ Z), which equation has infinitely many solutions. In this case the corresponding curve must be of genus 1. Mordell’s conjecture can now be formulated as follows: C ONJECTURE 8.1. Let F (x, y) = 0 be a polynomial in two variables with integer coefficients. If the equation F = 0 cannot be mapped by a change of variables (x, y) → (u, v) to an equation such than the curve determined by it has genus 0 or 1, then this equation has only finitely many Q-solutions. For example, the equation xn + y n = 1, n ≥ 4 cannot be transformed into an equation whose genus is 0 or 1. Therefore, Fermat’s equation should according to Mordell’s conjecture have only finitely many Q-solutions. We add that according to FLT this equation ought to have precisely 3 solutions. Already in the 1920’s, C.-L. Siegel and A Weil tried to prove Mordell’s conjecture. Weil generalized the result of Poincaré-Mordell(that the group of rational points on an elliptic curve is finitely generated) to varieties of higher dimension, in the hope to be able to show by invoking, for a curve of genus g > 1, its so-called Jacobian variety that only finitely many rational points of the Jacobian lie on the curve itself. Attempts were made to amend this scheme of reasoning (for example, C. Chaubaty in 1938). A generalized and improved form of the Weil scheme was found by Serge Lang in 1962 (see [8]). A first essential step forward for the proof of Mordell’s conjecture was the proof of the same conjecture in the case of function fields (Yu. Manin in 1963). Although A. Weil did not reach his goal, he had set the right direction, and in the course of the next 60 years much new mathematics was created (Tate; Shafarevich; Manin; Parshin; Arakelov; Zarkhin; Deligne etc.). The further development was in essential way influenced by the Shafarevich conjecture (1962). Namely Shafarevich (see [17]) managed to formulate in number theoretic terms the problem of Kodaira for the classification of a given analytically varying (critical) family of Riemann surfaces of genus g > 1. As catalyst was here the analogy between number fields and fields of rational functions, observed and studied already in the 19-th century by Kronecker and Hilbert. This analogy has made it possible to transfer the correct formulation of the problem from one branch of mathematics to another, but it has not led to any solutions. A. Parshin (1968) and Yu. Zarkhin (1974) found a new approach to Manin’s result. Of special importance here is that Parshin proved that the Shafarevich conjecture is a consequence of Mordell’s. Gerd Faltings first established the Shafarevich conjecture in a weaker form and then in 1964 derived from it Tate’s conjecture (see [10]). Thereafter, using the Chebotarev density theorem and the Weil-Deligne theorem he reached his final goal – he found in its broad outline how to prove Mordell’s conjecture (as well as other conjectures mentioned here). In the opinion of several mathematicians (P. Deligne; L. Szpiro; F. Oort etc.) there were, in the original variant, several notions hard to understand and many observations extremely difficult to penetrate (more exactly, possible to put in order, only with great effort). But still, after less than a year it became apparent to the specialists that Mordell’s conjecture (and with it also the conjectures of Shafarevich, Tate etc.) now was proved! The way in which Faltings, in his proof, combined (and, if necessary, extended) surprised the specialists by his unexpected and extremely clear way in overcoming all difficulties. In the beginning, Faltings had doubted if he possessed the will, and the gift
442
C HAPTER VI. POPULARIZATION OF MATHEMATICS
to deal with such an abstract and complicated thing as Tate’s conjecture. But a great thirst for truth, and an interest for the many mathematical disciplines cohesive with this theme made it possible for him to understand and learn more, so that he did not stop and fail to test any of the key observations (as had done previously many mathematicians interested in this). From these small victories there grew finally a big one – the proof of Mordell’s conjecture. Much was written about this sensational result, one even expressed the opinion that this was the “theorem of the century” (Math. Intelligencer 5, No. 4 (1983)). If this is true, will of course be decided by the mathematicians of the following generations. In any case, we have here to deal with a triumph of mathematics (see the interview of JeanPierre Serre, Math. Intelligencer 8 (1983)). At the time of the solution of the problem Gerd Faltings was 28 years of age and was, for the second year, teaching mathematics at the University of Wuppertal ([West-]Germany). He had obtained his Ph. D. from Professor Nastold in Münster, under whom Faltings had studied, and who impressed him as a person. For the results described Faltings received the Fields Medal at the ICM in Berkeley in 1986.
19. The words of Academician V. Platonov (Minsk) “with our intellect we are dealing with Mordell’s conjecure, but at hearts we are attached to FLT” seem to express the sentiments of the majority of mathematicians when acquainted with the result of Faltings. Problems and their solution have been the soul of Mathematics – the solution of veritable problems has always led to a new, deeper understanding of many notions, often giving birth to new theories, and in this connection to the formulation of many new problems. We have already spoken above of new things which arose immediately from the theorem of Faltings in the case of FLT. In the years following the proof of Faltings (1983) one made many efforts to find methods for an effective estimation of the number of solutions of Diophantine equations with a finite set of solutions. However, it became clear rather quickly that, moving along the path of Faltings’s proof, it seems to be practically impossible to determine the equations of the geometric objects appearing in the proof (which are Abelian varieties). Still one hopes to obtain such effective estimates (Parshin, 1984; Raynaud and others). In 1984, a new approach to Fermat’s equation was found by the young German mathematician G. Frey. To each (assumed) non-simple solution one associates a certain elliptic curve – a so-called Frey curve, obtained as follows. Let p ≥ 5 be a prime and (A, B, C) a triple of integers such than Ap + B p = C p and GCD(A, B, C) = 1. Setting a = Ap , b = B p , c = (−C)p , we observe that a + b + c = 0 and that GCD(a, b, c) = 1. For the simple Fermat triple (A, B, C) the corresponding Frey curve (over Q) is the elliptic curve Ea,b,c given by the equation y 2 = x(x − a)(x + b). It turns out that Frey curves have special properties. Assuming the validity of the Taniyama-Weil conjecture and using these special properties together with the results of Serre and Ribet (1986) ¯ ¯ p ), where Q ¯ and F ¯ p are the algebraic about the homomorphisms Gal(Q/Q) → GL(2, F closures of Q and Fp respectively and Gal(. . . ) is the Galois group of the corresponding extension – so-called modular representations of weight 2 –, Serre and Frey reached at the conclusion that Frey curves do not exist! Taking into account how Frey curves were obtained, it appears from this (under the validity of the Taniyama-Weil conjecture) that Fermat’s equation does not have simple solutions.
8. Mordell’s problem
443
The non-existence of Frey curves would also follow from the arithmetical analogue of an inequality (the so-called Bogomolov-Miyaoka-Yau inequality) valid for Chern classes of algebraic surfaces (over C), that is the corresponding inequality for a number field – in the assumption that it succeeds to prove the latter. For the first time, one spoke about this during Parshin’s lecture in Paris in October, 1986. The following year the Japanese mathematician Yoichi Miyaoka, a student of K. Kodaira, heard about these results, and already in the early spring of 1988 there spread a sensational rumor that Miyaoka had succeeded in proving the arithmetical analogue of this inequality (and so FLT) . . . But when one got time to analyze the complete text of Miyaoka’s proof, his mistake became apparent. Thus Faltings found an essential error in Miyaoka’s argument, and so the proof lost its credibility. E. Bombieri arrived at the same conclusion, admitting, however, that the paper of the Japanese contained interesting ideas. More detail about this reduction (and some others connected with FLT) can be found in the survey [12]. 20. At least, one can say that the story of the Mordell-Faltings theorem (and the things connected with FLT) have corroborated of the rather firm conviction of many mathematicians that FLT is a true touchstone for the generality and depth of our mathematical methods, at the same time for to what extent these methods make it possible to transcend (in both directions!) the barrier between the discrete and the continuous. At least it should be clear to everybody today how illusory it is to hope that in especially favorable conditions one would find a solution to FLT by elementary means (see the observations made in Subsection 18). According to A. Parshin such a thing would require important, new knowledge about arithmetical surfaces. At the same time, J.-P. Serre adds to this line of thought that it would be strange if it would be possible to prove FLT geometrically only. In view of this it is hard to say how far one has come on the route offered by G. Frey on ones way to a proof of FLT. Therefore, one could believe that FLT is like the continuum hypothesis, which can neither be proved or disproved. This is not quite so. Consider the sequences (n, A, B, C) where An + B n = C n , n ≥ 3, and A, B and C are natural numbers, and call them Fermat quadruples. The statement “FLT is true” means that Fermat quadruples do not exist. From the statement “FLT is not true” it follows that Fermat quadruples do exist. If it were possible to find a Fermat quadruple and prove it convincingly, then FLT would be refuted. This argument shows that: in case that there is no proof that FLT can be refuted, then FLT is true. One might believe that the geometric point of view will bring the analytic and arithmetical arguments forward on the way toward the proof FLT. The theorem will probably be proved one day with the help of all the methods described here used together, as it happened with the proof of the conjectures of Weil and Mordell. The proof cannot be simple; already C.F. Gauss said: Hopefully the proof of FLT will one day be found as side product to some deep result in arithmetic. And still, the problem has been attacked by an uncountable army of “fermasists”, but even 20 years ago mathematicians had not fully adopted this ideas of Gauss. Even more, there were numerous mathematicians, among them also those who knew the subject very well, considering the algebro-geometric method created in the course of the attempts to prove
444
C HAPTER VI. POPULARIZATION OF MATHEMATICS
FLT as water sprouts of Diophantine Analysis, generalizations and analogies detached from the real needs of Number Theory. Maybe this is illustrated most significantly by Mordell’s own reaction on the occasion of the appearance of the book [8]. He told that he felt like Rip van Winkel99, adding that if in case one can understand, even the simplest special cases, the proofs of the generalizations with great difficulty, it would be better to leave these generalizations, where they are. To these Serge Lang counters strikingly: A mathematician working in Algebraic Geometry who fell asleep in 1961 and awoke in 1981 will probably feel himself like Rip van Winkel, this is the natural effect of the rapid and fundamental changes that have occurred in mathematics. Because of this, in the case of A. Weil, Grothendieck, Serre, Shafarevich and other, who all contributed the solution of Mordell’s problem, one has to estimate their contribution, but even more admire their personal fortitude and insight in the application of the methods of that time. 21. In our pragmatic age one can of course consider all what we have described as a fruitless enterprize only by the reason that it concerns only so-called pure mathematics. Here one could quote a letter (July 2, 1830) of Carl Gustav Jacobi to Adrien Marie Legendre; . . . I have read with great pleasure the opinion of Mr. Poisson about my work, and I could have been quite pleased, but Poisson should perhaps have omitted the rather tactless phrase of Mr. Fourier, where the latter reproaches Abel and me that we do not prefer to work more on the question of heat conductivity. One knows of course the opinion of Mr. Fourier that the main objects of mathematics are the applications to the clarification of natural phenomena and the yield from this. But such a deep thinker ought also have known that the ultimate goal is to glorify the human spirit. And seen from this point of view Natural Numbers are of no lesser importance than the Structure of the Universe. In a time, when are ever less doubts in the usefulness of computers, it will perhaps not make sense to complete this argument to the support of the aesthetic origin of Mathematics. But in a changing world one ought to add that, in Jacobi’s words, there is expressed fully the opposition, peculiar to each generation, about the distinctions in Mathematics. These distinctions express themselves in the choice whether to prefer problems which have arisen from the closest needs of practise, or to think on problems dictated by the inner logic of things and which will yield a benefit only in a remote future. This dilemma may appear also, in one form or the other, in the activities of one and the same mathematician. It is significant here, for instance, that such a well-known mathematician as John von Neumann has expressed entirely conflicting opinions in the question under view. But it is also precisely here that the dilemma find its solution: Gauss, Riemann, Hilbert, Hermann Weyl and other major front figures of Mathematics have often found in their theoretical work major inspiration in an applied background. In the course of 99Translator’s note. Character in a book by the classic American writer and humorist Washington Irwing (1783-1859). He is the man who slept for 20 years and when he wakes up find himself in a world that has transformed, the American Colonies have become independent.
8. Mordell’s problem
445
a longer period of time (30-100 years, sometimes even longer) this distinction may disappear or express itself in a different form. Selected results in the solutions of applied problems are generalized, in this time, to a theory, and the so-called pure mathematics find its way into the applications. Even the new theories treated in the present paper have found there way into the applications, namely into contemporary physics. We point out here only the discussion of D. Ruelle [15], in particular his observation that a theorem found and proved by two physicists Lee and Yang (Phys. Rev 87 (1952), 410-419) and its applications (for example, in the theory of phase transition) probably is connected to the Weil conjecture. Finally, we add that, in both of these types of motivation, and their interplay, always it is the concrete problems which bring mathematics forward, and also direct its development in an essential way. In this sense the words of the Polish born U.S. mathematicians Marc Kac are remarkable: Even axiomatic systems change in the waves of time, but their applications live for ever.
Epilogue. This survey was probably written in 1988. Since then Fermat’s Last Theorem has been proved by Andrew Wiles (assisted by Richard Taylor) [19]. The proof depends on a special case of the TaniyamaShimura Conjecture [2], saying that every elliptic curve is modular. This conjecture in general was later proved by C. Brenil; B. Conrad; F. Diamond; and R. Taylor [1]. A very readable (non-technical) description of Wiles’ work is the book [18]. Gert Almkvist
Other references:[16],[3],[6],[9],[11], [14],[21],[22]
References C. Brenil, B. Conrad, F. Diamond, and R. Taylor. On the modularity of elliptic curves over Q: wild 3-adic exercises. J. Am. Math. Soc. 14, 2001, 843–939. [2] H. Darmon. A proof of the full Shimura-Taniyama conjecture. Notices Am. Math. Soc. 46, 1998, 1397– 1401. [3] P. Deligne. Preuve des conjectures de Tate et Shafarevich (d’après G. Faltings). Séminaire Bourbaki, Exposé 616, Novembre 1983. Asterisque 1983/84 (121–122), 1985, 25–41. [4] Diophant of Alexandria. The Arithmetics and the book on polygonial numbers. Nauka, Moscow, 1974. [5] D. R. Heath-Brown. Fermat’s last theorem for "almost all" exponents. Bull. London Math. Soc. 17 (1), 1985, 15–16. [6] N. Katz. An overview of Deligne’s proof of the Riemann hypothesis for varieties over finite fields. In: Proc. Pure Appl Math 38, Part 1: Mathematical developments arising from Hilbert problems. Amer. Math. Soc., Providence, R.I., 1976, 275–306. [7] A. O. Koblitz. Introduction to elliptic curves and modular forms. Graduate Text of Math., 97. SpringerVerlag, New York, 1984. [8] S. Lang. Diophantine geometry. Interscience Tracts in Pure and Applied Mathematics, 11. Interscience Publ., New York, London, 1962. Russian translation: Mir, Moscow, 1986. [9] S. Lang. Higher dimensional Diophantine problems. Bull. Am. Math. Soc. 80, 1974, 779–787. [10] B. Mazur. Higher dimensional Diophantine problems. Bull. Am. Math. Soc. 14, 1986, 207–259. [11] B. Mazur. On some of the mathematical contributions of Gerd Faltings. In: Proceedings of the Int. Congress of Mathematicians, August 3-11, 1986. Amer. Math. Soc., 1987, 7–11. [1]
446
C HAPTER VI. POPULARIZATION OF MATHEMATICS
[12] J. Oesterlé. Preuve des conjectures de Tate et Shafarevich (d’après G. Faltings). Séminaire Bourbaki, Exposé 694. Asterisque 1987/88 (161–162), 1989, 165–186. [13] M. M. Postnikov. Introduction to algebraic number theory. Nauka, Moscow, 1982. [14] S. Raghavan. Impact of Ramanujan’s work on modern mathematics. J. Indian Inst. Sci. Srinivasa Ramanujan centenary 1987, Special Issue, 1987, 45–53. [15] D. Ruelle. Is our mathematics natural? The case of the equilibrium of statistical mechanics. Bull. Am. Math. Soc. 19 (1), 1988, 259–268. [16] F. Schinzel. Construction of telephone networks by group representations. Notices Am. Math. Soc. 26 (1), 1989, 5–22. [17] I. R. Shavarevich. Algebraic number fields. In: Proceedings of the Int. Congress of Mathematicians, August 15-22, 1962. Institute Mittag-Leffler. Almqvist & Wiksells, Uppsala, 1963, 163–176. [18] S. Singh. Fermat’s enigma. Walker and Co., New York, 1997. [19] R. Taylor and A. Wiles. Ring theoretic properties of certain Hecke algebras. Ann. of Math. 141, 1995, 553–572. [20] S. S. Jr. Wagstaff. The irregular primes to 125 000. Math. Comp. 32 (142), 1978, 583–591. [21] A. Weil. Number of solutions of equations in finite fields. Bull. Am. Math. Soc. 55, 1949, 497–508. [22] Yu. Zarikhin and A. Parshin. Problems of finiteness in Diophantine Geometry. Supplement to the Russian edition of [14], 1986, 369–438.
447
9.
[K96] On two discrete models in connection with structures of mathematics and language Translation by J. Peetre
In a mathematical theory there is no a priori need to bring its conceptions and language in agreement with newest needs of natural sciences. Nevertheless this has happened often and the good harvest of the cooperation has given a profit to both parties. During the last decades there has been a steadily growing interest in discrete models of an ever increasing complexity. As it has not been possible to present adequately such a model with the aid of standard functional rules, this interest has increased in proportion to the possibilities of computers for theoretical experiments with them. In the following we shall describe the possibilities of two such simplest models.
9.1. Binary trees and Strahler numbers One example is the study of branching phenomena in neurophysiology, botany, geology – in the last discipline in particular in connection with hydrogeological research by R. E. Horton [2] and A. Strahler concerning the structure of river systems [13]. A common denominator for these phenomena is provided by the notion of tree, which expressed in mathematical language means a cycle-free connected simple graph. In computer science one employs the notion of a binary tree, which can be determined recursively: • if such a tree has only one vertex, then this tree is identified with its vertex; • in all other cases a binary tree is defined as a triple B = (v; BL , BR ), where v is a distinguished vertex of B (designated the root) and BL (as well as BR ) are binary trees, called the left (respectively, the right) subtree of the tree B. The vertices of a binary tree are classified as inner vertices (such an vertex 2 H has two “successors”, its left and its right vv• HHH 1 2 vv HH successor) and as exterior vertices (these v HH v H1 vv v 5 are the vertices without successors). The • • 555 1 1 1 0 110 edges of a tree are pairs of vertices (v, w), 5 11 55 where w is the successor of v. If we as•* * • •** • * * 0 *0 0 *0 sume that in a river system no islands ** ** • • • • have been formed and that at each juncture not more than two rivers are united, Fig. 23: The orders of the binary tree then the branching picture which arises is a binary tree. To the edges of a tree one can assign an order using the Horton-Strahler rule: • the order of a river proceeding from a source is 0; • two order-k rivers join to a river of order k + 1, but two rivers of order i and k (i < k) give when joined an order-k river (Fig. 23).
448
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The maximal order of the edges of a tree under consideration is called its Strahler number and will be denoted st(B). This parameter of a tree can be defined inductively as follows: • we agree upon that st(∅) = 0; • if st(BL ) = st(BR ), then we agree that st(v; BL , BR ) = 1 + st(BL ); • if, however, st(BL ) = st(BR ) we agree that st(B) = max(st(BL ), st(BR )). A maximal path among the paths of a tree consisting of edges of order k is called •** • •** • •** • •** • ** ** ** ** a k-th order segment of the river system; 0 * 0 0 * 0 0 * 0 0 * 0 such a segment starts in a source (in case * * *5 * •55 • • • 5 k = 0) or else arises by joining two edges 55 5 55 5 of order k − 1 (in case k ≥ 1), but ends by 1 55 1 1 55 1 • HHH v• the joining with a segment of order k (k > HH vv v H v k). Denoting the total number of segments 2 HHH vvv 2 •v of order k by bk , we define the bifurctation ratio of the tree B as the quotients bk /bk+1 ; 3 here k ≤ st(B). For example, for a binary tree with all exterior vertices (leaves) at the same distance k from its root the biFig. 24: A binary tree B: bk /bk+1 = 2 for any k, furctation ratio is 2 and the Strahler numand st(B) = 3 ber of such a tree is k (Fig. 24). In accordance with hydro-geological observation the bifurctation ratio does not change within the frames of a given river system, and stays between 3 and 5, giving a good qualitative picture of the shape of the river system. Branching trees are of interest also in botany. A result of these investigations for computer science is the discovery of the so-called Lindenmayer grammars and their use in computer graphics, where using these complementary methods one tries to assemble a synthetic picture of the tree [16]. The inputs of such a program are the number k and a stochastic matrix with (at least) k rows, the so called ramification matrix , and it yields a binary tree with Strahler number k having the given matrix as ramification matrix. Strahler numbers appear in a natural way like7654 0123 + 999 wise in other questions of computer science. One 99 99 9 of these is the question of the least number of reg ?>=< 89:; ?>=< 89:; g ×; ;; isters needed for the evaluation of a given arith ;; ;; metic expression. Let us identify the arithmetic ex; 7654 0123 7654 0123 +, ++ , pression (consisting binary operations) with a tree ,, +++ ,, ++ whose vertices are labelled by symbols for these 89:; ?>=< ?>=< 89:; 89:; ?>=< ?>=< 89:; e / / f operations and the variables used in the expres ,,, -- ,, -sion. For example, in Fig. 25 we have drawn a , 89:; ?>=< 89:; ?>=< 89:; ?>=< 89:; a c b ?>=< d labelled binary tree corresponding to the expression (a/b + c/d) (e + f ) + g. In the general case it turns out (theorem of A. Ershov!) that the Fig. 25: The syntax tree of the expression minimal number of registers required in the eval- ( ab + dc )(e + f ) + g uation of an arithmetic expression exceed by one the Strahler number of the corresponding binary tree. The number of registers required
9. Structures of mathematics and language
449
for the evaluation of a long arithmetic expressions is described by a formula for finding lim st(n), where st(n) stands for the average 1 · st(Bn ) cn (Bn )
over all binary trees Bn with n vertices. Such a formula was found by X. Viennot (1986) (see [16]). As a detail, let us record that the total number of the latter is cn = 2n n /(n+1) and that the generating function c(t) = n≥0 cn tn of these Catalan numbers cn satisfies the relation 1 + tc(t) − c(t)2 = 0.
9.2. Molecular biology and formal languages The results of molecular biology has sometimes been formulated in terms of formal languages and information theory. On the one hand, the formal languages. Fixing an alphabet X, let us consider subsets L of the set of all words X ∗ ; such subsets are called formal languages. A language L can be presented asfunction L : X → {0, 1}. Therefore, L can also be interpreted as a formal series w∈L w. Here we are interested in context free language (CF-languages)100; such a language can be given by a context free grammar, that is a quadruple G = (X, N ; σ, P), where N (the terminals) and X (the non-terminals) are finite alphabets, σ ∈ N the so-called initial symbol and the finite set P contains the rules of deduction (productions) α → β, that is, pairs (α, β), where α ∈ N and β ∈ (N ∪ X)∗ (for details see [11]). As an example, we have the Dyck language D, D ⊆ X, where X = {x, x ¯} and the rules of deduction are σ → xσ¯ xσ and σ → 1; here the symbol 1 denotes the empty set in X. In a word of the language D there are always the same number of letters x and x¯. Moreover, there are in each left term (prefix) not fewer letters x than letters x¯. Another example is the Fibonacci language F , F ⊆ X, for which X = {x, a} and N = {σ, τ }, while the productions are σ → aτ , σ → xxσ, τ → aσ, τ → xxτ , σ → 1. The formal series F representing this language is the solution of the system of equations F =1 + aG + xxF G =aF + xxG in the algebra ZX of formal series with integer coefficients. One owes to M. P. Schützenberger the idea to seek in the enumeration of combinatorial objects in their graduated set K = ∪Kn such a formal algebra language L whose n-words are in one-to-one corresponding with the objects of order n, that is, elements of the set Kn . In this situation the desired result will give the generating n n function l(t) = n>0 ln t of the numbers ln = |L(G) ∩ X )|. In order to find the number of words of a given length in the language L(G) let us consider the morphism Ψ : X ∗ → {t}∗ , which maps all characters of the alphabet X into one and the same (new) variable t. In this situation is the formal series corresponding to L(G) represented by the generating function of the numbers ln : Ψ(L) = l(t). In the case of the Dyck language we obtain in this way a series d(t), satisfying the equation 1 − d(t) + t2 · d(t)2 = 0. 100The author uses the term algebraic language instead.
450
C HAPTER VI. POPULARIZATION OF MATHEMATICS
√ 2 2 This equation is solved by the function d(t) = (1 − 1 − 4t such that in its power 2n)/2t 2n series the coefficient of t is the Catalan number cn = n /(n + 1). In our second example, we obtain the power series Ψ(F ) = f (t) and Ψ(G) = g(t) satisfying the system of equations f (t) = 1 + t · g(t) + t2 · f (t) g(t) = t · f (t) + t2 · g(t) As the solution to this system we obtain the function f (t) = (1 − t2 )/(1 − 3t2 + t4 ). In is Taylor series the coefficient of t2n is the Fibonacci number F2n ; here F0 = F1 = 1 and Fn = Fn−1 + Fn−2 (n ≥ 2). An auxiliary fact: The Fibonacci language is rational, which (according to Kleene’s theorem!) means that that this language is recognizable by a finite automaton; such an automaton is depicted in Fig. 26. a On the other hand, the genetic code. Inter( ?>=< 89:; ?>=< 89:; esting macromolecules are nucleic acid (NA) and 1 h 3 a W W proteins. One of the forms of NA - deoxyribonucleic acid (DNA) contains chromosomes and carx x x x ries hereditary information. It appears as a double helix twisted up in space and consisting of a dual 89:; ?>=< 89:; ?>=< 2> 4 pair of threads joined with each other through hy>> >> drogen bonds. If one separates two DNA strands > a >> a > and then adds to each of them another DNA chain 89:; ?>=< 5 complementary to it one gets as a result two iden9 e x a tical copies of the original DNA molecule. The kinky form of a double helix optimizes the spatial Fig. 26: A finite automaton for the Fidistribution of the molecule, because in untwisted bonacci language form the DNA thread the shape of DNA would have been 50 centimeters long. The proteins are the workhorses of the cell, assuring the stability of its structure, its defence, energy content and life activity. The protein molecule consists of amino acids, of which there are 20 species. The latter may be viewed as the semantic primitives of the genetic language, of which finite words (long!) formed by concatenation are called polypeptide chains. The primary structure of nucleic acids may be viewed as a chain of nucleoides (bases) – a thread. The alphabet G, with the aid of which the DNA thread is transferred as words consists of four bases: A (adenine), G (guanine), C (cytosine) and T (thymine). In the case of the ribonucleic molecule (RNA) one uses in the alphabet R of the ones mentioned the three first, while T (thymine) is here replaced by the base U (uridine). These bases possess several properties which make it possible to count them as phonemes of the genetic language. But they have also peculiarities:
• the number of phonemes of a natural languages is variable (> 10), the number of nucleoides is 4 in all organisms; • the phoneme of a natural language is given by a complex of (binary) predicates whose order in the words is not important, while at the same time as for example T (thymine) and C (cytosine), although they consist of the same elements, appear as different graphs. In the alphabet of nucleotides there arises 43 = 64 strings – the codons, that in turn form the so-called nucleotide chains – (very
9. Structures of mathematics and language
451
long!) strings in the alphabet of codons. In both cases the bases are joined into a unique chain with the aid of sugar components. It is possible to view the genetic code as an exact correspondence between the codons and amino acids of a special type; see the table in [6]. In the decoding of the polynucleotide chain each codon is replaced by a corresponding amino acid. In fact, amino acid specifies 61 codons, the remaining 3 codons (UAA, UAG, UGA) are terminators, the role of which is to indicate the end of the phase of decoding. Codons could be compared to morphemes in natural language – each of them is a sequence of genetic phonemes, which within the limits of the given syntax does not dissolve into shorter subsequences. A difference is the same length of codons (3), which is not observed in the case of natural languages. In the same way the meaning of the morphemes in natural language is modified from language to language, while genetic morphemes and their meaning remains invariant for all organisms. In the framework of this interpretation one can consider terminators as grammatical morphemes, while at the same time the remaining 61 codons play the role of lexical morphemes. As a detail – there exist also contexts where grammatical morphemes may appear as lexical ones. Z. Pawlak made an attempt to present the genetic language with the aid of a grammar based on geometrical intuition [8], the inconveniences of which were removed in a modification of this grammar into a formal grammar by B. Vanquois a few years later101. S. Marcus extended the grammar obtained to the Lindenmayer system in order to present also the “spatial” aspect of the genetic language (the double helicity!) [6]. %% vSSS •S %% SSSSSSS SSSS %% S* T3 %% ~~•* % ~~ ** ~ % ** % ~~ • •*~ •% ** %% *** ** % % ** * %% •• •&55 • %% &&& 55 5 % % && 55 • • • • •
Let us now consider here again the question how Strahler numbers appear. The fact that the threads a double helix of DNA are not knotted, makes it possible to view the double helix as a planar graph (which is also called the secondary structure of the molecule): the vertices are the bases and T the edges are both the base joints in the • T DNA thread (primary bonds) as well as their hydrogen joints formed in the helix (secFig. 27: A rooted tree T = (v; T1 , T1 , T3 ) ondary bonds). Each secondary structure induces a certain forest – a cycle free graph the vertices of which are the primary bonds and the edges are determined by the incidence relationship of these bonds. Such forests were used by Vauchaussade de Chaumon and Viennot [17] with the purpose in mind to study the homologies of the secondary structures, that is, the molecule’s properties in distinct species. As a result there was an answer to M. Waterman’s question [18]: what is the generating function of all k-th order secondary structures? Here by the order of a secondary structure is meant the order of the forest induced by it. Let us introduce the necessary notions. The rooted tree T is defined recursively: 1
2
• if T has only one apex, then T is identified with this vertex;
101See details in [6]
452
C HAPTER VI. POPULARIZATION OF MATHEMATICS
• in the opposite situation one gives the tree as a sequence T = (v; T1 , . . . , Tp ), where v is a vertex of T (the root) and Ti is a subtree of T rooted at v, see Fig. 27. A forest is a list of all connected components of the graph consisting of rooted trees. A maximal sequence of vertices (v1 , . . . , vs ) such that each vi (i = 1, 2, . . . , s − 1) has the unique successor vi+1 and vs is a leaf (that is, an apex without a successor) is called a filament of the forest. The operator δ of removing filaments is defined on the forest M by the rule that δ(M ) is the forest which is obtained from M by omitting all vertices of the filaments and all the edges incident to them; the filament containing the root is removed in the last instance. The smallest such number i that the vertex x is extinguished by application of the operator δ i is called the degree of this vertex. The maximal degree of the vertices of a forest is called the degree of the forest. In the example given in Fig. 28 we have the degree 3. It is clear that the degree of a forest is the least integer k such that δ i (M ) = ∅. An answer to the Waterman’s question • above is obtained in the following way. The secondary structures of degree k are coded •1I1II 11 IIII 11 II with the words of a suitable algebraic lan 11 I•55 •* 55 ** 11 guage and then one finds a system of equa* 55 * 11 * 5& • • •& tions which is satisfied by the generating •*** && * && ** formal series of the words of this language. • • • • Subsequently, the desired answer is found • • using the procedure described above (in connection with the map Ψ). Indeed, if the • number of unlabelled k-the degree secondary structure with n vertices is denoted sn,k , Fig. 28: A forest of the degree 3 then the generating function under discussion is given by the formula tp(k) sn,k tn = , (1 − t)P1 P2 . . . Pk n≥0
− 2 and the polynomials Pi are defined recursively by the rules where p(k) = 5 · 2 2 − 2tp(k) (in case i ≥ 2). The problem connected with P1 = 1 − 2t − t3 and Pi = Pi−1 this question regarding the enumeration of rooted forests of degree k and n vertices is simpler and, surprisingly, its answer is the same generating function which enumerated the binary trees with Strahler number k [15]. k−1
9.3. On coding theory Contemporary technology has lifted on a new level questions about the mechanisms of information processing and their effectiveness. The solutions have required a mathematical formulation of which many essential concepts originates in coding theory. As many similar questions are of interest also in the study of the genetic code and language, we shall in what follows likewise give a brief survey of these concepts. There are many possibilities for sending information. In some cases (for instance, in satellite communication) information is transferred through medial channels, in other
9. Structures of mathematics and language
453
cases the sender writes it, for instance, on a floppy disk, from which the computer later reads it. The exact mechanism of the transfer is not far from always known – it suffices to think of the questions of transfer of information in the human brain. However, many communication channels have a common characteristic – the transfer of information is there accompanied by background noise, with the effect that some of the transferred symbols get modified in the process of communication and arrive to the receiver in distorted form. In order to improve qualitatively of the reception one applies error detecting and error correcting codes. In mathematical formulation, a channel is given by a triple (S, V ; P ), consisting of an input language S, an output language V and a matrix P = p(y|x). The elements of the latter are conditional probabilities: p(y|x) shows the probability for receiving the symbol y in the condition that x was sent and this probability is regarded as independent of the fate of the previous and later signals in the channel under view. Here information is interpreted as a sequence of (long!) finite sequences (called words, also strings), for the writing of which the symbols of the given alphabet are used. In the theory the most suitable alphabet is some finite field Fq (here q = p is a prime power). The Reader may picture the field Fq as a domain of numbers where the arithmetical operations are carried out according the most common rules, to which new ones have been added that introduce basically periodicity phenomena, emanating from the finiteness of the domain Fq . The coding may be viewed as a procedure (as an algorithm or a mapping), which map a natural message or a part of it written in words in the channel’s input language S, adding so-called code symbols (‘redundancy blocks’). Expressed more exactly, for the coding of a message, broken up into k letter blocks, Ψ may be presented as an injective map Fkq −→ Fnq ; words in the image set C = Ψ(Fkq ) ⊆ n Fq are called code words. If Ψ is a linear map, then the set of code words forms a kdimensional subspace of the sequence space Fnq ; therefore the code is termed a linear (n, k)-code. All code words of a linear code can be presented in the form x ¯G, where x ¯ ∈ Fnq and G is a fixed k × n matrix, the rows of which form a basis of the subspace C; it is called the generating matrix of C. There is also another important matrix connected with the code, it is the parity check matrix which is a (n − k) × k-matrix H such that x ¯ ∈ C if and only if x ¯H t = 0; here t is standing for taking the transpose of the matrix. If we introduce a form n ¯ x, y¯ = xi yi i=1
in Fnq , then we can ‘compute’ the orthogonality of vectors, that is, interpret this geometrical notion in the analytic language: x¯⊥¯ y ⇔ ¯ x, y¯ = 0. Therefore it makes sense to speak of the code C ⊥ dual to C, C ⊥ = {x| x ∈ Fnq such that x⊥c for all c ∈ C.} The reception of the coded information is followed by decoding – a procedure which maps the sequence received in the channel’s output language V into a natural message. Often this is achieved in such a way that one finds the code word closest to the received words (maximum likelihood decoding). Maximizing the correct choice of the code word is facilitated by the Hamming distance of two words (vectors) x = x1 x2 . . . xn and y = y1 y2 . . . yn : dH (x, y) = #{i|xi = yi }. For example dH ((0111), (1001)) = 3 and dH ((01100), (11000)) = 2.
454
C HAPTER VI. POPULARIZATION OF MATHEMATICS
If the minimal (Hamming) distance between the words in C is d, then such a code can correct ≤ [(d − 1)/2] errors arisen in the channel, and detect even ≤ d − 1 errors. This is easy to understand if we surround all code words x ∈ C ⊆ Fnq by the discrete balls Be (x) = {z|z ∈ Fnq , dH (x, z) ≤ e}. Here e is the radius of the ball and e ≤ [(d − 1)/2]. As a detail, we add the fact that each ball contains n n n (q − 1)e (q − 1)2 + . . . (q − 1) + 1+ e 2 1 words in the (vector) space Fnq ; here ni denotes the binomial coefficient. In view of the choice of the radius the balls {Be (x)|x ∈ C}, apparently the inequality 2e + 1 ≤ d is satisfied. Consequently, these balls do not intersect (Fig. 29) and if a received word falls into one of the balls Be (x) then this word can be uniquely! decoded by the code word x, x ∈ C, which constitutes the center of the ball in question. The most known example of a linear code is the Hamming code. Let us fix an integer r and consider the vector space Frq of all vectors as an affine space, that is, a point space where Fqn the vectors in Frq appear in two roles: as point and as displacement vectors. Although such a “point space” Ar (q) consists of only q r dise e tinct points, it has its own geometry which • • may be described as the crypto-morphological analogue on the knowledge offered in uniBe (x) versity courses in linear algebra and geometry on the real affine space, where the field Fq is in the role of the real numbers. Taking one of the points O ∈ Ar (q), let us consider the lines through this point: an arbitrary Fig. 29: The Hamming distance point X on such a line is given by the equation X = O + v¯t, where the parameter t runs through all values in the field Fq , and the non-zero vector v¯ ∈ Frq is the direction vector of the line. Thus, here every line as a set of points {X(t)|t ∈ Fq } consists of q points! There are q r − 1 choices for the direction vector (¯ v = 0), so that the number of lines through the point O equals n = (q r − 1)/(q − 1); let us denote these (non-collinear) directions by v1 , v2 , . . . , vn . Let us further form the matrix H, whose rows are the sequences vi ∈ Frq . Then we may consider the code C = {x|x ∈ Frq , xH t = 0} . This linear (n − r, n)-code is called the Hamming code. The minimal distance between its code words is 3 and this code is thus perfect in the sense that @ Frq = B1 (x) . x∈C
In other words, in the case of an arbitrary word received with not more one distorted letter it is possible to decide with which code word it has to be decoded.
9. Structures of mathematics and language
455
As another example, let us consider the radar codes. One of the best known medieval mathematician was Leonardo (from Pisa, with the nickname Fibonacci 102). His most important work concerned the completion and systematization of arithmetic, which he had learnt from the Arabs. Through his treatise “Liber Abacci” (1201) his results became known in Europe. Fibonacci numbers are widely known; this is the sequence 1, 1, 3, 5, 8, 13, 21 . . ., which members (Fn |n = 0, 1, . . . ) may be found from the relation Fn = Fn−1 + Fn−2 (n ≥ 2), assuming that F0 = F1 = 1. More generally, let us consider sequences y = (yn |n = 0, 1, . . . ) satisfy a homogeneous recurrent equation, that is, a relation of type a0 + a1 yi−1 + · · · + ak yi−k = 0,
i = k + 1, . . . ,
where we agree that a0 = 0 and further that, in the interest of the context that the coefficients ai and the members on the sequence are taken in the Galois field Fq . Fixing the initial values y0 = c0 , . . . , yk−1 = ck−1 , this equation gives us as solution a sequence (cn |n = 0, 1, 2, . . . ), the components of which can be found from the formula ci =
a−1 0
·
k
aj ci−j ,
i = k, k + 1 . . .
j=1
A useful detail: if we interpret the solution y as a (formal) series y = c0 + c1 x + c2 x2 + . . . , then this series comes as the quotient of two polynomials c(x)/a(x), where the degree of c(x) is less than k and a(x) = a0 + a1 x + · · · + ak xk is the (left) characteristic polynomial of the equation under consideration. A radar code codes a k-sequence (c0 , c1 , . . . , ck−1 ) written in the input alphabet Fq as an infinite recurrent sequence c = (cn |n = 0, 1, . . . ), which is determined as the solution of the above recurrent equation under the initial condition (c0 , c1 , . . . , ck−1 ). If, in addition, ak = 0 in this equation, then the radar code determined by it generates only periodic sequences (cn ), that is, there exist integers p and t such that ci = ci+p for all i ≥ t. For instance, taking q = 2, k = 4 and the equation yi + yi−3 + yi−4 = 0 we obtain sequences with period p = 24 − 1 = 15. The error correcting properties of this radar code depend on the fact that 24 = 16 distinct initial sequences (these are words of length 4 in the alphabet Z2 ) generate 16 distinct code words of length 15 and that their set is closed for addition as well as (obviously) also multiplication by the scalars 0 and 1. Hence, C turns out to be a 4-dimensional subspace in the 15-dimensional space 4−1 Z15 = 8. Therefore the radar 2 , in which the minimal distance of any two codes is 2 4−2 code described can recognize 2 = 4 errors, and correct 24−2 − 1 = 3 errors. The set C may be realized as a simplex, so this code is also known as a simplex code. The dual to it code C ⊥ is the widely known binary (3, 7)-Hamming code. An example of the effectiveness of radar codes is the fourth test of A. Einstein’s theory of gravitation. A long time one has known three experimental facts validating this theory (1915): the precession of the perihelion of the orbit of Mercury; the bending of right rays near the Sun; and the gravitational red shift. The fourth effect (the slowing down of electromagnetic radiation in the gravitational field) was checked only half a century later. To this end one measured the arrival of echoing from a radar signal from Mercury both when Mercury was obscured 102Translator’s note. The common used name-form Fibonacci came into use only in the course of the 19th century, presumably through the influence of the Italian mathematician and mathematical historian Guglielmo Libri Carucci della Sommaja (1803-1869). Leonardo himself wrote (in Latin) Leonardo filio Bonnaci.
456
C HAPTER VI. POPULARIZATION OF MATHEMATICS
by the Sun (in this case the energy of the echo is 10−27 of the emitted energy!), as well as when is was not obscured. With the aid of a suitable radar code one succeeded to fix the time difference in the arrival of the echo. Interest in mathematical coding theory spread in particular after Shannon’s result [12] regarding the possibilities of good transfer of information in “noise” adding symmetric binary channels (BSC). For such a channel (S, B; P ) one has S = V = Z2 = {0, 1}, while the elements of the matrix P , the conditional probabilities p(i|j), are given by the rules: p(1|0) = p(0|1) = p (probability of error), and p(0|0) = p(1|1) = 1 − p for some p ∈ [0, 1]. The rate of transmission for this channel is determined as the ratio between the number of bites appearing in the original message and the total number of bites input in the channel, where the last number also includes the bites added in the decoding of the message. According to Shannon’s theorem transformation of information with noise is possible in a symmetric channel with given positive rate of transmission, which at the same time guarantees the correct reception of an initial message with close to one probability. Supplementary information about codes can be found in [14].
9.4. Conclusion One may remark that, despite the ancient origin of the problem of information transfer, some of the questions connected with are still of interest and that often they are only beginning to become accessible to research. This problem has provided the motivation for and is the real testing stone in the development of biology as well as of several combinatorial theories within mathematics. In connection with of the genetic language one could note two questions which, presumably, offer a continued interest. First, in which way can the genetic code be viewed as an error detecting and error correcting code? Second, how to explain the continuity properties of the genetic language, that is, in which cases (always?) and why does closeness between some codons generate corresponding polypeptide chains which are similarly close to each other? The determinations of nearness of codons (modified; weighted; etc.) have so far not given any result with the aid of the Hamming distance. It is the author’s conjecture that this can be realized via Grothendieck topology. Also related problems are connected with the model of the Dutch mathematician De Bruijn [1] regarding transfer of information in the (human) brain, as well as, the related to this, Grothendieck continuity within the realm of automata. References The references [2], [8],[17], and [18] are supplied by the Editors. The works [3], [4], [5], [7], [9], [10] below are actually not cited in the paper. They are kept here because of the appearance in the original publication. [1] [2] [3]
N. G. de Bruijn, A model for information processing in human memory and consciousness. Preprint (2.11.1993). Dept. of Math. and Comp. Sci. of Techn. Univ. Eindhoven, 1993. R. E. Horton. Erosioned development of streams and their drainage basins, hydrophysical approach to quantitative morphology. In: Bull. Geol. Soc. of America, Vol. 56, 1945, 275–370. A. Jaffe and F. Quinn. Theoretical mathematics: towards a cultural synthesis of Mathematics and Theoretical Physics. Bull. Am. Math. Soc. 29(I), 1993, 1–12.
9. Structures of mathematics and language
[4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
[18]
457
U. Kaljulaid. An invited review of the book “Discrete Mathematics and Algebra Structures ”, by S. Gerstein. In: Acta Appl. Math., Vol. 22. Freman & Co., N. Y., 1987, 325–329. J. Kiho. Algoritmid ja nende struktuurid, Tartu, 1994. (In Estonian). S. Marcus. Linguistic structures and generative devices in molecular genetics. Cahiers linguistique théorique et appliqueé 11(2), 1974, 77–104. D. Mumford. Picard groups and moduli problems. In: Arithmetical Algebra Geometry, N. Y., 1965, 33–81. Z. Pawlak. Gramatyka i matematyka. Pa´nstwowe Zaklady Wydawnictw Szkolnych, Warszawa, 1965. (In Polish). H.-O. Peitgen, H. Jürgens, and D. Satpe. Chaos and Fractals. Springer-Verlag, 1992. P. Prusinkiewicz, A. Lindenmayer, and J. Hannan. Developmental models of herbaceous plants for computer imagery purposes. In: ACM SIGGRAPH Computer Graphics, Vol. 22, 1988, 141–150. A. Salomaa. Formal Languages and Power Series. In: “Formal Models and Semantics”, Handbook of Theor. Comp. Sci., Vol. B. Elsevier Science Publ. B.V., 1990, 103–132. C. E. Shannon. A Mathematical theory of communication. The Bell System Technical Journal 27, 1948, 379–423, 623–656. A. N. Strahler. Hypsometric (area-altitude) analysis of erosonal topology. In: Bull. Geol. Soc. of America, Vol. 63, 1952, 1117–1142. H. C. A. van Tilborg. Error-correcting codes - a first course. Chartwell Bratt, Studentlitteratur, Lund, 1993. M. Vauchaussade de Chaumont, Nombre de Strahler des arbres, languages algébriques et dénombrement des structure sécondaires en biologie moléculaire. Thèse. Univ. de Bordeaux 1, 1985. X. Viennot, G. Eyrolles, N. Janey, and D. Arqués. Combinatorial analysis of ramified patterns and computer imagery of trees. In: ACM SIGGRAPH Computer Graphics, Vol. 23, 1989, 31–40. M. Vauchaussade de Chaumont and X. G. Viennot. Enumeration of RNA’s secondary structures by complexity, in Mathematics in Medecine and Biology. In: Lecture Notes in Biomath., Vol. 57. Springer, BerlinNew York, Berlin, N. Y., 1985, 360–365. M. S. Waterman. Secondary structure of single stranded nucleic acids. Adv. Math. Suppl. Stud. I, 1978, 167–212.
This page intentionally left blank
459
Index of Names Abel, Niels Henrik, 3, 4, 8, 24, 25, 27, 40, 51, 69, 70, 72, 73, 75, 78, 79, 81, 82, 86–90, 92, 97, 106, 107, 118–121, 123, 124, 128, 129, 134, 259, 272, 345, 348, 351, 368–370, 377, 378, 381, 382, 384, 387, 389, 392, 395, 399, 405–407, 411, 435, 436, 442, 444 Alameddine, Ahmad Fawzi, 247 Aleksandrov, Pavel Sergeevich, 203 Aleksandrov (Alexandroff), Aleksandr Danilovic, 227 Alexander the Great, 428 Alexander I, ix Almkvist, Gert, ix, 243, 286, 427, 445 Ameling, Friedrich, 311 Amitsur, Shimshon Avraham, 282 Anderson, Ian, 208 Ando, Tsuyoshi, 221 Andrunakievich, Vladimir Aleksandrovich, 122 Apollonius of Perga, 428 Arakelov, Suren Yu, 441 Arazy, Jonathan, 233 Archimedes of Syracuse, 428 Argand, Jean Robert, 364 Artin, Emil, 74–77, 83, 84, 106, 107, 437 Bézout E., 358, 391 Bachet, Claude, 429 Backlund, Helge Gotrik, 292, 294 Backlund, Hjalmar, 294 Backlund, Johan Oskar, 291–294 Backlund, Ulrika Catharina, 292, 294 Backlund-Celsing, Elsa Carolina, 292 Bahturin, Yuri A., xxiv Banachewski, Bernhard, 94, 108 Barbilian, Dan (Barbu, Ion), 117 Bashmakova, Isabella Grigoyevna, 311, 327 Beckenbach, Erwin F., 226 Beilinson, Alexander, 353 Bell, Eric Temple, 239 Bellman, Richard, 226 Belousov, V. D., 348 Belski˘ı, A., 353 Beltrami, Eugenio, 259
Bergman, George, 16, 17, 21, 43, 63, 105, 123, 283 Berkovich, Vladimir, 353 Bertrand, Joseph Louis François, 362 Betti, Enrico, 371, 436, 438 Bidder, Georg Friedrich Karl Heinrich, 321 Birch, Bryan John, 349, 439 Birkhoff, Garrett, 15, 21, 26, 44, 45, 103–105, 155, 207, 221 Björner, Anders, 203 Blauert, Marianne, ix Bogomolov, Fedor Alekseevich, 443 Bokut, Leonid Arkadievich, ix, 269 Boltzmann, Ludwig, 416 Bombelli, Rafael, 429 Bombieri, Enrico, 443 Booth, Laura, 299 Booth, Lorentz, 299 Borevich, Zenon Ivanovich, xii Bourbaki, Nicolas, 352, 355, 371, 399, 440 Bovdi, Adalbert, 22, 88 Brandt, Kerstin, ix Brauer, Richard Dagobert, 386 Brenil, C., 445 Brouwer, Luitzen Egbertus Jan, 353 Brualdi, Richard A., 213, 221 Bruck, Richard Hubert, 348 Bruhat, François Georgwe René, 221 Brunner, Georg Bernhard, 320 Buckley, Joseph T., 71, 85 Bulman-Fleming, Sydney, 137 Burnside, William, 254, 257, 258, 261, 377, 385, 386 Cantor, Georg Ferdinand Ludwig Philipp, 332, 353, 410, 416 Capelli, Alfred, 259 Cardano, Geronimo, 357, 358, 363, 364 Carrol, Lewis , 356, 359, 362 Cartan, Élie Joseph, 257, 267, 268 Cartan, Henri, 5, 355, 389 Castelnuovo, Guido, 406 Catalan, Eugène Charles , 449 Catharine I (Martha Skovronska), 292
460
INDEX OF NAMES
Cauchy, Augustin Louis, 226, 227, 261, 276, 361, 362, 368 Cayley, Arthur, 254, 257, 259, 261, 267–269, 424 Chabauty, Claude, 441 Chasles, Michel, 408 Chebotarev, Nikolai Grigorievich , 441 Cherednik, Ivan, 353 Chern, Shiing-Shen, 227, 443 Chernikov, Sergei Nikolaevich, 77 Chevalley, Claude, 288 Clebsch, Rudolf Friedrich Alfred, 259 Clemens, Charles Herbert, 406 Cobos, Fernando, xvi Cobos, Luz, xvi Cohen, I. S., 3, 9 Cohn, Paul Moritz, 21, 68, 328, 390, 391 Connell, Ian, 71 Conrad, B., 445 Coxeter, Harold Scott MacDonald, 266 Crelle, August Leopold, 370 Cremona, Antonio Luigi Gaudenzio Giuseppe, 259, 269 Cruse, Allan B., 212, 221 Culik II, Karel, 179, 180 Curie, Pierre, 408 Currie, James D., 250 Cwikel, Michael, ix, 214, 233 d’Alembert, Jean Le Rond, 364 Dade, Everett C., 273, 274 Danilov, Volodymyr Yakovych, 353 Darboux, Jean Gaston, 267 Dassel, Egbert, 304 de Bruijn, Nicolaas Govert, 456 de Saint-Exupéry, Antonie Marie Roger, 373 de Moivre, Abraham de, 362, 381 Dedekind, Julius Wihelm Richard, 254, 267, 431 Dehn, Max Wilhelm, 269 Deligne, Pierre, 427, 436, 439–441 Demidov, E. E., 353 Descartes, René, 357, 358, 407, 439 Deskins, Wilbur Eugene, 120, 121 Diamond, F., 445 Dicks, Warren, 286 Dieudonné, Jean Alexandre Eugéne, 258, 352 Dilworth, Robert Palmer, 243, 249 Dimberg, Sven, ix Diophantus of Alexandria, 427–429, 432–436, 439, 440, 442, 444, 467 Dirac, Paul Adrien Maurice, 409, 411 Dirichlet, Johann Peter Gustav Lejeune, 221, 429 Dolgaev, Sergey Ivanovich, 9 Dolotin, Valeri V., xvi Drensky, Vesselin Stoyanov, 283 Drinfel’d, Vladimir Gershonovich, 353 Duffus, Dwight, 250
Dynkin, Eugene Borisovich, 266 Eagon, John, 273, 274 Eastwood, David, 221 Egorychev, Georgiy Petrovich, 209, 221, 222, 225, 227, 230 Ehrenfest, Paul, 415 Eicheldinger, Martina, ix Eilenberg, Samuel, 5, 21, 43, 416, 425 Einstein, Albert, 411, 455 Eisenstein, Ferdinand Gotthold Max, 259, 363, 390, 402 El Hushi, 353 Encke, Johann Franz, 292, 293 Eneroth, Bertil, xvi Engel, Friedrich, 266, 269 Engliš, Miroslav, ix Eratosthenes of Cyrene, 428 Erik XIV, xvi Ershov Andrei Petrovich, 448 Euclid of Alexandria, 266, 362, 390, 407, 408, 428 Euler, Leonhard, 249, 250, 269, 292, 351, 358, 363, 416, 429, 430, 439 Faisal Ibn Abdul Aziz Al Saud, 353 Falikman, Dmitry I., 209, 225, 228, 230 Faltings, Gerd, 427, 432, 436, 441–443 Fan, Kenneth, 221 Fano, Gino, 406 Farkas, David K., 221 Feit, Walter, 373, 385–387 Feller, Edmund H., 117 Fermat, Pierre, 427, 429, 436, 437, 440–443, 445 Ferro, Scipione, 357 Fibonacci, Leonardo, xxiii, 245, 247, 249, 357, 450, 455 Filep, László, ix, 208 Fiore, Antonio Maria, 357 Formanek, Edward, 203, 271, 283–286, 288 Forsyte, 122 Fossum, Robert M., 286 Fourier, Jean Baptiste Joseph, 444 Fowler, Kenneth Arthur, 386 Fox, Ralph, 61 Frey, Gerhard, 442, 443 Frobenius, Georg Ferdinand, 208, 222, 225, 229, 231, 253, 254, 257, 261, 267–271, 276, 353, 385 Frumkin, M. A., 353 Gödel, Kurt, 415 Gårding, Lars Jakob, 225–228 Gabovitsh, Evgeniˇı, 328, 373
INDEX OF NAMES
Galois, Évariste, vii, xxii, 269, 270, 355, 363, 370, 371, 373, 387, 389, 399–408, 411, 415, 417, 425, 436, 455 Gauss, Johann Carl Friedrich, 259, 266, 267, 269, 351, 353, 362–364, 367, 370, 399, 410, 429, 431, 432, 436, 443, 444 Geissinger, Ladnor, 221 Gel’fand, Israil Moiseevich, xvi Gel’fond, Aleksandr Osipovich, 332 Geronimus, A. Yu., 353 Girard, Albert, 363, 364 Give’on, Yehoshafat, 170 Glushkov, Victor Mihaylovich, 21, 68, 416 Gluskin, Lazar Matveevich, xiv, 20 Goethe, Johann Wolfgang, 334, 344 Govorov, Valentin Evgenevich, 127, 138 Grassmann, Hermann Günter, 282 Griffiths, Phillip Augustus, 406 Grinberg, A. S., 20 Grossman, Marcel, xvi Grothendieck, Alexander, xii, 3, 5, 6, 9, 144, 183, 203, 399, 439, 440, 444, 456 Gruenberg, Karl W., 22, 72, 76, 77, 88, 96, 102, 107 Gustafson, William H., 257, 267 Gustavsson, Jan, ix Gustavus, Adolphus, viii Gyldén, Hugo, 293, 305, 306 Hölder, Otto , 165, 250, 379, 383, 385, 423, 424 Hörmander, Lars Valter, xvi, 226 Haber, Semyour, 385 Hadamard, Jacques Salomon, 233 Hall, Philip, 72, 96, 208, 221, 386, 424 Halpin, Patrick, 283 Hamilton, William Rowan, 123, 127, 130, 257, 267, 269 Hamming, Richard Wesley, 453, 454 Hankel, Hermann, xvi Hansen, Peter Andreas, 306 Harary, Frank, 247 Hardy, Godfrey Harold, 207, 221, 233 Hartley, Brian, 22, 71, 85, 86, 88, 97, 107 Hartshorne, Robin, 12, 13 Hasse, Helmut, 347, 407, 438, 439 Hawkins, Thomas W., 267, 268 Heath-Brown, D. Roger, 432 Helmling, Peter, 266, 321 Henno, Jaak, 68 Hermite, Charles, 207, 227, 232, 233, 259, 266 Hertz, Heinrich, 409 Hesse, Ludwig Otto, 268 Higman, Graham, 64, 386 Hilbert, David, vii, 5, 203, 235, 258, 259, 272, 297, 332, 345, 399, 406, 407, 409, 413, 431, 441, 444
461
Hion, Jaak, xiii–xv Hochschild, Gerhard Paul, 16, 35, 104 Hochster, Melvin, 273, 274 Hoffman, Allan J., 212 Horton, Robert Elmer, 447 Hotz, Günter, 183 Hudde, Jan, 357 Hughes, Ian, 221 Hurwitz, Adolf, 269, 298, 345, 349 Höhn, Gerald Helmut, 353 Irwing, Washington, 444 Iskovskih, Vasili Alexeevich, 353, 406 Jaakson, Hermann, xi Jacobi, Carl Gustav Jacob, 259, 416, 437, 441, 444 Jacobson, Nathan, 17, 64 Janson, Svante, xvi Jansson-Peetre, Eila Ritva, ix, xvi, xvii Johnson, Kenneth W., 271 Johnsson, Margreth, ix Jordan, Camille , 165, 250, 371, 383, 385, 423, 424 Kämtz, Ludwig Friedrich, 321 König, Denes, 208, 222, 225, 231 Kaarli, Kalle, ix, 19, 111 Kaasik,Ülo, xxiii, 427 Kac, Mark, 445 Kadikis, Peteris, 268 Kalin, 283 Kaljulaid, Elmar, xi Kaljulaid, Uno, vii, xi–xix, xxi, 13, 17, 143, 145, 207, 214, 243, 284, 291, 311, 366, 411 Kalman, Rudolf Emil, 170, 425 Kaluzhnin (Kalujnin), Lev Arkad’evich, 22, 32, 68, 70, 71, 82, 84, 89, 97, 108, 162 Kanevski˘ı, D., 353 Kangro, Gunnar, xi, 340 Kanunov, Nikolai Feodorovich, 265, 269, 289, 311 Kaplansky, Irving, 94–96, 102 Kapranov, Mihail M., 353 Katsov, Yefim, 127, 137, 138 Katz, Matthew J., 211 Katzman, Simha Idelevich, 111 Kaufmann, Ralph M., 353 Kelly, Annela, xv, 207 Kemer, Aleksandr Robertovich, 269 Kennel, Julius Thomas, 319 Kepler, Johannes, 428 Kharchenko, Vladislav Kirillovich, 288 Khoai, Kha Huy, 353 Kii, K., 353 Killing, Wilhelm Karl Joseph, 253, 266, 267
462
INDEX OF NAMES
Kilp, Mati, xiii Kingissepp, Viktor, 317 Kiselman, Christer, ix Kiselman, Dan, ix Kivinukk, Andi, ix Kleene, Stephen Cole, 145, 450 Klein, Felix Christian, 253, 254, 258, 266, 269, 270, 355, 364, 366, 385, 408–411 Kneser, Adolph Hermann, 298 Kneser, Friederike Wilhelmine Filippe Augusta, 298 Kneser, Helmuth, 297 Kneser, Julius Carl Christian Adolf, 291, 297–301 Kneser, Lorents Friedrich, 297 Kneser, Martin, 12, 297 Knuth, Donald Ervin, 183 Koch, 117 Koch, Richard, ix Kodaira, Kunihiko, 441, 443 Kolmykov, Vladislav Alekseevich, 353 Kolyvagin, Victor A., 353 Kostrikin, Aleksei Ivanovich, xxiv Koval’skiˇı Nikolai Pavlovich, 268 Krakowski, Don, 283 Krasner, Marc, 32, 68, 162 Krohn Kenneth, 68, 165, 417, 422–425 Kronecker, Leopold, 130, 267, 269, 297, 298, 331, 332, 405–407, 431, 441 Krull, Wolfgang, 26, 286 Kruus, R., xxi Krylov, Petr Andreevich, ix, 312 Kummer, Ernst Eduard, 267, 429, 431, 432 Kurchanov, Pavel Fedorovich, 353 Kurosh, Aleksandr Gennadievich, xiii, xxiv, 203, 328, 340 Kurter, 117, 118 Kuzmin, Evgeniˇi N., 68 Künneth, Hermann, 6
Lee, Tsung-Dao, 445 Lefschetz, Solomon, 440 Legendre, Adrien-Marie, 429, 444 Leibniz, Gottfried Wilhelm, 358, 407 Leites, Dimitry Alexander, 353 Lembra, Jaak, xxiii Lenin (Ulyanov), Vladimir Ilych, xxi, 351 Levin, Andrey, 353 Levitzki, Jacob, 282 Lewin, Jacques, 16, 17, 21, 43, 63, 105, 123, 283 Lexell, Anders Johan, 292 Li, Winnie, 283 Libri, Guglielmo, 357, 455 Lie, Marius Sophus, 101, 254, 257, 260, 266–269, 327, 345, 352, 355, 387, 408, 409 Lindemann, Carl Louis Ferdinand, 395 Lindenmayer, Aristid, 448, 451 Lindhagen, Georg, 292 Lindstedt, Anders, 266, 291, 303–310 Lindstedt, Ewa, 308 Lindstedt, Folke, 308 Lindstedt, Hilda, 308 Lindstedt, Samuel, 308 Liouville, Joseph, 332, 370 Lipyanskiˇı, Ruvim, ix, 15, 17 Littlewood, John Edensor, 207, 221, 233 Liu, Bo Lian, 213 Lobachevsky, Nikolai Ivanovich, 269, 407 Loewner, Charles, 234 London, David, 230 Lorentz, Hendrik Antoon, 225, 409 Lovász, László, 221 Lucas, François Edouard Anatole, 245 Luh, Jiang, 122 Luigi Ferrari, 357, 358 Lumiste, Ülo, ix, xiii, 338 Lusztig, George, 203 Lyapin, Evgeniˇı Sergejevich, xiv Lyapunov, Aleksandr Mihailovich, 268
Lüroth, Jacob, 353, 406 Lagrange, Joseph-Louis, 349, 351, 355, 358, 359, 361, 362, 364, 368, 374, 382–384, 399, 402, 410, 429 Laguerre, Edmond Nicolas, 241 Lah, Ivo, xxiii, 239, 241 Lajos, Sándor, 122 Lamé, Gabriel, 429 Landau, Edmund, 68 Lang, Serge, 328, 352, 441, 444 Langlands, Robert Phelan, 440 Laptev, German Fedorovich, 408 Laud, Peeter, xv, xviii Lazard, Daniel, 127, 138 Lebedev, D. R., 353 Lebesque, Henri Léon, 429
Mädler, Johann Heinrich, 321 Müürsepp, Peeter, 301 Mac Lane, Saunders, 35 Macauly, F. S., 3, 9 MacKoy, 122 Magnus, Wilhelm, 96 Mal’cev, Anatoly Ivanovich, 20, 22, 86, 91, 96, 101, 102, 108, 130, 253, 269, 415 Mal’cev, Yuriˇi N., 67, 68, 105 Manin, Yuri Ivanovich, vii, xii, xiv, 348, 351–353, 406, 407, 416, 441 Marcus, Marvin, 228, 231, 232 Marcus, Solomon, 451 Markov, Andrei Andreyevich, 145 Marshall, Albert W., 221 Martinson, Indrek, ix
INDEX OF NAMES
Martynov, B., 353 Maschke, Heinrich, 80 Mathieu, Emile Léonard, 387 Matiyasevich, Yuri Vladimirovich, 440 Mauchly, John William, 413 Maxwell, James Clerk, 409 Mayer, Christian Gustav Adolph, 299 Mazur, Barry Charles, 435 Mc Culloch, Warren Sturgis, 414, 415, 421 McDowell, Kenneth, 137 McMullen, P., 221, 222 Mealy, George, 45, 145, 147, 152, 154, 168 Melin, Anders, xvii Menal, Pere, 117, 120, 123 Menger, Karl, 68 Menskiˇı, Michail Borisovich, 43 Meriste, Merik, xxiii, xxiv Merkulov, Segei A., 353 Mihalev , Alexander Vasilyevich, 20, 22 Mihovski, Stoyl Vassilev, 117, 123 Miljan, Riina, xv, 111 Miller, George Abram, 375 Milne, Alan Alexander, 327 Minc, Henryk, 225, 228, 232 Minding, Ernst Ferdinand Adolf, 266 Minh, Hoang Le, 353 Minkowski, Hermann, 269, 347 Mirsky, Leon, 212, 221 Miyaoka, Yoichi, 443 Molien, Andrei [Andrew], 265 Molien, Benedikt, 312 Molien, Eduard, 265 Molien, Elise, 312 Molien, Johan, 265 Molien, Theodor (Molin, Fedor Eduardovich), vii, xv, xxiii, 222, 253–255, 257–262, 265–272, 274–277, 281, 286, 287, 291, 311–315, 385 Molotov, Vyacheslav Mihailovich, viii Moore, Edward F., 68, 156–158, 168, 169, 172, 173, 175, 179, 421 Mordell, Louis Joel, xxiv, 328, 339, 345, 346, 349, 352, 353, 427, 435, 436, 440–444 Muir, Thomas, 227 Mumford, David, 183 Munn, Walter Douglas, 221 Mustafin, G. A., 353 Myhill, John, 171, 419, 420 Myrberg, Caroline, ix Néron, André, 352 Nagata, Masayoshi, 3, 64 Nagell, Trygve, 349 Nano, Villem, xxi Nemmers, Frederic Esser, 353 Nerode, Anil, 171
463
Netto, Eugen Otto Erwin, 406 Neumann, Bernhard Hermann, 20, 21, 58 Neumann, Hanna, 20, 21, 58, 101 Neumann, Peter M., 20, 21, 58 Newman, Morris, 228, 231 Newton, Isaac, 266, 364, 365, 369, 428 Nikolskii, Aleksandr Vadimovich, ix, 312 Noether, Emmy, xv, 5, 9, 73–76, 83–85, 106, 107, 155, 257, 258, 267, 269, 273, 281, 399, 406 Nuut, Jüri, xxi Oettingen, Arthur Joachim, 297, 305, 307 Oettinger, Arthur Joachim, 266 Ol’shanskiˇı Alexander Yu., xxiv Olkin, Ingram, 221 Oort, Frans, 441 Ore, Oystein, 127, 131, 132 Ostrowsky, Alexander Markowich, 207, 221, 233 Pólya, George, 207, 239, 249, 254, 258, 261, 262, 275 Palowsky, Karl Rudolph, 306 Panchishkin, Alexei, 353 Parshin, Aleksey Nikolaevich, 441–443 Pasch, Moritz, 409 Passman, Donald, 22 Pawlak, Zdzisław, 451 Pearson, Kenneth Robert, 221 Peetre, Inga-Britt, ix Peetre, Jaak, vii, xvi, xvii, xix, 15, 19, 101, 143, 145, 203, 207, 221, 222, 225, 233, 243, 245, 253, 257, 265, 291, 447 Peetre, Jakob-Sebastian, ix Peirce, Benjamin, 314 Peirce, Charles Sanders, 314 Penjam, Jaan, xv, xxiii, xxiv, 143, 183, 203 Penkov, Ivan, 353 Perkmann, Monika, ix Perron, Oskar, 229, 231 Persson, Ann-Christin, ix Persson, Ulf, ix, 399, 411 Peter the Great (Romanov, Pjotr Alexeiovich), 292 Petri, Carl Adam, vii, 203 Picard, Emile, 183, 267 Pick, Georg, 234 Pierce, Richard S, 257 Pikkmaa, Tiit, xv, xxiii Piltz, Anders, ix Pitts, Walter H., 414, 415, 421 Plato, 411 Platonov, Vladimir Petrovich, 442 Plotkin, Boris Isakovich, vii, ix, xii–xiv, 15, 17, 20, 22, 24, 30, 42, 68, 71, 75, 86, 88, 97, 101, 106, 108, 127
464
INDEX OF NAMES
Poincaré, Jules Henri, 254, 267, 272, 349, 407, 409, 435, 441 Poisson, Siméon Denis, 444 Pontryagin, Lev Semenovich, 407 Popov, Vladimir Leonidovich, 282, 283 Postnikov, Mihail Mihailovich, 389 Prank, Rein, 427 Procesi, Claudio, 283 Prodinger, Helmut, 245 Proskurowski, Andrzej, 247 Pythagores, 427 Quillen, Daniel Grey, 13 Rägo, Gerhard, xi Rödl, Vojtech, 250 Rado, Richard, 219, 221 Ramanujan, Srinivasa Aiyangar, 440 Rankin, Robert Alexander, 440 Raynaud, Michel, 442 Razmyslov, Yuriˇi P., 68 Redfield, J. Howard, 261, 275 Rees, Mina, Spiegel, 113 Regev, Amitai, 283 Remak, Robert, 26, 64, 72, 76, 84 Renner, Johann, xvi Rhodes, John, 68, 165, 417, 422–425 Ribbentrop, Joachim, viii Riemann, Bernhard, 338, 340, 344, 345, 373, 407, 408, 435, 436, 438, 439, 441, 444 Roitman, A. M., 353 Rolle, Michel, 226 Roos, Jan-Erik, xii, xvi, 3, 13 Roseblade, James Edward, 22, 96, 107 Rosenfeld, A, 385 Rosengren, Hjalmar, xix Rota, Gian-Carlo, 221, 239 Rothe, Peter, 363 Ruelle, David, 445 Ruffini, Paolo, 361, 368, 369, 384, 389, 392, 395, 399, 405 Ryser, Herbert John, 221 Saburov, Andrei, 293, 294, 304, 319 Sandling, Robert, 22, 97, 107 Sands, Bill, 250 Sarv, Jaan, xi Schützenberger, Marcel-Paul, 43, 449 Scheffers, Georg, 267 Schlömilch, Oscar Xavier, 317 Schmidt, Erhard, 26, 286 Schmidt, Friedrich Karl, 347, 438 Schock, Rolf, 353 Schroeter (Schröter), Heinrich Eduard, 299 Schur, Friedrich Heinrich, 266, 269, 385 Schur, Issai, 207, 221, 233–235, 269, 280, 281
Schwarz, Peter Carl Ludwig, 266 Selberg, Atle, 353 Serganova, Vera V., 353 Serre, Jean-Pierre, xii, 5, 7, 8, 12, 13, 440, 442–444 Serret, Joseph Alfred, 371 Shabat, George, 353 Shafarevich, Igor Rostislavovich, 351, 352, 356, 441, 444 Shain, Aleksandr, 122 Shannon, Claude Elwood, 456 Shannon, R. T., 137 Shenkman, 89 Shephard, G. C., 275, 288 Shermenev, Alexander Mihailovich, 353 Shevrin, Lev Naumovich, xiv, xxiii, 111–113, 127 Shimura, Goro, 407, 445 Shmel’kin, Alfred Lvovich, 20, 21, 58, 101 Shokurov, Vyacheslav V., 353 Sibley, David, 271 Siderov, Plamen N., 284 Siegel, Carl Ludwig, 328, 435, 441 Singer, Isadore M., 413 Skornyakov, Lev Anatolyevich, 127, 221 Skorobogatov, Alexei Nikolayevich, 353 Sloane, Neil James Alexander, 275 Smith, Patrick F., 75, 77, 106 Sokratova, Olga, ix, xiv, xxiv, 127, 138 Spanne, Sven, ix Sparr, Gunnar, ix Spivak, Michael David, ix Stanley, Richard, 203, 249, 250, 261, 272, 275 Staude, Ernst Otto, 297, 298 Stein, Elias M., 235 Steinitz, Ernst, 409 Steklov, Vladimir Andreevich, 268, 297, 351 Stenström, Bo, 127, 137 Sternberg, Shlomo, 221 Stirling, James, xxii, xxiii, 239 Strahler, Arthur Newell, 447, 448, 451, 452 Struve, Friedrich Georg Wilhelm, 266, 268, 269, 292 Study, Eduard, 260, 266, 267, 269 Suprunenko, Dmitriˇi Alekseevich, 20, 284 Suslin, Andrei Aleksandrovich, 13 Suzuki, Michio, 386 Swinnerton-Dyer, H. Peter F., 349, 439 Sylow, Peter Ludwig Mejdell, 89, 374, 386 Sylvester,James Joseph, 259, 267 Szász, Ferenc A., 122 Szpiro, Lucien, 441 Tacitus, Publius Cornelius, viii Tallinn, Annika, ix, xviii, xix Tambour, Torbjörn, 243, 275, 276
INDEX OF NAMES
Tamm, Hellis, ix Tamm, Marje, ix Tamme, Enn, ix, xxi, xxii, 144, 413 Tammeste, Rein, 413 Tammiksaar, Erki, ix Taniyama, Yutaka, 407, 439, 442, 445 Tannery, Paul, 368 Tartaglia, Niccolo, 357 Tate, John Torrence, 441, 442 Taylor, Brook, 450 Taylor, Richard, 445 Thompson, John Griggs, 373, 385–387 Tichy, Robert F., 245 Tits, Jacques, 440 Todd, J. A., 275, 288 Tolstoˇı, Dmitriˇı, 294 Traustason, Gunnar, ix, 373, 387, 424 Tschinkel, Yuri, 353 Tschirnhaus, Ehrenfried Walter, 357 Tsfasman, Michael A., ix, 351, 353 Tsygan, Boris L., 353 Tunnel, Jerrold Bates, 439 Turan, Paul, 391 Turing, Alan Mathison, 145 Tyshkevich, Regina Iosifovna, 284 Ufanrovsky, Victor, ix, 291 Vagner, V. V., xiv Vainberg, Yu., 353 Vainikko, Gennadi, xvi, xvii Vaintrob, Arkady Yu., 353 van der Waerden, Bartel Leendert, 207, 209, 222, 225, 328, 339 van Lint, Jacobus Hendricus, 225, 227 Van Tilborg, Henk, 456 Vandermonde, Alexandre-Théophile, 362, 370 Vanquois, Bernard, 451 Vauchaussade de Chaumont, Mireille, 451 Vene, Varmo, xv Verevkin, A. B., 353 Vershik, Anatoly Moyseevich, 221 Viéte, François, 347, 357, 359, 391, 396, 400, 404, 429 Viennot, Xavier, 449, 451 Vilyatser, V. G., 71 Visentin, Terry I., 250 Vishik, Mihail M., 353 Vladuts, Serge, 353 Volterra, Vito, 234 von Below, Joachim, 213 von Dyck, Walther , 449 von Neumann, John, 135, 207, 221, 410, 414–416, 444 Voronov, Alexander A., 353
465
Wagstaff, Samuel S., 432 Wallis, John, 364 Waterman, Michael, 451 Weber, Heinrich, 298, 364, 406 Wedderburn, Joseph Henry Maclagen, 257 Weierstrass, Karl Theodor Wilhelm, 254, 267, 297, 298, 308, 347, 349 Weihrauch, Anna Elisabeth, 321 Weihrauch, Filipp Alexander Robert, 320 Weihrauch, Karl, 291, 306, 309, 317–322 Weihrauch, Karl Ernest, 320 Weihrauch, Karolina Eliza Johanna, 320 Weihrauch, Matilde, 320 Weihrauch, Philipp, 321 Weil, André, 328, 339, 345, 349, 429, 435, 438–441, 443–445 Weiss, Guido, 235 Wessel, Caspar, 364 Weyl, Hermann Klaus Hugo, 257, 262, 266, 399, 416, 444 Wieland, Helmut, 386 Wiener, Norbert, 415 Wiles, Andrew, 445 Wodzicki, Mariuz, 353 Woodrow, Robert, 250 Yang, Chen Ning, 445 Yaroslav the Wise, 266 Yau, Shing-Tung, 443 Young, Alfred, 277, 279 Zaharevich, Ilya, 353 Zalcstein, Yechezkel, 170 Zaleskiˇı, Alexander E., xii, 22 Zarhin, Yuri G., 353 Zariski, Oscar, 3, 12, 339 Zarkhin, Yuri Gennadievich, 441 Zeiger, H. Paul, 422, 423 Zelmanov, Efim Isaakovich, 269 Zhang, Genkai, xvi Zingel, Tiina, xv Zorn, Max August, 75 Zubkov, Aleksandr Nikolaevich, ix, 265, 268 Zuse, Konrad, 413
This page intentionally left blank
467
Subject Index (X, Y )- automaton, 147 G-average, 217 G-co-expressions, 360 G-doubly stochastic matrix, 209 K-algebraic point, 340 K-rational point, 339, 340 RB −1 -act of fractions AB −1 , 132 T -ideal, 281 Λ-linear transition system, 171 Λ-monoid, 170, 171 Λ-monoid of inputs, 171 Ω-field, 129 Ω-ring, 128 Ω-ring of fractions, 132 L-fixed point, 210 N (2) -groups, 107 R-semigroup, 94 G-scheme, 275 k-characters, 286 m-linear form, 226 n-dimensional projective space, 334 n-focal, 92 n-stable representation, 108 n-th order general equation, 395 q-extension, 203 r-fold point, 342 x-sequence, 113 Diophantine equation, 427 Abelian extension, 406 Abelian group, 377, 381 Abelian sheaf, 3, 4 Abelian variety, 345 acceptable equivalence, 249 acceptable subset, 245 act of characters, 134 action, 156 adenine, 450 affine automaton, 170 affine space, 434 affine variety, 335, 434 Aleksandrov topology, 203 algebraic curve, 327 algebraic integer, 430
algebraic number, 331 algebraic number field, 331, 433 algebraic variety, 433 algebraically closed field, 331 algebraically independent numbers, 395 alternating group, 376 amino acid, 450 Amitsur-Levitzki theorem, 282 approximated Ω-ring, 132 atomary semigroup, 425 attributed automaton, 199 augmentation ideal, 101, 106 automaton, 145 automaton language, 418 automorphism, 379, 401 average-preserving function, 180 average-preserving WFA, 180 Bézout’s Lemma, 391 Bell number, 239 Betti number, 436, 438 bifurctation ratio, 448 bilinear map, 133 binary (3, 7)-Hamming code, 455 binary tree, 447 birational equivalence of curves, 340 birational geometry, 340 birational invariant, 340 birationally equivalence, 341 Birkhoff class, 15 bistochastic matrix, 207 Björner topology, 203 branching theorem, 278 Burnside’s Theorem, 387 cancellative semigroup, 94 cascade, 165 cascade of automata, 166, 421, 422 cascading, 422 Catalan numbers, 449 category of changes, 38 category of pairs, 25 category of primitives, 197 Cauchy-Frobenius lemma, 276
468
center of group, 377 centralizer, 23 CF-language, 449 channel, 453 character map, 280 character series, 281 class of nilpotent semigroup, 111 code words, 453 coding, 453 cogenerator, 134 Cohen-Macauly ring, 9 cohomological dimension, 3, 5 cohomology, 3 colored category, 184 commutative Om-algebra, 128 commutative Om-ring, 129 commutator subgroup, 382 commutators, 382 compatible pair of subsets, 46 complete polarization, 226 complete system of representatives, 374 complex character, 257 component of curve, 337 composition series, 383 congruence of automata, 45 congruence on an automaton, 154 congruent numbers, 439 conjecture of Birch and Swinnerton-Dyer, 439 context free grammar, 449 context free language, 449 contravariant coordinate, 230 convolution, 269 coset, 373 cover, 184 cover of automata, 147 critical semigroup, 111 cryptomorphism, 104 cyclic action, 157 cyclic automaton, 157 cyclic group, 381 cyclotomic field, 406 cytosine, 450 decoding, 453 decomposition, 21 degree of the forest, 452 deoxyribonucleic acid, 450 deterministic finite state machine, 145 dimension congruence, 91 Diophantine geometry, 339 discrete time system, 171 division Ω-ring, 129 DNA, 450 doubly stochastic matrix, 207, 225 duo semigroup, 111 Dyck language, 449
SUBJECT INDEX
edge, 447 elementary symmetric polynomial, 392 elliptic curve, 328, 345, 436 epimorphism, 379 epimorphism of (X, Y )-automata, 148 equivalent automata, 149 Eulerian ring of integers, 429 even doubly stochastic matrix, 207 even substitution, 376 exact diagram, 185 extension, 330 extension of ring, 432 exterior algebra, 282 exterior vertices, 447 factor automaton, 45 factor group, 377 factor-automaton, 155 faithful action, 156, 159 faithful pair, 26 Faltings’ theorem, 436 Fermat equation, 437 Fermat’s Last Theorem, 429 fiber product, 184, 188 Fibonacci number of a graph, 245 Fibonacci numbers, 450, 455 field, 328 field of characteristic 0, 330 field of characteristic p, 330 field of definition, 345 field of rational functions on the curve, 341 fields of remainder classes, 330 filament, 452 final state, 145 finitary T ideal, 66 finite automaton, 21, 417 finite extension, 330 finite group, 165 finitely presented R-act, 136 finitely stable action, 70 First Theorem of Sylow, 374 flat R-act, 133 focal, 92 forest, 451, 452 form of order i, 336 formal language, 418, 449 formal Lie group, 352 formal neuron, 414 formal series, 393 formula of Viète, 391 Fox calculus, 61 frame, 333 free m-generated nilpotent semigroup, 113 Frey curve, 442 Frobenius’ theorem, 267 Frobenius-König theorem, 225
SUBJECT INDEX
fundamental ideal, 91, 101 Galois group of the equation, 401 Galois group of the extension Δ/P , 401 Galois inversion problem, 406 Galois theory of schemes, 399 Gaussian numbers, 429 general algebraic curve, 337 general equation, 395 general formal series, 394 general linear system, 171 generalized dimensional subgroup, 106 generalized Mordell conjecture, 346, 352 generating matrix, 453 generator, 381 genetic code, 450 genetic language, 450 genus of birational invariant, 340 genus of curve, 344, 351, 436 genus of the Riemann surface, 345 good polynomial bases, 273 Grassmann algebra, 282 Grothendick (pre)topology, 203 Grothendieck ring, 279 Grothendieck pretopology, 184, 185 Grothendieck topology, 456 group algebra, 269 group determinant, 270 group of all automorphisms of Δ, 401 group pair, 23 guanine, 450 Hamiltonian algebra, 129 Hamiltonian group, 118 Hamming code, 454 Hamming distance, 453 hereditary condition, 135 Hermitian matrix, 227 heterogeneous algebra, 155 Hilbert series, 281 Hilbert-Poincaré series, 272 holonomy, 408 holonomy group, 408 homogeneous form of order i, 336 homogeneous recurrent equation, 455 homomorphism, 378 homomorphism of automata, 148 Horton-Strahler rule, 447 hyperbolic polynomial, 226 hyperbolic quadratic form, 227 ideal pair, 46 indecomposable module, 26 indecomposable variety, 57 indecomposbale representations, 16 index of stabilization, 69
index of subgroup, 374 indicator, 44 infinite cyclic group, 381 infinite ordinal, 69 initial state, 145 initial symbol, 449 inner automorphism, 375, 379 inner vertices, 447 input alphabet, 145 input signal, 417 input-output map, 171 integrity basis, 273 interpretation, 197 invariant element, 117 invariant subautomaton, 59 invariant subgroup, 375 inversion, 376 irreducible T -ideal, 284 irreducible algebraic curve of rank m, 337 irreducible form, 336 irreducible polynomial, 331, 390 isomorphism, 379 Jacobi sum, 437 joining map, 166 Jordan-Hölder Theorem, 165, 383 Künneth formula, 6 Kaplansky semigroup, 94 kernel of a pair, 26 kernel of homomorphism, 378 Klein 4-group, 385 Klein curve, 436 Krohn-Rhodes Theorem, 165 L-flat condition, 138 Lah numbers, 241 language accepted by an automaton, 147 language accepted by the automaton, 418 left R-transferable duoring, 117 left R-transferable element, 117 left coset, 374 left distributivity, 92 left duo semigroup, 111 left homomorphism, 26 left ideal, 129 left subcommutative ring, 117 left subduo semigroup, 111 left unitary R-act, 128 length of partition, 277 length of series, 106 light-like vector, 227 limit, 69, 106 limit dimensional subgroup, 69, 106 line of behavior, 149 linear (n, k)-code, 453
469
470
linear automaton, 45, 105, 168 linear cyclic automaton, 169 linguistic category, 197 local cohomology, 3, 4 Loewner unction, 234 Lorentz form, 227 lower central series, 91 lower stable series of a pair, 70 lower stable series of pair, 106 Lusztig Conjecture, 203 machine, 417 majorization, 233 many-sorted algebra, 155 maximum likelihood decoding, 453 Mealy coding machine, 152 Mealy machine, 145 metanilpotent group, 83 model of the automaton, 421 Molien series, 272 Molien’s formula, 271 monoid, 418 monomial, 335 monomial group, 275 monomorphism, 379 Moore automaton, 156, 158, 421 Mordell-Weil theorem, 349 morphism of automata, 45 Muir’s formula, 227 multiplication of varieties, 15 multiplicity of component, 337 multiresolution function, 179 multiresolution vector, 181 mutual commutator, 70 NA, 450 near-ring, 92 nil, 111 nilpotency index, 111 nilpotent coradical, 85 nilpotent ideal, 267 nilpotent of class, 91 nilpotent semigroup, 111 Noetherian module, 5 Noetherian pre-scheme, 9 Noetherian ring, 5 non-commutative analogue of algebra, 280 non-elliptic curve, 346, 436 non-homogeneous polynomial, 226 non-terminal, 449 noncascadable automaton, 424 normal divisor, 375 nucleic acid, 450 nucleoide, 450 nucleotide chain, 450 number field Q(ζ), 430
SUBJECT INDEX
o-automaton, 183 odd substitution, 376 orbit, 275, 373 order, 447 order of a substitution, 375 order of group, 374 order of monomial, 336 order-polynomial, 250 Ore set, 131 outerplanar graph, 247 output alphabet, 145 output signal, 417 parallel composition, 166 parity check matrix, 453 partial feedback operation, 203 particular ring, 59 partition, 277 permanent, 225 Petri net, 203 Pick unction, 234 Poincaré series, 281 Poincaré’s conjecture, 435 Poincaré-Mordell conjecture, 435 polynomial Ω-ring, 130 polynomial algebra, 130 polynomial basis, 273 polypeptide chain, 450 presentation of an R-act, 136 presheaf, 185 primitive, 197 primitive derivation, 197 production, 449 projective algebraic variety, 336, 434 projective space, 7, 434 proper subgroups, 383 pseudo-reflection, 281 pullback, 184 pure homomorphism, 136 quasi-endomorphism, 92 quasi-equivalent automata, 421 quasi-ring, 91 quaternion, 269 quivalent automata, 419 radar code, 455 radical, 267 ramification matrix, 448 rank of curve, 435 rational curve, 345, 346, 436 rational function on the curve Γ, 341 rational point, 339, 434 Redfield-Pólya theory, 275 reduced automaton, 149, 419 reduced linear automaton, 169
SUBJECT INDEX
reducible polynomial, 390 Rees factor semigroup, 113 regular Ω-ring, 135 regular at zero Ω-ring, 136 regular language, 147 relatively free algebra, 281, 284 Remak’s theorem, 26 representation, 257, 378 residually biprimary groups, 86 restriction, 278 rewriting system, 197 ribonucleic molecule, 450 Riemann hypothesis, 438 Riemann surface, 435, 436 right R-transferable duoring, 117 right R-transferable element, 117 right coset, 374 right distributivity, 92 right homomorphism, 26 right subcommutative ring, 117 right A-set, 191 ring, 328, 432 ring of formal series, 394 ring of invariants, 273 RNA, 450 root, 447 rooted tree, 451 saturated Birkhoff class, 26, 105 saturated class, 15, 44, 45 Schur function, 280 Schur-convexity, 233 secant, 342 secondary structure, 451 semantic pair, 197 semi-automaton, 145, 158 semi-direct product, 23, 425 semi-Thue system, 197 semidirect product, 35 semigroup of the automaton, 156 semigroup Ω-ring, 130 semigroup R-act, 130 semigroup action, 183 semigroup automaton, 45, 155, 156, 183 semigroup of ideal pairs, 60 semisimple algebra, 267 semisimplicity, 17 sequential composition, 166 set of states, 145 sheaf, 185 sign representation, 278 simple algebra, 267 simple group, 375 simple Lie group, 387 simple Pythagorean triples, 427 simplex code, 455
singular endomorphism, 123 singular point, 342 size of partition, 277 solvable group, 382, 383 space-like vector, 227 special basis, 63 special ideal, 44 special involution semigroup, 211 special property, 44 species, 184 spectrum of subgroup, 213 splitting field, 395, 396, 401 stabilizer, 162 stabilizing index, 106 stable pair, 92 start state, 145 state, 417 state-output automaton, 421 Stirling numbers of the first kind, 240 Stirling numbers of the second kind, 239 Strahler number, 448 strictly constant function, 229 strictly decreasing function, 229 strictly regular ring, 122 strongly flat R-act, 136 subalgebra of G-invariant, 271 subalgebra of G-invariants, 280 subnilpotent semigroup, 111 substitution, 375 substitution group, 376 symmetric G-mean of a, 217 symmetric group, 376 symmetric matrix, 227 symmetric polynomial, 392 symmetries, 210 syntactic category, 197 syzygy, 273 tangent, 342 Taniyama-Shimura Conjecture, 445 Taniyama-Weil conjecture, 439 tensor product, 133 terminal, 449 terminal of group, 106 terminal of a group, 69 terminal of a ring, 69 terminators, 451 theorem of Krull-Remak-Schmidt, 26 thymine, 450 time-invariant system, 171 time-like vector, 227 torsion, 435 transcendental number, 332 transferable element, 117 transition, 180 transition category, 198
471
472
SUBJECT INDEX
transition system, 198 Travelling Salesman Problem, 212 tree, 447 triangular product, 15, 16, 24, 36, 103 triangular product of automata, 172, 173 trigger, 420 triple product of semigroups, 27, 30 trivial normal divisor, 375 trivial representation, 277 type, 275 universal cone, 184 uridine, 450 value, 197 variety, 26 verbal function, 26 verbatim, 38 Weierstrass addition theorem, 349 weight of partition, 277 weighted finite automaton, 180 Weil’s conjecture, 438 word accepted by the automaton, 418 wreath product, 15, 23, 184 wreath product construction, 203 wreath product of actions, 159, 196 wreath product of automata, 194 wreath product of semigroup automata, 168, 196 wreath product of the algebras, 66 Young diagram, 277 Zariski dimension, 3 Zariski space, 3, 4 Zariski topology, 12 zeta-polynomial, 250 zyzygy, 273