Lecture Notes in Mathematics 2108 CIME Foundation Subseries
Aldo Conca · Sandra Di Rocco Jan Draisma · June Huh Bernd Sturmfels · Filippo Viviani
Combinatorial Algebraic Geometry Levico Terme, Italy 2013 Editors: Sandra Di Rocco Bernd Sturmfels
Lecture Notes in Mathematics Editors-in-Chief: J.-M. Morel, Cachan B. Teissier, Paris Advisory Board: Camillo De Lellis (Zürich) Mario Di Bernardo (Bristol) Alessio Figalli (Austin) Davar Khoshnevisan (Salt Lake City) Ioannis Kontoyiannis (Athens) Gabor Lugosi (Barcelona) Mark Podolskii (Aarhus) Sylvia Serfaty (Paris and NY) Catharina Stroppel (Bonn) Anna Wienhard (Heidelberg)
For further volumes: http://www.springer.com/series/304
2108
Fondazione C.I.M.E., Firenze C.I.M.E. stands for Centro Internazionale Matematico Estivo, that is, International Mathematical Summer Centre. Conceived in the early fifties, it was born in 1954 in Florence, Italy, and welcomed by the world mathematical community: it continues successfully, year for year, to this day. Many mathematicians from all over the world have been involved in a way or another in C.I.M.E.’s activities over the years. The main purpose and mode of functioning of the Centre may be summarised as follows: every year, during the summer, sessions on different themes from pure and applied mathematics are offered by application to mathematicians from all countries. A Session is generally based on three or four main courses given by specialists of international renown, plus a certain number of seminars, and is held in an attractive rural location in Italy. The aim of a C.I.M.E. session is to bring to the attention of younger researchers the origins, development, and perspectives of some very active branch of mathematical research. The topics of the courses are generally of international resonance. The full immersion atmosphere of the courses and the daily exchange among participants are thus an initiation to international collaboration in mathematical research. C.I.M.E. Director Pietro ZECCA Dipartimento di Energetica “S. Stecco” Università di Firenze Via S. Marta, 3 50139 Florence Italy e-mail:
[email protected] C.I.M.E. Secretary Elvira MASCOLO Dipartimento di Matematica “U. Dini” Università di Firenze viale G.B. Morgagni 67/A 50134 Florence Italy e-mail:
[email protected] For more information see CIME’s homepage: http://www.cime.unifi.it
Aldo Conca • Sandra Di Rocco • Jan Draisma • June Huh • Bernd Sturmfels • Filippo Viviani
Combinatorial Algebraic Geometry Levico Terme, Italy 2013 Editors: Sandra Di Rocco, Bernd Sturmfels In collaboration with
123
Aldo Conca Dipartimento di Matematica Universitá di Genova Genova, Italy
Sandra Di Rocco Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden
Jan Draisma Department of Mathematics and Computer Science TU Eindhoven Eindhoven, The Netherlands
June Huh Department of Mathematics University of Michigan at Ann Arbor Ann Arbor, MI, USA
Bernd Sturmfels Department of Mathematics University of California, Berkeley Berkeley, CA, USA
Filippo Viviani Dipartimento di Matematica UniversitJa Roma Tre Roma, Italy
ISBN 978-3-319-04869-7 ISBN 978-3-319-04870-3 (eBook) DOI 10.1007/978-3-319-04870-3 Springer Cham Heidelberg New York Dordrecht London Lecture Notes in Mathematics ISSN print edition: 0075-8434 ISSN electronic edition: 1617-9692 Library of Congress Control Number: 2014939301 Mathematics Subject Classification (2010): 11H55, 13D02, 13P25, 14H10, 14M25, 16S37, 52B20, 62F10 © Springer International Publishing Switzerland 2014 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Combinatorics and Algebraic Geometry have enjoyed a fruitful interplay since the nineteenth century. Classical interactions include invariant theory, theta functions, and enumerative geometry. The aim of this volume is to introduce recent development in combinatorial algebraic geometry. The five chapters of this book are based on the lectures delivered at the CIMECIRM summer-school, Levico Terme, June 10–15, 2013. We here regard algebraic geometry with a view towards applications, such as tensor calculus and algebraic statistics. A common theme is the study of algebraic varieties endowed with a rich combinatorial structure. Relevant techniques are polyhedral geometry, free resolutions, multilinear algebra, projective duality, and compactifications. Aldo Conca offers an introduction to Koszul Algebras and Their Syzygies. Koszul algebras are fundamental in commutative algebra, and they have numerous applications in algebraic geometry. One application presented here is the study of Castelnuovo–Mumford regularity of projective varieties. Other results presented in this chapter concern syzygies of Koszul algebras, the Koszul property of Veronese algebras, and algebras in the theory of hyperspace arrangements. Systems of polynomial equations in infinitely many variables arise naturally in applied algebraic geometry. Typically, these infinite-dimensional systems have a lot of symmetry, and, in favorable circumstances, one encounters Noetherianity up to Symmetry. Jan Draisma offers a glimpse on recent developments in this field. His chapter focuses on examples from algebraic statistics and on the combinatorics of well-quasi-ordered sets. Maximum Likelihood Geometry studies the critical points of monomial functions over a variety inside the probability simplex. The number of complex critical points, known as its maximum likelihood degree, is a topological invariant. June Huh joined Bernd Sturmfels in writing a chapter, which introduces this theory and its statistical motivations. Many favorites from combinatorial algebraic geometry are featured: toric varieties, matroids, A-discriminants, Grassmannians, and determinantal varieties.
v
vi
Preface
Sandra Di Rocco lectured on Linear Toric Fibrations, that is, toric varieties which are birational to projective toric bundles. On the combinatorial side, these correspond to Cayley polytopes, a class of highly structured lattice polytopes that has received much attention in the recent literature. This chapter presents geometrical phenomena, in algebraic geometry and neighboring fields, which are characterized by a Cayley structure. Filippo Viviani takes the reader on A Tour of Hermitian Symmetric Manifolds. These are Hermitian manifolds which are homogeneous and such that every point has a symmetry preserving the Hermitian structure. Examples of such manifolds serve as moduli spaces in algebraic and analytic geometry. This chapter offers an introduction to several different perspectives from which Hermitian symmetric manifolds can be studied. We thank the CIME foundation and the CIRM center for hosting the school and for their generous support. All of us had a wonderful time at Levico Terme. The beautiful scenery of Trentino made the mathematical interactions and the stimulating lectures even more enjoyable. Stockholm, Sweden Berkeley, CA October 2013
Sandra Di Rocco Bernd Sturmfels
CIME activity is carried out with the collaboration and financial support of: – INdAM (Istituto Nazionale di Alta Matematica) – MIUR (Ministero dell’Istruzione, dell’Università e della Ricerca) – Ente Cassa di Risparmio di Firenze
Contents
Koszul Algebras and Their Syzygies .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Aldo Conca
1
Noetherianity up to Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Jan Draisma
33
Likelihood Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . June Huh and Bernd Sturmfels
63
Linear Toric Fibrations.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 119 Sandra Di Rocco A Tour on Hermitian Symmetric Manifolds . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 149 Filippo Viviani
vii
Koszul Algebras and Their Syzygies Aldo Conca
Introduction Koszul algebras, introduced by Priddy in [P], are positively graded K-algebras R whose residue field K has a linear free resolution as an R-module. Here linear means that the non-zero entries of the matrices describing the maps in the R-free resolution of K have degree 1. For example, if S D KŒx1 ; : : : ; xn is the polynomial ring over a field K then K is resolved by the Koszul complex which is linear. In these lectures we deal with standard graded commutative K-algebras, that is, quotient rings of the polynomial ring S by homogeneous ideals. The program of the lectures is the following: Lecture 1: Koszul algebras and Castelnuovo–Mumford regularity. Lecture 2: Bounds for the degrees of the syzygies of Koszul algebras. Lecture 3: Veronese algebras and algebras associated with collections of hyperspaces. In the first lecture, based on the survey paper [CDR], we present various characterizations of Koszul algebras and strong versions of Koszulness. In the second lecture we describe recent results, obtained in cooperation with Avramov and Iyengar [ACI1, ACI2], on the bounds of the degrees of the syzygies of a Koszul algebra. Finally, the third lecture is devoted to the study of the Koszul property of Veronese algebras and of algebras associated with collections of hyperspaces and it is based on the papers [CDR, C].
A. Conca () Dipartimento di Matematica, Università degli Studi di Genova, Genova, Italy e-mail:
[email protected] A. Conca et al., Combinatorial Algebraic Geometry, Lecture Notes in Mathematics 2108, DOI 10.1007/978-3-319-04870-3__1, © Springer International Publishing Switzerland 2014
1
2
A. Conca
1 Koszul Algebras and Castelnuovo–Mumford Regularity 1.1 Notation Let K be a field and R be a commutative L standard graded K-algebra, that is a K-algebra with a decomposition R D i 2N Ri (as an Abelian group) such that R0 D K, the vector space R1 has finite dimension and Ri Rj D Ri Cj for every i; j 2 N. Let S be the symmetric algebra of R1 over K. In other words, S is the polynomial ring KŒx1 ; : : : ; xn where n D dim R1 and x1 ; : : : ; xn is a K-basis of R1 . One has an induced surjection S D KŒx1 ; : : : ; xn ! R
(1)
of standard graded K-algebras. We call (1) the canonical presentation of R. Hence R is isomorphic to S=I where I is the kernel of the map (1). In particular, I is homogeneous and does not contain elements of degree 1. We say that I defines R. Denote by mR the maximal homogeneous ideal of R. We may consider K as a graded R-module via the identification K D R= mR . Unless otherwise stated, we will always assume that K-algebras are standard graded, modules and ideals are graded and finitely generated, and module homomorphisms have degree 0. For an R-module M D ˚i 2Z Mi we denote by HF.M; i / the Hilbert function of M at i , that is HF.M; i / D dimK .Mi /; and by HM .z/ D
X
dimK .Mi /zi 2 QŒjzjŒz1
i 2Z
the associated Hilbert series. Given an integer a 2 Z we will denote by M.a/ the graded R-module whose degree i component is Mi Ca . In particular R.j / is a free R-module of rank 1 whose generator has degree j . A minimal graded free resolution of M as an R-module is a complex of free R-modules i C1
i
1
F W ! Fi C1 ! Fi ! Fi 1 ! ! F1 ! F0 ! 0 such that: 1. Hi .F/ D 0 for i > 0, 2. H0 .F/ ' M , 3. i C1 .Fi C1 / mR Fi for every i .
Koszul Algebras and Their Syzygies
3
Such a resolution exists and it is unique up to an isomorphism of complexes. We hence call it “the” minimal free (graded) resolution of M . By definition, the i -th Betti number ˇiR .M / of M as an R-module is the rank of Fi . Each Fi is a direct sum of shifted copies of R. The .i; j /-th graded Betti number ˇijR .M / of M is the number of copies of R.j / that appear in Fi . By construction one has ˇiR .M / D dimK TorR i .M; K/ and ˇijR .M / D dimK TorR i .M; K/j : Here and throughout the notes an index on the right of a graded module denotes the homogeneous component of that degree. The Poincaré series of M is defined as X R .z/ D ˇiR .M /zi 2 QŒjzj; PM i 2N
and its bigraded version is R PM .s; z/ D
X
R ˇi;j .M /s j zi 2 QŒs; s 1 Œjzj:
i 2N;j 2Z
We set tiR .M / D supfj W ˇijR .M / ¤ 0g where, by convention, tiR .M / D 1 if Fi D 0. By definition, t0R .M / is the largest degree of a minimal generator of M . Two important invariants that measure the “growth” of the resolution of M as an R-module are the projective dimension pdR .M / D supfi W Fi ¤ 0g D supfi W ˇiR .M / ¤ 0g and the (relative) Castelnuovo–Mumford regularity regR .M / D supfj i W ˇijR .M / ¤ 0g D supftiR .M / i W i 2 Ng: An R-module M has a linear resolution as an R-module if for some d 2 Z one has ˇijR .M / D 0 if j ¤ d C i . Equivalently, M has a linear resolution as an R-module if it is generated by elements of degree regR .M /. We may as well consider M as a module over the polynomial ring S via the map (1). The absolute Castelnuovo–Mumford regularity is, by definition, the regularity regS .M / of M as an S -module. It has also a cohomological interpretation
4
A. Conca
via local duality, see for example [EG, Sect. 1] or [BH, 4.3.1]. Denote by Hmi S .M / the i -th local cohomology module with support on the maximal ideal of S . One has Hmi S .M / D 0 if i < depth M or i > dim M and regS .M / D maxfj C i W Hmi S .M /j ¤ 0g: Both pdR .M / and regR .M / can be infinite. Example 1.1. Let R D KŒx=.x v / with v > 1. Then the minimal free resolution of K over R is: ! R.2v/ ! R.v 1/ ! R.v/ ! R.1/ ! R ! 0 where the maps are multiplication by x or x v1 depending on the parity. Hence F2i D R.iv/ and F2i C1 D R.iv 1/ so that pdR .K/ D 1 for every v > 1. Furthermore regR .K/ D 1 if v > 2 and regR .K/ D 0 if v D 2. Note that, in general, regR .M / is finite if pdR .M / is finite. On the other hand, as we have seen in the example above, regR .M / can be finite even when pdR .M / is infinite. In the study of minimal free resolutions over R, the minimal free resolution KR of the residue field K as an R-module plays a fundamental role. This is because TorR .M; K/ D H .M ˝R KR / and hence ˇijR .M / D dimK Hi .M ˝R KR /j : A very important role is played also by the Koszul complex K.mR / on a minimal system of generators of the maximal ideal mR of R. The Koszul complex is the typical example of a differential graded algebra, DG-algebra for short.
1.2 DG-Algebras A graded algebra C D ˚i 0 Ci is graded-commutative if for every a 2 Ci and b 2 Cj one has: ab D .1/ij ba and furthermore
Koszul Algebras and Their Syzygies
5
a2 D 0 whenever i is odd. A DG-algebra is a graded-commutative algebra C D ˚i 0 Ci equipped with a linear differential @WC !C of degree 1 (i.e. @2 D 0 and @.Ci / Ci 1 ) that satisfies the “twisted” Leibniz rule: @.ab/ D @.a/b C .1/i
[email protected]/ whenever a 2 Ci . The cycles Z.C / D ker @, the boundaries B.C / D Image @ and the homology H.C / D Z.C /=B.C / of a DG-algebra C inherit the algebra structure from C . Precisely, Z.C / is a graded-commutative subalgebra of C , B.C / is a (two-sided) graded ideal of Z.C / and hence H.C / is a (graded-commutative) algebra. The component of Z.C / of (homological) degree i is denoted by Zi .C /. Similarly for the boundaries and the homology. Given a DG-algebra C and a cycle z 2 Zi .C / there is a canonical way to “kill” z in homology by adding a “variable” to C preserving the DG-algebra structure. If i is even then one considers D D C Œe where e is an exterior variable (hence e 2 D 0) of degree i C 1 and extends the differential by setting @.e/ D z. If i is odd then one considers D D C Œs where s is a polynomial variable (or a divided power variable) of degree i C 1 and extends the differential by setting @.s/ D z. By construction, the element z 2 Z.C / Z.D/ is now a boundary of the complex D and hence it is 0 in homology. Furthermore, by construction, Hj .C / D Hj .D/ for j < i . This process can clearly be iterated. One can, for instance, “kill” all the cycles in a given homological degree by adding variables.
1.3 Koszul Complex The Koszul complex can be described in the following way. Let R be any ring and let I D .a1 ; : : : ; am / be an ideal of R. Consider R as a DG-algebra concentrated in degree 0 and the elements a1 ; : : : ; am as cycles of that complex. Then we add exterior variables e1 ; : : : ; em in degree 1 to R and obtain the DG-algebra K.I; R/ D RŒe1 ; : : : ; em with @.ei / D ai : This is the Koszul complex associated with the ideal V I and coefficients in the ring R. In other words, K.I; R/ is the exterior algebra Rm equipped with the differential
6
A. Conca
induced by @.ei / D ai for i D 1; : : : ; m. If M is an R-module we then set K.I; M / D K.I; R/ ˝R M: This is the Koszul complex associated with the ideal I with coefficients in M . Denote by Z.I; M / the module of cycles of the complex K.I; M / and similarly by B.I; M / its boundaries, by C.I; M / its cokernel and by H.I; M / its homology. We will denote by Ki .I; M / the component of homological degree i of K.I; M /. When the coefficients of the Koszul complex are taken in R we use a simplified notation K.I / D K.I; R/;
Z.I / D Z.I; R/
and so on. By definition we have: H0 .I / D R=I and H0 .I; M / D M=IM: Furthermore H.I / is a (graded-commutative) algebra and H.I; M / is a H.I /module. In particular, IH.I; M / D 0. It is well-known that the Koszul complex K.I / is acyclic (and hence an R-free resolution of R=I ) if (and only if in the local or standard graded setting) the chosen generators a1 ; : : : ; am of I form a regular sequence, see [BH, 1.6.14]. When K.I / is not a free resolution one can nevertheless use the procedure of adding variables to kill homology degree by degree to obtain from K.I / a free resolution of R=I as an R-module with a DG-algebra structure. In the local or standard graded setting this can be done in the following way. 1. Set T1 D K.I /. 2. Choose a minimal system of generators of H1 .T1 / and a set of cycles z1;1 ; : : : ; z1;u1 representing them. 3. Add to T1 a set of polynomial variables (or divided powers in positive characteristic) Y2 D fs2;1 ; : : : ; s2;u1 g in degree 2 and set T2 D T1 Œs2;1 ; : : : ; s2;u1 W @.s2;i / D z1;i : 4. Choose a minimal system of generators of H2 .T2 /, a set of cycles z2;1 ; : : : ; z2;u2 representing them. 5. Add to T2 a set of exterior variables Y3 D fe3;1 ; : : : ; e3;u2 g in degree 3 and set T3 D T2 Œe3;1 ; : : : ; e3;u2 W @.e3;i / D z2;i and so on. We obtain a DG-algebra T D RŒY1 ; Y2 ; Y3 ; : : : that is an R-free resolution of R=I which is (essentially) independent on the choices of the minimal system of generators of Hi .Ti / and of the cycles representing them. It is called the Tate complex and we will denote it by
Koszul Algebras and Their Syzygies
7
T .R; R=I /: We refer to [A] for a precise description of the construction and many beautiful results and questions concerning it. We just give two examples: Example 1.2. Set R D QŒx=.x n /, a D xN and I D .a/. Then T1 D K.I / D RŒe with @.e/ D a and z D an1 e generates the 1-cycles of K.I /. Hence the second round of Tate construction gives T2 D RŒe; s with @.s/ D an1 e. It turns out that T .R; R=I / D RŒe; s is a minimal resolution of R=I over R: ! Rs 2 e ! Rs 2 ! Rse ! Rs ! Re ! R ! 0 where the maps are given, alternatively, by multiplication with a and an1 up to non-zero scalars. Example 1.3. Set R D QŒx; y and I D .x 2 ; xy/. Then we have T1 D K.I / D RŒe1 ; e2 with @.e1 / D x 2 and @.e2 / D xy. The cycle ye1 xe2 generates H1 .T1 /. Hence the second round of Tate construction gives the DG-algebra T2 D RŒe1 ; e2 ; s1 with @.s1 / D ye1 xe2 . Now H2 .T2 / is generated by e1 e2 ys1 . Hence the third round of Tate construction gives T3 D RŒe1 ; e2 ; s1 ; e3 with @.e3 / D e1 e2 ys1 . Hence the beginning of the Tate complex is the following: ! Re3 ˚ Rs1 e1 ˚ Rs1 e2 ! Rs1 ˚ Re1 e2 ! Re2 ˚ Re1 ! R ! 0: Note that the resolution in Example 1.2 is minimal while the one in Example 1.3 is not.
1.4 Auslander–Buchsbaum–Serre We return to the graded setting and assume that R is a standard graded K-algebra. When is pdR .M / finite for every M ? The answer is given by the Auslander– Buchsbaum–Serre Theorem, a result that, in the words of Avramov [A1, p. 32], “really started everything”. The graded variant of it is the following: Theorem 1.4. The following conditions are equivalent: (1) (2) (3) (4)
pdR .M / is finite for every R-module M , pdR .K/ is finite, the Koszul complex K.mR / resolves K, R is a polynomial ring.
When the conditions hold, then for every M one has pdR .M / pdR .K/ D dim R. Remark 1.5. The Koszul complex K.mR / has three important features: (1) it has finite length,
8
A. Conca
(2) it has a DG-algebra structure, (3) the matrices describing its differentials have non-zero entries only of degree 1. When R is not a polynomial ring the minimal free resolution KR of K as an Rmodule does not satisfy condition (1) of Remark 1.5. Can KR nevertheless satisfy conditions (2) and (3) of Remark 1.5? For condition (2) the answer is yes: KR has always a DG-algebra structure. Indeed a theorem, proved independently by Gulliksen and Schoeller, asserts that KR is obtained by using the Tate construction. Furthermore results of Assmus, Tate, Gulliksen and Halperin clarify when KR is finitely generated as an R-algebra. Again, we state the theorem in the graded setting and refer to [A, 6.3.5, 7.3.3, 7.3.4] for general statements and proofs. Theorem 1.6. Let R be a standard graded K-algebra. Let T D T .R; K/ D RŒY1 ; Y2 ; Y3 ; : : : be the Tate complex associated with K D R= mR . We have: (1) T is the minimal graded resolution of K, i.e. T D KR . (2) T is finitely generated as an R-algebra if and only if R is a complete intersection. In that case, T is generated by elements of degree at most 2, i.e. Yi D ; for i > 2. (3) If R is not a complete intersection then Yi 6D ; for every i . Algebras R such that KR satisfies condition (3) in Remark 1.5 are called Koszul: Definition 1.7. The K-algebra R is Koszul if the matrices describing the differentials in the minimal free resolution KR of K as an R-module have non-zero entries only of degree 1, that is, regR .K/ D 0 or, equivalently, ˇijR .K/ D 0 whenever i ¤ j. Koszul algebras were originally introduced by Priddy [P] in his study of homological properties of graded (non-commutative) algebras, see the volume [PP] of Polishchuk and Positselski for an overview and surprising aspects of the Koszul property. We collect below a list of important facts about Koszul commutative algebras. We always refer to the canonical presentation (1) of R as a quotient of the polynomial ring S D Sym.R1 / by the homogeneous ideal I . First we introduce a definition. Definition 1.8. We say that R is quadratic if its defining ideal I is generated by quadrics (homogeneous polynomials of degree 2). Definition 1.9. We say that R is G-quadratic if its defining ideal I has a Gröbner basis of quadrics with respect to some coordinate system of S1 and some term order on S . In other words, R is G-quadratic if there exists a K-basis of S1 , say x1 ; : : : ; xn and a term order such that the initial ideal in .I / of I with respect to is generated by monomials of degree 2.
Koszul Algebras and Their Syzygies
9
Remark 1.10. For a standard graded K-algebra R one has
R .K/ D ˇ2j
8 S < ˇ1j .R/ :
S ˇ12 .R/ C
if j ¤ 2 n 2
if j D 2
and hence the resolution of K, as an R-module, is linear up to homological position 2 if and only if R is quadratic. In particular, if R is Koszul, then R is quadratic. On the other hand there are algebras defined by quadrics that are not Koszul. For example if one takes R D KŒx; y; z; t=.x 2 ; y 2 ; z2 ; t 2 ; xy C zt/ R then one has ˇ34 .K/ D 5 and hence R is not Koszul.
Remark 1.11. If I is generated by monomials of degree 2 with respect to some coordinate system of S1 , then a filtration argument, that we reproduce in Theorem 3.12, shows that R is Koszul. More precisely, for every subset Y of variables R=.Y / has an R-linear resolution. Remark 1.12. If I is generated by a regular sequence of quadrics, then R is Koszul. This follows from Theorem 1.6 because if R is a complete intersection of quadrics, then KR is obtained from K.mR / by adding polynomial variables in homological degree 2 and internal degree 2 to kill H1 .K.mR //. For example, if R D QŒx1 ; x2 ; x3 ; x4 =.x12 C x22 ; x3 x4 / then the Tate resolution of K over R is the DG-algebra RŒe1 ; e2 ; e3 ; e4 ; s1 ; s2 with differential induced by @.ei / D xi and @.s1 / D x1 e1 C x2 e2 and @.s2 / D x3 e4 . Here the ei ’s have internal degree 1 and the si ’s have internal degree 2. Remark 1.13. If R is G-quadratic, then R is Koszul. This follows from Remark 1.11 and from the standard deformation argument showing that ˇijR .K/ ˇijA .K/ with A D S= in .I /. For details see, for instance, [BC, Sect. 3]. Remark 1.14. On the other hand there are Koszul algebras that are not G-quadratic. One notes that an ideal defining a G-quadratic algebra must contain quadrics of “low” rank. For instance, if R is Artinian and G-quadratic then its defining ideal must contain the square of a linear form. But most Artinian complete intersections of quadrics do not contain the square of a linear form. For example, the ideal I D .x 2 Cyz; y 2 Cxz; z2 Cxy/ S D CŒx; y; z is a complete intersection not containing the square of a linear form and S=I is Artinian. Hence S=I is Koszul and not G-quadratic. See [ERT] for general results in this direction.
10
A. Conca
Remark 1.15. The Poincaré series PKR .z/ of K as an R-module can be irrational (even for rings with R3 D 0), see [An]. However for a Koszul algebra R one has PKR .z/HR .z/ D 1
(2)
and hence PKR .z/ is rational. Indeed the equality (2) turns out to be equivalent to the Koszul property of R, see for instance [F]. Remark 1.16. A necessary (but not sufficient) numerical condition for R to be Koszul is that the formal power series 1=HR .z/ has non-negative coefficients (indeed positive unless R is a polynomial ring). Another numerical condition is the following. Expand the formal power series 1=HR .z/ as 0
…h22NC1 .1 C zh /eh 0
…h22NC2 .1 zh /eh with eh0 2 Z. This can be done in a unique way, see [A, 7.1.1]. Furthermore set eh .R/ D #Yh where Yh is the set of variables that we add at the h-th iteration of the Tate construction of the minimal free resolution of K over R. The numbers eh .R/ are called the deviations of R. If R is Koszul then eh0 D eh .R/ for every h and hence eh0 0. More precisely, eh0 > 0 for every h unless R is a complete intersection. For example, the Hilbert function of the ring in Remark 1.10 is H.z/ D 1 C 4z C 5z2 : Expanding the series 1=H.z/ one sees that the coefficient of z6 is negative. Furthermore the corresponding e30 is 0. So for two numerical reasons an algebra with Hilbert series H.z/ cannot be Koszul. The following characterization of the Koszul property in terms of regularity is formally similar to the Auslander–Buchsbaum–Serre Theorem 1.4. Theorem 1.17 (Avramov–Eisenbud–Peeva). The following conditions are equivalent: (1) regR .M / is finite for every R-module M , (2) regR .K/ is finite, (3) R is Koszul. Avramov and Eisenbud proved in [AE] that every module M has finite regularity over a Koszul algebra R by showing that regR .M / regS .M /. Avramov and Peeva showed in [AP] that if regR .K/ is finite then it must be 0. Indeed they proved a more general result for graded algebras that are not necessarily standard. We collect below further remarks, examples and questions relating the Koszul property and the existence of Gröbner bases of quadrics in various ways. Let us recall the following:
Koszul Algebras and Their Syzygies
11
Definition 1.18. A K-algebra R is LG-quadratic (where the L stands for lifting) if there exist a G-quadratic algebra A and a regular sequence of linear forms y1 ; : : : ; yc such that R ' A=.y1 ; : : : ; yc /. We have the following implications: G-quadratic ) LG-quadratic ) Koszul ) quadratic
(3)
As discussed in Remarks 1.10 and 1.16 the third implication in (3) is strict. The following remark, due to Caviglia, in connection with Remark 1.14 shows that also the first implication in (3) is strict. Remark 1.19. Any complete intersection R of quadrics is LG-quadratic. Say R D KŒx1 ; : : : ; xn =.q1 ; : : : ; qm / is a complete intersection of quadrics. Then set 2 C qm / A D RŒy1 ; : : : ; ym =.y12 C q1 ; : : : ; ym
and note that A is G-quadratic because the initial ideal of its defining ideal with 2 /. respect to a lex term order satisfying yi > xj for every i; j is .y12 ; : : : ; ym Furthermore y1 ; : : : ; ym is a regular sequence in A because A=.y1 ; : : : ; ym / ' R and dim A dim R D m. In [Ca1, 1.2.6], [ACI1, 6.4] and [CDR, 12] it is asked whether a Koszul algebra is also LG-quadratic. The following example gives a negative answer to the question. Example 1.20. Let R D KŒa; b; c; d =.ac; ad; ab bd; a2 C bc; b 2 /: The Hilbert series of R is 1 C 2z 2z2 2z3 C 2z4 : .1 z/2 Also, R is Koszul as can be shown using a Koszul filtration argument, see Example 3.8. The h-polynomial does not change under lifting with regular sequences of linear forms. Hence to check that R is not LG-quadratic it is enough to check that there is no algebra with quadratic monomial relations and with h-polynomial equal to h.z/ D 1 C 2z 2z2 2z3 C 2z4 :
12
A. Conca
In general, if J is an ideal in a polynomial ring A not containing linear forms and the h-polynomial of A=J is 1 C h1 z C h2 z2 C : : : then J has codimension h1 and h1 C1 exactly 2 h2 quadratic generators. Now consider a quadratic monomial ideal J in a polynomial ring A with, say, n variables such that the h-polynomial of A=J is h.z/. Such a J must have codimension 2 and 5 generators. So J is generated by 5 monomials chosen among the generators of .a; b/.a; b; c; d; e; f; g/ where a; b; : : : ; g are distinct variables. But an exhaustive CoCoA [CoCoA] computation shows that such a selection does not exist. An interesting example of LG-quadratic algebra is the following: Example 1.21. Let R D KŒa; b; c; d =.a2 bc; d 2 ; cd; b 2 ; ac; ab/: The Hilbert series of R is 1 C 3z 3z3 : .1 z/ The h-polynomial 1 C 3z 3z3 is not the h-polynomial of a quadratic monomial ideal in four variables. It is however the h-polynomial of a (unique up to permutations of the variables) quadratic monomial ideal in five variables, namely U1 D .a2 ; b 2 ; ad; cd; be; ce/. In six variables there is another quadratic monomial ideal with that h-polynomial. It is U2 D .a2 ; ad; bd; be; ce; cf /. Another one in seven variables, U3 D .ad; bd; ae; ce; bf ; cg/. And that is all. It turns out that R is LG-quadratic since it lifts to KŒa; b; c; d; e=J with J D .a2 bc C be; d 2 ; cd; b 2 C eb; ac; ab C ae/ and in .J / is U1 (up to a permutation of the variables) where is the rev.lex. order associated with the total order of the variables e > d > b > c > a. The ring of Example 1.20 is not LG-quadratic because of the obstruction in the h-polynomial, i.e., there are no quadratic monomial ideals with that h-polynomial. It would be interesting to identify a Koszul algebra with a nonobstructed h-polynomial that is not LG-quadratic.
2 Syzygies of Koszul Algebras The second lecture is based on the results obtained jointly with Avramov and Iyengar and published in [ACI1, ACI2].
Koszul Algebras and Their Syzygies
13
Given a standard graded K-algebra R with presentation R D S=I where S D KŒx1 ; : : : ; xn and I S is a homogeneous ideal, we set tiS .R/ D supfj W ˇijS .R/ ¤ 0g i.e., tiS .R/ is the highest degree of a minimal i -th syzygy of R as an S -module. In particular, t0S .R/ D 0 and t1S .R/ equals the highest degree of a minimal generator of I . The starting point of the discussion is the following observation. Remark 2.1. If I is generated by monomials of degree 2, then: (1) (2) (3) (4) (5) (6)
tiS .R/ 2i for every i , regS .R/ pdS .R/, if tiS .R/ < 2i for some i then tiSC1 .R/ < 2.i C 1/, tiS .R/ < 2i if i > dim S dim R, S ˇiS .R/ ˇ1 i.R/ , pdS .R/ ˇ1S .R/.
These inequalities are deduced from the (non-minimal) Taylor resolution of quadratic monomial ideals, see for instance [MS, 4.3.2]. Suppose now that, combining and iterating the following operations: (a) changes of coordinates, (b) the formation of initial ideals with respect to weights or term orders, (c) liftings and specializations with regular sequences of degree 1, the algebra R deforms to an algebra R0 D S 0 =J with S 0 a polynomial ring and J generated by quadratic monomials. Then R satisfies the inequalities (1), (2), (5) and (6) because the Betti numbers and the ti ’s can only grow passing from R to 0 S S0 R0 and ˇ1S .R/ D ˇ12 .R/ equals ˇ1S .R0 / D ˇ12 .R0 /. This observation suggests the following question: Question 2.2. Are the bounds of Remark 2.1 valid for every Koszul algebra? In [ACI1] we have proved that (1), (2), (3) and (4) hold for every Koszul algebra. As far as we know it is still open whether (5) and (6) hold as well for Koszul algebras. The inequality (1) for Koszul algebras [and its immediate consequence (2)] has a short proof that we present below, see Lemma 2.10. In [ACI2] stronger limitations for the degrees of the syzygies of Koszul algebras are described (under mild assumptions on the characteristic of the base field). To explain the results in [ACI1] and [ACI2] we start from some general considerations concerning bounds on the ti ’s. For S D KŒx1 ; : : : ; xn and R D S=I (and no assumptions on R), a very basic question is whether one can bound tiS .R/ only in terms of t1S .R/ and the index i .
14
A. Conca
The answer is negative in this generality, but it is positive if one involves also the number of variables n. Indeed, in [BM] and [CaS] it is proved that n2
tiS .R/ .2t1S .R//2
1 C i:
(4)
Furthermore variations of the Mayr–Meyer ideals define algebras having a doubly exponential syzygy growth of the kind of the right-hand side of (4) (but with slightly different coefficients), see [BS,Ko]. So, without any further assumption, one cannot expect any better bound for the tiS .R/ in terms of t1S .R/ than the one derived from (4). Things change drastically if either R is defined by monomials (i.e. I is a monomial ideal) or R is Koszul. Under these assumptions we have that: tiS .R/ t1S .R/i
(5)
tiS .R/ 2i for every i
(6)
holds for every i . In particular
holds for Koszul algebras since t1S .R/ D 2. When R is defined by monomials (5) is derived from the Taylor resolution, while when R is Koszul (6) is proved by Kempf [K], in [ACI1] and also in an unpublished manuscript of Backelin (Relations between rates of growth of homologies, unpublished manuscript, 1988). One can ask whether the inequalities (5) and (6) are special cases of, or can be derived from, more general statements. We consider the following generalization of (5): tiSCj .R/ tiS .R/ C tjS .R/ for every i and j:
(7)
No counterexample is known to us to the validity of (7) for algebras with monomial relations or for Koszul algebras. Also, (7) for j D 1 holds for algebras defined by monomials, see [FG, 1.9] where the statement is proved when R is defined by monomials of degree 2 and [HS] for the general case. Furthermore in [EHU, 4.1] it is proved that (7) holds for algebras of Krull dimension at most 1 when i C j D n, see also [Mc] for related results. Denote by ZD
M
Zi D
i 0
BD
M i 0
M
Zi .mR /;
i 0
Bi D
M i 0
Bi .mR /;
Koszul Algebras and Their Syzygies
15
and by H D
M
Hi D
i 0
M
Hi .mR /
i 0
the modules of cycles, boundaries and homology of the Koszul complex K.mR / associated with the maximal homogeneous ideal mR of R. Similarly, Zi .mR ; M / stands for the i -th cycles of K.mR ; M / D K.mR / ˝ M and so on. By construction, tiS .R/ D supfj W .Hi /j ¤ 0g:
(8)
For a Koszul algebra R we have shown in [ACI1] that the map ^i H1 ! Hi
(9)
induced by the multiplicative structure on H is surjective in degree 2i and higher. Hence (6) (for Koszul algebras) is an immediate corollary of that assertion. The inequality (7) would follow from a similar statement regarding the multiplication map Hi ˝ Hj ! Hi Cj :
(10)
Indeed it would be enough to prove that the map (10) is surjective in degree tiS .R/C tjS .R/ and higher. Unfortunately we are not able to evaluate directly the cokernel of the map (10). Instead we can get some information by using the splitting map for Koszul cycles described originally in [BCR2] and rediscussed in [ACI2] from a more general perspective. Indeed in [ACI2, 2.2] it is proved that: Theorem 2.3. Let M be a graded R-module. For even i; j there is a natural map (of degree 0) TorR 1 .Ci 1 ; Zj .mR ; M // ! Hi Cj .mR ; M /=Hi Hj .mR ; M /
(11)
that is surjective provided R has characteristic 0 or large. Here Ci 1 denotes the cokernel of the Koszul complex K.mR / in homological position i 1. Taking M D R one obtains a natural map: R TorR 1 .Ci 1 ; Zj / D Tor1 .Bi 1 ; Bj 1 / ! Hi Cj =Hi Hj
that is surjective in characteristic 0 or large. Note that TorR 1 .Bi 1 ; Bj 1 / is a finite length module because the Bu ’s are free when localized at any non-maximal prime homogeneous ideal. In particular one has: Proposition 2.4. Set Tij D TorR 1 .Bi 1 ; Bj 1 /. If R has characteristic 0 or large then tiSCj .R/ maxftiS .R/ C tjS .R/; regS Tij g
(12)
16
A. Conca
where regS Tij D maxfv W .Tij /v ¤ 0g: In order to evaluate regS Tij we have developed in [ACI2] a long and technically complicated inductive procedure. The results obtained take a simpler form in the Cohen–Macaulay case because, under such assumption, the tiS .R/’s form an increasing sequence. We have: Theorem 2.5. Let R be a Koszul and Cohen–Macaulay algebra of characteristic 0. Then tiSC1 .R/ tiS .R/ C t1 .R/ D tiS .R/ C 2
(13)
tiSCj .R/ maxftiS .R/ C tjS .R/; tiS1 .R/ C tjS1 .R/ C 3g
(14)
and
hold for every i and j . Furthermore one also deduces: Theorem 2.6. If R is a Koszul algebra of characteristic 0 satisfying the Green– Lazarsfeld Np condition for some p > 1 (i.e. ˇijS .R/ D 0 for every i D 1; 2; : : : ; p and every j > i C 1) then pdS R C1 (15) regS .R/ 2 pC1 where the “C1” can be omitted if p C 1 divides pdS R. For more general results the reader can consult [ACI2, Sect. 5]. Remark 2.7. The problem of bounding the regularity of Tor-modules has been studied in [EHU]. Let S D KŒx1 ; : : : ; xn , and let M and N be finitely generated graded S -modules. It is proved in [EHU] that if dim TorS1 .M; N / 1, then for every i one has regS TorSi .M; N / regS .M / C regS .N / C i:
(16)
Unfortunately, the formula (16) does not hold if we replace the polynomial ring S with a Koszul ring R. For example with R D KŒx; y=.x 2 C y 2 /;
M D R=.x/ N
and N D R=.y/ N
one has regR M D regR N D 0
and
TorR 1 .M; N / D K.2/
Koszul Algebras and Their Syzygies
17
so that has regularity regR TorR 1 .M; N / D 2: Nevertheless variations of (16) (e.g. compute the Tor over R but regularity over S or add a correction term on the left depending on regS .R/) might hold in general. The regularity bound in Theorem 2.6 is much weaker than the logarithmic one obtained by Dao, Huneke and Schweig in [DHS, 4.8]. They showed that an algebra R with monomial quadratic relations and satisfying the property Np for some p > 1 has a very low regularity compared with its embedding dimension. Their result asserts that for a given p > 1 there exist fp .x/ 2 RŒx and ˛p 2 R with ˛p > 1 (which are explicitly given in the paper) such that regS R log˛p fp .n/
(17)
holds for every algebra R with quadratic monomial relations such that R has the property Np and has embedding dimension n. This type of bound cannot hold for Koszul algebras satisfying Np no matter what fp .x/ 2 RŒx and ˛p 2 .1; 1/ are. To show this, it is enough to describe a family of algebras fRp;m g with p; m 2 N and p > 1 such that: 1. the algebra Rp;m is Koszul and satisfies the Np -property, 2. given p, the embedding dimension of Rp;m is a polynomial function of m and the regularity of Rp;m is linear in m. For example, let Rp;m be the p-th Veronese subalgebra of a polynomial ring in pm variables. Then Rp;m is Koszul, it satisfies the Np -property, it has regularity , see [BCR1]. .p 1/m, and has embedding dimension equal to pmCp1 p Question 2.8. Consider the coordinate ring of the Grassmannian G.2; n/. It is defined by the Pfaffians of degree 2 (the 4 4 Pfaffians) in a n n skew-symmetric generic matrix. We know that it is Koszul, it satisfies the N2 condition (by work of Kurano [Ku]), it has regularity n 3 and codimension n2 2 . Hence for this family the codimension is quadratic in the regularity. Does there exist a family like this (Koszul with the N2 property) such that the codimension is linear in the regularity? Remark 2.9. If we apply Theorem 2.3 in the case R D S D KŒx1 ; : : : ; xn with K of characteristic 0 or large and M any graded module then we have a surjection TorS1 .Ci 1 ; Zj .M // ! Hi Cj .mS ; M / because Hi D 0. Here we have set for simplicity Zj .M / D Zj .mS ; M /. Since TorS1 .Ci 1 ; Zi .M // D Hi .mS ; Zj .M //
18
A. Conca
we obtain S ˇi;v .Zj .M // ˇiSCj;v .M /
(18)
for all i; j; v and every M . We conclude by presenting a short proof of the inequality tiS .R/ 2i for Koszul algebras and a related question. Lemma 2.10. Let R be a Koszul algebra and Zi D Zi .mR / the cycles of the Koszul complex K.mR ; R/ associated with the maximal homogeneous ideal of R. Then regR .Zi / 2i for every i . In particular, tiS .R/ 2i for every i . Proof. For i > 0 we have short exact sequences: 0 ! Zi ! Ki ! Bi 1 ! 0 0 ! Bi 1 ! Zi 1 ! Hi 1 ! 0: Hence one has: regR .Zi / D regR .Bi 1 / C 1 and regR .Bi 1 / maxfregR .Zi 1 /; regR .Hi 1 / C 1g: Hence regR .Zi / maxfregR .Zi 1 / C 1; regR .Hi 1 / C 2g: Since mR Hi 1 D 0 and R is Koszul we have regR .Hi 1 / D t0R .Hi 1 / t0R .Zi 1 / regR .Zi 1 /: It follows that regR .Zi / maxfregR .Zi 1 / C 1; regR .Zi 1 / C 2g D regR .Zi 1 / C 2: Since Z0 D R one has regR .Z0 / D 0 and it follows by induction that regR .Zi / 2i: Since tiS .R/ D t0R .Hi / t0R .Zi / regR .Zi / 2i
Koszul Algebras and Their Syzygies
19
we may conclude that tiS .R/ 2i: t u With the assumptions and notation of Lemma 2.10 one can ask: Question 2.11. Does the inequality regR .Zi Cj / regR .Zi / C regR .Zj / hold for every i; j ? For similar questions and results the reader can consult [CM].
3 Veronese Rings and Algebras Associated with Families of Hyperspaces In the third lecture we present two case studies: the Koszul properties of Veronese rings and of algebras associated with families of hyperspaces. The material we present is taken from [ABH, BF, B1, BaM, Ca, CC, CHTV, CTV, CRV, ERT].
3.1 Veronese Rings We will use the following results whose proofs can be found, for example, in the survey paper [CDR]. Lemma 3.1. Let R be a standard graded K-algebra. Let M W ! Mi ! ! M2 ! M1 ! M0 ! 0 be a complex of R-modules. Set Hi D Hi .M/. Then for every i 0 one has tiR .H0 / maxf˛i ; ˇi g where ˛i D maxftjR .Mi j / W j D 0; : : : ; i g and ˇi D maxftjR .Hi j 1 / W j D 0; : : : ; i 2g. Moreover one has regR .H0 / maxf˛; ˇg where ˛ D supfregR .Mj / j W j 0g and ˇ D supfregR .Hj / .j C 1/ W j 1g.
20
A. Conca
Theorem 3.2. Let A be a standard graded K-algebra, J A a homogeneous ideal and B D A=J . Then: (1) If regA .B/ 1 and A is Koszul, then B is Koszul. (2) If regA .B/ is finite and B is Koszul, then A is Koszul. We apply now Theorem 3.2 to prove that the Veronese subrings of a Koszul algebra are Koszul. Let R be a standard graded K-algebra. Let c 2 N and R.c/ D ˚j 2Z Rjc be the c-th Veronese subalgebra of R. Similarly, given a graded R-module M one defines M .c/ D ˚j 2Z Mjc : The formation of the c-th Veronese submodule can be seen as an exact functor from the category of graded R-modules and maps of degree 0 to the category of graded R.c/ -modules and maps of degree 0. For u D 0; : : : ; c 1 consider the Veronese submodules Vu D ˚j 2Z RjcCu of R. Note that Vu is an R.c/ -module generated in degree 0 and that for a 2 Z one has R.a/.c/ D Vu .da=ce/ where u D 0 if a 0 mod.c/ and u D c r if a r mod.c/ and 0 < r < c. Theorem 3.3. Let R be a Koszul algebra. Then for every c 2 N one has: (1) R.c/ is Koszul. (2) regR.c/ .Vu / D 0 for every u D 0; : : : ; c 1, i.e. the Veronese submodules of R have a linear resolution over R.c/ . Proof. Denote by A the ring R.c/ . We first prove assertion (2). Indeed we prove by induction on i that tiA .Vu / i for every i . There is nothing to prove in the case i D 0. For i > 0, observe that, since R is Koszul, one has regR m D 1 and by induction one has that regR mu D u. Now let M D muR .u/ so that regR .M / D 0 and M .c/ D Vu . Consider the minimal free resolution F of M over R and apply the functor .c/ . We get a complex G D F.c/ of A-modules such that H0 .G/ D Vu , .c/ Hj .G/ D 0 for j > 0 and Gj D Fj is a direct sum of copies of R.j /.c/ . Applying Lemma 3.1 and the inductive assumption we get tiA .Vu / i as required. To prove that A is Koszul we consider the minimal free resolution F of K over R and apply .c/ . We get a complex G D F.c/ of A-modules such that H0 .G/ D K,
Koszul Algebras and Their Syzygies
21 .c/
Hj .G/ D 0 for j > 0 and Gj D Fj is a direct sum of copies of Vu .dj=ce/. Hence regA .Gj / D dj=ce and applying Lemma 3.1 we obtain regA .K/ supfdj=ce j W j 0g D 0: t u We also have: Theorem 3.4. Let R be a standard graded K-algebra. Then: (1) The Veronese subalgebra R.c/ is Koszul for c 0: (2) If R D S=I with S D KŒx1 ; : : : ; xn , then R.c/ is Koszul for every c such that c maxftiS .R/=.1 C i / W i 0g: Proof. Let F be the minimal free resolution of R as an S -module. Set B D S .c/ and note that B is Koszul because of Theorem 3.3. Then G D F.c/ is a complex of B.c/ modules such that H0 .G/ D R.c/ , Hj .G/ D 0 for j > 0. Furthermore Gi D Fi is a direct sum of shifted copies of the Veronese submodules Vu . Using Theorem 3.3 we get the bound regB .Gi / dtiA .R/=ce. Applying Lemma 3.1 we get regB .R.c/ / maxfdtiS .R/=ce i W i 0g: Hence for c maxftiS .R/=.1Ci / W i 0g one has regB .R.c/ / 1 and we conclude from Theorem 3.2 that R.c/ is Koszul. t u Remark 3.5. (1) In [ERT, 2] it is proved that if c .regS .R/ C 1/=2, then R.c/ is even G-quadratic. See [Sh] for other interesting results in this direction. (2) Backelin proved in [B1] that R.c/ is Koszul if c Rate.R/. Here Rate.R/ is defined as supf.tiRC1 .K/ 1/= i g i >0
and it is finite. It measures the deviation from the Koszul property in the sense that Rate.R/ 1 with equality if and only if R is Koszul.
3.2 Strongly Koszul Algebras A powerful tool for proving that an algebra is Koszul is a typical “divide and conquer” strategy that can be formulated in the following way, see [CTV] and [CRV]:
22
A. Conca
Definition 3.6. A Koszul filtration of a K-algebra R is a set F of ideals of R such that: (1) Every ideal I 2 F is generated by elements of degree 1. (2) The zero ideal 0 and the maximal ideal mR are in F . (3) For every I 2 F , I ¤ 0, there exists J 2 F such that J I , I =J is cyclic and Ann.I =J / D J W I 2 F . One easily proves: Lemma 3.7. Let F be a Koszul filtration of a standard graded K-algebra R. Then one has: (1) regR .R=I / D 0 and R=I is Koszul for every I 2 F . (2) R is Koszul. Example 3.8. Let R D KŒa; b; c; d =.ac; ad; ab bd; a2 C bc; b 2 /: We have seen in Example 1.20 that R is not LG-quadratic. We show now that R is Koszul by constructing a Koszul filtration. Indeed, there is a Koszul filtration based on the given system of coordinates, i.e. a Koszul filtration whose ideals are generated by residue classes of variables. Here it is: F D f.a; b; c; d /; .a; c; d /; .c; d /; .a; c/; .c/; .a/; 0g: To check that it is a Koszul filtration we observe that in R one has: .a; c; d / W .a; b; c; d / D .a; b; c; d / .c; d / W .a; c; d / D .a; b; c; d / .c/ W .c; d / D .a; c/ .c/ W .a; c/ D .a; c; d / 0 W .a/ D .c; d / 0 W .c/ D .a/ The following notion is very natural for algebras with a canonical coordinate system (e.g. in the toric case). Definition 3.9. An algebra R is strongly Koszul if there exists a basis X of R1 such that for every Y X and for every x 2 X n Y there exists Z X such that .Y / W x D .Z/. This definition of strongly Koszul is taken from [CDR] and is slightly different from the one given in [HHR]. In [HHR] it is assumed that the basis X of R1 is
Koszul Algebras and Their Syzygies
23
totally ordered and in the definition one adds the requirement that x is larger than every element in Y . Remark 3.10. If R is strongly Koszul with respect to a basis X of R1 then the set f.Y / W Y X g is obviously a Koszul filtration. We have: Theorem 3.11. Let R D S=I with S D KŒx1 ; : : : ; xn and I S be an ideal generated by monomials of degrees d . Then R.c/ is strongly Koszul for every c d 1. The proof of Theorem 3.11 is based on the fact that the Veronese ring R.c/ is a direct summand of R and that computing the colon ideal of monomial ideals in a polynomial ring is a combinatorial operation. Let us single out an interesting special case: Theorem 3.12. Let S D KŒx1 ; : : : ; xn and let I S be an ideal generated by monomials of degree 2. Then S=I is strongly Koszul. The results presented for Veronese rings and Veronese modules have their analogous in the multigraded setting, see [CHTV]. We discuss below the bigraded case. Let S D KŒx1 ; : : : ; xn ; y1 ; : : : ; ym with Z2 -grading induced by the assignment deg.xi / D .1; 0/ and deg.yi / D .0; 1/. For every c D .c1 ; c2 / we look at the diagonal subalgebra S D ˚a2 Sa where D fi c W i 2 Zg: The algebra S is nothing but the Segre product of the c1 -th Veronese ring of KŒx1 ; : : : ; xn and the c2 -th Veronese ring of KŒy1 ; : : : ; ym . For a Z2 -graded standard K-algebra R D S=I with I S a bigraded ideal we may consider the associated diagonal algebra R D ˚a2 Ra and similarly for modules. One has: Theorem 3.13. (1) For every .a; b/ 2 Z2 the S -submodule S.a; b/ of S has a linear resolution.
24
A. Conca
(2) For every Z2 -standard graded algebra R one has that R is Koszul for “large” (i.e. c1 0 and c2 0/. One can give explicit bounds in terms of the bigraded Betti numbers of R as an S -module. Let I be a homogeneous ideal of S D KŒx1 ; : : : ; xn generated by elements f1 ; : : : ; fr of degree d . The Rees ring Rees.I / D
M
I i D S Œf1 t; : : : ; fr t S Œt
i 2N
is a bigraded K-algebra. Its component of degree .i; j / is Rees.I /.i;j / D .I j /jd Ci It is easy to check that Rees.I / is a standard bigraded algebra. It can be seen as a quotient ring of S Œy D S Œy1 ; : : : ; yr bigraded by deg.xi / D .1; 0/ and deg.yi / D .0; 1/. Then we may apply Theorem 3.13 and we get that Corollary 3.14. There exist integers c0 and e0 (depending on I ) such that for every c c0 and e e0 the K-subalgebra of S generated by the vector space .I e /edCc is Koszul. If one has information or bounds on the bigraded resolution of Rees.I / an S Œymodule then Corollary 3.14 can be formulated more precisely. One of the few families of ideals I for which the resolution of Rees.I / is known are the complete intersections. If f1 ; : : : ; fr form a regular sequence then Rees.I / D S Œy=I2
y1 y2 : : : yr f1 f2 : : : fr
and Rees.I / is resolved by the Eagon–Northcott complex. Then applying the principle described above to this specific case one has: Theorem 3.15. Let f1 ; : : : ; fr be a regular sequence of elements of degree d in S D KŒx1 ; : : : ; xn and I D .f1 ; : : : ; fr /. For c; e 2 N set A D KŒ.I e /edCc . Then: (1) If c d=2 then A is quadratic. (2) if c d.r 1/=r then A is Koszul. See [CHTV] for details of the proofs of Theorem 3.15. Example 3.16. With r D n, d D 2 and fi D xi2 for every i D 1; : : : ; n we have that the toric algebra KŒx a W a 2 Nn ; jaj D 2 C c and max.a/ 2 is quadratic for every c and Koszul for c 2.n 1/=n.
Koszul Algebras and Their Syzygies
25
Given integers n; d; s we set PV.n; d; s/ D KŒx a W a 2 Nn ; jaj D d and #fi W ai > 0g s: This is called the pinched Veronese generated by the monomials in n variables, of total degree d and supported on at most s variables. Question 3.17. For which values of n; d; s is the algebra PV.n; d; s/ quadratic or Koszul? Not all of them are quadratic, for instance PV.4; 5; 2/ is defined, according to CoCoA [CoCoA], by 168 quadrics and 12 cubics. The algebra of Example 3.16 for c D n 2 coincides with the pinched Veronese PV.n; n; n 1/. Hence PV.n; n; n 1/ is quadratic for every n and Koszul for n > 3. For n D 3 we have that PV.3; 3; 2/ D KŒx 3 ; x 2 y; x 2 z; xy2 ; xz2 ; y 3 ; y 2 z; yz2 ; z3 is quadratic. The argument above does not answer the question whether PV.3; 3; 2/ is a Koszul algebra. This turns out to be a difficult question on its own. In [Ca] and [CC] it is proved that: Theorem 3.18. The pinched Veronese PV.3; 3; 2/ is Koszul. The same holds for the generic projection of the Veronese surface of P9 to P8 . It is not clear whether PV.3; 3; 2/ is G-quadratic. The Koszul property of a toric ring is equivalent to the Cohen–Macaulay property of intervals of the underlying poset, see [PRS, 2.2]. Recently Tancer has shown that the intervals of the poset associated with PV.n; n; n 1/ are shellable for n > 3, see [Ta]. It is not clear whether the same is true for n D 3.
3.3 Koszul Algebras Associated with Hyperspace Configurations Another interesting family of Koszul algebras with relations to combinatorics arises in the following way. Let V D V1 ; : : : ; Vm be a collection of subspaces of the space of linear forms in the polynomial ring KŒx1 ; : : : ; xn . Denote by A.V / the K-subalgebra of KŒx1 ; : : : ; xn generated by the elements of the product V1 Vm . We have: Theorem 3.19. The algebra A.V / is Koszul. We outline the proof of Theorem 3.19. Denote by R the polynomial ring KŒx1 ; : : : ; xn and set di D dim Vi . Consider auxiliary variables y1 ; : : : ; ym and the Segre product: S D KŒyi xj W i D 1; : : : ; m; j D 1; : : : ; n
26
A. Conca
of KŒy1 ; : : : ; ym with R. Set B.V / D KŒy1 V1 ; : : : ; ym Vm : and T D KŒtij W i D 1; : : : ; m; j D 1; : : : ; n: Note that B.V / is a K-subalgebra of S . We give degree ei 2 Zm to yi xj and to tij so that S , T and B.V / are Zm -graded. Let D f.a; a; : : : ; a/ 2 Zm W a 2 Zg: By construction, the diagonal algebra B.V / D
M
B.V /b
b2Zm
coincides with KŒV1 Vm y1 ym and hence B.V / D A.V /: For i D 1; : : : ; m let ffij W j D 1; : : : ; di g be a basis of Vi and complete it to a basis of R1 with elements ffij W j D di C 1; : : : ; ng (no matter how). Set T .V / D KŒtij W 1 i m; 1 j di . We have presentations: WT !S
with tij ! yi fij for all i; j
0 W T .V / ! B.V / with tij ! yi fij for all i and 1 j di We have: Lemma 3.20. Suppose that ker 0 has a Gröbner basis (with respect to some term order >) of elements of degrees bounded above by .1; 1; : : : ; 1/ 2 Zm . Then A.V / is Koszul. Proof. Set I D ker 0 . Applying to the presentation B.V / D T .V /=I we obtain A.V / D T .V / =I . Now T .V / is a multiple Segre product and hence it is strongly Koszul (the argument is similar to the one for the Veronese case) and by assumption in> .I / is generated by a subset of the semigroup generators of T .V / . But then T .V /= in> .I / is Koszul because of the strongly Koszul property of T .V / . Hence T .V / =I is Koszul by Gröbner deformation. t u Since, by construction, ker 0 D ker \ T .V /, a Gröbner basis of ker 0 can be obtained from a lexicographic Gröbner basis of ker by elimination. Therefore,
Koszul Algebras and Their Syzygies
27
combining this point of view with Lemma 3.20 we have that Theorem 3.19 is a corollary of: Lemma 3.21. The ideal ker has a universal Gröbner basis whose elements have degrees bounded above by .1; 1; : : : ; 1/ 2 Zm . Observe that is a presentation of the Segre product S but with respect to a nonnecessarily monomial basis. Hence ker is obtained from the ideal I2 .t/ of the 2-minors of the m n matrix t D .tij / by a change of coordinates preserving the Zm -graded structure. Since the Hilbert function does not change under taking initial ideals, it is enough to prove the following (very strong) assertion: Lemma 3.22. Every ideal of T that has the Zm -graded Hilbert function of I2 .t/ is generated in degrees bounded above by .1; 1; : : : ; 1/ 2 Zm . Lemma 3.22 has been proved by Cartwright and Sturmfels [CS] using multigraded generic initial ideals and a result proved in [C]. This approach has been generalized in [CDG] to identify universal Gröbner bases of ideals of maximal minors of matrices of linear forms hence generalizing the classical result of Bernstein, Sturmfels and Zelevinsky [BZ, SZ], see also [K]. In detail, the group GLn .K/m acts as the group of Zm -graded K-algebra automorphisms on T by linear substitution (row by row). An ideal I T is Borel-fixed if it is invariant under the action of the Borel subgroup Bn .K/m of GLn .K/m . Here Bn .K/ is the group of upper triangular matrices. In [CDG] it has been proved that Lemma 3.23. If J T is Borel-fixed and radical then every ideal I with the Zm -graded Hilbert function of J is generated in degrees bounded above by .1; 1; : : : ; 1/ 2 Zm . Summing up, to conclude the proof of Theorem 3.19 it is enough to prove that: Lemma 3.24. The ideal I2 .t/ has the Zm -graded Hilbert function of the radical and Borel-fixed ideal J generated by the monomials ti1 j1 tik jk satisfying the following conditions: 1 i1 < < ik m; 1 j1 ; : : : ; jk n; j1 C C jk n C k: This is done in [C] by proving that J is indeed the multigraded generic initial ideal gin.I2 .t// of I2 .t/. The inclusion J gin.I2 .t// is a consequence of the following Lemma 3.25. The other inclusion is proved by checking that J is pure with codimension and degree equal those of I2 .t/.
28
A. Conca
LemmaP3.25. Let V1 ; : : : ; Vm be subspaces Qmof the vector Qmspace of the linear forms R1 . If m Vi , i.e. there is a i D1 dim Vi n C m then dim i D1 Vi < i D1 dim Q non-trivial linear relation among the generators of the product m i D1 Vi obtained by multiplying K-bases of the Vi ’s. One can also prove directly that T =I2 .t/ and T =J have the same Zm -graded Hilbert function in the following way. For every a D .a1 ; : : : ; am / 2 Nm we show that the vector space dimensions of T =I2 .t/ and of T =J in multidegree a are equal. By induction on m we may assume that a has full support, i.e. ai > 0 for all i . Since T =I2 .t/ is the Segre product of KŒy1 ; : : : ; ym and KŒx1 ;Q : : : ; xn , a K-basis of T =I2 .t/ in degree a is given by the monomials of the form yiai p where p is P a monomial in the x’s of degree ai . It follows that the dimension of T =I2 .t/ in degree a equals: ! n 1 C a1 C a2 C C am : n1 Given a monomial p in the tij ’s of multidegree a we set Mi .p/ D maxfj W tij jpg. It is easy to see that: p2J ”
m X
Mi .p/ n C m:
i D1
For a fixed c D .c1 ; : : : ; cm / 2 f1; : : : ; ngm the cardinality of the set of the monomials p in the tij ’s with degree a and Mi .p/ D ci is given by ! m Y ci 1 C ai 1 : ci 1 i D1 Therefore the dimension of T =J in multidegree a is given by: ! m XY ci 1 C ai 1 ci 1 c i D1 where the sum is extended to all the c D .c1 ; : : : ; cm / 2 f1; : : : ; ngm with c1 C C cm < n C m. Replacing n 1 with n and ci 1 with ci , we have to prove the following identity: n C a1 C a2 C C am n
!
m XY ci C ai 1 D ci c i D1
! (19)
Koszul Algebras and Their Syzygies
29
where the sum is extended to all the c D .c1 ; : : : ; cm / 2 Nm with c1 C C cm n. The equality (19) is a specialization (v D m C 1, bi D ai for i D 1; : : : ; m and bmC1 D 1) of the following identity: ! ! v XY n C b1 C b2 C C bv 1 ci C bi 1 D n ci c i D1
(20)
where the sum is extended to all the c D .c1 ; : : : ; cv / 2 Nv with c1 C Ccv D n. Now the identity (20) is easy: both the left and right side of it count the number of monomials of total degree n in a set of variables which is a disjoint union of subsets of cardinality b1 ; b2 ; : : : ; bv . Acknowledgements We thank Giulio Caviglia, Alessio D’Alì, Emanuela De Negri and Dang Hop Nguyen for their valuable comments and suggestions upon reading preliminary versions of the present notes and Christian Krattenthaler for suggesting the proof of formula (19).
References [An] [ABH] [A]
[A1]
[AP] [ACI1] [ACI2] [AE] [B1]
[BF] [BM]
[BS] [BaM]
D. Anick, A counterexample to a conjecture of Serre. Ann. Math. 115, 1–33 (1982) A. Aramova, S. ¸ B˘arc˘anescu, J. Herzog, On the rate of relative Veronese submodules. Rev. Roum. Math. Pures Appl. 40, 243–251 (1995) L.L. Avramov, Infinite free resolutions, in Six Lectures on Commutative Algebra (Bellaterra, 1996). Progress in Mathematics, vol. 166 (Birkhäuser, Basel, 1998), pp. 1–118 L.L. Avramov, Local algebra and rational homotopy, in Homotopie algebrique et algebre locale (Luminy, 1982), ed. by J.-M. Lemaire, J.-C. Thomas. Asterisque, vols. 113 and 114 (Soc. Math. France, Paris, 1984), pp. 15–43 L.L. Avramov, I. Peeva, Finite regularity and Koszul algebras. Am. J. Math. 123, 275– 281 (2001) L.L. Avramov, A. Conca, S. Iyengar, Free resolutions over commutative Koszul algebras. Math. Res. Lett. 17, 197–210 (2010) L.L. Avramov, A. Conca, S. Iyengar, Subadditivity of syzygies of Koszul algebras (2013). Preprint [arXiv:1308.6811] L.L. Avramov, D. Eisenbud, Regularity of modules over a Koszul algebra. J. Algebra 153, 85–90 (1992) J. Backelin, On the rates of growth of the homologies of Veronese subrings, in Algebra, Algebraic Topology, and Their Interactions (Stockholm, 1983). Lecture Notes in Mathematics, vol. 1183 (Springer, Berlin, 1986), pp. 79–100 J. Backelin, R. Fröberg, Koszul algebras, Veronese subrings, and rings with linear resolutions. Rev. Roum. Math. Pures Appl. 30, 85–97 (1985) D. Bayer, D. Mumford, What can be computed in algebraic geometry?, in Computational Algebraic Geometry and Commutative Algebra (Cortona, 1991). Sympos. Math., vol. XXXIV (Cambridge University Press, Cambridge, 1993), pp. 1–48 D. Bayer, M. Stillman On the complexity of computing syzygies. Computational aspects of commutative algebra. J. Symb. Comput. 6(2–3), 135–147 (1988) S. B˘arc˘anescu, N. Manolache, Betti numbers of Segre-Veronese singularities. Rev. Roum. Math. Pures Appl. 26, 549–565 (1981)
30 [BZ]
A. Conca
D. Bernstein, A. Zelevinsky, Combinatorics of maximal minors. J. Algebr. Comb. 2(2), 111–121 (1993) [BC] W. Bruns, A. Conca, Gröbner bases and determinantal ideals, in Commutative Algebra, Singularities and Computer Algebra (Sinaia, 2002). NATO Science Series II: Mathematics, Physics and Chemistry, vol. 115 (Kluwer Academic, Dordrecht, 2003), pp. 9–66 [BCR1] W. Bruns, A. Conca, T. Römer, Koszul homology and syzygies of Veronese subalgebras. Math. Ann. 351, 761–779 (2011) [BCR2] W. Bruns, A. Conca, T. Römer, Koszul cycles, in Combinatorial Aspects of Commutative Algebra and Algebraic Geometry, ed. by G. Floystad et al. Abel Symposia, vol. 6 (2011) [BH] W. Bruns, J. Herzog, in Cohen-Macaulay Rings, Revised edn. Cambridge Studies in Advanced Mathematics, vol. 39 (Cambridge University Press, Cambridge, 1998) [CS] D. Cartwright, B. Sturmfels, The Hilbert scheme of the diagonal in a product of projective spaces. Int. Math. Res. Not. IMRN 9, 1741–1771 (2010) [Ca1] G. Caviglia, Koszul algebras, Castelnuovo-Mumford regularity and generic initial ideal. Ph.D. Thesis, University of Kansas, 2004 [Ca] G. Caviglia, The pinched Veronese is Koszul. J. Algebr. Comb. 30, 539–548 (2009) [CC] G. Caviglia, A. Conca, Koszul property of projections of the Veronese cubic surface. Adv. Math. 234, 404–413 (2013) [CaS] G. Caviglia, E. Sbarra, Characteristic-free bounds for the Castelnuovo-Mumford regularity. Compos. Math. 141(6), 1365–1373 (2005) [CoCoA] CoCoA Team, CoCoA: A System for Doing Computations in Commutative Algebra. Available at http://cocoa.dima.unige.it [C] A. Conca, Linear spaces, transversal polymatroids and ASL domains. J. Algebr. Comb. 25(1), 25–41 (2007) [CDR] A. Conca, E. De Negri, M.E. Rossi, Koszul algebras and regularity, in Commutative Algebra (Springer, New York, 2013), pp. 285–315 [CDG] A. Conca, E. De Negri, E. Gorla, Universal Gröbner bases for maximal minors Int. Math. Res. Not. (to appear) [arXiv:1302.4461] [CHTV] A. Conca, J. Herzog, N.V. Trung, G. Valla, Diagonal subalgebras of bigraded algebras and embeddings of blow-ups of projective spaces. Am. J. Math. 119, 859–901 (1997) [CM] A. Conca, S. Murai, Regularity bounds for Koszul cycles. Proc. Am. Math. Soc. [arXiv:1203.1783] (to appear) [CRV] A. Conca, M.E. Rossi, G. Valla, Gröbner flags and Gorenstein algebras. Comp. Math. 129, 95–121 (2001) [CTV] A. Conca, N.V. Trung, G. Valla, Koszul propery for points in projectives spaces. Math. Scand. 89, 201–216 (2001) [DHS] H. Dao, C. Huneke, J. Schweig, Bounds on the regularity and projective dimension of ideals associated to graphs. J. Algebr. Comb. 38(1), 37–55 (2013) [arXiv:1110.2570] [EG] D. Eisenbud, S. Goto, Linear free resolutions and minimal multiplicity. J. Algebra 88, 89–133 (1984) [ERT] D. Eisenbud, A. Reeves, B. Totaro, Initial ideals, Veronese subrings, and rates of algebras. Adv. Math. 109, 168–187 (1994) [EHU] D. Eisenbud, C. Huneke, B. Ulrich, The regularity of Tor and graded Betti numbers. Am. J. Math. 128, 573–605 (2006) [FG] O. Fernandez, P. Gimenez, Regularity 3 in edge ideals associated to bipartite graphs. J. Algebr. Comb. (2012) [arXiv:1207.5553] (to appear) [F] R. Fröberg, Koszul algebras, in Advances in Commutative Ring Theory. Proceedings of the Fez Conference, 1997. Lectures Notes in Pure and Applied Mathematics, vol. 205 (Marcel Dekker Eds., New York, 1999) [HHR] J. Herzog, T. Hibi, G. Restuccia, Strongly Koszul algebras. Math. Scand. 86, 161–178 (2000)
Koszul Algebras and Their Syzygies [HS] [K]
[Ko] [Ku] [Mc] [MS] [PRS] [P] [PP] [Sh] [SZ] [Ta]
31
J. Herzog, H. Srinivasan, A note on the subadditivity problem for maximal shifts in free resolutions [arXiv:1303.6214] M.Y. Kalinin, Universal and comprehensive Gröbner bases of the classical determinantal ideal. Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 373 (2009). Teoriya Predstavlenii, Dinamicheskie Sistemy, Kombinatornye Metody. XVII, 134–143, 348; Translation in J. Math. Sci. (N.Y.) 168(3), 385–389 (2010). J. Koh, Ideals generated by quadrics exhibiting double exponential degrees. J. Algebra 200(1), 225–245 (1998) K. Kurano, Relations on Pfaffians. I. Plethysm formulas. J. Math. Kyoto Univ. 31(3), 713–731 (1991) J. McCullough, A polynomial bound on the regularity of an ideal in terms of half of the syzygies. Math. Res. Lett. 19(3), 555–565 (2012) [arXiv:1112.0058] E. Miller, B. Sturmfels, in Combinatorial Commutative Algebra. Graduate Texts in Mathematics, vol. 227 (Springer, Berlin, 2004) I. Peeva, V. Reiner, B. Sturmfels, How to shell a monoid. Math. Ann. 310(2), 379–393 (1998) S.B. Priddy, Koszul resolutions. Trans. Am. Math. Soc. 152, 39–60 (1970) A. Polishchuk, L. Positselski, in Quadratic Algebras. University Lecture Series, vol. 37 (American Mathematical Society, Providence, 2005) T. Shibuta, Gröbner bases of contraction ideals. J. Algebr. Comb. 36(1), 1–19 (2012) [arXiv:1010.5768] B. Sturmfels, A. Zelevinsky, Maximal minors and their leading terms. Adv. Math. 98(1), 65–112 (1993) M. Tancer, Shellability of the higher pinched Veronese posets. J. Algebr. Comb. (to appear) [arXiv:1305.3159]
Noetherianity up to Symmetry Jan Draisma
1 Kruskal’s Tree Theorem All finiteness proofs in these lecture notes are based on a beautiful combinatorial theorem due to Kruskal. In fact, the special case of that theorem known as Higman’s Lemma suffices for all of those proofs. But, hoping that Kruskal’s Tree Theorem will soon find further applications in infinite-dimensional algebraic geometry, I have decided to prove the theorem in its full strength. Original sources for Kruskal’s Tree Theorem and Higman’s Lemma are [Kru60] and [Hig52], respectively. We follow closely the beautiful proof in [NW63]. Throughout we use the notation N WD f1; 2; : : :g, Z0 WD f0; 1; : : :g, and Œn WD f1; : : : ; ng for n 2 Z0 . In particular, we have Œ0 D ;. The main concept is that of a well-partial-order on a set S . This is a partial order with the property that for any infinite sequence s1 ; s2 ; : : : of elements of S there exists a pair of indices i < j with si sj . Arguing by contradiction one then proves that there exists an index i such that sj si holds for infinitely many indices j > i . Take the first such index i1 , and retain only the term si1 together with the infinitely many terms sj with j > i1 and sj si1 . Among these pick an index i2 > i1 in a similar fashion, etc. This leads to the conclusion that in a well-partially-ordered set any infinite sequence has an infinite ascending subsequence si1 si2 : : : with i1 < i2 < : : :. Examples of well-partial-orders are partial orders on finite sets, and well-orders (which are linear well-partial-orders). If two sets S; T are both equipped with wellpartial-orders, then the componentwise partial order on the Cartesian product S T
J. Draisma () Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands Centrum Wiskunde en Informatica, Amsterdam, The Netherlands e-mail:
[email protected] A. Conca et al., Combinatorial Algebraic Geometry, Lecture Notes in Mathematics 2108, DOI 10.1007/978-3-319-04870-3__2, © Springer International Publishing Switzerland 2014
33
34
Jan Draisma
defined by .s; t/ .s 0 ; t 0 / if and only if s s 0 and t t 0 is again a wellpartial-order. Indeed, in an infinite sequence .s1 ; t1 /; .s2 ; t2 /; : : : there is an infinite subsequence of indices where the si increase weakly and in that subsequence there exist a pair of indices i j where in addition to si sj also the inequality ti tj holds. Repeatedly applying this Cartesian-product construction with all factors equal to the non-negative integers Z0 one obtains the statement that the componentwise order on Zn0 is a well-partial-order. This fact, known as Dickson’s Lemma, can be used to prove Hilbert’s Basis Theorem. In a similar fashion we shall use Kruskal’s Tree Theorem to prove Noetherianity of certain rings up to symmetry. Before stating and proving Kruskal’s Tree Theorem, we first discuss the following special case. Lemma 1.1. For any well-partial-order on a set S the partial order on the set of finite multi-subsets of S defined by A B if and only if there exists an injective map f W A ! B with a f .a/ for all a 2 A is a well-partial-order. Proof. Suppose that it is not. Then there exists an infinite sequence A1 ; A2 ; : : : of finite multi-subsets of S such that Ai — Aj for all pairs of indices i < j . Such a sequence is called a bad sequence, and we may assume that it is minimal in the following sense. First, the cardinality jA1 j of A1 is minimal among all bad sequences. Second, jA2 j is minimal among all bad sequences starting with A1 , etc. As the empty multi-set is smaller than all other multi-sets, none of the multisets Ai is empty, so we may choose an element ai from each Ai and define Bi WD Ai n fai g. There exists an infinite subsequence i1 < i2 < : : : where ai1 ai2 : : :. Now the desired contradiction will follow by considering the sequence A1 ; A2 ; : : : ; Ai1 1 ; Bi1 ; Bi2 ; : : : : Indeed, no Ai with i i1 1 is less than or equal to Aj with i < j i1 1. But neither is any Ai with i i1 1 less than or equal to any Bj with j 2 fi1 ; i2 ; : : :g, or else we would have Ai Bj Aj , with the inclusion map witnessing the second inequality. Finally, no relation Bi Bj holds with i; j 2 fi1 ; i2 ; : : :g and i < j . Indeed, a map Bi ! Bj witnessing that inequality could be extended to a map Ai ! Aj witnessing Ai Aj by mapping ai to aj . We conclude that the new sequence is bad, but this contradicts the minimality of jAi1 j among all bad sequences starting with A1 ; : : : ; Ai1 1 . t u The general case of Kruskal’s Tree Theorem concerns the set of (isomorphism classes of) finite, rooted trees whose vertices are labelled with elements of a fixed partially ordered set S . We call such objects S -labelled trees. A partial order on S -labelled trees is defined recursively as follows; see also Fig. 1. Suppose that T is an S -labelled tree with root r and suppose that T branches at r into trees B1 ; : : : ; Bp whose roots are the children of r; p D 0 is allowed here, and renders void one of the conditions that follow. We say that T is less than or equal to a second S -labelled tree T 0 if the latter has a vertex v (not necessarily its root) where T 0 branches into trees B10 ; : : : ; Bq0 (rooted at children of v), such that the S -label of v is at least
Noetherianity up to Symmetry
T
35
r
v
T
B1
Bq
B1
Bp
0 Fig. 1 If the S-label of r is at most that of v, and if Bi B.i/ for some injective W Œp ! Œq, then T T 0
1
2
9
2
1
4
4
3
2
2
3
3
5
1
6
5
4
Fig. 2 The N-labelled tree in gray is smaller than the N-labelled tree in black in Kruskal’s order
that of r and such that there exists an injective map from Œp into Œq with Bi 0 B.i / for all i . Unfolding this recursive definition one finds that the one-dimensional CW-complex given by the tree T can then be homeomorphically embedded into that of T 0 in such a way that each vertex u of T gets mapped into a vertex of T 0 whose S -label is at least that of u; see Fig. 2.
36
Jan Draisma
Theorem 1.2 (Kruskal’s Tree Theorem). For any non-empty well-partiallyordered set S , the set of S -labelled trees is well-partially-ordered by the partial order just defined. Note that the lemma is, indeed, a special case of this theorem, obtained by restricting to star trees with a root (labelled with some fixed, irrelevant element of S ) connected directly to all its leaves. Proof. The proof is very similar to that of the lemma. Assume, for a contradiction, the existence of a bad sequence T1 ; T2 ; : : :, which we may take minimal in the sense that the cardinality of the vertex set of Ti is minimal among all bad sequences starting with T1 ; : : : ; Ti 1 . At its root, Ti branches into a finite multi-set Ri of smaller S -labelled trees. Let R be the set-union of all Ri as i runs through N, so that R is a set of S -labelled trees. If R contained a bad sequence, then it would contain a bad sequence Bi1 ; Bi2 ; : : : with Bij 2 Rij and i1 < i2 < : : :. Then, as in the proof of the lemma, one shows that the sequence T1 ; : : : ; Ti1 1 ; Bi1 ; Bi2 ; : : : would be a bad sequence of trees contradicting the minimality of the original sequence. Hence R is well-partially-ordered. Consider a subsequence Tk1 ; Tk2 ; : : : with k1 < k2 < : : : for which the root labels of the Tki weakly increase in S . Applying the lemma to the well-partially-ordered set R, we find that there exist j < l such that Rkj Rkl . But then Tkj Tkl (by mapping the root of Tkj to the root of Tkl and Rkj suitably into Rkl ), a contradiction. t u We now formulate Higman’s Lemma, which is a useful direct consequence of Kruskal’s Tree STheorem. Given a partially ordered set S , define a partial order on the set S D p S p of finite sequences over S as follows. A sequence .s1 ; : : : ; sp / is less than or equal to a sequence .s10 ; : : : ; sq0 / if there exists a strictly increasing 0 map W Œp ! Œq satisfying si s.i / for all i . Theorem 1.3 (Higman’s Lemma). For any well-partially-ordered set S , the partial order on S just defined is a well-partial-order. Proof. Encode a sequence .s1 ; : : : ; sp / in S as an S -labelled tree with root 1 labelled s1 , with a single child 2 labelled s2 , etc. Under this encoding the partial order on S agrees with that on S -labelled trees, so that Kruskal’s theorem implies Higman’s Lemma. t u
2 Equivariant Gröbner Bases Just like, through a leading term argument, Dickson’s Lemma implies Hilbert’s Basis Theorem, Higman’s Lemma implies the central finiteness result that all our later proofs build upon. What follows is certainly not the most general setting, but it will suffice our purposes. For much more on this theme see [Coh67, Coh87, AH07, AH08, DS06, HS12, HMdC13, LSL09].
Noetherianity up to Symmetry
37
Let X be a (typically infinite) set of variables, and let Mon denote the free commutative monoid of monomials in those variables. Let be a monomial order on Mon, i.e., a well-order that satisfies the additional condition that u v ) uw vw for all u; v; w 2 Mon. Let … be a (typically non-commutative) monoid acting from the left on Mon by means of monoid endomorphisms and assume that the action preserves strict inequalities, i.e., u < v implies u < v for all 2 …. In particular, acts by means of an injective map on Mon. Moreover, we have u u since otherwise the sequence u > u > 2 u > : : : would contradict that is a well-order. Let K be a field and denote by KŒX D KMon the ring of polynomials in the variables in X , or, equivalently, the monoid algebra over K of Mon. The action of … on Mon extends to an action on KŒX by means of ring endomorphisms preserving 1. For a non-zero f 2 KŒX denote by lm.f / 2 Mon the largest monomial with a non-zero coefficient in f . As the action of … preserves the (strict) monomial order, we have lm.f / D lm.f / in addition to the usual properties of lm. In other words, the map lm from KŒX n f0g to Mon is …-equivariant, and this motivates the terminology in the following definition. Definition 2.1. Let I be a …-stable ideal in KŒX . Then a …-Gröbner basis, or equivariant Gröbner basis if … is clear from the context, of I is a subset B of I with the property that for any f 2 I there exists a g 2 B and a 2 … with lm.g/jlm.f /. The set B is an equivariant Gröbner basis of I if and only if the union …B D fg j 2 …; g 2 Bg of the …-orbits of elements of B is an ordinary Gröbner basis of I (except that it will typically not be finite). Then, in particular, …B generates I as an ideal; and we also say that B generates I as a …-stable ideal. We do not require that an equivariant Gröbner basis be finite, but finite ones will of course be the most useful ones to us. To formulate a criterion guaranteeing the existence of finite equivariant Gröbner bases we define the …-divisibility relation on Mon by uj… v if and only if there exists a 2 … such that u divides v. This relation is reflexive (take D 1), transitive [if v D u0 u and w D v 0 v, then w D .v 0 u0 / ./u], and antisymmetric (if ujv and vju then u u v and v v u so that u D v). Proposition 2.2 ([HS12]). Every …-stable ideal I KŒX has a finite …-Gröbner basis if and only if j… is a well-partial-order. Proof. For the “only if” part observe that if u1 ; u2 ; : : : were a bad sequence of monomials, then the …-stable ideal generated by them, i.e., the smallest …-stable ideal containing them, would not have a finite equivariant Gröbner basis. For the “if” part let I be a …-stable ideal in KŒX . Let M denote the set of j… -minimal elements of flm.f / j f 2 I n f0gg. As j… is a well-partial-order, M is finite, say M D fu1 ; : : : ; up g. Choose f1 ; : : : ; fp 2 I n f0g with lm.fi / D ui . Then ff1 ; : : : ; fp g is a …-Gröbner basis of I . t u
38
Jan Draisma
The main example that we shall use has X WD fxij j i 2 Œk; j 2 Ng and … WD Inc.N/, the monoid of maps N ! N that are strictly increasing in the standard order on N. This monoid acts on X by xij D xi .j / ; the action extends multiplicatively to an action on Mon and linearly to an action by ring endomorphisms on the polynomial ring R WD KŒX D KŒ.xij /ij . There exist monomial orders for which u < v implies u < v; for instance, the lexicographic order with xij < xi 0 j 0 if i < i 0 or i D i 0 and j < j 0 . Theorem 2.3 ([Coh87, HS12]). Fix a natural number k. Then any Inc.N/-stable ideal I in the ring KŒxij j i 2 Œk; j 2 N has a finite Inc.N/-Gröbner basis with respect to any monomial order preserved by Inc.N/. In particular, any Inc.N/stable ideal I in that ring is generated, as an ideal, by finitely many Inc.N/-orbits of polynomials. Proof. By Proposition 2.2 it suffices to prove that jInc.N/ is a well-partial-order. To this end, we shall apply Higman’s Lemma to S D Zk0 with the componentwise partial order, which is a well-partial-order by Dickson’s Lemma. Encode a monomial u in the variables xij as a word .s1 ; : : : ; sp / in S as follows: p is the largest value of the column index j for which some variable xij appears in u, and .sj /i is the exponent of xij in u. Now given any sequence u1 ; u2 ; : : : of monomials, by Higman’s Lemma there exist indices m < l such that the sequences s; s 0 encoding um ; ul satisfy s s 0 . This means that there exists a strictly increasing map W Œp ! Œp 0 , 0 with p; p 0 the lengths of s; s 0 , such that sj s.j / for all j 2 Œp. Extend in any manner to a strictly increasing map N ! N. Then the exponent of any variable xij in um equals 0 if j 62 .Œp/ and .s 1 j /i .sj0 /i otherwise. This proves that um jul , as desired. t u The second statement in Theorem 2.3 has several consequences. One is that any ascending chain I1 I2 : : : of Inc.N/-stable ideals in R stabilises at some finite index n: In D InC1 D : : :; we express this fact by saying that R is Inc.N/-Noetherian. This implies that R is Sym.1/-Noetherian, where the group S Sym.1/ WD j 2N Sym.Œj / is obtained by embedding Sym.Œj / into Sym.Œj C1/ as the stabiliser of j C 1 and where 2 Sym.1/ acts on xij by xij D xi .j / . Indeed, the Sym.1/-orbit of any polynomial f contains the Inc.N/-orbit of f , and hence any Sym.1/-stable ideal is also Inc.N/-stable. Note that one can also replace the countable group Sym.1/ by the uncountable group of all permutations of N, because the two have exactly the same orbits on R. Example 2.4. In contrast to these beautiful positive results, consider the set X D fyij j i; j 2 Ng with … D Inc.N/-action given by yij D y .i / .j /. We claim that j… is not a well-partial-order. Indeed, consider the sequence of monomials y12 y21 ; y12 y23 y31 ; : : : encoding directed cycles on two, three, etc. vertices. Any 2 Inc.N/ maps such a monomial to a monomial representing another directed cycle of the same length.
Noetherianity up to Symmetry
39
Since no larger cycle contains a smaller cycle as a subgraph, this is a bad sequence of monomials. The same argument shows that KŒX is not Sym.1/-Noetherian. Similar counterexamples exist for the action of Sym.1/ Sym.1/ on X given by .; /yij D y.i / .j /. We now return to the general setting of equivariant Gröbner bases, without the assumption that j… is a well-partial-order. These bases can sometimes be computed by a …-equivariant version of Buchberger’s algorithm. The halting criterion in this equivariant Buchberger algorithm is the following equivariant version of Buchberger’s criterion involving S -polynomials. Proposition 2.5 (Equivariant Buchberger Criterion). Let B be a subset of a …stable ideal I in KŒX . Then B is a …-Gröbner basis of I if and only if for all f; g 2 B and all ; 2 … the ordinary S -polynomial S.f; g/ gives remainder 0 upon division by …B. This criterion follows immediately from the ordinary Buchberger criterion applied to …B—indeed, while most textbooks assume a finite number of variables, division-with-remainder and Buchberger’s criterion apply to infinitely many variables as well; the crucial ingredient is the fact that the monomial order is a well-order. Unfortunately, since … is typically infinite, checking whether B is an equivariant Gröbner basis using the equivariant Buchberger criterion may be an infinite task, even when B is finite. But in many cases of interest this task can be reduced to a finite task as follows. Assume, first, that for any two polynomials f; g 2 KŒX the Cartesian product …f …g of the …-orbits of f and g is the union of finitely many diagonal orbits ….i f; i g/ D f.i f; i g/ j 2 …g; i D 1; : : : ; r, where r 2 N and 1 ; 1 ; : : : ; r ; r 2 Inc.N/ are allowed to depend on f; g. Then we would like to check only whether the S -polynomials S.i f; i g/ reduce to zero upon division by …B, and conclude that all S.f; g/ reduce to zero. For this we would like that S.i f; i g/ D S.i f; i g/, because letting act on the reduction of S.i f; i g/ to zero yields a reduction of S.i f; i g/ to zero. This desired …equivariance of S -polynomials does not follow from the assumptions so far, but it does follow if we make the further assumption that each 2 … preserves least common multiples, i.e., that lcm.u; v/ D lcm.u; v/ for all u; v 2 Mon. This is, in particular, the case if maps variables to variables. Theorem 2.6. Assume that Cartesian products …f …g of …-orbits on KŒX are unions of finitely many diagonal …-orbits, and assume that … preserves least common multiples of monomials. Let S be a finite subset of KŒX and consider the following algorithm: (1) Set B WD S and P WD S2 [ f.f; f / j f 2 S g, where S2 is the set of pairs of distinct elements from S ; (2) If P D ;, then stop, otherwise pick .f; g/ 2 P and remove itSfrom P . (3) Choose r 2 N; 1 ; 1 ; : : : ; r ; r 2 … such that …f …g D riD1 ….i f; i g/.
40
Jan Draisma
(4) For each i D 1; : : : ; r do the following: reduce S.i f; i g/ modulo …B, and if the remainder h is non-zero, then add h to B and consequently add B fhg to P . (5) Return to step 2. If and when this algorithm terminates, then B is a …-Gröbner basis for the …-stable ideal generated by S . Moreover, if j… is a well-partial-order, then this algorithm does terminate. As argued above, all but the last sentence of this theorem follows from the ordinary Buchberger criterion. The last sentence follows as for the ordinary Buchberger algorithm: if the algorithm does not terminate, then an infinite number of non-zero remainders h1 ; h2 ; : : : are added. If j… is a well-partial-order, then there exist i < j with lm.hi /j… lm.hj /, which means that hj was not reduced with respect to hi , a contradiction. One point to stress is that in initialising the pair set P also pairs of two identical polynomials .f; f /; f 2 S need to be added; and that similarly, when adding a remainder h to B, also the pair .h; h/ needs to be added to P . Indeed, already the …-stable ideal generated by a single polynomial can be interesting, as the following example shows. Example 2.7. Let X D fxi j i 2 Ng [ fyij j i; j 2 N; i > j g and let … D Inc.N/ act on X by xi D x.i / and yij D y.i /.j /. Set S WD fy21 x2 x1 g and let I denote the …-stable ideal generated by S . We would like to compute an equivariant Gröbner basis of the elimination ideal I \ KŒyij j i > j . To this end, we choose the lexicographic monomial order with x1 < x2 < : : : and yij < ykl if either i < k or i D k and j < l, and with ykl < xi for all i; k; l. Note that … preserves the strict monomial order and least common multiples. To apply the equivariant Buchberger algorithm we must further check that Cartesian products of …-orbits on KŒX are finite unions of diagonal …-orbits. For this, let f; g be elements of KŒX and let p; q be such that all variables in f; g have indices contained in Œp; Œq, respectively. Then f; g depend only on the restrictions of and to Œp; Œq respectively. Enumerate all (finitely many) pairs .i W Œp ! N; i W Œq ! N/; i D 1; : : : ; r of increasing maps for which the union of im.i / and im.i / equals some interval Œt D f1; : : : ; tg, necessarily with t p Cq. Extend these i and i arbitrarily to elements of …. We claim that for any pair ; 2 … we have .f; g/ D .i f; i g/ for some i 2 Œr and some 2 …. Indeed, there exists a unique i for which there exists an (again, unique) increasing map W Œt D im.i / [ im.i / ! .Œp/ [ .Œq/ such that the restrictions of ; to Œp; Œq equal the restrictions of ı i ; ı i to Œp; Œq, respectively. Extend in any manner to an element of … and we find .f; g/ D .i f; i g/, as desired. This means that we can apply the equivariant Buchberger algorithm, but without the guarantee that it terminates, since j… is not a well-partial-order (adapt Example 2.4 to see this). It turns out the algorithm does terminate, though, and yields the following equivariant Gröbner basis (after self-reduction):
Noetherianity up to Symmetry
41
B Dfx1 x2 y21 ; x3 y21 x2 y31 ; x3 y21 x1 y32 ; x2 y31 x1 y32 ; x12 y32 y31 y21 ; y43 y21 y41 y32 ; y42 y31 y41 y32 :g Since the monomial order is an elimination order, we conclude that I \ KŒyij j i > j has a …-Gröbner basis given by the last two binomials. In particular, that ideal is generated, as an Inc.N/-stable ideal, by these binomials. The result that we have just proved by computer has first appeared as a theorem in [dLST95]. This example gives the ideal of the so-called second hypersimplex, or, with a slight modification, of the Gaussian one-factor model. The k-factor model for k D 2 and higher will be the subject of Sect. 6.
3 Equivariant Noetherianity In this section we establish a number of constructions of equivariantly Noetherian rings and spaces. For some of the material see [Dra10]. Given a ring R (always commutative, with 1) and a monoid … with a left action on R by means of (always unital) endomorphisms we say that R is …-Noetherian, or equivariantly Noetherian if … is clear from the context, if every chain I1 I2 : : : of …-stable ideals in R eventually stabilises, that is, if there exists an n 2 N for which In D InC1 D : : :. This is equivalent to the condition that any …-stable ideal I is generated, as an ideal, by finitely many …-orbits …f1 ; : : : ; …fs . We then say that f1 ; : : : ; fs generate I as a …-stable ideal. We have seen a major example in Sect. 2, namely, for any fixed natural number k and any field K the ring KŒxij j i 2 Œk; j 2 N with its action of Inc.N/ on the second index is Inc.N/-Noetherian. There are several constructions of new equivariantly Noetherian rings from existing ones. The first and most obvious is that if R is …-Noetherian and I R is a …-stable ideal, then R=I is …-Noetherian: any chain of …-stable ideals in R=I lifts to a chain of …-stable ideals in R containing I , and the first chain stabilises exactly when the second chain does. A second construction takes a …-Noetherian ring R to the polynomial ring RŒx in a variable x, where … acts only on the coefficients from R. The standard proof of Hilbert’s Basis Theorem, say from [Lan65], generalises word by word from trivial … to general …. It is not true, in general, that a subring of an equivariantly Noetherian ring is equivariantly Noetherian. Indeed, this is already not true for ordinary Noetherianity, where … is the trivial monoid. However, the following construction proves that certain well-behaved subrings of equivariantly Noetherian rings are again equivariantly Noetherian. Suppose that S is a subring of R with the property that R
42
Jan Draisma
splits as a direct sum S ˚ M of S -modules. If J is an ideal in S and I is the ideal in R generated by J , then we claim P that S \ I D J —indeed, any element f of S \ I can be written as f D i fi gi with the fi elements of J and the gi elements of R. Applying the S -linear projection W R ! S along M to P both sides yields f D i fi .gi / 2 J , as claimed. If, moreover, the monoid … acts on R and stabilises S , then I is …-stable if J is. We conclude that if R is …-Noetherian, then any chain I1 I2 : : : of …-stable ideals in S generates such a chain J1 J2 : : : in R, and Jn D JnC1 D : : : implies that In D Jn \ S D JnC1 \ S D InC1 D : : :; so S is …-Noetherian. A particularly important example of this situation is the following proposition, due to Kuttler. Suppose that a group H acts on R by means of ring automorphisms, and that the action of H commutes with that of …, i.e., for every 2 … and h 2 H and f 2 R we have hf D hf . Then the ring RH WD ff 2 R j Hf D ff gg of H -invariant elements of R is stable under the action of …. Proposition 3.1. If on the one hand R is …-Noetherian and on the other hand R splits as a direct sum of irreducible ZH -modules, then RH is also …-Noetherian. Proof. By the discussion preceding the proposition, we need only prove that R splits as a direct sum RH ˚ M of RH -modules. For this, split R as a direct sum ˚i Mi of irreducible ZH -modules Mi . Then RH is the direct sum of the Mi with trivial H -action, and we set M equal to the direct sum of the Mi with non-trivial H action. We want to show that f Mi M for every f 2 RH and every Mi M . To this end, let W R ! RH be the projection along M and consider the map Mi ! RH sending m to .f m/. By invariance of f this map is H -equivariant, and by irreducibility of Mi its kernel is either f0g or all of Mi . But in the first case, the non-trivial H -module Mi would be embedded into the trivial H -module RH , which is impossible. Hence that kernel is all of Mi , and f Mi M . t u In our applications, R will typically be an algebra over some field K, and H will act K-linearly. Then it suffices that R splits as a direct sum of irreducible KH -modules (as one can infer from the proposition by replacing H by the group K H ). We give several applications of this proposition. First, we have seen in Example 2.4 that the ring of polynomials in the entries yij of an N N-matrix is not Inc.N/-Noetherian. The following corollaries show that interesting quotients of such rings are Inc.N/-Noetherian. Corollary 3.2. Let k be a natural number. Consider the homomorphism W KŒym j Q m 2 Nk ! KŒxij j i 2 Œk; j 2 N sending ym to kiD1 xi;mi . The kernel of is generated by finitely many Inc.N/-orbits of polynomials, and the quotient KŒ.ym /m = ker is Inc.N/-Noetherian. In more geometric language, that quotient is the coordinate ring of k-dimensional infinite-by-infinite-by-. . . -by-infinite tensors of rank one.
Noetherianity up to Symmetry
43
Proof. The first statement follows from the standard fact that the ideal of the variety of rank-one tensors is generated by the quadrics ym0 m1 ym00 m01 ym0 m01 ym00 m1 , where m0 ; m00 are multi-indices of length equal to some ` k and m1 ; m01 are multi-indices of length k `. The entries of the multi-indices m0 ; m00 ; m1 ; m01 taken together form a set of cardinality at most 2k, and this implies that each quadric of the form above is obtained by applying some element of Inc.N/ to one of the finitely many such quadrics with all indices in the interval Œ2k. The second statement follows from the proposition (or rather the discussion preceding it): the quotient is isomorphic, as a ring with Inc.N/-action, to the subring S D im of KŒxij j i 2 Œk; j 2 N. The ring S consists of all monomials in the xij that involve equally many variables, counted with their exponents, from all of the k rows of the k N-matrix .xij /ij . If one writes M for the vector space complement of S spanned by all other monomials, then M is an S -module, and the fact that KŒ.xij /ij is Inc.N/-Noetherian implies that S is. t u Alternatively, if K is infinite, then one can characterise S as the set of H -invariants, where H is the subgroup of .K /k consisting of k-tuples with product 1 and where h acts on xij by hxij D hi xij . Each monomial outside S spans an irreducible, non-trivial H -module, and the proposition implies that S is Inc.N/-Noetherian. A substantial generalisation of Corollary 3.2, which applies to a wide class of monomial maps into KŒ.xij /ij j i 2 Œk; j 2 N, is proved in [DEKL13]. For stabilisation of appropriate lattice ideals, see [HMdC13]. The next corollary shows that determinantal quotients of the coordinate ring of infinite-by-infinite matrices are Inc.N/-Noetherian, provided that the field has characteristic zero. Corollary 3.3. For any natural number k and any field K of characteristic zero, the quotient of the ring KŒyij j i; j 2 N by the ideal Ik generated by all .kC1/ .kC1/minors of the matrix .yij /ij is Inc.N/-Noetherian. Note that the set of these determinants is the union of finitely many Inc.N/-orbits of equations, so that the corollary implies that any Inc.N/-stable ideal containing Ik is generated by finitely many Inc.N/-orbits. Proof. Let the group H D GLk act on the ring KŒxi l j i 2 N; l 2 Œk by hxi l WD .xh/i l , where xh is the product of the N k-matrix x with variable entries and the k k-matrix h. Similarly, let H act on the ring KŒzlj j l 2 Œk; j 2 N by hzlj WD .h1 z/lj . Note that both actions commute with the action of Inc.N/ on the indices i; j , respectively. Let R be the polynomial ring KŒxi l ; zlj j i; j 2 N; l 2 Œk, equipped with the natural Inc.N/-action and H -action. Classical invariant theory tells us that rings, like R, on which H acts as an algebraic group split into a direct sum of irreducible KH -modules; here we use that char K D 0. So we may apply Proposition 3.1. The First Fundamental Theorem [GW09, Theorem 5.2.1] for H states that the algebra RH of H -invariant elements of the ring R is generated by all pairings pij WD
44
Jan Draisma
P
l xil zlj D .xz/ij . The Second Fundamental Theorem [GW09, Theorem 12.2.12] states that the kernel of the homomorphism KŒ.yij /ij 7! R determined by yij 7! pij is precisely Ik . Thus the quotient by Ik is isomorphic, as a K-algebra with Inc.N/action, to RH . This proves the corollary. t u
Similar results are obtained by taking other rings with group actions where the invariants and the polynomial relations among them are known. Here is an example, which first appeared in [Dra10]. Corollary 3.4. For any natural number k and any field K of characteristic zero, the kernel of the homomorphism W KŒym j m 2 Nk ; m1 < : : : < mk ! KŒxij j i 2 Œk; j 2 N sending ym to the determinant of xŒm, the k k-submatrix of x obtained by taking the columns indexed by m, is generated by finitely many Inc.N/orbits; and the quotient of KŒ.ym /m by ker is Inc.N/-Noetherian. Proof. Let H D SLk , the group of k k-matrices of determinant 1, act on KŒ.xij /ij by hxij D .h1 x/ij . The First Fundamental Theorem for SLn says that the k k-minors of x generate the invariant ring of H , and the Second Fundamental Theorem says that the Plücker relations among those determinants, which can be covered by finitely many Inc.N/-orbits, generate the ideal of all relations. Now proceed as in the previous case. t u We remark that Alexei Krasilnikov showed that the Noetherianity of this corollary does not hold when char K D 2 and k D 2. However, a weaker form of Noetherianity, which we introduce now, does hold. Let X be a topological space equipped with a right action of a monoid … by means of continuous maps X ! X . Then we call X equivariantly Noetherian, or …-Noetherian, if every chain X D X0 X1 X2 : : : of closed, …-stable subsets stabilises. If R is a K-algebra with a left action of … by means of K-algebra endomorphisms, then for any K-algebra A the set X WD R.A/ WD HomK .R; A/ of A-valued points of R is a topological space with respect to the Zariski topology in which closed sets are defined by the vanishing of elements of R. Moreover, the monoid … acts from the right on R.A/ by .p/.r/ D p. r/. If R is …-Noetherian, then R.A/ is …-Noetherian in the topological sense. Conversely, if R.A/ is …-Noetherian in the topological sense for every K-algebra A, then R is …-Noetherian—indeed, just take A equal to R, so that the map that takes closed sets to vanishing ideals is a bijection. However, topological Noetherianity of, say, R.K/ does not necessarily imply Noetherianity of R. An example of this phenomenon is given by Krasilnikov’s example: the ring KŒdet xŒi; j j i; j 2 N; i < j , where x is a Œ2 N-matrix of variables, is not Inc.N/-Noetherian if char K D 2, but its set of K-valued points is—indeed, this set of points is the image of K Œ2N under the Inc.N/-equivariant map sending a matrix to the vector of its 2 2-determinants. Since K Œ2N is Inc.N/-Noetherian, so is its image. More generally, …-equivariant images of …-Noetherian topological spaces are …-Noetherian, and so are …-stable subsets with the induced topology. Another construction that we shall make much use of is the following.
Noetherianity up to Symmetry
45
Proposition 3.5. Let G be a group with a right action by homeomorphisms on a topological space X . Let … be a submonoid of G and let Z be a …-stable subset of X S. Assume that Z is …-Noetherian with the induced topology. Then Y WD ZG D g2G Zg X is G-Noetherian with the induced topology. Proof. Let Y D Y1 Y2 Y3 : : : be a chain of G-stable closed subsets of Y . Then each Zi WD Yi \ Z is …-stable and closed, hence by …-Noetherianity of Z there exists an n with Zn D ZnC1 D : : :. By definition of Y , for each y 2 Yi there exist a g 2 G and a z 2 Z with y D zg, and by G-stability of Yi we have z D yg 1 2 Zi . This means that Yi can be recovered from Zi as Yi D Zi G, and hence the chain Y1 Y2 Y3 : : : stabilises at Yn , as well. t u
4 Chains of Varieties In the remainder of these notes we study various chains of interesting embedded finite-dimensional varieties, for which we want to prove that from some member of the chain on, all equations for later members come from those of earlier members by applying symmetry. To use the infinite-dimensional techniques from the previous chapters, we first pass to a projective limit, prove that the limit is defined by finitely many orbits of equations, and from this fact we derive the desired result concerning the finite-dimensional varieties. In this short section we set up the required framework for this, again without trying to be as general as possible. Most of this material is from [Dra10]. Thus let K be a field and let R1 ; R2 ; : : : be commutative K-algebras with 1. The algebra Ri plays the role of coordinate ring of the ambient space of the i -th variety in our chain. Assume that the Ri are linked by (unital) ring homomorphisms i W Ri ! Ri C1 and iSW Ri C1 ! Ri satisfying i ı i D 1Ri . Then we can form the K-algebra R1 WD i 2N Ri with respect to the inclusions i ; the use of the i will become clear later. Suppose, next, that are given ideals Ii Ri such that i maps Ii C1 into Ii and i maps Ii into Ii C1 . The ideal Ii plays the role of defining ideal of the i -th variety in our chain. Writing Si WD Ri =Ii we find that the i ; i induce inclusions S Si ! SS i C1 and surjections Si C1 ! Si , respectively, and we set I1 WD i Ii and S1 WD i Si , which also equals R1 =I1 . Assume, next, that a group Gi acts on Ri from the left by means of K-algebra automorphisms, and that we are given embeddings Gi ! Gi C1 that render both i and i equivariant with respect to Gi . Suppose furthermore that each Ii is Gi -stable, which expresses that the i -th variety has theSsame symmetries as imposed on the ambient space. We form the group G1 WD i Gi , which acts on R1 ; I1 ; S1 by means of automorphisms. For any K-algebra A, we write Ri .A/; Si .A/; R1 .A/; S1 .A/ for the sets of A-valued points of these algebras, i.e., for the set of homomorphisms Ri ! A, etc. As customary in algebraic geometry, for a p in these point sets, we write f .p/
46
Jan Draisma ∗ 2
∗ 1
R1 (A)
R2 (A)
R3 (A)
R∞ (A)
∗ 2
∗ 1
S1 (A)
S2 (A)
S3 (A)
S∞ (A)
G1
G2
G3
G∞
Fig. 3 Chains of varieties
rather than p.f / for the evaluation of p on an element f in the corresponding algebra. These sets are topological spaces with respect to the Zariski topology, in which closed sets are of the form fp 2 Ri .A/ j J.p/ D f0gg for some ideal J in Ri , and similarly for the other algebras. On these topological spaces Gi or G1 acts by means of homeomorphisms. Our set-up so far is summarised in the diagram of Fig. 3, where ; are the pull-backs of and , respectively. The relation ı D 1 implies that is surjective (and not just dominant) and that is injective; indeed, the latter is a closed embedding. Still, is needed only a bit later. The topological space R1 .A/ is canonically the same as the projective limit, in the category of topological spaces, of the spaces Ri .A/ with their Zariski topologies: First, at the level of sets, an A-valued point p of R1 gives rise, by composition with the embeddings Ri ! R1 , to homomorphisms pi WD Ri ! A for all i 2 N. The resulting sequence .p1 ; p2 ; : : :/ has the property that the pull-back of i maps pi C1 to pi , i.e., it is a point of the inverse limit lim i Ri .A/ of sets. Conversely, a point of this inverse limit gives homomorphisms pi W Ri ! A such that pi C1 ı i D pi , and together these define a homomorphism R1 ! A. Second, the projective limit topology on R1 .A/ is the weakest topology that renders all maps R1 .A/ ! Ri .A/ continuous. This means, in particular, that sets given by the vanishing of a single element of Ri R1 must be closed, and so must intersections of these, which are sets given by the vanishing of an ideal in R1 . This shows that the projective limit topology on R1 .A/ has at least as many closed sets as the Zariski topology, and the converse is also clear since the maps R1 .A/ ! Ri .A/ are continuous in the Zariski topology. The basic result that we shall use throughout the rest of the notes is the following, where the use of the i becomes apparent. Proposition 4.1. Let i0 2 N and assume that the set S1 .A/ is characterised inside R1 .A/ by the vanishing of all gf 2 R1 with g 2 G1 and f 2 Ii0 . Then for i i0 the set Si .A/ is characterised by the vanishing of all functions of the form i l1 gf with l i , g 2 Gl , and f 2 Ii0 . Proof. That these functions vanish on Si .A/ follows from the inclusion Ii0 Il , the fact that Gl stabilises Il , and the fact that j maps Ij C1 into Ij . Conversely, suppose
Noetherianity up to Symmetry
47
that pi 2 Ri .A/ is a zero of all functions in the proposition, and let p D .p1 ; p2 ; : : :/ be the point of R1 .A/ obtained by setting pj C1 WD j pj for j i and pj WD j pj C1 for j < i . Then p is a point in S1 .A/ by the assumed characterisation of the latter set: any g 2 G1 lies in Gl for some l which we may take larger than i , and for f 2 Ii0 we have .gf /.p/ D .gf /.pl / D .gf /.l1 i pi / D .l1 i gf /.pi / D 0;
as desired. In particular, this means that pi lies in Si .A/.
t u
It would be more elegant to characterise Si .A/ for i i0 as the common vanishing set of Gi Ii0 , i.e., of all functions of the form gf with f 2 Ii0 and g 2 Gi . For this we introduce an additional condition. For l i write i l W Ri ! Rl for the composition l1 i and li W Rl ! Ri for the composition i l1 . The condition that we want is: For all indices l; i0 ; i1 with l i0 ; i1 and for all g 2 Gl there exist an index j i0 ; i1 and group elements g0 2 Gi0 ; g1 2 Gi1 such that
li1 g i0 l D g1 ji1 i0 j g0
( )
holds as an equality of homomorphisms Ri0 ! Ri1 .
This guarantees that the functions li gf D li g i0 l f from the proposition can be written as g1 ji i0 j g0 f for some j i; i0 and g0 2 Gi0 and g1 2 Gi . Since Ii0 is Gi0 -stable and i0 j maps Ii0 into Ij Ii0 the latter expression is an element of Gi Ii0 . The discussion so far concerned an arbitrary, fixed K-algebra A. In several applications we shall just take A equal to K, and the conclusion is that the point sets Si .A/ for i i0 are defined set-theoretically by equations coming from Ii0 using symmetry. However, if one assumes that G1 Ii0 generates I1 , then the assumption in the proposition holds for all K-algebras A, hence so does the conclusion. From this one can conclude that for i i0 the functions featuring in the proposition generate the ideal Ii . Under the additional assumption ( ) one finds that Gi Ii0 generates the ideal Ii . We conclude this section with a well-known example which paves the way for the treatment of the k-factor model in the next section. Example 4.2. Fix a natural number k. For n 2 N let Rn be the polynomial ring over variables yij D yji with i; j n. Let n be the natural inclusion K in the nC1 2 Rn ! RnC1 and let n be the projection RnC1 ! Rn mapping all variables to zero that have one or both indices equal to n C 1. Then we have n n D 1 as required. Let In Rn be the ideal of polynomials vanishing on all symmetric n n-matrices over K of rank at most k. Then n maps In into InC1 since the upper-left n n-block of an .n C 1/ .n C 1/-matrix of rank at most k has itself rank at most k, and n maps InC1 into In since appending a zero last row and column to any matrix yields a matrix of the same rank. Let Gn WD Sym.n/ act on Rn by gyij D yg.i /;g.j /, and
48
Jan Draisma
embed Gn into GnC1 as the stabiliser of n C 1. Then Gn stabilises In and the maps n and n are Gn -equivariant. So we are in the situation discussed in this section. Even the additional assumption ( ) holds. Indeed, consider the effect of appending l i1 zero rows and columns to a symmetric i1 i1 -matrix, then simultaneously permuting rows and columns, and finally forgetting the last l i0 rows and columns of which, say, m come from the zero rows and columns introduced in the first step. Set j WD i1 .l i0 m/, which is the number of rows and columns of the original matrix surviving this operation. Then the same effect is obtained by first permuting (with g1 ) rows and columns such that the i1 j rows and columns to be forgotten are in the last i1 j positions, then forgetting those rows and columns, then appending l i1 m D i0 j zero rows and columns, and finally suitably permuting rows and columns with a g0 2 Sym.i0 /. It is known, of course, that if K is infinite, then In is generated by all .k C 1/ .k C 1/-minors of the matrix .yij /ij . This implies that I1 is generated by G1 I2kC2 ; here 2k C2 is the smallest size where representatives of all Sym.N/-orbits of minors of an infinite symmetric matrix can be seen. But conversely, if through some other method (computational or otherwise) one can prove that I1 is indeed generated by G1 I2kC2 , then by the discussion above this implies that In is generated by Gn I2kC2 for all n 2k C 2. Using the equivariant Gröbner basis techniques from Sect. 2 one can prove such a statement automatically for small k, say k D 1 and k D 2, much like we have done in Example 2.7.
5 The Independent Set Theorem To appreciate the results of this section—though not to understand the proofs— one needs some familiarity with Markov bases and their relation to toric ideals. We formulate and prove the independent set theorem from [HS12], first conjectured in [SHS07], directly at the level of ideals. Fix a natural number m, let be a subset of 2Œm (thought of as a hypergraph with vertex set Œm; see Fig. 4 for an example), and fix an infinite field K; what follows will, in fact, not depend on K. To any m-tuple r D .r1 ; : : : ; rm / 2 NŒm of natural numbers we associate the polynomial ring Rr WD KŒyi1 ;:::;im j .i1 ; : : : ; im / 2 Œr1 Œrt in
Q
t 2Œm rt -many
variables, and the polynomial ring
Qr WD KŒxF;.it /t 2F j F 2 and 8t 2 F W it 2 Œrt : Furthermore, we define Q Ir Rr as the kernel of the homomorphism QRr ! Qr mapping y.it /t 2Œm to F 2 xF;.it /t 2F . On Rr ; Qr acts the group Gr WD t Sym.Œrt / by permutations of the variables, and the homomorphism defining Ir is Gr -equivariant. We want to let some of the rt , namely, those with t in a given
Noetherianity up to Symmetry
49
3 zik
ujk
1
2 xijl 4
Γ T
Fig. 4 A hypergraph with parameters xijl ; zik ; ujk and independent set T
subset T Œm, tend to infinity, and conclude that the ideals Ir stabilise up to Gr symmetry. To put this statement in the context of Sect. 4, we give the rt with t 62 T some fixed values, and take the rt ; t 2 T all equal, say to n 2 N. The corresponding rings and ideals are called Rn ; Qn ; In . We have inclusions n W Rn ! RnC1 obtained by inclusion of variables and projections n W RnC1 ! Rn obtained by mapping all variables to zero that have at least one T -labelled index equal to n C 1. This maps InC1 into In because there is a compatible homomorphism QnC1 ! Qn setting the relevant variables equal to zero. As in Sect. 4 we write R1 ; I1 for the union of all Rn ; In . The group Sym.n/T acts on Rn ; Qn ; In , but in fact the independent set theorem only needs one copy of Sym.n/ acting diagonally. The additional assumption ( ) holds by the same reasoning as in Example 4.2. Theorem 5.1 (Independent Set Theorem [HS12].). Suppose that T Œm is an independent set in , i.e., that T intersects any F 2 in at most one element. Then there exists an n0 such that In is generated by Sym.n/In0 for all n n0 . The condition that T is an independent set cannot simply be dropped. For instance, if m D 3 and D ff1; 2g; f2; 3g; f3; 1gg (the model of no three-way interaction), then if r1 D n tends to infinity for fixed r2 ; r3 , then the ideal stabilises (see [AT03] for the case of r2 D r3 D 3); but if r1 D r2 D n both tend to infinity and r1 is fixed, say, to 2, then the ideal does not stabilise [DS98]. Example 5.2. As an example take m D 4 and D f124; 13; 23g, where 124 is short-hand for f1; 2; 4g, etc. We write yijkl ; xijl ; zik ; ujk instead of yi1 ;i2 ;i3 ;i4 ; x124;.i1 ;i2 ;i4 / ; x13;.i1 ;i3 / ; x23;.i2 ;i3 / ; respectively; see Fig. 4. The ideal In is the kernel of the homomorphism sending yijkl to xijl zik ujk . Take T D f3; 4g, an independent set in . The ideal I1 contains obvious quadratic binomials such as
50
Jan Draisma
yij kl yij k 0 l 0 yij kl 0 yij k 0 l with i 2 Œr1 ; j 2 Œr2 ; k; l; k 0 ; l 0 2 N. Indeed, the first monomial maps to xijl zik ujk xijl0 zi k 0 uj k 0 , and the second monomial maps to xijl0 zik ujk xijl zi k 0 uj k 0 , which is the same thing. These obvious binomials generalise verbatim to the general case, where they read yjkl yj k 0 l 0 yjkl0 yj k 0 l
(1)
Q where now j runs over the finite set t 2ŒmnT Œrt , k; k 0 run over NS and l; l 0 run over NT nS for some S that runs over all subsets of T . Indeed, for any variable x D xF;.it /t 2F at most one t 2 F lies in T . If such a t exists for x and lies in S , then whether x appears in the image of a variable yjkl does not depend on the value of l. But disregarding that third index, the two monomials above are the same. A similar reasoning for the case where t 2 T n S and for the case where F \ T D ; shows that x has the same exponent in the image of both monomials in the binomial above. By Proposition 4.1 relating chains to infinite-dimensional varieties, we are done if we can prove that I1 is generated by Sym.1/In for some finite n. Let Jn ; J1 In ; I1 , respectively, be the ideals generated by all quadratic binomials as in (1). We claim that J1 is generated by Sym.1/J2jT j . Indeed, for any binomial as above, the set of all indices appearing in k; l; k 0 ; l 0 has cardinality n 2jT j, and there exists a bijection in Sym.1/ mapping its support bijectively onto Œn, witnessing that the binomial lies in Sym.1/J2jT j . The remainder of the proof consists of showing that the quotient R1 =J1 is, in fact, Sym.1/-Noetherian. To this end, we introduce a new polynomial ring 2 0 P WD K 4yjtq jj 2
Y
3 Œrt ; t 2 T; and q 2 N5 ;
t 2ŒmnT 0 0 are new variables, and consider the subring R1 of P generated by all where the yjtq Q 0 T monomials mji WD t 2T yjtit with j as before and i 2 N . The monomials mji satisfy the binomial relations (1) (for all splittings of i into two subsequences k and l), and it is known that these binomials generate the ideal of relations among Q the mji — 0 indeed, the ring R1 is the coordinate ring of the Cartesian product of t 2ŒmnT rT many copies of the variety of pure jT j-dimensional tensors. Thus we have an iso0 0 morphism R1 Š R1 =J1 , and we want to show that R1 is Sym.1/-Noetherian. 0 The enveloping polynomial ring P R1 is Sym.1/-Noetherian by Theorem 2.3 0 (only the index q of the variables yjt q is unbounded), but passing to a subring one may, in general, lose Noetherianity. However, let H be the torus in .K /T consisting of T -tuples of non-zero scalars whose product is 1. Then the mji are H -invariant, and these monomials generate the ring of H -invariant polynomials (if, as we may 0 assume, K is infinite). Hence by Proposition 3.1 we may conclude that R1 is Sym.1/-Noetherian, and this concludes the proof of the independent set theorem.
Noetherianity up to Symmetry
51
6 The Gaussian k-Factor Model This section discusses finiteness results for a model from algebraic statistics known as the Gaussian k-factor model. General stabilisation results for this model were first conjectured in [DSS07], and for 1 factor established prior to that in [dLST95]. For 2 factors, a positive-definite variant was established in [DX10], and an idealtheoretic variant in [BD11]. The ideal-theoretic version for more factors is open, but the set-theoretic version was established in [Dra10]. The Gaussian k-factor model consists of all covariance matrices for a large number n of jointly Gaussian random variables consistent with the hypothesis that those variables can be written as linear combinations of a small number k of hidden factors plus independent, individual noise. Algebraically, let Rn be the K-algebra of polynomials in variables yij D yji with i; j 2 Œn, and let Pk n be the K-algebra of polynomials in the variables xi l ; i 2 Œn; l 2 Œk and further variables z1 ; : : : ; zn . Let Ik n be the kernel of the homomorphism k n W Rn ! Pk n that maps yij to the .i; j /-entry of the matrix x x T C diag.z1 ; : : : ; zn /; where we interpret x as an Œn Œk-matrix of variables. Set Sk n WD Rn =Ik n ; the set n Sk n .K/ K .2/ of K-valued points of Sk n is (the Zariski closure of) the Gaussian k-factor model. Observe that k n is Sym.Œn/-equivariant, so that Ik n is Sym.Œn/stable. We are in the setting of Sect. 4, with the map n W RnC1 ! Rn mapping yi;.nC1/ equal to 0 for all i and the map n W Rn ! RnC1 the inclusion. The technical assumption ( ) from that section holds for the same reason as in Example 4.2. Theorem 6.1. For every fixed k 2 N, there exists an nk 2 N such that for all nC1 n nk the variety Sk n .K/ K . 2 / is cut out set-theoretically by the polynomials in Sym.n/Ink . Proof. By Proposition 4.1 and the discussion following its proof we need only prove that Sk1 .K/ is the zero set of Sym.1/Ink , for suitable nk . Let Jk1 denote the ideal generated by all .k C 1/ .k C 1/-minors of the symmetric N N-matrix y that 0 do not involve diagonal entries of y, and set Sk1 WD R1 =Jk1 . Then surely Jk1 is 0 contained in Ik1 , so that, dually, Sk1 .K/ contains Sk1 .K/. 0 We claim that Sk1 .K/ is a Sym.1/-Noetherian topological space, and to prove this claim we proceed by induction. For k D 0 the equations in Jk1 D J01 force 0 all off-diagonal entries of the matrix y to be zero, so that S01 .K/ is just the set of K-points of KŒy11 ; y22 ; : : :, with Sym.1/ permuting the coordinates. The latter ring is Sym.1/-Noetherian by Theorem 2.3, and hence its topological space of K-points is certainly Sym.1/-Noetherian. 0 0 Next, assume that Sk1;1 .K/ is Sym.1/-Noetherian, and note that Sk1;1 .K/ 0 is a (closed) subset of Sk;1 .K/. On any point outside this closed subset at least one of the k k-determinants in Jk1;1 is non-zero. Up to signs, these determinants
52
Jan Draisma
Fig. 5 The set T labelling symmetric matrix entries (in light gray), and the .k C 1/ .k C 1/-determinant expressing yij as a rational function in the T -labelled variables (in dark gray)
k k
k Δ
k
yij
form a single orbit under Sym.1/, so if we set WD det yŒŒk; Œ2k n Œk, then we have 0 0 .K/ D Sk1;1 .K/ [ Z Sym.1/; where Sk;1 0 .K/ j ¤ 0g: Z D fy 2 Sk1
The union of two Sym.1/-Noetherian topological spaces is Sym.1/-Noetherian, so it suffices to prove that Z Sym.1/ is Sym.1/-Noetherian. Observe that Z itself is stable under the subgroup H WD f 2 Sym.1/ j jŒ2k D 1jŒ2k g. Hence by Proposition 3.5 it suffices to prove that Z is H -Noetherian. To this end, define T N N by T WD f.i; j / 2 N N j i D j or .i < j and i 2 Œ2k/g: Let Q be the open subset of K T where the Œk .Œ2k n Œk/-submatrix has non-zero determinant. The coordinate ring of Q is H -Noetherian by Theorem 2.3 and the fact that adding finitely many H -fixed variables preserves H -Noetherianity. As a consequence, Q is an H -Noetherian space. We claim that the projection pr W Z ! Q that maps a matrix y to its T -labelled entries is a closed, H -equivariant embedding. Equivariance is immediate. To see that pr is injective observe that, for y 2 Z, any matrix entry yij with i < j (since we work with symmetric matrices) and i 62 Œ2k satisfies an equation (Fig. 5) 0 D det y ŒŒk [ fi g; .Œ2k n Œk/ [ fj g D yij E; where E is an expression involving only variables in T . Since is non-zero, we find that yij is determined by pr.y/. This shows injectivity. That pr W Z ! Q is, in fact, a closed embedding follows by showing that the dual map KŒQ ! KŒZ is surjective: the regular function detE maps onto yij jZ . Since Q is H -Noetherian, so is Z, and as mentioned before this concludes the induction step.
Noetherianity up to Symmetry
53
0 As Sk;1 .K/ is Noetherian, we find that in particular, the Zariski closure 0 Sk;1 .K/ is cut out from Sk;1 .K/ by finitely many Sym.N/-orbits of equations. 0 Representatives of these orbits already lie in Sk;n .K/ for suitable nk . t u k
7 Tensors and -Varieties This section deals with finiteness results for a wide class of varieties of tensors, introduced by Snowden [Sno13] and called -varieties. The proof of this chapter’s theorem is more involved than earlier proofs, and we have therefore decided to break the section up into more digestible sections.
-Varieties We work over a ground field K, which we assume to be infinite to avoid anomalies with the Zariski topology. For any tuple .V1 ; : : : ; Vn / of finite-dimensional vector spaces over K, we write V.V1 ; : : : ; Vn / for the tensor product V1 ˝ ˝ Vn . These spaces have three types of interesting maps between them. First, given linear maps fi W Vi ! Wi there is a natural linear map V.W1 ; : : : ; Wn / ! V.V1 ; : : : ; Vn /, namely, the tensor product ˝i fi of the dual maps fi . Second, given any 2 Sym.Œn/, there is a canonical map W V.V .1/ ; : : : ; V .n/ / ! V.V1 ; : : : ; Vn /. Third, there is a canonical flattening map V.V1 ; : : : ; Vn ; VnC1 / ! V.V1 ; : : : ; Vn ˝ VnC1 /, which is called like this because, in coordinates, it takes an .n C 1/-way table of numbers and transforms it into an n-way table; see Fig. 6. A -variety is not a single variety, but rather a rule X that takes as input a finite sequence .Vj /j 2Œn of finite-dimensional vector spaces over K, and assigns to it a subvariety X.V1 ; : : : ; Vn / of V.V1 ; : : : ; Vn /. To be a -variety, each of the three types of maps above must preserve X, i.e., ˝i fi must map X.W1 ; : : : ; Wn / into X.V1 ; : : : ; Vn /, and must map X.V .1/ ; : : : ; V .n/ / into X.V1 ; : : : ; Vn /, and the flattening map must map X.V1 ; : : : ; VnC1 / into X.V1 ; : : : ; Vn ˝ VnC1 /. The -varieties that we shall study will have a fourth, additional property, namely, that the inverse to the isomorphism V.V1 ; : : : ; Vn ; K/ ! V.V1 ; : : : ; Vn ˝ K/ D V.V1 ; : : : ; Vn / maps X.V1 ; : : : ; Vn / into X.V1 ; : : : ; Vn ; K/; we call such -varieties good. Taking any linear function f from an additional vector space VnC1 to K we then find that X.V1 ; : : : ; Vn / ˝ f , being the image of X.V1 ; : : : ; Vn / under the map above followed by 1V1 ˝ ˝1Vn ˝f , is contained in X.V1 ; : : : ; Vn ; VnC1 /. A (boring) example of a -variety that is not good is that for which X.V1 ; : : : ; Vn / equals V.V1 ; : : : ; Vn / if n < 10 and the empty set otherwise. A typical example of a good -variety is Seg, the cone over Segre, which maps a tuple of vector spaces to the variety of pure tensors v1 ˝ ˝ vn in the tensor product of the duals. In fact, any non-empty good -variety contains Seg—but the class of good -varieties is much larger. For instance, it is closed under taking joins:
54
Jan Draisma
Fig. 6 Flattening an element of K 4 ˝ K 2 ˝ K 3 to an element of K 4 ˝ .K 2 ˝ K 3 /
if X and Y are -varieties, then the rule X C Y that assigns to .Vi /i 2Œn the Zariski closure of X.V1 ; : : : ; Vn / C Y.V1 ; : : : ; Vn / is also a -variety, and good if both X and Y are. Similarly, (good) -varieties are closed under taking tangential varieties, unions, and intersections. Given some equations for an instance of a -variety X, one obtains equations for other instances of X by pulling back along sequences of the maps appearing in the definition of a -variety. For instance, start with the 2 2-determinant defining Seg.K 2 ; K 2 / inside V.K 2 ; K 2 /. Then we obtain generators for the ideal of Seg.K m ; K n / by pulling the determinant back along duals of linear maps f1 W K 2 ! K m and f2 W K 2 ! K n .1 And then, using the remaining two axioms, we also find equations for the variety of pure tensors in, say, V.K 2 ; K 2 ; K 2 ; K 2 / through the flattening maps into V.K 2 ˝K 2 ; K 2 ˝K 2 / Š V.K 4 ; K 4 /, V.K 2 ; K 2 ˝K 2 ˝K 2 / Š V.K 2 ; K 8 /, etc. Indeed, one readily shows that one obtains generators of the ideals of all instances of Seg in this manner. The result in this section is that a similar result holds for any sufficiently small good -variety, at least at a topological level. Theorem 7.1. Let X be a good -variety which is bounded in the sense that there exist finite-dimensional vector spaces W1 ; W2 such that X.W1 ; W2 / is not all of V.W1 ; W2 /. Then there exist an nX 2 N and vector spaces U1 ; : : : ; UnX such that X equals the inclusion-wise largest -variety Y with Y.U1 ; : : : ; UnX / D X.U1 ; : : : ; UnX /. This means, in more concrete terms, that the equations for X.U1 ; : : : ; UnX /, pulled back along all four types of linear maps from the definition of a good variety, yield equations that cut out all instances of X from their ambient spaces. In particular, there is a universal degree bound, depending only on X but not on n or V1 ; : : : ; Vn , on equations needed to define X.V1 ; : : : ; Vn / set-theoretically within V.V1 ; : : : ; Vn /. Since GL.W1 / GL.W2 / acts with a dense orbit on V.W1 ; W2 /—namely, the twotensors (or matrices) of full rank—the boundedness condition on X implies that all two-tensors in instances of the form X.V1 ; V2 / have uniformly bounded rank. This readily implies that the boundedness condition on X is also preserved under joins (by adding the rank bounds), tangential varieties, intersections, and unions, so that the theorem applies to a wide class of -varieties of interest in applications.
1
Snowden chose the notion of -varieties contravariant in the linear maps fi so as to make defining ideals and more general -modules [Sno13] depend covariantly on them.
Noetherianity up to Symmetry
55
Related Literature The boundedness condition on -varieties was first formulated, at an ideal-theoretic level, in [Sno13]. There it is conjectured that a generalisation of Theorem 7.1 should hold, for bounded -varieties, on the ideal-theoretic level; and not only for equations of instances of X, but also for their q-syzygies for any fixed q 1. This general statement is proved for Seg in [Sno13]. The special case where q D 1, i.e., finiteness of equations, is known to hold for the tangential variety to Seg [OR11], confirming a conjecture from [LW07]; and for the variety Seg C Seg D 2Seg of tensors of border rank at most 2 [Rai12], confirming the GSS-conjecture from [GSS05] (a set-theoretic version of which was first proved in [LM04]). The settheoretic theorem above was first proved in [DK13] for kSeg, i.e., for any fixed secant variety of Seg; and a discussion with Snowden led to the insight that our proof generalises to bounded, good, -varieties as in the theorem. Further recent keywords closely related to the topic of this section are GL1 -algebras [SS12], twisted commutative algebras [SS12b], FI-modules [CEF12], and cactus varieties [BB13].
From a -Variety to an Infinite-Dimensional Variety We prove the theorem by embedding all relevant instances of X into a single, infinite-dimensional variety given by determinantal equations, and showing that this variety is Noetherian up to symmetry preserving X. By the boundedness assumption, we can choose a number p that is strictly greater than the ranks of all twotensors in instances X.V1 ; V2 /, independently of V1 and V2 . Set V WD K Œp and Xn WD X.V; : : : ; V / Vn WD V.V; : : : ; V /, where the number of V s equals n. We first argue that the equations for all (infinitely many) Xn ; n 2 N pull back to equations defining all instances of X. Indeed, let V1 ; : : : ; Vn be vector spaces, and let ! 2 V.V1 ; : : : ; Vn / be a tensor. Then we claim that ! lies in X.V1 ; : : : ; Vn / if and only if for all linear maps fi W V ! Vi the image of ! under ˝i 2Œn fi lies in Xn . The “only if” claim follows from the first axiom for -varieties. For the “if” claim, note that if .˝i 2Œn fi /! lies in of fi , then for each j 2 Œn,N the linear map that ! induces from NXn for all tuples V into V , being a flattening of ! in V. i i ¤j i ¤j Vi ; Vj /, has image Uj Vj j of dimension strictly smaller than p. Now take fj W V ! Vj such that fj restricts to an injection Uj ! V , and let gj W Vj ! V be such that gj ı fj restricts to the identity on Uj . Then the tensor ! 0 WD ˝j fj ! lies in Xn D X.V; : : : ; V / by assumption. But then, by the first axiom, the tensor ˝j gj ! 0 D ! lies in X.V1 ; : : : ; Vn /, as claimed. This argument actually also works ideal-theoretically; only later shall we need to work purely topologically. We now cast the chain of varieties .Xn /n2N into the framework of Sect. 4. To this end, let Rn denote the symmetric K-algebra generated by V ˝Œn , which is the
56
Jan Draisma
coordinate ring of Vn . Pick a non-zero element x0 2 V and let n W Rn ! RnC1 be the homomorphism of K-algebras determined by the linear map V ˝Œn ! V ˝ŒnC1 ; x 7! x ˝ x0 . The group GL.V /Œn acts on V ˝Œn in the natural manner, and this extends to an action by algebra automorphisms on Rn . Similarly, the group Sym.Œn/ acts on V ˝Œn by permuting tensor factors. The embedding n is equivariant for the group Gn WD Sym.Œn/ Ë GL.V /Œn if we embed Sym.Œn/ into Sym.Œn C 1/ by fixing n and GL.V /Œn into GL.V /ŒnC1 by adding 1V in the last component. The linear map W VnC1 ! Vn maps XnC1 into Xn , and Xn is preserved by Gn . Letting Sn be the coordinate ring of Rn , we have all the arrows in the diagram of Fig. 3 except for the arrows to the right. To obtain these, we use that X is good, as follows. Given any e0 2 V such that e0 .x0 / D 1, the map n W Vn ! VnC1 ; ! 7! ! ˝ e0 maps Xn into XnC1 . The dual to this linear map, extended to an algebra homomorphism, is the required map W RnC1 ! Rn . This completes the diagram. The technical condition ( ) from page 47 is also satisfied, i.e., for all indices l; i0 ; i1 with l i0 ; i1 and for all g 2 Gl there exist an index j i0 ; i1 and group elements g0 2 Gi0 ; g1 2 Gi1 such that .li1 g i0 l / D .g1 ji1 i0 j g0 / : Indeed, the left-hand side is the composition of the map Vi1 ! Vl tensoring with e0˝li1 , followed by g, followed by the map Vl ! Vi0 contracting with x0˝li0 in the last l i0 factors. Let i1 j be the number of factors V in Vi1 that are moved, by g, into the last l i0 positions and hence end up being contracted in the last step. This means that j i0 ; i1 is the number of factors V in Vi1 that are not contracted. Hence the composition can also be obtained by first applying a g1 2 Gi1 , ensuring that the i1 j factors V in Vi1 that need to be contracted are in the last i1 j positions; then contracting by a pure tensor in those positions, which for suitable ˝i j choice of g1 may be chosen x0 1 ; then tensoring with i0 j copies of e0 ; and finally applying a suitable element g0 2 Gi0 . The upshot of this is that if we can prove S that the projective limit X1 WD lim n Xn is defined by finitely many G1 WD n Gn -orbits of equations within V1 WD lim n Vn , then there exists an nX such that for all n nX the variety Xn is defined by the Gn -orbits of equations for XnX . This implies the theorem.
Flattening Varieties To prove this finiteness result, then, we show that X1 is contained in a G1 Noetherian subvariety Y1 of V1 , which we call a flattening variety, and that Y1 itself is defined by finitely many G1 -orbits of equations. To define Y1 , let Y.k/ denote the largest -variety for which Y.k/ .V1 ; V2 / consists of two-tensors of rank at most k. Then Y.k/ .V1 ; : : : ; Vn / is defined by the N vanishing N of all .k C1/ .k C1/minors of the flattenings V.V1 ; : : : ; Vn / ! V. i 2A Vi ; i 2B Vi / for all partitions
Noetherianity up to Symmetry
57 .k/
of Œn into disjoint subsets A and B. Set Yn WD Y.k/ .V; : : : ; V / Vn , and .k/ .k/ .p1/ Y1 WD lim n Yn lim n Vn . By the boundedness assumption on X, Y1 contains X1 . .k/ We first prove that each Y1 ; k 2 N is defined by finitely many G1 -orbits of equations. Unwinding the definitions, this statement boils down to the statement .k/ that if ! 2 Vn with n 0 does not lie in Yn , then there exists an i 2 Œn and an x 2 V such that contracting ! with x in the i -th position yields a tensor ! 0 2 Vn1 .k/ that does not lie in Yn1 . In fact, we shall see that n > 2k suffices. The condition .k/ that ! does not lie in Yn means that there is a partition Œn D A [ B such that !, regarded as a linear map V ˝A ! .V /˝B , has rank strictly larger than k. Using that n > 2k and after swapping A and B if necessary we may assume that jBj > k. Let U .V /˝B be a .k C 1/-dimensional subspace of the image of this linear map !. We claim that since jBj is larger than k, there exists a position i 2 B and an x 2 V such that the image of U under contraction with x in the i -th position still has dimension k C 1. Indeed, otherwise U would be a point in the projective variety Q WD fW 2 GrkC1 .V /˝B j contracting U with any x in any position decreases the dimensiong: We claim that this variety is empty. To prove this, extend the distinguished vector x0 2 V from the definition of V1 to a basis x0 ; : : : ; xp1 , where the distinguished e0 2 V vanishes on x1 ; : : : ; xp1 . Then the basis of V dual to x0 ; : : : ; xp1 starts with e0 ; denote it e0 ; : : : ; ep1 .2 If Q is not empty, then by Borel’s fixed point theorem [Bor91] Q contains a T B -fixed point W , where T is the maximal torus in GL.V / consisting of invertible linear maps whose matrices with respect to x0 ; : : : ; xp1 are diagonal. This means that W has a basis of common eigenvectors e˛ WD ˝i 2B e˛i , where ˛ runs through some set J f0; : : : ; p 1gB of cardinality k C1. Think of the ˛ 2 J as B-labelled words over the alphabet f0; : : : ; p 1g. Contracting e˛ with x0 C x1 C : : : C xp1 at position i 2 B yields e˛0 , where ˛ 0 is the word obtained from ˛ by deleting the i -th letter. By assumption, the resulting words e˛0 ; ˛ 2 J are linearly dependent, which means that at least two of them must coincide. Summing up, J consists of k C 1 distinct words of length jBj k C 1 with the property that for each i 2 B the collection J contains two words that differ only at position i . By induction on k we show that this is impossible, i.e., that for k C 1 distinct words of length k C 1 over any alphabet there exist k positions restricted to which all words are distinct. For k D 0 this is immediate: restricting a single word to zero positions yields a single (empty) word. Assume that it is true for k 1, and consider k C 1 words of length k C 1. Set one word ˛ apart. Then there exist k 1 positions restricted to which the remaining k words are distinct. Restricted to those k 1 positions ˛ equals at most one word ˛ 0 of the remaining words. So by 2
The reason for labelling with f0; : : : ; p 1g rather than Œp will become apparent soon.
58
Jan Draisma
adding to the k 1 positions a position where ˛ and ˛ 0 differ we obtain k positions restricted to which all words are distinct. This contradiction shows that there exists an i 2 B and an x 2 V such that the contraction of U with x at position i still has dimension k C 1. As a consequence, .k/ contracting ! with x at position i yields a tensor outside Yn1 , as claimed. Thus .k/ Y1 is defined by finitely many G1 -orbits of equations. .k/ The variety Y1 is defined by the vanishing of .k C 1/ .k C 1/-determinants. For what follows, it will be convenient to understand these explicitly in terms of coordinates. The basis x0 ; : : : ; xp1 gives rise to a basis xw ; w 2 f0; : : : ; p 1gŒn of V ˝Œn . The ring Rn is the polynomial ring in these variables. Under the embedding n W Rn ! RnC1 the variable xw is mapped to xw0 . Hence R1 is the polynomial ring in variables xw where w runs over all infinite words in f0; : : : ; p 1gN of finite support supp.w/ WD fj 2 N j wj ¤ 0g; let us call these finitary words. In these .k/ coordinates, a determinantal equation for Y1 looks as follows. Fix k C 1 finitary words wi ; i 2 Œk C 1 and k C 1 further finitary words w0j ; j 2 Œk C 1 with the requirement that supp.wi / \ supp.w0j / D ; for all i; j . Then form the square matrix xŒ.wi /i ; .w0j /j WD .xwi Cw0j /i;j 2ŒkC1 and its determinant Œ.wi /i ; .w0j /j WD det xŒ.wi /i ; .w0j /j : .k/
All determinants defining Y1 have this form.3
Noetherianity of Flattening Varieties .k/
Using this explicit understanding of the defining equations for Y1 , we prove that .k/ Y1 , with its Zariski-topology, is G1 -Noetherian. The proof is similar to that in Sect. 6 for the Gaussian k-factor model. In particular, we proceed by induction .0/ on k. For k D 0 the variety Y1 consists of a single point, namely, 0, and is .k1/ certainly Noetherian. Now assume that Y1 is G1 -Noetherian. By the discussion .k1/ of flattening varieties, Y1 is defined by the orbits of finitely many k kdeterminants of flattenings, say q of them. Let a ; a 2 Œq be those determinants.
3
The convenient fact that the sum of two finitary words is again finitary explains our choice of labelling x0 ; : : : ; xp1 .
Noetherianity up to Symmetry
59
Then we may write .k/ .k1/ WD Y1 [ Y1
[
Za G1 ; where
a2Œq .k/ Za WD f! 2 Y1 j a .!/ ¤ 0g:
As in the case of the k-factor model, it suffices to show that Za is Noetherian under a suitable subgroup of G1 stabilising it. To this end, write a D Œ.wi /i ; .w0j /j for k finitary words wi and k finitary words w0j with supp.wi / \ supp.w0j / D ; S S 0 supp.w / [ supp.w / and observe that a is for all i; j . Set n WD max i i j j fixed by H WD f 2 Sym.1/ j jŒn D 1Œn g, and hence Za is stabilised by H . We claim that Za is H -Noetherian. To prove this, let J be the set of finitary words w with j supp.w/ n Œnj 1. In particular, all variables appearing in a are in J . Let Q be the open subset of K J where a is non-zero. By Theorem 2.3 and the fact that adding finitely many H -fixed variables to an H -Noetherian ring preserves H -Noetherianity, the coordinate ring of Q is H -Noetherian—here the crucial point is that “only one index runs off to infinity”. We claim that the projection pr W Za ! Q mapping a point to its coordinates labelled by J is an H -equivariant, closed embedding. Equivariance is immediate. To see that pr is injective, we prove that on Za any variable xw has an expression in terms of the variables labelled by J . We proceed by induction on the cardinality of supp.w/ n Œn. For cardinality 0 and 1 the word w lies in J and we are done. So assume that the cardinality is at least 2 and that the statement is true for all smaller cardinalities. Then we can split w as u C u0 where supp.u/ \ supp.u0 /; supp.u/ \ supp.w0j /; and supp.wi / \ supp.u0 / are all empty and where both supp.u/ and supp.u0 / contain at least one element of N n Œn. Then on Za we have 0 D Œ.w1 ; : : : ; wk ; u/; .w01 ; : : : ; w0k ; u0 / D a xuCu0 E where E is an expression involving only variables whose supports contain fewer elements of N n Œn than supp.w/ does. By the induction hypothesis, these may be expressed in the variables labelled by J , and as a is non-zero on Za , so can xuCu0 D xw . To show that pr is a closed embedding we note that the map KŒQ ! KŒZ is surjective: there is an expression for EjZ involving only J -labelled variables, and dividing by a yields such an expression for xw . We conclude that Z has the topology of a closed H -stable subspace of K Q and is hence H -Noetherian. By Proposition 3.5 and the fact that finite unions of .k/ equivariantly Noetherian spaces are equivariantly Noetherian, we find that Y1 is G1 -Noetherian. This concludes the proof of the theorem. An important final remark is in order here: our proof of the theorem shows that X1 is defined by finitely many G1 -orbits of equations, which is stronger than the theorem claims. In particular, this stronger statement can be used to show that for each fixed -variety there is a polynomial-time membership test. On the other hand,
60
Jan Draisma
it is typically not true that the ideal of X1 is generated by finitely many G1 -orbits of polynomials; indeed, this statement is already false for the cone over Segre. How to reconcile this with the aforementioned conjecture [Sno13] that an idealtheoretic version of theorem should hold? Well, by pulling back equations along elements of G1 , we are implicitly pulling back equations along tensor products of linear maps, and along permutations of tensor factors, and along contractions, and along tensoring with e0 , but not along flattening maps (though we did use, in the proof, that X was closed under flattening). This additional source of linear maps along which to pull back equations may allow for an ideal-theoretic version of the theorem. For details see [Sno13, DK13]. Acknowledgement The author was supported by a Vidi grant from the Netherlands Organisation for Scientific Research (NWO).
References S. Aoki, A. Takemura, Minimal basis for connected Markov chain over 3 3 k contingency tables with fixed two dimensional marginals. Aust. N. Z. J. Stat. 45, 229–249 (2003) [AH07] M. Aschenbrenner, C.J. Hillar, Finite generation of symmetric ideals. Trans. Am. Math. Soc. 359(11), 5171–5192 (2007) [AH08] M. Aschenbrenner, C.J. Hillar, An algorithm for finding symmetric Gröbner bases in infinite dimensional rings (2008). Preprint available from http://arxiv.org/abs/0801. 4439 [Bor91] A. Borel, Linear Algebraic Groups (Springer, New York, 1991) [BD11] A.E. Brouwer, J. Draisma, Equivariant Gröbner bases and the two-factor model. Math. Comput. 80, 1123–1133 (2011) [BB13] W. Buczy´nska, J. Buczy´nski, Secant varieties to high degree Veronese reembeddings, catalecticant matrices and smoothable Gorenstein schemes. J. Algebr. Geom. 23, 63– 90 (2014) [CEF12] T. Church, J.S. Ellenberg, B. Farb, FI-modules: a new approach to stability for Sn representations (2012). Preprint, available from http://arxiv.org/abs/1204.4533 [Coh67] D.E. Cohen, On the laws of a metabelian variety. J. Algebra 5, 267–273 (1967) [Coh87] D.E. Cohen, Closure relations, Buchberger’s algorithm, and polynomials in infinitely many variables, in Computation Theory and Logic. Lecture Notes in Computer Science, vol. 270 (Springer, Berlin, 1987), pp. 78–87. 68Q40 (13B99) [dLST95] J.A. de Loera, B. Sturmfels, R.R. Thomas, Gröbner bases and triangulations of the second hypersimplex. Combinatorica 15, 409–424 (1995) [DS98] P. Diaconis, B. Sturmfels, Algebraic algorithms for sampling from conditional distributions. Ann. Stat. 26(1), 363–397 (1998) [Dra10] J. Draisma, Finiteness for the k-factor model and chirality varieties. Adv. Math. 223, 243–256 (2010) [DK13] J. Draisma, J. Kuttler, Bounded-rank tensors are defined in bounded degree. Duke Math. J. 163(1), 35–63 (2014) [DEKL13] J. Draisma, R.H. Eggermont, R. Krone, A. Leykin, Noetherianity for infinitedimensional toric varieties (2013). Preprint available from http://arxiv.org/abs/1306. 0828 [AT03]
Noetherianity up to Symmetry [DS06]
61
V. Drensky, R. La Scala, Gröbner bases of ideals invariant under endomorphisms. J. Symb. Comput. 41(7), 835–846 (2006) [DX10] M. Drton, H. Xiao, Finiteness of small factor analysis models. Ann. Inst. Stat. Math. 62(4), 775–783 (2010) [DSS07] M. Drton, B. Sturmfels, S. Sullivant, Algebraic factor analysis: tetrads, pentads and beyond. Probab. Theory Relat. Fields 138(3–4), 463–493 (2007) [GSS05] L.D. Garcia, M. Stillman, B. Sturmfels, Algebraic geometry of Bayesian networks. J. Symb. Comput. 39(3–4), 331–355 (2005) [GW09] R. Goodman, N.R. Wallach, in Symmetry, Representations, and Invariants. Graduate Texts in Mathematics, vol. 255 (Springer, New York, 2009) [Hig52] G. Higman, Ordering by divisibility in abstract algebras. Proc. Lond. Math. Soc. III. Ser. 2, 326–336 (1952) [HMdC13] C.J. Hillar, A.M. del Campo, Finiteness theorems and algorithms for permutation invariant chains of Laurent lattice ideals. J. Symb. Comput. 50, 314–334 (2013) [HS12] C.J. Hillar, S. Sullivant, Finite Gröbner bases in infinite dimensional polynomial rings and applications. Adv. Math. 221, 1–25 (2012) [SHS07] S. Ho¸sten, S. Sullivant, A finiteness theorem for Markov bases of hierarchical models. J. Comb. Theory Ser. A 114(2), 311–321 (2007) [Kru60] J.B. Kruskal, Well-quasi ordering, the tree theorem, and Vazsonyi’s conjecture. Trans. Am. Math. Soc. 95, 210–225 (1960) [LM04] J.M. Landsberg, L. Manivel, On the ideals of secant varieties of Segre varieties. Found. Comput. Math. 4(4), 397–422 (2004) [LW07] J.M. Landsberg, J. Weyman, On tangential varieties of rational homogeneous varieties. J. Lond. Math. Soc. (2) 76(2), 513–530 (2007) [Lan65] S. Lang, Algebra (Addison-Wesley, Reading, 1965) [LSL09] R. La Scala, V. Levandovskyy, Letterplace ideals and non-commutative Gröbner bases. J. Symb. Comp. 44(10), 1374–1393 (2009) [NW63] C.St.J.A. Nash-Williams, On well-quasi-ordering finite trees. Proc. Camb. Philos. Soc. 59, 833–835 (1963) [OR11] L. Oeding, C. Raicu, Tangential varieties of segre-veronese varieties (2011). Preprint, avaibable from http://arxiv.org/abs/1111.6202 [Rai12] C. Raicu, Secant varieties of Segre–Veronese varieties. Algebra Number Theory 6(8), 1817–1868 (2012) [SS12] S.V. Sam, A. Snowden, GL-equivariant modules over polynomial rings in infinitely many variables (2012). Preprint, available from http://arxiv.org/abs/1206.2233 [SS12b] S.V. Sam, A. Snowden, Introduction to twisted commutative algebras. Preprint (2012), available from http://arxiv.org/abs/1209.5122 [Sno13] A. Snowden, Syzygies of Segre embeddings and -modules. Duke Math. J. 162(2), 225–277 (2013)
Likelihood Geometry June Huh and Bernd Sturmfels
Introduction Maximum likelihood estimation (MLE) is a fundamental computational problem in statistics, and it has recently been studied with some success from the perspective of algebraic geometry. In these notes we give an introduction to the geometry behind MLE for algebraic statistical models for discrete data. As is customary in algebraic statistics [LiAS], we shall identify such models with certain algebraic subvarieties of high-dimensional complex projective spaces. The article is organized into four sections. The first three sections correspond to the three lectures given at Levico Terme. The last section will contain proofs of new results. In Sect. 1, we start out with plane curves, and we explain how to identify the relevant punctured Riemann surfaces. We next present the definitions and basic results for likelihood geometry in Pn . Theorems 1.6 and 1.7 are concerned with the likelihood correspondence, the sheaf of differential one-forms with logarithmic poles, and the topological Euler characteristic. The ML degree of generic complete intersections is given in Theorem 1.10. Theorem 1.15 shows that the likelihood fibration behaves well over strictly positive data. Examples of Grassmannians and Segre varieties are discussed in detail. Our treatment of linear spaces in Theorem 1.20 will appeal to readers interested in matroids and hyperplane arrangements. Section 2 begins leisurely, with the question Does watching soccer on TV cause hair loss? [MSS]. This leads us to conditional independence and low rank matrices.
J. Huh () Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, USA e-mail:
[email protected] B. Sturmfels Department of Mathematics, University of California, Berkeley, CA 94720, USA e-mail:
[email protected] A. Conca et al., Combinatorial Algebraic Geometry, Lecture Notes in Mathematics 2108, DOI 10.1007/978-3-319-04870-3__3, © Springer International Publishing Switzerland 2014
63
64
J. Huh and B. Sturmfels
We study likelihood geometry of determinantal varieties, culminating in the duality theorem of Draisma and Rodriguez [DR]. The ML degrees in Theorems 2.2 and 2.6 were computed using the software Bertini [Bertini], underscoring the benefits of using numerical algebraic geometry for MLE. After a discussion of mixture models, highlighting the distinction between rank and nonnegative rank, we end Sect. 2 with a review of recent results in [ARSZ] on tensors of nonnegative rank 2. Section 3 starts out with toric models [PS, §1.22] and geometric programming [BoydVan, §4.5]. Theorem 3.2 identifies the ML degree of a toric variety with the Euler characteristic of the complement of a hypersurface in a torus. Theorem 3.7 furnishes the ML degree of a variety parametrized by generic polynomials. Theorem 3.10 characterizes varieties of ML degree 1 and it reveals a beautiful connection to the A-discriminant of [GKZ]. We introduce the ML bidegree and the sectional ML degree of an arbitrary projective variety in Pn , and we explain how these two are related. Section 3 ends with a study of the operations of intersection, projection, and restriction in likelihood geometry. This concerns the algebro-geometric meaning of the distinction between sampling zeros and structural zeros in statistical modeling. In Sect. 4 we offer precise definitions and technical explanations of more advanced concepts from algebraic geometry, including logarithmic differential forms, Chern–Schwartz–MacPherson classes, and schön very affine varieties. This enables us to present complete proofs of various results, both old and new, that are stated in the earlier sections. We close the introduction with a disclaimer regarding our overly ambitious title. There are many important topics in the statistical study of likelihood inference that should belong to “Likelihood Geometry” but are not covered in this article. Such topics include Watanabe’s theory of singular Bayesian integrals [Wat], differential geometry of likelihood in information geometry [AN], and real algebraic geometry of Gaussian models [Uhl]. We regret not being able to talk about these topics and many others. Our presentation here is restricted to the setting of [LiAS, §2.2], namely statistical models for discrete data viewed as projective varieties in Pn .
1 First Lecture Let us begin our discussion with likelihood on algebraic curves in the complex projective plane P2 . We fix a system of homogeneous coordinates p0 ; p1 ; p2 on P2 . The set of real points in P2 with sign.p0 / D sign.p1 / D sign.p2 / is identified with the open triangle ˚
2 D .p0 ; p1 ; p2 / 2 R3 W p0 ; p1 ; p2 > 0 and p0 C p1 C p2 D 1 : Given three positive integers u0 ; u1 ; u2 , the corresponding likelihood function is `u0 ;u1 ;u2 .p0 ; p1 ; p2 / D
p0u0 p1u1 p2u2 : .p0 C p1 C p2 /u0 Cu1 Cu2
Likelihood Geometry
65
This defines a rational function on P2 , and it restricts to a regular function on P2 nH, where H D
˚
.p0 W p1 W p2 / 2 P2 W p0 p1 p2 .p0 C p1 C p2 / D 0
is our arrangement of four distinguished lines. The likelihood function `u0 ;u1 ;u2 is positive on the triangle 2 , it is zero on the boundary of 2 , and it attains its maximum at the point .pO0 ; pO1 ; pO2 / D
1 .u0 ; u1 ; u2 /: u0 C u1 C u2
(1)
The corresponding point .pO0 W pO1 W pO2 / is the only critical point of the function `u0 ;u1 ;u2 on the four-dimensional real manifold P2 nH. To see this, we consider the logarithmic derivative u0 u0 C u1 C u2 u1 u0 C u1 C u2 u2 u0 C u1 C u2 dlog.`u0 ;u1 ;u2 / D : ; ; p0 p0 C p1 C p2 p1 p0 C p1 C p2 p2 p0 C p1 C p2
We note that this equals .0; 0; 0/ if and only if .p0 W p1 W p2 / is the point .pO0 W pO1 W pO2 / in (1). Let X be a smooth curve in P2 defined by a homogeneous polynomial f .p0 ; p1 ; p2 /. This curve plays the role of a statistical model, and our task is to maximize the likelihood function `u0 ;u1 ;u2 over its set X \ 2 of positive real points. To compute that maximum algebraically, we examine the set of all critical points of `u0 ;u1 ;u2 on the complex curve X nH. That set of critical points is the likelihood locus. Using Lagrange Multipliers from Calculus, we see that it consists of all points of X nH such that dlog.`u0 ;u1 ;u2 / lies in the plane spanned by df and .1; 1; 1/ in C3 . Thus, our task is to study the solutions in P2 nH of the equations 0 f .p0 ; p1 ; p2 / D 0
and
1
1
1
1
u1 u2 C p1 p2 A @f @f @f @p0 @p1 @p2
Bu det @ p00
D 0:
(2)
Suppose that X has degree d . Then, after clearing denominators, the second equation has degree d C 1. By Bézout’s Theorem, we expect the likelihood locus to consist of d.d C 1/ points in P2 nH. This is indeed what happens when f is a generic polynomial of degree d . We define the maximum likelihood degree (or ML degree) of our curve X to be the cardinality of the likelihood locus for generic choices of u0 ; u1 ; u2 . Thus a general plane curve of degree d has ML degree d.d C 1/. However, for special curves, the ML degree can be smaller.
66
J. Huh and B. Sturmfels
Theorem 1.1. Let X be a smooth curve of degree d in P2 , and a D #.X \ H/ the number of its points on the distinguished arrangement. Then the ML degree of X equals d 2 3d C a. This is a very special case of Theorem 1.7 which identifies the ML degree with the signed Euler characteristic of X nH. For a general curve of degree d in P2 , we have a D 4d , and so d 2 3d C a D d.d C 1/ as predicted. However, the number a of points in X \ H can drop: Example 1.2. Consider the case d D 1 of lines. A generic line has ML degree 2. The line X D V .p0 C cp1 / has ML degree 1 provided c 62 f0; 1g. The special line X D V .p0 Cp1 / has ML degree 0: (2) has no solutions on X nH unless u0 Cu1 D 0. In the three cases, X nH is the Riemann sphere P1 with four, three, or two points removed. } Example 1.3. Consider the case d D 2 of quadrics. A general quadric has ML degree 6. The Hardy–Weinberg curve, which plays a fundamental role in population genetics, is given by 2p0 p1 f .p0 ; p1 ; p2 / D det D 4p0 p2 p12 : p1 2p2 The curve has only three points on the distinguished arrangement: ˚
X \ H D .1 W 0 W 0/; .0 W 0 W 1/; .1 W 2 W 1/ : Hence the ML degree of the Hardy–Weinberg curve equals 1. This means that the maximum likelihood estimate (MLE) is a rational function of the data. Explicitly, the MLE equals .pO0 ; pO1 ; pO2 / D
1 .2u0 Cu1 /2 ; 2.2u0 Cu1 /.u1 C2u2 / ; .u1 C2u2 /2 : 4.u0 C u1 C u2 /2 (3)
In applications, the Hardy–Weinberg curve arises via its parametric representation p0 .s/ D s2 p1 .s/ D 2s.1 s/ p2 .s/ D .1 s/2
(4)
Here the parameter s is the probability that a biased coin lands on tails. If we toss that same biased coin twice, then the above formulas represent the following probabilities: p0 .s/ D probability of 0 heads p1 .s/ D probability of 1 head p2 .s/ D probability of 2 heads
Likelihood Geometry
67
Suppose now that the experiment of tossing the coin twice is repeated N times. We record the following counts, where N D u0 C u1 C u2 is the sample size of our repeated experiment: u0 D number of times 0 heads were observed u1 D number of times 1 head was observed u2 D number of times 2 heads were observed The MLE problem is to estimate the unknown parameter s by maximizing `u0 ;u1 ;u2 D p0 .s/u0 p1 .s/u1 p2 .s/u2 D 2u1 s 2u0 Cu1 .1 s/u1 C2u2 : The unique solution to this optimization problem is 2u0 C u1 : 2u0 C 2u1 C 2u2 Substituting this expression into (4) gives the estimator p0 .Os /; p1 .Os /; p2 .Os / for the three probabilities in our model. The resulting rational function coincides with (3). } sO
D
The ML degree is also defined when the given curve X P2 is not smooth, but it counts critical points of `u only in the regular locus of X . Here is an example to illustrate this. Example 1.4. A general cubic curve X in P2 has ML degree 12. Suppose now that X is a cubic which meets H transversally but has one isolated singular point in P2 nH. If the singular point is a node then the ML degree of X is 10, and if the singular point is a cusp then the ML degree of X is 9. The ML degrees are found by saturating the equations in (2) with respect to the homogenous ideal of the singular point. } Moving beyond likelihood geometry in the plane, we shall introduce our objects in any dimension. We fix the complex projective space Pn with coordinates p0 ; p1 ; : : : ; pn , representing probabilities. We summarize the observed data in a vector u D .u0 ; u1 ; : : : ; un / 2 NnC1 , where ui is the number of samples in state i . The likelihood function on Pn given by u equals `u
D
p0u0 p1u1 pnun : .p0 C p1 C C pn /u0 Cu1 CCun
The unique critical point of this rational function on Pn is the data point itself: .u0 W u1 W W un /:
68
J. Huh and B. Sturmfels
Moreover, this point is the global maximum of the likelihood function `u on the probability simplex n . Throughout, we identify n with the set of all positive real points in Pn . The linear forms in `u define an arrangement H of n C 2 distinguished hyperplanes in Pn . The differential of the logarithm of the likelihood function is the vector of rational functions dlog.`u / D
u0 u1 un uC ; ;:::; .1; 1; : : : ; 1/: p0 p1 pn pC
(5)
P P Here pC D niD0 pi and uC D niD0 ui . The vector (5) represents a section of the sheaf of differential one-forms on Pn that have logarithmic singularities along H. This sheaf is denoted 1Pn .log.H//: Our aim is to study the restriction of `u to a closed subvariety X Pn . We will assume that X is defined over the real numbers, irreducible, and not contained in H. Let Xsing denote the singular locus of X , and Xreg denote X nXsing . When X serves as a statistical model, the goal is to maximize the rational function `u on the semialgebraic set X \ n . To solve this problem algebraically, we determine all critical points of the log-likelihood function log.`u / on the complex variety X . Here we must exclude points that are singular or lie in H. Definition 1.5. The maximum likelihood degree of X is the number of complex critical points of the function `u on Xreg nH, for generic data u. The likelihood correspondence LX is the universal family of these critical points. To be precise, LX is the closure in Pn Pn of ˚
.p; u/ W p 2 Xreg nH and dlog.`u / vanishes at p : We sometimes write Pnp Pnu for Pn Pn to highlight that the first factor is the probability space, with coordinates p, while the second factor is the data space, with coordinates u. The first part of the following result appears in [Huh1, §2]. A precursor was [HKS, Proposition 3]. Theorem 1.6. The likelihood correspondence LX of any irreducible subvariety X in Pnp is an irreducible variety of dimension n in the product Pnp Pnu . The map pr1 W LX ! Pnp is a projective bundle over Xreg nH, and the map pr2 W LX ! Pnu is generically finite-to-one. See Sect. 4 for a proof. The degree of the map pr2 W LX ! Pnu to data space is the ML degree of X . This number has a topological interpretation as an Euler characteristic, provided suitable assumptions on X are being made. The relationship between the homology of a manifold and critical points of a suitable function on it is the topic of Morse theory.
Likelihood Geometry
69
The study of ML degrees was started in [CHKS, §2] by developing the connection to the sheaf 1X .log.H// of differential one-forms on X with logarithmic poles along H. It was shown in [CHKS, Theorem 20] that the ML degree of X equals the signed topological Euler characteristic .1/dim X .X nH/; provided X is smooth and the intersection H \ X defines a normal crossing divisor in X Pn . A major drawback of that early result was that the hypotheses are so restrictive that they essentially never hold for varieties X that arise from statistical models used in practice. From a theoretical point view, this issue can be addressed by passing to a resolution of singularities. However, in spite of existing algorithms for resolution in characteristic zero, these algorithms do not scale to problems of the sizes of interest in algebraic statistics. Thus, whatever computations we wish to do should not be based on resolution of singularities. The following result due to [Huh1] gives the same topological interpretation of the ML degree. The hypotheses here are much more realistic and inclusive than those in [CHKS, Theorem 20]. Theorem 1.7. If the very affine variety X nH is smooth of dimension d , then the ML degree of X equals the signed topological Euler characteristic of .1/d .X nH/. The term very affine variety refers to a closed subvariety of some algebraic torus .C /m . Our ambient space Pn nH is a very affine variety because it has a closed embedding Pn nH ! .C /nC1 ;
.p0 W W pn / 7!
p0 pC
;:::;
pn : pC
The study of such varieties is foundational for tropical geometry. The special case when X nH is a Riemann surface with a punctures, arising from a curve in P2 , was seen in Theorem 1.1. We remark that Theorem 1.7 can be deduced from works of Gabber–Loeser [Gabber-Loeser] and Franecki–Kapranov [Franecki-Kapranov] on perverse sheaves on algebraic tori. The smoothness hypothesis is essential for Theorem 1.7 to hold. If X is singular then, generally, neither X nH nor Xreg nH has its signed Euler characteristic equal to the ML degree of X . Varieties X that demonstrate this are the two singular cubic curves in Example 1.4. Conjecture 1.8. For any projective variety X Pn of dimension d , not contained in H, .1/d .X nH/ MLdegree .X /: In particular, the signed topological Euler characteristic .1/d .X nH/ is nonnegative.
70
J. Huh and B. Sturmfels
Analogous conjectures can be made in the slightly more general setting of [Huh1]. In particular, we conjecture that the inequality .1/d .V / 0 holds for any closed d -dimensional subvariety V .C /m . Remark 1.9. We saw in Example 1.2 that the ML degree of a projective variety X can be 0. In all situations of statistical interest, the variety X Pn intersects the open simplex n in a subset that is Zariski dense in X . If that intersection is smooth then MLdegree.X / 1. In fact, arguing as in [CHKS, Proposition 11], it can be shown that for smooth X , MLdegree .X / #.bounded regions of XR nH/: Here a bounded region is a connected component of the semialgebraic set XR nH whose classical closure is disjoint from the distinguished hyperplane V .pC / in Pn . If X is singular then the number of bounded regions of XR nH can exceed MLdegree .X /. For instance, let X P2 be the cuspidal cubic curve defined by .p0 Cp1 Cp2 /.7p0 9p1 2p2 /2 D .3p0 C5p1 C4p2 /3 : The real part XR nH consists of 8 bounded and 2 unbounded regions, but the ML degree of X is 7. The bounded region that contains the cusp .13 W 17 W 31/ has no other critical points for `u . } In what follows we present instances that illustrate the computation of the ML degree. We begin with the case of generic complete intersections. Suppose that X Pn is a complete intersection defined by r generic homogeneous polynomials g1 ; : : : ; gr of degrees d1 ; d2 ; : : : ; dr . Theorem 1.10. The ML degree of X equals Dd1 d2 dr , where X d1i1 d2i2 drir : D D
(6)
i1 Ci2 CCir nr
Proof. By Bertini’s Theorem, the generic complete intersection X is smooth in Pn . All critical points of the likelihood function `u on X lie in the dense open subset X nH. Consider the following .r C2/ .nC1/-matrix with entries in the polynomial ring RŒp0 ; p1 ; : : : ; pn : 3 2 u1 un u0 7 6 p1 pn 7 6 p0 7 6 @g @g @g 6 p0 @p01 p1 @p11 pn @pn1 7 u 7 6 @g D 6 p 2 p @g2 p @g2 7 : (7) n @pn 7 6 0 @p0 1 @p1 JQ .p/ 6 : : 7 :: : : 6 : : :: 7 : 5 4 : @gr @gr r p p p0 @g 1 n @p0 @p1 @pn
Likelihood Geometry
71
Let Y denote the determinantal variety in Pn given by the vanishing of its .r C 2/ .r C 2/ minors. The codimension of Y is at most n r, which is a general upper bound for ideals of maximal minors, and hence the dimension of Y is at least r. Our genericity assumptions ensure that the matrix JQ .p/ has maximal row rank r C 1 for all p 2 X . Hence a point p 2 X lies in Y if and only if the vector u is in the row span of JQ .p/. Moreover, by Theorem 1.6, .Xreg nH/ \ Y
D
X \Y
is a finite subset of Pn , and its cardinality is the desired ML degree of X . Since X has dimension n r, we conclude that Y has the maximum possible codimension, namely n r, and that the intersection of X with the determinantal variety Y is proper. We note that Y is Cohen–Macaulay, since Y has maximal codimension n r, and ideals of minors of generic matrices are Cohen–Macaulay. Bézout’s Theorem implies MLdegree.X /
D
degree.X / degree.Y /
D
d1 dr degree.Y /:
The degree of the determinantal variety Y equals the degree of the determinantal variety given by generic forms of the same row degrees. By the Thom–Porteous– Giambelli formula, this degree is the complete homogeneous symmetric function of degree codim.Y / D n r evaluated at the row degrees of the matrix. Here, the row degrees are 0; 1; d1 ; : : : ; dr , and the value of that symmetric function is precisely D. We conclude that degree.Y / D D. Hence the ML degree of the generic complete intersection X D V.g1 ; : : : ; gr / equals D d1 d2 dn . t u Example 1.11 (r D 1). A generic hypersurface of degree d in Pn has ML degree d D D d C d 2 C d 3 C C d n: Example 1.12 (r D 2; n D 3). A space curve that is the generic intersection of two surfaces of degree d and e in P3 has ML degree de C d 2 e C de 2 . } Remark 1.13. It was shown in [HKS, Theorem 5] that (6) is an upper bound for the ML degree of any variety X of codimension r that is defined by polynomials of degree d1 ; : : : ; dr . In fact, the same is true under the weaker hypothesis that X is cut out by polynomials of degrees d1 dr drC1 ds , so X need not be a complete intersection. However, the hypothesis codim.X / D r is essential in order for MLdegree.X / (6) to hold. That codimension hypothesis was forgotten when this upper bound was cited in [LiAS, Theorem 2.2.6] and in [PS, Theorem 3.31]. Hence these two book references are not correct as stated. Here is a simple counterexample. Let n D 3 and d1 D d2 D d3 D 2. Then the bound (6) is the Bézout number 8, and this is also the correct ML degree for a general complete intersection of three quadrics in P3 . Now let X be a general rational normal curve in P3 . The curve X is defined by three quadrics,
72
J. Huh and B. Sturmfels
namely, the 2 2-minors of a 2 3-matrix filled with general linear forms in p0 ; p1 ; p2 ; p3 . Since X is a Riemann sphere with 15 punctures, Theorem 1.7 tells us that MLdegree.X / D 13, and this exceeds the bound of 8. } We now come to a variety that is ubiquitous in statistics, namely the model of independence for two binary random variables [LiAS, §1.1]. This model is represented by Segre’s quadric surface X in P3 . By this we mean the surface defined by the 2 2-determinant: X D V .p00 p11 p01 p10 / P3 : The surface X is isomorphic to P1 P1 , so it is smooth, and we can apply Theorem 1.7 to find the ML degree. In other words, we seek to determine the Euler characteristic of the open complex surface X nH where ˚
H D p 2 P3 W p00 p01 p10 p11 .p00 Cp01 Cp10 Cp11 / D 0 : To this end, we write X D P1 P1 with coordinates .x0 W x1 /; .y0 W y1 / . Our surface is parametrized by pij D xi yj , and hence 1
˚ X nH D P P1 n x0 x1 y0 y1 .x0 C x1 /.y0 C y1 / D 0 1 1 D P nfx0 x1 .x0 C x1 / D 0g P nfy0 y1 .y0 C y1 / D 0g D 2-spherenfthree pointsg 2-spherenfthree pointsg : Since the Euler characteristic is additive and multiplicative, .X nH/ D .1/ .1/ D 1: This means that the map u 7! pO from the data to the MLE is a rational function in each coordinate. The following “word problem for freshmen” is aimed at finding that function. Example 1.14. Do this exercise: A biologist friend of yours wishes to test whether two binary random variables are independent. She collects data and records the matrix of counts u00 u01 uD : u10 u11 How to ascertain whether u lies close to the independence model X D V .p00 p11 p01 p10 / ‹ A statistician who recently started working in her lab explains that, as the first step in the analysis of her data, the biologist should calculate the maximum likelihood estimate (MLE)
Likelihood Geometry
73
pO00 pO01 : pO D pO10 pO11 Can you help your friend by supplying the formula for pO as a rational function in u? The solution to this word problem is as follows. The MLE is the rank 1 matrix pO D
u0C uC0 uC1 : 2 .uCC / u1C 1
(8)
We illustrate the concepts introduced above by deriving this well-known formula. The likelihood correspondence LX of X D V .p00 p11 p01 p10 / is the subvariety of X P3 defined by U .p00 ; p01 ; p10 ; p11 /T D 0;
(9)
where U is the matrix 1 0 u00 C u01 0 u10 u11 C Bu11 C u01 u00 u10 0 0 C: U DB A @u11 C u10 0 u01 u00 0 0 0 u01 u11 u00 C u10 0
We urge the reader to derive (9) from Definition 1.5 using a computer algebra system. Note that the determinant of U vanishes identically. In fact, for generic uij , the matrix U has rank 3, so its kernel is spanned by a single vector. The coordinates of that vector are given by Cramer’s rule, and we find them to be equal to the rational functions in (8). The locus where the function u 7! pO is undefined consists of those u where the matrix rank of U drops below 3. A computation shows that the rank of U drops to 2 on the variety V .u00 C u10 ; u01 C u11 / [ V .u00 C u01 ; u10 C u11 /; and it drops to 0 on the point V .u00 C u01 ; u10 C u11 ; u01 C u11 /. In particular, the likelihood function `u given by that point u has infinitely many critical points in the quadric X . } We note that all coefficients of the linear forms that define the exceptional loci in P3u for the independence model are positive. This means that data points u with all coordinates positive can never be exceptional. We will prove in Sect. 4 that this usually holds. Let pr1 W LX ! Pnp and pr2 W LX ! Pnu be the projections from the likelihood correspondence to p-space and u-space respectively. We are interested in the fibers of pr2 over positive points u.
74
J. Huh and B. Sturmfels
n Theorem 1.15. Let u 2 RnC1 >0 , and let X P be an irreducible variety such that no singular points of any intersection X \ fpi D 0g lies in the hyperplane at infinity fpC D 0g. Then
(1) the likelihood function `u on X has only finitely many critical points in Xreg nH; (2) if the fiber pr1 2 .u/ is contained in Xreg , then its length equals the ML degree of X . The hypothesis concerning “no singular point” will be satisfied for essentially all statistical models of interest. Here is an example which shows that this hypothesis is necessary. Example 1.16. We consider the smooth cubic curve X in P2 that is defined by f D .p0 C p1 C p2 /3 C p0 p1 p2 : The ML degree of the curve X is 3. Each intersection X \ fpi D 0g is a triple point that lies on the line at infinity fpC D 0g. The fiber pr1 2 .u/ of the likelihood fibration over the positive point u D .1 W 1 W 1/ is the entire curve X . } If u is not positive in Theorem 1.15, then the fiber of pr2 over u may have positive dimension. We saw an instance of this at the end of Example 1.14. Such resonance loci have been studied extensively when X is a linear subspace of Pn . See [CDFV] and references therein. The following cautionary example shows that the length of the scheme-theoretic fiber of LX ! Pnu over special points u in the open simplex n may exceed the ML degree of X . Example 1.17. Let X be the curve in P2 defined by the ternary cubic f D p2 .p1 p2 /2 C .p0 p2 /3 : This curve intersects H in eight points, has ML degree 5, and has a cuspidal singularity at P WD .1 W 1 W 1/: The prime ideal in RŒp0 ; p1 ; p2 ; u0 ; u1 ; u2 for the likelihood correspondence LX is minimally generated by five polynomials, having degrees .3; 0/; .2; 2/; .3; 1/; .3; 1/; .3; 1/. They are obtained by saturating the two equations in (2) with respect to hp0 p2 i \ hp0 p1 ; p2 p1 i. The scheme-theoretic fiber of pr1 over a general point of X is a reduced line in the u-plane, while the fiber of pr1 over P is the double line
˚ L WD .u0 W u1 W u2 / 2 P2 W .2u0 u1 u2 /2 D 0 :
Likelihood Geometry
75
The reader is invited to verify the following assertions using a computer algebra system: (a) If u is a general point of P2u , then pr1 2 .u/ consists of 5 reduced points in Xreg nH. (b) If u is a general point on the line L, then the locus of critical points pr1 2 .u/ consists of four reduced points in Xreg nH and the reduced point P . (c) If u is the point .1 W 1 W 1/ 2 L, then pr1 2 .u/ is a zero-dimensional scheme of length 6. This scheme consists of three reduced points in Xreg nH and P counted with multiplicity 3. In particular, the fiber in (c) is not algebraically equivalent to the general fiber (a). This example illustrates one of the difficulties classical geometers had to face when formulating the “principle of conservation of numbers”. See [Fulton, Chap. 10] for a modern treatment. } It is instructive to examine classical varieties from projective geometry from the likelihood perspective. For instance, we may study the Grassmannian in its Plücker embedding. Grassmannians are a nice test case because they are smooth, so that Theorem 1.7 applies. Example 1.18. Let X D G.2; 4/ denote the Grassmannian of lines in P3 . In its Plücker embedding in P5 , this Grassmannian is the quadric hypersurface defined by p12 p34 p13 p24 C p14 p23 D 0:
(10)
As in (7), the critical equations for the likelihood function `u are the 3 3-minors of 3 u13 u14 u23 u24 u34 u12 4 p12 p13 p14 p23 p24 p34 5 : p12 p34 p13 p24 p14 p23 p14 p23 p13 p24 p12 p34 2
(11)
By Theorem 1.6, the likelihood correspondence LX is a five-dimensional subvariety of P5 P5 . The cohomology class of this subvariety can be represented by the bidegree of its ideal: BX .p; u/
D
4p 5 C 6p 4 u C 6p 3 u2 C 6p 2 u3 C 2pu4 :
(12)
This is the multidegree, in the sense of [ch3:MS, §8.5], of LX with respect to the natural Z2 -grading on the polynomial ring RŒp; u. We can use [ch3:MS, Proposition 8.49] to compute the bidegree from the prime ideal of LX . Its leading coefficient 4 is the ML degree of X . Its trailing coefficient 2 is the degree of X . The polynomials BX .p; u/ will be studied in Sect. 3. The prime ideal of LX is computed from the equations in (10) and (11) by saturation with respect to H. It is minimally generated by the following eight polynomials in RŒp; u:
76
J. Huh and B. Sturmfels
(a) one polynomial of degree .2; 0/, namely the Plücker quadric, (b) six polynomials of degree .1; 1/, given by 2 2-minors of p12 p34 p13 p24 p14 p23 u12 u34 u13 u24 u14 u23
and
p12 Cp13 Cp23 p12 Cp14 Cp24 p13 Cp14 Cp34 p23 Cp24 Cp34 ; u12 Cu13 Cu23 u12 Cu14 Cu24 u13 Cu14 Cu34 u23 Cu24 Cu34 (c) one polynomial of degree .2; 1/, for instance 2u24 p12 p34 C 2u34 p13 p24 C .u23 C u24 C u34 /p14 p24 2 .u13 C u14 C u34 /p24 .u12 C 2u13 C u14 u24 /p24 p34 :
For a fixed positive data vector u > 0, these six polynomials in (b) reduce to three linear equations, and these cut out a plane P2 inside P5 . To find the four critical points of `u on X D G.2; 4/, we must then intersect the two conics (a) and (c) in that plane P2 . m The ML degree of the Grassmannian G.r; m/ in P. r /1 is the signed Euler characteristic of the manifold G.r; m/nH obtained by removing mr C 1 distinguished hyperplane sections. It would be very interesting to find a general formula for this ML degree. At present, we only know that the ML degree of G.2; 5/ is 26, and that the ML degree of G.2; 6/ is 156. By Theorem 1.7, these numbers give the Euler characteristic of G.2; m/nH for m 6. } We end this lecture with a discussion of the delightful case when X is a linear subspace of Pn , and the open variety X nH is the complement of a hyperplane arrangement. In this context, following Varchenko [Varchenko], the likelihood function `u is known as the master function, and the statement of Theorem 1.7 was first proved by Orlik and Terao in [Orlik-Terao]. We assume that X has dimension d , is defined over R, and does not contain the vector 1 D .1; 1; : : : ; 1/. We can regard X as a .d C 1/-dimensional linear subspace of RnC1 . The orthogonal complement X ? with respect to the standard dot product is a linear space of dimension n d in RnC1 . The linear space X ? C 1 spanned by X ? and the vector 1 has dimension n d C 1 in RnC1 , and hence can be viewed as subspace of codimension d in Pnu . In our next formula, the operation ? is the Hadamard product or coordinatewise product. Proposition 1.19. The likelihood correspondence LX in Pn Pn is defined by p2X
and u 2 p ? .X ? C 1/:
(13)
The prime ideal of LX is obtained from these constraints by saturation with respect to H.
Likelihood Geometry
77
Proof. If all pi are non-zero then u 2 p ? .X ? C 1/ says that u=p WD
u0 u1 un ; ;:::; p0 p1 pn
lies in the subspace X ? C 1. Equivalently, the vector obtained by adding a multiple of .1; 1; : : : ; 1/ to u=p is perpendicular to X . We can take that vector to be the differential (5). Hence (13) expresses the condition that p is a critical point of `u on X . t u The intersection X \ H is an arrangement of n C 2 hyperplanes in X ' Pd . For special choices of the subspace X , it may happen that two or more hyperplanes coincide. Taking fpC D 0g as the hyperplane at infinity, we view X \ H as an arrangement of n C 1 hyperplanes in the affine space Rd . A region of this arrangement is bounded if it is disjoint from fpC D 0g. Theorem 1.20. The ML degree of X is the number of bounded regions of the real affine hyperplane arrangement X \ H in Rd . The bidegree of the likelihood correspondence LX is the h-polynomial of the broken circuit complex of the rank d C1matroid associated with X \ H. We need to explain the second assertion. The hyperplane arrangement X \ H consists of the intersections of the n C 2 hyperplanes in H with X ' Pd . We regard these as hyperplanes through the origin in Rd C1 . They define a matroid M of rank d C 1 on n C 2 elements. We identify these Q elements with the variables x1 ; x2 ; : : : ; xnC2 . For each circuit C of M let mC D . i 2C xi /=xj where j is the smallest index such that xj 2 C . The broken circuit complex of M is the simplicial complex with Stanley–Reisner ring RŒx1 ; : : : ; xnC2 =h mC W C circuit of M i. See [ch3:MS, §1.1] for Stanley–Reisner basics. The Hilbert series of this graded ring has the form h0 C h1 z C C hd zd : .1 z/d C1 What is being claimed in Theorem 1.20 is that the bidegree of LX equals BX .p; u/
D
.h0 ud C h1 pud 1 C h2 p 2 ud 2 C C hd p d / p nd
(14)
Equivalently, this is the class of LX in the cohomology ring H .Pn Pn I Z/ D ZŒp; u=hp nC1 ; unC1 i: There are several (purely combinatorial) definitions of the invariants hi of the matroid M . For instance, they are coefficients of the following specialization of the characteristic polynomial: M .q C 1/ D q h0 q d hd 1 q d 1 C C .1/d 1 h1 q C .1/d h0 : (15)
78
J. Huh and B. Sturmfels
Theorem 1.20 was used in [Huh0] to prove a conjecture of Dawson, stating that the sequence h0 ; h1 ; : : : ; hd is log-concave, when M is representable over a field of characteristic zero. The first assertion in Theorem 1.20 was proved by Varchenko in [Varchenko]. For definitions and characterizations of the characteristic polynomial , and many pointers to matroid basics, we refer to [OTBook]. A proof of the second assertion was given by Denham et al. in a slightly different setting [Denham-Garrousian-Schulze, Theorem 1]. We give a proof in Sect. 4 following [Huh1, §3]. The ramification locus of the likelihood fibration pr2 W LX ! Pnu is known as the entropic discriminant [SSV]. Example 1.21. Let d D 2 and n D 4, so X is a plane in P4 , defined by two linear forms c10 p0 C c11 p1 C c12 p2 C c13 p3 C c14 p4 D 0; c20 p0 C c21 p1 C c22 p2 C c23 p3 C c24 p4 D 0:
(16)
Following Theorem 1.20, we view X \ H as an arrangement of five lines in the affine plane f p 2 X W p0 C p1 C p2 C p3 C p4 6D 0 g
' C2 :
Hence, for generic cij , the ML degree of X is equal to 6, the number of bounded regions of this arrangement. The condition u 2 p ? .X ? C 1/ in Proposition 1.19 translates into 3 u0 u1 u2 u3 u4 6 p0 p1 p2 p3 p4 7 7 rank 6 4c10 p0 c11 p1 c12 p2 c13 p3 c14 p4 5 3: c20 p0 c21 p1 c22 p2 c23 p3 c24 p4 2
(17)
The 4 4-minors of this 4 5-matrix, together with the two linear forms defining X , form a system of equations that has six solutions in P4 , for generic cij . All solutions have real coordinates. In fact, there is one solution in each bounded region of X nH. The likelihood correspondence LX is the fourfold in P4 P4 given by the Eqs. (16) and (17). We now illustrate the second statement in Theorem 1.20. Suppose that the real numbers cij are generic, so M is the uniform matroid of rank three on six elements. The Stanley–Reisner ring of the broken circuit complex of M equals RŒx1 ; x2 ; x3 ; x4 ; x5 ; x6 =hx2 x3 x4 ; x2 x3 x5 ; x2 x3 x6 ; : : : ; x4 x5 x6 i: The Hilbert series of this graded algebra is h0 C h1 z C h2 z2 .1 z/3
D
1 C 3z C 6z2 : .1 z/3
Likelihood Geometry
79
We conclude that the bidegree (14) of the likelihood correspondence LX equals BX .p; u/
D
6p 4 C 3p 3 u C p 2 u2 :
For special choices of the coefficients cij in (16), some triples of lines in the arrangement X \ H may meet in a point. For such matroids, the ML degree drops from 6 to some integer between 0 and 5. We recommend it as an exercise to the reader to explore these cases. For instance, can you find explicit cij so that the ML degree of X equals 3? What are the prime ideal and the bidegree of LX in that case? How can the ML degree of X be 0 or 1? } It would be interesting to know which statistical model X in Pn defines the likelihood correspondence LX which is a complete intersection in Pn Pn . When X is a linear subspace of Pn , this question is closely related to the concept of freeness of a hyperplane arrangement. Proposition 1.22. If the hyperplane arrangement X \ H in X is free, then the likelihood correspondence LX is an ideal-theoretic complete intersection in Pn Pn . Proof. For the definition of freeness see §1 in the paper [CDFV] by Cohen, Denman, Falk and Varchenko. The proposition is implied by their [CDFV, Theorem 2.13] and [CDFV, Corollary 3.8]. t u Using Theorem 1.20, this provides a likelihood geometry proof of Terao’s theorem that the characteristic polynomial of a free arrangement factors into integral linear forms [Terao].
2 Second Lecture In our newspaper we frequently read about studies aimed at proving that a behavior or food causes a certain medical condition. We begin the second lecture with an introduction to statistical issues arising in such studies. The “medical question” we wish to address is Does Watching Soccer on TV Cause Hair Loss? We learned this amusing example from [MSS, §1]. In a fictional study, 296 British subjects aged between 40 and 50 were interviewed about their hair length and how many hours per week they watch soccer (a.k.a. “football”) on TV. Their responses are summarized in the following contingency table of format 3 3:
U
D
0 2h 2–6 h @ 6h
lots of hair medium hair little hair 1 51 45 33 28 30 29 A 15 27 38
80
J. Huh and B. Sturmfels
For instance, 29 respondents reported having little hair and watching between 2 and 6 h of soccer on TV per week. Based on these data, are these two random variables independent, or are we inclined to believe that watching soccer on TV and hair loss are correlated? On first glance, the latter seems to be the case. Indeed, being independent means that the data matrix U should be close to a rank 1 matrix. However, all 2 2-minors of U are strictly positive, indeed by quite a margin, and this suggests a positive correlation. However, this interpretation is deceptive. A much better explanation of our data can be given by identifying a certain hidden random variable. That hidden variable is gender. Indeed, suppose that among the respondents 126 were males and 170 were females. Our data matrix U is then the sum of the male table and the female table, maybe as follows: 0 1 1 48 36 18 3 9 15 U D @4 12 20A C @24 18 9 A : 8 6 3 7 21 35 0
(18)
Both of these tables have rank 1, hence U has rank 2. Hence, the appropriate null hypothesis H0 for analyzing our situation is not independence but it is conditional independence: H0 W
Soccer on TV and Hair Loss are Independent given Gender.
And, based on the data U , we most definitely do not reject that null hypothesis. The key feature of the matrix U above was that it has rank 2. We now define low rank matrix models in general. Consider two discrete random variables X and Y having m and n states respectively. Their joint probability distribution is written as an m n-matrix 0 P
D
p11 p12 B p21 p22 B B : :: @ :: : pm1 pm2
1 p1n p2n C C : : :: C : : A pmn
whose entries are nonnegative and sum to 1. Here pij represents the probability that X is in state i and Y is in state j . The of all probability distributions is the standard simplex mn1 of dimension mn 1. We write Mr for the manifold of rank r matrices in mn1 . The matrices P in M1 represent independent distributions. Mixtures of r independent distributions correspond to matrices in Mr . As always in applied algebraic geometry, we can make any problem that involves semi-algebraic sets progressively easier by three steps:
Likelihood Geometry
81
• disregard inequalities, • replace real numbers with complex numbers, • replace affine space by projective space. In our situation, this leads us to replacing Mr with its Zariski closure in complex projective space Pmn1 . This Zariski closure is the projective variety Vr of complex m n matrices of rank r. Note that Vr is singular along Vr1 . The codimension of Vr is .m r/.n r/. It is a non-trivial exercise to write the degree of Vr in terms of m; n; r. Hint: [ch3:MS, Example 15.2]. Suppose now that i.i.d. samples are drawn from an unknown joint distribution on our two random variables X and Y . We summarize the resulting data in a contingency table 0 U
D
u11 u12 B u21 u22 B B : : @ :: :: um1 um2
:: :
1 u1n u2n C C :: C : : A
umn
The entries of the matrix U are nonnegative integers whose sum is uCC . The likelihood function for the contingency table U is the following function on mn1 : ! m n Y Y uij uCC p : P 7! u11 u12 umn i D1 j D1 ij Assuming fixed sample size, this is the likelihood of observing the data U given an unknown probability distribution P in mn1 . In what follows we suppress the multinomial coefficient. Furthermore, we regard the likelihood function as a rational function on Pmn1 , so we write Qm Qn `U D
uij j D1 pij
i D1
u
CC pCC
:
We wish to find a low rank probability matrix P that best explains the data U . Maximum likelihood estimation means solving the following optimization problem: Maximize `U .P / subject to P 2 Mr .
(19)
The optimal solution PO is a rank r matrix. This is the maximum likelihood estimate for U . For r D 1, the independence model, the maximum likelihood estimate PO is obtained from the data matrix U by the following formula, already seen for m D n D 2 in (8). Multiply the vector of row sums with the vector of column sums and divide by the sample size:
82
J. Huh and B. Sturmfels
1 u1C B u2C C 1 C B PO D B : C uC1 uC2 uCn : 2 : .uCC / @ : A 0
(20)
umC Statisticians, scientists and engineers refer to such a formula as an “analytic solution”. In our view, it would be more appropriate to call this an “algebraic solution”. After all, we are here using algebra not analysis. Our algebraic solution for r D 1 reveals the following points: • The MLE PO is a rational function of the data U . • The function U 7! PO is an algebraic function of degree 1. • The ML degree of the independence model V1 equals 1. We next discuss the smallest case when the ML degree is larger than 1. Example 2.1. Let m D n D 3 and r D 2. Our MLE problem is to maximize u
u11 u12 u13 u21 u22 u23 u31 u32 u33 CC p12 p13 p21 p22 p23 p31 p32 p33 /=pCC `U D .p11
subject to the constraints P 0 and rank.P / D 2, where P D .pij / is a 3 3-matrix of unknowns. The equations that characterize the critical points of this optimization problem are det.P / D
p11 p22 p33 p11 p23 p32 p12 p21 p33 D 0 Cp12 p23 p31 C p13 p21 p32 p13 p22 p31
and the vanishing of the 3 3-minors of the following 3 9-matrix: 2
3 u11 u12 u13 u21 u22 u23 u31 u32 u33 4 p11 p12 p13 p21 p22 p23 p31 p32 p33 5 p11 a11p12 a12p13 a13 p21 a21p22 a22p33 a33 p31 a31p32 a32p33 a33 where aij D
@det.P / @pij
is the cofactor of pij in P . For random positive data uij ,
these equations have ten solutions with rank.P / D 2 in P8 nH. Hence the ML degree of V2 is 10. If we regard the uij as unknowns, then saturating the above determinantal equations with respect to H [ V1 yields the prime ideal of the likelihood correspondence LV2 P8 P8 . See Example 4.8 for the bidegree and other enumerative invariants of the eight-dimensional variety LV2 . } Recall from Definition 1.5 that the ML degree of a statistical model (or a projective variety) is the number of critical points of the likelihood function for generic data. Theorem 2.2. The known values for the ML degrees of the determinantal varieties Vr are
Likelihood Geometry
r r r r r
83
.m; n/ D .3; 3/ .3; 4/ .3; 5/ .4; 4/ .4; 5/ .4; 6/ .5; 5/ D1 1 1 1 1 1 1 1 D2 10 26 58 191 843 3119 6776 D3 1 1 1 191 843 3119 61326 D4 1 1 1 6776 D5 1
The numbers 10 and 26 were computed back in 2004 using the symbolic software Singular, and they were reported in [HKS, §5]. The bold face numbers were found in 2012 in [HRS] using the numerical software Bertini. In what follows we shall describe some of the details. Remark 2.3. Each determinantal variety Vr is singular along the smaller variety Vr1 . Hence, the very affine variety Q Vr nH is singular for r 2, so Theorem 1.7 does not apply. Here, H D fpCC pij D 0g. According to Conjecture 1.8, the ML degree above provides a lower bound for the signed topological Euler characteristic of Vr nH. The difference between the two numbers reflect the nature of the singular locus Vr1 nH inside Vr nH. For plane curves that have nodes and cusps, we encountered this issue in Examples 1.4 and 1.17. We begin with a geometric description of the likelihood correspondence. An m n-matrix P is a regular point in Vr if and only if rank.P / D r. The tangent space TP is a subspace of dimension rnCrmr 2 in Cmn . Its orthogonal complement TP? has dimension .mr/.nr/. The partial derivatives of the log-likelihood function log.`U / on Pmn1 are uij uCC @log.`U / D : @pij pij pCC Proposition 2.4. An m n-matrix P of rank r is a critical point for log.`U / on Vr if and only if the linear subspace TP? contains the matrix
uCC uij pij pCC
i D1;:::;m j D1;:::;n
In order to get to the numbers in Theorem 2.2, the geometric formulation was replaced in [HRS] with a parametric representation of the rank constraints. The following linear algebra formulation worked well for non-trivial computations. Assume m n. Let P1 ; R1 ; L1 and ƒ be matrices of unknowns of formats r r, r .nr/, .mr/ r, and .nr/ .mr/. Set L D L1 Imr ; P D
R1 P1 P1 R1 ; and R D ; L1 P1 L1 P1 R1 Inr
84
J. Huh and B. Sturmfels
where Imr and Inr are identity matrices. In the next statement we use the symbol ? for the Hadamard (entrywise) product of two matrices that have the same format. Proposition 2.5. Fix a general m n data matrix U . The polynomial system P ? .R ƒ L/T C uCC P D U consists of mn equations in mn unknowns. For generic U , it has finitely many complex solutions .P1 ; L1 ; R1 ; ƒ/. The m n-matrices P resulting from these solutions are precisely the critical points of the likelihood function `U on the determinantal variety Vr . We next present the analogue to Theorem 2.2 for symmetric matrices
P
D
0 2p11 p12 B p12 2p22 B B p13 p23 B B : :: @ :: : p1n p2n
p13 p23 2p33 :: : p3n
1 p1n p2n C C p3n C C: :: C :: : : A 2pnn
Such matrices, with nonnegative coordinates pij that sum to 1, represent joint probability distributions for two identically distributed random variables with n states. The case n D 2 and r D 1 is the Hardy–Weinberg curve, which we discussed in detail in Example 1.3. Theorem 2.6. The known values for ML degrees of symmetric matrices of rank at most r (mixtures of r independent identically distributed random variables) are
r r r r r
D1 D2 D3 D4 D5
n D 2 3 4 5 6 1 1 1 1 1 1 6 37 270 2341 1 37 1394 ‹ 1 270 ‹ 1 2341
At present we do not know the common value of the ML degree for n D 6 and r D 3; 4. In what follows we take a closer look at the model for symmetric 3 3matrices of rank 2. Example 2.7. Let n D 3 and r D 2, so X is a cubic hypersurface in P5 . The likelihood correspondence LX is a five-dimensional subvariety of P5 P5 having bidegree BX .p; u/ D 6p 5 C 12p 4 u C 15p 3 u2 C 12p 2 u3 C 3pu4 :
Likelihood Geometry
85
The bihomogeneous prime ideal of LX is minimally generated by 23 polynomials, namely: • One polynomial of bidegree .3; 0/; this is the determinant of P . • Three polynomials of degree .1; 1/. These come from the underlying toric model frank.P / D 1g. As suggested in Proposition 3.5, they are the 2 2-minors of 2p0 C p1 C p2 p1 C 2p3 C p4 p2 C p4 C 2p5 : 2u0 C u1 C u2 u1 C 2u3 C u4 u2 C u4 C 2u5 • • • •
One polynomial of degree .2; 1/, three polynomial of degree .2; 2/, nine polynomials of degree .3; 1/, six polynomials of degree .3; 2/.
It turns out that this ideal represents an expression for the MLE PO in terms of radicals in U . We shall work this out for one numerical example. Consider the data matrix U with u11 D 10; u12 D 9; u13 D 1; u22 D 21; u23 D 3; u33 D 7: For this choice, all six critical points of the likelihood function are real and positive: p12 p13 p22 p11 0:1037 0:3623 0:0186 0:3179 0:1084 0:2092 0:1623 0:3997 0:0945 0:2554 0:1438 0:3781
p23 0:0607 0:0503 0:4712
p33 0:1368 0:0702 0:0810
log `U .p/ 82:18102 84:94446 84:99184
0:1794 0:2152 0:0142 0:3052 0:1565 0:2627 0:0125 0:2887 0:1636 0:1517 0:1093 0:3629
0:2333 0:2186 0:1811
0:0528 0:0609 0:0312
85:14678 85:19415 87:95759
The first three points are local maxima in 5 and the last three points are local minima. These six points define an algebraic field extension of degree 6 over Q. One might expect that the Galois group of these six points over Q is the full symmetric group S6 . If this were the case then the above coordinates could not be written in radicals. However, that expectation is wrong. The Galois group of the likelihood fibration pr2 W LX ! P5U given by the 3 3 symmetric problem is a subgroup of S6 isomorphic to the solvable group S4 . To be concrete, for the data above, the minimal polynomial for the MLE pO33 equals 6 5 4 4125267629399052p33 C 713452955656677p33 9528773052286944p33 3 2 63349419858182p33 C 3049564842009p33 75369770028p33
C 744139872 D 0:
86
J. Huh and B. Sturmfels
We solve this equation in radicals as follows: p33 D
16427 1
2 !2 66004846384302 C 12 !22 C 227664 19221271018849 14779904193 14779904193 1 2 2 211433981207339 211433981207339 !1 !2 C 2 !3 ;
where is a primitive third root of unity, !12 D 94834811=3, and 5992589425361 5992589425361 97163 !23D
150972770845322208
2 C 40083040181952 !1 ; 150972770845322208 5006721709 212309132509 212309132509 2409 2 ! !32D1248260766912 C 4242035935404
2 4242035935404 20272573168 17063004159 !12!2 158808750548335 2 17063004159 2 76885084075396 !2 C 422867962414678 422867962414678 !1 !2 : The explanation for the extra symmetry stems from the duality theorem below. It furnishes an involution on the set of six critical points that allows us to express them in radicals. } The tables in Theorems 2.2 and 2.6 suggest that the columns will always be symmetric. This fact was conjectured in [HRS] and subsequently proved by Draisma and Rodriguez in [DR]. Theorem 2.8. Fix m n and consider the determinantal varieties Vi for either general or symmetric matrices. Then the ML degrees for rank r and for rank mrC1 coincide. In fact, the main result in [DR] establishes the following more precise statement. Given a data matrix U of format m n, we write U for the m n-matrix whose .i; j / entry equals uij ui C uCj : .uCC /3 Theorem 2.9. Fix m n and U an m n-matrix with strictly positive integer entries. There exists a bijection between the complex critical points P1 ; P2 ; : : : ; Ps of the likelihood function `U on Vr and the complex critical points Q1 ; Q2 ; : : : ; Qs of `U on VmrC1 such that P1 ? Q1 D P2 ? Q2 D D Ps ? Qs D U : Thus, this bijection preserves reality, positivity, and rationality. The key to computing the ML degree tables and to formulating the duality conjectures in [HRS], was the use of numerical algebraic geometry. The software Bertini allowed for the computation of thousands of instances in which the formula of Theorem 2.9 was confirmed. Bertini is numerical software, based on homotopy continuation, for finding all complex solutions to a system of polynomial equations (and much more). The software is available at [Bertini]. The developers, Daniel Bates, Jonathan
Likelihood Geometry
87
Hauenstein, Andrew Sommese, Charles Wampler, have just completed a new textbook [BHSW] on the mathematics behind Bertini. For the past two decades, algebraic geometers have increasingly employed computational methods as a tool for their research. However, these computations have almost always been symbolic (and hence exact). They relied on Gröbnerbased software such as Singular or Macaulay2. Algebraists often feel a certain discomfort when asked to trust a numerical computation. We encourage discussion about this issue, by raising the following question. Example 2.10. In the rightmost column of Theorem 2.6, it is asserted that the solution to a certain enumerative geometry problem is 2341. Which of these would you trust most: • the output of a symbolic computation? • the output of a numerical computation? • a proof written by an algebraic geometer? In the authors’ view, it always pays off to be critical and double-check all computations, regardless of how they were carried out. And, this applies to all three of the above. } One of the big advantages of numerical algebraic geometry over Gröbner bases when it comes to MLE is the separation between Preprocessing and Solving. For any particular variety X Pn , such as X D Vr , we preprocess by solving the likelihood equations once, for a generic data set U0 chosen by us. The coordinates of U0 may be complex (rather than real) numbers. We can chose them with stable numerics in mind, so as to compute all critical points up to high accuracy. This step can take a long time, but the output is highly reliable. After solving the equations once, for that generic U0 , all subsequent computations for any other data set U are very fast. In particular, the computation is fully parallelizable. If we have m processors at our disposal, where m D MLdegree .X /, then each processor can track one of the paths. To be precise, homotopy continuation starts from the critical points of `U0 and transform them into the critical points of `U . Geometrically speaking, for fixed X , the homotopy amounts to walking on the sheets of the likelihood fibration pr2 W LX ! Pnu . To illustrate this point, here are the timings (in seconds) that were reported in [HRS] for the determinantal varieties X D Vr . Those computations were carried out in Bertini on a 64-bit Linux cluster with 160 processors. The first row is the preprocessing time for solving the equations once. The second row is the time needed to solve any subsequent instance: .m; n; r/
.4; 4; 2/
.4; 4; 3/
.4; 5; 2/
.4; 5; 3/
.5; 5; 2/
.5; 5; 4/
Preprocessing Solving
257 4
427 4
1938 20
2902 20
348555 83
146952 83
88
J. Huh and B. Sturmfels
This table suggests that combining numerical algebraic geometry with existing tools from computational statistics might lead to a viable tool for certifiably solving MLE problems. We are now at the point where it is essential to offer a disclaimer. The low rank model Mr does not correctly represent the notion of conditional independence. The model we should have used instead is the mixture model Mixr . By definition, Mixr is the set of probability distributions P in mn1 that are convex combinations of r independent distributions, each taken from M1 . Equivalently, the mixture model Mixr consists of all matrices P D A ƒ B;
(21)
where A is a nonnegative m r-matrix whose rows sum to 1, ƒ is a nonnegative r r diagonal matrix whose entries sum to 1, and B is a nonnegative r n-matrix whose columns sum to 1. The formula (21) expresses Mixr as the image of a trilinear map between polytopes: W .m1 /r r1 .n1 /r ! mn1 ;
.A; ƒ; B/ 7! P:
The following result is well-known; see e.g. [LiAS, Example 4.1.2]. Proposition 2.11. Our low rank model Mr is the Zariski closure of the mixture model Mixr in the probability simplex mn1 . If r 2 then Mixr D Mr . If r 3 then Mixr ¨ Mr . The point here is the distinction between the rank and the nonnegative rank of a nonnegative matrix. Matrices in Mr have rank r and matrices in Mixr have nonnegative rank r. Thus elements of Mr nMixr are matrices whose nonnegative rank exceeds its rank. Example 2.12. The following 4 4-matrix has rank 3 but nonnegative rank 4: 0
P
D
11 1 B 01 B 8 @0 0 10
1 00 1 0C C 1 1A 01
This is the slack matrix of a regular square. It is an element of M3 nMix3 .
}
Engineers and scientists care more about Mixr than Mr . In many applications, nonnegative rank is more relevant than rank. The reason can be seen in (18). In such a low-rank decomposition, we do not want the female table or the male table to have a negative entry. This raises the following important questions: How to maximize the likelihood function `U over Mixr ? What are the algebraic degrees associated with that optimization problem?
Likelihood Geometry
89
Statisticians seek to maximize the likelihood function `U on Mixr by using the expectation-maximization (EM) algorithm in the space .m1 /r r1 .n1 /r of parameters .A; ƒ; B/. In each iteration, the EM algorithm strictly decreases the Kullback–Leibler divergence from the current model point P D .A; ƒ; B/ to the 1 empirical distribution uCC U . The hope in running the EM algorithm for given data U is that it converges to the global maximum PO on Mixr . For a presentation of the EM algorithm for discrete algebraic models see [PS, §1.3]. A study of the geometry of this algorithm for the mixture model Mixr is undertaken in [KRS]. If the EM algorithm converges to a point that lies in the interior of the parameter polytope, and is non-singular with respect to , then that point will be among the critical points on Mr . These are characterized by Proposition 2.4. However, since Mixr is properly contained in Mr , it frequently happens that the true MLE PO lies on the boundary of Mixr . In that case, PO is not a critical point of `U on Mr , meaning that .PO ; U / is not in the likelihood correspondence on Vr . Such points will never be found by the method described above. In order to address this issue, we need to identify the divisors in the variety Vr Pmn1 that appear in the algebraic boundary of Mixr . By this we mean the irreducible components W1 ; W2 ; : : : ; Ws of the Zariski closure of @Mixr . Each of these Wi has codimension 1 in Vr . Once the Wi are identified, one would need to examine their ML degree, and also the ML degree of the various strata Wi1 \ \Wis in which `U might attain its maximum. At present we do not have this information even in the smallest non-trivial case m D n D 4 and r D 3. Example 2.13. We illustrate this issue by describing one of the components W of the algebraic boundary for the mixture model Mix3 when m D n D 4. Consider the equation 0 1 0 1 0 1 0 a12 a13 p11 p12 p13 p14 0 b12 b13 b14 B 0 a22 a23 C Bp21 p22 p23 p24 C B C B C @ A @p31 p32 p33 p34 A D @a31 0 a33 A b21 0 b23 b24 b31 b32 b33 0 p41 p42 p43 p44 a41 a42 0 This parametrizes a 13-dimensional subvariety W of the hypersurface V3 D fdet.P / D 0g in P15 . The variety W is a component in the algebraic boundary of Mix3 . To see this, we choose the aij and bij to be positive, and we note that P lies outside Mix3 when precisely one of the 0 entries gets replaced by . The prime ideal of W in QŒp11 ; : : : ; p44 is obtained by eliminating the 17 unknowns aij and bij from the 16 scalar equations. A direct computation with Macaulay 2 shows that the variety W is Cohen–Macaulay of codimension-2. By the Hilbert–Burch Theorem, it is defined by the 4 4-minors of the 4 5-matrix. This following specific matrix representation was suggested to us by Aldo Conca and Matteo Varbaro: 0 1 0 p11 p12 p13 p14 Bp21 p22 p23 p24 C 0 B C: @p31 p32 p33 p34 A p34 .p11 p22 p12 p21 / p41 p42 p43 p44 p41 .p12 p24 p14 p22 / C p44 .p11 p22 p12 p21 /
90
J. Huh and B. Sturmfels
Tte algebraic boundary of Mix3 consists of precisely 304 irreducible components, namely the 16 coordinate hyperplanes and 288 hypersurfaces that are all isomorphic to W . This is proved in [KRS]. In that paper, it is also shown that the ML degree of W equals 633. } The definition of rank varieties and mixture models extends to m-dimensional tensors P of arbitrary format d1 d2 dm . We refer to Landsberg’s book [Land] for an introduction to tensors and their rank. Now, Vr is the variety of tensors of borderrank r, the model Mr is the set of all probability distributions in Vr , and the model Mixr is the subset of tensors of nonnegative rank r. Unlike in the matrix case m D 2, the mixture model for borderrank r D 2 is already quite interesting when m 3. We state two theorems that characterize our objects. The set-theoretic version of Theorem 2.14 is due to Landsberg and Manivel [LM]. The ideal-theoretic statement was proved more recently by Raicu [Rai]. Theorem 2.14. The variety V2 is defined by the 3 3-minors of all flattenings of P . Here, flattening means picking any subset A of Œn D f1; 2;Q : : : ; ng with 1 jAj n 1 and writing the tensor P as an ordinary matrix with i 2A di rows and Q d columns. j 62A j Theorem 2.15. The mixture model Mix2 is the subset of supermodular distributions in M2 . This theorem was proved in [ARSZ]. Being supermodular means that P satisfies a natural family of quadratic binomial inequalities. We explain these for m D 3; d1 D d2 D d3 D 2. Example 2.16. We consider 2 2 2 tensors. Since secant lines of the Segre variety P1 P1 P1 fill all of P7 , we have that V2 D P7 and M2 D 7 . The mixture model Mix2 is an interesting, full-dimensional, closed, semi-algebraic subset of 7 . By definition, Mix2 is the image of a 2-to-1 map W .1 /7 ! 7 analogous to (21). The branch locus is the 2 2 2-hyperdeterminant, which is a hypersurface in P7 of degree 4 and ML degree 13. The analysis in [ARSZ, §2] represents the model Mix2 as the union of four toric cells. One of these toric cells is the set of tensors satisfying p111 p222 p112 p221 p112 p222 p122 p212 p111 p122 p112 p121
p111 p222 p121 p212 p121 p222 p122 p221 p111 p212 p112 p211
p111 p222 p211 p122 p211 p222 p212 p221 p111 p221 p121 p211
(22)
A nonnegative 2 2 2-tensor P in 7 is supermodular if it satisfies these inequalities, possibly after label swapping 1 $ 2. We visualize Mix2 by restricting to the three-dimensional subspace H given by p111 D p222 ; p112 D p221 ; p121 D p212 and p211 D p122 . The intersection H \ 7 is a tetrahedron, and we consider H \ Mix2 inside that tetrahedron. The restricted model H \ Mix2 is shown on the left in Fig. 1. It consists of four toric cells as shown on the right side. The boundary
Likelihood Geometry
91
Fig. 1 A three-dimensional slice of the seven-dimensional model of 222 tensors of nonnegative rank 2. Each toric cell is bounded by 3 quadrics and contains a vertex of the tetrahedron
is given by three quadratic surfaces, shown in red, green and blue, and which are obtained from either the first or the second row in (22) by restriction to H . The boundary analysis suggested in Example 2.13 turns out to be quite simple in the present example. All boundary strata of the model Mix2 are varieties of ML degree 1. One such boundary stratum for Mix2 is the five-dimensional toric variety X D V .p112 p222 p122 p212 ; p111 p122 p112 p121 ; p111 p222 p121 p212 / P7 : As a preview for what is to come, we report its ML bidegree and its sectional ML degree: BX .p; u/ D p 7 C 2p 6 u C 3p 5 u2 C 3p 4 u3 C 3p 3 u4 C 3p 2 u5 ; SX .p; u/ D p 7 C 14p 6 u C 30p 5 u2 C 30p 4 u3 C 15p 3 u4 C 3p 2 u5 :
(23)
In the next section, we shall study the class of toric varieties and the class of varieties having ML degree 1. Our variety X lies in the intersection of these two important classes. }
3 Third Lecture In our third lecture we start out with the likelihood geometry of embedded toric varieties. Fix a .d C1/ .nC1/ integer matrix A D .a0 ; a1 ; : : : ; an / of rank d C1 that has .1; 1; : : : ; 1/ as its last row. This matrix defines an effective action of the torus .C /d on projective space Pn : .C /d Pn ! Pn ;
t .p0 W p1 W W pn / 7! .t aQ0 p0 W t aQ1 p1 W W t aQn pn /:
92
J. Huh and B. Sturmfels
Here aQ i is the column vector ai with the last entry 1 removed. We also fix c D .c0 ; c1 ; : : : ; cn / 2 .C /nC1 ; viewed as a point in Pn . Let Xc be the closure in Pn of the orbit .C /d c. This is a projective toric variety of dimension d , defined by the pair .A; c/. The ideal that defines Xc is the familiar toric ideal IA as in [LiAS, §1.3], but with p D .p0 ; : : : ; pn / replaced by p=c D
pn p0 p1 : ; ;:::; c0 c1 cn
(24)
Example 3.1. Fix d D 2 and n D 3. The matrix 0
1 0301 A D @0 0 3 1 A 1111 specifies the following family of toric surfaces of degree three in P3 : Xc D f.c0 W c1 x13 W c2 x23 W c3 x1 x2 / W .x1 ; x2 / 2 .C /2 g D V .c33 p0 p1 p2 c0 c1 c2 p33 /: Of course, the prime ideal of any particular surface Xc is the principal ideal generated by p0 p1 p2 c0 c1 c2
p3 c3
3 :
How does the ML degree of Xc depend on the parameter c D .c0 ; c1 ; c2 ; c3 / 2 .C /4 ? } We shall express the ML degree of the toric variety Xc in terms of the complement of a hypersurface in the torus .C /d . The pair .A; c/ define the sparse Laurent polynomial f .x/ D c0 x aQ0 C c1 x aQ1 C C cn x aQn : Theorem 3.2. The ML degree of the d -dimensional toric variety Xc Pn is equal to .1/d times the Euler characteristic of the very affine variety Xc nH '
˚
x 2 .C /d W f .x/ 6D 0 :
(25)
For generic c, the ML degree agrees with the degree of Xc , which is the normalized volume of the d -dimensional lattice polytope conv.A/ obtained as the convex hull of the columns of A.
Likelihood Geometry
93
Proof. We first argue that the identification (25) holds. The map x 7! p D .c0 x aQ0 W c1 x aQ1 W W cn x aQn / defines an injective group homomorphism from .C /d into the dense torus of Pn . Its image is equal to the dense torus of Xc , so we have an isomorphism between .C /d and the dense torus of Xc . Under this isomorphism, the affine open set ff 6D 0g in .C /d is identified with the affine open set fp0 C C pn 6D 0g in the dense torus of Xc . The latter is precisely Xc nH. Since .C /d is smooth, we see that Xc nH is smooth, so our first assertion follows from Theorem 1.7. The second assertion is a consequence of the description of the likelihood correspondence LXc via linear sections of Xc that is given in Proposition 3.5 below. t u Example 3.3. We return to the cubic surface Xc in Example 3.1. For a general parameter vector c, the ML degree of Xc is 3. For instance, the surface V .p0 p1 p2 p33 / P3 has ML degree 3. However, the ML degree of Xc drops to 2 whenever the plane curve defined by f .x1 ; x2 / D c0 C c1 x13 C c2 x23 C c3 x1 x2 has a singularity in .C /2 . For instance, this happens for c D .1 W 1 W 1 W 3/. The corresponding surface V .27p0 p1 p2 C p33 / P3 has ML degree 2. } The isomorphism (25) has a nice interpretation in terms of Convex Optimization. Namely, it implies that maximum likelihood estimation for toric varieties is equivalent to global minimization of posynomials, and hence to the most fundamental case of Geometric Programming. We refer to [BoydVan, §4.5] for an introduction to posynomials and geometric programming. We write j j for the one-norm on RnC1 , we set b D Au, and we assume that c D .c0 ; c1 ; : : : ; cn / is in RnC1 >0 . Maximum likelihood estimation for toric models is the problem Maximize
pu subject to p 2 Xc \ n : jpjjuj
(26)
Setting pi D ci x aQi as above, this problem becomes equivalent to the geometric program Minimize
f .x/juj subject to x 2 Rd>0 : xb
(27)
By construction, f .x/juj =x b is a posynomial whose Newton polytope contains the origin. Such a posynomial attains a unique global minimum on the open orthant Rd>0 . This can be seen by convexifying as in [BoydVan, §4.5.3]. This global minimum of (27) corresponds to the solution of (26), which exists and is unique by Birch’s Theorem [PS, Theorem 1.10].
94
J. Huh and B. Sturmfels
Example 3.4. Consider the geometric program for the surfaces in Example 3.1, with 0
1 0301 A D @ 0 0 3 1A 1111
and u D .0; 0; 0; 1/:
The problem (27) is to find the global minimum, over all positive x D .x1 ; x2 /, of the function f .x1 ; x2 / D c0 x11 x21 C c1 x12 x21 C c2 x11 x22 C c3 : x1 x2 This is equivalent to maximizing p3 =pC subject to p 2 V .c33 p0 p1 p2 c0 c1 c2 p33 / \ 3 . } We now describe the toric likelihood correspondence LXc in Pn Pn associated with the pair .A; c/. This is the likelihood correspondence of the toric variety Xc Pn defined above. Proposition 3.5. On the open subset .Xc nH/ Pn , the toric likelihood correspondence LXc is defined by the 2 2-minors of the 2 .d C1/-matrix p=c AT : u=c AT
(28)
Here the notation p=c is as in (24). In particular, for any fixed data vector u, the critical points of `u are characterized by a linear system of equations in p restricted to Xc . Proof. This is an immediate consequence of Birch’s Theorem [PS, Theorem 1.10]. t u Example 3.6. The Hardy–Weinberg curve of Example 1.3 is the subvariety Xc D V .p12 4p0 p2 / in the projective plane P2 . As a toric variety, this plane curve is given by 012 and c D .1; 2; 1/: AD 210 The likelihood correspondence of Xc is the surface in P2 P2 given by p1 C 2p2 2p0 C p1 2p0 p1 D det D 0: det p1 2p2 u1 C 2u2 2u0 C u1
(29)
Note that the second determinant equals the determinant of the 2 2-matrix (28) times 4. Saturating (29) with respect to p0 C p1 C p2 reveals two further equations of degree .1; 1/:
Likelihood Geometry
95
2.u1 C 2u2 /p0 D .2u0 C u1 /p1
and .u1 C 2u2 /p1 D 2.2u0 C u1 /p2 :
For fixed u, these equations have a unique solution in P2 , given by the formula in (3). } Toric varieties are rational varieties that are parametrized by monomials. We now examine those varieties that are parametrized by generic polynomials. Understanding these is useful for statistics since many widely used models for discrete data are given in the form f W ‚ ! n ; where ‚ is a d -dimensional polytope and f is a polynomial map. The coordinates f0 ; f1 ; : : : ; fn are polynomial functions in the parameters D .1 ; : : : ; d / satisfying f0 C f1 C C fn D 1. Such models include the mixture models in Proposition 2.11, phylogenetic models, Bayesian networks, hidden Markov models, and many others arising in computational biology [PS]. The model specified by the polynomials f0 ; : : : ; fn is the semialgebraic set f .‚/ n . We study its Zariski closure X D f .‚/ in Pn . Finding its equations is hard and interesting. Theorem 3.7. Let f0 ; f1 ; : : : ; fn be polynomials of degrees b0 ; b1 ; : : : ; bn satisfying P fi D 1. The ML degree of the variety X is at most the coefficient of zd in the generating function .1 z/d : .1 zb0 /.1 zb1 / .1 zbn / Equality holds when the coefficients of f0 ; f1 ; : : : ; fn are generic relative to P fi D 1. Proof. This is the content of [CHKS, Theorem 1].
t u
Example 3.8. We examine the case of quartic surfaces in P3 . Let d D 2; n D 3, pick random affine quadrics f1 ; f2 ; f3 in two unknowns and set f0 D 1 f1 f2 f3 . This defines a map f W C2 ! C3 P 3 : The ML degree of the image surface X D f .C2 / in P3 is equal to 25 since .1 z/2 D 1 C 6z C 25z2 C 88z3 C .1 2z/4 The rational surface X is a Steiner surface (or Roman surface). Its singular locus consists of three lines that meet in a point P . To understand the graph of f , we
96
J. Huh and B. Sturmfels
observe that the linear span of ff0 ; f1 ; f2 ; f3 g in CŒx; y has a basis f1; L2 ; M 2 ; N 2 g where L; M; N represent lines in C2 . Let l denote the line through M \ N parallel to L, m the line through L \ N parallel to M , and n the line through L \ M parallel to N . The map C2 ! X is a bijection outside these three lines, and it maps each line 2-to-1 onto one of the lines in Xsing . The fiber over the special point P on X consists of three points, namely, l \ m, l \ n and m \ n. If the quadric f0 were also picked at random, rather than as 1 f1 f2 f3 , then we would still get a Steiner surface X P3 . However, now the ML degree of X increases to 33. On the other hand, if we take X to be a general quartic surface in P3 , so X is a smooth K3 surface of Picard rank 1, then X has ML degree 84. This is the formula in Example 1.11 evaluated at n D 3 and d D 4. Here X nH is the generic quartic surface in P3 with five plane sections removed. The number 84 is the Euler characteristic of that open K3 surface. In the first case, X nH is singular, so we cannot apply Theorem 1.7 directly to our Steiner surface X in P3 . However, we can work in the parameter space and consider the smooth very affine surface C2 nV .f0 f1 f2 f3 /. The number 25 is the Euler characteristic of that surface. It is instructive to verify Conjecture 1.8 for our three quartic surfaces in P3 . We found .X nH/ D 38 > 25 D MLdegree .X /; .X nH/ D 49 > 33 D MLdegree .X /; .X nH/ D 84 D 84 D MLdegree .X /: The Euler characteristics of the three surfaces were computed using Aluffi’s method [AluJSC]. } We now turn to the following question: which projective varieties X have ML degree one? This question is important for likelihood inference because a model having ML degree one means that the MLE pO is a rational function in the data u. It is known that Bayesian networks and decomposable graphical models enjoy this property, and it is natural to wonder which other statistical models are in this class. The answer to this question was given by the first author in [Huh2]. We shall here present the result of [Huh2] from a slightly different angle. Our point of departure is the notion of the A-discriminant, as introduced and studied by Gel’fand, Kapranov and Zelevinsky in [GKZ]. We fix an r m integer matrix A D .a1 ; a2 ; : : : ; am / of rank r which has .1; 1; : : : ; 1/ in its row space. The Zariski closure of ˚ a
.t 1 W t a2 W W t am / 2 Pm1 W t 2 .C /r is an .r 1/-dimensional toric variety YA in Pm1 . We here intentionally changed the notation relative to that used for toric varieties at the beginning of this section. The reason is that d and n are always reserved for the dimension and embedding dimension of a statistical model.
Likelihood Geometry
97
The dual variety YA is an irreducible variety in the dual projective space .P / whose coordinates are x D .x1 W x2 W W xm /. We identify points x in .Pm1 /_ with hypersurfaces m1 _
˚
t 2 .C /r W x1 t a1 C x2 t a2 C C xm t am D 0 :
(30)
The dual variety YA is the Zariski closure in .Pm1 /_ of the locus of all hypersurfaces (30) that are singular. Typically, YA is a hypersurface. In that case, YA is defined by a unique (up to sign) irreducible polynomial A 2 ZŒx1 ; x2 ; : : : ; xm . The homogeneous polynomial A is called the A-discriminant. Many classical discriminants and resultants are instances of A . So are determinants and hyperdeterminants. This is the punch line of the book [GKZ]. 3210 . The associated toric variety Example 3.9. Let m D 4; r D 2, and A D 0123 is the twisted cubic curve ˚
YA D .1 W t W t 2 W t 3 / j t 2 C P3 : The variety YA that is dual to the curve YA is a surface in .P3 /_ . The surface YA parametrizes all planes that are tangent to the curve YA . These represent univariate cubics x1 C x2 t C x3 t 2 C x4 t 3 that have a double root. Here the A-discriminant is the classical discriminant A D 27x12 x42 18x1 x2 x3 x4 C 4x1 x33 C 4x23 x4 x22 x32 : The surface YA in P3 defined by this equation is the discriminant of the univariate cubic. } Theorem 3.10. Let X Pn be a projective variety of ML degree 1. Each coordinate pOi of the rational function u 7! pO is an alternating product of linear forms in u0 ; u1 ; : : : ; un . The paper [Huh2] gives an explicit construction of the map u 7! pO as a Horn uniformization. A precursor was [Kapranov]. We explain this construction. The point of departure is a matrix A as above. We now take A to be any non-zero homogenous polynomial that vanishes on the dual variety YA of the toric variety YA . If YA is a hypersurface then A is the A-discriminant. First, we write A as a Laurent polynomial by dividing it by one of its monomials: 1 A D 1 c0 x b0 c1 x b1 cn x bn : monomial
(31)
98
J. Huh and B. Sturmfels
This expression defines an m .n C 1/ integer matrix B D .b0 ; : : : ; bn / satisfying AB D 0. Second, we define X to be the rational subvariety of Pn that is given parametrically by pi D ci x bi p0 C p1 C C pn
for i D 0; 1; : : : ; n:
(32)
The defining ideal of X is obtained by eliminating x1 ; : : : ; xm from the equations above. Then X has ML degree 1, and, by Huh [Huh2], every variety of ML degree 1 arises in this manner. Example 3.11. The following curve in P3 happens to be a variety of ML degree 1: X D V 9p1 p2 8p0 p3 ; p02 12.p0 Cp1 Cp2 Cp3 /p3 : This curve comes from the discriminant of the univariate cubic in Example 3.9: 2 x2 x3 4 x23 4 x33 1 x22 x32 1 A D 1 : monomial 3 x1 x4 27 x12 x4 27 x1 x42 27 x12 x42 We derived the curve X from the four parenthesized monomials via the formula (32). The maximum likelihood estimate for this model is given by the products of linear forms pO0 D
2 x2 x3 3 x1 x4
pO1 D
4 x23 27 x12 x4
pO2 D
4 x33 27 x1 x42
pO3 D
1 x22 x32 27 x12 x42
where x1 D u0 u1 2u2 2u3 x2 D u0 C 3u2 C 2u3 x3 D u0 C 3u1 C 2u3 x4 D u0 2u1 u2 2u3 These expressions are the alternating products of linear forms promised in Theorem 3.10. } We now give the formula for pOi in general. This is the Horn uniformization of [GKZ, §9.3]. Corollary 3.12. Let X Pn be the variety of ML degree 1 with parametrization (32) derived from a scaled A-discriminant (31). The coordinates of the MLE function u 7! pO are pOk
D
m X n Y ck . bij ui /bkj : j D1 i D0
Likelihood Geometry
99
It is not obvious (but true) that pO0 C pO1 C C pOn D 1 holds in the formula above. In light of its monomial parametrization, our variety X is toric in Pn nH. In general, it is not toric in Pn , due to appearances of the factor .p0 C p1 C C pn / in equations for X . Interestingly, there are numerous instances when this factor does not appear and X is toric also in Pn . One toric instance is the independence model X D V .p00 p11 p01 p10 /, whose MLE was derived in Example 1.14. What is the matrix A in this case? We shall answer this question for a slightly larger example, which serves as an illustration for decomposable graphical models. Example 3.13. Consider the conditional independence model for three binary variables given by the graph —–—–. We claim that this graphical model is derived from a00 1 xB 1 B A D yB B 0 z @ 0 w 0
a10 1 1 0 0 0
0
a01 1 0 1 0 0
a11 1 0 1 0 0
b00 1 0 0 1 0
b01 1 0 0 1 0
b10 1 0 0 0 1
b11 1 0 0 0 1
c0 1 1 0 1 0
c1 1 0 1 0 1
d 1 1 0C C 0C C: 0A 0
The discriminant of the corresponding family of hypersurfaces ˚ .x; y; z; w/ 2 .C /4 j .a00 C a10 /x C .a01 C a11 /y C .b00 C b01 /z
C .b10 C b11 /w C c0 xz C c1 yw C d D 0 equals A D c0 c1 d a01 b10 c0 a11 b10 c0 a01 b11 c0 a11 b11 c0 a00 b00 c1 a10 b00 c1 a00 b01 c1 a10 b01 c1 : We divide this A-discriminant by its first term c0 c1 d to rewrite it in the form (31) with n D 7. The parametrization of X P7 given by (32) can be expressed as pij k D
aij bjk cj d
for i; j; k 2 f0; 1g:
(33)
This is indeed the desired graphical model —–—– with implicit representation X D V p000 p101 p001 p100 ; p010 p111 p011 p110 P7 : The linear forms used in the Horn uniformization of Corollary 3.12 are aij D uij C
bjk D uCjk
cj D uCj C
d D uCCC
100
J. Huh and B. Sturmfels
Substituting these expressions into (33), we obtain pOij k D
uij C uCjk uCj C uCCC
for i; j; k 2 f0; 1g:
This is the formula in Lauritzen’s book [Lau] for MLE of decomposable graphical models. } We now return to the likelihood geometry of an arbitrary d -dimensional projective variety X in Pn , as always defined over R and not contained in H. We define the ML bidegree of X to be the bidegree of its likelihood correspondence LX Pn Pn . This is a binary form BX .p; u/
D
.b0 p d C b1 p d 1 u C C bd ud / p nd ;
where b0 ; b1 ; : : : ; bd are certain positive integers. By definition, BX .p; u/ is the multidegree [ch3:MS, §8.5] of the prime ideal of LX , with respect to the natural Z2 grading on the polynomial ring RŒp; u D RŒp0 ; : : : ; pn ; u0 ; : : : ; un . Equivalently, the ML bidegree BX .p; u/ is the class defined by LX in the cohomology ring H .Pn Pn I Z/ D ZŒp; u=hp nC1 ; unC1 i: We already saw some examples, for the Grassmannian G.2; 4/ in (12), for arbitrary linear spaces in (14), and for a toric model of ML degree 1 in (23). We note that the bidegree BX .p; u/ can be computed conveniently using the command multidegree in Macaulay2. To understand the geometric meaning of the ML bidegree, we introduce a second polynomial. Let Lni be a sufficiently general linear subspace of Pn of codimension i , and define si D MLdegree .X \ Lni /: We define the sectional ML degree of X to be the polynomial SX .p; u/
D
.s0 p d C s1 p d 1 u C C sd ud / p nd ;
Example 3.14. The sectional ML degree of the Grassmannian G.2; 4/ in (10) equals SX .p; u/ D 4p 5 C 20p 4 u C 24p 3 u2 C 12p 2 u3 C 2pu4 : Thus, if H1 ; H2 ; H3 denote generic hyperplanes in P5 , then the threefold G.2; 4/ \ H1 has ML degree 20, the surface G.2; 4/ \ H1 \ H2 has ML degree 24, and the curve G.2; 4/ \ H1 \ H2 \ H3 has ML degree 12. Lastly, the coefficient 2 of pu4 is simply the degree of G.2; 4/ in P5 . }
Likelihood Geometry
101
Conjecture 3.15. The ML bidegree and the sectional ML degree of any projective variety X Pn , not lying in H, are related by the following involution on binary forms of degree n: BX .p; u/ D
u SX .p; u p/ p SX .p; 0/ ; up
SX .p; u/ D
u BX .p; u C p/ C p BX .p; 0/ : uCp
This conjecture is a theorem when X nH is smooth and its boundary is schön. See Theorem 4.6 below. In that case, the ML bidegree is identified, by Huh [Huh1, Theorem 2], with the Chern–Schwartz–MacPherson (CSM) class of the constructible function on Pn that is 1 on X nH and 0 elsewhere. Aluffi proved in [Alu, Theorem 1.1] that the CSM class of an locally closed subset of Pn satisfies such a log-adjunction formula. Our formula in Conjecture 3.15 is precisely the homogenization of Aluffi’s involution. The combination of [Alu, Theorem 1.1] and [Huh1, Theorem 2] proves Conjecture 3.15 in cases such as generic complete intersections (Theorem 1.10) and arbitrary linear spaces (Theorem 1.20). In the latter case, it can also be verified using matroid theory. Conjecture 3.15 says that this holds for any X , indicating a deeper connection between likelihood correspondences and CSM classes. We note that BX .p; u/ and SX .p; u/ always share the same leading term and the same trailing term, and this is compatible with our formulas. Both polynomials start and end like MLdegree .X / p n C C degree .X / p codim.X / udim.X / : We now illustrate Conjecture 3.15 by verifying it computationally for a few more examples. Example 3.16. Let us examine some cubic fourfolds in P5 . If X is a generic hypersurface of degree 4 in P5 then its sectional ML degree and ML bidegree satisfy the conjectured formula: SX .p; u/ D 1364p 5 C 448p 4u C 136p 3 u2 C 32p 2 u3 C 3pu4 ; BX .p; u/ D 1364p 5 C 341p 4 u C 81p 3 u2 C 23p 2 u3 C 3pu4 : Of course, in algebraic statistics, we are more interested in special hypersurfaces that are statistically meaningful. One such instance was seen in Example 2.7. The mixture model for two identically distributed ternary random variables is the fourfold X P5 defined by 0
1 2p11 p12 p13 det @ p12 2p22 p23 A p13 p23 2p33
D
0:
(34)
102
J. Huh and B. Sturmfels
The sectional ML degree and the ML bidegree of this determinantal fourfold are SX .p; u/ D 6p 5 C 42p 4 u C 48p 3 u2 C 21p 2 u3 C 3pu4 BX .p; u/ D 6p 5 C 12p 4 u C 15p 3 u2 C 12p 2 u3 C 3pu4 : For the toric fourfold X D V .p11 p22 p33 p12 p13 p23 /, ML bidegree and sectional ML degree are BX .p; u/ D 3p 5 C 3p 4 u C 3p 3 u2 C 3p 2 u3 C 3pu4 ; SX .p; u/ D 3p 5 C 12p 4 u C 18p 3 u2 C 12p 2 u3 C 3pu4 : Now, taking X D V .p11 p22 p33 Cp12 p13 p23 / instead, the leading coefficient 3 changes to 2. } Remark 3.17. Conjecture 3.15 is true when Xc is a toric variety with c generic, as in Theorem 3.2. Here we can use Proposition 3.5 to infer that all coefficients of BX are equal to the normalized volume of the lattice polytope conv.A/. In symbols, for generic c, we have BXc .p; u/
D
degree .Xc /
d X
p ni ui :
i D0
It is now an exercise to transform this into a formula for the sectional ML degree SXc .p; u/. In general, it is hard to compute generators for the ideal of the likelihood correspondence. Example 3.18. The following submodel of (34) was featured prominently in [HKS, §1]: 0 1 12p0 3p1 2p2 det @ 3p1 2p2 3p3 A 2p2 3p3 12p4
D
0:
(35)
This cubic threefold X is the secant variety of a rational normal curve in P4 , and it represents the mixture model for a binomial random variable (tossing a biased coin four times). It takes several hours in Macaulay2 to compute the prime ideal of the likelihood correspondence LX P4 P4 . That ideal has 20 minimal generators one in degree .1; 1/, one in degree .3; 0/, five in degree .3; 1/, ten in degree .4; 1/ and three in degree .3; 2/. After passing to a Gröbner basis, we use the formula in [ch3:MS, Definition 8.45] to compute the bidegree of LX : BX .p; u/
D
12p 4 C 15p 3 u C 12p 2 u2 C 3pu3 :
Likelihood Geometry
103
We now intersect X with random hyperplanes in P4 , and we compute the ML degrees of the intersections. Repeating this experiment many times reveals the sectional ML degree of X : SX .p; u/
12p 4 C 30p 3 u C 18p 2 u2 C 3pu3 :
D
The two polynomials satisfy our transformation rule, thus confirming Conjecture 3.15. We note that Conjecture 1.8 also holds for this example: using Aluffi’s method [AluJSC], we find .X nH/ D 13. } Our last topic is the operation of restriction and deletion. This is a standard tool for complements of hyperplane arrangements, as in Theorem 1.20. It was developed in [Huh1] for arbitrary very affine varieties, such as X nH. We motivate this by explaining the distinction between structural zeros and sampling zeros for contingency tables in statistics [BFH, §5.1.1]. Returning to the “hair loss due to TV soccer” example from the beginning of Sect. 2, let us consider the following questions. What is the difference between the data set
U
D
0 2h 2–6 h @ 6h
lots of hair medium hair little hair 1 15 0 9 20 24 12 A 10 12 6
and the data set
UQ
D
0 2h 2–6 h @ 6h
lots of hair medium hair little hair 1 10 0 5 A‹ 9 3 6 7 9 8
How should we think about the zero entries in row 1 and column 2 of these two contingency tables? Would the rank 1 model M1 or the rank 2 model M2 be more appropriate? The first matrix U has rank 2 and it can be completed to a rank 1 matrix by replacing the zero entry with 18. Thus, the model M1 fits perfectly except for the structural zero in row 1 and column 2. It seems that this zero is inherent in the structure of the problem: planet Earth simply has no people with medium hair length who rarely watch soccer on TV. The second matrix UQ also has rank two, but it cannot be completed to rank 1. The model M2 is a perfect fit. The zero entry in UQ appeared to be an artifact of the particular group that was interviewed in this study. This is a sampling zero. It arose because, by chance, in this cohort nobody happened to have medium hair length and watch soccer on TV rarely. We refer to the book of Bishop et al. [BFH, Chap. 5] for an introduction.
104
J. Huh and B. Sturmfels
We now consider an arbitrary projective variety X Pn , serving as our statistical model. Suppose that structural zeros or sampling zeros occur in the last coordinate un . Following [Rapallo, Theorem 4], we model structural zeros by the projection n .X /. This model is the variety in Pn1 that is the closure of the image of X under the rational map n W Pn Ü Pn1 ;
.p0 W p1 W W pn1 W pn / 7! .p0 W p1 W W pn1 /:
Which projective variety is a good representation for sampling zeros? We propose that sampling zeros be modeled by the intersection X \ fpn D0g. This is now to be regarded as a subvariety in Pn1 . In this manner, both structural zeros and sampling zeros are modeled by closed subvarieties of Pn1 . Inside that ambient Pn1 , our standard arrangement H consists of n C 1 hyperplanes. Usually, none of these hyperplanes contains X \ fpn D0g or n .X /. It would be desirable to express the (sectional) ML degree of X in terms of those of the intersection X \ fpn D 0g and the projection n .X /. As an alternative to the ML degree of the projection n .X / into Pn1 , here is a quantity in Pn that reflects the presence of structural zeros even more accurately. We denote by MLdegree .X jun D0 / the number of critical points pO D .pO0 W pO1 W W pOn1 W pOn / of `u in Xreg nH for those data vectors u D .u0 ; u1 ; : : : ; un1 ; 0/ whose first n coordinates ui are positive and generic. Conjecture 3.19. The maximum likelihood degree satisfies the inductive formula MLdegree .X / D MLdegree .X \ fpn D0g/ C MLdegree .X jun D0 /;
(36)
provided X and X \ fpn D0g are reduced, irreducible, and not contained in their respective H. We expect that an analogous formula will hold for the sectional ML degree SX .p; u/. The intuition behind equation (36) is as follows. As the data vector u moves from a general point in Pnu to a general point on the hyperplane fun D 0g, the corresponding fiber pr1 2 .u/ of the likelihood fibration splits into two clusters. One cluster has size MLdegree .X jun D0 / and stays away from H. The other cluster moves onto the hyperplane fpn D 0g in Pnp , where it approaches the various critical points of `u in that intersection. This degeneration is the perfect scenario for a numerical homotopy, e.g. in Bertini, as discussed in Sect. 2. These homotopies are currently being studied for determinantal varieties by Elizabeth Gross and Jose Rodriguez [GR]. The formula (36) has been verified computationally for many examples. Also, Conjecture 3.19 is known to be true in the slightly different setting of [Huh1], under a certain smoothness assumption. This is the content of [Huh1, Corollary 3.2].
Likelihood Geometry
105
Example 3.20. Fix the space P8 of 3 3-matrices as in Sect. 2. For the rank 2 variety X D V2 , the formula (36) reads 10 D 5 C 5. For the rank 1 variety X D V1 , it reads 1 D 0 C 1. } Example 3.21. If X is a generic .d; e/-curve in P3 , then MLdegree .X / D d 2 e C de2 C de
and X \ fp3 D 0g D .d e distinct points/:
Computations suggest that MLdegree .X ju3 D0 / D d 2 e C de2
and MLdegree .3 .X // D d 2 e C de2 :
To derive the second equality geometrically, one may argue as follows. Both curves X P3 and 3 .X / P2 have degree de and genus 12 .d 2 e C de 2 / 2de C 1. Subtracting this from the expected genus 12 .de 1/.de 2/ of a plane curve of degree de, we find that 3 .X / has 12 d.d 1/e.e 1/ nodes. Example 1.4 suggests that each node decreases the ML degree of a plane curve by 2. Assuming this to bet the case, we conclude MLdegree .3 .X // D de.de C 1/ d.d 1/e.e 1/ D d 2 e C de2 : Here we are using that a general plane curve of degree de has ML degree de.deC1/. } This example suggests that, in favorable circumstances, the following identity would hold: MLdegree .X jun D0 / D MLdegree .n .X //:
(37)
However, this is certainly not true in general. Here is a particularly telling example: Example 3.22. Suppose that X is a generic surface of degree d in P3 . Then MLdegree .X / D d C d 2 C d 3; MLdegree .X \ fp3 D 0g/ D d C d 2 ; MLdegree .X ju3 D0 / D d 3; D 1: MLdegree .3 .X // Indeed, for most hypersurfaces X Pn , the same will happen, since n .X / D Pn1 . } As a next step, one might conjecture that (37) holds when the map is birational and the center .0 W W 0 W 1/ of the projection does not lie on the variety X . But this also fails: Example 3.23. Let X be the twisted cubic curve in P3 defined by the 2 2-minors of
106
J. Huh and B. Sturmfels
p0 C p1 p2 2p0 p2 C 9p3 p0 6p1 C 8p2 : 2p0 p2 C 9p3 p0 6p1 C 8p2 7p0 C p1 C 2p2
The ML degree of X is 13 D 3 C 10, and X intersects fp3 D 0g in three distinct points. The projection of the curve X into P2 is a cuspidal cubic, as in Example 1.4. We have MLdegree .X ju3 D0 / D 10 and MLdegree .3 .X // D 9: It is also instructive to compare the number 13 D .X nH/ with the number 11 one gets in Theorem 3.7 for the special twisted cubic curve with d D 1, n D 3 and b0 D b1 D b2 D b3 D 3. There are many mysteries still to be explored in likelihood geometry, even within P3 . }
4 Characteristic Classes We start by giving an alternative description of the likelihood correspondence which reveals its intimate connection with the theory of Chern classes on possibly noncompact varieties. An important role will be played by the Lie algebra and cotangent bundle of the algebraic torus .C /nC1 . This section ties our discussion to the work of Aluffi [AluJSC, AluLectures, Alu] and Huh [Huh1, Huh0, Huh2]. In particular, we introduce and explain Chern–Schwartz–MacPherson (CSM) classes. And, most importantly, we present proofs for Theorems 1.6, 1.7, 1.15, and 1.20. Let X Pn be a closed and irreducible subvariety of dimension d , not contained in our distinguished arrangement of n C 2 hyperplanes, ˚ H D .p0 W p1 W W pn / 2 Pn j p0 p1 pn pC D 0 g;
pC D
n X
pi :
i D0
Let 'i denote the restriction of the rational function pi =pC to X nH. The closed embedding ' W X nH ! .C /nC1 ;
' D .'0 ; : : : ; 'n /;
shows that the variety X nH is very affine. Let x be a smooth point of X nH. We define x W Tx X ! T'.x/ .C /nC1 ! g WD T1 .C /nC1
(38)
to be the derivative of ' at x followed by that of left-translation by '.x/1 . Here g is the Lie algebra of the algebraic torus .C /nC1 . In local coordinates .x1 ; : : : ; xd / around the smooth point x, the linear map x is represented by the logarithmic Jacobian matrix
Likelihood Geometry
107
! @ log 'i ; @xj
0 i n;
1 j d:
The linear map x in (38) is injective because ' is injective. We write q0 ; : : : ; qn for the coordinate functions on the torus .C /nC1 . These functions define a C-linear basis of the dual Lie algebra g_ corresponding to differential forms dlog.q0 /; : : : ; dlog.qn / 2 H 0 .C /nC1 ; 1.C /nC1 ' g_ ' CnC1 : We fix this choice of basis of g_ , and we identify P.g_ / with the space of data vectors Pnu : g_ '
n nX
o ui dlog.qi / j u D .u0 ; : : : ; un / 2 CnC1 :
i D0
Consider the vector bundle homomorphism defined by the pullback of differential forms _
W
g_ Xreg nH
!
1Xreg nH ;
.x; u/ 7!
n X
ui dlog.'i /.x/:
(39)
i D0
Here g_ Xreg nH is the trivial vector bundle over Xreg nH modeled on the vector space _ g . The induced linear map x_ between the fibers over a smooth point x is dual to the injective linear map x W Tx X ! g. Therefore _ is surjective and ker. _ / is a vector bundle over Xreg nH. This vector bundle has positive rank n d C 1, and hence its projectivization is nonempty. Proof of Theorem 1.6. Under the identification P.g_ / ' Pnu , the projective bundle P.ker _ / corresponds to the following constructible subset of dimension n: LX \ .Xreg nH/ Pnu Pnp Pnu : Therefore its Zariski closure LX is irreducible of dimension n, and pr1 W LX ! Pnp is a projective bundle over Xreg nH. The likelihood vibration pr2 W LX ! Pnu is generically finite-to-one because the domain and the range are algebraic varieties of the same dimension. t u Our next aim is to prove Theorem 1.15. For this we fix a resolution of singularities
108
J. Huh and B. Sturmfels
1 .Xreg nH/
XQ
X
Xreg nH
Pn ;
where is an isomorphism over Xreg nH, the variety XQ is smooth and projective, and the complement of 1 .Xreg nH/ is a simple normal crossing divisor in XQ with irreducible components D1 ; : : : ; Dk . Each 'i lifts to a rational function on XQ which is regular on 1 .X nH/. If u D .u0 ; : : : ; un / is an integer vector in ZnC1 , then these functions satisfy ordDj .`u /
D
n X
ui ordDj .'i /:
(40)
i D0
If u 2 CnC1 nZnC1 then ordDj .`u / is the complex number defined by the Eq. (40) for j D 1; : : : ; k. We write Hi WD fpi D 0g and HC WD fpC D 0g for the n C 2 hyperplanes in H. Lemma 4.1. Suppose that X \ Hi is smooth along HC , and let Dj be a divisor in the boundary of XQ such that .Dj / H. Then the following three statements hold: ( positive if .Dj / Hi , (1) If .Dj / ª HC then ordDj .'i / is zero if .Dj / ª Hi . ( positive if .Dj / ª Hi , (2) If .Dj / HC then ordDj .'i / is nonnegative if .Dj / Hi . (3) In each of the above two cases, ordDj .'i / is non-zero for at least one index i . 0 for the pullbacks of Hi and HC to X respectively. Note Proof. Write Hi0 and HC 0 that ordDj . .Hi // is positive if Dj is contained in 1 .Hi0 / and otherwise zero. Since 0 ordDj .'i / D ordDj . .Hi0 // ordDj . .HC //;
this proves the first and second assertion, except for the case when .Dj / Hi \ 0 HC . In this case, our assumption that Hi0 is smooth along HC shows that .Dj / 0 Xreg and the order of vanishing of Hi along .Dj / is 1. Therefore 0 // 1 0: ordDj .'i / D ordDj . .HC
The third assertion of Lemma 4.1 is derived by the following set-theoretic reasoning:
Likelihood Geometry
109
• If .Dj / ª HC , then .Dj / Hi for some i because .Dj / H is irreducible. T • If .Dj / HC , then .Dj / ª Hi for some i because niD0 Hi D ;. t u From Lemma 4.1 and Eq. (40) we deduce the following result. In Lemmas 4.2 and 4.3 we retain the hypothesis from Lemma 4.1 which coincides with that in Theorem 1.15. Lemma 4.2. If .Dj / H and u 2 RnC1 >0 is strictly positive, then ordDj .`u / is nonzero. Consider the sheaf of logarithmic differential one-forms 1XQ .log D/, where D is the sum of the irreducible components of 1 .H/. If u is an integer vector, then the corresponding likelihood function `u on XQ defines a global section of this sheaf: dlog.`u / D
n X i D0
ui dlog.'i / 2 H 0 XQ ; 1XQ .log D/ :
(41)
If u 2 CnC1 nZnC1 then we define the global section dlog.`u / by the above expression (41). Lemma 4.3. If u 2 RnC1 >0 is strictly positive, then dlog.`u / does not vanish on 1 .H/. Proof. Let x 2 1 .H/ and D1 ; : : : ; Dl the irreducible components of D containing x, with local equations g1 ; : : : ; gl on a small neighborhood G of x. Clearly, l 1. By passing to a smaller neighborhood if necessary, we may assume that 1XQ .log D/ trivializes over G, and dlog.`u /
D
l X
ordDj .`u / dlog.gj / C
;
j D1
where is a regular 1-form. Since the dlog.gj / form part of a free basis of a trivialization of 1XQ .log D/ over G, Lemma 4.2 implies that dlog.`u / is nonzero on 1 .H/ if u 2 RnC1 t u >0 . Proof of Theorem 1.7. In the notation above, the logarithmic Poincaré–Hopf theorem states Z cd 1XQ .log D/ D .1/d XQ n 1 .H/ : XQ
See [AluLectures, Sect. 3.4] for example. If X nH is smooth, then Lemma 4.3 shows that, for generic u, the zero-scheme of the Eq. (41) is equal to the likelihood locus
110
J. Huh and B. Sturmfels
˚
x 2 X nH j dlog.`u /.x/ D 0 :
Since the likelihood locus is a zero-dimensional scheme of length equal to the ML degree of X , the logarithmic Poincaré–Hopf theorem implies Theorem 1.7. t u Proof of Theorem 1.15. Suppose that the likelihood locus fx 2 Xreg nH j dlog.`u /.x/ D 0g contains a curve. Let C and CQ denote the closures of that curve in X and XQ respectively. Let .H/ be the pullback of the divisor H \ X of nC1 X . If u 2 R>0 then Lemma 4.3 implies that .H/ CQ is rationally equivalent to zero in XQ . It then follows from the Projection Formula that H C is also rationally equivalent to zero in Pn . But this is impossible. Therefore the likelihood locus does not contain a curve. This proves the first part of Theorem 1.15. For the second part, we first show that pr1 2 .u/ is contained in X nH for a strictly positive vector u. This means there is no pair .x; u/ 2 LX with x 2 H which is a limit of the form .x; u/ D lim .xt ; ut /; t !0
xt 2 Xreg nH;
dlog.`ut /.xt / D 0:
If there is such a sequence .xt ; ut /, then we can take its limit over XQ to find a point xQ 2 XQ such that dlog.`u /.x/ Q D 0, but this would contradict Lemma 4.3. Now suppose that the fiber pr1 2 .u/ is contained in Xreg , and hence in Xreg nH. By Theorem 1.6, this fiber pr1 2 .u/ is contained the smooth variety .LX /reg . Furthermore, by the first part of Theorem 1.15, pr1 2 .u/ is a zero-dimensional subscheme of .LX /reg . The assertion on the length of the fiber now follows from a standard result on intersection theory on Cohen–Macaulay varieties. More precisely, we have MLdegree.X / D .U1 : : : Un /LX D .U1 : : : Un /.LX /reg D deg.pr1 2 .u//; where the Ui are pullbacks of sufficiently general hyperplanes in Pnu containing u, and the two terms in the middle are the intersection numbers defined in [Fulton, Definition 2.4.2]. The fact that .LX /reg is Cohen–Macaulay is used in the last equality [Fulton, Example 2.4.8]. t u Remark 4.4. If X is a curve, then the zero-scheme of the Eq. (41) is zerodimensional for generic u, even if X nH is singular. Furthermore, the length of this zero-scheme is at least as large as ML degree of X . Therefore .X nH/ XQ n 1 .H/ MLdegree .X /: This proves that Conjecture 1.8 holds for d D 1. Next we give a brief description of the Chern–Schwartz–MacPherson (CSM) class. For a gentle introduction we refer to [AluLectures]. The group C.X / of constructible functions on a complex algebraic variety X is a subgroup of the group
Likelihood Geometry
111
of integer valued functions on X . It is generated by the characteristic functions 1Z of all closed subvarieties Z of X . If f W X ! Y is a morphism between complex algebraic varieties, then the pushforward of constructible functions is the homomorphism f W C.X / ! C.Y /;
1Z 7! y 7! f 1 .y/ \ Z ;
y2Y :
If X is a compact complex manifold, then the characteristic class of X is the Chern class of the tangent bundle c.TX / \ ŒX 2 H .X I Z/. A generalization to possibly singular or noncompact varieties is provided by the Chern–Schwartz–MacPherson class, whose existence was once a conjecture of Deligne and Grothendieck. In the next definition, we write C for the functor of constructible functions from the category of complete complex algebraic varieties to the category of abelian groups. Definition 4.5. The CSM class is the unique natural transformation cSM W C ! H such that cSM .1X / D c.TX / \ ŒX 2 H .X I Z/ when X is smooth and complete. The uniqueness follows from the naturality, the resolution of singularities over C, and the requirement for smooth and complete varieties. We highlight two properties of the CSM class which follow directly from Definition 4.5: 1. The CSM class satisfies the inclusion–exclusion relation cSM .1U [U 0 / D cSM .1U / C cSM .1U 0 / cSM .1U \U 0 / 2 H .X I Z/:
(42)
2. The CSM class captures the topological Euler characteristic as its degree: Z cSM .1U / 2 Z:
.U / D
(43)
X
Here U and U 0 are arbitrary constructible subsets of a complete variety X . What kind of information on a constructible subset is encoded in its CSM class? In likelihood geometry, U is a constructible subset in the complex projective space Pn , and we identify cSM .1U / with its image in H .Pn ; Z/ D ZŒp=hp nC1 i. Thus cSM .1U / is a polynomial of degree n is one variable p. To be consistent with the earlier sections, we introduce a homogenizing variable u, and we write cSM .1U / as a binary form of degree n in .p; u/. The CSM class of U carries the same information as the sectional Euler characteristic sec .1U /
D
n X i D0
.U \ Lni / p ni ui :
112
J. Huh and B. Sturmfels
Here Lni is a generic linear subspace of codimension i in Pn . Indeed, it was proved by Aluffi in [Alu, Theorem 1.1] that cSM .1U / is the transform of sec .1U / under a linear involution on binary forms of degree n in .p; u/. In fact, our involution in Conjecture 3.15 is nothing but the signed version of the Aluffi’s involution. This is explained by the following result. Theorem 4.6. Let X Pn be closed subvariety of dimension d that is not contained in H. If the very affine variety X nH is schön then, up to signs, the ML bidegree equals the CSM class and the sectional ML degree equals the sectional Euler characteristic. In symbols, cSM .1X nH / D .1/nd BX .p; u/ and
sec .1X nH / D .1/nd SX .p; u/:
Proof. The first identity is a special case of [Huh1, Theorem 2], here adapted to Pn minus n C 2 hyperplanes, and the second identity follows from the first by way of [Alu, Theorem 1.1]. t u To make sense of the statement in Theorem 4.6, we need to recall the definition of schön. This term was coined by Tevelev in his study of tropical compactifications [Tevelev]. Let U be an arbitrary closed subvariety of the algebraic torus .C /nC1 . In our application, U D X nH. We consider the closures U of U in various (not necessarily complete) normal toric varieties Y with dense torus .C /nC1 . The closure U is complete if and only if the support of the fan of Y contains the tropicalization of U [Tevelev, Proposition 2.3]. We say that U is a tropical compactification of U if it is complete and the multiplication map m W .C /nC1 U ! Y;
.t; x/ 7! t x
is flat and surjective. Tropical compactifications exist, and they are obtained from toric varieties Y defined by sufficiently fine fan structures on the tropicalization of U [Tevelev, §2]. The very affine variety U is called schön if the multiplication is smooth for some tropical compactification of U . Equivalently, U is schön if the multiplication is smooth for every tropical compactification of U , by Tevelev [Tevelev, Theorem 1.4]. Two classes of schön very affine varieties are of particular interest. The first is the class of complements of essential hyperplane arrangements. The second is the class of nondegenerate hypersurfaces. What we need from the schön hypothesis is the existence of a simple normal crossings compactification which admits sufficiently many differential one-forms which have logarithmic singularities along the boundary. For complements of hyperplane arrangements, such a compactification is provided by the wonderful compactification of De Concini and Procesi [DP]. For nondegenerate hypersurfaces, and more generally for nondegenerate complete intersections, the needed compactification has been constructed by Khovanskii [Hovanskii].
Likelihood Geometry
113
We illustrate this in the setting of likelihood geometry by a d -dimensional linear subspace of X Pn . The intersection of X with distinguished hyperplanes H of Pn is an arrangement of n C 2 hyperplanes in X ' Pd , defining a matroid M of rank d C 1 on n C 2 elements. Proposition 4.7. If X is a linear space of dimension d then the CSM class of X nH in Pn is cSM .1X nH / D
d X
.1/i hi ud i p nd Ci :
i D0
where the hi are the signed coefficients of the shifted characteristic polynomial in (15). Proof. This holds because the recursive formula for a triple of arrangement complements cSM .1U1 / D cSM .1U 1U0 / D cSM .1U / cSM .1U0 /; agrees with the usual deletion-restriction formula [OTBook, Theorem 2.56]: M1 .q C 1/ D M .q C 1/ M0 .q C 1/: Here our notation is as in [Huh1, §3]. We now use induction on the number of hyperplanes. u t Proof of Theorem 1.20. The very affine variety X nH is schön when X is linear. Hence the asserted formula for the ML bidegree of X follows from Theorem 4.6 and Proposition 4.7. t u Rank constraints on matrices are important both in statistics and in algebraic geometry, and they provide a rich source of test cases for the theory developed here. We close our discussion with the enumerative invariants of three hypersurfaces defined by 3 3-determinants. It would be very interesting to compute these formulas for larger determinantal varieties. Example 4.8. We record the ML bidegree, the CSM class, the sectional ML degree, and the sectional Euler characteristic for three singular hypersurfaces seen earlier in this paper. These examples were studied already in [HKS]. The classes we present are elements of H .Pnp Pnu / and of H .Pnp I Z/ respectively, and they are written as binary forms in .p; u/ as before. • The 3 3 determinantal hypersurface in P8 (Example 2.1) has BX .p; u/ D 10p 8 C 24p 7 u C 33p 6 u2 C 38p 5 u3 C 39p 4 u4 C33p 3 u5 C 12p 2 u6 C 3pu7 ;
114
J. Huh and B. Sturmfels
cSM .1X nH / D 11p 8 C 26p 7 u 37p 6 u2 C 44p 5 u3 45p 4 u4 C33p 3 u5 12p 2 u6 C 3pu7 ; SX .p; u/ D 11p 8 C 182p 7 u C 436p 6 u2 C518p 5u3 C351p 4 u4 C138p 3 u5 C30p 2 u6 C3pu7 ; sec .1X nH / D 11p 8 C 200p 7u 470p 6 u2 C542p 5 u3 357p 4 u4 C138p 3 u5 30p 2 u6 C3pu7 : • The 3 3 symmetric determinantal hypersurface in P5 (Example 2.7) has BX .p; u/ D 6p 5 C 12p 4 u C 15p 3 u2 C 12p 2 u3 C 3pu4 ; cSM .1X nH / D 7p 5 14p 4 u C 19p 3 u2 12p 2 u3 C 3pu4 ; SX .p; u/ D 6p 5 C 42p 4 u C 48p 3 u2 C 21p 2 u3 C 3pu4 ; sec .1X nH / D 7p 5 48p 4 u C 52p 3 u2 21p 2 u3 C 3pu4 : • The secant variety of the rational normal curve in P4 (Example 3.18) has BX .p; u/ D
12p 4 C 15p 3 u C 12p 2 u2 C 3pu3 ;
cSM .1X nH / D 13p 4 C 19p 3 u 12p 2 u2 C 3pu3 ; SX .p; u/ D
12p 4 C 30p 3 u C 18p 2 u2 C 3pu3 ;
sec .1X nH / D 13p 4 C 34p 3 u 18p 2 u2 C 3pu3 : In all known examples, the coefficients of BX .p; u/ are less than or equal to the absolute value of the corresponding coefficients of cSM .1X nH /, and similarly for SX .p; u/ and sec .1X nH /. That this inequality holds for the first coefficient is Conjecture 1.8 which relates the ML degree of a singular X to the signed Euler characteristic of the very affine variety X nH. } Acknowledgements We thank Paolo Aluffi and Sam Payne for helpful communications, and the Mathematics Department at KAIST, Daejeon, for hosting both authors in May 2013. Bernd Sturmfels was supported by NSF (DMS-0968882) and DARPA (HR0011-12-1-0011).
References [ARSZ]
E. Allmann, J. Rhodes, B. Sturmfels, P. Zwiernik, Tensors of nonnegative rank two, in Linear Algebra and its Applications. Special Issue on Statistics. http://www.sciencedirect.com/science/article/ pii/S0024379513006812
Likelihood Geometry [AluJSC]
115
P. Aluffi, Computing characteristic classes of projective schemes. J. Symb. Comput. 35, 3–19 (2003) [AluLectures] P. Aluffi, Characteristic classes of singular varieties, in Topics in Cohomological Studies of Algebraic Varieties. Trends in Mathematics (Birkhäuser, Basel, 2005), pp. 1–32 [Alu] P. Aluffi, Euler characteristics of general linear sections and polynomial Chern classes. Rend. Circ. Mat. Palermo. 62, 3–26 (2013) [AN] S. Amari, H. Nagaoka, in Methods of Information Geometry. Translations of Mathematical Monographs, vol. 191 (American Mathematical Society, Providence, 2000) [Bertini] D.J. Bates, J.D. Hauenstein, A.J. Sommese, C.W. Wampler, Bertini: Software for Numerical Algebraic Geometry (2006). www.nd.edu/~sommese/bertini [BHSW] D.J. Bates, J.D. Hauenstein, A.J. Sommese, C.W. Wampler, Numerically Solving Polynomial Systems with Bertini (Society for Industrial and Applied Mathematics, Philelphia, 2013). http:// www.ec-securehost.com/SIAM/SE25.html [BFH] Y. Bishop, S. Fienberg, P. Holland, Discrete Multivariate Analysis: Theory and Practice (Springer, New York, 1975) [BoydVan] S. Boyd, L. Vandenberghe, Convex Optimization (Cambridge University Press, Cambridge, 2004) [CHKS] F. Catanese, S. Ho¸sten, A. Khetan, B. Sturmfels, The maximum likelihood degree. Am. J. Math. 128, 671–697 (2006) [CDFV] D. Cohen, G. Denham, M. Falk, A. Varchenko, Critical points and resonance of hyperplane arrangements. Can. J. Math. 63, 1038– 1057 (2011) [DP] C. De Concini, C. Procesi, Wonderful models of subspace arrangements. Selecta Math. New Ser. 1, 459–494 (1995) [Denham-Garrousian-Schulze] G. Denham, M. Garrousian, M. Schulze, A geometric deletionrestriction formula. Adv. Math. 230, 1979–1994 (2012) [DR] J. Draisma, J. Rodriguez, Maximum likelihood duality for determinantal varieties. Int. Math. Res. Not. [arXiv:1211.3196]. http://imrn.oxfordjournals.org/content/ early/2013/07/03/imrn.rnt128.full.pdf [LiAS] M. Drton, B. Sturmfels, S. Sullivant, in Lectures on Algebraic Statistics. Oberwolfach Seminars, vol. 39 (Birkhäuser, Basel, 2009) [Franecki-Kapranov] J. Franecki, M. Kapranov, The Gauss map and a noncompact Riemann-Roch formula for constructible sheaves on semiabelian varieties. Duke Math. J. 104, 171–180 (2000) [Fulton] W. Fulton, in Intersection Theory, 2nd edn. Ergebnisse der Mathematik und ihrer Grenzgebiete. A Series of Modern Surveys in Mathematics, vol. 2 (Springer, Berlin, 1998) [Gabber-Loeser] O. Gabber, F. Loeser, Faisceaux pervers l-adiques sur un tore. Duke Math. J. 83, 501–606 (1996) [GKZ] I.M. Gel’fand, M. Kapranov, A. Zelevinsky, Discriminants, Resultants, and Multidimensional Determinants (Birkhäuser, Boston, 1994) [GR] E. Gross, J. Rodriguez, Maximum likelihood geometry in the presence of data zeros, http://front.math.ucdavis.edu/1310.4197 [HRS] J. Hauenstein, J. Rodriguez, B. Sturmfels, Maximum likelihood for matrices with rank constraints. J. Algebr. Stat. (to appear) [arXiv:1210.0198]
116 [HKS] [Hovanskii]
[Huh1] [Huh0] [Huh2] [Kapranov]
[KRS]
[Land]
[LM] [Lau] [ch3:MS]
[MSS]
[OTBook]
[Orlik-Terao] [PS] [Rai] [Rapallo] [SSV] [Terao]
[Tevelev] [Uhl]
J. Huh and B. Sturmfels S. Ho¸sten, A. Khetan, B. Sturmfels, Solving the likelihood equations. Found. Comput. Math. 5, 389–407 (2005) A. Hovanski˘ı, Newton polyhedra and toroidal varieties. Akademija Nauk SSSR. Funkcional’nyi Analiz i ego Priloženija 11, 56–64 (1977) J. Huh, The maximum likelihood degree of a very affine variety. Compositio Math. 149, 1245–1266 (2013) J. Huh, h-vectors of matroids and logarithmic concavity. Preprint. http://arxiv.org/abs/1201.2915. [arXiv:1201.2915] J. Huh, Varieties with maximum likelihood degree one. J. Algeb. Stat. (to appear). [arXiv:1301.2732] M. Kapranov, A characterization of A-discriminantal hypersurfaces in terms of the logarithmic Gauss map. Math. Ann. 290, 277–285 (1991) K. Kubjas, E. Robeva, B. Sturmfels, Fixed points of the EM algorithm and nonnegative rank boundaries, http://arxiv.org/abs/ 1312.5634 (in preparation) J.M. Landsberg, Tensors: Geometry and Applications. Graduate Studies in Mathematics, vol. 128 (American Mathematical Society, Providence, 2012) J.M. Landsberg, L. Manivel, On ideals of secant varieties of Segre varieties. Found. Comput. Math. 4, 397–422 (2004) S. Lauritzen, Graphical Models (Oxford University Press, Oxford, 1996) E. Miller, B. Sturmfels, in Combinatorial Commutative Algebra. Graduate Texts in Mathematics, vol. 227 (Springer, New York, 2004) D. Mond, J. Smith, D. van Straten, Stochastic factorizations, sandwiched simplices and the topology of the space of explanations. R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 459, 2821–2845 (2003) P. Orlik, H. Terao, in Arrangements of Hyperplanes. Grundlehren der Mathematischen Wissenschaften, vol. 300 (Springer, Berlin, 1992) P. Orlik, H. Terao, The number of critical points of a product of powers of linear functions. Inventiones Math. 120, 1–14 (1995) L. Pachter, B. Sturmfels, Algebraic Statistics for Computational Biology (Cambridge University Press, Cambridge, 2005) C. Raicu, Secant varieties of Segre–Veronese varieties. Algebra Number Theory 6, 1817–1868 (2012) F. Rapallo, Markov bases and structural zeros. J. Symb. Comput. 41, 164–172 (2006) R. Sanyal, B. Sturmfels, C. Vinzant, The entropic discriminant. Adv. Math. 244, 678–707 (2013) H. Terao, Generalized exponents of a free arrangement of hyperplanes and the Shepherd-Todd-Brieskorn formula. Invent. Math. 63, 159–179 (1981) J. Tevelev, Compactifications of subvarieties of tori. Am. J. Math. 129, 1087–1104 (2007) C. Uhler, Geometry of maximum likelihood estimation in Gaussian graphical models. Ann. Stat. 40, 238–261 (2012)
Likelihood Geometry [Varchenko]
[Wat]
117 A. Varchenko, Critical points of the product of powers of linear functions and families of bases of singular vectors. Compositio Math. 97, 385–401 (1995) S. Watanabe, in Algebraic Geometry and Statistical Learning Theory. Monographs on Applied and Computational Mathematics, vol. 25 (Cambridge University Press, Cambridge, 2009)
Linear Toric Fibrations Sandra Di Rocco
1 Introduction These notes are based on three lectures given at the 2013 CIME/CIRM summer school Combinatorial Algebraic Geometry. The purpose of this series of lectures is to introduce the notion of a toric fibration and to give its geometrical and combinatorial characterizations. Toric fibrations f W X ! Y; together with a choice of an ample line bundle L on X are associated to convex polytopes called Cayley sums. Such a polytope is a convex polytope P Rn obtained by assembling a number of lower dimensional polytopes Ri ; whose normal fan defines the same toric variety Y: Let Rn D M ˝ R; for a lattice M: The building-blocks Ri are glued together following their image via a surjective map of lattices W M ! ƒ; see Definition 3.7. In particular the normal fan of the polytope .P / defines the generic fiber of the map f: We will denote Cayley sums by Cayley.R0 ; : : : ; Rt /;Y : Our aim is to illustrate how classical notions in projective geometry are captured by certain properties of the associated Cayley sum. When the image polytope .P / is a unimodular simplex k the generic fiber of the fibration f is a projective space Pk embedded linearly, i.e. LjF D OPk .1/: For this reason the fibration is called a linear toric fibration. The following picture illustrates a linear toric fibration and the representation of the associated polytope as a Cayley sum.
S. Di Rocco () Department of Mathematics, Royal Institute of Technology (KTH), 10044 Stockholm, Sweden e-mail:
[email protected] www.math.kth.se/~dirocco A. Conca et al., Combinatorial Algebraic Geometry, Lecture Notes in Mathematics 2108, DOI 10.1007/978-3-319-04870-3__4, © Springer International Publishing Switzerland 2014
119
120
S. Di Rocco
f : P(
P1 ×P1 (1
1) ⊕
P1 ×P1 (3
3)) → P2
Section 3 will be devoted to define these concepts and to give the most relevant examples. In the following two sections we will present two characterizations of Cayley sums corresponding to linear toric fibrations. In both cases there are rich and interesting connections with classical projective geometry. Section 4 discusses discriminants of polynomials. A polynomial supported on a subset A Zn is a polynomial in n variables x D .x1 ; : : : ; xn / of the form pA D P a a2A ca x : The A -discriminant is again a polynomial in jA j variables, A .ca /; vanishing whenever the corresponding polynomial has at least one singularity in the torus .C /n : Understanding the existence and in that case the degree of the discriminant polynomial, for given classes of point-configurations A , is highly desirable. Finite subsets A Zn define toric projective varieties, XA PjA j1 : It is classical in Algebraic Geometry to associate to a given embedding, X Pm ; the variety parametrizing hyperplanes singular along X: This variety is called the dual variety and it is denoted by X _ : Understanding when the codimension of the dual variety is higher that one and giving efficient formulas for its degree is a long standing problem. We will see that projective duality is a useful tool for describing the discriminants A when the associated polytope Conv.A / is smooth or simple. In fact the case when A D 1 is completely characterized by Cayley sums and thus by toric fibrations. In the non singular case the following holds. Characterization 1. If PA D Conv.A / is a smooth polytope then the following assertions are equivalent: (a) PA D Cayley;Y .R0 ; : : : ; Rt / with t > max.2; nC1 /: 2 _ (b) codim.XA / > 1: (c) A D 1: _ When the codimension of XA is one then its degree is given by an alternating sum of volumes of the faces of the polytope PA : We will see that this formula corresponds to the top Chern class of the so called first jet bundle. This interpretation _ has a useful consequence. When the codimension of XA is higher than one this Chern class has to vanish. This leads to another characterization of Cayley sums.
Characterization 2. If PA D Conv.A / is a smooth polytope then the following assertions are equivalent: (a) PA D Cayley;Y .R0 ; : : : ; Rt / with t > max.2; nC1 2 /: P codim.F / (b) .1/ .dim.F / C 1/Š Vol.F / D 0: ;¤F PA
Linear Toric Fibrations
121
In Sect. 5 we discuss the problem of classifying convex polytopes and algebraic varieties. A classification is typically done via invariants. In recent years much attention has been concentrated on the notion of codegree of a convex polytope. codeg.P / D minZ ftjtP has interior lattice pointsg The unimodular simplex for example has codeg.n / D n C 1: Batyrev and Nill conjectured that imposing this invariant to be large should force the polytope to be a Cayley sum. It turned out that a Q-version of this invariant, what we denote by .P /, corresponds to a classical invariant in classification theory of algebraic varieties, called the log-canonical threshold. Let .XP ; LP / be the toric variety and ample line bundle associated to the polytope P: The canonical threshold .LP / and the nefvalue .LP / are the invariants used heavily in the classification theory of Gorenstein algebraic varieties. In particular Beltrametti-Sommese-Wisniewski conjectured that imposing .LP / to be large should force the variety to have the structure of a fibration. Again in the toric setting we will see that these two stories intersect making it possible to prove the above conjectures, at least in the smooth case, and leading to yet another characterization of Cayley sums. Characterization 3. Let P be a smooth polytope. The following assertions are equivalent: (a) codeg.P / > .n C 3/=2: (b) P is isomorphic to a Cayley sum Cayley.R0 ; : : : ; Rt /;Y where t C 1 D codeg.P / with k > n2 : (c) .LP / D .LP / > .n C 3/=2: In fact the characterizations above extend to more general classes of polytopes, not necessarily smooth, as we explain in Sects. 4 and 5. Section 6 is devoted to give a complete proof of these characterizations.
2 Conventions and Notation We assume basic knowledge of toric geometry and refer to [EW, FU, ODA] for the necessary background on toric varieties. We will moreover assume some knowledge of projective algebraic geometry. We refer the reader to [HA, FUb] for further details. Throughout this paper, we work over the field of complex numbers C. By a polarized variety we mean a pair .X; L/ where X is an algebraic variety and L is an ample line bundle on X:
122
S. Di Rocco
2.1 Toric Geometry In this note a toric variety, X; is always assumed to be normal and thus defined by a fan †X N ˝ R for a lattice N: By †X .t/ we will denote the collection of t-dimensional cones of †X : The invariant sub-variety of codimension t associated to a cone 2 †.t/ will be denoted by V ./: For a lattice we set R D ˝Z R: We denote by _ D Hom.; Z/ the dual lattice. If W ! is a morphism of lattices we denote by R W R ! R the induced R-homomorphism. By a lattice polytope P R we mean a polytope with vertices in : Let P Rn be a lattice polytope of dimension n: Consider the graded semigroup …P generated by .f1g P / \ .N Zn /: The polarized variety .Proj.CŒ…P /; O.1// is a toric variety associated to the polytope P . It will be sometimes denoted by .XP ; LP /: Notice that the toric variety XP is defined by the (inner) normal fan of P: Vice versa the symbol P.X;L/ will denote the lattice polytope associated to a polarized toric variety .X; L/: Two polytopes are said to be normally equivalent if their normal fans are isomorphic. The symbol n denotes the smooth (unimodular) simplex of dimension n: Recall that an n-dimensional polytope is simple if through every vertex pass exactly n edges. A lattice polytope is smooth if it is simple and the primitive vectors of the edges through every vertex form a lattice basis. Smooth polytopes are associated to smooth projective toric varieties. Simple polytopes are associated to Q-factorial projective toric varieties. When the toric variety is defined via a point configuration A Zn we will use the symbol .XA ; LA / for the associated polarized toric variety and PA D Conv.A / for the associated polytope. The corresponding fan is denoted by †A :
2.2 Vector Bundles The notion of Chern classes of a vector bundle is an essential tool in some of the proofs. Let E be a vector bundle of rank k over an n-dimensional algebraic variety X . Recall that the i -th Chern class of E; ci .E/; is the class of a codimension i cycle on X modulo rational equivalence. The top Chern class of a rank k > n vector bundle is cn .E/: The same symbol cn .E/ will be used to denote the degree of the associated zero-dimensional subvariety. The projectivization of a vector bundle plays a fundamental role throughout these notes. Let S l .E/ denote the l-th symmetric power of a rank r C 1 vector bundle E: l The projectivization of E is P.E/ D Proj.˚1 lD0 S .E//: It is a projective bundle r with fiber F D P.E/x D P.Ex / D P : Let W P.E/ ! Y be the bundle map. There is a line bundle on P.E/; called the tautological line bundle, defined by the property that F Š OPr .1/: When E is a vector bundle on a toric variety Y then
Linear Toric Fibrations
123
the projective bundle P.E/ has the structure of a toric variety if and only if E D L1 ˚ : : : ˚ Lk ; [DRS04, Lemma 1.1.]. When the line bundles Li are ample then the tautological line bundle is also ample. We refer to [FU] for the necessary background on vector bundles and their characteristic classes.
3 Toric Fibrations Definition 3.1. A toric fibration is a surjective flat map f W X ! Y with connected fibers where (a) X is a toric variety (b) Y is a normal algebraic variety (c) dim.Y / < dim.X /: Remark 3.2. A surjective morphism f W X ! Y , with connected fibers between normal projective varieties, induces a homomorphism from the connected component of the identity of the automorphism group of X to the connected component of the identity of the automorphism group of Y , with respect to which f is equivariant. It follows that if f W X ! Y is a toric fibration then Y and a general fiber F admit a toric structure with respect to which f becomes an equivariant morphism. Moreover if X is smooth, respectively Q-factorial, then Y and F are also smooth, respectively Q-factorial. Example 3.3. Let L0 ; : : : ; Lk be line bundles over a toric variety Y . The total space P.L0 ˚ : : : ˚ Lk / is a toric variety, Lemma 1.1, and the projective bundle W P.L0 ˚ : : : ˚ Lk / ! Y is a toric fibration.
3.1 Combinatorial Characterization A toric fibration has the following combinatorial characterization, see [EW, Chapter VI] for further details. Let N Š Zn be a lattice, † N ˝ R be a fan and X D X† ; the associated toric variety. Let i W ,! N be a sub-lattice. Proposition 3.4 ([EW]). The inclusion i induces a toric fibration, f W X ! Y if and only if: (a) is a primitive lattice, i.e. . ˝ R/ \ N D : (b) For every 2 †.n/; D C ; where 2 and \ D f0g (i.e. † is a split fan). We briefly outline the construction. The projection W N ! N= induces a map of fans † ! .†/ and thus a map of toric varieties f W X ! Y: The general fiber
124
S. Di Rocco
F is a toric variety defined by the fan †F D f 2 † \ g: The invariant varieties V ./ in X; where 2 † is a maximal-dimensional cone in †F ; are called invariant sections of the fibration. The subvariety V ./ is the invariant section passing through the fixed point of F corresponding to the cone 2 †F : Observe that they are all isomorphic, as toric varieties, to Y: Example 3.5. In Example 3.3 let Rn be the fan defining Y; and let D1 ; : : : ; Ds be the generators of Pic.Y /P associated to the rays i ; : : : ; s : The line bundle Li can be written as Li D i .j /Dj where i W R ! R are piecewise linear functions. Let e1 ; : : : ; ek 2 Zk be a lattice basis and let e0 D e1 : : : ek : One can define a map: W Rn ! RnCk as
.v/ D .v;
X
i .v/ei /:
Consider now the fan †0 RnCk given by the image of under , †0 D f ./; g: Let … Zk be the fan defining Pk : The fan † D f 0 C j 0 2 †0 ; 2 …g is a split fan, defining the toric fibration W P.L0 ˚ : : : ˚ Lk / ! Y: See also [ODAb, Proposition 1.33]. Definition 3.6. A polarized toric fibration is a pair .f W X ! Y; L/; where f is a toric fibration and L is an ample line bundle on X: Observe that for a general fiber F; the pair .F; LjF / is also a polarized toric variety. It follow that both pairs .X; L/ and .F; LjF / define lattice polytopes P.X;L/ ; P.F;LjF / . The polytope P.X;L/ is in fact a “twisted sum” of a finite number of lattice polytopes fibering over P.F;LjF / : Definition 3.7. Let R0 ; : : : ; Rk MR be lattice polytopes and let k > 1: Let W M ! ƒ be a surjective map of lattices such that R .Ri / D vi and such that v0 ; ; vk are distinct vertices of Conv.v0 ; : : : ; vk /: We will call a Cayley -twisted sum (or simply a Cayley sum) of R0 ; : : : ; Rk a polytope which is affinely isomorphic to Conv.R0 ; : : : ; Rk /: We will denote it by: ŒR0 ? : : : ? Rk : If the polytopes Ri are additionally normally equivalent, i.e. they define the same normal fan †Y ; we will denote the Cayley sum by: Cayley.R0 ; : : : ; Rk /.;Y / : We will see that these are the polytopes that are associated to polarized toric fibrations. Proposition 3.8 ([CDR08]). Let X D X† be a toric variety of dimension n; where † NR Š Rn ; and let i W ,! N be a sublattice. Let L be an ample line bundle
Linear Toric Fibrations
125
on X: Then the inclusion i induces a polarized toric fibration .f W X ! Y; L/ if and only if P.X;L/ D Cayley.R0 ; : : : ; Rk /.;Y / ; where R0 ; : : : ; Rk are normally equivalent polytopes on Y and W M ! ƒ is the lattice map dual to i: Proof. We first prove the implication “)”. Assume that i W ,! N induces a toric fibration f W X ! Y and consider the polarization L on X: We will prove that P.X;L/ D Cayley.R0 ; : : : ; Rk /.;Y / for some normally equivalent polytopes R0 ; : : : ; Rk : Notice first that the fact that is a primitive sub-lattice of N implies that the dual map W M ! ƒ is a surjection. Let F be a general fiber of f; and let S WD P.F;LjF / ƒR . Denote by v0 ; : : : ; vk the vertices of S . Every vi corresponds to a fixed point of F ; call Yi D V .i / the invariant section of f passing through that point. Note that i 2 †X , dim i D dim F and i R : Let Ri be the face of P.X;L/ corresponding to Yi . Observe that Aff .i / D R , so that Aff .i? / D ? R D ker R . Then there exists ui 2 M such that: • Aff .Ri / C ui D ker.R /; • Ri C ui D P.Yi ;LjYi / . This says that R0 ; : : : ; Rk are normally equivalent (because every Yi is isomorphic to Y ), and that R .Ri / is a point. Since the Yi ’s are pairwise disjoint, the same holds for the Ri ’s. If s is the number of fixed points of Y , then each Ri has s vertices. On the other hand, we know that F has .k C 1/ fixed points, and therefore X must have s.k C 1/ fixed points. So P.X;L/ has s.k C 1/ vertices, namely the union of all vertices of the Ri ’s. We can conclude that P.X;L/ D Conv.R0 ; : : : ; Rk / P Let D D x2†.1/ ax Dx be an invariant Cartier divisor on X such that L D OX .D/. Since P F is a general fiber, we have Dx \ F ¤ ; if and only if x 2 , and DjF D x2 ax DxjF . This implies that R .Ri / D vi and R .P.X;L/ / D S: We conclude that P.X;L/ D Cayley.R0 ; : : : ; Rk /.;Y / : We now show the other direction: “(”. Assume that P.X;L/ D Cayley.R0 ; : : : ; Rk /.;Y / : We will prove that the associated polarized toric variety is a polarized toric fibration. First observe that the fact that the dual map is a surjection implies that the sublattice is primitive. Since vi is a vertex of R .P.X;L/ /, Ri is a face of P.X;L/ for every i D 0; : : : ; k. Let Y be the projective toric variety defined by the polytopes Ri . Observe that Aff .Ri / is a translate of ker R , and .ker /_ D N=. So the fan †Y is contained in .N=/R . Let 2 †Y .dim.Y // and for every i D 0; : : : ; k let wi be the vertex of Ri corresponding to . We will show that Q WD Conv.w0 ; : : : ; wk / is a face of P.X;L/ . Observe first that .R /Aff .Q/ W Aff .Q/ ! ƒR is bijective. Let H be the linear subspace of MR which is a translate of Aff .Q/. Then we have MR D H ˚ ker R . Dually NR D R ˚ H ? , where H ? projects isomorphically onto .N=/R .
126
S. Di Rocco
Let u 2 H ? be such that its image in .N=/Q is contained in the interior of . Then for every i D 0; : : : ; k we have that (see [FU, §1.5]): .u; x/ > .u; wi / for every x 2 Ri ; .u; x/ D .u; wi / if and only if x D wi : Moreover u is constant on Aff .Q/, namely there exists m0 2 Q such that .u; z/ D m0 for every z 2 Q. Pl Any z 2 P can be written as z D i D1 i zi , with zi 2 Ri , i > 0 and Pl D 1. Then i i D0 .u; z/ D
l X i D1
i .u; zi / >
l X
i .u; wi / D
i D1
l X
i m 0 D m 0 :
i D1
Moreover, .u; z/ D m0 if and only if i > 0 for every i such that .u; zi / D .u; wi /; and i D 0 otherwise. This happens if and only if z 2 Q; implying that Q is a face of P.X;L/ . Let 2 †X be a cone of maximal dimension, and let w be the corresponding vertex of P.X;L/ . Then .w/ is a vertex, say v1 , of Q .P.X;L/ / and hence w lies in R1 . Since R1 is also a face of P.X;L/ , w is a vertex of R1 and hence it corresponds to a maximal dimensional cone in †Y : In each Ri , consider the vertex wi corresponding to the same cone of †Y : We set w1 D w. We have shown that Q WD Conv.w0 ; : : : ; wk / is a face of P.X;L/ , and w D Q \ R1 . Now call and the cones of †X corresponding respectively to R1 and Q. It follows that D C , Q , and \ Q D f0g. This concludes the proof. t u The previous proof shows the following corollary. Corollary 3.9. Let .f W X ! Y; L/ be a polarized toric fibration and let P.X;L/ D Cayley.R0 ; : : : ; Rk /.;Y / be the associated polytope. Let F be a general fiber of the fibration, Y0 ; : : : ; Yk be the invariant sections and .Ri / D vi : The following holds. (a) The polarized toric variety .F; LjF / corresponds to the polytope P.F;LjF / D Conv.v0 ; : : : ; vk /: (b) The polarized toric varieties .Yi ; LYi / correspond to the polytopes R0 u0 ; ; Rk uk ; where ui 2 M is a point such that .ui / D .Ri /: Example 3.10. The toric surface obtained by blowing up P2 at a fixed point has the structure of a toric fibration, P.OP1 ˚ OP1 .1// ! P1 : It is often referred to as the Hirzebruch surface F1 : Consider the polarization given by the tautological line bundle D 2 .OP2 .1// E where is the blow-up map and E is the exceptional divisor. The associated polytope is P D Cayley.1 ; 21 /; see the figure below.
Linear Toric Fibrations
127
2
1
v1
1
v0
Remark 3.11. The following are important classes of polarized toric fibrations, relevant both in Combinatorics and Algebraic Geometry. Projective Bundles. When .P / D t the polytope Cayley.R0 ; : : : ; Rt /.;Y / defines the polarized toric fibration .P.L0 ˚ : : : ˚ Lt / ! Y; /; where the Li are ample line bundles on the toric variety Y and is the tautological line bundle. In particular LjF D OPt .1/: These fibrations play an important role in the theory of discriminants and resultants of polynomial systems. See Sect. 4 for more details. Mori Fibrations. When .P / is a simplex (not necessarily smooth) the Cayley polytope Cayley.R0 ; : : : ; Rk /.;Y / defines a Mori fibration, i.e. a surjective flat map onto a Q-factorial toric variety whose generic fiber is reduced and has Picard number one. This type of fibrations are important blocks in the Minimal Model Program for toric varieties. See [CDR08] and [Re83] for more details. Pk -Bundles. When .P / D kt then again the variety has the structure of a Pt fibration whose general fiber F is embedded as an k-Veronese variety: .F; LjF / D .Pt ; OPt .k//: These fibrations arise in the study of k-th toric duality, see [DDRP12]. In the polarized toric fibration .P.L0 ˚ : : : ˚ Lt /; / the fibers are embedded as linear spaces. For this reason the associated Cayley polytopes Cayley.R0 ; : : : ; Rt /.;Y / can be referred to as linear toric fibrations. Remark 3.12. For general Cayley sums, ŒR0 ?: : :?Rk ; one has the following geometrical interpretation. Let .X; L/ be the associated polarized toric variety and let Y be the toric variety defined by the Minkowski sum R0 C: : :CRk : The fan defining Y is a refinement of the normal fans of the Ri for i D 0; : : : ; k: Consider the associated birational maps i W Y ! Yi ; where .Yi ; Li / is the polarized toric variety defined by the polytope Ri : The line bundles Hi D i .Li / are nef line bundles on Y and the polytopes P.Y;Hi / are affinely isomorphic to Ri : In particular ŒR0 ? : : : ? Rk is the polytope defined by the tautological line bundle on the toric fibration P.H0 ˚ : : : ˚ Hk / ! Y: Notice that in this case the line bundle may not be ample.
128
S. Di Rocco
If we want to relate ŒR0 ? : : : ? Rk to a polarized toric fibration we need to enlarge the polytopes Ri is order to get P an ample tautological line bundle. k Consider the polytopes Pi D P.Y;Hi / C 0 Rj : The normal fan of Pi is isomorphic to the fan defining the common resolution, Y; for i D 0; : : : ; k: Hence the polytopes Pi are normally equivalent. Let .Y; Mi / be the polarized toric variety associated to the polytope Pi : One can then define the Cayley sum Cayley.P0 ; : : : ; Pk /.;Y / ; whose normal fan is in fact a refinement of the one defining ŒR0 ? : : : ? Rk : Let .P.M0 ˚ : : : ˚ Mk / ! Y; / be the polarized toric fibration associated to Cayley.P0 ; : : : ; Pk /.;Y / : There is a birational morphism W P.M0 ˚ : : : ˚ Mk / ! X: Example 3.13. Consider the polytopes R0 D 2 ; R1 D 1 1 in Q2 : Consider the projection onto the first component W Z3 ! Z and P D Conv.R0 f0g; R1 f1g/: The polytope P is then isomorphic to ŒR0 ? R1 ; and Q .P / D 1 : The common refinement defined by R0 C R1 is the fan of the blow up of P2 at two fixed points, W Y ! P2 : The polytopes P0 defines the polarized toric variety .Y; .OP2 .4// E1 E2 / and the polytope P1 the pair .Y; .OP2 .5// 2E1 2E2 /, where Ei are the exceptional divisors. The polarized toric fibration .P.M0 ˚ M1 / ! Y; / is then .P.Œ .OP2 .4// E1 E2 ˚ Œ .OP2 .5// 2E1 2E2 / ! Y; /:
R1
R0
P0
P1
R0 + R1
[R0 R1 ]
Cayley(P0 P1 )(
Y)
3.2 Historical Remark The definition of a Cayley polytope originated from what is “classically” referred to as the Cayley trick, in connection with the Resultant and Discriminant of a system of polynomials. A system of n polynomials in n variables x D .x1 ; : : : ; xn /; f1 .x/; : : : ; fn .x/; is supported on .A1 ; A2 ; : : : ; An /; where Ai Zn if fi D …aj 2Ai cj x aj : The .A1 ; A2 ; : : : ; An /-resultant is a polynomial, R.: : : ; cj ; : : :/; in the coefficients cj ; which vanishes whenever the corresponding polynomials have a common zero. The discriminant of a finite subset A Zn ; A ; is also a polynomial A .: : : ; cj ; : : :/ in the variables cj ; which vanishes whenever the corresponding
Linear Toric Fibrations
129
polynomial supported on A ; f D …aj 2A cj x aj ; has a singularity in the torus .C /n : Theorem 3.14 ([GKZ] Cayley Trick). The .A1 ; A2 ; : : : ; An /-resultant equals the A -discriminant where A D .A1 f0g/ [ .A2 fe1 g/ [ : : : [ .An fen1 g/ Z2n1 where .e1 ; : : : ; en1 / is a lattice basis for Zn1 : Let Ri D N.fi / Rn be the Newton polytopes of the polynomials fi supported on Ai : The Newton polytope of the polynomial f supported on A is the Cayley sum N.f / D ŒR1 ? : : : ? Rn ; where W Z2n1 ! Zn1 is the natural projection such that R .ŒR1 ? : : : ? Rn / D n1 :
4 Toric Discriminants and Toric Fibrations The term “discriminant” is well known in relation with low degree equations or ordinary differential equations. We will study discriminants of polynomials in n variables with prescribed monomials, i.e. polynomials whose exponents are given by lattice points in Zn : Polynomials in n-variables describe locally the hyperplane sections of a projective n-dimensional algebraic variety, W X ,! Pm : The monomials are prescribed by the local representation of a basis of the vector space of global sections H 0 .X; .OPm .1///: For this reason the term discriminant has also been classically used in Algebraic Geometry. In what follows we will describe discriminants from a combinatorial and an algebraic geometric prospective. The two points of view coincide when the projective embedding is toric.
4.1 The A Discriminant Let A D fa0 ; : : : ; am g be a subset of Zn . The discriminant of A (when it exists) is an irreducible homogeneous polynomial A .c0 ; : : : ; cm / vanishing when the P ai corresponding Laurent polynomial supported on A ; f .x/ D ai 2A ci x ; has n at least one singularity in the torus .C / : Geometrically, the zero-locus of the discriminant is an irreducible algebraic variety of codimension one in the dual projective space Pm _ ; called the dual variety of the embedding XA ,! Pm :
130
S. Di Rocco
Example 4.1. Consider the point configuration A D f.0; 0/; .1; 0/; .0; 1/; .1; 1/g Z2 : The discriminant is given by an homogeneous polynomial A .a0 ; a1 ; a2 ; a3 / vanishing whenever the quadric a0 Ca1 x Ca2 y Ca3 xy has a singular point in .C /2 : It is well known that this locus correspond to singular 2 2 matrices and it is thus described by the vanishing of the determinant: A .a0 ; a1 ; a2 ; a3 / D a0 a3 a1 a2 : Similarly, one can associate the polynomials supported on A with local expansions of global sections in H 0 .P1 P1 ; O.1; 1// defining the Segre embedding of P1 P1 in P3 :
Example 4.2. The 2-Segre embedding 2 W P2 ,! P5 defined by the global sections of the line bundle OP2 .2/ can be associated to the point configuration A D fa0 ; a1 ; a2 ; a3 ; a4 ; a5 g D f.0; 0/; .0; 1/; .1; 0/; .0; 2/; .1; 1/; .2; 0/g
3 c0 c1 c2 D det 4c1 c3 c4 5 c2 c4 c5 2
A simple computation shows that A
Projective duality is a classical subject in algebraic geometry. Given en embedding i W X ,! Pm of an n-dimensional algebraic variety, the dual variety, X _ .Pm /_ is defined as the Zariski-closure of all the hyperplanes H Pm tangent to X at some non singular point. We can speak of a defining homogeneous polynomial .c0 ; : : : ; cm /, and thus of a discriminant, only when the dual variety has codimension one. Embeddings whose dual variety has higher codimension are called dually defective and the discriminant is set to be 1: Finding formulas for the discriminant and giving a classification of the embeddings with discriminant 1 is a long standing problem in algebraic geometry. In the case of a toric embedding defined by a point-configuration, XA ,! PjA j1 ; the problem is equivalent to finding formulas for the discriminant A and giving a classification of the
Linear Toric Fibrations
131
dually defective point-configurations, i.e. the point-configurations with discriminant A D 1: Dickenstein-Sturmfels [DS02] characterized the case when m D n C 2; CattaniCurran [CC07] extended the classification to m D n C 3; n C 4: In these cases the corresponding embedding is possibly very singular and the methods used are purely combinatorial. In [DiR06] and [CDR08] we completely characterize the case when PA D Conv.A / is smooth or simple. The latter characterisation relies on tools from Algebraic Geometry which will be explained in the next paragraph.
4.2 The Dual Variety of a Projective Variety The dual variety corresponds to the locus of singular hyperplane sections of a given embedding. By requiring the singularity to be of a given order, one can define more general dual varieties. Singularities of fixed multiplicity k correspond to hyperplanes tangent “to the order k:” Consider an embedding i W X ,! Pm of an n-dimensional variety, defined by the global sections of the line bundle L D i .OPm .1//: For any smooth point x of the embedded variety let: / jetkx W H 0 .X; L / ! H 0 .X; L ˝ OX =mkC1 x be the map assigning to a global section s in H 0 .X; L / the tuple jetkx .s/ D .s.x/; : : : ; .@t s=@x t /.x/; : : :/t 6k where x D .x1 ; : : : ; xn / is a local system of coordinates around x: The k-th osculating space at x is defined as Osc kx D P.Im.jet kx //: As the map jet1x is surjective, the first osculating space is always isomorphic to Pn and it is classically called the projective tangent space. The jet maps of higher order do not necessarily have maximal rank and thus the dimension of the osculating spaces of order bigger than 1 can vary. The embeddings admitting osculating space of maximal dimension at every point are called k-jet spanned. Definition 4.3. A line bundle L on X is called k-jet spanned at x if the map jetkx is surjective. It is called k-jet spanned if it is k-jet spanned at every smooth point x 2 X: Example 4.4. A line bundle L D OPn .a/ on Pn is k-jet spanned for all a > k: In fact the map jetkx W H 0 .Pn ; OPn .a// ! Jk .OPn .a//x is surjective for all x 2 P2 ; as a local basis of the global sections of OPn .a/ consists of all the monomials in n variables of degree up to a and we are assuming a > k:
132
S. Di Rocco
Example 4.5. Let L be a line bundle on a non singular toric variety X: Then the following statements are equivalent, see [DiR01]: (a) L is k-jet spanned. (b) all the edges of PL have length at least k: (c) L C > k for every invariant curve C on X: As an example consider the polytope P in figure below. The associated torc embedding is the embedding of the blow up of P2 at the three fixed points, W X ! P2 ; defined by the anticanonical line bundle .OP2 .3//E1 E2 E3 : Here Ei denote the exceptional divisors. The embedded variety is a Del Pezzo surface of degree 6: Let F be the set of the 6 fixed points on X and E D f .OP2 .3//Ei Ej ; i ¤ i g[ fE1 ; E2 ; E3 g be the set of invariant curves. The osculating spaces can easily seen to be: 8 3 < jet2p .1/; jet2p .x/; jet2p .y/; jet 2p .xy/ > ˆ ˆP D ˆ ˆ ˆ if x 2 F: ˆ ˆ < 4 P D < jet2p .1/; jet2p .x/; jet2p .y/; jet2p .xy/; jet2p .x 2 y/ > 2 Osc p D ˆ if x 2 E n F: ˆ ˆ ˆ ˆ P5 D < jet2p .1/; jet2p .x/; jet2p .y/; jet 2p .xy/; jet2p .x 2 y/; jet2p .xy2 / > ˆ ˆ : at a general point p 2 X n E:
The embedding defined by P is not 2-jet spanned on the whole X: It is 2-jet spanned at every point in X n E:
P
Definition 4.6. A hyperplane H Pm is tangent at x to the order k if it contains the k-th osculating space at x: Osc kx H: Definition 4.7. The k-th order dual variety X k is: X k D fH 2 Pm tangent to the order k to X at some non singular pointg: Notice that X 1 D X _ and that X 2 is contained in the singular locus of X _ : General properties of the higher order dual variety have been studied by S. Kleiman and R. Piene. Because the definition is related to local osculating properties and generation of jets, it is useful to introduce the sheaf of jets, Jk .L /; associated to a polarized variety .X; L /: In the classical literature it is sometime referred to as the sheaf of principal parts. Consider the projections i W X X ! X and let IX be the ideal sheaf of the diagonal in X X: The sheaf of k-th order jets of the line bundle L is defined as
Linear Toric Fibrations
133
Jk .L / D 2 .1 .L / ˝ .OX X =IkC1 //: X When the variety X is smooth Jk .L / is a vector bundle of rank k-jet bundle.
nCk n
; called the
Example 4.8. If L ¤ OX is a globally generated line bundle then Jk .L / splits as a sum of line bundles only if X D Pn and L D OPn .a/: In fact:
Jk .OPn .a// D
nCk .M k /
OPn .a k/
1
See [DRS01] for more details. It is important to note that when the map jetkx is surjective for all smooth points x; properties of the higher dual variety X k can be related to vanishing of Chern classes of the associated k-th jet bundle, Jk .L /: We start by identifying the k-th dual variety with a projection of the conormal bundle. Let X be a smooth algebraic variety and let L be a k-jet spanned line bundle on X: Consider the following commutative diagram. 0
(1)
S
0
kC1
1X ˝ L
IIk J .L / kC1 jetkC1 jetk Kk X H 0 .X; L / Jk .L / ˇk
0
0
The vertical exact sequence is often called the k-jet sequence. The map jetk is defined as jetk .s; x/ D jetkx .s/: The vector bundle Kk is the kernel of the map jetk (which has maximal rank!). The induced map II _ k can be identified with the dual of the k-th fundamental form. See [L94, GH79] for more details. By dualizing the map ˇk and projectivizing the corresponding vector bundles one gets the following maps:
134
S. Di Rocco
k
X
pr1
P.Kk_ /
X P.H 0 .X; L/_ /
(2)
˛k
pr2
.Pm /_
It is straightforward to see that X k D Im.˛k /. A simple dimension count shows that when the map jetk has maximal rank one expects the codimension of the k-th dual n: Notice that this is equivalent to requiring that variety to be codim.X k / D nCk k the map ˛k is generically finite. When the codimension is higher than the expected one the embedding is said to be k-th dually defective. The commutativity in diagram (1) has the following useful consequence. Lemma 4.9. Let .X; L / be a polarized variety, where X is smooth and the line bundle L is .k C 1/-jet spanned. Then the dual variety X k has the expected dimension. Proof. We follow diagram (1). Because the line bundle L is .k C 1/-jet ample the map II k is surjective. This means that for every x 2 X and for every monomial …P ti DkC1 x1t1 xntn there is an hyperplane section that locally around x is defined as C …P ti DkC1 x1t1 xntn C higher order terms D 0; where C ¤ 0 In other words, hyperplanes tangent at a point x to the order k are in one-toone correspondence with elements of the linear system jOPn1 .k C 1/j: The map ˛k having positive dimensional fibers is equivalent to saying that hyperplanes tangent at a point x to the order k are also tangent to nearby points y ¤ x; which in turn implies that the linear system jOPn1 .k C 1/j has base points. This is a contradiction as the linear system is k C 1-jet spanned and thus base-point free. t u When k D 1 the contact locus of a general singular hyperplane H , k .˛k1 .H // is always a linear subspace. This implies that if finite then deg.˛1 / D 1: For higher order tangencies, k > 1; the degree can be higher. When the map ˛k is finite we set nk D deg.˛k /: Lemma 4.10 ([LM00, DDRP12]). Let Xbe asmooth variety and let L be a k-jet spanned line bundle. Then codim.X k / > nCk n if and only if cn .Jk .L // D 0: k nCk k Moreover when codim.X / D k n the degree of the k-dual variety is given by: nk deg.X k / D cn ..Jk .L //: Proof. Observe first that because the map jetk is of maximal rank the vector bundle Jk .L / is spanned by the global sections of the line bundle L : This implies that, after fixing a basis fs1 ; : : : ; smC1 g of H 0 .X; L / Š CmC1 ; the Chern class cn .Jk .L // is represented by the set:
Linear Toric Fibrations
135
fx 2 X j dim.Span.jetkx .s1 /; jetkx .s2 /// 6 1g Notice that an hyperplane in the linear span Pt D hs1 ; : : : ; st C1 i is tangent at a point x to the order k exactly when dim.Span.jetkx .s1 /; : : : ; jetkx .st C1 /// D t C 1: The : The statement map k in diagram (2) defines a projective bundle of rank m nCk k cn .Jk .L // D 0 is then equivalent to ˛k . 1 .x// \ P1 D ; for every x 2 X and for a general P1 D hs1 ; s2 i: By Bertini this is equivalent to codim.X k / > nCk n: k Assume now that cn .Jk .L // ¤ 0 and thus that the generic fiber of the map ˛k is finite. The degree of X k D i m.˛k / times the degree of the map ˛k is given by the degree of the line bundle ˛k .OPm_ .1// which corresponds to the tautological line bundle OP.Kk_ / .1/: nCk k
nk deg.X k / D c1 .˛k .O.Pm /_ 1//mCn.
/ D c .O _ .1//mCn.nCk k /: 1 P.Kk /
From diagram (1) we see that cn .Jk .L // D cn .Kk_ /1 D sn .Kk_ / Finally let W P.Kk_/ ! X be the bundle map. By relating the Segre class sn .Kk_ / nCk to the tautological bundle [FU, 3.1] sn .K _ / D .c1 .O _ .1//mCn. k / / D P.Kk /
k
c1 .OP.Kk_ / .1//
mCn.nCk k /
k
we conclude that: nk deg.X / D cn .Jk .L //:
t u
The case of k D 1 is referred to as classical projective duality. When the codimension of the dual variety is one, the homogeneous polynomial in m C 1 variables defining it is called the discriminant of the embedding. For a polarized variety the discriminant, when it exists, parametrizes the singular hyperplane sections.
4.3 The Toric Discriminant In the case of singular varieties the sheaves of k-jets are not necessarily locally free and thus it is not possible to use Chern-classes techniques. For toric varieties however estimates of the degree of the dual varieties are possible, even in the singular case, and rely on properties of the associated polytope. In the classical case k D 1 there is a precise characterization in any dimension. For higher order duality, results in dimension 3 and for k D 2 can be found in [DDRP12]. A generalization to higher dimension and higher order is an open problem. Proposition 4.11 ([GKZ, DiR06, MT11]). Let .XA ; LA / be a polarized toric variety associated to the polytope PA : Set
136
ıi D
S. Di Rocco
X
.1/
;¤F P
codim.F /
! dim.F / C 1 f C ..1/i 1 .i 1/g Vol.F / Eu.V .F //: i
_ _ Then codim.XA / D r D minfi; ıi ¤ 0g and deg.XA / D ır :
The function Eu W finvariant subvarieties of XA g ! Z in the above proposition assigns an integer to all invariant subvarieties. Its value is different from 1 only when the variety is singular. In particular, when XA is smooth we have: codim.XA_ / > 1 ,
X
.1/codim.F / .dim.F / C 1/Š Vol.F / D 0
;¤F P
In fact in the smooth case one can prove this characterization using the vector bundle of 1-jets. Proposition 4.12. Let .XA ; LA / be an n-dimensional non singular polarized toric variety associated to the polytope PA : Assume A D PA \ Zn : Then cn .J1 .LA // D
X
.1/codim.F / .dim.F / C 1/Š Vol.F /
;¤F P
Proof. Chasing the diagram (1) one sees: cn .J1 .LA // D
n X i D0
.n C 1 i /ci .1XA / LAi
Consider now the generalized Euler sequence for smooth toric varieties [BC94, 12.1]: 0 ! 1XA !
M
j†
OXA .V .// ! OXAA
.1/jn
!0
2†A .1/
Where V ./ is the invariant divisor associated to the ray 2 †A .1/: It follows that: P ci .1XA // D .1/i 1 ¤2 ¤:::¤i ŒV .1 / : : : ŒV .i /: Recall that the intersection products ŒV .1 / : : : ŒV .i / correspond to codimension i invariant subvarieties and thus faces of PA of dimension n i: Moreover the degree of the embedded subvariety ŒV .1 / : : : ŒV .i / is equal to L ni .ŒV .1 / : : : ŒV .i // D .n i /Š Vol.F /; where F is the corresponding face. We can then conclude: P cn .J1 .LA // D ;¤F PA .n C 1 i /.n i /Š.1/i Vol.F / D P D ;¤F PA .1/codim.F / .dim.F / C 1/Š Vol.F / t u
Linear Toric Fibrations
137
Example 4.13. Consider the simplex 22 in Example 4.2. All the edges have length equal to two and therefore the toric embedding is 2-jet spanned. The dual variety is then an hypersurface and the degree of the discriminant is given by c2 .J1 .OP2 .2// D c2 .OP2 .1/ ˚ OP2 .1/ ˚ OP2 .1// D 3: The volume formula gives in fact: c2 .J1 .OP2 .2// D 6Vol.2ı2 / 2
3 X
Vol.21 / C 3 D 12 12 C 3 D 3:
1
Example 4.14. Consider the Segre embedding P1 P2 ,! P5 ; associated to the polytope Q: Then c3 .J1 .L // D 4Š 21 3Š.1 C 1 C 1 C 12 C 12 / C 2.9/ 6 D 0: This embedding is therefore dually defective. Q
The following is an amusing observation, which is a simple consequence of the previous characterization. Corollary 4.15. Let PA be a smooth polytope such that A D PA \ Zn : Then X
.1/codim.F / .dim.F / C 1/Š Vol.F / > 0
;¤F PA
Proof. Because the associated line bundle LA defines an embedding of the variety XA ; the map jet1 has maximal rank and thus the vector bundle J1 .LA / is spanned (by the global sections of LA ). It follows that the degree of its Chern classes must be non negative which implies the assertion. t u Now we can state the characterization of Q-factorial and non singular toric embeddings admitting discriminant A D 1: The theorem will include the combinatorial characterization and the equivalent algebraic geometry description. The proof in the non singular case will be given in Sect. 6. Theorem 4.16 ([DiR06, CDR08]). Let A D PA \ Zn and assume that XA is Q-factorial. Then the following equivalent statements hold. (a) The point-configuration A is dually defective if and only if PA is a Cayley sum of the form PA Š Cayley.R0 ; : : : ; Rt /.;Y / ; where .P / is a simplex (not necessarily unimodular) in Rt and R0 ; : : : ; Rt are normally equivalent polytopes with t > n2 : If moreover PA is smooth then .P / is a unimodular simplex.
138
S. Di Rocco
(b) The projective dual variety of the toric embedding XA ,! PjA \Z j1 has codimension s > 2 if and only if XA is a Mori-fibration, XA ! Y and dim.Y / < dim.X /=2: If moreover XA is non singular then .XA ; LA / D .P.L0 ˚ ˚ Lt /; /; where Li are line bundles on a toric variety Y of dimension m < t: n
Proposition 4.16 provides a characterization of the class of smooth polytopes achieving the minimal value 0: Corollary 4.17. Let P be a convex smooth lattice polytope. Then X
.dim.F / C 1/Š.1/codim.F / Vol.F / D 0
;¤F PA
If and only if P D Cayley.R0 ; : : : ; Rt / for normally equivalent smooth lattice polytopes Ri with dim.Ri / < t:
5 Toric Fibrations and Adjunction Theory The classification of projective algebraic varieties is a central problem in Algebraic Geometry dating back to early nineteenth century. The way one can realistically carry out a classification theory is through invariants, such as the degree, genus, Hilbert polynomial. Modern adjunction theory and Mori theory are the basis for major advances in this area. Let .X; L / be a polarized n-dimensional variety. Assume that X is Gorenstein (i.e. the canonical class KX is a Cartier divisor). The two key invariants occurring in classification theory, see [Fuj90], are the effective log threshold .L / and the nef value .L / W .L / WD supR fs 2 Q W dim.H 0 .KX C sL // D 0g .L / WD minR fs 2 R W KX C sL is nefg: Both invariants are at most equal to n C 1: Kawamata proved that .L / is indeed a rational number and recent advances in the minimal model program establish the same for .L /: They can be visualized as follows. Traveling from L in the direction of the vector KX in the Neron-Severi space NS.X / ˝ R of divisors, L C .1=.L //KX is the meeting point with the cone of effective divisors Eff.X / and L C .1=.L //KX is the meeting point with the cone of nef-divisors Nef.X /; see Fig. 1. A multiple of the nef line bundle KX C L defines a morphism X ! PM which can be decomposed (Remmert-Stein factorization) as a composition of a morphism W X ! Y with connected fibers onto a normal variety Y and finite-to-one morphism Y ! PM : The map is called the nef-value morphism. Kawamata showed that if one writes r D u=v for coprime integers u; v; then:
Linear Toric Fibrations
139
Fig. 1 Illustrating .L / and .L /
Ample L
Eff
L + 1 KX L + 1 KX KX
u 6 r.1 C max.dim.1 .y///: y2Y
Corollary 5.1. Let .X; L / be a polarized variety. Then the nef-value achieves the maximum value .L / D n C 1 if and only if .X; L / D .Pn ; OPn .1//: Proof. Consider the nef value morphism W X ! Y and observe that .n C 1/ 6 .1 C max.dim.1 .y///: y2Y
This implies that the dimension of a fiber of must be n and thus that the morphism contracts the whole space X to a point. By construction, the fact that contracts the whole space implies that KX C.nC1/L D OX : A celebrated criterion in projective geometry, called the Kobayashi-Ochiai theorem, asserts that if L is an ample line bundle such that KX C .n C 1/L D OX then .X; L / D .Pn ; OPn .1//: t u Remark 5.2. Recall that the interior of the closure of the effective cone is the cone of big divisors, .Eff.X //ı D Big.X /; and that the closure of the ample cone is the nef cone, Ample.X / D Nef.X /: In particular the equality .L / D .L / occurs if and only if the line bundle KX C .L /L is nef and not big, which implies that defines a fibration structure on X: A fibration structure on an algebraic variety is a powerful geometrical tool as many invariants are induced by corresponding invariants on the (lower dimensional) basis and generic fiber. Criteria for a space to be a fibration are therefore highly desirable. Beltrametti, Sommese and Wisniewski conjectured the if the effective log threshold is strictly bigger than half the dimension then the nef-value morphism should be a fibration. Conjecture 5.3 ([BS94]). If X is non singular and .L / > .n C 1/=2 then .L / D .L /: Let us now assume that the algebraic variety is toric. In this case it is immediate to see that the defined invariants are rational numbers as the cones Eff.X /; Big.X /; Ample.X /; Nef.X / are all rational cones.
140
S. Di Rocco
We have seen in Sect. 3 that toric fibrations are associated to certain Cayley polytopes. Analogously to the classification theory of projective algebraic varieties it is important to find invariants of polytopes that would characterize a Cayley structure. One invariant which has attracted increasing attention in recent years is the codegree of a lattice polytope: codeg.P / D minft 2 Z>0 such that tP contains interior lattice pointsg: Via Ehrhart theory one can conclude that codeg.P / 6 n C 1 and that codeg.P / D n C 1 if and only if P D n : This is in fact a simple consequence of our previous observations. Corollary 5.4. Let P be a Gorenstein lattice polytope. Then codeg.P / D n C 1 if and only if P D n : Proof. Let .X; LP / be the Gorenstein toric variety associated to P: Notice that, because KX D Di where the Di are the invariant divisors, the polytope defined by the line bundle KX C tL is the convex hull of the interior points of tP: The equality codeg.P / D n C 1 is equivalent to H 0 .KX C tL / D 0 for t 6 n: Because nef line bundles must have sections (in particular being nef is equivalent to being globally generated on toric varieties) we have .L / > codeg.P / D nC1: It follows from Corollary 5.1 that .X; L / D .Pn ; OPn .1// and thus P D n : t u Let us now examine the class of Cayley polytopes we encountered in the characterization of dually defective toric embeddings. We will see that this is a / class of polytopes satisfying the strong lower bound codeg.P / > dim.P C 1 and 2 the equality codeg.P / D .L /: Lemma 5.5. Let P D Cayleyh;Y .R0 ; : : : ; Rt / with t > n2 ; then: .LP / D .LP / D codeg.P / D t C 1 >
nC3 : 2
Proof. Observe that XP D P.L0 ˚ : : : ˚ Lt / for ample line bundles Li on the toric variety Y and L D is the tautological line bundle. Consider the projective bundle map W XP ! Y: The Picard group of XP is generated by the pull back of generators of Pic.Y / and by the tautological line bundle : Moreover the canonical line bundle is given by the following expression: KXP D .KY C L0 C : : : C Lt / .t C 1/: The toric nefness criterion says that a line bundle on a toric variety is nef if and only if the intersection with all the invariant curves is non-negative, see for example [ODA]. On the toric variety P.L0 ˚ : : : ˚ Lt / there are two types of rational invariant curves. The ones contained in the fibers F Š Pt and the pull back of rational invariant curves in Y which will be denoted by .C /i when contained in the invariant section defined by the polytope Ri : For any rational invariant curve
Linear Toric Fibrations
141
C F; it holds that jC D OP1 .1/ and .D/ C D 0; for all divisors D on Y: For every curve of the form .C /i it holds that .C /i .D/ D C D and .C /i D Li C: See [DiR06, Remark 3] for more details. We conclude that KXP C sL is nef if the following is satisfied: Œ .KY C L0 C : : : C Lt / C .s t 1/C D s t 1 > 0 if C F if C D .C /i .KY C L0 C : : : C .s t/Li C : : : C Lt / C > 0 In [MU02] Mustata proved a toric-Fujita conjecture showing that if for a line bundle H on an n-dimensional toric variety, H C > n for every invariant curve C; then the adjoint bundle K C H is globally generated, unless H D OPn .n/: Because ŒL0 C : : : C .s t/Li C : : : C Lt C > .s t/ C t it follows that .KY C L0 C : : : C .s t/Li C : : : C Lt / C > 0 for all invariant curves C D .C /i if s > t: This implies that KXP C sL is nef if and only if s > t C 1 and thus ./ D t C 1: Consider now the projection h W Rn ! Rt such that h.P / D t : Under this projection interior points of a dilation sP are mapped to interior points of the corresponding dilation st : This implies that codeg.P / D t C 1: Notice that .L / 6 codeg.P / D t C 1 as interior points of sP correspond to global sections of KXP C sL : On the other hand, see [HA, Ex. 8.4]: H 0 .u. .KY C L0 C : : : C Lt // C .v u.t C 1/// D D H 0 . .u. .KY C L0 C : : : C Lt // C .v u.t C 1//// D 0 D H .u.KY C L0 C : : : C Lt / C ..u v.t C 1/// D 0 if v u.t C 1/ < 0: This implies that .L / > t C 1, which proves the assertion.
t u
Recently Batyrev and Nill in [BN08] classified polytopes with codeg.P / D n and conjectured the following. Conjecture 5.6 ([BN08]). There is a function f .n/ such that any n-dimensional polytope P with codeg.P / > f .n/ decomposes as a Cayley sum of lattice polytopes. The above conjecture was proven by Haase, Nill and Payne in [HNP09]. They showed that f .n/ is at most quadratic in n: It is important to observe that, as interior lattice points of tP correspond to global sections of KX C tL for the associated toric embedding, codeg.P / can be considered as the integral variant of .L /: This observation, techniques from toric Mori theory and adjunction theory led to prove a stronger version of Conjectures 5.3 and 5.6 for smooth polytopes giving yet another characterization of Cayley sums. Theorem 5.7 ([DDRP09, DN10]). Let P Rn be a smooth n-dimensional polytope. Then the following statements are equivalent.
142
S. Di Rocco
(a) codeg.P / > .n C 3/=2: (b) P is affinely isomorphic to a Cayley sum Cayley.R0 ; : : : ; Rt /;Y where t C 1 D codeg.P / with t > n2 : (c) .LP / D .LP / D t C 1 and t > n2 : (d) .XP ; LP / D .P.L0 ˚ ˚ Lt /; / for ample line bundles Li on a non singular toric variety Y: Notice that Theorem 5.7 proves the reverse statement of Lemma 5.5. Conjectures 5.3 and 5.6, made independently in two apparently unrelated fields, constitute a beautiful example of the interplay between classical projective (toric) geometry and convex geometry. In view of the results above one could hope that in the toric setting the conjectures should hold in more generally. Conjecture 5.8. Let .X; L / be an n-dimensional toric polarized variety (not necessarily smooth or even Gorenstein), then .L / > .n C 1/=2 implies that .L / D .L /: The invariants .L /; .L / in the non Gorenstein case can be defined using corresponding invariants, .P /; .P / of the associated polytope, see below for a definition. Conjecture 5.9. If an n-dimensional lattice polytope P satisfies codeg.P / > .n C 2/=2, then it decomposes as a Cayley sum of lattice polytopes of dimension at most 2.n C 1 codeg.P //. Conjecture 5.8 is a toric version of Conjecture 5.3, extending the statement to possibly singular and non Gorenstein varieties. Conjecture 5.9 states that the function f .n/ in Conjecture 5.6 should be equal to .n C 2/=2: An important step to prove these conjectures is to define the convex analog of .LP /: Let P Rn be a rational polytope of dimension n. Any such polytope P can be described in a unique minimal way as P D fx 2 Rn W hai ; xi > bi ; i D 1; : : : ; mg where the ai are the rows of an m n integer matrix A, and b 2 Qm . For any s > 0 we define the adjoint polytope P .s/ as P .s/ WD fx 2 Rn W Ax > b C s1g; where 1 D .1; : : : ; 1/T . We call the study of such polytopes P .s/ polyhedral adjunction theory (Fig. 2). Definition 5.10. We define the Q-codegree of P as .P / WD .supfs > 0 W P .s/ 6D ;g/1 ; and the core of P to be core.P / WD P .1=.P // .
Linear Toric Fibrations
143
Fig. 2 Two examples of polyhedral adjunction
Fig. 3 P .4/ P for a three-dimensional lattice polytope P
Notice that in this case the supremum is actually a maximum. Moreover, since P is a rational polytope, .P / is a positive rational number. One sees that for a lattice polytope P .P / 6 codeg.P / 6 n C 1 Definition 5.11. The nef value of P is given as .P / WD .supfs > 0 W N .P .s/ / D N .P /g/1 2 R>0 [ f1g where N .P / denotes the normal fan of the polytope P: Note that in contrast to the definition of the Q-codegree, here the supremum is never a maximum. Figure 3 illustrates a polytope P with .P /1 D 2 and .P /1 D 6: In this case core.P / is an interval. In [DRHNP13] the precise analogue of Conjecture 5.9 for the Q-codegree is proven. Theorem 5.12 ([DRHNP13]). Let P be an n-dimensional lattice polytope. If n is odd and .P / > .n C 1/=2, or if n is even and .P / > .n C 1/=2, then P is a Cayley polytope. Results from [DRHNP13] show Conjecture 5.9 in two interesting cases: when d.P /e D codeg.P / and when the normal fan of P is Gorenstein and .P / D .P /:
144
S. Di Rocco
6 Connecting the Three Characterizations In Sect. 4 we have seen that a certain class of Cayley polytopes characterizes dually defective configuration points. Moreover this class corresponds to the polytopes achieving the equality in Corollary 4.15. In Sect. 5 the same class of Cayley polytopes was characterized as corresponding to smooth configurations with codegree larger than slightly more that half the dimension. We will here assemble the three characterizations and provide proofs in the non singular case. Theorem 6.1. Let A Zn be a point configuration such that PA \ Zn D A ; dim.PA / D n and such that PA is a smooth polytope. Then the following statements are equivalent. (a) PA is affinely isomorphic to a Cayley sum Cayley.R0 ; : : : ; Rt /;Y where t C 1 D codeg.PA / and t > n2 : (b) codeg.PA / > nC1 2 C 1 and .PA / D .PA /: (c) The discriminant A D 1: P (d) .dim.F / C 1/Š.1/codim.F / Vol.F / D 0 ;¤F PA Proof. [.d / , .c/]. The implication .d / Ý .c/ follows from Lemma 4.10 and Proposition 4.12. The reverse implication follows from Corollary 4.17. [.c/ ) .b/:] Assume now .c/; i.e. assume that the configuration is dually defective. Consider the associated polarized toric manifold .XA ; LA /: It is a classical result that the generic tangent hyperplane is in fact tangent along a linear space in XA : _ Therefore if codim.XA / D k > 1 then there is a linear Pk through a general point of XA : By linear Pk we mean a subspace Y Š Pk such that LA jY D OPk .1/: nk
nk
Moreover, by a result of Ein [E86] NPk =X D .˚1 2 OPk / ˚ .˚1 2 OPk .1//. Observe that if we fix a point x 2 XA ; a sequence fFj g of general linear subspaces FJ Š Pk can be chosen so that x 2 lim.Fj /: Since the Fi are all linear the limit space has to be also a linear Pk : We can then assume that there is a linear Pk through every point of XA : Let L now be an invariant line in one of the Pk through a fixed point. Then: ŒKX C tL L D OP1 ..n 2 k/=2 C t/ nCk which implies .LA / > nCk 2 C1: Assume now that .LA / > 2 C1 and let L be again a line in the family of linear spaces covering X: The quantity KX L2 D is called the normal degree of the family. In our case D .n C k/=2 1 > n=2: By a result of Beltrametti-Sommese-Wisniewski [BSW92], this assumption implies D 2; proving .LA / D nCk 2 C 1: Notice that the nef-morphism contracts all the linear Pk of the covering family and thus it is a fibration. As a consequence the line bundle KXA C LA is not big and thus .LA / D .LA /: The inequality
codeg.PA / > .LA / D .LA / D shows the implication .c/ Ý .b/:
nCk C1 2
Linear Toric Fibrations
145
Œ.b/ ) .a/: Assume now .b/. The nef-value morphism is then a fibration and .LA / > codeg.PA / 1 >
n : 2
Notice that the nef-morphism contracts a face of the Mori-cone and thus faces of the lattice polytope PA ; i.e. all the invariant curves with 0-intersection with the line bundle KXA C LA : Let now C be a generator of an extremal ray contracted by the morphism : If LA C > 2, then KX C > n C 1 which is impossible. We can conclude that C is a line and .LA / D KXA C is an integer. It follows that .LA / > nC1 C1: This inequality implies that is the contraction of one extremal 2 ray, by [BSW92, Cor. 2.5]. These morphisms are analyzed in detail in [Re83]. Because XA is smooth and toric and this contraction has connected fibres, the general fiber F of the contraction is a smooth toric variety with Picard number one. It follows that F is a projective space and thus is a Pt bundle. Let LjF D OPt .a/: Observe that by construction KXA jF D KF : Consider a line l F: It follows that 0 D .KXA C LA / l D KF l C LA l D t 1 C a nC1 nC1 and thus D t C1 a > 2 C 1 which implies a D 1 and t > 2 : Since a D 1 the fibers are embedded linearly and thus .XA ; LA / D .P.L0 ˚ : : : ˚ Lt /; /; for ample line bundles Li on a smooth toric variety Y: This proves the implication .b/ Ý .a/: Œ.a/ ) .c/: Assume now .a/. Using notation as in (2), consider the commutative diagram:
˛ 1 .P.E/_ reg /
˛
P.E/
Y
f
P.E/_ reg
where E D L0 ˚ : : : ˚ Lt and .Y; Li / is the smooth polarized variety associated to the polytope Ri : The commutativity of the diagram and the existence of f follows from [DeB01, Lemma 1.15]. Let y 2 Y and let F Š Pt PjA j1 be the fiber 1 .y/: Commutativity of the diagram implies that the contact locus .˛ 1 .H // is included in F for all H 2 f 1 .y/. Moreover Osc F;y Osc P.E/;y H implies that H belongs to the dual variety F _ ; with contact locus at least of the same dimension. Because the map f is dominant we can conclude that: dim.F _ / > dim.P.E/_ / dim.Y /; which implies
146
S. Di Rocco
codim.P.E/_ / > codim.F _ / dim.Y /: Recall that the fibers F are embedded linearly and thus codim.F _ / D dim.F / C 1: It follows that codim.P.E/_ / > dim.F / C 1 dim.Y / > 1 and thus A D 1. This proves .a/ Ý .c/: t u Acknowledgements The author was supported by a grant from the Swedish Research Council (VR). Special thanks to A. Lundman, B. Nill and B. Sturmfels for reading a preliminary version of the notes.
References [BN08]
V. Batyrev, in Combinatorial Aspects of Mirror Symmetry, vol. 452 of Contemp. Math., ed. by Matthias Beck and et. al., (AMS, 2008), pp. 35–66 [BC94] V.V. Batyrev, D. Cox, On the Hodge structures of projective hyper surfaces in toric varieties. Duke Math. 75, 293–338 (1994) [BSW92] M. Beltrametti, in Complex Algebraic Varieties (Bayreuth, 1990), Lecture Notes in Math., vol. 1507 (Springer, Berlin, 1992), pp. 16–38. DOI 10.1007/BFb0094508. URL http://dx.doi.org/10.1007/BFb0094508 [BS94] M. Beltrametti, Classification of Algebraic Varieties (L’Aquila, 1992), vol. 162 of Contemp. Math. (Amer. Math. Soc., Providence, RI, 1994), pp. 31–48 [CDR08] C. Casagrande, Commun. Contemp. Math. 10(3), 363 (2008) [CC07] E. Cattani, J. Symbolic Comput. 42(1-2), 115 (2007). DOI 10.1016/j.jsc.2006.02. 006. URL http://dx.doi.org/10.1016/j.jsc.2006.02.006 [DeB01] O. Debarre, Higher-Dimensional Algebraic Geometry. Universitext (Springer, New York, 2001) [DS02] A. Dickenstein, J. Symbolic Comput. 34(2), 119 (2002). DOI 10.1006/jsco.2002. 0545. URL http://dx.doi.org/10.1006/jsco.2002.0545 [DDRP09] A. Dickenstein, S. Di Rocco, R. Piene, Classifying smooth lattice polytopes via toric fibrations. Adv. Math. 222(1), 240–254 (2009) [DN10] A. Dickenstein, B. Nill, A simple combinatorial criterion for projective toric manifolds with dual defect. Math. Res. Lett. 17, 435–448 (2010) [DDRP12] A. Dickenstein, S. Di Rocco, R. Piene, Higher order duality and toric embeddings. Ann. l’Institut Fourier (to appear) [DiR01] S. Di Rocco, Generation of k-jets on toric varieties. Math. Z. 231, 169–188 (1999) [DiR06] S. Di Rocco, Projective duality of toric manifolds and defect polytopes. Proc. Lond. Math. Soc. (3) 93(1), 85–104 (2006) [DRS01] S. Di Rocco, A.J. Sommese, Line bundles for which a projectivized jet bundle is a product. Proc. A.M.S. 129(6), 1659–1663 (2001) [DRS04] S. Di Rocco, A.J. Sommese, Chern numbers of ample vector bundles on toric surfaces. Trans. Am. Math. Soc. 356(2), 587–598 (2004) [DRHNP13] S. Di Rocco, B. Nill, C. Haase, A. Paffenholz, Polyhedral adjunction theory. Algebra Number Theory (to appear) [E86] L. Ein, Varieties with small dual varieties. Inv. Math. 96, 63–74 (1986) [EW] G. Ewald, Combinatorial Convexity and Algebraic Geometry, Graduate Texts in Mathematics, vol. 168 (Springer, New York, 1996) [FU] W. Fulton, Introduction to Toric Varieties, Annals of Mathematics Studies, vol. 131 (Princeton University Press, Princeton, NJ, 1993). The William H. Roever Lectures in Geometry
Linear Toric Fibrations [FUb]
[Fuj90]
[GKZ]
[GH79] [HNP09] [HA] [L94] [LM00] [MT11] [MU02] [ODA] [ODAb]
[Re83]
147
W. Fulton, Intersection Theory, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics], vol. 2, 2nd edn. (Springer, Berlin, 1998). DOI 10.1007/978-1-4612-1700-8. URL http://dx.doi. org/10.1007/978-1-4612-1700-8 T. Fujita, in Classification Theories of Polarized Varieties. Lecture Note Series, vol. 155 (London Mathematical Society/Cambridge University Press, London/Cambridge, 1990) I. Gel0 fand, Discriminants, Resultants, and Multidimensional Determinants. Mathematics: Theory & Applications (Birkhäuser Boston, Boston, MA, 1994). DOI 10.1007/978-0-8176-4771-1. URL http://dx.doi.org/10.1007/978-0-8176-4771-1 P. Griffiths, Ann. Sci. École Norm. Sup. (4) 12(3), 355 (1979). URL http://www. numdam.org/item?id=ASENS_1979_4_12_3_355_0 C. Haase, J. Reine Angew. Math. 637, 207 (2009) R. Hartshorne, Algebraic Geometry (Springer, New York, 1977). Graduate Texts in Mathematics, No. 52 J. Landsberg, Invent. Math. 117(2), 303 (1994). DOI 10.1007/BF01232243. URL http://dx.doi.org/10.1007/BF01232243 A.A. Lanteri, R. Mallavibarrena, Higher order dual varieties of generically k-regular surfaces. Arch. Math. (Basel) 75(1), 75–80 (2000). doi:10.1007/s000130050476 Y. Matsui, Adv. Math. 226(2), 2040 (2011). DOI 10.1016/j.aim.2010.08.020. URL http://dx.doi.org/10.1016/j.aim.2010.08.020 M. Mustata, Tohoku Math. J. 54(3), 4451 (2002) T. Oda, Algebraic Geometry Seminar (Singapore, 1987) (World Scitific, Singapore, 1988), pp. 89–94 T. Oda, Convex Bodies and Algebraic Geometry, Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], vol. 15 (Springer, Berlin, 1988). An introduction to the theory of toric varieties, Translated from the Japanese M. Reid, Arithmetic and Geometry (Progress in Math. 36, Birkhäuser, Boston, 1983), pp. 395–418
A Tour on Hermitian Symmetric Manifolds Filippo Viviani
1 Introduction A Hermitian symmetric manifold (or HSM for short) is a Hermitian manifold which is homogeneous and such that every point has a symmetry preserving the Hermitian structure. First studied by Élie Cartan [Car35], they are the specialization of the notion of Riemannian symmetric manifolds (introduced by Élie Cartan himself in [Car26-27]) to complex manifolds. HSMs (or more generally Riemannian symmetric manifolds) arise in a wide variety of mathematical contexts: representation theory, harmonic analysis, automorphic forms, complex analysis, differential geometry, algebra (Lie theory and Jordan theory), number theory and algebraic geometry. For example, in algebraic geometry, HSMs arise often as (orbifold) fundamental covers of moduli spaces, such as the moduli space of polarized abelian varieties (possibly with level structures or with fixed endomorphism algebras), the moduli space of polarized K3 surfaces, the moduli space of polarized irreducible symplectic manifolds, etc. Due to their frequent occurrence in different areas of mathematics, there is a vast literature on HSMs (e.g. [AMRT10,Bor52,BJ06,FKKLR00,Hel78,Koe69,Loo69a, Loo69b, Loo77, Mok89, PS69, Sat80, Wol67]) dealing with the various aspects of the theory. This vast literature, however, makes it difficult for a non-expert to have a global overview on the subject. The aim of these notes is to survey the different points of view on HSMs, so that a beginner can orient himself inside the vast literature. For this reason, we have chosen to give very few proofs of the results presented, referring the reader to the relevant literature for complete proofs.
F. Viviani () Dipartimento di Matematica e Fisica, Università degli Studi Roma Tre, Largo S. Leonardo Murialdo 1, 00146 Roma, Italy e-mail:
[email protected] A. Conca et al., Combinatorial Algebraic Geometry, Lecture Notes in Mathematics 2108, DOI 10.1007/978-3-319-04870-3__5, © Springer International Publishing Switzerland 2014
149
150
F. Viviani
Let us now examine more in detail the contents of the paper. In studying HSMs, the reader should keep in mind the (well known) classification of HSMs of (complex) dimension one. Namely, a HSM of complex dimension one is isomorphic to one of the following HSMs: 1. The complex manifold C=, where C is a discrete additive subgroup, endowed with the Hermitian structure induced by the standard Euclidean metric g D dx dx C dy dy on C D R2.x;y/ (which has constant zero curvature). The translations z 7! z C a (with a 2 C) act transitively via holomorphic isometries and the inversion symmetry at Œ0 2 C= is given by sŒ0 W z 7! z. 2. The upper half space H WD fz D x C iy 2 C W Im z D y > 0g endowed with a Hermitian structure induced by the hyperbolic metric g D dxy 2dy (which has constant negative curvature). The group SL2 .R/ acts transitively via Möbius transformations (which are holomorphic isometries) az C b ab z WD cd cz C d 1 and the inversion symmetry at i 2 H is given by si W z 7! . z 3. The complex projective line P1C with the Fubini–Studi Hermitian metric (with constant positive curvature), which is induced by pulling back the Euclidean metric on the two dimensional sphere S2 R3 via the diffeomorphism P1C Š S2 induced via stereographic projection from the north pole N D .1; 0; 0/ 2 S2 . The group SO3 .R/ acts transitively on S2 via rotations (which are holomorphic isometries of P1C Š S2 ) and the inversion symmetry at the north pole N is given by the rotation sN W .x; y; z/ 7! .x; y; z/. Note that, according to the Riemann’s uniformization theorem, the unique simply connected complex manifolds of dimension one are C, H and P1C : The above trichotomy in dimension one extends to arbitrary dimensions (see the Decomposition Theorem 2.9): any HSM can be written uniquely as the product of a HSM of the form Cn = for some discrete additive subgroup Cn (which is called the Euclidean factor of the HSM), of a HSM of non-compact type (i.e. a product of irreducible non-compact HSMs) and a HSM of compact type (i.e. a product of irreducible compact HSMs). The rest of Sect. 2 is devoted to the study of non-Euclidean HSM, i.e. those for which the Euclidean factor in the above decomposition is trivial. Hermitian symmetric manifolds of compact or of non-compact type admit another natural incarnation. Namely, HSMs of compact type are exactly the cominuscle rational homogeneous projective varieties, i.e. those varieties isomorphic to a quotient of the form G=P , where G is a semisimple complex Lie group and P G is a parabolic subgroup whose unipotent radical is abelian. We review this description in Sect. 2.6.
Hermitian Symmetric Manifolds
151
HSMs of non-compact type admit a canonical embedding (the so called Harish– Chandra embedding) inside a complex vector space in such a way that they become bounded symmetric domains. And, conversely, any bounded symmetric domain becomes a HSM of non-compact type when it is endowed with the Bergman metric. We review this correspondence between HSMs of non-compact type and bounded symmetric domains in Sect. 2.5. There is a natural correspondence between HSMs of non-compact type and HSMs of compact type, which we review in Sect. 2.3. Moreover, this correspondence satisfies the property that each HSM of non-compact type is canonically realized (via the so called Borel embedding) as an open subset inside the associated HSM of compact type (which is called its compact dual). We review the Borel embedding in Sect. 2.4. Irreducible HSMs of compact or non-compact type can be classified using Lie theory. Indeed, they are diffeomorphic to a quotient of the form G=K where G is a simple Lie group (compact in the compact type case and non-compact in the non-compact type case) and K G is a maximal compact proper subgroup whose center is equal to S1 . We review this description in Sect. 2.1. By passing to the associated Lie algebras, we get a correspondence between nonEuclidean HSMs and irreducible Hermitian symmetric Lie algebras, which is the datum of a simple real Lie algebra g together with an involution W g ! g such that its C1 eigenvalue has a one-dimensional center. See Sect. 2.2 for more details. There is an alternative approach to the study of HSMs of non-compact type based on Jordan theory rather than Lie theory. Indeed, there is a natural bijection between HSMs of non-compact type and Hermitian Jordan triple systems; see Sect. 2.7 for more details. Using either Lie theory or Jordan theory, it is possible to give a classification of irreducible HSMs of non-compact type (and of their compact duals). They are divided into four infinite families, called (following Siegel’s notation) Ip;q , II n , III n and IV n , and two exceptional cases, called V and VI. Section 3 is devoted to a detailed analysis of each of the above mentioned irreducible HSMs. In particular, we make explicit, in each of the cases, the general properties of HSMs presented in Sect. 2. Section 4 is devoted to the study of the boundary components of HSMs of noncompact type. More precisely, fix a HSM of non-compact type and realize it as a bounded symmetric domain D CN via its Harish–Chandra embedding. The closure D of D inside CN can be partitioned into several equivalence classes for the equivalence relation of being connected through a chain of holomorphic disks. Each of these equivalence classes, called boundary components of D, is indeed again a HSM of non-compact type which is realized as a bounded symmetric domain inside its linear span in CN . Boundary components can be classified via their normalizer subgroups, which turn out to be all the maximal parabolic subgroups of the group G of automorphisms of D (see Theorem 4.5). The structure of the normalizer subgroups of the boundary components is analyzed in detail in Sect. 4.1.
152
F. Viviani
In Sect. 4.2, we show that, for every boundary component F of D, the domain D can be decomposed into the product of F , a real vector space W .F / and a symmetric cone C.F / associated to F , i.e. an open homogeneous cone inside a real vector space which is self-dual with respect to a suitable scalar product. In Sect. 4.3, we show how symmetric cones correspond bijectively to Euclidean Jordan algebras and we present the classification of irreducible symmetric cones via the classification of simple Euclidean Jordan algebras. In Sect. 4.4, we show how bounded symmetric domains can be realized in a unique way as Siegel domains (of the second type) associated to a suitable symmetric cone and to a suitable representation of the associated Euclidean Jordan algebra. In Sect. 4.5, we describe explicitly the boundary components of each of the irreducible bounded symmetric domains by computing their normalizer subgroups and their associated symmetric cones. These notes were written for a Ph.D. course (held at the University of Roma Tre in Spring 2013) entitled “Toroidal compactifications of locally symmetric varieties” and a course in the Summer School “Combinatorial Algebraic Geometry” (held in Levico Terme in June 2013) with the same title. Thus, our original motivation was to write a survey on the construction of toroidal compactifications of locally symmetric varieties (i.e. quotients of Hermitian symmetric manifolds of non-compact type by arithmetic subgroups), by revisiting the original work of Ash–Mumford–Rapoport– Tai [AMRT10] (see also the work of Namikawa [Nam80], where the case of Siegel spaces is worked out explicitly). Due to limitations in space and time, we were unable to complete this project and we ended up with an attempt to write a survey on the beautiful and rich theory of Hermitian symmetric manifolds. We plan to write a sequel to these notes on the construction of toroidal compactifications of locally symmetric varieties.
Notations 1.1. Given a Lie group G, we denote by G o the connected component of G containing the identity and by ZG its center. A semisimple Lie group G is said to be adjoint if it has trivial center, or in symbols if ZG D feg. 1.2. Given a Lie algebra g, we denote its center by Z.g/. For a real Lie algebra g, we denote by gC WD g ˝R C D g ˚ i g its complexification. 1.3. Given a real (resp. complex) finite-dimensional vector space V and two symmetric (resp. Hermitian) linear operators F; G 2 End.V /, we write: (a) F > G (or G < F ) if and only if F G is positive definite; (b) F G (or G F ) if and only if F G is positive semidefinite. 1.4. Given a matrix M 2 Mn;n .F / with entries in F D R; C; H, we will denote by M t its transpose and by M its conjugate with respect to:
Hermitian Symmetric Manifolds
153
• the trivial conjugation if F D R; • the conjugation x0 C ix1 7! x0 ix1 if F D C; • the conjugation x0 C ix1 C jx2 C kx3 7! x0 ix1 jx2 kx3 if F D H. t
Moreover, we set M D M . 1.5. We will denote by 0 the zero matrix of any size, by In the identity n n 0 In , and we matrix, by Jn the 2n 2n standard symplectic matrix, i.e. Jn WD In 0 0 In set Sn WD . In 0 1.6. For the notation on simple real Lie groups, we will follow [Hel78, Chap. X, §2] (see also [Kna96, Chap. I, §17]).
2 Hermitian Symmetric Manifolds The aim of this section is to introduce Hermitian symmetric manifolds and to establish their basic properties. Let us begin by recalling the definition of a complex structure and of an almost complex structure on a differentiable manifold M . Definition 2.1. (i) A complex manifold is a pair .M; OM / consisting of a (connected) differentiable manifold M and a sheaf OM of C-valued smooth functions on M such that .M; OM / is locally isomorphic to .CN ; OCN /, where OCN is the sheaf of holomorphic functions on CN . The sheaf OM is said to be a complex structure on the manifold M . (ii) A quasi-complex manifold is a pair .M; J / consisting of a (connected) differentiable manifold M and a smooth tensor field J of type .1; 1/ such that for every p 2 M the induced linear map Jp W Tp M ! Tp M satisfies Jp2 D id, i.e. Jp is a complex structure on the vector space Tp M . The tensor field J is said to be a quasi-complex structure on the manifold M . Given a complex manifold .M; OM /, the local isomorphism of .M; OM / with .Cn ; OCn / together with the natural complex structure on each tangent space Tq Cn Š Cn given by multiplication by i , induces a quasi-complex structure J on M . A quasi-complex structure J on M induced by a complex structure is said to be integrable. Integrable quasi-complex structures are characterized by the following well-known theorem of Newlander–Nirenberg (see [Hel78, Chap. VIII, Thm. 1.2] and the references therein).
154
F. Viviani
Theorem 2.2 (Newlander–Nirenberg). A quasi-complex structure J on M is induced by a complex structure on M (i.e. it is integrable) if and only if ŒJX; JY D J ŒJX; Y C J ŒX; JY C ŒX; Y ; for any two vector fields X and Y on M . In this case, the complex structure on M is uniquely determined by the almost complex structure J on M . Therefore giving a complex manifold is equivalent to giving an almost complex manifold .M; J / such that J is integrable. There are three equivalent ways of giving an Hermitian structure on a complex manifold .M; J /, which we now recall. Lemma - Definition 2.3. Let .M; J / be a complex manifold. A Hermitian structure on .M; J / is the assignment of one of the following equivalent structures: (i) A smooth tensor field h of type .0; 2/ such that hp W Tp M Tp M ! C is a positive definite Hermitian form with respect to Jp for any p 2 M (called a Hermitian metric), i.e. • hp .x; y/ D hp .y; x/ for any x; y 2 Tp M ; • hp .Jp x; y/ D ihp .x; y/ for any x; y 2 Tp M ; • hp .x; x/ > 0 for any 0 ¤ x 2 Tp M . (ii) A Riemannian metric g such that gp W Tp M Tp M ! R is compatible with Jp for any p 2 M , i.e. • gp .x; y/ D gp .y; x/ for any x; y 2 Tp M ; • gp .Jp x; Jp y/ D gp .x; y/ for any x; y 2 Tp M ; • gp .x; x/ > 0 for any 0 ¤ x 2 Tp M . (iii) A 2-form ! such that !p W Tp M Tp M ! R is compatible with Jp and positive definite with respect to Jp for any p 2 M , i.e. • !p .x; y/ D !p .y; x/ for any x; y 2 Tp M ; • !p .Jp x; Jp y/ D !p .x; y/ for any x; y 2 Tp M ; • !p .x; Jp x/ > 0 for any 0 ¤ x 2 Tp M . One can pass from one assignment to the other two by means of the following formulas: g.X; Y / D Re h.X; Y / D !.X; JY/; !.X; Y / D Im h.X; Y / D g.JX; Y /; h.X; Y / D g.X; Y / ig.JX; Y / D !.X; JY/ i !.X; Y /: for X; Y any smooth vector fields on M .
Hermitian Symmetric Manifolds
155
We say that .M; J; h/ (resp. .M; J; g/, resp. .M; J; !/) is a Hermitian manifold if .M; J / is a complex structure and h (resp. g, resp. !) defines a Hermitian structure on .M; J /. Sometimes, we will say that M is a Hermitian manifold if there is no need to specify the complex structure and the Hermitian structure. Using partitions of unity, it is easy to show that every complex manifold can be endowed with a Hermitian structure. In what follows, we will be interested in complex manifolds that admits a special Hermitian structure in the following sense. Definition 2.4. Let .M; J; h/ be a Hermitian manifold. Denote by Aut.M; J; h/ the group of holomorphic isometries, i.e. the group of self-diffeomorphisms W M ! M such that J D J and h D h. We say that (i) .M; J; h/ is homogeneous if Aut.M; J; h/ acts transitively on M ; (ii) .M; J; h/ is symmetric (or HSM for short) if it is homogeneous and for some p 2 M (or, equivalently, for any p 2 M ) there exists sp 2 Aut.M; J; h/ (called a symmetry at p) such that sp2 D id and p is an isolated fixed point of sp . Remark 2.5. (i) If every point p 2 M admits a symmetry sp as above then .M; J; h/ is automatically homogeneous (see [Mil05, Prop. 1.6]). (ii) The symmetry sp at p can be characterized as the unique sp 2 Aut.M; J; h/ such that sp .p/ D p and dsp D idTp M . It follows that sp is a geodesic symmetry at p, i.e. if W .a; a/ ! M is any geodesic such that .0/ D p then sp ..t// D .t/ for any a < t < a (see [Mil05, Prop. 1.11]). (iii) If .M; J; h D g i !/ is a Hermitian symmetric manifold then: • .M; g/ is a (geodesically) complete, i.e. M is a complete metric space or, equivalently, every geodesic of the Riemannian manifold .M; g/ can be defined on the entire real line (see [Mil05, Prop. 1.11]); • .M; J; !/ is Kähler, i.e. ! is a closed 2-form (see [Hel78, Chap. VIII, Thm. 4.1]). (iv) A Riemannian manifold .M; g/ such that the group Aut.M; g/ of isometries acts transitively on M and for some p 2 M (or, equivalently, for any p 2 M ) there exists sp 2 Aut.M; g/ which is a geodesic symmetry at p is called a Riemannian symmetric manifold; see [Hel78] for an extensive study of Riemannian symmetric manifold. Note that Hermitian symmetric manifolds are in particular Riemannian symmetric manifolds. In dimension one, every Hermitian symmetric manifold is isomorphic to one of the following examples. Example 2.6. (1) Let C be a discrete additive subgroup (note that is isomorphic to .0/, Z or Z2 ). The quotient C= is a complex manifold which we endow with the Hermitian structure induced by the standard Euclidean metric g D dx dxCdy dy on C D R2.x;y/ (which has constant zero curvature). The translations z 7! z C a (with a 2 C) act transitively via holomorphic isometries, so that C= is a
156
F. Viviani
homogeneous Hermitian manifold. If o denotes the class of 0 in the quotient C=, then the map so W z 7! z is an isometry at o, which shows that C= is a Hermitian symmetric manifold. (2) Let H WD fz D x C iy 2 C W Im z D y > 0g be the upper half space. Then H inherits from C a complex structure and we endow it with a Hermitian structure induced by the hyperbolic metric g D dxy 2dy (which has constant negative curvature). The group SL2 .R/ acts transitively via Möbius transformations (which are holomorphic isometries) az C b ab ; z WD cd cz C d which shows that H is a homogeneous Hermitian manifold. The Möbius 1 transformation si W z 7! is a symmetry at i 2 H, so that H is a Hermitian z symmetric manifold. (3) Let P1C be the complex projective line. Via stereographic projection from the north pole N D .1; 0; 0/, the two dimensional sphere S2 R3.x;y;z/ is diffeomorphic to P1C . Via this diffeomorphism, the restriction of the Euclidean metric to S2 induces a metric g on P1C (with constant positive curvature) which is compatible with its complex structure, i.e. it induces a Hermitian structure on P1C , which is called the Fubini–Studi metric. The group SO3 .R/ acts transitively on S2 via rotations, which are holomorphic isometries of P1C Š S2 ; hence P1C is a homogeneous Hermitian manifold. The rotation sN W .x; y; z/ 7! .x; y; z/ is a symmetry at the north pole N 2 S2 Š P1C , which show that P1C is a Hermitian symmetric manifold. The trichotomy of the previous Example 2.6 extends to arbitrary dimension. Definition 2.7. Let M be a Hermitian symmetric manifold (HSM). (i) M is said to be of Euclidean type if M is isomorphic to Cn = for some discrete additive subgroup Cn , where Cn = is endowed with the complex structure and the Hermitian metric induced by Cn . (ii) M is said to be irreducible if it is not Euclidean and it cannot be written as the product of two non-trivial HSMs. (iii) M is said to be non-Euclidean if it is the product of irreducible HSMs. (iv) M is said to be of compact type (resp. non-compact type) if it is the product of compact (resp. non-compact) irreducible HSMs. Hermitian symmetric manifolds of non-compact type are also called Hermitian symmetric domains, due to the fact that they are biholomorphic to bounded symmetric domains (see Sect. 2.5). Remark 2.8. Clearly, Euclidean HSMs have identically zero Riemannian sectional curvature. On the other hand, HSMs of compact type (resp. of non-compact type) have semipositive (resp. seminegative) Riemannian sectional curvature (see
Hermitian Symmetric Manifolds
157
[Hel78, Chap. V, Thm. 3.1]) and therefore also semipositive (resp. seminegative) holomorphic bisectional curvature (see [Mok89, Chap. 2, (3.3), Prop. 1]). Moreover, irreducible HSMs of compact type (resp. of non-compact type) have positive (resp. negative) Ricci curvature (see [Mok89, Chap. 3, (1.3), Prop. 2]). Every Hermitian symmetric manifold can be decomposed uniquely in the following way (see [Hel78, Chap. VIII, Prop. 4.4, Thm. 4.6, Prop. 5.5]). Theorem 2.9 (Decomposition theorem). Every Hermitian symmetric manifold M decomposes uniquely as M D M0 M MC ; where M0 is a Euclidean HSM, M is a HSM of compact type and MC is a HSM of non-compact type. Moreover, M (resp. MC ) is simply connected and it decomposes uniquely as a product of compact (resp. non-compact) irreducible HSMs. In particular, any HSM is the product of a Euclidean HSM and of a non-Euclidean HSM. Since Euclidean HSMs are easy to understand (being isomorphic to Cn =, for some discrete additive subgroup Cn ), from now on we will focus on nonEuclidean HSMs. An important invariant of a non-Euclidean HSM is its rank, which we are now going to define following [Hel78, Chap. V, §6]. Recall that a submanifold S of a Riemannian manifold .M; g/ is called totally geodesic if for every p 2 S it holds that all the geodesics of M through p that are tangent to S are contained in S . Moreover, in the case where .M; g/ is a Riemannian symmetric manifold [in the sense of Remark 2.5(iv)], N is totally geodesic if and only if for every p 2 N we have that sp .N / D N (see [Mok89, Chap. 5, (1.1), Lemma 1.1]). Furthermore, N is said to be flat if the restriction of the Riemannian metric g to N has identically zero curvature tensor. Definition 2.10. Let M be a non-Euclidean HSM. The rank of M is the maximal dimension of a flat totally geodesic submanifold of M .
2.1 Classifying Non-Euclidean HSM Via Lie Groups The aim of this subsection is to classify non-Euclidean Hermitian symmetric manifolds in terms of Lie groups. Let M D .M; J; h/ be a non-Euclidean Hermitian symmetric manifold and fix a point o 2 M . The group Aut.M / D Aut.M; J; h/ of holomorphic isometries of M , endowed with the compact-open topology, becomes a (real) Lie group (see [Hel78, Chap. VIII, §4]). We denote by Aut.M /o the connected component of Aut.M / containing the identity and by Stab.o/ the Lie subgroup of Aut.M /o consisting
158
F. Viviani
of all the elements that fix o 2 M . The symmetry so at o induces the following involutive automorphism of Aut.M /o : W Aut.M /o ! Aut.M /o g 7! so gso : Denote by Fix./ the closed Lie subgroup of Aut.M /o consisting of all the elements that are fixed by and let Fix./o be the connected component of Fix./ containing the origin. Theorem 2.11. Notations as above. (i) Aut.M /o is a semisimple adjoint (i.e. with trivial center) Lie group and Stab.o/ is a compact Lie subgroup of Aut.M /o such that Fix./o Stab.o/ Fix./: (ii) The map Aut.M /o = Stab.o/ ! M Œg 7! g o is a Aut.M /o -equivariant diffeomorphism. (iii) The symmetry so is contained in the identity component of the center of Stab.o/. Proof. See [Hel78, Chap. IV, Thm. 3.3; Chap. VIII, Thm. 4.5]
t u
A pair .G; K/ consisting of a connected semisimple Lie group G and a compact subgroup K for which there exists an involutive automorphism of G such that Fix./o K Fix./ is a particular case of a Riemann symmetric pair (see [Hel78, Chap. IV, §3]). In particular, Theorem 2.11(i) is saying that for any nonEuclidean Hermitian symmetric manifold M , the pair .Aut.M /o ; Stab.o// is a Riemannian symmetric pair. Conversely, starting with a Riemannian symmetric pair .G; K/, the quotient manifold G=K can always be endowed with a Riemannian metric g such that .G=K; g/ is a Riemannian symmetric space (see [Hel78, Chap. IV, Prop. 3.4]). However, in order to endow G=K with a complex structure J and a Hermitian metric h such that .G=K; J; h/ becomes a Hermitian symmetric manifold, the pair .G; K/ must satisfy some extra conditions. Theorem 2.11(iii) says that a necessary condition for a Riemannian symmetric pair .G; K/ to come from a Hermitian symmetric manifold is that the center of K is not finite. Indeed, in the irreducible case, this last condition is also sufficient. Theorem 2.12. (i) Every irreducible HSM of non-compact type is diffeomorphic to G=K for a unique pair .G; K/ such that G is a connected non-compact
Hermitian Symmetric Manifolds
159
Table 1 Irreducible HSMs: G=K is of non-compact type and G c =K is its compact dual Type Ip;q .p q 1/
Group G PSU.p; q/
Group G c PSU.p C q/
Compact subgroup K S.Up Uq /
dimR M 2pq
Rank M q
II n .n 2/
Š PSO .2n/
Š PSO.2n/
U.n/
n.n 1/
b n2 c
III n .n 1/
Š PSp.n; R/
PSp.n/
U.n/
n.n C 1/
n
IV n .2 ¤ n 1/
Š PSO.2; n/
PSO.2 C n/
SO.2/ SO.n/
2n
minf2; ng
V
E6.14/
E6c
SO.10/ SO.2/
32
2
VI
E7.25/
E7c
E6c SO.2/
54
3
simple adjoint Lie group and K is a maximal connected and compact Lie subgroup of G with non discrete center ZK (or, equivalently, with ZK D S1 ). (ii) Every irreducible HSM of compact type is diffeomorphic to G=K for a unique pair .G; K/ such that G is a connected compact simple adjoint Lie group and K is a maximal connected and compact proper Lie subgroup of G with non discrete center ZK (or, equivalently, with ZK D S1 ). Proof. See [Hel78, Chap. VIII, §6].
t u
Using the Lie-theoretic representation given by Theorem 2.12, È. Cartan in [Car35] (based upon his previous work [Car26-27] in which he classifies Riemannian symmetric manifolds) was able to classify the irreducible HSMs of non-compact type (rep. of compact type) into six types (see also [Bor52] for a nice exposition of the work of Cartan). We list the six types (together with their real dimensions and their rank) in Table 1, referring to Sect. 3 for more details on each type and to Sect. 2.3 for an explanation of the duality between HSMs of noncompact type and HSMs of compact type. Using the above presentation of HSMs, it is possible to give a Lie theoretic description of the rank of a HSM of non-compact type (as in Definition 2.10). Proposition 2.13. Let M be a HSM of non-compact type and write M Š Aut.M /o = Stab.o/ D G=K as in Theorem 2.11. Then the rank of M is equal to the dimension of any maximal R-split torus T contained in G. Proof. See [Mor, Sec. 8C].
t u
2.2 Classifying Non-Euclidean HSM Via Lie Algebras The aim of this subsection is to classify non-Euclidean HSM in terms of Lie algebras data. Let M D .M; J; h/ be a non-Euclidean HSM with a fixed base point o 2 M . We shorten the notation used in Sect. 2.1 by setting G WD Aut.M /o and K WD Stab.o/. Let g D Lie.G/ be the real Lie algebra of G [which is semisimple by
160
F. Viviani
Theorem 2.11(i)] and let D d W g ! g be the involution of g given by the differential of the involution of G. Denote by k (resp. p) the eigenspace for relative to the eigenvalue C1 (resp. 1). Since 2 D id, we have a direct sum decomposition gDk˚p
(1)
in such a way that the Lie bracket Œ; of g satisfies Œk; k k;
Œk; p p;
Œp; p k:
(2)
From Theorem 2.11(i), it follows that the Lie subalgebra k g is equal to the Lie subalgebra Lie.K/ Lie.G/ of the Lie subgroup K G. In particular, since K is a compact Lie group, it follows that k is a compactly embedded subalgebra of g (in the sense of [Hel78, p. 130]). Therefore, the pair .g; / above defined is a special case of an effective orthogonal symmetric Lie algebra in the sense of [Hel78, Chap. V, §1]. Š Moreover, the G-equivariant diffeomorphism G=K ! M of Theorem 2.11(i) induces a canonical identification p Š To M , where To M is the tangent space of M at o 2 M . The above identification, together with the complex structure J on M , induces a complex structure Jo on p. We extend Jo to a linear operator on g by setting: ( J D
0
on k;
Jo
on p:
(3)
Using the fact that the complex structure J on M is compatible with the Hermitian metric h on M , one can prove the following (see [Sat80, Chap. II, §3]) Lemma 2.14. J is a derivation of g. Since g is semisimple, each derivation on g is inner (see [Kna96, Prop. 1.98]) and the center of g is trivial; therefore, J D ad H for a unique element H 2 g. Using (3), it follows that H belongs to the center Z.k/ of k. The properties of the triple .g; ; H / are summarized in the following definition, which is a slight adaptation of [Sat80, p. 54]. Definition 2.15. A Hermitian symmetric Lie algebra (or, for short, a Hermitian SLA) is a triple .g; ; H / consisting a semisimple real Lie algebra g, an involution W g ! g whose associated decomposition g D k ˚ p into eigenvalues for C1 and 1 is such that k is a compactly embedded subalgebra of g and an element H 2 Z.k/ such that ad.H /2jp D idp . A Hermitian SLA .g; ; H / is said to be irreducible if the complexification gC of g is a simple complex Lie algebra.
Hermitian Symmetric Manifolds
161
A Hermitian SLA .g; ; H / is said to be (i) of compact type if g is a compact Lie algebra, i.e. its Killing form B is negative definite; (ii) of non-compact type if g does not contain compact simple factors and is a Cartan involution of g, i.e. B is negative definite on k and positive definite on p. We have seen before how to associate to a non-Euclidean HSM M a Hermitian SLA .g; ; H /. Indeed, this construction is bijective and it is compatible with the decomposition of HSMs as in Theorem 2.9. Theorem 2.16. There is a bijection Š
fnon-Euclidean HSMsg ! fHermitian SLAsg
(4)
obtained by sending a non-Euclidean Hermitian symmetric manifold M to its associated Hermitian SLA .g; ; H /. (i) If M D M1 Mr is the decomposition of M into irreducible HSMs and .gi ; ; Hi / is the Hermitian SLA associated to Mi , then .g; ; H / D .g1 ; 1 ; H1 / ˚ ˚ .gr ; r ; Hr /;
(5)
meaning that g decomposes a direct sum of ideals g D g1 ˚ ˚ gr , is the unique involution on g that preserves the above decomposition and such that jgi D i and H D H1 C C Hr . (ii) M is an irreducible HSM if and only if .g; ; H / is irreducible. (iii) M is of non-compact type if and only if .g; ; H / is of non-compact type. (iv) M is of compact type if and only if .g; ; H / is of compact type. Proof. Part (i) follows from [Hel78, Prop. 4.4, Prop. 5.5]. According to [Hel78, Prop. 5.5, Thm. 5.3, Thm. 5.4], M is irreducible if and only if .g; / belongs to one of the following four types: (1) Type I: g is a simple compact Lie algebra; (2) Type II: g is a compact Lie algebra which is the sum of two simple ideals g D g1 ˚ g2 which are interchanged by ; (3) Type III: g is a simple non-compact Lie algebras such that gC is simple; (4) Type IV: g is a simple complex Lie algebra (regarded as a real Lie algebra). Moreover, the existence of a non-trivial element H 2 Z.k/ rules out Type II and Type IV (see [Hel78, p. 518]). The remaining cases (Type I and Type III) are exactly those cases for which gC is a simple complex Lie algebra. Part (ii) follows. Parts (iii) and (iv) follow easily from (ii) and the Definitions 2.7 and 2.15. It remains to prove that the map from non-Euclidean HSM to Hermitian SLAs is bijective. In order to prove that, we will construct the inverse map of (4). First of all, we have the following
162
F. Viviani
Claim: Any Hermitian SLA admits a decomposition (as in (5)) into the direct sum of irreducible Hermitian SLAs. Indeed, given a Hermitian SLA .g; ; H /, using [Hel78, Prop. 5.2, Thm. 5.3, Thm. 5.4], we can write g as the direct sum of ideals g D g1 ˚ ˚ gr ; in such a way that preserves this decomposition and each factor .gi ; i WD jgi / belongs to one of the four Types mentioned before. The element H can be written as a sum H D H1 C C Hr in such a way that .gi ; i ; Hi / is a Hermitian SLA. As observed before, the existence of such an element Hi 2 Z.ki / forces .gi ; i / to be of Type I or Type III, so that each .gi ; i ; Hi / is irreducible, q.e.d. Using the Claim and part (ii), it is now enough to construct an inverse of (4) for irreducible Hermitian SLAs. Let .g; ; H / be an irreducible Hermitian SLA and assume first that it is of noncompact type. Let G the unique connected adjoint Lie group with Lie.G/ D g and let K be the unique Lie subgroup of G corresponding to the Lie subalgebra k g. Since is a Cartan involution of g and g is simple, we deduce that G is a simple non-compact Lie group and K is a maximal compact Lie subgroup of G. On the quotient manifold G=K (with base point o D Œe) , consider the unique G-invariant almost complex structure J such that Jo is equal to ad.H /jp via the identification To M Š p and the unique G-invariant Riemannian metric g such that go is equal to the Killing form of g restricted to p. It follows from [Hel78, Chap. VIII, Prop. 4.2] that .G=K; J; g/ is a HSM, which is irreducible of non-compact type by Theorem 2.12. Moreover, it is easy to check that the Hermitian SLA associated to .G=K; J; g/ is the Hermitian SLA .g; ; H / we started with, and we are done. The case where .g; ; H / is irreducible of compact type can be dealt with similarly and therefore it is left to the reader. t u Remark 2.17. The bijection (4) becomes an equivalence of categories if the two sets are endowed with the following morphisms: (i) A symmetric (or equivariant) morphism between two pointed non-Euclidean HSMs .M; o/ and .M 0 ; o0 / is a pointed holomorphic map W .M; o/ ! .M 0 ; o0 / such that 0 ı sp D s.p/ ı for any p 2 M; 0 where sp is the symmetry of M at p 2 M and s.p/ is the symmetry of M 0 at .p/ 2 M 0 . (ii) A morphism of Hermitian SLAs (also called a H1 -morphism) between two Hermitian SLAs .g; ; H / and .g0 ; 0 ; H 0 / is a morphism of Lie algebras W g ! g0 such that
Hermitian Symmetric Manifolds
163
Table 2 Irreducible classical Hermitian SLAs of non-compact type Type
Lie algebra g
Ip;q
su.p; q/
II n
Š so .2n/
III n
Š sp.n; R/
IV n
Š so.2; n/
Involution ! ! Z1 Z2 Z1 Z2 D t t Z2 Z3 Z2 Z3 ! ! Z1 Z2 Z1 Z2 D t t Z 2 Z1t Z 2 Z1t ! ! Z1 Z2 Z1 Z2 D t t Z Z1t Z Z1t 2 2 X1 iX 2 X1 iX 2 D iX t2 X3 iX2t X3
Central element H 2 Z.k/ ! q I 0 pCq p i p 0 I pCq q In 0 i 2 0 In In 0 i 2 0 In 0 0 0 J1
Table 3 Irreducible classical Hermitian SLAs of compact type Type
Lie algebra g
Ip;q
su.p C q/
II n
Š so.2n/
III n
sp.n/
IV n
so.2 C n/
Involution
!
!
Z1 Z2 D t Z2 Z3 ! ! Z1 Z2 Z1 Z2 D t t Z2 Z1t Z2 Z1t ! ! Z1 Z2 Z1 Z2 D t t Z2 Z1t Z2 Z1t X1 X2 X1 X2 D t t X2 X3 X2 X3
Z1 Z2 t Z2 Z3
Central element H 2 Z.k/ ! q Ip 0 i pCq p 0 I pCq q 0 I n i 2 0 In In 0 i 2 0 In 0 0 0 J1
ı D 0 ı ; ı ad.H / D ad.H 0 / ı : See [Sat80, Chap. II, §8] and [AMRT10, Chap. III, §2.2]. The correspondence in Theorem 2.16 together with the classification of irreducible HSMs recalled in Sect. 2.1 gives a classification of irreducible Hermitian SLAs. We record the list of irreducible classical Hermitian SLAs of non-compact type (resp. of compact type) into Table 2 (resp. Table 3). Moreover, to each irreducible Hermitian SLA .g; ; H / of non-compact type in Table 2, its dual Hermitian SLA of compact type (as defined in Sect. 2.3) is denoted by .g ; ; H / in Table 3. We refer to Sect. 3 for more details. The rank of a non-Euclidean HSM as defined in Definition 2.10 can be read off from the associated Hermitian SLA as it follows. Proposition 2.18. Let M be a non-Euclidean HSM with base point o 2 M and let .g; ; H / its associated Hermitian SLA. Consider the decomposition g D k ˚ p as in (1). Then any two maximal abelian Lie subalgebras of g contained in p are conjugate by an element of Stab.o/ Aut.M /o , acting on g via the adjoint representation, and their common dimension is equal to the rank of M . Proof. See [Hel78, Chap. V, Prop. 6.1 and Lemma 6.3].
t u
164
F. Viviani
2.3 Duality Between Compact and Non-compact HSMs We are now going to define an involution on non-Euclidean HSMs that exchanges compact HSMs with non-compact HSMs. The involution is defined most easily in terms of Hermitian SLAs, using the bijection of Theorem 2.16. Given a Hermitian SLA .g; ; H /, define a new Hermitian SLA .g; ; H / D .g ; ; H / as it follows. The Lie algebra g is the subalgebra of the complexification gC given by g WD k ˚ i p gC D .k ˚ p/ ˚ i.k ˚ p/; where as usual k and p are the eigenspaces for relative to the eigenvalues C1 and 1. Since gC D .g /C and the property of being semisimple is preserved by the complexification functor, it follows that g is a semisimple real Lie algebra. The involution on g is defined by (
D
C1
on k;
1
on i p:
In other words, g D k ˚ i p is the decomposition of g into eigenspaces for relative to the eigenvalues C1 and 1. Finally, we set H WD H: Note that H D H 2 Z.k/ and that ad.H /ji p D idi p . Therefore, .g; ; H / is a Hermitian SLA, which is called the dual Hermitian SLA of .g; ; H /. Theorem 2.19. The map (called the duality map) fHermitian SLAsg ! fHermitian SLAsg .g; ; H / 7! .g; ; H / WD .g ; ; H / is an involution which satisfies the following properties: (i) If .g; ; H / D .g1 ; 1 ; H1 / ˚ ˚ .gr ; r ; Hr / is the decomposition of .g; ; H / into irreducible Hermitian SLAs, then the dual Hermitian SLA .g; ; H / admits the following decomposition into irreducible Hermitian SLAs .g; ; H / D .g1 ; 1 ; H1 / ˚ ˚ .gr ; r ; Hr / :
Hermitian Symmetric Manifolds
165
(ii) .g; ; H / is of compact type (resp. of non-compact type) if and only if .g; ; H / is of non-compact type (resp. of compact type). Proof. The fact that the duality map is an involution follows immediately from the definition. Part (i) follows from easily from the definitions of the dual and of the direct sum of Hermitian SLAs, together with the observation that .g; ; H / is irreducible if and only if .g; ; H / is irreducible since gC D .g /C . Part (ii) follows from the well-know fact that an involution on a semisimple real Lie algebra g, with associated decomposition g D k˚p into C1 and 1 eigenvalues, is a Cartan involution of g if and only if g D k ˚ i p is a compact (semisimple) Lie algebra (see e.g. [Hel78, Chap. III, Prop. 7.4]). t u We can now define the dual of a non-Euclidean HSM, using the bijection of Theorem 2.16. Definition 2.20. Let M be a non-Euclidean HSM whose associated Hermitian SLA is .g; ; s/. The dual HSM of M , denoted by M , is the unique non-Euclidean HSM whose associated Hermitian SLA is .g; ; H / . From Theorems 2.19 and 2.16, we can deduce the following Corollary 2.21. The map (called the duality map) fnon-Euclidean HSMsg ! fnon-Euclidean HSMsg M 7! M is an involution which satisfies the following properties: (i) If M D M1 Mr is the decomposition of M into irreducible HSMs, then M admits the following decomposition into irreducible HSMs M D M1 Mr : (ii) M is of compact type (resp. of non-compact type) if and only if M is of noncompact type (resp. of compact type).
2.4 Harish–Chandra and Borel Embeddings The aim of this subsection is to realize a given HSM of non-compact type as an open subset of a complex vector space (Harish–Chandra embedding) and as an open subset of its dual HSM (Borel embedding), which is also called its compact dual.
166
F. Viviani
Fix a HSM of non-compact type M D .M; J; h/ together with a base point o 2 M . By Theorem 2.11(ii), we have a diffeomorphism M Š G=K where G D Aut.M /o and K D Stab.o/. Since G is an adjoint semisimple Lie group [by Theorem 2.11(i)], the adjoint representation Ad W G ! GL.g/ is faithful. Therefore G admits a natural complexification GC (in the sense of [Kna96, p. 437]), namely the complex connected (semisimple) Lie subgroup of GL.g; C/ whose Lie algebra ad
is the Lie subalgebra gC ,! gl.g; C/. Denote by KC the unique connected Lie subgroup of GC whose Lie algebra is kC gC . Note that KC is a complex reductive Lie group which is a complexification of K. Denote by .g; ; H / the Hermitian SLA associated to M , as in Sect. 2.2. Since ad.H /jp D idp , the complexification p ˝R C admits a decomposition p˝R C D pC ˚ p ; where pC (resp. p ) is the eigenspace for ad.H /jp relative to the eigenvalue Ci (resp. i ). Therefore, gC admits the following decomposition gC D kC ˚ pC ˚ p
(6)
which satisfies the relations (see [Sat80, p. 53]): ŒkC ; p˙ p˙
ŒpC ; p kC
ŒpC ; pC D 0
Œp ; p D 0:
In particular, pC and p are abelian subalgebras of gC which are normalized by kC . Denote by PC (resp. P ) the connected Lie subgroups of GC whose Lie algebra is pC gC (resp. p gC ). Then PC and P are abelian unipotent Lie groups that are stabilized by KC . It turns out that PC and P are simply connected so that the exponential map exp W p˙ ! P˙ is a diffeomorphism (see [Hel78, Chap. VIII, Lemma 7.8]). Moreover the multiplication map .PC P / Ì KC ! GC is injective and the image contains G. Finally, let Gc be the Lie subgroup of GC corresponding to the Lie subalgebra g .g /C D gC . By Theorem 2.19 and Corollary 2.21, Gc is a compact Lie group containing K such that M Š Gc =K: We can summarize the above discussion into the following commutative diagram
Hermitian Symmetric Manifolds
G K
167
.PC P / Ì KC P Ì KC
GC P Ì KC
Gc
(7)
K
Theorem 2.22. By taking quotients in (7), we get a diagram of complex manifolds .PC P / Ì KC = .P Ì KC / GC = .P Ì KC / PC j iHC Š exp
Gc =K Š M
M Š G=K
Š
pC
in which is a biholomorphism, iHC is a holomorphic open embeddings and j is a Zariski open embedding onto the homogeneous projective variety GC = .P Ì KC / Proof. See [Hel78, Chap. VIII, §7] or [Sat80, Chap. II, §4].
t u
The embedding iHC is called the Harish–Chandra embedding while the composition 1 ı j ı iHC is called the Borel embedding. Harish–Chandra embeddings and Borel embeddings for each irreducible HSM will be studied in detail in Sect. 3.
2.5 HSMs of Non-compact Type as Bounded Symmetric Domains The Harish–Chandra embedding iHC W M ,! pC defined in Sect. 2.4 allows to realize canonically a given HSM of non-compact type M as a bounded symmetric domain. Definition 2.23. A domain D CN (i.e. an open connected subset of CN ) is said to be (i) bounded it is bounded as a subset of CN ; (ii) homogeneous if the group Hol.D/ of biholomorphisms of D acts transitively on D; (iii) symmetric if it is homogeneous and for some p 2 D (or, equivalently, for any p 2 D) there exists sp 2 Hol.D/ (called a symmetry at p) such that sp2 D id and p is an isolated fixed point of sp .
168
F. Viviani
(iv) irreducible if there does not exist a non-trivial decomposition CN D CN1 CN2 and two domains D1 CN1 and D2 CN2 such that D D D1 D2 . Remark 2.24. If D CN is a bounded domain, then we have that (see [Sat80, Chap. II, §4, Rmk. 2] and the references therein) (i) the group Hol.D/ admits a (unique) structure of Lie group compatible with the open-compact topology; (ii) D is symmetric if and only if it is homogeneous and Hol.D/ is semisimple. Moreover, É. Cartan [Car35] showed that, up to dimension three, every homogeneous bounded domain is also symmetric and he asked if this was true in every dimension. Counterexamples were found later, starting from dimension four, by Pyateskii–Shapiro (see [PS69]). The image of the Harish–Chandra embedding is a bounded symmetric domain inside the complex vector space pC . Theorem 2.25 (Hermann, Harish–Chandra). For any HSM of non-compact type M together with a fixed base point o 2 M , the image of the Harish–Chandra embedding iHC W M ,! pC is a bounded symmetric domain. Proof. Since iHC is a holomorphic open embedding and M is connected, the image iHC .M / is a domain inside pC . The group Aut.M / of biholomorphic isometries of M is a subgroup of finite index of the group of biholomorphisms Hol.M / D Hol.iHC .M // (see [Mil05, Prop. 1.6]). Therefore, the group Aut.M /o D Hol.iHC .M //o acts transitively on iHC .M / by Theorem 2.11(ii); hence iHC .M / is a homogeneous domain. Moreover, the fact that every point of M has a symmetry in the sense of Definition 2.4(ii)) implies that every point of iHC .M / has a symmetry in the sense of Definition 2.23; hence iHC .M / is a symmetric domain. Finally, in order to prove that iHC .M / is a bounded domain, we need to recall an explicit description of the image of iHC . Consider the Hermitian SLA of non-compact type .g; ; H / associated to M as in Sect. 2.2 and the induced decomposition gC D kC ˚pC ˚p as in (6). Denote by the complex conjugation on gC corresponding to the real form g of gC introduced in Sect. 2.3. Since .g; ; H / is of non-compact type, the algebra g is compact by Theorem 2.19(ii). Therefore, is a Cartan involution of gC (see [Kna96, Prop. 6.14]), which implies that (see [Kna96, Chap. VI, §2]) B W gC gC ! C .X; Y / 7! B .X; Y / WD B.X; Y / is a positive definite Hermitian form, where B denotes as usual the Killing form of gC . For any X 2 pC , define the linear operator
Hermitian Symmetric Manifolds
169
T .X / W p ! kC Y 7! ŒY; X and denote by T .X / W kC ! p the adjoint of T .X / with respect to B . With these notations, the image of M via the Harish–Chandra embedding can be described as (see [AMRT10, Chap. III, Thm. 2.9]): iHC .M / D fX 2 pC W T .X / ı T .X / < 2 idp g: From (8), it follows that iHC .M / is a bounded domain, as required.
(8) t u
Remark 2.26. Let M be a HSM of non-compact type of dimension n and fix a base point o 2 M . (i) The Harish–Chandra embedding iHC W M ,! pC (with respect to the base point o 2 M ) can be characterized as the unique open holomorphic embedding of M inside Cn , up to linear complex isomorphisms, such that iHC .o/ D 0 and iHC .M / is a circular domain, i.e. it is stable under multiplication by S1 C . See [Sat80, Chap. II, §4, Rmk. 1] and the references therein. (ii) From the description (8), it follows easily that iHC .M / is a convex bounded domain (Hermann’s convexity theorem). Conversely, Mok–Tsai proved that, if the rank of M is greater than one, then the Harish–Chandra embedding is the unique embedding of M inside Cn , up to complex affine transformations, as a bounded convex domain; see [Mok89, Chap. 5, §2] and the references therein. We are now going to show that, conversely, any bounded symmetric domain can be endowed with a canonical Hermitian metric with respect to which it becomes a HMS of non-compact type. Let D CN be any bounded domain. Let H2 .D/ be the separable Hilbert space consisting of all holomorphic functions on D that are square integrable with respect to the Euclidean measure d on D (see [Hel78, Chap. VIII, Cor. 3.2]). Choose an orthonormal basis fen .z/gn2N of H.D/ and set KD W D D ! C; .z; w/ 7! KD .z; w/ WD
X
en .z/ em .w/;
(9)
n2N
where the right hand side converges absolutely and uniformly on any compact subset of D D (see [Hel78, Chap. VIII, Thm. 3.3]). The function KD , known as the Bergman kernel function of D, is independent of the choice of the orthonormal basis fen .z/g (see [Hel78, Chap. VIII, Thm. 3.3]) and it can be intrinsically characterized (see [Sat80, Chap. II, §6]) as the unique function KD W D D ! C such that (i) KD .z; w/ D KD .w; z/ for any z; w 2 D. (ii) For any w 2 D, the function z 7! KD .z; w/ belongs to H2 .D/.
170
F. Viviani
(iii) For any f 2 H2 .D/, we have that Z KD .z; w/f .w/d.w/:
f .z/ D D
Fix now coordinates z D .z1 ; ; zN / of CN and consider the smooth tensor of type .0; 2/ defined by hD D
X 1i;j N
@2 log KD .z; z/dzi d zj : @zi @zj
(10)
Theorem 2.27. Let D CN be a bounded domain and consider the complex structure JD on D inherited from CN . (i) The tensor hD defines a Hermitian metric (called the Bergman metric of D) on the complex manifold .D; JD /, which is invariant under Hol.D/. In particular, Aut.D; JD ; hD / D Hol.D/. (ii) If D is a bounded symmetric domain, then .D; JD ; hD / is a HSM of noncompact type. Proof. Part (i) is proved in [Hel78, Chap. VIII, Prop. 3.4, Prop. 3.5]. Part (ii): by assumption, Hol.D/ acts transitively on D and each point p 2 D has a symmetry sp 2 Hol.D/. Since Hol.D/ D Aut.D; JD ; hD / by part (i), it follows that Aut.D; JD ; hD / acts transitively on D and that the symmetry sp at the point p 2 D belongs to Aut.D; JD ; hD /. Therefore, .D; JD ; hD / is a Hermitian symmetric manifold. The fact that .D; JD ; hD / is of non-compact type is proved in [Hel78, Chap. VIII, Thm. 7.1(i)]. t u Remark 2.28. (i) The Bergman metric hD of a bounded domain D CN is Kähler, i.e. Im hD is a closed 2-form (see [Hel78, Chap. VIII, Prop. 3.4]). (ii) If D CN is a homogeneous bounded domain, then hD is Kähler–Einstein, i.e. its Ricci curvature is proportional to the associated Riemannian metric Re hD (see [Hel78, Chap. VIII, Prop. 3.6]). Example 2.29. Consider the open unitary disk WD fz 2 C W jzj < 1g C: Clearly, is a bounded domain. Its Bergman Hermitian metric is equal to (see [FK94, Chap. IX, §2]) h D
4 dzd z: 1 jzj2
The unitary disk is biholomorphic to the upper half space H of Example 2.6(2) via the Cayley transforms
Hermitian Symmetric Manifolds
171 Š
W H ! 7!
(11)
i : Ci
The pull-back via of the Riemannian metric Re h on is the hyperbolic metric on H introduced in Example 2.6(2). Therefore, is a bounded symmetric domain. Putting together Theorems 2.25 and 2.27, we obtain the following correspondence between HSM of non-compact type and bounded symmetric domains. Theorem 2.30. The maps fHSMs of non-compact typeg ! fBounded symmetric domainsg .M; J; h/ ! iHC .M / pC .D; JD ; hD /
DC
(12)
N
are bijections which are inverses of each other. Moreover, the above bijections send irreducible HSMs of non-compact type into irreducible bounded symmetric domains and conversely. Proof. It follows from Theorems 2.25 and 2.27 that the above maps are welldefined. The fact that they are inverses of each other can be extracted from the proof of [Hel78, Chap. VIII, Thm. 7.1]. The last assertion is obvious. t u The rank of a bounded symmetric domain can be characterized in the following way. Theorem 2.31 (Polydisk theorem). Let D be a bounded symmetric domain with its Bergman metric hD and fix a base point o 2 D. If the rank of D is equal to r then there exists a totally geodesic polydisk r D of dimension r such that the restriction of hD to r is equal to the Bergman metric of r and D D Stab.o/ r . Proof. See [Mok89, Chap. 5, (1.1), Thm. 1].
t u
The correspondence in Theorem 2.30 together with the classification of irreducible HSMs of non-compact type recalled in Sect. 2.1 gives a classification of irreducible bounded symmetric domains. We record the irreducible bounded symmetric domains in their Harish–Chandra embeddings (together with their complex dimensions and their ranks) into Table 4, referring to Sect. 3 for more details. Remark 2.32. Bounded symmetric domains play a crucial role in Hodge theory. Indeed, on one hand if a period domain D is such that its universal family of Hodge structures satisfies Griffiths transversality then D is a bounded symmetric domain. On the other hand, Deligne has shown that every bounded symmetric domain can be realized as the subdomain of a period domain on which certain tensors for the universal family are of Hodge type. In particular, every HSM of non-compact type
172
F. Viviani
Table 4 Irreducible bounded symmetric domains in their Harish–Chandra embeddings Type Ip;q
Bounded symmetric domain fZ 2 Mp;q .C/ W Z t Z < Iq g
II n
skew fZ 2 Mn;n .C/ W Z t Z < In g
III n
fZ 2
sym Mn;n .C/ t n
dimC pq n 2 nC1
W Z t Z < In g t
2
Rank q b n2 c n
IV n
fZ 2 C W 2Z Z < 1 C jZ Zj ; Z Z < 1g
n
V
DV O2C
16
2
VI
DVI H3 .OC /
27
3
t
2
minf2; ng
can be realized as a moduli space for Hodge structures plus tensors. We refer the reader to [Mil12, Sec. 7] and the references therein. HSM of non-compact type embedded (equivariantly and horizontally) inside period domains have been recently characterized by R. Friedmann and R. Laza [FL13].
2.6 HSMs of Compact Type as Cominuscle Homogeneous Varieties As a consequence of the Borel embedding (see Theorem 2.22), we can describe HSMs of compact type as cominuscle homogeneous (projective) varieties. Definition 2.33. A rational homogeneous projective variety H=Q, where H is a semisimple complex Lie group (or, equivalently, algebraic group) and Q is a parabolic subgroup of H , is said to be a cominuscle homogeneous variety if the unipotent radical of Q is abelian. Remark 2.34. If H is a simple complex algebraic group, then H=Q is a cominuscle homogeneous variety if and only if Q is, up to conjugation, a standard maximal parabolic subgroup associated to a cominuscle (or special) simple root, i.e. a simple root occurring with coefficient 1 in the simple root decomposition of the highest positive root (see [RRS92, Lemma 2.2] for a proof). In this case, H=Q is called an irreducible cominuscle homogeneous variety and clearly any cominuscle homogenous variety can be written uniquely as a product of irreducible ones. In Sect. 2.4, we have seen that any HSM M D Gc =K of compact type is isomorphic to GC =.P Ì KC / (see Theorem 2.22), which is a cominuscle homogeneous variety since GC is a semisimple complex Lie group and P Ì KC is a parabolic subgroup whose unipotent radical P is abelian. Theorem 2.35. The map fHSMs of compact typeg ! fCominuscle homogeneous varietiesg Gc =K 7! GC =.P Ì KC /
(13)
Hermitian Symmetric Manifolds
173
is a bijection sending irreducible HSMs of compact type into irreducible cominuscle homogeneous varieties. Proof. See e.g. [RRS92, 5.5].
t u
The rank of a cominuscle homogeneous variety can be characterized in the following way. Theorem 2.36 (Polysphere theorem). Let X be a cominuscle homogeneous variety with a base point o and let h the Hermitian metric coming from the bijection of Theorem 2.35. If the rank of X is equal to r then there exists a totally geodesic polysphere .P1 /r X of dimension r such that the restriction of h to .P1 /r is equal to the product of the Fubini–Studi metrics on each factor and X D Stab.o/ .P1 /r . Proof. See [Mok89, Chap. 5, (1.1), Thm. 1].
t u
The correspondence in Theorem 2.35 together with the classification of irreducible HSMs of compact type recalled in Sect. 2.1 gives a classification of irreducible cominuscle homogeneous varieties (see also [LM02, Sec. 2.1, 3.1] and [LM03, Sec. 3]). We record the irreducible cominuscle homogeneous varieties together with their associated cominuscle simple roots (see Remark 2.34) into Table 5, referring to Sect. 3 for more details. Note that each of the varieties Gr.q; p C q/, Grort .n; 2n/ and P2 .O/ corresponds to two cominuscle simple roots; this is due to the fact that in each of the above cases there is an automorphism of the Dinkin diagram that exchanges the two roots, and hence inducing an outer automorphism of the associated simple complex algebraic group that establishes an isomorphism of their associated cominuscle homogeneous varieties. Remark 2.37. Theorem 2.35 and Remark 2.34, together with the correspondence between (irreducible) bounded symmetric domains and (irreducible) cominuscle varieties (see Corollary 2.21 and Theorem 2.30), provide an explicit bijection between irreducible bounded symmetric domain and cominuscle simple roots of Dinkin diagrams, up to the action of the automorphism group of the Dinkin diagram. This correspondence is described explicitly in [Mil05, Sec. 1] and [Mil12, Sec. 2] (following Deligne), without passing to the compact dual HSM.
2.7 Classifying HSMs of Non-compact Type Via Jordan Theory The aim of this subsection is to explain an alternative approach to the classification of HSMs of non-compact type which is based on the Jordan theory, rather than the Lie theory as in Sect. 2.2.
174
F. Viviani
Table 5 Irreducible cominuscle homogeneous varieties and their associated cominuscle simple roots Cominuscle homogeneous variety Ip;q W
Gr.q; p C q/
II n W
Grort .n; 2n/o
III n W
Grsym .n; 2n/
IV n W
Qn
V W
P2O
Cominuscle simple roots
Cayley plane VI W
F D Gr! .O3 ; O6 /
Freudenthal variety
We start by giving the definition of Jordan triple systems, referring the interested reader to [FKKLR00, Part V] for an excellent survey on Jordan triple systems, with special emphasis on Hermitian positive JTSs. Definition 2.38. A Jordan triple system over a field F is a pair .V; f: ; : ; :g/ consisting of a (finite-dimensional) F -vector space V together with a F -multilinear triple product V V V ! V , which satisfies the following properties: (JT1) fx; y; zg D fz; y; xg for any x; y; z 2 V ; (JT2) Œax; by D ..ab/x/y x..ba/y/ for any a; b; x; y 2 V , where ab is the endomorphism of V defined by .ab/x WD fa; b; xg; and Œ is the usual bracket among endomorphisms of V .
Hermitian Symmetric Manifolds
175
A Jordan triple system .V; f: ; : ; :g/ over F is said to be: (i) semisimple if the trace form W V V ! F; (14)
.x; y/ 7! .x; y/ WD tr.xy/;
is non-degenerate. (ii) simple if is not identically zero and .V; f: ; : ; :g/ does not have proper ideals, i.e. proper subvector spaces I V such that fI; V; V g I and fV; I; V g I . (iii) Hermitian if F D R and V has a complex structure with respect to which f: ; : ; :g is C-linear in the first and third factor and C-antilinear with respect to the second factor. (iv) Hermitian positive (or, for short, a Hermitian positive JTS) if it is Hermitian and (JTp) the trace form is positive definite. Note that property (JTp) makes sense since the trace form is a Hermitian form on V (see [Sat80, p. 55]). Proposition 2.39. Any Hermitian positive (resp. complex semisimple) JTS decomposes uniquely as a product of simple Hermitian positive (resp. complex simple) JTSs. Proof. See [FKKLR00, Part V, Prop. IV.1.4] for the case of Hermitian positive JTSs and [Loo75, Thm. 10.14] for the case of complex semisimple JTSs. t u Remark 2.40. There is a natural bijection Š
fHermitian positive JTSsg ! fComplex semisimple JTSsg .V; f: ; : ; :g/ 7! .V; fx; y; xg0 WD fx; y; zg/ which preserves the decomposition into the product of simple Jordan algebras. Simple Hermitian positive JTSs (or equivalently simple complex JTSs by Remark 2.40) can be classified (see [Loo75, Sec. 17.4]). Theorem 2.41. Every simple Hermitian positive JTS is isomorphic to one of the following: t
(i) Mp;q .C/ with Jordan triple product fM1 ; M2 ; M3 g D 12 .M1 M2 M3 C t M3 M2 M1 /. skew (ii) Mn;n .C/ WD fM 2 Mn;n .C/ W M t D M g with the same Jordan triple product of (i). sym (iii) Mn;n .C/ WD fM 2 Mn;n .C/ W M t D M g with the same Jordan triple product of (i). (iv) Cn with Jordan triple product fX; Y; Zg D .X t Z/Y .Z t Y /X .X t Y /Z.
176
F. Viviani
Table 6 Simple Hermitian positive JTSs
Type
Complex vector space pC
Jordan triple product f: ; : ; :g
Ip;q
Mp;q .C/
fM1 ; M2 ; M3 g D 12 .M1 M2 M3 C M3 M2 M1 /
II n
skew Mn;n .C/
fM1 ; M2 ; M3 g D 12 .M1 M2 M3 C M3 M2 M1 /
III n
Mn;n .C/
fM1 ; M2 ; M3 g D 12 .M1 M2 M3 C M3 M2 M1 /
IV n
Cn
V
O2C
fX; Y; Zg D .X t Z/Y .Z t Y /X .X t Y /Z ! a1 b1 c1 .a1 be1 /c1 C .c1 be1 /a1 C .a1 b2 /e c2 C .c1 b2 /ae2 ; ; D a2 b2 c2 ae1 .b1 c2 / C e c1 .b1 a2 / C a e2 .b2 c2 / C e c2 .b2 a2 /
VI
H3 .OC /
sym
t
t
t
t
t
t
fa; b; cg D .ajb/c C .cjb/a .a c/ b
b c a1 ; 1 ; 1 D a2 b2 c2 ! c2 C .c1 b2 /e a2 .a1 be1 /c1 C .c1 be1 /a1 C .a1 b2 /e . ae1 .b1 c2 / C ce1 .b1 a2 / C ae2 .b2 c2 / C ce2 .b2 a2 / (vi) H3 .OC / WD fM 2 M3;3 .OC / W MQ t D M g with Jordan triple product fa; b; cg D .ajb/c C .cjb/a .a c/ b. (v) O2C with Jordan triple product
We record the simple Hermitian positive JTSs into Table 6, referring to Sect. 3 for more details. Among the different types of simple Hermitian positive JTSs, there are the same isomorphisms specified in Table 7. There is a way to construct a Hermitian positive JTS starting from a Hermitian SLA of non-compact type. Indeed, given a Hermitian SLA of non-compact type .g; ; H /, consider the decomposition gC D kC ˚ pC ˚ p given in (6) and denote by x ! x the complex conjugation on gC with respect to the real form g of gC . Then the complex vector space pC endowed with the triple product f: ; : ; :g W pC pC pC ! pC ; .x; y; z/ 7! fx; y; zg WD
1 ŒŒx; y; z; 2
(15)
is a Hermitian positive JTS (see [Sat80, p. 55]). Theorem 2.42. The map fHermitian SLAs of non-compact typeg ! fHermitian positive JTSsg .g; ; H / 7! .pC ; f: ; : ; :g/
(16)
Hermitian Symmetric Manifolds Table 7 Isomorphic types
177 Isomorphic types I1;1 D II 2 D III 1 D IV 1 I3;1 D II 3 III 2 D IV 3 I2;2 D IV 4 II 4 D IV 6
R-dimension 2 6 6 8 12
Rank 1 1 2 2 2
is a bijection sending irreducible Hermitian SLAs of non-compact type into simple Hermitian positive JTSs. Proof. We limit ourself to defining a map in the other direction, referring to [Sat80, Chap. II, Prop. 3.3] for the verification that it is the inverse of the map (16). Let .V; f: ; : ; :g/ be a Hermitian positive JTS. Consider the graded complex vector space S D S1 ˚ S0 ˚ S1 WD V ˚ .V V / ˚ V ; where V V WD fxy W x; y 2 V g End.V / and V is the complex conjugate vector space of V , i.e. the complex vector space whose underlying real vector space is equal to the one of V and such that i v D i v for any v 2 V . Observe that (JT2) of Definition 2.38 implies that V V is a complex Lie subalgebra of End.V / with respect to the usual Lie bracket Œ on End.V /. Moreover, if we denote by T the adjoint of an endomorphism T 2 End.V / with respect to the Hermitian positive definite trace form (see (JTp) of Definition 2.38), then (JT2) implies that .xy/ D yx. In particular, V V End.V / is closed under the adjoint operator. Using these observations, we can define a Lie bracket on S D V ˚ .V V / ˚ V as it follows: Œ.a; T; b/; .a0 ; T 0 ; b 0 / WD .Ta0 T 0 a; 2a0 b C ŒT; T 0 2ab 0 ; .T 0 / b T b 0 /: It turns out that .S; Œ / is a graded semisimple complex Lie algebra (see [Sat80, Chap. I, Prop. 7.1]) Consider now the map W S ! S; .a; T; b/ 7! .b; T ; a/: It is easily checked that is a complex conjugation on the graded complex Lie algebra S, i.e. it is a C-antilinear involution such that Œ.X /; .Y / D ŒX; Y for any X; Y 2 S and .Si / D Si for i D 1; 0; 1. Moreover, the fact that the trace form is positive definite (by (JTp) of Definition 2.38) implies that is a Cartan involution of the complex semisimple Lie algebra S (see [Sat80, Chap. I, §9]), or,
178
F. Viviani
equivalently, that the real form S WD fX 2 S W .X / D X g of the complex Lie algebra S defined by is a compact real form of S (see [Kna96, Prop. 6.14]). Finally, consider the C-linear involution on S which preserves the grading on S and such that ( id if i D 0; jSi D id if i D 1; 1: Then the complex conjugation X 7! ..X // on S defines another real form g WD fX 2 S W ..X // D X g of S, on which induces a Cartan involution (see e.g. [Sat80, Chap. I, §4]). If g D k ˚ p is the Cartan decomposition relative to (as in (1)), then by construction it follows that kC D S0 and p ˝R C D S1 ˚ S1 . Therefore, arguing as in Sect. 2.2, there exists a unique element H 2 Z.k/ such that ad.H /2jp D idp and such that p ˝R C D S1 ˚ S1 is the decomposition into eigenspaces for ad.H /jp relative to the eigenvalues C1 and 1 (as in Sect. 2.4). Summing up, starting with a Hermitian positive JTS .V; f: ; : ; :g/, we have constructed a Hermitian SLA of non-compact type .g; ; H / and this defines the inverse of the map (16) (see [Sat80, Chap. II, Prop. 3.3]). t u Note that the correspondence in Theorem 2.42 together with Theorem 2.16 and the classification of simple Hermitian positive JTSs in Theorem 2.41 gives a new approach to the classification of HSMs of non-compact type, as recalled in Sect. 2.1. Remark 2.43. The bijection (16) becomes an equivalence of categories if the left hand set is endowed with morphisms of Hermitian SLAs as in Remark 2.17 and the right hand side is endowed with morphisms of Hermitian positive JTSs defined as it follows: a morphism of Hermitian positive JTSs between two Hermitian positive JTSs .V; f: ; : ; :g/ and .V 0 ; f: ; : ; :g0 / is a C-linear map W V ! V 0 such that .fx; y; zg/ D f .x/; .y/; .z/g0 for any x; y; z 2 V: See [Sat80, Chap. II, §8]. Remark 2.44. Let M be a HSM of non-compact type and consider its associated Hermitian positive JTS .pC ; f: ; : ; :g/ (via the bijections of Theorems 2.16 and 2.42). Then the image of the Harish–Chandra embedding iHC W M ,! pC (as in Theorem 2.22) is equal to iHC .M / D fz 2 pC W zz < idpC g:
(17)
See [Sat80, Chap. II, Thm. 5.9] and the references therein. Using the above presentation of HSMs, it is possible to give a Jordan theoretic description of the rank of a HSM of non-compact type (as in Definition 2.10). Recall that a tripotent (or idempotent) of a Jordan triple system .V; f: ; : ; :g/ is an element e 2 V such that fe; e; eg D e. Two tripotents e1 and e2 are said to be orthogonal
Hermitian Symmetric Manifolds
179
if fe1 ; e2 ; xg D 0 for any x 2 V . A Jordan frame of .V; f: ; : ; :g/ is a maximal collection fe1 ; : : : ; en g of pairwise orthogonal distinct tripotents. Proposition 2.45. Let M be a HSM of non-compact type and let .pC ; f: ; : ; :g/ the corresponding Hermitian positive JTS according to Theorems 2.16 and 2.42. Then the rank of M is equal to the cardinality of every Jordan frame of .pC ; f: ; : ; :g/. Proof. See [Loo77].
t u
3 Irreducible Hermitian Symmetric Manifolds Irreducible HSMs of non-compact type (resp. of compact type) have been classified by Cartan in [Car35] (based upon [Car26-27]) and they are divided in six types which according to the nowadays standard Siegel’s notation1 are called: Ip;q (with p q 1), II n (with n 2), III n (with n 1), IV n (with 2 ¤ n 1), V and VI. The first four families are called classical HSMs while the last two are called exceptional HSMs. For small values of the parameters there are some isomorphisms between the above types as shown in Table 7. For a modern proof of the above classification, the reader is referred to [Hel78, Chap. X, §6], where such a result is deduced as a corollary of the more general classification of irreducible Riemannian symmetric spaces. The notation used by Helgason differs from the one used by Siegel and the translation between the two different notations is given as it follows: Ip;q D AIII , II n D DIII , III n D CI, IV n D BDI.q D 2/, V D EVI and VI D EVII. A direct proof which avoids the classification of Riemannian symmetric manifolds appears in [Wol64]. In the subsections below, we describe in detail each of the above types following mainly [Wol72] and [Mok89, Chap. 4, §2] for classical HSMs and [Roo08] for exceptional HSMs.
3.1 Type Ip;q The bounded symmetric domain of type Ip;q (with p q 1) in its Harish– Chandra embedding is given by DIp;q WD fZ 2 Mp;q .C/ W Z t Z < Iq g Mp;q .C/:
(18)
Note that, in the special case q D 1, DIp;1 is the unitary ball B p WD f.z1 ; ; zp / 2 P Cp W i jzi j2 < 1g Cp .
1
Cartan’s original notation permutes Type III and Type IV.
180
F. Viviani
Let SU.p; q/ be the connected simple non-compact Lie subgroup of SL.p C q; C/ that leaves invariant the bilinear Hermitian form on CpCq CpCq given by x1 y 1 xp y p C xpC1 y pC1 C C xpCq y pCq . More explicitly Ip 0 Ip 0 gD SU.p; q/ D g 2 SL.p C q; C/ W g 0 Iq 0 Iq 8 9 t t ˆ A A C C D Ip > ˆ > ˆ > < = A B t t 2 SL.p C q; C/ W D D B B D Iq : D ˆ > C D ˆ > ˆ > : ; t t AB DC D
t
The Lie group SU.p; q/ acts transitively on DIp;q via generalized Möbius transformations SU.p; q/ DIp;q ! DIp;q A B ; Z 7! .AZ C B/.CZ C D/1 : C D
(19)
˚
Notice that the center Z.SU.p; q// D IpCq W pCq D 1 of SU.p; q/ acts trivially on DIp;q ; indeed, it turns out that the connected component of the group of biholomorphisms of DIp;q is given by Hol.DIp;q /o D SU.p; q/=Z.SU.p; q// WD PSU.p; q/; which is the connected non-compact adjoint simple Lie group of type ApCq1 . The symmetry of DIp;q at the base point 0 is given by the element s0 D
Ip 0 0 Iq
2 PSU.p; q/;
(20)
which acts on DIp;q by sending Z into Z. The symmetry s0 induces an involution on SU.p; q/ W SU.p; q/ ! SU.p; q/; 1 Ip 0 AB Ip 0 A B A B 7! ; D C D C D C D 0 Iq 0 Iq
(21)
whose fixed Lie subgroup is equal to the maximal compact Lie subgroup ( ) t t A A D Ip ; D D D Iq A 0 A 0 W 2 SU.p; q/ D DW S.Up Uq /; 0 D 0 D det.A/ det.D/ D 1
Hermitian Symmetric Manifolds
181
which is also equal to the stabilizer of 0 2 DIp;q . In particular, the pair .SU.p; q/; S.Up Uq // is a Riemannian symmetric pair. Notice that the involution descends to an involution of PSU.p; q/ whose fixed locus is the maximal compact Lie subgroup S.Up Uq / WD S.Up Uq /=Z.SU.p; q// of PSU.p; q/. Therefore also the pair .PSU.p; q/; S.Up Uq // is a Riemannian symmetric pair. By the above discussion, we get the following presentation of DIp;q as an irreducible HSM of non-compact type DIp;q Š SU.p; q/= S.Up Uq / D PSU.p; q/=S.Up Uq /;
(22)
associated to the Riemannian symmetric pair .SU.p; q/; S.Up Uq // (resp. to .PSU.p; q/; S.Up Uq /). Notice that the last description of DIp;q is the one appearing in Theorem 2.12(i). The irreducible Hermitian SLA of non-compact type associated to the Riemannian symmetric pair .SU.p; q/; S.Up Uq // (or equivalently to .PSU.p; q/; S.Up Uq /) is given by the Lie algebra (
!
Lie SU.p; q/ D su.p; q/ D M 2 sl.p C q; C/ W M
t
Ip 0 0 Iq
! ) Ip 0 D M 0 Iq
(23) ( D
Z1 Z2 t Z2 Z3
!
t
2 gl.p C q; C/ W
t
Z1 D Z1 ; Z3 D Z3
)
Z2 2 Mp;q .C/; Tr.Z1 / C Tr.Z3 / D 0
endowed with the Cartan involution D d given by Z1 Z2 t Z2 Z3
!
Z1 Z2 D t Z2 Z3
!
and with the element H Di
q pCq Ip
0
0 p pCq Iq
!
Z1 0 2 Fix./ D Lie S.Up Uq / D 2 su.p; q/ : 0 Z3
The cominuscle homogeneous variety of type Ip;q is the Grassmannian Gr.q; p C q/ parametrizing q-dimensional subspaces of CpCq : Gr.q; p C q/ WD fŒW CpCq W dim W D qg: The Borel embedding of DIp;q into Gr.q; p C q/ is given by
(24)
182
F. Viviani
DIp;q Mp;q .C/ ,! Gr.q; p C q/; ( Z 7! hv1 ; vq i W fv1 ; ; vq g are the column vectors of
!) Z Iq
:
(25) The complex algebraic simple group SL.p C q; C/ of type ApCq1 acts transitively on Gr.q; p C q/ via SL.p C q; C/ Gr.q; p C q/ ! Gr.q; p C q/ .g; ŒW CpCq / 7! Œg.W / CpCq : Note that the center Z.SL.p C q; C// D fIpCq W pCq D 1g of SL.p C q; C/ acts trivially on Gr.q; p C q/; indeed, it turns out that the group of automorphisms of the algebraic variety Gr.q; p C q/ is equal to PSL.p C q; C/ WD SL.p C q; C/=Z.SL.p C q; C//; which is the connected simple adjoint complex algebraic group of type ApCq1 and it is the complexification of the Lie group PSU.p; q/. Consider now the base point Wo WD hepC1 ; ; epCq i 2 Gr.q; p C q/ with respect to the standard basis fe1 ; ; epCq g of CpCq . The stabilizer of Wo is the maximal parabolic subgroup associated to the q-th simple root of the Dinkin diagram ApCq1 (which is cominuscle, see Table 5) Qq WD
A 0 2 SL.p C q; C/ SL.p C q; C/; C D
where A 2 Mp;p .C/, C 2 Mq;p .C/ and D 2 Mq;q .C/. The parabolic group Qq admits the following Levi decomposition Qq D Ru .Qq / Ì L.Qq / WD
Ip 0 A 0 Ì 2 SL.p C q; C/ ; C Iq 0 D
which coincides with the Levi decomposition appearing in Theorem 2.35. From the above discussion, we obtain the following explicit presentation of Gr.q; p C q/ as a cominuscle homogeneous variety (as in Definition 2.33) Gr.q; p C q/ Š SL.p C q; C/=Qq D PSL.p C q; C/=Qq ;
(26)
˚
where Qq WD Qq = IpCq W pCq D 1 . Consider now the compact real form of SL.p C q; C/, which is the Lie subgroup SU.p C q/ SL.p C q; C/ that leaves invariant the positive definite Hermitian form x1 y 1 C C xpCq y pCq on CpCq CpCq . More explicitly
Hermitian Symmetric Manifolds
183
SU.p C q/ D fg 2 SL.p C q; C/ W g t g D IpCq g 8 9 t t ˆ A A C C C D Ip > ˆ > ˆ > C D ˆ > ˆ > : ; t t A B D C D Similarly, the quotient of SU.p C q/ by its center PSU.p C q/ WD SU.p C q/=Z.SU.p C q// D SU.p C q/=fIpCq W pCq D 1g is the compact real form of PSL.p C q; C/. The restriction of the action of SL.p C q; C/ on Gr.q; p C q/ to the subgroup SU.p C q/ SL.p C q; C/ is still transitive and the stabilizer of Wo is the maximal proper connected and compact subgroup SU.p C q/ \ Qq D
A 0 2 SU.p C q/ Š S.Up Uq /: 0 D
The action of SU.p C q/ on Gr.q; p C q/ factors through a transitive action of PSU.p C q/ in such a way that the stabilizer of Wo is equal to the maximal proper connected and compact subgroup A 0 2 PSU.p; q/ Š S.Up Uq /: PSU.p C q/ \ Qq D 0 D The pair .SU.p C q/; S.Up Uq // is a Riemannian symmetric pair since S.Up Uq / is the fixed subgroup of the involution W SU.p C q/ ! SU.p C q/ A B A B ; 7! C D C D and similarly for the pair .PSU.p C q/; S.Up Uq //. By the above discussion, we get the following presentation of Gr.q; p C q/ as the irreducible HSM of compact type Gr.q; p C q/ Š SU.p C q/= S.Up Uq / D PSU.p C q/=S.Up Uq /;
(27)
associated to the Riemannian symmetric pair .SU.p C q/; S.Up Uq // (resp. to .PSU.p C q/; S.Up Uq /). In particular, the last description of Gr.q; p C q/ is the
184
F. Viviani
one appearing in Theorem 2.12(ii). Notice that the symmetry at the base point Wo of Gr.q; p C q/, seen as a Hermitian symmetric manifold, is given by the element sWo D
Ip 0 2 PSU.p C q/: 0 Iq
The irreducible Hermitian SLA of compact type associated to the Riemann symmetric pair .SU.p C q/; S.Up Uq // is given by the Lie algebra ( Lie SU.p C q/ D su.p C q/ D
Z1 Z2 t Z2 Z3
!
t
W
)
t
Z1 D Z1 ; Z3 D Z3 Z2 2 Mp;q .C/; Tr.Z1 / C Tr.Z3 / D 0
(28) endowed with the involution D d
Z1 Z2 t Z2 Z3
!
Z1 Z2 D t Z2 Z3
!
and with the element H D i
q pCq Ip
0
0 p pCq Iq
(
!
! Z1 0 0 Z3
2 Fix. / D
) 2 su.p C q/ Š Lie S.Up Uq /:
Notice that the Hermitian SLA .su.p C q/; ; H / is the dual of the Hermitian SLA .su.p; q/; ; H / in the sense of Sect. 2.3. The complexification of the Lie algebras su.p; q/ and su.p C q/ is the complex simple Lie algebra of type ApCq1 ( Lie SL.p Cq; C/ D sl.p Cq; C/ D
)
! Z1 Z2 Z4 Z3
2 gl.p C q; C/ W Tr.Z1 / C Tr.Z3 / D 0 :
The decomposition (6) of sl.p C q; C/ is given by sl.p Cq; C/ D
0 0 0 Z2 Z1 0 ˚ : W Tr.Z1 / C Tr.Z3 / D 0 ˚ Z4 0 0 0 0 Z3
In particular, we have the identification Š
Mp;q .C/ ! pC 0M : M 7! 0 0
(29)
Hermitian Symmetric Manifolds
185
Using the above identification and the formula (15), Mp;q .C/ becomes a Hermitian positive JTS with respect to the triple product fM1 ; M2 ; M3 g D
1 t t M1 M2 M3 C M3 M2 M1 : 2
(30)
3.2 Type IIn The bounded symmetric domain of type II n (n 2) in its Harish–Chandra embedding is given by skew skew DII n WD fZ 2 Mn;n .C/ W Z t Z < In g Mn;n .C/ WD fZ 2 Mn;n .C/ W Z t D Zg:
(31) Let SO.2n; C/ be the connected complex Lie subgroup of SL.2n; C/ that leaves invariant the bilinear symmetric form on C2n C2n given by S.x; y/ D x1 ynC1 C C xn y2n C C xnC1 y1 C C x2n yn .2 The group SO.2n:C/ is simple of type Dn and it is explicitly given in n n block notation as
˚ SO.2n; C/ D g 2 SL.2n; C/ W g t Sn g D Sn 8 9 t t A C D C A ˆ > ˆ > C D ˆ > : ; t t A D C C B D In Consider the non-compact real form SOnc .2n/ of SO.2n; C/ consisting of all the elements of SO.2n; C/ that leave invariant the bilinear Hermitian form on C2n C2n given by x1 y 1 xn y n C xnC1 y nC1 C C x2n y 2n . Explicitly, ( nc
SO .2n/ D SO.2n; C/ \ SU.n; n/ D g 2 SO.2n; C/ W g 8 < D
A B : B A
! 2 SL.2n; C/ W
9 t A A B t B D In = t
A B C Bt A D 0
! !) In 0 In 0 gD 0 In 0 In
t
;
:
Note that the group SOnc .2n/ is isomorphic to the classical real Lie group SO .2n/ via the conjugation inside SO.2n; C/ given by (see [Mok89, p. 74])
Usually, one defines SO.2n; C/ with respect to the standard bilinear symmetric form on C2n C2n given by x1 y1 C C x2n y2n . However, for our purposes it will be more convenient to use this alternative presentation. 2
186
F. Viviani Š
SOnc .2n/ ! SO .2n/ WD fg 2 SL.2n; C/ W g t g D I2n and g t Jn g D Jn g 1 (32) In iI n I iI h 7! h n n : iI n In iI n In The Lie group SOnc .2n/ acts transitively on DII n via generalized Möbius transformations, as in (19). Notice that the center Z.SOnc .2n// D f˙I2n g of SOnc .2n/ acts trivially on DII n ; indeed, it turns out that the connected component of the group of biholomorphisms of DII n is given by Hol.DII n /o D SOnc .2n/=Z.SOnc .2n// WD PSOnc .2n/; which is the connected non-compact adjoint simple Lie group of type Dn . The symmetry of DII n at the base point 0 is given by the element s0 D
iI n 0 2 PSOnc .2n/; 0 iI n
(33)
which acts on DII n by sending Z into Z. The symmetry s0 induces an involution on SOnc .2n/ W SOnc .2n/ ! SOnc .2n/ 1 iI n 0 AB iI n 0 A B A B 7! ; D C D C D C D 0 iI n 0 iI n whose fixed Lie subgroup is equal to the maximal compact Lie subgroup
A 0 0 D
t A 0 2 SOnc .2n/ D W A A D In DW U.n/; 0 A
which is also equal to the stabilizer of 0 2 DII n . In particular, the pair .SOnc .2n/; U.n// is a Riemannian symmetric pair. Notice that the involution descends to an involution of PSOnc .2n/ whose fixed locus is the maximal compact Lie subgroup U.n/ WD U.n/=f˙In g of PSOnc .2n/. Therefore, also the pair .PSOnc .2n/; U.n// is a Riemannian symmetric pair. By the above discussion, we get the following presentation of DII n as an irreducible HSM of non-compact type DII n Š SOnc .2n/= U.n/ D PSOnc .2n/=U.n/;
(34)
associated to the Riemannian symmetric pair .SOnc .2n/; U.n// (resp. to .PSOnc .2n/; U.n/). Notice that the last description of DII n is the one appearing in Theorem 2.12(i).
Hermitian Symmetric Manifolds
187
The irreducible Hermitian SLA of non-compact type associated to the Riemannian symmetric pair .SOnc .2n/; U.n// (or equivalently to .PSOnc .2n/; U.n/) is given by the Lie algebra 8 9 M t Sn D Sn M ˆ > ˆ > < ! ! = sonc .2n/ WD Lie SOnc .2n/ D M 2 sl.2n; C/ W In 0 t In 0 ˆ ˆ > M D M> : ; 0 In 0 In
(35) ( D
Z1 Z2 t Z 2 Z1t
)3
! t
2 gl.2n; C/ W Z1 D Z1 ; Z2t D Z2
endowed with the Cartan involution D d given by Z1 Z2 t Z 2 Z1t
!
Z1 Z2 D t Z 2 Z1t
!
and with the element i In 0 t Z1 0 H D 2 Fix./ D W Z 1 D Z1 Š Lie U.n/: 0 Z1t 2 0 In The cominuscle homogeneous variety of type II n is the connected component Grort .n; 2n/o containing ŒWo WD henC1; ; e2n i C2n of the orthogonal Grassmannian Grort .n; 2n/ parametrizing Lagrangian n-dimensional subspaces of C2n : Grort .n; 2n/ WD fŒW C2n W dim W D p; SjW W 0g;
(36)
where S is the bilinear non-degenerate form on C2n which is represented symmetric 0 In in the standard basis of C2n . by the matrix Sn D In 0 The Borel embedding of DII n into Grort .n; 2n/o is given by skew DII n Mn;n .C/ ,! Grort .n; 2n/o (
Z 7! hv1 ; vn i W fv1 ; ; vn g are the column vectors of
!) Z In
:
(37)
Note that sonc .2n/ is isomorphic to the classical real Lie algebra so .2n/ D Lie SO .2n/ via the same conjugation map as in (32).
3
188
F. Viviani
The complex algebraic simple group SO.2n; C/ acts transitively on Grort .n; 2n/o via SO.2n; C/ Grort .n; 2n/o ! Grort .n; 2n/o .g; ŒW C2n / 7! Œg.W / C2n : Note that the center Z.SO.2n; C// D f˙I2n g of SO.2n; C/ acts trivially on Grort .n; 2n/o ; indeed, it turns out that the group of automorphisms of the algebraic variety Grort .n; 2n/o is equal to PSO.2n; C/ WD PSO.2n; C/=f˙I2n g; which is the connected simple adjoint complex algebraic group of type Dn and it is the complexification of the Lie group PSOnc .2n/. The stabilizer of ŒWo D henC1 ; ; e2n i C2n 2 Grort .n; 2n/o is the maximal parabolic subgroup associated to the n-th simple root of the Dinkin diagram Dn (which is a cominuscle simple root of Dn , see Table 5)4 A 0 2 SO.2n; C/ SO.2n; C/; C D
Qn WD
where A 2 Mn;n .C/, C 2 Mn;n .C/ and D 2 Mn;n .C/. The parabolic group Qn admits the following Levi decomposition Ip 0 A 0 t W C D C Ì 2 SO.2n; C/ ; Qn D Ru .Qn / Ì L.Qn / WD 0 D C Iq which coincides with the Levi decomposition appearing in Theorem 2.35. From the above discussion, we obtain the following explicit presentation of Grort .n; 2n/o as a cominuscle homogeneous variety (as in Definition 2.33) Grort .n; 2n/o Š SO.2n; C/=Qn D PSO.n; C/=Qn ;
(38)
where Qn WD Qn =f˙I2n g. Consider now the compact real form SOc .2n; C/ of SO.2n; C/ consisting of all the elements of SO.2n; C/ that leaves invariant the positive definite Hermitian form x1 y 1 C C x2n y 2n on C2n . More explicitly
As it is seen from Table 5, we could have chosen the .n 1/-th simple root and we would have gotten an isomorphic (although non conjugate) parabolic subgroup.
4
Hermitian Symmetric Manifolds
189
˚ SOc .2n/ WD SO.2n; C/ \ SU.2n/ D g 2 SO.2n; C/ W g t g D I2n 8 ˆ > < = AB t t 2 SL.2n; C/ W B D D D B : D ˆ > C D ˆ > : ; t t A D C B D In Consider the non-compact real form Spnc .n/ of Sp.n; C/ consisting of all the elements of Sp.n; C/ that leave invariant the bilinear Hermitian form on C2n C2n given by x1 y 1 xn y n C xnC1 y nC1 C C x2n y 2n . Explicitly, In 0 In 0 gD Sp .n/ D Sp.n; C/ \ SU.n; n/ D g 2 Sp.n; C/ W g 0 In 0 In 9 8 t ; M 0 In 0 In (47) ( D
Z1 Z2 t Z 2 Z1t
!
)6 t
2 gl.2n; C/ W Z1 D Z1 ; Z2t D Z2
:
endowed with the Cartan involution D d given by Z1 Z2 t Z 2 Z1t
!
Z1 Z2 D t Z 2 Z1t
!
and with the element i In 0 t Z1 0 H D Z D Z 2 Fix./ D W Š Lie U.n/: 1 1 0 Z1t 2 0 In The cominuscle homogeneous variety of type III n is the symplectic Grassmannian Grsym .n; 2n/ parametrizing Lagrangian n-dimensional subspaces of C2n : Grsym .n; 2n/ WD fŒW C2n W dim W D p; JjW W 0g;
(48)
whereJ is the standard symplectic form on C2n which is represented by the matrix 0 In in the standard basis of C2n . Jn D In 0 The Borel embedding of DIII n into Grsym .n; 2n/ is given by sym
DIII n Mn;n .C/ ,! Grsym .n; 2n/; ( Z 7! hv1 ; vn i W fv1 ; ; vn g are the column vectors of
!) Z In
:
(49) Note that spnc .n/ is isomorphic to the classical real Lie algebra sp.n; R/ via the same conjugation given in formula (44).
6
194
F. Viviani
The complex simple algebraic group Sp.n; C/ acts transitively on Grsym .n; 2n/ via Sp.n; C/ Grsym .n; 2n/ ! Grsym .n; 2n/ .g; ŒW C2n / 7! Œg.W / C2n : Note that the center Z.Sp.n; C// D f˙I2n g of Sp.n; C/ acts trivially on Grsym .n; 2n/; indeed, it turns out that the group of automorphisms of the algebraic variety Grsym .n; 2n/ is equal to PSp.n; C/ WD Sp.n; C/=f˙I2n g; which is a connected semisimple complex algebraic group of adjoint type and it is the complexification of the Lie group PSp.n; R/. Consider now the base point Wo WD henC1 ; ; e2n i 2 Grsym .n; 2n/ with respect to the standard basis fe1 ; ; e2n g of C2n (recall that we have normalized J so that it is represented by the standard symplectic matrix Jn with respect to this basis). The stabilizer of Wo is the maximal parabolic subgroup associated to the n-th simple root of the Dinkin diagram Cn (which is the unique cominuscle simple root of Cn , see Table 5) A 0 2 Sp.n; C/ Sp.n; C/; C D
Qn WD
where A 2 Mn;n .C/, C 2 Mn;n .C/ and D 2 Mn;n .C/. The parabolic group Qn admits the following Levi decomposition In 0 A 0 t W C DC Ì 2 Sp.n; C/ ; Qn D Ru .Qn / Ì L.Qn / WD 0 D C In which coincides with the Levi decomposition appearing in Theorem 2.35. From the above discussion, we obtain the following explicit presentation of Grsym .n; 2n/ as a cominuscle homogeneous variety (as in Definition 2.33) Grsym .n; 2n/ Š Sp.n; C/=Qn D PSp.n; C/=Qn ;
(50)
where Qn WD Qn =f˙I2n g. Consider now the compact real form of Sp.n; C/, which is the Lie subgroup Sp.n/ WD Sp.n; C/ \ SU.2n/ Sp.n; C/ that leaves invariant the positive definite Hermitian form x1 y 1 C C x2n y 2n on C2n . More explicitly7
7
The Lie group Sp.n/ admits another natural description in terms of matrices with coefficients in H. Namely, there an isomorphism of Lie group
Hermitian Symmetric Manifolds
195
Sp.n/ D fg 2 Sp.n; C/ W g t g D I2n g 8 9 t t 1. The first connected component is the one that contains the origin 0 2 Cn and it coincides with the domain DIV n . The domain DIV n (which is also called the Lie ball) admits another real analytic incarnation in terms of 2 n real matrices, namely we have a real analytic diffeomorphism (see [Hua46, Sec. 12 and 13]) Š
DIV n ! fM 2 M2;n .R/ W M M t < I2 g t 1 Z Z C 1 i.Z t Z 1/ Z Z 7! 2 : t t Z Z C 1 i.Z Z 1/ Z
(56)
Consider the subgroup SOnc .n; 2/ of SO.n C 2; C/ consisting of all the elements of SO.2 C n; C/ that leave invariant the bilinear Hermitian form on CnC2 C2Cn given by x1 y 1 xn y n C xnC1 y nC1 C xnC2 y nC2 . Explicitly,
198
F. Viviani
SOnc .n; 2/ D SO.n C 2; C/ \ U.n; 2/ ( D g 2 SL.n C 2; C/ W g t g D InC2 ; g t 8 ˆ ˆ ˆ < D
A B ˆ C D ˆ ˆ :
! !) In 0 I 0 gD n 0 I2 0 I2
At A C C t C D In ;
!
2 SL.n C 2; C/ W D t D C B t B D I2 ; At B D C t D;
9 t t A A C C D In > > > = t t D D B B D I2 > > > t t ; A BDC D
The Lie group SOnc .n; 2/ acts transitively on DIV n via SOnc .n; 2/ DIV n ! DIV n
1 C Zt Z 2iAZ C B i iZ t Z AB ; Z 7! : C D 1 C Zt Z .1; i / 2iC Z C D i iZ t Z
Notice that the center Z.SOnc .n; 2// D f˙InC2 g of SOnc .n; 2/ acts trivially on DIV n ; indeed, it turns out that the connected component of the group of biholomorphisms of DIV n is given by Hol.DIV n /o D SOnc .n; 2/=Z.SOnc .n; 2// WD PSOnc .n; 2/; which is the connected non-compact adjoint simple Lie group of type Dn=2C1 if n is even and B.nC1/=2 if n is odd. The symmetry of DIV n at the base point 0 is given by the element s0 D
In 0 2 PSOnc .n; 2/; 0 I2
(57)
which acts on DIV n by sending Z into Z. The symmetry s0 induces an involution on SOnc .n; 2/ W SOnc .n; 2/ ! SOnc .n; 2/ 1 In 0 AB In 0 AB A B 7! ; D C D C D C D 0 I2 0 I2 whose fixed Lie subgroup is equal to the maximal compact Lie subgroup
Hermitian Symmetric Manifolds
199
9 8 t t = ˆ > < ! ! = nc nc so .n; 2/ WD Lie SO .n; 2/ D M 2 sl.n C 2; C/ W I 0 t In 0 ˆ ˆ > M D n M> : ; 0 I2 0 I2
(59) ( D
! X1 iX 2 iX t2 X3
2 gl.n C 2; C/ W
X 1 D X1 ; X 2 D X2 ; X 3 D X3
)
X1t D X1 ; X3t D X3
endowed with the involution D d X1 iX 2 X1 iX 2 D iX t2 X3 iX t2 X3 and with the element X1 0 0 0 2 Fix./ D 2 gl.n C 2; R/ W X1t D X1 ; X3t D X3 H D 0 J1 0 X3 Š Lie.SO.n/ SO.2//:
200
F. Viviani
The cominuscle homogeneous variety of type IV n is the complex quadric hypersurface of dimension n: Qn WD fŒv 2 PnC1 W Q.v; v/ D 0g PnC1
(60)
where Q is the bilinear symmetric non-degenerate form on CnC2 given by Q.v/ D 2 v12 C : : : C vnC2 . Observe that the complex quadric hypersurface Qn PnC1 admits another real analytic incarnation. Namely, Qn is real analytic diffeomorphic to the oriented real Grassmannian GrC R .2; n C 2/ parametrizing two-dimensional oriented subspaces of RnC2 via the map (see [Sat80, Appendix §6]) Š
n GrC R .2; n C 2/ ! Q
(61)
hv1 ; v2 i 7! Œv1 C iv2 : The Borel embedding of DIV n into Qn is given by DIV n Cn ,! Qn 20
13 2iZ Z! 7 4@ 1 C Z t Z A5 : i iZ t Z
(62)
The complex algebraic simple group SO.n C 2; C/ acts transitively on Qn via SO.n C 2; C/ Qn ! Qn .g; Œv/ 7! Œg.v/: Note that the center ( Z.SO.n C 2; C// D
f˙InC2 g
if n is even,
fInC2 g
if n is odd,
(63)
acts trivially on Qn ; indeed, it turns out that the group of automorphisms of the algebraic variety Qn is equal to PSO.2n; C/ WD PSO.2n; C/=Z.SO.n C 2; C//; which is the connected simple adjoint complex algebraic group of type Dn=2C1 if n is even and B.nC1/=2 if n is odd. The stabilizer of vo D Œ.0; ; 0; 1; i / 2 Qn is the maximal parabolic subgroup associated to the first simple root of the Dinkin diagram Dn=2C1 if n is even and of the Dinkin diagram B.nC1/=2 if n is odd (which are cominuscle simple roots, see Table 5)
Hermitian Symmetric Manifolds
8 ˆ = A B 2 SO.n C 2; C/ W ; Q1 WD ab ˆ : C D such that i a b D c C id> DD ; cd
where A 2 Mn;n .C/, B 2 Mn;2 .C/, C 2 M2;n .C/ and D 2 M2;2 .C/. From the above discussion, we obtain the following explicit presentation of Qn as a cominuscle homogeneous variety (as in Definition 2.33) Qn Š SO.n C 2; C/=Q1 D PSO.n; C/=P 1 ;
(64)
where P 1 WD Q1 =Z.SO.n C 2; C//. Consider now the compact real form SO.n C 2/ of SO.n C 2; C/ consisting of all the real matrices in SO.nC2; C/ or, equivalently, of all the elements in SO.nC2; C/ that leave invariant the positive definite Hermitian form x1 y 1 C C xnC2 y nC2 on CnC2 . More explicitly SO.n C 2/ WD SO.n C 2; C/ \ SU.n C 2/ D fg 2 SO.n C 2; C/ W g D gg 8 9 At A C C t C D In > ˆ ˆ > < = AB t t 2 SL.n C 2; R/ W D D C B B D I2 : D ˆ > C D ˆ > : ; At B D C t D Similarly, the quotient of SO.n C 2/ by its center (which is given by (63)) PSO.n C 2/ WD SO.n C 2/=Z.SO.n C 2// is a compact real form of PSO.2n; C/. The restriction of the action of SO.n C 2; C/ on Qn to the subgroup SO.n C 2/ SO.n C 2; C/ is still transitive and the stabilizer of vo is the maximal proper connected and compact subgroup ( At A D I ; det.A/ D 1 ) n A 0 SO.n C 2/ \ Q1 D W t Š SO.n/ SO.2/: 0 D D D D I2 ; det.D/ D 1 The pair .SO.n C 2/; SO.n/ SO.2// is a Riemannian symmetric pair since SO.n/ SO.2/ is the connected component of the fixed subgroup of the involution W SO.n C 2/ ! SO.n C 2/ A B AB ; 7! C D C D
202
F. Viviani
and similarly for the pair .PSO.n C 2/; SO.n/ SO.2//, where SO.n/ SO.2/ is the image of SO.n/ SO.2/ in PSO.n C 2/. By the above discussion, we get the following presentation of Qn as the irreducible HSM of compact type Qn Š SO.n C 2/= SO.n/ SO.2/ D PSO.n C 2/=SO.n/ SO.2/;
(65)
associated to the Riemannian symmetric pair .SO.n C 2/; SO.n/ SO.2// (resp. to .PSO.n C 2/; SO.n/ SO.2/). In particular, the last description of Qn is the one appearing in Theorem 2.12(ii). Notice that the symmetry at the base point vo of Qn , seen as a Hermitian symmetric manifold, is given by the element svo D
In 0 2 PSO.n C 2/: 0 I2
The irreducible Hermitian SLA of compact type associated to the Riemann symmetric pair .SO.n C 2/; SO.n/ SO.2// is given by the Lie algebra ˚
so.n C 2/ WD Lie SO.n C 2/ D M 2 sl.p C q; R/ W M t D M (66) X1 X2 2 gl.p C q; R/ W X1t D X1 ; X3t D X3 D X2t X3 endowed with the involution D d X1 X2 X1 X2 D X2t X3 X2t X3 and with the element 0 0 X1 0 t t 2 Fix. / D W X1 D X1 ; X3 D X3 H D 0 J1 0 X3 Š Lie.SO.n/ SO.2//: Notice that the Hermitian SLA .so.n C 2/; ; H / is the dual of the Hermitian SLA .sonc .n; 2/; ; H / in the sense of Sect. 2.3. The complexification of the Lie algebras sonc .n; 2/ and so.n C 2/ is the complex simple Lie algebra of type Dn=2C1 if n is even and B.nC1/=2 if n is odd: ( Lie SO.nC2; C/ D so.nC2; C/ D
! Z1 Z2 Z2t Z3
) 2 gl.n C 2; C/ W
Z1t
D Z1 ;
Z3t
D Z3 :
Hermitian Symmetric Manifolds
203
The decomposition (6) of so.n C 2; C/ is given by ( Z t D Z ) 1 1 0 .iZ 0 ; Z 0 / Z1 0 ˚ so.n C 2; C/ D W t 0 0 Z3 .iZ 0 ; Z 0 /t Z3 D Z3 0 .Z 00 ; iZ 00 / ˚ : 00 0 .Z 00 ; iZ /t In particular, we have the identification Š
Cn ! pC 0 .iZ; Z/ Z 7! : .iZ; Z/t 0
(67)
Using the above identification and the formula (15), Cn becomes a Hermitian positive JTS with respect to the triple product fX; Y; Zg D .X t Z/Y .Z t Y /X .X t Y /Z:
(68)
3.5 Type VI Let O be the R-algebra of octonions or Cayley algebra (we refer the reader to [Bae02] for a beautiful introduction to the octonions). Recall that O is the alternative R-algebra (neither associative nor commutative) of dimension 8 whose underlying vector space is equal to H H and whose multiplication is equal to .a1 ; b1 / .a2 ; b2 / WD .a1 a2 b2 be1 ; ae1 b2 C a2 b1 /; where H is the division R-algebra of quaternions andedenotes its involution eW H ! H x0 C ix1 C jx2 C kx3 7! x0 ix1 jx2 kx3 : The algebra O is endowed with the unity element e D .1; 0/ and with an involutive anti-automorphism
A
.a; b/ 7! .a; b/ WD .a; Q b/:
204
F. Viviani
The above involution gives rise to a norm j:j2 W O ! R
A
.a; b/ 7! j.a; b/j2 WD .a; b/ .a; b/ D aaQ C b bQ which is a positive define quadratic form and it is multiplicative (i.e. j.a1 ; b1 / .a2 ; b2 /j2 D j.a1 ; b1 /j2 j.a2 ; b2 /j2 ). Therefore the pair .O; j:j2 / is a Euclidean composition algebra of dimension 8 and indeed it is the unique such algebra. We will denote by h; i the bilinear form associated to the quadratic form j:j2 , i.e. hx; yi WD jx C yj2 jxj2 jyj2 ; for any x; y 2 O. Let OC WD O ˝R C be the complexification of O (it is called the complex Cayley algebra). The involutioneand the quadratic form j:j2 on O extend naturally on OC (by a slight abuse of notation, we will continue to denote them by the same symbols). Moreover, OC is endowed with a complex conjugation with respect to its real form O: ˝ x 7! ˝ x WD ˝ x; where 2 C and x 2 O. Consider the complex vector space H3 .OC / consisting of Hermitian 3 3matrices with entries in OC ˚
H3 .OC / WD a 2 M3;3 .OC / W aQ t D a D
(69)
80 9 1 < ˛1 a3 ae2 = D @ae3 ˛2 a1 A W ˛1 ; ˛2 ; ˛3 2 CI a1 ; a2 ; a3 2 OC : : ; a2 ae1 ˛3 The complex vector space H3 .OC / is endowed with a product (called the Freudenthal product) defined by 0
1 0 1 ˛1 a3 ae2 ˇ1 b3 be2 a b WD @ae3 ˛2 a1 A @ be3 ˇ2 b1 A WD a2 ae1 ˛3 b2 be1 ˇ3
(70)
1 ˛2 ˇ3 C ˛3 ˇ2 ha1 ; b1 i a1 b2 C b1 a2 ˛3 be3 ˇ3 ae3 be1 ae3 C ae1 be3 ˛2 b2 ˇ2 a2 C Be D @b2 ae1 C ae2 be1 ˛3 b3 ˇ3 a3 ˛3 ˇ1 C ˛1 ˇ3 ha2 ; b2 i a2 b3 C b2 a3 ˛1 be1 ˇ1 ae1 A : a3 b1 C b3 a1 ˛2 be2 ˇ2 ae2 be3 ae2 C ae3 be2 ˛1 b1 ˇ1 a1 ˛1 ˇ2 C ˛2 ˇ1 ha3 ; b3 i 0
Hermitian Symmetric Manifolds
205
Moreover, H3 .OC / is endowed with a positive definite Hermitian form defined by .ajb/ WD
3 X
˛i ˇi C
i D1
3 X
haj ; bj i;
(71)
j D1
where a; b 2 H3 .OC / are written as in (70). Using the Freudenthal product and the above positive define Hermitian form, we can define a Jordan triple product on H3 .OC / via fa; b; cg WD .ajb/c C .cjb/a .a c/ b;
(72)
where b is the element of H3 .OC / obtained by conjugating all the entries of b with respect to the complex conjugation of OC . The pair .H3 .OC /; f: ; : ; :g/ is an irreducible Hermitian positive JTS of dimension 27 (see [Roo08, Sec. 2.2]), called sometimes the exceptional Hermitian positive JTS of dimension 27 or the Hermitian positive JTS of type VI. From the above explicit description of the Hermitian positive JTS .H3 .OC /; f: ; : ; :g/ and formula (17), we can deduce an explicit expression of the associated bounded symmetric domain in its Harish–Chandra embedding. In order to do that, we need to introduce the determinant and the adjoint of an element of H3 .OC /. The determinant is defined by det W H3 .OC / ! OC ; a 7!
3 X 1 .a aja/ D ˛1 ˛2 ˛3 ˛i jai j2 C a1 .a2 a3 / C .e a3 ae2 /a1 ; 3Š i D1 (73)
where a 2 H3 .OC / is written as in (70). The adjoint of a 2 H3 .OC / is defined by .a/] WD
a a : 2
(74)
The relation between the determinant and the adjoint is given by the following formulas (see [Roo08, Sec. 2.1]) (
.x ] jx/ D 3 det.x/; .x ] /] D det.x/x:
(75)
The bounded symmetric domain of type VI in its Harish–Chandra embedding is given by (see [Roo08, Sec. 3.1])
206
DVI
F. Viviani
9 1 .aja/ C .a] ja] / j det.a/j2 > 0> > = ] ] H3 .OC /: WD a 2 H3 .OC / W 3 2.aja/ C .a ja / > 0 > ˆ > ˆ ; : 3 .aja/ > 0 8 ˆ ˆ
ˆ > ˆ 2 2 ˆ > ˆ 1 hxi ; xi i C .jxi j / C hx2 x3 ; x2 x3 i > 0> > ˆ ! > ˆ = < i D1 i D1 x1 2 O2C : DV WD x D 2 OC W > ˆ x2 2 > ˆ X > ˆ > ˆ > ˆ hxi ; xi i > 0 2 > ˆ ; : i D1
(80)
Hermitian Symmetric Manifolds
207
The cominuscle homogeneous variety of type V is the Cayley plane ˚
P2O WD Œa 2 P.H3 .OC // W a] D 0 D ( Œa 2 P.H3 .OC // W
˛2 ˛3 D ja1 j2 ; ˛3 ˛1 D ja2 j2 ;
˛1 ˛2 D ja3 j3
a1 a2 D ˛3 ae3 ; a2 a3 D ˛1 ae1 ;
a3 a1 D ˛2 ae2
(81) ) ;
where a 2 H3 .OC / is written as in (69). The Cayley plane P2O is homogeneous with respect to the natural action of the subgroup SL3 .OC / GL.H3 .OC // D GL27 .C/ consisting of the elements preserving the determinant (73) (see [LM03, Sec. 6.2]). The group SL3 .OC / is a complex simple Lie group of type E6 . Moreover, the stabilizer of any element is isomorphic to the maximal parabolic subgroup P6 corresponding to the 6-th simple root of the diagram E6 (which is a cominuscle simple root, see Table 5). Therefore, we obtain the following explicit presentation of P2O as a cominuscle homogeneous variety (as in Definition 2.33) P2O Š SL3 .OC /=P6 :
(82)
The Borel embedding of DV into P2O is given by j
DV O2C ,! P2O 20 13 1 x2 xe1 x1 7! 4@xe2 jx2 j2 xe2 xe1 A5 : x2 x1 x1 x2 jx1 j2
(83)
Indeed, it is easily checked that j.O2C / is the Zariski open subset of P2O consisting of all the matrices Œa 2 P2O whose .1; 1/-entry is non-zero.
4 Boundary Components The aim of this section is to define and study the boundary components of a Hermitian symmetric manifold of non-compact type, or, equivalently, of a bounded symmetric domain, see Sect. 2.5. iHC
Let D ,! CN be a bounded symmetric domain in its Harish–Chandra embedding iHC
j
and let D ,! CN ,! D c be the Borel embedding into the compact dual D c of D (see Theorem 2.22). Denote by D the closure of D inside CN with respect to the Euclidean topology.
208
F. Viviani
Definition 4.1. Consider the following equivalence relation on D: p q if and only if there exist holomorphic maps 1 ; ; m W D fz 2 C W jzj < 1g ! D (for some m 2 N) such that • 1 .0/ D p and m .0/ D q; • Im i [ Im i C1 ¤ ; for any 1 i m 1. A boundary component F of D is an equivalence class for the above equivalence relation on D. Given two boundary components F1 and F2 of D, we say that F1 dominates F2 (and we write F2 F1 ) if F2 F1 . In other words, two points p and q of D belong to the same boundary component if they can be connected by a finite chain of holomorphic disks contained in D. Note that D is always a boundary component of itself and that for every boundary component F of D it holds that F D. Theorem 4.2. Let D CN be a bounded symmetric domain in its Harish– Chandra embedding. Then ` (i) D D F D F and G D Hol.D/o preserves this decomposition. (ii) Let F D and denote by hF i be the smallest linear subspace of CN containing F , by F be the Euclidean closure of F inside CN (or equivalently inside hF i) and by F c the Zariski closure of F inside D c . Then F is a Hermitian symmetric manifold of non-compact type such that • F hF i is the Harish–Chandra embedding of F ; • F hF i F is the Borel embedding of F . Moreover the following diagram of inclusions is Cartesian F
D
F
hF i
D
CN
Fc
(84)
Dc
(iii) If F D and F 0 F then F 0 D. (iv) If D D D1 Dr is the decomposition of D into irreducible bounded symmetric domains, then the boundary components of D are the product of the boundary components of the Di ’s. Proof. See [AMRT10, Chap. III, Thm. 3.3].
t u
Remark 4.3. It has been proved by Bott–Korányi (see [KW65, §3]) that the union of the zero-dimensional boundary components of a bounded symmetric domain D CN is the Bergman–Silov boundary of D, i.e. the smallest closed subset of the boundary @D WD D n D on which the absolute value of any function continuous on D and holomorphic on D achieves its maximum.
Hermitian Symmetric Manifolds
209
4.1 The Normalizer Subgroup of a Boundary Component The aim of this subsection is to study the normalizer subgroup of a boundary component F of D. Definition 4.4. The normalizer subgroup of a boundary component F D is the subgroup N.F / WD fg 2 G D Hol.D/o W gF D F g G: We can classify the boundary components of D in terms of their normalizer subgroups. Theorem 4.5. Let D D D1 Ds the decomposition of D into its irreducible bounded symmetric domains and let Hol.D/o D G D G1 Gs D Hol.D1 /o Hol.Ds /o the associated decomposition of G into its simple factors. Then there is a bijection ( Š
fBoundary components F Dg !
Subgroups P1 Ps G1 Gs such that
)
Pi D Gi or Pi is a maximal parabolic subgroup of Gi
F 7! N.F /:
t u
Proof. See [AMRT10, Chap. III, Prop. 3.9].
We want now to take a closer look at the structure of the normalizer subgroup associated to a boundary component of D. We will need the following technical result. Lemma 4.6. Let F be a boundary component of D and fix a base point o 2 D. Then there exists a unique pair fF W ! D; F W S1 SL2 .R/ ! G D Hol.D/o ;
(85)
such that (i) fF is a symmetric morphism (in the sense of Remark 2.17) such that fF .0/ D o 2 D and oF WD fF .1/ WD limz!1 fF .z/ 2 F ; (ii) F is a morphism of Lie groups such that cos sin i ho .e / WD F e ; sin cos i
belongs to K D Stab.o/ G and it acts on To D as multiplication by e 2i .
210
F. Viviani
(iii) fF is equivariant with respect to the morphism F and the natural actions of G on D and of S1 SL2 .R/ on via
S1 SL2 .R/ ! Œi.a C d / .b c/z C Œi.a d / C .b C c/ ab e i ; : ; z 7! cd Œi.a d / .b C c/z C Œi.a C d / C .b c/
Proof. See [AMRT10, Chap. III, Thm. 3.3(v) and Thm. 3.7].
t u
Remark 4.7. (i) Since fF W ! D is a symmetric morphism, it extends uniquely to a morphism between their Harish–Chandra and Borel embeddings (see [AMRT10, Chap. III, Sec. 2.2]) fF
C
fF
D P1 c
fFc
D CN Dc
In particular fF .1/ D fF .1/ 2 D. (ii) For any non-Euclidean Hermitian symmetric domain M with a fixed base point o there exists a unique morphism uo W S1 ! G D Aut.M /o such that Im uo K D Stab.o/ G and uo .e i / induces the multiplication by e i on To M (see [AMRT10, Chap. III, Sec. 2.1]). Therefore, part (ii) is equivalent to saying that h2o D uo . (iii) The action of SL2 .R/ on in part (iii) is equivalent to the action of SL2 .R/ on the upper half space H via Moëbius transformations (see Example 2.6(2)) using the Cayley isomorphism H Š of (11). The connected component of the normalizer subgroup of a boundary component of D admits the following decomposition, know as the 5-term decomposition. Theorem 4.8. Let F be a boundary component of D and let N.F / its associated normalizer subgroup. Consider the one-parameter subgroup of G D Hol.D/o wF W Gm ! G t 7! F
t 0 1; : 0 t 1
(86)
Hermitian Symmetric Manifolds
211
(i) The normalizer subgroup N.F / of F is equal to N.F / D fg 2 G W lim wF .t/ g wF .t/1 existsg: t !0
(ii) The connected component N.F /o of N.F / is equal to the semidirect product N.F /o D Z.wF /o Ë W .F /; where: • W .F / is the unipotent radical of N.F /o and it is equal to W .F / WD fg 2 G W lim wF .t/ g wF .t/1 D 1 2 GgI t !0
• Z.wF /o is a Levi subgroup of N.F /o and it is the connected component of the centralizer Z.wF / of wF , i.e. Z.wF / WD fg 2 G W wF .t/ g wF .t/1 D g for any t 2 Gm g: (iii) W .F / is a two-step unipotent group which is given as an extension of two abelian groups 0 ! U.F / ! W .F / ! V .F / ! 0; where U.F / is the center of W .F / and V .F / WD W .F /=U.F /. (iv) Z.wF /o is a reductive group which is equal to the product modulo finite subgroups Z.wF /o D Gh .F / Gl .F / M.F /; where • M.F / is compact and semisimple; • Gh .F / is semisimple and it satisfies Gh .F /=ZGh .F / D Aut.F /o ; • Gl .F / is reductive without compact factors. Proof. See [AMRT10, Chap. III. Thm. 3.7, Thm. 3.10, §4.1].
t u
4.2 The Decomposition of D Along a Boundary Component The aim of this subsection is to deduce from the 5-term decomposition of the normalizer N.F / of a boundary component F of D (see Theorem 4.8) a decomposition of D along F .
212
F. Viviani
Proposition 4.9. Let D be a bounded symmetric domain and fix a boundary component F D. Then N.F /o acts transitively on D. Proof. See [AMRT10, Chap. III, §4.3].
t u
Recall that, fixing a base point o 2 D, D is diffeomorphic to G=K where G D Hol.D/o and K D Stab.o/ G is a maximal compact subgroup (see Theorem 2.11). Using this and the above Proposition 4.9, we have a diffeomorphism D Š N.F /o =.K \ N.F /o /:
(87)
From the 5-term decomposition of N.F /o (see Theorem 4.8), it follows that K \ N.F /o D K \ Z.wF / D Kl .F / Kh .F / M.F / Gl .F / Gh .F / M.F / D Z.wF /;
(88) where Kl .F / Gl .F / and Kh .F / Gh .F / are maximal compact subgroups. Substituting (88) into (87), we get the diffeomorphism Š
D !
N.F /o ŒGh .F / Gl .F / M.F / Ë W .F / Gh .F / Gl .F / D D W .F /: K \ N.F /o Kh .F / Kl .F / M.F / Kh .F / Kl .F /
(89)
In order to get a better description of the above diffeomorphism, we will now describe more geometrically the right hand side of (89). Theorem 4.10. Notations as above. (i) Under the natural action of Gh .F / G on D CN , the orbit of the point oF WD fF .1/ is equal to F D and its stabilizer is equal to Kh .F /. Therefore, we have a diffeomorphism Gh .F / Š F: Kh .F /
(90)
(ii) Under the action of Gl .F / N.F /o on U.F / by conjugation, the orbit of the 11 point ˝F WD F 1; 2 U.F / is an open cone C.F / U.F / and its 01 stabilizer is equal to Kl .F /. Therefore, we have a diffeomorphism Gl .F / Š C.F /: Kl .F /
(91)
Proof. Part (i) follows from [AMRT10, Chap. III, Thm. 3.10 and Lemma 4.6]. Part (ii) follows from [AMRT10, Chap. III, Thm. 4.1]. t u Remark 4.11. Let o 2 D pC be a bounded symmetric domain (with a fixed base point o) in its Harish–Chandra embedding (see Theorem 2.22). Consider the Jordan
Hermitian Symmetric Manifolds
213
triple product f: ; : ; :g of (15) with respect to which .pC ; f: ; : ; :g/ is a Hermitian positive JTS (see Sect. 2.7). Then there is a bijection (see [Sat80, Chap. III, §8, Rmk. 2]) Š
fBoundary components of Dg ! fTripotents of .pC ; f: ; : ; :g/g
(92)
F 7! oF where a tripotent (or idempotent) of .pC ; f: ; : ; :g/ is an element e 2 pC such that fe; e; eg D e. For a detailed study of the tripotents of a Jordan triple system, we refer the reader to [FKKLR00, Chap. V, Part V]. Using the above Theorem 4.10, the diffeomorphism (89) can be written as Š
D ! F C.F / W .F /
(93)
x 7! .F .x/; ˚F .x/; w.x//: The above smooth maps w, F and ˚F can be described explicitly as it follows. By (87) and (88), we can write x 2 D as x D gh gl w o; for some unique w 2 W .F / and for some gh 2 Gh .F / (resp. gl 2 Gl .F /) which is unique up to multiplication by an element of Kh .F / (resp. Kl .F /). Then we have that (see [AMRT10, Chap. III, §4.3]): 8 w.x/ WD w 2 W .F /; ˆ ˆ < F .x/ WD gh oF 2 F; ˆ ˆ : ˚F .x/ WD gl ˝F 2 C.F /:
(94)
The maps F and ˚F are closely related as we are now going to explain. Recall that D embeds as an open subset in its compact dual D c via the Borel embedding and that D c is a homogeneous projective variety with respect to the action of the complexification GC of G D Hol.D/o (see Theorem 2.22). We will denote by U.F /C GC the complexification of U.F / N.F /o G. Definition 4.12. Notations as above. Denote by D.F / the analytic open subset of Dc : [ g D Dc : D D.F / WD U.F /C D D g2U.F /C
The open subset D.F / admits the following Lie-theoretic description.
214
F. Viviani
Lemma 4.13. We have a diffeomorphism D.F / Š
N.F /o U.F /C ; Kh .F / Gl .F / M.F /
where N.F /o U.F /C is the subgroup of GC generated by N.F /o and U.F /C . Proof. The group N.F /o U.F /C acts transitively on D.F / by Proposition 4.9 and Definition 4.12. The stabilizer of the point ooF WD so .oF / D fF .1/ 2 D.F / is equal to Kh .F / Gl .F / M.F / by Ash et al. [AMRT10, Chap. III, Lemma 4.6]. Hence, we get the desired diffeomorphism. t u Using the diffeomorphism in Lemma 4.13, we can define two smooth and surjective maps 8 ˆ ˆ ˆ < Q F W D.F / Š ˆ ˆ ˆ : ˚Q F W D.F / Š
Gh .F / N.F /o U.F /C ! Š F; Kh .F / Gl .F / M.F / Kh .F / N.F /o U.F /C N.F /o U.F /C ! Š U.F /; Kh .F / Gl .F / M.F / N.F /o
(95)
where the last diffeomorphism is obtained by projecting onto i U.F / U.F /C . Theorem 4.14. Notations as above. (i) The smooth maps Q F and ˚Q F fit into the following commutative diagram C.F /
˚F
D
U.F /
˚QF
D.F / Q F F F
where the upper square is Cartesian. (ii) The smooth map Q F factors as F0
pF
D.F / ! D.F /0 WD D.F /=U.F /C ! F D D.F /0 =V .F /; in such a way that F0 is a trivial holomorphic U.F /C -torsor and pF is a smooth V .F /-torsor and, at the same time, a trivial complex vector bundle. In
Hermitian Symmetric Manifolds
215
particular, we have a diffeomorphism D.F / Š U.F /C Ck F;
(96)
for some k 2 N. (iii) Under the diffeomorphism (96), the smooth map ˚Q F can be written as ˚Q F W D.F / Š U.F /C Ck F ! U.F / .x; y; z/ 7! Im x hz .y; y/; for some bilinear symmetric form hz W Ck Ck ! U.F / varying smoothly with z 2 F . Proof. See [AMRT10, Chap. III, §4.3].
t u
From the above Theorem, we deduce the following presentation of D as a Siegel domain of the third kind. Corollary 4.15. Notations as above. We have a diffeomorphism D Š f.x; y; z/ 2 U.F /C Ck F W Im x hz .y; y/ 2 C.F /g:
4.3 Symmetric Cones and Euclidean Jordan Algebras The cone C.F / associated to a boundary component F D (see Theorem 4.10(ii)) belongs to a special class of cones, namely the symmetric cones, that we now introduce. Definition 4.16. Let V be a real (finite-dimensional) vector space endowed with a scalar product h; i (i.e. an Euclidean space). An open (pointed and convex) cone C V is said to be: (i) homogeneous if the group of automorphisms of C : G.C / WD fg 2 GL.V / W g C D C g GL.V / acts transitively on C . (ii) symmetric if it is homogeneous and self-dual, i.e. C is equal to the its dual cone C D fx 2 V W hx; yi > 0 for any y 2 C g: Some basic properties of homogeneous and symmetric cones are contained in the following
216
F. Viviani
Theorem 4.17. Let C .V; h; i/ be a homogeneous cone and fix a base point o 2 C. (i) The stabilizer subgroup of o G.C /o WD fg 2 G.C / W g.o/ D og G.C / is a maximal compact subgroup of G.C / and, conversely, every maximal compact subgroup of G.˝/ is the stabilizer subgroup of some point of ˝ (ii) We have a diffeomorphism G.C / Š C: G.C /o (iii) C is symmetric if and only if G.C / is equal to its dual group G.C / D fg W g 2 G.C /g; where g denote the adjoint of the element g 2 GL.V / with respect to the scalar product h; i. In particular, in this case, G.C / is a reductive Lie group. Proof. For part (i), see [Sat80, Chap. I, Prop. 8.4]. For part (ii), see [FK94, Chap. I, §4]. For part (iii), see [Sat80, Chap. I, Lemma 8.3]. t u Remark 4.18. It can be shown that symmetric cones are Riemannian symmetric manifolds in the sense of Remark 2.5(iv); see [FK94, Chap. I, §4] for a proof. It turns out (see [FK94, Prop. III.4.5]) that any symmetric cone decomposes uniquely as the product of irreducible symmetric cones, defined as it follows. Definition 4.19. A symmetric cone ˝ is said to be irreducible if and only if there does not exist a non-trivial decomposition V D V1 ˚ V2 and two symmetric cones ˝1 V1 and ˝2 V2 such that ˝ D ˝1 C ˝2 (in this case, we say that ˝ is the product of ˝1 and ˝2 ). Symmetric cones can be classified via Euclidean Jordan algebras, which we now introduce. Definition 4.20. A Jordan algebra over a field F is a (finite-dimensional) algebra .A; ı/ over F such that (J1) x ı y D y ı x for any x; y 2 A; (J2) Tx and Txıx commutes, where for any x 2 A we denote by Tx the endomorphism of A given by Tx W A ! A; y 7! Tx .y/ WD x ı y:
Hermitian Symmetric Manifolds
217
A Jordan F -algebra .A; ı/ is said to be (i) semisimple if the trace form W A A ! F; .x; y/ 7! .x; y/ WD tr.Txıy /:
(97)
is non-degenerate. (ii) simple if is not identically zero and A does not contain proper ideals, i.e. proper subvector spaces I A such that for any x 2 I and y 2 A we have that x ı y 2 I . (iii) Euclidean if F D R and the trace form is positive definite. The following properties of Jordan algebras follow quite easily from the axioms (J1) and (J2). Lemma 4.21. Let .A; ı/ be a Jordan F -algebra. Then (i) .A; ı/ is power-associative, i.e. if we define inductively x p WD x ı x p1 (for any p 2 Z>0 ) then we have that x p ı x q D x pCq for any p; q 2 Z>0 . (ii) For any x 2 A and any p; q 2 Z>0 the endomorphisms Tx p and Tx q commute. (iii) The trace form is associative, i.e. .x ı y; z/ D .x; y ı z/; for any x; y; z 2 A. In particular, if .A; ı/ is semisimple then, for any y 2 A, the endomorphism Ty is self-adjoint with respect to . (iv) If .A; ı/ is semisimple then .A; ı/ has a unique unit element e 2 A, i.e. an element e 2 A such that e ı x D x for any x 2 A. Proof. For (i) and (ii), see [FK94, Prop. II.1.2]. For (iii), see [FK94, Prop. 2.4.3]. Part (iv): since is non-degenerate, there exists a unique e 2 A such that .e; x/ D tr Tx for any x 2 A. Using the associativity of , we get (for any x; y 2 A) .x; e ı y/ D .x ı y; e/ D tr Txıy D .x; y/; which implies (again by the non-degeneracy of ) that e ı y D y, q.e.d.
t u
Example 4.22. (1) Let .A; / be an associative F -algebra. Then A becomes a Jordan algebra with respect to the Jordan product x ı y WD .x y C y x/: (2) Let W be a F -vector space and let B be a symmetric bilinear form on W W . Then A D F ˚ W becomes a Jordan algebra with respect to the Jordan product .; u/ ıB .; v/ WD . C B.u; v/; v C u/:
(98)
218
F. Viviani
It is easily checked that the Jordan algebra .F ˚ W; ıB / is semisimple if and only if B is non-degenerate and that it is Euclidean if and only if F D R and B is positive definite. (3) Let D be equal to R; C or H and denote by x 7! x the natural involution. For any n 2, the real vector space of Hermitian matrices of order n with entries in D t
Hermn .D/ WD fM 2 Mn;n .D/ W M D M g becomes a Euclidean Jordan algebra with respect to the Jordan product (see [FK94, Chap. 5, §2]) M1 ı M2 D
1 .M1 M2 C M2 M1 /: 2
(99)
If D is equal to the algebra of octonions O, then Hermn .O/ with the product (99) is an Euclidean Jordan algebra if m 3 (see [FK94, Chap. 5, §2]). In particular, Herm3 .O/ is an Euclidean Jordan algebra of dimension 27, known as the Albert algebra. The Jordan algebras Herm2 .D/ for D D R; C; H or O are isomorphic to the Jordan algebras associated to a suitable bilinear symmetric form as in Example 4.22; more precisely, we have that 8 Herm2 .R/ Š .R ˚ R2 ; ıQ / ˆ ˆ ˆ ˆ ˆ < Herm2 .C/ Š .R ˚ R3 ; ıQ / ˆ ˆ Herm2 .H/ Š .R ˚ R5 ; ıQ / ˆ ˆ ˆ : Herm2 .O/ Š .R ˚ R9 ; ıQ /
(100)
where ıQ is defined in Example 98 with respect to the positive definite symmetric bilinear form B on the suitable vector space. (4) Let .A; ı/ be a Jordan algebra over F . Then A becomes a Jordan triple system with respect to the triple product f: ; : ; :gı defined by (see [Sat80, Chap. 1,§6]) fx; y; zgı WD .x ı y/ ı z C .z ı y/ ı x .x ı z/ ı y: It turns out that xy D Txıy C ŒTx ; Ty for any x; y 2 A which implies that the trace form of the Jordan algebra .A; ı/ as defined in (97) coincides with the trace form of the JTS .A; f: ; : ; :gı / as defined in (14). In particular, .A; ı/ is semisimple if and only if .A; f: ; : ; :gı / is semisimple. If .A; ı/ is a Jordan algebra over R, then AC WD A˝R C becomes a Hermitian JTS with respect to the triple product fx; y; zg0ı WD .x ı y/ ı z C .z ı y/ ı x .x ı z/ ı y;
Hermitian Symmetric Manifolds
219
Table 8 Simple Euclidean Jordan algebras Real vector space V R Rn Hermn .R/ .n 3/ Hermn .C/ .n 3/ Hermn .H/ .n 3/ Herm3 .O/
Jordan product ı Pn x ı y D .x0 y0 C iD1 xi yi ; x0 y1 C y0 x1 ; ; x0 yn C y0 xn / A ı B D 12 .AB C BA/ A ı B D 12 .AB C BA/ A ı B D 12 .AB C BA/ A ı B D 12 .AB C BA/
where y 7! y is the complex conjugation corresponding to the real form A AC and the Jordan product ı is extended linearly to AC . Then .A; ı/ is Euclidean if and only if .AC ; f: ; : ; :g0ı / is a positive Hermitian JTS. Observe that simple Jordan algebras are semisimple. Moreover, the direct sum of simple Jordan algebras is semisimple, where the direct sum of two Jordan algebras .A1 ; ı1 / and .A2 ; ı2 / is the vector space A WD A1 ˚ A2 endowed with the component-wise Jordan product .x1 ; x2 / ı .y1 ; y2 / WD .x1 ı y1 ; x1 ı y2 /. Conversely, we have the following decomposition theorem. Proposition 4.23. Any semisimple (resp. Euclidean) Jordan algebra decomposes uniquely as the product of simple (resp. Euclidean and simple) Jordan algebras. Sketch of the Proof. Let .A; ı/ be a semisimple (resp. Euclidean) Jordan algebra. If A is not simple, then there exists a proper ideal I A. Consider the orthogonal complement of I I ? WD fx 2 A W .x; y/ D 0 for any y 2 I g: It is possible to prove (see [FK94, Prop. III.4.4]) that (i) I ? is an ideal of A; (ii) I and I ? are semisimple (resp. Euclidean) Jordan algebras; (iii) A D I ˚ I ? as Jordan algebras. Iterating this construction for I and I ? , we get the existence of the decomposition. For the unicity, see loc. cit. t u Simple Euclidean Jordan algebras were classified by Jordan–Neumann–Wigner (see [FK94, Chap. V] and the references therein). Theorem 4.24. Every simple Euclidean Jordan algebra is isomorphic to one of the following Jordan algebras (see Table 8) (i) (ii) (iii) (iv) (v)
.R ˚ Rn ; ıQ / where Q is the standard scalar product on Rn ; Hermn .R/ for n 3; Hermn .C/ for n 3; Hermn .H/ for n 3; Herm3 .O/ (the Albert algebra).
220
F. Viviani
Remark 4.25. The complexification of a real Jordan algebra .A; ı/ is the complex vector space AC WD A ˝R C endowed with the Jordan product ıC obtained by extending linearly the Jordan product ı on A. The complexification induces a bijection Š
fEuclidean Jordan algebrasg ! fSemisimple Jordan C-algebrasg .A; ı/ 7! .AC ; ıC /
(101)
which preserves the decomposition into the product of simple Jordan algebras (see [FK94, Chap. VIII]). We now explain the relationship between Euclidean Jordan algebras and symmetric cones, due to the work of Koecher and Vinberg. Let .V; ı/ be a Euclidean Jordan algebra and denote by e 2 V the unit element of V (see Lemma 4.21(iv)). Denote by V the set of invertible elements of V , i.e. the elements x 2 V for which there exists y 2 V such that x ı y D e. Then we define an open cone inside V by ˝.V;ı/ WD fx 2 W x 2 V g D fx W x 2 V go D fx 2 V W Tx > 0g;
(102)
where f: ; : ; :go denotes the connected component containing the identity e 2 V . Indeed, ˝.V;ı/ is a symmetric cone with respect to the positive definite form h; i WD .; / (see [FK94, Chap. III, §2]). Conversely, let ˝ V be a symmetric cone with respect to a scalar product h; i on V and choose a base point e 2 ˝. Let g be the Lie algebra of G.˝/, k the Lie algebra of the maximal compact subgroup G.˝/e G.˝/ and g D k ˚ p be the associated Cartan decomposition. The action of G.˝/ on V induces an action of g on V . Clearly an element X 2 g belongs to k if and only if X e D e. Therefore, the map p ! V sending X into X e is a bijection; hence, for any x 2 V , there exists a unique Lx 2 k such that Lx e D x. Define a product ı˝ on V by x ı˝ y WD Lx y:
(103)
The pair .V; ı˝ / is an Euclidean Jordan algebra with unit element e (see [FK94, Chap. III,§3]). Theorem 4.26. There is a bijection fEuclidean Jordan algebrasg
Š
! fSymmetric conesg
.V; ı/ ! ˝.V;ı/ V .V; ı˝ /
(104)
˝ V
preserving the decomposition of Euclidean Jordan algebras into simple ones and the decomposition of symmetric cones into irreducible ones.
Hermitian Symmetric Manifolds
221
t u
Proof. See [FK94, Chap. III] or [Sat80, Chap. I, §8].
Remark 4.27. The bijection of Theorem 4.26 becomes an equivalence of categories if the two sets are endowed with the following morphisms (see [Sat80, Chap. I, §9]): (i) A unital Jordan algebra homomorphism between two Euclidean Jordan algebras .V; ı/ and .V 0 ; ı0 / is a linear map f W V ! V 0 such that f .x ı y/ D f .x/ ı0 f .y/
for any x; y 2 V;
f .e/ D e 0 ; where e (resp. e 0 ) is the unit element of .V; ı/ (resp. .V 0 ; ı0 /). (ii) An equivariant morphism between two symmetric cones ˝ .V; h; i/ and ˝ 0 .V 0 ; h; i0 / is a linear map W V ! V 0 sending ˝ into ˝ 0 and such that there exists a morphism of Lie algebras W Lie G.˝/ ! Lie G.˝ 0 / satisfying .T x/ D .T / .x/ t
.T / D .T /
for any x 2 V and any T 2 Lie G.˝/;
t
where t denotes the transpose with respect to either h; i or h; i0 . As a consequence of the bijection between symmetric cones and Euclidean Jordan algebras in Theorem 4.26 and the classification of simple Euclidean Jordan algebra given in Theorem 4.24, we get the following classification of irreducible symmetric cones (see [Sat80, Chap. 1, §8]) (see Table 9). Theorem 4.28. Every irreducible symmetric cone is isomorphic to one of the following cones q (i) P.1; n/ WD fx 2 R ˚ Rn W x0 > x12 C C xn2 g R ˚ Rn for n 1 (the Lorentz or light cone); (ii) Pn .R/ D Herm>0 n .R/ WD fM 2 Hermn .R/ W M > 0g Hermn .R/ for n 3; (iii) Pn .C/ D Herm>0 n .C/ WD fM 2 Hermn .C/ W M > 0g Hermn .C/ for n 3; (iv) Pn .H/ D Herm>0 n .H/ WD fM 2 Hermn .H/ W M > 0g Hermn .H/ for n 3; (v) P3 .O/ D Herm>0 3 .O/ WD fM 2 Herm3 .O/ W M > 0g Herm3 .O/. The closure ˝ of a symmetric cone ˝ V can be decomposed into a disjoint union of boundaries components, which we are now going to define. Recall first that an idempotent of an Euclidean Jordan algebra .V; ı/ is an element e 2 V such that e ı e D e. The operator Te (see Definition 4.20) is self-adjoint with respect to the positive definite scalar product given by the trace form (see Lemma 4.21(iii)) and its eigenvalues are 0, 1=2 and 1 (see [FK94, Prop. III.1.3]). Therefore, we get an orthogonal decomposition of V into eigenspaces
222
F. Viviani
Table 9 Irreducible symmetric cones Rank
Dimension
Cone
2
nC1 nC1
P .1; n/ D fx 2 R ˚ Rn W x0 >
n n n 3
Pn .R/ D Pn .C/ D Pn .H/ D P3 .O/ D
2
n2 n.2n 1/ 27
q
x12 C C xn2 g
Herm>0 n .R/ .n 3/ Herm>0 n .C/ .n 3/ Herm>0 n .H/ .n 3/ Herm>0 3 .O/
V D V .e; 1/ ˚ V .c; 1=2/ ˚ V .c; 0/;
(105)
relative to, respectively, the eigenvalues 0, 1=2 and 1. Lemma 4.29. For any idempotent e 2 .V; ı/, we have that V .e; 1/ is an Euclidean Jordan subalgebra of .V; ı/ such that e 2 V .e; 1/ is the identity element. t u
Proof. See [FK94, Prop. IV.1.1].
Consider now the Euclidean Jordan algebra .V; ı˝ / corresponding to a symmetric cone ˝ V according to Theorem 4.26. Definition 4.30. For an idempotent e 2 .V; ı˝ /, we define the boundary component of ˝ associated to e as the symmetric cone ˝.e/ WD ˝.V .c;1/;ı˝ / V .c; 1/ corresponding to the Euclidean Jordan algebra .V .c; 1/; ı˝ / according to Theorem 4.26. The closure ˝ of the symmetric cone ˝ in V can be partitioned into the disjoint union of boundaries components as it follows. Theorem 4.31. Notations as before. (i) For any idempotent e 2 .V; ı˝ /, the intersection of ˝ V with the subspace V .e; 1/ V is equal to the closure ˝.e/. In particular, ˝.e/ is contained in ˝. (ii) We have that ˝D
a
˝.e/;
(106)
e
where the disjoint union varies over all the idempotents e 2 .V; ı˝ / and, for any an idempotent e, the closure of ˝.e/ is a disjoint union of boundary components. (iii) The group G.˝/o of automorphisms of ˝ acts on ˝ by permuting its boundary components.
Hermitian Symmetric Manifolds
223
t u
Proof. See [AMRT10, Chap. II,§3].
Using (ii), we can introduce an order relation on the set of idempotents of .V; !ı / by saying that e e 0 if and only if ˝.e/ ˝.e 0 /. Indeed, it turns out that e e 0 if and only if e D e 0 C e 00 for a certain idempotent e 00 such that e 0 ı˝ e 00 D 0. Example 4.32. (i) For F D R; C; H; O and n 2 (with the convention that n 3 if F D O), consider the symmetric cone Pn .F / D Herm>0 n .F / Hermn .F / (as in Theorem 4.28) which is associated to the Euclidean Jordan algebra .Hermn .F /; ı/ of Example 4.22(4.22). Every idempotent of Hermn .F / is conjugate by G.Pn .F //o to the idempotent Ip 0 ep WD 0 0 for some 0 p n. The eigenspaces (105) of Tep are given by 8 ˆ A0 ˆ ˆ V .ep ; 1/ D W A 2 Hermp .F / ; ˆ ˆ 0 0 ˆ ˆ ˆ ˆ ( ) ! ˆ < 0 B V .ep ; 1=2/ D W B 2 Mp;np .F / ; t ˆ B 0 ˆ ˆ ˆ ˆ ˆ ˆ ˆ 0 0 ˆ ˆ V .ep ; 0/ D W C 2 Hermnp .F / : : 0C The boundary component associated to ep is equal to Pn .F /.ep / D
A0 W A 2 Herm>0 .F / Herm0 p n .F / D Pn .F /: 0 0
Note that the above idempotents fep g are such that 0 D e0 ep e n D In . (ii) Consider the Lorentz cone P.1; n/ of Theorem 4.28 which is associated to the Euclidean Jordan algebra .R ˚ Rn ; ıB / of Example 4.22(4.22), where B..x1 ; : : : ; xn // WD x12 C Cxn2 is the standard quadratic form on Rn . By abuse of notation, we will denote also by B the symmetric bilinear form associated to the quadratic form B. Every non-trivial idempotent (i.e. different from .0; 0/ and .1; 0/) is of the form ew D
1 1 ; w where B.w/ D : 2 4
224
F. Viviani
The eigenspaces (105) of Tew are given by 8 ˆ ˆ V .ew ; 1/ D R ew ; < V .ew ; 1=2/ D f.0; v/ W B.v; w/ D 0g ; ˆ ˆ : V .ew ; 0/ D R ew : The boundary component associated to ew is equal to P1;n .ew / D R>0 ew P1;n : For the symmetric cone C.F / associated to a boundary component F of a bounded symmetric domain D, as in Theorem 4.10(ii), we can explicitly describe its boundary components in terms of boundary components of D that dominates F . Theorem 4.33. Let D be a bounded symmetric domain and let F D be a boundary component. (i) If F F 0 then U.F / U.F 0 / and we have the equality C.F 0 / D C.F / \ U.F 0 / U.F /. Moreover, C.F 0 / is a boundary component of the symmetric cone C.F /. (ii) There is an order-reversing bijection ˚ 0
Š F D W F F 0 D ! fBoundary components of C.F /g F 0 7! C.F 0 / C.F /: Proof. See [AMRT10, Chap. III, Thm. 4.8].
(107)
t u
4.4 Siegel Domains The presentation of a bounded symmetric domain D as a Siegel domain of the third kind with respect to a given boundary component F D (see Corollary 4.15) assumes a nicer form when the boundary component F is a point, in which case it gives rise to a presentation of D as a Siegel domain of the second type, or simply a Siegel domain (following the terminology of [Sat80]). The aim of this subsection is to introduce and study Siegel domains. Definition 4.34. Let ˝ be an open (convex and pointed) cone in a real vector space U . Let V be a complex vector space and let H W V V ! UC be a Hermitian map (C-linear in the second variable and C-antilinear in the first variable). Assume that H is ˝-positive, i.e. H.v; v/ 2 ˝ n f0g for any 0 ¤ v 2 V:
Hermitian Symmetric Manifolds
225
The Siegel domain associated to .U; V; ˝; H / is given by S D S.U; V; ˝; H / WD f.u; v/ 2 UC V W Im u H.v; v/ 2 ˝g UC V: (108) In the special case where V D f0g, then S D S.U; ˝/ WD fu 2 UC W Im u 2 ˝g UC
(109)
is called a Siegel domain of the first kind, or a tube domain. The following result is due to Pyateskii–Shapiro [PS69] (see also [Sat80, Chap. III, Prop. 6.1]). Theorem 4.35. Every Siegel domain is holomorphically equivalent to a bounded domain. There is a nice characterization (due to Satake) of the Siegel domains that are holomorphically equivalent to a bounded symmetric domain. In order to present such a characterization, we need to introduce some notations. Consider the setting of Definition 4.34. Assume furthermore that ˝ U is symmetric with respect to a scalar product h; i on U (see Definition 4.16) and extend h; i to a C-bilinear symmetric form on UC UC . Choose a base point e 2 ˝ such that G.˝/e D G.˝/ \ O.V; h; i/;
(110)
which is possible by Theorem 4.17(i). Denote by ı˝ the Jordan product on U defined by mean of (103) and extend it linearly to UC . Recall that .U; ı˝ / is an Euclidean Jordan algebra with unit element e (see Theorem 4.26). Define now a positive definite Hermitian form h on V by h.v; v 0 / D he; H.v; v 0 /i for any v; v 0 2 V:
(111)
Using h, we can define for any u 2 UC an endomorphism Ru 2 End.V / by mean of the formula hu; H.v; v 0 /i D 2h.v; Ru v 0 / for any v; v 0 2 V:
(112)
It is easily checked from (112) that R D Ru , where the adjoint is with respect to the Hermitian form h. In particular, if u 2 U then Ru 2 Herm.V; h/ WD ff 2 End.V / W f D f g: Theorem 4.36. Notations as above. The Siegel domain S.U; V; ˝; H / is biholomorphic to a bounded symmetric domain (in which case we say that it is symmetric) if and only if
226
F. Viviani
(i) ˝ U is a symmetric cone with respect to a scalar product h; i on U ; (ii) u ı˝ H.v; v 0 / D H.Ru v; v 0 / C H.v; Ru v 0 / for any u 2 U and any v; v 0 2 V ; (iii) H.RH.v00 ;v0 / v; v 00 / D H.v 0 ; RH.v;v00 / v 00 / for any v; v 0 ; v 00 2 V . t u
Proof. See [Sat80, Chap. V, Thm. 3.5].
For a tube domain, the conditions (ii) and (iii) of Theorem 4.36 are trivially satisfies. Therefore, we get the following Corollary 4.37. The tube domain S.U; ˝/ is biholomorphic to a bounded symmetric domain if and only if ˝ U is a symmetric cone. Now we want to classify the symmetric Siegel domains, or equivalently that satisfy the three conditions of Theorem 4.36. Actually, it is possible to classify the following bigger class of Siegel domains. Definition 4.38. A Siegel domain S.U; V; ˝; H / is said to be quasi-symmetric if and only if it satisfies the first two conditions of Theorem 4.36. Indeed, quasi-symmetric Siegel domains correspond to unital Jordan algebra representations of .U; ı˝ / into Herm.V; h/. Lemma 4.39. Fix a symmetric cone ˝ U and keep the notations as above. (i) If H W V V ! UC is ˝-positive Hermitian map (as in Definition 4.34) satisfying Theorem 4.36(ii) then WD 2R W .U; ı˝ / ! Herm.V; h/ u 7! 2Ru
(113)
is a unital Jordan algebra homomorphism in the sense of Remark 4.27 (we call it a complex representation of .U; ı˝ /). (ii) Conversely, if we start from a unital Jordan algebra homomorphism (113) and we define a Hermitian map H W V V ! UC by mean of (112), then H is ˝-positive and it satisfies Theorem 4.36(ii). Proof. See [Sat80, Chap. IV, Prop. 4.1].
t u
Each quasi-symmetric Siegel domain is a product of irreducible quasi-symmetric Siegel domains, which we are now going to define. Definition 4.40. (i) Let S1 D S.U1 ; V1 ; ˝1 ; H1 / and S2 D S.U2 ; V2 ; ˝2 ; H2 / two Siegel domains. The product S1 S2 is equal to the Siegel domain S.U1 ˚ U2 ; V1 ˚V2 ; ˝1 C˝2 ; H D H1 ˚H2 /, where H D H1 ˚H2 W .V1 ˚V2 / .V1 ˚ V2 / ! U1 ˚ U2 is defined by HjV1 V2 0, HjV1 V1 H1 and HjV2 V2 H2 . (ii) A Siegel domain is irreducible if and only if it cannot be written as the product of two non-trivial Siegel domains. Theorem 4.41. (i) A quasi-symmetric Siegel domain S.U; V; ˝; H / is irreducible if and only if ˝ U is irreducible.
Hermitian Symmetric Manifolds
227
(ii) Any quasi-symmetric (resp. symmetric) Siegel domain decomposes uniquely as the product of irreducible quasi-symmetric (resp. symmetric) Siegel domains. According to Theorem 4.26, Lemma 4.39 and Theorem 4.41, an irreducible quasi-symmetric Siegel domain is built up from a simple Euclidean Jordan algebra .U; ı/ together with a complex representation W .U; ı/ ! Herm.V; h/. Such pairs can be classified as it follows. Theorem 4.42. The complex representations of the simple Euclidean Jordan algebras are given as it follows: ˚s (i) Type IV nIr;s .even n 4I r s 0/: the representation r;s D sp˚r 1 ˚ sp2 n1 of .R ˚ R ; ıQ /, where sp1 and sp2 are the two spin representations (see [Sat80, Appendix, §4–6]); (ii) Type IV nIr .odd n 3 or n D 2I r 0/: the representations r D sp˚r of .R ˚ Rn1 ; ıQ /, where sp is the spin representation (see [Sat80, Appendix, §4–6]); (iii) Type III nIr .n 3I r 0/: the representations r D id˚r of Hermn .R/, where id W Hermn .R/ ! Hermn .C/ is the natural injection. ˚s (iv) Type InIr;s .n 3I r s 0/: the representations r;s D id˚r ˚id of Hermn .C/, where id W Hermn .C/ ! Hermn .C/ is the identity homomorphism and id W Hermn .C/ ! Hermn .C/ is given by sending A into its complex conjugate A. (v) Type II nIr .n 3I r 0/: the representations r D id˚r of Hermn .H/, where
id W Hermn .H/ ! Herm2n .C/ A B A C jB 7! B A (vi) Type IV 0 : the trivial representation 0 of Herm3 .O/. Proof. See [Sat80, Chap. V, §5] and the references therein.
t u
In Table 10, we have listed all the irreducible quasi-symmetric Siegel domains by specifying the irreducible symmetric cone and the complex representation of their associated simple Euclidean Jordan algebra. Moreover, in the last column, we have specified the quasi-symmetric Siegel domains that are also symmetric (see [Sat80, Chap. V, §5] and the references therein) together with the corresponding bounded symmetric domains (using the notations of Table 4) to which they are biholomorphic. By looking at the last column of Table 10 and using the isomorphisms between bounded symmetric domains of small dimension belonging to different types (see Table 7), it is easy to see that every bounded symmetric domain is biholomorphic to a unique Siegel domain. As a consequence, there exists a bijection between symmetric Siegel domains and Hermitian positive JTSs, which we are now going to make explicit.
228
F. Viviani
Table 10 Irreducible quasi-symmetric Siegel domains Type
Symmetric cone
Complex representation
Symmetric cases
IV nIr;s .r s 0/ even n 4
P .1; n 1/
˚s r;s D sp˚r 1 ˚ sp2
IV nI0;0 D IV n IV 4Ir;0 D IrC2;2 IV 6I1;0 D II 5 IV 8I1;0 D V
IV nIr .r 0/ odd n 3 or n D 2
P .1; n 1/
r D sp˚r
IV nI0 D IV n .n 3/ IV 2Ir D IrC1;1
III nIr .n 3; r 0/
Pn .R/ D Herm>0 n .R/
r D id˚r
III nI0 D III n ˚s
InIr;s .n 3; r s 0/
Pn .C/ D Herm>0 n .C/
r;s D id˚r ˚id
InIr;0 D InCr;n
II nIr .n 3; r 0/
Pn .H/ D Herm>0 n .H/
r D id˚r
II nIr D II 2nCr .r D 0; 1/
VI 0
P3 .O/ D
0 D 0
VI 0 D VI
Herm>0 3 .O/
Start with a symmetric Siegel domain S D S.U; V; ˝; H /. By Theorem 4.36(i), the cone ˝ U is symmetric with respect to a scalar product h; i. Keeping the notations introduced before Theorem 4.36, we get a Jordan product ı˝ on U which we extend linearly to UC . Using the fact that .U; ı˝ / is an Euclidean Jordan algebra, it can be checked (see [Sat80, Chap. I, §6]) that UC becomes a Hermitian positive JTSs with respect to the triple product fu1 ; u2 ; u3 g˝ WD .u1 ı˝ u2 / ı˝ u3 C .u3 ı˝ u2 / ı˝ u1 .u1 ı˝ u3 / ı˝ u2 : (114) Define a triple product on UC ˚ V as it follows (
! ! !) u1 u u ; 2 ; 3 v1 v2 v3
WD S
! fu1 ; u2 ; u3 g˝ C 2H.Ru3 v2 ; v1 / C 2H.Ru1 v2 ; v3 / : 2Ru3 Ru2 v1 C 2Ru1 Ru2 v3 C 2RH.v2 ;v1 / v3 C 2RH.v2 ;v3 / v1
(115)
Using that S.U; V; ˝; H / is symmetric, it can be shown that .UC ˚V; f: ; : ; :gS / is a Hermitian positive JTS (see [Sat80,Chap. V, Thm. 6.9] and the discussion following e 2 UC ˚ V is an tripotent of the Hermitian it). Observe that the element eQ WD 0 positive JTS .UC ˚ V; f: ; : ; :gS /, i.e. fe; Q e; Q eg Q S D e. Q Moreover, using the fact that e D e is the identity element of .UC ; f: ; : ; :g˝ / and that 2Re D idV by Lemma 4.39, we can easily compute ŒeQ e Q In other words, 1 and eigenspaces
u e e u u ; ; WD D v : v S 0 0 v 2 1 2
(116)
are the only eigenvalues of eQ eQ with associated
V .eQ eI Q 1/ D UC and V .eQ eI Q 1=2/ D V:
(117)
Hermitian Symmetric Manifolds
229
Conversely, start with a Hermitian positive JTS .W; f: ; : ; :g/ and choose a tripotent e 2 W , i.e. an element of W such that fe; e; eg D e. The endomorphism ee 2 End.W / is semisimple and it satisfies the equation .ee 1/.2ee 1/.ee/ D 0 (see [Sat80, p. 242]). Therefore, the possible eigenvalues of ee are 1, 1=2 and 0. We can furthermore choose e in such a way that 0 is not an eigenvalue, in which case e is called principal (see [Sat80, Chap. V, §6, Ex. 5]). With this assumption, we get a decomposition W D W1 ˚ W1=2 D W .eeI 1/ ˚ W .eeI 1=2/
(118)
into eigenspaces for ee relative to the eigenvalues 1 and 1=2, respectively. The complex vector space W1 becomes a Jordan algebra with unit element e 2 W1 with respect to the Jordan product (see [Sat80, Chap. V, Prop. 6.1]) a ı b D fa; e; bg for any a; b 2 W1 :
(119)
Moreover, the map a 7! a WD fe; a; eg is a C-antilinear involution on W1 (see [Sat80, Chap. V, Prop. 6.1]). Therefore C W1 WD fa 2 W1 W a D ag; ı (120) is a real Jordan algebra which turns out to be Euclidean (see [Sat80, p. 254]). We will denote by ˝W W1C its associated symmetric cone (see Theorem 4.26). The Jordan algebra .W1 ; ı/ comes with a unital Jordan algebra homomorphism (see [Sat80, Chap. V, Prop. 6.2]) 2R W W1 ! End.W1=2 /; a 7! 2Ra s. t. Ra .x/ WD fa; e; xg:
(121)
It can be checked (see [Sat80, p. 247, Eq. (6.21)]) that Ra D Ra , where Ra is the adjoint of Ra with respect to the positive definite hermitian form h D =2 on W1=2 with equal to the trace form of .W; f: ; : ; :g/. Therefore, by restriction, we get a unital Jordan algebra homomorphism 2R W W1C ! Herm.W1=2 ; h/:
(122)
Consider now the Hermitian map HW W W1=2 W1=2 ! W1 ; .x; y/ 7! H.x; y/ WD fe; x; yg:
(123)
It can be checked (see [Sat80, p. 247, Eq. (6.25)]) that the map HW and the unital Jordan algebra homomorphism 2R satisfy formula (112), i.e. ha; HW .x; y/i D 2h.x; Ra y/ for any x; y 2 W1=2 and any a 2 W1 ;
230
F. Viviani
where h; i is the trace form of the Jordan algebra .W1 ; ı/. Therefore, from Lemma 4.39 it follows that HW is ˝-positive and it satisfies Theorem 4.36(ii). Moreover, it can be checked (see [Sat80, p. 245, Eq. (6:15”)]) that HW satisfies Theorem 4.36(iii). Therefore, using Theorem 4.36, we infer that S.W1C ; W1=2 ; ˝W ; HW / is a symmetric Siegel domain. Theorem 4.43. Notations as above. There is a bijection Š
fSymmetric Siegel spacesg ! fHermitian positive JTSsg S D S.U; V; ˝; H / ! .UC ˚ V; f: ; : ; :gS / S.W1C ; W1=2 ; ˝W ; HW /
(124)
.W; f: ; : ; :g/
sending irreducible symmetric Siegel domains into simple Hermitian positive JTSs. t u
Proof. See [Sat80, Chap. V, §6].
For symmetric Siegel spaces of the first kind (or symmetric tube domains), the bijection of Theorem 4.43 assumes a particular simple form. Indeed, let ˝ U be a symmetric cone and choose a base point e 2 ˝ as in (110). Consider the associated Jordan product ı˝ on U (see Theorem 4.26). Then the Hermitian positive JTS fUC ; f: ; : ; :gS g associated to the Siegel domain of the first kind S D S.˝; U / UC (as in (108)) is the one associated to the Euclidean Jordan algebra .U; ı˝ / as in Example 4.22(4.22). Consider now the bounded symmetric domain D˝ UC (in its Harish–Chandra embedding) corresponding to the Hermitian positive JTS fUC ; f: ; : ; :gS g (see Theorems 2.30 and 2.42). It is possible to describe explicitly the biholomorphism between S.˝; U / and D˝ , generalizing the Cayley transform in dimension one (see Example 2.29). Theorem 4.44. Notations as above. Then we have the following biholomorphism (called the generalized Cayley transform) Š
c W D˝ ! S.˝; U / w 7! i.e C w/ ı˝ .e w/1 ;
(125)
The inverse is given by the map sending z 2 S.˝; U / into .zie/ı˝ .zCie/1 2 D˝ . Indeed, generalized Cayley transforms have been defined for all symmetric Siegel spaces by Korányi–Wolf in [KW65, Chap. VI]. Looking at the last column of Table 10, it is easy to see which irreducible bounded symmetric domains are biholomorphic to symmetric tube domains (we call them bounded symmetric domains of tube type). Corollary 4.45. The irreducible bounded symmetric domains of tube type are the following:
Hermitian Symmetric Manifolds
231
Table 11 Irreducible bounded symmetric domains of tube type Symmetric Cone P .1; n 1/ .n 2/ Pn .R/ D Herm>0 n .R/ .n 3/ Pn .C/ D Herm>0 n .C/ .n 3/ Pn .H/ D Herm>0 n .H/ .n 3/ P3 .O/ D Herm>0 3 .O/
(1) (2) (3) (4) (5)
Bounded symmetric domain of tube type IV n if n 3 IV 1 if n D 2 III n In;n II 2n VI
In;n for any n 1; II 2n for any n 1; III n for any n 1; IV n for any 2 ¤ n 1; VI.
We have collected the irreducible bounded symmetric domains of tube type (avoiding repetitions in small dimension, see Table 7) in the following Table 11, together with their associated symmetric cones.
4.5 Boundary Components of Irreducible Bounded Symmetric Domains In this subsection, we describe explicitly the boundary components of each of the irreducible bounded symmetry domains (see Sect. 3). We begin with the following result. Theorem 4.46. Let D be an irreducible bounded symmetric domain and let G D Aut.D/o . Then all the boundary components of rank k are conjugated by the group G. Proof. See [Wol72, p. 292].
t u
In virtue of the above Theorem, it will be enough to describe for each irreducible symmetric domain D of rank r and each 0 k < r a boundary component F D of rank k. 4.5.1 Type Ip;q (p q 1) Every boundary component of DIp;q of rank k (with 0 k < q) is conjugate to the following boundary component
232
F. Viviani
DIkp;q
0 Z 0 0 W Z 2 DIpqCk;k Š DIpqCk;k ; WD 0 Iqk
(126)
which we call the standard boundary component of rank k. The pair .fDk ; Dk / Ip;q
Ip;q
of Lemma 4.6 associated to DIkp;q is equal to fDk
Ip;q
W ! DIkp;q z 7!
D k
Ip;q
0 0 ; 0 zIqk
(127)
W S1 SL2 .R/ ! SU.p; q/ D Hol.DIp;q /o 0 i e IpqCk B ab 0 e i ; 7 B ! @ 0 cd 0
0
0 0
0
i
aCd Ci.bc/ Iqk 2
e
bCci.ad / Iqk 2
1
0
Ik
0
bCcCi.ad / Iqk C 2 C:
0
A
aCd i.bc/ Iqk 2
(128)
In particular, the one-parameter subgroup of SU.p; q/ associated to DIkp;q as in (86) is given by wDk
Ip;q
W Gm ! SU.p; q/ 0 IpqCk B B 0 t 7! B @ 0 0
0 t Ct 1 Iqk 2 0 i.t t 1 / Iqk 2
Using the above explicit expression of wDk
Ip;q
0 0 Ik 0
1
0
C i.t t 1 / Iqk C 2 C: 0
t Ct 1 Iqk 2
(129)
A
and Theorem 4.8, we can compute
the Levi subgroup of the normalizer N.DIkp;q / subgroup of DIkp;q together with its decomposition as in Theorem 4.8(iv) Z.wDk /o Ip;q
80 A 0 ˆ ˆ ˆ > = 2 U.p q C k; k/>
E 2 GLqk .C/
> > > ;
D SU.p q C k; k/ GLoqk .C/ S1 D Gh .DIkp;q / Gl .DIkp;q / M.DIkp;q /;
(130) where GLoqk .C/ WD fE 2 GLqk .C/ W det E 2 R g GLqk .C/.
Hermitian Symmetric Manifolds
233
Similarly, using again the above explicit expression of wDk
Ip;q
we can compute the unipotent radical of
and Theorem 4.8,
N.DIkp;q /
W .DIkp;q / 80 9 1 F1 2 MpqCk;qk .C/; F2 2 Mk;qk .C/> ˆ IpqCk F1 0 iF1 ˆ > ˆ > t t A @ 0 iF2 Ik F2 ˆ > ˆ > t t t t t : ; F1 F1 F2 F2 D i.M M / iF1 M F2 Iqk iM
(131) Moreover, the center U.DIkp;q / of W .DIkp;q / is equal to the set of all matrices of W .DIkp;q / such that F1 D F2 D 0, and is therefore isomorphic to the abelian unipotent Lie group underlying the vector space Hermqk .C/. 0 0 under the natural action of The orbit of the point oDk D fDk .1/ D Ip;q Ip;q 0 Iqk the group Gh .DIkp;q / D SU.p q C k; k/ is equal to DIkp;q and its stabilizer subgroup is isomorphic to Kh .DIkp;q / D SU.p q C k; k/ \ S.Up Uq / D S.UpqCk Uk /. Therefore, DIkp;q is a bounded symmetric domain of type IpqCk;k and it is diffeomorphic to DIkp;q
Š
Gh .DIkp;q / Kh .DIkp;q /
D
SU.p q C k; k/ : S.UpqCk Uk /
(132)
The action of Gl .DIkp;q / D GLoqk .C/ on U.DIkp;q / Š Hermqk .C/ is given by t
.E; M / 7! EM E . Under this action, the orbit of the point ˝Dk
Ip;q
D
1 2 Iqk
2
Hermqk .C/ is equal to the cone of positive definite Hermitian complex quadratic forms Pqk .C/ of size q k and its stabilizer subgroup is equal to Kl .DIkp;q / D GLoqk .C/ \ S.Up Uq / D Uo .q k/, where Uo .q k/ WD fE 2 U.q k/ W det E D ˙1g: Therefore, C.DIkp;q / is equal to the symmetric cone Pnk .C/ (see Theorem 4.28) and it is diffeomorphic to C.DIkp;q / Š
Gl .DIkp;q / Kl .DIkp;q /
D
GLoqk .C/ Uo .q k/
D
GLqk .C/ D Pqk .C/: U.q k/
(133)
4.5.2 Type IIn Every boundary component of DII n of rank k (with 0 k < b n2 c) is conjugate to the following boundary component (which we call the standard boundary component of rank k)
234
F. Viviani
DIIk n
0 Z 0 0 W Z 2 DII 2kC Š DII 2kC ; WD 0 En2k
(134)
where D 0 (resp. 1) if n is even (resp. odd) and for any m 2 N we denote by E2m 0 1 the 2m 2m-matrix formed by m diagonal blocks of the form . 1 0 The decomposition into factors (as in Theorem 4.8(iv)) of the Levi subgroup L.DIIk n / of the normalizer subgroup N.DIIk n / of DIIk n is given by (see [Sat80, p. 116]) L.DIIk n / D Gh .DIIk n / Gl .DIIk n / M.DIIk n / 8 ˆ f1g GL n2 .H/ f1g if k D 0 and n is even; ˆ ˆ ˆ ˆ 1 ˆ .H/ S if k D 0 and n is odd; ˆ = > > > ;
(141)
9 80 1 > ˆ F 0 iF Ir > ˆ ˆ t = M C BF Ink C iM iF t k W .DIII n / D B : CW t > ˆ A F F F t F D i.M t M / @ 0 Ik F iF > ˆ > ˆ t ; : iF M F t Ink iM
(142) k k Moreover, the center U.DIII / of W .DIII / is equal to the set of all matrices of n n k W .DIII n / such that F D 0, and is therefore isomorphic to the abelian unipotent Lie group underlying the vector space Hermnk .R/.
236
F. Viviani
0 0 under the natural action The orbit of the point oDk D fDk .1/ D III n III n 0 Ink k k of the group Gh .DIII / D Spnc .k/ is equal to DIII and its stabilizer subgroup is n n nc k k is a bounded isomorphic to Kh .DIII n / D Sp .k/ \ U.n/ D U.k/. Therefore, DIII n symmetric domain of type III k and it is diffeomorphic to k Š DIII n
k / Gh .DIII n k Kh .DIII / n
D
Spnc .k/ : U.k/
(143)
k k The action of Gl .DIII / D GLnk .R/ on U.DIII / Š Hermnk .R/ is given by n n t .E; M / 7! EME . Under this action, the orbit of the point ˝Dk D 12 Ink 2 III n Hermnk .R/ is equal to the cone of positive definite real quadratic forms Pnk .R/ k of size n k and its stabilizer subgroup is isomorphic Kl .DIII / D GLnk .R/ \ n k U.n/ D O.n k/. Therefore, C.DIII n / is equal to the symmetric cone Pnk .R/ (see Theorem 4.28) and it is diffeomorphic to
k C.DIII /Š n
k / Gl .DIII n k Kl .DIII / n
D
GLnk .R/ D Pnk .R/: O.n k/
(144)
4.5.4 Type IV n .n 3/ The standard boundary component of DIV of rank k (for k D 0; 1) are given by 0 WD f.i; 0; : : : ; 0/t 2 Cn g; DIV n t 1Cz 1Cz 1 n ; ; 0; : : : ; 0 DIV WD i 2 C W jzj < 1 ; n 1z 1z
(145)
see [Wol72, p. 355]. The decomposition into factors (as in Theorem 4.8(iv)) of the k k k Levi subgroup L.DIV / of the normalizer subgroup N.DIV / of DIV is given by n n n (see [Sat80, p. 117]) k k k k L.DIV / D Gh .DIV / Gl .DIV / M.DIV / n n n n ( SO.1; 1/ GL1 .R/ SO.n 2/ D f1g .SO.n 1; 1/ R / f1g
if k D 1; if k D 0:
(146)
k Therefore, DIV is a bounded symmetric domain of type IV 1 D I1;1 if k D 1 and n k it is a point if k D 0. Moreover, the symmetric cone associated to DIV is equal to n (see Theorem 4.28)
Hermitian Symmetric Manifolds
237
( k C.DIV /D n
P.1; 0/
if k D 1;
P.1; n 1/
if k D 0:
(147)
4.5.5 Type V Using Remark 4.11, denote by DVk (for k D 0; 1) the boundary component of DV of rank k such that 8 ! ˆ 0 ˆ ˆ if k D 1; ˆ < 1 ! oD k D V ˆ 1 ˆ ˆ ˆ if k D 0: : 1 We call DVk the standard boundary component of DV of rank k. The decomposition into factors (as in Theorem 4.8(iv)) of the Levi subgroup L.DVk / of the normalizer subgroup N.DVk / of DVk is given by (see [Sat80, p. 117]) ( L.DVk / D Gh .DVk / Gl .DVk / M.DVk / D
SU.5; 1/ GL1 .R/ f1g
if k D 1;
f1g .SO.7; 1/ R / S
1
if k D 0: (148)
Therefore, DVk is a bounded symmetric domain of type I5;1 if k D 1 and it is a point if k D 0. Moreover, the symmetric cone associated to DVk is equal to (see Theorem 4.28) ( C.DVk /
D
P.1; 0/
if k D 1;
P.1; 7/
if k D 0:
(149)
4.5.6 Type VI k Using Remark 4.11, denote by DVI (for k D 0; 1; 2) the boundary component of DVI of rank k such that
oD k
VI
0 0 : D 0 I3k
k the standard boundary component of DVI of rank k. We call DVI The decomposition into factors (as in Theorem 4.8(iv)) of the Levi subgroup k k k L.DVI / of the normalizer subgroup N.DVI / of DVI is given by (see [Sat80, p. 117])
238
F. Viviani
8 ˆ ˆ <SO.2; 10/ GL1 .R/ f1g k k k k L.DVI / D Gh .DVI / Gl .DVI / M.DVI / D SU.1; 1/ .SO.9; 1/ R / f1g ˆ ˆ :f1g .E .26/ R / f1g 6
if k D 2; if k D 1; if k D 0:
(150) k Therefore, DVI is a bounded symmetric domain of type IV 10 if k D 2, of type I1;1 if k k D 1 and it is a point if k D 0. Moreover, the symmetric cone associated to DVI is equal to (see Theorem 4.28)
8 ˆ ˆ