This page intentionally left blank
This book provides a thorough introduction to the theory of classical integrable s...
74 downloads
1354 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
This page intentionally left blank
This book provides a thorough introduction to the theory of classical integrable systems, discussing the various approaches to the subject and explaining their interrelations. The book begins by introducing the central ideas of the theory of integrable systems, based on Lax representations, loop groups and Riemann surfaces. These ideas are then illustrated with detailed studies of model systems. The connection between isomonodromic deformation and integrability is discussed, and integrable field theories are covered in detail. The KP, KdV and Toda hierarchies are explained using the notion of Grassmannian, vertex operators and pseudo-differential operators. A chapter is devoted to the inverse scattering method and three complementary chapters cover the necessary mathematical tools from symplectic geometry, Riemann surfaces and Lie algebras. The book contains many worked examples and is suitable for use as a textbook on graduate courses. It also provides a comprehensive reference for researchers already working in the field. o l i v i e r b a b e l o n has been a member of the Centre National de la Recherche Scientifique (CNRS) since 1978. He works at the Laboratoire de Physique Th´eorique et Hautes Energies (LPTHE) at the University of Paris VI-Paris VII. His main fields of interest are particle physics, gauge theories and integrables systems. m i c h e l t a l o n has been a member of the CNRS since 1977. He works at the LPTHE at the University of Paris VI-Paris VII. He is involved in the computation of radiative corrections and anomalies in gauge theories and integrable systems. d e n i s b e r n a r d has been a member of the CNRS since 1988. He currently works at the Service de Physique Th´eorique de Saclay. His main fields of interest are conformal field theories and integrable systems, and other aspects of statistical field theories, including statistical turbulence.
CAMBRIDGE MONOGRAPHS ON MATHEMATICAL PHYSICS General editors: P. V. Landshoff, D. R. Nelson, S. Weinberg J. Ambjørn, B. Durhuus and T. Jonsson Quantum Geometry: A Statistical Field Theory Approach A. M. Anile Relativistic Fluids and Magneto-Fluids J. A. de Azc´ arraga and J. M. Izquierdo Lie Groups, Lie Algebras, Cohomology and Some Applications in Physics† O. Babelon, D. Bernard and M. Talon Introduction to Classical Integrable Systems V. Belinski and E. Verdaguer Gravitational Solitons J. Bernstein Kinetic Theory in the Early Universe G. F. Bertsh and R. A. Broglia Oscillations in Finite Quantum Systems N. D. Birrell and P. C. W. Davies Quantum Fields in Curved Space† M. Burgess Classical Covariant Fields S. Carlip Quantum Gravity in 2 + 1 Dimensions J. C. Collins Renormalization† M. Creutz Quarks, Gluons and Lattices† P. D. D’Eath Supersymmetric Quantum Cosmology F. de Felice and C. J. S. Clarke Relativity on Curved Manifolds† P. G. O. Freund Introduction to Supersymmetry† J. Fuchs Affine Lie Algebras and Quantum Groups† J. Fuchs and C. Schweigert Symmetries, Lie Algebras and Representations: A Graduate Course for Physicists Y. Fujii and K. Maeda The Scalar Tensor Theory of Gravitation A. S. Galperin, E. A. Ivanov, V. I. Ogievetsky and E. S. Sokatchev Harmonic Superspace R. Gambini and J. Pullin Loops, Knots, Gauge Theories and Quantum Gravity† M. G¨ ockeler and T. Sch¨ ucker Differential Geometry, Gauge Theories and Gravity† C. G´ omez, M. Ruiz Altaba and G. Sierra Quantum Groups in Two-dimensional Physics M. B. Green, J. H. Schwarz and E. Witten Superstring Theory, volume 1: Introduction† M. B. Green, J. H. Schwarz and E. Witten Superstring Theory, volume 2: Loop Amplitudes, Anomalies and Phenomenology† S. W. Hawking and G. F. R. Ellis The Large-Scale Structure of Space-Time† F. Iachello and A. Aruna The Interacting Boson Model F. Iachello and P. van Isacker The Interacting Boson–Fermion Model C. Itzykson and J.-M. Drouffe Statistical Field Theory, volume 1: From Brownian Motion to Renormalization and Lattice Gauge Theory† C. Itzykson and J.-M. Drouffe Statistical Field Theory, volume 2: Strong Coupling, Monte Carlo Methods, Conformal Field Theory, and Random Systems† C. V. Johnson D-Branes J. I. Kapusta Finite-Temperature Field Theory† V. E. Korepin, A. G. Izergin and N. M. Boguliubov The Quantum Inverse Scattering Method and Correlation Functions† M. Le Bellac Themal Field Theory† Y. Makeenko Methods of Contemporary Gauge Theory N. H. March Liquid Metals: Concepts and Theory I. M. Montvay and G. M¨ unster Quantum Fields on a Lattice† A. Ozorio de Almeida Hamiltonian Systems: Chaos and Quantization† R. Penrose and W. Rindler Spinors and Space-time, volume 1: Two-Spinor Calculus and Relativistic Fields† R. Penrose and W. Rindler Spinors and Space-time, volume 2: Spinor and Twistor Methods in Space-Time Geometry† S. Pokorski Gauge Field Theories, 2nd edition J. Polchinski String Theory, volume 1: An Introduction to the Bosonic String J. Polchinski String Theory, volume 2: Superstring Theory and Beyond V. N. Popov Functional Integrals and Collective Excitations† R. G. Roberts The Structure of the Proton† J. M. Stewart Advanced General Relativity† A. Vilenkin and E. P. S. Shellard Cosmic Strings and Other Topological Defects† R. S. Ward and R. O. Wells Jr Twistor Geometry and Field Theories† †
Issued as a paperback
Introduction to Classical Integrable Systems OLIVIER BABELON Laboratoire de Physique Th´ eorique et Hautes Energies, Universit´es Paris VI–VII
DENIS BERNARD Service de Physique Th´eorique de Saclay, Gif-sur-Yvette
MICHEL TALON Laboratoire de Physique Th´ eorique et Hautes Energies, Universit´es Paris VI–VII
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge , United Kingdom Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521822671 © O. Babelon, D. Bernard & M. Talon 2003 This book is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2003 - isbn-13 978-0-511-07050-1 eBook (EBL) - isbn-10 0-511-07050-0 eBook (EBL) - isbn-13 978-0-521-82267-1 hardback - hardback isbn-10 0-521-82267-X
Cambridge University Press has no responsibility for the persistence or accuracy of s for external or third-party internet websites referred to in this book, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
Contents
1
Introduction
1
2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13
Integrable dynamical systems Introduction The Liouville theorem Action–angle variables Lax pairs Existence of an r-matrix Commuting flows The Kepler problem The Euler top The Lagrange top The Kowalevski top The Neumann model Geodesics on an ellipsoid Separation of variables in the Neumann model
5 5 7 10 11 13 17 17 19 20 22 23 25 27
3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10
Synopsis of integrable systems Examples of Lax pairs with spectral parameter The Zakharov–Shabat construction Coadjoint orbits and Hamiltonian formalism Elementary flows and wave function Factorization problem Tau-functions Integrable field theories and monodromy matrix Abelianization Poisson brackets of the monodromy matrix The group of dressing transformations
32 33 35 41 49 54 59 62 65 72 74
vii
viii
Contents
3.11 Soliton solutions
79
4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10
Algebraic methods The classical and modified Yang–Baxter equations Algebraic meaning of the classical Yang–Baxter equations Adler–Kostant–Symes scheme Construction of integrable systems Solving by factorization The open Toda chain The r-matrix of the Toda models Solution of the open Toda chain Toda system and Hamiltonian reduction The Lax pair of the Kowalevski top
86 86 89 92 94 96 97 100 105 109 115
5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14
Analytical methods The spectral curve The eigenvector bundle The adjoint linear system Time evolution Theta-functions formulae Baker–Akhiezer functions Linearization and the factorization problem Tau-functions Symplectic form Separation of variables and the spectral curve Action–angle variables Riemann surfaces and integrability The Kowalevski top Infinite-dimensional systems
124 125 130 138 142 145 149 153 154 156 162 164 167 169 175
6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8
The closed Toda chain The model The spectral curve The eigenvectors Reconstruction formula Symplectic structure The Sklyanin approach The Poisson brackets Reality conditions
178 178 181 182 184 191 193 196 200
7 7.1
The Calogero–Moser model The spin Calogero–Moser model
206 206
Contents
ix
7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13
Lax pair The r-matrix The scalar Calogero–Moser model The spectral curve The eigenvector bundle Time evolution Reconstruction formulae Symplectic structure Poles systems and double-Bloch condition Hitchin systems Examples of Hitchin systems The trigonometric Calogero–Moser model
208 210 214 216 218 220 221 223 226 232 239 244
8 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11
Isomonodromic deformations Introduction Monodromy data Isomonodromy and the Riemann–Hilbert problem Isomonodromic deformations Schlesinger transformations Tau-functions Ricatti equation Sato’s formula The Hirota equations Tau-functions and theta-functions The Painlev´e equations
249 249 251 262 264 270 272 277 278 280 282 290
9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9
Grassmannian and integrable hierarchies Introduction Fermions and GL(∞) Boson–fermion correspondence Tau-functions and Hirota bilinear identities The KP hierarchy and its soliton solutions Fermions and Grassmannians Schur polynomials From fermions to pseudo-differential operators The Segal–Wilson approach
299 299 303 308 311 314 316 322 328 331
10 10.1 10.2 10.3 10.4
The KP hierarchy The algebra of pseudo-differential operators The KP hierarchy The Baker–Akhiezer function of KP Algebro-geometric solutions of KP
338 338 341 344 348
x
Contents
10.5 10.6 10.7 10.8 10.9 10.10 10.11
The tau-function of KP The generalized KdV equations KdV Hamiltonian structures Bihamiltonian structure The Drinfeld–Sokolov reduction Whitham equations Solution of the Whitham equations
352 355 359 363 364 370 379
11 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10
The KdV hierarchy The KdV equation The KdV hierarchy Hamiltonian structures and Virasoro algebra Soliton solutions Algebro-geometric solutions Finite-zone solutions Action-angle variables Analytical description of solitons Local fields Whitham’s equations
382 382 386 392 394 398 408 414 419 425 433
12 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 12.10
The Toda field theories The Liouville equation The Toda systems and their zero-curvature representations Solution of the Toda field equations Hamiltonian formalism Conformal structure Dressing transformations The affine sinh-Gordon model Dressing transformations and soliton solutions N -soliton dynamics Finite-zone solutions
443 443 445 447 454 456 463 467 471 474 481
13 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8
Classical inverse scattering method The sine-Gordon equation The Jost solutions Inverse scattering as a Riemann--Hilbert problem Time evolution of the scattering data The Gelfand--Levitan--Marchenko equation Soliton solutions Poisson brackets of the scattering data Action--angle variables
486 486 487 496 497 498 502 505 510
Contents
xi
14 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8
Symplectic geometry Poisson manifolds and symplectic manifolds Coadjoint orbits Symmetries and Hamiltonian reduction The case M = T ∗ G Poisson–Lie groups Action of a Poisson–Lie group on a symplectic manifold The groups G and G∗ The group of dressing transformations
516 516 522 525 532 534 538 540 542
15 15.1 15.2 15.3 15.4 15.5 15.6 15.7 15.8 15.9 15.10 15.11 15.12 15.13 15.14 15.15
Riemann surfaces Smooth algebraic curves Hyperelliptic curves The Riemann–Hurwitz formula The field of meromorphic functions of a Riemann surface Line bundles on a Riemann surface Divisors Chern class Serre duality The Riemann–Roch theorem Abelian differentials Riemann bilinear identities Jacobi variety Theta-functions The genus 1 case The Riemann–Hilbert factorization problem
545 545 547 549 549 551 553 554 554 556 559 560 562 563 567 568
16 16.1 16.2 16.3 16.4 16.5 16.6
Lie algebras Lie groups and Lie algebras Semi-simple Lie algebras Linear representations Real Lie algebras Affine Kac–Moody algebras Vertex operator representations
571 571 574 580 583 587 592
Index
599
1 Introduction
The aim of this book is to introduce the reader to classical integrable systems. Because the subject has been developed by several schools having different perspectives, it may appear fragmented at first sight. We develop here the thesis that it has a profound unity and that the various approaches are simply changes of point of view on the same underlying reality. The more one understands each approach, the more one sees their unity. At the end one gets a very small set of interconnected methods. This fundamental fact sets the tone of the book. We hope in this way to convey to the reader the extraordinary beauty of the structures emerging in this field, which have illuminated many other branches of theoretical physics. The field of integrable systems is born together with Classical Mechanics, with a quest for exact solutions to Newton’s equations of motion. It turned out that apart from the Kepler problem which was solved by Newton himself, after two centuries of hard investigations, only a handful of other cases were found. In the nineteenth century, Liouville finally provided a general framework characterizing the cases where the equations of motion are “solvable by quadratures”. All examples previously found indeed pertained to this setting. The subject stayed dormant until the second half of the twentieth century when Gardner, Greene, Kruskal and Miura invented the Classical Inverse Scattering Method for the Korteweg– de Vries equation, which had been introduced in fluid mechanics. Soon afterwards, the Lax formulation was discovered, and the connection with integrability was unveiled by Faddeev, Zakharov and Gardner. This was the signal for a revival of the domain leading to an enormous amount of results, and truly general structures emerged which organized the subject. More recently, the extension of these results to Quantum Mechanics already led to remarkable results and is still a very active field of research. 1
2
1 Introduction
Let us give a general overview of the ideas we present in this book. They all find their roots in the notion of Lax pairs. It consists of presenting the ˙ equations of motion of the system in the form L(λ) = [M (λ), L(λ)], where the matrices L(λ) and M (λ) depend on the dynamical variables and on a parameter λ called the spectral parameter, and [ , ] denotes the commutator of matrices. The importance of Lax pairs stems from the following simple remark: the Lax equation is an isospectral evolution equation for the Lax matrix L(λ). It follows that the curve defined by the equation det (L(λ) − µI) = 0 is time-independent. This curve, called the spectral curve, can be seen as a Riemann surface. Its moduli contain the conserved quantities. This immediately introduces the two main structures into the theory: groups enter through the Lie algebra involved in the commutator [M, L], while complex analysis enters through the spectral curve. As integrable systems are rather rare, one naturally expects strong constraints on the matrices L(λ) and M (λ). Constructing consistent Lax matrices may be achieved by appealing to factorization problems in appropriate groups. Taking into account the spectral parameter promotes this group to a loop group. The factorization problem may then be viewed as a Riemann–Hilbert problem, a central tool of this subject. In the group theoretical setting, solving the equations of motion amounts to solving the factorization problem. In the analytical setting, solutions are obtained by considering the eigenvectors of the Lax matrix. At any point of the spectral curve there exists an eigenvector of L(λ) with eigenvalue µ. This defines an analytic line bundle L on the spectral curve with prescribed Chern class. The time evolution is described as follows: if L(t) is the line bundle at time t then L(t)L−1 (0) is of Chern class 0, i.e. is a point on the Jacobian of the spectral curve. It is a beautiful result that this point evolves linearly on the Jacobian. As a consequence, one can express the dynamical variables in terms of theta-functions defined on the Jacobian of the spectral curve. The two methods are related as follows: the factorization problem in the loop group defines transition functions for the line bundle L. The framework can be generalized by replacing the Lax matrix by the first order differential equation (∂λ − Mλ (λ))Ψ = 0, where Mλ (λ) depends rationally on λ. The solution Ψ acquires non-trivial monodromy when λ describes a loop around a pole of Mλ . The isomonodromy problem consists of finding all Mλ with prescribed monodromy data. The solutions depend, in general, on a number of continuous parameters. The deformation equations with respect to these parameters form an integrable system. The theta-functions of the isospectral approach are then promoted to more general objects called the tau-functions.
1 Introduction
3
One can study the behaviour around each singularity of the differential operator quite independently. In the group theoretical version, the above extension of the framework corresponds to centrally extending the loop groups. Around a singularity the most general extended group is the group GL(∞) which corresponds to the KP hierarchy. It can be represented in a fermionic Fock space. Fermionic monomials acting on the vacuum yield decomposed vectors, which describe an infinite Grassmannian introduced by Sato. In this setting, the time flows are induced by the action of commuting one-parameter subgroups, and the tau-function is defined on the Grassmannian, i.e. the orbit of the vacuum, and characterizes it. Finally the Pl¨ ucker equations of the Grassmannian are identified with the equations of motion, written in the bilinear Hirota form. We have tried, as much as possible, to make the book self-contained, and to achieve that each chapter can be studied quite independently. Generally, we first explain methods and then show how they can be applied to particular examples, even though this does not correspond to the historical development of the subject. In Chapter 2 we introduce the classical definition of integrable systems through the Liouville theorem. We present the Lax pair formulation, and describe the symplectic structure which is encoded into the so-called rmatrix form. In Chapter 3 we explain how to construct Lax pairs with spectral parameter, for finite and infinite-dimensional systems. The Lax matrix may be viewed as an element of a coadjoint orbit of a loop group. This introduces immediately a natural symplectic structure and a factorization problem in the loop group. We also introduce, at this early stage, the notion of tau-functions. In Chapter 4 we discuss the abstract group theoretical formulation of the theory. We then describe the analytical aspects of the theory in Chapter 5. In this setting, the action variables are g moduli of the spectral curve, a Riemann surface of genus g, and the angle variables are g points on it. We illustrate the general constructions by the examples of the closed Toda chain in Chapter 6, and the Calogero model in Chapter 7. The following two Chapters, 8 and 9, describe respectively the isomonodromic deformation problem and the infinite Grassmannian. Soliton solutions are obtained using vertex operators. Chapters 10 and 11 are devoted to the classical study of the KP and KdV hierarchies. We develop and use the formalism of pseudo-differential operators which allows us to give simple proofs of the main formal properties. Finite-zone solutions of KdV allow us to make contact with integrable systems of finite dimensionality and soliton solutions. In the next Chapter, 12, we study the class of Toda and sine-Gordon field theories. We use this opportunity to exhibit the relations between
4
1 Introduction
their conformal and integrable properties. The sine-Gordon model is presented in the framework of the Classical Inverse Scattering Method in Chapter 13. This very ingenious method is exploited to solve the sineGordon equation. The last three chapters may be viewed as mathematical appendices, provided to help the reader. First we present the basic facts of symplectic geometry, which is the natural language to speak about Classical Mechanics and integrable systems. Since mathematical tools from Riemann surfaces and Lie groups are used almost everywhere, we have written two chapters presenting them in a concise way. We hope that they will be useful at least as an introduction and to fix notations. Let us say briefly how we have limited our discussion. First we choose to remain consistently at a relatively elementary mathematical level, and have been obliged to exclude some important developments which require more advanced mathematics. We put the emphasis on methods and we have not tried to make an exhaustive list of integrable systems. Another aspect of the theory we have touched only very briefly, through the Whitham equations, is the study of perturbations of integrable systems. All these subjects are very interesting by themselves, but the present book is big enough! A most active field of recent research is concerned with quantum integrable systems or the closely related field of exactly soluble models in statistical mechanics. When writing this book we always had the quantum theory present in mind, and have introduced all classical objects which have a well-known quantum counterpart, or are semi-classical limits of quantum objects. This explains our emphasis on Hamiltonians methods, Poisson brackets, classical r-matrices, Lie–Poisson properties of dressing transformations and the method of separation of variables. Although there is nothing quantum in this book, a large part of the apparatus necessary to understand the literature on quantum integrable systems is in fact present. The bibliography for integrable systems would fill a book by itself. We have made no attempt to provide one. Instead, we give, at the end of each chapter, a short list of references, which complements and enhances the material presented in the chapter, and we highly encourage the reader to consult them. Of course these references are far from complete, and we apologize to the numerous authors having contributed to the domain, and whose due credit is not acknowledged. Finally we want to thank our many colleagues from whom we learned so much and with whom we have discussed many parts of this book.
2 Integrable dynamical systems
We introduce the definition of integrable systems through the Liouville theorem, i.e. systems for which n conserved quantities in involution are known on a phase space of dimension 2n. The Liouville theorem asserts that the equations of motion can then be solved by quadrature. The notion of Lax matrix is introduced. This is a matrix whose elements are dynamical and whose time evolution is isospectral, a central object in the theory. It is also shown that the Poisson brackets of the elements of the Lax matrix are expressed in the so-called r-matrix form. Finally, we present some historical examples of integrable systems which are solved by the method of separation of variables. This leads to linearization of the time evolution on the Jacobian of Riemann surfaces, another recurring theme in the book. 2.1 Introduction In Classical Mechanics the state of the system is specified by a point in phase space. This is generally a space of even dimension with coordinates of position qi and momentum pi . The Hamiltonian is a function on phase space, denoted H(pi , qi ). The equations of motion are a first order differential system taking the Hamiltonian form: q˙i =
∂H , ∂pi
p˙i = −
∂H ∂qi
(2.1)
Here and in the following, a dot will refer to a time derivative. For any function F (p, q) on phase space, this implies that F (p(t), q(t)) obeys: dF F˙ ≡ = {H, F } dt 5
6
2 Integrable dynamical systems
where for any functions F and G the Poisson bracket {F, G} is defined as: ∂F ∂G ∂G ∂F − {F, G} ≡ ∂pi ∂qi ∂pi ∂qi i
For the coordinates pi , qi themselves we have {qi , qj } = 0,
{pi , pj } = 0,
{pi , qj } = δij
(2.2)
The quantity H(p, q) is automatically conserved under time evolution, d dt H(p, q) = 0, so that the motion takes place on the subvariety of phase space defined by H = E constant. Historically, it proved very difficult to find dynamical systems such that eqs. (2.1) could be solved exactly. However, there is a general framework where the explicit solutions can be obtained by solving a finite number of algebraic equations and computing finite number of integrals, i.e. the solution is obtained by quadratures. These dynamical systems are the Liouville integrable systems that we will consider in this book. A dynamical system on a phase space of dimension 2n is Liouville integrable if one knows n independent functions Fi on the phase space which Poisson commute, that is {Fi , Fj } = 0. The Hamiltonian is assumed to be a function of the Fi . In order to understand the geometry of the situation, let us discuss a very simple example: the harmonic oscillator. The phase space is of dimension 2 and the Hamiltonian is H = 12 (p2 +ω 2 q 2 ) with Poisson bracket {p, q} = 1. The phase space is fibred into ellipses H = E except for the point (0, 0) which is a stationary point. An adapted coordinate system ρ, θ is given by: ρ p = ρ cos(θ), q = sin(θ) (2.3) ω and the non-vanishing Poisson bracket is {ρ, θ} = ω/ρ. In these coordinates the flow reads: √ ρ = 2E, θ = ωt + θ0 i.e. the flow takes place on the above ellipsis. This can be straightforwardly generalized to a direct sum of n harmonic oscillators with n 1 2 H= (p + ωi2 qi2 ) 2 i i=1
and Poisson bracket eq. (2.2). We do have n conserved quantities in involution, Fi = 12 (p2i + ωi2 qi2 ), and the level manifold Mf , i.e. the set of points of phase space such that Fi = fi , is an n-dimensional real torus, which is
2.2 The Liouville theorem
7
explicitly a cartesian product of n topological circles. The motion takes place on these tori which foliate the phase space. We can intoduce n angles θi as above which evolve linearly in time with frequency ωi . An orbit of the dynamical flow is dense on the torus when the ωi are rationally independent. For Liouville integrable systems, we shall assume that the conserved quantities are well-behaved so that the n dimensional surfaces Mf defined by Fi = fi are generically regular, and foliate the phase space. This does not preclude the existence of singular points such as pi = qi = 0 in the above example of the harmonic oscillator. In this setting we are now going to prove the Liouville theorem and show that the geometry of the situation is analogous to that of the harmonic oscillator example. 2.2 The Liouville theorem We consider a dynamical Hamiltonian system with phase space M of dimension 2n. Introduce canonical coordinates pi , qi such that the nondegenerate Poisson bracket reads as in eq. (2.2). As usual a non-degenerate Poisson bracket on M is equivalent to the data of a non-degenerate closed 2-form ω, dω = 0, defined on M , called the symplectic form, see Chapter 14. In the canonical coordinates the symplectic form reads ω= dpj ∧ dqj j
Let H be the Hamiltonian of the system. Definition. The system is Liouville integrable if it possesses n independent conserved quantities Fi , i = 1, . . . , n, {H, Fj } = 0, in involution {Fi , Fj } = 0 The independence means that at generic points (i.e. anywhere except on a set of measure zero), the dFi are linearly independent, or that the tangent space of the surface Fi = fi exists everywhere and is of dimension n. There cannot be more than n independent quantities in involution otherwise the Poisson bracket would be degenerate. In particular, the Hamiltonian H is a function of the Fi . The Liouville theorem. The solution of the equations of motion of a Liouville integrable system is obtained by “quadrature”. Proof. Let α = i pi dqi be the canonical 1-form and ω = dα = i dpi ∧ dqi be the symplectic 2-form on the phase space M . We will construct
8
2 Integrable dynamical systems
a canonical transformation (pi , qi ) → (Fi , Ψi ) such that the conserved quantities Fi are among the new coordinates: ω= dpi ∧ dqi = dFi ∧ dΨi i
i
If we succeed in doing that, the equations of motion become trivial: F˙j = {H, Fj } = 0 ∂H ψ˙ j = {H, ψj } = = Ωj ∂Fj
(2.4)
The Ωj depend only on F and so are constant in time. In these coordinates, the solution of the equations of motion read: Fj (t) = Fj (0),
ψj (t) = ψj (0) + tΩj
To construct this canonical transformation, we exhibit its so-called generating function S. Let Mf be the level manifold Fi (p, q) = fi . Suppose that on Mf we can solve for pi , pi = pi (f, q), and consider the function q m α= pi (f, q)dqi S(F, q) ≡ m0
q0
i
where the integration path is drawn on Mf and goes from the point of coordinate (p(f, q0 ), q0 ) to the point (p(f, q), q), where q0 is some reference value. Suppose that this function exists, i.e. if it does not depend on the path ∂S from m0 to m, then pj = ∂q . Defining ψj by j ψj = we have dS =
∂S ∂Fj
ψj dFj + pj dqj
j
Since d2 S = 0 we deduce that ω = j dpj ∧ dqj = j dFj ∧ dψj . This shows that if S is a well-defined function, then the transformation is canonical. To show that S exists, we must prove that it is independent of the integration path. By Stokes theorem, we have to prove that: dα|Mf = ω|Mf = 0
2.2 The Liouville theorem
9
Fig. 2.1. A leaf Mf on phase space Let Xi be the Hamiltonian vector field associated with Fi , defined by dFi = ω(Xi , ·), ∂Fi ∂ ∂Fi ∂ Xi = − ∂qk ∂pk ∂pk ∂qk k
These vector fields are tangent to the manifold Mf because the Fj are in involution, Xi (Fj ) = {Fi , Fj } = 0 Since the Fj are assumed to be independent functions, the tangent space to the submanifold Mf is generated at each point m ∈ M by the vectors Xi |m (i = 1, . . . , n). But then ω(Xi , Xj ) = dFi (Xj ) = 0 and we have proved that ω|Mf = 0, and therefore S exists. We have effectively obtained the solution of the equations of motion through one quadrature (to calculate the function S) and some “algebraic manipulation” (to express the p as functions of q and F ) Remark 1. From the closedness of α on Mf , the function S is unchanged by continuous deformations of the path (m0 , m). However, if Mf has non-trivial cycles, which is generically the case, S is a multivalued function defined in a neighbourhood of Mf . The variation over a cycle ∆cycle S = α cycle
is a function of F only. This induces a multivaluedness of the variables ψj : ∆cycle ψj = ∂ ∆cycle S. For instance, in the case of harmonic oscillators, we see that above each ∂Fj
10
2 Integrable dynamical systems
point (q1 , . . . , qn ) we have 2n points on the Mf level surface, due to the independent choices of sign in pi = ± 2fi − ωi2 qi2 . So we have many choices for the path of integration, reflecting the topology of the torus.
Remark 2. The definition we have given of a Liouville integrable system requires some care. Given any Hamiltonian H, the Darboux theorem, see Chapter 14, implies that we can always find locally a system of canonical coordinates on phase space (P1 , . . . , Pn ; Q1 , . . . , Qn ), with H = P1 , hence fulfilling the hypothesis of the Liouville theorem. For integrable systems we require that the conserved quantities are globally defined on a sufficiently large open set, and that the surfaces Fi = fi are well-behaved and foliate the phase space. This is not generally the case for the Pi constructed by the Darboux theorem. Moreover, in all known examples, the conserved quantities are even algebraic functions of canonical coordinates on some open domain and the solutions of the equations of motion are analytic. Remark 3. Using the Poisson commuting functions Fi , one can solve simultaneously the n time evolution equations dF/dti = {Fi , F }, since: ∂ ∂ ∂ ∂ F− F = {Fi , {Fj , F }} − {Fj , {Fi , F }} = {{Fi , Fj }, F } = 0 ∂ti ∂tj ∂tj ∂ti Since the Hamiltonian vector fields are well-defined and linearly independent everywhere, the flows define a locally free (no fixed points) and transitive (goes everywhere) action of a small open set in Rn on the surface Mf . Assuming that Mf is connected and compact, the flows extend to all values of the times ti and fill the whole surface Mf , hence we have a surjective action of Rn on Mf . The stabilizer of a point is an Abelian discrete subgroup of Rn since the action is locally free, so it is of the form Zn . Thus Mf appears as the quotient of Rn by Zn , i.e. a torus. This refinement, due to Arnold, of the Liouville theorem shows that, under suitable global hypothesis, the phase space is indeed foliated by n dimensional tori, called the Liouville tori. It is remarkable that for small perturbations of integrable systems, there still exist Liouville tori “almost everywhere”. This is the content of the famous Kolmogorov–Arnold–Moser (KAM) theorem.
2.3 Action–angle variables As already noticed in the proof of the Liouville theorem, the level manifold Mf has non-trivial cycles. Under suitable compactness and connectivity conditions, the Mf are n-dimensional tori Tn . This points to the introduction of angle variables to describe the motion along the cycles. The torus Tn is isomorphic to a product of n circles Ci . We may choose special angular coordinates on Mf dual to the n fundamental cycles Ci (see eq. (2.5)).
2.4 Lax pairs
11
The action variables Ij are defined as the integrals of the canonical 1-form over the cycles Cj , 1 Ij = α 2π Cj The Ij are functions of the constants of motion Fj and we suppose they are independent, so that if the values of Ij (j = 1, . . . , n) are known, then Mf is determined. Let us consider the canonical transformation generated by the same function as above: m α S(I, q) = m0
but expressed in terms of the variables Ii instead of Fi . Denoting by θj the variable conjugate to Ij , the canonical transformation generated by S is defined by ∂S ∂S , θk = pk = ∂qk ∂Ik The variables θk are canonically conjugated to the action variables Ij . We show that they can be regarded as normalized angular variables on the cycles Cj . That is, 1 dθk = δjk (2.5) 2π Cj By definition of θk , ∂ dθk = dS, ∂Ik Cj Cj
dS =
∂S ∂S dqi + dIi ∂qi ∂Ii i
Since on the manifold Mf , dIi = 0, we have ∂S ∂ ∂ dθk = dqi = α = 2πδjk ∂Ik Cj ∂qi ∂Ik Cj Cj This proves that θk are angle variables. 2.4 Lax pairs The new concept which emerged from the modern studies of integrable systems is the notion of Lax pairs. A Lax pair L, M consists of two matrices, functions on the phase space of the system, such that the Hamiltonian evolution equations, eq. (2.1), may be written as dL ≡ L˙ = [M, L] dt
(2.6)
12
2 Integrable dynamical systems
Here, [M, L] = M L − LM denotes the commutator of the matrices M and L. The immediate interest in the existence of such a pair lies in the fact that it allows for an easy construction of conserved quantities. Indeed, the solution of eq. (2.6) is of the form L(t) = g(t)L(0)g(t)−1 where the invertible matrix g(t) is determined by the equation M=
dg −1 g dt
It follows that if I(L) is a function of L invariant by conjugation L → gLg −1 , then I(L(t)) is a constant of the motion. Such functions are functions of the eigenvalues of L. We say that the evolution equation (2.6) is isospectral, which means that the spectrum of L is preserved by the time evolution. Remark 1. Recall that integrability of the system in the sense of Liouville demands that (i) the number of independent conserved quantities equals the number of degree of freedom, and that (ii) these conserved quantities are in involution. Remark 2. A Lax pair is by no means unique. Even the size of the matrices may be changed. There is also a natural gauge group acting on the Lax pair: L −→ gLg −1 ,
M −→ gM g −1 +
dg −1 g dt
where g is an invertible matrix, a function on phase space.
Let us present some simple examples showing that the equations of motion can indeed be recast into Lax form. Example 1. For any integrable system in the sense of Liouville, one may construct a Lax pair in a tautological way. Consider a finitedimensional Hamiltonian system, with n degrees of freedom, Poisson bracket { , } and Hamiltonian H. Suppose it is integrable in the sense of Liouville, which means that it possesses n independent integrals of the motion Fi , i = 1, . . . , n, in involution. The Liouville theorem states that there exists, at least locally, a system of conjugate coordinates Ii , θi , i = 1, . . . , n, where the Ij are functions of the Fi only. In these coordinates, the equations of motion take the very simple form I˙j = 0,
∂H θ˙j = ∂Ij
(2.7)
13
2.5 Existence of an r-matrix
Introduce the Lie algebra generated by {Hi , Ei , i = 1, . . . , n} with relations [Hi , Hj ] = 0, [Hi , Ej ] = 2 δij Ej , [Ei , Ej ] = 0 This Lie algebra has a natural representation by 2n × 2n matrices. Set: L=
n
Ij Hj + 2Ij θj Ej ,
M =−
j=1
n ∂H j=1
∂Ij
Ej
The equation L˙ = [M, L] is then equivalent to eq. (2.7). Thus L, M form a Lax pair. However, this construction is useless since it requires the knowledge of the action angle variables to build the Lax pair, but if these are known, there is no need for a Lax pair any more. Example 2. As a second example we exhibit a Lax pair for the harmonic oscillator. Let: p ωq 0 −ω/2 (2.8) L= , M= ωq −p ω/2 0 We check immediately that the Lax equation, eq. (2.6), is equivalent to the equations of motion q˙ = p, p˙ = −ω 2 p. Let us observe that the Hamiltonian H can be written as 14 TrL2 . This example can be generalized to n independent harmonic oscillators by writing the Lax matrices L, M in a block diagonal form where each block is a two by two matrix as above. Now the conserved quantities are TrL2p = 2 (2Fi )p , with 2Fi = p2i + ω 2 qi2 , and TrL2p+1 = 0, so that they are equivalent to the collection of the Fi . 2.5 Existence of an r-matrix A Lax pair provides us with conserved quantities without referring to a Poisson structure. The notion of Liouville integrability requires the knowledge of a Poisson structure together with the involution property of the conserved quantities. We shall now present the general form of Poisson brackets between the matrix elements of the Lax matrix which ensures the involution property for the conserved quantities. Suppose we are given a Lax pair L, M , which are N × N matrices, and suppose that the matrix L can be diagonalized: L = U ΛU −1
(2.9)
The matrix elements λk of the diagonal matrix Λ are the conserved quantities. We will not consider here the question of the independence of these quantities.
14
2 Integrable dynamical systems
Let us first introduce some notations. Let Eij be the canonical basis of the N × N matrices, (Eij )kl = δik δjl . We can write L= Lij Eij ij
The components Lij of the Lax matrix are functions on the phase space. We can evaluate the Poisson brackets {Lij , Lkl } and gather the results as follows. Let Lij (Eij ⊗ 1), L2 ≡ 1 ⊗ L = Lij (1 ⊗ Eij ) L1 ≡ L ⊗ 1 = ij
ij
The index 1 or 2 means that the matrix L sits in the first or second factor in the tensor product. Similarly, for T living in the tensor product of two copies of N × N matrices, we set Tij,kl Eij ⊗ Ekl , T21 = Tij,kl Ekl ⊗ Eij T = T12 = ij,kl
ij,kl
More generally, when we have tensor products with more copies of N × N matrices, we denote by Lα the embedding of L in the α position, e.g. L3 = 1 ⊗ 1 ⊗ L ⊗ 1 ⊗ · · ·, and Tαβ the embedding of T in the α and β position. We shall also denote by Trα the partial trace on the space in α position in a tensor product. For example Tij,kl Tr(Eij ) Ekl Tr1 T12 = ij,kl
Define {L1 , L2 } as the matrix of Poisson brackets between the elements of L: {L1 , L2 } = {Lij , Lkl }Eij ⊗ Ekl ij,kl
For an integrable system, the Poisson brackets between the elements of the Lax matrix L can be written in the following very special form: Proposition. The involution property of the eigenvalues of L is equivalent to the existence of a function, r12 , on the phase space such that: {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]
(2.10)
Proof. Assume first that the eigenvalues of L Poisson commute, {λi , λj } = 0. Recall that L is diagonalized by U , eq. (2.9). Since U is a function
2.5 Existence of an r-matrix
15
on phase space, we compute directly the Poisson brackets {L1 , L2 } = {U1 Λ1 U1−1 , U2 Λ2 U2−1 } using the Leibnitz rule. We get nine terms. Out of these, four terms involve the Poisson brackets {U1 , U2 }. Introducing the quantity k12 = {U1 , U2 }U1−1 U2−1 , these terms can be written as [[k12 , L2 ], L1 ] = 12 [[k12 , L2 ], L1 ] − 12 [[k21 , L1 ], L2 ] Four other terms involve {Λ1 , U2 } and {U1 , Λ2 }. Introducing q12 = U2 {U1 , Λ2 }U1−1 U2−1 we can write them as [q12 , L1 ] − [q21 , L2 ]. Putting all this together, we get {L1 , L2 } = U1 U2 {Λ1 , Λ2 }U1−1 U2−1 + [r12 , L1 ] − [r21 , L2 ] where r12 = q12 + 12 [k12 , L2 ]. This proves one part of the equivalence when {Λ1 , Λ2 } = 0. Conversely, suppose we have eq. (2.10). Then, in any matrix representation n,m n,m {Ln1 , Lm 2 } = [a12 , L1 ] + [b12 , L2 ]
(2.11)
with an,m 12
=
n−1 m−1
Ln−p−1 Lm−q−1 r12 Lp1 Lq2 1 2
p=0 q=0
bn,m 12 = −
n−1 m−1
Ln−p−1 Lm−q−1 r21 Lp1 Lq2 1 2
p=0 q=0
Taking the trace of eq. (2.11), and using that the trace of a commutator is zero, we get that the quantities of Tr (Ln ) are in involution. This is equivalent to the involution of the eigenvalues λk of L. Although simple to prove, this proposition is important for developing formal aspects of integrable systems since it allows us to control the Poisson brackets of the Lax matrix. The Jacobi identity on the Poisson bracket, see Chapter 14, yields the following constraint on r: [L1 , [r12 , r13 ] + [r12 , r23 ] + [r32 , r13 ] + {L2 , r13 } − {L3 , r12 }] + cyc. perm. = 0 (2.12) where cyc. perm. means cyclic permutations of tensor indices 1, 2, 3. In a sense, solving this equation amounts to classifying integrable Hamiltonian systems.
16
2 Integrable dynamical systems
If r happens to be constant, the only remaining terms in eq. (2.12) are the first ones. In particular, the Jacobi identity is satisfied if a constant r-matrix satisfies [r12 , r13 ] + [r12 , r23 ] + [r32 , r13 ] = 0 When r is antisymmetric, r12 = −r21 , this is called the classical Yang– Baxter equation. This case will be extensively studied in Chapter 4. Remark 1. The form of the bracket is preserved by gauge transformations. If {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ] such that and L = gLg −1 , then there exists a matrix function r12 {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ] can be expressed in terms of the functions r12 and of the Poisson The function r12 brackets between g and the Lax matrix L. 1 r12 = g1 g2 r12 + g1−1 {g1 , L2 } + [u12 , L2 ] g1−1 g2−1 (2.13) 2
where u12 = g1−1 g2−1 {g1 , g2 }.
Remark 2. In the form (2.10) the antisymmetry property of the bracket is explicit, although r has no special symmetry property. Furthermore, we have the freedom to redefine r by r12 −→ r12 + [σ12 , L2 ]
(2.14)
where σ is symmetric, without changing the Poisson bracket.
Example. Let us give an example of this construction in the simple example of the harmonic oscillator. The Lax matrix L is given in eq. (2.8) and we introduce the action–angle coordinates ρ, θ as in eq. (2.3). In these coordinates the matrix L is diagonalized by: sin 2θ cos 2θ U = U −1 = sin 2θ − cos 2θ Since {U1 , U2 } = 0, r12 = q12 , which is easily computed to be: ω 0 1 ⊗L r12 = 2 −1 0 2ρ It is easy to verify that this r-matrix indeed satisfies eq. (2.10). Let us notice that it is a dynamical r-matrix, which means that it depends explicitly on the dynamical variables.
2.7 The Kepler problem
17
2.6 Commuting flows The Poisson brackets, eq. (2.10), are equivalent to the involution of the eigenvalues of L. An equivalent set of commuting Hamiltonians, Hn , is given by the traces of the powers of the Lax matrix: Hn = Tr (Ln )
(2.15)
The Hamiltonians Hn are in involution, {Hn , Hm } = 0, since they are symmetric polynomials in the eigenvalues. Furthermore, we show that the time evolution of the Lax matrix L with Hamiltonian Hn naturally takes the Lax form. Proposition. Suppose that {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]. If we take Hn = Tr (Ln ) as Hamiltonians, then the equations of motion admit a Lax representation: dL ≡ {Hn , L} = [Mn , L] , dtn
with Mn = −n Tr1 (Ln−1 r21 ) 1
(2.16)
Proof. Set m = 1 in eq. (2.11), and take the trace over the first space, to r21 ). get dL/dtn = [Mn , L] with Mn = −n Tr1 (Ln−1 1 Note that the matrices Mn are unchanged under the transformation eq. (2.14). However, the matrices Mn are not unique since adding any matrix commuting with L does not change the equations of motion. We close this chapter by presenting some of the few historical examples of integrable systems which were known at the end of the nineteenth century. Of course all systems with one degree of freedom (phase space of dimension 2) are integrable since the Hamiltonian H is conserved. We discuss below more sophisticated examples with higher dimensional phase spaces. 2.7 The Kepler problem The first historical integrable system is the Kepler two-body problem. In the centre of mass frame, the equations of motion take the form: d2 xi ∂V (r) = − , r = x21 + x22 + x23 dt2 ∂xi In the traditional Kepler problem, V (r) = C/r, but we will consider any centrally symmetric potential V (r). This is a Hamiltonian system with 1 2 pi + V (r) 2 3
H=
i=1
18
2 Integrable dynamical systems
and Poisson brackets {pi , xj } = δij . The phase space is of dimension 6, and we have to exhibit three commuting conserved quantities. Due to central symmetry, the angular momentum J = (J1 , J2 , J3 ),
Jij = xi pj − xj pi = ijk Jk
is conserved. Here ijk is the totally antisymmetric Levi–Civita tensor. The three components Ji are conserved but do not Poisson commute. However, a set of three independent Poisson commuting quantities is pro2 + J 2 + J 2 . At this point, one may follow vided by H, J3 ≡ J12 , J 2 ≡ J12 23 13 the standard solution, which takes advantage of the conservation of J, and restrict ourselves to the plane perpendicular to J where the motion takes place. Here we prefer to show the Liouville theorem at work, and use only the three commuting conserved quantities. Due to the spherical symmetry of the problem, it is convenient to use spherical coordinates: x1 = r sin θ cos φ,
x2 = r sin θ sin φ,
x3 = r cos θ
We introduce the conjugate momenta pr , pθ , pφ by writing the canonical 1-form α = pi dxi = pr dr + pθ dθ + pφ dφ. In these coordinates the conserved quantities read: 1 1 1 H = p2r + 2 p2θ + 2 2 p2φ + V (r) 2 r r sin θ 1 J 2 = p2θ + p2 sin2 θ φ J3 = pφ (2.17) On the surface Mf corresponding to fixed values of the conserved quantities, we solve for the p in terms of the the position variables, yielding:
J2 J32 , pφ = J3 pr = 2 H − V (r) − 2 , pθ = J 2 − r sin2 θ Note that on Mf , pr depends only on r, pθ only on θ and pφ only on φ (it is constant). The variables r, θ, φ are then called separated variables. The 1-form α restricted on Mf is then obviously closed. The action S appearing in the Liouville theorem reads: φ θ r
J2 J32 2 H − V (r) − 2 dr + J2 − J3 dφ S= dθ + r sin2 θ The angle variables corresponding to our action variables are given by ψH =
∂S , ∂H
ψJ 2 =
∂S , ∂J 2
ψJ3 =
∂S ∂J3
19
2.8 The Euler top
and have simple time evolution with respective frequencies (1, 0, 0) by eq. (2.4). Hence ψJ 2 and ψJ3 remain constant, while ψH = t − t0 . This gives the standard formula for the Kepler motion: r dr t − t0 =
2 2 H − V (r) − Jr2 Note that the constancy of ψJ3 implies: φ˙ =
J 3 sin2 θ J 2 −
J32 sin2 θ
θ˙
This, in turn, implies the conservation of J1 , J2 : J1 = −J3 cot θ cos φ − sin φ
J2 −
J32 sin2 θ
J2 −
J32 sin2 θ
J2 = −J3 cot θ sin φ + cos φ
as exso that the motion takes place in the plane perpendicular to J, pected. It is worth noticing that the present approach is the one which prevails in Quantum Mechanics, where the three components of J cannot be measured simultaneously. 2.8 The Euler top We consider a rotating solid body attached to a fixed point. The Euler top corresponds to the case where there is no external force. It is very convenient to consider the equations of motion in a frame rotating with the body, as discovered by Euler. We choose the moving frame with origin at the fixed point of the top (that is the point where the top is attached), and the axis being the principal inertia axis which diagonalizes, the inertia ten sor computed with respect to the fixed point Iij = (x2 δij − xi xj )ρ(x)dx with ρ(x) the mass density. Let J be the angular momentum of the top seen in the moving frame. We have J = I. ω where I = Diag(I1 , I2 , I3 ) and ω is the rotation vector of the moving frame. We shall assume that the principal moments of inertia Ii are all different. The equation of motion reads: dJ = − ω ∧ J dt
20
2 Integrable dynamical systems
It expresses the conservation of J in the absolute frame. This can be recast into the Hamiltonian framework by defining the Poisson brackets: {Ji , Jj } = ijk Jk where ijk is the usual antisymmetric tensor. The Hamiltonian reads: 1 Ji2 2 Ii 3
H=
i=1
This Poisson bracket is degenerate because J 2 Poisson commutes with everything. One must choose a symplectic leaf to get a well-defined Hamiltonian system. This is achieved by fixing the value of J 2 to a numerical value. Then the phase space is of dimension 2, and the system is integrable with conserved quantity H. Note that the trajectories are immediately obtained as the intersection of the sphere J12 + J22 + J32 = J 2 and the ellipsoid J12 /I1 + J22 /I2 + J32 /I3 = 2H. Using these relations to compute J2 and J3 in terms of J1 and substituting into the equation of −1 −1 ˙ motion of J1 , i.e. J1 = (I3 − I2 )J2 J3 , yields an equation of the form J˙1 = α + βJ12 + γJ14 , so that J1 is an elliptic function of t. 2.9 The Lagrange top When the top is in a gravitational field its weight has to be taken into account and the problem is more complicated. Let us assume that the rotating frame has its origin at the fixed point. The problem is integrable only in special cases. One case, found by Lagrange, is when two inertia moments are equal, for example I1 = I2 , and the centre of mass is located at a position (x1 = 0, x2 = 0, x3 = h) with respect to the rotating frame. This situation is achieved when the top has an axis of symmetry (around the third axis) and is attached to a point on this axis. For any top in a gravitational field, the equations of motion in the rotating frame take the form: dJ = − ω ∧ J + h ∧ P , dt
dP = − ω ∧ P dt
(2.18)
where P is the weight of the top, which is constant in the absolute frame, and h is the vector from the fixed point to the centre of mass, which is constant in the rotating frame. From these equations one can check the conservation of three quantities: P 2 , J · P and the energy 1 − P · h H = (J · I −1 J) 2
21
2.9 The Lagrange top
In order to formulate the equations of motion in a Hamiltonian framework let us introduce the following Poisson brackets between the six dynamical quantities Ji and Pi : {Ji , Jj } = ijk Jk ,
{Ji , Pj } = ijk Pk ,
{Pi , Pj } = 0
The Hamilton equations of motion are precisely eqs. (2.18). The Poisson structure is degenerate, i.e. the two conserved quantities P 2 , J·P are in the centre. Hence the symplectic leaves are of dimension 4. The Hamiltonian H provides one conserved quantity defined on the leaves and that is all in general. Using the particular hypothesis of the Lagrange top, namely the rotational symmetry around the third axis, it is easy to see that J · h is a second independent conserved quantity. Indeed, multiplying the first eq. (2.18) by h we find d h · J = h3 (ω1 J2 − ω2 J1 ) = h3 ω1 ω2 (I2 − I1 ) = 0 = h · ( ω ∧ J) dt where we have used that h is along the third axis. Since h is conserved, this quantity Poisson commutes with the Hamiltonian, hence the system is integrable. To solve this integrable system we describe the top by Euler angles (θ, φ, ψ). Recall that the rotation vector and the weight vector can be expressed in the moving frame as: P1 φ˙ sin θ sin ψ + θ˙ cos ψ P sin θ sin ψ ω1 ω2 = φ˙ sin θ cos ψ − θ˙ sin ψ , P2 = P sin θ cos ψ φ˙ cos θ + ψ˙ P cos θ ω3 P3 The two quantities in the centre of the Poisson bracket are P 2 = P 2 ˙ Moreover, the and P · J ≡ P Jz = P (I1 sin2 θ + I3 cos2 θ)φ˙ + P I3 cos θψ. ˙ This Lagrange conserved quantity reads h · J ≡ hJ3 = hI3 (φ˙ cos θ + ψ). ˙ ˙ allows us to eliminate φ and ψ in the Hamiltonian: 1 1 ˙ 2 − P cos θ H = I1 (sin2 θφ˙ 2 + θ˙2 ) + I3 (φ˙ cos θ + ψ) 2 2 We get a one-dimensional system in the variable θ with Hamiltonian: 1 1 J32 1 (Jz − J3 cos θ)2 H = I1 θ˙2 + − P cos θ + 2 2I1 2 I3 sin2 θ It follows that θ is an elliptic function of the time t. Note that the choice of the Euler angle coordinates has disentangled the dynamics, since φ and ψ no longer appear. This is related to the symmetry of the problem with respect to both the vertical axis and the axis of the top.
22
2 Integrable dynamical systems 2.10 The Kowalevski top
There is another much hidden case of integrability of the top, which was discovered by S. Kowalevski. As before we consider the motion of the top in a moving frame with origin at the fixed point of the top. Assume that the moments of inertia obey I1 = I2 = 2I3 . Assume further that the centre of mass is in the plane x3 = 0, but away from the origin, so that the top has no rotational symmetry. However, we are free to choose the inertia axis up to a rotation around the third one, hence to assume that the fixed point is on the first axis. We introduce the traditional notations h γ1 p ω = q , P = γ2 , h = 0 0 γ3 r and we write eqs. (2.18) in components, with c0 = h/I3 : 2p˙ = qr 2q˙ = −pr − c0 γ3 r˙ = c0 γ2
γ˙ 1 = rγ2 − qγ3 γ˙ 2 = pγ3 − rγ1 γ˙ 3 = qγ1 − pγ2
The Hamiltonian and Poisson brackets are the same as in the Lagrange case. Again the Hamiltonian H=
I3 (2p2 + 2q 2 + r2 ) − hγ1 2
and the following quantities P 2 = γ12 + γ22 + γ32 ,
P · J = I3 (2pγ1 + 2qγ2 + rγ3 )
are conserved. The last two are in the centre of the Poisson bracket, so the symplectic leaves are of dimension 4, and we need one further conserved quantity to prove integrability. To introduce it naturally, consider z = p + iq and ξ = γ1 + iγ2 . The equations of motion give: 2z˙ = −irz − ic0 γ3 ,
ξ˙ = −irξ + iγ3 z
We can eliminate γ3 by considering the combination z 2 +c0 ξ which obeys: d 2 (z + c0 ξ) = −ir(z 2 + c0 ξ), dt
d 2 ¯ = ir(¯ ¯ z 2 + c0 ξ) (¯ z + c0 ξ) dt
where the second equation is obtained from the first one by complex conjugation. It is then clear that |z 2 + c0 ξ|2 is conserved. In terms of the original variables, we have obtained the Kowalevski conserved quantity: K = (p2 − q 2 + c0 γ1 )2 + (2pq + c0 γ2 )2
(2.19)
2.11 The Neumann model
23
Note that the conditions I1 = I2 = 2I3 are essential in this calculation. The solution of this model has been obtained by S. Kowalevski and is considerably more involved than in the previous cases. The main steps of the solution will be presented in Chapters 4 and 5.
2.11 The Neumann model This model deals with the motion of a particle on a sphere SN −1 submitted to harmonic forces with generically different frequencies in each direction. It was first introduced by Neumann. An easy formulation is achieved by introducing a Lagrange multiplier, Λ, and writing the Lagrangian: L=
N 1 k=1
2
1 + Λ( x2l − 1) 2 N
(x˙ 2k
−
ak x2k )
l=1
The equations of motion are: x ¨k = −ak xk + Λxk ,
x2l = 1
l
Tocompute Λ, we multiply by xk and sum over k, which yields Λ = − k (x˙ 2k − ak x2k ) where we used that the constraint implies k (xk x ¨k + x˙ 2k ) = 0. This leads to the non-linear Newton equations of motion for the particle: (x˙l 2 − al x2l ) (2.20) x ¨k = −ak xk − xk l
Conversely, if we start with initial conditions satisfying k x2k = 1 and ˙ k = 0, these conditions are preserved by the time evolution. k xk x It is important for us to cast this model in the Hamiltonian formulation. This is achieved by introducing a larger phase space and then reducing by a symmetry. Consider a 2N -dimensional phase space with coordinates xn , yn , n = 1, . . . , N and canonical Poisson brackets {xn , ym } = δnm
(2.21)
and introduce the “angular momentum” antisymmetric matrix: Jkl = xk yl − xl yk and the Hamiltonian: H=
1 2 1 Jkl + ak x2k 4 2 k=l
k
(2.22)
24
2 Integrable dynamical systems
We shall assume in the following that a1 < a2 < · · · < aN . The Hamiltonian equations are, with X = (xk ), Y = (yk ), and the diagonal constant matrix L0 = (ak δkl ): X˙ = −JX,
Y˙ = −JY − L0 X
(2.23)
The Hamiltonian and the symplectic form have a symmetry Y → Y + λX,
X→X
and we can perform a Hamiltonian reduction under this symmetry group (see Chapter 14). The moment map is given by M = 12 k x2k since {M, Yk } = Xk , {M, Xk } = 0. We fix the moment to M = 12 . The reduced phase space is obtained by then taking the quotient by the group of stability of the moment which is here the whole group of symmetry. This amounts to imposing some gauge condition, e.g. (X, Y ) = 0. The reduced phase space has the correct dimension 2n − 2 for a point on a sphere. Remark. The reduced equations of motion are equivalent to the equations of motion for the Neumann model. Indeed, the reduced system is characterized by the conditions t XX = 1 and t XY = 0, but the equations of motion eq. (2.23) do not preserve the second condition. We need to perform simultaneously a time-dependent gauge transformation Y → Y +λ(t)X to keep the motion on the gauge surface. Writing: 0=
d ˙ − λJX) (X, Y + λX) = (−JX, Y + λX) + (X, −JY − L0 X + λX dt = −(X, L0 X) + λ˙
since J is antisymmetric, gives λ˙ = (X, L0 X). The equation of motion for Y = (Y +λX) on the gauge surface is thus: ˙ = −JY − L0 X + (X, L0 X)X Y˙ = (−JY − L0 X) − λJX + λX Since J = J we have X˙ = −J X = Y and Y˙ = −(Y , Y )X − L0 X + (X, L0 X)X so that eliminating Y we finally get: ˙ X) ˙ − (X, L0 X) X ¨ = −L0 X − (X, X which is identical to eq. (2.20).
The Liouville integrability of this system is a consequence of the existence of (N − 1) independent quantities in involution, first found by K. Uhlenbeck: Fk = x2k +
l=k
2 Jkl , ak − al
k
Fk = 1
(2.24)
2.12 Geodesics on an ellipsoid
25
Notice that the Hamiltonian of the Neumann model can be expressed in terms of the Fk as 1 ak Fk H= 2 k
Alternatively we can implement the Hamiltonian reduction by considering functions of X and Y invariant under Y → Y + λX . Such invariant functions are functions of X and J. Given X and J, an antisymmetric rank 2 matrix whose image contains X, we can always find a vector Y up to the above symmetry such that Jkl = xk yl − xl yk . The equations of motion can be written in terms of the two gauge invariant matrices J = X t Y − Y t X and K = X t X K˙ = −[J, K],
J˙ = [L0 , K]
That eq. (2.23) implies the above is a simple computation. Conversely, knowing K it is easy to compute X since K is a projector on X whose length is 1 and then one can compute Y knowing J, up to a gauge. 2.12 Geodesics on an ellipsoid The Neumann problem was shown by Moser to contain, in particular, the geodesic motion on an ellipsoid which was found to be integrable by Jacobi. Consider the Hamiltonian belonging to the Neumann conserved quantities, but different from the usual Neumann Hamiltonian: 1 Fk = Q(X, X) − Q(X, X)Q(Y, Y ) + Q2 (X, Y ) H= ak k
where we have defined the quadratic form: 1 xk yk Q(X, Y ) = ak k
The Hamiltonian H is of course conserved, so we can restrict ourselves on the surface H = 0. Note that if one defines the quantity ξ, invariant under the gauge transformation Y → Y + λX by ξ=Y −
Q(X, Y ) X Q(X, X)
we have H = Q(X, X)(1 − Q(ξ, ξ)) so that the condition H = 0 is equivalent to the condition Q(ξ, ξ) = 1. The flow of H, for H = 0, leaves ξ on the ellipsoid Q(ξ, ξ) = 1.
26
2 Integrable dynamical systems
We want to show that the trajectories of ξ are geodesics on the ellipsoid. We compute the time evolution of ξ, and for this we remark that since ξ is gauge invariant, one can use the unreduced equations of motion: ∂H yk xk ξk = −2Q(X, X) + 2Q(X, Y ) = −2Q(X, X) ∂yk ak ak ak ∂H xk yk y˙ k = − = −2(1 − Q(Y, Y )) − 2Q(X, Y ) ∂xk ak ak
x˙ k =
Note that, since X is gauge independent, so is its time derivative. Denoting s = Q(X, Y )/Q(X, X) we then have: Q2 (X, Y ) xk ξ˙k = −2 1 − Q(Y, Y ) + − sx ˙ k Q(X, X) ak On the surface H = 0 this becomes simply: dξk dξk = −sx ˙ k =⇒ = −xk dt ds Since x2 = 1 the length of the vector dξ/ds is 1, which means that s is the length parameter on the trajectory of ξ. Next we compute: 1 dxk d2 ξk Q(X, X) ξk = = −2 2 ds s˙ dt s˙ ak
(2.25)
The vector with components ξk /ak is the gradient of Q(ξ, ξ) hence is normal to the ellipsoid Q(ξ, ξ) = 1. Equation (2.25) shows that the second derivative of ξ with respect to the length parameter s is normal to the surface. This characterizes geodesics, as we now show. Indeed, to find geodesics on the surface f (ξ) = 0, we have to minimize the arc length: ˙ ˙ ξ · ξ + Λf (ξ) dt where Λ is a Lagrange parameter. The Euler–Lagrange equation reads: ˙ d ξ = Λ∇f (ξ) dt ˙ ˙ ξ·ξ So the derivative of the normalized velocity vector is perpendicular to the surface. The geodesic motion on an ellipsoid was solved originally by Jacobi by introducing ellipsoidal coordinates which separate the variables for this problem. As they separate the variables of the Neumann model as well, we now explain this method for the Neumann model.
2.13 Separation of variables in the Neumann model
27
2.13 Separation of variables in the Neumann model Following Jacobi and Neumann, we introduce (N − 1) parameters on the sphere ζ1 , . . . , ζN −1 . They are the roots of the equation: u(ζ) ≡
k
x2k =0 ζ − ak
This equation is invariant by xk −→ λxk so that ζ1 < ζ2 < · · · < ζN −1 are indeed defined on the sphere. Conversely, by definition of the ζj we have for x ∈ S (N −1) : j (ζ − ζj ) j (ak − ζj ) =⇒ x2k = (2.26) u(ζ) = k (ζ − ak ) l=k (ak − al ) Considering the graph of u(ζ) it is easy to see that: a1 < ζ1 < a2 < ζ2 < a3 < · · · < ζN −1 < aN and we have a bijection of this domain D of the ζj on the “quadrant” xk > 0 ∀k of the sphere. The (ζj ) define an orthogonal system of coordinates on the sphere, of ellipsoidal type. Let us consider, for each root ζj , the vector: 1 xN ∂x x1 (2.27) = vj , vj = ,..., ∂ζj 2 ζj − a1 ζj − aN Since ζj solves u(ζ) = 0, we have x · vj = 0. Moreover, vj · vj =
k
u(ζj ) − u(ζj ) x2k = 0, =− (ζj − ak )(ζj − ak ) ζj − ζj
if j = j
Therefore the vectors vj are (N − 1) orthogonal vectors in the tangent plane to the sphere S (N −1) at the point x. As a byproduct, since vj2 = −u (ζj ) we also get the metric tensor: gjj =
∂x ∂x 1 · = − δjj u (ζj ) ∂ζj ∂ζj 4
To compute the momentaconjugated to the variables ζj , we consider the canonical 1-form α = k y k dxk associated with the Poisson bracket eq. (2.21). We write it as α = j pj dζj . One gets pj =
k
yk
∂xk 1 = y · vj ∂ζj 2
28
2 Integrable dynamical systems
These (N − 1) equations determine y up to a vector proportional to x which does not affect the value of Jkl . A solution is y = 1/2 j g jj pj vj which easily gives: Jkl = −
1 (ak − al ) vjk vjl g jj pj 2 j
With this, we can compute the conserved quantities Fk , eq. (2.24), in terms of the new canonical coordinates ζi , pj : Fk =
x2k
1 l l ak vj · vj − + al vj vj vjk vjk g jj g j j pj pj 4 l
jj
Noting that vj · vj = 4gjj δjj and
l
al vjl vjl = 4ζj gjj δjj , we obtain:
g jj p2j Fk = x2k 1 − ζj − ak j
It is convenient to introduce the generating function for the Fk : Fk (λ − bn ) H(λ) ≡ = n λ − ak (λ − ak ) k
(2.28)
k
Fk = 1 to for appropriate bn , n = 1, . . . , N − 1, and we have used normalize the leading coefficient in the numerator. By a simple calculation we find: H(λ) = u(λ) 1 −
g jj p2j ζj − λ
(2.29)
j
Following the general strategy of the Liouville theorem, we express the momenta pj in terms of the conserved quantities Fk and the ζj . We have: g jj p2j = lim
λ→ζj
λ − ζj 1 H(λ) =⇒ p2j = − H(ζj ) u(λ) 4
where we have taken into account the value of the metric tensor and eq. (2.26). Notice that, on Mf , the momentum pj is a function of ζj only, so that the coordinates ζj form a set of separated variables for the Neumann model.
2.13 Separation of variables in the Neumann model The function S = 1 S= 2 j
ζj
j
29
pj dζj reads
1 −H(ζ) dζ = 2
ζj
j
(ζ − bn ) − n dζ k (ζ − ak )
(2.30)
and is a sum of terms, one for each separated variable. Choosing as independent action variables the (N −1) independent quantities bn instead of the N dependent quantities Fk , the conjugate variables ψn are: ∂S 1 ζj dζ ψn = =− −H(ζ) (2.31) ∂bn 4 ζ − bn j
By the Liouville theorem, the time evolution of the ψn under the Hamiltonian H(λ) is linear: ∂H(λ) H(λ) 1 ζ˙j −H(ζj ) ˙ ψn = =− =− (2.32) ∂bn λ − bn 4 ζj − bn j
where the last equality is obtained by differentiating eq. (2.31) with respect to time. Equation (2.32) gives the evolution of the variables ζj . This can be formulated in a more geometrical way by introducing the polynomial of degree (2N − 1): N N −1 P (ζ) = (ζ − ai ) (ζ − bn ) n=1
i=1
We can rewrite eq. (2.32) as the set of N − 1 equations: Qn (ζj )dζj dt = 4 Qn (λ), −P (ζj ) i (λ − ai ) j with Qn (λ) =
n = 1, . . . , N − 1
(λ − bm )
m=n
Since the Qn (λ) are N − 1 linearly independent polynomials of degree N − 2 this system is equivalent to the following one: j
ζjk dζj λk = 4 dt, −P (ζj ) i (λ − ai )
k = 0, . . . , N − 2
(2.33)
30
2 Integrable dynamical systems
The quantities ζ k dζ/ −P (ζ), k = 0, . . . , N − 2, are the Abelian differentials of the first kind on the hyperelliptic Riemann surface of genus g = N − 1 given by the equation s2 + P (ζ) = 0, see Chapter 15. The sums appearing in the left-hand side are thus Abel sums, and describe a point in the Jacobian of the curve. Equation (2.33) shows that this point moves linearly in time. This relationship between integrability, separation of variables, Riemann surfaces and linear flows on their Jacobian will reappear in a much broader context in the following chapters. References [1] J.L. Lagrange, M´ecanique analytique. (1788) In Oeuvres de Lagrange, XII, Gauthier–Villars (1889), Paris. [2] C. Jacobi, Vorlesungen u ¨ber Dynamik, Gesammelte Werke, Supplement band (1884) Berlin. [3] J. Liouville, Note sur l’int´egration des ´equations diff´erentielles de la dynamique. Journal de Math´ematiques (Journal de Liouville) XX (1855) 137. [4] C. Neumann, De problemate quodam mechanico, quod ad primam integralium ultraellipticorum classem revocatur. Crelle Journal 56 (1859) 46–63. [5] Sophie Kowalevski, Sur le probl`eme de la rotation d’un corps solide autour d’un point fixe. Acta Mathematica 12 (1889) 177–232. [6] C.S. Gardner, J.M. Greene, M.D. Kruskal and R.M. Miura, Method for solving the Korteweg–de Vries equation. Phys. Rev. Lett. 19 (1967) 1095. [7] P.D. Lax, Integrals of non-linear equations of evolution and solitary waves. Comm. Pure Appl. Math. 21 (1968) 467. [8] L. Landau and E. Lifchitz, M´ecanique. MIR (1969) Moscow. [9] K. Uhlenbeck, Minimal 2-spheres and tori in S k . Preprint (1975). [10] V. Arnold, M´ethodes math´ematiques de la m´ecanique classique. MIR (1976) Moscow. [11] J. Moser, Various aspects of integrable Hamiltonian systems. Proc. CIME Bressanone, Progress in Mathematics, 8 Birkhauser (1978) 233.
2.13 Separation of variables in the Neumann model
31
[12] E.K. Sklyanin, On the complete integrability of the Landau–Lifchitz equation. Preprint LOMI E-3-79. Leningrad, 1979. [13] M. Semenov-Tian-Shansky, What is a classical r-matrix? Funct. Anal. and Appl. 17 4 (1983) 17. [14] D. Mumford, Tata lectures on Theta II. Progress in Mathematics, 43 (1984) Birkh¨ auser.
3 Synopsis of integrable systems
In this chapter, we introduce Lax pairs with spectral parameters. These are Lax matrices L(λ) and M (λ) depending analytically on a parameter λ. The study of the analytical properties of the Lax equation ˙ L(λ) = [M (λ), L(λ)] yields considerable insight into its structure, and in fact, quickly introduces many of the major objects and concepts, which will be developed in depth in the subsequent chapters. The first important result is that the possible forms of M (λ) are completely determined by eq. (3.15). This form of M (λ) is such that the commutator [M (λ), L(λ)] has the same polar structure as L(λ). The Lax equation has then a natural interpretation as a flow on a coadjoint orbit of a loop group. This has in turn the important consequence of introducing a symplectic structure into the theory allowing us to connect with Liouville integrability. Moreover, this geometric interpretation of the Lax equation lends itself to its solution by factorization in a loop group, which is a Riemann–Hilbert problem. Studying the analytic structure of M (λ), we are led to consider an infinite family of elementary flows, depending on the order of the poles. This introduces a connection between time flows and the spectral parameter dependence, which finds a striking expression in Sato’s formula expressing the wave function in terms of tau-functions, eq. (3.61). The same ideas are exploited to analyse field theories. Here the Lax equation is replaced by a zero curvature equation for a Lax connection depending on a spectral parameter. We show that the role of the Lax matrix is played by the so-called monodromy matrix. Starting from a linear Poisson bracket in the r-matrix form for the Lax connection, we get a quadratic Poisson bracket for the monodromy matrix. The factorization problem allows us to define a group of transformations, the dressing group, acting on the solutions of the equations of motion. It is shown that this 32
3.1 Examples of Lax pairs with spectral parameter
33
action is a Lie–Poisson action whose generator is the monodromy matrix. Finally, we use simple dressing elements to produce the so-called soliton solutions. 3.1 Examples of Lax pairs with spectral parameter We begin our study of Lax pairs depending on a complex parameter λ, called the spectral parameter, by giving a few examples. Example 1. Our first example will be provided by the Euler top, see section (2.8) in Chapter 2. In this case, a Lax pair appears naturally. Let us introduce the 3 × 3 matrices Jij = ijk Jk and Ωij = ijk ωk . Then the equation of motion ddtJ = − ω ∧ J can be recast in matrix form: dJ = [Ω, J] dt This is a Lax pair with L = J, and M = Ω, but unfortunately the conserved quantities, Tr Ln , either vanish or are functions of J 2 , and therefore the Hamiltonian is not included in this set of conserved quantities (recall that J 2 is in the centre of the Poisson bracket). To cure this problem some modifications are needed. Let us introduce a diagonal matrix: I = Diag(I1 , I2 , I3 ) with Ik = 12 (Ii + Ij − Ik ), where (i, j, k) is a cyclic permutation of (1, 2, 3). With these notations we have J = IΩ + ΩI We assume that all Ij are different and we set: L(λ) = I 2 +
1 J, λ
M (λ) = λI + Ω
(3.1)
where λ is a free arbitrary parameter, the so-called spectral parameter. To check that the Lax equation gives back the equations of motion, we compute: 1 ˙ L(λ) − [M (λ), L(λ)] = [J, I] + [I 2 , Ω] + (J˙ + [J, Ω]) λ The first two terms cancel, while the vanishing of the 1/λ-term gives the equations of motion. This Lax pair is much better than the previous one because: 2 Tr L2 (λ) = Tr I 4 − 2 J 2 λ 1 3 3 3 6 2 2 6 2 2 Tr L (λ) = Tr I + 2 Tr I J = Tr I − 2 (Tr I) J − I1 I2 I3 H λ λ 4
34
3 Synopsis of integrable systems
hence we now do have the Hamiltonian among the conserved quantities of the form Tr Ln (λ). The new important point is that the Lax matrix depends on a spectral parameter λ and this was necessary to generate the proper conserved quantities. Furthermore, the Lax equation holds true identically in λ. Example 2. As a second example we consider the Lagrange top. The matrices L(λ) and M (λ) are written as 4 × 4 matrices in block form, 0 λ th 0 I t h + λ−2 t P , M (λ) = (3.2) L(λ) = λ−1 J Ih + λ−2 P λh Ω where the 3 × 3 matrices J and Ω are as in the previous example, and h and P are 3 × 1 matrices corresponding to the vectors h and P of the Lagrange top, see section (2.9) in Chapter 2. Moreover, I stands for the two equal moments of inertia of the top I = I1 = I2 . Let us write the Lax equation 0 = L˙ − [M, L] or: 0 I t hΩ − t hJ + λ−2 ( t P˙ + t P Ω) 0= −IΩh + Jh + λ−2 (P˙ − ΩP ) λ−1 (J˙ + [J, Ω] + P t h − h t P ) Due to the Lagrange condition I = I1 = I2 we have IΩh = Jh and the vanishing of the other elements reduces to the equations of motion J˙ + [J, Ω] + P t h − h t P = 0 and P˙ = ΩP . Example 3. Finally consider the Neumann model. As we have seen in section (2.11) in Chapter 2 the equations of motion on gauge invariant quantities are: K˙ = −[J, K], J˙ = [L0 , K] To recast these two relations into the Lax form we introduce the matrices L(λ) = L0 +
1 1 J − 2 K, λ λ
1 M (λ) = − K λ
(3.3)
and compute: 1 1 ˙ L(λ) − [M (λ), L(λ)] = (J˙ − [L0 , K]) − 2 (K˙ + [J, K]) λ λ The Lax equation with spectral parameter is equivalent to the vanishing of the two coefficients J˙ − [L0 , K] and K˙ + [J, K]. Hence the Lax equation is equivalent to the equations of motion of the Neumann model. There also exists a Lax pair with spectral parameter for the Kowalevski top, however, its construction is more involved and will be discussed in Chapter 4.
3.2 The Zakharov–Shabat construction
35
3.2 The Zakharov–Shabat construction Given an integrable system, there does not yet exist a useful algorithm to construct a Lax pair. There does exist, however, a general procedure, due to Zakharov and Shabat, to construct consistent Lax pairs giving rise to integrable systems. This is a general method to construct matrices L(λ) and M (λ), depending on a spectral parameter λ, such that the Lax equation ∂t L(λ) = [M (λ), L(λ)]
(3.4)
is equivalent to the equations of motion of an integrable system. The method consists of specifying the analytical properties of the matrices L(λ) and M (λ), λ ∈ C. We consider here systems with a finite number of degrees of freedom. The main result is eq. (3.15) expressing the possible forms of the matrix M in the Lax pair. We will end the section by showing that the previous examples do fit into this framework. We first introduce a notation. For any matrix valued rational function f (λ) with poles of order nk at points λk at finite distance, we can decompose f (λ) as f (λ) = f0 +
fk (λ),
−1
with fk (λ) =
fk,r (λ − λk )r
r=−nk
k
with f0 a constant. The quantity fk (λ) is called the polar part at λk . When there is no ambiguity about the pole we are considering, we will often use the alternative notation f− (λ) ≡ fk (λ). Around one of the points λk , f (λ) may be decomposed as follows: f (λ) = f (λ)+ + f (λ)−
(3.5)
with f (λ)+ regular at the point λk and f (λ)− = fk (λ) being the polar part. Let us now consider matrices L(λ) and M (λ) of dimension N × N . We will assume that the matrices L(λ) and M (λ) are rational functions of the parameter λ. Let {λk } be the set of their poles, namely the poles of L(λ) and those of M (λ). With the above notations, assuming no pole at infinity, we can write quite generally: L(λ) = L0 +
k
Lk (λ),
with Lk (λ) ≡
−1 r=−nk
Lk,r (λ − λk )r
(3.6)
36
3 Synopsis of integrable systems
and M (λ) = M0 +
Mk (λ)
with Mk (λ) ≡
−1
Mk,r (λ − λk )r (3.7)
r=−mk
k
Here nk and mk refer to the order of the poles at the corresponding point λk . The coefficients Lk,r and Mk,r are matrices. We will assume that the positions of the poles λk are constants independent of time. The Lax equation (3.4), with L(λ) and M (λ) given by eqs. (3.6, 3.7), must hold identically in λ. Looking at eqs. (3.4) we see that the pole λk in the left-hand side is a priori of order nk while in the right-hand side it is potentially of order nk + mk . Hence we have two types of equation. The first type does not contain time derivatives and comes from setting to zero the coefficients of the poles of order greater than nk in the right-hand side of the equation. This will be interpreted as mk constraint equations on Mk . The equations of the second type are obtained by matching the coefficients of the poles of order less or equal to nk on both sides of the equation. These equations contain time derivatives and are thus the true dynamical equations. Proposition. Assuming that L(λ) has distinct eigenvalues in a neighbourhood of λk , one can perform a regular similarity transformation g (k) (λ) diagonalizing L(λ) in a vicinity of λk : L(λ) = g (k) (λ) A(k) (λ) g (k)−1 (λ)
(3.8)
where A(k) (λ) is diagonal and has a pole of order nk at λk . As a result, the decomposition of L(λ) and M (λ) in polar parts reads:
L = L0 + Lk , with Lk = g (k) A(k) g (k)−1 (3.9) −
k
M = M0 +
k
Mk ,
with
Mk = g (k) B (k) g (k)−1
−
(3.10)
where B (k) (λ) has a pole of order mk at λk . Moreover, the Lax equation implies that B (k) (λ) is diagonal. Proof. If λk is a pole of L(λ), demanding that L(λ) has distinct eigenvalues in a neighbourhood of λk means that Lk,−nk has distinct eigenvalues. Then the matrix Q(λ) = (λ − λk )nk L(λ), which is regular at λk , can be diagonalized in vicinity of λk with a regular matrix g (k) (λ). This proves eq. (3.8). Then defining B (k) (λ) by M (λ) = g (k) (λ) B (k) (λ) g (k)−1 (λ) + ∂t g (k) (λ) g (k)−1 (λ)
(3.11)
3.2 The Zakharov–Shabat construction
37
the Lax equation becomes: A˙ (k) (λ) = [B (k) (λ), A(k) (λ)] This implies A˙ (k) = 0 as expected (because the commutator with a diagonal matrix has no element on the diagonal), and, moreover, if we assume that the diagonal elements of A(k) are all distinct this equation implies that B (k) is also diagonal. Finally, the term ∂t g (k) g (k)−1 is regular and does not contribute to the singular part Mk of M at λk . Hence (k) Mk = (g (k) B (k) g (k)−1 )− which only depends on B− . It is worth noting that the first n coefficients of the expansion of g (k) (λ) only depend on the first n coefficients of the expansion of Q(λ). The matrix g (k) (λ) is defined up to a right multiplication by an arbitrary analytic diagonal matrix. Note that this simultaneous diagonalization of L(λ) and M (λ) works around any point where L(λ) has distinct eigenvalues. This proposition clarifies the structure of the Lax pair. Only the singular parts of A(k) and B (k) contribute to Lk and Mk . The independent (k) parameters in L(λ) are thus L0 , the singular diagonal matrices A− of the form −1 (k) Ak,r (λ − λk )r (3.12) A− = r=−nk
and jets of regular matrices g(k) of order nk − 1, defined up to right multiplication by a regular diagonal matrix d(k) (λ): g
(k)
=
n k −1
gk,r (λ − λk )r
(3.13)
r=0
From these data, we can reconstruct the Lax matrix L(λ) by defining L = L0 + k Lk with
(k) Lk ≡ g(k) A− g(k)−1 (3.14) −
Then around each λk one can diagonalize L(λ) = g (k) A(k) g (k)−1 . This (k) yields an extension of the matrices A− and g(k) to complete series A(k) and g (k) in (λ − λk ). Finally, to define M (λ) = M0 + k Mk , we choose a set of polar matrices (B (k) (λ))− and use the series g (k) to define Mk by eq. (3.10). In the vicinity of a singularity, L(λ) and M (λ) can be simultaneously diagonalized if the Lax equation holds true. In this diagonal gauge, the
38
3 Synopsis of integrable systems
Lax equation simply states that the matrix A(k) (λ) is conserved and that B (k) (λ) is diagonal. When we transform these results into the original gauge, we get the general solution of the non-dynamical constraints on M (λ): Proposition. Let L(λ) be a Lax matrix of the form eq. (3.6). The general form of the matrix M (λ) such that the orders of the poles match on both sides of the Lax equation is M = M0 + k Mk with
Mk = P (k) (L, λ) (3.15) −
where P (k) (L, λ) is a polynomial in L(λ) with coefficients rational in λ and ( )− denotes the singular part at λ = λk . Proof. It is easy to show that this is indeed a solution. We have to check that the order of the poles is correct. Let us look at what happens around λ = λk . Using a beautiful argument first introduced by Gelfand and Dickey we write:
(k) [Mk , L]− = P (L, λ) , L − −
(k) (k) (k) = P (L, λ) − P (L, λ) , L = − P (L, λ) , L +
−
+
−
where we used that a polynomial in L commutes with L. From this we see that the order of the pole at λk is less than nk . To show that this is a general solution, recall eqs. (3.8, 3.10). Since A(k) (λ) is a diagonal N × N matrix with all its elements distinct in a vicinity of λk , its powers 0 up to N − 1 span the space of diagonal matrices and one can write B (k) = P (k) (A(k) , λ)
(3.16)
where P (k) (A(k) , λ) is a polynomial of degree N − 1 in A(k) . The coefficients of P (k) are rational combinations of the matrix elements of A(k) and B (k) , hence admit Laurent expansions in λ − λ k in a vicinity
of λk . Inserting eq. (3.16) into eq. (3.10) one gets Mk = P (k) (L, λ) . More−
over, in this formula the Laurent expansions of the coefficients of P (k) can be truncated at some positive power of λ − λk since a high enough power cannot contribute to the singular part, yielding a polynomial with coefficients Laurent polynomials in λ − λk . It is important to realize that the dynamical variables are the matrix elements of the Lax matrix, or the matrix elements of the Lk,r . Choosing
39
3.2 The Zakharov–Shabat construction
the number and the order of the poles of the Lax matrix amounts to specifying a particular model. Choosing the polynomials P (k) (L, λ) amounts to specifying the dynamical flows. The above propositions give the general form of M (λ) as far as the matrix structure and the λ-dependence is concerned. One should keep in mind however that the coefficients of the polynomials P (k) (L, λ) are a priori functions of the matrix elements of L and require further characterizations in order to get an integrable system. In the setting of the next section these coefficients will be constants. Remark 1. If λk is a pole of M (λ) and not a pole of L(λ), one can redefine M (λ) without changing the Lax equations of motion so as to eliminate the singularities of M (λ) at λk . Indeed, redefining: M (λ) → M (λ) − P (k) (L, λ) does not change the Lax equation. The new M (λ) is regular at λk . Of course we cannot eliminate the poles common to L(λ) and M (λ) by this procedure.
Remark 2. The Lax equation is invariant under similarity transformations, L → L = gLg −1 ,
M → M = gM g −1 + ∂t gg −1
(3.17)
If this similarity transformation is independent of λ, it will not spoil the analytic properties of L(λ) and M (λ). We can use the gauge freedom eq. (3.17) to diagonalize L0 , L0 = Diag(a1 , . . . , aN ) Consistency of eq. (3.4) then requires M0 to be diagonal also and thus L˙ 0 = [M0 , L0 ] = 0. Hence M0 is a polynomial P of L0 , so that replacing M (λ) → M (λ) − P (L(λ)) gets rid of M0 .
Remark 3. For Lax matrices L(λ) and M (λ) rational functions of λ, we can easily compare the number of variables to the number of equations contained in eq. (3.4). The variables are the matrices L0 , Lk,r and M0 , Mk,r . A naive counting, assuming that L(λ) and M (λ) are generic and independent matrices, gives in units of N 2 : number of variables = 2 + nk + mk = 2 + l + m k
number of equations = 1 +
k
(nk + mk ) = 1 + l + m
k
where l and m are the total order (degree of the divisor) of the poles of L(λ) and M (λ) respectively. Therefore there is one more variable than the number of equations, which reflects the gauge invariance eq. (3.17) of the Lax equation. If we assume however that λ belongs to a higher genus Riemann surface with genus g, the situation is very different. Indeed, suppose that L(λ) and M (λ) have poles of total multiplicity l and m respectively. Let us count the number of meromorphic functions on which L(λ) and M (λ) can be expanded with constant matrix coefficients. By the
40
3 Synopsis of integrable systems
Riemann–Roch theorem, L(λ) can be expanded on a basis of (l − g + 1) independent meromorphic functions (in the generic case), and M (λ) on (m − g + 1) functions. So we have number of variables = 2 + l + m − 2g Similarly, the commutator [M (λ), L(λ)] has poles of total multiplicity l + m and can be expanded on a basis of (l + m − g + 1) independent functions. Therefore number of equations = 1 + l + m − g So (number of equations – number of variables) = g − 1. Taking into account the gauge symmetry of the Lax equation, we see that the number of equations is always greater than the number of unknowns when g > 0. This shows that if λ belongs to a Riemann surface of genus g ≥ 1, (and such systems exist, at least for g = 1, see Chapter 7), one has to consider a non-generic situation.
Remark 4. As we already mentioned, the dynamical variables are the matrix elements of the Lk,r . In most cases this is too general and we may try to impose more restrictions on the matrix elements of the Lk,r . For instance, assuming that the λk are all real, then clearly one can impose that all the matrices Lk and Mk are anti-Hermitian. Another simple example is provided by the Neumann model whose Lax matrices (3.3) are such that t L(−λ) = L(λ), t M (−λ) = −M (λ). More generally, we may define the action of a reduction group R and impose that L(λ) and M (λ) are invariant under R. This may be done, for example, by demanding that the Lax pair L(λ) and M (λ) satisfies: −1 L(r(λ))R = L(λ), R −1 M (r(λ))R = M (λ) R (3.18) and r(λ) of R. This type of restriction is always compatible for some representations R with the Lax equation. It provides a way to lower the number of degrees of freedom. An example of this procedure can be found later in this chapter, see eq. (3.84).
We end this section by illustrating theses constructions on the examples of the Euler and Lagrange tops and of the Neumann model. We verify that the matrices L(λ) and M (λ) are indeed related as in eq. (3.15). Example 1. Let us consider the Euler top. We see that L(λ), eq. (3.1), has a pole at 0 and M (λ) has a pole at ∞. Let us apply the above procedure to remove this pole. There exists a polynomial P (x) = αx2 + βx + γ such that P (I 2 ) = I. We will need the coefficient α = −1/I1 I2 I3 . Redefining M (λ) to M (λ) − λP (L(λ)) one gets M = M0 − (α/λ)J 2 with M0 = Ω − α(I 2 J + JI 2 ) − βJ. One can check that M0 = 0. (Hint: for i = j compute (Ii − Ij )(M0 )ij using P (Ii2 ) = Ii ). Hence for the Euler top we can choose α M (λ) = − J 2 (3.19) λ We see that this new M (λ) is such that M (λ) = −α(λL2 )− . The Lax matrix of the Euler top L(λ) = I 2 + λ−1 J is of the form L0 + L− with
3.3 Coadjoint orbits and Hamiltonian formalism 41 L0 diagonal non-dynamical. The eigenvalues of J are (0, i J 2 , −i J 2 ), which are non-dynamical since J 2 belongs to the centre of the Poisson bracket and has been fixed to a numerical value. Example 2. For the Lagrange top L(λ), eq. (3.2), has a pole at 0 and M (λ) has a pole at ∞. One can remove this pole by redefining M (λ) → M (λ) − I −1 λL(λ). Notice however that since the eigenvalues of L0 are degenerate one cannot express M0 as a polynomial in L0 . The new M (λ) can be expressed as M = M0 + M− with M− (λ) = −I −1 (λL(λ))− . For the Lagrange top the Lax matrix is again of the form L0 + L− where For the singular part one gets, since J is obviously L0 is non-dynamical. −2 2 antisymmetric, A− = λ Diag( P , − P 2 , 0, 0) which again belongs to the centre of the Poisson bracket and is non-dynamical. Example 3. Finally, let us consider the Neumann model. We see from eq. (3.3) that we have M (λ) = (λL(λ))− where we project on the singular part at λ = 0. The Lax matrix of the Neumann model is of the form L = L0 + L− with L0 a numerical diagonal matrix, and the singular part L− at the only pole λ = 0 is given by: L− = λ−1 J − λ−2 K. This is a rank 2 matrix whose image is spanned by the vectors x and y. It is easy to diagonalize in this subspace and one gets the singular diagonal part A− = λ−2 Diag(1, 0, . . . , 0). It is again a numerical matrix. (k)
We have seen in these three examples that the singular parts, A− , of the matrix A(k) (λ) are independent of the dynamical variables. We will show in the next section that, in this case, eq. (3.14) admits an important interpretation as a coadjoint orbit. 3.3 Coadjoint orbits and Hamiltonian formalism In this section we show that the Zakharov–Shabat construction, when the (k) matrices A− are non-dynamical, can be interpreted as coadjoint orbits. This introduces a natural symplectic structure in the problem and gives a Hamiltonian interpretation to the Lax equation. This also allows us to compute the Poisson brackets of the matrix elements of the Lax matrix in terms of an r-matrix. We first recall some notions about adjoint and coadjoint actions of Lie algebras and Lie groups, see Chapter 14. Let G be a connected Lie group with Lie algebra G. The group G acts on G by the adjoint action denoted Ad: X −→ (Ad g)(X) = gXg −1 , g ∈ G, X ∈ G
42
3 Synopsis of integrable systems
Similarly the coadjoint action of G on the dual G ∗ of the Lie algebra G (i.e. the vector space of linear forms on the Lie algebra) is defined by: (Ad∗ g.Ξ)(X) = Ξ(Ad g −1 (X)), g ∈ G, Ξ ∈ G ∗ , X ∈ G The infinitesimal version of these actions provides actions of the Lie algebra G on G and G ∗ , denoted ad and ad∗ respectively and given by: ad X(Y ) = [X, Y ], X, Y ∈ G, ad X.Ξ(Y ) = −Ξ([X, Y ]), X, Y ∈ G, Ξ ∈ G ∗ ∗
To see how these notions relate to our problem, let us consider first a Lax matrix with only one polar singularity at λ = 0: (3.20) L(λ) = g(λ) A− (λ) g −1 (λ) −
−1 r with A− (λ) = r=−n Ar λ , and g(λ) has a regular expansion around λ = 0. Let G be the loop group of invertible matrix valued power series expanr sion around λ = 0. The elements of G are regular series g(λ) = ∞ r=0 gr λ . The product law is the pointwise product: (gh)(λ) = g(λ)h(λ). Formally, ∞ the Lie algebra G of G consists of elements of the form X = r=0 Xr λr . Its Lie bracket is given by the pointwise commutator. ∗ The dual−rG of G can be identified with the set of polar matrices Ξ(λ) = r≥1 Ξr λ , where the sum contains a finite but arbitrary large number of terms, by the pairing: Tr (Ξr+1 Xr ) Ξ, X ≡ Tr Resλ=0 (Ξ(λ)X(λ)) = r
where Resλ=0 is defined to be the coefficient of λ−1 . The coadjoint action of G on G ∗ is defined by ((Ad∗ g) · Ξ) (X) = Ξ(g −1 Xg) for Ξ ∈ G ∗ and any X ∈ G. Using the above model for G ∗ , and since Ξ, g −1 Xg = gΞg −1 , X = (gΞg −1 )− , X , we get (Ad∗ g) · Ξ(λ) = g · Ξ · g −1 − This is precisely eq. (3.20). The Lax matrix can thus be interpreted as belonging to the coadjoint orbit of the element A− (λ) of G ∗ under the loop group G. With this interpretation, the Lax equation reads: L˙ = ad∗ M · L = [M, L]
(3.21)
This shows that the equation of motion is a flow on the coadjoint orbit.
3.3 Coadjoint orbits and Hamiltonian formalism
43
Coadjoint orbits in G ∗ are equipped with the canonical Kostant–Kirillov symplectic structure. Choosing two linear functions h1 (Ξ) = Ξ(X) and h2 (Ξ) = Ξ(Y ) with X, Y ∈ G, so that dh1 = X and dh2 = Y , the Kostant–Kirillov Poisson bracket reads: {Ξ(X), Ξ(Y )} = Ξ([X, Y ]) where the right-hand side is the linear function Ξ → Ξ([X, Y ]). This Poisson bracket is very natural but one has to be aware that it is degenerate. The kernel is the set of Ad∗ -invariant functions. Let us specialize this construction to our case. We identify G ∗ with series expansions singular at λ = 0 using the linear form induced by Tr Resλ=0 . We parametrize the orbit of the element A− (λ) by the group element g(λ). Consider the 1-form α on the group given by α = −Tr Resλ=0 A− g −1 δg The pullback on the group of the Kostant–Kirillov symplectic form reads (see Chapter 14):
ω = δα = Tr Resλ=0 A− g −1 δg ∧ g −1 δg (3.22) This interpretation of L(λ) as a coadjoint orbit assumes that A− (λ) is not a dynamical variable. This construction can be extended to the multi-pole case. We consider the direct sum of loop algebras Gk , around λ = λk : Gk G≡ k
An element of this Lie algebra has the form of a multiplet X(λ) = (X1 (λ), X2 (λ), . . .)
where Xk (λ), defined around λk , is of the form Xk (λ) = n≥0 Xk,n (λ − λk )n . The Lie bracket is such that [Xk (λ), Xl (λ)] = 0 if k = l. The group G is the direct product of the groups Gk of regular invertible matrices at λk : G ≡ (G1 , G2 , . . .) (3.23) The dual G ∗ of this Lie algebra consists of multiplets Ξ = (Ξ1 (λ), Ξ2 (λ), . . .)
where Ξk (λ) around λk is of the form Ξk (λ) = r≥1 Ξk,r (λ − λk )−r . In this sum the number of terms is finite but arbitrary. The pairing is simply Ξk , Xk = Tr Resλk (Ξk (λ)Xk (λ)) Ξ, X ≡ k
k
44
3 Synopsis of integrable systems
The coadjoint action of G on G ∗ is given by the usual formula: if g = (g1 , g2 , . . .) ∈ G and Ξ = (Ξ1 , Ξ2 , . . .) ∈ G ∗ (Ad∗ g).Ξ(λ) = ((g1 Ξ1 g1−1 )− , (g2 Ξ2 g2−1 )− , . . .) A coadjoint orbit consists of elements Ξk with a fixed maximal order of the pole. Then, we can interpret eq. (3.9) as the coadjoint orbit of the element ((A1 )− , (A2 )− , . . .). Alternatively, we can consider the function on G ∗ L(λ) = L0 + Ξk (3.24) k
with poles at the points λk . Given this function we can recover the Ξk by extracting the polar parts. The constant matrix L0 is added to match the formula for the Lax matrix, eq. (3.9). By choice it is assumed to be invariant by coadjoint action. The pairing can be rewritten as L, X = Tr Resλk L(λ)Xk (λ) k
Note that only Ξk contributes to the residue at λk and the formula is compatible with the matrix L0 being invariant by coadjoint action. It is interesting to compute the dimension of this coadjoint orbit. In (k) (k) the formula Lk = (g (k) A− g (k)−1 )− , the matrices A− characterize the orbit and are non-dynamical. The dynamical variables are the jets of order (nk − 1) of the g (k) , which gives N 2 nk parameters. But Lk is invariant under g (k) → g (k) d(k) with d(k) a jet of diagonal matrices of the same order. Hence the dimension of the Lk orbit is (N 2 − N )nk , and the dimension of the orbit is the even number: nk dim M = (N 2 − N ) k
In the multi-pole case, the pullback of the symplectic form reads:
(k) Tr Resλk A− g (k)−1 δg (k) ∧ g (k)−1 δg (k) ω= (3.25) k
We can now use this symplectic form to evaluate the Poisson brackets of the elements of the Lax matrix. To write them we use the tensor notation of section (2.5) in Chapter 2 and show that they take the r-matrix form. We assume that each Lk (λ) is a generic element of an orbit of the (k) loop group GL(N )[λ], that L0 and the A− are non-dynamical, and the symplectic form is given by eq. (3.25).
3.3 Coadjoint orbits and Hamiltonian formalism
45
Proposition. With the symplectic structure eq. (3.25), the Poisson brackets of the matrix elements of L(λ) can be written as: C12 {L1 (λ), L2 (µ)} = − (3.26) , L1 (λ) + L2 (µ) λ−µ with C12 = i,j Eij ⊗ Eji , where the Eij are the canonical basis matrices. The commutator in the right-hand side of eq. (3.26) is the usual matrix commutator. Proof. Let us first assume that we have only one-pole and L = (gA− g −1 )− . Because we are dealing with a Kostant–Kirillov bracket for the loop algebra of gl(N ), we can immediately write the Poisson bracket of the Lax matrix using the defining relation {L(X), L(Y )} = L([X, Y ]). Using L(X) = Tr Resλ=0 (L(λ)X(λ)), this gives: {L(X), L(Y )} = Tr Resλ=0 (L(λ)[X(λ), Y (λ)])
(3.27)
By definition of the notation {L1 , L2 }, we have: {L(X), L(Y )} = {L1 (λ), L2 (µ)} , X(λ) ⊗ Y (µ) where , = Tr12 Resλ Resµ . We need to factorize X(λ)⊗Y (µ) in eq. (3.27). To this end, we introduce a Casimir operator C12 = Eα ⊗ Eα∗ ∈ G ⊗ G ∗ α
where Eα and Eα∗ are two dual bases of G and G ∗ respectively. We choose n Eij = λn Eij ,
∗n Eij = λ−n−1 Eji ,
n≥0
m = δ δ δ so that under the pairing Tr Res we have E ∗ nij , Ekl ik jl nm . The Casimir operator is such that for Y ∈ G, we have Y1 = C12 (Y2 ). Then we want to write L, [X, Y ] = [C12 , L1 ], X ⊗ Y , however, this formula n , L]⊗E ∗n and does not make sense as it stands because [C12 , L1 ] = α [Eij ij n ∈ G while L ∈ G ∗ and the commutator is not defined. To overcome Eij this problem we embed G and its dual G ∗ into the full loop algebra G˜ n , n ∈ Z. generated by Eij G˜ = G + G ∗ (3.28)
Note that in this sum, G and G ∗ do not commute. Let us compute C12 , assuming |λ| < |µ|: ∞ C12 λn , =− C12 (λ, µ) = C12 n+1 µ λ−µ n=0
C12 =
i,j
Eij ⊗ Eji
46
3 Synopsis of integrable systems
We can now write L(λ)[X(λ), Y (λ)] = [C12 (λ, µ), L(λ) ⊗ 1] , X(λ) ⊗ Y (µ) . Consider the rational function of λ: ϕ(λ) = {L1 (λ), L2 (µ)} − [C12 (λ, µ), L(λ) ⊗ 1]. By inspection ϕ contains only negative powers of µ, and we have ϕ, X(λ) ⊗ Y (µ) = 0. Hence ϕ contains only positive powers of λ and is regular at λ = 0. It has a pole at λ = µ, due to the form of C(λ, µ). We remove this pole by subtracting to ϕ the quantity [C12 (λ, µ), 1 ⊗ L(µ)] which contains only positive powers of λ and is therefore in the kernel of ·, X(λ) ⊗ Y (µ) . The pole at λ = µ disappears since [C12 , L(µ) ⊗ 1 + 1 ⊗ L(µ)] = 0. The redefined ϕ is regular everywhere and vanishes for λ → ∞, hence vanishes identically. This proves eq. (3.26) in the one-pole case. We can now study the multi-pole situation occuring in eq. (3.9). Consider L = L0 + k=1 Lk . Each Lk lives in a coadjoint orbit as above equipped with its own symplectic structure. From eq. (3.25) they have vanishing mutual Poisson brackets {Lj1 , Lk2 } = 0 for j, k = 0, . . . , N and j = k. We assume further that L0 does not contain dynamical variables {L01 , L02 } = 0,
{L01 , Lk2 } = 0
(here the indices 1 and 2 refer to the tensorial notation). Then since C12 /(λ − µ) is independent of the pole λk , it is obvious that the r-matrix relations for each orbit combine by addition to give eq. (3.26) for the complete Lax matrix L(λ).
Remark. The quantity C12 =
Eij ⊗ Eji
(3.29)
i,j
often occurs when calculating r-matrices. It is called the tensor Casimir of gl(N ). Its main properties are [C12 , g ⊗ g] = 0,
Tr2 C12 g2 = g1 ,
∀g ∈ GL(N )
(3.30)
This proposition shows that the generic Zakharov–Shabat system, equipped with this symplectic structure, is an integrable Hamiltonian system (the precise counting of independent conserved quantities will be done in Chapter 5). It also gives us a very simple formula for the r-matrix specifying the Poisson bracket of L(λ): r12 (λ, µ) = −r21 (µ, λ) = −
C12 (λ − µ)
(3.31)
47
3.3 Coadjoint orbits and Hamiltonian formalism
The Jacobi identity is satisfied because this r-matrix verifies the classical Yang–Baxter equation (see eq. (2.12) in Chapter 2): [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0 where rij stands for rij (λi , λj ). Note that r12 is antisymmetric: r12 (λ1 , λ2 ) = −r21 (λ2 , λ1 ). As in Chapter 2, these Poisson brackets for the Lax matrix ensure that one can define commuting quantities. The associated equations of motion take the Lax form. Proposition. The functions on phase space:
H (n) (λ) ≡ Tr Ln (λ) are in involution. The equations of motion associated with H (n) (µ) can be written in the Lax form with M = k Mk : n−1 L (λ) Mk (λ) = −n (3.32) λ−µ k Proof. The quantities H (n) (λ) are in involution because {Tr Ln (λ), Tr Lm (µ)} = nmTr12 {L1 (λ), L2 (µ)}Ln−1 (λ)Lm−1 (µ) 1 2 nm n−1 =− (µ) + [C12 , Lm (λ)) = 0 Tr12 ([C12 , Ln1 (λ)]Lm−1 2 (µ)]L1 2 λ−µ where we have used that the trace of a commutator vanishes. Similarly, we have: C12 n−1 (n) ˙ (µ), L1 (λ) L(λ) = {H (µ), L(λ)} = nTr2 L λ−µ 2 Performing the trace and remembering that Tr2 (C12 M2 ) = M1 , we get ˙ L(λ) = [M (n) (λ, µ), L(λ)],
M (n) (λ, µ) = n
Ln−1 (µ) λ−µ
(3.33)
This M (n) (λ, µ) has a pole at λ = µ and is otherwise regular. According to the general procedure we can remove this pole by subtracting some polynomial in L(λ) without changing the equations of motion. Obviously one can redefine: M (n) (λ, µ) → M (n) (λ, µ) − n
Ln−1 (λ) Ln−1 (λ) − Ln−1 (µ) = −n λ−µ λ−µ
48
3 Synopsis of integrable systems
This new M has poles at all λk andis regular at λ = µ. Decomposing it into its polar parts, we write M = k Mk with n−1 L (λ) Mk (λ) = −n λ−µ k This is of the form eq. (3.15) with P (k) (L, λ) = −
n Ln−1 (λ) λ−µ
(3.34)
Notice that the coefficients of the polynomial P (k) (L, λ) are pure numerical constants. Example 1. In the case of the Euler top L(λ) = I 2 + λ1 J. The singular part satisfies t L− (−λ) = L− (λ). This is not preserved by a general coadjoint action L− (λ) = (g(λ)L− (λ)g −1 (λ))− . To overcome this problem we consider the subgroup of matrices satisfying t g −1 (−λ) = g(λ) which may be called graded orthogonal. Its Lie algebra consists of matrices X(λ) such that t X(−λ) = −X(λ). Its dual under the pairing Tr Res consists of matrices L(λ) such that t L(−λ) = L(λ), and having an expansion in a finite sum of strictly negative powers of λ. The matrix L− (λ) is an orbit under this coadjoint action. The symplectic structure on this orbit is obtained by applying eq. (3.27):
1 {Jij , Jkl } = − δjk Jil − δjl Jik + δil Jjk − δik Jjl (3.35) 2 The computation of the r-matrix of the Euler top is similar to the proof of eq. (3.26), except that the loop algebra being different, the Casimir operator has to be recomputed. We keep the factor −1/2 in eq. (3.35) in order to match the general considerations. Of course if this factor is omitted we must multiply the final r-matrix by −2. A basis Eα of the graded loop algebra is given for n = 0, 1, . . . by: (Eij − Eji )λ2n , i < j,
(Eij + Eji )λ2n+1 , i < j,
Eii λ2n+1
The dual basis Eα∗ under Tr Res is given respectively for n = 0, 1, . . . by: 1 1 − (Eij − Eji )λ−2n−1 , i < j, (Eij + Eji )λ−2n−2 , i < j, Eii λ−2n−2 2 2 Then one gets for C12 = α Eα ⊗ Eα∗ : 1 1 1 1 C12 (λ, µ) = − Eij ⊗ Eji − Eij ⊗ Eij 2λ−µ 2λ+µ ij
ij
3.4 Elementary flows and wave function
49
This implies that poles at λ = ±µ appear in ϕ(λ) = {L1 (λ), L2 (µ)} − [C12 (λ, µ), L(λ) ⊗ 1]. To cancel these poles we now subtract [C21 (µ, λ), 1 ⊗ L(µ)] which only contains positive powers of λ. Indeed, the residue at λ = µ is [C12 , L1 (µ) + L2 (µ)], with C12 = ij Eij ⊗ Eji , and therefore vanishes as previously. The residue at λ = −µ reads [D12 , L1 (−µ)−L2 (µ)] with D12 = ij Eij ⊗ Eij . Now L(−µ) = t L(µ) and one checks that for any matrix A the commutator [D12 , A1 − t A2 ] vanishes. Hence ϕ(λ) − [C21 (µ, λ), 1 ⊗ L(µ)] = 0. Strictly speaking, one should have done this calculation on the polar part L− (λ) and added the L0 part afterwards. However, since the full Lax matrix satisfies t L(−λ) = L(λ), the reasoning is actually valid for the full Lax matrix. Finally one gets: {L1 (λ) , L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] with r12 (λ, µ) = C12 (λ, µ). Note that this is a two-poles r-matrix with poles at λ = ±µ. Example 2. We consider next the Neumann model. Recall that the Lax matrix reads L(λ) = L0 + λ1 J − λ12 K. As in the Euler top it satisfies t L(−λ) = L(λ). Hence we are dealing with the graded orthogonal group t g −1 (−λ) = g(λ). Let us check that the matrix L (λ) is an orbit under − the coadjoint action. As a matter of fact: (g(λ)L− (λ)g −1 (λ))− = −
1 1 g0 K t g 0 + (g0 J t g 0 − g1 K t g 0 + g0 K t g 1 ) 2 λ λ
with g(λ) = g0 +λg1 +. . .. Recalling that K = X t X and J = X t Y −Y t X, we see that this is exactly of the same form as L− (λ) with X → g0 X,
Y → g0 Y + g1 X
One can check that the Kostant–Kirillov bracket on this orbit reproduces the canonical Poisson bracket on the variables X and Y . It follows that the Neumann model has the same r-matrix as the Euler top. 3.4 Elementary flows and wave function We have found that the time evolution of a Zakharov–Shabat system is given by matrices M (λ) of the form eq. (3.10). This leaves an infinite number of choices for M (λ). We introduce an infinite number of elementary times corresponding to these choices. We will show that these flows are pairwise commuting. This defines a so-called integrable hierarchy. (k) The elementary flows correspond to diagonal matrices B− having a single pole of order n at λk in matrix diagonal entry α. Here, and in the
50
3 Synopsis of integrable systems
following, we use the multi-index i = (k, n, α). We thus define matrices Mi by:
Mi ≡ g (k) ξi g (k)−1 ,
ξi ≡ ξ(k,n,α) =
k
1 Eαα (λ − λk )n
(3.36)
We call ti = t(k,n,α) the time variable associated with Mi through the Lax equation: (3.37) ∂ti L = [Mi , L] A general flow is a linear combination of these elementary ones. Note that Mi (λ), which is a priori defined around λk , is a rational fonction of λ, with only a polar part at λk , hence is defined in the whole λ-plane. The Lax equation, eq. (3.37), has a meaning in the whole λ-plane and defines the time evolution of the quantities locally defined at λk , such as g (k ) , with respect to the times associated with λk . We will need some notations. Let: ξ(k,n,α) t(k,n,α) (3.38) ξ (k) (λ, t) = n,α
ξ(λ, t) =
k
ξ (k) (λ, t) =
ξi ti
(3.39)
i=(k,n,α)
These are generating functions with coefficients rational in λ. The function ξ (k) (λ, t) involves all the times above the singularity λk , while ξ(λ, t) involves all the times of the hierarchy. It is easy to find the Hamiltonians generating these elementary flows. Proposition. The Hamiltonian generating the flow ti is A(k) (λ)Eαα dλ Tr(A(k) (λ)Eαα ) Hi = = Tr Resλk (λ − λk )n (λ − λk )n Γ(k) 2iπ (k)
where A(k) (λ) is the diagonal form of L(λ), with singular part A− (λ), and Γ(k) a small contour around λk . Proof. Let us introduce the differential dHi (L) defined by δHi = δL, dHi for any variation δL of the Lax matrix. To compute it we start from eq. (3.8), written as A(k) = g (k)−1 Lg (k) , so that δA(k) = g (k)−1 δLg (k) + [A(k) , g (k)−1 δg (k) ]. Hence δTr(A(k) (λ)Eαα ) = Tr(δL g (k) Eαα g (k)−1 ), where we used that [Eαα , A(k) ] = 0. We get the formula: dHi (L) = (λ − λk )−n g (k) Eαα g (k)−1
(3.40)
3.4 Elementary flows and wave function
51
Next, ∂ti L(µ) = {Hi , L(µ)}, so we have:
∂ti L(µ) = Tr1 Resλ=λk dHi (λ) ⊗ 1[r12 (λ, µ), L1 (λ) + L2 (µ)] = Tr1 Resλ=λk (r12 (λ, µ)[L(λ), dHi (λ)]1 ) + Tr1 Resλ=λk (r12 (λ, µ)dHi (λ) ⊗ 1), L2 (µ) with r12 (λ, µ) given by eq. (3.31). In the first term, [L(λ), dHi (λ)] is proportional to g (k) [A(k) , Eαα ]g (k)−1 and vanishes since A(k) and Eαα are both diagonal. In the second term we expand r12 (λ, µ) in positive powers of (λ − λk )/(µ − λk ) to get something polar in (µ − λk ): ∞ (λ − λk )m Resλ=λk dHi (λ) Tr1 Resλ=λk (r12 (λ, µ)dHi (λ) ⊗ 1) = (µ − λk )m+1 m=0 Eαα (k) −1 g = Mi (µ) = (dHi (µ))k = g (k) (µ − λk )n k m In the last step we used that for any function f (λ) = +∞ m=−∞ λ fm , one has the identity ∞
Res(λm f (λ))µ−m−1 =
m=0
∞
f−m−1 µ−m−1 = (f (µ))−
m=0
Comparing eq. (3.36) and eq. (3.40) we find the useful relation: dHi (L) = g (k) ξi g (k)−1 ,
Mi = (dHi )−
(3.41)
We now verify directly that the flows ∂ti defined by eqs. (3.37) and (3.36) all commute. This amounts to showing that [L, ∂ti Mj −∂tj Mi −[Mi , Mj ]] = 0. We get even the stronger result: Proposition. The matrices Mi defining the time evolution ti satisfy the zero curvature condition ∂ti Mj − ∂tj Mi − [Mi , Mj ] = 0
(3.42)
As a consequence, the flows defined by eqs. (3.36, 3.37) are all commuting. Proof. Let i = (k, n, α) and j = (k , n , α ). Diagonalizing L = g (k ) A(k ) g (k )−1 around λk , the Lax equation gives
(k )
∂ti g (k ) = Mi g (k ) + g (k ) di
(3.43)
52
3 Synopsis of integrable systems (k )
where di is an unknown diagonal matrix. This equation holds true in (k) a vicinity of λk . If i = (k, n, α) and k = k, this implies that di = −ξi + regular, because ∂ti g (k) is regular and Mi = (g (k) ξi g (k)−1 )− , while (k ) if k = k, we only conclude that di is regular around k . Note that g (k) is known only up to a right multiplication by a regular diagonal matrix (k ) (k ) g (k) → g (k) d(k) and this changes di → di − d(k )−1 ∂ti d(k ) . From eqs. (3.36) and (3.43), we get (k)
∂tj Mi = [Mj , g (k) ξi g (k)−1 ]k + (g (k) [dj , ξi ]g (k)−1 )k Since the commutator in the second term involves diagonal matrices, it vanishes. Hence we have ∂tj Mi = [Mj , g (k) ξi g (k)−1 ]k . Let us assume first that λk = λk . Then Mj is regular at λk and only (g (k) ξi g (k)−1 )k = Mi contributes to the polar part of the above commutator, yielding ∂tj Mi = [Mj , Mi ]k Similarly we have ∂ti Mj = [Mi , Mj ]k The zero curvature condition follows because [Mi , Mj ] is a rational function with poles only at λk and λk and vanishes at infinity, so that [Mi , Mj ] = [Mi , Mj ]k + [Mi , Mj ]k Assume next that λk = λk . We still have ∂tj Mi = [Mj , g (k) ξi g (k)−1 ]k ,
∂ti Mj = [Mi , g (k) ξj g (k)−1 ]k
where now all projections are at λk . But Mi − g (k) ξi g (k)−1 = O(1) and Mj −g (k) ξj g (k)−1 = O(1) so that [Mi −g (k) ξi g (k)−1 , Mj −g (k) ξj g (k)−1 ]k = 0, or [Mi , Mj ] − [g (k) ξi g (k)−1 , Mj ]k − [Mi , g (k) ξj g (k)−1 ]k + [g (k) ξi g (k)−1 , g (k) ξj g (k)−1 ]k = 0 The last term vanishes because [ξi , ξj ] = 0. From this the zero curvature condition readily follows. When parametrizing L(λ) as in eq. (3.9), the dynamical variables are the g (k) (λ), modulo gauge transformations consisting of right multiplication by regular diagonal matrices. Let us write the equations of motion on the variables g (k) (λ).
53
3.4 Elementary flows and wave function
Proposition. There exists a gauge choice such that the equations of motion read, for each i = (k, n, α):
∂ti g (k ) = Mi g (k ) − g (k ) ∂ti ξ (k ) (λ, t)δkk (k )
(3.44)
= g (k )−1 (∂ti − Mi )g (k ) is Proof. Equation (3.43) means that di the gauge transform of Mi by g (k ) . The zero curvature equation being (k ) (k ) invariant under gauge transformation implies that ∂ti dj − ∂tj di − (k )
[di
(k )
, dj
] = 0 for any indices (i, j, k ). Since the matrices di
(k)
are diag(k )
= onal, the commutator vanishes, and the condition implies that di ∂ti h(k ) for some diagonal matrix h(k ) . We can now use the freedom g (k ) → g (k ) d(k ) to suppress the regular part of h(k ) around λk by (k ) choosing d(k ) = exp(h+ ). This is a choice of gauge. We already no(k )
(k )
around λk is fixed, di = 0 if ticed that the singular part of di (k ) (k ) = −ξi if k = k , this determines di completely: k = k and di ) (k) (k di = −∂ti ξ (λ, t)δkk . The set of Lax equations, eq. (3.37), for all the times ti is what is called an integrable hierarchy. Written on the variables g (k) , eq. (3.44) reads in detail, when i = (k , n, α) and k = k:
∂ti g (k) = g (k) ξi g (k)−1 g (k) − g (k) ∂ti ξ (k) = − g (k) ξi g (k)−1 g (k) (3.45) −
and when i =
(k , n, α)
and
+
k
= k:
∂ti g (k) = Mi g (k)
(3.46)
In this equation, Mi , which is a rational function of λ with only one-pole at λk , is regular around λk . The zero curvature condition also allows to introduce the “wave function”: Definition. The wave function Ψ(λ; t1 , t2 , . . .), is a matrix function depending on all the times simultaneously, and satisfying ∂ti Ψ = Mi Ψ,
Ψ(λ, t)|t=0 = 1
(3.47)
Locally around each λk we have Ψ(λ, t) = g (k) (λ, t)eξ
(k) (λ,t)
g (k)−1 (λ, 0)
(3.48)
The compatibility conditions of eqs. (3.47) are precisely the zero curvature (k) equations, eqs. (3.42). Equation (3.48) follows because g (k) (λ, t)eξ (λ,t) (k)−1 is easily seen to satisfy eq. (3.47). Multiplying on the right by g (λ, 0) enforces the initial condition Ψ(λ, t)|t=0 = 1.
54
3 Synopsis of integrable systems 3.5 Factorization problem
We now show that solving the hierarchy amounts to solving a factorization problem in a loop group. This is in fact solving a Riemann–Hilbert factorization problem. These two aspects, group theory and analytic properties, will be fundamental in Chapters 4 and 5. At the end of the section we make contact with the wave function. In the construction orbits, we introduced a loop algebra of coadjoint n at λ = 0. Its dual space G ∗ was G of elements X = n≥0 Xn λ regular identified with the set of elements Ξ = n0
n>0
(k)
Lemma 2. The action of ∇α on the tau-function is given by: (k)−1 ∇(k) ∂λ h(k) ) α log τ = −λ Tr (Eαα h
(3.64)
Its action on h(k) is given by: h(k)−1 ∇α(k) h(k) = −λ[h(k)−1 ∂λ h(k) , Eαα ] − h(k)−1 Eαα h(k) + Eαα
(3.65)
Proof. In the definition of log τ , eq. (3.59), we can replace g (k) by h(k) (k) because g0 is independent of λ. Hence, using eq. (3.62), −∇(k) α log τ
=
∞
λn Tr Res(λ−n Eαα h(k)−1 ∂λ h(k) ) = λTr(Eαα h(k)−1 ∂λ h(k) )
n=1
To prove the second formula, we start from the equations of the hierarchy, eq. (3.45), with ξi = Eαα λ−n . They read:
(k) (k) ∂ti g0 = −g0 h(k) Eαα λ−n h(k)−1 0
(k) (k) −n (k)−1 ∂ti h = − h Eαα λ h h(k) ++
62
3 Synopsis of integrable systems
where for any series f (λ) = i≥0 fi λi we have defined (f (λ))0 = f0 and f (λ)++ = i≥1 fi λi . n −n f (λ)) = f (λ) − f , we get Using eq. (3.62) again and ∞ 0 0 n=1 λ (λ (k) =− ∇(k) α h
∞
λn h(k) Eαα λ−n h(k)−1
n=1
++
h(k)
= −λ∂λ (h(k) Eαα h(k)−1 ) + h(k) Eαα h(k)−1 − Eαα h(k) This is equivalent to eq. (3.65). We can now give the proof of the theorem: Proof. Let us take the matrix element β, α of eq. (3.65). Separating the cases β = α and β = α, we can write it as: (k) (k)−1 ∂λ h(k) )βα − (h(k)−1 Eαα h(k) − Eαα )βα (h(k)−1 ∇(k) α h )βα = −λ(h
+ δβα λ(h(k)−1 ∂λ h(k) )αα Multiplying on the left by h(k) and remembering that, by eq. (3.64), we (k) have λ(h(k)−1 ∂λ h(k) )αα = −∇α log τ , we obtain (k)
(k)
(k)
(k) (k) (k) ∇(k) α hβα = −λ∂λ hβα − (Eαα h )βα + (h Eαα )βα − (∇α log τ )hβα
If β = α we set Xαα = τ hαα and if β = α, we set Xβα = λ−1 τ hβα . The (k)
(k)
(k)
(k)
(k)
(k)
(k)
equation becomes (∇α + λ∂λ )Xβα = 0. This means that Xβα (t, λ) is of (k)
(k)
(k)
the form Xβα (t, λ) = τβα (. . . , t(k,n,α) − λn /n, . . .). Since hαα (λ)|λ=0 = 1, (k)
the function ταα (t) is in fact equal to τ (t). 3.7 Integrable field theories and monodromy matrix For a system with a finite number of degrees of freedom, we have seen that a Lax matrix could be interpreted as a coadjoint orbit. It is possible to adapt this interpretation to field theory by properly choosing the Lie algebra involved. We shall consider two-dimensional field theory on a cylinder with space variable x ∈ [0, 2π] and time variable t ∈ [−∞, +∞]. To introduce the space variable x, we consider the loop algebra G of (periodic) maps from the circle S 1 to the some Lie algebra G, i.e. maps S 1 → G. The simplest case corresponds to choosing G to be the algebra of N × N matrices, but more frequently, it will be an element of a loop algebra with spectral parameter λ as in the finite-dimensional case. So we are dealing
3.7 Integrable field theories and monodromy matrix
63
with double loop algebras. In order to introduce some structure in the x direction, we consider the central extension of the x–loop algebra (see Chapter 16): G = G + CK i (x) + ci K reads by definition: The commutator of two elements Xi = X 1 (x), X 2 (x)] + [X1 , X2 ] = [X
2π
2 (x))dx K 1 (x)∂x X (X
0
where ( , ) is an invariant non-degenerate bilinear form on G (such as Tr or Tr Res). The dual space G∗ of G can be identified with the space of pairs of elements of the form Ξ = (Ξ(x), ζ) with pairing 2π Ξ(X) = (Ξ(x), X(x))dx + ζc 0
The coadjoint action is defined as usual (ad∗ X · Ξ)(Y ) = −Ξ([X, Y ]) and takes the form 2π 2π ∗ (ad X · Ξ)(Y ) = − (Ξ(x), [X(x), Y (x)])dx − ζ (X(x)∂ x Y (x))dx 0 0 2π = Y (x))dx (−[Ξ(x), X(x)] + ζ∂x X(x), 0
so that
ad∗ X · Ξ = (−[Ξ(x), 0) X(x)] + ζ∂x X(x),
(3.66)
We see that the element ζ is invariant by coadjoint action, and we will choose orbits with ζ = 1. In this setting, the Lax equation eq. (3.21) reads, for L = (U, 1) and M =V: ∂t U − ∂x V − [V, U ] = 0 (3.67) This is a zero curvature condition. Alternatively, one can say that the variable x behaves like one of the times of finite-dimensional systems, see eq. (3.42). It is important, however, to realise that in the field theory case, the construction of commuting quantities is more complicated because we have to construct functions invariant under the coadjoint action eq. (3.66). For this, the right object to consider is the so-called monodromy matrix which we now introduce. The zero curvature condition (3.67) expresses the compatibility condition of the associated linear system (∂x − U ) Ψ = 0,
(∂t − V ) Ψ = 0
(3.68)
64
3 Synopsis of integrable systems
The matrices U and V can be thought of as the x and t components of a connection. This connection will be called the Lax connection. Given U and V , the linear system (3.68) determines the matrix Ψ up to multiplication on the right by a constant matrix, which we can fix by requiring Ψ(λ, 0, 0) = 1. This Ψ will be called the wave function. Choosing a path γ from the origin to the point (x, t), the wave function can be written symbolically as ←− Ψ(x, t) =exp (U dx + V dt) (3.69) γ ←−
where exp denotes the path-ordered exponential. This is just the parallel transport along the curve γ with the connection (U, V ). Since the Lax connection satisfies the zero curvature relation (3.67) the value of the path-ordered exponential is independent of the choice of this path. In particular, if γ is the path x ∈ [0, 2π] with fixed time t, we call Ψ(2π, t) the monodromy matrix T (λ, t): 2π ←− U (λ, x, t)dx (3.70) T (λ, t) ≡exp 0
where we assume that U (λ, x, t) and V (λ, x, t) depend on a spectral parameter λ. Proposition. Assume that all fields are periodic in x with period 2π. Let T (λ, t) be the monodromy matrix and let H (n) (λ) = Tr (T n (λ, t))
(3.71)
Then, H (n) (λ) is independent of time. Hence traces of powers of the monodromy matrix generate conserved quantities. Proof. Thinking of the path-ordered exponential on [a, b] as b ←− exp U (x)dx ∼ (1 + δxU (xn )) · · · (1 + δxU (x1 )) a
with a subdivision x1 = a < x2 < · · · < xn = b such that xi+1 − xi = δx → 0, we get (all exponentials are path-ordered exponentials): 2π 2π x dxe x U dx U˙ (λ, x)e 0 U dx ∂t T (λ, t) = 0 2π 2π x = dxe x U dx (∂x V + [V, U ])e 0 U dx 0 2π 2π
x = dx∂x e x U dx V e 0 U dx 0
65
3.8 Abelianization Performing the integral, ∂t T (λ, t) = V (λ, 2π, t)T (λ, t) − T (λ, t)V (λ, 0, t)
(3.72)
So, if the fields are periodic, we have V (λ, 2π, t) = V (λ, 0, t) and the relation becomes ∂t T (λ, t) = [V (λ, 0, t), T (λ, t)] This is a Lax equation. It implies that H (n) (λ) is time-independent. Expanding in λ we obtain an infinite set of conserved quantities. It is the monodromy matrix which plays the role of the Lax matrix in the field theoretical context. 3.8 Abelianization We now discuss the analogue of the Zakharov–Shabat construction for field theory. We consider the linear system eq. (3.68) where U (λ, x, t) and V (λ, x, t) are matrices depending in a rational way on a parameter λ having poles at constant values λk . U = U0 +
Uk
with
Uk =
V = V0 +
Uk,r (λ − λk )r
(3.73)
Vk,r (λ − λk )r
(3.74)
r=−nk
k
−1
Vk
with
k
Vk =
−1 r=−mk
The compatibility condition of the linear system (3.68) is the zero curvature condition (3.67). We demand that it holds identically in λ. These conditions are always compatible, since by the same naive counting argument as for finite-dimensional systems there is one more variable than the number of equations. The origin of this indeterminacy is the same: the zero curvature condition is invariant by gauge transformations. If the gauge transformation is independent of λ, it will not spoil the analytic properties of U and V . Notice that eq. (3.67) implies that U0 and V0 are pure gauge, i.e. there exists a group valued function h such that U0 = ∂x hh−1
and V0 = ∂t hh−1
(3.75)
Remark 1. Using a λ independent gauge transformation periodic in x, we can always choose a gauge in which U0 is constant diagonal and V0 = 0. To show this we start from eq. (3.75). Writing that U0 (x) is periodic implies that ∂x (h−1 (x)h(x+2π)) = 0. We change basis so that h−1 (x)h(x + 2π) is diagonal and denote it by exp(2πP ) with
66
3 Synopsis of integrable systems
˜ ˜ + 2π, t) = h(x, ˜ P diagonal. Hence we can write h(x, t) = h(x, t) exp(P x) with h(x t) ˜ ˜ ˜ and we gauge transform under h. Then we have U0 = P and V0 = x∂t P . But V˜0 is periodic in x, hence ∂t P = 0.
As for finite-dimensional systems, we first make a local analysis around each pole λk in order to understand solutions of eq. (3.67). We show that around each singularity λk , one can perform a gauge transformation bringing simultaneously U (λ) and V (λ) to a diagonal form. The important new feature we want to emphasize, as compared to the finite-dimensional case, is that this construction is local in x. Let us assume that the pole is located at λ = 0. The rational functions U (λ, x, t), V (λ, x, t) can be expanded in a Taylor series in a neighbourhood of this pole: U (λ, x, t) =
∞
Ur (x, t)λr ,
V (λ, x, t) =
r=−n
∞
Vr (x, t)λr
(3.76)
r=−m
We have: Proposition. There exists a local, periodic, gauge transformation ∂x − U = g(∂x − A)g −1 ,
∂t − V = g(∂t − B)g −1
(3.77)
where g(λ), A(λ) and B(λ) are formal series in λ: g=
∞ r=0
r
gr λ ,
A=
∞
r
Ar λ ,
r=−n
B=
∞
Br λr
r=−m
such that the matrices A(λ) and B(λ) are diagonal. Moreover ∂t A(λ) − ∂x B(λ) = 0. Proof. Let g0 be the matrix diagonalizing the leading term in eq. (3.76), U−n = g0 A−n g0−1 ˜ r ˜ )g −1 , U ˜ = ∞ U Let ∂x − U = g0 (∂x − U 0 r=−n r λ . Since λ = 0 is a pole of ˜−n = A−n . We set g = g0 h. Equation (3.77) U and g0 is regular we have U ˜ ˜ h + hA = 0. Expanding in becomes ∂x − U = h(∂x − A)h−1 , or ∂x h − U powers of λ, we get ∂x hl −
l r=−n
˜r hl−r − hl−r Ar ) = 0, (U
l = −n, . . . , ∞
(3.78)
67
3.8 Abelianization
Of course hl = 0 if l < 0; h0 = 1. In the sum, we separate the first and the last term ∂x hl −[A−n , hl+n ]−
l−1
˜r hl−r −hl−r Ar )−U ˜l +Al = 0, (U
l = −n, . . . , ∞
r=−n+1
(3.79) Projecting this equation on the diagonal matrices, we determine Al in terms of Ak , k < l, and hk , k < l + n. (The term [A−n , hl+n ] does not contribute to the diagonal since A−n is itself diagonal). Similarly, the off-diagonal part of this equation determines the off-diagonal part of hl+n in terms of the same variables. We can make the solution unique by requiring, for instance, diag (hl+n ) = 0. Therefore h and A are determined recursively. Note that this is a purely algebraic computation, so g and A are periodic functions of x and are algebraic functions of the matrix elements of U and their derivatives. Under the gauge transformation g we obviously have ∂t A − ∂x B = [B, A], or ∂t Al − ∂x Bl = ∞ [B r=−n l−r , Ar ]. If B is regular at λ = 0 the expansion of B starts at B0 and ∂t A−n = [B0 , A−n ]. Hence the off-diagonal part of the commutator is zero and therefore B0 is diagonal. If, however, B is singular, the expansion of B starts at B−m and the most singular term in the commutator is λ−n−m [B−m , A−n ], and this has to vanish because n + m > max(n, m). Hence B−m is diagonal. We finish the proof by induction on l. Assume Br is diagonal until Bl+n−1 . Then ∂t Al − ∂x Bl = l+m r=−n [Bl−r , Ar ] = [Bl+n , A−n ], hence Bl+n is diagonal. It is important to notice that this procedure only requires local computations. There is no differential equation to be integrated when recursively diagonalizing the Lax connection around its poles. As for finite-dimensional systems, we can reconstruct all the matrices Uk and Vk , and therefore the Lax connection, from simple data. U = U0 +
Uk ,
(k) with Uk ≡ g (k) A− g (k)−1
Vk ,
(k) with Vk ≡ g (k) B− g (k)−1
k
V = V0 +
−
−
(3.80) (3.81)
Remark 2. Let us check that the reconstruction formulae (3.80, 3.81) for U and V are such that the order of the poles in the commutator appearing in the zero curvature condition matches the order of the poles in the derivative terms. Indeed the polar part at λk of the commutator is [Uk , V ]− + [Vk , U ]− and can be higher than the order of the poles of the derivatives ∂t Uk and ∂x Vk . So we have to show that the formula
68
3 Synopsis of integrable systems
ensures that the poles of these commutators, which are naively of order nk + mk , are actually of order less than max(nk , mk ). Indeed, consider for example the commutator [Uk , V ]− . Similarly as for finite-dimensional systems, we may write it as: (k)
g (k) A− g (k)−1
[Uk , V ]− = =
,V −
(3.82) −
(k) (k) g (k) A− g (k)−1 − g (k) A− g (k)−1 , V
+
(k)
= g (k) A− g (k)−1 , g (k) ∂t g (k)−1
− −
−
(k)
g (k) A− g (k)−1
,V +
−
In the last line we use the fact that ∂t −V is diagonalized by g (k) , i.e. V = −g (k) ∂t g (k)−1 + g (k) B (k) gk−1 with B (k) diagonal. All the terms of the last line have a pole of order at most nk . Similarly one shows that the order of the pole of [Vk , U ]− is at most mk . This shows that the constraints hidden in eq. (3.67) are solved by the formulas (3.80, 3.81).
We now use this diagonal gauge to compute the conserved quantities. 2π Proposition. The quantities Q(k) (λ) = 0 A(k) (λ, x, t) dx are local conserved quantities of the field theory. They are related to eq. (3.71) by 2π (n) (k) H (λ) = Tr exp n A (λ, x, t)dx = Tr exp nQ(k) (λ) 0
Proof. Around each pole, in the diagonal gauge, the zero curvature condition reduces to ∂t A(k) (λ, x, t) − ∂x B (k) (λ, x, t) = 0 It is the equation of conservation of a current. The charge Q(k) (λ) = 2π (k) 0 A (λ, x, t) dx is conserved because
2π
∂t Qk (λ) =
∂x B (k) (λ, x, t) = B (k) (λ, 2π, t) − B (k) (λ, 0, t) = 0
0
where we have used the fact that A(k) (λ, x, t) and B (k) (λ, x, t) are local in terms of the coefficients of the Lax connection and are therefore periodic in x. Expanding Q(k) (λ) in powers of λ−λk produces an infinite number of conserved quantities. Under a gauge transformation, U → g U = g −1 U g − g −1 ∂x g, the monodromy matrix is transformed into g T (λ, t) with g
T (λ, t) = g −1 (2π, t)T (λ, t)g(0, t)
69
3.8 Abelianization Thus, if g(x, t) is periodic, g(2π, t) = g(0, t), one has Tr ( g T (λ, t)) = Tr (T (λ, t))
In the diagonal gauge around λ = λk , the monodromy matrix is easily computed and one gets: 2π (n) (k) A (λ, x, t)dx H (λ) = Tr exp n 0
There is no problem of ordering in the exponential since the matrices A(k) are diagonal. Remark 3. The abelianization procedure can be used to give a local expression of the wave function, (x,t)
Ψ(λ, x, t) = g (k) (λ, x, t)e
0
A(k) (λ)dx+B (k) (λ)dt (k)−1
g
(λ, 0, 0)
As for finite-dimensional systems, this shows that Ψ(λ, x, t) possesses essential singularities at the points λk . By the Poincar´e theorem on differential equations, these are the only singularities of Ψ(λ, x, t).
We now give some examples of two-dimensional field theories having a zero curvature representation. Example 1. The first example is the non-linear σ model. For simplicity, we look for a Lax connection in which U and V have only one simple pole at two different points and U0 = V0 = 0. Choosing these points to be at λ = ±1, we can thus parametrize U and V as: U=
1 Jx , λ−1
V =−
1 Jt λ+1
(3.83)
with Jx and Jt taking values in some Lie algebra. Decomposing the zero curvature condition [∂x − U, ∂t − V ] = 0 over its simple poles gives two equations: ∂t Jx − 12 [Jx , Jt ] = 0, ∂x Jt + 12 [Jx , Jt ] = 0. Taking the difference implies that [∂t + Jt , ∂x + Jx ] = 0. Thus J is a pure gauge and there exists g such Jt = g −1 ∂t g and Jx = g −1 ∂x g. Taking now the sum of the two equations implies ∂t Jx + ∂x Jt = 0, or equivalently, ∂t (g −1 ∂x g) + ∂x (g −1 ∂t g) = 0
70
3 Synopsis of integrable systems
This is the field equation of the so-called non-linear sigma model, with x, t as light-cone coordinates. Example 2. Another important example is the sinh-Gordon model. It also has a two-poles Lax connection, one-pole at λ = 0, the other at λ = ∞. Moreover, we require that in the light-cone coordinates, x± = x ± t, U (λ, x± ) has a simple pole at λ = 0 and V (λ, x± ) a simple pole at λ = ∞. The most general 2 × 2 system of this form is: (∂x+ − U )Ψ = 0, (∂x− − V )Ψ = 0,
U = U0 + λ−1 U1 V = V0 + λV1
The matrices Ui , Vi are taken to be traceless matrices, so contain 12 parameters. One can reduce this number by imposing a symmetry condition under a discrete group, as in eq. (3.18). Namely, we consider the group Z2 acting by: 1 0 −1 Ψ(λ) −→ σz Ψ(−λ)σz , σz = (3.84) 0 −1 and we demand that Ψ be invariant by this action. This restriction means that the wave function belongs to the twisted loop group. It follows that: σz U (−λ)σz = U (λ) and σz V (−λ)σz = V (λ). We still have the possibility of performing a gauge transformation by an element g, independent of λ, in order to preserve the pole structure of the connection, and commuting with the action of Z2 , i.e. g diagonal. This gauge freedom can be used to set (V0 )ii = 0. The symmetry condition then gives: u0 λ−1 u1 0 λv1 , V = U= λ−1 u2 −u0 λv2 0 In this gauge, the zero curvature equation reduces to: ∂x− u0 − u1 v2 + v1 u2 = 0 ∂x− u1 = 0, ∂x− u2 = 0 ∂x+ v1 − 2v1 u0 = 0, ∂x+ v2 + 2v2 u0 = 0
(3.85) (3.86) (3.87)
From eq. (3.86) we have u1 = α(x+ ), u2 = β(x+ ). We set u0 = ∂x+ ϕ. Then, from eq (3.87) we have v1 = γ(x− ) exp 2ϕ and v2 = δ(x− ) exp −2ϕ. Finally, eq (3.85) becomes : ∂x+ ∂x− ϕ + β(x+ )γ(x− )e2ϕ − α(x+ )δ(x− )e−2ϕ = 0 This is the sinh-Gordon equation. The arbitrary functions α(x+ ), β(x+ ) and γ(x− ), δ(x− ) are irrelevant: they can be absorbed into a redefinition
3.8 Abelianization
71
of the field ϕ and a change of the coordinates x+ , x− . Taking them as constants, equal to m, we finally get 0 mλe2ϕ ∂x+ ϕ mλ−1 ; V = (3.88) U= mλ−1 −∂x+ ϕ mλe−2ϕ 0 Hence the Lax connection of the sinh-Gordon model is naturally recovered from two-poles systems with Z2 symmetry. This construction generalizes to other Lie algebras, the reduction group being generated by the Coxeter automorphism, and yields the Toda field theories. Remark 4. There is a relation between the linear system eq. (3.88) and what is called B¨ acklund transformations. These transformations produce new solutions of a non-linear partial differential equation from old ones. Assume that ϕ satisfies a second order non-linear partial differential equation. The B¨ acklund transformation requires that the new function ϕ is obtained by solving a first order system: = P (ϕ, ϕ, ∂x+ ϕ, ∂x− ϕ), ∂x+ ϕ
∂x− ϕ = Q(ϕ, ϕ, ∂x+ ϕ, ∂x− ϕ)
Of course the compatibility condition of this system puts strong constraints on P and Q. We say the transformation is auto-B¨ acklund if ϕ satisfies the same equation as ϕ. In the sinh-Gordon case, the transformation is defined by: = 2mλ−1 sinh(ϕ − ϕ), ∂x+ (ϕ + ϕ)
∂x− (ϕ − ϕ) = 2mλ sinh(ϕ + ϕ)
(3.89)
where λ is an arbitrary parameter. The compatibility condition reads: ∂x+ ∂x− ϕ = −∂x+ ∂x− ϕ + 4m2 cosh(ϕ − ϕ) sinh(ϕ + ϕ) + ϕ) sinh(ϕ − ϕ) = ∂x+ ∂x− ϕ + 4m2 cosh(ϕ and reduces to the sinh-Gordon equation ∂x+ ∂x− ϕ = 2m2 sinh(2ϕ). Moreover, we then find ∂x+ ∂x− ϕ = 2m2 sinh(2ϕ), so if ϕ solves the sinh-Gordon equation, so does the transformed field ϕ. The relation between eq. (3.89) and the linear system eq. (3.88) is obtained by setting eϕ = eϕ uv . The B¨ acklund transformation then reads: u(−∂x+ v + ∂x+ ϕ v + mλ−1 u) + v(∂x+ u + ∂x+ ϕ u − mλ−1 v) = 0 u(∂x− v − −mλ e2ϕ u) + v(−∂x− u + mλ e−2ϕ v) = 0 Requiring the vanishing of the four terms in the parenthesis yields exactly the linear system: v v (∂x+ − U ) = 0, (∂x− − V ) =0 u u where the connection is given in eq. (3.88). Conversely, if we have a solution (u, v) of the linear sytem associated with ϕ, then eϕ = eϕ uv is a solution of the B¨ acklund transformation. In general the relation between ϕ and ϕ is non-local. However, when expanding ϕ in formal power of either λ or 1/λ each term of the infinite series is a local function of ϕ. This remark can be used to deduce an infinite set of local conserved currents
72
3 Synopsis of integrable systems
in the sinh-Gordon model. Indeed, from the defining relations (3.89) of the B¨ acklund transformation, we see that the current Jx+ , Jx− with components: Jx+ = λ−1 cosh(ϕ − ϕ),
Jx− = −λ cosh(ϕ + ϕ)
is conserved: ∂x− Jx+ + ∂x+ Jx− = 0. Expanding it in power series of either λ or 1/λ gives two infinite series of local conserved currents.
3.9 Poisson brackets of the monodromy matrix As we just saw, the zero curvature equation leads to the construction of an infinite set of conserved currents. We want to compute the Poisson brackets of the conserved charges associated with these conserved currents. For this we will compute the Poisson brackets of the matrix elements of the monodromy matrix. In order to do it we assume the existence of an r-matrix relation such that: {U1 (λ, x), U2 (µ, y)} = [r12 (λ − µ), U1 (λ, x) + U2 (µ, y)]δ(x − y)
(3.90)
We assume that r is a non-dynamical r-matrix such as eq. (3.31). We say that the Poisson bracket eq. (3.90) is ultralocal due to the presence of δ(x − y) only. This hypothesis actually covers a large class of integrable field theories. Since we are computing Poisson brackets, let us fix the time t, and consider the transport matrix from x to y y ←− T (λ; y, x) =exp U (λ, z)dz x
In particular the monodromy matrix is T (λ) = T (λ; 2π, 0). The matrix elements [T ]ij of T (λ; y, x) are functions on phase space. As in Chapter 2, section (2.5), we use the tensor notation to arrange the table of their Poisson brackets. Proposition. If eq. (3.90) holds, we have the fundamental Sklyanin relation for the transport matrix: {T1 (λ; y, x), T2 (µ; y, x)} = [r12 (λ, µ), T1 (λ; y, x)T2 (µ; y, x)]
(3.91)
As a consequence, the traces of powers of the monodromy matrix H (n) (λ) = Tr (T n (λ)), generate Poisson commuting quantities: {H (n) (λ), H (m) (µ)} = 0
(3.92)
Proof. Let us first prove the relation (3.91) for the Poisson brackets of the transport matrices. Notice that λ is attached to T1 and µ to T2 , so that
3.9 Poisson brackets of the monodromy matrix
73
there is no ambiguity if we do not write explicitly the λ and µ dependence. The transport matrix T (y, x) verifies the differential equations ∂x T (y, x) + T (y, x)U (x) = 0 ∂y T (y, x) − U (y)T (y, x) = 0
(3.93)
Since Poisson brackets satisfy the Leibnitz rules, we have {T1 (y, x), T2 (y, x)} = (3.94) y y dudv T1 (y, u)T2 (y, v){U1 (u), U2 (v)}T1 (u, x)T2 (v, x) x
x
Replacing {U1 (u), U2 (v)} by eq. (3.90), and using the differential equation satisfied by T (y, x), this yields: {T1 (y, x), T2 (y, x)} y y = dudv δ(u − v). T1 (y, u)T2 (y, v) r12 (∂u + ∂v )T1 (u, x)T2 (v, x) x x
+(∂u + ∂v )(T1 (y, u)T2 (y, v)) r12 T1 (u, x)T2 (v, x) y
= dz ∂z T1 (y, z)T2 (y, z).r12 .T1 (z, x)T2 (z, x) x
Integrating this exact derivative gives the relation (3.91). Let us now show that the trace of the monodromy matrix H (n) (λ) generates Poisson commuting quantities. Equation (3.91) implies {T1n (λ), T2m (µ)} = [r12 (λ, µ), T1n (λ)T2m (µ)] We take the trace of this relation. In the left-hand side we use the fact that Tr12 (A ⊗ B) = Tr (A)Tr (B) and get {H (n) (λ), H (m) (µ)}. The right-hand side gives zero because it is the trace of a commutator. Let us emphasize that it is the integration process involved in the transport matrix which leads from the linear Poisson bracket eq. (3.90) to the quadratic Sklyanin Poisson bracket eq. (3.91). The proposition shows that we may take as Hamiltonian any element of the family generated by H (n) (µ). We show that the corresponding equations of motion take the form of a zero curvature condition. Proposition. Taking H (n) (µ) as Hamiltonian, we have U˙ (λ, x) ≡ {H (n) (µ), U (λ, x)} = ∂x V (n) (λ, µ, x) + [V (n) (λ, µ, x), U (λ, x)] (3.95)
74
3 Synopsis of integrable systems
where
V (n) (λ, µ; x) = nTr1 T1 (µ; 2π, x)r12 (µ, λ)T1 (µ; x, 0)T1n−1 (µ, 2π, 0)
This provides the equations of motion for a hierarchy of times, when we expand in µ. Proof. To simplify the notation, we do not explicitly write the λ, µ dependence as above, noting that µ is attached to the tensorial index 1 and λ to the tensorial index 2. We have: 2π {T1 (2π, 0), U2 (x)} = dy T1 (2π, y) {U1 (y), U2 (x)} T1 (y, 0) 0
= T1 (2π, x) [r12 , U1 (x) + U2 (x)] T1 (x, 0) Expanding the commutator we get four terms {T1 (2π, 0), U2 (x)} = T1 (2π, x) · r12 · U1 (x)T1 (x, 0) +T1 (2π, x) · r12 · U2 (x) T1 (x, 0) "# $ "# $ ! ! commute
use diff. eq.
− T1 (2π, x) U1 (x) ·r12 · T1 (x, 0) − T1 (2π, x) U2 (x) ·r12 · T1 (x, 0) ! ! "# $ "# $ use diff. eq.
commute
Using the differential equations (3.93) and commuting factors as indicated gives {T1 (2π, 0), U2 (x)} = ∂x V12 (x) + [V12 (x), U2 (x)] where we have introduced V12 (x) = T1 (2π, x) · r12 · T1 (x, 0). From this (n) (n) (n) we get {T n (2π, 0), U2 (x)} = ∂x V12 (x) + [V12 (x), U2 (x)] with V12 (x) = n−i−11 V12 (x)T1i . Taking the trace over the first space, remembering i T1 (n) that H (n) (µ) = Tr T n (µ), and setting V (n) (λ, µ, x) = Tr1 V12 (x), we find eq. (3.95). 3.10 The group of dressing transformations We now introduce a very important notion, the group of dressing transformations, which is related to the Zakharov–Shabat construction. These transformations provide a way to construct new solutions of the field equations of motion from old ones. It defines a group action on the space of classical solutions of the model, and therefore on the phase space of the model. Dressing transformations are special non-local gauge transformations preserving the analytical structure of the Lax connection. These transformations are intimately related to the Riemann–Hilbert problem which we have discussed in the section on factorization.
3.10 The group of dressing transformations
75
We choose a contour Γ in the λ-plane such that none of the poles λk of the Lax connection are on Γ. We will take for Γ the sum of contours Γ(k) , each one surrounding a pole λk as in the factorization problem. To define the dressing transformation, we pick a group valued function ˜ on Γ. From the Riemann–Hilbert problem, eqs. (3.49, 3.53), g(λ) ∈ G g(λ) can be factorized as: −1 g(λ) = g− (λ)g+ (λ)
where g+ (λ) and g− (λ) are analytic inside and ouside the contour Γ respectively. In the following discussion we assume that g(λ) is close enough to the identity so that there are no indices. Let U, V be a solution of the zero curvature equation eq. (3.67) with the prescribed singularities specified in eqs. (3.73, 3.74). Let Ψ ≡ Ψ(λ; x, t) be the solution of the linear system (3.68) normalized by Ψ(λ; 0, 0) = 1. We set: θ(λ; x, t) = Ψ(λ; x, t) · g(λ) · Ψ(λ; x, t)−1
(3.96)
At each space–time point (x, t), we perform a λ decomposition of θ(λ, x, t) according to the Riemann–Hilbert problem as: −1 θ(λ; x, t) = θ− (λ; x, t) · θ+ (λ; x, t)
(3.97)
with θ+ and θ− analytic inside and outside the contour Γ respectively. Then, Proposition. The following function, defined for λ on the contour Γ, −1 Ψg (λ; x, t) = θ± (λ; x, t) · Ψ(λ; x, t) · g± (λ)
(3.98)
extends to a function Ψg+ , defined inside Γ except at the points λk where it has essential singularities, and a function Ψg− defined outside Γ. On Γ we have Ψg−−1 Ψg+ |Γ = 1. So Ψg± define a unique function Ψg which is normalized by Ψg (λ, 0) = 1 and is a solution of the linear system (3.68) with Lax connection U g and V g given by −1 −1 + ∂x θ± · θ± U g (λ; x, t) = θ± · U · θ±
V (λ; x, t) = θ± · V · g
−1 θ±
+ ∂t θ± ·
−1 θ±
(3.99) (3.100)
The matrices U g and V g , which satisfy the zero curvature equation (3.67), are meromorphic functions on the whole complex λ plane with the same analytic structure as the components U (λ) and V (λ) of the original Lax connection.
76
3 Synopsis of integrable systems
Proof. First it follows directly from the definitions of g± and θ± that for λ on Γ, −1 −1 θ+ (λ; x, t) · Ψ(λ; x, t) · g+ (λ) = θ− (λ; x, t) · Ψ(λ; x, t) · g− (λ)
so that the two expressions of the right-hand side of eq. (3.98) with the + and − signs are equal, and effectively define a unique function Ψg on Γ. It is clear that this function can be extended into two functions Ψg± respectively defined inside and outside this contour by: −1 Ψg± = θ± · Ψ · g±
These functions have the same essential singularities as Ψ at the points λk . By construction, they are such that Ψg−−1 Ψg+ |Γ = 1. We may use Ψg± to define the Lax connection U±g , V±g inside and outside the contour Γ. Explicitly: g−1 −1 −1 = ∂x θ± θ± + θ ± U θ± U±g = ∂x Ψg± · Ψ± −1 −1 V±g = ∂t Ψg± · Ψg−1 = ∂t θ± θ± + θ± V θ ± ±
Since Ψg−−1 Ψg+ |Γ = 1 we see that U+ coincides with U− on the contour Γ and similarly V+ = V− for λ ∈ Γ and hence the pairs U±g , V±g define a conection U g , V g on the whole λ-plane. Since θ± are regular in their respective domains of definition, we see that U g , V g have the same singularities as U , V . This proposition effectively states that the dressing transformations (3.98) map solutions of the equations of motion into new solutions. Given a solution U, V of the zero curvature equation with the prescribed pole ˜ we produce a new solution structure and an element of the loop group G, of the zero curvature equation with same analytical structure. But, since this analytic structure is the main information which specifies the model, we have produced a new solution of the equations of motion. Thus, the dressing transformations Ψ → Ψg act on the solution space. They form a group, called the dressing group and denoted by GR . This ˜ but is not isomorphic to it since its composition group is modeled on G law is different. Indeed, −1 g+ and h = h−1 Proposition. Let g = g− − h+ be two elements of the dressing group GR . The composition law of dressing transformations is given by
h • g = (h− g− )−1 (h+ g+ )
(3.101)
77
3.10 The group of dressing transformations
Representing the elements of the dressing group by the pairs (g− , g+ ) and (h− , h+ ) we may write the composition law as: (h− , h+ ) • (g− , g+ ) = (h− g− , h+ g+ ). In particular the plus and minus components commute. −1 g+ and h = h−1 Proof. Consider two elements g = g− − h+ and transform g g h successively Ψ → Ψ → (Ψ ) ; we have: g −1 Ψg = θ± Ψ g±
with
g θ± = (ΨgΨ−1 )±
hg (Ψg )h = θ± Ψg h−1 ±
with
hg θ± = (Ψg hΨg−1 )± (3.102)
The factorization of (Ψg hΨg−1 ) can be written as follows: hg −1 hg g g (θ− ) θ+ ≡ Ψg hΨg−1 = θ− Ψ (h− g− )−1 (h+ g+ ) Ψ−1 θ+
−1
or, equivalently, hg g θ± = (Ψ (h− g− )−1 (h+ g+ ) Ψ−1 )± θ±
Inserting this formula into eq. (3.102) proves the multiplication law for the dressing transformations. The Lie algebra GR of the dressing group is composed of the two commuting subalgebras G± that we have introduced in the section on factorization. Recall that G− consists of maps X− (λ) extendable ouside Γ, while G+ = ⊕k Gk , [Gk , Gk ] = 0, k = k , consists of a collection of maps Xk (λ) ˜ but in the regular inside Γ(k) . As a vector space, GR is isomorphic to G, dressing Lie algebra, [G− , G+ ] = 0. The infinitesimal form of the dressing transformation, eq. (3.98), for ˜ = X+ − X− , with X± ∈ G± is: any X ˜ −1 )± Ψ − Ψ X± δX˜ Ψ = (ΨXΨ
(3.103)
We end these general considerations on dressing transformations by clarifying their relation to the Poisson structure of the theory. We shall assume the Poisson bracket eq. (3.90) for the Lax connection with r12 (λ, µ) = −
C12 , λ−µ
C12 =
Eij ⊗ Eji
ij
The Poisson bracket of the wave function is thus, from eqs. (3.69, 3.91): {Ψ1 (λ; x), Ψ2 (µ; x)} = [r12 (λ, µ), Ψ1 (λ; x)Ψ2 (µ; x)]
(3.104)
The r-matrix is related to the factorization problem in the loop algebra ˜ ˜ G˜ whose elements are maps X(λ) defined on Γ. Recall that X(λ) can be
78
3 Synopsis of integrable systems
˜ = X+ − X− with X± ∈ G± . Its component X− (λ) can decomposed as X be computed by: dµ ˜ 2 (µ), λ outside Γ X− (λ) = Tr2 r12 (λ, µ)X Γ 2iπ Its component X+ (λ) = (X1 (λ), X2 (λ), . . .) reads: dµ ˜ 2 (µ), λ inside Γ(k) Tr2 r12 (λ, µ)X Xk (λ) = Γ 2iπ ˜ k , where X ˜ k denotes the We have to verify that (Xk − X− )|Γ(k) = X ˜ component of X(λ) on Γ(k) . Recalling the formula 1 1 − = −2iπδ(x) x + i0 x − i0 and taking λ± be two values of λ pinching the contour Γ(k) from inside and outside, we can write: r12 (λ+ , µ) − r12 (λ− , µ) = 2iπC12 δ(λ − µ)
(3.105)
with δ(λ − µ) the Dirac measure. Then we have:
dµ ˜ 2,l (µ) (Xk − X− )(λ)|Γ(k) = Tr2 (r12 (λ+ , µ) − r12 (λ− , µ))X Γ(l) 2iπ l
˜ k (λ) =X Introducing two maps R± acting on the loop algebra G˜ by dµ ± ˜ ˜ 2 (µ) R (X)(λ) = X± (λ) = Tr2 r12 (λ± , µ)X Γ 2iπ
(3.106)
˜ = X+ − X− . This is equivalent to R+ − R− = Id, we have shown that X the identity operator. With this result at hand we can spell out the Poisson property of dressing transformations. The action eq. (3.103) does not naively preserve the Poisson brackets eq. (3.104). However, a good Poisson action is recovered if the dressing group itself is equipped with a non-trivial Poisson structure. It is then called a Poisson–Lie group and the action a Lie–Poisson action. The theory of Poisson–Lie groups and Lie–Poisson actions is sketched in Chapter 14. Here we only need to know that infinitesimal actions are generated by so-called non-Abelian Hamiltonians T . This means that there exists a function on phase space, T , taking value in the dual group, such that for any function f on phase space δX f = X, T −1 {T, f }
(3.107)
3.11 Soliton solutions
79
Here X is an element of the Lie algebra of the Poisson–Lie group acting on the manifold, and T −1 {T, f } belongs to the dual of this Lie algebra; is the pairing, see Chapter 14. In the Abelian case, writing T = exp P, we get δX f = X, {P, f } = {H(X), f }, where H(X) = P(X). This is the standard formula showing that the action is symplectic in this case. In our situation X is an element of the Lie algebra of the dressing group ˜ GR = (G+ , G− ) and T an element of the loop group exp G. Proposition. The action of dressing transformations is a Lie–Poisson action. The non–Abelian generator is the monodromy matrix. Proof. Introduce the monodromy matrix T (µ) = Ψ(µ, 2π). Using the ultralocality property, its Poisson bracket with the wave function is, for 0 ≤ x ≤ 2π: {Ψ1 (λ, x), T2 (µ)} = T2 (µ)Ψ−1 2 (µ, x)[r12 (λ, µ), Ψ1 (λ, x)Ψ2 (µ, x)] In this formula, we can freely replace λ on a contour Γ(k) by either of the two values λ± pinching it from inside and outside Γ(k) . This is because for µ outside Γ(k) , this replacement has no effect since r12 (λ, µ) is regular, while for µ on Γ(k) , by eq. (3.105), the difference is the product of the Dirac measure δ(λ − µ) and the commutator [C12 , Ψ1 (λ)Ψ2 (λ)] which vanishes. ˜ ∈ GR we have: Therefore, for any X
dµ ˜ 2 (µ)T −1 (µ){Ψ1 (λ± ), T2 (µ)} Tr2 X 2 Γ 2iπ
˜ ˜ −1 (λ) Ψ(λ) − Ψ(λ, x) R± (X)(λ) = R± ΨXΨ where the two signs ± give the same answer. Since the maps R± project on the subalgebras G± , see eq. (3.106), this reads:
dµ ˜ 2 (µ)T −1 (µ){Ψ1 (λ± , x), T2 (µ)} = δ ˜ Ψ(λ, x) Tr2 X 2 X Γ 2iπ where δX˜ Ψ(λ, x) is the infinitesimal form of the dressing transformation, eq. (3.103). Comparing with eq. (3.107), this proves that T (µ) is the non– Abelian Hamiltonian and shows that the action is Lie–Poisson. 3.11 Soliton solutions In general, a matrix Riemann–Hilbert problem like eq. (3.49) cannot be solved explicitly by analytical methods. This statement applies to the fundamental solution of the Riemann–Hilbert problem, i.e. the one satisfying the conditions det θ± = 0. However, once the fundamental solution
80
3 Synopsis of integrable systems
is known, new solutions “with zeroes” can easily be constructed from it. This can be used to produce new solutions to the equations of motion. Starting from a trivial vacuum solution, we obtain in this way the so-called soliton solutions. To define the Riemann–Hilbert with zeroes, we first introduce a definition. We say that a matrix function θ(λ) has a zero at the point λ0 if det θ(λ0 ) = 0 and if in the vicinity of this point θ(λ) = F0 + (λ − λ0 )F1 + O(λ − λ0 )2 , θ−1 (λ) =
1 C0 + C1 + O(λ − λ0 ) λ − λ0
Since θ(λ)θ−1 (λ) = θ−1 (λ)θ(λ) = Id, we have F0 C0 = C0 F0 = 0, C0 F1 + C1 F0 = Id. In particular Ker F0 = Im C0 ,
Ker C0 = Im F0
Let now Γ be a closed contour in the λ-plane. As in the previous section Γ could be a sum of small contours Γ(k) around each point λk , but here we can also consider more general contours provided no point λk sits on them. Let g(λ) be a matrix defined on Γ, and consider the Riemann– Hilbert problem −1 g(λ) = θ− (λ)θ+ (λ) (3.108) where θ+ (λ) is analytic inside Γ and has N zeroes located at the points −1 (λ) is analytic outside Γ and has N zeroes at the µ1 , . . . , µN , and θ− −1 (λ) which has zeroes, and points λ1 , . . . , λN . We emphasize that it is θ− not θ− (λ). Let us fix the two set of subspaces: −1 (λ)|λ=λn , Vn = Im θ−
Wn = Ker θ+ (λ)|λ=µn
Then we have: Proposition. The choice of the subspaces Vn , Wn , specifies uniquely the solution of the factorization problem eq. (3.108), up to a left multiplication by a constant matrix. This factorization problem is called a Riemann– Hilbert problem with zeroes. Proof. Suppose θ± and θ˜± are two solutions of the Riemann–Hilbert −1 −1 problem. Then the function χ = θ˜− θ− = θ˜+ θ+ is a meromorphic function in the whole complex plane. Its possible poles are located at λn , µn . But around such a pole, let us say µn , we have θ+ (λ) = Fn + O(λ − µn ), θ˜+ (λ) = F˜n + O(λ − µn ),
1 Cn + O(1), λ − µn 1 −1 C˜n + O(1) θ˜+ (λ) = λ − µn −1 θ+ (λ) =
81
3.11 Soliton solutions
Since Ker Fn = Wn = Im Cn , Ker F˜n = Wn = Im C˜n , we see that around 1 λ = µn , χ = λ−µ F˜n Cn + O(1) is regular because F˜n Cn = 0. The same n analysis holds true at λn , and so χ(λ) is regular everywhere, hence a constant. There is a simple way to construct the solution of the Riemann–Hilbert problem with zeroes, from its fundamental solution. We begin by adding a pair of zeroes at (µN , λN ). −1 Proposition. Let θ+ and θ− be the solution of the Riemann–Hilbert problem eq. (3.108) with zeroes at µN and λN respectively such that WN = −1 Ker θ+ (µN ) and VN = Im θ− (λN ) are fixed. Let θ˜± be a solution of the Riemann–Hilbert problem, without zeroes at these points. Then µN − λN −1 1− PN θ˜+ (λ) θ+ (λ) = χ0 λ − λN λN − µN −1 −1 ˜ θ− (λ) = θ− (λ) 1 − PN χ0 (3.109) λ − µN
where χ0 is a constant matrix and PN is a projector such that Im PN = θ˜+ (µN )WN ,
Ker PN = θ˜− (λN )VN
−1 −1 Proof. Introduce the matrix χ(λ) = θ˜− θ− = θ˜+ θ+ as above. It is a meromorphic function in the λ-plane with possible simple poles at λN and µN , so we can parametrize χ as
χ = χ0 +
1 χ1 , λ − µN
χ−1 = χ2 +
1 χ3 λ − λN
From the condition χ(λ)χ−1 (λ) = 1, we find χ2 = χ−1 and the two 0 relations 1 1 −1 χ1 χ0 + χ3 = 0, χ1 χ3 = 0 χ0 + µN − λN λN − µN Adding these two equations gives χ0 χ3 = −χ1 χ−1 0 . Let us define PN =
χ0 χ3 χ1 χ−1 0 =− λN − µN λN − µN
This matrix is a projector since PN2 = PN , because χ1 χ3 = (λN − µN )χ1 χ−1 0 . We can rewrite χ(λ) in terms of PN : µN − λN λN − µN −1 −1 1− χ(λ) = 1 − PN χ0 ; χ (λ) = χ0 PN λ − µN λ − µN
82
3 Synopsis of integrable systems
from which eqs. (3.109)
follow. Next we demand WN = Ker θ+ (µN ) = ˜ Ker (1 − PN )θ+ (µN ) so that (1 − PN )θ˜+ (µN )WN = 0 or Ker (1 − PN ) = −1 −1 Im PN = θ˜+ (µN )WN . Similarly, VN = Im θ− (λN ) = Im θ˜− (λN )(1 − −1 ˜ PN )χ0 , or what is the same, VN = θ− (λN )Ker PN , so that Ker PN = θ˜− (λN )VN . Repeated applications of this result allows one to build a solution of the Riemann–Hilbert problem with zeroes at µ1 , . . . , µN , λ1 , . . . , λN . µN − λN µ1 − λ1 −1 −1 1− θ+ (λ) = χN 1 − PN . . . χ1 P1 θ˜+ (λ) λ − λN λ − λ1 λ1 − µ1 λN − µN −1 −1 ˜ P1 χ1 . . . 1 − PN χN θ− (λ) = θ− (λ) 1 − λ − µ1 λ − µN Here θ˜± (λ) refers to the fundamental solution of the Riemann–Hilbert problem. We now extend the method of dressing transformations to the case of a Riemann–Hilbert problem with zeroes. Proposition. Given a Lax connection satisfying the zero curvature condition and the associated wave function Ψ(λ, x, t), and given vector spaces Vn (0), Wn (0), we define Vn (x, t) = Ψ(λn , x, t)Vn (0),
Wn (x, t) = Ψ(µn , x, t)Wn (0)
(3.110)
We use Vn (x, t), Wn (x, t) to define a Riemann–Hilbert problem with zeroes −1 (λ)g+ (λ) on Γ, the at λ1 , λ2 , . . . and µ1 , µ2 , . . .. Then for any g(λ) = g− g transformation Ψ → Ψ , −1 , Ψg = θ± Ψg±
−1 θ− θ+ = Ψ−1 gΨ
is a dressing transformation, i.e. preserves the analytic structure of the Lax connection. Proof. We start with the linear system (∂x − U (λ, x, t))Ψ = 0,
(∂t − V (λ, x, t))Ψ = 0
and dress it with a solution with zeroes of the Riemann–Hilbert problem, according to eqs. (3.99, 3.100): −1 −1 + ∂x θ± · θ± , U g = θ± · U · θ±
−1 −1 V g = θ ± · V · θ± + ∂t θ ± · θ ±
In general, the components of the dressed connection will have simple poles at the points µn , λn . We must require that the residues of these
3.11 Soliton solutions
83
poles vanish. At λ = µn we have −1 θ+ (λ) = Fn + O(λ − µn ), θ+ (λ) =
1 Cn + O(1) λ − µn
Requiring that the residue at λ = µn vanishes yields Fn (∂x − U |λ=µn )Cn = 0,
Fn (∂t − V |λ=µn )Cn = 0
This means that the space Wn = Ker Fn = Im Cn should be invariant under the action of the operators ∂x − U |λ=µn and ∂t − V |λ=µn . Similarly, at λ = λn we have −1 (λ) = Fn + O(λ − λn ), θ− (λ) = θ−
1 Cn + O(1) λ − λn
and setting the residues to zero gives Cn (∂x − U |λ=λn )Fn = 0,
Cn (∂t − V |λ=λn )Fn = 0
This means that the space Vn = Im Fn = Ker Cn should be invariant under the action of the operators ∂x − U |λ=λn and ∂t − V |λ=λn . The simplest solution is to choose the spaces Vn (x, t), Wn (x, t) as in eq. (3.110).
The interest of this procedure is that it yields non-trivial results even if the Riemann–Hilbert problem is trivial, i.e. g(λ) = Id. Then its fundamental solution is also trivial, θ˜± (λ) = Id, and the solutions with zeroes are constructed by purely algebraic means. The resulting θ± (λ) are rational functions of λ. To make this method effective, we need a simple solution of the zero curvature condition ∂t U − ∂x V − [V, U ] = 0 to start with. Simple solutions can be found in the form U = A(λ, x), V = B(λ, t), [A, B] = 0
t x Then Ψ = exp 0 Adx + 0 Bdt . The solutions obtained by dressing this simple type of solutions by the trivial Riemann–Hilbert problem with zeroes are called soliton solutions. Example. Let us illustrate this construction on the non-linear sigma model. There, one has (see eq. (3.83)) U=
1 Jx , λ−1
V =−
1 Jt λ+1
84
3 Synopsis of integrable systems
We consider the case of 2 × 2 matrices. The non-linear sigma model field is related to Jx , Jt , by Jx = g −1 ∂x g, Jt = g −1 ∂t g. The matrix g can be easily related to the solution Ψ of the linear system: g = Ψ−1 (λ = 0, x, t) A simple solution of the equations of motion is ax bt Jx = aσ3 , Jt = bσ3 , Ψ0 = e[( λ−1 − λ+1 )σ3 ] , g0 = e−(ax+bt)σ3
We want to dress this solution by solving a Riemann–Hilbert problem with zeroes at λ1 and µ1 . According to the general formulae, we have µ1 − λ1 λ1 − µ1 −1 −1 1− P , θ− (λ) = 1 − P χ0 θ+ (λ) = χ0 λ − λ1 λ − µ1 with χ0 an constant matrix and P a projector. It can be parametrized by two vectors w and n: wi nj Pij = w·n The spaces V1 and W1 are defined by −1 V1 = Im θ− (λ)|λ=λ1 = Im (1 − P )χ0 = Ker P W1 = Ker θ+ (λ)|λ=µ1 = Ker (1 − P )χ0 = Im P
Hence W1 is spanned by the vector w, and V1 is spanned by the vector n⊥ perpendicular to n. The invariance properties eq. (3.110) are ensured if we set w(x, t) = Ψ(x, t, µ1 )w(0),
n⊥ (x, t) = Ψ(x, t, λ1 )n⊥ (0),
In components, this reads
w1 (x, t) = e
ax − µ bt+1 µ1 −1 1
w1 (0),
−
w2 (x, t) = e
ax − µ bt+1 µ1 −1 1
w2 (0)
and −
n1 (x, t) = e
ax − λ bt+1 λ1 −1 1
n1 (0),
n2 (x, t) = e
ax − λ bt+1 λ1 −1 1
n2 (0)
These formulae completely determine the projector P (x, t). The dressed field is then reconstructed by λ1 − µ1 −1 −1 −1 g(x, t) = Ψ |λ=0 = Ψ0 |λ=0 θ− |λ=0 = g0 (x, t) 1 + P (x, t) χ0 µ1 The generalization to the N -soliton case is clear.
3.11 Soliton solutions
85
References [1] I.M. Gelfand and L.A. Dickey, Fractional powers of operators and Hamiltonian systems. Funct. Anal. Appl. 10 (1976) 259. [2] V.E. Zakharov and A.B. Shabat, Integration of Non Linear Equations of Mathematical Physics by the Method of Inverse Scattering. II Funct. Anal. Appl. 13 (1979) 166. [3] M. Adler, On a trace functional for formal pseudodifferential operators and symplectic structure of the Korteweg–de Vries type equations. Inv. Math. 50 (1979) 219. [4] M. Jimbo and T. Miwa, Monodromy preserving deformation of linear ordinary differential equations with rational coefficients. III Physica D 4 (1981) 26–46. [5] M. Semenov-Tian-Shansky. What is a classical r-matrix? Funct. Anal. Appl. 17 4 (1983) 17. [6] L.D. Faddeev, Integrable Models in 1 + 1 Dimensional Quantum Field Theory. Les Houches Lectures 1982. Elsevier Science Publishers (1984). [7] M. Semenov-Tian-Shansky, Dressing transformations and Poisson group actions. Publ. RIMS 21 (1985) 1237. [8] L.D. Faddeev and L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons. Springer (1986). [9] L.A. Dickey, Soliton Equations and Hamiltonian Systems. World Scientific (1991). [10] L.A. Dickey, On the τ –function of matrix hierarchies of integrable equations. J. Math. Phys. 32 (1991) 2996–3002.
4 Algebraic methods
We abstract the group theoretical settings of integrable systems. In this framework, the Lax matrix is viewed as a coadjoint orbit of a Lie algebra G. When the r-matrix is non-dynamical the Jacobi identity simplifies and one can use it to define a second Lie algebra structure on G. Hence G has a structure of Lie bi-algebra, and conversely, such a structure defines an r-matrix. One can then build dynamical systems admitting Lax representations and conserved quantities in involution. Furthermore, the solution of the equations of motion is reduced to a factorization problem in Lie group theory. We illustrate these constructions in the case of finite-dimensional Lie groups, with the open Toda chain model which we solve completely by algebraic methods. Finally, we demonstrate the versatility of the algebraic setting by constructing the Lax pair with spectral parameter for the Kowalevski top. 4.1 The classical and modified Yang–Baxter equations In Chapter 2, we have computed the Poisson brackets of the entries Lij of the Lax matrix L = L E ij ij , and have shown that they can be ij expressed with an r-matrix (see section (2.5) in Chapter 2):
{L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]
(4.1)
with r12 = ij,kl rij,kl Eij ⊗ Ekl , and Eij is the canonical basis of gl(N ). So r12 can be viewed as an element of gl(N ) ⊗ gl(N ). We can generalize this setup immediately by considering a Lie algebra G equipped with a non-degenerate invariant scalar product ( , ), also denoted by Tr ( ). We will use a basis {Ta } of G, and denote the matrix of scalar products by gab = (Ta , Tb ) = Tr (Ta Tb ) and its inverse by g ab . 86
4.1 The classical and modified Yang–Baxter equations
87
The proper interpretation of the Lax matrix L is as an element of G ∗ , i.e. a linear form on G. It can also be viewed as an element of G, since the invariant scalar product allows us to identify G and its dual G ∗ : X ∈ G −→ L(X) ≡ (L, X) With L viewed as an element of G, eq. (4.1) shows that r12 is an element of G ⊗ G. The nice structural aspects, however, appearwhen one interprets the r-matrix as a linear map R : G −→ G. If r12 = ab rab Ta ⊗ Tb , then R(X) = rab Ta (Tb , X) = Tr2 (r12 X2 ) (4.2) ab
The r-matrix relation, eq. (4.1), can be presented in a dual form: {L(X), L(Y )} = L([X, Y ]R )
(4.3)
with the R-bracket, [ , ]R , defined as [X, Y ]R = [R(X), Y ] + [X, R(Y )]
(4.4)
To prove these formulae, take the invariant scalar product of both sides of eq. (4.1) by X ⊗ Y . On the left-hand side we get {L(X), L(Y )}, while on the right-hand side we get ([R(Y ), L], X) − ([R(X), L], Y ), which is equal to (L, [X, Y ]R ) by invariance of the scalar product. In this dual formalism, the Jacobi identity, eq. (2.12) in Chapter 2, becomes an equation on R: L( [X, J(Y, Z)] + cyc.perm.) = 0
(4.5)
where cyc.perm. means cyclic permutation of (X, Y, Z) and the quantity J(Y, Z) is defined as J(Y, Z) = {L(Y ), R(Z)} − {L(Z), R(Y )} + [R(Y ), R(Z)] − R ([Y, Z]R ) If the R matrix is a constant on phase space (independent of the dynamical variables), the Jacobi identity becomes: L( [X, [R(Y ), R(Z)] − R ([Y, Z]R )] + cyc.perm.) = 0 A particular way to fulfil this equation is to set 1 [R(X), R(Y )] − R ( [X, R(Y )] + [R(X), Y ] ) = − [X, Y ] 4
(4.6)
88
4 Algebraic methods
so that it reduces to the Jacobi identity in G. Equation (4.6) is called the modified Yang–Baxter equation and will be extensively studied below. The factor 1/4 is conventional and can be changed by a rescaling of R. Example. The simplest example of an equation of the type eq. (4.3) is provided by the case in which the map R is proportional to the identity map. It corresponds to the Kostant–Kirillov bracket on G ∗ , which is defined by: {L(X), L(Y )}K = L([X, Y ]) for any X, Y in G. Comparing with eq. (4.3) we see that this bracket corresponds to RK = 1/2 Id. The modified Yang–Baxter equation (4.6) is satisfied with this value of RK . Under dualization, the Poisson bracket can be written in the form (4.1) with r12 = 12 C12 , where C12 is the tensor Casimir of G, g ab Ta ⊗ Tb (4.7) C12 = ab
Recall that the tensor Casimir has the two main properties: [C12 , X1 + X2 ] = 0,
X1 = Tr2 (C12 X2 ),
X∈G
where we used the tensorial notations of section (2.5) in Chapter 2. Before studying the modified Yang–Baxter equation, we would like to explain its relation with another important equation appearing in this domain: the classical Yang–Baxter equation. For any r in G ⊗ G it reads: [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0
(4.8)
This equation is important because it is the classical limit of the quantum Yang–Baxter equation, which is one of the main tools in the study of many quantum integrable models. In dualized form eq. (4.8) reads [R(X), R(Y )] − R( [X, R(Y )] − [t R(X), Y ] ) = 0
(4.9)
where we defined t R(X) = Tr2 (r21 X2 ). Please, notice the subtle difference between the left-hand sides of eq. (4.6) and eq. (4.9). The two expressions agree if t R = −R, i.e. if the r-matrix is antisymmetric: r12 = −r21 . The relation between the solutions of the modified Yang–Baxter equation and of the classical Yang–Baxter equation is as follows: Proposition. Let R be an antisymmetric solution of the modified Yang– Baxter equation, then R± = R± 12 Id both satisfy the classical Yang–Baxter equation.
4.2 Algebraic meaning of the classical Yang–Baxter equations
89
Proof. We compute R± ([X, Y ]R ) = R([X, Y ]R ) ± 12 [X, Y ]R = [R± (X), R± (Y )]
(4.10)
The statement follows from the observation that [X, Y ]R = [R∓ (X), Y ] + [X, R± (Y )] and the fact that R∓ = −t R± when R is antisymmetric. ± Viewed as an element of G ⊗ G, R± corresponds to r12 with ± = r12 ± 12 C12 r12
(4.11)
where r12 is the dualized form of the antisymmetric solution of the modified Yang–Baxter equation, and C12 is the tensor Casimir element in G ⊗ G. 4.2 Algebraic meaning of the classical Yang–Baxter equations We now undertake an abstract study of the bracket eq. (4.3), for a generic linear mapping R : G −→ G solution of the modified Yang–Baxter equation. This will naturally lead us to introduce a factorization problem in the Lie group associated with G. Proposition. Let R be a solution of the modified Yang–Baxter equation (4.6), then the antisymmetric bracket [ , ]R satisfies the Jacobi identity. It thus defines a second Lie algebra structure on G. Proof. We have to prove that the bracket [ , ]R satisfies the Jacobi identity: [X, [Y, Z]R ]R + [Z, [X, Y ]R ]R + [Y, [Z, X]R ]R = 0 Expanding the external R-brackets, we get: [R(X), [Y, Z]R ] + [R(Z), [X, Y ]R ] + [R(Y ), [Z, X]R ] + [X, R([Y, Z]R )] + [Z, R([X, Y ]R )] + [Y, R([Z, X]R )] = 0 Developing the terms in the first line and using the Jacobi identity on the Lie bracket [ , ], we can rewrite them as −[X, [R(Y ), R(Z)]] − [Y, [R(Z), R(X)]] − [Z, [R(X), R(Y )]] They combine with the terms in the second line to yield: [X, [R(Y ), R(Z)] − R([Y, Z]R )] + cyclic permutations = 0
90
4 Algebraic methods
This is satisfied if R is a solution of the modified Yang–Baxter equation because the original Lie bracket [ , ] obeys the Jacobi identity. We are then in a very special situation where the vector space G is equipped with two Lie algebra structures defined by the two brackets [ , ] and [ , ]R . This is called a Lie bi-algebra. We will denote by GR the Lie algebra with underlying vector space G but with Lie bracket [ , ]R . The relation between these two Lie algebra structures is described by the following: Proposition. Let us define R± = R ± 12 Id, K± = Ker (R∓ ) and G± = Im (R± ), then (i) (R± ) : GR −→ G are Lie algebra homomorphisms, (ii) G± ⊂ G are Lie subalgebras of G, (iii) K± ⊂ GR are ideals of GR and G± GR /K∓ . Proof. This proposition is a straightforward consequence of the modified Yang–Baxter equation written as in eq. (4.10). Given the maps R+ and R− , we thus construct two subalgebras G± ∈ G. Since R+ − R− = Id, any element X ∈ G can be written, perhaps not uniquely, as X = X + − X−
with X± = R± (X) ∈ G± = Im (R± )
(4.12)
Note that the bracket [ , ]R takes the simple form: [X, Y ]R = [X+ , Y+ ] − [X− , Y− ]
(4.13)
We define the Lie algebra G+ ⊕ G− as the Cartesian product of G+ and G− in which [G+ , G− ] = 0. We can embed GR into G+ ⊕G− by the map X → (R+ (X), R− (X)). From eq. (4.13) we see that GR is a subalgebra of G+ ⊕ G− through this embedding. The question then arises to determine the image G˜R of GR in G+ ⊕G− . This is the object of the next two propositions. Proposition. (i) K± ⊂ G± are ideals in G± , hence G± /K± are Lie algebras. (ii) The mapping ν : G+ /K+ −→ G− /K− defined by ν : R+ X −→ R− X is a Lie algebra isomorphism. Proof. To prove the first part of the proposition, we remark that K± ⊂ G± since, for X ∈ K± , we have by definition RX = ± 12 X, so that X = ±R± X. For the same reason, on K± we have [ , ]R = ±[ , ] and K± are indeed subalgebras in G± . To prove that they are ideals, we consider X ∈ K± , Y ∈ G± . We can write Y = R± Z, then [X, Y ] = [X, R± Z] = [X, RZ] ± 12 [X, Z] = [X, RZ] + [RX, Z] = [X, Z]R
4.2 Algebraic meaning of the classical Yang–Baxter equations
91
but [X, Z]R ∈ K± since K± are ideals in GR . To prove the second part let us denote by X ± the equivalence class X ± = R± X [mod K± ] First ν : G+ /K+ −→ G− /K− is well-defined, since an element of K+ is mapped to 0. The mapping ν is surjective because it is induced by the surjective mapping G+ → G− given by R+ X → R− X. It is injective because if R− X ∈ K− one has R+ (X) = 0 by definition of K− . Finally we prove that ν is a Lie algebra isomorphism, i.e: [ν(x), ν(y)] = ν[x, y] for any x, y of the form x = R+ X, y = R+ Y . Recalling that [R± X, R± Y ] = R± [X, Y ]R and the definition of the Lie algebra bracket on G± /K± , we have: ν([R+ X, R+ Y ]) = ν(R+ [X, Y ]R ) = R− [X, Y ]R = [R− X, R− Y ] = [ν(R+ X), ν(R+ Y )]
Finally, we have the following important result: Proposition. Consider the two maps: R+ ,R−
1,−1
GR −→ G+ ⊕ G− −→ G that is X → (R+ (X), R− (X)) and (X+ , X− ) → X+ − X− respectively. The first map is a Lie algebra injective homomorphism. Let G˜R be its image. The second map, when restricted to G˜R , is bijective. Finally, G˜R is characterized by the set of couples (R+ X, R− Y ) such that ν(R+ X) = R− Y . Proof. The first map is injective and the second one is surjective because R+ − R− = Id. The elements of G˜R are of the form (R+ X, R− X) hence satisfy the condition ν(R+ X) = R− X. Conversely, let us start from a pair (R+ X, R− Y ) such that ν(R+ X) = R− Y . This means R− Y = R− X, hence there exists K− ∈ K− such that R− (X − Y ) = K− = −R− K− . It follows that (X − Y + K− ) = K+ belongs to Ker R− = K+ . Then we have ˜ R− Y = R− X ˜ with X ˜ = X + K− = Y + K+ so that the R+ X = R+ X, + − point (R X, R Y ) belongs to G˜R . This proposition shows that we can decompose uniquely any element X ∈ G as X = X+ − X− with X± ∈ G± and ν(X+ ) = X− . This will be the basis for the factorization problem below. Another important consequence of these results is that the algebraic structure we have exhibited is equivalent to the existence of an r-matrix.
92
4 Algebraic methods
This is the starting point of the classification theorems of the solutions of the Yang–Baxter equation by Belavin–Drinfeld and Semenov-TianShansky. In the next sections we content ourselves with giving simple examples of this algebraic setting. In the Adler–Kostant–Symes scheme ν is trivial, and in the open Toda chain ν is non-trivial. 4.3 Adler–Kostant–Symes scheme A class of solutions of the modified Yang–Baxter equation is produced by the Adler–Kostant–Symes scheme. Let G be a Lie algebra and assume that we have two Lie subalgebras A and B such that, as a vector space, G is the direct sum G =A+B Note that A and B are Lie subalgebras but they are not assumed to commute ([A, B] = 0). We denote by PA the projection on A along B and P B = 1 − PA . Proposition. The linear map R = Yang–Baxter equation.
1 2
(PA − PB ) satisfies the modified
Proof. By definition we have [X, Y ]R = [PA X, PA Y ]−[PB X, PB Y ], where we used PA + PB = 1, thus: R([X, Y ]R ) =
1 2
([PA X, PA Y ] + [PB X, PB Y ])
Since [RX, RY ] =
1 ([PA X, PA Y ] − [PA X, PB Y ] − [PB X, PA Y ] + [PB X, PB Y ]) 4
we obtain
1 [RX, RY ] − R([X, Y ]R ) = − [X, Y ] 4 This is the modified Yang–Baxter equation. This construction is a particular example of the general discussion. With the notations of the previous section, R+ = R + 12 = PA and R− = R − 12 = −PB , and G+ = K+ = A, G− = K− = B. Thus G± /K± = {0} so we have the decomposition of GR as a direct sum of Lie algebras: GR = A ⊕ B In the Lie algebra GR , A and B commute. It is then clear that any element X ∈ G can be decomposed as X = X+ − X− with X+ = PA X and X− = −PB X.
4.3 Adler–Kostant–Symes scheme
93
Example. Let G˜ be the loop algebra G˜ = G ⊗ C[[λ, λ−1 ]] with G a finitedimensional simple Lie algebra. Elements of G˜ are linear combinations of elements X ⊗ λn with X ∈ G and n ∈ Z. The commutation relations in G˜ are: [X ⊗ λn , Y ⊗ λm ] = [X, Y ] ⊗ λn+m
(4.14)
The two subalgebras A = G+ , B = G− are spanned by elements of the form X ⊗ λn with n ≥ 0 and n < 0 respectively. Clearly [A, B] = 0 for the ˜ For any X ∈ G and any formal power series Lie algebra of G. structure n f (λ) = n∈Z fn λ , the maps R± are defined by: R+ (X ⊗ f (λ)) = X ⊗ f+ (λ) (4.15) R− (X ⊗ f (λ)) = −X ⊗ f− (λ) where f+ (λ) = n≥0 fn λn and f− (λ) = n0
α simple
hence lα = exp(α(q)) for α simple and lα = 0 otherwise. We have obtained the Lax matrix of the Toda chain: eα(q) (Eα + E−α ) (4.52) L=p + α simple
The set of the (Q, L) is obviously a (2r)-dimensional subvariety S of Nµ . For any point (g, ξ) in Nµ one can write uniquely g = nQk and ξ = k −1 Lk. To compute the reduced symplectic form in the coordinates (Q, L), it is enough to evaluate the canonical 1-form: α = 2ξ(g −1 δg) = 2L(Q−1 δQ) = 2L(δq) = 2 pi δqi i
114
4 Algebraic methods
This completes the identification of the reduced geodesic flow with the open Toda chain. An interesting feature of this approach is that it allows us to compute naturally the r-matrix of the Toda model. We refer to Chapter 14 for the exposition of the general method. The function L(X) (for any X ∈ M) on the reduced phase space has a uniquely defined extension on T ∗ G0 , invariant under the action of the group N+ × K: FX (g, ξ) = (ξ, k −1 Xk)
(4.53)
where k = k(g) is uniquely determined by the Iwasawa decomposition g = nQk. In this situation one has simply the reduced Poisson bracket: {L(X), L(Y )} = {FX , FY } = {(kξk −1 , X), (kξk −1 , Y )} We evaluate this bracket with the help of eqs. (4.48), and then restrict to n = k = e. For a function f : G → G we have: {ξ(X), f (g)} =
1 d f (getX )|t=0 2 dt
Applying this to the function k = k(g), we get: 1 {ξ(X), k(g)} = ∇k(X), 2
where ∇k(X) =
∂ k(Q exp(tX))|t=0 ∂t
We then immediately obtain:
1 L , [X, Y ] − [X, ∇k(Y )] − [∇k(X), Y ] {L(X), L(Y )} = 2 Notice that [X, Y ] ∈ K because X, Y ∈ M, hence L([X, Y ]) = 0, because L ∈ M. From this equation we get an r-matrix for the Toda system given by: 1 R X = − ∇k(X) ∈ K 2 We can compute this r-matrix as follows: Given an element X ∈ M, we write the Iwasawa decomposition of QetX as: QetX = etX+ · QetXa · etXK This uniquely defines the quantities, (a priori depending on t), X+ ∈ N+ , Xa ∈ A, and XK ∈ K. Letting t tend to zero, we see that ∇k(X) = XK and moreover QX = X+ Q + Xa Q + QXK ⇒ X = Q−1 X+ Q + Xa + XK
(4.54)
4.10 The Lax pair of the Kowalevski top
115
Any α>0 xα (Eα + E−α ) + X ∈ M can be written in the form X = x H which we rewrite as i i i X=2 xα Eα + xi Hi − xα (Eα − E−α ) (4.55) α>0
i
α>0
Noticing that Q−1 X+ Q ∈ N+ , we get by comparison of eqs. (4.54) and (4.55), Q−1 X+ Q = 2 α>0 xα Eα , Xa = i xi Hi , and finally 1 1 1 R X = − ∇k(X) = − XK = xα (Eα − E−α ) 2 2 2 α>0
.1 ⊗ X), we obtain the r-matrix: By dualization, R X = Tr2 (r12 r12 =
1 (Eα − E−α ) ⊗ (Eα + E−α ) 4 (Eα , E−α )
(4.56)
α>0
to the r-matrix, r , we computed in eq. (4.26). For We want to relate r12 12 this, we recall that L and r12 satisfy the equation {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]. Applying the automorphism σ1 = σ ⊗ 1 to this equation and noting that σ(L) = −L, we get {L1 , L2 } = [σ1 r12 , L1 ]+[σ1 r21 , L2 ]. Adding the two equations, we get a similar r-matrix relation:
1 1 {L1 , L2 } = [ (r12 + σ1 r12 ), L1 ] − [ (r21 − σ1 r21 ), L2 ] 2 2 It is straightforward to check that: 1 , (r12 + σ1 r12 ) = r12 2
1 (r21 − σ1 r21 ) = r21 2
hence we recover eq. (4.56). 4.10 The Lax pair of the Kowalevski top Applying the algebraic setting of this chapter to classical Lie groups and low-dimensional coadjoint orbits, one can find a rich variety of integrable systems. This pragmatic approach was rewarded by the discovery of a Lax pair with spectral parameter for the Kowalevski top by Reyman and Semenov-Tian-Shansky. We now explain this construction. Consider a Lie group G with an involutive automorphism σ. This involution defines a linear involution on the Lie algebra G that we also call σ. Let K be the subgroup of fixed points of σ, and K its Lie algebra. We have G =K⊕R
116
4 Algebraic methods
where σ = +1 on K and σ = −1 on R. Since σ is a Lie algebra automorphism of order 2, we have [K, K] ⊂ K,
[K, R] ⊂ R,
[R, R] ⊂ K
By exponentiation, the vector space R is a representation space for the Lie group K acting by conjugation. From these data we first construct the loop algebra G˜ = G ⊗ C[[λ, λ−1 ]] with the commutation relations eq. (4.14), and then the twisted loop algebra G˜σ as the set of fixed points of the induced involution σ (X(λ)) = (σ · X)(−λ). Elements of the twisted loop algebra are of the form X(λ) = Xn λn with Xn ∈ K for n even, and Xn ∈ R for n odd. As in section (4.3), this algebra decomposes as a sum of two subalgebras: G˜σ = G˜σ + + G˜σ − where elements of G˜σ + have vanishing or positive powers of λ while elements of G˜σ − have strictly negative powers of λ. Applying the Adler– Kostant–Symes scheme, we define a pair of matrices R± whose actions are given by R+ (X(λ)) = −X+ (λ) R− (X(λ)) = X− (λ) As compared with eq. (4.15), we have introduced a minus sign to match the conventions of Chapter 2. Hence we have two Lie algebra brackets on G˜σ , the original one, and the one in which G˜σ ± commute, defining the Lie algebra G˜σ R , with which ˜R. is associated the Lie group G The invariant bilinear form on G˜σ is still given byeq. (4.16). It allows us to write elements of the dual G˜σ∗ as series L(λ) = n ln λn−1 with ln ∈ K for n even and ln ∈ R for n odd. On this dual we consider the Poisson algebra defined by the Poisson brackets { , }R . To construct an integrable system we select a particularly simple orbit: Proposition. Let A and B be two fixed elements of R. (i) The set of elements L(λ; ξ, k) = A + ξλ−1 + (k −1 Bk) λ−2 ,
ξ ∈ K, k ∈ K
(4.57)
˜R. is a coadjoint orbit of G −2 (ii) Setting L = l−1 λ + l0 λ−1 + l1 , with l0 ∈ K and l±1 ∈ R, the Poisson brackets of this Lax matrix are given for x, y ∈ K, z ∈ R by: {(l0 , x), (l0 , y)}R = −(l0 , [x, y]),
{(l0 , x), (l−1 , z)}R = −(l−1 , [x, z]) (4.58)
4.10 The Lax pair of the Kowalevski top
117
All other Poisson brackets vanish. In particular, l1 is in the centre of the Poisson algebra. (iii) Equation (4.57) shows that there is a map from T ∗ (K) to the orbit. This is a Poisson map. Proof. We first show that the elements of the form L = l−1 λ−2 +l0 λ−1 +l1 ˜ R . It is enough to show that such elements are form a coadjoint orbit of G xn λn we have by stable under the coadjoint action of G˜σ R . For X = ∗ eq. (4.13) (adR X · L)(Y ) = L([X, Y ]R ) = −L([X+ , Y+ ] + [X− , Y− ]), so that: (ad∗R X · L, Y ) = −([X+ , L], Y+ ) + ([X− , L], Y− ) = −([x0 , l−1 ]λ−2 − ([x0 , l0 ] + [x1 , l−1 ])λ−1 , Y+ ) We see that under this coadjoint action l1 remains invariant, l−1 is transformed by the coadjoint action of K, and l0 is transformed to a generic element of K: δl1 = 0,
δl−1 = −[x0 , l−1 ],
δl0 = −[x0 , l0 ] − [x1 , l−1 ]
Next we compute the Poisson brackets of L induced by the Kostant– Kirillov bracket { , }R . In the formula {(L, X), (L, Y )}R = (L, [X, Y ]R ), we choose X = xp λp , p = −1, 0, 1, and similarly for Y , to probe the three components of L. For example {(l1 , x−1 ), (l0 , y0 )}R = (L, [x−1 , y0 ]R ) = 0 because x−1 and y0 lie in the two subspaces of G˜σ which commute in G˜σ R . Similarly, {(l0 , x0 ), (l−1 , y1 )}R = −(L, [x0 , y1 ]λ) = −(l−1 , [x0 , y1 ]), etc. To show that these Poisson brackets are the same as those of T ∗ (K), recall the parametrization (k, ξ) of the generic element of the cotangent bundle T ∗ (K) and the Poisson bracket formulae (with an extra minus sign compared to Chapter 14): {ξ(x), ξ(y)} = −ξ([x, y]),
{ξ(x), k} = −kx,
{k, k} = 0
(4.59)
The first of eq. (4.59) reproduces the Kostant–Kirillov bracket of K as in the first of eq. (4.58) with the identification ξ = l0 . The second of eq. (4.59) is equivalent to {ξ(x), k −1 Bk} = −[k −1 Bk, x], which coincides with the second of eq. (4.58) since (l−1 , [x, z]) = ([l−1 , x], z) and l−1 = k −1 Bk. It follows that the map from T ∗ (K) to the orbit preserves the symplectic structure. In the general Adler–Kostant–Symes construction, any function invariant under the coadjoint action of G˜σ yields an integrable flow on our coadjoint orbit. Such functions are given by Res Tr (λm Ln (λ)) for any m, n. In particular, taking m = 1 and n = 2 we get the Hamiltonian: H = −Tr (ξ 2 ) − 2Tr (k −1 BkA)
(4.60)
118
4 Algebraic methods
where the minus sign is appropriate to get a positive Hamiltonian when K is a compact group. Recall that K acts on the left and on the right −1 on T ∗ (K) by ((hL , hR ), (k, ξ)) → (hL kh−1 R , hR ξhR ), so this Hamiltonian is invariant under the subgroup of KL of elements which stabilize B, h−1 L BhL = B, and under the subgroup of KR of elements which stabilize A, h−1 R AhR = A. More generally, under these special subgroups, the Lax matrix L is invariant under hL and gets conjugated under hR , so that any Hamiltonian of the hierarchy is invariant. One can then perform a Hamiltonian reduction under the action of such subgroups. These reduced Hamiltonians are in involution using the reduced Poisson bracket, because reduced Poisson brackets coincide with unreduced ones for invariant functions. This scheme is particularly interesting when G is a real Lie group and σ is a Cartan involution. Then K is a maximal compact subgroup of G, see Chapter 16, so that G/K is a symmetric space. The situation which leads to the Kowalevski top is obtained by considering G = SO(p, q) and K = SO(p) × SO(q) with p ≥ q. The Kowalevski case corresponds to p = 3, q = 2. The group SO(p, q) is the real Lie group of pseudoorthogonal transformations of Rp+q leaving invariant the metric x21 +· · ·+ x2p − x2p+1 − · · · − x2p+q . Elements of its Lie algebra may be represented as matrices of the form: Xp D (Xp , Xq ) ∈ so(p) ⊕ so(q) (4.61) X= t D Xq where D is an arbitrary (p, q) matrix. The subalgebra K consists of block diagonal matrices (Xp , Xq ), while the subspace R consists of the offdiagonal terms. Let us write the L matrix eq. (4.57) in this context, with k = (kp , kq ) ∈ SO(p) × SO(q) and ξ = (ξp , ξq ) ∈ so(p) ⊕ so(q): 0 kp−1 bkq 0 a ξp 0 −1 + λ−2 λ + (4.62) L= t a 0 0 ξq kq−1 t bkp 0 Here, we have written the matrices A, B ∈ R in eq. (4.57) in the form 0 b 0 a , B= t A= t a 0 b 0 where a and b are rectangular p × q matrices. We have also computed k −1 Bk explicitly. To obtain the Kowalevski top we choose a specific orbit, i.e. we specify B. This amounts to choosing the rectangular p × q matrix b, which we take as: 1q b= 0p−q
4.10 The Lax pair of the Kowalevski top
119
where 1q means the q × q identity matrix, and 0p−q means the (p − q) × q zero matrix. The subgroup of KL which stabilizes B consists of matrices of the form: 0 hq hq 0 , hq ≡ ∈ SO(p) for hq ∈ SO(q) hL = 0 1p−q 0 hq where the map hq → hq embeds SO(q) into SO(p) and the map hq → hL embeds the group SO(q) into SO(p) × SO(q). Hence the reduction subgroup identifies with SO(q). At the Lie algebra level, it is realized into the Lie algebra so(p) ⊕ so(q) by pairs of matrices q , Xq ) XL = (X Using these embeddings we can write kp−1 bkq = rb, where r = kp−1 kq ∈ SO(p). The quantity r is invariant under the action of the reduction hq kp , hq kq ). Moreover, the action of KL leaves group since hq · (kp , kq ) = ( ξ invariant, so we have the manifestly invariant expression for L: 0 rb ξp 0 0 a −1 λ + t −1 + λ−2 , r = kp−1 kq L= t 0 ξq a 0 br 0 The moment map is given by eq. (4.49). For any XL in the Lie algebra of the reduction group it is given by: q ) + (kq ξq kq−1 , Xq ) (P L (k, ξ), XL ) = (kp ξp kp−1 , X = (kq ξq kq−1 + πq (kp ξp kp−1 ), Xq ) where Xq ∈ so(q), k = (kp , kq ) ∈ KL , ξ = (ξp , ξq ) and πq (M ) is the projection operator which restricts the p × p matrix M to its q × q upper q is a matrix with Xq in the left corner. This projector appears because X upper left q × q corner and 0 in the lower right (p − q) × (p − q) corner. We choose to reduce on the surface of zero momentum which is given by kq ξq kq−1 + πq (kp ξp kp−1 ) = 0, that is: 1 ξq = − πq (J), 2
J = 2r−1 ξp r ∈ so(p)
(4.63)
The factor of 2 in the definition of J is introduced for later convenience. We still need to quotient by the stability group of the moment, i.e. the whole group SO(q). However, since we deal only with invariant quantities like L, r, the Poisson brackets on the reduced phase space can be computed by simply using the Poisson brackets on the unreduced phase
120
4 Algebraic methods
space, see Chapter 14. From eq. (4.59), we easily compute the Poisson brackets that we will need later: {ξp (r−1 Xr), ξp (r−1 Y r)} = ξp (r−1 [X, Y ]r) {ξp (X), r} = Xr,
{ξq (X), r} = −rX
(4.64)
The Hamiltonian eq. (4.60) takes the form H = −Tr (ξ 2 ) − 4Tr (t bF ),
F = r−1 a
(4.65)
Using the explicit form of b, we have Tr (t bF ) = Tr πq (F ), while Tr (ξ 2 ) = Tr (ξp2 )+Tr (ξq2 ). By eq. (4.63), we have Tr (ξp2 ) = 1/4 Tr (J 2 ) and Tr (ξq2 ) = 1/4 Tr πq (J)2 . In particular when (p = 3, q = 2) we use that J is a 3 × 3 antisymmetric matrix which we write as Jij = ijk Jk , and we express the 3 × 2 matrix F as F = (P , P ) for two vectors P , P of components γi , γi , i = 1, 2, 3. Then Tr πq (F ) = γ1 + γ2 , and Tr πq (J)2 = −2J32 , while Tr J 2 = −2J2 , so the Hamiltonian eq. (4.60) takes the form: 1 H = (J12 + J22 + 2J32 ) − 4(γ1 + γ2 ) 2 The Poisson brackets eq. (4.64) translate into the following non-zero Poisson brackets for the quantities Ji , γi , γi : {Ji , Jj } = ijk Jk ,
{Ji , γj } = ijk γk ,
{Ji , γj } = ijk γk
We recover exactly the dynamical data of the Kowalevski top, except that it is generalized to a situation with two external forces γ and γ . The special Kowalevski case corresponds to γ = 0. It is now clear that J is the angular momentum. The matrix r can be identified as the rotation relating the absolute and the moving frames, so that Ω = −r−1 r˙ is the rotation vector. We have r−1 r˙ = r−1 {H, r} = −r−1 {Tr (ξp2 ) + Tr (ξq2 ), r}, where in the last equality we dropped the F term in the Hamiltonian, eq. (4.65), because {r1 , r2 } = 0. Using eq. (4.64), we get r−1 r˙ = −2r−1 ξp r + 2ξq = −J + 2ξq . It follows that, on the zero momentum surface, we have: Ω = J + πq (J) For (p = 3, q = 2) we denote Ωij = ijk ωk and we see that Jk = Ik ωk with I1 = I2 = 1 and I3 = 1/2. Finally, F = r−1 a represents external forces in the rotating frame. These forces are constant in the absolute frame and given by the constant matrix a.
4.10 The Lax pair of the Kowalevski top
121
It is convenient to conjugate the Lax matrix on the zero momentum surface by the block diagonal (p + q) × (p + q) matrix D = Diag(r, 1q ) so that the expression of the rotated Lax matrix is: 1 J 0 0 F 0 b −1 −1 + λ + t λ−2 (4.66) L ≡ D LD = t F 0 b 0 2 0 −πq (J) In this way the Lax matrix is naturally expressed only in terms of quantities defined in the moving frame. can be written The Lax equation describing the motion of the matrix L in the form: d 0 b Ω 0 −1 ( ( λ + L(λ) = [M (λ), L(λ)], M (λ) = 2 t b 0 0 0 dt −1 ˙ ((λ) = 2(λL(λ)) In fact we have M − − D D, the first term follows from the general theory, and the second term is produced by the conjugation Explicitly, this Lax equation reads: by D in L.
F˙ = ΩF,
J˙ = −[J, Ω] − 4(F t b − b t F ),
˙ = 4( t F b −t bF ) πq (J)
which in components gives: 2p˙ = qr + 8γ3 2q˙ = −pr − 8γ3 r˙ = −8(γ1 − γ2 )
γ˙ 1 = rγ2 − qγ3 γ˙ 2 = pγ3 − rγ1 γ˙ 3 = qγ1 − pγ2
γ˙ 1 = rγ2 − qγ3 γ˙ 2 = pγ3 − rγ1 γ˙ 3 = qγ1 − pγ2
These equations generalize the equations of the Kowalevski top given in Chapter 2, which are recovered if γ = 0 and c0 = 8 or h = 4. Finally, it is convenient to switch to a more compact representation of the so(3, 2) Lie algebra by 4 × 4 matrices instead of 5 × 5 matrices. We first consider the complex Lie algebra so(5, C) and view so(3, 2) as one ¯ of its non-compact real forms, obtained using the conjugation X → η Xη where η = Diag (1, 1, 1, −1, −1) is the metric left-invariant by so(3, 2). Any element of the Lie algebra so(3, 2), in the vector representation, can be written as in eq. (4.61) with: 0 x3 −x2 0 y , D = (γ , γ ) Xp = −x3 , Xq = 0 x1 −y 0 x2 −x1 0 This matrix is mapped to an antisymmetric matrix of so(5, C) by a change of basis, namely conjugation by Diag (1, 1, 1, i, i). In this basis the matrix reads: X iD Xp (4.67) X= −i t D Xq
122
4 Algebraic methods
Let Tµν = Eµν − Eνµ for µ < ν be the standard basis of antisymmetric matrices, obeying the so(5) algebra: [Tµν , Tρσ ] = Tµσ δνρ − Tρν δµσ + Tσν δµρ − Tµρ δνσ It is well known, and easy to check, that this algebra can be realized with Γ matrices satisfying the anticommutation relations {Γµ , Γν } = 2δµν , µ, ν = 1, . . . , 5. This is called the spinorial representation of so(5, C). The Tµν are given by Tµν = 1/4[Γµ , Γν ]. It remains to notice that the Γµ can be represented by 4 × 4 matrices as follows: Γj = σ1 ⊗ σj ,
Γ4 = σ2 ⊗ σ0 ,
Γ5 = σ3 ⊗ σ1
where σ0 = 1 and the σj are the 2 × 2 Pauli matrices: σ1 =
0 1
1 0
,
σ2 =
0 −i i 0
,
σ3 =
1 0
0 −1
(4.68)
obeying the relations σi2 = 1 and σi σj = i ijk σk for (i, j, k) = (1, 2, 3). in eq. (4.67) expands on the Tµν as The matrix X = x3 T12 − x2 T13 + x1 T23 + yT45 + i X
3
(γj Tj4 + γj Tj5 )
j=1
Plugging the representation of Tµν in terms of the Γµ , we get: =1 X 2
ix · σ − γ · σ iyσ0 + iγ · σ
iyσ0 − iγ · σ ix · σ + γ · σ
We finally write the Lax pair in this 4 × 4 representation: · 1 − γ · σ −i γ σ = L γ · σ 2 iγ · σ 1 1 −σ1 −iσ2 iJ · σ −iJ3 σ0 + + 2 iσ2 σ1 4λ −iJ3 σ0 iJ · σ 2λ 1 −σ1 −iσ2 ω · σ 0 ( = 1 i + M σ1 0 i ω · σ 2 λ iσ2
(4.69)
It is a simple computation to check directly that the Lax equation with this Lax pair reproduces the equations of motion of the Kowalevski top.
4.10 The Lax pair of the Kowalevski top
123
References [1] M. Toda, Vibration of a chain with non-linear interaction. J. Phys. Soc. Japan 22 (1967) 431. [2] B. Kostant, The solution to the generalized Toda lattice and representation theory. Adv. Math. 34 (1979) 195–338. [3] M. Adler, On a trace functional for formal pseudodifferential operators and symplectic structure of the Korteweg–de Vries type equations. Inv. Math. 50 (1979) 219. [4] E.K. Sklyanin, On the complete integrability of the Landau–Lifchitz equation. Preprint LOMI E-3-79. Leningrad, 1979. [5] W. Symes, Systems of Toda type, inverse spectral problems and representation theory. Inv. Math. 59 (1980) 13. [6] M. Olshanetsky and A. Perelomov, Classical integrable finitedimensional systems related to Lie algebras. Phys. Rep. 71 (1981) 313–400. [7] A. Belavin and V. Drinfeld, On the solutions of the classical Yang– Baxter equation for simple Lie algebras. Funct. Anal. Appl. 16, 3 (1982) 1–29. [8] V. Drinfeld, Hamiltonian structures on Lie groups, Lie bialgebras, and the geometrical meaning of the Yang–Baxter equations. Dokl. Akad. Nauk SSSR 268 (1983) 285–287. [9] D. Olive and N. Turok, Algebraic structure of Toda systems. Nucl. Phys. B220 (1983) 491–507. [10] M. Semenov-Tian-Shansky, What is a classical r-matrix? Funct. Anal. Appl. 17 4 (1983) 17. [11] L. Ferreira and D. Olive, Non-compact symmetric spaces and the Toda molecule equations. Comm. Math. Phys. 99 (1985) 365–384. [12] A. Bobenko, A. Reyman and M. Semenov-Tian-Shansky, The Kowalevski top 99 years later: a Lax pair, generalizations and explicit solutions. Comm. Math. Phys. 122 (1989) 321–354. [13] A. Reyman and M. Semenov-Tian-Shansky, Group-theoretical methods in the theory of finite-dimensional integrable systems. Encyclopaedia of Mathematical Sciences, 16. Springer-Verlag (1990).
5 Analytical methods
In this chapter, we present the general ideas for solving the Lax equations with spectral parameter. The spectral curve Γ is the characteristic equation for the eigenvalues of the Lax matrix: det (L(λ) − µ) = 0. Since ˙ the Lax equation L(λ) = [M (λ), L(λ)] is isospectral, the eigenvalues of L(λ) are time-independent and so is the spectral curve. At any point of the spectral curve there exists, by definition, an eigenvector of L(λ) with eigenvalue µ. We explain how we can reconstruct the eigenvector from its analyticity properties on Γ. In particular all the dynamical information is contained in the divisor of its poles which we call the dynamical divisor. The time evolution of this divisor is equivalent to a linear flow on the Jacobian of Γ. We give three proofs of this result. The first one proceeds by explicitly computing the time evolution of the image of the dynamical divisor by the Abel map in the Jacobian. The second one uses a special type of functions on the Riemann surface, the Baker–Akhiezer functions, to reconstruct the eigenvectors explicitly. Finally, the linearization property also follows very directly by properly interpreting the group theoretical factorization method in its Riemann–Hilbert incarnation. As a result, one can express the dynamical variables in terms of θ functions defined on the Jacobian of the spectral curve. We then show that the symplectic structure can be nicely written in terms of coordinates of the points of the dynamical divisor, hence exhibiting the interplay between analytical data and separation of variables. We finally present the application of theses ideas to sketch the solution of the Kowalevski top.
124
5.1 The spectral curve
125
5.1 The spectral curve Let us consider an N × N Lax matrix L(λ), depending, as in Chapter 3, rationally on a spectral parameter λ ∈ C with poles at points λk : Lk (λ) (5.1) L(λ) = L0 + k
L0 is independent of λ and Lk (λ) is the polar part of L(λ) at λk , i.e. r Lk = −1 r=−nk Lk,r (λ − λk ) . The analytical method of solution of integrable systems is based on the study of the eigenvector equation: (L(λ) − µ1) Ψ(λ, µ) = 0
(5.2)
where Ψ(λ, µ) is the eigenvector with eigenvalue µ. The characteristic equation for the eigenvalue problem (5.2) is: Γ : Γ(λ, µ) ≡ det(L(λ) − µ 1) = 0
(5.3)
This defines an algebraic curve in C2 which is called the spectral curve. We are considering here the smooth compact curve obtained from this equation by the desingularization procedure explained in Chapter 15, even if we do not mention it explicitly. A point on Γ is a pair (λ, µ) satisfying eq. (5.3). If N is the dimension of the Lax matrix, the equation of the curve is of the form: Γ : Γ(λ, µ) ≡ (−µ) + N
N −1
rq (λ)µq = 0
(5.4)
q=0
The coefficients rq (λ) are polynomials in the matrix elements of L(λ) and therefore have poles at λk . Since the Lax equation L˙ = [M, L] is isospectral, these coefficients are time-independent and are related to the action variables. From eq. (5.4) we see that the spectral curve appears as an N -sheeted covering of the Riemann sphere. To a given point λ on the Riemann sphere there correspond N points on the curve whose coordinates are (λ, µ1 ), . . . , (λ, µN ), where the µi are the solutions of the algebraic equation Γ(λ, µ) = 0. By definition µi are the eigenvalues of L(λ). Our goal is to determine the analytical properties of the eigenvector Ψ(λ, µ) and see how much of L(λ) can be reconstructed from them. The result is that one can reconstruct L(λ) up to global (independent of λ) similarity transformations. This is not too surprising since the analytical properties of L(λ) and the spectral curve are invariant under global gauge
126
5 Analytical methods
transformations consisting of similarity transformations by constant invertible matrices. So from analyticity we can only hope to recover the system where global gauge transformations have been factored away. In general, we may fix the gauge by diagonalizing L(λ) for one value of λ. To be specific, we choose to diagonalize at λ = ∞, i.e. we diagonalize the coefficient L0 : L0 = lim L(λ) = diag(a1 , . . . , aN ) λ→∞
(5.5)
We assume for simplicity that all the ai are different. Then on the spectral curve we have N points above λ = ∞: Qi ≡ (λ = ∞, µi = ai ) In the gauge (5.5) there remains a residual action which consists of conjugating the Lax matrix by constant diagonal matrices. Generically, these transformations form a group of dimension N − 1 and we will have to factor it out. Before doing complex analysis on Γ, one has to determine its genus. A general strategy is as follows. As we have seen, Γ is an N -sheeted covering of the Riemann sphere. There is a general formula expressing the genus g of an N -sheeted covering of a Riemann surface of genus g0 (in our case g0 = 0). It is the Riemann–Hurwitz formula: 2g − 2 = N (2g0 − 2) + ν
(5.6)
where ν is the branching index of the covering, see Chapter 15. Let us assume for simplicity that the branch points are all of order 2. To compute ν we observe that this is the number of values of λ where Γ(λ, µ) has a double root in µ. This is also the number of zeroes of ∂µ Γ(λ, µ) on the surface Γ(λ, µ) = 0. But ∂µ Γ(λ, µ) is a meromorphic function on Γ, and therefore the number of its zeroes is equal to the number of its poles and it is enough to count the poles. These poles can only be located where the matrix L(λ) itself has a pole. So we are down to a local analysis around the points of Γ such that L(λ) has a pole. Let us apply this idea to the matrix (5.1). Above a pole λk , we have N branches of the form µj = lj /(λ − λk )nk + · · · , where lj are the eigenvalues of Lk,−nk thatare assumed all distinct. On such a branch we have ∂µ Γ(λ, µ)|(λ,µj (λ)) = i=j (µj (λ) − µi (λ)), which thus has a pole of order (N − 1)nk . Summing on all branches, the total order of the poles over λk is N (N − 1)nk . Summing on all poles λk of L(λ), we see that the total branching index is ν = N (N − 1) k nk . This gives: N (N − 1) nk − N + 1 g= 2 k
5.1 The spectral curve
127
For consistency of the method it is important to observe that the genus is related to the dimension of the phase space and to the number of action variables occuring as independent parameters in eq. (5.4), which should also be half the dimension of the phase space. As we have seen in Chapter 3, Lk = (gk · (Ak )− · gk−1 )− , where the (Ak )− characterize the orbit and are non-dynamical. The gk are defined modulo right multiplication by diagonal matrices, and we have in addition to quotient by global gauge transformations. Proposition. The phase space M has dimension 2g and there are g proper action variables in eq. (5.4). Proof. The dynamical variables are the jets of order (nk − 1) of the gk . This gives N 2 nk parameters. But Lk is invariant under gk → gk dk with dk a jet of diagonal matrices of the same order. Hence the dimension of the Lk orbit is (N 2 − N )nk . We also have the residual global gauge invariance by diagonal matrices acting as gk → dgk , or L(λ) → dL(λ)d−1 . This preserves the diagonal form of L0 . The orbits of this action are of dimension (N − 1), since the identity does not act. The phase space M is obtained by Hamiltonian reduction by this action (see Chapter 14). First one fixes the momentum, yielding (N − 1) conditions, and then one takes the quotient by the stabilizer of the momentum which is here the whole group since it is Abelian. As a result, the dimension of the phase space is reduced by 2(N − 1), yielding: nk − 2(N − 1) = 2g dim M = (N 2 − N ) k
Let us now count the number of independent coefficients in eq. (5.4). It is clear that rj (λ) is a rational function of λ. The value of rj at ∞ is known since µj → aj . Note that rj is the symmetrical function σj (µ1 , . . . , µN ), where µi are the eigenvalues of L(λ). Above λ = λk , they can be written as nk (j) cn µj = + regular (5.7) (λ − λk )n n=1 (j)
(j)
where all the coefficients c1 , . . . , cnk are fixed and non-dynamical because they are the matrix elements of the diagonal matrices (Ak )− , while the regular part is dynamical. We see that rj (λ) has a pole of order jnk at λ = λk , and so can be expressed using j k nk parameters, namely the coefficients of all these poles. Summing over j we have altogether 1 k nk parameters. They are not all independent however, be2 N (N + 1) (j)
cause in eq. (5.7) the coefficients cn are non-dynamical. This implies that
128
5 Analytical methods
the nk highest order terms in rj are fixed and yields N nk constraints on the coefficients of rj . We are left with 12 N (N − 1) k nk parameters, that is g + N − 1 parameters. It remains to take the symplectic quotient by the action of constant diagonal matrices. We assume that the system is equipped with the Poisson bracket (3.26) of Chapter 3. Consider the Hamiltonians Hn = (1/n) Resλ=∞ Tr (Ln (λ)) dλ, i.e. the term in 1/λ in Tr (Ln (λ)). These are functions of the rj (λ) in eq. (5.4). We show that they are the generators of the diagonal action. First we have: Lk (λ))dλ Resλ=∞ Tr (Ln (λ))dλ = n Resλ=∞ Tr (Ln−1 0 k
= n Resλ=∞ Tr (Ln−1 L(λ))dλ 0
(5.8)
since all Lk (λ) are of order 1/λ at ∞. Using the Poisson bracket we get C12 n−1 , L(λ) ⊗ 1 + 1 ⊗ L(µ) dλ {Hn , L(µ)} = − Resλ=∞ Tr1 L0 ⊗ 1 λ−µ The term L(λ) ⊗ 1 in the commutator does not contribute because the L0 part produces a vanishing contribution by cyclicity of the trace and all other terms are of order at least 1/λ2 . The term 1 ⊗ L(µ) yields −[Ln−1 , L(µ)], which is the coadjoint action of a diagonal matrix on L(µ). 0 Since L0 is generic, the Ln0 generate the space of all diagonal matrices, so we get exactly N − 1 generators H1 , . . . , HN −1 . In the Hamiltonian reduction procedure, the Hn are the moments of the group action and are to be set to fixed (non-dynamical) values. Hence when the system is properly reduced we are left with exactly g action variables. Example. Let us consider the example of the Neumann model. Recall the Lax matrix (see Chapter 3): 1 1 J − 2K λ λ The spectral curve can be computed as follows: 1 1 det (L(λ) − µ) = det (L0 − µ) + J − 2 K λ λ 1 −1 1 = det (L0 − µ) det 1 + (L0 − µ) ( J − 2 K) λ λ Fk 1 (ai − µ) 1 + 2 (5.9) = λ µ − ak L(λ) = L0 +
i
k
129
5.1 The spectral curve where the conserved quantities Fk are given by: Fk = x2k +
2 Jkj
j=k
ak − aj
(5.10)
This is because J is a projector of rank 2 and K is a projector of rank 1. The matrix P = (L0 − µ)−1 ( λ1 J − λ12 K) is of rank 2. Its image is spanned by the two vectors v1 = (L0 − µ)−1 X and v2 = (L0 − µ)−1 Y while its kernel is the (N − 2)-dimensional space orthogonal to X and Y , which is generically supplementary to the image. We have P = 0 on the kernel, and: 1 1 1 P v1 = V (µ) − 2 U (µ) v1 − U (µ)v2 λ λ λ 1 1 1 P v2 = (W (µ) + 1) − 2 V (µ) v1 − V (µ)v2 λ λ λ where the functions U (µ), V (µ), W (µ) are defined as: U (µ) =
i
x2i , ai − µ
V (µ) =
xi yi , ai − µ
W (µ) = −1 +
i
i
yi2 ai − µ (5.11)
From this it follows that:
1 1 Fk det (1 + P ) = 1 − 2 V 2 (µ) − U (µ)W (µ) = 1 + 2 (5.12) λ λ µ − ak k
which yields the result. Incidentally this proves formula (2.24) in Chapter 2. Since we have already found an r-matrix for the Neumann system this proves its integrability. The spectral curve can be written in the form: (µ − ai )Fk (µ − bi ) k 2 i=k λ =− = − i (5.13) (µ − a ) (µ − ai ) i i i Performing the birational transformation (see Chapter 15) λ = λ i (µ − ai ), we get: N N −1 2 (µ − bi ) (5.14) λ = − (µ − ai ) i=1
i=1
which is a hyperelliptic curve of genus g = N − 1. Note that the phase space is of dimension 2(N − 1) and that we have (N − 1) independent conserved quantities, namely the N quantities Fk modulo the relation
130
5 Analytical methods
k Fk = 1. Let us remark that in this case we do not quotient by the diagonal action. This is because the diagonal action does not preserve the particular form of the Lax matrix in terms of the vectors X and Y . In J) identically vanish. this case the Hamiltonians Hn = Tr (Ln−1 0 To illustrate the discussion of N -sheeted coverings, it is instructive to consider the covering projection (λ, µ) → λ which allows us to see the spectral curve as an N -sheeted covering of the Riemann sphere of the variable λ. To compute the branching index of this covering we have to find the total number of poles of ∂µ Γ(λ, µ). Such poles can only occur when λ = ∞ or µ = ∞. First, above λ = ∞ we have the N points µ = ai . These are not branch points and the local parameter is 1/λ. Around such a point we have by eq. (5.9):
(λ, µ) → Qi = (∞, ai ),
µ = ai − Fi /λ2 + O(1/λ4 )
(5.15)
hence ∂µ Γ = λ2 j=i (ai − aj ) + O(1). We thus have N double poles at these points. When µ → ∞ we have, by eq. (5.9), λ2 = −1/µ → 0 and λ is again a local parameter, (λ, µ) → (0, ∞),
µ=−
1 + O(1) λ2
(5.16)
At this point we have ∂µ Γ = −(−1/λ2 )N −2 + · · ·, hence we have a pole of order 2N − 4. So the branching index is ν = 4N − 4. This yields g = N − 1. Note that Γ(λ, µ) is a very non-generic polynomial of degree 2N − 1 and the orbit to which L(λ) belongs is a very low-dimensional one, nevertheless all the numbers fit nicely.
5.2 The eigenvector bundle The aim of this chapter is to present the general procedure for solving integrable models using analytical properties on the spectral curve. In order to simplify the exposition we shall assume that all functions or matrices are generic. We wish to examine how the Lax matrix can be reconstructed from the analytic data characterizing its eigenvectors. This analysis will exhibit the special role played by the divisor D of the poles at finite distances of the eigenvector Ψ(P ). This divisor contains all the dynamical information. Let P be a point on the spectral curve. We assume that P = (λ, µ) is not a branch point so that all eigenvalues of L(λ) are distinct and the eigenspace at P is one-dimensional. Let Ψ(P ) be an eigenvector, and
5.2 The eigenvector bundle
131
ψj (P ) its N components: ψ1 (P ) .. Ψ(P ) = .
ψN (P ) Since the normalization of the eigenvector Ψ(λ, µ) is arbitrary, one has to make a choice before making a statement about its analytical properties. We choose to normalize it such that its first component is equal to one, i.e. ψ1 (P ) = 1,
at any point P ∈ Γ.
It is then clear that the ψj (P ) depend locally analytically on P . As a matter of fact: Proposition. With the above normalization, the components of the eigenvectors Ψ(P ) at the point P = (λ, µ) are meromorphic functions on the spectral curve Γ. Proof. For a generic point P on the curve Γ, i.e. for a pair P = (λ, µ) satisfying eq. (5.3), there exists a unique eigenvector Ψ(P ) of the matrix L(λ) normalized by the condition ψ1 (P ) = 1. The un-normalized components ψi (P ) can be taken as suitable minors ∆i (P ) of the matrix L(λ) − µ1, and are thus meromorphic functions on Γ. After dividing by ∆1 (λ, µ) to normalize the first component, all the other components ψj (P ) are still meromorphic functions on Γ. With each point P (λ, µ) on Γ we associate a meromorphic eigenvector Ψ(P ). At a branch point however, special care must be taken since there could be several eigenvectors associated with that point. We show that, for a generic Lax matrix, the eigenspaces are one-dimensional even at a branch point P . Moreover, the eigenspaces around P admit a unique analytic continuation at P , irrespective of the branch chosen. Proposition. With each point P in Γ we associate the eigenspace at P . This allows us to define an analytic line bundle that we call the eigenvector bundle. Proof. The first point is to show that the eigenspace at P is of dimension 1 at each point of Γ, even at a branch point, in the generic case. Consider the matrix A ≡ L(λ) − µ. The fact that we are on a branch point of the curve Γ is expressed by two algebraic equations in the coefficients of A: Γ(λ, µ) = 0 and ∂µ Γ(λ, µ) = 0. To say that the kernel of
132
5 Analytical methods
A at P is of dimension ≥ 2, means that the dimension of its image is ≤ (N − 2). Let us show that this implies at least three algebraic equations on the coefficients of A. Let v1 , . . . , vN be the columns of A. We assume for simplicity that the kernel of A at P is of dimension 2 and that v1 , . . . , vN −2 are independent. First we impose that v1 , . . . , vN are linearly dependent, producing one condition det A = 0, which is the equation of the spectral curve. Then we impose that v1 , . . . , vN −1 are dependent on v1 , . . . , vN −2 . This is expressed by the vanishing of two (N − 1) × (N − 1) minors. One of these conditions is equivalent to ∂µ Γ(λ, µ) = 0 but there remains another independent one, hence the variety of such matrices is of codimension 1. Another way of saying this is that when a matrix has coinciding eigenvalues it can be put in the Jordan form, but is not diagonalizable in general. We now construct an abstract line bundle starting from the dimension 1 eigenspace EP at each point P , see Chapter 15. We call e1 , . . . , eN the canonical basis of ambient space in which the eigenvectors live. Define N open sets Ui on the curve Γ by the conditions Ui = {P ∈ Γ| exists V ∈ EP s.t. Vi = (V, ei ) = 0}, meaning that the eigenspace EP is not perpendicular to ei . Obviously the Ui form an open covering of Γ. On each intersection Ui ∩ Uj we define transition functions tij (P ) by tij (P ) = Vi /Vj where V is any non-zero eigenvector at P . The quotient is independent of the choice of V and the components Vi , Vj do not vanish on Ui ∩ Uj . In view of the argument of the previous proposition it is clear that tij (P ) is analytic with respect to the point P on Γ and non-vanishing. Finally, the cocycle condition tij tjk = tik is trivially satisfied on Ui ∩ Uj ∩ Uk . Hence these transition functions define an analytic line bundle which we call the eigenvector bundle. Any meromorphic section of this bundle can be described as a collection (V1 (P ), . . . , VN (P )), where Vi (P ) is defined and meromorphic on Ui and Vi (P ) = tij (P )Vj (P ) on Ui ∩ Uj . One can see this collection as a P -dependent vector lying in the eigenspace EP with components Vi (P ) on ei .
Remark 1. Alternatively one can define normalized eigenvectors ψ(i) (P ) with (i) ψi (P )
= 1. Then Ui is the open set where ψ (i) (P ) remains finite. The transition (j) (i) function tij (P ) = ψi (P ) = 1/ψj (P ) has no zero nor pole on Ui ∩ Uj .
Remark 2. It may be useful to understand the situation at branch points on the simple example of 2 × 2 matrices. Let a(λ) b(λ) L(λ) = c(λ) d(λ)
5.2 The eigenvector bundle
133
which has eigenvalues µ± (λ) = 12 (a(λ) + d(λ)) ± 12 ∆(λ) with ∆(λ) = (a(λ) − d(λ))2 + 4b(λ)c(λ). The corresponding normalized eigenvectors are: ∆(λ) d(λ) − a(λ) 1 Ψ± = , ψ± = ± ψ± 2b(λ) 2b(λ) Assume that λ0 is a root of ∆(λ) = 0. It is obvious that when λ → λ0 , Ψ+ and Ψ− tend smoothly to the same limit except if one has also b(λ0 ) = 0. If b(λ0 ) = 0 one can express L(λ0 ) in the basis given by Ψ(λ0 ) and (0, 1) getting 1 (a(λ0 ) + d(λ0 )) b(λ0 ) L(λ0 ) → 2 1 0 (a(λ0 ) + d(λ0 )) 2 from which it is obvious that L(λ0 ) is of the Jordan form and has just one eigenvector. If, however, b(λ0 ) = 0 then we also have a(λ0 ) = d(λ0 ). Assuming that d(λ) − a(λ), b(λ) vanish to first order in λ − λ0 , then (d(λ) − a(λ))/2b(λ) tends to some limit ψe . We see that ψ± ∼ ψe ± c(λ)/b(λ). Hence if c(λ0 ) = 0 we still have only one eigenvector of the form (0, 1), while if c(λ) also vanishes to first order in λ − λ0 the matrix L(λ0 ) is diagonalizable, and the eigenvectors Ψ± tend generically to different limits at λ0 . However, in this case we have ∆ ∼ (λ − λ0 )2 so that the corresponding point (λ0 , µ0 ) of the spectral curve is not a branch point, but a singular point. Upon desingularization it blows up to two points and the two values of Ψ are perfectly natural. Of course this analysis clearly covers what happens at a branch point of order 2 in the general case.
We now compute the Chern class of the eigenvector bundle. To do that we view Ψ(P ) as a meromorphic section of our bundle in the above way, i.e. as the collection defined respectively on U1 , . . . , UN : (ψ1 (P ) = 1, ψ2 (P ), . . . , ψN (P )). Notice that this section does not vanish because ψi (P ) does not vanish on Ui by definition. We compute the number of poles of this section, which yields the Chern class. The number of poles of the normalized eigenvectors cannot be deduced by simply counting the number of zeroes of minors. Indeed, let ∆(λ, µ) be the matrix of cofactors of (L(λ) − µ1), which, by definition, is such that = Γ(λ, µ)1. Therefore at P = (λ, µ) ∈ Γ, the matrix ∆(P ) is (L(λ)−µ1)∆ a matrix of rank 1, since the kernel of (L(λ)−µ1) is of dimension 1. Hence, ) are of the form αi (P )βj (P ) and the for P ∈ Γ the matrix elements of ∆(P )β1 (P ) αi (P ) components of the normalised eigenvector are ψi (P ) = αα1i (P (P )β1 (P ) = α1 (P ) . We thus expect cancellations to occur when we take the ratio of the minors and we cannot deduce the number of poles of the normalized eigenvector by simply counting the number of zeroes of the first minor. Proposition. We say that the vector Ψ(P ) possesses a pole if one of its components has a pole. The number of poles of the normalized vector Ψ(P ) is: m=g+N −1 (5.17)
134
5 Analytical methods
Proof. Let us introduce the function W (λ) of the complex variable λ defined by:
2 W (λ) = det Ψ(λ) where Ψ(λ) is the matrix of eigenvectors of L(λ) defined as follows:
ψ1 (P1 ) .. Ψ(λ) = .
ψ1 (P2 ) .. .
··· .. .
ψ1 (PN ) .. .
(5.18)
ψN (P1 ) ψN (P2 ) · · · ψN (PN ) where the points Pi are the N points above λ. In this formula ψ1 (Pj ) = 1. Changing the normalization of the eigenvectors Ψ(Pj ) amounts to multi plying Ψ(λ) on the right by a diagonal matrix. By definition Ψ(λ) is the matrix diagonalizing L(λ). The function W (λ) is well-defined as a rational function of λ on the Riemann sphere since the square of the determinant does not depend on the order of the Pj . It has a double pole where Ψ(P ) has a simple pole. To count its poles, we count its zeroes. We show that W (λ) has a simple zero for values of λ corresponding to a branch-point of the covering, therefore m = ν/2. Recall that from eq. (5.6) the number of branch points is ν = 2(N + g − 1). First notice that W (λ) only vanishes on branch points where there are at least two identical columns. Indeed, let Pi = (µi , λ) be the N points above λ. Then the Ψ(Pi ) are the eigenvectors of L(λ) corresponding to the eigenvalues µi are thus linearly independent when all the µi are different. Therefore W (λ) cannot vanish at such a point. The other possibility for the vanishing of W (λ) would be that the vector Ψ(P ) itself vanish at some point (all components have a common zero at this point), but this is impossible because the first component is always 1. Let us assume now that λ0 corresponds to a branch point, which is generically of order 2. At such a point W (λ) has a simple zero. Indeed, let z be an analytical parameter on the curve around the branch point. The covering projection P → λ gets expressed as λ = λ0 +λ1 z 2 +O(z 3 ). The determinant vanishes to order z, hence W vanishes to order z 2 . This is precisely proportional to λ − λ0 . A similar analysis can be performed if the branch point is of higher order. We now need to examine the behaviour of the eigenvector around λ = ∞. At the N points Qi above λ = ∞, the eigenvectors are proportional to the canonical vectors ei , (ei )k = δik , since L(λ = ∞) is diagonal, cf. eq. (5.5). While this is compatible with the normalization ψ1 (P ) = 1 at the
5.2 The eigenvector bundle
135
point Q1 , it is not compatible at the points Qi , i ≥ 2, if the proportionality factor remains finite. The situation is described more precisely by the following: Proposition. The k th component ψk (P ) of Ψ(P ) has a simple pole at Qk and vanishes at Q1 for k = 2, 3, . . . , N . Proof. Around Qk (λ = ∞, µ = ak ), k = 1, . . . , N , the eigenspace of L(λ) is spanned by a vector of the form Vk (λ) = ek + O(1/λ). The first component of Vk is Vk1 = δ1k + O(1/λ). To get the normalized Ψ one has to divide Vk by Vk1 . So we get:
Ψ(P )|P ∼Q1
1 O(1/λ) .. . . , . = . . .. . .. O(1/λ)
Ψ(P )|P ∼Qk
1 O(1) . . . = O(λ) , O(1) . .. O(1)
k ≥ 2 (5.19)
where O(λ) is the announced pole of the k th component of Ψ(P )|P ∼Qk . The previous proposition shows that fixing the gauge by imposing that L(λ) is diagonal at λ = ∞ introduces N − 1 poles at the positions Qi , i = 2, . . . , N . The location of these poles is independent of time, and is really part of the choice of the gauge condition. These poles do not contain any dynamical information. Only the positions of the other g poles have a dynamical significance. Let D be the divisor of these dynamical poles. We call it the dynamical divisor. Recall that the vector Ψ(P ) possesses a pole if one of its components has a pole. Therefore the two previous propositions tell us that the divisor of the k th components of the eigenvector Ψ(P ) is bigger than (−D + Q1 − Qk ). This information is enough to reconstruct the eigenvectors and the Lax matrix. Proposition. Let D be a generic divisor on Γ of degree g. Up to normalization, there is a unique meromorphic function ψk (P ) with divisor (ψk ) ≥ −D + Q1 − Qk . Proof. This is a direct application of the Riemann–Roch theorem, since ψk is required to have g + 1 poles and one prescribed zero. Hence it is generically unique apart from multiplication by a constant ψk → dk ψk .
136
5 Analytical methods
Equipped with these functions ψk (P ) for k = 2, . . . , N we construct a vector function with values in CN : 1 ψ2 (P ) Ψ(P ) = .. . ψN (P ) The normalization ambiguity of the ψk translates into left multiplication of the vector Ψ(P ) by a constant diagonal matrix d = diag (1, d2 , . . . , dN ). We have constructed a line bundle on the Riemann surface Γ, which is the line bundle associated to the divisor −D − Q2 − · · · − QN of degree −(g + N − 1) (see Chapter 15). In fact we have constructed an embedding of this line bundle into Γ × CN . Theorem. Given the spectral curve Γ, such that above the points λk the N branches satisfy eq. (5.7), there exists a unique matrix L(λ), rational in λ, such that (L(λ) − µ1)Ψ(P ) = 0 This matrix has poles at the points λk and satisfies the boundary condition: limλ→∞ L(λ) = diag(a1 , . . . , aN ). whose columns are the vectors ψ(Pi ), Proof. Consider the matrix Ψ(λ) where Pi = (λ, µi ) are the N points above λ, cf. eq. (5.18). This matrix depends on the ordering of the columns, i.e. on the ordering of the points Pi . However, the matrix −1 (λ) L(λ) = Ψ(λ) ·µ ·Ψ
(5.20)
does not depend on this ordering and is a well-defined function on the base curve. Here µ is the diagonal matrix µ = diag (µ1 , . . . , µN ). One has to examine the poles of the right-hand side of eq. (5.20). At a generic coalesce and its determinant branch point two columns of the matrix Ψ has a simple zero with respect to the local parameter. These zeroes are the only zeroes of det Ψ(λ). This is because the meromorphic function 2 (λ) is a function of λ and has 2(N + g − 1) poles, since Ψ W (λ) = (det Ψ) has (g + N − 1) poles. The function W (λ) has the same number of zeroes and poles, hence has also 2(N + g − 1) zeroes. At the branch points it behaves like z 2 ∼ (λ − λb ) (where z is a local parameter), hence has a simple zero. Thus the branch points contribute to ν = 2(N +g −1) zeroes, which are all the zeroes of W (λ).
5.2 The eigenvector bundle
137
We now show that, at a branch point, the matrix (5.20) is regular. is the matrix of cofactors of Ψ we have: Recall that if ∆ ·∆ = det Ψ 1, L(λ) = 1 Ψ ·µ Ψ ·∆ (5.21) det Ψ ∆ = 0, thus Im ∆ = Ker Ψ which is oneAt the branch point, Ψ dimensional. We may assume without loss of generality that the two eigen So, at the branch vectors that coalesce are the first two columns of Ψ. is spanned by e1 − e2 , where ei are the canonical point, the kernel of Ψ also base vectors (ei )j = δij . Since the first two diagonal elements of µ so Ψ(λ) = 0. become equal we see that µ acts as a scalar on Im ∆, ·µ ·∆ Hence the numerator of eq. (5.21) has a simple zero at the branch point, cancelling the simple pole of the determinant. Therefore the matrix L(λ) is a rational function of the parameter λ. It has poles only at the projections of the points where µ has poles, i.e. at the points λk , see eq. (5.7). At λ = ∞, the leading part of Ψ(λ) is diagonal since it is dominated by the functions ψk (P ) with P approaching Qk . Therefore at infinity L(λ) goes to µ |λ=∞ = diag(a1 , . . . , aN ). This theorem is a crucial step of this method of resolution. It says that, once the spectral curve has been given, which amounts to giving the values of the integrals of motion, all remaining dynamical data are encoded into the divisor D. In other words, this theorem teaches us that the dynamical variables are the action variables and the points of this divisor. It should be emphasized, however, that Ψ(P ) is defined up to left multiplication by diagonal matrices ψk → dk ψk . On the Lax matrix L(λ) this amounts to a conjugation by a constant diagonal matrix. Hence the object we reconstruct is actually the Hamiltonian reduction of the dynamical system by this group of diagonal matrices, as emphasized at the beginning of this chapter. Remark. It is worth comparing the reconstruction formula (5.20) for the Lax matrix with the local analysis of section (3.2) in Chapter 3. Recall that in this chapter we explained that the pair of matrices L(λ) and M (λ) could be diagonalized simultaneously, locally around each pole λk . Explicitly, the diagonalization formula (3.8) for L(λ) was L(λ) = gk Ak gk−1 with gk and Ak power series in (λ − λk ) and Ak diagonal. Of course gk is determined up to right multiplication by a diagonal matrix. The expression (5.20) of L(λ) in terms of the eigenvectors is simply a global version of the previous local statement. Example. Let us illustrate the analytical properties of the eigenvector bundle on the example of the Neumann model. There is a simple
138
5 Analytical methods
description of the eigenvectors in this case. Indeed, from eq. (3.3) in Chapter 3, 1 1 −1 L(λ)Ψ = µΨ =⇒ Ψ = −(L0 − µ) (5.22) JΨ − 2 KΨ λ λ Since JΨ = (Y · Ψ)X − (X · Ψ)Y and KΨ = (X · Ψ)X we see that Ψ is known once we know its projections on X and Y . Projecting eq. (5.22) on X and Y one gets a 2 × 2 system: 1 1 − λ1 V (µ) − λ12 U (µ) X ·Ψ λ U (µ) =0 Y ·Ψ − λ1 (1 + W (µ)) − λ12 V (µ) 1 + λ1 V (µ) The vanishing of the determinant of this linear system is precisely the equation of the spectral curve: λ2 = V 2 (µ) − U (µ)W (µ) Solving this system for X · Ψ and Y · Ψ, and inserting back into eq. (5.22), one gets: ψk a1 − µ (λ − V (µ))xk + U (µ)yk = · ψ1 ak − µ (λ − V (µ))x1 + U (µ)y1 Let us check the general results on these explicit formulae. Recalling the expansion (5.15), we see that (a1 − µ)/(ak − µ) has a double zero at Q1 (λ = ∞, µ = a1 ) and a double pole at Qk (λ = ∞, µ = ak ). Consider the meromorphic function φk = (λ−V (µ))xk +U (µ)yk . It has poles at the points Qi . Using eq. (5.15) we see that it has double poles at Qi , i = k and a simple pole at Qk . In fact (−V (µ)xk +U (µ)yk ) = λ2 xi Jki /Fi +O(1). This show that ψk /ψ1 has a simple zero at Q1 and a simple pole at Qk . To find the other poles of ψk /ψ1 we study the zeroes of φk which has (2N −1) poles and therefore (2N −1) zeroes. Among them, N are common to all functions φk and cancel in ψk /ψ1 . Indeed, a common zero is such that λ − V (µ) = U (µ) = 0. By eq. (5.12) the points satisfying these two equations are on the spectral curve. The equation U (µ) = 0 has (N − 1) roots, µj , at finite µ, and λj = V (µj ) selects one of the two points above µj . In addition we have the point λ = 0, µ = ∞ which is a simple zero in view of eq. (5.16). Finally, ψk /ψ1 has a zero at Q1 , a pole at Qk , and g = N − 1 poles at finite distance depending on the dynamical data, in agreement with the general considerations of this section. 5.3 The adjoint linear system −1 (λ) In view of eq. (5.20) for the Lax matrix, it is important to compute Ψ in an efficient way. This is achieved by introducing the solution Ψ+ (P ) of
5.3 The adjoint linear system
139
the adjoint linear system: Ψ+ (P ) (L(λ) − µ1) = 0
(5.23)
Here Ψ+ (P ) is a row vector. The precise relation between Ψ+ (P ) and the −1 (λ) is provided by the following: matrix Ψ Proposition. Let Ψ+ (P ) be a solution of the adjoint system (5.23). The inverse of the matrix Ψ(λ) defined in eq. (5.18) is the matrix whose rows are the N row vectors Ψ(−1) (Pj ) = Ψ+ (Pj )/ Ψ+ (Pj )Ψ(Pj ) with Pj the N points above λ and V W = i Vi Wi .
(5.24)
Proof. One has to show that N k=1
ψk+ (Pj ) ψk (Pi ) = δij Ψ+ (Pj )Ψ(Pj )
where Pi and Pj are two points of Γ above the same λ. This is obvious for i = j, and for i = j, Ψ(Pi ) and Ψ+ (Pj ) are orthogonal because computing Ψ+ (Pj )L(λ)Ψ(Pi ) in two different ways we find µ(Pj ) Ψ+ (Pj )Ψ(Pi ) = Ψ+ (Pj )Ψ(Pi ) µ(Pi ), hence the scalar product vanishes if µ(Pi ) and µ(Pj ) are different. So we get: + −1 (λ) = Ψ
Ψ (P1 )
Ψ+ (P1 )Ψ(P1 )
.. . +
Ψ (PN )
Ψ+ (PN )Ψ(PN )
(5.25)
−1 (λ) to reconstruct them We now use this relation between Ψ+ (P ) and Ψ from their analyticity properties. One may perform on Ψ+ (P ) the same analysis as for the vector Ψ(P ). Normalizing the first component of Ψ+ (P ) to 1, one sees that the vector Ψ+ (P ) has g poles at a divisor D+ and (N − 1) simple poles at Q2 , . . . , QN above λ = ∞. Moreover, ψk+ (P ), k ≥ 2, has a zero at Q1 . Our first task is to relate the divisor D+ to D. Proposition. Let Ψ+ (P ) be the solution of eq. (5.23) normalized by ψ1+ (P ) = 1. The differential form Ω≡
dλ Ψ+ (P )Ψ(P )
(5.26)
140
5 Analytical methods
is an Abelian differential of the second kind with a double pole at Q1 and zeroes at D and D+ . Conversely, there is a unique differential Ω of the second kind with a double pole at the point Q1 and having among its zeroes the g points of D. Its g other zeroes are then completely fixed and define D+ . Its image under the Abel map is given by: A(D+ ) = −A(D) + A(B) − 2
N
A(Qj )
(5.27)
j=2
where B is the divisor of branch points of the covering (λ, µ) → λ. Proof. Consider the meromorphic function f (p) = Ψ+ (P )Ψ(P ) . It has 2(g + N − 1) poles coming from the poles of Ψ and Ψ+ . Their divisor is D + D+ + 2 N j=2 Qj . Therefore it has also 2(g + N − 1) zeroes, which are in fact the branch points of the covering, as we now see (recall that the covering has 2(g+N −1) branch points by the Riemann–Hurwitz formula). So let P = (λ0 , µ0 ) be a branch point and consider two points P1 and P2 above the same λ close to λ0 on the two sheets of the covering that coalesce at P . Because Ψ+ (P1 ) and Ψ(P2 ) are dual eigenvectors corresponding to different eigenvalues, they are orthogonal: Ψ+ (P1 )Ψ(P2 ) = 0
(5.28)
The assertion then follows by continuity, since Ψ(P2 ) → Ψ(P ) and Ψ+ (P1 ) → Ψ+ (P ) (recall that the line bundles are analytic at P ). At a branch point P we have (µ − µ0 )2 ∼ (λ − λ0 ), so µ − µ0 is a local parameter and dλ = (µ − µ0 )dµ vanishes at P . Moreover, dλ has double poles at the points Q1 , . . . , QN above λ = ∞. We see that Ω is regular at the branch points, has a double pole at Q1 , and has zeroes at D + D+ . Recall that given a point P ∈ Γ and a divisor D = (γ1 , . . . , γg ) of degree g on Γ, the Abel map with base point P0 is defined by: P g γj ωj and Aj (D) = ωj (5.29) Aj (P ) = P0
j=1
P0
with ωj is a normalized basis of Abelian differentials. Applying Abel’s theorem to the meromorphic function Ω/dλ one gets eq. (5.27). Conversely, assume that we have two such forms Ω and Ω with a double pole at Q1 and divisor of zeroes D + D+ and D + D + respectively. Their quotient is a meromorphic function with a divisor of poles of degree g, i.e. a constant generically. Hence the differential Ω is unique. The outcome of this proposition is that Ω is uniquely characterized by its behaviour at infinity and its zeroes at the points of D. Therefore, given
5.3 The adjoint linear system
141
the dynamical divisor D, we know the form Ω and we can find the divisor D+ as the complementary set of zeroes of Ω. This information on the divisor D+ can now be used to reconstruct the vector Ψ+ (P ). Its components ψk+ (P ) are uniquely determined up to normalization once we know the divisor D+ . Here, however, we have no freedom on these normalizations since we must preserve the orthogonality conditions (5.28). Let ψ˜k+ (P ) be any choice of such meromorphic function and let ψk+ (P ) = (1/ck ) ψ˜k+ (P ). We want to determine the constants ck . We require that ψk+ (P ) satisfies an orthogonality relation of the form Ψ+ (Pj )Ψ(Pi ) = f (Pj )δij . This means that the matrices of elements ψi+ (Pj )/f (Pj ) and ψj (Pi ) are inverse to each other. By uniqueness of the inverse matrix we also have: 1 1 ψ˜i+ (Pk )ψj (Pk ) ψi+ (Pk )ψj (Pk ) = δij , or = ci δij f (Pk ) f (Pk ) k
k
We have an independent characterization of f (P ). By eq. (5.26) f −1 (P )dλ = Ω(P ), and therefore f (P ) is known from its analyticity properties. This allows us to compute ci as follows. Consider on the Riemann surface the form ψ˜i+ (P )ψj (P )Ω(P ). For i = j, it has a double pole at Qi and for i = j, it has simple poles at Qi and Qj and no other singularity. This is because the poles at D and D+ in Ψ ˜ + cancel against the zeroes of Ω, and the double pole of Ω at Q1 and Ψ ˜ + at this point. Finally we define a form combines with zeroes of Ψ and Ψ on the Riemann sphere λ by: ωij = ψ˜i+ (Pk )ψj (Pk )Ω(Pk ) k
where the N points Pk are the points above λ. If there are no branch points among the Pk , λ is a local parameter around each Pk and ωij = gij (λ)dλ. If there is a branch point a short computation shows that this still holds. Since ωij is regular for finite λ, the function gij (λ) is in fact a polynomial in λ. Moreover, ωij has poles of order at most 1 for i = j and 2 for i = j at λ = ∞. Since dλ has a double pole at ∞, this implies ωij = 0 for i = j and ωii = ci dλ for some constants ci . We have obtained orthogonality relations: Ω ωij = ci δij dλ, ci = lim ψ˜i+ (P )ψi (P ) (P ) (5.30) P →Qi dλ is These orthogonality relations show that the inverse of the matrix Ψ given by:
Ω(Pi ) 1 ˜+ −1 ψ (Pi ) = (5.31) Ψ dλ cj j ij
142
5 Analytical methods
It is worth noticing that this expression is invariant under a change of ˜ + , i.e. ψ˜+ → dj ψ˜+ yields cj → dj cj normalization of the components of Ψ j j and dj cancels. On the other hand it transforms appropriately under a change of normalization of the components of Ψ. One then gets Ω = dλ/ Ψ+ Ψ if one sets the k th component of Ψ+ to the invariant value 1/ck ψ˜k+ . Let us summarize the situation in a proposition: Proposition. Given an effective divisor D of degree g, the functions ψk (P ) with divisor ≥ −(D + Qk − Q1 ) are unique up to normalization. There is a unique form Ω having a double pole at Q1 and vanishing on D. It has g other zeroes at an effective divisor D+ of degree g. There exists a unique set of functions ψk+ (P ) of divisor ≥ −(D+ + Qk − Q1 ) such that k ψi+ (Pk )ψj (Pk )Ω(Pk ) = δij dλ. The inverse of the matrix Ψ −1 = ψ + (Pi ) Ω(Pi )/dλ. is Ψ ij j 5.4 Time evolution The aim of this section is to solve for the equations of motion by looking at the time evolution of the dynamical divisor D. The outcome is the beautiful fact that the dynamical flow linearizes on the Jacobian of the spectral curve. Recall that the time evolution is governed by the Lax equation, d L(λ) = [M (λ), L(λ)] dt As we have seen in Chapter 3, the matrix M (λ) is of the form
M= Mk , with Mk (λ) = P (k) (L, λ)
(5.32)
−
k
where P (k) (L, λ) is a polynomial in L(λ) with constant rational coeffi cients in λ, and P (k) (L, λ) − denotes its polar part at λk . Suppose that, at time t, we made the analysis of the previous section and built the normalized eigenvector Ψ(t, P ). If Ψ(t, P ) is an eigenvector of L(λ) with eigenvalue µ, the Lax equation eq. (5.32) implies (L(λ) − µ)( dΨ dt − M Ψ) = 0. It follows that d Ψ(t, P ) = (M (λ) − C(t, P )1) Ψ(t, P ) (5.33) dt where C(t, P ) is a scalar function. Normalizing the eigenvector Ψ(t, P ) such that its first component equals 1 gives: C(t, P ) = M1j (λ)ψj (t, P ) j
143
5.4 Time evolution
By the analysis of the previous section, the normalized eigenvector Ψ(t, P ) has poles at the dynamical divisor D(t) and at the N − 1 points Qk , k = 2, . . . , N . Consider the function N (t, dt, P ) ≡ 1 + C(t, P )dt, with dt infinitesimal: M1j (λ)ψj (t, P ) N (t, dt, P ) = 1 + dt j
One can rewrite eq. (5.33) in the equivalent form: N (t, dt, P ) Ψ(t + dt, P ) = (1 + dtM (λ)) · Ψ(t, P ) + O(dt2 ) (5.34) We see that the meromorphic function N (t, dt, P ) of P ∈ Γ normalizes the eigenvector whose time evolution is naturally induced by the Lax equation ˙ = M Ψ. The divisor of this meromorphic function reads: Ψ (N ) = D(t + dt) +
mk
α Pk,i − D(t) −
k,i α=1
mk Pk,i
k,i
From eq. (5.34) we see that N cancels the poles of Ψ(t+dt, P ) at D(t+dt) and produces the poles of Ψ(t, P ) at D(t). The poles at Q2 , . . . , QN are the same on both sides and do not appear in N . Moreover, since Mk (λ) has a pole of order mk at λk , N has poles of order mk at the N points α to match the number of Pk,i above λk . Finally, N has extra zeroes Pk,i its poles. Since dt is small, and N = 1 for dt = 0, the zeroes are close to the poles, D(t + dt) is close to D(t), and on each sheet i of the covering α close to the mth order pole P . there are exactly mk zeroes Pk,i k,i k Theorem. Let γj (t) with j = 1, . . . , g be the points of the dynamical divisor D(t). Let ω be any holomorphic differential on Γ. The time evolu˙ tion of the points γj (t) induced by the Lax equation L(λ) = [M (λ), L(λ)] (k) with M (λ) = k (P (L, λ))− is such that: d dt g
γj (t)
ω=
j=1
N k
ResPk,i ωP (k) (µ, λ)
(5.35)
i=1
where the points Pk,i are the N points above λk . Notice that the right-hand side is independent of time. Proof. Since N (t + dt, P ) is a meromorphic function, Abel’s theorem tells us that: g γj (t+dt) Pk,α (dt) ω+ ω=0 (5.36) j=1
γj (t)
k,i,α Pk,i
144
5 Analytical methods
for any holomorphic differential ω. This equation will give us the time evolution of the divisor as in eq. (5.35) if we can evaluate the second sum. For this we need the following lemma: Lemma. Consider a point P ∈ C, a holomorphic differential ω in the neigbourhood V of P and an analytic function u in V with a pole of order m at P . Consider ∈ C small enough, so that the m points Pα ( ), where u(Pα ( )) + −1 = 0, belong to V . Then m 1 Pα () lim ω = −ResP (ωu) →0 P α=1
Proof. Let ω = dσ with σ(P ) = 0. For any path π enclosing the zeroes Pα ( ) of u + −1 and the point P , we have m 1 Pα () u 1 1 ω = σ(Pα ( )) = ResPα () σ α α u + −1 α=1 P u 1 = σdz 2iπ π 1 + u We used that the integrand is regular at P because σ(P ) = 0. When dz 1 = − 2iπ tends to zero, the right-hand side tends to π u σ 2iπ π uω = −ResP (uω). Returning to the proof of the Theorem, we decompose the second sum in eq. (5.36) as a sum of terms associated with each pole of M (λ), Pk,i on Γ. α , close to P , are by definition solutions of 1+dtu (P α ) = The points Pk,i k,i k k,i 0 with uk = j (Mk )1j ψj . Thus, by the Lemma and eq. (5.36), one finds: g γj (t+dt) ω = dt ResPk,i ω (Mk )1j ψj (t, P ) (5.37) j=1
γj (t)
j
k,i
Recall that Mk (λ) is the polar part of P (k) (L, λ), i.e.
Mk (λ) = P (k) (L, λ) = P (k) (L, λ) − P (k) (L, λ)
with Therefore,
ResPk,i ω
−
P (k) (L, λ)
j
+
+
regular at λk , hence does not contribute to eq. (5.37).
(Mk )1j ψj (t, P ) = ResPk,i ω P (k) (L, λ)Ψ(t, P ) 1
= ResPk,i ωP (k) (µ, λ)ψ1 (t, P ) = ResPk,i ωP (k) (µ, λ)
145
5.5 Theta-functions formulae
where we have used the fact that Ψ(t, P ) is an eigenvector and the normalization ψ1 (t, P ) = 1. Equation (5.35) can alternatively be written in terms of Abel’s map. Theorem. The flow induced by the Lax equation (5.32) on the eigenvector bundle is a linear flow on the Jacobian of the spectral curve: A(D(t)) − A(D(0)) = −tU (M ) (5.38)
with (M ) =− ResPk,i ωj P (k) (µ, λ) (5.39) Uj k,i
5.5 Theta-functions formulae Since the motion linearizes on the Jacobian Jac (Γ), it is natural to express the solution in terms of Riemann’s theta-functions. We use the notations and results of Chapter 15, in particular K is the vector of Riemann’s constant. We first recall the way to parametrize meromorphic functions on a Riemann surface Γ, in terms of theta-functions, on its Jacobian. For e in an open set of the divisor Θ = {z ∈ Jac (Γ)|θ(z) = 0}, the function y θ(e + x ω) does not vanish identically in y. By Riemann’s theorem it has g zeroes y1 , . . . , yg for given x. Since e ∈ Θ, one of these zeroes is x and we choose y1 = x. We now show that y2 , . . . , yg are independent of x. Indeed, by Riemann’s theorem we have: A(y1 ) + · · · + A(yg ) = A(x) − e − K, so that y2 , . . . , yg are determined by A(y2 ) + · · · + A(yg ) = −e − K. This equation is independent of x. As a side remark note that for such an x yj , j ≥ 2, we have θ(−e + yj ω) = 0 for all x. This means that some translation of the curve Γ embedded into the Jacobian by the Abel map is entirely contained in Θ, hence one has to be careful in the choice of the vector e. To use this result to construct meromorphic functions on y the y Riemann surface, notice that the building block θ(e + x1 ω)/θ(e + x2 ω) has a zero at y = x1 and a pole at y = x2 . The extra (g − 1) zeroes at the numerator and denominator cancel. We then assemble such blocks so that the product has no monodromy. ) and its inverse are easily writExplicit expressions for the matrix Ψ(P ten. Let D(t) = (γ1 (t), . . . , γg (t)) be the dynamical divisor. Let U (M ) be (M ) Then, the g-dimensional vector eq. (5.39) with components Uj P θ(A(P ) − A(Qk ) + A(Q1 ) − ζD(t) ) θ(e + Q1 ω) ψk (t, P ) = dk (5.40) P θ(A(P ) − ζD(t) ) θ(e + ω) Qk
146
5 Analytical methods
In this equation ζD = A(D) + K. Equation (5.38) implies ζD(t) = ζD(0) − tU (M ) . The dk are constants (d1 = 1) related to the residual diagonal action of the gauge group. Note that the sum of the arguments of the theta functions in the numerator is equal to the sum of the arguments in the denominator so that the whole expression has no monodromy when the point P loops around non-trivial cycles of Γ. Hence eq. (5.40) defines a meromorphic function on Γ with the correct zeroes and poles. Similarly one has: P + 1 θ(A(P ) − A(Qk ) + A(Q1 ) − ζD+ (t) ) θ(e + Q1 ω) + ψk (t, P ) = P ck θ(A(P ) − ζD+ (t) ) θ(e+ + ω) Qk
(5.41) Here D+ is the divisor given by eq. (5.27) and the normalization constants ck are defined in eq. (5.30). We now compute them. To do that we express the meromorphic function Ω/dλ in terms of theta-functions. Let B be the set of branch points of the covering (λ, µ) → λ. We decompose the set of 2(g + N − 1) points of B into four subsets B0 , B0 , B1 , B1 such that card B0 = card B0 = N − 1 and card B1 = card B1 = g. This decomposition is arbitrary but does not affect the final formulae. Then we can write ( Bi means bk ∈Bi ): Ω (P ) = (5.42) dλ P P N + j=2 θ(e + Qj ω)θ(e + Qj ω) θ(A(P ) − ζD(t) ) θ(A(P ) − ζD+ (t) ) P P + B0 θ(e + bk ω) B θ(e + bk ω) θ(A(P ) − ζB1 ) θ(A(P ) − ζB1 ) 0
This expression of Ω/dλ has the correct zeroes and poles, and has no monodromy in view of eq. (5.27). To compute ck we can now use eq. (5.30), yielding: ck = dk ×
Qk
Qk
(5.43)
θ(A(Q1 ) − ζD(t) )θ(A(Q1 ) − ζD+ (t) ) j=k θ(e + Qj ω)θ(e+ + Qj ω) Qk Qk + B0 θ(e + bk ω) B θ(e + bk ω)θ(A(Qk ) − ζB1 )θ(A(Qk ) − ζB1 ) 0
−1 as products This gives an expression of the elements of the matrix Ψ of theta-functions, through eq. (5.31). Example. Let us apply the above formalism to find the solution of the Neumann model. In this case since t L(λ) = L(−λ) the normalized Ψ+ (λ, µ) is equal to t Ψ(−λ, µ). The transformation (λ, µ) → (−λ, µ) is
5.5 Theta-functions formulae
147
just the hyperelliptic involution σ on the spectral curve of the Neumann model. The fixed points of σ are the (2g + 2) points (λ, µ) lying above λ = ∞, namely the Qj , and above λ = 0, namely the point P∞ (λ = 0, µ = ∞) and the N − 1 points of coordinates (λ = 0, µ = βi ), see eq. (5.14). We take the point P∞ as base point of the Abel map, and note that the hyperelliptic involution changes the sign of the Abelian differentials which are of the form p(µ)dµ/λ, so that for any point P we get A(σ(P )) = −A(P ) modulo periods. The branch points B of 2the covering (λ, µ) → λ are solutions of ∂µ Γ(λ, µ) = 0, i.e. k Fk /(µ − ak ) = 0. This equation has (2N − 2) roots at finite distance, each one giving rise to two points related by the hyperelliptic involution. Hence the set B is globally invariant under σ and one can choose B0 = σ(B0 ) and B1 = σ(B1 ) in the above construction. Considering eq. (5.27), we see Indeed, since B = that D+ = σ(D). i (bi + σ(bi )) with σ(bi ) = bi we have A(B) = i (A(bi ) + A(σ(bi ))) = 0 up to periods. Similarly, since σ(Qj ) = Qj we have A(Qj ) = −A(Qj ) so that 2A(Qj ) = 0 modulo periods. This shows that A(Qj ) is a half-period on the Jacobian torus. Finally, we get A(D+ ) = −A(D) modulo periods so that D+ = σ(D) since D is generic. From this we understand that the requirement ψk+ (t, P ) = ψk (t, σ(P )) fixes the constants dk and e+ in the expressions (5.40) and (5.41). Let us compare for instance the theta-functions θ(A(P ) − A(Qk ) + A(Q1 ) − ζD+ (t) ) and θ(A(σ(P )) + A(Qk ) − A(Q1 ) − ζD(t) ). The sum of the two arguments vanishes modulo periods, because in the hyperelliptic case the vector of Riemann’s constants K is some half-period, hence these two theta-functions have the same zeroes. Applying a similar argument to the other theta-functions and choosing e+ = −e, we see that: P 1 θ(−A(P ) + A(Qk ) − A(Q1 ) − ζD(t) ) θ(e − Q1 ω) + (5.44) ψk (t, P ) = P ck θ(−A(P ) − ζD(t) ) θ(e − Qk ω) Starting from the expressions (5.40) for ψk , (5.44) for ψk+ and (5.42) for Ω/dλ, one computes ck according to eq. (5.30). Then the relation ψk+ (t, P ) = ψk (t, σ(P )) fixes the constant dk . To compute the solution of the Neumann model note that the diagonal element Lkk of the Lax matrix reads Lkk (λ) = ak − x2k /λ2 . On the other hand, from the reconstruction formula eq. (5.20), we have (−1) ) (P ), where the P are the N points Lkk (λ) = i i k i ψk (Pi )µ(Pi )(ψ above λ. When λ → 0 only µ(P∞ ) diverges as −1/λ2 while all the other terms remain finite. Hence: x2k = ψk (P∞ )(ψ (−1) )k (P∞ )
148
5 Analytical methods
Note that this formula implies k x2k = 1 as it should be in the Neumann model, and moreover this expression is independent of the constant dk . Inserting the above expressions we immediately find: x2k (t) = αk
θ(A(Qk ) − A(Q1 ) + ζD(t) ) θ(A(Qk ) − A(Q1 ) − ζD(t) ) θ(A(Q1 ) − ζD(t) ) θ(A(Q1 ) + ζD(t) )
where αk is given by ratios of theta-functions completely independent of the dynamical divisor D(t). It depends only on the geometry of the spectral curve. It is convenient to express this result in terms of theta–functions with characteristics θ[η](z) which are essentially translates of the thetafunction by half-periods: θ[η](z) = eiπ(
t η Bη +2 t η (z+η ))
θ(z + η)
(5.45)
where η = Bη + η is a half-period, i.e. η , η ∈ (Z/2)g . Note that A(Qi ) are half-periods. By redefining ζD → ζD + A(Q1 ) one gets rid of A(Q1 ) in the theta–functions. Indeed the first factors in the numerator and denominator get translated by the period 2A(Q1 ). This produces exponential factors whose dependence in ζD cancel. Similarly, the dependence in D in the exponential factor in eq. (5.45) cancels as well, this time between the two factors in the numerator, and separately the two factors Q between + η2k−1 is an in the denominator of x2k . Finally A(Qk ) = P∞k ω = Bη2k−1 even non-singular characteristic. Hence one gets x2k (t) = αk
θ2 [η2k−1 ](ζD(t) ) θ2 [0](ζD(t) )
One could in principle evaluate the coefficients αk directly by using the very special properties of theta-functions on hyperelliptic curves. However, we have a short cut by appealing to the Frobenius formula, only valid on hyperelliptic curves, which states that: g+1 2 θ [η2k−1 ](z) θ2 [η2k−1 ](0) =1 θ2 [0](z) θ2 [0](0) k=1
so we finally get: x2k (t) =
θ2 [η2k−1 ](z0 − U t) θ2 [η2k−1 ](0) θ2 [0](z0 − U t) θ2 [0](0)
(5.46)
Here U is obtained by applying eq. (5.35) with P (0) (µ, λ) = λµ (recall that for the Neumann model there is only a singularity at λ = 0 and
5.6 Baker–Akhiezer functions
149
M (λ) = (λL(λ))− , see Chapter 3). This yields: Uj = − ResPi (λµ)ωj (P ) = −ResP∞ (λµ)ωj (P ) = ωj (P∞ ) Pi
where Pi are the N points above λ = 0, and we used that for λ = 0, the product (λµ) has only a simple pole at P∞ (λ = 0, µ = ∞). The ωj are the normalized Abelian differentials. 5.6 Baker–Akhiezer functions Baker–Akhiezer functions are special functions with essential singularities on Riemann surfaces. With them, we have a very natural parametrization of eigenvectors of the linear system. We also get a very simple proof of the linearization theorem. Definition. Let P1 , . . . , Pl be points on a Riemann surface Γ of genus g. Let wi (P ),with wi (Pi ) = 0, be local parameters around these points. r Let Si (P ) = −1 r=−mi Si,r wi be some singular parts around Pi . Let D be a divisor on Γ. A Baker–Akhiezer function, ΨBA (P ), defined with these data, is a function such that: (1) it is meromorphic on Γ outside the points Pi with the divisor of its poles and zeroes satisfying (ΨBA ) + D ≥ 0, (2) for P → Pi the product ΨBA (P )e−Si (wi (P )) is analytic. It is important to keep in mind the data involved in the definition of the Baker–Akhiezer functions. First we need a set of punctures Pi on the Riemann surface. Second a set of local parameters wi (P ) allowing to define a set of singular parts Si in the neighbourhood of each puncture. Notice that this definition is not invariant under change of local parameters wi . Assuming for the moment that such functions exist it is worth doing a few remarks. Remark 1. If Baker–Akhiezer functions associated with a given set of data exist, they form a vector space. However, the sum of two Baker–Akhiezer functions with different singular parts Si is not a Baker–Akhiezer function.
Remark 2. The ratio of two Baker–Akhiezer functions associated with a given set of singular parts is a meromorphic function. This allows one to use standard analysis on Riemann surfaces to study Baker–Akhiezer functions. Remark 3. Even though Baker–Akhiezer functions are not meromorphic functions, they have the same number of poles and zeroes. The differential form d(log f ) is a meromorphic form. The sum of its residues is the number of zeroes minus the number
150
5 Analytical methods
of poles of f and has to vanish. Essential singularities do not contribute because around Pi we have d(log f ) = dSi + regular and dSi has no residue.
We now give a fundamental formula expressing the Baker–Akhiezer functions in terms of Riemann theta-functions. Recall that a differential of the second kind is a meromorphic differential with poles of order ≥ 2. See Chapter 15 for more details. Let Ω(S) be the unique Abelian differential of the second kind, normalized with vanishing a-periods, and with singular part at the points Pi of the form dSi (wi (P )). Thus, near the points Pi , −1 (S) r Si,r wi + regular Ω =d r=−mi (S)
Denote by 2iπU (S) the vector of b-periods of Ω(S) . Its g components Uj , j = 1, . . . , g are: 1 (S) Uj = Ω(S) (5.47) 2iπ bj Proposition. If D = gi=1 γi is a generic divisor of degree g, the following expression defines a Baker–Akhiezer function with D as divisor of poles: P θ(A(P ) + U (S) − ζ) (S) ΨBA (P ) = const. exp Ω (5.48) θ(A(P ) − ζ) P0 Here ζ = A(D) + K, where K is the vector of Riemann’s constants and A denotes the Abel map with based point P0 , cf. eq. (5.29). Proof. It is enough to check that the function defined by the formula (5.48) is well-defined (i.e., it does not depend on the path of integration between P0 and P ) and has the desired analytical properties. Indeed, when P describes some a-cycle, nothing happens because the theta-functions are a-periodic and Ω(S) is normalized. If P describes the bj -cycle the quo(S) tient of theta-functions is multiplied by exp(−2iπUj ) (see Chapter 15) (S)
while the exponential factor changes by exp(2iπUj ), so that ψBA is well-defined. Clearly it has the right poles if deg D = g. Remark 4. For a generic divisor D of degree ≥ g, the dimension of the vector space of Baker–Akhiezer functions is equal to deg(D)−g+1. In particular for deg D = g the above formula gives the unique Baker–Akhiezer function having poles at D up to
151
5.6 Baker–Akhiezer functions
a constant. If we have two Baker–Akhiezer functions, their ratio is a meromorphic function with d = deg D poles. By the Riemann–Roch theorem the dimension of the space of such functions is d − g + 1.
It is worth noticing that generically we get a non-trivial Baker–Akhiezer function with only g poles, while to get a non-trivial meromorphic function we need generically (g + 1) poles. To understand why Baker–Akhiezer functions arise naturally in the construction of the eigenvectors, let us consider the unnormalized eigenvector Ψun (t, P ) whose time evolution is governed by the equation ∂t Ψun (t, P ) = M (λ)Ψun (t, P )
(5.49)
The normalized eigenvector Ψ(t, P ) and the unnormalized eigenvector Ψun (t, P ) are related by multiplication by a scalar function: Ψun (t, P ) = f (t, P )Ψ(t, P ) and (Ψun )1 (t, P ) = f (t, P ). Taking the first component of eq. (5.49), one gets: f˙ = Cf with C = M1j ψj j
where C is the same object appearing in eq. (5.33). Let us describe the singularities of f (t, P ). Note that C(P ) has poles where Ψ has poles or at points above the poles λk of M (λ) (recall that in general the poles of M (λ) are a subset of the poles of L(λ)). Consider first the points Pk,i , i = 1, . . . , N above a point λk . In the vicinity of Pk,i we have:
f˙ = (M Ψ)1 f = (P (k) (L, λ))− + regular Ψ f 1
(k) (k) P (µ, λ) + regular f = P (L, λ) + regular Ψ f = −
1
where we have used the fact that Ψ is an eigenvector of L(λ) with eigenvalueµ and Ψ1 = 1. The quantity f˙/f has poles at points (λ, µ) such that P (k) (µ, λ) − = 0. The projection ( )− is computed using the local parameter λ. Notice that P (k) (µ(λ), λ) − is independent of time, therefore f (t, P ) has an essential singularity at Pk,i of the form: t P (k) (µ,λ))−
f (t, P ) = e (
× regular
Let us now consider the poles of C coming from Ψ. First at λ = ∞, while Ψ has poles, M vanishes so that C is regular, and nothing special happens for f . At a point γ(t) of the dynamical divisor D(t) we have
152
5 Analytical methods
ψi ∼ αi (t)/(λ − γ(t)) and C ∼ r(t)/(λ − γ(t)). Comparing the second order pole in both sides of eq. (5.33), we find r(t) = −γ(t). ˙ Thus ∂t log f = ∂t log(λ − γ) + regular, showing that f (t, P ) vanishes at the points of the dynamical divisor D(t). Finally, let us remark that the poles of f (t, P ) are independent of time. Indeed, assuming that f (t, P ) has a pole of order k at a point γ(t), which by the previous argument is not a pole of C, the orders of the poles on both sides of the equation f˙ = Cf are different if γ˙ = 0. Considering the solution such that f (t = 0) = 1 we see that the divisor of its zeroes is the dynamical divisor D(t) and the divisor of its poles is D(0). Moreover, f has essential singularities at the points Pk,i with prescribed singular parts, hence f is the unique Baker–Akhiezer function with these essential singularities and the g poles corresponding to the divisor D(0), so that: P θ(A(P ) + tU (M ) − ζD(0) ) (M ) f (t, P ) = exp tΩ (5.50) θ(A(P ) − ζD(0) ) P0 The linear time dependence in the theta-function in the numerator of this equation arises from the form of the singular exponential and the requirement that there is no monodromy. This provides another quick proof of the linearization of the flow on the Jacobian. If D(t) is the divisor of the zeroes of f (t, P ), by the Riemann theorem it satisfies: t (M ) A(D(t)) − A(D0 ) = −t U =− Ω(M ) (5.51) 2iπ b This shows that the flow is linear on the Jacobian! On the other hand we know that A(D(t)) is given by eq. (5.38), which coincides with eq. (5.51) because by Riemann’s bilinear indentity N
1 Ω(M ) = − ResPk,i ωj P (k) (µ, λ) 2iπ bj k
i=1
with ωj the normalized Abelian differentials. See Chapter 15. We finally give the expression of the components of the unnormalized eigenvector Ψun (t, P ) = f (t, P )Ψ(t, P ): P (M ) (Ψun )k (t, P ) = dk exp t Ω P0
P θ(A(P ) − A(Qk ) + A(Q1 ) − ζD(t) ) θ(e + Q1 ω) × P θ(A(P ) − ζD(0) ) θ(e + ω) Qk
In the product f (t, P )ψk (t, P ), the zeroes of f cancel the dynamical poles of ψk which are replaced by the constant poles of f .
5.7 Linearization and the factorization problem
153
Remark 5. As explained in Chapter 3, different functions P (k) (L, λ) correspond to different dynamical flows. Therefore, different Abelian differentials of the second kind with poles at the points above λk correspond to different time flows. In other words, all the different dynamics are encoded into the singular differentials Ω(M ) .
5.7 Linearization and the factorization problem We show that the solution of integrable systems by factorization can be interpreted as the time evolution of the eigenvector bundle. This gives a third very short proof that the flows linearize on the Jacobian of the spectral curve. We consider small disks Uk around the poles λk of L(λ) and define the open set U+ as the union of Uk , while U− is an open set slightly larger than the complement of U+ in the complex plane. On these open sets we have defined in Chapter 3 a factorization problem −1 θ− (λ, t)θ+ (λ, t) = e−
i ti dHi (L(λ,0))
where θ± are analytic and invertible in U± respectively. Recall that we Ψ −1 (0) has different expressions on the have shown that the matrix Ψ(t) patches U+ and U− given by eqs. (3.56, 3.57). Multiplying these equations on the right by Ψ(0) we get on U+ and U− respectively:
Ψ(t) = θ+ (λ, t)Ψ(0)e
µ) i ti dHi (
,
Ψ(t) = θ− (λ, t)Ψ(0)
where µ is the diagonal matrix of eigenvalues of L(λ, 0). We used that −1 because H(L) is an ad-invariant function. dH(L) = ΨdH( µ)Ψ We can interpret this matrix equation in λ as a vector equation on the Riemann surface Γ. First we lift each disc Uk around λk to N disks around the Pk,i , and still define U+ as the union of these disks, and U− as an open set in Γ containing the closure of U+ . Each column of these matrix equations can be viewed as vector equations at a point P above λ: Ψ(t, P )e−
i ti dHi (µ(P ))
= θ+ (λ, t)Ψ(0, P ),
Ψ(t, P ) = θ− (λ, t)Ψ(0, P ) (5.52)
valid on the open sets U± respectively. The vector Ψ(0, P ) is a section of the eigenvector bundle, E0 , at time t = 0. As explained in Chapter 15, (θ+ (λ, t)Ψ(0, P ), θ− (λ, t)Ψ(0, P )) defines a section of a line bundle isomorphic to E0 due to the regularity properties of the matrices θ± . In the left-hand sides of eqs. (5.52), we write Ψ(t, P ) = f (t, P )Ψm (t, P ), where Ψm (t, P ) is meromorphic with first component equal to 1, and f (t, P ) is the Baker–Akhiezer function (5.50). Note
154
5 Analytical methods
that, by definition, Ψm (t, P ) is a section of the eigenvectorbundle Et . We introduce the line bundle Ft with transitionfunction e− i ti dHi (µ(P )) on U+ ∩ U− which possesses the section (f e− i ti dHi (µ(P )) , f ) on (U+ , U− ) since the first term is regular in U+ by eq. (5.52). Recall that the product of bundles admits the product of sections. It is now clear that Et ∼ E0 ⊗ Ft . The bundle Ft is of Chern class 0 because Et and E0 have the same Chern class at least for t small. Hence Ft defines a point in the Jacobian Jac (Γ). This point moves linearly in time since the addition law on the Jacobian corresponds to taking the product of transition functions. 5.8 Tau-functions We now wish to relate the formula for the Baker–Akhiezer function to the so-called tau-function (see Chapter 3). Let us consider for simplicity the case of only one singular point P∞ , and let z be a local parameter for a point P in the vicinity of P∞ such that z(P∞ ) = 0. The general case is similar but more cumbersome to present. In order to be able to describe at once all possible singular parts, we introduce an infinite set of elementary time variables tk . Denote by Ω(k) the normalized differential of the second kind with singular part d(z −k ) at P∞ and denote by U (k) its b-periods, 1 (k) Uj = Ω(k) 2iπ bj ∞ Proposition. Let ψBA (P ) be the Baker–Akhiezer function with divisor D of poles of degree g and singular part ξ(t, z) = k tk z −k at P∞ (this should be understood in the sense of formal series), normalized such that: ψBA (P ) = eξ(t,z) (1 + O(z)), z ∼ 0 + , Define t − [z] = tk − k1 z k . Then we have in the vicinity of P∞ : ψBA (P ) = eξ(t,z)
τ (t − [z]) τ (t)
(5.53)
The function τ (t) may be expressed in terms of the Riemann’s thetafunction,
tk U (k) − ζ (5.54) τ (t) = eα(t)+β(t,t) θ A(P∞ ) + k
Here α(t) and β(t, t) are a linear and a quadratic form in the times t respectively, and ζ = A(D) + K. A(D) is the Abel map and K the vector of Riemann’s constants.
155
5.8 Tau-functions
Proof. From eq. (5.48), the normalized Baker–Akhiezer function can be written as: P ξ(t,z) (k) −k exp tk (Ω − d z ) ψ(P ) = e (5.55) k
P∞
θ(A(P ) + k tk U (k) − ζ)θ(A(P∞ ) − ζ) × θ(A(P ) − ζ)θ(A(P∞ ) + k tk U (k) − ζ) P P Indeed, eq. (5.48) contains P0 Ω which can be written ξ(t, z) + P∞ (Ω − dξ)+C st where (Ω−dξ) = k tk (Ω(k) −d z −k ) is regular at P∞ . Moreover, eq. (5.55) is obviously correctly normalized. On the other hand, if τ (t) is assumed to be of the form (5.54), we have z −k (k) − ζ) τ (t − [z]) k (tk − k )U −α([z])−2β(t,[z])+β([z],[z]) θ(A(P∞ ) + =e τ (t) θ(A(P∞ ) + k tk U (k) − ζ) We need to compare eq. (5.53) and eq. (5.55). Recall that near P∞ , P (k) − d z −k ) = b (z) is regular. So we first choose α(t) and β(t, t) k P∞ (Ω such that: θ(A(P ) − ζ) 1 β(t, [z]) = − tk bk (z), α([z]) = β([z], [z]) + log 2 θ(A(P∞ ) − ζ) k
This defines α and β uniquely and consistently. For this, we must check that the coefficients βkjj of β are symmetric. We have βkj = −1/2 jbkj , where bk (z) = j bkj z . We apply the Riemann bilinear identity, eq. (15.8) in Chapter 15 to the two normalized second kind Abelian differentials Ωk and Ωl . The left-hand side in this identity vanishes because the integrals over a-cycles vanish, while the right-hand side yields kblk = lbkl . This choice takes care of the exponential prefactor and two of the theta functions in eq. (5.55). To deal with other two theta functions in (5.55) we Taylor expand the Abel map A(P ) − A(P∞ ) around P∞ . Writing (j) i ωj = ∞ i=0 ci z dz in the vicinity of the point P∞ and Taylor expanding using Riemann bilinear identities, one deduces: ∞ ∞ ∞ i z k (k) 1 zk (j) z Aj (P ) − Aj (P∞ ) = ci−1 = − Ω(k) = − U i 2πi k bj k j i=1
k=1
k=1
Using this relation we get: zk A(P ) + tk U (k) − ζ = A(P∞ ) + (tk − )U (k) − ζ k k
Gathering all this we obtain eq. (5.53).
k
(5.56)
156
5 Analytical methods
The formula (5.53) relating Baker–Akhiezer functions to tau-functions is usually called the Sato formula, cf. eq. (3.61) in Chapter 3. It may easily be generalized for several punctures, see Chapter 8 for more details. It shows that the local parameter z can be generated from translations on the infinite set of times. The left-hand side of eqs. (5.56) gives a convergent expression for the formal series of the right-hand side. Moreover, the Baker–Akhiezer function provides a global meaning to Sato’s formula in this case. 5.9 Symplectic form Our aim is to express the symplectic form inherited from the coadjoint orbit structure in terms of the dynamical divisor. We consider a rational Lax matrix of the form L(λ) = L0 + k Lk (λ), where each Lk may be written as: Lk (λ) = gk · (Ak )− · gk−1 − with (Ak )− diagonal matrices. Locally around λk , L(λ) can be diagonalized as, see Chapter 3: L(λ) = gk Ak gk−1 Both matrices Ak and gk depend on λ. Ak has poles at λk but gk is regular. By definition L0 is non-dynamical. The variables (Ak )− are also chosen to be not dynamical, and specify the coadjoint orbit. The dynamical variables are the matrix elements of the jets g(k) , cf. eq. (3.13) in Chapter 3. The pullback on the loop group of the Kirillov symplectic form on the coadjoint orbit is: Resλk Tr (Ak )− gk−1 δgk ∧ gk−1 δgk dλ ω= k
The dynamical variables g(k) and g(k ) Poisson commute for k = k . We have seen that the Lax pair description of a dynamical system naturally provides coordinates on phase space, namely g = genus (Γ) independent action variables Fi which parametrize the spectral curve Γ, and g points γi = (λγi , µγi ) on the spectral curve, which we called the dynamical divisor. It is important to express the symplectic form in these coordinates. The phase space appears as a fibred space whose base is the space of moduli of the spectral curve, explicitly described as coefficients of the equation Γ(λ, µ) = 0 of the spectral curve, and the fibre at a given Γ is the Jacobian of the curve Γ(λ, µ) = 0. On this space we introduce a
5.9 Symplectic form
157
differential δ which varies the dynamical variables Fi , λγi , µγi subjected to the constraint Γ{Fi } (λγi , µγi ) = 0. We will need an auxiliary fibre bundle above the same base whose fibre is Γ × Jac(Γ). We extend δ to this space by keeping the previous definition on the Jac(Γ) part and on the Γ part, we differentiate any function of Fi , λ, µ with Γ{Fi } (λ, µ) = 0, by keeping λ constant. This definition makes sense because the bundle of curves is given by the family of equations Γ(λ, µ) = 0, where the coefficients of Γ depend on the moduli which parametrize the base space. This provides a universal definition of the meromorphic function λ on the whole family of curves. So differentiating on the bundle of curves, keeping λ constant, provides a horizontal direction, i.e. a connection. Explicitly, for a function f (P ; Fi ) we take λ as a local parameter, then δf = i ∂Fi f δFi . At a branch point, however, the local parameter is µ, and we have: ∂Fi f δFi (5.57) δf = ∂µ f δµ + i
To compute δµ we differentiate the equation Γ{Fi } (λ, µ) = 0 at λ constant, getting: 1 δµ = − ∂Fi Γ{Fi } (λ, µ)δFi (5.58) ∂µ Γ{Fi } (λ, µ) i
At a branch point of the covering (λ, µ) → λ, we have ∂µ Γ{Fi } (λ, µ) = 0, hence the differential δf acquires a pole even though f is regular. Note, however, that if f (P ) depends on P only through λ(P ), δf is regular at the branch points. Recall that at each point P (λ, µ) on Γ a column eigenvector Ψ(P ) of the Lax matrix is defined, up to normalization, and that we have defined a dual line eigenvector Ψ(−1) (P ) such that Ψ(−1) (P )Ψ(P ) = 1. This allows us to define a 3-form K on our extended fibre bundle. We regard it as a 1-form on Γ whose coefficients are 2-forms on phase space. K = K1 + K2
(5.59)
K1 = Ψ
(P )δL(λ) ∧ δΨ(P ) dλ
K2 = Ψ
(P )δµ ∧ δΨ(P ) dλ
(−1) (−1)
Of course Ψ(P ) is defined, knowing the dynamical divisor, up to multiplication by a diagonal matrix independent of P . We normalize the eigenvectors at ∞ so that ψi (Qj ) = λδij + O(1),
for i, j = 2, . . . , N
(5.60)
158
5 Analytical methods
Proposition. Define the 2-form on phase space: ω = k,i ResPk,i K, where Pk,i are the points above the poles λk of L(λ). Then we have: ω=2
g
δλγi ∧ δµγi
(5.61)
i=1
where (λγi , µγi ) are the coordinates of the points of the dynamical divisor D. Proof. The sum of the residues of K, seen as a form on Γ, vanishes. The poles of K are located at four different places. First the dynamical poles of Ψ, then the poles at the Pk,i coming from L and µ, next the poles above λ = ∞ coming from Ψ and dλ, and finally the poles at the branch points of the covering coming from the poles of Ψ(−1) and from eq. (5.58). Let us compute the residues at the dynamical poles (γ1 , . . . , γg ). We write the coordinates of these points as: γi = (λγi , µγi ) for i = 1, . . . , g. Near such a point we can choose λ as a universal local parameter and Ψ = 1/(λ − λγi ) × Ψreg , hence: δΨ =
δλγi dλ (Ψ + O(1)) , so that K1 ∼ Ψ(−1) δLΨ ∧ δλγi λ − λ γi λ − λ γi
Since (L−µ)Ψ = 0 and Ψ(−1) (L−µ) = 0, we have (δL−δµ)Ψ+(L−µ)δΨ = 0. Multiplying by Ψ(−1) we get Ψ(−1) δLΨ = δµ, therefore: Resγi K1 = δµ|γi ∧ δλγi
(5.62)
Here δµis to be seen as a meromorphic function on Γ given by eq. (5.58), that is j ∂Fj Γ|γi δFj + ∂µ Γ|γi δµ|γi = 0. However, varying Γ(λγi , µγi ) = 0 we obtain j ∂Fj ΓδFj + ∂λ Γδλγi + ∂µ Γδµγi = 0. Comparing these equations we get: ∂λ Γ -δλγi δµ|γi = δµγi + ∂µ Γ γi
and the second term does not contribute to the wedge product in eq. (5.62). The contribution of K2 is exactly the same. So we finally get: Resγi K = 2δµγi ∧ δλγi
(5.63)
We now show that there are no residues at the branch points due to the proper choice of K2 . Let us look at the term K1 . At a branch point b, Ψ(−1) has a simple pole, δL is regular, δΨ has a simple pole due to eq. (5.58) and the form dλ has a simple zero, hence K1 has a simple pole
5.9 Symplectic form
159
at b. To compute its residue it is enough to keep the polar part in δΨ, i.e. to replace δΨ by ∂µ Ψδµ (recall that µ is a good local parameter around b). We get: Resb K1 = Resb Ψ(−1) δL∂µ Ψ ∧ δµ dλ = Resb Ψ(−1) (δL − δµ)∂µ Ψ ∧ δµ dλ where in the last equation we have used the antisymmetry of the wedge product to replace δL by δL − δµ. Using again the eigenvector equation (L−µ)Ψ = 0, and varying the point (λ, µ) on the curve around b, one gets (L − µ)∂µ Ψ = Ψ −
dλ dL Ψ dµ dλ
(5.64)
where dλ/dµ vanishes at the branch point. We then differentiate with δ and multiply on the left by Ψ(−1) to get: Resb Ψ(−1) (δL − δµ)∂µ Ψ ∧ δµ dλ = Resb Ψ(−1) δΨ ∧ δµ dλ dλ dL (−1) − Resb Ψ δ Ψ ∧ δµ dλ dµ dλ It is easy to see that the first term is exactly cancelled by the term Resb K2 . The second term gives a non-vanishing contribution Resb
δµb ∧ δµ dλ µ − µb
(5.65)
To show it, note that the quantity ζ = (dλ/dµ)(dL/dλ)Ψ vanishes at b = (λb , µb ). Writing ζ = (µ−µb )ζ1 , we get δζ = −δµb /(µ−µb )ζ+δµ ζ1 +ζ2 with ζ2 regular. The ζ1 term does not contribute due to the antisymmetry of the wedge product and the ζ2 term has no residue. Using eq. (5.64) we dλ dL have Ψ(−1) dµ dλ Ψ = 1 yielding eq. (5.65). This contribution is exactly cancelled by the contribution of a new form K3 : K3 = δ (log ∂µ Γ) ∧ δµ dλ We will see that K3 has poles only at the branch points. At the branch point b, ∂µ Γ has a zero, so we write ∂µ Γ = (µ − µb )S with S regular. The contribution of the point b to K3 is: Resb
δ∂µ Γ ∧ δµdλ ∂µ Γ
The variation of ∂µ Γ reads δ∂µ Γ = δ(µ − µb ) S + (µ − µb )δS. The second term does not contribute to the residue because S is regular, while the
160
5 Analytical methods
variation δµ cancels due to the antisymmetry of the wedge product, and we are left with the contribution of δµb which exactly cancels eq. (5.65). We now compute the residues above λ = ∞. Recall that we consider a reduced Hamiltonian system under the action of diagonal matrices. Recall the normalization of the eigenvectors at ∞, eq. (5.60). Notice that L = L0 + O(1/λ), where L0 is non-dynamical so δL0 = 0, and that µ = ai + O(1/λ) around Qi hence δL and δµ are O(1/λ). Moreover, Ψ(−1) vanishes at Qi and dλ has a double pole. Altogether K1 and K2 are regular at Qi since (δΨ)(Qi ) = O(1) due to the normalization condition. Finally, K3is also regular since, on the sheet µ = µi (λ), one can write ∂µ Γ = j=i (µi − µj ) yielding δ log ∂µ Γ = O(1/λ). Hence δ log ∂µ Γ ∧ δµ = O(1/λ2 ) has a double zero which compensates the double pole of dλ at infinity. All this shows that K has no residues above λ = ∞. It remains to show that K3 has no other poles. Obviously, K3 is regular at the points of the dynamical divisor and does not contribute to the residues at these points. To compute the residue of K3 at the points Pk,i above λk , we note that if ∂µ Γ has a pole of some order m, it can be written ∂µ Γ = c(λ)/(λ − λk )m , where c(λ) is regular and non-vanishing. Since δλ = 0 and δλk = 0 we get δ (log ∂µ Γ) = δ log c(λ) which is regular. At λk we remark that δµ is regular on all sheets above λk . This is because, due to the form of L(λ), we have µ = (Ak )− + regular. Since (Ak )− characterizes the coadjoint orbit and is not dynamical, one has to take δ(Ak )− = 0. Hence K3 has no residue. Proposition. The form eq. (5.61) is given by: ω=2
g i=1
δλγi ∧ δµγi = 2
Resλk Tr (Ak )− gk−1 δgk ∧ gk−1 δgk dλ (5.66)
k
where (λγi , µγi ), i = 1, . . . , g, are the coordinates of the points of the dynamical divisor D. This shows that ω is the symplectic form on the orbit. Proof. Let us compute the residues at the poles λk of K1 , where only Lk contributes. Recall the local diagonalization theorem of Chapter 3, eq. (3.8), which allows us to write the Lax matrix as L = gk Ak gk−1 around λ = λk . Thus locally around λk we may identify the matrix Ψ(λ) with gk . −1 −1 More precisely, by eq. (5.20), we have Ψ(λ) = gk dk and Ψ (λ) = d−1 k gk with dk a diagonal matrix. The residues are obtained by integrating over small circles surrounding each of the N points Pk,i above λk . We can choose these small circles so that they project on the base λ on a single
5.9 Symplectic form
161
small circle surrounding λk . Then we get N i=1
N 1 Ψ(−1) (Pi )δL(λ) ∧ δΨ(Pi ) dλ 2iπ Ck,i i=1
1 −1 (λ)δL(λ) ∧ δ Ψ(λ)dλ Tr Ψ (5.67) = 2iπ Ck
ResPk,i K1 =
−1 (λ) is equal to the matrix whose rows where we used the fact that Ψ (−1) (Pi ). The trace has been reconstructed in eq. (5.67) are the vectors Ψ because Ψ(Pi ), i = 1, . . . , N , form a basis of eigenvectors. Using the iden tification of Ψ(λ) in terms of gk gives: Resλk K1 = Resλk Tr
−1 −1 −1 −1 d−1 δg ∧ (δg g (A ) g − g (A ) g δg g d + g δd ) dλ − − k k k k k k k k k k k k k k = −2 Resλk Tr (Ak )− gk−1 δgk ∧ gk−1 δgk + gk−1 δgk [(Ak )− , δdk d−1 k ] dλ The last term vanishes because it involves the commutator of two diagonal matrices. Finally, K2 is regular at λk because, as we already remarked, δµ is regular on all the sheets above λk . This proposition means that the coordinates (λγi , µγi ) of the point γi of the dynamical divisor are canonical coordinates. Remark 1. This result shows the nice interplay between the analytical and the group-theoretical approaches to integrable systems. We are able to show that (λγi , µγi ) are canonical coordinates using only the fact that L parametrizes a coadjoint orbit, specified by constant matrices (Ak )− and L0 . Remark 2. In practice, to perform one has to compute at each this calculation,
Ψ −1 dλ. In the residue at λk , only pole of L the quantity ωk = Resλk Tr δL ∧ δ Ψ δLk appears. From eq. (5.20) one has Ψ −1 )jet , Lk ] δLk = [(δ Ψ
(5.68)
Ψ −1 )jet up where ()jet is the expansion to order nk − 1. This equation determines (δ Ψ to a quantity commuting with Lk . It is easy to see that
Ψ −1 , Lk ] ∧ δ Ψ Ψ −1 dλ ωk = Resλk Tr [δ Ψ is not affected by this ambiguity, using the antisymmetry of the wedge product and the cyclicity of the trace.
162
5 Analytical methods
Example. We consider the example of the Neumann model. The above analysis can be applied to this model, except for one feature. As we have already stressed, there is no residual action of diagonal matrices on L, hence one has to pay special attention to the residues above λ = ∞. The Ψ −1 dλ. sum of residues at the poles above λ = ∞ is Res∞ Tr δL ∧ δ Ψ Ψ −1 is regular at ∞, while dλ has a pole of order One can see that δ Ψ 2. Since δL has a zero of order 1, one generally gets a residue. However Ψ −1 is constrained by δΨ Ψ −1 , L] + Ψδ µΨ −1 δL = [δ Ψ At λ = ∞ the second term vanishes and L tends to D, hence the order 0 Ψ −1 is diagonal. The leading term in δL is 1/λ δJ (see eq. (3.3) term in δ Ψ in Chapter 3) and has no diagonal element, consequently the considered µ trace vanishes. Similarly the term involving K2 has no residue because δ vanishes to order 2 at λ = ∞. To compute the residue at λ = 0 (a second order pole of L) we remark that the jet: Ψ −1 = δX t X − Xδ t X + λ δY t X + Xδ t Y − δX t Y − Y δ t X + O(λ2 ) δΨ solves eq. (5.68) and has the correct symmetry properties. One gets the expression of the symplectic form of the Neumann model in terms of the dynamical divisor: ω=4
N
δyi ∧ δxi = 2
i=1
N −1
δλγj ∧ δµγj
(5.69)
j=1
5.10 Separation of variables and the spectral curve Let us call Fi , i = 1, . . . , g, the action variables which are also the moduli of the spectral curve. For fixed Fi , we have seen that the motion takes place on the Jacobian Jac (Γ{Fi } ). When varying initial conditions, the {Fi } will eventually vary and we get a foliation of the (complexified) phase space in terms of the Jacobian tori of Γ{Fi } . So we are back to the situation described in Liouville’s theorem, cf. Chapter 2. Let us check that the symplectic form does vanish when we restrict ourself to one of the tori of the foliation. We view Jac (Γ) as the g th symmetric product Γg . Solving the equation Γ(λγj , µγj ; {Fi }) = 0, one has µγj = µγj ({Fi }, λγj ) which depends on λγj only and not on the other λ. The symplectic form can then be written as: g g ∂µγj δFi ∧ δλγj , α = δµγj ∧ δλγj = µγj δλγj ω = δα = ∂Fi j=1
i,j
j=1
5.10 Separation of variables and the spectral curve
163
If we restrict ourselves to a level manifold Fi = fi , we have δFi = 0 and ω|f = 0. Let us explain why the conjugate variables (λγj , µγj ) form a set of separated variables. The construction is similar to the method used for proving the Liouville theorem. Consider the function m λγj S({Fi }, {λγj }) = α= µ(λ)dλ m0
j
λ0
The integration contour is drawn on the level manifold Fi = fi . Just as in the Liouville case, this function does not depend on local variations of the integration path. It is explicitly separated since it can be written as a sum of functions each depending on only one variable λγj : S({Fi }, {λγj }) = Sj ({Fi }, λγj ) j
Since
∂Sj ∂λγj
= µγj and since the point (λγj , µγj ) belongs to the curve Γ
with equation Γ(λ, µ) = 0, each function Sj is a solution of the differential equation: ∂Sj Γ λγj , ; {Fi } = 0, k = 1, . . . , g (5.70) ∂λγj Of course the coefficients of the function Γ(λ, µ) depend on the values of the integrals of motion Fi . This is an equation of the form N −1 ∂Sj N ∂Sj q − + rq (λγj ) =0 ∂λγj ∂λγj q=1
where the coefficients rq (λ) are defined in eq. (5.4). Remark. Equation (5.70) plays an important role in the quantum case. It is the separated Schroedinger equation also known as the Baxter equation in some cases. The commuting Hamiltonians {Fi } are functions of the 2g coordinates λγj , µγj . To find them we write that the curve Γ(λ, µ; {Fi }) = 0 passes through the g points (λγj , µγj ) of the dynamical divisor D. Hence the equations of the Liouville torus Fi = fi in these coordinates read Γ(λγj , µγj ; {fi }) = 0
(5.71)
The standard Hamilton–Jacobi equation is obtained by setting µγj = ∂λγj S, where S is the action. Due to the form of eq. (5.71), it is clear
164
5 Analytical methods
that one can take S({λγj }) = j s(λγj ), where the unique function s(λ) obeys the one-variable equation Γ(λ, ∂λ s; {fi }) = 0. This shows that the Hamilton–Jacobi equation separates into g identical one-variable equations, using the variables λγj . This is a particularly striking example of separation of variables. Remark. It is sometime advantageous to consider λγj as a function of µγj . Defining S =
µγ
∂S λγj dµγj , we get λγj = ∂µ . The relation between S and S is γj simply a Legendre transform: S = j µγj λγj − S. j
j
µ0
Example. In the Neumann model, we see that eq. (2.30) in Chapter 2 is exactly of the form 1 µγj λγj dµγj S= 2 j
where the points (λγj , µγj ) belongs to the spectral curve eq. (5.13). So the results of Chapter 2 are particular cases of the general theory explained in this chapter. Moreover, we see in eq. (5.14) that the spectral curve depends on g = N −1 dynamical moduli bi , while the ai are non-dynamical. Asking that a curve of the form eq. (5.13) passes through the g points (λγj , µγj ) determines the symmetric functions of the coefficients bi in terms of the (λγj , µγj ). In fact, setting P (µ) = i (µ − bi ), we find the conditions P (µγj ) = −λ2γj i (µγj − ai ). By the Lagrange interpolation formula we reconstruct P (µ): 2 i (µγj − ai ) (µ − µγj ) − λγj (µ − µγk ) (5.72) P (µ) = k=j (µγj − µγk ) j
j
k=j
Using the canonical Poisson bracket eq. (5.69), it is a simple exercise to check that {P (µ), P (µ )} = 0, as it should be. 5.11 Action–angle variables So far, we dealt with complexified dynamical sytems. We found that the phase space of this complexified system can be viewed as a fibration where the base is the moduli space of the spectral curve and the fibre is the Jacobian of the spectral curve corresponding to the specific values of the moduli parameters. This is very similar to the situation in the Liouville theorem, but the Liouville tori are real tori of dimension g, while the Jacobian has real dimension 2g. We need to choose a real slice of this complex phase space.
5.11 Action–angle variables
165
This can be done as follows. On Γ we choose a canonical basis of 2g cycles ak , bk . The cycles ak are non-intersecting. Once they are chosen, we can adapt the basis of Abelian differentials of the first kind ωk such that they . are normalized by ak ωl = δkl . The real slice can be defined by restricting the g points of the dynamical divisor D to move along these g non-intersecting cycles, each point on a different cycle. This obviously is a product of g real circles. One has to be aware that a real slice has in general several connected components, and the above description applies to one of them. Finally, explicit models correspond to specific cycles and not only to homology classes of cycles. The g angle variables are given by θk =
g i=1
γi
m0
ωk =
g i=1
λγi
σk (λ)dλ
(5.73)
λ0
where the integration paths are taken along the cycles ak and the Abelian differentials ωk are written in terms of the local parameter λ as ωk = σk (λ)dλ. With these assumptions the angles have real periods. The angles being defined, one may find the conjugated action variables. To do this, we need a Lemma. Lemma. The conditions characterizing the dynamical moduli can be summarized into the single statement: δ(µdλ) is a regular form
(5.74)
where δ is the differentiation with respect to the moduli, keeping λ constant. Proof. Taking the variation at λ constant produces poles at the branch points of the covering which are cancelled by corresponding zeroes of dλ. The form µdλ has poles at finite distance where L(λ) has poles. Around a pole λk , we have (k)
Lk (λ) = (g (k) (λ)A− (λ)g (k)−1 (λ))− (k)
and we assume that the diagonal polar part A− (λ) is non-dynamical. Hence the singular part of µ is kept fixed under δ, and δ(µdλ) is regular at λk . At λ = ∞, dλ has a double pole. The dominant term of µdλ is ai dλ when µ → Qi = (∞, ai ), and is kept fixed because we assume that L0 is non-diagonal. The subdominant term is also kept fixed because of the reduction by the group of conjugation by diagonal matrices.
166
5 Analytical methods
The Hamiltonians Hn generating this group action are given in eq. (5.8). Setting µi = ai + bλi + · · ·, we have Hn = an−1 bi i i
After Hamiltonian reduction, these quantities are to be kept fixed. So both ai and bi are non-dynamical and δ(µdλ) is regular at infinity. We emphasize that all the conditions specifying the non-dynamical variables in L(λ) are accounted for by eq. (5.74). Under these conditions, we have seen at the beginning of this chapter that the counting of parameters leaves a phase space of dimension 2g. The g action variables are now easily constructed: Proposition. Assume that δ(µdλ) is regular. Then we have: ω=
g
δµγi ∧ δλγi =
i=1
g
δIi ∧ δθi
(5.75)
i=1
where the action variable Ik , canonically conjugated to θk , is Ik = µdλ
(5.76)
ak
on the basis of holomorphic Proof. By eq. (5.74), δ(µdλ) decomposes Abelian differentials: δ(µdλ) = i αi ωi . To find the coefficients αi , we integrate both sides on the cycles al . We get δ(µdλ) = δ µdλ = δIl (5.77) αl = al
al
Hence we have, with ωk = σk (λ)dλ: δIk ωk = δIk σk (λ)dλ δ(µdλ) = k
k
Since the variations are taken at λ constant so that δ(µdλ) = δ(µ)dλ, and since δµ decomposes on the δIk by eq. (5.58), we have ∂µ ∂µ δIk , = σk (λ) δµ = ∂Ik ∂Ik k
By the definition of the angular variables in eq. (5.73) one has, using δIk : δσi (λ) = k ∂σ∂Ii (λ) k g g λγ j ∂2µ σi (λγj )δλγj + dλ δIk δθi = ∂Ii ∂Ik λ0 j=1
j=1
k
167
5.12 Riemann surfaces and integrability Finally, we obtain: ω = δµγi ∧ δλγi = δIi ∧ σi (λγj )δλγj i
=
i
δIi ∧ δθi −
i,j
j
λγj
λ0
∂2µ dλ δIk = δIi ∧ δθi ∂Ii ∂Ik i
k
where the second term vanishes because ∂Ii ∂Ik µ is symmetrical in the indices i, k and δIi ∧ δIk is antisymmetric. This shows that the Ii are canonically conjugated to the θi . At the level of Poisson brackets, we have {Ii , Ij } = 0,
{Ii , θj } = δij ,
{θi , θj } = 0
5.12 Riemann surfaces and integrability We are now in a position to clarify the link between integrable systems and Riemann surfaces. Let Γ be a Riemann surface of genus g and let λ be a meromorphic function on it. We assume that λ takes each value N times. Any other meromorphic function µ on Γ is related to λ by an algebraic relation Γ(λ, µ) = 0. One can choose µ such that this relation is irreducible. Then the field of meromorphic functions on Γ is the field of rational functions of λ and µ. The choice of these functions allows us to present Γ as an N sheeted covering of the Riemann sphere by (λ, µ) → λ. We can interpret Γ as the spectral curve of a Lax matrix L(λ) in the following way. Let Q1 , Q2 , . . . , QN be the N points above λ = ∞, µ(Qi ) = ai . Choose a divisor D of g points on Γ. From these data, we construct N linearly independent meromorphic functions, ψ1 = 1 and ψk with a zero at Q1 and poles at D + Qk for k = 2, . . . , N . This determines ψk uniquely up to multiplication by a constant ck . Let Pi = (λ, µi ) be the N points ij = ψi (Pj ) and µ = diag(µi ), and above λ. Define the N × N matrices Ψ let µ −1 L=Ψ Ψ This matrix is a rational function of λ because it is a rational function of λ, µ1 , . . . , µN , invariant by permutations of the µj . It tends to the diagonal matrix diag(ai ) at ∞, and Γ is the spectral curve of L(λ). Note that L is defined only up to conjugation by diagonal matrices due to the undeterminacy in the normalization of the functions ψk .
168
5 Analytical methods
We now introduce time evolutions such that Γ is time-independent, but the divisor D depends on time. This is enough to assert the existence of a rational Lax equation ˙ L(λ) = [M (λ), L(λ)],
˙ Ψ −1 M (λ) = Ψ
We are thus exactly in the situation of the Zakharov–Shabat construction. To relate to Liouville integrable systems, we have to introduce a symplectic structure on the dynamical variables. We have seen that imposing coadjoint orbit structure at the poles of L(λ) automatically yields integrable systems once we have performed the Hamiltonian reduction by the diagonal group action of dimension N − 1. This produces a dynamical system of dimension 2g. The g angle variables are given by the dynamical divisor, which evolves linearly on Jac(Γ), and the g action variables are contained in the moduli of the curve. The conditions we impose on the moduli, coming from the coadjoint orbit structure and the Hamiltonian reduction, can be written in a very concise way: δ(µdλ) is a holomorphic differential where δ is the differential with respect to the dynamical moduli. This means that the polar parts of µdλ are non-dynamical. Since δµdλ = g ω k=1 k δIk , we see that we have exactly g dynamical modules. In this setting, the standard symplectic form on the variables γi = (λγi , µγi ) is equal to the Kirillov symplectic form on L(λ), as we have shown . θi = γi in eq. (5.61). Moreover, due to eq. (5.75), the angle variables ω are canonically conjugated to the action variables I = j i i ai µdλ, i.e. g g δµγi ∧ δλγi = ωK = δIi ∧ δθi (5.78) i=1
i=1
We want to emphasize the meaning of this result. Starting from a Riemann surface Γ(λ, µ) = 0, we specify g dynamical moduli F1 , . . . , Fg , by imposing that δ(µdλ) be regular. We take g arbitrary points γi = (λγi , µγi ) and impose the symplectic structure ω = i δµγi ∧ δλγi on these data. We determine the g moduli Fi by solving the g equations meaning that the curve passes through the points γi : Γ(λγi , µγi ; F1 , . . . , Fg ) = 0 This determines Fi as symmetric functions of the λj , µj . The beautiful result is that these functions Poisson commute, {Fi , Fj } = 0, because, by eq. (5.78), the action variables, Ii , Poisson commute and they are
5.13 The Kowalevski top
169
independent functions of the Fj . See eq. (5.72) for an example of this situation. Remark 1.
The above construction can be generalized by imposing conditions
such as
δµ dλ (n,m) = ωk δIk n m µ λ g
k=1
(n,m) Ik .
for g modules This will modify the symplectic form as well. An example of this is given in Chapter 6.
Remark 2. If the Riemann surface Γ can be viewed as covering of the Riemann sphere in different ways, one can construct Lax matrices L(λ) of different sizes in the above way. In particular if Γ is hyperelliptic, one can construct a 2 × 2 Lax matrix.
Remark 3. Lax matrices with elliptic dependence on the spectral parameter can be viewed as particular cases of this setup when the covering of the Riemann sphere λ : Γ → S factorizes as λ : Γ → T → S, where T is the torus. The rational Lax matrix has a size twice as big as the elliptic one 5.13 The Kowalevski top We now briefly discuss the algebro-geometric solution of the Kowalevski top. It is more convenient to start from a slightly different Lax matrix from the one in eq. (4.69) in Chapter 4, obtained by conjugation L(λ) → λP −1 L(λ)P : iσ0 + σ3 σ1 + σ2 P = σ0 + iσ3 iσ1 − iσ2 The new Lax matrix reads: 0 −λ ξ2 0 i λ ξ1 L(λ) = 1 2 z λ γ3 1 2 1 −λ γ3 − z2 2
1 z2 2 −λ γ3 −J3 1 −2 − λ ξ2 λ
λ γ3 1 − z1 2 1 2 + λ ξ1 λ J3
(5.79)
where we have used Kowalevski’s variables z1 = J1 + iJ2 , z2 = J1 − iJ2 , ξ1 = γ1 + iγ2 , ξ2 = γ1 − iγ2 . We restrict ourselves to the pure Kowalevski case γ = 0. In this basis the Lax matrix satisfies the symmetry properties: −1 t L(−λ) = −Σ−1 L(λ) Σ1 , t L(λ) = −Σ−1 1 2 L(λ) Σ2 , L(−λ) = Σ3 L(λ) Σ3 (5.80)
170
5 Analytical methods
where the matrices Σ1 , Σ2 , Σ3 are given by: σ1 0 σ2 0 , Σ2 = , Σ1 = 0 σ1 0 σ2
Σ3 =
σ3 0
0 σ3
The matrix P has been chosen to simplify the expression of these symmetries and in particular to diagonalize Σ3 . The first of eqs. (5.80) expresses the fact that L(λ) belongs to a twisted loop algebra. The second one says that L(λ) belongs to sp(4), which is well-known to be isomorphic to so(3, 2), while the third is a combination of the first two. The equation of the spectral curve Γ : det (L(λ) − µ) = 0 reads: 2 1 λ 2 1 λ4 λ2 K 4 γ − H + 2 µ2 + (γ 2 )2 + ((J · γ )2 − Hγ 2 ) + =0 µ − 2 4 λ 16 16 256 (5.81) The Hamiltonians H and K in this formula are given by: 1 2 1 (J + J22 + 2J32 ) − 4γ1 = z1 z2 + J32 − 2(ξ1 + ξ2 ) 2 1 2 K = (J12 − J22 + 8γ1 )2 + (2J1 J2 + 8γ2 )2 = (z12 + 8ξ1 )(z22 + 8ξ2 ) H =
while γ 2 and J · γ are in the centre of the Poisson algebra. Note that the coordinates λ and µ on the spectral curve appear only through λ2 and µ2 , which is a consequence of the symmetries eqs. (5.80). It will be necessary in the following to have a clear picture of the solutions µ(λ) of eq. (5.81) around λ = 0 and λ = ∞. Around λ = ∞, we have four branches: J · γ γ 2 1 µ= λ + i + O( ) (5.82) 2 2 λ 4 γ where , are independent signs. Around λ = 0, we get two branches with µ → 0 and two branches with µ → ∞: √ 1 H K (5.83) λ + O(λ3 ), µ = − λ + O(λ3 ) µ= 16 λ 8 Of course all these branches properly exchange under the symmetries λ → −λ and µ → −µ. At this point it is important to recall that Γ is defined as the desingularization of the curve defined by eq. (5.81). We are going to study Γ by considering it as successive coverings of simpler curves. Setting λ2 = z in eq. (5.81) yields a curve C of equation: z 2 1 z2 z K 1 4 µ2 + (γ 2 )2 + ((J · γ )2 − Hγ 2 ) + µ − γ − H + =0 2 4 z 16 16 256
5.13 The Kowalevski top
171
and Γ is a two-sheeted cover of C. Setting µ2 = y we get the curve E of equation: z2 z 2 1 z K 1 y + (γ 2 )2 + ((J · γ )2 − Hγ 2 ) + y2 − γ − H + =0 2 4 z 16 16 256 and C is a two-sheeted branched cover of E. First, E is an elliptic curve of genus 1. Indeed, setting t = 1/z and Y = ty − 14 γ 2 + H8 t − 12 t2 , the equation of E takes the form Y 2 = tP3 (t), where P3 (t) is the polynomial of degree 3: 2 H 1 3 H 2 (J · γ )2 γ 2 K t− P3 (t) = t − t + + − 4 8 64 4 256 16 The four branch points are obtained for Y = 0, so there is a branch point at t = 0 (or z = ∞) and three branch points at t (or z) finite. We now study the covering C → E, coming from µ → y = µ2 . This two-sheeted covering can only be branched at y = 0 and y = ∞. The meromorphic function y on E takes each value three times because, given y, z is determined by a third degree equation. Hence it has three zeroes and three poles. Setting y = 0 in the equation of E, one gets 1 2 2 1 K 2 (γ ) + t((J · γ )2 − Hγ 2 ) + t =0 16 16 256 yielding two points (y = 0, t = t1 ) and (y = 0, t = t2 ), where t1 , t2 are the two roots of this second degree equation. The third point with y = 0 and the three points with y = ∞ occur when t = 0 and t = ∞. For t → ∞ we have two points P1 , P2 on the curve E corresponding to the branches: H K 1 1 1 P1 : y = t − , P2 : y = (5.84) +O +O 2 4 t 256 t t Since t is a good local parameter at ∞, P1 is a pole of y, while P√ 2 provides the third zero of y. For t → 0, a good local parameter on E is t and we find two branches: P3 : y =
γ 2 1 J · γ 1 √ + O(1) ±i 4 t 4 t
(5.85)
showing that y has a double pole at this point P3 (t = 0, y = ∞) on E. Of these six poles and zeroes of y only four are branch points of the covering C → E. This is because at the point P3 the equation of C is singular. Since C is the desingularized curve, P3 blows up to two points P˜3 and P˜3 of C, and the point P3 has two pre-images as its neighbours. On the other
172
5 Analytical methods
hand, P1 and P2 are branch points and have just one pre-image each, P˜1 and P˜2 on C. Using the Riemann–Hurwitz formula 2g−2 = N (2g0 −2)+ν, where g0 = 1, N = 2 and ν = 4, we find that C has genus 3. Finally, we study the covering Γ → C, coming from λ → z = λ2 which can possibly be ramified only at z = 0, ∞. Generically, given z, there are four values of µ satisfying the equation of C, so that the meromorphic function z has four zeroes and four poles. We have already obtained the four branches of Γ above λ = ∞ in eq. (5.82) which correspond by definition to four points on the smooth curve Γ. They project on the two branches of C given by eq. (5.85), hence the covering at the two points P˜3 and P˜3 is unbranched. Similarly, above λ = 0 we have the four √ branches of Γ given √ in eq. (5.83). The two points of Γ, Q1 (µ ∼ λ K/16) and Q2 (µ ∼ −λ K/16), project on P˜2 , and the two points Q3 (µ ∼ 1/λ) and Q4 (µ ∼ −1/λ) project on P˜1 , as seen from eq. (5.84). So the covering is unbranched at these points. This exhausts the zeroes and poles of z, hence the covering Γ → C is unbranched. Applying the Riemann–Hurwitz formula with g0 = 3, N = 2, ν = 0, we find that the genus of the spectral curve Γ is equal to 5. We see that, in the case of the Kowalevski top, the Jacobian of the spectral curve is of dimension 5 while the Liouville tori are of dimension 2. This non-generic situation is related to the symmetry properties, eqs. (5.80), of the Lax matrix L(λ) as we now show. Consider the eigenvector Ψ(λ, µ) satisfying L(λ)Ψ = µΨ at the point (λ, µ) of the spectral curve, normalized such that the first component is equal to 1. According to the general discussion Ψ has g + N − 1 = 8 poles on the spectral curve. It is easy to get the following expansions for the eigenvector at the four reads points Q1 , Q2 , Q3 , Q4 , above λ = 0. The matrix Ψ(λ) 1 1 1 1 −iζ + O(λ) i zz12 + O(λ) −i zz12 + O(λ) iζ + O(λ) −iz2 4i 1 ζλ + O(λ2 ) iz2 ζλ + O(λ2 ) − 4i 1 + O(1) + O(1) 4
− 14 z1 λ + O(λ2 )
4
− 14 z1 λ
z2 λ
z2 λ
− z42 λ1 + O(1) − z42 λ1 + O(1) (5.86) The column i corresponds to an expansion at the point Qi . We have denoted ζ = (z12 + 8ξ1 )/(z22 + 8ξ2 ). Note that the eigenvector Ψ(λ, µ) has simple poles at Q3 and Q4 . This can be understood in the context of the general analysis of this chapter. Indeed, choosing a basis where the constant coefficient of 1/λ in L(λ) (which plays the role of L0 in the general discussion) is diagonal, we expect poles at the points above λ = 0. However, because we have two degenerate vanishing eigenvalues, we have only two-poles instead of the expected three. They are on the sheets corresponding to the non-vanishing
173
5.13 The Kowalevski top
eigenvalues, hence at the two points Q3 , Q4 . When returning to our basis, we have to make linear combinations of the last two components of Ψ and this explains our formulae. Recall that the Lax matrix obeys eq. (5.80) so that if L(λ)Ψ(λ, µ) = µΨ(λ, µ) then L(−λ)Σ3 Ψ(λ, µ) = µΣ3 Ψ(λ, µ). Since Σ3 is diagonal and the first component of Ψ is equal to one, we get: Ψ(−λ, µ) = Σ3 Ψ(λ, µ)
(5.87)
We have seen that Γ is a two-sheeted unbranched cover of C. The two sheets are exchanged under the involution τ : (λ, µ) → (−λ, µ). Note that τ exchanges the points Q1 , Q2 and also Q3 , Q4 . It exchanges the corresponding sheets in a vicinity of λ = 0. On the explicit solution, eq. (5.86), we have Ψ(−λ) = Σ3 Ψ(λ)Σ 1 . The matrix Σ1 on the right accounts for the exchange of the sheets under τ . As a result of eq. (5.87) the meromorphic functions ψi (P ), i = 2, 3, 4 on the curve Γ obey the symmetry properties: ψ2 (τ · P ) = −ψ2 (P ),
ψ3 (τ · P ) = ψ3 (P ),
ψ4 (τ · P ) = −ψ4 (P )
In particular the divisor of the poles of Ψ is invariant under the involution τ . Two of these poles are the points Q3 , Q4 which are exchanged by this involution. The remaining six poles thus come in three pairs (γj , τ · γj ), j = 1, 2, 3. These pairs can be seen as points on C, also denoted by γi . The dynamical divisor on C, D(t) = 3i=1 γi (t), is of degree 3 and moves linearly on the dimension 3 Jacobian of the curve C, as we now show. Considering the Lax pair L, M given in eq. (4.69) in Chapter 4 and remembering that L has been rescaled, L → λL, we see that the polar parts of L and M at λ = 0 are related by M− = 1/2 L− . This relation is not affected by the further similarity L → P −1 LP that we have used in this section. Proceeding exactly as in eq. (5.35), we get: τ (γi (t)) 3 4 γi (t) 1 d ω+ ω = ResQi µω (5.88) dt 2 i=1
i=3
where Q3 , Q4 are the two points on Γ with λ = 0 and µ = ∞, and ω is any Abelian differential of the first kind on Γ. In particular, choosing for ω the pullback on Γ of an Abelian differential ω on C, the right-hand side becomes twice the residue at the point P˜1 on C of the form ( 12 µω). This is because in the vicinity of the corresponding points µ(−λ) = −µ(λ), and the pullback of a form on C has a local expression σ(λ)dλ with σ(−λ) = −σ(λ). Note that non-vanishing higher order Hamiltonians in the Kowalevski hierarchy are traces of an even power of L ultimately
174
5 Analytical methods
yielding an odd power of µ in eq. (5.88), so the same conclusion applies to all these flows. Similarly, the left-hand side doubles between a pair of corresponding points γi , by definition of a pullback. Finally, we get: d dt 3
i=1
γi (t)
ω = ResP˜1
1 µω 2
(5.89)
Since C is of genus 3, we have three independent forms ω and this proves that the flow is linear on the Jacobian of C. One can view the functions ψi (P ) as functions on C in the following way. First ψ3 (P ) is a well-defined meromorphic function on C (since it is even under τ ) with three poles at D, one-pole on P˜1 and a zero at P˜2 , and is therefore uniquely determined. The functions ψ2 (P ) and ψ4 (P ) require special treatment since they are odd under τ√ , hence are multivalued on C. However, we can consider the function λ = z defined on Γ which is odd under τ and the functions λψ2 (P ) and λψ4 (P ) which, being even under τ , yield well-defined meromorphic functions on C. These functions are uniquely characterized by their analyticity properties: λψ2 (P ) has three simple poles at D, two simple zeroes at P˜1 and P˜2 and simple poles at P˜3 and P˜3 , while λψ4 (P ) has three simple poles at D, a double zero at P˜2 , and two simple poles at P˜3 and P˜3 . Hence we have shown that one can work only with √ the genus 3 curve C, together with the extra multivalued function λ = z. We still have three points in D, while we need only two degrees of freedom. A further restriction is provided by the other symmetry (λ, µ) → (λ, −µ) induced by the second eq. (5.80). Note that the righthand side of eq. (5.89) contains only odd powers of µ for the general Kowalevski flow. At the point P˜1 (which is a branch point of C → E) µ is a good local parameter. Assume that ω is the pullback on a form on E, hence has a local expression σ(µ)dµ with σ(µ) odd, then µ2k+1 σ(µ) is even, and has no 1/µ term, so that the right-hand side of eq. (5.89) vanishes. Since E is of genus 1, we get one condition which restricts the flow. We finally see that the flow occurs on a two-dimensional subvariety of Jac (C), the so-called Prym variety of the covering C → E. It is defined as the subvariety such that any tangent vector is in the kernel of the pullback to C of any Abelian form of E. Here the action of ω on a tangent vector to Jac (C) is defined by the left-hand side of eq. (5.89). We have recovered a four-dimensional phase space for the Kowalevski top. At this point one can solve the equations of motion with theta-functions on Jac (C) and reduce them to two-dimensional theta-functions by using the Prym condition. Finally, Kowalevski has directly solved the system by using a curve of genus 2. For a study of these approaches and their relations we refer to the literature.
5.14 Infinite-dimensional systems
175
5.14 Infinite-dimensional systems In the field theory case, we can use the previous constructions to find particular classes of solutions to the field equations, called finite-zone solutions. The equations we have to solve are the first order differential system: (∂x − U (λ))Ψ = 0 (∂t − V (λ))Ψ = 0
(5.90) (5.91)
whose compatibility conditions are equivalent to the field equations. The situation is very different as compared to the finite-dimensional case. As we saw in Chapter 3, the analogue of the spectral curve is det(T (λ) − µ) = 0
(5.92)
where T (λ) is the monodromy matrix of the linear system (5.90, 5.91). This equation does not define an algebraic curve of finite genus. This had to be expected since, in field theory, we need an infinite number of action variables, which is incompatible with the finite genus of the spectral curve. Thus we cannot directly apply the previous construction. However, if we restrict our goal to finding only particular solutions to eqs. (5.90, 5.91), then the knowledge acquired in this chapter becomes directly applicable. In fact, the two equations (5.90, 5.91) are exactly of the type of eq. (5.33), whose solution was built in terms of Baker–Akhiezer functions. One can adapt this construction to solve them simultaneously. The idea consists of interpreting the two equations (5.90, 5.91) as evolution equations with respect to two different “times” for a system with a finite number of degrees of freedom associated with some Lax matrix L(λ). This Lax matrix should satisfy: [∂x − U (λ), L(λ)] = 0 [∂t − V (λ), L(λ)] = 0
(5.93)
To exhibit such Lax matrices, we consider the higher order flows as described in eq. (3.95) of Chapter 3. They provide a family of compatible linear equations (∂ti − Vi )Ψ = 0 for i = 1, 2, 3, . . . , where we have identified t1 = x, V1 = U and t2 = t, V2 = V . Since these equations are compatible they satisfy a zero-curvature condition: Fij ≡ ∂ti Vj − ∂tj Vi − [Vi , Vj ] = 0,
∀i, j = 1, . . . , ∞
We now look for particular solutions which are stationary for some given time tn , i.e. ∂tn Vi = 0 for all i. The zero-curvature conditions Fni = 0
176
5 Analytical methods
reduce to a system of Lax equations: dL = [Mi , L], dti
i = 1, . . . , ∞
with L = Vn , Mi = Vi
This is an integrable hierarchy for a finite-dimensional dynamical system described by the Lax matrix L. Taking n larger and larger, the genus of the corresponding spectral curve usually increases and we get families of solutions involving more and more parameters. We give an example of this procedure in Chapter 11. The methods of this chapter deal in fact with systems with a finite number of degrees of freedom. In Chapter 13 we present the inverse scattering method which deals directly with systems with an infinite number of degrees of freedom. References [1] Sophie Kowalevski, Sur le probl`eme de la rotation d’un corps solide autour d’un point fixe. Acta Mathematica 12 (1889) 177–232. [2] B. Dubrovin, V. Matveev and S. Novikov, Non-linear equations of Korteweg–de Vries type, finite-zone linear operators, and Abelian varieties. Russian Math. Surveys 31 (1976) 59–146. [3] P. van Moerbeke and D. Mumford, The spectrum of difference operators and algebraic curves. Acta. Math. 143 (1979) 93–154. [4] A.G. Reyman and M.A. Semenov-Tian-Shansky, Reduction of Hamiltonian systems, affine Lie algebras and Lax equations. Inventiones Mathematicae 54 (1979) 81–100. [5] M. Adler and P. van Moerbeke, Linearization of Hamiltonian systems, Jacobi varieties and representation theory. Advances in Mathematics 38 (1980) 318–379. [6] A.G. Reyman and M.A. Semenov-Tian-Shansky, Reduction of Hamiltonian systems, affine Lie algebras and Lax equations II. Inventiones Mathematicae 63 (1981) 425–432. [7] D. Mumford, Tata Lectures on Theta Vols. I and II, Birkhauser (1983–1984). [8] A. Bobenko, A. Reyman and M. Semenov-Tian-Shansky, The Kowalevski top 99 years later: a Lax pair, generalizations and explicit solutions. Comm. Math. Phys. 122 (1989) 321–354.
5.14 Infinite-dimensional systems
177
[9] E. Horozov and P. van Moerbeke, The full geometry of Kowalevski’s top and (1,2)-Abelian surfaces. Comm. Pure Appl. Math. 42 (1989) 357–407. [10] B.A. Dubrovin, I.M. Krichever and S.P. Novikov, Integrable Systems I. Encyclopedia of Mathematical Sciences, Dynamical Systems IV, Springer (1990) 173–281. [11] A. Beauville, Jacobienne des courbes spectrules et syst`emes hamiltoniens compl`etement int´egrables. Act. Math. 164 (1990) 211–235. [12] I.M. Krichever and D.H. Phong, On the integrable geometry of soliton equations and N=2 supersymmetric gauge theories. J. Diff. Geom. 45 (1997) 349–389.
6 The closed Toda chain
In contrast to open Toda chains, the closed Toda chain is associated with loop algebras. This introduces a spectral parameter into the theory. The aim of this chapter is to construct the general solution of the closed Toda chain by means of the analytical method. We shall do this in two ways. The first method, based on an (n+1)×(n+1) Lax matrix, follows closely Chapter 5. This canonical example illustrates the general constructions of this chapter, for instance, linearization of the flows on the Jacobian of the spectral curve, separation of variables and the corresponding Hamilton– Jacobi equations. The second method, which is based on 2 × 2 matrices, can be regarded as a lattice version of the field theoretical considerations of Chapter 3. A monodromy matrix is introduced which satisfies a quadratic Poisson bracket. This provides short cuts which in general are not available, and makes contact with Sklyanin’s method of separation of variables. We take advantage of this particular example to discuss the reality conditions, which are frequently quite subtle.
6.1 The model We consider a chain of (n + 1) points with positions qi and momenta pi with equations of motion given by: q˙i = pi ,
p˙i = 2e2(qi−1 −qi ) − 2e2(qi −qi+1 ) ,
i = 1, . . . , n + 1
(6.1)
The fact that the chain is closed is implemented by setting (p0 , q0 ) = (pn+1 , qn+1 ) and (pn+2 , qn+2 ) = (p1 , q1 ), which gives sense to the above equations for i = 1 and i = n + 1. Alternatively, one can view the points 178
6.1 The model
179
as sitting on a circle. This is a Hamiltonian system with canonical Poisson brackets {pi , qj } = 12 δij and Hamiltonian: H= p2i + 2 exp 2(qi − qi+1 ) (6.2) i
i
The system has translational symmetry qi → qi + a and one can eliminate n+1 the centre of mass motion by imposing the two conditions i=1 pi = 0 and n+1 i=1 qi = 0, so we are left with a phase space of dimension 2n. In contrast to the open Toda chain which is associated with the finite-dimensional Lie algebra sl(n + 1), the closed Toda chain is associated with the infinite-dimensional Kac–Moody algebra. In fact we con˜ n+1 ≡ sider only its loop representation with vanishing central charge: sl −1 sln+1 ⊗ C(λ, λ ). Its rank is (n + 1), see Chapter 16 for more details. The generators E±αi associated with the simple roots are represented by Eαi = Ei,i+1 , E−αi = Ei+1,i , for i = 1, . . . , n, together with Eαn+1 = λEn+1,1 , and E−αn+1 = λ−1 E1,n+1 , where Ejk are the (n + 1) × (n + 1) canonical matrices (Ejk )mn = δjm δkn . The elements of the Cartan subalgebra are represented by (n + 1) × (n + 1) diagonal traceless matrices. Applying the general construction of the Toda models, see Chapter 4, the Lax pair for the closed Toda chain is given by: L(λ) =
n+1
pi Eii +
i=1 n+1
M (λ) = −
n+1
ai (Eαi + E−αi )
(6.3)
i=1
ai (Eαi − E−αi )
i=1
where the coefficients ai are given by ai = exp (qi − qi+1 ), i = 1, . . . , n, and an+1 = exp (qn+1 − q1 ). Note that (qi − qi+1 ) = αi (q), where q is the traceless diagonal matrix i qi Eii and αi is the simple root associated with the root vector Ei,i+1 . The quantities ai satisfy the condition n+1 i=1 ai = 1. Explicitly, the Lax pair reads: a1 0 ... λ−1 an+1 p1 a1 p2 a2 ... 0 . .. .. . . . . ai−1 pi ai 0 L(λ) = 0 (6.4) . .. .. .. . . 0 an ... an−1 pn λan+1 ... 0 an pn+1
180
6 The closed Toda chain M (λ) =
0 a1 .. .
−a1 0
0 −a2 .. .
0 .. .
ai−1
0 −λan+1
... ...
... ... 0
−ai .. . an−1 0
0 an
λ−1 an+1 0 .. . 0 (6.5) .. . −an 0
The matrices L(λ) and M (λ) have poles at λ = 0 and λ = ∞. The equations of motion eq. (6.1) are equivalent to the Lax equation: ˙ L(λ) = [M (λ), L(λ)] as one can check easily by an explicit computation. Notice that the correct equations of motion for q1 and qn+1 are obtained thanks to the λdependent terms in L(λ) and M (λ). As usual, the Lax equation ensures that Tr (Lp (λ)) are conserved quantities, in particular the Hamiltonian eq. (6.2) reads H = Tr (L2 (λ)) which is independent of λ. As for all Toda models, the Poisson bracket of the Lax matrix can be written in terms of an r-matrix. This implies that the closed Toda chain is integrable. There are two natural r-matrices, r± , which can be computed according to the general formula (4.34) in Chapter 4. In particular: 1 + Hi ⊗ Hi + Eα ⊗ E−α r12 (λ, λ ) = ρ(λ) ⊗ ρ(λ ) 2 α>0
i
where ρ(λ) is the loop representation given above with spectral parameter ˜ n+1 are λn Eij λ. To compute r+ , we recall that the positive roots in sl with i < j; n ≥ 0 or i ≥ j; n > 0, and the corresponding negative roots are λ−n Eji . We immediately get: + (λ, λ ) r12
=
(6.6)
1 1 λ + λ λ E ⊗ E + E ⊗ E + λ Eij ⊗ Eji ii ii ij ji 2 λ − λ λ − λ i
i<j
i>j
This expression is valid for |λ| < |λ |. The formula for r− (λ, λ ) is the same as for r+ (λ, λ ) but valid in the region |λ| > |λ |. So we consider in general the rational function r12 (λ, λ ) which is the extension of the right-hand side of eq. (6.6). Notice that r12 (λ, λ ) = −r21 (λ , λ) so that {L1 (λ), L2 (λ )} = [r12 (λ, λ ), L1 (λ) + L2 (λ )]
6.2 The spectral curve
181
± Using the dualization Tr Res, we define from r12 two maps R± . Let us ± define M± = −2R (L):
M+ = −
pi H i − 2
i
M− =
pi H i + 2
i
n+1
ai Eαi
i=1 n+1
ai E−αi
i=1
we have from eqs. (6.4, 6.5): 1 M (λ) = (M+ (λ) + M− (λ)), 2
1 L(λ) = (−M+ (λ) + M− (λ)) 2
so that the Lax equation can be written: ˙ L(λ) = [M (λ), L(λ)] = [M+ (λ), L(λ)] = [M− (λ), L(λ)] 6.2 The spectral curve The spectral curve Γ is the smooth algebraic curve defined by: Γ
:
det(L(λ) − µ) = 0
(6.7)
For a fixed λ, this is the equation for the eigenvalues of L(λ). By expanding the determinant we see that it is of the form: (6.8) Γ(λ, µ) ≡ (λ + λ−1 ) − 2t(µ) = 0 n where 2t(µ) = µn+1 − n+1 i=1 pi µ + · · · is a polynomial of degree (n + 1). The spectral curve is a hyperelliptic curve since it can be written as s2 = t2 (µ) − 1,
with s = λ − t(µ)
(6.9)
Let us compute the genus of the curve Γ. The polynomial t2 (µ) is of degree 2n + 2 and generically the equation t2 (µ) − 1 = 0 has no double roots, so the genus of the curve Γ is g = n. This is equal to the number of degrees of freedom, i.e. half the dimension of phase-space. In the following, we shall always assume that we are in this generic situation. Notice that the hyperelliptic curve eq. (6.9) is very special because the polynomial of degree 2n + 2 in the right-hand side is expressed in terms of a polynomial of degree n + 1. This also shows that the number of action variables is precisely g = n when n+1 i=1 pi = 0.
182
6 The closed Toda chain
Let us see how we can recover this result from the general analysis in Chapter 5, when looking at the curve as an (n + 1)-sheeted cover of the λ plane. We recall the Riemann–Hurwitz formula for computing the genus: 2g − 2 = (2g0 − 2)(n + 1) + ν where g0 = 0 and ν is the number of branch points. To find ν, we count ∂ ∂ Γ(λ, µ) = −2 ∂µ t(µ). This is a polynomial of degree n in µ, the zeroes of ∂µ independent of λ. Hence we have n zeroes at finite distance. To each value of µ correspond two values of λ, and the total contribution of the branch points at finite distance is 2n. We now look at µ = ∞ which corresponds to λ = ∞ or λ = 0. For λ = ∞, we have λ µn+1 , which means that we have a branch point of order n + 1, contributing n to ν. Similarly, λ = 0 is a branch point of order n+1, contributing also n to ν. Adding everything, we find ν = 4n and g = n. We will call P + and P − the two points above λ = ∞ and λ = 0 respectively. In the neighbourhood of P ± the local parameter is µ−1 and we have by direct expansion of eq. (6.8): n+1
P + : λ = µn+1 1 − µ−1 ( pj ) + O(µ−2 )
(6.10)
j=1
P
−
−n−1
: λ=µ
−1
1+µ
n+1
(
pj ) + O(µ−2 )
(6.11)
j=1
6.3 The eigenvectors Equation (6.7) is the condition for µ to be an eigenvalue of L(λ). Therefore, with each point P on Γ one can associate an (n + 1) dimensional eigenvector Ψ(P ): (L(λ) − µ)Ψ(P ) = 0,
P = (λ, µ) ∈ Γ
Writing this equation explicitly for ψ1 Ψ(P ) = ... ψn+1
we find the following system of linear equations: p1 ψ1 + a1 ψ2 + λ−1 an+1 ψn+1 = µψ1 ai−1 ψi−1 + pi ψi + ai ψi+1 = µψi λan+1 ψ1 + an ψn + pn+1 ψn+1 = µψn+1
(6.12)
183
6.3 The eigenvectors
We extend the definition of the coefficients ai , pi by periodicity, ai+n+1 = ai , pi+n+1 = pi , and introduce a second order difference operator D: (DΨ)i ≡ ai−1 ψi−1 + pi ψi + ai ψi+1 This operator is a discrete version of a Schroedinger operator with periodic potential. Equations (6.12) are then equivalent to: (DΨ)i = µψi ,
with ψi+n+1 = λψi
(6.13)
Thus, Ψ is a Bloch wave for the difference operator D with a Bloch momentum λ. We choose the normalization condition: ψ0 (P ) = 1 for all P ∈ Γ or alternatively ψn+1 (P ) = λ. This is slightly different from the convention of Chapter 3 where we normalized ψ1 (P ) = 1 but it will prove to be more convenient in the following. We need to know the analyticity properties of Ψ at infinity. Proposition. Let us normalize ψ0 (P ) = 1. Then, at the points P + and P − above λ = ∞ and λ = 0, the eigenvector Ψ(P ) behaves as: qi −q0 i
ψi (P ) = e
i−1
( pj ) + O(µ−2 ) ,
−1
µ 1−µ
P ∼ P+
(6.14)
j=0 −qi +q0 −i
ψi (P ) = e
µ
−1
1+µ
i
( pj ) + O(µ−2 ) ,
P ∼ P−
(6.15)
j=1
where the qi are the Toda position parameters (q0 = qn+1 , p0 = pn+1 ). Proof. It is easy to check that this is consistent with eq. (6.12). The result then follows by the uniqueness of the eigenvector. From the general theory, see eq. (5.17) in Chapter 5, we expect g + (n + 1) − 1 = 2n poles for the eigenvector. From eq. (6.14), we see that we have a fixed pole of order n at P + . There remain g = n poles at finite distance. These are the dynamical poles. We shall denote by γ1 , . . . , γn their positions. Note that their number equals the number of degrees of freedom. Recall from Chapter 5 that these dynamical poles contain all the relevant information to reconstruct the eigenvector. We now consider the time evolution of the eigenvector. The Lax equation implies d Ψ(t, P ) = (M (λ) − C(t, P ) 1) Ψ(t, P ) dt
184
6 The closed Toda chain
with C(t, P ) some scalar function, determined by imposing the condition ψ0 (t, P ) = 1 or equivalently ψn+1 (t, P ) = λ, which yields C(t, P ) = j Mn+1,j ψj . Alternatively, one can use the natural time evolution d Ψ(t, P ) = M (λ) Ψ(t, P ) dt
(6.16)
but then, at t = 0 the eigenvector Ψ(t, P ) is a Baker–Akhiezer function. It has n poles at finite distance, independent of time, and essential singularities at the points P ± given by: Proposition. Let Ψ(t, P ) be the eigenvector evolving according to natural eq. (6.16). Then ψi (t, µ) = eqi (t) e−µt µi (1 + O(µ−1 )), −qi (t)
ψi (t, µ) = e
µt −i
−1
e µ (1 + O(µ
)),
P → P+ P →P
(6.17) −
(6.18)
Proof. The time evolution of the eigenvector (Lψ)i = µψi is: d ψi = (M Ψ)i = ((L + M+ )Ψ)i = ((−L + M− )Ψ)i dt Let us see what happens near P + . Since Ψ(t, P ) is also a solution of (L(λ) − µ)Ψ(t, P ) = 0, the results of eqs. (6.14, 6.15) still apply with qi (t) now depending on t. Hence we can write ψi (t, P ) = f (t, P )eqi (t)−q0 (t) µi (1 + O(µ−1 )) There is an extra multiplicative factor f (t, P ), independent of n, because we relax the condition ψ0 = 1 for t = 0. Writing ψ˙ n+1 = f˙λ = −((L − M− )Ψ)n+1 = −µλf + pn+1 λf + 2an ψn we get f˙ = (−µ + pn+1 + O(µ−1 ))f , hence f = exp(−µt + q0 (t))(1 + O(µ−1 )), where we used pn+1 = q˙0 . The analysis near P − is similar, using M = M+ + L. 6.4 Reconstruction formula Starting from a curve of the particular form eq. (6.8), we reconstruct the eigenvector Ψ(P ) from its analyticity properties. From this the whole closed Toda chain model and its solution is obtained. Consider first what happens at time t = 0. Let the algebro-geometrical data be specified by the curve eq. (6.8), with a divisor of g = n points
185
6.4 Reconstruction formula
on it D = γ1 + . . . + γg , and two punctures which are the points P ± at infinity. From the previous section, the divisor of the component ψi (P ) of Ψ(P ) is: (ψi ) = −D + i P − − i P +
(6.19)
Applying the Riemann–Roch theorem with deg (ψi ) = g, we see that there exists a unique meromorphic function, up to a proportionality constant, having this divisor. This function is not constant, for i > 0, since it vanishes at P − . We can fix the proportionality constant (up to a sign) by requiring that the coefficients of µ∓i at P ± are inverse to each other. Denote these coefficients by e±(qi −q0 ) . Note that this eliminates the residual gauge invariance by diagonal matrices which is present in the general theory. The function ψi (P ) will have the form for all i ≥ 0 : ψi (µ) = eqi −q0 µi (1 − µ−1 ξi+ + O(µ−2 )), −qi +q0
ψi (µ) = e
−i
−1
µ (1 − µ
ξi−
−2
+ O(µ
P → P+ )),
P →P
(6.20) −
(6.21)
ξi±
are just Taylor coefficients. Of course, since ψi (P ) is a meromorHere phic function, it also possesses g extra zeroes. We now show that these properties imply that the functions ψi (P ) constructed with these analyticity requirements are solutions of eq. (6.13): Proposition. Let ψi (P ) be the unique meromorphic function having simple poles at the g points γi , a pole of order i at P + , and a zero of order i at P − , and normalized as in eqs. (6.20, 6.21). Then, (i) The ψi (P ) satisfy the Schroedinger equations (D −µ)Ψ = 0, eq. (6.13), with coefficients ai , pi given by: ai = exp (qi − qi+1 ) ,
+ pi = ξi+1 − ξi+
(6.22)
(ii) The functions ψi (P ) are quasi-periodic: ψn+1+i (P ) = λ(P ) ψi (P )
(6.23)
Therefore, the ψi (P ) are the components of the eigenvector of a Lax matrix L(λ) of the form eq. (6.4). Proof. Let us consider the function ψi (P ) = ((D − µ)Ψ)i (P ). Since the coefficients of (D − µ) are regular outside P ± , ψi possesses g poles at the γi . Consider now the behaviour at P ± . We have: ((D − µ)Ψ)i = µi+1 ai eqi+1 −q0 − eqi −q0 + −µi ai eqi+1 −q0 ξi+1 − eqi −q0 (pi + ξi+ ) + O(µi−1 ), P → P + ((D − µ)Ψ)i = µ−i+1 ai−1 eq0 −qi−1 − eq0 −qi + O(µ−i ), P → P −
186
6 The closed Toda chain
If we choose the coefficients ai and pi as in eqs. (6.22), the function (D − µ)ψi has a pole of order (i−1) at P + , and a zero of order i at P − . Therefore the divisor of ψi (P ) is greater than −D = −D + i P − − (i − 1))P + and we have deg D = g − 1. Thus, by the Riemann–Roch theorem, (D − µ)Ψ = 0. Notice that the coefficients ai and pi are chosen in order to decrease the degree of this divisor by one unit. Let us now prove the periodicity relation ψn+1+i (P ) = λψi (P ). We apply the Riemann–Roch theorem to the functions in each member of this equation. The divisor of the function λ is (λ) = (n + 1)P − − (n + 1)P + . The poles at finite distance of ψn+1+i and ψi are located at the same positions γ1 , . . . , γg . So we have: (ψn+1+i ) ≥ −D + (n + 1 + i)P − − (n + 1 + i)P + (λψi ) ≥ −D + i P − − i P + + (n + 1)P − − (n + 1)P + Notice that these divisors are both greater than the degree g divisor −D+ (n+1+i)P − −(n+1+i)P + . By the Riemann–Roch theorem, there is only one function satisfying this property and the two functions are therefore proportional. The proportionality constant is determined by comparing the behaviour at P + or P − , and is found to be equal to one. Let us now consider what happens at t = 0. We define, for all integer i ≥ 0, the function ψi (t, P ) as the Baker–Akhiezer function with g poles at the same positions γ1 , . . . , γn as above, and with essential singularities at P ± given by: ψi (t, µ) = eqi (t) e−µt µi (1 − µ−1 ξi+ (t) + . . .), −qi (t)
ψi (t, µ) = e
µt −i
−1
e µ (1 − µ
ξi− (t)
+ . . .),
P → P+ P →P
−
(6.24) (6.25)
with the normalization parameters qi (t) obtained by requiring that the expansions at P + and P − start with inverse coefficients. Then the Taylor coefficients ξi± (t) are fixed. By the Riemann–Roch theorem this function exists and is unique. Moreover, we have ψn+1+i (t, P ) = λψi (t, P ) by the same method as above. Taking into account the expansions eqs. (6.10, 6.11) of λ and the expansions eqs. (6.24, 6.25) of ψi (t, P ) near P ± , it follows thatthe Taylor coefficients ξi± (t) are periodic, i.e. ± ξi+n+1 (t) = ξi± (t), when i pi = 0 (centre of mass system). Proposition. The Baker–Akhiezer functions defined above satisfy the eigenvalue equation and the evolution equation: L(λ)Ψ(t, P ) = µ Ψ(t, P ) d Ψ(t, P ) = M (λ)Ψ(t, P ) dt
187
6.4 Reconstruction formula with M (λ) defined in eq. (6.3) with ai (t) = exp (qi (t) − qi+1 (t)) .
Proof. The proof of the eigenvalue equation and the quasi-periodicity property (6.23) is the same as for the initial eigenvector at t = 0. To prove the evolution equation let us consider for i = 1, . . . , n the expression: Ei ≡
d ψi + (ai ψi+1 − ai−1 ψi−1 ) dt
Since the poles of the ψi at finite distance are independent of time, and the ai are constant on the Riemann surface, Ei has the same g poles at finite distance as ψi . Its behaviour at infinity is easily obtained as: + Ei = −e−µt eqi µi (ξi+1 − ξi+ − q˙i ) + O(µi−1 ), µt −qi −i
Ei = e e
µ
− (ξi−1
−
ξi−
−i+1
− q˙i ) + O(µ
),
P → P+ P →P
(6.26)
−
By the Riemann–Roch theorem Ei is proportional to ψi , and we write Ei = di (t)ψi . Using the quasi-periodicity property eq. (6.23) we can restate this result as: ˙ = M Ψ + dΨ Ψ (6.27) where M is the matrix given in eq. (6.5) and d is the time-dependent diagonal matrix d = Diag (d1 , . . . , dn+1 ), which is constant on the Riemann surface. Differentiating the relation (L(λ) − µ)Ψ = 0 with respect to time we get, using eq. (6.27), (L˙ − [M, L] − [d, L])Ψ = 0. For any value of λ which is not a branch point of the covering (λ, µ) → λ, we have n+1 independent (see Chapter 5) vectors Ψ(λ, µk ) for which this equation is true, ˙ hence we get the matrix equation L(λ) − [M (λ), L(λ)] = [d, L(λ)], which remains true for all values of λ by analytic continuation. In particular it is true when λ → 1/λ. Note that t L(λ) = L(1/λ), t M (λ) = −M (1/λ), so taking the transpose of the above equation evaluated at 1/λ and comparing it with the original equation we get [d, L(λ)] = 0. This implies that d is proportional to the identity matrix, d = δ(t)I. Then, comparing + − ξi+ + δ(t). In the centre of eq. (6.26) with Ei = δ(t)ψi , we find q˙i = ξi+1 n mass system, i=0 qi = 0, so that δ(t) = 0 by the periodicity of ξi+ . At this point we have completely reconstructed the closed Toda chain, starting from an appropriate spectral curve. Moreover, the procedure also provides the solution of the equations of motion. To get explicit expressions for qi (t) we need explicit formulae for the Baker–Akhiezer functions. This is done using Riemann’s theta-functions. Let us fix a canonical set of cycles (ai , bj ) on Γ, a base point Q0 on Γ and a set of holomorphic Abelian differential ωj , dual to the a-cycles
188
6 The closed Toda chain
(see Chapter 15). Let Ω(i) be the meromorphic differential analytic on Γ outside the points P ± , and obeying the following normalization conditions: Ω(i) = 0, ak
Ω (P ) = ±(µi−1 + O(µ−2 ))dµ,
P → P± (6.28) P The notation assumes that some multivalued primitives Q0 Ω(i) have been chosen for these differentials. Since the local parameter around P ± is 1/µ, Ω(i) has poles of order (i + 1) at the points P ± . Note that Ω(i) , i ≥ 1, are Abelian differentials of the second kind, whereas Ω(0) is a normalized Abelian differential of the third kind. Let us also define, for each i, the g-dimensional vectors U (i) whose components are the b-periods of the forms Ω(i) : 1 (i) Uk = Ω(i) (6.29) 2πi bk (i)
Given these data, the Baker–Akhiezer functions defined in eqs. (6.24, 6.25) are expressed as follows: P Proposition. Let A(P ) be the Abel map, Ak (P ) = Q0 ωk . Then the Baker–Akhiezer function ψi (t, P ) has the following expression: P (1)
P θ(A(P ) i Q Ω(0) −t Q Ω
ψi (P ) = ri (t)e
0
0
+ i U (0) − tU (1) − ζ0 ) (6.30) θ(A(P ) − ζ0 )
where ri (t) is independent of P on Γ and ζ0 =
g
A(γi ) + K
(6.31)
i=1
with K the vector of Riemann constants. Proof. First, one checks that this function is well-defined on Γ, i.e. it does not depend on the path of integration between Q0 and P . This is done using the formulae of Chapter 15 on theta functions. Then one checks that it has the right poles at finite distance. They are given by the zeroes of the theta-function θ(A(P ) − ζ0 ). By Riemann’s theorem, they are located at the points γ1 , . . . , γg because we chose the vector ζ0 according to eq. (6.31). Finally, we check that it has the right behaviour in the neighbourhood of the points at infinity P ± . The theta functions are regular at infinity. Therefore, the behaviour at infinity is governed
189
6.4 Reconstruction formula
by the differentials Ω(0) and Ω(1) . From eq. (6.28), we deduce that when P → P ±: P Ω(0) = ± log µ + O(1) Q0 P
Ω(i) = ±
Q0
µi + O(1), i
i≥1
These expressions hold modulo periods, but as we have seen, this does not affect the global expression. Therefore, when P → P ± : P P (0) (1) = const. µ±i e∓µt 1 + O(µ−1 ) exp i Ω −t Ω Q0
Q0
This proves the result. In eq. (6.30), the coefficient ri (t) is fixed up to a sign by the requirement that the leading term in the expansions of ψi (P ) at P ± are inverse to each other. In the following we only need to consider ratios of the values of ψi (P ) at the vicinity of the points P + and P − , in which ri (t) cancels out. So we do not give the value of ri (t). Proposition. Let τi (t) be the n tau–functions defined by: 1 2 τi (t) = e 2 β0 i θ(iU (0) − tU (1) − ζ0 )
(6.32)
where β0 and ζ0 are constants. Then the solution of the equations of motion of the closed Toda chain is given by: e2(qi (t)−qi+1 (t)) =
τi+1 (t)τi−1 (t) τi2 (t)
(6.33)
Proof. From eqs. (6.24, 6.25), we have: ψi (P → P + ) = e2qi µ2i e−2µt (1 + O(µ−1 )), ψi (P → P − )
µ→∞
hence substituting the expression (6.30) of ψi we obtain an expression of the form: e2qi = eβ0 i+β1 t ×
θ(A(P + ) + i U (0) − tU (1) − ζ0 ) θ(A(P + ) − ζ0 )
θ(A(P − ) − ζ0 ) θ(A(P − ) + i U (0) − tU (1) − ζ)
190
6 The closed Toda chain
P The exponential prefactor comes from limP →P ± ( Q0 Ω(0) ∓ log µ) = β0± P and limP →P ± ( Q0 Ω(1) ∓ µ) = β1± . Then βk = βk+ − βk− and the singular part cancels with µ2i e−2µt . Taking the quotient of these expressions for i and i + 1 gives: e2(qi −qi+1 ) = e−β0 × θ(i U (0) − tU (1) − ζ0 + A(P + ))θ((i + 1)U (0) − tU (1) − ζ0 + A(P − )) θ(i U (0) − tU (1) − ζ0 + A(P − ))θ((i + 1)U (0) − tU (1) − ζ0 + A(P + )) To end the proof of eq. (6.33), we show that: A(P − ) − A(P + ) = U (0) This is a direct consequence of Riemann’s bilinear relations which, for normalized Abelian differential of the third kind like Ω(0) , with residue −1 at P + and +1 at P − , implies: U
(0)
1 ≡ 2iπ
Ω bk
(0)
P−
= P+
ωk = A(P − ) − A(P + )
Then the result follows by defining ζ0 = ζ0 − A(P + ) − U (0) and inserting the definition of the tau-function. The explicit formula of the Baker–Akhiezer function in terms of thetafunctions also shows that the Abel map linearizes the dynamics. Indeed, consider the component ψ0 (t, P ). It has poles at fixed position γ1 , . . . , γn with divisor ζ0 , and zeroes at n others points γ1 (t), . . . , γn (t). At t = 0, γi (t = 0) = γi since ψ0 (t = 0) is equal to one. The points γi (t) are the zeroes of the theta-function which is in the numerator of eq. (6.30) taken for i = 0. Thus, by Riemann’s theorem, n
A(γi (t)) − A(γi ) = tU (1)
(6.34)
i=1
This flow is linear. Remark. The formula (6.30) can be generalized by considering all flows associated with the other conserved Hamiltonians. If we denote by tp the time associated with these Hamiltonians, the generalized tau-functions are obtained by replacing tU (1) by tU (1) → p tp U (p) .
6.5 Symplectic structure
191
6.5 Symplectic structure We now want to prove that the coordinates (λγi , µγi ) of the points of the dynamical divisor form a set of separated canonical coordinates. Recall that the Poisson bracket is the standard canonical one: 1 {pi , qj } = δij 2
{qi , qj } = {pi , pj } = 0,
(6.35)
Proposition. Let Γ be the spectral curve eq. (6.7). Let (λγi , µγi ), i = 1, . . . , g, be the g points of the dynamical divisor D. Then ω=2
δqi ∧ δpi =
i
δλγ i
i
λγi
∧ δµγi
(6.36)
Proof. According to the general procedure explained in Chapter 5, we start from the 2-form K on phase space with values in the 1-forms on Γ defined by: K = K 1 + K2 + K3
(6.37)
dλ K1 = < Ψ(−1) (P )δL(λ) ∧ δΨ(P ) > λ dλ (−1) (P )δµ ∧ δΨ(P ) > K2 = < Ψ λ dλ K3 = δ (log ∂µ Γ(λ, µ)) ∧ δµ λ Here Γ(λ, µ) = 0 is the equation of the spectral curve Γ. We have included the form K3 in the definition of K, although its role is auxiliary. More importantly, notice the factor 1/λ as compared to the analysis of Chapter 5. This is necessary to get the right Poisson bracket on the variables qi , pi . We write that the sum of the residues of the form K seen as a 1-form on Γ vanishes. The poles of K are located at three different places, first the dynamical poles of Ψ, then the poles at P ± coming from Ψ, L and dλ/λ, and finally the poles at the branch points of the covering coming from the poles of Ψ(−1) and from the fact that the δ differential is taken at fixed λ. The evaluation of the residues at the dynamical poles and the branch points is exactly as in Chapter 5, yielding the sum 2 k δµγk ∧ δλγk /λγk . So we are left with the computation of the residues at P ± . The residue at the point P + , for instance, is obtained by integrating over a small contour enclosing it. But the Riemann surface seen as a branched cover of the λ sphere has a branch point of order n + 1 at P + .
192
6 The closed Toda chain
So, a closed contour around P + runs on the n + 1 sheets of the covering before returning to its starting point. Therefore, we can write 1 1 K(P i ) ResP + K = K(P ) = 2iπ 2iπ i
where in the last expression, the points Pi are the n + 1 points of the contour over the point on the base with coordinate λ. This sum is independent of the order of the sheets and is therefore a 1-form on the base. The integral is also taken on the base. The sum i K1 (Pi ) can be written as a trace since the vectors Ψ(Pi ), with Pi the (n + 1) points over λ, form a basis of eigenvectors of L(λ). As in Chapter 5, it is convenient to introduce the (n + 1) × (n + 1) matrix ) whose columns are the (n + 1) vectors Ψ(Pi ). We thus have: Ψ(P 1 Ψ −1 (λ) ∧ δL(λ)) dλ ResP+ (K1 ) = − Tr (δ Ψ(λ) 2iπ λ Ψ −1 (λ) does not depend on the order of the sheets and The matrix δ Ψ(λ) is therefore a meromorphic function of λ which we now calculate. Lemma.
Ψ −1 (λ) = −(δqi − δq0 )δij δΨ (6.38) ij 1 λ δi,j−1 + δi,n+1 δj,1 + O(λ2 ), P → P − + δηi ai an+1
Ψ −1 (λ) = (δqi − δq0 )δij δΨ (6.39) ij 1 1 δi,j+1 + δi,1 δj,n+1 + O(λ−2 ), P → P + − δξi ai−1 λan+1 where ηi =
i
j=1 pj
and ξi =
i−1
j=0 pj .
Proof. Let us first consider the behaviour near P − . From eq. (6.15), we have
ij = Ψ i (Pj ) = e−qi +q0 µ−i 1 + ηi µ−1 + O(µ−2 ) Ψ j j j
−1 = 1 eqj −q0 µj 1 + O(µ−1 ) Ψ i ij j n+1 where µ−1 is the local coordinate of the point Pj above λ, i.e. µj = j −1
λ n+1 αj + · · · with αj a (n + 1)-th root of unity.
193
6.6 The Sklyanin approach
Taking variations, we limit ourselves to O(µ−2 ) so that we can set ij = −(δqi − δµ = 0 (using eq. (6.11) where j pj = 0), and obtain: δ Ψ ij µ−1 + O(µ−2 ). Hence ij + δηi Ψ δq0 )Ψ j j
Ψ −1 (λ) = −(δqi − δq0 )δij + δηi ik µ−1 Ψ −1 δΨ Ψ k kj ij
k
−1 needs to be computed to leading order Note that this shows that Ψ only. The last term is equal to: qj −qi −i+j−1 eqj −qi i−j+1 −i+j−1 ik µ−1 Ψ −1 = e n+1 Ψ µ = αk λ k k kj n+1 n+1 k
k
k
The last sum is over the roots of unity. It vanishes unless i − j + 1 ≡ 0 mod [n + 1], which is possible only if i = j − 1, (i = 1, . . . , n), and i = n + 1, j = 1. Evaluating these two types of terms yields eq. (6.38). The analysis at P + is similar. We now finish the proof of the proposition, by first analysing the residue of K1 at P ± . It is clear that only the terms written in eqs. (6.38, 6.39) contribute to the residue. Indeed, if one keeps terms of order µ−2 the Ψ −1 which are not same computation as above yields contributions to δ Ψ vanishing only if i − j ≡ ±2 mod [n + 1], which cannot contribute to the trace with δL. The contributions to the residues at P ± of the first terms in eqs. (6.38, 6.39) add to 2 i δpi ∧ δqi (remember that dλ/λ has residue contributions of the second terms at P ± are ∓1 at P ± ). Similarly, the also equal and add up to 2 i δpi ∧ δqi . Finally, since δµ has a simple zero at P ± the forms K2 and K3 are regular at these points. The quantities (log λγk , µγk ) are therefore canonical coordinates. Since the points (λγk , µγk ) of the dynamical divisor are on the spectral curve, we have the n equations Γ(λγk , µγk ) = 0, where Γ(λ, µ) = 0 is the equation of the spectral curve: λγk + λ−1 γk = 2t(µγk ),
for k = 1, . . . , n
As explained in Chapter 5, this implies that the variables (λγk , µγk ) are separated and, furthermore, this allows to construct the action–angle variables. 6.6 The Sklyanin approach We now introduce an equivalent description of the Toda chain. This approach can be viewed as a lattice version of the constructions of integrable
194
6 The closed Toda chain
field theories of Chapter 3. It is based on the use of a 2 × 2 transfer matrix whose Poisson brackets are quadratic, as in eq. (3.91) in Chapter 3. This provides a very simple way to find the separated canonical variables and the corresponding separated Hamilton–Jacobi equations. The linearization of the flow on the Jacobian of the spectral curve is also obtained in this approach. We first introduce 2 × 2 matrices by replacing the linear second order system eq. (6.13) by the linear 2 × 2 first order system: ψi ψi−1 = Ti (µ) (6.40) ψi+1 ψi where the Ti are given by: Ti (µ) =
0 −ai−1 /ai
1 −(pi − µ)/ai
(6.41)
The solution of eq. (6.40) with ψ0 = λ−1 ψn+1 is: ψi ψ0 = Ti (µ)Ti−1 (µ) · · · T1 (µ) ψi+1 ψ1 Each matrix Ti (µ) may be viewed as an elementary transport matrix on a small segment at position i, and the product of such matrices is a transport matrix from site 1 to site i. The periodicity condition, ψn+1+i = λψi , translates into the eigenvalue problem: ψ0 ψ 0 T(µ) =λ ψ1 ψ1 where T(µ) is the monodromy matrix defined by: A(µ) B(µ) T(µ) = Tn+1 (µ)Tn (µ) · · · T1 (µ) ≡ C(µ) D(µ)
(6.42)
= n−1, Here A(µ), B(µ), C(µ), D(µ) are polynomials in µ of degrees deg A = deg C = n, and deg D = n + 1. Since det Ti (µ) = ai−1 /ai one has deg B det T(µ) = 1, and the characteristic equation det(T(µ) − λ) = 0 reads λ2 − 2t(µ)λ + 1 = 0, with 2t(µ) = Tr (T(µ)) = A(µ) + D(µ)
(6.43)
195
6.6 The Sklyanin approach
Clearly t(µ) is a polynomial in µ of degree (n + 1): t(µ) = 12 µn+1 + · · · . Moreover, eq. (6.43) also expresses the existence of an eigenvector of L(λ), hence is equivalent to eq. (6.7). By using T(µ) instead of L(λ) we have exchanged the role played by µ and λ. The spectral curve Γ is now presented as a two-sheeted covering of the µ-plane. Above each point µ there are two points corresponding to the two roots λ± = t(µ) ± t2 (µ) − 1 of eq. (6.43) such that λ+ λ− = 1. In particular, above µ = ∞ we have the two points P + (λ = ∞) and P − (λ = 0). In the following, we shall choose the normalization condition ψ0 (P ) = 1. It is useful to introduce a standard basis of solutions of the linear system (0) (1) eq. (6.40), which we denote by χi , χi , and specified by the boundary conditions: (0) (1) χ0 χ0 1 0 = , = (6.44) (0) (1) 0 1 χ1 χ1 (0)
(1)
These solutions are polynomials in µ with deg χi = i−2, deg χi We can expand any other solution on this basis. In particular: (0)
= i−1.
(1)
ψi = χi + ψ1 χi
The coefficient ψ1 is determined by the periodicity condition = ψn+1+i 1 1 λψi . This is equivalent to the eigenvalue equation T (µ) =λ ψ1 ψ1 and gives: ± − A(µ) λ (0) (1) χi (6.45) ψi± = χi + B(µ) We see that the two functions ψi± corresponding to the two Bloch waves are in fact the values of a unique meromorphic function ψi (P ) evaluated at the two points (µ, λ± ) above µ, on Γ. The poles at finite distance of ψi are thus the sames as those of ψ1 γ ) = 0, the two and are located above the n zeroes of B(µ). When B(µ k γ ) and D(µ γ ) so that the numerator λ − eigenvalues of T(µγk ) are A(µ k k A(µ) vanishes on one of the two points above µγk . Therefore the function γ )). Hence the dynamical ψi (P ) has only one-pole at (µγk , λγk = D(µ k divisor is exactly the same as in the (n + 1) × (n + 1) matrix approach. As we already know, the coordinates (µγk , log λγk ) of these points form a set of conjugated canonical variables. We present an alternative way of deriving this result in the following section.
196
6 The closed Toda chain 6.7 The Poisson brackets
We first establish an explicit formula for the Poisson brackets of the matrix elements of T(µ). It is actually more convenient to perform a gauge tranformation before computing these Poisson brackets. Let −1 Ti (µ) → Ti (µ) = Di Ti (µ)Di−1
with Di periodic, Dn+1+i = Di . Since T(µ) = Tn+1 (µ) · · · T1 (µ) is the product of the matrices Ti (µ), it gets conjugated by D0 : T(µ) → T (µ) = D0 T(µ)D0−1 . In particular, the spectral curve det(T(µ) − λ) = 0 is preserved by such a gauge transformation. We choose the matrices Di so that the matrices Ti (µ) are local, i.e. Ti (µ) only depends on the canonical variables qi and pi . We take Di = −1 Diag di , di+1 with di = exp(qi ). Notice that ai = di /di+1 . The explicit expressions for Ti (µ) are: 0 e2qi (6.46) Ti (µ) = −e−2qi (µ − pi ) Note that det Ti (µ) = 1 for all i. The monodromy matrix T (µ) is equal to T (µ) = Tn+1 (µ) · · · T1 (µ). Proposition. The Poisson brackets of the matrix elements of T (µ) are given by: / 0 {T1 (µ), T2 (µ )} = r12 (µ − µ ), T1 (µ)T2 (µ ) (6.47) where the r-matrix is given by r12 (µ − µ ) =
C12 , µ − µ
C12 =
Eij ⊗ Eji
ij
Proof. We first prove this relation for each individual Ti (µ). Specifically, / 0 {T1,i (µ), T2,i (µ )} = r12 (µ − µ ), T1,i (µ)T2,i (µ ) (6.48) This is shown by a direct computation using the explicit formula (6.46) for Ti (µ). We then prove that if two Poisson-commuting matrices Ti (µ) and Tj (µ ) satisfy (6.48) then so does their product Ti (µ)Tj (µ ). Indeed, using the fact that the Poisson bracket and the Lie bracket are both derivations, {T1,j (µ)T1,i (µ), T2,j (µ )T2,i (µ )} = T1,j (µ)T2,j (µ ){T1,i (µ), T2,i (µ )} + {T1,j (µ), T2,j (µ )}T1,i (µ)T2,i (µ ) 0 / = r12 (µ − µ ), T1,j (µ)T1,i (µ)T2,j (µ )T2,i (µ )
197
6.7 The Poisson brackets
The claim then follows by induction because {T1,i (µ), T2,j (µ )} = 0 if i = j by the locality of Ti (µ). Note that this also implies the integrability of the Toda chain. Let A(µ), B(µ) , C(µ) and D(µ) be the matrix elements of T (µ): A(µ) B(µ) T (µ) = C(µ) D(µ) In terms of the matrix elements of T(µ) introduced in the previous section, A(µ) = A(µ), D(µ) = D(µ) but B(µ) = (d0 d1 )B(µ) and C(µ) = C(µ)/(d0 d1 ). In particular one finds (recalling that B(µ) = (d0 /d1 )µn + · · ·): n pj ) + · · · B(µ) = d20 µn − µn−1 ( j=1
This is a polynomial of degree n with n zeroes, which we denote by µγk , k = 1, . . . , n: n (µ − µγk ) B(µ) = d20 k=1
As explained in the previous section, these zeroes are the µ-coordinates of the poles of the Baker functions ψi . Proposition. Let µγk be the zeroes of B(µ) and λγk be the values of D(µ) at µ = µγk . The n points (λγk , µγk ) are the points of the dynamical divisor. B(µγk ) = 0
and
λγk = D(µγk )
(6.49)
These parameters obey the following Poisson brackets: {µγk , µγk } = {λγk , λγk } = 0,
{λγk , µγk } = λγk δkk
(6.50)
(log λγk , µγk ) form a set of canonical coordinates. Proof. Recall that the matrix T (µ) is lower triangular at µ = µγk , since B(µγk ) = 0. Therefore, at µ = µγk we have: 1 = det T (µγk ) = A(µγk )D(µγk ) and 2t(µγk ) = A(µγk ) + D(µγk ), hence the points (λγk , µγk ) are on the spectral curve: λγk + λ−1 γk = 2t(µγk )
(6.51)
198
6 The closed Toda chain
The Poisson brackets (6.47) for T (µ) imply the following: {A(µ), A(µ )} = {B(µ), B(µ )} = {C(µ), C(µ )} = {D(µ), D(µ )} = 0 A(µ)B(µ ) − B(µ)A(µ ) µ − µ C(µ)B(µ ) − B(µ)C(µ ) {A(µ), D(µ )} = µ − µ D(µ)B(µ ) − B(µ)D(µ ) {B(µ), D(µ )} = µ − µ {A(µ), B(µ )} =
(6.52) (6.53) (6.54)
The first equation directly implies that the µγk Poisson commute. Let P (µ) be a polynomial in µ with coefficients functions on phase space, and F be an arbritary function on phase space. Then the Poisson bracket between F and the value of P (µ) at µγk is: + {F, µγk } ∂µ P (µγk ) (6.55) {F, P (µγk )} = {F, P (µ)}µ=µγk
We apply this to F = D(µ) and P (µ) = B(µ). Since B(µγk ) = 0, we get: 0 = {D(µ), B(µγk )} = {D(µ), B(µ )}- + {D(µ), µγk } B (µγk ) µ =µγk
where B (µ) = ∂µ B(µ). We evaluate the first term by using eq. (6.54), obtaining {D(µ), µγk } =
λ γk 1 B(µ) µ − µγk B (µγk )
Next we apply the same formula to F = µγk and P = D, evaluated at µ = µγk . Since {µγk , µγk } = 0, one gets {µγk , λγk } = {µγk , D(µγk )} = {µγk , D(µ)}|µ=µγ
k
So we have to evaluate B(µ)/(µ − µγk ) at µ = µγk , which gives δkk B (µγk ). The equation {λγk , µγk } = δkk λγk follows. Finally, we need to prove {λγk , λγk } = 0. We compute {D(µγk ), D(µγk )} by expanding this expression completely using eq. (6.55), {D(µ), D(µ )} = 0 and {µγk , µγk } = 0. We get: {λγk , λγk } = ∂µ D(µγk ){D(µ), µγk }|µ=µγk − ∂µ D(µγk ){D(µ), µγk }|µ=µγ
k
We have already seen that each of these two terms vanishes when k = k and they obviously cancel each other when k = k .
199
6.7 The Poisson brackets In particular, the symplectic two-form ω can be written as: n δλγk ω= ∧ δµγk λ γk k=1
where δ denote the differential on the phase space. The fact that the equations of motion linearize on the Jacobian variety of the spectral curve may also be derived in this framework. The generating function for the conserved Hamiltonians is the trace of the monodromy matrix Tr (T (u)) = 2t(u) = A(u) + D(u). Thus the equations of motion of any function F on phase space for this generic Hamiltonian t(u) are: F˙ ≡ {t(u), F } Proposition. The equations of motion, relative to the Hamiltonian t(u), of the zeroes µγk of B(µ) are: 1 B(u) µ˙ γk = {t(u), µγk } = (A(µγk ) − D(µγk )) 2 (u − µγk )B (µγk )
(6.56)
Their linearization, under the Abel map A, is given by the following relations: n µjγk µ˙ γk = uj , 0 ≤ j ≤ n − 1 (6.57) A˙ j = 2 (µ ) − 1 t γk k=1 Proof. The Poisson bracket {t(u), µγk } can be computed in the same way as in the previous proposition. One obtains: µ˙ γk = {t(u), µγk } =
B(u) (A(µγk ) − D(µγk )) (6.58) 2(u − µγk )B (µγk )
Hence µ˙ k is a polynomial in u of degree n − 1 which vanishes at the points µγk for k = k and takes the value 12 (A(µγk ) − D(µγk )) = t2 (µγk ) − 1 for k = k. On a hyperelliptic curve the Abelian differentials are the ωj =
µj t2 (µ) − 1
dµ,
0≤j ≤n−1
so that the time derivative of the Abel map Aj ≡ A˙ j =
k
Pk k
ωj is given by:
µjγk µ˙ γk t2 (µγk ) − 1
Replacing µ˙ γk by its value given in eq. (6.58), we see that A˙ j is a polynomial in u of degree at most n − 1. If one evaluates it at u = µγl , only the term k = l contributes in the above sum, which gives µjγl . Hence the polynomial itself is uj .
200
6 The closed Toda chain 6.8 Reality conditions
Up to now the dynamical variables of the Toda chain were complex valued. We now come to the important and generally difficult question of choosing a real slice in the complexified phase space. This complex phase space is a fibred space with basis the action variables and fibre the Jacobian variety of the corresponding spectral curve. While it is easy to take real action variables, it is less trivial to choose a proper real slice on the Jacobian which will be identified with the Liouville torus. Let us first remark that the equation of the spectral curve, eq. (6.8), shows that it is a hyperelliptic curve with hyperelliptic involution given by (λ, µ) → (1/λ, µ). Hence the branch points, which are invariant under this involution, are located at λ = 1/λ, i.e. λ = ±1. For such values of λ the Lax matrix L(λ), eq. (6.4), is symmetric. If, moreover, the dynamical variables pi , qi , are real, the matrices L(λ = ±1) are real symmetric, hence each have (n + 1) real eigenvalues which are precisely the location of the branch points. We denote by β0 < β1 · · · < β2n+2 these 2(n + 1) real branch points which completely characterize the curve. This implies in particular that the action variables, i.e. the coefficients of eq. (6.8), are real. Writing the equation of the spectral curve as: λ+
1 = 2t(µ) λ
(6.59)
where t(µ) is a polynomial of degree (n + 1) with real coefficients, the 2n + 2 branch points are located at t(µ) = ±1. It follows that the graph of t(µ) has the form shown in Fig. 6.1 (here assuming n + 1 even): For n + 1 odd, the left branch (where µ → −∞) goes to −∞. Note that for real µ the two values of λ such that λ2 − 2t(µ)λ + 1 = 0 are either real or complex conjugated with modulus one. Borrowing the terminology of solid state physics, the first case is called the forbidden zone, while the second case is called the allowed zone. We see that the forbidden zones correspond to the intervals [βi , βi+1 ] for i = 1, 3, . . . , 2n − 1 (for both n even and odd), to which is added one extra zone passing through µ = ∞ given by µ < β0 or µ > β2n+2 . So there are n compact forbidden zones. We now have to locate the points of the dynamical divisor in agreement with the reality conditions. Recall that they are given by the points B(µ) = 0 and λ = D(µ). We show that the zeroes of the polynomial B(µ) of degree n are real so that the corresponding λ are also real, since D(µ) has real coefficients. This immediately implies that the zeroes of B(µ) lie in the forbidden zones.
201
6.8 Reality conditions
Fig. 6.1. The graph of the function t(µ) Proposition. The points of the dynamical divisor have real coordinates, and there is exactly one such point in each of the n forbidden zones [βi , βi+1 ] for i = 1, 3, . . . , 2n − 1. Proof. We first show that the zeroes of B(µ) are real. This is because the zeroes of B(µ) are also the poles of the eigenvector of L(λ) normalized by ψ0 = 1. The eigenvector equation being: a0 . . . λ−1 an p0 − µ ψ0 a0 p1 − µ . . . 0 ψ1 . . =0 .. .. .. . .. . λan
0
...
pn − µ
ψn
the poles of the normalized Ψ are included in the zero set of the minor of (p0 − µ), see Chapter 5. This minor is the characteristic equation of a real symmetric n × n matrix, hence has n real eigenvalues µγi . To show that there is just one zero of B(µ) in each [βi , βi+1 ], we show that B(µ) has the same sign as a0 dt(µ)/dµ in the allowed zones, hence changes sign between βi and βi+1 , i odd, because so does dt(µ)/dµ. Since
202
6 The closed Toda chain
B(µ) has n zeroes the only possibility is that it has exactly one zero in each forbidden zone. The analysis of the sign of a0 dt(µ)/dµ is quite involved. It rests on the analysis of the two fundamental solutions, eq. (6.44), of the Schroedinger equation, eq. (6.12). In particular, recall that (χn+1 (µ), χn+2 (µ)) is related to the initial values (χ0 , χ1 ) by the monodromy matrix T(µ), so that (0) (1) (0) we have A(µ) = χn+1 (µ), B(µ) = χn+1 (µ), C(µ) = χn+2 (µ), D(µ) = (1) (0) (1) χn+2 (µ), and 2t(µ) = χn+1 (µ) + χn+2 (µ). Moreover, since det T(µ) = 1, we get the Wronskian relation: (0)
(1)
(0)
(1)
χn+1 (µ)χn+2 (µ) − χn+2 (µ)χn+1 (µ) = 1
(6.60)
(j )
We then consider two solutions χi (µ) and χi (µ ) of the Schroedinger equation corresponding to different values µ and µ . (j)
(j)
(j)
(j)
(j)
ai−1 χi−1 (µ) + pi χi (µ) + ai χi+1 (µ) = µχi (µ) (j )
(j )
(j )
(j )
ai−1 χi−1 (µ ) + pi χi (µ ) + ai χi+1 (µ ) = µ χi (µ ) (j )
Here j, j take the values 0, 1. We multiply the first equation by χi (µ ), (j) the second by χi (µ) and subtract. Adding the resulting equations for i = 1, 2, . . . , n + 1 we get:
(j) (j ) (j) (j ) a0 χ0 (µ)χ1 (µ ) − χ1 (µ)χ0 (µ )
(j) (j ) (j) (j ) −an+1 χn+1 (µ)χn+2 (µ ) − χn+2 (µ)χn+1 (µ )
= (µ − µ ) χ(j) (µ), χ(j ) (µ ) (j) (j ) where we have denoted (χ(j) (µ), χ(j ) (µ )) = n+1 i=1 χi (µ)χi (µ ). Consider the four above equations for (j, j ) = (0, 0), (1, 1), (0, 1), (1, 0) (1) (0) (1) and multiply them respectively by χn+1 (µ), −χn+2 (µ), χn+2 (µ) and (0)
−χn+1 (µ). Adding them and taking into account eq. (6.60), we get: 2a0
t(µ) − t(µ ) (0) (1) = B(µ)(χ (µ), χ(0) (µ )) − C(µ)(χ (µ), χ(1) (µ )) + µ − µ (0) (1) D(µ)(χ (µ), χ(1) (µ )) − A(µ)(χ (µ), χ(0) (µ ))
We can now set µ = µ and obtain our final relation:
dt(µ) χ(1) (µ), χ(1) (µ) 2a0 = B(µ) χ(0) (µ), χ(0) (µ) − C(µ) dµ
(6.61) +(D(µ) − A(µ)) χ(0) (µ), χ(1) (µ)
6.8 Reality conditions
203
The right-hand side is a sum over i = 1, . . . , n + 1 of quadratic forms (0) (1) 2+ in the variables χi (µ) and χi (µ), with discriminant (D(µ) − A(µ)) C(µ) 4B(µ) = 4(t2 (µ) − 1). In the allowed zones we have t2 (µ) − 1 < 0 and the quadratic form has a definite sign, which is the sign of B(µ). Equation (6.61) shows that this is also the sign of a0 dt(µ)/dµ, as asserted above. We can now describe the real part of the spectral curve. From eq. (6.59), we see that when t(µ)2 > 1, i.e. in the forbidden zones, we have two real solutions for λ, which are inverse to each other. For µ = βj , these two solutions coincide. In the allowed zones, t(µ)2 < 1, there is no real solution for λ. Hence the real slice of the spectral curve has n components Ck at finite distance, and one component which extends to ∞. See Fig. 6.2. The dynamical divisor has n points γk with just one point in each of the components Ck at finite distance. As time goes on, each of the points γk runs along the cycle Ck , hence the whole motion lies on the real torus C1 × C2 × · · · × Cn , which is the Liouville torus. For the Hamiltonian t(u), the time evolution of µγk is
Fig. 6.2. The real slice of the spectral curve.
204
6 The closed Toda chain
given by eq. (6.58) and the one of λγk is given similarly by: λ˙ k = −
1 B(u) λγ t (µγk ) u − µγk B (µγk ) k
Hence, when the point γk hits the line λ = ±1, we have µ˙ k = 0 and λ˙ k = 0, so that the point γk continues in the same direction and loops around Ck . It is interesting to relate the Liouville torus to the real slice of the Jacobian variety of the spectral curve. One an antiholomorphic can define involution of the Jacobian by sending γk to γ k , where the image of γ = (λ, µ) is γ = (λ, µ), which also lies on the spectral curve since t(µ) has real coefficients. A real slice of the Jacobian can be defined as the set of fixed points of this involution. Note that γk is invariant if the γk are real or occur in complex conjugate pairs. The various combinations of these two possibilities define several connected components of the real slice of the Jacobian. The Liouville torus is just one of these components. Note that we have discussed the reality condition for the dynamical variables ai and pi , but one should further ensure that the variables ai are positive to get real qi . Alternatively, one can view the Liouville torus as some choice of acycles on the spectral curve. Consider it as a two-sheeted covering of the complex µ-plane, with cuts between branch points as shown in Fig. 6.3. Here the a-cycles are drawn on the first sheet, so in the limit where the ellipses go to the segments [βi , βi+1 ] we see that λ goes to the two real roots of the spectral equation, thereby producing the cycles in the previous drawing. The b-cycles are also drawn. Starting from ∞, they cut the corresponding a-cycle once, then pass on the second sheet through the cut, and return to ∞ without cutting any more of the a-cycles.
Fig. 6.3. Γ as the cut µ-plane.
6.8 Reality conditions
205
References [1] H. Flaschka, The Toda lattice I: Existence of integrals. Phys. Rev. B9 (1974) 1924. [2] H. Flaschka, The Toda lattice II: Inverse scattering solution. Prog. Theor. Phys. 51 (1974) 703–716. [3] P. van Moerbeke, The spectrum of Jacobi Matrices. Invent. Math. (1976) 45–81. [4] P. van Moerbeke and D. Mumford, The spectrum of difference operators and algebraic curves. Acta. Math. 143 (1979) 93–154. [5] P. van Moerbeke, About Isospectral deformations of discrete Laplacians. Lecture Notes in Math. 755 (1979) 313–370. [6] E. Sklyanin, The quantum Toda chain. Lect. Notes Physics 226 (1985) 196–233. [7] B. Dubrovin, I. Krichever and S. Novikov, Integrable Systems. I. Encyclopedia of Mathematical Sciences, Dynamical Systems IV. Springer (1990).
7 The Calogero–Moser model
The elliptic Calogero–Moser model provides an example in which the Lax matrix is not a rational function of the spectral parameter but lives on an elliptic curve. As pointed out in Chapter 3, integrable systems whose spectral parameter belongs to Riemann surfaces of higher genus are highly non-generic. Furthermore, this model gives an example of an integrable system whose spectral curve is non-hyperelliptic. Nevertheless, most of the results obtained in the rational case extend to this case with slight but interesting adaptations. A special feature of this model is that its r-matrix explicitly contains dynamical variables. Another remarkable fact is the relation between doubly periodic solutions of the KP hierarchy and the Calogero–Moser model. Finally, we show that this model is a particular example of a general construction, due to Hitchin, which allows us to construct models with spectral parameter lying on higher genus Riemann surfaces.
7.1 The spin Calogero–Moser model The Calogero–Moser model consists of N identical particles on a line, at positions qi and momenta pi , with pairwise interactions and Hamiltonian: N 1 1 2 γ2 H= pi + 2 2 i,j=1 (qi − qj )2 i
i=j
This dynamical system is integrable, moreover it remains integrable when the potential is replaced by an elliptic potential, i.e. 1/q 2 → ℘(q) with ℘(q) the Weierstrass elliptic function defined on a two dimensional torus with periods ω1 and ω2 . 206
207
7.1 The spin Calogero–Moser model
It is rewarding to consider a slightly generalized model, the so-called spin Calogero–Moser model, which contains extra dynamical spin variables. Let us introduce a set of dynamical variables (qi , pi ) and (fij ), i, j = 1, . . . , N together with the Poisson brackets: {pi , qj } = δij {fij , fkl } = −δjk fil + δli fkj
(7.1) (7.2)
The Poisson bracket eq. (7.2) is a Kostant–Kirillov bracket for the coadjoint action of the group GL(N ). The Hamiltonian reads H=
N N 1 2 1 pi − fij fji V (qi − qj ) 2 2 i,j=1 i=1
(7.3)
i=j
where the potential V (q) ≡ ℘(q). The equations of motion are easily derived: q˙i = pi , f˙ii = 0 N p˙i = fij fji V (qij ),
qij ≡ qi − qj
j=1 j=i
f˙ij =
N
fik fkj [V (qik ) − V (qjk )] + (fii − fjj )fij V (qij ),
i = j
(7.4)
k=1 k=i,j
From (7.4) we see that fii are integrals of motion, and we can restrict the system to the submanifold fii = α
(7.5)
where α is a constant independent of i. In this case the last term in eqs. (7.4) vanishes. These constraints are related to a Hamiltonian reduction, and we will see that it is this reduced system which admits a Lax pair and is integrable. Let us count the number of degrees of freedom. This is not a completely trivial matter due to the degeneracy of the Poisson bracket eq. (7.2), and the necessary reduction to the manifold fii = α. The symplectic leaves of the Kostant–Kirillov bracket, eq. (7.2), are the coadjoint orbits of GL(N ) acting on the matrix F = (fij ) by conjugation. These orbits are generically characterized by the eigenvalues of the matrix F which are in the centre of the Poisson bracket. Here we shall consider matrices F of rank l, with l different non-vanishing eigenvalues. Moreover,
208
7 The Calogero–Moser model
we shall assume that the matrix F is diagonalizable, so that the orbit is of the form {CνC −1 |C ∈ GL(N )},
with ν = Diag(ν1 , . . . , νl , 0, . . . , 0)
(7.6)
The Hamiltonian eq. (7.3) is not invariant under the above GL(N ) but it is preserved by special subgroups. First we have the discrete subgroup of permutation matrices, i.e. the Weyl group of GL(N ), which simply operates by permutation of the N indices i. More importantly, we have the group of diagonal matrices, i.e. the Cartan torus, which operates by: fij → d−1 i fij dj
(7.7)
This action preserves the Hamiltonian which only depends on fij fji . We consider the dynamical system on an orbit of rank l, reduced under this diagonal action whose generators will be shown to be the fii . Proposition. The dimension of the reduced phase space M is l(l + 1) dim M = 2 N l − +1 2
(7.8)
Proof. The tangent space to the orbit (7.6) at F = (fij ) is the set of matrices U = [F, X] for any X ∈ gl(N ). In a basis where F is diagonal this equation reads Uij = (νi − νj )Xij , hence Uij vanishes when νi = νj but is otherwise arbitrary. So the dimension of the orbit is 2N l − l2 − l. The action of the subgroup of diagonal matrices induces a fibring of the orbits with fibres of dimension N −1 because the identity does not act. The moment associated with this action is the collection of diagonal elements fii , that is N − 1 non-trivial moments since on the orbit, the eigenvalues of F being fixed, so is Tr(F ). Indeed, under infinitesimal action di = 1 + i , fij changes as δfij = ( j − i )fij , and if P = i i fii , we have {P, fij } = ( j − i )fij , so P is the corresponding Hamiltonian. We consider the reduced dynamical system obtained by first fixing the moments to a common value fii = α, and then quotienting by the stabilizer of this moment, which is the whole diagonal group. We can now count the number of degrees of freedom. We have 2N degrees of freedom for the qi , pi , plus 2N l − l2 − l for the orbit, minus 2(N − 1) due to the Hamiltonian reduction, which ends up with a phase space of dimension 2(N l − l(l + 1)/2 + 1). 7.2 Lax pair The first step in proving that the reduced spin Calogero–Moser model is integrable consists of finding a Lax pair formulation. Here we will
7.2 Lax pair
209
just give the Lax pair, but in later sections we will explain methods to derive it. We will need the Lam´e function Φ: Φ(q, λ) =
σ(λ − q) ζ(λ)q e σ(λ)σ(q)
(7.9)
where σ and ζ are defined in Chapter 15. It is an elliptic function of the parameter λ and satisfies the equation 2 d − 2℘(x) Φ(x, λ) = ℘(λ)Φ(x, λ) (7.10) dx2 The Lam´e function is used to construct the Lax pair of the spin Calogero– Moser model. Proposition. The equations of motion of the spin Calogero–Moser system are equivalent to the Lax equation ˙ L(λ) = [M (λ), L(λ)]
(7.11)
where the Lax matrices, with spectral parameter λ, are given by: Lij (t, λ) = q˙i δij + (1 − δij )fij Φ(qi − qj , λ) Mij (t, λ) = −(1 − δij )fij Φ (qi − qj , λ)
(7.12) (7.13)
The prime in eq. (7.13) refers to the derivative with respect to q. Proof. The Lax equation (7.11) reads: q¨i δij + (1 − δij ) f˙ij Φ(qij , λ) + fij q˙ij Φ (qij , λ) = (1 − δij )fij q˙ij Φ (qij , λ) 0 / fik fkj Φ(qik , λ)Φ (qkj , λ) − Φ (qik , λ)Φ(qkj , λ) + k=i,j
This reduces to the equations of motion, in the case where fii = α, if we use an identity satisfied by the Lam´e function: Φ (x, λ)Φ(y, λ) − Φ (y, λ)Φ(x, λ) = [℘(y) − ℘(x)]Φ(x + y, λ)
(7.14)
which also implies, by taking the limit y → −x: Φ (x, λ)Φ(−x, λ) − Φ (−x, λ)Φ(x, λ) = −℘ (x) To show eq. (7.14) we compare the analyticity and monodromy properties of both sides of the equation in the x variable, using the properties given in Chapter 15.
210
7 The Calogero–Moser model
By a similar method one shows that the Lam´e function obeys the identity: Φ(q, λ)Φ(−q, λ) = ℘(λ) − ℘(q) (7.15) hence
1 fij fji Tr L2 = H + ℘(λ) 2 i<j
where H is the Hamiltonian of the spin Calogero–Moser model, and 2 2 i=j fij fji = Tr F − N α is an orbit invariant in the centre of the Poisson algebra. Let us comment on the trigonometric and rational limits of the above formulae. The trigonometric limit is obtained when one of the periods ω → ∞. We choose the other one as iπ. In this limit the function Φ becomes: Φ(q, λ) → (coth q − coth λ) eq coth λ The exponential factor in Φ(q, λ) comes from the factor exp(ζ(λ)q) which is necessary in the elliptic case to ensure the double periodicity of Φ(q, λ) in λ. In the trigonometric case, however, this exponential factor can be eliminated by performing a similarity transformation on L(λ) without affecting the periodicity properties of the matrix elements of L(λ). So we may define Ltrigo (λ) = Diag (e−qi coth λ ) lim Lelliptic (λ) Diag (eqi coth λ ) ω→∞ = pi Eii + fij (coth qij − coth λ)Eij (7.16) i
i=j
The potential V (q) becomes V (q) = 1/ sinh2 (q). The rational limit is obtained straightforwardly from the trigonometric limit by sending the second period ω → ∞. The functions V (q) and Φ(q, λ) become 1 1 1 V (q) → 2 , Φ(q, λ) → − , q q λ Remark. The diagonal action in eq. (7.7) is equivalent to conjugation of the Lax matrix by a diagonal matrix. The Hamiltonian reduction we have performed is similar to the one appearing in the general rational case, see Chapter 5.
7.3 The r-matrix In this section we compute the r-matrix associated with the Lax matrix pi Eii + Φ(qij , λ) fij Eij . (7.17) L(λ) = i
ij
7.3 The r-matrix
211
One should emphasize that it is only the reduced system (fii = α) which is integrable. The Lax matrix (7.17) is not a function on the reduced phase space and so is not expected to have an r-matrix in the usual sense. This accounts for the extra term in eq. (7.18) below: Proposition. The Poisson bracket of the Lax matrix eq. (7.17) is given by: {L1 (λ), L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] + [D, r12 (λ, µ)] (7.18) where D = i fii ∂q∂ i and the r-matrix is expressed as: r12 (λ, µ) = a(λ, µ)
i
Eii ⊗ Eii −
b(qij , λ, µ) Eij ⊗ Eji
(7.19)
ij
with a(λ, µ) = ζ(λ − µ) − ζ(λ) + ζ(µ) σ(λ − µ − q) (ζ(λ)−ζ(µ))q b(q, λ, µ) = e σ(λ − µ)σ(q) Proof. We compute the various terms of eq. (7.18), and collect them according to the number of equal matrix indices. We have (all written indices are different, and sums are implied): {L1 (λ), L2 (µ)} = −Φ(qij , λ)Φ(qji , µ)(fii − fjj )Eij ⊗ Eji +Φ (qij , µ)fij (Eii − Ejj ) ⊗ Eij − Φ (qij , λ)fij Eij ⊗ (Eii − Ejj ) +Φ(qij , λ)Φ(qki , µ)fkj Eij ⊗ Eki − Φ(qij , λ)Φ(qjk , µ)fik Eij ⊗ Ejk Similarly one has, using the antisymmetry of the r-matrix: [r12 , L1 (λ) + L2 (µ)] =
a(λ, µ)Φ(qij , µ) + b(qji , λ, µ)Φ(qij , λ) fij (Eii − Ejj ) ⊗ Eij
+ a(λ, µ)Φ(qij , λ) + b(qij , λ, µ)Φ(qij , µ) fij Eij ⊗ (Eii − Ejj )
+ b(qij , λ, µ)Φ(qkj , µ) − b(qik , λ, µ)Φ(qkj , λ) fkj Eij ⊗ Eki
+ b(qkj , λ, µ)Φ(qik , λ) − b(qij , λ, µ)Φ(qik , µ) fik Eij ⊗ Ejk Finally, we have: [D, r12 ] = −b (qij , λ, µ)(fii − fjj )Eij ⊗ Eji
212
7 The Calogero–Moser model
Using the relations a(µ, λ) = −a(λ, µ) and b(−q, µ, λ) = −b(q, λ, µ), we see that eq. (7.18) reduces to the three identities: Φ (q, µ) = a(λ, µ)Φ(q, µ) + b(−q, λ, µ)Φ(q, λ) Φ(q, λ)Φ(q , µ) = b(q, λ, µ)Φ(q + q , µ) − b(−q , λ, µ)Φ(q + q , λ) b (q, λ, µ) = Φ(q, λ)Φ(−q, µ) The first identity, written in terms of Weierstrass functions, reads: σ(λ − q)σ(µ)σ(λ − µ + q) = ζ(µ − q) + ζ(q) + ζ(λ − µ) − ζ(λ) σ(λ)σ(µ − q)σ(λ − µ)σ(q) One observes that both sides are elliptic functions of λ (and µ) and have the same poles and residues, hence are equal. The second identity reads: σ(q + q )σ(λ − µ)σ(λ − q)σ(µ − q ) = σ(λ − µ − q)σ(q )σ(λ)σ(µ − q − q ) + σ(λ − µ + q )σ(µ)σ(q)σ(λ − q − q ) which is true because both sides vanish at λ = q and λ = µ, are equal for λ = 0, and have the same monodromy properties when shifting λ by periods. Finally, the third identity is a consequence of the first one. Notice that a(λ, µ) and b(q, λ, µ) are true elliptic functions of both λ and µ. Note also that the r-matrix is antisymmetric, i.e. r12 (λ, µ) = −r21 (µ, λ). The most remarkable feature of this r-matrix is that it depends on the dynamical variables qij , hence the name dynamical r-matrix. Remark. Equation (7.18) holds for the non-reduced dynamical system (7.3). We see that Tr Ln (λ) are not in involution for this system. Indeed, we have: {Tr Ln (λ), Tr Lm (µ)} = −nm
N
Φ(qij , λ)Φ(qji , µ)(fii − fjj )[Ln−1 (λ)]ij [Lm−1 (µ)]ji (7.20)
i,j=1 i=j
As we have already seen, the fii are the moments of the diagonal group which acts on L(λ) by conjugation. The quantities Tr Ln (λ) are invariant under this action. It follows that one can compute their reduced Poisson bracket on the manifold (fii = α)i=1,...,N by just setting fii = α in eq. (7.20), see Chapter 14. Therefore Tr Ln (λ) are in involution for the reduced system. This proves integrability of the spin Calogero–Moser model. We will count the number of action variables later on.
213
7.3 The r-matrix
Proposition. The classical r-matrix, eq. (7.19), satisfies the identity Y B = 0 with Y B ≡ −{L1 , r23 } + {L2 , r13 } − {L3 , r12 } + [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] (7.21) Proof. It is done by direct calculation. It is more interesting, however, to see how this is related to the Jacobi identity. Let us call ∂ fii Z12 = [D, r12 ], D = ∂qi i
The Jacobi identity reads [L1 , [r12 , r23 ] + [r12 , r13 ] + [r32 , r13 ] + {L2 , r13 } − {L3 , r12 }] + cyclic perm. +[r23 , Z12 ] + [r31 , Z23 ] + [r12 , Z31 ] − [r32 , Z13 ] − [r13 , Z21 ] − [r21 , Z32 ] +{L1 , Z23 } + {L2 , Z31 } + {L3 , Z12 } = 0 Note that the term commuted with L1 is not quite equal to the left-hand side of eq. (7.21). We want to show that the missing term, {L1 , r23 }, is produced by all the Zij contributions. Using r12 (λ, µ) = −r21 (µ, λ), we find [r23 , Z12 ] + [r31 , Z23 ] + [r12 , Z31 ] − [r32 , Z13 ] − [r13 , Z21 ] − [r21 , Z32 ] = −[D, [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ]] Moreover, we easily compute {L1 , D} = − k [L1 , Ekk ]∂qk so that: {L1 , Z23 } = {L1 , [D, r23 ]} = [D, {L1 , r23 }] + [{L1 , D}, r23 ] = [D, {L1 , r23 }] − [L1 , {L1 , r23 }] The Jacobi identity becomes [L1 + L2 + L3 , Y B] − [D, Y B] = 0 where we used that Y B is invariant under cyclic permutations of the indices 1, 2, 3. As a result, the Jacobi identity is satisfied if eq. (7.21) holds, thereby giving a motivation to the computation showing that Y B = 0. Remark. Let us comment on the trigonometric limit of these formulae. Using the Lax matrix, eq. (7.16), we find {L1 (λ), L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] +
N i,j=1 i=j
where r12 (λ, µ) = coth(λ − µ)C12 −
N
fii − fjj Eij ⊗ Eji sinh2 (qi − qj )
coth(qi − qj )Eij ⊗ Eji
i,j=1 i=j
and C12 is the Casimir element of sl(N ): C12 =
N i,j=1
Eij ⊗ Eji .
214
7 The Calogero–Moser model 7.4 The scalar Calogero–Moser model
The scalar Calogero–Moser model is defined by the Hamiltonian: HCal
N N 1 2 1 2 = pi + γ V (qi − qj ) 2 2 i,j=1 i=1
(7.22)
i=j
We show here that this model and its r-matrix can be obtained from eq. (7.18) by a Hamiltonian reduction procedure. Quite generally, we can parametrize the matrix F = (fij ) of rank l as follows: fij =
l
bri arj
(7.23)
r=1
br
The l vectors form a basis of the image of F , and the vectors ar form a basis of a supplementary space of the kernel of F . Moreover, the Poisson bracket, eq. (7.2), is reproduced if we set {ari , bsj } = −δrs δij The equations of motion for the quantities ari , bri read V (qik )fki ark , b˙ ri = {H, bri } = V (qik )fik brk a˙ ri = {H, ari } = − k=i
k=i
These equations of motion reproduce eq. (7.4) on the reduced manifold fii = α. To recover the scalar Calogero–Moser model, we choose l = 1 in eq. (7.23). So there is only one pair of vectors a and b. We simply denote their components by ai , bi . On these variables, the diagonal action eq. (7.7) reads: (7.24) ai −→ di ai , bi −→ d−1 i bi The integrable system is obtained by applying the method of Hamiltonian √ reduction under this group. We fix the moment to fii = ai bi = α = −1γ, which removes N degrees of freedom. Then we have to quotient by the isotropy subgroup of the moment, which is again the group of diagonal matrices. At the end the 2N degrees of freedom ai , bi are eliminated, leaving as reduced Hamiltonian the scalar Calogero–Moser model eq. (7.22). Note that fij fji = α2 = −γ 2 . In order to perform the reduction at the level of the Lax matrix, we remark that if g = Diag (a−1 i )i=1,...,N , the matrix LCal (λ) = gL(λ) g −1 = pi Eii + α Φ(qij,λ ) Eij i
ij
7.4 The scalar Calogero–Moser model
215
is invariant under the diagonal action and so is a function on the reduced phase space. This is the Lax matrix of the scalar Calogero–Moser model. Proposition. The Poisson bracket of the Lax matrix LCal (λ) takes the r-matrix form: Cal Cal Cal Cal Cal {LCal 1 (λ), L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] (7.25)
with Cal (λ, µ) = a(λ, µ) r12
Eii ⊗ Eii −
i
b(qij , λ − µ) Eij ⊗ Eji
ij
1 − Φ(qij , µ) (Eii + Ejj ) ⊗ Eij 2 ij
This is a dynamical r-matrix which is no longer antisymmetric. Proof. Since LCal (λ) is invariant under the symmetry group, we can compute the Poisson brackets of its matrix elements directly. Using eq. (2.13) in Chapter 2, we get: 1 −1 Cal r12 (λ, µ) = g1 g2 r12 (λ, µ) + g1 {g1 , L2 (µ)} + [u12 , L2 (µ)] g1−1 g2−1 2 where u12 = g1−1 g2−1 {g1 , g2 } is here equal to zero. We get Cal r12 (λ, µ) = r12 (λ, µ) −
Φ(qij , µ)Eii ⊗ Eij
i=j
Redefining Cal Cal (λ, µ) −→ r12 (λ, µ) + r12
1 Eii ⊗ Eii , LCal (µ) 2 2α i
does not change eq. (7.25) and yields the r-matrix of the scalar Calogero– Moser model. Finally, we give the formula for the matrix M Cal = gM g −1 + gg ˙ −1 . We Cal find Mij = − a˙ i /ai δij − (1 − δij )αΦ (qi − qj , λ). Using the equation of motion a˙ i = − k=i αV (qi − qk )ai , we obtain M Cal = αδij
k=i
V (qi − qk ) − (1 − δij )αΦ (qi − qj , λ)
216
7 The Calogero–Moser model 7.5 The spectral curve
The spectral curve of the spin Calogero–Moser model is defined as usual: Γ : Γ(λ, µ) ≡ det (L(t, λ) − µI) = 0
(7.26)
The curve Γ is time-independent due to the Lax equation, eq. (7.11). Note that Γ is invariant under the symmetries eq. (7.7). Proposition. The equation of the spectral curve takes the form: Γ(λ, µ) ≡
N
ri (λ)µi = 0
(7.27)
i=0
where the ri (λ) are elliptic functions of λ, independent of t, which can be expanded on the Weierstrass ℘ function and its derivatives as: ri (λ) =
Ii0
+
N −i−2
Ii,s ∂λs ℘(λ)
(7.28)
s=0
In a neighbourhood of λ = 0, the function Γ(λ, µ) can be factorized as: Γ(λ, µ) =
N (µ − µi (λ));
µi (λ) = (α − νi )λ−1 + hi (λ)
(7.29)
i=1
where hi (λ) are regular functions of λ, and the νi are the eigenvalues of the matrix F = (fij ), so that νi = 0, i > l and fii = α Proof. The matrix elements Lij (t, λ) of the Lax matrix are elliptic functions of the variable λ having an essential singularity at λ = 0. The functions ri (λ), however, are meromorphic because the essential singularity in L(λ) can be gauged away near λ = 0 since we can write: ˜ λ)G−1 (t, λ), Gij = δij exp(ζ(λ)qi (t)) L(t, λ) = G(t, λ)L(t,
(7.30)
˜ ij (t, λ) are meromorphic functions of λ in a neighbourhood of the where L point λ = 0. Incorporating the constraint fii = α, we have: ˜ λ) = 1 (αI − F (t)) + O(λ0 ) L(t, λ
(7.31)
where F (t) is the matrix of elements fij (t). Therefore the elliptic functions ri (λ) have poles of degree at most N − i at the point λ = 0, so that they can be expanded as linear combinations of the function ℘(λ) and its derivatives. We can always factorize the polynomial in µ, Γ(λ, µ), around
7.5 The spectral curve
217
λ = 0. The branches µi (λ) in eq. (7.29) have simple poles, of the stated form due to eq. (7.31). In particular, since F is of rank l, the eigenvalue νi = 0 has multiplicity N − l. The coefficients Ii0 , Ii,s of this expansion are the integrals of motion of the spin Calogero–Moser model. They define the moduli of the algebraic curve Γ. We are now in a position to compute the number of action variables. Proposition. The number of action variables is half the dimension of the phase space: 1 N l − l(l + 1)/2 + 1 = dim M 2 Proof. The spectral equation Γ(λ, µ) = 0 depends on N (N + 1)/2 parameters Ii0 , Iis . However, they are not all independent. The constraints come from the conditions νi = 0, i > l and νi non-dynamical for 1 ≤ i ≤ l in eq. (7.29) . To see how they translate on the parameters Ii0 , Iis , let us in˜ µ ˜ +αλ−1 ) = Γ(λ, ˜), troduce the variable µ ˜ = µ−αλ−1 . Then we have Γ(λ, µ which can be expanded as: ˜ µ Γ(λ, ˜) =
N
˜ i (˜ Γ µ)λ−i + R(λ, µ ˜),
(7.32)
i=0
˜ i (˜ where Γ µ) are polynomials in µ ˜ and R(λ, µ ˜) = O(λ) is regular at λ = 0. ˜ µ) is N − i and that its coeffiOne can check easily that the degree of Γi (˜ cients are linear combinations of the parameters Ii0 , Iis . The conditions νi = 0, i > l, imply that ˜ i (˜ µ) = 0, i > l. Γ
(7.33)
Altogether this is equivalent to a set of (N −l)(N −l+1)/2 linear equations on the parameters Ii0 , Iis . The total number of independent parameters is therefore equal to N l − l(l − 1)/2. Next, recall that the expansion around λ = 0 of the branch µi for i = 1, . . . , l reads µi = (α − νi )λ−1 + O(1). This yields l − 1 additional relations (since the νi are constants characterizing the orbit of F , not to be counted as dynamical variables). Note that this gives only l−1 constraints, and not l, because we have N α = νi = Tr F . This condition also accounts for the fact that the elliptic function rN −1 (λ) is constant, since it cannot have a single pole of order 1. Finally, the number of independent parameters is equal to N l − l(l − 1)/2 − (l − 1) which is exactly half the dimension of the reduced phase space. We now compute the genus of the spectral curve Γ.
218
7 The Calogero–Moser model
Proposition. For generic values of the action variables the genus of the spectral curve is given by: g = Nl −
l(l + 1) +1 2
(7.34)
Proof. The idea of the proof is the same as in Chapter 5 and uses the Riemann–Hurwitz theorem. There is a difference, however, because here the base curve is of genus 1. Equation (7.26) presents the compact Riemann surface Γ as an N -sheeted branched covering of the base curve of the variable λ. The sheets are the N roots in µ. By the Riemann– Hurwitz formula we have 2g − 2 = N (2g0 − 2) + ν, where g0 is the genus of the base curve, g0 = 1, and ν is the number of branch points, i.e. the number of values of λ for which Γ(λ, µ) has a double root in µ. This is the number of zeroes of ∂µ Γ(λ, µ) on the surface Γ(λ, µ) = 0. But ∂µ Γ(λ, µ) is a meromorphic function on the surface, hence it has as many zeroes as poles. The poles are located above λ = 0, or µ = ∞ which is the same, and are easy to count. Let Pi be the points of Γ lying on the different sheets over the point λ = 0. In the neighbourhood of Pi the function µ has the expansion µi = (α − νi )λ−1 + hi (λ). It follows that the function ∂Γ/∂µ in the neighbourhood of Pi has the form ∂Γ/∂µ = [(νj − νi )λ−1 − (hj (λ) − hi (λ))] j=i
From this, we see that on each of the l sheets (λ, µi (λ)) (i = 1, . . . , l) we have one-pole of order (N −1). On each of the (N −l) sheets (λ, µi (λ)) (i = l + 1, . . . , N ) we have one-pole of order l. Finally, ν = l(N − 1) + (N − l)l. Inserting this value in the Riemann–Hurwitz formula yields the result. The last two propositions show that the number of independent action variables is exactly the genus of the spectral curve. Among these, one is the total momentum associated with translation invariance and we will have to factorize by this symmetry. 7.6 The eigenvector bundle As in Chapter 5, we consider at any point P = (λ, µ) of the spectral curve the unique eigenvector Ψ(0, P ) of L(0, λ) with eigenvalue µ, normalized by ψ1 (0, P ) = 1. We want to study the analyticity properties in λ of this eigenvector. A first difference with the case of rational Lax matrices is that it has an essential singularity at the points Pi above λ = 0.
7.6 The eigenvector bundle
219
Proposition. In the neighbourhood of the point Pi the component ψj (0, P ) has the form (i)
ψj (0, P ) = exp [ζ(λ)(qj (0) − q1 (0))](cj + O(λ))
(7.35)
(i)
where cj are the eigenvectors of the matrix F corresponding to the nonzero eigenvalue νi for i = 1, . . . , l: N
(i)
(i)
fkj cj = νi ck
j=1
while for i > l the c(i) form a basis of the kernel of F . ˜ P ), where Proof. From equation (7.30), we have Ψ(0, P ) = G(0, λ)Ψ(0, ˜ ˜ ˜ P) = Ψ(0, P ) is an eigenvector of L(0, λ). Using eq. (7.31), we have Ψ(0, (i) (i) c +O(λ), where c is an eigenvector of F . More precisely, for i = 1, . . . , l, c(i) is the unique eigenvector of F with eigenvalue νi = 0 normalized by (i) c1 = 1, while for i > l it is determined by the limit for P → Pi of the ˜ ) corresponding to the eigenvalue µi (P ). normalized eigenvector of L(P The degeneracy has been lifted by higher order terms in λ. Therefore we (i) have ψj (0, P ) = (cj + O(λ)) exp (ζ(λ)qj (0)). Normalizing ψ1 (0, P ) = 1 yields the result. We can now compute the number of poles of Ψ on Γ. The result is slightly different from the case of rational Lax matrices, where it was found to be g + N − 1. Proposition. The number of poles of Ψ(0, P ) is: m = Nl −
l(l + 1) =g−1 2
(7.36)
Proof. As in the case of rational Lax matrices, let us introduce the function W (λ) of the complex variable λ defined by: W (λ) = (Det |ψi (Mj )|)2 where the Mj are the N points above λ. It is well-defined on the base curve since the Det2 does not depend on the order of the M j. 2ζ(λ) i (qi (0)−q1 (0)) This function has an essential singularity of the form e at λ = 0. This does not affect the property that the number of poles of W (λ) is equal to the number of its zeroes. This property is obtained by considering the sum of residues of W /W on the λ-torus, and noting that
220
7 The Calogero–Moser model
ζ (λ) = ℘(λ) = λ12 + O(λ2 ) is elliptic and has no residue. Clearly W (λ) has a double pole where there exists a point P above λ at which Ψ(P ) has a simple pole. As in the rational case, we show that W (λ) has a simple zero for values of λ corresponding to a branch-point of the covering, hence m = ν/2 = g −1, by Riemann–Hurwitz. Here lies the difference with the rational case, because g0 = 1, see Chapter 5. This result looks different from what we got in the case of rational Lax matrices. There, we had g + N − 1 poles for a meromorphic eigenvector at time t = 0. However, N − 1 poles were located above λ = ∞ and were not dynamical. Here all poles are dynamical and we have g − 1 of them. This is a surprising result. In fact, from eq. (7.35), we see that the components of the eigenvector ψi (0, P ) at time t = 0 are Baker–Akhiezer functions. But generically, such a function has at least g poles, not g −1. So, we have to admit that we are not in a generic situation. It was noted in Chapter 3 that when the genus g of the base curve is greater than or equal to 1, the consistency equations of the Lax pair become overdetermined, preventing genericity of the Lax matrix. In fact, this will be a crucial ingredient in the solution of the Calogero–Moser model. 7.7 Time evolution The next step is to compute the time evolution of the eigenvectors. We let the eigenvector evolve according to the natural equation: dΨ = MΨ dt
(7.37)
We choose as initial condition the eigenvector Ψ(0, P ) normalized with its first component equal to 1. Of course, at subsequent time this normalization will not hold any more. However, we know that, in this setting, the poles of Ψ(t, P ) do not evolve with time, see Chapter 5. Proposition. The coordinates ψj (t, P ) of the vector-function Ψ(t, P ) are meromorphic functions on Γ except at the points Pi above λ = 0. Their poles γ1 , . . . , γg−1 do not depend on t. In the neighbourhood of Pi they have the form (i)
ψj (t, P ) = cj (t, λ) exp [ζ(λ)(qj (t) − q1 (0)) + mi (λ)t] (i)
(i)
(7.38) (i)
where cj (t, λ) are regular functions of λ for λ 0, cj (t, λ) = cj (t) + O(λ). Here the c(i) (t) are eigenvectors of the matrix F (t) = (fij (t))
7.8 Reconstruction formulae
221
corresponding to the eigenvalues νi , and: mi (λ) = (−α + νi )λ−2 − hi (0)λ−1
(7.39)
is the singular part of −λ−1 µi (λ) at λ = 0, see eq. (7.29). ˜ P ) defined as Proof. Let us consider the vector Ψ(t, ˜ P) Ψ(t, P ) = G(t, λ)Ψ(t,
(7.40)
where G(t, λ) is defined in eq. (7.30) and let Ψ(t, P ) evolve according to ˜ P ) is an eigenvector of the matrix L(t, ˜ λ) and eq. (7.37). The vector Ψ(t, evolves according to the equation ˜ (t, λ))Ψ(t, ˜ P ) = 0, M ˜ = −G−1 ∂t G + G−1 M G (∂t − M
(7.41)
From eqs. (7.12, 7.13) it follows that: ˜ ij = −δij ζ(λ)q˙i − (1 − δij )fij σ(λ − qij ) [ζ(qij − λ) − ζ(qij ) + ζ(λ)] M σ(λ)σ(qij ) so that collecting the coefficient of the 1/λ term we can write: ˜ (t, λ) = −λ−1 L(t, ˜ λ) + O(λ0 ) M Hence around Pi we have: ˜ λ) = ∂t Ψ(t,
1 ˜ P) − µi (λ) + O(1) Ψ(t, λ
(7.42)
(7.43)
The quantity: mi (λ) = −(λ−1 µi (λ))− = (−α + νi )λ−2 − hi (0)λ−1 is independent of time because so is µi (λ). Integrating eq. (7.43), multiplying by G(t, λ) and normalizing ψ1 (0, P ) = 1, we get the result. 7.8 Reconstruction formulae We now reconstruct the original dynamical variables in terms of the Riemann surface Γ and the poles of the eigenvectors. From eq. (7.38) we see that the components ψi (t, P ) of the eigenvector are Baker–Akhiezer functions. Their behaviour above λ = 0 is ψi (P ) = exp[(qi (t)−q1 (0))λ−1 +mj (λ)t](ci (t)+O(λ)), P → Pj (j)
(7.44)
As we already mentioned, there is a paradox with the number of poles of the Baker–Akhiezer function. Generically, such a function has g poles.
222
7 The Calogero–Moser model
The function then exists and is unique up to normalization. In particular, its zeroes are completely determined. One way to construct a Baker– Akhiezer function with g − 1 poles is to let one of the zeroes of the generic function cancel one of its poles. Clearly, this gives a relation between the parameters defining the function, specifically the qi (t) defining the essential singularity and the moduli of the curve. Let ψi (t, P ) be the Baker–Akhiezer function with the singularities eq. (7.44) at the points Pj above λ = 0, and a divisor of g poles (γ0 , γ1 , . . . , γg−1 ). We will denote by (η0 , η1 , . . . , ηg−1 ) the divisor of its zeroes. By the general formula eq. (5.48) in Chapter 5, we have:
ψi (t, P ) = di (t)e[(qi (t)−q1 (0)) Ω1 +t Ω2 ] θ(A(P ) − U1 (qi (t) − q1 (0)) − U2 t − ζ) × θ(A(P ) − ζ) where A(P ) is the Abel map, and Ω1 and Ω2 are normalized second kind Abelian differentials with singularities dλ−1 and dmj (λ) at the points Pj above λ = 0 respectively. Note that the differentials are independent of the index i of the component considered, and so are the vectors U1 and U2 of their b-periods. The functions di (t) are arbitrary normalizations. According to Riemann’s theorem, the divisors of the poles and zeroes of ψi (t, P ) satisfy : A(η0 ) +
g−1
A(ηi ) − U1 (qi (t) − q1 (0)) − U2 t − ζ = −K
i=1
A(γ0 ) +
g−1
A(γi ) − ζ = −K
i=1
Let us assume that the zero η0 coincides with the pole γ0 . We then get the condition g−1 U1 qi (t) + U2 t + V = −K + A(ηi ) (7.45) i=1
g−1 A(γi ), a vector in where we have denoted V = −K − U1 q1 (0) + i=1 Jac (Γ) independent of time. Now the expression in the right-hand side of this equation belongs to the zero divisor of the theta-function (see Chapter 15). This is the constraint we were looking for. As a consequence we have Proposition. The quantities qi (t) satisfying the equation θ(U1 qi (t) + U2 t + V ) = 0
(7.46)
223
7.9 Symplectic structure
are the solutions of the equations of motion of the spin Calogero–Moser model. Moreover, the spin variables are reconstructed with the matrix C of eigenvectors of F = CνC −1 with: (j)
ci (t) = eαj qi (t) θ(U1 qi (t) + U2 t + Vj )
(7.47)
where Vj is a constant vector. Proof. Equation (7.46) expresses the fact that the left-hand side of eq. (7.45) belongs to the zero divisor of the theta-function. Note that all qi (t) satisfy the same equation which describes, for all values of time, the intersection of a straight line in the Jacobian torus with the theta divisor. To proceed to the reconstruction of the spin variables, it is enough to compute the matrix of eigenvectors C. As we have shown, the columns of C are the vectors c(j) whose components are given by the limit when λ → 0 of the following expression, taken for P = (λ, µj (λ)): P
(j) ci (t) = lim di (t)e[(qi (t)−q1 (0))(
1 Ω1 − λ )+t( P Ω2 −mj (λ))]
λ→0
θ(A(P ) − U1 (qi (t) − q1 (0)) − U2 t − ζ) θ(A(P ) − ζ) P Ω1 − λ1 ) and βj = limλ→0 ( Ω2 − mj (λ)) are ×
P The limits αj = limλ→0 (
(j)
(j)
well-defined and depend on the point Pj . Note that if ci → di ci we have fij → di fij d−1 j and we know that we must quotient by this diagonal action. Moreover, if we change the normalization of the vector c(j) , i.e. (j) (j) ci → dj ci , the matrix F is invariant. Hence we can drop all factors (j)
that can be absorbed in these invariances, and we can choose ci eq. (7.47) with the constant vector Vj = ζ − U1 q1 (0) − A(Pj ).
as in
7.9 Symplectic structure −1 (λ) To compute the symplectic form we need to consider the inverse Ψ of the matrix Ψ(λ) whose columns are the eigenvectors at the N points −1 (λ) is built by considering the adjoint system above λ. The matrix Ψ Ψ+ (P )(L(λ) − µ) = 0. The row vector Ψ+ (P ) is reconstructed from its −1 (λ) analytical properties, exactly like Ψ(P ). The rows of the matrix Ψ are the values of the row vector: Ψ(−1) (P ) ≡
Ψ+ (P ) Ψ+ (P )Ψ(P )
224
7 The Calogero–Moser model
at the N points Pj above λ. As before Ψ+ (P )Ψ(P ) is a meromorphic function on Γ with zeroes at the branch points of the covering µ → λ. Note that this function is regular at λ = 0, in particular the essential singularities cancel. It follows that Ψ(−1) (P ) has poles at the branch points, and zeroes at the divisor of poles of Ψ. We remarked already several times that the components of the eigenvector can be written as ratios of suitable minors of the matrix L(λ) − µ. In particular, the poles of Ψ(P ) are obtained as the zeroes of the first minor. Note that in this minor, the variables q1 , p1 have disappeared. This corresponds to a reduction of the system by translational symmetry, which leaves a phase space of dimension 2(g − 1), hence the g − 1 poles of Ψ(P ). In particular one can choose the origin such that at initial time q1 (0) = 0. Proposition. The symplectic form of the spin Calogero–Moser model is ω=
g−1 i=1
dλγi ∧ dµγi =
N
dpi ∧ dqi − ωK
(7.48)
i=1
where (λγi , µγi ) are the coordinates of the dynamical divisor, and ωK is −1 the Kirillov symplectic form on the orbit defined by F = CνC , i.e. ωK = Tr νC −1 δC ∧ C −1 δC . Proof. As usual, we consider the form K = K1 + K2 + K3 with: K1 = Ψ(−1) (P )δL(λ) ∧ δΨ(P ) dλ K2 = Ψ(−1) (P )δµ ∧ δΨ(P ) dλ K3 = δ (log ∂µ Γ) ∧ δµ dλ and write that the sum of its residues vanishes to prove the equality of the two expressions for ω. The poles of K are the poles of Ψ(P ), {γi = (λγi , µγi ), i = 1, . . . , g − 1}, the branch points, and the points Pj above λ = 0. At the points γi , the computation is exactly similar to that in Chapter 5, yielding a sum of residues 2 g−1 i=1 δµγi ∧ δλγi coming from equal contributions of K1 and K2 . At the branch points the residues from K1 , K2 , K3 cancel, as in the rational case. The new features occur above λ = 0. It is easy to see that K3 is regular above λ = 0 because δµ is regular at these points, since the coefficients of the polar part of µj are non-dynamical. The sum of residues of K1 and K2 at the Pj can be written in matrix form as the residue of forms on the base
7.9 Symplectic structure as follows: E1 ≡
j
E2 ≡
225
−1 δL(λ) ∧ δ Ψ dλ ResPj K1 = Resλ=0 Tr Ψ
∧ δ −1 dλ ResPj K2 = −Resλ=0 Tr δ Ψ µΨ
j
where µ is the diagonal matrix of the µj (λ). Since this is a local compu = GΨ ˜ and tation we first extract the essential singularities by setting Ψ −1 ˜ L = GLG with G given in eq. (7.30). We get
˜ + δL ˜ ∧ G−1 δG + δ Ψ ˜Ψ ˜ −1 dλ E1 = Resλ=0 Tr [G−1 δG, L]
˜ µΨ ˜ −1 + δ Ψ ˜ ∧ δ ˜ −1 dλ µΨ E2 = −Resλ=0 Tr G−1 δG ∧ Ψδ
˜ ∧ G−1 δG = 0 because G is diagonal, and Note that Tr [G−1 δG, L] ˜ ∧ δ ˜ −1 = 0 since δ Resλ=0 δ Ψ µΨ µ is regular at λ = 0. Collecting the remaining terms, we have: ˜ ∧ G−1 δG + δ L ˜ ∧ δΨ ˜Ψ ˜ −1 E1 + E2 = Resλ=0 Tr δ L
˜ ∧ δΨ ˜Ψ ˜ −1 − G−1 δG ∧ Ψδ ˜ µΨ ˜ −1 dλ +[G−1 δG, L] ˜Ψ ˜ = Ψ ˜ µ and Ψ ˜ −1 L ˜=µ ˜ −1 , we get Ψ Using L
˜ ∧ δΨ ˜Ψ ˜ −1 = Tr G−1 δG ∧ (Lδ ˜ Ψ ˜ − δ Ψ ˜ µ)Ψ ˜ −1 Tr [G−1 δG, L]
˜ µ − δL ˜ Ψ) ˜ Ψ ˜ −1 = Tr G−1 δG ∧ (Ψδ ˜ Ψ ˜ − δ Ψ ˜ µ = Ψδ ˜ µ − δL ˜ Ψ. ˜ Hence our where in the last step we have used Lδ expression simplifies to:
˜ ∧ G−1 δG + δ L ˜ ∧ δΨ ˜Ψ ˜ −1 dλ E1 + E2 = Resλ=0 Tr 2δ L Note that (G−1 δG)ij = λ−1 δqi δij + O(1) so that the first term contributes ˜ = C + O(λ), a residue 2 i δpi ∧ δqi . For the second term we note that Ψ −1 ˜ in the reduced system where q1 (0) = 0, and
L = −λ (F − α) + O(1) −1 so that the residue is −Tr δF ∧ δCC since δα = 0. Remembering
that F = CνC −1 , this reads −2Tr νC −1 δC ∧ C −1 δC . We recognize the Kostant–Kirillov form on the coadjoint orbit of ν, see Chapter 3. Writing that the sum of residues vanishes proves the proposition.
226
7 The Calogero–Moser model
This result shows that λγi , µγi are canonically conjugate variables. Since (λγi , µγi ) belong to the spectral curve, they form a set of separated variables. 7.10 Poles systems and double-Bloch condition In this section we present a natural construction of the Lax pair for the spin Calogero–Moser model by relating it to the matrix KP equation. Before delving into this particular example it is illuminating to present the general context of this sort of connection. Let us consider a linear differential operator D in two variables x and t (there could be more than one time) and consider the differential equation DΨ(x, t) = 0. We will consider as an example the KP operator D = ∂t − ∂x2 + u(x, t), see eq. (10.10) in Chapter 10. Assume, moreover, that the coefficients of D are doubly periodic meromorphic functions of the complex variable x with periods 2ωi , i = 1, 2. We require nothing concerning the t-dependence. In general we know that for generic simply periodic potentials we can find a solution of the equation DΨ(x, t) = 0 which is quasi-periodic, i.e. Ψ(x + 2ω1 , t) = B1 Ψ(x, t). Such solutions are called Floquet or Bloch solutions. In the case of elliptic potentials, since we have a second period, it is natural to require that Ψ be double Bloch, i.e. Ψ(x + 2ωi , t) = Bi Ψ(x, t),
i = 1, 2
In contrast to the case of the one-period situation it turns out that this is in general impossible. Nevertheless, one can find double Bloch solutions for very special potentials, from which the Calogero–Moser model will automatically spring out. To understand the restrictions coming from the double Bloch condition let us assume that the function x → Ψ(x, t) is a meromorphic function and has N poles at positions qi on the torus with periods 2ωi . Applying the Riemann–Roch theorem with g = 1 one sees that such functions form a vector space of dimension N . Indeed, for any two such functions Ψ1 , Ψ2 with the same Bloch multipliers, the quotient Ψ2 /Ψ1 is a meromorphic function on the torus with N poles (since Ψ1 has the same number of zeroes and poles, because Ψ1 /Ψ1 is elliptic), hence lives in a space of dimension N − g + 1 = N . The existence of such functions comes from their explicit construction using the Lam´e function. Take any sum of the form: N ci (t, z)Φ(x − qi , z)ekx Ψ(x, t, z) = i=1
7.10 Poles systems and double-Bloch condition
227
where Φ(q, z) is the Lam´e function defined in eq. (7.9). Recall that we have Φ(x + 2ωi , z) = Ti (z)Φ(x, z), with Ti (z) = exp (2ωi ζ(z) − 2ηi z) (see Chapter 15), so that Ψ(x + 2ωi , t, z) = Bi Ψ(x, t, z) with Bi = Ti (z) exp (2kωi ). Given the two Bloch multipliers Bi , we can adjust k and z to achieve these values. So we have found an explicit form of the basis of the N -dimensional vector space of double Bloch functions. If we now require that this function Ψ obeys the equation DΨ = 0, we impose in fact more than N conditions. This is because DΨ is a double Bloch function with the same multipliers, but differentiations and multiplication by potentials can only increase the degree of its divisor of poles. Hence DΨ lives in a space of dimension greater than N , and its vanishing requires more than N linear conditions on the N coefficients ci . This means that for a general operator D with elliptic coefficients there are no double Bloch solutions. On the other hand, given a Riemann surface Γ of genus g, one can construct Baker–Akhiezer functions Ψ on it. It is well known that such Baker–Akhiezer functions satisfy differential equations of the form DΨ = 0 for some specific operator D, see an example below eq. (7.49). A generic Baker–Akhiezer function depends on many parameters. It is defined first by the choice of the Riemann surface Γ, which depends on 3g − 3 moduli, i.e. depends on 3g − 3 complex parameters, and second by the choice of punctures Pα , local parameters wα around Pα , and singular parts of order nα at Pα . Only the first nα coefficients in the expansion of wα are relevant to the definition of the singular part. Altogether this produces a total of (3g − 3) + α (1 + nα ) parameters. In the case of one puncture, the Baker–Akhiezer function is of the generic form: (i) ti P Ω(i) θ(A(P ) + i U ti + ζ) i Ψ(P, t) = e θ(A(P ) + ζ) where A(P ) is the Abel map. We assume that x is t1 , but we could take for x any combination x = i αi ti of the elementary times ti of the hierarchy. The condition for such a function Ψ to be double Bloch is that 2ωi U1 belongs to the lattice of periods of Jac (Γ). This means 2g conditions on the parameters of the Baker–Akhiezer functions. The dimension of the parameter space of Baker–Akhiezer functions is large enough to accomodate the 2g double Bloch conditions. This provides families of differential operators possessing double Bloch solutions. In this case the overdetermined linear system on the coefficients ci (t, z) becomes compatible. The compatibility conditions eventually take the form of a Lax equation. Let us apply this strategy to a simple example: we consider a smooth Riemann surface of genus g and l punctures Pβ , β = 1, . . . , l with local
228
7 The Calogero–Moser model
parameters wβ (P ) around Pβ (wβ (Pβ ) = 0). Fix a divisor of degree g+l−1 in general position. Then there exists a unique Baker–Akhiezer function ψα having poles at this divisor and behaving in the neighbourhood of each Pβ as: ∞ −1 −2 ξsαβ (x, t)wβs ψα (x, t, P ) = ewβ x+wβ t δαβ + s=1
In fact, the degree of the divisor of poles being g + l − 1, we have a vector space of functions of dimension l and we impose a system of l linear inhomogeneous normalization conditions. Proposition. Let |Ψ(x, t, P ) be the vector with l ψα (x, t, P ). It satisfies the equation:
components
(∂t − ∂x2 + u(x, t))|Ψ(x, t, P ) = 0
(7.49)
where the l × l matrix u is given by uαβ (x, t) = 2∂x ξ1αβ (x, t). Such potentials are called finite-zone potentials. Proof. In the vicinity of each puncture Pβ one can write: −1
(∂t − ∂x2 )ψα (x, t, P ) = ewβ
x+wβ−2 t
(−2∂x ξ1αβ + O(wβ ))
Since the left-hand side is meromorphic except at the Pj , has the same g + l − 1 poles as Ψ, and has an appropriate essential singularity at each puncture, it can be expanded on the ψβ , so that one can write (∂t − ∂x2 )ψα = − β uαβ ψβ , for some uαβ (x, t) independent of P ∈ Γ. Comparing with the right-hand side around Pβ we find uαβ = 2∂x ξ1αβ . We now express the condition that the potential u is elliptic and obtain its precise form. Proposition. If the finite-zone potential u is elliptic, it has necessarily the form: N u(x, t) = ρi (t)℘(x − qi (t)) i=1
where ρi (t) is an l × l matrix of rank 1 of the form ρi = |ai bi | with |ai an l vector and bi | an l covector. Proof. We need the explicit form of |Ψ as a Baker–Akhiezer function on some curve Γ. Let P1 , . . . , Pl be the punctures and γ1 , . . . , γg+l−1 be the poles of the Baker–Akhiezer function. There exists a unique meromorphic function hα (P ) with poles at the γi and such that hα (Pβ ) = δαβ . In particular it has l − 1 zeroes at the Pβ , β = α, and g other zeroes at a
7.10 Poles systems and double-Bloch condition
229
divisor Dα . Applying Abel’s theorem, it follows that Z0 + K ≡ A(Pα ) − A(Dα ) = β A(Pβ ) − A(γi ) is independent of α. The Baker–Akhiezer function ψα (P ) reads: ψα (P ) = hα (P )
θ(A(P ) + U1 x + U2 t + Z0 − A(Pα ))θ(Z0 ) x P Ω1 +t P Ω2 e θ(A(P ) + Z0 − A(Pα ))θ(U1 x + U2 t + Z0 )
In this formula the first theta-function in the denominator cancels the extra g zeroes of hα (P ) at Dα . Then the first theta-function in the numerator is obtained by requiring that there is no monodromy. Finally, the two other theta functions are necessary to ensure the correct normalization of ψα (P ) at Pα . It is now easy to identify the poles in x of |Ψ(x, t, P ) . They occur at positions x = qi (t) with (compare with eq. (7.46)): θ(U1 qi (t) + U2 t + Z0 ) = 0 This defines the functions qi (t). Consider the residue of ψα (x, t, P ) when x = qi (t). As a function of P it is a Baker–Akhiezer function, having poles at the points γi . Moreover, in the neighbourhood of each puncture Pβ it has the form exp (wβ−1 qi (t) + wβ−2 t) O(wβ ), i.e. the coefficient in front of the exponential vanishes. This is because at Pβ , β = α the function hα vanishes, while at Pα the theta-function θ(A(P ) + U1 qi (t) + U2 t + Z0 − A(Pα )) vanishes, due to the definition of qi (t). In general such a Baker– Akhiezer function vanishes identically, however for the special values of the parameters (x = qi (t), t) it exists, and is a fortiori unique, up to a normalization constant. We choose some normalization and call it σi (t, P ). This means that: ψα (x, t, P ) =
aiα (t)σi (t, P ) + O(1) x − qi (t)
(7.50)
Here aiα (t) is the normalization constant independent of P . The potential uαβ (x, t) is obtained by computing the expansion of ψα (x, t, P ) around Pβ . Writing the expansion of σi (t, P ) as σi (t, P ) = − 12 (wβ bβi + O(wβ2 )) exp (wβ−1 qi (t) + wβ−2 t), we find around x = qi and P = Pβ : ξ1αβ (x, t) =
aiα (t)bβi (t) + O(1) x − qi (t)
Finally, uαβ = 2∂x ξ1αβ has double poles at x = qi with coefficient ραβ i = β aiα (t)bi (t). If we now impose that u is elliptic, the double pole gives rise to a Weierstrass function ℘(x − qi (t)) with a matrix coefficient ρi of rank 1.
230
7 The Calogero–Moser model
We now require that |Ψ is double Bloch in x, hence can also be written as: N µ µ2 |Ψ = |si (t, λ, µ) Φ(x − qi (t), λ)e− 2 x+ 4 (7.51) i=1
In this formula λ and µ are free parameters, but the double Bloch condition will turn out to specify the Riemann surface on which (λ, µ) are coordinates. Moreover, µ will become infinite at the punctures, and λ will be the local parameter required to define the essential singularity of the Baker–Akhiezer function. Finally |si , in view of eq. (7.50), is proportional to |ai (t) . We denote by ψi (t, λ, µ) the proportionality coefficient |si = ψi (t, λ, µ)|ai (t) (do not confuse ψi (t, λ, µ), i = 1, . . . , N with ψα (x, t, P ), α = 1, . . . , l). Proposition. The equation N ∂t − ∂x2 + ρi (t)℘(x − qi (t)) |Ψ = 0
(7.52)
i=1
has solutions |Ψ of the form |Ψ =
N
µ
ψi (t, λ, µ)Φ(x − qi (t), λ)e− 2 x+
µ2 t 4
|ai (t)
(7.53)
i=1
if and only if qi (t) and the quantities fij = bi |aj satisfy the equations of motion of the spin Calogero–Moser system, eqs. (7.4), and the constraints fii = 2. Proof. Inserting equation (7.53) into equation (7.52), we find the condition: N d(ψi |ai ) Φ(x − qi , λ) E ≡ − (q˙i − µ)Φ (x − qi , λ)ψi |ai dt i=1 N − Φ (x − qi , λ)ψi |ai + fji ℘(x − qj )Φ(x − qi , λ)ψi |aj = 0 j=1
where Φ = ∂x Φ and so on. The vanishing of the triple pole (x − qi )−3 gives the condition: bi |ai |ai = 2|ai (7.54) Using this condition and the Lam´e equation (7.10), we can identify the double pole (x − qi )−2 . Its vanishing gives the condition: (q˙i − µ)ψi |ai + fij Φ(qi − qj , λ)ψj |ai = 0 (7.55) j=i
7.10 Poles systems and double-Bloch condition
231
We finally identify the residue of the simple pole and obtain the condition: d fji ℘(qi −qj )ψi |aj + fij Φ (qi −qj , λ)ψj |ai = 0 − ℘(λ) ψi |ai + dt j=i
j=i
(7.56) Inserting back eqs. (7.54, 7.55, 7.56) into the expression of E one sees that E vanishes identically due to the functional equation (7.14). Hence the vector function |Ψ given by eq. (7.53) satisfies eq. (7.52) if and only if the conditions (7.54, 7.55, 7.56) are fulfilled. From eq. (7.54) it follows that the constraints fii = 2 should hold. Equation (7.55) can then be rewritten as a matrix equation for the N dimensional vector Ψ = (ψi ) (not to be confused with the l-dimensional object |Ψ ): (L(t, λ) − µI)Ψ = 0 (7.57) where the matrix L(t, λ) is given by: Lij (t, λ) = q˙i δij + (1 − δij )fij Φ(qi − qj , λ)
(7.58)
We recognize the Lax matrix of the Calogero–Moser model, eq. (7.12), so that the N -vector Ψ identifies with the eigenvector considered in section (7.6). We can rewrite equation (7.56) as: fji ℘(qi − qj )|aj (7.59) |a˙ i = −Λi |ai − j=i
where we have defined: Λi =
ψj ψ˙ i − ℘(λ) + fij Φ (qi − qj , λ) ψi ψi j=i
But this last equation can be written: (∂t − ℘(λ)I − Λ − M )Ψ = 0
(7.60)
where Λ = Diag (Λi ) and the matrix M (t, λ) is given by: Mij (t, λ) = −(1 − δij )fij Φ (qi − qj , λ)
(7.61)
We recognize the second matrix M of the Lax pair (7.13). The compatibility condition of eq. (7.57, 7.60) reads L˙ = [M + Λ + ℘(λ)I, L]. Of course the term ℘(λ)I does not contribute to the commutator. Moreover, we can get rid of the diagonal matrix Λ by performing a conjugation by a diagonal matrix on L, which amounts to quotienting out the toral action. In
232
7 The Calogero–Moser model
this way we have exactly recovered the Lax pair of the Calogero–Moser model, eq. (7.11), hence the qi and fij have to satisfy the Calogero–Moser equations of motion, in order for |Ψ to be double Bloch. In the course of the proof, λ and µ have been identified as coordinates on the spectral curve of the Calogero–Moser model, i.e. are related by the equation det (L(λ) − µ) = 0. One can see that the punctures Pβ are l among the N points above λ = 0, and λ is a local parameter around each of them. The outcome of this analysis is that the double Bloch condition singles out very specific finite-zone potentials. It amounts to an overdetermined linear system on the coefficients of the expansion of |Ψ which is equivalent to the Lax equation. In particular, we have obtained in a simple and natural way the Lax matrices of the Calogero–Moser model in eqs. (7.58, 7.61). The method is clearly general and lends itself to extensions by Baker–Akhiezer functions with different patterns of essential singularities. Note that in our construction, we have considered only two “times”, x and t, which parametrize the singularity at each puncture. This provides a whole variety of integrable systems with spectral parameter lying on a genus 1 curve. In view of the counting of parameters in Lax equations in Chapter 3 this is a notable fact.
7.11 Hitchin systems A remarkable construction, due to Hitchin, provides integrable systems with spectral parameter lying on a curve of arbitrary genus. The Calogero–Moser model can also be seen as a particular case of this construction. Let Σ be a Riemann surface of genus g, and let G be a complex semisimple Lie group. Let A be the space of type (0, 1) fields on Σ, i.e. fields of z for some local coordinate system z, with values the form A = Az¯(z, z¯)d¯ in the Lie algebra of G. We define the “gauge group” G to be the space of maps from Σ to G, so that h ∈ G is a function h(z, z¯) with values in G. The gauge group acts on A as follows: ¯ A −→ Ah ≡ h−1 Ah + h−1 ∂h
(7.62)
Note that the differences A − A form a vector space compatible with the gauge group action, so that A can be seen as an affine space with group action. We call N = A/G the orbit space of A under G. A tangent vector at the point A ∈ A is of the form X = Xz¯(z, z¯)d¯ z with values in the Lie algebra of G. A covector Φ at the point A is of the form Φ = Φz (z, z¯)dz, the pairing between vectors and covectors at the
7.11 Hitchin systems
233
point A being given by: Tr (Φz Xz¯)dzd¯ z
(Φ, X) = Σ
The gauge group acts on vectors and covectors by adjoint action, X h = h−1 Xh, Φh = h−1 Φh, and this leaves the pairing invariant. The starting point of Hitchin construction is the cotangent bundle T ∗ A whose points are pairs (A, Φ), where Φ is a cotangent vector at A ∈ A. The canonical symplectic form on this space reads: ω= Tr (δΦz ∧ δAz¯)dzd¯ z Σ
Note that this symplectic form is invariant under the gauge group action, so we can perform a Hamiltonian reduction by this group. To do that we need to compute the moment µ of this action. In the case of a cotangent bundle it is shown in Chapter 14 that the Hamiltonian generating the infinitesimal group action is given by H ≡ (µ, ) = Tr (µz z¯ )dzd¯ z = α(X (A, Φ)), ∈ Lie(G) where α is the canonical 1-form α = Σ Tr (ΦδA) and X (A, Φ) is the infinitesimal variation of the point (A, Φ) under gauge group action, namely: ¯ + [A, ], X A = ∂¯A ≡ ∂
X Φ = [Φ, ]
One gets after an integration by parts: ∂Φz z =− z µz z¯dzd¯ + [Az¯, Φz ] dzd¯ ∂ z¯ ¯ +A∧Φ+Φ∧A µ = ∂¯A Φ ≡ ∂Φ The phase space P of the Hitchin system is obtained by choosing the moment equal to 0. The stability group of this moment is therefore the whole gauge group G, so that we have: P = µ−1 (0)/G Choosing µ = 0 means that ∂¯A Φ = 0, and this has a nice geometric interpretation. A cotangent vector at a point n ∈ N , which is the class of A ∈ A under the G action, may be viewed as a linear form on TA A vanishing on vectors tangent to the fibre, that is, such that (Φ, ∂¯A ) = 0 for all . By integration by parts, this condition is equivalent to ∂¯A Φ = 0. This
234
7 The Calogero–Moser model
interpretation being covariant under the gauge group action, it follows that P = T ∗ N , where N is the orbit space (avoiding non-generic orbits to obtain a good manifold). By a theorem of Narasimhan and Seshadri, the space N is known to be isomorphic to the moduli space of (stable) holomorphic G-bundles on Σ, and this implies in particular that it is finite-dimensional. Theorem. The phase space P = T ∗ N is of finite dimension: for g > 1 and G a semi-simple Lie group, we have dim P = 2 dim N = 2 (g − 1) dim G Proof. Let us sketch some ideas of the proof. The first step is to relate N to holomorphic G-bundles. Given A ∈ A and a sufficiently fine covering Uα of Σ, one solves, for some C ∞ functions hα ∈ G defined in Uα , the equation ¯ h−1 (7.63) α ∂hα = Aα where Aα ≡ A|Uα . We define a principal G-bundle by the transition functions gαβ = hα h−1 on Uα ∩ Uβ . The action of G on hα reads β h hα = hα h, so that gαβ is gauge invariant, and the G-bundle is really ¯ αβ = 0, because attached to a point of N . This bundle is holomorphic, ∂g −1 ¯ −1 gαβ ∂gαβ = hβ (Aα − Aβ )hβ = 0 since Aα = Aβ on Uα ∩ Uβ . Of course, hα is defined up to right multiplication by a holomorphic function fα , but this yields an equivalent presentation of the same bundle, with transition = f g f −1 . functions gαβ α αβ β Next we remark that the associated determinant bundle has vanishing Chern class. Viewing G as a group of matrices, define the determinant bundle as a line bundle whose transition functions are det gαβ . By definition of hα we have ∂¯ log det hα = Tr Aα = 0 because we assume G semisimple. So det hα is in fact holomorphic. Then det gαβ = det hα / det hβ defines a trivial holomorphic line bundle, and its Chern class vanishes. To compute the dimension of N , one computes the dimension of the cotangent space of N at a point n ∈ N . Take a representative element A ∈ A of n. As we have seen, the cotangent space at n identifies with the space of forms Φ of type (1, 0) satisfying ∂¯A Φ = 0. Consider the Gbundle attached to n, defined using some choice of functions hα as in α = hα Φh−1 eq. (7.63). Define on each Uα the forms Φ α , with values in the ¯ β g −1 , α = gαβ Φ Lie algebra of G. By construction we have ∂ Φα = 0 and Φ αβ i.e. the Φα define a global holomorphic section with values in 1-forms on Σ of the associated bundle Ad P , the bundle with fibres the Lie algebra of G and adjoint group action. This shows that the cotangent space to N at n identifies with Tn∗ N = H 0 (Σ, κ ⊗ Ad Pn ), where Pn is the principal G-bundle attached to n, and κ is the canonical bundle.
7.11 Hitchin systems
235
The Riemann–Roch theorem has an extension to vector bundles which reads in our case: dim H 0 (Σ, κ ⊗ Ad P ) − dim H 0 (Σ, Ad P ) = (g − 1)dim G − c(det Ad P ) where the determinant bundle det Ad P has transition functions det gαβ , and so has vanishing Chern class c(det Ad P ) = 0. We proceed to show that, generically, the vector bundle Ad P has no global holomorphic section, i.e. dim H 0 (Σ, Ad P ) = 0. Indeed, if {fα } with patching con−1 defines such a section, for any integer m the quandition fα = gαβ fβ gαβ m tities Tr fα define a global holomorphic function on Σ, hence a constant. This means that the eigenvalues of fα are constants, independent of α, and one can write fα = uα Λu−1 α for some uα holomorphic on Uα and a constant traceless diagonal matrix Λ. The patching condition reads −1 [Λ, u−1 β gαβ uα ] = 0. This implies that the matrices uβ gαβ uα are block diagonal (blocks correspond to coincident eigenvalues of Λ). But this would mean that the fibre bundle P is decomposable, since the transition functions u−1 β gαβ uα provide an equivalent description of our bundle. We exclude this situation because it is not generic. Finally, the dimension of Tn∗ N is (g − 1)dim G, and dim T ∗ N = 2 (g − 1) dim G. Remark. In the case of line bundles, we have dim H 0 (Σ, Ad P ) = 1, since we are here considering global holomorphic functions and the same argument shows that dim Tn∗ N = g.
It will be useful to have a heuristic picture of the moduli. First, it is known that all vector bundles on a non-compact Riemann surface are trivial. This implies that one can describe all vector bundles on a compact Riemann surface by using a covering with only two open sets U0 and U∞ , where U0 is a small disc around a point, say z = 0, and U∞ is the Riemann surface with the point z = 0 removed. The bundles are then described by giving only one transition function g0∞ defined on the annulus U0 − {z = 0}. We get an equivalent bundle by changing the transition function −1 g0∞ (z) → g0∞ (z) = f0 (z)g0∞ (z)f∞ (z)
where f0 is analytic non-vanishing on U0 , and f∞ is analytic non-vanishing on U∞ , i.e. on the whole of Σ−{z = 0}. The moduli space of vector bundles is the space of transition functions g0∞ modulo the above redefinitions. In the case of a line bundle, we can write quite generally ∞
g0∞ (z) = z k e
−∞
an z n
(7.64)
236
7 The Calogero–Moser model
where k is an integer because g0∞ (z) is single valued on the annulus.
∞ Consider now on U∞ the function f∞ (z) = exp a ϕ (z) , −n −n n=g+1 where ϕ−n (z) is a meromorphic function on Σ such that around z = 0 we have ϕ−n (z) = z −n + O(z −g ) and ϕ−n (z) is regular everywhere else. Notice that such a function exists and is unique for n ≥ g + 1. Using this terms z −n , n ≥ g +1, in eq. (7.64). Then using f∞ , we can get rid ∞of all the n f0 (z) = exp (− n=0 an z ), we can get rid of all the terms z n , n ≥ 0, as well. Hence, we are left with g
g0∞ (z) = z k e
1
a−n z −n
(7.65)
So the line bundles on Σ are holomorphically classified by an integer k, the Chern class of the bundle, and g continuous moduli, which describe in fact the Picard variety of Σ (the Jacobian for k = 0). In the case of higher rank vector bundles, things are much more complicated. If Σ is a sphere, we know by Riemann–Hilbert factorization that we can always decompose a matrix g0∞ (z) as g0∞ (z) = f0 (z)λ(z)f− (z)−1
(7.66)
where λ(z) is a diagonal matrix with diagonal elements z ki , for some integers ki , f0 has an expansion in z, and f− has an expansion in 1/z. This exactly means that vector bundles on the sphere are classified by the integers ki and have no continuous moduli. In other words, on the sphere the bundle. The dim N = 0. The integers ki are holomorphic invariants of Chern class of the corresponding determinant bundle is ki . In the case of a general Riemann surface Σ, we can still use Birkhoff’s theorem in the small disc around z = 0 to write the transition function on the annulus in the form eq. (7.66). Note that f− (z) is here only defined on the annulus and cannot in general be extended to Σ−{0}. However one can hope that a similar mechanism as in the case of line bundles allows us to get rid of powers z −n for n ≥ g + 1, so that we can write the transition function as in eq. (7.65), but with k a diagonal matrix with integer entries ki , and a−1 , . . . , a−g matrices. If all the integers ki vanish, one can use the freedom of redefinition of the bundle by constant matrices f0 and f∞ to diagonalize a−1 . We can still quotient away by conjugating by constant diagonal matrices while preserving this form. If g > 1, it remains a space of (g − 1) dim G parameters (in fact rank G + [(g − 1) dim G − rank G]) which can be plausibly taken as coordinates on an open set of the moduli space. If g = 1, however, we are left with g0∞ (z) = exp(a−1 /z)
(7.67)
7.11 Hitchin systems
237
where a−1 is diagonal, so that we have rank G parameters, which is known to be the correct dimension of the moduli space in that case. We will content ourselves with this interpretation in the following. The construction of Hitchin integrable systems on P will now be done by defining a very simple set of Poisson commuting functions on T ∗ A, invariant under the gauge group, and by reducing them to T ∗ N . Let P (X) be an invariant polynomial on the Lie algebra of G, i.e. P (h−1 Xh) = P (X). Recall that the ring of such invariant polynomials is freely generated (Chevalley’s theorem) by homogeneous polynomials Pi , i = 1, . . . , rank G, of degrees mi , the so-called exponents of the Lie algebra. These numbers are such that: rank G
(2mi − 1) = dim G
i=1
For any given invariant polynomial P of degree m, e.g. P (X) = Tr X m , consider the function on phase space taking values in differentials of type (m, 0) (i.e. of the form ω(z, z¯)dz m ): (A, Φ) ∈ T ∗ A → P (Φ) ¯ (Φ) = m Tr (∂ΦΦ ¯ m−1 ) = The differential P (Φ) is holomorphic, since ∂P m−1 ¯ ) = 0, where we have used cyclicity of trace. Introducing m Tr (∂A ΦΦ (m) a basis of holomorphic differentials of type (m, 0), say ωj we can write (m) HP,j (Φ) ωj P (Φ) = j
The functions HP,j on phase space are G-invariant, and define the Hamiltonians of the Hitchin systems. Note that the basis of differentials (m) ωj do not contain dynamical variables, since the Riemann surface Σ is a fixed parameter of the construction. Proposition. The functions HPm ,j associated with the primitive polynomials Pm , m = 1, . . . , rank G, which generate the ring of invariant polynomials, are in involution. Their number is (g − 1) dim G, so they define a Hamiltonian integrable system on P = T ∗ N . Proof. The functions HPm ,j seen as functions on T ∗ A are in involution, {HPm ,j , HPn ,k } = 0, because they depend only on the momenta Φ. Since the polynomials P are G-invariant, the functions HP,j are gauge invariant. They are thus well-defined on the symplectic quotient and in involution there, because one can compute directly their reduced Poisson brackets,
238
7 The Calogero–Moser model
see Chapter 14. It remains to show that the number of the Hamiltonians is half the dimension of the phase space. Let us count them. We need the number of holomorphic differentials of type (m, 0). By the Riemann–Roch theorem, dim H 0 (Σ, κm ) − dim H 0 (Σ, κ1−m ) = c(κm ) + 1 − g where c(κm ) = m(2g −2) and dim H 0 (Σ, κ1−m ) = 0 because 1−m < 0, so that dim H 0 (Σ, κm ) = (2m − 1)(g − 1). The total number of independent Hamiltonians HPi ,j is therefore: rank G
0
mi
dim H (Σ, κ
i=1
)=
rank G
(2mi − 1)(g − 1) = (g − 1)dim G
i=1
This is half the dimension of the phase space. This counting works for g > 1. For g = 0 there are no regular differentials on the sphere, so that dim H 0 (Σ, κm ) = 0. The Hitchin construction in this case yields no interesting system. For g = 1 one has dim H 0 (Σ, κm ) = 1, since this space G 1 = rank G Hamiltonians. is spanned by dz m , so that we find rank i=1 In genus 0 and 1 the Hitchin construction does not provide useful dynamical systems. This is a motivation to generalize it to Riemann surfaces with marked points. Let zk ∈ Σ be N points on the Riemann surface Σ. At each of these points we associate an element uk in the dual of the Lie algebra of G, which we shall identify with the Lie algebra itself using the invariant bilinear form. Instead of choosing the moment in the Hamiltonian reduction equal to zero, we now choose: u k δ zk (7.68) µ(A, Φ) ≡ ∂¯A Φ = 2iπ k
where δzk is the Dirac measure at the point zk , represented locally around z . The stability group of this momentum is the subgroup zk by δ(z−zk )dzd¯ of gauge transformations leaving uk invariant: Gz;u = {g ∈ G s.t. h(zk ) ∈ Gk }, where Gk is the stabilizer of uk . The reduced phase space is as usual: Pz;u ≡ µ−1 (2iπ uk δzk )/Gz;u k
In other words, we have: Pz;u ≡ {(A, Φ)|∂¯A Φ = 2iπ
k
uk δzk }/Gz;u
(7.69)
239
7.12 Examples of Hitchin systems
The equation ∂¯A Φ = 2iπ k uk δzk specifies the behaviour of Φ around the marked points. Indeed, let us parametrize A locally around zk as ¯ k with gk (zk ) = 1. Then the condition that (A, Φ) belongs A = gk−1 ∂g ¯ k Φg −1 ) = 2iπuk δz , so to Pz;u translates locally into the condition ∂(g k
k
that gk Φgk−1 is holomorphic in an open set around zk excluding zk , and behaving locally as uk dz gk Φgk−1 = + O(1) (7.70) z − zk dz (we used ∂¯ z−z = 2iπδz0 ). 0 The construction of the commuting Hamiltonians then works as above. For Pi an invariant polynomial of degree mi on the Lie algebra of G, one considers the function (A, Φ) ∈ Pz;u → Pi (Φ)
(7.71)
which take values in the space of meromorphic forms of type (mi , 0) with poles at points zk of order at most mi . These functions are in involution and define an integrable system. 7.12 Examples of Hitchin systems Example 1. Let us illustrate this construction by considering the Riemann sphere with N marked points z1 , . . . , zN and fixed parameters uk in the dual of the Lie algebra of G. We take A of the form ¯ A = h−1 ∂h
(7.72)
with h(z, z¯) ∈ G globally defined on the sphere, so that the attached principal bundle is trivial. The condition ∂¯A Φ = 2iπ k uk δzk imposes that hΦh−1 possesses a simple pole at z = zk with residue h(zk )uk h−1 (zk ). Since there are no holomorphic differentials on the sphere, the only solution is: u k dz = hΦh−1 = Φ with u k = h(zk )uk h−1 (zk ) (7.73) z − zk k
In contrast to eq. (7.70), h(zk ) may be different from one. Requiring that hΦh−1 is regular at infinity yields: u k = 0 (7.74) k
The data (A, Φ) are parametrized by h through the equations (7.72, 7.73), up to left multiplication h → lh with l holomorphic, hence constant,
240
7 The Calogero–Moser model
and together with the constraint eq. (7.74). Notice that this constraint is invariant under h → lh. Gauge transformations act on h as h → hg, where g is such that g(zk ) ∈ Gk , the stability group of uk . Note that Φ and u k are invariant under this action. Quotienting by Gz;u allows us to gauge h away except at the marked points. At these points only a copy of G/Gk survives. This is equivalent to the orbit Ok of uk under coadjoint action of G. Noting that u k describes the orbit Ok , we see that: Pz;u = {( uk ) ∈ COk s.t. u k = 0}/G (7.75) k
k
k → l uk l−1 for all k. where G acts on u k by u The reduced symplectic structure is the usual symplectic structure on coadjoint orbits. Indeed, consider the canonical 1-form on T ∗ A, i.e. −1 ¯ Using the 2iπα = Σ Tr (ΦδA). parametrization A = h ∂h, one finds 2iπα = − Tr (h−1 δh)∂¯A Φ which, using eq. (7.68), reduces to: Σ
α=−
Tr (uk h−1 k δhk )
k
We recognize that δα is the Kostant–Kirillov symplectic form on the product of coadjoint orbits k Ok , which means that the Poisson brackets read: {Tr ( uk ), Tr ( u k )} = δkk Tr ([ , ] uk ) We still have to quotient by the left action of G. Infinitesimal ack ] and is generated by the Hamiltonian tion isgiven by δ uk = [ , u H = uk ), so the moment is µ = k and eq. (7.74) shows k Tr ( ku that the phase space can be identified with the symplectic quotient Pz;u = µ−1 (0)/G. Choosing the invariant polynomial P2 (Φ) = Tr (Φ2 ) gives the Hamiltonian: P2 (Φ) =
k,l
Hk Tr ( uk u l ) = (z − zk )(z − zl ) z − zk
(7.76)
k
with Hk =
Tr ( uk u l ) l=k
zk − z l
The Hk form a family of commuting Hamiltonians usually called the Gaudin Hamiltonians, which are very closely related to the Neumann model.
7.12 Examples of Hitchin systems
241
Example 2. Let us explain how one may rederive the elliptic spin Calogero–Moser system from this general construction. Consider the torus Tτ ≡ C/(Z + τ Z) of periods 1 and τ , Im τ > 0. We shall use a coordinate z on Tτ with the identification z ∼ z + 1 and z ∼ z + τ . We consider the case with one marked point at z = 0. Let u be an element of the (dual of the) Lie algebra attached to this marked point. We need a Cartan decomposition of the Lie algebra of G, i.e. consider the basis (Hi , Eα ), with Hi in the Cartan subalgebra of the Lie algebra of G and Eα the root generators. We describe any fibre bundle on the torus Tτ using two open sets. U0 , a small disc around z = 0, and U∞ = Tτ − {0}. The transition ∞ functions solving function is g0∞ = h∞ h−1 0 , where h0 and h∞ are C −1 ¯ in the respective open sets U0 and U∞ . It follows from the A = h ∂h above discussion, eq. (7.67), that the transition function g0∞ is equivalent to one of the form g0∞ = exp(q/z),
with q diagonal
Note that in general h0 and h∞ are defined up to left multiplication by holomorphic functions, but the condition h∞ h−1 0 = g0∞ = exp (q/z) completely determines h0 and h∞ up to left multiplication by the same constant diagonal matrix. 0 = h0 Φh−1 has The condition ∂¯A Φ = 2iπuδ0 implies, in U0 , that Φ 0 a simple pole at z = 0 with residue u = h0 (0)uh0 (0)−1 . Similarly in −1 ∞ = h∞ Φh−1 U∞ , Φ ∞ is regular. We have Φ∞ = g0∞ Φ0 g0∞ so that Φ defines a holomorphic section of our holomorphic vector bundle. Writing ∞ is regular on Tτ except the patching condition when z → 0 we see that Φ at z = 0 where it behaves as (in the case where all holomorphic indices ki vanish): q/z u Φ∞ = e + O(1) e−q/z z ∞ . Introducing the root In the following we drop the subscript ∞ in Φ decompositions: i Hi + α Eα dz = u i Hi + u α Eα , Φ Φ Φ u = i
α
i
α
i ∼ ui , and Φ α ∼ exp ( α(q) ) uα . The first condition means we see that Φ z z z i is an elliptic function with a single pole of order 1, which means that Φ i = pi , constant. The second conthat it is a constant. Hence u i = 0 and Φ dition means that Φα has an essential singularity of the Baker–Akhiezer
242
7 The Calogero–Moser model
type at z = 0. The solution is the Lam´e function: σ(z − α(q)) α(q)ζ(z) α = − Φ uα e σ(z)σ(α(q)) finally reads: The section Φ ≡Φ ∞ = Φ
pi H i +
α Eα Φ
dz
(7.77)
α
i
In the sl(n) case this is exactly of the same form as the Lax matrix of the elliptic spin Calogero–Moser model. The variables (pi , qi ), where q = i qi Hi , are the momenta and positions of the particles of the Calogero– Moser model, while the u α are the spin variables. Choosing now P2 (Φ) = Tr (Φ2 ) and using the property of Lam´e functions, eq. (7.15), we can relate P2 (Φ)to the Hamiltonian of the spin Calogero–Moser model: P2 (Φ) = uα u −α )℘(z) with H + α Tr( p2i − ℘(α(q)) Tr( uα u −α ) H= i
α
As in the previous example, taking the quotient by the gauge group leaves only the variables qi and pi which parametrize respectively the and the spin variables moduli space N and the Cartan component of Φ, u = h0 (0)uh0 (0)−1 describing the G-orbit of u, restricted by u i = 0. Moreover, since h is defined up to left multiplication by a constant diagonal matrix, and this leaves the constraint u i = 0 invariant, the phase space is: Pu = {(p, q, u ∈ Ou ) s.t. u i = 0}/H where H is the Cartan torus, acting on u by conjugation. It remains to show that our dynamical variables have the correct Poisson brackets. To do that we compute the canonical 1-form: Tr (ΦδA) = Tr (ΦδA) + Tr (ΦδA) 2iπα = Σ
U0
Σ−U0
where we use the description of the bundles on Tτ with two open sets. −1 ¯ ¯ Writing A = h−1 0 ∂h0 = h∞ ∂h∞ in the respective open sets, one computes: ¯ ∂ Tr (ΦδA) = − Tr (h−1 δh Φ) + Tr (Φh−1 0 A 0 0 δh0 ) U0 U0 ∂U0 −1 ¯ Tr (ΦδA) = − Tr (h∞ δh∞ ∂A Φ) + Tr (Φh−1 ∞ δh∞ ) Σ−U0
Σ−U0
∂(Σ−U0 )
7.12 Examples of Hitchin systems
243
First we have ∂(Σ−U0 ) = − ∂U0 . Second, since ∂¯A Φ = uδ0 , the integral Σ−U0 vanishes. We get:
1 −1 Tr Φ(h−1 δh − h δh ) α = −Tr (uh0 (0)−1 δh0 (0)) + 0 ∞ ∞ 0 2iπ ∂U0 ∞ (h∞ h−1 δh0 h−1 The contour integral can be rewritten as ∂U0 Tr Φ ∞ − 0
−1 δh∞ h−1 ∞ ) . Varying the equation h∞ h0 = g0∞ = exp (q/z) we have: δq −1 −1 = δh∞ h−1 ∞ − h∞ h0 δh0 h∞ z so that we finally get:
∞ δq Tr Φ α = −Tr (uh0 (0) z ∂U0 pi δqi − Tr (uh(0)−1 δh(0)) =− −1
1 δh0 (0)) − 2iπ
i
To evaluate the contour integral, we noticed that δq being diagonal, only in eq. (7.77) contributes to the trace. It is clear that the diagonal part of Φ the constraint u i = 0 and the quotient by the Cartan torus is a symplectic quotient of the coadjoint orbit, so finally the symplectic form reads: ω = −δα = δpi ∧ δqi + ωK i
Remark 1. The solution of the Hitchin systems can be viewed as the projection, under taking the quotient by the gauge group, of a straight line motion on T ∗ A since ˙ = 0 and A˙ = P (Φ) = Const. on this space the equations of motion are of the form Φ Remark 2. Note that the constraints ui = 0 are the same as the constraints fii = α in eq. (7.5), because the Cartan algebra for sl(N ) is generated by Eii − Ejj . occuring in eq. (7.77) is the same as the Lax matrix in Moreover, it is clear that Φ eq. (7.12). This provides some insight in the true nature of the Lax matrix, which appears as a form-valued holomorphic section of a vector bundle (depending on the dynamical data). Remark 3. The above construction naturally yields a Lax pair formulation of ˙ −1 . = hΦh−1 , we have dΦ/dt with M = hh = [M, Φ] the equations of motion. Since Φ As it was emphasized above, h is determined up to a constant (in z) diagonal matrix, ˙ → lΦl −1 and M → lM l−1 + l. the diagonal action on these data is given by Φ
244
7 The Calogero–Moser model 7.13 The trigonometric Calogero–Moser model
In this section we present the construction, due to Olshanetsky and Perelomov, of the trigonometric spin Calogero–Moser model by Hamiltonian reduction of the geodesic motion on a symmetric space. The Lax matrix of the elliptic spin Calogero–Moser model, eq. (7.12), reads in the trigonometric limit: Lij = pi δij + (1 − δij )fij (coth(qij ) − coth(λ))eqij coth(λ) In the limit λ → −∞ we get: Lij = pi δij + (1 − δij )
fij sinh(qi − qj )
This Lax matrix, without spectral parameter, is a good Lax matrix in the case of the scalar trigonometric Calogero–Moser model, as it produces N commuting conserved quantities. In the spin case, however, it does not provide enough conserved commuting quantities. We will construct this Lax matrix by a Hamiltonian reduction procedure, starting from a finite-dimensional space. The construction is similar to that in the case of the open Toda chain, see Chapter 4. One starts from the symplectic space T ∗ G, where G is a complex Lie group, and reduce under the action on the left by a subgroup HL and on the right by a subgroup HR . The general theory, see Chapter 14, shows that the Poisson brackets in the coordinates (g, ξ) are expressed as: {ξ(X), ξ(Y )} = ξ([X, Y ]),
{ξ(X), g} = g X,
{g, g} = 0
(7.78)
Recall that here g is the matrix of functions on G whose elements are the matrix elements of g in a faithful representation. As in the case of the Toda chain we reduce the geodesic motion on G. The Hamiltonian on T ∗ G reads Hgeod = 12 Tr (ξ 2 ). Notice that Hgeod is bi-invariant, so one can attempt to reduce this dynamical system using Lie subgroups HL and HR of G, acting respectively on the left and on the right on T ∗ G. Recall that the corresponding moments are, see eq. (4.49) in Chapter 4: P L (g, ξ) = −PHL∗ gξg −1 ,
∗ ξ P R (g, ξ) = PHR
(7.79)
∗ of elements of G ∗ inwhere we have introduced the projectors on HL,R duced by the restriction of these elements (linear forms) to HL,R , the Lie algebras of HL,R . Let us consider an involutive automorphism σ of G and let H be the subgroup of its fixed points. Then H acts on the right on G, defining a
7.13 The trigonometric Calogero–Moser model
245
principal fibre bundle of total space G and base G/H, which is a symmetric space. Moreover, G acts on the left on G/H and in particular so does H itself. We shall consider the reduction of the geodesic motion on G under the product HL × HR with HL = HR = H. The reduction is achieved with an adequate choice of the momentum µ = (−µL , µR ) such that (P L , P R ) = µ. We take µR = 0 so that the isotropy group of the right action is HR itself. In this way we are in fact dealing with motions on G/H. The derivative of σ at the unit element of G is an involutive automorphism of G, also denoted by σ. Let us consider its eigenspaces H and K associated with the eigenvalues +1 and −1 respectively. We have a decomposition: G =H⊕K
(7.80)
in which H is the Lie algebra of H. Note that hKh−1 = K, for h ∈ H. We shall consider the particular case obtained by taking G = SL(N, C) and H = SU (N ). Here, we view SL(N, C) as a real Lie algebra. Counting real parameters, dim G = 2N 2 − 2 (note that det M = 1 yields two conditions) and dim H = N 2 − 1. The automorphism σ is σ(g) = (g ∗ )−1 , where ∗ means transpose and complex conjugate. The set of its fixed points is SU (N ). The symmetric space K = G/H may be identified with the space of positive Hermitian matrices. Indeed, for any matrix M ∈ SL(N, C) we can uniquely write the Cayley decomposition M = KU with U ∈ SU (N ) and K positive Hermitian. This is because M M ∗ is Hermitian and strictly −1 with V ∈ SU (N ) and D dipositive, so we can write M M ∗ = V DV √ √ agonal real positive. We define K = V DV −1 , where D is the positive square root. Then we check that K −1 M is in SU (N ). At the Lie algebra level we have the decomposition (7.80) where H is the Lie algebra of antihermitian matrices, and K is the space of Hermitian matrices. Finally, σ reads X → −X ∗ , and is an automorphism for the structure of a real Lie algebra. The dual of G is identified with G using the symmetric real invariant bilinear form (X, Y ) = Tr (XY + X ∗ Y ∗ ). The choice of the moment µL is of course of crucial importance. We will consider µL in the (dual of the) Lie algebra of H, i.e. an antihermitian matrix, such that its isotropy subgroup Hµ is a maximal proper Lie subgroup of HL , so that the phase space of the reduced system is of minimal dimension but non-trivial. As a matter of fact, the dimension of the reduced phase space is 2 dim G − dim (H × H) − dim (Hµ × H). This is because dim T ∗ G = 2 dim G, the constraint (P L , P R ) = (−µL , 0) yields dim (H ×H) equations, and we still have to quotient by Hµ ×H. To analyse the stabilizer of µ we have to solve h−1 µh = µ. In a basis where µ is diagonal, we see that if µ has N − l equal eigenvalues and the other
246
7 The Calogero–Moser model
l eigenvalues all different, the stabilizer is U (1)l × U (N − l)/U (1). This yields a reduced phase space of dimension 2N l − l(l + 1). Note that this is the dimension of the phase space of the spin Calogero–Moser model, eq. (7.8), apart from a discrepancy of 2 which corresponds to a reduction by the centre of mass motion, as we shall see. We now introduce coordinates on the group G. Using the Cayley decomposition, we can write uniquely g = KU with K Hermitian positive and U unitary. Diagonalizing K we write K = hL Qh−1 L with Q diagonal with real positive entries, and hL ∈ SU (N ). The columns of hL form the orthonormal basis of eigenvectors of K, hence are defined up to a phase. So we can write g = hL Qh−1 R , where hL , hR ∈ SU (N ) are defined up to right multiplication by the same unitary diagonal matrix. The set of points of T ∗ G with given moment (−µL , 0) is defined by the equations: −1 −1 PH (gξg −1 ) = PH (hL Qh−1 R ξhR Q hL ) = µL ,
PH (ξ) = 0
(7.81)
The second condition means that ξ ∈ K is a Hermitian traceless matrix. −1 −1 Setting L = h−1 R ξhR ∈ K, the first condition reads (hL QLQ hL , X) = −1 (µL , X), ∀X ∈ H. Equivalently (QLQ−1 , h−1 L XhL ) = (hL µL hL , −1 hL XhL ), yielding: PH (QLQ−1 ) = h−1 L µL hL q Setting Q = Diag (e i ) with i qi = 0 (hence the reduction by the centre of mass motion), we have (QLQ−1 )ij = exp (qi − qj )Lij . The projection on H amounts to take the antiHermitian part of this matrix, giving the equation sinh (qij )Lij = µ ij with µ ij = (h−1 L µL hL )ij . The solution of this equation is: Lij =
µ ij , i = j, sinh (qij )
Lii = pi ,
pi = 0,
µ ii = 0
where the pi are arbitrary real numbers. We have found: L=
i
pi Eii +
i=j
µ ij Eij sinh (qij )
We recognize the Lax matrix of the trigonometric spin Calogero–Moser model, with the spin variables µ ij describing the coadjoint H-orbit of the momentum µL . In view of the definition of L, the ambiguity on the definition of hL , hR amounts to quotienting by a conjugation by a diagonal unitary matrix. We still have not explored the implications of the conditions µ ii = 0 on hL . We do this for the case of the scalar model, i.e. µL has N − 1 equal
7.13 The trigonometric Calogero–Moser model
247
eigenvalues. This means that µL is of the form µL = i(V V ∗ − αI), where V is an N -vector and α is determined so that µL is traceless. We then ii = 0 read have µ = i(W W ∗ − αI), where W = h−1 L V . The conditions µ ¯ i = α for all i. Since hL is defined up to right multiplication by a Wi W positive, diagonal unitary √ matrix, we can always assume that Wi is real −1 some fixed element h ∈ SU (N ) so that Wi = α. There always exists 0 √ mapping V to the vector W = α(1, . . . , 1)t , and any other solution hL is of the form hL = h0 hW , where hW runs over the stability group of W , which is exactly h−1 0 Hµ h0 , where Hµ is the stabilizer of µ. Here the element h0 plays essentially no role, and one usually takes µL = i(W W ∗ − αI). Then all solutions of the constraints (PL , PR ) = (−µL , 0) are of the form (hL , hR ).(Q, L) with Q, L as above, and hL ∈ Hµ , as it should be. Quotienting by Hµ × H, according to the general procedure, yields the phase space of the Calogero–Moser model. A similar analysis can be performed in the case of the spin Calogero–Moser model. Finally, it is easy to see that the symplectic structure of the cotangent bundle we started with gives the symplectic structure of the spin Calogero–Moser model under reduction. It is enough to compute the canonical 1-form: −1 −1 α = (ξ, g −1 δg) = (QLQ−1 , h−1 L δhL ) + (L, Q δQ) − (L, hR δhR )
Using the constraint eq. (7.81), the first term is (µL , δhL h−1 L ), that is the Kirillov structure on the spin variables, the second term is i pi δqi , and the last term vanishes because the constraint PH (ξ) = 0 also implies PH (L) = 0. References [1] F. Calogero, Exactly solvable one-dimensional many-body systems. Lett. Nuovo Cimento 13 (1975) 411–415. [2] J. Moser, Three integrable Hamiltonian systems connected with isospectral deformations. Adv. Math. 16 (1975) 441–416. [3] M.A. Olshanetsky and A.M. Perelomov, Classical integrable finitedimensional systems related to Lie algebras. Physics Reports 71 (1981). [4] I.M. Krichever, Elliptic solutions of Kadomtsev–Petviashvilii equation and integrable systems of particles. Func. Anal. App. 14 (1980) no. 4, 282–290. [5] J. Gibbons and T. Hermsen, A generalization of the Calogero–Moser system. Physica 11D (1984) 337.
248
7 The Calogero–Moser model
[6] H. Airault, H. McKean and J. Moser, Rational and elliptic solutions of the KdV equation and related many-body problem. Comm. Pure Appl. Math. 30 (1977) 95–125. [7] N. Hitchin, Stable bundles and integrable systems. Duke Math. Journ. 54 (1987) 91. [8] M.S. Narasimhan and C.S. Seshadri, Holomorphic vector bundles on a compact Riemann surface. Math. Ann. 155 (1964) 69–80. [9] R.C. Gunning, Lectures on vector bundles over Riemann surfaces, Princeton University Press (1967). [10] O. Forster, Lectures on Riemann Surfaces, Springer (1981).
8 Isomonodromic deformations
In this chapter, we consider isomonodromic deformations of the first order linear differential operator ∂λ − Mλ (λ). Here the problem is to determine Mλ such that solutions of this differential equation have a given monodromy. In general, the solution is not unique, and depends on a number of continuous parameters, the so-called isomonodromic deformation parameters. We show that the deformation equations with respect to these parameters are an integrable system. Ordinary integrable systems with a Lax matrix appear as particular cases of such systems, namely when the group generated by the monodromies is finite. However, the new setting is much more general. Just as solutions of Lax equations were written in terms of theta-functions, in the general case, the solutions can be written in terms of new functions called tau-functions. We express the dynamical variables in terms of tau-functions and their so-called Schlesinger transforms. We show that, in terms of tau-functions, the equations of motion take the form of bilinear Hirota equations. Finally, we show that the Painlev´e equations can be interpreted as isomonodromic deformation equations. 8.1 Introduction In Chapter 3, we have seen that the Zakharov–Shabat construction yields hierarchies of integrable equations of the form: ∂ti Ψ(λ) = Mi (λ)Ψ(λ), Mi (λ) = g(λ)ξi (λ)g −1 (λ) − (8.1) Here i is a multi-index i = (k, n, α), ξi (λ) is the diagonal matrix ξi (λ) = (λ − λk )−n Eαα , g(λ) is a regular matrix, and
Ψ(λ) = g(λ)e
i ξi (λ)ti
is a matrix of size N × N . The notation ( )− means taking the polar part at λk . In Chapter 3, the wave function Ψ(λ) was defined through the 249
250
8 Isomonodromic deformations
Lax equation L(λ)Ψ(λ) = Ψ(λ) µ, where µ is the diagonal matrix of the eigenvalues of L(λ). We have shown that the isospectral flows defined by eq. (8.1) are all commuting and satisfy the zero curvature equations: ∂ti Mj − ∂tj Mi − [Mi , Mj ] = 0
(8.2)
In this chapter we enlarge the framework by replacing the Lax equation by a linear differential equation in the spectral parameter λ: ∂λ Ψ = Mλ Ψ
(8.3)
We assume that the entries of the matrix M (λ) are rational functions of λ: (k)
Ank +1 A1 (∞) (∞) −A0 −· · ·−An∞ −1 λn∞ −1 +· · ·+ (λ − λk ) (λ − λk )nk +1 k=1 (8.4) (k) (k) This is of the form Mλ (λ) = k Mλ (λ), where Mλ is the polar part at (k) λk (including ∞). The polar parts Mλ at λk are readily obtained from the expression of Mλ , but some care must be taken concerning the polar (∞) part at ∞. To define Mλ one chooses a local parameter z = 1/λ and (∞) writes the differential equation as ∂z Ψ(z) = (Mλ (z) + regular)Ψ(z), so that using eq. (8.4) one gets: (k) 1 1 (∞) (∞) (∞) −2 −n∞ −1 Mλ ≡ − M ( = A z + · · · + A z − A1 z −1 ) λ n∞ −1 0 z2 z − k (8.5) The solutions Ψ(λ) of the linear differential equation (8.3) have essential singularities at the λ = λk and at λ = ∞. They are otherwise analytic but have non-trivial monodromies around the singularities. So, they are in fact defined on a Riemann surface with an infinite (in general) number of sheets. This will be understood in the following. Again we will show that around each pole λk of Mλ , the function Ψ(λ) admits an expansion of the form
Mλ (λ) =
K
(k)
Ψ(λ) ∼ Ψasy (λ) = g (k) (λ) eξ (k)
(k) (λ)
(k)
where g (k) (λ) = g0 (1 + g1 (λ − λk ) + · · ·) is regular at λk and (k)
ξ (k) = B0 log (λ − λk ) +
nk α,n=1
ξ (∞)
t(k,n,α) Eαα (λ − λk )n
n∞ 1 (∞) + = B0 log t(∞,n,α) λn Eαα λ α,n=1
(8.6)
251
8.2 Monodromy data
While the equations look the same as in the isospectral context, there are important differences. One of them is that the quantity ξ (k) includes now a logarithmic term, and another one is that the expansions are now only valid in the asymptotic sense (in general), hence the notation Ψasy (λ). (k) Nevertheless, this is sufficient to show that the polar part Mλ of Mλ at λk is given by
(k) Mλ = g (k) ∂λ ξ (k) g (k) −1 −
The solution Ψ(λ) of eq. (8.3) has non-trivial monodromy properties when λ makes a loop around a singularity of Mλ (λ). We call isomonodromic deformations the deformations of Mλ (λ) such that the monodromy properties of Ψ(λ) are kept fixed. Our aim is to write evolution equations which describe these deformations. The isomonodromic deformation parameters will include as before the t(k,n,α) occurring in eq. (8.6). Under these time evolutions we have:
∂ti Ψ = Mi Ψ, Mi = g (k) ∂ti ξ (k) g (k) −1 , i = (k, n, α) −
Moreover, this set of commuting flows will be enlarged by varying the positions λk of the poles of Mλ (λ):
(k) (k) (k) −1 ∂λk Ψ = Mλk Ψ, Mλk = g ∂λk ξ g (8.7) −
All these flows will be interpreted as isomonodromic deformations of the differential equation ∂λ Ψ = Mλ Ψ, and will be shown to commute. 8.2 Monodromy data In this section, we define precisely the monodromy properties of a linear differential equation of the form eq. (8.3). Our purpose is to clarify the relation between the true solution Ψ(λ), and local expansions Ψasy (λ) = g(λ) exp (ξ(λ)) around each pole λk . We start from a wave-function Ψ(λ), solution of the linear differential equation eq. (8.3), where Mλ (λ) is a globally defined rational function of λ. That is, it is a sum of polar parts (k) Mλ (λ) at given poles λk , as in eq. (8.4). Here we consider the parameters (k) Aj as given and our aim is to reconstruct the asymptotic expansions from the differential equation. We begin by recalling some basic definitions and facts concerning linear differential equations with singular points. Definition. The point λk is a regular singular point of the linear differential equation eq. (8.3), if Mλ (λ) has a simple pole at λk . When Mλ (λ)
252
8 Isomonodromic deformations
has a pole of order nk + 1 ≥ 2, the point λk is called an irregular singular point of rank nk . A more refined definition should include the notion of apparent singularities removable through simple changes of variables. Special care must be taken at the point λ = ∞. To include it into this definition we set z = 1/λ. The equation becomes: ∂z Ψ = −1/z 2 Mλ (1/z)Ψ
(8.8)
When Mλ (λ) is rational in λ, this equation is of the same form as the original one, so that we can in fact consider that it is defined on the Riemann sphere. Assume first that Mλ (λ) is a sum of poles at finite (∞) distance, i.e. all coefficients Aj = 0 in eq. (8.4). We then have Mλ (λ) = O(1/λ) at ∞ so that 1/z 2 Mλ (1/z) has a simple pole at z = 0 which (∞) is thus a regular singular point. On the other hand, if An∞ −1 = 0 then 1/z 2 Mλ (1/z) ∼ 1/z n∞ +1 and we have an irregular singularity of rank n∞ (in particular this is the case if we have only a constant term corresponding to n∞ = 1). The behaviour of solutions at the two types of singularities is very different. Assume first that eq. (8.3) has a regular singularity at λ = 0, i.e. one can write it λ∂λ Ψ(λ) = A(λ)Ψ(λ), with A(λ) analytic around λ = 0: A(λ) = A0 + A1 λ + A2 λ2 + · · · We also assume that all eigenvalues of A0 are different and do not differ by integers. Theorem. Let λ = 0 be a regular singularity. There exists a fundamental matrix of solutions in some neighbourhood of λ = 0 of the form: Ψ(λ) = g(λ) eB log λ where g(λ) is analytic around λ = 0, and B = Diag (αi ), where αi are the eigenvalues of A0 . The coefficients of the development of gex (λ) are obtained by plugging a series expansion into the differential equation, and determining the coefficients recursively. Indeed, by a constant similarity transformation, we can assume that A0 is diagonal. Setting g = 1 + g1 λ + g2 λ2 + · · · we get the recursive system: rgr − [A0 , gr ] =
r i=1
Ar−i gi
253
8.2 Monodromy data (r−1)
In components this equation reads (r − αi + αj )(gr )ij = Rij , where R(r−1) only depends on g1 , . . . , gr−1 , so that gr is uniquely determined if the eigenvalues αi do not differ by integers. Moreover, it is easy to show that one gets a series with non-vanishing radius of convergence. If now we have an irregular singularity at λ = 0, the situation is very different. Let us assume that the equation is of the form λn+1 ∂λ Ψ(λ) = A(λ)Ψ(λ) with n ≥ 1, and that the most singular term, i.e. A0 , has distinct eigenvalues. One can find a formal expansion g(λ) = g0 + g1 λ + · · · by plugging it into the differential equation, but one finds a system of linear equations of the form [gr , A0 ] = R(r−1) , where the right-hand side depends on lower order terms. One can obtain formal solutions, but the resulting series is in general divergent. The precise meaning of Ψ ∼ Ψasy is that Ψ(λ) exp (−ξ(λ)) is asymptotically equal to the formal series g(λ). Theorem. Let λ = 0 be an irregular singularity. The differential equation has a fundamental system of formal solutions in the neighbourhood of λ = 0 of the form: (8.9) Ψasy (λ) = g(λ) eξ(λ) where ξ(λ) = B0 log λ + · · · is a diagonal matrix, and the dots represent a polynomial in 1/λ with dominant term nλ1 n Diag (α1 , . . . , αN ), where the αi are the eigenvalues of the matrix A0 , and g(λ), which is determined up to a right multiplication by a constant diagonal matrix, has a formal expansion of the form g(λ) = g0 + g1 λ + · · ·. In each angular sector S of angle slightly bigger than π/n with vertex λ = 0, there exists a unique true solution Ψ(λ) which admits, in S, an asymptotic expansion given by the above formal series Ψasy . Proof. We shall not give a complete proof of these theorems here (see the References, particularly Wasow) but sketch a few of the ideas involved. It is simpler to assume that the singularity is at infinity, so we set z = 1/λ. We consider the equation z −q Ψ (z) = A(z)Ψ(z) around the singular point z = ∞ of order (q + 1). We assume that A(z) = A0 + 1/zA1 + · · ·, and make the further simplifying assumption that all eigenvalues of A0 are different, hence there is no restriction in taking A0 diagonal. We show below that there is a matrix P (z) such that under the transformation Ψ(z) = P (z)W (z), the differential equation becomes W (z) = z q Q(z)W (z) with Q(z) diagonal. This equation is readily solved, yielding: z q (8.10) Ψ(z) = P (z)e ζ Q(ζ)dζ
254
8 Isomonodromic deformations
The matrix P (z) must obey the equation: z −q P (z) = −P (z)Q(z) + A(z)P (z)
(8.11)
in which Q(z) is diagonal. Of course the transformation P (z) is defined up to multiplication on the right by a diagonal matrix, since this would leave the matrix Q(z) diagonal. We fix this ambiguity by requiring Pkk (z) = 1. Taking the diagonal element kk of eq. (8.11) one gets Qkk (z) = (A(z)P (z))kk . Plugging this into eq. (8.11) one gets a non-linear system for the N (N − 1) non-diagonal elements Pij (z) with i = j: z −q Pij (z) = (A(z)P (z))ij − Pij (A(z)P (z))jj
(8.12)
We first show that this equation admits a unique formal solution of the form: 1 1 P (z) = 1 + P , (P ) = 0, Q(z) = A + Qr r r 0 kk zr zr r≥1
r≥1
Inserting into eq. (8.11), the coefficient of 1/z r for r ≥ 1 gives: A0 Pr − Pr A0 = Qr + Hr
(8.13)
where Hr depends on Pj , Qj for j < r. This is solved recursively uniquely by setting (Qr )kk = −(Hr )kk , which reproduces the solution Qkk = (AP )kk , and (Pr )ij = 1/(αi − αj )(Hr )ij for i = j, where we recall that A0 = Diag (α1 , . . . , αN ). Inserting this expansion for P (z) and Q(z) into eq. (8.10) we get eq. (8.9) with B0 = −Qq+1 and ξ = − qi=0 Qi z q−i+1 /(q − i + 1). Moreover, g(z) = P (z) exp ( ∞ i=q+2 Qi z q−i+1 /(q − i + 1)) is of the form g = 1 + O(1/z) and this uniquely determines this expansion. Note that the most singular term in ξ is indeed of the form A0 z n /n with n = q + 1 and A0 diagonal. If A0 is not diagonal, write A0 = g0 Dg0−1 . Then g = g0 (1 + O(1/z)). Of course g0 is determined up to a right multiplication by any constant diagonal matrix. In contrast to the regular singularity case this expansion is, however, only valid in the asymptotic sense. We then show that there exists a true solution in the sector S having the above formal solution as asymptotics when z → ∞. Write P = 1 + P˜ , eq. (8.12) takes the form z −q P˜ (z) = f0 (z) + f1 (z, P˜ ) + f2 (z, P˜ ), where f1 (z, P˜ ) is linear in P˜ , and f2 (z, P˜ ) is quadratic in P˜ . Moreover, when z → ∞, f1 (z, P˜ ) = [A0 , P˜ ] + · · ·, hence this linear application is not singular (since P˜ has no diagonal element). Finally, we know this equation
8.2 Monodromy data
255
has a formal solution r≥1 Pr /z r . It is a known theorem that there exists, in the interior of any sector of angle less than 2π, an analytic function Φ(z) with the given asymptotics r≥1 Pr /z r when z → ∞. We set P˜ (z) = U (z) + Φ(z) and get a transformed equation for U (z) of the same form, z −q U (z) = g0 (z) + g1 (z, U (z)) + g2 (z, U (z)), but now when z → ∞, one can show that g0 (z) is asymptotic to 0, and the leading term of g1 (z, U (z)) is of the form [A0 , U (z)]. Finally, our equation can be written: z −q U (z) = [A0 , U (z)] + R(z, U ) where R(z, U ) will be treated as a perturbation. The unperturbed equation is readily solved and has the fundamental solution: V (z) = e
z q+1 A0 q+1
q+1
− zq+1 A0
V0 e
We now replace the differential equation by a system of coupled integral equations. Written in components, they are of the form: z q+1 −tq+1 (αi −αj ) q e q+1 t Rij (t, U (t)) dt, i = j Uij (z) = γij
The integration path γij ends at z but its origin may depend on ij. The origins of the paths γij represent the N (N − 1) integration constants of the problem. So we need to carefully specify the integration paths, and it is here that the sector S appears. To be able to control the exponentials in the kernel of the integral equation one chooses paths γij (t) such that Re (z q+1 − tq+1 )(αi − αj ) < 0. Let us examine the conditions under which such a choice is possible. We use the variables ζ = z q+1 and τ = tq+1 . In the τ -plane we draw the lines Re τ (αi − αj ) = 0 called Stokes lines. Let Σ be a sector of angle slightly larger than π, such that each Stokes line intersects the interior of Σ on just one half-line. The pre-image of Σ in the z-plane (under z → ζ) is a sector S: (8.14) θ0 ≤ θ ≤ θ0 + π/n + δ with n = q+1, and some small positive δ. The Stokes line Re τ (αi −αj ) = 0 divides Σ in two regions where Re τ (αi − αj ) is positive and negative respectively. The path γij is taken so that its image in Σ is a straight line from ∞ to ζ with a slope such that Re τ (αi − αj ) < 0. The origins of the paths are chosen at ∞ so that Uij (z) → 0 when z → ∞. The solution Uij (z) is uniquely determined by this requirement, which subsequently yields uniqueness of the solution Ψ(λ) having the considered
256
8 Isomonodromic deformations
asymptotic expansion in the sector S. With this choice of path, one can check that Uij (z) is asymptotic to 0 when z → ∞, hereby justifying the treatment of R(z, U ) as a perturbation. It is clear that for τ on the paths γij one can write: - q+1 q+1 - z −t (αi −αj ) -e q+1 - ≤ e−β|xq+1 −tq+1 | for some positive fixed constant β (recall that αi = αj for i = j). From this point, one can solve the integral equation by successive approximations and prove that this yields series U a convergent in the sector S and asymptotic to zero. We will not reproduce this analysis here. Note that the paths occur in pairs γij and γji such that their images in Σ are straight lines, one on each side of the line Re (ζ − τ )(αi − αj ) = 0. Moreover, we keep the slopes fixed when ζ varies in Σ, and this can be done through the whole sector Σ bounded by some Stokes lines. Note, however, that if ζ goes beyond the boundary, one of the pair of paths has to be modified (see Fig. 8.1) yielding a different value of the integral. In general the true solution Ψ1 (λ) with asymptotic expansion Ψasy (λ) in the sector S1 can be analytically continued beyond the sector S1 , but it is very important to realize that its asymptotic expansion will be different there. If we consider another sector S2 adjacent to S1 , there exists another true solution Ψ2 which has in S2 the given asymptotic expansion Ψasy (λ). Since S1 and S2 overlap, there is a constant matrix S1 such that in this overlap Ψ2 = Ψ1 S1 . This relation remains true in S1 ∪ S2 by analytic continuation. This phenomenon was first noticed by Stokes and the matrix S1 is called a Stokes multiplier. Of course, the origin of the Stokes phenomenon is that subdominant exponentials in one sector become dominant in the next sector. Note that there exists a permutation matrix P such that P −1 S1 P is triangular with diagonal elements equal to 1. Indeed, let P be a permutation matrix. Then ΨP results from Ψ by some permutation of columns. We have ΨP ∼ gP exp(P −1 ξP ), where the matrix P −1 ξP is diagonal and results from ξ by the corresponding permutation of the diagonal elements. In the smallest sector bounded by Stokes rays containing the overlap S1 ∩ S2 , one can choose P such that Re (αP (1) τ ) < Re (αP (2) τ ) < · · · < Re (αP (n) τ ). Since Ψ1 P and Ψ2 P have the same asymptotics in the overlap, we have necessarily Ψ2 P = Ψ1 P S1 , where S1 is lower triangular. This is because exp(−αi τ ) is asymptotically negligible with respect to exp(−αj τ ) when i > j. Moreover, comparing the asymptotics of Ψ1 P and Ψ2 P we see that the diagonal elements of S1 are equal to 1. Finally
257
8.2 Monodromy data
Fig. 8.1. Here the sector Σ is bounded by the bold lines, close to the Stokes rays, i.e. half Stoke lines, labelled 23 and 12. When the point ζ crosses the boundary 12, in the lower left side of the picture, the pair of paths γ12 and γ21 has to be modified as indicated by the dashed lines. Notice that this new path cannot be continuously deformed into the previous one. The sector Σ is of angle greater than π, so its pre-image is as in eq. (8.14). S1 = P S1 P −1 . Altogether the Stokes matrix S1 depends on N (N − 1)/2 continuous parameters. The monodromy matrix around λ = 0 is the matrix M such that Ψ1 (e2iπ λ) = Ψ1 (λ)M
(8.15)
where the left-hand side means the analytic continuation of the solution Ψ1 (λ), with asymptotics Ψasy on S1 , around a closed contour around λ = 0. We can easily relate the matrix M to the Stokes multipliers as follows. Let us cover a neighbourhood of the plane at λ = 0 by 2n sectors of angle π/n + δ, denoted by S1 , S2 , . . . , S2n . More precisely, the sector Sj is defined by (j − 1)π/n ≤ arg (λ) ≤ jπ/n + δ. First in the sector S2 , we have Ψ1 (λ) = Ψ2 (λ)S1−1 (where Ψ2 has the asymptotic Ψasy in S2 ) since
258
8 Isomonodromic deformations
this is true on the overlap. By recursion on the sector Sj we have Ψ1 (λ) = −1 Ψj (λ)Sj−1 · · · S1−1 for λ ∈ Sj . Making a complete 2π rotation around λ = 0, we get a sector S2n+1 which projects over S1 , and we have on this sector −1 · · · S1−1 . But the asymptotic expansion of Ψ1 (e2iπ λ) = Ψ2n+1 (e2iπ λ)S2n Ψ2n+1 (e2iπ λ) is by definition Ψasy (e2iπ λ) = Ψasy (λ) exp (2iπB0 ), so we see that: −1 Ψ1 (e2iπ λ) Ψasy (λ) e2iπB0 S2n · · · S1−1 (8.16) By comparing the asymptotic expansions in eqs. (8.15, 8.16) we have: −1 · · · S1−1 M = exp (2iπB0 ) S2n
In the case of regular singularities, there are no Stokes matrices, and the monodromy matrix reduces to exp (2iπB0 ). We now extend these results to the case of several singular points λk , including the point ∞. Hence we consider the differential equation eq. (8.3), where Mλ (λ) is the general rational function of λ given in eq. (8.4). To describe the monodromy data of eq. (8.3), we have to patch the local descriptions around each singularity λk . At these points we have formal solutions:
(k) (k) (k) (k) (λ) eξ (λ) , g (k) (λ) = g0 1 + g1 (λ − λk ) + · · · Ψ(k) asy (λ) = g Since they obey eq. (8.3) one can readily identify the polar part of Mλ at λk . We compute ∂λ Ψ · Ψ−1 in a sector so that we can replace Ψ by its asymptotic expansion. We get ∂λ gg −1 + g∂λ ξg −1 ∼ Mλ . Keeping the polar part yields the equality between a finite number of terms:
(k) (k) (k) −1 Mλ = Ψasy (λ)∂λ ξ (k) Ψasy (λ) (8.17) −
(∞)
Due to our definition of Mλ parameter z = λ−1 : (∞)
Mλ
we have a similar formula at ∞ using the
(∞) (∞) −1 (z) = Ψ(∞) Ψasy asy ∂z ξ
−
(8.18)
The global monodromy problem is specified by fixing paths γk from a reference point, which we choose to be ∞, to each λk . Around λk there are (k) (k) sectors Sj and corresponding solutions Ψj with the given asymptotics. (∞)
Starting from the solution Ψ1 we want to see how it changes when it is (∞) analytically continued around singularities. We can continue Ψ1 along (k) the path γk and compare the result to Ψ1 . This defines matrices C (k)
259
8.2 Monodromy data
λ1
λ2
λ3
λ
∞
Fig. 8.2. The paths γi and the various Stokes sectors at the points λk . (∞)
such that Ψ1 around λk is
(k)
(∞)
(λ) = Ψ1 (λ)C (k) . Then the monodromy matrix of Ψ1 (k)
(k) −1
M(k) = C (k) −1 e2iπB0 S2n
(k) −1 (k)
· · · S1
C
(8.19)
Around λ = ∞ we have the same formula with C (∞) = 1. Since the path around all singularities is contractible, these matrices are subjected to the relation: M(∞) · M(1) · · · M(K) = 1
(8.20)
where K is the number of singularities at finite distance. The monodromy group is the group generated by the M(k) subjected to the above relation. Note that the determinant of each Stokes matrix is equal to 1 since there exists a permutation of the basis in which it is triangular with 1 on (k) the diagonal. It follows that det (M(k) ) = exp (2iπTr B0 ). Taking the (k) determinant of eq. (8.20) we see that k Tr B0 is an integer, where the sum includes ∞. In fact there is a stronger compatibility condition for
260
8 Isomonodromic deformations
the existence of Ψ(λ), called the Fuchs condition, stating that this integer vanishes. (k) (∞) Tr (B0 ) + Tr (B0 ) = 0 (8.21) k
To show this we first take the trace of eq. (8.18), getting: (∞)
Tr Mλ
(∞) 1 = Tr ∂z ξ (∞) = Tr B0 + ··· z −
On the other hand, considering the 1/z term in eq. (8.5), we get: (∞)
Tr Mλ
=−
1 (k) Tr A1 + · · · z k
so we get (∞)
Tr B0
=−
(k)
Tr A1
(8.22)
k
Similarly, at each singularity λk at finite distance, we take the trace of eq. (8.17): (k)
Tr Mλ
(k) = Tr ∂λ ξ (k) = Tr B0 −
1 + ··· λ − λk
Considering the 1/(λ − λk ) term in eq. (8.4), we get: (k)
Tr Mλ (k)
=
1 (k) Tr A1 + · · · λ − λk
(k)
so that Tr B0 = Tr A1 . Inserting this into eq. (8.22), we find the Fuchs relation. Our next task is to relate the coefficients of Mλ (λ) and the monodromy data. Definition. We define the monodromy data at λk to be the 2nk Stokes (k) matrices Sj and the connection matrices C (k) . We define the singularity data at λk to be the coefficients of the singular terms ξ (k) . These definitions apply to the point at ∞ as well. It is important to note that given Mλ , the monodromy data are defined only up to a group of diagonal matrices at each λk . In fact we have seen that they are uniquely defined once we have chosen an asymptotic (k) expansion around λk . More precisely, once the most singular part of Mλ
8.2 Monodromy data
261
has been diagonalized, there is a canonical asymptotic expansion of the (k) form Ψasy (λ) = (1 + O(λ − λk )) exp (ξ (k) (λ)). Conjugating the matrix Mλ by a constant matrix, one can assume that the most singular term at ∞ is (∞) diagonal. Then there is a canonical choice of Ψ1 . At the other singular (k) (k) points, however, one has simply Ψasy (λ) = g0 (1+O(λ−λk )) exp (ξ (k) (λ)), (k) (k) where g0 is the matrix diagonalizing the leading singularity in Mλ . It is defined only up to multiplication on the right by a diagonal matrix d(k) . (k) Hence the Ψ1 are defined only up to right multiplication by d(k) . This (k) (k) changes ξ (k) → ξ (k) , Sj → d(k) −1 Sj d(k) , and C (k) → C (k) d(k) . Since the monodromy data are significant only up to this diagonal action, we define reduced monodromy data to be the quotient set. We are now in a position to compare the number of parameters in the matrix Mλ and in the reduced monodromy and singularity data set. Assuming that the leading singular coefficient at ∞ is diagonal, the matrix Mλ depends on N 2 (n∞ + K k=1 nk + K − 1) + N parameters. On the other hand, each Stokes matrix depends on N (N − 1)/2 parameters, and there are 2nk such matrices at each singular point including ∞. Similarly each ξ (k) depends on (nk + 1)N parameters, while the matrix C (k) (for k = 1, . . . , K) depends on N 2 parameters. Altogether the set of mon odromy and singularity data contains N 2 (n∞ + K k=1 nk + K) + N (K + 1) parameters. These parameters are not all independent, since we have to take into account the relation eq. (8.20) between the monodromy matrices which removes N 2 parameters, and we have to quotient by the diagonal group action at each λk which removes N K parameters. One then gets exactly the number of free parameters in Mλ . It is then reasonable to expect that these two sets of data are equivalent, that is to say, for any monodromy and singularity data satisfying the appropriate consistency conditions, one can find a unique differential equation of the form eqs. (8.3, 8.4) with these given monodromy and singularity data. This is called the generalized Riemann problem, which was first studied by Birkhoff and more recently by Malgrange and Sibuya. The unicity part is easy. Assume that we have two differential equations with the same monodromy and singularity data. Consider corresponding ˜ solutions Ψ(λ) and Ψ(λ), normalized as (1 + O(λ−1 )) exp (ξ (∞) (λ)) at ∞. −1 (λ). We see that P (λ) is single valued. ˜ Let us consider P (λ) = Ψ(λ)Ψ The singular parts cancel at λk so that P (λ) is holomorphic at λk and therefore holomorphic everywhere. Since P (λ = ∞) = 1, we get P (λ) = 1.
262
8 Isomonodromic deformations
−1 The same argument applied to ∂Ψ(λ) ∂λ Ψ (λ) shows that Mλ is a rational function with a pole of order nk + 1 at λk .
The existence part, i.e. the reconstruction of Ψ from its monodromy and singularity data, is much more difficult. We present in the next section a sketch of this construction in the case of regular singularities, by relating it to the Riemann–Hilbert factorization problem. In the following we shall take the general result for granted, and refer to the literature for its proof. However, assuming that Ψ exists, one can write linear differential equations on Ψ which characterize deformations where the monodromy data are fixed and we vary the singularity data. The compatibility conditions of this system is a set of non-linear differential equations on the coefficients of Mλ . We will directly prove the integrability, in the Frobenius sense, of these equations. Hence there exist locally solutions depending on exactly the number of deformation parameters. 8.3 Isomonodromy and the Riemann–Hilbert problem In the same way that the Riemann–Hilbert factorization problem is fundamental for the study of isospectral integrable systems, it is also the key ingredient in the construction of Ψ with given monodromy. We shall only consider the restricted Riemann problem, i.e. find a differential equation with first order poles whose solutions have a prescribed monodromy group. Let us fix K points λ1 , . . . , λK (including λK = ∞) on the Riemann sphere, and K matrices M(1) , . . . , M(K) whose product M(K) · · · M(1) = 1. We construct a multivalued function Ψ(λ) such that Ψ(λ) → Ψ(λ)M(k) , when λ describes a loop around λk . Finally, we show that det Ψ = 0 and that ∂λ ΨΨ−1 is rational with simple poles at the λk at finite distance. Following Birkhoff, we draw a simple closed path D visiting all the λk , but leaving them outside, and small circles Ck around each λk as in Fig. 8.3. We define matrices M1 , . . . , MK+1 with M1 arbitrary, and −1 Mk+1 Mk = M(k) . Since M(k) = 1 we have MK+1 = M1 . We then define a C ∞ invertible matrix M (λ) on the path D, which is constant equal to Mk between Ck and Ck+1 , and which inside Ck is a C ∞ interpolation between the two constant values Mk and Mk+1 . With these data we can consider the following Riemann–Hilbert factorization problem: U+ (λ) = U− (λ)M (λ) on D where U+ (λ) is analytic inside D and U− (λ) is analytic outside D. As explained in Chapter 3, indices can appear in the solution of the factorization problem, see eq. (3.49). We have absorbed the diagonal matrix of
8.3 Isomonodromy and the Riemann–Hilbert problem
263
Fig. 8.3. The path D and the small circles Ck around the λk . indices in U− , which is thus allowed to have a pole or a zero at ∞, otherwise det U± = 0. Note that U+ has an analytic continuation outside D, through the segment of D between Ck−1 and Ck , given by U− Mk . This is because Mk being constant, U− (λ)Mk is analytic outside D and coincides with U+ on this segment, hence analytically continues U+ (λ). Similarly, the analytic continuation of U− through the segment of D between Ck −1 and Ck+1 is given by U+ Mk+1 . So, if we perform a loop around λk , U+ (λ) −1 gets multiplied on the right by M(k) = Mk+1 Mk . We have obtained a multivalued matrix U (λ), the analytic continuation of U+ (λ), which is analytic outside the Ck , has non-vanishing determinant there, apart possibly for a pole at ∞, and has the given monodromy properties around the Ck . We now consider a second Riemann–Hilbert factorization problem, relative to the union of the contours Ck . Let Zk (λ) be the matrix defined inside Ck by: 1 (k) Zk (λ) = (λ − λk ) 2iπ log M for some determination of the logarithms. Of course, the matrix Zk (λ) undergoes a right multiplication by M(k) when λ performs a loop around λk . We define the functions Ak (λ) on each Ck as U (λ)Zk−1 (λ). Notice that Ak (λ) is univalued and analytic in a vicinity of Ck . Let us denote by A(λ) the collection of functions Ak (λ) on Ck and consider the
264
8 Isomonodromic deformations
factorization problem: V+ (λ) = V− (λ)A(λ) on ∪ Ck where V+ (λ) is a set of invertible matrices Vk+ (λ) analytic inside Ck , VK+ having a possible pole at λK = ∞ (we have absorbed the indices here) and V− (λ) is an invertible matrix analytic outside the Ck . Finally, on each Ck we have Vk+ (λ) = V− (λ)Ak (λ). With the solution of these two Riemann–Hilbert problems at hand, we define: Ψ(λ) = V− (λ)U (λ) outside the Ck The analytic continuation of this function inside Ck is Vk+ (λ)Zk (λ) by the definition of the second factorization problem. Note that Vk+ (λ)Zk (λ) is analytic inside Ck except at λk . Finally, Ψ(λ) has the same monodromy around Ck as Zk (λ), hence has the prescribed monodromies M(k) . It is clear that ∂λ ΨΨ−1 is well-defined and analytic except at the λk , where it has a simple pole. This is because inside Ck : −1 −1 + Vk+ (∂λ Zk Zk−1 )Vk+ ∂λ ΨΨ−1 = ∂λ Vk+ Vk+
and ∂λ Zk Zk−1 has a simple pole at λk . This means that Ψ is a solution of eq. (8.3) with Mλ having only simple poles at the λk , thereby solving the restricted Riemann problem. There is still a problem left, that we cannot be sure that the pole at λK = ∞ is simple. When the corresponding monodromy matrix M(∞) is diagonalizable, Birkhoff has shown that one can carefully choose ZλK so as to achieve a simple pole at ∞. The basic idea is that log Mk is only defined up to integers, and one can choose them to remove the left-over poles. It has been found, however, subsequently that the pole may remain of higher order if M(∞) is not diagonalizable. The case of irregular singularities has been treated by similar methods by Birkhoff. Alternatively, one can see higher order poles as obtained from first order poles by confluence of singularities. As an example, consider eq. (8.3) when Mλ (λ) is a sum of two nearby polar terms: Mλ (λ) =
1 A1 − A2 −1 1 A1 + A2 −1 A1 A2 + = + 2 + O( 2 ) 2 λ+ 2 λ− λ λ
We see that it is easy to produce poles of order n by letting n simple poles move to a single point. 8.4 Isomonodromic deformations As we have seen in the previous section, the matrix Ψ(λ) is determined by its monodromy data and its singularity data at the λk . These are two
265
8.4 Isomonodromic deformations
independent sets of quantities which can be specified independently. We will examine here the matrices Ψ(λ), with prescribed monodromy data (k) (k) Sj , C (k) , fixed parameters B0 , and varying the parameters ti in ξ (k) and the singularity positions λk . Assuming that Ψ exists, we will show that Ψ(λ, {ti }, {λk }) has to satisfy a hierarchy of equations of the form eq. (8.1) with respect to the times ti , and new equations, that we call Schlesinger’s equations, with respect to the λk . Notice that these are all the possible deformation parameters, since once they are fixed, together with the monodromy data, they determine the function Ψ(λ) uniquely. Let us assume that some multivalued function Ψ(λ) exists, with given monodromy and singularity data. Specifically, suppose that we are given (k) the positions λk of the singularities, some Stokes matrices Sj at these singularities (which satisfy the triangularity condition) and some connection matrices C (k) , so that eq. (8.20) is obeyed. By hypothesis, we have (∞) (∞) an asymptotic expansion of the form Ψasy (λ, t) = (1 + O(1/λ)) eξ (λ,t) (k) in the first sector at ∞, and asymptotic expansions in the sector Sj at λk of the form: (k) −1
Ψ(λ) ∼ Ψ(k) asy (λ)Sj
(k) −1 (k)
· · · S1
C
(8.23)
where (k)
(k)
ξ Ψ(k) asy (λ) = g0 (1 + g1 (λ − λk ) + · · ·) e
(k) (λ,t)
(8.24)
With these assumptions, the matrix Mλ = ∂λ ΨΨ−1 is a rational function of λ, as was noticed above. Hence it is of the form eq. (8.4) and the differential equation ∂λ Ψ = Mλ Ψ has a solution Ψ(λ) with the above monodromy and singularity data. Denote by {τi } the set of variables {ti } ∪ {λk }, andby d the exterior differentiation with respect to the parameters τi , d = i dτi ∂τi . Theorem. The monodromy data for Ψ(λ) are independent of the deformation parameters if and only if the function Ψ(λ) satisfies differential equations with respect to the deformation parameters: dΨ = M Ψ where M = of λ.
i Mi dτi
(8.25)
is a 1-form with coefficients Mi rational functions
Proof. Let us consider the function Ψ(λ) obeying all the above constraints, and consider the 1-form dΨ(λ)Ψ−1 (λ), as a function of λ. This 1-form is single valued around λk because when we turn around λk , the matrix Ψ(λ) gets multiplied by the monodromy matrix eq. (8.19), which
266
8 Isomonodromic deformations
cancels in dΨ Ψ−1 because it is assumed to be independent of the deformation parameters. Moreover, in the vicinity of λk its asymptotic expansion can be computed in any sector Sj by inserting the asymp(k) (k) −1 totic expansion eq. (8.23), yielding dΨ Ψ−1 ∼ dΨasy Ψasy , where we again use the independence of the monodromy data from the deformation parameters, and the known fact that one can differentiate an asymp(k) totic expansion valid in a sector. From the explicit form of Ψasy we see that the singularity of the asymptotic expansion of dΨ(λ)Ψ−1 (λ) is a pole at λk . Explicitly, ∂ti Ψ(λ) Ψ−1 (λ) has a pole of order nk at λk , and ∂λk Ψ(λ)Ψ−1 (λ) has a pole of order nk + 1 at λk . Hence M (λ) is a rational function of λ. Conversely, assume that Mλ (λ) is parametrized by some parameters τi and that one can write equations ∂τi Ψ = Mi (λ)Ψ, where Mi (λ) are rational functions of λ. Considering, for example, two adjacent sectors S1 , S2 at a singularity, the solution Ψ(λ) has the asymptotic Ψasy in S1 and Ψasy S in S2 . Then ∂i Ψ has the asymptotic ∂i Ψasy = Mi Ψasy in S1 . This relation on Ψasy remains true in all sectors. In S2 , ∂i Ψ has the asymptotic ∂i Ψasy S + Ψasy ∂i S, but this should also be equal to Mi Ψ which has the asymptotic Mi Ψasy S. Hence ∂i S = 0. Similarly, one shows (k) that the connection matrices C (k) and the B0 are independent of the deformation parameters τi . We now change our point of view and directly study the system of equations: ∂λ Ψ = Mλ Ψ, ∂τi Ψ = Mi (λ)Ψ (8.26) with Mλ of the form eq. (8.4) and Mi rational matrices. This system of equations is compatible if the rational matrices Mλ , Mi obey the zerocurvature conditions: ∂τi Mλ = ∂λ Mi + [Mi , Mλ ]
(8.27)
∂τi Mj − ∂τj Mi − [Mi , Mj ] = 0
(8.28)
In this case, the Frobenius theorem asserts that one can find solutions Ψ(λ) depending on the maximal number of parameters. In our situation the maximal number of isomonodromic parameters τi and compatible equations that one can introduce is just the set of times ti and the λk . We want to prove the compatibility of the system eqs. (8.27, 8.28) directly, for properly defined Mi , yielding the existence of Ψ(λ; τi ). We start from the rational matrix Mλ of the form eq. (8.4) and consider the differential equation eq. (8.3). Around λk there exists a formal solution
267
8.4 Isomonodromic deformations (k)
Ψasy = g (k) exp (ξ (k) ). Hence g (k) obeys the equation: ∂λ g (k) = Mλ g (k) − g (k) ∂λ ξ (k) Next we define the rational matrices:
Mi (λ) = g (k) ∂τi ξ (k) g (k) −1
(8.29)
−
where τi stands for t(k,n,α) and λk . Note that the matrices Mi are algebraic functions of the matrix elements of Mλ , since to compute Mi , we need the expansion of g (k) to some finite order, which is obtained algebraically from Mλ by the recurrence relations eq. (8.13). From Mi , we define some vector field Xi acting on Mλ by: Xi Mλ = ∂λ Mi + [Mi , Mλ ]
(8.30)
It is important to notice that the polar structure of the right-hand side of this equation allows us to consider the flows Xi as acting on the coefficients (k) Aj in eq. (8.4) of Mλ and the λk . In particular Xi and ∂λ commute. To show that, one has to examine the order of the pole at λk in both sides of eq. (8.30). Let i = (k, n, α). Around λk , k = k, the matrix Mi is regular and eq. (8.30) is compatible with the pole structure of Mλ at λk . Around λk , since ∂λ (A)− = (∂λ A)− for any rational matrix A(λ), we have ∂λ Mi = (g (k) ∂λ ∂τi ξ (k) g (k)−1 )− + ([∂λ g (k) g (k)−1 , g (k) ∂τi ξ (k) g (k)−1 ])− In the second term, we can replace ∂λ g (k) g (k)−1 = Mλ − g (k) ∂λ ξ (k) g (k)−1 by Mλ because g (k) ∂λ ξ (k) g (k)−1 does not contribute to the commutator since ξ (k) is diagonal. Similarly, writing g (k) ∂τi ξ (k) g (k)−1 = Mi + (g (k) ∂τi ξ (k) g (k)−1 )+ we get ∂λ Mi = (g (k) ∂λ ∂τi ξ (k) g (k)−1 )− − [Mi , Mλ ]− + [Mλ , (g (k) ∂τi ξ (k) g (k)−1 )+ ]− hence ∂λ Mi + [Mi , Mλ ] = (g (k) ∂λ ∂τi ξ (k) g (k)−1 )− +[Mi , Mλ ]+ + [Mλ , (g (k) ∂τi ξ (k) g (k)−1 )+ ]− from which we see that the pole structure is the same as the one of ∂τi Mλ (it is the term ∂λ ∂τi ξ (k) which controls this assertion, the action of ∂τi on (k) the Aj will be determined later on).
268
8 Isomonodromic deformations
We are now in a position to prove the main theorem of this section: Theorem. The flows Xi are all commuting, and we can identify Xi = ∂τi . Proof. The commutation of the flows Xi , Xj is expressed by ∂λ Fij − [Mλ , Fij ] = 0, where: Fij = Xi Mj − Xj Mi − [Mi , Mj ] We show in fact a stronger result, i.e. Fij = 0. Our first task is to find the action of the flow Xi on the variables g (k) and ξ (k) around any pole λk . To do that we apply Xi to eq. (8.29), getting: ∂λ (Xi g (k) − Mi g (k) ) = Mλ (Xi g (k) − Mi g (k) ) −(Xi g (k) − Mi g (k) )∂λ ξ (k) − g (k) ∂λ (Xi ξ (k) ) Writing (Xi g (k) − Mi g (k) ) = g (k) hi and using eq. (8.29), we get for hi the linear equation: ∂λ hi − [∂λ ξ (k) , hi ] = −Xi ∂λ ξ (k) Since Xi ξ (k) is diagonal, a particular solution is hi = −Xi ξ (k) . The gen(k) (k) eral solution is obtained by adding to it eξ Di e−ξ for any matrix Di independent of λ. Through its definition, we see that hi has at most poles at λk , hence Di must be diagonal, otherwise essential singularities appear (k) (k) in eξ Di e−ξ . Finally, we get: Xi g (k) = Mi g (k) − g (k) Xi ξ (k) + g (k) Di Looking at the polar part of this equation we see that (g (k) Xi ξ (k) g (k) −1 )− = (g (k) ∂τi ξ (k) g (k) −1 )− , from which it follows (k) that Xi ξ (k) = ∂τi ξ (k) , assuming that g0 is generic. Hence Xi identifies to ∂τi on ξ (k) . We can now compute Fij . It is simpler to use a compact notation: let M be the 1-form Mi dτi , and let δ be the vector field i dτi Xi with values in differentials. Note that on ξ (k) , δ identifies to d so that 2 (k) (k) δ ξ = 0. The equation on g becomes, with D = Di dτi : δg (k) = M g (k) − g (k) δξ (k) + g (k) D
(8.31)
The conditions Fij = 0 read δM − M ∧ M = 0. With the help of the above equation one can compute the polar part of δM at λk . Since M (k) = (g (k) δξ (k) g (k)−1 )− , we get: δM (k) = (g (k) [D, δξ (k) ]g (k) −1 )− + [M, g (k) δξ (k) g (k) −1 ]−
269
8.4 Isomonodromic deformations where for M =
Mi dτi , N =
Mj dτj , we define [Mi , Nj ]dτi ∧ dτj [M, N ] = i=j
The first term vanishes because it involves commutators of diagonal matrices. To evaluate the second term we remark that M − g (k) δξ (k) g (k) −1 = O(1), hence, squaring it, we get M ∧M −[M, g (k) δξ (k) g (k) −1 ] = O(1) since again the commutator of diagonal matrices vanishes. Taking the polar part of this relation, we conclude that [M, g (k) δξ (k) g (k) −1 ]− = (M ∧ M )− , hence δM (k) = (M ∧ M )− . Since this is true at each pole λk and since M is a rational function of λ, we see that Fij = 0 for all i, j. We have shown that [Xi , Xj ] = 0. The Frobenius theorem implies that one can simultaneously solve (locally) ∂τi Mλ = Xi Mλ , thereby obtaining rational matrices Mλ , Mi satisfying eqs. (8.27, 8.28). Since Xi identifies with ∂τi on ξ (k) , we have, ∂τi = ∂τi and ∂τi identifies to Xi everywhere. Remark 1. The term D in eq. (8.31) appeared because g(k) is defined only up to right multiplication by a diagonal matrix. This gauge transformation did not affect the calculation above because only gauge invariant quantities are considered. For any gauge choice, we havean equation of the form eq. (8.31) for some specific D. Now that δ is identified to d = dτi ∂τi , we have d2 = 0, and this implies dD = 0. Hence we have D = dh, so one can choose a gauge where D = 0. In this gauge the evolution equations of g (k) read: dg (k) = M g (k) − g (k) dξ (k) (8.32) This should be compared to eq. (3.44) in Chapter 3.
Remark 2. The compatibility conditions, eq. (8.27, 8.28) are non-linear differential equations on the coefficients of Mλ . Once we have a complete solution of these equations, one can find Ψ by solving eq. (8.26). Example. The Schlesinger equations. Consider a differential equation ∂λ Ψ = Mλ Ψ, where Mλ has only regular singularities, i.e. Mλ (λ) =
A(k) , λ − λk
(k)
(k) (k) −1
A(k) = g0 B0 g0
(8.33)
k
We assumed, according to the general analysis, that the matrices A(k) are diagonalizable. Note that there is a hidden regular singularity at ∞. The asymptotic expansions of Ψ(λ) at λk are easily computed and found to be of the form: (k)
ξ Ψ(k) asy (λ) = g0 (1 + O(λ − λk )) e
(k) (λ)
,
(k)
ξ (k) (λ) = B0 log (λ − λk )
270
8 Isomonodromic deformations
This means that the times ti are all set to 0 and that the only deformation parameters are the positions of the poles λk . The deformation equations ∂λk Ψ = Mλk Ψ are constructed with the help of the general formula:
A(k) (k) (k) −1 Mλk = Ψ(k) ∂ ξ Ψ = − asy λk asy λ − λk − The zero curvature conditions read: [A(k) , A(l) ] , l = k λk − λ l [A(k) , A(l) ] =− λ k − λl
∂λl A(k) =
(8.34)
∂λk A(k)
(8.35)
l=k
These equations are called Schlesinger equations. One can check easily that both equations are contained in eq. (8.27), while eq. (8.28) is a direct consequence of them, in agreement with the general theory. 8.5 Schlesinger transformations We have studied all continuous isomonodromic deformations. They are parametrized by the times ti and the λk . However, there remains discrete isomonodromic deformations. The basic remark is that, although the (k) matrices B0 are not allowed to change continuously, a discrete change (k) (k) B0 → B0 + L(k) , where L(k) is a diagonal matrix with integer entries, does not change the monodromy data. The singularity data are modified (k) (k) (k) by eξ (λ) → (λ − λk )L eξ (λ) , i.e. we add extra zeroes or poles at the singularities. This is a discrete analogue of the ti deformations. These transformations are called Schlesinger transformations. So, one can consider that Ψ(λ) depends not only on the continuous variables ti and λk but also on a set of integers, a diagonal matrix with integer entries above each singularity. For the continuous isomonodromic transformations we have deformation equations dΨ = M Ψ with M a rational matrix. We show that for Schlesinger transformations we can write analogously difference equations with respect to the integers. Proposition. The two wave–functions Ψ(λ) associated with the data (k) (k) B0 and Ψ (λ) associated with the data B0 + L(k) are related by: Ψ (λ) = R(λ)Ψ(λ)
(8.36)
where the matrix R(λ) is a rational function of λ. Proof. Consider the matrix Ψ (λ)Ψ−1 (λ). The essential singularities cancel and it has no monodromy. Hence, it is a rational function.
271
8.5 Schlesinger transformations
Note that the integer matrices L(k) have to be restricted by the Fuchs condition, eq. (8.21): Tr L(k) + Tr L(∞) = 0 k
To study Schlesinger’s tranformations, it is enough to concentrate on elementary ones: Definition. An elementary Schlesinger transformation is associated with the matrices k (l) k = δkl Eαα − δk l Eα α (8.37) L α α This shifts the αth diagonal element by +1 above λk and the αth diagonal element by −1 above λk in order to fulfil the Fuchs condition. In the following we restrict ourselves to Schlesinger transformations involving singularities at finite distance. Proposition. The matrix R(λ) in eq. (8.36) associated with the elementary Schlesinger transformation eq. (8.37) is of the form: R(λ) = 1 −
R0 , λ − λk
R0 λ − λk
R−1 (λ) = 1 +
(8.38)
where the matrix R0 is given by: R0 = R0 =
λ k − λk
(k )−1 (k) g0 g0 1
(k) g1
(k )−1
(k)
if k = k
g0 Eαα g0
,
(k)−1
if k = k
(8.39)
α α
(k)
g0 Eαα g0
,
(8.40)
α α
(k)
The matrices gi are defined in the asymptotic expansion eq. (8.24). We have R02 = (λk − λk )R0 . Proof. The conditions determining R(λ) are R(λ)g (l) (λ) = (g )(l) (λ)(λ − λl )L
(l)
(8.41)
so that R(λ) has a simple pole at λ = λk . Similarly, the inverse Schlesinger transform is obtained by changing L(l) to −L(l) , so that R−1 (λ) has a simple pole at λk . Asymptotic expansion at ∞ shows that R(λ) and R−1 (λ) tend to 1 at ∞. This motivates eqs. (8.38) which are moreover consistent if R02 = (λk − λk )R0 .
272
8 Isomonodromic deformations
Suppose first k = k . Let us write the condition eq. (8.41) in more detail for l = k , k. They read 1 1 (k ) (k ) 1+ R0 g = (g ) − 1 Eα α , l = k 1− λ − λk λ − λk
1 1− R0 g (k) = (g )(k) 1 + (λ − λk − 1)Eαα , l = k λ − λk (k )
Looking at the polar terms in the first of these equations, we get R0 g0 = (k )−1 (k ) (k ) −g 0 Eα α , or R0 = −g 0 Eα α g0 . The matrix element γα of the right-hand side of the second equation vanishes at λ = λk . So we have, using the value of R0 : 1 (k ) (k) (k )−1 (k) (g 0 )γα (g0 g0 )α α = 0 (g0 )γα + λ k − λk (k )
Solving for (g 0 )γα and inserting back into the formula for R0 yields eq. (8.39). With this expression one checks immediately that R02 = (λk − λk )R0 . Suppose next k = k , which implies α = α in order to get a non-trivial transformation. The equation determining R(λ) now reads: 1 E α α (k) (k) 1− Id + (λ − λk − 1) Eαα − R0 g = (g ) λ − λk λ − λk Comparing the terms of order (λ − λk )−1 one gets R0 = −g 0 Eα α g0 as before. Next the matrix element γα of the right-hand side vanishes at λ = λk so that, considering the terms of order (λ − λk )0 in the left-hand (k) (k) side, one gets (recall that g (k) (λ) = g0 (1 + (λ − λk )g1 + · · ·): (k)
(k)
(k)−1
(k) (k)
(g0 )γα − (R0 g0 g1 )γα = 0 This is eq. (8.40). Here one checks that R02 = 0. 8.6 Tau-functions Consider the differential equation eq. (8.3), where Mλ is a rational function of λ depending on isomonodromic deformation parameters. At each singularity λk we have asymptotic expansions of the form (k) Ψ(λ) ∼ g (k) (λ) eξ (λ) . With any solution of the deformation equations, eqs. (8.25), we can associate a 1-form Υ: Resλ=λk Tr(g (k)−1 ∂λ g (k) dξ (k) )dλ (8.42) Υ=− k
The sum is over all singularities including ∞.
273
8.6 Tau-functions
Theorem. The deformation equations imply that Υ is closed: dΥ = 0. Proof. We have already proved this equation in a more restricted setting in Chapter 3. Let us repeat the proof of this important result in this more general context. Recall that d is the differential with respect to the isomonodromic deformation parameters ti and λk . Resλk Tr(g (k)−1 dg (k) g (k)−1 ∂λ g (k) dξ (k) − g (k)−1 ∂λ dg (k) dξ (k) )dλ dΥ = k
From the deformation equation eq. (8.32) we get (using that dξ (k) ∧dξ (k) = 0 since the matrix ξ (k) is diagonal): dΥ = Resλk Tr(d∂λ ξ (k) ∧ dξ (k) − ∂λ M ∧ g (k) dξ (k) g (k)−1 )dλ k
The first term vanishes because the order of the pole is at least 3. For the same reason Resλk Tr(∂λ (g (k) dξ (k) g (k)−1 ) ∧ (g (k) dξ (k) g (k)−1 ))dλ = Resλk Tr(d∂λ ξ (k) ∧ dξ (k) )dλ = 0
(8.43)
Next we write g (k) dξ (k) g (k)−1 = M + N (k) , where N (k) , is regular at λk . Then eq. (8.43) reads Resλk Tr(∂λ (M + N (k) ) ∧ (M + N (k) ))dλ = 0 Since the residue of a derivative of a function of λ vanishes, we can replace ∂λ N (k) ∧ M by ∂λ M ∧ N (k) , getting: 1 Resλk Tr(∂λ M ∧ N (k) )dλ = − Resλk Tr(∂λ M ∧ M )dλ 2 It follows that 1 Resλk Tr(∂λ M ∧(M +N (k) ))dλ = − Resλk Tr(∂λ M ∧M )dλ dΥ = − 2 k
k
But now Tr(∂λ M ∧ M )dλ is a rational 1-form on the λ Riemann sphere, hence the sum of the residues vanishes. Example. Let us give the form Υ in the Schlesinger case of regular singularities and deformation parameters λk . In that case (see eq. (8.33)): dξ = −
B (k) 0 dλk λ − λk k
274
8 Isomonodromic deformations
and we have only to keep the constant term in g (k) −1 ∂λ g (k) , yielding: (k) (k) Υ= Tr(g1 B0 )dλk k
Starting from ∂λ Ψ =
l (g
(l) ∂
λξ
(l) g (l) −1 ) Ψ, −
(k)
and expanding
(k)
Ψ = g0 (1 + (λ − λk )g1 + · · ·)eξ we get: (k)
k)
(k)
(k)
(k)−1
g0 (g1 − [B0 , g1 ])g0
=
g (l) B (l) g (l)−1 0
l=k
so that Υ=
(k)
0
0
λ k − λl
dλk − dλl 1 Tr(Ak Al ) 2 λ k − λl k=l
We can verify that this form is closed using the Schlesinger equations eqs. (8.34, 8.35). By the closedness of Υ, we can introduce a function τ ({ti }, {λk }), defined up to a multiplicative constant, by: Definition. The tau-function is defined by Υ = d log τ
(8.44)
With each solution of the deformation equations eq. (8.25), one can associate a tau-function. Hence, with the Schlesinger transformed solution we can associate a transformed tau-function. There is a simple relation between the original tau-function and its transform by an elementary Schlesinger transformation. Proposition. In the gauge eq. (8.32), we have:
(k )−1 (k) 1 g g 0 0 λk −λk k k α α (t) = τ (t) τ
α α (k) g1 αα
if k = k if k = k
(8.45)
where the left-hand side denotes the transform of the tau-function under the elementary Schlesinger transformation eq. (8.37). Proof. Let us denote by Υ the form associated with the transformed solutions. Using eq. (8.41), we can write: Υ − Υ = −E1 − E2 + E3
275
8.6 Tau-functions where we defined the expressions E1 = R−1 ∂λ Rg (l) dξ (l) g (l)−1 l
E2 =
g (l)−1 ∂λ g (l) d(ξ (l) − ξ (l) )
l
E3 =
L(l) (λ − λl )−1 dξ (l)
l
In this section we use the notation X (l) = Resλl Tr(X (l) )dλ. The term E3 vanishes because the pole is of order at least 2. Using ξ (l) = ξ (l) + L(l) log(λ − λl ), the second term E2 is equal to:
(l) (k) (k ) E2 = − Tr(g1 L(l) )dλl = − g1 dλk + g1 dλk αα
l
αα
In the above sum over l, only l = k, k contribute since otherwise L(l) vanishes. To compute the first term E1 , we split it into two parts: E1 = E 1 + E 1 E1 = R−1 ∂λ Rg (l) dξ (l) g (l)−1 l
E
1
=
R−1 ∂λ Rg (l) (dξ (l) − dξ (l) )g (l)−1
l
Using the explicit form for R(λ) we can compute (assuming λk = λk ) 1 R0 1 R0 −1 R ∂λ R = − = (λ − λk )(λ − λk ) λ k − λk λ − λk λ − λk one gets for the first term E1 : 1 1 1 E1 = − R0 g (l) dξ (l) g (l)−1 λ k − λk λ − λk λ − λk l
To evaluate this expression, we use the following identity valid for any ∞ function f (λ)i with an expansion around λl of the form f (λ) = i=−N fi (λ − λl ) : 5 1 −f− (λ)|λ=λk if λl = λk f (λ) = Resλl f0 if λl = λk λ − λk We immediately get: 6 1 Tr (R0 (g (l) dξ (l) g (l)−1 )− )|λ=λk − E1 = λ k − λk l=k
7
+Tr (R0 (g (k) dξ (k) g (k)−1 )0 ) − (k → k )
276
8 Isomonodromic deformations
To rewrite these terms, consider the equation of motion eq. (8.32) and expand it around λ = λk . We find (k) (k)−1 (k) (k) (k)−1 dg0 g0 −g0 g1 g0 dλk = Ml |λ=λk +(M (k) −g (k) dξ (k) g (k)−1 )|λ=λk l=k
Now we have Ml |λ=λk =
(g (l) dξ (l) g (l)−1 )− |λ=λk
and
(M (k) − g (k) dξ (k) g (k)−1 )|λ=λk = −(g (k) dξ (k) g (k)−1 )0 so that we can rewrite
7 −1 6 (k) (k)−1 (k) (k) (k)−1 E1 = − g0 g1 g0 dλk − (k → k ) Tr R0 dg0 g0 λk − λk Similarly, the term E 1 reads: 6 (l) 1 (l) L (l)−1 E 1 = Tr(R0 g0 g )dλl λk − λk λk − λl 0 l=k
(k) (k) (k)−1 −Tr(R0 g0 [g1 , L(k) ]g0 )dλk
7 − (k → k )
In the right-hand side, only l = k, k contribute to the sums over l because L(l) vanishes otherwise. After substituting the explicit value of R0 , they produce the contribution dλk − dλk λk − λ k The terms depending on g1 in E 1 give:
(k) (k ) dλk − g1 dλk g1 αα α α
(k )−1 (k) (k) (k ) (k )−1 (k) g0 g0 g1 dλ − g g g dλk k 1 0 0 α α α α
− (k )−1 (k) g0 g0 αα
E
they cancel with those coming from 1 and E2 . Hence, putting everything together, we get: (k) (k)−1 (k ) (k )−1 dλk − dλk dg g − dg g 0 0 − Υ − Υ = Tr R0 0 0 λk − λk λk − λk
or τ
(k )−1 (k) g0
g0
αα = d log τ λk − λ k Integrating the above equation proves eq. (8.45) for k = k . The integration constant has been normalized to 1. The case k = k is proved similarly.
d log
277
8.7 Ricatti equation 8.7 Ricatti equation
Notice that the right-hand side of eqs. (8.45) is the product of τ (t) by the leading term in the expansion of G
(kk )
δkk Id − g (k)−1 (λ)g (k ) (λ ) (λ, λ ) = λ − λ
(8.46)
in powers of zk = λ−λk and zk = λ −λk . This double expansion has only positive powers of zk and zk . This is clear when k = k , and for k = k the zero in the denominator is cancelled by a zero in the numerator. The matrix elements of G(kk ) (λ, λ ) are algebraic functions of the dynamical variables occuring in Mλ . We can recast the equations of motion of the hierarchy in terms of these new variables. They take a particularly simple Ricatti type form. Let us consider the generating function for the flows associated with the pole λl : ∂ (λ − λl )n−1 (8.47) ∇(l) α (λ) = ∂t(l,n,α) n>0
Strictly speaking, in our formalism there were a finite number of times t(k,n,α) with n ≤ nk . We consider now, formally, differential equations ∂λ Ψ − Mλ Ψ = 0, where Mλ is allowed to have poles of arbitrary order at each λk .
Proposition. The quantity G(kk ) (λ, λ ) defined in eq. (8.46) obeys the Ricatti type equation:
(kk ) (λ, λ ) = G(kl) (λ, λ )Eαα G(lk ) (λ , λ ) ∇(l) α (λ )G G(kk ) (λ, λ )
(8.48)
G(kk ) (λ , λ )
− λ − λ ) (kk G (λ, λ ) − G(kk ) (λ, λ ) −δk l Eαα λ − λ +δkl Eαα
Similarly, the equation of motion relative to the position of the pole λl takes the form:
∂λl G(kk ) (λ, λ ) = Resλ =λl G(kl) (λ, λ )∂λl ξ (l) (λ )G(lk ) (λ , λ )
+δkl ∂λk ξ (k) (λ)G(kk ) (λ, λ ) − δk l G(kk ) (λ, λ )∂λk ξ (k ) (λ ) +
+
where ()+ means taking the positive power part in the expansions around λk and λk respectively.
278
8 Isomonodromic deformations
Proof. We need the following identities: let f (λ) = have: ∞
λ
n−1
(λ
−n
n=1
f (λ) − f (λ ) f (λ))+ = , λ − λ
∞
∞
j=1 fj λ
j,
then we
λn−1 (λ−n f (λ))− =
n=1
f (λ ) λ − λ
Recalling the equations of the hierarchy expressed on the g (k) : ∂ ∂t(l,n,α)
g (k) (λ) = (g (l) Eαα (λ − λl )−n g (l)−1 )− g (k) − δkl g (k) Eαα (λ − λk )−n
we see that they can be recast in the form: ∇α(l) (λ )g (k) (λ) = −g (k) (λ)Eαα
g (l) (λ )Eαα g (l)−1 (λ )g (k) (λ) δkl + λ − λ λ − λ
which proves the first part of the proposition. n Similarly, using the identity for the function f (λ) = ∞ −∞ fn λ : 1 f (λ ) f− (λ) = Resλ =0 λ − λ we can write the equation of motion for g (k) in the form: ∂λl g (k) (λ) g (l) (λ )∂λl ξ (l) (λ )g (l) −1 (λ ) (k) · g (λ) − δkl g (k) (λ)∂λk ξ (k) (λ) λ − λ
= Resλ =λl
from which the second statement follows. We will need, in the next section, the limits of eqs. (8.48) when l = k, λ → λ and l = k , λ → λ . We get respectively (if k = k ):
∇α(k) (λ)G(kk ) (λ, λ ) = G(kk) (λ, λ)Eαα G(kk ) (λ, λ ) + Eαα ∂λ G(kk ) (λ, λ ) (8.49)
∇α(k ) (λ )G(kk ) (λ, λ ) = G(kk ) (λ, λ )Eαα G(k k ) (λ , λ )−∂λ G(kk ) (λ, λ )Eαα (8.50) 8.8 Sato’s formula
It is remarkable that the complete matrix G(kk ) (λ, λ ) can be reconstructed from the tau-function and its elementary Schlesinger transforms. As a consequence we can express the matrix elements of G(kk ) (λ, λ ) as
279
8.8 Sato’s formula
quotients of tau-functions, as in Sato’s formula. We still denote zk = λ−λk and zk = λ − λk and introduce the notation: t → t + [zk ]α means t(l,n,γ) → t(l,n,γ) + δkl δγα
zkn n
(kk )
Proposition. Denote by Gαα (λ, λ ) the matrix element αα of G(kk ) (λ, λ ). We have: k k (kk ) (t + [zk ]α − [zk ]α ), if (k, α) = (k , α ) τ (t) Gαα (λ, λ ) = τ α α τ (t) − τ (t + [zk ]α − [zk ]α ) τ (t) G(kk) (λ, λ ) = , (8.51) αα λ − λ Proof. This is a generalization of the proof of eq. (3.61) in Chapter 3. From the definition of the tau-function, eq. (8.44), we have:
∂ log τ = −Resλl Tr g (l)−1 (λ)∂λ g (l) (λ)Eαα (λ − λl )−n ∂t(l,n,α) Using the identity eq. (3.62) in Chapter 3, we get
(kk) ∇α(k) (λ) log τ = −Tr g (k)−1 (λ)∂λ g (k) (λ)Eαα = −Gαα (λ, λ) From this, it follows, using eqs. (8.49, 8.50) and the definition of G(kk ) (λ, λ ), that:
(kk ) (λ) − ∂ (8.52) τ (t)Gαα (λ, λ ) = 0 ∇(k) λ α
(k ) (kk ) ∇α (λ ) + ∂λ τ (t)Gαα (λ, λ ) = 0 (8.53) These are differential equations relating the λ-dependence to the time dependence. Their unique solution allows us to express τ (t)G(kk ) (λ, λ ) in the form: (kk )
τ (t)Gαα (λ, λ ) = ταα (t + [zk ]α − [zk ]α ) To find the functions ταα , it is enough to compare the two sides of the equation at zk = zk = 0. But there, comparing with eq. (8.45), we find k k (kk ) τ (t)Gαα (λ, λ ) → τ (t) α α this proves the proposition if k = k . If k = k , the right-hand side of eq. (8.49) contains the extra term −
G(kk) (λ, λ ) − G(kk) (λ, λ) Eαα λ − λ
280
8 Isomonodromic deformations
If α = α this does not affect eq. (8.52), but if α = α it becomes
(kk) (λ) − ∂ ) τ (t)[(λ − λ )G (λ, λ ) − 1] =0 (∇(k) λ α αα Using the analogous equation for λ , we deduce that (kk) τ (t)[(λ − λ )Gαα (λ, λ ) − 1] = τα (t + [zk ]α − [zk ]α )
To find the function τα (t) we notice that τ (t)[(λ − λ )Gαα (λ, λ ) − 1] → −τ (t) when λ → λk and λ → λk , hence τα (t) = −τ (t). This yields the second half of the proposition. (kk)
Many remarkable relations can be extracted from this result. In particular, setting k = k , zk = 0, and introducing the matrix h(k) (λ) by (k) g (k) (λ) = g0 h(k) (λ), we find k k (t − [zk ]α ) τ
α α (k) (λ) = (λ − λk ) h , α = α τ (t) αα
τ (t − [zk ]α ) (λ) = h(k) (8.54) τ (t) αα We have already met these equations in Chapter 3, they are the Sato formulae. We see that we have completely identified the functions ταα occurring in the numerator of eq. (3.61) as the Schlesinger transforms of the tau-function in the denominator. 8.9 The Hirota equations Hirota noticed that many integrable equations could be recast into a bilinear form in terms of tau-functions. Specifically, introducing the Hirota differential operators Di with the definition: ∂ n Din f · g = f (x + y)g(x − y)|y=0 (8.55) ∂yi the equations of motion take the symbolic form: P (D)τ · τ = 0, where P is a polynomial in D. For instance, the equation (D14 + 3D22 − 4D1 D3 )τ · τ = 0
(8.56)
is the Hirota form of the KP equation. As a matter of fact, setting u = −2
∂2 log τ ∂t21
(8.57)
281
8.9 The Hirota equations we get the Kadomtsev–Petviashvili equation: ∂2u ∂u ∂ ∂u ∂3u 3 2 + −4 − 6u + 3 =0 ∂t1 ∂t3 ∂t1 ∂t2 ∂t1
(8.58)
We show in this section that this is a general phenomenon. We have the: Proposition. In terms of the tau-function and its elementary Schlesinger transforms, the hierarchy equations take the Hirota bilinear form. Proof. The proof is just a rewriting of the Ricatti equation eq. (8.48) in terms of tau-functions. Let us do it in a simple case (the other cases are similar). We assume for simplicity that l, k, k are all different. Multiplying eq. (8.48) by τ 2 (t) we get: 1 k k (l) 2 τ (t)∇β (λ ) (t + [zk ]α − [zk ]α ) τ α α τ (t) k l l k =τ (t + [zk ]α − [zl ]β ) τ (t + [zl ]β − [zk ]α ) β α α β The left-hand side can be rewritten in terms of Hirota differential operators, using the identity: g 2 ∂ = −(f˙g − gf ˙ ) = −Dt f · g f ∂t f Introducing the generating function for Hirota differential operators: Dα(l) (λ ) = (λ − λl )n−1 Dt(l,n,α) n>0
and shifting the variables t by t → t − [zk ]α /2 + [zk ]α /2, we get: [zk ]α [zk ]α [zk ]α [zk ]α k k (l) ·τ Dβ (λ )τ t − t+ + − α α 2 2 2 2 ] [z [z ] α k l k α t+ = −τ + k − [zl ]β α β 2 2 [zk ]α [zk ]α l k t− ×τ − + [zl ]β β α 2 2 We now expand this formula in powers of zk , zk and zl using the equation f (t + z) g(t − z) = ezDt f · g
282
8 Isomonodromic deformations
It is clear that the coefficients in this expansion have the form of Hirota bilinear equations. Exactly the same method applies to the other Ricatti equations, and shows that they can be recast in Hirota form. This form is very remarkable as it allows for some easy particular solutions and, moreover, lends itself to a beautiful geometric interpretation as Pl¨ ucker relations in an infinite Grassmannian which will be described in Chapter 9. 8.10 Tau-functions and theta-functions In this section we show how the Lax matrix approach fits into the isomonodromy approach. We show that it corresponds to very special matrices Mλ . In that case, the tau-functions are essentially Riemann’s theta-functions. We start from a rational Lax matrix L(λ) of size N × N , with poles at λk . We consider the associated spectral curve, Γ : det (L(λ) − µ) = 0, which is a compact Riemann surface presented as an N -sheeted branched covering of the Riemann sphere. For any value of the spectral parame ter λ one may consider the (multivalued) N × N matrix Ψ(λ) given by (Ψ(P1 ), . . . , Ψ(PN )), where the Pj are the N points above λ in some order, and Ψ(Pj ) is the eigenvector of L(λ) for the corresponding eigenvalue µj (λ). It has been explained in Chapter 5 that requiring the time evolution = Mi Ψ implies that the components of the eigenvector are equations ∂ti Ψ Baker–Akhiezer functions on Γ. In general they have essential singularities at all points on Γ above λk . Around each puncture λk , the matrix Ψ(λ) has an expansion of the form Ψ(λ) = g (k) (λ) exp (ξ (k) (λ)) where g (k) (k) is regular at λ = λk and ξ is a diagonal matrix singular at λk : ξ (k) (λ) =
n,α
t(k,n,α)
Eαα (λ − λk )n
Here t(k,n,α) are the times describing all the integrable flows of the hierarchy. Moreover, Ψ(P ) has g + N − 1 poles on the Riemann surface, g of them at finite distance being the dynamical divisor D, and the other N − 1 being at the points Qi , i = 2, . . . , N above λ = ∞. Note that at λk , Ψ(λ) has the behaviour considered in this chapter. Hence it is natural to ·Ψ −1 . consider the matrix Mλ = ∂λ Ψ Proposition. The matrix Mλ is a rational function of λ. It has poles at the λk , simple poles at the projections of the branch points of the covering (λ, µ) → λ, and at the projections of the poles of Ψ(λ).
8.10 Tau-functions and theta-functions
283
is multivalued, i.e. its columns Proof. The main point is that while Ψ(λ) undergo permutations when one performs a loop around a branch point, Ψ −1 is independent of the ordering of the columns of Ψ(λ), ∂λ Ψ· hence it is well-defined as a function of λ. Using the expansions around the punctures λk we see that the singular part of Mλ at λk is given by (g (k) ∂λ ξ (k) g (k)−1 )− . It has a finite order pole if there are a finite number of time variables. Let us consider now a branch point, and for simplicity assume that it is of order 2, that is we assume that the first two columns of Ψ(λ) coalesce for λ = λb . The corresponding eigenvalues (µ1 , µ2 , . . .) are such that µ1 , µ2 also coalesce. Locally the equation of Γ√is of the form λ−λb = (µ−µb )2 and the local parameter is z = µ − µb = λ − λb . The first two eigenvectors are just the evaluation of one meromorphic vector valued function of z on the two sheets above λ. Splitting the even and odd powers of z, we can write the matrix Ψ(λ) in the form: √ √ Ψ(λ) = (Ψe (z) + zΨo (z), Ψe (z) − zΨo (z), Ψ3 (z), . . .) 1 1 0 ··· 1 √0 0 · · · 0 z 0 ··· 1 −1 0 · · · = (Ψe (z), Ψo (z), Ψ3 (z), . . .) 0 0 1 0 0 1 .. .. .. .. .. .. . . . . . . The first matrix g(z) = (Ψe (z), Ψo (z), Ψ3 (z), . . .) is regular around z = 0. The third matrix is an inessential invertible constant matrix, and the second matrix can be identified with exp ξ(λ) with ξ(λ) = 12 log (λ − λb )E22 . This produces in Mλ a polar part (g∂λ ξg −1 )− =
1 g(λb )E22 g −1 (λb ) 2 (λ − λb )
At a pole of Ψ(λ) (above λc ) at finite distance, one column (say the j th one) has a pole. Hence one can write, up to right multiplication by a constant invertible matrix, Ψ(λ) = g(λ) exp (ξ(λ)) with ξ(λ) = − log (λ − λc )Ejj . This again yields a simple pole in Mλ : (g∂λ ξg −1 )− = −
g(λc )Ejj g −1 (λc ) (λ − λc )
Finally, above λ = ∞, in the setup of Chapter 5, we have N points Q1 , . . . , QN on Γ, and Ψ(λ) has simple poles at the N − 1 points Q2 , . . . , QN . When λ → ∞, the wave function Ψ(λ) has the asymptotic (∞) (∞) (∞) expansion g0 (1 + O(1/λ)) exp (B0 log (1/λ))C , where C (∞) is a (∞) constant invertible matrix and B0 = − N i=2 Eii . This implies that
284
8 Isomonodromic deformations
around λ = ∞ the matrix Mλ has the form: (g∂λ ξg −1 )− = −
(∞)
g0
(∞) (∞) −1 g0
B0
λ
+ O(1/λ2 )
It is now clear that Mλ is a rational function of λ vanishing at ∞ and is the sum of its polar parts at finite distance. Remark 1. Note that the last statement of the proof implies that Mλ tends to 0 at ∞, so we must have: (k) 1 1 (∞) (∞) (∞) −1 (k) (k) −1 Mλ = g ∂λ ξ g = − g0 B0 g0 +O 2 λ λ − k
where the sum over k runs over poles at finite distance. In particular, looking at the 1/λ terms and taking the trace one gets the Fuchs relation eq. (8.21) again. It is interesting (∞) to check it in this context. First we have Tr B0 = −N +1. The poles at finite distance are the branch points, each one contributing 1/2 to the trace, and the g points of D, each one contributing −1. Finally, at the punctures, we have Tr (g (k) (λ)∂λ ξ (k) g (k) −1 (λ))− = Tr ∂λ ξ (k) = O(1/λ2 ) at ∞, hence they do not contribute to the 1/λ terms. The Fuchs condition therefore reads g = ν/2 − N + 1. This is just the Riemann–Hurwitz formula.
Remark 2. The matrix Mλ (λ) has more poles than L(λ). However, at the extra
and the branch points, the singularity of Mλ is of regular poles, namely the poles of Ψ type. The corresponding singularity data ξ contains only a logarithmic term. At the poles of L(λ) we have the whole singularity structure for singularities of irregular type, but without a logarithmic term. The matrix Mλ (λ) embodies the data allowing us to In particular, it contains data pertaining to the spectral curve, through reconstruct Ψ. its branch points, and data pertaining to the eigenvector bundle, through the divisor D.
It follows from the Proposition that the matrix Ψ(λ) constructed from the eigenvector bundle satisfies a differential equation (∂λ −Mλ (λ))Ψ(λ) = 0, where Mλ is a rational function of λ of the type studied in this chapter. However, Mλ is a very particular function of λ since the solution Ψ(λ) has no monodromy at the punctures, i.e. at the poles of L(λ). Its only non–trivial monodromy occurs around the branch points, and acts by permutations of corresponding columns. Globally, the monodromy group is finite, and this is a special feature of the differential equations coming from Lax equations. Finally, there are no Stokes matrices at any singularity. Hence the monodromy data, as defined above, consist only of the connection matrices C (k) , which are time-independent matrices, function of the moduli of the spectral curve. The general theory nevertheless applies to this very special situation, and in particular the tau-functions can be computed explicitly in terms
8.10 Tau-functions and theta-functions
285
of theta-functions. We have already met this situation in Chapter 5, but we make here the analysis in a broader context in order to be able to understand the action of Schlesinger transformations. We will see that Schlesinger transformations reduce to very simple translations in the argument of the theta-functions, as we now show. Let Pkα be the N points above λk . We take zk = λ − λk as local parameter around each Pkα . We introduce the singular parts t(k,n,α) zk−n (8.59) ξkα (zk ) = lkα log (zk ) + n≥1
We introduced logarithmic terms as in eq. (8.6), but we assume that the coefficients lkα are integers in order to be able to construct Baker– Akhiezer functions. These logarithmic terms will introduce extra zeroes or poles in the Baker–Akhiezer functions at the punctures, and will be useful to help us understand Schlesinger transformations. Consider Baker–Akhiezer functions with singular parts given by eq. (8.59) at each Pkα . We introduce poles at the g + N − 1 given points and Q2 , . . . , QN (above ∞). We D = (γ1 , . . . , γg ) (the dynamical divisor) choose the numbers lkα such that kα lkα = 0 so that the degree of the divisor of prescribed zeroes and poles is still g + N − 1. The dimension of the space of such functions is N , by the Riemann–Roch theorem. We fix a particular λk and consider the N sheets above it. For each α, one (k) can define a unique Baker–Akhiezer function ψα (P ) satisfying the N conditions: ψα(k) (P ) = eξkα (zk ) (δαβ + O(zk )),
when P → Pkβ
We put these N functions in a column vector Ψ(k) (P ) and form as usual the N × N matrix (k) (λ) = (Ψ(k) (P1 ), . . . , Ψ(k) (PN )) Ψ
(8.60)
where the Pi are the N points above λ. Note that, around our particular λk , we have the expansion (k) (λ) = h(k) (λ)eξ(k) (zk ) Ψ
(8.61)
with h(k) (λ) = 1 + O(zk ) and ξ (k) being the diagonal matrix α ξkα Eαα , (k) i.e. we have chosen the normalization g0 = 1 in the expansion eq. (8.24). We are going to identify the tau-function by comparing the expression of (k) in terms of theta-functions with the expression eq. (8.54). the matrix Ψ
286
8 Isomonodromic deformations (k)
Proposition. The Baker–Akhiezer function ψα (P ) can be written in terms of theta-functions as: P
ψα(k) (P ) = e
Pkα
Ω
·
θ(A(P ) + V − K)θ(A(Pkα ) − A(D) − K) θ(A(Pkα ) + V − K)θ(A(P ) − A(D) − K)
(8.62)
where the vector V is given by: V = t(k,n,α) U (k,n,α) + lk β A(Pk β ) − A(Pkα ) + A(Q1 ) − A(D) knα
k β
Proof. Following the general procedure of Chapter 5, we construct P
ψα(k) (P ) = C · e
Pkα
Ω
·
θ(A(P ) + V − K) θ(A(P ) − A(D) − K)
where C is a normalization constant, Ω is an Abelian differential chosen (k) so that ψα has the required properties at the punctures and at the Qj , the theta-function at the denominator has been introduced to take care of the poles at the dynamical divisor D (K is the vector of Riemann’s constants), and the vector V in the theta-function in the numerator is determined by requiring that the resulting function has no monodromy. Let us first determine the form Ω. We write it as a sum of three pieces (k) Ω = Ω(t) + Ω(l) + Ω(q) , where Ω(t) ensures that ψα (P ) has the correct essential singularities at the punctures. One can write (n) t(k ,n,β) ΩP (8.63) Ω(t) = k ,n,β
(n)
k β
where ΩP is the normalized (i.e. the a-periods vanish) second kind k β Abelian differential with just one singularity at Pk β and such that around P (n) (n) this point ΩP = d(zk−n ΩPkα )+holomorphic. Note that the integral P kα k β is ill-defined. We take as its definition the unique primitive which around Pkα behaves as zk−n +O(zk ) (no constant term). The form Ω(l) is computed to produce a zero or pole of order lk β at the puncture Pk β . Noting that lk β = 0, it is the unique normalized Abelian differential of the third kind with first order poles at these points with corresponding residues lk β . As in the previous case, the integral of this form with origin at Pkα is ill-defined. We take it to be the primitive behaving as lkα log (zk ) + O(zk ), mod 2iπ. Finally, the form Ω(q) is introduced to get N − 1 poles at the points Q2 , . . . , QN and N − 1 zeroes at the points Pkβ with β = α (for the special k). It is given by the unique third kind differential with residues −1 at the Qj and +1 at the Pkβ .
8.10 Tau-functions and theta-functions
287
P It is now easy to compute the monodromy of the function exp ( Pkα Ω). Since the differentials are normalized, there is no monodromy around the a-cycles. Around the cycle bj the monodromy is given by exp ( bj Ω). Using the monodromy property of theta-functions, eq. (15.14) in Chapter 15, we can cancel this monodromy by taking: 1 Vj = Ω − Aj (D) 2iπ bj Decomposing Ω into its three components, we note that bj Ω(t) is a linear form in the times with coefficients independent of the indices (k, α) since all punctures enter symmetrically in its definition. One can compute more precisely the other two contributions by using Riemann’s bilinear identity for third kind differentials. Let Ω3 be a third kind differential with first order poles at some points Pl with residue rl ( rl = 0). The Riemann bilinear identity reads: Ω3 = 2iπ rl Aj (Pl ) bj
l
Applying this to the form Ω(l) , one gets: 1 Ω(l) = lk β Aj (Pk β ) 2iπ bj kβ
where the last sum is over all punctures. Similarly, we find: 1 2iπ
Ω(q) = bj
β=α
Aj (Pkβ ) −
N
Aj (Ql ) = −Aj (Pkα ) + Aj (Q1 )
l=2
To get the last equation, notethat if P1 , . . . , PN are the N points above some λ0 then the Abel sum j A(Pj ) is a constant independent of λ0 . This is because the meromorphic function on Γ: f (P ) = (λ − λ0 )/(λ − λ1 ) has zeroes at the points above λ0 and poles at the points above λ1 , hence these two divisors are mapped to the same point in Jac (Γ) due to the Abel theorem. In particular β Aj (Pkβ ) = N l=1 Aj (Ql ). There remains to compute the normalization constant C, such that (k) ψα (P ) = eξkα (zk ) (1 + O(zk )) when P → Pkα . Thanks to our definition of P the primitive of the form Ω, we have Pkα Ω = ξkα (zk ) + O(zk ). Hence we need only to normalize the quotient of theta-functions, and we find the final result, eq. (8.62).
288
8 Isomonodromic deformations
To identify the tau-function, we have to expand this formula for P → Pkβ (β may be equal to α) in powers of zk and compare the result with eq. (8.54). Proposition. The tau-function associated with the algebro-geometric integrable system is given by: τ (t) = eσ(t,t)+ρ(t) θ(U · t + W )
(8.64)
where σ(t, t) is bilinear in the times ti and ρ(t) is linear. Proof. We first look at P → Pkα . Note that, due to the explicit form of V , one can write for P close to Pkα : P
ψα(k) (zk ) = eξkα (zk ) · e
W =
Pkα (Ω−dξkα )
θ(A(Pkα ) − A(D) − K) θ(A(P ) − A(D) − K) θ(A(P ) − A(Pkα ) + U · t + W ) × θ(U · t + W )
lk β A(Pk β ) + A(Q1 ) − A(D) − K
k β (k)
(k)
which compares to ψα (zk ) = eξkα (zk ) hαα (zk ). The middle term is regular when zk → 0 and tends to one, so one can write it in the form P
e
Pkα (Ω−dξkα )
θ(A(Pkα ) − A(D) − K) = exp (t.bα (zk ) + aα (zk )) θ(A(P ) − A(D) − K)
where t.bα (zk ) is an expression linear in the times t(k,n,α) , and both aα (zk ) and bα (zk ) are of order O(zk ). Considering the last term, note that it can be rewritten as θ(A(P ) − A(Pkα ) + U · t + W ) θ(U · (t − [z]) + W ) = θ(U · t + W ) θ(U · t + W ) To see that we Taylor expand the Abel map A(P ) − A(Pkα ) around Pkα . ∞ (j) i Writing the first kind differential ωj = i=0 ci zk dzk one gets, using Riemann’s bilinear identities (see eq. (15.9) in Chapter 15): ∞ ∞ ∞ n zkn (k,n,α) −1 zkn (j) zk (n) Aj (P ) − Aj (Pkα ) = cn−1 ΩPkα = − = U n 2πi n bj n j n=1
n=1
This invites us to look for a tau-function of the form: τ (t) = eρ(t)+σ(t,t) θ ti U (i) + W i
n=1
8.10 Tau-functions and theta-functions
289
where ρ(t) is linear in t and σ(t, t) is a quadratic form in t. One gets a condition on ρ and σ: t · bα (zk ) + aα (zk ) = −ρ([zk ]α ) − 2σ(t, [zk ]α ) + σ([zk ]α , [zk ]α )
(8.65)
One can always choose ρ and σ satisfying this equation provided that bα obeys the adequate symmetry property stemming from the fact that the quadratic form σ is symmetric in its arguments. Explicitly, we have the expansion b(k ,n ,α ),α (zk )t(k ,n ,α ) t · bα (zk ) = For (k , α ) = (k, α) (the equality case was treated in Chapter 5) we have by eq. (8.63): P (n ) b(k ,n ,α ),α (zk ) = ΩP = b(k ,n ,α ),(k,n,α) zkn Pkα
k α
n
The condition (8.65) implies σ(k ,n ,α ),(k,n,α) = 12 nb(k ,n ,α ),(k,n,α) . So we must have the relation nb(k ,n ,α ),(k,n,α) = n b(k,n,α),(k ,n ,α ) (in this equation, the function b(k,n,α) (zk ) is obtained by performing the same construction as above but starting from the privilegied point Pk α ). This is a consequence of Riemann’s bilinear identities: apply the identity eq. (15.8) (n) (n ) in Chapter 15 to the second kind differentials ΩPkα and ΩP . Since k α these differentials are normalized their a-periods vanish and the left-hand side of the identity vanishes. We get Res (b(k ,n ,α ) db(k,n,α) ) = 0. There are two poles, one at Pkα and the other at Pk α . Computing the residues yields the required relation. Altogether this shows that the quadratic form σ exists, is completely determined, and independent of the choice of the particular point Pkα . The computation of the linear form ρ(t) is then straightforward. We are now in a position to discuss the effect of a Schlesinger transformation on this tau-function since this only amounts to changing the inte gers lkα . These integers only occur in the contribution k β lk β A(Pk β ) to the vector W in eq. (8.64), and in a term linear in lkα in ρ(t) (appearing in the exponential prefactor). We see that an elementary Schlesinger transformation (which adds a zero at Pkα and a pole at Pk α ) is obtained by changing lkα → lkα + 1 and lk α → lk α − 1. The effect of such a transformation on the theta-function is remarkably simple and amounts to a simple translation of its argument. Up to the exponential prefactor, we have: k k τ (t) = θ(U · t + W + A(Pkα ) − A(Pk α )) α α
290
8 Isomonodromic deformations
Remark. This allows us to perform an interesting check of the first of eqs. (8.54). (k) From the definitions, eqs. (8.60, 8.61), the matrix element hαα is obtained by evalu(k) ating ψα (P ) around the point Pkα in eq. (8.62). The time-dependent theta-function in the numerator of this equation is then exactly what we expect from the numerator in eq. (8.54). We have found that the tau-function in the algebro-geometric situation is essentially the Riemann theta-function. Moreover, the various matrix elements of h(k) (t, λ) are obtained by simple shifts of the arguments of the theta-function. The isomonodromic context of this chapter is a generalization of the algebro-geometric context, so that the tau-functions can be viewed as generalizations of Riemann’s theta-functions and should enjoy many of their remarkable properties. In Chapter 9 another framework is proposed allowing us to directly define the tau-function, using the geometry of the infinite Grassmannian. 8.11 The Painlev´ e equations An important application of the theory of isomonodromic deformations concerns the Painlev´e equations which can be interpreted as isomonodromic deformation equations, but not as isospectral deformations. The Painlev´e property deals with singularities of solutions of differential equations. In this respect there is a striking difference between linear differential equations and non-linear ones. The solutions of linear differential equations have singularities (poles, branch points, essential singularities) only where the coefficients of the equation have singularities, i.e. the position of these singularities are fixed. In contrast, the solutions of non-linear equations can develop singularities at arbitrary points, depending on initial conditions. For example y˙ = y 2 has the general solution y = −1/(t − t0 ). We call these singularities depending on the initial conditions movable singularities. In general one cannot avoid movable poles, however, one can try to find equations whose solutions have no movable singularities other than poles, i.e. the branch points and essential singularities of all solutions are fixed. This is called the Painlev´e property. In fact Painlev´e and Gambier have classified all differential equations of the form: d2 y dy = R(t, y, ) 2 dt dt where R is a rational function of its arguments, satisfying the Painlev´e property. Up to trivial redefinitions they found 50 such equations. Of all
8.11 The Painlev´e equations
291
these equations only six could not be integrated in terms of already known functions, and are listed below.
(ii)
d2 y dt2 d2 y dt2
(iii)
d2 y dt2
(iv)
d2 y dt2
(v)
d2 y dt2
(vi)
d2 y dt2
(i)
= 6y 2 + t = 2y 3 + ty + α 1 dy 2 1 dy 1 δ = − + (αy 2 + β) + γy 3 + y dt t dt t y 2 1 dy 3 β = + y 3 + 4ty 2 + 2(t2 − α)y + 2y dt 2 y 5 8 2 dy 1 1 dy 1 = − + 2y y − 1 dt t dt 8 5 2 γy δy(y + 1) β (y − 1) + αy + + + 2 t y t y−1 5 8 2 dy 1 1 1 1 = + + 2 y y−1 y−t dt 5 8 dy 1 1 1 − + + t t − 1 y − t dt 5 8 βt γ(t − 1) δt(t − 1) y(y − 1)(y − t) α+ 2 + + + t2 (t − 1)2 y (y − 1)2 (y − t)2
All these equations can be understood in the framework of isomonodromy deformations. The fact that the Painlev´e property appears in integrable systems is not an accident. The dynamical variables, i.e. the matrix elements of the h(k) in eq. (8.54), are ratios of tau-functions. In the algebrogeometric case the tau-functions are theta-functions which are entire functions, so that we only have movable poles at the zeroes of one thetafunction. This remark has been generalized to the isomonodromy case by Malgrange and Miwa. Here we shall only consider equations (ii) and (vi) in order to illustrate various aspects of the method. Example 1. The Painlev´e (ii) equation. We apply the general construction to the case where all matrices are of size 2 × 2, with just one singularity at λ = ∞. Moreover, we require that Ψ belongs to the group SL(2) so that Mλ belongs to the Lie algebra sl(2). We limit ourselves to n∞ = 3, so that Mλ = A0 + A1 λ + A2 λ2 , where A2 is diagonal and traceless. Altogether there are seven parameters in Mλ . The point at ∞ is an irregular singularity of order 3, so that there are six Stokes sectors, and therefore the Stokes matrices depend on six parameters. The monodromy matrix at ∞ must be equal to 1, yielding three relations, so the
292
8 Isomonodromic deformations
monodromy data are expressed by three parameters. Finally, the singularity data depend on four parameters. Since we are looking for an ordinary non-linear differential equation we introduce only one time, t, and assume that the singularity data is given by: 3 λ tλ 1 1 0 ξ(λ) = σ 3 , σ3 = + + θ log 0 −1 3 2 λ Clearly one could extend this construction to a whole hierarchy of times. In the following we shall parametrize Mλ by three natural parameters and find their time dependence so that the flow is isomonodromic. We have introduced a branch point parametrized by θ (which will account for the parameter α in Painlev´e (ii)), and this is consistent with the Fuchs condition since Tr (σ3 ) = 0. Finally, for irrational θ, any Lax pair interpretation of the differential equation is prohibited since Baker–Akhiezer functions do not have such infinitely branched points. We set Ψ(λ) = g(λ)eξ(λ) and take g(λ) = 1 +
g1 g2 g3 + 2 + 3 + ··· λ λ λ
There is no restriction in assuming that the leading term is 1. We also impose that det g(λ) = 1, since Ψ(λ) belongs to Sl(2). The gi are such that the matrix Ψ(λ) satisfies the linear differential equation ∂λ Ψ = Mλ Ψ with Mλ = (g∂λ ξg −1 )− . Here the symbol ()− means taking the polynomial part of the considered expression. Finally, the isomonodromic deformation equation is ∂t Ψ = Mt Ψ with Mt = (g∂t ξg −1 )− . One gets: t Mλ = λ2 σ3 + λ[g1 , σ3 ] + σ3 + [g2 , σ3 ] − [g1 , σ3 ]g1 2 1 1 Mt = λσ3 + [g1 , σ3 ] 2 2 We see that Mλ depends only on g1 , g2 , and ∂λ g = Mλ g−g∂λ ξ determines the g3 , g4 , . . . in terms of these two matrices. One finds for i ≥ 3: t [gi , σ3 ] = (i − 3)gi−3 + θgi−3 σ3 − [gi−2 , σ3 ] 2 +[g1 , σ3 ]gi−1 + [g2 , σ3 ]gi−2 − [g1 , σ3 ]g1 gi−2
(8.66)
We parametrize the matrix gi in the form: bi ∆i + ai gi = ci ∆i − ai where ∆i is obtained by requiring that det g(λ) = 1, yielding ∆1 = 0, ∆2 = (a21 + b1 c1 )/2, ∆3 = a1 a2 + (b1 c2 + b2 c1 )/2, etc. The left-hand side
8.11 The Painlev´e equations
293
of eq. (8.66) has vanishing diagonal elements. This provides a constraint on the lower order gi . The off-diagonal elements determine bi and ci . In the case i = 3, we find one constraint: b1 c2 + b2 c1 =
θ 2
and get the off-diagonal elements: t b1 + a2 b1 + a1 b2 + b1 ∆2 2 t c3 = − c1 + a2 c1 + a1 c2 − c1 ∆2 2
−b3 =
Similarly for i = 4 we get the constraint: a1 = −(t + 4∆2 )b1 c1 + 2b2 c2 + 2a1 (b1 c2 − b2 c1 ) while a2 is determined by the equation at order i = 5. Finally there are only three free parameters in g1 and g2 . The equations of motion ∂t g = Mt g − g∂t ξ read: 1 1 g˙ 1 = − [g2 , σ3 ] + [g1 , σ3 ]g1 , 2 2
1 1 g˙ 2 = − [g3 , σ3 ] + [g1 , σ3 ]g2 2 2
Note that since [g3 , σ3 ] is known in terms of g1 , g2 , these equations close on these two matrices. In components we get: a˙ 1 = −b1 c1 ,
b˙ 1 = b2 + a1 b1 ,
c˙1 = −c2 + a1 c1
t t b˙ 2 = − b1 − a1 b2 − 2b1 ∆2 , c˙2 = c1 − a1 c2 + 2c1 ∆2 2 2 It is now natural to introduce the three parameters: x=
b1 , c1
y = a1 +
b2 , b1
z = b1 c1
With these parameters one has a1 = −tz − 2z 2 − 2zy 2 + θy. Then the equations of motion give: 1 x x˙ = θ , 2 z
y˙ = −t/2 − 2z − y 2 ,
z˙ = 2zy − θ/2
Eliminating z yields the equation: d2 y 1 = 2y 3 + yt − + θ 2 dt 2
294
8 Isomonodromic deformations
One recognizes the Painlev´e (ii) equation with α = θ − 1/2. Once the solution y(t) of the Painlev´e equation is known, it is easy to reconstruct the matrix Mλ (λ; t) in terms of x(t), y(t), z(t). The subtle nature of the t-dependence of Mλ (λ; t) ensuring the isomonodromy property is here particularly striking. The Schlesinger transformations take a particularly simple form on this example. They just amount to changing θ → θn = θ + n, so that the first column of Ψ acquires a zero of order n at λ = ∞ while the second column acquires a pole of order n. Let Ψ(n, λ) be the wave-function constructed with the parameter θn , then we have for an elementary Schlesinger transformation Ψ(n + 1, λ) = R(n, λ)Ψ(n, λ), where R(n, λ) is a matrix rational in λ. Plugging in the asymptotic expansions at ∞, Ψ(n, λ) ∼ g(n, λ) exp (ξ(n, λ)), one gets R(n, λ)g(n, λ) = g(n+1, λ)(λ−1 E11 +λE22 ). Expanding to order λ−3 , one finds: 0 b1,n+1 R(n, λ) = −1/b1,n+1 λ − yn+1 where b1,n+1 and yn+1 are the parameters entering in g(n+1, λ). Moreover, one also gets: c2,n a1,n+1 = , c1,n+1 = c3,n − c1,n (∆2,n + a2,n ) + c2,n (a1,n − a1,n+1 ) c1,n (8.67) a1,n 1 , b2,n+1 = − (8.68) b1,n+1 = c1,n c1,n From this we obtain the recursion relations: t θn 2 − zn , yn+1 = −yn + zn+1 = − − yn+1 2 2zn It is now illuminating to introduce the tau-functions. First we compute the closed form Υ = −Res∞ Tr (g −1 (λ)∂λ dξ)dλ. Here we simply have dξ = λσ3 dt/2, so that we find: 1 Υ = − Tr (g1 σ3 )dt = −a1 dt 2 Hence τn is defined by τ˙n /τn = −a1,n . One can express the matrix elements of g(λ) in terms of tau-functions according to the general results of the previous sections. Here however, things are so simple that it is even more straightforward to rederive the appropriate results. From eq. (8.67) we have a1,n+1 = −τ˙n+1 /τn+1 = c2,n /c1,n . It follows immediately that c2,n τ˙n τ˙n+1 d τn yn+1 = −a1,n + = − = log c1,n τn τn+1 dt τn+1
8.11 The Painlev´e equations
295
On the other hand, by the equations of motion we have yn = d log b1,n /dt. Moreover, by eq. (8.68) we have c1,n = 1/b1,n+1 . Hence we can take: b1,n =
τn−1 , τn
c1,n =
τn+1 , τn
zn =
τn+1 τn−1 τn2
The Hirota bilinear form of the equations of motion also follows straightforwardly. Let us write the three equations of motion: a˙ 1,n = −zn → τn τ¨n − τ˙n2 − τn+1 τn−1 = 0 1 1 z˙n = 2yn zn − θn → τn+1 τ˙n−1 − τn−1 τ˙n+1 + θn τn2 = 0 2 2 t 1 2 y˙ n = −2zn − yn − → τn−1 τ¨n + τn τ¨n−1 − 2τ˙n τ˙n−1 + tτn τn−1 = 0 2 2 In the last equation we have used the first one to simplify the result. Example 2. The Painlev´e (vi) equation. We consider the case of three regular singularities at finite distance which we put at λ = 0, 1, t, and we study the isomonodromic deformation equations with respect to the parameter t. We choose the singularity data at these three points as: θk 0 (k) log (λ − λk ), k = 0, 1, t, λ0 = 0, λ1 = 1, λt = t ξ (λ) = 0 0 With this we construct the wave-function Ψ(λ) having the asymptotic expansion Ψ(λ) ∼ g (k) (λ) exp (ξ (k) (λ)), with k = 0, 1, t and satisfying the differential equation ∂λ Ψ = Mλ Ψ. This implies: A(0) A(1) A(t) θk 0 (k) (k) −1 (k) g0 Mλ (λ) = + + , A = g0 0 0 λ λ−1 λ−t In particular A(k) is a rank 1 projector, hence det A(k) = 0 and Tr A(k) = θk . The equation of motion reads ∂t Ψ = Mt Ψ with:
A(t) Mt = g (t) ∂t ξ (t) g (t) −1 =− λ−t − We are free to make a global gauge transformation by a matrix which is constant in λ, and we use this freedom to diagonalize Mλ at ∞, so that we assume: 1 κ1 0 + O(λ−2 ), λ → ∞ Mλ (λ) = λ 0 κ2 The differential equation ∂λ Ψ = Mλ Ψ has a regular singularity at ∞, and the Fuchs condition reads κ1 + κ2 = θ0 + θ1 + θt .
296
8 Isomonodromic deformations
We can write Mλ as: m11 (λ) Mλ (λ) = m21 (λ)
m12 (λ) m22 (λ)
=
1 A(λ) λ(λ − 1)(λ − t)
where Aii (λ) are second degree polynomials in λ with leading term κi λ2 , and A12 (λ) and A21 (λ) are degree one polynomials in λ. We set A12 (λ) = γ(λ − y) introducing the important parameter y which will end up satisfying the Painlev´e (vi) equation. Taking the trace of Mλ (λ), we have A22 (λ) = θ0 (λ − 1)(λ − t) + θ1 λ(λ − t) + θt λ(λ − 1) − A11 (λ) which eliminates the parameters in A22 (λ). Moreover, we know that det A(λ) vanishes for λ = 0, 1, t, so we get three conditions on A21 (λ): A21 (λ) =
A11 (λ)A22 (λ) , γ(λ − y)
λ = 0, 1, t
The first two allow us to express A21 (λ) in terms of A11 (λ) and A12 (λ), while the last one provides, since A21 (λ) is a linear function of λ, a constraint on the parameters in A11 (λ): A21 (t) = (1 − t)A21 (0) + tA21 (1) In order to write this constraint more explicitly, we parametrize A11 (λ) by the values it takes at λ = t and λ = y, i.e. write the interpolation formula: A11 (λ) = (λ − t)(λ − y)κ1 +
(λ − y)A11 (t) − (λ − t)A11 (y) t−y
Substituting this into the previous equations, one observes that the constraint is quadratic in A11 (y) but, unexpectedly, is linear in A11 (t). We solve A11 (t) in term of A11 (y) getting: A11 (t) = −
κ21 (y − t)(y − 1 + t) κ 1 − κ2
κ1 (θ0 t + θ1 (t − 1))(y − t) + 2A11 (y) + κ1 − κ 2
A11 (y) A11 (y) + θ1 y(t − 1) + θ0 t(y − 1) − (κ1 − κ2 )y(y − 1)
297
8.11 The Painlev´e equations
The dynamical variables are now y, γ, A11 (y) and we have to write the equations of motion ∂t Mλ = ∂λ Mt + [Mt , Mλ ]. Everything can be expressed in terms of the matrix A(λ) since we have Mt (λ) = −
1 A(t) t(t − 1)(λ − t)
We compute the equation of motion and afterwards we set λ = y. We get 5 8 1 1 ˙ 12 (λ) m ˙ 11 (λ) m A(t) − = [A(t), A(y)] ˙ 22 (λ) λ=y t(t − 1)(y − t)2 m ˙ 21 (λ) m y(y − 1) Taking the matrix element 12 of this equation and evaluating it at λ = y gives: θ0 θ1 θt − 1 y(y − 1)(y − t) 2m11 (y) − − − y˙ = t(t − 1) y y−1 y−t Taking the matrix element 11 evaluated at λ = y gives: m ˙ 11 (λ)|λ=y =
A11 (t) γ + m21 (y) 2 t(t − 1)(y − t) t(t − 1)
We introduce the dynamical variable z = m11 (y) and compute z˙ = m ˙ 11 (λ)|λ=y + (∂λ m11 )|λ=y · y˙ We find after some algebra: t(t − 1)z˙ = −(3y 2 − 2(1 + t)y + t)z 2
+ 2(κ1 + κ2 − 1)y + 1 − θ0 − θt − (θ0 + θ1 )t z + κ1 (1 − κ2 ) Eliminating z between the equations for y˙ and z˙ yields the Painlev´e (vi) equation, with the following values of the parameters: 1 α = (κ1 − κ2 + 1)2 , 2
1 β = − θ02 , 2
1 γ = θ12 , 2
1 δ = (1 − θt2 ) 2
It is known that the other Painlev´e equations can be obtained from this one by various limiting procedures. References [1] G.D. Birkhoff, Collected mathematical papers, Vol. 1. Dover Publications Inc. (1968) 259–306. [2] E.L. Ince, Ordinary differential equations. Dover Publications Inc. (1956).
298
8 Isomonodromic deformations
[3] W. Wasow, Asymptotic expansions for ordinary differential equations. Interscience Publishers (1965). [4] H. Flaschka and A. Newell, Monodromy and Spectrum preserving deformations I. Commun. Math. Phys. 76 (1980) 65–166. [5] M. Jimbo, T. Miwa and K. Ueno, Monodromy preserving deformations of ordinary differential equations with rational coefficients I. Physica 2D (1981) 306–352. [6] M. Jimbo and T. Miwa, Monodromy preserving deformations of ordinary differential equations with rational coefficients II. Physica 2D (1981) 407–448. [7] M. Jimbo and T. Miwa, Monodromy preserving deformations of ordinary differential equations with rational coefficients III. Physica 4D (1981) 26–46. [8] T. Miwa. Painlev´e property of monodromy preserving deformation equations and the analyticity of τ –functions. RIMS 17 (1981) 703– 721. [9] B. Malgrange, Sur les d´eformations isomonodromiques. Mathematics and Physics (Paris, 1979/1982), Progr. Math. 37, Birkh¨ auser, Boston, Mass. (1983). [10] Y. Sibuya, Linear differential equations in the complex domain: problems of analytic continuation. AMS Providence (1990). [11] D. Anosov and A. Bolibruch, The Riemann–Hilbert problem. Aspects of Mathematics Vol. 22 (1994).
9 Grassmannian and integrable hierarchies
We learned in previous chapters that when the Lax matrix has several singularities, the situation mainly reduces to local studies around each singularity. In this chapter we consider the case of one singularity in all its generality, which amounts to the study of the Kadomtsev–Petviashvili equation. This finds a natural presentation in Sato’s Grassmannian approach. We give an explicit realization of the Grassmannian in a fermionic Fock space. In this settingwe obtain a remarkable formula expressing the τ -function as τ (t) = 0|e i Hi ti g|0 . Soliton solutions are then obtained by choosing for g particular elements which have a simple expression in terms of vertex operators. Hirota equations are interpreted as the Pl¨ ucker equations of an infinite-dimensional Grassmannian, on which the infinitedimensional group GL(∞) acts. The time flows are interpreted as the action of an infinite-dimensional natural Abelian subgroup. We also to find particularly simple tau-functions expressed as Schur polynomials. Finally, we show that the full KP hierarchy admits an elegant formulation in terms of pseudo-differential operators.
9.1 Introduction In Chapter 3 we showed that one can simultaneously solve the hierarchy of equations: ∂tj Ψ = Mj Ψ,
Mj = (gξj g −1 )− ,
Ψ = ge
ξj tj
(9.1)
with g = 1 + O(1/z) regular and ξj a diagonal constant matrix singular at the unique puncture that we take at z = ∞. Here Ψ is a matrix of fundamental solutions of this system. In this chapter we denote by z the spectral parameter and the singularity is at z = ∞. The matrix 299
300
9 Grassmannian and integrable hierarchies
elements Ψαα admit a remarkable expression in terms of tau-functions, see eq. (3.61) in Chapter 3. Ψαα (z; t1 , t2 , . . .) =
τ (t − [z −1 ]α ) z n t(n,α) ) exp ( τ (t)
(9.2)
n≥1
where (t − [z −1 ]α ) means that times t(n,α) are shifted by −z −n /n. The noticable feature of this expression is that the z-dependence is folded into the infinitely many times t(n,α) . The local formula eq. (9.2) was also recovered when analyzing isospectral flows in Chapter 5, where we observed that, in this context, the tau-functions are essentially Riemann’s theta– functions. In the isomonodromic approach of Chapter 8, it was shown that the notion of tau-function is in fact much more general than that of theta-function. We have also seen in Chapter 8 that the equations of motion can be recast as bilinear Hirota equations on the tau-function, P (D)τ · τ = 0. They form an infinite set of bilinear identities which are equivalent to the hierarchy equations. In this chapter we elaborate on the relation between Sato’s formula eq. (9.2), Hirota’s bilinear equations and representation theory of infinitedimensional affine Kac–Moody algebras. A motivation for introducing affine Kac–Moody algebras may be seen in the structure of Sato’s formula, which, using exp(a∂t )f (t) = f (t + a), can be written as: V (z, t)τ (t) Ψ(z, t) = , τ (t)
V (z, t) = e(
n≥1
z n tn ) −
e
n≥1
z −n ∂ n ∂tn
(9.3)
The operator V (z, t) is a typical vertex operator which is known to play a central role in the construction of representations of affine Kac–Moody algebras, see Chapter 16. A further motivation can be found in the observation that highest weight representations of affine Kac–Moody algebras by vertex operators naturally produce tau-functions obeying bilinear identities. We sketch now, in a rather informal way, the ideas allowing us to associate Hirota bilinear equations with vertex operator highest weight representations of affine Kac–Moody algebras. These ideas also underly the fermionic constructions of the following sections. Let G be an affine Kac–Moody algebra and H a Cartan subalgebra of G. We will use freely, in this section, the notion of a group G associated with G. Let X a be a basis of G. It will often be a Cartan–Weyl basis (Hi , Eα ), where Hi is a basis of H, and Eα are the root vectors associated to the roots α, normalized by (Eα , E−α ) = 1. The tensor Casimir C12 of G is
301
9.1 Introduction defined by: C12 =
X a ⊗ Xa =
a
H i ⊗ Hi +
Eα ⊗ E−α
α
i
where the indices are raised or lowered with the Killing form. The fundamental property of C12 is that for all g ∈ G, we have: C12 g ⊗ g = g ⊗ g C12 Let us give a proof of this property, which will be adapted in the fermionic case. We have to show that g −1 X a g ⊗ g −1 Xa g = X a ⊗ Xa a
a
but g −1 X a g = αda X d , and g −1 Xb g = βbc Xc . The Killing form is such a , X ) = δ a and is invariant under adjoint action. Hence we have that b b (X a c a α β = δ c c b b , which implies our property. If |Λ and |Λ are two highest weight vectors, we have: C12 |Λ ⊗ |Λ = (Λ, Λ )|Λ ⊗ |Λ To see it, we write C12 in the Cartan–Weyl basis and use that highest weight vectors |Λ are eigenstates of Hi and are annihilated by the generators Eα for α any positive root. These two relations imply the following identity: C12 g|Λ ⊗ g|Λ = (Λ, Λ ) g|Λ ⊗ g|Λ ,
∀ g∈G
(9.4)
Suppose now that the representation of G, with highest weight vector |Λ , admits a vertex operator construction in terms of bosonic operators αn , n ∈ Z, satisfying [αn , αm ] = nδn+m,0 , see Chapter 16. The representation space is the bosonic Fock space with vacuum |Λ such that αn |Λ = 0 for n > 0. It is generated by acting on |Λ with the operators αn , n < 0. A basis consists of states of the form α−n1 · · · α−nk |Λ . A vertex operator is an operator acting on the Fock space and has the typical form:
V (z) = e−(
n0 (z
−n /n)α
n
)
(9.5)
When we expand V (z) in powers of z, the coefficients are polynomials in the bosonic operators αn . Together with the αn themselves, they represent the elements X a of the Lie algebra. Note that taking for α−n , n > 0 the operator of multiplication by ntn , and for αn , n > 0 the derivation operator ∂/∂tn , the vertex operator eq. (9.5) exactly reproduces the vertex operator appearing in eq. (9.3).
302
9 Grassmannian and integrable hierarchies
Let us introduce the generating function, H(t) = n>0 αn tn , depending on the infinite number of variables tn . Note that [H(t), H(t )] = 0. We also need the dual vacuum Λ| such that Λ|αn = 0 for n < 0. We define the tau-function by: τΛg (t) = Λ|eH(t) g|Λ , with H(t) = αn tn , g ∈ G (9.6) n>0
This is a function on the orbit of |Λ under G. Since X a ∈ G is represented by some polynomial in the αn , there exist a (t, ∂ ) in the variables t such that: differential operators DΛ t n a (t, ∂t ) Λ|eH(t) g|Λ Λ|eH(t) X a g|Λ = DΛ
This easily follows from: Λ|e
H(t)
5
αm g|Λ =
∂ H(t) g|Λ for m > 0 ∂tm Λ|e mt−m Λ|eH(t) g|Λ for m < 0
(9.7)
(9.8)
The case m > 0 is obvious, while for m < 0 one uses eH(t) αm e−H(t) = αm + mt−m to bring αm to the left where it is annihilated by the vacuum. We now multiply the fundamental bilinear relation eq. (9.4) by eH(t ) ⊗ eH(t ) . Let xn = 12 (tn + tn ), yn = 12 (tn − tn ), so that
)
eH(t ) ⊗ eH(t
= eH(y) ⊗ e−H(y) · eH(x) ⊗ eH(x)
The second factor in the right-hand side of this relation commutes with C12 , so that we can push it to its right. Then, taking the scalar product with Λ| ⊗ Λ |, we get Λ|eH(y) X a eH(x) g|Λ Λ |e−H(y) Xa eH(x) g|Λ a
= (Λ, Λ )τΛg (x + y)τΛg (x − y) Using now eq. (9.7) we obtain: a DΛ (y, ∂y )DaΛ (−y, −∂y ) τΛg (x + y)τΛg (x − y) a
= (Λ, Λ )τΛg (x + y)τΛg (x − y) These are Hirota type equations. Expanding in y, we get an infinite number of bilinear Hirota equations which are in fact equations characterizing the orbits of the highest weight vectors |Λ and |Λ .
303
9.2 Fermions and GL(∞)
Clearly this construction can be applied to any vertex operator representation of an affine Kac–Mooody algebra G. The hierarchy of Hirota equations we obtain in this way depends on the affine Kac–Moody algebra but also on the choice of the vertex operator representation. In the next section we make these ideas precise in the case of the group GL(∞). 9.2 Fermions and GL(∞) Let gl(∞) be the Lie algebra of infinite-dimensional band matrices of the form Mrs with r, s ∈ Z + 12 (the 12 is added for convenience) and such that Mrs = 0 for |r − s| large. Note that the sum and the product of two such matrices is well-defined and of the same type. Finally, gl(∞) is a central extension of gl(∞) that will be described later on, by displaying a representation of this algebra on some Hilbert space. We introduce fermionic operators βr , βr∗ , r ∈ Z + 12 , with the following anticommutation relations: {βr , βs } = 0,
{βr∗ , βs∗ } = 0,
{βr , βs∗ } = δr+s,0
(9.9)
where { , } denotes the anticommutator, ie. {a, b} = ab + ba for any operators a and b. By convention, we will say that the fermions βr have charge (+1) while the fermionic operators βs∗ have charge (−1). We introduce a vacuum vector |0 such that: βr |0 = βr∗ |0 = 0
for r ≥
1 2
The Fock space is, by definition, generated by acting successively on the vacuum with the fermionic operators. A basis of the fermionic Fock space is the following set of states: ∗ ∗ · · · β−r |0 β−sn · · · β−s1 · β−r m 1
(9.10)
with sj , rj ≥ 12 . We introduce a Hermitian structure on the Fock space by defining the dual vacuum vector: 0|βr = 0|βr∗ = 0
for r ≤ − 12
such that 0|0 = 1, and the adjoint of the fermionic operators: ∗ βr† = β−r ,
(βr∗ )† = β−r
This implies that the states eq. (9.10) form an orthonormal basis of the Fock space. Since the fermions βs have charge (+1) and the fermions βr∗ charge (−1), the state eq. (9.10) has charge (n − m).
304
9 Grassmannian and integrable hierarchies
Let |p be the vector defined by: |p ≡ β−p+ 1 · · · β− 1 |0 , ≡
∗ βp+ 1 2
2
for p > 0,
(9.11)
2
∗ · · · β− 1 |0 , 2
for p < 0,
∗ |p = 0 for r ≥ 1 . We These states have charge p and satisfy βr−p |p = βr+p 2 call them charged vacuum vectors. The states with charge p are obtained by acting on the vacuum |p with an equal number of βs and βr∗ . In the following we will mostly restrict ourselves to neutral states which are linear combinations of states of the form eq. (9.10) with the same number of β and β ∗ .
We will need the notion of normal order of fermionic operators. This order consists in writing all operators βr , βr∗ with positive indices r on the right and multiplying by the signature of the considered permutation. The important property of normal ordered products is that their vacuum expectation value vanishes. The relation between a monomial and its normal ordered form is given by Wick’s theorem. For two fermions it reads: βr βs∗ = : βr βs∗ : + 0|βr βs∗ |0 ,
: βr βs∗ := − : βs∗ βr :
This can be seen as follows. To bring the left-hand side to its normal ordered form we have to perform at most one anticommutation, hence adding at most a c-number which can only be the vacuum expectation of the left-hand side since the normal product has vanishing vacuum expectation value. This scalar is called the contraction of the two operators. For more than two operators Wick’s theorem is expressed by the induction formula: φ1 : φ2 · · · φn :=: φ1 φ2 · · · φn : +
n
(−1)i 0|φ1 φi |0 : φ2 · · · φi · · · φn :
i=2
where φj is either some βr or some βs∗ and the notation φi means omission of the factor φi . It follows that 0|φ1 φ2 · · · φn |0 is equal to the sum of all possible products of contractions with appropriate signs. We now construct the algebra gl(∞). Consider neutral bilinear opera∗ tors of the form βr βs . They form a closed algebra under commutation. For example: ∗ ∗ ] = δs+n,0 (βr βm ) − δr+m,0 (βn βs∗ ) [βr βs∗ , βn βm
9.2 Fermions and GL(∞)
305
Let Mrs be a band matrix, i.e. an infinite-dimensional matrix such that − s| > N for some given N . We can define the formal Mrs = 0 for |r ∗ . The commutator of two such objects can operator X = r,s Mrs βr β−s be computed without ambiguity, due to the band structure of the matrices M and N : ∗ ∗ ∗ Mrs Nnm [βr β−s , βn β−m ]= [M, N ]rm βr β−m [X, Y ] = rsnm
rm
We reproduce the Lie algebra of (band) matrices. However, objects like X cannot be represented on the Fock space since their matrix elements on this space may be infinite. For example, 0|X|0 = r>0 Mrr which can be infinite. To overcome this problem we normal order the bilinear fermionic operator. Then all matrix elements are finite between states with finite number of particles. However, this induces a modification of the commutation rules: / 0 ∗ ∗ ∗ ∗ : βr β−s :, : βn β−m : = δs,n : βr β−m : − δr,m : βn β−s : ∗ ∗ (9.12) +δs,n 0|βr β−m |0 − δr,m 0|βn β−s |0 The additional c-number term is the central extension of gl(∞), and is equal to δrm δsn (θ(m) − θ(s)), where θ(m) = 1 for m > 0 and vanishes for m < 0. Definition. The algebra gl(∞) is the infinite-dimensional Lie algebra of elements of the form: 9 : gl(∞) = X= Mrs : βr β ∗ :, Mrs = 0 if |r − s| 0 ⊕ C −s
r,s
(9.13) equipped with the Lie bracket eq. (9.12). Here |r − s| 0 means that there exists N > 0 such that Mrs = 0 when |r − s| > N . We will need to consider the group GL(∞) associated with this Lie algebra. Its precise definition is somewhat tricky. We will adopt here a very naive point of view, and consider group elements of the form: g = eX1 eX2 · · · eXk with X1 , . . . , Xk ∈ gl(∞) defined by finite sums of bilinears in the fermionic operators. It is then clear that such group elements are well represented in the Fock space. Of course we will quickly encounter the need to extend this setting to infinite sums of bilinears, then one has to be very careful about convergence properties, but we will not attempt to define a general framework and refer instead to the literature. The group GL(∞) acts on the fermions by conjugation.
306
9 Grassmannian and integrable hierarchies
Proposition. For any g ∈ GL(∞) we have gβs g −1 = βr ars
(9.14)
r
∗ −1 gβ−s g =
∗ β−r (a−1 )sr
(9.15)
r
for some c-number matrix ars . With our definition of GL(∞) the matrices ars only have a finite but arbitrary large number of non-zero entries. Proof. We use a matrix notation: β is an ∞ line vector with elements βs , ∗ . We want to show that and β ∗ is an ∞ column vector with elements β−s for a given g, gβg −1 = β · a g −1 β ∗ g = a · β ∗ with the same matrix a. Note that if g1 βg1−1 = βa1 and g2 βg2−1 = βa2 then g1 g2 βg2−1 g1−1 = βa1 a2 , and similarly if g1−1 β ∗ g1 = b1 β ∗ , g2−1 β ∗ g2 = b2 β ∗ , then g2−1 g1−1 β ∗ g1 g2 = b1 b2 β ∗ . So, it is enough the verify that a = b on simple generators. We take ∗
∗
∗
g = e:βr βs : = e− 0|βr βs |0 eβr βs
We see that the normal ordering produces simple scalar factors which do not contribute to the adjoint action, and we can omit it. If (r + s) = 0, ∗ we have g = eβr βs = 1 + βr βs∗ , and therefore, gβk g −1 = βk + [βr βs∗ , βk ] = βl (δkl + δrl δs+k,0 ) ∗ ∗ ∗ ∗ g −1 β−l g = β−l − [βr βs∗ , β−l ] = (δkl + δrl δs+k,0 )β−k ∗
If (r + s) = 0 then g = eβr β−r , and gβk g −1 = βk eδrk
,
∗ ∗ g −1 β−k g = β−k eδrk
This proves the result. It is convenient to work with generating functions β(z) and β ∗ (z), also called fermionic fields, defined by: β(z) = βr z −r−1/2 (9.16) 1 r∈Z+ 2
β ∗ (z) =
βr∗ z −r−1/2
(9.17)
1 r∈Z+ 2
Notice that β(z) and β ∗ (z) are single valued, i.e. β(ze2iπ ) = β(z) and similarly for β ∗ (z).
307
9.2 Fermions and GL(∞)
Proposition. The fermionic fields satisfy the following “operator product expansion” : β(z)β ∗ (w) =: β(z)β ∗ (w) : +
1 , z−w
for
|z| > |w|
(9.18)
It is enough to compute the vacuum expectation value of 1 1 −r− 2 −s− 2 ∗ w . Since the left-hand side. This is given by r,s 0|βr βs |0 z ∗ 0|βr βs |0 = 1 whenr = −s > 0 and vanishes otherwise, we get i 0|β(z)β ∗ (w)|0 = 1/z ∞ i=0 (w/z) = 1/(z − w) when |z| > |w|.
Proof.
By Wick’s theorem this can be generalized to any product of fields, yielding: 0|
N
β(zj )
j=1
N
∗
β (wj )|0 = (−1)
N (N −1) 2
det
j=1
1 zi − wj
(9.19)
A direct consequence of eqs. (9.14, 9.15) is that there exists a fermionic analogue of the tensor Casimir. This fact is the crucial point of the fermionic construction of the Hirota equations. Proposition. Let S12 be defined by: S12 =
∗ βr ⊗ β−r =
1 r∈Z+ 2
dz β(z) ⊗ β ∗ (z) 2iπ
(9.20)
∀g ∈ GL(∞)
(9.21)
Then S12 g ⊗ g = g ⊗ g S12 ,
The operator (S † S) is the tensor Casimir for gl(∞). Proof. Equation (9.21) is equivalent to : ∗ ∗ βr g ⊗ β−r g= gβr ⊗ gβ−r , r∈Z+1/2
∀g ∈ GL(∞)
r∈Z+1/2
∗ g = a gβ ∗ , so From eqs. (9.14, 9.15), we have βr g = gβs (a−1 )sr and β−r rs −s that ∗ ∗ ∗ βr g ⊗ β−r g= gβs ⊗ gβ−k (a−1 )sr ark = gβs ⊗ gβ−s r
r,s,k
s
308
9 Grassmannian and integrable hierarchies
∗ |0 = 0, Notice that the definition of the vacuum implies that βr |0 ⊗β−r ∗ ∀r, since either βr |0 = 0 or β−r |0 = 0 . Therefore ∗ βr |0 ⊗ β−r |0 = 0 (9.22) S12 |0 ⊗ |0 = r
Combining eqs. (9.21, 9.22) we obtain: S12 g|0 ⊗ g|0 = 0
(9.23)
for any g ∈ GL(∞). This relation is the fermionic analogue of the bosonic equation eq. (9.4) and will be very important in the following. Remark. The transformations eqs. (9.14, 9.15) can alternatively be written in a more compact way by introducing the function Agn (z) for n ∈ Z: m z am+ 1 ,n+ 1 (9.24) Agn (z) = m∈Z
2
2
1 dzβ(z)Ags− 1 (z) 2iπ 2 In eq. (9.24) the sum is finite as long as we use the previous definition for the group GL(∞). As pointed out above, extending this construction such that the definition of the function Agn (z) involves an infinite sum requires choosing an appropriate completion of GL(∞).
We then have:
(gβs g −1 ) =
9.3 Boson–fermion correspondence We now come to a truly remarkable result known as the boson–fermion correspondence. This will be the main technical tool used in the forthcoming sections. We construct bosonic operators from the fermions and conversely. The bosonic operators Hn are simply bilinear in the fermionic operators. Proposition. Define the operators Hn by H(z) =: β(z)β ∗ (z) := z −n−1 Hn or equivalently Hn = tion relations:
(9.25)
n∈Z
r
∗ : βr β−r+n :. They obey the following commuta-
[Hn , β(z)] = z n β(z) [Hn , β ∗ (z)] = −z n β ∗ (z) [Hn , Hm ] = nδn+m,0
(9.26)
9.3 Boson–fermion correspondence
309
Proof. The proof follows from standard manipulations using Wick’s theorem. By definition, we have: dz n Hn = z : β(z)β ∗ (z) : 2iπ Using Wick’s theorem, we may write the product Hn β(w) as: dz n Hn β(w) = z : β(z)β ∗ (z) : β(w) 2iπ dz n 1 dz n ∗ = z : β(z)β (z)β(w) : − z β(z) 2iπ 2iπ w−z |w||z| The integral is over a circle C2 around the origin not enclosing w. So the commutator [Hn , β(w)] is: dz n 1 [Hn , β(w)] = − z β(z) (9.27) 2iπ w−z C1 −C2 dz n 1 z β(z) = wn β(w) = z−w Cw 2iπ where Cw is a small circle around w. The other relations are proved similarly. From eq. (9.26), we see that Hn obey bosonic commutation relations. Moreover, we have Hn† = H−n . The operator H0 is both Hermitian and in the centre of the bosonic algebra. Let us now describe the bosonic Fock space. The vacuum vector |l is such that: Hn |l = 0,
for n > 0,
H0 |l = l|l ,
l∈Z
Here we choose l ∈ Z because in terms of fermionic operators H0 has integer spectrum. Over each state |l we construct a bosonic Fock space by acting with the creation operators Hn with n < 0. A basis of the bosonic Fock space consists of the states: H−n1 · · · H−nk |l
310
9 Grassmannian and integrable hierarchies
We will say that these states have charge l, so the charge is the H0 eigenvalue. In particular, the neutral bosonic Fock space is spanned by the states obtained by acting with the Hn on the vacuum |0 . We have constructed bosons from fermions, the unexpected result is that fermions can be reconstructed from bosons. We start with bosonic operators obeying commutation relations [Hn , Hm ] = nδn+m,0 . As already noticed, H0 is in the centre of this algebra, so we call it “momentum” and denote it p. We enlarge the algebra by introducing a “position” operator q such that [p, q] = −i. On the vacua |l , the operator eiq acts as a translation operator: eiq |l = |l + 1 , compatible with p|l = l|l . Proposition. Let [p, q] = −i and define φ(z) = q − ip log z + i
Hn n=0
n
z −n
Then, the fermionic fields are reconstructed by the formulae: β(z) = V+ (z),
β ∗ (z) = V− (z)
(9.28)
where V± (z) =: exp(±iφ(z)) : The colons : : denote the bosonic normal ordering, which consists of writing the operators Hn with n > 0 on the right, and the eiq factor on the left. Explicitly, the operators V± (z) read V± (z) = e±iq z ±p exp ∓
z −n n0
n
Hn
(9.29)
The operators V± (z) are called vertex operators. Proof. The proof relies on the following relations: V± (z)V± (w) = (z − w) : e±iφ(z) e±iφ(w) : |z| > |w| 1 V± (z)V∓ (w) = : e±iφ(z) e∓iφ(w) : |z| > |w| z−w
(9.30)
They follow immediately from the standard Campbell–Hausdorff formula eA eB = eB eA e[A,B] for two operators A and B whose commutator [A, B] is a c-number. To compute the anticommutator {βr , βs∗ }, we represent the
9.4 Tau-functions and Hirota bilinear identities
311
. fermionic operators as contour integrals, βr = dz z −r−1/2 : exp(iφ(z)) : and similarly for βs∗ . We then use the above relations and apply contour integral manipulations as in the previous proposition: dzdw z −r−1/2 w−s−1/2 ∗ {βr , βs } = − : eiφ(z) e−iφ(w) : 2 z−w |z|>|w| |z| 0. Using the bosonic construction of the fermions, ∗ |l = 0 for one verifies that these states |l also satisfy : βr−l |l = βr+l 1 r ≥ 2 . Thus they coincide with the states that we defined in eq. (9.11), i.e. fermionic charged vacua of charge l. The boson–fermion correspondence may also be viewed at the level of vacuum expectation values as a consequence of Cauchy’s determinant formula. Wick’s theorem allows us to compute expectation values of vertex operators: N N i<j (zi − zj )(wi − wj ) iφ(zj ) −iφ(wj ) :e : :e : |0 = (9.31) 0| i,j (zi − wj ) j=1
j=1
Similarly we have: 0|
N j=1
: eiφ(zj ) : | − N =
(zi − zj ) zi−N i<j
(9.32)
i
The bosonization formula will follow if the fermionic expectation value eq. (9.19) coincide with the bosonic expectation value eq. (9.31), specifically, if we have: N (N −1) 1 i<j (zi − zj )(wi − wj ) = (−1) 2 det (9.33) zi − wj i,j (zi − wj ) This is nothing but the Cauchy determinant formula. 9.4 Tau-functions and Hirota bilinear identities The tau-functions are functions on the orbit of the vacuum |0 under the action of the group GL(∞). To define them as in eq. (9.6), we again
312
9 Grassmannian and integrable hierarchies
introduce the function H(t) by: H(t) =
tn H n
n>0
Definition. We define the tau-function τ (t) as the vacuum expectation value: τ (t; g) = 0|eH(t) g|0 , g ∈ GL(∞) (9.34) We wish to prove that the tau-functions obey Hirota equations. We repeat the argument of section (9.1), but using the operator S12 in place of C12 . From eq. (9.23) we get: dz S12 · g|0 ⊗ g|0 = (9.35) β(z)g|0 ⊗ β ∗ (z)g|0 = 0 2iπ This is an identity on vectors in Fock space belonging to the orbit of the vacuum vector. It translates readily into bilinear identities on the tau-functions. Proposition. Let τ (t; g) be the tau-function defined in eq. (9.34). It satisfies the identities: z −n ∂ dz n 2z yn exp − τ (t + y; g)τ (t − y; g) = 0 exp 2iπ n ∂yn n>0 n>0 (9.36)
Proof. Applying eH(t ) ⊗eH(t ) to eq. (9.35), and taking the inner product with +1| ⊗ −1|, we get : dz +1|eH(t ) β(z)g|0 −1|eH(t ) β ∗ (z)g|0 = 0 2iπ
To compute +1|eH(t ) β(z)g|0 and −1|eH(t ) β ∗ (z)g|0 we use the bosonization formula. This gives, as we show below:
+1|eH(t ) β(z)g|0 = V+ (z, t )τ (t ; g) H(t ) ∗
−1|e
(9.37)
β (z)g|0 = V− (z, t )τ (t ; g)
where V± (z; t) are the vertex operators in their differential operator representation: z −n ∂ z n tn exp ∓ V± (z; t) = exp ± n ∂tn n>0
n>0
313
9.4 Tau-functions and Hirota bilinear identities
Indeed, inserting eq. (9.28) into the left-hand side of eq. (9.37) we get:
+1|eH(t ) β(z)g|0 H(t )
= +1|e
eiq z p exp −
z −n n0
n
Hn
g|0
−n Noticing that +1|eiq z p = 0|, and commuting exp − n0
n>0
z n tn
Hn
g|0
z −n ∂ exp − n ∂tn
0|eH(t ) g|0
n>0
With these formulae, the bilinear identity becomes: dz V+ (z; t ) V− (z; t ) τ (t ; g)τ (t ; g) = 0 2iπ
(9.38)
which is eq. (9.36) if we use the variables t = 12 (t + t ) and y = 12 (t − t ). The identity eq. (9.36) contains an infinite set of bilinear Hirota equations. Let us introduce the Hirota operators, see eq. (8.55) in Chapter 8, for the infinite set of variables tn : D = {D1 , D2 , . . . , Dn , . . .} Equation (9.36) can be rewritten in terms of these operators as:
dz (n>0 2z n yn ) −n>0 z−n Dn ( n e n>0 yn Dn ) τ (t; g) · τ (t; g) = 0 e e 2iπ Using the definition of the elementary Schur polynomials Pk (t): ∞ ∞ tn z n = Pk (t)z k (9.39) eξ(t,z) = exp n=1
k=0
314
9 Grassmannian and integrable hierarchies
n we can expand in z the quantity exp n>0 2z yn . Integrating over z shows that the bilinear identity eq. (9.36) is equivalent to: ∞ exp Pj (2y)Pj+1 (−D) yn Dn τ (t; g) · τ (t; g) = 0, ∀yn (9.40) n>0
j=0
= {D1 , 1 D2 , . . . , 1 Dn , . . .}. where D 2 n Expanding this equation in powers of y, we get an infinite set of Hirota equations called the KP hierarchy. For instance, the coefficient of y13 gives the equation (D14 + 3D22 − 4D1 D3 )τ · τ = 0 which is the Hirota form of the KP equation, see eq. (8.56) in Chapter 8. ∗ |l = 0 for r ≥ 12 , we also have Remark. Since the states |l satisfy βr−l |l = βr+l
the relation
S12 · g|l ⊗ g|l =
∗ βr g|l ⊗ β−r g|l = 0
r
Thus we can similarly prove more general bilinear identities for the τ -functions τl (t; g) =
l|eH(t) g|l : dz l−l V− (z; t ) V+ (z; t ) τl (t ; g)τl (t ; g) = 0, ∀t , t , l, l , l ≥ l (9.41) z 2iπ
9.5 The KP hierarchy and its soliton solutions To produce a solution of the KP hierarchy, we need to choose an element g ∈ GL(∞) and calculate the τ -function. A particularly simple choice is: g1−sol = eaβ(p)β
∗ (q)
= 1 + aβ(p)β ∗ (q),
with |p| > |q|
This corresponds to the one-soliton solution. In principle one should normal order β(p)β ∗ (q) in the exponential, but this would amount to a multiplicative prefactor which is irrelevant in the bilinear Hirota equations. The N -soliton solution is obtained by choosing the group element to be the product of N one-soliton factors. Proposition. The N -soliton tau-function of the KP hierarchy is given by: τN (t; g) = 0|eH(t) g1 g2 · · · gN |0 ,
with gi = 1 + ai β(pi )β ∗ (qi )
315
9.5 The KP hierarchy and its soliton solutions Explicitly, it is equal to: τN (t; g) = 1 +
N
r=1
I⊂{1,...,N } |I|=r
where ηi = ξ(t, pi ) − ξ(t, qi ) + log
(pi − pj )(qi − qj ) eηi · (pi − qj )(qi − pj )
i<j∈I
ai pi −qi
(9.42)
i∈I
with ξ(t, p) =
np
nt
n.
Proof. Using the commutation relation eq. (9.26) of the fermions with H(t), we get the “evolution equations” of the fermionic fields: eH(t) β(z)e−H(t) = eξ(t,z) β(z) eH(t) β ∗ (z)e−H(t) = e−ξ(t,z) β ∗ (z)
(9.43) (9.44)
This allows us to commute eH(t) to the right where it is annihilated by the vacuum, getting: τN (t; g) = 0|
N
1 + ai eξ(t,pi )−ξ(t,qi ) β(pi )β ∗ (qi ) |0
(9.45)
i=1
Expanding this formula with the Wick theorem, we get the tau-function as a sum of determinants: Wii + det [W(ij) ]2×2 + det [W(ijk) ]3×3 + · · · τN (t; g) = 1 + i
i<j
i<j 0, a partition of p is defined as a collection of integers nj , j = 1, . . . , M such that p= M j=1 nj and 1 ≤ n1 ≤ n2 ≤ · · · ≤ nM . With such a partition one associates a diagram of boxes with M lines, and nM boxes in the first line, nM −1 boxes in the second line, up to n1 boxes in the last line. This is called a Young diagram with p boxes. With the set of number (k1 , . . . , kM ) with 1 ≤ k1 < · · · < kM ≤ N (note that ki = kj for i = j) we associate the Young diagram Y = [nj ]
323
9.7 Schur polynomials nM = kM - M + 1
Y = [kj − j+1] =
nj = k j - j + 1
n1 = k1
Fig. 9.1. A Young diagram with the j th line of length nj = kj − j + 1 (now 1 ≤ n1 ≤ · · · ≤ nM ≤ N ). These Young diagrams have at most M lines and N columns. Consider variables u1 , u2 , . . . , uk , where k is arbitrarily large. Symmetric polynomials of degree d in these variables may be expressed on several bases. To construct one of them consider the elementary basic symmetric polynomials hj of degree j defined by: k i=1
∞
1 = hj (u)sj 1 − ui s j=0
With any partition of d into at most k parts, namely (λ1 ≥ λ2 ≥ · · · λk ≥ 0) such that d = i λi , we associate the Young diagram Y with at most k lines, whose ith line has λi boxes, and the symmetric polynomial of degree d: hλi hY = i
It is known that the hY form an integral basis of the space of symmetric polynomials of degree d. Any symmetric polynomial with integer coefficients can be expanded on this basis with integer coefficients. With the same Young diagram Y one can associate another symmetric polynomial, known as the Schur polynomial: SY (u) =
det uλj i +j−i ∆(u)
,
∆(u) =
(ui − uj )
(9.62)
i<j
where ∆(u) = det uj−i is the Vandermonde determinant. Note that the j numerator and the denominator are antisymmetric polynomials, and that the denominator divides the numerator so that SY is a symmetric polynomial of degree d. It can also be shown that the Schur polynomials form an integral basis of symmetric polynomials of degree d indexed by the
324
9 Grassmannian and integrable hierarchies
Young diagrams. Hence we can express SY in terms of the hY , and this relation is given by the formula SY = det (hλi +j−i ) We will prove this formula below using the boson–fermion correspondence, see eq. (9.65). We can also relate these definitions to the definition of “elementary Schur polynomials” introduced in eq. (9.39). We have 1 n uj hi (u) = Pi (t), for tn = n j
This may be seen by writing: ∞ 1 i j hi (u)si = e− j log(1−uj s) = e i=1 ( i j uj )s = Pi (t)si i
i
Any polynomial f (t1 , t2 , . . .) can be seen as a symmetric polynomial of the variables u and conversely. This is because symmetric polynomials in uj can be expressed on the Newton sums tn . In particular, we shall denote by χY (t) the Schur polynomial SY (u) expressed in terms of the variables tn . 1 n uj χY (t) = SY (u), tn = n j
The next proposition computes the expectation value 0|eH(t) |Y , using the boson–fermion correspondence. ∗ ∗ Proposition. Let |Y = β−sm · · · β−s1 ·β−r · · · β−r |0 be a neutral state m 1 with sm > · · · > s1 and rm > · · · > r1 , then
0|eH(t) |Y = det Pna +a−b (t) (9.63)
with na the number of boxes in the ath line, counted from the bottom, for a Young diagram Y = [na ] with L = (rm + 12 ) lines and C = (sm + 12 ) columns, enclosed in a hook1 of width m. The first m lines have length (sm−k+1 + k − 12 ), and the first m columns have length (rm−k+1 + k − 12 ) for k = 1, . . . , m. See Fig. (9.2). Proof. Let us give the details of the proof of eq. (9.63) in the simplest case ∗ |0 . Using the definition m = 1. Thus we consider the state |Y = β−s β−r eq. (9.11) for the charged vacuum | − r − 12 we can write: |Y = β−s β 1 · · · βr−1 | − r − 12 2
1
This means that there is no box in the rectangle of size (L − m) × (C − m) in the lower right corner.
325
9.7 Schur polynomials Young Tableau
Length of the lines Sm + 1/2 S + 3/2 m-1
S1 + m -1/2
Y =
r 1 - 1/2
m
2 rm-1 - rm-2 - 1 rm
2 1
rm-1 - 1
-
1
Fig. 9.2. The Young diagram associated with the state with m particules and holes. Expressing the fermionic operators as contour integrals, βk . dz k− 1 2 β(z), we obtain 2iπ z 1 r+ 2
0|eH(t) |Y =
1 r−ma − 2
za
=
0|eH(t) β(zr+1/2 ) · · · β(z1 )| − r − 12
a=1
with (m1 , . . . , mr+1/2 ) = (1, 2, . . . , r − 12 , r + s). Using the evolution equations eq. (9.43) and the fact that H(t) annihilates | − r − 12 , we find:
1 r+ 2
0|eH(t) |Y =
1 r−ma − 2 ξ(t,z ) a
za
e
0|β(zr+1/2 ) · · · β(z1 )| − r − 12
a=1
The expectation value can be evaluated using the bosonization formulae eqs. (9.28, 9.30): (za − zb ) za−r−1/2 0|β(zr+1/2 ) · · · β(z1 )| − r − 12 = a
a0
We show that Ψ(t, z) is the Baker–Akhiezer function. For this, it is sufficient to prove that it is expressed by Sato’s formula, eq. (9.77), in terms of the tau-function. Proposition. Let τ (t; g) be defined by eq. (9.70), and let Ψ(t, z) be the function defined in eq. (9.72). We have: Ψ(z, t) =
τ (t − [z −1 ]; g) ξ(z,t) 1|eH(t) β(z)g|0 = e τ (t; g) 0|eH(t) g|0
(9.73)
The second expression is just a rewriting of the first in terms of fermions. t) is the following determinant: Proof. By definition, τ (t; g)w(z, z −2 ··· 1 z −1 g g g F1,0 ∂ −1 F1,0 ∂ −2 F1,0 ··· g g g −1 −2 τ (t; g)w(z, t) = det F 2,0 ∂ F2,0 ∂ F2,0 · · · .. .. .. . . . ··· Subtracting z −1 times the j th column from the (j + 1)th column reduces the first line to (1, 0, 0, . . .). Expanding the determinant with respect to this first line gives: g τ (t; g)w(z, t) = det (1 − z −1 ∂)∂ −n Fm,0 n,m≥1
On the other hand, since eξ(ζ,t−[z have:
−1 ])
= (1 − ζ/z)eξ(ζ,t) for |ζ| < |z|, we
g g (t − [z −1 ]) = (1 − z −1 ∂)Fm,n (t) Fm,n
hence τ (t; g)w = τ (t − [z −1 ]; g) which proves eq. (9.73).
330
9 Grassmannian and integrable hierarchies
The pseudo-differential operator Φ has coefficients depending on the times of the KP hierarchy. One can express its time-evolution simply: Proposition. We have: ∂ Φ = (Φ∂ n Φ−1 )+ Φ − Φ∂ n = −(Φ∂ n Φ−1 )− Φ ∂tn
(9.74)
where the subscript + refers to the projection on the differential part of the operator, and the subscript − refers to the projection on the negative powers of ∂. Proof. We introduce differential operators DN = ∂ N +· · · which are finite order approximations of Φ: Φ = lim
N →∞
DN ∂ −N
The operator DN is defined by truncating the determinant eq. (9.71) to a finite-dimensional determinant: N ∂ f ∂ N −1 f · · · f g g g F1,0 ∂ −1 F1,0 · · · ∂ −N F1,0 1 DN · f = det . .. .. . τN (t; g) . . . g g g −1 −N FN,0 ∂ FN,0 · · · ∂ FN,0
g with τN (t; g) = det ∂ −n Fm . The time evolution of DN is easy n,m=1,...,N
to find: ∂ −1 DN = (DN ∂ n DN )+ DN − DN ∂ n ∂tn Since both side of this equation are differential operators of order (N − 1), to prove this equality it is enough to check it on N linearly independent g g functions. We choose them to be Fp,N = ∂ −N Fp,0 , for p = 1, . . . , N , which span the kernel of DN . We have: g ∂Fp,N ∂DN g g −1 · Fp,N = −DN · = −DN ∂ n + (DN ∂ n DN )+ DN · Fp,N ∂tn ∂tn g The first equality follows by applying ∂∂t to DN Fp,N = 0. To prove the n g g n second equality we substitute ∂tn Fp,N = ∂ Fp,N and we use again the fact g that Fp,N is in the kernel of DN to add the second term which vanishes on g Fp,N . This term is chosen so that, combined with the first one, we get a differential operator of degree N −1. This proves the evolution equation for DN . Taking the limit N → ∞, we get the evolution equation of Φ.
9.9 The Segal–Wilson approach
331
Note that eq. (9.74) has exactly the form of eq. (3.45) in Chapter 3, but the element of the loop group, g (k) (λ), is here replaced by the pseudodifferential operator Φ(∂ −1 ). This formulation of the KP hierarchy will be studied in detail in Chapter 10.
9.9 The Segal–Wilson approach Up to now, we have used the description of the Grassmannian embedded into projective space by Pl¨ ucker coordinates, using the fermionic language. This has the advantage of providing a well-defined computational framework by regularizing potential infinities using normal ordering. In this section we shall look at the Grassmannian as the space of suitable subspaces of a Hilbert space, selected by imposing appropriate functional constraints. We start again from the space V∞ with basis z n , n ∈ Z which can be seen as the space of functions on the unit circle |z| = 1. With a function n on the circle we associate its Fourier expansion f (z) = n an z . The 2 space2 V∞ is turned into a Hilbert space H by introducing the L norm |f | , or what amounts to the same, n |an |2 . We can decompose H as a direct sum of subspaces H ± , where H + is generated by z n , n ≥ 0 and H − by z n , n < 0. We consider the set of subspaces W which are comparable to H + in the following sense: 5 pr+ : W → H + is Fredholm W ∈ Gr iff pr− : W → H − is compact The fact that the projection pr− is compact means that it is a norm limit of operators with finite-dimensional images. In our fermionic language these finite-dimensional images correspond to states with a finite number of particles. The fact that the projection pr+ is Fredholm means that the kernel of pr+ is finite-dimensional, and its image is closed and of finite codimension. In the fermionic language this means a finite number of holes. Let us illustrate these conditions by an example: assume that W is spanned by e−2 , e1 +e−1 , e3 , e4 , . . .. Then pr+ W is spanned by e1 , e3 , e4 , . . . and is of codimension 2 in H + since e0 and e2 are missing. Its kernel is spanned by e−2 and is of dimension 1. Similarly pr− W is spanned by ucker embedding W e−1 , e−2 and is of finite dimension 2. Under the Pl¨ goes to e−2 ∧ (e1 + e−1 ) ∧ e3 ∧ e4 ∧ · · · which expands on two semi-infinite
332
9 Grassmannian and integrable hierarchies
wedge products. These two terms have the property Index (pr+ ) = −1 where the index of the Fredholm operator pr+ is defined by Index (pr+ ) ≡ dim Ker pr+ − codim Im pr+ = no. of particles − no. of holes Notice that this is also the common fermionic charge of all the above states. In general we have Ker pr+ ⊂ Im pr− , but the first one is of finite dimension while the second one may become infinite-dimensional under the limiting procedure yielding a general compact operator. We recall some properties of compact and Fredholm operators. If u is compact and v is continuous (i.e. bounded) then uv and vu are compact. The product of a Fredholm operator and a compact operator is compact. Finally, the sum of a Fredholm operator and a compact operator is Fredholm, and the product of two Fredholm operators is Fredholm. This allows us to consider the group GL(∞) of matrices having the following block structure on the decomposition H = H + ⊕ H − : a b a, d Fredholm and b, c compact h= c d The product of two such elements is of the same form. Moreover, a group element h acts on the Grassmannian, moving W to hW , where the projections for hW are given by: pr+ a b pr+ = pr− pr− c d and have therefore the required properties. A subgroup denoted by Γ+ is of particular interest. An element of Γ+ is given by the multiplication by an L2 non-vanishing function h on the unit circle, extending to a non-vanishing analytic function h in the unit disc and normalized by h(0) = 1. In particular the expansion h(z) has only positive powers of z. It can be represented in the block form as: a b + h∈Γ , h= 0 d where a and d are invertible, hence Fredholm, and b is compact. Indeed, let 2 us write h(z) 1 z+a2 z +· · · and consider its action by multiplication = 1+a k on g(z) = k bk z . If g has only positive powers, hg has only positive powers, moreover, 1/h can be expanded on positive powers, so that a is
333
9.9 The Segal–Wilson approach
invertible. It is continuous by the Schwartz inequality and so is Fredholm. If g has only negative powers of z, we remark that h(z)z −n = z −n + a1 z −n+1 + · · · + an−1 z −1 + an+j z j j≥0
We see that the H − part induces a triangular system with 1 on the diagonal, hence is invertible, so that d is Fredholm. To show that b is N j compact, consider the truncation hN = j=0 aj z which is such that dim{pr+ (hN g)|g ∈ H − } < N , i.e. the corresponding bN is of finite rank.We have ||h − hN || → 0 when N → ∞ because ||(h − hN )g|| ≤ 2 ||g|| ∞ j=N |aj | by the Schwartz inequality. In the following we restrict ourselves to spaces W which are transversal to H − , i.e. such that pr+ : W → H + is an isomorphism. Such W have charge 0, and are “small” deformations of the vacuum |0 . In this case one can define an operator A : H + → H − as follows. With any f ∈ H + one can associate a unique element w ∈ W such that f = pr+ (w). We denote w = pr−1 + (f ) and form f ∈ H + → A(f ) = pr− (w),
w = pr−1 + (f )
The operator A is compact, and we have (pr+ |W )−1 (f ) = w = f + Af with Af ∈ H − . We now want to understand the tau-function in this context. Let us fix an W of the Grassmannian and let h ∈ Γ+ be given by h = element i exp( i≥0 ti z ). This introduces the times of the KP hierarchy. We denote a b −1 h = 0 d Definition. Assuming that W is transverse to H − so that (pr+ |W )−1 exists, we set: τW (h) = det (h pr+ h−1 (pr+ |W )−1 ) = det (1 + a−1 bA)
(9.75)
To understand the second formula, consider f ∈ H + . Then we have pr+ h−1 (pr+ |W )−1 (f ) = af + bAf ∈ H + and h pr+ h−1 (pr+ |W )−1 (f ) = (1 + a−1 bA)f . We have here an operator from H + to H + of the form 1 + compact and we want to take its determinant. This is possible if the
334
9 Grassmannian and integrable hierarchies
operator a−1 bA is of trace class, which is ensured if h is sufficiently regular on the unit circle. Note that the operator a : H + → H + is triangular with 1 on the diagonal, hence one can set det a = 1. If it were allowed to write det M N = det M · det N for infinite-dimensional matrices, we could content ourselves with defining τW = det(a + bA). The definition eq. (9.75) performs a regularization of this too naive expression. It is important to show that this definition agrees with the previous construction in eq. (9.34). Recall the expression eq. (9.60) of the Pl¨ ucker embedding of some element W of the Grassmannian. The two definitions agree if we identify Ags (z) = (pr+ |W )−1 (z s ) Hence A0 (z) ∧ A1 (z) ∧ · · · represents the Pl¨ ucker embedding of W . By g −1 definition the action of h on W is As (z) → eξ(t,z) Ags (z), which under the Pl¨ ucker embedding is represented by multiplication by eH(t) due to eq. (9.43). Next the projection P+ becomes |0 0| and the multiplication by h is achieved by multiplying by e−H(t) , so that: h pr+ h−1 (pr+ |W )−1 (H + ) −→ e−H(t) |0 0|eH(t) A0 (z) ∧ A1 (z) ∧ · · · Pl¨ ucker
Taking the scalar product with 0| produces the determinant of the operator h pr+ h−1 (pr+ |W )−1 . Since 0|e−H(t) |0 = 1 due to normal ordering of H, we reproduce exactly eq. (9.61). When M, N are of the form 1 + m, 1 + n with m and n of trace class, one is allowed to write det M N = det M det N . For h ∈ Γ+ one can show that a−1 bA is of trace class. In particular one obtains: τW (h1 h2 ) = τW (h1 )τh−1 W (h2 )
(9.76)
1
Indeed, τW (h1 )τh−1 W (h2 ) 1
−1 −1 −1 = det(h2 pr+ h−1 2 (pr+ |h−1 W ) ) · det(pr+ h1 (pr+ |W ) h1 ) 1
=
−1 )−1 pr+ h−1 det (h1 h2 pr+ h−1 2 (pr+ |h−1 1 (pr+ |W ) ) 1 W
= τW (h1 h2 )
This is because if f ∈ H + then w = (pr+ |W )−1 f ∈ W so that h−1 1 w ∈ + which under (pr | −1 W . Under pr this gives some g ∈ H h−1 −1 + + h W) 1
−1 −1 reproduces h−1 1 w = h1 (pr+ |W ) f .
1
To define Baker–Akhiezer functions, we assume that h ∈ Γ+ and is such that h−1 W is transverse to H − .
9.9 The Segal–Wilson approach
335
Definition. Let h(z; t) = exp( i>0 ti z i ) = eξ(z,t) ∈ Γ+ . The Baker– Akhiezer function ΨW (h, z) is the unique function such that h−1 ΨW (h, z) is the inverse image under pr+ |h−1 W of the constant function 1 ∈ H + . means that h−1 ΨW (h, z) ∈ h−1 W and can be written as 1 + This ∞ −i i=1 ai (h)z . That is to say, the Baker–Akhiezer function is the unique function ΨW (h, z) ∈ W having the form: ΨW (h, z) = h(z; t)(1 +
∞
ai (h)z −i )
i=1
We see that ΨW has an essential singularity at z = ∞. This is one of the essential feature of the Baker–Akhiezer functions. The Baker–Akhiezer function ΨW becomes a function of t and z and can be explicitly expressed in terms of the tau-function by means of Sato’s formula: Proposition. We have ΨW (t, z) =
τW (t − [z −1 ]) ξ(t,z) e τW (t)
(9.77)
where the notation t − [z −1 ] refers to the substitution of tn by tn − z −n /n. Proof. To avoid notational conflicts, in this proof we denote by ζ the current variable on the circle (previously denoted by z). The group element −1 h(ζ; t − [z ]) is given by exp( n>0 (tn − z −n /n)ζ n ) = h(ζ; t)qz where qz = 1 − ζ/z ∈ Γ+ . Equation (9.77) is equivalent to e−ξ(t,z) ΨW (t, z) =
τW (hqz ) τW (h)
Since τW (hqz ) = τW (h)τh−1 W (qz ), we have to show that e−ξ(z,t) ΨW (t, z) = τh−1 W (qz ). By definition, e−ξ(z,t) ΨW (t, z) = (pr+ |h−1 W )−1 (1). Replacing h−1 W by W , we have to show the equality of the two functions of z: (pr+ |W )−1 (1) = τW (qz ) = det(1 + a−1 bA), where the operators a and b are the ones appearing in the block representation of qz−1 and A is the operator induced by W . Since qz−1 (ζ) = n≥0 ζ n /z n , we have: −n ζ = z −n qz−1 (ζ) b(ζ −n ) = pr+ 1 − ζ/z while the action of a−1 on this element of H + is simply represented by the multiplication by qz (ζ). Hence for any element g ∈ H − we have a−1 bg = g(z) · 1 ∈ H + , i.e. the operator is of rank 1, so that det(1 + a−1 bA) = 1 + Tr(a−1 bA). Since the image is spanned by the function 1, it is enough
336
9 Grassmannian and integrable hierarchies
to compute the action of a−1 bA on 1 to get the trace. But by definition A sends the basis element 1 on the negative power part of (pr+ |W )−1 (1) denoted by f (ζ), so that 1 + Tr(a−1 bA) = 1 + f (z). Let us explain how the algebro-geometric solutions of KP fit into this setting (see Chapter 10 for a description of these solutions). Let Γ be a compact Riemann surface of genus g and L be a line bundle on Γ, see Chapter 15. Fix a puncture x∞ on Γ and let z −1 be a local parameter around x∞ . Let D∞ be a small disc around x∞ . Sections of L locally appear as functions on D∞ . Consider also the open set Γ0 which is the complement of the disc D∞ . The two open sets D∞ and Γ0 cover Γ. With this set of data one associates an element W of the Grassmannian such that w ∈ W if w is the boundary value on the circle of a holomorphic section of the restriction of L on Γ0 . One can show that pr− is compact. To show that pr+ is Fredholm one first shows that Ker(pr+ : W → H + ) = H 0 (Γ, L∞ ), where L∞ = L−[x∞ ] is the difference of the line bundle L and the point bundle at x∞ , i.e. a section of L∞ arises from a section of L vanishing at x∞ . Recall that H 0 (Γ, L∞ ) is the set of global holomorphic sections of L∞ . A function belongs to Ker(pr+ : W → H + ) if and only if it has only strictly negative powers of z, hence extends to the interior of D∞ and vanishes at x∞ . By definition it extends to Γ0 so providing a global section of L∞ . Next we show that H + /pr+ W = H 1 (Γ, L∞ ). It is a well-known fact that the non-compact Riemann surface Γ0 has no sheaf cohomology H 1 (Γ0 , L∞ ) = 0 and it is obviously the same for D∞ . Hence any nonvanishing element in H 1 (Γ, L∞ ) comes from some analytic function φ0∞ on the annulus Γ0 ∩ D∞ which cannot be written as φ0 − φ∞ , where φ0 extends to an analytic section of L on Γ0 and φ∞ is an analytic function on D∞ vanishing at x∞ . But the part of φ0∞ with strictly negative powers of z extends uniquely to a function φ∞ vanishing at x∞ . So H 1 (Γ, L∞ ) is isomorphic to the set of analytic functions on S 1 with only non-negative powers of z modulo those which extend to sections of L. This is precisely the definition of H + /pr+ W . Thus we arrived at the conclusion that the Fredholm index of pr+ is given by the Riemann–Roch theorem: Index (W ) = dim H 0 (L∞ ) − dim H 1 (L∞ ) = 1 − g + c(L∞ ) = c(L) − g where c(L) is the Chern class of L and g is the genus of the Riemann surface. Recall that c(L∞ ) = c(L) − 1 (see Chapter 15). In particular, in the interesting case where W is transverse to H − the index of W vanishes, which needs c(L) = g.
9.9 The Segal–Wilson approach
337
In practice one takes a puncture x∞ and a set of g points in generic position and one considers meromorphic functions with poles at these g points and an essential singularity at x∞ . These data define a unique W transverse to H − in the Grassmannian. Elements of W are boundary values on the circle of meromorphic functions on Γ0 with poles only at these g points. The Baker–Akhiezer function specified by W has an essential singularity at x∞ and g poles. It identifies with those defined in Chapter 5. References [1] R. Hirota, Exact solution of the Korteweg–de Vries equation for multiple collisions of solitons. Phys. Rev. Lett. 27 (1971) 1192. [2] M. Sato and Y. Sato, Solitons equations as dynamical systems on infinite-dimensional Grassmann manifolds. Lect. Notes in Num. Appl. Anal. 5 (1982) 259, or Proc. U.S.–Japan Seminar Nonlinear PDE in Applied Science, Tokyo 1982, Ed. Lax, Fujita, 259–271, North Holland/Kinokuniya. [3] E. Date, M. Kashiwara, M. Jimbo and T. Miwa, Transformation groups for soliton equations. in Proceedings of RIMS symposium, Kyoto 1981. World Scientific (1983) 39–119. [4] M. Jimbo and T. Miwa, Solitons and infinite dimensional Lie algebras. RIMS 19 (1983) 943–1001. [5] G. Segal and G. Wilson, Loop groups and equations of KdV type. Publ. Math. I.H.E.S. 61 (1985) 5–65. [6] V. Kac, Infinite-dimensional Lie algebras. Cambridge University Press (1985). [7] V. Kac and A. Raina, Bombay lectures on highest weight representations of infinite-dimensional Lie algebras. World Scientific (1987). [8] L. Dickey, On the tau-function of matrix hierarchies of integrable equations. J. Math. Phys. 32 (1991) 2996–3002. [9] W. Fulton and J. Harris, Representation theory. Springer (1991). [10] C. Itzykson and J.-B. Zuber, Combinatorics of the modular group II: the Kontsevich integrals. Int. J. Mod. Phys. A7 (1992) 5661–5705.
10 The KP hierarchy
In the previous chapter we showed that the equations of the KP hierarchy can be written as: ∂tn Φ = −(Φ∂ n Φ−1 )− Φ where Φ is a pseudo-differential operator. This is identical to the standard form of the equations of an integrable hierarchy, but we are dealing here with the algebra of pseudo-differential operators, instead of a loop algebra. In this chapter, we explain this setting and investigate the corresponding hierarchy. We show that the general solution can be expressed with the Grassmannian tau-function. With any Riemann surface one can associate particular finite-zone solutions. More generally, we construct solutions corresponding to slow modulations of the algebro-geometric solutions following the Whitham procedure. We also present the reduction of this hierarchy to the generalized KdV equations, and discuss their Poisson structures. We show that these Poisson stuctures can be obtained by Hamiltonian reduction from the Kostant–Kirillov bracket on a Kac– Moody algebra.
10.1 The algebra of pseudo-differential operators We briefly expose the theory of pseudo-differential operators first introduced in this context by Gelfand and Dickey. The algebra of differential operators is the algebra generated by Cvalued functions of one variable x and the derivation symbol ∂, with the usual Leibnitz rule, ∂.a = a.∂ + (∂a), where (∂a) means the derivative (∂x a)(x) of the function a(x). This defines the multiplication law between the symbol∂ and the functions. An element in this algebra is a finite i sum A = N i=0 ai ∂ , with N finite but arbitrary. The coefficients ai are 338
10.1 The algebra of pseudo-differential operators
339
functions of x. To define the algebra of pseudo-differential operators, we extend the algebra of differential operators by introducing the “integration” symbol, ∂ −1 and its powers, with the following algebraic rules: ∂ −1 ∂ = ∂∂ −1 = 1 ∂ −1 a =
∞
(−1)i (∂ i a)∂ −i−1
(10.1)
i=0
The algebra of pseudo-differential operators consists of elements which are i . Equations (10.1) define the semi-infinite sums of the form A = N a ∂ i −∞ −1 multiplication by ∂ in the pseudo-differential algebra since they allow us to push all ∂ −1 symbols to the right. Here we don’t have to deal with the convergence questions involved in this reshuffling, since only a finite number of terms appear at each order in ∂ −i . This rule is motivated by the integration by parts formula. Symbolically we have: dx a∂(∂ −1 u) = a∂ −1 u − dx (∂a)∂ −1 u ∂ −1 (a.u) = dx au = = a∂ −1 u − (∂a)∂ −2 u + dx (∂ 2 a)∂ −2 u and so on. The algebra of pseudo-differential operators is an associative algebra with a unit. It possesses a natural anti-homomorphism, that is (AB)∗ = B ∗ A∗ , which we call the formal adjoint, defined by: (a ∂ i )∗ ≡ (−∂)i a
(10.2)
for any function summarize these facts in the following definitions: 7 6 a. We N i be the set of formal pseudo-differential Let P = A = −∞ ai ∂ 6 7 i a ∂ the operators in one variable. We denote by P+ = A = N i i=0 6 7 −1 subalgebra of differential operators, and by P− = A = −∞ ai ∂ i the subalgebra of integral operators. We have the direct sum decomposition of P as a vector space: P = P+ ⊕ P− Notice that P is naturally a Lie algebra. P+ and P− are Lie subalgebras, but P+ and P− do not commute. For A ∈ P, we define its residue, denoted by Res∂ A, as the coefficient of ∂ −1 in A, Res∂ A ≡ a−1 (x) (10.3)
340
10 The KP hierarchy
On P there exists a natural linear form: Proposition. The algebra P is equipped with a linear form, denoted by N , called the Adler trace. It is defined for any element A = −∞ ai ∂ i by: A =
dx Res∂ A =
dx a−1 (x)
(10.4)
This linear form satisfies the fundamental trace property AB = BA , hence defines an ad-invariant non-degenerate scalar product on P by: ∗ =P . (A, B) = AB . Using this bilinear form we have the duality: P+ − unique way of writing a pseudo-differential operator Proof. There is a N i i in the form A = −∞ ai ∂ , i.e. with all ∂ on the right. Let us prove the trace property. It is sufficient to verify it on operators of the form A = a ∂ k and B = b ∂ j . If k and j are both positive or both strictly negative, we clearly have AB = BA = 0. Thus we take A = a ∂ k and B = b ∂ −j−1 with k, j ≥ 0. Using the relation: ∞ j+v −j−1 v (∂ v a) ∂ −j−1−v a= (−1) (10.5) ∂ v v=0
which is shown by induction, starting from the definition relation eq. (10.1), and the identity between binomial coefficients
j+1+µ µ
=
µ j+ν ν=0
ν
we get:
k k k−j BA = (−1) dx b ∂ a = dx a ∂ k−j b j j k (a ∂ v b) ∂ k−j−v−1 , we Similarly, using the Leibnitz rule, AB = ∞ v=0 v find that AB is given by:
k AB = dx a ∂ k−j b j k−j
Clearly, AB and BA coincide. The invariance of the scalar product means (A, [B, C]) = ([A, B], C). It follows from the trace property. We already noticed that P+ and P− are isotropic with respect to the ∗ = P , we consider in P elements trace (P± , P± ) = 0. To check that P+ − +
341
10.2 The KP hierarchy
ai ∂ i , i = 0, 1, . . . , ∞. They are paired with elements ∂ −i−1 bi in P− since: −i−1 j ∂ bi , aj ∂ = δij dx (ai bi ) Choosing the coefficients ai and bi in an orthonormal basis under we get dual bases of P+ and P− .
dx ab,
10.2 The KP hierarchy We introduce the KP flows by applying the Adler–Kostant–Symes construction to the Lie algebra of pseudo-differential operators. Consider the formal group G = exp(P− ), called the Volterra group. We have G ∼ 1 + P− because powers of elements in P− are in P− . Let Φ be an element of G: Φ=1+
∞
wi ∂ −i ∈ (1 + P− )
(10.6)
i=1
−i and The element Φ has an inverse because, writing Φ−1 = 1 + ∞ 1 wi ∂ −1 demanding Φ Φ = 1, one recursively computes the coefficients wi . One finds that they are of the form wi = −wi + pi (w1 , . . . , wi−1 ), where pi is a polynomial in its arguments and their derivatives (up to order i − 2). For example: w1 = −w1 , w2 = −w2 + w12 w3 = −w3 − w1 (∂w1 ) + 2w1 w2 − w13 The left and right inverses are identical. In the Adler–Kostant–Symes scheme we consider the decomposition of the Lie algebra P = P+ +P− and introduce a second Lie algebra structure on the underlying vector space, called PR . In PR , P+ and P− commute, see eq. (4.13) in Chapter 4. The Lie algebra PR acts on the Volterra group by δA Φ = (ΦAΦ−1 )+ Φ − ΦA+ = −(ΦAΦ−1 )− Φ + ΦA− for any A = A+ + A− ∈ PR . We have [δA , δB ] = δ[A,B]R . See eq. (14.35) in Chapter 14. The signs are slightly different from those in that chapter because we use here the decomposition A = A+ + A− instead of A = A+ − A− in order to conform ourselves with the usual conventions in KP theory.
342
10 The KP hierarchy
If A ∈ A+ , the formula simplifies to δA Φ = −(ΦAΦ−1 )− Φ. The KP flows are defined by taking A in a the Abelian subalgebra ∂ k for k > 0. Definition. For any Φ ∈ (1 + P− ), define the k th KP flow by:
k −1 Φ ∂tk Φ = − Φ · ∂ · Φ −
(10.7)
These flows coincide with eq. (9.74) in Chapter 9. By construction they commute, but it is instructive to check this essential commutativity property directly. Proposition. The KP flows ∂tk all commute. Proof. Consider the pseudo-differential operators Θk = Φ∂ k Φ−1 . One has:
∂tk ∂tl Φ − ∂tl ∂tk Φ = [Θl− , Θk ] − [Θk− , Θl ] + [Θk− , Θl− ] Φ −
Replace Θk = Θk− + Θk+ and similarly for Θl . One gets:
[∂tk , ∂tl ]Φ = [Θl+ , Θk− ]− + [Θl− , Θk+ ]− + [Θl− , Θk− ]− Φ = [Θl , Θk ]− Φ The result follows because [Θl , Θk ] = 0. From Φ, we construct the pseudo-differential operator: Q = Φ · ∂ · Φ−1 ,
Q=∂+
∞
q−i ∂ −i
i=1
It is easy to check that there is no ∂ 0 term. Given Q one can reconstruct Φ up to some constants: Proposition. The pseudo-differential operator Φ is determined by Q up −i and the c are to the transformation Φ → ΦC, where C = 1 + ∞ i i=1 ci ∂ constants independent of x. Proof. Obviously Φ → ΦC, with C independent of x, leaves Q invariant. Using the expression of Φ−1 , we find immediately q−i = −(∂wi ) + hi (w1 , . . . , wi−1 ), where hi is a differential polynomial in its arguments. The derivatives are at most of order (i − 1). Conversely, wi is recursively x determined by q−i as wi = (hi − q−i )dx, up to an integration constant. These constants can be absorbed in C.
343
10.2 The KP hierarchy
On the pseudo-differential Q the KP evolution equations take the Lax form: Proposition. The time evolutions of Q are given by:
∂tk Q = [ Bk , Q ] with Bk = Qk
(10.8)
+
Moreover, the differential operators Bk satisfy the zero curvature condition: ∂tk Bl − ∂tl Bk − [Bk , Bl ] = 0 Proof. We have to compute ∂tk Q. Using the definition eq. (10.7) written as ∂tk Φ = −(Qk )− Φ, we find:
−1 k k ∂tk Q = ∂tk (Φ∂Φ ) = − Q ,Q = Q ,Q −
+
The zero-curvature condition follows from a direct computation:
− ∂tl Qk = [Bk , Ql ] − [Bl , Qk ] ∂tk Bl − ∂tl Bk = ∂tk Ql +
+
+
Therefore, using the decomposition Qk = Qk + + Qk − , we obtain:
∂tk Bl − ∂tl Bk − [Bk , Bl ] = [Bk , Ql ] − [Bl , Qk ] − [Bk , Bl ] +
k l k l = −[Bk − Q , Bl − Q ]+ = − Q , Q =0
−
− +
The Lax equation eq. (10.8) shows that Q is the analogue of the Lax matrix L, and the differential operator Bk is the analogue of Mk . The pseudo-differential operator Φ is the analogue of g(λ) in the loop–algebra situation of Chapter 3. In particular the evolution equation eq. (10.7) is analogous to eq. (3.50) for g. We will keep, however, the traditional notations in this chapter. This analogy can be pursued to get conserved quantities as traces of powers of the Lax matrix if we replace the ordinary trace by the Adler trace. Proposition. The quantities Hk = Qk are conserved. Proof. Using eq. (10.8) we have ∂tl Hk = [Bl , Qk ] which vanishes due to the cyclicity of Adler’s trace. Remark 1. The equations (10.8) are consistent in the sense that [Bk , Q] ∈ P− . This is because [Bk , Q] = [(Qk )+ , Q] = [Qk − (Qk )− , Q] = −[(Qk )− , Q] which expands
344
10 The KP hierarchy
on the negative powers ∂ −j , j ≥ 1. Hence the Lax equations (10.8) produce non-linear equations of motion for the coefficients of Q, i.e. for the functions {q−i }.
Remark 2. The first KP-flow ∂1 is identified with ∂, because we have Q+ = ∂ so that the first flow reads: ∂1 Q = [(Q)+ , Q] = [∂, Q] =
∞
(∂q−i )∂ −i
i=1
Therefore, the KP-time t1 is naturally identified with the variable x introduced in the definition of the algebra P.
Example. To illustrate these formulae we compute the first few equations of motion. First we have: (Q2 )+ = ∂ 2 + 2q−1 ,
(Q3 )+ = ∂ 3 + 3q−1 ∂ + 3(∂q−1 ) + 3q−2
The time evolution with respect to t2 reads: ∂t2 q−1 = ∂ 2 q−1 + 2∂q−2 ∂t2 q−2 = ∂ 2 q−2 + 2∂q−3 + 2q−1 ∂q−1 .. . Similarly, the time evolution with respect to t3 of q−1 is given by: ∂t3 q−1 = ∂ 3 q−1 + 3∂ 2 q−2 + 3∂q−3 + 6q−1 ∂q−1 Eliminating q−2 and q−3 between these equations and renaming u = −2q−1 , one gets: 3∂t22 u = ∂(4∂t3 u + 6u∂u − ∂ 3 u) This is the KP equation, see eq. (8.58) in Chapter 8. It is the first of an infinite hierarchy of non-linear partial differential equations for q−1 , q−2 , . . . . 10.3 The Baker–Akhiezer function of KP By analogy with the Lax situation, we consider eigenvectors of the operator Q, together with their time evolutions under the KP flows, that is we look for an eigenfunction Ψ(t, z) of Q such that: (Q − z)Ψ = 0, (∂tm − Bm )Ψ = 0,
Q = Φ∂Φ−1 Bm = (Qm )+
(10.9)
For m = 2, we find (∂t2 − ∂ 2 + u)Ψ = 0
(10.10)
10.3 The Baker–Akhiezer function of KP
345
In the algebro-geometric case such eigenfunctions were shown to be Baker–Akhiezer functions and we will continue to call them by this name. In order to make connection with previous expressions of the Baker– Akhiezer function, we define the action of pseudo-differential operators on exponentials: ∂ i ezx = z i ezx , for all i ∈ Z This extends to the action of any pseudo-differential operator by writing it first in normal form with all ∂ i on the right. This definition is compatible with the algebra structure, in particular (Φ1 Φ2 ) ezx = Φ1 ((Φ2 ) ezx ). Proposition. The Baker–Akhiezer function Ψ(t, z) obeying eqs. (10.9) can be written as: z) eξ(t,z) Ψ(t, z) = Φeξ(t,z) = (1 + w1 z −1 + w2 z −2 + · · ·) eξ(t,z) ≡ w(t, (10.11) i . This defines where ξ(t, z) = ∞ t z i=1 i w(t, z) = 1 + w1 z −1 + w2 z −2 + · · · z) results clearly from the definition of Proof. The expansion w(t, the action of Φ on exp ξ(t, z), noting that t1 = x. Then QΨ = (Φ∂Φ−1 )Φeξ(t,z) = zΨ. Similarly, using ∂tm eξ(t,z) = ∂ m eξ(t,z) = z m eξ(t,z) the evolution of Ψ with respect to tm is given by:
∂tm Ψ = ∂tm (Φeξ(t,z) ) = (∂tm Φ)eξ(t,z) + Φ ∂tm eξ(t,z)
= − (Qm )− Φeξ(t,z) + Φ ∂ m eξ(t,z) = − (Qm )− + Qm Φeξ(t,z) = (Qm )+ Ψ In the last equality, we have written Φ∂ m eξ = (Φ∂ m Φ−1 )Φeξ = Qm Ψ and we used the decomposition Qm = (Qm )+ + (Qm )− . It is useful to introduce the adjoint Baker–Akhiezer function by: Ψ∗ = (Φ∗ )−1 e−ξ(t,z) where Φ∗ is the formal adjoint of the pseudo-differential operator Φ, defined in eq. (10.2). This adjoint function satisfies the adjoint system: Proposition. The adjoint Baker–Akhiezer function obeys: ( Q∗ − z) Ψ∗ = 0, ∗ ( ∂tm + Bm ) Ψ∗ = 0,
Q∗ = −(Φ∗ )−1 ∂φ∗ ∗ Bm = (Qm )∗+
346
10 The KP hierarchy
Proof. Since adjoint is an antihomomorphism we have Q∗ = ∗ the formal −1 ∗ −1 = −(Φ ) · ∂ · Φ∗ . As a consequence, we have: Φ·∂·Φ Q∗ Ψ∗ = −(Φ∗ )−1 ∂ e−ξ(t,z) = z Ψ∗ Similarly, we have: ∂tm Ψ∗ = ∂tm (Φ∗ )−1 e−ξ(t,z) + (Φ∗ )−1 ∂tm e−ξ(t,z) = (Qm )∗− − (Qm )∗ Ψ∗ = −(Qm )∗+ Ψ∗
Note that, compared to the algebro-geometric Baker–Akhiezer functions, the puncture is at z = ∞ where the exponential factor exp ξ(t, z) has an essential singularity, while w(x, z) is formally regular. The fact that the Baker–Akhiezer functions are solutions of the linear system eq. (10.9) implies that they satisfy an important bilinear identity: Theorem. The following bilinear identity holds for all (i1 , . . . , im ), ij ≥ 0: dz i1 m Ψ(t, z)) · Ψ∗ (t, z) = 0 (10.12) (∂ · · · ∂tim 2iπ t1 It can be rewritten more compactly as: dz Ψ(t, z) · Ψ∗ (t , z) = 0, 2iπ
∀t, t
(10.13)
The integrals over z are residues around z = ∞, i.e. integrals on big circles around z = ∞. Proof. Notice that the integrands are meromorphic functions around ∞ because the essential singularities cancel. We first need a formula expressing the Adler residue, eq. (10.3), of a product of two pseudo-differential operators D = i di ∂ i and F = i fi ∂ i : Lemma.
dz (Dezx )(F e−zx ) = Res∂ (DF ∗ ) 2iπ
(10.14)
Proof. The left-hand side is the coefficient of z −1 in the integrand: dz dz (Dezx )(F e−zx ) = di z i fj (−z)j = (−1)j d−j−1 fj 2iπ 2iπ i
j
j
347
10.3 The Baker–Akhiezer function of KP
Similarly, we compute residue, i.e. the coefficient of ∂ −1 in the Adler ∗ ∗ i (DF ), using F = i (−∂) fi : Res∂ (DF ∗ ) = Res∂ ( di ∂ i (−∂)j fj ) = (−1)j d−j−1 fj i,j
j
We can now prove the identity eq. (10.12). Since ∂tm Ψ = Bm Ψ, where Bm is a polynomial in ∂ (only a finite number of positive powers appear), it is sufficient to prove this equality for (i, 0, . . . , 0), i ≥ 0. In this case: dz i ξ(t,z) dz i ∗ ) · ((Φ∗ )−1 e−ξ(t,z) ) (∂ Ψ) · Ψ = (∂ Φe 2iπ 2iπ dz i zx = (∂ Φe ) · ((Φ∗ )−1 e−zx ) 2iπ = Res∂ (∂ i Φ · Φ−1 ) = Res∂ (∂ i ) = 0 The compact expression eq. (10.13) is obtained formally by Taylor expanding around t = t . In the previous proposition, we proved that the KP equations imply that the Baker–Akhiezer functions satisfy the bilinear identities eq. (10.13). We now establish the converse statement, meaning that the whole KP hierarchy is equivalent to these bilinear identities. Proposition. Consider two formal series: Ψ = Φeξ(t,z) ,
Ψ∗ = Φ e−ξ(t,z)
where Φ and Φ are two pseudo-differential operators of the form: ∞ ∞ wi ∂ −i , Φ = 1 + wi (−∂)−i Φ=1+ i=1
i=1
{wi , wi }
where are functions of the variables {ti }. Let us assume that the bilinear identity eq. (10.13) is satisfied. Then one has Φ = (Φ∗ )−1 Φ∂Φ−1 ,
(10.15)
Moreover, defining Q = we have ∂tm Φ = the Baker–Akhiezer function of the KP hierarchy.
−(Qm )− Φ.
Hence Ψ is
Proof. We have used the notation Φ (with a different star) to avoid introducing a new letter, but at this stage it is an independent pseudodifferential operator. By definition the functions Ψ and Ψ∗ are: ∞ ∞ wi z −i eξ(t,z) , Ψ∗ = 1 + wi z −i e−ξ(t,z) Ψ= 1+ i=1
i=1
348
10 The KP hierarchy
. dz i and we assume that 2iπ ∂ Ψ · Ψ∗ = 0 for any i ≥ 0. We first prove that this implies that Φ and Φ∗ are inverse to each other. Indeed, using eq. (10.14), we have for any i ≥ 0: dz i ξ(t,z) dz i i ∗ −ξ(t,z) Res∂ (∂ Φ (Φ ) ) = )(Φ e )= (∂ Φe ∂ Ψ · Ψ∗ = 0 2iπ 2iπ where the hypothesis is used in the last step. But by construction Φ(Φ )∗ = 1 + X, with X ∈ P− , therefore the above equation implies Res (∂ i X) = 0, for all i ≥ 0 and thus X = 0, so that Φ = (Φ∗ )−1 . Now let Q = Φ∂Φ−1 , and Bm = (Qm )+ . We show that ∂tm Φ = −(Qm )− Φ. First, observe that using ∂tm eξ(t,z) = ∂ m eξ(t,z) and Φ∂ m = Qm Φ, we have:
(∂tm Φ) + (Qm )− Φ eξ(t,z) = ∂tm Φeξ(t,z) − Φ∂tm − (Qm )− Φ eξ(t,z)
= ∂tm Φeξ(t,z) − Φ∂ m − (Qm )− Φ eξ(t,z)
= ∂tm − Qm + (Qm )− Φ eξ(t,z)
= ∂tm − (Qm )+ Φ eξ(t,z) By hypothesis, Ψ fore, since (Qm )+ 0= =
and Ψ∗ satisfy the bilinear identity eq. (10.13). Thereis a differential polynomial we have, for any i ≥ 0: dz i ∂ (∂tm − (Qm )+ )Φeξ(t,z) · Φ e−ξ(t,z) 2iπ
dz i ∂ (∂tm Φ) + (Qm )− Φ eξ(t,z) · Φ e−ξ(t,z) 2iπ
Equivalently, one can write:
0 = Res∂ ∂ i (∂tm Φ) + (Qm )− Φ (Φ )∗
= Res∂ ∂ i (∂tm Φ) + (Qm )− Φ Φ−1 Since this is true for any i ≥ 0, it implies: ((∂tm Φ) + (Qm )− Φ)Φ−1 = 0. Multiplying on the right by Φ proves the result. 10.4 Algebro-geometric solutions of KP It is quite a remarkable fact that with any Riemann surface of genus g one can associate a solution of the KP hierarchy. We explain this construction in this section. Let Γ be a smooth algebraic curve of genus g. Fix a point P∞ on Γ and a local coordinate w(P ) = z −1 in a neighbourhood of the puncture
10.4 Algebro-geometric solutions of KP
349
P∞ ( z = ∞). Then for each set of g points γ1 , . . . , γg in a general position there exists a unique function Ψ(t, P ) of the variable P ∈ Γ which is meromorphic outside P∞ and has at most simple poles at the points γs , and in the neighbourhood of the puncture P∞ one requires: ∞ ξ(t,z) −s Ψ(t, P ) = e ws (t)z 1+ (10.16) s=1
We now recall the fundamental formula expressing the Baker–Akhiezer functions in terms of Riemann theta functions (see Chapter 5). Let Ω(i) be the unique normalized meromorphic differential with a pole at P∞ , of the form Ω(i) = d(z i + O(z −1 )) and holomorphic everywhere else. The normalization condition is that all its a-periods vanish, Ω(i) = 0 ak
With it, we define a vector U (i) with coordinates 1 (i) Ω(i) Uk = 2πi bk
(10.17)
The Baker–Akhiezer function Ψ(t, P ) is equal to Ψ(t, P ) =
θ(A(P ) + U (1) x + U (2) t − ζ)θ(ζ) (i ti PP Ω(i) ) ∞ e θ(A(P ) − ζ)θ(U (1) x + U (2) t − ζ)
(10.18)
P where P∞ Ω(i) is the unique primitive of Ω(i) behaving as z i + O(z −1 ) modulo periods in the vicinity of P∞ . The vector ζ is equal to ζ = A(D)+ K with D = γ1 +· · ·+γg and K is the vector of Riemann constants. Finally, A(P ) is the Abel map with origin at P∞ . Remark. The Baker–Akhiezer function is intrinsically defined by its analyticity properties. In the above formula, the choice of a-cycles and b-cycles and the normalization of the differentials is at our disposal. Another more canonical normalization is obtained by requiring that the forms Ω (j) have pure imaginary periods on any cycle. They are obtained by the transformation Ω(j) = Ω
(j)
+
g
(j)
αi ωi
(10.19)
i=1
where the ωi are the g normalized holomorphic differentials. Writing these normal(j) ization conditions gives 2g real conditions on the g complex parameters αi which can be solved. Indeed, taking the integral of this formula over the cycle ai , we get
350
10 The KP hierarchy
. (j) αi = − a Ω (j) is pure imaginary. Taking the integral over the cycle bi , we get the i unique solution (j) (j) αi = 2iπ(Im B)−1 ik Im Uk where B is matrix of b-periods of the ωi . With such Ω
Ψ(t, P ) = e
(j)
P∞
Ω (j)
ϕ
(j)
, we can write
tj U
(j)
,P
(10.20)
(j)
has 2g components 1 1 (j) =− Ω (j) , Ug+i = Ω (j) , 2iπ ai 2iπ bi
where the vector U Ui
j tj
P
j = 1, . . . , g
The function ϕ(z, P ) is equal to g
ϕ(z, P ) = e2iπ(
i=1
zi Ai (P ))
θ(A(P ) + i (zi+g Ii + zi Bi ) − ζ)θ(ζ) θ(A(P ) − ζ)θ( i (zi+g Ii + zi Bi ) − ζ)
The vectors Ii and Bi are such that (Ii )j = δij and (Bi )j = Bij . This is obtained by plugging eq. (10.19) in eq. (10.18). Note that the function ϕ(z, P ) is periodic with period 1 in each of the 2g variables zi . This form will be useful in the considerations of the last sections on Whitham equations.
The Baker–Akhiezer function automatically produces solutions of the KP hierarchy as follows. Consider eq. (10.16) for the asymptotic form of the Baker–Akhiezer function around P∞ . Let us rewrite it as ξ(t,z)
Ψ(t, P ) = Φe
,
Φ=1+
∞
ws (t)∂ −s
s=1
i where ∂ = ∂t1 = ∂x and ξ(t, z) = i ti z . This defines the pseudodifferential operator Φ. From it we define Q = Φ∂Φ−1 Then we have Proposition. Let Ψ(t, P ) be the above Baker–Akhiezer function. Then it satisfies the equations of motion of the KP hierarchy (Q − z)Ψ = 0 (∂ti − (Qi )+ )Ψ = 0 Proof. The first equation has a meaning as an expansion around P∞ and directly follows from the definition of Q. To prove the second equation,
351
10.4 Algebro-geometric solutions of KP
consider the function (∂ti − (Qi )+ )Ψ on Γ. It has the same analyticity properties as Ψ, apart from the behavior around P∞ where we have (∂ti − (Qi )+ )Ψ = (∂ti − Qi + (Qi )− )Ψ = O(z −1 )eξ(t,z) We used that Φ∂ti eξ(t,z) = Qi Φeξ(t,z) . Hence the expression in the lefthand side identically vanishes by the unicity of the Baker–Akhiezer function.
Remark. We stress that this construction associates solutions of the KP hierarchy with any Riemann surfaces. Special curves may lead to additional interesting structures, as we have seen in Chapter 7 on Calogero–Moser systems. We now give the global definition of the adjoint Baker–Akhiezer function. For any set of g points in general position there exists a unique meromorphic differential Ω with a double pole at P∞ : Ω = dz(1 + O(z −2 ))
(10.21)
and zeroes at the points γs : Ω(γs ) = 0,
s = 1, . . . , g
(10.22)
Besides γs this differential has g other zeroes that we denote by γs∗ . The adjoint Baker–Akhiezer function is the unique function Ψ∗ (t, P ) of the variable P ∈ Γ which is meromorphic outside P∞ , has at most simple poles at the points γs∗ (if all of them are distinct), and behaves in the neighbourhood of the puncture P∞ as ∞ Ψ∗ (t, P ) = e−ξ(t,z) 1 + ws∗ (t)z −s s=1
The adjoint Baker–Akhiezer function Ψ∗ (t, P ) is equal to Ψ∗ (t, P ) =
θ(A(P ) − U (1) x − U (2) t − ζ ∗ )θ(ζ ∗ ) −(i ti PP Ωi ) ∞ e θ(A(P ) − ζ ∗ )θ(U (1) x + U (2) t + ζ ∗ )
where ζ ∗ = A(D∗ ) + K where D∗ = γ1∗ + · · · + γg∗ .
352
10 The KP hierarchy
Proposition. The adjoint Baker–Akhiezer function satisfies the equations: (Q∗ − z)Ψ∗ = 0 (∂ti + (Qi )∗+ )Ψ∗ = 0
(10.23)
where Q∗ is the formal adjoint of Q. Proof. Consider, for any positive integer i, the form (∂ i Ψ)Ψ∗ Ω, where Ω is defined by eqs. (10.21, 10.22). This is a meromorphic 1-form on Γ with a unique pole at P∞ of order 2 + i because the poles of Ψ and Ψ∗ are cancelled by the zeroes of Ω. Moreover, the essential singularities of Ψ and Ψ∗ at P∞ cancel. Around P∞ , we have Ψ = Φeξ (t, z) and Ψ∗ = Φ e−ξ(t,z) , where Φ is defined from the expansion of Ψ∗ . So we have (∂ i Φeξ )Φ e−ξ Ω = z i dz (1 + O(1/z)) Since the sum of residues of any meromorphic 1-form must vanish, we get: (∂ i Ψ) · Ψ∗ Ω = 0, ∀i ≥ 0 where the integral is taken on a small circle around P∞ . This means that Ψ and Ψ∗ Ω satisfy the bilinear identities eq. (10.13), and therefore by eq. (10.15) we have Φ Ω = (Φ∗ )−1 dz. The adjoint equations of motion follow because Ω is independent of t. We have shown that the adjoint Baker–Akhiezer function is equal to the formal adjoint of Ψ, up to the factor Ω/dz. This also shows that Baker–Akhiezer functions constructed from Riemann surfaces automatically satisfy the fundamental bilinear identities eq. (10.13). 10.5 The tau-function of KP The bilinear identity eq. (10.13) allows us to express the Baker–Akhiezer function in terms of a tau-function. Proposition. Assume that Ψ and Ψ∗ are Baker–Akhiezer functions of the KP hierarchy satisfying eq. (10.13). Then there exists a function τ such that τ (t − [z −1 ]) ξ(t,z) τ (t + [z −1 ]) −ξ(t,z) , Ψ∗ (t, z) = (10.24) e e τ (t) τ (t) i where [z −1 ] = z1 , 2z12 , 3z13 , · · · and ξ(t, z) = ∞ i=1 ti z . Ψ(t, z) =
10.5 The tau-function of KP
353
−i we have, by the residue Proof. Note that for f (z) = 1 + ∞ i=1 fi z theorem: dz f (z) = z (f (z ) − 1) (10.25) 2iπ 1 − zz where z is big enough to be inside the integration contour around ∞. The two terms correspond to the poles at z = z and z = ∞. The bilinear identity applied to t and t = t − [z1−1 ] yields: dz w(t, z)w ∗ (t − [z1−1 ], z) dz 0= Ψ(t, z)Ψ∗ (t − [z1−1 ], z) = 2iπ 2iπ 1 − z/z1 where w(t, z) is defined in eq. (10.11). To show the second equality, we used that −1 1 e−ξ(t−[z1 ],z) = e−ξ(t,z) 1 − z/z1 Up to now the arguments are similar to the proof of eq. (9.77) in Chapter 9. Notice that the Cauchy kernel in the right-hand side is produced by the very specific choice of shift we consider. Applying the residue ∗ (t − [z1−1 ], z1 ) = 1, or: formula eq. (10.25), we get w(t, z1 )w w ∗ (t − [z1−1 ], z1 ) =
1 w(t, z1 )
(10.26)
Similarly, applying the bilinear identity to t and t = t − [z1−1 ] − [z2−1 ], we see that the following quantity vanishes: dz w(t, dz z)w ∗ (t − [z1−1 ]−[z2−1 ], z) −1 −1 ∗ Ψ(t, z)Ψ (t−[z1 ]−[z2 ], z) = 2iπ 2iπ (1 − z/z1 )(1 − z/z2 ) Since there is no residue at ∞ we get w(t, z1 )w ∗ (t − [z1−1 ] − [z2−1 ], z1 ) = −1 −1 ∗ ∗ (t−[z1 ]−[z2 ], z2 ). Eliminating w using eq. (10.26), we obtain w(t, z2 )w the functional equation: w(t − [z2−1 ], z1 ) w(t − [z1−1 ], z2 ) = w(t, z1 ) w(t, z2 ) We want to show that this equation implies: w(t, z) =
τ (t − [z −1 ]) τ (t)
(10.27)
It is trivial to verify that this solves the equation. We now proceed to show that this is the general solution. Taking the logarithm of the functional equation, we are led to study an equation of the form f (t − [u−1 ], v) − f (t, v) = f (t − [v −1 ], u) − f (t, u)
(10.28)
354
10 The KP hierarchy
where the function f is: f (t, v) = log w(t, v) =
1 1 1 w1 (t) + 2 (w2 (t) − w12 (t)) + · · · v v 2
Introducing the generating function of time derivatives: ∇v =
∞
v −i−1 ∂ti
i=1
)φ(t − [v −1 ])
we remark that (∂v − ∇v (∂v − ∇v ) to eq. (10.28), we get
= 0 for any function φ(t). Applying
(∂v − ∇v )f (t − [u−1 ], v) − (∂v − ∇v )f (t, v) = −(∂v − ∇v )f (t, u) −i−1 , this Expanding in v −1 and setting (∂v − ∇v )f (t, v) = ∞ i=1 γi (t)v reads: (10.29) γi (t − [u−1 ]) − γi (t) = ∂ti f (t, u) Considering Fij (t) = ∂ti γj (t) − ∂tj γi (t) we get the condition Fij (t − [u−1 ]) = Fij (t). Expanding in powers of u−1 one sees that Fij (t) is independent of all the time variables t1 , t2 , . . . . But by construction, Fij is a local differential polynomial in w1 (t), w2 (t), . . . (for example F12 = ∂t2 w1 − 2∂t1 w2 + 2w1 ∂t1 w1 − ∂12 w1 , etc. . . .). Using the equations of motion of the KP hierarchy we can replace all the ∂tk wl for k ≥ 2 by higher derivatives of wl with respect to t1 = x. Hence we can write Fij as a polynomial in the ∂ k wl . But the monomials in Fij are independent, and we know that Fij is constant, hence it reduces to its constant term. Since it vanishes for the particular solution w = 0, we see that Fij = 0. So we can write γi (t) = ∂ti log τ (t). Finally, inserting this into eq. (10.29) we get eq. (10.27). The formula for Ψ∗ (t, z) is then a straightforward consequence of eq. (10.26) where one substitutes t → t + [z1−1 ]. Equation (10.24) is Sato’s formula. Using it, the bilinear identity eq. (10.13) can be rewritten as bilinear identities for the tau-functions. Of course they coincide with the Hirota bilinear identities we obtained using vertex operators, eq. (9.38) in Chapter 9. Remark 1. We recall that in terms of the fermionic description of the previous chapter, the tau-function and the Baker–Akhiezer function have a particularly elegant formulation: τ (t; g) = 0|eH(t) g|0 and Ψ(t, z) =
1|eH(t) β(z)g|0 , τ (t; g)
Ψ∗ (t, z) =
−1|eH(t) β ∗ (z)g|0 τ (t; g)
355
10.6 The generalized KdV equations
where g is an element of the group GL(∞), and β(z) and β ∗ (z) are the fermionic operators.
Remark 2. The Grassmannian formulation also shows that τ (t) is given by an infinite determinant. In particular interesting cases, it degenerates to a finite determinant. 10.6 The generalized KdV equations The KP hierarchy is a system of evolution equations for the infinite set of functions wi (t) or equivalently the coefficients q−i (t) appearing in Q. To reduce the sytem to a finite number of coefficients, we remark that since Q obeys the Lax equations ∂tk Q = [(Qk )+ , Q], any power Qn+1 also obeys the same equation. The main remark is that one can impose consistently that Qn+1 is a differential operator, i.e. (Qn+1 )− = 0. This is because one then has [(Qk )+ , Qn+1 ]− = 0, and so ∂tk Qn+1 is a differential operator. Moreover, the two sides of the Lax equation are differential operators of the same order because one can also write ∂tk Qn+1 = [−(Qk )− , Qn+1 ], which is in fact of order ∂ n−1 . It follows that one can further impose that the coefficient of ∂ n in Qn+1 vanishes. To summarize, we impose that Qn+1 = L is a differential operator: L=∂
n+1
−
n−1
ui ∂ i
(10.30)
i=0
With this L one can write Lax equations which define flows on the finite number of functions ui . These flows close on the ui because given L one can reconstruct Q such that Qn+1 = L, so that (Qk )+ may be viewed as a function of the ui . Proposition. Let L be the differential operator eq. (10.30). There exists a unique pseudo-differential operator Q = ∂ + q−1 ∂ −1 + · · · such that 1 Qn+1 = L. We will denote it by Q = L n+1 . Proof. If Q = ∂ + q0 + q−1 ∂ −1 + · · ·, one first sees that Qn+1 = ∂ n+1 + (n + 1)q0 ∂ n + · · ·. Since there is no term ∂ n in L one has q0 = 0. Then by induction one shows that: (n + 1)q−1 = −un−1 ,
(n + 1)q−2 = −un−2 −
n(n + 1) ∂q−1 2
(n + 1)q−i = −un−i + pi (q−1 , . . . , q−i+1 ) where pi is a differential polynomial in its arguments. Knowing the ui , this system uniquely determines the qi recursively.
356
10 The KP hierarchy
We can rewrite the reduced KP flows directly in terms of L. These systems are called the generalized KdV hierarchies. The KdV hierarchy corresponds to n = 1, and the generalized ones to n = 2, 3, . . . . It is worth writing these equations once more: Proposition. Let L be the differential operator eq. (10.30). Then the Lax equations
k n+1 ∂tk L = L ,L (10.31) +
are consistent for all k ∈ N. 1
Proof. We introduce the pseudo-differential operator Q = L n+1 . Notice that Qk , ∀k ∈ N, commutes with L since LQk = Qn+1+k = Qk L. Then we have:
k k k k Q ,L = Q ,L − Q ,L = − Q ,L +
−
−
From the last equality, it follows that the differential operator Qk + , L is of order less or equal to n − 1, so that the Lax equation eq. (10.31) is an equation on the coefficients of L. Example. Let us consider the KdV case n = 1. The operator L is the second order differential operator L = ∂2 − u We first find Q such that Q2 = L. One has Q2 = ∂ 2 + 2q−1 + (2q−2 + ∂q−1 )∂ −1 + · · · so that q−1 = − 12 u, q−2 = 14 ∂u, etc. . . . 1 1 Q = ∂ − u∂ −1 + (∂u)∂ −2 + · · · 2 4 We again check on this simple example that all the q−j are recursively determined in terms of u by requiring that no ∂ −j terms occur in Q2 . To obtain the KdV flows, we only have to compute (Qk )+ , k = 1, 2, . . . . For k = 1, we have (Q)+ = ∂, and ∂1 L = [∂, L]. This reduces to the identification ∂t1 = ∂. For k = 2, we have (Q2 )+ = L, and we get the trivial equation ∂t2 L = 0. The first non-trivial case is k = 3. We have: 3 3 (Q3 )+ = ∂ 3 − u∂ − (∂u) 2 4 so the Lax equation reads ∂t3 u = [(Q3 )+ , ∂ 2 − u]. This is the Korteweg–de Vries equation: 4∂t3 u = ∂ 3 u − 6u(∂u)
10.6 The generalized KdV equations
357
This is the first of a hierarchy of equations obtained by taking k = 3, 5, 7, . . . , called the KdV hierarchy (note that for k even we get trivial equations), which will be studied in detail in Chapter 11. We now show that the generalized KdV equations are Hamiltonian systems. The differential operator L is an element of P+ . So we have to specify a Poisson structure on the space F(P+ ) of functions on P+ . If we view P+ as the dual of the Lie algebra P− through the Adler trace, there is a natural Poisson bracket on P+ : the Kostant–Kirillov bracket. For any functions f and g on P+ , it is defined as usual by: {f, g}1 (L) = L , [df, dg]
∀ L ∈ P+
(10.32)
where we understand that df, dg ∈ P− . In particular, if for any X = ∞ −j−1 x ∈ P , we define the linear function f (L) by: j − X j=0 ∂ fX (L) = L, X
(10.33)
and we have dfX = X ∈ P− . Therefore {fX , fY }1 = f[X,Y ] = L, [X, Y ] for any X, Y ∈ P− . Proposition. Let L ∈ P+ be the differential operator of order n + 1 as in eq. (10.30). Define the functions of L by k n+1 L n+1 +1 Hk (L) = n+k+1 They are the conserved quantities of the generalized KP hierarchy. Then: (i) The quantities Hk are the Hamiltonians of the generalized KdV flows under the bracket eq. (10.32):
k ˙ n+1 L = {Hk , L}1 = L ,L (10.34) +
(ii) The functions Hk (L) are in involution with respect to this bracket. 1
Proof. Recall that Q = L n+1 so that Hk is proportional to Qk+n+1 , which are the conserved quantities of the KP hierarchy, and so are conserved also in the generalized KdV hierarchies. We first need to compute the differential of the Hamiltonian Hk . Let L and δL be differential operators of the form eq. (10.30). One has, using the cyclicity of Adler’s trace: (L + δL)ν = Lν + ν Lν−1 δL + · · · which implies d Lν = ν(Lν−1 )(−n) , where the notation ( )(−n) means projection on P− truncated at the first n terms. This projection appears
358
10 The KP hierarchy
because δL = −δun−1 ∂ n−1 − · · · − δu0 , which is dual to elements of the form ∂ −1 x0 + · · · + ∂ −n xn under the Adler trace. Hence: k
dHk (L) = L n+1 = Qk ∈ P− (10.35) (−n)
(−n)
(k)
We define θ−(n+1) as the terms left over in the truncation: k
(k) L n+1 = dHk + θ−(n+1)
(10.36)
−
We now prove eq. (10.34). Consider the function fX (L) = LX , then f˙X = {Hk , fX }(L) = L, [dHk , dfX ] = [L, dHk ], X where we used the invariance of the Adler trace. Since X ∈ P− , only [L, dHk ]+ contributes to this expression. But
k k (k) [L, dHk ]+ = L, L n+1 − L, θ−(n+1) = L n+1 ,L − +
+
+
k
(k)
where we have used [L n+1 , L] = 0, and the fact that [L, θ−(n+1) ]+ = 0. So [L, dHk ]+ is a differential operator of order at most n − 1, and this is enough to prove eq. (10.34). Next we show that the Hamiltonians Hk are in involution. We have: {Hk , Hl }1 (L) = L, [dHk , dHl ] = [L, dHk ]+ , dHl
k = L n+1 , L , dHl +
Using again the fact
that [L, dHk ]+ is of order at most n − 1, we can l n+1 replace dHl by L , and get: −
k l k l n+1 n+1 n+1 {Hk , Hl }1 (L) = L ,L L
= L , L L n+1 −
+
+
l
In the last step we used that P+ , P+ = 0 in order to replace L n+1 by L
l n+1
−
. Finally, from the invariance of the trace, we obtain: k l L, L n+1 = 0 {Hk , Hl }1 (L) = L n+1 +
This proposition shows that the generalized KdV hierarchies are Hamiltonian systems. In the next section we show that there exists in fact another local Hamiltonian structure for the same hierarchy.
359
10.7 KdV Hamiltonian structures 10.7 KdV Hamiltonian structures
We will establish recursion relations between the equations of motion written with Hamiltonians Hk and Hk+n+1 . These relations are called Lenard recursion relations. They suggest introducing a second Poisson bracket on F(P+ ) such that the generalized KdV equations can be writtens as Hamilton equations with respect to both Poisson brackets; such systems are called bihamiltonian systems. Proposition. Let L be the differential operator eq. (10.30). Let us introduce two operators D1 and D2 acting on any X ∈ P− by: D1 (X) = [L, X]+ D2 (X) = (LX)+ L − L(XL)+ −
1 [L, (∂ −1 [X, L]−1 )] n+1
Then, the functions Hk (L) satisfy the following recursion relation: D1 (dHk+n+1 ) = D2 (dHk )
(10.37)
Proof. Note that for any X ∈ P− we have 0 = [X, L] = dx [X, L]−1 so that [X, L]−1 is a total derivative and the object (∂ −1 [X, L]−1 ) appearing of this total differential, in the definition of D2 is by definition a primitive −i x i.e. it is local. For example, we have [∂ 2 −u, ∞ i=1 i ∂ ]−1 = ∂(∂x1 +2x2 ). Using eq. (10.36) for dHk+n+1 , we have: (k+n+1)
]+ − [L, θ−(n+1) ]+ D1 (dHk+n+1 ) = [L, dHk+n+1 ]+ = [L, Qk+n+1 − = −[L, Qk+n+1 ] + (k+n+1)
We used [L, θ−(n+1) ]+ = 0 and the fact that Qk+n+1 = Qk+n+1 +Qk+n+1 + − commutes with L. Now, the simple recursion relation: Qk L = Qk+n+1 implies:
Qk+n+1 = Qk L = Qk+ L + Qk− L = Qk+ L + Qk− L +
+
+
+
+
Thus we obtain:
k+n+1 k k = − L, Q+ L + L, Q− L D1 (dHk+n+1 ) = − L, Q +
= [L, dHk ]+ L − [L, (dHk L)+ ] −
+
(k) [L, (θ−(n+1) L)+ ]
(10.38)
where we used again the decomposition eq. (10.36) for dHk . The remark(k) able fact is that the last term involving θ−(n+1) can also be expressed
360
10 The KP hierarchy
in terms of dHk . Indeed, defining v0 by θ−(n+1) = v0 ∂ −(n+1) + · · ·, we (k)
k
(k)
k
have (θ−(n+1) L)+ = v0 . Also, [(L n+1 )− , L]− = −[(L n+1 )+ , L]− = 0, and looking at the Adler residue gives: k
(k)
0 = [(L n+1 )− , L]−1 = [dHk , L]−1 +[θ−(n+1) , L]−1 = [dHk , L]−1 −(n+1)∂v0 Because the residue of a commutator is a total derivative, we can integrate this equation, and write consistently that: v0 =
1 −1 ∂ [dHk , L]−1 n+1
Inserting this result into eq. (10.38), we get: [L, dHk+n+1 ]+ = (LdHk )+ L − L(dHk L)+ −
1 [L, (∂ −1 [dHk , L]−1 )] n+1
which is the claimed statement. This recursion relation led Adler to conjecture that the operators D1 and D2 define two Poisson brackets on F(P+ ) denoted by { , }1 and { , }2 through the equations: {f, g}1 = D1 (df ) · dg {f, g}2 = D2 (df ) · dg More precisely, let us define as in eq. (10.33) the functions fX (L) = LX such that dfX = X. The explicit expressions for the two brackets are then: (10.39) {fX , fY }1 (L) = LXY − LY X {fX , fY }2 (L) = (LX)+ (LY )− − (XL)+ (Y L)−
1 − dx (∂ −1 [L, X]−1 ) [L, Y ]−1 n+1 The first bracket is the Kostant–Kirillov bracket, eq. (10.32). We now prove that { , }2 is a Poisson bracket. The antisymmetry can easily be checked using the cyclicity of Adler trace and isotropy of P± . To check the Jacobi identity, we change variables, from the coefficients ui of L, see eq. (10.30), to new variables pj in which the Jacobi identity is obvious. This change of variables is called the Miura transformation. The Poisson bracket { , }2 becomes very simple in the variables pj . This is the content of the Kupershmidt–Wilson theorem.
361
10.7 KdV Hamiltonian structures Theorem. Let us write L = Ln+1 Ln · · · L1 , with Li = ∂ − pi : L = (∂ − pn+1 ) · (∂ − pn ) · · · (∂ − p1 ),
n+1
pi = 0
(10.40)
i=1
Let us define a Poisson bracket on the functions pk (x) by: 1 {pk (x), pl (y)} = δkl − δ (x − y) n+1
(10.41)
Then we have {fX , fY }(L) = {fX , fY }2 (L). This shows that { , }2 satisfies the Jacobi identity. Proof. Viewing fX (L) as a function of pk , we have: δfY δfX {fX , fY }(L) = dx dy {pk (x), pl (y)} δpk (x) δpl (y) k,l
We now check that inserting eq. (10.41) into this formula reproduces the second Poisson structure. Define the operators: Lij = Li Li−1 · · · Lj ,
for i ≥ j,
L01 = 1, Ln+1,n+2 = 1
(10.42)
From the expression of fX , we have: δfX /δpk (x) = −(Lk−1,1 XLn+1,k+1 )−1 (x) Thus, {fX , fY }(L) =
dx dy(Lk−1,1 XLn+1,k+1 )−1 (x)
k,l
× {pk (x), pl (y)} × (Ll−1,1 Y Ln+1,l+1 )−1 (y) Using the Poisson bracket of the pk , this is rewritten as: {fX , fY }(L) = E1 − E2
(10.43)
where E1 is produced by the Kronecker delta in eq. (10.41) while E2 is produced by the 1/(n + 1) term. We have: E1 =
n+1
dx(∂(Lk−1,1 XLn+1,k+1 )−1 )(Lk−1,1 Y Ln+1,k+1 )−1
k=1
n+1 1 E2 = dx(∂(Lk−1,1 XLn+1,k+1 )−1 )(Ll−1,1 Y Ln+1,l+1 )−1 n+1 k,l=1
362
10 The KP hierarchy
We can write ∂(Lk−1,1 XLn+1,k+1 )−1 = ([Lk , Lk−1,1 XLn+1,k+1 ])−1 . This is because Lk = ∂ − pk and one checks that for any pseudo-differential f = n fn ∂ n one has [∂, f ]−1 = (∂f−1 ) and [pk , f ]−1 = 0. Thus: n+1 (10.44) dx (Lk1 XLn+1,k+1 )−1 (Lk−1,1 Y Ln+1,k+1 )−1 E1 = k=1
−(Lk−1,1 XLn+1,k )−1 (Lk−1,1 Y Ln+1,k+1 )−1 E2 =
n+1 1 dx (Lk1 XLn+1,k+1 )−1 (Ll−1,1 Y Ln+1,l+1 )−1 n+1 k,l=1
−(Lk−1,1 XLn+1,k )−1 (Ll−1,1 Y Ln+1,l+1 )−1 We use the identity, true for all p, dx(U )−1 (V )−1 = U− (∂ − p)V− = (∂ − p)U− V−
(10.45)
to rewrite the expression E1 as: n+1 E1 = (Lk1 XLn+1,k+1 )− Lk (Lk−1,1 Y Ln+1,k+1 )− k=1
− Lk (Lk−1,1 XLn+1,k )− (Lk−1,1 Y Ln+1,k+1 )−
Next replace all the terms U− by U − U+ , to get E1 =
n+1
− (Lk1 XLn+1,k+1 )+ Lk1 Y Ln+1,k+1
k=1
+ (Lk−1,1 XLn+1,k )+ Lk−1,1 Y Ln+1,k
In the sum, the terms cancel two by two, and we are left with E1 = (XL)+ Y L − (LX)+ LY We now turn to the summation over k in E2 . Again, terms cancel two by two and we are left with: n+1 1 E2 = dx [X, L]−1 (Ll−1,1 Y Ln+1,l+1 )−1 n+1 l=1
1 = [X, L]−1 Ll−1,1 Y Ln+1,l+1 n+1 n+1
=
1 n+1
l=1 n+1 l=1
Ln+1,l+1 [X, L]−1 Ll−1,1 Y
10.8 Bihamiltonian structure
363
But we have [L, f ] = n+1 l=1 Ln+1,l+1 ∂f Ll−1,1 for any function f . Thus the sum can finally be written as: 1 E2 = dx (∂ −1 ([X, L]−1 )[Y, L]−1 n+1 This completes the proof that { , } = { , }2 . Since { , } obviously satisfies the Jacobi identity, so does { , }2 . Proposition. The Hamiltonians Hk are in involution with respect to both Poisson brackets. Proof. We already know that {Hk , Hl }1 = 0. Using the recursion relation eq. (10.37), we have: {Hk , Hl }2 = D2 dHk , dHl = D1 dHn+1+k , dHl = {Hn+1+k , Hl }1 = 0
Example. For n = 1, we have L = ∂ 2 −u with u = p2 +p . The Poisson bracket eq. (10.41) becomes {p(x), p(y)} = 12 δ (x − y). This implies 1 {u(x), u(y)}2 = [u(x)∂x + ∂x u(x) − ∂x3 ]δ(x − y) 2 We recognize the Virasoro algebra (see Chapter 11). 10.8 Bihamiltonian structure The two Poisson brackets { , }1 and { , }2 have a remarkable compatibility property, called the Magri compatibility condition. Proposition. The two Poisson structures { , }1 and { , }2 are compatible, in the sense that for any λ1 , λ2 the application (f, g) → λ1 {f, g}1 + λ2 {f, g}2 is a Poisson bracket. Proof. The condition that the sum of two Poisson brackets {f, g}1 = (1) (2) ij Pij ∂i f ∂j g and {f, g}2 = ij Pij ∂i f ∂j g satisfies the Jacobi identity reads:
(1) (2) (2) (1) Pil ∂l Pjk + Pil ∂l Pjk ∂i f ∂j g∂k h + cyc. perm. = 0 Since no second order derivatives occur, it is sufficient to check it on the linear functions f = fX , g = fY , and h = fZ . The condition becomes: D1 (X), d D2 (Y ), Z
+ D2 (X), d D1 (Y ), Z
+ cyc. perm. = 0
364
10 The KP hierarchy
where d is the differential on phase space. We have: d D1 (X), Y = [X, Y ] d D2 (X), Y = X(LY )− − (Y L)− X + (XL)− Y − Y (LX)− , 1 + −1 + (Y LX − XLY )− + [(∂ [Y, L]−1 ), X] − [(∂ −1 [X, L]−1 ), Y ] n+1 Inserting this into the above condition, one verifies that it is indeed satisfied. One of the main advantages of bihamiltonian structures is that they automatically produce commuting Hamiltonians, as we now explain. Let { , }1 and { , }2 be two compatible Poisson brackets, and consider the linear combination { , }λ = { , }1 − λ{ , }2 Let us assume the existence of Hλ , a Casimir function of the Poisson bracket { , }λ . This means (10.46) {Hλ , f }λ = 0 ∀f ∞ Suppose that we can expand Hλ = n=0 Hn λn . Then the above relation gives {H0 , f }1 = 0, {Hn , f }1 = {Hn−1 , f }2 , ∀f (10.47) This shows that the flows generated by the Hn are related by recursion relations of the Lenard type. Moreover, the Hn are in involution with respect to both Poisson brackets { , }1,2 . The proof is done by induction. First, by eq. (10.47), one has {H0 , Hn }1 = 0, ∀n. Suppose that we have shown that {Hm , Hn }1 = 0, ∀n, then we have {Hm+1 , Hn }1 = 0, ∀n. This is because, using the recursion relations, we have {Hm+1 , Hn }1 = {Hm , Hn }2 = −{Hn , Hm }2 = −{Hn+1 , Hm }1 = 0 Hence the Hn are in involution with respect to the first Poisson bracket. But by the recursion relations, we have {Hm , Hn }2 = {Hm+1 , Hn }1 = 0, so they are also in involution with respect to the second bracket. All this means that bihamiltonian structures are consubstantial to integrable systems. The unique feature of the present situation is that both Poisson brackets are local. 10.9 The Drinfeld–Sokolov reduction The two compatible Poisson brackets { , }1 and { , }2 have a nice Lie algebraic interpretation, which we now explain. They can be obtained,
365
10.9 The Drinfeld–Sokolov reduction
through Hamiltonian reduction, from the Kostant–Kirillov bracket on coadjoint orbits of central extensions of loop algebras. Consider the loop algebra of traceless (n + 1) × (n + 1) matrices U (x), with matrix elements functions of x. Let G be the central extension of this loop algebra. It consists of pairs (U (x), u) also denoted by U (x) + u K, where u is a number, and K is called the central element. The commutator on G is defined as: [(U (x), u), (V (x), v)] ≡ ( [U (x), V (x)] , ω(U, V ) ) where in the right-hand side, [U (x), V (x)] is the loop algebra commutator and ω(U, V ) is a bilinear antisymmetric form. The central element K = (0, 1) commutes with everything. The Jacobi identity for this bracket reduces to the cocycle condition on ω: ω([U, V ], W ) + ω([W, U ], V ) + ω([V, W ], U ) = 0 Note that this is a linear condition on ω, so the sum of two cocycles is a cocycle. Trivial cocycles are given by Σ([U, V ]) where Σ is a linear form on the loop algebra. Such a linear form can be written as Σ(U ) = dx Tr (Σ(x)U (x)) where we used the natural invariant scalar product on the loop algebra: (U, V ) = dx Tr (U (x)V (x)) The standard non-trivial cocycle is: ω0 (U, V ) = dx Tr (U (x)∂x V (x)) and we can take for ω any linear combination ω(U, V ) = ω0 (U, V ) + Σ([U, V ]) G∗
The dual of G can be identified with G using the non-degenerate bilinear form (U + u K, V + v K) = (U, V ) + uv. Then the coadjoint action of G on G ∗ reads: ad∗(V,v) (U, u) = (u∂x V − [U + Σ, V ], 0)
(10.48)
To see it, we apply the definition
ad∗(V,v) (U, u) (W, w) = − (U, u), [(V, v), (W, w)] = −(U, [V, W ]) − uω(V, W )
= dx Tr (−[U, V ] + u∂V − [Σ, V ])W
366
10 The KP hierarchy
We see that u is invariant by the coadjoint action eq. (10.48), so in the following we fix it to the value u = 1. The coadjoint action of (V, v) becomes a gauge transformation on the operator ∂ − U − Σ, namely: ad∗(V,v) (U, 1) = (U , 0),
with U = ∂V − [U + Σ, V ]
(10.49)
By construction any orbit of the gauge action in G ∗ is equipped with an invariant symplectic form, the Kostant–Kirillov form. Explicitly, at the point U , the induced Poisson bracket reads:
{f, g}(U ) = ad∗df (U ) (dg) = ∂df − [U, df ], dg − Σ, [df, dg] (10.50) where the differentials df and dg of functions on G ∗ are viewed as elements of G. The two terms in the right-hand side of eq. (10.50) obviously satisfy the Jacobi identity separately, and so does their sum, hence they define two compatible Poisson brackets. We show below that, with a proper choice of Σ and an appropriate symplectic reduction, these two Poisson brackets reduce to the brackets { , }1 and { , }2 considered in the previous section, which are then compatible by construction. We choose to reduce by a subgroup of the loop group, namely the loop group N− of lower triangular matrices with 1 on the diagonal. The dual of its Lie algebra can be identified with the loop algebra N+ of strictly upper triangular matrices. The Hamiltonian which generates the gauge action by (V, v) is simply H(V,v) (U, u) = (U, V ) as in the general theory, see Chapter 14. Alternatively, this follows from eq. (10.50) with fV (U ) = (U, V ) and dfV = V . The moment at the point (U, 1) is P(U ) = PN+ U . To perform the Hamiltonian reduction, we must set P(U ) to a fixed value µ ∈ N+ , which determines the nature of the reduced symplectic manifold. We take: n Ei,i+1 = E+ ∈ N+ µ= i=1
where E+ is the sum of simple root vectors of the Lie algebra sl(n + 1) in the vector representation. So far, we did not specify the form Σ(U ) in the central extension we considered. We require now that Σ ∈ N− , and more precisely: Σ = αΣ0 ,
with Σ0 = En+1,1 ∈ N−
(10.51)
This choice will lead to the Poisson bracket { , }1 . It has the important property that [Σ, V ] = 0 for any V ∈ N− , so that the coadjoint action of
10.9 The Drinfeld–Sokolov reduction
367
N− reduces to a gauge action on ∂ − U . Moreover, the group of stability of µ is the whole group N− , as we now show. The variation of the moment under the coadjoint action of N− is given by δV µ = PN+ (∂V − [µ + Σ, V ]) where V ∈ N− . Due to the specific form of the momentum µ, the commutator [µ, V ] cannot have matrix elements above the diagonal and is killed by the projection on N+ . Similarly, ∂V is lower triangular and does not survives the projection. Finally, we recall that [Σ, V ] = 0 for the specific choice of Σ we made. Altogether δV µ = 0. The matrices U (x) such that PN+ U = µ have the form U (x) = B(x) + µ, where B(x) is a lower triangular matrix, including the diagonal. The reduced phase space is obtained by quotienting by the group N− with group action U → U = ∂V + [U, V ] with V ∈ N− . Note that this leaves the form of U invariant. Alternatively, the reduced phase space can be identified with the set of differential operators ∂ − B − µ quotiented by the group N− acting by gauge transformations. One can use this gauge action to bring ∂ − B − µ to either one of the two forms (note that B has (n + 1) more parameters than N− ): 0 ··· p1 1 0 1 0 ··· 0 p2 1 0 ··· 0 1 ··· . . . . , Du = ∂ − .. .. .. Dp = ∂ − .. 0 ··· 0 ··· 0 1 1 un u0 u1 · · · 0 · · · · · · 0 pn+1 (10.52) Since we consider the loop algebra of traceless matrices we have i pi = 0 and un = 0. The sets (p1 , . . . , pn+1 ) and (u0 , . . . , un ) constitute two different coordinate systems on the reduced phase space Fµ = Mµ /N− . Note that with any point in Mµ , i.e. with any matrix differential operator D = ∂ − B − µ, one can associate a scalar differential operator of order (n + 1). To do that we consider the matrix differential equation DΨ = 0 and write the differential equation of order (n + 1) induced on the first component ψ1 of Ψ, which is of the form Lψ1 = 0. Since the group action of N− on Ψ leaves ψ1 invariant, this differential equation is invariant under gauge transformations, so the coefficients of the equation are invariant functions under N− . For the two particular forms Dp , Du , we get: L = (∂ − pn+1 ) · · · (∂ − p1 ),
L = ∂ n+1 − un−1 ∂ n−1 − · · · − u1 ∂ − u0
It remains to express the reduced Poisson bracket in terms of the invariant operator L. We recall that the reduced bracket of invariant
368
10 The KP hierarchy
functions can be computed straightforwardly using the Poisson bracket on the unreduced phase space, see Chapter 14. We take as invariant functions the functions fX = LX , where X is any pseudo-differential operator on P− , and is the Adler trace. Hence we have at the point (p + µ) (where p is diagonal) {fX , fY }reduced = (ad∗dfX (p + µ), dfY )
(10.53)
Separating the ω0 and Σ parts in the cocycle definition, we can write also {fX , fY }reduced = {fX , fY }ω0 + {fX , fY }Σ with {fX , fY }ω0 = (∂dfX − [p + µ, dfX ], dfY ),
{fX , fY }Σ = −([Σ, dfX ], dfY )
To compute dfX and dfY , we first need to compute the variation of L when p + µ → p + µ + b, where b is a small lower triangular matrix (including diagonal). Writing the system (∂−p−b−µ)Ψ = 0 in terms of ψ1 and keeping only terms of first order in b we find, with the notations of eq. (10.42): δL = − Ln+1,i+1 bij Lj−1,1 i≥j
The differential dfX is defined by the relation (dfX , b) = δLX so that dfX is the upper triangular traceless matrix:
1 Lk−1,1 XLn+1,k+1 + (dfX )ji = − Lj−1,1 XLn+1,i+1 δij n+1 −1 −1 k (10.54) We are now in a position to prove the main result of this section: Proposition. One has {fX , fY }ω0 = {fX , fY }2 ,
{fX , fY }Σ = {fX , fY }1
Proof. We start with the coadjoint action: ad∗dfX (p + µ) = ∂dfX − [p + µ + Σ, dfX ] Noting that dfY is upper triangular in eq. (10.53), we need only keep the lower triangular part in ad∗dfX (p + µ). Using eq. (10.49) (with V = dfX and U = p+µ), we remark that the Σ independent term in this expression is upper triangular, so that we need only keep the diagonal part in this term, that is (dfX )kk . We get: (∂(dfX )kk )(dfY )kk − ([Σ, dfX ], dfY ) {fX , fY }reduced = k
10.9 The Drinfeld–Sokolov reduction
369
The first term is just {fX , fY }ω0 . Substituting the expressions of dfX and dfY it immediately yields eq. (10.43), which shows that it coincides with the bracket { , }2 . This also shows that the Poisson bracket of the diagonal coordinates pi (x) is given by eq. (10.41). We now look at the Σ dependent term which is {fX , fY }Σ and show that it reproduces the bracket { , }1 . Due to the choice Σ = αΣ0 we have to compute: ([Σ0 , dfX ], dfY ) =
n+1
(dfX )1,j (dfY )j,n+1 − (dfY )1,j (dfX )j,n+1
j=1
Inserting the expressions eq. (10.54) for dfX and dfY , and noting that the terms proportional to the Kroneker deltas do not contribute, we get: n+1 ([Σ0 , dfX ], dfY ) = dx (XLn+1,j+1 )−1 (Lj−1,1 Y )−1 j=1
−(Y Ln+1,j+1 )−1 (Lj−1,1 X)−1 Using eq. (10.45), this can be rewritten in terms of Adler traces: ([Σ0 , dfX ], dfY ) =
n+1
(XLn+1,j+1 )− Lj (Lj−1,1 Y )−
j=1
− (Y Ln+1,j+1 )− Lj (Lj−1,1 X)− Substituting everywhere U− = U − U+ and using the isotropy of P± so that U V+ = U− V , this reads: ([Σ0 , dfX ], dfY ) =
n+1
XLY − (XLn+1,j )− Lj−1,1 Y
j=1
− (XLn+1,j+1 )+ Lj,1 Y − (X ↔ Y ) In the sum the first term yields (n + 1) XLY , while the second and third terms regroup themselves as −
n+1
(XLn+1,j )− + (XLn+1,j )+ Lj−1,1 Y = −n XLY
j=2
and we finally get: −([Σ0 , dfX ], dfY ) = − XLY + Y LX = L[X, Y ] = {fX , fY }1
370
10 The KP hierarchy
This construction, due to Drinfeld and Sokolov, can be generalized to Lie algebras other than sl(n + 1), replacing E+ by the sum of simple root vectors and Σ0 by the root vector E−α , where α is the longest root. This also shows the nice interplay between different Lie algebra structures (the one induced by the algebra of pseudo-differential operators, and the Kac– Moody one) producing the same Kostant–Kirillov Poisson brackets, after suitable Hamiltonian reduction. 10.10 Whitham equations In many cases, solutions of non-linear partial differential equations take the form of modulated wavetrains, i.e. at small scale they look like sinusoidal solutions, but at large scale the parameters of the sinusoid slowly evolve. Whitham equations describe the slow variations of these parameters. It turns out that algebro-geometric solutions of KP are particularly well suited examples of Whitham analysis. In the algebro-geometric solutions, the field u = −2q−1 of the KP hierarchy is of the form: u(t) = u0 ti U (i) (m), m i
where ti are the KP time variables and m denotes the moduli of the Riemann surface, Γ, used to build the solution. The quantities U (i) (m), defined in eq. (10.17), are functions of the moduli only. The vector V = ti U (i) (m) (10.55) i
lives on the Jacobian of Γ and u(t) is a pseudo-periodic function of each time ti . We now look for solutions of KP, close to these algebro-geometric solutions, but where the moduli m slowly evolve. To describe the slow modulation, we introduce the large scale variables Ti = ti , and we express the idea of a modulated wavetrain by searching for u(t) in the form u(t) = u0 ( −1 S(T ), m(T )) + u1 (T ) + · · · ,
Ti = ti
(10.56)
Our purpose is to find the equations for S(T ) and m(T ) such that u(t) in eq. (10.56) is a solution of the KP equation to first order in , valid over a time scale t ∼ −1 . This means that the first order term must remain uniformly bounded over a period of time −1 . To this aim, we take advantage that time dependence in the algebrogeometric solution is entirely contained in the variable V , eq. (10.55), so
10.10 Whitham equations
371
that we can write it explicitly as a function of V and m. Once this is done, we consider V and m as independent variables. We postulate the equations of motion for V = −1 S: ∂Ti S = U (i) (m(T ))
(10.57)
These equations are obviously satisfied for the modulated solutions. As a consequence, we can write the time derivatives of u(t) in eq. (10.56) to order as ∂ti u(t) = U (i) · ∂V u0 + (∂Ti u0 + ∂ti u1 (t)) (10.58) where the slow time derivatives are defined by: ∂Ti u0 = (∂Ti mj )∂mj u0 j
They come from the variation of the moduli only. Note that eq. (10.57) already imposes constraints on the time evolution of the moduli. Specifically, integrability conditions of eq. (10.57) imply ∂Ti U (j) = ∂Tj U (i)
(10.59)
The slow modulation equations will have to be compatible with these constraints. The equations for the time evolution of the moduli we are aiming at are the Whitham equations, eqs. (10.77). The main idea of the derivation, which is rather long, is to average over fast oscillations and retain terms involving only slow modulations. In the algebro-geometric setting, the specific feature we use is that, the time flow being a linear motion on the Jacobian torus, by the ergodicity theorem we can replace the fast time average by average over the torus. Let us now start from the linear system satisfied by the Baker–Akhiezer function, and limit ourselves to the first three times: ∂t2 Ψ = (Q2 )+ Ψ,
∂t3 Ψ = (Q3 )+ Ψ
or (∂t2 − L)Ψ = 0,
L ≡ ∂x2 − u (10.60) 3 (∂t3 − A)Ψ = 0, A ≡ ∂x3 − u∂x − v (10.61) 2 where we set v = − 32 ∂x u − 3q−2 , and we identify t1 = x, T1 = X. The compatibility condition of this system F ≡ ∂t2 A − ∂t3 L − [L, A] = 0 is the KP equation.
372
10 The KP hierarchy
Proposition. Let us denote by L = L0 + L1 + · · · and A = A0 + A1 + · · · the operators corresponding to the small perturbation eq. (10.56) of a finite-zone solution with L1 , A1 linear in u1 . To order the zero curvature condition reduces to: (1)
(1)
F1 + ∂T2 A0 − ∂T3 L0 + L0 ∂X A0 − A0 ∂X L0 = 0 (1)
(10.62)
(1)
with L0 = −2∂x , A0 = −3∂x2 + 32 u0 and F1 = ∂t2 A1 − ∂t3 L1 − [L1 , A0 ] − [L0 , A1 ] Proof. We suppose that the perturbed operators satisfy the zero curvature equation. Since L = L0 + L1 + · · · and A = A0 + A1 + · · ·, one has F = F0 + F1 + · · ·
(10.63)
It is important to realize, however, that the leading term F0 also produces see eq. (10.58). To a correction of order due to the deformation of ∂ti , extract this term, let us write L0 = i li ∂ i , A0 = j aj ∂ j , with li , aj functions of u0 and its derivatives. The product is i l (∂ k aj )∂ i+j−k L0 A0 = k i i,j,k
The term of order induced by eq. (10.58) in (∂ k aj ) is k∂X ∂ k−1 aj . Hence (1) (1) the first order term is equal to − L0 ∂X A0 , where L0 = − i>0 ili ∂ i−1 . (1)
One gets the similar contribution − A0 ∂X L0 from the product A0 L0 (1) with a similar definition of A0 . We will get rid of F1 in eq. (10.63) by an averaging procedure, leading to a direct determination of the variation of the moduli in terms of the slow variables. To work conveniently with these averages, we introduce a definition. For D a differential operator, we define the differential operators D(j) by (D∗ f )g = ∂ j (f D(j) g) (10.64) j≥0
To show how this definition works, consider D = a∂ i , so that (D∗ f ) · g = ((−)i ∂ i (af )) · g. Let us write ∂ = ∂1 + ∂2 , where ∂1,2 act respectively on the first and second factor around the dot: ∂(f · g) = ∂1 f · g + f · ∂2 g ≡ ∂f · g + f · ∂g
373
10.10 Whitham equations This is just a way to encode the Leibnitz rule. Then i ∗ i j ∂ j (f a∂ i−j g) (D f ) · g = (∂2 − ∂) af · g = (−1) j j≥0
i i (j) j So, for D = a∂ , we get D = (−1) a∂ i−j . In particular, by linearj ity, we get for any differential operator D = i ai ∂ i D(0) = D, D(1) = − iai ∂ i−1 i>0 (1)
(1)
Note that the notation D(1) is consistent with the notation L0 , A0 introduced earlier. We also have the identity: (j)
(D1 D2 )
=
j
(k)
(j−k)
(10.65)
D1 D2
k=0
This follows from the binomial identity n m − k mn + m − k = j k p j−p p
Let Ψ0 and Ψ∗0 be the Baker–Akhiezer functions corresponding to the exact algebro-geometric solution u0 . Recall that Ψ0 and Ψ∗0 can be written in the form (see eq. (10.20) )
Ψ0 = e
i ti P
(i) (P )
ϕ(P, V, m),
Ψ∗0 = e−
i ti P
(i) (P )
ϕ∗ (P, V, m)
(10.66)
where ϕ(P, V, m), ϕ(P, V, m)∗ are periodic, of period 1 in each component of V , and so bounded in V . We have introduced the notation P Ω(i) P (i) (P ) = P∞
Ω(i)
where the forms have purely imaginary periods over any cycle. In these Baker–Akhiezer functions, we make the substitution V → S/ . They satisfy the equations (∂t2 − L0 )Ψ0 = O( ) and (∂t3 − A0 )Ψ0 = O( ). The right-hand sides are not zero because in L0 and A0 we made the substitution V → S/ . Proposition. We have the identity ∂t2 (Ψ∗0 A1 Ψ0 ) − ∂t3 (Ψ∗0 L1 Ψ0 )
(j) (j) ∂ j Ψ∗0 (A0 L1 −L0 A1 )Ψ0 + O( ) = Ψ∗0 F1 Ψ0 + j≥1
(10.67)
374
10 The KP hierarchy
Proof. From eq. (10.23), and the renaming of the operators Qi as in eqs. (10.60, 10.61), we have ∂t2 (Ψ∗0 A1 Ψ0 ) = −(L∗0 Ψ∗0 )A1 Ψ0 + Ψ∗0 ∂t2 A1 Ψ0 + Ψ∗0 A1 L0 Ψ0 + O( ) (j) ∂ j (Ψ∗0 L0 A1 Ψ0 ) + O( ) = Ψ∗0 (∂t2 A1 + [A1 , L0 ])Ψ0 − j≥1
Writing the second term
∂t3 (Ψ∗0 L1 Ψ0 )
in a similar way yields the result.
We now take the average of eq. (10.67) over the times t1 , t2 , t3 . This average is taken over a time scale which is large compared to 1 but small compared to 1/ . For the quantity O, we denote this average by 1 O
ti = O dti 2 − Over this time scale, the point V describes in general an almost dense trajectory on the torus, so that the average can also be interpreted as an average on the torus. The time scales are chosen so that the moduli can be considered as constant in the averaging. In agreement with our hypothesis, we assume that L1 and A1 remain bounded when the ti evolve in an interval of order O( −1 ). Note that in eq. (10.67), the exponential factors cancel between Ψ0 and Ψ∗0 . Since the average of derivatives of bounded functions vanishes, only one term survives in the averaging of eq. (10.67) and we get: Ψ∗0 F1 Ψ0
t1 ,t2 ,t3 = 0 Hence, by averaging eq. (10.62), we get an equation, valid at order : Ψ∗0 (∂T2 A0 − ∂T3 L0 + L0 ∂X A0 − A0 ∂X L0 )Ψ0
t1 ,t2 ,t3 = 0 (1)
(1)
(10.68)
In this equation all quantities are computed with the exact algebrogeometric solution u0 . In the following we shall drop the suffix 0. The next two propositions are devoted to the computation of the various terms in this equation. Proposition. With the parametrization eq. (10.66) of the Baker– Akhiezer function, we have: Ψ∗ ∂T3 LΨ
= ∂T3 P (2) ϕ∗ ϕ
+ ∂T3 Uj ϕ∗ ϕj
(2)
(1) ϕj
+ ∂T3 Uj ϕ∗ L (1)
(10.69)
Ψ∗ ∂T2 AΨ
= ∂T2 P (3) ϕ∗ ϕ
+ ∂T2 Uj ϕ∗ ϕj
(3)
(1) ϕj
+ ∂T2 Uj ϕ∗ A (1)
(10.70)
375
10.10 Whitham equations and Ψ∗ (L(1) ∂X A − A(1) ∂X L)Ψ
= ∂X P (3) Ψ∗ L(1) Ψ
(3) (1) ϕj
− ∂X P (2) Ψ∗ A(1) Ψ
+ ∂X U ϕ∗ L j
−
(2) (1) ϕj
∂X Uj ϕ∗ A
(1) = e− where ϕj = ∂Vj ϕ and L (1) . for A
i ti P
(10.71)
(i)
L(1) e
i ti P
(i)
and similarly
Proof. Let us choose two Riemann surfaces Γ(m) and Γ Γ(m + δm). Comparison of functions defined on different Riemann surfaces requires a “connection”. This is achieved by choosing a meromorphic function on each Riemann surface and keeping it fixed. We choose to keep P (1) fixed. Let Ψ and Ψ be corresponding Baker–Akhiezer functions on Γ and Γ . Consider the expression ∂t2 (Ψ∗ Ψ ) = −(L∗ Ψ∗ )Ψ + Ψ∗ L Ψ = Ψ∗ L Ψ − ∂ j (Ψ∗ L(j) Ψ ) j≥0 ∗
= Ψ (L − L)Ψ −
∂ j (Ψ∗ L(j) Ψ )
j≥1
Subtracting the same equation for Ψ = Ψ, we get ∂ j (Ψ∗ L(j) (Ψ − Ψ)) ∂t2 (Ψ∗ (Ψ − Ψ)) = Ψ∗ (L − L)Ψ − j≥1
If Ψ = Ψ + δΨ, this gives: ∂t2 (Ψ∗ δΨ) = (Ψ∗ δLΨ) −
∂ j (Ψ∗ L(j) δΨ)
(10.72)
j≥1
Now from eq. (10.66) we have
(i) (i) ti δUj ϕj + ti δP (i) ϕ δΨ = e i ti P δm ∂m ϕ + i,j
(10.73)
i
where we recall that ϕj = ∂Vj ϕ. We now average eq. (10.72). In the left-hand side, in the average over t2 , the terms which do not contain an explicit factor t2 vanish because they are the average of a derivative of a bounded function. The terms
376
10 The KP hierarchy
containing an explicit factor t2 are treated by first averaging over t2 on the interval [−, ]: 1 f (t1 , , t3 ) + f (t1 , −, t3 )
t1 ,t3 2 = f
∂t2 (t2 f (t1 , t2 , t3 ))
t1 ,t2 ,t3 =
where f
means average on the torus. We treat similarly the average over t1 = x in the right-hand side. Note that since we have kept P (1) fixed, there is no δP (1) contribution. Interpreting δ as a small variation of the moduli m in the direction T3 , we arrive at Ψ∗ ∂T3 LΨ
= ∂T3 P (2) ϕϕ∗
+ ∂T3 Uj ϕj ϕ∗
(2)
(1) ϕj
+∂T3 Uj ϕ∗ L (1)
where the last term comes from the term j = 1 in eq. (10.72). In the same way, we get Ψ∗ ∂T2 AΨ
= ∂T2 P (3) ϕϕ∗
+ ∂T2 Uj ϕj ϕ∗
(3)
(1) ϕj
+∂T2 Uj ϕ∗ A (1)
This proves eqs. (10.69, 10.70). To prove eq. (10.71), note that the vanishing of the curvature, F = 0, ∗ implies 0 = (F g)f = j ∂ j (gF (j) f ), and we deduce that F (j) = 0, ∀j. By eq. (10.65), this can be written as (j)
∂t3 L
− ∂t2 A
(j)
+
j
[L(k) , A(j−k) ] = 0
k=0
This relation implies, by performing the time derivatives with eqs. (10.60, 10.61), the identity
∂ j−1 ∂t3 (Ψ∗ L(j) Ψ ) − ∂t2 (Ψ∗ A(j) Ψ ) j≥1
=
∂ j−1 Ψ∗ L(j) (A − A)Ψ − Ψ∗ A(j) (L − L)Ψ
j≥1
Averaging this equation with Ψ = Ψ + δΨ we obtain: ∂t3 (Ψ∗ L(1) δΨ) − ∂t2 (Ψ∗ A(1) δΨ)
= Ψ∗ L(1) δAΨ − Ψ∗ A(1) δLΨ
377
10.10 Whitham equations
Indeed, the order zero term, i.e. Ψ = Ψ, gives vanishing averages because it is always a derivative of a bounded function. The first order term (in δΨ), produces potentially dangerous terms linear in the time variables. However, the averages vanish when j ≥ 2 because we have at least two derivatives. The average finally reduces to eq. (10.71) when we interpret δ = ∂X . Using the results of this proposition, eq. (10.68) becomes:
0 = ∂T2 P (3) − ∂T3 P (2) Ψ∗ Ψ
+∂X P (3) Ψ∗ L(1) Ψ
− ∂X P (2) Ψ∗ A(1) Ψ
(10.74)
The terms ϕ∗ ϕj
cancel because we assumed U (i) = ∂Ti S, and so the compatibility condition eq. (10.59) holds. For the same reason the terms (1) ϕj
and ϕ∗ A (1) ϕj
also cancel. ϕ∗ L The last step in our derivation of the Whitham equations consists of evaluating the averages Ψ∗ L(1) Ψ
and Ψ∗ A(1) Ψ
. Proposition. Let Ω(i) be the second kind Abelian differentials with pure imaginary periods used to construct the Baker–Akhiezer function on Γ. We have: Ψ∗ L(1) Ψ
Ω(1) = − Ψ∗ Ψ
Ω(2) ∗
Ψ A
(1)
Ψ
Ω
(1)
∗
= − Ψ Ψ
Ω
(3)
(10.75) (10.76)
Proof. Consider eq. (10.72) with δ = d now representing the differential on the curve Γ. Since δL = 0, it reduces to: ∂t2 (Ψ∗ dΨ) = −∂x (Ψ∗ L(1) dΨ) − ∂ j (Ψ∗ L(j) dΨ) j≥2
We have, recalling that dP (i) = Ω(i) ,
(i) dΨ = e i ti P ti Ω(i) ϕ + dϕ i
By averaging, the terms with ∂ j , j ≥ 2, all vanish. Treating carefuly the terms linear in the times as in the proof of the previous proposition, we get: Ω(2) Ψ∗ Ψ
= −Ω(1) Ψ∗ L(1) Ψ
The other formula is proved similarly.
378
10 The KP hierarchy
Inserting these formulae into eq. (10.74) we get our final result: Proposition. The slow modulations obey the Whitham equations (3) Ω(2) Ω ∂T2 − (1) ∂X P (3) = ∂T3 − (1) ∂X P (2) (10.77) Ω Ω Had we kept fixed any meromorphic function on the Riemann surfaces instead of P (1) , the Whitham equation would have taken the more symmetric form
Ω(1) ∂T2 P (3) − ∂T3 P (2)
+Ω(2) ∂T3 P (1) − ∂T1 P (3)
+Ω(3) ∂T1 P (2) − ∂T2 P (1) = 0 It is important to check the consistency equations, eqs. (10.59). When the point P on Γ describes a non-trivial cycle, the forms Ω(i) (P ) do not change, but the functions P (i) (P ) change by a period U (i) . Hence the above equation implies
Ω(1) (P ) ∂T2 U (3) − ∂T3 U (2) + cyclic perm. = 0 which implies eqs. (10.59) because the Ω(i) are linearly independent. P In the KdV case, there is no time T2 . Keeping P∞ Ω(2) fixed, we get ∂T3 P (1) − ∂T1 P (3) = 0 Differentiating with respect to the point P on the Riemann surface, we get the Whitham equations in their usual form: ∂T3 Ω(1) − ∂T1 Ω(3) = 0
(10.78)
We will recover this equation in Chapter 11 where other proofs are available. In the KP case, however, the above derivation of Whitham equations, which is due to Krichever, is the only one known. Remark. Assuming that the Riemann surface Γ is generic, the forms Ω(1) and Ω(2)
have no common zero. Let us assume that
Ψ∗ Ψ and
Ψ∗ L(1) Ψ are meromorphic functions. They have respectively 2g and 2g + 1 poles, hence 2g and 2g + 1 zeroes. Looking at eq. (10.75), we see that the zeroes of
Ψ∗ Ψ are the 2g zeroes of Ω(1) . The form Ω(1) Ω= (10.79)
Ψ∗ Ψ has a double pole at ∞, is otherwise regular and has zeroes at the poles of Ψ and Ψ∗ . It coincides with the form defined in eq. (10.21).
10.11 Solution of the Whitham equations
379
10.11 Solution of the Whitham equations There is a simple method to find explicit solutions to the Whitham equations. Let us present it in the simple case of hyperelliptic curves, which is appropriate to the KdV equation. In this case the curve is of the form 2
µ = R(λ) =
2g+1
(λ − λi )
i=1
where the λi are slowly modulated. Recall that in KdV, only the odd times survive. The forms Ω(2i−1) are given by 2i − 1 λg+i−1 + P (2i−1) (λ) (2i−1) = Ω dλ (10.80) 2 R(λ) where the polynomial P (2i−1) (λ) is of degree g − 1 and chosen so that all the periods of Ω(2i−1) are pure imaginary. At infinity √ Ω(2i−1) = d(z 2i−1 + O(z −1 )), z = λ Let us introduce the normalized form T2i−1 Ω(2i−1) + Ω(n) S= i
where n is chosen at will and is a free parameter of the solution. Proposition. Let us assume that for each branch point λj , either λj is independent of the times T2i−1 or S vanishes at λj . This is a system of 2g + 1 equations for the 2g + 1 quantities λj , which allows us to express them in terms of the T2i−1 . Then ∂T2i−1 S = Ω(2i−1) It follows that the Whitham equations, eqs. (10.78), are satisfied. We have more generally: (10.81) ∂T2i−1 Ω(2j−1) = ∂T2j−1 Ω(2i−1) (2i−1) Uj = ∂T2i−1 Sj , Sj = S cj
where cj is a basis of non-trivial cycles on Γ. Proof. Let us consider the analyticity properties of ∂T2i−1 S. First, at infinity we have
S=d T2i−1 z 2i−1 + z 2n−1 + O(z −1 ) i
380
10 The KP hierarchy
hence ∂T2i−1 S = d(z 2i−1 + O(z −1 )). At finite distance, we have (i) 1 ∂T2i−1 λk (2i−1) S+ ∂T2i−1 S = Ω + ck ωk 2 λ − λk k
k
The second term in the right-hand side comes from the derivation of the factor 1/ R(λ) in eq. (10.80), while the last term comes from differentiating the polynomials P (2i−1) (λ) and P (n) . The ωk are the holomorphic differentials. The right-hand side is regular at finite distance since either ∂T2i−1 λk = 0 or S|λk = 0. Finally, all periods of ∂T2i−1 S are purely imaginary for T2i−1 real. Hence we have ∂T2i−1 S = Ω(2i−1) . This in turn implies eq. (10.81). Moreover, we have (2i−1) (2i−1) = Ω = ∂T2i−1 S = ∂T2i−1 Sj Uj cj
cj
So we have solved both eq. (10.57) and the Whitham equations, eq. (10.78). References [1] G.B. Whitham, Linear and Nonlinear Waves. Wiley (1974). [2] I.M. Gelfand and L.A. Dickey, Fractional powers of operators and Hamiltonian sytems. Funkz. Anal. Priloz. 10 (1976) 13–29. [3] F. Magri, A simple model of integrable Hamiltonian equation. J. Math. Phys. 19 (1978) 1156–1162. [4] M. Adler, On a trace functional for formal pseudo-differential operators and the symplectic structure of the Korteweg–de Vries equations. Invent. Math. 50 (1979) 219–248. [5] D.R. Lebedev and Yu.I. Manin, Hamiltonian Gelfand–Dickey operator and coadjoint representation of the Volterra group. Funkz. Analys. Priloz. 13 (1979) 40–46. [6] H. Flaschka, M.G. Forest and D.W. McLaughlin, Korteweg–de Vries equation. Comm. Pure Appl. Math. 33 (1980) 739–784. [7] A.G. Reyman and M.A. Semenov-Tian-Shansky, Family of Hamiltonian structures, hierarchy of Hamiltonians, and reduction for matrix first order differential operators. Funkz. Analys. Priloz. 14 (1980) 77–78.
10.11 Solution of the Whitham equations
381
[8] V.G. Drinfeld and V.V. Sokolov, Equations of the Korteweg–de Vries type and simple Lie algebras. Doklady AN SSSR 258 (1981) 11–16. [9] M. Jimbo and T. Miwa, Solitons and infinite-dimensional Lie algebras. RIMS 19 (1983) 943–1001. [10] I. Krichever, Method of averaging for two dimensional integrable equations. Funkz. Analys. Priloz. 22 (1988) 37–52. [11] L.A. Dickey, Soliton Equations and Hamiltonian Systems. World Scientific (1991).
11 The KdV hierarchy
In this chapter we study the Korteweg–de Vries equation, which occupies a central place in the modern theory of integrable systems. All the aspects of integrable systems discussed so far converge in this chapter to draw a particularly rich landscape. In particular, the methods of pseudodifferential operators allow us to easily discuss the formal aspects, the tau-functions yield soliton solutions, and the algebro-geometric methods yield finite-zone solutions. The soliton solutions which we obtained in the Grassmannian setting by using vertex operators are also degenerate cases of these finite-zone solutions. Finally, we use a fermionic fomalism to analyse the structure of the local fields and show that the equations of the hierarchy can be recast in a very compact form. This is used to give a new derivation of the Whitham equations in the KdV case. 11.1 The KdV equation The Korteweg–de Vries (KdV) equation was introduced historically as an approximation of the equations of hydrodynamics, describing unidimensional long waves in shallow water. In their pioneering work, Gardner, Greene, Kruskal and Miura found an unexpected connection with the inverse scattering problem of the Schroedinger equation. More recently, the Hamiltonian aspects of KdV theory connected it to conformal field theory. The KdV equation reads: 4∂t u = −6u∂x u + ∂x3 u
(11.1)
The numerical factors in front of each term in eq. (11.1) can be modified by rescaling u, x and t. The KdV equation can be written as the zero curvature condition Fxt ≡ ∂x At − ∂t Ax − [Ax , At ] = 0 382
11.1 The KdV equation
383
with the connection Ax , At , depending on a spectral parameter λ: 1 ∂x u 4λ − 2u 0 1 , At = Ax = −∂x u λ+u 0 4 4λ2 + 2λu + ∂x2 u − 2u2 (11.2) Note that ∂x − Ax = Du − λΣ0 with the notations of eqs. (10.51, 10.52) in Chapter 10. Alternatively, one can recast the KdV equation in the Lax form ∂t L = [M, L], where L and M are the following differential operators: L = ∂2 − u (11.3) 3 1 1 M = (4∂ 3 − 3u∂ − 3∂u) = (4∂ 3 − 6u∂ − 3(∂x u)) = (L 2 )+ 4 4 3
The operator ∂ acts as ∂x , and the notation (L 2 )+ refers to the pseudodifferential operator formalism introduced in Chapter 10. In the Lax equation [M, L] is the commutator of differential operators. Of course these two descriptions are not independent. To relate them, consider the linear system: Ψ Ψ (∂x − Ax ) = 0, (∂t − At ) =0 (11.4) χ χ The x-equation yields χ = ∂x Ψ and (L − λ)Ψ = 0 with L = ∂x2 − u
(11.5)
The time evolution of Ψ is given by 4∂t Ψ = ∂x u · Ψ + (4λ − 2u) ∂x Ψ. Using eq. (11.5), this may be rewritten as: (∂t − M )Ψ = 0
1 with M = (4∂ 3 − 3u∂ − 3∂u) 4
(11.6)
The compatibility condition of eqs. (11.5, 11.6) is the Lax equation ∂t L = [M, L], which is equivalent to the KdV equation. Equation (11.5) is the Schroedinger equation with potential u. The parameter λ gets an interpretation as a point of the spectrum of this operator. This is the origin of the terminology “spectral parameter”. As explained in Chapter 3, a general consequence of the zero curvature condition is the existence of non-trivial conserved quantities. This requires, however, imposing appropriate boundary conditions. We shall consider here for definiteness either potentials u(x) fast decreasing at x → ±∞, or potentials periodic under x → x + . Let us assume for definiteness that u(x) is periodic. Since Ax and At are local in u, they are also periodic. In this case conserved quantities are generated by Tr T (λ),
384
11 The KdV hierarchy
where T (λ) is the monodromy matrix associated with the linear system eq. (11.4): ←− T (λ) =exp dxAx (x, t, λ) (11.7) 0
To compute the trace Tr T (λ), we remark that it is invariant under periodic gauge transformations. We build a gauge in which the connection is diagonal, making the calculation of the monodromy matrix and its trace simple. We shall present this computation for general connections whose components Ax and At belong to the sl(2) algebra, so that it can be applied to a wider class of systems. The commutation relations of the sl(2) algebra are: [H, E± ] = ±2E± , [E+ , E− ] = H Its fundamental representation is given by: 0 1 0 1 0 , E− = H= , E+ = 0 0 1 0 −1
0 0
Proposition. Let Ax = Ah H +A− E− +A+ E+ , where Ah (x, λ), A± (x, λ) are periodic functions of x. There exists a periodic gauge transformation g(x, λ): Ax → g Ax ≡ g −1 Ax g − g −1 ∂x g such that g A− = g A+ = 0, and g
1 Ah = P (λ)H
where P (λ), independent of x, is given by: P (λ) = function v(x, λ) is a solution of the Ricatti equation: v + v2 = V
with
V = A + A 2 + A− A + ,
0
dx v(x, λ). The
A = Ah −
1 A+ 2 A+
(11.8)
Proof. The proof consists of performing successively three gauge transformations: the first one annihilates the component along E− , the second one annihilates the component along E+ and the third one is chosen to ensure that the component along H is constant. Let us perform first a gauge transformation with g = g1 = exp(f− E− ), then g
Ax = (Ah + A+ f− )H − (f− + 2Ah f− + A+ f−2 − A− )E− + A+ E+
where prime ( ) means derivative with respect to x. The coefficient of E− vanishes if f− is a solution of the Ricatti equation f− + 2Ah f− + A+ f−2 − A− = 0
385
11.1 The KdV equation A
If one sets f− = A1+ (v − A) and then A = Ah − 12 A+ , this equation be+ comes the Ricatti equation (11.8). As usual the substitution v = y /y linearizes the equation which becomes the Schroedinger equation: y − V y = 0
(11.9)
The potential V (x, λ) being periodic, one can take for y(x, λ) any one of the two quasi-periodic Bloch waves (Floquet solutions), y± (x, λ): y± (x + , λ) = exp(±P (λ)) y± (x, λ)
(11.10)
P (λ) is called the quasi-momentum. For definiteness, we shall take v = /y which is periodic. We shall, moreover, assume that the Wronskian y+ + − y y = 1. Notice that of y+ and y− is normalized by y+ y− + − y+ (, λ) dx v(x, λ) (11.11) = P (λ) = ln y+ (0, λ) 0 Similarly, let us define g2 = exp(f+ E+ ) and compute the matrix with g = g1 g2 : g
gA x
Ax = (Ah + A+ f− )H + (−f+ + 2(Ah + A+ f− )f+ + A+ )E+
We reduce g Ax to the diagonal form if we choose for f+ the periodic solution of the equation −f+ + 2(Ah + A+ f− )f+ + A+ = 0. This solution is f+ = A+ y+ y− which is also periodic. Finally, taking g3 = exp(hH), the gauge transformed matrix g Ax with g = g1 g2 g3 reads: y 1 A+ g H Ax = −h + Ah + A+ f− H = −h + + + y + 2 A+ 2 e−2P (λ)x/ ) reduces Choosing for h the periodic function h = 12 ln(A+ y+ 1 the coefficient of H to the constant P (λ)H.
Conserved quantities are obtained by looking at the trace of the monodromy matrix Tr (T (λ)). Once the connection has been diagonalized this trace is easy to compute. It follows from the previous proposition that the two eigenvalues of T (λ) are exp(±P (λ)), hence: dx v(x, λ) Tr T (λ) = 2 cosh P (λ), P (λ) = 0
The function P (λ) can serve, as well as Tr T (λ), as a generating function for the integrals of motion. To construct them we only have to solve the Ricatti equation (11.8) for v(x, λ).
386
11 The KdV hierarchy
Let us apply the above proposition to the KdV equation. In view of the expression of the KdV connection (11.2), we have Ah = 0, A− = λ + u and A+ = 1. Thus V = λ + u and the Ricatti equation reads: v + v2 = λ + u
(11.12)
The Schroedinger equation associated with the Ricatti eq. (11.12) coincides with eq. (11.5). The quantity P (λ) is the quasi-momentum of the Bloch eigenfunctions of the differential operator L = ∂ 2 − u with periodic potential u and eigenvalue λ. To obtain local conserved quantities we expand P (λ) around λ = ∞. When λ → ∞, the solution of the Ricatti equation admits the asymptotic expansion: v=
√
λ+
(−1)n vn √ , n ( λ) n≥0
2vn+1 = vn +
n
vp vn−p ,
v0 = 0, 2v1 = −u
p=0
This gives a recursion relation for computing the coefficients vn . Since its solution does not require any integration it leads to local integrals of motion. The first few coefficients are: v1 = − 12 u, v4 = 12 v3 + 18 uu ,
v2 = − 14 u , v5 = 12 v4 +
v3 = 18 (u2 − u ) 1 2 32 u
+
1 16 uu
−
1 3 16 u
The conserved quantities are given by the integral: √ 1 1 2 P (λ) = v dx = λ + √ u dx − √ u dx 2 λ (2 λ)3 0 0 0 1 2 3 (11.13) (u + 2u )dx − · · · + √ (2 λ)5 0 11.2 The KdV hierarchy We now particularize the formalism of pseudo-differential operators studied in Chapter 10 to the KdV situation. This amounts to studying the implications of the condition that L = Q2 is a second order differential operator: Q2 = Φ∂ 2 Φ−1 = L = ∂ 2 − u,
Φ=1+
∞
wi ∂ −i
1
A first consequence is that only odd times survive in the KdV hierarchy. Indeed, recalling the equations of motion of the KP hierarchy, ∂ti Φ = −(Φ∂ i Φ−1 )− Φ
387
11.2 The KdV hierarchy
we see that for i = 2j, (Φ∂ 2j Φ−1 ) = Lj is a differential operator, so that its projection ( )− vanishes. Recall that we have defined two formal Baker–Akhiezer functions, see eq. (10.11) in Chapter 10: ξ(t,z)
Ψ(t, z) = Φe
,
∗
∗ −1 −ξ(t,z)
Ψ (t, z) = (Φ )
e
,
ξ(t, z) =
∞
t2i−1 z 2i−1
i=1
where t1 = x. The function Ψ(t, z) is an eigenfunction of L with the eigenvalue λ = z2 and Ψ∗ (t, z) is its formal adjoint (see eq. (10.2) in Chapter 10). Since L is obviously formally self-adjoint, Ψ∗ (t, z) is also an eigenfunction of L with the same eigenvalue, (∂x2 − u)Ψ(t, z) = λΨ(t, z),
(∂x2 − u)Ψ∗ (t, z) = λΨ∗ (t, z)
(11.14)
The Wronskian of these two solutions is a constant that we now compute. Proposition. The Wronskian of the two Baker–Akhiezer functions Ψ and Ψ∗ is given by: W (Ψ, Ψ∗ ) ≡ Ψ (t, z)Ψ∗ (t, z) − Ψ∗ (t, z)Ψ(t, z) = 2z
(11.15)
where we have denoted Ψ ≡ ∂x Ψ. Proof. From the definition of the Baker–Akhiezer functions we see that the essential singularities cancel in W (Ψ, Ψ∗ ) and that W admits a power series expansion in z around ∞ of the form W (z) = 2z + α + β/z + · · ·. We prove that only the first term is present by showing that the residue of W (z)z i vanishes for i ≥ −1. dz i z W (z) 2iπ
dz ∂Φ∂ i ezx (Φ∗ )−1 e−zx − Φezx ∂(Φ∗ )−1 (−∂)i e−zx = 2iπ Note that the terms involving the times t3 , t5 , . . . in ξ(t, z) cancel because there are only derivatives with respect to x = t1 . Using eq. (10.14) in Chapter 10, we can rewrite this as an Adler residue in the pseudodifferential algebra:
i i dz i z W (z) = Res∂ ∂Φ∂ i Φ−1 + Φ∂ i Φ−1 ∂ = Res∂ ∂L 2 + L 2 ∂ 2iπ
388
11 The KdV hierarchy i
For i even, this vanishes because L 2 is a differential operator. For i odd we are going to show that: L 2 ∗ = −L 2 , i
i
i
i = odd
i
so that ∂L 2 +L 2 ∂ is formally self-adjoint and so cannot have a residue. It 1 1 1 is sufficient to show that L 2 ∗ = −L 2 . But L 2 = ∂ − 12 u∂ −1 + 14 u ∂ −2 + · · · 1 is the unique solution of the equation (L 2 )2 = L with leading term ∂. 1 Similarly, −L 2 is the unique solution of the same equation with leading 1 1 term −∂. Since (L 2 ∗ )2 = L∗ = L and L 2 ∗ = −∂ + · · · the result follows. We introduce the quantity S(t, λ) which will be useful in expanding the compact pseudo-differential expressions. It will also play an important role in the last two sections on Whitham theory. It is defined by ∗
S(t, λ) ≡ Ψ (t, z)Ψ(t, z) = 1 +
∞
λ−j S2j (t)
j=1
Note that the essential singularities cancel in S(t, λ) and that S(t, λ) is a function of λ = z 2 . This is because Ψ∗ (t, z) and Ψ(t, −z) are solutions of the same eq. (11.14), and have the same behaviour at z → ∞. So we have Ψ∗ (t, z) = c(z, t3 , . . .)Ψ(t, −z)
(11.16)
Inserting this into eq. (11.15) we see that c(z, t) is even in z, and so is Ψ∗ (t, z)Ψ(t, z). The aim of the following propositions is to show that the whole KdV hierarchy can be written in terms of S(t, λ). Proposition. The Baker–Akhiezer functions can be expressed as: Ψ(t, z) = S 1/2 (t, λ)eX(t,z) ,
Ψ∗ (t, z) = S 1/2 (t, λ)e−X(t,z)
(11.17)
where
z , X(t, z) = ξ(t, z) + O(1/z) S(t, λ) Proof. This parametrization obviously satisfies Ψ∗ (t, z)Ψ(t, z) = S(t, λ) and this defines X(t, z). Inserting it into eq. (11.15) yields the equation relating X(t, z) and S(t, λ). The asymptotic form of X(t, z) when z → ∞ follows by comparing the asymptotic expansions of Ψ(t, z). ∂x X(t, z) =
As a consequence, eq. (11.14) translates into an equation on S(t, λ). It is convenient to write it in the form of the Ricatti equation (11.12) with: v(t, z) =
∂x Ψ(t, z) z 1 = ∂x log S(t, λ) + Ψ(t, z) 2 S(t, λ)
(11.18)
11.2 The KdV hierarchy
389
There is a simple expression of the coefficients S2j as residues of fractional powers of L. Proposition. The coefficients S2j are the local densities of the conserved quantities of the KdV hierarchy, as computed in Chapter 10. 2j−1
S2j = Res∂ L 2 (11.19) As a consequence, the Hamiltonians of the KdV hierarchy are given by: 2 (11.20) dxS2j+2 (x) H2j−1 (L) = 2j + 1 Proof.
S2j
dz 2j−1 ∗ dz = Ψ (t, z)Ψ(t, z) = z ((Φ∗ )−1 e−zx ) · (Φ∂ 2j−1 ezx ) 2iπ 2iπ 2j−1
= Res∂ (Φ∂ 2j−1 Φ−1 ) = Res∂ L 2
where we have used eq. (10.14) in Chapter 10. Due to eq. (11.16), replacing Ψ(t, z) by Ψ∗ (t, z) in eq. (11.18) amounts to changing z → −z. In particular this shows that the coefficients v2n are derivatives with respect to x of local densities. One can compute the coefficients S2j by induction as follows. Since Ψ and Ψ∗ obey eq. (11.14), their product S(t, λ) = Ψ∗ (t, z)Ψ(t, z) obeys the third order equation: 1 1 ( ∂ 3 − u − u∂ − λ∂)S(t, λ) = 0 4 2 Expanding in z one gets ∂S2j+2 = ( 14 ∂ 3 − 12 u − u∂)S2j . This recursion relation can also be understood as the Lenard recursion relation, eq. (10.37) in Chapter 10. In fact, since k+2 k 2 2 2 Hk (L) = L 2 = dxRes∂ L 2 +1 = dxSk+3 k+2 k+2 k+2 (11.21) we get, using eq. (10.35) with n = 1, dHk = Sk+1 ∂ −1 . The Lenard relation becomes D1 (S2j+2 ∂ −1 ) = D2 (S2j ∂ −1 ) which is exactly the previous recursion relation. The first few values of S2j are: 1 S2 = − u, 2
1 3 S4 = − u + u2 , 8 8
S6 = −
5 3 5 2 5 1 u − u + (uu ) − u(iv) 16 32 16 32
390
11 The KdV hierarchy
One can recast the equations of motion of the KdV hierarchy as equations on S(t, λ). It is convenient to introduce a generating function for all the time derivatives: ∇(λ) = λ−j ∂2j−1 (11.22) j≥1
Then we have: Proposition. The equations of the KdV hierarchy are equivalent to: ∇(λ)S(t, λ ) =
S(t, λ) · ∂x S(t, λ ) − S(t, λ ) · ∂x S(t, λ) , λ − λ
Proof. In these equations, we have λ = a similar equation on Ψ(t, z): ∇(λ)Ψ(t, z ) =
z 2 , λ
=
2S(t, λ)∂x − ∂x S(t, λ) Ψ(t, z ), 2(λ − λ )
z 2 .
|λ| > |λ | (11.23)
We shall first prove
for |λ| > |λ | (11.24)
Recall that, from eq. (10.9) in Chapter 10, the equation of motion for Ψ is (11.25) ∂2j−1 Ψ(t, z) = (L(2j−1)/2 )+ Ψ(t, z) Now, there is a simple recursion relation: 2j+1
2j−1
1 L 2 = L 2 L + S2j ∂x − ∂x S2j 2 + +
(11.26)
Indeed, L being a differential operator, we have: 2j−1
2j−1
2j−1
2j+1
= L 2 L = L 2 L+ L 2 L L 2 +
+
+
−
+
To compute the plus part in the last term, we only have to keep the first two terms in the expansion of (L(2j−1)/2 )− because L is a second order differential operator. We have: 2j−1
1 −2 L 2 = S2j ∂ −1 − S2j ∂ + ··· 2 − where the first coefficient (which is the residue of the considered pseudodifferential operator) is determined by eq. (11.19), and the second coefficient is then fixed by the fact that the left-hand side is formally anti self-adjoint. The recursion relation eq. (11.26) follows immediately. It implies in turn: j−1 2j−1
1 j−i−1 L 2 = (S2i ∂ − S2i )L (11.27) 2 + i=0
391
11.2 The KdV hierarchy Using LΨ(t, z ) = λ Ψ(t, z ), we have:
∂2j−1 Ψ(t, z ) =
j−1 i=0
1 j−i−1 (S2i ∂ − S2i )λ Ψ(t, z ) 2
The computation of ∇(λ)Ψ(t, z ) is then straightforward, yielding eq. (11.24). Changing z → −z , we see that the Baker–Akhiezer function Ψ∗ (t, z ) also obeys eq. (11.24), and eq. (11.23) follows immediately for S = Ψ∗ Ψ. It is worth noticing that eq. (11.23) can be rewritten as a local conservation law: 1 S(λ) 1 = ∂x ∇(λ) S(λ ) λ − λ S(λ ) Using this formalism we now show that the conserved quantities given in eq. (11.13) are the same as the ones in eq. (11.21). This amounts to showing that v(t, z) and S(t, λ) differ by the derivative in x of a local function. Proposition. We have the relation:
dv(t, z) 1 d = S(t, λ) + ∂x 2z + ∇(λ) log S(t, λ) dz 2 dλ
(11.28)
where v = ∂x Ψ/Ψ, λ = z 2 . Proof. From eq. (11.23) we have:
S(λ) log S(t, λ ) − ∂ log S(t, λ) ∇(λ) log S(t, λ ) = ∂ x x λ − λ We substitute eq. (11.18) in this equation and take the limit λ → λ. One gets: S(t, λ) dv(t, z) 1 d + − 2z log S(t, λ) z dz z dλ We differentiate this expression with respect to x and substitute ∂x v = λ + u − v 2 to get the result. ∇(λ) log S(t, λ) = −
Remark 1. As for the KP hierarchy, there exists a tau-function τ (t) such that: Ψ(t, z) = eξ(t,z) with [z −1 ] = (. . . ,
τ (t − [z −1 ]) , τ (t)
z −2j+1 , . . .). 2j−1
S(t, λ) =
Ψ∗ (t, z) = e−ξ(t,z)
τ (t + [z −1 ]) τ (t)
It is related to the generating function S(t, λ) by:
τ (t + [z −1 ])τ (t − [z −1 ]) = 1 + ∂x ∇(λ) log τ (t) τ 2 (t)
(11.29)
392
11 The KdV hierarchy
This last formula follows by inserting the parametrization of Ψ, Ψ∗ in terms of taufunctions in eq. (11.28).
Remark 2. The Schroedinger equations, eq. (11.14), can also be translated into an equation on S(t, λ). Using the value of the Wronskian, it is straightforward to show that 2∂x2 log S(t, λ) + (∂x log S(t, λ))2 + 4λS −2 (t, λ) = 4(u + λ) (11.30)
Remark 3. We can give more information on the decomposition eq. (11.17). Using eq. (11.24), we have: S(t, λ) z ∇(λ)X(t, z ) = (11.31) λ − λ S(t, λ ) S(t, λ) − S(t, λ ) z z + , for |λ| > |λ | = λ − λ S(t, λ ) λ − λ Notice that we can expand X(t, z) as: ˜ z), X(t, z) = ξ(t, z) + X(t,
with
ξ(t, z) =
z 2j−1 t2j−1
(11.32)
j≥1
˜ z) regular at z = ∞. This decomposition follows from the fact that with X(t, z ∇(λ)ξ(z , t) = λ−λ for |λ| > |λ |.
11.3 Hamiltonian structures and Virasoro algebra Recall eq. (10.39) in Chapter 10 which defined two Poisson structures: {fX , fY }1 (L) = LXY − LY X {fX , fY }2 (L) = (LX)+ (LY )− − (XL)+ (Y L)−
1 − dx (∂ −1 [L, X]−1 ) [L, Y ]−1 2 Here L = ∂ 2 − u, X = X−1 ∂ −1 + X−2 ∂ −2 + · · · and fX (L) = LX , where denotes the Adler trace. To compute the Poisson brackets of u it is enough to take X = X(x)∂ −1 so that fX (L) = − dxu(x)X(x). The two Poisson brackets become: 5 8 dxuX, dxuY = dx(X Y − XY ) (11.33) 1
5
8
dxuX,
=−
dxuY 2
1 dxu(X Y − XY ) − 2
dxXY (11.34)
11.3 Hamiltonian structures and Virasoro algebra
393
Equivalently, we can write: {u(x), u(y)}1 = −(∂x − ∂y )δ(x − y)
(11.35)
1 1 {u(x), u(y)}2 = (u(x) + u(y))(∂x − ∂y )δ(x − y) − (∂x3 − ∂y3 )δ(x − y) 2 4 (11.36) A noticeable feature of these two Poisson brackets is that they are both local in x. Alternatively, we can expand the field u(x) in Fourier series: u(x) = k uk eikx (we chose = 2π). Taking X = e−inx and Y = e−imx , eqs. (11.33, 11.34) become respectively: i (11.37) {un , um }1 = − nδn+m,0 π
1 n3 {un , um }2 = − (n − m)un+m − δn+m,0 (11.38) 2iπ 2 The bracket { , }1 is called the Gardner–Faddeev–Zakharov bracket, while the bracket { , }2 is called the Magri–Virasoro bracket. It coincides with the Kostant–Kirillov bracket associated with the Virasoro algebra. The Lax operator L can also be written in factorized form L = (∂ + p)(∂ −p) so that u = p +p2 . This is the Miura transformation. The second Poisson bracket has a simple expression in this parametrization: 1 {p(x), p(y)}2 = (∂x − ∂y )δ(x − y) 4 or in Fourier modes p(x) = k pk eikx : {pn , pm }2 =
(11.39)
in δn+m,0 4π
It was shown in Chapter 10 that these Poisson brackets are compatible in the sense that their sum is again a Poisson bracket. Moreover, the Hamiltonians Hn of the KdV hierarchy are in involution with respect to both Poisson brackets. Let us give an example of the Hamiltonian equations of motion. Taking 1 1 2 H1 = dxu , H3 = − dx(u2 + 2u3 ) 4 16 one gets: 1 ∂t3 u = {H3 , u}1 = {H1 , u}2 = (−6uu + u ) 4 illustrating the fact that one finds the same equations of motion with the two Poisson brackets, but with different Hamiltonians.
394
11 The KdV hierarchy 11.4 Soliton solutions 1
Considering the KP hierarchy with Q = L 2 , we see that the equations i of motion ∂ti Φ = −(L 2 )− Φ imply that Φ is stationary with respect to the even times t2j . Conversely, any solution of the KP hierarchy which is stationary for any even time is such that (Q2j )− = 0, hence, in particular, Q2 = L is a differential operator. This solution is thus a solution of the KdV hierarchy. At the tau-function level, to obtain the decoupling of the even time variables it is sufficient to have:
τKP (teven , todd ) = e
n even cn tn
τKdV (todd )
(11.40)
This is because the action of a Hirota differential operator with respect to even time variables on such a KP tau-function vanishes: Dtm2k ec2k t2k τKdV (todd ) · ec2k t2k τKdV (todd ) = ∂ym ec2k (t2k +y) τKdV (todd )ec2k (t2k −y) τKdV (todd )|y=0 = 0 The Hirota equations of the KdV hierarchy are thus obtained from the Hirota equations of the KP hierarchy by simply erasing the even times. We get, for instance, the Hirota form of the KdV equation: (D14 − 4D1 D3 )τ · τ = 0
(11.41)
(compare with eq. (8.56) in Chapter 8). Setting u = −2
∂2 D12 τ · τ log τ = − τ2 ∂t21
(11.42)
we recover the KdV equation on u. Indeed, one has: D14 τ · τ , τ2 D1 D3 τ · τ −u˙ = ∂x , τ2
−u + 3u2 =
D14 τ · τ = 2τ (iv) τ − 8τ τ + 6(τ )2 D1 D3 τ · τ = 2(τ˙ τ − τ τ˙ )
Combining these expressions we see that the KdV equation is equivalent to the Hirota equation (11.41). Recall that the KP tau-functions are constructed by choosing an element g ∈ GL(∞) (see eq. (9.34) in Chapter 10): τKP (t) = 0|eH(t) g|0 with H(t) = n Hn tn and where Hn are bosonic oscillators, (not to be confused with the Hamiltonians). See Chapter 9. This tau-function
395
11.4 Soliton solutions
satisfies the bilinear identity (9.36), which reduces to the KdV Hirota bilinear identity whenever τKP (t) is of the form eq. (11.40). The main problem is to find the group elements g such that this property holds. Recall that the Lie algebra of GL(∞) consists of fermionic bilinears ∗ : (see eq. (9.13) in Chapter 9). If g of the form X = rs Mrs : βr β−s commutes with the H2k , one can push the term exp ( k t2k H2k ) in τKP to the right, where it hits the vacuum and disappears since H2k |0 = 0. In fact commutation up to a central element is sufficient since a central term would produce an exponential of a linear combination of the even times. Using eq. (9.26) in Chapter 9, we have: ∗ [Hn , βs∗ ] = −βs+n
[Hn , βr ] = βr+n ,
so that the H2k commute with X, up to central terms, if Mrs = Mr+2,s−2 . This means that the infinite band matrix Mrs is made of diagonals whose elements reproduces themselves with period 2. This characterizes the Lie ⊂ gl(∞), algebra sl(2) see Chapter 16. We have found: Proposition. The tau-function of the KdV hierarchy is given by: τKdV (t) = 0|eH(t) g|0 , with H(t) = t2k−1 H2k−1 and g ∈ SL(2) k>0
As an application, we can find the KdV soliton solutions. Recall that the KP soliton solutions are obtained for g = gi with gi = 1+ai β(pi )β ∗ (qi ), where β(p) and β ∗ (q) are the fermionic fields. Since we have: [Hn , β(z)β ∗ (w)] = (z n − wn ) : β(z)β ∗ (w) : +
z n − wn z−w
we see that gi commutes with H2k if pi = −qi . Hence we have: Proposition. The n-soliton tau-functions of the KdV hierarchy are: τn (t, g) = 0|e
H(t)
n
(1 + ai β(pi )β ∗ (−pi )) |0
(11.43)
i=1
Explicitly, they are equal to: τn (X|p) = 1 +
= 1+
n
p=1
I⊂{1,...,n} |I|=p
i
Xi +
i<j∈I
i<j
p i − pj 2 · Xi p i + pj i∈I
Xi Xj
pi − p j pi + p j
2 + ···
(11.44)
396
11 The KdV hierarchy
where ξ(p, t) = n compact form as:
odd p
nt
n,
τn (X|p) = det (1 + W ) ,
and Xi =
with
ai 2ξ(pi ,t) . 2pi e
Wij =
This can be written in
Xi
4pi pj Xj (pi + pj )
(11.45)
Proof. We have eH(t) β(pi )e−H(t) = eξ(t,pi ) β(pi ) and eH(t) β ∗ (−pi )e−H(t) = eξ(t,pi ) β ∗ (−pi ) so that pushing eH(t) to the right amounts to replacing ai → (2pi )Xi . The expression (11.44) is obtained by applying Wick’s theorem. 0| (1 + 2pi Xi β(pi )β ∗ (−pi ))|0 = 1 + 2pi Xi 0|β(pi )β ∗ (−pi )|0 i
+
i ∗
4pi pj Xi Xj 0|β(pi )β (−pi )β(pj )β ∗ (−pj )|0 + · · ·
i<j
Each of these vacuum expectation values is a determinant (see eq. (9.19) in Chapter 9). We get: 0| (1 + 2pi Xi β(pi )β ∗ (−pi ))|0 i
=1+
Xi +
i
Xi Xj det
i<j
1
+ ··· pi + pj
(11.46)
By the Cauchy formula, eq. (9.33), these determinants can be rewritten as in eq. (11.44). On the other hand, eq. (11.45) reduces to eq. (11.46) by virtue of the expansion formula, eq. (9.47) in Chapter 9. Remark 1. Using the bosonization formula of Chapter 9, one recognizes that the operator
V (p) = p β(p)β ∗ (−p)
coincides with the vertex operator defining the level one vertex representation of the algebra, (see Chapter 16). In the bosonic representation, the group elements affine sl(2) gi can be written: gi ≡ 1 + ai β(pi )β ∗ (−pi ) = 1 +
ai V (pi ) pi
Hence the soliton solutions of KdV are directly related to vacuum expectation values of vertex operators.
The one-soliton solution is obtained when τ = 1 + X, with X = 1 + 3 e(2p(x−x0 )+2p t) . One gets: u(x, t) = −
2p2 cosh2 (p(x − x0 ) + p3 t)
397
11.4 Soliton solutions
It corresponds to a bump (or rather a dip) of height 2p2 propagating with velocity −p2 . In sharp contrast with the case of linear partial differential equations where wave packets spread out in time, here the bump preserves its shape for all times. Note that the centre of the bump is located at X(x, t) = 1. Consider now the general n-soliton solution, eq. (11.44). We can analyse its shape asymptotically when t → ±∞. Assume that p1 > p2 > · · · > pn > 0. We want to show that we have asymptotically n solitons moving from right to left with velocities −(p1 )2 , . . . , −(pn )2 . Let us assume t → −∞ and consider what happens around Xi = 1, i.e. xi (t) = −p2i t. The values of the other Xj when x ∼ xi (t) are: Xj ∼ (aj / 2pj ) exp (2pj (p2j − p2i )t). Hence for large negative time, if p2j < p2i then Xj (xi (t), t) is very large, while if p2j > p2i , then Xj (xi (t), t) is very small. So we can split the indices j into two subsets, relative to the index i. One subset I+ is such that p2j < p2i and corresponds to Xj very large. The other one I− is such that p2j > p2i and corresponds to Xj very small. To evaluate the tau functions, eq. (11.44), when x ∼ xi (t), one has to keep the terms containing the maximum number of Xj , j ∈ I+ . There are two such terms, yielding: τ (x, t)|x∼xi (t) ∼
j∈I+
Xj
j1
p2j−1 i
i
So the complete set of conserved quantities is provided by: 4 2j+1 = pi 2j + 1 n
H2j−1
(11.47)
i=1
Because the Hamiltonians H2j−1 are conserved for any j, it follows that in the scattering process of solitons, only permutations of the pi can occur. The scattering of solitons is completely described by this permutation and the time delays δi . 11.5 Algebro-geometric solutions We wish to apply the analytical methods of Chapter 5 to construct solutions of the KdV equation. As explained in that chapter, one way to get a Lax matrix compatible with the equations of the KdV hierarchy is to seek for stationary solutions with respect to some higher time tj . Then the zero-curvature condition, ∂i Aj − ∂j Ai − [Ai , Aj ] = 0, becomes a Lax equation since the stationarity condition with respect to time tj means ∂j Ai = 0, for all i. The Lax matrix is Aj and its associated spectral curve
399
11.5 Algebro-geometric solutions
is independent of all times ti . A very simple example of this situation occurs when u is stationary with respect to t3 = t. In that case the Lax matrix is At given in eq. (11.2). The associated spectral curve is: Γ:
1 1 1 det(At − µ) = µ2 − λ3 + (3u2 − u )λ + (2uu − u2 − 4u3 ) = 0 4 4 16
The zero-curvature condition becomes the Lax equation ∂x At = [Ax , At ] and reduces to the stationary KdV equation 6uu − u = 0. Integrating, one gets 3u2 − u = C1 and 2u3 − u2 = 2C1 u + C2 for some constants C1 , C2 . So the spectral curve reads µ2 = λ3 /4 − C1 λ/4 − C2 /16, and is independent of x as it should be. This is a genus 1 curve, so that u is given by an elliptic integral. More interesting solutions will be obtained by assuming u to be stationary with respect to some higher time t2j−1 . Let us compute the matrices At2j−1 . We start from the equations of the hierarchy, eq. (11.25). Since 2j−1
L 2 is anti self-adjoint, for either one of the two solutions Ψ and Ψ∗ of the KdV hierarchy we have: 2j−1 (L 2 )+ Ψ Ψ Ψ = = At2j−1 ∂t2j−1 2j−1 ∂x Ψ ∂x Ψ ∂x (L 2 )+ Ψ Using the identity (11.27) and ∂x2 Ψ = (λ − u)Ψ one gets: At2j−1 =
j−1
λ
j−i−1
i=0
− 12 S2i (λ − u)S2i − 12 S2i
S2i 1 2 S2i
Notice that At2j−1 depends only on λ, in agreement with the fact that Ψ and Ψ∗ play the same role. In particular, for j = 1 one finds Ax , and for 2j−1 2j−1 2j−1 j = 2 one finds At . Writing (L 2 )+ = (L 2 ) − (L 2 )− , we see that: 2j−1 Ψ Ψ =λ 2 + O(1) At2j−1 ∂x Ψ ∂x Ψ Hence we have identified, asymptotically for λ → ∞, the eigenvectors of At2j−1 and the eigenvalues: µ=λ
2j−1 2
+ O(1)
(11.48)
The matrix At2j−1 being traceless 2 × 2, its associated spectral curve is a hyperelliptic curve of genus (j − 1) of the form µ2 = R(λ) with R(λ) = 2j−1 j=0 (λ − λj ) = det At2j−1 . This curve is not a general hyperelliptic
400
11 The KdV hierarchy
Riemann surface of genus (j−1). In fact, since µ = R(λ) is an eigenvalue of At2j−1 (λ), it has to be of the specific form eq. (11.48), showing that R(λ) has the very special form R(λ) = λ2j−1 + C1 λj−1 + C2 λj−2 + · · ·. To overcome this peculiarity, we notice that the stationarity condition can be generalized by imposing the condition: cj ∂t2j−1 u = 0 j
for some constant coefficients cj . Then the corresponding Lax matrix becomes j cj At2j−1 . For any time t2i−1 , the zero curvature condition implies the Lax equation (because the At2j−1 depend only on u): ∂t2i−1
cj At2j−1 = At2i−1 , cj At2j−1
j
j
In the following we consider the hyperelliptic curve Γ constructed from such a Lax matrix. It is of the generic form: 2
Γ : µ = R(λ) =
2g+1
(λ − λi )
(11.49)
i=1
The point at√∞ is a branch point, and a local parameter around that point is z = λ. We want to construct a section Ψ of the eigenvector bundle on Γ, obeying Ψ = 0, ∀i (∂t2i−1 − At2i−1 ) ∂x Ψ The choice ∂x Ψ for the second component is dictated by the equation for i = 1 which then reduces to: (∂x2 − u)Ψ = λΨ So it is enough to consider Ψ. A consequence of eq. (11.48) is that Ψ ξ(t,z) (1 + O(z −1 )), where has the √ asymptotic behaviour at infinity Ψ = e z = λ. We know (see Chapter 5) that Ψ has g + N − 1 = g + 1 poles on Γ (N is the size of the √ Lax matrix). Here one of the poles is at ∞ because ∂x Ψ ∼ zΨ and z = λ is the local parameter at ∞. Hence we require that Ψ has g poles (γ1 , . . . , γg ) at finite distance. Recall that the positions of these poles are independent of all the times t2j−1 . With these data we
11.5 Algebro-geometric solutions
401
construct the Baker–Akhiezer function on Γ which is the unique function with the following analytical properties: • It has an essential singularity at the point P at infinity:
α(t) Ψ(t, z) = eξ(t,z) 1 + + O(1/z 2 ) z √ where z = λ and ξ(t, z) = i≥1 z 2i−1 t2i−1 .
(11.50)
• It has g simple poles, independent of all times. The divisor of these poles is D = (γ1 , . . . , γg ). This Baker–Akhiezer function solves the KdV hierarchy equations as the following two propositions show. Proposition. There exists a function u(x, t) such that (∂x2 − u) Ψ = λΨ
(11.51)
Proof. Consider on Γ the function ∂x2 Ψ − λΨ. To define this object as a function on the curve, λ is viewed as a meromorphic function on Γ. Note that λ has only a double pole at ∞ and such a function exists only if Γ is hyperelliptic. We see that ∂x2 Ψ − λΨ has the same analytical properties as Ψ itself at finite distance on Γ. At infinity we have by eq. (11.50): √ ∂x2 Ψ − λΨ = eξ(t,z) (2∂x α + O(1/z)), z = λ So it is a Baker–Akhiezer function, but with a normalization 2∂x α instead of 1 at infinity. By the uniqueness theorem of such functions, we have: ∂x2 Ψ − λΨ = uΨ,
u = 2∂x α
(11.52)
Having found the potential u, we construct the differential operator L = ∂ 2 − u and show that the Baker–Akhiezer function Ψ obeys all the equations of the associated KdV hierarchy. Proposition. The evolution of Ψ is given by: ∂t2i−1 Ψ = (L
2i−1 2
)+ Ψ
where L = ∂ 2 − u is the KdV operator constructed above. 2i−1
Proof. Consider the function ∂t2i−1 Ψ − (L 2 )+ Ψ. It has the same analytical properties as Ψ at finite distance on Γ. At infinity we have
402
11 The KdV hierarchy 2i−1
2i−1
2i−1
∂t2i−1 Ψ = z 2i−1 Ψ + eξ O(1/z) and (L 2 )+ Ψ = L 2 Ψ − (L 2 )− Ψ = z 2i−1 Ψ + eξ O(1/z), where we have used LΨ = z 2 Ψ. Summarizing, we get: ∂t2i−1 Ψ − (L
2i−1 2
)+ Ψ = eξ(t,z) O(z −1 ),
z→∞
By unicity, this Baker–Akhiezer function which vanishes at ∞ vanishes identically. Remark. Because the Schroedinger operator in eq. (11.51) is self-adjoint, the adjoint Baker–Akhiezer function Ψ∗ (P ) is easily related to Ψ(P ). In fact one can choose Ψ∗ (P ) = Ψ(σ(P )) where σ is the hyperelliptic involution on Γ. Note, however, that this choice does not correspond to the normalization selected by the definition of Ψ∗ (P ) in terms of pseudodifferential operators.
The Baker–Akhiezer function Ψ(t, z) can be written explicitly in terms of theta-functions, see Chapter 5. Let Ω(2j−1) be the unique normalized second kind differential (all the a-periods vanish) with a pole of order 2j at infinity, such that: Ω(2j−1) = d z 2j−1 + regular , for z → ∞ (2j−1)
Let Uk
be its b-periods: (2j−1)
Uk
=
1 2iπ
Ω(2j−1) bk
In terms of these data we have: Proposition. The Baker–Akhiezer function with the divisor of poles D = (γ1 , . . . , γg ) can be expressed as: (2j−1) − ζ) θ(ζ) P (2j−1) θ(A(P ) + j t2j−1 U t Ω 2j−1 Ψ(t, P ) = e ∞ j θ(A(P ) − ζ) θ( j t2j−1 U (2j−1) − ζ) (11.53) where A(P ) is the Abel map on Γ with base point ∞, and ζ = A(D) + K with K the Riemann’s constant vector. The KdV field, u, is given by the Its–Matveev formula:
u(x, t) = −2∂x2 log θ t2j−1 U (2j−1) − ζ + const. (11.54) j
403
11.5 Algebro-geometric solutions
P Proof. In eq. (11.53) the integral ∞ has to be understood in the follow P ing sense: for z in a vicinity of ∞, one defines ∞ Ω(2j−1) as the unique primitive of Ω(2j−1) which behaves as z 2j−1 + O(1/z) (no constant term). Of course, when this is analytically continued on the Riemann surface, b-periods will appear. However, they will cancel out in eq. (11.53) due to the monodromy properties of theta-functions, leaving us with a welldefined normalized Baker–Akhiezer function. The formula for the KdV field is found by using: λ + u = (∂x2 log Ψ) + (∂x log Ψ)2 Setting Ω(1) (z) = d(z + we have:
β z
+ O(z −2 )), where β does not depend on times,
∂x log Ψ = z + ∂x log θ A(P ) + t2j−1 U (2j−1) − ζ −∂x log θ
j
j
β t2j−1 U (2j−1) − ζ + + O(z −2 ) z
We evaluate this expression when z → ∞. Using Riemann’s bilinear identities, we can expand the Abel map A(P ) around ∞ (see eq. (5.56) in Chapter 5), and we have:
z −2j+1 (2j−1) t2j−1 U (2j−1) − ζ = θ (t2j−1 − −ζ )U θ A(P ) + 2j − 1 j
j
Keeping the 1/z terms, we obtain:
β 1 1 t2j−1 U (2j−1) − ζ + + O( 2 ) ∂x log Ψ = z − ∂x2 log θ z z z j
Differentiating once more with respect to x, we also get ∂x2 log Ψ = O(1/z). It follows that z 2 + u = z 2 − 2∂x2 log θ + 2β + O(1/z), proving the result. On a hyperelliptic curve Γ, one can easily express the Baker–Akhiezer function Ψ knowing the divisor of its zeroes. We concentrate first on the x dependence. The higher times t2j−1 will be considered next. Let D(x) be the divisor of the zeroes of the Baker–Akhiezer function Ψ. It is of degree g, and coincides with the divisor of the poles, D, for x = 0: D(x) ≡ {γ1 (x), . . . , γg (x)}
(11.55)
404
11 The KdV hierarchy
The points γi (x) have coordinates λγi (x) , µγi (x) = R(λγi (x) ) . In this formula, the expression R(λγi (x) ) refers to the determination of the square root corresponding to the sheet to which the point γi (x) belongs, while − R(λγi (x) ) corresponds to the other sheet. Let us define the polynomial in λ: g B(λ, x) = (λ − λγi (x) ) i=1
In terms of these data, we have: Proposition. The Baker–Akhiezer function is equal to: x R(λ) Ψ(x, λ) B(λ, x) = exp dx Ψ(x0 , λ) B(λ, x0 ) x0 B(λ, x)
(11.56)
The KdV potential u is expressed as: u=2
g i=1
λγi (x) −
2g+1
λj
(11.57)
i=1
Proof. We need to find the equations governing the x dependence of the divisor D(x). To this end we consider the function ∂x Ψ/Ψ. It is a meromorphic function on Γ, has poles at the points γi (x) and behaves like z + O(1/z) at infinity. Hence we can write R(λ) + Q(λ, x) ∂x Ψ = g (11.58) Ψ i=1 (λ − λγi (x) ) where Q(λ, x) is a polynomial of degree g − 1 in λ. We determine Q(λ, x) by requiring that ∂xΨΨ has a pole above λ = λγi (x) on the sheet µγi (x) and not on −µγi (x) . Thus we find the g conditions Q(λγi (x) , x) = µγi (x) which completely determine the polynomial: j=i (λ − λγj (x) ) µγi (x) Q(λ, x) = j=i (λγi (x) − λγj (x) ) i
On the other hand, in the vicinity of λγi (x) , we have: ∂x λγi (x) ∂x Ψ + O(1) =− Ψ λ − λγi (x)
(11.59)
11.5 Algebro-geometric solutions
405
˜ because in the vicinity of the zero λγi , we have Ψ(x, λ) ∼ (λ − λγi )Ψ. Comparing the residues of the poles at λ = λγi in eq. (11.58) and in eq. (11.59), we get the equations of motion for the divisor D(x): ∂x λγi (x) = −2
µγi (x) j=i (λγi (x) − λγj (x) )
(11.60)
One can now reconstruct the Baker–Akhiezer function itself. Indeed, inserting eq. (11.60) into eq. (11.58), we get: R(λ) 1 ∂x Ψ 1 ∂x λγi (x) = − Ψ B(λ, x) 2 λ − λγi (x) i
Integrating this formula from x0 to x gives eq. (11.56). To compute the potential u(x), we insert eq. (11.56) into (∂x2 − u)Ψ = λΨ. We get the polynomial identity 1 1 R = − BB + B 2 + (u + λ)B 2 2 4 where = ∂x . Comparing the terms in λ2g we obtain eq. (11.57). One can find the generalization of eq. (11.60) for any flow of the KdV hierarchy. Proposition. On the coordinates λγi of the divisor D(x) the equations of motion with respect to the time t2j−1 read: g −1 (1 − λγi λ ) j−1 Pj (λγi )µγi λ ∂t2j−1 λγi = −2 , Pj (λ) = i=1 (λ − λ ) 2g+1 γl −1 l=i γi i=1 (1 − λi λ ) + (11.61) where the ( )+ means taking the polynomial part in the expansion at λ = ∞. Proof. The only difference from the previous discussion is that, in the derivation of eq. (11.58), the behaviour at ∞ is replaced by: ∂t2j−1 Ψ/Ψ = z 2j−1 + O(1/z) Following the same reasoning as before, we can write: ∂t2j−1 Ψ Pj (λ) R(λ) + Qj (λ) = Ψ B(λ)
(11.62)
406
11 The KdV hierarchy
where Pj (λ) is a polynomial in λ of degree (j−1) and Qj (λ) is a polynomial in λ of degree (g − 1). The polynomial Pj (λ) = λj−1 + · · · is uniquely determined by imposing the asymptotic eq. (11.62), which gives (j − 1) linear conditions. The solution is given by eq. (11.61). The polynomial Qj is determined as above, yielding Qj (λγi ) = µγi Pj (λγi ). One gets the equation of motion for the divisor D(x): ∂t2j−1 λγi = −2
Pj (λγi )µγi l=i (λγi − λγl )
This can be used to give a direct proof of the linearization of the flows on the Jacobian of the hyperelliptic curve Γ. For this purpose, it is sufficient to consider the time evolutions of the Abel sums λkγi ∂ λγi λk dλ = ∂t2j−1 λγi ∂t2j−1 R(λγi ) R(λ) i i = −2
i
λkγi Pj (λγi ) l=i (λγi − λγl )
The right-hand side is equal to the integral: λk Pj (λ) −2 dλ (λ − λγl ) , B(λ) = 2iπ Υ B(λ) l
where Υ is a loop surrounding all the λγi . We deform the contour to a loop around ∞ so that: −2
i
λkγi Pj (λγi ) Pj (λ) k λ dλ = 2Resλ=∞ B(λ) l=i (λγi − λγl )
To compute this residue at λ = ∞, we write: 2Resλ=∞
Pj (λ) R(λ) λk dλ Pj (λ) k λ dλ = Res∞ B(λ) B(λ) R(λ)
where the right-hand side is a residue computed on the curve Γ. The factor 2 appears because ∞ is a branch point of the covering z → λ = z 2 around that point, so that the residue on Γ has to be computed with 1/z as local parameter. The first factor behaves as z 2j−1 + O(1/z) by construction. So we have: ∂ γi −2j+1 ωk = Resz =0 (z + O(z ))ωk ∂t2j−1 i
11.5 Algebro-geometric solutions
407
where z = 1/z is the local parameter at ∞, and ωk = λk dλ/ R(λ) is an unnormalized Abelian differential. This shows that the flow linearizes on the Jacobian because the right-hand side is a constant. Since this equation is linear in ωk it remains true for normalized Abelian differentials. One can then use Riemann’s bilinear identities to evaluate the residue further. The time derivatives of the Abel map are given by: ∂ 1 −2j+1 A(D) = Res∞ (z + O(z ))ωk = − Ω(2j−1) = −U (2j−1) ∂t2j−1 2iπ bk We recover exactly the expected slope of the linear flow on the Jacobian. Remark. The equation of motion of the divisor D can be recast in compact form by introducing the generating of time derivatives, eq. (11.22). Remembering function −n that for a function f (λ) = ∞ f λ we have: n n=0 ∞
λ−j (f (λ)λj−1 )+ =
j=1
we get:
f (λ ) λ − λ
R(λγi ) B(λ) λ = −2 λ − λγi R(λ) l=i (λγi − λγl ) √
∇(λ)λγi
(11.63)
Finally, we show that eq. (11.57) can be generalized to a whole set of so-called trace identities. Proposition. The generating function S(t, λ) of the local quantities S2n has a simple expression in terms of the divisor D(x). g −1 √ B(λ, x) i=1 (1 − λγi (t) λ ) S(t, λ) = λ (11.64) = 2g+1 R(λ) −1 ) (1 − λ λ j j=1 Proof. Recall that we have defined S(t, λ) = ΨΨ∗ , where Ψ and Ψ∗ are normalized such that their Wronskian is equal to 2z. We can evaluate S(t, λ) using eq. (11.56) for Ψ and a similar equation for Ψ∗ with the sign of the exponential reversed, provided that we normalize these expressions to have the correctWronskian. Using eq. (11.56), one gets W (Ψ, Ψ∗ ) = 2Ψ(x0 , λ)Ψ∗ (x0 , λ) R(λ)/B(λ, x0 ), and the expression of S(t, λ) follows.
Taking the logarithm of eq. (11.64), we get: ∞ ∞ 1 1 1 λ−n λnγi = λ−n λni − log S(λ) n 2 n n=0
i
n=0
i
408
11 The KdV hierarchy
Identifying the powers of λ−n , we find: i
λ γi =
1 u λi + , 2 2
λ2γi =
i
1 2 1 1 2 λi + u − u 2 4 2
The first equation is eq. (11.57). We see that all the symmetric functions of the λγi have a local expression in terms of the potential u. 11.6 Finite-zone solutions The Its–Matveev formula, eq. (11.54), shows that, as a function of the real variable x, the potential u(x, t) is almost periodic. Specifically, the argument of the theta-function is a straight line, Y (x) = U (1) x + Y0 , which wraps densely around the Jacobian torus, and for sufficiently large Y (x + ) returns arbitrarily close to Y (x). This means that Y (x + ) Y (x)+n+Bm, where n, m ∈ Zg and B is the matrix of b-periods. The effect of the translation by n + Bm does not affect the potential because of the second order derivative in front of the logarithm. Hence u(x + ) u(x). One can choose the moduli of the curve Γ so that the potential is exactly periodic. This amounts to the condition: U (1) = n + Bm
(11.65)
which gives 2g real conditions on the 2g + 1 complex parameters λi , the branch points of the hyperelliptic curve Γ: Γ:
µ2 = R(λ) =
2g+1
(λ − λi )
i=1
If, however, the parameters λi are real (as will be the case in the following), these conditions quantize all the moduli of Γ (up to a translation λi → λi + α which does not change the periods U (1) ). Under these conditions the potential u(x) given by eq. (11.57) is periodic, and eq. (11.53) shows that the Baker–Akhiezer function Ψ(x, λ) is a Bloch wave: Ψ(x + , λ) = eP(λ) Ψ(x, λ)
(11.66)
Up to now, the potential u(x) was complex. We determine below the conditions on branch points λi and the divisor D(x) ensuring that u(x) is real and periodic. To do this, we first derive general properties of Bloch waves for generic real periodic potentials. We will then particularize these properties to the algebro-geometric solutions. We will get in this way the very special finite-zone potentials.
11.6 Finite-zone solutions
409
Consider the Schroedinger equation Ψ − uΨ = λΨ with a real periodic potential u(x) with period . The space of solutions is spanned by the two solutions y1 (x, λ) and y2 (x, λ) with initial conditions: y1 (0, λ) = 1, y1 (0, λ) = 0,
y2 (0, λ) = 0, y2 (0, λ) = 1
Because the initial conditions are independent of λ, y1 and y2 are entire functions of λ. Since u(x) is periodic, y1 (x + , λ) and y2 (x + , λ) are two other solutions and we can write: y1 (x, λ) y1 (x + , λ) = T (λ) y2 (x + , λ) y2 (x, λ) where T (λ) is a 2 × 2 matrix, the monodromy matrix. Because the Wronskian W (y1 , y2 ) = 1, we have det T (λ) = 1. Notice that for real ¯ The Bloch λ, yi are real so that T (λ) is real, and in general T (λ) = T (λ). waves are the two solutions which diagonalize the monodromy matrix. Denote by e±P(λ) its eigenvalues (with product 1). The two Bloch waves are such that Ψ± (x + , λ) = e±P(λ) Ψ± (x, λ), and we choose to normalize them by Ψ± (0, λ) = 1. The quantity P(λ) is called the quasi-momentum. Writing Ψ± (x, λ) = y1 (x, λ) + βy2 (x, λ), the Bloch condition on Ψ± (x, λ) and ∂x Ψ± (x, λ) gives: Ψ± (x, λ) = y1 (x, λ) +
e±P(λ) − y1 (, λ) y2 (x, λ) y2 (, λ)
(11.67)
where P(λ) is obtained by solving the equation: eP(λ) + e−P(λ) = y1 (, λ) + y2 (, λ) ≡ t(λ)
(11.68)
This shows that t(λ) = Tr T (λ) is an entire function of λ, hence eP(λ) has no pole or zero at finite distance. One can discuss the nature of the solutions of eq. (11.68) when λ is real, which also implies that t(λ) is real. When ∆(λ) ≡ t2 (λ) − 4 is negative, the quasi-momentum P(λ) is pure imaginary. In this case Ψ(x, λ) has an oscillatory behaviour in x at large scale corresponding to propagation of waves. This defines what is called the allowed zones in the spectrum. When ∆(λ) is positive, Ψ(x, λ) has an exponential behaviour at large scale, so waves cannot propagate. This defines the forbidden zones in the spectrum. Finally, when ∆(λ) = 0, we have t(λ) = ±2 and eP(λ) = ±1 (same sign). This means that Ψ(x + , λ) = ±Ψ(x, λ) so that Ψ(x, λ) is periodic or antiperiodic. This means that the periodic and antiperiodic levels are the boundaries of allowed and forbidden zones. When λ → ∞ one can as a first approximation neglect the potential √ ± λx (forbidden zone) and u. Then for λ → +∞ we have Ψ(x, λ) ∼ e
410
11 The KdV hierarchy
√ √ ±i −λx t(λ) ∼ 2 cosh λ. For λ → −∞ √ we have similarly Ψ(x, λ) ∼ e (allowed zone) and t(λ) ∼ 2 cos −λ. In fact, when u = 0 one can show that for generic potentials there are an infinite number of forbidden zones of exponentially decreasing sizes extending in the region λ → −∞. Finite-zone potentials are such that this phenomenon does not occur, i.e. there are a finite number of forbidden zones. The classical theory of Sturm–Liouville equations gives a rather detailed information on the poles of the Bloch waves and the boundaries of the zones. We recall the main facts. Consider a differential equation y −uy = λy with a real periodic potential u. We have:
f ∂2g = 0
g∂ 2 f + [f g − f g]0
0
so that we get a self-adjoint problem when the boundary conditions are such that the term [f g − f g]0 vanishes. In this case the spectrum is real. Proposition. The poles of the Bloch wave Ψ(x, λ) and the periodic and antiperiodic levels are all real. The periodic and antiperiodic levels form a sequence β1 > β2 ≥ β3 > β4 ≥ β5 > · · · such that β1 is a periodic level, β2 , β3 are antiperiodic levels, β4 , β5 are periodic, and so on. The allowed zones are the intervals [β2j , β2j−1 ]. The forbidden zones are the intervals [β1 , ∞] and the [β2j+1 , β2j ]. There is at least one-pole of Ψ(x, λ) in each forbidden zone, except [β1 , +∞]. Proof. It follows from eq. (11.67) that the poles in λ of Ψ± (x, λ) are located where y2 (, λ) = 0. For these values of λ, y2 (x, λ) is solution of the Sturm–Liouville problem with boundary conditions y(0, λ) = y(, λ) = 0. This problem is self-adjoint, so that these values of λ are real. Similarly, the periodic and antiperiodic levels correspond to the boundary conditions (valid in the case of a periodic potential) y(, λ) = ±y(0, λ) and y (, λ) = ±y (0, λ), which also lead to a self-adjoint problem. These levels are therefore also real. We have shown that all the roots of ∆(λ) are real. We now show that ∂λ t(λ) has a definite sign in the allowed zones. We have ∂λ t(λ) = ∂λ y1 (, λ)+∂λ y2 (, λ). Since v = ∂λ yj obeys the differential equation v − (u + λ)v = y with initial conditions v(0) = v (0) = 0, one has: x ∂λ yj (x, λ) = (y1 (x)y2 (ξ) − y1 (ξ)y2 (x))yj (ξ)dξ, j = 1, 2 0 x ∂λ yj (x, λ) = (y1 (x)y2 (ξ) − y1 (ξ)y2 (x))yj (ξ)dξ 0
11.6 Finite-zone solutions
411
so that:
(A(λ)y22 (ξ) + B(λ)y1 (ξ)y2 (ξ) + C(λ)y12 (ξ))dξ
∂λ t(λ) = 0
where A(λ) = y1 (, λ), B(λ) = y1 (, λ) − y2 (, λ), C(λ) = −y2 (, λ). The quadratic form appearing in the integrand has discriminant B 2 − 4AC = (y1 − y2 )2 + 4y2 y1 = t2 (λ) − 4 = ∆(λ) < 0 so it is of fixed sign in the whole domain of integration. It follows that ∂λ t(λ) = 0 in an allowed zone, and that the sign of ∂λ t(λ) is the same as the sign of C(λ) = −y2 (, λ). In particular y2 (, λ) cannot vanish in an allowed zone. We see that t(λ) either crosses the lines t = ±2 or is tangent to them, but cannot have an extremum in the region |t(λ)| < 2. From this, it follows that the periodic and antiperiodic levels are distributed as indicated in the proposition. Obviously, the sign of ∂λ t(λ) changes when one goes from one allowed zone to the next one, so that y2 (, λ) has at least one zero in each forbidden zone [β2j+1 , β2j ] with β2j+1 = β2j and j ≥ 1. We now return to the case where the periodic potential u(x) is produced by the algebro-geometric construction on the hyperelliptic curve
Fig. 11.1. The graph of the function t(λ) for real λ.
412
11 The KdV hierarchy
Γ(λ, µ). As we have seen, the two Bloch waves Ψ± (x, λ) are the two values of the Baker–Akhiezer function Ψ(x, (λ, µ)) at the two points (λ, ±µ) above λ. At a branch point, we have Ψ+ (x, λ) = Ψ− (x, λ) so that the quasi-momentum satisfies eP(λ) = ±1, i.e. the branch points are zone boundaries and therefore real. Since the region λ → ∞ is a forbidden zone and corresponds to R(λ) > 0, we see, following sign changes, that allowed zones correspond to R(λ) < 0 and forbidden zones to R(λ) > 0. In particular, the branch points form a sequence λ1 > λ2 > λ3 > · · · > λ2g+1 with λ1 = β1 and {λi } ⊂ {βi }. There may be degenerate forbidden zones with β2j+1 = β2j which do not correspond to branch points of the spectral curve. To compare eq. (11.56) with this discussion, we set x0 = 0, and we see that for x = 0, the divisor D(0) of zeroes of Ψ(x, λ) coincides with the divisor of its poles. By eq. (11.67) and the discussion which follows it, we see that the elements of D(0) are all real, and lie in forbidden zones. We know that the divisor D(0) is of degree g if Γ is of genus g, so we put one-pole of Ψ(x, λ) in each forbidden zone [λ2j+1 , λ2j ], for j = 1, . . . , g, and no pole in [λ1 , +∞] to get a periodic motion. The equation of motion, eq. (11.60), are thus regular and show that all points of D(x) stay real for all x, since R(λγi (x) ) > 0, and remain in forbidden zones. Conversely, if all λj and λγi (x) are real, eq. (11.57) shows that u is real. We have also to choose the curve so that the potential is periodic. This will be the case if the points of D(x) describe cyclic motions on g cycles belonging to the real slice of Γ (i.e. λ and µ real and so R(λ) > 0) and if this motion is periodic of period (so that u(x) has period ). As we already remarked, this requires that the moduli be quantized. It is worth mentioning that, once a genus g Riemann surface Γ is chosen with moduli quantized to produce a periodic finite-zone real potential, the KdV flows with respect to higher times preserve these conditions, and automatically produce a family of continuous deformations parameters. To express the periodicity conditions further, we view the curve Γ as a two-sheeted covering of the λ plane with cuts on the forbidden zones.
Fig. 11.2. The curve Γ(λ, µ) seen as a two sheeted cover of the λ plane with cuts along the forbidden zones.
413
11.6 Finite-zone solutions
We choose for the a-cycles loops around the compact cuts on the upper sheet. Then the b-cycles start from ∞ on the upper sheet, cross the a-cycles once, go to the lower sheet through the cut and return to ∞. The regular Abelian differentials are of the form λj dλ/s for j = 0, . . . , g − 1. Their a-periods are real and their b-periods are pure imaginary. Hence normalized regular Abelian differentials are real combinations of the above, so the matrix of b-periods is pure imaginary. Moreover, the second kind Abelian differential Ω(1) is of the form λg dλ/s plus a combination of first kind differentials with real coefficients, so that U (1) is real. More generally, all Ω(i) will have all their periods pure imaginary. They are thus the forms considered in eq. (10.19) in Chapter 10. The periodicity condition eq. (11.65) can only be realized with m = 0 and becomes 1 (1) U = Ω(1) = nj 2iπ bj The Bloch momentum P is naturally defined only up to 2iπZ. This is consistent with the following: Proposition. The Bloch quasi-momentum P(λ) is the primitive to the form Ω(1) : dP = Ω(1) (11.69) Proof. Recall that µ = eP(λ) satisfies µ + µ−1 = t(λ), where t(λ) is an entire function of λ. Hence µ has no pole or zero at finite distance. Moreover, for periodic motions of the divisor D(x), µ is a well-defined function on the curve Γ (minus the point at ∞) since, by eq. (11.66), we can write it as the quotient of two Baker–Akhiezer functions: µ(P ) =
Ψ(P, x + ) Ψ(P, x) √
We also deduce from this formula the behaviour at ∞: µ ∼ e λ . Hence dP = dµ/µ is an Abelian differential on Γ, without singularities at √ finite distance, and having a double pole at ∞, i.e. dµ/µ ∼ dz where z = λ is the local parameter. For real λ, µ is real of fixed sign on a forbidden zone, so that the a-periods of dP = dµ/µ vanish. This characterizes dP = Ω(1) . On the other hand in an allowed zone, µ is of modulus one, so the b-periods of dµ/µ are pure imaginary, and of the form 2iπnj since µ is well-defined on the curve, in agreement with the periodicity condition.
From eq. (11.56), we can get still another expression for the quasimomentum. In the case of a periodic motion of the divisor D(x), it is
414
11 The KdV hierarchy
given by:
P(λ) = 0
R(λ) dx B(λ, x)
(11.70)
At a point βj which is a boundary of a non-degenerate forbidden zone, P(λ) changes from real to pure imaginary, which implies that R(λ) changes sign. Hence all such zone boundaries appear among the branch points of Γ. All other periodic or antiperiodic levels are therefore points where t(λ) is tangent to t = ±2, and there are an infinite number of them. These remarks also allow us to compare the spectral curve ΓT of the monodromy matrix T (λ) corresponding to a finite-zone potential u with the finite genus curve Γ used to built the algebro-geometric solution. By definition, the “curve” ΓT is given by the equation µ2 −t(λ)µ+1 = 0. Setting s = 2µ − t(λ), we get the hyperelliptic type equation s2 = ∆(λ), where ∆(λ) = t2 (λ) − 4. The entire function ∆(λ) is not a polynomial, but admits an infinite product representation: ∆(λ) =
2g+1 i=1
λ λ 2 1− 1− λi βj ∞
j=1
where the βj are the points where t(λ) is tangent to the lines ±2, i.e. correspond to the degenerate forbidden zones, and the λi are the points where t(λ) crosses the lines ±2, i.e. correspond to the branch √ points of Γ. This infinite product is convergent because t(λ) ∼ 2 cos −λ when λ is large, or else βj ∼ −j 2 π 2 / when j is large. Note that the potential u is determined by Γ, so the βj are really functions of the moduli λi of Γ. λ The bianalytic transformation s = s ∞ j=1 (1− βj ) transforms the equa λ tion of ΓT into the equation s2 = 2g+1 i=1 (1 − λi ), that is the equation of Γ. All points (s = 0, λ = βj ) are singular points of ΓT and Γ is the desingularization of ΓT . Conversely, ΓT is obtained from Γ by identifying pairs of points (λ = βj , s = ±s (βj )) which accumulate at ∞.
11.7 Action-angle variables We want to express the restriction of the symplectic forms corresponding to the two Poisson brackets { }1,2 on the finite-zone solutions of the KdV hierarchy. We begin with the first Poisson bracket (11.35). This Poisson bracket is degenerate, and the kernel is 0 u(x)dx. So we consider a symplectic leaf where this integral is kept constant. On such a leaf, the associated
11.7 Action-angle variables
415
symplectic form reads: ω1 =
1 4
x
dx δu(x) ∧
0
dy δu(y) 0
To build finite-zone solutions one chooses a genus g hyperelliptic Riemann surface, and a divisor D = (γ1 , . . . , γg ) of degree g on it. These data determine the potential u. The variations δu are expressed through the variations of the moduli of the curve, and the variations of the divisor, D, of the poles of the Baker–Akhiezer function. When all the times t2j−1 are set to zero, the Baker–Akhiezer function is equal to one, and its zeroes coincide with its poles.In agrement with eq. (11.55) we denote γi = (λγi , µγi ), where µγi = R(λγi ). Proposition. The restriction of the symplectic form ω1 on the finitezone solution constructed from the curve Γ : µ2 = R(λ) =
2g+1
(λ − λi )
i=1
is expressed in terms of the divisor of poles γi = (λγi , µγi ) of the Baker– Akhiezer function as: ω1 =
g
δP(γi ) ∧ δλγi
i=1
where P is the Bloch momentum such that dP = Ω(1) and P = z + O(1/z) at infinity. Proof. By analogy with the discussion in Chapter 5, we introduce a 1-form on the curve Γ with values in 2-forms on phase space: K = Ψ∗ δL ∧ δΨ Ω where Ω is the form defined in eq. (10.21) in Chapter 10. There are several important differences coming from the field theoretical context. First, the notation means: 1 f (x) = lim f (x)dx (11.71) →∞ 0 This is a Whitham average and is also an average on the Liouville torus. Second, variations δΨ have to be defined by keeping the primitive P the (1) Ω (behaving as k = z + O(z −1 ) at ∞) fixed instead of keeping λ k=
416
11 The KdV hierarchy
fixed. Otherwise, eq. (10.66) in Chapter 10 shows that a term linear in x occurs in δΨ and the average is not defined. Comparing with eq. (10.79) in the same chapter, we can write: K = Ψ∗ δL ∧ δΨ
dk , Ψ∗ Ψ
Ω=
dk Ψ∗ Ψ
(11.72)
Just as taking the variation δ by keeping λ fixed introduces poles at the branch points, that is where dλ = 0, taking the variation δ by keeping k fixed introduces poles at the points where dk = Ω(1) = 0. We write as usual that the sum of residues of K on Γ vanishes. The residue of K at ∞ is precisely the form ω1 . Indeed, using k as a local parameter, we have Ψ = ekx (1 + α/k + · · ·) and Ψ∗ = e−kx (1 + · · ·). Since we vary while keeping k fixed, we have δΨ = ekx (δα/k + O(1/k 2 )). Remembering that δL = −δu, we get Res∞ K = δu ∧ δα There is an extra minus sign because the local parameter is really 1/k. To relate α to the potential u, notice that by definition Ω(1) = d(z + O(z −1 )) at ∞, so that λ = k 2 + c + O(k −1 ) for some constant c. Reproducing the reasoning leading to eq. (11.52), we find u = 2∂α − c. When the curve Γ is such that the potential u is periodic, we have Ψ(x + ) = ek Ψ(x), so that α is periodic. Hence u = −c. x Recalling that we keep u fixed, we have δc = 0, and so δu, giving δα = 12 2 Res∞ K = ω1 The form K has poles at finite distance at the poles of Ψ (the poles of Ψ∗ are cancelled by Ω) and at the zeroes of Ω(1) , coming from δΨ. At a pole γi of Ψ we have δΨ = (δkγi /(k − kγi ))(Ψ + O(1)), since we keep k constant in the variation. We get a contribution: Resγi K = Resγi
δkγi Ψ∗ δLΨ dk ∧ ∗ Ψ Ψ k − kγi
Varying LΨ = λΨ, we have Ψ∗ δLΨ = − Ψ∗ (L − λ)δΨ + δλ Ψ∗ Ψ . Integrating by parts, and using (L − λ)Ψ∗ = 0, we get: - 1 1 Ψ∗ (L − λ)δΨ = Ψ∗ ∂x δΨ − ∂x Ψ∗ δΨ = W (δΨ, Ψ∗ ) 0 0 Taking large but close enough to an almost period of u, we write: Ψ(x + ) = ek Ψ(x),
Ψ∗ (x + ) = e−k Ψ∗ (x)
417
11.7 Action-angle variables
so that this quantity vanishes. Finally, Ψ∗ δLΨ = δλ Ψ∗ Ψ and the contribution of the pole γi is: Resγi K = δλγi ∧ δkγi We now analyze what happens at zeroes si of dk. We have: dΨ dΨ δΨ = δmj δλ + dλ dmj j
where mj are the moduli of Γ. The polar part at si comes from the first term and we can replace δΨ → dΨ dλ δλ. Moreover, δL is the multiplication operator by −δu, so dΨ dΨ
∧ δλ Ω = (δL − δλ)Ψ∗
∧ δλ Ω dλ dλ where we added the δλ term which cancels in the wedge product. Varying (L − λ)Ψ∗ = 0, we can write Ψ∗ δL
Ressi K = −Ressi (L − λ)δΨ∗
dΨ
∧ δλ Ω dλ
Integrating by parts gives dΨ dΨ -- 1 Ressi K = −Ressi δΨ∗ (L − λ)
+ W (δΨ∗ , )- ∧ δλ Ω dλ dλ 0 Differentiating (L − λ)Ψ = 0 with respect to λ, we have (L − λ) dΨ dλ = Ψ. Using that Ψ and Ψ∗ are Bloch waves and the fact that δk = 0, we find 1 dΨ -- dk W (δΨ∗ , )- = W (δΨ∗ , Ψ) dλ 0 dλ 0 Since dk vanishes at si , to get a residue for this term one has to take the polar contribution in δΨ∗ , proportional to δλ, which disappears due to the wedge product. We get finally Ressi K = −Ressi K1 ,
K1 = δΨ∗ Ψ ∧ δλ Ω
To evaluate the sum of residues of K1 at si , we write that the sum of residues of K1 vanishes. Poles of K1 are at the si , the poles γi∗ of Ψ∗ , and at infinity. At infinity, Ω has a double pole but δλ and δΨ∗ Ψ have a simple zero so that there is no residue. At γi∗ , we write δΨ∗ =
δkγ ∗ ∗ i k−kγ ∗ (Ψ i
and using eq. (11.72) for Ω, we find Ressi K1 = − δkγi∗ ∧ δλγi∗ si
γi∗
+ · · ·),
418
11 The KdV hierarchy
Recalling that γi∗ = σ(γi ), where σ is the hyperelliptic involution (λ, k) → (λ, −k), we get λγi∗ = λγi and kγi∗ = −kγi . Hence the result Ressi K = − δkγi ∧ δλγi si
γi
from which the proposition follows. We can analyse the second symplectic structure in a similar way. The symplectic form ω2 is expressed most simply in terms of the Miura variable p such that u = p + p2 . Since the second Poisson bracket is given by eq. (11.39), it has a kernel 0 p(x)dx in the variable p. So this quantity has to be fixed. The symplectic form in the variable p reads: x dx δp(x) ∧ dy δp(y) ω2 = − 0
0
With the same notations as in the previous proposition: Proposition. The restriction of the symplectic form ω2 on the finitezone solution is expressed as: ω2 = 2
g
δP(γi ) ∧
i=1
δλγi λ γi
Proof. Now we introduce the 1-form K on the curve Γ, with values in 2-forms on phase space: K = Ψ∗ δL ∧ δΨ
Ω λ
Again, the variations are done by keeping the meromorphic function k fixed. In contrast to the previous case, there is no pole at ∞, but a new pole appears at λ = 0. Let us compute the residue of K at λ = 0. We will show that 1 Resλ=0 K = ω2 The Miura variable p is related to Ψ by ∂x Ψ -p= Ψ -λ=0 and we can express Ψ(x) ≡ Ψ(λ = 0, x) in terms of p as: x
Ψ(x) = e
0
p(y)dy
11.8 Analytical description of solitons
419
When u is periodic of period , we can choose p periodic and Ψ is a Bloch wave with Bloch momentum given by P(λ = 0) = 0 p(y)dy, hence in the kernel of ω2 . The residue of K at λ = 0 is: x δp Ψ∗ δL ∧ δΨ = − Ψ∗ Ψ(δp + 2pδp) ∧ 0 x x
∗ ∗ = − Ψ Ψ 2pδp ∧ δp + δp ∧ ∂x Ψ Ψ δp 0 0 x x W (Ψ, Ψ∗ ) 1 δp = δp ∧ δp − Ψ∗ Ψδp ∧ 0 0 0 0 where we have used (Ψ∗ Ψ) = −W (Ψ, Ψ∗ ) + 2pΨ∗ Ψ and the fact that Ψ∗ Ψ and p are periodic, so the last term is proportional to δ 0 p, hence vanishes. Now the residue is easily computed once we notice that Ω=
dλ W (Ψ, Ψ∗ )
This is because the right-hand side has zeroes at the poles of Ψ and Ψ∗ . Moreover, the Wronskian vanishes at the branch points, thus cancelling the zeroes of dλ. Finally, at ∞ we can normalize Ψ and Ψ∗ such that W (Ψ, Ψ∗ ) 2z, so that the form behaves as dz (λ = z 2 ). This uniquely identifies Ω. The analysis of the residues of K at the poles of Ψ and Ψ∗ is exactly the same as in the previous case, replacing Ω by Ω/λ everywhere, which accounts for the result δP(γi ) ∧ δλγi /λγi . Finally, at the points si , the analysis is also similar but the form K1 involves Ω/λ and so has no pole at ∞ but a possible pole at λ = 0. In fact, there is no pole at λ = 0 because δλ vanishes there: the function k(λ) has an expansion of the form 1 k(λ) = p(y)dy + aλ + · · · 0 where the first term is the Bloch momentum at λ = 0. Variations are done by keeping k fixed and 0 p(y)dy fixed. So δλ = −(δ log a)λ + · · · vanishes at λ = 0. 11.8 Analytical description of solitons The soliton solutions can be viewed as a singular limit of the finite-zone solutions in which the branch points λj coincide in pairs. On these degenerate Riemann surfaces, everything becomes rational and all calculations can be performed to the end.
420
11 The KdV hierarchy
Consider the curve s2 = λ ni=1 (λ − λi )2 , where we set λi = p2i with 0 < p1 < p2 < · · ·. This singular curve is desingularized by setting s = z (λ − λi ), which leads to λ = z 2 . Hence the desingularized curve is the Riemann sphere, which we identify to the complex z plane. The singular curve is obtained from the z plane by identifying the points z = ±pi which are both mapped to the point (s = 0, λ = λi ). A general Baker–Akhiezer function Ψ(t, z) on the Riemann sphere is given by: z − zi (t) Ψ(t, z) = eξ(z,t) Ψ(0, z) z − zi (0) n
(11.73)
i=1
If such a function comes from a function defined on the singular curve, and this is the case if it is a singular limit of a finite-zone solution, it must take the same value at identified points, so we must have Ψ(t, pi ) = Ψ(t, −pi ), or n pi + zj (t) (11.74) = ai e2ξ(pi ,t) , i = 1, . . . , n pi − zj (t) j=1
where ai are time-independent constants. Equation (11.74) is a linear system for the symmetric functions of the zj (t). It determines the time evolution of the divisor of zeroes of Ψ. On the other hand, the soliton solution is well known in terms of tau-functions, see eq. (11.44). From this we can construct the Baker– Akhiezer function through the Sato formula: Ψ(t, z) τ (t)|t=0 τ (t − [z −1 ]) eξ(z,t) = Ψ(0, z) τ (t) τ (t − [z −1 ])|t=0
(11.75)
Recall that τ (t) depends on times only through the variables Xi (t) = Xi (0)e2ξ(pi ,t) so that: Xi (t − [z −1 ]) =
z − pi Xi (t) z + pi
Plugging this into eq. (11.44) for τ (t), we find a formula of the type: τ (t − [z −1 ]) (z − zi (t)) = (11.76) (z + pi ) τ (t) This shows that the right-hand side of eq. (11.75) is exactly of the form eq. (11.73). We shall now prove that the zi (t) defined by eq. (11.74) and
421
11.8 Analytical description of solitons
eq. (11.76) are the same. For this it is sufficient to show that the zi (t) from eq. (11.76) satisfy eq. (11.74). We note the relation (pi − pk )2 Xk τn (X) = τn−1 (X) + Xi τn−1 (pi + pk )2 where τn−1 (X) is given by the formula for (n − 1) solitons (Xi removed), and in the coefficient of Xi all arguments Xk in τn−1 (X) are replaced as indicated. This formula is obvious from eq. (11.44). With its help, we can compare the residue of the pole z = −pi on both sides of eq. (11.76). We find:
pi −pj τ X n−1 j (pi + zj (t)) pi +pj 1 2pi j Xi =− − z + pi τn (X) z + pi j=i (pi − pj ) Similarly, comparing the two sides of eq. (11.76) for z = pi , we find:
p −p τn−1 pii +pjj Xj (pi − zj (t)) 1 j = τn (X) 2pi j=i (pi + pj ) Combining the two equations yields pi + zj (t) j
pi − zj (t)
=
pi − pj j=i
pi + pj
Xi (t)
(11.77)
This is exactly eq. (11.74). We also found the value of the constants ai in that equation: p i − pj Xi (0) ai = p i + pj j=i
The equations of motion for the divisor D(t) = {zi (t)} are obtained by taking the limit of eq. (11.63) in which we have to replace λγi (t) = zi2 (t). They read: 2 2 2 2 j (zi − pj ) j=i (z − zj ) 2 ∇(z )zi = − 2 2 2 2 j (z − pj ) j=i (zi − zj ) In particular, taking the coefficient of z −2 we find: 2 2 j (zi − pj ) ∂x zi = − 2 2 j=i (zi − zj )
(11.78)
The solution of these equations is of course given by solving the algebraic system eq. (11.74).
422
11 The KdV hierarchy
We can also compute the degenerate limit of the Baker–Akhiezer function starting from eq. (11.56). It suffices to notice that: R(λ) -1 1 1 ∂x zi = z − + (11.79) B(λ, x) λ=z 2 2 z − zi z + z i i
so that performing the integral in eq. (11.56) we get eq. (11.73). One can use the rationality of the spectral curve in this degenerate case to express the tau-function as a rational functions of the divisor D(t). Proposition. The tau-function admits a simple expression in terms of the divisor D(t). n (zi + zj ) i<j (pi + pj ) n(n−1) i<j n pj τ (t) = (−1) 2 2 i,j (pi − zj ) j=1
Proof. We start from the logarithm of eq. (11.76) and apply the operator ∂z − ∇(z 2 ) which vanishes on log τ (t − [z]). We get: ∇(z 2 ) log τ (t) =
i
1 1 1 − + ∇(z 2 )zi z − zi z + pi z − zi i
i
Since the tau-function only through the zi (t), we have depends on times 2 )z . Identifying the residues of the ∂ log τ (t)∇(z ∇(z 2 ) log τ (t) = i i zi poles in z at z = ±pl and z = zl , one finds the n conditions: 1 1 1 ∂zi log τ (t) 2 ∂ z = ∂x zi , l = 1, . . . , n x i 2 2 pl − zi pl − zi2 p l − zi i
i
This is a linear system for the quantities ∂zi log τ (t)∂x zi , the solution of which requires the inversion of the Cauchy matrix Mij = 1/(p2i − zj2 ). Recalling eq. (9.33) in Chapter 9 we find: 2 2 2 2 2 2 k=i (pj − zk ) k (pk − zi ) k=i (pj − zk ) −1 n−1 (M )ij = (−1) = ∂ z x i 2 2 2 2 2 2 k=j (pj − pk ) k=i (zi − zk ) k=j (pj − pk ) This gives: n−1
∂zi log τ = (−1)
(p2j − zk2 ) k=j (p2k k=i 2 2 2 k=j (pj − pk ) k=l (zl j,l
− zl2 ) −
zk2 )
1 p j − zl
We consider this expression as a rational function of zi . It has poles at zi = pj for any j and at zi = ±zj for j = i, and goes to 0 at λi = ∞.
11.8 Analytical description of solitons
423
One sees easily that the residue at zi = pj is equal to −1, the residue at zi = zj vanishes. We finally get: ∂zi log τ =
rl 1 + p j − zi zi + z l
j
with rl =
j
l=i
(z 2 k=j l2 k=j (pj
− p2k ) − p2k )
k=i,l
(p2j − zk2 )
2 k=l,i (zl
− zk2 )
In fact we have rl = 1. This is equivalent to the identity: 2 2 k=i,l (p2j − zk2 ) 1 k=i,l (zl − zk ) = 2 2 2 2 2 2 k=j (pj − pk ) zl − pj k (zl − pk ) j which is easily checked by comparing the residues in zl2 on both sides. This gives the final simple result: ∂zi log τ =
1 1 + p j − zi zi + z l
j
l=i
Integrating this formula, we find:
i<j (zi + zj ) τ =C i,j (pi − zj )
(11.80)
where the constant C does not depend on the zi hence is independent of times. To find it, note that when the times are set to −∞ we have τ = 1. Equation (11.76) implies that the zi (−∞) are equal to the −pi up to ordering. Inserting these values into eq. (11.80) gives C = (−1)
n(n−1) 2
2n
j
pj
(pi + pj ) i<j
To end this section, we would like to discuss the Poisson structures of the KdV equation in these variables. As we know, we have a whole hierarchy of these structures. Let ωk , k = 1, 2 be the restriction of the k th symplectic form on the manifold of finite-zone solutions. We have seen that: ωk = ck
n i=1
δP(λγi ) ∧
δλγi λk−1 γi
(11.81)
424
11 The KdV hierarchy
where ck is a normalization constant and the Bloch momentum P(λ) is defined as: P(λ) =
1 Ψ(λ, x = ) log 2 Ψ(λ, x = −)
To compute this Bloch momentum in the soliton limit, we use eq. (11.73) and we send → ∞. Equation (11.74) shows that, up to ordering, we have zi () → pi , zi (−) → −pi . Using eq. (11.70) and eq. (11.79), which is symmetric in the zi , we get: z + pj 1 mod iπ P(z) = z − log 2 z − pj j
Hence, recalling that λγi = zi2 , we find: n zi − p j δzi ∧ δpj δzi δ log ωk = ck ∧ 2k−3 = 2ck 2 zi + p j zi (zi − p2j )zi2k−4 i=1
j
ij
The form ω2 , normalized with c2 = 2, corresponds to the Magri–Virasoro bracket: δzi δzi ∧ δpj ∧ δpj ω2 = 4 =4 (11.82) (zi2 − p2j ) zi2 − p2j ij j i In the coordinates (Xi (0), pi ), it reads δpi δXi δpi ∧ δpj ω2 = 2 ∧ −8 pi Xi p2i − p2j i
(11.83)
i<j
To see that, we differentiate eq. (11.77), getting δzj δpj δXi 2pi =− + 2pi + Λi δpi 2 2 Xi zj − pi p2i − p2j j
j=i
where Λi is a coefficient whose value does not matter. Inserting into eq. (11.82), we arrive at eq. (11.83). We can write the Poisson brackets {pi , pj }2 = 0,
1 {pi , Xj }2 = pi Xj δij , 2
{Xi , Xj }2 = 2
pi p j Xi X j p2i − p2j
As a simple consistency check, we use the Hamiltonians, eq. (11.47), and compute Xi ∂t2j+1 Xi = {H2j−1 , Xi }2 = 2p2j+1 i in agreement with the fact that H2j−1 generate the flow t2j+1 with { , }2 .
11.9 Local fields
425
11.9 Local fields In this and the next section, we study the Whitham average of local fields in the KdV hierarchy. By definition, the space of local fields is the vector space generated by monomials of the form O(u, u , u , . . .) where the prime denotes ∂x . In particular, we wish to know when the Whitham average of a local field vanishes. One such a case is when the local field we consider is a derivative with respect of any time of the hierarchy of another local field. This is a motivation for presenting local fields in the form: ∂ ν EO,ν (S2 , S4 , . . .) (11.84) O(u, u , u , . . .) = |ν|≥0
where ν = (i1 , i3 , . . .) is a multi-index, ∂ ν = ∂ti11 ∂ti33 · · · , and |ν| = i1 + 3i3 + · · ·. This is not all the story, however, because some linear combinations in the right-hand side of eq. (11.84) identically vanish by the equations of motion of the KdV hierarchy. We define a null vector as such a combination which vanishes when we take the equations of motion into account. Our first task is to describe the null vectors. We will analyse the Whitham average in the next section. The possibility of going from one presentation to the other in eq. (11.84) relies on the possibility of replacing the odd derivatives ∂x2j−1 u by the higher times derivatives ∂t2j−1 , according to the equations of motion ∂t2j−1 u = [(L
2j−1 2
)+ , L] =
1 u(2j−1) + · · · 22j−1
Similarly, the even derivatives ∂x2j u can be replaced by the densities of the integrals of motion S2j : S2j = Res∂ L
2j−1 2
=−
1 22j−1
u(2j−2) + · · ·
Let us compare the dimensions of the space L and the space E generated by elements of the form ∂ ν E(S2 , S4 , . . .), where E(S2 , S4 , . . .) is a monomial of S2 , S4 , . . .. Since both spaces are infinite, we introduce a grading by attributing to ∂x weight 1 and to u weight 2. As a consequence, ∂t2j−1 has weight 2j − 1 and S2j has weight 2j. This makes the two spaces L and E graded vector spaces: ∞ ∞ L= Ln , E = En n=0
n=0
426
11 The KdV hierarchy
At each grade n, the vector spaces Ln and En are finite-dimensional. We define the characters: χ(L) =
∞
q dim Ln , n
χ(E) =
n=0
∞
q n dim En
n=0
The space Ln is made of monomials in u, u , u , . . . of weight n. It is quite clear that 1 1 = (1 − q) = 1 + q 2 + q 3 + 2q 4 + 2q 5 + · · · χ(L) = 1 − qj 1 − qj j≥2
j≥1
because the number of local fields of weight k is the number of ways to
2n 3n 1 2 · · ·. write 2n1 + 3n2 + · · · = k, so that χ(L) = n1 q n2 q Similarly, the character of the vector space E can be easily computed and is found to be: 1 1 1 = χ(E) = 2j−1 2j 1−q 1−q 1 − qj j≥1
j≥1 3
j≥1 5
= 1 + q + 2q + 3q + 5q + 7q + · · · 2
4
The first infinite product counts the factors ∂ ν and the second product counts the factors EO,ν in eq. (11.84). Note that we have χ(E) > χ(L), meaning that for each n dim En ≥ dim Ln . However, the equations of motion of the KdV hierarchy imply that many expressions of the form of those in the left-hand side of eq. (11.84) vanish. So the equality in eq. (11.84) is meant modulo the equations of motion of the KdV hierarchy. Let us give some examples of null vectors: level 1 : ∂t1 · 1 = 0, level 2 : ∂t21 · 1 = 0, level 3 : ∂t31 · 1 = 0,
∂t3 · 1 = 0
level 4 : ∂t41 · 1 = 0,
∂t1 ∂t3 · 1 = 0,
(∂t21 S2 − 4S4 + 6S22 ) = 0,
level 5 : ∂t51 · 1 = 0,
∂t21 ∂t3 · 1 = 0,
∂t5 · 1 = 0,
∂t1 ((∂t21 S2
− 4S4 +
6S22 )
= 0,
(∂t3 S2 − ∂t1 S4 ) = 0
We have written all the null vectors explicitly to show that their numbers exactly match the character formula. The non-trivial null vector at level 4 expresses S4 in terms of u: 8S4 = −u + 3u2 . With this identification, the non-trivial null vector at level 5, ∂t3 S2 − ∂t1 S4 = 0, gives the KdV equation eq. (11.1).
427
11.9 Local fields
The goal of our study is to find all the null vectors. They will have a simple description, given in eqs. (11.90, 11.97) below, at the price of introducing a fermionic language. A generating function for the monomials spanning the vector space E is: E(u, v) ≡ e j u2j−1 ∂t2j−1 e n v2n J2n (11.85) where the coefficients J2n are defined by log S(λ) ≡ −
1 J2n λ−n n
(11.86)
n>0
One can express any monomial in the S2n in terms of the J2n and vice versa. To describe the null vectors, we introduce fermionic and bosonic fields as in Chapter 9: βr λ−r−1/2 , β ∗ (λ) = βr∗ λ−r−1/2 , (11.87) β(λ) = r∈Z+ 12
r∈Z+ 12
H(λ) =
Hn λ−n−1 =: β(λ)β ∗ (λ) :
n∈Z
The vacuum |0 is characterized by βr |0 = 0, βr∗ |0 = 0, for r > 0. By the Campbell–Hausdorff formula we have:
e
n
v2n J2n
= 0|e−
n>0
v2n Hn −
e
1 n>0 n J2n H−n
|0
Hence the monomials in J2n are in one to one correspondence with the components of the vector Z|0 in Fock space, where: Z ≡ e−
1 n>0 n J2n H−n
=e
dλ 2iπ
log S(λ)H(λ)
(11.88)
The full set of elements of the space E is obtained by acting on this vector with the time derivatives ∂t2j−1 . Null vectors admit a particularly simple description in this setting. −j Proposition. Let ∇(λ) = j≥1 λ ∂t2j−1 be the operator defined in eq. (11.22), and Q be the fermionic operator: dλ Q= β(λ)∇(λ) (11.89) 2iπ Then the equations of motion of the KdV hierarchy imply: Q Z|0 = 0
(11.90)
428
11 The KdV hierarchy
Proof. We need the two formulae dλ −1 ∇(λ)Z = Z S (λ )∇(λ)S(λ )H(λ ) 2iπ
(11.91)
and β(λ)Z = S −1 (λ) Z β(λ)
(11.92)
These formulae are straightforward consequences of the fact that all Hn , n > 0, mutually commute, and eq. (9.43) in Chapter 9 applies. So we can write dλ dλ −1 Q Z|0 = Z S (λ)S −1 (λ )∇(λ)S(λ )β(λ)H(λ )|0 2iπ 2iπ dλ dλ ∂x log S(λ ) − ∂x log S(λ) β(λ)H(λ )|0 (11.93) =Z 2iπ 2iπ λ − λ In the last step we used eq. (11.23) valid for |λ| > |λ |. We examine separately the ∂x log S(λ ) and the ∂x log S(λ) terms. The ∂x log S(λ ) term reads: dλ dλ 1 ∂ log S(λ )β(λ)H(λ )|0 x 2iπ 2iπ λ − λ |λ|>|λ | One can do the integral over λ. Poles can occur at λ = 0 and λ = λ . At λ = 0, the integrand is actually regular. In fact, potentially dangerous terms come from β(λ)H(λ )|0 . But this is regular at λ = 0 because we have β(λ)H(λ ) =: β(λ)β(λ )β ∗ (λ ) : −
1 β(λ ), λ − λ
|λ| > |λ |
and by definition of the vacuum and normal ordered product, the term : β(λ)H(λ ) : |0 is regular at λ = 0. The same formula is used to analyse the poles at λ = λ . One has two terms. The first one is dλ 1 : β(λ)β(λ )β ∗ (λ ) : |0 2iπ λ − λ which is zero because at λ = λ we get the product of two fermionic fields at the same point inside the normal product, and this vanishes. The second term is equal to 1 dλ β(λ )|0 − 2iπ (λ − λ )2
429
11.9 Local fields and this obviously vanishes. Next, we examine the ∂x log S(λ) term: dλ dλ 1 ∂ log S(λ)β(λ)H(λ )|0 x 2iπ 2iπ λ − λ |λ|>|λ |
This time we can do the λ integral. But, by the properties of the vacuum vector, the integrand is regular at λ = 0 because H(λ )|0 = O(1) and the integral vanishes. We have shown that Q Z |0 = 0. Equation (11.90) is not sufficient to characterize null vectors. We exhibit now a second equation which, together with the first one, will provide a complete set of constraints. Proposition. Let C be the fermionic operator: C = C0 + C1
where
dλ d β(λ) λ β(λ) 2iπ dλ dλ1 dλ2 λ1 β(λ1 )β(λ2 )∇(λ1 )∇(λ2 ) log 1 − 2iπ 2iπ λ2 C0 =
C1 =
(11.94)
|λ1 ||λ] 2iπ λ1 − λ So, one has to evaluate the integral dλ1 dλ λ1 ∂x log S(λ) − ∂x log S(λ1 ) log 1 − β(λ1 )H(λ)|0 λ2 λ1 − λ |λ1 |>|λ] 2iπ 2iπ
430
11 The KdV hierarchy
This is exactly the same type of integral we met in eq. (11.93). Again, we see that the term ∂x log S(λ1 ) vanishes because in the λ integral, the integrand is regular at λ = 0. Only the term ∂x log S(λ) contributes. This time, however, the double pole gives a non-vanishing contribution: 1 dλ1 1 λ1 =− log 1 − 2 2iπ λ (λ − λ) λ 2 1 2−λ |λ1 |>|λ] Hence, we find C1 Z|0 = −
dλ2 β(λ2 )∇(λ2 )Z 2iπ
|λ]|λ| 2iπ 2iπ ∂x2 log S(λ) − ∂x2 log S(λ2 ) − (∂x log S(λ2 ))2 : β(λ2 )β(λ) : |0 + (λ2 − λ)2 In the second term, one can always perform one of the integrals. In the first term, we take the half sum with λ and λ2 exchanged, getting an integral which localizes at λ = λ2 . Putting everything together, we get 1 d dλ C1 Z|0 = Z [2∂x2 log S(λ) + (∂x log S(λ))2 ]β(λ)λ β(λ)|0 4 2iπλ dλ Remembering eq. (11.30), we finally obtain: dλ d (C0 + C1 )Z|0 = (u + λ)β(λ)λ β(λ)|0 2iπλ dλ d β(λ)|0 = O(λ). This integral vanishes because β(λ)λ dλ
Remark. It is useful to write the operators Q and C explicitly. For Q we find Q=
β−j+ 1 ∂t2j−1
j≥1
2
From this expression it is quite clear that Q2 = 0
431
11.9 Local fields Similarly, we find C=
β−r
1
− 2rβr +
r≥ 1 2
r− 2 j1 ≥1 j2 =1
1 r − j2 +
1 βr+1−j1 −j2 ∂t2j1 −1 ∂t2j2 −1 2
Note that Q and C commute.
We now show that eqs. (11.90, 11.97), contain all the information about the KdV hierarchy by enumerating all null vectors and comparing characters. ∗ be the Fock space consisting of elements of the form Let F−n
0|βs1 · · · βsm βr∗1 · · · βr∗m+n with si , ri ≥ 12 and all different. Note that we have n + m operators βr∗i of charge −1, and m operators βsi of charge +1, so that the total charge of the state is −n. Attributing a weight 2s at βs and 2r at βr∗ turns the ∗ of charge −n into a graded vector space. dual Fock space F−n Introducing a parameter q to count the weight and x to count the ∗ is easily calculated: charge, the character of F−n dx n ∗ χ(F−n ) = (1 + q 2j−1 x)(1 + q 2j−1 x−1 ) x 2iπx j≥1
∗ ) = Changing x = q 2 x in the above integral, we find the relation χ(F−n 2 ∗ ∗ ) = q n χ(F ∗ ). On the other hand, F ∗ is ), so that χ(F−n q 2n−1 χ(F−n+1 0 0 isomorphic to the bosonic Fock space generated by the Hn , which has weight 2n. Its character is therefore j≥1 (1 − q 2j )−1 . Hence we have shown 1 2 ∗ χ(F−n ) = qn (11.98) 1 − q 2j j≥1
∗ be the dual Proposition. Let F−2 ∗ → F ∗ is injective. cation C : F−2 0
Fock space of charge −2. The appli-
Introduce a linear transformation on the space of fermions β˜r = Proof. r Nrr βr , where the matrix Nrr is defined by β˜r = βr ,
r≤−
1 2 1
β˜r = 2rβr −
r− 2 j1 ≥1 j2
1 βr+1−j1 −j2 ∂t2j1 −1 ∂t2j2 −1 , r − j2 + 12 =1
r≥
1 2
432
11 The KdV hierarchy
∗ ∗ ˜ ˜∗ satisfy the The dual fermions are β˜−r = (t N −1 )r,r β−r so that β, β canonical anticommutation relations. Because the transformation N is triangular and leaves β−r , βr∗ , r ≥ 12 invariant, the vacuum is invariant and the Fock spaces are the same. In the β˜r basis we can write β˜−r β˜r C= r≥ 12
∗ β ˜∗ . We have the sl2 Let us call X+ ≡ C and introduce X− = r≥ 1 β˜−r r 2 algebra [X+ , X− ] = H0 , [H0 , X± ] = ±2X± ∗ ˜ ˜∗ where H0 = r : βr β−r := r : βr β−r : is the charge operator The ∗ spaces F−n are eigenspaces of H0 . So; we have a representation of the sl2 algebra on the full Fock space F ∗ = n Fn∗ . ∗ Note, moreover, that X± , H0 are of grade zero, so their action on F−n preserves the gradation of these spaces, and we can restrict them to subspaces of given grade. Subspaces of given grade are finite-dimensional. For each grade, we can decompose F ∗ into a direct sum of irreducible finite-dimensional representations of sl2 . Finally, on a finite-dimensional ∗ irreducible representation, X+ : Fn−2 → Fn∗ is injective for n ≤ 0. The weights were so chosen that, taking into account the weights of ∂t2j−1 defined above, the operators Q and C have weight zero. Hence they preserve the gradings of the spaces on which they act. This is an important observation in the proof of the following: Proposition. The character of the space of vectors in E, solutions of the equations eq. (11.90) and eq. (11.97), is 1 χ1 = 1 − qj j≥2
It is equal to the character of the space of local fields. As a consequence eqs. (11.90, 11.97) capture the complete information about the KdV hierarchy. Proof. The character we are looking for is of the form 1 χ1 = ·χ 1 − q 2j−1 j≥1
where the first factor comes from the u-exponential in eq. (11.85), while the second factor, χ, comes from the v-exponential, subjected to the conditions eqs. (11.90, 11.97). To compute χ, we count the dimension of the
433
11.10 Whitham’s equations
dual spaces. The C conditions are taken into account by considering the ∗ C. Due to the previous proposition, the character of this space F0∗ /F−2 ∗ ). Next, we have to take into account the Q condispace is χ(F0∗ ) − χ(F−2 tions. Because Q and C commute, this is achieved by replacing the spaces ∗ by the spaces H∗ = F ∗ /F ∗ F−n −n −n −n−1 Q. It follows that: ∗ ) χ = χ(H0∗ ) − χ(H−2
(11.99)
∗ ), we take into account that Q is a To compute the characters χ(H−n nilpotent operator with trivial cohomology. In fact, we can construct a homotopy: −1 ∗ βj− Q∗ = 1 ∂t 2j−1 j≥1
2
The operator QQ∗ + Q∗ Q acting on any vector reproduces it up to a constant. Hence ∗ ∗ ∗ ∗ → F−n ) = Im (Q : F−n−2 → F−n−1 ) Ker (Q : F−n−1
Summing over this complex, we get, using eq. (11.98): 2
2 2 ∗ ) = q n − q (n+1) + q (n+2) + · · · χ(H−n j≥1
Inserting into eq. (11.99), we get χ = (1 − q)
j≥1 (1
1 1 − q 2j
− q 2j )−1 .
Equations (11.90, 11.97), which code for the non-linear KdV hierarchy, are actuatly linear in Z. The non-linearity comes from the explicit form of Z as an exponential, as in eq. (11.88). 11.10 Whitham’s equations We study eqs. (11.90, 11.97) in the case of finite-zone solutions. They acquire a beautiful geometrical meaning when we consider their average over the Liouville torus in the spirit of the Whitham theory (see Chapter 10). Specifically, C reflects the Riemann bilinear identities and Q gives rise to the Whitham equations directly. For any local quantity O(u, u , . . .), the Whitham average is defined by 1 O(u(x, t), . . .)dx O
= lim →∞ 2 − 2π 2π dθg dθ1 ··· O(u(θ, {λi }), . . .) = 2π 2π 0 0
434
11 The KdV hierarchy
where, in the second expression, we have replaced the average over x by an average over the Liouville torus, which is justified for almost all trajectories. Let us remark that if the expression, O, is the x-derivative of a bounded function, its Whitham average vanishes. However, we will show below that there exist more subtle vanishing conditions. We consider finite-zone solutions constructed with the hyperelliptic curve eq. (11.49) and the dynamical divisor D(x) = {γi = (λγi , µγi )}, defined in eq. (11.55). With these data, we can write: - - ∂θ −1 dλγ1 · · · dλγg O(u({λγi }, {λi }), . . .) -- - O
= N ∂γ a1 an where the normalization factor N is defined so that 1
= 1. The Jacobian determinant |∂θ/∂γ| is easily computed. Proposition. We have N
- - i<j (λγi − λγj ) −1 - ∂γ - = ∆ R(λ )
−1 - ∂θ -
where
∆ = det ai
λj−1 dλ R(λ)
(11.100)
γi
i
i,j=1,...,g
Proof. The angles on the torus are given by θk = are the normalized Abelian differentials. So dθk = ωk (γi )
γi i
ωk , where ωk
i
But ωk =
g
j=1 ckj
√λ
j−1
R(λ)
dλ, where the coefficients ckj are determined by
normalizing the a-periods. Hence - j−1 - ∂θ i<j (λγi − λγj ) - - = det c det λγi = det c - ∂γ R(λγi ) R(λγi ) i The factor det c cancels out in the normalization factor, which is found by imposing 1
= 1 giving N = det c ∆. By eqs. (11.84, 11.64), with every local field O we can associate a symmetric function of the points γi , LO (λγ1 , . . . , λγg ) = EO,0 (S2 , S4 , . . .). This function depends on the moduli λi and the λγi . In the Whitham average,
11.10 Whitham’s equations
435
all terms with ν > 0 in eq. (11.84) vanish because they are exact derivatives. We have dλγg dλγ1 −1 O
= ∆ LO (λγ1 , . . . , λγg ) (λγi −λγj ) ··· R(λγg ) R(λγ1 ) a1 an i<j (11.101) We can expand the symmetric function LO (λγ1 , . . . , λγg ) on the Schur polynomials introduced in eq. (9.62) in Chapter 9: LO = LYO SY Y
Using the determinant formula for SY associated with the Young diagram Y = [ni ], we see that averaging reduces to computing dλ O
= ∆−1 LYO det λni +j−i R(λ) a j Y These formulae are particularly useful for describing the Whitham average because the variables λγi are separated and we have only one dimensional integrals to compute. By eq. (11.64), the value of the coefficients J2n defined in eq. (11.86) are: g 2g+1 1 n n J2n = λ γi − λi 2 i=1
i=1
Using the boson–fermion correspondance of Chapter 9, we can write Z in eq. (11.88) as: g g i=1 λγi (11.102) Z|0 = GΓ β ∗ (λγ1 )β ∗ (λγ2 ) · · · β ∗ (λγg )|g i<j (λγi − λγj ) where GΓ depends only on the curve Γ: 2g+1 1 1 λni H−n GΓ = exp 2 n n≥1
i=1
In the Whitham theory, we assume that the moduli, λi , become functions of the slow modulation times T2j−1 = t2j−1 . ∂t2j−1 O = ∂t2j−1 O|λi +
∂λi ∂λ O ∂t2j−1 i i
436
11 The KdV hierarchy
Upon averaging, the first term drops out and we are left with ∂t2j−1 O
=
∂λi ∂λ O
≡ ∂T2j−1 O
∂t2j−1 i i
The modulation equations are obtained by keeping the leading terms in in eqs. (11.90, 11.97). They become Q0 Z
|0 = 0, with Q0 =
C0 Z
|0 = 0
β−j+ 1 ∂T2j−1
(11.104)
dλ d β(λ) λ β(λ) 2iπ dλ
(11.105)
2
j≥1
and C0 =
(11.103)
The rest of this section is devoted to the analysis of these equations. We need some preparation. On a Riemann surface, there is a natural pairing between meromorphic differentials. If Ω1 and Ω2 are two such differentials on Γ, we define (Ω1 • Ω2 ) =
g i=1
Ω2 −
Ω1
aj
Ω2
aj
bj
(11.106)
Ω1 bj
The Riemann bilinear identities express this quantity in terms of residues (see Chapter 15). We also have a pairing between cycles. If C1 and C2 are two cycles, the pairing is simply g (n1j m2j − m1j n2j ) C1 ◦ C2 = j=1
g
where Ci = j=1 (nij aj + mij bj ). We can write this intersection number in a way similar to eq. (11.106). Proposition. Let ωi be the normalized holomorphic differentials. Let ηi , i = 1, . . . , g, be the second kind differentials dual to the ωi , normalized by (ωi • ηj ) = δij , then C1 ◦ C2 =
(ηi • ηj ) = 0,
g j=1 C 1
(ωi • ωj ) = 0
ηj −
ωj C2
ωj
C2
ηj
C1
(11.107)
437
11.10 Whitham’s equations
Proof. The normalization conditions of ωj and ηj mean that the matrix P , defined by ωi ωi a b j j P = aj ηi bj ηi is a symplectic matrix: t
P J P = J,
J=
0 Id −Id 0
Since J 2 = −Id, the right inverse of P is −J t P J. Using the fact that the right inverse and the left inverse are the same, we deduce that t P JP = J. Now let Ci | = (nij , mij ). We can rewrite the intersection number as C1 ◦ C2 = C1 |J|C2 . Using the relation t P JP = J, this is equal to C1 ◦ C2 = C1 | t P JP |C2 . This is equivalent to eq. (11.107) because C1 | t P is the vector of periods of the forms ωj , ηj along the cycle C1 , and similarly for P |C2 . For a hyperelliptic curve, one has the explicit formula: Proposition. The intersection number is given by 1 dλ dλ 1 2 C(λ1 , λ2 ) C1 ◦ C2 = 4iπ C1 R(λ1 ) C2 R(λ2 )
(11.108)
where the antisymmetric polynomial C(λ1 , λ2 ) is defined by R(λ1 ) ∂ C(λ1 , λ2 ) = R(λ1 ) − (λ1 ↔ λ2 ) ∂λ1 λ1 − λ2 Proof. The first term in the expression of C1 ◦ C2 reads R(λ1 ) 1 dλ2 d dλ1 4iπ C2 R(λ2 ) C1 dλ1 λ1 − λ2
(11.109)
The integral over λ1 can be performed and gets contributions only at intersection points of C1 and C2 . Let the curves have a positive intersection at λ = λ0 . We get a contribution 1 1 = 2iπ R(λ0 )δ(λ0 − λ2 ) R(λ0 ) − + λ0 + i − λ2 λ0 − i − λ2 The integral over λ2 now gives 1/2. The second term is treated similarly and also gives 1/2.
438
11 The KdV hierarchy
It is important to realize that the average eq. (11.101) can vanish for particular antisymmetric polynomials: MO (λγ1 , . . . , λγg ) ≡ (λγi − λγj )LO (λγ1 , . . . , λγg ) i<j
There are two general reasons why such an integral can vanish: • The first one is when MO is “an exact form” (−1)i M (λγ1 , . . . , λ( MO (λγ1 , . . . , λγg ) = γi , . . . , λγg )P (λγi ) i
where P (λ) is a polynomial such that P (λ)/ R(λ) has vanishing a-periods. In particular, if deg P ≥ 2g, we can write √ S d P √ =√ + (Q R) R R dλ with deg S ≤ 2g − 1 and deg Q = deg P − 2g. The exact derivative term has vanishing periods. • The second one, less trivial, is when MO (λγ1 , . . . , λγg ) ( = (−1)i+j M (λγ1 , . . . , λ( γi , . . . , λγj , . . . , λγg )C(λγi , λγj ) (11.110) i,j
since we are integrating over non-intersecting a-cycles. This second condition, which is a direct consequence of Riemann bilinear identities, ensures that the second of eqs. (11.103) is automatically satisfied. Proposition. The equation C0 Z
|0 = 0
(11.111)
follows from eq. (11.110). Proof. We have to evaluate g C0 Z
|0 = ∆−1 i=1
ai
λgγ dλγi i C0 GΓ β ∗ (λγ1 ) · · · β ∗ (λγg )|g R(λγi )
The commutation relation β(λ)GΓ = λ−g− 2 1
R(λ) GΓ β(λ)
11.10 Whitham’s equations implies
C0 GΓ = GΓ
439
d dλ −2g−1 R(λ)β(λ)λ β(λ) λ 2iπ dλ
So, we have to compute
d dλ −2g−1 R(λ)β(λ)λ β(λ)β ∗ (λγ1 ) · · · β ∗ (λγg )|g λ 2iπ dλ
This is done using Wick’s theorem. Since we are using the charged vacuum, the contraction is g|β(z)β ∗ (w)|g =
z g w−g , z−w
|z| > |w|
We have three terms corresponding to zero, one and two contractions respectively. The term with zero contraction reads E0 =
dλ −2g d λ R(λ) : β(λ) β(λ)β ∗ (λγ1 ) · · · β ∗ (λγg ) : |g 2iπ dλ
d This term vanishes because β(λ) dλ β(λ)|g = λ2g |g + 2 + O(λ2g+1 ). The term with one contraction can be written as
E1 = 2
g i=1
dλ λ−g d γi (−1) R(λ) R(λ)λn 2iπ λ − λγi dλ i
n≥0
∗ (λ ) · · · β ∗ (λ ) : |g : β−g−n− 1 β ∗ (λγ1 ) · · · β γi γg 2
When we perform the λ integral, we get a contribution at the pole λγi which produces an exact form and therefore does not contribute in the Whitham average. The point λ = 0 does not contribute because the integrand is regular there. The term with two contractions reads E2 =
ij
(−1)i+j
dλ −g −g λ λ 2iπ γi γj
R(λ) d R(λ) −i↔j λ − λγi dλ λ − λγj
∗ (λ ) · · · β ∗ (λ ) · · · β ∗ (λ ) : |g : β ∗ (λγ1 ) · · · β γi γj γg
where the hat means that the corresponding quantity is omitted. Performing the λ integral, we get an expression of the form eq. (11.110) vanishing under the Whitham average.
440
11 The KdV hierarchy
Proposition. The equation Q0 Z
|0 = 0
(11.112)
implies the modulation equations ∂ ∂T2p−1
Ω(2q−1) =
∂ ∂T2q−1
Ω(2p−1)
where Ω(2p−1) are the normalized second kind differentials with a pole at ∞ such that Ω(2p−1) = d(z 2p−1 + O(z −1 )) at ∞. Proof. To extract this particular modulation equation, one has to extract a particular component in eq. (11.112). Consider the co-vector ∗ ∗ λ2 λ−1/2 dλ 0| : βp− 1β q− 1 1
2
2
d 1 λ 2 β(λ) : dλ
where the factor λ−1/2 dλ = 2dz is introduced to get a 1-form on Γ. ∗ Applying the Wick theorem, with contraction 0|βp− 1 β−j+ 1 |0 = δpj , we 2
have
2
d 1 ∗ ∗ Q λ 2 β(λ)βp− 1β q− 12 0 2 dλ 1 d 1 1 d 1 ∗ ∗ = 0|λ 2 λ 2 β(λ)βp− λ 2 β(λ)βq− 1 ∂T2q−1 − 0|λ 2 1 ∂T2p−1 2 2 dλ dλ On the other hand, we easily get 1
0|λ 2
1 d 1 1 ∗ ∗ 0|β ∗ (λ)βp− ) 0|β ∗ (λ)βp− 1 − 0|λ 2 λ 2 β(λ)βp− 1 C0 = (p − 1 2 2 2 2 dλ hence, having in mind eq. (11.111), we can write
d 1 ∗ ∗ Q λ 2 β(λ)βp− 1β q− 12 0 2 dλ 1 1 = (p − ) 0|β ∗ (λ)βp− 1 ∂T2q−1 − (q − ) 0|β ∗ (λ)βq− 1 ∂T2p−1 2 2 2 2 This yields the equation 1
0|λ 2
(2q − 1)∂T2p−1 0| : βq− 1 β ∗ (λ) : Z
|0 2
= (2p − 1)∂T2q−1 0| : βp− 1 β ∗ (λ) : Z
|0 2
It remains to evaluate 0|β ∗ (λ)βp− 1 Z
|0 2 g λgγ dλγi −1 i =∆ 0|β ∗ (λ)βp− 1 GΓ β ∗ (λγ1 ) · · · β ∗ (λγg )|g 2 R(λ ) γi i=1 ai
441
11.10 Whitham’s equations Pushing GΓ to the left and writing βp− 1 = 2
dλ1 p−1 2iπ λ1 β(λ1 ),
we arrive at
1
λg+ 2 0|β (λ)βp− 1 GΓ β (λγ1 ) · · · β (λγg )|g = 2 R(λ) 3 dλ1 p−g− 2 λ R(λ1 ) 0|β ∗ (λ)β(λ1 )β ∗ (λγ1 ) · · · β ∗ (λγg )β−g+ 1 · · · β− 1 |0 2 2 2iπ 1 ∗
∗
∗
The last vacuum expectation value is just the determinant of a (g + 1) × (g + 1) matrix 1 λ−j+1 λ−λ1 det 1 λ−j+1 γi λγ −λ1 i,j=1,...,g
i
The λ1 integral can be evaluated: 1 1
λg+ 2 1 λg+ 2 p−g− 3 dλ1 2 R(λ1 ) = R(λ) λ λ − λ1 + R(λ) |λ1 |>|λ| 2iπ R(λ) where ()+ means the polynomial part at ∞. Putting back the factor λ−1/2 dλ, we finally get: λ−1/2 dλ 0|β ∗ (λ)βp− 1 Z
|0 2
3 g √λ λp−g− 2 R(λ) dλ R(λ) + = ∆−1 det dλ p−g− 32 λgγi γi √ λ γi R(λγi ) ai 2iπ R(λγi )
+
g−j+1 λ √ dλ R(λ) g−j+1 dλγi λγi √ ai 2iπ R(λγi )
Here the indices i, j = 1, . . . , g, so that the matrix inside the determinant is (g + 1) × (g + 1). Note that this is a second kind differential. At infinity it behaves as (z 2p−2 + O(1))dz. Moreover, it is evident that its a-periods 1 Ω(2p−1) . all vanish. It is equal to 2p−1 References [1] D.J. Korteweg and G. de Vries, On the change of form of long waves advancing in a rectangular channel and on a new type of long stationary wave. Philos. Mag. 39 (1895) 422–443. [2] C.S. Gardner, J.M. Greene, M.D. Kruskal and R.M. Miura, Method for solving the Korteweg–de Vries equation. Phys. Rev. Lett. 19 (1967) 1095. [3] P.D. Lax, Integrals of nonlinear equations of evolution and solitary waves. Comm. Pure Appl. Math. 21 (1968) 467–490.
442
11 The KdV hierarchy
[4] V.E. Zakharov and L.D. Faddeev, Korteweg–de Vries equation: a completely integrable Hamiltonian system. Funct. Anal. Appl. 8 4 (1971) 280–287. [5] S. Novikov, S.V. Manakov, L.P. Pitaevskii and V.E. Zakharov, Theory of solitons. The inverse scattering method. Consultants Bureau (1984). [6] L.D. Faddeev and L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons. Springer (1986). [7] L.A. Dickey, Soliton Equations and Hamiltonian Systems. World Scientific (1991). [8] O. Babelon, D. Bernard and F. Smirnov, Null-vectors in integrable field theory. Commun. Math. Phys. 186 (1997) 601–648. [9] O. Babelon, D. Bernard and F. Smirnov, Form factors, KdV and deformed hyperelliptic curves. Nucl. Phys. B (Proc. Suppl.) 58 (1997) 21–33. [10] I. Krichever and D.H. Phong, Symplectic forms in the theory of solitons. Surveys in Differential Geometry 4 (1998) 239–313, International Press.
12 The Toda field theories
In this chapter, we study Toda field theories, which are generalizations of the Liouville equation. The equations of motion are both integrable, i.e. they admit a zero curvature representation, and conformally invariant. They allow us to see the interplay between conformal symmetry and the classical Yang–Baxter equation. The sine-Gordon theory is a Toda field (2 . In particular, soliton theory associated with the affine Lie algebra sl solutions are constructed by the action of very special elements of the dressing group on the vacuum solution. Compared to the KP situation, we have here an example with two singularities, each one accomodating one of the two relativistic light-cone variables x ± t, and their corresponding hierarchy of times. We study soliton and finite-zone solutions of the sineGordon equation. In the next chapter we will apply the inverse scattering method to discuss further solutions. 12.1 The Liouville equation In his studies on surfaces with constant curvature, Liouville introduced the equation (∂t2 − ∂x2 )ϕ = −4e2ϕ (12.1) In the light-cone coordinates x± = x ± t, ∂x± = 12 (∂x ± ∂t ), the equation reads ∂x+ ∂x− ϕ = e2ϕ . Remarkably, Liouville was able to give the general solution of this non-linear partial differential equation: e2ϕ = −
∂F (x+ )∂G(x− ) (F (x+ ) − G(x− ))2
(12.2)
where F and G are arbitrary functions of the single variables x+ and x− respectively and ∂F (x+ ) = ∂x+ F (x+ ), ∂G(x− ) = ∂x− G(x− ). Such 443
444
12 The Toda field theories
functions are called chiral functions. A very important property of this equation is its invariance under changes of coordinates: x+ = f (x+ ),
x− = g(x− ),
ϕ = ϕ −
1 log (∂f ∂g) 2
We see that the equation for the primed variables is the same as for the unprimed ones. We call this invariance the conformal invariance of the Liouville equation. There is an important connection between the Liouville equation and the Schroedinger equation, eq. (11.5) in Chapter 11. The field e−ϕ satisfies two chiral equations: (∂x2− − u(x− ))e−ϕ = 0, (∂x2+
−ϕ
− u(x+ ))e
= 0,
u = (∂x− ϕ)2 − ∂x2− ϕ,
∂x+ u = 0
u = (∂x+ ϕ) −
∂x− u = 0
2
∂x2+ ϕ,
Indeed, the two Schroedinger equations are obtained readily by computing ∂x2± e−ϕ and the chirality of u and u is then proved using the Liouville equation. Conversely, starting from two arbitrary chiral potentials u(x− ) and u(x+ ), we construct solutions ξi (x− ) and ξ i (x+ ), i = 1, 2, of the two Schroedinger equations, normalized such that their Wronskians are: W (ξ1 , ξ2 ) = 1,
W (ξ 1 , ξ 2 ) = −1
where the Wronskian is defined as W (f, g) = f g − f g . It is easy to check that ϕ(x+ , x− ), given by the formula, e−ϕ(x+ ,x− ) = ξ1 (x− )ξ 1 (x+ ) + ξ2 (x− )ξ 2 (x+ )
(12.3)
satisfies the Liouville equation. One can relate this solution to Liouville’s solution eq. (12.2) by writing it as: 1 G F −G F 1 e−ϕ = √ =√ √ −√ √ −F G F −G F −G This leads us to set: F ξ1 = √ , F
1 ξ2 = − √ , F
ξ1 = √
1 , −G
ξ2 = √
G −G
The Wronskian conditions W (ξ1 , ξ2 ) = 1 and W (ξ 1 , ξ 2 ) = −1 are automatically satisfied. It follows that F = −ξ 1 /ξ 2 and G = ξ2 /ξ1 . Finally, one
445
Zero-curvature representation
can use this identification to compute the potentials u and u in terms of F and G. 1 u = − S(F ), 2
1 u = − S(G), 2
S(f ) =
f 3 f 2 − f 2 f 2
where S(f ) is the so-called Schwarzian derivative of f . Note that the two solutions ξ1 , ξ2 are defined up to a linear transformation of determinant 1. This translates into a homographic transformation for G which leaves the potential u invariant, and similarly for F and u. When the two homographic transformations of F and G are inverse to each other, the Liouville field ϕ is invariant. Invariance of the Liouville equation under change of coordinates reflects itself into covariance properties of the Schroedinger equation. Changing x = f (x ), we have: ξ(x) ξ (x ) = √ , ∂f
u (x ) = u(x)(∂f )2 +S(f ),
(∂ 2 −u )ξ = (∂f ) 2 (∂ 2 −u)ξ 3
These equations exhibit the covariance properties of the objects involved. Specifically, ξ is a differential of weight − 12 , u is a Schwartzian connection, and (∂ 2 − u)ξ is a differential of weight 32 . In the next section, we generalize this setup to a large class of twodimensional field theories, called Toda field theories, based on Lie algebras. 12.2 The Toda systems and their zero-curvature representations Toda field theories are two-dimensional generalizations of the Toda chains studied in Chapter 4. We use the same notations for Lie algebras as in that chapter. Let G be a simple Lie algebra of rank r, and consider a Cartan decomposition: G = N− ⊕ H ⊕ N+ Let Φ(x, t) be the Toda field taking values in the Cartan subalgebra Φ(x, t) =
r
Φi (x, t)Hi
i=1
where Hi form an orthonormal basis of the Cartan algebra. By a straightforward generalization of eq. (4.24), we define the Toda field theories by
446
12 The Toda field theories
their equations of motion:
(∂t2 − ∂x2 )Φ = −2
Hα exp (2α(Φ))
(12.4)
α simple
To write these equations in components, we introduce ϕi = Λ(i) (Φ) for i = 1, . . . , r, where Λ(i) are the r fundamental weights of G. Since we have Λ(i) (Hαj ) = (αi2,αi ) δij , and αi (Φ) = j Λ(j) (Φ) aji , where aji are the elements of the Cartan matrix, the Toda equations of motion can be written as: j (∂t2 − ∂x2 )ϕi = −αi2 e 2 j ϕ aji (12.5) In particular in the sl2 case, the Toda field has only one component ϕ1 = Λ(Φ), and setting ϕ1 = 12 ϕ, we recover eq. (12.1). All these equations share with the Liouville equation the important property of being invariant under a change of coordinates: Proposition. The Toda field equations are invariant under the transformations x± → x± with x+ = f (x+ ), x− = g(x− ), and 1 Φ (x+ , x− ) = Φ(f (x+ ), g(x− )) + Hρ ln(∂f (x+ )∂g(x− )) 2 Here Hρ ∈ H is the Weyl vector characterized by the property αj (Hρ ) = 1 for all simple roots αj . Proof. The proof is obvious once we write the field equations in the light-cone coordinates: 1 ∂x+ ∂x− Φ = Hα e2α(Φ) 2 α simple
We introduce now a zero curvature representation for the Toda field equations. Proposition. Let Ax = ∂t Φ +
nα eα(Φ) (Eα + E−α )
(12.6)
nα eα(Φ) (Eα − E−α )
(12.7)
α simple
At = ∂x Φ −
α simple
Then the Toda field equations eq. (12.4) can be rewritten as the zero curvature equation ∂x At − ∂t Ax − [Ax , At ] = 0. The constants nα are such that n2α (Eα , E−α ) = 1.
12.3 Solution of the Toda field equations
447
Proof. The proof is exactly the same as for the open Toda chain, and we do not repeat it. It will be often convenient to work with the light-cone coordinates. In these coordinates, Ax± = 12 (Ax ± At ). Define the elements of G:
E± =
nαj E±αj
αj simple
then Ax+ = ∂x+ Φ + e−ad Φ E− ,
Ax− = −∂x− Φ + ead Φ E+
(12.8)
Example. Let us give the example associated with the Lie algebra G = sl2 . Let E+ , E− , H, be its generators with commutation relations [H, E± ] = ±2E± ,
[E+ , E− ] = H
We have Φ = 12 ϕH, and the Lax connection reads, in the fundamental representation: 1 1 ∂x ϕ 0 eϕ − 2 ∂x− ϕ and A Ax+ = 2 ϕ+ = x− 1 − 12 ∂x+ ϕ 0 e 2 ∂x− ϕ The zero curvature condition ∂x+ Ax− − ∂x− Ax+ − [Ax+ , Ax− ] is equivalent to the Liouville equation (12.1). 12.3 Solution of the Toda field equations The Toda field equations being conformally invariant, one can solve them by splitting the chiralities, as in the Liouville case. To do that, we use the zero-curvature representation. When Φ is a solution of the Toda field equations, we have Fx+ x− = 0 (and conversely), so we can solve the linear system (∂x± −Ax± )Ψ = 0. Equivalently, we can write Ax± as a pure gauge (12.9) Ax± = ∂x± Ψ · Ψ−1 We denote by B± = H ⊕ N± the two Borel subalgebras of G. Proposition. Let Q± ∈ exp B± be defined by two decompositions of Ψ as: Ψ = e±Φ N∓ Q±
with
N∓ ∈ exp N∓ ,
Q± ∈ exp B± (12.10)
448
12 The Toda field theories
then Q± satisfy the following equations: ∂x− Q− Q−1 − = 0, ∂x+ Q+ Q−1 +
= 0,
∂x+ Q− Q−1 − = −P + E−
(12.11)
∂x− Q+ ·
(12.12)
Q−1 +
= P + E+
where P (x− ) and P (x+ ) are chiral fields with values in the Cartan subalgebra. Proof. We write Ψ in two different ways: Ψ = e−Φ G1 = eΦ G2 . Plugging this into eq. (12.9) we get: 2adΦ ∂x− G1 · G−1 E+ , ∂x− G2 · G−1 1 =e 2 = −2∂x− Φ + E+
(12.13)
−1 −2adΦ E− ∂x+ G1 · G−1 1 = 2∂x+ Φ + E− , ∂x+ G2 · G2 = e
(12.14)
Let us prove eqs. (12.11). Using the Gauss decomposition G1 = N+ Q− with N+ ∈ exp N+ and Q− ∈ exp B− , we obtain: −1 −1 2adΦ N+ (∂x− Q− Q−1 E+ − )N+ + ∂x− N+ N+ = e
or, multiplying on the right by N+ and on the left by N+−1 , −1 −1 2adΦ E+ N+ ∂x− Q− Q−1 − = −N+ ∂x− N+ + N+ e
Since the left-hand side is in B− and the right-hand side is in N+ , they −1 2adΦ E . This both vanish, so that ∂x− Q− Q−1 + − = 0 and ∂x− N+ N+ = e proves that Q− only depends on x+ . Next, using eq. (12.14) and again the decomposition G1 = N+ Q− , we obtain: −1 −1 N+ (∂x+ Q− Q−1 − )N+ + ∂x+ N+ N+ = 2∂x+ Φ + E−
(12.15)
The right-hand side has lowest height −1 given by the E− term. So the −1 lowest height term in ∂x+ Q− Q−1 − is also equal to E− . Since ∂x+ Q− Q− ∈ B− , it is necessarily of the form −P + E− , with P ∈ H only depending on x+ . The equations for Q+ are proved similarly. One can reconstruct the Toda field, Φ, from the knowledge of Q± . Let |Λ(i) , i = 1, . . . , r = rank G be highest weight vectors for the fundamental representations of G. Recall the main properties of |Λ(i) : H|Λ(i) = Λ(i) (H)|Λ(i) ,
Eα |Λ(i) = 0 for α > 0
We denote by Λ(i) | the conjugate highest weight which satisfies (see Chapter 16): Λ(i) | = Λ(i) (H) Λ(i) |,
Λ(i) |E−α = 0 for α > 0
449
12.3 Solution of the Toda field equations
We can compute any scalar product of the form Λ(i) |X |Λ(i) , where X is any element of the universal enveloping algebra, by pushing all the Eα to the right and all the E−α to the left using the commutation relations, and the fact that |Λ(i) is a common eigenvector of all the H. Finally, we normalize the scalar product by: Λ(i) |Λ(i) = 1 With these definitions at hand, we have: Proposition. For any fundamental representation with highest weight Λ(i) , define: ξ (i) = Λ(i) | e−Φ Ψ = Λ(i) | Q+ ξ
(i)
(i) = Ψ−1 e−Φ |Λ(i) = Q−1 − |Λ
(12.16)
(i)
The vectors ξ (i) and ξ are chiral: ∂x+ ξ (i) = 0 and ∂x− ξ The Toda field Φ can be reconstructed by the formula: e−2Λ
(i) (Φ)
= ξ (i) · ξ
(i)
(i)
(i) = Λ(i) |Q+ Q−1 − |Λ
= 0.
(12.17)
Proof. First e−Φ Ψ = N− Q+ and Λ(i) |N− = Λ(i) | by the highest weight condition. So Λ(i) | e−Φ Ψ = Λ(i) | Q+ depends only on x+ . Similarly (i) (i) Ψ−1 e−Φ |Λ(i) = Q−1 − |Λ depends only on x− . By the definition of ξ (i)
(i)
and ξ , we see that Ψ and Ψ−1 cancel in the scalar product ξ (i) · ξ , (i) leaving Λ(i) | e−2Φ |Λ(i) = e−2Λ (Φ) . The knowledge of these quantities for i = 1, . . . , r completely characterizes Φ. Equations (12.12) determine Q± in terms of the and P with values in H. For example, consider the (P + E+ )Q+ in the Liouville case, G = sl2 . We have: q11 q12 p , (P + E+ ) = Q+ = 0 q22 0
two chiral fields P equation ∂x− Q+ = 1 −p
The vector ξ reads ξ = (1, 0)Q+ = (q11 , q12 ) and the first order differential equation for Q+ yields: (∂x2− − u)ξ = 0,
u = p + p2
The relation between u and p is called the Miura transformation. The first order linear system for Q+ is a matrix version of the Schroedinger equation, which is recovered as an equation for the first row of the matrix Q+ .
450
12 The Toda field theories
The chiral fields P and P are the two arbitrary functions parametrizing the general solution of Toda field equations. Note that the splitting of chiralities in Toda field theories brings us back to the Drinfeld–Sokolov linear systems of Chapter 10. We describe more explicitly the case of sl(n + 1). The n fundamental representations are the vector representation and its wedge products. The vector representation acts on Cn+1 with basis | j , j = 1, . . . , (n + 1). The highest weight vector is | 1 . The elements of the Cartan algebra are the Ei,i±1 , where Eij = | i j | are traceless diagonal matrices, and E± = the canonical matrices acting on Cn+1 . In the vector representation the −1 ¯ | 1 . Let us decompose chiral fields are ξ(x) = 1 |Q + and ξ(x) = Q− Pj = the fields P and P¯ as P = j Pj Ejj and P¯ = j P¯j Ejj , with ¯ ¯ ξ¯j the components of the chiral ¯fields: ξ(x) = Pj = 0. Denote¯ by ξj and | ξ , and ξ(x) = j j j j ξj | j . The functions ξj and ξj satisfy the differential equations of order (n + 1): (∂x+ − P¯n+1 ) · · · (∂x+ − P¯1 ) ξ¯j = 0 (∂x− − Pn+1 ) · · · (∂x− − P1 ) ξj = 0
(12.18) (12.19)
Indeed, the first row of Q+ is (ξ1 , . . . , ξn+1 ) and the explicit form of E+ immediately yields eqs. (12.19). Proposition. The components of the Toda field along the fundamental weights Λ(p) are given by: e−2Λ
(p) (Φ)
¯ = det (∂xi − ξ · ∂xj + ξ),
i, j = 0, . . . , p − 1
(12.20)
Proof. The highest weight vector of the Λ(p) is the wedge product |Λ(p) = p−1 | 1 ∧ · · · ∧ | p , and we have E− | 1 = | p for p = 1, . . . , n. Thus, p−1 |Λ(p) = | 1 ∧ E− | 1 ∧ · · · ∧ E− | 1
¯(p) Acting with Q−1 − we obtain an expression for the chiral field ξ : −1 −1 −1 p−1 (p) ξ¯(p) = Q−1 − |Λ = Q− | 1 ∧ Q− E− | 1 ∧ · · · ∧ Q− E− | 1 (12.21) j We now use the equation of motion (12.12) to express Q−1 − E− | 1 in terms j ¯ of the derivatives ∂x+ ξ. Using eq. (12.12) and differentiating it, we get: −1 −1 ¯ Q−1 − E− | 1 = −∂x+ Q− | 1 + Q− P | 1 −1 −1 ¯ 2 2 ¯ ¯2 Q−1 − E− | 1 = ∂x+ Q− | 1 + Q− P E− + E− P − P − ∂x+ P | 1 j −1 j j Q−1 − E− | 1 = (−1) ∂x+ Q− | 1 + · · ·
451
12.3 Solution of the Toda field equations The extra terms cancel in the wedge product (12.21). Therefore, ξ¯(p) = (−1)
p(p−1) 2
−1 p−1 −1 Q−1 − | 1 ∧ (∂x+ Q− )| 1 ∧ · · · ∧ (∂x+ Q− )| 1
= (−1)
p(p−1) 2
ξ¯ ∧ ∂x+ ξ¯ ∧ · · · ∧ ∂xp−1 ξ¯ +
= (−1)
p(p−1) 2
ξ¯j1 ∂x+ ξ¯j1 | j1 ∧ · · · ∧ | jp det .. . j1 y we can write Ψ(x) = Ψ(x, y)Ψ(y), where Ψ(x, y) is the transport matrix from y to x. Using the ultralocality property of the Poisson bracket, we have {Ψ1 (x), Ψ2 (y)} = Ψ1 (x, y){Ψ1 (y), Ψ2 (y)}. Equation (12.30) then yields: −1 {Ψ1 (x), Ψ2 (y)} = Ψ1 (x)Ψ2 (y)[−r12 + Ψ−1 1 (y)Ψ2 (y)r12 Ψ1 (y)Ψ2 (y)] (12.33) Combining this with eq. (12.32), we finally evaluate the Poisson brackets between components of the chiral field ξ(x) as: (r)
(r )
{ξ1 (x), ξ2 (y)} = Λ(r) | ⊗ Λ(r ) | exp[−Φ1 (x)] exp[−Φ2 (y)] 0 ·[−Ψ1 (x)Ψ2 (y) · r12 + Ψ1 (x)Ψ−1 1 (y) · [r12 − C12 ].Ψ1 (y)Ψ2 (y)] ± + . Choosing r12 = r12 , we have In this formula, one can take r12 = r12 + (r ) (r ) 0 1 ⊗ Λ |r12 = 1 ⊗ Λ |C12 , and we see that the last term vanishes. This proves the first of eqs. (12.31) for x > y. The other cases are proved similarly.
12.5 Conformal structure The solutions of the Toda field theory were parametrized by two chiral fields P and P . The transformation which relates the original fields to these chiral fields is highly non-local. There exist, however, remarkable quantities which are local in terms of both sets of fields. The purpose of this section is to describe them and to find their Poisson bracket algebra. As we will see, the Virasoro algebra appears as a subalgebra, so that we are in fact dealing with the conformal symmetry algebra of the theory.
12.5 Conformal structure
457
We consider only one chirality, and recall the linear systems eq. (12.12) and eq. (12.13):
[∂x−
[∂x− − P − E+ ]Q+ = 0 + 2∂x− Φ − E+ ]N− Q+ = 0
(12.34) (12.35)
These equations have exactly the same structure, but one is expressed in terms of P while the other is expressed in terms of ∂x− Φ, the derivative of the original field of the theory. The relation between the two equations is a gauge transformation N− ∈ N− . Notice that the fields ξ (r) (x) are invariant under this gauge transformation because Λ(r) |N− = Λ(r) |. The vector ξ (r) (x) is just the first row of Q+ or N− Q+ . For finite-dimensional highest weight representations of G, one can deduce from the first order linear systems eq. (12.34) and eq. (12.35) a single differential equation of higher order for the components of vector ξ (r) (x), L·ξ =0 (12.36) Since the components of ξ (r) (x) are invariant under the gauge transformation N− , the coefficients of this equation will also be invariant. Moreover, these coefficients are local differential polynomials in terms of P or Φ. These are the local quantities we were looking for. In the following, we shall restrict ourselves to the sl(n + 1) case in the fundamental representation. In this representation, the vectorξ(x) has (n + 1) components, ξ(x) = (ξi (x)), i = 1, . . . , (n + 1) and E+ = i Ei,i+1 , basis of (n + 1) × (n + 1) matrices. Finally whereEij is the canonical P = i Pi Eii with i Pi = 0. Using eq. (12.34) or eq. (12.35) one can write the operator L in eq. (12.36) explicitly, cf. eq. (12.19): L = ∂xn+1 − −
n−1
ui ∂xi −
i=0
with L = (∂x− − Pn+1 ) · · · (∂x− − P1 ),
Pi = 0
(12.37)
i
From eq. (12.35) L admits a similar expression but with P → −2∂x− Φ. It is easy to compute un−1 : 1 un−1 = (P, P ) + (Hρ , ∂x− P ) = 2(∂x− Φ, ∂x− Φ) − 2(Hρ , ∂x2− Φ) 2
(12.38)
458
12 The Toda field theories
where Hρ is the element in H such that [Hρ , E+ ] = E+ , namely in our case Hρ = Diag( n2 , n2 − 1, . . . , − n2 ), yielding (Hρ , ∂x− P ) = n∂x− P1 + (n − 1)∂x− P2 + · · · + ∂x− Pn Using the Toda equations of motion to eliminate the higher order time derivatives of Φ, we also have un−1 = H − P, where: 1 1 H = (Π, Π) + (Φx , Φx ) + e2α(Φ) − (Hρ , Φxx ) 2 2 α simple
P = (Π, Φx ) − (Hρ , Πx ) These are the energy and momentum densities respectively. The function un−1 in eq. (12.38) is the generalization of the potential u introduced in the Liouville theory. We now compute its Poisson brackets. As before we set t = 0 and identify x− = x. 0 = 1 Proposition. Let C12 i Hi ⊗ Hi . Then 2 0 {P1 (x), P2 (y)} = (∂x − ∂y )δ(x − y)C12 (12.39) 3 {un−1 (x), un−1 (y)} = un−1 ∂x + ∂x un−1 − (Hρ , Hρ )∂x δ(x − y)
We recognize the standard Virasoro algebra, see Chapter 11. The value of the central charge is (Hρ , Hρ ) = n(n + 1)(n + 2)/12. Proof. We have to compute the Poisson brackets of P . To do this we start from the exchange algebra, and remark that the scalar product ξ (r) (x)|Λ(r) satisfies: 6
7 log ξ (r) (x)|Λ(r) , log ξ (r ) (y)|Λ(r ) 1 (r) Λ (Hi )Λ(r ) (Hi ) (x − y) (12.40) =− 2 i
where (x) is the sign of x. Note also that eq. (12.34) implies that x
ξ (r) (x)|Λ(r) = e
Λ(r) (P )
θ
where θ is some integration constant. Equation (12.39) follows by differentiating eq. (12.40) with respect to x and y and using the independence of the fundamental weights. The Poisson bracket for un−1 then follows readily, either from the expression of un−1 in terms of P or in terms of Φ and Π. We now explore the Poisson algebra of the other invariant coefficients ui in eq. (12.37). Since the quantity un−1 is the generator of the conformal
12.5 Conformal structure
459
symmetry underlying the theory, we call this Poisson bracket algebra an extended conformal algebra. Comparing with Chapter 10, we know that the Poisson brackets of the ui coincides with the second Hamiltonian structure of the generalized KdV equation. We are going to rederive this result starting from the exchange algebra eq. (12.31). In the course of this computation we will exhibit the extended conformal properties of the fields ξi . Any component ξi (x), i = 1, . . . , n + 1, of the vector ξ(x) obeys the differential equation L·ξi = 0 with L a differential operator of order n+1. Once we know the functions ξi (x), the operator L can be reconstructed as: ξ ξ1 · · · ξn+1 ξ1 · · · ξn+1 ξ L · ξ = det (12.41) .. . . . =0 .. . . (n+1) (n+1) · · · ξn+1 ξ (n+1) ξ1 It follows that the coefficients ui of L are given by Wronskian type expressions of the ξi , and we can directly calculate their Poisson brackets knowing those of the ξi . To express the result conveniently we need to recall some notations about pseudo-differential operators, see Chapter 10. We have introduced the derivation symbol ∂ = ∂x (here identified with ∂x− ), with the usual Leibnitz rule ∂.a = a.∂ + (∂a), where (∂a) means ∂x a(x), and the integration symbol ∂ −1 with the following computational rules: ∞ i+v −1 −1 −i−1 v ∂ ∂ = ∂∂ = 1, ∂ (∂ v f )∂ −i−1−v(12.42) f= (−1) v v=0
i The elements A = N −∞ ai ∂ form an associative algebra with unit, called the algebra of formal pseudo-differential operators in one variable. It is equipped with a linear form satisfying the fundamental trace property AB = BA , called the Adler trace, and defined by: A = dxa−1 With these notations we have: Proposition. Let X = i ∂ −i−1 Xi and let fX (L) = LX . Then one has: 1 −1 {fX (L), ξi (x)} = (XL)+ + ∂ ([X, L]−1 ) ξi (x) (12.43) n+1
460
12 The Toda field theories
Proof. Using eq. (12.42) we find: (XL)+ =
k−i−1
(−1)
k−i−s
k>i≥0 s=0
k−1−s i
∂ k−i−s−1 (Xi uk )∂ s
and [X, L]−1 = −
k+i
(−1)
k>i≥0
k ∂ k−i (Xi uk ) i
So we have to prove that: 1 k k+i ∂ k−i−1 (Xi uk )ξq (x) (−1) {fX (L), ξq (x)} = − i (n + 1) +
k−i−1
(−1)
k+i+s
k>i≥0 s=0
k>i≥0
k−s−1 i
∂ k−i−s−1 (Xi uk )∂ s ξq
(12.44)
From eq. (12.41), the coefficients ui in the operator L are given by ui = (−1)i+1 det Mi , where Mi is the matrix of elements 5 a
(Mi )ab = ∂ ξb
with
a = 0, 1, ..., n + 1, a = i b = 1, ..., n + 1
Let [∆i (x)]kl with k = 0, . . . , n + 1, k = i and l = 1, . . . , n + 1 be the minors of the matrix Mi . By using the Leibnitz rules for the Poisson brackets, we obtain {fX (L), ξq (x)} = − dzXi (z)(−1)i [∆i (z)]kl ∂zk {ξl (z), ξq (x)} (12.45) where all repeated indices are summed over. The Poisson brackets of the chiral fields were obtained in eq. (12.31): − + − {ξ1 (z), ξ2 (x)} = −ξ1 (z)ξ2 (x)r12 − θ(z − x)ξ1 (z)ξ2 (x)(r12 − r12 )
We have thus two contributions to eq. (12.45). The contribution of the term independent of θ(z − x) is
− dzXi (z)(−1)i [∆i (z)]kl ∂ k ξm (z)ξn (x)rmn,lq
12.5 Conformal structure
461
Remembering that k=i [∆i (z)]kl ∂ k ξm (z) = (det Mi ) δlm , we see that the − = above expression vanishes since the r-matrices are such that: Tr1 r12 − − Tr2 r12 = Tr12 r12 = 0. Calculating the θ(z − x) dependent term, we get k Xi (z)(−1)i [∆i (z)]kl ∂za ξm (z)ξn (x) dz {fX (L), ξq (x)} = a a=k
·(r+ − r− )mn,lq ∂zk−a θ(z − x) The term a = k in the sum may be excluded, again due to the trace ± properties of the matrices r12 . To evaluate this expression further we + − remark that r12 −r12 = C12 , where C12 is the Casimir element. To evaluate C12 for two vector representations of sln+1 we first compute it in gln+1 according to eq. (12.27). Here the Eα are the Eab for a < b, and E−α = E ba , while Hi = Eii , so we get for C12 the permutation operator P12 = Eab ⊗ Eba , or Pij,kl = δil δjk . We restrict this to sln+1 by requiring that 1 1 + P12 . Using this the partial traces of C12 vanish, yielding C12 = − n+1 result, we finally write the Poisson brackets (12.45) as: {fX (A), ξq (x)} =
1 Uq (x) − Vq (x) (n + 1)
where Uq (x) comes from the identity factor in C12 and Vq (x) from the factor P12 . k i ∂za ξl (z)ξq (x)∂zk−a−1 δ(z − x) Uq (x) = dzXi (z)(−1) [∆i (z)]kl a a=k k Vq (x) = ∂za ξq (z)ξl (x)∂zk−a−1 δ(z − x) dzXi (z)(−1)i [∆i (z)]kl a a=k
The Uq (x) term is easy to deal with. Noticing that, for k = i and a ≤ n+1, we have [∆i (z)]kl ∂ a ξl (z) = (det Mi ) δka + (−1)k+i+1 (det Mk ) δia (12.46) l
we get Uq (x) = −
k>i
(−1)k+i
k ∂ k−i−1 (Xi uk )ξq (x) i
After integrating over z the expression of Vq (x) becomes k i+k+a+1 Vq (x) = (−1) ∂ k−a−1 (Xi [∆i ]kl ∂ a ξq )ξl (x) a a=k
(12.47)
462
12 The Toda field theories
The identity
(∂ c A)B
=
c b=0
(−1)b
c ∂ c−b (A∂ b B) allows us to rewrite d
this as k−a−1 k k−a−1 k+a+i+b+1 (−1) Vq (x) = a b a=k
b=0
×∂ k−a−b−1 (Xi [∆i ]kl ∂ a ξq ∂ b ξl ) Suming over l using eq. (12.46) and performing the derivatives, we obtain k−1−s k+i (−1) Vq (x) = − i s≥0 9 : k k − 1 − a ∂ k−i−1−s (Xi uk )∂ s ξq (−1)a × a k−1−s a 0} N + − − = {E (2n+1) = λ2n+1 E+ , E (2n+1) = λ2n+1 E− , H (2n) = λ2n H, n < 0} N + − (12.59) In particular, the simple root vectors can be taken as E±α1 = λ±1 E± and E±α2 = λ±1 E∓ . The commutation relations are: H (r) , H (s) = Kr δr+s,0 (s) (r+s) H (r) , E± = ±2E± K (r) (s) E+ , E− = H (r+s) + rδr+s,0 2 Following the general construction of Toda field theories, we define E± = E±α1 + E±α2 = λ±1 (E+ + E− ) The connection Ax± is given by Ax± = ±∂x± Φ + me∓adΦ E∓
(12.60)
(2 . Let us of sl where the field Φ takes values in the Cartan subalgebra H decompose it on the generators H, d and K of H: 1 1 Φ= H ϕ+d η+ K ζ 2 4 The zero curvature condition, ∂x+ Ax− − ∂x− Ax+ − [Ax+ , Ax− ] = 0, can (2 . It gives be worked out using only the Lie algebra structure of sl ∂x+ ∂x− ϕ = m2 e2η (e2ϕ − e−2ϕ ) ∂x+ ∂x− η = 0
(12.61) (12.62)
∂x+ ∂x− ζ = m2 e2η (e2ϕ + e−2ϕ )
(12.63)
Thanks to the field η, the above equations are conformally invariant. This is in contrast with the sinh-Gordon equation which is not conformally
12.7 The affine sinh-Gordon model
469
invariant. In fact, performing a change of coordinates x+ = f (x+ ) and x− = g(x− ), the equations are invariant if we redefine the fields by: ϕ (x+ , x− ) = ϕ(f (x+ ), g(x− )) ζ (x+ , x− ) = ζ(f (x+ ), g(x− )) η (x+ , x− ) = η(f (x+ ), g(x− )) + log (∂f ∂g) There are two real forms of eqs. (12.61–12.63). One is when ϕ, ζ and η are all real, and the other one is when ϕ is pure imaginary and ζ, η are real. These two forms correspond to the sinh-Gordon and sine-Gordon case respectively (when one sets η = 0, which can be done consistently and decouples the equation for ϕ). In a loop representation, K = 0, the field ζ and its equation of motion, eq. (12.63), disappear. Setting η = 0, we are left with the standard sinhGordon equation. Let us write the Lax connection in this case: 1 ∂t ϕ m(λeϕ + λ−1 e−ϕ ) 2 Ax = (12.64) − 12 ∂t ϕ m(λe−ϕ + λ−1 eϕ ) 1 ∂x ϕ −m(λeϕ − λ−1 e−ϕ ) 2 At = (12.65) − 12 ∂x ϕ −m(λe−ϕ − λ−1 eϕ ) The benefit of having extended the loop algebra to the full affine algebra is that we have now at our disposal highest weight representations and the general structures of Toda field theories can be applied straightforwardly, provided that the action of “group” elements on the highest weight vector is defined. In the following we shall restrict ourselves to integrable highest weights and work freely with formal expressions. Final formulae will be (2 , where they will evaluated explicitly in the level 1 representations of sl be seen to make sense. As usual with Toda field theories, with a highest weight vector |Λ one ¯ t) defined by : associates two sets of fields ξ(x, t) and ξ(x, ξ(x, t) = Λ|e−Φ Ψ(x, t),
ξ(x, t) = Ψ−1 (x, t)e−Φ |Λ
(12.66)
These fields are chiral: ∂x+ ξ = 0 and ∂x− ξ = 0. For any highest weight Λ we can reconstruct Λ(Φ) by the formula: ¯ +) exp (−2Λ(Φ)) = ξ(x− ) · ξ(x (2 algebra has two fundamental highest weights, which we The affine sl shall denote by Λ− and Λ+ , see Chapter 16. They are characterized by Λ± (H) = ± 12 , Λ± (K) = 1 and Λ± (d) = 0, so that we have Λ± (Φ) = 1 4 (±ϕ + ζ). This is enough to reconstruct the fields ϕ and ζ. The field
470
12 The Toda field theories
η cannot be obtained by these highest weight projections. This is not a problem if we are interested in the sinh-Gordon model, since then the free field η is equal to 0. Defining the tau-functions: τ± = exp −2Λ± (Φ) we have: e−ϕ =
τ+ τ−
and e−ζ = τ+ τ−
(12.67)
In terms of the tau-functions, the equations of motion take the form: τ± (∂x− ∂x+ τ± ) − (∂x+ τ± )(∂x− τ± ) = −m2 e2η τ∓2
(12.68)
When η = 0, this is just the Hirota bilinear form of the sinh-Gordon equation. ¯ + ) for the simplest We now describe the chiral fields ξ(x− ) and ξ(x solution, the vacuum solution of eq. (12.61): ϕvac = 0,
ζvac = 2m2 x+ x−
ηvac = 0,
One can insert this solution into the linear system and compute the vacuum wave function Ψvac (x, t). We have Φvac = 12 m2 x+ x− K and the linear system becomes: 1 (∂x+ − m2 x− K − mE− )Ψ = 0, 2
1 (∂x− + m2 x+ K − mE+ )Ψ = 0 2 1
˜ − ). The first equation is readily solved by Ψ = e 2 m x+ x− K emx+ E− Ψ(x Inserting this into the second equation, and using [E+ , E− ] = K, which ˜ = 0. implies exp (−mx+ adE− )E+ = E+ + mx+ K, one gets (∂x− − mE+ )Ψ We finally obtain: m2 x+ x− K 2
m2 x+ x − K 2
emx− E+ emx+ E− (12.69) The two expressions are equal thanks to the Campbell–Haussdorf formula. One can then compute the chiral fields: Ψvac (x, t) = e
emx+ E− emx− E+ = e−
2
ξvac (x− ) = Λ|emx− E+ ,
ξ vac (x+ ) = e−mx+ E− |Λ
(12.70)
The reconstruction formula for Φ reads
exp (−2Λ(Φ)) = ξvac (x− ).ξ¯vac (x+ ) = exp −m2 x+ x− Λ(K)
as it should be. The tau-functions of the vacuum solution are: τ+ = τ− = τ0 = exp[−m2 x+ x− ]
(12.71)
12.8 Dressing transformations and soliton solutions
471
Remark. We can embed the vacuum equations of motion ∂x+ ∂x− ζvac = 2m2 into (r)
a larger hierarchy. Introduce the variables x± for r odd, and consider the following connection, generalizing eqs. (12.60) evaluated at Φvac , 1 (r) Avac (r) = ± ∂ (r) ζ K + mE∓ x± 4 x± where E± = λ±r (E+ + E− ), with r odd. Since [E+ , E− ] = Krδrs , the zero curvature (r) (r) condition reduces to ∂x(r) ∂x(s) ζvac = 2m2 rδrs . A solution is ζvac = 2m2 r r x+ x− . (r)
(r)
+
(s)
−
(r)
(r)
In the same way as before, we can calculate Ψvac (x+ , x− ) (r)
2
− m2
(r)
Ψvac (x+ , x− ) = e
= e
m2 2
r
r
(r) (r)
rx+ x− (r) (r)
rx+ x−
K m
e
K m
e
r
r
(r)
(r)
x− E+ (r)
(r)
x+ E−
em
em
r
r
(r)
(r)
x+ E− (r)
(r)
x− E+
The vacuum chiral fields are thus: (r)
ξvac (x− ) = Λ|em
r
(r)
(r)
x− E +
,
ξ vac (x+ ) = e−m (r)
r
(r)
(r)
x+ E−
|Λ
(12.72)
(r)
The x± are the collection of elementary times defining the sinh-Gordon hierarchy, see Chapter 3. Each chirality is attached separately to the poles at λ = 0, ∞ in the loop representation.
12.8 Dressing transformations and soliton solutions As shown in eq. (12.53), dressing transformation act on the chiral fields ξ and ξ¯ by: −1 , ξ g (x− ) = ξ(x− ) · g−
g
ξ (x+ ) = g+ · ξ(x+ )
(2 . The dressing group In our case, we start from an element g ∈ exp sl element (g+ , g− ) is defined by the factorization problem −1 g = g− g+
± ) with g± ∈ B± = (exp H)(exp N
and, moreover, we require that g− and g+ have inverse components on the Cartan torus. −1 g+ , we When we dress the vacuum solution with an element g = g− g get a new solution Φ such that: e−2Λ(Φ
g)
= Λ|emx− E+ g e−mx+ E− |Λ
To construct new solutions of the sinh-Gordon equation, one has to choose particular elements g of the affine group. Remarkable elements can be constructed with vertex operators in the (2 algebra. Let us recall here two level one representations of the affine sl
472
12 The Toda field theories
the main facts, and refer to Chapter 16 for more details. One introduces bosonic oscillators pn for n odd, such that [pm , pn ] = mδn+m,0 and p†n = p−n . They generate a Fock space over the vacuum state |0 which is specified by pn |0 = 0 for n > 0. Let Z(λ) be the generating function: √ λn Z(λ) = −i 2 p−n n n odd
The level one vertex operator representations with highest weight Λ± of (2 are then obtained by setting: the Lie algebra sl n
i d n n λ−n (E+ + E− ) = √ λ Z(λ) 2 dλ odd
λ−n H n +
n even
n n λ−n (E+ − E− ) = ± V (λ)
(12.73)
n odd
where V (λ) denotes the vertex operator: V (λ) =
1 2
√
: e−i
2Z(λ)
:=
1 2
√
: ei
2Z(−λ)
:
The double dots means that the expressions are normal ordered, i.e. we write all pn with n > 0 to the right. The representations with highest weight Λ± correspond to the plus and minus sign respectively, in eq. (12.73). Recall the formula, (see eq. (16.25)): λ−µ 2 V (λ)V (µ) = : V (λ)V (µ) : |λ| > |µ| λ+µ This means that inside expectation values V 2 (µ) = 0. So there is an (2 such that ρΛ+ (g) = ρΛ+ (exp aV (µ)) = 1 + aV (µ). element g ∈ exp sl In the representation ρΛ− we have ρΛ− (g) = 1 − aV (µ) due to the sign change in eq. (12.73). More generally, we consider the product of such elements g = g1 g2 · · · gN . We have: ρΛ± (g) = (1 ± 2a1 V (µ1 ))(1 ± 2a2 V (µ2 )) · · · (1 ± 2aN V (µN )) (N )
Proposition. Let τ±
be the tau-functions:
τ± (x± ) = Λ± |emx− E+ (N )
N (1 ± 2ai V (µi ))e−mx+ E− |Λ± i=1
(12.74)
12.8 Dressing transformations and soliton solutions
473
Then we have: τ± =1+ (±)p τ0 (N )
N
p=1
k1 0 is such that th to bound the n iterate in the following way: ∞ ∞ n ˜0 dx1 K(x, x1 ) · · · K(xn−1 , xn )f˜10 | ≤ |(K f1 )(x)| = | x xn−1
n 1 ∞ 0 ˜ dx1 · · · dxn M (x1 ) · · · M (xn ) = dyM (y) |f˜10 | |f1 | n! x x≤x1 ≤···≤xn
492
13 Classical inverse scattering method
Because of the n! in the denominator, the series is absolutely and uni∞ formly bounded by exp ( x dyM (y)). This is the main observation in the theory of Volterra integral equations. Each term in the expansion is analytic in λ for Im λ > 0, hence the uniform limit is also analytic. Note, however, that for λ = 0, V (y, λ) has a pole so that singularities are expected for f1 and f2 . The equation for f2 is treated in the same way.
Remark 1. If we assume, in the previous proof, that the potential ϕ is such that V1 , V2 have compact support (i.e. the potential reaches its limiting values 0, Qπ/β at finite distance), then the integration domain in the integral equation is finite, and one can bound e2ik(λ)(y−x) even for Im(k) < 0. It follows that f1 (x, λ) and f2 (x, λ) are analytic functions of λ in the whole λ-plane except at λ = 0 and λ = ∞. In this case, the scattering data a(λ), b(λ) can be analytically continued in the λ-plane. This remains true if V1 , V2 vanish at ±∞ rapidly enough to compensate for the growth of e2ik(λ)(y−x) when Im(k) < 0. In particular, choosing λ to be a zero λn of a(λ) in the upper half-plane , and looking at the asymptotic expansions of f1 , f2 when x = −∞, we see that cn = −1/b(λn ) in eq. (13.14). The next task is to compute the asymptotics of the Jost solutions when λ → 0 and λ → ∞. Proposition. The Jost solutions f1 , f2 have the following asymptotics in λ, valid for any x: iπQ − 2 iβϕ 1 e ik(λ)x σ 3 iπQ f1 = e e 2 + O( ) , |λ| → ∞ |λ| e− 2 iβϕ 1 1 −ik(λ)x σ 3 e 2 + O( ) , f2 = e |λ| → ∞ −1 |λ| iπQ 2 e ik(λ)x − iβϕ σ3 2 iπQ e + O(|λ|) , |λ| → 0 f1 = e e 2 1 −ik(λ)x − iβϕ σ 3 f2 = e e 2 + O(|λ|) , |λ| → 0 (13.18) −1 iβϕ
Proof. It is easily checked that the factors e± 2 σ3 are such that the right-hand sides of these equations have the correct asymptotics for f1 , f2 when x → ±∞ respectively, see eq. (13.9). Let us consider the case λ → ∞ and analyse the asymptotics of f1 for definiteness, the other cases being similar. We introduce f1 by performing the gauge transformation: f1 = eik(λ)x ei
βϕ σ 2 3
f1
493
13.2 The Jost solutions
x ]f1 = 0 with: It obeys the transformed equation [∂x + ik(λ)(1 − σ1 ) − A β (ϕ˙ − ϕ ) mλ−1 (1 − e−2iβϕ ) 2 Ax = i mλ−1 (1 − e2iβϕ ) − β2 (ϕ˙ − ϕ ) x rapidly vanNotice that the boundary conditions on ϕ are such that A ishes when x → ±∞. Equivalently, f1 is a solution of the integral equation: ∞ 1 0 ik(λ)(y−x)(1−σ1 ) 0 −i πQ Ax (y, λ)f1 (y, λ), f1 = e 2 f1 (x, λ) = f1 − dy e 1 x We decompose this equation on the eigenvectors of the matrix (1 − σ1 ) corresponding to the eigenvalues 2, 0 respectively: 1 1 f1 (x, λ) = F (x, λ) + G(x, λ) 1 −1 This yields the two coupled scalar integral equations: ∞ 6β F (x, λ) = −i dy e2ik(λ)(y−x) (ϕ˙ − ϕ )G(y, λ) 2 x
7 −mλ−1 (1 − cos(2βϕ))F (y, λ) − i sin(2βϕ)G(y, λ) ∞ 6 πQ β G(x, λ) = e−i 2 − i dy (ϕ˙ − ϕ )F (y, λ) 2 x
7 +mλ−1 (1 − cos(2βϕ))G(y, λ) − i sin(2βϕ)F (y, λ) This is a Volterra system, so the iteration procedure converges, exactly as above. Note that there is a Fourier exponential in the first equation, but not in the second one. Now for h(x) rapidly decreasing at +∞ we have: ∞ dy eiλy h(y) = O(λ−1 ) x
It follows that the iteration procedure yields G(x, λ) = e−i and F (x, λ) = O(λ−1 ).
πQ 2
+ O(λ−1 )
Proposition. The Jost function is analytic in the upper half-plane Im (λ) > 0. Furthermore, iπQ 1 − iπQ , |λ| → ∞; a(λ) = e 2 + O (|λ|) , |λ| → 0 a(λ) = e 2 + O |λ| Recall also that a(−λ) = e−iπQ a∗ (λ) for real λ. Proof. This follows from the relation a(λ) = − 12 W (f1 , f2 ), and from the fact that f1 , f2 are analytic in the upper half-plane.
494
13 Classical inverse scattering method
Remark 2. When x → −∞, we see that f1 ∼ a(λ)
1 1 − b(λ) e−2ik(λ)x 1 −1
Letting x → −∞ in the above Volterra system for F, G, we can directly identify: ∞ 5 πQ β a(λ) = e−i 2 − i dy (ϕ˙ − ϕ )F (y, λ) 2 −∞ 8 +mλ−1 (1 − cos(2βϕ))G(y, λ) − i sin(2βϕ)F (y, λ) 5 ∞ 2ik(λ)y β b(λ) = −i dy e (ϕ˙ − ϕ )G(y, λ) 2 −∞ 8 −mλ−1 (1 − cos(2βϕ))F (y, λ) − i sin(2βϕ)G(y, λ) We see in the expression for a(λ) that it can readily be extended in the upper half-plane, since F and G have such an extension. By contrast, since b(λ) is a Fourier transform on the whole real axis, it cannot be extended outside the real λ axis in general. However, if the field ϕ attains its asymptotic values at finite distance, the integral is over a finite interval, and b(λ) admits an analytic continuation in the whole λ-plane. Similarly, if we assume that the field ϕ(x) is C ∞ , the functions F , G, solutions of the integral equation, will also be C ∞ in x, so that λn b(λ) → 0 for λ → ∞ for any n > 0, and λ−n b(λ) → 0 for λ → 0. This is because, integrating by parts, ∞ ∞ 1 dx eiλx h(x) = − dx eiλx h (x) iλ −∞ −∞ where the second integral exists by hypothesis. Iterating this formula, we see that the Fourier transform of a C ∞ function h(x) goes to 0 faster than any power of λ when λ → ∞. Noting that |a|2 + |b|2 = 1, we see that the modulus of a tends rapidly to 1 for λ → 0, ∞ and so all information is contained in its phase.
We will need alternative representations of the Jost solutions f1 and f2 , in which the λ dependence is explicit. It is convenient to first get rid of the explicit ϕ dependence in the asymptotic expansions eq. (13.18), by defining: i i f1 = e 2 (βϕ−Qπ)σ3 f1 , f2 = e 2 βϕσ3 f2 Proposition. These functions admit the following Fourier representations: ∞
ik(λ)x 0 −1 ( f1 (x, λ) = e f1 + dy U1 (x, y) + λ W1 (x, y) eik(λ)y (13.19) x
f2 (x, λ) = e−ik(λ)x f20 +
x
−∞
(2 (x, y) e−ik(λ)y 2 (x, y) + λ−1 W dy U
13.2 The Jost solutions
495
i (x, y), W (i (x, y) are two component vectors, and where U 1 1 0 0 , f2 = f1 = eiπQ −1 Proof. Consider a function f (λ) analytic in the upper half-plane . Since k(λ) = m(λ − λ−1 ), the map λ → k covers twice the upper half k-plane, the two values λ and −1/λ map to the same value of k. So the function f (λ) can be written: f (λ) =
1 1
1 1 1
f (λ) + f (− ) + f (λ) − f (− ) = g1 (k) + λ + g2 (k) 2 λ 2 λ λ
where g1 and g2 are analytic functions of k. If f is bounded at λ = 0 and λ = ∞, this implies that g1 (k) is bounded and g2 (k) tends to zero 1 k, we can write f (λ) = at k → ∞. Alternatively, since λ + λ1 = 2λ−1 + m −1 h1 (k) + λ h2 (k), and the function h1 is bounded when k → ∞, while h2 tends to zero. By the classical Paley–Wiener theorem, one can represent the functions hi in the form: ∞ hi (k) = ci + ui (y)eiky dy (13.20) 0
where the functions ui (y) are sufficiently regular so that the Fourier integral tends to zero when k → ∞, and ci is then the limiting value of hi . The analyticity of hi in the upper half-plane is accounted for by the support of the Fourier transform on the positive half–line. To apply this to the Jost solutions, we note that e−ikx f1 is analytic in the upper halfplane. Moreover, e−ikx f1 is bounded at λ → 0, ∞ by eq. (13.18), hence one can represent it as in eq. (13.20). Multiplying by eikx and changing variables y + x → y in the integral (13.20), we arrive at eq. (13.19). The other equation is obtained similarly. i , W (i are determined by eq. (13.7) as follows. The funcThe kernels U x )f1 = 0 with the tion f1 obeys the gauge transformed equation (∂x − A connection: β (ϕ˙ − ϕ ) meiπQ (λ − λ−1 e−2iβϕ ) 2 Ax = i meiπQ (λ − λ−1 e2iβϕ ) − β2 (ϕ˙ − ϕ ) We will use the notation −1 1 + A 0 + λ−1 A x = λA A
496
13 Classical inverse scattering method
When we insert the representation eq. (13.19) into that equation, terms in λ and λ−2 appear. To rewrite them, we use: λ−2 eik(λ)y = eik(λ)y −
1 ∂y eik(λ)y , imλ
λeik(λ)y =
1 1 ∂y eik(λ)y + eik(λ)y im λ
and perform integration by parts. The differential equation then translates 1 (x, y) and W (1 (x, y): into the following relations on the kernels U
1 0 (x)U 1 + A 1 (x) + A −1 (x) W (1 1 = A A1 (x)∂y U im
1 0 (x)W (1 + A −1 (x) U (1 = A 1 (x) + A 1 ∂x − A−1 (x)∂y W im
∂x +
and the boundary conditions:
1 0 (x)f10 1 (x, x) = −A 1− A1 (x) U im
1 (1 (x, x) = − A −1 (x) + im f0 A−1 (x) W 1+ 1 im i (x, y) and W (i (x, y), these equations allow us Alternatively, knowing U x (x, λ) and therefore the field ϕ(x, t). In to reconstruct the connection A particular, the boundary conditions yield e2iβϕ(x) =
(1 )2 (x, x) im + eiπQ (W (1 )1 (x, x) im + (W
(13.21)
The Gelfand–Levitan–Marchenko equation which we will present below (i (x, y) directly to the scattering data. i (x, y) and W relates U 13.3 Inverse scattering as a Riemann--Hilbert problem One can of course define Jost solutions, f3 , f4 , analytic in the lower halfplane by choosing the appropriate boundary conditions: 1 1 −ik(λ)x f3 ∼ , x → +∞; f4 ∼ eik(λ)x , x → −∞ e −eiπQ 1 (13.22) These solutions are linear combinations of f1 , f2 . By comparing at x = ±∞, we find the relations, valid for real λ: f3 =
b∗ (λ) −iQπ e−iQπ f1 + e f2 , a(λ) a(λ)
f4 =
1 b(λ) f1 + f2 a(λ) a(λ)
(13.23)
13.4 Time evolution of the scattering data
497
We can write b(λ) = − 12 W (f1 , f4 ). Since f1 and f4 are not analytic in the same half-plane, we recover the fact that b(λ) cannot be extended outside the real axis in general. Let us define the matrices Θ± (λ), analytic in the upper and lower half-plane respectively, by (recall that fi are twocomponent vectors): 1 iQπ 1 −ik(λ)σ3 x , e f3 , ∗ f4 = Θ− (λ)e−ik(λ)σ3 x (f2 , f1 ) = Θ+ (λ)e a∗ (λ) a (λ) The factors e−ik(λ)x are introduced so that Θ± (λ) have finite limits when λ → 0 and λ → ∞ in their respective domains of analyticity. We can write eq. (13.23) as 1 −b(λ)e−2ik(λ)x −1 Θ− Θ+ = −b∗ (λ)e2ik(λ)x 1 This is a Riemann–Hilbert problem, typical of a dressing transformation. Note, however, that the matrix Θ+ is degenerate at the zeroes of a(λ) in the upper half-plane and the matrix Θ−1 − is degenerate at the zeroes of a∗ (λ) in the lower half-plane. We are thus led to a Riemann– Hilbert problem with zeroes, as discussed in Chapter 3. In the following, we propose another route to the solution of the inverse scattering problem, by transforming it to the Gelfand–Levitan–Marchenko linear integral equation. 13.4 Time evolution of the scattering data In the previous sections, from a field ϕ(x, 0) at time t = 0, we have defined the scattering data a(λ), b(λ), λn and cn . The second step in the classical inverse scattering method is to compute the time evolution of the scattering data, which turns out to be beautifuly simple. Proposition. The time evolution of the sine-Gordon theory linearizes on the scattering data: a(λ, ˙ t) = 0, ˙λn = 0,
˙ t) = 2im(λ + λ−1 )b(λ, t) b(λ, c˙n = −2im(λn + λ−1 n )cn
Proof. Recall that for the Jost solution f1 (x, λ) we have 1 eik(λ)x f1 (x, λ)|x→+∞ ∼ eiQπ 1 1 ik(λ)x e e−ik(λ)x f1 (x, λ)|x→−∞ ∼ a(λ) − b(λ) 1 −1
(13.24)
498
13 Classical inverse scattering method
Consider now the time evolution of Ψ given by the second equation of the linear system: ∂Ψ ∂t − At Ψ = 0. In the limit x → +∞, it reduces to ∂Ψ 0 1 iQπ −1 Ψ=0 (13.25) + ie m(λ + λ ) 1 0 ∂t Choose Ψ = α(t)f1 (x, t, λ). Then the asymptotic time evolution equation at x → +∞ gives α˙ = −im(λ + λ−1 )α and for x → −∞ it gives d (αa) + im(λ + λ−1 )(αa) = 0 dt d (αb) − im(λ + λ−1 )(αb) = 0 dt ˙ t) = 2im(λ + λ−1 )b(λ, t) as claimed in the Therefore a(λ, ˙ t) = 0 and b(λ, proposition. For bound states, we have by definition a(λn ) = 0. So λn does not evolve. Consider now the wave function fn (x) ≡ f2 (x, λn ) = cn f1 (x, λn ). We have 1 1 −ik(λn )x e eik(λn )x , fn (x)|x→+∞ ∼ cn fn (x)|x→−∞ ∼ eiQπ −1 Take Ψn = α(t)fn . For x → −∞ we have α˙ = im(λn + λ−1 n )α, while for )c = 0. x → +∞ we have c˙n + 2im(λn + λ−1 n n Integrating eqs. (13.24), we get simple time evolutions of the scattering data: a(λ, t) = a(λ, 0), λn (t) = λn (0),
−1 )t
b(λ, t) = e+2im(λ+λ
−2im(λn +λ−1 n )t
cn (t) = e
b(λ, 0)
(13.26)
cn (0)
13.5 The Gelfand--Levitan--Marchenko equation We now explain the inverse problem which amounts to reconstructing the potential from the scattering data. The Gelfand–Levitan–Marchenko i (x, y) equation is a linear integral equation which determines the kernels U ( and Wi (x, y) from the scattering data. Once these kernels are known, the i (x, x), W (i (x, x) local fields are reconstructed from their boundary values U by eq. (13.21). Recall eq. (13.11), which we write in the form: b∗ (λ) f2 (x, λ) = r(λ)f1 (x, λ) + if 1 (x, λ) with r(λ) = − a(λ) a(λ)
(13.27)
13.5 The Gelfand--Levitan--Marchenko equation
499
i (x, y), W (i (x, y), we need to rewrite this equaTo introduce the kernels U tion in terms of the functions fi . Recall that f1 = g f1 with g = exp ( 2i (βϕ − Qπ)σ3 ). We multiply eq. (13.27) by g −1 . Notice that g −1 f2 = πQ ei 2 σ3 f2 and g −1 f¯1 = f1 , because g −1 f¯1 = g −1 σ2 f1∗ = g −1 σ2 g ∗ f1∗ and g −1 σ2 g ∗ = σ2 . We get: ei
πQ σ3 2
1 ¯ f2 (x, λ) = r(λ)f1 (x, λ) + if1 (x, λ) a(λ)
(13.28)
The Gelfand–Levitan–Marchenko equation is essentially the Fourier transform of this equation. To perform this transformation we need a lemma: Lemma. We have the relations +∞ 2π eik(λ)x dλ = δ(x) m −∞ +∞ dλ eik(λ)x =0 λ −∞ +∞ 2π dλ eik(λ)x 2 = δ(x) λ m −∞
(13.29)
Proof. Because of the singularities at λ = 0, ∞, one has to give a careful definition of these integrals on the real axis. We give a principal part definition, that is we set: ∞
−
dλ = lim −∞
→0
1/
dλ + −1/
dλ
(13.30)
Let us prove the first formula. Recall that k(λ) = m(λ − λ−1 ), so that k(−λ−1 ) = k(λ). In the integral from −1/ to − , change λ → −λ−1 to get 1/ +∞ 1 +∞ ikx 2π ik(λ)x ik(λ)x −2 e dλ = lim e (1 + λ )dλ = e dk = δ(x) →0 m −∞ m −∞ The same technique applied to the second formula yields: 1/ +∞ 1 1 ik(λ)x dλ = lim e eik(λ)x (− + )dλ = 0 →0 λ λ λ −∞ Finally, the last equation is equivalent to the first one changing λ to −1/λ.
500
13 Classical inverse scattering method
1 (x, y), W (1 (x, y), y ≥ x, appearing in the Theorem. The kernels U Fourier transform of f1 , eq. (13.19), satisfy the linear integral equations: −
2iπ U 1 (x, y) = F0 (x + y)f10 m ∞
1 (x, z) + F−1 (y + z)W (1 (x, z) dz + F0 (y + z)U x
2iπ ( W 1 (x, y) = F−1 (x + y)f10 (13.31) − m ∞
1 (x, z) + F−2 (y + z)W (1 (x, z) dz + F−1 (y + z)U x
The functions Fj (x) are directly computed in terms of the scattering data by: ∞ Fj (x) = dλ λj eik(λ)x r(λ) − 2iπ eik(λn )x λjn mn (13.32) −∞
n
where we defined the parameters mn =
cn a (λn )
1 and W (1 always appear with their Proof. Note that, in eq. (13.31), U second argument greater than the first one, in agreement with their definition. Hence the system of two equations (13.31) determines these two quantities in their domain of definition. We multiply eq. (13.28) by λj eik(λ)y (for j = 0, −1) and integrate over λ from −∞ to +∞, with a principal part prescription, getting: ∞ f2 (x, λ) ik(λ)y i πQ σ 3 e 2 dλ λj e a(λ) −∞ ∞
¯ = dλ λj eik(λ)y r(λ)f1 (x, λ) + if1 (x, λ) (13.33) −∞
Recall the Fourier representations: ∞ (1 (x, z))eikz dz 1 (x, z) + λ−1 W f1 (x) = eikx f10 + (U x ∞ 0 −ikx 1 (x, z) + λ−1 W ( 1 (x, z))e−ikz dz f 1 (x) = e f1 + (U x
where the second equation is derived from the first by complex conjugating and multiplying by σ2 . We evaluate the right-hand side of eq. (13.33) using
13.5 The Gelfand--Levitan--Marchenko equation
501
the lemma. We find for j = 0, −1 respectively: ∞ dλ eiky (r(λ)f1 (x) + if1 (x)) −∞
0 2π 2π 1 (x, y) = R0 (x + y)f10 + i δ(x − y)f1 + i θ(y − x)U m m ∞
1 (x, z) + R−1 (y + z)W (1 (x, z) (13.34) dz R0 (y + z)U + x
and similarly: ∞ dλ λ−1 eiky (r(λ)f1 (x) + if1 (x)) −∞
2π ( 1 (x, y) = R−1 (x + y)f10 + i θ(y − x)W m ∞
1 (x, z) + R−2 (y + z)W (1 (x, z) dz R−1 (y + z)U + x
where we have introduced the notation: ∞ Rj (x) = dλ λj eik(λ)x r(λ), −∞
j = 0, −1, −2
To evaluate the left-hand side of eq. (13.33) we use the analyticity properties of the Jost solutions, and the residue theorem. Recall that the λ integrals are defined with the prescription (13.30). We close the contour in the upper half-plane by introducing a small half-circle C of center 0 and radius and a large half-circle C1/ of center 0 and radius 1/ . In the upper half-plane , the integrand has poles at the zeroes λn of a(λ). So the residue theorem gives: ∞ πQ f2 (x, λ) ik(λ)y f2 (x, λn ) i πQ σ 3 dλ λj = 2iπei 2 σ3 eik(λn )y λjn e 2 e a(λ) a (λn ) −∞ n πQ f2 (x, λ) ik(λ)y dλ λj (13.35) + ei 2 σ3 e a(λ) C +C1/ da . To proceed we need to evaluate the integrals on the where a (λ) = dλ half-circles. Using the asymptotic expansions eq. (13.18) we see that on these circles f2 (x, λ) = e−ik(λ)x × regular, and similarly a(λ) is regular. In particular, at ∞ we have iπQ f2 (x, λ) = e 2 e−ik(λ)x (f20 + O(λ−1 )) a(λ)
502
13 Classical inverse scattering method
. We consider, first for j = 0, integrals of the form C eik(λ)(y−x) dλ, where k is large. The existence of such integrals requires y ≥ x, otherwise the exponential explodes. Assuming in the following that this condition is satisfied, we have |eik(λ)(y−x) | ≤ 1 for λ in the upper half-plane, hence the integral on C is bounded by π and can be neglected when → 0. On the other hand, the integral over C1/ reduces by the residue theorem to the integral on the real axis: ∞ 2π eik(λ)(y−x) dλ ∼ eimλ(y−x) dλ = δ(x − y) m C1/ −∞ so that the last term in eq. (13.35) is equal to 0
2π m δ(x
− y)e
iπQ(1+σ3 ) 2
f20 .
Taking into account the value of f1 , this term precisely cancels the δ(x−y) term in the right-hand .side of eq. (13.34). For j = −1, we have to consider integrals of the form C eik(λ)x dλ/λ with x > 0. Write λ = eiθ with 0 < θ < π, and fix η > 0 small enough. The integral takes the form: π mx imx ik(λ)x −1 e λ dλ ∼ i dθe− sin θ− cos θ C
0
On the interval [η, π − η] the integrand decays exponentially when → 0. On the intervals [0, η] and [π − η, π] the integrand is bounded by 1, and the integral by 2η. So the contribution on C can be neglected. A similar analysis shows that the contribution on C1/ can also be neglected. We now evaluate f2 (x, λn ) appearing in eq. (13.35). Precisely at the zeroes λn of a(λ) we have f2 (x, λn ) = cn f1 (x, λn ), which translates into 1 f2 (x, λn ) = cn e−i 2 πQσ3 f1 (x, λn ). Replacing f1 (x, λn ) by its expression, eq. (13.19), one gets:
eik(λn )y λjn
j f2 (x, λn ) ik(λn )y cn λn −i 12 πQσ3 eik(λn )x f10 = e e a (λn ) a (λn ) ∞ ik(λn )z −1 ( + dz e (U1 (x, z) + λn W1 (x, z)) x
Combining everything finally yields the Gelfand–Levitan–Marchenko equations. 13.6 Soliton solutions The solution of the Gelfand–Levitan–Marchenko equation (13.31) is particularly simple when we take R(x) = 0. This corresponds to b(λ) = 0,
503
13.6 Soliton solutions
which means that there is no reflection in the auxiliary scattering problem. Corresponding potentials are called reflectionless potentials. Then the kernels Fj are degenerate and the Gelfand–Levitan–Marchenko equations reduce to a finite linear system. The sine-Gordon solutions ϕ(x, t) we get in this way are just the multi-soliton solutions. If there is no reflection, the scattering data are λn and mn . The Gelfand– Levitan–Marchenko kernels read: Fj (x, y) = −2iπ mn λjn eik(λn )(x+y) n
Looking at the y dependence in the Gelfand–Levitan–Marchenko equation (1 (x, y) can be expanded as: 1 (x, y) and W shows that U ∗ 1 (x, y) = e−ik(λn )(x+y) m∗n un (x) U n
(1 (x, y) = W
∗
e−ik(λn )(x+y) m∗n wn (x)
n
The y exponentials have been chosen to remain bounded when y → ∞, so that the z integrals in the Gelfand–Levitan–Marchenko equations ∗ converge. The factor m∗n e−ik(λn )x has been introduced to simplify later formulae. Inserting these forms into the Gelfand–Levitan–Marchenko equations and identifying the coefficient of exp(ik(λn )y) on both sides yields: im∗p 1 ∗ u ¯n (x) = f10 + e−2ik(λp )x (up (x) + λ−1 n wp (x)) ∗ m k(λn ) − k(λp ) p im∗p λn ∗ w ¯n (x) = f10 + e−2ik(λp )x (up (x) + λ−1 n wp (x)) ∗ m k(λn ) − k(λp ) p Since the right-hand sides of these equations are identical, we get wn (x) = 1 λ∗n un (x). Substituting back into the equation for un and noting that k(λn ) − k(λ∗p ) = m(λn − λ∗p )(1 + 1/(λn λ∗p )), the equation simplifies to: u ¯n (x) = mf10 +
p
im∗p −2ik(λ∗ )x p e up (x) λn − λ∗p
In the following, we restrict ourselves to pure soliton and antisoliton solutions (no breathers), so that the λp are pure imaginary. To connect with the notations of Chapter 12 we set λp = iµp . The Gelfand–Levitan– Marchenko equation becomes, in matrix notation: u ¯ = mf + V u
(13.36)
504
13 Classical inverse scattering method
u1n , f is the vector with where u is the vector with components un = u2n components fn = f10 en , where en = 1 for all n, and we have defined the matrix: µp mp −2m(µp +µ−1 p )x X p , Xp = e (13.37) Vnp = µn + µp µp
Note that, since a(−λ∗ ) = eiπQ (a(λ))∗ and c∗n = −eiπQ cn , we have m∗p = mp , and the matrix V is real in this pure solitonic case. Note that, if one −1 includes the time dependence, i.e. mp → mp e2m(µp −µp )t , the exponential appearing in V becomes exp (−2m[µp (x − t) + µ−1 p (x + t)]), which is the familiar form encountered in Chapter 12. ¯ = −u, we get −u = Taking the bar of eq. (13.36) and using that u mf¯ + V u ¯ so that, eliminating u ¯: (1 + V 2 )u = −mf¯ − mV f Writing (1 + V 2 ) = (1 + ieiπQ V )(1 − ieiπQ V ), this is solved by: u1 = imeiπQ (1 − ieiπQ V )−1 e,
u2 = −im(1 + ieiπQ V )−1 e
We can then compute the field ϕ by eq. (13.21). This is done in the: Proposition. We have: iπQ
(1 )1 (x, x) = im det (1 + ie V ) im + (W det (1 − ieiπQ V ) iπQ (1 )2 (x, x) = im det (1 − ie V ) im + eiπQ (W det (1 + ieiπQ V ) so that e−iβϕ =
τ+ τ−
(13.38) (13.39)
as in eqs. (12.67, 12.76) in Chapter 12.
Proof. Let us prove the first equation. We have (1 )1 (x, x) = i (W Xn (u1 )n = −meiπQ Tr (M ), M ≡ (1−ieiπQ V )−1 e⊗X n
where X is the vector of components Xn as in eq. (13.37). Hence
im + W1 (x, x) = im 1 + ieiπQ Tr (M ) Note that M is of rank 1, so 1 + Tr (ieiπQ M ) = det (1 + ieiπQ M ). Moreover:
1 + ieiπQ M = (1 − ieiπQ V )−1 1 − ieiπQ (V − e ⊗ X)
505
13.7 Poisson brackets of the scattering data Finally, we remark that (V − e ⊗ X) = −µV µ−1 so that 1 + ieiπQ M = (1 − ieiπQ V )−1 µ(1 + ieiπQ V )µ−1
Taking the determinant proves eq. (13.38). Equation (13.39) is obtained similarly. Equation (13.21) gives: e2iβϕ =
det2 (1 − ieiπQ V ) det2 (1 + ieiπQ V )
which identifies with the tau-function formula. The parameters an , µn in Chapter 12 are related to mn , λn in this chapter by: λn = iµn ,
mn = −2ieiπQ µn an
(13.40)
13.7 Poisson brackets of the scattering data We now consider the sine-Gordon equation from the Hamiltonian point of view. The aim is to compute the Poisson brackets of the scattering data defined in the previous sections. We follow the method of the classical r-matrix introduced by Faddeev, Sklyanin and Takhtajan. We start from the canonical Poisson brackets: {π(x), ϕ(y)} = δ(x − y) and define the Hamiltonian to ∞: 1 2 H= dx π (x) + 2 −
on the interval [−, ], eventually will tend 1 4m2 2 (∂x ϕ) (x) + 2 (1 − cos (2βϕ)) dx 2 β
The equations of motion read ϕ˙ = π and π˙ = ϕ − 8m β sin (2βϕ), which reproduces eq. (13.1). Consider the auxiliary linear problem eq. (13.7) on the interval [−, ]. Let Ψ(−) be the value of a solution at x = − and Ψ() its value at x = . Then we can write: 2
Ψ() = T (λ)Ψ(−) where T (λ) is the monodromy matrix.
506
13 Classical inverse scattering method
Proposition. The monodromy matrix and the scattering data are directly related by: a(λ) b(λ) T (λ) ≡ −b∗ (λ) a∗ (λ) iπQ ik(λ) ik(λ) 1 e e −eik(λ) e−ik(λ) e = lim T (λ) e−ik(λ) eiπQ e−ik(λ) −eik(λ) e−ik(λ) 2 →∞ (13.41) Proof. From the asymptotic form of the Jost solutions and the relation |a|2 + |b|2 = 1 we find 1 eik(λ)x x → −∞ 1 1 b f1 + f2 ∼ iπQ a a 1 e ∗ ik(λ)x e−ik(λ)x x → +∞ e +b a iπQ e −1 Any solution Ψ(x) can be written as ψ1 ( a1 f1 + ab f2 ) + ψ2 f2 and behaves at x = − as: −ik(λ) e eik(λ) ψ1 Ψ(−) = e−ik(λ) −eik(λ) ψ2 while at x = +∞ it behaves as: ∗ ik(λ) a e + beiπQ e−ik(λ) Ψ() = ∗ iπQ ik(λ) a e e − be−ik(λ)
−b∗ eik(λ) + aeiπQ e−ik(λ) −b∗ eiπQ eik(λ) − ae−ik(λ)
ψ1 ψ2
The result follows from writing Ψ() = T (λ)Ψ(−), and conjugating the relation obtained by σ1 . To compute the Poisson brackets of the scattering data we need the r-matrix relation: Proposition. {T,1 (λ), T,2 (µ)} = [r12 (λ, µ), T (λ) ⊗ T (µ)]
(13.42)
where the matrix r12 (λ, µ) is given by β 2 λ2 + µ2 4λµ r12 (λ, µ) = H ⊗H + 2 (E+ ⊗ E− + E− ⊗ E+ ) 4 λ2 − µ2 λ − µ2 (13.43)
13.7 Poisson brackets of the scattering data
507
Proof. As we know from Chapter 3, see eqs. (3.90, 3.91), it suffices to prove the much simpler local relation: {Ax,1 (λ, x), Ax,2 (µ, y)} = [r12 (λ, µ), Ax (λ, x) ⊗ I + I ⊗ Ax (µ, y)]δ(x − y) (13.44) The r-matrix is obtained, up to a factor, by the general formula for Toda models, 1 + − + r12 ) r12 = (r12 2 ± are given by eqs. (12.28, 12.29) in Chapter 12 in The formulae for r12 which we insert the root decomposition, eq. (12.59), in the same chapter. We get in the loop representation: 1 1 + = H ⊗H + λ2n H ⊗ µ−2n H r12 4 2 n>0
+λ2n−1 E+ ⊗ µ−2n+1 E− + λ2n−1 E− ⊗ µ−2n+1 E+
(we take as invariant bilinear form on sl(2) the trace in the 2 × 2 representation, so that (H, H) = 2 and (E+ , E− ) = 1. This accounts for the relative factors). Summing the geometric series yields eq. (13.43) in which − |λ| < |µ|. The formula for r12 is the same but with |λ| > |µ|. The antisymmetric matrix r12 is just the half sum of these two identical rational functions for λ = µ. It is easy to check that the relation eq. (13.44) holds true, and this allows us to adjust the factor β 2 . Proposition. The complete list of Poisson brackets of the scattering data is: (1) continuum–continuum: {a(λ), b(µ)} = β 2
λµ a(λ)b(µ) (λ + µ)(λ − µ + i0)
{a(λ), b∗ (µ)} = −β 2
λµ a(λ)b∗ (µ) (λ + µ)(λ − µ + i0)
{b(λ), b∗ (µ)} = −iπβ 2 λ |a(λ)|2 δ(λ − µ) {a(λ), a(µ)} = 0,
{a(λ), a∗ (µ)} = 0,
{b(λ), b(µ)} = 0
(2) continuum–discrete: {a(λ), λn } = 0,
{a(λ), λ∗n } = 0
{b(λ), λn } = 0,
{b(λ), λ∗n } = 0
508
13 Classical inverse scattering method
{a(λ), cn } = −β 2
{b(λ), cn } = 0,
{b(λ), c∗n } = 0
λλn a(λ)cn , 2 λ − λ2n
{a(λ), c∗n } = β 2
λλ∗n a(λ)c∗n λ2 − λ∗2 n
(3) discrete–discrete: {λn , λm } = 0,
{λn , λ∗m } = 0
{cn , cm } = 0,
{cn , c∗m } = 0
β2 (13.45) λn cm δnm , {λ∗n , cm } = 0 unless λm = −λ∗n 2 Proof. Define EL (λ) and ER (λ) as the matrices: iπQ ik(λ) ik(λ) −eik(λ) e−ik(λ) e e e , ER (λ) = EL (λ) = e−ik(λ) eiπQ e−ik(λ) −eik(λ) e−ik(λ) {λn , cm } =
so that T (λ) = 12 lim→∞ EL (λ)T (λ)ER (λ). Inserting this into eq. (13.42), we get the Poisson brackets of the elements of the matrix T (λ): {T1 (λ), T2 (µ)} = r+ (λ, µ)T1 (λ)T2 (µ) − T1 (λ)T2 (µ)r− (λ, µ) where we have defined: r+ (λ, µ) = EL (λ) ⊗ EL (µ)r12 (λ, µ)EL−1 (λ) ⊗ EL−1 (µ), r− (λ, µ) =
−1 ER (λ)
⊗
−1 ER (µ)r12 (λ, µ)ER (λ)
⊗ ER (µ),
→∞ →∞
In order to evaluate these r-matrices, one first computes:
EL HEL−1 = eiπQ e2ik E+ + e−2ik E−
1 − eiπQ H ± e2ik E+ ∓ e−2ik E− EL E± EL−1 = 2 −1 HER = e−2ik E+ + e2ik E− ER
1 −1 ER E± E R = − H ± e−2ik E+ ∓ e2ik E− 2 and then obtain r± , before taking the → ∞ limit: 4 2λµ r± (λ, µ) = 2 H ⊗H 2 β λ − µ2 λ − µ ±2i(k(λ)+k(µ)) + E + ⊗ E+ + e λ+µ λ + µ ±2i(k(λ)−k(µ)) + E + ⊗ E− + e λ−µ
λ − µ ∓2i(k(λ)+k(µ)) E− ⊗ E− e λ+µ λ + µ ∓2i(k(λ)−k(µ)) E− ⊗ E+ e λ−µ
13.7 Poisson brackets of the scattering data
509
We take the limit → ∞ in the sense of distribution theory. This is done using the formula: 1 lim P e±ix = ±iπδ(x) →∞ x Indeed, if f is analytic with slow growth at ∞ and x > 0 one can compute − ∞ 1 ±ix + f (x)dx = iπf (0) e x −∞ by considering the closed contour obtained by adding a half-circle C and a half-circle at infinity. The integral on this last circle vanishes in the limit → ∞, and the integral on C gives −iπf (0). From this we deduce a formula more suited to our case, lim P
→∞
1 e±2i(k(λ)−k(µ)) = ±iπδ(λ − µ) λ−µ
which is obtained by the change of variables λ − µ → 2(k(λ) − k(µ)) in the delta function. We have similar formulae for λ + µ, but due to the symmetry properties under λ → −λ, a(−λ) = eiπQ a∗ (λ) and b(−λ) = −e−iπQ b∗ (λ), we can restrict ourselves to λ > 0 and µ > 0, and ignore the terms in δ(λ + µ). We can now take the limit → ∞ in r± , getting: r± (λ, µ) =
β2 λ + µ λ − µ
P − H ⊗H 8 λ−µ λ+µ iπβ 2 (λ + µ)(E+ ⊗ E− − E− ⊗ E+ )δ(λ − µ) ∓ 4
From this we compute the Poisson brackets of the scattering data. For instance: λµ iπ 2 {a(λ), b(µ)} = β a(λ)b(µ) − (λ + µ)δ(λ − µ)b(λ)a(µ) λ2 − µ2 4
λ − µ 2 1 β (λ + µ) P a(λ)b(µ) = − iπδ(µ − λ) − 4 λ−µ λ+µ λµ = β2 a(λ)b(µ) (λ − µ + i0)(λ + µ) where in the last step we have used the identity: 1 1 = P ∓ iπδ(x) x ± i0 x
510
13 Classical inverse scattering method
Note that the left-hand side and the right-hand side are analytic in the upper λ half-plane , as it should be. The other Poisson brackets are computed similarly. There are sixteen Poisson brackets in {T1 , T2 } but the independent ones are listed in the proposition. The Poisson brackets for the discrete spectrum would require a detailed analysis, but they can be obtained quickly using the following trick. We insisted already on the fact that the function b(µ) cannot be analytically continued in the upper half-plane. The situation changes, however, if the field ϕ(x) is compactly supported. Then b(µ) is analytic in the plane, and we have seen that: 1 cn = − b(λn ) Assuming that we are in such a case, setting µ = λm into the equation for {a(λ), b(µ)}, we get immediately {a(λ), cm } = −β 2 λλm /(λ2 −λ2m )a(λ)cm . an in this equation with λ → λn , one gets Letting further a(λ) = (λ − λn )˜ β2 {λn , cm } = 2 λn cm δnm . The remaining Poisson brackets are computed similarly. 13.8 Action--angle variables Due to the boundary conditions of the field ϕ, which differ at x = −∞ and x = +∞, the generating function of conserved quantities is a modified trace of the monodromy matrix, specifically Tr (T (λ)ρ), where ρ is the iπQ iπQ diagonal matrix ρ = Diag (e− 2 , e 2 ). This is a consequence of the zero curvature condition which implies, (see eq. (3.72) in Chapter 3): ∂t T (λ, t) = At (λ, )T (λ, t) − T (λ, t)At (λ, −) and the explicit values At (λ, ) = −im(λ + λ−1 )eiπQ σ1 and At (λ, −) = −im(λ + λ−1 )σ1 for → +∞. Computing T (λ, t) from eq. (13.41), we can express Tr (T (λ)ρ) in terms of the scattering data: Tr (T (λ)ρ) ∼ e
iπQ 2
e−2ik(λ) a(λ) + e−
iπQ 2
e2ik(λ) a∗ (λ)
So the generating functional for conserved quantities can be taken as a(λ). Since {a(λ), a(µ)} = 0 these conserved quantities Poisson commute. Remembering the asymptotics iπQ 1 − iπQ , |λ| → ∞; a(λ) = e 2 + O (|λ|) , |λ| → 0 a(λ) = e 2 + O |λ| we see that we can expand log a(λ) around λ = 0 and λ = ∞: ∞
log a(λ) = −
iπQ + In (iλ)−n , + 2 n=1
|λ| → ∞
(13.46)
511
13.8 Action--angle variables ∞
log a(λ) =
iπQ − In (iλ)n , + 2
|λ| → 0
(13.47)
n=1
We will now calculate the In± in two different ways. The first one will give In± in terms of the original sine-Gordon field ϕ, while the second one will express In± in terms of the scattering data. In order to compute In+ in terms of ϕ, we perform a gauge transformaβ ˜ so that the gauge transformed connection A˜x reads: tion Ψ → ei 2 ϕσ3 Ψ, β A˜x = i (ϕ˙ − ϕ )H + imλ(1 − λ−2 e−2iβϕ )E+ + imλ(1 − λ−2 e2iβϕ )E− 2 Note that A˜x takes the same value at x = ±∞ and we have Tr (T˜ (λ)) = Tr (T (λ)ρ). To compute this trace we can directly apply eq. (11.8) in Chapter 11, where we have found that Tr (T˜ (λ)) = eP (λ) + e−P (λ) For smooth (C ∞ ) fields ϕ, the quantity P (λ) admits asymptotic expansions for λ → 0 and λ → ∞ which can be found using the Ricatti equation. The coefficients of this asymptotic expansion are integrals over local densities in the field ϕ containing higher and higher derivatives. Hence the smoothness condition is essential for their existence. On the other hand, for such smooth fields we have seen that b(λ) vanishes at λ = 0, ∞ as well as all its derivatives. This means that b(λ) has zero asymptotic expansion at these points. Since |a|2 = 1 − |b|2 we see that |a| = 1 in the asymptotic sense, or a(λ)∗ ∼ 1/a(λ). Using this fact, we can compare the asymptotic expansions in both sides of the equation: eP (λ) + e−P (λ) = e
iπQ 2
e−2ik(λ) a(λ) + e−
iπQ 2
e2ik(λ) a∗ (λ)
and identify in the asymptotic sense: P (λ) =
iπQ − 2ik(λ) + log a(λ), 2
λ→∞
To compute the left-hand side, we recall that P (λ) = − v(x, λ)dx, where v(x, λ) is a solution of the Ricatti equation v +v 2 = V and V is determined in eq. (11.8) in terms of A˜x . We compute this expansion at the lowest nontrivial order, so that O(λ−2 ) terms are neglected. We get: V = −m2 λ2 −
β2 β (ϕ˙ − ϕ )2 + 2m2 cos(2βϕ) − i (ϕ˙ − ϕ ) + O(λ−2 ) 4 2
512
13 Classical inverse scattering method
One inserts v = ±imλ + · · · and observes that there is no O(1) term. Up to a choice of sign, and substituting k(λ) = mλ − mλ−1 , we have:
im β 2 iβ 1 2 v = −ik(λ) − ( ϕ ˙ − ϕ ) + 1 − cos(2βϕ) + ( ϕ ˙ − ϕ ) + O( 2 ) 2 2 λ 8m 4m λ Since (ϕ˙ − ϕ ) vanishes at x = ±∞, we obtain: P (λ) = −2ik(λ) − imλ−1 where:
H= −
β2 (H − P ) + O(λ−2 ) 4m2
1 2 1 2 4m2 ϕ˙ + ϕ + 2 (1 − cos (2βϕ)) dx 2 2 β P = ϕϕ ˙ dx −
In−
(λ → 0) we perform the gauge transformation Similarly, to compute −i β2 ϕσ3 ˜ Ψ, so that the gauge transformed connection A˜x reads: Ψ→e β A˜x = i (ϕ˙ + ϕ )H − imλ−1 (1 − λ2 e2iβϕ )E+ − imλ−1 (1 − λ2 e−2iβϕ )E− 2 Note, however, that, due to the sign change in the gauge transformation, we now have Tr T˜ = Tr (T ρ−1 ) = eiπQ Tr (T ρ). The same computation as above yields: P (λ) = −2ik(λ) + imλ
β2 (H + P ) + O(λ2 ) 4m2
Comparing with the asymptotic expansions in: Tr T˜ = eP (λ) + e−P (λ) = e−iπQ (e we find P (λ) = −
iπQ 2
e−2ik(λ) a(λ) + e−
iπQ − 2ik(λ) + log a(λ), 2
iπQ 2
e2ik(λ) a∗ (λ))
λ→0
from which we identify: I1± =
β2 (H ∓ P ) 4m
Now we reconstruct a(λ) from its analyticity properties, and get alternative expressions for the quantities In± . Recall that the function a(λ) is analytic in the upper half-plane, behaves at λ = 0, ∞ as: iπQ iπQ 1 , |λ| → ∞; e 2 a(λ) = eiπQ +O (|λ|) , |λ| → 0 e 2 a(λ) = 1+O |λ|
513
13.8 Action--angle variables iπQ
iπQ
and obeys (e 2 a(−λ∗ ))∗ = e 2 a(λ). Therefore we can reconstruct a(λ) if we know its modulus on the real axis and the position of its zeroes. ∞ λ − λi iπQ 1 log |a(µ)| exp − e 2 a(λ) = dµ (13.48) λ − λ∗i iπ −∞ λ − µ + i0 i
The right-hand side is analytic in the upper half-plane. It is invariant under complex conjugation and changing λ → −λ∗ because for each λi there is a λj = −λ∗i and one can change variables µ → −µ in the integral (|a(−µ)| = |a(µ)|). For λ → ∞ one gets the correct asymptotic value 1. Finally, for λ real, using: 1 1 =P − iπδ(λ − µ) λ − µ + i0 λ−µ the modulus of the right-hand side is exp ( log |a(µ)|δ(λ−µ)dµ) = |a(λ)|. Using the asymptotic value at λ = 0, and noting that the integral vanishes for λ = 0 (by µ → −µ) we must have eiπQ = i (λi /λ∗i ). Note that a pair of roots symmetric with respect to the imaginary axis contribute a +1 in this product, but pure imaginary roots contribute a −1, so, modulo 2, Q must be equal to the number of pure imaginary roots, i.e. the total number of solitons and antisolitons must have the same parity as the topological charge Q. This is as it should be since a soliton has Q = 1 and an antisoliton has Q = −1. On the real axis we have |a(λ)|2 = 1 − |b(λ)|2 and we can replace 1 π log |a(λ)| by 1 ρ(λ) = log (1 − |b(λ)|2 ) 2π in eq. (13.48). Thus a(λ) can be reconstructed from the knowledge of |b(λ)| on the real axis and the zeroes λn . For a smooth field ϕ, ρ(µ) decreases fast at µ → 0, ∞, so that one can expand 1/(λ − µ) in powers of λ/µ or µ/λ in the integral. We get asymptotic expansions at λ → ∞: ∞ ∞ iπQ 1 λnj − λ∗n j log a(λ) = − − µn−1 ρ(µ)dµ + +i 2 λn n −∞ n=1
and at λ → 0: log a(λ) =
iπQ + 2
∞ n=1
j
1 λ n − n j
1 1 − ∗n λnj λj
−i
∞
−∞
µ−n−1 ρ(µ)dµ
514
13 Classical inverse scattering method
Comparing with eqs. (13.46, 13.47), we identify the In± . We find In± = 0 for n even, and for n odd In± = ±I±n , where In is defined for n ∈ Z by: (n−1)/2
In = (−1)
λn − λ∗n j
j
j
in
−
∞
µn−1 ρ(µ)dµ ∈ R
0
In particular, for n = ±1 we obtain: ∞ ±1 ∗±1 4m λj − λj dµ − H ±P = 2 ± µ±1 ρ(µ) β i µ −∞ j
If, for simplicity, we don’t consider breathers and set λj = iξj , we get: ∞ dµ 4m (µ + µ−1 )|ρ(µ)| H = 2 (ξj + ξj−1 ) + β µ 0 j ∞ 4m dµ P = 2 (ξj − ξj−1 ) + (µ − µ−1 )|ρ(µ)| β µ 0 j
Setting kj = written as:
8 m (ξ β2 2 j
− ξj−1 ), M =
8 m β2
and k =
m 2 (µ
− µ−1 ), this can be
∞ 8 dk kj2 + M 2 + 2 k 2 + m2 |ρ(k)| √ 2 β −∞ k + m2 j ∞ 8 dk P = kj + 2 k|ρ(k)| √ β −∞ k 2 + m2 j
H =
which nicely exhibits the decomposition of the theory into a sum of relativistic modes. Note that the soliton j has mass M and momentum kj , while the continuous spectrum is a superposition of modes of mass m. Remark. One can extract the Poisson brackets of the solitonic modes from eq. (13.45) and recover eq. (12.81) in Chapter 12. The parameters an , µn are related to mn , λn by eq. (13.40). Moreover, from eqs. (13.32, 13.48), for purely solitonic solutions we have: πQ λ − λj cn mn = , a(λ) = e−i 2 a (λn ) λ + λj j so that: an = 2iei
πQ 2
cn
µn + µj µn − µj
j=n
13.8 Action--angle variables
515
Then a straightforward computation using eqs. (13.45), which mean that log µn and log cn are canonically conjugated variables, yields: β2 β2 4µi µj ai aj µi aj δij , {ai , aj } = {µi , µj } = 0, {µi , aj } = 2 2 µ2i − µ2j which identifies with eq. (12.81) up to the factor β 2 (we should set β = −i to compare the two formulae).
With this, we end this chapter on the classical inverse scattering method, thereby paying due tribute to Gardner, Greene, Kruskal and Miura, without whom this book would not exist. References [1] C.S. Gardner, J.M. Greene, M.D. Kruskal and R.M. Miura, Method for solving the Korteweg–de Vries equation. Phys. Rev. Lett. 19 (1967) 1095. [2] E.K. Sklyanin, On the complete integrability of the Landau–Lifchitz equation. Preprint LOMI E-3-79. Leningrad (1979). [3] S.Novikov, S.V. Manakov, L.P. Pitaevskii and V.E. Zakharov, Theory of Solitons, the Inverse Scattering Method. Consultants Bureau, NY (1984). [4] L.D. Faddeev and L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons. Springer (1986).
14 Symplectic geometry
The aim of this chapter is to provide a concise presentation of classical mechanics in the framework of symplectic geometry. This geometrical approach of mechanics is essential to gain any understanding of integrable systems theory. We assume the reader has some basic knowledge of elementary differential geometry and differential forms, but we present all the symplectic theory we need. We then explain the notion of symplectic reduction under a Lie group action, a concept which frequently appears in our discussion of integrable systems. The chapter ends by a discussion of a more recent topic, Poisson–Lie groups, which is used in the analysis of dressing transformations, and whose importance has to be stressed in connection with quantum group theory.
14.1 Poisson manifolds and symplectic manifolds In this chapter we investigate the formulation of classical mechanics using Poisson brackets. We work on a phase space M and consider the differentiable functions on M . We denote by F(M ) the algebra of such functions. A Poisson bracket is a bilinear antisymmetric derivation of the algebra F(M ): { , } : F(M ) × F(M ) → F(M ) such that: {f1 , f2 } = −{f2 , f1 }, antisymmetry {f1 , αf2 + βf3 } = α{f1 , f2 } + β{f1 , f3 }, α, β constants {f1 , f2 f3 } = {f1 , f2 }f3 + f2 {f1 , f3 }, Leibnitz rule {f1 , {f2 , f3 }} + {f3 , {f1 , f2 }} + {f2 , {f3 , f1 }} = 0, Jacobi identity Since the Poisson bracket is linear in f and obeys the Leibnitz rule, with any function H ∈ F(M ) we can associate a vector field XH on M by 516
14.1 Poisson manifolds and symplectic manifolds
517
XH f = {H, f }. When H is the Hamiltonian of the system, this vector field defines the time evolution by f˙ = XH f = {H, f } ∀f ∈ F(M ) Definition. A Poisson manifold M is a manifold on which a Poisson bracket is defined. The important feature of Poisson brackets in classical mechanics is that if f1 and f2 are two conserved quantities, i.e. {H, f1 } = {H, f2 } = 0, then {f1 , f2 } is also conserved due to the Jacobi identity. In general a Poisson bracket is degenerate, which means that there are functions f on M such that {f, g} = 0 for all g. The set of such functions is called the centre of the Poisson algebra. If the centre is non-trivial, i.e. contains non-constant functions, one can reduce the dynamical system by setting all functions of the centre to constant values. This defines a foliation of M and the Poisson bracket is non-degenerate on the leaves. In a local coordinate system xj on M we can write: ∂f1 ∂f2 P ij (x) i j { f1 , f2 }M (x) = ∂x ∂x ij
due to the bilinearity and Leibnitz rule. Antisymmetry requires that P ij (x) = −P ji (x). The Jacobi identity reads: P is ∂s P jk + P ks ∂s P ij + P js ∂s P ki = 0 where ∂s = ∂xs . Assume now that the matrix P ij is invertible. In particular, this means that the centre of the Poisson algebra is trivial. This can occur only when the dimension of M is even. Denote by (P −1 )ij the inverse matrix of P ij so that ∂s P ij = −P ia ∂s (P −1 )ab P bj . Inserting this into the Jacobi identity yields:
P is P ja P kb ∂s (P −1 )ab + ∂b (P −1 )sa + ∂a (P −1 )bs = 0 s,a,b
Since
P ij
is invertible, this is equivalent to the linear conditions: ∂a (P −1 )bc + ∂b (P −1 )ca + ∂c (P −1 )ab = 0
These conditions can be interpreted as the closedness of the 2-form: ω=− (P −1 )ij dxi ∧ dxj i<j
that is dω = 0. This 2-form is invariant under changes of coordinates, and so is globally defined on M . We emphasize that the matrices entering the definition of the Poisson bracket and the symplectic form are inverse to each other.
518
14 Symplectic geometry
Definition. A symplectic manifold (M, ω) is a manifold M equipped with a non-degenerate closed 2-form ω, dω = 0. Given a function H on a symplectic manifold, the Hamiltonian vector field XH is defined using the interior product iX by: dH = −iXH ω,
i.e.
dH = −ω(XH , ·)
Using local coordinates xi on M we have ω= ωij dxi ∧ dxj i<j
i i = ω ij ∂ H, where ω ij is the inverse Setting XH = i XH ∂i , we get XH j matrix of ωij . Knowing the symplectic form ω one can reconstruct the Poisson bracket as follows: {f1 , f2 } = Xf1 (f2 ) = −Xf2 (f1 ) = ω(Xf1 , Xf2 )
(14.1)
In components, we have {f1 , f2 } = −
ω ij ∂i f1 ∂j f2
ij
On a symplectic space, one can define the notion of symplectic transformations. Consider a bijection γ : M → M and the transform of the symplectic form ω under γ. This is the form (γ ∗ ω)m (V, W ) = ωγ(m) (γ∗ V, γ∗ W ) where m is a point of M , and γ∗ is the differential of γ, sending a tangent vector at m to a tangent vector at γ(m). We say that the transformation γ is symplectic if γ ∗ ω = ω. For an infinitesimal transformation, γ is specified by a vector field X on M , and this condition is equivalent to LX ω = 0, where LX is the Lie derivative, LX = diX + iX d. We translate the symplecticity property of ω on Poisson brackets and show that it reads: γ{f, g} = {γf, γg} (14.2) We recall that the action of γ on functions is given by (γf )(m) = f (γ −1 (m)). Note that we have introduced the inverse of γ so that the property γ1 · (γ2 · f ) = (γ1 γ2 ) · f holds. If γ is symplectic, we have:
Xγf (m) = γ∗ Xf (γ −1 (m))
14.1 Poisson manifolds and symplectic manifolds
519
This is because d commutes with the pullback operation, i.e. d(γf ) = γ −1∗ df , so applying this to V ∈ Tm M we get d(γf )m (V ) = dfγ −1 (m) (γ∗−1 V ) that is: ωm (Xγf , V ) = ωγ −1 (m) (Xf (γ −1 (m)), γ∗−1 V ) = ωm (γ∗ Xf ◦ γ −1 (m), V ) where in the last step we have used that γ is symplectic. We use this to prove eq. (14.2). We have {γf, γg}m = ωm (Xγf (m), Xγg (m)) = ωγ(γ −1 (m)) (γ∗ Xf ◦ γ −1 , γ∗ Xg ◦ γ −1 ) Using that γ is symplectic, this is equal to ωγ −1 (m) (Xf ◦ γ −1 (m), Xg ◦ γ −1 (m)) = {f, g}(γ −1 (m)) = (γ{f, g})(m) Proposition. Any Hamiltonian flow is a symplectic transformation. Proof. Let H be the Hamiltonian with associated vector field XH such that dH = −iXH ω. We have: LXH ω = (iXH d + diXH )ω = d(iXH ω) = −d2 H = 0 where we have used dω = 0. Example 1. A standard example of symplectic space is given by M = R2n with coordinates (pi , qi ), and symplectic form: ω= dpi ∧ dqi i
These coordinates are called canonical coordinates. The corresponding Poisson bracket reads: {qi , qj } = 0,
{pi , pj } = 0,
{pi , qj } = δij
The Hamiltonian vector field corresponding to the function H is: XH =
i
−
∂H ∂H ∂pi + ∂q ∂qi ∂pi i
In fact this example is generic, at least locally. This is the Darboux theorem.
520
14 Symplectic geometry
Fig. 14.1. Foliation of phase space by the surfaces Σt and Φs . Theorem. On any symplectic manifold (M, ω) one can introduce, locally around a point m0 , canonical coordinates (qi , pi ) such that ω = i dpi ∧ dqi . Moreover, one can choose p1 to be any given function H on M such that dH(m0 ) = 0. Proof. The proof will be done by induction on the dimension (2n) of M . We choose a coordinate system y ∈ R2n around m0 , and define p1 (y) = H(y). We can assume that p1 (m0 ) = 0. We then introduce the Hamiltonian vector field Xp1 associated with p1 , which is non-vanishing in the considered neighbourhood of m0 . Choose a hypersurface Σ0 passing through m0 , and transverse to Xp1 , i.e. such that Xp1 is not tangent to Σ0 . We want to define q1 such that {p1 , q1 } = 1 and q1 = 0 on Σ0 . This means that on any trajectory of the flow of Xp1 we want to achieve q˙1 = {p1 , q1 } = 1. Hence q1 (y) − q1 (z) = q1 (y) = t, where z is the point at which the trajectory crosses Σ0 , that is q1 (z) = 0. By the assumption of transversality, for any y close to Σ0 one defines q1 (y) as the time needed to go from z ∈ Σ0 to y along the trajectory of the flow Xp1 . The neighbourhood of m0 is foliated by the hypersurfaces Σt where q1 (y) = t. On the other hand, it is also foliated by the hypersurfaces Φs where p1 (y) = s. We now show that one can simultaneously solve: ∂s y = −Xq1 (y),
∂t y = Xp1 (y)
thereby allowing us to write y = y(s, t, z) with z ∈ Σ0 ∩Φ0 . This is because for any function f (y) one has by definition ∂s f = −Xq1 f = {−q1 , f } and
14.1 Poisson manifolds and symplectic manifolds
521
∂t f = Xp1 f = {p1 , f } so that, using the Jacobi identity: ∂s (∂t f ) − ∂t (∂s f ) = −{q1 , {p1 , f }} + {p1 , {q1 , f }} = {{p1 , q1 }, f } = 0 since {p1 , q1 } = 1 is constant. The vector field Xp1 is tangent to Φ0 (in fact to any Φs , because Xp1 p1 = {p1 , p1 } = 0), and similarly Xq1 is tangent to Σt . They are both transverse to Σ0 ∩ Φ0 and independent, because {p1 , q1 } = ω(Xp1 , Xq1 ) = 1 = 0. For any vector V tangent to Σ0 ∩ Φ0 we have ω(V, Xp1 ) = ω(V, Xq1 ) = 0 because ω(V, Xp1 ) = V · p1 , ω(V, Xq1 ) = V · q1 , and p1 = q1 = 0 are constant on this intersection. It follows that the restriction of ω to Σ0 ∩Φ0 is non-degenerate. By the induction hypothesis we assume that we have found canonical coordinates pi , qi , i ≥ 2 on the (2n − 2) dimensional symplectic variety Σ0 ∩Φ0 . We extend these coordinates as functions on M by setting pj (y) = pj (z) and qj (y) = qj (z) for j ≥ 2 and y = y(p1 , q1 , z) with z ∈ Σ0 ∩ Φ0 . This amounts to keeping them constant along the flows of Xp1 and Xq1 so that {p1 , pj } = {p1 , qj } = {q1 , pj } = {q1 , qj } = 0 It remains to show that the symplectic form ω on M is equal to ω ˜ = n dp ∧ dq . We first show that this is true at any point p = q =0 i i 1 1 i=1 of Σ0 ∩ Φ0 . Any vector V tangent to M at this point can be decomposed as a sum of a vector V1 on the space spanned by Xp1 and Xq1 , and a vector V2 tangent to Σ0 ∩ Φ0 . Computing ω(V1 + V2 , W1 + W2 ) we have seen that ω(V1 , W 2 ) = ω(V2 , W1 ) = 0, while by the induction hypothesis ω(V2 , W2 ) = ( j≥2 dpj ∧ dqj )(V2 , W2 ). Finally, ω(V1 , W1 ) = (dp1 ∧ dq1 )(V1 , W1 ), because it is sufficient to compute this for V1 = Xp1 and W1 = Xq1 , and both members are then equal to {p1 , q1 } = 1. To show that the equality holds on M , consider the Hamiltonian evolution U(s,t) under the flows of Xp1 and Xq1 . In the coordinates (pi , qi ) it reads (p1 , q1 , p2 , q2 , . . .) → (p1 + s, q1 + t, p2 , q2 , . . .) so that the form ω ˜ = i dpi ∧ dqi is obviously invariant. On the other hand, the symplectic form ω is also invariant under Hamiltonian evolutions. Since we have shown that ω = ω ˜ for p1 = q1 = 0, these two forms coincide everywhere. Example 2. Another very natural example of symplectic space is the cotangent bundle M = T ∗ N of any differentiable manifold N . On this bundle is defined a canonical 1-form α given by: αx (X) = p(π∗ X) (T ∗ N ),
and x ∈ T ∗ N . One can write x = (q, p), with where X ∈ Tx q = π(x) ∈ N (π is the projection on N ), and p is a 1-form belonging to
522
14 Symplectic geometry
the fibre of q. We define the symplectic form as the closed form ω = dα. To show that it is non-degenerate we express it in terms of local coordinates. If q = (q1 , . . . , qn ) is a system of local coordinates on N , a basis of the tangent space of N at q is (∂q1 , . . . , ∂qn ) and a basis of the cotangent ∗ space at q is (dq1 , . . . , dqn ). In particular, any point p in the fibre of T N above q is of the form p = i pi dqi . A tangent vector X = (δq, δp) at the point(q, p) of T ∗ N has projection π∗ X = δq = i (δq)i ∂qi , so that α(X) = i pi (δq)i , and the canonical form reads:
Then ω = dα =
α = p1 dq1 + · · · + pn dqn
i dpi
∧ dqi is non-degenerate. 14.2 Coadjoint orbits
Our aim is to present a non-trivial example of symplectic structure on the coadjoint orbits of a Lie group, which plays an important role in the study of integrable systems. We first recall some notions about adjoint and coadjoint actions of Lie algebras and Lie groups. Let G be a connected Lie group with Lie algebra G. The group G acts on G by the adjoint action denoted Ad: X −→ (Ad g)(X) = gXg −1 ,
g ∈ G, X ∈ G
Similarly, the coadjoint action of G on the dual G ∗ of the Lie algebra G (i.e. the vector space of linear forms on the Lie algebra) is defined by: (Ad∗ g.Ξ)(X) = Ξ(Ad g −1 (X)), g ∈ G, Ξ ∈ G ∗ , X ∈ G The infinitesimal version of these actions provides actions of the Lie algebra G on G and G ∗ , denoted by ad and ad∗ , given by: ad X(Y ) = [X, Y ], X, Y ∈ G, ad X.Ξ(Y ) = −Ξ([X, Y ]), X, Y ∈ G, Ξ ∈ G ∗ ∗
Coadjoint orbits in G ∗ are equipped with the canonical Kostant–Kirillov symplectic structure. Before defining it we need some simple facts concerning the functions on G ∗ . Denote the space of such functions by F(G ∗ ). The coadjoint action of G on G ∗ induces a coadjoint action of G on functions on G ∗ , also denoted by Ad∗ . If g ∈ G and h ∈ F(G ∗ ), Ad∗ g . h(Ξ) = h(Ad∗ g −1 (Ξ)) and a similar formula for the infinitesimal action ad∗ . For a function h on G ∗ the differential dh may be viewed as an element of G. This is because,
14.2 Coadjoint orbits
523
G ∗ being a vector space, the differential dh is a linear form on G ∗ , i.e. an element of G ∗∗ ∼ G, and one can write for δΞ ∈ G ∗ : h(Ξ + δΞ) = h(Ξ) + δΞ(dh) + O (δΞ)2 From the Lie algebra structure on G, we can construct a Poisson bracket on the space F(G ∗ ) of functions on G ∗ . This is the Kostant–Kirillov bracket. Definition. If h1 and h2 ∈ F(G ∗ ), the Kostant–Kirillov bracket is defined by: {h1 , h2 }(Ξ) = Ξ([dh1 , dh2 ]) It is obvious that the bracket { , } is antisymmetric and verifies the Jacobi identity. Choosing two linear functions h1 (Ξ) = Ξ(X) and h2 (Ξ) = Ξ(Y ) with X, Y ∈ G, we have dh1 = X and dh2 = Y , and the Kostant–Kirillov Poisson bracket reads: {Ξ(X), Ξ(Y )} = Ξ([X, Y ]) The right-hand side is the linear function Ξ → Ξ([X, Y ]). This Poisson bracket is very natural but one has to be aware that it is a degenerate Poisson bracket. Proposition. The kernel of the Kostant–Kirillov bracket { , } is the set I(G ∗ ) of Ad∗ -invariant functions. Proof. Let us first express the Ad∗ -invariance property of a function h on G ∗ . Performing an infinitesimal transformation, we have: h(Ξ + t ad∗ X · Ξ) = h(Ξ) + t (ad∗ X · Ξ)(dh) + O(t2 ) so h is invariant if (ad∗ X · Ξ)(dh) = Ξ([dh, X]) = 0 for all Ξ ∈ G ∗ and all X ∈ G. Assuming now that k ∈ F(G ∗ ) is in the kernel of { , }, we have {k, f }(Ξ) = Ξ([dk, df ]) = 0, ∀f ∈ F(G ∗ ), in particular f = Ξ(X) for any X ∈ G. Thus k is ad∗ -invariant. The converse is obvious. This proposition means that the kernel of the Kostant–Kirillov bracket { , } is the set of functions which are constant on the orbits of the coadjoint action of G. Let I1 , I2 , . . . be the primitive invariant functions, i.e. any invariant function is a function of them, and denote by I1 , I2 , . . . the constant values they take on a specific orbit. We consider the ideal of the function algebra generated by the non-constant functions I1 − I1 , I2 − I2 , . . . . It is also an ideal of the Poisson algebra. The quotient of the function algebra by this ideal can be identified with the functions on the orbit, and the quotient Poisson bracket yields a Poisson bracket on
524
14 Symplectic geometry
the orbit, which by construction is non-degenerate, and therefore defines a symplectic structure on the orbit. More explicitly: Proposition. For any two tangent vectors at the point Ξ of the orbit, VX = ad∗ X · Ξ and VY = ad∗ Y · Ξ, define ωK (VX , VY ) = Ξ([X, Y ])
(14.3)
This form is closed and non-degenerate on any G-orbit. It induces the Kostant–Kirillov bracket. Proof. First note that since we are on an orbit of G, the vectors ad∗ X · Ξ, X ∈ G describe all the tangent space at Ξ. To show that ω is closed, let us recall the definition of exterior differentiation: dη(X0 , . . . , Xp ) = +
p
j , . . . , Xp ) (−1)j Xj · η(X0 , . . . , X
j=0
i , . . . , X j , . . . , Xp ) (−1)i+j η([Xi , Xj ], X0 , . . . , X
0≤i<j≤p
In our case, the first term vanishes because VX · Ξ([Y, Z]) = −Ξ([X, [Y, Z]]), and we apply the Jacobi identity. The second term vanishes for the same reason once one notices that [VX , VY ] = −V[X,Y ] . This is because, by the general definition of the Lie derivative, we have: LX·m Y ·m ≡ [X·m, Y ·m] =
d −Xt Y.(eXt m) |t=0 = −[X, Y ]·m (14.4) e dt
To show that the form ω is non-degenerate on the orbit, assume that the tangent vector VX is such that ω(VX , VY ) = 0 for all Y ∈ G. This means Ξ([X, Y ]) = 0, ∀Y , that is ad∗ X ·Ξ(Y ) = 0, ∀Y . Hence VX = ad∗ X ·Ξ = 0. To compute the Poisson bracket associated with ωK , we need the Hamiltonian vector field of any function f : Xf (Ξ) = −ad∗ df · Ξ To show it, note that df (ad∗ Y · Ξ) = Ξ([df, Y ]), which is also ωK (ad∗ df· Ξ, ad∗ Y · Ξ). Hence ωK (Xf , Xg )(Ξ) = Ξ([df, dg]). The 2-form ωK defining the Kostant–Kirillov symplectic structure on coadjoint orbits is closed, but not exact. However, the coadjoint action defines a map ϕ from the group G to the orbit OΞ0 by
14.3 Symmetries and Hamiltonian reduction
525
ϕ : g → Ad∗ g · Ξ0 ≡ Ξ. The pullback ω = ϕ∗ ωK of ωK on G is exact and one can write ω = δα, with α = −Ξ0 (g −1 δg) To check this, note that ϕ∗ (gX) = ad∗ gXg −1 · Ξ. Hence ϕ∗ ωK (gX, gY ) = ωK (ϕ∗ (gX), ϕ∗ (gY )) = Ξ(g[X, Y ]g −1 ) = Ξ0 ([X, Y ]) On the other hand, δα(gX, gY ) = gX · α(gY ) − gY · α(gX) − α([gX, gY ]) The first two terms vanish because they are derivatives of constant functions (α(gY ) = −Ξ0 (Y )). By definition of the Lie bracket [gX, gY ] = g[X, Y ] so that the last term is equal to Ξ0 ([X, Y ]), hence ϕ∗ ω = δα. 14.3 Symmetries and Hamiltonian reduction Let G be a Lie group and G its Lie algebra. Consider a symplectic manifold M on which G operates. We say that the action of G is symplectic if for any g ∈ G the transformation m → gm is symplectic. In view of eq. (14.2) this means: {f1 (gm), f2 (gm)} = {f1 , f2 }(gm) For any X ∈ G, consider the one-parameter group g t = exp (tX). In the limit t → 0 we define the action of X on functions by: X · f (m) =
d f (e−tX · m)|t=0 dt
(14.5)
so that we get a representation on functions of the Lie algebra G: X · (Y · f ) − Y · (X · f ) = [X, Y ] · f
(14.6)
Notice that X · f = −LX·m f , where L is the Lie derivative. Finally, the symplecticity condition reads: {X · f1 , f2 } + {f1 , X · f2 } = X · {f1 , f2 }
(14.7)
Proposition. Let G be a Lie group acting on M by symplectic diffeomorphism. The action of any one-parameter subgroup of G is locally Hamiltonian. This means that there exists a function HX , locally defined on M , such that: X · f = {HX , f }
(14.8)
526
14 Symplectic geometry
Proof. The condition eq. (14.7) is obviously necessary to have X · f = {HX , f }. It is also sufficient. To show it, we use the canonical Darboux i i coordinates. Writing X · m = i (X p ∂pi + X q ∂qi ), we have: {X · f, h} + {f, X · h} − X · {f, h} / 0 = (∂pj Xqi − ∂pi Xqj )∂qi f ∂qj h + (∂pj Xpi + ∂qi Xqj )∂pi f ∂qj h i,j
/ 0 − (∂qj Xqi + ∂pi Xpj )∂qi f ∂pj h + (∂qj Xpi − ∂qi Xpj )∂pi f ∂pj h i,j
The condition that this vanishes identically for all f and h is equivalent to dΩX = 0, where ΩX = −iX·m ω = i (Xqi dpi −Xpi dqi ). So there exists, at least locally, a function HX such that ΩX = dHX or: X qi =
∂HX ∂pi
Xpi = −
∂HX ∂qi
Then X ·f =
i
−
∂HX ∂f ∂HX ∂f + = {HX , f } ∂qi ∂pi ∂pi ∂qi
This proves eq. (14.8). If one knows that there is an invariant 1-form α such that ω = dα, one can give an explicit formula for the function HX , which is then globally defined: (14.9) HX (m) = α(X · m) Indeed, since α is invariant we have LX α = 0. Then 0 = LX α = (iX d + diX )α = ω(X, ·) + dα(X), so that comparing with dHX = −iX ω we see that HX = α(X · m). Using eq. (14.8) and the Jacobi identity, we find that X ·(Y ·f )−Y ·(X · f ) = {{HX , HY }, f }, so that eq. (14.6) yields {H[X,Y ] −{HX , HY }, f } = 0. Because constants commute with any function we cannot conclude that H[X,Y ] = {HX , HY }. This motivates the: Definition. Consider a Lie group G acting on a symplectic manifold M by symplectic action. This action is said to be Poissonian if the Hamiltonians HX of the one-parameter subgroups are globally defined, depend linearly on X, and are such that H[X,Y ] = {HX , HY }
14.3 Symmetries and Hamiltonian reduction
527
In the previous case, where there exists an invariant 1-form α, this property is always satisfied. In this case, we have {HX , HY } = ω(X, Y ) = dα(X, Y ) = X · α(Y ) − Y · α(X) − α([X, Y ]). By invariance of α we have X · α(Y ) = LX α(Y ) = α([X, Y ]) = −Y · α(X), so that {HX , HY } = α([X, Y ]) = H[X,Y ] . Example. This particular situation is important because it occurs in the case of a cotangent bundle. In this case we already know that the 1-form α exists and is globally defined. We consider particular diffeomorphisms of M = T ∗ N , namely those which are induced by a diffeomorphism of the base N . We are going to show that α is invariant under such diffeomorphisms, and in particular under those which are induced by group actions on N . Any diffeomorphism φ of N induces a transformation on T ∗ N as follows: a point (q, p) of T ∗ N is determined by a point q ∈ N and a linear form p on the tangent space Tq N to N at q. The differential of φ at q, which we denote by φ∗ , maps Tq N into Tφ(q) N . Its transpose ∗ N to T ∗ N , hence φ∗ −1 maps T ∗ N to T ∗ N . The induced φ∗ maps Tφ(q) q q φ(q) ∗ transformation φ on M = T N is given by p) = (φ(q), φ∗ −1 (p)) φ(q, Proposition. The 1-form α on T ∗ N is invariant under transformations induced by transformations of the base manifold N . Proof. The transformation φ of M induces a transformation φ∗ on T M given by φ∗ (δq, δp) = (φ∗ δq, φ∗ −1 δp). Recall the definition α(δq, δp) = p(δq). Quite generally, we have: (φ∗ (δq, δp)) (φ∗ α)(q,p) (δq, δp) = αφ(q,p) = φ∗ −1 (p)(φ∗ δq) = p(δq) = α(δq, δp)
In particular, assuming that a Lie group G acts on the base manifold N , this action lifts to a Poissonian action on M = T ∗ N , and for any X ∈ G we have HX (m) = α(X · m) = p(X · q) for m = (q, p). When the action of a Lie group G on a symplectic manifold M is Poissonian, any X ∈ G is associated with a function HX such that X · f = {HX , f }, and X → HX is linear. Hence there exists a function P : M −→ G ∗ such that one can write HX (m) = P(m), X , where is the pairing between G and its dual.
528
14 Symplectic geometry
Definition. The application m → P(m) ∈ G ∗ is called the moment map. The moment map has the following covariance property with respect to the action of G: Proposition. The value of the moment at the point g · m is related to its value at the point m by: P(g · m) = Ad∗g P(m) Proof. This is equivalent to HX (g · m) = Hg−1 Xg (m). Since we asssume that G is connected, it is sufficient to show this for an infinitesimal g = 1 + Y . Then we have to show that dHX (Y · m) = H[X,Y ] (m). Using eq. (14.5) we have dHX (Y · m) = −Y · HX = −{HY , HX }, where we used eq. (14.8) in the last step. Since the group action is Poissonian, this is H[X,Y ] . The moment map associates conserved quantities with symmetries of the Hamiltonian. This is the Noether theorem. Theorem. Let G be a Lie group acting by Poissonian action on a symplectic manifold M , and let H be a Hamiltonian invariant under the action of G. Then the moment P is conserved under the flow of H. Proof. Let us fix X ∈ G and consider the function on M : P(m), X = HX (m). Its time derivative under the flow of H is ∂t HX = {H, HX } = −X · H = 0 because H is invariant under the group action. In the situation of the theorem, one can use the conserved quantities to reduce the number of degrees of freedom. This is called Hamiltonian reduction, because one is able to define a new symplectic variety of smaller dimension on which the reduced motion takes place. Let M be a symplectic manifold and let G be a group acting on M by Poissonian action. Let P be the moment map. Let us fix a particular value µ of the moment and consider the set of points of phase space where P(m) = µ. By the Noether theorem the motion takes place on this set: Mµ ≡ P −1 (µ),
µ ∈ G∗
We have to assume that µ is not a critical value of P, that is, at all m ∈ Mµ we have dP(m) = 0. Hence there exists a tangent space at m to Mµ . Since Mµ is defined by dim G equations, we have: dim Mµ = dim M − dim G
(14.10)
Note that Mµ is not in general a symplectic variety, and is not in general of even dimension. However, there is a residual action of the group G
14.3 Symmetries and Hamiltonian reduction
529
on Mµ . We have seen that the action of G is transformed by P into the coadjoint action on G ∗ . So the stabilizer Gµ of the moment µ, that is the group of g ∈ G such that Ad∗g µ = µ, preserves Mµ . Definition. The reduced phase space Fµ is the quotient: Fµ ≡ Mµ /Gµ = P −1 (µ)/Gµ Here we assume that the quotient is well-defined as a differentiable manifold. In general there are particular values of µ where this quotient is ill-defined. However, for a generic situation we don’t have to enter into such subtleties. The nice feature of the reduced phase space is that it is naturally equipped with a symplectic structure, and in particular is of even dimension. Proposition. Let ξ and η be two vectors tangent to Fµ at the point f. Consider at a point m ∈ Mµ above f any two tangent vectors to Mµ , ξ and η , projecting on ξ, η respectively. We then set: ωf (ξ, η) = ωm (ξ , η ) This is independent of the chosen representatives m, ξ , η and defines a symplectic form on Fµ . Proof. We first show that given m, ωm (ξ , η ) is independant of the choices of representatives ξ , η . This amounts to showing that ωm (V, W ) = 0 for V vertical, i.e. tangent to the orbit of Gµ , and W tangent to Mµ . Since V is vertical we can write V = X · m with X ∈ Gµ . We can consider the Hamiltonian HX defined on M and note that ωm (V, W ) = −dHX (W ). But on Mµ we have HX = P(X) = µ(X) is constant. Since dHX (W ) is the derivative of HX in the direction of W , which is tangent to Mµ , this derivative vanishes. The quantity ωm (ξ , η ) is independent of the choice of the point m above f since by invariance of ω we have for any g ∈ Gµ , ωgm (gξ , gη ) = ωm (ξ , η ). This shows that ωF , the form that we have defined on Fµ , is well-defined, bilinear and antisymmetric. To show that ωF is closed, note that the restriction ω|Mµ of ω to Mµ is obviously closed. On the other hand, if π is the projection Mµ → Mµ /Gµ we have just shown that ω|Mµ = π ∗ ωF . Since d commutes with π ∗ we have π ∗ dωF = 0. This means that dωF (π∗ X, π∗ Y ) = 0 for all X, Y . Since π∗ is surjective we have dωF = 0. Finally, we have to show that ωF is non-degenerate. Vertical vectors are in the kernel of ω|Mµ , in fact they are the whole kernel of ω|Mµ as we
530
14 Symplectic geometry
now explain. We have seen that vertical vectors are orthogonal under ω to Tm (Mµ ). More precisely, we have (G · m)⊥ = Tm (Mµ ) because both sides have the same dimension, dim M − dim G, since ω is non-degenerate. But the kernel of the restriction of ω to Mµ is ⊥ (Mµ ) = Gµ · m Ker ω|Mµ (m) = Tm (Mµ ) ∩ Tm
These vectors project to 0 when we take the quotient under Gµ , hence ωF is non-degenerate. We often need to compute the Poisson brackets of functions on the reduced phase space Fµ , knowing the Poisson brackets on M . Any function f˜ on Fµ uniquely extends to a Gµ invariant function on Mµ . However, to be able to compute Poisson brackets on M we have to extend this function further to a function f defined in a vicinity of Mµ in M . Even requiring complete G invariance is not sufficient to lift f˜ to M . This is because while dim M = dim Mµ + dim G, the fibre along G at m ∈ Mµ is not transverse to Mµ , since Gµ leaves Mµ invariant. The general procedure consists of choosing arbitrary extensions f of f˜ outside of Mµ . Two extensions differ by a function vanishing on Mµ . Then we will show how to compute the reduced Poisson bracket as some modification of {f, g} (computed on M ) independent of the arbitrary choices (see eq. (14.11) below). The difference of the Hamiltonian vector fields of two extensions of the same function on Fµ is controlled by the following: Lemma. Let f be a function defined in a vicinity of Mµ and vanishing on Mµ . Then the Hamiltonian vector field Xf associated with f is tangent to the orbit G · m at any point m ∈ Mµ . Proof. The subvariety Mµ is defined by the equations HX i = µi for some basis Xi of G. Since f vanishes on Mµ one can write f = (HXi − µi )fi for some functions fi defined in the vicinity of Mµ . For any tangent vector V at a point m ∈ Mµ one has, using that (HXi − µi ) ∂fi vanishes on Mµ : df (m)(V ) =
i
dHXi (m)(V )fi (m) = −ω (
fi (m) Xi · m , V )
i
where we used that the Hamiltonian vector field associated with HXi is Xi .m. Hence Xf = fi (m) Xi · m ∈ G · m.
14.3 Symmetries and Hamiltonian reduction
531
As a consequence of this lemma we have a method for computing the reduced Poisson bracket. We take two functions defined on Mµ and invariant under Gµ and extend them arbitrarily. Then we compute their Hamiltonian vector fields on M . It turns out that they can be “projected” on the tangent space to Mµ by adding a vector tangent to the orbit G · m. These projections are independent of the extensions and the reduced Poisson bracket is given by the value of the symplectic form on M acting on them. Proposition. Let f be a function defined in a vicinity of Mµ and Gµ invariant on Mµ . At each point m ∈ Mµ one can choose a vector Vf · m ∈ G · m such that Xf + Vf · m ∈ Tm (Mµ ) and Vf · m is determined up to a vector in Gµ · m. Proof. Recall that the symplectic orthogonal of G · m is exactly Tm (Mµ ), so we want to solve ω (Xf + Vf · m, X · m) = 0, ∀X ∈ G. Note that for X, Y ∈ G, ω(X ·m, Y ·m) = {HX , HY } = H[X,Y ] = P ([X, Y ]) = µ ([X, Y ]) since the action is Poissonian and m ∈ Mµ . So the equation to be solved reads LX·m f = µ ([Vf , X]). Both members are linear in X ∈ G and vanish when X ∈ Gµ (the left-hand side because f is Gµ -invariant, the righthand side because Gµ stabilizes µ), so the equation can be seen as an ¯ Y¯ ) → µ([X, Y ]) is equation on G/Gµ . On this quotient, the mapping (X, a skew-symmetric non-degenerate (since we have quotiented by the kernel Gµ ) bilinear form, hence the equation can be uniquely solved for V¯f . Proposition. Let f˜, g˜ be two functions on the reduced phase space Fµ . We lift them to functions f, g defined on a vicinity of Mµ , and Gµ invariant on Mµ . The reduced Poisson bracket is given by: {f˜, g˜}red = ω (Xf + Vf · m, Xg + Vg · m) = {f, g} − µ([Vf , Vg ]) (14.11) Proof. We want to show that the Hamiltonian vector field associated with f˜ for the reduced symplectic form is π∗ (Xf (m) + Vf · m) for m ∈ Mµ . Note that this is independent of the choice of Vf modulo Gµ , and that Xf (m) + Vf · m is by construction tangent to Mµ . If V is an arbitrary tangent vector to Mµ at m, by definition of the reduced symplectic form, one has to check that: df˜(π∗ V ) = −ωred (π∗ (Xf (m) + Vf · m), π∗ V ) = −ωm (Xf (m) + Vf · m, V ) (14.12) We have seen that the symplectic orthogonal of G · m is Tm (Mµ ), so that ωm (Vf · m, V ) = 0. On the other hand, since f is Gµ -invariant we have df˜(π∗ V ) = df (V ) = −ωm (Xf , V ). This proves eq. (14.12) and the first equality in eq. (14.11). To get the second form of the reduced Poisson
532
14 Symplectic geometry
bracket, note that ω(Xf , Vg · m) = −µ([Vf , Vg ]). This is because, since Xf + Vf · m ∈ (G · m)⊥ , one has, for m ∈ Mµ : ω(Xf , Vg · m) = −ω(Vf · m, Vg · m) = −H[Vf ,Vg ] (m) = −µ([Vf , Vg ])
Remark. Note that if f | Mµ = 0 we have Xf + Vf · m ∈ Gµ · m, hence {f˜, g˜}red = ω (Xf + Vf · m, Xg + Vg · m) = 0, so that eq. (14.11) is independent of the arbitrary choices of f and g.
In the applications, we are often given functions f and g on M which are G-invariant. It is then obvious that {f, g} is also G-invariant (invariance of ω), hence its restriction to Mµ is Gµ -invariant. Moreover, the associated Hamiltonian vector fields Xf , Xg are tangent to Mµ since the G-invariance of f implies: 0 = df (X · m) = −ω (Xf , X · m) = −dHX (Xf ), X ∈ G therefore the functions HX are constant along Xf , i.e. Xf is tangent to Mµ . In that case, one can take Vf = 0. It follows that for such functions the reduced Poisson bracket on Fµ is simply given by the ordinary Poisson bracket on M . Proposition. If f and g are G-invariant functions on M they define functions f˜ and g˜ on the reduced phase space Fµ and we have: {f˜, g˜}red = {f, g} where the right-hand side is also G-invariant and defines a function on Fµ . 14.4 The case M = T ∗ G Let us now apply these general considerations to the case where N is a Lie group G and M = T ∗ G. We have two actions of G on itself, namely Lg : (g, n) → gn,
Rg : (g, n) → ng −1
The first one is called left-action and the second one right-action. The differential Lg∗ of the map n → gn sends Tn G → Tgn G. In particular if n = e, the unit element of G, this differential maps Te G = G to Tg G. For any fixed X ∈ G, we get a left-invariant vector field g · X on G. As above, the action on functions on G is defined by (gf )(n) = f (g −1 n) so
14.4 The case M = T ∗ G
533
that (g(hf )) = ((gh)f ). For an infinitesimal transformation g = exp(tX) with t small we have: X · f (n) =
d f (e−tX · n)|t=0 = −LX·n f dt
To build a phase space we need the cotangent bundle M = T ∗ G. We can use left translations to identify M with G × G ∗ , where G ∗ is the dual of the Lie algebra G. p ∈ Tg∗ G −→ (g, ξ)
where
p = L∗g−1 ξ
Indeed, Lg−1 ∗ maps Tg G to G, hence its transposed L∗g−1 maps G ∗ to Tg∗ G. Explicitly, p(V ) = ξ(g −1 V ), which is simply ξ(X) when V = g · X is a left-invariant vector field. Note that if Xi is a basis of the Lie algebra G, the left-invariant vector fields g · Xi provide a basis of each Tg G. Since M = T ∗ G there is a canonical 1-form α defined as follows: if (v, κ) is a tangent vector to T ∗ G at the point (g, ξ), so that v ∈ Tg G and κ ∈ G ∗ , we have: α(v, κ) = ξ(g −1 v) This canonical 1-form is both left-invariant and right-invariant, because, according to the general construction, it is invariant under any diffeomorphism of the base M = G of the cotangent bundle. Hence the symplectic form ω = dα is invariant under both actions. We can compute the Hamiltonians which generate infinitesimal left and right translations on functions. We have seen that in the case of a cotangent bundle HX (m) = α(−X · m), where, X · m is the infinitesimal group action on M and the minus sign is introduced because we consider the action on functions. For right translations g → gh−1 and h = etXR we get HXR (g, ξ) = α(g · XR ) = ξ(XR ) For left translations g → hg and h = etXL we get HXL (g, ξ) = α(−XL · g) = −ξ(g −1 XL g) From this, one sees that the moment maps for left and right actions are given by PL (XL ) = −(gξg −1 , XL ), PR (XR ) = (ξ, XR ) In applications, we consider the case when only subgroups HL and HR act by left and right translations on G. The moments live in the duals of the Lie algebra HL and HR , hence require natural projections from G ∗ , induced by restriction to the subalgebras. Specifically, if ξ is a linear form
534
14 Symplectic geometry
on G its restriction to H is an element of H∗ , and we denote it by PH∗ ξ. We have shown the: Proposition. Let HL and HR be two subgroups of G acting by left and right translations respectively on T ∗ G. The moment maps associated with these actions are: ∗ ξ, PR (g, ξ) = PHR
PL (g, ξ) = −PHL∗ gξg −1
We often need to compute Poisson brackets of functions on T ∗ G. We have natural elementary functions on this phase space, namely the quantities ξ(X) for any given X ∈ G, and the matrix elements ρij (g) of g in any faithful representation of G. Any other function can be expressed as a function of these elementary ones. So it is enough to give the Poisson brackets of these elementary functions. Proposition. The Poisson brackets of the elementary functions on T ∗ G read: {ρij (g), ρkl (g)} = 0, {ξ(X), ρij (g)} = ρij (gX), {ξ(X), ξ(Y )} = ξ([X, Y ]) Proof. These relations are consequences of the fact that HX = ξ(X) is the Hamiltonian generating right translations. Since, by the general theory, the action is Poissonian, we have {HX , HY } = H[X,Y ] and this gives the last equation. Moreover, {HX , ρij (g)} is the infinitesimal action of the right translation by X on the function ρij (g), namely the action (hρij )(g) = ρij (gh) (note that we have here gh because this is an action on functions). So {HX , ρij (g)} = ρij (gX), proving the second equation. Finally, the first equation is obvious since the ρij (g) only depend on the position variables and not the momenta. The Poisson bracket on ξ is the Kirillov bracket. We often drop the explicit reference to the representation ρ in the above formulae, but it is important to keep in mind that the g occurring in these equations is a function on phase space, and not a point on the base.
14.5 Poisson–Lie groups Consider two Poisson manifolds M1 and M2 . The cartesian product M1 × M2 is also equipped with a natural Poisson structure as follows. The space of functions on M1 × M2 is the tensor product of the space of functions on M1 and the space of functions on M2 . That is, one can write any (1) (2) such function in the form f (x, y) = i fi (x)fi (y), where the sum is in
14.5 Poisson–Lie groups
535
general infinite and requires some topology for its precise definition. We then define for two functions f (x, y) and g(x, y) the Poisson bracket: (1) (1) (2) (2) (2) (2) (1) (1) {f, g}M1 ×M2 = {fi , gj }M1 fi gj + {fi , gj }M2 fi gj ij
This obeys all the properties of a Poisson bracket, and implies that functions on M1 Poisson commute with functions on M2 . In particular, if G is a Lie group endowed with a Poisson structure, the product G × G has a Poisson structure and one may wonder whether the multiplication (g, h) → gh from G×G to G is compatible with the respective Poisson structures. More precisely, if we have two Poisson manifolds M and N and a map φ : M → N , this map is said to be Poisson if for any two functions f1 , f2 , on N we have {f1 ◦ φ, f2 ◦ φ}M = {f1 , f2 }N ◦ φ. In our case the multiplication is Poisson if we have: {f1 (gh), f2 (gh)}G×G = {f1 , f2 }G (gh) where in the left-hand side f1,2 (gh) are to be viewed as functions on G×G. Definition. A Poisson–Lie group G is a Lie group G equipped with a Poisson structure such that the multiplication in G, viewed as a map G × G → G, is a Poisson mapping. To describe the Poisson structure on G one can use the Lie algebra G to label the derivatives of any function at a point g ∈ G. We first choose a basis Ea of the Lie algebra G: c [Ea , Eb ]G = fab Ec
(14.13)
and consider the right-invariant vector fields ∇R a defined as: d ∇R f (etEa g)a f (g) = dt t=0 which form a basis of the tangent space Tg G. The Poisson bracket of two functions f1 , f2 on G can be written as a bilinear combination of derivatives with coefficients η ab (g) as: R η ab (g)(∇R (14.14) {f1 , f2 }G (g) = a f1 )(g)(∇b f2 )(g) a,b
The coefficients η ab (g) contain all the information on the Poisson structure, and we will express the Lie–Poisson condition on them. For this, it is convenient to introduce the element η(g) ∈ G × G: η ab (g)Ea ⊗ Eb η(g) = a,b
536
14 Symplectic geometry
Any function on G can be expressed on elementary functions, i.e. the matrix elements of a faithful representation ρ of G. It is sufficient to express the Poisson brackets of such elementary functions. If we consider the function g → ρ(g) we have ∇R a ρ(g) = ρ(Ea g) = ρ(Ea )ρ(g). The Poisson bracket of two matrix elements then reads:
{ρij (g), ρkl (g)}G = ρij ⊗ ρkl η(g)g ⊗ g In the following we drop the explicit mention of the representation ρ and write this formula in the usual tensor notation: {g1 , g2 }G = η12 (g) g ⊗ g
(14.15)
Note that the antisymmetry of the Poisson bracket (14.14) requires η12 = −η21 . Proposition. The Lie–Poisson property is equivalent to the following cocycle condition on η: η(gh) = η(g) + Adg · η(h) where the adjoint action on G ⊗G is defined as Adg ·η(h) = g⊗g η(h) g −1 ⊗ g −1 . Proof. We have {(gh)1 , (gh)2 }G×G = {g1 , g2 }G h1 h2 + g1 g2 {h1 , h2 }G since ρ(gh) = ρ(g)ρ(h). This has to be equal to η(gh)(gh)1 (gh)2 . Comparing the two expressions, the cocycle condition follows. This condition is called a cocycle condition because it means that η is a 1-cocycle for the Hochschild group cohomology of G with values in the representation G ⊗ G. Looking at the vicinity of the identity e in G, that is writing g = exp (tX) and h = exp (t Y ), with t, t small and η(etX ) = η(e) + tde η(X) + · · ·, the cocycle condition implies, at order 0 in t, t , that η(e) = 0, and at second order that de η([X, Y ]) = [∆X, de η(Y )] − [∆Y, de η(X)]
(14.16)
with ∆X = X ⊗ 1 + 1 ⊗ X. This means that the linear function de η on G is a 1-cocycle for the Lie algebra cohomology with values in the same representation G ⊗ G. One can use de η to introduce a Lie algebra structure on G ∗ . For any function f on G the differential de f at the identity is a linear function on G, hence an element of G ∗ . Considering two functions f1 and f2 , eq. (14.14) shows that the differential de {f1 , f2 } only depends on de f1 and de f2 and is proportional to de η, since η(e) = 0.
14.5 Poisson–Lie groups
537
Definition. The Poisson bracket {, }G defines a Lie algebra structure on G ∗ by the following formula: [de f1 , de f2 ]G ∗ = de {f1 , f2 }G
(14.17)
The Jacobi identity for the Lie algebra structure is direct consequence of the Jacobi identity for the Poisson bracket on G. basis (Ea ) in G, the difIntroducing a basis (E a ) in G ∗ , dual to the ferential at the identity can written as de f = a E a (∇a f ) ∈ G ∗ , where d f (etEa )|t=0 . Similarly, ∇a f = dt de η = E c ∇c η = Ccab E c ⊗ Ea ⊗ Eb With these notations eq. (14.17) reads: [E a , E b ]G ∗ = Ccab E c
(14.18)
so the structure constants are Ccab = (∇c η ab ). We shall denote by G∗ the connected Lie group with Lie algebra G ∗ . Let G be a Poisson–Lie group and let D = G ⊕ G ∗ . This is called the classical double. One can introduce a Lie algebra structure on D which extends the Lie algebra structures on G and G ∗ and such that the elements of G and G ∗ do not commute. In terms of a basis (Ea ) ∈ G and its dual (E a ) ∈ G ∗ , this structure reads: c Ec [Ea , Eb ] = fab a a [E , Eb ] = fbc E c − Cbac Ec [E a , E b ] = Ccab E c
(14.19)
Proposition. The above brackets define a Lie algebra structure on D. Proof. One defines also [Eb , E a ] = −[E a , Eb ] so the antisymmetry is obvious. We need to verify the Jacobi identity. Due to the Jacobi identity on G and G ∗ one has only to verify the two cases [E a , [E b , Ec ]] + [Ec , [E a , E b ]] + [E b , [Ec , E a ]] = 0 and [E a , [Eb , Ec ]] + [Ec , [E a , Eb ]] + [Eb , [Ec , E a ]] = 0. These relations reduce to: a b a b Cdab fcld − Clbd fdc + Clad fdc + Ccdb fld − Ccda fld =0 a. when using the Jacobi identity on the structure constants Ccab and fbc This is just the cocycle relation, eq. (14.16).
Remark. With this bracket on the double, one can construct a solution of the Yang–Baxter equation: r12 = Ea ⊗ E a ∈ G ⊗ G ∗ a
538
14 Symplectic geometry
The Yang–Baxter equation: [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = [Ea , Eb ] ⊗ E a ⊗ E b + Ea ⊗ [E a , Eb ] ⊗ E b + Ea ⊗ Eb ⊗ [E a , E b ] = 0 is identically satisfied due to the definition of the commutators on D.
14.6 Action of a Poisson–Lie group on a symplectic manifold Let G be a Poisson–Lie group and M be a symplectic manifold. We assume that G acts on M . In this case the natural compatibility condition between the group action and the Poisson structures on G and M is: Definition. The action of a Poisson–Lie group on a symplectic manifold is a Lie–Poisson action if for any g ∈ G and any functions f1 and f2 on M , we have: {f1 (g · m), f2 (g · m)}G×M = {f1 , f2 }M (g · m)
(14.20)
Here the Poisson structure on G × M is the product Poisson structure. At the infinitesimal level, let X ∈ G and denote by X · m the vector field on M corresponding to the infinitesimal transformation generated by X. For any function f on M we define the action of X on f by: (X.f )(m) =
d f (e−tX · m)|t=0 = ζf (m), X dt
where , denotes the pairing between G and G ∗ . This defines a function ζf : m → ζf (m) ∈ G ∗ . Proposition. The infinitesimal form of eq. (14.20) for a Lie–Poisson action is: {X ·f1 , f2 }M + {f1 , X ·f2 }M + [ζf1 , ζf2 ]G ∗ , X = X ·{f1 , f2 }M (14.21) or, equivalently, {ζf1 , f2 }M + {f1 , ζf2 }M + [ζf1 , ζf2 ]G ∗ = ζ{f1 ,f2 }M . Proof. The definition of the product Poisson bracket is equivalent to: {f1 (g · m), f2 (g · m)}G×M = {f1 (g · m), f2 (g · m)}M + {f1 (g · m), f2 (g · m)}G In the right-hand side of this formula, in the term { , }M the functions f1,2 (gm) are viewed as functions of m, and g is a parameter, while in the term { , }G they are viewed as functions of g and m is a parameter. So the Lie–Poisson condition reads: {f1 (g · m), f2 (g · m)}M + {f1 (g · m), f2 (g · m)}G = {f1 , f2 }M (g · m)
14.6 Action of a Poisson–Lie group
539
Setting g = e−tX and taking t infinitesimal, the first Poisson bracket becomes −{X · f1 (m), f2 (m)}M − {f1 (m), X · f2 (m)}M and the right-hand side becomes −X · {f1 , f2 }M (m). The second Poisson bracket is − de {f1 (g · m), f2 (g · m)}G , X , where de is the differential of a function on G taken at g = e. By definition of the Lie bracket on G ∗ this is − [de f1 , de f2 ]G ∗ , X . One gets eq. (14.21) noting that ζf (m) = −de f (g·m). Introducing two dual basis of the Lie algebras G and G ∗ , Ea ∈ G a ∗ a a and E ∈ Ga with E , Eb = δb , eq. (14.21) becomes, using ζf = a (Ea · f )E : {Ea · f1 , f2 }M + {f1 , Ea · f2 }M − Cabc (Eb · f1 )(Ec · f2 ) = Ea .{f1 , f2 }M (14.22) It follows immediately from eq. (14.21) that a Lie–Poisson action cannot be symplectic if the algebra G ∗ is non-Abelian. Hence we cannot expect that infinitesimal group actions are locally generated by Hamiltonians as in the symplectic case. There is, however, a generalization of this notion, in the Lie–Poisson case, by what are called non-Abelian Hamiltonians. Proposition. Assume that a Poisson–Lie group G acts on M by a Lie– Poisson action. Then, there exists a function Γ, locally defined on M and taking values in the group G∗ , with Lie algebra G ∗ , such that for any function f on M , X.f = Γ−1 {f, Γ}M , X
,
∀X∈G
(14.23)
Γ−1
Equivalently, ζf (m) = {f, Γ}M (m). We will refer to Γ as the non– Abelian Hamiltonian of the Lie–Poisson action. Proof. Introduce the Darboux coordinates (q i , pi ) on the symplectic mani i ifold M . For any X ∈ G expand X ·m = X p ∂pi +X q ∂qi and introduce the i i form ΩX = i (X q dpi − X p dqi ). Finally, let Ω be the G ∗ -valued 1-form Ω = E a ΩEa . As in the symplectic case, eq. (14.21) is then equivalent to the following zero-curvature condition for Ω: dΩ + [Ω, Ω]G ∗ = 0 Therefore, locally on M , Ω = Γ−1 dΓ with Γ ∈ G∗ . This proves eq. (14.23). The converse is true: an action generated by a non–Abelian Hamiltonian as in eq. (14.23) is Lie–Poisson since we have: X · {f1 , f2 }M − {X · f1 , f2 }M − {f1 , X · f2 }M 0 / = Γ−1 {f1 , Γ}M , Γ−1 {f2 , Γ}M G ∗ , X
540
14 Symplectic geometry
In the Abelian case, we have Γ(m) = exp (−P(m)), where P is the momentum taking values in the Abelian Lie algebra G ∗ . This is because, in the symplectic case, eq. (14.23) becomes X · f = {P, f }, X = {HX , f }. Hence Γ is the non–Abelian generalization of the moment map. 14.7 The groups G and G∗ As a preparation for our study of dressing transformations, we apply the previous results to a more specific situation. Let G be a Lie algebra with a bilinear invariant form denoted by Tr, with associated connected Lie group G. We denote by C the tensor Casimir in G ⊗ G. We equip G with a Lie–Poisson structure by choosing a cocycle η(g). A simple way to fulfil the cocycle condition is to take for η(g) a coboundary η12 (g) = r12 − Adg r12 where r is a constant element in G ⊗ G. Then eq. (14.15) becomes: {g1 , g2 }G = [r12 , g1 g2 ]
(14.24)
This is the Sklyanin bracket. In this case the Lie–Poisson property is easy to check. One has to show that: {(gh)1 , (gh)2 }G×G = [r12 , (gh)1 (gh)2 ] This is obvious when we notice that, in the product Poisson structure on G × G, g and h Poisson commute. The computation then reduces to the computation we have done for the Sklyanin approach to the closed Toda chain, see Chapter 6. Since in G × G, g and h Poisson commute, we have: {(gh)1 , (gh)2 }G×G = {g1 , g2 }G h1 h2 + g1 g2 {h1 , h2 }G = [r12 , g1 g2 ]h1 h2 + g1 g2 [r12 , h1 h2 ] = [r12 , g1 h1 g2 h2 ] Note that r12 is defined only up to the addition of a multiple of the Casimir ± = r12 ± 12 C12 . Antisymmetry element which drops out of η(g). We define r12 of the Poisson bracket is ensured by choosing r12 antisymmetric, so that ± ∓ = −r21 . The Jacobi identity for the Poisson bracket is ensured by r12 ± are solutions of the classical Yang–Baxter equation: requiring that r12 / ± ±0 / ± ±0 / ± ±0 r12 , r13 + r12 (14.25) , r23 + r13 , r23 = 0 Equation (14.25) is an equation in G ⊗ G ⊗ G, and the indices on r± refer to the copies of G on which r± is acting. Using the bilinear form Tr to
14.7 The groups G and G∗
541
± identify the vector spaces G ∗ and G, the elements r12 of G ⊗ G can be ± ∗ mapped into elements R ∈ G ⊗ G ∼ = EndG defined by: ± R± (X) = Tr2 r12 (1 ⊗ X) , ∀ X ∈ G (14.26)
Note that we have R+ − R− = Id. The Poisson bracket (14.24) on G induces a Lie algebra structure on G ∗ by eq. (14.17). Identifying the vector spaces G and G ∗ by Tr, the bracket on G ∗ is mapped to the R-bracket: / 0 / 0 [ X, Y ]R = R± (X), Y + X, R∓ (Y ) = [R(X), Y ] + [X, R(Y )] (14.27) with R = 12 (R+ + R− ). We gave a detailed analysis of this bracket in Chapter 4, and all the results apply here. We simply recall that because R+ − R− = Id, any X ∈ G admits a decomposition as: ± X = X+ − X− , X± = R± (X) = Tr2 r12 (1 ⊗ X) (14.28) In terms of the components X+ and X− , the commutator in G ∗ becomes: [ X, Y ]R = [X+ , Y+ ] − [X− , Y− ] In particular, the plus and minus components commute in G ∗ . Moreover, R± are Lie algebra homomorphisms so that X± live in two subalgebras G± of G. Recall also that the image of G ∗ in G− ⊕G+ is the set of X = X+ −X− such that θ(X+ ) = X− , see Chapter 4. By exponentiation, the subalgebras G± correspond to connected Lie subgroups G± of G, and the group G∗ can be viewed as the set of pairs (g− , g+ ), subjected to some condition θ(g+ ) = g− , with product law: (g− , g+ ) · (h− , h+ ) = (g− h− , g+ h+ )
(14.29)
Any element g ∈ G (in a neighbourhood of the identity) admits a unique factorization as: −1 g = g− g+
(14.30)
with θ(g+ ) = g− . This associates with g ∈ G a unique element of G∗ = (g− , g+ ) through a factorization problem. So as sets, G and G∗ are identified, but they have different group structures. The group G∗ itself becomes a Poisson–Lie group if we introduce on it the Semenov-Tian-Shansky Poisson bracket: / ± 0 {(g+ )1 , (g+ )2 }G∗ = − r12 , (g+ )1 (g+ )2 0 / ∓ , (g− )1 (g− )2 {(g− )1 , (g− )2 }G∗ = − r12 / − 0 , (g− )1 (g+ )2 {(g− )1 , (g+ )2 }G∗ = − r12 0 / + (14.31) , (g+ )1 (g− )2 {(g+ )1 , (g− )2 }G∗ = − r12
542
14 Symplectic geometry
−1 g+ : or, for the factorized element g = g− + − ± ∓ {g1 , g2 }G∗ = −g1 r12 g2 − g2 r12 g1 + g1 g2 r12 + r12 g1 g2
(14.32)
The multiplication in G∗ is a Poisson map for the brackets (14.31). The group G∗ is therefore a Poisson–Lie group. 14.8 The group of dressing transformations We use the above results to understand the Poisson structure of the dressing transformations introduced in Chapter 3. Definition. Let G be a Poisson–Lie group associated with an r-matrix, and G∗ its dual Poisson–Lie group. We define an action of G∗ on G which we call a dressing transformation. We identify G∗ to G as sets, via the −1 g+ . The dressing of x ∈ G by g ∈ G∗ is factorization problem g = g− defined by: −1 −1 (g = g− g+ ∈ G∗ , x ∈ G) :→ g x = (xgx−1 )± x g± ∈G
(14.33)
In this equation (xgx−1 )± refers to the factorization −1 −1 (xgx−1 )−1 − (xgx )+ = (xgx )
and this implies that the two signs give the same result for g x. Proposition. The action x → g x is a group action of G∗ on G. Proof. We have to show that g (h x) = gh x, that is: −1
(h x g h x
−1 )± h x g± = (x(gh)x−1 )± x(gh)−1 ±
(14.34)
Introducing the notation, for any h ∈ G∗ : Θh± = (xhx−1 )± , so that h x = h h h −1 Θh± xh−1 ± , and using the sign freedom to write the first x in ( x g x )± with the minus sign and the second one with the plus sign, the left-hand side of eq. (14.34) reads:
−1 −1 h −1 gh x Θ Θh± x h−1 Θh− x h−1 + − + ± g± ±
−1 Since g = g− g+ , and due to the definition of the group law in G∗ , −1 = Θ(gh) −1 Θ(gh) h−1 − gh+ = (gh). Moreover, since by definition x(gh)x − + we get: (gh) −1 (gh) h −1 −1 h −1 Θh− x h−1 Θ+ = Θh− Θ− Θ+ Θ+ − gh+ x
14.8 The group of dressing transformations
543
so that one reads the factorization:
(gh) −1 h −1 h −1 Θ− x h− gh+ x Θ+ = Θ± Θh±−1 ±
From this the result follows immediately. The infinitesimal form of eq. (14.33) is, for any X ∈ G with X = X+ − X− : δX x = Y± x − x X±
with
Y± = (xXx−1 )±
(14.35)
One of the main properties of this action is that it is a Lie–Poisson action of G∗ on G if the groups G and G∗ are equipped with the Poisson structures defined in eqs. (14.24) and (14.31), i.e. we have: / ± g g 0 , x1 x2 {g x1 , g x2 }G∗ ×G = r12 We will prove this fact by exhibiting a non–Abelian Hamiltonian for dressing transformations, so that they are automatically Lie–Poisson. Proposition. The non–Abelian Hamiltonian of the dressing transformations (14.33), which is an element of G∗∗ ∼ = G, is the identity function of G, i.e. it is x itself: X · x = −δX x = Tr2 x−1 ∀X ∈ G, x ∈ G (14.36) 2 {x2 , x1 }G X2 Proof. First note that, as usual, the action on functions is defined with the inverse group element, so that the action of X on the function x is −δX x = −(xXx−1 )± x + xX± . From eq. (14.24), in which we take Γ(x) = x, we have: −1 ± Tr2 x−1 2 {x2 , x1 }G X2 = −Tr2 x2 [r12 , x1 x2 ]X2 ± ± + Tr2 x1 r12 = −Tr2 r12 x1 x2 X2 x−1 X2 2 = −(xXx−1 )± x + x X± = −δX x It is a remarkable fact that there exists a non-Abelian Hamiltonian, since the group G with the Sklyanin bracket is not a symplectic manifold, the bracket being degenerate. References [1] V. Arnold, M´ethodes math´ematiques de la M´ecanique classique. MIR, Moscow (1976). [2] R. Abraham and J. Marsden, Foundations of Mechanics. Benjamin, Reading, Massachusetts (1978).
544
14 Symplectic geometry
[3] M. Semenov-Tian-Shansky, Dressing transformations and Poisson group actions. Publ. RIMS 21 (1985) 1237. [4] V. Drinfeld, Hamiltonian structures on Lie groups, Lie bialgebras and the geometric meaning of classical Yang–Baxter equations. Soviet Math. Dokl. 27 (1983) no. 1, 68–71. [5] J.H. Lu, Multiplicative and Affine Poisson Structures on Lie Groups. PhD Thesis, University of California at Berkeley (1990).
15 Riemann surfaces
Riemann surfaces play a ubiquitous role in the analytic study of integrable systems. Here it is fundamental to see Riemann surfaces both as smooth analytical one-dimensional varieties and as the desingularization of the locus of an algebraic equation P (x, y) = 0. We explain the notion of line bundle which arises naturally in the study of integrable systems. This allows us to provide a proof of the Riemann–Roch theorem. This theorem is the main enumerative tool in our applications. Riemann himself discovered the profound implications of theta functions and notably the geometry of the theta divisor in the subject. The starting point is Riemann’s theorem which we use to exhibit explicit solutions of integrable systems in terms of theta-functions. We close the chapter by sketching Birkhoff’s proof of the Riemann–Hilbert factorization theorem, which plays a central role throughout the book. 15.1 Smooth algebraic curves Riemann surfaces are compact smooth analytic varieties of dimension 1. This means that around each point p there is a neighbourhood and a local parameter z(p) mapping it homeomorphically to an open disc |z| < 1 of the complex numbers. Moreover, in the intersection of two such neighbourhoods the corresponding local parameters z1 (p) and z2 (p) must be related by an analytic bijection. Hence, locally a smooth curve looks like the complex line. Finally, a Riemann surface is compact, hence it is a closed surface without boundary. For our purposes it is very important to look at Riemann surfaces from an algebraic viewpoint, that is as the locus in C2 of an algebraic equation P (x, y) = 0, where P is a polynomial in the complex variables x and y. 545
546
15 Riemann surfaces
At a generic point of this locus one has ∂x P = 0 and ∂y P = 0, hence both x and y can be taken as analytic local parameters in the vicinity of this point. If P has degree N in y we can find N analytic solutions y = fj (x) for j = 1, . . . , N defined in some open set of the x variable. These are the N branches of the curve presented as an N -fold covering of the complex line of the x variable. This situation can be analytically continued until one gets to a point where some branches meet, that is the equation in y, P (x, y) = 0, has a multiple root, so that ∂y P = 0. Generically one still has ∂x P = 0 at such a point and one can choose y instead of x as local parameter in the vicinity of this point which is a perfectly smooth point on the curve. The covering projection (x, y) → x, however, is branched at this point, that is several branches coalesce to one point. We say that P (x, y) = 0 expresses the curve as an N -fold branched covering of the complex line. Branch points occur when the discriminant of P (x, y), viewed as a polynomial in y, vanishes. This is a polynomial in x, hence has a finite number of roots above which the covering projection branches. A more complicated situation arises at points where both partial derivatives of P (x, y) vanish. This means that locally the curve looks like the intersection of several lines, hence is not smooth. In order to make contact with our general definition of a Riemann surface one has to perform an operation called desingularization, which basically consists of replacing the singular point by several ordinary points while leaving the analytic structure of the neighbourhood untouched. An easy way to understand how this can be achieved is to consider birational transformations. They are mappings of the complex plane to itself (x, y) → (x = x (x, y), y = y (x, y)), where x and y are rational functions of x and y such that one can also express x and y as rational functions of x and y . At a non-singular point of such a transformation it is clearly bijective and preserves the analytic structure. A simple example is the quadratic transformation x = 1/x and y = 1/y, which is obviously bijective and analytic on its domain of definition. When transforming the equation P (x, y) = 0 under such a mapping chosen to have its singular set precisely at a singular point of the curve one may blow up the singular point into several ordinary points of the transformed curve. Let us see how this works on a simple example. Consider the curve y 2 = x2 + x3 , which is singular at (0, 0), and the birational transformation x = x , y = x y , which can be inverted by y = y/x. It is obviously bijective except at the singular point (0, 0) where y is indeterminate. Substituting into the equation of the curve one gets y 2 = 1 + x and we see that the two ordinary points x = 0 and y = ±1 project to the one singular point x = y = 0, while any other point of the transformed
15.2 Hyperelliptic curves
547
curve bijectively corresponds to just one point of the initial curve. This bijection preserves the analytic structure of both curves. A similar construction can be done around any singular point of an algebraic curve and then patched with the analytic structure around ordinary points to get a smooth Riemann surface associated with the equation P (x, y) = 0. We say that this Riemann surface is the desingularized curve of the equation P (x, y) = 0. Finally, to get a compact surface one has to consider points at infinity. We transform the equation P (x, y) = 0 under the quadratic transformation x = 1/x , y = 1/y so that a chart around ∞ becomes an analytic chart around 0. In general one gets a singular point at (0, 0) which has to be desingularized by the above method. We shall give a simple example of this procedure in the next section, in which we study a type of Riemann surface frequently occuring in integrable models: the hyperelliptic curves.
15.2 Hyperelliptic curves A hyperelliptic curve is the locus of an equation of the form: 2
y = P (x),
N P (x) = (x − ai )
(15.1)
i
where P (x) is a polynomial in x of degree N . In order not to have singular points we shall assume ai = aj for i = j. There is an analytic involution of the curve into itself (x, y) → (x, −y) which is called the hyperelliptic involution, and the existence of such an automorphism in fact characterizes hyperelliptic curves. Of course the curve can be expressed as a two-sheeted covering of the complex line branched above the points x = ai . Around such a point (x = ai , y = 0) one can take y as a perfectly smooth local parameter. For example, if P (x) = xP1 (x) with P1 (0) = 0 one can express x = y 2 + O(y 3 ) around (0, 0). Note that the situation would be entirely similar for an equation of the type y n = xP1 (x) around (0, 0) except that now n branches coalesce at the origin. The only tricky point is to consider the situation at infinity. Performing the transformation x = 1/x , y = 1/y one gets N N y (1 − x ai ) − x = 0 2
i=1
and we see that the origin is singular for N > 1. It is now necessary to distinguish two cases, N odd and even.
548
15 Riemann surfaces
When N = 2g + 1, for some integer g, one can perform the birational transformation x = x and y = (x g / (1 − x ai ))y and the trans formed equation reads y 2 − x (1 − x ai ) = 0. We see that the singular point (x = 0, y = 0) gives rise to a single point (x = 0, y = 0) on the desingularised curve, hence we say that there is just one point at infinity, but it is a branch point of the covering (x, y) → x. The local parameter around ∞ is y and we have locally: x=
1 (1 + O(y )), y 2
y=
1 y 2g+1
(1 + O(y ))
Note that x and y are meromorphic functions on the Riemann surface with poles only at ∞ where, √ for example, x has a pole of order 2. Finally, one frequently says that x is a local parameter around ∞ since this is equivalent to 1/y . When N = 2g + 2 one performs the birational transformation x = x and y = (x (g+1) / (1 − x ai ))y , which yields y 2 = (1 − x ai ). Hence we are in the generic situation, i.e. there are two points corresponding to the singular point (x = 0, y = 0), specifically ∞+ = (x = 0, y = +1) and ∞− = (x = 0, y = −1). In other words, all the points of the desingularised curve bijectively and analytically correspond to points of the original curve except the two points ∞± which map to the same ∞ = (x = 0, y = 0). We say that the singular curve is obtained from the non-singular one by identifying two points. Around ∞± either x or y are good local parameters and one gets for example x=
1 , x
y=
1 x (g+1)
(1 + O(x ))
We see that the meromorphic function x has two simple poles on the curve, one at ∞+ and the other at its hyperelliptic conjugate ∞− . The curve can be seen as a two-sheeted branched covering of the Riemann sphere branched over the 2g + 2 points (ak , 0). Let us remark that in both cases, N = 2g +1 and N = 2g +2, eq. (15.1) allows us to present the hyperelliptic curve as a two-sheeted branched covering of the Riemann sphere with 2g + 2 branch points. In fact, for N = 2g + 1 we have the 2g + 1 points (ak , 0) plus the single point at infinity ∞, while for N = 2g + 2 we have two points at infinity and they are not branch points of the covering. This will allow us to conclude that in both cases the underlying topological surface is of genus g as we now explain.
15.3 The Riemann–Hurwitz formula
549
15.3 The Riemann–Hurwitz formula Let us recall that for a triangulated surface where the triangulation has F faces, V vertices and A edges one defines the Euler–Poincar´e characteristic χ = F − A + V = 2 − 2g which is a topological invariant. Here g is called the genus of the Riemann surface. Any surface of genus g is homeomorphic to a sphere with g handles. Let us now assume that a Riemann surface is presented as an N -sheeted branched covering of some base space of Euler characteristic χ0 . Choose a triangulation of the base space such that all branch points are included as vertices. Now, for each triangle consider its N pre-images under the covering projection, and similarly the N pre-images of each edge. While ordinary vertices have N pre-images, each base point corresponding to a branch point has fewer than N pre-images, and the reduction is given by the order of the branch point, i.e. the number of branches which coalesce at the branch point minus 1. This is also called the index of the branch point. Altogether one gets a triangulation of the Riemann surface with N F triangles, N A edges and N V −B vertices, where B is the total number of branch points counted with their index on the Riemann surface. Hence the Euler characteristic χ of the Riemann surface is related to χ0 by χ = N χ0 − B. This is the Riemann–Hurwitz formula: 2g − 2 = N (2g0 − 2) + B
(15.2)
Let us apply this formula to the computation of the genus of a hyperelliptic curve. The base is the Riemann sphere for which there is a triangulation with eight faces, twelve edges and six vertices, so that χ0 = 2 or g0 = 0. We have seen that the covering has B = 2g + 2 branch points of index 1. Since N = 2, the genus computed by the Riemann–Hurwitz formula is precisely the number g parametrizing the degree of the polynomial P (x) in the equation of the curve y 2 = P (x).
15.4 The field of meromorphic functions of a Riemann surface Consider a Riemann surface of genus g. A complex valued function on this surface is analytic around a point if its expression in terms of a local parameter is analytic. Similarly, it has a pole of order n if its expression in terms of a local parameter has a pole of order n, and so on. These definitions are invariant under analytic reparametrizations. It is impossible to get an everywhere non-constant analytic function on a compact
550
15 Riemann surfaces
Riemann surface, this is basically the Liouville theorem. We are generally interested in meromorphic functions which have a finite number of poles. Obviously the meromorphic functions form a field which is called the function field of the surface. When the curve is given by an equation P (x, y) = 0, any rational function of x and y is a meromorphic function on the curve with poles located at arbitrary points and one can show that the most general meromorphic function can be so constructed. The field of meromorphic functions is just the field of rational functions of x and y modulo the equation of the curve. This allows us to give a completely algebraic description of Riemann surfaces. Conversely, let f be a non-trivial meromorphic function on a Riemann surface. The Cauchy theorem still holds true on a Riemann surface. Integrating over a small circle whose interior contains no poles or zeroes of f − a, one gets: 1 df 0= = number of zeroes of (f − a) − number of poles of f 2πi f −a which shows that f takes each value the same number of times, say N times. This allows us to present a general Riemann surface as an N sheeted branched covering of the Riemann sphere by p → z = f (p). For any z on the Riemann sphere consider its N pre-images pj under f on the Riemann surface. Now take any other meromorphic function h, and consider the elementary symmetric functions of h(pj ). They do not depend on the order of the sheets, hence define meromorphic functions of z on the Riemann sphere, that is rational functions of z. It is then clear that h obeys a polynomial equation P (f, h) = 0. By similar arguments one can show that one can choose h so that the polynomial P is irreducible and then any other meromorphic function is a rational function of f and h. We see that any abstract Riemann surface may be viewed as a smooth compact algebraic curve. Let us consider the example of a hyperelliptic curve y 2 = P2g+1 (x), where P2g+1 is a polynomial of degree 2g + 1. Any meromorphic function can be written as f (x, y) = (A(x) + yB(x))/C(x) with A, B, C polynomials in x since one can eliminate y 2 using the equation of the curve and eliminate y in the denominator. In this case we get a simple construction of a meromorphic function with g + 1 poles at (x1 , y1 ), . . . , (xg+1 , yg+1 ) by taking y + Q(x) f (x, y) = i (x − xi ) where Q(x) is a polynomial of degree g determined by requiring that Q(xi ) = yi so that the numerator of f vanishes at the point (xi , −yi ). Note
15.5 Line bundles on a Riemann surface
551
that f vanishes at ∞ and has g zeroes at finite distance. Also note that the meromorphic function x has a double pole at ∞ hence takes each value twice. The existence of such a function characterizes hyperelliptic curves. 15.5 Line bundles on a Riemann surface It is important to generalize the notion of function on a Riemann surface by considering line bundles on the surface, and sections of these bundles. Such line bundles occur naturally in the study of integrable systems. Let us consider a covering {Uα } of the Riemann surface by open sets Uα and assume that for each non-void intersection Uα ∩ Uβ some continuous functions (called transition functions) tαβ are given which are neither vanishing nor ∞ on the intersection. Moreover, we assume that on each non-void triple intersection Uα ∩ Uβ ∩ Uγ one has tαβ tβγ tγα = 1 and that tαβ tβα = 1. This defines a line bundle ξ. When the transition functions are differentiable it is a differentiable bundle, while if the transition functions are analytic it is an analytic bundle. We shall be concerned with analytic bundles on a Riemann surface. A section of the bundle ξ is a collection of functions fα on each Uα such that on each intersection we have fα = tαβ fβ . If all functions fα are holomorphic this is a holomorphic section of ξ and the space of such analytic sections will be denoted by Γ(ξ). If all functions fα are meromorphic it is called a meromorphic section. If fα has a pole or a zero of order m at some point of Uα ∩ Uβ , it is the same for fβ since tαβ is analytic without zero, and we say that the section has a zero or a pole of order m. Geometrically, an analytic line bundle can be seen as a triple (E, B, π) such that for any point b ∈ B there exists an open set Ub and an analytic isomorphism π −1 (Ub ) Ub × C. Any point above Uα can be written (p, φα ) with p ∈ Uα and φα a number. If, moreover, p ∈ Uβ , the same point can be written (p, φβ ). The two descriptions patch if one can write φα = tαβ (p)φβ for some analytic non-vanishing function tαβ defined on Uα ∩Uβ . The triple intersection condition is obviously satisfied. Moreover, for any non-vanishing analytic functions fα defined on Uα an equivalent description of the line bundle is obtained by sending the point (p, φα ) of the line bundle to (p, fα (p)φα ). In the new description, the transition functions are now tαβ (p)fα (p)/fβ (p), so that transition functions differing by multiplication by a ratio fα /fβ define the same line bundle ξ. A local or global section of the line bundle ξ can be viewed intrinsically as a map which associates with each point of the Riemann surface a point in the fibre above it. In a local trivialization Uα × C this point can be written (p, fα (p)). In the intersection Uα ∩ Uβ the two descriptions are related by fα = tαβ fβ . We recover the above definition of sections.
552
15 Riemann surfaces
Note that the quotient of two sections is a meromorphic function since the transition functions cancel, and the product of a section by a function is a section. One can differentiate differentiable sections of ξ with respect to z¯ since ∂z¯fα = tαβ ∂z¯fβ , hence {∂z¯fα } is a section. Example 1. A very important example of line bundle is provided by the canonical bundle. Consider a covering Uα such that each open set Uα is analytically isomorphic to a domain of the complex numbers by a coordinate p → zα (p). In each non-trivial intersection Uα ∩ Uβ the coordinate zβ can be expressed as an analytic function of the coordinate zα , and conversely. One can define: καβ (p) =
dzβ (p) dzα
which is holomorphic non-vanishing in the intersection. The canonical bundle is defined by these transition functions. This definition is canonical since under change of local parameters zα → wα the transition functions transform according to καβ → wβ /wα · καβ and the wα are analytic nonvanishing on Uα . They therefore define the same bundle. A global section of this bundle is a collection of analytic functions fα such that fα = καβ fβ on intersections, that is fα dzα = fβ dzβ . It can be viewed as a globally defined holomorphic differential form of the type (1, 0), i.e. a form involving dz only. Example 2. Another important example is provided by the so-called point bundles. Let the covering be as above and assume that p is a point on the Riemann surface belonging to some Uα and to no other Uβ . One can assume that zα (p) = 0. Define the transition functions tαβ = zα on nontrivial Uα ∩ Uβ and choose tβγ = 1 on all other non-trivial intersections. This defines a line bundle ξp . Note that our point bundle admits at least one analytic section, σα , namely σα (zα ) = zα and σβ = 1. The patching conditions σβ = tβγ σγ are obeyed for any β, γ including α. This section has just one zero at p. We see that an analytic line bundle may have nontrivial holomorphic sections while holomorphic functions on a compact Riemann surface are constant. Also, note that a bundle ξ is the trivial bundle if and only if it admits an analytic non-vanishing section fα , since in this case one can write tαβ = fα /fβ . One can introduce a group structure on line bundles. Given two line bundles ξ and σ one can assume that they are defined on a common suitably refined covering Uα by transition functions tαβ and sαβ . Then
15.6 Divisors
553
the product ξσ is defined by the transition functions tαβ sαβ and the inverse ξ −1 is defined by 1/tαβ , while the neutral element is just the trivial bundle, i.e. the cartesian product of the Riemann surface by the complex numbers which can be defined by tαβ = 1. Note that the above definitions are coherent with the redefinitions tαβ = fα /fβ · tαβ . This group law is commutative, hence an additive notation is frequently used. If {fα } is a section of ξ and {gα } a section of σ then {fα gα } is a section of ξσ and {1/fα } is a meromorphic section of ξ −1 . 15.6 Divisors One can build more complicated bundles from point bundles at points p1 , . . . , pk by taking their product, that is j nj ξpj . Here nj are positive or negative integers. A formal sum of points pj on the Riemann surface with multiplicities nj is called a divisor, denoted by D = j nj pj . The line bundle ξ = j nj ξpj is the line bundle associated with the divisor D(ξ) = j nj pj . For any meromorphic function f on the Riemann surface with zeroes at pj of order nj and poles at qk of order mk one defines the divisor of the function f as the formal sum n j pj − mk qk D(f ) = j
k
The divisor of a section of a line bundle is similarly defined. A divisor is positive if all nj are positive, hence a meromorphic section is analytic if and only if its divisor is positive. Since ξp has a section with divisor p, the line bundle ξ associated with D has a section fξ with divisor D. When the divisor D is the divisor of a meromorphic function f , the associated bundle ξ has a section fξ with divisor D. The section f −1 fξ is analytic non-vanishing so that ξ is the trivial bundle. Conversely, if ξ associated with D is trivial, the section fξ gives rise to a meromorphic function of divisor D, since we can dividefξ by an analytic non-vanishing section. This allows us to introduce an equivalence relation between divisors: two divisors D1 and D2 are equivalent if their associated bundles ξ1 and ξ2 are such that ξ1 − ξ2 is the trivial is the divisor of a meromorphic function. bundle, i.e. if D1 − D2 With a divisor D = j nj pj is associated a number nj deg (D) = j
called the degree of the divisor. Two equivalent divisors have the same degree since a meromorphic function has the same number of zeroes and
554
15 Riemann surfaces
poles. Note that if the line bundle ξ has a meromorphic section, σ, of divisor D and η is the line bundle associated with D with meromorphic section τ having the same divisor D, ξ−η has a holomorphic non-vanishing section σ/τ , hence ξ = η. 15.7 Chern class It is well known from differential geometry that one can associate with differential vector bundles an integer called the Chern class which can be computed as the integral of some curvature form. For line bundles it can be shown that this integer classifies differential bundles. Moreover, the Chern class is compatible with the group structure, hence the Chern class of the “product” ξ + σ is just the sum c(ξ) + c(σ) of the Chern classes of ξ and σ. Simple proofs of these facts, adapted from the differential geometric case, can be found in the references. This leads to an index theorem stating that for any meromorphic section f of a line bundle ξ one has: deg (D (f )) = c(ξ) The trivial bundle has Chern class 0, hence this theorem reduces in this case to the above mentioned fact that a meromorphic function has the same number of zeroes and poles. This implies in particular that only bundles with positive Chern class may have holomorphic sections. Note that the point bundle ξp has Chern class 1 since it has a holomorphic section with just one zero at p. 15.8 Serre duality Having introduced these natural definitions, we can now study the space of most interest to us, i.e. the space Γ(ξ) of holomorphic sections of the line bundle ξ. In order to do that it turns out to be very useful to introduce new spaces called H 1 (Σ, O(ξ)) associated with the Riemann surface Σ and the line bundle ξ. Their definition is analogous to that of line bundles, except that the multiplicative structure is replaced by an additive one. Specifically, consider a covering Uα (and we assume that each Uα is connected and simply connected) and for each non-void intersection Uα ∩ Uβ a holomorphic local section fαβ of ξ. Such sections are assumed to obey fαβ + fβγ + fγα = 0 on each non-void triple intersection Uα ∩ Uβ ∩ Uγ . The space H 1 (Σ, O(ξ)) is the space of {fαβ } modulo trivial ones, i.e. modulo those of the form fαβ = fα − fβ for holomorphic sections fα of ξ over Uα . Note that there is no non-vanishing condition here. If trivialisations of the bundle are defined over the Uα one can represent these sections
15.8 Serre duality
555
by complex valued functions related by transition functions. Let fα and α , while f fαβ be represented over Uα by functions fαα and fαβ αβ and fβ β are represented on Uβ by fαβ and fββ respectively. Then the equation exα = f α − t f β or a pressing the triviality of fαβ reads on Uα ∩ Uβ : fαβ αβ β α similar expression for the β trivialization. Note that one can define a ∂¯ operator on sections of ξ since ∂¯ vanishes on transition functions. Let us give an alternative description of H 1 (Σ, O(ξ)). One can always solve fαβ = fα − fβ with fα differentiable sections of ξ on Uα by partition z) of unity arguments∗ . Consider the type (0, 1) forms (i.e. forms in d¯
¯ α = ∂fα d¯ ∂f z ∂ z¯ ¯ β = ∂f ¯ αβ = 0, hence these forms ¯ α − ∂f On an intersection we have ∂f patch to a globally defined section σ of ξ with values in type (0, 1) forms. Conversely, if such a differentiable section is given, one can write it as ¯ α on each Uα because, by an application of the Cauchy integral formula, ∂f ¯ = σ on a connected set with f a differentiable one can always solve ∂f function. Then one can define on an intersection fαβ = fα − fβ which is analytic since ∂¯ vanishes on it. Of course fαβ vanishes if and only if the set of all fα define a section of ξ, i.e. if fα = fβ . Hence H 1 (Σ, O(ξ)) can be identified with the set of differentiable sections of type (0, 1) of ξ, denoted by Γ1d (ξ), modulo the image by ∂¯ of the set of differentiable sections of ξ. ¯ d (ξ) H 1 (Σ, O(ξ)) ≡ Γ1d (ξ)/∂Γ It can be shown that H 1 (Σ, O(ξ)) is finite-dimensional. This description is useful for understanding the Serre duality between the spaces H 1 (Σ, O(ξ)) and the space Γ(κ − ξ), defined by a non-singular pairing: H 1 (Σ, O(ξ)) × Γ(κ − ξ) → C As we have seen in the discussion of the canonical bundle, a section of κ − ξ can be viewed as a holomorphic section of −ξ with values in type (1, 0) forms. Take a differentiable element of Γ1 (ξ), locally of the form f (z, z¯)d¯ z , and a holomorphic section of κ − ξ, locally of the form g(z)dz, and consider their wedge product, locally f (z, z¯)g(z) dz ∧ d¯ z . In the prod−1 uct, the transition functions tαβ of ξ and tαβ of −ξ cancel so that one ∗
If α rα = 1 with supp(rα ) ∈ Uα , definefα = γ rγ fαγ where rγ fαγ is extended to 0 in Uα − Uγ , and note that fα − fβ = γ rγ (fαγ − fβγ ) = γ rγ fαβ = fαβ .
556
15 Riemann surfaces
ends up with a globally defined volume form on the Riemann surface that we integrate. This defines the pairing. Note that if f is identified ¯ one can integrate by parts obtaining with 0, that is if f (z, z¯)d¯ z = ∂h, ¯ ∧ dz which vanishes since g is holomorphic. Hence the the integral of h∂g pairing is well-defined between the considered spaces. Finally, the essential point is that the pairing is non-singular. Indeed, any linear form acting on f d¯ z may be written f (z, z¯)g(z, z¯) dz ∧ d¯ z for some distribution g, ¯ the distribution g is weakly anasince this form vanishes when f d¯ z = ∂h, lytic hence is an analytic function by the classical Weyl lemma. It follows that Γ(κ − ξ) and H 1 (Σ, O(ξ)) are finite-dimensional spaces of the same dimension.
15.9 The Riemann–Roch theorem Let Γ(ξ) be the finite-dimensional space of holomorphic sections of the line bundle ξ and c(ξ) its Chern class. Theorem. On a Riemann surface of genus g with canonical bundle κ we have, for any line bundle ξ: dimΓ(ξ) − dimΓ(κ − ξ) = c(ξ) + 1 − g
(15.3)
Proof. We first show that χ(ξ) = dimΓ(ξ) − dimH 1 (Σ, O(ξ)) − c(ξ) is independent of the line bundle ξ. As a first step, we show that χ(ξ + ξp ) = χ(ξ) for any point bundle ξp (see Example 2), with analytic section σp vanishing at p. Note that any local or global analytic section of ξ can be multiplied by σp to produce a section of ξ + ξp vanishing at p. This is clearly an injective homomorphism of the space of sections. It fails to be surjective if some global sections of ξ + ξp do not vanish at p. Hence we have two cases: (a) There exists a global section of ξ + ξp non-vanishing at p. We have dim Γ(ξ + ξp ) = dim Γ(ξ) + 1. (b) All global sections of ξ + ξp vanish at p. We have dim Γ(ξ + ξp ) = dimΓ(ξ). Now let us consider local sections over intersections Uα ∩ Uβ (we assume that p does not belong to such intersections and p ∈ Uα ). The homomorphism fαβ → σp fαβ is clearly bijective in this case since σp does not vanish outside p and trivial elements are mapped to trivial elements. This induces a surjective homomorphism H 1 (Σ, O(ξ)) → H 1 (Σ, O(ξ + ξp )) which fails to be injective if there exists a non-trivial set of sections fαβ of ξ mapping to a trivial set of sections fα − fβ of ξ + ξp , that is σp fαβ = fα − fβ .
15.9 The Riemann–Roch theorem
557
In case (a) above, we can substract the global non-vanishing section from all fγ so as to achieve fα (p) = 0. Then one can divide all fγ by σp , getting fα such that fαβ = fα − fβ . Hence the mapping is bijective and we have dim H 1 (Σ, O(ξ)) = dim H 1 (Σ, O(ξ + ξp )). In case (b) we will show that dim H 1 (Σ, O(ξ)) = dim H 1 (Σ, O(ξ+ξp ))+ 1 by constructing a section fαβ which maps under σp on a trivial section = f −f , where we have chosen f (p) = 0. If we had f fαβ αβ = fα −fβ we α α β would get fα − σp fα = fβ − σp fα , hence defining a global section of ξ + ξp which by hypothesis vanishes at p, yielding fα (p) = 0, a contradiction. If, however, fα (p) = 0 one can divide it by σp and fαβ is trivial. This shows that non-trivial elements {fαβ } are parametrized by fα (p) and form a space of dimension 1 modulo trivial elements. Since c(ξ +ξp ) = c(ξ)+1, in both cases we get χ(ξ +ξp ) = χ(ξ). Starting from ξ − ξp we get χ(ξ) = χ(ξ − ξp ). Hence for any divisor D and line bundle η associated with D we have χ(ξ + η) = χ(ξ). The rest of the proof is easier. There exists D such that dim Γ(ξ+η) = 0. Otherwise one gets c(η) = C ste − dim H 1 (Σ, O(ξ + η)) ≤ C ste , where C ste = dim H 1 (Σ, O(ξ)) − dim Γ(ξ) is independent of η. This is impossible since c(η) can be arbitrarily large. Let σ be a non-trivial analytic section of ξ + η. Since η has a meromorphic section ση of divisor D we see that ξ has a non-trivial meromorphic section σ/ση . Hence ξ is the line bundle associated with this meromorphic section. We have obtained the: Proposition. Any line bundle on the Riemann surface Σ has a nontrivial meromorphic section of divisor D and is the line bundle associated with this divisor. We can now take for ξ the trivial bundle in χ(ξ + η) = χ(ξ) and we see that χ(η) is independent of η, as previously claimed. Taking into account the Serre duality formula, we get: χ = dim Γ(ξ) − dim Γ(κ − ξ) − c(ξ) When ξ is the trivial bundle dim Γ(ξ) = 1 since global analytic functions on Σ are constants, hence χ = 1 − dim Γ(κ). When ξ = κ one gets χ = dim Γ(κ) − 1 − c(κ), hence χ = −(1/2)c(κ). To compute c(κ) we view the Riemann surface Σ as an n-sheeted covering of the Riemann sphere using a meromorphic function f on Σ. We start from a meromorphic differential ω on the Riemann sphere and take its pullback f ∗ (ω) on Σ. On the sphere we can take ω = dz. Its divisor has degree −2, since it has a double pole at infinity. Hence its pullback has n poles of order 2 above ∞. Moreover, for each branch point of order ν the mapping f is locally z = f (w) = wν , so that f ∗ (dz) = νwν−1 dw has a zero of order ν −1 which
558
15 Riemann surfaces
is the multiplicity index of the branch point. So the total multiplicity of the zeroes is precisely the total multiplicity B of the branch points of the covering, which by the Riemann–Hurwitz formula, eq. (15.2), with g0 = 0 is equal to 2g − 2 + 2n. The Chern class of κ is therefore 2g − 2. Hence χ = 1 − g, yielding the Riemann–Roch theorem. Moreover, we see that dim Γ(κ) = g which means that the space of globally defined analytic one forms is of dimension g. Consider now the meromorphic functions on Σ. Let M (−D) be the set of meromorphic functions on Σ whose divisor is bigger than −D; i.e. f ∈ M (−D) iff the orders of its poles are less than or equal to those specified by −D and the orders of its zeroes are greater than or equal to those specified by −D. Let ξ be the line bundle associated with the divisor D. The space of holomorphic sections of ξ is isomorphic to M (−D), because this line bundle has a meromorphic section of divisor D and any other section is obtained by multiplication by a meromorphic function. The section will be holomorphic iff the divisor of the function is greater than −D. Hence dim M (−D) = dim Γ(ξ). We define i(D) = dim Γ(κ − ξ). This is the dimension of the space of differentials with a divisor greater than D. Recalling that c(ξ) = deg D, the Riemann–Roch theorem can be written as: dim M (−D) = i(D) + deg D − g + 1 (15.4) In general the Riemann-Roch formula eq. (15.3) relates two unknown quantities. However, if c(ξ) > 2g − 2, we see that c(κ − ξ) < 0, hence dim Γ(κ − ξ) = 0 (because c(κ − ξ) is the degree of the divisor of any meromorphic section of κ−ξ which has then necessarily poles), and we get: dim Γ(ξ) = c(ξ) + 1 − g,
if
c(ξ) > 2(g − 1)
(15.5)
Corollary. The dimension of the space of meromorphic functions having at most k prescribed poles and at least l prescribed zeroes is greater than or equal to k − l + 1 − g. Equality occurs when k − l ≥ 2g − 2. If k−l ≥ g the equality is satisfied for generic positions of the prescribed zeroes and poles. Let us take for simplicity l = 0. Then dim Γ(κ − ξ) is the dimension of the space of holomorphic differentials having k prescribed zeroes. But the space of holomorphic differentials is of dimension g and we want to impose k linear conditions. This is generically impossible for k ≥ g. Hence the useful statement: deg D ≥ g ⇒ i(D) = 0
generically
15.10 Abelian differentials
559
Note that we have previously illustrated this situation by constructing a meromorphic function with g + 1 prescribed poles on a hyperelliptic surface.
15.10 Abelian differentials Consider a Riemann surface Σ of genus g. Let ai , bi be a basis of cycles on Σ with canonical intersection matrix (ai · aj ) = (bi · bj ) = 0, (ai · bj ) = δij . This means that one can take differentiable loops t → ai (t) and t → bi (t) such that there is no intersection between loops ai and aj or bj with j = i, while ai and bi intersect at just one point p. Moreover, at p the tangent vectors ai and bi form a positively oriented basis of the tangent space at p (for the orientation given by the complex space structure). One can then continuously deform these loops without changing the intersection index which is the sum of signs ±1 at each intersection according to the orientation of the tangent vectors. In particular, one can deform the loops ai and bi so that they have a common base point and then cut the Riemann surface along them, getting a polygon with some edges identified. The −1 boundary of this polygon can be described as a1 · b1 · a−1 1 · b1 · · · ag · bg · −1 a−1 g · bg , where the identifications are obvious. The common base point becomes all the vertices of the polygon. The globally defined analytic 1-forms on Σ are called Abelian differentials of the first kind. They form a space of dimension g over the complex numbers. Note that such a differential has 2g − 2 zeroes on the Riemann surface since c(κ) = 2g − 2 and it has no pole. There is a natural pairing between these forms and loops obtained by integrating the form along the loop. It can be shown that the pairing between a-cycles and differentials is non-degenerate (note they have the same dimension g). This is a consequence of the Riemann bilinear identities that we shall describe below. We choose a basis of first kind Abelian differentials, which we denote by ωj , j = 1, . . . , g, normalized with respect to the a-cycles: ωi = δij (15.6) aj
The matrix of b-periods is then defined as the matrix B with matrix elements: Bij = ωj (15.7) bi
Taking the example of a hyperelliptic surface y 2 = P2g+1 (x), where P (x) is a polynomial of degree 2g + 1, a basis of regular Abelian differentials is
560
15 Riemann surfaces
provided by the forms ωj = xj dx/y
for j = 0, . . . , g − 1
These forms are regular except perhaps at the branch points and at ∞. At a branch point the local parameter is y and we have y 2 = a(x − b) + · · · , hence xj dx/y = (2bj /a)(1 + · · ·)dy which is regular. At ∞ we take x = 1/x and y = y/x(g+1) so that y 2 = ax + · · · and xj dx/y = by 2(g−j−1) dy which is regular for j ≤ g − 1 since y is the local parameter. Of course these forms are unnormalized. Note that their (2g − 2) zeroes are located at x = 0 and x = ∞. Similarly Abelian differentials of the second kind are meromorphic differentials with poles of order greater than 2. Given a point p on Σ, there exists an Abelian differential of the second kind whose only singularity is a pole of second order at p. Indeed, by eq. (15.3) we see that dim Γ(κ+2ξp ) ≥ c(κ+2ξp )+1−g = g +1. An element in Γ(κ+2ξp ) comes from an Abelian differential multiplied by σp2 , where σp is the section of the point bundle ξp vanishing at p. Since the space of regular differentials is of dimension g we see that there exists at least one section in Γ(κ + 2ξp ) whose division by σp2 has a pole at p, which is necessarily of second order. Adding a proper combination of differentials of the first kind, one can always ensure that all a-periods of the second kind differential vanish. Such a second kind differential is called normalized. We can apply the Cauchy theorem to a meromorphic globally defined 1-form of type (1,0), yielding the vanishing of the sum of residues of first order poles. Note that the residue of a meromorphic 1-form is intrinsically . defined since Res = (1/2πi) ω, where the contour is a small loop around the given singularity. So we define Abelian differentials of the third kind as general meromorphic differentials with first order poles whose sum of residues vanish. Given two points p and q there exists a unique normalized (all a-periods vanish) third kind differential whose only singularities are a pole of order 1 at p with residue 1, and a pole of order 1 at q with residue −1. 15.11 Riemann bilinear identities On a Riemann surface on which we have chosen canonical cycles there is a pairing between meromorphic differentials. Specifically, let Ω1 and Ω2 be two meromorphic differentials on Σ. The pairing (Ω1 • Ω2 ) is defined by integrating them along the canonical cycles as follows: g (Ω1 • Ω2 ) = Ω1 Ω 2 − Ω 2 Ω 1 i=1
aj
bj
aj
bj
15.11 Riemann bilinear identities
561
The Riemann bilinear identity expresses this quantity in terms of residues: Proposition. Let g1 be a function defined on the Riemann surface, cut along the canonical cycles, and such that dg1 = Ω1 . We have: (Ω1 • Ω2 ) = 2iπ res(g1 Ω2 ) (15.8) poles
. Proof. One computes g1 Ω2 on the boundary of the polygon representing the Riemann surface. On one hand, this produces the sum of residues in eq. (15.8). On the other hand, we compute this integral explicitly: g1 Ω2 = g1 Ω2 − (g1 + Ω1 )Ω2 aj a−1 j
since g1 is shifted by
aj
bj
bj
Ω1 when crossing the cut aj . Similarly, one gets:
bj b−1 j
aj
g1 Ω2 −
g1 Ω 2 = bj
(g1 −
bj
Ω1 )Ω2 aj
Adding the contributions for all j one gets (Ω1 • Ω2 ). Corollary. The matrix of b-periods B is symmetric. Proof. The pairing between the normalized holomorphic differentials is trivial: (ωi • ωj ) = 0 for i, j = 1, . . . , g. This reads Bij = Bji . Corollary. Let Ω2 be a normalized differential of the second kind with a pole of order n, with principal part z −n dz at z = 0 for some local parameter z. Let Ω1 = ωk be a normalized holomorphic differential expanded as ∞ ci z i )dz ωk = ( i=0
around z = 0. One has:
Ω2 = 2πi bk
cn−2 n−1
second kind differential with principal By linearity, if Ω(P ) is a normalized N part dP (z), where P (z) = n=1 pn z −n , then we have 1 Ω(P ) = −Res (ωk P ) (15.9) 2iπ bk
562
15 Riemann surfaces 15.12 Jacobi variety
We have seen that differential line bundles on a Riemann surface are classified by their Chern class. This is not so for analytic line bundles. To describe their classification, it is sufficient to consider the different analytic structures on line bundles of Chern class 0. The space of such structures is called the Jacobi variety. Since analytic line bundles are the same as equivalence classes of divisors on the Riemann surface, the Jacobi variety identifies with equivalence classes of divisors of degree 0. It is thus necessary to characterize the divisors of meromorphic functions. In order to dothat, consider a divisor of degree 0 which can always be written D = i (pi − qi ), with non-necessarily distinct points. Choose paths γi from qi to pi and associate with D the point in Cg of coordinates: ρk (D) = ωk , k = 1, . . . , g i
γi
Such sums are called Abel sums. If the paths are homotopically deformed these integrals remain constant by the Cauchy theorem. If one makes a loop around ak , then ρl → ρl + δkl . If one makes a loop around bk , then ρl → ρl + Bkl . Hence the maps ρk give a well-defined point on the torus: J(Σ) = Cg / (Zg + BZg )
(15.10)
where B is the matrix of the b-periods. If one permutes the points pi and qi independently the point in the torus does not change. To see it, let q 1 to the paths γ1 connect q1 to p 2 , γ2 connect q2 to p1 and σ connect q2 . One has γ1 ω = γ ω − σ ω up to periods and γ2 ω = γ ω + σ ω 1 2 up to periods, so σ ω cancels in the sum. Note that J(Σ) is an additive group and that the above map from line bundles to points of J(Σ) is a homomorphism. The theorems of Abel and Jacobi state that this point on the torus J(Σ) characterizes the divisor D up to equivalence, so that the Jacobian variety can be identified with the g-dimensional complex torus J(Σ). Theorem (Abel). A divisor D = i (pi − qi ) is the divisor of a meromorphic function if and only if, for any first kind Abelian differential ω, the Abel sum i γi ω vanishes modulo Z + BZ for any choice of paths γi from qi to pi . Theorem(Jacobi). For any point λ ∈ J(Σ) and a fixed reference divisor D0 = gi=1 qi , one can find a divisor of g points D = p1 + · · · + pg on Σ such that ρk (D − D0 ) maps to λ. Moreover, for generic λ the divisor D is unique.
563
15.13 Theta-functions
Proof. Let f be a meromorphic function with divisor D = i (pi − qi ) and consider for λ ∈ C the pencil of divisors Dλ = i (pλi − qi ) of the meromorphic functions f + λ (the poles are those of f , the zeroes vary analytically with λ). Finally, let φ(λ) be the point in C/(Z+BZ) obtained by integrating ω along paths from qi to pλi . The map φ is obviously analytic and can be extended to an analytic map on the Riemann sphere because when λ → ∞, we have pλ → q. But such a map from the Riemann sphere to a torus is necessarily constant, since dz is a regular differential on the torus, hence φ∗ (dz) has to vanish on the Riemann sphere. Since φ(∞) = 0 one gets φ = 0. In order to prove the converse and the Jacobi theorem one has to exhibit particular functions or divisors. This will also be a consequence of the powerful Riemann theorem that we will show later on. To show the generic uniqueness of the g points mapping to a given λ note that we have g equations for g unknowns, hence the space of solutions is generically of dimension 0. If we have two solutions, there exists a meromorphic function f whose poles and zeroes are respectively these solutions. Then f + λ for λ ∈ C relates the divisor of poles to a one-parameter family of equivalent divisors, hence the solution space would be of dimension 1, a contradiction. One can embed the Riemann surface Σ into its Jacobian J(Σ) by the Abel map. Specifically, choose a point q0 ∈ Σ and define the vector A(p) with coordinates Ak (p) modulo the lattice of periods: A : Σ −→ J(Σ) p Ak (p) = ωk
(15.11) (15.12)
q0
Clearly, the Abel map depends on the point q0 . But changing this point just amounts to a translation in J(Σ). The Abel map is an analytic embedding of the Riemann surface into the g-dimensional torus J(Σ), i.e. is injective. 15.13 Theta-functions One can show using Riemann bilinear type identities that the imaginary part of the period matrix B is a positive definite quadratic form. This allows us to define the Riemann theta-function: e2πi(m,z)+πi(Bm,m) (15.13) θ(z1 , . . . , zg ) = m∈Zg
Since the series is convergent, it defines an analytic function on Cg .
564
15 Riemann surfaces
The theta-function has simple automorphy properties with respect to the period lattice of the Riemann surface: for any l ∈ Zg and z ∈ Cg θ(z + l) = θ(z) θ(z + Bl) = exp[−iπ(Bl, l) − 2iπ(l, z)]θ(z)
(15.14)
The divisor of the theta-function is the set of points in the Jacobian torus where θ(z) = 0. Note that this is an analytic subvariety of dimension g −1 of the torus, well-defined due to the automorphy property. The fundamental theorem of Riemann expresses the intersection of the image of the embedding of Σ into J(Σ) with the divisor of the thetafunction. Theorem. Let w = (w1 , . . . , wg ) ∈ Cg , arbitrary. Either the function θ(A(p) − w) vanishes identically for p ∈ Σ or it has exactly g zeroes p1 , . . . , pg such that: A(p1 ) + · · · + A(pg ) = w − K
(15.15)
where K is the so-called vector of Riemann’s constants, depending on the curve Σ and the point q0 but independent of w. Proof. We first dissect the Riemann surface Σ as explained above, ob−1 −1 −1 taining a polygon with boundary a1 · b1 · a−1 1 · b1 · · · ag · bg · ag · bg in C. Consider the analytic function on the polygon (or more precisely on the Riemann surface cut along the previous loops): f (p) = θ(A(p) − w) Assuming that f does not vanish identically, it has discrete zeroes pi . Since it has no pole the number of these zeroes is given by: df 1 number of zeroes = 2πi f where the integral is taken on the boundary of the polygon. This integral is a sum of terms on the arcs ak and bk . The integrals on the arcs bk and b−1 k are related by a translation by the ak period, hence cancel. Similarly the difference of the integrals on ak and a−1 k reduces by the automorphy property to 2iπ ak ωk = 2iπ. Thus number of zeroes = g. To prove the second identity we proceed similarly by considering the integral: 1 df gk , with dgk = ωk , and gk (q0 ) = 0 2πi f
565
15.13 Theta-functions
computed on the edge of the polygon. On one hand, this integral is equal to the sum of residues which occur at the zeroes of f , and produces: g
gk (pj ) =
j=1
Ak (pj )
j
On the other hand, it can be computed as a sum over arcs using the automorphy properties of the theta-function: gk d(log f ) = − (gk + Bjk )(d(log f ) − 2iπdAj (p)) aj a−1 j
aj
gk d(log f ) ≡ 2iπ
+ aj
gk ω j aj
modulo B periods. But we have: gk d(log f ) = (gk + δjk )d(log f ) − gk d(log f ) bj b−1 j
bj
bj
≡ δjk (2iπwj − 2iπAj (q1 )) modulo periods, where q1 is the base point of all the loops at the boundary. Putting everything together one gets the Riemann formula with K given by a complicated expression, independent of w. Corollary (Jacobi’s theorem). Any point in the Jacobian J(Σ) is the image of some degree g divisor p1 + · · · + pg on Σ. Proof. One has to find g points such that A(p1 ) + · · · + A(pg ) = z modulo periods for given z. We find these points by solving the equation θ(A(p) − K − z) = 0. The divisor of the zeroes of a theta-function has also a nice characterization in terms of points on the Riemann surface. Theorem. The zero divisor of a theta-function can be written as X =K−
g−1
A(ηi )
i=1
Proof. Let X be a point of the Θ divisor, i.e. θ(X) = 0. Consider the function θ(A(p) + X)
(15.16)
566
15 Riemann surfaces
By Riemann’s theorem, the zeroes of this function are such that (if it does not vanish identically) A(p1 ) + · · · + A(pg ) + X = −K Among these zeroes, one has necessarily q0 , the base pointof Abel’s map g−1 (since θ(X) = 0), say pg = q0 . Hence we have X = −K − i=1 A(pi ). Conversely, if X is of this form, one solves eq. (15.16) producing a g divisor of g points gi=1 qi such that g−1 i=1 A(pi ) = i=1 A(qi ). A solution is obviously pi = qi for i = 1, . . . , g − 1 and qg = q0 . This solution is generically unique up to permutation due to the Jacobi theorem. Equation (15.16) for p = q0 reads θ(X) = 0. Note that the space of points of the form X = −K − g−1 i=1 A(pi ) and the solution of θ(X) = 0 are both closed in the Jacobian, hence they are equal. The Riemann theorem can be used to express meromorphic functions in terms of theta-functions. Let f be a meromorphic function with g poles at points δ1 , . . . , δg and an additional pole at the point q + and one of its zeroes at a specified point q − ; i.e. with divisor greater than D = −δ1 − · · · − δg − q + + q − . By the Riemann–Roch theorem there is a unique such function generically. Let w, w+ , w− , w0 be vectors defined by the formulae: w=
g
A(δs ) + K
s=1
w
+
= A(q ) + +
w− = A(q − ) +
g s=2 g
A(δs ) + K A(δs ) + K
s=2 + −
w0 = w + w − w . Let us define the function f (p) =
θ(A(p) − w− )θ(A(p) − w0 ) θ(A(p) − w)θ(A(p) − w+ )
(15.17)
From the Riemann theorem it follows that the two factors in the denominator vanish generically at the points δ1 , . . . , δg and q + , δ2 , . . . , δg , respectively. Similarly, the two factors in the numerator vanish at q − , δ2 , . . . , δg and g other points. The zeroes at δ2 , . . . , δg cancel between the numerator and the denominator, thereby leaving us with the correct divisor of
15.14 The genus 1 case
567
zeroes and poles. It remains to show that the function f is well-defined on Σ. This is because, due to the definition of w0 , the automorphy factors of the theta functions in eq. (15.14) cancel between the numerator and the denominator when p describes b-cycles on Σ. The converse of Abel’s theorem results from an analogous construction. 15.14 The genus 1 case The application of these ideas to the genus 1 case is the classical theory of elliptic functions. A genus 1 analytic surface can be viewed as the quotient of C by a lattice, whose periods are denoted by a classical convention 2ω1 , 2ω2 . In this case, the theorems of Abel and Jacobi identify the curve and its Jacobi variety. Note that dz is a well-defined regular analytic differential on this torus, and spans the one-dimensional space of Abelian differentials. For any meromorphic function f on the torus, i.e. periodic with respect to the lattice, the differential f dz is a meromorphic differential, so the sum of its residues vanishes. Thus f has at least twopoles or a pole of order 2, and a meromorphic function with just one pole is impossible. The main example is the Weierstrass ℘-function: 8 5 1 1 1 − ℘(z) = 2 + z (z − 2mω1 − 2nω2 )2 (2mω1 + 2nω2 )2 m,n=0
An analogue of the theta-function is provided by the Weierstrass sigmafunction: 2 z z z 1 (15.18) σ(z) = z 1− exp + ωmn ωmn 2 ωmn m,n=0
with ωmn = 2mω1 + 2nω2 . This function is related to the ℘-function by the equations: ζ(z) =
σ (z) , σ(z)
℘(z) = −ζ (z),
(15.19)
The ℘-function is doubly periodic, and the sigma-function and zetafunctions transform according to: ζ(z + 2ωl ) = ζ(z) + 2ηl ,
σ(z + 2ωl ) = −σ(z)e2ηl (z+ωl )
The Riemann bilinear identity applied to the forms Ω1 = −℘(z)dz and Ω2 = dz yields 2(η1 ω2 − η2 ω1 ) = iπ. These functions have the symmetries σ(−z) = −σ(z),
ζ(−z) = −ζ(z),
℘(−z) = ℘(z).
568
15 Riemann surfaces
Their behaviour at the neighbourhood of zero is σ(z) = z + O(z 5 ),
ζ(z) = z −1 + O(z 3 ),
℘(z) = z −2 + O(z 2 )
It is useful for the study of the elliptic Calogero model to introduce the Lam´e function: Φ(x, z) =
σ(z − x) ζ(z)x e σ(x) σ(z)
(15.20)
It has the symmetry property Φ(−x, z) = −Φ(x, −z). The function Φ(x, z) is a doubly-periodic function of the variable z, Φ(x, z + 2ωl ) = Φ(x, z), and has an expansion of the form: Φ(x, z) = (−z −1 + ζ(x) + O(z))eζ(z)x
(15.21)
at the point z = 0. As a function of x, it has the following monodromy properties: Φ(x + 2ωl , z) = Φ(x, z) exp 2(ζ(z)ωl − ηl z).
(15.22)
and has a pole at the point x = 0: Φ(x, z) = x−1 + O(x). The function Φ is a solution of the Lam´e equation: 2 d − 2℘(x) Φ(x, z) = ℘(z)Φ(x, z) (15.23) dx2 Choosing the periods ω1 = ∞ and ω2 = i π2 , we obtain the hyperbolic functions: 2 z 1 z 1 , ζ(z) → coth(z)− , ℘(z) → + σ(z) → sinh(z) exp − 2 6 3 sinh (z) 3 and Φ(x, z) →
sinh(z − x) x coth z e sinh(z) sinh(x)
In the rational limit, we have σ(z) → z,
1 ζ(z) → , z
1 ℘(z) → 2 , z
(15.24)
Φ(x, z) →
1 1 − x z
x
ez
15.15 The Riemann–Hilbert factorization problem In this section, we give the proof of the Riemann–Hilbert theorem on the Riemann sphere, see eq. (3.49) in Chapter 3. Let U+ be the disc |x| < 1+η, and U− be the disc |x| > 1 − η. Let C be the circle |x| = 1.
15.15 The Riemann–Hilbert factorization problem
569
Let us give ourselves a matrix A analytic in the ring U+ ∩ U− , such that det A = 0. We consider the kernel: K(x, t) = A−1 (x)A(t) − 1 which is analytic in the above ring and vanishes for x = t. On continuous functions on C we define an operator F 1 K(x, t) (FF )(x) = F (x) + F (t)dt 2iπ C x − t This is a Fredholm operator because it is of the form 1 plus a compact operator. Hence, Im F is closed and of finite codimension. So one can choose a matrix of polynomials P (x) such that det P (x) = 0 for |x| sufficiently large and such that there exists a function F satisfying (FF )(x) = A−1 (x)P (x),
x∈C
The function F (x) has an analytical extension on the ring U+ ∩ U− given by K(x, t) 1 −1 F (x) = A (x)P (x) + F (t)dt (15.25) 2iπ C t − x It follows that one can expand F (x) in a Laurent series and write F (x) = F+ (x) + F− (x), where F± (x) are analytic in U± respectively and F− (x) vanishes at x = ∞. Note that for |x| > 1, one has F (t) 1 F− (x) = − dt 2iπ C t − x Subtracting from eq. (15.25), we get for 1 < |x| < 1 + η: 1 F (t) + K(x, t)F (t) −1 dt F+ (x) = A (x)P (x) + 2iπ C t−x Multiplying by A(x), we get in the same domain A(t)F (t) 1 dt ≡ F˜− (x) A(x)F+ (x) = P (x) + 2iπ C t − x The function F˜− (x) is in fact analytic for |x| > 1 and behaves as P (x) for x → ∞, so that its determinant does not vanish for |x| sufficiently large. It follows that det F+ (x) and det F˜− (x) do not vanish identically and therefore have a finite number of zeroes at finite distance. We remove each zero successively by the following procedure. Consider an equation of the form (15.26) AG+ = G−
570
15 Riemann surfaces
Suppose that det G+ (x) has a simple zero at x0 in U+ . This means that there is a linear combination of its columns vanishing at x0 . This combination is realized by multiplying on the right by a constant matrix M such that det M = 0. We have of course AG+ M = G− M . Let us assume that it is the first column of G+ M which vanishes at x0 . Multiplying on the right by diag(1/(x−x0 ), 1, . . . , 1), we remove the zero x0 without modifying the analytic properties of the right-hand side. Iterating the procedure, we get a pair G+ , G− satisfying eq. (15.26) with det G+ (x) = 0 for x ∈ U+ . It follows that the zeroes of det G− (x) are not in U+ and can be removed by the same procedure without modifying the analytic properties of G+ (x). At the end we get a matrix G− (x) behaving at ∞ as G− (x) = A− (x)diag (xκ1 , . . . , xκN ),
det A− (∞) = 0
Setting A+ = G+ , we have finally factorized A = A− diag (xκ1 , . . . , xκN )A−1 + ,
det A± = 0
Remark. When A is close to the identity matrix one can write A = 1 + A and the map F is surjective for sufficiently small . In that case one can take P (x) = 1, and it is then clear that F+ (x) = 1 + O(), F˜− (x) = 1 + O(), so that their determinants do not vanish. Hence all the indices κi vanish.
References [1] G. Springer, Introduction to Riemann surfaces. Addison–Wesley (1957). [2] R.C. Gunning, Lectures on Riemann surfaces. Princeton University Press (1966). [3] P. Griffiths and J. Harris, Principles of Algebraic Geometry. Wiley (1978). [4] D. Mumford, Tata Lectures on Theta. Vols. I and II. Birkhauser (1983). [5] J.P. Serre, Algebraic Groups and Class Fields. Springer (1997). [6] E.T. Whittaker and G.N. Watson, A Course of Modern Analysis. Cambridge University Press (1902). [7] J. Fay, Theta Functions on Riemann Surfaces. Springer lectures notes (1973).
16 Lie algebras
We present basic facts about Lie groups and Lie algebras. We describe semi-simple Lie algebras and their representations which can be characterized in terms of roots and weights. We discuss infinite-dimensional Lie algebras, called affine Kac–Moody algebras, which are at the heart of the study of field theoretical integrable systems. In particular we construct the so-called level one representations using the techniques of Fock spaces and vertex operators introduced in Chapter 9. 16.1 Lie groups and Lie algebras A Lie group is a group G which is at the same time a differentiable manifold, and such that the group operation (g, h) → gh−1 is differentiable. Due to a theorem of Montgomery and Zippin, the differentiable structure is automatically real analytic. The maps h → gh and h → hg are called respectively left and right translations by g. Their differentials at the point h map the tangent space Th (G) respectively to Tgh (G) and Thg (G). We will denote by g · X and X · g the images of X ∈ Th (G) by these maps. This notation is coherent because, differentiating the associativity condition in G, one gets (g · X) · h = g · (X · h), and g · (h · X) = (gh) · X. In particular, this last relation shows that, for any X in the tangent space of G at the unit element e, the vector field with value g · X at g is invariant under any left translation. Conversely, any such left-invariant vector field is of the form g · X. So in the following we identify leftinvariant vector fields on G and elements of Te (G). This finite-dimensional vector space is called the Lie algebra of G and will be denoted by G. Alternatively, one can define the vector field X(g) by its action on any function f , that is (X · f )(g) is the derivative of f along the tangent 571
572
16 Lie algebras
vector X at g. This defines a new function X · f . The left invariance of the vector field X(g) means that (X · f )(hg) = X · h f (g), where h f : g → f (hg). In other words, the differential operator X commutes with left translations. More generally, for X1 , X2 , . . . ∈ G, one can consider linear combinations of differential operators (of any order) of the form X1 · (X2 · · · (Xk · f ) · · ·) which obviously form an associative algebra of left-invariant differential operators. The Lie algebra is embedded into this associative algebra as the set of first order differential operators. One defines the Lie bracket [X, Y ] = XY − Y X, in terms of the associative algebra product. The main point is that, while XY and Y X are second order differential operators, their commutator is a first order differential operator, so that [X, Y ] belongs to the Lie algebra. It is clear from this definition that (X, Y ) → [X, Y ] is bilinear antisymmetric and obeys the Jacobi identity: [[X, Y ], Z] + [[Z, X], Y ] + [[Y, Z], X] = 0
(16.1)
The associative algebra of left-invariant differential operators on G is called the universal enveloping algebra of the Lie algebra G and will be denoted by U(G). Finally, there is a natural action of the Lie group G on its Lie algebra G called the adjoint action. Note that for any X in the tangent space at e and any g, g · X · g −1 is also in the tangent space at e to G. We define the adjoint action: Ad (g)(X) = g · X · g −1 ,
X∈G
(16.2)
and note that Adgh = Adg Adh , so this is a group action of G on G. Let X ∈ G and consider the left-invariant vector field g · X. For small t we can solve the differential equation g˙ = g · X with initial condition g(0) = e. The solution g(t) ∈ G is such that g(s + t) = g(s)g(t) for s, t small. This is because both members solve the above differential equation with initial value g(s) for t = 0. One can then use this property to show that the solution of the differential equation extends to a domain larger than initially defined (if we have a solution for |s| ≤ then g(s + t) is a solution for |s + t| ≤ 2 and still solves the equation there), and successively extends to all of R. The solution, defined for all t, is denoted by exp (tX). In particular, for t = 1 this defines the exponential map from G to G (obviously exp (tX) belongs to the connected component of e in G, so in the following we assume that G is connected). The exponential map allows us to relate subgroups of G to subalgebras of G. Let H be a subalgebra of G, and H be the smallest subgroup of G containing all the exp (X) for X ∈ H. One can show that H is a Lie
16.1 Lie groups and Lie algebras
573
subgroup of G with Lie algebra H. The precise definition of a Lie subgroup is quite tricky, in particular H may be embedded in a very complicated way in G, as shown by the simple example below. Conversely, given H its Lie algebra H is the set of X ∈ G such that t → etX is a continuous curve in H. Example. Consider the torus R2 /Z2 with the Abelian group law induced by the addition in R2 . This is a Lie group G, with Abelian Lie algebra R2 . Consider the one-parameter subgroup H = {exp (tX)|t ∈ R}, which is a Lie subgroup. When X has irrational slope, this subgroup is dense in G. In particular, any neighbourhood U of e contains infinititely many components of H. A much nicer situation is obtained when H is closed in G, and fortunately this is the situation of interest for our purposes. We are mostly interested in the case when the Lie group G acts differentiably on a manifold M , and H is the stabilizer of a point m ∈ M . In this case, since the operation is continuous, H is automatically closed in G. Remarkably, this is sufficient to ensure that H is a closed Lie subgroup of G, thanks to a theorem of E. Cartan. Theorem. If H is a closed subgroup of a Lie group G, there exists a unique analytic structure on H such that H is a Lie subgroup of G. Proof. We sketch the proof. The idea is to define H = {X ∈ G| exp (tX) ∈ H, ∀t} and show that this is a subalgebra of G. Easy computations show that: n
t t = exp t(X + Y ) + O(1/n) exp X exp Y n n n2
−t −t t t exp = exp t2 [X, Y ] + O(1/n) X exp Y exp X exp Y n n n n so that, if X, Y ∈ H, both (X + Y ) and [X, Y ] are in H since H is closed. One then uses the closedness of H to show that the Lie subgroup of G of Lie subalgebra H is in fact equal to H (more precisely to the connected component of the identity in H). In the situation described above where G acts on M , one can show that the application g → g · m of G onto the orbit Om of m is open so that Om is isomorphic to the homogeneous space G/H.
574
16 Lie algebras
Examples. The most natural examples of Lie groups are provided by so-called algebraic subgroups of the general linear group GL(n). These are subvarieties of GL(n) defined by polynomial equations compatible with the multiplication law. For example, the special linear group is defined by the equation det g = 1, and the product of two such matrices has determinant 1. Hence these are naturally closed Lie subgroups of GL(n). The other standard examples are the subgroups of orthogonal and symplectic matrices. For any Lie group G, a representation ρ on a vector space V is a differentiable group homomorphism G → GL(V ). For g ∈ G and v ∈ V one denotes g · v = ρ(g)(v) so that (gh) · v = g · (h · v). The differential of ρ at e maps the Lie algebra G to the Lie algebra gl(V ) of GL(V ). Similarly, left-invariant vector fields on G are mapped on left-invariant vector fields on GL(V ), and so are their Lie brackets. Hence we get a representation of G on gl(V ), which we shall also denote by ρ. In other words, we have [ρ(X), ρ(Y )] = ρ([X, Y ]). Such a representation is faithful if ρ : G → gl(V ) is injective (i.e. for X = 0 there exists v ∈ V such that ρ(X)(v) = 0). In this case G may be seen as a subalgebra of gl(V ). There is a natural representation of any Lie group on its Lie algebra, i.e. V = G, given by the adjoint representation. This induces a representation of G on G, also called the adjoint representation: adX (Y ) = [X, Y ]
(16.3)
It is easy to check that this is a representation of G, using the Jacobi identity. Almost all results on Lie algebras are obtained by studying this representation.
16.2 Semi-simple Lie algebras Because there is such an interplay between Lie groups and Lie algebras, we study here Lie algebras from an algebraic viewpoint. In this section we consider Lie algebras over the complex numbers, e.g. complexifications GR ⊗ C of real Lie algebras. We will often use a basis (Ta ), a = 1, . . . , dim G, on the complex Lie algebra G. The Lie bracket is then expressed as: c Tc [Ta , Tb ] = fab c are called structure constants. Note that in this basis The coefficients fab c . the matrix elements of the adjoint representation are (adTa )cb = fab
16.2 Semi-simple Lie algebras
575
The adjoint representation eq. (16.3) allows us to define a natural bilinear form on G, also called the Killing form, by: (X, Y ) = Tr (adX adY ) This bilinear form is invariant in the sense that: ([X, Y ], Z) = (X, [Y, Z]) This results immediately from the cyclic invariance of the trace and the fact that X → adX is a representation. The invariance property also means that (adY X, Z) + (X, adY Z) = 0. A Lie algebra is said to be semi-simple if it does not contain any nontrivial Abelian ideal. The Cartan criterion says that this is the case if and only if the Killing form is non-degenerate. In one direction this is easy. If X belongs to an Abelian ideal I, and Y is arbitrary, choose a basis Ta of G such that Ta is a basis of I for i = 1, . . . , r. Then adY adX (Ta ) vanishes for a ≤ r and belongs to I for a > r, hence adY adX has no diagonal element in this basis, and its trace vanishes. We see that any Abelian ideal is in the kernel of the Killing form. A Lie algebra G is called simple if it is semi-simple and its only ideals are either {0} or the algebra G itself. For any ideal I in a semi-simple algebra, its orthogonal under the Killing form is also an ideal. Moreover, by invariance and non-degeneracy of the Killing form one sees that I ∩ I ⊥ is an Abelian ideal (for X ∈ I, Y ∈ I ⊥ , and Z arbitrary ([X, Y ], Z) = (X, [Y, Z]) = 0), hence vanishes. It follows that G is the direct sum of its simple ideals, this being an orthogonal decomposition. We now introduce the concept of a Cartan subalgebra in a semi-simple Lie algebra. First, an element X of G is called semi-simple if adX is a diagonalizable matrix in the adjoint representation. A Cartan subalgebra H of a semi-simple Lie algebra G is a maximal Abelian subalgebra of G whose elements are all semi-simple. The existence of such an algebra is a very non-trivial result. To construct it one starts with a regular element, that is an element of G such that adX has a maximal number of distinct eigenvalues (as a matrix in the adjoint representation). Then the subalgebra of G on which adX is nilpotent is a Cartan subalgebra. One can show that any two Cartan subalgebras are related by a Lie algebra automorphism, and their common dimension is called the rank of the Lie algebra and will be denoted by rank G. In the adjoint representation, the endomorphisms ad (H) for H ∈ H form a system of commuting diagonalizable endomorphisms. We can thus diagonalize them simultaneously. Let Eα ∈ G be the common eigenvectors: ad (H) · Eα = α(H) Eα
576
16 Lie algebras
The application α : H ∈ H → α(H) ∈ R is a linear form defined over H. That is, α belongs to the dual of the Cartan algebra: α ∈ H∗ . These forms are called the roots of the Lie algebra. We shall denote their set by ∆. They satisfy a few simple properties: (i) if α is a root then so is −α, (ii) a non-zero root is non-degenerate (i.e. the eigenspace is of dimension 1), (iii) if α is a root and t ∈ C, tα is not a root, except for t = ±1. Let {Hi } be a basis of the Cartan subalgebra. Then {Hi , Eα } form a basis of the Lie algebra G, on which the Killing form has a very simple structure, namely: (Hi , Eα ) = 0,
(Eα , Eβ ) = 0,
α + β = 0
(16.4)
This is because (H, [H , Eα ]) = α(H )(H, Eα ) = ([H, H ], Eα ) = 0, and ([H, Eα ], Eβ ) = α(H)(Eα , Eβ ) = (−([H, Eβ ], Eα ) = −β(H)(Eα , Eβ ). As a consequence, the restriction of the Killing form to the Cartan subalgebra is non-degenerate, otherwise the Killing form would be degenerate on the Lie algebra. It is convenient to introduce the isomorphism between H and its dual H∗ induced by the Killing form: α ∈ H ∗ → Hα ∈ H
with α(H) = (Hα , H),
∀H ∈ H
This defines, for any α ∈ H∗ , an element Hα ∈ H depending linearly on α. We may then define a bilinear form on H∗ by: (α, β) = (Hα , Hβ ) = α(Hβ ),
α, β ∈ H∗
This form is non-degenerate because the Killing form is non-degenerate on the Cartan subalgebra. Moreover, Hα for α ∈ ∆ span the Cartan subalgebra. In the basis {Hi , Eα } the Lie bracket reads: [Hi , Hj ] = 0 [Hi , Eα ] = α(Hi )Eα 9 (Eα , E−α )Hα [Eα , Eβ ] = Cα,β Eα+β 0
(16.5) if α + β = 0 if α + β ∈ ∆ if α + β ∈ ∆
with Cα,β some structure constants. Here we remark that [Eα , Eβ ] either vanishes or is proportional to Eα+β if α + β = 0 is a root, because [H, [Eα , Eβ ]] = [[H, Eα ], Eβ ] + [Eα , [H, Eβ ]] = (α(H) + β(H))[Eα , Eβ ]. If,
16.2 Semi-simple Lie algebras
577
however, α + β = 0, this shows that [Eα , E−α ] is in the Cartan subalgebra, and we have (H, [Eα , E−α ]) = ([H, Eα ], E−α ) = α(H)(Eα , E−α ) = (H, (Eα , E−α )Hα ). For each root α the three generators Hα , Eα , E−α form an sl(2) subalgebra of G. This allows us to study the α-chain through β, that is the set of roots of the form β + nα, using the commutation relations: [Hα , E±α ] = ±α(Hα )E±α , [Eα , E−α ] = (Eα , E−α )Hα
j The vectors adE±α Eβ for j ∈ N are obviously linearly independent root vectors in G, for the roots β ± jα, if they don’t vanish. They span a representation space for the considered sl(2) and since this representation is of finite dimension, the chain must be of finite length. Let p ≤ 0 be the minimal index such that β + pα is a root, and q ≥ 0 be the maximal index such that β + qα is a root. Let β = β + pα and consider the
j vectors vj = adEα Eβ for j ∈ N. By the minimality of p, we have ad E−α v0 = 0. Using this property, we can compute:
adHα vj = β (Hα ) + jα(Hα ) vj α(Hα )
vj−1 (16.6) adE−α vj = −j(Eα , E−α ) β (Hα ) + (j − 1) 2 Since vq−p+1 vanishes, but vq−p does not, we have β (Hα ) + (q − p)α(Hα )/2 = 0 or: (β, α) = −(p + q) ∈ Z (16.7) 2 (α, α) This result allows us to show that the Killing form induces a positive definite scalar product on the real vector space α∈∆ RHα . By duality this defines a positive definite scalar product on α∈∆ Rα. Indeed, computing the Killing form on the basis of G provided by the Hi and the E±α , we have (α, γ)(β, γ) (α, β) = (Hα , Hβ ) = Tr (ad Hα ad Hβ ) = γ
Taking α = β and dividing by (α, α)2 , we get: (α, γ) 2 1 = (α, α) (α, α) γ so that (α, α) is a rational number. It follows that (α, β) is a rational number, hence is real. Then for any x = α xα α with xα ∈ R we have (x, x) = γ (x, γ)2 ≥ 0 and this vanishes only if x = 0.
578
16 Lie algebras
For later use let us write another consequence of eq. (16.6): [E−α [Eα , Eβ ]] = (Eα , E−α )q(1 − p)
(α, α) Eβ 2
(16.8)
Both members are homogeneous in the normalizations of E±α and Eβ so one can replace Eβ = v−p with the notations of eq. (16.6), and then use eq. (16.7). With any root α one can associate a reflection wα acting on H∗ by: wα (x) = x − 2
(α, x) α (α, α)
These orthogonal reflections are called Weyl reflections. They preserve the root system, i.e. if β is a root so is wα (β). This is because, using eq. (16.7), wα (β) = β + (p + q)α is in the chain β + pα, . . . , β + qα. The Weyl group is the discrete group generated by these reflections. While roots span H∗ , they are not linearly independent in general, and one can choose a subset of them which forms a basis. There exists a subset Π of roots αi , i = 1, . . . , r, such that any other root α can be written α = i ni αi , where the ni are integers all of the same sign. When all ni are ≥ 0 we say that α is a positive root and otherwise α is called a negative root. The αi are called simple roots. So they are positive roots which cannot be written as the sum of two positive roots. The choice of Π is not unique, but any two such choices are related by a unique transformation of the Weyl group. To show the existence of a basis of simple roots, choose a hyperplane in H∗ which does not contain any root. Half of the roots are then on one side of this hyperplane, and we call them positive roots. If a positive root can be written as a sum of two positive roots we call it decomposable, otherwise we call it simple. Obviously, any positive root can be written as a linear combination of simple roots with positive integer coefficients. In particular the simple roots span H∗ . We show that they are linearly independent. Note that for two simple roots α and β their difference (β −α) is not a root, because if (β −α) is a positive root we can decompose β = (β − α) + α as sum of positive roots, in contradiction with the simplicity of β, while if (α−β) is positive, we get similarly a contradiction with the simplicity of α. Hence the simple root condition means that p = 0 in eq. (16.7), so that: (α, β) −2 =n∈N (α, α) In particular, the scalar product (α, β) is negative for α, β simple roots. The α-chain through β consists of the roots of the form β + nα
16.2 Semi-simple Lie algebras
579
for n = 0, 1, . . . , −2(α, β)/(α, α). Assume now that there is a linear rela tion between the simple roots that we can write in the form r α s s = s r α with r and r real and positive, and the set {s} disjoint from s s s s s the set {s }. From this equality one gets ( s rs αs )2 = ss rs rs (αs , αs ). The left-hand side is obviously positive, while the right-hand side is obviously negative because s = s so (αs , αs ) ≤ 0. It follows that rs = rs = 0, so the simple roots are linearly independent. At the extreme opposite of the simple roots are highest roots. They are of the form θ = i ni αi , where the ni are maximal ≥ 0 integers. For a simple Lie algebra one can show that the highest root is unique, and that all ni > 0. Let αi be a set of simple roots. One defines the Cartan matrix, which is independent of the choice of basis (since two bases are related by the Weyl group) by: aij =
2(αj , αi ) (αi , αi )
,
i, j = 1, . . . , rank G
(16.9)
It is such that aii = 2 and aij ≤ 0, aij = 0 ⇒ aji = 0 for i = j. Moreover, the aij for i = j are negative integers such that 0 ≤ aij aji ≤ 4. This last condition comes from the fact that the scalar product is positive definite on H∗ . The Cartan matrix is non-degenerate: det(a) = 0 because det(a) is proportional to the determinant of the matrix of scalar products of simple roots which are linearly independent. With the Cartan matrix, we can give a presentation of the Lie algebra G, by generators and relations. For each simple root αi the elements: hi =
2 Hα , (αi , αi ) i
e+ i = Eα i ,
e− i =
1 E−αi (Eαi , E−αi )
generate an sl(2) subalgebra with standard commutation relations. The Cartan matrix allows us to reconstruct the Lie algebra from this set of sl(2) subalgebras. Given a Cartan matrix aij satisfying the properties mentioned above, we may define G as the Lie algebra generated by the − sets (hi , e+ i , ei ) with the relations (called the Serre relations): [hi , hj ] = 0 hi , e± = ±aij e± j j − = δij hi e+ i , ej
1−aij (ad e± · e± i ) j = 0
for i = j
(16.10)
580
16 Lie algebras
The last condition is just the condition that the αi -chain starting at αj is of length −aij . The elements hi generate the Cartan subalgebra H. The fact that these relations yield a finite-dimensional Lie algebra is a theorem by J.P. Serre. Let N± be the subalgebra generated by the e± i . We have: G = N− ⊕ H ⊕ N+ The subalgebras B± = H ⊕ N± are called Borel subalgebras. The classification of finite-dimensional simple Lie algebras is then reduced to the classification of finite-dimensional Cartan matrices satisfying the mentioned properties. This leads to four infinite series An = sl(n + 1), Bn = so(2n + 1), Cn = sp(2n), Dn = so(2n) and a few exceptional algebras called E6 , E7 , E8 and F4 , G2 (see the References). A consequence of Serre’s theorem is the existence of an involutive automorphism ω of the Lie algebra G, called the Chevalley automorphism. To define it we give its action on the generators: ω(hi ) = −hi ,
− ω(e+ i ) = −ei ,
+ ω(e− i ) = −ei
and check that it preserves the relations. Hence it extends to the whole Lie algebra. For any root α of the Lie algebra one can choose E−α = −ω(Eα ). Changing Eα → λEα , we have (Eα , E−α ) → λ2 (Eα , E−α ) so that we can always achieve the condition (Eα , E−α ) = 1. Notice that in general λ will be a complex number. 16.3 Linear representations Recall that a linear representation on a vector space V , of a finitedimensional Lie algebra G, is a homomorphism ρ from G to End V . We can define the sum and the product of two representations (ρ1 , V1 ) and (ρ2 , V2 ). The sum is the representation on the direct sum V1 ⊕ V2 such that ρV1 ⊕V2 maps elements of G into block diagonal endomorphisms whose restrictions to V1,2 coincide with their images under ρ1,2 , in other words ρV1 ⊕V2 is block diagonal. The product is the representation on the tensor product V1 ⊗ V2 with ρV1 ⊗V2 (X) = ρ1 (X) ⊗ 1 + 1 ⊗ ρ2 (X) for any element X ∈ G. A representation on a vector space V is said to be indecomposable if it cannot be decomposed into the sum of subrepresentations. Note that for a general algebra A the sum of two representations is a representation, but the tensor product is not. The Lie algebra case appears as very special, and this is due to the existence of the algebra homomorphism, called the coproduct: ∆ : U(G) → U(G) ⊗ U(G),
∆(X) = X ⊗ 1 + 1 ⊗ X
(16.11)
16.3 Linear representations
581
The elements of the Cartan subalgebra are represented by a family of commuting diagonalizable endomorphisms. They can thus be simultaneously diagonalized. Let |λ be an eigenvector, and λ(H) be the corresponding eigenvalues, which depend linearly on H. The various λ are linear forms acting on the Cartan subalgebra H, i.e. λ ∈ H∗ , and are called weights. The weights λ(H) may have multiplicities, so we denote by |λa the weight vectors with the same weight λ. H|λa = λ(H)|λa The weight vectors |λa form a basis of the representation space V . Their number, degeneracy included, is the dimension of the representation. The representation space V contains representations of the sl(2) subalgebras generated by (Hα , Eα , E−α ) for any root α. From the knowledge of finitedimensional representations of sl(2) we get the basic integrality condition: 2
(λ, α) ∈ Z for all α ∈ ∆ (α, α)
Note that the weight system of a representation is invariant under the action of the Weyl group. In other words, if λ is a weight, so is wα (λ) for any root α. The difference of two weights of a given irreducible representation always belongs to the root system. We may thus introduce an order between weights of a representation by λ1 > λ2 iff λ1 − λ2 > 0. Any finitedimensional representation possesses a highest weight since its number of weights is finite. This vector is unique for irreducible representations. Let Λ be this highest weight, which is thus non-degenerate. The corresponding eigenvector |Λ is called the highest weight vector of the representation. It is such that: H|Λ = Λ(H)|Λ , Eαi |Λ = 0 for H ∈ H and αi any simple positive root. This follows from the fact that since Λ is a highest weight Λ + αi is not a weight. Given a representation with highest weight Λ, one defines its Dynkin indices δi by: (Λ, αi ) δi = 2 ∈N (αi , αi ) with αi the simple roots. The proof that this is a positive integer is the same as in the case of the adjoint representation. By definition, the fundamental weights Λj are the highest weights with Dynkin indices δij . They are specified by: (Λj , αi ) 2 = δij (αi , αi )
582
16 Lie algebras
The number of fundamental weights is equal to the rank of G, and there exist representations with highest weights the fundamental weights (called fundamental repesentations). Any highest weight Λ of a finite-dimensional representation may be decomposed on the fundamental weights: Λ = j δj Λj with δj the Dynkin indices. More generally, the weight lattice is the set of λ ∈ H∗ such that (λ, α) ∈ Z for any root α. Any weight of any representation is on the weight lattice. As a Z-module, the weight lattice has a basis provided by the fundamental weights, such that (λi , αj ) = δij , for any simple root αj . The highest weight representations may also be viewed as quotient of Verma modules. The Verma module associated with a highest weight vector |Λ is the space (U (N− )|Λ ) with U (N− ) the enveloping algebra of N− . Then the irreducible representation with highest weight vector Λ is isomorphic to the quotient: (U (N− )|Λ ) /MΛ where MΛ the maximal submodule of U (N− )|Λ , which is shown to exist, and is unique (by maximality). The above quotient is finite-dimensional when Λ belongs to the weight lattice. The Verma module construction shows that any weight of the weight lattice is conjugated by the Weyl group to the highest weight of some representation. The roots themselves generate a lattice called the root lattice. It is a sublattice of the weight lattice. On the universal enveloping algebra, one can define a operation such ¯ . In a Cartan–Weyl basis it reads that (XY ) = Y X and (λX) = λX H = H, Eα = E−α , which is compatible with the commutation relations since C−α,−β = −Cα,β and Cα,β and the α(H) are real. In the highest weight representation, the operation is just Hermitian conjugation and allows us to introduce complex conjugated representations. In particular, the state Λ|, dual to the highest weight |Λ , satisfies: Λ|H = Λ(H) Λ|,
Λ|E−α = 0 for α > 0
since Λ|E−α = (Eα |Λ ) = 0. The Casimir operator C is the following operator, quadratic in the Lie algebra generators (Ta ) forming a basis of the Lie algebra, hence living in the universal enveloping algebra of G: C= Ta K ab Tb a,b
583
16.4 Real Lie algebras
with K ab the matrix inverse of the Killing form Kab . Its main property is that it is in the centre of the enveloping algebra, so that, in any given representation, the Casimir operator commutes with the endomorphisms representing the elements of the Lie algebra. It thus acts proportionally to the identity on irreducible representations. If Λ is the highest weight of the representation, its value is: C(Λ) = (Λ, Λ + ρ) with ρ the Weyl vector equal to the sum of the fundamental weights: ρ= Λj j
We frequently meet the tensor Casimir operator living in G ⊗ G given by ab C12 = a,b K Ta ⊗ Tb . Note that we have, usinq eq. (16.11): 1 C12 = (∆C − C ⊗ 1 − 1 ⊗ C) 2 The main property of the tensor Casimir is that [C12 , ∆(X)] = 0 for any X ∈ G. 16.4 Real Lie algebras Up to now, we considered complex Lie algebras. Examples are provided by complexification of real Lie algebras. More precisely, let G be a real Lie algebra. This means that we have a basis Xa of the Lie algebra such that the structure constants are real, and we consider linear combinations of the Xa with real coefficients. Its complexification GC is the set of elements Z = X + iY with X, Y ∈ G. On GC we define a conjugation c : X + iY → X − iY . We have c2 = 1,
[c(Z1 ), c(Z2 )] = c([Z1 , Z2 ]),
¯ c(λZ) = λc(Z)
Conversely, given any such conjugation, c, we can write GC = G+ ⊕ G− , where c|G± = ±1 and GC can be viewed as the complexification of G+ which is a real Lie algebra. Different real Lie algebras may have the same complexification. For example, sl(2, C) is the common complexification of the two real Lie algebras sl(2, R) and su(2). The algebra sl(2, R) is the Lie algebra of 2 × 2 traceless real matrices, with basis: 1 0 0 1 0 0 E+ = E− = H= 0 −1 0 0 1 0
584
16 Lie algebras
and commutation relations [H, E± ] = ±2E± and [E+ , E− ] = H. On the other hand, su(2) is the Lie algebra of antihermitean traceless 2×2 matrices, i.e. linear combinations with real coefficients of the matrices tk = iσk , where σk are the Pauli matrices: 1 0 0 −i 0 1 σ2 = σ1 = σ3 = 0 −1 i 0 1 0 The commutation relations are [ti , tj ] = −2 ijk tk . We see that the structure constants of su(2) are real. The Lie algebras sl(2, R) and su(2) are referred to as the non-compact and compact real forms of sl(2, C) respectively. Notice that although the algebra su(2) is real, the matrices representing it have complex entries. It is an important problem to classify the real forms of a given complex Lie algebra. This amounts to classifying the conjugations c. For this purpose note that one can build a basis of the Lie algebra GC such that all the structure constants are real, as follows. Choose the basis E±α , Hα such that ω(Eα ) = −E−α (where ω is the Chevalley automorphism) and (Eα , E−α ) = 1. Then [Eα , E−α ] = Hα , and [Hα , E±β ] = ±β(Hα )E±β , where β(Hα ) are real. Setting [Eα , Eβ ] = Cα,β Eα+β and applying ω, we get the relation: C−α,−β = −Cα,β
(16.12)
To show that these structure constants are real, we compute: 2 2 Hα+β = −Cα,β (Hα + Hβ ) [[Eα , Eβ ], [E−α , E−β ]] = −Cα,β
On the other hand, using the Jacobi identity and eq. (16.8), we get: [[Eα , Eβ ], [E−α , E−β ]] = [E−β , [E−α , [Eα , Eβ ]]] + [E−α , [E−β , [Eβ , Eα ]]] (β, β) (α, α) = −q(1 − p) Hα − q (1 − p ) Hβ 2 2 Here p and q refer to the β-chain through α. If Cα,β = 0, i.e. α + β is a root, Hα and Hβ are linearly independent, so identifying the coefficients we get: (α, α) 2 Cα,β = q(1 − p) 2 and a similar formula with p and q . Recalling that p < 0 and (α, α) > 0, we see that Cα,β is real. The basis of the Lie algebra that we have constructed is called a Weyl basis.
585
16.4 Real Lie algebras
The real Lie algebra G spanned over R by E±α and Hα is the analogue of the non-compact sl(2, R) in the general case. We obviously have GC = G +iG and the real form is selected by the conjugation c (X +iY ) = X − iY . It is a theorem by H. Weyl that for any semi-simple complex Lie algebra, there exists a real form which is the Lie algebra of a compact Lie group. The Lie algebra of a semi-simple compact Lie group is characterized by the fact that its Killing form is negative definite. Indeed, if G is a compact Lie group, choose any positive definite scalar product on G, its Lie algebra. Since G is compact, one can use the Haar integral on G to take the average of this bilinear form and obtain a positive definite invariant scalar product on G. This means that the Lie group G acts by orthogonal matrices in the adjoint representation (for this scalar product). Hence, in an orthonormal basis the matrices adX are antisymmetric and the Killing form (X, X) = − ij (adX )2ij ≤ 0 vanishes only when adX = 0, i.e. when X is in the centre of G. We see that G decomposes as the orthogonal sum of its centre and a semi-simple algebra [G, G] on which the Killing form is negative definite. Conversely, starting from the Weyl basis, one can construct the compact form as follows: consider the generators Xα = (Eα − E−α ),
Yα = i(Eα + E−α ),
Zα = iHα
(16.13)
The real vector space spanned by these elements is a real Lie algebra thanks to eq. (16.12). Moreover, it has a negative definite Killing form. To show this recall the orthogonality relations, eq. (16.4), so it is sufficient to look at each subspace (Xα , Yα , Zα ) independently, where the check is simple, thereby proving the Weyl theorem. For instance, (Xα , Xα ) = (Eα − E−α , Eα − E−α ) = −2(Eα , E−α ) = −2 and (Xα , Yα ) = (Eα − E−α , i(Eα + E−α )) = 0, and so on. This compact real form corresponds to the conjugation: c(E±α ) = −E∓α ,
c(Hα ) = −Hα
(16.14)
This conjugation selects the analogue of su(2) in the sl(2) case. In general, the representations of a (real) compact Lie group G may be complex. Choosing on the representation space V an arbitrary sesquilinear form and averaging it by the group G, one gets an invariant sesquilinear form, i.e. gv, gw = v, w . Hence all elements of G are represented by unitary matrices and elements of G by antihermitean matrices. The generators eq. (16.13) are such that Xα+ = −Xα , Yα+ = −Yα and Zα+ = −Zα . This also reads Eα+ = E−α , Hα+ = Hα , or more abstractly for any X in the complexified Lie algebra, X + = −c(X).
586
16 Lie algebras
In particular, any maximal Abelian subalgebra of G is a Cartan subalgebra, because antihermitean matrices are always diagonalizable with eigenvalues purely imaginary. Its image by the exponential map is the Weyl torus. One can choose a basis Hj of the Cartanalgebra such that any element of the torus is of the form h = exp ( j θj Hj ) with exp (2πHj ) = 1 for all j. If |λ is a weight vector in V , we have h|λ = χλ (h)|λ , where χλ (h) is a character, i.e. χλ (hh ) = χλ (h)χλ (h ). So we have χλ (h) = exp ( j θj λ(Hj )). The condition exp (2πHj ) = 1 gives λ(Hj ) ∈ iZ for all j. This defines a lattice in H∗ called the weight lattice of the group G, which is a sublattice of the weight lattice of the Lie algebra. Moreover, since the adjoint representation of G is well-defined, the root lattice is a sublattice of the weight lattice of G. In general the weight lattice of G is a sublattice of the weight lattice of G, and is equal to it only when G is simply connected. This is because any positive weight of the weight lattice of G is the highest weight of some representation of G which can then be lifted to G. Hence, for a compact semi-simple simply connected Lie group, the weight lattices of the group and its Lie algebra are the same. Note that 2λ(Hα )/(α, α) ∈ Z for any weight λ of G, so in this case the elements Hj such that exp (2πHj ) = 1 are of the form Hj = i2Hαj /(αj , αj ), where αj are the simple roots. When G is a semi-simple compact connected Lie group, its centre is a finite group contained in all maximal tori of G. Assuming that the centre of G is trivial, its weight lattice is equal to the root lattice, because in this case the adjoint representation is a faithful representation of G and so generates all representations of G, taking tensor products. It follows that the root lattice generates the weight lattice. This allows us to describe the various compact Lie groups G with Lie algebra G. Starting from the universal cover of any one of them, which we call G (and can be shown to be compact), having centre Z, the other compact Lie groups are of the form G/D, where D is any subgroup of the discrete Abelian group Z. They have centre Z/D isomorphic to the quotient of their weight lattice by the root lattice. Moreover, their first homotopy group is isomorphic to the quotient of the weight lattice of G by their own weight lattice. We see that global topological properties of compact Lie groups are remarkably encoded in the structure of their tangent space at the unit element. The classification of real forms of complex Lie algebras is also the basis of the study of symmetric spaces. We have obtained two conjugations c and c which select non-compact and compact real forms of GC . Note that c commutes with the conjugation c defined in eq. (16.14). Hence we
16.5 Affine Kac–Moody algebras
587
can diagonalize c in the eigenspaces of c and conversely. This yields the decompositions, called Cartan decompositions of the real Lie algebras G and G : G = t ⊕ p, G = t ⊕ ip The Lie algebra t = G∩G is generated by the (Eα −E−α ), and p is spanned by the i(Eα + E−α ) and the iHα . Moreover, we have the relations: [t, t] ⊂ t,
[t, p] ⊂ p,
[p, p] ⊂ t
and similarly with p → ip. For example in the case of sl(n), this corresponds to the decomposition into symmetric and antisymmetric matrices. At the Lie group level, with the algebra t corresponds a compact group K, and with the Lie algebras G and G correspond appropriate Lie groups G and G , respectively compact and non-compact. One gets symmetric spaces G/K and G /K of the compact and non-compact type respectively. This is the situation we have encountered in Chapter 7. Many more conjugations exist, but we will not enter into this subject. 16.5 Affine Kac–Moody algebras We start from a finite-dimensional simple Lie algebra, G, and construct the loop algebra which consists of formal Laurent polynomials G = G ⊗ C[λ, λ−1 ] with Lie bracket: [X ⊗ λn , Y ⊗ λm ] = [X, Y ] ⊗ λn+m is the central extension of the loop The affine Kac–Moody algebra, G, ˜ algebra G by a central element denoted by K (this means that the formal element K commutes with all other elements). It is convenient to further extend this algebra by including the derivation d = λ∂λ . Thus, the affine Kac–Moody algebra is: G = G˜ ⊕ CK ⊕ Cd and the Lie brackets are defined (with (X, Y ) the Killing form on G) by: [X ⊗ λn , Y ⊗ λm ] = [X, Y ] ⊗ λn+m + 12 nδm+n,0 (X, Y )K (16.15) [d, X ⊗ λn ] = n X ⊗ λn n [K, X ⊗ λ ] = [K, d] = 0 Note that the coefficient ω(X ⊗ λn , Y ⊗ λm ) ≡ 12 nδm+n,0 (X, Y )
(16.16)
588
16 Lie algebras
of the central element K satisfies the cocycle condition ω([X, Y ], Z) + ω([Z, X], Y ) + ω([Y, Z], X) = 0, ensuring that the Jacobi identity is satisfied. An invariant bilinear form on the Kac–Moody algebra is given by: (X ⊗ λn , Y ⊗ λm ) = (X, Y )δn+m,0 ,
(K, K) = (d, d) = 0,
(K, d) = 1
and (K, X ⊗ λn ) = (d, X ⊗ λn ) = 0. The fact that this form is invariant is easy to check by direct computation. Alternatively, denoting a general element of G by = X(λ) ˜ X + XK K + Xd d we have in the affine Kac–Moody algebra: dλ 1 ˜ ˜ ˜ Y˜ (λ)) K X, Y = X, Y + 2 (∂λ X(λ), 2iπ = [K, d] = 0 K, X d ˜ d, X = λ X(λ) dλ
(16.17)
It is worth noticing that affine Kac–Moody algebras are subalgebras of the Lie algebra gl(∞) introduced in Chapter 9. To see it, one has to associate an infinite-dimensional matrix with λn X, where X is a k × k matrix. We represent λ by the shift operator S with matrix elements SIJ = δI+1,J for I, J ∈ Z and λn X by X ⊗ S n . In other words, one has (λn X)i+kI,j+kJ = Xij δI+n,J The loop algebra structure is obviously preserved by this identification, moreover, one can check that the cocycles eq. (9.12) and eq. (16.16) also match. Hence gl(k) is embedded into gl(∞) as the subalgebra of infinite matrices with period k along the diagonal Let αi be the simple roots of the finite-dimensional simple Lie algebra G and θ its highest root. The affine Kac–Moody algebra G ⊗ C(λ, λ−1 ) ⊕ CK is generated by the following elements: (E−θ ⊗ λ, K − Hθ , Eθ ⊗ λ−1 ) (16.18) Each triplet form an sl2 subalgebra. These triplets are associated with the simple roots of the affine Kac–Moody algebra. The derivation d = λ∂λ is not in the algebra generated by these elements and has to be added by hand. The λ dependence in this presentation corresponds to what is (Eαi , Hαi , E−αi ),
i = 1, . . . , rank G,
589
16.5 Affine Kac–Moody algebras
called the homogeneous gradation. The gradation is defined by the degree in λ, which is counted by d. A slight modification of this construction allows us to define the twisted affine Kac–Moody algebras. Assume that G has an automorphism τ of order N , i.e. τ N = 1. Let ζ = e2iπ/N . One extends τ to an automorphism τ of the Kac–Moody algebra by setting: τ(X ⊗ λn ) = τ (X) ⊗ (ζλ)n ,
τ(K) = K,
τ(d) = d
(16.19)
Since τ is an automorphism, the set of its fixed points is a Lie algebra, which is called the twisted affine Kac–Moody algebra associated with τ , and denoted by Gτ . If the automorphism τ is an inner automorphism, Gτ is isomorphic to If, however, τ is not an inner automorphism, one the untwisted algebra G. gets an essentially different algebra. It is known that this situation occurs only when N = 2 or N = 3 and only for particular simple Lie algebras. Let us illustrate the use of an inner automorphism to obtain the presentation of the affine Kac–Moody algebra in the principal gradation. Consider the Weyl vector ρ = i Λi , where the Λi are the fundamental weights of G. So we have (ρ, αi ) = 1 for any simple root αi of G. Moreover, if θ is the highest root of G we define the dual Coxeter number h∗ by (ρ, θ) = h∗ − 1. Let τ be the inner automorphism of G: τ (X) = e− h∗ Hρ Xe h∗ Hρ , 2iπ
2iπ
τ (H) = H,
τ (Eα ) = e− h∗ (ρ,α) Eα 2iπ
and extend it to an automorphism τ of the affine algebra as in eq. (16.19) 2iπ with ζ = e h∗ . The algebra of its fixed points is isomorphic to our affine ∗ ∗ algebra, and is linearly generated by the H ⊗λmh and the Eα ⊗λmh +(ρ,α) . It follows that the elements of degree ±1, 0 are: (Eαi ⊗λ, Hαi , E−αi ⊗λ−1 ), i = 1, . . . , rank G,
(E−θ ⊗λ, K −Hθ , Eθ ⊗λ−1 ) (16.20) These elements generate the whole fixed point algebra. This presentation differs from the presentation in eq. (16.18) by the way in which the degrees in λ are distributed. Affine Kac–Moody algebra may also be presented by generators and relations using Cartan matrices and their associated set of generators. Specifically, an affine Cartan matrix is a finite-dimensional matrix aij such that: aii = 2; aij ≤ 0; aij = 0 ⇒ aji = 0, the aij are negative integers for i = j, and the dimension of its kernel is 1. Note that the only difference with the Cartan matrix of a semi-simple algebra is that its determinant vanishes. A classification of such Cartan matrices may be found in the References, where it is shown that irreducible ones yield exactly the
590
16 Lie algebras
standard and twisted algebras constructed above, for any simple Lie algebra G. In analogy to the finite-dimensional case, the affine Kac–Moody algebra with Cartan matrix aij is defined as the Lie algebra generated by − the elements (e+ i , ei , hi ) with Serre relations: [hi , hj ] = 0 hi , e± = ±aij e± j j − = δij hi e+ i , ej
(16.21)
1−aij (ad e± · e± i ) j = 0 for i = j
One gets an infinite algebra because det (a) = 0. The elements hi generate the Cartan subalgebra H. Let nj be such that i ni aij = 0. Since, by hypothesis, the kernel of aij is one-dimensional, such coefficients are unique up to a multiplicative constant. It is usually convenient to normalize them such that j nj = h∗ with h∗ the dual Coxeter number; the coefficients nj are then all non-negative integers. By construction, the element K, n i hi K= i
is central. The derivation d is not an element of the algebra generated by − the (e+ i , ei , hi ). It has to be added by hand. Its commutation relations depends on the gradation one chooses. For example, the principal gradation obtained in eq. (16.20) corresponds to the choice: / −0 / +0 d, ei = −1 d, ei = 1, [d, hi ] = 0, In particular, the rank of the (untwisted) affine Kac–Moody algebra G is (1 + rank G) if one does not include the derivation in its definition and is (2 + rank G) if one does include it. As for finite-dimensional Lie algebras, one has the decomposition: G = N− ⊕ H ⊕ N+ with N± the subalgebra generated by the e± i and H the Cartan subalgebra. One may also introduce roots, which are points in the dual H∗ of the Cartan subalgebra H, and systems of simple roots. However, in contrast to the finite-dimensional case, the number of roots is infinite and roots may have multiplicities. Weights are elements of H∗ . The fundamental weight vectors Λj are such that Λj (hi ) = δij (16.22)
16.5 Affine Kac–Moody algebras
591
By definition, integrable highest vectors Λ are integer linear combinations of the fundamental weights: Λ = j δj Λj with δj integers. The coefficients δj are called Dynkin indices. Unitary highest weight representations may be defined as in the finitedimensional case. Let Λ be an integrable highest weight vector and |Λ be the corresponding highest weight vector. By definition, one assumes that hi |Λ = Λ(hi )|Λ ,
e+ j |Λ = 0
The highest weight representation V (Λ), with highest weight vector Λ, is then defined as: V (Λ) = (U (N− )|Λ ) /MΛ with the U (N− ) the universal enveloping algebra of N− and MΛ the maximal submodule of U (N− )|Λ . More concretely, vectors in V (Λ) are obtained by multiple action of the generators e− i on the highest weight δ Λ , then the central element K = vector |Λ . Note that if Λ = i i i i ni hi acts on V (λ) as the C-number K = i ni δi . This number is called the level of the representation. Note that the adjoint representation is not a highest weight representation. in more detail. Let Let us present the affine Kac–Moody algebra sl(2) E+ , E− , H be the three generators of the Lie algebra sl(2): [H, E± ] = ±2E± ,
[E+ , E− ] = H
We normalize the Killing form on sl(2) by (H, H) = 2, (E+ , E− ) = 1. The loop algebra sl(2) is the Lie algebra of traceless 2 × 2 matrices with entries Laurent polynomials in λ: sl(2) = sl(2) ⊗ C(λ, λ−1 ). The affine Lie is the central extension of sl(2): = sl(2) algebra sl(2) sl(2) ⊕ CK ⊕ Cd, ∂ with K the central element and d the derivation d = λ ∂λ . Let us write = N ⊕N + . First one can choose, as in − ⊕ H the decomposition: sl(2) eq. (16.18), the simple root vectors Eα1 = E+ , Eα2 = λE− . Together with the Cartan algebra generators H, K, d and E−α1 = E− , E−α2 = λ−1 E+ they generate the whole algebra. The simple root vectors are of degree 0 and 1. This is called the homogeneous gradation. It is more convenient to define a gradation such that simple root vectors have degree 1, the so-called principal gradation. To do that we choose simple root vectors Eα1 = λE+ , Eα2 = λE− , and E−α1 = λ−1 E− , E−α2 = λ−1 E+ . Together with the Cartan algebra generators H, K, d, they generate the algebra. The degree 0 elements are: = {H, d, K} H
592
16 Lie algebras
The positive degree ones (n > 0) are: + = {E (2n−1) = E+ ⊗ λ2n−1 , E (2n−1) = E− ⊗ λ2n−1 , H (2n) = H ⊗ λ2n } N + − and the negative degree ones (n < 0) are: − = {E (2n+1) = E+ ⊗ λ2n+1 , E (2n+1) = E− ⊗ λ2n+1 , H (2n) = H ⊗ λ2n } N + − in the We can exhibit the isomorphism between the affine algebra sl(2) homogeneous gradation and this presentation which corresponds to the principal gradation. First, replace the parameter λ by λ2 , in the homogeneous presentation. Then the simple root vectors of the homogeneous gradation are E+ and λ2 E− . Then perform a conjugation by exp (log(λ)H/2). This conjugation sends E+ to λE+ and E− to λ−1 E− and extends to an isomorphism of the two algebras. Note that d → d − 12 H. In the principal gradation, the commutation relations read: H (r) , H (s) = r δr+s,0 K (s) (r+s) (16.23) H (r) , E± = ±2E± (r) (s) E+ , E− = H (r+s) + 12 r δr+s,0 K In the notation of eq. (16.21), we have h1 = H + 12 K and h2 = −H + 12 K. Let H ∗ , K ∗ , d∗ be the dual basis of the basis H, K, d of the Cartan (r) algebra. With the root vectors E± are associated the roots ±2H ∗ + rd∗ , and with the root vectors H (r) are associated the roots rd∗ . Let us draw see Fig 16.1. the root diagram of sl(2), algebra possesses two fundamental highest weights, The affine sl(2) + denoted by Λ and Λ− . They are characterized by eq. (16.22). Expanding on the dual basis H ∗ , K ∗ , one gets Λ± = ± 12 H ∗ + K ∗ , or equivalently: Λ± (H) = ± 12 ;
Λ± (K) = 1;
Λ± (d) = 0
Note that the levels of these fundamental representations are equal to one, i.e. K takes the value 1 on them. 16.6 Vertex operator representations We now recall the vertex operator construction of the level one represen in the principal gradation. We introduce oscillators pn for tations of sl(2), n odd, such that [pm , pn ] = mδn+m,0
593
16.6 Vertex operator representations d∗ (3)
E−
c
c c
(1)
E−
c
(3)
E+
H (2)
α2
c
α1
(1)
E+
H∗
(−1)
E−
c
c c
(−3)
E−
(−1)
E+
H (−2)
c
c
(−3)
E+
Fig. 16.1. The root diagram of sl(2). Note that the choice n odd ensures that there is no centre in this algebra. Assume p+ n = p−n . The vacuum |0 is defined by pn |0 = 0 for n > 0. Its dual 0| is defined by 0|pn = 0 for n < 0, and the normalization condition 0|0 = 1. This allows us to compute the vacuum expectation value, 0|O|0 , of any operator O. The representation space is the Fock space generated by the p−n acting on the vacuum. We define the normal ordering on monomials of the pn by putting pn , n > 0 to the right. We denote it by : :. Define the operators acting on the Fock space: √ zn p−n Q(z) = −i 2 n n odd
We have
0|Q(z1 )Q(z2 )|0 = log
z 1 + z2 z 1 − z2
,
|z1 | > |z2 |
(16.24)
This is because 0|Q(z1 )Q(z2 )|0 = −2
n1 n2
z1n1 z2n2 0|p−n1 p−n2 |0 n 1 n2
Here n1 , n2 are odd integers, but the properties of the vacuum select n1 < 0 and n2 > 0 and we have 0|p−n1 p−n2 |0 = −n1 δn1 +n2 ,0 using
594
16 Lie algebras
p−n1 p−n2 = p−n2 p−n1 − n1 δn1 +n2 ,0 . The sum reduces to: 1 z2 n2 z 1 + z2 = log 0|Q(z1 )Q(z2 )|0 = 2 , n 2 z1 z 1 − z2
|z1 | > |z2 |
n2 >0
The vertex operator V (r, z) is defined by: V (r, z) =
1 : exp (irQ(z)) : 2
Proposition. The normal ordered form of a product of two vertex operators is given by: z1 − z2 rs : V (r, z1 )V (s, z2 ) : |z1 | > |z2 | (16.25) V (r, z1 )V (s, z2 ) = z 1 + z2 Proof. Let
√ p−n z n /n Q± = −i 2 ∓n>0
so that Q = Q+ + Q− and Q+ |0 = 0. Then, by definition of the normal order, V (r, z) = 12 eirQ− (z) eirQ+ (z) . To compute : V (r, z1 )V (s, z2 ) : we need to commute exp (irQ+ (z1 )) to the right of exp (isQ− (z2 )). Now it is clear that the commutator of Q+ (z1 ) and Q− (z2 ) is a C-number. To evaluate this number one can take its vacuum expectation value. One has, using Q+ |0 = 0 and 0|Q− = 0, 0|[Q+ (z1 ), Q− (z2 )]|0 = 0|Q+ (z1 )Q− (z2 )|0 = 0|Q(z1 )Q(z2 )|0 which is given by eq. (16.24). Moreover, if A and B are two operators such that [A, B] is a C-number, one has eA eB = eB eA e[A,B] . So we arrive at eq. (16.25) since e[A,B] = ((z1 − z2 )/(z1 + z2 ))rs . The level one vertex operator representations of the Lie algebra sl(2) are obtained as follows: (n) (n) z −n (E+ + E− ) = P (z) ≡ pn z −n (16.26) n
n odd
n even
z −n H (n) +
z −n (E+ − E− ) = ± V (z) (n)
(n)
(16.27)
n odd
where V (z) denotes the vertex operator: √ √ √ 1 1 V (z) = V (− 2, z) = : e−i 2Q(z) := : ei 2Q(−z) : 2 2
(16.28)
16.6 Vertex operator representations
595
(n)
Proposition. The operators E± and H (n) defined in eq. (16.27) provide representations of the affine algebra eq. (16.23) with K = 1. These representations correspond to the fundamental highest weights Λ± according to the sign in eq. (16.27). They are the fundamental level one representations of sl(2). Proof. We derive eq. (16.25) with respect to z1 and get for |z1 | > |z2 |: 1 ∂ i dQ(z1 ) irQ(z1 ) V (r, z1 )V (s, z2 ) = : e : V (s, z2 ) r ∂z1 2 dz1 z1 − z2 rs 2sz2 = 2 : V (r, z1 )V (s, z2 ) : z1 − z22 z1 + z2 i z1 − z2 rs dQ(z1 ) irQ(z1 ) + : e V (s, z2 ) : 2 z 1 + z2 dz1 Here we have used the fact that in the normal ordered product everything commutes so that one can derive the exponential straightforwardly. We then set r = 0. Defining: √ 2sz1 z2 Γ(z1 , z2 ) = 2 V (s, z2 )+ : P (z1 )V (s, z2 ) : z1 − z22 we get: P (z1 )V (s, z2 ) = Γ(z1 , z2 ),
|z1 | > |z2 |
Similarly, we derive eq. (16.25) with respect to z2 and then perform the exchange (z1 , z2 , r) → (z2 , z1 , s). We get: V (s, z2 )P (z1 ) = Γ(z1 , z2 ), |z1 | < |z2 | Expanding P (z) = n pn z −n , where n is odd, so that dz n−1 (n) (n) p n = E+ + E − = P (z) z C 2iπ we can write the commutator [pn , V (s, z2 )] as: dz1 n−1 [pn , V (s, z2 )] = z1 Γ(z1 , z2 ) C1 −C2 2iπ where C1 is a circle around the origin with |z1 | > |z2 | while C2 is a circle the around the origin with |z1 | < |z2 |. This contour integral is given by √ residues at the two-poles z1 = ±z2 and we finally obtain, setting s = − 2: (n)
(n)
[E+ + E− , V (z)] = −2z n V (z)
(16.29)
596
16 Lie algebras
Similarly, starting from eq. (16.25) and setting V (z) = gets: [Vn , V (z2 )] =
C1 −C2
dz1 n−1 z 2iπ 1
z 1 − z2 z 1 + z2
nV
(n) z −n ,
one
2 : V (z1 )V (z2 ) :
The residue is at z1 = −z2 and is easily computed, noting that: V (z2 )V (−z2 ) := 1/4. One finds [Vn , V (z)] = 2(−1)n z n P (z) + (−1)n nz n . Separating n even and odd this reads (with the sign of eq. (16.27)): (n)
(n)
[E+ + E− , V (z)] = −2z n P (z) − nz n (16.30) From eqs. (16.29, 16.30), one gets by expanding V (z) into its components (note that cancels): [H (n) , V (z)] = 2z n P (z) + nz n ,
(m)
(n+m)
[H (n) , E± ] = ±2E±
(16.31)
1 (n) (m) (m) [E± , E+ − E− ] = −H (n+m) ∓ nδn+m,0 2
(16.32)
(n) (n) Finally, we have P (z) = n pn z −n so that E+ +E− = pn , and from the (n) (n) (m) (m) commutation relations of pn we get [E+ + E− , E+ + E− ] = nδn+m,0 . Combining with eq. (16.30), this gives: (n)
(m)
[E± , E+
1 (m) + E− ] = ±H (n+m) + nδn+m,0 2
(16.33)
Equations (16.31, 16.32, 16.33) are equivalent to the commutation relations eq. (16.23) for K = 1. We have obtained a level one repesentation It remains to identify the highest weight. It is provided by the of sl(2). vacuum vector |0 because 1 √ V (z)|0 = e−i 2Q− (z) |0 2
(16.34) (n)
(n)
contains only positive odd powers of z. It follows that (E+ −E− )|0 = 0 (n) (n) and H (n) |0 = 0 for n > 0. Moreover, since E+ + E− = pn we have (n) (n) (E+ + E− )|0 = 0 for n > 0. This implies that the vacuum is annihilated by all positive root vectors, hence is a highest weight vector. Equation (16.34) also gives H (0) |0 = 12 |0 so that the corresponding weight is 12 . It is known that this representation on Fock space is irreducible.
16.6 Vertex operator representations
597
References [1] S. Helgason, Differential Geometry and Symmetric spaces. Academic Press (1962). [2] J.E. Humphreys, Introduction to Lie Algebras and Representation Theory. Springer (1972). [3] V. G. Kac, Infinite Dimensional Lie Algebras. Cambridge University Press (1985).
Index
Abel map, 29, 136, 184, 399, 552, 553 Abelian differentials, 183, 280, 549 abelianization, 35, 64 action–angle variables, 10, 161, 500 adjoint action, 563 adjoint linear system, 135, 339 adjoint representation, 565 Adler trace, 333, 451 AKS scheme, 90, 113, 335 Arnold theorem, 10
central extension, 61, 358 chiral fields, 436, 460 classical double, 527 coadjoint action, 40, 92 coadjoint orbit, 40, 43, 92, 101, 113, 124, 152 conformal invariance, 386, 436, 454, 460 coproduct, 571 cotangent bundle, 228, 512
B¨acklund transformations, 69 Baker–Akhiezer function, 145, 182, 184, 215, 222, 279, 322, 328, 338, 342, 380, 393 bihamiltonian structure, 356, 385 bilinear identities, 339 Bloch solutions, 191, 221, 378, 401 bosonization, 303
Darboux theorem, 9, 510 degenerate Poisson bracket, 508 degree of divisor, 544 desingularization, 130, 166, 537, 539 divisor, 544 double, see classical double dressing transformation, 72, 74, 454, 463, 532 Drinfeld–Sokolov construction, 358, 441, 448 dualization, 85 dynamical divisor, 127, 132, 138, 139, 179, 189, 215, 396
Calogero–Moser model, 202, 208, 236, 238 canonical bundle, 543 canonical coordinates, 157, 189, 193, 510 canonical cycles, 549 canonical transformation, 7, 8 Cartan matrix, 569 Cartan subalgebra, 107, 437, 566 Casimir operator, 44, 45, 47, 100, 294, 301, 573 Cauchy determinant, 305
eigenvector bundle, 128, 150, 178, 213 elementary flows, 48 elliptic functions, 203, 557 equivalent divisors, 544 Euler top, 19, 32, 39, 47
599
600
Index
exchange algebra, 447 exponential map, 563 factorization, 54, 56, 94, 150, 454, 487 fermions, 297, 388 finite zone solutions, 170, 222, 400, 474 Fock space, 297 Fuchs relation, 254, 264, 278 Gaudin model, 235 Gelfand–Dickey, 37, 66, 332 Gelfand–Levitan–Marchenko, 490 genus, 123, 540 geodesics, 25, 107 Grassmannian, 309, 325 Hamilton–Jacobi equation, 160 Hamiltonian reduction, 24, 107, 116, 125, 203, 228, 238, 359, 519 Hamiltonian vector field, 9, 509 hierarchy, 52, 72, 244, 335 highest root, 569 highest weight, 572, 440 Hirota equation, 274, 296, 307, 387, 443 Hirota operators, 274, 296, 307, 387 Hitchin systems, 227 hyperelliptic curve, 177, 372, 393, 474, 538 irregular singular points, 246 isomonodromic flows, 260, 261 isospectral, 12 Iwasawa decomposition, 107 Jacobi identity, 15 Jacobi matrices, 103 Jacobi problem, 25 Jacobi–Trudy formula, 317 Jacobian torus, 139, 552 Jacobian variety, 29, 552 Jost solutions, 479 Kac–Moody algebra, 175, 459, 578 Kadomtsev–Petviashvili, see KP
KdV equation, 350 KdV Hamiltonians, 350, 382 KdV hierarchy, 348, 379 Kepler problem, 17 Killing form, 565 Kirillov symplectic form, 514 Korteweg–de Vries, see KdV Kostant–Kirillov bracket, 41, 43, 86, 153, 202, 350, 513, 525 Kowalevski top, 22, 118, 165 KP equation, 338 KP hierarchy, 222, 274, 308, 323, 335 Lagrange top, 20, 33, 40 Lam´e function, 203, 558 Lax connection, 62, 72, 438, 460 Lax equation, 11, 34, 41, 92, 139, 176, 204, 392 Lax pair, 11, 13, 92, 93, 119, 175 level of representation, 583 Lie–Poisson action, 528 line bundle, 542 linear system, 52, 62, 122, 182, 394, 477 linearization, 140, 399 Liouville equation, 435 Liouville theorem, 7, 10 Liouville tori, 10, 160, 198 loop algebra, 41, 91, 578 matrix of periods, 550 Miura transformation, 354, 441, 448 moment map, 109, 228, 240, 359, 518, 524 monodromy matrix, 62, 70, 77, 190, 254, 377, 458, 496 Neumann model, 23, 26, 33, 40, 48, 125, 134, 143, 158, 160 Noether theorem, 518 non-Abelian Hamiltonian, 76, 458, 529, 533 normal order, 298 null vectors, 418 operator ∇, 59, 271, 347, 383
Index Painlev´e property, 284 path-ordered exponential, 62 phase shift, 390, 466 Pl¨ ucker relations, 310 point bundle, 543 Poisson bracket, 6, 507 Poisson manifold, 508 Poisson–Lie group, 76, 457, 525, 526 Poissonian action, 517 polar part, 34 product of bundles, 543 Prym variety, 170 pseudo-differential operators, 322, 333, 379, 451 R± projectors, 76, 177, 456 r-bracket, 85, 531 r-matrix, 14, 43, 70, 76, 85, 98, 100, 176, 206, 497 rank of an algebra, 566 reality conditions, 195, 402 reduced Poisson bracket, 111, 521, 522 reduction group, 39 regular singular points, 246 representation, 565 representation of Lie algebras, 106 Ricatti equations, 271, 379 Riemann bilinear identity, 152, 282, 430, 551, 552 Riemann problem, 256 Riemann surface, 29 Riemann theorem, 217, 554 Riemann’s constants, 554 Riemann–Hilbert, 53, 56, 72, 78, 231, 256, 487, 559 Riemann–Hurwitz formula, 123, 168, 177, 212, 278, 540 Riemann–Roch, 132, 183, 229, 546, 548, 549 root, 95, 107, 566 Ruijsenaars–Schneider model, 471
601
Schlesinger transformation, 265 Schroedinger discrete, 178 Schroedinger equation, 376 Schur polynomials, 317 Schwarzian derivative, 437 section of bundle, 542 Semenov-Tian-Shansky bracket, 457, 532 semi-infinite wedge product, 312 semi-simple Lie algebra, 566 separation of variables, 18, 27, 28, 158, 410 Serre relations, 570, 580 sigma model, 67, 81 sine-Gordon hierarchy, 462, 464 sine-Gordon model, 68, 458, 477 Sklyanin bracket, 70, 192, 446, 497, 530 soliton solutions, 77, 308, 388, 412, 463, 493 spectral curve, 122, 177, 211, 391 spectral parameter, 32, 34, 61 stationary flows, 171 Stokes matrix, 251 Stokes sectors, 248, 251 symmetric spaces, 107, 108, 115, 239 symplectic form, 43, 157, 186, 194, 219, 407, 415, 466, 508 symplectic transformations, 509 tau-function, 57, 106, 151, 185, 268, 281, 295, 305, 315, 327, 346, 387, 413, 443, 463 tensor notation, 13 theta-function, 141, 147, 184, 218, 395, 554 Toda chain, closed, 175 Toda chain, open, 95 Toda field, 437 topological charge, 465, 478 twisted algebra, 579 ultralocality, 70, 447
Sato formula, 59, 274, 328, 346 scattering data, 478, 481, 488 Schlesinger equations, 263, 267
Verma module, 572 vertex operator, 295, 304, 463, 584
602
Index
Virasoro algebra, 356, 386, 449 Volterra group, 335 wave function, 52, 56, 62, 67, 147, 179, 221, 322, 338, 376, 439 weight, 571, 581 Weyl group, 103, 569 Weyl vector, 438 Whitham average, 367, 425 Whitham equations, 371, 431 Wick theorem, 298
Yang–Baxter equation, 15, 45, 86, 100, 207, 528 Yang–Baxter modified equation, 85, 90, 100 Young diagram, 316 Zakharov–Shabat, 34, 63, 72, 244 zero curvature, 50, 62, 245, 438 zones, allowed, 196 zones, forbidden, 196