Commun. Math. Phys. 212, 1 – 27 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble Horia D. Cornean Institute of Mathematics of the Romanian Academy, P.O. Box 1-764, 70700 Bucharest, Romania. E-mail:
[email protected];
[email protected] Received: 15 July 1999 / Accepted: 29 November 1999
Abstract: Consider a charged Bose gas without self-interactions, confined in a three dimensional cubic box of side L ≥ 1 and subjected to a constant magnetic field B 6 = 0. If the bulk density of particles ρ and the temperature T are fixed, then define the canonical magnetization as the partial derivative with respect to B of the reduced free energy. Our main result is that it admits thermodynamic limit for all strictly positive ρ, T and B. It is also proven that the canonical and grand canonical magnetizations (the last one at fixed average density) are equal up to the surface order corrections. 1. Introduction Much work has been done on the thermodynamic behavior of large systems composed from independent quantum particles in the presence of external magnetic fields.As is well known, the fundamental problem consists in proving the existence of the thermodynamic limit for the potentials and the equations of state defined at finite volume. In the particular case of the canonical magnetization (defined as the partial derivative with respect to the magnetic field of the reduced free energy), one has to prove that the derivative (performed at finite volume) commutes with the thermodynamic limit of the reduced free energy. Although the quantum canonical ensemble is (from the physical point of view) the most important one, most of the previous works were carried out either using the Maxwell–Boltzmann statistics or in the framework of the quantum grand canonical ensemble, because in those settings, many physically relevant quantities can be expressed employing the integral kernel of the Gibbs semigroup associated to the one particle problem. Moreover, one is able to go beyond the bulk terms and investigate finite size effects. Take for example the grand canonical pressure of a quantum gas in a constant magnetic field. The rigorous proof of its thermodynamic limit goes back at least to Angelescu and Corciovei [A, A-C]; its surface correction (in the regime in which the fugacity is less than one) was obtained by Kunz [K]. As for the Maxwell–Boltzmann magnetization,
2
H. D. Cornean
nice results were obtained by Macris et al. [M-M-P 1,2]; they wrote down even the corner corrections. Notice that in these papers the domain 3 was allowed to be more general, typically convex with piecewise smooth boundary. Another result concerning the thermodynamic limit and the surface corrections for the magnetization and susceptibility of a Fermi gas at zero magnetic field was obtained by Angelescu et al. [A-B-N 2]. Because this paper motivated our work, we are giving some more details about it. Firstly, as in our setting, their domain was a rectangular parallelepiped and the magnetic field oriented after the third direction. They defined the grand canonical magnetization m3 (β, z) (susceptibility χ3 (β, z)) as the first (second) derivative with respect to the magnetic field of the grand canonical pressure at B = 0, for all z ∈ C \ (−∞, −1]. Their main result can be roughly stated as follows: i. m3 (β, z) = 0, ∀z ∈ C \ (−∞, −1]; ii. There exists χ∞ (β, z) analytic in C \ (−∞, −1] such that for any compact K ⊂ C \ (−∞, −1] one has: lim sup |χ3 (β, z) − χ∞ (β, z)| = 0.
3→∞ z∈K
More than that, they gave even the surface correction for susceptibility and proved that this expansion is uniform on compacts. Because the relation between the fugacity and the grand canonical average density of Fermi particles can be always inverted, they were able to express the grand canonical susceptibility in terms of the canonical parameters ρ and β. Let us stress that B = 0 and 3 a rectangular parallelepiped were crucial ingredients in [A-B-N 2], the uniform convergence on compacts being obtained via a substantial use of the explicit formula of the integral kernel of the Gibbs semigroup associated to the Dirichlet Laplacian. In this paper, we are studying the “true” canonical problem for a Bose gas at nonzero magnetic field B0 > 0 (in order to avoid the Bose condensation). Using a standard procedure (see [K-U-Z, H]) of deriving the canonical partition function from the grand canonical pressure (see (2.27)), we are able to transform the uniform convergence on compacts of the grand canonical magnetization (see Lemma 1) into a pointwise convergence (β, ρ fixed and L → ∞) of the canonical magnetization; this result is given in Theorem 2. Moreover, we obtain that the canonical magnetization mL (see (2.29)) and the grand canonical magnetization at fixed average density (see (2.30)) are equal up to the surface order corrections. Two natural questions arise: what about Fermi statistics and what about higher derivatives with respect to B (the susceptibility for example)? Partial answers and a few open problems are outlined at the end of the proofs.
2. Preliminaries and the Results Let 3 = x ∈ R3 | − L2 < xj < L2 , j ∈ {1, 2, 3} , L > 1, be a cubic box with its side equal to L. Then the “one particle” Hilbert space is H1,L := L2 (3); denote with Hn,L the proper subspace of ⊗nj=1 H1,L ∼ = L2 (3n ) which contains all totally symmetric functions. Denote with L H0,L = C the space with no particles; then the Fock space is defined as FL := n≥0 Hn,L . One can introduce the “number of particles” operator NL as the unique self-adjoint extension of the multiplication with n on each Hn,L .
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
3
Assume that the particles (each having an electric charge e) are subjected to a constant magnetic field B = Be3 , which corresponds to a magnetic vector potential Ba = B2 e3 ∧x. If c stands for the speed of light, define ω := (e/c)B. Then the “one particle” Hamiltonian (denoted with H1,L (ω)) will be the Friederichs extension of the symmetric and positive operator 21 (− i∇ − ω a)2 defined on C0∞ (3). Due to the regularity of 3, H1,L (ω) is essentially self-adjoint on o n D = f ∈ C 2 (3) ∩ C 1 (3), f |∂3 = 0, 1f ∈ L2 (3) . The Hamiltonian which describes n particles reads as: Hn,L (ω) = H1,L (ω) ⊗ · · · ⊗ I + · · · + I ⊗ · · · ⊗ H1,L (ω) . {z } | “n” terms
(2.1)
The second quantized Hamiltonian HL (ω) is defined as the unique self-adjoint operator on FL whose restrictions to Hn,L coincide with Hn,L (ω). If T > 0 stands for the temperature and µ ∈ R for the chemical potential, then define β = kB1 T > 0 and z = exp (β µ) (the fugacity), where kB is the Boltzmann constant. When working in the canonical ensemble, one considers that the bulk density of particles ρ is constant, therefore the number of particles is defined as N (L) := ρL3 . As is well known (see [R-S 4]), H1,L (ω) is positive, unbounded and has compact resolvent; these imply that its spectrum is purely discrete with accumulation point at infinity. Moreover, from the min-max principle it follows: inf σ (H1,L (ω)) ≥ inf σ (H1,∞ (ω)) =
ω . 2
(2.2)
It is also known that the semigroup WL (β, ω) := exp (−βH1,L (ω)) is trace class and admits an integral kernel Gω,L (x, x0 ; β), which is continuous in both its “spatial” variables. The diamagnetic inequality at finite volume (see [B-H-L]) reads as: 1 |x − x0 |2 . (2.3) exp − |Gω,L (x, x0 ; β)| ≤ G0,L (x, x0 ; β) ≤ (2πβ)3/2 2β If I1 (L2 (3)) denotes the Banach space of trace class operators, it follows that: ||WL (β, ω)||I1 = tr WL (β, ω) ≤
L3 . (2πβ)3/2
(2.4)
Denote with {Ej (ω)}j ∈N the set of the eigenvalues of H1,L (ω). If µ < 0, the grand canonical partition function reads as: 4L (β, z, ω) = tr FL exp [−β(HL (ω) − µNL )] =
∞ Y
[1 − z exp (−βEj (ω))]−1 .
j =0
(2.5) The canonical partition function of our system is: ZL (β, ρ, ω) = tr HN (L),L exp (−βHN (L),L (ω)).
(2.6)
4
H. D. Cornean
The link between them is contained in the following equality: 4L (β, z, ω) =
∞ X n=0
zn trHn,L exp [−βHn,3 ].
(2.7)
Throughout the entire paper, by log z we shall understand the logarithm function restricted to C \ (−∞, 0]. Let C be a contour which surrounds the origin, does not intersect the cut [1, ∞) but contains the spectrum of the trace class operator zWL , where z ∈ C \ [exp (βω/2), ∞). Let q(ξ ) = ξ1 log(1 − ξ ) be an analytic function in the interior of C. Define the following bounded operator: Z 1 dξ q(ξ )(ξ − zWL )−1 . (2.8) q(zWL ) = 2π ı C It is easy to see that log(1 − zWL ) = zWL · q(zWL ) and using (2.5) one obtains: log 4L (β, z, ω) = −tr (zWL · q(zWL )) .
(2.9)
Employing the above expression, one can easily prove that the grand canonical potential (seen as a function of z) is analytic in C \ [exp (βω/2), ∞). When |z| < 1, (2.9) becomes: log 4L (β, z, ω) =
∞ n X z n=1
∞ n X z tr WLn = n n n=1
Z 3
dx Gω,L (x, x; nβ) .
(2.10)
The grand canonical pressure and density are defined as: PL (β, z, ω) :=
1 1 X −βEj (ω) log 4 (β, z, ω) = − log 1 − ze , L βL3 βL3
(2.11)
j
and ρL (β, z, ω) := βz
∂PL (β, z, ω). ∂z
(2.12)
Let us remark that ρL (β, x, ω) is an increasing function if 0 < x < exp (βω/2): ∂ρL (β, x, ω) = 1/L3 tr[(1 − xWL )−2 WL ] > 0. ∂x
(2.13)
The proof of the thermodynamic limit for these two quantities goes back at least to Angelescu and Corciovei [A, A-C]. Because this result plays an important role in our work, we shall reproduce it here. In order to do that, let us define (ω > 0): P∞ (β, z, ω) := ω
∞ X 1 −(k+1/2)ωβ g ze , 3/2 (2πβ)3/2 k=0
(2.14)
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
5
and ∞ X 1 ∂P∞ −(k+1/2)ωβ (β, z, ω) = βω g ze , ρ∞ (β, z, ω) := βz 1/2 ∂z (2πβ)3/2
(2.15)
k=0
where gσ (ζ ) are the usual Bose functions: ζ gσ (ζ ) = 0(σ )
Z
∞
dt
0
t σ −1 e−t , 1 − ζ e−t
(2.16)
analytic in C \ [1, ∞) and if |ζ | < 1, they are given by the following expansion: gσ (ζ ) =
∞ X ζn . nσ n=1
Then the following result is true (see [A, A-C]): Theorem 1. Let K ⊂ C \ [exp (βω/2), ∞) be a compact set. Then the grand canonical pressure and density admit the thermodynamic limit i.e.: lim sup |PL (β, z, ω) − P∞ (β, z, ω)| = 0,
(2.17)
lim sup |ρL (β, z, ω) − ρ∞ (β, z, ω)| = 0.
(2.18)
L→∞ z∈K
and L→∞ z∈K
Firstly, because PL and ρL are analytic functions, then via the Cauchy integral formula it follows that all their complex derivatives admit a limit which is uniform on compacts. In particular: ∂ρL ∂ρ∞ (β, z, ω) − (β, z, ω) (2.19) lim sup = 0. L→∞ z∈K ∂z ∂z It can be seen from (2.15) that limx%eβω/2 ρ∞ (β, x, ω) = ∞, which means that the Bose condensation is absent when a nonzero magnetic field is present. A very important consequence of the theorem is that the relation between the fugacity and density can be inverted for all temperatures and moreover, if 0 < x∞ (β, ρ, ω) < eβω/2 is the unique real and positive solution of the equation ρ∞ (β, x, ω) = ρ and if xL (β, ρ, ω) is the unique real and positive solution which solves ρL (β, x, ω) = ρ, then limL→∞ xL = x∞ . Let us perform the Legendre transform at finite volume: ρ f˜L (β, ρ, ω) := −PL (β, xL (β, ρ, ω), ω) + log xL (β, ρ, ω). β
(2.20)
A straightforward result is that f˜L has the following limit: f∞ (β, ρ, ω) := −P∞ (β, x∞ (β, ρ, ω), ω) +
ρ log x∞ (β, ρ, ω). β
(2.21)
6
H. D. Cornean
L Let us denote with ∂W ∂ω (β, ω0 ) the following integral which makes sense in the norm 2 topology of B(L (3)) (see [A-B-N 1]): Z β dτ WL (β − τ, ω0 ) [a · (p − ω0 a)]WL (τ, ω0 ). (2.22) −
0
A particular case of the problem treated in [A-B-N 1] is that class and moreover, for δω sufficiently small one has: w w w w wWL (β, ω0 + δω) − WL (β, ω0 ) − δω ∂WL w w ∂ω w
I1
∂WL ∂ω (β, ω0 ) is even trace
= O((δω)2 ).
(2.23)
Using for PL (β, z, ω0 ) the following expression: −
1 tr[log(1 − zWL (β, ω0 ))], βL3
then the estimate (2.23) justifies the definition of the grand canonical magnetization: e ∂PL ez −1 ∂WL tr (1 − zW (β, ω )) (β, z, ω0 ) = − . 0L (β, z, ω0 ) := − L 0 c ∂ω cβL3 ∂ω (2.24) From its definition, one can easily see that 0L has the same domain of analyticity in z. Now let us define the natural candidate for its thermodynamic limit: 0∞ (β, z, ω0 ) := −
e ∂P∞ (β, z, ω0 ). c ∂ω
(2.25)
Our main technical result is presented in the following lemma: Lemma 1. Let K ⊂ C \ [exp (βω/2), ∞) be a compact set. Then the grand canonical magnetization admits the thermodynamic limit i.e.: lim sup |0L (β, z, ω) − 0∞ (β, z, ω)| = 0.
L→∞ z∈K
(2.26)
Let us go back to the canonical ensemble. From (2.7), (2.11) and (2.6), one can write down an useful representation of the canonical partition function: N (L) β Z 1 1 exp ρ PL (β, ξ, ω) dξ , (2.27) ZL (β, ρ, ω) = 2πı C1 ξ ξ where C1 is a contour which surrounds the origin and avoids the cut. The reduced free energy reads as: fL (β, ρ, ω) := −
1 log ZL (β, ρ, ω). βL3
(2.28)
The canonical magnetization is defined as follows: mL (β, ρ, ω) :=
e ∂fL (β, ρ, ω). c ∂ω
(2.29)
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
7
We expect that mL should be close to the following quantity: m ˜ L (β, ρ, ω) :=
e ∂ f˜L (β, ρ, ω) = 0L (β, xL (β, ρ, ω), ω), c ∂ω
(2.30)
which converges to 0∞ (β, x∞ (β, ρ, ω), ω). We are able now to give our main result: Theorem 2. Fix 0 < δ < 1/2. For all strictly positive temperatures, bulk densities and magnetic fields, the canonical magnetization mL (β, ρ, ω) admits the thermodynamic limit. Moreover, there exist two positive constants Cδ (β, ρ, ω) and Lδ (β, ρ, ω) such that for all L ≥ Lδ one has: ˜ L | ≤ Cδ L−3/2+δ . |mL − m
(2.31)
Remark. It is clear that m ˜ L is a much more convenient quantity. If (at least for dilute gases) one would be able to write down an expansion for m ˜ L which takes into account the surface corrections: m ˜ L = m∞ +
1 mS + o(1/L), L
(2.32)
then the estimate (2.31) would imply that the same expansion is true for mL , too. 3. The Proof of Theorem 2 At this point, we shall consider that Lemma 1 is true and give its proof in the next section. In order to simplify the notations, we shall drop the dependence on β, ρ and ω but we shall reintroduce it when needed. The main idea of the proof consists in isolating the principal part of the integral from (2.27). Although this procedure is far from being new (in the physical literature it is known as the Darwin-Fowler method; see [K-U-Z, H] and references therein), we decided to give a rather detailed proof in order to have a clearer image of the remainder from (2.31). Firstly, let us choose the contour C1 as follows: C1 := {xL eıφ , φ ∈ [−π, π]}.
(3.1)
Using (2.20), the formula (2.27) can be rewritten as: Z π 1 N (L)β(PL (xL eıφ ) − PL (xL )) −ıN (L)φ e dφ exp ZL = exp (−β f˜L L3 ) 2π −π ρ = exp (−β f˜L L3 )N(L)−1/2 AL , (3.2) where AL (β, ρ, ω) is given by: √ Z N(L) π ıφ ıφ dφ e N (L)β/ρ( 1 and ω ∈ . We claim that C can be chosen as the union C1 ∪C2 , where C2 is given by (η > 0): {(1 + t, ±η)| − η ≤ t ≤ 2d} ∪ {(1 − η, t)| − η ≤ t ≤ η}, and C1 is chosen such that if ξ ∈ C1 , then |ξ | ≥ 2d + 1. It is not difficult to prove that by choosing η sufficiently small, then: sup sup
z∈K ξ ∈C
sup βω0 0≤r≤e− 2
|(ξ − z r)−1 | ≤ M < ∞,
(4.5)
and via the Spectral Theorem, (4.3) takes place. If we manage to prove the existence of a numerical constant c(β, K, ω0 ) such that for all δω ∈ (0, ω1 − ω0 ) and z ∈ K to have: 1 |PL (β, z, ω0 + δω) − PL (β, z, ω0 )| ≤ c(β, K, ω0 ), δω
sup sup z
δω
(4.6)
then the magnetization will be bounded by the same constant. This estimate is straightforward if a stronger one takes place: sup sup sup z
ξ
δω
1 L3 δω
|tr[gω0 +δω (ξ, z; β, β)] − tr[gω0 (ξ, z; β, β)]| ≤ C(β, K, ω0 ). (4.7)
Our main task will consist in constructing a trace class operator Aω0 +δω (ξ, z; β) having the following two properties: trAω0 +δω (ξ, z; β) = trgω0 (ξ, z; β, β) (i.e. its trace is not depending on δω) and moreover, sup sup sup z
ξ
δω
1 L3 δω
kgω0 +δω (ξ, z; β, β) − Aω0 +δω (ξ, z; β)kI1 ≤ C 0 (β, K, ω0 ), (4.8)
which would clearly end the problem. We will see that (ω = ω0 + δω) Aω (ξ, z; β) can be chosen as a product g˜ ω (ξ, z; β, β/2)Sω (β/2) where the first term is bounded, the second one is trace class and moreover: sup sup kgω (ξ, z; β, β/2) − g˜ ω (ξ, z; β, β/2)kB(L2 ) ≤ C1 (β, K, ω0 )δω, z
ξ
(4.9)
and kWL (β/2, ω) − Sω (β/2)kI1 ≤ C2 (β, ω0 )L3 δω.
(4.10)
In particular, (4.9) and (4.10) imply: kSω (β/2)kI1 ≤ C(β, ω0 )L3 and sup sup kg˜ ω (ξ, z; β, β/2)kB(L2 ) ≤ C3 (β, K, ω0 ). z
ξ
(4.11) Employing these estimates together with the following identity (see (4.1)): gω (ξ, z; β, β) = gω (ξ, z; β, β/2)WL (β/2, ω), the proof of (4.8) follows easily. The rest of this subsection is dedicated to the rigorous proofs of these estimates and will be structured in a sequence of technical propositions. We start with a well known result, given without any other comments:
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
13
Proposition 1. The Dirichlet Laplacian defined in 3 admits a trace class semigroup which has an integral kernel given by the following formula: G0,L (x, x0 ; β) =
3 Y j =1
g0,L (xj , xj0 ; β),
(4.12)
where the “one dimensional” kernels read as: g0,L (x, x 0 ; β) = X (x + x 0 − 2mL − L)2 1 (x − x 0 + 2mL)2 − exp − = exp − (2πβ)1/2 2β 2β m∈Z
:= g0,∞ (x, x 0 ; β) + ζ0,L (x, x 0 ; β).
(4.13)
Using the previous proposition, one can write: G0,L (x, x0 ; β) = G0,∞ (x, x0 ; β) + Z0,L (x, x0 ; β),
(4.14)
1 |x − x0 |2 exp − . G0,∞ (x, x ; β) = (2πβ)3/2 2β
(4.15)
where: 0
The purpose of the next proposition is to give a few properties of smoothness and localization of the reminder Z0,L (x, x0 ; β): Proposition 2. For all β > 0 and L > 1, there exist two positive numerical constants c1 and c2 such that: ∂Z0,L 1+β 0 0 i. ∂x (x, x ; β) ≤ c1 β 1/2 G0,∞ (x, x ; c2 β); j 2 ∂ Z0,L ∂Z0,L 1+β 0 0 (x, x ; β), G0,∞ (x, x0 ; c2 β); (x, x ; β) ≤ c1 ii. max ∂β ∂x ∂x 0 β j
k
iii. |Z0,L |(x, x0 ; β) ≤ c1 (1 + β) G0,∞ (x, x0 ; c2 β). Proof. For x, x 0 ∈ (−L/2, L/2) define: (x − x 0 + 2mL)2 exp − , 2β m∈Z\{0} 1 X [x − x 0 − (2m + 1)L]2 0 exp − . ζ2 (x, x ; β) = √ 2β 2πβ m∈Z ζ1 (x, x 0 ; β) = √
1 2πβ
X
(4.16)
It is clear that if one obtains uniform estimates in L and β for these two quantities, the same would be true for Z0,L , too. A very useful estimate is the following: ∀t ≥ 0,
te−t = 2(t/2)e−t/2 e−t/2 ≤ 2e−t/2 .
(4.17)
14
H. D. Cornean
Then we have the inequalities: X ∂ζ1 (x − x 0 + 2mL)2 (x, x 0 ; β) ≤ const. exp − , ∂x β 4β m6 =0 2 X ∂ ζ1 (x − x 0 + 2mL)2 (x, x 0 ; β) ≤ const. , exp − ∂x∂x 0 β 3/2 4β
(4.18)
m6 =0
and similarly for ζ2 . At this point we have to control the summation over m. Let us prove the following inequality: X (x − x 0 + 2mL)2 (x − x 0 )2 exp − ≤ c1 (1 + β) exp − . (4.19) 4β c2 β m6=0
Because |x − x 0 | < L one has (|m|, L ≥ 1): (x − x 0 + 2mL)2 = (x − x 0 )2 + 4mL(x − x 0 ) + 4m2 L2 ≥ (x − x 0 )2 + 4(|m| − 1). (4.20) Therefore: X (x − x 0 + 2mL)2 exp − 4β m6=0 0 2 X 1 (x − x ) exp − m 2 1 + ≤ exp − 4β β m≥1 1 1 1 (x − x 0 )2 2β 2β exp − 1+ = 2 exp − 4β 2 sinh 1 2β ≤ 2(1 + β) exp −
(x
2β
− x 0 )2 4β
.
In order to control ζ2 , one has to study the following quantity: X [x + x 0 − (2m + 1)L]2 exp − . A := 4β
(4.21)
(4.22)
m∈Z
Denote with ξ = x + L/2 and ξ 0 = x 0 + L/2; then 0 < ξ, ξ 0 < L, x − x 0 = ξ − ξ 0 and (ξ + ξ 0 )2 ≥ (ξ − ξ 0 )2 . It follows: (4.23) [x + x 0 − (2m + 1)L]2 = [ξ + ξ 0 − 2(m + 1)L]2 0 2 0 2 2 = (ξ + ξ ) − 4(m + 1)L(ξ + ξ ) + 4(m + 1) L . If m ≤ −1, then: [x + x 0 − (2m + 1)L]2 ≥ (x − x 0 )2 + 4(|m| − 1). If m = 0, then: [x + x 0 − (2m + 1)L]2 = [(L/2 − x) + (L/2 − x 0 )]2 ≥ (x − x 0 )2 .
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
15
If m ≥ 1, then: [x + x 0 − (2m + 1)L]2 ≥ (ξ + ξ 0 )2 + 4L2 (m + 1)(m − 1) ≥ (x − x 0 )2 + 4(m − 1), and we can repeat the summation procedure used in (4.19). Putting all these things together, the proof is completed. u t The next proposition is a variant of the perturbation theory for self-adjoint Gibbs semigroups (see [H-P]). Instead of starting with a perturbation of its generator, we start with an approximation of the semigroup. Although simple, this proposition contains the main technical core of our paper. Proposition 3. Let H := L2 (3) and let H be a self-adjoint and positive operator having the domain D. Fix β0 > 0. Assume that there exists an application 0 < β ≤ β0 → S(β) ∈ B(H) with the following properties: A. sup0 [1/β]): Z β−1/n dτ exp [−(β − τ )H ]R(τ ) Tn (β) := 1/n
converges in norm; let T (β) be its limit; ii. The following equality takes place in B(H): exp (−βH ) = S(β) − T (β).
(4.25)
Proof. i. The norm convergence is assured by the integrability condition imposed on the norm of R(β). Moreover, when β is near zero: sup ||Tn (β)|| ≤ c(α)β 1−α ,
(4.26)
n
therefore the same thing is true for T (β). ii. Let 0 < β1 < β < β0 . If n > [1/β1 ] and φ ∈ H, define the vector: ψn (β) := exp (−βH )φ − S(β)φ + Tn (β)φ.
(4.27)
From (4.26) and condition A it follows: lim ψn (β) := ψ(β) = exp (−βH )φ − S(β)φ + T (β)φ and sup ||ψn (β)|| ≤ const. n,β
(4.28) Define fn (β) = ||ψn (β)||2 and f (β) = ||ψ(β)||2 . From the strong convergence to one of S(β) when β goes to zero and from the norm convergence to zero of T (β), it follows that limβ&0 f (β) = 0. If we manage to prove that f (β) is decreasing, then it would be identically zero and this would end the proof. u t
16
H. D. Cornean
Notice that Tn (β) is normly differentiable and: 1 ∂Tn φ + H Tn (β)φ = exp − H R(β − 1/n)φ. ∂β n
(4.29)
The positivity of H implies:
1 ∂fn ≤ 2 0, n ≥ 1): 2 2 n t t n 2 ≤ const(n)β exp − . t exp − β 2β
(4.36)
(4.37)
(4.38)
Let us get back to the proof of i. Firstly, because 0
|eıωϕ(x,x ) G0,L (x, x0 ; β)| ≤ G0,∞ (x, x0 ; β), it follows that S(β) obeys the condition (4.34) with C ≤ 1, which means that is uniformly bounded in β > 0. Let us show now that the operator given by the integral kernel 0
(eıωϕ(x,x ) − 1)G0,L (x, x0 ; β) converges in norm to zero. Let us remark first that |ϕ(x, x0 )| ≤ L |x − x0 | and moreover: 0
|(eıωϕ(x,x ) − 1)| ≤ ω|ϕ(x, x0 )| ≤ ωL |x − x0 |.
(4.39)
Then (using (4.38) with n = 1): 0
|(eıωϕ(x,x ) − 1)G0,L (x, x0 ; β)| ≤ const L β 1/2 G0,∞ (x, x0 ; 2β),
(4.40)
therefore its operator norm behaves in zero like β 1/2 at least. The proof of i is now straightforward. ii. We will prove that the application is in fact normly differentiable. For β > 0 and δβ sufficiently small, one has: Z 0 ∂G0,L (·, x0 ; β)f (x0 ) dx0 eıωϕ(·,x ) S(β + δβ)f − S(β)f = δβ ∂β 3 Z 2 (δβ)2 0 ∂ G0,L ˜ (x0 ), (4.41) dx0 eıωϕ(·,x ) (·, x0 ; β)f + 2 ∂β 2 3 where β˜ is situated between β and β + δβ. It is not difficult now to see that the “operator derivative” is an integral operator whose kernel is the derivative with respect to β of the initial one.
18
H. D. Cornean
iii. Denote with D0 the common domain of essentially self-adjointness for H1,L (ω), ω ≥ 0: D0 = {ψ ∈ C 2 (3) ∩ C 1 (3)| ψ|∂3 = 0, 1ψ ∈ L2 (3)}.
(4.42)
The action of H1,L (ω) on a function from D0 is as follows: [H1,L (ω)ψ](x) = −(1ψ)(x) + 2ıωa(x) · (∇ψ)(x) + ω2 a2 (x)ψ(x).
(4.43)
Now take f ∈ C0∞ (3). After integration by parts, using (4.33) and the fact that G0,L (x, x0 ; β) solves the heat equation in the interior of 3, one obtains: Z 0 dx0 dx ψ(x)f (x0 )eıωϕ(x,x ) hH1,L (ω)ψ, S(β)f i = 32
· [−1x + 2ıωa(x − x0 ) · ∇x + ω2 a2 (x − x0 )]G0,L (x, x0 ; β) = −hψ, S 0 (β)f i + hψ, R(β)f i.
(4.44)
The result follows easily after a density argument and with the remark that: a(x − x0 ) · ∇x G0,∞ (x, x0 ; β) = 0.
(4.45)
Finally, let us notice that the norm of R(β) is independent of L and is integrable in zero. To do that, one has to employ the estimates from Proposition 2, (4.38) and the criterion from (4.34). u t In order to perform a similar perturbative treatment of the semigroup near a nonzero magnetic field, we need the estimate given by the next proposition: Proposition 5. Let n be a unit vector in R3 . Then there exist three positive numerical constants s, c4 and c5 such that for all ω ∈ , x, x0 ∈ 3 and β > 0, one has the following uniform estimate in L: |x − x0 |2 (1 + β)s 0 . (4.46) exp − |n · (−ı∇ − ωa(x))Gω,L (x, x ; β)| ≤ c4 β2 c5 β Proof. Proposition 3 allows us to write down the following integral equation: Z β dτ WL (β − τ, ω) R(τ ). WL (β, ω) = S(β) −
(4.47)
Because S(β) and WL are self-adjoint, one can rewrite (4.47) as: Z β dτ R ∗ (τ )WL (β − τ, ω). WL (β, ω) = S(β) −
(4.48)
0
0
In terms of integral kernels, (4.48) reads as: Z β Z 0 0 dτ dy R ∗ (x, y, τ )Gω,L (y, x0 , ; β − τ ), Gω,L (x, x ; β) = S(x, x ; β) − 0
3
(4.49)
where the equality is between continuous functions in C(3 × 3) and the integral in τ R β− has to be understood as “ ” in the limit & 0.
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
19
Because the kernel of R ∗ reads as R ∗ (x, y; τ ) = R(y, x; τ ), by direct computation one can obtain the estimate (see Proposition 2): 0 |x − x0 |2 (1 + τ )s exp − , (4.50) |n · (−ı∇ − ωa(x))R ∗ (x, x0 ; τ )| ≤ c40 τ2 c50 τ where one has to apply (4.33) then use the estimates from Proposition 2; finally, introducing (4.50), (2.3), (4.35) and (4.38) in (4.49) and because the singularity in τ is t integrable, the result for Gω,L is straightforward. u Take ω = ω0 + δω ∈ . The analogous of Proposition 4 at nonzero magnetic field is: Proposition 6. The bounded operator denoted with Sω (β) and given by the kernel 0 eıδωϕ(x,x ) Gω0 ,L (x, x0 ; β) has the following properties: i. (0, ∞) 3 β 7 −→ Sω (β) is strongly differentiable and s − limβ&0 Sω (β) = 1; ∂ Sω (β)f + H1,L (ω)Sω (β)f = Rω (β)f , where ii. RanSω (β) ∈ Dom(H1,L (ω)) and ∂β Rω (β) is given by: h 0 Rω (x, x0 ; β) = eı(δω)ϕ(x,x ) (δω)2 a2 (x − x0 )Gω0 ,L (x, x0 ; β)
+ 2(δω)a(x − x0 ) · (ı∇x + ω0 a(x))Gω0 ,L (x, x0 ; β) .
Proof. i. Rewriting the integral kernel of Sω (β) as: h i 0 Sω (x, x0 ; β) = Gω,L (x, x0 ; β) + eı(δω)ϕ(x,x ) − 1 Gω,L (x, x0 ; β),
(4.51)
(4.52)
and using the diamagnetic inequality, one can reproduce the argument from (4.40) in order to prove that the second term converges in norm to zero. Clearly, the first one converges strongly to one. If {ψj } and {Ej } denote the sets of eigenvectors and eigenvalues of H1,L (ω), then: X e−βEj ψj (x)ψj (x0 ), (4.53) Gω,L (x, x0 ; β) = j
where the series is absolutely and uniformly convergent on 3 × 3. This can be seen from the fact that the semigroup is trace class and that the eigenfunctions belong to D0 and admit the estimate: |ψj |(x) ≤ const(L) (Ej + 1),
(4.54)
obtained from the fact that the resolvent [H1,L (ω) + 1]−1 is bounded between L2 (3) and L∞ (3). It follows that uniformly in 3: |Gω,L (·, ·; β + δβ) − Gω,L (·, ·; β) −
∂Gω,L (·, ·; β)| ≤ const(L) (δβ)2 , ∂β
(4.55)
which is sufficient for the strong differentiability (see also (4.41)). ii. One has to make the same steps as in the proof of the third point of Proposition 4. As for the norm of R(β), let us see that is independent of L and is integrable in zero: from
20
H. D. Cornean
(4.51), (4.46), (4.38) and (2.3), one can obtain an estimate on the kernel of Rω (β) of the following form: |Rω (x, x0 ; β)| ≤ c9 δω(1 + β)s G0,∞ (x, x0 ; c10 β),
(4.56)
which implies that its B(L2 (3)) norm is bounded by a constant multiplied with δω (see (4.34)). u t We shall give now without proof a result which gives sufficient conditions for an operator defined in B(L2 (3)) to be trace class: Proposition 7. Let {Tn } a sequence of trace class operators, converging to T in B(L2 (3)). If supn ||Tn ||I1 ≤ c < ∞, then T ∈ I1 and ||T ||I1 ≤ c. Remark. Assume that an operator T is defined by a B(L2 (3))-norm Riemann integral Rb on the interval [a, b], with a continuous trace class integrand S(t). If a dt ||S(t)||I1 ≤ c < ∞, then T is trace class and Z b dt tr S(t). ||T ||I1 ≤ c, tr T = a
˜ Denote with R(ω, β) the bounded operator given by the kernel: ˜ x0 ; β) = 2 a(x − x0 ) · (−ı∇ − ω0 a(x))Gω0 ,L (x, x0 ; β). R(x,
(4.57)
Among other things, the next proposition proves (4.10): Proposition 8. Take β > 0 and ω = ω0 + δω ∈ . i. The operator Sω (β) is trace class and moreover, there exists a positive numerical constant c such that: ||WL (β, ω) − Sω (β)||B(L2 (3)) ≤ c δω
and
||WL (β, ω) − Sω (β)||I1 ≤ c δω L3 .
(4.58) (4.59)
ii. For all x, x0 ∈ 3 and uniformly in L, one has: 0
(4.60) Gω,L (x, x0 ; β) = eıδωϕ(x,x ) Gω0 ,L (x, x0 ; β) + Z Z β 0 ˜ x0 ; τ ) + O((δω)2 ). dτ dy eıδωϕ(x,y) Gω0 ,L (x, y; β − τ )eıδωϕ(y,x ) R(y, + δω 0
3
Proof. i. We know that as bounded operators: Z β dτ WL (β − τ, ω)Rω (τ ). WL (β, ω) = Sω (β) −
(4.61)
0
We have already seen that the B(L2 (3)) norm of Rω (τ ) is bounded by a constant multiplied with δω (see (4.56)). Its Hilbert-Schmidt norm is bounded by (see (4.37): ||Rω (τ )||I2 ≤ const δω
(1 + τ )s 3/2 L . τ 3/4
(4.62)
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
21
For the semigroup, we know that ||WL ||B(L2 (3)) ≤ 1 and from (2.3) and (4.37) it follows that: ||WL (β − τ, ω)||I2 ≤ const
1 L3/2 . (β − τ )3/4
(4.63)
With the help of the well known inequality ||A B||I1 ≤ ||A||I2 ||B||I2 , it follows that in both situations (B(L2 ) and I1 (L2 )) the singularities in τ are integrable (see the previous remark), and the desired bounds follow easily. ii. The formula (4.60) is obtained from (4.61) by isolating the term which contains δω. t u The next proposition imposes sufficient conditions on a trace class integral operator such that its trace to be equal to the integral of the kernel’s diagonal (see [R-S 1]): Proposition 9. Let T ∈ I1 (L2 (3)), given by the integral kernel T (x, x0 ) ∈ C(3 × 3). Then: Z dx T (x, x). (4.64) tr T = 3
The rest of this subsection is dedicated to the proof of (4.9). Fix β, τ > 0. Let 0 < ω0 < ω1 and let = [ω0 , ω1 ]. Clearly, gω (ξ, z; β, τ ) is trace class and admits a continuous integral kernel given by the following series, which is absolutely and uniformly convergent on 3 × 3 (see also (4.53)): X [ξ − z exp (−βEj )]−1 z exp (−τ Ej )ψj (x)ψj (x0 ), (4.65) Tω (x, x0 ) = j
Notice that in order to simplify the notations, we did not specify the dependence on ξ , z, β and τ . Let us start with the equation satisfied by Tω (x, x0 ): Proposition 10. As continuous functions: Z dy Gω,L (x, y; β)Tω (y, x0 ) = zGω,L (x, x0 ; τ ). ξ Tω (x, x0 ) − z 3
(4.66)
Proof. The above equality is nothing but the rewriting in terms of integral kernels of an identity between bounded operators: t [ξ − zWL (β, ω)]gω (ξ, z; β, τ ) = zWL (τ, ω).u
(4.67) 0
For further purposes, we shall prove that for all L ≥ 1, |Tω (x, x0 )| ∼ e−α|x−x | for some positive α. We need first a few definitions: let ρ(x) := (1 + x2 )1/2 and α ≥ 0. It is known that the partial derivatives up to the second order of ρ are bounded by a numerical constant and moreover: ∀x ∈ R3 , e±αρ(x) e∓α|x| ≤ const(α).
(4.68)
Fix x0 ∈ 3. Denote with A(α) the multiplication operator with e−αρ(·−x0 ) . Then A(α) and A−1 (α) = A(−α) are bounded operators and invariate D0 (see 4.42)). An useful result is contained in the next proposition:
22
H. D. Cornean
Proposition 11. i. The operator A(α)WL (τ, ω)A(−α) belongs to B(L2 ), and has a norm which is uniformly bounded in L, x0 ∈ 3, ω ∈ and 0 < τ ≤ β; ii. The operator A(α)WL (β, ω)A(−α) belongs to B(L2 , L∞ ), having a norm which is uniformly bounded in L, ω and x0 . Proof. i. The inequality (4.68) allows us to replace A with the multiplication operator given by eα|x−x0 | . Let ψ ∈ L2 (3) and define: Z dy e−αρ(·−x0 ) GL,ω (·, y; τ )eαρ(y−x0 ) ψ(y). (4.69) φ := A(α)WL (τ, ω)A(−α) = 3
Let us remark an elementary estimate, which is true for all 0 < τ ≤ β: |x − x0 |2 0 ≤ const(α, β). eα|x−x | exp − 4τ Applying (4.68), the triangle inequality, (2.3) and (4.70) one obtains: Z 0 dx0 eα|x−x | |Gω,L (x, x0 ; τ )| |ψ|(x0 ) |φ|(x) ≤ const(α) 3 Z dx0 G0,∞ (x, x0 ; 2τ )|ψ|(x0 ). ≤ const(α, β) 3
(4.70)
(4.71)
The result follows from (4.36) and (4.34). ii. We apply the Schwartz inequality in (4.71) with τ = β and then use (4.37).
t u
Proposition 12. Under the same conditions as above, there exists a sufficiently small 0 < α < 1 such that the following inequalities are true in B(L2 (3)), uniformly in L, ω and x0 : i. ||WL (β, ω) − A(α)WL (β, ω)A(−α)|| ≤ α const(β); ii. Uniformly in ξ ∈ C and z ∈ K one has: ||A(α)[ξ − zWL (β, ω)]−1 A(−α)||B(L2 (3)) ≤ const(β). Proof. i. Let S(β) = A(α)WL (β, ω)A(−α). We will see that S(β) obeys the conditions of Proposition 3. From Proposition 11 follows condition A. Then S(β) is strongly differentiable, has its range included in the domain of H1,L (ω) and converges strongly to one. Define B := H1,L (ω) − A(α)H1,L (ω)A(−α), or in other form: B = 2ıα(p − ωa) · ∇ρ(· − x0 ) + α(1ρ)(· − x0 ) − α 2 |∇ρ(· − x0 )|2 . (4.72) w w A well known result says (see [S]) that wB[H1,L (ω) + 1]−1/2 w ≤ const. By direct computation, √ R(τ ) = B A(α)WL (τ, ω)A(−α); a rough estimate gives ||R(τ )|| ≤ const(L)/ τ and even if the constant behaves badly with L, the norm is integrable in zero with respect to τ . In conclusion: Z β dτ WL (β − τ, ω)R(τ ) WL (β, ω) − S(β) = − Z
0
β
=− 0
dτ WL (β − τ, ω)BS(τ ),
(4.73)
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
23
where the integral converges in norm. But uniformly in L and x0 there exists a numerical constant such that: const . ||WL (β − τ, ω)B|| ≤ α √ β −τ
(4.74)
From (4.74), (4.73) and point i of Proposition 11, the needed estimate follows. ii. Using point i, the estimate (4.3) and the identity (α small enough): A(α)[ξ − zWL (β, ω)]−1 A(−α) = [ξ − zA(α)WL (β, ω)A(−α)]−1 =
X [ξ − zWL (β, ω)]−1 zj j ≥0
n
· [A(α)WL (β, ω)A(−α) − WL (β, ω)][ξ − zWL (β, ω)]−1 the result follows.
oj
,
t u
Corollary 1. The operator A(α)gω (ξ, z; β, τ )A(−α) belongs to B(L2 , L∞ ) if α is small enough, and uniformly in ξ , z, x0 , ω and L one has: i. ||A(α)gω (ξ, z; β, τ )A(−α)|| ≤ const(β, τ ); R ii. 3 dy e2α|y−x0 | |Tω (x0 , y)|2 ≤ const(β, τ ); iii. eα|x−y| |Tω (x, y)| ≤ const(β, τ ).
Proof. i. It is an immediate consequence of Propositions 11 and 12. ii. Let φ = A(α)gω (ξ, z; β, τ )A(−α)ψ, where φ is bounded and continuous. From i it follows: Z dy eαρ(y−x0 ) Tω (x0 , y)ψ(y)| ≤ const(β, τ )||ψ||L2 , (4.75) |φ(x0 )| = | 3
and the result follows from the representation theorem of linear and continuous functionals on L2 . iii. Rewrite the identity gω (ξ, z; β, τ ) = gω (ξ, z; β, τ/2)WL (τ/2, ω) in terms of integral kernels, use (4.68), (2.3), ii, the triangle and Schwartz inequalities, and the proof is completed. u t Let δω > 0 be such that ω = ω0 +δω ∈ . Define the bounded operator g˜ ω (ξ, z; β, τ ) given by the following integral kernel: 0 T˜ω (x, x0 ) := eıδωϕ(x,x ) Tω0 (x, x0 ).
(4.76)
Equations (4.3) and (4.58) imply that if δω is sufficiently small then there exists a numerical constant such that uniformly in L: sup sup ||[ξ − zSω (β)]−1 || ≤ const
(4.77)
sup sup ||[ξ − zSω (β)]−1 − [ξ − zWL (β, ω)]−1 || ≤ const δω.
(4.78)
ξ
z
and: ξ
z
24
H. D. Cornean
We state now an important property of g˜ ω (ξ, z; β, τ ): Proposition 13. Under the above conditions, there exists a numerical constant such that if δω is small enough, then uniformly in ξ , z and L, the following B(L2 ) estimate takes place: ||[ξ − zSω (β)]−1 zSω (τ ) − g˜ ω (ξ, z; β, τ )|| ≤ const δω.
(4.79)
Proof. The integral kernel of the operator [ξ − zSω (β)]g˜ ω (ξ, z; β, τ ) is given by: Z 0 ˜ dy Sω (x, y; β)T˜ω (y, x0 ). (4.80) ξ Tω (x, x ) − z 3
Let us notice a crucial property of the magnetic phase: ϕ(x, y) + ϕ(y, x0 ) = ϕ(x, x0 ) + fl(x, y, x0 ),
(4.81)
where fl(x, y, x0 ) = 1/2 B · [(y − x0 ) ∧ (x − y)]. Using (4.81) and (4.66) in (4.80) we obtain: [ξ − zSω (β)]g˜ ω (ξ, z; β, τ ) = zSω (τ ) + R, where R is an integral operator given by: Z 0 0 dy (eıδω fl(x,y,x ) − 1) Gω0 ,L (x, y; β)Tω0 (y, x0 ). −z eıδωϕ(x,x ) 3
(4.82)
(4.83)
0
Because |eıδω fl(x,y,x ) − 1| ≤ δω|x − y| |y − x0 |, denoting with P the operator given by |x − y| |zGω0 ,L (x, y; β)| and with Q the operator corresponding to |y − x0 | |Tω0 (y, x0 )|, it follows that ||R|| ≤ δω||P || ||Q||. Using (2.3), Corollary 1 iii., (4.34) and (4.77), the proof is completed. u t Employing (4.58), (4.79), (4.78) and (4.77) in the next equality: gω (ξ, z; β, β/2) − g˜ ω (ξ, z; β, β/2) = [(ξ − zWL (β, ω))−1 − (ξ − zSω (β))−1 ]zWL (β/2, ω) + (ξ − zSω (β))−1 z[WL (β/2, ω) − Sω (β/2)] + (ξ − zSω (β))−1 zSω (β/2) − g˜ ω (ξ, z; β, β/2),
(4.84)
(4.9) is straightforward. Let us end this subsection by proving that the operator Aω (ξ, z; β) = g˜ ω (ξ, z; β, β/2)Sω (β/2) has the same trace as gω0 (ξ, z; β, β). Indeed, because Aω fulfills the conditions of Proposition 9 and noticing that ϕ(x, x0 ) = −ϕ(x0 , x), one can write: Z dx dx0 Tω0 (ξ, z; β, β/2; x, x0 )Gω0 ,L (x0 , x; β/2) trAω (ξ, z; β) = 32 Z dx Tω0 (ξ, z; β, β; x, x) = trgω0 (ξ, z; β, β). (4.85) = 3
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
25
4.2. The proof of II and III. The analyticity of 0∞ (β, z, ω0 ) in D follows from the bound (see (2.16)) |gσ (ζ )| ≤ const(σ, K)|ζ | where K is some compact in C \ [1, ∞). In what follows, we will prove that if z ∈ D0 := {|z| < 1}, then: lim 0L (β, z, ω0 ) = 0∞ (β, z, ω0 ).
(4.86)
L→∞
Because |z| < 1, the grand canonical pressure will be (see (2.10)): PL (β, z, ω) =
∞ n X z n=1
∞
X zn 1 trW (nβ, ω) = L n βL3 n
n=1
1 βL3
Z 3
dx Gω,L (x, x; nβ) . (4.87)
Under the same conditions, the magnetization reads as: 0L (β, z, ω0 ) = −
∞ e X zn 1 ∂WL (nβ, ω0 ). tr c n βL3 ∂ω
(4.88)
n=1
An important quantity is the integral kernel of the semigroup defined on the whole space: 0
eıωϕ(x,x ) ωβ/2 · Gω,∞ (x, x0 ; β) = 3/2 (2πβ) sinh (ωβ/2) ωβ/2 1 (e3 ∧ (x − x0 ))2 + (e3 · (x − x0 ))2 . · exp − 2β tanh (ωβ/2)
(4.89) (4.90)
Denote with: g(β, ω) = Gω,∞ (x, x; β) =
1 ωβ/2 . 3/2 (2πβ) sinh (ωβ/2)
Then it is easy to see that if |z| < 1: P∞ (β, z, ω0 ) =
∞ n X z g(nβ, ω0 ) n=1
n
β
,
0∞ (β, z, ω0 ) = −
∞ e X zn ∂g (nβ, ω0 ). βc n ∂ω n=1
One of the results in [M-M-P 1] can be adapted to our problem and gives: ∂WL ∂g 1 (β, ω0 ) = (β, ω0 ). tr L→∞ L3 ∂ω ∂ω lim
(4.91)
If we prove the existence of a positive function f with at most polynomial growth such that: 1 ∂WL (nβ, ω0 ) ≤ f (nβ), tr (4.92) L3 ∂ω then (4.86) would be true.
26
H. D. Cornean
The next corollary is a direct consequence of Proposition 8 ii.: Corollary 2. Under the conditions of Proposition 8 one has: Z 1 dx [Gω0 +δω,L (x, x; β) − Gω0 ,L (x, x; β)] = lim δω&0 δω 3 Z β Z ∂WL (β, ω0 ) = 2 dτ dx dy · = tr ∂ω 32 0 ·Gω0 ,L (x, y; β − τ )a(y − x) · (−ı∇y − ω0 a(y))Gω0 ,L (y, x; τ ).
(4.93)
Use (2.3), (4.46), (4.38) and (4.35) in (4.93) and (4.92) follows. The proof of (4.86) is now completed. Remarks. 1. What can we say about the same problem for Fermi particles (say electrons, where H1,L (ω) should be replaced with the Pauli operator)? Knowing that in this case inf σ (H1,∞ (ω)) = 0 (i.e. is independent of ω), the grand canonical result (Lemma 1) can be easily restated in terms of Fermi statistics: the only thing that changes is the domain on which the limit takes place i.e. C \ (−∞, −1]. As for Theorem 2, its proof was based on the fact that there exists a compact K ⊂ C \ [eβω/2 , ∞) which contains the circle centered in the origin with radius equal to xL (β, ρ, ω), for all L ≥ L0 and for all strictly positive β, ω and ρ. For Fermi particles, if one fixes ρ and ω but makes β very large (lowers the temperature), then xL (β, ρ, ω) would be a very large positive quantity, therefore the above circle could intersect the negative cut. Of course, if the gas is diluted (say ρ and ω fixed and β small), then it could happen that x∞ (β, ρ, ω) < 1 which means that the circle never intersects the cut when L ≥ L0 , therefore a similar proof can be provided. Our conclusion is that the extension of Theorem 2 to a Fermi gas at low temperature is not trivial and remains an interesting problem. 2. What about the higher derivatives with respect to ω (the susceptibility for example) at ω0 6 = 0? This also remains an open problem, even for the grand canonical ensemble. Nevertheless, we think that our approach (the modified perturbation theory for Gibbs semigroups) could provide an answer to it. Acknowledgement. Part of this work was done during a visit to Centre de Physique Théorique in Marseille, at the invitation of Professor P. Duclos. The financial support of CNCSU grant 13(C) is hereby gratefully acknowledged. Finally, the author wishes to thank Professors N. Angelescu, M. Bundaru and G. Nenciu for their encouragement and fruitful discussions.
References [A] [A-B-N 1] [A-B-N 2] [A-C] [B-H-L] [C-N] [H] [H-P]
Angelescu, N.: Ph.D thesis, I.F.A., Bucharest, 1976 Angelescu, N., Bundaru, M., Nenciu, G.: On the perturbation of Gibbs semigroups. Commun. Math. Phys. 42, 29–30 (1975) Angelescu, N., Bundaru, M., Nenciu, G.: On the Landau diamagnetism. Commun. Math. Phys. 42, 9–28 (1975) Angelescu, N., Corciovei, A.: On free quantum gases in a homogeneous magnetic field. Rev. Roum. Phys. 20, 661–671 (1975) Broderix, K., Hundertmark, D., Leschke, H.: Continuity properties of Schrodinger semigroups with magnetic fields. Mathematical-Physics preprint archive of University of Texas at Austin. Cornean, H.D., Nenciu, G.: On eigenfunction decay for two dimensional magnetic Schrödinger operators. Commun. Math. Phys. 192, 671–685 (1998) Huang, K.: Statistical mechanics. New York–London: John Wiley & Sons, Inc., 1963 Hille, E., Phillips, R.S.: Functional integral and semigroups. Providence; RI: Am. Math. Soc., 1957
On the Magnetization of a Charged Bose Gas in the Canonical Ensemble
[K-U-Z]
27
Kac, M., Uhlenbeck, G.E., Ziff, R.M.: The ideal Bose-Einstein gas, revisited. Phys. Rep. 32C, no 4, 169–248 (1977) [K] Kunz, H.: Surface orbital magnetism. J. Stat. Phys. 76, 183–207 (1994) [M-M-P 1] Macris, N., Martin, Ph.A., Pulé, J.V.: Diamagnetic currents. Commun. Math. Phys. 117, 215–241 (1988) [M-M-P 2] Macris, N., Martin, Ph.A., Pulé, J.V.: Large volume asymptotics of Brownian integrals and orbital magnetism. Ann. I.H.P. Phys. Theor. 66, 147–183 (1997) [R-S 1,4] Reed, M., Simon, B.: ßit Methods of modern mathematical physics I, IV. New York: Academic Press, 1975 [S] Simon, B.: Schrödinger semigroups. Bull. Am. Math. Soc. (N. S.) 7, 447–510 (1982)
Communicated by H. Araki
Commun. Math. Phys. 212, 29 – 61 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Trace Construction of a Basis for the Solution Space of slN qKZ Equation Atsushi Nakayashiki Graduate School of Mathematics, Kyushu University, Ropponmatsu 4-2-1, Fukuoka 810-8560, Japan. E-mail:
[email protected] Received: 26 March 1999 / Accepted: 4 January 2000
Abstract: The trace of intertwining operators over the level one irreducible highest (1) weight modules of the quantum affine algebra of type AN −1 is studied. It is proved that the trace function gives a basis of the solution space of the qKZ equation at a generic level. The highest-highest matrix elements of the composition of intertwining operators are explicitly determined as rational functions up to an overall scalar function. The integral formula for the trace is presented.
1. Introduction In this paper we shall study solutions of the quantized Knizhnik–Zamolodchikov (qKZ) equation associated with the quantum group Uq (slN ). The idea in this paper stems from the study of solvable lattice models. The qKZ equation was introduced in [6] as the equation satisfied by the highesthighest matrix elements of the intertwining operators of quantum affine algebra. For generic values of parameters the set of matrix elements gives a basis of the solution space over the field of appropriate periodic functions. The connection matrix of two solutions with different asymptotics have been calculated from the commutation relation of intertwining operators. The solutions of the qKZ equation associated with Uq (sl2 ) is systematically studied by Tarasov and Varchenko [15] (see also references therein). In [15] the solutions are described as the multidimensional q-hypergeometric integrals. It is proved that, for generic values of parameters, the q-hypergeometric solutions give a basis of the solution space over the field of appropriate periodic functions. The connection matrix is determined as the representation of Felder’s elliptic quantum group. In this paper we propose another basis of the solution space of the qKZ equation as the trace of intertwining operators of quantum affine algebra. The trace depends on two kinds of variables. It satisfies a qKZ equation in one kind of variable and another qKZ
30
A. Nakayashiki
equation in the other kind of variables. Those two qKZ equations are dual to each other. The solutions to one qKZ equation are parametrized by solutions to the dual equation. Let us consider the Uq (slN ) modules V1 , . . . ,Vn and the trigonometric R matrix Rij (zi /zj ) acting on the tensor product Vi ⊗ Vj . The qKZ equation is the q-difference equation for the V1 ⊗ · · · ⊗ V valued function f (z1 , · · · , zn ) of the form f (· · · , pzj , · · · ) = Rjj −1 (pzj /zj −1 ) · · · Rj 1 (pzj /z1 )(κ −H )j ×Rj n (zj /zn ) · · · Rjj +1 (zj /zj +1 )f (z1 , · · · , zn ),
(1)
Q −hi , h1 , · · · , hN −1 is a basis of the Cartan subalgebra of slN and where κ −H = N−1 i=1 κi −H −H acts on Vj . The complex numbers p and κi ’s are the parameters (κ )j means that κ of the equation. If we write p = q 2(k+N ) the number k is called level. d Let 3i (0 ≤ i ≤ N − 1) be the fundamental weights of sl N . We identify 3i (1 ≤ i ≤ N − 1) with the fundamental weights of slN . In this paper we consider the case where all Vi are isomorphic to the N dimensional irreducible Uq (slN ) module V with the highest weight 31 or 3N −1 . c Let V (3i ) be the irreducible highest weight Uq (sl N ) module with the highest weight 3i and Vζ the evaluation module of V . Then there exist, up to normalization, unique intertwining operators 8(ζ ) and 9 ∗ (ξ ): 8(ζ ) : V (3i+1 ) −→ V (3i ) ⊗ Vζ ,
9 ∗ (ξ ) : Vξ ⊗ V (3i ) −→ V (3i+1 ).
We extend the index i of 3i to the set of integers and read it modulo N . The operators 8(ζ ) and 9 ∗ (ξ ) are sometimes called of type I and type II respectively [8]. The difference between type I and type II is in the place where the evaluation module is. For type I it is on the right of the highest weight module while for type II it is on the left. Denote by D the grading operator of the principal gradation of V (3i ) and consider the trace of the form G(ζ1 , · · · , ζm |ξ1 , · · · , ξn |x, κ) = F (ζ |ξ |x)−1
N−1 X
trV (3i ) x D κ H 8(ζ1 ) · · · 8(ζm )9 ∗ (ξn ) · · · 9 ∗ (ξ1 ) (2)
i=0
which is a function taking the value in HomC (V ⊗n , V ⊗m ). Here F (ζ |ξ |x) is some scalar function (cf. (16)). By the commutation relation of the intertwining operators, the cyclic property of the trace and the functional equations of F (ζ |ξ |x), G satisfies G(ζ | · · · , xξi , · · · |x, κ) = G(ζ |ξ |x, κ)R¯ ii+1 (ξi /ξi+1 ) · · · R¯ in (ξi /ξn )(κ −H )ξi ×R¯ i1 (xξi /ξ1 ) · · · R¯ ii−1 (xξi /ξi−1 ), (3) −1 G(· · · , x ζi , · · · |ξ |x, κ) = R¯ ii−1 (x −1 ζi /ζi−1 ) · · · R¯ i1 (x −1 ζi /ζ1 )(κ −H )ζi ×R¯ im (ζi /ζm ) · · · R¯ ii+1 (ζi /ζi+1 )G(ζ |ξ |x, κ),
(4)
¯ ) is the trigonometric R matrix (cf. (7)), R¯ ii+1 (ξi /ξi+1 ) acts non-trivially on where R(ζ Vξi ⊗ Vξi+1 in V ⊗n , etc. Equation (4) has precisely the same form as the qKZ equation (1). Let t G be the transpose of G, that is, t G ∈ HomC (V ∗⊗m , V ∗⊗n ), V ∗ being the dual vector space of V . Then, as the equation for t G, (3) is of the same form as (1). Since we use the principal gradation in this paper, to make a precise correspondence between the parameter x and the parameter p in (1) we need to consider G as a function of zj = ζjN
Trace Construction of Basis for Solution Space of slN qKZ Equation
31
and uj = ξjN . Then if x N = p = q 2(k+N ) , t G and G satisfy the qKZ equation of level k and level −k − 2N in the variabs u and z respectively. In this paper, if x −N = q 2(k+N ) , we say (4) the qKZ equation of level k with the value in V ⊗m . Let S nk and Sk∗n be the space of meromorphic solutions of the qKZ equation of level k with the value in V ⊗n and V ∗⊗n respectively and F the field of x periodic meromorphic functions in n variables. Then the function G defines two maps simultaneously: t
G(ζ | · |x, κ) : V ∗⊗m ⊗ F −→ Sk∗n , G(·|ξ |x, κ) : V
⊗n
⊗ F −→
m S−k−2N .
(5) (6)
In (5), ζ1 , · · · , ζm are parameters of the map and in (6), ξ1 , · · · , ξn are parameters of the map. We consider the case n = m. We assume |x| < 1. We shall prove that if x and κ are generic, (5) is an isomorphism for the generic values of ζ1 , . . . ,ζm and (6) is an isomorphism for the generic values of ξ1 , . . . ,ξn . It is proved by showing that the determinant of G does not vanish identically. We calculate the determinant at x = 0, where G reduces to the highest-highest matrix element. For the level one irreducible module V (3i ) the matrix elements can be calculated explicitly as rational functions up to an overall scalar function. This is expected because at q = 1 such formula is given c2 ) the with the help of the Frenkel–Kac bosonization of V (3i ) [5]. In the case of Uq (sl c ) it is possible to carry out the integral of formula of this type is given in [8]. For Uq (sl N the integral formula, which is obtained from the bosonization of intertwining operators based on the Frenkel–Jing bosonization of V (3i ) in [10], in a similar manner to N = 2 case. The case x = q 2 is relevant to the physical quantities in solvable lattice models. In fact at this value of x if we further specialize the variables ζi and ξj appropriately, the trace functions give correlation functions and form factors of the solvable lattice model ¯ ). We have calculated the determinant of G for whose Boltzman weight is given by R(ζ 2 N = n = 2 and x = q explicitly. By the q series expansion we checked that det G does not vanish identically for n = 3. We conjecture that the determinant does not vanish identically at x = q 2 . This suggests that the trace description can be effective for the completeness problem of the space of local fields [13, 1]. The bosonization of intertwining operators makes it possible not only to derive the integral formula for the matrix elements but also to derive the integral formula for the trace. Therefore the integral formulae of the basis of the solution space of (3) and (4) are given. The plan of this paper is as follows. In the second section we summarize necessary notations of quantum affine algebra c Uq (sl N ). We review the properties of the intertwining operators for the level one intec grable Uq (sl N ) modules in the principal picture in Sect. 3. In Sect. 4 we give the relation between principal picture and homogeneous picture. It serves for translating the results in the references, in which the homogeneous gradation is used, into principal picture and vice versa. In Sect. 5 the trace of intertwining operators are introduced and the equations satisfied by them are derived. The non-vanishing of the determinant of the trace function is properly formulated and proved in Sect. 6. In Sect. 7 an example of the concrete expression of the determinant of the trace of intertwing operators in the case N = 2 is given. In Sect. 8 we give the integral formulae for the matrix elements of the intertwining operators. The rational function formula for the extremal component of the normalized matrix element is given in Sect. 9. In Sect. 10 the integral formula of
32
A. Nakayashiki
the trace of intertwining operators is presented. In Appendix A we refer to the integral c2 ) in [8], since in this case it is formula for the trace of intertwining operators of Uq (sl possible to simplify the formula a bit. This simplification is used in the calculation in the example of Sect. 7. The bosonic expression of the intertwining operators are reviewed in Appendix B. The list of the expression of the operators in terms of their normal ordered operators is given in Appendix C. In Appendix D a derivation of the integral formula for the trace of intertwining operators is briefly explained. The explicit formulae of constants appeared in the formulae of matrix elements and the trace are given in Appendix E. 2. Preliminary Let
−1 P = ⊕N i=0 Z3i ⊕ Zδ
d be the weight lattice of sl N and −1 P ∗ = HomZ (P , Z) = ⊕N i=0 Zhi ⊕ Zd
its dual. The pairing is given by h3i , hj i = δij ,
h3i , di = 0,
hδ, hi i = 0,
hδ, di = 1.
We say 3i (0 ≤ i ≤ N − 1) are the fundamental weights. Simple roots are given by αi = −3i−1 + 23i − 3i+1 + δi0 δ,
where the index should be read modulo N . If we set aij = αi , hj , (aij ) is the generalized (1) Cartan matrix of type AN−1 . ±1 c The quantum affine algebra Uq0 (sl N ) is the Hopf algebra generated by ei , fi , ti (0 ≤ i ≤ N − 1) with the following defining relations: ti tj = tj ti , [ei , fj ] = δij 1−aij
X k=0
ti ej ti−1 = q hhi ,αj i ej ,
ti±1 ti∓1 = 1,
ti − ti−1 , q − q −1
(k) (1−aij −k)
(−1)k ei ej
1−aij
=
X k=0
(k) (1−aij −k)
(−1)k fi fj
ti fj ti−1 = q −hhi ,αj i fj ,
=0
i 6= j,
where e(k) = ek /[k]! and similarly for f (k) , [k]! = [k] · · · [2][1], [k] = (q k − q −k )/(q − q −1 ). The coproduct 1 and the antipode S are given by 1(ei ) = ei ⊗ 1 + ti ⊗ ei , and
1(fi ) = fi ⊗ ti−1 + 1 ⊗ fi ,
S(ei ) = −ti−1 ei ,
S(fi ) = −fi ti ,
1(ti ) = ti ⊗ ti ,
S(ti ) = ti−1 .
c We extend the Hopf algebra Uq0 (sl N ) by adding the element D such that [D, ei ] = ei ,
[D, fi ] = −fi ,
[D, ti±1 ] = 0,
1(D) = D ⊗ 1 + 1 ⊗ D.
Trace Construction of Basis for Solution Space of slN qKZ Equation
33
c c The resulting algebra is denoted by Uq (sl N ). We say that an element X ∈ Uq (sl N ) has degree n if [D, X] = n. c For a highest weight Uq (sl N ) module M with a highest weight vector v, D defines a c grading on M by D(Xv) = n for an element X of degree n in Uq (sl N ). The evaluation 0 (sl c Cv of U ) associated with the vector representation of Uq (slN ) module Vζ = ⊕N−1 j N q j =0 is given by fi vj = ζ −1 δij +1 vj +1 ,
ei vj = ζ δij vj −1 ,
ti vj = q −δij +δij +1 vj ,
where the index of vj should be read modulo N. In particular the weight of vj , which we denote by wtvj , is given by wtvj = 3j +1 − 3j . P We denote the binomial coefficient by n Cr , that is, (1 + x)n = nr=0 n Cr x r . In this paper two kinds of variables appear, one is u and z, the other is ξ and ζ . They are always related by the relation u = ξ N and z = ζ N except in Appendix A where u = −ξ 2 and z = ζ 2 . 3. Intertwining Operators In [2,10] the evaluation module, R matrices and intertwining operators are described in terms of the homogeneous grading. We shall rewrite them to the principal picture so that the description is consistent with the N = 2 case in [8] and that the equations for the trace of intertwining operators are free from cumbersome factors. ¯ 1 /ζ2 ) the intertwining Let P be the permutation operator, P (v⊗w) = w⊗v, and P R(ζ ¯ 1 /ζ2 )(v0 ⊗ v0 ) = v0 ⊗ v0 . We operator from Vζ1 ⊗ Vζ2 to Vζ2 ⊗ Vζ1 normalized as R(ζ ¯ ) by define the components of R(ζ X ¯ )ij0 0 vi 0 ⊗ vj 0 . ¯ )(vi ⊗ vj ) = R(ζ R(ζ ij i 0 ,j 0
Explicitly they are given by (cf. [2]) N ¯ )j k = b(ζ ) = q(1 − ζ ) (j 6= k), R(ζ jk 1 − q 2ζ N 2 1−q = cj k (ζ ) = ζ N θ (k−j )+j −k (j 6 = k), 1 − q2ζ N
¯ )jj = 1, R(ζ jj ¯ )j k R(ζ kj
(7)
where θ(k) = 1 (k ≥ 0), = 0 (otherwise) and 0 ≤ j, k ≤ N − 1. c Let V (3i ) be the irreducible highest weight Uq (sl N ) module with the highest weight ∗ 3i and the highest weight vector |3i >, V (3i ) the restricted dual right highest weight module of V (3i ) with the highest weight
vector < 3i | such
that hh3 i |, |3
i ii = 1, where h, i is the dual pairing. We denote 3j |, X|3i = 3j |X, |3i = 3j |X|3i for any X ∈ HomC (V (3i ), V (3j )), where X acts on V (3j )∗ from the right. c The type I and type II intertwining operators 8(i) (ζ ) and 9 ∗(i) (ξ ) are the Uq0 (sl N) linear operators of the form 8(i) (ζ ) : V (3i+1 ) −→ V (3i ) ⊗ Vζ , 8(i) (ζ ) =
N −1 X =0
8(i) (ζ ) ⊗ v ,
9 ∗(i) (ξ ) : Vξ ⊗ V (3i ) −→ V (3i+1 ), 9 ∗(i) (ξ )(vµ ⊗ ·) = 9µ∗(i) (ξ ).
34
A. Nakayashiki
We normalize them by the condition that E D (i) 3i |8i (ζ )|3i+1 = 1,
D
∗(i)
3i+1 |9i
E (ζ )|3i = 1.
Under these normalizations the operators 8(i) (ζ ) and 9 ∗(i) (ξ ) are unique. We sometimes omit the upper index (i) of 8(i) (ζ ) and 9 ∗(i) (ξ ) for the sake of simplicity. The intertwining operators 8(ζ ) and 9 ∗ (ξ ) satisfy the following commutation relations ([2]): R(ζ1 /ζ2 )8(ζ1 )8(ζ2 ) = 8(ζ2 )8(ζ1 ), 9 ∗ (ξ2 )9 ∗ (ξ1 )R ∗ (ξ1 /ξ2 ) = 9 ∗ (ξ1 )9 ∗ (ξ2 ), 8(ζ )9 ∗ (ξ ) = τ (ζ /ξ )9 ∗ (ξ )8(ζ ),
(8) (9) (10)
where τ (ζ ) = ζ 1−N
θq 2N (qζ N )
θq 2N (qζ −N )
and for any complex number p such that |p| < 1 we set ∞ Y
(z; p)∞ =
(1 − pk z),
θp (z) = (z; p)∞ (pz−1 ; p)∞ (p; p)∞ .
k=0
The matrices R(ζ ) and R ∗ (ζ ) are given by ¯ ), R(ζ ) = r(ζ )R(ζ
¯ ), R ∗ (ζ ) = r ∗ (ζ )R(ζ
with r(ζ ) = ζ −1
(q 2N z−1 ; q 2N )∞ (q 2 z; q 2N )∞ , (q 2N z; q 2N )∞ (q 2 z−1 ; q 2N )∞
r ∗ (ζ ) = −ζ −1
(q 2N z−1 ; q 2N )∞ (q 2N−2 z; q 2N )∞ . (q 2N z; q 2N )∞ (q 2N−2 z−1 ; q 2N )∞
In (8)–(10) we use the following notation: for vi ⊗ vj ∈ Vζ1 ⊗ Vζ2 and vj 0 ⊗ vi 0 ∈ Vζ2 ⊗ Vζ1 , the equation vi ⊗ vj = vj 0 ⊗ vi 0 means vi = vi 0 and vj = vj 0 . This is for the sake of simplifying the description of the equation. Thus in terms of components (8), (9) and (10) are written as 0 0
R(ζ1 /ζ2 )11 22 810 (ζ1 )820 (ζ2 ) = 82 (ζ2 )81 (ζ1 ),
R (ξ1 /ξ2 )10 20 9∗0 (ξ2 )9∗0 (ξ1 ) = 9∗1 (ξ1 )9∗2 (ξ2 ), 2 1 1 2 8 (ζ )9µ∗ (ξ ) = τ (ζ /ξ )9µ∗ (ξ )8 (ζ ). ∗
(11) (12) (13)
c Let σ be the automorphism of Uq (sl N ) induced by the Dynkin diagram automorphism, σ (ei ) = ei+1 , σ (fi ) = fi+1 , σ (hi ) = hi+1 .
Trace Construction of Basis for Solution Space of slN qKZ Equation
35
The indices should be read modulo N . The automorphism σ induces the linear automorphism of Vζ , the linear isomorphism between the left highest weight modules V (3i ) and V (3i+1 ), the linear isomorphism between the right highest weight modules V (3i )∗ and V (3i+1 )∗ by σ (vj ) = vj +1 , σ (|3i i) = |3i+1 i, σ (h3i |) = h3i−1 | with the propc erties σ (Xv) = σ (X)σ (v) and σ (v ∗ X) = σ (v ∗ )σ −1 (X) for X ∈ Uq (sl N ), v ∈ V (3i ), ∗ ∗ v ∈ V (3i ) . Then the intertwining operators satisfy the following relations: 8(i) (ζ ) = (σ ⊗ σ )8(i−1) (ζ )σ −1 ,
9 ∗(i) (ξ ) = σ 9 ∗(i−1) (ξ )(σ −1 ⊗ σ −1 ).
(14)
In terms of components these are (i−1)
−1 , 8(i) (ζ ) = σ 8−1 (ζ )σ
∗(i−1)
9µ∗(i) (ξ ) = σ 9µ−1 (ξ )σ −1 .
These relations are proved by checking the intertwining properties and the normalization conditions of the right hand side of the equations using the relation (σ ⊗ σ )1 = 1σ. ¯ ) is also invariant with respect to σ ; The R matrix R(ζ (j ) ¯ ij ¯ )σ (i)σ R(ζ σ (i 0 )σ (j 0 ) = R(ζ )i 0 j 0 ,
where σ (i) = i + 1 (0 ≤ i ≤ N − 2), σ (N − 1) = 0. 4. Principal-Homogeneous Correspondence We shall give relations between the intertwining operators defined in the previous section and those in [10,2]. (h) Let Vz = ⊕N−1 j =0 Cvj be the homogeneous evaluation module given by fi vj = z−δi0 δij +1 vj +1 ,
ei vj = zδi0 δij vj −1 ,
ti vj = q −δij +δij +1 vj .
(h) c The map Vζ −→ Vz given by vi 7→ vi ζ i commutes with the action of Uq0 (sl N ), where N z=ζ . ∗ ˜ V 3i+1 (z) be the intertwining operators in [10]; ˜ 3i V (z) and 8 Let 8 3i 3i+1 ∗
˜ V 3i+1 (z) : V (3i ) −→ Vz(h)∗ ⊗V (3i+1 ). 8 3i
˜ 3i V (z) : V (3i+1 ) −→ V (3i )⊗Vz(h) , 8 3i+1 We set ˜ 3i V (z), ˜ h(i) (z) = 8 8 3i+1
∗
˜ V 3i+1 (z) = 8 3i
N −1 X j =0
∗h(i)
˜ vj∗ ⊗ 9 j
(z),
D E where {vj∗ } is the dual basis to {vj }, vi , vj∗ = δij . Then (i)
h(i)
N ˜ 8j (ζ ) = ζ i−j 8 j (ζ ),
∗(i)
9j
∗h(i)
˜ (ζ ) = ζ j −i 9 j
(ζ N ), ∗
˜ V 3i+1 (z) in [10] where 0 ≤ i, j ≤ N − 1. We remark that the dual represeation V ∗ in 8 3i is with respect to the antipode inverse. Let R¯ (h) (z) = R¯ V (1) V (1) (z) be the R matrix in [2]. Then ¯ )ij0 0 = R¯ (h) (ζ N )ij0 0 ζ i−i 0 . R(ζ ij ij
36
A. Nakayashiki
5. Trace of Intertwining Operators In order to normalize the trace of intertwining operators we first introduce scalar functions which satisfies some functional equations. For complex numbers p1 , · · · , pk such that |pi | < 1 for any i, we define ∞ Y
(z; p1 , · · · , pk )∞ =
(1 − pr1 · · · prk z).
r1 ,··· ,rk =0
We assume |q| < 1, |x| < 1 and set {z} = (z; q 2N , x N )∞ ,
{q 1+σ x N z−1 }{q 1+σ z} , {q 2N−1+σ x N z−1 }{q 2N−1+σ z}
h(σ ) (z|x) =
where σ = 0, ±1. Let us define Y Y Y zb ua ua h(+) ( |x)( h(0) ( |x))−1 h(−) ( |x), F¯ (z|u|x) = za zb ub a 0 such that
hu, Aui ≥ −ChkukH −1/2,1/2 (X) . sc
(2.2)
Semiclassical Estimates in Asymptotically Euclidean Scattering
209
2m,−2l,0 In particular, if A ∈ 9scc,h (X) has principal symbol a ≥ 0, then A ≥ hR for some 2m−1,−2l+1,0 (X). R ∈ 9scc,h
Proof. The inequality is well known in the case of Rn – see Sect.18.4 of [8] (easily adapted to the semi-classical setting), [3] and [6]. The localization argument presented in the Appendix of [16] then gives the lemma. u t Now, let g be a scattering metric on X, that is, a metric which near ∂X takes the form h0 dx 2 + 2, 4 x x
h0 |∂X = h is a metric on ∂X.
(2.3)
This defines an asymptotically Euclidean structure near ∂X: a neighbourhood of ∂X is isometric to a perturbation of the large end of the cone R+ × ∂X with the metric dr 2 + r 2 h. We will consider the following self-adjoint, classically elliptic operators in 2,0,0 Diff h,sc 2,0,0 (X) ⊂ 9h,sc (X): P = h2 1g + V ,
(2.4)
where P in any compact set, V is a second order semiclassical operator (V = |α|≤2 vα (z, h)(hDz )α in local coordinates) and near the boundary ∂X, in local coordinates y ∈ ∂X, X vkα (x, y, h)(hx 2 Dx )k (hDy )α , V = x γ V0 , V0 = |α|+k≤2 (2.5) 0 0 ∈ hS 0,0,0 (X), vkα ∈ S 0,0 (X) vkα − vkα
γ > 0.
The condition that the coefficients are symbols independent of the fiber variables means β that |(x∂x )l ∂y vk,α | ≤ Clβ . In the Euclidean setting it corresponds to assuming that the coefficients are symbols in the Euclidean base variables. Due to the vanishing of 0 in S 0,0,0 (X) when h = 0, the semiclassical principal symbol of P is vkα − vkα p = g + xγ
X |α|+k≤2
0 vkα (x, y)τ k µα ,
(2.6)
where g also denotes the (dual) metric function of the metric g. Thus, p can be represented by an h-independent function, which will be convenient for the construction in the last 0 ∈ hS 0,0,0 (X) could be section of this paper. Note, however, that in (2.5), vkα − vkα 0 ρ 0,0,0 (X), ρ > 0, or indeed by the assumption that vkα is replaced by vkα − vkα ∈ h S continuous on [0, 1)h with values in S 0,0 (X), at the expense of minor changes in the next section. For obtaining the uniform resolvent estimates in h for R(λ ± i0), we make the assumption that the Hamiltonian is non-trapping at energy λ, for any ξ ∈ T ∗ X◦ satisfying p(ξ ) = λ, lim x(exp(tHp )(ξ )) = 0. t→±∞
(2.7)
210
A. Vasy, M. Zworski
As discussed in [5], this implies that an interval of energies around λ is non-trapping: ∃δ0 > 0 such that for any ξ ∈ T ∗ X ◦ satisfying p(ξ ) ∈ (λ − δ0 , λ + δ0 ), lim x(exp(tHp )(ξ )) = 0.
(2.8)
t→±∞
The symbolic functional calculus applies in the semiclassical setting as well – see [3] and references given there. Here, we will restrict the discussion to the operator P given by (2.4). The formula Z 1 ∂¯z f˜(z)(P − z)−1 d z¯ ∧ dz, f˜ ∈ Cc∞ (C), f (P ) = 2πi C f˜|R = f, ∂¯ f˜ = O(| Im z|∞ ), (f˜ is an almost analytic extensions of f ) shows that for f ∈ Cc∞ (R),
−∞,0,0 f (P ) ∈ 9sc,h (X).
?,0,0 (f (P )) = f (p). Also σh,sc If ψ ∈ Cc∞ (R), ψ ≡ 1 near λ, then for t ∈ R, 1 − ψ(σ ) = ψ˜ t (σ )(σ − (λ + it)), −1 ˜ (R) satisfying uniform symbol estimates as t varies over compact sets, so ψ ∈ Scl ˜ ) ∈ 9 −2,0,0 (X), and we have proved the following lemma: ψ(P sc,h
Lemma 2.2. Let P be as in (2.4). Suppose that ψ ∈ Cc∞ (R), ψ ≡ 1 near λ, and suppose that r, s ∈ R. Then there exists C > 0, independent of t as long as t varies in compact r−2,s (X), the following sets, such that for all u ∈ C −∞ (X) with (P − (λ + it))u ∈ Hsc estimate holds: k(Id −ψ(P ))ukHscr,s (X) ≤ Ck(P − (λ + it))ukH r−2,s (X) . sc
(2.9)
3. Semiclassical Estimates In this section we will prove the semi-classical resolvent estimates under the assumption that there exists q ∈ S 0,−,0 (X), ∈ (0, 41 ), such that 2qHp q = −bψ(p)2 , b ∈ S 0,1−2,0 (X), ψ ∈ Cc∞ (R; [0, 1]), ψ ≡ 1 near λ, b ≥ c0 x
1+2
(3.1)
> 0.
The existence of q under global non-trapping assumptions will be established in Sect.4. If we write Q = Op(q) and B = (Op(b) + Op(b)∗ )/2 then, as reviewed in Sect.2, i[Q∗ Q, P ] = hψ(P )Bψ(P ) + h2 R
(3.2)
0,2−2,0 −∞,0,0 (X). Note that ψ(P ) ∈ 9scc,h (X), i.e. it is smoothing, so the with R ∈ 9scc,h r,s differentiability order r in the weighted Sobolev spaces Hsc (X) is mostly irrelevant 0, 1 +
below. Suppose that u ∈ Hsc 2
(X). Then for t > 0,
hu, i[Q∗ Q, P ]ui = −2 Imhu, Q∗ Q(P − (λ + it))ui − 2tkQuk2 .
(3.3)
Semiclassical Estimates in Asymptotically Euclidean Scattering
211
0,−2,0 (Note that Q∗ QP and P Q∗ Q are in 9scc,h (X), so the expressions of the form ∗ hu, Q QP ui, make sense.) Thus, taking into account that 2tkQuk2 ≥ 0,
hhu, ψ(P )Bψ(P )ui ≤ 2|hu, Q∗ Q(P − (λ + it))ui| + h2 |hu, Rui|.
(3.4)
By the Cauchy-Schwartz inequality we have, for any δ > 0, |hu, Q∗ Q(P −(λ + it))ui| ≤ kx 2 + ukkx − 2 − Q∗ Q(P −(λ + it))uk 1
1
≤ δhkx 2 + uk2 + δ −1 h−1 kx−2 − Q∗ Q(P −(λ + it))uk2 . (3.5) 1
1
0,1−4,0 (X), hence bounded on L2sc (X) since ∈ Note that x − 2 − Rx − 2 − ∈ 9scc,h 1
1
0,0,0 (X) is also bounded on the L2 space. (0, 41 ). Similarly, x − 2 − Q∗ Qx 2 +3 ∈ 9scc,h Thus, 1
1
hhu,ψ(P )Bψ(P )ui − (δh + h2 kx − 2 − Rx − 2 − kB(L2sc (X)) )kx 2 + uk2 1
1
1
≤ δ −1 h−1 kx − 2 − Q∗ Qx 2 +3 k2B(L2 (X)) kx − 2 −3 (P − (λ + it))uk2 . 1
1
1
(3.6)
sc
We will now use the last assumption in (3.1): x −1+2 bψ(p) ≥ c0 x 2 ψ(p). Hence by the sharp Gårding estimate, x − 2 + ψ(P )Bψ(P )x − 2 + ≥ c02 x 2 ψ(P )2 x 2 + hR1 , 1
1
−∞,1,0 R1 ∈ 9sc,h (X).
(3.7)
Adding c02 x 2 (Id −ψ(P )2 )x 2 to both sides gives x − 2 + ψ(P )Bψ(P )x − 2 + + c02 x 2 (Id −ψ(P )2 )x 2 ≥ c02 x 4 + hR1 . 1
1
(3.8)
We also note that |hx 2 − u, R1 x 2 − ui| ≤ C 0 kx 1− uk2 . Thus, applying both sides of 1 1 (3.8) to x 2 − u, and pairing with x 2 − u afterwards yields 1
1
c02 kx 2 + uk2 1
≤ hu, ψ(P )Bψ(P )ui + c02 |h(Id +ψ(P ))x 2 + u, (Id−ψ(P ))x 2 + ui| + C 0 hkx 1− uk2 1
1
≤ hu, ψ(P )Bψ(P )ui + 2c02 δkx 2 + uk2 + δ −1 k(Id−ψ(P ))x 2 + uk2 + C 0 hkx 1− uk2 , (3.9) 1
1
The last term is clearly bounded by C 0 hkx 2 + uk2 and the second to last term can be estimated using (2.9). Choosing δ < 1/4, h1 = c02 /4C 0 gives that for h ∈ (0, h1 ), 1
kx 2 + uk2 ≤ C1 hu, ψ(P )Bψ(P )ui + C2 k(P − (λ + it))uk2 1
−2,− 21 −
Hsc
(X)
.
(3.10)
−2, 1 +
The norm in the second term on the right hand side can be replaced by the Hsc 2 (X) norm. Combining (3.6) and (3.10), we thus conclude that there exists h0 > 0 such that for h ∈ (0, h0 ), hu, ψ(P )Bψ(P )ui ≤ Ch−2 kx − 2 −3 (P − (λ + it))uk2 . 1
(3.11)
212
A. Vasy, M. Zworski
Again using (3.10), we conclude that for all > 0, kukH 0,−1/2− (X) ≤ Ch−1 k(P − (λ + it))ukH 0,1/2+3 (X) , sc
sc
h ∈ (0, h0 ).
(3.12)
We can modify this argument slightly by inserting (P + i)(P + i)−1 in (3.5) between Q and P − (λ + it), to see that the last factor in (3.6) can be replaced by k(P + 1 i)−1 x − 2 −3 (P − (λ + it))uk2 , and correspondingly the norm on the right-hand side of (3.12) can be replaced by k(P − (λ + it))ukH −2,1/2+3 (X) . A further slight modification sc r,s (X) can be in the same spirit allows us to conclude that the smoothness order r in Hsc shifted by the same amount on both sides of (3.12): kukH r,−1/2− (X) ≤ Ch−1 k(P − (λ + it))ukH r−2,1/2+3 (X) , sc
sc
h ∈ (0, h0 ).
(3.13)
r,1/2+3
(X). Since R(λ+it) = (P −(λ+it))−1 ∈ Now let u = ut = R(λ+it)f , f ∈ Hsc r+2,1/2+3 −2,0,0 9scc,h (X) for t > 0, we see that ut ∈ Hsc (X) for t > 0. Thus, the above estimate is applicable and we conclude that kR(λ + it)f kH r+2,−1/2− (X) ≤ Ch−1 kf kH r,1/2+3 (X) , sc
sc
h ∈ (0, h0 ).
(3.14)
Note that for a fixed ψ, we can let λ be arbitrary inside the region where ψ ≡ 1, so a compactness argument gives the uniform estimate in λ as stated in our main Theorem. Remark 3.1. As in Melrose’s paper [10], using these estimates one can show that for r+2,−1/2− r,1/2+ (X) for f ∈ Hsc (X), fixed h > 0, the limits R(λ ± i0)f exist in Hsc > 0. Hence, (3.14) yields kR(λ + i0)f kH r+2,−1/2− (X) ≤ Ch−1 kf kH r,1/2+ (X) , sc
sc
h ∈ (0, h0 ),
(3.15)
as well. 4. Symbol Construction Let p be the principal symbol of P . Thus, near ∂X, p = τ 2 + g∂ (y, µ) + x γ r,
r ∈ S 2,0 (X),
(4.1)
where g∂ is the metric on the boundary, and we denote the metric function on the cotangent bundle the same way. Its Hamilton vector field Hp is of the form x(2τ (x∂x + µ · ∂µ ) − 2g∂ ∂τ + Hg∂ ) + x 1+γ W,
W ∈ Vb (scT ∗ X) ⊗ S 1,0 (X) ; (4.2)
see [10, Eq. (8.17)] for a detailed calculation. Here we will be mainly concerned with the (x, τ ) variables, so we rewrite this as Hp = x(2τ + x γ a)(x∂x ) − x(2g∂ + x γ b)∂τ + 2xτ µ · ∂µ + xHg∂ + x 1+γ W 0 , (4.3) where a, b ∈ S 1,0 (X), and W 0 is now a vector field tangent to the ∂X fibers, i.e. it is a vector field in ∂y and ∂µ with coefficients in S 1,0 (X). In this section we take λ2 , not λ, as the spectral parameter.
Semiclassical Estimates in Asymptotically Euclidean Scattering
213
6
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..................... ....... . ..r . ............ ............. . . . . . . . . ....... . ...... . ..... . . . . . . ..... .. . . ..... ..... . . . . . ... . . . .... . .... . . . ... . . ... . ... . . . . . .. . .. . .. . . . .. . .. . .. . . .. . .. . . . .. . .. . .. . . . . . .. . . . . . . ..................................................................................................................................................... .................................................. .......................... . . . .. . .. . . . .. . .. . . .. .. . . . .. . . . ... .. . . . . . . ... . ... . ... . . . ... ... . .... . ... . . ... ..... . . .... . . ... . . . ..... . . ..... . ..... . ..... ..... . ...... . ........ . ....... .......... . . ........... ...................... . .................... .. ...r . . . . . . . . . . . .
? R
R+ ()
R (P 2) ? -
R ()
Fig. 3. The projection of the characteristic variety 6(P − λ2 ) and the bicharacteristics of Hp inside it to the (τ, µ)-plane
As indicated, we make the assumption that a small interval of energies around λ2 is non-trapping, i.e. ∃δ0 > 0 such that for any ξ ∈ T ∗ X ◦ satisfying p(ξ ) ∈ (λ2 − δ0 , λ2 + δ0 ), lim x(exp(tHp )(ξ )) = 0.
(4.4)
t→±∞
Now, Hp (x −1 τ ) = −2(τ 2 + g∂ ) + x γ f, f ∈ S 1,0 (X),
(4.5)
so there exists 1 > 0 such that for ξ ∈ scT ∗ X satisfying p(ξ ) ∈ (λ2 /2, 2λ2 ), x < 1 , −(Hp (x −1 τ ))(ξ ) ≥ c0 > 0. Since p is constant along integral curves of Hp , we see that if exp(−tHp )(ξ ), t ≥ T , stays in x < 1 (which holds under our non-trapping assumption for sufficiently large T ), then x −1 τ tends to +∞; in particular τ is nonnegative for all large t. By reducing 1 > 0 if necessary, we also see that there exist δ1 > 0, 1 > 0 such that for ξ ∈ scT ∗ X, |p(ξ ) − λ2 | < δ1 , x(ξ ) < 1 , |τ | < 7λ/8 ⇒ g∂ (ξ ) ≥ c1 > 0.
(4.6)
Reducing 1 > 0 further if necessary, we can thus arrange that |p(ξ ) − λ2 | < δ1 , x < 1 , |τ | < 7λ/8 ⇒ −x −1 Hp τ (ξ ) ≥ c1 > 0.
(4.7)
Thus, we see that given any x0 > 0, ξ ∈ T ∗ X ◦ with |p(ξ ) − λ2 | < δ1 , there exists T > 0 such that t ≥ T ⇒ τ (exp(−tHp )(ξ )) > 2λ/3, x(exp(−tHp )(ξ )) < x0 /2.
(4.8)
We now define a symbol q ∈ S −∞,0 (X) whose most important properties are that q≥0
and x −1 Hp q ≤ 0.
(4.9) Cc∞ (R)
is We will always use a localization in the energy via a factor ψ(p), where ψ ∈ supported in (λ2 − δ, λ2 + δ), where δ ∈ (0, λ2 ) is a fixed small constant with δ < δ1 , δ1 as above. Let M = sup{|a(ξ )| + |b(ξ )| : p(ξ ) ≤ 2λ2 } < +∞;
(4.10)
214
A. Vasy, M. Zworski
here we used that p−1 ((−∞, 2λ2 ]) is a compact subset of scT ∗ X. Also, let x0 = min{(λ/6(M + 1))1/γ , (c1 /2(M + 1))1/γ , 1 }.
(4.11)
Let χ− ∈ C ∞ (R) be supported in (λ/3, +∞), identically 1 on (2λ/3, +∞), with χ−0 ≥ 0, and similarly let χ+ ∈ C ∞ (R) be supported in (−∞, −λ/3), identically 1 on (−∞, −2λ/3), with χ+0 ≤ 0.Also, let χ∂ ∈ Cc∞ (R) be supported in (−7λ/8, 7λ/8), with χ∂0 ≥ (6λ/c1 )χ∂ ≥ 0 on (−7λ/8, 3λ/4), and χ∂ (−3λ/4) > 0. Let ρ ∈ Cc∞ ([0, +∞)) be identically 1 on [0, 1/2], supported in [0, 1), ρ 0 ≤ 0 on [0, +∞). In the incoming region we will take the symbol q− = x − χ− (τ )ψ(p)ρ(x/x0 ),
(4.12)
in the outgoing one the symbol q+ = x χ+ (τ )ψ(p)ρ(x/x0 ),
(4.13)
with ∈ (0, 41 ). In the intermediate region we take q∂ = x − χ∂ (τ )ψ(p)ρ(x/x0 ).
(4.14)
Note that for any α ∈ R, χ , ρ ∈ C ∞ (R), x −α−1 Hp (x α χ (τ )ρ(x/r)) = (2τ + x γ a)(αρ(x/r) + r −1 ρ 0 (x/r))χ (τ ) − (2g∂ (ξ ) + x γ b)ρ(x/r)χ 0 (τ ).
(4.15)
Note that in the definition of q− , α = − < 0, so αρ(x/r) + r −1 ρ 0 (x/r) ≤ 0 everywhere. Moreover, on supp χ− , τ > λ/3 > 0, so for x(ξ ) ≤ x0 , ξ ∈ supp ψ(p), τ (ξ ) ∈ supp χ− , 2τ + x γ a ≥ λ/3 > 0. In addition, τ ≤ 2λ/3 on supp χ−0 , so if ξ ∈ supp(ρ(x/x0 )χ 0 (τ )ψ(p)) then g∂ ≥ c1 > 0, hence 2g∂ + x γ b ≥ c1 > 0 there. Thus, x −1+ Hp q− ≤ 0.
(4.16)
Moreover, x ≤ x0 /2 implies ρ 0 (x/x0 ) = 0, and τ ≥ 2λ/3 implies χ−0 (τ ) = 0, so x ≤ x0 /2, τ ≥ 2λ/3 ⇒ −x −1+ Hp q− ≥ c2 ψ(p), c2 > 0.
(4.17)
The difference between q− and q+ is that τρ 0 is positive on supp χ+ , and −χ+0 is also positive, so the negativity estimate only holds away from supp ρ 0 and supp χ+0 . Thus, there is no analogue of (4.16), but the following analogue of (4.17) still holds: x ≤ x0 /2, τ ≤ −2λ/3 ⇒ −x −1− Hp q+ ≥ c3 ψ(p), c3 > 0.
(4.18)
Next, q∂ provides the connection between the incoming and outgoing regions. Since χ∂0 can be used to estimate χ∂ on (−7λ/8, 3λ/4), we see that τ (ξ ) ∈ (−7λ/8, 3λ/4), x(ξ ) ≤ x0 /2, ξ ∈ supp ψ(p) ⇒ |(2τ + x γ a)χ∂ (τ )| ≤ c1 χ∂0 (τ )/2.
(4.19)
Semiclassical Estimates in Asymptotically Euclidean Scattering
215
Since α = −, |α| < 1, so we conclude that τ (ξ ) ∈ (−7λ/8, 3λ/4), x(ξ ) ≤ x0 /2, ξ ∈ supp ψ(p) ⇒ −x −1+ Hp q∂ ≥ c1 χ∂0 (τ )ψ(p) ≥ 0.
(4.20)
Note that on (−3λ/4, 3λ/4), χ∂0 ≥ C > 0, so x(ξ ) ≤ x0 /2,
τ (ξ ) ∈ (−3λ/4, 3λ/4), ξ ∈ supp ψ(p) ⇒ −x
−1+
Hp q∂ ≥ c4 ψ(p),
c4 > 0.
(4.21)
For ξ ∈ T ∗ X ◦ with p(ξ ) ∈ (λ2 − δ0 , λ2 + δ0 ), take T = Tξ > 0 as in (4.8), so for t ≥ T we have τ (exp(−tHp )(ξ )) > 2λ/3, x(exp(−tHp )(ξ )) < x0 /2. We will define a symbol qξ which is supported in a neighborhood of the bicharacteristic segment {exp(−tHp )(ξ ) : t ∈ [0, T + 1]}, and which satisfies Hp q ≤ 0 over K 0 = {ξ 0 ∈ T ∗ X◦ : x(ξ 0 ) ≥ x0 /2
or (x(ξ 0 ) ≤ x0 /2
and τ (ξ 0 ) ≤ 2λ/3}.
(4.22)
Namely, let 6 be a hypersurface through ξ which is transversal to Hp . Then there is a neighborhood Uξ of ξ , such that Vξ = {exp(−t (Uξ ∩ 6)) : t ∈ (−1, T + 2)} is a neighborhood of the above bicharacteristic segment, which we can think of as a product (−1, T + 2) × (Uξ ∩ 6), and (T + 1/2, T + 2) × (Uξ ∩ 6) is disjoint from K 0 . Now let φξ ∈ Cc∞ (Uξ ∩ 6) be identically 1 near ξ , and let χξ ∈ Cc∞ (R) be supported in (−1, T + 2), χξ ≥ 0, χξ0 ≥ 0 on (−1, T + 2/3). Using the product coordinates, we can think of φξ and χξ as functions of scT ∗ X with compact support in Vξ . Let qξ = χξ φξ ψ(p),
(4.23)
Hp qξ = −χξ0 φξ ψ(p).
(4.24)
so
Thus, for ξ 0 ∈ K 0 , Hp qξ (ξ 0 ) ≤ 0. Now let K ⊂ T ∗ X ◦ be the compact set K = {ξ ∈ T ∗ X ◦ : ξ ∈ supp ψ(p), x(ξ ) ≥ x0 /4}.
(4.25)
Since K is compact, applying the previous argument for every ξ ∈ K gives a Uξ , and a Uξ0 ⊂ Uξ on which φξ = 1. Since {Uξ0 : ξ ∈ K} covers K, the compactness of K shows that we can pass to a finite subcover, {Uξ0j : j = 1, . . . , N}. We let q◦ =
N X
qξj .
(4.26)
j =1
The symbol we use in the positive commutator estimate is q = q− + C 00 q∂ + Cq◦ + C 0 q+ ,
(4.27)
with C, C 0 , C 00 > 0 chosen appropriately. Namely note that in the region x ≤ x0 /2, τ ≥ 2λ/3, which is the only place where Hp q◦ is positive, we have the estimate −x −1+ Hp q− ≥ c2 > 0. Since x −1+ Hp q◦ is bounded, we can choose C > 0 sufficiently small so that −x −1+ Hp (q− + Cq◦ ) is still bounded below by a positive constant
216
A. Vasy, M. Zworski
supp q+
supp q@
@R
..... ... .. ... .... .. .. .... .... ... ... ..... ..... .. ..... .... ... ... .... ..... .. ... .... .. . ..... . . . ... .. ..... .. ..... ... ..... ... .... .... ... ... .... ..... .. .... ... ..... .. .. ..... ..... ... .. .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . .. .. .............. .... .. ... .................... ........ ..... ..... ........ .... .... ........... ..... ......... .... .... .... .. .... ..... ................ .. .......... ........ ..... ..... ... .... .... ... . . . . . . . . . . . . .. . .... ...... ... .... .... .... ....... ... .... ...... ..... . ... ... ..... ... ..... .... ... ... ... ..... ... .... .. .. .... ..... ... ... ... .. ........ ..... .. ... .. . . . ..... .. .. .. ............. ... ......... . ... . ........ ..... ... .. .. . . ............. . . . . . . .. .. ....................................... ... .. .. . . . .. . . .. . .. .. . .. . . .. .. .. .. . . . .. .. . .. ... . .. . . . .. ... . .. . . . . . . .. . . . . . . .. .. . . . . . . . ... . . . . .. . . .. . . . . . . . . .. . . . .... . . . .. . .. .. . . . . .. ... . . . . . . . .. . .. . . ... .. .. . . . .. . ... . . ... . ... . . . .. . . . ... . . .............................................................. .. . . . ........ . . . ........... .......... . . . ... . . . . . .. ....... . . ..... .... . . . ... . . . . . . . ... ...... . .. . ..... .... . . . . . . . . ..... . .. . ..... .... ... . . . . . . . . . . .... .. . . .... ..... .. . ... . . . . . . . . . . ..... ... . ... . . ..... . . . . . . . . .. . . . ... ..... . ... ... . ..... . . . . . . . . . . . . .... ... . . ... ..... . ... .... ......... . . ..... . ... . . .. ..... ... . . . . . . ...... ... ......... . .. . . . . . . . . . .. ..... .. . ....... ... . . .. . . . . . . . . ........ ... . ..... .... .. . . . . ..... . . ... ..... .. ... .... . . ..... . . . . . . . . . . . . . . . .... . . . . ........... .. ..................... ..... .... ................... . . . . . . . . . . . . . . . ... ..... .. ................................. ................... ..... . . . . . .... . . . . . . . . . . . . . . . . . . . . . . ....................... .. ... ..... .. ............................................. ..... . . . ..... . . . . . ... ..... .. .... . . . . . ... . . . .
supp q@
-
supp qÆ
6
@Rsupp q
Fig. 4. Supports of q+ , q− , q◦ and q∂ for X = Rn . scT ∗ X is identified with Rn × Rnξ , and the covector ξ is fixed on the picture
in this region. Then −x −1+ Hp (q− +Cq◦ ) is non-negative everywhere, and it is bounded below by a positive constant on x ≥ x0 /2 as well as on x ≤ x0 /2, τ ≥ 2λ/3. But this is the only region where the bounded function x −1+ Hp q∂ is positive, so by choosing C 00 > 0 sufficiently small, we can arrange that −x −1+ Hp (q− +Cq◦ +C 00 q∂ ) is non-negative everywhere, and it is bounded below by a positive constant on x ≥ x0 /2, as well as on x ≤ x0 /2, τ ≥ −3λ/4. But this is the only region where x −1− Hp q+ > 0. Thus, by choosing C 0 > 0 sufficiently small, and taking into account that x −1+ Hp = x 2 x −1− Hp q+ , with x −1− Hp q+ as well as x 2 bounded, we can arrange that −x −1+ Hp q is nonnegative everywhere, and −x −1− Hp q bounded below by a positive constant everywhere. In summary, we have proved the proposition needed in Sect.3 (see (3.1)) Proposition 4.1. There exist functions q ∈ S −,∞ (X), ψ ∈ Cc∞ (R; [0, 1]), ψ ≡ 1 near λ2 , and c0 , c00 > 0 such that q ≥ c0 x ψ(p), −Hp q ≥ c00 x 1+ ψ(p).
(4.28)
Thus, the results of the previous section show that there exists h0 > 0 such that for h ∈ (0, h0 ), kR(λ2 + it)f kH ∗,−1/2− (X) ≤ C0 h−1 kf kH ∗,1/2+3 (X) . sc
sc
(4.29)
Acknowledgements. The authors are grateful to the National Science Foundation for partial support grants number DMS-99-70607 and DMS-99-70614.
Semiclassical Estimates in Asymptotically Euclidean Scattering
217
References 1. Agmon, Sh.: Spectral theory of Schrödinger operators on Euclidean and non-Euclidean spaces. Comm. Pure Appl. Math. 39, 3–16 (1986) 2. Bruneau, V., and Petkov, V.: Semiclassical resolvent estimates for trapping perturbations. To appear in Commun. Math. Phys. 3. Dimassi, M., and Sjöstrand, J.: Spectral asymptotics in the semi-classical limit. Cambridge University Press, 1999 4. Gérard, Ch.: Semiclassical resolvent estimates for two and three body Schrödinger operators. Comm. P.D.E. 15, 1161–1178 (1990) 5. Gérard, Ch., and Martinez, A.: Principe d’absorption limite pour des opérateurs de Schrödinger à longue portées C.R. Acad. Sci. Paris 306, 121–123 (1988) 6. Helffer, B., and Sjöstrand, J.: Resonances en limite semi-classique. Mém. Soc. Math. France (N.S.) 24-25, (1986) 7. Hörmander, L.: On the existence and the regularity of solutions of linear pseudo-differential equations. Enseignement Math. 17 (2), 99–163 (1971) 8. Hörmander, L.: Linear partial differential equations. Vol. 3, Berlin: Springer Verlag, 1985 9. Jensen, A., Mourre, E. and Perry, P.: Multiple commutator estimates and resolvent smoothness in quantum scattering theory, Ann. Inst. H. Poincaré (phys. théor.) 41, 207–225 (1984) 10. Melrose, R.B.: Spectral and scattering theory for the Laplacian on asymptotically Euclidean spaces. In: Spectral and scattering theory (M. Ikawa, ed.), New York: Marcel Dekker, 1994, pp, 85–130 11. Melrose, R.B., and Zworski, M.: Scattering metrics and geodesic flow at infinity. Invent. Math. 124, 389–436 (1996) 12. Robert, D.: Asymptotique de la phase de diffusion à haute énergie pour des perturbations du second ordre du laplacien. Ann. Sci. École Norm. Sup. 25, 107–134 (1992) 13. Robert, D.: Relative time-delay for perturbations of elliptic operators and semiclassical asymptotics. J. Funct. Anal. 126, 36–82 (1994) 14. Robert, D., and Tamura, H.: Semiclassical estimates for resolvents and asymptotics for total scattering cross-sections. Ann. Inst. H. Poincaré (phys. théor.) 47, 415–442 (1987) 15. Wang, X.P.: Time-decay of scattering solutions and classical trajectories. Ann. Inst. H. Poincaré (phys. théor.) 47, 25–37 (1987) 16. Wunsch, J., and Zworski, M.: Distribution of resonances for asymptotically Euclidean manifolds. Preprint, 1999 Communicated by B. Simon
Commun. Math. Phys. 212, 219 – 243 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On the Constraints Defining BPS Monopoles C. J. Houghton, N. S. Manton, N. M. Romão Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Silver Street, Cambridge CB3 9EW, UK. E-mail:
[email protected];
[email protected];
[email protected] Received: 15 October 1999 / Accepted: 19 January 2000
Abstract: We discuss the explicit formulation of the transcendental constraints defining spectral curves of SU (2) BPS monopoles in the twistor approach of Hitchin, following Ercolani and Sinha. We obtain an improved version of the Ercolani–Sinha constraints, and show that the Corrigan–Goddard conditions for constructing monopoles of arbitrary charge can be regarded as a special case of these. As an application, we study the spectral curve of the tetrahedrally symmetric 3-monopole, an example where the Corrigan– Goddard conditions need to be modified. A particular 1-cycle on the spectral curve plays an important rôle in our analysis. 1. Introduction BPS monopoles for the SU (2) Yang–Mills–Higgs gauge theory have been studied for over twenty years, using a number of different approaches. Twistor methods, relating the solutions of the integrable differential equations of the model to holomorphic vector bundles over a so-called twistor space, were first introduced by Ward, adapting previous work on the self-duality equations for the pure Yang–Mills theory on R4 . They enabled solutions of magnetic charge k > 1 to be constructed for the first time [15]. A twistor approach intrinsic to the geometry of R3 was developed later by Hitchin in [7] and [8]. In his formulation, a monopole is associated to a spectral curve, a compact complex curve in T 0 CP1 , the total space of the holomorphic tangent bundle of CP1 , satisfying a number of conditions which were stated in [8]. Based on this approach, new solutions have been constructed and new characterisations of monopoles developed; we refer to [14] for a brief overview. The (reduced) moduli space N k of gauge-inequivalent BPS monopoles of a given charge k is a (4k − 1)-dimensional manifold, which has been described in several ways. If we adopt the twistor formulation in terms of spectral curves, it can be characterised as the space of complex curves in T 0 CP1 satisfying a number of transcendental constraints. For the case where the curve is nonsingular, Ercolani and Sinha attempted to formulate
220
C. J. Houghton, N. S. Manton, N. M. Romão
these constraints explicitly in [4]; they followed essentially the method of Hurtubise in [12], who achieved a satisfactory description of N2 . Their approach leads to a method of determining constraints on spectral curves by analysing objects of the function theory obtained on them. These constraints parallel the Corrigan–Goddard conditions [3] for constructing SU (2) monopoles, and we shall clarify how they relate to each other. This paper is organised as follows. We start by introducing the relevant aspects of the twistor approach for monopoles in Sect. 2, in order to fix the notation. In Sect. 3, we review the method of Ercolani and Sinha, and present a new version (22) of their constraint equations which involves a special 1-cycle c on the spectral curve. The Corrigan– Goddard conditions were originally formulated in terms of integrals around the equator of CP1 , but we show in Sect. 4 how to interpret them equivalently as integrals on the spectral curve. Moreover, we establish that the conditions in the two methods agree except for one detail: the Corrigan–Goddard approach enforces c to be of a special sort, namely a combination of lifts of the equator of CP1 to the spectral curve. In Sect. 5, we apply the Ercolani–Sinha method to compute the spectral curve of the tetrahedral 3-monopole discussed in [9]. Thereby, the corresponding 1-cycle c is determined and the result shows that the Corrigan–Goddard assumption about c is too restrictive in general; we also consider the action of the tetrahedral group A4 ⊂ SO(3) on the homology of the spectral curve and show that it leaves c invariant. Finally, we present some concluding remarks in Sect. 6. 2. Twistor Methods for BPS Monopoles Magnetic BPS monopoles with gauge group SU (2) are defined as gauge equivalence classes of solutions (A, φ) to the Bogomol’ny˘ı equations in R3 , ∗FA = ±∇A φ,
(1)
satisfying boundary conditions (see [2]) that ensure finiteness of the energy functional; here A is a connection 1-form (with covariant derivative ∇A and curvature 2-form FA ), and φ (the Higgs field) a function, both taking values in su(2). Such solutions can be interpreted as particle-like solitons carrying discrete magnetic charge. They are associated with an integer k ∈ Z (with ±k > 0 according to the sign in Eq. (1)), which corresponds to the magnetic charge of the field configuration in suitable units and classifies the solutions homotopically; we take k > 0 throughout. The Bianchi identity together with (1) imply that BPS monopoles are also static classical solutions of the corresponding Yang–Mills–Higgs theory in the BPS limit, in which the Higgs potential is set to zero, and they correspond exactly to the minima of the energy functional. The Eqs. (1) are integrable and their solutions can be studied using methods of complex algebraic geometry. This was formulated by Hitchin in [7] as follows. The space T of oriented geodesics (straight lines) of R3 is a 4-dimensional manifold – a point on it can be specified by a pair of vectors (u, v) ∈ R3+3 , where u has unit length and defines the orientation of the line, while v gives the position of the point on the line closest to the origin and is thus orthogonal to u. This manifold admits a natural integrable almost complex structure, given at each point (u, v) by taking the cross product with u of each of the pair of vectors representing a tangent vector. It turns out that T, endowed with this complex structure, is isomorphic as a complex surface to the total space T 0 CP1 of the holomorphic tangent bundle to the Riemann sphere. The isomorphism takes u to the corresponding point in CP1 ∼ = S 2 and v to the obvious complex coordinate on the fibre. We will consider the standard affine pieces U0 and U1 of CP1 , identifying
Constraints Defining BPS Monopoles
221
the affine coordinate ζ on U0 with the stereographic projection from the south pole and ∂ |ζ letting η denote the corresponding coordinate on the fibre; thus a tangent vector η ∂ζ 1 is assigned the pair (η, ζ ). We let π be the natural projection T → CP , given in these coordinates by (η, ζ ) 7 → ζ , and denote again by U0 , U1 the pre-images under π of the affine pieces of CP1 . In the literature, T is often called mini-twistor space. It admits a real structure τ : T → T, which is the anti-holomorphic involution corresponding to the reversal of direction of oriented lines in R3 ; it obviously has no fixed points. In terms of our coordinates, it can be seen to be given by τ : (η, ζ ) 7 → (−
1 η¯ , − ). ¯ζ 2 ζ¯
(2)
The group SO(3) of rotations in R3 induces an action on T, which can be easily described in the coordinates (η, ζ ) in terms of the corresponding P SU (2) transformations: The matrix p q ∈ P SU (2), |p|2 + |q|2 = 1 −q¯ p¯ acts on the affine coordinate ζ as ζ 7→
pζ ¯ − q¯ qζ + p
(3)
and this corresponds to a rotation by θ around the direction n ∈ S 2 with n1 sin θ2 = Im q, n2 sin θ2 = −Re q, n3 sin θ2 = −Im p and cos θ2 = Re p; η transforms by multiplication by the derivative of (3), η , η 7→ (qζ + p)2 since it is the fibre coordinate of T 0 CP1 corresponding to ζ . It is clear from the definitions that the action of SO(3) commutes with the Z2 action generated by τ . For each s ∈ R and k ∈ Z, we define a holomorphic line bundle Ls (k) on T through the transition function (s,k)
g01
: U0 ∩ U1 −→ C ∗ η
(η, ζ ) 7 −→ e−s ζ ζ k with respect to the trivialising cover {U0 , U1 } of T; this definition is independent of the stereographic projection used on CP1 . We shall use the notation Ls for Ls (0) and O(k) for L0 (k). These line bundles play a rôle in the formulation of the twistor correspondence for monopoles, which we now describe. To a monopole (A, φ) we associate the complex vector bundle E → T, whose fibre at an oriented line γ ∈ T is the complex 2-dimensional space of solutions u : γ → C 2 to the equation (∇γ − iφ)u = 0, where ∇γ is the restriction of ∇A to γ . The Bogomol’ny˘ı equations (1) implies that E is holomorphic; it can be regarded as an extension 0 −→ L± −→ E −→ (L± )∗ −→ 0
(4)
222
C. J. Houghton, N. S. Manton, N. M. Romão
of the line subbundles L± ⊂ E of solutions decaying exponentially as t → ±∞, where t ∈ R is the natural coordinate on γ . It can be shown that, for any monopole of charge k, L± is isomorphic to L±1 (−k); different monopoles correspond to different extensions E. Given the two short exact sequences (4), we consider the composite morphism L− → E → (L+ )∗ , which defines a holomorphic section P of the line bundle (L− ⊗ L+ )∗ ∼ = O(2k). It will determine a compact curve S ⊂ T, which is given in our coordinates by an equation P (η, ζ ) = ηk + α1 (ζ )ηk−1 + . . . + αk (ζ ) = 0,
(5)
where each αj is a complex polynomial of degree not exceeding 2j . Notice that the real '
→ L∓ and thus restricts to a real structure τ induces antiholomorphic morphisms L± − structure on S. This implies that the polynomials αj in Eq. (5) must satisfy the reality conditions 1 αj (ζ ) = (−1)j ζ 2j αj (− ). ζ¯
(6)
It can be shown that the three independent real coefficients of α1 (ζ ) may be interpreted as giving the center (x1 , x2 , x3 ) of the monopole in R3 , α1 (ζ ) = k(x− ζ 2 + 2x3 ζ − x+ ), where x± := x1 ± ix2 , and are thus trivial moduli in the solution, related to the translational symmetry of (1). In the following, we shall only consider centred monopoles; these are defined as having the origin as center and thus have α1 (ζ ) = 0. In [8], Hitchin proved that, conversely, any compact real curve S of the linear system |O(2k)| on T for which L2 |S is trivial determines a charge k monopole, which will be smooth if the additional condition (7) H 0 S, Ls (k − 2) = 0 holds for 0 < s < 2. S is called the spectral curve of the monopole and completely determines the gauge equivalence class of the field configuration. It encodes all the information about the monopole; in particular, its genus g is related to the magnetic charge k by g = (k − 1)2
(8)
and every symmetry of S is also a symmetry of the corresponding solution to (1). 3. A New Version of the Ercolani–Sinha Conditions In [4], Ercolani and Sinha rephrase the condition of triviality of the line bundle L2 |S in terms of g equations involving periods of 1-forms on the spectral curve S. Starting with these equations, which they call the “quantisation conditions”, they propose an algorithm for constructing monopoles in the case where the underlying spectral curve is nonsingular. We now review their argument. Recall that when S is nonsingular the group H 0 (S, 1S ) of global holomorphic 1forms on S is a finite-dimensional C-vector space, whose dimension is the genus g of S. Locally, these forms can be described, using the adjunction formula, as Poincaré
Constraints Defining BPS Monopoles
223
residues of meromorphic 1-forms on T with at most simple poles along S. Imposing global regularity, it is easy to show that they can be written in our coordinates as β0 ηk−2 + β1 (ζ )ηk−3 + . . . + βk−2 (ζ ) dζ (9) = ∂P /∂η (on U0 ∩ S and away from the branch points of π |S ), where each βj is a polynomial of degree at most 2j with arbitrary coefficients. It is clear from this formula that Eq. (8) indeed holds. From Eq. (5), it is clear that the spectral curve S can be described as a k-sheeted branched cover of CP1 , with projection π |S : S → CP1 . The reality symmetry implies that the number of branch points is even and that they occur in antipodal pairs. To define the sheets of the cover, which we will label by integers 1, . . . , k, we have to introduce appropriate branch cuts. We may start by choosing a great circle on the sphere passing through no branch points, and joining the branch points in one of the corresponding hemispheres by non-intersecting cuts; then we apply the antipodal map to these to get further cuts joining the branch points on the other side of the great circle we have chosen. To ensure that each sheet is simply connected, we have to make one last cut, connecting the cuts introduced on the two hemispheres. For the spectral curves we shall consider below, one can argue that this last cut has trivial monodromy and is thus unnecessary; in this situation, the reality structure maps cuts to cuts and can therefore be described in terms of the antipodal map together with an order two permutation of the sheets. We will be interested in the local behaviour of certain meromorphic forms at the points of the fibre above 0, which we shall denote by 0j , j = 1, . . . , k, and assume to be distinct; this is no loss of generality since there is the freedom of rotating the monopole. Consider the meromorphic function on S defined by η/ζ on U0 ∩ S; it is easy to see that it has simple poles at the 2k points of (π |S )−1 ({0, ∞}) and is holomorphic elsewhere. In a neighbourhood of 0j , ηj (0) η = − 2 + O(1) dζ as ζ → 0, (10) d ζ ζ where ηj (ζ ) denotes the local solution of (5) on the j th sheet. Given a global holomorphic 1-form , we introduce the notation gj for the coefficient of at the point 0j in terms of the local coordinate ζ , i.e. |0j =: gj dζ |ζ =0 .
(11)
The triviality of the line bundle L2 |S is equivalent to the existence of a nowhere vanishing holomorphic section f ; with respect to the trivialisation of L2 |S over the open sets U0 ∩ S and U1 ∩ S, f is given by two nowhere vanishing holomorphic functions f0 and f1 on U0 ∩ S, U1 ∩ S respectively, satisfying η
f0 (η, ζ ) = e−2 ζ f1 (η, ζ ) for (η, ζ ) ∈ U0 ∩ U1 ∩ S. This implies that the meromorphic 1-forms d logf0 (:=df0 /f0 ) and d logf1 are related by η (12) + d logf1 d logf0 = −2d ζ
224
C. J. Houghton, N. S. Manton, N. M. Romão
on U0 ∩ U1 ∩ S. Notice that
I λ
d logfα ∈ 2π iZ α = 0, 1
(13)
for any homology 1-cycle λ ∈ H1 (Uα ∩ S, Z); moreover, these integrals are nonzero in general, since the 1-forms d logfα do not have to be exact. From Eqs. (10) and (12), we conclude that d logf1 must have the local behaviour near 0j 2ηj (0) d logf1 = − + O(1) dζ as ζ → 0 ζ2 in order for f0 not to have an essential singularity at 0j ∈ U0 ∩ S. It should be noted that the section f is uniquely determined up to a multiplicative constant, since the quotient of f by any other nowhere vanishing section of L2 |S yields a global holomorphic function on the compact Riemann surface S. Notice also that the modulus of this constant can be fixed by imposing the symmetry f1 (η, ζ ) =
1 f0 ◦ τ (η, ζ )
since the right-hand side has the regularity and nowhere vanishing properties of f1 , and η ζ is conjugated under pull-back by τ . Let {a1 , . . . , ag , b1 , . . . , bg } be a canonical basis of H1 (S, Z) ∼ = Z⊕2g , i.e. satisfying the orthonormality conditions ](ai , bj ) = δij ,
](ai , aj ) = 0 = ](bi , bj )
(14)
for the intersection pairing. Following Ercolani and Sinha, we apply the reciprocity law for differentials of the first and second kinds (cf. [6], p. 241) to an arbitrary holomorphic 1-form and d logf1 to get H H g k X 1 X Haj Haj d logf1 (−2ηi (0))gi = (15) . bj bj d logf1 2π i i=1
j =1
Let mj and nj be the integers I I 1 1 d logf1 and nj := d logf1 , mj := − 2πi aj 2π i bj
(16)
consistently with (13), and let us define the 1-cycle c :=
g X (nj aj + mj bj ).
(17)
j =1
Then Eq. (15) can be rewritten as −2
k X i=1
I ηi (0)gi =
c
.
(18)
Constraints Defining BPS Monopoles
225
The existence of c ∈ H1 (S, Z) satisfying (18) is equivalent to the line bundle L2 |S being trivial. Unfortunately, the condition (7) which would ensure smoothness cannot be implemented directly in the Ercolani–Sinha approach if k > 2, but we can include a weaker statement in the analysis as follows. Since for k ≥ 2 there is an inclusion H 0 (S, Ls ) ,→ H 0 (S, Ls (k − 2)) given by tensoring with a section of O(k − 2)|S , the condition H 0 (S, Ls ) = 0
(19)
is necessary for (7) to hold. Now we can repeat the argument above to investigate the existence of global sections of Ls |S , arriving at the same Eq. (18) with 2 replaced by s, and we can conclude that there will be no nontrivial global sections of Ls |S for 0 < s < 2 if and only if c is primitive in H1 (S, Z). We can still simplify the left-hand side of (18). Consider a global holomorphic 1-form on S, as given by (9). After defining the branch cuts, we can write P (η, ζ ) =
k Y
η − ηj (ζ ) ,
j =1
and so k
k
XY ∂P η − ηj (ζ ) . (η, ζ ) = ∂η i=1 j 6 =i
On sheet i, η = ηi (ζ ) and all the terms in the sum above vanish except one, k Y ∂P (η, ζ ) = ηi (ζ ) − ηj (ζ ) . ∂η sheet i
(20)
j 6 =i
We can use this to write the coefficient gi in (11) for as gi =
β0 ηik−2 (0) + β1 (0)ηik−3 (0) + . . . + βk−2 (0) , Qk j 6 =i ηi (0) − ηj (0)
so the left-hand side of (18) takes the form −2
k X
ηi (0)gi = −2
i=1
k X β0 ηik−1 (0) + β1 (0)ηik−2 (0) + . . . + βk−2 (0)ηi (0) . Qk j 6 =i ηi (0) − ηj (0) i=1
This appears to be a very complicated expression, but we can simplify it considerably if we make use of the identity ( k k X Y 1 0 ,0 ≤ n ≤ k − 2 n xi = . (21) 1 ,n = k − 1 xi − xj i=1
j 6=i
Taking xi = ηi (0), we obtain −2
k X i=1
ηi (0)gi = −2β0
226
C. J. Houghton, N. S. Manton, N. M. Romão
and substitution in (18) yields
I c
= −2β0 .
(22)
So our version of the Ercolani–Sinha conditions amounts to the existence of a primitive 1-cycle c such that Eq. (22) is satisfied for every global holomorphic 1-form , where β0 is the coefficient in (9) for . To prove (21), we first note that the cases 0 ≤ n ≤ k − 2 follow from the n = k − 1 case: A translation xi 7 → xi − y of all the xi ’s leaves the denominators in the sum invariant, k k X Y (xi − y)k−1 i=1
j 6 =i
1 = 1, xi − xj
so by expanding the binomials and collecting equal powers of y we get the statement for all 0 ≤ n ≤ k − 2. The proof of the n = k − 1 case by induction on k is rather lengthy and we prefer to argue as follows. It is readily seen that the whole sum is symmetric under the action of the symmetric group Sk permuting the xi ’s. Reducing to a common fraction yields as denominator 1(x1 , . . . , xk ) =
k Y
(xi − xj )
i<j
and this polynomial is completely antisymmetric under Sk ; in fact, the space of antisymmetric polynomials in k variables is generated by 1 over the ring of symmetric polynomials. The numerator is then necessarily antisymmetric and a homogeneous polynomial of degree 21 k(k − 1), which is also the degree of 1, so it has to be equal to 1 times a constant. Taking the asymptotic limit x1 → ∞ in the original sum, we conclude that this constant has to be 1. It is convenient, when we come to investigate particular examples, to introduce bases for both the global holomorphic 1-forms and the homology 1-cycles on S. An obvious basis {(`) , 1 ≤ ` ≤ g} for H 0 (S, 1S ) corresponds to taking monomials ηr ζ s for the allowed powers r and s (in lexicographical order of decreasing r and increasing s) as numerators of (9), (1) =
ηk−2 dζ ηk−3 dζ ηk−3 ζ dζ ζ 2k−4 dζ , (2) = , (3) = , . . . , (g) = . ∂P /∂η ∂P /∂η ∂P /∂η ∂P /∂η (23)
The condition (22) for a general is then equivalent to the g conditions I (`) = −2δ1` . c
(24)
Let us also fix a canonical basis (14) for H1 (S, Z). The (g × 2g) period matrix for S corresponding to the two choices of bases is then defined as usual by P = [A|B], where A and B are square matrices with entries I I (`) and B`j := (`) . A`j := aj
bj
Constraints Defining BPS Monopoles
227
Recalling (17), Eq. (24) can now be written as g X (A`j nj + B`j mj ) = −2δ1` .
(25)
j =1
Although the number of integers to be determined in (25) is 2g, they still have to satisfy constraints coming from the reality structure of S. We prove below that these imply that c is antisymmetric under the action of τ on the first homology group, τ∗ c = −c.
(26)
This imposes g linear constraints on the 2g components of c. In fact, since τ is antiholomorphic, ](a, b) = −](τ∗ a, τ∗ b) for any a, b ∈ H1 (S, Z), and this shows that the matrix τ representing τ∗ in the canonical basis (14) of H1 (S, Z) satisfies τ t = J(−τ −1 )J−1 ,
(27)
where J is the matrix representing the intersection pairing in this basis, 1g . J= −1g
(28)
Since τ 2 = 12g , τ is diagonalisable and has eigenvalues ±1; then (27) implies that these have to occur with equal multiplicities. Hence the antisymmetric 1-cycles lie in a Z⊕g subgroup of H1 (S, Z). To prove (26), we consider the basis (23). Since τ is antiholomorphic, it pulls back holomorphic 1-forms on S to antiholomorphic 1-forms and vice-versa; the forms above are mapped as τ∗
ηr ζ s dζ ∂P /∂η(η, ζ )
= (−1)k+r+s+1
ηr ζ 2(k−r−2)−s dζ ∂P /∂η(η, ζ )
(29)
for 0 ≤ r ≤ k − 2 and 0 ≤ s ≤ 2(k − r − 2). Using (29) and (24), we obtain I τ∗ c+c
I I (1) = − (1) + (1) = 2 − 2 = 0 c
c
and for ` 6 = 1 I τ∗ c+c
I I 0 (`) = ± (` ) + (`) = ±0 + 0 = 0 c
c
for some `0 6 = 1. We conclude that the integral of any global holomorphic 1-form around τ∗ c + c vanishes, and this implies (26). To illustrate how we can use the conditions (25) to determine spectral curves of monopoles, we take as example the well-known charge 2 monopole ([12, 2]), which is
228
C. J. Houghton, N. S. Manton, N. M. Romão
also considered in [4]. The general spectral curve for a centred monopole of charge 2, after imposing the reality conditions (6), has the form η2 + (γ0 ζ 4 + γ1 ζ 3 + γ2 ζ 2 − γ 1 ζ + γ 0 ) = 0, where γ2 is real. The four roots of the polynomial in brackets occur in antipodal pairs; we can use the SO(3) action to take one pair to ±1 and the other one to ±e±2iθ , where 0 ≤ θ ≤ π4 . A further rotation by ζ 7 → eiθ ζ then takes the spectral curve to η2 +
κ 2
ζ 4 − 2 cos(2θ )ζ 2 + 1 = 0,
2
(30)
where κ is a real number to be determined in terms of θ . Equation (30) defines a double cover of CP1 with branch points at the four roots of the polynomial in brackets, w1 = eiθ ,
w2 = −e−iθ ,
z1 = −eiθ ,
z2 = e−iθ .
We will be interested in the generic case where S is nonsingular; this happens if and only if all the points above are distinct. S is an elliptic curve and can be constructed by gluing together two copies of the Riemann sphere along two branch cuts, that we choose to be on the equator {ζ : |ζ | = 1}. We label the two sheets of S by j = 1, 2, which correspond to the two possible choices of sign for η when solving (30); sheet j is defined by the function ηj obtained by analytic continuation, avoiding the cuts above, of ζ 7 → (−1)j −1
iκ p 4 ζ − 2 cos(2θ )ζ 2 + 1 2
regarded as a germ at 0 ∈ C. Here, and elsewhere, we consider the principal branch of the root, viz − πq < arg z1/q ≤ πq , ∀ z ∈ C ∗ . Im ζ b
w2
w1 Re ζ
0 a z1
z2
Fig. 1. Branch cuts and 1-homology basis for the spectral curve of the charge 2 monopole
Constraints Defining BPS Monopoles
229
We choose a canonical basis {a, b} of H1 (S, Z) as in Fig. 1, where we draw the paths as dashed or dotted lines if they lie on sheets 1 or 2, and write c = na + mb. In this case H 0 (S, 1S ) is 1-dimensional and a generator is =
dζ . 2η
The periods can be expressed in terms of Legendre’s complete elliptic integral of the first kind, Z θ I 2 2 du p = K(sin θ ), A= = 2 2 κ sin θ κ a 0 1 − csc θ sin u I
2i B= = κ cos θ b
Z
π 2 −θ
0
2i du p = K(cos θ ). 2 κ 1 − sec2 θ sin u
So Eq. (25) reads 2i 2 K(sin θ )n + K(cos θ )m = −2. κ κ Therefore m = 0, and n must then be a generator of Z, which we can take to be −1, obtaining κ = K(sin θ ). This can be checked to agree with the result of Hurtubise [12]. Note that in this case Eqs. (7) and (19) are equivalent, so the method recovers all nonsingular spectral curves of (centred and suitably oriented) monopoles of charge 2. In this example, the special 1-cycle c in Eq. (22) is thus −a. It is readily checked that it is antisymmetric under τ . We point out that, although here I d log f1 = 0, a
the a-periods of d log f1 do not vanish for general monopoles, and this cannot be avoided by just rescaling f as claimed in [4]. This will be illustrated in Sect. 5.1, where we consider a monopole with a spectral curve of higher genus. 4. The Corrigan–Goddard Conditions In [3], Corrigan and Goddard used the so-called Ak Ansatz ofAtiyah–Ward for instantons to construct a charge k solution to the Bogomol’ny˘ı equations (1) with dim Nk = 4k − 1 free parameters. This construction was also obtained independently by Forgács et al. [5], and has been applied [13] to study monopoles in situations where the equations involved are simplified. Unlike the method we presented in Sect. 3, the Corrigan–Goddard approach does not assume smoothness of the underlying spectral curves; indeed, it can be used to obtain for example the axially symmetric monopole of arbitrary charge k, whose spectral curve is reducible to k spherical components. In the notation we have introduced, the construction goes as follows. Start with a polynomial P (η, ζ ) as in (5), satisfying the reality constraints (6). Orient the monopole
230
C. J. Houghton, N. S. Manton, N. M. Romão
so that there is an open annulus A in CP1 which contains the equator E = {ζ : |ζ | = 1} but does not contain any of the branch points of π |S . Assume that A lifts to k disjoint annuli on the spectral curve; then one can define the branch cuts so that sheet j contains one of the lifted annuli, which we denote by Aj . On π −1 (A), consider the function 2(η, ζ ) := 2π i
k k X νj Y j =1
2
`6 =j
η − η` (ζ ) , ηj (ζ ) − η` (ζ )
(31)
where νj are some integers to be determined. This is a Lagrange interpolation polynomial in η of the k conditions that 2 should take the value π iνj on Aj . For ζ ∈ A, define the functions 2r from the coefficients of ηr in 2 as follows: 2(η, ζ ) =: 2π i
k−1 X r=0
2r (ζ )
η 2ζ
r
.
Corrigan and Goddard’s analysis then leads to the conditions I dζ =2 21 (ζ ) ζ E and
I E
2r (ζ )ζ s
dζ = 0, ζ
2 ≤ r ≤ k − 1, |s| ≤ r − 1.
(32)
(33)
(34)
These are (k − 1)2 constraints on the k 2 + 2k coefficients of P (η, ζ ), just as one obtains using the Ercolani–Sinha algorithm. When the spectral curve is nonsingular, we would expect them to be equivalent to (24). We now clarify how they relate to each other. Denoting by si the i th elementary symmetric polynomial in a given number of variables, we can expand the numerator of (31) to obtain 2(η, ζ ) = 2πi
k k−1 X r X νj sk−r−1 (η1 (ζ ), . . . , η[ j (ζ ), . . . , ηk (ζ ))η . (−1)k−r−1 Qk 2 `6 =j ηj (ζ ) − η` (ζ ) r=0 j =1
The elementary symmetric polynomials satisfy the recurrence relation si (x1 , . . . , xbj , . . . , xk ) = si (x1 , . . . , xk ) − xj si−1 (x1 , . . . , xbj , . . . , xk ) for 0 ≤ i ≤ k (taking s0 := 1), and iterating this one finds si (x1 , . . . , xbj , . . . , xk ) =
i X (−1)h xjh si−h (x1 , . . . , xk ). h=0
Clearly, (−1)j sj (η1 (ζ ), . . . , ηk (ζ )) are just the polynomials αj (ζ ) in (5) for each 0 ≤ i ≤ k (with α0 := 1). Therefore, we can read off the functions 2r in (32) as 2r (ζ ) =
ηjh (ζ )αk−r−h−1 (ζ ) (2ζ )r . Qk 2 `6=j ηj (ζ ) − η` (ζ )
k k−r−1 X X νj j =1 h=0
Constraints Defining BPS Monopoles
231
So far, we have shown that, for the j th term in the sum, the numerator depends only on ηj (ζ ) and ζ . Using (20), we can eliminate altogether the dependence on the functions η` with ` 6 = j , and this allows us to write for 1 ≤ r ≤ k − 1, k−r−1 X
I
dζ = 2r (ζ ) ζ E
I
h=0
Pk
j =1 νj Ej
ηh αk−r−h−1 (ζ ) ∂P (η, ζ ) ∂η
(2ζ )r−1 dζ ,
where Ej := (π |S )−1 (E)∩Aj is the lift of E to sheet j . The integrand no longer depends on the sheet label. It becomes clear now how to write the left-hand side of the Corrigan– Goddard conditions as integrals over 1-cycles on S. If we define the holomorphic 1-form −4r on ∪kj =1 Aj to be the integrand in the above expression, then the conditions (33) and (34) can be written respectively as I Pk
j =1 νj Ej
41 = −2
(35)
ζ s 4r = 0
(36)
and I Pk
j =1 νj Ej
for 2 ≤ r ≤ k − 1 and |s| ≤ r − 1. Equations (35) and (36) are very similar to the version (24) of the Ercolani–Sinha conditions. In fact, they turn out to be precisely equivalent to (24), provided we assume c to be of the form c=
k X
νj Ej
(37)
j =1
rather than a general 1-cycle as in (17). To see this, we first remark that all the integrands in (35) and (36) are of the form (9), and hence global holomorphic 1-forms on S. For each 1 ≤ r ≤ k − 1, the highest power of η in the numerator of 4r never exceeds k − r − 1, and the coefficient of ηk−r−1 can be seen to be equal to −(2ζ )r−1 . So multiplication of 4r by ζ s with −r + 1 ≤ s ≤ r − 1 as in (36) gives monomials in ζ of all degrees between 0 and 2(r −1) as coefficients for ηk−r−1 . We conclude that all the homogeneous Eqs. (` 6 = 1) in (24) can be obtained from (36) if we consider first the 2k − 3 equations corresponding to r = k − 1 and continue decreasing r down to 2, using at each stage the vanishing of the integrals for greater r from the previous steps. The ` = 1 equation also agrees with (35), since we can use (36) and the coefficient of ηk−2 in the numerator of 41 is −1. Conversely, the Ercolani–Sinha conditions in the form (24) also imply the Corrigan–Goddard conditions (35) and (36) if (37) holds. The question to put now is of course: Is the Ansatz (37) for the special cycle c in Eq. (22) valid in general? In the next section, we show that this is not the case, by explicit computation of c for the tetrahedral 3-monopole.
232
C. J. Houghton, N. S. Manton, N. M. Romão
5. The Tetrahedral 3-Monopole Revisited 5.1. Spectral curve. Now we apply the method of Sect. 3 to investigate the spectral curve of the tetrahedrally symmetric monopole of charge 3. This was first studied in [9], where the existence of the monopole was proved by imposing tetrahedral symmetry to simplify Nahm’s equations and solve them in terms of elliptic functions. A numerical treatment of the ADHMN construction was developed and applied to this monopole in [10], which allowed the fields to be computed and, using these, level surfaces for the energy density were plotted. As in [9], we start with the Ansatz √ (38) η3 + α(ζ 6 + 5 2ζ 3 − 1) = 0 for the spectral curve S, where α is a nonzero constant to be determined; the reality conditions imply α ∈ R. The branch points occur at the zeroes of the polynomial in brackets, √ √ 3−1 3+1 ¯ 1 , z1 = − √ , z2 = ωz1 , z3 = ωz ¯ 1, w1 = √ , w2 = ωw1 , w3 = ωw 2 2 2π i
where ω := e 3 . These are equidistant points on the Riemann sphere, antipodal in pairs, which are related by radial projection to the midpoints of the edges of a tetrahedron inscribed in the sphere. In the configuration we have chosen, the tetrahedron has a vertex at 0 and is oriented such that the radial projection of one of the three edges containing 0 passes through 1, as shown in Fig. 2. 0
w3
w2
w1
z1
z3
8
z2
1
Fig. 2. The inscribed tetrahedron underlying the symmetry of the spectral curve
To define the branch cuts, we choose to connect the wi ’s and the zi ’s together along arcs of circles centred at the origin and antipodal to each other as shown in Fig. 3. No more cuts are needed, since each branch point is of cube root type and so any closed path on CP1 enclosing zero mod 3 branch points lifts to a closed path on S. Now we can
Constraints Defining BPS Monopoles
233
label the three sheets as before: for j = 1, 2, 3, we define sheet j to correspond to the analytic continuation ηj of q √ 3 (39) ζ 7 → −ωj −1 α 1/3 ζ ζ 3 + 5 2 − ζ −3 regarded as a germ at 1 ∈ C. In particular, notice that on each sheet η is indeed given by (39) for all ζ in the annulus C := {ζ : |w1 | < |ζ | < |z1 |}. With these conventions, it can be checked that the rules for crossing the branch cuts are as given in Fig. 3, where the encircled ± signs mean that the label j is to be increased/decreased by 1 mod 3 when the corresponding cut is crossed.
Im ζ + z3
w2 + 0 z1 w3
w1
1
Re ζ
+
+ z2
Fig. 3. Branch cuts for the spectral curve of the tetrahedral 3-monopole
It is not hard to see that one obtains a compact Riemann surface of genus four when three copies of the Riemann sphere are identified along the branch cuts as specified in Fig. 3. In fact, by identifying three copies of the upper or lower hemispheres along the pair of cuts as above, one obtains a torus with three discs removed; the circles of the boundary correspond to the equators of the spheres we started with. Gluing together the two surfaces obtained in this way along their boundaries gives a compact curve of genus four. This is sketched in Fig. 4; the three circles shown project under π |S to the equator E of CP1 , and they will be referred to as the equators on a given sheet. We shall adopt the convention of drawing the paths as dash-dotted, dashed or dotted curves if they lie on sheets 1, 2 or 3, respectively. Now we choose a canonical basis for H1 (S, Z) ∼ = Z⊕8 as in Fig. 5. The first two 1-cycles a1 and b1 are drawn close to the cut connecting the zi ’s so as to have the desired intersection number; for a2 and b2 we choose the equator on sheet 2 and a distorted meridian intersecting it as required; all the other intersections between these four 1cycles are zero. Then we act with the reality map τ on these cycles to get the other
234
C. J. Houghton, N. S. Manton, N. M. Romão
2
1
3
Fig. 4. Spectral curve of the tetrahedral 3-monopole
elements of the basis: a3 := τ∗ a2 , a4 := τ∗ a1 , b3 := −τ∗ b2 , b4 := −τ∗ b1 .
(40)
Our choice of branch cuts is such that τ sends cuts to cuts and hence maps a given sheet onto another sheet. It is easy to check that for ζ ∈ R, η as given by (39) for j = 1 also takes real values (cf. Eq. (2)). We then conclude that sheet 1 is invariant under τ , while the other two sheets are interchanged. It follows that the second half of our homology basis is as drawn in Fig. 5, and all the remaining intersection numbers for the elements in the basis are as required by (14).
a1
a2
a3 a4
b3 b1
b2
Fig. 5. The basis for H1 (S, Z)
Our chosen basis for H 0 (S, 1S ) is (1) =
dζ dζ ζ dζ ζ 2 dζ , (2) = 2 , (3) = , (4) = . 2 3η 3η 3η 3η2
b4
Constraints Defining BPS Monopoles
235
According to (29) these forms are pulled back by the reality structure as τ ∗ (1) = −(1) ,
τ ∗ (2) = (4) ,
τ ∗ (3) = −(3) ,
τ ∗ (4) = (2) .
(41)
We are now ready to compute the period matrix. The reality properties (40) and (41) imply that the periods around a2 , a1 , b2 and b1 determine those around a3 , a4 , b3 and b4 , respectively. For example, I B23 =
−τ∗ b2
I τ ∗ (4) = − (4) = −B42 . b2
This means that we only have to calculate half of the 32 entries of the period √ matrix. 3 First we consider the periods around the equator a2 . Notice that ζ + 5 2 − ζ −3 is invariant under the change of variable ζ 7 → ωζ . So Z ω (1 + ω + ω) ¯ dζ A22 = − 2/3 = 0 √ 3α 1/3 1 ζ 2 ζ 3 + 5 2 − ζ −3 and similarly A42 = 0. The two integrals A12 and A32 can be expressed in terms of the 2 ), we find hypergeometric function 2 F1 . Letting F := 2 F1 ( 16 , 23 ; 1; − 25 A12
2i ω¯ =− √ 3(5 2α)1/3
Z
π 2
− π2
2π i ωF ¯ √ 1/3 = − √ 3 6 3 5 2α 1/3 sin u
du
1−
√ i 2 5
and, using the relation (see [1], p. 559) 2 F1
2 1 5 , ; 1; − 3 6 25
√ 3 2 1 2 5 = √ 2 F1 , ; 1; − , 6 3 25 3
we obtain A32
2iω = √ 3(5 2α)2/3
Z
π 2
− π2
2π iωF . 2/3 = √ √ 3 3 3 10α 2/3 sin u
du
1−
√ i 2 5
Our choice of a1 and b1 implies that the periods around these two cycles are related by conjugation, Ai1 = Bi1 . This follows from the fact that the paths π ◦ a1 and π ◦ b1 are complex conjugate, while η2 (ζ ) = η3 (ζ ) from the definition in (39). There remain eight integrals to be calculated. By resorting to numerical integration, we have established that they are related to the periods around a2 by simple numerical
236
C. J. Houghton, N. S. Manton, N. M. Romão
factors. The conclusion is that the two blocks A and B of the period matrix can be written as 2π i ωF ¯ 2π iωF 2π F 2πF − √ − √ − √ √ √ √ √ √ √ √ 3 3 3 3 6 6 6 6 3 3 5 2α 1/3 3 5 2α 1/3 3 5 2α 1/3 3 3 5 2α 1/3 √ 4 2π ωF ¯ 0 0 0 √ 3 9 10α 2/3 A= 2π F 2πF 2π iωF 2π i ωF ¯ − √ √ √ √ √ √ 3 3 2/3 3 3 3 10α 2/3 2/3 9 3 10α 2/3 3 3 10α 9 10α √ 4 2πωF 0 0 0 √ 9 3 10α 2/3 and 4π F 4π F 2π F 2πF − √ √ − √ √ √ √ √ √ √ √ √ √ 3 3 3 3 6 6 6 6 3 3 5 2α 1/3 3 3 5 2α 1/3 3 3 5 2α 1/3 3 3 5 2α 1/3 √ √ √ 4 2π iF 4 2π iF 4 2π ωF 0 − √ √ − √ √ √ 9 3 3 10α 2/3 9 3 3 10α 2/3 9 3 10α 2/3 . B= 4π iF 2πF 4π iF 2π F − √ √ √ √ √ √ 3 3 3 2/3 2/3 2/3 9 3 10α 2/3 9 3 10α 9 3 10α 9 10α √ √ √ 4 2π ωF 4 2π iF ¯ 4 2π iF − √ √ 0 √ √ √ 3 3 3 9 10α 2/3 9 3 10α 2/3 9 3 10α 2/3
We have now all that is needed to determine α from the conditions (25). For a given α, this is a system of eight real linear equations in eight (integer) unknowns. It has a solution given by 0 0 n = , 0 0
0 1 m = m , 1 0
where m satisfies √ √ √ 3 3 3 5 6 2 α 1/3 . m= 4π F Now m must be either 1 or −1 for (m, n) to be primitive in Z⊕8 . If we take m = 1, √ 32 2π 3 F 3 , α= √ 405 3
(42)
while m = −1 reverses the sign of α. The two solutions can be seen to be a rotation of each other by ζ 7 → − ζ1 ; to fix ideas, we take α positive from now on. It can be checked
Constraints Defining BPS Monopoles
237
numerically that (42) agrees with the solution obtained in [9]. For our orientation of the spectral curve (38), the latter is given by √ 0( 1 )9 2 0( 16 )3 0( 13 )3 = √3 . (43) α= √ · √ 3 3 48 3π 3/2 48 6π 3 In Sect. 5.2, we use a change of variables projecting S onto an elliptic curve to relate analytically the two results. The special 1-cycle c for the tetrahedral 3-monopole with a positive is c = b2 + b3 .
(44)
It is sketched on the Riemann sphere in Fig. 6, after simplification using relations in homology – for example, the sum of the three equators with the same orientation is homologous to zero, which is clear from Fig. 4. Clearly, it is not a combination of a lift of equators, since the projections of b2 and b3 enclose different branch points on the Riemann sphere. In the next section, it will be proved that c is invariant under the tetrahedral group.
0 w2
w3
w1
z1
z3
z2 8
Fig. 6. The special 1-cycle c for the tetrahedral 3-monopole
5.2. Action of the tetrahedral group. The spectral curve S defined by (38) admits an action of the tetrahedral group A4 ⊂ SO(3) determined by P SU (2) transformations on ζ ; the corresponding rotations are the symmetries of the tetrahedron drawn in Fig. 2. This induces an action on H1 (S, Z), which we now describe. Recall that A4 is generated by the 3-cycle (123) and the double transposition (12)(34). We represent these as the rotation by 2π 3 about the direction defined by the top vertex 0 of the tetrahedron in Fig. 2, R : ζ 7→ e
2π i 3
ζ,
(45)
238
C. J. Houghton, N. S. Manton, N. M. Romão
and the rotation by π around the axis connecting the edge midpoints w1 and z1 , √
T : ζ 7→
2−ζ √ , 1 + 2ζ
(46)
respectively. Later, we will also be interested in another element of order two, V = R T R 2,
(47)
which corresponds to a rotation by π about the axis connecting w2 and z2 . We also denote by R, T and V the maps induced on S by (45), (46) and (47). On the complex plane, R is of course just the rotation by 2π 3 about the origin, while T and V are elliptic Möbius transformations of order two with w1 , z1 and w2 , z2 as fixed points, respectively. A way to visualize the action of T or V is to draw the (invariant) circles of Apollonius corresponding to the two fixed points; the other four branch points of π|S all lie on one of these circles and it is easy to verify that they are permuted as expected under the two transformations. To describe the action of A4 on H1 (S, Z), we start by computing the matrices representing the generators R and T . The effect of R is easy to understand, since it leaves the three annuli over C = {ζ : |w1 | < |ζ | < |z1 |} invariant. T is harder to describe since it does not preserve the annuli, on which we can easily keep track of the sheet labels by using the expression (39) for ηj (ζ ). But we can still use (39) when ζ is in the smaller region C+ ∪ C− , where C± := {ζ ∈ C ∩ T (C) : ±Im ζ > 0} are mapped onto each other by T . Denoting by C±,j the intersection of (π |S )−1 (C± ) with sheet j , it can be concluded that T sends C±,j to C∓,j ∓1 , where the labels are taken mod 3. The sheet that contains the image under R or T of any point on S can now be easily identified from these data and analytic continuation. In particular, we conclude that the 1-cycles in our basis for H1 (S, Z) are mapped as shown in Fig. 7. We can now use the perfect intersection pairing (14) to compute the matrices of the maps R∗ and T∗ induced on homology from the intersection numbers of the 1-cycles ai and bi with their images. Let ci := ai and c4+i := bi for i = 1, . . . , 4. Defining Mij := ](R∗ ci , cj ),
Nij := ](T∗ ci , cj ),
we obtain the entries of the matrices R and T representing R∗ and T∗ as Rij =
8 X
Jik Mj k ,
Tij =
k=1
8 X
Jik Nj k ,
k=1
where, as in (28), Jij := ](ci , cj ) =
−14
14
ij
.
Constraints Defining BPS Monopoles
239
R
T
a1
R
a2
R
a3
R
T
T
T a4
R
T b1
R
T b2
b3
R
R
b4
T
T
Fig. 7. The action of R and T on the basis of H1 (S, Z)
240
C. J. Houghton, N. S. Manton, N. M. Romão
The intersection numbers Mij and Nij 0 0 0 0 1 0 0 0 1 0 0 0 R= 1 0 0 0 0 0 0 0 0 0 0 0 and
T=
can be just read off from Fig. 7, and we get 0 0 0 −1 0 0 −1 0 −1 0 1 0 0 1 0 0 0 0 1 0 0 −1 −1 1 0 0 1 0 0 0 0 0 1 0 0 1 −1 −1 −1 0
0 0 0 0 −1 0 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 −1 0 −1 0 0 0 0 0 1 0 0 −1 0 0 0 0 1 0 −1 −1 0 −1 0 0 1 0 1 −1 0 1 0 0 0 0 −1 0 0 −1 0
.
So the characters of the A4 representation on 1-cycles are tr 18 = 8,
tr R = 2 = tr R2 ,
tr T = 0,
and this shows that H1 (S, C) splits as 1⊕2 ⊕ 3⊕2 . Another way to see this is to consider the action of A4 by pull-back on the holomorphic 1-forms (`) by R and T and calculate the characters to conclude that H0 (S, 1S ) splits as 1 ⊕ 3 under A4 (with (1) spanning the trivial singlet and being orthogonal to the triplet), and use Poincaré duality. Using the matrices for R∗ and T∗ , we can compute the projection onto the subspace 1⊕2 ⊂ H1 (S, C) as 0 −1 −1 0 0 0 0 0 0 2 2 0 0 0 0 0 0 2 2 0 0 0 0 0 1 0 −1 −1 0 0 0 0 0 1 X σ = = . 1 1 0 0 0 0 0 |A4 | 4 0 −1 0 −3 1 −1 2 2 −1 σ ∈A4 −1 3 0 1 −1 2 2 −1 0 −1 −1 0 0 0 0 0 The range of this matrix is spanned by t 0 0 0 0 0 1 1 0 and
t 2 −4 −4 2 −2 3 −3 2 ,
(48)
so we conclude that the special cycle c given in (44) is invariant under the action of A4 . Notice that in (48) the first vector is antisymmetric whereas the second is symmetric under reality. We can explore the action of the Vierergruppe D2 ⊂ A4 generated by the two elements T and V to express the value α given by (42) in terms of elliptic integrals, as in [9]. The
Constraints Defining BPS Monopoles
241
actions of both T and V are much easier to describe in an alternative orientation of the monopole, obtained by rotation of (38) under ζ 7→
(z3 − z1 )(ζ − w1 ) . (z3 − w1 )(ζ − z1 )
Then the spectral curve is taken to the form √ 3 3 3 η + √ αiζ (ζ 4 − 1) = 0, 2
(49)
which can be described as a covering of CP1 with branch points at 0, ±1, ±i and ∞. In this configuration, T is just ζ 7→ −ζ , while V is ζ 7 → − ζ1 . The map p : ζ 7→
1 1 ζ 2 + 2 =: z 2 ζ
identifies points in the same orbit of D2 , having the first quadrant as fundamental region. Under the map induced on T 0 CP1 by p, the spectral curve (49) goes to √ w 3 + 24 6αi(z2 − 1)2 = 0, which is a torus by the Riemann–Hurwitz formula and corresponds to the quotient S/D2 . The two pairs of branch cuts on the original Riemann sphere are both identified with a cut connecting the new branch points 1, ∞ and −1 along the real axis. With some care, it can be seen that the image of the 1-cycle c in (44) can be identified with a cycle going four times along the √ imaginary axis in the negative direction, on the sheet containing √ the point (w, z) = (2 6 2 3α 1/3 i, 0). On the other hand, it is easy to see that the 1-form (1) is given by the same expression in the new orientation, and dζ dz = = (1) . p∗ 3w 3η Thus we can write I I (1) = c
p∗ c
dz =4 3w
Z
−i∞
i∞
dz √ √ 1/3 6 6 2 3ωα 1/3 −i(z2 − 1)2
and this can be reduced to an elliptic integral, yielding 0( 1 )3 . −√ 3 6π α 1/3 Now this has to be equal to −2 by (24). Thus we get α= in agreement with (43).
0( 13 )9 √ 48 6π 3
242
C. J. Houghton, N. S. Manton, N. M. Romão
6. Discussion The version of the Ercolani–Sinha constraints that we derived in Sect. 3 generalises the Corrigan–Goddard conditions to all monopoles with a nonsingular spectral curve. An interesting aspect is the existence of a distinguished 1-cycle c on the spectral curve. The premises in the Corrigan–Goddard approach lead to the constraint (37) for c, but their conditions are otherwise equivalent to Eq. (22). In Sect. 5, we have applied (22) to rederive the scale parameter α in the spectral curve of the tetrahedrally symmetric monopole of charge 3. We also verified that this monopole provides an example where our condition (22) can be satisfied but those of Corrigan and Goddard are not. Let us make some remarks about the nature of the special 1-cycle c. Given a nonsingular spectral curve S in T, c is uniquely determined as the solution to Eq. (22); we have established that it is always antisymmetric under the real structure. Moreover, although the left-hand side of (22) depends on the spatial orientation of the monopole, c remains constant along the SO(3) orbit of S in the moduli space Nk . In fact, its components in a given homology basis are integer solutions to a linear equation and cannot change when the spectral curve is rotated, since the period matrix occurring in (15) never becomes singular. This argument applies to more general deformations in Nk that do not pass through monopoles with a singular spectral curve. It also implies that c has to be invariant under any rotational symmetry of the spectral curve, and this imposes further restrictions – for example, in the case of the tetrahedrally symmetric 3-monopole that we studied in Sect. 5, this consideration together with the τ -antisymmetry completely determines c up to sign. As implied in [4], the components of the 1-cycle c are the characteristics of the line bundle L2 |S and can thus be interpreted as giving the direction of the linear flow determined by Nahm’s equations on the Jacobian of the spectral curve S. Another interpretation for c is afforded by Eq. (16). Recall that the triviality of L2 |S provides for two nowhere vanishing functions f0 and f1 on the open sets U0 ∩ S and U1 ∩ S. We may wonder whether we can define logarithms of these functions. And of course the answer is no: the nonzero components of c correspond to nontrivial periods of both d logf0 and d logf1 , and so they cannot be exact 1-forms. To define the logarithms, one should eliminate the 1-cycles correponding to the nonzero periods, by cutting S along their conjugate homology 1-cyles in the canonical basis (14). But we can see from (17) that this is equivalent to cutting S along c. The Riemann surface of logf0 or logf1 is then obtained from the cut surfaces U0 ∩ S or U1 ∩ S by analytic continuation across the cuts, and this yields an infinite cover of the original open sets. So we may regard c as a topological obstruction to defining the logarithms of the nowhere vanishing functions f0 and f1 on the spectral curve punctured at the points lying over ζ = ∞ and ζ = 0, respectively. We should emphasise that the Ercolani–Sinha algorithm is still not sufficient to ensure smoothness of the fields if k > 2, since it does not include the condition (7). The family of nonsingular spectral curves of monopoles has codimension zero in the family of real curves in |O(2k)| satisfying Eq. (22), but the inclusion is proper in general. For example, it can be shown that the icosahedrally symmetric curve η6 + αζ (ζ 10 + 11ζ 5 − 1) = 0
(50)
satisfies (22) for some constant α, but not (7); this follows from the conclusion in [9] that there is no 6-monopole with icosahedral symmetry. It is known [11] that an icosahedrally
Constraints Defining BPS Monopoles
243
symmetric monopole of charge 7 exists, and its spectral curve is reducible to a projective 33 0( 1 )18
line and a smooth genus 25 curve of the form (50), with α = 28 π36 . An interesting question is to understand how (22) degenerates when a spectral curve becomes singular. Some singularities arise by imposing interesting symmetries on the monopoles, as in the case of the axially symmetric monopoles that we have mentioned already. We may expect that the condition still holds for other singular spectral curves, but it is not clear how the 1-cycle c is to be determined in general. Acknowledgements. We thank Roger Bielawski for advice. CJH thanks Fitzwilliam College, Cambridge, for a research fellowship. NMR is supported by Fundação para a Ciência e a Tecnologia, Portugal, through the research grant BD/15939/98.
References 1. Abramowitz, M. and Stegun, I.A.: Handbook of Mathematical Functions. National Bureau of Standards, 1965 2. Atiyah, M.F. and Hitchin, N.J.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton University Press, 1988 3. Corrigan, E. and Goddard, P.: An n Monopole Solution with 4n − 1 Degrees of Freedom. Commun. Math. Phys. 80, 575–587 (1981) 4. Ercolani, N. and Sinha, A.: Monopoles and Baker Functions. Commun. Math. Phys. 125, 385–416 (1989) 5. Forgács, P., Horváth, Z. and Palla, L.: Finitely Separated Multimonopoles Generated as Solitons. Phys. Lett. B 109, 200–204 (1982) 6. Griffiths, P. and Harris, J.: Principles of Algebraic Geometry. New York: Wiley, 1978 7. Hitchin, N.J.: Monopoles and Geodesics. Commun. Math. Phys. 83, 579–602 (1982) 8. Hitchin, N.J.: On the Construction of Monopoles. Commun. Math. Phys. 89, 145–190 (1983) 9. Hitchin, N.J., Manton, N.S. and Murray, M.K.: Symmetric Monopoles. Nonlinearity 8, 661–692 (1995); dg-ga/9503016 10. Houghton, C.J. and Sutcliffe, P.M.: Tetrahedral and Cubic Monopoles. Commun. Math. Phys. 180, 343– 361 (1996); hep-th/9601146 11. Houghton, C.J. and Sutcliffe, P.M.: Octahedral and Dodecahedral Monopoles. Nonlinearity 9, 385–401 (1996); hep-th/9601147 12. Hurtubise, J.: SU (2) Monopoles of Charge 2. Commun. Math. Phys. 92, 195–202 (1983) 13. O’Raifeartaigh, L., Rouhani, S. and Singh, L.P.: Explicit Solution of the Corrigan–Goddard Conditions for n Monopoles for Small Values of the Parameters. Phys. Lett. B 112, 369–372 (1982) 14. Sutcliffe, P.M.: BPS Monopoles. Int. J. Mod. Phys. A 12, 4663–4705 (1997); hep-th/9707009 15. Ward, R.S.: A Yang–Mills–Higgs Monopole of Charge 2. Commun. Math. Phys. 79, 317–325 (1981) Communicated by A. Kupiainen
Commun. Math. Phys. 212, 245 – 256 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On the Statistical Mechanics of 2D Euler Equation Raoul Robert Institut Fourier CNRS, BP 74, 38402 Saint Martin d’Hères Cedex, France. E-mail:
[email protected] Received: 25 November 1997 / Accepted: 27 January 2000
Abstract: We address the issue of a rigorous justification of the statistical mechanics of 2D Euler equation. We construct a converging sequence of approximations of this equation for which a Liouville theorem holds and such that the sequence of Liouville measures has a large deviation property. This provides an important step in the justification of the use of the entropy functional previously introduced in [8, 11, 13]. 1. Introduction The most striking feature of 2D hydrodynamical turbulence is the emergence of a largescale organization of the flow, leading to structures usually called coherent structures (see references in [2, 13]). Jupiter’s Great Red Spot, a huge vortex persisting for more than three centuries in the turbulent shear between two zonal jets, is probably related to this general property [7, 14]. Such hydrodynamical vortices, whose dynamics is governed by Euler equation or some quasi-geostrophic variant, occur in a wide variety of geophysical phenomena. The common remarkable feature of these structures is that they occur and persist in a strongly turbulent environment, and their robustness demands a general understanding. Onsager [10] was the first to suggest that an explanation might be found in terms of statistical mechanics of Euler equation. Our previous work [8] was an attempt to provide a rigorous basis to the statistical theory of such systems. In this paper we focussed on the Sanov-type large deviation estimates for empiricalYoung measures which were necessary to justify the thermodynamic limit. But the only rigorous link between the statistics and the dynamics of the system that was appealed to was the invariance by the flow of a concentration property associated to our entropy functional (this property was called an erzatz of Liouville theorem). At the time, it was argued by Eyink and Spohn [3] that this invariance argument was too weak and did not provide a good justification of the statistical theory. Their criticism was justified since we can have many different notions of concentration conserved by the
246
R. Robert
flow and corresponding to different entropy functionals. Then we tried to give a stronger argument: we tried to construct an approximation of the flow on the finite dimensional space of piecewise constant vorticity functions, the approximate dynamical system preserving the natural product measure for which we have derived large deviation estimates. Despite some efforts, related in [8], we did not succeed, and the problem remained open. Meanwhile experimental and numerical works made some progress, accumulating evidences that the (vorticity – stream function) relationship derived from our entropy functional was fairly well satisfied inside the coherent structures in a variety of cases [12, 15, 16]. So the theoretical justification of the special form of the entropy became a crucial question. Our aim in this paper is to address as far as possible this remaining issue. Let us formulate precisely the problem at hand. To provide an appropriate justification I think we have to construct a sequence of finite dimensional approximations of the Euler flow (of course with good convergence properties such as strong L2 convergence uniformly on any finite time interval), satisfying the two following properties: (i) A Liouville theorem holds for the finite dimensional approximations. (ii) For the family of measures given by (i), we can prove the Sanov-type large deviation estimates for empirical Young measures which are necessary to take the thermodynamic limit (as in [8]). Of course, it is easy to satisfy the point (i) by considering the spectral approximation; but then, it is a very difficult issue to prove that the associated family of measures satisfies (ii). Our approach here is to get a Liouville theorem for a general class of approximations, including approximations on spaces of functions which are spatially localized like finite element approximants. For such approximations we are not able to prove directly the large deviation estimates (ii) but we can use them as an intermediate to construct the final approximation on the space of piecewise constant functions for which the large deviation estimates hold (see [8]); so that the use of the finite-element approximants appears here as an essential intermediate step in order to both insure the convergence of the approximations and keep the large deviation estimates. One may worry about the fact that our approximate dynamical system retains only the enstrophy among the infinite family of the Casimir functionals which are conserved by the continuous system (in contrast with the finite mode hamiltonian approximation of [20, 21]). Of course it would be more satisfactory to construct approximations having in addition a large number of constants of the motion. But we think that this is not truly necessary to our microcanonical approach. Indeed if we are interested in the long-time behavior of 2D-Euler flow, and if we believe that a statistical mechanics approach can bring some light to this issue, then we expect that we will finally have to solve some constrained variational problem: find the maximum value of some entropy functional under a set of constraints. But while we have no doubts about the set of constraints which is directly derived from the constants of the motion of the system (energy, integrals of functions of the vorticity field...), it is hard to guess what the relevant entropy functional is. So a key issue is to find a pertinent justification for our special form of the entropy functional. But in our microcanonical approach the entropy is not related to the fact that many constants of the motion are (or are not) exactly conserved by the approximate flow but it is only associated to large deviation estimates for the invariant measures. Up to now we only considered the issue of the justification of the entropy functional via invariant measures of approximate systems and large deviation estimates. Of course this necessary step is not sufficient to give a conclusive justification of the equilibrium
Statistical Mechanics of 2D Euler Equation
247
statistical mechanics. Such a task would involve intricate dynamical considerations (involving an ergodicity assumption and a precise estimate of the mixing time for the approximations.). It seems that such an analysis is out of reach at the present time. Nevertheless we give in Sect. 5 some elements of discussion which may help us to delimit the field of validity of the theory. 2. 2D Euler Equation Euler equation. The motion of a two-dimensional incompressible inviscid fluid in a bounded domain is governed by Euler equation, which we write in the classical velocity-vorticity formulation: ( ω1 + div(ωu) = 0 (E) + curl u = ω, div u = 0, u · n = 0 on ∂, where u(t, x) is the velocity field of the fluid, ω = curl u the scalar vorticity, n the outward unit normal vector to ∂. Because of incompressibility we introduce the stream function ψ(t, x): ω = −1ψ, ψ = 0 sur ∂. The constants of the motion of this dynamical system are: – the energy Z Z u2 dx = 21 ψω dx; 4(ω) = 21
– the integrals
Z Fθ (ω) =
θ (ω(x)) dx,
for any continuous function θ . These constants of the motion which are associated to the degeneracy of the (infinite dimensional) hamiltonian system are usually called Casimir functionals. – If is the ball B(0, R), we must consider also the angular momentum with respect to 0: Z Z 2 2 1 x ∧ u(x) dx = 2 R − x ω(x) dx e3 . M(ω) =
The Cauchy problem. Youdovitch’s theorem [18] gives a satisfactory existence-uniqueness result for the Cauchy problem for (E): For any given initial datum ω0 (x) in the space L∞ (), there is a unique weak solution ∞ of (E); ω(t, x) is in L () for all t, and furthermore belongs to the space this solution p C 0, ∞[; L () for all p p, 1 ≤ p < ∞. We will define the flow 0t of the Euler equation on the phase space L∞ (), by ω(t, .) = 0t ω0 . Furthermore this weak solution satisfies the following useful stability property: If ω0ε is a bounded sequence in the space L∞ (), which converges in the strong L2 topology towards ω0 , then 0t ω0ε converges L2 -strongly towards 0t ω0 , uniformly on any bounded time interval.
248
R. Robert
3. The Entropy Associated to the Turbulent Mixing Process The mechanism of turbulent mixing responsible for the self organization of the flow in Euler equation is studied at a physical level in [2]. Our concern here is to justify, as rigorously as possible, the introduction of the entropy functional which we use to give a precise content to the vague notion of turbulent disorder of the flow. As previously discussed [8], this issue is based on the existence of finite dimensional approximations which admit invariant Liouville measures. This is the very root of any thermodynamical approach. It is well known that, at a formal level, Euler equation is an infinite dimensional Hamiltonian system; but, in contrast with the finite dimensional case, this does not imply the existence of an invariant Liouville measure on the natural phase space L∞ . Although we can find finite dimensional approximations of Euler equation which preserve the Hamiltonian structure [20, 21], this stucture is broken by any kind of approximation of practical use. But for the needs of thermodynamics the Hamiltonian stucture is not truly necessary, it is the Liouville theorem and the constants of the motion which are the key ingredients. In the case of Euler equation, it is well known that a Liouville theorem holds for the usual spectral approximation. We shall show that this is a particular case of a general property: there is a natural way to approximate Euler equation on any finite dimensional space in such a way that the volume measure is conserved. The spectral approximation is only a particular case of that. It does not seem that this simple fact was previously noticed. Then the problem of defining an equilibrium statistical mechanics for (E) amounts to the study of families of measures. For an arbitrary choice of the approximating spaces the study of the asymptotic behavior of these measures seems untractable, but fortunately we can choose spaces for which the thermodynamic limit of these measures can be carried on [8].
3.1. Finite dimensional approximations. A classical way to construct finite dimensional approximations of Euler equation is as follows. Let FN be an N-dimensional subspace of L∞ and denote PN the orthogonal projector from L2 () onto FN . Then we define the approximate solution ωN (t) as the solution of the ordinary differential equation in FN : ( ωtN + PN uN · ∇ωN = 0, (EN) ωN (0) = PN ω0 , where uN = curl ψ N , and −1ψ N = ωN , ψ N = 0 on ∂. If FN is properly chosen and = ω0 regular enough, then ωN (t) converges towards ω(t) for the strong L2 topology, uniformly on any bounded time interval [6]. The constants of the motion of the dynamical system (EN ) are: – the energy
Z 1 2
– the enstrophy
ψ N ωN dx,
Z
ωN
2
dx.
Statistical Mechanics of 2D Euler Equation
249
Let us notice here that (EN ) is a differential system with quadratic non-linearity so that the solution always exists on a small time interval; but due to the conservation of the enstrophy the solution cannot blow up and it exists globally in time. Now, it is well known that if we take for FN a subspace generated by N eigenvectors of the operator −1 (with the Dirichlet boundary condition), the volume measure on FN is conserved by (EN ). This is in fact a particular case of what follows. We consider the modification of (EN ) which consists in replacing, in the definition of ψ N , the Dirichlet problem by the variational formulation: Z Z ∇ψ N · ∇ϕ dx = ωN ϕ dx, ∀ϕ ∈ FN . ψ N ∈ FN and
For sake of simplicity, from now on we shall also denote by (EN ) this modified dynamical system. Of course, we shall suppose at least that FN is included in the Sobolev space H01 (), so that for any given ωN , the above variational problem possesses a unique solution ψ N (by the Lax–Milgram theorem). One can easily check that the energy and the enstrophy are still conserved but now we have in addition Theorem 1. The volume measure on FN is conserved by the dynamical system (EN ). N Proof. FN is endowed with the L2 scalar product. Let us write (EN ) in the form ωt = N N N N N GN ω , where GN ω = −P u · ∇ω is a nonlinear transformation of FN . Then to prove the theorem it suffices to show that the trace of the derivative G0N ωN vanishes. Let us compute the first variation of GN corresponding to a small variation δωN : h i δGN = G0N ωN δωN = −PN δuN · ∇ωN − PN uN · ∇δωN .
P By definition, we have tr G0N ωN = i G0N ωN [ei ] , ei , for any orthonormal basis ei of FN . Let us denote ui the vector field associated to ei , we have: Z Z G0n ωN [ei ] , ei = − ui · ∇ωN ei dx − uN · ∇ei ei dx,
but since div uN = 0, the last term vanishes, and after integration by parts we get: Z ωN curl ψi · ∇ei dx. G0N ωN [ei ] , ei =
Let us consider now the positive definite and symmetric linear operator A defined on R FN by: ∇ψ · ∇ϕ dx = (Aψ, ϕ), and take for ei an orthonormal basis of eigenvectors ψi (λi is the eigenvalue corresponding to ei ), so that of A. We obviously have ei = λi t curl ψi · ∇ei = 0 and tr G0N ωN = 0. u Two main concerns then remain. (i) Prove the convergence (when N → ∞) of the approximate solution ωN (t) towards the solution ω(t) of the Euler equation.
250
R. Robert
(ii) In order to properly define an equilibrium statistical mechanics, one has to study the asymptotic behavior of the (N -dependent) family of invariant probability distributions on FN : 1 µN = exp −αkωN k2L2 () dωN , Z where dωN is the volume measure on FN given by the L2 metric and the exponential factor is introduced to normalize to a probability. Point (ii) will be addressed later, and we will now focus on (i). We shall take for approximating space FN the space Fh () of the finite-element approximation of the Sobolev space H m (R)2 , with compact support in (m is an integer > 5 and h a small positive parameter, see the Appendix). Then we have the following convergence result whose proof is classical. Proposition 2. Let ω(t) be any weak solution of (E), with ω0 (x) in the space L∞ (), and let T > 0 be fixed.Then for all ε > 0, there is h(ε) > 0, such that for all h, 0 < h ≤ h(ε), there is a solution ωh (t) of (Eh) such that:
≤ ε, for all t in [0, T ].
ω(t) − ωh (t) 2 L ()
The measures µh on Fh () associated to this approximation are not easy to handle, but it appears that a slight change in the approximating dynamical system improves greatly the situation with a view to (ii). Let us denote 0th the flow on Fh () defined by the system (Eh ). Let ph : L2h → F h be the classical prolongation operator of the finite-element method (see the Appendix), and πh = ph−1 . Let us define Lh () = πh Fh (), and denote 2ht = πh ◦ 0th ◦ ph , the flow induced on Lh (). Obviously 2ht preserves the volume measure on Lh (). And from Proposition 2 we deduce the following. Corollary 3. Let ω(t) be any weak solution of (E), with ω0 (x) in the space L∞ (), and let T > 0 be fixed. Then for all ε > 0, there is h(ε) > 0 such that for all h, 0 < h ≤ h(ε), there is ω0h in Lh () such that:
≤ ε, for all t in [0, T ].
ω(t) − 2ht ω0h 2 L ()
Proof. By the L2 -stability property of Euler equation, we only need to prove the result h
for ω0 in C ∞ c . Using Proposition 2, we have, for h ≤ h(ε): ω(t) − ω (t) ≤ ε, on h [0, T ]. Let us denote ωh (t) = πh ω (t), we have: kω(t) − ωh (t)k ≤ kω(t) − r h ω(t)k + kr h ω(t) − ωh (t)k , where r h is the classical restriction operator (see the Appendix). But since
kr h ω(t) − ωh (t)k ≤ c ph rh ω(t) − ωh (t) , it comes:
kω(t) − ωh (t)k ≤ kω(t) − rh ω(t)k + c kph rh ω(t) − ω(t)k + c ω(t) − ωh (t) . Now we have (see the Appendix) kω(t) − rh ω(t)k ≤ ch kω(t)kH 1 () ≤ C(T )h,
on [0, T ]
Statistical Mechanics of 2D Euler Equation
and similarly
kph rh ω(t) − ω(t)k ≤ C(T )h,
thus and the result follows.
251
kω(t) − ωh (t)k ≤ C(T )h + cε, t u
Let us summarize our results, we have constructed a flow 2ht on Lh () which apj proximates the Euler flow and preserves the measure dωh = ⊗j dωh , where ωh (x) = P j x j ωh χ h − j (finite sum). 3.2. Long time dynamics and Young measures. As we have seen, Euler system describes the advection of a scalar function (the vorticity) by an incompressible velocity field , thus R the vorticity ω remains bounded in L∞ (). The functionals C2 (ω) = 2 (ω(x)) dx, are constants of the motion (for any continuous function 2). That is to say, the distribution measure of ω, πω , defined by hπω , 2i = C2 (ω), is conserved by the flow. Let us consider an initial datum ω0 . It is well known that, in general, as time evolves, 0t ω0 becomes a very intricate oscillating function. Let us denote r = kω0 kL∞ () . Since the measure πω is conserved, 0t ω0 will remain, for all time, in the ball L∞ r = {ω : kωk∞ ≤ rk. Extracting a subsequence (if necessary), we may suppose that, as time goes to infinity, 0t ω0 converges weakly (for the weak-star topology σ (L∞ , L1 )) towards some function ω∗ : w 0t ω0 −→ ω∗ . We can easily see that C2 (0t ω0 ) does not converge towards C2 (ω∗ ) if 2 is nonlinear, whereas some other invariants can converge, as it is the case for the energy. So much information (given by the constants of the motion) is lost in this limit process. Thus the weak space L∞ () is not the good one to describe the long-time limits of our system. Fortunately, the relevant space to do this is well known. The need to describe in some macroscopic way the small-scale oscillations of functions was understood a long time ago by L.C. Young [19]. To solve problems from the calculus of variations, Young introduced a natural generalization of the notion of function: at each point x in we no longer associate a well determined real value, but only some probablity distribution on R (such a mapping is called a Young measure on × R). More precisely, a Young measure ν on × R is a measurable mapping x → νx from to the set M1 (R) of the Borel probability measures on R, endowed with the narrow topology (weak topology associated to the continuous bounded functions). Clearly, ν defines a positive Borel measure on × R (that we will also denote by ν) by: Z hνx , φ(x, .)i dx, hν, φi =
for every real function φ(x, z), continuous and compactly supported on × R. To any measurable real function g on , we associate the Young measure δg : x → δg(x) , Dirac mass at g(x). We shall denote by M the convex set of Young measures on × R, and we recall some useful properties: – M is closed in the space of all bounded Borel measures on × R (with the narrow topology). In the sequel, M will be endowed with the narrow topology. If we replace R by the compact interval [−r, r], the space Mr of Young measures on × [−r, r] is compact.
252
R. Robert
We can now identify the long time limits of the system as Young measures. Indeed, Mr is a suitable compactification of L∞ r since the narrow convergence (when t goes to infinity) of δ0t ω0 towards some Young measure ν preserves the information given by the constants of the motion, that is, for all functions 2(z): Z Z hνx , 2i dx, 2 (0t ω0 (x)) dx →
but the left-hand side is constant and equal to πω0 , 2 , so that:
Z
νx dx = πω0 .
(*)
The same kind of arguments applies R to the other invariants. For example, since 0t ω0 converges weakly towards ν¯ (x) = z dνx (z), we have, for the energy, 4(0t (ω0 )) → 4(¯ν ), which is the energy of the Young measure ν, and thus: 4(¯ν ) = 4(ω0 ). We shall denote by (∗∗) the set of constraints (associated to the constants of the motion) other than (∗), that ν¯ has to satisfy: (∗∗) = {energy constraint, angular momentum constraint (eventually)}. Thus we see that the constants of the motion bring the constraints (∗), (∗∗) on the possible long time limits. Since we don’t know anything( in the general case) on the long time behavior of the solutions of Euler equation, we will consider Young measures merely as a convenient framework in which we can perform the thermodynamic limit of a family of invariant measures.
3.3. A large deviation property. In order to define relevant statistical equilibrium states, we have to take the thermodynamic limit of the invariant Liouville measures with the conditionning given by all the constants of the motion. Let be a bounded open subset of Rd , the space Fh () is composed of the func P j tions of the form j fh β xh − j which are compactly supported in (see the Appendix). And the space Lh () = πh (Fh ()) is composed of functions of the form P j x j fh χ h − j which vanish in a neighborhood of the boundary ∂ (whose width goes to zero with h). P j Let us write a function of Lh () : fh = j ∈J h fh χ xh − j . We denote dfh = R j ⊗j ∈Jh dfh , and µh = Z1 exp − h1d fh2 dx dfh , the probability measure on Lh ()),
where the scaling factor 1/ hd is introduced in order to give a finite value to themean R 2 R j fh dx dµh (fh ), in the limit h → 0. We will write µh = ⊗j ∈J h dπ∗ fh , Lh () where dπ∗ (y) = √1π e−y dy. We will consider now fh as a random function with probability distribution µh . Thus δfh is a random Young measure on × R. 2
Statistical Mechanics of 2D Euler Equation
253
It follows from Theorem 3.1. in [8] that the family (depending on h) of the random Young measures δfh has the large deviation property with constants 1/ hd and rate function Iπ ν, where we denote π = dx ⊗π∗ , and Iπ (ν) is the classical Kullback information functional, defined on M by: Z Iπ (ν) =
Log
Iπ (ν) = +∞
dν dν, if νis absolutely continuous with respect to π, dπ otherwise.
A straightforward consequence of this large deviation property is that the random Young measures δfh which in addition satisfy the constraints (*), (**) are exponentially concentrated (see [8] for a precise statement) about the set E ∗ of the solutions of the variational problem Iπ (ν ∗ ) = inf {Iπ (ν) : ν ∈ E} , where E is the closed subset of the Young measures on × R satisfying the constraints (*), (**). Notice that this variational problem has at least one solution since E is non empty and closed and Iπ (ν) is a lower semi-continuous and inf-compact functional on M. 1 πω0 , and π 0 = dx ⊗ π0 . For all ν satisfying (*), one Now, let us denote π0 = || can easily get the relationship: Iπ (ν) = Iπ 0 (ν) + ||Iπ∗ (π0 ). Thus if Iπ∗ (π0 ) < ∞, minimizing Iπ or Iπ 0 on E gives the same equilibrium set E ∗ . In fact the use of the functional Iπ 0 is more natural since it is associated to the invariant distribution πω0 . To justify the use of Iπ 0 in the degenerate case Iπ∗ (π0 ) = ∞, one can, for instance, j
modify the definition of the measures µh , and consider µh = ⊗j ∈J h dπh fh , where
dπh (y) = Z1 exp (−Qh (y)) dy, and the polynomial function Qh (y) is such that πh converges towards π0 in the narrow topology when h → 0. Of course, we have µh =
Z 1 1 exp − d Qh (fh ) dx dfh . Z h
It is not hard to see that the proof of Theorem 3.1. in [8] works for these measures, it follows that δfh has the large deviation property with constants 1/ hd and rate function Iπ 0 (ν). Notice that −Iπ 0 (ν) is the entropy, that is the functional which measures the disorder created in the fluid by the turbulent mixing. Remark 1. For Euler equation we have d = 2. R Remark 2. In order to get probability measures, we multiply dfh by Z1 exp − h1d fh2 dx
despite the fact that this functional is (eventually) not conserved by the flow 2ht . Indeed, we consider as an authorized trick to multiply the measures by any functional which is conserved by the flow of the infinite dimensional dynamical system.
254
R. Robert
4. The Statistical Equilibrium States Once we have identified the relevant entropy functional, the determination of the equilibrium states come down to the solution of a variational problem: i.e. find the minimum value of Iπ 0 (ν) under the constraints given by the constants of the motion of the system. After that it remains to discuss at a physical level the relevance of these states. The discussion of the equilibrium states for Euler equation was done at a mathematical and physical level in [11, 13, 15, 16], and we refer to these papers. 5. Comments Let us now address, at an heuristical level, the remaining difficult issue of a complete justification of this equilibrium statistical mechanics. We will consider the well known phenomenon of the formation of coherent structures in 2D turbulence. We can observe, in meteorology, experiments or numerical simulations that such structures form. Let us scrutinize what chain of logic would lead us to identify these structures with the statistical equilibrium states previously described. Notice first that we observe the phenomenon (the formation of the structure) over some finite time interval [0,T]. Obviously the turbulent real fluid has some very small dimensionless viscosity, so that we may suppose that in our time interval the flow is well approximated, in a strong L2 sense, by a solution of 2D Euler ( we consider for example the case of periodic boundary conditions to avoid the problem of boundary layers formation). Then we can approximate (still in a strong L2 sense), uniformly over [0, T ], the flow by a solution of our finite dimensional system, taking the number of degrees of freedom N large enough. Now we have to make the assumption that this finite dimensional system is ergodic and comes close to equilibrium in a mixing time T (N) which is less than T . Of course to have a good approximation of the flow over [0, T ] we have to take N very large (this is well known in hydrodynamical simulations) and the crucial question is: how T (N ) increases with N ? We clearly don’t have any rigorous argument to insure that T (N ) does not increase dramatically with N so that the above justification might fail. From a careful examination of the results of many tests in various situations emerge the following facts (see [12] and references therein). 1) If the turbulent flow reaches an equilibrium state after a mixing process occupying the whole domain, then the description of the final state as a global maximum entropy state is accurate. 2) In many cases the flow reaches a kind of equilibrium which is not a statistical equilibrium in the whole domain occupied by the flow. This indicates clearly that difficulties may arise with the ergodic hypothesis. 3) In such cases, inside the subdomain occupied by the coherent structure, the relationship (vorticity-stream function) associated to our entropy functional is fairly well satisfied. This indicates also clearly that our entropy functional retains some relevance even when ergodicity fails. In conclusion, from the above considerations, it seems unrealistic to seek a complete mathematical justification (involving the dynamics) of the statistical equilibrium states. Nevertheless we can go on to study at a physical level and investigate the relaxation process about the equilibrium; this can bring some light to the dynamical mechanisms responsible for a possible lack of ergodicity [12].
Statistical Mechanics of 2D Euler Equation
255
The picture of a turbulent mixing of the vorticity driving the system towards its equilibrium is very similar to the violent relaxation process in Vlasov–Poisson system that was suggested by astrophysicists to explain the formation of galaxies [5]. But in the case of stellar systems a true difficulty occurs: the stars are not naturally confined in a bounded container, and there is no equilibrium state in the whole space. One can put forward physical arguments [2] to impose such a confinement, but this point is rather controversial at this time. Nevertheless, once a spatial confinement is imposed the above analysis on 2D Euler extends with only minor technical changes to 6D Vlasov–Poisson system. Appendix Finite-Element Approximation. For the comfort of the reader, we briefly recall some standard notations and properties [1]. Approximation of the Sobolev space H m Rd . We denote Qd =] − 1/2, 1/2[d , χ the characteristic function of Qd , and β = χ ∗ . . . ∗ χ (m + 1 terms). For a given parameter P j h > 0, we define a prolongation operator ph : to any function fh = j fh χ xh − j (j belongs to Zd ), we associate the function X j x −j . fh β ph fh = h j
Let us R consider now a compactly supported measurable bounded function λ(x) satisfying λ(x) dx = 1 and ZZ β(x)λ(y)(x − y)k dx dy = 1 for k = 0 =0
for 0 < |k| ≤ m,
where k = (k1 , . . . , kd ), |k| = k1 +· · ·+kd , and (x −y)k = (x1 −y1 )k1 . . . (xd −yd )kd . Then we define a restriction operator rh : Z 1 x j − j f (x) dx, λ for f ∈ L1loc (Rd ), we denote fh = d h h X j x fh χ −j . and define rh f = h j
We have the well known estimates (where c denotes different constants which do not depend on h): (1) If fh ∈ L2 (R)d ,we have krh f kL2 ≤ ckf kL2 . (2) If fh ∈ L2 Rd , we have ph fh ∈ H m and kph fh kH m ≤ hcm kfh kL2 , moreover ckfh kL2 ≤ kph fh kL2 ≤ kfh kL2 , where c > 0 does not depend on h. It follows that ph is an isomorphism from the space L2h of the functions fh which are square integrable onto a subspace Fh of H m . (3) If f ∈ H m+1 (Rd ), for 0 ≤ k ≤ s ≤ m + 1 and k ≤ m, we have: kf − ph rh f kH k ≤ chs−k kf kH s .
256
R. Robert
m the Sobolev space The periodic case. Let us suppose that h = 1/N . We denote Hper H m on the d-dimensional torus (R/Z)d , and Fh (Qd ) the space of the restrictions to Qd P j j j +N i of the functions of the form j fh β xh − j which are Zd -periodic (i.e. fh = fh , d 2 for all j , i in Z ). Fh (Qd ) is endowed with the L scalar product. Obviously, if f and fh are Zd -periodic, so are rh f and ph fh . And the following estimates hold:
(10 ) If f ∈ L2per , we have krh f kL2 (Qd ) ≤ ckf kL2 (Qd ) . c m ≤ m kfh k 2 (20 ) If f is Zd -periodic, we have kph fh kHper L (Qd ) , and h ckfh kL2 (Qd ) ≤ kph fh kL2 (Qd ) ≤ kfh kL2 (Qd ) . m+1 , for 0 ≤ k ≤ s ≤ m + 1 and k ≤ m, we have (30 ) If f f ∈ Hper s−k s . kf kHper kf − ph rh f kHper k ≤ ch
References 1. Aubin, J.P.: Approximation of elliptic boundary-value problems. New York: Wiley-Interscience, 1972 2. Chavanis, P.H., Sommeria, J., Robert, R.: Statistical mechanics of two-dimensional vortices and collisionless stellar systems. The Astrophysical J. 471, 385–399 (1996) 3. Eyink, G.L., Spohn, H.: Negative states and large-scale long-lived vortices in two-dimensional turbulence. J. Stat. Phys. 70, 833–886 (1993) 4. Jordan, R.: A statistical equilibrium model of coherent structures in magnetohydrodynamics. Nonlinearity 8, 585–614 (1995) 5. Lynden-Bell, D.: Statistical mechanics of violent relaxation in stellar systems. Mon. Not. R. Astr. Soc. 181, 405, (1967) 6. Marchioro, C., Pulvirenti, M.: Mathematical Theory of Incompressible Nonviscous Fluids. New York: Springer-Verlag, 1994 7. Michel, J., Robert, R.: Statistical mechanical theory of the great red spot of Jupiter. J. Stat. Phys. 77 3/4, 645–666 (1994) 8. Michel, J., Robert, R.: Large deviations for Young measures and statistical mechanics of infinite dimensional dynamical systems with conservation law. Commun. Math. Phys. 159, 195–215 (1994) 9. Miller, J., Weichman, P.B., Cross, M.C.: Statistical mechanics, Euler equations,and Jupiter’s red spot. Phys. Rev. A 45, 2328–2359 (1992) 10. Onsager, L.: Statistical hydrodynamics. Nuovo Cimento supll. 6, 279 (1949) 11. Robert, R.: A maximum entropy principle for two-dimensional Euler equations. J. Stat. Phys. 65, 3/4, 531–553 (1991) 12. Robert, R., Rosier, C.: On the modelling of small scales for 2D turbulent flows. J. Stat Phys. 86, 3/4, 1997 13. Robert, R., Sommeria, J.: Statistical equilibrium states for two- dimensional flows. J. Fluid Mech. 229, 291–310 (1991) 14. Sommeria, J., Nore, C., Dumont, T., Robert, R.: Théorie statistique de la tache rouge de Jupiter. C. R. Acad. Sci. Paris, 312 Série II, 999–1005 (1991) 15. Sommeria, J., Staquet, C., Robert, R.: Final equilibrium state of a two-dimensional shear layer. J. Fluid Mech. 233, 661–689 (1991) 16. Thess, A., Sommeria, J., JÜttner, B.:: Inertial organization of a two-dimensional turbulent vortex street. Phys. Fluids 6 (7), 2417–2429 (1994) 17. Turkington, B., Jordan, R.: Turbulent relaxation of a magnetofluid: A statistical equilibrium model. In: Proceedings, International Conference on Advances in Geometric Analysis and Continum Mechanics. Stanford University, August 1993 18. Youdovitch, V.I.: Non-stationary flow of an incompressible liquid. Zh. Vych. Mat. 3, 1032–1066 (1963) 19. Young, L.C.: Generalized surfaces in the calculus of variations. Ann. Math. 43, 84–103 (1942) 20. Zachos, C.K.: Hamiltonian flows, SU(8), SO(8), USp(8) and strings. In: Differential geometric methods in theoretical physics, L.L. Chau, W. Nahm (eds.). New York: Plenum Press, 1990 21. Zeitlin, V.: Finite mode analogs of 2D ideal hydrodynamics: Coadjoint orbits and local canonical structure. Physica D 49, 353–362 (1991) Communicated by J. L. Lebowitz
Commun. Math. Phys. 212, 257 – 275 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
The Stability of Magnetic Vortices∗ S. Gustafson† , I. M. Sigal Dept. of Mathematics, University of Toronto, 100 St. George St., Toronto, ON, Canada, M5S 3G3 Received: 16 November 1998 / Accepted: 3 January 2000
Abstract: We study the linearized stability of n-vortex (n ∈ Z) solutions of the magnetic Ginzburg–Landau (or Abelian Higgs) equations. We prove that the fundamental vortices (n = ±1) are stable for all values of the coupling constant, λ, and we prove that the higher-degree vortices (|n| ≥ 2) are stable for λ < 1, and unstable for λ > 1. This resolves a long-standing conjecture (see, eg, [JT]). 1. Introduction In this paper, we determine the stability of magnetic (or Abelian Higgs) vortices. These are certain critical points of the energy functional Z λ 1 2 2 2 2 |∇A ψ| + (∇ × A) + (|ψ| − 1) (1) E(ψ, A) = 2 R2 4 for the fields A : R 2 → R2
and
ψ : R2 → C.
Here ∇A = ∇ − iA is the covariant gradient, and λ > 0 is a coupling constant. For a vector, A, ∇ × A is the scalar ∂1 A2 − ∂2 A1 , and for a scalar ξ , ∇ × ξ is the vector (−∂2 ξ, ∂1 ξ ). Critical points of E(ψ, A) satisfy the Ginzburg–Landau (GL) equations −1A ψ +
λ (|ψ|2 − 1)ψ = 0, 2
∗ Research on this paper was supported by NSERC under grant N7901 † Present address: Courant Institute, 251 Mercer St., New York, NY 10012, USA.
E-mail:
[email protected] (2)
258
S. Gustafson, I. M. Sigal
¯ A ψ) = 0, ∇ × ∇ × A + I m(ψ∇
(3)
where 1A = ∇A · ∇A . Physically, the functional E(ψ, A) gives the difference in free energy between the superconducting and normal states near the transition temperature in the Ginzburg– Landau theory. A is the vector potential (∇ × A is the induced magnetic field), and ψ is an order parameter. The modulus of ψ is interpreted as describing the local density of superconducting Cooper pairs of electrons. The functional E(ψ, A) also gives the energy of a static configuration in the YangMills-Higgs classical gauge theory on R2 , with abelian gauge group U (1). In this case A is a connection on the principal U (1)- bundle R2 × U (1), and ψ is the Higgs field (see [JT] for details). A central feature of the functional E(ψ, A) (and the GL equations) is its infinitedimensional symmetry group. Specifically, E(ψ, A) is invariant under U (1) gauge transformations, ψ 7→ eiγ ψ,
(4)
A 7→ A + ∇γ
(5)
for any smooth γ : R2 → R. In addition, E(ψ, A) is invariant under coordinate translations, and under the coordinate rotation transformation ψ(x) 7 → ψ(g −1 x)
A(x) 7 → gA(g −1 x)
(6)
for g ∈ SO(2). Finite energy field configurations satisfy |ψ| → 1
as
|x| → ∞
(7)
which leads to the definition of the topological degree, deg(ψ), of such a configuration: ! ψ : S1 → S1 deg(ψ) = deg |ψ| |x|=R (R sufficiently large). The degree is related to the phenomenon of flux quantization. Indeed, an application of Stokes’ theorem shows that a finite-energy configuration satisfies Z 1 (∇ × A). deg(ψ) = 2π R2 We study, in particular, “radially-symmetric” or “equivariant” fields of the form ψ (n) (x) = fn (r)einθ ,
A(n) (x) = n
an (r) ⊥ xˆ , r
(8)
where (r, θ) are polar coordinates on R2 , xˆ ⊥ = 1r (−x2 , x1 )t , n is an integer, and fn , an : [0, ∞) → R. It is easily checked that such configurations (if they satisfy (7)) have degree n. The existence of critical points of this form is well-known (see Sect. 2.1). They are called n-vortices.
The Stability of Magnetic Vortices
259
Our main results concern the stability of these n-vortex solutions. Let L(n) = Hess E(ψ (n) , A(n) ) be the linearized operator for GL around the n-vortex, acting on the space X = L2 (R2 , C) ⊕ L2 (R2 , R2 ). The symmetry group of E(ψ, A) gives rise to an infinite-dimensional subspace of ker(L(n) ) ⊂ X (see Sect. 3.2), which we denote here by Zsym . We say the n-vortex is (linearly) stable if for some c > 0, L(n) |Zsym ⊥ ≥ c, and unstable if L(n) has a negative eigenvalue. The basic result of this paper is the following linearized stability statement: Theorem 1. 1. (Stability of fundamental vortices) For all λ > 0, the ±1-vortex is stable. 2. (Stability / instability of higher-degree vortices) For |n| ≥ 2, the n-vortex is stable for λ < 1, unstable for λ > 1. Theorem 1 is the basic ingredient in a proof of the nonlinear dynamical stability / instability of the n-vortex for certain dynamical versions of the GL equations. These include the GL gradient flow equations, and the Abelian Higgs (Lorentz-invariant) equations. These dynamical stability results are established in a separate work ([G2]). Other work on dynamics of magnetic vortices appears in [DS, S, S2]. The statement of Theorem 1 was conjectured in [JT] on the basis of numerical observations (see [JR]). Bogomolnyi ([B]) gave an argument for instability of vortices for λ > 1, |n| ≥ 2. Our result rigorously establishes this property. The instability of higher-degree vortices for sufficiently large λ was established in [ABG]. The stability of vortices of Ginzburg–Landau equations without magnetic field was studied in [LL, M,OS1]. The stability of “monopole” solutions of a non-abelian generalization of (2-3) was studied in [AD] (see also [G1]). The solutions of (2)–(3) are well-understood in the case of critical coupling, λ = 1. In this case, the Bogomolnyi method ([B]) gives a pair of first-order equations whose solutions are global minimizers of E(ψ, A) among fields of fixed degree (and hence solutions of the GL equations). Taubes ([T1,T2]) has shown that all solutions of GL with λ = 1 are solutions of these first-order equations, and that for a given degree n, the gauge-inequivalent solutions form a 2|n|-parameter family. The 2|n| parameters describe the locations of the zeros of the scalar field. This is discussed in more detail in [JT] (see also [BGP]) and Sect. 6. We remark that for λ = 1, an n-vortex solution (8) corresponds to the case when all |n| zeros of the scalar field lie at the origin. The remainder of this paper is organized as follows. In Sect. 2 we describe in detail various properties of the n-vortex. In particular, we establish an important estimate on the n-vortex profiles which differentiates between the cases λ < 1 and λ > 1. In Sect. 3, we introduce the linearized operator, fix the gauge on the space of perturbations, and identify the zero-modes due to symmetry-breaking. Sections 4 through 7 comprise
260
S. Gustafson, I. M. Sigal
a proof of Theorem 1. A block-decomposition for the linearized operator is described in Sect. 4. This approach is similar to that used to study the stability of non-magnetic vortices in [OS1] and [G1]. In Sect. 5, we establish the positivity of certain blocks (those corresponding to the radially-symmetric variational problem, and those containing the translational zero-modes) for all λ, which completes the stability proof for the ±1vortices. The basic techniques are the characterization of symmetry-breaking in terms of zero-modes of the Hessian (or linearized operator), and a Perron-Frobenius type argument, based on a version of the maximum principle for systems (Proposition 6), which shows that the translational zero-modes correspond to the bottom of the spectrum of the linearized operator. A more careful analysis is needed for |n| ≥ 2. This requires us to review some aspects of the critical case (λ = 1) in Sect. 6. The stability / instability proof for |n| ≥ 2 is completed in Sect. 7. We use an extension of Bogomolnyi’s instability argument, and another application of the Perron-Frobenius theory. 2. The n-Vortex In this section we discuss the existence, and properties, of n-vortex solutions. 2.1. Vortex solutions. The existence of solutions of (GL) of the form (8) is well-known: Theorem 2 (Vortex existence; [P, BC]). For every integer n, and every λ > 0, there is a solution an (r) ⊥ A(n) (x) = n (9) xˆ ψ (n) (x) = fn (r)einθ r of the variational equations (2)–(3). In particular, the radial functions (fn , an ) minimize the radial energy functional Z 2 2 0 2 1 ∞ λ 2 (n) 0 2 2 (1 − a) f 2 (a ) 2 +n + (f − 1) rdr (10) Er (f, a) = (f ) + n 2 0 r2 r2 4 (which is the full energy functional (1) restricted to fields of the form (8)) in the class a a0 ∈ L2loc (rdr), ∈ L2 (rdr)}. r r The functions fn , an are smooth, and have the following properties (for n 6 = 0): {f, a : [0, ∞) → R | 1 − f ∈ H 1 (rdr),
1. 0 < fn < 1, 0 < an < 1 on (0, ∞), 2. fn0 , an0 > 0, 3. fn ∼ cr n , an ∼ dr 2 , as r → 0 (c > 0 and d > 0 are constants), 4. 1 − fn , 1 − an → 0 as r → ∞, with an exponential rate of decay. We call (ψ (n) , A(n) ) an n-vortex (centred at the origin). It follows immediately that the functions fn and an satisfy the ODEs −1r fn +
n2 (1 − an )2 λ fn + (fn2 − 1)fn = 0 r2 2
(11)
and −an00 +
an0 − fn2 (1 − an ) = 0. r
(12)
The Stability of Magnetic Vortices
261
Remark 1. The n-vortex is known to be the unique solution of (GL) of the form (8) when λ ≥ 2n2 [ABGi]. In the appendix, we show that for λ ≥ 2n2 , any such solution (n) minimizes Er . Remark 2. The functions fn and an also depend on λ, but we suppress this dependence for ease of notation. When it will cause no confusion, we will also drop the subscript n. ¯ A 7 → −A of (GL) interchanges (ψ (n) , A(n) ) Remark 3. The discrete symmetry ψ 7→ ψ, (−n) (−n) ,A ). Thus, we can assume n ≥ 0. and (ψ 2.2. An estimate on the vortex profiles. The following inequality, relating the exponentially decaying quantities f 0 and 1 − a, plays a crucial role in the stability / instability proof. Proposition 1. We have 0 f (r) for λ < 1 f (r) > n(1−a(r)) r . (13) f (r) for λ > 1 f 0 (r) < n(1−a(r)) r f (r). The properties listed in Theorem 2 imply Proof. Define e(r) ≡ f 0 (r) − n(1−a(r)) r that e(r) → 0 as r → 0 and as r → ∞. Using the ODEs ((11)–(12)) we can derive the equation e (−1r + α)e + e0 = (1 − λ)f 2 f 0 , f where rf 0 na 0 1 + n(1 − a) ) + f2 + >0 (1 + α(r) = 2 r f r and the result follows from the maximum principle. u t 3. The Linearized Operator In this section, we introduce the linearized operator (or Hessian) around the n-vortex, and identify its symmetry zero-modes. 3.1. Definition of the linearized operator. We work on the real Hilbert space X = L2 (R2 ; C) ⊕ L2 (R2 ; R2 ) with inner-product
Z < (ξ, B), (η, C) >X =
R2
{Re(ξ¯ η) + B · C}.
We define the linearized operator, Lψ,A (= the Hessian of E(ψ, A)) at a solution (ψ, A) of (2)–(3) through the quadratic form ∂2 E(ψ + ξ + δη, A + B + δC)|=δ=0 = h(η, C), Lψ,A (ξ, B)iX ∂∂δ for all (ξ, B), (η, C), ∈ X. The result is ! [−1A + λ2 (2|ψ|2 − 1)]ξ + λ2 ψ 2 ξ¯ + i[2∇A ψ + ψ∇] · B ξ . = Lψ,A B ¯ A ]ξ ) + (−1 + ∇∇ + |ψ|2 ) · B I m([∇A ψ − ψ∇
262
S. Gustafson, I. M. Sigal
3.2. Symmetry zero-modes. We identify the part of the kernel of the operator L(n) ≡ Lψ (n) ,A(n) which is due to the symmetry group. Proposition 2. We have 1. L(n)
iγ ψ (n) ∇γ
=0
(14)
=0
(15)
for any γ : R2 → R. 2. (n)
L
∂j ψ (n) ∂j A(n)
for j = 1, 2. Proof. We use the basic result that the generator of a one-parameter group of symmetries of E(ψ, A), applied to the n-vortex, lies in the kernel of L(n) . The vector in (14) is easily seen to be the generator of a one-parameter family of gauge transformations (4-5) applied to the n-vortex. Similarly, the vector in (15) is the generator of coordinate translations applied to the n-vortex. u t Remark 4. Applying the generator of the coordinate rotational symmetry (6) to the nvortex gives us nothing new. This is covered by the gauge-symmetry case. We define Zsym to be the subspace of X spanned by the L2 zero-modes described in Proposition 2. We recall that the n-vortex is called stable if there is a constant c > 0 such that L(n) |Zsym ⊥ ≥ c,
(16)
and unstable if L(n) has a negative eigenvalue. 3.3. Gauge fixing. In order to remove the infinite dimensional kernel of L(n) arising from gauge symmetry, we restrict the class of perturbations. Specifically, we restrict L(n) to the space of those perturbations (ξ, B) ∈ X which are orthogonal to the L2 gauge zero-modes (14). That is, ξ iγ ψ (n) , =0 B X ∇γ for all γ . Integration by parts gives the gauge condition I m(ψ (n) ξ ) = ∇ · B. As is done in [S], we consider a modified quadratic form L˜ (n) , defined by Z (n) (n) ˜ < α, L α >=< α, L α > + (I m(ψ (n) ξ ) − ∇ · B)2
(17)
The Stability of Magnetic Vortices
263
for α = (ξ, B) ∈ X. Clearly, L˜ (n) agrees with L(n) on the subspace of X specified by the gauge condition (17). This modification has the important effect of shifting the essential spectrum away from zero (see (26)). A straightforward computation gives the following expression for L˜ (n) : ˜ (n)
L
ξ B
[−1A + λ2 (2|ψ|2 − 1) + 21 |ψ|2 ]ξ + 21 (λ − 1)ψ 2 ξ¯ + 2i∇A ψ · B
=
2I m[∇A ψξ ] + [−1 + |ψ|2 ]B
! .
To establish Theorem 1, it suffices to prove that L˜ (n) ≥ c > 0 on the subspace of X orthogonal to the translational zero-modes (15). L˜ (n) is a real-linear operator on X. It is convenient to identify L2 (R2 ; R2 ) with 2 L (R2 ; C) through the correspondence B=
B1 B2
↔ B c ≡ B1 − iB2 ,
(18)
and then to complexify the space X 7→ X˜ = [L2 (R2 ; C)]4 via (ξ, B) 7 → (ξ, ξ¯ , B c , B¯ c ).
(19)
As a result, L˜ (n) is replaced by the complex-linear operator (n)
˜˜ L
= diag {−1A , −1A , −1, −1} + V (n) ,
where λ V (n) =
1 1 2 2 2 2 (2|ψ| − 1) + 2 |ψ| 2 (λ − 1)ψ λ 1 1 2 2 2 ¯ 2 (λ − 1)ψ 2 (2|ψ| − 1) + 2 |ψ| i(∂A∗ ψ) i(∂A ψ) −i(∂A∗ ψ) −i(∂A ψ)
−i(∂A∗ ψ) i(∂A ψ) −i(∂A ψ) i(∂A∗ ψ) . |ψ|2 0 0 |ψ|2
Here we have used the notation ∂A ≡ ∂z − iA, where ∂z = ∂1 − i∂2 (and the superscript c has been dropped from the complex function A obtained from the vector-field A via (18)). The components of V (n) are bounded, and it follows from standard results ([RSII]) ˜˜ (n) is a self-adjoint operator on X, ˜ with domain that L (n)
˜˜ D(L
) = [H 2 (R2 ; C)]4 .
264
S. Gustafson, I. M. Sigal
4. Block Decomposition We write functions on R2 in polar coordinates. Precisely, X˜ = [L2 (R2 ; C)]4 = [L2rad ⊗ L2 (S1 ; C)]4 ,
(20)
where L2rad ≡ L2 (R+ , rdr). Let ρn : U (1) → Aut([L2 (S1 ; C)]4 ) be the representation whose action is given by ρn (eiθ )(ξ, η, B, C)(x) = (einθ ξ, e−inθ η, e−iθ B, eiθ C)(R−θ x), where Rα is a counter-clockwise rotation in R2 through the angle α. It is easily checked ˜˜ (n) commutes with ρ (g) for any g ∈ U (1). It follows that the linearized operator L n
˜˜ (n) leaves invariant the eigenspaces of dρ (s) for any s ∈ iR = Lie(U (1)). The that L n (n) ˜ ˜ resulting block decomposition of L , which is described in this section, is essential to our analysis. In particular, the translational zero-modes each lie within a single subspace of this decomposition. 4.1. The decomposition of L(n) . In what follows, we define, for convenience, b(r) = n(1−a(r)) . r Proposition 3. There is an orthogonal decomposition M (ei(m+n)θ L2rad ⊕ ei(m−n)θ L2rad ⊕ −iei(m−1)θ L2rad ⊕ iei(m+1)θ L2rad ), X˜ =
(21)
m∈Z
˜˜ under which the linearized operator around the vortex, L ˜˜ L
(n)
=
M m∈Z
(n)
, decomposes as
Lˆ (n) m ,
where ˆ (n) Lˆ (n) m = −1r (I d) + Vm with 1 Vˆm(n) = 2 diag {[m + n(1 − a)]2 , [m − n(1 − a)]2 , [m − 1]2 , [m + 1]2 } + V 0 r and λ V0 =
1 2 1 2 2 2 (2f − 1) + 2 f 2 (λ − 1)f 1 1 2 λ 2 2 2 (λ − 1)f 2 (2f − 1) + 2 f 0 0 f − bf −[f + bf ]
−[f 0 + bf ]
f 0 − bf
f 0 − bf −[f 0 + bf ] −[f 0 + bf ] f 0 − bf . f2 0 2 0 f
(22)
The Stability of Magnetic Vortices
265
Proof. The decomposition (21) of X˜ follows from the usual Fourier decomposition of ˜˜ (n) preserves the L2 (S1 ; C), and the relation (20). An easy computation shows that L space of vectors of the form (ξ ei(m+n)θ , ηei(m−n)θ , −iαei(m−1)θ , iβei(m+1)θ ) and that it acts on such vectors via (22).
(23)
t u
(n) It follows that Lˆ m is self-adjoint on [L2rad ]4 . (n) It will also be convenient to work with a rotated version of the operator Lˆ m , ( (n) R Lˆ m R T m ≥ 0 (n) , Lm ≡ (n) R 0 Lˆ m (R 0 )T m < 0
where
1 1 −1 R=√ 0 2 0
1 1 0 0
0 0 1 1
0 0 , 1 −1
1 1 1 0 R =√ 2 0 0
1 −1 0 0
0 0 1 1
0 0 . 1 −1
We have (n) L(n) m = −1r (I d) + Vm ,
(24)
where
m2 + b2 + λ (3f 2 − 1) −2|m| br −2bf 0 2 2 r 2 m b λ 2 2 2 −2|m| r + b + 2 (f − 1) + f 0 −2f 0 (n) r2 Vm = . 2 +1 |m| m 2 −2bf 0 + f −2 r2 r2 2 m +1 + f 2 0 −2f 0 −2 |m| 2 2 r r
(n)
4.2. Properties of Lm . Proposition 4. We have the following: 1. (n)
L(n) m = L−m .
(25)
σess (L(n) m ) = [min(1, λ), ∞).
(26)
2.
3. For |n| = 1 and |m| ≥ 2, (n)
L(n) m − L1 ≥ 0 with no zero-eigenvalue.
(27)
266
S. Gustafson, I. M. Sigal
Proof. The first statement is obvious. The second statement follows in a standard way from the fact that lim Vm(n) (r) = diag {λ, 1, 1, 1}.
r→∞
To prove the third statement, we compute m−1 ˆ (n) diag {m + 1 + 2n(1 − a), m + 1 − 2n(1 − a), m − 1, m + 3} Lˆ (n) m − L1 = r2 which is non-negative, with no zero-eigenvalue for m ≥ 2, n = 1.
t u
Remark 5. In light of (25), we can assume from now on that m ≥ 0. This degeneracy is a result of the complexification (19) of the space of perturbations.
4.3. Translational zero-modes. The gauge fixing (Sect. 3.3) has eliminated the zeromodes arising from gauge symmetry. The translational zero-modes remain. As written in (15), the translational zero-modes fail to satisfy the gauge condition (17). Further, they do not lie in L2 . A straightforward computation shows that if we adjust the vectors in (15) by gauge zero-modes given by (14) with γ = −Aj , j = 1, 2, we obtain T1 =
(∇A ψ)1 (∇ × A)e2
,
T2 =
(∇A ψ)2 −(∇ × A)e1
,
where e1 = (1, 0) and e2 = (0, 1). T1 and T2 satisfy (17), and are zero-modes of the linearized operator. Note also that T±1 decay exponentially as |x| → ∞, and hence lie in L2 . (n) It is easily checked that T1 ± iT2 lie in the m = ±1 blocks for Lˆ m . After rotation by R, we have (n)
L±1 T = 0, where T = (f 0 , bf, n
a0 a0 , n ). r r
5. Stability of the Fundamental Vortices In this section we prove the first part of Theorem 1. Specifically, we show that for some (±1) (±1) c > 0, Lm ≥ c for m 6 = 1, and L1 |T ⊥ ≥ c. In light of the discussions in Sects. 3.3, 4.1, and 4.3, this will establish the stability of the ±1-vortices.
The Stability of Magnetic Vortices
267
(n)
5.1. Non-negativity of L0 and radial minimization. (n)
Proposition 5. L0 ≥ 0 for all λ. (n)
Proof. From the expression (24) we see that L0 breaks up: (n)
L0 = N0 ⊕ M0
(28)
(abusing notation slightly) where M0 = −1r (I d) + W0 with
W0 =
and
N0 =
b2 + λ2 (3fn2 − 1) −2bf 1 + f2 −2bf r2
−1r + b2 + λ2 (f 2 − 1) + f 2 −2f 0 0 −2f −1r + r12 + f 2
.
An easy computation shows that M0 is precisely the Hessian of the radial energy, (n) (n) HessEr (see (10)). Since the n-vortex minimizes Er , we have M0 ≥ 0. It remains to show N0 ≥ 0. We establish the stronger result, N0 > 0. Note that N0 = G∗0 G0 , where
G0 =
∂r − f 0 /f f f ∂r + 1/r
.
In fact, G0 has no zero-eigenvalue. To see this, we exploit some known results about the kernel of G0 at λ = 1. In Sect. 6, we will show that at λ = 1, the full linearized operator is the square of a first-order differential operator, F : L˜ (n) |λ=1 = F ∗ F . The operator F was analyzed in [S], where it was shown to be Fredholm with index 2|n|. The operator F0 ≡ G0 |λ=1 is F restricted to a particular invariant subspace. Thus F0 is a Fredholm operator from its domain to L2rad . The kernels of F and F ∗ are known precisely, (see [S] and Sect. 6) and it follows that F0 has index zero. Now, G0 is a relatively compact perturbation of F0 (due to the decay of the field components – see, again, [S]), and hence G0 is also Fredholm with index zero. Finally, it is a simple matter to check that G∗0 has trivial kernel. If ∗ ξ =0 G0 β it follows that (−1r + f 2 )β = 0 and hence that β = 0, and so ξ = 0. The relation N0 > 0 follows from this, and the fact t that σess (N0 ) = [1, ∞). u
268
S. Gustafson, I. M. Sigal
5.2. A maximum principle argument. Removing the equality in Proposition 5 requires more work. First, we establish an extension of the maximum principle to systems (see, eg, [LM,PA] for related results). We will use this also in the proof that the translational (n) zero-mode is the ground state of L1 (Sect. 5.4). Proposition 6. Let L be a self-adjoint operator on L2 (Rn ; Rd ) of the form L = −1(I d) + V , where V is a d × d matrix-multiplication operator with smooth entries. Suppose that L ≥ 0 and that for i 6 = j , Vij (x) ≤ 0 for all x. Further, suppose V is irreducible in the sense that for any splitting of the set {1, . . . , d} into disjoint sets S1 and S2 , there is an i ∈ S1 and a j ∈ S2 with Vij (x) < 0 for all x. Finally, suppose that Lξ = η ∈ L2 with η ≥ 0 component-wise, and ξ 6 ≡ 0. Then either 1. ξ > 0 or 2. η ≡ 0 and ξ < 0. Proof. We write ξ = ξ + − ξ − with ξ + , ξ − ≥ 0 component-wise, and compute 0 ≤ < ξ − , Lξ − > = < ξ − , Lξ + > − < ξ − , Lξ > . Since ξj+ and ξj− have disjoint support, we have r.h.s. =
X j 6=k
< ξj− , Vj k ξk+ > − < ξ − , η > ≤ 0.
Thus we have 1. 0 = < ξ − , Lξ − >. 2. 0 = < ξj− , Vj k ξk+ > for all j 6 = k. Since L ≥ 0, the first of these implies Lξ − = 0 and hence Lξ + = η. So if η 6 ≡ 0, then ξ + 6 ≡ 0. If η ≡ 0 and ξ + ≡ 0, replace ξ with −ξ in what follows. An application of the strong maximum principle (eg. [GT], Thm. 8.19) to each component of the equation Lξ + = η now allows us to conclude that for each k, either ξk+ > 0 or ξk+ ≡ 0. We know that for some k, ξk+ > 0. Looking back at the second listed equation above, and using the irreducibility of V , we then see that ξj− ≡ 0 for all j . Finally, we can easily rule out the possibility ξk ≡ 0 for some k, by looking back at the equation satisfied by ξk . Thus we have ξ > 0. u t (n)
5.3. Positivity of L0 . Now we apply Proposition 6 to show M0 > 0. The trick here is to find a function ξ which satisfies M0 ξ ≥ 0. This allows us to rule out the existence of a zero-eigenvector, which would be positive by Proposition 6. To obtain such a ξ , we differentiate the vortex with respect to the parameter λ. Specifically, differentiation of the Ginzburg–Landau equations with respect to λ results in M0 ξ = η,
(29)
The Stability of Magnetic Vortices
269
where
ξ=
and η=
∂λ f n∂λ a/r
1
2 (1 − f
0
2 )f
≥ 0.
We can now establish (n)
Proposition 7. For all λ, L0 ≥ c > 0. Proof. We have already shown in the proof of Proposition 5, that N0 > 0 and M0 ≥ 0. Hence, due to (28) and (26), it suffices to show that N ull(M0 ) = {0}. Suppose M0 ζ = 0, ζ 6 ≡ 0. Proposition 6 then implies ζ > 0 (or else take −ζ ). Now 0 = < M0 ζ, ξ > = < ζ, M0 ξ > = < ζ, η > > 0 t u
gives a contradiction.
Remark 6. Proposition 6 applied to Eq. (29) also gives ξ > 0. That is, the vortex profiles increase monotonically with λ. This can be used to show that the rescaled vortex √ √ (fn (r/ λ), an (r/ λ)) converges as λ → ∞ to (f ∗ , 0), where f ∗ is the (profile of) the n-vortex solution of the ordinary GL equation: −1r f ∗ + n2 f ∗ /r 2 + (f ∗ 2 − 1)f ∗ = 0. This result was established by different means in [ABG]. (±1)
5.4. Positivity of L1
(±1)
Proposition 8. L1
.
≥ 0 with non-degenerate zero-eigenvalue given by T . (±1)
(±1)
Proof. Let µ = inf specL1 ≤ 0, which is an eigenvalue by (26). Suppose L1 S = (±1) (±1) satisfies the irreducibility µS. Applying Proposition 6 to L1 − µ (note that V1 requirement) gives S > 0 (or S < 0). Further, µ is non-degenerate, as if µ were degenerate, we would have two strictly positive eigenfunctions which are orthogonal, an impossibility. Now if µ < 0, we have < S, T >= 0, which is also impossible. Thus S is a multiple of T , and µ = 0. u t 5.5. Completion of stability proof for n = ±1. We are now in a position to complete (±1) ≥ c > 0. By the proof of the first statement of Theorem 1. By Proposition 7, L0 (±1) (±1) Proposition 8 and (26), L1 |T ⊥ ≥ c˜ > 0. Finally, by (27), Lm ≥ c0 > 0 for |m| ≥ 2. It follows from Proposition 3 that L˜ (n) ≥ c > 0 on the subspace of X orthogonal to the translational zero-modes. By the discussion of Sect. 3.3, this gives Theorem 1 for n = ±1. u t 6. The Critical Case, λ = 1 In order to prove the remainder of Theorem 1, we exploit some results from the λ = 1 case.
270
S. Gustafson, I. M. Sigal
6.1. The first-order equations. Following [B], we use an integration by parts to rewrite the energy (1) as Z h i2 n 1 |∂A∗ ψ|2 + ∇ × A + 21 (|ψ|2 − 1) E(ψ, A) = 2 R2 o + 41 (λ − 1)(|ψ|2 − 1)2 + πdeg(ψ) (30) (recall, since we work in dimension two, ∇ × A is a scalar) where deg(ψ) is the topological degree of ψ, defined in the introduction. We assume, without loss of generality, that deg(ψ) ≥ 0. Clearly, when λ = 1, a solution of the first-order equations ∂A∗ ψ = 0,
(31)
1 ∇ × A + (|ψ|2 − 1) = 0 2
(32)
minimizes the energy within a fixed topological sector, deg(ψ) = n, and hence solves GL. Note that we have identified the vector-field A with a complex field as in (18). The n-vortices (9) are solutions of these equations (when λ = 1). Specifically, n
1 a0 = (1 − f 2 ) r 2
(33)
(1 − a)f . r
(34)
and f0 = n
In fact, it is shown in [T2] that for λ = 1, any solution of the variational equations solves the first- order equations (31)-(32). Beginning from expression (30) for the energy, the variational equations (previously written as (2)-(3)) can be written as 1 1 ∂A [∂A∗ ψ] + ψ[∇ × A + (|ψ|2 − 1)] + (λ − 1)(|ψ|2 − 1)ψ = 0, 2 2 1 iψ[∂A∗ ψ] − i∂z¯ [∇ × A + (|ψ|2 − 1)] = 0 2
(35) (36)
(here ∂A∗ ≡ −∂z¯ + i A¯ is the adjoint of ∂A ). 6.2. First-order linearized operator. We show that the linearized operator at λ = 1 is the square of the linearized operator for the first-order equations. Linearizing the first-order equations (31)–(32) about a solution, (ψ, A) (of the firstorder equations) results in the following equations for the perturbation, α ≡ (ξ, B): ∂A∗ ξ + iψ B¯ = 0, ¯ ) = 0. ∇ × B + Re(ψξ
The Stability of Magnetic Vortices
271
Now using i∂z¯ B = ∇ × B + i(∇ · B), and adding in the gauge condition (17), we can rewrite this as F α = 0, where
F =
(37)
∂A∗ iψ( ¯ ) . ψ( ¯ ) i∂z
If we linearize the full (second order) variational equations (in the form (35)-(36)) around (ψ, A), we obtain ¯ + i B[∂ ¯ ∗ ψ] + ψ[∇ × B + Re(ψξ ¯ )] ∂A [∂A∗ ξ + i Bψ] A
¯ )] = 0 +ξ [∇ × A + 21 (|ψ|2 − 1)] + 21 (λ − 1)[(|ψ|2 − 1)ξ + 2ψRe(ψξ and ¯ + i ξ¯ [∂A∗ ψ] − i∂z¯ [∇ × B + Re(ψξ ¯ )] = 0. ¯ A∗ ξ + i Bψ] i ψ[∂ Proposition 9. When λ = 1, these linearized equations can also be written F ∗ F α = 0. Proof. This is a simple computation using the fact that the first-order equations (31–32) hold. u t This relation holds also on the level of the blocks. A straightforward computation gives ∗ L(n) m |λ=1 = Fm Fm ,
where
Fm =
∂r − b m r
f 0
f 0 ∂r − b 0 f . 0 ∂r + 1/r − mr m f −r ∂r + 1/r m r
6.3. Zero-modes for λ = 1. It was predicted in [W] (and proved rigorously in [S]) that for λ = 1, the linearized operator around any degree-n solution of the first-order equations has a 2|n|-dimensional kernel (modulo gauge transformations). This kernel arises because the Taubes solutions form a 2|n|-parameter family, and all have the same energy. The zero-eigenvalues are identified in [B], and we describe them here. Let χm be the unique solution of (−1r +
m2 + f 2 )χm = 0 r2
on (0, ∞) with χm ∼ r −m
as
r→0
272
S. Gustafson, I. M. Sigal
and χm → 0
as
r→∞
for m = 1, 2, . . . , n. Then it is easy to check that when λ = 1, Fm Wm = 0,
(38)
where f χm f χm . Wm = −(χm0 + mχm /r) 0 −(χm + mχm /r)
We remark that χ1 =
1−a r
and it is easily verified that for λ = 1, W1 = n1 T gives the translational zero-modes. 7. The (In)stability Proof for |n| ≥ 2 Here we complete the proof of Theorem 1. (n) The idea is to decompose Lm into a sum of two terms, each of which has the same (n) (translational) zero-mode (for m = 1) as Lm . One term is manifestly positive, and the other satisfies restrictions of Perron-Frobenius theory. We begin by modifying Fm , and defining, for any λ, 0 m (∂r − ff ) · q f 0 r f0 m q ∂ − 0 f ˜ r Fm ≡ r f , m fq 0 ∂r + 1/r − r 0 f − mr ∂r + 1/r where we have defined q(r) ≡
n(1 − a)f rf 0
(39)
and ∂r · q denotes an operator composition. By (34), we have q ≡ 1 for λ = 1. We also set, for m = 1, . . . , n, q −1 f χm f χm . W˜ m = −(χm0 + m χrm ) χm 0 −(χm + m r ) Now W˜m has the following properties: 1. W˜ 1 is the translational zero-mode n1 T for all λ.
The Stability of Magnetic Vortices
273
2. When λ = 1, W˜ m = Wm , m = 1, . . . , n, give the 2|n| zero-modes (38) of the linearized operator. These W˜ m were chosen in [B] as candidates for directions of energy decrease (for |m| ≥ 2) when λ > 1. Intuitively, we think of W˜ m as a perturbation that tends to break the n-vortex into separate vortices of lower degree. Now, F˜m was designed to have the following properties: 1. F˜m = Fm when λ = 1 (this is clear). 2. F˜m W˜ m = 0 for all m and λ (this is easily checked). A straightforward computation gives ˜∗ ˜ L(n) m = Fm Fm + J Mm ,
(40)
where J = diag{1, 0, 0, 0} and Mm = lm − qlm q + (λ − q 2 )f 2 with m2 λ + b2 + (f 2 − 1). 2 r 2 By construction, when m = 1, the second term in the decomposition (40) must have a zero-mode corresponding to the original translational zero-mode. In fact, one can easily check that M1 f 0 = 0. lm = −1r +
Proposition 10. For |n| ≥ 2, M1 has a non-degenerate zero-eigenvalue corresponding to f 0 , and M1 ≥ 0 λ < 1 M1 ≤ 0 λ > 1 on L2rad . Proof. We recall inequality (13), which implies that for λ < 1, q < 1, and for λ > 1, q > 1. The operator M1 is of the form M1 = (1 − q 2 )(−1r ) + first order + multiplication.
(41)
One can show that M1 is bounded from below (resp. above) for λ < 1 (resp. λ > 1). We stick with the case λ < 1 for concreteness. Suppose M1 η = µη with µ = infspecM1 ≤ 0. Applying the maximum principle (e.g. Proposition 6 for d = 1) to (41), we conclude that η > 0. If µ < 0, we have < η, f 0 >= 0, a contradiction. Thus µ = 0, and is non-degenerate by a similar argument. u t We also have Lemma 1. For m ≥ 2, Mm − M1 is non-negative for λ < 1, non-positive for λ > 1, and has no zero-eigenvalue. Proof. This follows from the equation Mm − M1 = (1 − q 2 )
m2 − 1 . r2
t u
274
S. Gustafson, I. M. Sigal
Completion of the proof of Theorem 1. Suppose now λ < 1. Since F˜m∗ F˜m is manifestly (n) non-negative, and Mm > M1 for m ≥ 2, we have Lm ≥ 0 for m ≥ 1 (with only the translational 0-mode). Combined with (26) and Propositions 7 and 3, this gives stability of the n-vortex for λ < 1. Now suppose λ > 1. By (40), Proposition 10 and Lemma 1, we have for m = 2, . . . n, ˜ < W˜ m , L(n) m Wm >
< 0.
We remark that W˜ m corresponds to an element of the un-complexified space X, and so L(n) has negative eigenvalues. This establishes the instability of the n-vortex for |n| ≥ 2, λ > 1, and completes the proof of Theorem 1. u t 8. Appendix: Vortex Solutions are Radial Minimizers (n)
Proposition 11. For λ ≥ 2n2 , a solution of Eqs. (11)–(12) locally minimizes Er . (n)
Proof. It suffices then to show M0 = HessEr L0 + Z0 , where
> 0 (see Sect. 5.1). We write M0 =
L0 = diag{l, −1r } with l = −1r + b2 + λ2 (f 2 − 1) and 2λf 2 −2bf Z0 = . −2bf r12 + f 2 We note that lf = 0 (one of the GL equations). It follows from the fact that f > 0 and a Perron-Frobenius type argument (see [OS1]) that l ≥ 0 with no zero-eigenvalue. It suffices to show Z0 ≥ 0. Clearly tr(Z0 ) > 0, and det(Z0 ) = 2λf 4 + is strictly positive for λ ≥ 2n2 .
2f 2 [λ − 2n2 (1 − a)2 ] r2
t u
Acknowledgements. The first author would like to thank the Courant Institute for its hospitality during part of the preparation of this paper, and especially J. Shatah for some helpful discussions. Part of this work is toward fulfillment of the requirements of the first author’s PhD at the University of Toronto. The second author thanks Yu. N. Ovchinnikov for many fruitful discussions. The authors would also like to thank the referee for helpful remarks.
References [ABG] Almeida, L., Bethuel, F., Guo, Y.: A remark on the instability of symmetric vortices with large coupling constant. Commun. Pure Appl. Math. 50, 1295–1300 (1997) [ABGi] Alama, S., Bronsard, L., Giorgi T.: Uniqueness of symmetric vortex solutions in the Ginzburg–Landau model of superconductivity. Preprint (1998) [AD] Androulakis, G., Dostoglou, S.: On the stability of monopole solutions. Nonlinearity 11, 377–408 (1998) [BC] Berger, M.S., Chen,Y.Y.: Symmetric vortices for the nonlinear Ginzburg–Landau equations of superconductivity, and the nonlinear desingularization phenomenon. J. Funct. Anal. 82, 259–295 (1989) [B] Bogomol’nyi, E.B.: The stability of classical solutions. Yad. Fiz. 24, 861–870 (1976)
The Stability of Magnetic Vortices
[BGP]
275
Boutet de Monvel–Berthier, A., Georgescu, V., Purice, R.: A boundary value problem related to the Ginzburg–Landau model. Commun. Math. Phys. 142, 1–23 (1991) [DS] Demoulini, S., Stuart, D.: Gradient flow of the superconducting Ginzburg–Landau functional on the plane. Commun. Anal. Geom. 5, no.1, 121–198 (1997) [GT] Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: SpringerVerlag, 1977 [G1] Gustafson, S.: Symmetric solutions of Ginzburg–Landau equations in all dimensions. Intern. Math. Res. Notices No. 16, 807–816 (1997) [G2] Gustafson, S.: Dynamic stability of magnetic vortices. In preparation. [JT] Jaffe, A., Taubes, C.: Vortices and Monopoles. Boston: Birkhauser, 1980. [JR] Jacobs, L., Rebbi, C.: Interaction of superconducting vortices. Phys. Rev. B19, 4486–4494 (1979) [LL] Lieb, E.H., Loss, M.: Symmetry of the Ginzburg–Landau Minimizer in a Disc. Math. Res. Lett. 1, 701–715 (1994) [LM] Lopez-Gomez, J., Molina-Meyer, M.: The maximum principle for cooperative weakly coupled elliptic systems and some applications. Diff. Int. Eqns. 7, no. 2, 383–398 (1994) [M] Mironescu, P.: On the stability of radial solutions of the Ginzburg–Landau equation. J. Funct. Anal. 130, 334–344 (1995) [OS1] Ovchinnikov, Y., Sigal, I.M.: Ginzburg–Landau equation I: Static vortices. In: Partial Differential Equations and their Applications, Greiner et. al., eds. Providence, RI: AMS, 1997, pp. 199–220 [P] Plohr, B.: The existence, regularity, and behaviour at infinity of isotropic solutions of classical gauge field theories. Princeton thesis [PA] Pao, C.V.: Nonlinear elliptic systems in unbounded domains. Nonlinear Analysis: Theory, Methods, and Applications 22, No. 11, 1391–1407 (1994) [RSII] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol II: Fourier Analysis, SelfAdjointness. New York: Academic Press, 1975 [RSIV] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol IV: Analysis of Operators. New York: Academic Press, 1978 [S] Stuart, D.: Dynamics of Abelian Higgs vortices in the near Bogomolny regime. Commun. Math. Phys. 159, 51–91 (1994) [S2] Stuart, D.: Periodic solutions of the Abelian Higgs model and rigid rotation of vortices. GAFA 9, 568–595 (1999) [T1] Taubes, C.: Arbitrary n-vortex solutions to the first order Ginzburg–Landau equations. Commun. Math. Phys. 72, 277–292 (1980) [T2] Taubes, C.: On the equivalence of the first and second order equations for gauge theories. Commun. Math. Phys. 75, 207–227 (1980) [W] Weinberg, E.: Multivortex solutions of the Ginzburg–Landau equations. Phys. Rev. D 19, 3008–3012 (1979) Communicated by A. Jaffe
Commun. Math. Phys. 212, 277 – 296 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Stochastic Stability for Contracting Lorenz Maps and Flows∗ R. J. Metzger Instituto de Matemática y Ciencias Afines, IMCA, UNI. Jr. Ancash 536, Casa de las Trece Monedas, Lima 1, Perú. E-mail:
[email protected] Received: 24 February 1999 / Accepted: 7 January 2000
Abstract: In a previous work [M], we proved the existence of absolutely continuous invariant measures for contracting Lorenz-like maps, and constructed Sinai–Ruelle– Bowen measures f or the flows that generate them. Here, we prove stochastic stability for such one-dimensional maps and use this result to prove that the corresponding flows generating these maps are stochastically stable under small diffusion-type perturbations, even though, as shown by Rovella [Ro], they are persistent only in a measure theoretical sense in a parameter space. For the one-dimensional maps we also prove strong stochastic stability in the sense of Baladi and Viana [BV]. 1. Introduction Lorenz flows are related to the system numerically studied in [Lo] as a truncation of a Navier-Stokes equation. Guckenheimer and Williams [GW] introduced a geometric model called the expanding Lorenz attractor, in which it was supposed that the eigenvalues λ2 < λ1 < 0 < λ3 at the singularity of the flow satisfy the expanding condition λ1 + λ3 > 0. In [Ro], the expanding conditions is replaced by the contracting condition λ1 + λ3 < 0. The general assumptions used to construct the geometric models also permit the reduction of the 3-dimensional problem, first to a 2-dimensional Poincaré section and then to a one-dimensional map. These maps are also called Lorenz-like. In a previous work we proved the existence and uniqueness of an ergodic absolutely continuous invariant measure (a.c.i.m.) for certain one-dimensional Lorenz-like maps (see [M]). In the same work we related this result to the case of flows and constructed an SRB measure for them. Since the a.c.i.m. found for the one-dimensional case is unique, the SRB measure constructed for the flow is also unique. SRB measures are related to the statistical properties of a system. On the other hand, stochastic stability means, in a general sense, that the statistical properties are persistent ∗ This work was partially support by CNPq-Brazil, during a stay at IMPA
278
R. J. Metzger
under small random perturbations. It can be stated as follows. Consider the family of measures P ε (t, x, ·) on M given for every x ∈ M and t ∈ R or t ∈ Z+ and ε > 0 small ε enough and define Markov chains xtε , t ∈ R in the following way: if xtε = x then xt+τ ε ε has probability P (τ, x, 0) of being in 0. The Markov chain xt for t ∈ R is called a small random perturbation of a flow f t if for every continuous function h on M, we have Z lim P ε (t, x, dy)h(y) − h(f t (x)) = 0. ε→0
M
Similarly, the Markov chain xnε for n ∈ Z+ is called a small random perturbation of a map f if for every continuous function h on M, we have Z ε n lim P (n, x, dy)h(y) − h(f (x)) = 0. ε→0
M
νε
We say that on M is an invariant measure for the Markov chain xtε if for all Borel sets 0 and any τ > 0, Z ν ε (dx)P ε (τ, x, 0) = ν ε (0). M
Stochastic stability concerns the convergence of these measures. Under general hypothesis, weak limits of invariant measures ν ε when ε −→ 0 are invariant under the flow (or the map, depending on the case). So, the question is if all limits are equal. At least for the case when there is only a finite number of ergodic attractors (that is when only a finite number of SRB measures exist), we say that a system is stochastically stable if all these limits are the same and in this case we have that the common limit ν is in the convex hull of the SRB measures (cf. [Ki2, Theorem II.5.5]). For the Lorenz-like maps, we prove their stochastic stability in Theorem A, in the general setting of [KK] for the a.c.i.m. obtained in [M]. This case will help us to prove stochastic stability for flows under small diffusion-type random perturbations in Theorem B, even though, as proved in [Ro], they are not robust in parameter space but only persistent in a measure theoretical sense: the contracting Lorenz-like attractors only exist with positive Lebesgue probability in parameter space. Perturbations of diffusion type are often introduced when we try to model actual behavior of systems, they accomplish for brownian motion or random collisions of particles in a media, see [Y]. Finally, for the case of maps, it is common to consider stochastic stability for local perturbations, i.e. when the family of measures P ε have compact support. Actually, for the contracting Lorenz-like maps we prove more (Theorem C): they are strongly stochastically stable in the sense of [BV]. Let us remark that the case when the perturbations are not local, helps us to prove stochastic stability for flows, where the diffusion type perturbations are considered. For the stochastic stability of the Lorenz flow, it is important that there exists only one ergodic attractor because that allows us to reduce the problem to a compact neighborhood of the attractor and to Markov chains that remain in it, as in [Ki2]. The systems commonly considered have finitely many SRB measures µ1 , . . . , µN , and their basin of attractions cover Lebesgue almost all the phase space M. Also, it seems that finding SRB measures is a necesary condition to show stochastic stability. A global point of view on this subject is the Palis conjecture: every dynamical system can be approximated by another having only finitely many attractors, supporting physical measures that describe the time average of Lebesgue almost all points, and, moreover, the statistical properties of these measures are stable under small random perturbations, see [Pa,Vi2]. It is in this spirit that the present work was done. Concerning SRB measures,
Stochastic Stability for Lorenz Attractors
279
we were much inspired by the works of Sinai, Ruelle, Bowen and Kifer [Si, Ru, BR, Ki1, Ki2], and more recently [Vi1]. 2. Lorenz-Like Maps and Flows In this section we recall the strange attractor first discovered by Lorenz [Lo], as a truncation of a Navier-Stokes equation. Actually, we will be dealing with the geometric model introduced by Guckenheimer and Williams in [GW], called the expanding Lorenz attractor. This is, a family of C r (R3 ) vector fields such that it is linear in a neighborhood of the origin containing the cube {(x, y, z) ∈ R3 : |x|, |y|, |z| ≤ 1} and with eigenvalues λ1 , λ2 , λ3 satisfying λ2 < λ1 < 0 < λ3 and λ1 + λ3 > 0 , and with both trajectories of the unstable manifold intersecting the top of the cube, as in Fig. 1. So if we call U the union of the cube with a neighborhood of the unstable manifold, there exists an attractor 3 = ∩t≥0 Xt (U ), where Xt is the flow of the vector field.
Q Σ
Fig. 1. The Lorenz flow
The contracting Lorenz attractor arises in a similar way if we replace the expanding condition λ1 + λ3 > 0 by the contracting condition λ1 + λ3 < 0, see [Ro]. By construction, the top of the cube is a cross section Q for the flow. More explicitly, there exists a curve 6 that we can assume to be the intersection Q with the plane {x = 0}, so there exists a first return map (a Poincaré map) of the form P : Q\6 −→ Q (x, y) 7 −→ P (x, y) = (f (x), g(x, y)), This Poincaré map reduces in a wide sense the study of the dynamics of the Lorenz attractor to the study of the map P . But also the form of this map, that says that the leaves with x = cte are mapped to leaves with x = f (cte), allows another simplification if we
280
R. J. Metzger
project along the stable leaves, see [Ro]. In other words studying the one-dimensional map defined by f gives a great amount of information on the flow that generates it. The one-dimensional maps constructed in this way are called Lorenz-like maps. Let I ⊂ [−1, 1] be a compact interval and f : I → I be a map such that f (I ) ⊂ I with a discontinuity at the origin. Set c±k = limx→0± f k (x) for k ≥ 0. We will require f to satisfy conditions A0-A3 below. A0) Outside the origen f is of class C 3 and with negative Schwarzian derivative, and also satisfies 0 K2 |x|s−1 ≤ f (x) ≤ K1 |x|s−1 . For some constants K1 , K2 and s with s > 1. 0 A1) (f n ) (c±1 ) > λnc , for some λc > 1, and for n ≥ 1. A2) |f n−1 (c±1 )| > e−αn some α small enough, and all n ≥ 1. A3) For any interval J ⊂ I there exists a number n(J ) > 0 such that I∗ ⊂ f n (J ) (f is topologically mixing on I∗ = [c+1 , c−1 ]). Sinai–Ruelle–Bowen measures, or physical measures, are those measures for what the Birkhoff averages converge to a constant for a large Lebesgue set. More precisely: if f : M −→ M is a transformation on a manifold M, we call an f -invariant measure µ an SRB measure if there exists a positive Lebesgue measure set B(µ) of points x ∈ M such that Z n 1X i ϕ(f (x)) = ϕdµ for every ϕ ∈ C0 (M, R) , lim n→∞ n M i=1
and the set B(µ) is called (ergodic) basin of attraction of µ. For a flow f t : M −→ M the definition is Z Z 1 T ϕ(f t (x))dt = ϕdµ for every ϕ ∈ C0 (M, R). lim T →∞ T 0 M It is clear from our definitions that if µ is an absolutely continuous invariant measure for f and ergodic then it is an SRB measure. In [M] we have shown the following: Theorem 1. Under Conditions A0–A3, f admits an absolutely continuous invariant probability measure. This measure is unique and ergodic. This theorem implies also that there exist a unique SRB measure for the original Lorenz flow, see [Vi1,M]. 3. General Random Perturbations Consider the family of measures Qε (x, ·) on I given for every x ∈ I and ε > 0 ε small enough. Define Markov chain xnε in the following way: if xnε = x then xn+1 ε ε has probability Q (f x, 0) of being in 0. The Markov chain xn is called small random perturbation of f if for every function h continuous on I , we have Z lim Qε (x, dy)h(y) − h(x) = 0. ε→0
I
Stochastic Stability for Lorenz Attractors
281
We say that µε on I is an invariant measure for the Markov chain xnε if for all Borel sets 0 we have Z µε (dx)P ε (x, 0) = µε (0), I
where P ε (x, 0) = Qε (f x, 0) and we will consider the family Qεx (·) to be of the form Z ε qxε (y)dy, (1) Qx (0) = 0
and we impose some restrictions to the density qxε . This conditions are Assumption A in [KK] namely: 1. Transition probabilities of Markov chains xnε have the form (1). 2. There exists constants α < 1, C > 0 and a family of non-negative functions {rx (ξ ), x ∈ I, ξ ∈ R} such that α
qxε (y) ≤ Cε−1 e− ε
dist(x,y)
for all x, y ∈ I ,
(2)
where dist(x, y) = min(|y − x|, |y − x + 2|, |y − x − 2|),
(3)
1−α , where σ (x, y) and qxε ≤ (1 + εα )ε−1 rx ( σ (x,y) ε ), provided that dist(x, y) ≤ ε equals one of the numbers (y − x), (y − x + 2), or (y − x − 2) so that |σ (x, y)| = dist(x, y), where definition (3) is mainly because we are considering the interval [−1, 1] with identification of end points1 . 3. The R functions rx (ξ ), x ∈ I , ξ ∈ R, satisfy: – R rx (ξ )dξ = 1. – rx (ξ ) ≤ Ce−α|ξ | for α, C > 0 independent of x and ξ . – There exists C > 0 such that if Vx+ = {ξ : rx (ξ ) > 0} and ∂Vx+ (δ) denotes the δ-neighborhood in R of the boundary ∂Vx+ of Vx+ then Z rx (ξ )dξ ≤ Cδ ∂Vx+ (δ)
and rx (ξ ) ≤ ry (η) + Cρ + χ∂Vx+ (Cρ) (ξ )rx (ξ ),
(4)
where ρ = ρ(x, ξ ), (y, η) = dist(x, y) + ||ξ − πyx η||, and πyx is the parallel transport. These assumptions imply that instead of taking into account whole Markov chains we can work only with Markov chains that are δ-pseudo-orbits. This is shown with more generality in Lemma 1.1 of [Ki2, Chapter 2], which says that the mistake we are making calculating only the n-step transition probability for δ-pseudo-orbits to arrive at borel αδ set 0 is of the order of Cnε 2 m(0)e− 2ε . That is to say ε P (n, x, 0) − P ε dist(f (x ε ), x ε ) < δ; i = 0,ε . . . , n − 1 ≤ Cnε2 m(0)e− αδ 2ε x i i+1 and xn ∈ 0 (5) for appropriate chosen constants. 1 We can consider it also without identification.
282
R. J. Metzger
3.1. The shadowing lemma. To make a proof of stochastic stability similar to that of [KK] we need the following lemma. Lemma 1 (Shadowing). Suppose that f satisfies hypothesis A0-A3. Let x0 , . . . , xn be an ε α -pseudo-orbit of f, i.e. dist(f xi , xi+1 ) < εα ,
i = 0, . . . , n − 1,
(6)
where dist is define by (3) and ε > 0 is small enough. There exists C > 0 depending only on f such that if 0 < β ≤ α/s and |xi − c0 | ≥ 2Cεβ ,
i = 0, . . . , n,
(7)
then one can find a point y ∈ I so that dist(f i (y), xi ) ≤ Cεα−β(s−1) ,
i = 0, . . . , n.
(8)
Proof. Choose ρ3 < ρ0 /2 for suficiently small ρ0 such that \ f (U2ρ3 (c0 )) U2ρ3 (c0 ) = ∅. Let i1 < . . . < ik such that xij ∈ Uρ3 (c0 ), for j = 1, . . . , k and xl 6∈ Uρ3 (c0 ) if l 6 = ij for j = 1, . . . , k. Put also i0 = 0, and ik+1 = n. Fixing ρ2 , we have fixed M 3ρ3 so if xl 6 ∈ Uρ3 (c0 ) for i = l + 1, . . . , l + M 3ρ3 − 1, 4
then f q (xl ) 6 ∈ U 3ρ3 (c0 ) for q = 1, . . . , M 3ρ3 − 1, from relation (6). 4
4
4
Therefore, in our case, a lemma similar to Lemma 5.2 of [Vi1] (shown in [Ro]) enable us to employ the standard argument yielding the shadowing in the expanding case for pieces xij +1 , . . . , xij +1 , of the pseudo-orbit to conclude that there exists Cρ > 0 independent of the whole pseudo-orbit and some points yj , j = 0, . . . , k such that dist(xij +l , f l (xij )) ≤ Cρ3 εα
(9)
for all l = 1, . . . , ij +1 − ij and j = 0, . . . , k. Next, we shall prove that there exists a point y ∈ f −ik yk satisfying (8). By (6) and (9) dist(f xij , fyj ) ≤ (Cρ3 + 1)εα ,
(10)
from (A0) and (10) we have s (Cρ3 + 1)εα |xij |s − |yj |s ≤ K2 and since s > 1 we have
|xi |s − |yj |s |xi | − |yj | j j ≥ |xij | |xij |s
so (11) becomes s (Cρ3 + 1) α ε K2 |xij | s (Cρ3 + 1) α−sβ ≤ |xij | ε , K2 (2C)s
|xij − yj | ≤ |xij |
(11)
Stochastic Stability for Lorenz Attractors
283
since |xij | = |xij − c0 | ≥ 2Cε β and sβ ≤ α. Now, by (9) if C6 is chosen big enough we have α eρ3 ε (12) dist(yj , f ij −ij −1 yj −1 ) ≤ C |xij |s−1 eρ3 independent of ε, j and points {xi }. for some C Since |xij | ≥ 2Cεβ then for ε small enough, it follows from (9) and (12) eρ3 dist(yj , c0 ) ≤ dist(yj , xij ) + dist(xij , c0 ) ≤ C
2 εα + ρ3 ≤ ρ0 |xij |s−1 3
Also we have dist(f ij −ij −1 yj −1 , c0 ) ≤ 23 ρ0 . From this two relations and the corresponding lemmas similar to Lemmas (3.2) and (3.5) [KK] we have α eρ3 ε γ −l dist f −l yj , f ij −ij −1 −l yj −1 ≤ C1 C |xij |s−1 0 for appropiate preimages of yj , where l ≥ 0 and C1 > 0 depends only on ρ0 . It follows from here that k X )+l r−j −(i −i eρ3 εα−(s−1)β C1 γ0 j j −1 dist f −(ik −ij )+l yk , f l yj ≤ C
(13)
r=j
for corresponding preimages of yk , where l = 1, . . . , ij +1 − ij . From the assumptions on our maps, (ij +1 − ij ) is of the order C2−1 log ρ3−1 , where C2 is independent of ρ3 , ij and the choice of points {xi }, {yj }. −(i
−i )
Therefore if ρ3 is taken small enough, then C1 γ0 j +1 j < 1 and the sum in the right side of (13) is bounded. This, together with (9) and (13), yield (8) for some y ∈ f −ik (yk ), and proves Lemma 1. u t In [KK], there are two crucial lemmas: Lemma 4.1 and Lemma 4.4. Lemma 4.4 shows that the closure of the critical orbit has Lebesgue measure. This is necessary since [KK] proves Lemma 4.1 for intervals that are far from the critical orbit. In other words, [KK] shows that the limit measure µ of a weak covergent sequence µεi of stationary measures is absolutely continuous with respect to Lebesgue in I \A where m(A) can be made arbitrarely small. We do the same, only chosing carefully the subset A. More precisely, we are going to estimate probabilities of arriving at intervals 0 ⊂ I such that 0 = π(fˆn (η)) for some η ⊂ E0 and η ∈ Q(n, N ), where N is chosen in such a way that the sum of the measures of the intervals that do not belong to Q(n, N ) is less than ˜ . The definitions of fˆ and the tower extension Iˆ (π is the natural projection of the tower) can be found in Section 5 and in [M]. The colection of intervals Q(n, N ) is defined in Sect. 6 of [M] and the principal property we are using is stated in Lemma 6.1 of the same reference (similar definitions and properties can be found in [Vi1]). Let 0 ⊂ I be a borel set, define J1ε (ρ, n, x, 0) = Px { min (xkε , c0 ) > ρ and xnε ∈ 0} 0≤k≤n−1 Z Z Z ... qfε (x) (y1 )qfε (y1 ) (y2 ) . . . qyεn−1 (yn )dy1 . . . dyn . = I \Uρ (c0 )
I \Uρ (c0 ) 0
284
R. J. Metzger
Lemma 2. For any ˜ there exist N such that if 0 = π(fˆn (η)), where η ∈ Q(n, N ) as before, then there exist γ0 such that for any x ∈ I we have J1ε (εγ , n, x, 0) ≤ Dm(0), provided that (log ε)4 ≥ n ≥ (log ε)2 , γ ≤ γ0 and ε is small enough. Proof. This lemma is similar to a corollary in [KK]. Let us give a sketch of the proof. First we define J2ε (ρ, δ1 , n, x, 0) similar to J1 but considering only δ1 -pseudo-orbits. This let us approximate J1 with J2 with the same error as in (5). From here we use the shadowing lemma to reduce the problem to calculate the probability J3ε (δ2 , n, x; z, 0) for n Markov chains beginning in x and ending in 0 that can be δ2 shadowed by z, i.e. for Markov chains that stay in iterates of a dynamical ball. After this we can conclude the lemma as in [KK], using the corresponding bounded distortion properties of [Vi1] shown in [M]. u t As in [KK] we shall take care of the Markov chain xkε which sometimes approach the critical point c0 . This is made in Lemma 4.3 of [KK] that can be translated with few modifications. So we already have Theorem 2 (Theorem B). Stationary measures µε for perturbations converge weakly to the a.c.i.m. µ0 of f . Proof. We are assuming that the a.c.i.m. for f are unique and the methods in [KK] give t that if µε → µ then µ is an a.c.i.m., therefore µ = µ0 . u 4. Stochastic Stability for the Lorenz Flow For the previously defined Lorenz flow Xt we consider the family of measures P ε (t, x, ·) on M given for every x ∈ M and t ∈ R and ε > 0 small enough. Define the Markov ε has probability P ε (τ, x, 0) chain xtε , t ∈ R in the following way: if xtε = x then xt+τ ε of being in 0. The Markov chain xt is called a small random perturbation of Xt if for every function h continuous on M, we have Z ε lim P (t, x, dy)h(y) − h(Xt (x)) = 0. ε→0
M
νε
We say that on M is an invariant measure for the Markov chain xtε if for all Borel sets 0 and any τ > 0 Z ν ε (dx)P ε (τ, x, 0) = ν ε (0). M
It is a standard fact that in the case we are treating, weak limits of invariant measures ν ε when ε −→ 0 exist and it will be invariant for the flow itself. As in [Ki2], we are going to deal with diffusion type random perturbation. This is the most common perturbation used for flows because it models a particle that moves under the action of the vector field and it is also affected by random collision in a media. The operator which models this process has the following form: Lε = ε2 L + B,
Stochastic Stability for Lorenz Attractors
285
where B is the vector field (Lorenz type in our case) and L is the L aplace–Beltrami operator (a second order elliptic operator acting on the space of coordinates [IW]). This operator generates a Markov diffusion process xtε with transition probability P ε (t, x, ·) having densities p ε (t, x, y) with respect to the Riemannian volume satisfying Kolmogorov’s equation ∂pε (t, x, y) = Lε pε (t, x, y), ∂t where Lε acts in the variable x see [IW,Y]. It is known that if we consider P ε (x, 0) = P ε (τ, x, 0) for τ fixed, and P ε (τ, x, ·) being the diffusion transition probabilities, we arrive at Markov chains xn = xnτ that satisfy similar properties as in Assumption A of [KK]. We already know the existence of a Poincaré section which has all the good properties including the stochastic stability for diffeomorfism. Before going to the theorem we need the following lemma as asked in [Ki2]. Lemma 3. Let U be a sufficiently small neighborhood of the attractor 3, There exist a constant C > 0 such that if xi i = 0, . . . , n is a εα -pseudo-orbit of X1 staying in U and satisfying min dist(xi , O) > Cεβ ,
0≤i≤n
(14)
then we can find a point y ∈ U such that max dist(xi , Xi (y)) ≤ Cnεα−β(s−1) ,
0≤i≤n
where O in 14 represents the origin and dist here is the Euclidean distance. Proof. This lemma is the combination of the shadowing property in the Poincaré section with the shadowing property in the neighborhood of the hyperbolic fixed point of the flow. u t Our main theorem is the following Theorem 3 (Theorem C). Let xtε be diffusion type small random perturbation of the Lorenz flow introduced in Sect. 2 and let ν ε be an invariant measure for this process. If ν ε −→ ν then ν is an SRB measure for the Lorenz flow. Proof. Since invariant measures for the perturbation of the flow are also invariant measures for the diffeomorfism Xτ for τ fixed, we can consider invariant measure ν 0 ε for this diffeomorphism. Let ν εi be a weak convergent subsequence of measures having as limit the measure ν. Define a measure ν ∗ on the Poincaré section 6 (see Sect. 2 for the definition of 6), as follows: dν(∪0≤t≤s Xt (0)) ∗ (15) ν (0) = ds s=0 for all 0 ⊂ 6. We claim that ν ∗ satisfies the property that the measure µ defined as µ(B) = ν ∗ (π −1 (B)) is absolutely continuous with respect to Lebesgue in the interval I .
286
R. J. Metzger
From the claim the theorem follows since invariant measures for diffusion type perturbation are unique [IW] Theorem 4.5, and since the SRB measure for the Lorenz flow is uniquely defined by the property asked in the claim and by Definition 15. The claim follows from the methods in [Ki2] provided that we prove a distortion property in “rectangles”formed by a cartesian product of boxes in the flow, the stable manifold and the Poincaré section. If we choose carefully the partition to make the “rectangles”the distortion property will be an easy consequence of a similar property in the one dimensional case. We know the existence of Q(n, N ), which is a partition except for a small Lebesgue set (small depending only on N). With this “partition”we induced a similar one in the Poincaré section 6. That is, property (15) is shown for 0 ⊂ 6 which belongs to this induced “partition”similarly to Lemma 2 and using the methods of Chapter 2 of [Ki2]. u t 5. Strong Stochastic Stability In what we will say below, we are using the same constants as in the work [M]. It will be clear in the development of the sections that if these constants change they do it in such a way that the proofs that use them remain true. We will use here other constants, for example β1 and β2 , that are very close to β. From now on we will use an open interval only a little larger than I , and it will be still denoted by I . Fix some small 0 so that ft (I ) ⊂ I for all |t| < 0 , where ft (x) = f (x) + t, and we also write ftn = ftnn ... = ftn ◦ . . . ◦ ft1 for each n ≥ 1 and t = (t1 , . . . , tn ). We are interested in Markov chains xt , for 0 < < 0 , whose transition probabilities P (x, .) have densities θ (y − f (x)). Each θ is a probability distribution on [−, ], bounded from below as in [BY]. We assume also Z and θ (x)dx = 1. supp θ ⊂ [−, ] We also assume that θ satisfies M = sup ( sup |θ |) < ∞. Denote J = {t : θ (t) > 0}. It is clear that J is an interval containing zero and we assume that φ = log(θ |J ) is concave. Clearly, φ is concave if θ |J is. On the other hand, θ |J is at most two-to-one if φ is concave since log is a homeomorphism in (0, ∞). It follows from our assumptions on P that, for all small enough, the Markov process xn has a unique invariant probability measure µ , and this measure is absolutely continuous with respect to Lebesgue. The uniqueness comes from the property that θ is bounded from below. Uniqueness also implies that µ satisfy an ergodic property, namely, the product measure µ ×θN is ergodic (and invariant) with respect to the map on I × RN defined by (x, t1 , t2 , t3 , . . . ) 7 → (ft1 (x), t2 , t3 , . . . ), (see [Ki1], Theorem 2.1). It follows, using the Ergodic Theorem, that Birkhoff averages of random trajectories xj = ftj ...t1 (x) converge to µ for Lebesgue almost every (x, t1 , t2 , t3 , . . . ) ∈ supp (µ ) × supp θN . In this context we say that the dynamics of f is strong stochastically stable if the densities h of µ converge to the density of µ0 as goes to zero in the BV topology, where µ0 is the unique invariant measure of f . In this section we are going to prove Theorem 4 (Theorem D). The dynamics of f is strong stochastically stable.
Stochastic Stability for Lorenz Attractors
287
5.1. The construction. Besides the tower extension fˆ, we construct its deterministic perturbation fˆt , for |t| < < 0 . Take the constants β1 and β2 , with β1 < β < β2 , very 1/s close to β so that it is still true that eβi /2 λρ < λc , for i = 1, 2. For k ∈ Z, (x, k) ∈ Ek and |t| < we set (f (x), k(+)) if |k| ≥ 1 and ft (x) ∈ Bk t (ft (x), −1) if k = 0 and x ∈ (−δ, 0) fˆt (x, k) = (f (x), 1) if k = 0 and x ∈ (0, δ) t (f (x), 0) otherwise. t Define also fˆtn ...t1 = fˆtn ◦ . . . ◦ fˆt1 for some (t1 , . . . , tn . . . ) ∈ JN . Observe that also ft ◦ π = π ◦ fˆt on Iˆ. We allow now H (δ) to depend on 0 , in the following way H (δ) = H (δ, 0 ) be the minimal k ≥ 1 such that there exist some x ∈ (−δ, δ) and some t = (t1 , . . . , tk , tk+1 ) ∈ such that fˆtk+1 (x, 0) ∈ E0 . We observe that, by continuity, H (δ) again can be Jk+1 0 made arbitrarily large by choosing small enough δ and 0 . We define the Markov chain xˆn by considering the transition probabilities ∞ Z X θˆ ((y, j ), fˆ(x, k))dy, Pˆ ((x, k), E) = j =−∞ π(E)
where θˆ ((y, j ), fˆ(x, k)) = 0 if fˆy−f (x) (x, k) 6 ∈ Ej and θˆ ((y, j ), fˆ(x, k)) = θ (y − f (x)) otherwise, in which case fˆy−f (x) (x, k) = (y, j ). In particular j = k + 1 and when there is no ambiguity we simply write θ (y − f (x)). We wish to consider the transfer operator L related to the unique absolutely continuous invariant measure of fˆ and Markov chain xˆn , so we first define the cocycle ω in the following way: 1 k=0 λ R θ (x − f (y))dy k =1 R(−δ,0) ω (x, k) = k = −1 λ (0,δ) θ (x − f (y))dy λ R ω (y, k(−))θ (x − f (y))dy |k| ≥ 2. ∗ B k
(x, k), with |k| ≥ l ≥ 1, to be the unique point Define (xtl ,... ,t1 , k − l(k)) = fˆt−l l ,... ,t1 l ˆ such that ftl ,... ,t1 (xtl ,... ,t1 , k − l(k)) = (x, k), with l(k) = l if k ≥ 0 and l(k) = −l otherwise. With the previous definition we have Z ω (x, ±1) = λ (f 0 (xt ))−1 θ (t)dt, Z
and also ω (x, k) = λ
ω (xt , k(−) )(f 0 (xt ))−1 θ (t)dt
|k| ≥ 2.
Integration is over t such that xt is defined, with xt ∈ (−δ, 0) or (0, δ) for the first integral (depending on the “sign of the level”), and xt ∈ Bk(−) for the second integral. Introducing the notation dθ (t) = θ (t1 ) . . . θ (t|k|−1 )dt1 . . . dt|k|−1 .
288
R. J. Metzger
we also have for |k| ≥ 2, |k|−1
Z
ω (x, k) = λ
0 |k|−1 ω (xt|k|−1 ...t1 , 1) fˆt|k|−1 ...t1 (xt|k|−1 ...t1 )dθ (t), |k|−1
such that xt|k|−1 ...t1 ∈ B±1 exists. where the integral is over the t ∈ J Our assumptions imply that θ converges to the Dirac functions as → 0. It follows that ω (x, k) → ω0 (x, k) pointwise as → 0. Moreover, for small enough, and for |k| > 0, the support of ω in Ek is an interval with endpoints close to the endpoint of the support of ω0 in Ek . Similar to what we do for ω0 we write m = ω m, and note that this measure is also finite. We use the cocycles ω to define nonnegative weights gt on Iˆ by ω (y, k) 1 , gt (y, k) = 0 ω (fˆt (y, k) f (y) for |t| < . Q (n) ˆj = We use the notations g = g0 , and g (n) = n−1 j =0 g ◦ f , and similarly for gt Qn−1 j ˆ j =0 g ◦ ftj ,... ,t1 . S |k| T We will denote [ak , bk ] × {k} = Ek . Note that this definition |k| Im fˆt t∈J implies that for all x that belongs in [ak , bk ] × {k} there exists a t and a point xt in the |k| ground level such that fˆt (xt , 0) = (x, k). We denote X 1 ϕ(y, j )ω (y, j ) Lt ϕ(x, k) = ω (x, k) f 0 (y) X
=
fˆt (y,j )=(x,k)
ϕ(y, j )gt (y, j )
fˆt (y,j )=(x,k)
and
Z L ϕ(x, k) =
Lt ϕ(x, k)θ (t)dt XZ 1 ϕ(y, j )ω (y, j )θˆ (x, k)fˆ(y, j )dy, = ω (x, k) Bj j
for k = 0 or |k| ≥ 1 with ak < x < bk . For |k| ≥ 1 and x 6∈ [ak , bk ] we make definitions using limits as before. An interval η ⊂ Ek is called an interval of monotonicity for a map Fˆ : Iˆ → Iˆ if the map F = π ◦ Fˆ is monotone on η and if there is a j such that Fˆ (η) ⊂ Ej . Observe that this definition coincides with that of P (n) given in Sect. 3 of [M]. (n) For t = (t1 , . . . , tn ) ∈ JN , let Pt be the set of intervals of monotonicity of fˆtn . That is (n) Pt = { η1 ∩ fˆt−1 (η2 ) ∩ · · · ∩ (fˆtn−1,... ,t )−1 (ηn ) : 1
n−1
1
ηi monotonicity interval of fˆti , 1 ≤ i ≤ n}. (n)
Clearly, endpoints of nontrivial intervals in Pt vary continuously with t. It follows that given any η0 in P (n) , for each t close enough to 0, there exists an interval η(t, η0 ) ∈ (n) Pt with endpoints depending continuously on t and such that η(0, η0 ) = η0 .
Stochastic Stability for Lorenz Attractors
289
5.2. The lemmas. We are not going to prove all the equivalent statements of [BV], but give some lemmas that make us understand how the others come through, with the necessary modifications. The next lemma is related to the weight gt (y, k), compare with the definition of gt , evaluated at points in the support of m which “fall down”from the tower, i.e., |k| ≥ H (δ) and fˆt (y, k) ∈ E0 . Lemma 4 (BV 3). There is c > 0 so that ω (y, k)|f 0 (y)|−1 ≤ cρ −|k| for all ≥ 0 and |k| ≥ 1, and all (y, k) ∈ Ek having fˆt (y, k) ∈ E0 for some |t| < . Proof. The case = 0 is easy. Assuming > 0, we derive a preliminary estimate for ω on E±1 . Consider ω on E−1 . (For E1 the proof is similar.) If x ≥ c−1 + then ω (z, −1) = 0. Otherwise Z θ (z − f (y))dy, ω (z, −1) = λ (−δ,0) z−f (−δ)
Z ω (z, −1) = λ
z−c−1
θ (t)
f 0 (ft−1 (z))
Z dt ≤ sup θ
1 f 0 (z
t)
dt.
Note that we write ft−1 (z) because it is well defined for all z such that (z, −1) ∈ E−1 , for δ small enough. The first integral is taken over {t ≥ z − c−1 : |zt | ≤ δ} and the second one over {t ≥ z − c−1 : |t| ≤ }. Hence, if c−1 − ≤ z ≤ c−1 + then Z Z z dt −dx, (16) = λ sup θ ω (z, −1) ≤ λ sup θ −1 0 z−c−1 f 0 (ft (z)) |z | −z ≤ λM , (17) ω (z, −1) = λ( sup θ ) ≤ λMC/()1−1/s , (18) because property A0 implies K1
|x|s < |f (x) − f (0− )|, s
leading to
|z |s < 2. s On the other hand, for z ≤ c1 − we have, Z dt = λ sup θ (−(z − z− ) ω (z, −1) ≤ λ sup θ −1 0 − f (ft (z)) ≤ λ sup θ |z | − |z− | |z ||z− |s−1 − |z− ||z |s−1 |z |s − |z− |s + ≤ λ sup θ |z |s−1 + |z− |s−1 |z |s−1 + |z− |s−1 |z |s − |z− |s . ≤ Cλ sup θ |z |s−1 + |z− |s−1 K1
290
R. J. Metzger
So it becomes CλM , |z0 |s−1
ω (z, −1) ≤
(19)
since (ft−1 )s = (zt )s is a smooth function of t and since |z |s−1 + |z− |s−1 ≥ |z0 |s−1 . Now we consider a general |k| ≥ H (δ). Without loss of generality suppose here that k > 0. From the definition we have Z −1 k 0 ) (y d θˆ (t ). ω (y, k)|f 0 (y)|−1 = λk−1 ω (ytk−1 ...t1 , 1) (f0,t t ...t k−1 1 k−1 ...t1 Now, split this into a sum W1 + W2 , where the two terms correspond to restricting the domain of integration, respectively, to {|c1 − ytk−1 ...t1 | ≥ } and to {|c1 − ytk−1 | < }. In order to bound W1 and W2 , we note that e−β2 (k+1) ≤ c|(f k )0 (c1 )||c1 − ytk−1 ...t1 | + |.
(20)
This is a translation of the relation deduced in the proof of Lemma 2 in [BV], using j −1 k (ytk−1 ...t1 , 1) ∈ E0 for some fˆtj −1 ...t1 (ytk−1 ...t1 , 1) ∈ Ej for 1 ≤ j ≤ k and fˆt,t k−1 ...t1 |t| < . −β2 k Let us set first |c1 − ytk−1 ...t1 | ≥ . Then (20) gives |c1 − ytk−1 ...t1 | ≥ C|(fe k )0 (c )| , and since |z0 | ≥
|c1 −z|1/s , C
1
(19) yields 0
|(f k ) (c1 )| e−β2 k
ω (ytk−1 ...t1 , 1) ≤ CλM
! s−1 s
0
(s−1)/s
≤ Ceβ2 k(s−1)/s ((f k ) (c1 ))
.
Replacing in W1 and using again the distortion inequality (3.9) of [BV] we obtain W1 ≤ λk−1
Z
Ceβ2 k(s−1)/s
(s−1)/s
0
((f k ) (c1 )) 0
(f k ) (c1 ) (s−1)/s k ≤ Cρ −k . ≤ λk−1 C eβ2 (s−1)/s λc
dθ (t)
e−β2 k if |c1 − ytk−1 ...t1 | ≤ , and (18) to 0 C(f k ) (c1 ) (s−1)/s 0 Ceβ2 k(s−1)/s |(f k ) (c1 )| . The same calculations
For W2 , we use (20) to get that ≥
conclude that ω (ytk−1 ...t1 , 1) ≤ as before give W2 ≤ Cρ −k , ending the proof of Lemma 4.
t u
For |k| ≥ 1 we introduce subintervals of Ek : βkL = {(y, k)|f (y) < ak+1 − },
βkR = {(y, k)|f (y) > bk+1 + }, where [aj , bj ] = Bj , i.e, aj , bj are the endpoints of the interval Bj . Note that (y, k) ∈ βkR ∪ βkL if and only if fˆt (y, k) ∈ E0 for some |t| ≤ . Lemma 5 (BV 4). There is a constant c > 0 such that for all ≥ 0 and |k| ≥ 1 we have |k| −1 ≤ c(eα ρ −1 ) var β L,R ω (y, k)(f 0 (y))
Stochastic Stability for Lorenz Attractors
291
Proof. For each fixed ≥ 0 and |k| ≥ 1, we have that {ω (y, k) 6= 0} is an interval. Denote γkL,R its intersection with βkL,R . We suppose |k| ≥ H (δ), otherwise γkL,R is empty. Now, suppose that = 0. For (y, k) ∈ γkL,R we have −1
ω0 (y, k)|f 0 (y)|
=
λ|k| . 0 (f |k|+1 ) (fˆ−|k| (y, k))
Note that f |k|+1 has negative Schwarzian derivative, because f does. Now, f |k|+1 does not have critical points in fˆ−|k| (γkL,R ), because this last set does not contain the critical point, neither does π(Ej ∩ supp (ω0 )) for j ≥ 1 for appropriate chosen contants, 0 see Sect. 2 of [M]. This implies that (f |k|+1 ) (fˆ−|k| (y, k)) has a unique maximum and so ω0 (y, k)|f 0 (y)|−1 has a unique minimum, restricted to γkL,R , hence −1
var β L,R (ω0 (y, k)(f 0 (y))
−1
) ≤ 2 sup (ω0 (y, k)(f 0 (y)) γkL,R
),
and the claim for = 0 follows from Lemma 4. Assume now that > 0. The main step is to prove that ω is at most two-to-one on each Ek . For this we use the assumption that φ = log(θ |J ) is concave. Observe that a function 9 is concave if and only if 9(x1 ) + 9(x4 ) ≤ 9(x2 ) + 9(x3 ), for every x1 < x2 ≤ x3 < x4 with x1 + x4 = x2 + x3 . Given j ≥ 0 (for j < 0 we have similar relations). If j = 0 replace Bj by [−δ, 0) or (0, δ], thus ω (x1 , j + 1)ω (x4 , j + 1) − ω (x2 , j + 1)ω (x3 , j + 1) Z Z ω (y, j )ω (z, j ) = λ2 Bj
Bj
· [θ (x1 − fy)θ (x4 − f z) − θ (x2 − fy)θ (x3 − f z)] dydz ≤ 0 , for all x1 < x2 ≤ x3 < x4 with x1 + x4 = x2 + x3 . For the last inequality observe that the term in the integral is always non-positive since we have (x1 − f (y)) + (x4 − f z) − (x2 − fy) + (x3 − f z) and log(θ |J ) is concave. This proves that log ω is concave and so ω is at most two-to-one on Ej +1 , see Sect. 2.1 in [BV]. As a consequence −1
var β L,R ω (y, k) ≤ 2 sup (ω (y, k)(f 0 (y)) γkL,R
),
by Lemma 4. Therefore, since |ck | ≥ e−α|k| ( condition A2) and f 0 has a unique maximum on each Bk for |k| ≥ H (δ), use that f has negative Schwarzian derivative once more.
292
R. J. Metzger
Therefore, we have −1
var β L,R (ω (y, k)(f 0 (y))
−1
) ≤ var β L,R (ω (y, k)) sup ((f 0 (y)) β L,R
) −1
+ sup (ω (y, k)) var β L,R ((f 0 (y)) β L,R
≤ cρ −|k| ceα|k| + cρ −|k| eα|k| .
)
t u
We now proceed with some preliminary bounds on L concerning points which are “climbing” the tower, i.e., (y, j ) ∈ Ek and fˆt (y, k) ∈ Ek(+) . Lemma 6 (BV 5). Let ϕ ∈ BV (Iˆ) and ≥ 0. 1) For |k| > 0 and each β ⊂ Ek(+) ∩ supp m , we have supβ |L ϕ| ≤ −1 S (β) ∪ supp m . where γ = t∈J fˆt |Ek 2) For each β ⊂ E±1 ∪ supp m we have supβ |L ϕ| ≤ −1 S ˆ (β). t∈J ft | ±
K λ
1 λ
supγ |ϕ|,
supγ ± |ϕ|, where γ ± =
E0
Proof. The proof for = 0 is easy. Assume > 0. By definition, for |k| ≥ 1 and x ∈ Bk(+) such that ω (x, k(+) ) 6 = 0, R L ϕ(x, k + 1) =
ω (z, k)ϕ(z, k)θ (x − f (z))dz R , λ Bk ω (z, k)θ (x − f (z))dz
Bk
and part (1) follows. Now, note that if ω (x, ±1) 6 = 0, then R L ϕ(x, 1) =
[−δ,0) ϕ(z, k)θ (x
R L ϕ(x, −1) = and part (2) follows.
λ
R
[−δ,0) θ (x
− f (z))dz
(0,δ] ϕ(z, k)θ (x
λ
R
(0,δ) θ (x
− f (z))dz
− f (z))dz
− f (z))dz
,
,
t u
Lemma 7 (BV 6). Let ϕ ∈ BV (I ) and ≥ 0. 1) For all |k| ≥ 1 and each interval β ⊂ Ek+1 , we have var β L (ϕ) ≤ λ1 var γ (ϕ), where −1 S (β) ∪ supp (m ). γ = t∈J fˆt |Ek S 2) For each interval β ⊂ E±1 , we have var β L (ϕ) ≤ Kλ var γ ± ϕ, where γ + = t∈J −1 −1 S (β), and γ − = t∈J fˆt |(0,δ)×{0} (β). fˆt |(−δ,0)×{0} Proof. The case = 0 is easy. We start with |k| ≥ 1. Consider first ϕ|Ek = Hu = xn |[u,bk ]×{k} for some point u ∈ Bk . We shall prove that L ϕ is monotone on Ek(+) .
Stochastic Stability for Lorenz Attractors
293
Obviously, we may disregard the points (x, k(+) ), where L is defined by a limit. At all other points, we have R bk ω (z, k)θ (x − f (z))dz . L ϕ(x, k + 1) = Rub λ akk ω (y, k)θ (x − f (y))dy Fix x1 > x2 in π(β) with ω (x1 , k + 1) 6= 0, for i = 1, 2. Up to a positive factor, the difference L ϕ(x1 , k + 1) − L ϕ(x2 , k + 1) is equal to Z bk Z bk dy dzω (z, x)ω (y, k) [θ (x1 −f z)θ (x2 −fy)−θ (x2 −f z)θ (x1 − fy)] . ak
u
(21) Since f |Bk ∩supp m is increasing then f (y) ≤ f (z) in (21). Thus x1 −fy ≥ max{x1 − f z, x2 − fy} and x2 − f z ≤ min{x2 − fy, x1 − f z}. So that, using (x1 − fy) + (x2 − f z) = (x1 − f z) + (x2 − f z) together with the concavity of log(θ |J ), we get θ (x1 − f z)θ (x2 − fy) ≥ θ (x2 − f z)θ (x1 − fy). Hence L ϕ(x1 , k + 1) ≥ L ϕ(x2 , k + 1), i.e., L ϕ is non-decreasing on β. This proves, using Lemma 6, that var β L ϕ = sup L ϕ − inf L ϕ ≤ β
β
1 (1 − 0). λ
Consider now the case where ϕ|Ek =
m X
dj Huj ,
(22)
j =1
for some uj ∈ Bk and dj > 0. Then
X dj Huj , var β L ϕ = var β L d0 χγ + uj ∈γ
for some constant d0 ≥ 0. Observe that L (d0 χγ ) is constant on β. Therefore, by P linearity, var β L ϕ ≤ λ1 uj ∈γ dj = (1/λ)var γ ϕ. If ϕ|Ek is nonnegative and non-decreasing, we take a sequence of ϕn of the form (22) with ϕn |Ek ≤ ϕ|Ek and converging uniformly to ϕ|Ek . Since L ϕn converges pointwise to L ϕ on Ek(+) , we get var β L ϕ ≤ lim inf var β L ϕn ≤ n
1 1 λ lim sup var γ ϕn ≤ λvar γ ϕ. 1 1 n
Finally, if ϕ|Ek is any function with bounded variation, we may write ϕ|Ek = ϕ1 − ϕ2 with ϕj nonnegative, nondecreasing, and such that varγ ϕ = var γ ϕ1 + var γ ϕ2 , then var β L ϕ ≤
X j
var β L ϕj ≤
X1 1 var γ ϕj = var γ ϕ. λ λ j
For β ⊂ E±1 the argument above holds for a function ϕ which vanishes on (0, δ] or t [−δ, 0), yields var β L ϕ ≤ λ1 var γ ± ϕ. u
294
R. J. Metzger
Lemma 8 (BV 7). lim
→0
XZ Bk
k
|ω (x, k) − ω0 (x, k)|dx = 0.
Proof. The term for k = 0 vanishes. For k = ±1 we have ( ( 0 x 6 ∈ f ([−δ, 0)) 0 ; ω0 (x, −1) = ω0 (x, 1) = λ λ otherwise f 0 (x0 ) f 0 (x0 )
x 6 ∈ f ((0, δ]) . otherwise
−1 −1 (x, 1) and x0 = fˆt=0 (x, −1), Respectively we have x0 = fˆt=0 ( 0 x 6∈ ∪t∈J ft ([−δ, 0)) R , ω (x, 1) = λ J θf 0(t)dt otherwise (xt ) 0 x 6∈ ∪t∈J ft ((0, δ]) . ω (x, −1) = R θ (t)dt λ otherwise J f 0 (xt )
Respectively we have xt = fˆt−1 (x, 1) and xt = fˆt−1 (x, −1). In what follows we shall consider k = 1, since the case k = −1 is similar. For small fixed ζ > 0 we have, by a computation similar to (18), Z ω0 (x, 1)dx = λ[(x0 (1 − ζ, 1) − x0 (c1 , 1)] ≤ λζ 1/s . |1−x|≤ζ
R Since ω (x) converges uniformly to ω0 (x) on |c1 −x| ≥ ζ , the integral |c1 −x|≥ζ |ω − ω0 |dx can beR arbitrarily small by taking small. We split |c1 −x|≤ζ ω (x, 1)dx into a sum W1 + W2 , where W1 , W2 correspond to restricting the domain of integration to ζ ≥ |c1 − x| ≥ 2, respectively|c1 − x| ≤ min(2, ζ ). The first item vanishes if ζ < 2, otherwise it satisfies Z C λ 0 dx ≤ λCζ 1/s , W1 ≤ |c1 −x|≤ζ f (x0 ) since |c1 − x + t| ≥ |c1 − x|/2. For the second item, we have (recall (18)) Z C CλM ≤ 1−1/s min(2, ζ ) W2 ≤ 1−1/s |c1 −x|<min(2,ζ ) ≤ C min( 1/s , ζ 1/s ). We have just proved:
Z lim →0 Z
E1
|ω0 − ω |dx = 0,
|c1 −x|≤ζ
ω (x, 1)dx ≤ Cζ 1/s .
Now, for levels with |k| > 2 we can apply the same ideas contained in [BV], and the previous relations and lemmas. u t
Stochastic Stability for Lorenz Attractors
295
Lemmas (4)–(8) are the ones needed to make the proof of Theorem D. It suffices to follow the methods in [BV], and also the ideas developed here that solve the technical problems concerning the maps we are dealing with. A remark has to be made: there is one simplification that we can do with our construction that concerns the differentiability of f . We do not require f to be C 4 (or piecewise C 4 ) nor to be symmetric, as it is required in [BV]. We can see this as follows. Since our tower extension is injective for (−δ, δ), we do not have to make a choice between two points in (−δ, δ) to define the cocycle. For each point p in Ek , there is at most one point q in E0 that goes to Ek in |k| iterates. This makes it unnecessary to compare derivatives of points going to the same image in the tower. We already use this fact in Lemmas (4)–(8) and also in the previous sections. As (n) an example we finish this section stating two preperties of g (n) and gt that simplify the proof of the Integral Lemma in [BV]. Let η0 ∈ P n+N , and η1 (, η0 ) = ∩t∈Jn (η(t, η0 )). Define η2 (, η0 ) in a similar way, replacing intersection by union. Let l = k(0), i.e., l such that η0 ⊂ El . It is clear that given (x, k) ∈ η1 (, η0 ) ⊂ Ek and t ∈ Jn , there exists exactly one point yt = yt (η0 ), such that (yt , l) ∈ (t, η0 and fˆtn (yt , l) = (x, k). Lemma 9. Given (x, k) ∈ Ek we have g (n) (y, l) =
λ|l|−|k| , 0 f n+|l|−|k| (fˆ−|l| (y, l))
where fˆn (y, l) = (x, k), and we write y for y0 . Proof. This relation comes from our definition of g (n) , and ω0 . We only have to keep in mind that for (x, k) ∈ Ek there is only one choice for a point z ∈ B0 if we want it to satisfy fˆ|k| (z, 0) = (x, k). The same affirmative is true for (y, l). There exists only one point in B0 (that we are denoting by fˆ−|l| (y, l)), with the property that fˆ|l| ((fˆ−|l| (y, l), 0)) = (y, l). u t |k|
|l|
Now, given n ≥ 1, and l, k ∈ Z, for t ∈ Jn , u ∈ J , and v ∈ J , we denote −|k| −|l| xu = fˆu (x, k), and yt,v = fˆv (yt , l). Note that there is no ambiguity in the choice of xu and yt,v . With these definitions we have the following lemma. Lemma 10. For n ≥ |l| − |k| we have Z
−1 R |l| 0 (fv ) yt,v d θˆ (v) −1 (n) d θˆ (t) · (ftn )0 yt gt (yt , l)dθ (t) = λ|l|−|k| −1 R |k| d θˆ (v) (fu )0 xu R |k| 0 −1 Z −1 (fr ) xr d θˆ (r) d θˆ (s) · (fsn+|l|−|k| )0 yt,v = λ|l|−|k| R −1 |k| d θˆ (v) (fu )0 xu Z −1 d θˆ (s), (fsn+|l|−|k| )0 yt,v = λ|l|−|k| Z
where we denote r = (tn−|k| , . . . , tn ), and s = (v1 , . . . , v|l| , t1 , . . . , tn−|k| ).
296
R. J. Metzger (n)
Proof. The first equality comes from the definition of gt rearranging terms in the integral. u t
and the second is obtained
Acknowledgements. This paper was completed during a visit to IMPA – Rio de Janeiro. It is part of my Ph.D. Thesis. I am grateful to Prof. Jacob Palis, my advisor, who contributed in a significant way to achieve the final form of this work.
References [BV]
Baladi, V. and Viana, M.: Strong stochastic stability and rate of mixing for unimodal maps. Ann. Scient. E.N.S. 29-4, 483–517 (1996) [BR] Bowen, R. and Ruelle, A.: The ergodic theory of Axiom A flows. Inv. Math. 29, 181–202 (1975) [GW] Guckenheimer, J. and Williams, R.F.: Structural stability of Lorenz attractors. Publ. Math. IHES 50, 307–320 (1979) [IW] Ikeda, N. and Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. Amsterdam: North-Holland/Kodansha, 1981 [KK] Katok, A. and Kifer, Y.: Random perturbations of transformations of an interval. J. de Analyse Math. 47, 193–237 (1986) [Ki1] Kifer, Y.: Ergodic Theory of Random Perturbations. Boston-Basel: Birkhäuser, 1986 [Ki2] Kifer, Y.: Random Perturbations of Dynamical Systems. Boston-Basel: Birkhäuser, 1988 [Lo] Lorenz, E.N.: Deterministic non periodic flow. J. Atmosph. Sci. 20, 130–141 (1963) [M] Metzger, R.: Sinai–Ruelle–Bowen measures for contracting Lorenz maps and flows. To appear in: Annales de L’I.H.P. Jour. d’Analyse [Pa] Palis, J.: A global view of dynamics and a conjecture on the denseness of finitude of attractors. Asterisque (1998) [Ro] Rovella, A.: The dynamics of perturbations of the contracting Lorenz Attractor. Bull. Braz. Math. Soc. 24, 233–259 (1993) [Ru] Ruelle, D.: A measure associated with Axiom A attractors. Am. J. of Math. 98, 619–654 (1976) [Si] Sinai, Ya.: Gibbs measure in ergodic theory. Russ. Math. Surv. 27, 21–79 (1972) [Vi1] Viana, M.: Stochastic Dynamics of Deterministic Systems. 21o Colóquio Brasileiro de Matemática, IMPA 1997 [Vi2] Viana, M.: Dynamics: A probabilistic and geometric perspective. Doc. Math. J., Extra Volume ICM 1998 [Y] Yosida, K.: Functional Analysis. Berlin: Springer-Verlag, 1980 Communicated by Ya. G. Sinai
Commun. Math. Phys. 212, 297 – 321 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Factorization Dynamics and Coxeter–Toda Lattices Tim Hoffmann1 , Johannes Kellendonk1 , Nadja Kutz1 , Nicolai Reshetikhin1,2 1 Fachbereich Mathematik, Sekr. MA 8-5, Technische Universität Berlin, Strasse des 17. Juni 136,
10623 Berlin, Germany. E-mail:
[email protected];
[email protected];
[email protected] 2 Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720, USA. E-mail:
[email protected] Received: 1 October 1999 / Accepted: 18 January 2000
Abstract: It is shown that the factorization relation on simple Lie groups with standard Poisson Lie structure restricted to Coxeter symplectic leaves gives an integrable dynamical system. This system can be regarded as a discretization of the Toda flow. In case of SLn the integrals of the factorization dynamics are integrals of the relativistic Toda system. A substantial part of the paper is devoted to the description of symplectic leaves in simple complex Lie groups, its Borel subgroups and their doubles. Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Basic Facts About Simple Poisson Lie Groups . . . . . . . . . . . . . . . 1.1 Basic facts about Poisson Lie groups . . . . . . . . . . . . . . . . 1.2 Standard Poisson structure on a simple Lie group . . . . . . . . . . 2. Symplectic Leaves of G . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Bruhat decomposition of the double of G . . . . . . . . . . . . . . 2.2 Left cosets D(G)/j (G− ) . . . . . . . . . . . . . . . . . . . . . . 2.3 Double cosets j (G− )\D(G)/j (G− ) . . . . . . . . . . . . . . . . 2.4 Symplectic leaves of G and double Bruhat cells . . . . . . . . . . 3. Symplectic Leaves of B . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 B − double cosets in D(B) . . . . . . . . . . . . . . . . . . . . . . 3.2 Factorization of left cosets and Darboux coordinates on symplectic leaves of B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Coxeter symplectic leaves of B . . . . . . . . . . . . . . . . . . . 3.4 Symplectic leaves of B − . . . . . . . . . . . . . . . . . . . . . . . 4. Symplectic Leaves of D(B) . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Symplectic leaves of D(B) . . . . . . . . . . . . . . . . . . . . . 4.2 Relation between symplectic leaves of B and D(B) . . . . . . . . 4.3 Relation between symplectic leaves of D(B) and G . . . . . . . .
298 300 300 301 302 302 303 303 304 305 305 306 308 308 308 308 309 309
298
5.
6.
7.
8.
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
Factorization Dynamics on Poisson Lie Groups . . . . . . . . . . . . . 5.1 Dynamics of Poisson relations . . . . . . . . . . . . . . . . . . 5.2 Factorization relations on Poisson Lie groups . . . . . . . . . . Factorization Dynamics on Coxeter Symplectic Leaves . . . . . . . . 6.1 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Factorization map . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Real positive form . . . . . . . . . . . . . . . . . . . . . . . . The Interpolating Flow and Continuous Time Nonlinear Toda Lattices . 7.1 Interpolating flow . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Linearization in a neighborhood of 1 . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
310 310 311 312 312 314 316 317 317 318 320
Introduction An integrable Hamiltonian system on a symplectic manifold consists of a Hamiltonian that generates the dynamics together with a Lagrangian fibration on the manifold such that the flow lines generated by the Hamiltonian are parallel to the fibers. Usually, the fibers are level surfaces of functions called higher integrals. The fibration by level surfaces is Lagrangian when the integrals Poisson commute and the flow lines are parallel to the fibers when the integrals Poisson commute with the Hamiltonian. The level surfaces of the integrals are equipped with natural affine coordinates in which the dynamics is linear [Arn89, HZ94]. Integrable systems on Poisson Lie groups have the following characteristic features: • The phase space of such a system is a symplectic manifold which is a symplectic leaf of a factorizable Poisson Lie group G. • The level surfaces of integrals are G-orbits with respect to the adjoint action of the group on itself. One should notice that for some symplectic leaves the G-invariant functions do not form complete set of Poisson commuting integrals (their level sets are not Lagrangian submanifolds, but only co-isotropic). In such cases still there is a complete system of integrals, but the complementary integrals may have singularities. An example is socalled full Toda system [Kos79,DLNT86]. Since symplectic leaves of the Poisson Lie group G are connected components of orbits of the dressing action of the dual Poisson Lie group G∗ on G, the invariant tori of such systems lie in the intersection of AdG and G∗ -orbits in G. Surprisingly enough, most of the known integrable systems on Poisson Lie groups are of this type. Such integrable systems have a Lax representation. Systematic treatment of such integrable systems was done by Semenov-Tian-Shanskii [STS85]. Linearization of this construction in a neighborhood of identity gives the similar construction based on Lie algebras which has been pioneered by Kostant [Kos79] on the example of Toda lattices and by Adler [Adl79] on the example of KdV equation. An integrable discrete dynamical system on a symplectic manifold is a symplectomorphism which acts parallel to fibers of a Lagrangian fibration given by level surfaces of integrals. More generally, it can be a Poisson relation preserving the fibration, for details see [Ves91]. In this paper we derive integrable systems related to Toda models [Tod88] (and references therein). We show that for simple Lie groups the factorization relation restricted to symplectic leaves that are associated with a Coxeter element in the Weyl group yields
Factorization Dynamics and Coxeter–Toda Lattices
299
a discrete integrable evolution. Such a dynamical system will be called Coxeter–Toda lattice and the dynamics factorization dynamics. It turns out that different choices of Coxeter element produce isomorphic integrable systems. The integrals for the factorization dynamics are in case of G = SLn the integrals of so-called relativistic Toda lattice introduced in [Rui90]. (Since we will deal only with simple Lie groups with standard Poisson Lie structure we can avoid going into the general discussion of factorizable Poisson Lie groups.) The phase space of the Coxeter–Toda lattice the symplectic leaf mentioned above. On a Zariski open subset of such a leaf which is isomorphic to C2r one can introduce coordinates χi± , i = 1, . . . , r = rankG with the following Poisson brackets: {χi± , χj± } = 0, {χi+ , χj− } = −2di Cij χi+ χj− . Here Cij is the Cartan matrix of G and the di co-prime positive integers symmetrizing it. The factorization relation restricted to a Coxeter symplectic leaf gives a symplectomorphism which acts on coordinates χi± is as follows α(χi+ ) = χi− , α(χi− ) =
r (χi− )2 Y
χi+
(1 − χj− )−Cj i .
j =1
This symplectomorphism is integrable. We will call it the discrete Toda evolution. Its integrals have the following description in terms of characters of finite dimensional representations of G. Let xi− , hi , xi+ be Chevalley generators of the Lie algebra g = Lie(G) and ϕi : SL2 (i) ⊂ G be the natural embedding of the SL2 subgroup generated by the elements xi− , hi , xi+ corresponding to the simple root αi . For a Coxeter element w of the Weyl group W of G fix a reduced decomposition w = si1 . . . sir , where r = rank(g) and define the element of G g=
r χ+ Y j j ( − )h exp(−χi+1 xi+1 ) exp(xi−1 ) . . . exp(−χi+r xi+r ) exp(xi−r ). χ j =1 j
Here {hj }rj =1 are elements of the Cartan subalgebra of g corresponding to fundamental P weights, hi = rj =1 Cij hj . The functions ChV (χ + , χ − ) = T rV (g),
(1)
where V is a finite dimensional representation of G form Poisson commutative subalgebra in the algebra of functions the phase space. They are the integrals of the map α. The characters of fundamental irreducible representations of G generate the subalgebra of integrals. Consider the function 1 Hd (χ ± ) = (ξ, ξ ), 2
300
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
where g = exp(ξ ) and (., .) is the Killing form on Lie G. The Hamiltonian flow generated by this function interpolates the map α. For G = SLn the integrals (1) are the integrals of so-called relativistic Toda lattice [Rui90]. In a neighborhood of the identity these integrals turn into the integrals of the (usual) Toda lattice. In the same sense as a Lie algebra can be regarded as a linearization of a Lie group, the usual Toda lattices are linearizations of Coxeter–Toda lattices. Integrable discretizations of Toda lattices have been discovered by Hirota [Hir77] who studied solitonic aspects of them (see also [DJM82]). Later they were re-derived in [Sur90,Sur91b] from discrete time version of a Lax pair. The Hamiltonian interpretation based on classical r-matrices was derived in [Sur91a] and generalized to Toda systems related to all classical Lie groups (and their affine extensions). In [KR97] a discrete version of Toda field theory was described together with the Hamiltonian structure and its quantization. The role of matrix factorization in discrete integrable systems was noticed quite some time ago. The references include [Sym82, QNCvdL84, MV91, DLT89]. The primary goal of this article is not to produce new discrete integrable systems (although those related to exceptional Lie groups are new) but rather to demonstrate how the discrete Toda evolution together with its integrals (1) can be derived in a systematic way from the geometry of Poisson Lie groups, and from the factorization relation. A large part of this paper is devoted to the study of the phase space of these systems. This requires the careful study of symplectic leaves of B (a Borel subgroup in a simple algebraic Lie group G with the standard Poisson Lie structure) and of its double. In Sect. 1 we recall basic facts about Poisson Lie groups and describe the factorization dynamics on factorizable Poisson Lie groups. Section 2 contains the analysis of symplectic leaves of simple complex algebraic groups G with a standard Poisson Lie structure. In Sect. 3 we describe symplectic leaves of the Borel subgroup B of a simple Poisson Lie group G. Section 4 contains the description of symplectic leaves of the double of B and of how they are related to symplectic leaves of B and of G. The factorization dynamics on Coxeter symplectic leaves is studied in Sect. 5. The interpolating flow and the relation to the (usual) Toda lattices is described in Sect 6. In the conclusion we point out what may be done next in this direction.
1. Basic Facts About Simple Poisson Lie Groups 1.1. Basic facts about Poisson Lie groups. A Poisson Lie group is a Lie group equipped with a Poisson structure which is compatible with the group multiplication. There is a functorial correspondence between connected, simply connected Poisson Lie groups and Lie bialgebras [Dri87]. The Lie bialgebra corresponding to a given Poisson Lie group is called tangent Lie bialgebra. The dual of a Lie bialgebra p is the dual vector space p∗ equipped with the Lie bracket dual to Lie cobracket of p and with the Lie cobracket dual to Lie bracket on p. The dual P ∗ of a Poisson Lie group P is, by definition, the connected, simply connected Poisson Lie group having the dual p∗ of the Lie bialgebra p corresponding to P as Lie bialgebra. Denote by p ∗ op the Lie bialgebra p∗ with opposite cobracket (which is minus the original cobracket). The double D(p) of p is the direct sum p ⊕p ∗ op as a Lie coalgebra and its Lie bracket is determined uniquely by the requirement that the natural inclusions i : p → D(p) and j : p ∗ op → D(p) (into the first and second summand, respectively) are Lie bialgebra
Factorization Dynamics and Coxeter–Toda Lattices
301
homomorphisms and by the fact that the natural bilinear form < (x, l), (y, m) >= m(x) + l(y) is D(p)-invariant. The double D(P ) of P is the connected, simply connected Poisson Lie group having D(p) as its Lie bialgebra. The maps i and j lift to injective Poisson maps i : P → D(P ), j : P ∗ op → D(P ) and consequently to a map µ ◦ (i × j ) : P × P ∗ op → D(P ): (x, y) 7 → i(x)j (y) which is also a local Poisson isomorphism. By a local isomorphism we mean an isomorphism between neighborhoods of the identity. A symplectic leaf of a Poisson manifold is an equivalence class of points which can be joined by piecewise Hamiltonian flow lines. When the Poisson manifold is a Poisson Lie group P , there is another description of these leaves which involves the dressing action of the dual Poisson Lie group on P . The Poisson Lie group P ∗ acts on D(P ) via left multiplication, y · x := j (y)x. We also have a map ϕ : P → D(P )/j (P ∗op ) which is the composition of i with the natural projection. In a neighborhood of the identity this map ϕ is a Poisson isomorphism and induces dressing action of P ∗ on P [STS85].The map ϕ is a finite cover and has open dense range. The symplectic leaves of P are orbits of dressing action of G∗op and are connected components of preimages of left P ∗ -orbits in D(P )/j (P ∗op ). Among the cases which have been investigated we point out the following three, P = G (a complex connected and simply connected simple Lie group with standard Poisson structure), P = B a Borel-(Poisson)-subgroup of G, and P = K the compact real from of G. For P = K, the double, which can be identified with G as a real group, is globally isomorphic to K × K ∗ op as a real manifold via Iwasawa factorization. The map ϕ in this case is a global Poisson isomorphism [LW90]. There is a particular simple relation between the Bruhat decomposition of K and its symplectic leaves [Soi90, LW90]. It is worth noticing that as the double of K the complex simple Lie group is equipped with real Poisson structure which is different from the standard Lie Poisson structure on G. In the first two cases, which are the ones we shall consider in detail below, the double is only locally isomorphic to P × P ∗ op . The symplectic leaves of G have been studied in [HL93] . Symplectic leaves for B were described in [DCKP95]. We reproduce the results of [HL93,DCKP95] below but will describe symplectic leaves in G and B more explicitly.
1.2. Standard Poisson structure on a simple Lie group. Let G be a simple complex Lie group. Fix a labeling of the nodes on the Dynkin diagram associated with the Lie algebra Lie G by integers i = 1, . . . , r = rank(G). Assign the simple root αi to the node labeled by i . Let C be the Cartan matrix, that is, Cj i = 2
(αi , αj ) . (αj , αj )
Denote by di the length of i th simple root, then di Cij = dj Cj i . Fix a Borel subgroup B ⊂ G. This fixes the polarization of the root system and together with the enumeration of nodes of Dynkin diagram fixes the generators of the Lie algebra Lie G {hi , xi± }i=1,··· ,r corresponding to simple roots of Lie G. The determining
302
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
relations for these generators are: [hi , hj ] = 0, [xi+ , xj− ] = δij hi , [hi , xj± ] = ±Cij xj± , ad(xi± )1−Cij xj± = 0,
i 6 = j.
The standard Lie bialgebra structure on Lie G compatible with the chosen Borel subgroup B is given by the cobracket acting on generators as follows: δ(hi ) = 0, δ(xi± ) = di xi± ∧ hi . This induces the Poisson Lie structure on G for which the Lie bialgebra described above is the tangent Lie bialgebra. The Borel subgroup B and its opposite B − are Poisson Lie subgroups. The Lie bialgebra Lie(G) is isomorphic to the double of the Lie bialgebra Lie(B) quotioned by the diagonally embedded Cartan subalgebra [Dri87]. We denote by N and N − the nilpotent subgroups of B and B − , respectively. Since H = B∩B − we have two natural projections and isomorphisms θ : B → B/N ∼ = H and θ − : B − → B − /N − ∼ = H . We shall also write B + and N + for B and N , respectively. 2. Symplectic Leaves of G 2.1. Bruhat decomposition of the double of G. A simple Lie group G with fixed Borel subgroup B admits Bruhat decomposition with respect to B: G BwB. G= w∈W def
˙ where w˙ is a representative of w ∈ NG (H )/H in NG (H ) (clearly Here BwB = B wB, B wB ˙ depends only on the class w ∈ NG (H )/H ). There is also a Bruhat decomposition of G with respect to B − : G B − wB − . G= w
Recall [KS98] that the double D(G) is, as a group, isomorphic to G × G. The cell decompositions of G therefore give the Bruhat decomposition of D(G) with respect to D − = B − × B: G D − (w1 , w2 )D − , D(G) = (w1 ,w2 )∈W ×W
D − (w1 , w2 )D −
B −w
B−
= × Bw2 B, where W × W = ND(G) (H × H )/H × H is 1 the Weyl group of D(G). We can also represent D − ⊂ D(G) as D − = (H × H )(N − × N + ) = (N − × N + )(H × H ). Then for the Bruhat cell D − (w1 , w2 )D − we can write D − (w1 , w2 )D − = (Nw−1 × Nw+2 )(H × H )(w˙ 1 , w˙ 2 )D − ,
(2)
where Nw± = {n ∈ N ± |w˙ −1 nw˙ ∈ N ∓ } (clearly this definition of Nw± does not depend on the choice of w). ˙
Factorization Dynamics and Coxeter–Toda Lattices
303
2.2. Left cosets D(G)/j (G− ). Let G− = G∗op which may be identified with {(b− , b) ∈ B − ×B|θ − (b− ) = θ(b)−1 }, a subgroup of B − ×B, [KS98]. We write j : G− ,→ B − ×B for this identification. There is a natural isomorphism: D − /j (G− ) ' H.
(3)
The group H × H acts on cosets (w˙ 1 , w˙ 2 )D − /j (G− ) by left multiplication: (h, h0 )(w˙ 1 , w˙ 2 )(b− , b)j (G− ) = (w˙ 1 , w˙ 2 )(hw1 b− , h0w2 b)j (G− )
b)j (G− ) = (w˙ 1 , w˙ 2 )(Adhw1 b− , h0w2 hw1 (Adh−1 w 1
= (w˙ 1 , w˙ 2 )(θ − (b− ), h0w2 hw1 θ (b))j (G− ) . ˙ Using also (3) we conclude that Here (b− , b) ∈ B − × B and we write hw = w˙ −1 hw. this action has stationary subgroup H w1 ,w2 = {(h, h0 ) ∈ H × H | hw1 = h0w2
−1
}.
(4)
Thus, we have an isomorphism D − (w1 , w2 )D − /j (G− ) ∼ = Nw−1 × Nw+2 × H and, in − − − particular, dim(D (w1 , w2 )D /j (G )) = l(w1 ) + l(w2 ) + r. 2.3. Double cosets j (G− )\D(G)/j (G− ). For double cosets, we have + ˙ 1 , w˙ 2 )(b− , b)j (G− ) = j (G− )(h˜ 1 , h˜ 2 )(w˙ 1 , w˙ 2 )j (G− ), j (G− )(n− w1 , nw2 )(h1 , h2 )(w
where h˜ 1 = h1 θ − (b− )w−1 , h˜ 2 = h2 θ − (b− )w−1 . The set of such double cosets is ac1 2 cording to (4) isomorphic to j (H )\H × H /jw1 w2 (H ),
(5)
where j (H ) ⊂ H × H is the subgroup that consists of elements (h, h−1 ), h ∈ H and −1 jw1 ,w2 (h) = (hw1 , h−1 w2 ). The coset of (h1 , h2 ) ∈ H × H in (5) is the set {(h, h ) 00 −1 00 00 (1, h1 h2 h h −1 )|h, h ∈ H }. Thus (5) is isomorphic to Hw−1 w1 , where Hw is the w2 w1
2
space of H -orbits on H with respect to the action
h : h0 → h0 hh−1 w .
(6)
All orbits are naturally isomorphic and we denote the one through 1 by H w . Furthermore, Hw is isomorphic to ker (w2−1 w1 − id) = {h ∈ H |hw−1 w1 = h}. Thus, we proved: 2
Proposition 1. We have an isomorphism j (G− )\D − (w1 , w2 )D − /j (G− ) ' Hw−1 w1 . 2
(7)
Each j (G− ) orbit corresponding to an element of this set is isomorphic to −1
Nw−1 × Nw+2 × H w2
w1
.
In particular, each such orbit has the dimension `(w1 ) + `(w2 ) + dim(coker(w2−1 w1 − 1)).
(8)
304
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
Notice that the isomorphism (7) and the isomorphisms between j (G− )-orbits and sets (8) are not canonical but depend on the choice of representatives w˙ 1 , w˙ 2 . What we really have here is the fiber bundle D − (w1 , w2 )D − /j (G− ) → j (G− )\D − (w1 , w2 )D − /j (G− )
(9)
over the torus Hw−1 w1 whose fibers are j (G− )-orbits. 2
2.4. Symplectic leaves of G and double Bruhat cells. Double Bruhat cells are defined as intersections of B-Bruhat cells and B − -Bruhat cells: Gw1 ,w2 = B − w1 B − ∩ Bw2 B. It is known that dim(Gw1 ,w2 ) = l(w1 ) + l(w2 ) + r (for example [FZ99]). i
Let ϕ : G ,→ D → D/j (G− ) be the composition of diagonal embedding with the natural projection. According to (2) we have D − (w1 , w2 )D − /j (G− ) ∼ = (Nw−1 × Nw+2 )(w˙ 1 , w˙ 2 )i(H ). Define
0 := {ε ∈ H |ε 2 = 1}.
Theorem 1. 1. ϕ(Gw1 ,w2 ) ⊂ D − (w1 , w2 )D − /j (G− ) 2. The image of ϕ is Zariski open in D − (w1 , w2 )D − /j (G− ). 3. For each x ∈ I mϕ the group 0 acts by left translations on ϕ −1 (x). 4. The restriction of map varphi to Gw1 ,w2 is a cover map with the group of deck transformation 0. ˙ 1 b− = n+ ˙ 2 b+ ∈ B − w1 B − ∩ Bw2 B, Here is the outline of the proof. Let g = n− w1 w w2 w − − + + ± ± where nw1 ∈ Nw1 , nw2 ∈ Nw2 and b ∈ B . Then we have ˙ 1 b− , n+ ˙ 2 b+ )j (G− ). ϕ(g) = (g, g)j (G− ) = (n− w1 w w2 w Therefore ϕ(g) is an element of D − (w1 , w2 )D − /j (G− ). ˙ 1 b− and x2 = n+ ˙ 2 b+ , then (x1 , x2 )j (G). This Conversely, assume x1 = n− w1 w w2 w class has a representative of the form (g, g)j (G) if and only if there exists (η+ , η− ) ∈ ˙ 1 η− = n+ ˙ 2 η+ . According to [FZ99] such elements exist when G− such that n− w1 w w2 w − + (nw1 , nw2 ) belong to an open dense subset of Nw−1 × Nw+2 . Therefore the image of ϕ is open dense in D − (w1 , w2 )D − /j (G− ). Furthermore ϕ(gε) = (gε, gε−1 )j (G− ) = ϕ(g) for each ε ∈ 0. This shows that 0 acts (fixed point freely) on the preimages of points. Since i(0) = i(H ) ∩ j (H ) is the kernel of ϕ, 0 is the group of deck transformations for the cover map ϕ : Gw1 ,w2 → D − (w1 , w2 )D − /j (G− ). Since the symplectic leaves in G are connected components of preimages of j (G− )orbits in D(G)/j (G− ) we obtain the following description of leaves. Corollary 1. Connected components of preimages of G− orbits in D − (w1 , w2 ) D − /j (G− ) with respect to the map ϕ are symplectic leaves of G which belong to the double Bruhat cell Gw1 ,w2 .
Factorization Dynamics and Coxeter–Toda Lattices
305
3. Symplectic Leaves of B 3.1. B − double cosets in D(B). The double of a Borel subgroup B of G is isomorphic to G × H as a group (for the details see for example [KS98]. Furthermore, B ∗op ∼ = B −, − − − − − sitting inside G × H as j : B → G × H : j (b ) = (b , θ (b )). In particular, D(B) has the following cell decompositions: G G B − wB − × H = BwB × H. (10) D(B) = G × H = w∈W
Denote D(B)w = j (B − ) we have
B − wB −
D(B)/j (B − ) =
× H and G
w∈W
D(B)w
= BwB × H . For the quotient D(B)/
(B − wB − × H )/j (B − ) ∼ =
w∈W
G w∈W
Nw− × H.
Let us compute double cosets: j (B − )(b− w˙ b˜ − , h)j (B − ) = j (B − )(hb− w˙ b˜ − , 1)j (B − ) = j (B − )(hh− w˙ h˜ − , 1)j (B − ) ˙ 0 , 1)j (B − ) = j (B − )(wh ' j (H )(wh ˙ 0 , 1)j (H ). Clearly j (h) = (h, h−1 ) ∈ H × H ⊂ G × H . Therefore we have the isomorphism j (B − )\D(B)w /j (B − ) ∼ = Hw , where, we recall, Hw is the space of H -orbits on H for the action (6). In particular, dim(Hw ) = dim(ker(w − id)). ˙ h) in D(B) representing an equivalence class in D(B)/j (B − ). Choose a point (n− w w, − The left j (B ) orbit passing through this point is the set of elements ˙ θ(b− )−1 h)j (B − ), b− ∈ B − } {(b− n− w w, − − ˙ )w , θ(b− )−1 )h)j (B − ) | b− ∈ B − , b− n− ˜− = {(n˜ − w wθ(b w =n w θ (b )} − ˙ θ(b− )−1 θ(b− )w h)j (B − ) | b− ∈ B − , b− n− ˜− = {(n˜ − w w, w =n w θ (b )}.
Thus, we proved the following Proposition 2. 1. j (B − )\D(B)w /j (B − ) ∼ = (C× )dim(ker(w−id)) . − w 2. Each orbit is isomorphic to Nw × H . Similar to the case of G, the isomorphisms are not canonical but we have a fiber bundle D(B)w /j (B − ) → j (B − )\D(B)/j (B − ) whose fibers are the j (B − ) orbits. For w ∈ W define the subset Bw = B ∩ B − wB − . According to the general theory, symplectic leaves of B are connected components of preimages of j (B − )-orbits in D(B)/j (B − ) with respect to the map ϕ : B ⊂ D(B) → D(B)/j (B − ). Theorem 2. 1. The subset ϕ(Bw ) ⊂ BwB × H /j (B − ) is Zariski open. 2. For each x ∈ I mϕ the group 0 acts freely on ϕ −1 (x) acts by left translations. 3. The restriction of ϕ to Bw is a covering map Bw → D(B)w /j (B − ) with the group of deck transformations isomorphic to 0.
306
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
3.2. Factorization of left cosets and Darboux coordinates on symplectic leaves of B. Fix a reduced decomposition of w ∈ W , w = si1 . . . si`(w) , where `(w) is the length of w. Consider the subset Bi1 ,...,i`(w) = Bsi1 . . . Bsi`(w) ⊂ B which is the image of Bi1 × . . . × Bi`(w) under the multiplication in G. Here Bsi = B(i) ∩ B(i)− si B − (i) and B(i) = B ∩ SL2 (i) is the intersection of the Borel subgroup in G and of the SL2 -subgroup corresponding to the i th simple root. For w ∈ W define numbers of “repetitions” ni = {# of i in the sequence {i1 , . . . , i`(w) }}, and define the support of w as I (w) = {i | 1 ≤ i ≤ r, ni 6= 0}. If ni ≥ 1 consider the following action of (C× )ni −1 on Bi1 × . . . × Bi`(w) : (x1 , . . . , xni −1 ) : (bi1 , . . . , bi`(w) ) 7 → (. . . , bi ϕi (x1 ), . . . , Adϕi (x1 ) (bj ), . . . , ϕi (x1 )−1 bi ϕi (x2 ), . . .
Adϕi (x2 ) (bk ), . . . , ϕi (x2 )−1 bi ϕi (x3 ), . . . ).
(11)
Here ϕi : C× ,→ SL2 ,→ G is the composition of embedding, C× into SL2 as the (complex) Cartan subgroup and SL2 into G as the i th SL2 -triple. It is clear that for different ni , nj , both greater than 1, the corresponding actions commute so that w gives rise to an action of the torus J , the product of all (C× )ni −1 , over i with ni > 1. Proposition 3. The multiplication map Bsi1 × . . . × Bsi` (w) → Bi1 ,...,i`(w) commutes with the J -action, assuming J acts trivially on Bi1 ,...,i`(w) and establishes an isomorphism Bi1 ,...,i`(w) ' (Bsi1 × · · · × Bsi`(w) )/J . Here is the outline of the proof. We can choose the elements for (C× )ni −1 in such a way that the Cartan parts of the elements bi of (bi1 , . . . , bi`(w) ) will all be trivial, all except one. If we do this for each i ∈ I (w) we will have cross-section of the action of J . Then it quickly follows that this cross-section is a birational isomorphism. The support I (w) of w defines naturally a sub-diagram of the Dynkin diagram of G (by deleting all nodes not in I (w)) and hence a subgroup of G. Let Bw0 be the image in G of the Bruhat cell corresponding to w in this subgroup. Then multiplication provides an isomorphism between Bw0 × H (w) and Bw where H (w) is the subgroup of / I (w). The following is known (see for H corresponding to the simple roots αi with i ∈ example [FZ99]). Theorem 3. • For each w ∈ W with fixed reduced decomposition the set Bi1 ,...,i`(w) is Zariski open in Bw0 . • For each two reduced decompositions w = si1 . . . si`(w) and w = sj1 . . . sj`(w) there is a birational isomorphism between Bi1 ,...,i`(w) and Bj1 ,...,j`(w) .
Factorization Dynamics and Coxeter–Toda Lattices
307
Let us describe the symplectic leaves of Bw more explicitly, using results of the previous subsection. There is a natural coordinate system in a neighborhood of the identity of the subgroup B(i) in which the group elements are written as exp (ai hi + bi xi+ ) = exp(ai hi ) exp(bi0 xi+ ), where bi0 = e−ai baii sinh(ai ).
i) . In these The corresponding global coordinates on B(i) are Ai = eai , Bi = bi sinh(a ai Ai Bi in two coordinates the above element is represented by the 2 × 2 matrix 0 A−1 i dimensional representation of SL2 . The subgroup B(i) is a Poisson Lie subgroup in SL2 (i) with the following Poisson brackets between coordinate functions:
{Ai , Bi } = −di Ai Bi . Here and below we will abuse notations and will denote coordinates and coordinate functions by the same letters. The symplectic leaves of B(i) are one 2-dimensional leaf Bsi = {Ai , Bi | Ai ∈ C× , Bi ∈ C× }, and a 1-dimensional family of zero-dimensional leaves {Ai = t, Bi = 0}. The product Bsi1 × . . . × Bsi` (w) carries natural product symplectic structure. Since the multiplication map is Poisson, the sub-manifold Bi1 ,...,i`(w) ⊂ Bw0 is a Poisson submanifold. According to Theorem 3, Bi1 ,...,i`(w) is Zariski open in Bw0 which implies that the symplectic leaves of Bi1 ,...,i`(w) are Zariski open sub-varieties in the symplectic leaves of B. The following result, combined with the product Poisson structure on Bi1 ×· · ·×Bi`(w) allows to describe symplectic leaves of Bi1 ,...,i`(w) explicitly via Hamiltonian reduction. Proposition 4. The action (11) of J = (C× )`(w)−|I (w)| on Bi1 × · · · × Bi`(w) is Hamiltonian. Here |I (w)| is the cardinality of the support of w. The Hamiltonians generating this action can be constructed explicitly as linear functions of logA and logB. They do not commute with respect to the Poisson brackets. This means that the pull-back of the moment map maps the Poisson algebra of functions on Bi1 × · · · × Bi`(w) to the Poisson algebra of functions on a hyper-plane in the vector space dual to a central extension of Lie algebra j . The symplectic leaves of the quotient space (Bsi1 × · · · × Bsi`(w) )/J are preimages of the corresponding coadjoint orbits (of centrally extended Lie algebra j ) with respect to the moment map. In other words the symplectic leaves of these quotient spaces can be obtained via Hamiltonian reduction. We will leave the details of this Hamiltonian reduction to a separate publication. Below we will consider symplectic leaves corresponding to Coxeter elements. In this case all ni = 1 and so J is trivial. This means that for these symplectic leaves the coordinates described above are global.
308
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
3.3. Coxeter symplectic leaves of B. An element of the Weyl groups W is called a Coxeter element if its reduced decomposition into the product of simple reflections w = si1 . . . sil(w) does not have repetitions in the sequence of sub-indices and if l(w) = r (i.e. in this product each generator of W appears exactly once). It is not difficult to see that if w is a Coxeter element, dim(coker(w − id)) is r and therefore the subset Bw is a symplectic leaf of B. We will call them Coxeter symplectic leaves. Let Ui : Bsi ,→ B be the natural inclusion of Bsi ⊂ B(i) into B. Then any element of Bi1 ,...,i`w can be written as Ui1 (Ai1 , Bi1 ) · · · Uir (Air , Bir ). Thus, for Coxeter symplectic leaves Ai , Bi (more precisely, their logarithms) are Darboux coordinates. 3.4. Symplectic leaves of B − . Symplectic leaves of B − can be described similarly to how it was done for B. They also can obtained from the ones for B since B is antiisomorphic to B − as a Poisson manifold (there is an isomorphism of groups, which maps one Poisson tensor to the negative of the other) . Let Ci , Di , be coordinates on the lower triangular part ofSL2 (i) in which group Di 0 elements are represented by matrices Li (Di , Ci ) = in the two dimensional Ci Di−1 irreducible representation of SL2 . These coordinate functions have the following Poisson brackets: {Di , Ci } = di Di Ci . Denote by Bs−i the sub-variety of the lower triangular part of SL2 (i), where Ci 6 = 0. Fix the Coxeter element w ∈ W and its reduced decomposition w = si1 . . . sir . On a Zariski open subset of the Coxeter symplectic leaf Bw− one can introduce the natural coordinates Ci , Di , i = 1, . . . , r. Every element of this subset can be written as: Lir (Dir , Cir ) . . . Li1 (Di1 , Ci1 ), where Li : Bs−i ⊂ B − are natural inclusions. 4. Symplectic Leaves of D(B) 4.1. Symplectic leaves of D(B). As above, let us identify D(B) with G × H as a group. The Poisson structure on D(B) = G × H is not the product structure. Symplectic leaves of D(B) can be described similarly to how it was done for G. Since D(B) is a factorizable Poisson Lie group D(D(B)) ' D(B) × D(B). Fix this isomorphism together with the identification D(B) = G × H . This gives the following cell decomposition for D(D(B)): G (B − w1 B − × H ) × (Bw2 B × H ). D(B) × D(B) = w1 ,w2
The Poisson Lie group D − (B) = D(B)∗op can naturally be identified with B − × B.
Factorization Dynamics and Coxeter–Toda Lattices
309
Let D(B)w and D(B)w be Bruhat cells of D(B) defined in (10). The double cosets j (D − (B))\D(B)w1 × D(B)w2 /j (D − (B)) can be computed similarly to Proposition 1: j (D − (B))\D(B)w1 × D(B)w2 /j (D − (B)) ' Hw1 × Hw2 . The D − (B)-orbit passing through the coset class of ((w˙ 1 , h1 ), (w˙ 2 , h2 )) ∈ D(B)w1 × D(B)w2 is isomorphic to (Nw−1 × H w1 ) × (Nw+2 × H w2 ).
(12)
Notice that j (D − (B))-orbits in D(B)w1 × D(B)w2 /j (D − (B)) are isomorphic to the product of corresponding orbits for B and for B − . Again we have a natural fiber bundle (D(B)w1 × D(B w2 ))/j (D − (B)) → j (D − (B))\D(B)w1 × D(B)w2 /j (D − (B)) and j (D − (B))-orbits are fibers of this bundle. Symplectic leaves of D(B) are connected components of preimages of j (D − (B))orbits under the map ϕ : D(B) → (D(B) × D(B))/j (D − (B)). Symplectic leaves whose image is a j (D − (B))- orbit in D(B)w1 × D(B)w2 / j (D − (B)) will be denoted as Sw1 ,w2 . 4.2. Relation between symplectic leaves of B and D(B). Embeddings i : B ,→ D(B) and j : B − ,→ D(B) combined with the multiplication and inversion in D(B) give rise to the map I : B × B − → D(B) = G × H,
I (b, b− ) = (b(b− )−1 , θ (b)θ − (b− ))
(13)
which is most important to define the factorization relation, see below. The image of this map is Zariski open in D(B). This map is Poisson and therefore it maps symplectic leaves of B × B − to symplectic leaves of D(B). The intersection of the image of I and of any of the symplectic leaves is Zariski open in this leaf. This “explains” the formula (12). 4.3. Relation between symplectic leaves of D(B) and G. The Cartan subgroup H acts naturally on B × B − by diagonal multiplication from the right, h(b, b− ) = (bh, b− h).
(14)
The following is clear. Lemma 1. The map I commutes with the H -action I (bh, b− h) = I (b, b− )(1, h2 ) and induces a Poisson map I˜ between corresponding cosets: ˜
I (B × B − )/H −→ G ∼ = D(B)/H.
Here the coset is taken with respect to the action of H on D(B) by the multiplication by (1, h2 ) from the right.
310
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
It is also clear that the image of I˜ is Zariski open in G and that I˜ is a birational isomorphism. Since the action (14) is Hamiltonian, generic symplectic leaves of (B × B − )/H can be obtained via Hamiltonian reduction from generic symplectic leaves of B × B − . Therefore, symplectic leaves of G can be obtained via Hamiltonian reduction from symplectic leaves of D(B). Symplectic leaves of G can be also described via Hamiltonian reduction similarly to how it was done for symplectic leaves of B in Sect. 4.1. For this consider two elements u, v ∈ W and fix their reduced decomposition u = si1 . . . sil , v = sj1 . . . sjm . Consider the image of Bsi1 × · · · × Bsil × Bs−jm × · · · × Bs−j 1
under the multiplication and inverse map: Gi1 ,...,il ,j1 ,...,jm = Bsi1 . . . Bsil Bs−j
1
−1
. . . Bs−jm
−1
.
The double Bruhat cell Gu,v has natural decomposition Gu,v = G0 u,v × H (u, v), where H (u, v) is the subgroup of H generated by elements corresponding to simple roots which do not belong to I (u) ∪ I (v). It follows from [FZ99] that the variety Gi1 ,...,il ,j1 ,...,jm is birationally isomorphic to G0 u,v . On the other hand it is also isomorphic to the quotient of Bsi1 × · · · × Bsil × Bs−jm × · · · × Bs−j 1
with respect to the appropriate Hamiltonian toric action (see Sect. 4.1. This allows to construct all symplectic leaves of G via Hamiltonian reduction. We will leave the details of this construction for another publication. 5. Factorization Dynamics on Poisson Lie Groups 5.1. Dynamics of Poisson relations. Here we will recall basic facts about Poison relations and their dynamics. Let (M, p) be a Poisson manifold with the Poisson tensor p ∈ ∧2 T M. Denote by p(2) ∈ ∧2 T (M × M) the Poisson tensor corresponding to the following product of Poisson manifolds: (M, −p) × (M, p). A smooth relation of finite type on a manifold M is a submanifold R ⊂ M × M, such that natural projections π1 , π2 : M × M → M, π1 (x, y) = x, π2 (x, y) = y have a finite number of preimages. Denote by T ⊥ R the forms on M × M which vanish on T R ⊂ T (M × M). A smooth relation on a Poisson manifold M is called a Poisson relation if p(2) |T ⊥ R = 0 and dim(R) = dim(M). If a relation R = {(x, φ(x)) | x ∈ M} is a graph of a map φ : M → M it is Poisson if and only if φ is a Poisson map. An nth iteration of a relation R on M is a submanifold R (n) ⊂ M ×(n+1) such that R (n) = {(x1 , . . . , xn+1 ) | xi ∈ M, (xi , xi+1 ) ∈ R ⊂ M × M}. A function F ∈ C ∞ (M) is called an integral of a smooth relation R ⊂ M × M if F (x) = F (y) for all
(x, y) ∈ R.
Factorization Dynamics and Coxeter–Toda Lattices
311
A smooth relation on a symplectic manifold is Poisson if and only if it is a Lagrangian submanifold in M × M (equipped with the product symplectic structure). It is called integrable if there exists n independent Poisson commuting functions I1 , . . . , In which are integrals of R. Similarly one can define Poisson and symplectic relations in an algebro-geometric setting. For more details about the dynamics of symplectic relations see [Ves91].
5.2. Factorization relations on Poisson Lie groups. We will study very specific Poisson relations on Poisson Lie groups which we will call factorization relations. Let P be a Poisson Lie group and D(P ) be its double. A factorization relation on P × P op is a sub-variety F ⊂ (P × P op ) × (P × P op ), defined as F = {(g + , g − ), (h+ , h− ) | i(g + )j (g − )−1 = j (h− )−1 i(h+ )}, where i : P ,→ D(P ) and j : P op ,→ D(P ) are the natural inclusions of Poisson Lie groups. Proposition 5. • Functions on D(P ) which are invariant with respect to the adjoint action of D(P ) form a Poisson commutative subalgebra in the Poisson algebra of functions on D(P ). • A function on P × P op which is the composition of the map M(i × j ) : P × P op → D(P ), (g + , g − ) 7 → i(g + )j (g − )−1 and of an Ad-invariant function on D(P ) is an integral of the factorization map. Part 1 of this proposition is well known [STS85]; Part 2 is obvious: f (i(g + )j (g − )−1 ) = f (j (h− )−1 i(h+ )) = f (i(h+ )j (h− )−1 ). Let 61 and 6¯2 be symplectic leaves in P and P op respectively. Restricting the relation F to the symplectic leaf 6 = 61 × 6¯2 ⊂ P × P op we obtain a Poisson relation on 6. Functions on D(P ) which are Ad-invariant are invariant with respect to the factorization relation and therefore produce its integrals. It may happen that one can make a Hamiltonian reduction of 6 in such a way that on the reduced space we have enough central functions, in a sense that their level surfaces are half of the dimension of the reduced symplectic manifold. In this case the factorization dynamics on 6 or on the reduced space will be integrable. In the next sections we will show that this is exactly what happens with symplectic leaves corresponding to the Coxeter elements. As we will see this gives an integrable system which is a “nonlinear” version of an open Toda system corresponding to the Lie algebra g = Lie(G). It becomes the usual Toda system in a neighborhood of the identity. Remark. One can argue that factorization dynamics is integrable on all (appropriately reduced) symplectic leaves of P × P op when P is a Borel subgroup of simple Lie group G. In a neighborhood of the identity such systems become “complete” Toda systems (corresponding to parabolic subgroups in G) [Kos79, DLT89]. But this will be the subject for a separate publication.
312
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
6. Factorization Dynamics on Coxeter Symplectic Leaves 6.1. Integrals. Consider a Coxeter symplectic leaf Sw,w of D(B) corresponding to the Coxeter element w. Fix reduced decomposition w = si1 . . . sir . On Zariski open subset 0 of Sw,w each element of G∩Sw,w (provided that G is embedded to D(B) = G×H Sw,w as (G, e)) can be represented by the product −1 U L−1 = Ui1 . . . Uir L−1 i1 . . . Lir .
(15)
Here we abbreviated Ui ≡ Ui (Ai , Bi ), Li = Li (Di , Ci ). This subset depends on the choice of reduced decomposition of w. We will suppress this dependence since different reduced decompositions give birationally isomorphic subsets. For each i = 1, . . . , r and given reduced decomposition w = si1 . . . sir define {i}+ = {iα = 1, . . . r | α > β, i = iβ } and {i}− = {iα = 1, . . . r | α < β, i = iβ }. Proposition 6. The following identities hold: Y h Y −h −1 ˜ ˜ ˜ Ai i Ui1 (1, V˜i1 )L−1 Di i , U L−1 = i1 (1, Wi1 ) . . . Uir (1, Vir ) · Lir (1, Wir ) i
U L−1
i
Y Ai h i −1 = Ui1 (1, Vi1 ) . . . Uir (1, Vir ) Li1 (1, Wi1 ) . . . L−1 ir (1, Wir ), Di
(16)
i
where
Y
Vi = Bi Ai
j ∈{i}−
Wi = Ci Di−1
V˜i = Bi A−1 i
C
Aj j i ,
Y
j ∈{i}+
−Cj i
Dj
−Cj i
j ∈{i}+
Y
W˜ i = Ci Di
,
Y
Aj
,
C
j ∈{i}−
Dj j i .
The proof of this proposition and of the next lemma is a simple exercise. Lemma 2. V˜i = Vi
Q
j
−Cj i
Aj
, W˜ i = Wi
Q
j
C
Dj j i .
Define variables χi± , Gi , Fi as χi+ = Vi Wi ,
χi− = χi+
Y Cj i Y Cj i Bi Ai Di Aj Dj , Gi = Ci j ∈{i}+
Y Aj −C ji , Dj j
Fi = Ai Di .
j ∈{i}−
0 , χi± , Fi , and Gi have the following Proposition 7. Considered as functions on Sw,w Poisson brackets:
{χi+ , χj+ } = {χi− , χj− } = 0, {χi+ , χj− } = −2di Cij χi+ χj− , {χi± , Fj } = {χi± , Gj } = 0, {Fi , Gj } = −2di Fi Gj δi,j .
Factorization Dynamics and Coxeter–Toda Lattices
313
The proof is a straightforward computation based on the definition of χi± , Fi , and Gi and on the Poisson brackets between Ai , Bi , Ci , Di : {Ai , Bj } {Di , Cj } {Ai , Aj } {Bi , Bj }
= = = =
−di δij Ai Bj , di δij Di Cj , {Ai , Cj } = {Ai , Dj } = {Bi , Cj } = {Bi , Dj } = {Di , Dj } = 0 , {Ci , Cj } = 0.
Using Proposition 6, the definition of χi± and elementary algebra we arrive at the following Proposition 8. Let V be a finite-dimensional representation of G and ChV be its character. Then ChV (U L−1 ) = ChV =
r Y χj+ hj
φi1 (gi1 ) . . . φir (gir ) χj− j =1 r Y χj+ hj φi1 (g¯ i1 ) . . . φir (g¯ ir ) . ChV − χj j =1
Here φi : SLr (i) ,→ G is the embedding of SL2 generated by xi+ , hi , xi− into G, gi and g¯ i are elements of SL2 whose image in 2-dimensional irreducible representation is given by the following weight basis of 2-dimensional irreducible representation: 1 χi+ 1−χi− χi− , g¯ i = . gi = −1 1 −1 1−χi+ The element {hj } forms the basis in h ⊂ g corresponding to fundamental weights: P hj = i Cj i hi . Observe that [hj , Xi± ] = ±δij Xi± , hence by conjugating U L−1 with an element exp ahi of H one can alter the off-diagonals of the gi0 s. This was used in the proof of Proposition 8. Now let us interpret these two propositions from the point of view of Hamiltonian reduction. 0 . Proposition 9. (1) Functions log Gi generate H action (14) on Sw,w 0 (2) Functions log Fi generate the adjoint action of H on Sw,w ⊂ D(B), h : (g, h0 ) 7 → (hgh−1 , h0 ).
This proposition can be derived immediately from formulae (16) and from the explicit form of Poisson brackets in terms of coordinates Ai , Bi , Ci , Di . Characters as functions on the group are invariant with respect to the adjoint action. 0 do not depend Therefore Proposition 9 implies that characters, computed on G ∩ Sw,w on Fi , Gi which can be seen also by direct computation (Proposition 8). As it follows from 4.2 we can naturally identify 0 0 = Sw,w /H, G ∩ Sw,w
where the H action is generated by log Fi . Level surfaces of functions Gi are symplectic 0 and log χi± are Darboux coordinates on these symplectic leaves. All leaves of G ∩ Sw,w this is clear from the structure of Poisson brackets in Proposition 7.
314
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
6.2. Factorization map. Consider the map α : (C× )2r → (C× )2r , α(χi+ ) = χi− , α(χi− ) =
(χi− )2 Y (1 − χj− )−Cj i , χi+ j
α(Fi ) = Fi , α(Gi ) = Gi
Y j
Cij
Fj
defined outside of the hyper-planes (χj− = 1, χi+ = 0). Since we are interested in integrable systems whose Hamiltonians are given by functions on G invariant under conjugations and since there functions restricted to a Coxeter orbit do not depend on F and G variables we will focus on the action of the factorization dynamics on χ ± . Here we will continue the practice of abusing notation and will denote coordinates and coordinate functions by the same letter. Let ChV (χ + , χ − ) be functions on (C× )2r as defined in Proposition 8. Theorem 4. ChV (α(χ + ), α(χ − )) = ChV (χ + , χ − ). Proof. We will use two formulae for these functions derived in Proposition 8. From the first one we have: + r → + Y Y ) α(χ j 1 α(χi ) j h φi ChV (α(χ + ), α(χ − )) = ChV − −1 1−α(χi+ ) α(χ ) j j =1 i r r → − Y Y χj+ hj Y 1 χi (1 − χj− )hj φi = ChV − −1 1−χi− χ j j =1 j =1 i r → − − Y χj+ hj Y 1−χi χi . φi = ChV −1 1 χ− j =1
j
i
Here the product is taken in the order (i1 , . . . , ir ) and in the last equality we used the cyclic property of the trace and the Lie brackets between hj and ei and fi . The last expression is exactly the second formula for ChV , which proves the theorem. On the other hand the theorem follows from the next statement and from the fact that Adinvariant functions on a Poisson Lie group are invariant with respect to the factorization map. u t 0 0 ×Sw,w be the factorization relation restricted to Coxeter Proposition 10. Let F ⊂ Sw,w symplectic leaves of D(B). The diagram
χL . (C× )2r
F α
& χR
−→ (C× )2r
is commutative. Here χL is the composition of the projection to the first component in 0 0 × Sw,w and the map χ : (Ai , Bi , Ci , Di ) 7→ (χi± ) and χR is the composition of Sw,w the projection to the right component and χ .
Factorization Dynamics and Coxeter–Toda Lattices
315
Proof. On the image of the factorization map I : B ×B → D(B), elements of Sw,w−1 ⊂ D(B) can be represented as (U L−1 , diag(U )diag(L)), 0 0 × Sw,w consists of where U and L are as above. The factorization relation F ⊂ Sw,w points
¯ (U L−1 , diag(U )diag(L)), (U¯ L¯ −1 , diag(U¯ )diag(L)) satisfying conditions U L−1 = L¯ −1 U¯ , ¯ diag(U )diag(L)) = diag(U¯ )diag(L). Let Ui , Ui0 , Ui00 , U¯ i , Li , L0i , L00i , L¯ i be factors of U, L, . . . satisfying relations 0
0
−1 −1 −1 0 0 U L−1 = Ui1 . . . Uir L−1 i1 . . . Lir = Ui1 Li1 . . . Uir Lir 00
00
−1 00 00 ¯ −1 ¯ ¯ = L−1 U−1 . . . Lir−1 U−r = L¯ −1 i1 . . . Lir Ui1 . . . Uir .
Then the coordinates A, B, C, D of these elements have to satisfy the relations Di0 = Di , Q −C Ci0 = Ci j ∈{i}+ Aj j i ,
A0 = Ai , Q C Bi0 = Bi j ∈{i}− Dj j i , 0
00
00
A0i Di−1 − Bi0 Ci0 = A00i Di −1 , Bi0 Di0 = Bi00 Di −1 , 0
00
Ci0 Ai−1 = Ci00 A00i ,
−1 00 A−1 − Bi00 Ci00 , i Di = Di Ai
A0i Di0 = A00i Di00 , Y 00 Cj i Dj , B¯ i = Bi00
C¯ i = Ci00
A¯ i = A00i ,
D¯ i = Di00 .
j ∈{i}+
Y j ∈{i}−
00 −C ji
Aj
,
Let us find χ¯ i+ from these relations: A¯ i Y ¯ −Cj i Y ¯ −Cj i A Dj χ¯ i+ = B¯ i C¯ i D¯ i j ∈{i} j j ∈{i} = Bi00 Ci00 =
Y
−
Dj
j ∈{i}+ A00 Bi00 Ci00 i00 Bi0 Ci0 Di
= Bi Ci
j ∈{i}−
+
00 −C ji
Aj
Di0 A0i
Di Y Cj i Y −Cj i Dj Aj Ai j ∈{i}−
=
Y
00 C ji
χ− χi+ i+ χi
= χi− .
j ∈{i}+
A00i Y 00 Cj i Y 00 −Cj i Aj Dj Di00 j ∈{i}−
j ∈{i}+
316
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
Similarly, χ¯ i−
=
=
χ¯ i+ χi−
Y D¯ j A¯ j
!Cj i =
j
Y Dj0 j
A0j
Here we used the identities proves the proposition. u t
χi−
1 − Bj0 Cj0 Q
Y Dj00 j
A00j
!−Cj i
Di0 A0i
0 0 Cj i j (Dj /Aj )
!Cj i
=
(χi− )2 Y (1 − χj− )−Cj i . χi+ j
= χi− /χi+ and Bj0 Cj0 Dj0 /A0j = χj− . This
Corollary 2. The map α is Poisson. This can also be checked by direct calculation using Poisson brackets between χi± . Thus, we have a Poisson map α : ((C× )2r ) → ((C× )2r ) defined outside of hyperplanes χi− = 1, χ + = 0, which preserves functions ChV (χ + , χ − ). Proposition 11. (1) {ChV , ChW } = 0 for every pair of finite dimensional representations V and W . (2) ChV , as a function of the χi± , is independent of the choice of the Coxeter element w. Proof. The first part of this proposition is a general fact about factorizable Poisson Lie groups. For the second we have to show that ChV (χ + , χ − ) does not depend on the order (i1 , . . . , ir ) of the indices. Clearly ChV (χ + , χ − ) doesn’t change if we change the order by an elementary transposition (exchange of two consecutive indices) of two indices which are not linked in the Coxeter diagram. Let us call these transpositions free elementary transpositions. Furthermore, ChV (χ + , χ − ) is also invariant under a cyclic permutation as may be seen using the observation made after Proposition (8). Thus the proposition follows from the easily established fact that every elementary transposition can be obtained by successive applications of cyclic permutations and free elementary transpositions. u t To summarize, with each Coxeter symplectic leaf of G we associated a (complex holomorphic, algebraic) integrable system on ((C× )2r ) for which the integrals are given by characters (there are exactly r independent of them) but all these systems are trivially isomorphic. The coordinates χi± simply describe different points in the group if one changes the Coxeter element. The factorization relation restricted to a Coxeter symplectic leaf gives a discrete-time evolution preserving these integrals. 6.3. Real positive form. Consider the real form GR of the complex algebraic group G. ± Introduce variables χi± = −u± i . The domain ui > 0 we will call positive domain. The following is clear. − Proposition 12. Functions ChV (u+ , u− ) are positive for u+ i , ui > 0 and + r + + Y u j 1 u 1 u j h i1 ir φi1 ChV (u+ , u− ) = T rV + . . . φir + − 1 1 + u 1 1 + u u i i r 1 j j =1 r − − − − Y u+ hj u 1 + u u 1 + u j i1 i1 . . . φ ir ir . φi1 = T rV ir − 1 1 1 1 u j j =1
Factorization Dynamics and Coxeter–Toda Lattices
317
It is also clear that the map α is defined globally on positive domain: − α(u+ i ) = ui ,
α(u− i )=
2 Y (u− i ) −Cj i (1 + u− . j ) + ui j
Let G>0 be the positive part of GR (see [Lus94] and [FZ98] for definitions). For SL(n) the positive part consists of all real unimodular n × n matrices with positive principal minors. Lemma 3. On G>0 there exists unique factorization g = g+ (g− )−1 , ±1 ± ∈ B>0 = B ± ∩ G>0 and θ (g+ ) = θ − (g− )−1 . where g± + be the positive Coxeter symplectic leaf of G . It is the connected component Let Sw,w R of ϕ −1 of the corresponding orbit in D(GR )/j (GR − ) which lies in G>0 . The positive domain described above is essentially a positive symplectic leaf and thus, on the positive −1 7→ domain the factorization map α is the restriction of the factorization map g = g+ g− −1 g¯ = g− g+ .
7. The Interpolating Flow and Continuous Time Nonlinear Toda Lattices 7.1. Interpolating flow. From now on we consider the factorization dynamics in positive real domain. As it was already pointed out the factorization dynamics on the positive real domain is a graph of a Poisson map. The trajectory of this map is defined recursively as x(n + 1) = x− (n)−1 x+ (n) for x(n) = x+ (n)x− (n)−1 . Proposition 13. The trajectory of the factorization map restricted to the positive real domain which starts at x(0) has the form: x(n) = g+ (n)−1 x(0)g+ (n), g(n) = x(0)n = g+ (n)g− (n)−1 . Proof. x(0)n = x+ (0)x(1)n−1 x− (0)−1 = x+ (0) . . . x+ (n)x− (n)−1 . . . x− (0)−1 shows t that g+ (n) = x+ (0) . . . x+ (n) which quickly leads to the statement. u This proposition is a discrete analogue of the following theorem of Semenov-TianShansky [STS85] for continuous time systems which describes the trajectories of Hamiltonian systems on Poisson Lie groups generated by Ad-invariant functions. Define the Lie G-valued gradient ∇f of a function f : G → R by (∇f (g), η) :=< df (g), (Xη ) >, where we write Xη for the left invariant vector field on G corresponding to η and < ω, X > is the value of the form ω on the vector field X. Theorem 5. Let H be an AdG -invariant function on G. The trajectory x(t) of the Hamiltonian equations of motion generated by H is given by x(t) = g+ (t)−1 x(0)g+ (t), where g(t) = exp(t∇H (x(0)) = g+ (t)g− (t)−1 .
318
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
Now we are in a position to derive a Hamiltonian flow which interpolates the factorization dynamics. Obviously, a Hamiltonian Hd which has a flow whose time 1 map is given by factorization as above has to solve the equation g = exp(∇Hd ). Thus, for g = eξ and ξ ∈ Lie G we should have ξ = ∇Hd (eξ ) . Proposition 14. In a neighborhood of the identity all AdG -invariant solutions of the equation ξ = ∇H (eξ )
(17)
have the form Hd (eξ ) =
1 (ξ, ξ ) + const. 2
Proof. Let H an AdG -invariant solution of the above equation and H˜ = H ◦ exp. Then H˜ is adg -invariant and hence d H˜ ξ (adη (ξ )) = 0 for all ξ, η ∈ Lie G. By (17), (ξ, η) =< dH (eξ ), Xη >=< d H˜ (ξ ), η >. Here we trivialized the tangent bundle on G by left translations. Thus, for H˜ we have the equation (ξ, η) = d H˜ |ξ (η). Integration yields now the statement of the proposition. u t If G = SL(n, R) then Hd (g) = 21 tr(log2 (g)) in a sufficiently small neighborhood of the identity [Sur91a]. The Hamiltonian Hd is quite remarkable since it gives the so-called classical quantum R-matrix [WX92,Res92]. The function Hd is the most singular part of the quantum R-matrix in the appropriate semi-classical limit [Skl82, Res95, Res96]. The map α generated by time 1 flow of Hd is the classical quantum R-matrix in the sense of [WX92] restricted to the product of Coxeter symplectic leaves and reduced by hamiltonian reduction. + − 7.2. Linearization in a neighborhood of 1. Consider R2n + with coordinates (ui , ui ) and with the following Poisson brackets between coordinate functions: + − − {u+ i , uj } = {ui , uj } = 0, − + − {u+ i , uj } = −2di Cij ui uj .
Consider the family of diffeomorphisms of β : R+ ×R2n → R2n + acting on coordinate functions as βε (πi , φi ) = (ε2 eφi +επi , ε2 eφi ). Here (πi , φi ) are coordinates in R2n such that (βε (φi ), βε (πi )) are the coordinates − 2n is equipped with the following (u+ i , ui ) which were used above. Assume that R symplectic structure {φi , φj } = {πi , πj } = 0, Then the maps βε are symplectomorphisms.
{φi , πj } = 2di Cij .
(18)
Factorization Dynamics and Coxeter–Toda Lattices
319
For each ε > 0 define the map αε : R2n → R2n , as αε = βε−1 ◦ α ◦ βε . The map αε acts on coordinates (φi , πi ) as αε (πi ) = πi +
r X
Cj i
j =1
αε (φi ) = φi + επi +
1 ln(1 + ε2 eφj ), ε
r X
Cj i
j =1
1 ln(1 + ε2 eφj ). ε
(19)
By construction these maps are symplectomorphisms for the bracket (18). In the limit ε → 0 Eq. (19) defines a vector field on R2n with coordinates αε (φi ) − φi = πi , ε X αε (πi ) − πi Cj i eφj . = π˙ i = lim ε→0 ε φ˙ i = lim
ε→0
j
This vector field is the Hamiltonian (for the Poisson brackets (18)) generated by the (usual) Toda Hamiltonian HToda = 21 (ξ0 , ξ0 ), where ξ0 =
r X (πi hi + eφi xi+ + xi− ). i=1
Thus the family of maps (19) “retracts” to the Toda Hamiltonian flow in the neighborhood of the identity. Equivalently, we have: lim α n (φ, π ) n→∞ ε
= (φ(t), π(t)),
where t = n is fixed and φ(t), π(t) is the Hamiltonian flow generated by HToda passing through (φ, π) at t = 0. It is easy to find the leading terms of the asymptotic expansion of the integrals in the limit ε → 0. Indeed, composing map βε with functions ChV and Hd we have: HV (φ, π) = (ChV ◦ βε )(φ, π ) = T rV (exp(ξε )), Hd (φ, π) = 21 (ξε , ξε ), where exp(ξε ) =
r Y j =1
exp(επj hj )
→ Y i
exp(εeφi xi+ ) exp(εxi− ).
As ε → 0, ξε = εξ0 + O(ε2 ).
320
T. Hoffmann, J. Kellendonk, N. Kutz, N. Reshetikhin
Thus, for HV and Hd we have HV = dim V 1 + ε2
cV HToda + O(ε3 ) , dim(g)
Hd = ε2 HToda + O(ε3 ). Here we assumed that V is irreducible and cV is the value of the Casimir operator action on V . Higher Toda Hamiltonians can be obtained from higher order terms of ε-expansion of HV . 8. Conclusion As it was mentioned in the introduction, the main goal of this paper was systematic derivation of Coxeter–Toda systems from the symplectic geometry of Poisson Lie groups. Naturally, such analysis can be done for loop groups as well. The corresponding models will be affine versions of Coxeter–Toda systems. For the An root system this will give the relativistic Toda chain first described by Ruijsenaars [Rui90]. In a similar way one can construct discrete versions of Toda field theories. For the An -case it has been done in [KR97]. Notice also that somewhat unexpectedly the same Hirota equations appear as a system of equations for transfer-matrices of some solvable models in statistical mechanics [BR90,KNS94]. Although it is clear that the explanation of this coincidence lies in the theory of q − W -algebra [ER97], the complete picture is still missing. The factorization dynamics restricted to other symplectic leaves will give “nonlinear” Toda–Kostant systems which are related to general coadjoint orbits. Acknowledgements. The authors thankYuri Suris for helpful discussions. N.R. thanks S. Fomin andA. Zelevinsky for valuable discussions and the Technische Universität Berlin for hospitality. The research of N.R. was partially supported by the NSF grant DMS-9603239. T. H., J. K. and N. K. were supported by the grant “Discrete integrable systems” of the Deutsche Akademische Austauschdienst and by the Sonderforschungsbereich 288 supported by the Deutsche Forschungsgemeinschaft.
References [Adl79] [Arn89] [BR90] [DCKP95] [DJM82] [DLNT86] [DLT89] [Dri87] [ER97]
Adler, M.: On a trace functional for formal pseudo-differential operators and the symplectic structure of the kdv-type equations. Inv. Math. 50, 219–248 (1979) Arnold, V.I.: Mathematical Methods of Classical Mechanics, Second Edition. Berlin– Heidelberg–New York: Springer, 1989 Bazhanov, V., Reshetikhin, N.: Restricted solid-on-solid models connected with simply laced algebras and conformal field theory. J. Phys. A 23, 1477–1492 (1990) De Concini, C., Kac, V.G. and Procesi, C.: Some quantum analogues of solvable Lie groups. In Geometry and analysis. Papers presented at the Bombay colloquium, India, January 6–14, 1992, Oxford: Oxford University Press, 1995, pp. 41–65 Date, F., Jimbo, M., Miwa, T.: Method for generating discrete soliton equations I-IV. J. Phys. Soc. Japan 51, 4116–4131 (1982) Deift, P., Li, L.C., Nanda, T. and Tomei, C.: The Toda flow on a generic orbit is integrable. Comm. Pure Appl. Math. 39, 183–232 (1986) Deift, P., Li, L.C. and Tomei, C.: Matrix factorization and integrable systems. Comm. Pure Appl. Math. 42, 443–521 (1989) Drinfeld, V.G.: Quantum groups. In Proc. Intern. Congress of Math. (Berkeley 1986), pp. 798–820. Providence, RI: AMS, 1987 Frenkel, E. and Reshetikhin, N.: Deformations of W-algebras associated to simple Lie algebras. from-math-QA-archive, q-alg/9707012:–, 1997
Factorization Dynamics and Coxeter–Toda Lattices
[FZ98]
321
Fomin, S. and Zelevinsky, A.: Totally nonconnegative and oscillatory elements in semisimple groups. Preprint, 1998 [FZ99] Fomin, S. and Zelevinsky, A.: Double bruhat cells and total positivity. J. of the AMS 12, 335–380 (1999) [Hir77] Hirota, R.: Nonlinear partial difference equations II. Discrete-time Toda equation. J. Phys. Soc. Japan 43 (6), 2074–2078 (1977) [HL93] Hodges, T. and Levasseur, T.: Primitive ideals of Cq [SL(3)]. Commun. Math. Phys. 156, 581–605 (1993) [HZ94] Hofer, H. and Zehnder, E.: Symplectic invariants and Hamiltonian dynamics. Base, Boston: Birkhäuser Verlag, 1994 [KNS94] Kuniba, A., Nakanishi, T.and Suzuki, J.: Functional relations in solvable lattice models. Int. J. of Mod. Phys. 9, 5215–5311 (1994) [Kos79] Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. Math. 34, 195–338 (1979) [KR97] Kashaev, R. and Reshetikhin, N.: Affine Toda systems as an integrable 3-dimensional quantum field theory. Comm. Math. Phys. 188, 251–266 (1997) [KS98] Korogodski, L. and Soibelman, Y.: Algebras of Functions on Quantum Groups, Part I. Providence, RI: American Mathmatical Society, 1998 [Lus94] Lusztig, G.: Total positivity in reductive groups. In Lie theory and geometry: In honor of Bertram Kostant. Boston: Birkhäuser, 1994 [LW90] Lu, J.-H. and Weinstein, A.: Poisson Lie groups, dressing transformations and Bruhat decompositions. J. Differ. Geom. 31, 501–526 (1990) [MV91] Moser, J. and Veselov, A.: Discete versions of some classical integtable systems and factorization of matrix polynomials. Commun. Math. Phys. 139, 217–243 (1991) [QNCvdL84] Quispel, G., Nijhoff, F., Capel, H. and van der Linden, J.: Linear integral equations and nonlinear differential-difference equations. Physics A 125, 344–380 (1984) [Res92] Reshetikhin, N.: Quasitriangularity of quantum groups and quasi-triangular Hopf-Poisson algebras. In AMS Summer Reaserch Institute on Algebras, Groups and Their Generalization, Providence, RI: AMS, 1992, pp. 111–133 [Res95] Reshetikhin, N.: Quasitriangularity of quantum groups at roots of 1. Commun. Math. Phys. 170, 79–100 (1995) [Res96] Reshetikhin, N.: Integrable discrete systems. In Quantum Groups and their Appliations in Physics. Bologna: IOS Press, 1996, pp. 445–487 [Rui90] Ruijsenaars, S.: Relativistic Toda systems. Commun. Math. Phys. 122, 217–247 (1990) [Skl82] Sklyanin, E.: On some algebraic structures related to the Yang–Baxter equation. Funct. Anal. and its Appl. 16, 27–34 (1982) [Soi90] Soibelman, Y.: Algebra of functions on a compact quantum group and its representations. Algebra i Analiz 2, 193–225 (1990) [STS85] Semenov-Tian-Shansky, M.: Dressing transformations and Poisson group actions. Pub. Res. Inst. Math. Sci. Kyoto Univ. 21, 1237–1260 (1985) [Sur90] Suris, Y.: Discrete time generalized Toda lattices: Complete integrability and relation with relativistic Toda lattices. Phys. Lett. A 145, 113–119 (1990) [Sur91a] Suris, Y.: Algebraic structure of discrete-time and relativistic Toda x lattices. Phys. Lett. A 156, 467–474 (1991) [Sur91b] Suris, Y.: Generalized Toda chains in discrete time. Leningrad Math. J. 2, 339–352 (1991) [Sym82] Symes, W.: The QR-algorithm and scattering for the finite non-periodic Toda lattice. Physica D 4, 275–290 (1982) [Tod88] Toda, M.: Theory of nonlinear lattices. Berlin–Heidelberg–New York: Springer, 1988 [Ves91] Veselov, A.P.: Integrable maps. Russ. Math. Surv. 46, 3–45 (1991) [WX92] Weinstein, and Xu, P.: Classical solutions to the quantum Yang–Baxter equation. Commun. Math. Phys. 143, 309–344 (1992) Communicated by T. Miwa
Commun. Math. Phys. 212, 323 – 336 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Resonance Wave Expansions: Two Hyperbolic Examples T. Christiansen, M. Zworski 1 Department of Mathematics, University of Missouri, Columbia, MO 65211, USA.
E-mail:
[email protected] 2 Department of Mathematics, University of California, Evans Hall, Berkeley, CA 94720, USA.
E-mail:
[email protected] Received: 1 October 1999 / Accepted: 24 January 2000
Abstract: For scattering on the modular surface and on the hyperbolic cylinder, we show that the solutions of the wave equations can be expanded in terms of resonances, despite the presence of trapping. Expansions of this type are expected to hold in greater generality but have been understood only in non-trapping situations.
1. Introduction In this note we give two examples for which we can obtain, on compact sets, an asymptotic expansion of solutions to the wave equation with smooth, compactly supported initial data, although there is trapping. The expansions are given in terms of resonances and they generalize the standard “separation of variables” expansions in terms of eigenvalues. The examples are the modular surface where we can use detailed information about the zeta function (see Fig. 1 and Theorem 1) and the hyperbolic cylinder where the resonances are particularly simple (see Fig. 2(b) and Theorem 2). Resonances or scattering poles are defined as poles of the meromorphic continuation of the resolvent or the scattering matrix and they constitute a natural replacement of discrete spectral data for problems on exterior domains. That point of view was emphasized early by Lax–Phillips ([12]) – see [23] for a light-hearted overview of recent results. Although resonances are most frequently defined in the stationary framework of scattering theory they are a dynamical concept: the real part of a resonance describes the rest energy of a state and the imaginary part its rate of decay. Consequently they should be understood in terms of long time behaviour of solutions to evolution equations, and, in particular, to the wave equation. For the Schrödinger evolution equation we refer to the recent paper by Soffer-Weinstein [17] and references given there; in that case one considers resonances which come from perturbing embedded eigenvalues. Under a quantum dynamical condition that the obstacle O is non-trapping (that is, a condition on the behaviour of solutions of the evolution equation) Lax–Phillips [12]
324
T. Christiansen, M. Zworski
and Vainberg [21,22] showed that for n odd, for some > 0,
(Dt2 − 1)u(t, x) = 0, x ∈ Rn \ O, uR×∂ O = 0, ∞ n H⇒ ut=0 = f ∈ Cc (R \ O) 1 ∞ n i ∂t ut=0 = g ∈ Cc (R \ O)
u(t, x) =
X
mX O (λl )
(1.1)
wλl ,j (x)eitλl t j −1 + O(e−(C+)t ), x ∈ K,
Im λl ≤C j =1
K ⊂ Rn \ O compact, where mO (λl ) is the multiplicity of the resonance. Here we took the convention that Im λl ≥ 0, that is, that the resonances lie in the upper half-plane. That the quantum non-trapping condition follows from the classical non-trapping condition was shown in the works of Andersson, Melrose, Morawetz, Ralston, Strauss, Sjöstrand and Taylor – see the appendix to the second edition of [12] and references given there. In the ultimate trapping situation of a compact manifold, the expansion (1.1) is simply the expansion in terms of eigenvalues. Hence a naïve “interpolation” argument suggests its validity for all perturbations. That, however, is far from clear and not much is known. Tang-Zworski [19] recently showed, using the methods of [18], that the expansion (1.1) is valid for general “black box” perturbations when we sum over resonances satisfying Im λj ≤ |λj |−M , M sufficiently large, and when we replace the error by O(t −N ) for any N. The result is at the moment conditional and the following generically reasonable yet unverifiable assumption has to be made: |λl − λk | >
1 (max{|λl |, |λk |})−L , for some fixed L > 0. C
(1.2)
This motivates our unconditional results in two rather explicit examples. Remark. After this paper was written we learned that in a recent paper [1], Beyer obtained results very close to Theorem 2 here. His work was motivated by gravitational scattering – see [15] for another discussion of resonances in that setting and the relation to the hyperbolic cylinder. In the paper, C will stand for a constant whose value may change from line to line. The 1 notation hsi means (1 + |s|2 ) 2 . 2. The Quotient by the Modular Group We begin by working on a general surface with one constant curvature cusp end, which we shall identify with (a, ∞)y × Sθ1 for some a > 0. We will use z to denote a variable in the surface and, as is common for scattering on hyperbolic surfaces, we use the spectral variable s(1−s). We refer to [14] for the spectral and scattering theories of such surfaces. Let E(z, s) be the generalized eigenfunction of the positive Laplacian 1, so that (1 − s(1 − s))E(z, s) = 0, such that on the cusp end E(z, s) = y s + S(s)y 1−s + O(y − )
Resonance Wave Expansions
325
for some > 0. With this convention, the scattering matrix S(s) is holomorphic when Re s > 1/2, except for a finite number of values of s which correspond to the eigenvalues of 1 which lie below 1/4. √ √ 1−1/4) f satisfies the equation The function u(t) = sin(t 1−1/4 (Dt2 − (1 − 1/4))u(t) = 0, u(0) = 0, ut (0) = f when f is sufficiently well-behaved. Using the spectral representation, we have √ √ √ 1 X (ei λj −1/4t − e−i λj −1/4t ) sin(t 1 − 1/4) p = φj (z)φ j (z0 ) √ 2i 1 − 1/4 λ − 1/4 j λj ∈σp (1) Z ∞ iτ t (e − e−iτ t ) 1 E(z, 1/2 + iτ )E(z0 , 1/2 − iτ )dτ + 8π −∞ iτ
(2.1)
in the sense of distributions, where φj is a normalized eigenfunction of 1, associated to λj . Here the λj ’s can take the same value depending on the multiplicity and the φj ’s form an orthonormal set. Let X0 = H2 /P SL(2; Z) be the quotient of the hyperbolic upper half plane by P SL(2; Z), which has scattering matrix S(s) =
√ 0(s − 21 ) ζ (2s − 1) , π 0(s) ζ (2s)
(2.2)
where 0 is the Euler 0-function and ζ is the Riemann ζ -function. One consequence of this is that the poles of the scattering matrix other than s = 1 correspond to the nontrivial zeros of ζ (2s). We recall that if N (T ) is the number of zeros of the function ζ (s) in 0 ≤ Re s ≤ 1, 0 ≤ Im s ≤ T , then N(T ) =
1 1 T log T − T + O(log T ) 2π 2π
as T → ∞ ([20, Theorem 9.4]). That is, the scattering matrix for the Laplacian on X0 has far fewer than the maximum number of poles in a ball of radius T centered at the origin, which is O(T 2 ) for general surfaces with cusps. ∞ Let √ f, χ ∈ Cc√(X0 ). We wish to obtain an asymptotic expansion as t → ∞ of sin(t 1 − 1/4)/ 1 − 1/4 applied to f and truncated by χ. Theorem 1. Let f, χ ∈ Cc∞ (X0 ). Then there exist vj k ∈ Cc∞ (X0 ) such that as t → ∞, √ 1 sin(t 1−1/4) f = χ √ 2i 1−1/4 X +
√ ! − e−i λj −1/4t p χ(z)φj (z)(f, φj ) λj − 1/4 λj ∈σp (1) X e(sj −1/2)t (sign(1/2−Re sj )) vj k t k + O(e−N t )
sj poles of S(s)
for any N .
X
ei
√
λj −1/4t
k≤mult(sj )−1
326
T. Christiansen, M. Zworski
The sum over the poles should be understood as follows: for n ∈ Z, there exist τn ∈ R, n ≤ 2τn < n + 1, such that if X X bn (z, t) = e(sj −1/2)t vj k t k , sj :τn 2. (For the bound on S, see [14, (3.26),(3.27)].) The bounds (2.4) are valid for other surfaces with one cusp end, but we shall need an improved bound in the special case of X0 . As a by-product of their work on L∞ bounds on L2 -eigenfunctions, Iwaniec and Sarnak obtain the bound |χ(z)E(z, 1/2 + it)| |t| 12 + , 5
Resonance Wave Expansions
327
see [11, (A.12)] where now the positive contribution of the Eisenstein series should be kept (see also the discussion around [16, (2.18)]). Here we give a direct proof of a weaker estimate, in the spirit of general scattering theory: the estimate depends only on the separation of the poles of E from the continuous spectrum. Lemma 2.1. For the generalized eigenfunctions E(z, s) on X0 , |χ(z)E(z, s)| ≤ Chsi2+ eRe s if Re s ≥ 1/2, |s − 1| > > 0. Proof. We shall use (2.2) to improve our bound on E(z, s) when Re s ≥ 1/2. We start by recalling an exponential bound on the resolvent: kχ2 (1 − s(1 − s))−1 χ2 kL2 →L2 ≤ C exp(C|s|N0 ), −3 −3 D(s , hs i ) ∪ D(σ , hσ i ) if s 6 ∈ ∪∞ j j j j σ :σ (1−σ )=λ ∈σ (1) p j j j j j =1
(2.5)
for any χ2 ∈ Cc∞ (X0 ) for some N0 (see the representation of the resolvent in [7, Sect. 5] and [8, Lemma 3.6]; the estimate also follows from the general “black box” scattering estimate, see [18, Lemma 1]). Moreover, by Theorem 3.8 of [20] there is a constant A > 0 such that ζ (s) is not zero for Re s ≥ 1 − A/ log | Im s|, | Im s| > t0 . Since (1 − s(1 − s))−1 [1, χ1 ]y s is regular away from the poles of the scattering matrix, using (2.3), (2.5), and the maximum principle this leads to a bound on the generalized eigenfunctions: |χ (z)E(z, s)| ≤ C exp(C|s|N0 )
(2.6)
when | Im s| is sufficiently large and Re s ≥ 1/2 − A/(4 log | Im s|). Then, just as in [18, Lemma 2], we can use the maximum principle, the existence of a pole free region, the bound in the good half plane (2.4) and the exponential bound (2.6) to obtain, for | Im s| sufficiently large, t u (2.7) |χ(z)E(z, s)| ≤ Chsi2+ eRe s if Re s ≥ 1/2. √ √ Proof of Theorem 1. We use the representation (2.1) of sin(t 1 − 1/4)/ 1 − 1/4. For the term of (2.1) with an eitτ we deform the contour of integration into the upper half plane; the term with e−itτ is deformed into the lower half-plane. We note that this is possible despite the singularity at τ = 0 since Z lim ↓0
eiτ t E(z, 1/2 + iτ )E(z0 , 1/2 − iτ )dτ Im τ = τ −δ 0, one can show, by moving the contour of integration off the real axis, to Im τ = C/4 Re τ for sufficiently large τ , that √ √ ! √ 1 X ei λj −1/4t − e−i λj −1/4t sin(t 1 − 1/4) p f = χ √ 2i 1 − 1/4 λj − 1/4 λ ∈σ (1) p
j
· χ(z)φj (z)(f, φj ) + O(t −N ) for any N. 3. The Hyperbolic Half-Cylinder The second example we consider is the hyperbolic half-cylinder Y0l ' (R+ )r ×(R/ lZ)θ with metric dr 2 + cosh2 rdθ 2 . The analysis is equally applicable to the case of the full cylinder Yl ' (R)r × (R/ lZ)θ with the same metric. In both cases the trapped set consists of one closed hyperbolic orbit which is well known to generate resonances on a lattice (as was pointed out by Guillopé [6] and Epstein [2]; see also [7, Appendix] and Fig. 2(b)). We begin by recalling some results of [7, Sect. 3 and Appendix]. The Laplacian 1Y0l = Dr2 − i tanh rDr + 1R/ lZ (cosh r)−2 on the hyperbolic half-cylinder Y0l is, through conjugation by cosh1/2 r, equivalent to the operator Dr2 +
1R/ lZ + 1/4 1 + 4 cosh2 r
on L2 (R+ × R/ lZ, drdθ). We can expand this in terms of the eigenfunctions on R/ lZ to obtain M (2π m/ l)2 + 1/4 1 + Dr2 + 4 cosh2 r m∈Z
+ , dr)). Modifying slightly the notation of [7], we have a generalized eigenfunction satisfying l)2 +1/4 − the Dirichlet boundary condition for the one-dimensional problem Dr2 + (2π m/ cosh2 r k2 ,
on
l 2 (Z, L2 (R
˜ m , k) sinh r cosh1+νm r E˜ νm (r, k) = a(ν · 2 F1 (νm − ik + 2)/2, (νm + ik + 2)/2, 3/2; − sinh2 r , where νm = −1/2 + i(2πm/ l) and a(ν ˜ m , k) =
2−ik 0((νm − ik + 2)/2)0((−νm − ik + 1)/2) . 0(−ik)0(3/2)
Then, for r, r 0 > 0, δ(r − r 0 ) =
1 4π
Z
∞
−∞
E˜ νm (r, k)E˜ νm (r 0 , −k)dk.
(3.1)
330
T. Christiansen, M. Zworski
d/2
(a)
d
d π
(b)
Fig. 2. (a) Resonances associated to two strictly convex bodies: in every fixed strip, the resonances become closer to points on the lattice as the real part increases. (b) Resonances for a hyperbolic cylinder: all resonances lie exactly on a lattice. The underlying dynamical structure, exactly one hyperbolic closed orbit, is the same in the two examples
In order to be consistent with our first example, we shall use as the variable s = 1/2 − ik and set Eνm (r, s) = E˜ νm (r, k). The scattering matrix S0l (s) for the hyperbolic half-cylinder with Dirichlet boundary conditions is S0l (s) =
M
s(H 0,−1/2+2imπ/ l )(k), s = 1/2 − ik,
m∈Z l) +1/4 with Dirichlet boundary conditions. This where H 0,−1/2+2imπ/ l is Dr2 + (2π m/ cosh2 r gives us ([7, Lemmas 3.3, 3.4]) that 2
S0l (s) =
M
slm (s)
(3.2)
m∈Z
with slm (s) =
22s−1 0(1/2 − s)0((1 + s − i2π m/ l)/2)0((1 + s + i2π m/ l)/2) 0(s − 1/2)0((2 − s − i2π m/ l)/2)0((2 − s + i2π m/ l)/2)
(3.3)
and the resonances of the Dirichlet Laplacian associated to slm (s) are ±i2π m/ l −n, n ∈ 2N − 1. Notice that this means that for any β ∈ R, corresponding to each eigenfunction on R/ lZ there are only a finite number of resonances with real part greater than β. Note too that for m 6 = 0, the resonance i2π m/ l − n has multiplicity two as a resonance of S0l (s) but only one as a resonance of slm (s) and sl(−m) (s). For this reason, when m 6 = 0 we can rule out the possibility of needing terms with t dependence in the expansion someplace other than the exponential. If X is a manifold with boundary, we use the notation C˙ c∞ (X) to denote the smooth, compactly supported functions on X that vanish to infinite order at the boundary.
Resonance Wave Expansions
331
Theorem 2. Let f ∈ C˙ c∞ (Y0l ) and let χ ∈ Cc∞ (Y0l ). Then there exist vm,n , wn ∈ Cc∞ (Y0l ) such that as t → ∞, p X sin(t 1Y0l − 1/4) e(i2mπ/ l−n−1/2)t vm,n f = χ p 1Y0l − 1/4 0 0, β 6 ∈ (N ∪ (N − 1/2)). Then X
Z φm (θ)χ(r)
m∈Z
Re s=−β
Z
Z R+
R/ lZ
et (s−1/2) Eν (r, s)Eνm (r 0 , 1−s)f (r 0 , θ 0 )φ m (θ 0 ) s − 1/2 m cosh r 0 1/2 0 0 dy dr ds = O(e−t (β+1/2) ) · cosh r
as t → ∞. Proof. Here we need uniform polynomial bounds on Eνm (r, s) and Eνm (r, 1 − s) when r is in a compact set and as |m| → ∞, |s| → ∞ with Re s = β. Recall that ˜ m , is − i/2) sinh r(cosh r)1/2+2π im/ l Eνm (r, s) = a(ν × 2 F1 ((s + 2π im/ l + 1)/2, (2π im/ l − s + 2)/2, 3/2; − sinh2 r). It is relatively easy to bound the hypergeometric function either in s (as was done in the previous lemma) or in m, but we are unaware of a bound in both independently. Instead we will take the approach of writing 2 F1 ((s
+ 2πim/ l + 1)/2, (2π im/ l − s + 2)/2, 3/2; − sinh2 r)
(3.9)
as a sum of derivatives of 2 F1 ((s 0 +2π im/ l+1)/2, (2π im/ l−s 0 +2)/2, 3/2; − sinh2 r) with 1/2 < Re s 0 < 5/2, where we can obtain bounds on it using properties of the resolvent. We will use (c − a)2 F1 (a − 1, b, c; z) + (2a − c − az + bz)2 F1 (a, b, c; z) + a(z − 1)2 F1 (a + 1, b, c; z) = 0
(3.10)
334
T. Christiansen, M. Zworski
([13, 3.4.19]). Let γ be the greatest integer strictly less than (5/2 + β)/2 and let s = −β + iτ , τ ∈ R. We can write 2 F1 ((iτ − β + 2π im/ l + 1)/2, (2π im/ l − iτ + β + 2)/2, 3/2; − sinh2 r) as a sum of 2 F1
2πim + 1 /2 + j1 , l 2πim 3 − iτ + β − 2γ + 2 /2 + j2 , ; − sinh2 r , l 2
iτ − β + 2γ +
(3.11)
where j1 , j2 ∈ {0, 1} and the coefficients are rational functions in 2π im/ l, τ = Im s, and sinh2 r, the degree of which does not exceed 2γ , and whose denominators are bounded away from 0 when β 6 ∈ 2N − 2. These coefficients can be bounded by Chmi2γ hsi2γ , where the constant depends on β. There are four functions (3.11) to bound now. We use the notation (b)j = b(b + 1) · · · (b + j − 1). If at least one of j1 , j2 6 = 0, then we use d j a+j −1 a−1 [z 2 F1 (a, b, c; z)] = (a)j z 2 F1 (a + j, b, c; z) dzj
(3.12)
for positive integers j ([13, 3.4.4]) to write (3.11) as a sum of 2π im 2πim 2 + 1)/2, ( − iτ + β −2γ + 2)/2, 3/2; − sinh r 2 F1 (iτ −β +2γ + l l (3.13) and its derivative in r (and second derivative, if j1 = j2 = 1 ) with coefficients which are rational functions of order no greater than two in sinh r, cosh r, m, and τ , with the denominator bounded away from 0. Note that 1/2 < 2γ −β < 5/2 when β 6 ∈ 2N−1/2, β > 0. The derivatives are not a difficulty, since once we have a polynomial bound on the function (3.13) we get a similar bound on the derivative using the fact that (3.13) is closely related to solutions of the equation (Dr2 +
(2πm/ l)2 + 1/4 + 1/4 − s 0 (1 − s 0 ))h = 0. cosh2 r
Let s 0 = iτ − β + 2γ , and note that (3.13) is equal to (a(νm , is 0 − i/2))−1 (sinh r)−1 (cosh r)−1/2−2π im/ l Eνm (m, s 0 ) with 1/2 < Re s 0 < 5/2. In order to bound E(νm , s 0 ) when 1/2 < Re s 0 < 5/2, we use, in analogy with (2.3), −1 (2π m/ l)2 + 1/4 1 0 0 + (1 − s ) − s E(νm , s 0 ) = 2 sinh (s 0 − 1/2)r − Dr2 + 4 cosh2 r 2 (2π m/ l) + 1/4 0 2 sinh (s · − 1/2)r (3.14) cosh2 r
Resonance Wave Expansions
335
when 1/2 < Re s 0 < 5/2 and we recall that we have chosen the convention that (Dr2 + (2πm/ l)2 +1/4 cosh2 r
+
1 4
− s 0 (1 − s 0 ))−1 is bounded on L20 (R+ ) when Re s 0 > 1/2. We have
−1
2 + 1/4 1 (2πm/ l) C
0 0 + − s (1 − s )
≤
Dr2 + 2 0
4 | Im s || Re s 0 − 1/2| cosh r
when Re s 0 > 1/2. Since
(2πm/ l)2 + 1/4 (s 0 −1/2)r hmi2 −(s 0 −1/2)r
(e − e ) ≤ C
2
| Re s 0 − 5/2|1/2 cosh2 r L (R+ )
(3.15)
(3.16)
when 1/2 ≤ Re s 0 < 5/2, we find that for 1/2 + < Re s 0 < 5/2 − , |Eνm (r, s 0 )| ≤ 0 Chmi2 hs 0 i1+ e(s −1/2)r where the constant depends on and we have used (3.14), (3.15), (3.16) as well as the Sobolev embedding theorem. ˜ m , is − i/2) when Re s = −β It remains to bound (a(ν ˜ m , i(s + 2γ ) − i/2))−1 and a(ν – actually, we need only bound their product. We have a(ν ˜ m , is − i/2) a(ν ˜ m , i(s + 2γ ) − i/2) 0(s + 2γ − 1/2) 0((2π im/ l + s + 1)/2) 0(−2π im/ l + s + 1)/2) = 2−2γ 0(s − 1/2) 0(2π im/ l + s + 2γ + 1)/2) 0((−2π im/ l + s + 2γ + 1)/2) 2−2γ (s − 1/2)2γ = ((2πim/ l + s + 1)/2)γ (−2π im/ l + s + 1)/2)γ ≤ Chsi2γ
when Re s 6 ∈ −(2N − 1). Finally, this shows that |χ(r)Eνm (r, s)| ≤ Chmi2γ +4 hsi4γ +7 when Re s = −β, β > 0, β 6 ∈ N ∪ N − 1/2. A polynomial bound for Eνm (r, 1 − s) is found in a similar way. These polynomial bounds are enough then to show that the integral in the statement of the lemma is of t order O(e−t (β+1/2) ). u We point out that the same method applies to the case of the Neumann Laplacian on Y0l , using the generalized eigenfuctions as in [5]. The Laplacian on the full cylinder Yl can be built from the Dirichlet and Neumann Laplacians on the half-cylinder (by decomposing L2 (Yl ) into subspaces of odd and even functions) and hence we obtain the second part of Theorem 2. As a final remark we point out that a more complex example with the same dynamical structure is given by two convex obstacles – see Fig. 2(a). The resonances are shown to be asymptotic to a lattice of points, [9, 4]. The estimates on the resolvent given in [9] seem sufficient for obtaining an analogue of Theorem 2 above ([10]). Acknowledgements. The first author is grateful for the partial support of a University of Missouri S.R.F. The second author is grateful to the National Science and Engineering Research Council of Canada and the National Science Foundation of the U.S. for partial support. The authors would like to thank Bill Banks for helpful conversations and Laurent Guillopé for allowing them to use the figures he created for another occasion.
336
T. Christiansen, M. Zworski
References 1. Beyer, H.: On the completeness of the quasinormal modes of the Pöschl-Teller potential. Commun. Math. Phys. 204, 397–423 (1999) 2. Epstein, Ch.: Unpublished 3. Erdélyi, A. et. al.: Higher Transcendental Functions. Vol. I. New York: McGraw-Hill Book Company, Inc., 1953 4. Gérard, Ch.: Asymptotique des pôles de la matrice de scattering pour deux obstacles strictement convexes. Mém. Soc. Math. France (N.S.) 31, (1988) 5. Guillopé, L.: Pöschl-Teller potentials and Laplacians on hyperbolic spaces. Unpublished 6. Guillopé, L.: Sur la distribution des longuers des géodésiques fermées d’une surface compacte à bord totalement géodésique. Duke Math. J. 53, 827–848 (1986) 7. Guillopé, L. and Zworski, M.: Upper bounds on the number of resonances for non-compact Riemann surfaces. J. Funct. Anal. 129 (2), 364–389 (1995) 8. Guillopé, L. and Zworski, M.: Scattering asymptotics for Riemann surfaces. Ann. Math. 145, 597–660 (1997) 9. Ikawa, M.: On the poles of the scattering matrix for two strictly convex obstacles. J. Math. Kyoto Univ. 23, 127–194 (1983) 10. Ikawa, M.: Private communication 11. Iwaniec, H. and Sarnak, P.: L∞ norms of eigenfunctions of arithmetic surfaces. Ann. of Math. 141, 301–320 (1995) 12. Lax, P. and Phillips, R.: Scattering Theory. New York: Academic Press, 1st edition 1969, 2nd edition 1989 13. Luke, Y.: The Special functions and their approximations. Vol. 1. New York: Academic Press, 1969 14. Müller, W.: Spectral geometry and scattering theory for certain complete surfaces of finite volume. Invent. Math. 109, 265–305 (1992) 15. Sá Barreto, A. and Zworski, M.: Distribution of resonances for spherical black holes. Math. Res. Lett. 4, 103–121 (1997) 16. Sarnak, P.: Arithmetic quantum chaos. The Schur lectures (1992, Tel Aviv). Israel Math. Conf. Proc. 8, 183–236 (1995) 17. Soffer, A. and Weinstein, M.: Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations. Invent. Math. 136, 9–74 (1999) 18. Tang, S.-H. and Zworski, M.: From quasimodes to resonances. Math. Res. Lett. 5 3, 261–272 (1998) 19. Tang, S.-H. and Zworski, M.: Resonance expansions of scattered waves. Preprint, 1999 20. Titchmarsh, E.C.: The Theory of the Riemann zeta-function. Second edition. Oxford: Clarendon Press, 1986 21. Vainberg, B.R.: Exterior elliptic problems that depend polynomially on the spectral parameter, and the asymptotic behavior for large values of the time of the solutions of nonstationary problems. (Russian) Mat. Sb. (N.S.) 92 134, 224–241 (1973) 22. Vainberg, B.R.: Asymptotic methods in equations of mathematical physics. London: Gordon and Breach, 1989 23. Zworski, M.: Resonances in physics and geometry. Notices Amer. Math. Soc. 46, 319–328 (1999) Communicated by P. Sarnak
Commun. Math. Phys. 212, 337 – 370 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Classical Dynamical r-Matrices and Homogeneous Poisson Structures on G/H and K/T Jiang-Hua Lu∗ Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA. E-mail:
[email protected] Received: 4 September 1999 / Accepted: 25 January 2000
Abstract: Let G be a finite dimensional simple complex group equipped with the standard Poisson Lie group structure. We show that all G-homogeneous (holomorphic) Poisson structures on G/H , where H ⊂ G is a Cartan subgroup, come from solutions to the Classical Dynamical Yang–Baxter equations which are classified by Etingof and Varchenko. A similar result holds for a maximal compact subgroup K, and we get a family of K-homogeneous Poisson structures on K/T , where T = K ∩ H is a maximal torus of K. This family exhausts all K-homogeneous Poisson structures on K/T up to isomorphisms. We study some Poisson geometrical properties of members of this family such as their symplectic leaves, their modular classes, and the moment maps for the T -action. Contents 1. 2. 3.
4. 5.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Classical Dynamical Yang–Baxter Equation . . . . . . . . . r-Matrices and Homogeneous Poisson Structures on G/H . . . . 3.1 The main theorem . . . . . . . . . . . . . . . . . . . . . 3.2 The Poisson structures πrX (λ) on G/H . . . . . . . . . . 3.3 Comparison with Karolinsky’s classification . . . . . . . r-Matrices and Homogeneous Poisson Structures on K/T . . . . The Poisson Structures πX,X1 ,λ on K/T . . . . . . . . . . . . . . 5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Connections via taking limits in λ . . . . . . . . . . . . . 5.3 The Lagrangian subalgebras of g corresponding to πX,X1 ,λ 5.4 Geometrical interpretation of πX,X1 ,λ . . . . . . . . . . . 5.5 πX,X1 ,λ as the result of Poisson induction . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
338 340 341 341 347 348 349 351 351 355 355 357 359
∗ Research partially supported by an NSF Postdoctoral Fellowship and by NSF grant DMS 9803624.
338
J.-H. Lu
5.6 5.7
The symplectic leaves of πX,X1 ,λ . . . . . . . . . . . . . . . . . . The modular vector fields and the leaf-wise moment maps for the T -actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
362 366
1. Introduction This paper is motivated by the work of Etingof and Varchenko [E-V] on classical dynamical r-matrices for the pair (g, h), where g is a complex simple Lie algebra and h ⊂ g a Cartan subalgebra. A classical dynamical r-matrix is, by definition, a meromorphic function r : h∗ → g⊗g satisfying the so-called Classical Dynamical Yang–Baxter Equation (CDYBE): Alt(dr) + [r 12 , r 13 ] + [r 12 , r 23 ] + [r 13 , r 23 ] = 0 (see Sect. 2 for details). One such r-matrix has the form ε X ε ε coth( α, λ )Eα ⊗ E−α , r(λ) = + 2 2 2 α∈6
where ∈ (S 2 g)g corresponds to the Killing form , of g, 6 is the set of roots of x −x is the g with respect to h, the Eα and E−α ’s are root vectors, and coth(x) = eex +e −e−x hyperbolic cotangent function. Other r-matrices can be obtained by performing certain “gauge transformations" to the one above and by taking various limits of it. See Sect. 2. We want to understand the geometrical meaning of these r-matrices. In [E-V], Etingof and Varchenko show that every classical dynamical r-matrix defines a Poisson groupoid over an open subset of h∗ . In this paper, we give another geometrical interpretation of the r-matrices by connecting them with Poisson structures on the spaces G/H and K/T , where G is a complex Lie group with Lie algebra g, H ⊂ G its connected subgroup corresponding to h, K a compact real form of G, and T = K ∩ H . We then study some Poisson geometrical properties of these Poisson structures on K/T such as their symplectic leaves, their modular classes, and the moment maps for the T -action. We now explain this in more detail. A special example of a classical dynamical r-matrix is one that is not “dynamical”, i.e., independent of λ. It is given by ε ε X Eα ∧ E−α r0 = + c + 2 2 α∈6+
for a choice of positive roots 6+ and an element c ∈ h ∧ h. It defines a (holomorphic) Poisson structure πG on G by πG (g) = Rg r0 − Lg r0 , where Rg and Lg are respectively the right and left translations on G by g ∈ G, making (G, πG ) into a Poisson Lie group. This Poisson structure is the semi-classical limit of the quantum group corresponding to G [D1, D2]. A Poisson structure on G/H is said to be (G, πG )-homogeneous if the action map G × (G/H ) → G/H is a Poisson map [D3]. The first result of this paper, Theorem 3.2, is on the construction of a surjective map from the set of all classical dynamical r-matrices for the pair (g, h) together with their
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
339
domains to the set of all (holomorphic) (G, πG )-homogeneous Poisson structures on G/H . More precisely, for any classical dynamical r-matrix r and λ ∈ h∗ such that r(λ) is defined, we show that the bi-vector field π˜ r(λ) on G defined by π˜ r(λ) = Rg r0 − Lg r(λ) projects to a holomorphic (G, πG )-homogeneous Poisson structure on G/H under the projection G → G/H , and that all (G, πG )-homogeneous Poisson structures on G/H arise this way. See also [L-X] for another interpretation of classical dynamical r-matrices. Let K ⊂ G be a compact real form of G, and let T = K ∩ H be the maximal torus of K. Then K also carries a natural Poisson structure πK such that (K, πK ) is a Poisson Lie group. Theorem 3.2 is then modified to Theorem 4.1 which states that classical dynamical r-matrices give rise to (K, πK )-homogeneous Poisson structure on K/T and that all (K, πK )-homogeneous Poisson structures on K/T arise this way. We point out that a classification of all (G, πG ) or (K, πK )-homogeneous Poisson structures, not necessarily on G/H or on K/T , has already been obtained by E. Karolinsky [Ka2,Ka3]. We want to emphasize that what is brought out here is the connection of such Poisson spaces with the CDYBE. Among all (K, πK )-homogeneous Poisson structures on K/T , we single out a family denoted by πX,X1 ,λ , where X is any subset of the set S(6+ ) of all simple roots, X1 ⊂ X, and λ ∈ h satisfies some regularity condition (Theorem 5.1). This family exhausts all (K, πK )-homogeneous Poisson structures on K/T up to K-equivariant isomorphisms. Moreover, these Poisson structures are related to each other by taking various limits of the parameter λ (see Sect. 5.2). We study several Poisson geometrical properties of this family: The Lagrangian subalgebra of g corresponding to each πX,X1 ,λ is described in Sect. 5.3. In Sect. 5.4, we recall the construction in [E-L2] of a Poisson structure 5 on the variety L of all Lagrangian subalgebras in g and the fact that each (K/T , πX,X1 ,λ ) sits inside (L, 5) as a Poisson submanifold (possibly up to a covering map). The two special cases of πX,X1 ,λ when X = X1 = ∅ and when X = S(6+ ), X1 = ∅ are considered in more detail here. In Sect. 5.5, we show that each πX,X1 ,λ on K/T can be obtained via Poisson induction from a Poisson structure on a smaller manifold. In Sect. 5.6, we describe the symplectic leaves of πX,X1 ,λ when X1 is the empty set. We show that in this case πX,X1 ,λ has a finite number of symplectic leaves. For an arbitrary πX,X1 ,λ , we show that it always has at least one open symplectic leaf. In Sect. 5.7, we show that with respect to a K-invariant volume form µ0 on K/T , all the Poisson structures πX,X1 ,λ have the same modular vector field. In the case when X1 is the empty set, we also describe the moment map for the T -action on each symplectic leaf of πX,∅,λ . Some applications of results in this paper are given in [E-L1], where a Poisson geometrical interpretation of the Kostant harmonic forms on K/T [Ko] is given using the Bruhat Poisson structure π∞ := πX,X1 ,λ for X = X1 = ∅. Set πλ = πX,X1 ,λ when X = S(6+ ) and X1 = ∅. The fact that πλ → π∞ as λ → ∞ is used in [E-L1] to show that the Kostant harmonic forms are limits of the usual Hodge harmonic forms. Results in this paper also motivate our work in [E-L2], where, among other things, we show that there is a Poisson manifold (L0 , 5) such that every (K/T , πX,X1 ,λ ) is a Poisson submanifold (possibly up to a covering map) of (L0 , 5). In fact, L0 is an irreducible component of the variety L of all Lagrangian subalgebras of g, and the
340
J.-H. Lu
Poisson structure 5 is defined on all of L. We show in [E-L2] that all the K-orbits in L with respect to the Adjoint action are (K, πK )-homogeneous Poisson spaces, and that every (K, πK )-homogeneous Poisson space maps to (L, 5) by a Poisson map. Thus, (L, 5) is a setting for studying all (K, πK )-homogeneous Poisson spaces. We point out that many more properties of the Poisson structures πX,X1 ,λ can be studied, among these their Poisson cohomology, their Poisson harmonic forms [E-L1], and their symplectic groupoids. We hope to do this in the future.
2. The Classical Dynamical Yang–Baxter Equation Definition 2.1 ([F,E-V]). A meromorphic function r : h∗ → g⊗g is called a classical (quasi-triangular) dynamical r-matrix for the pair (g, h) if it satisfies the following three conditions: 1. The zero weight condition: adx r(λ) = 0 for all x ∈ h and λ ∈ h∗ such that r(λ) is defined; 2. The generalized unitarity condition: r 12 + r 21 = ε for some complex number ε and for all λ ∈ h∗ such that r(λ) is defined, where ∈ (S 2 g)g is the element corresponding to the Killing form on g; 3. The Classical Dynamical Yang–Baxter Equation (CDYBE): Alt(dr) + [r 12 , r 13 ] + [r 12 , r 23 ] + [r 13 , r 23 ] = 0, P P P where, for r = i ui ⊗vi , we have r 12 = i ui ⊗vi ⊗1, r 13 = i ui ⊗1⊗vi , r 23 = P i 1⊗ui ⊗vi , CYB(r) := [r 12 , r 13 ] + [r 12 , r 23 ] + [r 13 , r 23 ] X = [ui , uj ]⊗vi ⊗vj + ui ⊗[vi , uj ]⊗vj + ui ⊗uj ⊗[vi , vj ], i,j
and Alt(dr)(λ) ∈ ∧3 g is the skew-symmetrization of dr(λ) ∈ h⊗g⊗g ⊂ g⊗g⊗g. The complex number ε is called the coupling constant for r. We now recall the classification of classical dynamical r-matrices for the pair (g, h) as given in [E-V]. Let 6 be the set of all roots for g with respect to h. For each α ∈ 6, choose root vectors Eα and E−α such that Eα , E−α = 1, where , is the Killing form on g. P Let ε be a non-zero complex number, let µ ∈ h∗ , and let C = i,j Cij dxi ∧ dxj be a closed meromorphic 2-form on h∗ . Let 6+ be a choice of positive roots, and let X be a subset of the set S(6+ ) of simple roots in 6+ . For each α ∈ 6, define a (scalar-valued) meromorphic function φα on h∗ according to the rule: If α is a linear combination of simple roots in X, then φα (λ) =
ε ε coth( α, λ − µ ), 2 2
x −x is the hyperbolic cotangent function; Otherwise, set φα (λ) = ε2 where coth(x) = eex +e −e−x if α is positive and φα (λ) = − ε2 if α is negative.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
341
Theorem 2.2 (Etingof–Varchenko [E-V]). 1. With the above choices of µ, C, 6+ , X ⊂ S(6+ ) and φα , the meromorphic function r : h∗ → g⊗g defined by X X ε Cij (λ)xi ⊗ xj + φα (λ)Eα ⊗ E−α (1) r(λ) = + 2 α∈6
i,j
is a classical dynamical r-matrix with non-zero coupling constant ε; 2. Every classical dynamical r-matrix with non-zero coupling constant has this form. 3. r-Matrices and Homogeneous Poisson Structures on G/H 3.1. The main theorem. Let r : h∗ → g⊗g be any classical dynamical r-matrix as in Definition 2.1. Let ε Ar (λ) = r(λ) − 2 be the skew-symmetric part of r(λ). Using the fact that is symmetric and ad-invariant, one easily shows that the terms [ij , Akl r ] in the CDYBE for r all cancel. Moreover, it is well-known that [12 , 13 ] + [12 , 23 ] + [13 , 23 ] = [12 , 13 ] = [13 , 23 ] = −[12 , 23 ] ∈ (∧3 g)g . Therefore, Ar satisfies the following modified CDYBE (see also [E-V]): 13 12 23 13 23 Alt(dAr ) + [A12 r , Ar ] + [Ar , Ar ] + [Ar , Ar ] =
ε2 12 23 [ , ] ∈ (∧3 g)g . 4
(2)
Recall that there is the Schouten bracket [ ] on ∧g. For x1 , x2 , . . . , xk ∈ g, we use the convention X sign(σ )xσ (1) ⊗xσ (2) ⊗ · · · ⊗xσ (k) ∈ g⊗k . x1 ∧ x2 ∧ · · · ∧ xk = σ ∈Sk
Then for X ∈ ∧2 g, the element CYB(X) and the Schouten bracket [X, X] are related by [D2] CYB(X) = [X12 , X13 ] + [X12 , X23 ] + [X13 , X23 ] =
1 [X, X]. 2
Thus, we can rewrite Eq. (2) as [Ar (λ), Ar (λ)] =
ε2 12 23 [ , ] − 2Alt(dAr )(λ). 2
(3)
It is this form of the CDYBE that we will use to define Poisson structures on G/H . Recall [D2] that a classical quasi-triangular r-matrix with coupling constant ε is an element r0 ∈ g⊗g such that r0 + r021 = ε, CYB(r0 ) = 0.
342
J.-H. Lu
Remark 3.1. If r0 has the zero-weight property, i.e., if r0 ∈ (g⊗g)h , then by Theorem 2.2, it must be of the form X ε ε X cij xi ∧ xj + Eα ∧ E−α (4) r0 = + 2 2 i,j
α∈6+
P for some choice 6+ of positive roots and i,j cij ∈ h∧h. But not every quasi-triangular r0 has the zero-weight property. For example, for g = sl(3, C), we can take r0 = 1 ε ( + P α∈6+ Eα ∧ E−α + 6 E21 ∧ E23 ), where Eij has 1 at the (ij )’s entry and 0 2 everywhere else. See [B-D] for more examples. Let r0 be a classical quasi-triangular r-matrix with coupling constant ε (not necessarily of zero weight for h). Let 3 = r0 − ε2 ∈ g ∧ g be the skew-symmetric part of r0 . Then, as a special case of (3), 3 satisfies the modified Classical Yang–Baxter Equation (CYBE) [3, 3] =
ε 2 12 23 [ , ]. 2
(5)
It is well known that the bi-vector field πG on the group G defined by πG (g) = Rg 3 − Lg 3,
(6)
where for Rg and Lg denote respectively the right and left translations from the identity element to g, defines a holomorphic Poisson structure on G, and that (G, πG ) is a (holomorphic) Poisson Lie group [D2, STS1]. All Poisson structures in this section are assumed to be holomorphic. Recall that an action of the Poisson Lie group (G, πG ) on a Poisson manifold P is said to be Poisson if the action map G × P → P : (g, p) 7→ gp is a Poisson map, where G × P is equipped with the product Poisson structure. When the action of G on P is transitive, the Poisson structure on P is said to be (G, πG )-homogeneous [D3]. The following theorem makes a connection between classical dynamical r-matrices and (G, πG )-homogeneous Poisson structures on G/H . Theorem 3.2. Let r0 = ε2 + 3 be any classical quasi-triangular r-matrix (not necessarily of zero-weight) with skew-symmetric part 3. Let r(λ) = ε2 + Ar (λ) be any classical dynamical r-matrix for the pair (g, h) as in Definition 2.1. For each value λ such that r(λ) is defined, define a bi-vector field π˜ r(λ) on G by π˜ r(λ) (g) = Rg 3 − Lg Ar (λ),
g ∈ G.
Let πr(λ) = p∗ π˜ r(λ) be the projection of π˜ r(λ) to G/H by the map p : G → G/H : g 7 → gH . Then 1) πr(λ) is well-defined and it defines a Poisson structure on G/H ; 2) Equip G with the Poisson structure πG as defined by (6). Then πr(λ) is a (G, πG )homogeneous Poisson structure on G/H . 3) When r0 has the zero-weight property, i.e., r0 ∈ (g⊗g)h , every (G, πG )-homogeneous Poisson structure on G/H arises this way.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
343
The rest of this section is devoted to the proof of this theorem. We first prove the first two parts. Proof of 1) and 2) in Theorem 3.2. It follows from Ar (λ) ∈ (∧2 g)h that πr(λ) is well-defined. To show that πr(λ) defines a Poisson structure on G/H , we calculate the Schouten bracket [πr(λ) , πr(λ) ] of πr(λ) with itself. Set 3R (g) = Rg 3 and Ar (λ)L (g) = Lg Ar (λ). Then π˜ r(λ) = 3R − Ar (λ)L . Hence [π˜ r(λ) , π˜ r(λ) ] = [3R , 3R ] − 2[3R , Ar (λ)L ] + [Ar (λ)L , Ar (λ)L ] = −[3, 3]R + [Ar (λ), Ar (λ)]L = −2Alt(dAr (λ))L ∈ (h ∧ g ∧ g)L , where in the last step, we used Eqs. (3) and (5). This shows that π˜ r(λ) is in general not a Poisson bi-vector field on G. However, for πr(λ) = p∗ π˜ r(λ) , we have [πr(λ) , πr(λ) ] = p∗ [π˜ r(λ) , π˜ r(λ) ] = −2p∗ Alt(dAr (λ))L = 0. Therefore, πr(λ) is a Poisson structure on G/H . Now for any g1 and g2 ∈ G, we have π˜ r(λ) (g1 g2 ) = Rg1 g2 3 − Lg1 g2 Ar (λ) = Lg1 (Rg2 3 − Lg2 Ar (λ)) + Rg2 (Rg1 3 − Lg1 3) = Lg1 π˜ r(λ) (g2 ) + Rg2 πG (g1 ). Projecting π˜ r(λ) to πr(λ) , this says that the action map of G on G/H by left translations is a Poisson map. Thus πr(λ) is a (G, πG )-homogeneous Poisson structure on G/H . This finishes the proof of 1) and 2) in Theorem 3.2. We now prove 3) of Theorem 3.2. Assume that r0 ∈ (g⊗g)h . Then by PTheorem 2.2, it must be of the form (4) for some choice 6+ of positive roots and some i,j uij xi ∧ xj ∈ h ∧ h. Let e = eH be the base point of G/H . Recall [D3] that a (G, πG )-homogeneous Poisson structure π on G/H is determined by its value π(e) at e in such a way that π(gH ) = Lg π(e) + p∗ πG (g).
(7)
Moreover, since πG (g) = 0 for g ∈ H (this is why we need the zero weight condition on r0 ), we see that π(e) is H -invariant, i.e., π(e) ∈ ∧2 Te (G/H )H ∼ = (∧2 (g/h))H . Let n+ and n− be the nilpotent Lie subalgebras of g spanned by the root vectors for the roots in 6+ and −6+ respectively. Identify g/h ∼ = n− + n+ . Lemma 3.3. Write π(e) =
X ε ( − φα )Eα ∧ E−α ∈ (∧2 (g/h))H 2
(8)
α∈6+
and set φ−α = −φα . Then the bi-vector field π on G/H defined by (7) is Poisson if and only if the function φ : 6 → C satisfies φα φβ + φβ φγ + φγ φα = −
ε2 , 4
wheneverα, β, γ ∈ 6 and α + β + γ = 0.
(9)
344
J.-H. Lu
Proof of Lemma 3.3. For any given π(e) in the form of (8), set X φα Eα ∧ E−α ∈ ∧2 g A= α∈6+
and introduce the following bi-vector field πˆ on G: πˆ (g) = Rg 3 − Lg A. ˆ But as in the proof of 1) of Theorem 3.2, Then π = p∗ πˆ , and hence [π, π] = p∗ [πˆ , π]. we have [π, ˆ π] ˆ = [3R , 3R ] − 2[3R , AL ] + [AL , AL ] = −[3, 3]R + [A, A]L . Since 3 satisfies the modified CYBE (5), by writing B = [A, A] −
ε 2 12 23 [ , ] ∈ ∧3 g, 2
we see that [π, ˆ π] ˆ = B L , the left invariant 3-vector field on G with value B at e. Thus [π, π] = 0 if and only if B ∈ h ∧ g ∧ g, or, if and only if [A, A] =
ε2 12 23 [ , ]modh ∧ g ∧ g. 2
A direct calculation shows that X φα2 hα ∧ Eα ∧ E−α [A, A] = α∈6
−2
X
(φα φβ + φβ φγ + φγ φα )Nα,β Eα ∧ Eβ ∧ Eγ
˜3 [(α,β,γ )]∈6
and [12 , 23 ] =
1X hα ∧ Eα ∧ E−α + 2 α∈6
X
Nα,β Eα ∧ Eβ ∧ Eγ ,
˜3 [(α,β,γ )]∈6
where hα = [Eα , E−α ] ∈ h, [Eα , Eβ ] = Nα,β Eα+β when α, β ∈ 6 and α + β ∈ 6, ˜ 3 means that the summation index runs over all and the summation over [(α, β, γ )] ∈ 6 3 triples (α, β, γ ) ∈ 6 such that α + β + γ = 0 but two such triples are considered the same if they only differ by a reordering of the three roots. It then follows immediately that π is a Poisson structure on G/H if and only if Condition (9) is satisfied. This finishes the proof of Lemma 3.3. It now remains to classify all odd functions φ on 6 such that Condition (9) is satisfied. Note that the Weyl group W for (g, h) acts on the set of such functions by (w·φ)α := φwα . We say that two such functions φ and ψ are W -related if ψ = w · φ for some w ∈ W . Notation 3.4. Let S(6+ ) be the set of simple roots in 6+ . For a subset X of S(6+ ), we will use [X] to denote the set of roots in 6 that are in the linear span of X. Also set hX = spanC {hγ = [Eγ , E−γ ] : γ ∈ X}.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
345
Lemma 3.5. For any X ⊂ S(6+ ) and h ∈ hX such that α(h) ∈ / π iZ for any α ∈ [X], where π = 3.14159 . . . (we hope that there is no confusion between this notation of π = 3.14159 . . . and π as a Poisson structure), and Z is the set of integers, define φ : 6 → C by ε 2 coth α(h), α ∈ [X] α ∈ 6+ \[X] φα = ε2 , ε α ∈ −(6 \[X]). − , +
2
Then (1) φ satisfies Condition (9); (2) Any odd function φ : 6 → C satisfying Condition (9) is W -related to one obtained this way. Proof. (1) can be checked directly. We only show (2). Suppose that φ : 6 → C satisfies Condition (9). Set Y = {α ∈ 6 : φα = ε2 }. Then because of (9), Y has two properties: (A) If α, β ∈ Y and α + β ∈ 6, then α + β ∈ Y ; (B) If α ∈ Y , then −α 6 ∈ Y . 0
0
It follows [E-V] that there exists a choice of positive roots 6+ such that Y ⊂ 6+ . Since 0 there exists w ∈ W such that w6+ = 6+ , by considering w · φ instead of φ, we can 0 assume that 6+ = 6+ . Set X = S(6+ ) ∩ (6+ \Y ). Since Condition (9) implies that Y has the additional property: (C) If α ∈ Y, β ∈ 6\(−Y ) are such that α + β ∈ 6, then α + β ∈ Y , we claim that 6+ = ([X] ∩ 6+ ) ∪ Y is a disjoint union. Indeed, suppose that α ∈ [X] ∩ 6+ . We first use induction on the height ht(α) of α with respect to S(6+ ) to show that α ∈ / Y . If ht(α) = 1, then α is simple, so α ∈ / Y by definition. Suppose that ht(α) = k. We can [Se] write α as α = α1 + · · · + αk such that each αj is in X and 0 that each α1 + · · · + αj is a root, for j = 1, . . . , k. Set α = α1 + · · · + αk−1 . By 0 0 / Y . If α ∈ Y , then we know by (C) that αk = α − α ∈ Y induction assumption, α ∈ which is a contradiction. Thus α ∈ / Y . This shows that ([X] ∩ 6+ ) ∩ Y = ∅. Next, suppose that α ∈ 6+ \Y . We use induction on ht(α) again to show that α ∈ [X]. If ht(α) = 1, then α ∈ X ⊂ [X] by the definition of X. Suppose that ht(α) = k. Write α 0 0 as α = α + αk , where α ∈ 6+ and αk is a simple root. If αk ∈ Y . Then by (C), we 0 0 is absurd. Thus αk ∈ / Y , so αk ∈ 0 X. If α ∈ Y , then have −α = αk − α ∈ Y which 0 / Y . By induction again by (C), we have −αk = α − α ∈ Y which is also absurd, so α ∈ 0 assumption, α ∈ [X]. Thus α ∈ [X]. Hence we have shown that 6+ = ([X] ∩ 6+ ) ∪ Y is a disjoint union. / π iZ, such that φγ = For γ ∈ X, since φγ 6 = ± ε2 , there exists λγ ∈ C, λγ ∈ ε coth λ . Choose h ∈ h such that γ (h) = λ for every γ ∈ X. We now show that γ γ X 2 α(h) ∈ / πiZ and that φα = ε2 coth α(h) for all α ∈ [X] ∩ 6+ by using induction on the height ht(α). This is true when ht(α) = 1. Suppose that ht(α) = k. As before, write 0 0 0 α = α + αk , where α ∈ [X] ∩ 6+ , ht(α ) = k − 1, and αk ∈ X. Then by induction 0 0 / πiZ and φα 0 = ε2 coth α (h). By Condition (9), assumption, α (h) ∈ −φα (φα 0 + φαk ) = −
ε2 − φα 0 φαk . 4
346
J.-H. Lu
2 If φα 0 + φαk = 0, we would have φα 0 φαk = − ε4 and thus φα 0 = ± ε2 and φαk = ∓ ε2 . This is not possible since ([X] ∩ 6+ ) ∩ Y = ∅. Thus φα 0 + φαk 6= 0, so α(h) = 0 α (h) + αk (h) ∈ / πiZ, and
ε2 + φ 0 φ ε α αk = coth α(h). φα = 4 φα 0 + φαk 2
t u
We now continue with the proof of (3) of Theorem 3.2. Let π be a (G, πG )-homogeneous Poisson structure on G/H . Then by Lemmas 3.3 and 3.5, there exist a choice 0 0 0 6+ of positive roots, a subset X of the set of simple roots in 6+ , and an element λ0 ∈ h∗ such that π = πr 0 (λ0 ) , where X
rX0 (λ) =
ε ε + 2 2
X α∈[X
0
coth
0 ]∩6+
ε ε α, λ Eα ∧ E−α + 2 2
X
Eα ∧ E−α
0 0 α∈6+ \[X ]
(10) is a classical dynamical r-matrix for the pair (g, h). This proves part (3) of Theorem 3.2. t u 0
Remark 3.6. For any Lie subalgebra h of g, one can define classical dynamical r-matrices 0 for the pair (g, h ). It is clear from the proof that 1) and 2) in Theorem 3.2 still hold 0 when H is replaced by any closed subgroup H of G and when r is a classical dynamical 0 0 0 r-matrix for (g, h ) with h being the Lie algebra of H . For 3), assume that r0 ∈ (g⊗g)h 0 and that H is a subgroup of H . Let π be a (G, πG )-homogeneous Poisson structure on 0 G/H . Consider again 0
0
H 0
π(eH ) ∈ ∧ TeH 0 (G/H ) 2
0 0 H 2 ∼ . = ∧ g/h
0 0 0 By picking an H -invariant complement h0 of h in g, and by identifying g/h ∼ = h0 , we can consider 0
0
A = π(eH ) ∈ (∧2 h0 )H ⊂ ∧2 g. The discussions in the proof of 3) of Theorem 3.2 show that ε 0 CYB( + A) ∈ h ∧ g ∧ g. 2 0
In [Sc], Schiffmann shows that if h contains a regular semi-simple element, then under 0 certain conditions on A, there is a classical dynamical r-matrix r for (g, h ) such that A is the skew-symmetric part of r(0). Thus π comes from r in our sense. We thank the referee for pointing this out.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
347
3.2. The Poisson structures πrX (λ) on G/H . In this section, we consider in more detail the case when the Poisson structure on G is defined by a classical quasi-triangular rmatrices r0 with the zero weight property. In other words, we fix a choice 6+ of positive roots, and consider r0 of the form X ε ε X cij xi ∧ xj + Eα ∧ E−α , (11) r0 = + 2 2 i,j
P
P
α∈6+
where i,j cij xi ∧ xj ∈ h ∧ h. When i,j cij xi ∧ xj = 0, the corresponding r0 is often called the standard r-matrix. The corresponding Poisson structure πG on G is the semi-classical limit of the quantum group corresponding to G [D2]. For X ⊂ S(6+ ), set ε ε ε X ε X coth α, λ Eα ∧ E−α + Eα ∧ E−α . rX (λ) = + 2 2 2 2 α∈[X]∩6+
α∈6+ \[X]
(12) / 2πεi Z Clearly, the domain D(rX ) of rX consists of those λ ∈ h∗ such that λ, α ∈ for all α ∈ [X]. For each such λ, we have the (G, πG )-homogeneous Poisson structure πrX (λ) on G/H : let p∗ πG be the projection to G/H of πG by p : G → G/H : g 7→ gH . Then L X ε Eα ∧ E−α , πrX (λ) = p∗ πG + 1 − eεα ,λ α∈[X]∩6+
where the second term on the right hand side is the G-invariant bi-vector field on G/H whose value at e = eH is the expression given in the parenthesis. Theorem 3.7. With the Poisson structure πG on G defined by r0 in (11), every holomorphic (G, πG )-homogeneous Poisson structure on G/H is isomorphic, via a Gequivariant diffeomorphism, to a πrX (λ) for some subset X ⊂ S(6+ ) and λ ∈ D(rX ), where rX is given in (12). Proof. Let π be a (G, πG )-homogeneous Poisson structure on G/H . By Theorem 3.2, 0 0 we know that there exists a choice 6+ of positive roots and a subset X of the set of 0 simple roots in 6+ such that π = πr 0 (λ0 ) for some λ0 ∈ h∗ , where rX0 is the classical X dynamical r-matrix given by (10). Let 3 = r0 − ε2 and let AX0 (λ0 ) be the skew0 0 symmetric part of rX0 (λ0 ). Then recall from Sect. 3 that π = p∗ πˆ , where πˆ is the bi-vector field on G given by 0
πˆ (g) = Rg 3 − Lg AX0 (λ0 ), 0
g ∈ G.
0
Pick w ∈ W such that w6+ = 6+ . Set X = wX . Let w˙ be a representative of w in G. We will use Rw˙ −1 to denote the right translation on G by w˙ −1 as well as the induced diffeomorphism on G/H . Then for any g ∈ G, 0
Rw˙ −1 πˆ (g) = Rw˙ −1 g 3 − Lg Lw˙ −1 Adw˙ AX0 (λ0 ) = Rg w˙ −1 3 − Lg w˙ −1 AX (wλ0 ), where AX is the skew-symmetric part of the r-matrix rX given by (12). It follows from the definition of πrX (wλ0 ) that π = Rw˙ πrX (wλ0 ) . The map Rw˙ : G/H → G/H is G-equivariant. u t
348
J.-H. Lu
P 3.3. Comparison with Karolinsky’s classification. When ij cij xi ∧ xj = 0 in the definition of r0 , all (G, πG )-homogeneous Poisson structures on G/H have been classified by Karolinsky [Ka3] by using Drinfeld’s theorem on Poisson homogeneous spaces. We now look at the Poisson structures πrX (λ) on G/H in terms of Karolinsky’s classification. Recall that the double Lie algebra associated to the Poisson Lie group (G, πG ) can be identified with the direct sum Lie algebra d = g + g equipped with the ad-invariant non-degenerate scalar product given by h(x1 , x2 ), (y1 , y2 )i =
1 ( x2 , y2 − x1 , y1 ). ε
The Lie algebra g is identified with the diagonal of d, and the Lie algebra g∗ is identified with the subspace g∗ ∼ = {(x− , x+ ) : x± ∈ b± , (x− )h + (x+ )h = 0}. Here, b± = h+n± and (x± )h ∈ h is the h-component of x± . A theorem of Drinfeld [D3] says that (G, πG )-homogeneous Poisson structures on G/H correspond to Lagrangian (with respect to the scalar product h, i) subalgebras l of the double d ∼ = g + g such that l ∩ g = h. Theorem 3.8 (Karolinsky [Ka3]). Lagrangian subalgebras l of g + g such that l ∩ g = 0 0 h are in 1 − 1 correspondence with triples (p, p , η), where p and p are parabolic 0 subalgebras of g such that q = p ∩ p is the Levi subalgebra, h ⊂ q, and η is an 0 interior orthogonal automorphism of q with qη = h. If (p, p , η) is such a triple, the 0 0 0 corresponding subalgebra l of g + g is l = {(x , x) ∈ p × p : η(xq ) = xq }, where 0 0 0 xq ∈ q (resp. xq ∈ q ) is the projection of x (resp. x ) to q with respect to the Levi 0 decomposition of p (resp. p ). For a (G, πG )-homogeneous Poisson structure π on G/H , the Lagrangian subalgebra lπ(e) of g + g is by definition [D3] lπ(e) = {x + ξ : x ∈ g, ξ ∈ g∗ , ξ |h = 0, andξ For π(e) of the form π(e) = that
P
π(e) = x + h}.
ε − φ )E ∧ E , it is an easy calculation to see α α −α
α∈6+ ( 2
lπ(e) = h + spanC {ξα : α ∈ 6}, where for α ∈ 6, ε ε ξα = (φα − )Eα , (φα + )Eα ∈ g + g. 2 2 Thus, for the Poisson structure πrX (λ) on G/H , we have (−εEα , 0) if α ∈ −Y ε (Eα , eεα,λ Eα ) if α ∈ [X] , ξα = eεα,λ −1 if α ∈ Y. (0, εEα )
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
349
where Y = 6+ \[X]. Let pX = h + spanC {Eα : α ∈ [X] ∪ Y } be the parabolic subalgebra of g defined by X, and let 0
pX = h + spanC {Eα : α ∈ [X] ∪ (−Y )} be its opposite parabolic subalgebra. Set mX = h + spanC {Eα : α ∈ [X]}
(13)
0
so that mX = pX ∩ pX . Let η be the interior automorphism of mX given by Adeεhλ , where 0 hλ ∈ h corresponds to λ ∈ h∗ under the Killing form. Then the triple (pX , pX , η) is the one corresponding to the Poisson structure πrX (λ) in the Karolinsky classification. 4. r-Matrices and Homogeneous Poisson Structures on K/T We pick a compact real form k of g as follows: For each α ∈ 6+ , set Xα = Eα − E−α ,
Yα = i(Eα + E−α )
and hα = [Eα , E−α ]. Then the real subspace k = spanR {ihα , Xα , Yα : α ∈ 6+ } is a compact real form of g. Set t = spanR {ihα : α ∈ 6} ⊂ k. Let K and T ⊂ K be respectively the connected compact subgroups of G with Lie algebras k and t. It is well-known [So] that every Poisson structure πK on K such that (K, πK ) is a Poisson Lie group is of the form πK (k) = Rk 3 − Lk 3,
(14)
where 3=u−
iε X Xα ∧ Yα ∈k∧k 2 2
(15)
α∈6+
for some u ∈ t ∧ t, an imaginary complex number ε and a choice 6+ of positive roots. In this section, we will show how (K, πK )-homogeneous Poisson structures on K/T are related to classical dynamical r-matrices. We remark again that one classification of all (K, πK )-homogeneous Poisson spaces (by the corresponding Lagrangian Lie subalgebras) has been given by Karolinsky [Ka2]. If we regard ∧g as a real vector space, then ∧k −→ ∧g : ∧l k 3 x1 ∧ · · · ∧ xl 7−→ x1 ∧ · · · ∧ xl ∈ ∧l g is an embedding of ∧k into ∧g as a real subspace. This embedding also preserves the Schouten bracket. Thus, for A ∈ ∧2 k of the form A=
X α∈6+
aα
Xα ∧ Yα , 2
aα ∈ R
for α ∈ 6+ ,
350
J.-H. Lu
P we can calculate [A, A] ∈ ∧3 k by first writing A = α∈6+ iaα Eα ∧ E−α ∈ ∧2 g and calculate [A, A] inside ∧g. Indeed, as in the proof of Lemma 3.3, in ∧3 g we have [A, A] =
1 X 2 aα (ihα ∧ Xα ∧ Yα ) 2 α∈6+ X (aα aβ + aβ aγ + aγ aα )Nα,β Eα ∧ Eβ ∧ Eγ . +2
(16)
˜3 [(α,β,γ )]∈6
Clearly, ihα ∧ Eα ∧ E−α ∈ ∧3 k for each α ∈ 6+ . Suppose that (α, β, γ ) ∈ 6 3 are such that α + β + γ = 0. Without loss of generality, we can assume that α, β ∈ 6+ and γ ∈ −6+ . Then Nα,β Eα ∧ Eβ ∧ Eγ + N−α,−β E−α ∧ E−β ∧ E−γ = Nα,β (Eα ∧ Eβ ∧ Eγ − E−α ∧ E−β ∧ E−γ ). This element is in ∧3 k because it is fixed by θ ∈ EndR (∧3 g) defined by θ(x1 ∧ x2 ∧ x3 ) = θ(x1 ) ∧ θ (x2 ) ∧ θ (x3 ),
x1 , x2 , x3 ∈ g,
where θ ∈ EndR (g) is the complex conjugation of g defined by k. The right hand side of (16) is thus the Schouten bracket of A with itself inside ∧k. Now suppose that r is a classical dynamical r-matrix for the pair (g, h) as given in Theorem 2.2. Suppose that λ ∈ h∗ is in the domain of r such that the skew-symmetric part Ar (λ) = r(λ) − ε2 of r(λ) lies in ∧2 k. Then [Ar (λ), Ar (λ)] − [3, 3] ∈ (∧3 k) ∩ (h ∧ k ∧ k) = t ∧ k ∧ k. By abuse of notation, we still use π˜ r(λ) (already used in Theorem 3.2) to denote the bi-vector field on K given by π˜ r(λ) (k) = Rk 3 − Lk Ar (λ),
k ∈ K,
where Rk and Lk are respectively the right and left translations on K by k. We use πr(λ) to denote the projection of π˜ r(λ) to K/T by the map p : K → K/T : k 7 → kT . Theorem 4.1. Let r be any classical dynamical r-matrix for the pair (g, h) given in Theorem 2.2. Suppose that λ ∈ h∗ is in the domain of r such that Ar (λ) = r(λ) − ε2 is in ∧2 k. Then, 1) the bi-vector field πr(λ) on K/T defines a (K, πK )-homogeneous Poisson structure on K/T ; 2) with the Poisson structure πK on K given by (14), every (K, πK )-homogeneous Poisson structure on K/T arises this way. Proof. The proof of 1) is similar to that of Theorem 3.2. We prove 2). Assume that π is a (K, πK )-homogeneous Poisson structure on K/T . Since π is T -invariant, we can write X iε Xα ∧ Yα ∈ ∧2 (k/t), (− + iφα ) π(e) = 2 2 α∈6+
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
351
where e = eT ∈ K/T and φα ∈ iR for each α ∈ 6+ . (Recall that ε ∈ iR is fixed at the beginning.) Set φ−α = −φα for α ∈ 6+ . Using the same trick for calculating the Schouten bracket in ∧k, i.e., by embedding ∧k into ∧g, and by using arguments similar to those in the proof of Lemma 3.3, we know that the φα ’s must satisfies Condition (9). Exactly the same as in the proof of the second part of Theorem 3.2, we know that there 0 0 exist a choice of positive roots 6+ , a choice of a subset X of the set of simple roots for 0 6+ , and some (not necessarily unique) λ0 ∈ h∗ such that φα =
ε
0 coth ε2 α, λ0 ifα ∈ [X ] 0 0 ifα ∈ ±(6+ \[X ]. ± ε2
2
0
0
Let r be the classical dynamical r-matrix for the pair (g, h) defined by 6+ and X as in Theorem 2.2 (µ = 0 and C = 0), we see that π coincides with the Poisson structure t πr(λ0 ) on K/T . u 5. The Poisson Structures πX,X1 ,λ on K/T 5.1. Definition. As in the case for G/H , we will single out a family of (K, πK )homogeneous Poisson structures on K/T which exhausts all such Poisson structures on K/T up to K-equivariant isomorphisms. For a subset X ⊂ S(6+ ), set aX = spanR {hγ = [Eγ , E−γ ] : γ ∈ X}. Denote by {hˇ γ : γ ∈ S(6+ )} the set of fundamental co-weights for S(6+ ), i.e., hˇ γ ∈ a for each γ ∈ S(6+ ) and γ1 (hˇ γ ) = δγ1 ,γ for all γ1 , γ ∈ S(6+ ).. For X1 ⊂ S(6+ ), set X hˇ γ . ρˇX1 = γ ∈X1
Define ρˇX1 to be 0 if X1 is the empty set. iπ Theorem 5.1. For X ∈ S(6+ ), X1 ⊂ X and λ = λ1 + iπ 2 ρˇX1 ∈ a X + 2 ρˇX1 such that α(λ1 ) 6 = 0 for all α ∈ [X] with α(ρˇX1 ) even, let πX,X1 ,λ be the bi-vector field on K/T given by L iε X 1 Xα ∧ Yα , πX,X1 ,λ = p∗ πK − 2 1 − e2α(λ) α∈[X]∩6+
where the second term on the right hand side is the K-invariant bi-vector field on K/T whose value at e = eT is the expression given in the parenthesis. Then 1) πX,X1 ,λ is a (K, πK )-homogeneous Poisson structure on K/T , and 2) every (K, πK )-homogeneous Poisson structure on K/T is K-equivariantly isomorphic to some πX,X1 ,λ . / π iZ for all Remark 5.2. Note that the condition on λ1 ∈ aX is equivalent to α(λ) ∈ α ∈ [X], so that e2α(λ) 6 = 1 for all α ∈ [X].
352
J.-H. Lu
Proof. 1) The number e2α(λ) is real for each α ∈ [X]. Thus πX,X1 ,λ is a (K, πK )homogeneous Poisson structure coming from a classical dynamical r-matrix. 2) Assume that π is a (K, πK )-homogeneous Poisson structure on K/T . By Theorem 4.1 and by a proof similar to that of Theorem 3.7, there exist X ⊂ S(6+ ) and some λ0 ∈ h∗ such that π is isomorphic, via a K-equivariant diffeomorphism of K/T , to the 0 Poisson structure π given by L X iε 0 kα Xα ∧ Yα , π = p∗ πK − 2 α∈[X]∩6+
where kα =
ε 1 1 (1 − coth( α, λ0 )) = ∈ R. εα,λ 0 2 2 1−e
Let hλ0 ∈ h be the element in h corresponding to λ0 under the Killing form, so that α, λ0 = α(hλ0 ) for all α ∈ 6. It remains to show that ε2 hλ0 can be replaced by z some λ ∈ aX + iπ 2 ρˇX1 . To this end, consider the function f (z) = 1/(1 − e ) for z ∈ C. It takes values in all of C except for 0 and 1. Moreover, f (R\{0}) = (−∞, 0) ∪ (1, ∞) and f (R + iπ) ∈ (0, 1). Set X1 = {γ ∈ X : kγ ∈ (0, 1)}. Then for each γ ∈ X, there exists µγ ∈ R such that kγ = f (µγ + iπ ) if γ ∈ X1 kγ = f (µγ ) if γ ∈ X\X1 . Let λ1 ∈ aX be such that 2γ (λ1 ) = µγ for each γ ∈ X (such a λ1 exists), and let λ = λ1 + πi 2 ρˇX1 . Then kγ = f (2γ (λ)) for all γ ∈ X. Consequently, by writing α ∈ [X] ∩ 6+ as a linear combination of elements in X, we see that kα = f (2α((λ)) for all α ∈ [X]. u t Notation 5.3. For reasons given in Sect. 5.2, we will use π∞ to denote the Poisson structure p∗ πK on K/T . It is called the Bruhat Poisson structure [Lu-We], because its symplectic leaves are Bruhat cells in K/T . See Sect. 5.6 for more details. Example 5.4. Consider K = SU (2) =
u v −v¯ u¯
: u, v ∈ C, |u| + |v| = 1 , 2
2
T = {diag(eix , e−ix ) : x ∈ R} ∼ = S 1 and the root α(x, −x) = 2x is taken to be the positive root. Then 1 0 1 1 0i , Yα = . Xα = 2 −1 0 2 i 0 With 3=−
iε Xα ∧ Yα ∈ su(2) ∧ su(2) 2 2
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
353
and the Poisson structure πK on K = SU (2) defined by πK = 3R − 3L , the Poisson brackets among the coordinate functions u, v, u¯ and v¯ on SU (2) are given by ε {u, u} ¯ = − |v|2 , 4
{u, v} =
ε uv, 8
{u, v} ¯ =
ε uv, ¯ 8
{v, v} ¯ = 0.
Let π0 be the SU (2)-invariant bivector field on SU (2)/S 1 whose value at the point e = eS 1 is Xα ∧ Yα . It is symplectic. Case 1. X = X1 = ∅. Then πX,X1 ,λ = π∞ . λ1 0 with λ1 6= 0, and Case 2. X = {α}, X1 = ∅. Then λ = 0 −λ1 πX,X1 ,λ = π∞ − Case 3. X = X1 = {α}. Then
λ=
λ1 + 0
πi 4
iε 1 π0 . 2 1 − e4λ1
0 −λ1 −
πi 4
with λ1 ∈ R arbitrary, and πX,X1 ,λ = π∞ − Note that the range of the function
1 1−e4λ1
1 iε π0 . 2 1 + e4λ1
for λ1 ∈ R\{0} is (−∞, 0) ∪ (1, +∞), and
for λ1 ∈ R is (0, 1). Thus, for all possible choices of X, X1 and λ, the range of we get all the Poisson structures of the form 1 1+e4λ1
π a = π∞ −
iε aπ0 2
for a ∈ R except for a = 1. But the Poisson structure π a when a = 1 is easily seen to be isomorphic to π∞ (corresponding to a = 0) by the SU (2)-equivariant diffeomorphism on SU (2)/S 1 defined by the right translation by the non-trivial Weyl group element. The fact that every (SU (2), πK )-homogeneous Poisson structures on S 2 is of the form π a for some a ∈ R is very easy to check directly [Sh]. Identify the Lie algebra su(2) with R3 by ix y + iz 7 −→ (x, y, z) −y + iz −ix i 0 so the Adjoint orbit through can be identified with the sphere S 2 = {(x, y, z) ∈ 0 −i R3 : x 2 + y 2 + z2 = 1}. Consequently, we have the identification i 0 , SU (2)/S 1 → S 2 : kS 1 7 −→ Adk 0 −i
354
J.-H. Lu
or
u v ¯ −(uv + u¯ v)). ¯ S 1 7 −→ (|u|2 − |v|2 , −i(uv − u¯ v), −v¯ u¯
The induced Bruhat Poisson structure π∞ on S 2 is given by {x, y} = −
εi (x − 1)z, 4
{y, z} = −
εi (x − 1)x, 4
{z, x} = −
εi (x − 1)y, 4
and the Poisson structure π a on S 2 is given by εi (x + 2a − 1)z, 4 εi {y, z} = − (x + 2a − 1)x, 4 εi {z, x} = − (x + 2a − 1)y. 4
{x, y} = −
Note that π a is symplectic when a < 0 or a > 1. When a = 0, it has two symplectic leaves, the point (1, 0, 0) being a one-point leaf and the rest of S 2 as another leaf. Similarly for a = 1. When 0 < a < 1, it has infinitely many symplectic leaves: two open leaves respectively given by x < 1 − 2a and x > 1 − 2a, and every point on the circle x = 1 − 2a as a one-point leaf. Example 5.5. Let g = sl(3, C) and K = SU (3). The three positive roots are chosen to be α1 (x) = x1 − x2 ,
α2 (x) = x2 − x3 ,
α3 (x) = x1 − x3
for a diagonal matrix x = diag(x1 , x2 , x3 ). Take X = S(6+ ) = {α1 , α2 } and X1 = {α1 }. In this case 0 0 = 0 − 13 0 , 0 0 − 13 2
ρˇX1
3
and
λ1 + λ= 0 0
πi 3
0 λ2 − 0
πi 6
0 −(λ1 + λ2 ) −
πi 6
, λ1 + 2λ2 6= 0.
Then πX,X1 ,λ = π∞ +
2Xα1 ∧ Yα1 2Xα2 ∧ Yα2 2Xα3 ∧ Yα3 + + 2λ +4λ 2(λ −λ ) 1 2 1 2 1−e 1 + e4λ1 +2λ2 1+e
L
.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
355
5.2. Connections via taking limits in λ. As noted in [E-V], the dynamical r-matrices are related to each other via taking various limits in λ. Correspondingly, the Poisson structures πX,X1 ,λ are also related this way. We study these relations in the section. Proposition 5.6. For any X1 ⊂ X ⊂ Y ⊂ S(6+ ) and λ = λ1 + such that α(λ1 ) 6 = 0 for all α ∈ [X] with α(ρˇX1 ) even, we have
iπ 2 ρˇX1
∈ aX +
iπ 2 ρˇX1
πX,X1 ,λ = lim πY ,X1 ,λ+t ρˇY \X .
(17)
t→+∞
In particular, π∞ = lim πY ,∅,t ρˇY . t→+∞
Moreover, we also have π∞ = lim πX,X1 ,λ+t ρˇX .
(18)
t→+∞
Proof. Set µt = λ + t ρˇY \X for t > 0. Let α ∈ [Y ] ∩ 6+ . If α ∈ [X], then α(ρˇY \X ) = 0 so α(µt ) = α(λ). If α ∈ [Y ]\[X], then v := α(ρˇY \X ) is positive, so 1 1 = lim = 0. α(µ ) t t→∞ 1 − e t→∞ 1 − etv lim
Hence (17) follows from the definition of πX,X1 ,λ . The limit in (18) is obvious.
t u
5.3. The Lagrangian subalgebras of g corresponding to πX,X1 ,λ . The Lie bialgebra of the Poisson Lie group (K, πK ) is (k, a + n), where the pairing between k and a + n is given by 2iε Im , , where Im , stands for the imaginary part of the Killing form , . We will call a real subalgebra l of g a Lagrangian algebra if 1) dim l = dim k, and 2) 2iε Im x, y = 0 for all x, y ∈ l. By a theorem of Drinfeld [D3], (K, πK )homogeneous Poisson structures on K/T correspond to Lagrangian subalgebras l of g with l ∩ k = t. In this section, we calculate the Lagrangian subalgebras lX,X1 ,λ corresponding to the Poisson structures πX,X1 ,λ . By definition [D3], lX,X1 ,λ = {x + ξ : x ∈ k, ξ ∈ a + n : ξ |t = 0, ξ
πX,X1 ,λ (e) = x + t}.
A direct calculation gives lX,X1 ,λ = t + spanR {Eβ , iEβ : β ∈ 6+ \[X]} 1 1 Xα + Eα , 2α(λ) Yα + iEα : α ∈ [X] ∩ 6+ }. + spanR { 2α(λ) e −1 e −1 On the other hand, for α ∈ [X], since e2α(λ) 6 = 1, we have 1 Xα + Eα ), e2α(λ) − 1 1 Yα + iEα ). Adeλ Yα = Adeλ (iEα + iE−α ) = (eα(λ) − e−α(λ) )( 2α(λ) e −1
Adeλ Xα = Adeλ (Eα − E−α ) = (eα(λ) − e−α(λ) )(
356
J.-H. Lu
Note that eα(λ) is real or imaginary depending on α(ρˇX1 ) is even or odd. Set nX = spanR {Eβ , iEβ : β ∈ 6+ \[X]}.
(19)
Then we have proved the following proposition. Proposition 5.7. Denote by lX,X1 ,λ the Lagrangian subalgebra of g corresponding to the Poisson structure πX,X1 ,λ on K/T . It is given by lX,X1 ,λ = Ad eλ (t + nX + spanR {Xα , Yα : α ∈ [X], α(ρˇX1 ) is even} + spanR {iXα , iYα : α ∈ [X], α(ρˇX1 ) is odd}). Remark 5.8. Let θ be the complex conjugation on g defined by k. Let τX,X1 be the complex conjugation on g given by τX,X1 = Adexp(π i ρˇX1 ) θ = θ Adexp(−π i ρˇX1 ) . τX,X1
Denote by mX
the set of fixed points of τX,X1 in mX , where mX = h + spanC {Eα : α ∈ [X]}.
Then τX,X1
lX,X1 ,λ = Adeλ (mX
+ nX ).
Remark 5.9. Let n = dim k and consider lX,X1 ,λ as a point in Gr(n, g), the Grassmannian of n-dimensional real subspaces of g. Then, corresponding to Proposition 5.6, we have, iπ for X1 ⊂ X ⊂ Y ⊂ S(6+ ) and for any λ = λ1 + iπ 2 ρˇX1 ∈ a X + 2 ρˇX1 such that α(λ1 ) 6 = 0 for all α ∈ [X] with α(ρˇX1 ) even, lim lY ,X1 ,λ+t ρˇY \X = lX,X1 ,λ
(20)
t→+∞
in Gr(n, g). Indeed, under the Plucker embedding of Gr(n, g) into P1 (∧n g), the Lie subalgebra lY ,X1 ,λ corresponds to the point in P1 (∧n g) defined by the vector vY ,X1 ,λ := Z0 ∧
Y α∈[Y ]∩6+
1
Xα + Eα
e2α(λ) − 1 1 Yα + iEα ∧ ∧ 2α(λ) e −1
Y
Eα ∧ E−α ,
α∈6+ \[Y ]
where Z0 ∈ ∧dim t t and Z0 6 = 0 is fixed. Since vY ,X1 ,λ+t ρˇY \X → vX1 ,λ as t → +∞, we see that (20) holds in P1 (∧n g) and thus also in Gr(n, g). Example 5.10. When X = X1 are the empty set, we have lX,X1 ,λ = t + n, and when X = S(6+ ) and X1 is the empty set, we have lX,X1 ,λ = Adeλ k. In general, when X = S(6+ ), the Lie subalgebra lX,X1 ,λ is a real form of g.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
357
5.4. Geometrical interpretation of πX,X1 ,λ . Denote by L the set of all Lagrangian subalgebras of g with respect to the imaginary part of the Killing form , . (Here g is regarded as a real vector space.) It is an algebraic subvariety of the Grassmannian Gr(n, g) of n-dimensional subspaces of g, where n = dim k. In [E-L2], we show that there is a smooth bivector field 5 on Gr(n, g) such that the Schouten bracket [5, 5] vanishes at every l ∈ L. More precisely, consider the G-action on Gr(n, g) by the Adjoint action. It defines a Lie algebra anti-homomorphism κ : g −→ χ 1 (Gr(n, g)), where χ 1 (Gr(n, g)) is the space of vector fields on Gr(n, g). Denote by the same letter its multi-linear extension from ∧2 g to the space of bi-vector fields on Gr(n, g). Then the bivector field 5 on Gr(n, g) is defined to be 5=
1 κ(R), 2
where R ∈ ∧2 g is the r-matrix for g given by hR, (x1 + y1 ) ∧ (x2 + y2 )iε = hx1 , y2 iε − hx2 , y1 iε
(21)
for x1 , x2 ∈ k and y1 , y2 ∈ a + n with h, iε = 2i ε Im , . Explicitly, l X ε X (ihj ) ∧ hj + (−Xα ∧ (iEα ) + Yα ∧ Eα ) , R=− 2i j =1
α∈6+
where {h1 , . . . , hl } is a basis for a such that hj , hk = δj k . It now follows from the definition of 5 that it defines a Poisson structure on every G-invariant smooth submanifold of L. One particular G-invariant smooth submanifold of L is the (unique) irreducible component L0 of L that contains k. We show in [E-L2] that each lX,X1 ,λ ∈ L0 and that its K-orbit in L0 is a Poisson submanifold of (L0 , 5). (We also show in [E-L2] that L0 is diffeomorphic to the set of real points in the De Concini–Procesi compactification of G [D-P].) For each Poisson structure πX,X1 ,λ on K/T , consider the map P : (K/T , πX,X1 ,λ ) −→ (L0 , 5) : kT 7 −→ Adk lX,X1 ,λ . It is shown in [E-L2] that P is a Poisson map. When the normalizer subgroup of lX,X1 ,λ in K is T , this map is an embedding of K/T into L0 whose image is the the K-orbit of lX,X1 ,λ in L0 . In general, P is a covering map onto the K-orbit of lX,X1 ,λ in L0 . Thus, every (K/T , πX,X1 ,λ ) is a Poisson submanifold of (L0 , 5) (possibly up to a covering map). This can be considered as one geometrical interpretation of πX,X1 ,λ . Two special cases of πX,X1 ,λ deserve more attention. The first is when X = X1 = ∅ (λ = 0 in this case). Then πX,X1 ,λ = π∞ is the Bruhat Poisson structure. It has been the most interesting example in terms of connections to Lie theory. For its relations with the Kostant harmonic forms [Ko], see [Lu3] and [E-L1]. The second special case is when X = S(6+ ) and X1 = ∅. The condition on λ is that λ ∈ a is regular. We will show that πX,X1 ,λ is symplectic in this case. In fact, we will show that πX,X1 ,λ can be identified with the symplectic structure on a dressing orbit of K in its dual Poisson Lie group. We also remark that this symplectic structure has been used in [L-R] to give a symplectic proof of Kostant’s nonlinear convexity theorem.
358
J.-H. Lu
Recall that the Manin triple (g, k, a + n, 2i ε Im , ) gives rise to a Poisson structure πAN on the group AN making (AN, πAN ) into the dual Poisson Lie group of (K, πK ). The group K acts on AN by the (left) dressing action: K × AN −→ AN : (k, b) 7 −→ k · b := b1 , if bk −1 = k1 b1 for k1 ∈ K and b1 ∈ AN. The K orbits of this dressing action of K in AN , called the dressing orbits, are precisely all the symplectic leaves of the Poisson structure on AN and they are parametrized by a fundamental W -chamber in a. Thus each dressing orbit inherits a symplectic, and thus Poisson, structure as a symplectic leaf. Since the dressing action is Poisson [STS2, Lu-We], these dressing orbits are examples of (K, πK )-homogeneous Poisson spaces. Let λ ∈ a be regular and consider the element e−λ ∈ A. The stabilizer subgroup of K in AN at e−λ is T . Thus, by identifying K/T with the dressing orbit through e−λ , we get a Poisson structure on K/T which is in fact symplectic. Notation 5.11. We will use πλ to denote the Poisson structure on K/T obtained by identifying K/T with the symplectic leaf in AN through the point e−λ , and we call it the dressing orbit Poisson structure corresponding to e−λ ∈ A. Proposition 5.12. When X = S(6+ ), X1 = ∅, and λ ∈ a is regular, the Poisson structure πX,X1 ,λ on K/T is nothing but the dressing orbit Poisson structure πλ corresponding to e−λ . Explicitly, we have L X iε 1 Xα ∧ Yα + π∞ , (22) πλ = − 2 1 − e2α(λ) α∈6+
where the first term is the K-invariant bi-vector field on K/T whose value at e = eT is the expression given in the parenthesis. Proof. Since lX,X1 ,λ is given by the right-hand side of (22), we only need to show that the dressing orbit Poisson structure πλ is also given by the same formula. Denote the Poisson structure on AN by πAN . Since we are identifying k with (a + n)∗ via 2i Im , , an element x ∈ k can be regarded as a left invariant 1-form on AN which we denote by x l . Let pk : g → k be the projection from g to k with respect to the Iwasawa Decomposition g = k + a + n. We know that (see [Lu1]) for any a ∈ A, πAN (x l , y l )(a) =
2i Im Ada x, pk Ada y ε
for all x, y ∈ k. Here, Ada is the Adjoint action of a ∈ A on g. Thus, when x and y run over the basis vectors {iHα , Xα , Yα : α ∈ 6+ } for k, we have πAN (x l , y l )(a) = 0 except that 2i Im Ada Xα , pk Ada Yα 2i = Im a α Eα − a −α E−α , a −α (iEα + iE−α ) 2i = (1 − a −2α ).
πAN (Xαl , Yαl ) =
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
359
Let σx be the (left)-dressing vector field on AN defined by x ∈ k, i.e., σx = −x l Then, taking a = e−λ , we have X
πAN (a) =
α∈6+
=−
πAN .
1 σX (a) ∧ σYα (a). πAN (Xαl , Yαl ) α
1 iε X σX (a) ∧ σYα (a) ∈ ∧2 Ta (K · a). 2 1 − e2α(λ) α α∈6+
Identify K/T with K · a by kT 7→ k · a, we get πλ (eT ) = −
iε X 1 Xα ∧ Yα . 2 1 − e2α(λ) α∈6+
Thus πλ is given as by (22).
t u
5.5. πX,X1 ,λ as the result of Poisson induction. We now look at the general case of πX,X1 ,λ . Set kX = t + spanR {Xα , Yα : α ∈ [X] ∩ 6+ }, and let KX ⊂ K be the connected subgroup of K with Lie algebra kX . We will show that lX,X1 ,λ can be obtained via Poisson induction (see Remark 5.15 below) from a Poisson structure on the smaller space KX /T . To this end, consider k0X = {ξ ∈ k∗ : ξ(x) = 0∀x ∈ kX }. Since we are identifying k∗ with a + n, we have k0X ∼ = nX as real Lie algebras, where nX is given in (19). Since nX ⊂ a + n is an ideal, we know that KX ⊂ K is a Poisson subgroup [Lu-We]. In fact, set 31 = −
iε 2
X α∈[X]∩6+
Xα ∧ Yα , 2
32 = −
iε 2
X α∈6+ \[X]
Xα ∧ Yα . 2
Then, we have Proposition 5.13. 1) For any x ∈ kX , adx 32 = 0. 2) The Poisson structure on KX (as a Poisson submanifold of K) is given by πKX (k1 ) = Rk1 31 − Lk1 31 , where Rk1 and Lk1 are respectively the right and left translations on KX by k1 ∈ KX . 3) The Manin triple for the Poisson Lie group (KX , πKX ) is (mX , kX , a + uX , 2i ε , ), where mX , given in (13), is considered as over R, and uX = spanR {Eα , iEα : α ∈ [X] ∩ 6+ }.
360
J.-H. Lu
Proof. 1) Using the embedding of ∧• k into ∧• g as a real subspace, it is enough to show that adx 32 = 0 for x = Eα with α ∈ [X]. Let α ∈ [X] ∩ 6+ . Then, X 2 adEα 3 = [Eα , Eβ ] ∧ E−β + Eβ ∧ [Eα , E−β ]. ε β∈6+ \[X]
Set Y1 = {β ∈ 6+ \[X] : α + β ∈ 6},
and
Y2 = {β ∈ 6+ \[X] : β − α ∈ 6}.
Since Y = 6+ \[X] has the property that if α ∈ [X] ∩ 6+ and β ∈ Y are such that α + β ∈ 6, then α + β ∈ Y , the map Y1 → Y2 : β 7→ α + β is a bijection. Thus X 2 adEα 32 = ([Eα , Eβ ] ∧ E−β + Eα+β ∧ [Eα , E−(α+β) ]) ε β∈Y1 X (Nα,β + Nα,−(α+β) )Eα+β ∧ E−β = β∈Y1
= 0. Similarly, adE−α 32 = 0. This proves 1). 2) By definition, the induced Poisson structure πKX on KX is the restriction of πK to KX . Using the definition of πK and 1), we know that πKX is as given. 3) From the general theory of Poisson Lie groups [Lu-We], we know that the induced Lie algebra structure on k∗X is isomorphic to the quotient Lie algebra k∗ /k0X . Through the 2i ∗ ∼ identifications k∗ ∼ = a + n and k0X ∼ = nX via 2i ε , , we get kX = a + uX via ε , t which is now considered as a symmetric scalar product on mX by restriction. u Notation 5.14. Let X1 ⊂ X and let λ = λ1 + π2i ρˇX1 ∈ aX + π2i ρˇX1 be such that α(λ1 ) 6= 0 for any α ∈ [X] with α(ρˇX1 ) even. By replacing K by KX and by regarding X as the set of all simple roots for the root system for (KX , T ), we know that there is a (KX , πKX )homogeneous Poisson structure on KX /T corresponding to X, X1 and λ. We will denote it by πXX1 ,λ . We now show that the Poisson structure πX,X1 ,λ on K/T can be obtained via Poisson induction from the Poisson structure πXX1 ,λ on KX /T . To this end, consider the product space K×(KX /T ) with the product Poisson structure πK ⊕ πXX1 ,λ . Even though the diagonal (right) action of KX on K × (KX /T ) given by 0
0
k1 : (k, k T ) 7 → (kk1 , k1−1 k T ) is in general not Poisson, there is nevertheless a unique Poisson structure on the quotient space K ×KX (KX /T ) such that the projection map 0
0
K × (KX /T ) −→ K ×KX (KX /T ) : (k, k T ) 7−→ [(k, k T )] is a Poisson map. We temporarily denote this Poisson structure on K ×KX (KX /T ) by π0 . Remark 5.15. In general, suppose that K is a Poisson Lie group and K1 ⊂ K is a Poisson subgroup. Suppose that M is a Poisson manifold on which there is a Poisson action by K1 . Then there is a unique Poisson structure on K ×K1 M such that the natural projection from K ×M to K ×K1 M is a Poisson map. Moreover, the left action of K on K ×K1 M by left translations on the first factor is a Poisson action. We call this procedure of producing the Poisson K-space K ×K1 M from the Poisson K1 -space M Poisson induction.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
361
Proposition 5.16. We have F∗ π0 = πX,X1 ,λ , where F is the identification ∼
0
0
F : K ×KX (KX /T ) −→ K/T : [(k, k T )] 7 −→ kk T . Proof. Recall that πX,X1 ,λ is the image of π˜ rX (λ) = 3R − AX (λ)L under the projection p1 : K → K/T , where 3R (resp. AX (λ)L ) is the right (resp. left) invariant bivector field on K with value 3 (resp. AX (λ)) at e, and AX (λ) ∈ k ∧ k is the skew symmetric part of the r-matrix rX (λ) given in (12). On the other hand, π0 is the image of πK ⊕ π¯ under the projection 0
0
p2 : K × KX −→ K ×KX (KX /T ) : (k, k ) 7−→ [(k, k T )], L where π¯ is the bi-vector field on KX defined by π¯ = 3R 1 − 33 with
33 = −
iε 2
X
coth α(λ)
α∈[X]∩6+
Xα ∧ Yα . 2
Because of the commutative diagram: m
K × KX −→ K ↓ p1 p2 ↓ K ×KX (KX /T ) 0
−→ F
K/T ,
0
where m : K × KX −→ K : (k, k ) 7 → kk , it is enough to show that m∗ (πK ⊕ π¯ ) = π˜ rX (λ) , or π˜ rX (λ) (kk1 ) = Lk π¯ (k1 ) + Rk1 πK (k), ∀k ∈ K, k1 ∈ KX . But this follows easily from the definitions and the fact that Adk1 32 = 32 for all t k1 ∈ KX . u We state some more properties of πX,X1 ,λ which can be proved either by definitions or as corollaries of Proposition 5.16. Proposition 5.17. 1) The embedding (KX /T , πXX1 ,λ ) ,→ (K/T , πX,X1 ,λ ) is a Poisson map; 2) With the Poisson structure πK on K, the Poisson structure πXX1 ,λ on KX /T and the Poisson structure πX,X1 ,λ on K/T , the map 0
0
m1 : K × (KX /T ) −→ K/T : (k, k T ) 7−→ kk T is a Poisson map; 3) Let p∗ πK be the projection to K/KX of πK by p : K → K/KX : k 7→ kKX . Then the projection map (K/T , πX,X1 ,λ ) → (K/KX , p∗ πK ) is a Poisson map. Remark 5.18. The Poisson structure p∗ πK on K/KX is known as the Bruhat-Poisson structure, because its symplectic leaves are exactly the Bruhat cells in K/KX . See [Lu-We].
362
J.-H. Lu
5.6. The symplectic leaves of πX,X1 ,λ . In this section, we first describe the symplectic leaves of πX,X1 ,λ for any X ⊂ S(6+ ) but X1 = ∅. The description of symplectic leaves for general πX,X1 ,λ is somewhat complicated, and we will leave it to the future. However, we will show that each πX,X1 ,λ , for any X, X1 and λ, has at least one open symplectic leaf. Notation 5.19. We will use πX,∅,λ to denote the Poisson structure πX,X1 ,λ when X1 is the empty set. We first recall that the space K/T has the well-known Bruhat decomposition: Because of the Iwasawa decomposition G = KAN of G, the natural map K/T → G/B : kT 7 → kB is a diffeomorphism. Its inverse map is G/B → K/T : gB 7→ kT if g = kan is the Iwasawa decomposition of g. Thus we have [
K/T ∼ = G/B =
N wB
w∈W
as a disjoint union. The set NwB is called the Bruhat (or Schubert) cell corresponding to w ∈ W . We denote it by 6w . For w ∈ W , set 8w = (−w6+ ) ∩ 6+ = {α ∈ 6+ : w−1 α ∈ −6+ }. Set nw = spanC {Eα : α ∈ 8w } and Nw = exp nw . Then 6w is parametrized by Nw by the map jw : Nw −→ 6w : n 7−→ nwB. Define j1 = G −→ K : g = kb 7 −→ k j2 = G −→ K : g = bk 7 −→ k
for k ∈ K, b ∈ AN; for k ∈ K, b ∈ AN.
Then we have a left action of G on K by G × K −→ K : (g, k) 7 −→ g ◦ k := j1 (gk), and a right action of G on K: K × G −→ K : (k, g) 7 −→ k g := j2 (kg). The parameterization of 6w by Nw is then also given by ˙ , jw : Nw −→ 6w : n 7−→ (n ◦ w)T where w˙ ∈ K is any representative of w in K. Notation 5.20. For k ∈ K and a subgroup G1 ⊂ G, we set G1 ◦ k = {g ◦ k : g ∈ G1 },
k G1 = {k g : g ∈ G1 }.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
363
It is easy to show that (AN ) ◦ k = k AN for any k ∈ K. This set is the symplectic leaf of πK in K through the point k (see [So, Lu-We]). Since KX ⊂ K is a Poisson submanifold, we know that (AN ) ◦ k = k AN ⊂ KX for k ∈ KX . Moreover, if w ∈ W and if w˙ ∈ K is a representative of w in K, set Cw˙ = (AN ) ◦ w˙ ⊂ K. Then Cw˙ = (AN) ◦ w˙ = N ◦ w˙ = Nw ◦ w˙ = w˙ AN = w˙ N = w˙ Nw−1 .
(23)
Its image under the projection K → K/T is the Bruhat cell 6w , which is also the symplectic leaf of the Bruhat Poisson structure π∞ in K/T . See [So, Lu-We]. Let X ⊂ S(6+ ). Denote by WX the subgroup of W generated by the simple reflections corresponding to elements in X. It is the Weyl group for (mX , h). Introduce the subset W X of W : W X = {w ∈ W : 8w−1 ⊂ 6+ \[X]}. It follows from the definition that w ∈ W X if and only if w([X] ∩ 6+ ) ⊂ 6+ . Moreover, we have Cw˙ 1 = w˙ 1NX for w1 ∈ W X because Nw−1 ⊂ NX , where NX = exp nX with 1 nX given by (19). The following lemma says that each w1 ∈ W X is the minimal length representative for the coset w1 WX , and that the set W X is a “cross section” for the canonical projection from W to the coset space W/WX . For a proof of the Lemma, see [Ko], Prop. 5.13. Lemma 5.21. For any w ∈ W , there exists a unique w1 ∈ W X and w2 ∈ WX such that w = w1 w2 . Moreover, 8w−1 = 8w−1 ∪ w2−1 8w−1 2
1
is a disjoint union, and the components on the right hand side are the respective intersections of 8w−1 with [X] and 6+ \[X]. Hence, l(w) = l(w1 ) + l(w2 ). We can now describe the symplectic leaves of πX,∅,λ in K/T . S Theorem 5.22. 1) For each w1 ∈ W X , the union w2 ∈WX 6w1 w2 is the symplectic leaf of πX,∅,λ in K/T through the point w1 ∈ K/T . 2) These are all the symplectic leaves of πX,∅,λ in K/T . Proof. Set LX,λ = eλ KX e−λ NX = NX eλ KX e−λ . It is the connected subgroup of G with Lie algebra lX,λ = Adeλ (nX + kX ) . Notice that each l ∈ LX,λ can be written as a unique product l = nX eλ ke−λ for nX ∈ NX and k ∈ KX . Denote by Sw1 the symplectic leaf of πX,∅,λ through the point w1 ∈ K/T . Pick a representative w˙ 1 of w1 in K. By Theorem 7.2 of [Lu2] (see also [Ka1]), the symplectic L leaf Sw1 is the image of the set w˙ 1 X,λ under the projection K → K/T . We define a map M : LX,λ −→ Nw−1 × KX 1
364
J.-H. Lu 0
as follows: For l = nX eλ ke−λ ∈ LX,λ , write ke−λ = bk , where b ∈ AUX with UX = 0 0 exp uX and k ∈ KX , so that l = nX eλ bk . Since the map Nw−1 → Cw˙1 : n 7 → w˙ 1n is a 1
0
λ
diffeomorphism, there exists a unique n ∈ Nw−1 such that w˙ 1n = w˙ 1nX e b . Now define 0
0
1
0
0
0
M(l) = (n , k ). It is easy to see that the map M is onto and that w˙ 1l = w˙ 1n k ∈ Cw˙1 KX . This shows that L
w˙ 1 X,λ = Cw˙1 KX . It is easy to show that the map Cw˙1 × KX −→ Cw˙1 KX : (c, k) 7−→ ck is a diffeomorphism, Sand that the image of Cw˙1 KX to K/T under the projection K → K/T is the union w2 ∈WX 6w1 w2 , which is thus the symplectic leaf of the Poisson structure πX,∅,λ through the point w1 ∈ K/T . Now since [ Sw1 K/T = w1 ∈W X
is already a disjoint union, we conclude that the collection {Sw1 : w1 ∈ W X } is that of t all symplectic leaves of πX,∅,λ in K/T . u X S Let w1 ∈ W . The following proposition identifies the symplectic manifold Sw1 = 6 w2 ∈WX w1 w2 , as a symplectic leaf of πX,∅,λ in K/T , with the product of two symplectic manifolds. Recall that for w ∈ W with a representative w˙ in K, the set Cw˙ ⊂ K is the ˙ Recall also from Notation 5.14 the definition symplectic leaf of πK through the point w. X on KX /T . Note that it is symplectic by Proposition 5.12. of the Poisson structure π∅,λ
Proposition 5.23. Let w1 ∈ W X and let w˙ 1 be a representative of w1 in K. Equip Cw˙1 with the symplectic structure as a symplectic leaf of πK in K; Equip KX /T with X , and finally, equip Sw1 with the symplectic structure as a the symplectic structure π∅,λ symplectic leaf of πX,∅,λ . Then the map 0
0
m1 : Cw˙1 × KX /T −→ Sw1 : (k, k T ) 7−→ kk T is a diffeomorphism between symplectic manifolds. Proof. This is a direct consequence of 2) in Proposition 5.17.
t u
Among all the elements in W X , there is one which is the longest. We denote this element by w X , so l(wX ) ≥ l(w1 ) for all w1 ∈ W X . Proposition 5.24. The symplectic leaf SwX of πX,∅,λ in K/T through the point wX is open and dense. Proof. Consider the projection K/T → K/KX : kT 7→ kKX . The image of 6wX ⊂ K/T under this projection is an open dense subset (in fact a cell) in K/KX . Since t K/T → K/KX is a fibration, we know that SwX is open and dense in K/T . u Corollary 5.25. Each Poisson structure πX,∅,λ has a finite number of symplectic leaves with at least one of them open and dense.
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
365
Remark 5.26. Note that the statement in Corollary 5.25 may not be true if X1 6 = ∅, as is seen from Case 3 of Example 5.4. The description of the symplectic leaves of πX,X1 ,λ in general is somewhat complicated. However, we have Proposition 5.27. The Poisson structure πX,X1 ,λ for X = S(6+ ) (and X1 ⊂ X arbitrary) is non-degenerate at every element in the Weyl group W of (K, T ) considered as a point in K/T . Consequently, the symplectic leaves of πX,X1 ,λ through these points are open. Proof. Let w ∈ W and let w˙ ∈ K be a representative of w in K. Recall from the definition of πX,X1 ,λ that πX,X1 ,λ = p∗ π˜ 1 , where p : K → K/T is the natural projection and π˜ 1 is the bi-vector field on K defined by with 3 = − i4ε
π˜ 1 = 3R − AL ,
P
α∈6+
Xα ∧ Yα and A=−
iε X e2α(λ) + 1 Xα ∧ Yα . 4 e2α(λ) − 1 α∈6+
Thus ˙ = Adw˙ −1 3 − A lw˙ −1 π˜ 1 (w) X e2α(λ) + 1 iε X (Xw−1 α ∧ Yw−1 α ) + ( 2α(λ) Xα ∧ Yα ) =− 4 e −1 α∈6+
=−
iε 4
−
X
α∈6+
(1 +
α∈6+ ,wα0
e2α(λ)
+1 )Xα ∧ Yα e2α(λ) − 1
(−1 +
e2α(λ) + 1 )Xα ∧ Yα . e2α(λ) − 1
neq ± 1, lw˙ −1 πX,X1 ,λ (wT ˙ ) = p∗ lw˙ −1 π˜ 1 (w) ˙ ∈ ∧2 Te (K/T ) is non-degenSince ee2α(λ) +1 −1 ˙ ∈ K/T . u t erate. Hence πX,X1 ,λ is non-degenerate at w = wT 2α(λ)
Corollary 5.28. For any X, X1 and λ, the Poisson structure πX,X1 ,λ on K/T has at least one open symplectic leaf. Proof. We use Proposition 5.16 which says that πX,X1 ,λ can be obtained via Poisson induction from the Poisson structure πXX1 ,λ on KX /T . Recall the definition of πXX1 ,λ from Notation 5.14. Since X is the set of all simple roots for the root systems for (KX , T ), we know from Proposition 5.27 that πXX1 ,λ is non-degenerate at every Weyl group element in WX , regarded as points in KX /T . Let w2 ∈ WX . Recall that wX is the longest element in the set W X . Let w˙ X be any representative of wX in K. Recall that Cw˙ X is the symplectic leaf of πK in K through w˙ X . By Proposition 5.17, the map 0
0
(Cw˙ X , πK ) × (KX /T , πXX1 ,λ ) −→ (K/T , πX,X1 ,λ ) : (k, k T ) 7 −→ kk T
366
J.-H. Lu
is a Poisson map. But this map is a diffeomorphism onto its image which is open because it is the inverse image under the natural projection K/T → KX /T of the biggest cell in KX /T . Thus the symplectic leaf of πX,X1 ,λ through the point w˙ X w2 ∈ K/T is open. t u Note that the proof of Corollary 5.28 shows that πX,X1 ,λ is open at every point in the coset w X WX ⊂ K/T . Example 5.29. Corollary 5.28 can be checked directly for the case of g = sl(2, C) by looking at the explicit formulas in Example 5.4. 5.7. The modular vector fields and the leaf-wise moment maps for the T -actions. For an orientable Poisson manifold (P , π ) and a given volume form µ on P , the modular vector field of π associated to µ is defined to be the vector field vµ on P satisfying vµ µ = d(π µ). It measures how Hamiltonian flows on P fail to preserve µ. More details can be found in [W]. Coming back to (K, πK )-homogeneous Poisson structures on K/T , we set ρ = 1P α∈6+ α for the choice of 6+ in the definition of πK . Then we have iHρ ∈ t. We use 2 σiHρ to denote the infinitesimal generator of the T action on K/T by left translations in the direction of iHρ . Proposition 5.30. For the Poisson structure πK on K defined by (14) with 3 given in (15), all (K, πK )-homogeneous Poisson structures on K/T , and in particular all the πX,X1 ,λ ’s, have the same modular vector field v, namely v = −iεσiHρ , with respect to a (and thus any) K-invariant volume form on K/T . Remark 5.31. Proposition 5.30 is a statement about any Poisson Lie group structure on K since the Poisson structure πK on K defined by (14) with 3 given in (15) is the most general form of such structures. Proof of Proposition 5.30. Let π be an arbitrary (K, πK )-homogeneous Poisson structure. Then we know that π is the sum π = π(e)L + p∗ πK , where π(e)L is the K-invariant bi-vector field on K/T whose value at e = eT is π(e), and p∗ πK is the projection of πK from K to K/T by p : K → K/T : k 7 → kT (it is the Bruhat Poisson structure π∞ when u = 0 in the definition of 3). Let µ be a Kinvariant volume form on K/T . Let bµ be the degree −1 operator on χ • (K/T ) defined by bµ (U ) = (−1)|U | d(U µ), so that v = bµ (π ) [E-L-W]. Then bµ (π ) = bµ (π(e)L )+ bµ (p∗ πK ). Since µ is K-invariant, the operator bµ maps a K-invariant multi-vector field to another such. Hence bµ (π(e)L ) must be a K-invariant (1-)vector field so it must be zero. Thus bµ (π) = bµ (p∗ πK ). It is proved in [E-L-W] that bµ (p∗ πK ) = −iεσiHρ , which is therefore the modular vector field for any π . u t The modular vector field is always a Poisson vector field [W], but it is not necessarily Hamiltonian in general. For the rest of this section, we study this problem for the modular vector field v = −iεσiHρ for the Poisson structure πX,∅,λ . We will show that although v is not globally Hamiltonian unless X = S(6+ ), it is leaf-wise, and we describe its Hamiltonian function on each leaf. In fact, since every πX,∅,λ is T -invariant (for the T action on K/T by left translations), we will describe the moment map for the T -action
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
367
on each symplectic leaf of πX,∅,λ . We are particularly interested in the behavior of these moment maps when λ goes infinity in various directions as in Sect. 5.2. We first look at the Bruhat Poisson structure π∞ corresponding to X = ∅. This case (when ε = i) is studied in detail in [Lu3]. We recall the results there. Denote by PA : G = KAN −→ A : g = kan 7 −→ a, where G = KAN is the Iwasawa decomposition of G (as a real Lie group). For each w ∈ W , choose a representative w˙ ∈ K of w in K, and use ˙ jw : Nw −→ 6w : n 7 −→ (n ◦ w)T ˙ ∈ A. The element to parameterize the Bruhat cell 6w . For n ∈ Nw , let aw (n) = PA (nw) ˙ so we have a well-defined map aw (n) is independent of the choice of w, aw : Nw −→ A : n 7 −→ aw (n). Denote by w the symplectic structure on 6w as a symplectic leaf of π∞ . Then each (6w , w ) is a Hamiltonian T -space. The following fact is proved in [Lu3]. Proposition 5.32. The map φw : 6w −→ t ∗ : hφw , xi(kT ) 2i = Im Adw˙ log aw (jw−1 (kT )), x , ε
x∈t
is the moment map for the T -action on (6w , w ) such that φw (w) = 0. In [Lu3], we have written down an explicit formula for φw in certain Bott-Samelson type coordinates {z1 , z¯ 1 , z2 , z¯ 2 , . . . , zl(w) , z¯ l(w) }. It takes the form l(w)
hφw , xi = −
1 X 2αj (x) log(1 + |zj |2 ), ε αj , αj j =1
where {α1 , α2 , . . . , αl(w) } = 6+ ∩ (−w6+ ). In particular, let x = −iε(iHρ ) = εHρ , we get a Hamiltonian function for the vector field v = −iεσiHρ on (6w , w ) as hφw , εHρ i = −
l(w) X 2 ρ, αj log(1 + |zj |2 ). αj , αj j =1
This function goes to −∞ as |zj | → ∞ which corresponds to the boundary of 6w . Thus, the modular vector field v can not be globally Hamiltonian on K/T . Next, we look at the case when X = S(6+ ), so πX,∅,λ = πλ is the the symplectic structure on K/T obtained by identifying K/T with the dressing orbit in the group AN through the point e−λ (see Proposition 5.12). Since K/T is simply connected, the T -action on K/T is Hamiltonian. The following fact is proved in [L-R]. Proposition 5.33. The moment map for the T -action on (K/T , πλ ) is given by 8λ : K/T −→ t ∗ : h8λ , xi(kT ) 2i = Im log(PA (ke−λ k −1 )), x , ε
x ∈ t.
368
J.-H. Lu
Remark 5.34. This fact plays the key role in the symplectic proof of Kostant’s nonlinear convexity theorem given in [L-R]. Corresponding to the fact that limt→+∞ πλ+t ρˇ = π∞ , where ρˇ is the sum of all fundamental co-weights, the two moment maps are related as follows. Proposition 5.35. For any λ ∈ a, w ∈ W and kT ∈ 6w , lim 8λ+t ρˇ (kT ) − 8λ+t ρˇ (w) = φw (kT ), t→+∞
lim d8λ+t ρˇ (kT ) = dφw (kT ).
t→+∞
Proof. Using the parameterization of 6w by Nw , we regard both 8λ+t ρˇ |6w and φw as ˙ Write (t ∗ -valued) functions on Nw . Let n ∈ Nw with k = n ◦ w. nw˙ = kaw (n)m(n) with m(n) ∈ Nw . Then ˙ −λ aw (n)w˙ −1 )n−1 . e−λ k −1 = (e−λ aw (n)m(n)aw (n)−1 eλ w˙ −1 )(we Thus, for any x ∈ t, h8λ+t ρˇ (n) − 8λ+t ρˇ (e) − φw (n), xi 2i = Im log PA (e−λ−t ρˇ aw (n)m(n)aw (n)−1 eλ+t ρˇ w˙ −1 ), x , ε where e ∈ Nw is the identity element. Consider now the map ψt : Nw −→ Nw : m 7−→ e−λ−t ρˇ meλ+t ρˇ . Under the identification of nw with Nw by the exponential map of Nw , this is the linear map Ad−λ−t ρˇ on nw , which goes to 0 as t → +∞. Thus lim ψt (m) = 0,
t→+∞
and
lim dψt (m) = 0
t→+∞
for all m ∈ Nw . But we have the composition of maps h8λ+t ρˇ (n) − 8λ+t ρˇ (e) − φw (n), xi = ηx (ψt (ξ(n))), −1 where ηx : Nw → R : m 7 → 2i ε Im log PA (mw˙ ), x and ξ : Nw → Nw : n 7 → u aw (n)m(n)aw (n)−1 . Thus the two limits in Proposition 5.35 hold. t
Now consider the general case of πX,∅,λ . Recall that the symplectic leaves of πX,∅,λ in K/T are indexed by elements in W X . We keep the notation in Proposition 5.23, in which we have used the map m1 to identify the symplectic leaf Sw1 of πX,∅,λ in K/T with the product symplectic manifold Cw˙ 1 × KX /T . We use the projection map Cw˙ 1 → 6w1 : k 7 → kT to identify Cw˙ 1 and 6w1 . This identification is T -equivariant if we equip Cw˙ 1 with the T -action T × Cw˙ 1 −→ Cw˙ 1 : t · k 7−→ tk(w˙ 1−1 t −1 w˙ 1 ).
Classical Dynamical r-Matrices and Homogeneous Poisson Structures
369
Equip Cw˙ 1 × KX /T with the T -action 0
T × (Cw˙ 1 × KX /T ) −→ Cw˙ 1 × KX /T : t · (k, k T ) 0 7−→ (tk(w˙ 1−1 t −1 w˙ 1 ), w˙ 1−1 t w˙ 1 k T ). Then the map m1 in Proposition 5.23 is T -equivariant. Denote by 8λ,X the moment X ). Then the moment map for the T -action on map for the T -action on (KX /T , π∅,λ ∼ Sw1 = Cw˙ 1 × KX /T is given by 0
0
hφλ,X,w1 (k, k T ), xi = hφw1 (kT ), xi + h8λ,X (k T ), Adw˙ −1 xi 1
for all x ∈ t. Remark 5.36. There remain many problems to be addressed concerning the Poisson structures πX,X1 ,λ . Other than the description of their symplectic leaves in the general case, one can try to compute its Poisson cohomology according to the theory developed in [Lu2]. One can also study the K-invariant Poisson harmonic forms [E-L1] of πX,X1 ,λ . Another problem is to construct the symplectic groupoids for πX,X1 ,λ . We hope to treat these problems in the future. Acknowledgement. The author would like to thank P. Etingof for explaining to her the results in [E-V] and Professors V. Drinfeld, S. Evens, Y. Kosmann-Schwarzbach, A. Weinstein and P. Xu for helpful discussions. She would also like to thank the Mathematics Department of Hong Kong University of Sciences and Technology for it hospitality. Special thanks to the referee for useful comments.
References [B-D]
Belavin,A. and Drinfeld,V.: Solutions of the classicalYang–Baxter equations for simple Lie algebras. Funct. Anal. Appl. 16, 159–180 (1982) [D1] Drinfeld, V. G.: Hamiltonian structures on Lie groups, Lie bialgebras and the geometric meaning of the classical Yang - Baxter equations. Soviet Math. Dokl. 27 (1), 68-71 (1983) [D2] Drinfeld, V.: Quantum groups. Proc. Intern. Congr. Math., Berkeley, 1, 1986, pp. 798–820 [D3] Drinfeld, V. G.: On Poisson homogeneous spaces of Poisson-Lie groups. Theo. Math. Phys. 95 (2), 226–227 (1993) [D-P] De Concini, C. and Procesi, C.: Complete symmetric varieties. In Invariant Theory (Montecatini, 1982), Lecture Notes in Math., Vol. 996, Berlin–New York: Springer, 1983, pp. 1–44 [E-V] Etingof, P. and Varchenko, A.: Geometry and classification of solutions of the classical dynamical Yang–Baxter equation. Commun. Math. Phys. 192, 77–120 (1998) [E-L-W] Evens, S., Lu, J-H., and Weinstein, A.: Transverse measures, the modular class, and a cohomology pairing for Lie algebroids. Quarterly J. Math. 50, 417–436 (1999) [E-L1] Evens, S., and Lu, J-H.: Poisson harmonic forms, the Kostant harmonic forms, and the S 1 -equivariant cohomology of K/T . Adv. Math. 142, 171–220 (1999) [E-L2] Evens, S., and Lu, J-H.: On the variety of Lagrangian subalgebras. Preprint, 1999 [F] Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Proceedings of the ICM, Zurich, 1994 [Ka1] Karolinsky, E.: The symplectic leaves on Poisson homogeneous spaces of Poisson-Lie groups. Mathematical Physics, Analysis, Geometry 2 No. 3/4 (in Russian), 306–311 (1999) [Ka2] Karolinsky, E.: The classification of Poisson homogeneous spaces of compact Poisson Lie groups. Mathematical physics, analysis, and geometry 3 No. 3/4, (in Russian) 274–289 (1996) [Ka3] Karolinsky, E.: A classification of Poisson homogeneous spaces of complex reductive Poisson Lie groups. math.QA/9901073 [Ko] Kostant, B.: Lie algebra cohomology and generalized Schubert cells. Ann. of Math. 77 (1), 72–144 (1963) [L-X] Liu, Z.-J., Xu, P.: Dirac structures and dynamical r-matrices. Preprint. [Lu-We] Lu, J. H., Weinstein, A.: Poisson Lie groups, dressing transformations, and Bruhat decompositions. J. Diff. Geom. 31, 501–526 (1990)
370
[Lu1] [L-R] [Lu2] [Lu3] [O-S] [Sc] [STS1] [STS2] [Se] [Sh] [So] [W]
J.-H. Lu
Lu, J. H.: Multiplicative and affine Poisson structures on Lie groups. UC Berkeley thesis, 1990 Lu, J. H., Ratiu, T.: On the nonlinear convexity theorem of Kostant. J. of AMS 4, No. 2, 349–363 (1991) Lu, J. H.: Poisson homogeneous spaces and Lie algebroids associated to Poisson actions. Duke Math. J. 86, No. 2, 261–304 (1997) Lu, J. H.: Coordinates on Schubert cells, Kostant’s harmonic forms, and the Bruhat Poisson structure on G/B. Trans. groups 4, No. 4, 355–374 (1999) Oshima, T., and Sekiguchi, J.: Eigenspaces of invariant differential operators on an affine symmetric space. Inventiones Math. 57, 1–81 (1980) Schiffmann, O.: On classification of dynamical r-matrices. Math. Res. Letters 5, 13–30 (1998) Semenov-Tian-Shansky, M. A.: What is a classical r-matrix?. Funct. Anal. Appl. 17, (4), 259–272 (1983) Semenov-Tian-Shansky, M. A.: Dressing transformations and Poisson Lie group actions. Publ. RIMS, Kyoto University 21, 1237–1260 (1985) Serre, J.-P.: Complex semisimple Lie algebras. Berlin–Heidelberg–NewYork: Springer-Verlag, 1987 Sheu, A.: Quantization of the Poisson SU (2) and its Poisson homogeneous space–the 2-sphere Commun. Math. Phsy. 135, 217–232 (1991) Soibelman, Y.: The algebra of functions on a compact quantum group, and its representations. St. Petersburg Math. J. 2 (1), 161–178 (1991) Weinstein, A.: The modular automorphism group of a Poisson manifold. J. Geom. Phys. 23, 379–394 (1997)
Communicated by T. Miwa
Commun. Math. Phys. 212, 371 – 394 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Renormalization and Periodic Orbits for Hamiltonian Flows Juan J. Abad, Hans Koch∗ Department of Mathematics, University of Texas at Austin, Austin, TX 78712, USA Received: 5 October 1999 / Accepted: 2 February 2000
Abstract: We consider a renormalization group transformation R for analytic Hamiltonians in two or more dimensions, and use this transformation to construct invariant tori, as well as sequences of periodic orbits with rotation vectors approaching that of the invariant torus. The construction of periodic and quasiperiodic orbits is limited to near-integrable Hamiltonians. But as a first step toward a non-perturbative analysis, we extend the domain of R to include any Hamiltonian for which a certain non-resonance condition holds. 1. Introduction and Results In this paper we complement and extend the results given in [15], by using a renormalization group (RG) transformation to construct sequences of periodic orbits for nearintegrable Hamiltonians, and by extending the domain of this transformation to a larger set of Hamiltonians. The construction of periodic orbits that approximate quasiperiodic motion is a canonical application of RG ideas; see for example [1, 7, 9, 11, 18, 21]. It relates observed universal accumulation rates to eigenvalues of the linearized RG transformation. This part of our analysis is restricted to near-integrable Hamiltonians, since RG fixed points relevant to critical cases have not yet been obtained rigorously. But the first part, which includes the definition of the RG transformation, does not require near-integrability. The work presented here is essentially self-contained, but for a motivation of some of our choices, and other background material, the reader is referred to [15]. P We start with some definitions. On Cd , consider the two norms |v| = j |vj | and kvk = maxj |vj |. Let V and W be two fixed but arbitrary d × d matrices over C, satisfying V T W = I, where V T denotes the transposed of V . Define Dρ,1 = q ∈ Cd : |V Imq| < ρ and Dρ,2 = p ∈ Cd : kWpk < ρ , for every ρ > 0. Unless stated ∗ Supported in Part by the National Science Foundation under Grant No. DMS-9705095.
372
J. J. Abad, H. Koch
otherwise, we will identify Dρ,1 with a complexified d-torus. In particular, a function on Dρ = Dρ,1 ×Dρ,2 is assumed to be 2π -periodic in each component of its first argument. If analytic, such a function H may be written as a Fourier–Taylor series X Hν,α (Wp)α eiq·ν , (q, p) ∈ Dρ , (1.1) H (q, p) = (ν,α)∈I
where I = Zd ×Nd . Here, and in what follows, x · y denotes the standard dot product of two vectors in Cd , and x α = x1α1 x2α2 · · · xdαd . Definition 1.1. Given any ρ > 0, define Aρ to be the Banach space of all analytic Hamiltonians H on Dρ of the form (1.1), for which the norm X |Hν,α |ρ |α| eρkW νk (1.2) |H |ρ = (ν,α)∈I
is finite. The identity operator on Aρ will be denoted by I. On the product space Adρ we P define the two norms |f |ρ = j |fj |ρ and kf kρ = maxj |fj |ρ . Let A0ρ be the space of all functions in Aρ , whose first partial derivatives belong to Aρ . On this space, we consider the following two seminorms: Denoting by ∇j H the partial gradient of H with respect to the j th argument, j = 1, 2, we define |H |0ρ and kH k0ρ to be the sum and maximum, respectively, of the numbers kW ∇1 H kρ and |V ∇2 H |ρ . Our first result is concerned with the possibility of finding a canonical change of variables that eliminates the component of a Hamiltonian in the direction of some predefined subspace of Aρ . The coordinate changes are restricted to those canonical transformations U : (q, p) 7 → (q + Q, p + P ) for which the one-form P · dq + (p + P ) · dQ is not just closed, but the differential of a function S. The function φ = p · Q − S, expressed in terms of q and p + P (if possible), satisfies the equation Q(q, p), P (q, p) = (J∇φ) q, p + P (q, p) , (1.3) where J(q, p) = (p, −q). Conversely, given φ ∈ Aρ sufficiently small, we can use (1.3) b the Hamiltonian vector field to define a canonical transformation U = Uφ . Denote by H b associated with a Hamiltonian H , that is, H f = (J∇H ) · ∇f . Theorem 1.2. Let ρ, c > 0, and let H be some non-empty open set of Hamiltonians H ∈ Aρ satisfying |H |ρ < c. Let I− be a projection operator on Aρ 0 , where 0 < ρ 0 < ρ, and assume that I− satisfies the following non-resonance condition: There are constants a, b > 0, such that for all values of r in the interval [ρ 0 , (ρ 0 +8ρ)/9], I− f ∈ Ar and |I− f |r ≤ a|f |r ; (a) if −f ∈ Ar , then 2 (b) I H ρ < ab c(1 − ρ 0 /ρ)2 /1231, for all H ∈ H; − b maps I− A0 onto I− A , and if φ ∈ I− A0 is nonzero, (c) for every r r r − H ∈ H, I H bφ > abcρ −1 ||φ||0r . then I H r Then there exists a map U that assigns to each H ∈ H a canonical transformation UH from Dρ 0 to D(ρ 0 +ρ)/2 , such that (1.4) H ◦ UH ∈ Aρ 0 , I− H ◦ UH = 0. The map N : H 7 → H ◦ UH is analytic from H to Aρ 0 . Furthermore, if H ∈ H satisfies b I− H bI− −1 I− , where I+ = I − I− . I− H = 0, then UH = I and DN (H ) = I+ − I+ H
Renormalization and Periodic Orbits for Hamiltonian Flows
373
This theorem was proved in [15], in the special case where H is a small neighborhood of a linear Hamiltonian (q, p) 7 → ω · p, and I− is a specific projection adapted to the choice of the vector ω. A similar projection will be considered again here. Its usefulness for renormalization derives from a combination of Theorem 1.2, and Lemma 1.3 below. Given a nonzero vector ω ∈ Rd , and two positive constants σ and κ, let (1.5) I − = (ν, α) ∈ I : |ω · ν| > σ kW νk + κ|α| , I + = I − I − , and define I± H by restricting the sum in (1.1) to the corresponding index sets I ± , I± H (q, p) =
X
Hν,α (Wp)α eiq·ν .
(1.6)
(ν,α)∈I ±
The functions I− H and I+ H will be referred to as the non-resonant and resonant part, respectively, of H . Clearly, the projection I− satisfies the condition (a) of Theorem 1.2, with a = 1. Notice that if H (q, p) depends on p only, then I− H = 0. For the purpose of renormalization, we now focus on vectors ω = (1, ω2 , . . . , ωd ) whose components span a real algebraic number field of degree d. In particular, ω ·ν 6 = 0 for every nonzero vector ν in Zd . Another consequence of this assumption [15] is that there exists an integral d × d matrix T such that (T1) T has a simple real eigenvalue ϑ > 1, and T ω = ϑω. (T2) All other eigenvalues of T are simple, and of modulus less than 1. (T3) det(T ) = ±1. Such a matrix T provides a way of approximating ω by vectors with rational components: If w ∈ Qd is nonzero, then the vector T n w, when rescaled such that its first component is one, approaches ω as n → ∞. The same approximating sequences can also be found in some Hamiltonian systems [1, 3, 9–13, 18, 21, 26], in the form of periodic orbits that accumulate at an invariant ω-torus. In order to investigate this Hamiltonian “representation” of the arithmetic related to ω, we “lift” the inverse of T , viewed as a map on frequency vectors, to a transformation acting on a space of Hamiltonians. A transformation that has some of the required properties is H 7→ µ−1 H ◦ Tµ , where Tµ (q, p) = T q, µ(T ∗ )−1 p ,
µ 6= 0.
(1.7)
It combines a canonical change of variables (case µ = 1) with a scaling in p, and is part of most RG schemes for Hamiltonians [1, 4–6, 15, 17, 19, 20]. By itself, this transformation is not a dynamical system on any of the spaces Aρ , since the domain Dρ is not left invariant by Tµ . But we can combine it with Theorem 1.2: The following lemma shows that a canonical change of variables, that eliminates the non-resonant part of a Hamiltonian, can “transfer analyticity” from the variable p to the variable q. After fixing an integral d × d matrix T satisfying (T1–T3), we adapt the matrices W and V to our choice of T , by assuming that the row vectors W1 , W2 , . . . , Wd of W are an ordered basis of eigenvectors for T , T Wj = ϑj Wj ,
|ϑj | ≤ |ϑi |,
1 ≤ i ≤ j ≤ d,
(1.8)
with W1 = ω. In addition we fix the parameters σ, κ > 0, with σ restricted by the condition |ϑ2 | + σ (ϑ − |ϑ2 |) < 1.
374
J. J. Abad, H. Koch
Lemma 1.3. Let 0 < ρ 0 < ρ and µ ∈ C be given such that µ ρ0 ρ0 |ϑ2 | + σ ϑ − |ϑ2 | < , 0 < eρκ(ϑ−|ϑ2 |) < . ρ ϑd ρ
(1.9)
Then every function H ∈ I+ Aρ 0 extends analytically to Tµ Dρ , and H 7→ H ◦ Tµ is a compact linear map from I+ Aρ 0 to Aρ , whose operator norm is ≤ 1. Let now H be some subset of Aρ for which the given projection I− satisfies a nonresonance condition, as described in Theorem 1.2. Definition 1.4. Given a nonzero complex number µ of modulus less than |ϑd |, define Rµ (H ) =
η H ◦ UH ◦ Tµ , µ
H ∈ H,
(1.10)
e0,(1,0,... ,0) of the renormalized where η = η(H ) is determined such that the coefficient H e Hamiltonian H = Rµ (H ) is equal to one, if this is possible. The action of this transformation is particularly simple when restricted to functions of the form hw (q, p) = w · p,
w ∈ Cd .
(1.11)
In particular, it is independent of the choice of µ, since hw is invariant under the scaling Sz : H 7 → z−1 H (., z.), for any z 6 = 0. More explicitly, if w = β1 ω+β2 W2 +. . .+βd Wd , with β1 nonzero, then Rµ (hw ) = hw0 , where w0 = ηT −1 w and η = ϑ/β1 . This shows e.g. that hω is a (trivial) fixed point of Rµ . For particular choices of the scaling parameter µ, there are other trivial fixed points as well. Such fixed points can be found easily by restricting Rµ to the space of Hamiltonians H ∈ Aρ for which ∇1 H = 0, and thus UH = I. Theorem 1.2 and Lemma 1.3 can be combined as follows. (Additional symmetries of Rµ are mentioned at the end of Sect. 3.) Theorem 1.5. Let 0 < ρ < σ/κ. If ρ 0 < ρ is sufficiently close to ρ, and µ ∈ C satisfies (1.9), then there exists an open neighborhood H of {H ∈ I+ A0s : H0,0 = 0, |H −hω |0s < κρ} in Aρ , where s = (ρ 0 + 8ρ)/9, such that the transformation Rµ is well defined, analytic, and compact, as a map from H to Aρ . The same holds for Rzµ = Sz ◦ Rµ , for all z in some open neighborhood Z ⊂ C of the unit circle, and Sz ◦ Rµ = Rµ ◦ Sz , for all z ∈ Z. This theorem is obtained by verifying conditions (b) and (c) of Theorem 1.2 for the given domain H. Since the rest of this paper will deal with near-integrable Hamiltonians only, we have centered H at the trivial fixed point hω , which simplifies the task of verifying (c). Condition (b) is satisfied by taking b > 0 small, i.e., we work with near-resonant Hamiltonians. This is not as restrictive or unusual as it may seem. The same can be achieved e.g. in the neighborhood of a Hamiltonian F that is near-resonant modulo a canonical change of coordinates U , by considering the modified transformation H 7 → Rµ (H ◦ U ). An example of such a pair (F, U ) could be the approximate RG fixed point F found in [1], and an approximation U for the corresponding canonical transformation UF . As mentioned earlier, Rµ acts trivially on Hamiltonians that only depend on the action variable p. This makes it possible to compute all eigenvalues and eigenvectors of
Renormalization and Periodic Orbits for Hamiltonian Flows
375
the derivative DRµ (hω ) of Rµ at the fixed point hω . The eigenvectors are precisely the monomials (q, p) 7 → (Wp)α , and the point 0 in the spectrum of DRµ (hω ) corresponds to functions with torus-average zero; see [15] for details. In what follows, ρ is a fixed positive real number less than σ κ, and the scaling parameter µ is assumed to be real, satisfying ϑ2 (1.12) 0 < µ < d . ϑ1 In this case, hω is an isolated fixed point of Rµ , and DRµ (hω ) has precisely d eigenvalues outside the open unit disk. One of them is λ0 = ϑ/µ, associated with constant Hamiltonians, and the other d − 1 eigenvalue-eigenvector pairs are λj = ϑ1 /ϑj ,
hWj (q, p) = Wj · p,
j = 2, . . . , d.
(1.13)
The corresponding local unstable manifold W u of hω is simply the d-dimensional affine subspace of Aρ that is tangent to the expanding eigenspace at hω . As usual in the theory of renormalization, the transformation Rµ is merely a tool for constructing and analyzing certain objects that are of interest outside this theory. We start with a discussion of the local stable manifold W s of Rµ at the fixed point hω . By definition, if H lies on W s , then the sequence of Hamiltonians Hn = Rnµ (H ) converges to hω , as n → ∞. This fact can be used to define a sequence of canonical transformations Vn (H ) = V0 (H ) ◦ V1 (H ) ◦ . . . ◦ Vn−1 (H ),
Vk (H ) = Tµk ◦ UHk ◦ Tµ−k .
(1.14)
Formally, H ◦ Vn (H ) approaches a constant multiple of hω , as n tends to infinity. But we cannot expect convergence on an open subset of phase space, unless H is integrable. One of the things that can be extracted from the transformations Vn (H ) is the limit of Vn (H )◦ϒ, as n → ∞, where ϒ(q) = (q, 0). This limit yields the function 0H described below. Theorem 1.6. Given two positive real numbers r 0 and r > r 0 + ρ, there exists an open neighborhood B of hω in Ar , and for every H ∈ B a complex number cH and an analytic function 0H : Dr 0 ,1 → Dr , such that the following holds. If H ∈ W s ∩ B then (1.15) J∇H ◦ 0H = cH ω · ∇0H on Dr 0 ,1 . For H = hω , the values of cH and 0H are 1 and ϒ, respectively. Furthermore, H 7 → cH is an analytic function on B, and H 7→ 0H − ϒ is an analytic map from B to some Banach space of analytic 2π -periodic functions on Dr 0 ,1 . This theorem is essentially identical to [15, Theorem 1.7]. Thus, we will not repeat a proof here. Notice that (1.15) is the equation of an invariant d-torus for H , with rotation vector proportional to ω; or more precisely (if cH is not real), the equation of an invariant d-torus for cH−1 H , with rotation vector ω. By construction, this torus lies on the energy surface H −1 (0). Another property of this torus is that it is “centered at p = 0”, in the sense that the integral I 1 p · dq (1.16) K(γ ) = 2π γ
376
J. J. Abad, H. Koch
vanishes, if γ is any closed curve on the torus. This follows from the fact that 0H is the limit of canonical transformations Vn (H ), that leave p · dq invariant up to a differential of a one-form. The assumption H ∈ W s , used to prove (1.15), replaces the non-degeneracy assumption in traditional KAM theory. To discuss the connection between these two conditions, we consider families β 7 → Hβ , Hβ = H ◦ Rβ ,
Rβ (q, p) = (q, p + β),
(1.17)
generated from a fixed Hamiltonian H by a translation in the action variable p. To be more specific, let r > ρ, and consider a Hamiltonian h ∈ Ar of the form h(q, p) = ω · p +
1 2
p · Mp + f (p),
f (p) = O(|p|3 ),
(1.18)
where M is a real symmetric d × d matrix, such that the quadratic form p 7 → p · Mp is non-degenerate, when restricted to the d − 1 dimensional contracting subspace of T ∗ . It is straightforward to check that if h is sufficiently close to hω , then the family β 7→ Hβ , for H = h, intersects the stable manifold W s transversally. Since this property persists under small perturbations, every Hamiltonian H ∈ Ar near h has an invariant cω-torus on the energy surface H −1 (0). The torus is given by 0 0 = Rβ 0 ◦ 0Hβ 0 , where β 0 is the value of the parameter β for which Hβ belongs to W s . Under suitable assumptions on H , the renormalization transformation Rµ can also be used to construct periodic orbits for H , whose rotation vectors are “rational approximants” for cω. Recall that a curve γ : R → Dρ is a (lifted) orbit for H if it satisfies the first order differential equation γ 0 = (J∇H ) ◦ γ . A periodic orbit that closes (modulo 2πZd in the variable q) after a time 2π τ > 0, but not earlier, can be parametrized as follows: (1.19) γ (t) = tw + Q0 + Q(t/τ ), P0 + P (t/τ ) , where w = (γ (2πτ ) − γ (0))/(2π τ ), and where Q, P are periodic functions with fundamental period 2π and average zero. The rotation vector w belongs to RZd , that is, w is a real scalar multiple of some vector in Zd . In what follows, given a nonzero vector w ∈ RZd , we denote by τ (w), or τ for short, the value of the smallest positive real number t such that tw belongs to Zd . In order to construct periodic orbits that depend continuously on H , near the Hamiltonian (1.18) that is invariant under translations q 7 → q + u, we will need to limit the number of ways this symmetry can be broken by a perturbation. We shall do this by restricting to Hamiltonians H (q, p) that are even functions of q – a property that is preserved under the transformation Rµ . The corresponding “even” subspace of Ar , r > 0, will be denoted by Br . We note that the approximate RG fixed point of [1] lies in such a space Br . Let ρ 0 and ε be fixed positive real numbers (to be chosen below). Definition 1.7. For every nonzero vector w ∈ RZd , we define H(w) to be the set of all Hamiltonians H ∈ Bρ , such that a constant multiple of H has a periodic orbit γ with rotation vector w, on the surface of constant energy zero, with K(γ ) = 0. The functions Q and P in (1.19) are assumed to be 2π -periodic, and to have average zero. In addition, we require P to be even, Q odd, and Q0 = 0. A subset 6(w) of H(w) is defined by requiring also that P0 = kV ∗ V w, with |k| < ε2 /τ , and that Q, P extend analytically to the strip |Imz| < ρ 0 /τ , satisfying the bounds |V Q(z)| < ε and kW P (z)k < ε.
Renormalization and Periodic Orbits for Hamiltonian Flows
377
Wu Hβ0
60
hw
Hβ1
61
Hβ2
62
Hβ3
63 64
hω
Hβ 0
Ws
β 7 → Hβ Fig. 1. Accumulation of the hypersurfaces 6n at the stable manifold W s
Theorem 1.8. There exist ρ 0 , ε > 0 such that the following holds. If w ∈ RZd is sufficiently close to ω, and hw lies on W u , then there exists an open neighborhood B(w) of hw in Bρ , such that 6(w) ∩ B(w) is an analytic manifold of codimension d that intersects W u transversally at hw . Consider now such a rotation vector w ∈ RZd , and let 6n (w) = R−n µ (6(w)∩B(w)), for all n ≥ 0. By the λ-Lemma [23, 24], this defines a sequence of codimension d manifolds that accumulate at W s as follows (see also Sect. 4). Assume that β 7 → Hβ is an analytic d-parameter family of Hamiltonians in the domain of Rµ , that intersect W s transversally at β = β 0 . Then there exists an open neighborhood B 0 of β 0 in Cd , such that for sufficiently large n, the set {β ∈ B 0 : Hβ ∈ 6n (w)} contains a single point, say βn , and the ratio |βn − β 0 |/|βn+1 − β 0 | converges to |λ2 |, as n → ∞. The reason for considering the manifolds 6n (w) is the fact that 6n (w) ⊂ H(T n w). Thus, by considering families of the type (1.17), we can construct infinite sequences of periodic orbits for a single Hamiltonian H . As part of the proof of the theorem below, and under its assumptions, we will show that 1 − ln βn − β 0 = ln |λ2 | + O n
1 n
.
(1.20)
Theorem 1.9. Let r > ρ and w ∈ RZd be given, w 6= 0. Let h ∈ Br be a Hamiltonian of the form (1.18), with M as described after (1.18). If h is sufficiently close to hω in Br , then there exists an open neighborhood B of h in Br , and a positive integer N, such that for every Hamiltonian H ∈ B, and every n ≥ N, some constant multiple of H has a periodic orbit γn with frequency vector wn = (ϑ −1 T )n w, lying on the energy surface
378
J. J. Abad, H. Koch
H −1 (0), and satisfying 1 − ln γn (0) − 0 0 (0) = ln |λ2 | + O n 0 where 0 is the invariant torus described after (1.18).
1 n
,
(1.21)
Our reason for using K(γ ) = 0 as one of the conditions in the definition of 6(w), is the resulting identity I 1 p · dq = wn · βn , (1.22) 2πτ (wn ) γn
valid for large n, if w ∈ RZd is chosen sufficiently close to ω. The normalized integral in this equation can be regarded as the coordinate of γn in the direction of wn , since it changes by an amount wn · v under a translation p 7→ p + v. It appears that in this direction, the orbits γn accumulate faster than in some other directions: A straightforward calculation shows that if ∇1 H = 0, then the difference wn · βn − ω · β 0 is of the order |λ2 |−2n . The same might be true more generally, as K is the functional that appears in the variational equation for orbits on fixed energy surfaces. 2. Eliminating Non-Resonant Modes Our goal in this section is to prove Theorem 1.2, concerning the existence of a canonical change of coordinates that eliminates the component of a Hamiltonian in the direction of a given “non-resonant” subspace of Aρ . We start by giving some basic estimates involving the evaluation, multiplication, differentiation, and composition of functions in the spaces Aρ . Proposition 2.1. Let ρ, δ > 0. Consider f, g ∈ Aρ , and P , Q ∈ Adρ , and h ∈ Aρ+δ . Define U (q, p) = q + Q(q, p), P (q, p) , for all (q, p) in Dρ . Then (a) |f (q, p)| ≤ |f |ρ for all (q, p) in Dρ . (b) fg ∈ Aρ and |fg|ρ ≤ |f |ρ |g|ρ . (c) |h|ρ + δ|h|0ρ ≤ |h|ρ+δ . (d) h ◦ U ∈ Aρ and |h ◦ U |ρ ≤ |h|ρ+δ , if |V Q|ρ ≤ δ and kW P kρ ≤ ρ + δ. The proof of these estimates is straightforward and will be omitted. Define {H, φ} = ∇1 H · ∇2 φ − ∇2 H · ∇1 φ. Proposition 2.2. Let r, δ > 0 and 0 < ε < 21 . Denote by B 0 the set of all functions φ ∈ A0r that satisfy kφk0r+2δ < εδ. Then for every function φ ∈ B 0 , Eq. (1.3) has a unique solution Q, P ∈ Adr satisfying kW P kr ≤ δ. The corresponding canonical transformation Uφ : (q, p) 7 → q + Q(q, p), p + P (q, p) is analytic from Dr to Dr+2δ . If H is any function in Ar+2δ , then H ◦ Uφ belongs to Ar , and H ◦ Uφ ≤ |H |r+2δ , r H ◦ Uφ − H ≤ 2 ε|H |r+2δ , (2.1) r 3 1 H ◦ Uφ − H − {H, φ} ≤ ε2 |H |r+2δ . r 3 Furthermore, the maps φ 7 → (Q, P ) and φ 7 → H ◦ Uφ are analytic on B 0 .
Renormalization and Periodic Orbits for Hamiltonian Flows
379
Proof. Denote by B the set of all P ∈ Adr satisfying kW P kr ≤ δ. Let φ ∈ B 0 , and define a map F : B → Adr by setting F (P ) = −(∇1 φ)◦G, where G(q, p) = (q, p+P (q, p)). If P ∈ B then, by using Proposition 2.1, we obtain kW DF (P )hkr = max h · ∇2 (W ∇1 φ)i ◦ G r i ≤ max kW hkr V ∇2 (W ∇1 φ)i ◦ G r i ≤ max kW hkr V ∇2 (W ∇1 φ)i r+δ (2.2) i ≤ max kW hkr δ −1 (W ∇1 φ)i r+2δ
i
= kW hkr δ2−1 kW ∇1 φkr+2δ ≤ kW hkr , for all h ∈ Adr . This, together with the bound kW F (0)kr ≤ δ/2, shows that F is a contraction on B, for the norm kW.kr . Thus, Eq. (1.3) has a unique solution (Q, P ) ∈ Adr × B. By Proposition 2.1, this solution satisfies |V Q|r = (V ∇2 φ) ◦ G r ≤ |V ∇2 φ|r+δ < εδ, (2.3) kW P kr = k(W ∇1 φ) ◦ Gkr ≤ kW ∇1 φkr+δ < εδ. Consider now H ∈ Ar+2δ . In order to prove (2.1), let f (z) (q, p) = H q + z∇2 φ q, p + zP (q, p) , p − z∇1 φ q, p + zP (q, p) . (2.4) From the bounds (2.3) and Proposition 2.1, it follows that this equation defines a function f , from an open neighborhood of the disk |z| ≤ 2/ε to Ar , satisfying |f (z)|r ≤ |H |r+2δ . By using the representation I 1 f (z) s 2 dz, (2.5) f (s) = f (0) + sf 0 (0) + 2π i z−s z |z|=2/ε
we obtain the bound H ◦ Uφ − H − {H, φ} = f (1) − f (0) − f 0 (0) r r I 1 f (z)dz ≤ 1 ε2 |H |r+2δ . ≤ 2 2π i (z − 1)z r 3
(2.6)
|z|=2/ε
The first two inequalities in (2.1) are proved similarly. The analyticity of the maps that assign to φ ∈ B 0 the functions P , Q ∈ Adr and H ◦ UH ∈ Ar , follows by the implicit function theorem and the chain rule. u t Proof of Theorem 1.2. We start with an informal description of the proof. Consider H0 = H ∈ H. Our goal is to define functions φ0 , φ1 , φ2 , . . . such that if we set Gn = Uφ0 ◦ Uφ1 ◦ . . . ◦ Uφn−1 − I,
Hn = H ◦ (I + Gn ),
n = 1, 2, . . . ,
∞, with I− H∞
then Gn → G∞ and Hn → H∞ = H ◦ (I + G∞ ) as n → with n = 0, we define φn = I− φn to be the solution of the equation I− {Hn , φn } = −I− Hn .
(2.7)
= 0. Starting (2.8)
380
J. J. Abad, H. Koch
If I− Hn is small, say of the order εn , then the same should be true for φn . By using that Hn+1 = Hn ◦ Uφn , and thus (2.9) I− Hn+1 = I− Hn ◦ Uφn − Hn − {Hn , φn } , we see from Eq. (2.1) that I− Hn+1 is of the order εn+1 ≈ εn2 . Now the process is repeated for n = 1, 2, . . . . In order to be more precise, assume now that H and I− satisfy the assumptions of Theorem 1.2. Let t0 = (1 − ρ 0 /ρ)/9 and ρ0 = ρ. Define tn = (2/3)n t0 ,
δn = tn ρ,
ρn = ρ 0 + 9δn ,
n = 1, 2, . . . ,
(2.10)
so that ρn+1 = ρn − 3δn , for all n ≥ 0. Given H ∈ H, we intend to verify inductively that for all m > 0, the function Hm belongs to Aρm and satisfies the bounds |Hm |ρm < c, c m |Hm − Hm−1 |ρm < t0 b(2/3)2(3/2) , 4 ac m (t0 b)2 (2/3)4(3/2) . |I− Hm |ρm < 3
(2.11)
By assumption, these bounds hold for m = 0, if we set H−1 = 0. Let now n ≥ 0 be fixed, and assume that (2.11) has been verified for all m ≤ n. We start by showing that Eq. (2.8) can be solved. By using Proposition 2.1.c we obtain |Hn − H |0ρn −δn ≤
κ such that κρ 2 < κ 0 ρρ 0 < σρ 0 , and such that the first inequality in (1.9) holds, if κ is replaced by κ 0 . Let s = (ρ 0 + 8ρ)/9, and define B = {H ∈ I+ A0s : H0,0 = 0, |H − hω |0s < κρ}. If ρ 0 ≤ r ≤ s, then for every function φ ∈ I− Ar we have X {hω , φ} = |ω · ∇1 φ|r = |φν,α ||ω · ν|r |α| erkW νk r ≥σ
X
(ν,α)∈I −
(ν,α)∈I −
|φν,α |r |α| kW νkerkW νk + κ 0
X
|φν,α ||α|r |α| erkW νk (3.5)
(ν,α)∈I −
≥ σ kW ∇1 φkr + κ 0 r|V ∇2 φ|r ≥ κ 0 rkφk0r , which yields the bound {H, φ} ≥ {hω , φ} − {H − hω , φ} r r r 0 0 0 ≥ κ r − |H − hω |r kφkr ≥ (κ 0 ρ 0 − κρ)kφk0r ,
(3.6)
for all H ∈ B. This shows that I− satisfies a non-resonance condition (with respect to the set B), as defined in Theorem 1.2, for some constants a, b, c > 0. The same resonance condition remains satisfied if we replace B by the set H of all Hamiltonians H ∈ Aρ
384
J. J. Abad, H. Koch
that lie within a small distance ε > 0 of B. If necessary, we decrease ε > 0 such that (H ◦ UH )0,(1,0,... ,0) is bounded away from zero, for all H ∈ H. This is possible since |H0,(1,0,... ,0) | > 1 − σ > 0, for all H ∈ B. Now Theorem 1.2 and Lemma 1.3 imply that (1.10) defines a compact analytic map Rµ from H to Aρ , provided that µ satisfies the second inequality in (1.9). By using the analyticity improving property of the map described in Lemma 1.3, we obtain the same result for Sz ◦ Rµ and Rµ ◦ Sz , uniformly in z, for all z in some open neighborhood of the unit circle in C. In order to prove that Sz ◦ Rµ = Rµ ◦ Sz , as claimed, it suffices to consider |z| = 1. In this case, it is straightforward to check that Sz “commutes through” each of the steps used in the proof of Theorem 1.2, yielding USz H ◦ Sz = Sz ◦ UH ,
(3.7)
where Sz (q, p) = (q, zp). This can be done by using the fact that Sz I− = I− Sz and {Sz H, Sz φ} = Sz {H, φ}, which implies that USz φ ◦ Sz = Sz ◦ Uφ . The details of this computation are left to the reader; see also [15]. u t Remarks. • Due to the symmetry Sz ◦ Rµ = Rµ ◦ Sz , the transformation Rµ maps the scaling orbit z 7 → Sz H of a Hamiltonian H to the corresponding scaling orbit for Rµ (H ). Thus, in situations where these orbits are non-degenerate, a normalization condition can be used to pick an arbitrary representative from each of them. This leads naturally to a transformation (1.10) where the scaling µ is chosen to depend on H , in such a way that the given normalization is preserved. • A relation analogous to (3.7) holds if the scaling transformations Sz and Sz are replaced by translations of the angles, given by Jγ (q, p) = (q − γ , p) and Jγ H = H ◦ Jγ , respectively. The corresponding symmetry Rµ ◦ Jγ = JT −1 γ ◦ Rµ can be related to observations for certain periodic orbits [1]. • As was mentioned in the introduction, if H is an even function of the angle variable q, then the same is true for Rµ (H ). This can be seen easily from our construction of UH in the proof of Theorem 1.2. • Another invariance property of Rµ is the following. Given v ∈ Cd , denote by A(v) the set of Hamiltonians H (in the appropriate domain) for which v · ∇2 H is constant. Our proof of Theorem 1.2 shows that H ◦ UH belongs to A(v) whenever H does. Thus, Rµ maps A(v) to A((T ∗ )−1 v). The same holds if we define A(v) by the condition v · ∇2 H = 0. • The symmetry properties mentioned above, either separately or combined, can be used e.g. to restrict the search for nontrivial RG fixed points (or invariant families) to appropriate invariant subspaces. Such a fixed point (or family) may be relevant only for Hamiltonians that share all of its symmetries. But as the trivial fixed point hω and other examples show, the “domain of relevance” (universality class) may actually be larger; see also [1, 6]. 4. Periodic Orbits We begin by showing that under a certain condition on w ∈ Rd , every Hamiltonian H near hw determines a “counterterm” 8(H ), within a d-dimensional subspace of Bρ that is roughly parallel to W u , such that a constant multiple of H + 8(H ) has a periodic
Renormalization and Periodic Orbits for Hamiltonian Flows
385
orbit of the type described in Definition 1.7. The subsequent proof of Theorem 1.8 is based on identifying 6(w) locally with 8−1 (0). A proof of Theorem 1.9 is given at this end of this section, after some results on d-parameter families. Given any r > 0, define Ar to be the Banach P space of all analytic functions g on |gn | exp(r|n|), where g0 , g±1 , g±2 , . . . the strip |Imz| < r, with finite norm |g|r = are the Fourier coefficients of g. In other words, Ar is the one-variable analogue of Ar . On the product space Adr we use norms |.|r and k.kr analogous to those introduced in Definition 1.1. Theorem 4.1. Let δ, r, ρ be positive real numbers satisfying δ ≤ r ≤ 1 and r + δ ≤ ρ/2. Let w be a nonzero vector in Rd such that τ w ∈ Zd , τ ≥ 4, and assume that |V (w − ω)| ≤ 2−4 . Define b = 2−5 δ 2 /τ . Then for every Hamiltonian H in B = {H ∈ Bρ : |H − hw |ρ < b}, there exist two complex numbers ξ and E, and a vector u perpendicular to v = V ∗ V w, such that the Hamiltonian ξ(H + hu ) + E has a periodic orbit (1.19) at energy zero, with K(γ ) = 0, Q0 = 0, P0 = kv for some k ∈ C, and with Q (odd) and P (even) belonging to Adr . The quantities ξ, u, E, k, Q, P satisfy the bounds |V u| ≤ δ 0 /δ, |ξ − 1| ≤ δ 0 /δ, 0 |k| ≤ δ , |V Q|r/τ ≤ τ δ 0 /δ,
|E| ≤ δ 0 , kW P kr/τ ≤ τ δ 0 /δ,
(4.1)
with δ 0 = 4|H −hw |ρ , and they are uniquely determined if we require that (4.1) holds for δ 0 = 4b. Furthermore, the dependence of ξ, u, E, k, Q, P on the Hamiltonian H ∈ B is analytic. Proof. Given ξ ∈ C, and u ∈ Cd satisfying u · v = 0, we define x = (ξ − 1)w + ξ u. If ξ is bounded away from zero, as is the case for the parameter values considered here, the map (ξ, u) 7 → x is one-to-one. Thus, we shall use x as a parameter and consider ξ, u to be functions of x. Let now H = hw + h be a Hamiltonian in Bρ , with |h|ρ ≤ 2b, and consider the family (x, E) 7 → G defined by G = ξ(H + hu ) + E = hw + ξ h + hx + E.
(4.2)
Let r 0 = r/τ , and denote by D the derivative operator on Adr0 . Then the equation for a periodic orbit (1.19) of G, with Q0 = 0 and P0 = kv, can be written as DQ = τ ∇2 (ξ h + hx ) (ζ + Q, kv + P ), DP = −τ ∇1 (ξ h + hx ) (ζ + Q, kv + P ),
(4.3)
where g(ζ + Q, kv + P ) stands for the composition of a given function g with the map t 0 7 → (t 0 τ w + Q(t 0 ), kv + P (t 0 )). The variable t 0 used here is the rescaled time t 0 = t/τ . Denote by Ef the average of a periodic function f . It is convenient to split the first equation in (4.3) into two equations, one for the average, and one for the remaining e where zero-average part. The result can be written in the form x = e x and Q = Q, e x = −ξ E(∇2 h)(ζ + Q, kv + P ), e = τ ξ D −1 (I − E)(∇2 h)(ζ + Q, kv + P ). Q
(4.4)
386
J. J. Abad, H. Koch
Here, D −1 denotes the antiderivative operator on (I − E)Adr0 . Similarly, the second e, with equation in (4.3) is equivalent to P = P e = −τ ξ D −1 (∇1 h)(ζ + Q, kv + P ). P
(4.5)
Notice that in this equation, the function to the right of D −1 has automatically a zero average, due to the parity conditions on Q, P , and h. The condition that the integral (1.16) along the periodic orbit γ be zero, will be written as k = e k, where (omitting the argument t 0 ) 1 e k=k− 2πτ c =−
Z2π
ξ 2πc
Z2π Z2π 1 0 (kv + P ) · (τ w + DQ) dt = − P · DQ dt 0 2π τ c 0
0
(4.6)
P · (∇2 h)(ζ + Q, kv + P ) dt 0 ,
0
with c = v · w. In addition, we impose the condition that G = 0 on the orbit γ , or equivalently (if k = e k), that the integral of Gdt − p · dq along γ be zero. This condition e where (omitting the argument t 0 ) can be written in the form E = E, e= E − 1 E 2π
Z2π 0
=−
ξ 2π
1 G(ζ + Q, kv + P ) dt + 2π τ 0
Z2π (kv + P ) · (τ w + DQ) dt 0 0
Z2πh
i
(4.7)
h(ζ + Q, kv + P ) − P · (∇2 h)(ζ + Q, kv + P ) dt 0 − kv · x.
0
The problem of finding a pair (x, E), such that the Hamiltonian (4.2) has an orbit γ with the desired properties, has now been reduced to a fixed point problem for the map ee e P e) defined by Eqs. (4.4) . . . (4.7). Denote by X F : (x, E, k, Q, P ) 7 → (e x , E, k, Q, the Banach space of all quintuples X = (x, E, k, Q, P ) in Cd × C × C × Adr0 × Adr0 , whose components Q and P have average zero, equipped with the norm (4.8) kXk = max τ |V x|, δ −1 τ |E|, δ −1 τ |k|, |V Q|r 0 , kW P kr 0 . We note that the conditions (4.1), with δ 0 = 4b, imply that kXk < δ/2. To see this, one first proves e.g. that |V x| ≤ 17 16 |ξ − 1| + |ξ ||V u|. Let us now assume that kXk ≤ δ. Then, by using part (b) of Proposition 2.1, we obtain g(ζ + Q, kv + P ) 0 ≤ |g|ρ−δ , g ∈ Aρ−δ , (4.9) r and this bound can be used to show that kF (X)k ≤
2 δ 2τ |h|ρ + kXk2 < . δ τ 4
(4.10)
The proof of the first inequality in (4.10) is mildly tedious, but straightforward. This estimate, together with Cauchy’s formula, can also be used to show that the derivative of F , at every point in the ball kXk ≤ δ/2, is of norm less than 21 . Thus, by the contraction
Renormalization and Periodic Orbits for Hamiltonian Flows
387
mapping principle, F has a unique fixed point X0 in this ball. From the bound (4.10), it follows that kX 0 k ≤
16τ |h|ρ . 7δ
(4.11)
By combining this with the two estimates |ξ −1| ≤ 87 |V x| and |V u| ≤ 43 |V x|, we obtain the inequalities (4.1) with δ 0 = 4|h|ρ . The analytic dependence of ξ, u, E, k, Q, P on t H ∈ B follows from the uniform convergence of F n (0) → X0 as n → ∞. u Proof of Theorem 1.8. We will use the notation and assumptions made in Theorem 4.1. In addition, we assume that hw lies on W u , and that δ > 0 is sufficiently small such that B is contained in the domain of Rµ . Denote by X the subspace of Bρ , consisting of all Hamiltonians of the form chw + f , with c ∈ C, and with f a function in Bρ , whose Fourier–Taylor coefficients fν,α are zero whenever ν = 0 and |α| < 2. In addition, let Y be the space of all Hamiltonians of the form hy + C, with C a constant function, y ∈ Cd , and y · v = 0. Then we can identify Bρ with X ⊕ Y . Define a map φ from X ∩ B to Y , by setting φ(H ) = hu + E/ξ , with H 7→ (ξ, u, E) as described in Theorem 4.1. The graph 6 0 of this map φ is clearly an analytic manifold of codimension d in Bρ . In addition, 6 0 intersects W u transversally at hw . This follows essentially from the fact that φ(chw ) = 0: Due to the identity Dφ(hw )hw = 0, it suffices to verify the transversality property in the (d + 1)-dimensional subspace of all Hamiltonians of the form (q, p) 7 → y · p + C, where it is trivial. In order to compare 6 0 with the set 6(w) defined in Definition 1.7, consider two different choices (δ1 , r1 ) and (δ2 , r2 ) for the parameters (δ, r) in Theorem 4.1, with δ1 < δ2 < r2 < r1 . Denote by B1 and B2 the corresponding balls B, and by 610 and 620 the corresponding manifolds 6 0 . Then, by the uniqueness part of Theorem 4.1, the intersection of 620 with B1 is equal to 610 . Furthermore, if we set ρ 0 = r1 and choose ε > 0 sufficiently small, then 6(w) ∩ B1 is contained in 620 . And by choosing δ1 > 0 sufficiently small, we also have 610 ⊂ 6(w). Thus, 6(w) agrees with 610 in the ball B1 . We note that the same choice of parameters works for all rotation vectors w ∈ RZd that are sufficiently close to ω. (This fact is not used later on.) The reason for this is that our bounds depend on w only through the constant τ = τ (w). u t Our proof of Theorem 1.9 is based on the graph transform method; see e.g. [14]. We start by introducing some notation. Denote by X and Y the stable and unstable subspaces, respectively, of the linearized RG transformation DRµ (hω ), restricted to Bρ . Denote by ψ the (analytic) function, defined on an open neighborhood of hω in X , with values in Y, whose graph is the local stable manifold W s of Rµ at hω . As was mentioned in the introduction, Y is spanned by the d − 1 functions listed in (1.13), together with the constant function. The canonical projections onto X and Y will be denoted by Ps and Pu , respectively. Due to our choice of norm on Bρ , both Ps and Pu have operator norm 1. Consider the transformation Nµ , Nµ = 9 −1 ◦ Rµ ◦ 9,
9 = I + ψ ◦ Ps ,
(4.12)
defined on a neighborhood of hω in Bρ . Notice that Nµ and Rµ have the same derivatives at the fixed point hω , and the same local unstable manifolds. But the local stable manifold of Nµ is trivial; that is, it agrees with X near hω . The largest (in modulus) contracting eigenvalue of DNµ (hω ) is λ = µϑ1 ϑd−2 . Let θ be some fixed real number larger than
388
J. J. Abad, H. Koch
ϑ1 |ϑd |−2 . In the remaining part of this section, we restrict the possible choices of µ by imposing the condition θµ < |λ2 |−1 . Since DNµ (hω ) is compact, we can choose a norm k.k on X that is equivalent to k.kρ , but for which the restriction of DNµ (hω ) to X has (operator) norm less than θ µ. Assume now that such a norm has been chosen. The extension to Bρ = X ⊕ Y is defined by setting kx + yk = kxk + kykρ , for every x ∈ X and y ∈ Y. Given any δ > 0, let Dδ = {y ∈ Y : kyk < δ}, and define Fδ to be the Banach space of all analytic d-parameter families F : Dδ → Bρ , that extend continuously to the boundary of Dδ and satisfy Pu F (0) = 0, equipped with the norm kF kδ = sup kF (y)k.
(4.13)
y∈Dδ
Of particular interest is the family F ∗ , defined by F ∗ (y) = hω + y, as it parametrizes the local unstable manifold of Rµ . Proposition 4.2. If δ > 0 is sufficiently small, then the equation Mµ (F ) = Nµ ◦ F ◦ YF ,
YF = (Pu ◦ Nµ ◦ F )−1 ,
(4.14)
defines an analytic contraction mapping Mµ , on some open neighborhood B of F ∗ in Fδ , with fixed point F ∗ , and contraction rate less than θ µ. The proof of this proposition is straightforward: Pu ◦ Nµ ◦ F ∗ agrees with the restriction of DRµ to Y, so that by the inverse function theorem, (F, y) 7 → YF (y) is well defined and analytic near (F ∗ , 0) in Fr × Y; and the asserted contraction property of Mµ follows from the identity DMµ (F ∗ )f (y) = Ps DNµ F ∗ (YF ∗ (y)) f (YF ∗ (y)), f ∈ Fr , y ∈ Dr , (4.15) which can be verified by an explicit computation. Notice that YF ∗ is linear, and that YF (0) = 0 for any F . The following proposition will be used to estimate the composition of maps fn = YFn , associated with an orbit (F0 , F1 , F2 , . . . ) of Mµ . Proposition 4.3. Let U , V be normed linear spaces, and let Z be an open ball in U ⊕ V , centered at zero. Let L be a bounded linear operator on U ⊕ V that commutes with the projection (u, v) 7 → (u, 0), and that satisfies kL(u, 0)k = akuk,
kL(0, v)k ≤ bkvk,
u ∈ U, v ∈ V ,
(4.16)
with 0 < b < a fixed. Let f0 , f1 , . . . , fn−1 be Lipschitz maps on Z, that leave the origin fixed. Then ±a −n kf0 (f1 (· · · fn−1 (u, v) · · · ))k ≤ ±kuk + ec0 c0 kuk + (b/a)n ec0 +c1 kvk, (4.17) where cm =
n−1 kfj (z) − Lzk 1 X a m(j +1) , sup a b kzk z6 =0 j =0
m = 0, 1.
(4.18)
Renormalization and Periodic Orbits for Hamiltonian Flows
389
Proof. Denote by sj the value of the supremum in (4.18). Let (u0 , v0 ) be an arbitrary point in Z. For k = 1, 2, . . . , n, define (uk , vk ) = Lk (u0 , v0 ) and (4.19) rk = fn−k ◦ fn−k+1 ◦ . . . ◦ fn−1 (u0 , v0 ) − (uk , vk ). By (4.16) we have kuk k = a k ku0 k and kvk k ≤ bk kv0 k. If we set r0 = 0, then
krk k = fn−k (uk−1 , vk−1 ) + rk−1 − L(uk−1 , vk−1 ) ≤ kLrk−1 k + sn−k k(uk−1 , vk−1 ) + rk−1 k k−1
(4.20) k−1
≤ (a + sn−k )krk−1 k + sn−k a ku0 k + sn−k b kv0 k, Q for all positive k ≤ n. Define pk = n−1 j =n−k (1 + sj /a). Then, by applying the bound (4.20) recursively, we obtain n 1 sn−k b a n−k+1 sn−k 1 ku0 k + kv0 k krk k ≤ k−1 krk−1 k + k a pk a pk−1 a a b a n X (4.21) n−1 n−1 X sj b a j +1 sj ku0 k + kv0 k. ≤ a a b a j =n−k
j =n−k
In particular, a −n krn k ≤ pn c0 ku0 k + (b/a)n pn c1 kv0 k. The assertion follows by comt bining this inequality with the trivial bounds pn ≤ ec0 and 1 + pn c1 ≤ ec0 +c1 . u Proposition 4.3 can be used to prove uniform upper and lower bounds on βn − β 0 , which we will need in order to estimate the accumulation rate of periodic orbits, as described in Theorem 1.9. The following proposition is the first step toward this goal. We note that a general result of this type was proved in [23], but it is not sufficient for our purpose. Proposition 4.4. If δ > 0 is sufficiently small, and if w ∈ RZd is sufficiently close to ω and normalized, such that hw − hω belongs to Dδ/2 , then there exists an open neighborhood B 0 of F ∗ in Fδ , and constants k1 , k2 , N > 0, such that the following holds. For every F ∈ B 0 , and for every non-negative integer n, the condition 9(F (y)) ∈ 6n (w) defines a unique parameter value y = yn in Dδ , and this value satisfies the bound k1 |λ2 |−n < kyn k < k2 |λ2 |−n ,
n ≥ N.
(4.22)
Proof. Under the given conditions on δ and w, Theorem 1.8 guarantees that F ∗ intersects the manifold 9 −1 (60 (w)) transversally, and in a single point. Since transversality is preserved under small perturbations, the same holds for every family F in some open ball B 0 ⊂ Fδ centered at F ∗ , and the intersection parameter y0 depends analytically on F . Denote by Z the map F 7 → y0 , and by r 0 the radius of B 0 . In addition, let z = Z(F ∗ ) and L = DYF ∗ (0). Assume that δ, r 0 > 0 have been chosen sufficiently small, such that the restriction of Mµ to B 0 has the properties described in Proposition 4.2, and such that the derivatives of F 7 → YF and Mµ are uniformly bounded on B 0 . Given any F0 ∈ B 0 , we set Fn = Mnµ (F0 ), for n = 1, 2 . . . . By the definition of Mµ , the family Fn intersects 9 −1 (6n (w)) at a single point yn , given by the equation (4.23) yn = YF0 ◦ YF1 ◦ . . . ◦ YFn−1 (zn ), zn = Z(Fn ).
390
J. J. Abad, H. Koch
By Proposition 4.2, the norm of Fn − F ∗ is bounded by (θ µ)n kF0 − F ∗ k, for all n ≥ 0. By analyticity, analogous bounds hold for kzn − zk and kDYFn (y) − Lk, up to constant factors that we can choose to be independent of n, F ∈ B, and y ∈ Dδ/2 . Notice that L is the inverse of the restriction of DRµ (hω ) to Y. All eigenvalues of L are either of modulus a, or of modulus less than b, with θ µ < b < a. Given these properties of the maps YFn and L, the bound (4.22) now follows from Proposition 4.3, provided that z has a nonzero component in a spectral subspace of L corresponding to an eigenvalue of modulus a. But z = hw − hω meets this requirement, since w = c1 W1 + . . . + cd Wd , with cj 6 = 0 for all j . This follows from the fact that the fields Q[ϑj ] are all isomorphic, and that Q[ϑ1 ] is spanned by the rationally independent components of ω. u t Proof of Theorem 1.9. Let w 0 be a nonzero vector in RZd . In order to prove the assertion for w = w 0 , it suffices to prove it for w = c(ϑ −1 T )k w0 , where c can be any nonzero real number, and k any nonnegative integer. Thus, since c(ϑ −1 T )k w 0 → ω for some value of c, we can, without loss of generality, assume that w is as close to ω as required by Proposition 4.4, after choosing δ > 0 sufficiently small. In what follows, c0 , . . . , c18 denote positive constants that do not depend on the choice of the Hamiltonian H . But unless stated otherwise, ci may depend on µ. We will also introduce constants b1 , . . . , b6 that only depend on the choice of T . We assume that µ > 0 has been chosen sufficiently small such that bi µ < |λ2 |−1 . Since r > ρ, there exists an open neighborhood 3 of zero in Cd , such that (β, H ) 7 → Hβ is analytic and has bounded derivatives, as a map from 3 × Br to Bρ . When considering families β 7 → Hβ , we will implicitly assume that β ∈ 3. Let now h be a Hamiltonian in Br , of the form (1.18), with M as described after (1.18), and assume that h is sufficiently close to hω , such that Nµ is well defined and analytic on an open ball in Bρ containing h and hω . Define Y (β) = Pu 9 −1 (h◦Rβ ). Then Y (0) = 0, and our assumption on M implies that Y is invertible as a map from an open neighborhood of zero in Cd , to Y. Define f0 (y) = 9 −1 (h ◦ Rβ ), with β = Y −1 (y). An explicit computation shows that for n = 1, 2, . . . , Eq. (4.14) defines a family fn = Mµ (fn−1 ), which belongs to Fδ for large n, and that fn → F ∗ in Fδ , as n tends to infinity. Thus, given any open neighborhood B 0 of F ∗ in Fδ , there exists a positive integer `, such that if H ∈ Br is sufficiently close to h, then the equation (4.24) F (y) = 9 −1 R`µ Hβ 0 ◦ RZ` (y) , Z` = Y −1 ◦ Yf0 ◦ . . . ◦ Yf`−1 , defines a family F ∈ B 0 . Here, β 0 denotes the parameter value where β 7 → Hβ intersects W s . This value is well defined and depends analytically on H , since β 7 → hβ intersects W s transversally at h. By using Proposition 4.4, and the fact that Z` is invertible near the origin, we can find constants k10 , k20 , N 0 > 0, such that for all n ≥ N 0 , and for all Hamiltonians H in some open neighborhood B of h in Br , the condition Hβ ∈ 6n (w) defines a unique parameter value β = βn , and this value satisfies the bound k10 |λ2 |−n < |βn − β 0 | < k20 |λ2 |−n .
(4.25)
In what follows, we assume that n is larger than N 0 . Define Hβ,m = Rm µ (Hβ ), whenever . Hβ belongs to the domain of Rm µ Consider now a fixed but arbitrary Hamiltonian H ∈ B. Given that Hβn ,n lies on 6(w), a constant multiple of this Hamiltonian has a periodic orbit with rotation vector w, given by an analytic curve gn in Dρ with fixed domain, as described in Definition 1.7. By the definition of Rµ , the Hamiltonian Hβn ,n is related to Hβn by a canonical change
Renormalization and Periodic Orbits for Hamiltonian Flows
391
of coordinates homotopic to T1n , and a scaling. Formally, a constant multiple of Hβn has a periodic orbit Gn with rotation vector wn = (ϑ −1 T )n w, and this orbit is given by the equation Gn = UHβn ,0 ◦ Tµ ◦ UHβn ,1 ◦ Tµ ◦ . . . ◦ UHβn ,n−1 ◦ Tµ ◦ gn ◦ 2−n = Vn (Hβn ) ◦ Tµn ◦ gn ◦ 2−n ,
(4.26)
where 2(t) = ϑt for all t, and where Vn (Hβn ) is the transformation given by (1.14). But in order to establish the existence of such an orbit Gn , we need to show that the maps in Eq. (4.26) can be composed as indicated. By construction, the size of the domain (width of the strip in the angle variable, and diameter of the ball in the action variable) of the transformation Vn (Hβn ) decreases exponentially with n. But the rate of decrease is independent of µ: Due to the identity (3.7), the transformations Vk (Hβn ), defined in (1.14), are not only canonical, but also independent of the choice of µ. Thus, the same is true for Vn (Hβn ). On the other hand, the nonlinear part of gn is bounded by a constant times |Hβn ,n − hw |ρ , as was shown in Theorem 4.1. And this norm is less than c0 (θ µ)n . This follows from Proposition 4.2, and from the fact that 6(w) intersects F ∗ transversally at hw . Thus, if µ > 0 is chosen sufficiently small, we find that the range of Tµn ◦ gn is contained in the domain of Vn (Hβn ), for large n. A more detailed discussion of the transformations Vn (H ) can be found in [15]. The estimates obtained there are formulated for H = Hβ 0 only, but they are easy to adapt to the Hamiltonians considered here. To be more precise, the starting point for these estimates is a bound kφ(Hβ 0 ,m )k0r0 ≤ c2 kI− Hβ 0 ,m kρ ≤ c2 c1 (b1 µ)m kHβ 0 − hω kρ ,
(4.27)
where φ(Hβ 0 ,m ) is the generating function of the canonical transformation UHβ 0 ,m . The constants c2 and b1 , and the parameter r0 defining the domain of φm (Hβ 0 ,m ), are independent of µ. We have omitted an additional factor (b1 µ)m that appears in Eq. (5.11) of [15], since it is not used or needed. Concerning the replacement of Hβ 0 by Hβn , we note that the first inequality in (4.27) is a general bound on φ that applies directly to Hβn ,m . The second inequality in (4.27) carries over as well. In fact, it can be improved if H is close to h: kI− Hβ,m kρ ≤ c3 (θ µ)m kH − hkr ,
H ∈ B,
(4.28)
for all β in some open set 3m ⊂ Cd containing β 0 and βn . This inequality is trivial for m ≤ `, and if m = ` + k with k positive, it follows from the bound
−
I 9 (Mk (F ))(y) = I− 9 (Mk (F ))(y) − I− 9 (Mk (f` ))(y) µ µ µ ρ ρ
(4.29) ≤ c4 Mkµ (F ) − Mkµ (f` ) ≤ c4 (θ µ)k kF − f` k ≤ c5 (θ µ)k kH − hkr . Here, F is the family defined in (4.24), and y is an arbitrary point in Dδ . We note that for m = ` + k, the abovementioned set 3m can be taken to be the image of Dδ under the map Z` ◦ YF0 ◦ . . . ◦ YFk−1 , where Fn = Mnµ (F ). Since YFn → YF ∗ uniformly on Dδ , there exists a universal constant b > 0, such that 3m contains a union of balls of radii c6 bm , whose centers trace out a path of length ≤ c7 b−m |βn − β 0 |, connecting βn and β 0 . Thus, by using (4.29), together with Cauchy’s formula to estimate
392
J. J. Abad, H. Koch
the derivative of β 7 → I− Hβ,m along the path from βn and β 0 , we obtain the first of the following two bounds: kI− Hβn ,m − I− Hβ 0 ,m kρ ≤ c8 (b2 µ)m |βn − β 0 | kH − hkr ,
(4.30)
kHβn ,m − Hβ 0 ,m kρ ≤ c9 b3m |βn − β 0 |.
The second bound is obtained similarly. Consider now the curve γn = Rβn ◦ Gn . Clearly, γn is a periodic orbit for a constant multiple of H , with rotation vector wn . Our next goal is to show that the n-dependence of γn (0) is mostly due to the translations Rβn . To this end, we split Gn (0) − 0Hβ 0 (0) into three pieces and use the triangle inequality: Gn (0) − 0H 0 (0) ≤ Vn (Hβ ) T n (gn (0)) − Vn (Hβ )(0) n n µ β + Vn (Hβ )(0) − Vn (Hβ 0 )(0) + Vn (Hβ 0 )(0) − 0H 0 (0) . n
β
(4.31) The first term on the right hand side of (4.31) can be estimated by using the results from [15], with Hβ 0 replaced by Hβn , as mentioned above. The relevant fact is that Vn (Hβn ) is uniformly bounded on a domain whose size decreases exponentially in n, independently of µ, while |gn (0)| ≤ c10 (θµ)n , as mentioned earlier. Thus, Vn (Hβ ) T n (gn (0)) − Vn (Hβ )(0) ≤ c11 (b4 µ)n . (4.32) n n µ In order to bound the second term on the right hand side of (4.31), we define an interpolating family of transformations s 7→ Vn,s (H ), such that Vn,0 (H ) = Vn (Hβ 0 ) and Vn,1 (H ) = Vn (Hβn ). To obtain Vn,s (H ), each of the transformations UHβ 0 ,m that enter the definition (1.14) of Vn (Hβ 0 ), is replaced by the canonical transformation with generating function (4.33) φn,m,s (H ) = φ(Hβ 0 ,m ) + s φ(Hβn ,m ) − φ(Hβ 0 ,m ) . By using the analyticity of the map H 7→ φ(H ), and the bounds (4.25), (4.29), and (4.30), we find that for |s| ≤ |λ2 |n , kφn,m,s (H )k0r0 ≤ c12 kI− Hβ 0 ,m kρ + c13 |s| kI− Hβn ,m − I− Hβ 0 ,m kρ + c14 |s| kI− Hβ 0 ,m kρ kHβn ,m − Hβ 0 ,m kρ ≤ c15 (b5 µ)m kH − hkr .
(4.34)
With this bound replacing (4.27), we now obtain the analogue of Lemma 5.5 in [15], which implies that v(s) = Vn,s (H )(0) is bounded in modulus by c16 kH − hkr , if n is sufficiently large and |s| ≤ |λ2 |n . Thus, by writing v(1) − v(0) as the integral of v 0 over [0, 1], and estimating v 0 by using Cauchy’s formula with contour |s| = |λ2 |n , we obtain the bound Vn (Hβ )(0) − Vn (Hβ 0 )(0) ≤ c17 |λ2 |−n kH − hkr . (4.35) n The last term in (4.31) satisfies again a bound of the form (4.32), as was shown in [15]. Putting the pieces together, we now have Gn (0) − 0H 0 (0) ≤ c17 |λ2 |−n kH − hkr + c18 (b6 µ)n , (4.36) β
Renormalization and Periodic Orbits for Hamiltonian Flows
provided that H is sufficiently close to h in Br , and n sufficiently large. Since ±|γn (0) − 0 0 (0)| ≤ ±|βn − β 0 | + Gn (0) − 0Hβ 0 (0) ,
393
(4.37)
as a result of the identities γn (0) = Gn (0) + (0, βn ) and 0 0 (0) = 0Hβ 0 (0) + (0, β 0 ), the bound (1.21) now follows from (4.25) and (4.36), if H is sufficiently close to h. u t We conclude with a proof of (1.22), using the same notation as above. By the definition of 6(w), we have Hβn ,n ◦gn = 0, and thus H ◦γn = Hβn ◦Gn = an Hβn ,n ◦gn ◦2−n = 0, where an is some nonzero constant. If U is a canonical transformation with globally defined generating function, or a composition of such transformations, then K(U ◦ γ ) = K(γ ) for any closed curve γ . The corresponding identity for Tµ is K(Tµ ◦γ ) = µK(γ ). As a result, we have K(Gn ) = µn K(gn ) = 0, and thus K(γn ) = τ (wn )wn · βn , which is equivalent to Eq. (1.22). Acknowledgements. We would like to thank R. de la Llave and P. Wittwer for helpful discussions.
References 1. Abad, J.J., Koch, H., Wittwer, P.: A Renormalization Group for Hamiltonians: Numerical Results. Nonlinearity 11, 1185–1194 (1998) 2. Arnold, V.I.: Proof of A.N. Kolmogorov’s Theorem on the Preservation of Quasi-Periodic Motions under Small Perturbations of the Hamiltonian. Usp. Mat. Nauk, 18, No. 5, 13–40 (1963); Russ. Math. Surv., 18, No. 5, 9–36 (1963) 3. Bernstein, D., Katok, A.: Birkhoff Periodic Orbits for Small Perturbations of Completely Integrable Hamiltonian Systems with Convex Hamiltonians. Invent. Math. 88, 225–241 (1987) 4. Chandre, C., Govin, M., Jauslin, H.R.: KAM-Renormalization Group Analysis of Stability in Hamiltonian Flows. Phys. Rev. Lett. 79, 3881–3884 (1997) 5. Chandre, C., Govin, M., Jauslin, H.R., Koch, H.: Universality for the Breakup of Invariant Tori in Hamiltonian Flows. Phys. Rev. E 57, 6612–6617 (1998) 6. Chandre, C., Jauslin, H.R., Benfatto, G., Celletti, A.: An Approximate Renormalization-Group Transformation for Hamiltonian Systems with Three Degrees of Freedom. Preprint U. Texas, mp_arc 99–74 (1999) 7. Collet, P., Eckmann, J.-P.: Iterated Maps on the Interval as Dynamical Systems. Basel–Boston–Berlin: Birkhäuser Verlag, 1980. 8. de la Llave, R.: Introduction to KAM Theory. Preprint U. Texas, mp_arc 93–8 (1993) 9. del Castillo-Negrete, J., Greene, J.M., Morrison, P.: Area Preserving Non-Twist Maps: Periodic Orbits and Transition to Chaos. Phys. D 91, 1–23 (1996) 10. Delshams, A., delaLlave, R.: KAM Theory and a Partial Justification of Greene’s Criterion for Non-Twist maps. Preprint U. Texas, mp_arc 98–732 (1998) 11. Escande, D.F., Doveil, F.: Renormalisation Method for Computing the Threshold of the Large Scale Stochastic Instability in Two Degree of Freedom Hamiltonian Systems. J. Stat. Phys. 26, 257–284 (1981) 12. Falcolini, C., delaLlave, R.: A Rigorous Partial Justification of Greene’s Criterion. J. Stat. Phys. 67, 609–643 (1992) 13. Greene, J.M.: A Method for Determining a Stochastic Transition. J. Math. Phys. 20, 1183–1201 (1979) 14. Hirsch, M.W., Pugh, C.C., Shub, M.: Invariant Manifolds. Lecture Notes in Math. 583, Berlin–New York: Springer-Verlag, 1977 15. Koch, H.: A Renormalization Group for Hamiltonians, with Applications to KAM Tori. Erg. Theor. Dyn. Syst. 19, 1–47 (1999) 16. Kolmogorov, A.N.: On Conservation of Conditionally Periodic Motions Under Small Perturbations of the Hamiltonian. Dokl. Akad. Nauka SSSR, 98, 527–530 (1954) 17. Kosygin, D.: Multidimensional KAM Theory from the Renormalization Group Viewpoint. In: Dynamical Systems and Statistical Mechanics, Ya.G. Sinai (ed), AMS, Adv. Sov. Math. 3, 99–129 (1991) 18. MacKay, R.S.: Renormalisation in Area Preserving Maps. Thesis, Princeton (1982), London: World Scientific, 1993 19. MacKay, R.S.: Three Topics in Hamiltonian Dynamics. In: Dynamical Systems and Chaos, Vol.2, Y. Aizawa, S. Saito, K. Shiraiwa (eds), London: World Scientific, 1995
394
J. J. Abad, H. Koch
20. MacKay, R.S., Meiss, J.D., Stark, J.: An Approximate Renormalization for the Break-up of Invariant Tori with Three Frequencies. Phys. Lett. A 190, 417–424 (1994) 21. Mehr, A., Escande, D.F.: Destruction of KAM Tori in Hamiltonian Systems: Link with the Destabilization of nearby Cycles and Calculation of Residues. Physica 13D, 302–338 (1984) 22. Moser, J.: On Invariant Curves of Area-Preserving Mappings of an Annulus. Nachr. Akad. Wiss. Gött., II. Math. Phys. Kl 1962, 1–20 (1962) 23. Palis, J.: A Note on the Inclination Lemma (λ-Lemma) and Feigenbaum’s Rate of Approach. In: Geometric dynamics (Rio de Janeiro, 1981), J. Palis (ed), Lecture Notes in Math. 1007, Berlin–New York: SpringerVerlag, pp. 630–635, 1983. 24. Palis, J., de Melo, W.: Geometric Theory of Dynamical Systems. An Introduction. Berlin–New York: Springer-Verlag, 1982 25. Thirring, W.: A Course in Mathematical Physics I: Classical Dynamical Systems. Berlin–NewYork–Wien: Springer-Verlag, 1978 26. Tompaidis, S.: Approximation of Invariant Surfaces by Periodic Orbits in High-Dimensional Maps: Some Rigorous Results. Experimental Math. 5, 197–209 (1996) Communicated by Ya. G. Sinai
Commun. Math. Phys. 212, 395 – 413 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere Ursula Carow-Watamura, Satoshi Watamura Department of Physics, Graduate School of Science, Tohoku University, Aoba-ku, Sendai 980-8577, Japan. E-mail:
[email protected];
[email protected] Received: 21 January 1998 / Accepted: 4 February 2000
Abstract: The differential algebra on the fuzzy sphere is constructed by applying Connes’scheme. The U (1) gauge theory on the fuzzy sphere based on this differential algebra is defined. The local U (1) gauge transformation on the fuzzy sphere is identified with the left U (N + 1) transformation of the field, where a field is a bimodule over the quantized algebra AN . The interaction with a complex scalar field is also given.
1. Introduction The concept of quantized spaces is discussed in a variety of fields in physics and mathematics. From the physicists’ viewpoint, the main motivation for investigating noncommutative spaces stems from the need of an appropriate framework to describe the quantum theory of gravity. Recently quantized spaces are also discussed in connection with M(atrix) theory which has been proposed as a nonperturbative formulation of string theory [1,2]. This development in string theory supports the idea that the noncommutative structure of spacetime becomes relevant when constructing the theory of gravitation at Planck scale. To describe noncommutative spaces, the noncommutative geometry is now investigated by many authors and using this framework one can even consider the differential geometry of singular spaces like, for example, a 2-point space which has been shown to provide a geometrical interpretation of the Higgs mechanism [3]. On the other hand, in order to describe gravity we have to know the theory of a wider class of noncommutative geometry. In this context, the class of noncommutative spaces which can be considered as deformations of continuous spaces is especially interesting. In general, such noncommutative spaces can be obtained by quantizing a given space with its Poisson structure. Furthermore, if the original space is compact one obtains a finite dimensional matrix algebra as a quantized algebra of functions over this space. In this case, we may consider the deformation as a kind of regularization with the special
396
U. Carow-Watamura, S. Watamura
property that we can keep track of the geometric structure, a feature which is missing in the conventional regularization schemes. In physics the algebra of the fuzzy sphere is well known and has been investigated in a variety of contexts: as an example for a general quantization procedure [4, 5] (see also for example [6–8,10,11] and references therein) and in relation with geometric quantization. It is also discussed as the algebra appearing in membranes [12, 13], in relation with coherent states [14,15], and recently in connection with noncommutative geometry [16–18]. The same structure also appears in the context of the quantum Hall effect [19,20]. In this paper, we investigate the differential geometry of the fuzzy sphere and the field theory on it. We formulate the U (1) gauge theory on the fuzzy sphere. The fuzzy sphere is one example in the above mentioned class of noncommutative geometry and thus the field theory on this space is a very instructive model to examine the ideas of noncommutative geometry. Besides that, it is a deformation of the sphere obtained by quantization based on the Poisson structure on S 2 , and the resulting algebra AN is a finite dimensional matrix algebra. Thus, what we obtain is a regularized field theory on the sphere. From this point of view, we are also interested in the gauge theory on this noncommutative space. In order to formulate the local U (1) gauge theory on the fuzzy sphere, we first have to define the differential algebra based on the above algebra AN . We apply Connes’ framework of noncommutative differential geometry [9] by using a spectral triple (AN ,HN , D) proposed recently by the authors[25], where D is the Dirac operator and HN is the corresponding Hilbert space of spinors. We analyze the space of 1-forms which corresponds to the gauge potential and give the 2-forms to define the field strength. This paper is organized as follows. In Sect. 2, we summarize the definitions of the Dirac operator, the chirality operator and the spectral triple. We give a complete derivation of the spectrum of the Dirac operator and discuss its properties in detail. Then we define the differential algebra on the fuzzy sphere. In Sect. 3, the gauge field and the field strength are defined using this differential algebra. We examine the structure of the U (1) gauge transformation of the charged scalar field. Then the corresponding invariant actions are formulated. Section 4 contains the discussion. We also discuss the commutative limit. 2. Noncommutative Differential Algebra 2.1. Algebra of fuzzy sphere. The algebra of the fuzzy sphere can be obtained by quantizing the function algebra over the sphere by using its Poisson structure. For this end we adopt the Berezin-Toeplitz quantization which gives the quantization procedure for a Kähler manifold [4,5]. Applying this method to the function algebra over the sphere we obtain the algebra AN . AN can be represented by operators acting on a (N + 1) dimensional Hilbert space FN . The algebra AN can thus be identified with the algebra of the complex (N + 1) × (N + 1) matrices. The basic algebra to be quantized is the function algebra A∞ of the square integrable functions over a 2-sphere. The basis of this algebra is given by the spherical harmonics Ylm and the multiplication of the algebra is a usual pointwise product of functions. The fuzzy sphere may also be introduced as an approximation of the function algebra over the sphere by taking a finite number N of spherical harmonics, where this number N is limited by the maximal angular momentum {Ylm ; l ≤ N }. However with respect to the usual multiplication this set of functions does not form a closed algebra since the product of two spherical harmonics Ylm and Yl 0 m0 contains Yl+l 0 ,m . It is a new multiplication rule that solves the above described situation and gives a closed function algebra with a finite
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
397
number of basis elements. The resulting algebra AN is noncommutative. We can identify the algebra of the fuzzy sphere with the algebra of complex matrices MN +1 (C) and thus we can consider it as a special case of matrix geometry [21–24]. The operator algebra AN and the Hilbert space FN can be formulated keeping the symmetry properties under the rotation group. We introduce a pair of creationannihilation operators a†b , ab (b = 1, 2) which transforms as a fundamental representation under the SU (2) action of rotation, [aa , ab† ] = δba .
(1)
Define the number operator by N = ab† ab , then the set of states |v > in the Fock space associated with the creation-annihilation operators satisfying N|v >= N |v >,
(2)
provides an N + 1 dimensional Hilbert space FN . The orthogonal basis |k > of FN can be defined as |ki = √
1 (a† )k (a2† )N −k |0i, k!(N − k)! 1
(3)
where k = 0, . . . , N and |0i is the vacuum. The operator algebra AN acting on FN is unital and given by operators {O; [N, O] = 0}. The generators of the algebra AN are defined by xi =
1 a † b ασ b a a , 2 i a
(4)
where the normalization factor α is a central element [α, xi ] = 0 and is defined by the constraint x i xi =
α2 N(N + 2) = `2 . 4
(5)
The above equation means that ` > 0 is the radius of the 2-sphere and we get for α, α=√
2` . N(N + 2)
(6)
The algebra of the fuzzy sphere is generated by xi and the basic relation is [xi , xj ] = iαij k xk .
(7)
On the Hilbert space FN , α is constant and plays the role of the “Planck constant”. The commutative limit corresponds to α → 0, i.e., N → ∞.1 Now let us consider the derivations of AN . Among them, the derivative operator Li is defined by the adjoint action of xi [16], 1 1 adxi a = [xi , a] ≡ Li a, α α
(8)
1 Another possible choice is to take α = 2 as in ref.[4]. With this choice, the radius of the fuzzy sphere N depends on N.
398
U. Carow-Watamura, S. Watamura
where a ∈ AN . These objects are the noncommutative analogue of the Killing vector fields on the sphere, and the algebra of Li closes. We obtain thus [Li , xj ] = iij k xk ,
[Li , Lj ] = iij k Lk .
(9)
Finally, the integration is given by the trace over the Hilbert space FN . The integration over the fuzzy sphere which corresponds to the standard integration over the sphere in the commutative limit is defined by 1 X 1 hk|O|ki, (10) Tr{O} = hOi = N +1 N +1 k
where O ∈ AN . 2.2. Chirality operator and Dirac operator. We introduce the spinor field 9 as an AN bimodule 0AN ≡ C2 ⊗ AN , which is the noncommutative analogue ofthe space of ψ1 sections of a spin bundle. 9 is represented by 2-component spinors 9 = , where ψ2 each entry is an element of AN and we require that it transforms as a spinor under rotation of the sphere. Since left multiplication and right multiplication commute, the AN -bimodule can be considered as a left module over the algebra AN ⊗ AoN , where AoN denotes the opposite algebra which is defined by: xio xjo ≡ (xj xi )o , xi ∈ AN .
(11)
The action of a, b ∈ AN onto the AN -bimodule 9 ∈ 0AN is abo 9 ≡ a 9 b.
(12)
We define the Dirac operator and the chirality operator in the algebra AN ⊗ AoN [25], i.e. as 2 × 2 matrices the entries of which are elements in the algebra AN ⊗ AoN . The construction of the Dirac operator is performed by the following steps: (a) Define a chirality operator which commutes with the elements of AN and which has a standard commutative limit. (b) Define the Dirac operator by requiring that it anticommutes with the chirality operator and, in the commutative limit it reproduces the standard Dirac operator on the sphere. Requiring the above condition (a) we obtain for the chirality operator [25] 1 α (σi xio − ). N 2 N is a normalization constant defined by the condition γχ =
(γχ )2 = 1, α 2 (N
(13)
(14)
+ 1) and σi (i = 1, 2, 3) are the Pauli matrices. In the commutative limit, as N = the operator xi can be identified with the homogeneous coordinate xi of sphere and the chirality operator given in Eq. (5) becomes 1` σi xi , which is the standard chirality operator invariant under rotation [26]. The chirality operator (13) defines a Z2 grading of the differential algebra and it commutes with the algebra AN .
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
399
Proposition 1. The Dirac operator D satisfying the condition (b), i.e., {γχ , D} = 0, is given by D=
i γχ ij k σi xjo xk . `α
(15)
Proof. See [25].2 Note that this Dirac operator is selfadjoint, D† = D. Acting with this operator on a spinor 9 ∈ 0AN , we obtain i D9 = γχ χi Ji 9, `
(16)
1 Ji = Li + σi , 2
(17)
χi ≡ ij k xj σk .
(18)
where
and
The action of the angular momentum operator on the bimodule is defined by Li 9 ≡
1 1 [xi , 9] = (xi 9 − 9xi ). α α
(19)
The second condition of (b) concerning the commutative limit of the Dirac operator is also satisfied. If we replace each operator χi , Ji and γχ in Eq. (16) by the corresponding quantity which is obtained in the commutative limit, we get i i 1 D∞ = γχ χi Ji = 2 (σl xl )ij k xi σj (iKk + σk ) = −(iσi Ki + 1), ` ` 2
(20)
where xi is the homogeneous coordinate of S 2 and Ki is the Killing vector. Therefore, in the commutative limit this Dirac operator is equivalent to the standard Dirac operator.
2.3. Spectral triple. In order to establish Connes’ triple we have to identify the Hilbert space. The space of the fermions 9 ∈ AN ⊗ C2 defines a Hilbert space HN with norm h9|9i = TrF (9 † 9) =
2 X
TrF {(ψ ρ )∗ ψ ρ },
(21)
ρ=1
where TrF is the trace over the (N + 1) dimensional Hilbert space FN . 2 Note that this Dirac operator is different from the one given in [17]. The difference is that the operator in [17] contains a product of the Pauli matrix and angular momentum operator, whereas the operator defined here contains a product of χi and angular momentum operator as in Eq. (16), i.e. it also contains xi . Consequently, the spectra are not the same.
400
U. Carow-Watamura, S. Watamura
The dimension of the Hilbert space HN is 2(N + 1)2 and the trace over HN is the trace over the spin suffices and over the (N + 1)2 dimensional space of the matrices. Since the Dirac operator is defined in the algebra AN ⊗ AoN , the trace must be taken for operators of the form abo , with a, b ∈ AN , and it is given by TrH {ab } = o
2(N +1)2 X
h9K |abo 9K i = 2TrF {a}TrF {b}.
(22)
K=1
Here 9K is an appropriate basis in HN labeled by an integer K ∈ {1, . . . , 2(N + 1)2 }. The factor 2 on the r.h.s. comes from the trace over the spin suffices. To examine the structure of the Hilbert space we compute the spectrum λj of the Dirac operator: D2 9j m = λ2j 9j m .
(23)
9j m is a state with total angular momentum j , J2 9j m = j (j + 1)9j m and J3 9j m = m9j m is the x3 component of the total angular momentum operator Ji in Eq. (17). j and m are half integers and run 21 ≤ j ≤ N + 21 and −j ≤ m ≤ j . Proposition 2. The spectrum of the Dirac operator is given by 1 − (j + 21 )2 1 2 2 . λj = (j + ) 1 + 2 N (N + 2)
(24)
Proof. `2 2 0 0 0 D = (ij k σ i Xj Yk )(i 0 j 0 k 0 σ i Xj Yk ) α2 = X2 Y2 − (XY)[(XY) + 1 + (Xσ ) + (Yσ )], P where Xi = α1 xi , Yi = − α1 xio and (XY) = i Xi Yi . Using the relations Li = Xi + Yi
and
1 Ji = Li + σi , 2
(25)
(26)
we obtain (XY) = 21 [L2 − X2 − Y2 ] and (σ X) + (σ Y) = J2 − L2 − 43 . In order to evaluate the spectrum we use the representation of the spinor and substitute J2 = j (j + 1) where j ≤ N +
1 2
and
L2 = (j + s)(j + s + 1),
(27)
is a half integer and s = ± 21 . With this value we get
1 1 [j (j + 1) + s(2j + 1) + − X2 − Y2 ], 2 4 (σ X) + (σ Y) = −s(2j + 1) − 1. (XY) =
(28)
Thus, the eigenvalue is i 1 1 2h 1 1 2 `2 2 2 2 (j + ) ) λ = − − 2(X + Y ) − 1 − (X2 − Y2 )2 . (j + α2 j 4 2 2 4 If we substitute X2 = Y2 =
N N 2(2
+ 1) we obtain the relation (24).
t u
(29)
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
401
This spectrum coincides with the classical spectrum of the Dirac operator in the limit N → ∞. For finite N, it contains zeromodes. When the angular momentum takes its maximal value we see that λN + 1 = 0. This happens since there is no chiral pair for the 2
spin N + 21 state and therefore this part must be a zeromode for consistency. We can also confirm this property by computing: TrH (γχ ) = 2(N + 1). Since these zeromodes have no classical analogue, one way to treat them is to project them out from the Hilbert space. On the other hand, the contribution of the zeromodes in the integration is of order 1 N and thus their contribution vanishes in the limit N → ∞. Therefore, considering the differential algebra on the fuzzy sphere as a kind of regularization of the differential algebra on the sphere, it is sufficient to take the full Hilbert space HN . In this way we obtain Connes’triple (AN , D, HN ). We thus can apply the construction of the differential algebra. 2.4. Differential algebra. In this section we construct the differential algebra associated with (AN , D, HN ) by using Connes’ method [9]. See also [27]. We define the universal differential algebra ∗ (AN ) over AN . An element ω ∈ ∗ (AN ) is in general given by X (0) (1) (2) (p) aλ daλ daλ · · · daλ , (30) ω= λ∈I
(k)
where p is an integer, aλ ∈ AN (k = 0 · · · p) and I is an appropriate set labeling the elements. da is a symbol defined by the operation of the differential d on a ∈ AN , which satisfies Leibnitz rule d(ab) = (da)b + a(db) for a, b ∈ AN , and d1 = 0 for the identity 1 ∈ AN . We also require (da)∗ = −da∗ . The Leibnitz rule provides a natural product among the elements in ∗ (AN ) and the differential d on ∗ (AN ) is defined by X (0) (1) (2) X (0) (1) (2) (p) (p) aλ daλ daλ · · · daλ ) = daλ daλ daλ · · · daλ . (31) d( λ∈I
λ∈I
Then, it follows d 2 ω = 0 and the graded Leibnitz rule. In order to define the p-forms as operators on HN , a representation π is defined by X (0) X (0) (1) (2) (p) (p) (1) (2) aλ daλ daλ · · · daλ ) = aλ [D, aλ ][D, aλ ] · · · [D, aλ ]. (32) π( λ∈I
λ∈I
Recall that AN is defined as an algebra of operators in HN . Then the graded differential algebra is defined by ∗D (AN ) = ∗ (AN )/J,
(33)
where J = ker π + d ker π is the differential ideal of ∗ (AN ). In order to establish the differential calculus on the fuzzy sphere, we have to examine the structure of the differential kernel J. For this we denote the kernel of each level as ker π (p) ≡ p (AN ) ∩ ker π,
(34)
then the differential kernel J(p) for the p-form is J(p) = ker π (p) + d ker π (p−1) .
(35)
402
U. Carow-Watamura, S. Watamura
Since the elements of the algebra AN are defined as operators in HN , ker π (0) = {0}, i.e., J(0) = {0}. It means that 0D (AN ) = AN . The differential kernel of the 1-form is J(1) = ker π (1) +d ker π (0) = ker π (1) , and thus for any element a ∈ AN the derivative is defined by π(da) = [D, a].
(36)
The space of 1-forms ω ∈ 1D (AN ) can be identified with the operators π(ω) in HN : π(1D (AN )) = {π(ω)| π(ω) =
X
aλ [D, bλ ] ; aλ , bλ ∈ AN }.
(37)
λ∈I
Thus, with the above identification, the exterior derivative d defines a map: d
AN → M2 (C) ⊗ (AN ⊗ AoN ),
:
(38)
where M2 (C) is the algebra of 2 × 2 complex matrices. Using the definition of the Dirac operator (15), a 1-form is expressed as follows: Take a 1-form π(ω) ∈ π(1D (AN )) in Eq. (37). Using Eq. (15) we obtain π(ω) =
X i i γχ ij k σi xjo aλ [xk , bλ ] = γχ χko ωk , `α `
(39)
λ
where χko ≡ −ij k xio σj ,
(40)
and the components ωk of π(ω) can be rewritten by using the definition (8) of L as: ωk ≡
X 1X aλ [xk , bλ ] = aλ (Lk bλ ). α λ
(41)
λ
Here, ωk ∈ AN may be considered as the component of a vector field. In order to write the gauge field action, we have to define the 2-form. A 2-form η ∈ 2D (AN ) can be given in general as π(η) =
X λ
(1)
(2)
(3)
aλ [D, aλ ][D, aλ ]
X (1) 1 (2) (3) aλ (Li aλ )(Lj aλ ), = 2 χio χjo `
(42)
λ
(i)
where aλ ∈ AN . Since the 2-form in Eq. (42) is defined up to the differential kernel π(d ker π (1) ), π(η) contains redundant components. Note that when we perform the calculation, we do not use the 2D (AN ), but its representation π(2D (AN )), thus it is sufficient to compute π(d ker π (1) ), since π(∗D (AN )) is isomorphic to π(∗ (AN ))/π(d ker π ). The nontrivial contribution of π(d ker π (1) ) is proportional to the traceless part of the symmetric product χ{io χjo} as we shall see in the following.
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
The exterior derivative of a general 1-form ω defined in Eq. (37) is X [D, aλ ][D, bλ ]. π(dω) =
403
(43)
λ
Using the Dirac operator we obtain 1 2 1 X o oh χi χi 0 Li (aλ Li 0 bλ ) − i ii 0 k aλ Lk bλ + δi,i 0 (aλ Li bλ )xi π(dω) = 2 ` 2 3α λ i 1 1 −aλ [ {Li , Li 0 } − δii 0 L2 ]bλ . (44) 2 3 The first three terms vanish for ω ∈ ker π (1) . Only the last term gives a nontrivial contribution for the differential kernel and thus π(d ker π (1) ) is proportional to the symmetric traceless product of χio χio0 . The proof of the existence of the nontrivial 1-form kernels which contribute to p q d ker π (1) is given in the appendix. Using the explicit expression ωp,q = xA dxA of ker π (1) obtained in the appendix (see Eq. (87)) we compute π(d ker π (1) ). We find that dωp,q gives an element of d ker π (1) : Proposition 3. π(dωp,q ) 6 = 0, for p + q = N + 2 and p, q > 1. Proof. p
q
π(dωp,q ) = [D, x+ ][D, x+ ] α α −1 p−1 q−1 = 2 γχ χ+o γχ χ+o [−2px+ (x3 + (p − 1))][−2qx+ (x3 + (q − 1))] ` 2 2 α 3α 1 o o p+q−2 2 x3 + x3 ( p + q − 2α) = 2 χ+ χ+ 4pqx+ ` 2 2 α2 (p + 2q − 3)(q − 1) . (45) + 4 Using the identity x+ x− = `2 + αx3 − x32 , this expression can be simplified to π(dωp,q ) = 4pq
1 o o p+q−2 χ χ x [A(q)x3 + B(q)], `2 + + +
(46)
where A(q) =
B(q) = `2 +
3α α p+ q − α, 2 2
α2 (p + 2q − 3)(q − 1) − x+ x− . 4
(47)
(48)
This means that π(dωp,q ) does not vanish for p + q = N + 2 although ωp,q ∈ kerπ (1) for p + q = N + 2. u t The result of Proposition 3 and the corresponding contribution from the kernels of the other directions show that there exist nontrivial differential kernel elements π(d kerπ (1) ) proportional to the symmetric traceless product of χio χjo . With this result we can prove the following proposition.
404
U. Carow-Watamura, S. Watamura
Proposition 4. π(d ker π (1) ) = {3|3 =
1 o o χ χ aij `2 i j
where aij ∈ AN , aij = aj i and
3 X
aii = 0}.
i=1
(49) Proof. Using Proposition 3, we obtain a nontrivial element by multiplying aλ0 , b0λ ∈ AN and X X aλ0 ωp,q b0λ ) = π aλ0 (dωp,q )b0λ π d( λ
λ
X 1 p q χ o χ o a0 (L− x+ )(L− x+ )b0λ = `2 + + λ λ X 1 N = 4pq 2 χ+o χ+o aλ0 x+ [A(q)x3 + B(q)]b0λ , `
(50)
λ
where we have used p + q = N + 2. Choosing appropriate elements aλ0 , b0λ ∈ AN , P N [A(q)x + B(q)]b0 can become any element in A . We have six the factor λ aλ0 x+ 3 N λ independent directions for ωp,q and combining the results from them we get the traceless t symmetric combinations of suffices i, i 0 in χio χio0 . u η∈ Identifying the 2D (AN ) with its representation π(2D (AN )), a general 2-form e π(2D (AN )) is given by X (1) (2) (3) aλ [D, aλ ][D, aλ ], (51) e η= λ
(i)
where aλ ∈ AN up to π(d ker π (1) ). Combining Eqs. (43) and (44) we can compute the operation of the derivative d on a general 1-form in Eq. (37) and we obtain 2 1 π(dω) = 2 χko χko0 {Lk ωk 0 − Lk 0 ωk } − ikk 0 k 00 ωk 00 + δkk 0 [xi ωi + ωi xi ] , (52) 2` 3α where we have used the definition of the components ωk in Eq. (39). Since the trace part does not belong to the differential kernel, the last term in the above equation is not removed by dividing differential kernels. We continue here our construction of the gauge field action with this definition of the differential algebra and we shall obtain a kind of mass term in the gauge theory. The commutative limit α → 0 becomes singular, as can be seen from Eq. (39). However, as we discuss in the following, we can still interpret the resulting theory as a regularization of the corresponding commutative theory. An alternative strategy to the one taken here would be to restrict the above defined 2-form. With the above 2-form as it stands the naive commutative limit does not give the standard differential calculus. One possibility to handle this situation is to use the property of the trace: χio χio = 2N 2 − αN γχ . It turns out that the trace part JT is an ideal of the π(2 (AN )). Furthermore, in each p-form space π(p (AN )), the set JT π(p−2 (AN )) ∪ π(p−2 (AN ))JT is an ideal and thus there is a possibility to divide the differential algebra so that we can take the commutative limit and obtain the standard differential calculus. This procedure will be discussed elsewhere.
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
405
3. U (1) Gauge Field Theory 3.1. Vector field. Using the geometric notions defined in the previous sections, we formulate the U (1) gauge theory on the fuzzy sphere. We identify the differential algebra ∗D (AN ) with its representation π(∗D (AN )) and do not write the map π explicitly. First, to formulate the gauge field theory we define the real vector field A which is a 1-form on the fuzzy sphere. We impose the reality condition for this 1-form by A† = A. Using the general definition of a 1-form, A can be written as3 X A= aλ [D, bλ ],
(53)
(54)
λ
where aλ , bλ ∈ AN are appropriate elements. According to the general discussion about 1-forms in the previous section we can write i A = γχ χko Ak , ` where Ak is the component field of A given by X aλ (Lk bλ ). Ak =
(55)
(56)
λ
For the component field the reality condition gives A∗k = Ak .
(57)
Thus each component of the gauge field is represented by an (N +1)×(N +1) hermitian matrix. Note that, in the commutative case, the 1-form satisfies the constraint xi Ai = 0, which shows the reduction of the degrees of freedom. However in the noncommutative case the 1-form defined by Eq. (56) does not satisfy the similar constraint on Ak in general. Further discussion on the treatment of this property is given in Sect. 4. In the remaining part, let us push forward the construction of the gauge theory on the noncommutative sphere. In the commutative case, we obtain the field strength of the U (1) gauge theory by taking the exterior derivative of the 1-form. In the noncommutative case, the exterior derivative gives X [D, aλ ][D, bλ ]. (58) dA = λ
Applying the result of the previous section we obtain dA =
i o o χ χ 0 Fkk 0 , 2`2 k k
(59)
3 The hermiticity condition requires the form A = P a [D, b ] + b∗ [D, a∗ ] − 1 [D, a b + b∗ a∗ ]. ρ ρ ρ ρ ρ ρ ρ ρ ρ 2
This can be again written in the form (54).
406
U. Carow-Watamura, S. Watamura
with Fkk 0 = −i{Lk Ak 0 − Lk 0 Ak } − kk 0 k 00 Ak 00 − iδkk 0
2 [Ai xi + xi Ai ]. 3α
(60)
We can show that the above Fkk 0 corresponds to the field strength for the abelian gauge field in the commutative limit. To see this, we use the following correspondence which holds in the commutative limit: µ
Ak = Kk Aµ
and
Lk = iKk .
(61)
µ
Here Aµ is a gauge field and Kk (k = 1, 2, 3, µ = 1, 2) is the Killing vector on the µ sphere with appropriate coordinates ρ µ , and Kk = Kk ∂µ . With the above identification we get µ
Fkk 0 = Kk Kkν0 Fµν ,
(62)
where Fµν = ∂µ Aν − ∂ν Aµ . Here we have used the relation Ai xi = 0 which holds in the commutative case. In the noncommutative case, however, the exterior derivative of the 1-form dA does not give the field strength. 3.2. U (1) Gauge transformation. For the formulation of the U (1) gauge theory on the fuzzy sphere, let us consider the U (1) gauge transformation of a charged scalar field, i.e., a complex scalar field [25]. The algebraic object corresponding to the complex scalar field on the fuzzy sphere is the AN -bimodule 8 ∈ AN . Its action is given by S=
1 TrH {(d8)† d8}. 2(N + 1)2
(63)
Apparently, the above action is invariant under global U (1) transformation of the phase 80 = eiφ 8.
(64)
Following the standard approach, the local U (1) gauge transformation can be defined if we let the phase eiφ be a function on the fuzzy sphere. In the present algebraic formulation this means we multiply an element u ∈ AN on the field 8, where unitarity is implemented by u∗ u = 1
.
(65)
When we generalize the transformation, we may take either left or right multiplication of u on the field 8 due to the ordering ambiguity. Here we take the left multiplication as the U (1) gauge transformation for 8: 80 = u8.
(66) 0
The transformation of the conjugate field 8 = 8∗ is given by 8 = 8u∗ . Since the algebra AN is isomorphic to the algebra of (N + 1) × (N + 1) matrices, the condition (65) shows that, as a matrix, u is an element of U (N + 1). In other words, the local U (1) gauge transformation on the fuzzy sphere in matrix representation is defined as the left U (N + 1) transformation.
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
407
Therefore, we define the covariant derivative ∇A as ∇A 8 = d8 + A8.
(67)
Then the gauge transformation of the gauge field can be defined by requiring the covariance of ∇A 8: ∇A0 (u8) = u∇A 8.
(68)
This defines the standard form of the gauge transformation A0 = udu∗ + uAu∗ .
(69)
A0k = u(Lk u∗ ) + uAk u∗ .
(70)
In components it reads
The above transformation keeps the hermiticity condition (53) and may be interpreted as the transformation of the U (N + 1) gauge theory on a one-point space, and thus the covariant field strength is given by the standard curvature form [9] 2 = dA + AA.
(71)
In components the curvature 2-form is 2=
−i o o χ χ 0 2kk 0 , 2`2 k k
(72)
where the component of the field strength is 2kk 0 = i{Lk Ak 0 − Lk 0 Ak } + kk 0 k 00 Ak 00 + i[Ak , Ak 0 ] 2i + δkk 0 [Ai xi + xi Ai + αAi Ai ]. 3α
(73)
3.3. The action of gauge field and matter. With the above results we define the noncommutative analogue of the gauge invariant action. The action of the charged scalar is SM =
1 TrH {(∇A 8)† ∇A 8}. 2(N + 1)2
(74)
The action of the gauge field is given by SG ≡
1 TrH {22 }. 2(N + 1)2
(75)
Both actions are invariant under local U (1) gauge transformation. Thus, combining these two actions, we obtain the action of the U (1) gauge theory with scalar matter on the fuzzy sphere. Note that we may introduce the gauge coupling constant g by rescaling the gauge field A to gA. In order to see the detailed structure of the above actions, we take a part of the trace. We perform the trace relating to the opposite algebra and the spin suffices. Then we obtain the action which contains only the fields A and 8 and the trace of this action is taken over the Hilbert space FN .
408
U. Carow-Watamura, S. Watamura
Then the matter action (74) can be reduced as SM =
2 TrF {(Li 8 + Ai 8)∗ (Li 8 + Ai 8)}. 3(N + 1)
(76)
Similarly, the gauge field action (75) is reduced to SG =
CA CS A TrF {2A TrF {2Sii 0 2Sii 0 }, ii 0 2ii 0 } + (N + 1) (N + 1)
(77)
where N 2 n −α 2 2o , + 2`2 3N 2 3 1 CS = 1 + N (N + 2)
CA =
(78)
and 2Sij (2A ) is the (anti)symmetric part of the field strength given in Eq. (73). Since the trace over the Hilbert space FN corresponds to the volume integration in the commutative limit, the actions SG and SM given in Eqs. (76) and (77), respectively, should correspond to the standard action on the sphere in the limit N → ∞. Apparently the 2S in the gauge action does not have a classical correspondence. Furthermore, as we see below this term is singular in the naive N → ∞ limit. This is unavoidable since our differential algebra is singular in this limit. However, under certain conditions we may consider the above action as a regularized theory of the commutative case as follows: The symmetric part of the action is (2S )2 ∼
1 [(Ai xi + xi Ai ) + αAi Ai ]2 . α2
(79)
The above combination is gauge invariant under the gauge transformation given in Eq. (69). This term can be understood as the gauge invariant mass term of the radial component of the gauge field. Thus, physically we can understand the effect of the symmetric part as follows: When we consider the quantization of the above regularized theory using the path integral which respects the gauge symmetry, then in the α → 0 limit the symmetric term behaves like a (gauge invariant) delta function which drops the radial component. Furthermore, from the point of view of gauge theory it is not necessary to take 22 as an action. Instead, we can simply take any linear combination of the gauge invariant terms. This means that we can take CA and CS as independent parameters. Thus, we obtain in general the following action for the gauge field: S=
1 2 TrF {C1 Gkk 0 Gkk 0 + C2 G0 }, (N + 1)
(80)
where C1 and C2 are c-numbers and Gkk 0 = iLk Ak 0 − iLk 0 Ak + kk 0 k 00 Ak 00 + i[Ak , Ak 0 ], G0 = xi Ai + Ai xi + αAi Ai . The above action (77) is a special case of the general form given here.
(81)
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
409
4. Discussions and Conclusion In this paper we have formulated the U (1) gauge theory on the fuzzy sphere, following Connes’ framework of noncommutative differential geometry. The differential algebra on the fuzzy sphere has been constructed by applying the chirality operator and Dirac operator proposed in ref.[25]. This chirality operator anticommutes with the Dirac operator and the structure of the differential algebra becomes simple. Then we analyzed the structure of the 1-forms and 2-forms which are necessary to construct the gauge field action. In ref.[25], the action of a complex scalar field on the fuzzy sphere which is invariant under the global U (1) transformation of the phase of the complex scalar field has been formulated. Here, the local U (1) gauge transformation on the fuzzy sphere is introduced by making the global phase transformation into a local transformation, i.e. the phase becomes a function over the fuzzy sphere. By construction, a function over the fuzzy sphere is simply given by elements of the algebra AN . Thus, the local U (1) gauge transformation is defined by multiplication of an element u ∈ AN , satisfying unitarity u∗ u = 1. Since the algebra AN is noncommutative, there is an ambiguity of operator ordering when replacing the global phase by the algebra elements u. We have chosen here the left multiplication. Thus, when we represent the algebra AN by matrices, the local U (1) gauge transformation is identified with the left transformation by a unitary (N + 1) × (N + 1) matrix. Therefore, the gauge field action is analogous to the Yang-Mills action. Once we know the Dirac operator, the construction of the differential calculus is rather straightforward, however, as we have seen when defining the 1-forms, their components Ai do not satisfy Ai xi = 0 in general. In the commutative case this relation holds since the Killing vector is perpendicular to the normal direction of the sphere. However in the noncommutative case xi Li is not necessarily zero. Since the relation 4 xi Ai + Ai xi =
1 [xi , a][xi , b], α
(82)
holds, this property is related with the trace part of the 2-form as follows: As we have seen in the construction of the 2-forms performed here, the differential kernel π (2) does not contain a trace part, i.e., the part proportional to χio χio . In the course of deriving Eq. (44) we get [xi , a][xi , b] as a coefficient of χio χio . Up to the kernel condition this product of commutators is equivalent to aL2 b. The reason why the trace part drops from the differential kernel is due to the relation aL2 b = − α2 a(Li b)xi . This relation is a direct consequence of the condition that `2 is central. This type of problem relating to the reduction of degrees of freedom as well as to the structure of the differential kernel is a rather general feature when defining the differential forms by the adjoint action Li .5 Thus, in the noncommutative case the construction gives 1-forms which have three independent components. One possibility to drop the trace part (which is proportional to the third component) in the present approach has been indicated in Sect. 2.4. On the other hand, although the 2-form is singular in the N → ∞ limit, the action given in Eq. (77) still allows the interpretation as a regularized theory of the gauge theory on the sphere. 4 This relation follows from Definition (56). 5 The structure of the Dirac operator depends on the choice of the fermion, but on the other hand if the Dirac operator has the form θ i xi , and if θ i commutes with xi , where θ i ∼ γχ χio in our case, then the derivative d is always given by da = θ i (Li a) with Li being the adjoint action.
410
U. Carow-Watamura, S. Watamura
It is easy to check that both terms in the action Eq. (77) are invariant under the gauge transformation (70). Thus, the most general gauge action can be written as in Eq. (81). The first term corresponds to the standard gauge action in the commutative limit. This term is usually taken as the action for the gauge field in the fuzzy sphere. The second term approaches simply (2xi Ai )2 in this limit. As we mentioned, the symmetric part of the action can be understood as a gauge invariant mass for the radial component of the gauge field. Furthermore, in the action (75), this mass is diverging in the limit of N → ∞ and can be treated as delta function constraint under the path integral. Thus by taking a limit which respects the gauge symmetry, the freedom corresponding to xi Ai +Ai xi +αA2 is frozen and thus effectively drops from the theory. Since in this limit this procedure is equivalent to the constraint xi Ai = 0, it reduces the freedom of the vector potential in the commutative theory properly. From the point of view of constructing a gauge theory on the fuzzy sphere, we have an even simpler choice to treat the degrees of freedom of the theory. If we require only the gauge invariance under the gauge transformation (69), we can take the symmetric term as a constraint for the gauge field from the beginning. Then the action contains only the antisymmetric part, i.e., C2 = 0 in Eq. (81) and the gauge field is constrained by G0 = (Ai xi + xi Ai ) + αAi Ai = 0.
(83)
Then in this construction, the gauge field has correct degrees of freedom, even in the noncommutative case. Apparently, this theory also gives the correct commutative limit. To complete our discussion, we want to mention that the use of the constraint G0 = 0 to restrict the differential calculus is not straightforward, since dG0 does not automatically vanish. The treatment of this constraint within the differential calculus needs more investigation. The fuzzy sphere is one of the easiest examples of a noncommutative space. We can consider the U (1) gauge theory on the fuzzy sphere formulated in this paper as a regularized version of a gauge theory on the sphere. The gauge theory on the noncommutative sphere is also investigated in ref.[28]. The differential calculus there is based on the supersymmetric fuzzy sphere and the structure of the fermion is different from the one discussed here. Thus the structure of the differential algebra is also different. However, this is not a contradiction since, in principle, there are many types of differential algebra associated with the fuzzy sphere algebra, depending on the choice of the spectral triple. In the formulation given here we can also see an interesting analogue with the M(atrix) theory. If we introduce a new field ∇i =
1 xi + Ai , α
(84)
then the field strength 2A ij is given by 2A ij = i[∇i , ∇j ] − ij k ∇k .
(85)
Using the same replacement for the symmetric part, the action is SG =
2 1 `2 TrF {C1 i[∇i , ∇j ] − ij k ∇k + C2 (∇i ∇i − 2 )}. (N + 1) α
(86)
After rewriting the gauge field action in the above form, we can make the following reinterpretation: There is a general theory defined by the matrix ∇i and the action (86).
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
411
The geometry of the base space is then defined by the vacuum expectation value of the field ∇i given by h∇i i = xαi . Then the original gauge field action can be obtained by expanding the field around this vacuum expectation value. Acknowledgement. The authors would like to thank H. Ishikawa for helpful discussions. This work is supported by the Grant-in-Aid of Monbusho (the Japanese Ministry of Education, Science, Sports and Culture) #09640331.
5. Appendix 5.1. One form kernels. We show the existence of nontrivial elements of J(1) which contribute to π(dkerπ (1) ). Consider the 1-forms: (xA )p d(xB )q
(87)
with A, B = +, −, 3, where we have used the coordinates x± = x1 ± ix2 . Since AN is the algebra of (N + 1) × (N + 1) matrices, corresponding to the (N + 1) dimensional representation of the algebra of the angular momentum up to the normalization, the identity (x± )N+1 = 0 holds, and thus one easily finds that elements of the differential kernel appear for A = B = ±. Here, we give the proof for A = B = +. (The proof for A = B = − works correspondingly.) Let us define the 1-forms ωp,q as p
q
ωp,q = x+ dx+ .
(88)
Then the following proposition holds. Proposition 5. ωp,q is an element of ker π (1) , for integers p, q satisfying 1 < p, q < N + 1 and p + q ≥ N + 2. Proof. Using the Dirac operator given in (15) we obtain p
q
π(ωp,q ) = x+ [D, x+ ] =
i `
X A=+,3,−
p
q
γχ χAo x+ LA x+ .
(89)
A straightforward calculation yields q
L+ x+ = 0, q
q
L3 x+ = qx+ , q
q−1
L− x+ = −2qx+ (x3 +
α (q − 1)). 2
(90)
N +1 = 0, the r.h.s. of Eq. (89) vanishes. Substituting the above relations, and using that x+ t u
Note that there are six different elements xλ which correspond to the raising (lowering) operators of the three different directions xλ = xj ±ixk , where j < k and j, k ∈ {1, 2, 3}, satisfying (xλ )N +1 = 0.
(91)
412
U. Carow-Watamura, S. Watamura p
q
For each direction xλ we can obtain kernels of the type ωp,q = xλ dxλ .6 These one forms as well as all one forms obtained by multiplying elements a ∈ AN onto them, belong to the kernel J(1) = ker π (1) . We may still find other elements of ker π (1) . However, the above kernel ωp,q is sufficient to prove that π(d ker π (1) ) is not empty and contains the symmetric traceless part of χio χjo . References 1. Banks, T., Fischler, W., Shenker, SH., Susskind, L.: M theory as a matrix model. Nucl. Phys. B 497, 41–55 (1997) 2. Ishibashi, N., Kawai, H., Kitazawa, Y., Tsuchiya, A.: A large-N reduced model as superstring. Nucl. Phys. B 498, 467–491 (1997) 3. Connes, A., Lott, J.: Nucl. Phys. B 18 (Proc. Suppl.) 29 (1990); see also chapter VI of ref.[9] 4. Berezin, F.A.: Quantization. Math. USSR Izvestija 8, 1109–1165 (1974) 5. Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975) 6. Bordemann, M., Hoppe, J., Schaller, P., Schlichenmaier, M.: gl(∞) and Geometric Quantization. Commun. Math. Phys. 138, 209–244 (1991) 7. Bordemann, M., Meinrenken, E., Schlichenmaier, M.: Toeplitz Quantization of Kähler Manifolds and gl(N), N → ∞ Limits. Commun. Math. Phys. 165, 281–296 (1994) 8. Coburn, L.A.: Deformation Estimates for the Berezin-Toeplitz Quantization. Commun. Math. Phys. 149, 415–424 (1992) 9. Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 10. Klimek, S., Lesniewski, A.: Quantum Riemann Surfaces I. The Unit Disk. Commun. Math. Phys. 146, 103 (1992) 11. Cahen,M., Gutt, S., Rawnsley, J.: Quantization of Kähler Manifolds II. Transactions of the American Math. Soc. 337, 73–98 (1993) 12. Hoppe, J.: Quantum Theory of a Massless Relativistic Surface and a Two-Dimensional Bound State Problem. PhD Thesis, MIT (1982) published in Soryushiron Kenkyu (Kyoto) Vol. 80, 145–202 (1989) 13. de Wit, B., Hoppe, J., Nicolai, H.: On the Quantum Mechanics of Supermembranes. Nucl. Phys. B305, 545–581 (1988) 14. Bargmann, V.: On a Hilbert Space of Analytic Functions and an Associated Integral Transform,Part I. Comm. Pure Appl. Math. 14, 187–214 (1961) 15. Perelomov, A.M.: Coherent states for arbitrary Lie groups. Commun. Math. Phys. 26, 222 (1972); Generalized Coherent States and their Application. Berlin–Heidelberg–New York: Springer Verlag, 1986 16. Madore, J.: The fuzzy sphere. Class. Quant. Grav. 9, 69–87 (1992) 17. Grosse, H., Presnajder, P.: The Dirac Operator on the Fuzzy Sphere. Lett. Math. Phys. 33, 171–181 (1995) 18. Grosse, H., Klimˇcík, C., Presnajder, P.: Towards a finite Quantum Field Theory in Noncomm. Geometry. Int. J. Theor. Phys. 35, 231–244 (1996); Field Theory on a Supersymmetric Lattice. Commun. Math. Phys. 185, 155–175 (1997); Topological Nontrivial Field Configurations in Noncommutative Geometry. Commun. Math. Phys. 178, 507–526 (1996) 19. Haldane, F.D.M.: Fractional Quantization of the Hall Effect: A Hierachy of Incompressible Quantum Fluid States. Phys. Rev. Lett. 51, 605 (1983) 20. Fano, G., Ortolani, F., Colombo, E.: Configuration-interaction calculations on the fractional quantum Hall effect. Phys. Rev. B 34, 2670–2680 (1986) 21. Dubois-Violette, M., Kerner, R., Madore, J.: Gauge Bosons in a Noncommutative Geometry. Phys. Lett. B 217, 485–488 (1989) 22. Dubois-Violette, M., Kerner, R., Madore, J.: Noncommutative differential geometry of matrix algebras. J. Math. Phys. 31, 316 (1990) 23. Dubois-Violette, M., Kerner, R., Madore, J.: Noncommutative differential geometry and new models of gauge theory. J. Math. Phys. 31, 323 (1990) 6 In fact we have a whole “tower” of kernels m Y k=0
(x3 −
(N − 2k) α)ωp,q ∈ kerπ (1) , for p + q ≥ N + 1 − m, m = 0, . . . , N − 1, 2
(92)
Q (N −2k) N −m since m α)x+ = 0. However the above kernel ωp,q is enough for the following discusk=0 (x3 − 2 sions.
Noncommutative Geometry and Gauge Theory on Fuzzy Sphere
413
24. Dubois-Violette, M., Madore, J., Kerner, R.: Super Matrix Geometry. Class. Quantum Grav. 8, 1077 (1991) 25. Carow-Watamura, U., Watamura, S.: Differential Calculus on Fuzzy Sphere and Scalar Field. Int. J. of Mod. Phys. A 13, 3235–3243 (1998) 26. Jayewardena, C.: Schwinger model on S 2 . Helvetica Physica Acta 61, 636–711 (1988) 27. Chamseddine, A.H., Fröhlich, J.: Some Elements of Connes’ Noncommutative Geometry, and Spacetime Geometry. Preprint ETH-TH-93-24 (93, rec. Jul.), hep-th/9307012 28. Klimˇcík, C.: Gauge theories on the noncommutative sphere. IHES/P/97/77, hep-th/9710153 Communicated by H. Araki
Commun. Math. Phys. 212, 415 – 436 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On the Large Time Asymptotics of Decaying Burgers Turbulence Roger Tribe, Oleg Zaboronski Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK. E-mail:
[email protected];
[email protected] Received: 4 October 1999 / Accepted: 4 February 2000
Abstract: The decay of Burgers turbulence with compactly supported Gaussian “white noise” initial conditions is studied in the limit of vanishing viscosity and large time. Probability distribution functions and moments for both velocities and velocity differences are computed exactly, together with the “time-like” structure functions Tn (t, τ ) ≡ h(u(t + τ ) − u(t))n i. The analysis of the answers reveals both well known features of Burgers turbulence, such as the presence of dissipative anomaly, the extreme anomalous scaling of the velocity structure functions and self similarity of the statistics of the velocity field, and new features such as the extreme anomalous scaling of the “time-like” structure functions and the non-existence of a global inertial scale due to multiscaling of the Burgers velocity field. We also observe that all the results can be recovered using the one point probability distribution function of the shock strength and discuss the implications of this fact for Burgers turbulence in general. 1. Introduction The study of decaying Burgers turbulence (DBT) is largely motivated by the the observation that this is a system which falls into the phenomenological class of turbulent systems which can be treated in principle by means of Kolmogorov theory.Yet the answers which can be derived analytically for Burgers turbulence are in the sharp contradiction to the predictions of Kolmogorov theory. The understanding of the reasons for such a discrepancy and their relevance for the general theory of turbulence is one of the major aims of the study of Burgers turbulence. The history of the subject (see e.g. [11, 23, 27, 32, 16, 28, 19, 2–4, 1, 5, 6, 18, 22, 21, 33, 34,9,24,31,35,8]; see [17] for a review) shows however that the problem is hard, so hard in fact that it has a tendency to become self justifying, getting more and more alienated from the main body of turbulent research. However, until recently there existed
416
R. Tribe, O. Zaboronski
no model of Burgers turbulence which can be used as a testing ground for general phenomenological theories of turbulence on one hand and admits a complete and simple analytical treatment on the other. In the present paper we introduce and analyse such a model. Namely, we study the decay of Burgers turbulence with compactly supported Gaussian “white noise” initial conditions. In physical terms the turbulence in our model is excited by an initial disturbance localized at a fixed scale much less than the size of the reservoir and which can occur with equal probability around any point of the reservoir. Note that DBT driven by “white noise” plays a special role for the theory of DBT in general. The reason is that the integral scale of turbulence in this problem is not imposed by initial conditions but rather is generated by time evolution. Thus, the answers one obtains for “white noise” DBT are in some sense universal. Consider for example DBT driven by Gaussian initial conditions characterized by the two point function χ(r) which is approximately constant for r R and goes to 0 exponentially fast for r R. Then the statistics of the velocity field in this model at scales much larger than R and much less than the integral scale is asymptotically equivalent, in the limit as ν → 0, t → ∞, to that of “white noise” DBT. Likewise, compactly supported “white noise” DBT defines a universality class of models of DBT driven by compactly supported Gaussian initial conditions. The choice of a simple initial condition and the choice to look for answers only in the vanishing viscosity and large time limits lead to a model that is exactly solvable. Explicit asymptotics can be obtained for statistics that are hard to estimate in more general models. The main reason for the exact solvability of our model is the fact that the statistics of the velocity field in the case of compactly supported initial conditions are dominated in the limit ν → 0, t → ∞ by two shock configurations, the statistics of which is easily computable as functionals of white noise. We would like to stress that our model is in a different universality class than the original Burgers model in which turbulence is initiated by white noise initial conditions but no restriction of compactness is imposed: a solution to Burgers equation corresponding to an initial condition supported on a whole line will generically contain infinitely many shocks at any moment of time, not just two as in our case. Accordingly, the large time statistics of the velocity field in our case is very different from that in Burgers’ model. For instance, energy density decays as t −1/2 in our case (see Sect. 3.1) and as t −2/3 in Burgers’, [11]. The paper is organized as follows. In Sect. 2 we give a precise statement of the problem, construct a large time limit of the solution to the inviscid Burgers equation corresponding to compactly supported initial conditions and formulate the main statements about the statistics of these solutions. In Sect. 3 we obtain asymptotics for a variety of statistics: the moments of velocity field, the probability distribution function of velocities, the velocity structure functions, the probability distribution function of velocity differences, time-like velocity structure functions. In Sect. 4 the analysis of these results is given. In particular, the validity of one shock approximation and multiscaling in the problem are discussed.
Large Time Asymptotics of Decaying Burgers Turbulence
417
2. The Limiting Velocity Field Consider the following initial value problem connected to the Burgers equation: ∂u ∂ 2u ∂u (x, t) + u(x, t) (x, t) = ν 2 (x, t), x ∈ R, t > 0, ∂t ∂x ∂x u(x, 0) = u0 (x),
(1) (2)
where u0 (x) is a bounded function which is compactly supported in the interval [x0 − l, x0 + l]. Here l is a fixed positive constant and x0 is a random variable uniformly distributed in the interval [−L, L]. The fixed positive constant L plays a role of normalization length. Conditional on x0 the initial velocity u0 (x) will be a white noise over the interval [x0 − l, x0 + l], so that it has a formal density 1 − 2J1 Rxx0−l+l u20 (x)dx 0 , e Z where Z is a normalization constant chosen in such a way that, formally, Z Z L dx0 P (u0 |x0 )D(u0 ) = 1. −L 2L P (u0 |x0 ) =
(3)
J , the Gaussian variance, is a positive constant which plays a role of Loitsansky integral for the problem at hand. Since we have a compact initial condition the distribution of the velocities ut (x) are not translation invariant. The role of x0 is to randomise the location of the initial disturbance uniformly over the interval [−L, L]. The values of ut (x) at a fixed x will then typically be non-zero only with probability O(L−1 ). We take the limit as L → ∞ and all the answers concerning the statistics of the velocity field will be expressed in the form of the leading term in an asymptotic expansion in L−1 . This has the advantage that the answers are then translation invariant and we are free to consider statistics centered at the origin. In what follows we will compute asymptotics of the following statistics: the moments of velocity distribution Mn = hun (x, t)i; the velocity structure functions Sn (y) = h(u(x + y, t) − u(x, t))n i, the probability distribution function of the velocity field P (u) = hθ(u − u(x, t))i; the probability distribution function of velocity differences P (u, y) = hθ(u − u(x + y, t) + u(x, t))i; and the “time-like” velocity structure functions, Tn (τ, t) = h(u(x, t + τ ) − u(x, t))n i. Here θ (z) = χ(z ≥ 0) is the Heavyside function and h. . . i denotes the average w.r.t. to the random initial velocity field u0 (x). The solution of the initial value problem (1), (2) for ν > 0 via the Cole-Hopf transformation and the evaluation of the limit as ν → 0 for fixed t > 0 are well known. We refer the reader to [20] and [11] for a detailed description and give here a quick summary, sufficient for our needs. The vanishing viscosity solution can be obtained by plotting a chain R x of parabolic arcs such that each is touching the graph of the function −q(x) = − −∞ u0 (y)dy at two points exactly. The i th parabolic arc is given by a graph i) . As time grows the parabolic arcs flatten out and of the function 8i (x, t) = 8i + (x−x 2t ∗ merge, and there exists a time T such that for any t > T ∗ there are generically only two arcs left. The velocity field associated with such a configuration is then given by 2
u∗ (x, t) = U (x0 + x ∗ , x, t, P , Q) (x − x0 − x ∗ ) χ[(x0 +x ∗ −√−2Qt,x0 +x ∗ +√2(P −Q)t] (x), ≡ t
(4)
418
R. Tribe, O. Zaboronski
where χI is an indicator function of the interval I , P = q(+∞) is a momentum corresponding to a given u0 , Q = minx q(x) is a global minimum of q(x) and x0 + x ∗ ∈ [x0 − l, x0 + l] is the point where this minimum is achieved. (Such a point exists and is unique almost surely as q(x) is continuous and the global minimum is almost surely unique.) The limiting solution (4) was originally constructed in [20]. The time T ∗ at which the limiting velocity field u∗ is attained depends on the random initial condition u0 but it will be shown that the statistics of the velocity field is well approximated at large times by the statistics of the limiting velocity field u∗ . The latter is determined in turn by the joint distribution of the momentum P and the global minimum Q. Indeed although the expression for u∗ depends explicitly on (P , Q, x ∗ ), the dependence on x ∗ doesn’t influence the statistics of u∗ in the limit L → ∞, where the translational invariance is restored. We delegate the detailed discussion of this point to the next section. The choice of white noise as an initial distribution leads to the distribution of the pair (P , Q) being exactly calculable. Indeed it is a well known consequence of the ’reflection principle’ for Brownian paths [29]. Since it is key to all our asymptotics we include a quick derivation of the joint density function ρ(P , Q). We start with a computation of the probability distribution function of momentum ρ(P ). Writing δ for the delta function at zero, we have by definition Z ρ(P ) = δ P −
∞ −∞
dx u0 (x)
Z =
∞
−∞
dλ iλP D −iλ R ∞ dxu0 (x) E −∞ e e . 2π
Using (3) this functional integral is Gaussian and can be simply computed to give D
e−iλ
R∞
−∞ dxu0 (x)
E
= e−lJ λ . 2
The integral over λ is a Gaussian integral and we conclude that the distribution of P is also Gaussian, as could have been guessed from the very beginning, and given by −(
P
)2
e P0 , ρ(P ) = √ π P0
√ where P0 = 2 lJ .
(5)
The joint probability distribution function can now be computed as follows. Fix q, p satisfying q < 0, q < p. Let x 0 be the first value of x for which q(x) = q. Define q 0 (x) to equal q(x) for x ≤ x 0 and to equal the reflection of q(x) in the horizontal line y = q for x ≥ x 0 . Then if Q0 = minx q 0 (x) and P 0 = q 0 (∞) the reflection principle (see [29]), which exploits the white noise nature of u0 , states that Q0 , P 0 have the same distribution as Q, P . Then Prob(Q ≤ q, P ≥ p) = Prob(Q0 ≤ q, P 0 ≤ 2q − p) = Prob(P 0 ≤ 2q − p) Z 2q−p dz −( Pz )2 e 0 . = √ π P0 −∞
Large Time Asymptotics of Decaying Burgers Turbulence
419
Differentiating in p and q we conclude that ρ(p, q) =
4(p − 2q) −( p−2q )2 √ 3 e P0 , if q ≤ min{0, p}, π P0
(6)
and is zero for all other values of p and q. With the help of (6) we are able to average functionals F [u∗ (t)] = F [u∗ (xi , t) : i = 1, 2, . . . ] with respect to the initial distribution. If however we are interested in the statistics of u(x, t) at zero viscosity and large times there is still a question: is it true that in this limit hF [u(t)]i ∼ hF [u∗ (t)]i, or even at large times are there statistically many initial conditions such that corresponding velocity profiles haven’t converged to the limiting ones? It so happens that the first alternative prevails. The detailed proofs of this fact for relevant functionals are carried out in the next section and in the appendix and are based on the following estimate on the time T ∗ of convergence to the limiting profile: 1 tc 2 ∗ , (7) Prob(T > t) ≤ C t q 3 where tc = lJ and C is a positive number. The proof of this estimate is fairly complicated and is allocated to the appendix. However the result itself is so important for the validity of conclusions of our paper that we decided to present here a convincing and very simple heuristic derivation of it. By definition, Prob(T ∗ < t | P , Q, x ∗ ) = Prob(q ≤ 8t | P , Q, x ∗ ), where 8t coincides for x < x ∗ with parabolic arc 81,t passing through the point (x ∗ , −Q) and touching the line y = 0 and with parabolic arc 81,t passing through the point (x ∗ , −Q) and touching the line y = −P for x > x ∗ (we used the translation invariance of the random variable T ∗ to set x0 = 0. Consequently, x ∗ ∈ [−l, l]). It is convenient to think of a Brownian walk q(x) passing through (x ∗ , −Q) as a collection of two independent walks q + (x) and q − (x) starting at this point and moving in the opposite directions in “time” x. Therefore, Prob(T ∗ < t | P , Q, x ∗ ) = Prob(q − < 81,t | Q, x ∗ ) · Prob(q + < 82,t | P , Q, x ∗ ). (8) To estimate, say, Prob(q − < 81,t | Q, x ∗ ) below we note that Prob(q − < 81,t | Q, x ∗ ) ≥ Prob(q − < −Q + θ · (x − x ∗ ) | Q, x ∗ ), where y = −Q + θ · (x −qx ∗ ) is an equation for the line tangent to the parabola 81,t at the point (x ∗ , −Q); θ = Hence, Prob(q − < 81, t | Q, x ∗ ) q(x ∗ )=−Q− R
≥ lim
→+0
q(−l)=0
1 Dq 2 q < −Q + θ · (x − x ∗ ) e− 2J
q(x ∗ )=−Q− R q(−l)=0
1 Dq2 q < −Q e− 2J
R x∗ −l
R x∗ −l
−2Q t .
q˙ 2 dx
q˙ 2 dx
· 2 0 < −Q − θ · (l + x ∗ ) ,
(9)
420
R. Tribe, O. Zaboronski
where 2[. . . ] is a functional step function, 2(. . . ) - a usual one. The functional integral in the numerator of (9) can be transformed into an integral over all pathes satisfying q(x) ≥ 0 by a change of variables q(x) → q(x) − Q + θ (x − x ∗ ). (A counterpart of this transformation in quantum mechanics is a Galilean transformation.) Now the functional integrals in both numerator and denumerator of (9) can be expressed in terms of Green’s function of heat equation q˙ = J2 q 00 on half a line, i.e. the antisymmetrization of Green’s function of the same equation on the whole line. A simple computation shows then that s 2 2 2l · θ 1 − 2l . (10) Prob(q − < 81,t | Q, x ∗ ) ≥ 1 − −Qt −Qt Similar estimate holds for Prob(q + < 82,t | P , Q, x ∗ ) if one replaces −Q with P − Q in the r. h. s. of (10). Substituting these two estimates into (8) and integrating both sides of q the resulting
inequality w. r. t. P , Q using (6) we find that Prob(T ∗ < t) ≥ 1 − Const is equivalent to the estimate (7) for Prob(T ∗ > t) = 1 − Prob(T ∗ < t).
l2 tP0 ,
which
3. The Statistics of the Velocity Field in the ν → 0, t → ∞ Limit 3.1. Moments of the velocity distribution. The aim of the present section is to compute the large t-limit of moments of the velocity distribution
(11) Mn (t) = un (0, t) , n = 1, 2, . . . . Odd order moments vanish identically due to the symmetry: both Burgers equation and the initial distribution are invariant with respect to the transformation u → −u, x → −x. On the other hand, M2k+1 → −M2k+1 under this transformation, which implies that M2k+1 (t) ≡ 0 for k = 1, 2, . . . . We concentrate therefore on the computation of the moments of even order and assume everywhere below that n is even. We may write, using the fact that u(x, t) = u∗ (x, t) for t > T ∗ ,
(12) Mn (t) = u∗n (0, t) + Rn (t), where
Rn (t) =
n
∗n
∗
u (0, t) − u (0, t) θ (T − t)
(13)
is an error term to be estimated. The first term in the right-hand side of (12) can be written in the following form: Z Z L
∗n dx0 n (14) U (x0 , 0, t, p, q) + rn (t), u (0, t) = dpdq ρ(p, q) −L 2L where rn (t) =
Z
l
−l
dx
∗
Z
∗
dpdq ρ(p, q, x )
Z
−L
−L+x ∗
Z +
L+x ∗
L
dx0 n U (x0 , 0, t, p, q) 2L (15)
Large Time Asymptotics of Decaying Burgers Turbulence
421
is an error term appearing due to neglecting x ∗ in comparison to L and ρ(p, q, x ∗ ) is a joint probability density of P , Q and x ∗ . It is shown in the appendix that the error term rn (t) does not affect the asymptotics as ν → 0, L → ∞, t → ∞. Informally this fact can be explained by noticing that the integrand in (15) is non-zero only for velocity profiles which are “stretched” over the interval of length L and thus are exponentially improbable. The remaining integral on the right hand side of (14) can be evaluated exactly using the explicit expressions (4) and (6) leading to the following result: 0((n + 3)/4) L(t) U (t)n , u∗n (0, t) ∼ √ π (n + 1) L
(16)
where L(t) =
p
2P0 t, U (t) =
L(t) t
(17)
are parameters, with dimensions length and velocity, which should be interpreted as the scale of turbulence and turbulent velocity correspondingly. Here we write the symbol ∼ to mean asymptotic equivalence in the limit as L → ∞ and then t → ∞. Another computation presented in the appendix leads to the following estimate of the error term Rn (t) from (12): 1/4 tc L(t) U (t)n , (18) |Rn (t)| ≤ Cn L t p where tc = l 3 /J is a constant having a dimension of time, Cn is a positive constant. Comparing (18) with (16) we see that for t tc , |Rn (t)| hu∗n (0, t)i, which permits us to conclude that 0(k/2 + 3/4) L(t) U (t)2k , M2k (t) ∼ √ π (2k + 1) L
k = 1, 2, . . . .
(19)
It is important to stress however that the coefficient Cn from (18) grows faster with n than the number factor in the r. h. s. of (19). Thus it takes a long time for a moment of high order to converge to the limiting value (19). It follows from (19) that the energy density E(t) ≡ 21 M2 (t) decays like t −1/2 as t → ∞. This is the result to be expected: Dissipation of energy occurs in Burgers turbulence due to shock collisions and at each separate shock. The energy of a separate shock decays as t −1/2 and due to the absence of shock collisions in the limiting profile (4), this also gives the law of decay of total energy density. This argument is due to J. M. Burgers, see [11]. We will also see below that the statistics of the velocity field in our model is selfsimilar with the scales of length and velocity given by (17). These scales depend on time exactly as their counterparts in Kida’s model. The statistics of the velocity field in our case are however different from that of Kida1 . Thus we conclude that the self-similarity alone does not determine the large time asymptotics of the statistics of the velocity field in DBT. Note also that E(t) decays in time, showing the presence of a dissipation anomaly in the model: the rate of energy dissipation does not vanish but converges to a finite non-zero limit when the viscosity ν approaches zero. 1 There exists no complete solution of Kida’s model. Yet the answers which can be obtained within Kida’s model are different from their counterparts in our model.
422
R. Tribe, O. Zaboronski
3.2. The probability distribution function of velocities. In this section we will concern ourselves with computing the probability distribution function (PDF) of velocities given by P (u, t) ≡ Prob(u(0, t) > u) = hθ (u(0, t) − u)i . Reasoning exactly as in the previous section we find that
P (u, t) = θ (u∗ (0, t) − u) + R(u, t), where R(u, t) =
θ (t ∗ − t) θ u(0, t) − u − θ u∗ (0, t) − u
(20)
(21)
(22)
is an error due to the replacement u → u∗ ; Z
∗ θ(u (0, t) − u) = θ(−u) + dpdq ρ(p, q) (23) Z L dx0 θ (U (x0 , 0, t, p, q) − u) − θ (−u) + r(u, t), · −L 2L where r(u, t) is an error due neglecting x ∗ in comparison with L: Z r(u, t) =
l −l
dx ∗
Z
·
Z
−L −L+x ∗
dpdq ρ(p, q, x ∗ ) Z +
L+x ∗
L
dx0 (θ (U (x0 , 0, t, p, q) − u) − θ (−u)) . 2L
(24)
The reason that the term θ(−u) is added and subtracted is that, due to the averaging of the position of the initial condition over the block [−L, L], the velocity is typically zero and so the PDF is an O(L−1 ) perturbation to θ (−u). An estimate of r(u, t) similar to that of the term rn (t) in Sect. 3.1 shows that r(u, t) does not affect the final asymptotics. An exact calculation using the known density ρ(p, q) for the other terms on the right hand side of (23) leads to Z
∗ L(t) ∞ dα −α 2 √ ¯ + α − |u| ¯ sgn(u), ¯ (25) θ(u (0, t) − u) ∼ θ(−u) √ e L u¯ 2 π where u¯ = u/U (t). A computation performed in the appendix shows that |R(u, t)| ≤ C
L(t) tc 1/4 ( ) , L t
(26)
where C is a positive constant. Comparing (26) with (25) we see that for t tc we have hθ(u∗ (0, t) − u)i |R(u, t)|, with the last inequality being pointwise in u¯ rather than uniform. We conclude that Z L(t) ∞ dα −α 2 √ α − |u| ¯ sgn(u). ¯ (27) P (U (t)u, ¯ t) ∼ θ(−u) ¯ + √ e L u¯ 2 π
Large Time Asymptotics of Decaying Burgers Turbulence
423
If in particular |u| ¯ → ∞ this simplifies to L(t) e−u¯ 1 . ¯ P (U (t)u, ¯ t) ∼ θ (−u) ¯ + √ sgn(u) L u¯ 5 8 π 4
(28)
Note that the answer (27) for P (u, t) is self-similar with U (t) playing the role of the integral velocity scale. Note also that the form of P (u, t) is not Gaussian. This confirms the non-triviality of our model: the output (the strongly non-Gaussian statistics of the velocity field in the limit of small viscosity and large time) is not the same as the input (a trivial Gaussian distribution of the initial velocity field). This non-triviality will be reemphasised in the consequent sections where it will be shown that the limiting statistics of the velocity field is intermittent. Finally we would like to make the following technical comment. Of course, the moments of the distribution (27) are exactly those given by (19). We could therefore try to compute the distribution (20) first and then argue that the moments of this asymptotic distribution coincide with the asymptotics of the moments of the actual distribution. Unfortunately the analysis of error terms within this approach becomes very involved. For this reason we have two separate computations, the asymptotics of the moments of the velocity distribution and the asymptotics of the velocity distribution itself. 3.3. Velocity structure functions. Now we will turn to the two-point statistics of the velocity field and compute asymptotics for the velocity structure functions given by n n = 1, 2, . . . . (29) Sn (y, t) = u(y, t) − u(0, t) We find as in the previous subsections that n + Rn (y, t), Sn (y, t) = u∗ (y, t) − u∗ (0, t)
(30)
where Rn (y, t) accounts for the error due to the replacement of u with u∗ . As shown in the appendix this error can be estimated as follows: for t such that L(t) ≥ y, y tc 1/4 n . (31) |Rn (y, t)| ≤ Cn U (t) L t We express the first term in (30) as n ∗ ∗ u (y, t) − u (0, t) Z =
L −L
dx0 2L
Z
n dpdq ρ(p, q) U (x0 , y, t, p, q) − U (x0 , 0, t, p, q) + rn (y, t),
where rn (y, t) accounts for an error arising due to neglecting x ∗ in comparison with L. Again it can be shown that the term rn (y, t) does not contribute to the asymptotics. Now a direct computation using the density ρ(p, q) shows, for y ≥ 0, that n u∗ (y, t) − u∗ (0, t) n + 2 L(t) n 1 U (t)y¯ + O(y¯ 2 ), n = 2, 3, . . . , ∼ (−1)n √ 0 4 L π
424
R. Tribe, O. Zaboronski
y where y¯ = L(t) . In addition S1 (y, t) ∼ 0, which confirms the restoration of translation invariance in the large L limit. Comparing this with (31) we see that the asymptotics of the velocity structure functions is given, for fixed y¯ ≤ 1, by
n + 2 L(t) n 1 ) U (t)y¯ + O(y¯ 2 ) n = 2, 3, . . . . ¯ t) ∼ (−1)n √ 0( Sn (L(t)y, 4 L π
(32)
It has been assumed in our computations that y¯ ≥ 0. Extending (32) to negative y by ¯ t) is proportional to |y| ¯ and the symmetry y → −y, u → u, we see that S2k (L(t)y, ¯ t) is proportional to y¯ for k ≥ 1 and |y| ¯ 1. S2k+1 (L(t)y, Thus the velocity structure functions of the problem exhibit in the inertial range the extreme anomalous (non-Kolmogorov) scaling which is typical for Burgers turbulence in general and is due to the presence of shocks in the limiting velocity profile. The Burgers anomalous scaling is well known from heuristic arguments (see e.g. [15, 7, 9] ). In our case however it has been derived as a part of the complete solution of the problem. 3.4. The probability distribution function of velocity differences. Here we will compute the PDF for velocity differences P (u, y, t) = Prob u > 1u(y, t) = θ u − 1u(y, t) , (33) where 1u(y, t) = u(y/2, t) − u(−y/2, t) and y ≥ 0. Definition (33) is tailored for the study of negative velocity differences and we consider only the case u < 0. Negative differences are the interesting case since they occur when the velocities are evaluated either side of a shock. A lengthy but straightforward computation shows that for fixed u¯ < 0, y¯ > 0, P (U (t)u, ¯ L(t)y, ¯ t) Z (y− Z ∞ ¯ u) ¯ 2 dα dα L(t) 2 √ 2 α + u¯ + y¯ ¯ y, ¯ t), ∼2 √ e−α √ e−α + R(u, L π π u¯ 2 (y− ¯ u) ¯ 2
(34)
where, as shown in the appendix, 1/4 tc L(t) y¯ . |R(u, ¯ y, ¯ t)| ≤ C L y¯ + |u| ¯ t
(35)
Due to the presence of extra factor of ( ttc )1/4 decaying with time, R(u, y, t) becomes small compared to the first term in the right hand side of (34), given that u, ¯ y¯ fixed. It is easy to analyse (34) in the following limiting cases. We suppose that y¯ 1. If |u| ¯ 1, then √ √ π π 2 L(t)y¯ 2 3 1− y¯ − |u| ¯ + O(y¯ ) + O(|u| ¯ ) . (36) P (U (t)u, ¯ L(t)y, ¯ t) ∼ L 2 2 If 1 |u| ¯ y¯ −1/3 then
1 L(t)y¯ 1 −|u| ¯4 3 e ) + O( y| ¯ u| ¯ ) . 1 + O( P (U (t)u, ¯ L(t)y, ¯ t) ∼ √ ¯2 |u| ¯4 π L |u|
(37)
Large Time Asymptotics of Decaying Burgers Turbulence
425
If |u| ¯ y¯ −1/3 then
1 L(t) 1 −|u| ¯4 e 1 + O( 8 ) . P (U (t)u, ¯ L(t)y, ¯ t) ∼ √ |u| ¯ ¯5 4 π L |u|
(38)
To summarize, for negative u, P (u, y, t) decays algebraically for |u| U (t) and super exponentially for |u| U (t). Moreover, P (u, y, t) ∼ O(y) if 1 |u| ¯ y¯ −1/3 −1/3 and doesn’t depend on y if 1 |u| ¯ y¯ . This information alone enables one to conclude that velocity structure functions of sufficiently high order exhibit anomalous scaling. In addition we observe a crossover between regimes (37) and (38). This crossover is actually responsible for the presence of many scales in the description of the statistics of velocity field and the absence of the universal inertial range in Burgers turbulence. We refer the reader to Sect. 4 for a detailed discussion of this point. 3.5. The multi-time statistics of the velocity field. The simplicity of our model allows us to compute the correlation between values of the velocity field at different moments of time. Let n (39) Tn (τ, t) = u(0, t + τ ) − u(0, t) be the velocity structure functions corresponding to the same point at space but different moments of time. We write (39) in the already familiar form n (40) + Rn (τ, t) Tn (τ, t) = u∗ (0, t + τ ) − u∗ (0, t) with Rn (τ ) accounting for an error due to the replacement of u with u∗ . An estimate in the appendix shows that τ U (t) tc 1/4 n . (41) |Rn (τ, t)| ≤ Cn U (t) L t The computation of the first term in the right-hand side of (40) is very close to the computation performed in previous sections and leads to Tn (τ, t) ∼ (−1)n ∼ (−1)
0( n+3 ) U (t)τ + Rn (τ, t) √4 U n (t) L 2 π
n+3 n 0( 4 )
U (t)τ , √ U (t) L 2 π n
(42)
n = 2, 3, . . . .
We therefore conclude that the time-like structure functions exhibit in Burgers turbulence the extreme anomalous scaling in τ given by Tn (τ, t) ∼ τ, n = 2, 3, . . . . Comparing this with the expression (32) for the space-like structure functions, we see that Sn (y, t) = Tn (τ, t),
n = 2, 3, . . .
(43)
at y = C(n)U (t)τ , given that y L(t) and τ t. The identity (43) means that the “isotropic” Taylor conjecture stating the equivalence of the space-like and time-like statistics in isotropic turbulence at small scales, becomes a theorem for our model of
426
R. Tribe, O. Zaboronski
Burgers turbulence. The similar observation was also independently made in [8] in the context Burgers turbulence generated by correlated Gaussian initial conditions. Let us finally note that if one wishes to compare Tn (y, t) with Sn (τ, t) at arbitrarily high orders n, the condition of applicability of relation (43) has to be changed to y Ln (t), where Ln (t) is correlation length associated with nth order structure function introduced in Sect. 4.2. For n 1, Ln (t) ∼ L(t)/n3/4 , see (46) below. 3.6. One-shock approximation. We wish to show that all of the results obtained in the previous section can be easily obtained from heuristic arguments given the knowledge of the probability density of a velocity jump at a shock. In our case the latter is easy to compute: a simple computation which uses the knowledge of the limiting velocity profile (4) and the density ρ(p, q) gives p 4 4 µe ¯ −µ¯ , (44) ρ(µ) ≡ δ µ − 2(P − Q)/t = √ π U (t) where µ is a velocity jump at the (right) shock, µ¯ = Uµ(t) . The probability density of the velocity jump at the left shock has exactly the same form, so we will be referring to (44) as the probability density of the velocity jump at a shock. Now let us assume: Firstly that the large-t statistics of u are approximated by that of u∗ ; secondly that a one-shock approximation is valid, i.e. that one can disregard in the analysis the contributions coming from configurations with shocks separated by distances much less than the average separation L(t). To derive P (u, y, t) for u < 0, y L(t) using these assumptions note that u(y, t)− u(0, t) can be negative only if there is a shock at some point in [0, y]. If the right hand shock lies at x ∈ [0, y] then u(y, t) − u(0, t) = −µ + x/t. A similar formula holds if the left hand shock lies in [0, y]. So neglecting the contribution from the configurations with 2 shocks inside the interval [0, y], we see that Z y dx x Prob(Size of Jump > − u). Prob u(y, t) − u(0, t) < u ≈ 2 2L t 0 This can be easily computed using the density of the shock jump (44) giving
Prob u(y, t) − u(0, t) < y ≈
2L(t) L
Z
(y− ¯ u) ¯ 2
u¯ 2
dα 2 √ ¯ + y¯ √ e−α ( α + u) π
Z
∞
(y− ¯ u) ¯ 2
dα 2 √ e−α , π
(45)
which coincides with the exact answer (34). With the knowledge of the PDF of velocity difference we can compute velocity structure functions, thus moments of velocities, thus the PDF of velocities. In other words all of the results of the previous section concerning single time statistics of the velocity field can be obtained using a one-shock approximation. Moreover, the τ -dependence of the time-like structure functions (42) is also entirely due to the one-shock effects: if n = 2, 3, ... and τ t, then the main contribution to Tn (τ, t) comes from the configurations with a shock passing through x = 0 between
Large Time Asymptotics of Decaying Burgers Turbulence
427
the moments of time t and t + τ . A shock with velocity jump µ travels a distance approximately µτ/2 over the interval [t, t + τ ]. Therefore,
Tn (τ ) ≈ (−µ) χ Shock passed through 0 during [t, t + τ ] n
≈ h(−µ)n
µτ i. 2L
Computing this average using the PDF of shock strength (44) we arrive exactly at (42), which again shows that one-shock approximation is asymptotically exact. These calculations support the following statement about decaying Burgers turbulence: all one needs to know in order to describe the statistics of the velocity field at scales much less than the average distance between shocks is the one-point PDF of shock velocity and strength (or just shock strength if the correlation functions which we’re trying to compute are Galilean-invariant). Thus the problem is much simpler than one might have thought: recall for example that exact formulae expressing velocity correlation functions in terms of the statistics of shocks are such ([11, 23]) that one seemingly needs to know the n-point joint PDF of shock strengths in order to compute the nth order correlation function. The rigorous proof of the above statement together with estimates on the errors of one-shock approximation will make DBT analytically tractable for a wide class of initial conditions as the great deal is known about the one-point function of shock strength, see e. g. [11,23,32,3,4]. Is there a universal technique for the computation of the one-point PDF of the shock strength? It has been known since Burgers [11], but never really exploited, that shocks behave (almost) as a system of sticky particles. One might try therefore to extract the information about one-point PDF of shock characteristics by studying the kinetics of this system, for example, by analyzing the Smoluchowski-Bogoluibov chain of equations for one-point, two-point, . . . PDF’s of shocks.
3.7. On multiscaling in Burgers turbulence. In statistical physics the term “multiscaling”, instead of “anomalous scaling”, is used to stress an inherently multiscale nature of a system exhibiting anomalous scaling of correlation functions. Burgers turbulence is no exception. In this section we will show that the crossover between the tails (37) and (38) of the PDF for velocity differences is actually a reflection of the presence of many correlation lengths in the problem, which in turn is a consequence of the anomalous scaling of correlation functions and, ultimately, the intermittency of the velocity field in Burgers turbulence. Let n 1 be a large even positive integer. We know from (32) that as y¯ approaches zero, 1 n + 2 L(t) n ¯ t) ≈ √ 0 ¯ U (t)y. Sn (L(t)y, 4 L π
428
R. Tribe, O. Zaboronski
For large y¯ however one expects the quantities u(L(t)y, ¯ t) and u(0, t) to become independent. When this happens we have
¯ t) = (u(L(t)y, t) − u(0, t))n Sn (L(t)y,
∼ un (L(t)y, t) + un (0, t) 0( n+3 4 ) L(t) U (t)n . ∼ 2Mn = 2 √ π (n + 1) L Here the cross terms in expanding the nth power are, using the independence, of order O(L−2 ). The region in between these two formulae for large and small y marks the correlation length for the nth moments. If we assume there is a simple crossover then we can locate the scale at which it occurs by equating the expressions for large y¯ and ¯ t) ≈ 2Mn , at the value n−3/4 and so the small y. ¯ These become equal, i.e. Sn (L(t)y, correlation length for the nth structure function is Ln ∼
L(t) , n3/4
n 1,
(46)
and this shows the presence of many scales in our problem. To show how this multiscaling is related to the crossover between the asymptotic regimes (37) and (38) we shall use the PDF for velocity differences to compute Sn (y, t) ¯ t) as an integral against the PDF of for n positive and large. Writing Sn (L(t)y, 1u(L(t)y, ¯ t) and treating n as a large parameter we see that the integral is dominated by values of |u| ¯ coming from the neighbourhood of the negative critical point of the function ¯ 4) F (u) = |u| ¯ n exp(−|u| namely near u¯ c = −n1/4 . Note this value is much less than −1 for n 1 and so we may neglect the part of the integral that uses the PDF in the form (36) and also neglect positive values of 1u(y). Now, if in addition |u¯ c | y¯ −1/3 , we have to use asymptotics (37) ¯ t) ≈ C y. ¯ to evaluate the contribution from the critical point, which yields Sn (L(t)y, If |u¯ c | y¯ −1/3 we have to use asymptotics (38) in our computations, which gives ¯ t) ≈ Constant. The crossover between these two answers corresponds to the Sn (L(t)y, crossover between the asymptotics (36) and (37) and occurs when y¯ = |u¯ c |3 = n−3/4 , exactly as in our computed correlation length Ln for the nth structure function. It remains to remark that multiscaling, and consequently a PDF for velocity differences which has a crossover between a regime scaling like y and one that is independent of y, should be a general feature of DBT regardless of the initial distribution. All related questions concerning other statistics can be studied in more general situations, if one assumes a one-shock approximation is valid, by using the information about the tails of the one-point PDF of shock strength obtained in [32, 3, 4]. It is worth noting that the presence of the multitude of correlation lengths in Burgers turbulence was understood long ago by Robert Kraichnan, [25], and rediscovered within the instanton approach to the forced Burgers turbulence, [12]. It is also worth stressing that in models of chaotic systems which do not account for the effects of intermittency, there is always a single universal correlation length. A good example is served by random matrix models, see [26] for a review. Finally, let us remark that if we define the integral scale as the scale of scaling behaviour of correlation functions, we must immediately conclude that there is no such unique scale, there is rather a family of them parameterized by the order of correlation
Large Time Asymptotics of Decaying Burgers Turbulence
429
function. In other words, the notion of the integral scale becomes local, and the notion of the universal inertial range disappears. (See also [14] for the general discussion about the multitude of dissipative scales based on a multifractal models.) This should be a general feature of all intermittent turbulent systems, for instance, Navier–Stokes turbulence. Acknowledgements. We are grateful to E. Balkovski, D. Elworthy, G. Falkovich, U. Frisch2 , J. Gibbon, K. Khanin, S. Kuksin, S. Nazarenko, A. Newell, C. Vassilicos for most illuminating discussions. We are most grateful for the hospitality of the Department of Complex Systems of Weizmann Institute of Science, where part of this work has been carried out. The financial support through the research grant MA1117 from the University of Warwick is also greatly appreciated. Note added in proof. We are grateful to the referee of our paper who drew our attention to a recent preprint by L. Frachebourg and Ph. A. Martin, [13], in which the study of the model of decaying Burgers turbulence initiated by white noise initial conditions (without compactness assumption) has been effectively completed. This model was originally considered by Burgers himself about forty years ago but complexity of analysis prevented him from obtaining explicit answers for anything but the two- and three-point correlation functions of velocity field. Now most of the questions about the statistics of velocity field in Burgers’ model can be effectively resolved using the integral representation of the Green’s function of a diffusion equation in the (x, t)-domain with parabolic boundary derived in the above mentioned paper.
4. Appendix In order to bound the various error terms in Sect. 3 we will need to bound the size of the true solution u, the asymptotic solution u∗ and the size of their supports (i.e. the interval on which they are non-zero). We use details from the method of construction of the vanishing viscosity solution as descibed in [20] and recalled in Sect. 2. Suppose that initial velocity profile is supported in the interval (x0 − l, x0 + l). The rightmost (respectively leftmost) parabola in the chain of parabolic arcs built on the initial potential will always lie to the left (respectively right) of the parabola with the same curvature that passes through the point (x0 + l, −Q) (respectively (x0 − l, −Q)) and assumes minimial value equal to −P (respectively 0). This immediately implies that both u and u∗ are supported in the interval [y∗ , y ∗ ] where y∗ = x0 − l −
p −2tQ,
y ∗ = x0 + l +
p 2t (P − Q).
(47)
Using the fact that both u and u∗ vanish at the point within [x0 − l, x0 + l] at which q(x) achieves its global minimum, we also find that |u| and |u∗ | are bounded by umax , where umax = max{(y ∗ − (x0 − l))/t, ((x0 + l) − y∗ )/t} p p = max{(2l + 2t (P − Q))/t, (2l + −2tQ)/t}.
(48)
Estimates (47), (48) and the bound (7) will be used to estimate all relevant error terms in Sect. 3. The careful analysis of these error terms leads to a better understanding for when the asymptotics for various statistics start to hold. 2 Who asked the very useful, perhaps rhetorical, question “Why study white noise Burgers turbulence at all?”
430
R. Tribe, O. Zaboronski
4.1. Proof of the estimate (18). Applying the above estimates to the error term (18) we obtain |Rn (t)| = h|un (0, t) − u∗n (0, t)|θ (T ∗ − t)i = h|un (0, t) − u∗n (0, t)|χ[y∗ ,y ∗ ] (0) θ (T ∗ − t)i ≤ 2hunmax χ[y∗ ,y ∗ ] (0) θ (T ∗ − t)i 2 ≤ h(y ∗ − y∗ )unmax θ (T ∗ − t)i (averaging over x0 ) L 2 1/2 hθ (T ∗ − t)i1/2 (Cauchy-Schwartz) ≤ h(y ∗ − y∗ )2 u2n max i L t∗ L(t) n U (t)( )1/4 , ≤ Cn L t where the last inequality uses the the estimate (7) and an explicit calculation using ρ(p, q). Comparing the first and the last entries of the presented chain of inequalities we obtain a proof of (18). 4.2. Proof of the estimate on rn (t) from Sect. (3.1). We may bound rn (t) as follows: Z |rn (t)| ≤
l
−l
Z
dx
∗
Z
Z dpdq ρ(p, q, x )( ∗
Z
−L+l
Z
−L+x ∗
−L L+l
Z +
L+x ∗
L
)
dx0 |U (x0 , 0, t, p, q)|n 2L
dx0 |U (x0 , 0, t, p, q)|n ≤ dpdq ρ(p, q)( + ) 2L −L−l L−l Z p 2l L + l n ≤ dpdq ρ(p, q)θ ( −2Qt − (L − l))U n (t) L L ! L−l 4 2l L + l n L(t) 2 exp − U n (t), ≤ L L L−l L(t) where the last inequality follows by an explicit calculation using ρ(p, q). This is exponentially small in L and so does not affect the asymptotics which take the limit L → ∞ first and preserve only the O(L−1 ) terms. A similar argument controls similar error terms of this form for the other statistics considered.
4.3. Proof of the estimate (26). The proof of (26) is similar to that of (18): |R(u, t)| ≤= h|θ(u(0, t) − u) − θ (u∗ (0, t) − u)|χ[y∗ ,y ∗ ] (0)θ (T ∗ − t)i ≤ 2hχ[y∗ ,y ∗ ] (0)θ(T ∗ − t)i 1 ≤ h(y ∗ − y∗ )θ (T ∗ − t)i L 1 ≤ h(y ∗ − y∗ )2 i1/2 hθ (T ∗ − t)i1/2 L L(t) tc 1/4 ≤C ( ) . L t
Large Time Asymptotics of Decaying Burgers Turbulence
431
4.4. Proof of the estimate (31). We can split this error term into two via |Rn (y, t)| ≤ 2n h1u(y, t)n θ (T ∗ − t)i + 2n h1u∗ (y, t)n θ (T ∗ − t)i,
(49)
where 1u(y) = u(y/2, t) − u(−y/2, t). We show how to bound the first of these terms, the other being entirely similar. The vanishing viscosity solution u takes the form, within its support, of a line with slope 1/t plus a series of downward jumps. So we may define F (x, t) to be a non increasing piecewise constant function so that, for x in the support of u, y − x0 + F (y − x0 , t). u(y, t) = t It is easy to see that |1F (y, t)| = |F (y/2, t)−F (−y/2, t)| ≤ 2umax . Also |1u(y, t)| ≤ |y/t| + |1F (y − x0 , t)| whenever one of the points y/2 or −y/2 is in the support of u. So we bound the first term on the right-hand side of (49) by h(|y/t| + |1F (y − x0 )|)n χ[y∗ −(y/2),y ∗ +(y/2)] (0)θ (T ∗ − t)i ≤ 2n h|1F (y − x0 )|n θ(T ∗ − t)i + 2n h|y/t|n χ[y∗ −(y/2),y ∗ +(y/2)] (0)θ (T ∗ − t)i.
(50)
The first term on the right-hand side of (50) can be bounded by averaging over x0 first and using Z L Z L dx0 dx0 n n−1 |1F (y − x0 , t)| ≤ (2umax ) |1F (y − x0 , t)| −L 2L −L 2L yumax , ≤ (2umax )n−1 L using in the last inequality the fact that F is decreasing and bounded by 2umax . Substituting into (50) one can take the further averaging as for previous error bounds. By taking t large enough that L(t) ≥ y and combining the various terms one arrives at the desired error bound. 4.5. The proof of the estimate (35). The proof of this estimate is similar to that of (31). Noting that 1u(y, t) = 1F (y − x0 , t) + (y/t) we may write Z Z dx0 dx0 θ(u − 1u(y, t)) = θ (|1F (y − x0 )| − (y/t) − |u|) 2L 2L Z dx0 |1F (y − x0 )| ≤ 2L (y/t) + |u| 2yumax 1 . ≤ 2L (y/t) + |u|) A similar estimate holds for 1u∗ (y, t). Hence
|R(u, y, t)| ≤ h θ(u − 1u(y, t)) + θ (u − 1u∗ (y, t) θ (T ∗ − t)i 2y 1 humax θ (T ∗ − t)i ≤ L (y/t) + |u|) 1/4 tc L(t) y¯ . ≤ L y¯ + |u| ¯ t
432
R. Tribe, O. Zaboronski
4.6. Proof of the estimate (41). The proof of this estimate is similar to that of (35). The key change is to obtain a bound for Z dx0 |F (y − x0 , t) − F (y − x0 , t + τ )|. (51) 2L The piecewise constant profile F (y, t) consists of a series of shocks which may travel forwards or backwards but move with a maximum speed umax . The total height of the shocks is also bounded by umax . So the integral (51) can be bounded by u2max τ/2L. The possibility of infinitely many shocks, or the merging of shocks between times t and t +τ , does not affect this upper bound.
4.7. The proof of the estimate (7). The construction of the two shock profile R x uses two parabolas that pass through the graph of the Brownian motion −q(x) = − −∞ u0 (z)dz at its point of maximum. Below is a lemma about the behavior of a Brownian path near its maximum. Lemma 1. Let (Bt : 0 ≤ t ≤ 1) be a standard Brownian motion started at zero. Define M = sup Bt ,
6 = inf{t : Bt = M}.
t∈[0,1]
We consider the pieces of the path (Bt ) either side of its maximum by defining Xt = M − B6−t for t ∈ [0, 6],
X¯ t = M − B6+t for t ∈ [0, 1 − 6].
Define the slopes of two lines that pass through the maximum and lie above the path by 2 = inf{Xt /t : 0 < t ≤ 6},
¯ = inf{X¯ t /t : 0 < t ≤ 1 − 6}. 2
a) The triples (M, 6, (Xt : t ≤ 6)) and (M − B1 , 1 − 6, (X¯ t : t ≤ 1 − 6)) are identically distributed. b) The law of (M, 6) is given by P (M ∈ dm, 6 ∈ dσ ) =
mσ −1 exp(−m2 /2σ )dm dσ. π(σ (1 − σ ))1/2
c) Conditional on M ∈ dm, 6 ∈ dσ the path (Xt : t ≤ σ ) satisfies X0 = 0 and solves the stochastic differential equation, driven by a Brownian motion (Wt ), dXt = f (t, Xt )dt + dWt ,
where f (t, x) =
m−x 2m 2mx + (exp( ) − 1)−1 . σ −t σ −t σ −t (52)
d) For (Xt ) that solves (52) we have the estimate P (2 ≤ θ) ≤ Cθ (m + σ m−1 ) + I (m ≤ θ σ ).
Large Time Asymptotics of Decaying Burgers Turbulence
433
We delay the proof of this lemma until the end of this appendix and first use it to prove the estimate (7) on the tail P (T ∗ ≥ t) of the time T ∗ at which the two shock R x profile is obtained. The construction of the two shock profile uses the function q(x) = ∞ u0 (z)dz, its global minimum Q and the position x0 + x ∗ at which the minimum is attained. Two ¯ 2 /2t) are constructed parabolas of the form π(z) = (z − x)2 /2t (and π¯ (z) = P + (z − x) to pass through the point (x0 + x ∗ , −Q). The slopes of the parabolas at the point x0 + x ∗ are (−2Q/t)1/2 (respectively (2(P − Q)/t)1/2 ). Let T (respectively T¯ ) be the smallest time t at which the parabola π (respectively π¯ ) lies above the graph of −q(x). Then the two shock profile is attained for times t ≥ T ∗ = max{T , T¯ }. To apply the lemma we must rescale to obtain a standard Brownian path of length one. Set Bt = −(2lJ )−1/2 q(x0 − l + 2lt) for t ∈ [0, 1]. Then (Bt ) is a standard Brownian motion and its maximum M takes the value −Q/(2lJ )1/2 . The construction of the parabola π (respectively π) ¯ show that if 2 ≤ (−4lQ/tJ )1/2 then t ≤ T (respectively ¯ ≤ ((4l(P − Q)/tJ )1/2 then t ≤ T¯ ). Part a) of the lemma shows that both of these if 2 events have the same probability. So, applying part d) of the lemma, P (T ∗ ≥ t) ≤ 2P (2 ≤ (−4lQ/tJ )1/2 ) = 2P (2 ≤ 25/4 M 1/2 t −1/2 l 3/4 J −1/4 ) ≤ Ct −1/2 l 3/4 J −1/4 E(M 1/2 (M + 6M −1 )) + 2P (M ≤ 25/4 M 1/2 t −1/2 l 3/4 J −1/4 6) ≤ Ct −1/2 l 3/4 J −1/4 using Markov’s inequality in the last inequality and the exact distribution of (M, 6) in part b) of the lemma. This completes the proof of (7) and it remains to describe the proof of the lemma. Part a) of the lemma follows from the symmetry of the problem with respect to the time reversal t → 1 − t. The distribution of (M, 6) is well known and may be obtained for example by exploiting the reflection principle. Conditional on M ∈ dm, 6 ∈ σ the path (Xt ) becomes a Brownian bridge, taking the value zero at time zero and the value m at time σ , that is conditioned to never take negative values. The equation describing the evolution can then be obtained using an h-transform as in Rogers and Williams [30] Sect. 4.23. We first sketch the idea for estimating P (2 ≤ θ ) = P (Xs < θ s for some s ≤ σ ). The drift f (t, x) in Eq. (52) is approximately 1/x for small t and x. If this approximation were exact the process (Xt ) would satisfy dX = X−1 dt + dW which is uniquely solved by the three dimensional Bessel process (the radius of a three dimensional Brownian motion). For a Bessel process one can make use of time inversion via the identity in distribution (Xt : t > 0) = (tX1/t : t > 0) and potential theory for three dimensional Brownian motion which gives P (Xs < θ for some s ≥ 0|X0 = x) = min{θ x −1 , 1}.
434
R. Tribe, O. Zaboronski
Then P (Xs < θs for some s ≤ σ ) = P (Xs < θ for some s ≥ 1/σ ) −1 , 1}) = E(min{θ X1/σ
= E(min{θ σ 1/2 X1−1 , 1}) Z ∞ (2π )−3/2 r 2 exp(−r 2 /2) min{θ σ 1/2 r −1 , 1}dr = 0
≤ Cθ σ 1/2 , where the penultimate equality follows from Brownian scaling and the final equality from a calculation using the density of the Gaussian variable X1 . To exploit this idea we divide the interval [0, σ ] into two parts, over the first of which the approximation f (t, x) ≈ 1/x is sufficiently good. We first estimate P (Xs < θs for some s ≤ σ/2). Using the elementary inequalities (1−z)/2z ≤ (e2z −1)−1 ≤ 1/2z for all z > 0 one obtains the bounds x −1 −x(σ −t)−1 ≤ f (t, x) ≤ x −1 + 2mσ −1 . Hence Xt−1 − 2σ −1 Xt ≤ f (t, Xt ) ≤ Xt−1 + 2m(σ − t)−1
for t ≤ σ/2.
(53)
So the solution of the equation dYt = Yt−1 dt − 2σ −1 Yt dt + dWt , Y0 = 0 satisfies Yt ≤ Xt for all t ≤ τ . To remove the unwanted −2σ −1 Yt dt in the drift of (Yt ) we use a change of measure. Define a new probability measure Q by defining the Radon-Nicodym derivative M by dQ M= dP Fσ/2 Z σ/2 Z σ/2 −1 −2 = exp(2σ Ys dWs − 2σ Ys2 ds) 0
2 + 2σ −2 = exp(σ −1 Yσ/2
Z 0
0
σ/2
Ys2 ds − 3/2)
≥ exp(−3/2). The second equality here follows from Ito’s formula. By Girsanov’s theorem ( see [29]) the process (Yt ) solves dY = Y −1 dt + d W˜ with respect to some Brownian motion (W˜ ) under Q, implying that (Yt ) is a three dimensional Bessel process under Q. Writing EQ for the expectation under Q we have P (Xs < θs for some s ≤ σ/2) ≤ P (Ys < θ s for some s ≤ σ/2) = EQ (M −1 I (Ys < θ s for some s ≤ σ/2)) ≤ e3/2 Q(Ys < θ s for some s ≤ σ/2) ≤ Cθ σ 1/2 using the argument given above.
(54)
Large Time Asymptotics of Decaying Burgers Turbulence
435
It remains to estimate the probability P (Xs < θ s for some σ/2 ≤ s ≤ σ ). We shall further condition on the value of Xσ/2 . If Xσ/2 ∈ dr the evolution of (Xs : s ∈ [σ/2, σ ]) is that of a Brownian bridge starting at r, ending at m and conditioned to take non-negative values. We write Qx for the law of a one-dimensional Brownian motion (Wt ) started at x and we define Ha = inf{t : Wt ≤ a}. Then, supposing r, q ≥ θ σ , we have P (Xs ≤ θ s for some s ∈ [σ/2, σ ]|Xσ/2 ∈ dr) = 1 − P (Xs > θ s for all s ∈ [σ/2, σ ]|Xσ/2 ∈ dr) ≤ 1 − P (Xs > θ σ for all s ∈ [σ/2, σ ]|Xσ/2 ∈ dr) = 1 − Qr (Hθ σ > σ/2|Wσ/2 ∈ dm, H0 > σ/2) Qr (Hθ σ > σ/2, Wσ/2 ∈ dm) . =1 − Qr (H0 > σ/2, Wσ/2 ∈ dm) The reflection principle can be used to show that, for a ≤ r, m, Qr (Ha > t, Wt ∈ dm) = (pt (m − r) − pt (m + r − 2a)) dm,
(55)
where pt (z) = (2πt)−1/2 exp(−z2 /2t). Using this we rewrite the last expression as exp((m + r)2 /σ ) − exp((m + r − 2θ σ )2 /σ ) exp((m + r)2 /σ ) − exp((m − r)2 /σ ) (m + r − 2η)2 − (m + r)2 −4mr −1 )) 4θ (m + r − 2η) exp( ) = (1 − exp( σ σ for some η ∈ [0, θ σ ] by the mean value theorem −4mr −1 )) 4θ (m + r − 2η) ≤ (1 − exp( σ σ )(m + r) (using (1 − e−z )−1 ≤ C(1 + z−1 )) ≤ Cθ(1 + 4mr ≤ Cθ(m + r + σ r −1 + σ m−1 ). Thus P (Xs ≤ θs for some s ∈ [τ, σ ]|Xτ ∈ dr) ≤ Cθ(m + r + σ r −1 + σ m−1 ) + I (r ≤ θ σ ) + I (m ≤ θ σ ).
(56)
We now undo the conditioning on Xσ/2 ∈ dr. Using the upper bound in (53) and Ito’s formula one obtains dXt2 ≤ (3 + 4mσ −1 Xt )dt + 2Xt dWt . Taking expectations one has Z t 2 −1 E(Xs )ds E(Xt ) ≤ 3t + 4mσ 0 Z t E(Xs2 )ds. ≤ (3 + m2 σ −1 )t + 4σ −1 0
2 ))1/2 ≤ C(σ 1/2 + m). Applying Gronwall’s inequality shows that E(Xσ/2 ) ≤ (E(Xσ/2 −1 By Markov’s inequality P (Xτ ≤ θ σ ) ≤ θ σ E(Xτ ). Using the comparison with a −1 ) ≤ Cσ −1/2 . Using these Bessel process as before we have E(Xτ−1 ) ≤ e3/2 EQ (Yσ/2 bounds in (56) and combining with (54) leads to the estimate in part d) of the lemma.
436
R. Tribe, O. Zaboronski
References 1. Aurell, E., Frisch, U., Noullez, A., Blank, M.: Bifractality of the Devil’s staircase appearing in the Burgers equation with Brownian initial velocity. chao-dyn/961101, published in J. Stat. Phys. 88, 1151–1164 2. Avellaneda, M., Ryan, R., E, Weinan: PDFs for velocity and velocity gradients in Burgers’ turbulence. Commun. Math. Phys. 172, no. 1, 13–38 (1995) 3. Avellaneda, M., E, Weinan: Statistical properties of shocks in Burgers turbulence. Commun. Math. Phys. 172, no. 1, 13–38 (1995) 4. Avellaneda, M.: Statistical properties of shocks in Burgers turbulence. II. Tail probabilities for velocities, shock-strengths and rarefaction intervals. Commun. Math. Phys. 169, no. 1, 45–59 (1995) 5. Balkovsky, E., Falkovich, G., Kolokolov, I., Lebedev, V.: Intermittency of Burgers’ Turbulence . Phys. Rev. Lett. 78, 1452 (1997) 6. Balkovsky, E., Falkovich, G., Kolokolov, I., Lebedev, V.: Viscous Instanton for Burgers’Turbulence. Phys. Rev. Lett. 78, 1452 (1997) 7. Balkovski, E. and Falkovich, G.: Private communication 8. Bec, J., Frisch, U.: Pdf’s of Derivatives and Increments for Decaying Burgers Turbulence. condmat/9906047 9. Bernard, D. and Gawedzki, K.: Scaling and exotic regimes in the decaying Burgers turbulence. chaodyn/9805002 10. Borodin, A.N., Salminen, P.: Handbook of Brownian motion – facts and formulae. Probability and its Applications. Birkhäuser Verlag, Basel, 1996 11. Burgers, J. M.: Statistical problems connected with the solution of a simple non-linear partial differential equation. I, II, III. Nederl. Akad. Wetensch. Proc. Ser. B. 57, 403–413, 414–424, 425–433 (1954) 12. Falkovich, G.: Unpublished 13. Frachebourg, L., Martin, Ph.A.: Exact statistical properties of the Burgers equation. cond-mat/9909056 14. Frisch, U., Vergassola, M.: A prediction of the multifractal model: the intermediate dissipation range. In: New approaches and concepts in turbulence (Monte Verità, 1991), Basel: Birkhäuser, 1993, pp. 29–34 15. Frisch, U.: Turbulence. The legacy of A. N. Kolmogorov. Cambridge: Cambridge University Press, 1995 16. Gotoh, T. and Kraichnan, R.: Statistics of decaying Burgers turbulence. Phys. Fluids A 5, (2), (1993) 17. Gurbatov, S.N., Malakhov, A.N., Saichev, A.I.: Nonlinear random waves and turbulence in nondispersive media: Waves, rays, particles. Translated from the Russian. Supplement 1 by Adrian L. Melott and Sergei F. Shandarin. Supplement 2 by V. I. Arnol’d, Yu. M. Baryshnikov and I. A. Bogayevsky. Translation edited and with a preface by D. G. Crighton. Nonlinear Science: Theory and Applications. Manchester: Manchester University Press, 1991 18. Gurbatov, S.N., Simdyankin, S.I., Aurell, E., Frisch, U., Tóth, G.: On the decay of Burgers turbulence. J. Fluid Mech. 344, 339–374 (1997) 19. Gurarie, V., Migdal, A.: Instantons in Burgers Equation. Phys. Rev. E 54, 4908–4914 (1996) 20. Hopf, E.: The partial differential equation ut +uux = µuxx . Comm. Pure Appl. Math. 3, 201–230 (1950) 21. E, W., Khanin, K., Mazel, A., and Sinai,Ya.G.: Invariant measures for the random forced Burgers equation. Submitted to Ann. Math. 22. E, W., Khanin, K., Mazel, A., and Sinai, Ya.G. Probability distribution functions for the random forced Burgers equation. Phys. Rev. Letters 78, 1904–1907 (1997) 23. Kida, S.: Asymptotic properties of Burgers turbulence. J. Fluid Mech. 93 no. 2, 337–377 (1979) 24. Kraichnan, R.H.: Note on Forced Burgers Turbulence. chao-dyn/9901023 25. Kraichnan, R.: Unpublished 26. Mehta, M.L.: Random matrices. Second edition. Boston, MA: Academic Press, Inc., 1991 27. Parker, D.F.: The decay of sawtooth solutions to the Burgers equation. Proc. Roy. Soc. London Ser. A 369, no. 1738, 409–424 (1980) 28. Polyakov, A.M.: Turbulence without pressure. Phys. Rev. E (3) 52, no. 6, part A, 6183–6188 (1995) 29. Revuz, D. and Yor, M.: Continuous Martingales and Brownian Motion Berlin–Heidelberg–New York: Springer Verlag, 1991 30. Rogers, L.C.G. and Williams, D.: Diffusions, Markov Processes, and Martingales. Volume 2. New York: Wiley, 1986 31. Ryan, R., Avellaneda, M.: The one-point statistics of viscous Burgers turbulence initialized with Gaussian data. Commun. Math. Phys. 200, no. 1, 1–23 (1999) 32. Sinai, Ya.G.: Statistics of shocks in solutions of inviscid Burgers equation. Commun. Math. Phys. 148, no. 3, 601–621 (1992) 33. Truman, A., Zhao, H.Z.: On stochastic diffusion equations and stochastic Burgers’ equations. J. Math. Phys. 37, no. 1, 283–307 (1996) 34. Truman, A., Zhao, H.Z.: Stochastic Burgers’ equations and their semi-classical expansions. Commun. Math. Phys. 194, 1, 231–248 (1998) 35. E, W., Vanden Eijnden, E.: Statistical Theory for the Stochastic Burgers Equation in the Inviscid Limit. chao-dyn/9904028 Communicated by Ya. G. Sinai
Commun. Math. Phys. 212, 437 – 467 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Slow Motion of Charges Interacting Through the Maxwell Field Markus Kunze1 , Herbert Spohn2 1 Mathematisches Institut der Universität Köln, Weyertal 86, 50931 Köln, Germany.
E-mail:
[email protected] 2 Zentrum Mathematik and Physik Department, TU München, 80290 München, Germany.
E-mail:
[email protected] Received: 13 January 2000 / Accepted: 4 February 2000
Abstract: We study the Abraham model for N charges interacting with the Maxwell −1 field. On the scale of √ the charge diameter, Rϕ , the charges are a distance ε Rϕ apart and have a velocity εc with ε a small dimensionless parameter. We follow the motion of the charges over times of the order ε−3/2 Rϕ /c and prove that on this time scale their motion is well approximated by the Darwin Lagrangian. The mass is renormalized. The interaction is dominated by the instantaneous Coulomb forces, which are of the order ε 2 . The magnetic fields and first order retardation generate the Darwin correction of the order ε 3 . Radiation damping would be of the order ε7/2 .
1. Introduction Classical charges interact through Coulomb forces, as one learns in every course on electromagnetism. Presumably the best realization in nature is a strongly ionized gas, for which the Darwin correction to the Coulomb forces is of importance, since under standard conditions the velocities cannot be considered small as compared to the velocity of light, cf. [7, §65]. Thus, given N charges, with positions rα , velocities uα , charges eα , and masses mα , α = 1, . . . , N, their motion is governed by the Lagrangian LD =
N X 1 α=1
+
1 4c2
2
mα u2α +
N eα eβ 1 ∗ 4 1 X m u − α α 2 8c 2 α,β=1 4π |rα − rβ | α6 =β
eα eβ uα · uβ + |rα − rβ |−2 (uα · [rα − rβ ])(uβ · [rα − rβ ]) , 4π|rα − rβ | α,β=1 N X
α6=β
(1.1)
438
M. Kunze, H. Spohn
c denoting the velocity of light. The first term is the kinetic energy with a u4α -correction of a strength m∗α depending on the precise model (m∗α = mα for a relativistic particle). The second term is the Coulomb potential, whereas the third term is the Darwin potential, which decays as the Coulomb potential and has a velocity dependent strength. On a more fundamental level, the forces between the charges are mediated through the electromagnetic field. The instantaneous Coulomb–Darwin interaction is a derived concept only. To understand the emergence of such an interaction, in this paper we will investigate the coupled system, charges and Maxwell field, and we will prove that in a certain limit the motion of the charges is well approximated by the Lagrange equations for LD . Let us first describe how the charges are coupled to the Maxwell field. To avoid short-distance singularities, we assume that the charge is spread out over a distance Rϕ , which physically is of order of the classical electron radius. Thus charge α has a charge distribution ρα which for simplicity we take to be of the form ρα (x) = eα ϕ(x),
x ∈ R3 ,
where the form factor ϕ satisfies 0 ≤ ϕ ∈ C0∞ (R3 ) ,
ϕ(x) = ϕr (|x|) ,
ϕ(x) = 0
for |x| ≥ Rϕ .
(C)
To distinguish the true solution from the approximation (1.1), the position of a charge α in the coupled system is denoted by qα and its velocity by vα , α = 1, . . . , N. The charges then generate the charge distribution ρ and the current j given by ρ(x, t) =
N X
ρα (x − qα (t))
and
j (x, t) =
α=1
N X
ρα (x − qα (t))vα (t),
(1.2)
α=1
which satisfy charge conservation by fiat. The Maxwell field, consisting of the electric field E and the magnetic field B, evolves according to c−1
∂ B(x, t) = −∇ ∧ E(x, t), ∂t
c−1
∂ E(x, t) = ∇ ∧ B(x, t) − c−1 j (x, t) (1.3) ∂t
with the constraints ∇ · E(x, t) = ρ(x, t),
∇ · B(x, t) = 0.
(1.4)
The charges generate the electromagnetic field which in turn determines the forces on the charges through the Lorentz force equation Z h i d mbα γα vα (t) = d 3 x ρα (x − qα (t)) E(x, t) + vα (t) ∧ B(x, t) , t ∈ R, dt (1.5) for α = 1, . . . , N. Here mbα is the bare mass of charge α and γα the relativistic factor −1/2 , which ensures |vα | < c. Note that there are no direct forces γα = (1 − vα2 /c2 ) acting between the particles. Equations (1.2)–(1.5) are known as Abraham model for N charges.
Slow Motion of Charges Interacting Through the Maxwell Field
439
We define the energy function by H(E, B, q, v) =
N X α=1
mbα γα +
1 2
Z d 3 x [E 2 (x) + B 2 (x)],
(1.6)
with q = (q1 , . . . , qN ) and v = (v1 , . . . , vN ). It then may be seen that the initial value problem corresponding to (1.2)–(1.5) has a unique weak solution of finite energy and that H is conserved by this solution, compare with [4] for the case of a single particle. We assume that initially the particles are very far apart on the scale set by Rϕ . Thus we require, for α 6 = β, that (1.7) |qα (0) − qβ (0)| ∼ = ε−1 Rϕ with ε > 0 small. If particles would come together as close as Rϕ , our equations of motion are not trustworthy anyhow. In addition, we require that the initial velocities be small compared to the speed of light, √ (1.8) |vα (0)| ∼ = εc. Subject to these restrictions, in essence, the initial electromagnetic field is chosen such as to minimize the energy function H from (1.6), cf. Sect. 5.1 for precise statements and estimates. With these initial conditions, for the particles to travel a distance of order ε −1 Rϕ it will take a time of order ε−3/2 Rϕ /c, which will be the time scale of interest. Thus physically we consider slow particles that are far apart, and we want to follow their motion over long times. Next note that it takes a time of order ε−1 Rϕ /c for a signal to travel between the particles. This means that on the time scale of interest, retardation effects are small. If particles interact through Coulomb forces, as will have to be proved, the strength of the forces is of order ε 2 since the distance is of order ε−1 Rϕ . Followed over a time √ span ε −3/2 Rϕ /c, this yields a change in velocity of order εc. On this basis we expect the orders of magnitude (1.7) and (1.8) to remain valid over times of order ε−3/2 Rϕ /c. There is one subtle point here, however. The self-interaction of a charge with the fields renormalizes its mass. Thus in (1.1) the quantity mα cannot be the bare mass of the charge, the electromagnetic mass has to be added. In theoretical physics it is common practice to count the post-Coulombian corrections in orders of v/c relative to the motion through pure Coulomb forces. Thus the Darwin term is the first correction and of order (v/c)2 . The next correction is of order (v/c)3 and accounts for damping through radiation. If we push the Taylor expansion in Sect. 3 one term further, one obtains N X ∂LD d ∂LD + (eα / 6π c3 ) eβ v¨β , (1.9) = ∂rα dt ∂uα β=1
α = 1, . . . , N. The physical solution has to be on the center manifold for (1.9). At the present level of precision it suffices to substitute the Hamiltonian dynamics to lowest order, which yields eα 1 ∂LD d ∂LD + = dt ∂uα ∂rα 6π c3 2 N X eβ 0 (rβ −rβ 0 ) · (uβ −uβ 0 ) eβ eβ 0 eβ 0 )−3 0) . − −u (r −r (u β β β β mβ mβ 0 4π|rβ −rβ 0 |3 |rβ −rβ 0 |2 0 β,β =1 β6=β 0
440
M. Kunze, H. Spohn
Note that if the ratio eα /mα does not depend on α, then the radiation reaction vanishes and the system does not emit dipole radiation. The next order correction is (v/c)4 and of Lagrangian form. It is discussed in [7] and [1]. While in electrodynamics corrections of order higher than the radiation reaction are of marginal interest, in general relativity there is a huge effort to obtain very precise corrections to the Newtonian orbits, a problem that is similar to the one discussed here. The most famous example is the Hulse–Taylor binary pulsar, where two highly compact neutron stars of roughly solar mass revolve around each other with a period of 7.8 h [9]. In this case (v/c) ∼ = 10−3 . For gravitational systems there is only quadrupole radiation which is of order (v/c)5 . To this order the theory agrees with the observed radio signals within 0.3%. In newly designed experiments one expects highly improved precision which will require corrections up to order (v/c)11 . 2. Main Results We recall the initial conditions for the Abraham model (1.2)–(1.5), where we set c = 1 throughout for simplicity. For the initial positions qα0 = qα (0) we require C1 ε−1 ≤ |qα0 − qβ0 | ≤ C2 ε−1 ,
α 6 = β,
(2.1)
for some constants C1 , C2 > 0. For the initial velocities vα0 = vα (0) we assume √ (2.2) |vα0 | ≤ C3 ε with C3 > 0. The initial fields are a sum over charge solitons, E(x, 0) = E 0 (x) =
N X α=1
Evα0 (x − qα0 ) and
B(x, 0) = B 0 (x) =
N X α=1
Bvα0 (x − qα0 ). (2.3)
Here Ev (x) = −∇φv (x) + (v · ∇φv (x))v
and
Bv (x) = −v ∧ ∇φv (x)
(2.4)
and the Fourier transform of φv is given by 2 ˆ − (k · v)2 ], φˆ v (k) = eϕ(k)/[k
(2.5)
where it is understood that in φvα0 we have to set e = eα . For this choice of data, the constraints (1.4) are satisfied for t = 0 and therefore for all t. In case N = 1, the particle would travel freely, q1 (t) = q10 + v10 t, t ≥ 0, and the co-moving electromagnetic fields would maintain their form (2.3). In spirit, the bounds (2.1) and (2.2) should propagate in time and the form (2.3) of the electromagnetic fields, at least in approximation. On the other hand, for two particles with opposite charge one particular solution is the head on collision which violates the lower bound in (2.1). Considerably more delicate are solutions where some particles reach infinity in finite time, [8,10]. Thus we simply require that for given constants C∗ , C ∗ > 0 the bound C∗ ε−1 ≤
sup t∈[0, T ε−3/2 ]
|qα (t) − qβ (t)| ≤ C ∗ ε−1 ,
α 6= β,
(2.6)
Slow Motion of Charges Interacting Through the Maxwell Field
441
holds, which implicitly defines the first time, T , at which (2.6) is violated. In fact (2.6) looks like an uncheckable assumption. But, as to be shown, the optimal T can be computed on the basis of the approximation dynamics generated by the Lagrangian (1.1). Under the assumption (2.6) the velocity bound propagates through the conservation of energy. We define the electrostatic energy of the charge distributions as Estat =
N X α=1
eα2
Z 1 2 −2 3 d k |ϕ(k)| ˆ k , 2
(2.7)
and compute the energy (1.6) for the given initial data. Then H(0) := H(t = 0) =
N X α=1
mbα γ (vα0 ) + Estat + O(ε)
−1/2
2 . We minimize the electromagnetic field energy Hf (t) = with R γ3 (v) 2= (1 − v )2 1 x [E (x, t) + B (x, t)] at time t for given ρ and j , i.e., for given positions q(t) d 2 and velocities v(t). Using (2.6) it may be shown that
H(t) ≥
N X
mbα γ (vα (t)) + Estat + O(ε).
α=1
Since by energy conservation H(0) = H(t) and since the dominant contributions Estat √ cancel exactly, we thus will continue to have the bound |vα (t)| ∼ = C ε. (We refer to Sect. 5.1 in Appendix A for the complete argument). Therefore √ |vα (t)| ≤ Cv ε (2.8) sup t∈[0, T ε−3/2 ]
with some constant Cv > 0. As a next step we solve the inhomogeneous Maxwell equations for the fields and insert them into the Lorentz force equations. According to the retarded part of the fields, retarded positions qα (s), s ∈ [0, t], will show up. To control the Taylor expansion of qα (t) − qα (s) and thus of the retarded force, including the Darwin term, we will need bounds not only on positions and velocities, but also on v˙α and v¨α . Implicitly they use that the true fields remain close to the fields of the form (2.3) evaluated at current positions and velocities. Lemma 2.1. Let the initial data for the Abraham model satisfy (2.1), (2.2), and (2.3). Moreover, assume C∗ ε−1 ≤
sup t∈[0,T ε−3/2 ]
|qα (t) − qβ (t)|,
α 6 = β,
(2.9)
for some T > 0. Then there exist constants C ∗ , Cv > 0 such that (2.6) and (2.8) hold. ¯ In addition, we find C > 0 and In particular, supt∈[0,T ε−3/2 ] |vα (t)| ≤ v¯ < 1 for some v. e¯ > 0 such that sup t∈[0,T ε−3/2 ]
|v˙α (t)| ≤ Cε2 and
sup t∈[0,T ε−3/2 ]
|v¨α (t)| ≤ Cε7/2
(2.10)
¯ α = 1, . . . , N. In the estimates (2.6), (2.8), and (2.10), C and e¯ in case that |eα | ≤ e, do depend only on T and the bounds for the initial data, but not on ε.
442
M. Kunze, H. Spohn
The proof of this lemma is rather technical and will be given in Appendix A. Using the bounds of Lemma 2.1, we expand the Lorentz force up to an error of order ε7/2 , cf. Lemma 3.5, which is the order of radiation damping (the Coulomb force is order ε 2 and radiation damping a relative order ε3/2 smaller). The terms up to order ε3 then can be collected in the form of the Darwin Lagrangian (1.1). We set 4 mα = mbα + eα2 me 3 with the electromagnetic mass me = LD (r, u) =
R
m∗α = mbα +
16 2 e me 15 α
2 d 3 k |fˆ(k)| k −2 and the Darwin Lagrangian
N X 1
N 1 X eα eβ ε mα u2α + m∗α u4α − 2 8 2 α,β=1 4π |rα − rβ |
α=1
+
1 2
and
α6 =β
eα eβ ε uα · uβ + |rα − rβ |−2 (uα · [rα − rβ ])(uβ · [rα − rβ ]) 4 α,β=1 4π|rα − rβ | N X
α6=β
for r = (r1 , . . . , rN ) and u = (u1 , . . . , uN ). The comparison dynamics is then ∂LD d ∂LD , α = 1, . . . , N. (2.11) = dt ∂uα ∂rα It conserves the energy HD (r, u) =
N X 1 α=1
N 1 X eα eβ 3 mα u2α + ε m∗α u4α + . 2 8 2 α,β=1 4π |rα − rβ |
(2.12)
α6 =β
Because of the Coulomb singularity, in general the solutions to (2.11) will exist only locally in time, the only exception being when all charges have the same sign, in which case energy conservation yields global existence. In the corresponding gravitational problem, for a set of positive phase space measure, mass can be transported to infinity in a finite time, [10]. We do not know whether this can happen also for the Coulomb problem. We set √ (2.13) qα0 = ε−1 rα0 and vα0 = εu0α , α = 1, . . . , N, with rα0 6 = rβ0 for α 6 = β. Then (2.1) and (2.2) are satisfied. During the initial time slip of order ε −1 the fields build up the forces between particles and adjust to their motion. Thus during that period the dynamics of the particles is not well approximated by the Darwin Lagrangian and we correct the initial data of the comparison dynamics to the true positions and velocities only at the end of the initial time slip. To take into account that the comparison dynamics will have no global solutions in time, in general, we define τ ∈]0, ∞] to be the first time when either limt→τ − |rα (t) − rβ (t)| = 0 for some α 6 = β or limt→τ − |rα (t)| = ∞ for some α holds for the comparison dynamics (2.11).
Slow Motion of Charges Interacting Through the Maxwell Field
443
As our main approximation result we state Theorem 2.2. Let T > 0 be fixed. Define τ ∈]0, ∞] as above and fix some δ0 ∈]0, τ [. For the Abraham model let the initial data be given by (2.13) and (2.3). Furthermore we ¯ with e¯ = e(T ¯ , data) > 0 from Lemma 2.1. Let t0 = 4(Rϕ + C ∗ ε−1 ). require |eα | ≤ e, We adjust the initial data of the comparison dynamics such that qα (t0 ) = ε−1 rα (ε3/2 t0 ) √ and vα (t0 ) = εuα (ε3/2 t0 ), α = 1, . . . , N. Then there exists a constant C > 0 such that for all t ∈ [t0 , min{τ − δ0 , T } ε−3/2 ] we have √ √ |qα (t) − ε−1 rα (ε 3/2 t)| ≤ C ε, |vα (t) − εuα (ε 3/2 t)| ≤ Cε2 , α = 1, . . . , N. (2.14) Remarks. (i) If we are satisfied with the precision from the pure Coulomb dynamics, then in (2.14) we loose one power in ε. In this case, we can adjust the initial data of the comparison dynamics at time t = 0, and then (2.14) holds for all t ∈ [0, min{τ − δ0 , T } ε−3/2 ]. (ii) In fact the initial data need not to be adjusted exactly at t = t0 , a bound √ √ |qα (t0 ) − ε−1 rα (ε 3/2 t0 )| ∼ ε and |vα (t0 ) − εuα (ε3/2 t0 )| ∼ ε2 would be sufficient. 3. Self-Action and Mutual Interaction In this section we expand the Lorentz force term Z Fα (t) = d 3 x ρα (x − qα (t)) E(x, t) + vα (t) ∧ B(x, t) .
(3.1)
Since the fields (E, B) are a solution to the inhomogeneous Maxwell’s equations, we may decompose them in the initial and the retarded fields, E(x, t) = E (0) (x, t) + E (r) (x, t)
and
B(x, t) = B (0) (x, t) + B (r) (x, t),
where sin |k|t ˆ ˆ k ∧ B(k, 0), 0) − i Eˆ (0) (k, t) = cos |k|t E(k, |k| sin |k|t ˆ ˆ 0) + i k ∧ E(k, 0), Bˆ (0) (k, t) = cos |k|t B(k, |k| Z t Z t sin |k|(t − s) ρ(k, ˆ s)k, ds cos |k|(t − s) jˆ(k, s) + i ds Eˆ (r) (k, t) = − |k| 0 0 Z t sin |k|(t − s) k ∧ jˆ(k, s), ds Bˆ (r) (k, t) = −i |k| 0 cf. [6, Sect. 4], with j (x, t) and ρ(x, t) from (1.2). Accordingly we can rewrite Fα (t) in (3.1) as Z Fα (t) = d 3 x ρα (x − qα (t))[E (0) (x, t) + vα (t) ∧ B (0) (x, t)] Z + d 3 x ρα (x − qα (t))[E (r) (x, t) + vα (t) ∧ B (r) (x, t)] = Fα(0) (t) + Fα(r) (t).
(3.2)
444
M. Kunze, H. Spohn (0)
First we consider Fα (t). (0)
Lemma 3.1. For t ∈ [t0 , T ε−3/2 ], with t0 = 4(Rϕ + C ∗ ε−1 ), we have Fα (t) = 0. Proof. If S(t) denotes the solution group generated by the free wave equation in D 1,2 (R3 ) ⊕ L2 (R3 ), it follows from (2.3) through Fourier transform that
E (0) (x, t) E˙ (0) (x, t)
E (0) (·, 0) = S(t) ˙ (0) (x) E (·, 0) Z 0 N X β eβ ds [S(t − s)8E (· − qβ0 − vβ0 s)](x), =− β=1
−∞
β
where 8E (x) = (ϕ(x)vβ0 , ∇ϕ(x)). The analogous formula is valid for B (0) (x, t), with β
β
8E to be replaced with 8B (x) = (0, vβ0 ∧ ∇ϕ(x)). For fixed 1 ≤ β ≤ N and x ∈ R3 β
with |x − qβ0 | ≤ t − Rϕ assumption (C) yields [S(t − s)8E (· − qβ0 − vβ0 s)] (x) = 0 1 for all s ≤ 0 by means of Kirchhoff’s formula and Lemma 2.1, [. . . ]1 denoting the first component. As for t ∈ [t0 , T ε−3/2 ] and |x − qβ0 | > t − Rϕ we obtain √ |x − qα (t)| ≥ |x − qβ0 | − |qα (t) − qβ (t)| − |qβ (t) − qβ0 | ≥ t − Rϕ − C ∗ ε−1 − C ε t ≥ t0 /2 − Rϕ − C ∗ ε−1 ≥ Rϕ for ε small by Lemma 2.1, the claim follows.
t u
(r)
Turning then to Fα (t) in (3.2), we write this term in Fourier transformed form and use (1.2) to obtain (r) (t) + Fα(r) (t) = eα2 Fαα
N X
(r)
eα eβ Fαβ (t),
(3.3)
β=1
β6 =α
with (r) Fαβ (t)
Z =
t
Z ds
2 −ik·[qα (t)−qβ (s)] dk |ϕ(k)| ˆ e
0
·
sin |k|(t − s) k |k| sin |k|(t − s) vα (t) ∧ (k ∧ vβ (s)) , −i |k|
− cos |k|(t − s) vβ (s) + i
(r)
(3.4) (r)
α, β = 1, . . . , N. The term Fαα (t) accounts for the self-force, whereas Fαβ (t) for β 6 = α represents the mutual interaction force between particle α and particle β. These both contributions are dealt with separately in the following two subsections. Before going on to this, we state an auxiliary result.
Slow Motion of Charges Interacting Through the Maxwell Field
445
Lemma 3.2. Let 1 ≤ α, β ≤ N , α 6 = β. For t ∈ [t0 , T ε−3/2 ] we have Z Z t 2 −ik·[qα (t)−qβ (s)] ds dk |ϕ(k)| ˆ e cos |k|(t − s) vβ (s) (a) − 0
Z Z ∞ n o 2 −ik·ξαβ dτ dk |ϕ(k)| ˆ e cos |k|τ vβ − iτ (k · vβ )vβ − τ v˙β + O(ε7/2 ), =− Z t 0 Z 2 −ik·[qα (t)−qβ (s)] sin |k|(t − s) k ds dk |ϕ(k)| ˆ e (b) i |k| 0 Z Z ∞ h 1 2 i 1 2 2 −ik·ξαβ sin |k|τ 2 k 1−ik· τ vβ − τ v˙β − τ (k · vβ ) dτ dk |ϕ(k)| ˆ e =i |k| 2 2 0 7/2 ), +O(ε Z Z t 2 −ik·[qα (t)−qβ (s)] sin |k|(t − s) vα (t) ∧ (k ∧ vβ (s)) ds dk |ϕ(k)| ˆ e (c) (−i) |k| 0 Z Z ∞ 2 −ik·ξαβ sin |k|τ dτ dk |ϕ(k)| ˆ e vα ∧ (k ∧ vβ ) + O(ε7/2 ). = (−i) |k| 0 Here vα = vα (t), etc., and ξαβ = qα (t) − qβ (t). The proof is somewhat tedious and given in Appendix B.
3.1. Self-action. For t ∈ [t0 , T ε−3/2 ] we have Z
(r) (t) Fαα
=
Z
i 2 −i(k·vα )τ dk |ϕ(k)| ˆ e 1 + (k · v˙α )τ 2 2 0 sin |k|τ sin |k|τ k−i vα ∧ (k ∧ [vα − v˙α τ ]) · − cos |k|τ [vα − v˙α τ ] + i |k| |k| ∞
dτ
+O(ε 7/2 ).
(3.5)
The rigorous proof of this relation is omitted since it is very similar to the proof of Lemma 3.2 given in Appendix B. It once more relies on the fact that we may Taylor expand 1 qα (s) ∼ = qα − vα τ + v˙α τ 2 + O(ε7/2 ), 2
vα (s) ∼ = vα − v˙α τ + O(ε7/2 )
by Lemma 2.1, with qα = qα (t), etc. and τ = t − s, whence i e−ik·[qα (t)−qα (s)] ∼ = e−i(k·vα )τ 1 + (k · v˙α )τ 2 + O(ε7/2 ). 2 Introducing Z Ip =
0
t¯
dτ
sin(|k|τ ) −i(k·vα )τ p τ , e |k|
Z Jp =
0
t¯
dτ cos(|k|τ )e−i(k·vα )τ τ p ,
p ∈ N0 ,
446
M. Kunze, H. Spohn
Eq. (3.5) may be rewritten as Z i 2 (r) (t) = lim − dk |ϕ(k)| ˆ vα J0 − v˙α J1 + (k · v˙α )vα J2 Fαα 2 t¯→∞ Z 2 i [(1 − vα2 )k + (k · vα )vα ]I0 + i [(vα · v˙α )k − (k · vα )v˙α ]I1 + dk |ϕ(k)| ˆ 1 − (k · v˙α )[(1 − vα2 )k + (k · vα )vα ]I2 + O(ε7/2 ), (3.6) 2 since v˙α2 = O(ε4 ). Denote the term containing the Jp by J and the one containing the Ip by I. To evaluate the limits t¯ → ∞, we can rely on the results from [6, Sect. 4]. We first recall that Z Z 2 2 J0 → 0, dk |ϕ(k)| ˆ J1 → −2me γα2 as t¯ → ∞, dk |ϕ(k)| ˆ −1/2
with γα = (1 − vα2 ) therefore
and me =
1 2
R
2 −2 k . Moreover, ∇v J1 = −ikJ2 , and dk |ϕ(k)| ˆ
1 J → (−2me γα2 )v˙α + v˙α · ∇v (−2me γα2 ) vα = −2me γα2 v˙α + γα2 (vα · v˙α )vα 2 2 (3.7) = −2me (1 + vα )v˙α + (vα · v˙α )vα + O(ε4 )
as t¯ → ∞, the latter equality according to the expansion γα2 = 1 + vα2 + O(vα4 ) = 1 + vα2 + O(ε2 ) and γα4 = 1 + O(ε). R 2 kI0 → 0, What concerns I, we know from [3, 6] that dk |ϕ(k)| ˆ Z Z 2 2 I0 → 2me |vα |−1 arth|vα |, dk |ϕ(k)| ˆ (k · v˙α )kI2 → −2me µ(vα )v˙α dk |ϕ(k)| ˆ as t¯ → ∞, where γ 4 γ 2 −3 2 −5 − |v| arth|v| z + (5v − 3) + 3|v| arth|v| (v · z)v µ(v)z = v2 v4 for |v| < 1 and z ∈ R3 . Consequently, since s −1 arth(s) = 1 + s 2 /3 + s 4 /5 + O(s 6 ) for s close to zero, it thus follows after some calculation that I → −(vα · v˙α )∇v 2me |vα |−1 arth|vα | + v˙α vα · ∇v 2me |vα |−1 arth|vα | 1 1 − (1 − vα2 ) − 2me µ(vα )v˙α − vα vα · − 2me µ(vα )v˙α 2 2 14 2 22 2 + v me v˙α + me (vα · v˙α )vα + O(ε4 ). (3.8) = 3 15 α 15 Summarizing (3.6), (3.7), and (3.8), we arrive at Lemma 3.3. For t ∈ [t0 , T ε−3/2 ] we have 8 4 16 (r) + vα2 me v˙α − me (vα · v˙α )vα + O(ε7/2 ). (t) = − Fαα 3 15 15
Slow Motion of Charges Interacting Through the Maxwell Field
447 (r)
3.2. Mutual interaction. In this section we expand Fαβ (t) from (3.4) with β 6 = α. For p ∈ N0 we have that Z ∞ Z 2 −ik·ξαβ sin |k|τ p τ Ap := dτ dk |ϕ(k)| ˆ e |k| 0 Z Z = (4π)−1
and
Z
∞
dxdy ϕ(x)ϕ(y)|ξαβ + x − y|p−1
Z
2 −ik·ξαβ dk |ϕ(k)| ˆ e cos(|k|τ ) τ p Z Z −1 = (−p)(4π) dxdy ϕ(x)ϕ(y)|ξαβ + x − y|p−2 = (−p)Ap−1 ,
Bp :=
dτ
0
as may be seen through Fourier transform. We hence obtain from Lemma 3.2 that for β 6 = α and t ∈ [t0 , T ε−3/2 ], 1 1 (r) Fαβ (t) = −vβ (vβ · ∇ξ )B1 + v˙β B1 − ∇ξ A0 + (v˙β · ∇ξ )∇ξ A2 − (vβ · ∇ξ )2 ∇ξ A2 2 2 +(vα · vβ )∇ξ A0 − vβ (vα · ∇ξ )A0 + O(ε7/2 ), (3.9) taking also into account that A1 = (4π )−1 , thus ∇ξ A1 = 0. As a consequence of |ξαβ | = O(ε−1 ), cf. Lemma 2.1, of assumption (C), and of Lemma 2.1, it follows that in (3.9) we have −∇ξ A0 = O(ε2 ), while all other terms are O(ε3 ). Since e.g. ≤ Cε4 , (vα · vβ )∇ξ A0 − (vα · vβ ) − ξαβ 4π |ξαβ |3 with an obvious similar estimate for the other terms besides −∇ξ A0 , we find from (3.9) and after some calculation that for β 6 = α and t ∈ [t0 , T ε−3/2 ], 1 1 (r) − v˙β − ∇ξ A0 Fαβ (t) = vβ (vβ · ∇ξ ) 4π|ξαβ | 4π |ξαβ | |ξαβ | |ξαβ | 1 1 + (v˙β · ∇ξ )∇ξ − (vβ · ∇ξ )2 ∇ξ 2 4π 2 4π 1 1 − vβ (vα · ∇ξ ) + O(ε7/2 ) +(vα · vβ )∇ξ 4π|ξαβ | 4π |ξαβ | vβ2 (v˙β · ξαβ ) 3(vβ · ξαβ )2 1 ξ + ξ − ξαβ v˙β − αβ αβ 8π|ξαβ | 8π |ξαβ |3 8π |ξαβ |3 8π |ξαβ |5 (vα · vβ ) (vα · ξαβ ) − ξαβ + vβ + O(ε7/2 ). 3 4π|ξαβ | 4π |ξαβ |3
= −∇ξ A0 −
Finally, to deal with the lowest-order term we observe that with n = ξαβ /|ξαβ |, ∇ξ A0 +
ξαβ 1 = 4π|ξαβ |3 4|ξαβ |2
Z Z n + x−y |ξαβ | . (3.10) − n dxdy ϕ(x)ϕ(y) n + x−y 3 |ξαβ |
448
M. Kunze, H. Spohn
Defining R = (x − y)/|ξαβ | = O(ε) for |x|, |y| ≤ Rϕ , we canR expand ψ(R) = (n + R dxdy ϕ(x)ϕ(y)(x − R)/|n+R| to obtain that ψ(R) = n+R−3(n·R)n+O(ε2 ). As y) = 0, we hence conclude that the right-hand side of (3.10) is O(ε4 ). Thus we can summarize our estimates on the mutual interaction force as follows. Lemma 3.4. For β 6 = α and t ∈ [t0 , T ε−3/2 ] we have (r)
Fαβ (t) =
vβ2 ξαβ (v˙β · ξαβ ) 3(vβ · ξαβ )2 1 v ˙ − − ξ + ξ − ξαβ β αβ αβ 4π|ξαβ |3 8π|ξαβ | 8π |ξαβ |3 8π |ξαβ |3 8π |ξαβ |5 (vα · vβ ) (vα · ξαβ ) − ξαβ + vβ + O(ε 7/2 ). 4π|ξαβ |3 4π|ξαβ |3 (r)
3.3. Summary of the estimates. By (3.1), (3.2), and Lemma 3.1 we find Fα (t) = Fα (t) for t ∈ [t0 , T ε−3/2 ]. According to (3.3) and Lemmas 3.3 and 3.4 we hence have obtained the following expansion of the Lorentz force in (3.1). For t ∈ [t0 , T ε−3/2 ] we have 4 16 8 ˙ + O(ε7/2 ), + vα2 me v˙α − me (vα · v˙α )vα + Gα (q, v, v) Fα (t) = − 3 15 15 N X vβ2 ξαβ eα eβ (v˙β · ξαβ ) 1 ˙ = v ˙ − − ξ + ξαβ Gα (q, v, v) β αβ 4π |ξαβ |3 2|ξαβ | 2|ξαβ |3 2|ξαβ |3 β=1 β6=α
3(vβ · ξαβ )2 (vα · vβ ) (vα · ξαβ ) − ξαβ − ξαβ + vβ (3.11) , |ξαβ |3 |ξαβ |3 2|ξαβ |5
where t0 = 4(Rϕ + C ∗ ε−1 ), ξαβ = qα (t) − qβ (t), vα = vα (t), and vβ = vβ (t). Due to d the Lorentz equation dt (mbα γα vα ) = Fα (t), cf. (1.5), we finally obtain the following lemma by calculating the right-hand side and expanding γα . Lemma 3.5. For t ∈ [t0 , T ε−3/2 ] we have ˙ + O(ε7/2 ), Mα (vα )v˙α = Gα (q, v, v)
1 ≤ α ≤ N,
with Gα from (3.11) and Mα (v) the (3 × 3)-matrix Mα (v)(z) = (mα + 21 m∗α v 2 )z + m∗α (v · z)v for v, z ∈ R3 . 4. Proof of Theorem 2.2 We need to compare a solution (qα (t), vα (t)) of (1.2)–(1.5) with data (2.13) to (˜rα (t), u˜ α (t)), where we let √ r˜α (t) = ε−1 rα (ε3/2 t), u˜ α (t) = εuα (ε3/2 t), (4.1) and where the (rα (t), uα (t)) are the solution to the system induced by (2.11) with data (rα0 , u0α ). A somewhat lengthy but elementary calculation shows that (˜rα (t), u˜ α (t)) satisfy ˙˜ ˜ u, ˜ u), Mα (u˜ α )u˙˜ α = Gα (r,
1 ≤ α ≤ N,
(4.2)
Slow Motion of Charges Interacting Through the Maxwell Field
449
cf. Lemma 3.5 for the notation. Recalling that τ ∈]0, ∞] was defined to be the first time when either limt→τ − |rα (t) − rβ (t)| = 0 for some α 6= β or limt→τ − |rα (t)| = ∞ for some α holds, we find that (4.2) is valid for t ∈ [0, (τ − δ0 )ε−3/2 ], for any δ0 ∈]0, τ [ which we consider to be fixed throughout. This leads to some useful estimates on the effective dynamics. Lemma 4.1. For suitable constants C0 , C 0 , C > 0 (depending on τ , δ0 , and the data) we have C0 ε−1 ≤
sup t∈[0, (τ −δ0 )ε−3/2 ]
|˜rα (t) − r˜β (t)| ≤ C 0 ε−1 ,
α 6 = β,
(4.3)
and sup t∈[0, (τ −δ0 )ε−3/2 ]
√ |u˜ α (t)| ≤ C ε.
(4.4)
Proof. The bounds in (4.3) follow from (4.1) and the fact that |rα (t) − rβ (t)| ≥ δ1 and |rα (t)| ≤ C on [0, τ − δ0 ] for some δ1 > 0, C > 0, by definition of τ . Concerning (4.4), by conservation of the energy HD from (2.12) we obtain C ≥ HD (r(0), u(0)) = HD (r(t), u(t)) ≥ 21 mα u2α (t) as long as the solution exists, in particular for t ∈ [0, τ −δ0 ]. t u To simplify the presentation, we henceforth omit the tilde and write (r, u) instead of (˜r , u) ˜ to denote the rescaled solution. Utilizing the bounds from Lemma 2.1 and from (4.3), (4.4), it may be seen after some calculation that ˙ ˙ − Gα (r, u, u)(t) Gα (q, v, v)(t) ≤C
(4.5) ε3 |qβ (t) − rβ (t)| + ε5/2 |vβ (t) − uβ (t)| + ε|v˙β (t) − u˙ β (t)|
N X β=1
for 1 ≤ α ≤ N and t ∈ [0, T ε −3/2 ] ∩ [0, (τ − δ0 )ε−3/2 ] = [0, min{τ − δ0 , T }ε−3/2 ]. Note that the term ε 3 |qβ − rβ | appears through comparison of ξαβ /|ξαβ |3 to rαβ /|rαβ |3 , cf. the form of Gα in (3.11). Next, a general (3 × 3)-matrix M(v) = a(v)id + b(v ⊗ v) has the inverse M(v)−1 = a(v)−1 id +
b (v ⊗ v). a(v)[a(v) + bv 2 ]
√ This remark shows |Mα (vα )−1 | = O(1) and |Mα (vα )−1 − Mα (uα )−1 | ≤ C ε|vα − uα | −3/2 2 ˙ = O(ε ) it follows from Lemma ]. Since |Gα (q, v, v)| for t ∈ [0, min{τ − δ0 , T }ε 3.5, (4.2), and (4.5) that |v˙α (t) − u˙ α (t)| ≤C
ε3 |qβ (t) − rβ (t)| + ε5/2 |vβ (t) − uβ (t)| + ε|v˙β (t) − u˙ β (t)| + O(ε7/2 )
N X β=1
450
M. Kunze, H. Spohn
for 1 ≤ α ≤ N and t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ]. Summation over α and choosing ε > 0 sufficiently small this results in N X α=1
|v˙α (t) − u˙ α (t)| ≤ C
ε3 |qα (t) − rα (t)| + ε5/2 |vα (t) − uα (t)| + O(ε7/2 )
N X α=1
(4.6) for t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ]. To use this basic estimate, we write dα (t) = qα (t) − rα (t) as Z t Z t ˙ ¨ ˙ ˙ (t −s)dα (s) ds, dα (t) = dα (t0 )+ d¨α (s) ds. dα (t) = dα (t0 ) + (t − t0 )dα (t0 ) + t0
t0
We then obtain for t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ] from (4.6) that Z t ¯ 0 ) + Cε3 (t − s)D(s) ds D(t) ≤ D(t0 ) + (t − t0 )D(t Z + Cε 5/2
t t0
t0
√ ¯ (t − s)D(s) ds + C ε,
¯ ¯ 0 ) + Cε3 D(t) ≤ D(t
Z
t
t0
Z D(s) ds + Cε5/2
(4.7) t
t0
¯ D(s) ds + Cε2 ,
(4.8)
where D(t) = max max |dα (s)|
and
1≤α≤N s∈[t0 ,t]
¯ D(t) = max max |d˙α (s)|. 1≤α≤N s∈[t0 ,t]
Application of Gronwall’s lemma to (4.8) yields Z t ¯ ¯ 0 ) + ε2 + ε3 D(s) ds , D(t) ≤ C D(t t0
(4.9)
and utilizing this in (4.7) implies √ ¯ 0 ) + C ε + Cε−1/2 (D(t ¯ 0) D(t) ≤ D(t0 ) + (t − t0 )D(t Z t (t − s)D(s) ds. + ε 2 ) + Cε3 t0
Finally, (t − s) ≤ Cε −3/2 yields upon a further application of Gronwall’s lemma that √ ¯ 0 ) + ε , t ∈ [t0 , min{τ − δ0 , T }ε−3/2 ]. (4.10) D(t) ≤ C D(t0 ) + ε−3/2 D(t ¯ 0 ). Therefore (4.10) and (4.9) imply (2.14). This By assumption D(t0 ) = 0 = D(t completes the proof of Theorem 2.2. u t 5. Appendix A: Proof of Lemma 2.1 This appendix concerns the proof of Lemma 2.1. We split the proof into three subsections.
Slow Motion of Charges Interacting Through the Maxwell Field
451
5.1. Bounding the particle distances and the velocities. We intend to use energy conservation to show (2.8), and for that reason we calculate with (2.3) the field energy Z 1 d 3 x [E 2 (x, 0) + B 2 (x, 0)] HF (0) = 2 N Z 1X d 3 x [Ev20 (x − qα0 ) + Bv20 (x − qα0 )] = α α 2 α=1
+
N Z h 1 X d 3 x Evα0 (x − qα0 ) · Ev 0 (x − qβ0 ) β 2 α,β=1 α6 =β
i + Bvα0 (x − qα0 ) · Bv 0 (x − qβ0 ) . β
According to (2.4) and [6, Sect. 2] the first term equals (1) HF (0)
=
N X α=1
eα2
Z 1 + |vα0 | 1 1 2 −2 3 d k |ϕ(k)| log −1 . ˆ k 2 |vα0 | 1 − |vα0 |
Denoting the term in [. . . ] as ψ(|vα0 |), ψ(r) is odd, and hence Taylor expansion implies ψ(r) = 1 + O(r 2 ) for r small. Therefore (2.2) yields (1)
HF (0) = ECoul + O(ε), with ECoul from (2.7). To deal with the contributions for α 6 = β in the second term, we obtain by passing to Fourier transformed form and observing (2.2) that e.g. Z Z 2 −2 ik·(qα0 −qβ0 ) ˆ k e + O(ε) d 3 x Evα0 (x − qα0 ) · Ev 0 (x − qβ0 ) = eα eβ d 3 k |ϕ(k)| β
= O(ε), the latter with (2.1) and by passing to polar coordinates. Thus we have shown HF (0) = ECoul + O(ε).
(5.1)
Next we will investigate the field energy at time t > 0. We claim that Z 1 d 3 x [E 2 (x, t) + B 2 (x, t)] HF (t) = 2 Z 1 1 d 3 x E 2 (x, t) ≥ − ρ(·, t), 1−1 ρ(·, t) 2 3 . ≥ L (R ) 2 2
(5.2)
The easiest way to see this is to introduce potentials A and φ, B(x, t) = ∇ ∧ A(x, t),
E(x, t) = −∇φ(x, t)−F (x, t),
with F (x, t) =
∂A (x, t), ∂t
for the electromagnetic field. Then ρ = ∇ · E = −1φ − ∇ · F , and the estimate in (5.2) follows by passing to Fourier transformed form. On the other hand, substituting ρ from (1.2) into Z 1 1 −1 ˆ t)|2 k −2 , d 3 k |ρ(k, − ρ(·, t), 1 ρ(·, t) 2 3 = L (R ) 2 2
452
M. Kunze, H. Spohn
by assumption (2.9) we can argue exactly as before to show that the terms with α 6 = β are O(ε), and thus N Z 1X (5.3) d 3 k |ρˆα (k)|2 k −2 + O(ε) = ECoul + O(ε) HF (t) ≥ 2 α=1
for t ∈ [0, T ε −3/2 ]. Consequently for t ∈ [0, T ε−3/2 ] by energy conservation, cf. (1.6), by (5.1) and (5.3), N X α=1
mbα γ (vα0 ) + ECoul + O(ε) = = ≥
N X α=1 N X α=1 N X
mbα γ (vα0 ) + HF (0) mbα γ (vα (t)) + HF (t) mbα γ (vα (t)) + ECoul + O(ε),
α=1 −1/2
with γ (v) = (1 − v 2 ) N X α=1
. Thus
mbα γ (vα0 ) + Cε
≥
N X
mbα γ (vα (t)),
t ∈ [0, T ε−3/2 ]
(5.4)
α=1
with some constant C depending on C1 , C3 , C∗ , T . This estimate now allows to prove (2.8). Define I+ = {α ∈ {1, . . . , N} : γ (vα (t)) ≤ γ (vα0 )} and I− = {α ∈ {1, . . . , N} : γ (vα (t)) > γ (vα0 )}. √ For α ∈ I+ we have |vα (t)| ≤ |vα0 | ≤ C3 ε by (2.2). Thus for ε so small that C32 ε ≤ 1/2, √ γ (vα0 ) − γ (vα (t)) ≤ 2|(vα0 )2 − (vα (t))2 | ≤ Cε. Therefore by (5.4), X mbα γ (vα (t)) − γ (vα0 ) . Cε ≥ α∈I−
Since mbα > 0 we deduce that γ (vα (t)) ≤ γ (vα0 ) + Cε, α ∈ I− , √ √ and according to |vα0 | ≤ C3 ε it then follows that |vα (t)| ≤ C ε also for α ∈ I− . This concludes the proof of (2.8). Using (2.1) and (2.8) it is finally easy to derive the upper bound in (2.6), since for t ∈ [0, T ε −3/2 ] we have |qα (t) − qβ (t)| ≤ |qα0 − qβ0 | + |qα (t) − qα0 | + |qβ (t) − qβ0 | √ ≤ C2 ε−1 + 2Cv T εε−3/2 = C ∗ ε−1 , with C ∗ = C2 + 2Cv T . We remark that for the estimates in this section the smallness of the eα was not needed.
Slow Motion of Charges Interacting Through the Maxwell Field
453
5.2. Bounding |v˙α (t)|. Since d mbα γα vα (t) = m0α (vα (t))v˙α (t), dt with the (3×3)-matrices m0α (vα ) given through m0α (vα )(z) = mbα (γα z+γα3 (vα ·z)vα ), z ∈ R3 , we obtain from (1.5) that for α = 1, . . . , N, v˙α = m0α (vα )−1 Z · d 3 x ρα (x − qα ) [E(x) − Evα (x − qα )] + vα ∧ [B(x) − Bvα (x − qα )] Z −1 d 3 x ρα (x) Z1 (x + qα , t) + vα ∧ Z2 (x + qα , t) + Rα (t), (5.5) = m0α (vα ) where m0α (vα )−1 z = mbα −1 γα−1 (z − (vα · z)vα ), z ∈ R3 , is the matrix inverse of m0α (vα ). For (5.5) it is important to note that adding the Evα (x − qα )-term and the vα ∧ Bvα (x − qα )-term does not change the integral, as may be seen through Fourier transform using (2.4) and (2.5). Moreover, in (5.5) we have set Rα (t) = m0α (vα )−1
X N Z
h i d 3 x ρα (x − qα ) Evβ (x − qβ ) + vα ∧ Bvβ (x − qβ )
β=1
β6=α
(5.6) and Z(x, t) =
Z1 (x, t) Z2 (x, t)
=
E(x, t) −
PN
β=1 Evβ (t) (x PN B(x, t) − β=1 Bvβ (t) (x
− qβ (t)) − qβ (t))
! .
Maxwell’s equations and the relations (v · ∇)Ev (x) = −∇ ∧ Bv (x) + eϕ(x)v, (v · ∇)Bv (x) = ∇ ∧ Ev (x), e = eα for index α, yield 0 ∇∧ ˙ Z(t) = AZ(t) − f (t) , with A = (5.7) −∇∧ 0 and f (x, t) =
N X
(v˙β (t) · ∇v )Evβ (t) (x − qβ (t))
β=1
(v˙β (t) · ∇v )Bvβ (t) (x − qβ (t))
! .
(5.8)
The Maxwell operator A generates a C 0 -group U (t), t ∈ R, of isometries in L2 (R3 )3 ⊕ L2 (R3 )3 ; see [2, p. 435; (H2)]. Therefore we have the mild solution representation Z t ds [U (t − s)f (·, s)](x). (5.9) Z(x, t) = [U (t)Z(·, 0)](x) − 0
According to (2.3), Z(0) = 0, so the first term drops out. To estimate the remaining term, we first state and prove some auxiliary lemmas that will be used frequently.
454
M. Kunze, H. Spohn
Lemma 5.1. For given f = (f1 , f2 ) with ∇ · f1 = 0 and ∇ · f2 = 0 we have for W (t, s, x) = (W1 (t, s, x), W2 (t, s, x)) = [U (t − s)f (·, s)](x), 1 W1 (t, s, x) = 4π(t − s)2 Z h i · d 2 y (t − s)∇ ∧ f2 (y, s) + f1 (y, s) + ((y − x) · ∇)f1 (y, s) , |y−x|=(t−s)
1 W2 (t, s, x) = 4π(t − s)2 Z h i · d 2 y − (t − s)∇ ∧ f1 (y, s) + f2 (y, s) + ((y − x) · ∇)f2 (y, s) . |y−x|=(t−s)
Proof. See [6, Lemma 8.1].
t u
Lemma 5.2. (a) Let ξ(s) ≥ 0 be some function. Assume that for y ∈ R3 , s ∈ [0, t], and some f (y, s) = (f1 (y, s), f2 (y, s)) with ∇ · f1 = 0 = ∇ · f2 , |f1 (y, s)| + |f2 (y, s)| ≤ Cξ(s) |∇f1 (y, s)| + |∇f2 (y, s)| ≤ Cξ(s)
N X
1
β=1
1 + |y − qβ (s)|2
N X
1
1 + |y − qβ (s)|3 β=1
,
(5.10)
.
(5.11)
Then for each α = 1, . . . , N, t ∈ [0, T ε −3/2 ], and |x| ≤ Rϕ , Z t ds [U (t − s)f (·, s)](x + qα (t)) ≤ C sup ξ(s) . s∈[0,t]
0
(b) Under the hypotheses of (a), if instead of (5.10) and (5.11) it holds for fixed 1 ≤ α ≤ N that |f1 (y, s)| + |f2 (y, s)| ≤ Cξ(s)
N X
1
β=1
1 + |y − qβ (s)|3
N X
1
β=1
1 + |y − qβ (s)|4
β6 =α
|∇f1 (y, s)| + |∇f2 (y, s)| ≤ Cξ(s)
β6 =α
,
(5.12)
,
(5.13)
then for t ∈ [0, T ε −3/2 ] and |x| ≤ Rϕ we have even that Z t ds [U (t − s)f (·, s)](x + q (t)) ≤ C sup ξ(s) ε. α 0
s∈[0,t]
Slow Motion of Charges Interacting Through the Maxwell Field
455
(c) Let ξ(τ, s) ≥ 0 be some function. Assume that for y ∈ R3 , τ ∈ [0, t], s ∈ [0, τ ], and some g(y, τ, s) = (g1 (y, τ, s), g2 (y, τ, s)) with ∇ · g1 = 0 = ∇ · g2 that N X
|g1 (y, τ, s)| + |g2 (y, τ, s)| ≤ Cξ(τ, s) |∇g1 (y, τ, s)| + |∇g2 (y, τ, s)| ≤ Cξ(τ, s)
α=1 N X α=1
1 1 + |y − qα (s)|3 1 1 + |y − qα (s)|4
,
(5.14)
.
(5.15)
Then for each α = 1, . . . , N, t ∈ [0, T ε −3/2 ], and |x| ≤ Rϕ , Z t Z τ dτ ds [U (t − s)g(·, τ, s)](x + q (t)) ≤ C sup ξ(τ, s) , α 0
(τ,s)∈1t
0
where 1t = {(τ, s) : τ ∈ [0, t], s ∈ [0, τ ]}. In (a)–(c), all constants C on the right-hand sides are independent of α, t, and x. Proof. (a) Define W as in Lemma 5.1. We derive the estimates with W1 . Fix 1 ≤ α ≤ N , t ∈ [0, T ε −3/2 ], s ∈ [0, t], and |x| ≤ Rϕ . According to Lemma 5.1, (5.10), and (5.11), |W1 (t, s, x + qα (t))| ≤ C
N ξ(s) X (2) Iαβ (t, s, x), (t − s)2 β=1
with (n) Iαβ (t, s, x)
Z =
d y 2
|y−x−qα (t)|=(t−s)
(t − s) 1 + |y − qβ (s)|n+1
1 + . 1 + |y − qβ (s)|n (5.16) (n)
In the sum in (5.16), with general n ≥ 2, we first consider the term Iαα (t, s, x), i.e., the one with β = α. In this case according √ to (2.8), |y − qβ (s)| ≥ |y − x − qα (t)| − |x| − |qα (t) − qα (s)| ≥ (t − s) − Rϕ − C ε(t − s) ≥ (t − s)/2 − Rϕ for ε small. Therefore |y − qβ (s)| ≥ (t − s)/4 for s ≤ t − 4Rϕ . We hence obtain for β = α and s ≤ t − 4Rϕ , (n) (t, s, x) ≤ C Iαα
(t − s)2 . 1 + (t − s)n
(5.17)
On the other hand, for s ∈ [t − 4Rϕ , t], (n) Iαα (t, s, x) ≤ C(t − s)2 [(t − s) + 1] ≤ C(t − s)2 [4Rϕ + 1]
≤ C(t − s)2
(t − s)2 1 ≤ C . 1 + (4Rϕ )n 1 + (t − s)n
Hence (5.17) shows that the latter estimate holds for any s ∈ [0, t]. Since Z t Z τ Z t ds ds ≤ C, dτ ≤ C, 2 3 0 1 + (t − s) 0 0 1 + (t − s) the term with β = α will satisfy the claimed estimates not only in (a), but also in (c).
456
M. Kunze, H. Spohn (2)
Next we turn to deriving a bound for Iαβ (t, s, x) with β 6= α. First note that for some portion of the interval [0, t] the preceding argument applies again. For this, define t0 = 4(Rϕ + C ∗ ε−1 ). Then for s ≤ t − t0 we find by (2.8) for ε small that on the y-sphere, |y − qβ (s)| ≥ |y − x − qα (t)| − |x| − |qα (t) − qβ (s)| ≥ (t − s) − Rϕ − |qα (t) − qβ (t)| − |qβ (t) − qβ (s)| √ ≥ (t − s) − Rϕ − C ∗ ε−1 − C ε(t − s) ≥ (t − s)/2 − Rϕ − C ∗ ε−1 ≥ (t − s)/4. Therefore as in (5.17) for general n ≥ 2, (n)
Iαβ (t, s, x) ≤ C
(t − s)2 , 1 + (t − s)n
s ∈ [0, t − t0 ],
(5.18)
(2)
and it remains to estimate Iαβ (t, s, x) for β 6= α and s ∈ [t − t0 , t]. To do so, we note that an explicit computation shows for z1 , z2 ∈ R3 and γ ≥ 0, Z d 2y
|y−z1 |=γ
πγ 1 + (γ + |z1 − z2 |)2 1 = log |z1 − z2 | (1 + |y − z2 |2 ) 1 + (γ − |z1 − z2 |)2 4γ |z1 − z2 | πγ log 1 + = |z1 − z2 | 1 + (γ − |z1 − z2 |)2 ≤
4π γ 2 1 + (γ − |z1 − z2 |)2
,
(5.19)
as log(1 + A) ≤ A for A ≥ 0. Similarly, for n ≥ 2, Z d 2y
|y−z1 |=γ
Z
= 2πγ 2
dr
1
−1
Z ≤ Cγ 2
1 (1 + |y − z2 |n+1 )
1 + |z1 − z2 |2 + 2γ |z1 − z2 |r + γ 2
(n+1)/2
dr
1
(n+1)/2 1 + |z1 − z2 |2 + 2γ |z1 − z2 |r + γ 2 1 1 γ = Cn − (n−1)/2 . |z1 − z2 | 1 + (|z1 − z2 | − γ )2 (n−1)/2 1 + (|z1 − z2 | + γ )2 −1
(5.20) So in particular Z d 2y
|y−z1 |=γ
γ 1 ≤C , n+1 |z1 − z2 | (1 + |y − z2 | )
n ≥ 2.
(5.21)
Slow Motion of Charges Interacting Through the Maxwell Field
457
Below we will also need some more refined estimates, and for this purpose we note that according to (5.20) also Z d 2y
|y−z1 |=γ
γ2 1 γ2 ≤C ≤C . 3 2 (1 + |y − z2 | ) |z1 − z2 |2 1 + (|z1 − z2 | + γ )
(5.22)
Analogously we obtain 1 1 γ2 ≤ C min 1, . (5.23) (1 + |y − z2 |4 ) |z1 − z2 |2 1 + (|z1 − z2 | − γ )2
Z d 2y
|y−z1 |=γ
(2)
As to bound Iαβ (t, s, x) for β 6= α and s ∈ [t − t0 , t] we then use (5.21) and (5.19) with z1 = x + qα (t), z2 = qβ (s), and γ = t − s to obtain for s ∈ [t − t0 , t], (2) Iαβ (t, s, x)
≤C
(t − s)2 (t − s)2 + . |x + qα (t) − qβ (s)| 1 + [(t − s) − |x + qα (t) − qβ (s)|]2 (5.24)
Therefore by (5.18) and (5.24), Z t ξ(s) (2) ds I (t, s, x) (t − s)2 αβ 0 Z t−t0 Z t ds ds (2) (2) I (t, s, x) + I (t, s, x) ≤ sup ξ(s) 2 αβ (t − s)2 αβ 0 t−t0 (t − s) s∈[0,t] Z t−t0 Z t ds ds + ≤ C sup ξ(s) 2 1 + (t − s) |x + q (t) − qβ (s)| α 0 t−t0 s∈[0,t] Z t ds . (5.25) + 2 t−t0 1 + [(t − s) − |x + qα (t) − qβ (s)|] The first of the three integrals is bounded by a constant. Concerning the second, we have |x + qα (t) − qβ (s)| ≥ |qα (t) − qβ (t)| − |x| − |qβ (t) − qβ (s)| √ ≥ C∗ ε−1 − Rϕ − C ε (t − s) by (2.6) and (2.8). In the domain of integration [t − t0 , t] it holds that t − s ≤ t0 ≤ Cε−1 , whence |x + qα (t) − qβ (s)| ≥ C∗ ε−1 − Rϕ − Cε−1/2 ≥ (C∗ /2)ε−1 , s ∈ [t − t0 , t], β 6= α,
|x| ≤ Rϕ ,
(5.26)
Rt for ε small. Therefore the second integral can be bound by Cε t−t0 ds ≤ Cεt0 ≤ C. To estimate the last integral =: J on the right-hand side of (5.25), we substitute θ = t − s to obtain Z t0 dθ (5.27) J = 2 0 1 + [θ − r(θ )]
458
M. Kunze, H. Spohn
√ with r(θ) = |x + qα (t) − qβ (t − θ)|. Observe that |˙r (θ )| ≤ |q˙β (t − θ )| ≤ C ε by (2.8). Thus θ 7 → χ(θ) = θ − r(θ) is strictly increasing, and we can substitute θ = θ (χ) to get Z Z χ (t0 ) 1 dχ dχ ≤ C. ≤ C J = 2 2 R 1+χ χ(0) 1 − r˙ (θ ) 1 + χ Summarizing these estimates we obtain the bound claimed in part (a) of the lemma. (n)
(b) Defining Iαβ as in (5.16), we need to show Z
0
t
ds (3) I (t, s, x) ≤ Cε, (t − s)2 αβ
β 6= α.
(5.28)
By (5.18), Z
t−t0 0
ds (3) I (t, s, x) ≤ C (t − s)2 αβ
Z 0
t−t0
ds (t − s) . (t − s)2 1 + (t − s)2
In the domain of integration, (t − s) ≥ t0 ≤ Cε−1 , and hence Z t Z t−t0 ds ds (3) I (t, s, x) ≤ Cε ≤ Cε. αβ 2 2 (t − s) 0 0 1 + (t − s)
(5.29)
Thus it remains to estimate the part of the integral in (5.28) for s ∈ [t − t0 , t]. Firstly, by (5.22), Z Z t 1 ds d 2y 3 2 |y−x−qα (t)|=(t−s) (1 + |y − qβ (s)| ) t−t0 (t − s) Z t Z t ds ≤ Cε2 ds = Cε2 t0 ≤ Cε. (5.30) ≤C 2 |x + q (t) − q (s)| α β t−t0 t−t0 Here we have used |x + qα (t) − qβ (s)| ≥ (C∗ /2)ε−1 for ε small, cf. (5.26). Reference to this is possible, since we again have that β 6= α. Analogously we infer from (5.23) that Z Z t (t − s) ds d 2y 4 2 |y−x−qα (t)|=(t−s) (1 + |y − qβ (s)| ) t−t0 (t − s) Z t (t − s) 1 ds ≤ |x + qα (t) − qβ (s)|2 1 + [(t − s) − |x + qα (t) − qβ (s)|]2 t−t0 Z t ds = CεJ ≤ Cε, ≤ Cε 2 t0 2 t−t0 1 + [(t − s) − |x + qα (t) − qβ (s)|] with the bounded J from (5.27). This together with (5.30) and (5.29) shows that (5.28) is satisfied. (c) Due to the remarks in (a), (5.14), and (5.15) we only have to prove Z τ Z t ds (3) dτ I (t, s, x) ≤ C, β 6 = α, t ∈ [0, T ε−3/2 ], 2 αβ 0 0 (t − s)
|x| ≤ Rϕ . (5.31)
Slow Motion of Charges Interacting Through the Maxwell Field
459
We decompose the domain of integration 1t = {(τ, s) : τ ∈ [0, t], s ∈ [0, τ ]} in 1t,1 = 1t ∩ {(τ, s) : s ∈ [0, t − t0 ]} and 1t,2 = {(τ, s) : τ ∈ [t − t0 , t], s ∈ [t − t0 , τ ]}. On 1t,1 we can utilize (5.18) to get Z Z
1 (3) dτ ds I (t, s, x) ≤ C 2 αβ (t − s) 1t,1
Z 0
t
Z dτ 0
τ
ds
1 ≤ C. 1 + (t − s)3
(5.32)
Since again t − s ≤ t0 ≤ Cε−1 for (τ, s) ∈ 1t,2 , by (5.26) and (5.21) Z 1 d 2y dτ ds (t − s)2 |y−x−qα (t)|=(t−s) 1 + |y − qβ (s)|3 1t,2 Z Z 1 1 ≤C dτ ds (t − s) |x + qα (t) − qβ (s)| 1t,2 Z τ Z t Z t ds dτ ds = Cεt0 ≤ C. = Cε ≤ Cε t−t0 t−t0 t − s t−
Z Z
(5.33)
In addition, by (5.23) Z 1 d 2y dτ ds (t − s) |y−x−qα (t)|=(t−s) 1 + |y − qβ (s)|4 1 Z t,2Z 1 1 ≤ dτ ds (t − s) 1 + [(t − s) − |x + qα (t) − qβ (s)|]2 1t,2 Z t ds = ≤ C, 2 t−t0 1 + [(t − s) − |x + qα (t) − qβ (s)|]
Z Z
(5.34)
since the last integral is just J from (5.27) and hence bounded. By (5.32), (5.33), and (5.34) we thus have proved (5.31). u t 2 − (k · v)2 ]. Then for x ∈ R3 and ˆ Lemma 5.3. Define φv (x) through φˆ v (k) = eϕ(k)/[k |v| ≤ v¯ < 1, with ∇ = ∇x ,
|∇φv (x)| + |∇v ∇φv (x)| + |∇v2 ∇φv (x)| + |∇v3 ∇φv (x)| ≤ C|e|(1 + |x|)−2 ,
|∇ 2 φv (x)| + |∇v ∇ 2 φv (x)| + |∇v2 ∇ 2 φv (x)| + |∇v3 ∇ 2 φv (x)| ≤ C|e|(1 + |x|)−3 , |∇ 3 φv (x)| + |∇v ∇ 3 φv (x)| + |∇v2 ∇ 3 φv (x)| + |∇v3 ∇ 3 φv (x)| ≤ C|e|(1 + |x|)−4 , |∇ 4 φv (x)| + |∇v ∇ 4 φv (x)| + |∇v2 ∇ 4 φv (x)| + |∇v3 ∇ 4 φv (x)| ≤ C|e|(1 + |x|)−5 .
Proof. Tedious calculations; see also the appendices of [5, 6].
t u
Rt Now we can estimate 0 ds [U (t −s)f (·, s)](x +qα (t)), cf. (5.9), for t ∈ [0, T ε−3/2 ] and |x| ≤ Rϕ , using Lemma 5.1 and Lemma 5.2(a), with f = (f1 , f2 ) defined by (5.8). Since ∇ · Bv = 0, and ∇ · Ev = eϕ is independent of v, we have ∇ · f1 = 0 = ∇ · f2 . Concerning (5.10) and (5.11), note |∇v Ev (x)| + |∇v Bv (x)| ≤ C(|∇φv (x)| + |∇v ∇φv (x)|) ≤ C|e|(1 + |x|)−2 and |∇v ∇Ev (x)| + |∇v ∇Bv (x)| ≤ C(|∇ 2 φv (x)| + |∇v ∇ 2 φv (x)|) ≤ C|e|(1 + |x|)−3 by Lemma 5.3. Thus (5.10) and (5.11) are satisfied
460
M. Kunze, H. Spohn
with ξ(s) = max1≤β≤N |v˙β (s)| max1≤β≤N |eβ | . As Z(x, 0) = 0, hence (5.9) in conjunction with Lemma 5.2(a) yields for α = 1, . . . , N, max |eβ | , |Z(x + qα (t), t)| ≤ C sup max |v˙β (s)| 1≤β≤N s∈[0,t] 1≤β≤N (5.35) −3/2 ], |x| ≤ Rϕ . t ∈ [0, T ε We will utilize this further in (5.5), and to this end we also need to bound Rα (t) from (5.6). For fixed β 6 = α one calculates for the interaction terms Z 9αβ (t) = d 3 x ρα (x − qα (t))∇φvβ (t) (x − qβ (t)) Z = (−i) eα eβ eα eβ = 4π
d 3k k
Z Z
2 |ϕ(k)| ˆ
k 2 − (k · vβ (t))2
d 3 xd 3 y ϕ(x − qα (t))ϕ(y − qβ (t))∇ζvβ (t) (x − y), (5.36)
with ζv (x) =
eik·[qβ (t)−qα (t)]
r
1 2 1/2
[(1 − v 2 )x 2 + (x · v) ]
,
ζˆv (k) =
1 2 , 2 π k − (k · v)2
|v| < 1. (5.37)
Then supt∈[0,T ε−3/2 ] |∇ζvβ (t) (x)| ≤ C(1 + |x|)−2 due to (2.8). By (C), in (5.36) we only need to integrate over (x, y) that have |x − qα (t)| ≤ Rϕ and |y − qβ (t)| ≤ Rϕ . Then by (2.6), |x − y| ≥ |qα (t) − qβ (t)| − 2Rϕ ≥ C∗ ε−1 − 2Rϕ ≥ (C∗ /2)ε−1 for ε small. Therefore (5.36) shows |9αβ (t)| ≤ Cε2 ,
t ∈ [0, T ε−3/2 ],
α 6= β.
(5.38)
By definition of Bv (x) and Ev (x) we have Rα (t) = m0α (vα (t))−1 X · − 9αβ (t) + [vβ (t) · 9αβ (t)]vβ (t) + vα (t) ∧ [−vβ (t) ∧ 9αβ (t)] β6=α
(5.39) cf. (5.6), and therefore (5.38) together with (2.8) implies |Rα (t)| ≤ Cε2 ,
t ∈ [0, T ε−3/2 ].
(5.40)
Hence (5.5), (5.35), and (5.40) finally yield max |eβ | + Cε2 , |v˙α (t)| ≤ C sup max |v˙β (s)| s∈[0,t] 1≤β≤N
1≤β≤N
for every α = 1, . . . , N and t ∈ [0, T ε −3/2 ]. Choosing max1≤β≤N |eβ | ≤ e¯ with e¯ > 0 sufficiently small, we find that for α = 1, . . . , N, sup t∈[0, T ε−3/2 ]
|v˙α (t)| ≤ Cε2 .
(5.41)
For later reference we also note that then according to (5.35), |Z(x + qα (t), t)| ≤ Cε2 ,
α = 1, . . . , N,
t ∈ [0, T ε−3/2 ],
|x| ≤ Rϕ .
(5.42)
Slow Motion of Charges Interacting Through the Maxwell Field
461
5.3. Bounding |v¨α (t)|. By (2.8) we have in particular that √ |vα (t) − vβ (t)| ≤ C ε, t ∈ [0, T ε−3/2 ].
(5.43)
In order to estimate the derivative of Eq. (5.5), first note that using the explicit form of m0α (vα )−1 we obtain from (5.41) that d m0α (vα (t))−1 ≤ C|v˙α (t)| ≤ Cε2 . (5.44) dt Hence by (5.5), (5.42) and (5.40), |v¨α (t)| ≤ C ε4 + |Mα (t)| + |R˙ α (t)| ,
(5.45)
with Rα defined in (5.6), and Z h i Mα (t) = d 3 x ρα (x) (Lα (t)Z1 )(x + qα (t), t) + vα (t) ∧ (Lα (t)Z2 )(x + qα (t), t) , (5.46) where Lα (t)φ = (vα (t) · ∇)φ + ∂t φ for a function φ = φ(x, t). We first estimate Mα (t). d [Lα (t)φ] = Lα (t)φ˙ + (v˙α · ∇)φ and, Let 6α (x, t) = (Lα (t)Z)(x, t). Since generally dt see (5.7), Z˙ = AZ − f with f from (5.8), we obtain ˙ α = A6α + (v˙α · ∇)Z − Lα (t)f. 6 According to (2.3) it may be shown that 6α (x, 0) = 0. We hence get Z t h i dτ U (t − τ ) (v˙α (τ ) · ∇)Z(·, τ ) − Lα (τ )f (·, τ ) 6α (x + qα (t), t) = 0
· (x + qα (t)) =: 6α,1 (x + qα (t), t) − 6α,2 (x + qα (t), t). d (∇Z) = ∇(AZ − f ) = A(∇Z) − ∇f and Z(x, 0) = 0, we As a consequence of dt obtain from the group property of U (·) that Z t h i dτ U (t − τ ) (v˙α (τ ) · ∇)Z(·, τ ) (x + qα (t)) 6α,1 (x + qα (t), t) = 0 Z τ h Z t i dτ ds U (t − s) (v˙α (τ ) · ∇)f (·, s) (x + qα (t)). =− 0
0
With g(y, τ, s) = v˙α (τ )·∇f (y, s) it follows from the definitions of f , Ev (x), and Bv (x) that ∇ · g = 0. Moreover, by (5.41) and Lemma 5.3 we find that (5.14) and (5.15) are satisfied with ξ(τ, s) = ε 4 . Therefore Lemma 5.2(c) applies to yield for α = 1, . . . , N, |6α,1 (x + qα (t), t)| ≤ Cε4 , To estimate 6α,2 (x + qα (t), t) = −
Z 0
t
t ∈ [0, T ε−3/2 ],
|x| ≤ Rϕ .
h i dτ U (t − τ ) Lα (τ )f (·, τ ) (x + qα (t)),
(5.47)
462
M. Kunze, H. Spohn
observe that [Lα (τ )f (·, τ )](x) = vα (τ ) · ∇f (x, τ ) + ∂t f (x, τ ) N n o X = (v¨β · ∇v )8vβ (x − qβ ) + (v˙β · ∇v )2 8vβ (x − qβ ) β=1
+
N X
2 ∇xv 8vβ (x − qβ )(vα − vβ , v˙β )
β=1
β6=α
=: f \ (τ, y) + f [ (τ, y), with all time arguments taken at time τ , and 8v = (Ev , Bv ). Since ∇ · Bv = 0 and ∇ · Ev = eϕ is independent of v, we have that ∇ · f \ = 0 = ∇ · f [ . In addition, f \ satisfies (5.10) and (5.11) with max |eβ | + ε4 . ξ \ (τ ) = max |v¨β (τ )| 1≤β≤N
1≤β≤N
Because f [ has an additional x-derivative, moreover (5.12) and (5.13) hold for f [ , with ξ [ (τ ) = max |vα (τ ) − vβ (τ )| ε2 , 1≤β≤N
as again follows from Lemma 5.3 and (5.41). Thus Lemma 5.2(a) and (b) imply that for all α = 1, . . . , N, t ∈ [0, T ε −3/2 ], and |x| ≤ Rϕ , |6α,2 (x + qα (t), t)| Z t Z t dτ [U (t − τ )f \ (·, τ )](x + qα (t)) + dτ [U (t − τ )f [ (·, τ )](x + qα (t)) ≤ 0 0 \ [ ≤ C sup ξ (τ ) + sup ξ (τ )ε
τ ∈[0,t]
≤ C ε4 +
+
sup
τ ∈[0,t]
sup
max |v¨β (τ )| max |eβ |
τ ∈[0,t] 1≤β≤N
1≤β≤N
max |vα (τ ) − vβ (τ )| ε3 .
τ ∈[0,t] 1≤β≤N
Hence by (5.47) and (5.43) for α = 1, . . . , N, t ∈ [0, T ε−3/2 ], and |x| ≤ Rϕ , 7/2 max |eβ | . |6α (x + qα (t), t)| ≤ C ε + sup max |v¨β (τ )| τ ∈[0,t] 1≤β≤N
1≤β≤N
According to the definition of Mα (t) in (5.46) we therefore have Z h i d 3 x ρα (x) 6α,1 (x + qα (t), t) + vα (t) ∧ 6α,2 (x + qα (t), t) |Mα (t)| = |x|≤Rϕ
≤C ε
7/2
+
sup
max |v¨β (τ )|
τ ∈[0,t] 1≤β≤N
max |eβ |
1≤β≤N
.
(5.48)
Slow Motion of Charges Interacting Through the Maxwell Field
463
To further estimate the right-hand side of (5.45), we have to bound R˙ α (t), with Rα (t) from (5.6). Calculating R˙ α (t) explicitly we obtain XZ d m0α (vα )−1 m0α (vα )Rα (t) + m0α (vα )−1 d 3 x ρα (x − qα ) R˙ α (t) = dt β=1 β6 =α
h
i
· (v˙β · ∇v )Evβ (x − qβ ) + vα ∧ (v˙β · ∇v )Bvβ (x − qβ ) XZ h d 3 x ρα (x − qα ) ((vα − vβ ) · ∇)Evβ (x − qβ ) + m0α (vα )−1 β=1
β6=α
i
+vα ∧ ((vα − vβ ) · ∇)Bvβ (x − qβ ) XZ d 3 x ρα (x − qα ) v˙α ∧ Bvβ (x − qβ ) + m0α (vα )−1 β=1
β6=α
=: R˙ α,1 (t) + R˙ α,2 (t) + R˙ α,3 (t) + R˙ α,4 (t) with all time arguments at time t. Firstly, d −1 m0α (vα ) m0α (vα )Rα (t) ≤ Cε4 |R˙ α,1 (t)| = dt
(5.49)
for α = 1, . . . , N and t ∈ [0, T ε −3/2 ] by (5.44) and (5.40). Since Bv (x) = −v ∧ ∇φv (x), by (5.41), (2.8), and (5.38) also |R˙ α,4 (t)| ≤ Cε9/2 .
(5.50)
What concerns R˙ α,2 (t), we may repeat the calculation in (5.36) to obtain Z 2 φvβ (t) (x − qβ (t)) ∇v 9αβ (t) := d 3 x ρ(x − qα (t))∇xv Z Z 1 2 d 3 xd 3 y ρ(x − qα (t))ρ(y − qβ (t))∇xv ζvβ (t) (x − y), = 4π −2 2 ζ with ζv (x) from (5.37). Since supt∈[0, T ε−3/2 ] |∇xv vβ (t) (x)| ≤ C(1 + |x|) , we get as before that
|∇v 9αβ (t)| ≤ Cε2 ,
t ∈ [0, T ε−3/2 ],
α 6 = β,
and hence by (5.41), |R˙ α,2 (t)| ≤ Cε4 .
(5.51)
So finally we have to bound R˙ α,3 (t), and this relies on a similar argument. Here we have Z ∇9αβ (t) := d 3 x ρ(x − qα (t))∇ 2 φvβ (t) (x − qβ (t)) Z Z 1 d 3 xd 3 y ρ(x − qα (t))ρ(y − qβ (t))∇ 2 ζvβ (t) (x − y), = 4π
464
M. Kunze, H. Spohn
and supt∈[0, T ε−3/2 ] |∇ 2 ζvβ (t) (x)| ≤ C(1 + |x|)−3 . This in turn yields |∇9αβ (t)| ≤ Cε3 ,
t ∈ [0, T ε−3/2 ],
α 6= β.
Using the explicit form of Ev (x) and Bv (x), as in (5.39), we then get for t ∈ [0, T ε−3/2 ], |R˙ α,3 (t)| ≤ Cε3 |vα (t) − vβ (t)| ≤ Cε7/2 ,
(5.52)
by (5.43). Summarizing (5.49), (5.50), (5.51), and (5.52) it follows that |R˙ α (t)| ≤ Cε7/2 ,
α = 1, . . . , N,
t ∈ [0, T ε−3/2 ].
(5.53)
Consequently, by (5.45), (5.48), and (5.53) for α = 1, . . . , N and t ∈ [0, T ε−3/2 ], |v¨α (t)| ≤ C ε4 + |Mα (t)| + |R˙ α (t)| 7/2 max |eβ | . ≤ C ε + sup max |v¨β (τ )| τ ∈[0,t] 1≤β≤N
1≤β≤N
Choosing max1≤β≤N |eβ | ≤ e¯ with e¯ sufficiently small we hence obtain sup t∈[0, T ε−3/2 ]
|v¨α (t)| ≤ Cε7/2 ,
This completes the proof of Lemma 2.1.
α = 1, . . . , N.
t u
6. Appendix B: Proof of Lemma 3.2 Here we give the proof of Lemma 3.2. We verify e.g. (b). To compare the left-hand side to the right-hand side of the assertion, we will insert some additional terms and estimate the corresponding differences Dj (t), j = 1, 2, 3, for t ∈ [t0 , T ε−3/2 ], where t0 = 4(Rϕ + C ∗ ε−1 ). First we introduce Z t Z 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e D1 (t) = i 0 −ik·[qβ (t)−qβ (t−τ )] −ik·[τ vβ − 21 τ 2 v˙β ] sin |k|τ k · e −e |k| Z Z t 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e = − ∇ξ 0 −ik·[qβ (t)−qβ (t−τ )] −ik·[τ vβ − 21 τ 2 v˙β ] sin |k|τ · e −e |k| Z Z d 3 xd 3 y ϕ(x)ϕ(y) = − ∇ξ Z t n dτ ψτ [ξαβ + x − qβ (t − τ )] − [y − qβ (t)] × 0
o 1 − ψτ [x − τ 2 v˙β ] − [y − τ vβ ] , 2
(6.1)
Slow Motion of Charges Interacting Through the Maxwell Field
465
as follows through application of the Fourier transform, with ξαβ = qα (t) − qβ (t), and ψτ (x) = (4π|x|)−1 for |x| = τ whereas ψτ (x) = 0 otherwise. We claim that for x, y ∈ R3 with |x|, |y| ≤ Rϕ and t ∈ [t0 , T ε−3/2 ] there exists a unique τ0 = τ0 (x, y, t, ξαβ ) ∈ [0, t0 ] ⊂ [0, t] such that (6.2) τ0 = [ξαβ + x − qβ (t − τ0 )] − [y − qβ (t)] . To see this, observe with θ(τ ) = τ − |[ξαβ + x − qβ (t − τ )] − [y − qβ (t)]| that √ 0 ≥ θ(0) ≥ −(2Rϕ + C ∗ ε−1 ) and θ 0 (τ ) ≥ 1 − Cv ε by (2.6) and (2.8). For ε so small √ that 1 − Cv ε ≥ 1/2 we hence obtain θ (t0 ) ≥ −(2Rϕ + C ∗ ε−1 ) + t0 /2 = 2C ∗ ε−1 . This shows θ has a unique zero τ0 ∈ [0, t0 ]. Moreover (6.2) together with (2.6) implies √ τ0 ≥ |ξαβ | − |x − qβ (t − τ0 )] − [y − qβ (t)]| ≥ C∗ ε−1 − 2Rϕ − Cv ετ0 , whence also τ0 ≥ Cε−1 for ε small. Similarly, we find a unique τ1 = τ1 (x, y, t, ξαβ ) satisfying 1 (6.3) τ1 = [ξαβ + x − τ12 v˙β ] − [y − τ1 vβ ] , 2 with τ1 having the same properties as τ0 . By definition of ψτ we therefore may simply write Z Z d 3 xd 3 y ϕ(x)ϕ(y) ∇ξ τ0−1 − τ1−1 . (6.4) D1 (t) = − To estimate this, we calculate from (6.2) that −1 −3 [ξαβ + x − qβ (t − τ0 )] − [y − qβ (t)] ∇ξ τ0 = −τ0
+ [ξαβ + x − qβ (t − τ0 )] − [y − qβ (t)] · vβ (t − τ0 )∇ξ τ0 , with an analogous expression for ∇ξ τ1−1 . Therefore h i −1 −1 ∇ξ τ0 − τ1 ≤ C τ0−3 |qβ (t − τ0 ) − qβ (t − τ1 )| 1 + |vβ (t − τ1 )||∇ξ τ1 | h i +|τ0−3 − τ1−3 | [ξαβ + x − qβ (t − τ1 )] − [y − qβ (t)] 1 + |vβ (t − τ1 )||∇ξ τ1 | +τ0−2 |vβ (t − τ0 ) − vβ (t − τ1 )||∇ξ τ1 | +τ0−2 |vβ (t − τ0 )||∇ξ (τ0 − τ1 )| . From (6.2), (6.3), and according to the Taylor expansion 1 qβ (t − τ ) = qβ (t) − τ vβ + τ 2 v˙β + O(ε 7/2 τ 3 ), 2 cf. Lemma 2.1, it follows that 1 1 |τ0 − τ1 | ≤ τ0 vβ − τ02 v˙β − τ1 vβ + τ12 v˙β + O(ε7/2 τ03 ) 2 2 √ ≤ C ε |τ0 − τ1 | + Cε2 (τ0 + τ1 )|τ0 − τ1 | + O(ε7/2 τ03 ),
(6.5)
466
M. Kunze, H. Spohn
whence
√ |τ0 − τ1 | = O( ε ),
|τ0−3 − τ1−3 | = O(ε9/2 ),
recall Cε −1 ≤ τ0 , τ1 ≤ t0 = O(ε−1 ). Differentiating (6.2) and (6.3) w.r. to ξ = ξαβ we moreover get |∇ξ τ0 | + |∇ξ τ1 | = O(1), and after a longer calculation which we omit √ also |∇ξ (τ0 − τ1 )| ≤ C ε3/2 + ε |∇ξ (τ0 − τ1 )| , thus |∇ξ (τ0 − τ1 )| ≤ Cε3/2 . Utilizing these estimates and Lemma 2.1 in (6.5), we consequently obtain |∇ξ (τ0−1 − τ1−1 )| ≤ Cε7/2 . Hence (6.4) yields sup t∈[t0 , T ε−3/2 ]
|D1 (t)| ≤ Cε7/2
(6.6)
as desired. Next, with Z t Z 1 2 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e e−ik·[τ vβ − 2 τ v˙β ] D2 (t) = i 0 h i 1 sin |k|τ 1 k, − 1 − ik · τ vβ − τ 2 v˙β − τ 2 (k · vβ )2 2 2 |k| it may be shown in a similar way that sup t∈[t0 , T ε−3/2 ]
|D2 (t)| ≤ Cε7/2 .
(6.7)
Rt Finally we need to compare 0 dτ (. . . ) to the infinite dτ -integral and thus let Z ∞ Z 2 −ik·ξαβ dτ d 3 k |ϕ(k)| ˆ e D3 (t) = i t
h i 1 sin |k|τ 1 k. · 1 − ik · τ vβ − τ 2 v˙β − τ 2 (k · vβ )2 2 2 |k|
With the notation Kp = e−ik·ξαβ
Z t
∞
dτ
sin |k|τ p τ , |k|
p = 0, . . . , 2,
this may be rewritten as Z 2 ˆ D3 (t) = d 3 k |ϕ(k)| 1 1 2 · − ∇ξ K0 − (vβ · ∇ξ )∇ξ K1 + (v˙β · ∇ξ )∇ξ K2 − (vβ · ∇ξ ) ∇ξ K2 . 2 2 Thus we only need to estimate Z Z ∞ Z sin |k|τ p 2 2 −ik·ξαβ 3 3 τ ˆ Kp = d k |ϕ(k)| ˆ e dτ d k |ϕ(k)| |k| t Z Z Z ∞ = d 3 xd 3 y ϕ(x)ϕ(y) dτ ψτ (ξαβ + x − y) τ p , (6.8) t
Slow Motion of Charges Interacting Through the Maxwell Field
467
the latter equality follows analogously to (6.1). However, for |x|, |y| ≤ Rϕ and t ∈ [t0 , T ε−3/2 ] we obtain in case τ = |ξαβ + x − y| from (2.6) the contradiction 4(Rϕ + C ∗ ε−1 ) = t0 ≤ t ≤ τ ≤ 2Rϕ + |ξαβ | ≤ 2Rϕ + C ∗ ε−1 . This shows the term in (6.8) is identically zero for t ∈ [t0 , T ε−3/2 ], and thus D3 (t) = 0 for t ∈ [t0 , T ε−3/2 ]. Together with (6.6) and (6.7) this completes the proof of Lemma 3.2(b). u t Acknowledgement. We are grateful to A. Komech for discussions. HS thanks G. Schäfer for useful hints on post-Newtonian corrections in general relativity and for insisting on (1.9).
References 1. Damour T., Schäfer G.: Redefinition of position variables and the reduction of higher-order Lagrangians. J. Math. Phys. 22, 127–134 (1991) 2. Dautray R., Lions J.-L.: Mathematical Analysis and Numerical Methods for Science and Technology. Vol. 5: Evolution Problems I. Berlin–Heidelberg–New York: Springer, 1992 3. Komech A., Kunze M., Spohn H.: Effective dynamics for a mechanical particle coupled to a wave field. Commun. Math. Phys. 203, 1–19 (1999) 4. Komech A., Spohn H.: Long-time asymptotics for the coupled Maxwell-Lorentz equations. Comm. Partial Differential Equations 25, 559–584 (2000) 5. Kunze M., Spohn H.: Radiation reaction and center manifolds. To appear in SIAM J. Math. Anal. 6. Kunze M., Spohn H.: Adiabatic limit for the Maxwell–Lorentz equations. To appear in Annales Henri Poincaré 7. Landau L.D., Lifschitz E.M.: The Theory of Classical Fields. Oxford: Pergamon Press, 1962 8. Moser J.: Dynamical systems – past and present. In: Proc. of the ICM, Vol. 1 (Berlin 1998), Doc. Math., Extra Vol. I, 381–402 (1998) 9. Taylor J.H.: Binary pulsars and relativistic gravity. Rev. Mod. Phys. 66, 711–719 (1994) 10. Xia J.: The existence of noncollision singularities in Newtonian systems. Ann. Math. 135, 411–468 (1991) Communicated by Ya. G. Sinai
Commun. Math. Phys. 212, 469 – 501 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Statistical Properties of Locally Free Groups with Applications to Braid Groups and Growth of Random Heaps A. M. Vershik1 , S. Nechaev2,3 , R. Bikbov3 1 St. Petersburg Branch of Steklov Mathematical Institute, Fontanka 27, 119011 St. Petersburg, Russia 2 UMR 8626, CNRS - Université Paris XI, LPTMS, Bat. 100, Université Paris Sud, 91405 Orsay Cedex,
France
3 L.D. Landau Institute for Theoretical Physics, Kosygin str. 2, 117940 Moscow, Russia
Received: 7 June 1999 / Accepted: 21 April 2000
Abstract: The main statistical characteristics of locally free groups: the growth, the drift and the entropy are considered and relations between them are established. Our results assert that: (i) the statistical properties of random walks (Markov chains) on locally free and braid groups are not the same as the uniform statistics on these groups, and (ii) the stabilization of the statistical characteristics exists when the number of generators of the group grows. Contents 1. 2. 3. 4. 5.
6.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Main Definitions and Statement of a Problem . . . . . . . . . . . . . . . . Asymptotics of the Number of Words in the Locally Free Group (Logarithmic Volume) . . . . . . . . . . . . . . . . . . . . . . . . . . . . Random Walk on Locally Free Group: The Drift . . . . . . . . . . . . . . 4.1 Mathematical expectation of the heap’s roof . . . . . . . . . . . . 4.2 Drift as mathematical expectation of number of cells in the heap . . Random Walk on Locally Free Group and Semi-Group: The Entropy . . . 5.1 Entropy of random walk on semi-group LF + n+1 . . . . . . . . . . 5.2 Entropy of random walk on groups LF n+1 and LFI + n+1 . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Bounds for logarithmic volume and drift of random walk on braid group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Physical interpretation of results . . . . . . . . . . . . . . . . . . .
469 470 475 482 485 489 491 492 495 497 497 498
1. Introduction The last years have been marked by growing interest in a number of problems of physical origin, dealing with probabilistic processes on noncommutative groups. Among these
470
A. M. Vershik, S. Nechaev, R. Bikbov
are the problems of statistics and topology of chain molecules and related statistical problems of knots (see, for example, [Ne]). Along with the known problems of construction of topological invariants of knots and links, investigation of homotopic classes and fibre bundles, a set of similar, but less investigated problems dealing with statistics of knots and braids should be noted. In a set of works [GN] problems dealing with the investigation of the mathematical expectation of a “complexity” of randomly generated knots was formulated, where the degree of any known algebraic invariant (polynomials of Jones, Alexander, HOMFLY and others) had been served for the characteristics of a knot complexity. As for the theory of a random walk on braid groups, very few results devoted to the investigation of limiting behavior of random walks and Brownian bridges on the simplest braid group B3 are known [NGV]. Thus, neither Poisson boundary, nor explicit expression of harmonic functions for braid groups are yet found. At the same time it is clear that this set of problems is connected by large to random walks on noncommutative groups. In the present paper we consider statistical properties of locally free and braid groups following an idea of the first author (A.V.) and extended in the papers [Ve1, DN1, DN2]. For study of braid groups we introduce the concept of so called locally free groups, which are a particular case of local groups in the sense of [Ve1,Ve2,Ve3]. This concept gives us a very useful tool for bilateral approximations for the number of nonequivalent words in the braid groups and semi-groups. A very important and apparently rather new aspect of this problem consists in passing to the limit n → ∞ in the group Bn ; just this limit is considered in our work. We have found stabilization of various statistical characteristics of the local groups in this limit of a large number of generators. In [Ve2,Ve3] the systematical approach to computation of various numerical characteristics of countable groups is proposed. The essence of this approach deals with simultaneous consideration of three numerical constants, properly characterizing the logarithmic volume, the entropy and the escape (the drift) of the uniform random walk on the group. It so happens that these characteristics are related by the strict fundamental inequality (see also [Ve2,Ve3]), which means that the statistics of convolutions of measures on the generators is not the same as the uniform statistics on the set of words of a given length. In other words, generating the group step-by-step by a Monte Carlo method allows one to get only the exponentially small fraction of the group. The same statement holds also for braid groups. The locally free groups play the role of the approximants to the braid groups. 2. Main Definitions and Statement of a Problem We begin with definitions of a locally free group and semi-groups, which are special cases of a noncommutative local group and semi-groups [Ve1,Ve2,Ve3]. Definition 1 (Locally free group and semi-group). Locally free group (semi-group) LF n+1 (LF + n+1 ) with n generators {f1 , . . . , fn } is a group (semi-group), determined by the following relations: fj fk = fk fj
∀|j − k| ≥ 2,
{j, k} = 1, . . . n.
(1)
Each pair of neighboring generators (fj , fj ±1 ) produces a free subgroup (sub– semigroup) of a group LF n+1 (semi-group LF + n+1 ). In addition to the locally free group LF n+1 we define a few similar objects:
Statistical Properties of Locally Free Groups
471 +(r)
Definition 2. 1. Locally free semi-group LF n+1 of finite order r. The semi-group +(r)
LF n+1 is the locally free semi-group LF + n+1 subject to extra “finite order” relations (fi )r = 1; i = 1, . . . , n.
+ 2. Locally free idempotent semi-group LFI + n+1 . The idempotent semi-group LFI n+1 + is the locally free semi-group LF n+1 subject to extra “idempotent” relations
(fi )2 = fi ;
i = 1, . . . , n.
The concept, equivalent to the concept of a locally free semi-group LF + n+1 has appeared earlier in [CF], devoted to the investigation of combinatorial properties of substitutions of sequences and so called “partially commutative monoids” (see [Vi] and references there). Especially productive becomes the geometrical interpretation of monoids in the form of a “heap”, offered by G. X. Viennot and connected with various questions of statistics of directed growth and parallel computations. The case of a group (instead of semi-group) introduces a number of additional complications to the model of a heap and apparently has not been considered in the literature. We touch it in more detail below. It makes sense to give a more general definition of 0-locally free groups. Definition 3 (0-locally free group). Let 0 be a graph. Call the group LF(0) 0-locally free if the generators gγ of LF(0) can be labeled by vertices γ of the graph 0 and two generators commute if the vertices are not neighbors in the graph. The semi-group can be defined in the same way. If 0 is a p-cycle then corresponding locally free groups we call cyclic locally free group and denote by CLP p . For more details see [Ve2,Ve3,Ve1]. The more general concept of a locally free group consists in the consideration of the locally free group of depth m. Definition 4 (Locally free group of the depth m). The group G with the set of the generators f1 , f2 , . . . fn , n ≥ m is called locally free of depths m if fj fk = fk fj
∀|j − k| ≥ m + 1,
{j, k} = 1, . . . n.
(2)
For m = 1 we return to the previous notion. Let us recall finally the definition of the local group [Ve2,Ve3,Ve1]. Definition 5 (Local group). If generators f1 , f2 . . . fn of the group G satisfy the commutation relation fj fk = fk fj ∀|j − k| ≥ 2 and might have additional relations R between neighbors fj , fj +1 ∀j = 1 . . . n − 1, then G is the local group. If the relations R are the same for all j then the G is called local stationary group. Many important groups, semi-groups and algebras are of type of local groups, for example, the Coxeter groups, Hecke algebra, etc. Obviously, a locally free group with n generators (n ≤ ∞) is a universal object in the manifold of all local groups with the same number of generators. Now we give the definition of the braid group and establish the link between a braid group and a locally free group.
472
A. M. Vershik, S. Nechaev, R. Bikbov
... 1
2
i
i+1
... 1
2
= σj
... n
= σj-1
... i
i+1
n
Fig. 1. Graphic representation of generators of braid group Bn+1
Definition 6 (Artin braid group). The braid group Bn+1 of n + 1 “strings” has n generators {σ1 , . . . , σn } with the following relations: (
σi σi+1 σi = σi+1 σi σi+1 σi σj = σj σi
(1 ≤ i < n) . (|i − j | ≥ 2)
(3)
There exists an extensive literature on general properties of braid groups – see [Bi1]; for the last work on the normal forms of words, we shall quote [Bi2]. An element of the braid group Bn is set by a word in the alphabet {σ1 , . . . , σn ; σ1−1 , . . . , σn−1 } – see Fig. 1. By the length N of a record of a braid we mean just the length of a word in a given record of the braid, and by the irreducible length (or simple length) – the minimal length of a word, in which the given braid can be written. The irreducible length can be also viewed as a distance from the unity on the Cayley graph 0 of the group. Graphically the braid is represented by a set of strings, going from above downwards in accordance with the growth of the braid length. A closed braid is obtained by gluing the “top” and “bottom” free ends on a cylinder. A closed braid defines a link (in particular, a knot). The homotopy type of the link can be described in terms of algebraic characteristics of a braid [Jo]. The positive braid by definition is the element of the sub–semigroup generated by the generators of the braid group. The braid group Bn is the local group. Moreover, • The braid group Bn is a factor–group of a locally free group LF n , since Bn has been obtained from LF n by introducing the Yang–Baxter (braid) relations to LF n ; • The locally free group LF n is simultaneously the subgroup of the braid group. Bn , over squares of generators Bn : Lemma 1. Consider a subgroup B n of the group B n = σ 1 , . . . , σ n−1 |σ i = σi2 , i = 1, . . . , n . The correspondence σ i ↔ fi sets the isomorphism of the groups B n and LF n .
Statistical Properties of Locally Free Groups
473
... 1
2
... i
i+1
... 1
2
= fj n
= fj-1
... i
i+1
n
Fig. 2. Graphic representation of generators of locally-free group LF n+1
This lemma has been proved in [Hu, Co] and is the partial case of a general conjecture of J. Tits. We skip the full proof giving only a hint of it. Consider the Burau representation 1 0 −t 1 ; σ2 (t) = , σ1 (t) = t −t 0 1 being the exact representation of B3 over C[t]. It is obvious that 2 1 0 t −t + 1 2 2 . σ 1 = σ1 (t) = ; σ 2 = σ2 (t) = 0 1 t − t2 t2
(4)
Putting t = −1, we see that (4) is reduced up to 1 0 12 ; f2 = σ22 (−1) = , f1 = σ12 (−1) = −2 1 01 which are the generators of free group,02 . 1 0 It should be noted that the matrices 01 21 and −2 1 define the generators of a free group. This fact was proved apparently for the first time by I. Sanov [Sa]. Corollary 1. The locally free group LF n is simultaneously super- and subgroup of the braid group Bn . This consequence will hereafter be used for transmitting the estimates from the locally free group to the braid group. The geometrical interpretation of the group LF n+1 is shown in Fig. 2. Let us formulate the main problems concerning the determination of asymptotic characteristics of locally free and similar groups. This is the realization of the general program which was discussed in [Ve2,Ve3] and concerns the asymptotic properties of the local groups and similar objects. Namely we introduce three statistical characteristics of the group: the logarithmic volume, drift and entropy and study the fundamental inequality ([Ve2,Ve3]) which links these characteristics. 1. Asymptotics of number of words in a group (logarithmic volume). Let G be the group with fixed framing {g1 , . . . , gn }. The definition following hereafter makes sense
474
A. M. Vershik, S. Nechaev, R. Bikbov
for any groups with a fixed and finite set of generators. Denote by K(g) and call the length K(g) the minimal length of the word g, written in terms of generators {g1 , . . . , gn ; g1−1 , . . . , gn−1 }. The length defines a metric (the metrics of words [Gr]) on the group. Denote by V (G, K) the number of elements of group G of length K. Definition 7. Call v(G) the logarithmic volume of a given group G: log V (G, K) . K→∞ K
v(G) = lim
(5)
The existence of the limit is discussed in [Ve2,Ve3]. We call G the group of exponential growth if v > 0. In Sect. 3 we investigate the asymptotic behavior of logarithmic volumes v(Bn ), v(LF n ) in the limit n → ∞. 2. Random walk and average drift on a group. Consider the (right-hand side) random walk on any group G with fixed framing {g1 , . . . , gn ; g1−1 , . . . , gn−1 }, i.e. regard the Markov chain with the following transition probabilities: the word w transforms into 1 ; i = 1, . . . , n. Similarly one can build a left-hand Markov w g ±1 with the probability 2n chain. Let L(G) be a mathematical expectation of a length of a random word, obtained after N steps of random walk on the group G. Definition 8. Call l(G, N ) the drift on the group G (see [Ve2,Ve3]): l(G) = lim sup N →∞
L(G, N ) . N
(6)
Thus, the drift is the average speed of a flow to infinity in the metrics of words. In Sect. 4 we calculate the drift l(LF n ) on the locally free group and its limit for n → ∞. 3. Entropy of a random walk on a group. Let µN be the N -time convolution of a uniform measure µ on generators {f1 , . . . , fn ; f1−1 , . . . , fn−1 }. Definition 9. The entropy (see [Av,De,KaiV,Ve2,Ve3]) of a random walk on a group with respect to µ is H µN H µN = inf , (7) h(G) = lim N→∞ N N N P ν(x) log ν(x). where H (ν) = − x∈supp ν
Section 5 is devoted to the computation of h(LF n ) in the limit n = const 1. The question about simultaneous study of these three numerical characteristics (volume, drift and entropy) is delivered by the first author (A.V.) – see [Ve2,Ve3] and represents a serious and deep problem. In particular, the desire to find the above-defined characteristics for the braid group motivates our consideration of locally free and similar groups. These three quantities are connected by the basic fundamental inequality, which was suggested and proved for arbitrary groups in [Ve2,Ve3] (see also special earlier cases in [Av,Kai]): v l ≥ h.
(8)
Statistical Properties of Locally Free Groups
475
For many groups (like the free group, for example) Eq. (8) is reduced to equality ([Ve2, Ve3]). In general it is an interesting problem to classify the groups in a given framing for which Eq. (8) becomes the equality. As we show below for locally free groups in standard framing the fundamental inequality is strict. We propose an explanation of this phenomenon and discuss its possible applications and physical consequences. (For more detailed consideration of the mathematical aspects see [Ve2,Ve3]).
3. Asymptotics of the Number of Words in the Locally Free Group (Logarithmic Volume) In this section we find the asymptotics in n 1 of the logarithmic volume and precise expressions for numbers of words of locally free groups and semi-groups (see also [NGV, CN]). Later on, in Sect. 6 we use the results obtained here for the bilateral estimation of the logarithmic volume of the braid group. Lemma 2. Any element of length K in the group LF n+1 can be uniquely written in the normal form m m m (9) W = fα1 1 fα2 2 . . . fαs s , where
s P i=1
|mi | = K (mi 6 = 0 ∀ i; 1 ≤ s ≤ K), and the indices α1 , . . . , αs satisfy the
following conditions (i) If αi = 1 then αi+1 = 2, . . . , n; (ii) If αi = k (2 ≤ k ≤ n − 1) then αi+1 = k − 1, k + 1, . . . , n; (iii) If αi = n then αi+1 = n − 1. Proof. The proof directly follows from the definition of commutation relations in the t group LF n+1 . u Let θn (s) be the number of all different sequences α1 , . . . , αs of s indices (1 ≤ s ≤ K), satisfying the rules (i)–(iii). In other words, the local rules (i)-(iii) define a Markov chain of length s on the set of indices {α1 , . . . , αn } with n × n-dimensional transition matrix Tbn
0 1 0 0 .. .
1 0 1 0 .. .
1 1 0 1 .. .
1 1 1 0 .. .
... ... ... ... .. .
1 1 1 1 .. .
1 1 1 1 .. .
Tbn = 0 0 0 0 ... 0 1 0 0 0 0 ... 1 0
.
(10)
Thus, θn (s) is a partition function, determined as follows: θn (s) = vin
Tbn
s−1
n
vout ;
z }| { vin = ( 1 1 . . . 1 );
T . vout = vin
(11)
476
A. M. Vershik, S. Nechaev, R. Bikbov
First of all compute the spectrum of the matrix Tbn . Consider the determinant −λ 1 1 . . . 1 −λ 1 . . . Dn (λ) = det Tbn − λIb = 0 1 −λ . . . . . .. .. . . .. . . .
(12)
It satisfies the recursion relation Dn (λ) = −(λ + 1)Dn−1 (λ) − (λ + 1)Dn−2 (λ)
(13)
with the boundary conditions (
D0 (λ) = 1 . D1 (λ) = −λ
(14)
For λ > −1 one may set Dn (λ) = (λ + 1)
n−1 2
(−1)n ϕn (λ),
(15)
which gives for the function ϕ(λ), √ ϕn (λ) = λ + 1ϕn−1 (λ) − ϕn−2 (λ).
(16)
The general solution of (16) satisfying the previously defined boundary conditions (14) is given in [CN] in terms of Chebyshev’s polynomials of the second kind ϕn (λ) = Un+1 (cos ϑ), where
√ λ+1 cos ϑ = 2
(17)
π 0 1. Denote by Tn the family of all such subsets of the set {1, . . . , n}. In case of a colored heap the basis consists of subsets of {1, . . . , n} painted in two colors (+, −). We denote these subsets by Tnc . Remark 3. Let w be the element of the group LF n+1 and H be the corresponding heap. Then Tn (H ) is exactly the set of removable generators. It is convenient to characterize Tn by a vector (ε1 , . . . , εn ) with elements 0 and 1, where {εr = 1} ⇔ r ∈ T . the set Tn is equal to the Fibonacci number Fn and hence Lemma 4. The power #Tn of √ ≈ 1.618 is the golden mean. The power of the set Tnc it grows as λn , where λ = 5+1 2 n is equal to 2 . Proof. The power #Tn is equal to the number of sequences of elements 0 and 1 of length n, such that these sequences do not have the elements 1 in succession, i.e. satisfy the recursion relation Fn = Fn−1 + Fn−2 , (59) F1 = 1; F2 = 2, which defines the Fibonacci sequence. Similarly, the number of the elements of the set Tnc satisfies the recursion relation c c Fnc = Fn−1 + 2Fn−2 ,
F0c = 1
F1c = 2;
(60)
F2c = 4.
Actually, if the sequence #Tnc ⊂ Fnc begins with 0, the part remaining after removal c . If #T c begins with 1, then by definition the 2nd element of 0 is any sequence from Fn−1 n is 0. Deleting these two elements (1 and following after it 0), we get a sequence from c , Thus, the power of the set T c satisfies recursion relation (60), and consequently Fn−2 n t Fnc = 2n . u Define the time-homogeneous Markov chain, the set of states of which at any moment of time are the sets T ∈ T and the transition probabilities from the state T to the state T 0 are determined by the time-independent rules. Let T = {ε1 , . . . , εn }; T 0 = {ε10 , . . . , εn0 }. Then the transition matrix is as follows. The transition probability T → T 0 is nonzero and is equal to n1 only for the cases when εi = εi0 for all i except not more than three 0 0 , εr0 , εr+1 ) and for these triples one consecutive numbers, say (εr−1 , εr , εr+1 ) and (εr−1 of the following conditions is satisfied: If εr−1 If εr−1 If εr+1 If εr−1
= εr+1 = 1 = 1, εr+1 = 0 = 1, εr−1 = 0 = εr = εr+1 = 0
0 0 then εr0 = 1, εr−1 = εr+1 0 0 0 then εr = 1; εr−1 = εr+1 0 0 0 then εr = 1; εr−1 = εr+1 0 0 then εr−1 = εr+1 = 0, εr0
= 0; = 0; = 0; = 1.
(61)
Thus, the Markov chain is determined on the set of states Tn . Later on we will be interested in the asymptotics of a mathematical expectation of the size of a roof. This computation for the first time has been carried out in [DN2]. We repeat in Theorem 3 the main steps of the derivation of [DN2]. 2 Hereafter, if is not stipulated especially, we shall use the notation “roof” for a designation of both the roof as well as the basis of a roof.
Statistical Properties of Locally Free Groups
487
Theorem 3. The limit of the mathematical expectation of the number of removable generators for a random walk on the semi-group LF + n+1 for n 1 (i.e. the limit of the mathematical expectation of the roof of a heap) is lim E#T (wN ) =
N →∞
n . 3
(62)
Proof. Compute the mathematical expectation of a number of removable elements when we do not distinguish between generators and their inverses, i.e. for the random walk on the semi-group LF + n+1 . Let us represent the elements of the roof T (w) (i.e. the number of removable generators) graphically by filled boxes on the diagram as shown below:
1
4
8
10
Here n = 11, #T = 4. Denote by hj = kj − kj −1 − 1 the intervals of lengths j between neighboring boxes or between a box and the edge of the diagram. Let T consist of the set {k1 , . . . , ks }. If the edge points (1 and n) do not belong to T , then h1 = k1 , hs+1 = n − ks − 1; if one or both edge points belong to T , then h1 = k1 − 1, hs+1 = n − ks . For example, if k1 = 1 then h1 = 0, or if ks = n then hs = 0. (On the above diagram h1 = 0, h2 = 2, h3 = 3, h4 = 1.) The values hj satisfy the following relation, valid when neglecting the “boundary effects” at #T 1, n 1: X hj = n − #T . (63) j
It is not hard to establish the rules according to which the diagram is changed at such multiplication of w by gr (or by gr−1 ), which increases #T (w) by 1: in r’s position a point appears, while in positions (r − 1) and/or (r + 1) the points (which were present) disappear. Having in mind this rule, let us write the explicit expressions for the 1-step increment of a length, 1T (w), expressing it in terms of hj (w) provided that the boundary points do not belong to T : P (hj − 2) 1T (w) = +1 with the probability q+ = n1 j :hj ≥3 (64) 1T (w) = 0 with the probability q0 = n1 #T + 2#{j : hj ≥ 2} . 1 1T (w) = −1 with the probability q− = n #{j : hj = 1} Summing (64), we obtain the conditional mathematical expectation of the conditional probability of local reconstruction of a roof for the fixed element w: Ew 1T = 1 × q+ + P0 × q0 + (−1) × q− = 1 × n1 (hj − 2) + 0 × #T + n2 #{j : hj ≥ 2} j :hj ≥3
+ (−1) × n1 #{j : hj = 1} P P hj − n2 #T = 1 − n3 #T (w). = n1 (hj − 2) = n1 j
(65)
j
Let us mention that depending on whether the boundary points belong or do not belong to the set T (w), the right-hand side of Eq. (65) is changed by terms which do not
488
A. M. Vershik, S. Nechaev, R. Bikbov
exceed n4 . Therefore in the large n limit the expression (65) is exact. In case of periodic boundary conditions Eq. (65) is exact for any finite values of n. Since our Markov chain has a finite set of states and is ergodic, it has a unique invariant measure. The Markov chain with this invariant measure is stationary. So, the mathematical expectation E[1T (w)] over all elements w with respect to the invariant measure exists and is finite, therefore E[1T (w)] = 0. Thus, from the strong law of large numbers (or, equivalently, from the individual ergodic theorem) it follows that for the random walk on the semi-group we have Eq. (62) for the mathematical expectation of the number of removable elements (i.e. the set of elements of a roof). u t The distinction between the semi-group LF + n+1 (i.e. the heap) and the group LF n+1 (i.e. colored heap) is due to the fact that for the random walk on the group there is a 1 #T c (w). To account for that, possibility of the word’s reduction with the probability 2n + and p − to increase and to reduce the size of the roof we introduce the probabilities pw w #T c (w) of a colored heap per unity under the condition of the word’s length reduction. + and p − This mathematical expectation is a difference of conditional probabilities pw w c to change the value #T (w) per unity provided that reduction of a word occurs. This difference should be added to the mathematical expectation of the change of #T (w) in the case of a semi-group (61): Ew 1T c (w) = 1 −
− 3 c p+ − pw #T (w) + w #T c (w). n 2n
(66)
For the idempotent semi-group LFI + n+1 there are no possibilities for the word’s reduction and the mathematical expectation of the size of a roof is the same as for the semi-group LF + n+1 . Thus we arrive at the following corollaries of Theorem 3: Corollary 2. The limit of the mathematical expectation of the size of a roof for the random walk on the group LF n+1 and idempotent semi-group LFI + n+1 is n E#T c = 3−α n E#T = 3
for the group LF n+1 , for the idempotent semi-group LFI + n+1 ,
(67)
+ and p − = Ep − . where α = 21 (p+ − p− ); p+ = Epw w
On can easily realize that for some configurations of heaps w we could have p+ − 6 = 0 and in these cases the mathematical expectation Ew 1T for the group (colored heap) and for the semi-group (heap) do not coincide. However, we believe, that at + = Ep − and the following hypothesis (expressed N → ∞ (i.e. in a stationary mode) Epw w first by J. Desbois in [DN2]) is valid:
p−
Conjecture 1. The mathematical expectation of a roof (a set of removable elements) for the heap (the locally free semi-group LF + n+1 ) and for colored heap (the locally free group LF n+1 ) coincide at n 1. Hence, lim E#T (wN ) =
N→∞
n . 3
(68)
The concept of a roof is the same for the heap (the semi-group) and for the colored heap (the group), however the dynamics in these two cases is distinct. The random walk on the locally free semi-group (group) has been reduced to a Markov dynamics of heaps
Statistical Properties of Locally Free Groups
489
(colored heaps). We have defined a new dynamics – the dynamics of the roofs, Markovian in the case of the locally free semi-group, by which the general dynamics is restored and which is convenient for computations. In the case of the group this dynamics is not Markovian anymore, but nevertheless enables us to get some nontrivial estimates. Using the subadditive ergodic theorem we can prove now the following important fact: Lemma 5. 1. For almost all sequences of heaps, i.e. for almost all trajectories {wN } of the random walk on the semigroup LF + n+1 (or on the group LF n+1 ) the limit lim
N →∞
#T (wN ) = κn n
exists and does not depend on the trajectory. 2. From Theorem 3 we know that κn = 1/3 + on (1) for n large. This lemma is used below in the proof of Theorems 5 and 7.
4.2. Drift as mathematical expectation of number of cells in the heap. Let us compute now the change of a length of some fixed word w for a random walk on a group LF n+1 . It is obvious that for one step of the random walk the length of a word can be changed by ±1. The multiplication by a given generator, or by its inverse, occurs with the probability 1 2n and thus, the conditional mathematical expectation Ew K to change a word’s length is determined for a fixed element w. Below we shall compute Ew K and shall be convinced that the answer depends only on a size of a roof, i.e. on a size of a set #T c (w) of removable generators T c (w). Consider a fixed element w of the group LF n+1 such that the set of removable 1 generators w is {1 ≤ k1 < k2 < · · · < ks ≤ n}. Assume that with the probability 2n −1 the word w is multiplied by a generator gr or gr (for definiteness let us choose gr ). Denote the set of removable generators of the element w 0 = w gr as T 0 ≡ T (w0 ). Then the dynamics of the change of the set T (w) is settled by the following opportunities (compare to the above relations (61)): We have the following possibilities: I. Provided that the word’s length is increased, i.e. K 0 (wgi ) = K(w) + 1 the dynamics of the roof is described by the relations (61) valid for the semigroup LF + n+1 ; 0 II. Provided that the word is reduced, i.e. K (wgi ) = K(w) − 1, we have: T c → T c0 ≡ T c−
if εr = 1,
(69)
where T c − is the configuration of a roof obtained by the cancellation of one of the elements of the roof T c located in position r. (This rule cannot be described in local terms.) c
The probability of a word’s length reduction is #T 2n , because for each element of a roof there is a unique possibility to be reduced if and only if at the following step the element inverse to a former one has arrived. Accordingly, the probability to increase a c word’s length is 1 − #T 2n , which follows from the possibility mentioned above to change
490
A. M. Vershik, S. Nechaev, R. Bikbov
a word’s length for one step by ±1. As a result, the mathematical expectation of the total change of a word’s length for one step of random walk on the group LF n+1 is E#T c (w) E#T c (w) E#T c (w) + 1− =1− . (70) Egr [K(w) − K(w gr )] = − 2n 2n n The indicated computation proves the following lemma: Lemma 6. The conditional mathematical expectation of the word’s length K(w) after N steps of the random walk on the group LF n+1 for the fixed last element w is E#T c (w) , Ew K = N 1 − n hence the drift (i.e. the mathematical expectation of a normalized word’s length) is E#T c (w) 1 Ew K = 1 − . N→∞ N n
l = lim
Corollary 3. The drift of the random walk on the idempotent locally free semi-group LFI + n+1 is E#T (w) 1 Ew K = 1 − . l = lim N→∞ N n Proof. Despite the fact the expressions for the drifts for the group LF n+1 and idempotent semi-group LFI + n+1 coincide, their origins are different. For the idempotent semi-group there is no cancellation of the word’s length, however the relation gi2 = gi provides the existence of such configurations of the heap which does not change when a new letter is added. The probability of such an event is n1 . Hence the word’s length increases by +1 if and only if the new added letter changes the configuration of the roof. The probability of such an event is 1 − E#Tn(w) . By the corollary of Theorem 3 we know the sizes of roofs for the random walks on the semi-group and idempotent semi-group coincide. u t Thus, for calculation of the drift it is sufficient to know the mathematical expectation E#T (w) of the roof – see Eq. (67). Theorem 4. The mathematical expectations of the drift of a random walk on a locally free group and idempotent semi-group at n 1 are 2−α E#T c (w) = n 3−α 2 E#T (w) = l =1− n 3 l =1−
for the group LF n+1 ,
(71)
for the idempotent semi-group LF n+1 ,
where α is defined in (67). Conjecture 2. The mathematical expectation of the drift on the locally free group at n 1 is l=
2 . 3
(72)
Conjecture 2 is a direct consequence of Conjecture 1 (J. Desbois in [DN2]) but still it is not proved rigorously.
Statistical Properties of Locally Free Groups
491
5. Random Walk on Locally Free Group and Semi-Group: The Entropy The entropy h(G) of the random walk on the group G with the uniform measure µ on the set of generators according to the theorem similar to the Shannon–Macmillan– Breiman one and proved in [KaiV, De] (see also [Ve2,Ve3]) can be written as follows (see Definition 7): 1 N 1 H µ (wN ) = − lim log µN (wN ), N→∞ N N →∞ N
h(wN ) = lim
(73)
for almost all elements wN of the group; N is the number of the steps of the random walk; µN is the N -time convolution of a measure µ. In turn, the measure µN itself can be defined in the following way: #L (g) N (2n)N µN (g) = + #L N (g) nN
for the group .
(74)
for the semi-group
By LN (g) and L+ N (g) we denoted the sets of different dynamical representations of the word g of record’s length N in the alphabets {g1 , . . . , gn , g1−1 , . . . , gn−1 } (for the group) and {g1 , . . . , gn } (for the semi-group) correspondingly. Hence, #LN (g) and #L+ N (g) are the numbers of various dynamical representations of the element g by words of record length N in a given framing. The values #LN (g) (#L+ N (g)) can be viewed as the number of different ways on the Cayley graph of the group (semi-group), leading from the root point of the graph. Let us pay attention that in the case of the group the element g = wN can have the length K(g) shorter than the record length N . This question has been considered in detail in the previous section. As has been found in the previous section during the study of the drift, the dynamics of the increments of words (i.e. dynamics of the heap H ) for random uniform addition of cells is uniquely determined by the dynamics of the roof T of the heap H . Moreover, we have found (see Eq. (62)), that in the limit N → ∞ and at n 1 the mathematical expectation of the size E#T of a roof, normalized by n is 1/3. Let us prove the lemma: Lemma 7. The fluctuations of mathematical expectation of the roof for n 1 have the asymptotic behavior E #T 2 − E(#T )2 const ≤ , 2 E(#T ) n where we have denoted E#T = lim E#T (wN ). N →∞
Proof. Rewrite (62) in the form (E#T )2 =
2 lim E#T (wN )
N →∞
=
n2 . 9
(75)
492
A. M. Vershik, S. Nechaev, R. Bikbov
Using Eqs.(64)–(65) for the probabilities of local rearrangements of the roof we get the mathematical expectation of the fluctuations of a roof: h i E1(#T 2 ) = E (#T 0 )2 − (#T )2 = 1 × q+ (#T + 1)2 − (#T )2 +0 × q0 + (−1) × q− (#T − 1)2 − (#T )2 = 2(q+ − q− )#T + (q+ + q− ), where q+ =
1 X (hj − 2); n j :hj ≥3
q− =
q0 =
(76)
1 #T + 2#{j : hj ≥ 2} ; n
1 #{j : hj = 1}. n
Taking into account, that q+ − q− = 1 − n3 #T , we obtain from (76): 3 E1(#T ) = E 2 1 − #T #T + (q+ + q− ) . n 2
(77)
For the invariant initial distribution we should set E1(#T 2 ) = 0, therefore the mathematical expectation of a square of the size of a roof can be received from the following relation: 6 E1(#T 2 ) = 2E#T − E#T 2 − E(q+ + q− ) = 0 n hence we get E#T 2 =
n n E#T + E(q+ + q− ). 3 6
Estimating the mathematical expectation from above as E(q+ + q− ) < const, we arrive at the equation: const n n2 n2 + = + o(n2 ). E#T 2 = 9 6 9 Comparing the last expression with (75), we get the statement of Lemma 6.
t u
It is convenient to split the problem of computaton of the entropy of random walks on locally free group and semi-group in two parts and to begin with the case of the semi-group LF + n for which the computations seem to be more transparent. 5.1. Entropy of random walk on semi-group LF + n+1 . Theorem 5. The entropy of the random walk on the locally free semigroup LF + n+1 for n 1 is h = log 3 + o(1).
(78)
Statistical Properties of Locally Free Groups
493
Proof. We have X
#LN (g) =
#LN (g \ gi ),
(79)
gi ∈T
where the sum is taken over all elements gi from the roof T . Let us write g g (1) if the heap of g (1) is a result of removal of one cell from the roof of the heap g. Using the definition (74) for the semigroup LF + n+1 we have #L(g) = n−1 nN
X g (1) :gg (1)
#L(g1 ) , nN −1
which, consequently, can be rewritten in the following way: X µN −1 (g (1) ). µN (g) = n−1
(80)
(81)
g (1) :gg (1)
Both sums in (80) and (81) run over all g (1) for which g g (1) and number of cells in g is indicated by the exponent in µ. Let us rewrite (81) equivalently as µN (g) =
1 #T (g) × n #T (g)
X
µN −1 (g (1) ).
(82)
g (1) :gg (1)
The second factor in the right side of (82) is the average of the measures of heaps obtained by exclusion of one cell from the initial heap; we denote this term as A(g) =
1 #T (g)
X
µN −1 (g (1) ).
(83)
g (1) :gg (1)
Taking the logarithm of (82) divided by N and using (83) we get N −1 log µN (g) = N −1 log
#T (g) + N −1 log A(g). n
(84)
This is true for all N and all g with N cells. Now we iterate the second term in the right side: X 1 µN −1 (g (1) ) A(g) = #T (g) (1) g :gg1 X 1 1 µN −1 (g (1) ) = (1) ) #T (g) (1) #T (g g :gg (1) (85) X X 1 n−1 µN −2 (g (2) ) = (2) (1) (2) #T (g) (1) g :gg1 g :g g (1) X X #T (g ) 1 1 µN −2 (g (2) ) . = (1) #T (g ) (2) (1) (2) #T (g) (1) n (1) g
:gg
g
:g
g
494
A. M. Vershik, S. Nechaev, R. Bikbov #T (w1 ) n
for N large is close to κn up to . Thus, X X 1 1 N −2 (2) µ (g ) + . (86) × A(g) = κn × #T (g (1) ) (2) (1) (1) #T (g) (1) (1)
Using Lemma 5 we expect that
g
:gg
g
:g
g
We can iterate Eq. (86) assuming that all g (1) , g (2) , . . . , g (k) run over the all sequences of heaps g g (1) g (2) · · · g (k) and for all of those heaps
#T (wi ) n
A(g) = (κn )m−1 ×
(87)
is -close to κn . After mth iterations we obtain 1 × #T (g)
X
C(g (m) )µN −m (g (m) ),
(88)
g (1) ,...,g (m)
where the coefficients C(g (m) ) are positive with sum equal to 1, being average values of the convolution of the measures µN −m (g (m) ). Coming back to equality (84) and iterating it m = N − k times, we get −N −1 log µN (g) = N −1 × log κn + N −1 log A(g) ··· X N −k × log κn + N −1 log C(g (N−k) )µk (g (N−k) ) + . = N (1) (k) g
,...g
(89) Let us now make a shift g → g N +k , g g (1) , . . . , gk , so that g (j ) becomes a heap with N + k − j cells; in particular g (N ) now has k cells. With this shift we have −
N 1 log µN+k (g N+k ) = × log κn N +k N +k X 1 + log N +k (1) N +k g
g
···g (k)
C(g (k) )µN (g (k) )
+ . (90)
Now we fix sufficiently large k and set N → ∞. The convergence of limN→∞ N −1 log µ(wN ) for almost all sequences of heaps, i.e. for almost all trajectories {wN } of the random walk on the semigroup LF + n+1 (or the group LF n+1 ) follows now from the theorem of Shannon–Macmillan–Breiman–type, mentioned in the beginning of Sect. 5. Hence, lim N −1 log µN (wN ) = h.
N→∞
(91)
The limit in (91) exists in L2 in the space of trajectories. So, the left-hand side of (90) tends to the entropy h for almost all sequences when N → ∞. The second summund in the right-hand side tends to 0 because the logarithm
Statistical Properties of Locally Free Groups
495
of the sum is bounded by the average of the measures µN (g (k) ). Thus, for N → ∞ we have 1 N log µ(g (N+k) ) = lim log κn = log κn = log 3 + on (1) N→∞ N + k N →∞ N + k lim
(92)
and h = log 3 + on (1).
t u
(93)
Theorem 6. For the random walk on the locally free semi-group LF + n+1 the logarithmic volume v, the drift l and the entropy h satisfy at n 1 the strict inequality v l > h, where v ≡ Proof. (i)
v(LF + n );
l≡
l(LF + n );
h ≡ h(LF + n ).
By Theorem 2 we have v(LF + n ) → log 4.
(ii) The drift of the random walk on LF + n is strictly equal to 1, i.e. l = 1. (iii) By Theorem 5 we have
h(LF + n ) → log 3.
Comparing the values of v, l and h, we get the strict inequality v > h for the random t walk on LF + n+1 . u 5.2. Entropy of random walk on groups LF n+1 and LFI + n+1 . Theorem 7. The entropy h of the random walk on the group LF n+1 and idempotent semi-group LFI + n+1 at n 1 is ( log(3 − α) + o(1) for the group LF n+1 , (94) h= log 23 + o(1) for the idempotent semi-group LFI + n+1 where α = 21 (p+ − p− ) (see 67). Proof. In the case of LF n+1 and LFI + n+1 the element g can be achieved at N ’s step of the random walk not only by adding a new cell to the previous roof (as it was for the semi-group) but also by cancelling some already existing cell of the roof. This behavior is manifested in the following modifications of the recursion relation (79): X #LN −1 (g ∪ gi ) for LF n+1 X gX ∈T #LN−1 (g \ gi ) + i . (95) #LN (g) = #LN −1 (g) for LFI + n+1 gi ∈T gi ∈T
496
A. M. Vershik, S. Nechaev, R. Bikbov
Following the outline of the proof of Theorem 5 and using Lemma 7 we compute h regarding the dynamics of the long (n 1) roof in the stationary regime for N 1. It means that we replace the time-ordered product of j roofs Tj by the j ’s power of an averaged roof T . For the group LF n+1 the exact value of E#T c is unknown as far + − Ep − ). Let us recall that as E#T c depends on the unknown parameter α = 21 (Epw w + − pw and pw are the probabilities of the change of the value #T c (w) by ±1 provided the reduction of a word (see Eq. (67)). Nevertheless, we can follow directly the outline of the proof of Theorem 5 with a single replacement ξj → ξj − α. In the stationary regime for LF n+1 and LFI + n+1 both terms in (95) amounts to the same contribution which results in the following expression for the entropy h: 2n 2#Tj 1 PN 1 3−α − limN→∞ N j =1 log 2n = − log 2n + o(1) = − log 3−α + o(1) for LF n+1 . (96) h= 2n 2#Tj − lim 1 PN 2 3 log = − log + o(1) = − log + o(1) N→∞ j =1 N n n 3 for LFI + n+1 Thus Eq. (94) is proved.
t u
Now we are in a position to prove the following main theorem Theorem 8. For the random walk on the locally free group LF n , the logarithmic volume v, the drift l and the entropy h satisfy at n 1 the strict inequality v l > h, where v ≡ v(LF n ); l ≡ l(LF n ); h ≡ h(LF n ). Proof. For the group LF n as well as in the case of the semi-group LF + n , the entropy h and the drift l of the random walk are determined by the mathematical expectation of the size of a roof E#T . Nevertheless in the case of the group the numerical value of the mathematical expectation of a colored heap’s roof depends on the value α. However since our purpose is to prove that for a locally free group in the limit of infinite number of generators the strict inequality l v < h holds, it is sufficient to estimate appropriately the interval of the change of α. For the proof we shall use again the statements of Theorems 1, 3, and 4. (i)
By the Theorem 1: v(LF n ) → log 7.
(97)
2−α . 3−α
(98)
(ii) By the Theorem 4: l(LF n ) → (iii) By the Theorem 7: h(LF n ) → log(3 − α).
(99)
Statistical Properties of Locally Free Groups
497
By definition α = 21 (p+ − p − ). Because p+ + p− = 1, the following estimate is valid |α| < 21 . Thus, the values of the drift and the entropy lie within the interval 5 3 0 for all values of |α| from the interval |α| < 21 . Consider the function ε(α) =
2−α log 7 − log(3 − α). 3−α
1 Computing the derivative dε(α) dα , one can easily verify that on the interval − 2 < α < the function ε(α) is strictly positive, hence l v − h > 0. The theorem is proved. u t
1 2
Theorem 9. For the random walk on the locally free idempotent group LFI n for n 1 the strict inequality v l > h holds, where v ≡ v(LFI n ); l ≡ l(LFI n ); h ≡ h(LFI n ). Proof. By Theorems 1, 2, 4 and 7 we have at n → ∞: v(LFI n ) → log 3; which gets v l − h =
2 3
l(LFI n ) →
log 3 − log 23 > 0.
2 ; 3
3 h(LFI n ) → log , 2
(100)
t u
6. Conclusion Let us spread the results obtained above for locally free groups and semi-groups to the case of braid groups and semi-groups.
6.1. Bounds for logarithmic volume and drift of random walk on braid group. As we have pointed out already in Lemma 1, the braid group Bn is the factor-group of the locally free group LF n and simultaneously LF n is the subgroup of Bn . The same relations are valid for the semi-group of positive braids Bn+ and the locally free semi-group LF + n. Theorem 10. The logarithmic volumes v(Bn ) and v(Bn+ ) for n 1 satisfy the bilateral estimates, log 7 < v(Bn ) ≤ log 7, log 2 < v(Bn+ ) ≤ log 4. 1 2
(101)
Proof. The proof is based on Lemma 1 and its corollary. The upper bounds in (101) are the direct consequence of the fact that Bn (Bn+ ) is a factor–group of LF n (LF + n ). Thus, vBn ≤ vLF n ≡ log 7.
498
A. M. Vershik, S. Nechaev, R. Bikbov
In order to get the lower bounds let us point out that embedding ρn of LF n in Bn 2 + and of LF + n in Bn is realized by the isomorphism fi ↔ σi . Thus, in case of the group we have: V (ρn : LF n , K) ⊂ V (Bn , 2K) and
hence
log V (Bn , 2K) log V (ρn : LF n , K) ≤ , K K vLF n ≡ log 7 ≤ 2vBn ,
therefore
1 vLF n ≤ vBn , 2 The case of the semi-group LF + n can be treated along the same line.
t u
Apparently, the upper estimate in Eq. (101) is closer to the true value, than the lower one. Theorem 11. The drift l(Bn ) on the braid group Bn at n 1 satisfies the inequality 2−α 2−α < l(Bn ) ≤ . 2(3 − α) 3−α
(102)
Proof. The bilateral estimate (102) is again a direct consequence of Lemma 1 showing that the braid group is the factor–group of the locally free group and in turn the locally free group is the subgroup of the braid group. The value α has been defined above and t varies in the interval − 21 < α < 21 . u For the entropy of the random walk on the braid group the corresponding bilateral estimates have not yet received but nevertheless we can assert that the fundamental inequality is strict. We shall return to this question in the forthcomming publication. The fundamental inequality vl > h for locally free and braid groups has deep connection to the multifractal structure of harmonic measure on the Poisson boundary for corresponding groups. 6.2. Physical interpretation of results. Let us give a physical interpretation of a strict inequality lv > h for the locally free group and for the ballistic deposition process – see Fig. 5. The relations (73)–(74) permit one to estimate the probability of various dynamical representations of typical elements g by words of length N 1 with respect to the uniform measure µ: #LN (g) ≈ 2N h lim n→∞ (2n)N that for locally free group gives with the exponential accuracy (see Eq. (94)) #L(g) ≈ (3 − α)N . n→∞ (2n)N lim
L(g) The value #(2n) N is the measure of a set of trajectories on the Cayley graph of the locally free group, visited by the N-step random walk.
Statistical Properties of Locally Free Groups
499
Fig. 5. Typical configuration obtained in numerical simulations of the uniform heap’s growth
On the other hand, the expression #V(g) ≈ 2N l v (2n)N gives the exponential estimate for the probability to find the element g for a time of random walk on the group without any reference to dynamics. For the locally free group the value #V(w) can be written as follows: 2−α #V(g) ≈ 7 3−α N , N (2n)
where |α| < 21 . In other words, #VnN(g) is the measure of the set of all different states of the Cayley graph of the locally free group, located at a distance of typical drift L = 2−α 3−α N of the N-step walk from the root point of the graph. The inequality #L(g) #V(g) N (2n) (2n)N
(103)
500
A. M. Vershik, S. Nechaev, R. Bikbov
means that the measure of the set of typical trajectories covered dynamically by the N-step random walk on LF n is an exponentially small fraction of the set of statistically available trajectories of the same length. The inequality, similar to (103) in case of the locally free semi-group reads #L+ (g) #V + (g) , N n nN
(104)
where #V + (g) ≈ 4N is the volume of the locally free semi-group LF + n for n 1 and in the same limit. #L+ (g) ≈ 3N is the entropy of the random walk on LF + n The dynamically induced probabilistic measure on the group (semi-group), i.e. the representation of words by the random walks on a group (semi-group), essentially differs from the uniform (on the words) measure. This difference is manifested in the exponential divergence of the two quantities #V(g) and #L(g)–see Eq. (103) (the same is valid for the semi-group and is described by Eq. (104)). The very origin of the above stated exponential difference in the number of (i) dynamical representations of some typical group element g by N -step random walks and (ii) all different representations of the same element g by N -step trajectories in the locally free group consists in the locking mechanism of dynamical construction of words. Let us explain this mechanism for a simple example. Suppose after some steps of the (righthand) random walk we have arrived at some word, say wN = g1 g4 g3 g2 (it is written in the normal form). By any random adding a letter (from the side) we can h right-hand i never create a word, say, like the following one wN +1 = g1 g2 g4 g3 g2 . However the word wN+1 can be created if we would allow insertion of a new letter everywhere in the given word wN (but not only add in to the right-most end of wN ). Hence for the random walk many configurations which are statistically available are locked dynamically by the condition that we add the new letters at the right–most end of the current word. This locking is permanent for the locally free semi-group LF + n and temporary for the locally free group LF n . In the last case we can release the locked structure by consecutively adding opposite generators which would cancel step by step the roof, however the probability of such an event is exponentially small (the detailed explanation of this effect is given in [Ve2,Ve3]). The inequality (104) seems to be the origin of the fact that in the numerical simulations of a random heap’s growth (Fig. 5) a strong divergence is observed between the normalized mathematical expectation (averaged density) of the roof ρ roof = E #T n (where ρ roof = 13 ) and the mathematical expectation (averaged density) of a whole heap N ρ heap = nH (where H is the maximal height of a heap). The value of H , obtained in various computer simulations is evaluated as H ≈ 4.05 Nn (for references see [HZh]), which corresponds to the density ρ heap ≈ 0.247. The same value of the density is observed in average for any flat horizontal section of a heap. An essential numerical distinction between ρ roof and ρ heap means that the roof of a heap has nontrivial fractal structure lying in a strip of nonzero’s width. In Fig. 5 we have shown by black points a few current configurations of the roofs in the course of the heap’s growth. As it can be seen, the configurations of the roof are far from the flat ones and exhibit apparently nontrivial fractal behavior, which would be interesting to compare with the continuous models of the surface growth described by Kardar–Parisi–Zhang (KPZ) theory (see, for a review [HZh]).
Statistical Properties of Locally Free Groups
501
Acknowledgements. We would thank S. Fomin for pointing us to the connection between heaps and partially commutative monoids; G. X. Viennot, B. Derrida, A. Comtet and J. Lebowitz for fruitful discussions and comments; S.N. highly appreciates deep suggestions made by J. Desbois (see [DN2]). The authors are grateful to the RFBR grants 99–01–17931 and RFBR 00–15–96060 for partial support.
References [Av] [Bi1]
Avec, A.: C.R. Acad. Sci. Paris 275 (A), 1363 (1972) Birman, J.: Knots, Links and Mapping Class Groups. Ann. Math. Studies, 82, Princeton: Princeton Univ. Press, 1976 [Bi2] Birman, J., Ko, K.H., Lee, J.S.: Adv. Math. 139, 322 (1998) [CF] Cartier, P., Foata, D.: Lect. Not. Math. 85 New York–Berlin: Springer, 1969 [CN] Comtet, A., Nechaev, S.: J. Phys. (A): Math. Gen. 31, 5609 (1998) [Co] Collins, P.J.: Invest. Math. 117, 525 (1994) [De] Derennic, Y.: Astérisque 74, 183 (1980) [DN1] Desbois, J., Nechaev, S.: J. Stat. Phys. 88, 201 (1997) [DN2] Desbois, J., Nechaev, S.: J. Phys. (A): Math. Gen. 31, 2767 (1998) [FK+] Frank-Kamenetskii, M.D., Vologodskii, A.V.: Sov. Phys. Uspekhi 134, 641 (1981); Vologodskii, A.V., Lukashin, A.V., Frank-Kamenetskii, M.D., Anshelevich, V.V.: Zh. Exp. Teor. Fiz. (JETP), 66, (1974) 2153; Vologodskii, A.V., Lukashin, A.V., Frank-Kamenetskii, M.D.: Zh. Exp. Teor. Fiz. (JETP) 67, 1875 (1974); Frank-Kamenetskii, M.D., Vologodskii, A.V., Lukashin, A.V.: Nature (London), 258, 398 (1975) [GN] Grosberg,A.Yu., Nechaev, S.K.: J. Phys. (A): Math. Gen. 25, 4659 (1992); Grosberg,A.Yu., Nechaev, S.K.: Europhus. Lett. 20, 613 (1992) [Gr] Gromov, M. : Hyperbolic Groups. In: Essays in Group Theory 8, 75 MSRI Publishing: Springer, 1987 [HZh] Halpin-Healy, T., Zhang, Y.C.: Phys. Rep. 254, 215 (1995) [Hu] Humfreyies, S.: Journ. of Algebra 169, 847 (1994) [HNDV] Hakim, V., Nadal, J.P.: J.Phys. A 18, L-213 (1983); Nadal, J.P., Derrida, B., Vannimenus, J.: J. de Physique 43, 1561 (1982) [Jo] Jones, V.F.R.: Bull. Am. Math. Soc. 12, 103 (1985); Jones, V.F.R.: Pacific J. Math. 137, 311 (1989) [Kai] Kaimanovich, V.: Ergodic Theory and Dyn. Syst. 18, 631 (1998) [KaiV] Kaimanovich, V., Vershik, A.M.: Ann. Prob. 11, 457 (1983) [LGP] Lifshitz, I.M., Gredeskul, A., Pastur, L.A.: Introduction to the theory of disorderd systems. Moscow: Nauka, 1982 [Ne] Nechaev, S.K.: Statistics of Knots and Entangled Random Walks. (WSPC: Springer, 1996); Nechaev, S.K.: Sov. Phys. Uspekhi 168, 369 (1998) [NGV] Nechaev, S., Grosberg, A., Vershik, A.: J. Phys. (A): Math. Gen. 29, 2411 (1996) [Sa] Sanov, I.: Dokl. Ac. Sci. USSR 57, 657 (1940) [Ve1] Vershik, A.M.: In: Topics in Algebra 26, pt.2, 467, (1990) (Banach Center Publication, Warszawa); Vershik, A.M.: Proc. Am. Math. Soc. 148, 1 (1991) [Ve2] Vershik, A.M.: Zapiski Sem. POMI 256 (1999) [Ve3] Vershik, A.M.: Russ. Math. Surv. (2000), No.4 (to appear) [Vi] Viennot, G.X.: Ann. N. Y. Ac. Sci. 576, 542 (1989) Communicated by A. Jaffe
Commun. Math. Phys. 212, 503 – 533 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Instantons, Monopoles and Toric HyperKähler Manifolds Thomas C. Kraan Instituut-Lorentz for Theoretical Physics, University of Leiden, PO Box 9506, 2300 RA Leiden, The Netherlands. E-mail:
[email protected] Received: 20 November 1998 / Accepted: 11 October 1999
Abstract: In this paper, the metric on the moduli space of the k = 1 SU (n) periodic instanton – or caloron – with arbitrary gauge holonomy at spatial infinity is explicitly constructed. The metric is toric hyperKähler and of the form conjectured by Lee and Yi. The torus coordinates describe the residual U (1)n−1 gauge invariance and the temporal position of the caloron and can also be viewed as the phases of n monopoles that constitute the caloron. The (1, 1, . . . , 1) monopole is obtained as a limit of the caloron. The calculation is performed on the space of Nahm data, which is justified by proving the isometric property of the Nahm construction for the cases considered. An alternative construction using the hyperKähler quotient is also presented. The effect of massless monopoles is briefly discussed.
1. Introduction Moduli spaces of instantons [2] and Bogomol’nyi–Prasad–Sommerfield (BPS) monopoles [3] have been subject to investigation for some time. The moduli space, quotient of the set of self-dual gauge connections by the group of gauge transformations, is a subset of the configuration space and its geometry therefore reflects physical properties of the system. In this paper instantons on R3 × S 1 [17], or calorons, are studied for gauge group SU (n). Calorons are composed out of elementary BPS monopoles [29], as is seen from the action density [24]. This becomes clear for small compactification lengths when the constituents are far apart. In particular, removing one of the monopoles to spatial infinity turns the k = 1 caloron into a BPS SU (n) monopole. In contrast, the situation of all monopoles nearly coalescing -in appropriate units corresponding to an infinite compactification length- gives back the ordinary instanton on R4 . These various aspects are respected by the corresponding limits in the metric. The form of the metric was conjectured by Lee and Yi [29], using considerations of D-brane constructions and
504
T. C. Kraan
asymptotic monopole interactions. This paper addresses the explicit calculation of the metric for the caloron moduli space and its limits. Metric properties of moduli spaces of selfdual connections play an important role in the study of non-perturbative effects of gauge theories. For instantons the metric appears through the bosonic zero modes in the background of the charge one SU (2) instanton in a calculation to study its physical effects [19]. The scattering of monopoles can be described as the geodesic motion on the moduli space [33], relating the metric to the Lagrangian of the interacting monopole system [34]. The metrics on these moduli spaces are hyperKähler [18]. This property derives formally from the nature of the selfduality equations themselves [1, 10]. It also appears in the Atiyah–Drinfeld–Hitchin–Manin (ADHM) construction of instantons of higher charge, as well as in the Nahm construction for monopoles as a hyperKähler structure on the space of data [8, 9]. The Nahm formalism first appeared as a generalisation of the ADHM construction to construct the BPS monopole [36]. In its extension to selfdual monopoles for arbitrary group and charge [37, 38], the Nahm data in terms of which the monopole is obtained can be constructed in terms of the Weyl zero modes in the background of the monopole. A similar scheme was set up for the caloron [12, 38], which up to very recently [22, 23, 26] had not resulted in explicit solutions. This reciprocity idea could be applied to instantons on R4 as well [7]. Extended to the four-torus T 4 , the involutive property of the Nahm transformation preserves the metric and hyperKähler structure [4]. These ideas fit in a programme of studying the Nahm transformation on generalised tori M = R4 /H , where H is the isometry group of the selfdual connection. The calorons correspond to M = R3 × S 1 , H = Z. This compactification provides a smooth interpolation between instantons and monopoles, adding to the understanding of both objects and the formalism to study them. The incorporation of both instanton and monopole-like aspects by calorons is read off from the topological characteristics of selfdual gauge connections Aµ dxµ on R3 × S 1 [16]. These are related to the properties of the vacuum which the solution necessarily approaches at spatial infinity in order for the action to be finite. The homotopy class of the gauge transformation connecting the vacuum at infinity with the connection near the origin gives the instanton number k ∈ π3 (SU (n)) = Z. The vacuum itself can be nontrivial, due to the non-trivial topology of the asymptotic boundary of the base manifold S 2 × S 1 . This leads to extra labels for the solution which are studied in terms of the gauge holonomy P( x ) along S 1 . In the periodic gauge (Aµ ( x , x0 +T ) = Aµ ( x , x0 )), P( x ) is defined as T P( x ) = P exp( A0 ( x , x0 )dx0 ), (1) 0
where P denotes path ordering and T the circumference of S 1 , which we set to 1. In a zero curvature background, continuous deformations of the loop do not affect P( x ). Its eigenvalues at spatial infinity are topological invariants. Therefore, the gauge holonomy at infinity is diagonal up to an xˆ dependent gauge transformation V , 0 lim P( x ) = P∞ = V P∞ V −1 ,
| x |→∞
0 P∞ = exp[2π idiag(µ1 , . . . , µn )].
(2)
The eigenvalues can be ordered such that µ1 < . . . < µn < µn+1 ≡ µ1 + 1,
n m=1
µm = 0,
(3)
Instantons, Monopoles and Toric HyperKähler Manifolds
505
using the gauge symmetry and assuming maximal symmetry breaking for the moment. For later use, we define νm = µm+1 − µm , related to the mass of the mth constituent monopole. Asymptotically, A0 = 2πi diag(µ1 , · · · , µn ) − i diag(k1 , · · · , kn )/(2r) + O(r −2 ), ki = 0, i
(4) up to the gauge transformation V (x) ˆ that induces a map from S 2 to SU (n)/H∞ , with H∞ the isotropy group of exp[2πi diag(µ1 , · · · , µn )]. The maps V (x) ˆ → SU (n)/H∞ are classified according to the fundamental group of H∞ . Generically, H∞ consists of several U (1) and SU (N ), N > 1 subgroups. Each U (1) gives rise to a monopole winding number, related to the integers ki . The enhanced residual gauge symmetry described by the SU (N ) subgroups arises when there is non-maximal symmetry breaking, νm = µm+1 − µm = 0 for some value(s) of m, giving rise to massless constituent monopoles. A non-trivial value of P∞ breaks the gauge symmetry. This makes calorons very similar to BPS monopoles, [20, 36, 37] which fit in the above classification as S 1 invariant selfdual connections, classified according to the magnetic charges (m1 , . . . , mn−1 ), where mi = k1 + . . . + ki . The k = 1 SU (n) caloron studied in this paper has no magnetic charges, and its only nontrivial topological labels are the instanton number k = 1 and the eigenvalues µm of the holonomy. The explicit computation of the metrics in this paper is based on the isometric property of the Nahm transformation, known to hold for instantons on R4 and T 4 , as well as for certain types of BPS monopoles [39]. It is believed to hold generally. For most situations considered in this paper, an explicit proof seems not to be present in the literature, and will be given here. This allows for a determination of the metric on the moduli space of Nahm data. For monopoles, such a calculation was first done in [5] showing that the metric of the (1, 1) data is a Taub-NUT space with positive mass parameter. Considerations based on asymptotic monopole interactions [14] reproduced this result [11]. For the (1, 1, . . . , 1) monopole a similar equivalence was found [27, 35]. All these metrics are of so-called toric hyperKähler type [13,42], and can be efficiently obtained as metrics on hyperKähler quotients [15]. An explicit calculation of the k = 1 SU (2) caloron is extended here to SU (n), generalising the techniques in [22, 23]. An alternative derivation using the hyperKähler quotient will also be given. There we will greatly benefit from the formalism in [35, 15], due to the similarity between the caloron and monopole Nahm data. The outline of this paper is as follows. In Sect. 2, some aspects of hyperKähler manifolds are presented, mostly to fix notation and to give some identities used throughout. Crucial in the ability to handle the caloron is that the infinite matrices of the ADHM construction are converted by Fourier transformation to functions on S 1 . This translates ADHM to the Nahm formulation and allows one to keep track of crucial delta-function singularities. In Sect. 3, to define notation, we summarise the ADHMN formalism for calorons as developed in refs. [22, 23, 30] based on the ADHM construction for instantons, rather than following [12, 38]. The caloron metric is calculated in Sect. 4. The instanton and monopole limits of the caloron are discussed in Sect. 5. A unified description of instantons, calorons and monopoles is thus achieved. Other aspects of the caloron, among which the effect of massless constituents, are commented on in the discussion. The appendix contains some technicalities on the (1, 1, . . . , 1) monopole.
506
T. C. Kraan
2. Preliminaries Manifolds with metric g are hyperKähler if they have three independent complex structures I, J, K that satisfy the quaternion algebra, I J = −J I = K and cyclic, whose associated Kähler forms ωI (·, ·) = g(·, I ·), ωJ (·, ·) = g(·, J ·), ωK (·, ·) = g(·, K·) are closed. As will be outlined in Sect. 4.1, the moduli spaces of selfdual connections inherit their hyperKähler property from the hyperKähler structure of the base space manifold M = R4 /H , where H = ∅, Z, R for instantons, calorons and monopoles respectively. The position coordinate on R4 will be denoted as a quaternion, x = xµ σµ . Here the unit quaternions are defined as σµ = (12 , −i τ) = (1, i, j, k) and σ¯ µ = (12 , i τ), with ij = −ji = k and τ the Pauli matrices. As M has a flat metric, there is no difference between upper and lower indices. Repeated indices imply summation. We introduce the i σ ≡ 1 (σ σ selfdual, resp. anti-selfdual quaternionic tensors [19] ηµν ≡ ηµν ¯µ) i µ ¯ ν − σν σ 2 i 1 and η¯ µν ≡ η¯ µν σi ≡ 2 (σ¯ µ σν − σ¯ ν σµ ), and #0123 = 1. Identifying the tangent space to H = R4 with the vector space itself, the complex structures act on x as right multipli1,2,3 . It is convenient to combine the cation with −i, −j, −k, such that (I, J, K)µν = η¯ µν metric and Kähler forms into one quaternion, (g, ω) = gσ0 + ω · σ .
(5)
(g, ω) = d x¯ ⊗ dx, g = ds 2 = (dxµ )2 , ω · σ = d x¯ ∧ dx = η¯ µν dxµ ∧ dxν = (2dx0 ∧ d x − d x ∧ d x) · σ .
(6)
This implies for R4 ,
i = #ij k daj ∧ dbk . One extends to HN by replacing d x¯ in Eq. (6) by Here, (d a ∧ d b) † t dx = d x¯ . Many examples of hyperKähler manifolds emerge as hyperKähler quotients [18]. Consider a hyperKähler manifold M acted upon freely by a group G (with algebra g) of isometries, LX g = 0, L denoting the Lie derivative and X ∈ g. When G preserves the complex structures, LX ω = 0, the isometries are called triholomorphic and the moment map µ : M → g∗ ⊗ R3 can be defined as Xµ ω µν = ∂ν µ X . The manifold µ −1 (c)/G, 3 ∗ with c ∈ R ⊗ Zg (Zg the center of g ) obtained by taking the quotient of the level set µ −1 (c) by G is then hyperKähler itself. Isometries commuting with G descend to the quotient. When they are also triholomorphic, this property is preserved. The relevant example is provided by the moduli space of ADHM data in the construction of charge k instantons on R4 for gauge group SU (n) [2, 7]. The caloron will be constructed using an infinite-dimensional version of the ADHM construction which we therefore review here, to establish conventions. One considers the set Aˆ of matrices λ .= , (7) B with λ ∈ Cn,2k and the 2k × 2k dimensional matrix B = Bµ ⊗ σµ , where Bµ are k × k dimensional hermitian matrices. With metric and Kähler forms on Aˆ defined as g = 21 Tr tr 2 dB † dB + 2dλ† dλ , ω · σ = 21 σi Tr tr 2 σ¯ i dB † ∧ dB + 2dλ† ∧ dλ , (8)
Instantons, Monopoles and Toric HyperKähler Manifolds
507
Aˆ is hyperKähler. The U (k) transformations λ → λT † ,
Bµ → T Bµ T † ,
T ∈ U (k),
(9)
leave (g, ω) in Eq. (8) invariant and therefore form a group of triholomorphic isomeˆ The associated moment map reads (tr 2 denoting the trace associated with tries of A. quaternions) (10) µ = 21 tr 2 B † B + λ† λ σ¯ . Its zero set µ −1 (0) is formed by the solutions to the ADHM constraint η¯ µν Bµ Bν + 21 τa tr 2 (τa λ† λ) = 0.
(11)
The instanton gauge connection corresponding to a solution to . ∈ µ −1 (0) is obtained as Aµ (x) = v † (x)∂µ v(x),
(12)
in terms of the (2k + n) × n dimensional complex matrix v(x) containing the normalised zero modes of .† (x) = .† − x † b† , where b† = (0, 1k ). For Aµ to be an SU (n) gauge potential, B † B + λ† λ should be invertible, implying the existence of a k × k dimensional hermitian matrix fx commuting with the quaternions, .† (x).(x) = fx−1 ⊗ σ0 .
(13)
This matrix features in the expression for the curvature, Fµν = 2v † (x)bηµν fx b† v(x),
(14)
showing it to be self-dual. It also appears in the formula for the action density [40], 2 (x) = −∂µ2 ∂ν2 log det fx , TrFµν
(15)
from which it follows that the topological charge is k, because of the asymptotic behaviour fx = 1k /x 2 ,
x 2 → ∞.
(16)
Thus it is shown that an element . ∈ µ −1 (0) corresponds to a charge k instanton solution. The gauge connection (12) is not affected by the U (k) transformations (9), which therefore have to be divided out to obtain the instanton moduli space µ −1 (0)/U (k) (its isometry with the moduli space of instantons is discussed later). This reduces the dimension of the instanton moduli space to 4kn. As it is a hyperKähler quotient, this space is hyperKähler [8, 10]. Global gauge transformations of the instanton, which are included as moduli, are realised by the action λ → gλ,
g ∈ SU (n),
(17)
which is a triholomorphic isometry, as follows from Eq. (8). As SU (n) acts on the left, it commutes with U (k) acting on the right. Therefore, SU (n) descends as a group of triholomorphic isometries to the moduli space of ADHM data, the hyperKähler quotient µ −1 (0)/U (k), reflecting the gauge symmetry of the instanton solution.
508
T. C. Kraan
At this place we recall a frequently used U (1) fibration over R3 , physically interpreted as a monopole phase and position. It is presented in terms of complex row 2-vectors that feature in the ADHM matrix λ. Specifically, for a 2-dimensional complex row vector ς = (ς1 , ς2 ), describing R4 , the metric and Kähler forms read g = 21 tr 2 (dς † dς),
ω · σ = 21 σi tr 2 σ¯ i dς † ∧ dς.
(18)
The complex structures act on ς by right multiplication with −σi . There is a triholomorphic U (1) isometry with associated moment map ς → eit ς,
µ = 21 tr 2 (−iς † ς σ¯ ) = 21 r.
(19)
The level sets are U (1) fibres due to the phase ambiguity in defining ς from r, which becomes more manifest upon introducing new coordinates, ψ
ς = ς 0 ei 2 ,
ψ ∈ R/(4π Z),
(20)
with for example ς20 (r ) chosen real. A useful identity is 1 2
tr 2 (δς0† ς0 − ς0† δς0 ) = −i|r |w( r ) · d r,
(21)
where w( r ) is the vector potential of the abelian Dirac monopole, r 1 . r × w( r) = ∇ ∇ |r |
(22)
In the present form, the Dirac string lies along the positive z axis, other gauges are obtained by allowing for r dependent phase ambiguities. In terms of (r , ψ), the metric and Kähler forms on R4 read 1 1 2 2 2 r ) · d r) , d r + |r |(dψ + w( ds = 4 |r | (23) 1 ω = (dψ + w( r ) · d r) ∧ d r − d r ∧ d r. 2r The U (1) isometry is equivalent to a linear action ψ → ψ + 2t,
t ∈ R/(2π Z).
(24)
The moduli spaces we will encounter are all so-called toric hyperKähler manifolds [42]. These manifolds have coordinates consisting of N three-vectors xa ∈ R3 , a = 1, . . . , N, and N torus variables φa , generalising the U (1) in the previous example. Metric and Kähler forms read dφb dφa ac · d xc (:−1 )ab bd · d xd , +; +; g = d xa :ab · d xb + 4π 4π
ω = 2(
dφa ab · d xb ) ∧ d xa − :ab d xb ∧ d xa . +; 4π
(25)
Instantons, Monopoles and Toric HyperKähler Manifolds
509
are φa independent, giving Here we adopted the notation of [14]. The potentials : and ; rise to N commuting triholomorphic isometries ∂/∂φa , corresponding to shifts on the torus. Closure of the Kähler forms is equivalent to ∂ i ∂ ∂ j ; − ; = #ij k k :ab , ∂xai bc ∂xcj ba ∂xc
∀a, b, c, i, j.
(26)
These equations are therefore called hyperKähler conditions [42,13], and generalise Eq. (22). The metric in Eq. (25) has an SO(3) isometry, acting on the vectors xa , that rotates the complex structures. Toric hyperKähler manifolds are torus bundles over (R3 )N [14]. Physically, the R3 vectors xa are (relative) constituent monopole positions, whereas the torus describes the phases of the monopoles. In the Lagrangian interpretation denote retarded interaction potentials for the constituents [14, of the metric, : and ; 34] and it was considerations of this kind that led to the conjectures for the metric in [27, 29]. 3. The ADHM-Nahm Formalism We will construct the caloron in the so-called algebraic gauge, related to the periodic gauge by the non-periodic gauge transformation g( x , x0 ) = V exp[2π ix0 diag(µ1 , . . . , µn )]V −1 . In this gauge, the background field 2π i diag(µ1 , · · · , µn ) in Eq. (4) is removed and we have the alternative boundary condition, −1 x , x0 + T ) = P∞ Aµ ( x , x0 )P∞ . Aµ (
(27)
Since in the absence of magnetic windings, P∞ can always be gauged to a constant 0 without loss of generality. The periodic diagonal form, we assume henceforth P∞ = P∞ instanton of charge one is obtained in the algebraic gauge (27) by taking an infinite array of elementary instantons, relatively gauge- rotated by P∞ . To implement this in the ADHM formalism we take a specific solution for the zero mode vector v(x) in the ADHM construction, 1 −1n ϕ − 2 (x), u(x) = (B † − x † 1k )−1 λ† , ϕ(x) = 1n + u† (x)u(x), v(x) = u(x) (28) where ϕ is an n × n positive hermitian matrix. In terms of these, one obtains Aµ (x) = ϕ − 2 (x)(u† (x)∂µ u(x))ϕ − 2 (x) + ϕ 2 (x)∂µ ϕ − 2 (x). 1
1
1
1
(29)
For Eq. (27) to hold, it is then required that −1 , up+1 (x + 1) = up (x)P∞
p ∈ Z.
(30)
This imposes periodicity constraints on the data λp+1 = P∞ λp ,
Bp,p (x + 1) = Bp−1,p −1 (x),
(31)
with B(x) = B − x1k , which imply p
λp = P∞ ζ,
Bp,p = σ0 δp,p + Aˆ p−p ,
p, p ∈ Z.
(32)
510
T. C. Kraan
The off-diagonal part Aˆ is still to be determined. Fourier transformation translates the ADHM formalism to the Nahm language. B is cast into a Weyl operator, δ(z − z ) ˆ Bp,p (x)e2πi(pz−p z ) = Dx (z ), 2π i p,p ∈Z
d ˆ Dˆ x (z) = σµ Dˆ µ x (z) = + A(z) − 2π ix, dz ˆ A(z) = σµ Aˆ µ (z), Aˆ µ (z) = 2π i e2πipz Aˆ µ p , (33) p∈Z
ˆ and λ† λ into a singularity structure describing the matching conditions for A(z), ˆ e−2πpiz λp = e2πip(µm −z) Pm ζ = λ(z), p∈Z
ˆ λ(z) =
p∈Z
δ(z − µm )Pm ζ,
m∈Z/nZ
p,p ∈Z
ˆ λ†p e2πi(pz−p z ) λp = δ(z − z )C(z),
ˆ C(z) =
ˆ δ(z − µm )ζ † Pm ζ = ζ † λ(z).
(34)
m∈Z/nZ t , where e is the mth unit vector, Here we introduced the projection operators Pm = em em m in terms of which P∞ = exp(2πiµm )Pm and λp = exp(2π ipµm )Pm ζ. m∈Z/nZ
m∈Z/nZ
The group index m ∈ Z/nZ is a cyclic variable. We also used that for any two obp jects a, b of type ap = P∞ α, p ∈ Z, the Fourier transforms defined as a(z) ˆ = p∈Z exp(−2πipz)ap , have the property ˆ ) = δ(z − z ) a(z) ˆ ˆ † < bˆ >= δ(z − z ) < aˆ † > b(z) aˆ † (z)b(z δ(z − µm )α † Pm β, = δ(z − z ) where < H >≡
S1
(35)
m∈Z/nZ
H (z)dz. The quadratic ADHM constraint translates into 1 2
ˆ [Dˆ µ (z), Dˆ ν (z)]η¯ µν = 4π 2 C(z),
(36)
where is introduced to act on a 2×2 matrix as W ≡ 21 [W −τ2 W t τ2 ] (W ≡ 21 tr 2 W ). We use the U (1) fibration over R3 (Eq. (19)) to write 1 (ρm + ρm · τ), ρm = |ρm |. 2π This leads to the caloron Nahm equation d ˆ j δ(z − µm )ρm , Aj (z) = 2π i dz † ζ † Pm ζ = ζ(m) ζ(m) =
m∈Z/nZ
(37)
(38)
Instantons, Monopoles and Toric HyperKähler Manifolds
511
which is abelian in the k = 1 situation at hand, see [24, 38]. The phase ambiguity in defining ζ(m) from ρm is resolved later. As integration of Eq. (38) over S 1 gives a constraint on ζ ,
ρm = π tr 2 ( τ ζ † ζ ) = 0,
(39)
m∈Z/nZ
we can introduce vectors ym , m ∈ Z/nZ, such that ρm = ym − ym−1 . The vectors ym are to be interpreted as the constituent monopole positions. We now find for the spacelike ˆ components of A(z), Aˆ j (z) = 2π i
j
χ[µm ,µm+1 ] (z)ym ,
(40)
m∈Z/nZ
where χ[µm ,µm+1 ] (z) = 1 for z ∈ [µm , µm+1 ] and 0 elsewhere, extended periodically. Note that the Nahm equations determine ym up to the global R3 × S 1 position variable 1 ˆ νm ym . (41) A(z)dz, ξ = ξ= 2π i S 1 m∈Z/nZ
Here νm = µm+1 − µm is related to the mass of the mth constituent. The T symmetry Eq. (9) in the ADHM construction is mapped to a U (1) gauge symmetry, with gauge group Gˆ = {g(z)|g : z → e−ih(z) ∈ U (1)}, acting as d ˆ ˆ A(z) → A(z) + i h(z), dz
ζm → ζm eih(µm ) .
(42)
For calorons, g(z) is periodic and can be used to set Aˆ 0 (z) to a constant. A piecewise linear U (1) gauge function h(z) shifts the U (1) phase ambiguities in ζ(m) to Aˆ 0 (z), which thus becomes piecewise constant. Therefore, all 4n moduli are included in the following solution to the Nahm equations: ˆ A(z) = 2πi
m∈Z/nZ
χ[µm ,µm+1 ] (z)(
τm σ0 + ym · σ ), 4π νm
(43)
where τ = (τ1 , . . . , τn )t takes values in Rn . Using the gauge function g(z) =
m∈Z/nZ
χ[µm ,µm+1 ] (z) exp(2π i(z − µm )
km ), νm
km ∈ Z,
(44)
which leave the U (1) phases of ζ unaffected, τ can be restricted to the torus Rn /(4π Z)n . In this gauge, the moduli describing the general caloron are the position vectors ym , comprised in y = ( y1 , . . . , yn ) and the torus coordinate τ describing the U (1)n−1 residual gauge symmetry and the temporal position of the caloron. Strictly speaking, these variables are coordinates on the cover of the moduli space of framed calorons. The true moduli space is obtained by dividing out the center of the gauge group. This leads to orbifold singularities.
512
T. C. Kraan
Under Fourier transformation, the Green’s function fx (Eq. (13)) for calorons be 2πi(pz−p z ) and is a solution of the differential comes fˆx (z, z ) ≡ p,p ∈Z fx,p,p e equation
1 d − x0 2πi dz
2 +
m∈Z/nZ
1 + 2π
2 χ[µm ,µm+1 ] (z) rm
δ(z − µm )| ym − ym−1 | fˆx (z, z ) = δ(z − z ),
(45)
m∈Z/nZ
in the gauge with Aˆ 0 (z) constant. Here rm = | x − rm | is the center of mass radius of the mth constituent. Expressions for fˆx in other gauges are obtained by using that under the ˆ fˆx transforms as action of G, fˆx (z, z ) → g(z)fˆx (z, z )g(z )∗ ,
ˆ g(z) ∈ G.
(46)
The Nahm construction of the (1, 1, . . . , 1) monopole, later obtained by as a special limit of the caloron, is discussed in the appendix. 4. The Caloron Metric 4.1. Moduli spaces of selfdual connections. The metric on the moduli space M of selfdual connections on the manifold M = R4 /H is computed as the L2 norm of its tangent vectors. These are gauge orthogonal variations of the connections with respect to their moduli. Specifically, Zµ is tangent to the moduli space when it is a solution of the deformation equation and the gauge orthogonality condition requiring it to be a zero mode of the covariant derivative Dµad = ∂µ + [Aµ , ·], ad ad D[µ Zν] = 21 #µναβ D[α Zβ] ,
Dµad (A)Zµ = 0.
(47)
Written in terms of quaternions, these equations are concisely expressed as D ad† Z = 0, from which one reads off the tangent space to admit three almost complex structures I, J, K acting as −i, −j, −k on the right. Metric and Kähler forms read 1 (g, ω) M (Z, Z ) = d4 xTr Z † (x)Z (x) , (48) 2 4π M where Z, Z are any two tangent vectors. Gauge orthogonality of a general variation δAµ of the selfdual connection can be achieved by applying an infinitesimal gauge transformation :, Zµ = δAµ + Dµad :, implying for the metric g=−
1 4π 2
M
(Dνad )2 : = −Dµad δAµ ,
d4 xTr(δAµ − Dµad (Dνad )−2 Dρad δAρ )2 .
(49)
(50)
The hyperKähler property of the moduli space follows formally from considering it as the infinite dimensional hyperKähler quotient of the space of general connections
Instantons, Monopoles and Toric HyperKähler Manifolds
513
A by the triholomorphic action of the group of gauge transformations G[1, 10]. The moment map is µ G = η¯ µν Fµν /8π 2 , so that the zero set is formed by the space of selfdual solutions, which quotiented by G gives the moduli space. That this quotient is well defined follows from the invariance of the Kähler forms 1 ω rs · σ = − 2 d4 x η¯ µν Tr(δr Aµ δs Aν ), (51) 4π M under infinitesimal gauge transformations, which is seen by adding arbitrary Dµad : to the deformations. For the caloron the boundary condition Eq. (27) is consistent with complex structures acting as η¯ µν , i.e. the non- trivial holonomy is compatible with the hyperKähler structure. One therefore expects caloron moduli spaces to be hyperKähler. For practical purposes the formal reasoning above is of little use. Computing metrics on moduli spaces with the techniques presented depends crucially on the construction of the Green’s function of the covariant Laplacian and in the present situation, we do not even have an expression for Aµ readily available. We take a different route which uses multi-instanton calculus, suitably adapted to the caloron situation. This allows for calculating the metric in terms of the ADHMN data and makes it thus feasible to find a compensating gauge transformation or to perform the hyperKähler quotient. Moduli spaces of selfdual connections can usually be written as a product of the base space M, describing the center of mass and the non-trivial relative moduli space Mrel , M = M × Mrel .
(52)
In the metric this corresponds to a part describing the flat metric on the base space M and one for the relative or centered metric on Mrel , containing the nontrivial part. However, in the case at hand, where we want to take particular limits, it will be preferable to work with the full metric on M. 4.2. Isometric properties of the ADHM-Nahm construction. We first recall the computation of the metric on the moduli space of instantons on R4 which can be entirely performed using ADHM techniques. Adapted to the caloron situation, this will translate into the formalism to calculate metrics in terms of Nahm data. A tangent vector to the instanton moduli space is given by Zµ (C) = v † (x)C σ¯ µ fx u(x)ϕ − 2 (x) − ϕ − 2 (x)u† (x)fx σµ C † v(x), 1
1
where C is a tangent vector to the moduli space of ADHM data, c C= , Y
(53)
(54)
which satisfies tr 2 (.† (x)C σ¯ i ) = −tr 2 (C † .(x)σ¯ i ),
tr 2 (.† (x)C) = tr 2 (C † .(x)).
(55)
Here the first equation is the deformation of the ADHM constraint and the second guarantees gauge orthogonality. Using an infinitesimal U (k) transformation (9) T = exp(−iδX), where δX = δX† , the tangent vectors can be constructed as δλ + iλδX C = δ. + δX . = , (56) δB + i[B, δX]
514
T. C. Kraan
which automatically satisfy the deformation equation. Gauge orthogonality imposes tr 2 B † [B, iδX] − [B † , iδX]B + 2iδXλ† λ + λ† δλ − δλ† λ + B † δB − δB † B = 0. (57) The complex structures acting on tangent vectors Z extend to C in a natural way, i σ . The metric can Z(C)σ¯ i = Z(C σ¯ i ), as is seen from Eq. (53) and σµ σ¯ i = −η¯ µν ν 1 be evaluated using a powerful expression due to Corrigan [6], † (58) Tr(Zµ† (x)Zµ (x)) = − 21 ∂ 2 tr 2 Tr C † (1 − .(x)fx .† (x))C fx + C Cfx . The integral to compute the L2 norm in Eq. (48) is reduced to a boundary term corresponding with x 2 → ∞, where fx is known, compare Eq. (16). Using that Z(C)σ¯ i = Z(C σ¯ i ) and identifying the tangent space to the ADHM data with the vector space itself, the well-known (see also [32]) hyperKähler isometric property of the ADHM construction is proven † gM (Z, Z ) = 21 Tr tr 2 Y † Y + c† c + c c , (59) † ω M (Z, Z ) · σ = 21 σi Tr tr 2 σ¯ i Y † Y + c† c − c c . The right-hand side of Eq. (59) explains why Eq. (8) gives the natural metric and Kähler forms on the space Aˆ of ADHM matrices .. As the ADHM construction is an isometry and the moduli space of ADHM data µ −1 (0)/U (k) is hyperKähler the same holds for the moduli space of instantons on R4 . In employing the metric properties of the ADHM construction in the caloron case, one has – in addition to the deformation equation and gauge orthogonality – the algebraic gauge condition Eq. (27) to be satisfied −1 Zµ (x + 1) = P∞ Zµ (x)P∞ .
(60)
This requires Yp,p = Yp−1,p −1 ,
cp+1 = P∞ cp ,
δXp,p = δXp−p .
(61)
The compatibility of periodicity and nontrivial holonomy with the hyperKähler structure on the level of the ADHM-Nahm construction can be seen from the complex structures acting on Y and c as multiplication by −i, −j, −k on the right. We define the Fourier transforms of the tangent vector c(z) ˆ =
p∈Z
δ(z − z )Yˆ (z) =
exp(−2π ipz)cp =
m∈Z/nZ
e2πi(pz−p z ) Yp,p ,
δ(z − µm )cˆm , (62)
p,p ∈Z 1 The expression given in eq. 3.17 of [41] is incorrect for gauge group SU (n) and should be replaced by the one given here in Eq. (58).
Instantons, Monopoles and Toric HyperKähler Manifolds
515
and find after Fourier transformation of Eqs. (55, 56) the analogues of Eq. (47) as the deformation of the Nahm equation and a gauge orthogonality condition d ˆ † Yi (z) = −iπ tr 2 σ¯ i (ζm† cˆm + cˆm ζm )δ(z − µm ), dz m∈Z/nZ
d ˆ Y0 (z) = −iπ dz
m∈Z/nZ
† tr 2 (ζm† cˆm − cˆm ζm )δ(z − µm ).
(63)
To evaluate the caloron metric we use Eq. (58) and closely follow the reasoning in [23]. By Fourier transformation, Corrigan’s formula is cast into † 1 2 dz [Yˆ † (z)Yˆ (z) + Yˆ † (z)Yˆ (z) (64) TrZµ (x)Zµ (x) = − 2 ∂ tr 2 S1 + cˆ† (z) < cˆ > +cˆ† (z) < cˆ >]fˆx (z, z) ˆ + Yˆ x (z)]fˆx (z, z )[Yˆ x† (z ) + Cˆ† (z )]fˆx (z , z) , + 21 ∂ 2 tr 2 dzdz [C(z) S1
where we introduced the shorthand notation ˆ = cˆ† (z) < λˆ >, Yˆ x (z) = (2π i)−1 Yˆ † (z)Dˆ x (z). C(z)
(65)
In evaluating the integral over R3 × S 1 , the ∂02 term gives no contribution because of periodicity. The term involving ∂i2 is evaluated by partial integration as a boundary term at spatial infinity, for which the asymptotic behaviour of the Green’s function fˆx (z, z ) is needed. Since the asymptotic expression for the Green’s function is independent of n, π −2π|x ||z−z |+2πix0 (z−z ) fˆx (z, z ) = e + O(| x |−2 ), (66) | x| we can use, slightly adapted, the analysis for SU (2) in [23]. Combining the first line in Eq. (64) with the only surviving term of the second, we find the following gauge independent expression: (67) gM (Z, Z ) = 21 tr 2 < Yˆ † Yˆ > + < cˆ† >< cˆ > + < cˆ† >< cˆ > , i ωM (Z, Z ) = 21 tr 2 σ¯ i < Yˆ † Yˆ > + < cˆ† >< cˆ > − < cˆ† >< cˆ > . This proves that the metric and Kähler forms on the caloron moduli space can be computed as the metric on the Nahm data. In other words, for k = 1 SU (n) calorons, the Nahm construction is a hyperKähler isometry. A slightly modified proof shows this for monopoles of type (1, 1, . . . , 1) and can be found in the appendix. The isometric property is essential for what follows. The metric on the caloron moduli spaces can now be calculated in terms of tangent vectors to the space of solutions to the Nahm equations, with infinitesimal gauge transformations performed where needed. This method, used in Sect. 4.3, is called direct as it concentrates on the gauge orthogonal tangent vectors to the moduli space. An alternative method, given in Sect. 4.4, uses the fact that the moduli space of data is an infinite dimensional hyperKähler quotient. It proceeds by using part of the U (k) gauge symmetry to embed the moduli in a finite dimensional hyperKähler space. The metric on the moduli space is then found as the metric on a finite dimensional hyperKähler quotient, with the remaining gauge action to be divided out.
516
T. C. Kraan
ˆ 4.3. Direct computation. In the direct approach a compensating gauge function δ X(z) = X exp(2πipz) has to be found to account for the tangent vectors p∈Z p c(z) ˆ =
ˆ m) , δ(z − µm ) δζm + iζm δ X(µ
m∈Z/nZ
1 Yˆ (z) = 2πi
d ˆ ˆ δ A(z) + i δ X(z) , dz
(68)
to be gauge orthogonal, Eq. (63). The gauge orthogonality of Yˆ (z) implies for the comˆ pensating gauge function δ X(z) −
ˆ 1 d 2 δ X(z) ˆ + 2δ X(z) δ(z − µm )|ρm | 2 2π dz m∈Z/nZ
dτm dτm−1 δ(z − µm ) − − |ρm |w m (ρm ) · d ρm , = 4π νm 4π νm−1
(69)
m∈Z/nZ
ˆ where we used Eq. (21). This differential equation implies that δ X(z) is continuous and ˆ piecewise linear. Therefore, δ X(z) is fully determined by the values δ Xˆ m it takes at z = µm , which are comprised in the vector δ Xˆ = (δ Xˆ 1 , . . . , δ Xˆ n ) ∈ Rn . In the gauge chosen, all functions are either constants on the subintervals (µm , µm+1 ), or fixed by values at z = µm . Therefore, the entire computation can conveniently be performed in terms of n dimensional vectors and n × n matrix operators acting thereon, at the cost of introducing some extra notation. For taking derivatives, we will use the n × n matrix
1
S=
1
−1
−1
, . 1 −1
−1
..
(70)
1
with unspecified entries zero. In addition we introduce the vector ρ = (ρ1 , . . . , ρn ) ∈ R3n and diagonal matrices = 1 diag(w 1 (ρ1 ), . . . , w W n (ρn )), 4π = 4πdiag(ρ1 , . . . , ρn ).
N = diag(ν1 , . . . , νn ), V −1
(71)
Introducing the symbol V anticipates its later interpretation as potential. In the sequel, all matrix multiplications between n-dimensional objects are implicitly assumed. The transpose t acts only on the indices running from 1 to n. The Nahm connection is now represented by the n dimensional vector τ Aˆ = (Aˆ 1 , . . . , Aˆ n )t = 2π(N −1 + y · σ ), 4π
(72)
Instantons, Monopoles and Toric HyperKähler Manifolds
517
ˆ on (µm , µm+1 ). The Nahm equation reduces to ρ = S t y. where i Aˆ m is the value of A(z) Similarly c(z) ˆ = m∈Z/nZ δ(z − µm )cˆm and Yˆ (z) = i m∈Z/nZ χ[µm ,µm+1 ] (z)Yˆm are fixed by cˆm = δζm + iζm δ Xˆ m ,
1 ˆ 1 −1 ˆ Yˆ = δA − N Sδ X. 2π 2π
(73)
ˆ Integrating the differential equation (69) for δ X(z) over small intervals [µm −#, µm +#], # ↓ 0, gives conditions on the values δ Xˆ m . This yields 1 t −1 dτ S t · d y), S N S + V −1 δ Xˆ = (S t N −1 − V −1 W 2π 4π
2 dz contributes ˆ where we used that −d 2 δ X(z)/dz − δ Xˆ (µm +) − δ Xˆ (µm −) 1 ˆ 1 ˆ ˆ ˆ =− δ Xm+1 − δ Xm − δ Xm − δ Xm−1 . νm νm−1
(74)
(75)
Equation (74) is solved by δ Xˆ dτ S t · d y, = V S t G−1 − 1 − V S t G−1 S W 2π 4π
(76)
such that dτ dτ 1 −1 ˆ S t · d y), Yˆ = d y · σ + N −1 − N Sδ X = d y · σ + G−1 ( + SW 4π 2π 4π
(77)
where we defined G = N + SV S t . The integration over S 1 to evaluate the metric on the Nahm data in Eq. (67) is carried out as < Yˆ † Yˆ >= Yˆ † N Yˆ using that each subinterval has length νm = µm+1 − µm . Thus we obtain tr 2 < Yˆ † Yˆ > = d y t · N d y t dτ S t · d y G−1 N G−1 dτ + S W S t · d y , + + SW 4π 4π † t 1 1 ¯ i < Yˆ ∧ Yˆ > = − 2 d y N ∧ d y · σ 4 σi tr 2 σ t −1 dτ t + NG ( + S W S · d y) ∧ d y · σ . 4π 1 2
Using the properties (21, 23) of ζm , the contribution to the metric of cˆm defined in Eq. (73) is found. One obtains tr 2 < cˆ† >< cˆ >= d y t · SV S t d y t dτ S t · d y G−1 SV S t G−1 dτ + S W S t · d y , + SW + 4π 4π
518
T. C. Kraan 1 2
σi tr 2 σ¯ i < cˆ† > ∧ < cˆ >= − 21 (SV S t d y)t ∧ d y · σ t dτ S t · d y) ∧ d y · σ , + SW + SV S t G−1 ( 4π
where it is used that in the gauge chosen the phases of ζ are fixed. The metric and Kähler forms on moduli space of the uncentered caloron are now readily obtained dτ · d y)t G−1 ( dτ + W · d y), +W ds 2 = d y t G · d y + ( 4π 4π t dτ · d y ∧ d y − (Gd y)t ∧ d y, ω =2 +W 4π = SW St . G = N + SV S t , W
(78) (79)
Equivalently writing δm−1,m + δm,m 4πρm m, m ∈ Z/nZ,
Gm,m = νm δmm −
1 1 + 4πρm 4πρm+1
−
δm+1,m , 4πρm+1
(80)
reveals the form of G as given in [29]; thus we confirm the conjectured form for the satisfy metric in [29]. As is readily checked, from Eqs. (22, 71) it follows that G and W the hyperKähler conditions (26), k j i ∂m Gm ,m = #ij k ∂m W
yG = ∇ y × W, ∇
m ,m
,
(81)
i = ∂/∂y i ), which implies the Kähler forms in (79) to be closed and the caloron (∂m m metric to be hyperKähler. The metric has n commuting triholomorphic isometries,
∂ , ∂τm
m = 1, . . . , n,
(82)
are τ independent. The isometries correspond to shifts on the n-torus as G and W Rn /(4π Z)n which describe the residual U (1)n−1 gauge invariance and the temporal position ξ0 =
1 4π
τm ∈ S 1 ,
(83)
m∈Z/nZ
of the caloron. Therefore, the caloron moduli space is a toric hyperKähler manifold, with dimension 4n. 3n coordinates describe the monopole positions and n phase angles parameterise the temporal position and residual U (1)n−1 gauge invariance in the case of maximal symmetry breaking. From the uncentered caloron metric in Eq. (78), all other metrics discussed in this paper can be obtained by taking suitable limits. In the next subsection the caloron metric will be obtained using the hyperKähler quotient. The non-trivial part of the metric is obtained by splitting off the center of mass coordinate ξ in Eq. (41). To this aim, we express the metric in terms of ξ and n − 1
Instantons, Monopoles and Toric HyperKähler Manifolds
519
relative monopole position vectors ρm , using that ρn = − n−1 m because of Eq. (39). m=1 ρ The two sets of coordinates are related by the n × n dimensional “centering matrix" Fc , ρ˜ Fc = (Sc , N e), = Fct y. (84) ξ Here, the n × (n − 1) dimensional matrix Sc is obtained from S by omitting its last column, and we defined e = (1, . . . , 1)t ∈ Rn . A tilde denotes from now on the restriction to the first n−1 coordinates, e.g. ρ˜ = (ρ1 , . . . , ρn−1 )t . New torus coordinates υ˜ = (υ1 , . . . , υn−1 )t are introduced as well υ˜ τ = Fc . (85) 4π ξ0 The centered metric will be again hyperKähler, as splitting of the center off mass metric amounts to taking the hyperKähler quotient under the U (1) action τm → τm + νm tc ,
m = 1, . . . , n,
tc ∈ R.
(86)
From Eqs. (78, 79) it is seen that this action is a triholomorphic isometry whose moment map gives the center of mass of the caloron µ =
1 4π
νm ym =
m∈Z/nZ
ξ . 4π
(87)
Indeed, the phase variables υ˜ are invariant under the U (1) action and can serve as coordinates on the quotient whereas the fibre coordinate ξ0 changes as ξ0 → ξ0 + tc . In the new basis the relative metric is expressed in terms of a relative mass matrix and relative interaction potentials ˜ rel G ˜ rel = M˜ + V˜rel , Fc−1 G(Fc−1 )t = , G 1 1 n (ρn ) ˜ + w ˜ ) = W (V˜rel )mm = V˜mm + , (W , (88) rel mm mm 4π |ρn | 4π m . The relative mass matrix M˜ is defined where m, m = 1, . . . , n − 1, ρn = − n−1 m=1 ρ as M˜ −1 Fct N −1 Fc = , 1 1 1 − ν11 νn + ν1 1 1 −1 − ν12 (89) ν1 ν1 + ν2 . . . −1 ˜ . . . M = . . . , 1 1 1 1 − + − νn−3 νn−3 νn−2 νn−2 1 1 1 − νn−2 + νn−2 νn−1 its explicit form allowing one to take limits that correspond to massless monopoles M˜ = M˜ t ,
M˜ mm = (νm + · · · + νn−1 )(1 − νm · · · − νn−1 ) for m ≥ m ,
m, m = 1, . . . , n − 1.
(90)
520
T. C. Kraan
The centered metric and Kähler forms now read t ˜ d υ˜ ˜ ˜ −1 d υ t ˜ ˜ ˜ ˜ ˜ ˜ + W rel · d ρ Grel + W rel · d ρ , g = dξµ dξµ + d ρ Grel · d ρ + 4π 4π t d υ˜ ˜ · d ρ˜ ∧ d ρ˜ − (G ˜ t ∧ d ρ. ˜ (91) ˜ rel d ρ) ω = 2dξ0 ∧ d ξ − d ξ ∧ d ξ + 2 +W rel 4π The first terms give the center of mass metric on R3 × S 1 , the other terms represent the non-trivial part of the metric. Both are toric hyperKähler, and have an SO(3) invariance corresponding to spatial rotations. 4.4. HyperKähler quotient construction. We follow the approach in [35] for BPS monopoles of type (1, 1, . . . , 1) and consider the natural metric and Kähler forms on the ˆ compare Eq. (67), space of caloron Nahm data A, gAˆ = 21 tr 2 < d Aˆ † d Aˆ > +2 < d λˆ † >< d λˆ > , (92) ωi ˆ = 21 tr 2 σ¯ i < d Aˆ † ∧ d Aˆ > +2 < d λˆ † > ∧ < d λˆ > . A
One then notes that the group Gˆ of U (1) gauge transformations on Sˆ 1 acts triholomorˆ The zero set of the associated moment map is formed by the set N of phically on A. solutions to the Nahm equations, which after quotienting by the U (1) gauge action Gˆ on the dual S 1 gives the moduli space of Nahm data. By virtue of Eq. (67) this quotient is isometric to the caloron moduli space, ˆ M = N /G.
(93)
As both N and Gˆ are infinite dimensional, it is not obvious that this procedure is welldefined. However, using the gauge action we can restrict to those solutions N0 to the Nahm equations which have constant Aˆ 0 (z) on the subintervals (µm , µm+1 ). As the Nahm equations force Aˆ i (z) to be piecewise constant, there are n quaternions specifying the Nahm connection, denoted by y ∈ Hn . The singularities (or matching data) are described by n complex two-component vectors ζm , denoted by ζ ∈ Cn,2 . Hence, N0 is a subset of the space Aˆ 0 = Hn × Cn,2 of possible piecewise constant data, which has metric and Kähler forms (94) g = 21 Tr tr 2 dy † N dy + 2dζ † dζ , † † 1 ωi = 2 Tr tr 2 σ¯ i dy ∧ N dy + 2dζ ∧ dζ , as is natural from Eq. (67). On Aˆ 0 , the gauge action Gˆ is restricted to the set Gˆ0 of gauge functions with piecewise linear and continuous log. These are determined by the values h assumes at z = µm . Under these gauge transformations, Aˆ and ζ change according to ζm → eitm ζm ,
ψ → ψ + 2t,
y→y−
1 −1 N St, 2π
(95)
where t = (h(µ1 ), . . . , h(µn )) ∈ Rn /(2π Z)n and ψ = (ψ1 , . . . , ψn )/(4π Z)n denotes the phases of ζ . The lattices correspond to gauge transformations of type (44). Therefore
Instantons, Monopoles and Toric HyperKähler Manifolds
521
the action of the restriction Gˆ0 of Gˆ on Aˆ 0 is equivalent to an Rn /(2π Z)n action on Hn × Cn,2 . Thus we reduced the infinite dimensional hyperKähler quotient to a finite dimensional. This technique was also used for the (1, 1, . . . , 1) monopole metric [35]. The metric on the moduli space of Nahm data can now be computed as a metric on a hyperKähler quotient of a finite dimensional euclidean space by a toric group action. To do this we follow [15]. From the metric and Kähler forms on Aˆ 0 , determined by inserting Eqs. (6, 23) in Eq. (94), t dψ · d ρ V −1 dψ + W · d ρ , +W (96) ds 2 = dy † N dy + d ρ t V · d ρ + 4π 4π t dψ · d ρ ∧ d ρ − (V d ρ) ω = −(N d y)t ∧ d y + 2(N dy0 )t ∧ d y + 2 +W t ∧ d ρ, 4π the action (95) is seen to be triholomorphic. The moment map for this Rn /(2π Z)n action reads 1 ¯ − iζ † P ζ, (97) µ · σ = − S t (y − y) 4π −1 (0) given by the solutions Aˆ corwhere P = (P1 , . . . , Pn )t , and has a zero set µ t responding to ρ = S y. Therefore, the space of piecewise constant solutions to the ˆ ζ ) ∈ N0 = µ Nahm data is (A, −1 (0) ⊂ Aˆ 0 . The moduli space of Nahm data is this set quotiented by the reduction of the gauge action in Eq. (42), or equivalently Rn . Hence M = N /Gˆ = N0 /Gˆ0 = µ −1 (0)/(Rn /(2π )n ).
(98)
The metric on µ −1 (0) reads ds 2 = d y t (SV S t + N ) · d y t dψ t −1 dψ t + + W · S d y V + W · S d y + dy0t N dy0 , 4π 4π t dτ · d y ∧ d y − (Gd y)t ∧ d y. ω =2 +W 4π
(99) (100)
The n vector ψ τ =S + Ny0 , 4π 4π
(101)
is invariant under the Rn /(2πZ)n action (95) and can therefore be used as coordinate on the quotient µ −1 (0)/Rn = M, together with y. Cotangent vectors involving dψ have a vertical component, i.e. lie along the Rn /(2π Z)n fibre. The horizontal and vertical 1 part of the metric are separated by inserting y0 = 4π N −1 (τ − Sψ) and completing the squares to obtain dτ t −1 dτ · S t d y V −1 W ds 2 = d y t G · d y + N + d y t S · W 4π 4π t dτ S t · d y (V −1 + S t N −1 S)−1 − S t N −1 − V −1 W 4π dτ S t · d y + ϕ t (V −1 + S t N −1 S)ϕ, · S t N −1 − V −1 W 4π
(102)
522
T. C. Kraan
where the one form ϕ denotes the component along the Rn /(2π Z)n fibre ϕ=
dψ S t · d y − (V −1 + S t N −1 S)−1 S t N −1 dτ . + (V −1 + S t N −1 S)−1 V −1 W 4π 4π (103)
Horizontal projecting to the metric on µ −1 (0)/(Rn /(2π Z)n ) amounts to discarding the last term in Eq. (102) and one obtains (after reorganising) the metric on the caloron moduli space M given in Eq. (78). For the Kähler forms, this projection is generally not necessary: Eq. (100) is precisely the Kähler form in Eq. (79). This is a manifestation of the degeneracy of the Kähler forms along the gauge orbit, needed for the hyperKähler quotient to be well defined. 5. Instanton and Monopole Limits of the Caloron From the caloron metric, other toric hyperKähler manifolds can be obtained by taking suitable limits. For large T or equivalently all ρm small, one expects the metric to approach the moduli space for k = 1 SU (n) instantons on R4 . To study this limit, we consider the centered metric Eq. (91). For small ρm , the elements of the relative mass −1 terms in V˜ , matrix M˜ in Eq. (88) are dominated by the ρm rel ˜ rel G V˜rel → , ρm → 0, m = 1, . . . , n − 1, (104) Fc−1 G(Fc−1 )t = 1 1 resulting in the asymptotic form for the non-trivial part of the metric and Kähler forms t ˜ d υ˜ ˜ ˜ −1 d υ t ˜ ˜ ˜ ˜ ˜ ˜ + W rel · d ρ Vrel + W rel · d ρ , glimit = d ρ Vrel · d ρ + 4π 4π ω limit = 2
t d υ˜ ˜ · d ρ˜ ∧ d ρ˜ − (V˜ d ρ) ˜ t ˜ +W rel rel ∧ d ρ. 4π
(105)
The caloron with trivial gauge holonomy has the same limiting metric, as follows directly from taking the limit ν1 , . . . , νn−1 → 0, νn → 1 of the caloron relative mass matrix in Eq. (90). The phase variables are now given by υm = τm + . . . + τn−1 ∈ R/(4π Z), cf. Eq. (85). The Kähler forms ω limit are closed, since the hyperKähler conditions (26) are satisfied ˜ , ˜ rel = ∇ ρG ρ × W ∇ rel
(106)
hence the limiting metric for large T is hyperKähler. It is known as the Calabi metric. This limit was discussed in [29] using indirect arguments. With the techniques presented in this paper, it is easy to prove explicitly that the limiting metric is indeed the metric for both the ordinary k = 1 SU (n) instantons on R4 and the calorons with trivial holonomy. It follows immediately when realising that the 4(n − 1) dimensional Calabi space can be obtained as the hyperKähler quotient of Hn by a U (1) action [15]. This quotient emerges naturally from both the construction of the charge one SU (n) instanton and the trivial holonomy caloron. First note that there is a one to one correspondence between the ADHM data of the k = 1 SU (n) instanton and the Nahm data of the trivial holonomy caloron in the Gˆ gauge with constant Aˆ 0 (z). The latter are
Instantons, Monopoles and Toric HyperKähler Manifolds
523
ˆ ˆ given in terms of (ξ, ζ ) ∈ H × Cn,2 as A(z) = 2π iξ, λ(z) = δ(z)ζ and directly translate into ADHM data λ = ζ, B = ξ for the instanton. With only one subinterval, the metric on the Nahm data now reduces to the expression for the instanton (8). Having restricted to constant Aˆ 0 (z), the remaining transformations in Gˆ0 leave ξ invariant, apart from confining ξ0 to the circle through g(z) = exp(2π ipz), p ∈ Z. For their action on the matching data only the U (1) formed by the values g(0) is relevant. Therefore, in both cases the nontrivial part of the moduli space is the quotient of Cn,2 (with g = 21 tr 2 dζ † dζ, ωi = 21 tr 2 (σ¯ i dζ † ∧ dζ )) by the U (1) action ζm → eit ζm ,
ψm → ψm + 2t,
m = 1, . . . n,
t ∈ R/(2π Z).
(107)
(Identifying C2 and H, this quotient is readily seen to be equivalent to that discussed in eq. (36) of [15]). The corresponding moment map, zero set and invariants are given by µ =
1 2π
m∈Z/nZ
ρm ,
ρm = 0,
υ˜ m = ψm − ψn ,
m = 1, . . . , n − 1.
m∈Z/nZ
(108) Expressing the metric on the zero set in terms of invariants and the terms involving dψn describing the fibre part, one obtains [15] the Calabi metric in Eq. (105). The Calabi metric has an SU (n) triholomorphic isometry, reflecting the SU (n) gauge symmetry of the k = 1 instanton and trivial holonomy caloron. As explained in Sect. 2 for the instanton, it emerges as the SU (n) acting on ζ in Eq. (107) on the left, commuting with U (1), and descending to the quotient. A direct calculation using a compensating gauge transformation gives the same result. In [23, 25], it was explicitly shown from the action density that removing one of the constituent monopoles of the caloron to spatial infinity, | yn | → ∞ turns it into a static selfdual SU (n) solution, i.e. a monopole in the BPS limit. Indeed, this limit corresponds to the compactification length going to zero. The Nahm data suggest that the remnant is the (1, 1, . . . , 1) monopole. We will show indeed that the metric in this limit has the required form. Removing a constituent is described by a hyperKähler quotient. Consider the U (1) action that changes the phase of the mth monopole in the uncentered caloron τm → τm + t,
t ∈ R/(4π Z).
(109)
It is a triholomorphic isometry as follows from Eqs. (78, 79). Its moment map µ fix is exactly the position of the mth monopole, µ fix = ym /(4π ). Therefore, the metric on the quotient, the caloron moduli space with the mth constituent fixed, is hyperKähler −1 ym )/R irrespective of its position. For finite | ym |, the resulting metric on the quotient µ fix ( is complicated, and no longer SO(3) symmetric. Removing the constituent, | ym | → ∞, i.e. fixing it at spatial infinity, gives the hyperKähler metric of the remnant BPS monopole, with a simple form and SO(3) symmetry restored. The metric and Kähler forms with the nth monopole far away, in which case ρ1−1 , ρn−1 → 0, reads (g, ω) = (gn , ω n ) + (gm , ω m ).
(110)
524
T. C. Kraan
Here the removed monopole is described by the metric gn = νn d yn2 + νn −1 dτn2 /(4π 2 ) and Kähler forms ω n = dτ/(2π) ∧ d yn − νn d yn ∧ d yn , and the remnant by t dτm t −1 dτm gm = d ym Gm · d ym + + Wm · d ym Gm + Wm · d ym , 4π 4π t dτm m · d ym ∧ d ym , ω m = −(Gm d ym )t ∧ d ym + 2 (111) +W 4π where t Gm = Nm + Sm Vm Sm ,
t m = Sm W m Sm W ,
m = diag(w W 2 (ρ2 ), . . . , w n−1 (ρn−1 ))/(4π ),
Vm−1 = 4πdiag(ρ2 , . . . , ρn−1 ),
Nm = diag(ν1 , . . . , νn−1 ), ρm = (ρ2 , . . . , ρn−1 )t ,
ym = (y1 , . . . , yn−1 )t ,
τm = (τ1 , . . . , τn−1 )t ,
−1 1 −1 .. .. ∈ Rn−1,n−2 . Sm = . . 1 −1 1
(112)
More explicitly, the potential term in Eq. (111) reads 1 − ρ12 ρ2 − 1 1 + 1 − ρ13 ρ2 ρ2 ρ3 t . .. .. 4πSm Vm Sm = . . .. 1 1 1 1 − ρn−2 ρn−2 + ρn−1 − ρn−1 1 1 − ρn−1 ρn−1
(113)
∈ Rn−1,n−1 .
(114)
m has a similar structure. The metric in Eq. (111) is that of the The vector potential W uncentered SU (n) monopole of type (1, 1, . . . , 1). The calculation of the metric on its space of Nahm data was performed in [15, 35]. Details on the Nahm construction of the (1, 1, . . . , 1) monopole and a proof of its isometric property as well as an outline of the calculation of the metric can be found in the appendix. To connect with [27], we have to center the monopole. We introduce Fm = Sm , ν1 Nm em ∈ Rn−1,n−1 , (115) where em = (1, . . . , 1)t ∈ Rn−1 and ν = n−1 m=1 νm denotes the mass of the monopole. The relative position variables ρm are reinstated and the center of mass R3 position is separated off using ym = (Fmt )−1
ρm ξm
,
ξm =
n−1 1 νm ym . ν m=1
(116)
Instantons, Monopoles and Toric HyperKähler Manifolds
The mass matrix in this basis is given by −1 Mm t −1 , Fm Nm Fm = ν −1 1 1 − ν12 ν1 + ν2 1 1 −1 ν2 ν2 + ν3 .. .. −1 = Mm . . 1 − νn−3
t Mm = Mm ,
525
− ν13 .. . 1 1 νn−1 + νn−2 1 − νn−2
1 − νn−2 1 νn−2 + νn−1
,
1
(Mm )m,m = ν −1 (ν1 + · · · + νm )(νm +1 + · · · + νn−1 ), for m ≥ m.
(117)
Furthermore, alternative torus coordinates χm = (χ1 , . . . , χn−2 ) are introduced, as well as a global U (1) phase ξ0,m , τm = Fm
χm ξ0
,
ξ0,m =
n−1
τm .
(118)
m=1
In the new coordinates, the uncentered metric is the sum of the center of mass and relative metric 2 c gm = νd ξm · d ξm + ν −1 dξ0,m + gm ,
(119)
where the nontrivial part c gm = d ρmt (Mm + Vm ) · d ρm t dχm m · d ρm (Mm + Vm )−1 dχm + W m · d ρm + +W 4π 4π
(120)
is the Lee–Weinberg–Yi metric [27]. It is of toric hyperKähler form. Thus we proved that the (1, 1, . . . , 1) monopole is a limit of the caloron, identifying the static remnant in [24, 25]. Finally, we note that the (1, 1, . . . , 1) monopole has only one magnetic winding, as explained in the introduction. It is opposite to the winding of the removed monopole, and hence, we can apply the reasoning in [43] explaining how the instanton charge arises also for SU (n) from braiding two monopoles [23]. 6. Discussion Since the metric describes the Lagrangian for adiabatic motion on the moduli space [33], it reflects the interactions of the monopole constituents. The constituent nature of the caloron solution, easily extracted from the action density, should therefore also be reflected in the metric. The action density of the k = 1 SU (n) caloron [24] is derived from Eq. (15) employing Green’s function techniques and reads 2 − 21 TrFµν = − 21 ∂µ2 ∂ν2 log O.
(121)
526
T. C. Kraan
Here the positive scalar potential O is defined as n 1 O(x) = 2 tr Am − cos(2π x0 ),
(122)
m=1
where
Am =
rm | ym − ym+1 | 0 rm+1
cm sm s m cm
1 , rm
(123)
given in terms of the center of mass radii rm = | x − ym | of the mth constituent monopole, cm = cosh(2πνm rm ), sm = sinh(2π νm rm ) and nm=1 Am = An · · · A1 . The energy density for the (1, 1, . . . , 1) monopole is obtained from it by sending the nth constituent to infinity, which gives [25] 2 ˜ m ( E( x ) = − 21 trFµν ( x ) = − 21 .2 log O x ),
˜ m ( O x ) = 21 tr
1
rn−1
sn−1 cn−1 0 0
n−2
(124)
Am .
(125)
m=1
(see [31] for some special cases). These densities allow for an unambiguous identification of elementary BPS monopoles as constituents of calorons, and (1, 1, . . . , 1) monopoles, as in the limit where rm rl for all l ! = m the action density approaches that of the single BPS monopole [24]. The corresponding limit in the uncentered metrics reveals ds 2 |m = νm d ym · d ym +
1 dτ 2 νm m
(126)
for the part describing the mth constituent, as all interaction potentials approach zero with the other constituents far away. Equation (126) is the flat metric on R3 × S 1 , the twofold cover of the moduli space for the elementary BPS monopole. Therefore the limit of the (cover of the) caloron moduli space – corresponding to all monopoles well separated – can be seen as a product of elementary BPS monopole moduli spaces. We obtained the metric for the k = 1 SU (n) caloron assuming symmetry breaking to the maximal torus U (1)n−1 with arbitrarily chosen holonomy eigenvalues µm . In the situation of non-maximal breaking, some of the eigenvalues of the holonomy become equal, resulting in some monopoles acquiring zero mass. The form of the relative mass matrices defined as inverses suggests that dramatic things happen when one or more of the constituents acquire zero mass. However, as is clear from the explicit forms of M, Mm in Eqs. (71, 90, 112, 117), all limits can be taken smoothly. This assertion was explicitly checked for the trivial holonomy caloron, with all but one monopole having zero mass. Therefore one can study most efficiently all symmetry breaking patterns, both for k = 1 calorons and for monopoles of type (1, 1, . . . 1), just by inserting the proper values for µm , rather than having to calculate the metric for each case separately. Consider, both for the caloron and for the (1, 1, . . . , 1) monopole, the situation of N − 1 monopoles turning massless νK , . . . , νK+N−2 = 0,
µK = . . . = µK+N−1 ,
(127)
Instantons, Monopoles and Toric HyperKähler Manifolds
527
resulting in an enhanced residual symmetry to SU (N ) × U (1)n−N . The corresponding center of mass radii no longer appear in the expression for the action and energy densities [24], as follows from K−2 n n 1 rK−1 Rc cK−1 sK−1 Am → Am Am . 0 rK+N−1 sK−1 cK−1 rK−1 m=1
m =K+N−1
m=1
(128) Here Rc = |ρK | + . . . + |ρK+N−1 | = π tr 2
K+N−1 m=K
† ζ(m) ζ(m)
(129)
denotes what is known in the monopole literature as the "non- abelian cloud" parameter [28]. It is seen from the right-hand side of Eq. (129) that it is SU (N ) invariant. From the ADHM-Nahm construction (28, 29), this SU (N ) symmetry is seen to leave the holonomy invariant. It will descend to the quotient in the hyperKähler quotient construction of the metric, and therefore, the metric will be SU (N ) invariant as well, much like in the case of the trivial holonomy caloron. As the explicit form of the metric can readily be found by inserting Eq. (127) in the mass matrices (71, 112), it will not be given here. The SU (N ) transformation mixes the positions of the massless monopoles, which therefore do not exist as individual particles. A way of seeing this physically is that the intrinsic length scales of the monopoles, proportional to their inverse masses, become infinitely large as their masses become small, so that they overlap and lose their identities. This appearance of massless particles and infinite length scales illustrates a very general feature of systems near a transition to a more symmetric phase. The fact that the SU (n+1) (1, 1, . . . , 1) monopole and the SU (n) k = 1 caloron both consist out of n constituent BPS monopoles in combination with the fact that the former can be obtained out of an SU (n + 1) caloron, suggests a great similarity between their metrics. We consider the relevant situation for quantum chromodynamics, the SU (3) caloron. Removing one monopole to infinity gives the SU (3) monopole of type (1,1). There remain two constituents, of masses proportional to ν1 , ν2 . The relative metric of the (1, 1) monopole is Taub-NUT with positive mass parameter, gT N = U (ρ)d ρ 2 + U (ρ) −1 (
dψ Qw( ρ) + · d ρ) 2, 4π 4π
U (ρ) =
ν1 ν2 Q + , ν1 + ν 2 4π |ρ| (130)
ρ denoting the separation of the constituents, Q = 1. The relative metric for the SU (2) caloron is also a Taub-NUT [22, 23]. (The metric obtained there checks with Eq. (130) apart from the normalisation 4π 2 , as πρ 2 , ϒ in [23] is related to |r |, ψ in Eq. (130)). However, the interaction strength, depending on the distance between the monopoles, for the caloron is Q = 2, twice that of the SU (3) monopole. Both solitons can be considered as built out of two interacting constituent BPS monopoles, and have a four-dimensional relative moduli space. As each matching point in the Nahm construction gives rise to an interaction between monopoles of distinct type, this is to be expected. The SU (3) (1, 1) monopole has one matching point, at z = µ2 whereas the SU (2) caloron has, in the situation of two constituents, one additional, equal to the other, at z = µ1 + 1 to close the circle. In [26] this was attributed to the fact that the constituent monopoles in the
528
T. C. Kraan
SU (3) (1,1) case are charged with respect to different U (1), whereas for the caloron, they are oppositely charged with respect to the same U (1), generated by ω · τ. In conclusion, we have presented results for the metric on moduli spaces in a unified description that incorporates instantons, calorons and monopoles. Acknowledgements. Pierre van Baal is gratefully acknowledged for discussions and critically reading earlier versions of the manuscript. Conversations and correspondence with Conor Houghton have been very stimulating. This work was financially supported by a grant from the FOM/SWON Association for Mathematical Physics.
Appendix The (1, 1, . . . , 1) monopole. The Nahm construction of the (1, 1, . . . , 1) monopole is similar to that of the k = 1 SU (n) caloron. The main difference is that the circle is replaced by the interval [µ1 , µn ]. For the (1, 1, . . . , 1) monopole, the singularities reside at z = µ2 , . . . , µn−1 [37, 21, 44]. Like for the caloron we introduce 1 ˆ† .† = (λ† (z), 2πi Dx (z)), λ(z) =
n−1
δ(z − µm )ζm ,
m=2
d ˆ Dˆ x (z) = σµ Dˆ µ x (z) = + A(z) − 2π ix, dz
(131)
ˆ where A(z) is now defined on [µ1 , µn ]. The Nahm construction is performed in terms of the normalised zero modes v(x) of .(x), n−1 1 ˆ† sx m v(x) = ˆ δ(z − µm )ζm† sxm Dx (z)ψˆ xm (z) + , = 0,(132) ψx (z) 2π i m =2 µn † † † dzψˆ x (z)ψˆ x (z) = 1n , (133) v (x)v(x) = sx sx + µ1
where ψˆ x (z) = [µ1 , µn ], and s for fixed sx [21]). Though the monopole is a static solution, it is preferable to have x0 included as a dummy variable, the x0 dependence trivially being implemented by v(x) = e2πix0 z v( x ), so as to write concisely
(ψx1 (z), . . . , ψxn (z)) contains the n two-spinors defined on the interval ∈ Cn−2,n . (The equation for ψˆ xm (z) is readily seen to have n solutions
x )) = v † (x)∂µ v(x), x ) = (:( x ), A( Aµ (
(134)
with the inner product defined as in Eq. (133). Performing all monopole calculations in terms of .(x) and v(x), we can copy the caloron formalism. In particular, it follows that for Eq. (134) to be selfdual, .† (x).(x) should commute with the quaternions. This is equivalent to the monopole Nahm equation, n−1 d ˆ j Aj (z) = 2π i δ(z − µm )ρm . dz
(135)
m=2
Its solution Aˆ j (z) can be written in terms of n − 1 position vectors ym , ρm = ym − ym−1 , comprised in ym = ( y1 , . . . , yn )t , t ym , ρm = Sm
(136)
Instantons, Monopoles and Toric HyperKähler Manifolds
529
implying Aˆ j = 2π i
n−1
j
χ[µm ,µm+1 ] (z)ym .
(137)
m=1
Like for the caloron, there is a gauge action on the Nahm data d ˆ ˆ A(z) → A(z) + i h(z), dz
ζm → ζm eih(µm ) ,
m = 2, . . . , n − 1,
(138)
with gauge group Gˆm = {g(z)|g : z → e−ih(z) ∈ U (1), g(µ1 ) = g(µn ) = 1}. The condition at the endpoints is required for id/dz to be hermitian on the space of gauge functions. Hence, for the monopole Gˆm = {g(z)|g : z → e−ih(z) ∈ U (1), g(µ1 ) = g(µn ) = 1}. The Gm action can be used to set Aˆ 0 (z) constant, and to undo the U (1) phase ambiguities in relating ζm to ρm , m = 2, . . . , n − 1, hence ζm can be considered to have fixed phase. The monopole Nahm data can then be expressed in terms of n − 1 quaternions, τm + ym · σ ), Aˆ m = (Aˆ 1 , . . . , Aˆ n−1 )t = 2π(Nm−1 4π
(139)
ˆ takes on (µm , µm+1 ). i Aˆ m,m denoting the value A(z) In the gauge with constant Aˆ 0 (z), the Green’s function fx in the monopole Nahm construction is the solution to the differential equation
1 d − x0 2πi dz
2 +
n−1
2 χ[µm ,µm+1 ] (z) rm
m=2 n−1
1 + 2π
δ(z − µm )| ym − ym−1 | fˆx (z, z ) = δ(z − z ),
(140)
m=2
whereas transformations to other gauges are realised by fˆx (z, z ) → g(z)fˆx (z, z )g(z )∗ ,
g(z) ∈ Gˆm .
(141)
The boundary condition for the monopole Green’s function is determined by the required be a hermitian operator, therefore the eigenfunctions of the left-hand side ment that i dz of Eq. (140) vanish in the endpoints. This imposes by standard Sturm–Liouville theory fˆx (µ1 , z ) = fˆx (µn , z ) = 0
(142)
for the Green’s function. This boundary condition is automatically satisfied when obtaining the monopole Green’s function from the caloron Green’s function, taking the limit | yn | → ∞. The x0 dependence of the monopole Green’s function is trivial, fˆx (z, z ) = e2πix0 (z−z ) fˆx (z, z ).
(143)
The metric on the monopole moduli space is determined in terms of the L2 norm of gauge orthogonal solutions Zm to the linearised Bogomol’nyi equations. With A0 identified as the Higgs field, and assuming all fields and zero modes being static, the conditions for a
530
T. C. Kraan
tangent vector to the monopole moduli space are identical to those on the tangent vector to an instanton moduli space, hence Zm satisfies †
D ad (A)Zm = 0,
(144)
where ∂0 acts trivially, but is kept to make later derivations more transparent. Metric and Kähler forms read 1 † (g, ω)(Z d 3 xTrZm ( x )Zm ( x ). (145) m , Zm ) = 4π 2 R3 The formalism to compute the metric is copied from the caloron case. A tangent vector to the monopole moduli space is given by Zmµ ( x) =
[µ1 ,µn ]2
dzdz
n−1
m =2
sx† cˆm δ(z − µm ) + ψˆ x† (z)Yˆ (z) fˆx (z, z )σ¯ µ ψˆ x (z ) − h.c. (146)
in terms of a tangent vector to the moduli space of monopole Nahm data C=
c(z) ˆ , Yˆ (z)
n−1
c(z) ˆ =
cˆm δ(z − µm ),
(147)
m=2
satisfying the deformation and gauge orthogonality equations n−1
d ˆ † Yi (z) = −iπtr 2 σ¯ i (ζm† cˆm + cˆm ζm )δ(z − µm ), dz m=2
d ˆ Y0 (z) = −iπ dz
n−1 m=2
† tr 2 (ζm† cˆm − cˆm ζm )δ(z − µm ).
(148)
To derive the analogue for monopoles of Corrigan’s formula we trade each matrix multiplication in Eq. (58) for an integration over [µ1 , µn ] or an inner product of type (133) and use the trivial x0 dependence of v(x) and fx (z, z ) for the monopole to obtain † TrZm (x)Zm (x) = − 21 ∇ 2 tr 2 dz [Yˆ † (z)Yˆ (z) + Yˆ † (z)Yˆ (z) [µ1 ,µn ]
ˆ + cˆ (z) < cˆ > +cˆ (z) < cˆ >]fx (z, z) †
+ ∇ tr 2 1 2
[µ1 ,µn ]2
†
ˆ + Y(z)] ˆ dzdz [C(z)
2
fˆx (z, z )[Yˆ x† (z ) + Cˆ† (z )]fˆx (z , z) , (149)
n−1 † −1 ˆ † ˆ ˆ ˆ where now m=2 cˆm ζm δ(z − µm ), Yx (z) = (2π i) Y (z)Dx (z) and
≡ [µ1 ,µn ] H (z)dz. Compare Eq. (64). The monopole metric is evaluated from
Instantons, Monopoles and Toric HyperKähler Manifolds
531
Eqs. (145,149) by partial integration, along the lines of the derivation in Sect. 4.2. The monopole Green’s function fx (z, z ) behaves as in Eq. (66). Thus we arrive at the isometric property of the Nahm construction for (1, 1, . . . , 1) monopoles, gM (Zm , Zm ) = 21 tr 2 < Yˆ † Yˆ > + < cˆ† >< cˆ > + < cˆ† >< cˆ > , i ωM (Zm , Zm ) = 21 tr 2 σ¯ i < Yˆ † Yˆ > + < cˆ† >< cˆ > − < cˆ† >< cˆ > . (150) ˆ An infinitesimal gauge transformation δ X(z) is applied to obtain gauge orthogonality of the tangent vector C, c(z) ˆ =
n−1
δ(z − µm )cˆm =
m=2 n−1
Yˆ (z) = i
m=1
n−1
ˆ m) , δ(z − µm ) δζm + iζm δ X(µ
m=2
1 χ[µm ,µm+1 ] Yˆm = 2π i
(151)
d ˆ ˆ δ A(z) + i δ X(z) . dz
It vanishes in the endpoints z = µ1 , z = µn and satisfies −
n−1 ˆ 1 d 2 δ X(z) ˆ + 2δ X(z) δ(z − µm )|ρm | 2π dz2 m=2
=
n−1 m=2
δ(z − µm )
dτm dτm−1 − − |ρm |w m (ρm ) · d ρm . 4π νm 4π νm−1
(152)
Therefore, it is piecewise linear and fixed by δ Xˆ = (δ Xˆ 2 , . . . , δ Xˆ n−1 )t , δ Xˆ m = ˆ m ), m = 2, . . . , n − 1, where δ X(µ 1 t −1 dτm t t m Sm (S N Sm + Vm−1 )δ Xˆ = (Sm − Vm−1 W Nm−1 · d ym ) 2π m m 4π
(153)
(see Eqs. (112, 113) for definitions). With the compensating gauge function found, the remaining manipulations to retrieve the uncentered monopole metric in Eq. (111) from Eqs. (150, 151) differ only in the m label and the dimensions of the matrices from those in Sect. 4.3 and are therefore not repeated here. To compute the metric using the hyperKähler quotient construction we follow and summarise the reasoning in [35, 15] and Sect. 4.4. We have to find the metric on Nm /Gˆm , where Nm is the subset of the space Aˆ m of monopole Nahm data containing the solutions to the Nahm equations. Making use of the U (1) gauge symmetry for the monopole ˆ in Eq. (138), we can restrict ourselves to piecewise constant A(z), characterised by n − 1 quaternions corresponding to its values on the subintervals. Together with the n − 2 complex two-vectors giving the matching data, these form the space Aˆ 0m = Hn−1 × Cn−2,2 " (ym , ζm ). This space has natural metric † g = 21 Tr tr 2 dym Nm dym + 2dζm† dζm . (154) The set of piecewise constant solutions to the Nahm equations form N0,m , which is a subset of Aˆ 0m . The vector part of a piecewise constant solution to the monopole
532
T. C. Kraan
Nahm equation (i.e. Nm,0 ) is fixed by Eq. (136). We introduce the phases of ζm as ˆ ψm = (ψ2 , . . . , ψn−1 )t . Having gauge fixed to constant A(z), the residual U (1) gauge symmetry consists of gauge functions having piecewise linear and continuous logarithms, which vanish in the endpoints z = µ1 and z = µm . This results in an Rn−2 action on Aˆ 0m , characterised by ym → ym −
1 −1 N Sm tm , 2π m
ψm → ψm + 2tm ,
tm ∈ Rn−2 ,
(155)
with moment map, zero set and invariants given by µ m = −
1 t ρm S ym + , 2π m 2π
t ρm = Sm ym ,
τm = 4π Nm y0m + Sm ψm .
(156)
A suitable notation being established, the algebra to obtain the metric and Kähler forms for the uncentered monopole in Eq. (111) is now nearly identical to the hyperKähler quotient construction of the uncentered caloron metric, and one readily retrieves Eq. (111). Actually, one only has to insert the m labels at appropriate places, just realising that the dimensionalities of the objects are slightly different. References 1. Atiyah, M.F. and Hitchin, N.J.: The Geometry and Dynamics of Magnetic Monopoles. Princeton: Princeton Univ. Press, 1988 2. Atiyah, M.F., Hitchin, N.J., Drinfeld, V.G., Manin, Yu.I.: Phys. Lett. 65 A, 185 (1978); Atiyah, M.F.: Geometry of Yang–Mills fields. Fermi lectures, Scuola Normale Superiore, Pisa, 1979 3. Bogomol’nyi, E.B.:Yad. Fiz. 24, 861 (1976); Sov. J. Nucl. 24, 449 (1976); Prasad, M.K. and Sommerfield, C.M.: Phys. Rev. Lett. 35, 760 (1975) 4. Braam, P.J. and van Baal, P.: Commun. Math. Phys. 122, 267 (1989) 5. Connell, S.A.: The dynamics of the SU(3) charge (1,1) magnetic monopole. (1991), ftp://maths.adelaide.edu.au/pure/murray/oneone.tex, unpublished preprint 6. Corrigan, E.: Unpublished, quoted in [41] 7. Corrigan, E. and Goddard, P.: Ann. Phys. (N.Y.) 154, 253 (1984) 8. Donaldson, S.K.: Commun. Math. Phys. 93, 453 (1984) 9. Donaldson, S.K.: Commun. Math. Phys. 96, 387 (1984) 10. Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds, Oxford: Clarendon Press, 1990 11. Gauntlett, J.P. and Lowe, D.A.: Nucl. Phys. B472, 194 (1996) (hep-th/9601085); Lee, K., Weinberg, E.J. and Yi, P.: Phys. Lett. B376, 97 (1996) (hep-th/9601097) 12. Garland, H. and Murray, M.K.: Commun. Math. Phys. 120, 335 (1988) 13. Gauntlett, J.P., Gibbons, G.W., Papadopoulos, G. and Townsend, P.K.: Nucl. Phys. B 500, 133 (1997) (hep-th/9702202); Papadopoulos, G. and Townsend, P.K.: Nucl. Phys. B 444, 245 (1995) (hep-th/9501069) 14. Gibbons, G.W. and Manton, N.S.: Phys. Lett. B 356, 32 (1995) (hep-th/9506052) 15. Gibbons, G.W., Rychenkova, P. and Goto, R.: Commun. Math. Phys. 186, 581 (1997) (hep-th/9608085) 16. Gross, D.J., Pisarski, R.D. and Yaffe, L.G.: Rev. Mod. Phys. 53, 43 (1981) 17. Harrington, B.J. and Shepard, H.K.: Phys. Rev. D 17, 2122 (1978); ibid. D18, 2990 (1978) 18. Hitchin, N.J., Karlhede, A., Lindström, U. and Roˇcek, M.: Commun. Math. Phys. 108, 535 (1987) 19. ’t Hooft, G.: Phys. Rev. D 14, 3432 (1976) 20. Houghton, C.J. and Sutcliffe, P.M.: J. Math. Phys. 38, 5576 (1997) 21. Hurtubise, J. and Murray, M.K.: Commun. Math. Phys. 122, 35 (1989) 22. Kraan, T.C. and van Baal, P.: Phys. Lett. B428, 268 (1998) (hep-th/9802049) 23. Kraan, T.C. and van Baal, P.: Nucl. Phys. B 533, 627-659 (1998) (hep-th/9805168); Nucl. Phys. A 642, 299c (1998) (hep-th/9805201) 24. Kraan, T.C. and van Baal, P.: Phys. Lett. B 435, 389 (1998) (hep-th/9806034) 25. Kraan, T.C. and van Baal, P.: Nucl. Phys. Suppl. 73, 554 (1999) (hep-lat/9808015) 26. Lee, K.: Phys. Lett. B 426, 323 (1998) (hep-th/9802012); Lee, K. and Lu,C.: Phys. Rev. D 58, 25011 (1998) (hep-th/9802108) 27. Lee, K., Weinberg, E.J., Yi, P.: Phys. Rev. D 54, 1633 (1996) 28. Lee, K., Weinberg, E.J., Yi, P.: Phys. Rev. D 54, 6351 (1996)
Instantons, Monopoles and Toric HyperKähler Manifolds
29. 30. 31. 32. 33. 34. 35. 36. 37.
38. 39. 40. 41. 42. 43. 44.
533
Lee, K. and Yi, P.: Phys. Rev. D 56, 3711 (1997) (hep-th/9702107) Lee, K. and Yi, P.: Phys. Rev. D 58, 066005 (1998) (hep-th/9804174) Lu, C.: Phys. Rev. D 58, 125010 (1998) (hep-th/9806237) Maciocia, A.: Commun. Math. Phys. 135, 467 (1991) Manton, N.S.: Phys. Lett. B 110, 54 (1982) Manton, N.S.: Phys. Lett. B 154, 397 (1985) [Err. 157B, (1985) 475] Murray, M.K.: J. Geom. Phys. 23, 31–41 (1997) (hep-th/9605054) Nahm, W.: Phys. Lett. B 90, 413 (1980) Nahm, W.: All self-dual multimonopoles for arbitrary gauge groups. CERN preprint TH-3172 (1981), published in Freiburg ASI 301 (1981); The construction of all self-dual multimonopoles by the ADHM method. In: Monopoles in quantum field theory, eds. N. Craigie, e.a. Singapore: World Scientific, 1982, p. 87 Nahm, W.: Self-dual monopoles and calorons. In: Lect. Notes in Physics. 201, eds. G. Denardo, e.a. 1984, p. 189 Nakajima, H.: Monopoles and Nahm’s Equations. In: Einstein metrics andYang–Mills connections, Sanda, 1990, eds. T. Mabuchi and S. Mukai, New York: Dekker, 1993 Osborn, H.: Nucl. Phys. B 159, 497 (1979) Osborn, H.: Ann. Phys. (N.Y.) 135, 373 (1981) Pedersen, H. and Poon, Y.: Commun. Math. Phys. 117, 569 (1988) Taubes, C.: Morse theory and monopoles: Topology in long range forces. In: Progress in gauge field theory, eds. G. ’t Hooft et al, New York: Plenum Press, 1984, p. 563 Weinberg, E.J. and Yi, P.: Phys. Rev. D 58, 046001 (1998)
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 212, 535 – 556 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On Covariant Realizations of the Euclid Group R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych Institute of Mathematics, 3 Tereshchenkivska Street, 01004 Kiev, Ukraine. E-mail:
[email protected];
[email protected] Received: 12 September 1997 / Accepted: 30 January 2000
Abstract: We classify realizations of the Lie algebras of the rotation O(3) and Euclid E(3) groups within the class of first-order differential operators in arbitrary finite dimensions. It is established that there are only two distinct realizations of the Lie algebra of the group O(3) which are inequivalent within the action of a diffeomorphism group. Using this result we describe a special subclass of realizations of the Euclid algebra which are called covariant ones by analogy to similar objects considered in classical representation theory. Furthermore, we give an exhaustive description of realizations of the Lie algebra of the group O(4) and construct covariant realizations of the Lie algebra of the generalized Euclid group E(4). 1. Introduction The standard approach to constructing linear relativistic motion equations contains as a subproblem the one of describing inequivalent matrix representations of the Poincaré group P (1, 3). So that if one succeeds in obtaining an exhaustive (in some sense) description of all inequivalent representations of the latter, then it is possible to construct all possible Poincaré-invariant linear wave equations (for more details see, e.g., [1–3]). It would be only natural to apply the same approach to describing nonlinear relativisticallyinvariant models with the help of the Lie’s infinitesimal technique. However, in the overwhelming majority of the papers devoted to symmetry classification of nonlinear differential equations admitting some Lie transformation group G the realization of the group was fixed a priori. As a result, only particular classes of partial differential equations invariant with respect to a prescribed group G were obtained. One of the possible reasons for this is that the problem of describing inequivalent realizations of a given Lie transformation group reduces to constructing a general solution of some over-determined system of nonlinear partial differential equations (in contrast to the case of classical matrix representation theory where one has to solve nonlinear matrix equations).
536
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
We recall that given a fixed realization of a Lie transformation group G, the problem of describing partial differential equations invariant under the group G is reduced with the help of the infinitesimal Lie method to integrating some over-determined linear system of partial differential equations (called determining equations) [4–7]. However, to solve the problem of constructing all differential equations admitting the transformation group G whose realization is not fixed a priori one has • to construct all inequivalent (in some sense) realizations of the Lie transformation group G, • to solve the determining equations for each realization obtained. And what is more, the first problem, in contrast to the second one, reduces to solving nonlinear systems of partial differential equations. In this respect one should mention Lie’s classification of integrable ordinary differential equations based on his classification of complex Lie algebras of first-order differential operators in one and two variables [8]. However, it seems impossible to give an exhaustive description of all Lie algebras of first-order differential operators. Till now there is no complete classification of them even for the case of first-order differential operators in three variables, though a partial classification was obtained by Lie a century ago [8]. The classification problem is substantially simplified if we are looking for inequivalent realizations of a specific Lie algebra. It has been completely solved by Rideau and Winternitz [9], Zhdanov and Fushchych [10] for the generalized Galilei (Schrödinger) group G2 (1, 1) acting in the space of two dependent and two independent variables. Yehorchenko [11] and Fushchych, Tsyfra and Boyko [12] have constructed new (nonlinear) realizations of the Poincaré groups P (1, 2) and P (1, 3), correspondingly (see also [13, 14]). Some new realizations of the Galilei group G(1, 3) were suggested in [15]. A complete description of covariant realizations of the conformal group C(n, m) in the space of n+m independent and one dependent variables was obtained by Fushchych, Zhdanov and Lahno [16, 17] (see, also [18]). It has been established, in particular, that any covariant realization of the Poincaré group P (n, m) with max{n, m} ≥ 3 in the case of one dependent variable is equivalent to the standard realization. But given the condition max{n, m} < 3, there exist essentially new realizations of the corresponding Poincaré groups. The present paper is devoted mainly to classification of inequivalent realizations of the Euclid group E(3), which is a semi-direct product of the three-parameter rotation group O(3) and of the three-parameter Abelian translation group T (3), acting in the space of three independent (x1 , x2 , x3 ) and n ∈ N dependent (u1 , . . . , un ) variables. Being a subgroup of such fundamental groups as the Poincaré and Galilei groups, the Euclid group plays an exceptional role in modern mathematical and theoretical physics, since it is admitted both by equations of relativistic and non-relativistic theories. In particular, group E(3) is an invariance group of the Klein-Gordon-Fock, Maxwell, heat, Schrödinger, Dirac, Weyl, Navier–Stokes, Lamé and Yang-Mills equations. The paper is organized as follows. The second section contains the necessary notations, conventions and definitions used throughout the paper. In the third section we give an exhaustive classification of inequivalent realizations of the Lie algebra of the rotation group O(3) within the class of first-order differential operators. The fourth section is devoted to description of covariant realizations of the Euclid algebra AE(3). We give a complete classification of them and, furthermore, demonstrate how to reduce the realizations of AE(3) realized on the sets of solutions of the Navier–Stokes, Lamè, Weyl, Maxwell and Dirac equations to one of the two canonical forms. In the fourth section
On Covariant Realizations of the Euclid Group
537
the results obtained are applied to describe covariant realizations of the Lie algebra of the generalized Euclid group AE(4). 2. Basic Notations and Definitions It is common knowledge that investigation of realizations of a Lie transformation group G reduces to study of realizations of its Lie algebra AG whose basis elements are the first-order differential operators (Lie vector fields) of the form Q = ξα (x, u)∂xα + ηi (x, u)∂ui ,
(1)
where ξα , ηi are some real-valued smooth functions of x = (x1 , x2 , . . . , xm ) ∈ Rm and u = (u1 , u2 , . . . , un ) ∈ Rn , ∂xα = ∂x∂ α , ∂ui = ∂u∂ i , α = 1, 2, . . . , m, i = 1, 2, . . . , n. Hereafter, a summation over the repeated indices is understood. In the above formulae we have two “sorts” of variables. The variables x1 , x2 , . . . , xm and u1 , u2 , . . . , un will be referred to as independent and dependent variables, respectively. The difference between these becomes essential when we consider AG as an invariance algebra of some system of partial differential equations for u1 (x), . . . , un (x). Due to properties of the corresponding Lie transformation group G basis operators Qa , a = 1, . . . , N of a Lie algebra AG satisfy commutation relations c [Qa , Qb ] = Cab Qc ,
a, b = 1, . . . , N,
(2)
where [Qa , Qb ] ≡ Qa Qb − Qb Qa is the commutator. c = const ∈ R are structure constants which determine uniquely the Lie In (2) Cab algebra AG.A fixed set of Lie vector fields (LVFs) Qa satisfying (2) is called a realization of the Lie algebra AG. Thus the problem of description of all realizations of a given Lie algebra AG reduces c within the class of LVFs to solving relations (2) with some fixed structure constants Cab (1). It is easy to check that relations (2) are not altered with an arbitrary invertible transformation of variables x, u, yα = fα (x, u),
α = 1, . . . , m,
vi = gi (x, u),
i = 1, . . . , n,
(3)
where fα , gi are smooth functions. That is why we can introduce on the set of realizations of a Lie algebra AG the following relation: two realizations Q1 , . . . , QN and Q1 , . . . , QN are called equivalent if they are transformed one into another by means of an invertible transformation (3). As invertible transformations of the form (3) form a group (called diffeomorphism group), the relation above is an equivalence relation. It divides the set of all realizations of a Lie algebra AG into equivalence classes A1 , . . . , Ar . Consequently, to describe all possible realizations of AG it suffices to construct one representative of each equivalence class Aj , j = 1, . . . , r. Definition 1. First-order linearly-independent differential operators (1)
(1)
(2)
(2)
Pa = ξab (x, u)∂xb + ηai (x, u)∂ui , Ja = ξab (x, u)∂xb + ηai (x, u)∂ui ,
(4)
538
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
where the indices a, b take the values 1, 2, 3 and the index i takes the values 1, 2, . . . , n, form a realization of the Euclid algebra AE(3) provided the following commutation relations are fulfilled:
where εabc
[Pa , Pb ] = 0,
(5)
[Ja , Pb ] = εabc Pc ,
(6)
[Ja , Jb ] = εabc Jc ,
(7)
1, (abc) = cycle (123), = −1, (abc) = cycle (213), 0, in the remaining cases.
Definition 2. Realization of the Euclid algebra within the class of LVFs (4) is called covariant if coefficients of the basis elements Pa satisfy the following condition: ξ (1) ξ (1) ξ (1) η(1) . . . η(1) 11 12 13 11 1n (1) (1) (1) (1) (1) rank ξ21 ξ22 ξ23 η21 . . . η2n (8) = 3. (1) (1) (1) (1) (1) ξ ξ ξ η ... η 31
32
33
31
3n
3. Realizations of the Lie Algebra of the Rotation Group O(3) It is well-known from classical representation theory that there are infinitely many inequivalent matrix representations of the Lie algebra of the rotation group O(3) [1]. A natural equivalence relation on the set of matrix representations of AO(3) is defined as follows: Ja → V Ja V −1 with an arbitrary constant nonsingular matrix V . If we represent the matrices Ja as the first-order differential operators (see, e.g., [7]) Ja = −{Ja u}α ∂uα ,
(9)
where u is a vector-column of corresponding dimension, then the above equivalence relation means that the representations of the algebra AO(3) are searched for within the class of LVFs (9) up to invertible linear transformations u → v = V u. It is proved below that provided realizations of AO(3) are classified within arbitrary invertible transformations of variables vi = Fi (u),
i = 1, . . . , n,
there are only two inequivalent realizations.
(10)
On Covariant Realizations of the Euclid Group
539
Theorem 1. Let first-order differential operators Ja = ηai (u)∂ui ,
a = 1, 2, 3
(11)
satisfy the commutation relations of the Lie algebra of the rotation group O(3) (7). Then either all of them are equal to zero, i.e. Ja = 0,
a = 1, 2, 3,
(12)
or there exists a transformation (10) reducing these operators to one of the following forms: 1.
2.
J1 J2 J3 J1 J2 J3
= − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , = ∂u1 ; = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 + sin u1 sec u2 ∂u3 , = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 + cos u1 sec u2 ∂u3 , = ∂u1 .
(13)
(14)
Proof. If at least one of the operators Ja (say J3 ) is equal to zero, then due to the commutation relations (7) two other operators J2 , J3 are also equal to zero and we arrive at the Formulae (12). Let J3 be a non-zero operator. Then, using a transformation (10) we can always reduce the operator J3 to the form J3 = ∂v1 (we should write J3 but to simplify the notations we omit hereafter the primes). Next, from the commutation relations [J3 , J1 ] = J2 , [J3 , J2 ] = −J1 it follows that coefficients of the operators J1 , J2 satisfy the system of ordinary differential equations with respect to v1 , η2iv1 = η3i ,
η3iv1 = −η2i ,
i = 1, . . . , n.
Solving the above system yields η2i = fi cos v1 + gi sin v1 ,
η3i = gi cos v1 − fi sin v1 ,
(15)
where fi , gi are arbitrary smooth functions of v2 , . . . , vn , i = 1, . . . , n. Case 1. fj = gj = 0, j ≥ 2. In this case operators J1 , J2 read J1 = f cos v1 ∂v1 ,
J2 = −f sin v1 ∂v1
with an arbitrary smooth function f = f (v2 , . . . , vn ). Inserting the above expressions into the remaining commutation relation [J1 , J2 ] = J3 and computing the commutator on the left-hand side we arrive at the equality f 2 = −1 which cannot be satisfied by a real-valued function f . Case 2. Not all fj , gj , j ≥ 2 are equal to 0. Making a change of variables w1 = v1 + V (v2 , . . . , vn ),
wj = vj ,
j = 2, . . . , n
we transform operators Ja , a = 1, 2, 3 with coefficients (15) as follows: J1 = f˜ sin w1 ∂w1 +
n j =2
(f˜j cos w1 + g˜ j sin w1 )∂wj ,
540
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
J2 = f˜ cos w1 ∂w1 +
n
(g˜ j cos w1 − f˜j sin w1 )∂wj ,
(16)
j =2
J3 = ∂w1 . Here f˜, f˜j , g˜ j are some functions of w2 , . . . , wn . Subcase 2.1. Not all f˜j are equal to 0. Making a transformation z1 = w1 ,
zj = Wj (w2 , . . . , wn ),
j = 2, . . . , n,
where W2 is a particular solution of the partial differential equation n
f˜j ∂wj W2 = 1
j =2
and W3 , . . . , Wn are functionally-independent first integrals of the partial differential equation n
f˜j ∂wj W = 0,
j =2
we reduce the operators (16) to be J1 = F sin z1 ∂z1 + cos z1 ∂z2 + J2 = F cos z1 ∂z1 − sin z1 ∂z2 +
n j =2 n
Gj sin z1 ∂zj , Gj cos z1 ∂wj ,
(17)
j =2
J3 = ∂z1 . Substituting operators (17) into the commutation relation [J1 , J2 ] = J3 and equating coefficients of the linearly-independent operators ∂z1 , . . . , ∂zn , we arrive at the following system of partial differential equations for the functions F, G2 , . . . , Gn : Fz2 − F 2 = 1,
Gj z2 − F Gj = 0,
j = 2, . . . , n.
Integrating the above equations yields F = tan(z2 + c1 ),
Gj =
cj , cos(z2 + c1 )
where c1 , . . . , cn are arbitrary smooth functions of z3 , . . . , zn , j = 2, . . . , n. Changing, if necessary, z2 by z2 + c1 (z3 , . . . , zn ) we can put c1 equal to zero. Next, making a transformation, ya = za , a = 1, 2, 3, yk = Zk (z3 , . . . , zn ), k = 4, . . . , n,
On Covariant Realizations of the Euclid Group
541
where Zk are functionally-independent first integrals of the partial differential equation n
Gj ∂zj Z = 0,
j =3
we can put Gk = 0, k = 4, . . . , n. With these remarks the operators (17) take the form sin y1 (f ∂y2 + g∂y3 ), cos y2 cos y1 + (f ∂y2 + g∂y3 ), cos y2
J1 = sin y1 tan y2 ∂y1 + cos y1 ∂y2 + J2 = cos y1 tan y2 ∂y1 − sin y1 ∂y2 J3 = ∂y1 ,
(18)
where f, g are arbitrary smooth functions of y3 , . . . , yn . If g ≡ 0, then making a transformation u˜ 1 = y1 − arctan
f , cos y2
u˜ 2 = − arctan
sin y2 cos2 y2 + f 2
,
u˜ k = yk ,
where k = 3, . . . , n, we reduce operators (18) to the form (13). If in (18) g ≡ 0, then changing y3 to y˜3 = g −1 dy3 and y2 to y˜2 = −y2 we transform the above operators to become sin y˜1 sin y˜1 J1 = − sin y˜1 tan y˜2 ∂y˜1 − cos y˜1 − α ∂y˜2 + ∂y˜ , cos y˜2 cos y˜2 3 cos y˜1 cos y˜1 J2 = − cos y˜1 tan y˜2 ∂y˜1 + sin y˜1 + α ∂y˜2 + ∂y˜ , (19) cos y˜2 cos y˜2 3 J3 = ∂y˜1 . Here α is an arbitrary smooth function of y˜3 , . . . , y˜n . Finally, making the transformation u˜ 1 = y˜1 + f,
u˜ 2 = g,
u˜ 3 = h,
u˜ k = y˜k ,
where k = 3, . . . , n and f = f (y˜2 , . . . , y˜n ), g = g(y˜2 , . . . , y˜n ), h = h(y˜2 , . . . , y˜n ) satisfy the compatible over-determined system of nonlinear partial differential equations: fy˜2 = sin f tan g, fy˜3 = sin y˜2 − α sin f tan g − cos y˜2 cos f tan g, gy˜2 = cos f, gy˜3 = sin f cos y˜2 − α cos f, hy˜2 = − sin f sec g, hy˜3 = (cos f cos y˜2 + α sin f ) sec g, reduces operators (19) to the form (14). Subcase 2.2. fj = 0, j = 2, . . . , n. Substituting operators (16) under fj = 0 into the commutation relation [J1 , J2 ] = J3 and equating coefficients of the linearlyindependent operators ∂z1 , . . . , ∂zn yield a system of algebraic equations −f 2 = 1,
f gj = 0,
j = 2, . . . , n.
As the function f is a real-valued one, the system obtained is inconsistent. Thus we have proved that Formulae (13)–(12) give all possible inequivalent realizations of the Lie algebra (7) within the class of first-order differential operators (11). The theorem is proved.
542
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
If we realize the rotation group as the group of transformations of the space of spherical functions, then the basis elements of its Lie algebra are exactly of the form (13) [1]. Hence it follows that the realization space V of the Lie algebra (13) is a direct sum of subspaces V2l+1 of spherical functions of the order l. Furthermore, if we consider O(3) as the group of transformations of the space of generalized spherical functions [1], then operators (14) are the basis elements of the corresponding Lie algebra. 4. Realizations of the Algebra AE(3) First we will prove an auxiliary assertion giving inequivalent realizations of Lie algebras of the translation T (3) group within the class of LVFs. Lemma 1. Let mutually commuting LVFs (1)
(1)
Pa = ξab (x, u)∂xb + ηai (x, u)∂ui , where a, b = 1, . . . , N, satisfy the relation (1) ξ . . . ξ (1) η(1) 1N 11 11 .. .. rank ... ... . . ξ (1) . . . ξ (1) η(1) NN N 1 N1
(1) . . . η1n .. .. = N. . . (1) ... η
(20)
Nn
Then there exists a transformation of the form (3) reducing operators Pa to become Pa = ∂ya , a = 1, . . . , N. Proof. To avoid inessential technicalities we will give the detailed proof of the lemma for the case N = 3. Given a condition N = 3, relation (20) reduces to the form (8). Due to the latter Pa = 0 for all a = 1, 2, 3, it is well-known that a non-zero operator (1)
(1)
P1 = ξ1b (x, u)∂xb + η1i (x, u)∂ui can always be reduced to the form P1 = ∂y1 by a transformation (3) with m = 3. If we denote by P2 , P3 the operators P2 , P3 written in the new variables y, v, then owing to commutation relations (5) they commute with the operator P1 = ∂y1 . Hence, we conclude that their coefficients are independent of y1 . (1) (1) (1) Furthermore due to condition (8) at least one of the coefficients ξ22 , ξ23 , η21 , . . . , (1) η2n of the operator P2 is not equal to zero. Summing up, we conclude that the operator P2 is of the form (1)
(1)
P2 = ξ2b (y2 , y3 , v)∂yb + η2i (y2 , y3 , v)∂vi = 0, (1)
(1)
(1)
(1)
not all the functions ξ22 , ξ23 , η21 , . . . , η2n being identically equal to zero. Making a transformation z1 z2 z3 wi
= = = =
y1 + F (y2 , y3 , v), G(y2 , y3 , v), ω0 (y2 , y3 , v), ωi (y2 , y3 , v), i = 1, . . . , n,
(21)
On Covariant Realizations of the Euclid Group
543
where the functions F, G are particular solutions of the differential equations (1)
(1)
(1)
ξ22 (y2 , y3 , v)Fy2 + ξ23 (y2 , y3 , v)Fy3 + η2i (y2 , y3 , v)Fui (1)
+ξ21 (y2 , y3 , v) = 0, (1)
(1)
(1)
ξ22 (y2 , y3 , v)Gy2 + ξ23 (y2 , y3 , v)Gy3 + η2i (y2 , y3 , v)Gui = 1, and ω0 , ω1 , . . . , ωn are functionally-independent first integrals of the Euler–Lagrange system dy2 (1) ξ22
=
dy3 (1) ξ23
=
dv1 (1) η21
= ··· =
dvn
(1)
η2n
,
which has exactly n + 1 functionally-independent integrals, we reduce the operator P2 to the form P2 = ∂z2 . It is easy to check that transformation (21) does not alter the form of the operator P1 . Being rewritten in the new variables z, w it reads P1 = ∂z1 . As the right-hand sides of (21) are functionally-independent by construction, the transformation (21) is invertible. Consequently, operators Pa are equivalent to operators Pa , where P1 = ∂z1 , P2 = ∂z2 and (1)
(1)
P3 = ξ3b (z3 , w)∂yb + η3i (z3 , w)∂vi = 0. (Coefficients of the above operator are independent of z1 , z2 because of the fact that it commutes with the operators P1 , P2 .) And what is more, due to (8) at least one of the (1) (1) (1) coefficients ξ33 , η31 , . . . , η3i of the operator P3 is not identically equal to zero. Making a transformation Z1 Z2 Z3 Wi
= = = =
z1 + F (z3 , w), z2 + G(z3 , w), H (z3 , w), /i (z3 , w), i = 1, . . . , n,
where F, G, H are particular solutions of the partial differential equations (1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
ξ33 (z3 , w)Fz3 + η3i (z3 , w)Fwi = −ξ31 (z3 , w), ξ33 (z3 , w)Gz3 + η3i (z3 , w)Gwi = −ξ32 (z3 , w), ξ33 (z3 , w)Hz3 + η3i (z3 , w)Hwi = 1, and /1 , . . . , /n are functionally-independent first integrals of the Euler-Lagrange system dw1 dwn dz3 = (1) = · · · = (1) , (1) ξ33 η31 η3n we reduce the operators Pa , a = 1, 2, 3 to the form Pa = ∂Za , a = 1, 2, 3, which is the same as what was to be proved. Note 1. In the papers [9, 17] mentioned above a classification of realizations of the groups G2 (1, 1), C(n, m) was carried out under assumption that mutually commuting LVFs Qa = ξaα (x)∂xα ,
a = 1, . . . , N
544
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
can be simultaneously reduced by the map yα = fα (x),
α = 1, . . . , n
(22)
to the form Qa = ∂ya . It is not difficult to become convinced of the fact that this is possible if and only if the condition n rank ξaα N a=1α=1 = N
(23)
holds. The sufficiency of the above statement is a consequence of Lemma 1. The necessity follows from the fact that function-rows of coefficients of operators Q1 , . . . , QN transformed according to Formulae (22) are obtained by multiplying function-rows of coefficients of the operators Q1 , . . . , QN by a Jacobi matrix of the map (22), i.e. ξaα = ξaβ fαxβ ,
a = 1, . . . , N, α = 1, . . . , n,
which leaves relation (23) invariant. Consequently, in [9, 17] only covariant realizations of the corresponding Lie algebras were considered, which, generally speaking, do not exhaust a set of all possible realizations. Now we can prove a principal theorem giving a description of all inequivalent covariant realizations of the Euclid algebra AE(3). Theorem 2. Any covariant realization of the algebra AE(3) within the class of firstorder differential operators is equivalent to one of the following realizations: 1. Pa = ∂xa ,
Ja = −εabc xb ∂xc ,
a = 1, 2, 3;
2. Pa = ∂xa , a = 1, 2, 3, J1 = −x2 ∂x3 + x3 ∂x2 + f ∂x1 − fu2 sin u1 ∂x3 − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2 = −x3 ∂x1 + x1 ∂x3 + f ∂x2 − fu2 cos u1 ∂x3 − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3 = −x1 ∂x2 + x2 ∂x1 + ∂u1 ; 3. Pa = ∂xa , a = 1, 2, 3, J1 = −x2 ∂x3 + x3 ∂x2 + g∂x1 − (sin u1 gu2 + cos u1 sec u2 gu3 )∂x3 − sin u1 tan u2 ∂u1 − cos u1 ∂u2 + sin u1 sec u2 ∂u3 , J2 = −x3 ∂x1 + x1 ∂x3 + g∂x2 − (cos u1 gu2 − sin u1 sec u2 gu3 )∂x3 − cos u1 tan u2 ∂u1 + sin u1 ∂u2 + cos u1 sec u2 ∂u3 , J3 = −x1 ∂x2 + x2 ∂x1 + ∂u1 . Here f = f (u2 , . . . , un ) is given by the formula sin u2 + 1 f = α sin u2 + β sin u2 ln −1 , cos u2
(24)
(25)
(26)
(27)
On Covariant Realizations of the Euclid Group
545
α, β are arbitrary smooth functions of u3 , . . . , un and g = g(u2 , . . . , un ) is a solution of the following linear partial differential equation: cos2 u2 gu2 u2 + gu3 u3 − sin u2 cos u2 gu2 + 2 cos2 u2 g = 0.
(28)
Proof. Due to Lemma 1 operators Pa can always be reduced to the form Pa = ∂xa by means of a properly chosen transformation (3). Inserting the operators Pa = ∂xa ,
Ja = ξab (x, u)∂xb + ηai (x, u)∂ui
into commutation relations (6) and equating the coefficients of the linearly-independent operators ∂x1 , ∂x2 , ∂x3 , ∂u1 , . . . , ∂un we arrive at the system of partial differential equations for the functions ξab (x, u), ηai (x, u), ξacxb = −εabc ,
ηaixb = 0,
a, b, c = 1, 2, 3, i = 1 . . . , n.
Integrating the above system we conclude that the operators Ja have the form Ja = −εabc xb ∂xc + jab (u)∂xb + η˜ ai (u)∂ui ,
a = 1, 2, 3,
(29)
where jab , η˜ ab are arbitrary smooth functions. Inserting (29) into commutation relations (7) and equating coefficients of ∂u1 , . . . , ∂un show that the operators Ja = η˜ ai ∂ui , a = 1, 2, 3 have to fulfill (7) with Ja → Ja . Hence, taking into account Theorem 1 we conclude that any covariant realization of the algebra AE(3) is equivalent to the following one: Pa = ∂xa ,
Ja = −εabc xb ∂xc + jab (u)∂xb + Ja ,
a = 1, 2, 3,
(30)
operators Ja being given by one of the Formulae (12)–(14). Making a transformation ya = xa + Fa (u),
vi = ui ,
a = 1, 2, 3, i = 1, . . . , n,
we reduce the operators Ja from (30) to be J1 = −y2 ∂y3 + y3 ∂y2 + A∂y1 + B∂y2 + C∂y3 + J1 , J2 = −y3 ∂y1 + y1 ∂y3 + F ∂y2 + G∂y3 + J2 , J3 = −y1 ∂y2 + y2 ∂y1 + H ∂y3 + J3 ,
(31)
where A, B, C, F, G, H are arbitrary smooth functions of v1 , . . . , vn . Substituting operators (31) into (7) and equating coefficients of linearly-independent operators ∂y1 , ∂y2 , ∂y3 , ∂v1 , . . . , ∂vn result in the following system of partial differential equations: 1) J2 A = −C,
6) J3 C − J1 H = G,
2) J3 F = −B,
7) J1 G − J2 C = H − A − F,
3) J3 A = B,
8) J3 B = F − A − H,
4) J1 F − J2 B = G,
9) A − F − H = 0.
5) J2 H − J3 G = C,
(32)
546
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
Case 1. All operators J1 , J2 , J3 are equal to zero. Then, (32) reduces to the system of linear algebraic equations B = C = G = 0,
H − A − F = 0,
F − A − H = 0,
A − F − H = 0,
whence it immediately follows that A = F = G = 0. Substituting the above results into Formulae (31) we arrive at realization (24). Case 2. Suppose now that not all operators J1 , J2 , J3 vanish. Then, they are given either by Formulae (13) or by Formulae (14), where one should replace u1 , . . . , un by v1 , . . . , vn . As for both cases J3 = ∂v1 , a subsystem of Eqs. 2, 3, 8, 9 forms a system of linear ordinary differential equations for the functions A, B, F, H with respect to v1 . Integrating it we have A = B0 + B1 sin 2v1 − B2 cos 2v1 ,
B = 2B1 cos 2v1 + 2B2 sin 2v1 ,
F = B0 + B2 cos 2v1 − B1 sin 2v1 ,
H = 2B1 sin 2v1 − 2B2 cos 2v1 ,
(33)
where B0 , B1 , B2 are arbitrary smooth functions of v2 , . . . , vn . Subcase 2.1. Let the operators J1 , J2 , J3 be of the form (13). Then, making a transformation z1 = y1 + R1 cos v1 + R2 sin v1 , z2 = y2 + R2 cos v1 − R1 sin v1 , 1 1 z3 = y3 + (R2v2 + tan v2 R2 ) cos 2v1 − (R1v2 + tan v2 R1 ) sin 2v1 2 2 1 + (tan v2 R2 − R2v2 ), 2 where the functions R1 , R2 are solutions of the system of partial differential equations R1v2 +
1 tan v2 R1 = −2B2 , 2
R2v2 +
1 tan v2 R2 = 2B1 , 2
we reduce operators (31) with A, B, F, H given by (33) to the form
z1 + C∂
z3 + J1 , J1 = −z2 ∂z3 + z3 ∂z2 + A∂
z2 + G∂
z3 + J2 , J2 = −z3 ∂z1 + z1 ∂z3 + A∂ J3 = −z1 ∂z2 + z2 ∂z1 + J3 .
(34)
C,
G
are arbitrary smooth functions of v1 , . . . , vn , and what is more, A
does Here A, not depend on v1 . Given such a form of operators Ja , system (32) reduces to three differential equations
= −C,
J2 A
= G,
J1 A
− J2 C
= −2A.
J1 G
(35)
Inserting expressions for the operators J1 , J2 from (13) into the first two equations we have
v2 ,
= − sin v1 A C
= − cos v1 A
v2 . G
Substituting the above formulae into the third equation of system (35) we conclude that it is equivalent to the differential equation
= 0,
v2 + 2A
v2 v2 − tan v2 A A
On Covariant Realizations of the Euclid Group
547
whose general solution is given by (27). At last, inserting the results obtained into (34) we get Formulae (25). Subcase 2.2. Let the operators J1 , J2 , J3 be of the form (14). Then, making a transformation z1 = y1 + R1 cos v1 + R2 sin v1 , z2 = y2 + R2 cos v1 − R1 sin v1 , 1 z3 = y3 + R2v2 − sec v2 R1v3 + tan v2 R2 cos 2v1 2 1 − (R1v2 + sec v2 R2v3 + tan v2 R1 ) sin 2v1 2 1 + (tan v2 R2 − sec v2 R1v3 − R2v2 ), 2 where the functions R1 , R2 are solutions of the system of partial differential equations 2B1 = R2v2 − sec v2 R1v3 + tan v2 R2 , 2B2 = −R1v2 − sec v2 R2v3 − tan v2 R1 , we reduce operators (31) with A, B, F, H given by (33) to the form (34), where
C,
G
are arbitrary smooth functions, and what is more, A
does not depend on v1 . A, Given such a form of the operators Ja , system (32) reduces to three differential equations (35). Inserting the expressions for the operators J1 , J2 from (13) into the first two equations of (35) we have
= − cos v1 A
v2 + sin v1 sec v2 A
v3 , C
(36)
v3 .
= − sin v1 Av2 − cos v1 sec v2 A G
Substituting the above formulae into the third equation from (35), after some algebra we arrive at the conclusion that it is equivalent to Eq. (28). Inserting (36) into (34) yields Formulae (26). Thus we have proved that if LVFs Pa , Ja realize a covariant realization of the Euclid algebra AE(3), then they can be reduced to one of the forms (24)–(26) by means of an invertible transformation (3). The theorem is proved. While proving Theorem 1, we have established, in particular, that any realization of the Euclid algebra satisfying condition (8) can be transformed to become Pa = ∂xa ,
Ja = −εabc xb ∂xc + jab (u)∂xb + η˜ ai (u)∂ui ,
a = 1, 2, 3.
If we choose in the above formulae jab (u) = 0,
ηai (u) = −3aij uj ,
a, b = 1, 2, 3, i = 1, . . . , n,
where 3aij = const, then the following realization Pa = ∂xa ,
Ja = −εabc xb ∂xc + Ja ,
with Ja = −3aij uj ∂ui is obtained.
a = 1, 2, 3
(37)
548
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
A realization of the Euclid algebra with generators of the form (37) is called in the classical linear representation theory a covariant realization. That is why it is natural to preserve for a realization of the algebra AE(3) within the class of LVFs obeying (8) the same terminology. As an illustration to Theorem 2 we will demonstrate how to reduce the realizations of the Euclid algebras forming the symmetry algebras of the heat, wave, Laplace, Navier– Stokes, Lamè, Weyl, Dirac and Maxwell equations to one of the three canonical forms (24)–(26). First of all, we note that realization (24) is exactly the one realized on the sets of solutions of the linear and nonlinear heat (Schrödinger), wave and Laplace equations. Symmetry algebras of the Navier–Stokes and Lamè equations contain as a subalgebra the Euclid algebra having basis elements (37), where (see, e.g. [6]) Ja = −εabc vb ∂vc ,
a = 1, 2, 3.
(38)
The change of variables v1 = u3 sin u1 cos u2 ,
v2 = u3 cos u1 cos u2 ,
v3 = u3 sin u2
reduce these LVFs to the form (25) with f = 0. Next, if we consider the Weyl equation as the system of four real equations for four real-valued functions v1 , v2 , w1 , w2 , then on the set of its solutions realization (37) of the algebra AE(3) is realized, where [3, 7] 1 (w2 ∂v1 − v1 ∂w2 + w1 ∂v2 − v2 ∂w1 ), 2 1 J2 = (v2 ∂v1 − v1 ∂v2 + w2 ∂w1 − w1 ∂w2 ), 2 1 J3 = (w1 ∂v1 − v1 ∂w1 + v2 ∂w2 − w2 ∂v2 ). 2 J1 =
(39)
Making the change of variables
u u2 u3 u1 u2 1 v1 = u4 sin sin cos + cos cos 2 2 2 2 2
u1 u2 u3 u1 u2 v2 = u4 cos cos cos − sin sin 2 2 2 2 2
u1 u2 u3 u1 u2 w1 = u4 cos sin cos − sin cos 2 2 2 2
u2 u2 u3 u1 u2 1 w2 = u4 sin cos cos + cos sin 2 2 2 2 2
u3 , 2 u3 sin , 2 u3 sin , 2 u3 sin 2 sin
reduces the above LVFs to the form (26) with g = 0. On the solution set of the Maxwell equations the realization of the Euclid algebra (37), where Ja = −εabc Eb ∂Ec + Hb ∂Hc , is realized [19].
a = 1, 2, 3,
On Covariant Realizations of the Euclid Group
549
This realization is reduced to the form (26) under g = 0 with the help of the change of variables: E1 E2 E3 H1 H2 H3
= = = = = =
u6 sin u1 cos u2 , u6 cos u1 cos u2 , u6 sin u2 , u4 (cos u1 sin u3 + sin u1 sin u2 cos u3 ) + u5 sin u1 cos u2 , u4 (cos u1 sin u2 cos u3 − sin u1 sin u3 ) + u5 cos u1 cos u2 , −u4 cos u2 cos u3 + u5 sin u2 .
Taking the Dirac matrices γµ in the Majorana representation we can represent the Dirac equation as the system of eight real equations for eight real-valued functions ψ10 , . . . , ψ13 , ψ20 , . . . , ψ23 (for details, see e.g. [7]). With this choice of γ -matrices, the realization of the Euclid algebra (37) with 1 J1 = − ψ13 ∂ψ 0 + ψ12 ∂ψ 1 − ψ11 ∂ψ 2 − ψ10 ∂ψ 3 + ψ23 ∂ψ 0 + ψ22 ∂ψ 1 1 1 1 1 2 2 2 1 0 −ψ2 ∂ψ 2 − ψ2 ∂ψ 3 , 2 2 1 2 3 J2 = −ψ1 ∂ψ 0 + ψ1 ∂ψ 1 + ψ10 ∂ψ 2 − ψ11 ∂ψ 3 − ψ22 ∂ψ 0 + ψ23 ∂ψ 1 1 1 1 1 2 2 2 0 1 +ψ2 ∂ψ 2 − ψ2 ∂ψ 3 , 2 2 1 1 0 J3 = − ψ1 ∂ψ 0 − ψ1 ∂ψ 1 + ψ13 ∂ψ 2 − ψ12 ∂ψ 3 + ψ21 ∂ψ 0 − ψ20 ∂ψ 1 1 1 1 1 2 2 2 3 2 +ψ2 ∂ψ 2 − ψ2 ∂ψ 3 2
2
is realized on the set of solutions of the Dirac equation. Making the change of variables
u1 u2 u3 u1 u2 u3 ψ10 = u4 cos cos sin + sin sin cos , 2 2 2 2 2
u2 u2 u3 u1 u2 u3 1 ψ11 = u4 sin cos sin − cos sin cos , 2 2 2 2 2
2u u2 u3 u1 u2 u3 1 ψ12 = −u4 cos cos cos − sin sin sin , 2 2 2 2 2 2
u u2 u3 u1 u2 u3 1 ψ13 = −u4 sin cos cos + cos sin sin , 2 2 2 2 2 2 u1 u2 u3 + u6 u1 u2 u3 + u6 ψ20 = u5 sin sin sin − cos cos cos 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 +u7 sin cos sin − cos sin cos , 2 2 2 2 2 2 u1 u2 u3 + u6 u1 u2 u3 + u6 ψ21 = −u5 sin cos cos + cos sin sin 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 −u7 sin sin cos − cos cos sin , 2 2 2 2 2 2
550
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
u2 u3 + u6 u1 u2 u3 + u6 u1 ψ22 = −u5 cos cos sin + sin sin cos 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 +u7 cos sin sin + sin cos cos , 2 2 2 2 2 2 u1 u2 u3 + u6 u1 u2 u3 + u6 ψ23 = u5 cos sin cos − sin cos sin 2 2 2 2 2 2 u1 u2 u3 + u8 u1 u2 u3 + u8 −u7 cos cos cos − sin sin sin 2 2 2 2 2 2 reduces the above realization to the form (26) with g = 0. 5. Covariant Realizations of the Lie Algebra of the Group E(4) We recall that the basis elements of the Lie algebra of the Euclid group E(4) fulfill the following commutation relations: [Pα , Pβ ] = 0, [Jµν , Pα ] = δµα Pν − δνα Pµ , [Jαβ , Jµν ] = δαµ Jβν + δβν Jαµ − δαν Jβµ − δβµ Jαν ,
(40) (41) (42)
where α, β, µ, ν = 1, 2, 3, 4. Using the results of the previous sections and the fact that the Lie algebra of the rotation group O(4) is the direct sum of two algebras AO(3) we will obtain a description of covariant realizations of the Lie algebra (40)–(42) within the class of LVFs, Pµ = ξµν (x, u)∂xν + ηµi (x, u)∂ui , Jµν = ξµνα (x, u)∂xα + ηµνi (x, u)∂ui with Jµν = −Jνµ . Here the indices µ, ν, α take the values 1, 2, 3, 4 and the index i takes the values 1, . . . , n. As we consider covariant realizations, mutually commuting operators Pµ satisfy (20) with N = 4. Hence due to Lemma 1 it follows that they can be reduced to the form Pµ = ∂xµ , µ = 1, 2, 3, 4. Next, using the commutation relations (41) we establish that the operators Jµν have the following structure: Jµν = xν ∂xµ − xµ ∂xν + fµνα (u)∂xα + gµνi (u)∂ui
(43)
with arbitrary sufficiently smooth fµνα , gµνi . In what follows we will restrict our considerations to the case when in (43) fµνα ≡ 0. This means geometrically that the transformation groups generated by the operators Jµν in the space of independent variables are standard rotations in the planes (xµ , xν ). With this restriction LVFs Jµν take the form Jµν = xν ∂xµ − xµ ∂xν + Jµν ,
(44)
Jµν = gµνi (u)∂ui
(45)
where
and, furthermore, gµνi (u) = −gνµi (u).
On Covariant Realizations of the Euclid Group
551
Inserting LVFs (44) into (42) we come to the conclusion that the operators Jµν satisfy the commutation relations of the Lie algebra of the rotation group O(4), [Jαβ , Jµν ] = δαµ Jβν + δβν Jαµ − δαν Jβµ − δβµ Jαν .
(46)
An exhaustive description of inequivalent realizations of the above Lie algebra within the class of LVFs (45) is given below. It is based on results of Sect. 2 and on the wellknown fact that the algebra AO(4) is decomposed into the direct sum of two algebras AO(3). This is achieved by choosing the basis of AO(4) in the following way: 1 1 ± Ja = (47) εabc Jbc ± Ja4 , 2 2 where the indices a, b, c take the values 1, 2, 3. Due to (46) LVFs Ja− , Ja+ fulfill the following commutation relations: [Ja+ , Jb+ ] = εabc Jc+ ,
[Ja+ , [Ja− ,
Jb− ] Jb− ]
= 0, =
εabc Jc− ,
(48) (49) (50)
which is the same as what was required. Now we are ready to formulate an assertion giving an exhaustive description of LVFs (45) satisfying commutation relations (46) or, equivalently, (48)–(50). Theorem 3. Any realization of the Lie algebra AO(4) within the class of LVFs (45) is given by Formulae (47) and by one of the Formulae 1–6 presented below. 1. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3+ = ∂u1 ,
J1− = − sin u3 tan u4 ∂u3 − cos u3 ∂u4 , J2− = − cos u3 tan u4 ∂u3 + sin u3 ∂u4 , J3− = ∂u3 ;
2. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3+ = ∂u1 ,
J1− = − sin u3 tan u4 ∂u3 − cos u3 ∂u4 − sin u3 sec u4 ∂u5 ,
J2− = − cos u3 tan u4 ∂u3 + sin u3 ∂u4 − cos u3 sec u4 ∂u5 ,
J3− = ∂u3 ;
3. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,
J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 ,
J3+ = ∂u1 ,
552
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
J1− = sec u2 cos u3 ∂u1 + sin u3 ∂u2 − tan u2 cos u3 ∂u3 ,
J2− = − sec u2 sin u3 ∂u1 + cos u3 ∂u2 + tan u2 sin u3 ∂u3 ,
J3− = ∂u3 ;
4. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,
J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 ,
J1− = − sin u4 tan u5 ∂u4 − cos u4 ∂u5 − sin u4 sec u5 ∂u6 ,
J2− = − cos u4 tan u5 ∂u4 + sin u4 ∂u5 − cos u4 sec u5 ∂u6 , J3− = ∂u4 ;
5. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,
J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 ,
J1− = k sin u4 sec u5 ∂u3 − sin u4 tan u5 ∂u4 − cos u4 ∂u5 , J2− = k sin u4 sec u5 ∂u3 − cos u4 tan u5 ∂u4 + sin u4 ∂u5 , J3− = ∂u4 ;
6. J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,
J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 ,
J1− = u6 sin u4 sec u5 ∂u3 − sin u4 tan u5 ∂u4 − cos u4 ∂u5 , J2− = u6 sin u4 sec u5 ∂u3 − cos u4 tan u5 ∂u4 + sin u4 ∂u5 , J3− = ∂u4 ,
where k = const, k = 0. Proof. We will give the principal steps of the proof omitting intermediate computations. According to Theorem 1, there are two inequivalent realizations of the algebra AO(3) with basis elements J1+ , J2+ , J3+ : 1.
J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 , J3+ = ∂u1 ;
2.
(51)
J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,
J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 , J3+ = ∂u1 .
To complete a classification of inequivalent realization of AO(4) we have to find all the triplets of operators J1− , J2− , J3− which together with operators (51) satisfy (49), (50).
On Covariant Realizations of the Euclid Group
553
Analyzing commutation relations (49) we arrive at the following expressions for operators J1− , J2− , J3− : 1. Ja− = 2.
Ja−
=
n
fai (u3 , . . . , un )∂ui ,
i=3 3
fab (u4 , . . . , un )Qb +
b=1
n
fai (u4 , . . . , un )∂ui ,
i=4
where fij are arbitrary smooth functions and Q1 = sec u2 cos u3 ∂u1 + sin u3 ∂u2 − tan u2 cos u3 ∂u3 , Q2 = − sec u2 sin u3 ∂u1 + cos u3 ∂u2 + tan u2 sin u3 ∂u3 , Q3 = ∂u3 . Note that the operators Qa fulfill the commutation relations of the algebra AO(3). Hence, we conclude that for Case 1 from (51) the operators Ja− are given by the Formulae (51), where one should replace ui by ui+2 , correspondingly. Let us turn now to the second realization of the algebra AO(3) from (51). Case 1. fai = 0, a = 1, 2, 3, i = 4, . . . , n. In this case we can reduce J1− to the form J1− = r˜ (u4 , . . . , n)Q1 with the help of equivalence transformation X → X˜ = VXV −1 ,
V = exp
3
Fa Qa ,
(52)
a=1
where Fa are some functions of u4 , . . . , un . Note that transformation (52) does not change the form of the operators Ja+ , since [Ja+ , Qb ] = 0, a, b = 1, 2, 3. From commutation relations (50) it follows that r˜ = 1 and furthermore J2− = Q2 , J3− = Q3 . Thus we get the following forms of the operators Ja− : J1− = sec u2 cos u3 ∂u1 + sin u3 ∂u2 − tan u2 cos u3 ∂u3 ,
J2− = − sec u2 sin u3 ∂u1 + cos u3 ∂u2 + tan u2 sin u3 ∂u3 ,
J3− = ∂u3 .
Case 2. Not all fai vanish. Then the operators J1− , J2− , J3− can be transformed to become Ja− = fa (u4 , . . . , un )Q1 + ga (u4 , . . . , un )Q2 + ha (u4 , . . . , un )Q3 + Za , where a = 1, 2, 3, and Z1 = − sin u4 tan u5 ∂u4 − cos u4 ∂u5 − ε sin u4 sec u5 ∂u6 , Z2 = − cos u4 tan u5 ∂u4 + sin u4 ∂u5 − ε cos u4 sec u5 ∂u6 , Z3 = ∂u4 , and ε = 0, 1.
554
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
Now using transformation (52) we reduce the operator J3− to the form Z3 = ∂u4 . Next, from commutation relations [J3− , J1− ] = J2− ,
[J3− , J2− ] = −J1−
we get J1− = J2− =
3 a=1 3
(Ga cos u4 + Ha sin u4 )Qa + Z1 , (Ha cos u4 − Ga sin u4 )Qa + Z2 ,
a=1
where Ga , Ha are arbitrary smooth functions of u5 , . . . , un . Making use of the equivalence transformation (52) with Fa being functions of u5 , . . . , un , we can cancel the coefficients Ga . The remaining commutation relation [J1− , J2− ] = J3− yields equations for H1 , H2 , H3 , Hau5 − tan u5 Ha = 0, whence Ha = H˜ a sec u5 ,
a = 1, 2, 3,
H˜ a being arbitrary functions of u6 , . . . , un . Consequently, the operators Ja− read J1− =
3
sin u4 sec u5 H˜ a Qa + Z1 ,
a=1
J2− =
3
cos u4 sec u5 H˜ a Qa + Z2 ,
a=1
J3− = Z3 . If ε = 1, then using the transformation (52) with Fa being functions of u6 , . . . , un we can cancel H˜ a , thus getting Ja− = Za , a = 1, 2, 3. If ε = 0, then making use of the transformation (52) with Fa being functions of u6 , . . . , un we can put H˜ 1 = H˜ 2 = 0. Provided H˜ 3 = 0, we get the realization which is reduced to that given by Formulae 2 from the statement of the theorem. Provided H˜ 3 = const = 0, we get Formulae 5.At last, if H˜ 3 = const, then performing a proper change of variables we arrive at the realization given by Formulae 6 from the statement of the theorem. The theorem is proved. It follows from the above theorem that Formulae (47) and 1–6 of the statement of Theorem 3 give six inequivalent realizations of the Lie algebra of the Euclid group E(4) having the basis elements Pµ = ∂xµ and (44), (45). To get all possible realizations of the algebra in question belonging to the above class it is necessary to add to the list of realizations of the algebra AO(4) obtained in Theorem 3 the following three realizations of the operators Ja− , Ja+ :
On Covariant Realizations of the Euclid Group
1.
555
J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 , J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 ,
J3+ = ∂u1 ,
Ja− = 0; 2.
J1+ = − sin u1 tan u2 ∂u1 − cos u1 ∂u2 − sin u1 sec u2 ∂u3 ,
J2+ = − cos u1 tan u2 ∂u1 + sin u1 ∂u2 − cos u1 sec u2 ∂u3 ,
J3+ = ∂u1 , Ja− = 0;
3. Ja+ = 0,
Ja− = 0,
where a = 1, 2, 3. This yields nine inequivalent realizations of the Lie algebra of the group E(4). In particular, the basis generators of the Euclid groups realized on the sets of solutions of the Dirac and self-dual Yang-Mills equations in the Euclidean space R4 are reduced to such a form that the generators of the rotation groups are given by (44), (45), Jµν being adduced in Formulae 4 of the statement of Theorem 3.
6. Concluding Remarks Summarizing the results of Sects. 3 and 4 yields the following structure of realizations of the Lie algebra of the rotation group by LVFs in n variables: • If n = 1, then there are no realizations. • As there is no realization of AO(3) by real non-zero 2×2 matrices, the only realization for the case n = 2 is given by (13). Furthermore, this realization is essentially nonlinear (i.e., it is not equivalent to a realization of the form (9)). • In the case n = 3 there are two more realization given by formula (38) (which is equivalent to (13)) and by formula (14). The latter realization is essentially nonlinear. • Provided n > 3, there is no new realization of AO(3) and, furthermore, any realization can be reduced to a linear one (say, to (39)). An evident (and very important) consequence of Theorem 1 is that there are only two inequivalent classes of O(3)-invariant partial differential equations of order r. They are obtained via differential invariants of the order not higher than r of the Lie algebras having the basis elements (13), (14). In particular, the Weyl, Maxwell, Dirac equations are the special cases of the general system of first-order partial differential equations in n ≥ 8 dependent variables invariant with respect to the algebra (14). We intend to devote one of our future publications to description of first-order differential invariants of the Lie algebra of the Euclid group E(3) having the basis elements (13), (14) and (37). Let us note that this problem has been completely solved provided basis elements of AE(3) are given by Formulae (12) [20]. Acknowledgements. One of the authors (R. Zh.) gratefully acknowledges financial support from the Alexander von Humboldt Foundation and of the International Renessaince Foundation.
556
R. Z. Zhdanov, V. I. Lahno, W. I. Fushchych
References 1. Gel’fand, I.M., Minlos, R.A. and Shapiro, Z.Ya.: Representations of the Rotation Group and of the Lorentz Group and Their Applications. New York: Macmillan, 1963 2. Barut, A.O. and Raczka, R.: Theory of Group Representations and Their Applications. Warszawa: Polish Scientific Publ., 1984 3. Fushchych, W.I. and Nikitin, A.G.. Symmetry of Equations of Quantum Mechanics. New York: Allerton Press, 1994 4. Ovsjannikov, L.V.: Group Analysis of Differential Equations. New York: Academic Press, 1982 5. Olver, P.J.: Applications of Lie Groups to Differential Equations. New York: Springer, 1986 6. Fushchych, W.I., Shtelen, W.M. and Serov, N.I.: Symmetry Analysis and Exact Solutions of Nonlinear Equations of Mathematical Physics. Kiev: Naukova Dumka, 1989 (translated into English by Dordrecht: Kluwer Academic Publishers, 1993 7. Fushchych, W.I. and Zhdanov, R.Z.: Nonlinear Spinor Equations: Symmetry and Exact Solutions. Kiev: Naukova Dumka 1992 8. Lie, S.: Theorie der Transformationsgruppen. Vol. 3, Leipzig: Teubner, 1893 9. Rideau, G. and Winternitz, P.: Evolution equations invariant under two-dimensional space-time Schrödinger group. J. Math. Phys. 34, N 2, 558–570 (1993) 10. Zhdanov, R.Z. and Fushchych, W.I.: On new representations of Galilei groups. J. Non. Math. Phys. 4, N 3–4, 417–424 (1997) 11. Yehorchenko, I.A.: Nonlinear representation of the Poincaré algebra and invariant equations. In: Symmetry Analysis of Equations of Mathematical Physics, Kiev, Ukraine: Math. Acad. of Sci., 1992, pp. 62–66 12. Fushchych, W.I., Tsyfra, I.M. and Boyko, W.M.: Nonlinear representations for Poincaré and Galilei algebras and nonlinear equations for electro-magnetic field. J. Non. Math. Phys. 1, N 2, 210–221 (1994) 13. Rideau, G. and Winternitz, P.: Nonlinear equations invariant under the Poincaré, similitude and conformal groups in two-dimensional space-time. J. Math. Phys. 31, N 9, 1095–1105 (1990) 14. Lahno, V.I.: On the new representations of the Poincaré and Euclid groups. Proc. Acad. of Sci. Ukraine N 8, 14–19 (1996) 15. Fushchych, W.I. and Cherniha, R.M: Galilei-invariant nonlinear systems of evolution equations. J. Phys. A: Math. Gen., 28, N 19, 5569–5579 (1995) 16. Fushchych W.I., Lahno V.I. and Zhdanov R.Z.: On nonlinear representations of the conformal algebra AC(2, 2). Proc. Acad. of Sci. Ukraine, N 9, 44–47 (1993) 17. Fushchych W.I., Zhdanov R.Z. and Lahno V.I.: On linear and nonlinear representations of the generalized Poincaré groups in the class of Lie vector fields. J. Non. Math. Phys. 1, N 3, 295–308 (1994) 18. Fushchych W.I. and Zhdanov R.Z.: Symmetries and Exact solutions of Nonlinear Dirac Equations. Kyiv: Mathematical Ukraina Publ., 1997 19. Fushchych W.I. and Nikitin A.G.: Symmetries of Maxwell’s Equations. Dordrecht: Reidel, 1987 20. Fushchych, W.I. and Yegorchenko, I.A.: Second-order differential invariants of the rotation group O(n) and of its extention E(n), P (1, n). Acta Appl. Math. 28, N 1, 69–92 (1992) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 212, 557 – 569 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Periodic Instantons and the Loop Group Paul Norbury Department of Mathematics and Statistics, University of Melbourne, Parkville 3052, Australia. E-mail:
[email protected] Received: 10 August 1998 / Accepted: 30 January 2000
Abstract: We construct a large class of periodic instantons. Conjecturally we produce all periodic instantons. This confirms a conjecture of Garland and Murray that relates periodic instantons to orbits of the loop group acting on an extension of its Lie algebra. 1. Introduction Periodic instantons are solutions of the anti-self-dual equations F B = − ∗ FB for a connection B on a trivial vector bundle with structure group G over S 1 × R3 . In this paper, G is a compact Lie group with complexification Gc equipped with a representation acting on Cn that is unitary on G. Put B = A + dθ so ∗FA = dA − µ∂θ A,
(1)
where we use the three-dimensional Hodge star operator and µ is the reciprocal of the radius of the circle. One can think of the connection and Higgs field as defined over R3 and dependent on the circle-valued θ . Nahm studied periodic instantons, calling them calorons [17]. Later, Garland and Murray studied periodic instantons from the twistor viewpoint [7]. To remedy the fact that there was so far no existence theorem for periodic instantons nor an understanding of the topology of the moduli space of instantons (if they were to exist), they conjectured that periodic instantons can be constructed using holomorphic spheres in a flag manifold associated to the loop group. This conjecture is confirmed by the main result of this paper, Theorem 1. 1 × R 3 has been Recently the study of super-symmetric Yang–Mills theory over S1/µ used as further evidence for the existence of dualities in physical theories. In [21] Seiberg
558
P. Norbury
and Witten obtained a result for periodic instantons analogous to their 1994 work on instantons, [20], by studying the limiting behaviour when µ → 0 and µ → ∞. This led to the Rozansky-Witten invariants [19]. We will not discuss these developments here.
2. Loop Groups Define LG to be the group of smooth gauge transformations of the trivial G-bundle over the circle. Equivalently, LG is the space of smooth maps from S 1 to the compact Lie group G. Following [7], intertwine the gauge transformations with the isometries of the = LG×S ˜ 1 , where the action of S 1 is given by circle to get the twisted product LG ∼ rotation. It has Lie algebra Lg = Lg ⊕ Rd with Lie bracket [X + xd, Y + yd] = [X, Y ] − y∂X/∂θ + x∂Y /∂θ. ˆ = + φd. Then the Bogomolny equations over R3 for this pair are Put Aˆ = A + ad, given by ˆ ∗FAˆ = dAˆ .
(2)
The d-component is given by ∗da = dφ so a finite energy condition will force a = 0 and φ =constant= µ, say. The remaining part of (2) is then (1). Thus, one can think of periodic instantons as monopoles over R3 with structure group LG. Monopoles for finite-dimensional groups are well-studied [10, 16, 18]. In particular, the topology of the moduli space of monopoles is understood. The moduli space of monopoles with structure group G is diffeomorphic to the space of holomorphic maps from the two-sphere to a homogeneous space of G, or equivalently to an adjoint orbit of G [4, 12]. In analogy with the finite-dimensional case this led Garland and Murray to conjecture that periodic instantons are in one-to-one correspondence with based holo in Lg. The following theorem addresses half of morphic maps from S 2 to orbits of LG denote its this conjecture. The action of LG is really an action of LG. For (ξ, µ) ∈ Lg orbit by LG · (ξ, µ). Theorem 1. There is an injective map from (i) the space of based holomorphic maps from S 2 to LG · (ξ, µ), to 1 × R3 . (ii) the moduli space of instantons over S1/µ The basing condition on the space of holomorphic maps distinguishes an element of the orbit of LG that is conjecturally the asymptotic value of the Higgs field. See Sect. 6. The moduli space consists of gauge equivalence classes of connections where the gauge group consists of gauge transformations independent of θ in the limit at infinity. The full conjecture, that the map is also surjective, is equivalent to a conjecture for decay properties of finite energy periodic instantons analogous to known decay properties for monopoles. We discuss this in Sect. 6. Theorem 1 can be thought of as an extension of [13] from finite dimensional Lie groups to the loop group.
Periodic Instantons and the Loop Group
559
by 2.1. Orbits of the loop group. The loop group LG acts on Lg γ · (ξ, µ) = (γ · ξ − µγ γ −1 , µ). For ξ = 0 the orbit is given by the based loop group G. More generally, we get LG · (ξ, µ) ∼ = LG/Zξ , where the isotropy subgroup Zξ is described explicitly in the following proposition. Proposition 2.1 (Pressley and Segal). For π1 G = 0 and µ = 0 the orbits of LG on Lg correspond precisely to the conjugacy classes of G under the map (ξ, µ) → Mξ ∈ G, where Mξ is obtained by solving the ordinary differential equation h h−1 = −µ−1 ξ and noticing h(θ + 2π) = h(θ )Mξ . The isotropy subgroup of ξ is given by Zξ = {γ ∈ LG|γ (0) ∈ C[Mξ ], γ (θ ) = h(θ )γ (0)h(θ )−1 },
(3)
where C[Mξ ] is the centraliser of the conjugacy class of Mξ in G. Equivalently, the orbits are given by gauge equivalence classes of connections on a trivialised bundle over the circle of radius 1/µ. Each orbit is labeled by the underlying connection which is determined by its holonomy. In the next section we will equip the orbit of the loop group with a complex structure. 2.2. Loop groups and flat connections. Donaldson [5] re-interpreted elements of the loop group in terms of holomorphic bundles over the disk framed on the boundary, and the factorisation theorem in terms of flat connections on these bundles. He showed that each framed holomorphic bundle over the disk possesses a unique Hermitian-Yang–Mills (flat) connection. Theorem 2.2 (Donaldson). There is a 1 − 1 correspondence between (i) holomorphic bundles over D framed over ∂D; (ii) unitary Hermitian-Yang–Mills connections over D on a bundle with a unitary framing over ∂D. Donaldson’s argument generalises to parabolic bundles – holomorphic bundles over the disk with a flag specified over the origin [15]. In this case the flat connection must be singular at the origin. Proposition 2.3. There is a 1 − 1 correspondence between (i) parabolic bundles over D framed over ∂D; (ii) unitary Hermitian-Yang–Mills connections over D − {0} on a bundle with a unitary framing over ∂D. The singularity of the connection at 0 encodes the flag at 0. Following Donaldson, we can re-interpret this result in terms of a factorisation theorem for loop groups as follows. A parabolic bundle over the disk has an underlying trivial holomorphic bundle and a trivialisation compared to the framing over the boundary produces a loop γ ∈ LGc . Any other trivialisation that preserves the parabolic structure at 0 ∈ D changes γ by an element of L+ P – those loops that are boundary values of holomorphic maps from the disk to GL(n, C) with value at 0 lying in P . So (i) in the statement of Proposition 2.3 is equivalent to choosing an element of LGc /L+ P .
560
P. Norbury
A unitary Hermitian-Yang–Mills (or, equivalently, flat) connection over D − {0} is determined uniquely by the parabolic structure at 0 ∈ D. (This would not be true if there was more than one puncture.) With respect to the unitary framing over the boundary, We saw in the previous the flat connection defines an element of the orbit LG · ξ ∈ Lg. section that the orbit is isomorphic to LG/Zξ . Thus we get the following restatement of Proposition 2.3. Corollary 2.4. For any ξ ∈ Lg we have LGc /L+ P ∼ = LG/Zξ . We could have proven the factorisation theorem in a different way. In the special case that Zξ consists of only constant loops then Corollary 2.4 follows from the standard factorisation theorem for loop groups. In general, each orbit of LG possesses a nice representative which simplifies the isotropy subgroup to consist only of constant loops so the general case follows from the special case. The importance of the treatment here is that at the same time as establishing a complex structure on the orbit space, ξ remains the natural base-point for the holomorphic map and we get an interpretation of the orbit space in terms of flat connections over the disk on a bundle framed over the boundary. In the next section we will see how a holomorphic map from S 2 into a space of flat connections is related to an instanton over an associated four-manifold.
3. Instantons and Holomorphic Maps into Spaces of Flat Connections Atiyah showed that there is a one-to-one correspondence between instantons over the four-sphere and holomorphic maps from the two-sphere to the loop group [1]. The interpretation of elements of the loop group in terms of flat connections means that Atiyah’s result can be viewed as a relationship between instantons and holomorphic maps from the two-sphere to a space of flat connections. This approach was exploited in [14]. Another result of this type was obtained by Dostoglou and Salamon [6] in their proof of the Atiyah-Floer conjecture. They showed that the instanton Floer homology ˜ is the same as the associated to the three-manifold given by a mapping torus S 1 ×# symplectic Floer homology of the space of flat connections over #. The relationship between instantons and holomorphic maps into spaces of flat connections can be understood as follows. Suppose that locally a four-manifold is given by a product of two complex curves U × V equipped with the product metric. The anti-self-dual equations with respect to local coordinates {w} × {z} are given by: [∂wA¯ , ∂z¯A ] = 0 , (4) [∂z¯A , ∂zA ] = ρ(w, z)[∂wA¯ , ∂wA ] where ρ(w, z) depends on the metrics on U and V . Let f : U → MV be a holomorphic map from U into the space of flat connections MV over V . (The conformal structure on V equips the space of flat connections with a natural complex structure.) Define a connection over U × V by A = df + f (w),
(5)
Periodic Instantons and the Loop Group
561
where df is a Lie algebra valued 1-form over U × V and f (w) is a flat connection over {w} × V . Then A satisfies the following equations which resemble (4): [∂wA¯ , ∂z¯A ] = 0 . (6) [∂z¯A , ∂zA ] = 0 The first equation is equivalent to the holomorphic condition on the map f and the second equation uses the fact that f maps to a space of flat connections. We can think of the second equation of each of (4) and (6) as a type of moment map. One can move from solutions of (6) to solutions of (4) using the Yang–Mills flow, as we do in this paper or, say, by using the implicit function theorem. In order to apply this to periodic instantons we exploit the conformal invariance of the anti-self-dual equations. Let # be the punctured disk D 2 − {0} equipped with the complete hyperbolic metric |dz|2 /(|z| ln |z|)2 . There is a conformal equivalence: S 1 × (R3 − {0}) S 2 × #, where S 1 × (R3 − {0}) is equipped with the flat metric and S 2 × # is equipped with the product metric ds 2 =
4d wdw ¯ d z¯ dz + 2 . (1 + |w|2 )2 |z| (ln |z|)2
(7)
On S 2 × # the anti-self-dual equations are given by (4) with 2 1 + |w|2 ρ(w, z) = . |z| ln |z| Our course is set. We have shown that a holomorphic map from S 2 to LG · (ξ, µ) is the same as a holomorphic map from S 2 to a space of flat connections which gives an approximate instanton over S 2 × #. In Sect. 4 we will use rather standard techniques to move from an approximate instanton to an exact one. Under the conformal equivalence described above, this instanton will correspond to a periodic instanton. 3.1. Approximate instantons. Beginning with a holomorphic map from the two-sphere to an orbit of LG, we will construct an approximate instanton over S 1 × R3 . This will be an explicit realisation of (5). The map f : S 2 → LG/Zξ is holomorphic when f −1 ∂w¯ f : S 2 → L+ p, where L+ p ⊂ L+ gc is given by those loops that extend to a holomorphic map of the disk whose value at the origin lies in p. Put η equal to the holomorphic extension of f −1 ∂w¯ f to the disk. Over S 2 × # = {(w, z)}, define a connection A = ηd w¯ − Hξ−1 η∗ Hξ dw + iξ dz/z
(8)
which is Hermitian with respect to the Hermitian metric Hξ = exp(iξ ln z)∗ exp(iξ ln z)
(9)
562
P. Norbury
and flat on each {w} × D. Over S 1 × R3 in a radially-free gauge we get: (A, ) = (exp(iξ r)η exp(−iξ r)d w¯ − exp(−iξ r)η∗ exp(iξ r)dw, ξ ). Furthermore, 2 ∗FA = dA − µ∂θ A + (1 + |w|2 )2 Fww ¯ dr/r
(10)
which resembles the periodic instanton equation, (1). 4. Construction In this section we will use the Yang–Mills flow to move from the “approximate” periodic instanton (8) to an exact one. Instead of working directly with the connections, we will follow Donaldson [3] and work with a Hermitian metric on a holomorphic bundle which gives a Hermitian connection. In fact, we will work with a pair (H, η) consisting of a Hermitian metric H on a holomorphic bundle and a map η : S 2 × D 2 → gc that is holomorphic in the second factor. A connection A is obtained from the pair (H, η) by: A = H −1 ∂z H dz + η(w, z)d w¯ + (H −1 ∂w H − H 1 η(w, z)H )dw.
(11)
Associate to the pair (H, η) the Hermitian-Yang–Mills tensor B(H, η) = |z|2 (ln |z|)2 ∂z¯ (H −1 ∂z H ) + (1 + |w|2 )2 {∂w¯ (H −1 ∂w H ) −∂w¯ (H −1 η∗ H ) − ∂w η + [η, H −1 ∂w H − H −1 η∗ H ]}.
When B(H, η) ≡ 0, the connection (11) is anti-self-dual. Following Donaldson [3] we study the heat flow for the Hermitian metric H in place of the Yang–Mills flow for the associated connection. Since the Hermitian metrics we deal with here are not bounded we need to extend Donaldson’s results and their generalisations due to Simpson [23]. Essentially we need to understand properties of the Laplacian of the Kahler manifold S 2 × # with metric (7) and properties of the initial Hermitian metric (9). Similar results specialised to other non-compact Kahler manifolds exist in [8, 14]. 4.1. The heat flow. Associate to a holomorphic map f : S 2 → LG/Zξ the map η : S 2 × D 2 → gc given by the holomorphic extension of f −1 ∂w¯ f to the disks in the second factor. We would like to construct a Hermitian metric H that satisfies the equation B(H, η) = 0. This would produce an anti-self-dual connection associated to the map f . Consider the heat flow equation over S 2 × #, H −1 ∂H /∂t = B(H, η), H (w, z, 0) = Hξ ,
(12)
where Hξ is defined in (9). A solution of (12) will converge to the required solution of B(H, η) = 0 as t → ∞. Instead of solving (12) we will work with a family of boundary value problems. Put S 2 × #0,δ = {(w, z) ∈ S 2 × # | 0 ≤ |z| ≤ δ} so the S 2 × #0,δ exhaust S 2 × # as δ → 1 and 0 → 0.
Periodic Instantons and the Loop Group
563
Proposition 4.1. Over each S 2 × #0,δ there is a unique solution of the boundary value problem H −1 ∂H /∂t = B(H, η) H (w, z, 0) = Hξ (13) H |∂S 2 ×#0,δ = Hξ given by H 0,δ (w, z, t) and converging to a smooth metric H 0,δ (w, z, ∞) that satisfies B(H 0,δ (w, z, ∞), η) = 0. Proof. Since we have fixed S 2 × #0,δ for the moment we will omit the superscript in H 0,δ (w, z, t) during this proof. Short-time existence of a solution of (13) is automatic since B(H, η) is elliptic in H and we have Dirichlet boundary conditions. In order to extend this to long-time existence we will take the approach given by Donaldson [3] and extended by Simpson [23] and show that a solution on [0, T ) gives a limit at T which is a good initial condition to start the flow again. The lemmas we need to prove on the way use the details of our particular case and allow us to proceed with Donaldson’s proof. A Hermitian metric H takes its values in the space Gc /G which comes equipped with the complete metric d given locally by tr(H −1 δH )2 . Following Donaldson, we will use both this metric and the convenient function σ (H1 , H2 ) = tr(H1−1 H2 ) + tr(H1 H2−1 ) − 2n that satisfies c1 d 2 ≤ σ ≤ c2 d 2 for constants c1 , c2 . (Aside: if we take the loop group perspective described in [7], then a Hermitian metric takes its values in the space LGc /LG. We have not checked that this is a complete metric space.) Lemma 4.2. If H1 and H2 are two solutions of the heat equation then ∂t σ + 4σ ≤ 0
(14)
for σ = σ (H1 , H2 ). Proof. See [14]. Apply (14) to H (w, z, t) and H (w, z, t +τ ), the flow at two times. Since they obey the same boundary conditions on S 2 × #0,δ , σ vanishes on the boundary. By the maximum principle supS 2 ×#0,δ σ is a non-increasing function of t. By continuity, for any ρ > 0 there exists a τ small enough so that sup σ (H (w, z, t), H (w, z, t )) < ρ
S 2 ×#0,δ
for 0 < t, t < τ . It follows from the non-increasing property of σ that sup σ (H (w, z, t), H (w, z, t )) < ρ
S 2 ×#0,δ
for T − τ < t, t < T . Since ρ can be made arbitrarily small, H (w, z, t) is a Cauchy sequence in the C 0 norm as t → T . The metrics take their values in a complete metric space (described above) and the function σ acts like the metric so there is a continuous limit HT of the sequence. Notice also that (14) and the maximum principle show that this short-time solution to the heat flow equation is unique.
564
P. Norbury
Using the heat equation and the metric on Gc /G, we have
t d(H (w, z, t), H (w, z, 0)) ≤ |B(H (w, z, s), η)|ds, 0
where |B(H (w, z, s), η)|2 = tr(B ∗ B) and the adjoint is taken with respect to the metric Hs . Notice that B ∗ = B so |B(H (w, z, s), η)|2 = tr(B 2 ). Lemma 4.3. If H (w, z, t) is a solution of the heat equation then (d/dt + 4)|B(H (w, z, t), η)| ≤ 0 whenever |B| > 0. Proof. See [14].
(15)
The next two lemmas use the particular features of the Kahler manifold S 2 × # together with the initial Hermitian metric Hξ to get C 0 control on H (w, z, t) during the flow. Lemma 4.4. When η is the holomorphic extension of f −1 ∂w¯ f , for a given holomorphic map f : S 2 → LG/Zξ , there exists a constant M such that |B(Hξ , η)| ≤ M(1 − |z|) on S 2 × #. Proof. B(Hξ , η) = −(1 + |w|2 )2 (∂w η + ∂w¯ (Hξ−1 η∗ Hξ ) + [η, Hξ−1 η∗ Hξ ]), and since [η(0), ξ ] = 0, |B(Hξ , η)| is bounded near z = 0. Since f takes its values in the unitary loop group and Hξ = I on |z| = 1, we can identify B(Hξ , η) with the curvature of a flat connection which is 0. Furthermore, B(Hξ , η) is continuous and differentiable up to |z| = 1 so it vanishes like 1 − |z| there. Lemma 4.5. There is a constant C independent of 0 and δ such that d(H 0,δ (w, z, t), Hξ ) ≤ C ln(1 − ln |z|) for all (w, z, t) ∈ S 2 × #0,δ × R. Proof. It follows from (15) and the maximum principle that if there is a function b(w, z, t) defined on S 2 × #0,δ × R that satisfies (∂t + 4)b = 0 and |B(Hξ , η)| ≤ b(w, z, 0), then |B(H (w, z, t), η)| ≤ b(w, z, t) for all t. Put b(w, z, 0) = M(1 − |z|). Notice that b(w, z, 0) = b(|z|), so we only need use the one-dimensional Laplacian and b(w, z, t) = b(|z|, t). From the flow equation (13) we have
t d(H (w, z, t), Hξ (w, z)) = B(H (w, z, τ ))dτ 0
t ≤ b(w, z, τ )dτ
0 ∞ b(w, z, τ )dτ. (16) ≤ 0
Periodic Instantons and the Loop Group
565
Now, b(|z|, t) = ∞ b(s, t)k(|z|, s, t)ds, where k is the one-dimensional heat kernel operator. Since 0 k(|z|, s, t)dt = G(|z|, s), the Green’s operator is finite, Fubini’s theorem allows us to interchange the order of integration in (16). So
0 d(H (w, z, t), Hξ (w, z)) ≤ M (1 − s)G(|z|, s)ds
0
1
≤M
(1 − s)G(|z|, s)ds .
0
With respect to the Laplacian 2 4 = −(1 + |w|)2 ∂w¯ ∂w − 4|z|2 (ln |z|)2 ∂z¯ ∂z = −(ln |z|)2 ∂ln |z|
reduced to one dimension, the Green’s operator is given by G(|z|, s) = min{− ln |z|, − ln s}/s(ln s)2 . Actually, this Green’s operator is only valid for the entire interval (0 = 1) and Fubini’s theorem doesn’t apply there. There is a monotone property of heat kernels which means that our choice of G is simply an overestimate when 0 < 1 so the calculation is valid. Thus
|z|
1 (1 − s)ds (1 − s)ds d(H (w, z, t), Hξ (w, z)) ≤ M − ln |z| − s(ln s)2 s ln s 0 |z| ≤ C ln(1 − ln |z|), where the last inequality simply encodes the fact that the distance vanishes as |z| → 1 and grows like ln(1 − ln |z|) as |z| → 0. The preceding lemmas have shown that there is a solution to the heat equation that satisfies H (w, z, t) → H (w, z, T ) in C 0 and H (w, z, t) is uniformly bounded with bound independent of t (though depending on 0). These are the conditions required to use Simpson’s extension of Donaldson’s result to show that H (w, z, t) is bounded p in L2 uniformly in t. Hamilton’s methods [9] then give control of all higher Sobolev norms. Thus we get a solution, H (w, z, t), of (13) for all t that converges to a smooth limit H 0,δ (w, z, ∞) defined on S 2 × #0,δ and satisfying B(H 0,δ (w, z, ∞), η) = 0 and H 0,δ (w, z, ∞) = Hξ on ∂S 2 × #0,δ so Proposition 4.1 is proven. Proposition 4.6. For each holomorphic map f : S 2 → LG/Zξ there is a periodic instanton Af on S 1 × R3 . Proof. We have proven the existence of a family of hermitian metrics H 0,δ respectively defined over S 2 × #0,δ and satisfying B(H 0,δ , η) = 0. Since σ (H 0,δ , H 0 ,δ ) is subharmonic its maximum occurs at the boundary of the set on which it is defined. For 0 ≤ 0 ≤ δ ≤ δ , the common set is S 2 × #0,δ . If we fix 0 = 0 and let δ → 1, then σ = 0 on |z| = 0 and the maximum of σ occurs on |z| = δ. Since the metrics σ and d on Gc /G are equivalent, the maximum value of σ is bounded by a constant times d(H 0,δ , Hξ ) ≤ C ln(1 − ln δ) using Lemma 4.5. This tends to 0 as δ → 1, thus we have a Cauchy sequence that converges uniformly to a Hermitian metric H 0 defined on p |z| ≥ 0. The convergence can be improved to L2 to ensure that B(H 0 , η) = 0 [23].
566
P. Norbury
In order to deal with 0 → 0, notice that since ln |z| is harmonic on S 2 ×#, σ +a ln |z| is subharmonic for any a. Put a = sup|z|=0 σ/| ln 0|. Then σ + a ln |z| ≤ 0 on |z| = 1 and |z| = 0. Thus σ ≤ − ln |z| sup σ/| ln 0|. |z|=0
(17)
By Lemma 4.5, d(H 0,δ , Hξ ) ≤ C ln(1 − ln 0) so σ = o(| ln 0|) as 0 → 0. Thus the right hand side of (17) tends uniformly to 0 on compact sets away from z = 0. Again we conclude that the {H 0 } form a Cauchy sequence as 0 → 0, converging uniformly on the complement of any neighbourhood of S 2 × {0} to a Hermitian metric H that satisfies B(H, η) = 0 on S 2 × #. Using S 1 ×(R3 −{0}) ∼ = S 2 ×# we see that the limit H is smooth on S 1 ×(R3 −{0}) 1 and continuous on all of S ×R3 , converging to I on S 1 ×{0}. The connection A obtained from H via (11) is defined and anti-self-dual on S 1 ×(R3 −{0}). By the following lemma, A has finite charge. Since codimension three singularities of finite charge anti-self-dual connections can be removed [22], A is smooth on all of S 1 × R3 . Lemma 4.7. The curvature of the limiting connection A has finite L2 norm. Proof. The Yang–Mills flow decreases the L2 norm of a connection, and any bubbling in the limit just decreases the L2 norm further, so it is sufficent to show that the initial connection has finite L2 norm. For any connection A, we have
+ 2 2 2 (18) 8π FA 2 = 2 |FA | − FA ∧ FA , where FA+ is the self-dual part of the curvature. We can calculate this explicitly for the initial connection defined in (8). Notice that FA+ = B(Hξ , η) and by Lemma 4.4 we have |B(Hξ , η)| ≤ M(1 − |z|). This is square-integrable over S 2 × # since S 2 is compact and # has finite area near z = 0 and grows like 1/(1 − |z|)2 near |z| = 1. As one might expect, the topological term in (18) will coincide with the topological degree of the map f : S 2 → LG/Zξ .
1 1 2 k(E) = tr(F ) = − tr(∂z¯ η∗ ∂z η)d z¯ dzd wdw, ¯ A 8π 2 S 2 ×D 8π 2 S 2 ×D since only the Fzw¯ and Fz¯ w terms contribute. Since η is holomorphic in z, then on the disk d{tr(η∗ ∂z η)dz} = tr(∂z¯ η∗ ∂z η)d z¯ dz so
1 k(E) = − 2 tr(η∗ ∂z η)dzd wdw ¯ 8π S 2 |z|=1
1 d wdw ¯ = , f −1 ∂w¯ f 2 4π S 2 i 2 where f −1 ∂wf ¯ uses the Kahler metric on LG/Zξ . This expression is the degree of f.
Remark. In the construction of this section we started with parabolic bundles over the disk. However, the reverse is not true that a periodic instanton gives rise to a family of parabolic bundles. By this we mean that the holomorphic structure defined on each punctured disk by the restriction of the periodic instanton does not extend to the entire disk. The curvature just fails to satisfy FA ∈ Lp for p > 1 as required in [2].
Periodic Instantons and the Loop Group
567
5. Injection In this section we will show that the map produced in Sect. 4 is injective. Proposition 5.1. Let f : S 2 → LG/Zξ and g : S 2 → LG/Zν be two based holomorphic maps. Then the instantons Af and Ag are gauge equivalent precisely when ν − ξ is in the root lattice and g = f · exp(i(ν − ξ ) ln z). Proof. The instanton Af is given by the expression (11) which depends on a pair (H, η) consisting of a Hermitian metric, H , and the holomorphic extension of f −1 ∂w¯ f denoted by η and likewise for Ag . These expressions are independent of the unitary gauge so Af ∼ Ag only if Af = Ag or possibly if we have used different holomorphic trivialisations of the holomorphically trivial bundle restricted to each {w} × # for Af and Ag . If Af = Ag then f −1 ∂w¯ f = η = g −1 ∂w¯ g, so ∂w¯ (gf −1 ) = 0 and this is global over 2 S , thus g = γ (z)f for some loop γ (z) independent of w. The requirement that f and g map ∞ ∈ S 2 to the constant loop I forces γ (z) = I . If Af = Ag and Af ∼ Ag then Af uses the pair (H, η) in (11) and Ag uses the pair ∗ (p Hp, p−1 ηp + p −1 ∂w¯ p) for a map p : S 2 × # → Gc which is holomorphic on each {w} × # and unitary on its boundary. Note that this implies that g = fp though since p is not a priori in L+ P , the maps f and g can be distinct. The proof of the proposition is completed by the following two lemmas that show that g = fp together with the known growth of the Hermitian metrics associated to f and g forces p to be constant or to be a standard holomorphic gauge change. Lemma 5.2. If ξ = ν then Af ∼ Ag only if f = gu for u ∈ P ∩ G ∼ = Zξ . Proof. We can apply Lemma 4.5 to the Hermitian-Yang–Mills metric H over all of S 2 × # even though it is only stated for 0 < 0 < δ < 1. Thus d(H, Hξ ) + d(p ∗ Hp, Hξ ) ≤ C ln(1 − ln |z|) for the initial metric Hξ defined in (9). Using the identity d(p ∗ Hp, Hξ ) = d((p∗ )−1 Hξ p −1 , Hξ )) and the triangle inequality we have d(H, Hξ ) + d(p ∗ Hp, Hξ ) ≥ d((p∗ )−1 Hξ p −1 , Hξ )),
(19)
and the right-hand side is bounded by C ln(1 − ln |z|) only if p is bounded near z = 0 by C ln(1 − ln |z|). Since it satisfies limz→0 zp(z) → 0, p extends across z = 0 and is holomorphic there. Furthermore we must have p(0) ∈ P in order that the right-hand side of (19) is bounded by C ln(1 − ln |z|). Since p is holomorphic on the disk and unitary on the boundary it must be unitary on the disk (by the maximum principle applied to the subharmonic function tr(p∗ p) + tr((p∗ p)−1 )), and thus constant there, and moreover lie in P ∩ G. Lemma 5.3. If Af ∼ Ag then ν − ξ lies in the root lattice and g = f exp(i(ν − ξ ) ln z).
568
P. Norbury
Proof. As described above, g = fp. Then limz→0 zp −1 ∂z p = ν − ξ . Since zp −1 ∂z p is bounded and holomorphic on the punctured disk, itextends to a holomorphic function of z the disk. In fact p−1 ∂z p = q(z)/z so p(z) = exp( q(ζ )dζ /ζ ) and ν − ξ = q(0) must lie in the integer lattice. Thus p · exp(−i(ν − ξ ) ln z) is holomorphic on the disk and unitary on the boundary and hence constant which we absorb in the unitary ambiguity of f . So g = f · exp(i(ν − ξ ) ln z). The proposition allowed for gauge transformations that have angular dependence at infinity (corresponding to z = 0). When we restrict the gauge transformations to have no angular dependence at infinity then the maps f and f · exp(i(ν − ξ ) ln z) define inequivalent connections. Thus the map f → Af is injective. 6. Boundary Conditions There are natural boundary conditions that the periodic instantons constructed in this paper conjecturally satisfy: as r → ∞, − ξ = O(1/r), ∂ − ξ /∂ = O(1/r 2 ), ∇( − ξ ) = O(1/r 2 ), where ξ is a given constant Higgs field, r is the radial coordinate in R3 , ∂/∂ is an angular derivative, and the asymptotic constants are uniform in θ. In order to prove these conditions we would need to understand the precise elliptic constants for the Hermitian Yang–Mills Laplacian on S 2 × # near the puncture at z = 0. This would enable us to get estimates on the second derivatives of H from the estimates on H given in this paper and estimates on first derivatives of H obtained from a maximum principle argument [5]. We hope to show this in future work. Alternatively, one might prove the stronger conjecture that all finite energy periodic instantons satisfy these boundary conditions. Such a proof would again require a good understanding of the Laplacian on S 2 × # as in the special case of monopoles [11]. This stronger conjecture implies that the construction of this paper yields all periodic instantons. This can be proven by using a scattering argument to retrieve a holomorphic map from S 2 to an orbit of the loop group from a given periodic instanton. Acknowledgements. I would like to thank Michael Murray for useful discussions and the University of Adelaide for its hospitality over a period when part of this work was carried out.
References 1. Atiyah, M.F.: Instantons in two and four dimensions. Commun. Math. Phys. 93, 437–451 (1984) 2. Biquard, Olivier: Fibrés paraboliques stables et connexions singulières plates. Bull. Soc. Math. France 119, 231–257 (1991) 3. Donaldson, S.K.: Anti-self-dual Yang–Mills connections over complex algebraic surfaces and stable vector bundles. Proc. London Math. Society 30, 1–26 (1985) 4. Donaldson, S.K.: Nahm’s equations and the classification of monopoles. Commun. Math. Phys. 96, 387– 407 (1984) 5. Donaldson,S.K.: Boundary value problems for Yang–Mills fields. J. Geom. and Phys. 8, 89–122 (1992) 6. Dostoglou, Stamatis and Salamon, Dietmar: Self-dual instantons and holomorphic curves. Ann. of Math. 139, 581–640 (1994)
Periodic Instantons and the Loop Group
569
7. Garland, H. and Murray, M.K.: Kac-Moody monopoles and periodic instantons. Commun. Math. Phys. 120, 335–351 (1988) 8. Guo, G.-Y.: On an analytic proof of a result by Donaldson. Int. J. Math. 7, 1–17 (1996) 9. Hamilton, R.S.: Harmonic maps of manifolds with boundary. Lecture Notes in Math. 471, New York: Springer, 1975 10. Hitchin, N.J.: On the construction of monopoles. Commun. Math. Phys. 89, 145–190 (1983) 11. Jaffe, A. and Taubes, C.H.: Vortices and monopoles. Boston: Birkhäuser, 1980 12. Jarvis, S.: Euclidean monopoles and rational maps. Proc. LMS 77, 170–192 (1998) 13. Jarvis, S.: Monopoles to rational maps via radial scattering. Preprint (1996) 14. Jarvis, S. and Norbury, P.: Degenerating metrics and instantons on the four-sphere. J. Geom. Phys. 27, 79–98 (1998) 15. Mehta and Seshadri. Parabolic bundles. Math. Ann. 248, 205–239 (1980) 16. Murray, M.K.: Monopoles and spectral curves for arbitrary Lie groups. Commun. Math. Phys. 90, 263–271 (1983) 17. Nahm, W.: Self-dual monopoles and calorons. Lecture Notes in Phys. 201, Berlin: Springer, 1983, pp. 189– 200 18. Nahm, W.: The construction of all self-dual multimonopoles by the ADHM method. In Monopoles in quantum field theory (Trieste), Singapore: World Sci. Pub., 1981, pp. 87–94 19. Rozansky, L. and Witten, E.: Hyper-Kahler geometry and invariants of three-manifolds. Selecta Math. 3, 401–458 (1997) 20. Seiberg, N. and Witten, E.: Electric-magnetic duality, monopole condensation, and confinement in n = 2 supersymmetric Yang–Mills theory. Nuclear Phys. B 426, 19–52 (1994) 21. Seiberg, N. and Witten, E.: Gauge dynamics and compactifications to three dimensions. Adv. Ser. Math. Phys. 24, 333–366 (1997) 22. Sibner, L.M. and Sibner, R.J.: Classification of singular Sobolev connections by their holonomy. Commun. Math. Phys. 144, 337–350 (1992) 23. Simpson, C.T.: Constructing variations of Hodge structure using Yang–Mills theory and applications to uniformization. J. Amer. Math. Soc. 1, 867–918 (1988) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 212, 571 – 590 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Wigner Symbols and Combinatorial Invariants of Three-Manifolds with Boundary Gaspare Carbone1 , Mauro Carfora2,3 , Annalisa Marzuoli2,3 1 S.I.S.S.A.-I.S.A.S., Via Beirut 2–4, 34013 Trieste, Italy. E-mail:
[email protected] 2 Dipartimento di Fisica Nucleare e Teorica, Università degli Studi di Pavia, via A. Bassi 6, 27100 Pavia, Italy 3 Istituto Nazionale di Fisica Nucleare, Sezione di Pavia, via A. Bassi 6, 27100 Pavia, Italy.
E-mail:
[email protected];
[email protected] Received: 14 December 1998 / Accepted: 30 January 2000
To Giorgio Ponzano and Tullio Regge Abstract: In this paper we generalize the partition function proposed by Ponzano and Regge in 1968 to the case of a compact 3-dimensional simplicial pair (M, ∂M). The resulting state sum Z[(M, ∂M)] contains both Wigner 6j symbols associated with tetrahedra and Wigner 3j m symbols associated with triangular faces lying in ∂M. In order to show the invariance of Z[(M, ∂M)] under PL-homeomorphisms we exploit some results due to Pachner on the equivalence of n-dimensional PL-pairs both under bistellar moves on n-simplices in the interior of M and under elementary boundary operations (shellings and inverse shellings) acting on n-simplices which have some component in ∂M. We find, in particular, the algebraic identities – involving a suitable number of Wigner symbols – which realize the complete set of Pachner’s boundary operations in n = 3. The results established for the classical SU (2)-invariant Z[(M, ∂M)] are further extended to the case of the quantum enveloping algebra Uq (sl(2, C)) (q a root of unity). The corresponding quantum invariant, Zq [(M, ∂M)], turns out to be the counterpart of the Turaev–Viro invariant for a closed 3-dimensional PL-manifold. 1. Introduction The search for combinatorial invariants of compact n-dimensional manifolds (n = 3, 4) plays a key role both in topological lattice field theories and in quantum gravity discretized according to Regge’s prescription [R]. From a historical point of view, the typical examples of this class of models in dimension three are provided in [P-R] and in [T-V] (further developments can be found in [M-T, O92,a, O92,b, C-F-S] and in [C-K-S] for what concerns in particular a categorical approach to the subject). As a matter of fact, all the papers quoted above deal essentially with state sum invariants for a closed 3 or 4-dimensional manifold M . The interest in dealing with an n-dimensional compact pair (M, ∂M) (where ∂M is the (n − 1)-dimensional boundary manifold of M) relies on the fact that in typical physical situations we have to consider
572
G. Carbone, M. Carfora, A. Marzuoli
probability amplitudes between different (n − 1)-dimensional Riemannian geometries which represent the boundary of an n-dimensional (pseudo)Riemannian manifold. Borrowing the language from the Euclidean functional integral approach to the quantization of gravity, we have to evaluate quantities such as < (N1 , h1 ) | O | (N2 , h2 ) > /< (N1 , h1 ) | (N2 , h2 ) >, where (N1 , h1 ) and (N2 , h2 ) are (n − 1)-dimensional manifolds equipped with fixed Riemannian metrics h1 and h2 respectively, and O is some observable. The symbol < | > denotes a functional integration over (a suitable class of) n-dimensional Riemannian metrics, up to diffeomorphisms, interpolating between (N1 , h1 ) and (N2 , h2 ). If we are interested in studying either the topological sector of quantum gravity or just a topological model, the requirement of taking fixed geometries on the boundaries appears to be much too restrictive. Indeed, a topological n-dimensional field theory, when projected onto a (n − 1)-dimensional boundary, keeps on taking its topological character (namely, it is independent from the metric on the boundary as well), and thus topological invariants of pairs (M, ∂M) should come into play in a more natural way. Notice that combinatorial invariants for PL-pairs may appear also in the loop quantum gravity approach, in particular when introducing a spin networks basis (see e.g. [Ro-S, D-R] and references therein). In this paper we extend both the Ponzano–Regge partition function and the Turaev– Viro invariant to compact 3-dimensional simplicial PL-manifolds with non- empty boundaries. Although this issue has already been addressed some years ago in [K-M-S], our proposal turns out to be quite natural and more closely related to the original idea of a discretized spacetime partition function arising from a recoupling scheme of angular momenta variables, much in the spirit of [P-R, Pe] and [M]. In our approach the presence of a 2-dimensional boundary will be taken into account through the introduction in the state sum of a Wigner 3j m symbol associated with each triangle lying in the boundary itself. (Conversely, in [K-M-S] the authors introduce a new kind of mixed symbol, involving both the angular momenta variables and the vertices in the boundary.) The next step will consist in summing over all angular momenta variables j and simultaneously over all momentum projections, or m-variables; in such a way we get – up to regularization – a state sum Z[(M, ∂M)] for the simplicial pair (M, ∂M) which is the natural counterpart of the Ponzano–Regge partition function in the case of manifolds with boundary (Sect. 3). In order to show the invariance of the above state sum under PL-homeomorphisms we shall go back to a basic theorem (established by Pachner in [P91]) which states that two simplicial compact PL-manifolds of dimension n with non-empty boundaries are PL-homeomorphic if, and only if, they are equivalent under elementary shellings and their inverse operations. Such elementary shellings are topological operations, or moves, which act on simplices that have some components in the boundary of the manifold by deleting – or adding – one n-simplex at a time. Since this theorem and some other related results (see also [P90]) do not seem widely known, we shall briefly discuss them in Sect. 1. In Sect. 4 we show how the three different types of elementary shellings defined in the 3-dimensional case can be associated with identities involving one 6j and four 3j m symbols. (Notice that in [K-M-S] it is necessary to assume that the new symbols satisfy some identities of Biedenharn–Elliott type in order to show that the state sum is actually invariant under a suitable class of subdivisions and isotopies of the boundary.) As a consequence of such a natural algebrization of the elementary shellings, the state sum Z[(M, ∂M)] will turn out to be automatically invariant under such moves. Moreover,
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
573
by Pachner’s results, it will be also an invariant of the PL-structure of the simplicial pair (M, ∂M). In Sect. 5 we present the extension of the model to the case of the quantized enveloping algebra Uq (sl(2, C) (q a root of unit). The resulting invariant, Zq [(M, ∂M)], is thus the counterpart of the Turaev–Viro quantum invariant and reduces to it in the case ∂M = ∅ (and to Z[(M, ∂M)] for q → 1). Finally, Sect. 6 contains some remarks concerning possible developments of the methods proposed in the present paper. 2. Equivalence of PL-Manifolds with Boundary Under Pachner’s Elementary Shellings We first list some well known, but necessary, preliminaries on Piecewise-Linear manifolds (see e.g. [G, R-S]). By a p-simplex σ p ≡ (x0 , x1 , . . . , xp ) with vertices . p x0 , x1 , . . . , xp we mean the subspace of Rd , (d > p) defined by σ p = i=1 λi xi , where (x0 , x1 , . . . , xp ) are (p + 1) points in general position in Rd with i λi = 1 and λi ≥ 0, ∀i. Definition 1. Let σ p and τ q be simplices in Rd with distinct vertices and such that the totality of these vertices is at most (d + 1) and they are in general position in Rd . Then such vertices span a simplex, σ p τ q , the join of σ p and τ q , defined as the (p + q + 1)simplex obtained by taking the convex hull in Rd , viz.: . (1) σ p τ q = conv(σ p ∪ τ q ). A face of a p-simplex σ p is any simplex the vertices of which are a subset of those of σ p. Definition 2. A finite simplicial complex T (or, more precisely, the geometrical realization of an abstract simplicial complex) is a finite collection of simplices in Rd such that: i) if σ p ∈ T , then so are all of its faces; ii) if σ p , τ q ∈ T , then σ p ∩ τ q is either a (common) face or is empty. T has dimension n if n is the maximum dimension of its faces. The faces of maximal dimension, σ n , are called facets of T . Definition 3. If T1 and T2 are simplicial complexes, then the join of T1 with T2 is defined according to: . (2) T1 T2 = {σ1 σ2 s.t. σ1 ∈ T1 , σ2 ∈ T2 } where σ1 σ2 is given in Definition 1. In particular, the join of a complex T with the empty simplex gives T {∅} = T , while the join of T with the empty complex gives T ∅ = ∅. (Notice however that in the following the join of a complex T with a simplex τ will be denoted by T τ for short.) A simplicial complex is pure provided that all its facets have the same dimension. The boundary complex of a pure simplicial n-complex T is denoted by ∂T and it is the subcomplex of T the facets of which are the (n − 1)-faces of T which are contained in . only one facet of T . The set of the interior faces of T is denoted by int (T ) = T \ ∂T . If σ is a simplex, then by B(σ ) we mean the complex made up of all the faces of σ , except σ itself. Moreover . F(σ ) = B(σ ) ∪ {σ } (3) is the complex made up of σ and all its proper faces.
574
G. Carbone, M. Carfora, A. Marzuoli
Given a (finite) simplicial complex T , consider the set theoretic union |T | ⊂ Rd of all simplices from T , namely . (4) |T | = ∪σ ∈T σ. Introduce on the set |T | a topology that is the strongest of all topologies in which the embedding of each simplex into |T | is continuous (the set A ⊂ |T | is closed iff A ∩ σ p is closed in σ p for any σ p ∈ T ). The topological space |T | is the underlying polyhedron, geometric carrier of the simplicial complex T ; the polyhedron |T | is said to be triangulated by the simplicial complex T . More generally, a triangulation of a topological space M is a simplicial complex T together with a homeomorphism |T | → M. Definition 4. A simplicial map f : T1 → T2 between two simplicial complexes T1 , T2 is a continuous map f : |T1 | → |T2 | between the corresponding underlying polyhedra which takes p-simplices to p-simplices for all p. The map f is a simplicial isomorphism if f −1 : T1 → T2 is also a simplicial map. ∼ |T |, where ∼ A subdivision T of T is a simplicial complex such that: i) |T | = = denotes a homeomorphism between topological spaces; ii) each p-simplex of T is contained in a p-simplex of T , for every p. A property of a simplicial complex T which is invariant under subdivisions is a combinatorial (or Piecewise Linear) property of T . More precisely: Definition 5. A PL-homeomorphism f : T1 −→ T2
(5)
between two simplicial complexes (of the same dimension) is a map which is a simplicial isomorphism for some subdivisions T1 and T2 of T1 and T2 , respectively. ∼ |T |, each point of Definition 6. A PL-manifold of dimension n is a polyhedron M = which has a neighborhood, in M, PL-homeomorphic to an open set in Rn . PL-manifolds are realized by simplicial manifolds under the equivalence relation generated by PL-homeomorphisms: Definition 7. Two PL-manifolds M1 ∼ = |T1 | and M2 ∼ = |T2 | are PL-homeomorphic, or M1 ∼ =PL M2
(6)
if there exists a map g : M1 → M2 which is both a homeomorphism and a simplicial isomorphism, in the sense of Definition 5. In what follows we shall use the notation T −→ M ∼ = |T |
(7)
to denote a particular triangulation of the closed PL-manifold M and, when dealing with a PL-pair (M, ∂M), we shall write: (T , ∂T ) −→ (M, ∂M) ∼ = (|T |, |∂T |),
(8)
where the triangulation on ∂M is the unique triangulation induced on it by the chosen triangulation T in M. The extension of Definitions 5, 6 and 7 to PL-pairs is quite straightforward and can be found e.g. in [R-S]. Recall (see e.g. [T]) also that a sufficient condition for characterizing a triangulated space as a PL-manifold follows from:
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
575
Theorem 1. A simplicial n-complex K is a (simplicial) PL-manifold of dimension n if, for all p-simplices σ p ∈ K, the link of σ p , link(σ p ), has the topology of the boundary of the standard (n − p)-simplex, namely if: link(σ p ) ∼ = Sn−p−1 (the (n − p − 1)dimensional sphere). In the above statement, link(σ p ) ⊂ K is the union of all faces τ of all simplices in the star of σ satisfying σ ∩ τ = ∅ (the star of σ in K is simply the union of all simplices of which σ is a face). Notice however that in this paper we shall deal only with triangulations underlying PL-manifolds, and thus the content of Theorem 1 will not be discussed any further. The point that we are going to examine now concerns PL-equivalence of polyhedra. Notice that Definition 7 turns out to be quite difficult to be handled in practice, since one should go over and over through subdivisions in order to find out isomorphic triangulations. The issue of combinatorial equivalence was first addressed by Alexander in [A], where he proved the following theorem (which indeed holds true for more general complexes too): Theorem 2. For any polyhedron M which is dimensionally homogeneous (viz., its underlying simplicial complex is pure) any two triangulations of M can be transformed one into the other by a finite sequence of stellar subdivisions and their inverse transformations. The stellar subdivisions, typically known also as Alexander’s transformations (or moves), are not elementary, in the sense that each one of them involves a variable number of n-simplices of the triangulation T we are considering. Thus, being interested in transformations between different triangulations of a PL-manifold M, one should implement Alexander’s moves over a lot of local arrangements of simplices which cannot be factorized into simpler blocks. On the other hand, owing to Theorem 2, PL-manifolds are mapped homeomorphically into PL-manifolds, and moreover all admissible triangulations of a given M are related to each other by a suitable sequence of Alexander’s moves. The way out of this situation is to look for a different set of moves, which are both elementary (i.e. they involve just a fixed number of simplices in any dimension n) and equivalent to Alexander’s transformations, namely topology – preserving and ergodic (i.e. they must span all the possible triangulations of a given M). A set of moves that shares these requirements for the case of closed n-dimensional PL-manifolds has been found by Pachner: the bistellar elementary operations (see [P87] and also Appendix A of [A-C-M] for an account on this subject in connection with simplicial quantum gravity models in n = 3, 4). Pachner has also introduced a set of moves which are suitable in the case of compact n-dimensional PL-manifolds with a non–empty boundary, the elementary shellings (see [P90] and [P91]). As the term “elementary shelling” suggests, this kind of operation involves the cancellation of one n-simplex (facet) at a time in a given triangulation (T , ∂T ) → (M, ∂M) of a PL-pair of dimension n. In order to be deleted, the facet must have some of its faces lying in the boundary ∂T . Moreover, using Definition 1, we may decompose a facet of this kind (considered now as a complex) into the join of two suitable faces belonging to it. This decomposition is obviously not unique, although in each dimension n there are only a finite number of possibilities of carrying it out (up to relabelling the faces of a given dimension). Definition 8. Let (T , ∂T ) → (M, ∂M) be a triangulation of a PL-pair of dimension n and let σ n be a facet decomposed according to: σn = τ σr,
(9)
576
G. Carbone, M. Carfora, A. Marzuoli
where τ is a face of σ n of dimension p ≥ 0 such that τ ∈ int (T ), and the second factor represents a face of σ n of dimension r ≥ 0 with the following property: B(τ ) σ r ⊆ ∂T ,
(10)
where B(τ ) is the complex made up of all the faces of τ except τ itself. Then an elementary r-shelling of (T , ∂T ) is defined according to: . '−σ n (T , ∂T ) = (T , ∂T ) \ {F(τ ) σ r } ≡ (T˜ , ∂ T˜ ), (11) where F(τ ) is given in (3). Notice that the dimension p of τ is given, in terms of n and r, by p = n − r − 1; moreover, if τ is a 0-simplex then B(τ ) = ∅ and the remark at the end of Definition 3 has to be kept in mind. The inverse operation amounts to adding a new facet to (T˜ , ∂ T˜ ) along some faces in ˜ ∂ T , and can be simply defined as . '+σ n (T˜ , ∂ T˜ ) = ('−σ n )−1 (T˜ , ∂ T˜ ). (12) If we set '+ ≡ '+σ n and '− ≡ '−σ n for some facet (or missing facet) σ n , we can establish an equivalence relation between triangulations according to: Definition 9. Two triangulations (T , ∂T ) and (T˜ , ∂ T˜ ) are said to be equivalent under elementary shellings if, and only if, they are connected by a finite number of elementary boundary operations, namely: (T , ∂T ) ≈sh (T˜ , ∂ T˜ ) ⇐⇒ (T˜ , ∂ T˜ ) = 'k± · · · '1± (T , ∂T ),
(13)
where '± are defined in (12) and (11) respectively and k is an integer. Remark 1. It may happen that there exist one face τ ∈ int (T ) and different faces, say σ1r and σ2r , with both B(τ ) σ1r and B(τ ) σ2r in ∂T and such that σ n = τ σ1r , σ n = τ σ2r for a fixed σ n . However, for each σ r belonging to ∂T , there exists at most one τ ∈ int (T ) such that: i) τ σ r is a facet; ii) B(τ ) σ r ⊆ ∂T . Hence the elementary operation '−σ n defined in (11) in uniquely determined by σ r , thus the set of the possible elementary shellings (performed on a single facet) is equal to the dimension n of the facet itself, since r = 0, 1, . . . , n − 1. The statement of the main theorem in [P91] can be rewritten in our notation as: Theorem 3. Let (T1 , ∂T1 ) → (M1 , ∂M1 ) and (T2 , ∂T2 ) → (M2 , ∂M2 ) be triangulations of compact n-dimensional manifolds with boundary. Then (M1 , ∂M1 ) and (M2 , ∂M2 ) are PL-homeomorphic if, and only if, (T1 , ∂T1 ) and (T2 , ∂T2 ) are equivalent under elementary shellings, namely: |(T1 , ∂T1 )| ∼ =PL |(T2 , ∂T2 )| ⇐⇒ (T1 , ∂T1 ) ≈sh (T2 , ∂T2 ),
(14)
where |(T1 , ∂T1 )| ∼ = (M1 , ∂M1 ), |(T2 , ∂T2 )| ∼ = (M2 , ∂M2 ), the equivalence ≈sh being in the sense of Definition 9. Notice that in one direction (⇐) the result is quite straightforward and moreover, as a particular application, we may consider different triangulations (T , ∂T ), (T˜ , ∂ T˜ ) of the same PL-pair (M, ∂M). Indeed, Pachner has proved a weaker version of the above result in [P90], namely
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
577
Theorem 4. Let (T1 , ∂T1 ) → (M1 , ∂M1 ) and (T2 , ∂T2 ) → (M2 , ∂M2 ) be triangulations of PL, compact n-dimensional pairs. Then |(T1 , ∂T1 )| ∼ =PL |(T2 , ∂T2 )| ⇐⇒ (T1 , ∂T1 ) ≈sh,bst (T2 , ∂T2 ),
(15)
where the equivalence ≈sh,bst is both under elementary shellings and under bistellar elementary operations on n-simplices in int(T1 ) or int(T2 ). Remark 2. The advantage of having to deal with elementary shellings is quite evident (although we shall use Theorem 4 when handling our combinatorial invariants). Moreover, there exists a correspondence between bistellar moves in dimension n and elementary shellings in dimension (n − 1), as discussed in Sect. 6. In this respect, Theorem 3 represents exactly the counterpart of Pachner’s theorem for closed (n − 1)-PL-manifolds (see [P87]). Example. Elementary shellings in n = 3. Let (T , ∂T ) → (M, ∂M) represent a triangulation of a 3-dimensional PL-pair and let σ 3 be a facet with some component in ∂T . According to Definition 8 we can write: σ 3 = τ σ r (r = 0, 1, 2),
(16)
where τ ∈ int(T ) and σ r ∈ ∂T . As we noticed in Remark 1, for every σ r ∈ ∂T there exists at most one τ ∈ int(T ) which satisfies (16). Then we can classify the possible facets and the corresponding elementary shellings according to the dimensionality of σ r and using (11) (the different configurations are also illustrated in the figures of Sect. 4). 1. TYPE I (r = 0). The facet σ 3(I) admits the decomposition σ 3(I) = τ (I) σ 0 , where τ (I) is a 2-simplex and belongs to int(T ). The vertex σ 0 and the three 2-dimensional faces of σ 3(I) which have σ 0 as a common subsimplex are in ∂T . The shelling of σ 3(I) is represented by the map: '−(I) : (T , ∂T ) −→ (T , ∂T ) \ {F(τ (I) ) σ 0 }.
(17)
2. TYPE II (r = 1). The facet σ 3(II) admits the decomposition σ 3(II) = τ (II) σ 1 , where τ (II) is a 1-simplex and belongs to int(T ). The 1-simplex σ 1 and the two 2-dimensional faces of σ 3(II) which have σ 1 as a common subsimplex are in ∂T . The shelling of σ 3(II) is represented by the map: '−(II) : (T , ∂T ) −→ (T , ∂T ) \ {F(τ (II) ) σ 1 }.
(18)
3. TYPE III (r = 2). The facet σ 3(III) admits the decomposition σ 3(III) = τ (III) σ 2 , where τ (III) is a vertex and belongs to int(T ). The 2-simplex σ 2 is in ∂T . The shelling of σ 3(III) is represented by the map: '−(III) : (T , ∂T ) −→ (T , ∂T ) \ {F(τ (III) ) σ 2 }.
(19)
The inverse elementary shellings '+(I) , '+(II) and '+(III) are nothing but maps which are the inverse operations with respect to the former ones, and represent attachments of 3-simplices of Types I, II, III respectively.
578
G. Carbone, M. Carfora, A. Marzuoli
3. Generalization of Ponzano–Regge Partition Function in Terms of Wigner 3j m and 6j Symbols In this section we generalize the partition function proposed by Ponzano and Regge in [P-R] for closed manifolds to the case of a compact 3-dimensional simplicial PLmanifold with a non-empty boundary, (M, ∂M). The resulting state sum will contain both 6j symbols associated with 3-simplices in (M, ∂M) and 3j m symbols associated with 2-simplices in ∂M. We start with some basic definitions from the recoupling theory of angular momenta of SU (2) following the standard notation of [V-M-K]. If j1 , j2 are two angular momenta (spin) labelling irreducible representations of SU (2) and m1 , m2 are the corresponding projections on the quantization axis, then the Clebsh–Gordan jm coefficient Cj1 m1 j2 m2 represents the probability amplitude that j1 and j2 are coupled to give a resultant angular momentum j with projections m = m1 + m2 . In what follows the Wigner 3j m symbols will be used instead of the C-G coefficients owing to their symmetry properties. A 3j m symbol represents the probability amplitude that three angular momenta j1 , j2 , j3 with projections m1 , m2 , m3 respectively, are coupled to yield zero angular momentum; in terms of the corresponding C-G coefficient it can be expressed as: a b c cγ = (−1)a−b+γ (2c + 1)−1/2 Caαbβ . (20) α β −γ Here we adopt the Latin letters {a, b, c, . . . } to denote angular momenta, and the Greek letters {α, β, γ , . . . } to denote momentum projections in the arguments of the coefficients. (When convenient the notation j1 , j2 , . . . for angular momenta and m1 , m2 , . . . for the corresponding momentum projections will be restored). In any case, a variable of type j is any integer or half-integer non-negative number and its corresponding variable of type m is such that |m| ≤ j , both in h¯ units. We do not need at this point either the explicit expression of the C-G coefficient or the list of the properties of the 3j m symbol; we just mention the fact that the phase factor (−1)a−b+γ in (20) is chosen in such a way that any cyclic permutation of columns leaves the symbol unchanged. Moreover, both the 3j m symbol and the C-G coefficient vanish unless the triangular inequalities |a − b| ≤ c ≤ a + b (and their cyclic permutations) hold true. A triad of such kind will be called admissible, borrowing the language used in the context of quantum invariants introduced in [T-V] . Thus, from a geometrical point of view, a 3j m symbol can be associated with a triangle lying in an Euclidean 3-space, the edge lengths of which are (2a + 1), (2b + 1), (2c + 1) and having projection α, β, −γ respectively along a fixed reference axis. (Strictly speaking, such a picture arises only in the semiclassical limit, when the vector model for the recoupling of angular momenta can be applied). The following combination of four 3j m symbols, summed over their magnetic quantum numbers, provides the expression of the Racah–Wigner 6j symbol: ab c a b c a e f d bf d e c η = (−1) · , (21) d ef −α −β −γ α −1 ϕ δβ ϕ −δ 1 γ where η = a + b + c + d + e + f − α − β − γ − δ − 1 and the sum is extended over all possible values of the m-variables (notice however that only three summation indices are independent). The 6j symbols satisfy orthogonality conditions which read: abX abX (2X + 1) = (2e + 1)−1 δef {ade}{bce}, cd e cd f X
(22)
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
579
where the notation {ade} stands for the triangular delta, (viz., {ade} is equal to 1 if its three arguments satisfy triangular inequalities, and is zero otherwise), while δef ≡ δ(e, f ). Each triad in (21), (abc), (aef ), (dbf ), (dec), must be admissible, or, in other words, each 3j m symbol is different from zero. Thus the 6j symbol has the symmetries of a nondegenerate tetrahedron embedded in an Euclidean 3-space with edge lengths (2a + 1), . . . , (2f + 1) (the non-degeneracy is given by the supplementary condition V 2 > 0, where V is the Euclidean volume of the tetrahedron, see e.g. [P-R]). Any arrangement of six spin variables {j1 , j2 , . . . , j6 } with jp = 0, 1/2, 1, 3/2, . . . (p = 1, 2, . . . , 6) satisfying all the above requirements will be called admissible. After these preliminary remarks, the connection between a recoupling scheme of a (finite number of) angular momenta and the combinatorial structure of a compact 3dimensional simplicial PL-manifold M without boundary is given by the classical result by Ponzano and Regge. On the basis of the notation introduced in (7), let the map: T (j ) −→ M
(23)
represent here a particular triangulation of the 3-dimensonal PL-manifold M associated with an admissible assignment of spin variables to the collection of the edges in T . Moreover, we set j ≡ {jA }. A = 1, 2, . . . , N1 , where N1 is the number of the edges in T (j ); therefore (2jA + 1) is the length of the edge labelled by A. The compatibility conditions on the assignment of spin variables are encoded in the requirement that each 3-simplex σ 3 in T (j ) is actually associated, apart from a phase factor, with a 6j symbol of SU (2): σB3
6
←→ (−1)
p=1 jp
j1 j2 j3 j4 j5 j6
B
,
(24)
where B = 1, 2, . . . , N3 labels the tetrahedra of T (j ). Then the Ponzano–Regge partition function for the manifold M is rewritten here as: Z[T (j ) → M; L], (25) Z[M] = lim L→∞
{T (j ),j ≤L}
where the sum is extended to all assignments of spin variables such that each of them is not greater than the cut-off L, and each term under the sum is given by:
−N0
Z[T (j ) → M; L] = 8(L)
N1
(2jA + 1)
A=1
N3 B=1
(−1)
6
p=1 jp
j1 j2 j3 j4 j5 j6
B
.
(26)
Here 8(L) ≡ 4L3 /3a, (a is an arbitrary constant) and N0 is the number of vertices in T (j ). As is well known (see e.g. [P-R, P87] and [C-F-S]) the state sum given in (25) and (26) is invariant under bistellar elementary operations. Recall that such bistellar moves can be expressed in terms of the Biedenharn–Elliott identity (representing the moves (2 tetrahedra) ↔ (3 tetrahedra)) and of both the B-E identity and the orthogonality conditions (22) (which represent the moves (1 tetrahedron) ↔ (4 tetrahedra)). As an intermediate step toward the generalization to the case of a 3-manifold with boundary, we recall the extension of (25) and (26) to a manifold M with a fixed triangulation on its boundary ∂M (see e.g. [O92,a]). Let the map
580
G. Carbone, M. Carfora, A. Marzuoli
¯ (j¯)) −→ (M, ∂M ≡ ∂T ¯ ) (T (j, j¯), ∂T
(27)
denote now a particular triangulation of the PL-pair (M, ∂M) associated with an admissible assignment of spin variables j ≡ {jA }, A = 1, 2, . . . , n1 , to the edges belonging to the interior of T (j, j¯), and such that the assignment of variables j¯ ≡ {j¯C }, ¯ is kept fixed. Then the following C = 1, 2, . . . , n¯ 1 to the edges belonging to ∂M ≡ ∂T state sum can be defined: ¯ )] Z[(M, ∂M ≡ ∂T
= lim
L→∞
T (j,j¯),j ≤L; j¯fixed
¯ (j¯)) → (M, ∂M ≡ ∂T ¯ ); L], Z[(T (j, j¯), ∂T
(28)
where ¯ (j¯)) → (M, ∂M ≡ ∂T ¯ ); L] Z[(T (j, j¯), ∂T −n0
= 8(L)
n1
(−1)
2jA
(2jA + 1)
A=1
·
n¯ 1
N3
(−1)
B=1
6
p=1 jp
j1 j2 j3 j4 j5 j6
B
¯
(−1)jC (2j¯C + 1)1/2 .
C=1
In this last expression n0 is the number of vertices in the interior of T (j, j¯), N3 is the total number of 3-simplices, while N1 ≡ n1 + n¯ 1 is the total number of edges in ¯ (j¯)). The above state sum is invariant under bistellar moves performed in (T (j, j¯), ∂T int (T (j )). Moreover, it behaves correctly with respect to spacetime compositions (or cobordisms): starting for instance with two PL-pairs (M1 , ∂M1 ≡ ∂T ) and (M2 , ∂M2 ≡ ∂T ) with fixed isomorphic triangulations on their boundaries, the composite state sum is obtained by glueing along the boundaries and is given by (25) and (26) with M = M1 ∪ M2 . We turn now to the general case of a 3-dimensional compact PL-pair (M, ∂M), the boundary of which will be equipped with the unique triangulation induced on it by the triangulation we choose in T according to (8). In the present context, let the map (T (j ), ∂T (j , m)) −→ (M, ∂M)
(29)
represent a triangulation associated with an admissible assignment of both spin variables to the collection of the edges in (T , ∂T ) and of momentum projections to the subset of edges lying in ∂T . With a slight change of notation, let j ≡ {jA }, A = 1, 2, . . . , N1 , denote all the spin variables, n1 of which are associated with the edges in the boundary. This last subset is labelled both by j ≡ {jC }, C = 1, 2, . . . , n1 , and by m ≡ {mC }, where mC is the projection of jC along the fixed reference axis. The consistency in the assignment of j , j , m is ensured if we require that each 3-simplex σB3 , (B = 1, 2, . . . , N3 ), in (T , ∂T ) must be associated with a 6j symbol as in (24), while each 2-simplex σD2 , D = 1, 2, . . . , n2 in ∂T must be associated with a 3j m symbol of SU (2) according to 3 j1 j2 j3 σD2 ←→ (−1)( s=1 ms )/2 . (30) m1 m2 −m3 D
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
581
Then the following state sum can be defined: Z[(M, ∂M)] = lim Z[(T (j ), ∂T (j , m)) → (M, ∂M); L], (31) L→∞
(T (j ),∂T (j ,m)) j,j ,m≤L −j ≤m≤j
where Z[(T (j ), ∂T (j , m)) → (M, ∂M); L] = 8(L)−N0
N1
(−1)2jA (2jA + 1)
A=1
N3
(−1)
6
p=1 jp
B=1 n2
·
D=1
3
(−1)(
s=1 ms )/2
j1 j2 j3 j4 j5 j6
B
j1 j2 j3 m1 m2 −m3
D
.
(32)
N0 , N1 , N3 denote respectively the total number of vertices, edges and tetrahedra in (T (j ), ∂T (j , m)), while n2 is the number of 2-simplices lying in ∂T (j , m). Notice that there appears a factor 8(L)−1 for each vertex in ∂T (j , m) too (cfr. the corresponding expression in the case of a boundary with a fixed triangulation). Moreover, (32) is manifestely invariant under bistellar moves which involve 3-simplices in int (T ), and thus (31) and (32) reduce to (25) and (26) respectively if ∂M = ∅. It is worthwhile to remark also that products of 6j and 3j m coefficients of the kind which appear in (32) are known as j m coefficients in the quantum theory of angular momentum (see e.g [Y-L-V]). Their semiclassical limit can be defined in a consistent way by requiring that simultaneously j, j → ∞ and m → ∞ with the constraint −j ≤ m ≤ j . The summation in (31) has precisely this meaning , apart from the introduction of the cut-off L. 4. Identities Representing Elementary Shellings and Invariance of the State Sum The aim of this section is to show the invariance of the state sum proposed in (31) and (32) under the set of elementary shellings in n = 3 illustrated at the end of Sect. 2. Then Z[(M, ∂M)] will turn out to be a PL (or combinatorial) invariant of the PL-pair (M, ∂M) according to Theorem 4. It should be clear at this point that we have to find suitable identities (involving j m coefficients) which could be associated with the three types of elementary shellings and inverse shellings. Once these identities are established, the state sum Z[(M, ∂M)] will comply with them in a manifest way. Notice that this kind of proof mimics essentially the procedure followed both in [P-R] and [T-V] and, more recently, in [C-K-S]: we have actually inferred the expression of the state sum Z[(M, ∂M)] from the set of identities which implements its topological invariance. In what follows we show that one of the identities collected in [V-M-K], together with the orthogonality conditions for the 6j and 3mj symbols, are all we need in order to characterize completely both the three types of elementary shellings and their inverse transformations. Let us start with the shelling of a facet of TYPE III and with the corresponding maps '±(III) , the action of which (given in (19)) is depicted schematically in Fig. 1. According to the basic rules given in (24) and (30), the configuration on the lefthand side must be represented in the state sum by a product between one 6j symbol,
582
G. Carbone, M. Carfora, A. Marzuoli
ρ −(III)
a
q
b r
p c
a ρ +(III)
q
b r
p c
Fig. 1. On the left-hand side, the 2-simplex σ 2 ⊂ σ 3(III) lying in ∂T is associated with the triad (abc); its opposite vertex τ (III) , together with the other faces, are in the interior of T . The action of the map '−(III) amounts to cancel σ 2 and the interior of the facet; the surviving triangles, labelled by (apq), (bqr), (cpr), are in the boundary of the new complex. The action of the map '+(III) can be read in the opposite direction
associated with the facet, and one 3j m symbol associated with the unique face which is in ∂T . The three faces that survive after the shelling appear on the right-hand side, thus the corresponding side of the identity should contain a suitable sum of a product of three 3j m symbols. As a general remark, notice that variables appearing in 3j m symbol are associated with edges lying in ∂T in the particular configuration we are dealing with, while variables appearing only in 6j symbols correspond to internal edges. The labelling we adopt here agree with the notation at the beginning of Sect. 3, namely Latin letters a, b, c, r, p, q, . . . denote angular momentum variables and Greek letters α, β, γ , ρ, ψ, κ, . . . are the corresponding momentum projections. From these elementary remarks it follows that the maps '±(III) are represented by the following identity:
a b c αβγ
a b c (−1) r pq p a q q b r r c p −ψ−κ−ρ = , (−1) ψ α −κ κ β −ρ ρ γ −ψ =
(33)
κψρ
where = ≡ a + b + c + r + p + q. Here we have made use, with respect to the expression given in [V-M-K], of the symmetry properties of the 3j m symbols and of the fact that (−1)2(a+b+c) = 1. The triple sum over magnetic numbers (which appear in pairs with opposite signs) is interpreted as a glueing along the edges labelled by the corresponding j -variables. The shelling and inverse shelling of a facet of TYPE II, given in (18), are depicted in Fig. 2. The configuration on the left-hand side is associated with a suitable sum of a product of two 3j m symbols and one 6j symbol; on the other side we have just the two
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
583
ρ −(II) a
b
a
b
q q
c
r r p
p ρ +(II)
Fig. 2. On the left-hand side, the 1-simplex σ 1 ⊂ σ 3(II) (corresponding to the edge labelled by c) is in ∂T , together with the triangles associated with the triads (abc) and (cpr). The 1-simplex τ (II) ↔ q, together with the triangles associated with (aqp) and (bqr), are in the interior of T . The map '−(II) deletes both the interior of the facet and (abc), (cpr): the resulting complex has the two remaining faces in its boundary. The action of the map '+(II) can be read in the opposite direction
faces which survive after the shelling, represented by a sum (over the magnetic number corresponding to their common edge) of the product of the remaining 3j m’s. Recall that the expression of the orthogonality conditions for the 3j m symbols with respect to magnetic numbers reads: a b c b a c 2c−γ (−1) (2c + 1) (34) = (−1)α+β δαα δββ . −α −β γ −β −α γ cγ
r p c Consider now (33) again, multiply each side by (−1)−γ +2c (2c + 1) −ρ ψ −γ and sum over the pair c, γ . Then, using (34) and the symmetry properties of the 3j m’s in order to adjust phase factors, we get the identity representing the shellings of TYPE II: a b c c r p = a b c (2c + 1)(−1)2c−γ (−1) αβγ −γ ρ ψ r pq cγ p a q q b r −2ρ −κ . (35) (−1) = (−1) ψ α −κ κ β −ρ κ
584
G. Carbone, M. Carfora, A. Marzuoli
ρ −(I) p
r q c
c a
b
b
a
ρ +(I) Fig. 3. On the left-hand side, the 2-simplex τ (I) ⊂ σ 3(I) is in the interior of T and corresponds to the triad (abc); its opposite vertex σ 0 , together with the other three faces, are in ∂T . The action of '−(I) amounts to cancel both the edges labelled by r, p, q and the faces which share one of them. In the boundary of the new complex we have just the triangle associated with (abc). The action of the map '+(I) can be read in the opposite direction
Since in this kind of shelling an edge must disappear, it is natural to find out that in (35) we have indeed a sum both over the variable c and over its corresponding γ since c is shared by the two faces on the left-hand side. The shelling and inverse shelling of a facet of TYPE I are characterized by the fact that a vertex, together with the three edges arising from it, is now involved. More precisely, as the configurations in Fig. 3 show, we have to sum over external edges a product containing three 3j m and one 6j symbols, getting a single 3j m (this is the action of the map '−(I) , recall also (17)).
We start again from (33), multiply both sides by (2r + 1)(2p + 1)(2q + 1) ar pb qc ·(−1)2(p+q+r) and sum over q, p, r. Then we get the expression:
(−1)−ψ−κ−ρ (−1)2(p+q+r) (2p + 1)(2r + 1)(2q + 1)
qκ,pψ,rρ
·
=
a b c αβγ
p,r,q
q b r κ β −ρ (−1)
r c p ρ γ −ψ
2(p+q+r)
(−1)=
a b c r pq
p a q ψ α −κ
(2p + 1)(2r + 1)(2q + 1)(−1)
=
a b c r pq
2 .
Using the orthogonality conditions (22) for the 6j ’s, the sum over q on the right-hand side gives (2c + 1)−1 ; the two remaining summations reduce to (2c + 1) p (2p + 1)2 , which diverges in the semiclassical limit. Hence, as in the Ponzano–Regge model, we have to introduce a cut-off L and denote the above weight by 8(L) according to the notation of Sect. 3. The following steps consist in an interchange of the first two columns
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
585
of each 3j m and in a relabelling: ρ, κ, ψ → −ρ , −κ , −ψ . Thus the final expression representing the maps '±(I ) reads: 8(L)−1 ·
qκ ,pψ ,rρ
a p q α −ψ κ
(−1)−ψ −κ −ρ (−1)2(p+q+r) (2p + 1)(2r + 1)(2q + 1)
b q r β −κ ρ
c r p γ −ρ ψ
(−1)
=
a b c r pq
=
b a c βαγ
.
(36)
The above analysis of the identities representing the elementary shellings and their inverse moves, together with a comparison with the expression given in (32), completes the proof of the following: Theorem 5. The state sum Z[(M, ∂M)] for the 3-dimensional PL-pair (M, ∂M) is formally invariant both under bistellar moves in the interior of M and under elementary boundary operations (shellings and inverse shellings). Then, by Pachner’s Theorem 4, Z[(M, ∂M)] is an invariant of the PL-structure. Remark 3. As we have just seen, the complete set of elementary shellings can be derived from a single identity, namely (33), together with orthogonality conditions for 3j m and 6j symbols. However, it is quite clear that we could have get started either from (35) or from (36) as well. The expression given in (35) appears preferable since its structure closely resembles the Biedenharn–Elliott identity, both for what concerns the number of symbols involved and owing to the presence of a single sum over a j -variable. Recall also that the complete set of bistellar moves is actually derived from the B-E identity + (orthogonality conditions for the 6j ), apart from regularization. This similarity in the algebraic structure of the two sets of moves does not happen by chance, although the topological content of the fundamental identity is different in the two cases. Remark 4. It is worthwhile to stress that both the PL-invariants Z[M] in (25) and Z[(M, ∂M)] in (31), regularized in the same way with the introduction of the cut-off L, are notoriously difficult to handle. However, as with the state sum proposed by Turaev and Viro in [T-V], improved regularizations can be obtained by exploiting quantum groups technology. 5. Extension to the q-Deformed Case In this section we extend our previous results in order to get a quantum invariant of a 3-dimensional PL-pair (M, ∂M) which is the counterpart of the Turaev–Viro one defined in [T-V]. We limit ourselves to the analysis of the case in which representations of the quantized enveloping algebra Uq (sl(2, C)), q a root of unity, are involved. The notation, at least in the initial part, is the standard one (see e.g. [M-T, C-F-S]). Thus in particular a q −6j symbol can be associated with each 3-simplex of a given triangulation (T (j ), ∂T (j , m)) → (M, ∂M) according to j j j σ 3 ←→ 1 2 3 , (37) j4 j5 j6 q 6
where the phase factor (−1) p=1 jp has been inglobed. Notice that in the present case the spin variables j take their values in a finite set I ≡ {0, 1/2, 1.3/2, . . . , (k/2)−1}, where
586
G. Carbone, M. Carfora, A. Marzuoli
exp(π i/k) = q. Moreover, the 6-tuple (j1 , j2 , . . . , j6 ) ∈ I 6 is said to be admissible if each of its unordered triples (j1 j2 j3 ), (j1 j5 j6 ), (j4 j2 j6 ), (j4 j5 j6 ) is admissible in the sense already explained in Sect. 3. To be more precise, the q − 6j symbol is associated with a map I 6 → K, where K is a commutative ring with unity, and the corresponding 6-tuple is admissible. The following step consists in defining, for each j ∈ I , a function . w2 (j ) ≡ wj2 = (−1)2xj [2xj + 1]q ∈ K ∗ , where K ∗ = K \ {0} and [n]q denotes a q-integer, namely [n]q = (q n − q −n )/(q − q −1 ). Moreover, a distinguished element w ∈ K ∗ is chosen in such a way that w 2 = −2k/(q − q −1 )2 . The symbol | · · · |q , the functions wj and the element w are collectively referred to as initial data. According to [T-V], they have to satisfy some conditions, among which we just need here the following ones: 2 2 j 2 j1 j j 2 j1 j wj wj4 = δj4 j6 (38) j3 j5 j4 q j3 j5 j6 q j
representing the orthogonality relations for the q − 6j symbols, and w 2 = wj−2 wk2 wl2 .
(39)
(j,k,l)∈adm
The summation in (38) and (39) are carried out over those j -variables for which the symbols are defined and adm is the set of admissible triples. In order to deal with the generalization of the Turaev–Viro state sum to the case of a 3-dimensional PL-pair (M, ∂M), and following the framework of Sect. 3, we have to introduce the q-analog of the Wigner 3j m symbol (20). This symbol turns out to be associated with a triangular face σ 2 in the boundary of a triangulation (T , ∂T ) according to the prescription: j1 j 2 j 3 2 σ ←→ χq , (40) m1 m2 −m3 q where χq is a term containing both a phase factor as in (30) and a suitable normalization factor depending on q. The correct choice of the normalization in (40), and consequently in the definition of the q −6j symbol (37), is discussed in the Appendix. Then, following the procedure of Sect. 4 with a slight change of notation, it turns out that the maps '±(III) representing the elementary shelling and inverse shellings of a facet of TYPE III correspond to the identity:
j1 j2 j12 j3 j j23 q q m2 −m2 /6 = (−1) q (−1)m3 q −m3 /6 (−1)m12 q −m12 /6
j1 j23 j m1 m23 −m
m2 m3 m12
j3 j2 j23 m3 −m2 m23
q
j12 j3 j m12 −m3 m
q
j1 j2 j12 m1 m2 −m12
q
.
(41)
±(II) is found by multiplying (41) by The relation describing the maps ' j2 j3 j23 (−1)m23 +m2 +m3 q (m3 −m2 )/3 , summing both sides over j23 , m3 and m2 m3 −m23 q
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
587
using the orthogonality conditions for the q − 3j m’s given in (49). Then, up to the substitution m3 → −m3 , the final expression reads: j j j j1 j23 j j2 j3 j23 wj223 (−1)m23 q −m23 /6 1 2 12 j3 j j23 q m1 m23 −m q m2 −m3 −m23 q j23 m23 j1 j2 j12 j 2m3 2m3 /6 m12 −m12 /6 j12 j3 = (−1) q (−1) q , (42) m12 −m3 −m q m1 m2 −m12 q
m12
where wj223 ≡ (−1)2j23 [2j23 + 1]q as defined before. In order to find the identity representing the last type of shelling notice that (41) can be rewritten in terms of the deformation parameter 1/q and that | · · · |q = | · · ·|1/q (see e.g. [K-R]). Multiplying each j j j side of the resulting expression by wj22 wj23 wj212 1 2 12 , summing over j3 , j2 , j12 j3 j j23 q and taking into account (38) and (39), we get:
j1 j23 j m1 m23 m j3 m3
= w−2 1/q
j2 m2
wj22 (−1)m2 q m2 /6
j3 j2 j23 m3 −m2 m23 1/q j12 m12 j1 j2 j12 j1 j2 j12 j12 j3 j . · m12 −m3 m 1/q m1 m2 −m12 1/q j3 j j23 q
wj23 (−1)m3 q m3 /6
wj212 (−1)m12 q m12 /6
Interchanging the first two j -variables of each q−3j m symbol, and up to the substitutions m2 , m3 , m12 → −m2 , −m3 , −m12 , we obtain the correct identity representing the maps '±(I ) , namely:
j23 j1 j m23 m1 m j3 m3
q
= w −2
j2 m2
wj23 (−1)m3 q −m3 /6 ·
wj22 (−1)m2 q −m2 /6 j12 m12
wj212 (−1)m12 q −m12 /6
j3 j12 j m3 −m12 m
q
j2 j3 j23 m2 −m3 m23
q
j1 j2 j12 j2 j1 j12 . −m2 m1 m12 q j3 j j23 q
(43)
Collecting the results of this section we can now state the following: Theorem 6. Let (M, ∂M) be a 3-dimensional compact PL-pair and (T (j ), ∂T (j , m)) → (M, ∂M) a triangulation associated with an admissible assignment of both j variables (j of which in ∂T ) and m-variables according to the rules given at the beginning of this section. Then the state sum Z[(M, ∂M)]q = Zq [(T (j ), ∂T (j , m)) → (M, ∂M)], (44) {(T (j ),∂T (j ,m))}
588
G. Carbone, M. Carfora, A. Marzuoli
where Zq [(T (j ), ∂T (j , m)) → (M, ∂M)] = w −2N0
N1 A=1
n2
·
D=1
χq(D)
2 wA
N3 j1 j2 j3 (B) j4 j5 j6 · q
B=1
j1 j2 j3 m1 m2 −m3
(D) (45) q
is a quantum invariant of the PL-pair (M, ∂M). Proof. The state sum (44) is manifestely invariant both under bistellar moves in the interior of T and under elementary shellings represented by (41), (42), (43). Then, by Pachner’s Theorem 4, Zq [(M, ∂M)] is, for each q = root of unity, a quantum PLinvariant. $ % 6. Concluding Remarks The topological elementary moves introduced by Pachner and discussed in Sect. 2 are characterized by other remarkable properties. For instance, as pointed out recently in [C-K-S], one can recover the fundamental n-simplex glueing together, in Rn , the two different configurations – representing any one of the bistellar moves in dimension (n − 1) – along their common fixed boundary. Obviously, there cannot be any straightforward relationship between bistellar moves in contiguous dimensions (recall that the number of such kind of moves in dimension n is (n + 1)). However, if we allow the elementary boundary operations to be involved, new possibilities arise. As we have already noticed, the different types of elementary shellings acting on a simplicial n-dimensional pair (T , ∂T ) amount exactly to n. Moreover, the central projection of each elementary shelling onto ∂T gives a particular bistellar move in dimension (n − 1) (being ∂T a triangulation of a closed (n − 1)-dimensional manifold). It is also easy to check that the same kind of projection of the complete set of boundary operations reproduces the complete set of bistellar moves in the lower dimensional case. Thus the deep link between the two basic theorems proved by Pachner and quoted in our Remark 2 becomes evident. For what concerns in particular Theorem 3, it should be feasible to build up state sum models for triangulated 3-dimensional pairs leaving the requirement of being generalizations of the Ponzano–Regge and Turaev–Viro ones apart (and implementing the elementary shellings alone). Turning now to some possible developments of our approach toward models in dimension different from three, we are currently addressing a 2-dimensional closed model and a 4-dimensional model with boundary which are reminiscent of the 3-dimensional one. On both sides the starting point is one of the fundamental identities which we have looked at in Remark 3 (namely the Biedenharn– Elliott identity involving five 6j symbols and identity (35) involving one 6j and four 3j m symbols). Since the structure of a 2-dimensional local term of the state sum given e.g. in (32) is naturally encoded in (35), we can, in a definite sense, project this last expression in order to get the correct form of the corresponding bistellar move in n = 2. The partition function arising in such a way includes suitable sums of products of double 3j m symbols, each one of them being associated with a triangular facet of the closed 2-manifold. On the other hand, the B–E identity represents in our view the projected counterpart of the identity associated with a particular elementary shelling in n = 4. We are confident that, notwithstanding
Wigner Symbols and Combinatorial Invariants of Three-Manifolds
589
the complexity of the algebraic relations involved, a state sum which generalizes the known results in dimension 4 (see e.g. [O92,b, C-K-S] and references therein) could be established. Coming back again to the 3-dimensional partition function given in (32), we can investigate its semiclassical limit much in the spirit of the original approach of Ponzano and Regge. As is well known, in the case of a closed 3-manifold, the state sum given in (25) and (26) can be related to the semiclassical Euclidean partition function containing the Regge action SR (M) of the manifold M according to Z[M] ∼ cos(SR (M) + π/4). In order to perform a similar analysis on our state sum we have to consider also the semiclassical limit of each 3j m symbol involved in (32). Such a limit can be found in [P-R] as well, and involves both the angular momenta j and the momentum projections m, together with suitable angular variables. Without entering into technical details, the asymptotic structure of our state sum can be summarized in the expression Z[(M, ∂M)] ∼ cos(SR (M) + S(∂M) + const), where S(∂M) is an action containing both the Euler characteristic of ∂M and other terms depending on the orientation of the components of ∂M with respect to the quantization axis. Appendix In what follows we collect first some relations involving q − 3j m-symbols which can be found for instance in [K-R, N]. Recall that the relation between the quantum Clebsh– Gordan coefficient (j1 m1 j2 m2 |j3 m3 )q and the q − 3j m symbol is given by: (j1 m1 j2 m2 |j3 m3 )q = (−1)
j1 −j2 +m3
([2j3 + 1]q )
1/2
j 1 j2 j3 m1 m2 −m3
q
,
(46)
where, as usual, an m-variable runs in integer steps between −j and +j , and the classical expression (20) is recovered when q = 1. The symmetry properties of the q − 3j m symbol read: j1 j2 j3 j 2 j1 j3 = (−1)j1 +j2 +j3 , m1 m2 −m3 q m2 m1 −m3 1/q j1 j2 j3 j 1 j3 j2 j +j +j −m /2 1 2 3 1 = (−1) q , m1 m2 −m3 q m1 m3 −m2 1/q j1 j2 j3 j 1 j2 j3 = (−1)j1 +j2 +j3 . (47) m1 m2 −m3 q −m1 −m2 m3 q The above relations make it clear the necessity of choosing different normalization factors in order to comply with the cyclic-permutation property which ensures the correspondence (triangle) ↔ (q − 3j m). Thus we define the normalized q − 3j m symbols, for deformation parameters q and 1/q respectively, according to: j 1 j2 j3 j1 j2 j3 . = q (m1 −m2 )/6 , m1 m2 −m3 q m1 m2 −m3 q j1 j2 j3 . (m2 −m1 )/6 j1 j2 j3 =q . (48) m1 m2 −m3 1/q m1 m2 −m3 1/q
590
G. Carbone, M. Carfora, A. Marzuoli
The form of the orthogonality relations involving the normalized symbols which is used in Sect. 5 reads: j1 j 2 j j 2 j1 j wj2 (−1)µ q (m2 −m1 )/3 = δm1 m1 δm2 m2 (49) m1 m2 −m q −m2 −m1 −m q jm
where µ = m1 + m2 + m3 . References [A] Alexander, J.W.: The combinatorial theory of complexes. Ann. of Math. 31, 292–320 (1930) [A-C-M] Ambjørn, J., Carfora, M., Marzuoli, A.: The Geometry of Dynamical Triangulations. Lect. Notes in Physics m50. Berlin: Springer, 1997 [C-F-S] Carter, J.S., Flath, D.E., Saito, M.: The Classical and Quantum 6j-symbols. Math. Notes 43. Princeton, NJ: Princeton University Press, 1995 [C-K-S] Carter, J.S., Kauffman, L.H., Saito, M.: Structure and diagrammatics of four dimensional topological lattice field theories. Preprint, math. GT/9806023 (1998) [D-R] De Pietri, R., Rovelli, C.: Geometry eigenvalues and the scalar product from recoupling theory in loop quantum gravity. Phys. Rev. D54, 2664–2690 (1996) [G] Glaser, L.C.: Geometric Combinatorial Topology. vol. 1. New York: van Nostrand Reinhold, 1970 [K-M-S] Karowski, M., Müller, W., Schrader, R.: State sum invariants of compact 3-manifolds with boundary and 6j-symbols. J. Phys. A: Math. Gen. 25, 4847–4860 (1992) [K-R] Kirillov,A.N., Reshetikhin, N.Y.: Representations of the algebra Uq (sl2 ), q-orthogonal polynomials and invariants of links. In: Kac, V.G. (ed.) Infinite dimensional Lie algebras and groups, Adv. Ser. in Math. Phys. 7. Singapore: World Scientific, 1988, pp. 285–339 [M] Moussouris, J.P.: Quantum models of space-time based on recoupling theory. Oxford: Ph.D. thesis, 1983 [M-T] Mizoguchi, S., Tada, T.: 3-dimensional gravity and the Turaev-Viro invariant. Progr. Theor. Phys. Suppl. 110, 207–227 (1992) [N] Nomura, M.: Relations for Clebsh–Gordan and Racah coefficients in suq (2) and Yang–Baxter equation. J. Math. Phys. 30, 2397–2405 (1989) [O92,a] Ooguri, H.: Partition functions and topology-changing amplitudes in the three-dimensional lattice gravity of Ponzano and Regge. Nucl. Phys. B 382, 276–304 (1992) [O92,b] Ooguri, H.: Topological lattice models in four dimensions. Mod. Phys. Lett. A 7, 2799–2810 (1992) [P87] Pachner, U.: Ein Henkeltheorem für geschlossene semilineare Mannigfaltigkeiten. Result. Math. 12, 386–394 (1987) [P90] Pachner, U.: Shellings of simplicial balls and p.l. manifolds with boundary. Discr. Math. 81, 37–47 (1990) [P91] Pachner, U.: P.L. homeomorphic manifolds are equivalent by elementary shellings. Europ. J. Combinatorics 12, 129–145 (1991) [Pe] Penrose, R.: Angular momentum: an approach to combinatorial space-time. In: Bastin, T. (ed.) Quantum Theory and beyond. Cambridge: Cambridge University Press, 1971, pp. 151–180 [P-R] Ponzano, G., Regge, T.: Semiclassical limit of Racah coefficients. In: Bloch, F. et al (eds.) Spectroscopic and Group Theoretical Methods in Physics. Amsterdam: North-Holland, 1968, pp. 1–58 [R] Regge, T.: General Relativity without coordinates. Nuovo Cimento 19, 558–571 (1961) [R-S] Rourke, C., Sanderson, B.: Introduction to Piecewise Linear Topology. New York: Springer-Verlag, 1982 [Ro-S] Rovelli, C., Smolin, L.: Spin networks and quantum gravity. Phys. Rev. D52, 5743–5759 (1995) [T] Thurston, W.P.: Three-dimensional Geometry and Topology. Vol. 1, Levy, S. (ed.). Princeton, NJ: Princeton University Press, 1997 [T-V] Turaev, V., Viro, O.Ya.: State sum invariants of 3-manifolds and quantum 6j-symbols. Topology 31, 865–902 (1992) [V-M-K] Varshalovich, D.A., Moskalev, A.N., Khersonskii, V.K.: Quantum Theory of Angular Momentum. Singapore: World Scientific, 1988 [Y-L-V] Yutsis, A.P., Levinson, I.B., Vanagas, V.V.: The Mathematical Apparatus of the Theory of Angular Momentum. Jerusalem: Israel Program for Sci. Transl. Ltd. 1962 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 212, 591 – 611 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
A Path Integral Approach to the Kontsevich Quantization Formula Alberto S. Cattaneo1 , Giovanni Felder2 1 Institut für Mathematik, Universität Zürich, 8057 Zürich, Switzerland. E-mail:
[email protected] 2 Departement Mathematik, ETH-Zentrum, 8092 Zürich, Switzerland. E-mail:
[email protected] Received: 10 March 1999 / Accepted: 30 January 2000
Abstract: We give a quantum field theory interpretation of Kontsevich’s deformation quantization formula for Poisson manifolds. We show that it is given by the perturbative expansion of the path integral of a simple topological bosonic open string theory. Its Batalin–Vilkovisky quantization yields a superconformal field theory. The associativity of the star product, and more generally the formality conjecture can then be understood by field theory methods. As an application, we compute the center of the deformed algebra in terms of the center of the Poisson algebra. 1. Introduction In a recent paper [K], M. Kontsevich gave a general formula for the deformation quantization [BFFLS] of the algebra of functions on a Poisson manifold. The deformed product (the “star product”) is given in terms of an expansion reminiscent of the Feynman perturbation expansion of a two dimensional field theory on a disc with boundary. We review Kontsevich’s formula in Sect. 2. The purpose of this paper is to describe this quantum field theory explicitly. It turns out that it is a simple bosonic topological quantum field theory on a disc D with a field X : D → M taking values in the Poisson manifold M and a one-form η on D taking values in the pull-back X ∗ (T ∗ M) of the cotangent bundle. The formula for the star product is i f (X(1))g(X(0))e h¯ S[X,η] dX dη, f g (x) = X(∞)=x
where 0, 1, ∞ are three distinct points on the boundary of D. The integral is normalized in such a way that in the case of the trivial Poisson structure the star product reduces to the ordinary product. The action S is described in Sect. 3 and was originally studied for manifolds without boundary in [I] and [SchStr]. In particular the canonical quantization on the cylinder was considered.
592
A. S. Cattaneo, G. Felder
In the symplectic case the above formula essentially reduces to the original Feynman path integral formula for quantum mechanics, as pointed out to us by H. Ooguri. The quantization of the theory is somewhat subtle, due to the presence of a gauge symmetry which only closes on shell, as already noticed in [I]. In other words, the action S is a function of the fields annihilated by a distribution of vector fields which is only integrable on the set of critical points of S. As a consequence, the BRST quantization fails and one has to resort to the Batalin–Vilkovisky method (see for example [BV,W1, S1,AKSZ]). This method yields a gauge fixed action, which turns out to have a superconformal invariance. Its perturbative expansion around constant classical solutions reproduces Kontsevich’s formula. As an application, we show in Sect. 4 by quantum field theory methods that there exists a star product equivalent to Kontsevich’s whose center consists of the power series in h¯ whose coefficients are in the center of the Poisson algebra. A rigorous proof of this statement will appear elsewhere [CFT]. More generally, we may consider a path integral associated to an arbitrary polyvector field, a formal sum of skew-symmetric contravariant tensor fields of arbitrary rank, the star product being the special case of bivector fields. Correlation functions of boundary fields yield then a map U from polyvector fields to polydifferential operators. Formal properties of this map can be deduced from BV and factorization methods of quantum field theory. This leads to identities, also found by Kontsevich, which may be thought of as the open string analog of the WDVV equations [W2, DDV]. They may be formulated by saying that U is an L∞ morphism [SchlSt, LS]. They imply the associativity of the star product and, in the general setting of arbitrary polyvector fields, the formality conjecture [K]. These constructions are explained in Sect. 5. Although the non-rigorous quantum field theory arguments of this paper are of course no substitute for the proofs in [K], this approach offers an explanation for why Kontsevich’s construction works, and puts it in the context of Feynman’s original picture of quantization [F]. Moreover, our approach indicates the way for more general constructions. In particular, one can consider the perturbative expansion around a non-trivial classical solution, one can insert a Hamiltonian and one can consider this quantum field theory on a complex curve of higher genus. We plan to study these variants in the future.
2. The Kontsevich Formula In [K], M. Kontsevich wrote a beautiful explicit solution to the problem of deformation quantization of the algebra of functions on a Poisson manifold M. The problem is to find a deformation of the product on the algebra of smooth functions on a Poisson manifold, which to first order in Planck’s constant is given by the Poisson bracket. If M is an open set in Rd with a Poisson structure {f, g}(x) =
d
α ij (x)∂i f (x)∂j g(x)
i,j =1
given by a skew-symmetric bivector field α, obeying the Jacobi identity α il ∂l α j k + α j l ∂l α ki + α kl ∂l α ij = 0,
(1)
Path Integral Approach to Kontsevich Quantization Formula
593
the problem is to find an associative product on C ∞ (M)[[h]], ¯ such that for f, g ∈ C ∞ (M), f g (x) = f (x)g(x) +
i h¯ {f, g}(x) + O(h¯ 2 ). 2
Kontsevich’s solution1 to this problem may be described as follows. The coefficient of (i h/2) ¯ n in f g is given by a sum of terms labeled by diagrams of order n. A diagram of order n is a graph consisting of n vertices numbered from 1 to n and two vertices labeled by letters L and R, for Left and Right. From each of the numbered vertices there emerge two ordered oriented edges that end at numbered vertices or at vertices labeled by letters, so that no edge starts and ends at the same vertex. The two edges emerging from vertex i are called ei1 , ei2 . They are of the form eia = (i, va (i)) for some maps va : {1, . . . , n} → {1, . . . , n, L, R}. In fact, a diagram can be thought of as an ordered pair (v1 , v2 ) of maps {1, . . . , n} → {1, . . . , n, L, R}, such that va (i) is never equal to i. To each diagram of order n there corresponds a bidifferential operator D whose coefficients are differential polynomials, homogeneous of degree n in the components α ij of the Poisson structure. The edges indicate how the partial derivatives are acting. For instance the bidifferential operator2 (f, g) → α ij (x)∂i f (x)∂j g(x) corresponds to the diagram with vertices 1, L, R and edges e11 = (1, L), e12 = (1, R). The bidifferential operator D (f ⊗ g) = α ij ∂i α kl ∂j ∂l f ∂k g corresponds to the diagram with vertices 1, 2, L, R and edges e11 = (1, 2), e12 = (1, L), e21 = (2, R), e22 = (2, L). Kontsevich’s formula is then ∞ i h¯ n f g = fg + w D (f ⊗ g). 2 n=1
of order n
The weight w is the integral of a differential form over the configuration space Cn (H ) = {u ∈ H n , ui = uj (i = j )} of n ordered points on the upper half plane H . It is defined as follows: for any two distinct points z, w in the upper half plane with the Poincaré metric ds 2 = (dx 2 + dy 2 )/y 2 , let φ(z, w) be the angle between the (vertical) geodesic connecting z to i∞ and the geodesic connecting z to w, measured in counterclockwise ∂ ∂ direction. Let dφ(z, w) = dz ∂z φ(z, w) + dw ∂w φ(z, w) denote the differential of this angle. Then the weight is 1 w = ∧n dφ(ui , uv1 (i) ) ∧ dφ(ui , uv2 (i) ), (2π)2n n! Cn (H ) i=1 where we set uL = 0 and uR = 1. The orientation is induced from the product of the standard orientation of the upper half plane. For example, we have two graphs of order one, differing in the ordering of edges. Let us compute the weight of these diagrams. Let be the diagram with e11 = (1, L), e12 = (1, R). To compute the integral over u = u1 +iu2 ∈ H we introduce new variables φ0 = φ(u, 0), φ1 = φ(u, 1).As arg(u) varies between 0 and π , the angle φ0 varies from 0 to 2π . 1 In [K] h is what is here i h/2. We adopt the notation of the physics literature and work accordingly over ¯ ¯ the complex numbers. With Kontsevich’s conventions one may formulate the problem over the real numbers, which in terms of the physics conventions would mean to have an imaginary Planck constant. 2 We use throughout the paper the Einstein summation convention, meaning that sums over repeated indices are understood.
594
A. S. Cattaneo, G. Felder
As we vary u on the half-line of constant φ0 , the angle φ1 varies between φ0 (at infinity) and 2π (at u = 0). Thus this change of variables is a diffeomorphism from the upper half plane to the domain 0 < φ0 < φ1 < 2π in R2 . The above description also shows that this diffeomorphism is orientation preserving. Thus w = (2π )−2 0t1 >···>tm−1 >0
·
m−1
f0 (X(1))
∂ik f (X(tk ))η+ik (tk ) fm (X(0))δx (X(∞)).
k=1
Expanding the path integral in powers of h¯ as in the previous section, we get a map U that associates, to each polyvector field α, a formal power series whose coefficients are polydifferential operators. The perturbative expansion has the form U (α) = ∞ ¯ Here Un n=0 Un (α, . . . , α; h). is a multilinear function of n arguments in Tpoly (M) with values Dpoly (M). The formula for Un is Un (α1 , . . . , αn ; h)(f ¯ 0 ⊗ · · · ⊗ fm )(x) =
i i i e h¯ S0 Sα1 · · · Sαn Ox (f0 , . . . , fm ). h¯ h¯
Suppose now that, for i = 1, . . . , n, αi is homogeneous of degree pi . Then Sαi is the j0 ...jpi 1 ˜ η˜ j0 . . . η˜ jp , and has thus ghost integral of the two-form component of (pi +1)! αi (X) i number pi − 1. This has two consequences: first, since the integral over the ti picks the
˜ m − 1 form component of fi (X(t ni )), which has ghost number 1 − m, we have the ghost number condition 1 − m + i=1 (pi − 1) = 0 or m=1−n+
n
pi ,
(9)
i=1
for the path integral to be non-zero. This means that Un is a map of degree 1 − n from Tpoly (M)⊗n to Dpoly (M). Using this formula we may compute the dependence on h¯ of this integral: the path integral has an overall h¯ to the power −n+ (pi +1) = n+m−1 (each vertex has 1/h¯ and each propagator has an h), ¯ and we have m n+m−1 Un (α1 , . . . , αn )(⊗m Un (α1 , . . . , αn ; h)(⊗ ¯ ¯ 0 fi ) = (i h) 0 fi ),
with Un (α1 , . . . , αn ) = Un (α1 , . . . , αn ; h¯ = 1/ i) independent of h. ¯ The second consequence is that Un (. . . , αi , . . . , αj , . . . ) = (−1)(pi −1)(pj −1) Un (. . . , αj , . . . , αi , . . . ), i.e., Un is symmetric in a graded sense.
(10)
608
A. S. Cattaneo, G. Felder
5.2. Special cases. Let us consider in detail some special cases. For n = 0, Un is a polydifferential operator of degree m = 1, and i ˜ ˜ U0 (f0 ⊗ f1 ) = e h¯ S0 f0 (X(1))f 1 (X(0))δ x (X(∞)) = f0 (x)f1 (x) is the undeformed product on C ∞ (M). If n = 1 and α is a polyvector field of degree p then U1 (α) is of degree p. Let α=
∂ ∂ 1 α j0 ,...,jp (x) j ∧ · · · ∧ j , (p + 1)! ∂x 0 ∂x p
with α j0 ,...,jp antisymmetric. The Wick theorem yields in this case i U1 (α; h¯ )(f0 ⊗ · · · ⊗ fp )(x) = h¯
i h¯ 2π
p+1
Ip α j0 ,...,jp ∂j0 f0 (x) · · · ∂jp fp (x).
Here Ip is the integral Ip =
dφ(u, 1) ∧ dφ(u, t1 ) ∧ · · · ∧ dφ(u, 0),
over u = u1 +iu2 ∈ H and 1 > t1 > · · · > tp−1 > 0, with orientation given by the form du1 ∧du2 ∧dt1 . . . dtp−1 . To compute this integral we proceed as in Sect. 2 and introduce new variables φ0 = φ(u, 1), φj = φ(u, tj ) (j = 1, . . . , m − 1) and φp = φ(u, 0). In the new variables the integration is over the region 2π > φ0 > · · · > φp > 0. We claim that the Jacobian of the change of variables is (−1)p . This follows from the fact that dφ(u, 0) ∧ dφ(u, 1) = J du1 ∧ du2 with J > 0 and that ∂φ(u, t)/∂t > 0. Hence, dφ0 · · · dφp dφ(u, 1) ∧ dφ(u, t1 ) ∧ · · · ∧ dφ(u, 0) = (−1)p =
2π>φ0 >···>φp >0 p (−1) (2π )p+1
(p + 1)!
,
and we obtain U1 (α)(f0 ⊗ · · · ⊗ fp )(x) =
(−1)p+1 j0 ,...,jp α (x)∂j0 f0 (x) · · · ∂jp fp (x). (p + 1)!
5.3. U is an L∞ morphism. The formal properties of the map U can be deduced using the main trick of the BV formalism, which is to use the fact thatthe integral of the Laplacian of anything is zero. In our situation we have, with S0 = D d 2 θ η˜ i D X˜ i − D λi γi+ , and αj (j = 1, . . . , n) homogeneous polyvector fields of degree pj ,
e
i h¯ S0
n i=1
Sαi Ox (f0 , . . . , fm ) = 0.
Path Integral Approach to Kontsevich Quantization Formula
609
To evaluate the left-hand side we use (2) and S0 = Sα = 0. Also, we use the fact that ˜ η˜ j0 · · · η˜ jp ) which vanishes because of (S0 , Sα ) is proportional to D d 2 θD(α j0 ...jp (X) the boundary conditions for η˜ j . Thus we get 0 = (−1)m−1 +
i
e h¯ S0
n
i (S0 , Ox (f0 , . . . , fm )) h¯ i=1 ?j k (Sαj , Sαk ) Sαi Ox (f0 , . . . , fm ). i
e h¯ S0
Sαi
1≤j . The defining property of this representation is that a(u)|z, λ >= 0 = a ∗ (u )|z, λ >, for u ∈ H+ (z, λ) and u ∈ H− (z, λ). All creation operators a ∗ (u) and annihilation operators a(u) are anticommuting except a ∗ (u)a(u ) + a(u )a ∗ (u) =< u , u >, where < ·, · > is the inner product in Hz . If we change the vacuum level from λ to λ > λ, we have an isomorphism F(z, λ) → F(z, λ ) which is natural up to a multiplicative phase. The phase is fixed by a choice of normalized eigenvectors u1 , u2 , . . . up in the energy range λ < Dz < λ and setting |z, λ >= a ∗ (u1 ) . . . a ∗ (up )|z, λ >. But this choice is exactly the same as choosing a (normalized) element in DETλλ over the point z ∈ B. Thus, setting Fz = F(z, λ)⊗DETλ (z) for any λ not in the spectrum of Dz we obtain, according to Theorem 1, a family of Fock spaces parametrized by points of B but which do not depend on the choice of λ, [19]. This gives us a smooth Fock bundle F over B. The K action on the base lifts to a Kˆ action in F, the extension part in Kˆ coming entirely from the action in the determinant bundles DETλ . The Schrödinger wave functions for quantized fermions in background fields (parametrized by points of B) are sections of the Fock bundle. It follows that the Schwinger ˆ acting on Schrödinger wave functions, are terms for the infinitesimal generators of K, given by the formula for c which describes the curvature of the determinant bundle in the K directions. In the case of B = A and K = G, the elements in the Lie algebra are the Gauss law generators. This case was discussed in detail in [1]. More generally, we give explicit formulas for the Schwinger terms in Sect. 5.
Gravitational Anomalies, Gerbes, and Hamiltonian Quantization
621
5. Explicit Computations The Schwinger term in (2n − 1)-dimensional space will now be computed. This will be done by using notations for Yang-Mills, but it works for diffeomorphisms as well if a different symmetric invariant polynomial is used. Equation (4) gives that 1 f pn+1 A + v − A0 , ω2n+1 (A + v, A0 ) = (n + 1) 0
f dA + f A + (1 − f )dA0 + (1 − f )2 A20 − f (f − 1)[A0 , A] 1 +f (f − 1)[A − A0 , v] + f (f − 1)[v, v] dt. 2 The Schwinger term can be calculated from this expression. However, since we are only interested in the Schwinger term up to a coboundary, an alternative is to use the “triangle formula” as in [20]: 2
2
ω2n+1 (A + v, A0 ) ∼ ω2n+1 (A0 + v, A0 ) + ω2n+1 (A + v, A0 + v), where “∼” means equality up to a coboundary with respect to d + δ, where δ is the BRST operator. This gives a simpler expression for the non-integrated Schwinger term and also for all other ghost degrees of ω2n+1 (A + v, A0 ). Straightforward computations give the result (n + 1)n ω2n+1 (A + v, A0 )(2) ∼ pn+1 v, dv + [A0 , v], dA0 + A20 2 (n + 1)n(n − 1) 1 f (1 − f )2 pn+1 A − A0 , dv + [A0 , v], + 2 0 dv + [A0 , v], f dA + f 2 A2 + (1 − f )dA0 +(1 − f )2 A20 − f (f − 1)[A0 , A] dt, where the index (2) means the part of the form that is quadratic in the ghost. Inserting n = 1, 2 and 3 gives: ω3 (A + v, A0 )(2) ∼ p2 (v, dv + [A0 , v]) ω5 (A + v, A0 )(2) ∼ 3p3 v, dv + [A0 , v], dA0 + A20 +p3 (A − A0 , dv + [A0 , v], dv + [A0 , v]) ω7 (A + v, A0 )(2) ∼ 6p4 v, dv + [A0 , v], dA0 + A20 , dA0 + A20 +p4 A − A0 , dv + [A0 , v], dv + [A0 , v], 2 12 3 dA + A2 + 3dA0 + A20 + [A0 , A] , 5 5 5 This gives expressions for the non-integrated Schwinger term in a pure Yang-Mills potential if pn+1 is the symmetrized trace. The appropriate polynomial to use for the ˆ Levi–Civita connection is pn+1 = A(M) n+1 , according to Eq. (2). Using 2 1 1 ˆ A(M) = 1+ tr R 2 4π 12 4 1 2 2 1 4 1 + ... tr R tr R + + 4π 288 360
622
C. Ekstrand, J. Mickelsson
gives for n = 1 and 2: ω3 ( + v, 0 )(2) ∼
1 4π
2
1 vdv + 20 v 2 , 12
ω5 ( + v, 0 )(2) ∼ 0. Since the expression for ω7 is rather long we will omit to write it down. However, for the special case 0 = 0 it becomes: ω7 ( + v, 0)(2) ∼ 4 1 1 2 2 2 · tr (dv) tr dv d + 4π 288 3 5 1 1 2 + · tr (dv)2 tr d + 2 288 3 5 1 1 3 2 2 2 + . (dv) + (dv)dv + (dv) · tr R− 360 3 5 This expression can be simplified if subtracting the coboundary
1 4 1 2 4 3 · δ tr (dv) tr d + 4π 288 3 5 2 3 + d tr (vdv) tr d + . 3 The result is 4 1 1 ω7 ( + v, 0)(2) ∼ tr (vdv) trR 2 4π 288 1 1 3 + . · tr R − 2 (dv)2 + (dv)dv + (dv)2 360 3 5 The gravitational Schwinger terms are obtained by multiplying with the normalization factor (i/2π)−1 , inserting the integration over N and evaluating on vector fields X and Y on M generating diffeomorphisms. The Levi–Civita connection and curvature have i dx i ∧ dx j . Recall that (v(X))i = components ()ij = iji dx i and (R)ij = Rijj j
∂j X i , see, for instance, [18]. To illustrate how the Schwinger terms can be computed, we give the result for 1 space dimension: i −2πi ω3 ( + v, 0 )(2) (X, Y ) ∼ − (∂x X) ∂x2 Y dx, 48π N N
where “∼” now means equality up to a coboundary with respect to the BRST operator. When both a Yang-Mills field and gravity are present, the relevant polynomial is a sum of polynomial of type pk F ω(t) p˜ l R ω(t) ,
Gravitational Anomalies, Gerbes, and Hamiltonian Quantization
623
where the curvatures F ω(t) and R ω(t) are with respect to pure Yang–Mills and pure gravity, respective. This gives pk F ω(t) p˜ l R ω(t) M
kpk f (t)(A + vA − A0 ), F ω(f (t)) p˜ l R ω(h(t)) N 0 + lpk F ω(f (t)) p˜ l h (t)( + v − 0 ), R ω(h(t)) dt.
=
1
The expression is independent of f and h (see below). With a choice such that f (t) = 0, t ∈ [1/2, 1] and h (t) = 0, t ∈ [0, 1/2], this implies that ω(t) ω(t) p˜ l R = pk F (ω2k−1 (A + vA , A0 )p˜ l (R0 ) M
N
+pk (F )ω˜ 2l−1 ( + v , 0 )) .
(5)
Thus, the Schwinger term in combined Yang-Mills and gravity is up to a coboundary equal to the part of the expansion of (5) that is of second ghost degree. In particular, this implies that Schwinger terms which have oneYang–Mills ghost and one diffeomorphism ghost are in cohomology equal to the Schwinger term obtained from the form in (5). Thus, truly mixed Schwinger terms do not exist. Notice that if the background fields are vanishing then the Schwinger term is gravitational (although some parts of the form degrees are taken up by theYang–Mills polynomial). This can give anomalies of Virasoro type in higher dimensions. Observe that there is nothing special about gravity, a Yang– Mills Schwinger term is obtained by interchanging the role of f and h. This does however not mean that the gravitational Schwinger term differs from the Yang–Mills Schwinger term by a coboundary. The terms with k = 0 respectively l = 0 ruin this argument. It is easy to see that our method of computing the Schwinger term agrees with one of the most common approaches: The polynomial pk (F n ) − pk (F0n ) is written as (d + δ) on a form, the (non-integrated) Chern–Simons form. The Schwinger term is given by the part of the Chern–Simons form that is quadratic in the ghost. For the case when both Yang-Mills and gravity are present, the relevant polynomial is a sum of polynomials pk (F )p˜ l (R) − pk (F0 )p˜ l (R0 ).
(6)
There is an ambiguity in the definition of the Chern–Simons form; it is for instance possible to add forms of type (d + δ)χ to it. However, an ambiguity of this type will only change the Schwinger term by a coboundary. It will now be shown that the ambiguity in the definition of the Chern–Simons form is only of this type. Thus, we must prove that closeness with respect to (d + δ) implies exactness. This can be done by introducing the degree 1 derivation " defined on the generators by: "(d + δ)(A + vA ) = A + vA , "(d + δ)( + v ) = + v , "dA0 = A0 , "d0 = 0 , and otherwise zero. Then "(d + δ) + (d + δ)" is a degree 0 derivation which is equal to 1 on the generators. Therefore, if χ is closed with respect to (d + δ), then χ is proportional to (d + δ)"χ . An example of a (non-integrated) Chern–Simons form for the polynomial in (6) is ω2k−1 (A + vA , A0 )p˜ l (R0 ) + pk (F )ω˜ 2l−1 ( + v , 0 ). This is in complete agreement with (5).
624
C. Ekstrand, J. Mickelsson
References 1. Carey, A.L., Mickelsson, J. and Murray, M.K.: Index theory, gerbes and Hamiltonian quantization. Commun. Math. Phys. 183, 707 (1997), hep-th/9511151 2. Adler, S.: Axial vector vertex in spinor electrodynamics. Phys. Rev. 177, 2426 (1969); Bell, J. and Jackiw, R.: A PCAC puzzle: pi0−− >gamma gamma in the sigma model. Nuovo Cimento 60A, 47 (1969) 3. Jackiw, R. and Rebbi, C.: Conformal properties of a Yang-Mills pseudoparticle. Phys. Rev. D 14, 517 (1976); N.K. Nielsen, Römer, H. and Schroer, B.: Classical anomalies and a local version of the AtiyahSinger index theorem. Phys. Lett. B 70, 445 (1977); Hawking, S.W.: Gravitational instantons. Phys. Lett. A 60, 81 (1977) 4. Alvarez-Gaume, L. and Witten, E.: Gravitational anomalies. Nucl. Phys. B 234, 269 (1984) 5. Schwinger, J.: Field theory commutators. Phys. Rev. Lett. 3 , 296 (1959) 6. Mickelsson, J.: Chiral anomalies in even and odd dimensions. Commun. Math. Phys. 97, 361, (1985); On a relation between massive Yang-Mills theories and dual string models. Lett. Math. Phys. 7, 45 (1983); Faddeev, L. and Shatasvili, S.: Algebraic and Hamiltonian methods in the theory of nonabelian anomalies. Theoret. Math. Phys. 60, 770 (1985) 7. Mickelsson, J.: Wodzicki residue and anomalies of current algebras. In Integrable Models and Strings, ed. by A. Alekseev et al., Sprionger Lecture Notes in Physics 436; hep-th/9404093 8. Carey, A.L., Mickelsson, J. and Murray, M.K.: Bundle gerbes applied to field theory. hep-th/9711133, Rev. Math. Phys. 12, 65 (2000) 9. Murray, M.K.: Bundle Gerbes. J. London Math. Soc. (2) 54, 403 (1996), dg-ga/9407015 10. Atiyah, M.F., Patodi, V.K. and Singer, I.M.: Spectral asymmetry and Riemannian geometry I. Math. Proc. Camb. Phil. Soc. 77, 43 (1975) 11. Atiyah, M.F. and Singer, I.M.: The index of elliptic operators (IV). Ann. Math. 93, 119 (1971) 12. Bismut, J.M. and Cheeger, J.: Family index for manifolds with boundary, superconnections, and cones. I. J. Funct. Anal. 89, 313 (1990); Family index for manifolds with boundary, superconnections, and cones. II. J. Funct. Anal. 90, 306 (1990) 13. Bismut, J.M. and Freed, D.S.: The analysis of elliptic families: Metrics and connections on determinant line bundles. Commun. Math. Phys. 106, 159 (1986); The analysis of elliptic families: Dirac operators, eta invariants and the holonomy theorem of Witten. Commun. Math. Phys. 107, 103 (1986) 14. Piazza, P.J.: Determinant bundles, manifolds with boundary and surgery. Commun. Math. Phys. 178, 597 (1996) 15. Quillen, D.: Determinants of Cauchy–Riemann operators on Riemann surfaces. Funct. Anal. Appl. 19, 31 (1985) 16. Atiyah, M.F. and Singer, I.M.: Dirac operators coupled to vector potentials. Proc. Nat. Acad. Sci. 81, 2596 (1984) 17. Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds, Sect. 5.2.3. Clarendon Press, Oxford, 1990 18. Bardeen, W. and Zumino, B.: Consistent and covariant anomalies in gauge and gravitational anomalies. Nucl. Phys. B 244, 421 (1984) 19. Mickelsson, J.: On the Hamiltonian approach to commutator anomalies in (3+1) dimensions. Phys. Lett. B241, 70 (1990) 20. Mañes, J., Stora, R., Zumino, B.: Algebraic study of chiral anomalies. Commun. Math. Phys. 102, 157 (1985) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 212, 625 – 647 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Picard–Fuchs Uniformization and Modularity of the Mirror Map Charles F. Doran Department of Mathematics, Harvard University, Cambridge, MA 02138, USA. E-mail:
[email protected] Received: 14 August 1999 / Accepted: 30 January 2000
Abstract: Arithmetic properties of mirror symmetry (type IIA-IIB string duality) are studied. We give criteria for the mirror map q-series of certain families of Calabi–Yau manifolds to be automorphic functions. For families of elliptic curves and lattice polarized K3 surfaces with surjective period mappings, global Torelli theorems allow one to present these criteria in terms of the ramification behavior of natural algebraic invariants – the functional and generalized functional invariants respectively. In particular, when applied to one parameter families of rank 19 lattice polarized K3 surfaces, our criterion demystifies the Mirror-Moonshine phenomenon of Lian and Yau and highlights its non-monstrous nature. The lack of global Torelli theorems and presence of instanton corrections makes Calabi–Yau threefold families more complicated. Via the constraints of special geometry, the Picard–Fuchs equations for one parameter families of Calabi– Yau threefolds imply a differential equation criterion for automorphicity of the mirror map in terms of the Yukawa coupling. In the absence of instanton corrections, the projective periods map to a twisted cubic space curve. A hierarchy of “algebraic” instanton corrections correlated with the differential Galois group of the Picard–Fuchs equation is proposed. 1. Introduction Numerous remarkable properties of the type IIA-IIB string duality better known as mirror symmetry have been revealed since its discovery a decade ago. Mathematically this symmetry entails a correspondence between complex moduli in one family of Calabi– Yau manifolds and Kähler moduli of a mirror family. In the neighborhood of a large complex structure/large radius limit point mirror symmetry is described by the mirror map q-series. The mirror map is a locally holomorphic function determined by the behavior of fundamental solutions to the Picard–Fuchs equation for periods of a Calabi–Yau Present address: Center for Geometry and Mathematical Physics, Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA
626
C. F. Doran
family about a point of maximal unipotent monodromy. For a family of Calabi–Yau threefolds the mirror map q-series and the Yukawa couplings determine a generating function for the Gromov–Witten invariants. These invariants (conjecturally) count the number of rational curves on a generic member of the family. In fact, the original predictions of Candelas [2] for the one parameter family of Fermat-type quintic Calabi–Yau hypersurfaces have now been proven mathematically [22]. In a series of papers, Lian and Yau [23–26] investigate arithmetic properties of the mirror maps of several “torically defined” families of elliptic curves, K3 surfaces, and Calabi–Yau threefolds constructed in their work with Hosono, Klemm, Roan and Theisen [14, 15, 19, 16, 18]. Each of the one parameter families of elliptic curves and K3 surfaces they study has a globally defined mirror map, automorphic with respect to the global monodromy group of the family. The mirror maps of these elliptic curve families are classical modular functions for finite index subgroups of PSL(2, Z), while the mirror maps of the K3 surface families are, up to an additive integer correction, always reciprocals of some McKay–Thompson series associated to the monster in the “Monstrous Moonsine” lists of Conway and Norton [4]. In particular, the mirror maps of their examples are always automorphic functions for genus zero subgroups of PSL(2, R), a phenomenon Lian and Yau dub “Mirror-Moonshine”. When such modularity properties are possessed by a mirror map, other properties of potential physical interest can be derived: e.g., integrality of the mirror map and prepotential, congruences satisfied by the mirror map coefficients, the effect on instanton corrections, etc. Thus a question of mathematical interest and physical relevance is: Question 1. When is the mirror map an automorphic function? Unlike other questions regarding the mirror map studied in the literature, this is an inherently global question. We are asking for which families of Calabi–Yau manifolds does the mirror map admit an extension to a map from the whole period domain to the entire base of the family. Our question is related to the classical problem of characterizing modular relations between automorphic functions and the elliptic modular function. In fact, for families of elliptic curves we will see in Sect. 2.1 that this is all that is involved: we recover the classical criterion for just such modular relations from [11, 1, 41]. In [7, 8] we answer our question for families over P1 . For elliptic curve families we use Kodaira’s functional invariant J to pull back the uniformizing differential equation for the elliptic modular function from the coarse moduli space of elliptic curves (the J -line). The existence of the functional invariant J can be interpreted as a consequence of the (trivial) classical analogue of the global Torelli theorem. In the case of lattice polarized K3 surface families, we apply the global Torelli theorem of Nikulin (see the lists of related works in Dolgachev [6]) to define a generalized functional invariant mapping again from the base of a family to the associated coarse moduli space. We use this generalized functional invariant to explain the Mirror-Moonshine phenomenon for families of K3 surfaces over P1 with third order Picard–Fuchs differential equations – the setting in which the Mirror-Moonshine Conjecture of Lian and Yau was originally formuated. The basic idea behind our approach to answering the modularity question for one parameter families is quite simple: The mirror map of a family of elliptic curves (resp. rank 19 lattice polarized K3 surfaces) is classically modular (resp. automorphic) if and only if the Picard–Fuchs differential equation is a classical uniformizing differential equation (resp. the symmetric square of one). We call this Picard–Fuchs uniformization. In this paper, instead of deriving the modularity criterion “from scratch” from the local behavior of uniformizing differential equations on P1 , we use the theory of branched
Picard–Fuchs Uniformization and Modularity of the Mirror Map
627
covers of orbifolds as described by Namba [31]. This approach gives us directly the modularity criterion in the neatest possible form, and applies to both 1. one parameter families of elliptic curves (Sect. 2.1) and rank 19 lattice polarized K3 surfaces (Sect. 3.1) over a base curve of arbitrary genus, and 2. multiparameter families of lattice polarized K3 surfaces with surjective period mapping (Sect. 3.2). We replace the uniformizing differential equations for the elliptic curve and lattice polarized K3 surface families with holomorphic projective connections and holomorphic conformal connections respectively. Picard–Fuchs uniformization occurs when the GaussManin connection of such a family of elliptic curves (resp. lattice polarized K3 surfaces) is a holomorphic projective (resp. holomorphic conformal) connection. The lack of a global Torelli theorem for Calabi–Yau threefolds (in particular no presentation of the coarse moduli space as a locally symmetric space) prevents one from algebraically defining generalized functional invariants or mimicing the previous arguments for elliptic curves and K3 surfaces. Instead of an algebraic criterion for modularity of the mirror map, we must settle for a differential algebraic one in general. The Picard–Fuchs equation for a one parameter family of Calabi- Yau manifolds with h2,1 = 1 has order four. Moreover the constraints imposed by special geometry imply that about a point of maximal unipotent monodromy there is a set of fundamental solutions of the form u, u · t, u · F˙ , u · (t F˙ − 2F ) , where u(z) is the fundamental solution locally holomorphic at the point of maximal unipotent monodromy, t (z) is the mirror map, and F (z) is the prepotential (the derivative F˙ is taken with respect to the mirror map coordinate t). Following Lian and Yau, one can derive a “quantum Schwarzian equation” relating the second order coefficient of the Picard–Fuchs equation, the mirror map, and the Yukawa couplings (Sect. 4.1). In the absence of instanton corrections, this quantum Schwarzian reduces to a classical one, and the Picard–Fuchs equation takes the special form of a symmetric cube of a second order equation. We give first a criterion for modularity of such mirror maps in the beginning of Sect. 4.2. Suppose on the other hand that there are instanton corrections, so the quantum Schwarzian is not classical. If we assume that the mirror map is an automorphic function, it will satisfy another classical Schwarzian equation. By subtracting the two to eliminate the Schwarzian derivative terms, and applying a reduction of order argument to the original Picard–Fuchs equation, we obtain a nonlinear differential equation in the Yukawa coupling and coefficients of the Picard–Fuchs and classical Schwarzian equations (the “modularity equation” in Theorem 9). The mirror map does not appear in this expression, yet the equation will hold if and only if the mirror map is automorphic. This is our general criterion for modularity of the mirror map for Calabi–Yau threefolds. The absence of instanton corrections in a one parameter family of Calabi–Yau threefolds corresponds to the existence of a homogeneous third order relation among the four periods, i.e., the image of the period mapping lies on a twisted cubic space curve. It is natural to ask what other homogeneous algebraic relations can occur between periods of one parameter families of Calabi–Yau threefolds. We call these “algebraic” instanton corrections. In Sect. 4.3 we apply a century old theorem of Fano to give a rough classification, paralleling the structure of the differential Galois group of the Picard–Fuchs equation. Most of the results on the mirror map for Calabi–Yau manifolds which appear in the literature depend on the hypothesis that the families of Calabi–Yau threefolds arise
628
C. F. Doran
“torically”, i.e., as particular parametrized families of hypersurfaces or complete intersections in Fano toric varieties. By working in the setting of transcendental algebraic geometry, we obtain general results about whole classes of families of Calabi–Yau manifolds. There has been a major effort in the literature to produce examples, first of mirror maps in general [19, 14–16] and then to test the (generalized) Mirror-Moonshine phenomenon in particular [23–26, 42]. Since the point of this paper is to explain general tools and results, we refer the reader interested in examples to the papers cited above. 2. Elliptic Curve Families In this section we derive the modularity criterion for mirror maps of one parameter families of elliptic curves with section (Theorem 2), and make some comments on the case of multiparameter families of elliptic curves (Sect. 2.2). It does not really make sense to ask our question in this latter case, but we will use it to motivate some aspects of the problem for multiparameter families of K3 surfaces. 2.1. One parameter families of elliptic curves with section. In the early 1960’s Kodaira developed a general theory of elliptic surfaces, i.e., compact complex surfaces fibered over curves, with generic fiber an elliptic curve. In particular he showed that every elliptic surface with section is determined by a pair of natural invariants. The first of these, the functional invariant, is a meromorphic function on the base of the family which keeps track of the J -value of each elliptic curve fiber. The second, the homological invariant, is nothing more than the monodromy representation associated with the second order Fuchsian ordinary differential equation satisfied by the periods, i.e., the monodromy of the Picard–Fuchs equation. The elliptic surfaces with a section, the basic elliptic surfaces, play a distinguished role in Kodaira’s theory. There is a canonical form for such a family of elliptic curves π : X → C with section, exhibiting X as a divisor in a P2 -bundle over the base curve C: Theorem 1 ([29], Theorem (2.1)). Let denote the given section of π , i.e., = s(C), a divisor on X which is taken isomorphically onto C by π . Let L = π∗ [OX ()/OX ]. Suppose that the general fiber of π is smooth. Then L is invertible and X is isomorphic to the closed subscheme of P = P(L⊗2 ⊕ L⊗3 ⊕ OY ) defined by y 2 z = 4x 3 − g2 xz2 − g3 z3 , where
g2 ∈ (C, L⊗−4 ) , g3 ∈ (C, L⊗−6 ) ,
and [x, y, z] is the global coordinate system of P relative to (L⊗2 , L⊗3 , OC ). Moreover the pair (g2 , g3 ) is unique up to isomorphism, and the discriminant g23 − 27g32 ∈ (C, L⊗−12 ) vanishes at a point s ∈ C precisely when the fiber Xs is singular. For a family of elliptic curves in Weierstrass form, the functional invariant takes the form J = g23 / : C → P1J .
(1)
Picard–Fuchs Uniformization and Modularity of the Mirror Map
629
The fact that the functional invariant takes the special form of Eq. (1) is evidence of the coarseness of the “J - line” moduli space of elliptic curves. By contrast, if we were to mark the two torsion on each elliptic curve, i.e., use the Legendre family y 2 = x(x − 1)(x − λ(s)) “λ-line” moduli space, then any rational function on the base curve C would be a “λ-functional invariant” for a family of elliptic curves with level two structure over C. Kodaira has classified the singular fiber types which can arise in Weierstrass fibered elliptic surfaces. The singular fibers which appear in a smooth minimal elliptic surface fall into “types”: In (n ≥ 0), II, III, IV, I∗n (n ≥ 0), IV∗ , III∗ , and II∗ . Denote a smooth elliptic fiber by I0 . The fiber of type I1 is a rational curve with a single node. More generally, fibers of type In consist of an n-cycle of intersecting rational curves for n ≥ 1. A fiber of type II is just a rational curve with a single cusp. Type III fibers consist of two rational curves with a single point of tangency. Fibers of type IV consist of three rational components intersecting at a single point. There are also fibers of types I∗n , n ≥ 0, IV∗ , III∗ , and II∗ , whose dual intersection graphs, minus in each case a multiplicity one component, correspond to those graphs of Dynkin types Dn+4 , E6 , E7 , and E8 respectively. We now recall how the Kodaira fiber types correlate with the ramification behavior of the J-map. Lemma 1 ([30], Lemma IV.4.1). Let F = Xs be the fiber of π over s ∈ C, and let νs (J) be the multiplicity of the functional invariant at s. 1. If F has type II, IV, IV∗ , or II∗ , then J(s) = 0. Conversely, suppose that J(s) = 0. Then – F has type I0 or I∗0 if and only if νs (J) ≡ 0 mod 3, – F has type II or IV∗ if and only if νs (J) ≡ 1 mod 3, – F has type IV or II∗ if and only if νs (J) ≡ 2 mod 3. 2. If F has type III or III∗ , then J(s) = 1. Conversely, suppose that J(s) = 1. Then – F has type I0 or I∗0 if and only if νs (J) ≡ 0 mod 2, – F has type III or III∗ if and only if νs (J) ≡ 1 mod 2. 3. F has type In or I∗n with n ≥ 1 if and only if J has a pole at s of order n. Following [35, p. 304], one can apply the Griffiths-Dwork approach to computing the Picard–Fuchs equation of a Weierstrass elliptic surface as a Fuchsian system: −1 3δ d η1 d log η1 2 = 12 −g2 δ , 1 η2 d log dz η2 8
where
12
= g23 − 27g32 , δ = 3g3 g2 − 2g2 g3
and η1 =
γ
dx , η2 = y
γ
xdx . y
From this, for a one parameter family of elliptic curves in Weierstrass form it is not difficult to write down the Picard–Fuchs second order ordinary differential equation satisfied by the periods of the holomorphic one form ω = dx/y over the one cycles on
630
C. F. Doran
the fibers. Picking a basis of one cycles γi , i = 0, 1, we denote by fi = γi ω the basis of solutions to the Picard–Fuchs equation. We can now reinterpret Kodaira’s functional invariant J as the composition of the projective period morphism τ := f1 /f0 : C → H ⊂ P1 and the morphism J : H → P1 extending the classical modular function, i.e., J = J ◦ ω1 /ω0 : Ez
/E
G8 H NNNN NNNJ (τ ) NNN NN& / PSL(2, Z)\H∗ ∼ = P1J
τ (z)
C
∂(z)
Recall that a regular singular point of a Fuchsian ordinary differential equation of order k d k−1 f dkf + P (s) + . . . + Pk (s)f = 0 , Pi (s) ∈ C(s) , (2) 1 ds k ds k−1 is called a point of maximal unipotent monodromy if the local monodromy matrix G is such that G − Ik is nilpotent with exact order k. In a neighborhood of a point of maximal unipotent monodromy, Frobenius’ method tells us that there is a basis of solutions such that the first is holomorphic at the point, the second has logarithmic behavior, the next behaves like log2 , . . . , up to logk−1 . An easy consequence of Lemma 1 is Corollary 1. The points of maximal unipotent monodromy in the base curve C of an elliptic surface E are the points s ∈ C over which there is a singular fiber of type In or I∗n , n ≥ 1 (i.e., the support of the semistable elliptic fibers). Moreover, the presence of a point of maximal unipotent monodromy has global effects: Corollary 2. The Picard–Fuchs differential equation of an elliptic surface has a point of maximal unipotent monodromy if and only if the global monodromy group has infinite order if and only if the family of elliptic curves is not isotrivial. Consider more generally a one parameter family of Calabi–Yau manifolds π : X → C, whose Picard–Fuchs equation has a point of maximal unipotent monodromy. In a neighborhood of such a point consider the multivalued truncated period vector consisting only of the holomorphic solution and the logarithmic solution [phol (s) : plog (s)] : C P1 . If the image lies in the upper half plane H ⊂ P1 , then, possibly after composition with projective linear transformations so that the singular point lies at 0 ∈ P1 and maps to ı∞ ∈ H∗ ⊂ P1 , we can consider the q-series for the local inverse mapping / C , q(τ ) = e2πıτ . z(q(τ )) : H This q-series z(q) is called the mirror map of the family π : X → P1 about the point of maximal unipotent monodromy: H ~~7F τ (z) ~~~~~~~ ~~~~~ ~~~~~~~~ z(q) P1 o
Picard–Fuchs Uniformization and Modularity of the Mirror Map
631
Example 1. Consider the family EJ of elliptic curves over P1 defined by the equation 27s 27s x− . s−1 s−1 The periods of the form dx/y may be given in terms of the hypergeometric function 2 F1 (see [39, pp. 232–233] for explicit expressions). The Picard–Fuchs equation is EJ : y 2 = 4x 3 −
1 df (31/144)s − 1/36 d 2f + f = 0. + 2 ds s ds s 2 (s − 1)2 There is a basis of solutions with local monodromies G0 , G1 , G∞ about the regular singular points {0, 1, ∞} respectively, where 1 1 0 −1 11 , G1 = , G∞ = . G0 = −1 0 1 0 01 The unique point of maximal unipotent monodromy lies at s = ∞ ∈ P1 . The mirror map about this point is quite familiar. Since the maximal unipotent monodromy point is at ∞, we change variables first to z = 1/s so the mirror map q-series will be locally holomorphic. The single-valued local inverse to the period mapping is then the reciprocal of the q-series for the elliptic modular function J (q), J (q) = z(q) =
1 + 744 + 196884q + 21493760q 2 + O(q 3 ) , q
1 = q − 744q 2 + 356652q 3 − 140361152q 4 + O(q 5 ). J (q)
The period mapping is defined as a map to projective space. If one is interested in the mirror map it is often preferable to consider the Picard–Fuchs differential equation only up to “projective equivalence”. The projective normal form of a Fuchsian ordinary differential equation (e.g., that in Eq. (2) above) is the unique Fuchsian ordinary differential equation without a (k − 1)st order derivative dkg d k−2 g + R (s) + . . . + Rk (s)g = 0 , Ri (s) ∈ C(s) 2 ds k ds k−2 whose fundamental solutions define the same projective period map as that of Eq. (2). It is always possible to pass to the projective normal form differential equation by rescaling each fundamental solution of the original equation by the k th root of the Wronskian. Example 2. Suppose now that k = 2, i.e., the initial differential equation is df d 2f + P1 (s) + P2 (s)f = 0 , ds 2 ds then the projective normal form of this differential equation takes the particularly simple form d 2g 1 1 2 + P2 (s) − P1 (s) − P1 (s) g = 0. ds 2 2 4 Let /J denote the projective normal form of the Picard–Fuchs equation of the family EJ from Example 1, d2 36s 2 − 41s + 32 /J : + . ds 2 144s 2 (s − 1)2
632
C. F. Doran
As the process of taking the projective normal form does not alter the position or type of a maximal unipotent monodromy point, and as the projective solution determines the mirror map there, we see that the mirror map z(t) of a family of Calabi–Yau manifolds about a point of maximal unipotent monodromy of the Picard–Fuchs equation is determined by the projective normal form of this differential equation. Since the projective normalized Picard–Fuchs equation determines the mirror map, it is natural to ask if there is a simpler expression for this differential equation. In fact, by direct computation one can check that Proposition 1. The projective normalized Picard–Fuchs equation of a one parameter family of elliptic curves with section equals the projective normal form of the pullback J∗ (/J ) of /J from P1J to C by the functional invariant. Thus the mirror map of a one parameter family of elliptic curves is determined by the functional invariant J. This suggests that the answer to our modularity question should be expressed purely in terms of properties of the functional invariant itself. We now discuss three approaches to characterize modular mirror maps, each yielding the same criterion stated in terms of properties of the functional invariant. The three methods amount to the characterization of modular functions on the upper half plane H in terms of 1. modular relations between modular hauptmoduls and the elliptic modular function J , 2. uniformizing differential equations (genus g = 0) and holomorphic projective connections (g ≥ 1) on modular curves, and 3. branched covers of the J -line elliptic modular orbifold, respectively. The first of these is the most classical, implicit in fact in the early works of Fricke and Klein [11]. They introduce the notion of a single valued local uniformizer, or hauptmodul, H (τ ) for a genus zero modular curve. They compute several classical examples of modular relations between hauptmoduls H (τ ) and the elliptic modular function J (τ ), i.e., rational functions R(z) ∈ C(z) with the property that R(H (τ )) = J (τ ). In [1] Atkin and Swinnerton-Dyer state the following characterization of modular relations: Proposition 2. A function f (τ ) is a hauptmodul for a finite index subgroup of the classical elliptic modular group PSL(2, Z) if and only if there is a rational function R(z) ∈ C(z) such that 1. R(f (τ )) = J (τ ), 2. R(z) ramifies only over {0, 1, ∞} ⊂ P1J , and 3. the orders of ramification are = 1 or 3 over 0, and = 1 or 2 over 1. They comment further that this divisibility criterion extends to automorphic functions for subgroups of PSL(2, Z) of arbitrary genus. Their proof was extended by Venkov [41] to genus zero Fuchsian groups of the first kind more general than the classical elliptic modular group. The mirror map of a one parameter family of elliptic curves is modular when the functional invariant satisfies the three conditions of the proposition. The second approach, the one used to characterize modular mirror maps for families of elliptic curves over P1 in [8], focuses on the local properties of Fuchsian second order ordinary differential equations in projective normal form which characterize uniformizing differential equations. The uniformization theory for Riemann surfaces can be reformulated after Gunning [13] in terms of holomorphic projective connections on the
Picard–Fuchs Uniformization and Modularity of the Mirror Map
633
Riemann surface. On a local chart, or over a genus zero Riemann surface, this projective connection takes the form of a second order Fuchsian ordinary differential equation in projective normal form, i.e., d 2f + Q(z)f = 0. dz2
(3)
A fundamental set of solutions {f1 , f2 } to a uniformizing differential equation (3) has the property that Q(z) is the Schwarzian derivative of the projective solution τ (z) = f1 (z)/f2 (z) with respect to z, i.e., Q(z) = {τ (z); z}. The local behavior of the Schwarzian at poles then characterizes the class of Q(z) corresponding to uniformizing differential equations. Our criterion for modularity of the mirror map becomes a constraint on the functional invariant J so that the projective normalization of the pullback of the projective normalized /J (uniformizing differential equation for the J -line) is again a uniformizing differential equation. The “no excess ramification” condition (i.e., no ramification except over {0, 1, ∞} ⊂ P1J ) means that the projective normal form of the Picard–Fuchs equation must be free of apparent singularities. For a detailed discussion see [8, §3,4]. The third method, characterizing branched covers of orbifolds, is the most easily generalized of these three, and hence is our method of choice. We sketch here the theory of branched covers of orbifolds due to Kato, following Yoshida [44, §5.1]. Let X be a compact Riemann surface of genus g, equipped with m ≥ 1 marked “orbifold points” aj ∈ X and associated “orbifold weights” bj ∈ Z (2 ≤ bj ≤ ∞). Suppose that g = 0 and m ≥ 3. Fix the following data: X0 := X \ {a1 , . . . , am }; X˜ 0 the universal covering of X0 ; H the fundamental group of X0 , which we also view as the transformation group of X˜ 0 ; µj the element of H represented by a simple loop about aj ; H [µb ] the b
smallest normal subgroup of H containing µj j (j = 1, . . . , m) (determined uniquely independent of choice of µj or basepoint for H ). Let K be an arbitrary subgroup of H containing H [µb ], X0 the covering of X0 corresponding to K, and X the completion of X0 , i.e., the space obtained by adding to X0 all points over the aj with finite bj . Then we have a sort of “galois correspondence” of branched covers: The space X is a branched cover of X branching at aj with a ramification index dividing bj ; we say that X is branched at most over the divisor D = m b · (a j ) ∈ Pic(X). Conversely, to j =1 j such a branched covering of X there corresponds a subgroup K, H [µb ] ⊂ K ⊂ H . The covering M corresponding to K = H [µb ] is called the universal branched covering of X. In other words we have the following diagram of correspondences:
1 ↔ | K/H [µb ] ↔ | H /H [µb ] ↔
M ↓ X ↓ X
X˜ 0 ↓ ⊃ M0 ↓ ⊃ X0 ↓ ⊃ X0
↔
1 | ↔ H [µb ] | ↔ K | ↔ H
In this language we can most cleanly state our modularity criterion for the mirror map: Theorem 2. The mirror map of a one parameter family of elliptic curves with section π : E → C is an automorphic function for a finite index subgroup of PSL(2, Z) if and only if the functional invariant J(z) is branched at most over 3 · (0) + 2 · (1) ∈ Pic(P1J ).
634
C. F. Doran
Proof. Apply the galois correspondence above to the J -line orbifold. The Riemann surface X ∼ = P1J (genus g = 0), m = 3, a1 = 0, a2 = 1, a3 = ∞, b1 = 3, b2 = 2, b3 = ∞, D = 3 · (0) + 2 · (1) ∈ Pic(P1J ). There is a correspondence between Riemann surfaces uniformized by subgroups of H /H [µb ] ∼ = PSL(2, Z) and covers of the J -line branched at most over D = 3 · (0) + 2 · (1). The mirror map is an automorphic function for a subgroup of PSL(2, Z) if and only if the base C of the family is so uniformized. But the branched covering C → P1J is given by J. Hence the modularity criterion is just that the natural cover of the J -line defined by the functional invariant branch at most over D. This is not the end of the story in the elliptic curve case. By Lemma 1 we know the correspondence between local ramification behavior of the functional invariant and the type of Kodaira singular fiber to appear in the elliptic surface. In particular, if the mirror map of a basic elliptic surface is modular, then there are no singular fibers of types IV or II∗ . Moreover, if one restricts to the case of rational elliptic surfaces where all combinations of singular fiber types are known, one can list all rational elliptic modular surfaces with section. See [8, Theorem 4.11]. 2.2. Multi-parameter families of elliptic curves. The definition of Weierstrass fibrations in the one parameter case extends naturally to multiparameter families of elliptic curves with section. It is natural to ask if the modularity characterization extends in any way to families π : E → S of elliptic curves with section where dim(S) ≥ 2. This isn’t possible, but the obstruction is of interest in itself, and suggests an important hypothesis to make in the case of multiparameter families of K3 surfaces (Sect. 3.2). To begin with, the Gauss-Manin system for a multiparameter family of elliptic curves consists of a rank two system of linear partial differential equations. With a slight modification, we can construct a family of varieties for which the Gauss–Manin system takes a recognizable projective normal form. Replace an n parameter family of elliptic curves fiberwise with their nth power. The resulting Gauss-Manin system (essentially the nth symmetric power of the original) is a rank n+1 system of linear partial differential equations in n independent variables. A (projective) normal form exists for such differential equations [28]: n
∂ 2w ∂w = Pijk k + Pij0 w (i, j = 1, . . . , n). ∂zi ∂zj ∂z k=1
In the one parameter setting (n = 1) these equations reduce to projective normalized second order Fuchsian ordinary differential equations. Local conditions coming from the Schwarzian derivative define a natural subclass consisting of uniformizing differential equations for Riemann surfaces with respect to subgroups of PSL(2, R) (projective connections if g ≥ 1). In the multivariable case, the analogous subclass consists of the multiparameter holomorphic projective connections (connections modelled after projective space) much studied by Kobayashi [20] in a program established by Cartan. Holomorphic projective connections generalize the Schwarzian derivative, and uniformize quotients of the n-ball Bn := {[z0 : . . . : zn ] ∈ Pn | |z0 |2 − |z1 |2 − . . . − |zn |2 > 0} by a discrete subgroup of the analytic automorphisms.
Picard–Fuchs Uniformization and Modularity of the Mirror Map
635
The difficulty we encounter in generalizing our modularity criterion for the mirror map to multiparameter families of elliptic curves is fundamental. The image of the projective period morphism, even considering the symmetric power family, only lies on a one dimensional submanifold of the period domain Bn ! A necessary condition for the Picard–Fuchs equation to uniformize the base S of our family E is for the period mapping S Bn to be surjective. In fact, as the local inverse to the period mapping, the mirror map itself cannot be defined unless the period mapping is surjective and the dimension of S equals that of the period domain. This suggests two ingredients which will be needed for the multiparameter K3 surface generalization in Sect. 3.2: 1. a notion of uniformizing differential equation well adapted for Picard–Fuchs equations of K3 surface families, and 2. consideration only of families with surjective period mappings. 3. K3 Surface Families The results of Sect.2 are extended here to families of lattice polarized K3 surfaces with surjective period mappings, first in the one parameter case (Sect. 3.1) and then for multiparameter families (Sect. 3.2). By applying the resulting criterion for automorphic mirror maps to one parameter families of rank 19 lattice polarized K3 surfaces, we explain the Mirror-Moonshine phenomenon of Lian and Yau.
3.1. One parameter families of K3 surfaces. In their first systematic investigations of mirror symmetry for one parameter families of Calabi–Yau manifolds constructed via the “orbifold construction” [24], Lian andYau discovered that the reciprocal of the mirror maps for the K3 surfaces they were studying agreed, up to an additive constant, with some of the McKay–Thompson normalized q-series in the lists of Conway–Norton [4]. The evidence was sufficiently strong that they formulated Conjecture 1 (Mirror-Moonshine, [24,23]). If z(q) is the mirror map for a one parameter family of algebraic K3 surfaces from an orbifold construction which has a third order Picard–Fuchs equation, then, for some c ∈ Z, the q-series 1 +c z(q) is a McKay–Thompson series Tg (q) for some element g in the Monster. In [25, 26], Lian and Yau compute many more toric examples (including over a dozen complete intersection examples), and note that the correspondence to monstrous groups persists. This suggested that the hypothesis regarding the “orbifold construction” should perhaps be weakened to the hypothesis “torically constructed”. As noted in the proof of Theorem 5, for a family of lattice polarized K3 surfaces the condition of having a third order Picard–Fuchs equation is equivalent to the generic member possessing a polarization by a lattice of rank 19. Furthermore, a McKay–Thompson series is in particular a hauptmodul for some “monstrous” genus zero arithmetic group , and the various equivalent hauptmoduls are well-defined as generators of the function field of the rational curve \ H∗ only up to
636
C. F. Doran
action of . We see that in Conjecture 1 an equivalent conclusion is that the mirror map is itself a hauptmodul (unnormalized!) for some monstrous . Before Conjecture 1 was even formulated, Beukers, Peters, and Stienstra had computed the Picard–Fuchs equation of a particular family of rank 19 lattice polarized K3 surfaces [33]. The mirror map was determined by Verrill and Yui [42]. Although it is a hauptmodul, this q-series does not satisfy the conclusion of the Mirror-Moonshine Conjecture. Thus it provides a counterexample to a “monstrous” generalization of the Mirror-Moonshine Conjecture for torically constructed families. This suggests that we characterize the families of rank 19 lattice polarized K3 surfaces whose mirror maps are hauptmoduls for genus zero groups – a special case of our question from the introduction. The condition that a one parameter family of K3 surfaces have a third order Picard– Fuchs equation is actually quite natural. The periods obtained by integration of the holomorphic two form ω = ω(2,0) over algebraic two cycles all vanish. For a K3 surface X, the intersection form defines on H2 (X, Z) the structure of a lattice, isomorphic to the even unimodular lattice L = U ⊥ U ⊥ U ⊥ −E8 ⊥ −E8 , where U is the standard hyperbolic plane. The sublattice of algebraic cycles in H2 (X, Z) is naturally identified with the Picard group Pic(X) of divisor classes of X. Thus the rank ρ of the Picard group determines the order of the Picard–Fuchs equation: order of Picard–Fuchs = 22 − ρ. In particular, the families considered by Lian and Yau all have Picard rank 19. Let M be a lattice. An M-polarized K3 surface is a pair (X, j ) of a K3 surface X and a primitive lattice embedding j : M A→ Pic(X). The examples studied by Lian and Yau relating to Mirror-Moonshine are families of rank 19 lattice polarized K3 surfaces. A moduli space for lattice polarized K3 surfaces is constructed in [6, §3]. Each isomorphism class of (X, j ) is represented by a point of this coarse moduli space KM . Moreover the global Torelli theorem for lattice polarized K3 surfaces implies, as with the J -line in the case of elliptic curve moduli, that KM has the structure of an arithmetic quotient of a symmetric homogeneous space DM (a bounded symmetric domain of type IV) by an arithmetic group M . Here DM ∼ = O(2, 20 − ρ)/(SO(2) × O(20 − ρ)) and
M = ker (O(N ) → Aut(N ∗ /N )) ,
where N := ML⊥ . In particular, if the rank of M is 19 then DM ∼ = H. The generalized functional invariant HM : S → KM of a family π : X → S of M-polarized K3 surfaces may now be defined, by analogy with the elliptic curve case, as the composition of the multivalued period morphism S DM and the arithmetic quotient DM → KM . Since we are particularly interested in the case ρ = 19, the Picard–Fuchs equations of such one parameter families must be studied. We begin by examining some preliminary generalities on symmetric powers of second order Fuchsian ordinary differential equations. Assume that we have a second order Fuchsian ordinary differential equation L2 f = 0, where d2 d L2 = 2 + P1 (s) + P2 (s). ds ds
Picard–Fuchs Uniformization and Modularity of the Mirror Map
637
The second order equation L2 f = 0 is equivalent to the system of first order differential equations f =g g = −P2 f − P1 g with {f, g} as fundamental solutions. Observe that {f n , f n−1 g, . . . , f g n−1 , g n } n forms a set of fundamental solutions for the nth symmetric power L = L 2 . The following result describes a system of first order differential equations for L with these fundamental solutions.
Theorem 3 ([21], Theorem 2). If {f, g} satisfy a first order 2 × 2 differential system d f f 0 1 , = g −P2 −P1 dt g then {f n , f n−1 g, . . . , fg n−1 , g n } satisfy the (n + 1) × (n + 1) system fn fn d dt
f n−1 g f n−1 g . . = A .. , . . n−1 n−1 fg fg n g gn
where A = (aij ) is an (n + 1) × (n + 1) matrix such that ak,k ak,k+1 ak+1,k ai,j
= = = =
(1 − k)P1 , n + 1 − k, −kP2 , 0,
1 ≤ k ≤ n + 1, 1 ≤ k ≤ n, 1 ≤ k ≤ n, i > j + 1 or j > i + 1.
Example 3. In particular, when n = 2, the case for a symmetric square, one may rewrite the system in terms of a single third order operator Sym2 (L2 ) =
d2 d3 d + 3P + (2P1 2 + 4P2 + P1 ) + (4P1 P2 + 2P2 ). 1 ds 3 ds 2 ds
Our next task is to show that the Picard–Fuchs equation of a one parameter family of rank 19 lattice polarized K3 surfaces is a symmetric square of a second order equation, and to reduce the modularity question for the mirror map to the second order setting. Theorem 4 ([38], Lemma 3.1.(b)). Let L1 (y) and L2 (y) be homogeneous linear differential polynomials with coefficients in C(t). Then there exists a homogeneous linear differential equation L3 (y) = 0 with coefficients in C(t) and solution space the C-span of {ν1 ν2 | L1 (ν1 ) = 0 and L2 (ν2 ) = 0}.
We call the operator L3 (y) constructed above the symmetric product of L1 and L2 , and denote it by L1 L2 . In fact, the operation is associative, and we may further define L n for n ≥ 1 by L1 = L and L n = L n−1 L. We call Symn (L) = L n the nth symmetric power of L; conversely, L is the nth root of L n .
638
C. F. Doran
Lemma 2 ([38], Lemma 4, p. 129). Let L(y) be a homogeneous linear differential n polynomial with coefficients in C(t). Then L(y) = L 2 (y) for some second order homogeneous linear differential polynomial L2 (y) with coefficients in C(t) if and only if there exists a fundamental set of solutions {y1 , . . . , yn+1 } of L(y) = 0 such that 2 yi yi+2 − yi+1 = 0 , i = 1, . . . , n − 1.
Corollary 3. Let L(y) = 0 be a third order homogeneous linear equation with coefficients in C(t). If there exists a nondegenerate homogeneous polynomial P of degree 2 with constant coefficients and a fundamental set of solutions {y1 , y2 , y3 } of L(y) = 0 such that P (y1 , y2 , y3 ) = 0, then L(y) is the second symmetric power of a second order homogeneous linear differential equation with coefficients in C(t). Proof. This follows easily from Lemma 2. By assumption, the fundamental set of solutions satisfies a nondegenerate quadratic relation. Since all such quadrics in P2 (C) are projectively equivalent to y1 y3 − y22 = 0 the criterion of the lemma applies and L(y) is a symmetric square.
In this form, using the expression for the projective normal form of a second order Fuchsian differential equation given in Example 2, it is easy to check that: Proposition 3. Let L2 be as above a second order Fuchsian ordinary differential operator, and let L = Sym2 (L2 ) be its symmetric square. Then the projective normal form of L is the symmetric square of the projective normal form of L2 . In fact, it is possible to provide an explicit description of the relationship between the monodromy matrices of the second order “square root” equation and those of the third order symmetric square equation. This is provided by the faithful representation of SL(2, C) in SL(3, C) via the symmetric square representation [38]. Finally, we see the relevance of all of this for Picard–Fuchs equations of our rank 19 lattice polarized K3 surface families: Theorem 5. The Picard–Fuchs equation of a family of rank 19 lattice polarized K3 surfaces is the symmetric square of a second order homogeneous linear Fuchsian ordinary differential equation. Proof. To begin with, the order of the Picard–Fuchs equation is equal to the rank of the transcendental lattice, i.e., 22 − 19 = 3. By Nikulin’s Torelli theorem for lattice polarized K3 surfaces the period domain lies on a nondegenerate quadric in P2 [6]. Thus, Corollary 3 implies that the third order Picard–Fuchs differential equation is in fact a symmetric square. There is another approach to proving this result in the special case of K3 surfaces polarized by a lattice of the form Mn := U ⊥ U ⊥ −E8 ⊥ −E8 ⊥ #−2n$ , which takes advantage of their presentation as Shioda-Inose surfaces coming from a product of two elliptic curves linked by an n-isogeny. See [32] for more details. Such a simple geometric description is lacking in case of a general rank 19 lattice polarization. Nevertheless, our approach via symmetric square(root) Picard–Fuchs equations still
Picard–Fuchs Uniformization and Modularity of the Mirror Map
639
applies! This is what allows our transcendental methods to extend beyond the Mn polarized case to general rank 19 lattice polarized K3 surface families. We have effectively reduced the question of automorphicity of the mirror map to the case of uniformization of orbifold Riemann surfaces by second order Fuchsian equations already addressed in Sect. 2.1. Our result is Theorem 6. The mirror map of a one parameter family of rank 19 lattice polarized K3 surfaces π : X → C is an automorphic function for a finite index subgroup of M if and only if the generalized functional invariant HM (z) is branched at most over the orbifold divisor D ∈ Pic(KM ). Proof. By Theorem 5 the Picard–Fuchs equation of such a family of K3 surfaces is a symmetric square. The mirror map of a one parameter family of rank 19 lattice polarized K3 surfaces about a point of maximal unipotent monodromy is identical to that of the projective normalized square root of its Picard–Fuchs equation about the corresponding point: If {f, g} is a fundamental set of solutions to the square root equation, say f the locally holomorphic solution, then {f 2 , f g, g 2 } is a fundamental set of solutions to the symmetric square, with f 2 locally holomorphic. The (truncated) projective period mapping for the K3 surface family, is given by f g/f 2 = g/f , which is exactly the projective period ratio of the square root equation. Thus the mirror map for the K3 surface family is modular if and only if the projective normalized square root of Picard– Fuchs is a uniformizing differential equation for C. We can now apply the same galois correspondence for branched covers of orbifolds we used in Theorem 2. Now X = KM , the aj and bj are determined by the positions and orders of the fixed points m of the action of M on DM ∼ = H, and the total orbifold divisor of KM is again D = j =1 bj · (aj ) ∈ Pic(KM ). Using the theorem of Fano reproduced in Sect. 4.3, we can even characterize near modularity properties of one parameter families of rank 18 lattice polarized K3 surfaces. By the nondegenerate quadric structure of the period domain and case 3 of Theorem 9 we know that the fourth order projective normalized Picard–Fuchs equation is a tensor product of two second order Fuchsian equations in projective normal form. If the fundamental solutions, in {hol., log.} pairs, for these factor equations are {a, b} and {c, d} , then the fundamental solutions to the product equation take the form {ac, bc, ad, bd} so the truncated projective period mapping consists of the pair {b/a, d/c}, i.e., the pair of projective solutions to the factor equations. Although it is not natural to describe the mirror map when the dimension of the family is unequal to that of the associated period domain, there is a good notion of “bimodularity”, i.e., when each factor equation is a uniformizing differential equation (necessarily distinct else the lattice polarization rank jumps to 19 and the equation is a symmetric square). 3.2. Multi-parameter families of K3 surfaces. For the multiparameter definition of points of maximal unipotent monodromy and the mirror map we refer the reader to the unified presentation in [5] (§5.2.2 and §6.3.1 respectively). The details of the local description of the mirror map are in fact irrelevant for what follows as we address the
640
C. F. Doran
global question of modularity. In any case, the existence of a point of maximal unipotent monodromy is again guaranteed by the (related) hypotheses: 1. the family is not isotrivial, and 2. the period mapping is surjective. We define the generalized functional invariant HM : S → KM for a family π : X → S of M-polarized K3 surfaces as in Sect. 3.1 as the composition of the multivalued period morphism S DM and the quotient map to the coarse moduli space DM → KM coming from the global Torelli theorem. Under our hypotheses, the Gauss–Manin system for an n parameter family of rank 20 − n lattice polarized K3 surfaces is a system of linear partial differential equations of rank n + 2 in n independent variables. Any such system has a (projective) normal form [37] n
∂ 2u ∂ 2u ∂u = gij + Akij + A0ij u (1 ≤ i, j ≤ n) , ∂zi ∂zj ∂z1 ∂zn ∂zk k=1
where gij = gj i , Akij = Akj i , A0ij = A0j i , g1n = 1 , Ak1n = A01n = 0 (for n ≥ 3), or [36] ∂ 2u ∂u ∂u ∂ 2u +a +b + pu , = l 2 ∂x ∂x∂y ∂x ∂y ∂u ∂u ∂ 2u ∂ 2u +c +d + qu =m 2 ∂y ∂x∂y ∂x ∂y (for n = 2). The global Torelli theorem of Nikulin again implies that the periods map to a quadric projective hypersurface. The natural subclass of uniformizing differential equations adapted to the Picard–Fuchs equations of lattice polarized K3 surfaces with surjective period mappings are the holomorphic conformal connections (connections modelled after hyperquadrics) introduced by Kobayashi [20]. Once again the question of automorphicity of the inverse to the projective period mapping reduces to the uniformizability of the base S of our family as a branched cover of the modular orbifold KM . Fortunately, the galois correspondence for branched covers of orbifold Riemann surfaces has been generalized by Namba [31] to the case of orbifold complex manifolds of higher dimension. We refer to [31, Theorem 1.2.7] for the details, but the only essential difference is that we must add a higher dimensional analogue of the topological condition excluding “g = 0, m = 1 or 2” in the Riemann surface case. This topological condition, [31, Condition 1.2.4], says: if µdj ∈ H [µb ], then bj | d (for all j , 1 ≤ j ≤ m). By applying the galois correspondence as before to our families we find Theorem 7. The mirror map of an n parameter family of rank 20 − n lattice polarized K3 surfaces π : X → S is an automorphic function for a finite index subgroup of M (M := the polarizing lattice) if and only if the generalized functional invariant HM is branched at most over the orbifold divisor of KM .
Picard–Fuchs Uniformization and Modularity of the Mirror Map
641
4. Calabi–Yau Threefold Families We have seen that the presence of a global Torelli theorem is a great help in establishing modularity criteria for the mirror map, expressed in terms of natural algebraic invariants of our families of Calabi–Yau manifolds. It is known that in general moduli spaces of polarized Calabi–Yau threefolds lack the structure of a locally symmetric space. Nevertheless, it is possible that a differential algebraic criterion for automorphicity of the mirror map may be obtainable for Calabi–Yau threefold families by making use of the “special geometry" of Calabi–Yau threefold moduli. In fact, one can use the constraints imposed by special geometry on the Picard–Fuchs equation of a one parameter family of Calabi- Yau threefolds with h2,1 = 1 to derive an auxiliary differential equation (involving the Yukawa couplings, the coefficients of Picard–Fuchs, and the rational function defining the putative uniformizing differential equation) which holds if and only if the mirror map is an automorphic function (Theorem 9 in Sect. 4.2). 4.1. Picard–Fuchs equations of Calabi–Yau threefolds and special geometry. Special geometry arises in global N = 2 supersymmetry in four dimensions as a structure on the manifold spanned by the scalars in the vectormultiplets. The moduli space of (2,2) superconformal field theories, and thus the moduli space of Calabi–Yau threefolds, satisfies the same constraint equation for the natural Kähler metric on moduli space. In the case of one parameter families of Calabi–Yau threefolds with h2,1 = 1 much is known about the implications of special geometry. In particular, the effect of special geometry on the fourth order Picard–Fuchs ordinary differential equations is well known [19, 3]. We review these results in this section, using notation largely compatible with that in [19, 27]. We will always use primes (e.g., f (z)) to denote derivatives with respect to the base parameter z, and dots (e.g., F˙ (t)) to denote derivatives with respect to the truncated period mapping parameter t. Suppose given the Picard–Fuchs equation for a family of h2,1 = 1 Calabi–Yau threefolds Lf (z) = 0 : f (z) + b3 (z)f (z) + b2 (z)f (z) + b1 (z)f (z) + b0 (z)f (z) = 0 with fundamental solutions (ξ0 , ξ1 , ξ0 F˙ (ξ1 /ξ0 ), ξ0 ((ξ1 /ξ0 )F˙ (ξ1 /ξ0 ) − 2F (ξ1 /ξ0 ))). Then t (z) := ξ1 /ξ0 is the truncated period mapping. By rescaling the solutions g(z) := f (z)/A(z), where 1 A(z) = exp − b3 (z)dz , 4 we obtain the projective normalized Picard–Fuchs equation Lg(z) = 0 : g (z) + a2 (z)g (z) + a1 (z)g (z) + a0 (z)g(z) = 0 with fundamental solutions 1/A(z) times the previous ones. In fact, a1 (z) = a2 (z) (see [19]). Let u(z) = ξ0 /A. The quantum Yukawa coupling is related to the holomorphic solution u(z) about the point of maximal unipotent monodromy: K = F˙ (3) = const.A2 /ξ02 = const./u2 .
642
C. F. Doran
˜ = 0, By reduction of order applied to the z ↔ t variables exchanged equation Lg we derive a third order variant of the Picard–Fuchs equation in t, satisfied by u(t), (3)
P Ft u(t) = 0 : u˙ (3) (t) + 21 c2 (t)u(t) ˙ + 41 c˙2 (t)u(t) = 0, i.e.,
1 ˜ ˜ L(u · t) − t · (Lu) . 4 (3) We recognize P Ft as the symmetric square of the second order “square root” equation (3)
P Ft u = (2)
¨ + 18 c2 (t)v(t) = 0 P Ft v(t) = 0 : v(t)
√ satisfied by v(t) = u(t). By plugging in the two remaining fundamental solutions, one finds that the resulting system of equations reads c2 (t) = r2 (t) , c0 (t) = r0 (t), where
2 z˙ (3) z¨ +5 = a2 (z)(˙z)2 + 5{z(t), t}, z˙ z˙ 2 5 K˙ K¨ r2 (t) = 2 − , K 2 K and the lengthy expressions for c0 (t) and r0 (t) are found in [19], where they are used to derive nonlinear ordinary differential equations of high order for the mirror map and Yukawa coupling. The c0 = r0 equation provides no simplification of our approach to modularity in Sect. 4.2, so c0 and r0 may be safely ignored. By reduction of order applied to Lg = 0, we find a third order Picard–Fuchs type (3) equation in z for T (z) = t (z), P Fz T (z) = 0 : c2 (t) = a2 (z)(˙z)2 −
T (z) + 4
15 2
u (z) u (z) u (z) u (z) T (z) + 6 + a2 (z) T (z) + 4 + 2a2 (z) + a2 (z) T (z) = 0. u(z) u(z) u(z) u(z)
It is important that the dependence of the coefficients on u is only through the ratios u (z) u (z) u (z) , , and u(z) u(z) u(z) – this is why the constant relating K and u never enters into the equation even if we rewrite it in terms of K. With this in mind, let r := d log u, and Lu(z) = 0 becomes (r + 4rr + 3(r )2 + 6r 2 r + r 4 ) + a2 (r + r 2 ) + a2 r + a0 = 0.
(4)
For Calabi–Yau threefold families (assuming special coordinates) Lian andYau show that the mirror map satisfies a “quantum corrected” version of Schwarz’s equation (the c2 = r2 equation above): 2Q(z)(˙z)2 + {z, t} = 25 y¨ −
1 ˙ 2, 10 (y)
where y(t) = log(K(t)), Q(z) = a2 (z)/10. For reference note as well that 1 2 ˙ . c2 (t) = 2y¨ − (y) 2
Picard–Fuchs Uniformization and Modularity of the Mirror Map
643
4.2. Characterization of modular mirror maps. We will start with the case with no instanton corrections: The quantum corrections vanish if and only if c2 (t) = 0, i.e., when x¨ = (x) ˙ 2, where x = y/4. Letting X = x, ˙ this becomes X˙ = X2 , with solutions X(t) = −(c−t)−1 for constant c. So K(t) = exp(y(t)) = exp(4x(t)) = exp(4(log(c − t) + d)) = const.(c − t)4 . Whenever K(t) does not take this particular form, we know that the mirror map z(t) is not an automorphic function for the projective monodromy group of the second order ordinary differential equation in projective normal form with coefficient Q(z). Conversely, if K(t) satisfies Eq. (4.2) and Q(z) satisfies the local conditions coming from the Schwarzian derivative for orbifold uniformization (i.e., characteristic exponent differences are proper unit fractions or zero), then the mirror map z(t) will be an automorphic function. A two parameter family of Calabi–Yau threefolds (a subfamily of the 101 parameter family of Calabi–Yau quintic hypersurfaces in P4 ) without instanton corrections is described in [3]. Assume for the remainder of Sect. 4.2 that there are instanton corrections present. Suppose that there is a rational function R(z) (necessarily unequal to Q(z)) with respect to which the mirror map z(t) satisfies the classical Schwarz equation 2R(z)(˙z)2 + {z, t} = 0
(5)
i.e., with respect to which the mirror map is an automorphic function. There is only one such candidate rational function R(z). This is the rational function which defines the uniformizing differential equation with regular singular points with compatible characteristic exponent differences exactly at those of the projective normal form Picard–Fuchs equation. The only subtlety that arises is one of computational effectivity: If there are more than three regular singular points, then the coefficients in the numerator of R(z) are difficult to determine in general from the denominator data – this is the famous “accessory parameter problem” in Riemann–Hilbert theory. By subtracting the two expressions (4.1) and (5) we have the equation 2(Q(z) − R(z))(˙z)2 =
2 1 y¨ − (y) ˙ 2. 5 10
(6)
Set P (z) = 5(Q(z) − R(z)) and S(z) = (1/4)P (z). Then Eq. (6) can be rewritten as S(z)(˙z)2 = X˙ − X 2 , where X = x˙ and x = y/4 as before. Now apply a Ricatti transformation X(t) =
w(t) ˙ d = log w(t) w(t) dt
yielding the linear ordinary differential equation in t w(t) ¨ + S(z(t))(˙z)2 w(t) = 0.
(7)
644
C. F. Doran
Now change the independent variable from t to z and we get a second order linear equation in z w (z) −
T (z) w (z) + S(z)w(z) = 0, T (z)
(8)
where T (z) = t (z). (z) in terms of {w, S}, or {K, P }, or {u, a2 , P }, or This implies an equation for TT (z) {r, a2 , P } (recall r = d log u). The equation in terms of r, a2 , P reads T r 1 1P = − r− . T r 2 2 r One can of course substitute ((a2 (z)/2) − 5R(z)) for P (z) and obtain the expresion in terms of {r, a2 , R} as well. (3) Now apply this to reduce P Fz to an expression of the form T (z)ϕ(z) = 0. Since T (z) is not identically zero by assumption (the mirror map is locally invertible), ϕ(z) = 0. This is the modularity condition. If we use the expression for d log T in terms of u, we can arrange to never have more (4) than a u appear (use P Fz u = 0). Similarly we can arrange to never have more than a w or a K or a r appear. In the r variant in Theorem 9 below we use Eq. (4). Of course this results in an additional term involving a0 (there was only a2 dependence in the higher order equation). Theorem 8. Here is the equation characterizing modularity of the mirror map in terms of r, a0 , a2 , R: 0 = −a23 − 16a0 r 2 − 6a22 r 2 − 12a2 r 4 + 8r 6 + 30a22 R + 80a2 r 2 R + 200r 4 R − 300a2 R 2 − 200r 2 R 2 + 1000R 3 + 5a2 ra2 − 6r 3 a2 − 50rRa2 + 7a22 r + 8a2 r 2 r − 64r 3 r + 116r 4 r − 140a2 Rr − 80r 2 Rr + 700R 2 r − 12ra2 r − 16a2 (r )2 − 48r 2 (r )2 + 160R(r )2 + 16(r )3 − 50a2 rR + 60r 3 R + 500rRR + 120rr R − 4r 2 a2 − 16a2 rr + 80r 3 r + 160rRr + 32rr r + 40r 2 R . In particular, this modularity equation is a second order nonlinear ordinary differential equation with rational function coefficients which the logarithmic derivative of the holomorphic solution to the Picard–Fuchs equation (4.1) satisfies if and only if the mirror map is an automorphic function. Special geometry is a phenomenon present in multidimensional families of Calabi– Yau manifolds as well [40]. A multiparameter criterion for automorphicity of the mirror map would of course be desirable. Perhaps the recent mathematical reformulation of special geometry by Freed [10] is a natural starting point.
Picard–Fuchs Uniformization and Modularity of the Mirror Map
645
4.3. Algebraic instanton corrections. In case of families of lattice polarized K3 surfaces, by global Torelli there is a homogeneous quadratic relation among the periods, and no instanton corrections. For Calabi–Yau threefold families we can also interpret the absence of instanton corrections as imposing a particular homogeneous algebraic relation among the periods. In Sect. 4.2 we saw a condition for vanishing of instanton corrections was that c2 (t) = 0. Equivalently, as described in [3, §2.1] for example, this can be interpreted as the vanishing of the fourth W -algebra generator w4 in the presence of the vanishing of the third (w3 = 0 being a consequence of special geometry). This implies in particular that the projective normalized Picard–Fuchs equation have a set of fundamental solutions {u31 , u21 u2 , u1 u22 , u32 }, where u1 and u2 are the fundamental solutions to the cube root equation u (z) + Q(z)u(z) = 0. In particular the projective periods map to a twisted cubic space curve. What about other homogeneous algebraic relations among the periods? We call instanton corrections for which the Picard–Fuchs equation still admits homogeneous algebraic relations among the periods algebraic instanton corrections. A century ago Fano classified fourth order Fuchsian ordinary differential equations whose fundamental solutions satisfy homogeneous algebraic relations [9, pp. 496–497]. To paraphrase in more modern language Theorem 9. The projective solution to a fourth order Fuchsian ordinary diferential equation falls into one of the following classes: 1. The projective solution lies on an algebraic (twisted cubic) curve in P3 . These equations are symmetric cubes of second order Fuchsian ordinary differential equations. 2. There is a homogeneous quartic relation among the fundamental solutions. Such equations can be transformed by a differential algebraic change of variables f = αh + βh + γ h to a member of the previous class. 3. A quadratic relation with nonvanishing discriminant exists among the fundamental solutions. These equations are the tensor product of two distinct second order Fuchsian ordinary differential equations L2 ⊗ L2 . 4. A quadratic relation with vanishing discriminant exists. These equations are formed by operator composition of a first order and a third order equation L1 · L3 . 5. No homogeneous algebraic relations exist among the fundamental solutions. This is the generic case. We can of course reinterpret Fano’s result as providing a rough classification of algebraic instanton corrections. In the first and last cases at least, we know Fano’s classification parallels the classification by differential Galois group of the Picard–Fuchs equation. Since the Picard–Fuchs differential equation is a Fuchsian ordinary differential equation, the differential Galois group equals the Zariski closure of the global monodromy group. In the first case this corresponds to the symmetric cube monodromy representation of SL(2, C) in Sp(4). In the last case, the monodromy representation is irreducible and the differential Galois group is all of Sp(4). It should be possible to fill in the other three entries as well. In fact we can say more about the absence of algebraic relations among the periods in the last case. By special geometry there are no homogeneous algebraic relations among {u, u · t, u · F˙ , u · (t F˙ − 2F )}
646
C. F. Doran
which implies there are no algebraic relations whatsoever among {t, F˙ , (t F˙ − 2F )}. Hence there are no algebraic relations among {t, F, F˙ }, and thus no algebraic relations between {t, F }. Moreover the modularity equation from Theorem 9 takes a particularly simple form in each of the nongeneric cases (e.g., it characterizes “bimodularity” in class 3. above). References 1. Atkin, A.O.L. and Swinnerton-Dyer, H.P.F.: Modular forms on noncongruence subgroups. In: Combinatorics, Proc. Sympos. Pure Math., Vol. XIX, Univ. California, Los Angeles, Calif., 1968, Providence: Am. Math. Soc. 1971, pp. 1–25 2. Candelas, P.: A pair of Calabi–Yau manifolds as an exactly soluble superconformal theory. Reprinted in [43, pp. 31–95] 3. Ceresole, A., D’Auria, R., Ferrara, S., Lerche, W., Louis, J., Regge, T.: Picard–Fuchs equations, special geometry, and target space duality. In: [12, pp. 281–354] 4. Conway, J.H. and Norton, S.P.: Monstrous moonshine. Bull. London Math. Soc. 11, 308–339 (1979) 5. Cox, D.A. and Katz, S.: Mirror Symmetry and Algebraic Geometry. Providence: Am. Math. Soc., 1999 6. Dolgachev, I.V.: Mirror symmetry for lattice polarized K3 surfaces. Algebraic geometry, 4. J. Math. Sci. 81, 2599–2630 (1996) 7. Doran, C.F.: Picard–Fuchs Uniformization and Geometric Isomonodromic Deformations: Modularity and Variation of the Mirror Map. Ph.D. thesis, Harvard University, April 1999 8. Doran, C.F.: Picard–Fuchs uniformization: Modularity of the mirror map and Mirror-Moonshine. In B. Gordon, et al., Eds., The Arithmetic and Geometry of Algebraic Cycles: Proceedings of the CRM Summer School, June 7–19, 1998, Banff, Alberta, Canada. Centre de Recherches Mathématiques, CRM Proceedings and Lecture Notes. 24, 2000, pp. 257–281 9. Fano, G.: Über lineare homogene Differentialgleichungen mit algebraische Relationen zwischen den Fundamentallösungen. Math. Ann. 53, 493–590 (1900) 10. Freed, D.: Special Kähler manifolds. Commun. Math. Phys. 203, 31–52 (1999) 11. Fricke, R. and Klein, F.: Vorlesungen über die Theorie der elliptischen Modulfunktionen. I, II. Leipzig: Teubner, 1890, 1892 12. Greene, B. and Yau, S.-T. (eds.): Mirror Symmetry II. In: AMS/IP Studies in Advanced Mathematics. 1. Providence and Cambridge: Amer. Math. Soc. and International Press, 1997 13. Gunning, R.C.: Lectures on Riemann surfaces. Princeton Mathematical Notes. Princeton: Princeton University Press, 1966 14. Hosono, S., Klemm, A., Theisen, S., Yau, S.-T.: Mirror symmetry, mirror map and applications to Calabi– Yau hypersurfaces. Commun. Math. Phys. 167, 301–350 (1995) 15. Hosono, S., Klemm, A., Theisen, S.,Yau, S.-T.: Mirror symmetry, mirror map and applications to complete intersection Calabi–Yau spaces. Nucl. Phys. B. 433, 501–552 (1995). Reprinted in [12, pp. 545–606] 16. Hosono, S., Lian, B.H., Yau, S.-T.: GKZ-generalized hypergeometric systems in mirror symmetry of Calabi–Yau hypersurfaces. Commun. Math. Phys. 182, 535–578 (1996) 17. Hosono, S., Lian, B.H., Yau, S.-T.: Maximal degeneracy points of GKZ systems. J. Am. Math. Soc. 10, 427–443 (1997) 18. Hosono, S., Lian, B.H., Yau, S.-T.: Calabi–Yau varieties and pencils of K3 surfaces. LANL archive preprint, alg-geom/9603020 19. Klemm, A., Lian, B.H., Roan, S.-S., Yau, S.-T.: A note on ODEs from mirror symmetry. In: Functional Analysis on the Eve of the 21st Century, Vol. II (New Brunswick, NJ, 1993). Progr. Math. 132, Boston: Birkhäuser, 1996, pp. 301–323. 20. Kobayashi, S. and Ochiai, T.: Holomorphic structures modeled after hyperquadrics. Tôhoku Math. J. 34, 587–629 (1982) 21. Lee, M.-H.: Picard–Fuchs equations for elliptic modular varieties. Appl. Math. Lett. 4 91–95 (1991) 22. Lian, B.H., Liu, K., Yau, S.-T.: The Candelas-de la Ossa-Green- Parkes formula. String Theory, Gauge Theory and Quantum Gravity (Trieste, 1997). Nuclear Phys. B Proc. Suppl. 67, 106–114 (1998) 23. Lian, B.H. and Yau, S.-T.: Mirror symmetry, rational curves on algebraic manifolds and hypergeometric series. In: XIth International Congress of Mathematical Physics (Paris, 1994). Cambridge: International Press, 1995, pp. 163–184 24. Lian, B.H. and Yau, S.-T.: Arithmetic properties of mirror map and quantum coupling. Commun. Math. Phys. 176, 163–191 (1996)
Picard–Fuchs Uniformization and Modularity of the Mirror Map
647
25. Lian, B.H. and Yau, S.-T.: Mirror maps, modular relations and hypergeometric series. I. LANL archive preprint, hep-th/9507151 26. Lian, B.H. and Yau, S.-T.: Mirror maps, modular relations and hypergeometric series. II. S-duality and Mirror Symmetry (Trieste, 1995). Nuclear Phys. B Proc. Suppl. 46, 248–262 (1996) 27. Lian, B.H. and Yau, S.-T.: A note on ODEs from mirror symetry II. In preparation. 28. Matsumoto, K., Sasaki, T., Yoshida, M.: Recent progress of Gauss- Schwarz theory and related geometric structures. Memoirs of the Faculty of Science, Kyushu University. Ser. A. 47, 283–381 (1993) 29. Miranda, R.: The moduli of Weierstrass fibrations over P1 . Math. Ann. 255, 379–394 (1981) 30. Miranda, R.: The Basic Theory of Elliptic Surfaces. Dottorato di Ricerca in Matematica. Pisa: ETS Editrice (1989) 31. Namba, M.: Branched Coverings and Algebraic Functions. Pitman Res. Notes Math. Ser. 161. Harlow: Longman Scientific & Technical, 1987 32. Peters, C.: Monodromy and Picard–Fuchs equations for families of K3-surfaces and elliptic curves. Ann. Sci. École Norm. Sup. (4) 19, 583–607 (1986) 33. Peters, C. and Stienstra, J.: A pencil of K3-surfaces related to Apéry’s recurrence for ζ (3) and fermi surfaces for potential zero. In: Arithmetic of Complex Manifolds (Erlangen, 1988), Lecture Notes in Math. 1399. Berlin: Springer, 1989, pp. 110–127 34. Phong, D.H., Vinet, L.,Yau, S.-T. (eds.): Mirror Symmetry III. AMS/IP Studies in Advanced Mathematics, 10. Providence, Cambridge, and Montreal: Am. Math. Soc., International Press, and Centre de Recherches Mathématiques, 1999 35. Sasai, T.: Monodromy representations of homology of certain elliptic surfaces. J. Math. Soc. Japan 26, 296–305 (1974) 36. Sasaki, T. and Yoshida, M.: Linear differential equations in two variables of rank four. I. Math. Ann. 282, 69–93 (1988) 37. Sasaki, T. and Yoshida, M.: Linear differential equations modeled after hyperquadrics. Tôhoku Math. J. 41, 321–348 (1989) 38. Singer, M.: Algebraic relations among solutions of linear differential equations: Fano’s theorem. Am. J. Math. 110, 115–143 (1988) 39. Stiller, P.: On the uniformization of certain curves. Pacific J. Math. 107, 229–244 (1983) 40. Strominger, A.: Special geometry. Commun. Math. Phys. 133, 163–180 (1990) 41. Venkov, A.B.: Examples of the effective solution of the Riemann–Hilbert problem on the reconstruction of a differential equation from a monodromy group in the framework of the theory of automorphic functions. (Russian) Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 162, Avtomorfn. Funkts. i Teor. Chisel. III, 5–42, 189 (1987) 42. Verrill, H. and Yui, N.: Thompson series, and the mirror maps of pencils of K3 surfaces. In B. Gordon, et al., Eds., The Arithmetic and Geometry of Algebraic Cycles: Proceedings of the CRM Summer School, June 7–19, 1998, Banff, Alberta, Canada. Centre de Recherches Mathématiques, CRM Proceedings and Lecture Notes. 24, 2000, pp. 399–432 43. Yau, S.-T. (ed.): Mirror Symmetry I. AMS/IP Studies in Advanced Mathematics, 9. Cambridge: International Press, 1998 44. Yoshida, M.: Fuchsian differential equations. With special emphasis on the Gauss-Schwarz theory.Aspects of Mathematics, E11. Braunschweig: Friedr. Vieweg & Sohn, 1987 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 212, 649 – 652 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
A Geometric Interpretation of the χy Genus on Hyper-Kähler Manifolds George Thompson ICTP, P.O. Box 586, 34100 Trieste, Italy. E-mail:
[email protected] Received: 3 December 1999 / Accepted: 30 January 2000
Abstract: The group SL(2) acts on the space of cohomology groups of any hyperKähler manifold X. The χy genus of a hyper-Kähler X is shown to have a geometric interpretation as the super trace of an element of SL(2). As a by product one learns that the generalized Casson invariant for a mapping torus is essentially the χy genus. 1. Introduction The χy genus of Hirzebruch is a very interesting and rather powerful invariant. There are three significant values for y. At y = −1 the χy genus is the Euler characteristic, at y = 0 it is the Todd genus, while at y = 1 it is the signature. There seems to be, however, no geometric understanding of the genus away from these preferred values of y. In this short note, I prove that for (compact) hyper-Kähler manifolds, there is, in fact, quite a clear geometric meaning to the genus. For hyper-Kähler manifolds there is a natural SL(2) action, associated with the p holomorphic 2-form, on the cohomology groups p Hq X, X which preserves q and shifts p by even integers. This means that (−1)q+p is preserved. One can, therefore, take the graded trace of an SL(2) element, with the grading given by (−1)p+q . Denote the graded trace of U ∈ SL(2) by STr U . The geometric meaning of the χy genus for hyper-Kähler X is the content of the following Theorem 1.1. Let X be an irreducible compact hyper-Kähler manifold of real dimension 4n. Let U ∈ SL(2) and y be an eigenvalue of U , in the two dimensional representation, then STr U =
χ−y . yn
(1.1)
650
G. Thompson
Remarks. 1) Note that, since h(p,q) = h(2n−p,q) , the right-hand side is invariant under y → 1/y, so that it does not depend on which eigenvalue one picks. 2) Once one expects that a result of this kind is true the proof turns out to be embarrassingly easy. The motivation for this result comes from the study of 3-manifold invariants. Rozansky and Witten [RW] indicated how, given a hyper-Kähler manifold X, one could associate to the Mapping Torus TU , the invariant STr U . In [T], I showed that one could perform the associated path integral. The solution found there is, in fact, the Riemann– Roch formula for the χy genus divided by y n . This motivated the above theorem, which can be proven without recourse to physics. However, one can now read the derivation in [T] as a path integral proof of the Riemann–Roch formula for the χy genus. That path integral calculation of STr U gave 1/2 Todd (T XC ) Det U ⊗ I − I ⊗ e R , (1.2) X
which can be re-written as X
Todd (T XC )
n
(t − 2 cosh xi ) ,
(1.3)
i=1
where t is the character of U in the 2-dimensional representation. The χy genus is given by Riemann-Roch as [NR] 2n 1 − ye −xi , (1.4) Todd (T XC ) χ−y (X) = X
i=1
but since X is hyper-Kähler one has that xi+n = −xi for i ≤ n. This means that n χ−y (X) = Todd (T XC ) (1.5) (1 + y 2 ) − 2y cosh(xi ) , X
i=1
so that this suggests (1.1) on setting ty = 1 + y 2 . Consequently we have, in the notation of [T], RW [T ] = χ /y n , for U ∈ SL(2, Z). Corollary 1.2. The Rozansky–Witten invariant ZX U −y
Further Remarks. 1) The essential feature used here is the SL(2) action that is made available by the holomorphic 2-form. Hence this is not the same as thinking of X as a Kähler manifold and making use of the usual SL(2) action that comes from the symplectic 2-form (Lefschetz decomposition). 2) There is a rather more general formula that was suggested by the work of [RW]. If one considers a “mapping Riemann surface”, for a Riemann surface, , of genus g, RW [ ] = STr U , where U ∈ Sp(g) and this then the Rozansky–Witten invariant ZX U q ∗ ⊗g . In [T] a Riemann–Roch formula for this super group acts on H X, X trace was given which looks like a Riemann–Roch formula for a generalized χy genus. That suggests that the corresponding generalized χy can be rigorously shown to be the super trace. This has important implications for 3-manifold invariants. 3) Similar, though not identical, path integral formulae are available for general holomorphic symplectic manifolds. 4) Justin Sawon [S] has made use of the weight system in [RW] in an ingenious way to get constraints on the Chern numbers of X.
χy Genus on Hyper-Kähler Manifolds
651
2. The Sl(2) Action on X The SL(2, C) action on the cohomology groups of X, that we are interested in, is perhaps best explained at the level of the Lie algebra, Lie SL(2) := sl(2). Let L : p p+2 Hq X, X → Hq X, X be the map given by the cup-product with the holomor p p−2 phic 2-form . Let ı : Hq X, X → Hq X, X be contraction with respect to . To fix conventions we note that in local holomorphic coordinates if ω ∈ (p,q) (X), then, suppressing the anti-holomorphic factors, (the Einstein summation convention is in force) ω = ωI1 ,...,Ip dzIp ∧ · · · ∧ dzI1 ,
(2.1)
and ı ω =
p(p − 1) ωI1 ,I2 ,I3 ,...,Ip I1 I2 dzI3 ∧ · · · ∧ dzIp . 2
(2.2)
The algebra satisfied by these operators is, by a straightforward computation, [ı , L ] = (n − p)
(2.3)
p p understood as a map Hq X, X → Hq X, X . The generators of sl(2) are then realized as
01 00
∼ L
00 10
∼ ı
1 0 0 −1
∼ (n − p).
(2.4)
The following is taken from the survey by Huybrechts [H] (but see also the original work by Fujiki [F]). Let, p , Hq X, X := ker Ln−p+1
(2.5)
then the Lefschetz decomposition theorem tells us that p Hq X, X =
(p−l)≥max(p−n,0)
2l−p q X, X Lp−l . H
(2.6)
p One thinks of L as a raising operator, and the Hq X, X , for 0 ≤ p ≤ n, are the highest weight vectors of the n − p + 1 dimensional irreducible representations of SL(2, C). One also has, by a straightforward count, that p = h(p,q) − h(p−2,q) . dimR Hq X, X := h(p,q)
(2.7)
652
G. Thompson
3. Proof of Theorem 1.1 The proof is by direct computation. Let tr be the character of U in the r dimensional irreducible representation of SL(2, C) and set t2 = t. Note that t1 = 1, and I use the convention that tr = 0 for r ≤ 0, as well as h(p,q) = 0 if p < 0. Then STr U =
n 2n
(−1)p+q tn−p+1 h(p,q) .
(3.1)
(−1)p+q h(p,q) tn−p+1 − tn−p−1 .
(3.2)
q=0 p=0
One can re-write this expression as STr U =
n 2n q=0 p=0
Now notice that, on making use of Serre duality, which implies that h(p,q) = that the χy genus satisfies,
h(2n−p,q) ,
2n n−1
2n
q=0 p=0
q=0
p−n χ−y p+q (p,q) n−p y + = (−1) h + y (−1)q h(n,q) . yn
(3.3)
A comparison of (3.2) and (3.3) shows us that they agree if we can set tr+1 − tr−1 = y r + y −r r > 0.
(3.4)
ty = y 2 + 1,
(3.5)
For r = 1 this reads as
which is simply the characteristic polynomial for the two-dimensional representation of U , where y is an eigenvalue and t is the trace. We make this identification, then (3.4) is a standard relationship between characters and eigenvalues for SL(2). Acknowledgements. I would like to thank M. Blau, L. Göttsche and A. King for discussions. Special thanks are due to M. S. Narasimhan who made the right observations and the right remarks at the right time.
References [F]
Fujiki, A.: On the de Rham Cohomology Group of a compact Kähler Symplectic Manifold. In: Algebraic Geometry, Sendai, 1985. Advanced Studies in Pure Mathematics 10, T. Oda ed., Amsterdam: North Holland, 1987 [H] Huybrechts, D.: Compact Hyper-Kähler Manifolds: Basic Results. alg-geom/9705025. [NR] Narasimhan, M.S., Ramanan, S.: Generalized Prym Varieties as Fixed Points. J. Indian Math. Soc. 39, 1–19 (1975) [RW] Rozansky, L., Witten, E.: Hyper-Kähler Geometry and Invariants of Three Manifolds. hep-th/9612216 [S] Sawon J.: The Rozansky-Witten Invariants of Hyper-Kähler Manifolds. Preprint of a talk presented at the Brno conference. [T] Thompson G.: On the Generalized Casson Invariant. To appear in Adv. in Theor. Math. Physics 3, hep-th/9811199. Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 212, 653 – 686 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Group Invariant Solutions Without Transversality Ian M. Anderson1 , Mark E. Fels1 , Charles G. Torre2 1 Department of Mathematics and Statistics, Utah State University, Logan, Utah 84322, USA 2 Department of Physics, Utah State University, Logan, Utah 84322, USA
Received: 16 September 1999 / Accepted: 4 February 2000
Abstract: We present a generalization of Lie’s method for finding the group invariant solutions to a system of partial differential equations. Our generalization relaxes the standard transversality assumption and encompasses the common situation where the reduced differential equations for the group invariant solutions involve both fewer dependent and independent variables. The theoretical basis for our method is provided by a general existence theorem for the invariant sections, both local and global, of a bundle on which a finite dimensional Lie group acts. A simple and natural extension of our characterization of invariant sections leads to an intrinsic characterization of the reduced equations for the group invariant solutions for a system of differential equations. The characterization of both the invariant sections and the reduced equations are summarized schematically by the kinematic and dynamic reduction diagrams and are illustrated by a number of examples from fluid mechanics, harmonic maps, and general relativity. This work also provides the theoretical foundations for a further detailed study of the reduced equations for group invariant solutions. 1. Introduction Lie’s method of symmetry reduction for finding the group invariant solutions to partial differential equations is widely recognized as one of the most general and effective methods for obtaining exact solutions of non-linear partial differential equations. In recent years Lie’s method has been described in a number of excellent texts and survey articles (see, for example, Bluman and Kumei [10], Olver [29], Stephani [36], Vorob’ev [40], Winternitz [41]) and has been systematically applied to differential equations arising in a broad spectrum of disciplines (see, for example, Ibragimov [23] or Rogers and Shadwick [34]). It came, therefore, as quite a surprise to the present authors that Lie’s method, as it is conventionally described, does not provide an appropriate theoretical framework Research supported by NSF grants DMS–9804833 and PHY–9732636
654
I. M. Anderson, M. E. Fels, C. G. Torre
for the derivation of such celebrated invariant solutions as the Schwarzschild solution of the vacuum Einstein equations, the instanton and monopole solutions in Yang–Mills theory or the Veronese map for the harmonic map equations. The primary objectives of this paper are to focus attention on this deficiency in the literature on Lie’s method, to describe the elementary steps needed to correct this problem, and to give a precise formulation of the reduced differential equations for the group invariant solutions which arise from this generalization of Lie’s method. A second impetus for the present article is to provide the foundations for a systematic study of the interplay between the formal geometric properties of a system of differential equations, such as the conservation laws, symmetries, Hamiltonian structures, variational principles, local solvability, formal integrability and so on, and those same properties of the reduced equations for the group invariant solutions. Two problems merit special attention. First, one can interpret the principle of symmetric criticality [32, 33] as the problem of determining those group actions for which the reduced equations of a system of Euler-Lagrange equations are derivable from a canonically defined Lagrangian. Our previous work [2] on this problem, and the closely related problem of reduction of conservation laws, was cast entirely within the context of transverse group actions. Therefore, in order to extend our results to include the reductions that one encounters in field theory and differential geometry, one needs the more general description of Lie symmetry reduction obtained here. Secondly, there do not appear to be any general theorems in the literature which insure the local existence of group invariant solutions to differential equations; however, as one step in this direction the results presented here can be used to determine when a system of differential equations of Cauchy–Kovalevskaya type remain of Cauchy-Kovalevskaya type under reduction [4]. We begin by quickly reviewing the salient steps of Lie’s method and then comparing Lie’s method with the standard derivation of the Schwarzschild solution of the vacuum Einstein equations. This will clearly demonstrate the difficulties with the classical Lie approach. In Sect. 3 we describe, in detail, a general method for characterizing the group invariant sections of a given bundle. In Sect. 4 the reduced equations for the group invariant solutions are constructed in the case where reduction in both the number of independent and dependent variables can occur. We define the residual symmetry group of the reduced equations in Sect. 5. In Sect. 6 we illustrate, at some length, these results with a variety of examples. In the appendix we briefly outline some of the technical issues underlying the general theory of Lie symmetry reduction for the group invariant solutions of differential equations. 2. Lie’s Method for Group Invariant Solutions Consider a system of second-order partial differential equations β (x i , uα , uαi , uαij ) = 0
(2.1)
for the m unknown functions uα , α = 1, . . . , m, as functions of the n independent variables x i , i = 1, . . . , n. As usual, uαi and uαij denote the first and second order partial derivatives of the functions uα . We have assumed that Eqs. (2.1) are second-order and that the number of equations coincides with the number of unknown functions strictly for the sake of simplicity. A fundamental feature of Lie’s entire approach to symmetry reduction of differential equations, and one that contributes greatly to its broad applicability, is that the Lie algebra of infinitesimal symmetries of a system of differential equations can be
Group Invariant Solutions Without Transversality
655
systematically and readily determined. We are not so much concerned with this aspect of Lie’s work and accordingly assume that the symmetry algebra of (2.1) is given. Now let be a finite dimensional Lie subalgebra of the symmetry algebra of (2.1), generated by vector fields Va = ξai (x j )
∂ ∂ + ηaα (x j , uβ ) α , i ∂x ∂u
(2.2)
where a = 1,2, . . . , p. A map s : Rn → Rm given by uα = s α (x i ) is said to be invariant under the Lie algebra if the graph is invariant under the local flows of the vector fields (2.2). One finds this to be the case if and only if the functions s α (x i ) satisfy the infinitesimal invariance equations ξai (x j )
∂s α = ηaα (x j , s β (x j )) ∂x i
(2.3)
for all a = 1, 2, . . . , p. The method of Lie symmetry reduction consists of explicitly solving the infinitesimal invariance equations (2.3) and substituting the solutions of (2.3) into (2.1) to derive the reduced equations for the invariant solutions. In order to solve (2.3) it is customarily assumed (see, for example, Olver [29], Ovsiannikov [30], or Winternitz [41]) that the rank of the matrix ξai (x j ) is constant, say q, and that the Lie algebra of vector fields satisfies the local transversality condition (2.4) rank[ξai (x j ) = rank[ξai (x j ), ηaα (x j , uβ )]. Granted (2.4), it then follows that there exist local coordinates x˜ r = x˜ r (x j ),
xˆ k = xˆ k (x j )
and
v α = v α (x j , uβ ),
(2.5)
on the space of independent and dependent variables, where r = 1, . . . , n − q, k = 1, . . . , q, and α = 1, . . . , m, such that, in these new coordinates, the vector fields Va take the form Va =
q l=1
ξˆal (x˜ r , xˆ k )
∂ . ∂ xˆ l
(2.6)
The coordinate functions x˜ r and v α are the infinitesimal invariants for the Lie algebra of vector fields . In these coordinates the infinitesimal invariance equations (2.3) for v α = v α (x˜ r , xˆ k ) can be explicitly integrated to give v α = v α (x˜ r ), where the v α (x˜ r ) are arbitrary smooth functions. One now inverts the relations (2.5) to find that the explicit solutions to (2.3) are given by s α (x˜ r , xˆ k ) = uα (x˜ r , xˆ k , v β (x˜ r )).
(2.7)
Finally one substitutes (2.7) into the differential equations (2.1) to arrive at the reduced system of differential equations α ˜ β (x˜ r , v α , vrα , vrs ) = 0.
(2.8)
Every solution of (2.8) therefore determines, by (2.7), a solution of (2.1) which also satisfies the invariance condition (2.3). In many applications of Lie reduction one picks the Lie algebra of vector fields (2.2) so that q = n − 1 in which case there is only one independent invariant x˜ on M and (2.8) is a system of ordinary differential equations.
656
I. M. Anderson, M. E. Fels, C. G. Torre
For the vacuum Einstein equations the independent variables x i , i = 0, . . . , 3, are the local coordinates on a 4-dimensional spacetime, the dependent variables are the 10 components gij of the spacetime metric and the differential equations (2.1) are given by the vanishing of the Einstein tensor Gij = 0. In the case of the spherically symmetric, stationary solutions to the vacuum Einstein equations the relevant infinitesimal symmetry ∂ generators on spacetime are V0 = 0 , ∂x V1 = x 3
∂ ∂ − x2 3 , ∂x 2 ∂x
V2 = −x 3
∂ ∂ + x1 3 , ∂x 1 ∂x
and V3 = x 2
∂ ∂ − x1 2 ∂x 1 ∂x
and the symmetry conditions, as represented by the Killing equations LVa gij = 0, lead to the familiar ansatz (in spherical coordinates) ds 2 = A(r)dt 2 + B(r)dtdr + C(r)dr 2 + D(r)(dφ 2 + sin(φ)2 dθ 2 ).
(2.9)
The substitution of (2.9) into the field equations leads to a system of ODE whose general solution leads to the Schwarzschild solution to the vacuum Einstein field equations. What happens if we attempt to derive the Schwarzschild solution using the classical Lie ansatz (2.7)? To begin, it is necessary to lift the vector fields Va to the space of independent and dependent variables in order to account for the induced action of the infinitesimal spacetime transformations on the components of the metric. These lifted 0 = V0 and vector fields are V k = Vk − 2 V
∂Vkl ∂ glj . ∂x i ∂gij
(2.10)
In terms of these lifted vector fields, the infinitesimal invariance equations (2.3) then coincide exactly with the Killing equations. However, (2.7) cannot possibly coincide with (2.9) since the latter contains only 4 arbitrary functions A(r), B(r), C(r), D(r) whereas (2.7) would imply that the general stationary, rotationally invariant metric depends upon 10 arbitrary functions of r. This discrepancy is easily accounted for – in this example rank V0 , V1 , V2 , V3 = 3
while
rank Vˆ0 , Vˆ1 , Vˆ2 , Vˆ3 = 4,
and hence the local transversality condition (2.4) does not hold. Indeed, whenever the local transversality condition fails, the general solution to the infinitesimal invariance equation will depend upon fewer arbitrary functions than the original number of dependent variables. The reduced differential equations will be a system of equations with both fewer independent and dependent variables. We remark that in many of the exhaustive classifications of invariant solutions using Lie reduction either the number of independent variables is 2 and hence, typically, the number of vector fields Va is one, or there is just a single dependent variable and (2.1) is a scalar partial differential equation. In either circumstance the local transversality condition is normally satisfied and the ansatz (2.7) gives the correct solution to the infinitesimal invariance equation (2.3). However, once the number of independent and dependent variables exceed these minimal thresholds, as is the case in most physical field theories, the local transversality condition is likely to fail.
Group Invariant Solutions Without Transversality
657
3. An Existence Theorem for Invariant Sections Let M be an n-dimensional manifold and π : E → M a bundle over M. In our applications to Lie symmetry reduction the manifold M serves as the space of independent variables and the bundle E plays the role of the total space of independent and dependent variables. We refer to points of M with local coordinates (x i ) and to points of E with local coordinates (x i , uα ), for which the projection map π is given by π(x i , uα ) = (x i ). In many applications E either is a trivial bundle E = M × N , a vector bundle over M, or a fiber bundle over M with finite dimensional structure group. However, for the purposes of this paper one need only suppose that π is a smooth submersion. We let Ex = π −1 (x) denote the fiber of E over the point x ∈ M. Now let G be a finite dimensional Lie group which acts smoothly on E. We assume that G acts projectably on E in the sense that the action of each element of G is a fiber preserving transformation on E – if p, q lie in a common fiber, then so do g · p and g · q. Consequently, there is a smooth induced action of G on M. The action of G on the space of sections of E is then given by (g · s)(x) = g · [s(g −1 · x)].
(3.1)
for each smooth section s : M → E. A section s is invariant if g · s = s for all g ∈ G. More generally, we have the following definition. Definition 3.1. Let G be a smooth projectable group action on the bundle π : E → M and let U ⊂ M be open. Then a smooth section s : U → E is G invariant, if for all x ∈ U and g ∈ G such that g · x ∈ U , s(g · x) = g · s(x).
(3.2)
Let be the Lie algebra of vector fields on E which are the infinitesimal generators for the action of G on E. Since the action of G is assumed projectable, any basis Va , a = 1, . . . , p of assumes the local coordinate form (2.2). If gt is a one-parameter subgroup of G with associated infinitesimal generator Va on E, then by differentiating the invariance condition s(gt · x) = gt · s(x) one finds that the component functions s α (x i ) satisfy the infinitesimal invariance condition 2.3. If s is globally defined on all of M and if G is connected, then the infinitesimal invariance criterion (2.3) implies (3.2). This may not be true if G is not connected or if s is only defined on a proper open subset of M. For the purposes of finding group invariant solutions of differential equations, we shall take the group G to be a symmetry group of the given system of differential equations. The task at hand is to explicitly identify the space of G invariant sections of E with and to construct the differential sections of an auxiliary bundle πκ˜ G : κ˜ G (E) → M equations for the G invariant sections as a reduced system of differential equations on the sections of πκ˜ G : κ˜ G (E) → M. Our characterization of the G invariant sections of E is based upon the following key observation. Suppose that p ∈ E and that there is a G invariant section s : U → E with s(x) = p, where x ∈ U . Let Gx = { g ∈ G | g · x = x } be the isotropy subgroup of G at x. Then, for every g ∈ Gx , we compute g · p = g · s(x) = s(g · x) = s(x) = p.
(3.3)
658
I. M. Anderson, M. E. Fels, C. G. Torre
This equation shows that the isotropy subgroup Gx constrains the admissible values that an invariant section can assume at the point x. Accordingly, we define the kinematic bundle κG (E) for the action of G on E by κG,x (E), κG (E) = x∈M
where
κG,x (E) = p ∈ Ex | g · p = p
for all g ∈ Gx .
(3.4)
It is easy to check that κG (E) is a G invariant subset of E and therefore the action of G restricts to an action on κG (E). = M/G and κ˜ G (E) = κG (E)/G be the quotient spaces for the actions of G Let M on M and κG (E). We define the kinematic reduction diagram for the action of G on E to be the commutative diagram q κG
ι
κ˜ G (E) ←−−−− κG (E) −−−−→ πκ˜ G π
M
qM
←−−−−
M
E π
(3.5)
id
−−−−→ M.
In this diagram ι is the inclusion map of the kinematic bundle κG (E) into E, id : M → M is the identity map, the maps qM and qκG are the projection maps to the quotient spaces and πκ˜ G is the surjective map induced by π . The next lemma summarizes two of the key properties of the kinematic reduction diagram. Lemma 3.2. Let G act projectably on E. (i) Let p ∈ κG (E) and g ∈ G. If π(g · p) = π(p), then g · p = p. (ii) If p˜ ∈ κ˜ G (E) and x ∈ M satisfy πκ˜ G (p) ˜ = qM (x), then there is a unique point p ∈ κG (E) such that qκG (p) = p˜ and πκ˜ G (p) = x. Proof. (i) Let x = π(p). If π(g · p) = π(p), then g · x = x and therefore, since p ∈ κG,x (E), we conclude that g · p = p. (ii) Since qκG : κG (E) → κ˜ G (E) is surjective, there is a point p0 ∈ κG (E) which projects to p. ˜ Let x0 = π(p0 ). Then qM (x0 ) = qM (x) and hence, by definition of the quotient map qM , there is a g ∈ G such that g · x0 = x. The point p = g · p0 projects under qκG to p˜ and to x under π so that the existence of the point p is established. Suppose p1 and p2 are two points in κG (E) which project to p˜ and x under qκG and π respectively. Then p1 and p2 belong to the same fiber κG,x (E) and are related by a group element g ∈ G, that is, g · p1 = p2 . Since π(p1 ) = π(p2 ), it follows that π(g · p1 ) = π(p1 ). Since p1 ∈ κG,x (E), we infer from (i) that g · p1 = p1 and therefore p1 = p2 . → κ˜ G (E), This simple lemma immediately implies that every local section s˜ : U is an open subset of M, uniquely determines a G-invariant section s : U → where U κG (E), where U = q−1 M (U ), such that qκG (s(x)) = s˜ (qM (x)).
(3.6)
To insure that this correspondence between the G invariant sections of E and the sections of κ˜ G (E) extends to a correspondence between smooth sections it suffices to insure that is a smooth bundle. πκ˜ G : κ˜ G (E) → M
Group Invariant Solutions Without Transversality
659
Theorem 3.3 (Existence Theorem for G Invariant Sections). Suppose that E admits a kinematic reduction diagram (3.5) such that κG (E) is an imbedded subbundle of E, and κ˜ G (E) are smooth manifolds, and πκ˜ : κ˜ G (E) → M is a the quotient spaces M G bundle. be any open set in M and let U = q−1 Let U M (U ). Then (3.6) defines a one-to-one correspondence between the G invariant smooth sections s : U → E and the smooth → κ˜ G (E). sections s˜ : U We can describe the kinematic reduction diagram in local coordinates as follows. is a bundle we begin with local coordinates πκ˜ : (x˜ r , v a ) → Since πκ˜ G : κ˜ G (E) → M G r and a ranges from 1 to the fiber dimension (x˜ ) for κ˜ G (E), where r = 1, . . . , dim M is a submersion, we can use the coordinates x˜ r as part of κ˜ G (E). Since qM : M → M and, for of a local coordinate system (x˜ r , xˆ k ) on M. Here k = 1, . . . , dim M − dim M r r k fixed values of x˜ , the points (x˜ , xˆ ) all lie on a common G orbit. As a consequence of Lemma 3.2(ii) one can prove that qκG restricts to a diffeomorphism between the fibers of κG (E) and κ˜ G (E) and hence one can use (x˜ r , xˆ k , v a ) as a system of local coordinates on κG (E). Finally, let (x˜ r , xˆ k , uα ) → (x˜ r , xˆ k ) be a system of local coordinates on E. Since κG (E) is an imbedded sub-bundle of E, the inclusion map ι : κG (E) → E assumes the form ι(x˜ r , xˆ k , v a ) = (x˜ r , xˆ k , ια (x˜ r , xˆ k , v a )),
∂ια where the rank of the Jacobian matrix ∂v a kinematic reduction diagram (3.5) becomes q κG
is maximal. In these coordinates the
ι
(x˜ r , v a ) ←−−−− (x˜ r , xˆ k , v a ) −−−−→ (x˜ r , xˆ k , ια (x˜ r , xˆ k , v a )) πκ˜ G π
π
(x˜ r )
qM
(3.7)
←−−−−
(x˜ r , xˆ k )
id
−−−−→
(3.8)
(x˜ r , xˆ k ).
These coordinates are readily constructed in most applications. If v a = s˜ a (x˜ r ) is a local section of κ˜ G (E), then the corresponding G invariant section of E is given by s α (x˜ r , xˆ k ) = ια (x˜ r , xˆ k , s˜ a (x˜ r )).
(3.9)
Notice that when ι is the identity map, (3.9) reduces to (2.7). The formula (3.9) is the full and proper generalization of the classical Lie prescription (2.7) for infinitesimally invariant sections of transverse actions. In general the fiber dimension of κG (E) will be less than that of E, while the fiber dimension of κ˜ G (E) is always the same as that of κG (E). Thus, in our description of the G invariant sections of E, fiber reduction, or reduction in the number of dependent variables, occurs in the right square of the diagram (3.5) while base reduction, or reduction in the number of independent variables, occurs in the left square of (3.5). We now consider the case of an infinitesimal group action on E, defined directly by a p-dimensional Lie algebra of vector fields (2.2). These vector fields need not be the infinitesimal generators of a global action of a Lie group G on E. If the rank of the
660
I. M. Anderson, M. E. Fels, C. G. Torre
coefficient matrix [ξai (x j )] is q, then there are locally defined functions φ.a (x j ), where . = 1, . . . , p − q, such that p a=1
φ.a (x j )ξai (x j ) = 0.
Consequently, if we multiply the infinitesimal invariance equation (2.3) by the functions φ.a (x j ) and sum on a = 1, . . . , p, we find that the invariant sections s α (x j ) are constrained by the algebraic equations p a=1
φ.a (x j )ηaα (x j , s β (x j )) = 0.
(3.10)
These conditions are the infinitesimal counterparts to equations (3.3) and accordingly we define the infinitesimal kinematic bundle κ (E) = x∈M κ,x (E), where
j
β
κ,x (E) = (x , u ) ∈ Ex |
p a=1
= p ∈ Ex | Z(p) = 0
φ.a (x j )ηaα (x j , uβ ) = 0
for all Z ∈ such that π∗ (Z(p)) = 0 .
(3.11)
In most applications the algebraic conditions defining κ (E) are easily solved. The Lie algebra of vector fields restricts to a Lie algebra of vector fields on κ (E) which now satisfies the infinitesimal transversality condition (2.4). One then arrives at (3.8) as a local coordinate description of the infinitesimal kinematic diagram for , where the coordinates (x˜ r , v a ) are now the infinitesimal invariants for the action of on κ (E). It is not difficult to show that κG,x (E) ⊂ κ,x (E), with equality holding whenever the isotropy group Gx is connected. In the case where E is a vector bundle, the infinitesimal kinematic bundle appears in Fels and Olver [16]. For applications of the kinematic bundle to the classification of invariant tensors and spinors see [6] and [7]. 4. Reduced Differential Equations for Group Invariant Solutions Let G be a Lie group acting projectably on the bundle π : E → M and let = 0 be a system of G invariant differential equations for the sections of E. In order to describe ˜ = 0 for the G invariant solutions to = 0 we geometrically the reduced equations first formalize the definition of a system of differential equations. To this end, let π k : J k (E) → M be the k th order jet bundle of π : E → M. A point σ = j k (s)(x) in J k (E) represents the values of a local section s and all its derivatives to order k at the point x ∈ M. Since G acts on the space of sections of E by (3.1), the action of G on E can be lifted (or prolonged) to an action on J k (E) by setting g · σ = j k (g · s)(g · x),
where σ = j k (s)(x).
Now let π : D → J k (E) be a vector bundle over J k (E) and suppose that the Lie group acts projectably on D in a manner which covers the action of G on J k (E). A differential operator is a section : J k (E) → D. The differential operator is G invariant if it is invariant in the sense of Definition 3.1, that is, g · (σ ) = (g · σ )
Group Invariant Solutions Without Transversality
661
for all g ∈ G and all points σ ∈ J k (E). A section s of E defined on an open set U ⊂ M is a solution to the differential equations = 0 if (j k (s)(x)) = 0 for all x ∈ U . Typically, the bundle D → J k (E) is defined as the pullback bundle of a vector bundle V (on which G acts) over E or M by the projections π k : J k (E) → E or k : J k (E) → M and the action of G on D is the action jointly induced from J k (E) πM and V . Our goal now is to construct a bundle D˜ → J k (κ˜ G (E)) and a differential operator ˜ : J k (κ˜ G (E)) → D˜ such that the correspondence (3.6) restricts to a 1-1 correspondence ˜ = 0. between the G invariant solutions of = 0 and the solutions of One might anticipate that the required bundle D˜ → J k (κ˜ G (E)) can be constructed by a direct application of kinematic reduction to D → J k (E). However, one can readily check that the quotient space of J k (E) by the prolonged action of G does not in general coincide with the jet space J k (κ˜ G (E)) so that the kinematic reduction diagram for the action of G on D will not lead to a bundle over J k (κ˜ G (E)). For example, if G is the group acting on M × R → M by rotations in the base M = R2 − {(0, 0)}, then J 2 (E)/G is a 7-dimensional manifold whereas J 2 (κ˜ G (E)) is 4-dimensional. This difficulty is easily circumvented by introducing the bundle of invariant k-jets Invk (E) = { σ ∈ J k (E) | σ = j k (s)(x0 ), where s is a G invariant section defined in a neighborhood of x0 }. (4.1) This bundle is studied in Olver [29] although the importance of these invariant jet spaces to the general theory of symmetry reduction of differential equations is not as widely acknowledged in the literature as it should be. The quotient space Invk (E)/G coincides with the jet space J k (κ˜ G (E)). We let DInv → Invk (E) be the restriction of D to the bundle of invariant k-jets and to this we now apply our reduction procedure to arrive at the dynamic reduction diagram q
ι
κ˜ G (DInv ) ←−−−− κG (DInv ) −−−−→ π
π˜
qInv
DInv π
ιInv
−−−−→
id
D π
(4.2)
ιk
J k (κ˜ G (E)) ←−−−− Invk (E) −−−−→ Invk (E) −−−−→ J k (E). Theorem 3.3 insures that there is a one-to-one correspondence between the G invariant sections of DInv → Invk (E) and the sections of κ˜ G (DInv ) → J k (κ˜ G (E)). Any G invariant differential operator : J k (E) → D restricts to a G invariant differential operator Inv : Invk (E) → DInv and thus determines a differential operator ˜ : J k (κ˜ G (E)) → κ˜ G (D). This is the reduced differential operator; the solutions to ˜ = 0 describe the G invariant solutions to = 0. To describe diagram (4.2) in local coordinates, we begin with the coordinate description (3.8) of the kinematic reduction diagram and we let (x˜ r , xˆ k , uα , uαr , uαk , uαrs , uαrk , uαkl , . . . ) denote the standard jet coordinates on J k (E). Since the invariant sections are parameterized by functions v a = v a (x˜ r ), coordinates for Invk (E) are a , . . . ). (x˜ r , xˆ k , v a , vra , vrs
662
I. M. Anderson, M. E. Fels, C. G. Torre
In accordance with (3.9), the inclusion map ιk : Invk (E) → J k (E) is given by a ιk (x˜ r , xˆ k , v a , vra , vrs , . . . ) = (x˜ r , xˆ k , uα , uαr , uαk , uαrs , uαrk , uαkl , . . . ),
(4.3)
where, by a formal application of the chain rule, uα = ια (x˜ r , xˆ i , v a ), uαrs =
uαr =
∂ια ∂ια a + v , ∂ x˜ r ∂v a r
uαk =
∂ια , ∂ xˆ k
∂ 2 ια ∂ 2 ια a ∂ 2 ια a ∂ 2 ια a b ∂ια a + v + v + v v + a vrs , ∂ x˜ r ∂ x˜ s ∂v a ∂ x˜ s r ∂v a ∂ x˜ r s ∂v a ∂v b r s ∂v
and so on. The quotient map qInv : Invk (E) → J k (κ˜ G (E)) is given simply by a a qInv (x˜ r , xˆ k , v a , vra , vrs , . . . ) = (x˜ r , v a , vra , vrs , . . . ).
Next let f A be a local frame field for the vector bundle D. The differential operator : J k (E) → D can be written in terms of the standard coordinates on J k (E) and in this local frame as = A (x˜ r , xˆ k , uα , uαr , uαk , uαrs , uαrk , uαkl , . . . ) f A .
(4.4)
The restriction of to Invk (E) defines the section Inv : Invk (E) → DInv by a Inv = Inv,A (x˜ r , xˆ k , v a , vra , vrs , . . . ) f A,
(4.5)
a , . . . ) are defined as the comwhere the component functions Inv,A (x˜ r , xˆ k , v a , vra , vrs position of the maps (4.3) and the component maps A . Since is a G invariant differential operator, Inv is a G invariant differential operator and hence Inv necessarily factors through the kinematic bundle κG (DInv ),
Inv : Invk (E) → κG (DInv ). Our general existence theory for invariant sections implies that we can also find a locally Q defined, G invariant frame f Inv for κG (DInv ). The inclusion map κG (DInv ) → DInv is Q expressed by writing each vector f Inv as a linear combination of the vectors f A , Q
Q
f Inv = MA f A , Q
where the coefficients MA are functions on Invk (E). The invariant operator Inv can be expressed as Q a , . . . ) f Inv . Inv = Inv,Q (x˜ r , xˆ k , v a , vra , vrs
Group Invariant Solutions Without Transversality
663
Q
Finally, the G invariant frame f Inv determines a frame ˜f Q on κ˜ G (DInv ), the invariance of implies that the component functions Inv,Q are necessarily independent of the parametric variables xˆ k , that is, a a ˜ Q (x˜ r , v a , vra , vrs , . . . ) = Inv,Q (x˜ r , xˆ k , v a , vra , vrs ,...)
and the reduced differential operator is a ˜ = ˜ Q (x˜ r , v a , vra , vrs , . . . ) ˜f Q .
At first sight, this general framework may appear to be rather cumbersome and overly complicated. However, as we shall see in examples, every square in the dynamic reduction diagram (4.2) actually corresponds to the individual steps that one performs in practice. 5. The Automorphism Group of the Kinematic Bundle Let G be the full group of projectable symmetries on E for a given system of differential equations on J k (E) and let G ⊂ G be a fixed subgroup for which the group invariant solutions are sought. It is commonly noted (again, within the context of reduction with transversality) that Nor(G, G), the normalizer of G in G, preserves the space of invariant sections and that Nor(G, G)/G is a symmetry group of the reduced equations. However, because this is a purely algebraic construction which does not take into account the action of G on E, this construction may not yield the largest possible residual symmetry group or may result in a residual group which does not act effectively on κ˜ G (E). These difficulties are easily resolved. We let Op (G) denote orbit of G through a point p ∈ E. Definition 5.1. Let G be a group of fiber-preserving transformations acting on π : E → M and let G be a subgroup of G. Assume that E admits a kinematic reduction diagram (3.5) for the action of G on E. ˜ for the kinematic bundle π : κG (E) → M is the (i) The automorphism group G subgroup of G which stabilizes the set of all the G orbits in κG (E), that is, ˜ = a ∈ G | a · Op (G) = Oa·p (G) and G
a −1 · Op (G) = Oa −1 ·p (G) for all p ∈ κG (E) .
(5.1)
(ii) The global isotropy subgroup of G, as it acts on the space of G orbits of κG (E), is ˜ ∗ = a ∈ G | a · Op (G) = Op (G) for all p ∈ κG (E) . (5.2) G ˜ G ˜ ∗. ˜ eff = G/ (iii) The residual symmetry group is G ˜ ∗ is that it is the largest subgroup of G with exactly the same The key property of G reduction diagram and invariant sections as G. This is an important interpretation of the ˜ ∗ – from the viewpoint of kinematic reduction, one should generally replace group G ˜ ∗ . For computational purposes, it is often advantageous to the group G by the group G ˜ ∗ fixes every G invariant section of E. It is not difficult to check that use the fact that G
664
I. M. Anderson, M. E. Fels, C. G. Torre
˜ ∗ , G) = G, ˜ that the quotient group G ˜ eff = G/ ˜ G ˜ ∗ acts effectively and projectably Nor(G and that, if G is a symmetry group of a differential on the reduced bundle κ˜ G (E) → M ˜ eff is always a symmetry group of the reduced differential operator . ˜ operator , then G Similarly, if G is a Lie algebra of projectable vector fields on E and ⊂ G, we define the infinitesimal automorphism algebra of κ (E) as the Lie subalgebra of vector fields given by G˜ = Y ∈ G | [ Z, Y ]p ∈ span()(p)
for all p ∈ κ (E) and all Z ∈ ,
(5.3)
and the associated isotropy subalgebra for κ (E) G˜ ∗ = Y ∈ G | Yp ∈ span()(p) for all p ∈ κ (E) .
(5.4)
When G is a finite dimensional Lie group and G = (G), then it is readily checked that ˜ ∗ ). ˜ and G˜ ∗ = (G G˜ = (G) ˜ acts on the k-jets of invariant sections Invk (E), Since the automorphism group G G this group also plays an important role in dynamic reduction. Specifically, let us suppose that G acts on the vector bundle D → J k (E) and that : J k (E) → D is a G invariant ˜ and section. Then Inv : Invk (E) → DInv is always invariant under the action of G accordingly the operator Inv always factors through the kinematic bundle for the action ˜ on DInv , where for σ ∈ Invk (E), of G κG,σ ˜ (DInv ) = ∈ DInv,σ | g · =
for all
˜σ . g∈G
We note that κG˜ (DInv ) ⊂ κG (DInv ) and consequently one can refine the dynamic reduction diagram from (4.2) to q
ι
qInv
id
κ˜ G˜ (DInv ) ←−−−− κG˜ (DInv ) −−−−→ π
π
DInv π
ιInv
−−−−→
D π
ιk
J k (κ˜ G (E)) ←−−−− Invk (E) −−−−→ Invk (E) −−−−→ J k (E), where the quotient maps to the left are still by the action of G. Given the actions of G on π : E → M and also G on D → J k (E), it sometimes happens that κG,σ ˜ (DInv ) = 0.
(5.5)
In this case every G invariant section of E is automatically a solution to = 0 for every G invariant operator : J k (E) → D – such sections are called universal solutions. Previous work on this subject (see Bleecker [8], [9], Gaeta and Morando [19]) have emphasized a variational approach which, from the viewpoint of the dynamic reduction diagram and the automorphism group of the kinematic reduction diagram, may not always be necessary.
Group Invariant Solutions Without Transversality
665
6. Examples In this section we find the kinematic and dynamic reduction diagrams for the group invariant solutions for some well-known systems of differential equations in applied mathematics, differential geometry, and mathematical physics. We begin by deriving the rotationally invariant solutions of the Euler equations for incompressible fluid flow. As noted by Olver [29] (p. 199), these solutions cannot be obtained by the classical Lie ansatz. The general theory of symmetry reduction without transversality leads to some interesting new classification problems for group invariant solutions which we briefly illustrate by presenting new reduction of the Euler equations. In our second set of examples we consider reductions of the harmonic map equations. We show the classic Veronese map from S 2 → S 4 is an example of a universal solution. In Example 5.4 we consider another symmetry reduction of the harmonic map equation which nicely illustrates the construction of the reduced kinematic space for quotient with boundary. manifolds M In our third set of examples, the Schwarzschild and plane wave solutions of the vacuum Einstein equations are re-examined in the context of symmetry reduction without transversality. We demonstrate the importance of the automorphism group in understanding the geometric properties of the kinematic bundle and, as well, qualitative features of the reduced equations. Finally, some elementary examples from mechanics are used to demonstrate the basic differences between symmetry reduction for group invariant solutions and symplectic reduction of Hamiltonian systems. Although space does not permit us to do so, the kinematic and dynamic reduction diagrams are also nicely illustrated by symmetry reduction of the Yang–Mills equations as found, for example, in [22, 25, 27]. In particular, it is interesting to note that the invariance properties of the classical instanton solution to the Yang–Mills equations (Jackiw and Rebbi [24]) imply that it is a universal solution in the sense of Eq. (5.5). Euler Equations for Incompressible Fluid Flow. The Euler equations are a system of 4 first order equations in 4 independent and dependent variables. The underlying bundle E for these equations is the trivial bundle R4 × R4 → R4 with coordinates (t, x, u, p) → (t, x), where x = (x 1 , x 2 , x 3 ) and u = (u1 , u2 , u3 ) and the equations are ut + u · ∇u = −∇p
and
∇ · u = 0.
(6.1)
The full symmetry group G of the Euler equations is well- known (see, for example, [23, 29, 34]) Example 6.1 (Rotationally Invariant Solutions of the Euler Equations). The symmetry group of the Euler equations contains the group G = SO(3) acting on E by R · (t, x, u, p) = (t, R · x, R · u, p) = (t, Rji x j , Rji uj , p),
(6.2)
for R = (Rji ) ∈ SO(3). To insure that the action of G on the base R4 is regular we restrict to the open set M ⊂ R4 where ||x|| = 0. The infinitesimal generators for this action are Vk = εkij x i
∂ ∂ + εkij ui j . ∂x j ∂u
(6.3)
666
I. M. Anderson, M. E. Fels, C. G. Torre
We first construct the kinematic reduction diagram for this action. For a given point x0 = (t0 , x0 ) ∈ M, the isotropy subgroup Gx0 for the action of G on M is the subgroup SO(2)x0 ⊂ SO(3) which fixes the vector x0 in R3 . Since the only vectors invariant under all rotations about a given axis of rotation are vectors along the axis of rotation, we deduce that for x0 ∈ M, κG,x0 (E) = { (t0 , x0 , u, p) | R · u = u for all R ∈ SO(2)x0 } = { (t0 , x0 , u, p) | u = Ax0 for some A ∈ R }. The same conclusion can be obtained by infinitesimal considerations. Indeed, the infinitesimal isotropy vector field at x0 for the action on M is Z = x0k εkij x i
∂ ∂x j
and therefore, if (t, x, u, p) ∈ κ,x (E), we must have by (3.11) x k εkij ui
∂ = 0. ∂uj
This implies that x × u = 0 and so u is parallel to x. Either way, we conclude that κG (E) is a two dimensional trivial bundle (t, x, A, B) → (t, x), where the inclusion map ι : κG (E) → E is ι(t, x, A, B) = (t, x, u, p),
where u = Ax
and p = B. The invariants for the action of G on M are t and r = x 2 + y 2 + z2 so that the kinematic reduction diagram for the action of SO(3) on E is q κG
ι
(t, r, A, B) ←−−−− (t, x, A, B) −−−−→ (t, x, u, p) π πκ˜ G π
qM
(t, r)
←−−−−
id
(t, x)
−−−−→
(6.4)
(t, x).
In accordance with Eq. (3.9), each section A = A(t, r) and B = B(t, r) of κ˜ G (E) determines the rotationally invariant section u = A(r, t) x
and
p = B(r, t)
(6.5)
of E. The computation of the reduced equations for the rotationally invariant solutions to the Euler equations now proceeds as follows. From (6.5) we compute uit = At x i ,
uij = Aδji + Ar
x i xj r
and
pi = Br
xi r
(6.6)
so that the Euler equations (6.1) become x i xj xi At x i + Ax j Aδji + Ar = −Br r r
and 3A + rAr = 0
(6.7)
Group Invariant Solutions Without Transversality
667
which simplify to the differential equations At + A(A + rAr ) = −
Br r
and
3A + rAr = 0
(6.8)
on J 1 (κ˜ G (E)). These equations are readily integrated to give A=
a r3
and
B=
a2 a˙ − 4 +b r 2r
for arbitrary functions a(t) and b(t) and the rotationally invariant solutions to the Euler equations are u=
a x r3
and
p=
a2 a˙ − 4 + b. r 2r
(6.9)
We note that for the Lie algebra of vector fields (6.3), the matrix on the right side of (2.4), namely 0 −x 3 x 2 0 −u3 u2 x 3 0 −x 1 u3 0 −u1 , −x 2 x 1 0 −u2 u1 0 has full rank 3 whereas the matrix on the left side of (2.4), consisting of the first three columns of the above matrix, has rank 2. The local transversality condition (2.4) fails and the solution (6.9) to the Euler equations cannot be obtained using the classical Lie prescription. To describe the derivation of the reduced equations in the context of invariant differential operators and the dynamic reduction diagram we introduce the bundle D = ∂ J 1 (E) × R3 × R with sections i ⊗ dt and dt and define the differential operator ∂u on D by = [uit + uk uik + δ ij pj ]
∂ ⊗ dt + [uii ] dt. ∂ui
(6.10)
This operator is invariant under the full symmetry group of the Euler equations. The induced action of G = SO(3) on J 1 (E) is given by R · (t, x i , ui , p, uij , pj ) = (t, Rri x r , Rri ur , p, Rri Rjs urs , Rsr pr ),
where R ∈ SO(3).
Coordinates for the bundle of invariant jets Inv1 (E) are (t, x i , A, At , Ar , B, Bt , Br ) and (6.6) defines the inclusion map ι1 : Inv1 (E) → J 1 (E). A basis for the G invariant sections of DInv → Inv1 (E) is given by f 1 = xi
∂ ⊗ dt ∂ui
and
f 2 = dt.
Let ˜f 1 and ˜f 2 be the corresponding sections of κ˜ G (DInv ). We are now ready to work though the dynamic reduction diagram (4.2), starting with the Euler operator as a section : J 1 (E) → D. Restricted to the invariant jet bundle Inv1 (E), becomes x i xj x i xj xi ∂ j + Br ] i ⊗ dt + [δi (Aδji + Ar )] dt. Inv = [At x i + Ax j Aδji + Ar r r ∂u r
668
I. M. Anderson, M. E. Fels, C. G. Torre
Restricting to Inv1 (E) is precisely the first step one takes in practice in computing the reduced equations and corresponds to the right most square in the dynamic reduction diagram. Next, because Inv is G invariant it is necessarily a linear combination of the two invariant sections f 1 and f 2 and therefore factors though the kinematic bundle κG (DInv ). This means we can write Inv as a section of κG (DInv ), namely, i x j xj 1 j x xj Inv = [At + A A + Ar + Br ] f 1 + [3A + Ar (δi )] f 2 . r r r
This corresponds to the center commutative square in the dynamic reduction diagram (4.2) and coincides with the fact that Eq. (6.7) contained a common factor x i – a common factor which insures that the time evolution equation for u reduces to a single time evolution equation for A. Finally, as a G invariant section of κG (DInv ), a bundle on which G always acts ˜ on the transversally, we are assured that Inv descends to a differential operator bundle J 1 (κ˜ G (E)). In this example this implies that the independent variables (t, x i ) appear only though the invariants for the action of G on M, in this case t and r, and so ˜ = [At + A(A + rAr ) +
Br ˜ 1 ] f + [3A + rAr ] ˜f 2 . r
Example 6.2 (A New Reduction of the Euler Equations). It is possible to give a complete classification of all possible symmetry reductions of the Euler equations (6.1) to a system of ordinary differential equations in three or fewer dependent variables [17]. A number of authors have obtained complete lists of reductions of various differential equations (see, for example, [14, 18, 21, 42]) but this particular classification of reductions of the Euler equations may be the first such classification of group invariant solutions which explicitly requires non-trivial isotropy in the group action on the space of independent variables. There are too many cases to list the results of this classification here, but we do present one more reduction of the Euler equations, one which does not seem to appear elsewhere in the literature. For this example it will be convenient to write x = (x, y, z) and u = (u, v, w). The infinitesimal generators for the group action are = { V0 , V1 , V2 = Vx,α + Vy,β , V3 = Vy,α − Vx,β }, where V0 = x∂x +y∂y +z∂z +u∂u +v∂v +w∂w +2p∂p , ˙ u −x α∂ ¨ p, Vx,α = α∂x + α∂
and
V1 = y∂x −x∂y +v∂u −u∂v , ˙ v −y β∂ ¨ p. Vy,β = β∂y + β∂
Here α = α(t) and β = β(t) are such that α β¨ − αβ ¨ = 0, or equivalently, α β˙ − β α˙ = c = constant.
(6.11)
This condition insures that [V2 , V3 ] = 0 so that is indeed a finite dimensional Lie algebra of vector fields. In order that have constant rank on the base space, we assume that xyα = 0 or yzβ = 0. The horizontal components of V2 and V3 are given by M 1 α −β V2M α β ∂x ∂x V2 = , so that = , −β α ∂y ∂y V3M V3M δ β α
Group Invariant Solutions Without Transversality
669
where δ = α 2 + β 2 , and therefore at the point (t0 , x0 ), the horizontal components of the vector field Z = V1 − y0
α(t0 )V2 − β(t0 )V3 β(t0 )V2 + α(t0 )V3 + x0 δ(t0 ) δ(t0 )
(6.12)
vanish. The isotropy condition (3.11) defining the fiber of the kinematic bundle κ,x (E) leads, from the coefficients of ∂u , ∂v and ∂p , to the relations v=
yα − xβ xα + yβ α˙ + β˙ δ δ
and u =
xα + yβ xβ − yα ˙ α˙ + β. δ δ
(6.13)
We therefore conclude that the kinematic bundle has fiber dimension 2 with fiber coordinates w and p. However, these coordinates are not invariant under the action of on κ (E) and cannot be used in the local coordinate description (3.8) of the kinematic reduction diagram. Restricted to κ (E), the vector fields Vi become V0 = x∂x + y∂y + z∂z + w∂w + 2p∂p ,
V1 = y∂x − x∂y ,
¨ p, ¨ + βy)∂ V2 = α∂x + β∂y − (αx
¨ + αy)∂ V3 = −β∂x + α∂y − (−βx ¨ p.
and
Note that these restricted vector fields now satisfy the infinitesimal transversality condition (2.4). Invariants for this action are t, A=
w z
and
α α¨ + β β¨ 2 B = 2p + (x + y 2 ) /z2 . δ
(6.14)
To verify that B satisfies V2 (B) = V3 (B) = 0 one must use α β¨ = αβ. ¨ The kinematic reduction diagram for the action of on E is therefore q κ
ι
(t, A, B) ←−−−− (t, x, A, B) −−−−→ (t, x, u, p) π π
π˜
(t)
qM
←−−−−
(t, x)
id
−−−−→
(t, x),
where the inclusion map ι is defined by (6.13) and the solutions to (6.14) for w and p. The general invariant section is then, on putting σ = ln δ, u=x
c σ˙ −y , 2 δ
w = zA(t),
c σ˙ v =x +y , δ 2 α α¨ + β β¨ 2 z2 p =− (x + y 2 ) + B(t). 2δ 2
Note that the u and v components are uniquely determined from the isotropy conditions (6.12) and (6.13) and that the arbitrary functions A(t) and B(t) defining these invariant sections appear only in the w and p components. We now turn to the dynamic reduction diagram. Since we are treating the Euler equations as the section (6.10) of the tensor bundle D we can anticipate the form of Inv by computing the invariant tensors of the form T = P ∂u ⊗ d t + Q ∂v ⊗ d t + R ∂w ⊗ d t + Sd t.
(6.15)
670
I. M. Anderson, M. E. Fels, C. G. Torre
The isotropy condition LZ T = 0 at x0 , where Z is defined by (6.12), shows immediately that P = Q = 0 from which it follows that f1 = z
∂ ⊗ dt ∂w
and
f2 = d t
are a basis for the invariant fields of the type (6.15). This calculation shows that the ∂u ⊗ dt and ∂v ⊗ dt components of the reduced Euler equations must vanish identically and, consistent with this conclusion, one readily computes ¨ ∂ c 2 (α α¨ + β β) σ˙ + ( )2 − [x ⊗ dt − 2 2 δ δ ∂u ∂ +y ⊗ dt] + A˙ + A2 + B f 1 + σ˙ + A f 2 ∂v = A˙ + A2 + B f 1 + σ˙ + A f 2 .
Inv =
σ¨
Thus, the reduced differential equations are A˙ + A2 + B = 0
and
σ˙ + A = 0
which determine A and B algebraically. In conclusion, for each choice of α and β there is precisely one invariant solution to the Euler equations given by α α+β ˙ β˙ α β˙ − αβ ˙ −y 2 , 2 2 α +β α +β 2
α β˙ − αβ ˙ α α+β ˙ β˙ α α+β ˙ β˙ +y 2 , w = −2z 2 , 2 2 2 α +β α +β α +β 2 ¨ β¨ α α+β ¨ β¨ α β˙ −β α˙ 2 α α+β ˙ β˙ 2 1 2 α α+β . +z + −3 p = − (x 2 +y 2 ) 2 2 α +β 2 α 2 +β 2 α 2 +β 2 α 2 +β 2
u=x
v=x
Harmonic Maps. For our next examples we look at two well-known reductions of the harmonic map equation for maps between spheres. For these examples the bundle E is S n × S m → S n which we realize as a subset of Rn+1 × Rm+1 by E = (x, u) ∈ Rn+1 × Rm+1 | x · x = u · u = 1 . Let G be a Lie subgroup of SO(n + 1), let ρ : G → SO(m + 1) be a Lie group homomorphism and define the action of G on E by R · (x, u) = (R · x, ρ(R) · u) for
R ∈ G.
The kinematic bundle for the G invariant sections of E has fiber κG,x (E) = (x, u) ∈ E | ρ(R) · u = u for all R ∈ G such that R · x = x . We identify the jet space J 2 (E) with a submanifold of J 2 (Rn+1 , Rm+1 ) by J 2 (E) = { (x, u, ∂i u, ∂ij u) ∈ J 2 (Rn+1 , Rm+1 ) | x · x = 1, u · u = 1, u · ∂i u = 0, u · ∂ij u + ∂i u · ∂j u = 0 }.
Group Invariant Solutions Without Transversality
671
Since the harmonic map operator (or tension field) is a tangent vector to the target sphere S m at each point σ ∈ J 2 (E), we let D = {(σ, ) ∈ J 2 (E) × Rm+1 | u · = 0}.
(6.16)
By combining Proposition I.1.17 (p.19) and Lemma VII.1.2 (p.129) in Eells and Ratto [15], it follows that one can write the harmonic map operator : J 2 (E) → D as the map ∂ n+1 , (σ ) = R uα + x i x j uαij + nx i uαi − λuα ∂uα where
β
β
λ = δαβ [δ ij uαi uj − x i x j uαi uj ]
and
R
n+1
(6.17)
uα = −δ ij uαij .
This operator is invariant under the induced action of G = SO(n + 1) × SO(m + 1) on D. Example 6.3 (Harmonic Maps from S 2 to S 4 ). For our first example we take E = S 2 × S 4 → S 2 and we look for harmonic maps which are invariant under the standard action of SO(3) acting on S 2 . It can be proved that, up to conjugation, there are three distinct group homomorphisms ρ : SO(3) → SO(5), which lead to the following three possibilities for the infinitesimal generators of SO(3) acting on E: V1 = z∂y −y∂z , V1 = z∂y −y∂z −u2 ∂u3 + u3 ∂u2 , Case I Case II V2 = x∂z −z∂x , V2 = x∂z −z∂x −u3 ∂u1 + u1 ∂u3 , V3 = y∂x −x∂y . V3 = y∂x −x∂y −u1 ∂u2 + u2 ∂u1 . √ √ V1 = z∂y −y∂z + u2 ∂u1 −u1 ∂u2 + (u4 − 3u5 )∂u3 −u3 ∂u4 + 3u3 ∂u5 , √ √ Case III V2 = x∂z −z∂x −u3 ∂u1 + (u4 + 3u5 )∂u2 + u1 ∂u3 −u2 ∂u4 − 3u2 ∂u5 , V3 = y∂x −x∂y −2u4 ∂u1 + u3 ∂u2 −u2 ∂u3 + 2u1 ∂u4 . In Case I the map ρ is the constant map and, in Case II, ρ is the standard inclusion of SO(3) into SO(5). The origin of the map ρ in Case III will be discussed shortly. consists of a single point, the Since SO(3) acts transitively on S 2 , the orbit manifold M space of invariant sections is a finite dimensional manifold, and the reduced differential equations are algebraic equations. The kinematic bundles κG (E) are determined in each case from the isotropy constraint xV1 + yV2 + zV3 = 0. In Case I the action is transverse, the isotropy constraint is vacuous and the kinematic bundle is κG (E) = S 2 × S 4 . The invariant sections are given by AI (x, y, z) = (A, B, C, D, E), where A, . . . , E are constants and A2 + B 2 + C 2 + D 2 + E 2 = 1. In Case II the kinematic bundle is S 2 × S 2 and the invariant sections are AI I (x, y, z) = (Ax, Ay, Az, B, C),
672
I. M. Anderson, M. E. Fels, C. G. Torre
where A, B, C are constants such that A2 + B 2 + C 2 = 1. We take A = 0, since otherwise AI I becomes a special case of AI . In Case III, κG (E) = S 2 × { ±1 } and the invariant sections are √ √ 3 2 1 2 2 (x + y 2 − 2z2 ) , AI I I (x, y, z) = A 3 xy, xz, yz, (x − y ), 2 6 where A = ±1. Direct substitution into (6.17) easily shows that the maps AI and AI I I automatically satisfy the harmonic map equation. The map AI I is harmonic if and only if B = C = 0 in which case AI I is either the identity map or the antipodal map on S 2 followed by the standard inclusion into S 4 . Despite the simplicity of these conclusions, it is nevertheless instructive to look at the corresponding dynamic reduction diagrams. In Case I, the invariant sections are constant and so Inv2 (E) = {(x, A) ∈ R3 × R5 | x · x = A · A = 1, } and
DInv = {(σ, ) ∈ Inv2 (E) × R5 | A · = 0}. ˜ = SO(3) × SO(5) The automorphism group for the kinematic bundle in this case is G which acts on DInv by (R, S) · (x, A, ) = (R · x, S · A, S · )
for R ∈ SO(3) and S ∈ SO(5).
The isotropy constraint for κG˜ (DInv ) forces to be a multiple of A. Hence, by the tangency condition A · = 0, we have = 0 and κG,σ ˜ (DInv ) = 0. This shows that the map AI is harmonic by symmetry considerations alone and moreover that it is a universal solution for any operator : J k (S 2 × S 4 ) → D with SO(3) × SO(5) symmetry. In Case II, the harmonic map equations force B = C = 0 so that the maps AI I are not universal. Interestingly however, the standard and antipodal inclusions S 2 → S 4 have a larger symmetry group, namely SO(3) × SO(2) ⊂ G and it is easily seen, using these larger symmetry groups, that the standard and antipodal inclusions are universal. It is a common phenomenon that the group invariant solutions to a system of differential equations possess a larger symmetry group than the original group used in their construction. In Case III one finds immediately that κG,σ (DInv ) = 0 and AI I I is universal, again for any operator : J k (S 2 × S 4 ) → D with SO(3) × SO(5) symmetry. The map AI I I is the classic Veronese map. The symmetry group defining it is based on a standard irreducible representation of SO(3) which readily generalizes to give harmonic maps between various spheres of higher dimension. Specifically, starting with the standard action of SO(n) on V = Rn , consider the induced action on Symktr (V ), the space of rank k symmetric, trace-free tensors or, equivalently, on the space W = Hk (V ) of harmonic polynomials of degree k on V . The standard metric on W is invariant under this action of SO(n) and in this way one obtains a Lie group monomorphism ρ : SO(n) → SO(N ), where N = dim(W ) = n+k−1 − 1. For example, the polynok mials √ u1 = xy, , u2 = xz, u3 = yz, u4 = 1/2(x 2 −y 2 ), u5 = 3/6(x 2 +y 2 −2z2 ) form an orthogonal basis for H2 (R3 ) and the action of SO(3) on this space determines the action of SO(3) on R3 × R5 in Case III. For further examples see Eells and Ratto [15] and Toth [38].
Group Invariant Solutions Without Transversality
673
Example 6.4 (Harmonic Maps from S n to S n ). A basic result of Smith [35] states that each element of πn (S n ) = Z can be represented by a harmonic map (with respect to the standard metric) provided n ≤ 7 or n = 9. This result, which can be established by symmetry reduction of the harmonic map equation (see Eells and Ratto [15] and Urakawa [39]), illustrates a number of interesting features. First, we see that much of is the general theory which we have outlined could be extended to the case where M a manifold with boundary and where the fibers of κG (E) change topological type on the boundary. Secondly, we find that the invariant sections for the standard action of G = SO(n − 1) × SO(2) ⊂ SO(n + 1) on S n are slightly more general than those considered in [15] and [39]. However, a simple analysis of the reduced equations, based upon Noether’s theorem, shows that the only solutions to the reduced equations are essentially those provided by the ansatz used by Eells and Ratto and Urakawa. If (R, S) ∈ G = SO(n − 1) × SO(2) ⊂ SO(n + 1) and (x, y, u, v) ∈ E ⊂ (Rn−1 × R2 ) × (Rn−1 × R2 ), where ||x||2 + ||y||2 = 1 and ||u||2 + ||v||2 = 1, then the action of G on E = S n × S n is given by R0 x R0 u (R, S)(x, y, u, v) = , . 0 S y 0 S v The invariants for the action of G on the base Rn+1 are r = ||x|| and s = ||y|| which, for points (x, y) ∈ S n , are related by r 2 + s 2 = 1, where r ≥ 0 and s ≥ 0. The quotient = S n /G is therefore diffeomorphic to the closed interval [0, π/2]. manifold M To describe the kinematic bundle κG (E) we must consider separately those points in M for which (i) s = 0, (ii) s = 0 and r = 0 and (iii) r = 0, corresponding to the left-hand boundary point, the interior points and the right-hand boundary point of M. For (x, 0) ∈ S n , the isotropy subalgebra is SO(n − 1)x × SO(2) and the fiber of the kinematic bundle consists of a pair of points κG,(x,0) (E) = { (x, 0, u, v) | u = ±x,
and
v = 0 }.
For points (x, y) ∈ S n with r = 0 and s = 0 the isotropy group is SO(n − 1)x × { I } and the fiber of the kinematic bundle is the ellipsoid of revolution κG,(x,y) (E) = { (x, y, u, v) | u = Ax, Invariant coordinates on κG,(x,y) (E) are A = y⊥ = (0, −y 2 , y 1 ), subject to
where
r 2 A2 + ||v||2 = 1}.
x·u y·v y⊥ · v , B = and C = , where r2 s2 s2
r 2 A2 + s 2 (B 2 + C 2 ) = 1.
(6.18)
The inclusion map from κG,(x,y) (E) to E(x,y) is u = Ax
and
v = By + Cy⊥ .
At the points (0, y), the isotropy subalgebra is SO(n)×{ I } and the fiber of the kinematic bundle is the circle κG,(0,y) (E) = { (0, y, u, v) | u = 0
and ||v|| = 1 }.
674
0
I. M. Anderson, M. E. Fels, C. G. Torre
π/4
π/2
Fig. 1. The reduced kinematic bundle for SO(n − 1) × SO(2) invariant maps s : S n → S n
The quotient space κ˜ G (E) is shown in Fig. 1. The G invariant sections are therefore described, as maps A : Rn+1 → Rn+1 , by A(x, y) = A(t)x + B(t)y + C(t)y⊥ ,
(6.19)
r where t is the smooth function of (x, y) defined by cos(t) = 2 and sin(t) = r + s2 s , and where cos2 (t)A2 (t)+sin2 (t)(B 2 (t)+C 2 (t)) = 1. The isotropy conditions r 2 + s2 imply that the functions A, B and C are subject to the boundary at the boundary of M conditions π π π A(0) = ±1, B(0) = 0, C(0) = 0, and A( ) = 0, B( )2 + C( )2 = 1. 2 2 2 (6.20) The invariant sections considered in [15] and [39] correspond to C(t) = 0. Note that the space of invariant sections (6.19) is preserved by rotations in the v plane, that is, ˜ eff . By computing κG (DInv ) we rotations in the BC plane and therefore SO(2) ⊂ G deduce that the restricted harmonic operator Inv is of the form ∂ ∂ ∂ Inv = A x · + B y · + C y⊥ · ⊥ , ∂u ∂v ∂v where the tangency condition (6.16) reduces to r 2 AA + s 2 BB + s 2 CC = 0. A series of straightforward calculations, using (6.17), now shows that the coefficients of ˜ are the reduced operator sin(t) cos(t) ˙ ˜ A = −A¨ + n − A + nA − λA, cos(t) sin(t) sin(t) cos(t) ˙ ˜ B = −B¨ + (n − 2) −3 B + nB − λB, cos(t) sin(t) cos(t) ˙ sin(t) ˜ C = −C¨ + (n − 2) −3 C + nC − λC, cos(t) sin(t)
(6.21)
Group Invariant Solutions Without Transversality
where
675
λ = cos2 (t)A˙ 2 + sin2 (t) B˙ 2 + C˙ 2 + 2 cos(t) sin(t) −AA˙ + B B˙ + C C˙ + (n − 1)A2 + 2(B 2 + C 2 ) − cos2 (t)A2 − sin2 (t)(B 2 + C 2 ).
To analyze these equations, we first invoke the principle of symmetric criticality and the formulas in [2] for the reduced Lagrangian to conclude that these equations are the Euler–Lagrange equations for the reduced Lagrangian 1 L˜ = cos(t)n−2 sin(t)λ dt 2 subject, of course, to the constraint (6.18). From knowledge of the automorphism group of the kinematic bundle we know that this Lagrangian is invariant under rotations in the BC plane and this leads to the first integral ˙ J = cos(t)n−2 sin(t)3 (B C˙ − C B) for (6.21). By the boundary conditions (6.20), J must vanish identically. Thus C(t) = µB(t), for some constant µ and therefore a rotation in the v, v⊥ plane will rotate the general invariant section (6.19) into the section with C(t) = 0. We then have r 2 A2 + s 2 B 2 = 1 and the change of variables A(t) =
cos(φ(t)) cos(t)
and
B(t) =
sin(φ(t)) sin(t)
converts the reduced operator (6.21) into the form found in [15] or [39]. General Relativity. We now turn to some examples of Lie symmetry reduction in general relativity which we again examine from the viewpoint of the kinematic and dynamic reduction diagrams. To study reductions of the Einstein field equations, we take the bundle E to be the bundle Q(M) of quadratic forms, with Lorentz signature, on a 4dimensional manifold M. A section of E then corresponds to a choice of Lorentz metric on M. We view the Einstein tensor = Gij (ghk , ghk,l , ghk,lm )
∂ ∂ ⊗ j i ∂x ∂x
formally as a section of D → J 2 (E), where D is pullback of V = Sym2 (T (M)) to the bundle of 2-jets J 2 (E). The operator is invariant under the Lie pseudo-group G of all local diffeomorphisms of M. Let Divg be the covariant divergence operator (defined by the metric connection for g) acting on (1,1) tensors, Divg (S) = ∇i Sji dx j . The contracted Bianchi identity is Divg G = 0, where G is the operator obtained from by lowering an index with the metric. The first point we wish to underscore with the following examples is that the kinematic reduction diagram gives a remarkably efficient means of solving the Killing equations for the determination of the invariant metrics. Secondly, we show that discrete symmetries, can lead to isotropy which will not change the dimension of the reduced spacetime M,
676
I. M. Anderson, M. E. Fels, C. G. Torre
constraints which reduce the fiber dimension of the kinematic bundle. Thirdly, for G invariant metrics, the divergence operator Divg is a G invariant operator to which the dynamical reduction procedure can be applied to obtain the reduction of the contracted Bianchi identities for the reduced equations. Throughout, we emphasize the importance of the residual symmetry group in analyzing the reductions of the field equations. Finally, we remark that our conclusions in these examples are not restricted to the Einstein equations but in fact hold for any generally covariant metric field theory derivable from a variational principle. Example 6.5 (Spherically Symmetric and Stationary, Spherically Symmetric Reductions). We begin by looking at spherically symmetric solutions on the four dimensional manifold M = R × (R3 − { 0 }), with coordinates (x i ) = (t, x, y, z) for i = 0, 1, 2, 3. Although this is a very well-understood example, it is nevertheless instructive to consider it within the general theory of Lie symmetry reduction of differential equations. The infinitesimal generators for G = SO(3) are given by (2.10) and, just as in Example 6.1, we find that the infinitesimal isotropy constraint defining κG,x (E) = κ,x (E) is ∂ ε0kij x k gli = 0, ∂glj or, in terms of matrices, ga + a t g = 0, where
0 0 a= 0 0
0 0 −z y
0 z 0 −x
(6.22)
0 −y x 0
and g = [ gij ]. These linear equations are easily solved to give 0 0 0 0 0 1 0 0 0 0 x y z 2 0 x xy xz 0 0 0 0 0 x 0 0 0 g = A +B +C 0 xy y 2 yz + D 0 0 0 0 0 y 0 0 0 0 0 0 0 0 z 0 0 0 0 xz yz z2
0 1 0 0
0 0 1 0
0 0 . (6.23) 0 1
The fiber of the kinematic bundle κG,x (E) is therefore parameterized by four variables A, B, C, D. Since these variables are invariants for the action of G restricted to κG (E) and since the invariants for the action of SO(3) on M are t and r, the kinematic reduction diagram for the action of SO(3) on the bundle of Lorentz metrics is q κG
ι
(t, r, A, B, C, D) ←−−−− (x i , A, B, C, D) −−−−→ (x i , gij )
(t, r)
qM
←−−−−
(x i )
id
−−−−→
(x i ),
where the inclusion map ι is given by (6.23). Consequently, the most general rotationally invariant metric on M is ds 2 = A(t, r)dt 2 + 2B(t, r)dt (x dx + y dy + z dz) + C(t, r)(x dx + y dy + z dz)2 + D(t, r)(dx 2 + dy 2 + dz2 ).
(6.24)
Group Invariant Solutions Without Transversality
677
In standard spherical coordinates x = r cos θ sin φ, y = r sin θ sin φ, z = cos φ this takes the familiar form (on re-defining the coefficients B, C and D) ds 2 = A(t, r)dt 2 + B(t, r)dtdr + C(t, r)dr 2 + D(t, r)d H2 ,
(6.25)
where d H2 = dφ 2 + sin2 φ dθ 2 .
∂ , then the If we enlarge the symmetry group to include time translations V0 = ∂t kinematic reduction diagram becomes q κG
ι
(r, A, B, C, D) ←−−−− (x i , A, B, C, D) −−−−→ (x i , gij )
qM
(r)
←−−−−
(x i )
id
−−−−→
(6.26)
(x i ).
At first glance there appears to be little difference between the two diagrams (6.24) and (6.26), but a computation of the automorphism groups reveals a dramatic difference in the geometry of the reduced bundles κ˜ G (E) in (6.24) and (6.26). This difference is best explained in terms of general results on Kaluza-Klein reductions of metric theories as in, for example, Coquereaux and Jadczyk [13]. From our perspective, these authors show that when the action of G on M is simple in the sense that the isotropy groups Gx can all be conjugated in G to a fixed isotropy group Gx0 , then the reduced bundle κ˜ G (E) is a product of three bundles over M, ⊕ A(M) ⊕ QInv (K). κ˜ G (E) = Q(M)
(6.27)
Here is the bundle of metrics on M. (i) Q(M) 1 (ii) A(M) = J (M) ⊗ (P ×H h) , where P is the principal H bundle defined as the set of points in M with isotropy group Gx0 and H = Nor(Gx0 , G)/Gx0 . (iii) QInv (K) is the trivial bundle whose fiber consist of the G invariant metrics on the homogeneous space K = G/Gx0 . ˜ eff to be the diffeomorphism For (6.24) one computes the residual symmetry group G = R × R+ and one finds that the coefficients A, B, C transform as the group of M and that D is a scalar field (which one identifies as a map components of a metric on M into the space of SO(3) invariant metrics on S 2 ). Thus, for (6.24), we find that ⊕ !, κ˜ G (E) = Q(M) By contrast, for the diagram (6.26) the automorwhere ! is a trivial line bundle over M. ˜ phism group Geff acts on M by r → f (r) Diff(R+ ),
C ∞ (R)
and t → .t + g(r),
(6.28)
R∗ . Without
where f ∈ g∈ and . ∈ going further into the details of the decomposition (6.27), we simply note that the variable t is now the fiber coordinate on the principle bundle P and that under the transformations (6.28) the coefficients of the metric (6.25), which are now functions of r alone, transform according to A(r) → . 2 A(f (r)),
B(r) → .[f B(f (r)) + 2g A(f (r))],
C(r) → (f )2 C(f (r)) + f g B(f (r)) + (g )2 A(f (r)),
D(r) → D(f (r)).
678
I. M. Anderson, M. E. Fels, C. G. Torre
Consequently, the sections of κ˜ G (E) can be written as ˜ s˜ (r) = [g(r), ˜ ω(r), ˜ h(r)], where g(r) ˜ = [C(r) −
B(r)2 ] dr 2 , 4A(r)
ω(r) ˜ =
˜ h(r) = A(r)dt 2 + D(r) dH2 .
B(r) ∂ dr ⊗ , 2A(r) ∂t
and
˜ and h(r) ˜ ω(r) Here g(r) ˜ is a metric on M, ˜ is a connection on P pulled back to M, is a 2 ˜ map from M into the G invariant metrics on R × S . ˜ for the stationary, rotationally The detailed expression for the reduced operator invariant metrics can be found in any introductory text on general relativity. Here we simply point out that by computing the action of G on Sym2 (T M), we can deduce that the reduced operator will have the form ∂ ∂ ∂ ∂ ∂ ∂ ˜ = ˜ tt ⊗ ˜ rt + ⊗ + ⊗ ∂t ∂t ∂r ∂t ∂t ∂r ∂ ∂ 1 ∂ ∂ ∂ ∂ ˜ rr ˜H + ⊗ + ⊗ + ⊗ , ∂r ∂r ∂φ ∂φ ∂θ sin2 φ ∂θ ˜ tt , ˜ rt , ˜ rr and ˜ H are smooth functions on the 2-jets of the bundle (r, A, B, where C, D) → (r). In other words, of the ten components in the field equations, the dynamic reduction diagram automatically implies that 6 of these components vanish. Moreover, ˜ is constrained by the reduced Bianchi identities. Since dt and dr the reduced operator provide a basis for the invariant one forms on M, we know that the reduction of Divg S is a linear combination of dt and dr, Div g S = S˜t dt + S˜r dr. By direct computation, one finds that the dt and dr components of the reduced Bianchi identities are 1 d ˜ rt + B ˜ rr ) = 0, and γ (2A 2γ dr 1 1d ˜ rt ) − A˙ ˜ tt − C˙ ˜ rr − B˙ ˜ rt − 2D˙ ˜ H = 0, ˜ rr + B γ (2C 2 γ dr 1 2 where γ = D B − AC. It follows from the first of these identities and the transfor4 ˜ rt and ˜ rr under the residual scaling t → .t that mation properties of A, B, ˜ rt + B ˜ rr = 0. 2A This same identity can be derived by first observing that the principle of symmetric criticality holds for the action G and then by applying Noether’s second theorem to the ˜ eff . reduced Lagrangian with symmetry G Consequently of the four ODEs arising in the stationary, spherically symmetric reduction of the field equations one need only solve the two equations ˜ tt = 0
and
˜ rr = 0.
Group Invariant Solutions Without Transversality
679
The remaining two equations ˜ rt = 0
and
˜H =0
will automatically be satisfied (assuming D˙ = 0, A = 0). We stress that these conclusions actually hold true for the stationary, rotationally invariant reductions of any generally covariant metric field equations derivable from a variational principle. Example 6.6 (Static, Spherically Symmetric Reductions). A metric is static and spherically symmetric if, in addition to being invariant under time translations and rotations, it is invariant under time reflection. The symmetry group G now includes the transformations t → t + c and t → −t and therefore the isotropy subgroup Gx0 of the point x0 = (t0 , x0 ) now includes the reflection t → 2t0 − t. The fibers of the kinematic bundle are now constrained by (6.22) along with bgbt = g,
where
b = diag[−1, 1, 1, 1].
This forces B = 0 in (6.23) so that the fibers of the kinematic bundle are now 3-dimensional and the general invariant section is ds 2 = A(r)dt 2 + C(r)dr 2 + D(r)d H2 . The automorphism group for this bundle is now r → f (r) and t → .t and the A(M) summand in (6.27) does not appear. This example shows that while discrete symmetries that is, the number will never result in a reduction of the dimension of the orbit space M, of independent variables, discrete symmetries can reduce the fiber dimension of the kinematic bundle, that is, the number of dependent variables. Example 6.7 (Plane Waves). As our next example from general relativity, we consider a class of plane wave metrics [12]. We take M = R4 with coordinates (u, v, x, y) and let P (u) and Q(u) be arbitrary smooth functions satisfying P (u) > 0 and Q (u) > 0. The symmetry group on M is the five-parameter transformation group u = u, v = v + ε1 + ε4 x + ε5 y + 1/2 ε2 ε4 + ε3 ε5 + ε42 P (u) + ε52 Q(u) , x = x + ε2 + ε4 P (u),
y = y + ε3 + ε5 Q(u), (6.29)
with infinitesimal generators V1 = V4 = x
∂ ∂ ∂ , V2 = , V3 = , ∂v ∂x ∂y
∂ ∂ + P (u) ∂v ∂x
∂ ∂ + Q(u) . ∂v ∂y
and
V5 = y
and
[ V3 , V5 ] = V1
The only non-vanishing brackets are [ V2 , V4 ] = V1
so that, regardless of the choice of functions P and Q, the abstract Lie algebras or groups are the same although the actions are generically different for different choices of P and Q. The coordinate function u is the only invariant and the orbits of this action are 3dimensional. Therefore, at each point the isotropy subgroup is two dimensional and it is easily seen that, at x0 = (u0 , v0 , x0 , y0 ), the infinitesimal isotropy x0 is generated by Z1 = V4 − x0 V1 − P (u0 )V2
and
Z2 = V5 − y0 V1 − Q(u0 )V3 .
680
I. M. Anderson, M. E. Fels, C. G. Torre
At x ∈ M, the metric components g = [ gij ] of a G invariant metric satisfy the isotropy conditions ga1 + a1t g = 0 where
0 0 a1 = P (u) 0
0 0 0 0
and ga2 + a2t g = 0,
0 0 0 0
0 1 0 0
0 0 a2 = 0 Q (u)
and
(6.30) 0 0 0 0
0 0 0 0
0 1 . 0 0
We find that the solutions to (6.30) are
1 0 0 0 0 0 0 0 g1 = 0 0 0 0 0 0 0 0
and
0 −1 g2 = 0 0
−1 0 0
0 0 1 P (u)
0 0
0
Q (u)
0
0 1
.
Thus the kinematic reduction diagram is q κG
ι
(u, A, B) ←−−−− (x i , A, B) −−−−→ (x i , gij )
(u)
qM
←−−−−
(x i )
id
−−−−→
(x i ),
and the inclusion map ι sends (A, B) to ds 2 = Adu2 + Bγ , where γ = −2du dv +
dx 2 dy 2 + . P (u) Q (u)
The most general G invariant metric is ds 2 = A(u)du2 + B(u)γ . From the form of the most general G invariant symmetric type that the reduced field equations take the form ˜ = ˜ vv
(6.31) 2 0
tensor, we are assured
∂ ∂ ∂ ∂ ∂ ∂ ∂ ! ∂ ∂ ∂ ˜γ − ⊗ + ⊗ − ⊗ + P (u) ⊗ + Q (u) ⊗ . ∂v ∂v ∂u ∂v ∂v ∂u ∂x ∂x ∂y ∂y
Every G invariant one-form is a multiple of du so that there is only one non-trivial component to the contracted Bianchi identities and, indeed, by direct computation we find that d ˜G = ˜ γ d u. Divg˜ B du ˜ γ component of the reduced Since this must vanish identically, we conclude that the field equations is of the form c ˜γ = , B
Group Invariant Solutions Without Transversality
681
where c is a constant. Either the constant c is non-zero, in which case the reduced equations are inconsistent and there are no G invariant solutions, or else c = 0 and the reduced ˜ vv = 0. For generally covariant metric equations consist of just the single equation theories the case c = 0 can only arise when the field equations contain a cosmological term [37]. It is easy to check that while the isotropy algebras x0 are all two-dimensional abelian subalgebras, on disjoint orbits none are conjugate under the adjoint action of G. Hence the group action (6.29) is not simple and consequently the kinematic bundle for this action need not decompose according to (6.27). Indeed, the tensor γ cannot be identified with any G invariant quadratic form on the orbits G/Gx0 . Example 6.8 (Symplectic Reduction and Group Invariant Solutions). It is important to recognize the fundamental differences between symplectic reduction and Lie symmetry reduction for group invariant solutions of a Hamiltonian system with symmetry. Let M be an even dimensional manifold with symplectic form ω and let H : M → R be the Hamiltonian for a dynamical system on M. For the purposes of this example, it suffices to consider reduction by a one dimensional group of Hamiltonian symmetries generated by a vector field V with associated momentum map J , V
ω = d J.
(6.32)
is obtained by (i) restricting to the In symplectic reduction the reduced space M submanifold of M defined by J = µ ≡ constant, and then (ii) quotienting this submanifold by the action of the transformation group and the reduced equations are the associated generated by V . Both ω and H descend to M Since dim M = dim M − 2, the reduction in the number of Hamiltonian system on M. dependent variables is 2. The solution to the original Hamiltonian equations are obtained from that of the reduced Hamiltonian equations by quadratures. To compare with symmetry reduction for group invariant solutions, we transcribe Hamilton’s equations into the operator-theoretic setting used to construct the kinematic and dynamic reduction diagrams. Let E = M × R → R be extended phase space so that the differential operator characterizing the canonical equations is the one-form valued operator on J 1 (E) defined by =X
ω − d H.
Here X is the total derivative operator given, in standard canonical coordinates (ui , pi ) on M, by d ∂ ∂ ∂ X= = + u˙ i i + p˙ i . dt ∂t ∂u ∂pi It is not difficult to show that if V is any vector field on M, then the prolongation of V to J 1 (E) satisfies [X, pr 1 V ] = 0 and therefore V is a symmetry of the operator whenever V is a symmetry of ω and H . Since V is a vertical vector field on E it is “all isotropy” and the kinematic bundle is the fixed point set for the flow of V , κ (E) = {(t, ui , pi ) | V (ui , pi ) = 0 }.
682
I. M. Anderson, M. E. Fels, C. G. Torre
The dimension of κ (E) therefore depends upon the choice of V and is generally less than the dimension of E by more than 2 (the decrease in the dimension in the case of symplectic reduction). In short, it is not possible to identify the fibers of the kinematic Moreover, from (6.32), it follows that points in bundle with the reduced phase space M. κ (E) always correspond to points on the singular level sets of the momentum map and, typically, to points where the level sets fail to be a manifold. Thus the invariant solutions are problematic from the viewpoint of symplectic reduction and are subject to special treatment. See, for example, [5] and [20]. Finally, there is no guarantee that the reduced equations for the group invariant solutions possess any natural inherited Hamiltonian formulation. We illustrate these general observations with some specific examples. First, if V is a translation symmetry of a mechanical system, then J is a linear function and symplectic reduction yields all the solutions to Hamilton’s equations with a given fixed value for the first integral J . Since the vector field V never vanishes, the kinematic bundle is empty and there are no group invariant solutions. Second, for the classical 3-dimensional central force problem u¨ = −f (ρ)u,
where ρ =
v¨ = −f (ρ)v,
w¨ = −f (ρ)w,
√ u2 + v 2 + w 2 , the extended phase space E is R×R6 → R with coordinates (t, u, v, w, pu , pv , pw ) → (t),
the symplectic structure on phase space is ω = du ∧ dpu + dv ∧ dpv + dw ∧ dpw and 2 ) + φ(ρ), where φ (ρ) = ρf (ρ). The vector the Hamiltonian is H = 21 (pu2 + pv2 + pw field ∂ ∂ ∂ ∂ V = −u + pv +v − pu ∂v ∂u ∂pv ∂pu is a Hamiltonian symmetry. The kinematic bundle for the V invariant sections of E is ι
id
(t, w, pw ) ←−−−− (t, w, pw ) −−−−→ (t, u, v, w, pu , pv , pw ) π π π
(t)
id
←−−−−
(t)
id
−−−−→
(6.33)
(t) ,
where ι(t, w, pw ) = (t, 0, 0, w, 0, 0, pw ), the invariant sections are of the form t → (0, 0, w(t), 0, 0, pw (t)), and the reduced differential operator for the V invariant solutions is ˜ = (w˙ − pw ) dpw − (p˙ w + wf (|w|)) dw. Let us compare this state of affairs with that obtained by symplectic reduction based upon the Hamiltonian vector field V . The momentum map associated to this symmetry is the angular momentum J = −upv + vpu . The level sets J = µ are manifolds except for µ = 0. The level set J = 0 is the product of a plane and a cone whose vertex is precisely the fiber of the kinematic bundle.
Group Invariant Solutions Without Transversality
683
To implement the symplectic reduction, we introduce canonical cylindrical coordinates (r, θ, w, pr , pθ , pw ), where u = r cos θ, v = r sin θ, pθ pθ pu = pr cos θ − sin θ , pv = pr sin θ + cos θ . r r Note that this change of coordinates fails precisely at points of the kinematic bundle. In terms of these phase space coordinates, the symplectic structure is still in canonical 1 1 form ω = dr ∧ dpr + dθ ∧ dpθ + dw ∧ dpw ., the Hamiltonian is H = (pr2 + 2 pθ2 + 2 r 2 ) + φ( r 2 + w 2 ), and the momentum map is J = −pθ . We can therefore describe pw the symplectic reduction of E by the diagram ι
(t, r, w, pr , pw ) ←−−−− (t, r, θ, w, pr , pw ) −−−−→ (t, r, θ, w, pr , pθ , pw )
(t)
←−−−−
(t)
−−−−→
(t).
The reduced symplectic structure is then ωˆ = dr ∧ dpr + dw ∧ dpw , the reduced 2 2 = 1 (pr2 + µ + pw Hamiltonian is H ) + φ( r 2 + w 2 ), and the reduced equations of 2 2 r motion are µ2 r˙ = pr , p˙ r = −rf ( r 2 + w 2 ) + 3 , w˙ = pw , p˙ w = −wf ( r 2 + w 2 ). r Given a choice of µ and solutions to these reduced equations, we get a solution to the full equations via θ = −µt + const. 7. Appendix We summarize a few technical points concerning group actions on fiber bundles and the construction of the kinematic and dynamic reduction diagrams. For details, see [3]. A. Transversality and Regularity. Let G be a finite dimensional Lie group acting projectably on a bundle π : E → M. We say that G acts transversally on E if, for each fixed p ∈ E and each fixed g ∈ G, the equation π(g · p) = π(p)
implies that
g · p = p.
(7.1)
Thus each orbit of G intersects each fiber of E exactly once. For transverse group actions the orbits of G in E are diffeomorphic to the orbits of G in M under the projection map π : E → M. Projectable, transverse actions always satisfy the infinitesimal transversality condition (2.4) but the converse is easily seen to be false. = M/G is a Let us say that the action of G on M is regular if the quotient space M The smooth manifold and the quotient map qM : M → M defines M as a bundle over M. is discussed in various texts, for example, [1], [4], construction of the orbit manifold M [29], [31]. The assumption that the action of G on M is regular is a standard assumption is a manifold without in Lie symmetry reduction. For simplicity we suppose that M boundary but, as Example 6.4 shows, this assumption can be relaxed in applications. The fundamental properties of transverse group actions are described in the following theorem which is proved in [3].
684
I. M. Anderson, M. E. Fels, C. G. Torre
Theorem 7.1 (The Regularity Theorem for Transverse Group Actions). Let G be a Lie group which acts projectably and transversally on the bundle π : E → M. Suppose that G acts regularly on M. = E/G is a bundle over M. (i) Then G acts regularly on E and E is Hausdorff, then the orbit manifold E is also Hausdorff. (ii) If the orbit manifold M → M via the (iii) The bundle E can be identified with the pullback of the bundle π˜ : E quotient map qM : M → M. be an open set in M and let U = q−1 (U ). There is a one-to-one correspon(iv) Let U M dence between the smooth G invariant sections of E over U and the sections of E . over U B. Transversality and the Kinematic Bundle. Lemma 3.2 implies that the action of G on E always restricts to a transverse action on the set κG (E). In fact, it is not difficult to characterize κG (E) as the largest subset of E on which G acts transversally or, alternatively, as the smallest set through which all locally defined invariant sections factor. For Lie symmetry reduction without transversality the assumption that κG (E) is an imbedded subbundle of E now replaces the infinitesimal transversality condition (2.4) as the underlying hypothesis for the action of G on E (together, of course, with the regularity of the action of G on M). In particular, the assumption that the dimension of κG,x (E) is constant as x varies over M is clearly a necessary condition if one hopes to parameterize the space of G invariant local sections of E in terms of a fixed number of arbitrary functions. There are a variety of general results which one can apply to check whether κG (E) is a subbundle of E. To begin with, if x, y ∈ M lie on the same G orbit, that is, if y = g · x for some g ∈ G, then it is not difficult to prove that κG,y (E) = g · κG,x (E). By virtue of this observation it suffices to check that the restrictions of κG (E) to the cross-sections of the action of G on M are subbundles. For Lie group actions G which admit slices on M, it is not difficult to establish (see [4]) that the kinematic bundles for the induced actions on tensor bundles of M always exist. For compact groups acting by isometries on hermitian vector bundles the existence of the kinematic bundle is established in [11]. Granted that κG (E) → M is a bundle, Theorem 3.3 now follows from Theorem 7.1. Theorem 7.1 also shows that there is considerable redundancy in the hypothesis of Theorem 3.3. We emphasize that the action of G on E itself need not be regular in order to construct a smooth kinematic reduction diagram. This is well-illustrated by Example 19 in Lawson [26, p. 23]. C. The Bundle of Invariant Jets. The following theorem summarizes the key properties of the bundle Invk (E) → M. Theorem 7.2. Let G be a projectable group action on π : E → M and suppose that E admits a smooth kinematic reduction diagram (3.5). (i) Then Invk (E) is a G invariant embedded submanifold of J k (E). (ii) The action of G on Invk (E) is transverse and regular.
Group Invariant Solutions Without Transversality
685
(iii) The quotient manifold Invk (E)/G is diffeomorphic to J k (κ˜ G (E)) and the diagram ιk
qInv
J k (κ˜ G (E)) ←−−−− Invk (E) −−−−→ J k (E)
M
qM
←−−−−
M
id
−−−−→
M
commutes. This theorem implies that the same hypothesis on the action of G on the bundle π : E → M needed to insure that the kinemati c reduction diagram is a diagram of smooth manifolds and maps also insures that the bottom row of the dynamical reduction diagram (4.2) exists. Therefore to guarantee the smoothness of the entire dynamic reduction diagram one need only assume, in addition, that DInv is a subbundle of D. D. The Automorphism Group of the Kinematic Bundle. For computations of the auto˜∗ morphism group of the kinematic bundle it is often advantageous to use the fact that G ˜ fixes every G invariant section of E, that G preserves the space of G invariant sections and that, conversely, under very mild assumptions, these properties characterize these groups. Theorem 7.3. Assume that there is a G invariant section through each point of κG (E). ˜ ∗ coincides with the subgroup of G which fixes every invariant section Then the group G of E, ˜ ∗ = { a ∈ G | a · s = s for all G invariant sections s : M → E } G ˜ coincides with the subgroup of G which preserves the set of G invariant and the group G sections of E, ˜ = { a ∈ G | a · s is G invariant for all G invariant sections s : M → E }. G References 1. Abraham, R. and Marsden, J.: Foundations of Mechanics. 2nd ed., Reading, MA: Benjamin-Cummings, 1978 2. Anderson, I.M. and Fels, M.E.: Symmetry Reduction of variational bicomplexes and the principle of symmetry criticality. Am. J. Math. 112, 609–670 (1997) 3. Anderson, I.M. and Fels, M.E.: Transverse group actions on bundles. In preparation 4. Anderson, I.M., Fels, M.E., Torre, C.G.: Symmetry Reduction of Differential Equations. In preparation 5. Arms, J.A., Gotay, M.J., Jennings, G.: Geometric and algebraic reduction for singular momentum maps. Adv. in Math, 79, 43–103 (1990) 6. Beckers, J., Harnad, J., Perrod, M. and Winternitz, P.: Tensor fields invariant under subgroups of the conformal group of space-time. J. Math. Phys. 19 (10), 2126–2153 (1978) 7. Beckers, J., Harnad, J. and Jasselette, P.: Spinor fields invariant under space-time transformations. J. Math. Phys. 21 (10), 2491–2499 (1979) 8. Bleecker, D.D.: Critical mappings of Riemannian manifolds. Trans. Am. Math. Soc. 254, 319–338 (1979) 9. Bleecker, D.D.: Critical Riemannian manifolds. J. Diff. Geom. 14, 599–608 (1979) 10. Bluman, G.W. and Kumei, S.: Symmetries and Differential Equations. Applied Mathematical Sciences, 81, New York–Berlin: Springer-Verlag, 1989 11. Brúning, J. and Heintze, E.: Representations of compact Lie groups and elliptic operators. Invent. Math. 50, 169–203 (1979) 12. Bondi, H., Pirani, F., Robinson, I.: Gravitational waves in general relativity III. Exact plane waves. Proc. Roy. Soc. London A 251, 519–533 (1959) 13. Coquereaux, R. and Jadczyk, A.: Riemannian Geometry, Fiber Bundles, Kaluza–Klein Theories and all that. Lecture Notes in Physics 16, Singapore: World Scientific, 1988
686
I. M. Anderson, M. E. Fels, C. G. Torre
14. David, D., Kamran, N., Levi, D. and Winternitz, P.: Symmetry reduction for the Kadomtsev–Petviashvili equation using a loop algebra. J. Math. Phys: 27, 1225–1237 (1986) 15. Eells, J. and Ratto,A.: Harmonic Maps and Minimal Immersions with Symmetries.Annals of Mathematical Studies 130, Princeton: Princeton Univ. Press, 1993 16. Fels, M.E. and Olver, P.J.: On relative invariants. Math. Ann. 308, 609–670 (1997) 17. Fels, M.E.: Symmetry reductions of the Euler equations. In preparation 18. Fushchich, W.I., Shtelen, W.M., Slavutsky, S.L.: Reduction and exact solutions of the Navier–Stokes equations. Topology 15, 165–188 (1976) 19. Gaeta, G. and Morando, P.: Michel theory of symmetry breaking and gauge theories. Ann. Phys. 260, 149–170 (1997) 20. Gotay, M.J. and Bos, L.: Singular angular momentum mappings. J. Diff. Geom. 24 181–203 (1986) 21. Grundland,A.M., Winternitz, P., Zakrewski, W.J.: On the solutions of the CP1 model in (2+1) dimensions. J. Math. Phys. 37 (3), 1501–1520 (1996) 22. Harnad, J., Schnider, S. and Vinet, L.: Solution to Yang–Mills equations on M 4 under subgroups of O(4, 2). In: Complex Manifold Techniques in the Theoretical Physics. (Proc. Workshop, Lawrence, Kan. 1978). Research Notes in Math. 32 Boston: Pitman, 1979, pp. 219–230 23. Ibragimov, N.H.: CRC Handbook of Lie Group Analysis of Differential Equations. Volume 1, Symmetries, Exact Solutions and Conservation Laws. Boca Raton, FL: CRC Press, 1995 24. Jackiw, R. and Rebbi, C.: Conformal properties of aYang–Mills pseudoparticle. Phys. Rev. D 14, 517–523 (1976) 25. Kovalyov, M., Légaré, M. and Gagnon, L.: Reductions by isometries of the self-dualYang–Mills equations in four-dimensional Euclidean space. J. Math. Phys. 34 (7), 3245–3267 (1993) 26. Lawson, H.B.: Lectures on Minimal Submanifolds. Mathematics Lecture Series 9, Berkeley: Publish or Perish, 1980 27. Légaré, M. and Harnad, J.: SO(4) reduction of the Yang–Mills equations for the classical gauge group. J. Math. Phys. 25 (5), 1542–1547 (1984) 28. Lègaré, M. Invariant spinors and reduced Dirac equations under subgroups of the Euclidean group in four-dimensional Euclidean space. J. Math. Phys. 36 6, 2777–1791 (1995) 29. Olver, P.J.: Applications of Lie Groups to Differential Equations. (Second Ed.) New York: Springer, 1986 30. Ovsiannikov, L.V.: Group Analysis of Differential Equations New York: Academic Press, 1982 31. Palais, R.S.: A Global Formulation of the Lie theory of Transformation Groups. Memoirs of the Am. Math Soc. 22, Providence, RI: Am. Math. Soc., 1957 32. Palais, R.S.: The principle of symmetric criticality. Commun. Math. Phys. 69, 19–30 (1979) 33. Palais, R.S.: Applications of the symmetric criticality principle in mathematical physics and differential geometry. In: Proc. U.S.– China Symp. on Differential Geometry and Differential Equations II, 1985 34. Rogers, C. and Shadwick, W.: Nonlinear boundary value problems in science and engineering. Mathematics in Science and Enginering 183, Boston: Academic Press, 1989 35. Smith, R.T.: Harmonic mapings of spheres. Am. J. of Math. 97, 364–385 (1975) 36. Stephani, H.: Differential Equations and their Solutions using Symmetries. Cambridge: Cambridge University Press, 1989 37. Torre, C.G.: Gravitational waves: Just plane symmetry. Preprint gr-qc/9907089 38. Tóth, G.: Harmonic and Minimal Immersions through Representation Theory. Perspectives in Math. Boston: Academic Press, 1990 39. Urakawa, H.: Equivariant harmonic maps between compact Riemannian manifolds of cohomogenity. 1, Michigan Math. J. 40, 27–50 (1993) 40. Vorob’ev, E.M.: Reduction of quotient equations for differential equations with symmetries. Acta Appl. Math. 23, 1999 (1991) 41. Winternitz, P.: Group theory and exact solutions of partially integrable equations. In: Partially Integrable Evolution Equations, R. Conte and N. Boccara, eds. Dordrecht: Kluwer Academic Publishers, 1990, pp. 515–567 42. Winternitz, P., Grundland, A.M., Tuszy´nski, J.A.: Exact solutions of the multidimensional classical φ 6 – field equations obtained by symmetry reduction. J. Math. Phys. 28, 2194–2212 (1987) Communicated by H. Araki
Commun. Math. Phys. 212, 687 – 701 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On Baxter’s Q-Operator for the XXX Spin Chain G. P. Pronko Institute for High Energy Physics, Protvino, Moscow reg. 142284, Russia and International Solvay Institute, Brussels, Belgium Received: 5 September 1999 / Accepted: 10 February 2000
Abstract: We discuss the construction of Baxter’s Q-operator. The suggested approach leads to the one-parametric family of Q-operators, satisfying wronskian-type relations. Also we have found the generalization of Baxter operators, which defines the nondiagonal part of the monodromy. 1. Introduction Long ago, considering the XY Z spin chain Baxter [1] has introduced the so-called Qoperator, which permitted him to find the eigenvalues of the transfer matrix t (x) in spite of the absence of the Bethe Ansatz for this spin chain. This object may be defined by the following operator equation: t (x)Q(x) = a(x)Q(x + i) + b(x)Q(x − i),
(1)
together with the requirements [t (x), Q(y)] = [Q(x), Q(y)] = 0.
(2)
Recently, in a series of papers Bazhanov, Lukyanov and Zamolodchikov [2] have given an explicit construction of such operators for the case of a certain integrable field model. Moreover their construction definitely gives the pair of operators Q± (x), satisfying apart from (1) and (2) also [Q+ (x), Q− (y)] = 0
(3)
and a certain “wronskian” relation, which becomes the origin of the various fusion relations. However, the extension of their results for the six-vertex spin chain requires an external magnetic field which cannot be eliminated by the limiting procedure. Therefore in the simplest case of the XXX spin chain we do not know Q± (x) -operators, though from
688
G. P. Pronko
the point of view of the quantum inverse scattering method (QISM) [3] their construction should be universal for any integrable system. In [4] we investigated Eq. (1) for the eigenvalues of the transfer matrix in the cases of XXX and XXZ and had proven that there exists a pair of solutions (we called it Q(x) and P (x)) which are the polynomials (or trigonometric polynomials for the XXZ case) in the spectral parameter and which also satisfies “wronskian” relations. In the present paper we give the explicit operator construction of the one-parametric solution of Eq. (1) and also the solutions of the generalized equation, where instead of the transfer matrix, the whole monodromy matrix enters. To simplify the discussion we shall consider the case of quantum spin 1/2. The generalization for arbitrary quantum spin as well as for the inhomogeneous chain are more or less straightforward. In the frameworks of QISM [3], the monodromy matrix T l (x) is defined as the ordered product of L-operators, acting in the 2 × (2l + 1)-dimensional space: Lln = x + 2isna La , sna
(4) La
operators act in the auxiliary where are the operators of quantum spin 1/2, while 2l + 1-dimensional space. The monodromy matrix is then given by N
T l (x) =
Ll (x),
(5)
n=1
where N is the length of the chain and the transfer matrix t l (x) is the trace of T l (x) over the auxiliary space: t l (x) = T rT l (x).
(6)
Note that for the case of the isotropic XXX-spin chain, the monodromy matrix T l (x) is the scalar with respect to simultaneous rotation in the quantum and auxiliary spaces [S + L, T l (x)] = 0,
(7)
where S=
n=1
sn.
(8)
N
Therefore the transfer matrix t l (x) is the scalar with respect to quantum spin: [S, t l (x)] = 0.
(9)
The full set of commutation relations between matrix elements of the monodromy matrix with different spectral parameters is contained in the following equation [3]:
R l,l (x − y)T l (x)T l (y) = T l (y)T l (x)R l,l (x − y),
(10)
l
where the monodromies T l (x) and T (y) have a common quantum space and different auxiliary spaces. The R-matrix, which acts in the tensor product of auxiliary spaces with dimension (2l + 1) × (2l + 1) is the function of total auxiliary spin J = L + L [3]:
R l,l (x) = eiπJ
(J + 1 − ix) , (J + 1 + ix)
(11)
Baxter’s Q-Operator for the XXX Spin Chain
689
where the operator J is given by: 1/2 J = J 2 + 1/4 − 1/2.
(12)
The same commutation relations as (10) are valid also for L-operators:
R l,l (x − y)Ll (x)Ll (y) = Ll (y)Ll (x)R l,l (x − y).
(13)
Equation (10) has many important corollaries, among which there are so-called fusion relations. The later plays the key role in what follows. One of these fusion relations for the transfer matrix has the following form: i N l+1/2 i N l−1/2 t 1/2 (x)t l (x + i(l + 1/2)) = x + t (x + il) + x − t (x + i(l +1)). 2 2 (14) If we denote as A(x, l) the transfer matrix with shifted spectral parameter A(x, l) = t l (x + i(l + 1/2)),
(15)
relation (14) will take the form i N i N A(x − i, l + 1/2) + x − A(x + i, l − 1/2), (16) t 1/2 (x)A(x, l) = x + 2 2 which is very similar to the defining relation for the Baxter Q-operator in the case of the XXX spin chain [4]: i N i N Q(x − i) + x − Q(x + i). (17) t 1/2 (x)Q(x) = x + 2 2 The difference of (16) and (17) is due to the shift of the second argument of A(x, l) in the r.h.s. of (16). To eliminate this difference we shall make the following trick. Let us forget for a moment that l denotes the representation of auxiliary spin, and takes only integer or half-integer values and consider A(x, l) as a function of two complex arguments. Then the new function, which is defined as Q(x, l) = A(x, l + ix/2) apparently satisfies the relation i N i N Q(x − i, l) + x − Q(x + i, l). t 1/2 (x)Q(x, l) = x + 2 2
(18)
(19)
In such a way we obtain a one parametric family of operators, satisfying the Baxter equation (17). Of course, this construction can not be considered as rigorous, because the analytic continuation from a countable set of points into the complex plane is not unique and we have used this trick only to illustrate the idea. In what follows we shall give a more educated construction of the Q-operator, based on this idea. However, if we impose the condition that after the continuation into the complex l-plane, the operator A(x, l) remains a polynomial in l, then this continuation becomes unique and this trick gives the effective way for calculation of eigenvalues of the Q-operators via the eigenvalues of the monodromies t l (x).
690
G. P. Pronko
2. The L-Operators The discussion of the previous section made it clear that for the construction of the Q-operator we need the complex spin in the auxiliary space. Also, we shall look for the Q-operator in the form of the trace of the appropriate monodromy: ˆ Q(x) = T r Q(x),
(20)
ˆ where the operator Q(x) acts in the tensor product of quantum space of s = 1/2 and infinite dimensional auxiliary space . As we shall see for our purpose it is sufficient that this space is the representation of the algebra: [ρi , ρj+ ] = δij ,
i, j = 1, 2.
(21)
ˆ The operator Q(x) will be given by the ordered product: ˆ Q(x) =
N
Ln (x),
(22)
n=1
where Ln (x)-are the operators, depending on ρ and ρ + and acting in the space of nth quantum spin. 1/2 Further we shall need to consider the product Ln (x) Ln (x), which acts in the ij
auxiliary space × 2 (-for Ln (x) and 2 – is a two-dimensional auxiliary space for 1/2 Ln (x)). In this space it is convenient to consider a pair of projectors !± ij : + + + + −1 −1 !+ ij = (ρ ρ + 1) ρi ρj = ρi ρj (ρ ρ + 1) , + + + −1 + −1 !− ij = (ρ ρ + 1) "il ρl "j m ρm = "il ρl "j m ρm (ρ ρ + 1) ,
(23)
where ρ + ρ = ρi+ ρi , "ij = −"j i ,
"12 = 1.
(24)
These projectors formally satisfy the following relations: ± ± !± ik !kj = !ij , − !+ ik !kj = 0, − !+ ij + !ij = δij .
(25)
Rigorously speaking the r.h.s. of the first equation (25) in the Fock representation has an extra term, proportional to the projector on the vacuum state, but, as we shall see below, this term is irrelevant in the present discussion. In order to define the Q-operator which satisfies the Baxter equation (17) we shall exploit Baxter’s idea [1], which we reformulate as follows: Ln (x)-operator should satisfy the relation: 1/2 !− Ln (x)!+ (26) ij Ln (x) lk = 0. jl
Baxter’s Q-Operator for the XXX Spin Chain
691
If this condition is fulfilled, then 1/2 1/2 Ln (x) Ln (x) = !+ (x) Ln (x)!+ L n ik lj + ij kl 1/2 + 1/2 !− (x)n Ln (x)!− Ln (x)!− ik L lj + !ik Ln (x) lj . kl
kl
(27)
In other words, the condition (26) guarantes that the r.h.s. of (27) in the sense of projectors !± has the triangle form and this form will be conserved for products over n due to orthogonality of the projectors. From (26) we obtain "j m ρm xδj k + isna σjak Ln (x)ρk = 0, (28) or
and
Ln (x)ρk = isna σkla − δkl ρl An (x)
(29)
"j m ρm xδj k − isna σjak Ln (x) = Bn (x)"kl ρl ,
(30)
where An (x) and Bn (x) are some operators which we shall find now. Making use of (29) let us rewrite the first term in the r.h.s. of (27) in the following form: 1/2 + + −1 !+ L (x) Ln (x)!+ n ik lj = −(x + i/2)(x − 3i/2)ρi An (x)ρj (ρ ρ + 1) . (31) kl
This equation may also be written as 1/2 + + −1 !+ Ln (x)!+ ik Ln (x) lj = (x + i/2)ρi Ln (x − i)ρj (ρ ρ + 1) , kl
(32)
provided the operator An (x) is given by An (x) = −(x − 3i/2)−1 Ln (x).
(33)
Substituting (33) into (29) we obtain the desired equation for L operator: sna σija + ixδij ρj Ln (x) = (1/2 + ix)Ln (x + i)ρi .
(34)
If this equation is satisfied, we immediately find the operator Bn (x) in (30): Bn (x) = (x − i/2)Ln (x + i).
(35)
Having (35) we can also rewrite the second term in the r.h.s. of (27), as we did in Eq. (32) and finally arrive at 1/2 Ln (x) Ln (x) = (x + i/2)ρi Ln (x − i)ρj+ (ρ + ρ + 1)−1 ij 1/2 L + (x − i/2)(ρ + ρ + 1)−1 "il ρl+ Ln (x + i)"j m ρm + !+ (x) Ln (x)!− n ik lj . (36) kl
We do not care to rewrite the last term in the r.h.s. of (36) because this term does not contribute into the final expression of the Q-operator.
692
G. P. Pronko
Until now our discussion has been quite formal because we did not specify the representation of the algebra (21). The detailed investigation of Eq. (34) shows that the usual Fock representation for (21) does not fit for our purpose, therefore we shall use a less restrictive holomorphic representation. Let the operator ρi+ be the operator of multiplication by the αi , while the operator ρi – the operator of differentiation with respect to αi : ρi+ ψ(α) = αi ψ(α), ∂ ρi ψ(α) = ψ(α). ∂α
(37)
The operators ρi+ , ρi are canonically conjugated for the scalar product:
¯ i −α α¯ i=1,2 dαi d α ¯ (ψ, φ) = e ψ(α)φ(α). 2 (2π i)
(38)
The action of an operator in a holomorphic representation is defined by its kernel: ¯ ¯ β)ψ(β), (39) (Kψ) (α) = dµ(β, β)K(α, where we have denoted
¯ = dµ(β, β)
¯i i=1,2 dαi d α . 2 (2π i)
(40)
In this framework we can formulate the following: Theorem. The kernel of the operator Ln (x) , satisfying Eq. (34) in holomorphic representation has the following form: ¯ 2l+ix i + (α β) a + a ¯ Ln (x, l, α, β) = x + (ρ ρ + 1) + isn ρ σ ρ , (41) 2 (2l + ix + 1) where l is arbitrary number. The proof is trivial by direct substitution of (41) into (34), using Definition (39). In such a way, the operator Ln (x) given by (41) solves Eq. (36) for left multiplication by L1/2 (x). Changing the order of the multiplication in (36), we can prove that 1/2 Ln (x) Ln (x) = (x + i/2)(ρ + ρ + 1)−1 ρi Ln (x − i)ρj+ ij 1/2 L + (x − i/2)"il ρl+ Ln (x + i)"j m ρm (ρ + ρ + 1)−1 + !− (x) Ln (x)!+ n ik lj , (42) kl
provided Ln (x) satisfies the equation: Ln (x)ρi+ sna σija + ixδij = (1/2 + ix)ρj+ Ln (x + i).
(43)
The direct substitution of (41) into (43) shows that (41) is also the solution of this equation. The solution (41) possesses invariance with respect to simultaneous rotation in quantum and auxiliary spaces, as an L-operator of the XXX chain: σ s n + ρ + ρ, Ln (x) = 0. (44) 2
Baxter’s Q-Operator for the XXX Spin Chain
693
3. The Q-Operators To proceed further we need to recall the definition of a trace of an operator in holomorphic ¯ then, (see e.g. [5]) representation. If the operator is given by its kernel F (α, β) T rF = dµ(α, α)F ¯ (α, α), ¯ (45) where the measure was defined in (40). Let us now consider the ordered product of the Ln (x)-operators, introduced in the previous section, ˆ ¯ = Q(x, l, α, β)
N−1
dµ(γi , γ¯i )LN (x, l, α, γ¯N−1 )LN−1 (x, l, γN−1 , γ¯N−2 ) × · · ·
i=1
¯ · · · × L2 (x, l, γ2 , γ¯1 )L1 (x, l, γ1 , β).
(46)
Due to the triangle (in the sense of projectors !± ) structure of the r.h.s. of (36) we obtain the following rule of multiplication of the monodromy matrix T 1/2 (x) on the ˆ operator Q:
1/2 ˆ ˆ − i, l, α, β)ρ ¯ = (x + i )N ρi Q(x ¯ + (ρ + ρ + 1)−1 T (x) ij Q(x, l, α, β) j 2
+ i N + −1 + ˆ + i, l, α, β)" ¯ j k ρk + ! · · · (x − 2 ) (ρ ρ + 1) "im ρm Q(x !− , (47) im mk kj where we omitted the explicit expression of the last term for obvious reasons. In the derivation of (47) we have used the remnants of the projectors !± which govern the proper multiplication of each term in (36) separately. Now we can perform the trace operation over the holomorphic variables and over i, j indexes, corresponding to the auxiliary 2-dimensional space of T 1/2 (x). The result is the desired Baxter equation: i N i N t 1/2 (x)Q(x, l) = x + Q(x − i, l) + x − Q(x + i, l), 2 2
(48)
where, according to (45) and (46), Q(x, l) =
ˆ dµ(α, α) ¯ Q(x, l, α, α). ¯
(49)
ˆ exists due to the exponential factor in the holomorphic measure Note that the trace of Q (40) and has cyclic property, therefore Q(x, l) is invariant under cyclic permutation of the quantum spins. Further, due to property (44) we easily obtain that Q(x, l) is invariant with respect to rotations of the total quantum spin: [S, Q(x, l)] = 0,
(50)
where S is given in (8). Recall that Ln (x)-operators satisfy also relation (42) for the right multiplication by L1/2 (x), therefore [t (x), Q(x, l)] = 0.
(51)
694
G. P. Pronko
ˆ Expression (46) for the Q-operator could be essentially simplified. For that let us rewrite Eq. (41) in the following form: ¯ 2l+ix (α β) ¯ = Kn (x) , (52) Ln (x, l, α, α) (2l + ix + 1) where we have denoted via Kn (x) the following operator: i (53) Kn (x) = x + (ρ + ρ + 1) + isna ρ + σ a ρ. 2 Here ρ, ρ+ -are the operators, acting in (52) on the variable α. The action of the operator Ln (x) in the form (52) on the function ψ, according to (39) has the following form: ¯ 2l+ix (α β) ¯ n (x) ψ(β). (54) (Ln (x)ψ) (α) = dµ(β, β)K (2l + ix + 1) ¯ we can transfer the action of the operator Using a property of the measure dµ(β, β) Kn (x) from the variable αi onto the variable βi and rewrite (54) in the form: ¯ 2l+ix (α β) ¯ Kn (x)ψ(β). (55) (Ln (x)ψ) (α) = dµ(β, β) (2l + ix + 1) Proceeding this way in the representation (46) we can collect all operators Kn (x) in one place: N−1 (α γ¯N−1 )2l+ix (γN−1 γ¯N−2 )2l+ix · · · (γ2 γ¯1 )2l+ix ˆ ¯ Q(x, l, α, β) = dµ(γi , γ¯i ) [(2l + ix + 1]N−1 i=1 ×
N m=1
Km (x)
¯ 2l+ix (γ1 β) . (2l + ix + 1)
(56)
Now all operators Km (x) act on the variable γ1 , and we can perform integration over γk , γ¯k with k = 2, · · · , N − 1. This integration could be done with the help of the following formula: ¯ 2l+ix ¯ 2l+ix (α β) (α γ¯ )2l+ix (γ β) = . (57) dµ(γ , γ¯ ) (2l + ix + 1) [(2l + ix + 1)]2 ¯ u / (u + 1) is actually the kernel of some From (57) it follows that the factor (α β) ˆ projector. Using this property, we arrive at the following representation for Q: N ¯ 2l+ix (α γ¯ )2l+ix (γ β) ˆ ¯ Q(x, l, α, β) = dµ(γ , γ¯ ) Km (x) . (58) (2l + ix + 1) (2l + ix + 1) m=1
Due to the fact that in (58) the ordered product of Kn (x) acts only on the variable γi , we ˆ ¯ performing one more integration: can derive the expression for the trace of Q(x, l, α, β), N (γ γ¯ )2l+ix Q(x, l) = dµ(γ , γ¯ ) Km (x) . (59) (2l + ix + 1) m=1
¯ 2l / (2l+1) Needless to say, for an integer or half-integer l and x = 0 the expression (α β) coincides with the kernel of the projector on the representation l of su(2), so that Eq. (59) is actually the desired prescription for analytic continuation into complex momentum, naively suggested in the Introduction.
Baxter’s Q-Operator for the XXX Spin Chain
695
4. The Intertwining Relations In this section we shall derive several intertwining relations for the operators Ln (x) and L1/2 (x) which permit us to prove the commutativity of Q(x, l) and some other important corollaries. First of all let us consider the representation (52) for the Ln (x)operator. The formal operator Kn (x), which enters into this representation is nothing else but the usual Ln (x)-operator of the XXX-spin chain with infinite dimensional auxiliary space, with shifted spectral parameter. The shift commutes with Kn (x), so we can prove pure algebraically the R-matrix form of commutation relation for Kn (x): i R ρ,τ x − y + (ρ + ρ − τ + τ ) Knρ (x)Knτ (y) 2 i = Knτ (y)Knρ (x)R ρ,τ x − y + (ρ + ρ − τ + τ ) , 2
(60)
where R ρ,τ (x) is given by Eq. (11) with σ σ J = ρ + ρ + τ + τ. 2 2
(61)
The indexes ρ and τ at the operators Kn and R indicate different operators, acting in their auxiliary spaces. For the products of L-operators Eq. (60) implies ¯ 2m+iy ¯ 2l+ix i (γ δ) (α β) R ρ,τ x − y + (ρ + ρ − τ + τ ) Knρ (x)Knτ (y) 2 (2l + ix + 1) (2m + iy + 1) (α β) ¯ 2l+ix ¯ 2m+iy i (γ δ) = Knτ (y)Knρ (x)R ρ,τ x − y + (ρ + ρ − τ + τ ) 2 (2l + ix + 1) (2m + iy + 1) 2l+ix 2m+iy ¯ ¯ i (γ δ) (α β) = Knτ (y)Knρ (x) R ρ,τ x − y + (ρ + ρ − τ + τ ) . (2l + ix + 1) (2m + iy + 1) 2 (62) Few comments need to be made about these equations. The holomorphic variables, ¯ , to the pair (τ + , τ ) ∼ (γ , δ). ¯ corresponding to the pair of operators (ρ + , ρ) are (α, β) Equations (62) should be understood as the short version of the long story with integrals over the holomorphic variables with the functions, depending upon β, δ. The last step of the chain of the equations is due to the same property of measure, which permitted the transition from (54) to (55). Coming back to the L operators we can write the following intertwining relation: i ¯ n (y, m, γ , δ) ¯ = R ρ,τ x − y + (ρ + ρ − τ + τ ) Ln (x, l, α, β)L 2 i ¯ ρ,τ (x − y + (ρ + ρ − τ + τ ) . ¯ n x, l, α, β)R Ln (y, m, γ , δ)L 2
(63)
ˆ operators. From that we immediately Apparently, the same relation holds true also for Qobtain the commutativity of its traces: [Q(x, l), Q(y, m)] = 0.
(64)
696
G. P. Pronko
Further, let us consider another commutation relation [6] : i (x + σ 1 σ 2 )(y + iσ 1 M)(y − x + iσ 2 M) 2 i = (y − x + iσ 2 M)(y + iσ 1 M) x + σ 1 σ 2 , 2
(65)
where the Pauli matrices σ1a , σ2a acts in their spaces and M a are some operators satisfying [M a , M b ] = i"abc M c .
(66)
σ M a = ρ + ρ, 2
(67)
In particular, we can set
where ρ + , ρ the Heisenberg variables (21). Now let us shift the spectral parameter y in (65) by i/2(ρ + ρ + 1) and rewrite (65) in the following form:
(x + iσ s n ) y + 2i (ρ + ρ + 1) + is n ρ + σ ρ y − x + 2i (ρ + ρ + 1) + iσ ρ + σ2 ρ
= y − x + 2i (ρ + ρ + 1) + iσ ρ + σ2 ρ y + 2i (ρ + ρ + 1) + is n ρ + σ ρ (x + iσ s n ) , (68) where we interpret σ 1 as the quantum spin s n while the σ 2 serves as auxiliary spin. Equation (68) could be also written as
i i 1/2 1/2 Ln (x) Kn (y) K y − x + = K y−x+ Kn (y) Ln (x) , ik kj 2 kj 2 ik (69)
where we explicitly wrote the indexes, corresponding to the auxiliary space, index n indicates corresponding quantum space and the operator K(x) was introduced in (53). Equation (69) could be used for derivation of the intertwining relation for L1/2 and L operators. Indeed, let us consider the following product of the operators, acting on the function in holomorphic representation:
i Ln (y, l)Kkj (y − x + )ψ (α) ik 2 ¯ 2l+iy i (α β) 1/2 ¯ n (y) dµ(β, β)K = Ln (x) Kkj y − x + ψ(β). (70) ik (2l + iy + 1) 2
1/2 Ln (x)
¯ 2l+iy / (2l + Moving the operator Kkj (y −x +i/2) to the left, through the projector (α β) iy + 1) and using (69) we obtain the following relation:
i i 1/2 1/2 Ln (x) Ln (y, l)Kkj y − x + = Kik y − x + Ln (y, l) Ln (x) , ik kj 2 2 (71)
Baxter’s Q-Operator for the XXX Spin Chain
697
where the operator σ i Kij (x) = x + (ρ + ρ + 1) δij + iσ ij ρ + ρ 2 2
(72)
plays the role of R-matrix. The relation (71) gives rise to the analogous relation for the monodromies:
i ˆ ¯ kj y − x + T 1/2 (x) Q(y, l, α, β)K ik 2 i ˆ ¯ T 1/2 (x) , = Kik y − x + Q(y, l, α, β) kj 2
(73)
from where we obtain the commutativity of the transfer matrix and our Q-operator: [t (x), Q(y, l)] = 0.
(74)
5. Again Q-Operators Now we are ready to discuss some important properties of Baxter’s Q- operator and its generalizations. Let us start from the Baxter equation (48) for the Q-operator. Due to mutual commutativity of Q(x, l) with different spectral parameters and second arguments, we easily derive that i N i N x− Q(x − i, l) − x + Q(x + i, l) Q(x, m) 2 2 i N i N = x− Q(x − i, m) − x + Q(x + i, m) Q(x, l), (75) 2 2 or i i i i Q x + , l Q x − , m − Q x − , l Q x + , m = C(l, m)x N , 2 2 2 2
(76)
where C(l, m) is some unknown operator, commuting with Q. To find C(l, m) we must calculate the l.h.s. of (76) for some convenient values of arguments. From (59) it follows that the Q-operator is proportional to the trace of the projector, whose kernel in the holomorphic representation is (γ γ¯ )2l+ix / (2l + ix + 1). This trace is given by: (γ γ¯ )2l+ix dµ(γ γ¯ ) = 2(2l + ix + 1). (77) (2l + ix + 1) Note that for x = 0, the trace is 2× the dimension of the representation of spin l. From (77) we conclude that Q(x, l)|ix=−(2l+1) = 0.
(78)
Now let us set x = i(2l + 1/2) in Eq. (76) (for m, l – integer or half-integer and m ≥ l + 1/2). Then, due to (78) the first term in l.h.s. of (76) disappears and we obtain: −Q(2il, l)Q(i(2l + 1), m) = C(l, m)[i(2l + 1)]N .
(79)
698
G. P. Pronko
Further, from (59) we derive that Q(2il, l) = t 0 (i(2l + 1)) = [i(2l + 1)]N
(80)
Q(i(2l + 1)) = t m−l−1/2 (i(l + m)).
(81)
and
Hence, the unknown coefficient C(l, m) is given by C(l, m) = t m−l−1/2 (i(l + m)).
(82)
For the case l ≥ m + 1/2 , we should put x = i(2m + 1/2) and l and m will change their places in the final answer. So, finally we obtain the quantum wronskian in the following form: i i i i Q x + , l Q x − , m − Q x − , l Q x + , m = −x N t m−l−1/2 (i(l + m)). 2 2 2 2 (83) Note that for l = m, the r.h.s. of (83) vanishes, as it should be for wronskian of a linearly dependent solutions. Proceeding along this way we can obtain the general relation involving the transfer matrix with arbitrary spin in the auxiliary space in the r.h.s. of (83) (the x N is just the t o (x)) (see [2, 4]). We postpone the discussion of these relations to a future publication, where we intend to give another derivation. Further we want to consider the generalization of Baxter equation, which follows from the fundamental relation (47). To obtain these new relations, let us multiply both sides of (47) by total spin in the auxiliary space: 1 a ˆ ¯ l, α, β) σik + J a δik T 1/2 (x) Q(x, ij 2 i N ˆ 1 a ¯ + (ρ + ρ + 1)−1 x+ ρk Q(x − i, l, α, β)ρ = σik + J a δik j 2 2 i N + + ˆ ¯ j n ρn Q(x + i, l, α, β)" + x− (ρ ρ + 1)−1 "km ρm 2
− (84) + !+ km · · · mn !nj , where J a = ρ+
σa ρ. 2
(85)
Due to the obvious relations 1 a σ ρk + J a ρi = ρ i J a , 2 ik 1 a + σ "km ρm + J a "ik ρk+ = "ik ρk+ J a , 2 ik 1 a pm pm 1 a σik + J a δik !kj = !ik σkj + J a δkj , 2 2
(86)
Baxter’s Q-Operator for the XXX Spin Chain
Eq. (84) could be rewritten in the following form: 1 a ˆ ¯ l, α, β) σik + J a δik T 1/2 (x) Q(x, kj 2 i N ˆ − i, l, α, β)ρ ¯ + (ρ + ρ + 1)−1 = x+ ρi J a Q(x j 2 N i + a ˆ ¯ j k ρk + x− (ρ + ρ + 1)−1 "im ρm J Q(x + i, l, α, β)" 2
1 a a · · · ml !− σ + !+ + J δ km km ik lj . 2
699
(87)
If we now calculate the trace over the whole auxiliary space, the last term in the r.h.s. of (87) again will not contribute, as in the case of usual Baxter equation and we obtain the following relation: i N i N Q(x − i, l) + x − Q(x + i, l), t 1/2 (x)Q(x, l) + t 1/2 (x)Q(x, l) = x + 2 2 (88) where we have introduced the notations: σ t 1/2 (x) = T r T 1/2 (x), 2 ˆ Q(x, l) = dµ(α, α)J ¯ Q(x, l, α, α). ¯
(89)
This equation may be considered as an inhomogeneous Baxter equation, where the first term in the l.h.s. plays the role of inhomogeneity. Remarkably, the r.h.s. of (89) will not change if we simultaneously change the order of multiplication in both terms in the l.h.s.: i N i N Q(x − i, l) + x − Q(x + i, l). Q(x, l)t 1/2 (x) + Q(x, l)t 1/2 (x) = x + 2 2 (90) This property could by derived either from Eq. (47), repeating all steps, which would lead us to (88) or directly from the intertwining relation (73). These new vector Qoperators inherit many properties of the original Baxter operator. In particular, they also satisfy the wronskian-type relations, similar to (83). We intend to present a detailed discussion of these operators in a separate publication. It is worth to mention that we can go further, multiplying Eq. (47) by products of the generators of total auxiliary spin. The relations (86) and triangle structure of the r.h.s. of (47) guarantees that the last term will not contribute and we shall obtain the relations, similar to (90) for new tensorial generalizations of the Q-operator. Also, multiplying both sides of (47) by the operator σ U (H ) = exp iH +J , (91) 2 where H are a c-number, we shall obtain Baxter’s operator for the XXX spin chain in the external magnetic field.
700
G. P. Pronko
6. Concluding Remarks In our previous publication [4] we considered Baxter’s equation for eigenvalues of the Q-operator and have proven that the existence of one operator implies the existence of the second. As Baxter’s equation is linear, this apparently means that its solutions are the linear combinations of these two basic Q-operators, or in other words the general solution of this equation forms the one-parametric family. In [4] we denoted these basic operators as Q(x) and P (x). We have shown in particular that the eigenvalues of Q(x) and P (x) are polynomials and that the transfer matrix t l (x) could be expressed in terms of the eigenvalues of Q(x) and P (x) as follows: t l (x) = P (x + i(l + 1/2))Q(x − i(l + 1/2) − P (x − i(l + 1/2))Q(x + i(l + 1/2). (92) Making an analytic continuation as in (18) (remember that the condition of polynomiality makes this continuation unique) we can obtain now the expression for the eigenvalue of our operator Q(x, l) in terms of the eigenvalues of Q(x) and P (x): Q(x, l) = P (i(2l + 1))Q(x) − P (x)Q(i(2l + 1)).
(93)
From this equation it is evident that the the operators Q(x, l) constructed in the present paper are the linear combinations of the two basic operators with the operator coefficients, which do not depend on the spectral parameter x, but depend on the parameter l. In more simple case of e.g. Toda chain we can give the separate construction of the basic operators (G. Pronko, “On Baxter Q-operator for Toda Chain”, e-preprint nlin/0003002), but for the case of XXX spin chain we still do not know these operators separately. The construction of Baxter’s Q operator, considered in the present paper for the case of the XXX spin chain seems to be rather universal and could be extended for the case of the anisotropic XXZ spin chain. The key to this generalization is again the “naive” analytic continuation suggested in the Introduction. Indeed, in Baxter’s parametrizing [1], the fusion relation for the XXZ spin chain has the following form: t 1/2 (φ)t l (φ + (2l + 1)η) = sinN (φ + η)t l+1/2 (φ + 2lη) + sinN (φ − η)t l−1/2 (φ + 2(l + 1)η),
(94)
where η is the crossing parameter. From (92) it is clear that the function Q(φ, l), defined by Q(φ, l) = t l−φ/4η (φ/2 + (2l + 1)η)
(95)
satisfies the relation: t 1/2 (φ)Q(φ, l) = sinN (φ + η)Q(φ − 2η, l) + sinN (φ − η)Q(φ + 2η, l).
(96)
Again this trick can not be considered as the construction of the Q- operator, but it gives strong evidence that the procedure described in the present paper may be extended to the 6-vertex model. A very interesting question is the further generalization of this approach to the case of the 8-vertex spin chain, for which there also exists a Baxter construction [1] and to the case of the field model considered in [2]. Acknowledgements. The author is grateful to V. Bazhanov, L. Faddeev, E. Skyanin, S. Sergeev, Yu. Stroganov, A. Volkov for their interest, discussions, criticism and encouragement. This work was supported in part by ESPIRIT project NTCONGS and RFFI Grant 98-01-0070.
Baxter’s Q-Operator for the XXX Spin Chain
701
References 1. Baxter, R.J.: Stud. Appl. Math. L 51–69 (1971); Ann. Phys. N.Y. 70, 193–228 (1972); Ann. Phys. N.Y. 76, 1–71 (1973) 2. Bazhanov, V.V., Lukyanov, S.L., Zamolodchikov, A.B.: Commun. Math. Phys. 190, 247–78 (1997); 200, 297–324 (1998) 3. Faddeev, L.D., Takhtajan, L.A.: Zap. Nauch. Semin. LOMI, 109, Leningrad: Nauka, 1981, pp. 134–178 4. Pronko, G.P., Stroganov, Yu.G.: J. Phys. A: Math.Gen. 32, 2333–2340 (1999) 5. Berezin, F.A.: The Method of Second Quantization. Academic Press, New York, 1966 6. Faddeev, L.D.: UMANA 40, 214 (1995) (hep-th/9605187) Communicated by T. Miwa
Commun. Math. Phys. 212, 703 – 724 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Atoms in Strong Magnetic Fields: The High Field Limit at Fixed Nuclear Charge Bernhard Baumgartner1 , Jan Philip Solovej2 , Jakob Yngvason1 1 Institut für Theoretische Physik, Universität Wien, Boltzmanngasse 5, 1090 Vienna, Austria 2 Department of Mathematics, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen Ø,
Denmark Received: 9 December 1999 / Accepted: 15 February 2000
Abstract: Let E(B, Z, N ) denote the ground state energy of an atom with N electrons and nuclear charge Z in a homogeneous magnetic field B. We study the asymptotics of E(B, Z, N) as B → ∞ with N and Z fixed but arbitrary. It is shown that the leading term has the form (ln B)2 e(Z, N ), where e(Z, N ) is the ground state energy of a system of N bosons with delta interactions in one dimension. This extends and refines previously known results for N = 1 on the one hand, and N, Z → ∞ with B/Z 3 → ∞ on the other hand. 1. Introduction The effects of extremely strong magnetic fields (order of 109 Gauss and higher) on atoms and molecules are of considerable astrophysical as well as mathematical interest and are far from being completely understood in spite of many theoretical studies since the early seventies. We refer to [LSYa] and [RWHG] for a general discussion of this subject and extensive lists of references. An atom (ion) with N electrons and nuclear charge Z in a homogeneous magnetic field B = (0, 0, B) is (in appropriate units) usually modeled by the nonrelativistic many-body Hamiltonian HB,Z,N =
N
(i)
HA −
i=1
Z |xi |
+
N i<j
1 . |xi − xj |
(1)
Here xi ∈ R3 are the positions of the electrons, i = 1, . . . , N, A(x) = 21 B × x is the vector potential, and HA = [(i∇ + A(x)) · σ ]2
(2)
with σ the vector of Pauli spin matrices. The Hamiltonian HB,Z,N operates on the Hilbert space HN = N L2 (R3 ; C2 ) appropriate for Fermions of spin 1/2. In this paper we are
704
B. Baumgartner, J. P. Solovej, J. Yngvason
concerned with the ground state energy E(B, Z, N ) = inf spec HB,Z,N
N = inf{, HB,Z,N : ∈ C0∞ R3N ; C2 ∩ HN , 2 = 1}, (3)
more precisely the B → ∞ asymptotics of this quantity. Such an asymptotic study is relevant at the field strengths prevailing on white dwarfs and neutron stars. Previous investigations of the asymptotics of E(B, Z, N ) have either dealt with the case N = 1, i.e., hydrogen-like atoms [AHS,FWa], or the case when Z and N tend to ∞ together with B [LSYa, LSYb, I]. The most complete rigorous treatment of the ground state in the N = 1 case so far is [AHS] where the following B → ∞ asymptotics was derived: E(B, Z, 1)/Z 2 = − 41 [ln(B/2)]2 + [ln(B/2) ln ln(B/2)] − [(C + ln 2) ln(B/2)] − [ln ln(B/2)]2 + 2(C − 1 + ln 2) ln ln(B/2) + O(1),
(4)
with a constant C (Euler’s constant/2). Asymptotics for other eigenvalues and resonances are obtained in [FWa] and [FWb]. The basic results on the N, Z → ∞ case were obtained in [LSYa] and [LSYb]. In particular, in [LSYa] it was shown that if N, Z → ∞ with λ = N/Z fixed, and B/Z 3 → ∞, then 1 1 3 λ if λ < 2 − 4 λ + 18 λ2 − 48 3 3 2 E(B, Z, N )/(Z [ln(B/Z )] ) → (5) 1 −6 if λ ≥ 2. The fact that the right side of (5) decreases with increasing N/Z as long as N/Z < 2 shows that in the limit Z → ∞, B/Z 3 → ∞ an atom can bind at least 2Z electrons. In [I] some higher order corrections to the leading asymptotics for the energy are discussed. The main result of the present paper is a derivation of the leading term in the B → ∞ asymptotics of E(B, Z, N), where Z and N are fixed, but arbitrary. The precise statement is as follows: Theorem 1.1 (High field limit of the energy). For each fixed Z and N E(B, Z, N ) = e(Z, N ), B→∞ (ln B)2 lim
(6)
where e(Z, N ) is the ground state energy of the Hamiltonian hZ,N =
N δ(zi − zj ), −∂ 2 /∂zi2 − Zδ(zi ) +
N i=1
(7)
i<j
of N bosons with δ-interaction in one dimension, defined in the sense of quadratic forms as e(Z, N ) = inf{, hZ,N : ∈ C0∞ (RN ), 2 = 1}.
(8)
It is trivial to compute e(Z, 1) = −Z 2 /4. Thus (6) generalizes the first term in the expansion (4) to the case N > 1. The relevance of the δ-function model for the ground state of hydrogen in strong magnetic fields was noted already in [S]. We also verify that the mean field limit of e(Z, N ) agrees with (5):
Atoms in Strong Magnetic Fields
705
Theorem 1.2 (Mean field limit). If Z, N → ∞ with λ = N/Z fixed, then 1 1 3 λ if λ < 2 − 4 λ + 18 λ2 − 48 3 e(Z, N )/Z → . 1 −6 if λ ≥ 2
(9)
Taken together, Theorems 1.1 and 1.2 lead to the same high B, high Z limit as Theorem 1.4 in [LSYa], where Z → ∞ and B/Z 3 → ∞ simultaneously (the “hyperstrong” limit.) We now describe briefly the strategy for the proof of these results and introduce some notation that will be used throughout. The first step in the proof of Theorem 1.1 is a 0 ⊂ H generated by wave functions in the lowest Landau reduction to the subspace HN N 0 0 . (Its integral kernel is given by Eqs. (52)–(53) band. Let N denote the projector on HN 0 depend on B.) Let E conf (B, Z, N ) denote the ground in Sect. 5. Note that 0N and HN 0 0 state energy of N HB,Z,N N . It is clear that E(B, Z, N ) ≤ E conf (B, Z, N ),
(10)
and by Theorem 1.2 in [LSYa], E conf (B, Z, N ) ≤ E(B, Z, N )(1 − δ(B, Z, N )),
(11)
where δ(B, Z, N ) → 0 for B → ∞ with Z, N fixed. Hence it suffices to prove (6) with E(B, Z, N ) replaced by E conf (B, Z, N ). We note in passing that (11) also holds for bosons. In fact, it will become evident in the sequel that Theorem 1.1 is independent of the statistics of the particles. To study E conf (B, Z, N ) the next step is to introduce a Hamiltonian for the motion parallel to the magnetic field with the coordinates perpendicular to the magnetic field as parameters. We write the variables xi ∈ R3 as xi = (xi⊥ , zi ), where xi⊥ ∈ R2 and zi ∈ R are respectively the components perpendicular and parallel to the field. ⊥ ) ∈ R2N and Moreover, we write (x1 , . . . , xN ) = (x ⊥ , z) with x ⊥ = (x1⊥ , . . . , xN N z = (z1 , . . . , zN ) ∈ R . In the lowest Landau band the part of (2) associated with the motion perpendicular to the field is exactly canceled by the spin contribution and only the part corresponding to the motion along the field remains. Hence 0N HB,Z,N 0N = 0N HZ,N 0N
(12)
with HZ,N =
N
−∂
i=1
2
/∂zi2
Z − |xi |
+
N i<j
1 . |xi − xj |
(13)
The operator (13) contains no derivatives perpendicular to the field and hence the variables x ⊥ can be regarded as parameters for a differential operator in the variables ⊥ are all different from zero, we parallel to the field. For each x ⊥ such that x1⊥ , . . . , xN consider the one-dimensional Hamiltonian N N 1 Z −∂zi 2 − + HZ,N (x ⊥ ) = (14) (zi − zj )2 + (xi⊥ − xj⊥ )2 zi2 + (xi⊥ )2 i=1 i<j
706
acting on
B. Baumgartner, J. P. Solovej, J. Yngvason N
L2 (R) = L2 (RN ). The expectation values of HZ,N can be written as (15) , HZ,N = (x ⊥ , ·), HZ,N (x ⊥ )(x ⊥ , ·)L2 (RN ) dx ⊥ .
The next step is a scaling of the variables. In the lowest Landau level the characteristic length in the directions perpendicular to the field is B −1/2 . One can therefore expect that for the computation of E conf (B, Z, N ), i.e., the infimum of (15) over (normalized) 0 , the properties of H ⊥ ⊥ −1/2 are decisive. Anticipating this, it ∈ HN Z,N (x ) for |xi | ∼ B ⊥ is natural to make a transformation of variables, (x , z) → (B 1/2 x ⊥ , L(B)z), where the scale factor L(B) in the direction of the field has still to be specified. The corresponding unitary operator on L2 (RN ) is U (z) = L(B)N/2 (L(B)z),
(16)
and the Hamiltonian transforms in the following way: 1/2 ⊥ x ), U −1 HZ,N (x ⊥ )U = L(B)2 hB Z,N (B
(17)
where ⊥ hB Z,N (y ) =
VB,|y ⊥ −y ⊥ | (zi − zj ) −∂z2i − ZVB,|y ⊥ | (zi ) +
N i=1
i
i<j
i
j
(18)
and the potential VB,r (z) is (for r > 0) defined as VB,r (z) = L(B)−1 (B −1 L(B)2 r 2 + z2 )−1/2 .
(19)
B (y ⊥ ) denote the ground state energies of H ⊥ Let EZ,N (x ⊥ ) and eZ,N Z,N (x ) and ⊥ hB Z,N (y ) respectively. In order to avoid discussions about the domains of the Hamiltonians, which in fact depend on whether some of the parameters xi⊥ (resp. yi⊥ ) coincide, we define the ground state energies in terms of quadratic forms in the same way as (8):
EZ,N (x ⊥ ) = inf{, HZ,N (x ⊥ ) : ∈ C0∞ (RN ), 2 = 1}, B eZ,N (y ⊥ )
=
⊥ inf{, hB Z,N (y )
: ∈
C0∞ (RN ),
2 = 1}.
(20) (21)
These energies are connected by the scaling relation B (y ⊥ ). EZ,N (B −1/2 y ⊥ )/L(B)2 = eZ,N
(22)
In the next section we show that with the choice L(B) ∼ ln B the potential VB,r (z) converges for each r > 0 in the sense of distributions to the delta function as B → 0. This is the heuristic basis of Theorem 1.1. Since the convergence is not uniform in r, however, more is needed for a rigorous proof. In particular, one needs estimates on the rdependence of the convergence VB,r (z) → δ(z). These estimates, stated in Lemmas 2.1 and 2.2 in the next section, can be regarded as variants of Propositions 3.3 and 3.4 in [LSYa] and the Appendix in [JY], adapted to the problem at hand. They are included here for completeness. The upper bound on the energy, given in Sect. 3, is a straightforward variational calculation. The lower bound is more subtle. An important ingredient needed is the superharmonicity of the energy EZ,N (x ⊥ ) in the variables xi⊥ . This result, established
Atoms in Strong Magnetic Fields
707
in Theorem 4.3, generalizes a corresponding result (Proposition 2.3) in [LSYa]. Superharmonicity implies that the lowest value of EZ,N (B −1/2 y ⊥ ) for |yi⊥ | ≥ ε with ε > 0 is obtained at the boundary of the variable range, i.e., when either |yi⊥ | = ε or |yi⊥ | → ∞. Variables tending to infinity can be ignored, since VB,r (z) → 0 for r → ∞, so by this result one may in (15) restrict the attention to wave functions localized where |xi⊥ | ≤ (const.)B −1/2 . On the other hand, the requirement that only wave functions in the lowest Landau band are taken into account in (15) plays the role of a “hard core condition” that prevents collapse, since such wave functions cannot be concentrated on shorter scales than O(B −1/2 ). This statement is made precise in Lemma 5.3. The lower bound is obtained in Sect. 5 by combining Theorem 4.3, Lemma 5.3 and the convergence of the potentials VB,r discussed in Sect. 2. It is noteworthy that this lower bound holds also for bosonic statistics while the upper bound holds for fermionic statistics, so that altogether the convergence of E(Z, N, B)/(ln B)2 to e(Z, N ) is independent of the statistics. In Sect. 6 we discuss the delta-function model (7) and in particular prove Theorem 1.2. In the course of the proof we compare (7) with another model, whose ground state energy can be explicitly calculated. This model provides an upper bound for the ground state energy of (7) and has the same mean field limit. The Hamiltonian for this model is hZ,N =
N i=1
1 pi2 − δ(zi ) + δ(|zi | − |zj |). 2Z
(23)
i<j
An interesting feature of this model is the fact that the maximal number Nc of electrons that a nucleus of charge Z can bind is exactly the largest integer satisfying Nc < 2Z + 1.
(24)
(This fact is unrelated to Lieb’s upper bound [L] for the maximal negative ionization of atoms that does not apply to the Pauli Hamiltonian with a homogeneous magnetic field.) A corresponding statement for the Hamiltonian (7) is not known, except in the mean field limit, cf. Theorem 1.2. In this connection it should be mentioned that an estimate of the form Nc < 2Z + 1 + (const.) B 1/2 has been derived in [BR] for a Hamiltonian of a similar type as (18). 2. The High B Limit of the Coulomb Interaction We define the scaling factor L(B) in the potential (19) as the solution of the equation
Since
1 0
B 1/2 = L(B) sinh[L(B)/2]. (a 2 + z2 )−1/2 dz = sinh−1 (1/a), we have with this choice VB,r (z)dz = 1 |z|≤r
(25)
(26)
for all B. Note also that L(B) = ln B + O(ln ln B)
(27)
as B → ∞. Let ψ ∈ H 1 (R) = {ψ : |ψ|2 + |dψ/dz|2 < ∞}. Every such ψ is a continuous function on R.
708
B. Baumgartner, J. P. Solovej, J. Yngvason
Lemma 2.1 (Delta approximation, part 1). 2 − VB,r (z)|ψ(z)|2 dz ≤ L(B)−1 λr −1 + 8λ1/4 T 3/4 r 1/2 |ψ(0)| with λ =
|ψ|2 , T =
(28)
|dψ/dz|2 .
Proof. It suffices to take r = 1, for the general case follows by scaling z → rz. Write the difference on the left side of (2.1) as A1 + A2 with A1 = − VB,1 (z)|ψ(z)|2 dz, (29) |z|≥1 (30) A2 = VB,1 (z) |ψ(0)|2 − |ψ(z)|2 dz. |z|≤1
The missing term
A3 = 1 −
|z|≤1
VB,1 (z)dz |ψ(0)|2
(31)
is zero because of (26). Since |VB,1 (z)| ≤ L(B)−1 for |z| ≥ 1, we have |A1 | ≤ λL(B)−1 .
(32)
|VB,1 (z)| ≤ L(B)−1 |z|−1 .
(33)
For |z| ≤ 1 we have in any case
Moreover,
|ψ(z)|2 − |ψ(0)|2 ≤ |ψ(z) − ψ(0)| [|ψ(z)| + |ψ(0)|] z dψ ∞ d|ψ(z )|2 1/2 dz · 2 dz ≤ dz 0 dz −∞ ≤ |z|1/2 T 1/2 2λ1/4 T 1/4 = 2λ1/4 T 3/4 |z|1/2 .
Hence |A2 | ≤ 2L(B)−1
|z|≤1
|z|−1/2 dz λ1/4 T 3/4 = 8L(B)−1 λ1/4 T 3/4 .
Combining the estimates for A1 and A2 gives (2.1).
(35)
Lemma 2.2 (Delta approximation, part 2). Let ∈ H 1 (R2 ) and put 2 λ= |(z, z )| dzdz , T = |∂z (z, z )|2 dzdz . Then
(34)
(36)
2 VB,r (z − z )|ψ(z, z )|2 dzdz |ψ(z, z)| dz − ≤ L(B)−1 [λr −1 + 8λ1/4 T 3/4 r 1/2 ].
(37)
Atoms in Strong Magnetic Fields
Proof. Put λ(z) =
709
|(z, z )|2 dz , T (z) = |∂z (z, z )|2 dz . By (2.1) we have 2 z)| − VB,r (z − z )|(z, z )|2 dz |(z,
≤ L(B)−1 [λ(z)r −1 + 8λ(z)1/4 T (z)3/4 r 1/2 ]. (38) 1/4 3/4 Integration over z, using the Hölder inequality to estimate λ(z) T (z) dz, gives (37). 3. Upper Bound Let ψ ∈ S(R2N ) be a smooth and rapidly decreasing wave function in the lowest ∞ N Landau level at field2strength 1, and let φ ∈ C0 (R ). If ψ and φ are normalized, i.e., 2 R2N |ψ| = RN |φ| = 1, then B (x ⊥ , z) = (BL(B))N/2 ψ(B 1/2 x ⊥ )φ(L(B)z)
(39)
is a normalized wave function in the lowest Landau band at field strength B. Moreover, using (15) and (17) we have E(B, Z, N ) ≤ B , HZ,N B ⊥ 2N ⊥ = L(B)2 |ψ(y ⊥ )|2 φ, hB y , Z,N (y )φd ⊥ 2 2 where hB Z,N (y ) is given by (18). Since L(B) /(ln B) → 1 as B → ∞ and ψ is normalized, one has for the upper bound in Theorem 1.1 only to check that |ψ(y ⊥ )|2 VB,|y ⊥ | (zi )|φ(z)|2 d 2N y ⊥ d N z i → δ(zi )|φ(z)|2 d N z
and
|ψ(y ⊥ )|2 VB,|y ⊥ −y ⊥ | (zi − zj )|φ(z)|2 d 2N y ⊥ d N z i j → δ(zi − zj )|φ(z)|2 d N z
as B → ∞. But this is taken care of by Lemmas 2.1 and 2.2. (That VB,r (z) is not defined for r = 0 is of no consequence here, because the error terms in (28) and (37) are integrable all the way to r = 0.) We therefore have Proposition 3.1 (Upper bound). lim inf B→∞
E(B, Z, N ) ≤ e(Z, N ). (ln B)2
(40)
Remark. It is clear that our upper bound holds for fermions, although e(Z, N ) is the bosonic ground state energy of (7). In fact, in the ansatz (39) above we may choose ψ to be antisymmetric and φ to be symmetric; then B is antisymmetric. Note also that for the Hamiltonian (7) the bosonic ground state energy is the same as its ground state energy without symmetry restriction.
710
B. Baumgartner, J. P. Solovej, J. Yngvason
4. Superharmonicity In this section we take a closer look at the dependence of the ground state energy EZ,N (x ⊥ ) of the Hamiltonian (14) on the parameter x ⊥ . We start with a simple estimate: Lemma 4.1 (Simple bounds). The function x ⊥ → EZ,N (x ⊥ ) satisfies the bounds −
N
Z
2
1 + sinh
i=1
−1
2
((Z|xi⊥ |)−1 )
≤ EZ,N (x ⊥ ) ≤ 0
(41)
on the set A = {x ⊥ ∈ R2N : xi⊥ = 0, for all i = 1, . . . , N}.
(42)
Proof. The non-positivity of E is straightforward from the definition by an appropriate choice of . Note that this also holds in the case where some of the xi⊥ variables coincide. The lower bound on EZ,N (x ⊥ ) follows from Lemma 2.1 in [LSYa] together with the operator inequality N −∂z2 − Z HZ,N (x ⊥ ) ≥ i zi2 + (xi⊥ )2 i=1 which is obtained by ignoring the positive two-body interactions.
Next we turn to the superharmonicity properties of EZ,N (x ⊥ ). We shall need the following general result. Lemma 4.2 (Inherited superharmonicity). Let U be an open set in Rd and assume that f : U × R → (−∞, ∞] is a superharmonic function with the property that b = min{lim inf f (x, t), lim inf f (x, t)} t→∞
t→−∞
is independent of x for all x ∈ U . Then g(x) = inf t f (x, t) is a superharmonic function on U . Proof. We shall prove this by showing that +g ≤ 0 as a distribution. We shall use that f is a lower semicontinuous function satisfying the mean value inequality f (y, s)dyds ≤ f (x, t)cd+1 r d+1 , |(x,t)−(y,s)|≤r
for all (x, t) ∈ U × R if r > 0 is small enough, where cd+1 is the volume of the unit ball in Rd+1 . For x ∈ U it follows from the lower semicontinuity of f that we have either g(x) = b or there exists t ∈ R such that g(x) = f (x, t). In the first case we obviously have cd+1 r d+1 g(x) ≥ 2 g(y) r 2 − (x − y)2 dy (43) |x−y|≤r
Atoms in Strong Magnetic Fields
711
since g(y) ≤ b for all y. If g(x) < b we also conclude the above inequality since g(x)cd+1 r d+1 = f (x, t)cd+1 r d+1 ≥ f (y, s)dyds |(x,t)−(y,s)|≤r ≥ g(y)dyds = 2 g(y) r 2 − (x − y)2 dy. |(x,t)−(y,s)|≤r
|x−y|≤r
Note now that for any φ ∈ C0∞ (U ) we have for any x ∈ U that lim r −(d+3) [φ(y) − φ(x)] r 2 − (x − y)2 dy = C+φ(x) r→0
|x−y|≤r
for some constant C > 0 and in fact this limit holds in the topology of C0∞ (U ). Thus if φ ≥ 0 we have g(x)+φ(x)dx = C −1 lim r −(d+3) r→0 U · g(x)(φ(y) − φ(x)) r 2 − (x − y)2 dydx ≤ 0 |x−y|≤r
by the inequality (43). Hence +g ≤ 0.
Theorem 4.3 (Superharmonicity of the energy). On the set A defined in (42) the function x ⊥ → EZ,N (x ⊥ ) is superharmonic in each of the variables xi⊥ , i = 1, . . . , N independently. Proof. We follow closely the proof of Prop. 2.3 in [LSYa], which stated the superharmonicity of the ground state energy of a one-body operator which can be considered as a mean field approximation of HZ,N (x ⊥ ). It is clearly enough to prove that EZ,N (x ⊥ ) is superharmonic in x1⊥ (on the re⊥ fixed. We shall prove this by showing that x ⊥ → gion x1⊥ = 0) for x2⊥ , . . . , xN 1 ⊥ . Let x ⊥ = ⊥ EZ,N (x ) satisfies the mean value inequality around any given point x1,0 0 ⊥ , x ⊥ , . . . , x ⊥ ). Choose a sequence of L2 normalized functions ∈ C ∞ (RN ) (x1,0 n 2 0 N ⊥ such that n , HZ,N (x ⊥ 0 )n → EZ,N (x 0 ) as n → ∞. (w) For w ∈ R denote by n the function n(w) (z1 , . . . , zN ) = n (z1 − w, z2 , . . . , zN ). We clearly have (w) ⊥ inf n(w) , HZ,N (x ⊥ 0 )n → EZ,N (x 0 )
w∈R
(w)
⊥ we shall use If x1⊥ is close to x1,0 n
as n → ∞.
as a trial function for H (x ⊥ ). We then obtain
EZ,N (x ⊥ ) ≤ lim inf inf n(w) , HZ,N (x ⊥ )n(w) . n
w∈R
Hence EZ,N (x ⊥ ) − EZ,N (x ⊥ 0) (v) ) . ≤ lim inf inf n(w) , HZ,N (x ⊥ )n(w) − inf n(v) , HZ,N (x ⊥ n 0 n
w∈R
v∈R
(44)
712
B. Baumgartner, J. P. Solovej, J. Yngvason
The potential appearing in HZ,N (x ⊥ ), i.e., WZ,N,x ⊥ (z1 , . . . , zN ) = −
N
i=1
Z zi2 + (xi⊥ )2
+
N i<j
1 (zi − zj )2 + (xi⊥ − xj⊥ )2
is a superharmonic function of (z1 , x1⊥ ) ∈ R3 \ {0}. Writing n(w) , WZ,N,x ⊥ n(w) = WZ,N,x ⊥ (z1 + w, z2 , . . . , zN )|n (z1 , . . . , zN )|2 dz1 · · · dzN (w)
(w)
we see that n , WZ,N,x ⊥ n is superharmonic in (w, x1⊥ ) away from the line x1⊥ = (w)
(w)
0. Since n , ∂z2i n is independent of w and x1⊥ for all i = 1, . . . , N we have that (w)
(w)
n , HZ,N (x ⊥ )n is superharmonic in (w, x1⊥ ) away from the line x1⊥ = 0. Moreover, we also have that the two limits lim inf n(w) , HZ,N (x ⊥ )n(w)
w→±∞
are independent of x1⊥ . This is true simply because the contribution from the terms in the Hamiltonian depending on x1⊥ tend to zero as w → ±∞. We may therefore apply the (w) (w) above lemma to the function f (w, x1⊥ ) = n , HZ,N (x ⊥ )n . We conclude that the function x1⊥ → inf n(w) , HZ,N (x ⊥ )n(w) w∈R
is superharmonic for x1⊥ = 0. Moreover by the inequality (41) this function is bounded below if |x1⊥ | is bounded away from 0. Now using Fatou’s Lemma we see from (44) that the average of EZ,N (x ⊥ )−EN (x ⊥ 0) ⊥ | < r} is non-positive for all r > 0 small enough. over the set {x1⊥ : |x1⊥ − x1,0 5. Lower Bound B (y ⊥ ) of hB (y ⊥ ) The first lemma in this section concerns the ground state energy eZ,N Z,N and does not use superharmonicity. B (y ⊥ )). Let K be a compact subset of the set A given Lemma 5.1 (Lower bound on eZ,N in (42). Then B (y ⊥ ) ≥ e(Z, N ). lim inf inf eZ,N
(45)
B→∞ y ⊥ ∈K
Proof. To avoid problems at points y ⊥ with yi⊥ − yj⊥ = 0 for some i, j , we replace the repulsive potential VB,|y ⊥ −y ⊥ | (zi − zj ) by the smaller potential VB,|y ⊥ −y ⊥ |+1 (zi − zj ). i
j
i
j
⊥ We denote the corresponding Hamiltonian by h˜ B Z,N (y ) and its ground state energy by B B B B (y ⊥ ) gives ⊥ ⊥ ⊥ e˜Z,N (y ). It is obvious that eZ,N (y ) ≥ e˜Z,N (y ), so a lower bound on e˜Z,N B (y ⊥ ). a lower bound on eZ,N
Atoms in Strong Magnetic Fields
713
Let be a normalized, symmetric wavefunction in C0∞ (RN ). Since , hZ,N ≥ ⊥ e(Z, N) we have to estimate the matrix elements of the difference h˜ B Z,N (y ) − hZ,N . Using Lemmas 2.1 and 2.2, together with the Hölder inequality for the integration over z2 , . . . , zN and z3 , . . . , zN respectively, we obtain − , h ≤ L(B)−1 (ZN + N (N − 1)) , h˜ B Z,N Z,N 3/4 −1 (46) × rmin + 8T (2rmax + 1)1/2 , where rmin and rmax are respectively the minimum and the maximum value of |yi⊥ |, i = 1, . . . , N, with y ⊥ ∈ K, and (47) T = N |∂z (z, z2 , . . . , zN )|2 dzdz2 · · · dzN is the kinetic energy of . Now if yB⊥ ,n , n = 1, 2, . . . is a minimizing sequence of
⊥ normalized wave functions for h˜ B Z,N (y ), then we may assume that the corresponding kinetic energy is uniformly bounded in n, B and y ⊥ ∈ K. In fact, we may assume that B ⊥ yB⊥ ,n , h˜ B Z,N (y )y ⊥ ,n is a bounded sequence. If we use the bound from Lemma 2.1
in [LSYa], we obtain ⊥ B yB⊥ ,n , h˜ B Z,N (y )y ⊥ ,n
Tn 1 ≥ − 2 2
2Z L(B)
2
1 + sinh
−1
{(2Z)
−1
B
1/2
2
}
, (48)
where we have saved half of the kinetic energy Tn of yB⊥ ,n . For large B, L(B)−1 sinh−1 {(2Z)−1 B 1/2 } is bounded and hence we see that Tn is bounded. The error term (46) with = yB⊥ ,n thus tends to zero as B → ∞, uniformly in n, and the lemma is established.
Lemma 5.2 (Uniform bounds on EZ,N (x ⊥ )). Let ε > 0. Consider the set C B,ε = {x ⊥ : εB −1/2 ≤ |xi⊥ |, for all i = 1, . . . , N}.
(49)
lim inf (ln B)−2 inf{EZ,N (x ⊥ ) : x ⊥ ∈ C B,ε } ≥ e(Z, N ).
(50)
Then B→∞
where e(Z, N ) as before denotes the 1-dimensional delta function atom energy. Proof. Define the sets CnB,ε = {x ⊥ : εB −1/2 ≤ |xi⊥ | ≤ n, for all i = 1, . . . , N}. Since CnB,ε is compact and EZ,N is lower semicontinuous (being superharmonic, in fact, B,ε superharmonic in each variable) we may find x ⊥ n ∈ Cn such that ⊥ ⊥ B,ε EZ,N (x ⊥ n ) = min{EZ,N (x ) : x ∈ Cn }.
714
B. Baumgartner, J. P. Solovej, J. Yngvason
Clearly,
⊥ ⊥ B,ε }. lim EZ,N (x ⊥ n ) → inf{EZ,N (x ) : x ∈ C
n→∞
By the superharmonicity of EZ,N (x ⊥ ) in each variable xi⊥ we know that each coordinate ⊥ of the point x ⊥ satisfies either |x ⊥ | = εB −1/2 or |x ⊥ | = n. Moreover, since xi,n n i,n i,n EZ,N (x ⊥ ) is invariant under permutations of the coordinates of x ⊥ we may assume that ⊥ | ≤ |x ⊥ | ≤ · · · ≤ |x ⊥ | for all n. By possibly going to a subsequence we may |x1,n 2,n N,n assume that there exists an integer 0 ≤ K ≤ N such that for n large enough −1/2 , for i = 1, . . . , K εB ⊥ . |= |xi,n n, for i > K ⊥ converges as n → ∞ for i = 1, . . . , K. Moreover, we may assume that xi,n ⊥ , i = K + 1, . . . , N, which tend to infinity Since we may ignore the variables xi,n we have ⊥ ⊥ lim EZ,N (x ⊥ n )/EZ,K (x1,n , . . . , xK,n ) = 1. n→∞
Since EZ,K (x ⊥ ) is lower semicontinuous we conclude that there exists a point ⊥ , . . . , x ⊥ ) ∈ R2K with |x ⊥ | = εB −1/2 for all i = 1, . . . , K such that (x1,∞ K,∞ i,∞ ⊥ ⊥ inf{EZ,N (x ⊥ ) : x ⊥ ∈ C B,ε } = EZ,K (x1,∞ , . . . , xK,∞ ).
By Lemma 5.1 we have that ⊥ ) : |yi⊥ | = ε, for all i} ≥ e(Z, K). lim inf inf{L(B)−2 EZ,K (B −1/2 y1⊥ , . . . , B −1/2 yK B→∞
Since K ≤ N and hence e(Z, K) ≥ e(Z, N ) we have proved the lemma.
0 belongs to the Lemma 5.3 (Wave functions in the lowest Landau band). If ∈ HN lowest Landau band at field strength B, then |(x ⊥ , z)|2 dz is a bounded function of x ⊥ (possibly after a modification on a null set) and for all 1 ≤ n ≤ N , Bn ⊥ ⊥ |(x ⊥ , z)|2 dzdxn+1 · · · dxN 2 . (51) sup ≤ n (2π ) ⊥ ⊥ x ,...,x 1
n
Proof. The projector 0N on the lowest Landau band is the N th tensorial power of the projector 0 that operates on L2 (R3 ; C2 ) and is given by the integral kernel ⊥
0 (x, x ) = 0⊥ (x ⊥ , x )δ(z − z )P ↓ ,
(52)
where ⊥
0⊥ (x ⊥ , x ) =
B ⊥ ⊥ exp 2i (x ⊥ × x ) · B − 41 (x ⊥ − x )2 B 2π
(53)
and P ↓ is the the projector on vectors in C2 with spin component −1/2. The kernel 0⊥ (x ⊥ , x ⊥ ) is a continuous function with (54) 0 (x ⊥ , u⊥ )0 (u⊥ , y ⊥ )du⊥ = 0 (x ⊥ , y ⊥ )
Atoms in Strong Magnetic Fields
715
and 0 (x ⊥ , x ⊥ ) =
B 2π
(55)
for all x ⊥ . A wave function in the lowest Landau band has the representation = 0N . After writing 0N as an integral operator (51) follows from the Cauchy–Schwarz inequality, using (54) and (55). Proposition 5.4 (Lower bound). lim inf B→∞
E(B, Z, N ) ≥ e(Z, N ). (ln B)2
(56)
Proof. For fixed B let be a normalized wave function in the lowest Landau band. By (15) we have , HZ,N ≥ EZ,N (x ⊥ ) |(x ⊥ , z)|2 dz dx ⊥ . (57) We split the integral over x ⊥ into an integral over C B,ε (defined in (49)) and its complement in R2N . By Lemma 5.2 we have only to consider the latter. Using the estimate (41) the task is to bound terms of the form −1 ⊥ −1 2 ⊥ 2 (1 + [sinh (Z|xj | )] ) |(x , z)| dz dx ⊥ (58) |xi⊥ |≤εB −1/2
from above. If i = j we carry out the integration over all xk⊥ with k = i and use Lemma 5.3 for the remaining variable xi⊥ . For small r, | sinh−1 r −1 | ≤ (const.)| ln r| and the term can be estimated by (ln |x ⊥ |)2 Bdx ⊥ ≤ (const.)ε 2 (ln B)2 . (59) (const.) |x ⊥ |≤εB −1/2
For i = j we split the integration over xj⊥ into two parts, namely |xj⊥ | ≤ B −1/2 and |xj⊥ | ≥ B −1/2 . For the first part we obtain the following bound, after transforming variables and using Lemma 5.3, this time for n = 2, 2 (const.)ε (ln B −1/2 |yj⊥ |)2 dyi⊥ dyj⊥ ≤ (const.)ε2 (ln B)2 . (60) |yi⊥ |≤1,|yj⊥ |≤1
For the integral over |xj⊥ | ≥ B −1/2 we estimate | sinh−1 (Z|xj⊥ |−1 )|2 by its maximum value, ≤ (const.)(ln B)2 and obtain for this part of the integral the upper bound 2 ⊥ 2 (1 + (const.)(ln B) ) |(x , z)| dz dx ⊥ ≤
|xi⊥ | f (z)/f (z). This inequality can be integrated to give g(z)/g(0) > f (z)/f (0), a contradiction to the assumption of the square-integrability of g(z). Therefore we know that the Hamiltonian (82) has no negative eigenvalue. And so the operator inequality holds. The {Wb (z)} and hence {ZwZ,a,b (z)} are δ-sequences in the limit b → ∞. All these functions are positive definite, and finite at the origin: b . (84) 2Z 2 a With this tool we can now deduce the lower bound for the many body Hamiltonian: wZ,a,b (0)