Commun. Math. Phys. 203, 1 – 19 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Effective Dynamics for a Mechanical Particle Coupled to a Wave Field Alexander Komech1,? , Markus Kunze2,?? , Herbert Spohn3 1 Department of Mechanics and Mathematics of Moscow State University, Moscow 119899, Russia. E-mail:
[email protected] 2 Mathematisches Institut der Universit¨ at K¨oln, Weyertal 86, D-50931 K¨oln, Germany. E-mail:
[email protected] 3 Zentrum Mathematik, TU M¨ unchen, D-80290 M¨unchen, Germany. E-mail:
[email protected] Received: 2 September 1998 / Accepted: 13 November 1998
Abstract: We consider a particle coupled to a scalar wave field and subject to the slowly varying potential V (εq) with small ε. We prove that if the initial state is close, order ε2 , to a soliton (=dressed particle), then the solution stays forever close to the soliton manifold. This estimate implies that over a time span of order ε−1 the radiation losses are negligible and that the motion of the particle is governed by the effective Hamiltonian Heff (q, P ) = E(P ) + V (εq) with energy-momentum relation E(P ).
1. Introduction When a particle interacts with a field its mechanical properties are renormalized, e.g. the particle acquires an effective mass. In the context of charges interacting with the Maxwell field such an effective energy-momentum relation is discussed at length already in the classical work of Abraham [1] and Lorentz [16] with the implicit understanding that this relation determines how the particle responds to external forces. Kramers [14] emphasizes the distinction between bare (appearing in the equation of motion) and physical (observable by outside means) parameters of a charge. His vision has been implemented through the renormalization of quantum electrodynamics. To our knowledge, even on the classical level, it has never been properly settled in which sense and on what scale the dynamics governed by the effective energy-momentum relation is an approximation to the true solution of the coupled equations of motion. To gain some understanding we study here the arguably simplest model, namely a single particle interacting with a scalar wave field. ? Supported partly by French–Russian A.M.Liapunov Center of Moscow State University, by Max-Planck Institute for Mathematics in the Sciences (Leipzig), and by research grants of INTAS (IR-97-113) and of Volkswagen-Stiftung. ?? Supported by DAAD and NSF (through a grant of Ch. Jones) during a stay at Brown University.
2
A. Komech, M. Kunze, H. Spohn
Our second source of interest lies in the, by now, long list of examples we have for the emergence of an effective dynamics, to mention only the Boltzmann and Vlasov equation, hydrodynamics [21], homogenization in periodic and random environments [3, 8], interface and vortex dynamics in Ginzburg–Landau theories [11], quantum systems weakly coupled to a heat bath [6], and a quantum particle in the semiclassical limit [10, 18, 22]. Their common thread is a separation of space-time scales together with some sort of local stationarity in such a way that the slowly varying dynamical variables are governed by an effective dynamics. However, the detailed mechanisms differ notably from case to case. Here we add a novel item to the list. It is not covered by the mathematical techniques developed so far. We consider a scalar wave field φ(x), in three-dimensional space, coupled to a particle with position q, momentum p, governed by ˙ t) = π(x, t), φ(x, q(t) ˙ = p(t)/(1 + p2 (t))1/2 ,
π(x, ˙ t) = 1φ(x, t) − ρ(x − q(t)), Z p(t) ˙ = d3 x φ(x, t) ∇ρ(x − q(t)).
This is a Hamiltonian system with the Hamiltonian functional Z 1 d3 x |π(x)|2 + |∇φ(x)|2 H0 (φ, π, q, p) = (1 + p2 )1/2 + 2 Z + d3 xφ(x)ρ(x − q).
(1.1)
(1.2)
We have set the mechanical mass of the particle and the speed of wave propagation equal to one. In spirit the interaction term is simply φ(q). This would result however in an energy that is not bounded from below. Therefore we smoothen out the coupling by the function ρ(x). In analogy to the Maxwell–Lorentz equations we call ρ(x) the “charge distribution”. We assume ρ(x) to belong to the Sobolev space H 1 , radial, and compactly supported, i.e., ρ, ∇ρ ∈ L2 (R3 ), ρ(x) = ρr (|x|), ρ(x) = 0 for |x| ≥ Rρ .
(C)
The system (1.1) has solutions traveling with constant velocity v, |v| < 1. They are given by p Sv (t) = (φv (x − q − vt), πv (x − q − vt), q + vt, pv ), pv = v/ 1 − v 2 , (1.3) with φv (x) = −
Z 4π((1 −
v 2 )(y
ρ(y)d3 y , πv (x) = −v · ∇φv (x). − x)2 + (v · (y − x))2 )1/2 (1.4)
To have a short name we call Sv (t) the soliton with velocity v centered at q(t) = q + vt. We define the normalized energy of a soliton as Es (v) = H0 (Sv ) − H0 (S0 ), Sv = Sv (0), which, using the rotational invariance of ρ, is given by 1 1 + |v| 2 − v2 2 −1/2 . − log − 1 + 3me Es (v) = (1 − v ) 2(1 − v 2 ) 2|v| 1 − |v|
(1.5)
(1.6)
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
3
Here 3me = −hρ, 1−1 ρi, with h·, ·i the scalar product in L2 (R3 ); we have me < ∞ by assumption (C). Since the system (1.1) is invariant under spatial translations, the total momentum, Z (1.7) P(φ, π, q, p) = p − d3 x π(x) ∇φ(x), is conserved. Inserting Sv , the total momentum of a soliton is given by 1 1 1 + |v| 2 −1/2 − . + 3me v log Ps (v) = P(Sv ) = v(1 − v ) 2v 2 (1 − v 2 ) 4|v|3 1 − |v| (1.8) The map v 7→ Ps (v) is invertible from V = {v ∈ R3 : |v| < 1} onto R3 with the inverse vs (P ); see [12]. Therefore we obtain the effective energy-momentum relation E(P ) = Es (vs (P )).
(1.9)
Then E(P ) is radial. In the nonrelativistic limit (v small) we have Es (v) ∼ =
1 (1 + me )v 2 and Ps (v) ∼ = (1 + me )v for |v| 1. 2
(1.10)
Thus me is the additional mass acquired by the particle through the coupling to the field. For large |P | we have the relativistic dependence E(P ) ∼ = |P |. Now let us assume that, at some time t, we have the soliton Sv (t) centered at q(t), v = q(t), ˙ and that an external force is acting on the particle. This force changes the velocity to v 0 6= v and Sv (t) is no longer a solution to the system (1.1). However, if the force is small, so is the difference v 0 − v and, if the force is slowly varying, the wave field has enough time to reestablish a soliton with new velocity v 0 . In fact this happens essentially with the speed of wave propagation (one in our case). Geometrically in phase space, we have the 6-dimensional manifold S of solitons labeled by their center q and velocity v. For zero external force each point in this manifold moves on an orbit t 7→ (q + vt, v). Under a weak, slowly varying force, the true solution should remain close to the soliton manifold thereby inducing on it an effectively 6-dimensional motion. With this picture in mind, we add to H0 in (1.2) the slowly varying potential V (εq), ε 1, Z 1 2 1/2 d3 x |π(x)|2 + |∇φ(x)|2 Hε (φ, π, q, p) = (1 + p ) + V (εq) + 2 Z (1.11) + d3 x φ(x)ρ(x − q). For the potential V we require V ∈ C 2 (R3 ),
and sup q∈R3
inf V (q) > −∞,
q∈R3
|∇V (q)| + |∇∇V (q)| < ∞.
(P )
(U )
We remark that, using the conservation of energy, condition (U ) can be replaced by V (q) → ∞ as |q| → ∞,
(U 0 )
4
A. Komech, M. Kunze, H. Spohn
i.e., by the assumption that V be confining. In the sequel we study the Hamiltonian dynamics generated by (1.11), ˙ t) = π(x, t), φ(x,
π(x, ˙ t) = 1φ(x, t) − ρ(x − q(t)), Z
˙ = −ε∇V (εq(t)) + q(t) ˙ = p(t)/(1 + p2 (t))1/2 , p(t)
(1.12) d3 x φ(x, t) ∇ρ(x − q(t)).
The derivatives in (1.12) and below are understood in the sense of distributions. We consider the Cauchy problem for the system (1.12) with initial conditions (φ(x, 0), π(x, 0), q(0), p(0)) = (φ0 (x), π 0 (x), q 0 , p0 ).
(1.13)
Under our assumptions, the global solution to the Cauchy problem (1.12), (1.13) exists and is unique for initial data with finite energy. The solution depends on ε through the potential and possibly also through the initial conditions. In order as not to overburden our notation, we will mostly suppress this dependence. We assume the initial state to be close to a soliton. Since the force is slowly varying, near the particle such a wave field should persist. Indeed, we prove that k(φ(q(t) + x, t), π(q(t) + x, t)) − (φv(t) (x), πv(t) (x))kR ≤ CR ε, ∀R > 0, (1.14) uniformly in t ∈ R (with the norm k · kR being defined by the field energy in a ball of radius R), provided a smallness condition on ρ is satisfied. Presumably, this condition is an artifact of our method. In (1.12) the external force is O(ε). So is the self-force, since according to (1.14) the field φ deviates from the soliton only by O(ε). Then q¨ is of order ε, whereas q˙ is of order 1. The effective energy-momentum relation should be visible on a time scale O(1). Therefore we define the comparison dynamics through the effective Hamiltonian Heff (Q, P ) = E(P ) + V (εQ) with the corresponding equations of motion, ˙ = ∇E(P (t)), Q(t)
P˙ (t) = −ε∇V (εQ(t)),
(1.15)
suppressing again the ε-dependence of (Q(t), P (t)). Since the energy-momentum relation E(P ) depends on the charge distribution only through me , the effective dynamics is a structure independent property of the coupled system particle+field in the sense of the Kramers [14]. The particle loses energy through radiation, which is proportional to q¨2 and thus O(ε2 ). Therefore the comparison dynamics should be a valid approximation over a time scale ε−1 , i.e., over any time interval of duration ε−1 τ . At time t0 the comparison dynamics is adjusted to the true solution through the initial conditions Q(t0 ) = q(t0 ),
P (t0 ) = Ps (q(t ˙ 0 )).
(1.16)
Let (Q(t), P (t)) be the solution to (1.15) with these initial values. We then establish that, for |t − t0 | = O(ε−1 ), ˙ ¨ |q(t) − Q(t)| = O(1), |q(t) ˙ − Q(t)| = O(ε), |q(t) ¨ − Q(t)| = O(ε2 ) uniformly in t0 . This is our main result.
(1.17)
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
5
In the proof, we stick for a while to the traditional route. One solves the inhomogeneous wave equation and inserts the solution into the self-force. Thereby the force on the particle depends on its past history, but not on the field. If one expands this force at q(t) up to second order, one recovers the term missing in the full energy-momentum relation. To justify such a procedure mathematically we have to know a priori that |q(t)| ¨ ∼ε
and
...
| q (t)| ∼ ε2
(1.18)
uniformly...in t, which requires an estimate of the field difference (1.14) and a similar one to handle q (t). Our experience from the past is confirmed, namely a direct analysis of the exact delay equation for q(t) is hopeless. To make progress one has to switch back and forth between particle and field.
2. Main Results To formulate our results precisely, we need some definitions. We introduce the phase space suitable for the Cauchy problem corresponding to (1.12) and (1.13). Let L2 be the real Hilbert space L2 (R3 ) with norm || · ||, and let H˙ 1 be the completion of C0∞ (R3 ) with norm kφ(x)k = ||∇φ(x)|| . Equivalently, using Sobolev’s embedding theorem, H˙ 1 = {φ(x) ∈ L6 (R3 ) : |∇φ(x)| ∈ L2 }; see [15]. Let ||φ|| R denote the norm in L2 (BR ) for R > 0, where BR = {x ∈ R3 : |x| ≤ R}. Then the seminorms kφkR = ||∇φ|| R are continuous on H˙ 1 . Definition 2.1. i) The phase space E is the Hilbert space H˙ 1 ⊕ L2 ⊕ R3 ⊕ R3 of states Y = (φ, π, q, p) with finite norm k Y kE = kφk + ||π|| + |q| + |p|. ii) EF is the space E endowed with the Fr´echet topology defined by the local energy seminorms kY kR = kφkR + ||π|| R + |q| + |p|, ∀R > 0. iii) F is the Hilbert space H˙ 1 ⊕ L2 of the fields 8 = (φ, π) with finite norm k 8kF = kφk + ||π|| . iv) FF is the space F endowed with the Fr´echet topology defined by the local energy seminorms k8kR = kφkR + ||π|| R , ∀R > 0. A point in phase space is referred to as state. We write the Cauchy problem (1.12), (1.13) in the form Y˙ (t) = F(Y (t)), t ∈ R, Y (0) = Y 0 ,
(2.1)
where Y (t) = (φ(t), π(t), q(t), p(t)) and Y 0 = (φ0 , π 0 , q 0 , p0 ). As already mentioned, we mostly suppress the ε-dependence of the solutions, of the vector field F, and of the initial conditions. The following lemma is proved analogously to the corresponding result in [13].
6
A. Komech, M. Kunze, H. Spohn
Lemma 2.2. Let (C), (P ), and (U ), resp. (U 0 ), hold. Then for every Y 0 ∈ E, |ε| ≤ 1, the Cauchy problem (2.1) has a unique solution Y ∈ C(R, E) with speed bounded as ˙ ≤ v < 1. sup |q(t)| t∈R
(2.2)
The bound v = v(Y 0 ) is uniform in |ε| ≤ 1 and for initial values Y 0 in bounded subsets of E. If the effective dynamics is approximately valid, then the field should be close to the soliton centered at q(t) with velocity v(t) = q(t). ˙ We therefore consider the difference Z(x, t) = 8(x, t) − 8∗ (x, t), where
(2.3)
8(x, t) = (φ(x, t), π(x, t)), 8∗ (x, t) = 8v(t) (x − q(t))
and 8v (x) = (φv (x), πv (x)) is the field part of the soliton. Defining ρ(x) = (0, ρ(x)) and A(φ, π) = (π, 1φ), it follows that 8 and Z satisfy the equations of motion ˙ 8(x, t) = A8(x, t) − ρ(x − q(t)), ˙ Z(x, t) = AZ(x, t) − B(x, t), B(x, t) = p(t) ˙ · ∇p 8v(t) (x − q(t)).
(2.4) (2.5)
Here, according to the chain rule, ∇p 8v = ∇v 8v dv(p),
(2.6) p where dv(p) is the differential of the map p 7→ v(p) = p/ 1 + p2 . In Cartesian coordinates, dv(p) is just the Jacobi matrix ∂vi /∂pj . Theorem 2.3. Let the conditions of Lemma 2.2 hold and let ||ρ|| be sufficiently small, ||ρ|| ≤ δ(v, Rρ ). Then for every R > 0 there exists CR such that sup kZ(· + q(t), t)kR ≤ CR (kZ(0)kF + ε). t∈R
(2.7)
For the unperturbed, ε = 0, system our theorem states that the distance between the true solution and the soliton manifold S = {(φv (x − q), πv (x − q), q, pv ) : q ∈ R3 , v ∈ V}
(2.8)
remains bounded in time. This property is called orbital stability, which has been established for the system (1.1) in [12] and for related equations in [7, 2] using the Liapunov method in combination with energy and momentum conservation. For ε > 0 such an argument breaks down, since the Hamiltonian vector field is no longer parallel to S. To have a stability result as (2.7) we therefore need to exploit that through radiation damping the solution is “pushed” towards S. In other words, through the free wave equation a small deviation from the soliton is transported to infinity, which also shows that we are not allowed to replace the local energy norm in (2.7) by the global one. An adequate mathematical argument is provided by the nonautonomous integral equation method [4, 5, 19, 20], which has been used to prove the convergence to the soliton manifold in the context of the nonlinear Schr¨odinger equation. If we assume that initially kZ(0)kF ≤ Cε, then according to (2.7) the solution remains O(ε) close to S for all times. Thus it remains to characterize the motion along S as given by the particle trajectory q(t). To obtain its approximate equation of motion we
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
7
have to estimate the self-force. By Theorem 2.3 it is of O(ε). To control the error, O(ε2 ), the solution has to be slowly varying in time with outgoing fields, which we formalize through the notion of an adiabatic family of solutions Yε (t) = (φε (t), πε (t), qε (t), pε (t)). We denote by U (t) the dynamical group on F generated by the free wave equation and set 80 = (φε (0), πε (0)),
(φ0ε (·, t), πε0 (·, t)) = U (t)80 .
(2.9)
Definition 2.4. A family of solutions Yε (t) ∈ C(R, E), 0 < ε ≤ 1, to the system (1.12) is called adiabatic, if there exist constants a, T0 > 0, and v < 1, such that the following bounds hold: sup |q˙ε (t)| ≤ v,
(2.10)
sup |q¨ε (t)| ≤ aε,
(2.11)
t∈ R t∈ R
...
sup | q ε (t)| ≤ aε2 ,
(2.12)
t∈ R
| < φ0ε (x, t), ∇ρ(x − q) > | ≤ aε2 for |q| < |t| − T0 .
(2.13)
This definition is time-invariant, i.e., a family of solutions Yε (t + θ) is adiabatic for any θ ∈ R if it is for some θ. Our main result is the following Theorem 2.5. Let the assumptions of Theorem 2.3 hold and let Yε (t) ∈ C(R, E) be an adiabatic family of solutions to (1.12). Let (Q(t), P (t)) be the comparison dynamics (1.15) with initial values (1.16). Then for any τ > 0 there exists C = C(τ ) such that for |t − t0 | ≤ ε−1 τ , ˙ ¨ |q(t) − Q(t)| ≤ C, |q(t) ˙ − Q(t)| ≤ Cε, |q(t) ¨ − Q(t)| ≤ Cε2 .
(2.14)
The constant C(τ ) can be chosen independently of t0 . Of course, we still need a criterion for initial states, that ensures the corresponding family of solution trajectories is adiabatic. The following theorem provides sufficient conditions, which in particular show that any initial soliton (φv (x−q 0 ), πv (x−q 0 ), q 0 , pv ) defines an adiabatic family of solutions and that the set of adiabatic families of solutions is nonempty and open in an appropriate topology. We set (ϕ0 (x), ψ 0 (x)) = Z 0 (x) = Z(x, 0) with corresponding Fourier transforms 0 (ϕˆ (k), ψˆ 0 (k)), and we let Z p(0) ˙ = −ε∇V (εq(0)) + d3 x φ(x, 0) ∇ρ(x − q(0)). Theorem 2.6. Let there exist a0 > 0 such that for the initial states Yε0 = Y 0 = (φ0 , π 0 , q 0 , p0 ) ∈ E, 0 < ε ≤ 1, the following bounds hold: kY 0 (x)kE ≤ a0 ,
(2.15)
kZ (x)kF ≤ a ε,
(2.16)
0
k∇Z (x)kF + |p(0)| ˙ ≤a ε , ˆ ≤ a0 ε2 , d3 k |k||ϕˆ 0 (k)| + |ψˆ 0 (k)| |ρ(k)| 0
Z Z
0
0 2
ˆ + |∇[ψˆ 0 (k)ρ(k)]| ˆ ≤ a0 ε, d3 k |k| |k||∇[ϕˆ 0 (k)ρ(k)]|
(2.17) (2.18) (2.19)
8
A. Komech, M. Kunze, H. Spohn
and let ||ρ|| be sufficiently small, ||ρ|| ≤ δ(a0 , Rρ ). Then the family of solutions Yε (t) ∈ C(R, E) to the Cauchy problem (2.1) is adiabatic. Thus Theorem 2.6, in essence, requires that the deviation from the soliton has sufficient smoothness and decay. Our paper is organized as follows. Theorem 2.3 is proved in Sect. 3, and Theorem 2.6 is established in Sect. 4. In Sect. 5 we compute the self-force, and in Sect. 6 we complete the proof of Theorem 2.5. Section 7 concerns the translation invariant system (1.1). In Appendix A we collect Fourier space computations. Finally, in Appendix B, we list some remarks on the Hamiltonian structure. 3. Stability of the Soliton Manifold We prove Theorem 2.3 and establish first the required bound for R = Rρ from (C). Lemma 3.1. Under the assumptions of Theorem 2.3, the bound (2.7) holds for R = Rρ , kZ(· + q(t), t)kRρ ≤ C(kZ(0)kF + ε).
(3.1)
Proof. Solving Eq. (2.5) by Fourier transform we get the mild solution representation Z t U (t − s)[p(s) ˙ · ∇p 8v(s) (· − q(s))] ds, (3.2) Z(t) = U (t)Z(0) − 0
with U (t) being the group generated by the free wave equation in H˙ 1 ⊕ L2 . By conservation of energy for the wave equation k[U (t)Z(0)](· + q(t))kRρ ≤ k[U (t)Z(0)](· + q(t))kF = kZ(0)kF .
(3.3)
We denote by ϕ(x, t) = φ(x, t) − φv(t) (x − q(t)) the first component of Z(x, t) and observe that hφv (x), ∇ρ(x)i = 0 for |v| < 1 because the soliton (1.3) is a solution to (1.1). Then (1.12) implies p(t) ˙ = −ε∇V (εq(t)) + hϕ(x + q(t), t), ∇ρ(x)i. Thus with assumption (U ) we obtain, |p(t)| ˙ ≤ C ε + kZ(· + q(t), t)kRρ ||ρ|| .
(3.4)
(3.5)
We further introduce π v = ∇p πv , φv = ∇p φv , St−s (x) = {y : |y − x| = t − s}, and (φ(·, t, s), π(·, t, s)) = U (t − s)[∇p 8v(s) (· − q(s))].
(3.6)
Then Kirchhoff’s formula for U (t − s) implies the representation Z X (t − s)|α|−2 d2 y aα (x − y)∂yα π v(s) (y − q(s)) ∇φ(x, t, s) = St−s (x)
|α|≤1
+
X
|α|≤2
(t − s)|α|−3
Z
St−s (x)
d2 y bα (x − y)∂yα φv(s) (y − q(s)), (3.7)
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
9
and a similar representation for π(x, t, s). The coefficients aα (·), bα (·) are bounded and sums are taken over multiindices α = (α1 , α2 , α3 ) with integers αj ≥ 0. Therefore ∇φ(x + q(t), t, s) and π(x + q(t), t, s) can be represented as integrals of type (3.7) over the shifted sphere St−s (x + q(t)). If |x| ≤ Rρ , we have on this sphere |y − q(s)| = |(y − x − q(t)) + (x + q(t) − q(s))| ≥ (t − s) − |x| − v(t − s) ≥ (1 − v)(t − s) − Rρ
(3.8)
by the bound (2.2) on q(t). ˙ On the other hand, the integral representation (1.4) yields by Cauchy–Schwarz h |x||φv (x)| + |x|2 (|∇φv (x)| + |π v (x)|) + sup sup |v|≤v |x|≥2Rρ
i |x|3 (|∇∇φv (x)| + |∇π v (x)|) ≤ C(v, Rρ )|| ρ|| < ∞.
(3.9)
Inserting (3.9) and (3.8) in Kirchhoff’s formula for ∇φ(x + q(t), t, s), we obtain the pointwise bound X C1 (v, Rρ )|| ρ|| (t − s)2 (t − s)|α|−2 |∇φ(x + q(t), t, s)| ≤ (1 + |t − s|)|α|+2 |α|≤1
+
X
(t − s)|α|−3
|α|≤2
≤
C1 (v, Rρ )|| ρ|| (t − s)2 (1 + |t − s|)|α|+1
C2 (v, Rρ )|| ρ|| 1 + (t − s)2
(3.10)
for |x| ≤ Rρ and provided t − s ≥ 3Rρ /(1 − v). Therefore (3.10) implies for large t − s, together with a similar bound for π(x + q(t), t, s), the integral estimate k(φ(x + q(t), t, s), π(x + q(t), t, s)kRρ ≤
C3 (v, Rρ )|| ρ|| . 1 + (t − s)2
(3.11)
On the other hand, for bounded t − s this integral estimate follows directly from (3.6) by energy conservation for the map U (t − s), since k∇p 8v kF ≤ C(v, Rρ )|| ρ|| by (C). Finally, (3.5) and (3.11) imply kp(s) ˙ · (φ(x + q(t), t, s), π(x + q(t), t, s)kRρ ε + kZ(· + q(s), s)kRρ ||ρ|| ≤ C4 (v, Rρ )|| ρ|| , 1 + (t − s)2
(3.12)
and combining (3.2) and (3.3) we arrive at kZ(· + q(t), t)kRρ ≤ kZ(0)kF + C4 (v, Rρ )|| ρ||
(3.13) Z 0
t
ε + kZ(· + q(s), s)kRρ ||ρ|| 1 + (t − s)2
ds, t ≥ 0.
Thus, denoting M (t) = max0≤s≤t kZ(q(s) + x, s)kRρ , we have M (t) ≤ kZ(0)kF + C5 (v, Rρ )|| ρ|| (ε + ||ρ|| M (t)). We choose now ||ρ|| so small that C5 (v, Rρ )|| ρ|| 2 < 1. Then (3.1) follows for t ≥ 0.
10
A. Komech, M. Kunze, H. Spohn
We claim that the bound (3.1) implies (2.7) for any R > 0. Indeed, (3.11)-(3.13) hold with the norm k · kR instead of k · kRρ on the left hand sides and with Ci (v, ρ, R) instead of Ci (v, ρ) on the right hand sides. Then (3.13) with this generalization and (3.1) imply (2.7).
4. Adiabatic Solutions We prove Theorem 2.6. The bound (2.10) is the assertion of Lemma 2.2. Concerning (2.13), we have U (t)80 = U (t)8v(0) (· − q 0 ) + U (t)Z 0 .
(4.1)
Moreover, U (t)8v(0) (x − q 0 ) = 0 for |x − q 0 | < |t| − Rρ by Kirchhoff’s formula, since we have the representation Z 8v(0) (x) = −
0
−∞
[U (−s)ρ(· − q 0 − v(0)s)](x) ds.
(4.2)
Therefore with the choice T0 = 2Rρ + |q 0 | (2.13) holds for the first component of [U (t)8v(0) ](x). With the choice T0 = 0, (2.18) implies (2.13) for the first component of U (t)Z 0 , as can be seen in Fourier space representation. Thus it remains to prove (2.11) and (2.12). Proposition 4.1. For small ||ρ|| , the following bounds hold: ˙ ≤ C(a0 , Rρ ) ε, sup |v(t)|
(4.3)
¨ ≤ C(a0 , Rρ ) ε2 . sup |v(t)|
(4.4)
t∈R t∈R
Proof. The estimate (4.3) follows from (3.5), (3.1), and (2.16). To obtain (4.4), we differentiate (3.4) using (C), p(t) ¨ = −ε2 v(t) · ∇ ∇V (εq(t)) + M (t),
(4.5)
where M (t) = hL(t)ϕ(x + q(t), t), ∇ρ(x)i and L(t) = ∂t + v(t) · ∇. Then (U ) implies |p(t)| ¨ ≤ C(ε2 + |M (t)|).
(4.6)
Therefore (4.4) will be a consequence of Lemma 4.2. We have sup |M (t)| ≤ C(a0 , Rρ )ε2 t∈R
for small ||ρ|| .
(4.7)
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
11
Proof. We extend the method of the previous section. Denoting 4(x, t) = L(t)Z(x, t), we have M (t) = h4(x, t), ∇ρ∗ (x−q(t))i, where ρ∗ (x) = (ρ(x), 0). To obtain an equation for 4(t) we apply the differential operator L(t) to (2.5) in the sense of distributions to find ˙ 4(x, t) = A4(x, t) − L(t)B(x, t) + v(t) ˙ · ∇Z(x, t).
(4.8)
Here v(t) ˙ · ∇Z(·, t) ∈ C(R, F) due to (3.2), (2.17), and (C). Also L(t)B(·, t) ∈ C(R, F) because ˙ · ∇p )2 8v(t) (x − q(t)). L(t)B(x, t) = p(t) ¨ · ∇p 8v(t) (x − q(t)) + (p(t)
(4.9)
Moreover, assumptions (2.17) and (C) imply 4(·, 0) ∈ F, since ˙ · ∇p 8v(t) (x − q(t)) 4(x, t) = A8(x, t) − ρ(x − q(t)) + v(t) · ∇8(x, t) − p(t) (4.10) by definition of Z in (2.3) and by (2.5). Therefore, using the Fourier transform to solve the linear nonhomogeneous equation (4.8), we get the following integral representation, similar to (3.2), Z t Z t U (t − s)L(s)B(s)ds + v(s) ˙ · U (t − s)∇Z(s) ds, 4(x, t) = U (t)4(·, 0) − 0 0 (4.11) where both integrals converge in F . Hence (C) implies M (t) = hU (t)4(·, 0), ∇ρ∗ (· − q(t))i Z t hU (t − s)L(s)B(s), ∇ρ∗ (· − q(t))ids − 0 Z t v(s) ˙ · hU (t − s)∇Z(s), ∇ρ∗ (· − q(t))i ds. +
(4.12)
0
We analyze the three summands separately. (i) For the first summand we prove the bound sup |hU (t)4(·, 0), ∇ρ∗ (· − q(t))i| ≤ C1 (a0 )|| ρ|| ε2 . t≥0
(4.13)
Equation (4.10) implies k4(·, 0)kF ≤ C(a0 )ε2 by assumptions (2.17) and (C). Energy conservation then yields the uniform bound (4.13). (ii) For the second summand in (4.12) we will obtain Z t hU (t − s)L(s)B(s), ∇ρ∗ (· − q(t))i ds 0
Z
≤ C2 (a0 , Rρ ) ||ρ|| 2 0
t
ε2 + |M (s)| ds, t ≥ 0. 1 + (t − s)2
(4.14)
Equations (4.9), (4.6), and (4.3) result in L(t)B(x, t) = e(x, t) + m(x, t), where again by (C), sup ke(x, t)kF ≤ C(a0 , Rρ )|| ρ|| ε2 , km(x, t)kF ≤ C(a0 , Rρ )|| ρ|| |M (t)| . t≥0
12
A. Komech, M. Kunze, H. Spohn
Therefore (4.14) follows by repeating the arguments from (3.6)–(3.12). (iii) For the third summand in (4.12) we will prove Z t v(s) ˙ · hU (t − s)∇Z(s), ∇ρ∗ (· − q(t))i ds ≤ C3 (a0 , ρ) ε2 . sup t≥0
(4.15)
0
Taking the gradient of (3.2) yields
Z
s
U (t − s)∇Z(s) = U (t)∇Z(0) −
p(τ ˙ ) · U (t − τ )∇[∇p 8v(τ ) (· − q(τ ))] dτ.
0
(4.16)
For the first term, by partial integration in polar coordinates of the Fourier representation, (2.19) and (2.18) imply that |hU (t)∇Z(0), ∇ρ∗ (· − q(t))i| ≤ C(a0 )t−1 ε. The integral is oscillatory due to the bound (2.2). The justification for this partial integration comes from an appropriate averaging process. To bound the second term we note, similarly to (3.11), kU (t − τ )∇[∇p 8v(τ ) (· − q(τ ))]kRρ ≤
C(a0 , Rρ )|| ρ|| , 1 + (t − τ )3
(4.17)
since the bounds of type (3.9) hold for ∇∇p 8v (x) with an additional power of |x| on the left hand side. Then (4.16)-(4.17) and (4.3) imply (4.15). Finally we substitute (4.13), (4.14), and (4.15) into (4.12) to obtain the integral inequality Z t 2 ε + |M (s)| 0 2 0 2 ds, t ≥ 0. |M (t)| ≤ C(a , ρ)ε + C(a , Rρ ) ||ρ|| 2 0 1 + (t − s) Therefore (4.7) for t ≥ 0 follows, provided that ||ρ|| ≤ δ(a0 , Rρ ).
5. Inertial Representation of the Self-Force We study the self-action term Fs (t) =
Z d3 x φ(x, t) ∇ρ(x − q(t)).
Denote T1 = 2Rρ (1 − v)−1 , where v < 1 is the bound from (2.10), and T = max(T0 , T1 ) with T0 from (2.13). We also introduce the field part of the total momentum, Pf (v) = Ps (v) − pv ,
(5.1)
cf. (1.8), (1.3). The corresponding “effective mass”, mf (v), is given by the differential dPf (v) =: mf (v). Lemma 5.1. Let the assumptions of Theorem 2.5 hold. Then ˙ q(t) ¨ + fs (t), |fs (t)| ≤ Cε2 , for |t| ≥ T . Fs (t) = −mf (q(t))
(5.2)
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
13
Proof. We note that by (1.12) and (2.9), φ(x, t) = φ0 (x, t) + φr (x, t), where φ0 (x, t) is a solution to the free wave equation defined in (2.9), while φr is the retarded potential Z t Z ds 1 r d2 y ρ(y − q(s)). (5.3) φ (x, t) = − 4π 0 t − s |x−y|=t−s We decompose accordingly Fs (t) = F 0 (t) + F r (t), with F 0 (t) = hφ0 (·, t), ∇ρ(· − q(t)i, F r (t) = hφr (·, t), ∇ρ(· − q(t)i.
(5.4)
From (2.10) we conclude that |q(t) − q 0 | ≤ vt, and therefore F 0 (t) = O(ε2 ) for t ≥ T0
(5.5)
by (2.13), since the solution is adiabatic. Hence Fs (t) = F r (t) + O(ε2 ) for t ≥ T0 .
(5.6)
Equations (5.3) and (5.4) imply Z t Z Z ds 1 d3 x d2 y ρ(y − q(s)) ∇ρ(x − q(t)). (5.7) F r (t) = − 4π 0 t − s |x−y|=t−s Z t ds(. . . )-integral in (5.7) may be changed to Now observe that for all t, T ≥ T1 the 0 Z t ds(. . . )-integral, since a t−T
ρ(y − q(s)) ∇ρ(x − q(t)) = 0 if |x − y| = t − s ≥ T1 .
(5.8)
Indeed, ρ(y − q(s)) ∇ρ(x − q(t)) 6= 0 implies |y − q(s)| < Rρ and |x − q(t)| < Rρ . Therefore |x − y| < 2Rρ + v(t − s), since |q(t) − q(s)| ≤ v(t − s) by (2.2). Substituting |x − y| by t − s we obtain t − s < 2Rρ /(1 − v) = T1 . Next we fix t, T ≥ T1 and substitute in (5.7) the Taylor expansion 1 ¨ − s)2 + O(ε2 ) q(s) = q(t) − q(t)(t ˙ − s) + q(t)(t 2 according to (2.11)–(2.12). Then F r (t) = −
1 4π
Zt t−T
ds t−s
Z
Z
d2 y ρ y − q(t) + q(t)(t ˙ − s)
d3 x |x−y|=t−s
1 ¨ − s)2 + O(ε2 ) ∇ρ(x − q(t)). − q(t)(t 2 Combining with (5.6) we finally obtain 1 Fs (t) = − 4π
Zt t−T
ds t−s
Z
Z d3 x
h d2 y ρ(y − q(t) + q(t)(t ˙ − s))
|x−y|=t−s
i 1 ¨ · ∇ρ(y − q(t) + q(t)(t ˙ − s)) ∇ρ(x − q(t)) + fs (t) − (t − s)2 q(t) 2
(5.9)
14
A. Komech, M. Kunze, H. Spohn
with fs (t) satisfying (5.2). The integral does not depend on T provided T, t > T1 , which reflects the strong Huyghen’s principle. We will show in Appendix A by taking the limit ˙ q. ¨ Then (5.2) follows for t ≥ T . T → ∞ that the integral in (5.9) in fact equals −mf (q) 6. The Adiabatic Limit We complete the proof of Theorem 2.5. We first ensure the existence of the effective dynamics. Lemma 6.1. Define E(P ) through (1.9), and let the potential V satisfy (U ). Then for every initial state (Q(0), P (0)) ∈ R3 × R3 the Hamiltonian system ˙ = ∇E(P (t)), P˙ (t) = −ε∇V (εQ(t)) Q(t) (6.1) ...
¨ and | Q(t)| are has a unique solution (Q(t), P (t)) ∈ C(R, R3 × R3 ). Moreover, |Q(t)| bounded uniformly in t. Proof. Both ∇∇E(P ) and ∇∇V (Q) are bounded and Heff (P, Q) is bounded from below. Let m(v) = dPs (v). From Lemma 5.1, together with definitions (1.8), (5.1) and the equations of motion (1.12), we conclude that m(q(t)) ˙ q(t) ¨ = −ε∇V (εq(t)) + fs (t).
(6.2)
We want to rewrite (6.2) in a Hamiltonian form. For this purpose we introduce 5(t) = ˙ ˙ which yields m(q(t)) ˙ q(t) ¨ = 5(t). To obtain q˙ as a function of 5 we have to Ps (q(t)), invert the map v 7→ Ps (v). Lemma 6.2. The inverse function to Ps (v) is given by vs (P ) = ∇E(P ).
(6.3)
Proof. Using the chain rule, Eq. (9.1) states v = ∇Es (v) (dPs (v))−1 = ∇E(Ps (v)).
With these definitions, (6.2) becomes ˙ q(t) ˙ = ∇E(5(t)), 5(t) = −ε∇V (εq(t)) + fs (t).
(6.4)
ε
Let q ε (t) = εq(ε−1 t), Qε (t) = εQ(ε−1 t) and 5 (t) = 5(ε−1 t), P ε (t) = P (ε−1 t). Then (6.4) and (6.1) read q˙ε (t) = ∇E(5ε (t)), Q˙ ε (t) = ∇E(P ε (t)),
˙ ε (t) = −∇V (εq ε (t)) + ε−1 fs (εt), 5 P˙ ε (t) = −∇V (εQε (t)).
Since ∇∇E and ∇∇V are bounded, and |fs (εt)| ≤ Cε2 for |t| ≥ εT , from a Gronwall argument for r(t) = |q ε (t) − Qε (t)| + |5ε (t) − P ε (t)|, we conclude that r(t) ≤ C(r0 + ε)eC|t−t0 | .
(6.5)
Here r0 := r(t0 ) = 0 due to (1.16), if |t0 | > T , otherwise r0 := r(±εT ) = O(ε), since q ε (t), Qε (t), 5ε (t), P ε (t) change by O(ε) over the time interval |t| ≤ εT . Therefore, (6.5) implies the first two bounds of (2.14). The third bound follows from the second ¨ order equation (6.2) for q¨ and a similar equation for Q.
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
15
7. The Translation Invariant Case For V = 0 the velocity q(t) ˙ of the particle should, after a transient period, stabilize at some definite v dressed by the corresponding soliton field. Such a result was established in [12], where we only had to assume the Wiener condition ρ(k) ˆ 6= 0. The technique developed here avoids this condition at the prize of ||ρ|| 1 and obtains even a bound on the rate of convergence. We denote Z(0) = (ϕ0 (x), ψ 0 (x)). Proposition 7.1. Let ||ρ|| be sufficiently small, ||ρ|| ≤ δ(v, Rρ ), and assume for some σ ∈ (0, 1], |ϕ0 (x)| + |x|(|∇ϕ0 (x)| + |ψ 0 (x)|) + |x|2 (|∇∇ϕ0 (x)| + |∇ψ 0 (x)|) = O(|x|−σ ) as |x| → ∞.
(7.1)
Then the solution to (1.1) satisfies kZ(· + q(t), t)kR ≤ CR (1 + |t|)−1−σ , ∀R > 0.
(7.2)
Corollary 7.2. Under the same assumptions the acceleration is bounded as |q(t)| ¨ ≤ C(1 + |t|)−1−σ .
(7.3)
˙ = v± ∈ V exist, and Therefore, the limits limt→±∞ q(t) |q(t) ˙ − v± | ≤ C(1 + |t|)−σ .
(7.4)
Proof. Equations (7.1) and (3.2)–(3.11) with ε = 0 imply, similarly to (3.13), −1−σ
kZ(· + q(t), t)kRρ ≤ C(1 + |t|)
Z + C(v, ρ)|| ρ||
2 0
t
kZ(q(s) + x, s)kRρ ds (1 + |t − s|)2
for t ≥ 0. Therefore, setting M (t) = max0≤s≤t (1 + |t|)1+σ kZ(q(s) + x, s)kRρ , we find M (t) ≤ C + C(v, ρ)|| ρ|| 2 Iσ M (t), where
Z Iσ = sup(1 + |t|)1+σ t≥0
0
t
(1 + |s|)−1−σ ds < ∞ for σ ∈ (0, 1]. (1 + |t − s|)2
It remains to choose C(v, ρ)|| ρ|| 2 Iσ < 1, then (7.2) with R = Rρ follows for t ≥ 0. The corollary is a consequence of (3.4) with ε = 0. Remark. Soliton-like asymptotics are established in [17] for some translation invariant 1D completely integrable equations, in [4, 5] for small perturbations of soliton solutions to 1D translation invariant nonlinear Schr¨odinger equations, and in [19, 20] for U (1)invariant 2D and 3D nonlinear Schr¨odinger equations with a potential term decaying like a power decay at infinity; [9] studies soliton-like asymptotics for 1D translation invariant nonlinear reaction systems.
16
A. Komech, M. Kunze, H. Spohn
8. Appendix A. Fourier Integrals As usual, we denote by fˆ(k) = (2π)−3/2
Z
d3 x eikx f (x) the Fourier transform of f (x).
Solitons: The soliton (1.4) has the Fourier transform ρ(k) ˆ φˆ v (k) = − 2 , k − (k · v)2 ik · v ρ(k) ˆ πˆ v (k) = − . k 2 − (k · v)2
(8.1)
Energy-momentum relation: Inserting (8.1) in (1.2) and (1.7), the energy and the total momentum of a soliton with velocity v are, respectively, Z 2 2 1 2 3(k · v) − k d3 k |ρ(k)| + ˆ , H0 (Sv ) = (1 − v ) 2 [k 2 − (k · v)2 ]2 Z k·v 2 −1/2 2 + d3 k |ρ(k)| ˆ k. Ps (v) = v(1 − v ) [k 2 − (k · v)2 ]2 2 −1/2
(8.2) (8.3)
After some calculations, this yields (1.6) and (1.8). Field mass: Equation (8.3) implies that the effective mass due to the coupling to the field is given by Z mf (v) = dPf (v) =
2 d3 k |ρ(k)| ˆ
k 2 + 3(k · v)2 3
[k 2 − (k · v)2 ]
k ⊗ k, |v| < 1.
(8.4)
Self-force: We compute the integral (5.9) by switching to Fourier space. The wave propagator in Fourier space is multiplication by |k|−1 sin |k|t. Hence Z Fs (t) =
Z 2
d3 k |ρ(k)| ˆ ik
t
t−T
h i 1 ˙ ¨ · (−ik)(t − s)2 1 − q(t) ds e−ik·q(t)(t−s) 2
×|k|−1 sin |k|(t − s) + fs (t).
(8.5)
We evaluate this integral by taking the limit as T → ∞, recalling that the integral does not depend on T provided T ≥ T 1 . We set Fs (t) = I1 (T ) + I2 (T ) + fs (t). In (8.5) we integrate over s. Setting v = q(t) ˙ and k± = −k · v ± |k|, the first integral reads Z Z t sin |k|(t − s) 2 3 ˙ ˆ ik ds e−ik·q(t)(t−s) I1 (T ) = d k |ρ(k)| |k| t−T
Z =i
ˆ d k |ρ(k)| Z
= −i
2
3
k k − k 2 − (k · v)2 2|k|
k ˆ d k |ρ(k)| 2|k| 3
2
eik+ T eik− T − k+ k−
eik+ T eik− T − k+ k−
=: I1+ (T ) + I1− (T ).
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
17
Introducing polar coordinates ν = |k|, θ = k/|k|, we have Z
k eik+ T 2|k| k+ Z Z ∞ i θ 2 i(θ·v+1) νT =− d2 θ dν ν |ρ(νθ)| ˆ e . 2 |θ|=1 θ·v+1 0
I1+ (T ) = −i
2
d3 k |ρ(k)| ˆ
(8.6)
The integral converges absolutely, since ρ(k) ˆ is smooth with all derivatives in L2 (R3 ) by assumption (C). Therefore, integrating by parts twice in the ν-integral yields |I1+ (T )| ≤ ˙ < 1. CT −2 because |v| = |q(t)| The same argument applies to I1− (T ) and it follows that |I1 (T )| ≤ CT −2 → 0 as T → ∞.
(8.7)
The second integral reads I2 (T ) = − =−
1 2 Z
Z
Z
t
sin |k|(t − s) |k| t−T ik+ T 2 2 e 1 eik− T k + 3(k · v) 2 ˆ k q(t) ¨ ·k + − d3 k |ρ(k)| 3 (k 2 − (k · v)2 )3 2|k| k+3 k− ik+ T e eik− T eik− T T 2 eik+ T iT − − − . − 2 2|k| k+2 4|k| k+ k− k− 2
d3 k |ρ(k)| ˆ k q(t) ¨ ·k
˙ ds e−ik·q(t)(t−s) (t − s)2
The integrals containing T are again oscillatory and vanish as T → ∞. Therefore, comparing with (8.4), we conclude ˙ q(t) ¨ as T → ∞. I2 (T ) → −mf (q(t))
(8.8)
Hence (5.2) follows from (8.7) and (8.8).
9. Appendix B. The Hamiltonian Structure Energy-momentum relation: In Sect. 6 we used the identity v dPs (v) = ∇Es (v), |v| < 1.
(9.1)
While obtained from the explicit expressions (8.2), (8.3), resp. (1.6), (1.8), this identity should be understood as a direct consequence of the conservation of total momentum, i.e., of the translation invariance of (1.1). Our argument uses the canonical transformation [12] T : (φ, π, q, p) 7→ (8(x), 5(x), Q, P ) = (φ(q + x), π(q + x), q, p − < π(x), ∇φ(x) >).
18
A. Komech, M. Kunze, H. Spohn
In new variables the Hamiltonian (1.2) reads HP (8, 5) = H0 8(x − Q), 5(x − Q), Q, P + < 5(x), ∇8(x) > Z 1 1 |5(x)|2 + |∇8(x)|2 + 8(x)ρ(x) = d3 x 2 2 2 1/2 . + 1 + P + < 5(x), ∇8(x) > HP is bounded from below and has its unique minimum at the point (φv , πv ), the soliton at velocity v = vs (P ), with minimal value HP (φv , πv ) = Es (v) + H0 (S0 ); see [12]. Differentiating in v we obtain ∇Es (v) = h
δHP δHP (φv , πv ), ∇v φv i + h (φv , πv ), ∇v πv i δ8 δ5
+∇P HP (φv , πv ) dPs (v) = v dPs (v), since (φv , πv ) is a critical point of HP and the first two terms vanish, while v = Q˙ = ∇P HP (φv , πv ) because T is a canonical transformation. Correspondence of the Hamiltonian structures: Definitions (1.5), (1.8), and (1.9) imply that the Hamiltonian functional Hε of (1.11) restricted to the soliton Sv = (φv (x − q), πv (x − q), q, pv ) becomes Hε (Sv ) = E(P ) + V (εq) + H0 (S0 ) = Heff (q, P ) + H0 (S0 )
(9.2)
with P = Ps (v). Thus the effective Hamiltonian can be understood as the restriction of Hε to the soliton manifold. We need in addition the appropriate choice of the canonical variables to write the Hamilton’s equations in standard form (1.15). For general reasons one expects the conserved quantities to play a distinguished role. In our case this suggests P and q as canonical variables. The next lemma gives an inherent geometrical meaning to this choice, which might be valuable in a more general context. Lemma 9.1. The canonical structure P dq on the soliton manifold S is the restriction of the full canonical form p dq + < φ, dπ >, i.e., P dq = (p dq + hφ, dπi) . S
Proof. We have p dq +hφ, dπi = P dQ+h8, d5i, since T is a canonical transformation, and h8, d5i = hφv , dπv i = hφv , ∇v πv i dv = 0 S
by antisymmetry in Fourier space and since |ρ(−k)| ˆ = |ρ(k)|. ˆ
Effective Dynamics for a Mechanical Particle Coupled to Wave Field
19
References 1. Abraham, M.: Theorie der Elektrizit¨at, Band 2: Elektromagnetische Theorie der Strahlung. Leipzig: Teubner, 1905 2. Bambusi, D., Galgani, L.: Some rigorous results on the Pauli-Fierz model of classical electrodynamics. Ann. Inst. H. Poincar´e, Phys. Theor. 58, 155–171 (1993) 3. Bensoussan, A., Lions, J.L., Papanicolaou, G.: Asymptotic Analysis for Periodic Structures. Studies in Mathematics and its Applications, Vol. 5, Amsterdam: North-Holland, 1978 4. Buslaev, V.S., Perelman, G.S.: On nonlinear scattering of states which are close to a soliton. In: M´ethodes Semi-Classiques, Vol.2 Colloque International (Nantes, juin 1991), Asterisque 208, 1992, pp. 49–63 5. Buslaev, V.S., Perelman, G.S.: On the stability of solitary waves for nonlinear Schr¨odinger equations. Trans. Amer. Math. Soc. 164, 75–98 (1995) 6. Davies, E.B.: Quantum Theory of Open Systems. London: Academic Press, 1976 7. Grillakis, M., Shatah, J., Strauss, W.A.: Stability theory of solitary waves in the presence of symmetry I and II. J. Func. Anal. 74, 160–197 (1987); 94, 308–348 (1990) 8. De Masi, A., Ferrari, P.A., Goldstein, S., Wick, W.D.: An invariance principle for reversible Markov processes. Application to random environments. J. Stat. Phys. 55, 787–855 (1989) 9. Fleckinger, J., Komech, A.: On soliton-like asymptotics for 1D nonlinear reaction systems. Russian J. Math. Phys. 5, 295–307 (1997) 10. Hagedorn, G.A.: A time dependent Born–Oppenheimer approximation. Commun. Math. Phys. 77, 1–19 (1980) 11. Jerrard, R.L., Soner, H.M.: Dynamics of Ginzburg–Landau Vortices. Preprint, 1995 12. Komech, A., Spohn, H.: Soliton-like asymptotics for a classical particle interacting with a scalar wave field. Nonlinear Anal. 33, 13–24 (1998) 13. Komech, A., Spohn, H., Kunze, M.: Long-time asymptotics for a classical particle interacting with a scalar wave field. Comm. Partial Differential Equations 22, 307–335 (1997) 14. Kramers, H.A.: Non-relativistic Quantum-Electrodynamics and correspondence Principle. In: Solvay Conference 1948, Rapport et Discussions, Bruxelles, 1950 pp. 241–265; in: Kramers, H.A.: Collected Scientific Papers. Amsterdam: North-Holland, 1956, pp. 845–869 15. Lions, J.L.; Probl`emes aux Limites dans les Equations aux D´eriv´ees Partielles. Montr´eal: Presses de l’Univ. Montr´eal, 1962 16. Lorentz, H.A.: Theory of Electrons, 2nd edition 1915. Reprinted by New York: Dover, 1952 17. Novikov, S.P., Manakov, S.V., Pitaevskii, L.P., Zakharov, V.E.: Theory of Solitons: The Inverse Scattering Method. Consultants Bureau, 1984 18. Robert, D.: Autour de l’Approximation Semi-Classique, Progress in Mathematics, Vol. 68 Basel: Birkh¨auser, 1987 19. Soffer, A., Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations. Commun. Math. Phys. 133, 119–146 (1990) 20. Soffer, A., Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations II. The case of anisotropic potentials and data. J. Differ. Eqs. 98, 376–390 (1992) 21. Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin: Springer, 1991 22. Spohn, H.: Long time asymptotics for quantum particles in a periodic potential. Phys. Rev. Lett. 77, 1198–1201 (1996) Communicated by A. Kupiainen
Commun. Math. Phys. 203, 21 – 30 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Monopole Equations on 8-Manifolds with Spin(7) Holonomy Ay¸se Hümeyra Bilge1 , Tekin Dereli2 , Sahin ¸ Koçak3 1 Department of Mathematics, ˙Istanbul Technical University, ˙Istanbul, Turkey.
E-mail:
[email protected] 2 Department of Physics, Middle East Technical University, Ankara, Turkey.
E-mail:
[email protected] 3 Department of Mathematics, Anadolu University, Eski¸sehir, Turkey.
E-mail:
[email protected] Received: 17 October 1997 / Accepted: 16 November 1998
Abstract: We construct a consistent set of monopole equations on eight-manifolds with Spin(7) holonomy. These equations are elliptic and admit non-trivial solutions including all the 4-dimensional Seiberg–Witten solutions as a special case. 1. Introduction In a remarkable paper [1] Seiberg and Witten have shown that diffeomorphism invariants of 4-manifolds can be found essentially by counting the number of solutions of a set of massless, Abelian monopole equations [2, 3]. It is later noted that topological quantum field theories which are extensively studied in this context in 2, 3 and 4 dimensions also exist in higher dimensions [4, 5, 6, 7]. Therefore it is of interest to consider monopole equations in higher dimensions and thus generalizing the 4-dimensional Seiberg–Witten theory. In fact Seiberg–Witten equations can be constructed on any even dimensional manifold (D=2n) with a spinc -structure [8]. But there are problems. The self-duality of 2-forms plays an eminent role in 4-dimensional theory and we encounter projection maps ρ + (FA ) = ρ + (FA+ ) = ρ(FA+ ) (see the next section). The first projection ρ + (FA ) is meaningful in any dimension 2n ≥ 4. However, a straightforward generalization of the Seiberg–Witten equations using this projection yields an over determined set of equations having no non-trivial solutions even locally [9]. To use the other projections, one needs an appropriately generalized notion of self-dual 2-forms. On the other hand there is no unique definition of self-duality in higher than four dimensions. In a previous paper [10] we reviewed the existing definitions of self-duality and given an eigenvalue criterion for specifying self-dual 2-forms on any even dimensional manifold. In particular, in D = 8 dimensions, there is a linear notion of self-duality defined on 8-manifolds with Spin(7) holonomy [11, 12]. This corresponds to a specific choice of a maximal linear subspace in the set of (non-linear) self-dual 2-forms as defined by our eigenvalue criterion [13]. Eight dimensions is special because in this particular case the set of linear
22
A. H. Bilge, T. Dereli, S. ¸ Koçak
Spin(7) self-duality equations can be solved by making use of octonions [14] . The existence of octonionic instantons which realise the last Hopf fibration S 15 → S 8 is closely related with the properties of the octonion algebra [15, 16, 17]. Here we use this linear notion of self-duality to construct a consistent set of Abelian monopole equations on 8-manifolds with Spin(7) holonomy. These equations turn out to be elliptic and locally they admit non-trivial solutions which include all 4-dimensional Seiberg–Witten solutions as a special case. But before giving our 8-dimensional monopole equations, we first wish in the next section to give the set up and generalizations of 4-dimensional Seiberg–Witten equations to arbitrary even dimensional manifolds with spinc -structure as proposed by Salamon [8]. This is going to help us put our monopole equations into their proper context. We also wish to note that any 8-manifold with Spin(7) holonomy is automatically a spin manifold [18, 19] and thus carries a spinc -structure; making the application of the general approach possible. In fact our monopole equations can always be expressed purely in the real realm, but in order to relate them to the 4-dimensional Seiberg–Witten equations, it is preferable to use the spinc -structure and complex spinors. 2. Definitions and Notation A spinc -structure on a 2n-dimensional real inner-product space V is a pair (W, 0), where W is a 2n -dimensional complex Hermitian space and 0 : V → End(W ) is a linear map satisfying 0(v)2 = −kvk2 0(v)∗ = −0(v), for v ∈ V . Globalizing this defines the notion of a spinc -structure 0 : T X → End(W ) on a 2n-dimensional (oriented) manifold X, W being a 2n -dimensional complex Hermitian vector bundle on X. Such a structure exists if and only if w2 (X) has an integral lift. 0 extends to an isomorphism between the complex Clifford algebra bundle C c (T X) and End(W ). There is a natural splitting W = W + ⊕ W − into the ±i n eigenspaces of 0(e2n e2n−1 · · · e1 ), where e1 , e2 , · · · , e2n is any positively oriented local orthonormal frame of T X. The extension of 0 to C2 (X) gives, via the identification of 32 (T ∗ X) with C2 (X), a map ρ : 32 (T ∗ X) → End(W ) given by
X
ρ(
i<j
ηij ei∗ ∧ ej∗ ) =
X
ηij 0(ei )0(ej ).
i<j
The bundles W ± are invariant under ρ(η) for η ∈ 32 (T ∗ X). Denote ρ ± (η) = ρ(η)|W ± . The map ρ (and ρ ± ) extends to ρ : 32 (T ∗ X) ⊗ C → End(W ). (If η ∈ 32 (T ∗ X) ⊗ C is real-valued then ρ(η) is skew-Hermitian and if η is imaginaryvalued then ρ(η) is Hermitian.) A Hermitian connection ∇ on W is called a spinc connection (compatible with the Levi–Civita connection) if ∇v (0(w)8) = 0(w)∇v 8 + 0(∇v w)8, where 8 is a spinor (section of W ), v and w are vector fields on X and ∇v w is the Levi–Civita connection on X. ∇ preserves the subbundles W ± . There is a principal
Monopole Equations on 8-Manifolds with Spin(7) Holonomy
23
Spinc (2n) = {eiθ x|θ ∈ R, x ∈ Spin(2n)} ⊂ C c (R2n ) bundle P on X such that W and T X can be recovered as the associated bundles n
W = P ×Spinc (2n) C2 ,
T X = P ×Ad R2n ,
Ad being the adjoint action of Spinc (2n) on R2n . We get then a complex line bundle L0 = P ×δ C using the map δ : Spinc (2n) → S 1 given by δ(eiθ x) = e2iθ . There is a one-to-one correspondence between spinc connections on W and spinc (2n) = Lie (Spinc (2n) = spin(2n) ⊕iR-valued connection 1-forms Aˆ ∈ A(P ) ⊂ 1 ˆ A = 1n trace(A). ˆ This is (P , spinc (2n)) on P . Now consider the trace-part A of A: 2
an imaginary valued 1-form A ∈ 1 (P , iR) which is equivariant and satisfies Ap (p · ξ ) =
1 trace(ξ ) 2n
for v ∈ Tp P , g ∈ Spinc (2n),ξ ∈ spinc (2n) (where p·ξ is the infinitesimal action). Denote the set of imaginary valued 1-forms on P satisfying these two properties by A(0). There is a one-to-one correspondence between these 1-forms and spinc connections on W . Denote the connection corresponding to A by ∇A . A(0) is an affine space with parallel vector space 1 (X, iR). For A ∈ A(0), the 1-form 2A ∈ 1 (P , iR) represents a connection on the line bundle L0 . Because of this reason A is called a virtual connection 1/2 on the virtual line bundle L0 . Let FA ∈ 2 (X, iR) denote the curvature of the 1-form A. Finally, let DA denote the Dirac operator corresponding to A ∈ A(0), DA : C ∞ (X, W + ) → C ∞ (X, W − ) defined by DA (8) =
2n X
0(ei )∇A,ei (8),
i=1
where 8 ∈ C ∞ (X, W + ) and e1 , e2 , · · · , e2n is any local orthonormal frame. The Seiberg–Witten equations can now be expressed as follows. Fix a spinc -structure 0 : T X → End(W ) on X and consider the pair (A, 8) ∈ A(0) × C ∞ (X, W + ). The Seiberg–Witten equations read DA (8) = 0 ,
ρ + (FA ) = (88∗ )0 ,
where (88∗ )0 ∈ C ∞ (X, End(W + )) is defined by (88∗ )(τ ) =< 8, τ > 8 for τ ∈ C ∞ (X, W + ) and (88∗ )0 is the traceless part of (88∗ ). 3. Seiberg–Witten Equations on 4-Manifolds Before going over to 8-manifolds, we first show that the Seiberg–Witten equations on 4-manifolds ([8, p. 232]) can be rewritten in a different form. The Dirac equation DA (8) = 0
(1)
∇1 8 = I ∇2 8 + J ∇3 8 + K∇4 8,
(2)
can be explicitly written as
24
A. H. Bilge, T. Dereli, S. ¸ Koçak
and ρ + (FA ) = (88∗ )0
(3)
F12 + F34 = −1/28∗ I 8, F13 − F24 = −1/28∗ J 8, F14 + F23 = −1/28∗ K8,
(4)
is equivalent to the set
∂8 + Ai 8, where 8 : R4 → C2 , ∇i 8 = ∂x i P P4 1 4 A = i=1 Ai dxi ∈ (R , iR), FA = i<j Fij dxi ∧ dxj ∈ 2 (R4 , iR), and i 0 0 1 0i I= , J = , K= . 0 −i −1 0 i 0
In the most explicit form, these equations can be written as ∂φ1 ∂φ2 ∂φ2 ∂φ1 + A1 φ1 = i( + A2 φ1 ) + + A3 φ2 + i( + A4 φ2 ), ∂x1 ∂x2 ∂x3 ∂x4 ∂φ2 ∂φ1 ∂φ1 ∂φ2 + A1 φ2 = −i( + A2 φ2 ) − ( + A3 φ1 ) + i( + A4 φ1 ) ∂x1 ∂x2 ∂x3 ∂x4
(5)
(for DA (8) = 0) and F12 + F34 = −i/2(φ1 φ¯ 1 − φ2 φ¯ 2 ), F13 − F24 = 1/2(φ1 φ¯ 2 − φ2 φ¯ 1 ), F14 + F23 = −i/2(φ1 φ¯ 2 + φ2 φ¯ 1 )
(6)
(for ρ + (FA ) = (88∗ )0 ). We will reinterpret the second part of these equations in the following way: The 6-dimensional bundle of real-valued 2-forms on R4 has a 3-dimensional subbundle of self-dual forms with orthogonal basis f1 = dx1 ∧ dx2 + dx3 ∧ dx4 , f2 = dx1 ∧ dx3 − dx2 ∧ dx4 , f3 = dx1 ∧ dx4 + dx2 ∧ dx3 ,
(7)
in each fiber with respect to the usual metric. These forms span a 3-dimensional complex subbundle P of the bundle of complex-valued 2-forms. The projection of a (global) 2-form F = Fij dxi ∧ dxj ∈ 2 (R4 , iR) onto this complex subbundle is given by F + = 1/2(F12 + F34 )f1 + 1/2(F13 − F24 )f2 + 1/2(F14 + F23 )f3 .
(8)
We have ρ + (f1 ) = 2I, ρ + (f2 ) = 2J, ρ + (f3 ) = 2K, so that ρ + (F + ) = (F12 + F34 )I + (F13 − F24 )J + (F14 + F23 )K.
(9)
On the other hand, the orthogonal projection (88∗ )+ of 88∗ onto the subbundle of the positive spinor bundle generated by the (Hermitian-) orthogonal basis (ρ + (f1 ), ρ + (f2 ), ρ + (f3 )) is given by
Monopole Equations on 8-Manifolds with Spin(7) Holonomy
25
< 2I, 88∗ > 2I /|2I |2 + < 2J, 88∗ > 2J /|2J |2 + < 2K, 88∗ > 2K/|2K|2
=
1 1 1 < I, 88∗ > I + < J, 88∗ > J + < K, 88∗ > K. 2 2 2
(10)
Since < I, 88∗ >= −8∗ I 8, < J, 88∗ >= −8∗ J 8, < K, 88∗ >= −8∗ K8, (11) this shows that the second part of the Seiberg–Witten equations can be expressed as follows: Given any (global, imaginary-valued) 2-form F , the image under the map ρ + of its self-dual part F + coincides with the orthogonal projection of 88∗ onto the subbundle of the positive spinor bundle which is the image bundle of the complexified subbundle of self-dual 2-forms under the map ρ + , that is, ρ + (F + ) = (88∗ )+ .
(12)
Indeed, in the present case (88∗ )+ is nothing else than (88∗ )0 . In this modified form the Seiberg–Witten equations allow a tempting generalisation. Suppose we are given a subbundle S ⊂ 32 (T ∗ X). Denote the complexification of S by S ∗ , the projection of an imaginary valued 2-form field F onto S ∗ by F + and the projection of φφ ∗ onto ρ + (S ∗ ) by (φφ ∗ )+ . Then the equation ρ + (F + ) = (φφ ∗ )+ can be taken as a substitute of the 4dimensional equation (3) in 2n-dimensions. An arbitrary choice of S wouldn’t probably give anything interesting, but stable subbundles with respect to certain structures on X are likely to give useful equations.
4. Monopole Equations on 8-Manifolds We now consider 8-manifolds with Spin(7) holonomy. In this case there are two natural choices of S which have already found applications in the existing literature. In the 28-dimensional space of 2-forms 2 (R8 , R), there are two orthogonal subspaces S1 and S2 ( 7 and 21 dimensional, respectively) which are spin(7) ⊂ so(8) invariant [11, 12]. On an 8-manifold X with Spin(7) holonomy (so that the structure group is reducible to Spin(7)) they give rise to global subbundles (denoted by the same letters) S1 , S2 ⊂ 32 (T ∗ X) which can play the above mentioned role. We will concentrate on the 7-dimensional subbundle S1 and show that the resulting equations are elliptic, exemplify the local existence of non-trivial solutions and show that they are related to solutions of the 4-dimensional Seiberg–Witten equations. We would like to point out that instead of the widely known CDFN 7-plane, we are working with another 7-plane in 2 (R8 , R), which is conjugated to the CDFN 7-plane and thus invariant under a conjugated spin(7) embedding in so(8). This has the advantage that the 2-forms in this 7-plane can be expressed in an elegant way in terms of 4-dimensional self-dual and anti-self-dual 2forms. (For a general account we refer to our previous work, [10].) We will define this 7-plane below, but before that, for the sake of clarity, we first wish to present the global monopole equations. Let X be an 8-manifold with Spin(7) holonomy and S be any stable subbundle of 32 (T ∗ X) and S ∗ its complexification. Given an imaginary valued global 2-form F , let us denote its projection onto S ∗ by F + and the projection of any global
26
A. H. Bilge, T. Dereli, S. ¸ Koçak
spinor φ onto the subbundle ρ + (S ∗ ) ⊂ End(W + ) by φ + . Then the monopole equations read DA (φ) = 0,
(13)
ρ + (FA+ ) = (φφ ∗ )+ .
(14)
Now, we define S1 ⊂ 2 (R8 , R) to be the linear space of 2-forms X ωij dxi ∧ dxj ∈ 2 (R8 , R), ω= i<j
which can be expressed in matrix form as ω = ω12 f +
ω0 ω00 ω00 −ω0 ,
(15)
where ω12 is a real function, ω0 is the matrix of a 4-dimensional self-dual 2-form, ω00 is the matrix of a 4-dimensional anti-self-dual 2-form and we let f = iσ2 ⊗ I4 . These 2-forms span a 7-dimensional linear subspace S1 in the 28-dimensional space of 2-forms and the square of any element in this subspace is a scalar matrix. S1 is maximal with respect to this property. We choose the following orhogonal basis for this maximal linear subspace of self-dual 2-forms: f1 f2 f3 f4 f5 f6 f7
= dx1 ∧ dx5 + dx2 ∧ dx6 + dx3 ∧ dx7 + dx4 ∧ dx8 , = dx1 ∧ dx2 + dx3 ∧ dx4 − dx5 ∧ dx6 − dx7 ∧ dx8 , = dx1 ∧ dx6 − dx2 ∧ dx5 − dx3 ∧ dx8 + dx4 ∧ dx7 , = dx1 ∧ dx3 − dx2 ∧ dx4 − dx5 ∧ dx7 + dx6 ∧ dx8 , = dx1 ∧ dx7 + dx2 ∧ dx8 − dx3 ∧ dx5 − dx4 ∧ dx6 , = dx1 ∧ dx4 + dx2 ∧ dx3 − dx5 ∧ dx8 − dx6 ∧ dx7 , = dx1 ∧ dx8 − dx2 ∧ dx7 + dx3 ∧ dx6 − dx4 ∧ dx5 .
(16)
In matrix notation we set f1 = f , and take f2 = σ3 ⊗ a1 , f3 = σ1 ⊗ b1 , f4 = σ3 ⊗ a2 , f5 = σ1 ⊗ b2 , f6 = σ3 ⊗ a3 , f7 = σ1 ⊗ b3 , where (σ1 , σ2 , σ3 ) are the usual Pauli matrices and we have 0 −1 0 0 0 0 −1 0 0 1 0 0 0 0 0 0 1 0 , a2 = , a3 = a1 = 0 0 0 −1 1 0 0 0 0 0 0 1 0 0 −1 0 0 −1 and
0 1 b1 = 0 0
−1 0 0 0
0 0 0 −1
0 00 0 0 0 , b2 = 1 10 0 01
−1 0 0 0
(17)
0 1 0 0
1 0 0 0
0 −1 0 0
1 0 . 0 0
0 0 −1 0
0 0 0 −1 0 0 , b3 = 0 0 1 0 −1 0
Monopole Equations on 8-Manifolds with Spin(7) Holonomy
27
At this point it will be instructive to show that the above basis corresponds to a representation of the Clifford algebra Cl7 induced by right multiplications in the algebra of octonions. We adopt the Cayley-Dickson approach and describe a quaternion by a pair of complex numbers so that a = (x + iy) + j (u + iv), where (i, j, ij = k) are the imaginary unit quaternions. In a similar way an octonion is described by a pair of quaternions (a, b). Then the octonionic multiplication rule is ¯ da + bc). (a, b) · (c, d) = (ac − db, ¯
(18)
If we now represent an octonion (a, b) by a vector in R8 , its right multiplication by imaginary unit octonions correspond to linear transformations on R8 . We thus obtain the following correspondences: (0, 1) → f1 , (i, 0) → f2 , (j, 0) → f3 , (k, 0) → f4 , (19) (0, i) → f5 , (0, j ) → f6 , (0, k) → f7 . P The projection F + of a 2-form F= i<j Fij dxi ∧ dxj ∈ 2 (R8 , iR) onto the complexification of the above self-dual subspace is given by F + = 1/4(F15 + F26 + F37 + F48 )f1 + 1/4(F12 + F34 − F56 − F78 )f2 + 1/4(F16 − F25 − F38 + F47 )f3 + 1/4(F13 − F24 − F57 + F68 )f4 + 1/4(F17 + F28 − F35 − F46 )f5 + 1/4(F14 + F23 − F58 − F67 )f6 + 1/4(F18 − F27 + F36 − F45 )f7 . We now fix the constant spinc -structure 0 : R8 −→ C16×16 given by 0 γ (ei ) , 0(ei ) = −γ (ei )∗ 0
(20)
where ei , i = 1, 2, ..., 8 is the standard basis for R8 and γ (e1 ) = I d, γ (ei ) = fi−1 for i = 2, 3, ..., 8. We note that this choice is specific to 8 dimensions , because 2n = 2n−1 only for n = 4. We have X = R8 , W = R8 × C16 , W ± = R8 × C8 and L0 = L0 1/2 = R8 × C. Consider the connection 1-form A=
8 X
Ai dxi ∈ 1 (R8 , iR)
(21)
i=1
on the line bundle R8 × C. Its curvature is given by X Fij dxi ∧ dxj ∈ 2 (R8 , iR), FA =
(22)
i<j
where Fij =
∂Aj ∂xi
−
∂Ai ∂xj .
The spinc connection ∇ = ∇A on W + is given by ∇i 8 =
∂8 + Ai 8 ∂xi
(23)
28
A. H. Bilge, T. Dereli, S. ¸ Koçak
(i = 1, ..., 8), where 8 : R8 → C8 . Therefore the map ρ + : 32 (T ∗ X) ⊗ C → End(W + ) can be computed for our generators fi to give ρ + (f1 ) = γ (e1 )γ (e5 ) + γ (e2 )γ (e6 ) + γ (e3 )γ (e7 ) + γ (e4 )γ (e8 ), ρ + (f2 ) = γ (e1 )γ (e2 ) + γ (e3 )γ (e4 ) − γ (e5 )γ (e6 ) − γ (e7 )γ (e8 ), ρ + (f3 ) = γ (e1 )γ (e6 ) − γ (e2 )γ (e5 ) + γ (e3 )γ (e8 ) + γ (e4 )γ (e7 ), ρ + (f4 ) = γ (e1 )γ (e3 ) − γ (e2 )γ (e4 ) − γ (e5 )γ (e7 ) + γ (e6 )γ (e8 ), ρ + (f5 ) = γ (e1 )γ (e7 ) + γ (e2 )γ (e8 ) − γ (e3 )γ (e5 ) − γ (e4 )γ (e6 ), ρ + (f6 ) = γ (e1 )γ (e4 ) + γ (e2 )γ (e3 ) − γ (e5 )γ (e8 ) − γ (e6 )γ (e7 ), ρ + (f7 ) = γ (e1 )γ (e8 ) − γ (e2 )γ (e7 ) + γ (e3 )γ (e6 ) − γ (e4 )γ (e5 ). P Then for a connection A = 8i=1 Ai dxi ∈ 1 (R8 , iR) and a given complex 8-spinor 9 = (ψ1 , ψ2 , ..., ψ8 ) ∈ C ∞ (X, W + ) = C ∞ (R8 , R8 × C8 ) we state our 8-dimensional monopole equations as follows: DA (9) = 0 ,
ρ + (FA + ) = (99 ∗ )+ .
(24)
Here (99 ∗ )+ is the orthogonal projection of 99 ∗ onto the spinor subbundle spanned by ρ + (fi ), i = 1, 2, ..., 7. More explicitly, DA (9) = 0 can be expressed as ∇1 9 = γ (e2 )∇2 9 + γ (e3 )∇3 9 + ... + γ (e8 )∇8 9
(25)
and ρ + (FA + ) = (99 ∗ )+ is equivalent to the equation ρ + (FA + ) =
8 X
< ρ + (fi ), 99 ∗ > ρ + (fi )/|ρ + (fi )| . 2
i=2
Equation (26) is equivalent to the set of equations F15 + F26 + F37 + F48 = 1/8 < ρ + (f1 ), 99 ∗ >, F12 + F34 − F56 − F78 = 1/8 < ρ + (f2 ), 99 ∗ >, F16 − F25 − F38 + F47 = 1/8 < ρ + (f3 ), 99 ∗ >, F13 − F24 − F57 + F68 = 1/8 < ρ + (f4 ), 99 ∗ >, F17 + F28 − F35 − F46 = 1/8 < ρ + (f5 ), 99 ∗ >, F14 + F23 − F58 − F67 = 1/8 < ρ + (f6 ), 99 ∗ >, F18 − F27 + F36 − F45 = 1/8 < ρ + (f7 ), 99 ∗ >, or still more explicitly to the equations F15 + F26 + F37 + F48 = 1/4(ψ1 ψ¯ 3 − ψ3 ψ¯ 1 − ψ2 ψ¯ 4 + ψ4 ψ¯ 2 − ψ5 ψ¯ 7 + ψ7 ψ¯ 5 − ψ6 ψ¯ 8 + ψ8 ψ¯ 6 ), F12 + F34 − F56 − F78 = 1/4(ψ1 ψ¯ 5 − ψ5 ψ¯ 1 − ψ2 ψ¯ 6 + ψ6 ψ¯ 2 + ψ3 ψ¯ 7 − ψ7 ψ¯ 3 + ψ4 ψ¯ 8 − ψ8 ψ¯ 4 ),
(26)
Monopole Equations on 8-Manifolds with Spin(7) Holonomy
29
F16 − F25 − F38 + F47 = 1/4(ψ1 ψ¯ 7 − ψ7 ψ¯ 1 + ψ2 ψ¯ 8 − ψ8 ψ¯ 2 − ψ3 ψ¯ 5 + ψ5 ψ¯ 3 + ψ4 ψ¯ 6 − ψ6 ψ¯ 4 ), F13 − F24 − F57 + F68 = 1/4(ψ1 ψ¯ 2 − ψ2 ψ¯ 1 + ψ3 ψ¯ 4 − ψ4 ψ¯ 3 + ψ5 ψ¯ 6 − ψ6 ψ¯ 5 − ψ7 ψ¯ 8 + ψ8 ψ¯ 7 ), F17 + F28 − F35 − F46 = 1/4(ψ1 ψ¯ 4 − ψ4 ψ¯ 1 + ψ2 ψ¯ 3 − ψ3 ψ¯ 2 − ψ5 ψ¯ 8 + ψ8 ψ¯ 5 + ψ6 ψ¯ 7 − ψ7 ψ¯ 6 ), F14 + F23 − F58 − F67 = 1/4(−ψ1 ψ¯ 6 + ψ6 ψ¯ 1 − ψ2 ψ¯ 5 + ψ5 ψ¯ 2 − ψ3 ψ¯ 8 + ψ8 ψ¯ 3 + ψ4 ψ¯ 7 − ψ7 ψ¯ 4 ), F18 − F27 + F36 − F45 = 1/4(ψ1 ψ¯ 8 − ψ8 ψ¯ 1 − ψ2 ψ¯ 7 + ψ7 ψ¯ 2 − ψ3 ψ¯ 6 + ψ6 ψ¯ 3 − ψ4 ψ¯ 5 + ψ5 ψ¯ 4 ). 5. Conclusion We will now show that the system of monopole equations (25)-(26) form an elliptic system. These equations can be written compactly in the form hF, fi i = 1/8hρ + (fi ), 99 ∗ i, i = 1 . . . 7, DA (9) = 0. If in addition we impose the Coulomb gauge condition 8 X
∂i Ai = 0,
i=1
we obtain a system of first order partial differential equations consisting of eight equations for the components of the spinor 9 and eight equations for the components of the connection 1-form A. The characteristic determinant of this system [20] is the product of the characteristic determinants of the equations for 9 and A. As the Dirac operator is elliptic [19], the ellipticity of the present system depends on the characteristic determinant of the system consisting of hF, fi i = 1/8hρ + (fi ), 99 ∗ i, i = 1 . . . 7 and the Coulomb gauge condition. In the computation of the characteristic determinant, the fifth row, for instance, is obtained from F15 + F26 + F37 + F48 = ∂1 A5 − ∂5 A1 + ∂2 A6 − ∂6 A2 + ∂3 A7 − ∂7 A3 + ∂4 A8 − ∂8 A4 by replacing ∂i by ξi . Thus after a rearrangement of the order of the equations, the characteristic determinant can be obtained as ξ1 ξ2 ξ3 ξ4 ξ5 ξ6 ξ7 ξ8 −ξ2 ξ1 −ξ4 ξ3 ξ6 −ξ5 ξ8 −ξ7 −ξ3 ξ4 ξ1 −ξ2 ξ7 −ξ8 −ξ5 ξ6 −ξ −ξ3 ξ2 ξ1 ξ8 ξ7 −ξ6 −ξ5 det 4 . −ξ5 −ξ6 −ξ7 −ξ8 ξ1 ξ2 ξ3 ξ4 −ξ ξ 6 5 ξ8 −ξ7 −ξ2 ξ1 ξ4 −ξ3 −ξ7 −ξ8 ξ5 ξ6 −ξ3 −ξ4 ξ1 ξ2 −ξ8 ξ7 −ξ6 ξ5 −ξ4 ξ3 −ξ2 ξ1
30
A. H. Bilge, T. Dereli, S. ¸ Koçak
It is equal to (ξ12 + ξ22 + ξ32 + ξ42 + ξ52 + ξ62 + ξ72 + ξ82 )4 , and this proves ellipticity. Finally we point out that the monopole equations (25)-(26) admit non-trivial solutions. For example, if the pair (A, 8) with A=
4 X
Ai (x1 , x2 , x3 , x4 )dxi
i=1
and
8 = (φ1 (x1 , x2 , x3 , x4 ), φ2 (x1 , x2 , x3 , x4 ))
is a solution of the 4-dimensional Seiberg–Witten equations, then the pair (B, 9) with B=
4 X
Ai (x1 , x2 , x3 , x4 )dxi
i=1
(i.e. the first four components Bi of B coincide with Ai , thus not depending on x5 , x6 , x7 , x8 and the last four components of B vanish) and 9 = (0, 0, φ1 , φ2 , 0, 0, iφ1 , −iφ2 ), where φ1 and φ2 depend only on x1 , x2 , x3 , x4 , is a solution of these new 8-dimensional monopole equations. It can directly be verified that 9 is harmonic with respect to B and the second part of the equations is also satisfied. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
Seiberg, N., Witten, E.: Nucl. Phys. B426, 19 (1994) Witten, E.: Math. Res. Lett.1, 764 (1994) Flume, R.: O’Raifeartaigh, L., Sachs, I.: Brief resume of the Seiberg–Witten theory. hep-th/9611118 Donaldson, S.K., Thomas, R.P.: Gauge theory in higher dimensions. Oxford University preprint, 1996 Baulieu, L., Kanno, H., Singer, I.M.: Special quantum field theories in eight and other dimensions. hepth/9704167 Acharya, B.S., O’Loughlin, M., Spence, B.: Higher dimensional analogues of Donaldson-Witten Theory. hep-th/9705138 Hull, C.M.: Higher dimensional Yang–Mills theories and topological terms. hep-th/9710165 Salamon, D.: Spin Geometry and Seiberg–Witten Invariants. April 1996 version, Book to appear Bilge, A.H., Dereli, T., Koçak, S.: ¸ Seiberg–Witten equations on R 8 . In: The Proceedings of 5th Gökova Geometry-Topology Conference, Edited by S. Akbulut, T. Önder, R. Stern, TUBITAK, Ankara, 1997, p. 87 Bilge, A.H., Dereli, T., Koçak, S.: ¸ J. Math. Phys. 38, 4804 (1997) Corrigan, E., Devchand, C., Fairlie, D., Nuyts, J.: Nucl. Phys. B214, 452 (1983) Ward, R.S.: Nucl. Phys. B236, 381 (1984) Bilge, A.H., Dereli, T., Koçak, S.: ¸ Lett. Math. Phys. 36, 301 (1996) Gürsey, F., Tze, C.-H.: On the Role of Division, Jordan and Related Algebras in Particle Physics. Singapore: World Scientific, 1996 Fairlie, D., Nuyts, J.: J. Phys. A17, 2867 (1984) Fubini, S., Nicolai, H.: Phys. Lett. B155, 369 (1985) Grossman, B., Kephart, T.W., Stasheff, J.D.: Commun. Math. Phys. 96, 431 (1984) (Erratum:ibid, 100 311 (1985)) Joyce, D.D.: Invent. Math. 123, 507 (1996) Lawson, H.B., Michelsohn, M-L.: Spin Geometry . Princeton, NJ: Princeton U.P., 1989 John, F.: Partial Differential Equations. Berlin–Heidelberg–New York: Springer-Verlag, 1982
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 203, 31 – 52 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Special Kähler Manifolds Daniel S. Freed? Schools of Mathematics and Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA. E-mail:
[email protected] Received: 5 December 1997 / Accepted: 16 November 1998
Abstract: We give an intrinsic definition of the special geometry which arises in global N = 2 supersymmetry in four dimensions. The base of an algebraic integrable system exhibits this geometry, and with an integrality hypothesis any special Kähler manifold is so related to an integrable system. The cotangent bundle of a special Kähler manifold carries a hyperkähler metric. We also define special geometry in supergravity in terms of the special geometry in global supersymmetry.
Constraints on Riemannian metrics occur in many places in supersymmetry. For example, the requirement of extended supersymmetry in a two dimensional σ -model constrains the target manifold to be Kähler or hyperkähler depending on the amount of supersymmetry. The scalars in supergravity theories are often constrained to live on a particular homogeneous Riemannian manifold. These sorts of special metrics – metrics with restricted holonomy group (such as Kähler and hyperkähler metrics) and homogeneous metrics – are much studied by Riemannian geometers, but there are situations in which we meet something new. One important example occurs in four dimensional gauge theories with N = 2 supersymmetry: the scalars in the vector multiplet lie in a special Kähler manifold. This is the case pertaining to global supersymmetry; when coupled to N = 2 supergravity in four dimensions the scalars lie in a projective special Kähler manifold.1 Notice that N = 1 supersymmetry already constrains the scalars to lie ? The author is on leave from the Department of Mathematics at the University of Texas atAustin,Austin, TX 78712, USA, where he receives support from NSF grant DMS-962698. At the Institute for Advanced Study the author receives support from NSF grants DMS-9304580 and DMS-9627351, the Harmon Duncombe Foundation, and from the J. Seward Johnson Sr. Charitable Trust. 1 Physicists use the term “special Kähler manifold” for both cases, and use words like “rigid” and “local” to distinguish them. Since these words have other connotations in geometry, we adopt a different terminology.
32
D. S. Freed
in a Kähler manifold, which must be Hodge in the supergravity case. Special geometry is the additional constraint imposed by N = 2 supersymmetry. Special geometry appeared in the physics literature in 1984 in both global supersymmetry [ST,G] and supergravity [WP]. Strominger [St] gave a coordinate-free definition in the supergravity case. Projective special Kähler manifolds are important in mirror symmetry, as explained by Candelas and de la Ossa [CO]. Special Kähler manifolds in global supersymmetry have received more attention recently due to their prominent role in the seminal work of Seiberg and Witten on N = 2 supersymmetric Yang–Mills theories [SW1,SW2]. See [F,CRTP] for recent discussions of special geometry and for extensive references. In this paper we introduce an intrinsic2 definition of special geometry: A special Kähler structure is a flat connection on the tangent bundle of a Kähler manifold. The crucial condition is expressed in (1.2). From it follow the usual equations for special coordinates, the holomorphic prepotential, the Kähler potential, etc. We recount this in Sect. 1, where we also define this geometry in terms of a holomorphic cubic form. In Sect. 2 we construct a hyperkähler metric on the cotangent bundle of a special Kähler manifold. A local version of this result appears in the physics literature [CFG]. It seems likely that there is actually a one parameter family of hyperkähler metrics of which the one we construct is a limiting case (see [SW3]), but we have not pursued that here. In Sect. 3 we prove the assertion made by Donagi and Witten [DW] that with a suitable integrality hypothesis a special Kähler manifold parametrizes an algebraic completely integrable system. As a consequence the total space of an algebraic integrable system carries a hyperkähler metric. The usual definition of a projective special Kähler manifold is based on a particular type of variation of Hodge structure, which was first studied by Bryant and Griffiths [BG]. Our main observation here is that a projective special Kähler structure on a Hodge manifold M of dimension n induces a special pseudoKähler structure of Lorentz type on a closely related manifold M˜ of dimension n + 1. (M˜ is the total space of the Hodge line bundle with the zero section omitted.) With a suitable integrality hypothesis the associated intermediate Jacobians are an integrable system and carry a hyperkähler metric, results obtained previously [DM2,C]. Finally, in Sect. 5 we make some brief comments on the physics (in the case of global supersymmetry). We explain that supersymmetry combined with the quantization of electric and magnetic charges leads to the conclusion that integrable systems must enter into the low energy description of N = 2 supersymmetric gauge theories. As mentioned above, the base of an algebraic integrable system is a special Kähler manifold. This is, I believe, the proper context for special Kähler geometry. There are many examples of algebraic integrable systems, and hopefully this excuses the paucity of examples presented here. As mentioned in the footnote on the previous page, our terminology differs from that in the physics literature. We include the following table to aid in translation: Our Terminology Special Kähler Projective Special Kähler
Physics Literature Rigid Special Kähler (vector multiplets in global N = 2 supersymmetry) (Local) Special Kähler (vector multiplets in N = 2 supergravity)
2 Intrinsic geometry concerns the tangent bundle and associated bundles, whereas extrinsic geometry involves bundles not constructed directly from the coordinate charts of a manifold. Definitions of special geometry in the physics literature are not intrinsic in this sense.
Special Kähler Manifolds
33
This paper grew out of a seminar talk explaining [DW], and it had a long gestation period since. During that time I benefited from conversations and lectures by many colleagues, including Jacques Distler, Ron Donagi, Nigel Hitchin, Graeme Segal, Nathan Seiberg, Karen Uhlenbeck, and Edward Witten. From the first version of the paper I received helpful remarks from Vicente Cortés, James Gates, Zhiqin Lu, Simon Salamon, and the referees. I thank them all.
1. Definition and Basic Properties We introduce the following definition. Definition 1.1. Let M be a Kähler manifold with Kähler form ω. A special Kähler structure on M is a real flat torsionfree symplectic connection ∇ satisfying d∇ I = 0,
(1.2)
where I is the complex structure on M. First we examine the consequences of the connection on the underlying real symplectic structure on M. The connection ∇ determines an extension of the de Rham complex d∇ =∇ d∇ d∇ 0 −−−−→ 0 (T M) −−−−→ 1 (T M) −−−−→ 2 (T M) −−−−→ · · ·.
(1.3)
The flatness is the condition d∇2 = 0. Note that the Poincaré lemma holds for (1.3): a closed T M-valued form is locally exact. The torsionfree condition may be expressed by d∇ (id) = 0,
(1.4)
where id ∈ 1 (T M) is the identity endomorphism of T M. Now if {ξα } is a flat local framing of M with dual coframing {θ α }, then (1.4) implies dθ α = 0, whence θ α = dt α for some local coordinate functions t α .3 Since ∇ω = 0 we can choose these coordinates to be Darboux; that is, the coordinate functions are x i , yj (i, j = 1, . . . , n = dimC M) with ω = dx i ∧ dyi .
(1.5)
Summarizing, a flat torsionfree symplectic connection ∇ is equivalent to a flat symplectic structure on M. This is a covering by flat Darboux coordinate systems {x i , yj } whose transition functions are of the form ! ! ! x x˜ a (1.6) =P + , P ∈ Sp(2n; R), a, b ∈ Rn . y y˜ b (The coordinates are “flat” since ∇dx i = ∇dyj = 0.) Equation (1.5) is valid in any flat Darboux coordinate system. 3 For simplicity we always choose our coordinate systems to be defined on connected open sets, and we allow the domains of the coordinate systems to shrink when necessary.
34
D. S. Freed
The compatibility with the complex structure is expressed4 by (1.2), or equivalently by d∇ π (1,0) = 0,
(1.7)
where π (1,0) ∈ 1,0 (TC M) is projection onto the (1, 0) part of the complexified tangent bundle. The Poincaré lemma ensures that locally we can find a complex vector field ζ with ∇ζ = π (1,0) .
(1.8)
Note that ζ is unique up to a flat complex vector field. Also, ζ is not necessarily holomorphic. Let {x i , yj } be a flat Darboux coordinate system and write ζ =
∂ 1 i ∂ z − wj i 2 ∂x ∂yj
(1.9)
for some complex functions zi , wj . (The choice of sign and the factor ‘1/2’yield standard formulas for M = Cn .) Since π (1,0) has type (1, 0), Eq. (1.9) implies that zi , wj are holomorphic functions and π (1,0) =
1 ∂ ∂ dzi ⊗ i − dwj ⊗ . 2 ∂x ∂yj
(1.10)
It follows that Re(dzi ) = dx i , Re(dwj ) = −dyj .
(1.11)
In particular, {zi } is a local holomorphic coordinate system on M.5 We easily compute 1 ∂ ∂ ∂ = − τij , i i ∂z 2 ∂x ∂yj
(1.12)
where τij =
∂wj . ∂zi
(1.13)
Now the fact that ω has type (1, 1) implies that τij = τj i , and so there is a (local) holomorphic function F, determined up to a constant, so that wj =
∂F ∂ 2F , τ = . ij ∂zj ∂zi ∂zj
(1.14)
F is called the holomorphic prepotential. It determines a Kähler potential K=
∂F i 1 1 Im z¯ = Im(wi z¯ i ), 2 ∂zi 2
4 We give a characterization in terms of coordinates in Proposition 1.25 below. 5 Of course, so is {w }. We call {zi } and {w } conjugate coordinate systems (Definition 1.37). j j
(1.15)
Special Kähler Manifolds
35
and in terms of this data the Kähler form is √ √ √ −1 −1 ∂ 2F i j j ¯ dz ∧ dz = Im Im(τij )dzi ∧ dz . (1.16) ω = −1∂ ∂K = 2 ∂zi ∂zj 2 Formulas (1.14)–(1.16) are standard in the literature on special Kähler geometry; they show that our global Definition 1.1 reproduces the usual local characterization. We term {zi } a special coordinate system. We characterize special coordinate systems below in Definition 1.37. Remark 1.17. Condition (1.2) does not mean that the complex structure I is flat. Indeed, if ∇I = 0, then M is a flat Kähler manifold, locally isometric to Cn . Such a manifold is special Kähler, but of a very special type. Note that the existence of a flat symplectic structure has nontrivial global topological consequences but gives no local restriction. Equation (1.2), on the other hand, is a stringent local condition. Remark 1.18. Based on an earlier version of this paper, Zhigin Lu [L] proved that there are no nonflat complete special Kähler manifolds. Remark 1.19. The special Kähler condition (1.7) automatically implies that ∇ is torsionfree, since (1.4) is twice the real part of (1.7). Remark 1.20. Locally, we may specify a Kähler geometry by giving a holomorphic function F(z1 , . . . , zn ) such that Im
∂ 2F >0 ∂zi ∂zj
is positive definite. The function
√ −1 1 2 (z ) + · · · + (zn )2 F(z , . . . , z ) = 2 leads to the flat metric on Cn . A nontrivial example in one dimension is provided by the holomorphic function τ3 , F(τ ) = 6 defined on the upper half plane n
1
H = {τ : Im τ > 0}. The corresponding Kähler form
√ −1 Im(τ )dτ ∧ dτ ω= 2
has Gauss curvature 1/2(Im τ )3 . Note that the coordinate conjugate to τ is w = ∂F/∂τ = τ 2 /2. An adapted6 flat Darboux coordinate system {x, y} is x = Re τ, y = − Re τ 2 /2. In these coordinates the Riemannian metric is g=
2(x 2 + y) dx 2 + 2x dxdy + dy 2 p . x 2 + 2y
It is the Hessian of the function φ = 13 (x 2 + 2y)3/2 ; see Proposition 1.24 below. This metric is incomplete; see Remark 1.18. 6 See Definition 1.37.
36
D. S. Freed
Remark 1.21. Nowhere do we use the positive definiteness of ω. Hence our discussion applies also to pseudo-Kähler manifolds. (A pseudo-Kähler metric ω is nondegenerate and dω = 0, but it is not assumed positive definite.) We have the following easy result. Proposition 1.22. (a) Let (M, ω, ∇) be a special Kähler manifold. The connection ∇ determines a horizontal distribution H in the real cotangent bundle T ∗ M. Then H is invariant under the complex structure of T ∗ M. (b) The (0, 1) part of the connection ∇ on the complex tangent bundle T M equals the ∂¯ operator. Proof. (a) Choose a flat Darboux coordinate system {x i , yj }. Then the local 1-forms dx i , dyj define sections of T ∗ M → M whose image is an integral manifold of H . Since dx i and dyj are the real parts of holomorphic differentials (see (1.11)) their graphs are complex submanifolds. (b) From (1.12) we compute that ∇ ∂/∂zi is a form of type (1, 0): 1 ∂τj ` k ∂ ∂ dz ⊗ . ∇ j =− ∂z 2 ∂zk ∂y`
(1.23)
t Since ∂/∂zi is a local basis of holomorphic sections, the desired assertion follows. u The Riemannian metric has a very special form in flat real coordinates – it is the Hessian of a function. This observation is due to Nigel Hitchin. Proposition 1.24. Let (M, ω, ∇) be a special Kähler manifold. Suppose {uα } is a ∇-flat coordinate system. (For example, it may be a flat Darboux coordinate system.) Then the Riemannian metric g is ∂ 2φ g = α β duα ⊗ duβ ∂u ∂u for some real function φ. In fact, φ is a Kähler potential. Proof. In these coordinates the symplectic form ω = 21 ωαβ duα ∧ duβ has constant coefficients. Now γ gαβ = ωαγ Iβ and the special Kähler condition (1.2) implies ∂gαγ ∂gαβ = . γ ∂u ∂uβ Hence gαβ = ∂φα /∂uβ for some function φα . The symmetry of gαβ now implies that φα = ∂φ/∂uα for some φ, as desired. To see that φ is a Kähler potential, we compute √ ¯ = − 1 dI dφ −1 ∂ ∂φ 2 1 ∂φ α γ I du =− d 2 ∂uα γ ∂φ ∂Iγα β 1 ∂ 2φ α I + du ∧ duγ =− γ 2 ∂uα ∂uβ ∂uα ∂uβ 1 = − gαβ Iγα duβ ∧ duγ 2 = ω.
Special Kähler Manifolds
37
We use the special Kähler condition to pass from the third line to the fourth. u t We next express the special Kähler condition (1.2) in terms of coordinates. Proposition 1.25. Let (M, ω) be a Kähler manifold of dimension n and ∇ a flat torsionfree symplectic connection. Suppose {zi } is any local holomorphic coordinate system on M and {x i , yj } a flat Darboux coordinate system. Write 1 j ∂ ∂ ∂ = σi − τij i j ∂z 2 ∂x ∂yj j
j
for functions σi , τij . Then d∇ I = 0 if and only if σi , τij are holomorphic functions of z1 , . . . , zn and j
j
∂σ ∂σi = ki , k ∂z ∂z
∂τij ∂τkj = . k ∂z ∂zi
The proof is straightforward: Compute d∇ (π (1,0) ) = d∇ dzi ⊗ ∂z∂ i . Notice that τij is
not necessarily symmetric, but rather τik σjk is symmetric in i, j . There is a holomorphic cubic form 4 on a special Kähler manifold which encodes the extent to which ∇ fails to preserve the complex structure. Namely, set (1.26) 4 = −ω π (1,0) , ∇π (1,0) ∈ H 0 (M, Sym3 T ∗ M). That 4 is symmetric follows from the fact that ω is skew-symmetric, ∇ω = 0, and the special Kähler condition (1.7) (which says that ∇π (1,0) is symmetric). The holomorphicity follows from the computation (1.28) below. Note the alternative local expression 4 = −ω(∇ζ, ∇ 2 ζ ),
(1.27)
where ζ is a local complex vector field satisfying (1.8). We compute (1.26) in special coordinates {zi } introduced above. From (1.23) and the fact that ω has type (1, 1), we have ∂ ∂ , ∇(dzj ⊗ j ) ∂zi ∂z 1 ∂ 1 ∂τj ` k ∂ ∂ ), − dz ⊗ = −dzi ⊗ dzj ω ( i − τim k 2 ∂x ∂ym 2 ∂z ∂y` 1 ∂τj i dzi ⊗ dzj ⊗ dzk = 4 ∂zk 1 ∂ 3F = dzi ⊗ dzj ⊗ dzk . 4 ∂zi ∂zj ∂zk
4 = −ω dzi ⊗
(1.28)
Here we use (1.14) as well. The cubic form 4 can also be used7 to relate the special Kähler connection ∇ to the Levi–Civita connection D. Write ∇ = D + AR , 7 I learned this from the account in [BCOV], though it also appears in many other works.
(1.29)
38
D. S. Freed
where AR ∈ 1 (M, EndR T M). Then since Dπ (1,0) = 0, we have 4 = −ω(π (1,0) , [AR , π (1,0) ]).
(1.30)
Moreover, there is a complex tensor
A ∈ 1,0 Hom(T M, T M) with
(1.31)
AR = A + A.
To see this, note from (1.29) and Proposition 1.22(ii) that Aξ vanishes on vectors of type (0, 1) if ξ is of type (1, 0). Then, since Aξ is infinitesimal symplectic, for ζ of type (1, 0) and η¯ of type (0, 1), we have ¯ = −ω(ζ, Aξ η) ¯ = 0. ω(Aξ ζ, η) Since ω has type (1, 1), this implies that Aξ ζ is of type (0, 1). Therefore, A is as claimed in (1.31). Furthermore, A and 4 determine each other. In particular, we recover the special Kähler structure from 4. Conversely, we can start with a smooth cubic form 4 ∈ C ∞ (M, Sym3 T ∗ M) and ask for the conditions on 4 which ensure that ∇ as defined by (1.29) and (1.30) is a special Kähler structure. Note ‘T ∗ M’ denotes the complex tangent bundle; we assume 4 to be complex multilinear. The symmetry of 4 implies that ∇ is symplectic, torsionfree, and satisfies (1.2). Setting the curvature of ∇ to zero from (1.29) yields the equation 0 = R + dD A + A ∧ A + A ∧ A, where R is the curvature of the Kähler metric on M. Here ‘dD ’ is the alternation of the Levi–Civita covariant derivative. Notice that as endomorphisms of the tangent bundle R +A∧A+A∧A is complex linear, whereas dD A is complex antilinear (1.31); whence ¯ from which it follows that 4 is these separately vanish. The (1,1) piece of dD A is ∂A, holomorphic. The remaining equations are ∂D A = 0, R = −(A ∧ A + A ∧ A).
(1.32)
In any local coordinate system {zi } we write √ ω = −1 hi j¯ dzi ∧ dzj , j R= R ¯ dzk ∧ dz` , ik ` i,j
4 = 4ij k dzi ⊗ dzj ⊗ dzk , √ ¯ ¯ (Ai )kj = −1 4ij ` h`k . m . Then (1.32) is As usual, set Ri j¯k `¯ = hmj¯ Rik `¯
Di Aj = Dj Ai , ¯
Ri j¯k `¯ = −hα β 4ikα 4j `β . We summarize this discussion as follows.
(1.33)
Special Kähler Manifolds
39
Proposition 1.34. (a) If (M, ω, ∇) is a special Kähler manifold, then there is an associated holomorphic cubic form 4 ∈ H 0 (M, Sym3 T ∗ M), defined in (1.26), which satisfies (1.32). (b) If (M, ω) is a Kähler manifold and 4 ∈ H 0 (M, Sym3 T ∗ M) holomorphic cubic form which satisfies (1.32), then ∇ = D + A is a special Kähler structure, where D is the Levi–Civita connection and A is defined from 4 by (1.30). Remark 1.35. Lu [L] noticed that as a consequence of (1.33) any special Kähler manifold M has nonnegative scalar curvature ρ: ¯
¯
ρ = −4hi j hk ` Ri j¯k `¯ ¯
¯
¯
= 4hi j hk ` hα β 4ikα 4j `β
(1.36)
= 4|4| . 2
Then he computes 4ρ and uses a maximum principle to argue that if M is complete, then ρ = 0, from which 4 and then R vanish. Next, we discuss special coordinates. Definition 1.37. Let (M, ω, ∇) be a special Kähler manifold. (a) A holomorphic coordinate system {zi } is special if ∇ Re(dzi ) = 0. (b) We say that special coordinates {zi } and flat Darboux coordinates {x i , yj } are adapted if Re(zi ) = x i . (c) Special coordinate systems {zi }, {wj } are said to be conjugate if there exists a flat Darboux coordinate system {x i , yj } such that Re(zi ) = x i and Re(wj ) = −yj . Given adapted special coordinates {zi } and flat Darboux coordinates {x i , yj }, conjugate special coordinates {wj } are determined up to translation by a purely imaginary constant. For adapted coordinate systems we have Eqs. (1.9)–(1.16), but note that {wj }, τij , F, and K are not completely determined by {zi } and {x i , yj }. The following proposition clarifies the choices involved. Proposition 1.38. Let (M, ω, ∇) be a special Kähler manifold. (a) Given a flat Darboux coordinate system {x i , yj } there exists an adapted special coordinate system {zi }. Any two choices {zi }, {˜zi } satisfy zi = z˜ i + ci for some purely imaginary constants ci . (b) Given a special coordinate system {zi } there exists an adapted flat Darboux coordinate system {x i , yj }. Any two choices differ by a change of variables ! ! ! ! x 1 0 x˜ 0 = + , y A1 y˜ b where A is a (real) symmetric matrix and b ∈ Rn . (c) Given a special coordinate system {zi } the holomorphic prepotential F is determined up to a change 1 F −→ F + Aij zi zj + Bi zi + C, 2
40
D. S. Freed
where A = (Aij ) is a real symmetric matrix, and Bi , C ∈ C. So the conjugate coordinate system {wj } is determined up to a change wj −→ wj + Aj k zk + Bj and the Kähler potential (1.15) is determined up to a change K −→ K + Im(Bi z¯ i ). (d) If {zi }, {wj } are conjugate special coordinate systems, then any other pair {˜zi }, {w˜ j } of conjugate special coordinate systems are related by z w
! =P
! ! z˜ a + , P ∈ Sp(2n; R), a, b ∈ Cn . w˜ b
(1.39)
The corresponding matrices τ, τ˜ are related by τ = (D τ˜ + C)(B τ˜ + A)−1 , where P =
AB CD
(1.40)
.
2. The Associated Hyperkähler Manifold In this section we prove the following theorem, which (in local form) is due to Cecotti, Ferrara, and Girardello [CFG].8 Theorem 2.1. The cotangent bundle T ∗ M of a special Kähler manifold (M, ω, ∇) carries a canonical hyperkähler structure. Recall that a Riemannian manifold (Y, g) is hyperkähler if it carries a triple of integrable almost complex structures I, J, K which satisfy the quaternion algebra and such that the associated 2-forms ωT (ξ1 , ξ2 ) = g(ξ1 , T ξ2 ), T = I, J, K,
(2.2)
are closed. A useful lemma of Hitchin [H, p. 64] asserts that if ωI , ωJ , ωK are closed, then I, J, K are integrable. If we consider (Y, ωI ) as a Kähler manifold with complex structure I , then η = ωJ + iωK
(2.3)
is a holomorphic symplectic form. 8 Equation (B.7) in [CFG] corresponds to our description of the metric in (2.4), where their Z I are special coordinates on M and {Z I , WJ } the induced coordinate system on T ∗ M. Then (B.8b) describes the flat connection ∇.
Special Kähler Manifolds
41
Proof. Consider first a hermitian vector space V with complex structure I . The hermitian metric h·, ·i determines a metric and symplectic form on the underlying real vector space VR : hξ1 , ξ2 i = g(ξ1 , ξ2 ) + iω(ξ1 , ξ2 ), ξ1 , ξ2 ∈ VR . Then W = V ⊕ V ∗ ∼ = V ⊕ V has a constant hyperkähler structure. The complex structure J is the antilinear map J : V ⊕ V −→ V ⊕ V , v1 ⊕ v2 7 −→ −v2 ⊕ v1 . Now define K = I J . Then I, J, K satisfy the quaternion algebra. The metric on WR is gW (ξ1 ⊕ α1 , ξ2 ⊕ α2 ) = g(ξ1 , ξ2 ) + g −1 (α1 , α2 ), ξ1 , ξ2 ∈ VR , α1 , α2 ∈ VR∗ . (2.4) The forms ωI , ωJ , ωK are now determined by (2.2). It is straightforward to check that the holomorphic symplectic form η defined in (2.3) is the canonical form on W = V ⊕V∗ ∼ = T ∗V : η(v1 ⊕ `1 , v2 ⊕ `2 ) = `1 (v2 ) − `2 (v1 ), v1 , v2 ∈ V , `1 , `2 ∈ V ∗ . Now let (M, ω, ∇) be special Kähler and let Y = T ∗ M. Consider the distribution of horizontal spaces on Y given by the connection ∇. Here ‘horizontal’ means relative to the projection map π : Y → M. The horizontal space Hy at y ∈ Y is a complex subspace of Ty Y by Proposition 1.22. The projection π identifies Hy ∼ = Tm M, where m = π(y), and so the splitting into horizontal and vertical is a splitting Ty Y ∼ = Tm M ⊕ Tm∗ M.
(2.5)
The linear algebra of the preceding paragraph gives global endomorphisms I, J, K which satisfy the quaternion algebra. According to Hitchin’s lemma to check that this determines a hyperkähler structure we must only verify that ωI , ωJ , ωK are closed. First, since the canonical holomorphic symplectic form η on Y = T ∗ M is closed, Eq. (2.3) implies that ωJ and ωK are also closed. To see that ωI is closed we choose a flat Darboux coordinate system {x i , yj } on an open set U ⊂ M. This induces a local coordinate system {x i , yj ; qi , pj } on π −1 U ⊂ Y . Since the splitting (2.5) is induced by ∇, and dx i , dyj are ∇-flat by definition, it follows that ωI = dx i ∧ dyi + dqi ∧ dpi . This form is closed. u t 3. Integrable Systems In the mathematical description of a (finite dimensional) classical mechanical system one meets a symplectic manifold X and a Hamiltonian function. It is an integrable system if there is a maximal set of Poisson commuting conserved momenta which includes the Hamiltonian. Under suitable hypotheses this leads to a foliation of X by lagrangian tori [GS, Sect. 44]. The complex analogue leads to the following definition [DM1], which we explain in the succeeding paragraphs.
42
D. S. Freed
Definition 3.1. An algebraic integrable system is a holomorphic map π : X → M where (a) X is a complex symplectic manifold with holomorphic symplectic form η ∈ 2,0 (X); (b) The fibers of π are compact lagrangian submanifolds, hence affine tori; (c) There is a family of smoothly varying cohomology classes [ρm ] ∈ H 1,1 (Xm ) ∩ H 2 (Xm ; Z), m ∈ M, such that [ρm ] is a positive polarization of the fiber Xm . Hence Xm is an abelian torsor. The hypothesis that the fibers are compact lagrangian leads to the conclusion that they are affine tori. The fact that they are abelian torsors is an extra hypothesis. We assume that X and M are smooth.9 We now explain this definition and some consequences. Recall that a single10 abelian variety is a quotient A = V /3 of a complex vector space V by a full real lattice 3 such that H 1,1 (A) ∩ H 2 (A; Z) 6 = 0 and there is a positive class [ρ] in this intersection. Such a class is called a polarization and is represented by a unique invariant positive closed R n (1, 1)-form ρ on A. The polarization is principal if A ρn! = 1. Note that ρ is a real symplectic form on A, and since it is invariant it is a symplectic form on VR as well. Also, since ρ is an integral class, it induces a symplectic form on 3 ∼ = H1 (A). Let {γ i , δj } ⊂ 3 be a symplectic basis. Then there is a unique basis {ωi } of holomorphic differentials on A with Z ωj = δji , (3.2) γi
where δji is the Kronecker symbol. In fact, we can identify {ωi } as the complex basis of V ∗ dual to {γ i }. Now Z ωi = τij (3.3) δj
defines the period matrix τ of A. The Riemann bilinear relations state that the matrix τ = (τij ) belongs to the Siegel upper half space Hn = {τ an n × n complex matrix : τ is symmetric and Im τ is positive definite}. The group Sp(2n; R) acts transitively on Hn . A change of symplectic basis {γ i , δj } transforms τ by an element of a discrete subgroup 0 ⊂ Sp(2n; R) which depends on the polarization. (For a principal polarization 0 = Sp(2n; Z).) An abelian torsor X is a principal homogeneous space for an abelian variety A = V /3 with a polarization [ρ]. Here V is the space of invariant vector fields on X and 3 ⊂ V the lattice of such vector fields which exponentiate to the identity map. We can identify A as the Albanese variety of X. Any point x ∈ X determines an isomorphism A → X, and the pullback of [ρ] is a polarization [ρ] ˆ of A which is independent of the choice of x. The period matrix of X is equal to the period matrix of A. An algebraic integrable system π : X → M leads to a parametrized version of the preceding discussion. First, the holomorphic symplectic form η gives an isomorphism ∼ =
i : T ∗ M −→ V , 9 The singularities contain crucial physics, but for the geometry in this section we restrict to smooth points. 10 For convenience we use the same notation for the single abelian varieties in this explanatory paragraph
as we do in the rest of the text for families of abelian varieties.
Special Kähler Manifolds
43
where V → M is the bundle of invariant vector fields along the fibers of π. For a complex function f : M → C and complex vector field ξ on X we have η i(df ), ξ = π ∗ df (ξ ). This leads to a fiberwise action of T ∗ M ∼ = V by exponentiation. Let 3 be the kernel of the action. A basic fact is that 3 is a complex lagrangian submanifold of T ∗ M, where T ∗ M has the canonical holomorphic symplectic structure. (See [GS, Sect. 44] for proofs of the assertions made here.) Furthermore, 3 intersects each fiber of T ∗ M in a full lattice. The quotient A = T ∗ M/3 is a family of abelian varieties parametrized by M; it is the bundle of Albanese varieties of X → M. Since 3 is complex lagrangian, the symplectic canonical holomorphic symplectic form on T ∗ M passes to a holomorphic → U over an open form ηˆ on the quotient A. Now a local lagrangian section of π : X U set U ⊂ M induces a local isomorphism X U ∼ = A U , and this isomorphism maps ηˆ to η. Such sections may not exist globally. Since any two choices of local section lead to isomorphisms which differ by a translation on each fiber, the family of polarizations [ρm ] on X → M define a family of polarizations [ρˆm ] on A → M. To summarize: Every algebraic integrable system X → M has a canonically associated algebraic integrable system A → M whose fibers are abelian varieties. (An analogous assertion holds for real integrable systems.) Either system determines a welldefined period map τ : M −→ An = Hn / 0 into the moduli space An of suitably polarized abelian varieties. Now the bundle of lattices 3 determines a flat connection ∇ on T ∗ M, hence also on T M. Since 3 is lagrangian, ∇ is torsionfree. Also, the polarization [ρˆm ] on Am = Tm∗ M/3m determines a real symplectic form on Tm∗ M which restricts to an integral symplectic form on the lattice 3m . The dual 2-form ω on M is flat – ∇ω = 0 – and since ∇ is torsionfree it follows that ω is closed. Thus ω is a real symplectic form on M. The holonomy group of the flat connection ∇ is contained in the integral symplectic group Sp(3∗m ) at each m ∈ M, where 3∗m is the dual lattice to 3m . Furthermore, by the definition of a polarization ω is a (positive definite) Kähler form on M. If {γ i , δj } is a local symplectic basis of sections of 3 ⊂ T ∗ M, then we can write ω = γ i ∧ δi . There is also a global formula for ω. First, each polarization [ρˆm ] is represented by a unique invariant closed form ρˆm ∈ 1,1 (Am ). The family of forms {ρˆm } is flat with respect to ∇. Now the connection ∇ on T ∗ M induces an integrable distribution of horizontal planes on A, and we extend {ρˆm } to a form ρˆ ∈ 1,1 (A) by requiring that ρˆ vanish on those horizontal planes. Then d ρˆ = 0. The global formula for ω is expressed in terms of ρˆ and the holomorphic symplectic form ηˆ ∈ 2,0 (A): ω=
1 4
Z
ρˆ n−1 . ηˆ ∧ η¯ˆ ∧ (n − 1)! A/M
The conclusion of this discussion is a result stated by Donagi and Witten [DW]. Theorem 3.4. (a) Let (X → M, η, [ρm ]) be an algebraic integrable system. Then the Kähler form ω and the connection ∇ constructed above comprise a special Kähler structure on M. Furthermore, there is a lattice 3∗ ⊂ T M whose dual 3 ⊂ T ∗ M is a complex lagrangian submanifold, and the holonomy of ∇ is contained in the integral symplectic group defined by 3∗ .
44
D. S. Freed
(b) Conversely, suppose (M, ω, ∇) is a special Kähler manifold. Suppose further that there is a lattice 3∗ ⊂ T M, flat with respect to ∇, whose dual 3 ⊂ T ∗ M is a complex lagrangian submanifold. Then A = T ∗ M/3 → M admits a canonical holomorphic symplectic form η and a family of polarizations [ρm ] so that (A → M, η, [ρm ]) is an algebraic integrable system whose fibers are abelian varieties. Remark 3.5. The lattice 3 in (b) may be specified by a covering of distinguished flat Darboux coordinate systems {x i , yj } whose transition functions satisfy (1.6) with P ∈ Sp(2n; Z). In this case we also restrict the allowable special coordinate systems {zi } by requiring that {Re(zi )} be part of a distinguished flat Darboux coordinate system. Proof. For part (a) it remains to verify the special Kähler condition (1.2), or equivalently (1.7). We work locally. Let {γ i , δj } be a local symplectic basis of sections of 3. Since γ i , δj are closed 1-forms we can find flat Darboux coordinates {x i , yj } so that γ i = dx i and δj = dyj . Now γ i , δj also determine families of cycles on A and we can find holomorphic functions zi , wj such that dzi =
Z γi
Z η, ˆ dwj = −
δj
η. ˆ
Here the integrals are over the families of cycles in the fibration A → M, and Stokes’ theorem shows that the integrals are holomorphic (1, 0)-forms. It is easy to check that Re(dzi ) = dx i and Re(dwj ) = −dyj , so we can arrange that Re(zi ) = x i and Re(wj ) = −yj . Then ∂ ∂ 1 ζ = zi i − wj 2 ∂x ∂yj is a local complex vector field which satisfies (1.8). This implies (1.7). Notice that the vector fields ωi = ∂z∂ i define local holomorphic differentials on the fibers of A → M, and they satisfy (3.2). Thus Eq. (3.3) defines the period matrix (τij ) relative to {γ i , δj }. Equations (3.2) and (3.3) are equivalent to Eq. (1.12): 1 ∂ ∂ ∂ = − τ . ij ∂zi 2 ∂x i ∂yj By now the proof of (b) should be clear. Given (M, ω, ∇, 3), the family of polarizations on A = T ∗ M/3 → M is represented by the dual of the Kähler form ω. Hence A → M is a family of abelian varieties. The symplectic form is induced from the canonical symplectic form on T ∗ M. The hypothesis that 3 is complex lagrangian makes the t quotient T ∗ M/3 complex symplectic. u Remark 3.6. An arbitrary family of abelian varieties A → M does not admit a symplectic form. For that the differential of the period map must come from a cubic form c ∈ H 0 (M, Sym3 T ∗ M). (See [DM1, Sect. 7].) Here we assume a given identification of the bundle V with T ∗ M. (Recall that V is the bundle of constant vector fields along the fibers of A → M.) The cubic condition on the period matrix is essentially the special Kähler condition (1.2), as is clear from Proposition 1.25. Of course, the cubic form is (1.26).
Special Kähler Manifolds
45
Remark 3.7. The preceding discussion applies to the pseudo-Kähler case with one modification: the polarization classes [ρm ] are no longer positive definite. So Xm is an affine torus with an indefinite polarization. We term this an indefinite algebraic integrable system. The discussion in Sect. 2 applies directly to the quotient T ∗ M/3, and so Theorem 2.1 yields the following. Theorem 3.8. Let (X → M, η, [ρm ]) be an algebraic integrable system. Then X carries a canonical hyperkähler structure. 4. Projective Special Kähler Manifolds We term the triple (M, L, ω) a Hodge manifold if (M, ω) is Kähler and L → M is a holomorphic hermitian line bundle with curvature11 −2π iω. This implies [ω] ∈ H 2 (M; R) is an integral class. We begin with a geometric lemma about the principal C× bundle π : M˜ → M obtained by deleting the zero section from L → M. First, the hermitian connection on L is also a connection on π : M˜ → M, that is, a C× -invariant distribution of horizontal subspaces.Also, the bundle π ∗ L → M˜ has a canonical nonzero holomorphic section s. ˜ denote the form which equals |s|2 π ∗ ω on pairs of Lemma 4.1. Let ω˜ ∈ 1,1 (M) horizontal vectors, vanishes on a horizontal vector paired with a vertical vector, and is −1/π times the canonical Kähler form on pairs of vertical vectors. Then ω˜ =
i ¯ ∂∂|s|2 . 2π
(4.2)
Thus d ω˜ = 0, which implies that ω˜ is a pseudo-Kähler metric on M˜ of Lorentz type. Finally, π ∗ω =
i ¯ ∂∂ log |s|2 . 2π
(4.3)
¯ 2 , s ∈ L. The metric ω˜ The canonical Kähler form on a hermitian line L is 2i ∂ ∂|s| is negative definite on fibers and positive definite on horizontal subspaces. It has signature (n, 1), where n = dim M. Proof. Let t be a nonzero holomorphic section of L U → U for an open set U ∈ M, and ˜ set h(z) = |t (z)|2 , z ∈ U . We use local coordinates hz, λi 7→ λ t (z) ∈ π −1 U ⊂ M, where λ ∈ C× . Now s(z, λ) = λt (z), and so |s(z, λ)|2 = |λ|2 h(z). Compute the right-hand side of (4.2). To verify the description of ω˜ given before (4.2), note that ∂ is the horizontal lift of a tangent vector ξ in U . Formula (4.3) is the ξ − λh−1 ∂h(ξ ) ∂λ standard curvature formula for the hermitian connection. u t The usual definition for what we call a projective special Kähler structure is a particular type of variation of Hodge structure, which was considered specifically in a paper of Bryant and Griffiths [BG]. We discuss this first and defer our description to Proposition 4.6(b). Our version of the usual definition emphasizes the fact that the parameter space is a Hodge manifold, but it is equivalent to the definition in [BG] (cf., [C] for the relationship to [St]). 11 Since we do not use so many indices in this section, we revert to the standard notation i =
√
−1.
46
D. S. Freed
Definition 4.4. (i) A projective special Kähler structure on an n dimensional Hodge manifold (M, L, ω) is a triple (V , ∇, Q) where (a) V → M is a holomorphic vector bundle of rank n + 1 with a given holomorphic inclusion L ,→ V ; (b) ∇ is a flat connection on the underlying real bundle VR → M such that ∇(L) ⊂ V and the section M −→ P (VR )C (4.5) m 7 −→ Lm is an immersion with respect to ∇; (c) Q is a nondegenerate skew form on VR which has type (1,1) with respect to the complex structure and satisfies ∇Q = 0. Furthermore, we assume that Q L×L is i/2π times the hermitian metric on L. (ii) An integral projective special Kähler structure is a quadruple (3, V , ∇, Q) with (V , ∇, Q) as in (i) and 3 ⊂ VR a flat submanifold which intersects each fiber in a full lattice such that Q 3×3 has integral values. In this definition ∇ and Q are extended to the complexification (VR )C of VR . The flat connection gives a local identification of VR – hence also of its complexification (VR )C and the projectivization P (VR )C – with any fiber. The immersion condition in (b) states into the 2n + 1 dimensional projective space of a local that m 7 → Lm is an immersion trivialization of P (VR )C . The data in (ii) define a variation of polarized Hodge structures of weight 3 with Hodge numbers h3,0 = 1, h2,1 = n with an extra immersion condition. This is the form of the definition in [BG]. (See [CGGH] for the basic definitions related to variations of Hodge structures.) We recover the Hodge filtration {F p } by setting F 3 = L, F 2 = V , F 1 = ⊥ F 3 , and F 0 = (VR )C . (Here “⊥ ” is with respect to Q.) The Griffiths transversality condition ∇(F 3 ) ⊂ F 2 is given in (b) above; the condition ∇(F 2 ) ⊂ F 1 follows from this and the immersion condition [BG, pp.82–83]. Proposition 4.6 below implies that iQ H 2,1 ×H 2,1 is positive definite, where H 2,1 = F 2 ∩ F 1 . Variations of Hodge structure without the lattice, as in (i), were considered in [S]. Our main observation in this section is the following. We prefer to take the structure in (b) as the definition of projective special Kähler. Proposition 4.6. Let (M, L, ω) be a Hodge manifold with associated pseudo-Kähler ˜ ω) manifold (M, ˜ and canonical section s. (a) A projective special Kähler structure on (M, L, ω) induces a C× -invariant special ˜ ω) e on (M, e = π (1,0) . pseudo-Kähler structure ∇ ˜ with ∇s × ˜ ω) e on (M, ˜ which (b) Conversely, a C -invariant special pseudo-Kähler structure ∇ e = π (1,0) induces a projective special Kähler structure on (M, L, ω). satisfies ∇s Recall that ω˜ is defined in Lemma 4.1. The canonical section s defined there can be viewed as the holomorphic vertical vector field on M˜ induced by the C× action. e = π ∗ ∇ be the lifted flat connection on π ∗ V . Using the inclusion Proof. (a) Let ∇ L ,→ V we view s as a section of π ∗ V . The immersion condition (4.5) implies that e : T M˜ −→ π ∗ V ∇s
(4.7)
Special Kähler Manifolds
47
e ⊂ π ∗ V by the Griffiths transversality in (b).) Using is an isomorphism. (Note that ∇s ˜ we also the real isomorphism underlying (4.7) we obtain a real flat connection on M; ˜ = π ∗ Q pulls back to −ω. e Furthermore, under (4.7) the form Q ˜ This denote it by ‘∇’. follows by differentiating the equation i ˜ s¯ ), |s|2 = Q(s, 2π assumed in (c), to obtain ω˜ =
i ¯ ˜ ∇s, e ∇s). e ∂∂|s|2 = −Q( 2π
eω˜ = 0. Now under (4.7) the section s corresponds to a holomorphic vector field ζ Thus ∇ e satisfies the special Kähler condition (1.7), and which satisfies (1.8). This proves that ∇ e is also torsionfree. by Remark 1.19 ∇ (b) We simply indicate the construction of (V , ∇, Q). First, let V be the quotient of T M˜ by the C× action. Then V is a holomorphic bundle over M, and the inclusion of vertical ˜ R induces a e on (T M) vectors in T M˜ induces an inclusion L ,→ V . The connection ∇ connection ∇ on VR ; the immersion condition in Definition 4.4(i)(b) follows from the e = π (1,0) . The form −ω˜ on M˜ induces a skew form Q on V . u t hypothesis ∇s Notice as a consequence of (c) and the description of ω˜ in Lemma 4.1 that iQ H 2,1 ×H 2,1 is positive definite. Now the discussion of special coordinates, holomorphic prepotential, etc. from Sect. 1 ˜ ω, e We make C× -equivariant choices on M˜ and consider the induced applies to (M, ˜ ∇). tensors on M. We work on π −1 (U ) for U ⊂ M a sufficiently small open set. We do not ˜ which choose Darboux coordinates, but only a flat local symplectic framing12 of T M, we require to be C× -invariant. We say that a complex tensor field on M˜ has degree n if it transforms under λ ∈ C× by multiplication by λn . The vector field ζ (which corresponds to s under (4.7)) has degree 1. So from (1.9) we see that a special coordinate function zi also has degree 1. In other words, zi is a local holomorphic section of L → M. Thus a special coordinate system {zi } on M˜ gives rise to local projective coordinates on M (which transform as sections of L). From (1.13) we see that the period matrix (τij ) is a scalar, and from (1.14) that the holomorphic prepotential F has degree 2, i.e., F is a local holomorphic section of L⊗2 . Because of the C× -invariance there is less flexibility in choosing {zi } and F than in the nonprojective case – different choices differ by a homogeneous function. e be a projective special Kähler manifold. Proposition 4.8. Let (M, L, ω, ∇) (a) Given a special projective coordinate system {zi } the holomorphic prepotential F is determined up to a change 1 F −→ F + Aij zi zj , 2 where A = (Aij ) is a real symmetric matrix. Hence the conjugate special projective coordinate system {wj } is determined up to a change wj −→ wj + Aj k zk . 12 It is denoted { ∂ , ∂ } in Sect. 1, but here we do not consider coordinate functions x i and y . j ∂x i ∂yj
48
D. S. Freed
(b) If {zi }, {wj } are conjugate special projective coordinate systems, then any other pair {˜zi }, {w˜ j } of conjugate special projective coordinate systems are related by ! ! z z˜ =P , P ∈ Sp(2n; R). w w˜ ˜ From (4.3) we see that the lift of the metric ω to M˜ has a global “Kähler potential” K, which we write in special coordinates as . −1 log |s|2 K˜ = π . −1 ˜ s¯ ) log Q(s, = π . −1 log −ω(ζ, ˜ ζ¯ ) = π . −1 log Im(zi w¯ i ) = π ∂F . −1 . log Im zi = π ∂zi . Here “=” means “equals up to an additive constant”. K˜ pulls down to a local Kähler potential on M via a local holomorphic section of π : M˜ → M. ˜ of the special Kähler structure on M˜ (see (1.26)) is a holomorphic The cubic form 4 section 4 ∈ H 0 (M, Sym3 T ∗ M ⊗ L⊗2 ), as follows easily from (1.28). It is a basic ingredient in the analysis of [BG], where it is derived from an infinitesimal variation of Hodge structure. Since ζ is holomorphic of e ) = 0, and by differentiating ω(ζ, ∇ e2 ζ ) = 0. Differentiattype (1, 0), we have ω(ζ, ∇ζ ing once more we conclude from (1.27) that the cubic form in this case is ˜ 3 s, s). e3 ζ ) = Q(∇ 4 = ω(ζ, ∇ ˜ Sym3 T ∗ M) ˜ and the associated A˜ ∈ 1,0 Hom(T M, ˜ ˜ ∈ H 0 (M, We can use 4 ˜ to introduce an algebra structure on T M˜ ⊗R C. Fix m ˜ T M) ˜ ∈ M˜ and denote V = Tm˜ M. ˜ It is easy to see that A vanishes on ζ , and it is a well-defined map W ⊗ W → W , where W is the orthogonal complement to ζ . (Under the projection π we can identify ˜ We now obtain a graded algebra C: Set C0 = C · ζ , W ∼ = Tm M, where m = π(m).) C1 = W , C2 = W , and C3 = C · ζ¯ ; then ζ acts as the identity, the multiplication ˜ and the multiplication C1 ⊗C2 → C3 is α⊗β¯ 7 → ω(α, β) ¯ ζ¯ . C1 ⊗C1 → C2 is given by A, Associativity is trivial to verify. Now we consider the implications of the lattice 3 ⊂ VR in an integral projective special Kähler structure on M. Under the isomorphism (4.7) the lift π ∗ 3 ⊂ π ∗ VR ˜ Now 3 e ˜ ⊂ T M. ˜ is ∇-flat induces a lattice 3 by hypothesis, so by Proposition 1.22 it ˜ so the e ˜ is the graph of a ∇-flat is a complex submanifold. Locally 3 vector field on M, e e is torsionfree, this 1-form is ˜ ∗ is locally the graph of a ∇-flat 1-form. Since ∇ dual 3 ˜ ∗ ⊂ T ∗ M˜ is complex lagrangian. Thus Theorem 3.4(b) and also holomorphic and so 3 Theorem 3.8 apply to give the following conclusion.
Special Kähler Manifolds
49
e 3) is an integral projective special Kähler Proposition 4.9. Suppose (M, L, ω, ∇, manifold of dimension n. Then there is an associated indefinite algebraic integrable ˜ where M˜ is L with the zero section removed. The total space X carries system X → M, a “pseudo-hyperkähler” structure of real signature (4n, 4). The fibers of this integrable system are the intermediate Jacobians associated to the underlying variation of Hodge structure. The symplectic form on this family of intermediate Jacobians was constructed by Donagi and Markman [DM2] (for the case of a family of Calabi-Yau manifolds). The pseudo-hyperkähler structure was also given by Cortés [C]. As in the nonprojective case we restrict our local Darboux framings to lie in the lattice, and so the matrices A, P in Proposition 4.8 must be integral. 5. Remarks on N = 2 Gauge Theories in Four Dimensions We make some brief remarks on the role of special Kähler manifolds in global supersymmetric theories. We do not comment on their role in supergravity. References for the quantum physics are [SW1] and [SW2]. For a mathematical development of the relevant classical supersymmetry, see [DF]. The quantum aspects of our discussion have no pretension to rigor. We first recall the origin of the local formula (1.15) for the Kähler potential. It arises from the lagrangian for the complex scalars in the four dimensional N = 2 vector multiplets. There is a superspace description in terms of the superspace N 4|8 , which is an extension of ordinary four dimensional Minkowski space with eight odd dimensions. The complexification of the odd distribution splits into two pieces, and there is a corresponding notion of a chiral map 6 : N 4|8 → C. Such a map describes an (abelian) N = 2 vector multiplet. (More precisely, it is a component of the curvature of a constrained connection on superspace.) The most general supersymmetric lagrangian for n such multiplets is specified by a holomorphic function F : Cn → C. The theory is free if F is quadratic. Upon reduction to N = 1 superspace N 4|4 each multiplet 6 decomposes into an N = 1 chiral multiplet 8 and an N = 1 vector multiplet A. The lagrangian for the chiral multiplets is determined from the Kähler potential K, and a computation gives the formula (1.15) for K in terms of F. Next, we emphasize that a special Kähler manifold does not define a classical field theory for N = 2 vector multiplets. We do obtain a classical lagrangian from a special coordinate system, as explained in the previous paragraph. Furthermore, any Kähler manifold M does determine a well-defined N = 1 supersymmetric field theory for a chiral field 8 : N 4|4 → M. However, the change of special coordinates (1.39) must be accompanied by a duality13 transformation on the gauge field in the vector multiplet A, and this only makes sense in the quantum theory. Moreover, this duality transformation only makes sense when the holonomy of ∇ is contained in the integral symplectic group. Thus a special Kähler manifold M with a lattice as in Theorem 3.4 determines14 a quantum field theory which locally has a semiclassical description in terms of N = 2 vector multiplets. The manifold M is the moduli space of quantum vacua. According to Theorem 3.4 such a theory is always specified by an algebraic integrable system. These abelian theories describe the low energy behavior of the Coulomb branch of nonabelian N = 2 supersymmetric gauge theories, with or without matter. The 13 That is, electromagnetic duality. 14 Since typically M is incomplete this is not yet a full description of a theory. Also, an abelian gauge theory,
which has a positive β-function, only makes sense as an effective field theory, not as a fundamental theory.
50
D. S. Freed
simplest example [SW1] has gauge group SU (2) and no matter. Then M is the universal curve M(2) for the modular group 0(2) ⊂ SL(2; Z), which we can identify as CP1 with 3 points omitted, say M(2) = CP1 − {−1, 1, ∞}. The universal curve X(2) → M(2) is the algebraic integrable system which defines the model. Many more examples have been found, all of course involving integrable systems. (See [D] for a review.) So far we have taken M to be smooth. As stated above, a nonflat M is not complete and an honest physical theory is formulated on some completion of M. For example, for the pure SU (2) gauge theory the special Kähler metric on the moduli space CP1 −{−1, 1, ∞} is complete near ∞, but the singular points −1, 1 are at finite distance. At these points other fields are massless and must be added to the low energy description. We now remark further on the physical origin of the lattice 3. It is a feature of four dimensional abelian gauge theories; supersymmetry is irrelevant. (See [AgZ, Sect. 3] for a recent discussion.) Consider a four dimensional gauge theory with gauge group G = Tn , where T ∼ = U (1) is the circle group. The theory is specified by a complex bilinear form τ on the Lie algebra g whose imaginary part Im τ is an inner product.15 The lagrangian density in Minkowski space is o n 1 1 Im τ (FA , ∗FA ) + Re τ (FA , FA ) |d 4 x|, L= − 8π 8π
(5.1)
where A is a connection and |d 4 x| the standard density. There is a lattice gZ ⊂ g whose elements exponentiate to the identity in G, and each basis of this lattice produces a matrix (τij ) ∈ Hn which represents the form τ . The group GL(n; Z) permutes these bases. The larger duality group Sp(2n; Z) is generated by this group together with the electromagnetic duality transformation. The latter expresses the theory in terms of a “dual” connection A˜ and the bilinear form −τ −1 . The lagrangian has the same form as (5.1), and the operator FA in the original theory corresponds to ∗FA˜ in the dual theory. The action of Sp(2n; Z) which is generated acts on τ by (1.40). Fix a basis of gZ and so write the curvature as FA = (FAi )i=1,...,n . There are n electric charges q i and n magnetic charges gi for charged matter we might put into the theory. Classically, the electric charge in a spatial region bounded by a surface 6 is defined to be Z √ −1 i ∗ FAi , q = 6 2π and the enclosed magnetic charge is g i = (nm )i =
Z √ −1 i FA . 2π 6
The electric and magnetic charges of a quantum state are computed from the corresponding operators in the quantum theory. Now (nm )i is an integer by Chern-Weil theory for the compact gauge group Tn . In the classical theory (nm )i is an integer-valued function on the space of classical solutions; in the quantum theory it assigns an integer to each quantum state. There are other integers (ne )i attached to quantum states from the Noether charges associated to global infinitesimal gauge transformations. Here the integrality is from the fact that certain exponentials of these infinitesimal transformations √ 15 For n = 1 in the standard basis the form τ is usually written τ = θ + 8π −1 , where e is the coupling π e2
constant.
Special Kähler Manifolds
51
are the identity operator. These integers are related to the electric charge of states via the formula ij (Re τ )j k (nm )k + (ne )j . q i = (Im τ )−1 It is convenient to consider the complex quantity qi +
√ ij −1g i = (Im τ )−1 τj k (nm )k + (ne )j .
As the nm , ne run over all integers, this runs over the points of the (electromagnetic) charge lattice 3∗ in Cn . There is an integral symplectic form ω on 3∗ defined by ω
g q ,
g˜ q˜
τ τ = g i (Im )ij q˜ j − g˜ i (Im )ij q j 2 2 = (nm )i (n˜ e )i − (n˜ m )i (ne )i .
(5.2)
It is preserved by the duality group. Equation (5.2) is the form of charge quantization usually referred to as the “Dirac-Schwinger-Zwanziger condition”. Returning to an N = 2 supersymmetric abelian gauge theory, we have the moduli space M of the complex scalars which carries the special geometry we have been discussing. There is a distinguished set of conjugate special coordinate systems related by integral coordinate transformations. In each such coordinate system we have a lagrangian description as a gauge theory with gauge group G = Tn (with distinguished basis for the Lie algebra). We should regard the Cn where the coordinates live as the complexified Lie algebra with its distinguished basis. The electromagnetic charge lattice 3∗ discussed in the previous paragraph defines a global lattice in the complex conjugate cotangent ∼ T M. Note in the notation of Sect. 1 that (nm )i transforms analobundle T ∗ M = (ne )j i √ dx i i gously to dyj , and q + −1g transforms analogously to dzi . (See formulas (1.6) and (1.12).) There is a further geometric input from the BPS mass formula. Namely, in the classical theory the central charge Z in the N = 2 supersymmetry algebra is a complex-valued locally constant function on the space of solutions to the classical field equations. In the quantum theory (at a point m ∈ M) it is an operator whose eigenvalues are complex numbers. Let {zi }, {wj } be distinguished conjugate special coordinate systems. This means that there is a lagrangian description for the N = 2 theory in terms of a prepotential F(z1 , . . . , zn ) with wj = ∂F/∂zj . The BPS formula involves possible additional T charges S α which may appear in the theory. These have integer eigenvalues. The BPS formula asserts that the eigenvalue of Z is mα zi (ne )i + wi (nm )i + s α √ , 2 where s α is the eigenvalue of S α and (ne )i , (nm )i are the integers defined above. Let α 0 ⊂ C denote the set of points so described. If there are no S , then the fact that 0 is i m) intrinsic and the transformation law for (n (ne )j implies that there is no translation in the coordinate change (1.39) between different sets of distinguished conjugate coordinate systems. However, in the presence of charges S α there may be a nonzero translational component.
52
D. S. Freed
References [AgZ] Álvarez-Gaumé, L., Zamora, F.: Duality in quantum field theory (and string theory). hep-th/9709180 [BCOV] Bershadsky, M., Cecotti, S., Ooguri, H.,Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165, 311–428 (1994), hep-th/9309140 [BG] Bryant, R.L., Griffiths, P.A.: Some observations on the infinitesimal period relations for regular threefolds with trivial canonical bundle. In: Arithmetic and Geometry, Vol. II Progr. Math., 36, Boston: Birkhäuser, 1983, pp. 77–102 [CO] Candelas, P., de la Ossa, X.C.: Moduli space of Calabi-Yau manifolds. Nucl. Phys. B 355, 455–481 (1991) [CGGH] Carlson, J., Green, M., Griffiths, P., Harris, J.: Infinitesimal variations of Hodge structure. I. Compositio Math. 50, 109–205 (1983) [CFG] Cecotti, S., Ferrara, S., Girardello, L.: Geometry of type II superstrings and the moduli of superconformal field theories. Int. J. Mod. Phys. A 4, 2475–2529 (1989) [C] Cortés, V.: On hyper-Kähler manifolds associated to Lagrangian Kähler submanifolds of T ∗ Cn . Trans. Am. Math. Soc. 350, 3193–3205 (1998) [CRTP] Craps, B., Roose, F., Troost, W., Van Proeyen, A.: What is special Kähler geometry? Nucl. Phys. B 503, 565–613 (1997), hep-th/9703082 [DF] Deligne, P., Freed, D.S.: Supersolutions. In: Quantum Fields and Strings: A Course for Mathematicians. Vol. 1, Providence, RI: American Mathematical Society, 1999 [D] Donagi, R.Y.: Seiberg-Witten integrable systems. In: Algebraic geometry – Santa Cruz 1995, Vol. 62, Proc. Sympos. Pure Math. Providence, RI: Am. Math. Soc. 1997, pp. 3–43, alg-geom/9705010 [DM1] Donagi, R.Y., Markman, E.: Spectral covers, algebraically completely integrable, Hamiltonian systems, and moduli of bundles. In: Integrable systems and quantum groups (Montecatini Terme, 1993), Lecture Notes in Math. 1620, Berlin: Springer, 1996, pp. 1–119 [DM2] Donagi, R.Y., Markman, E.: Cubics, integrable systems, and Calabi-Yau threefolds. In: Proceedings of the Hirzebruch 65 Conference on Algebraic Geometry (Ramat Gan, 1993), Israel Math. Conf. Proc. 9, 199–221 (1996) [DW] Donagi, R.Y., Witten, E.: Supersymmetric Yang–Mills theory and integrable systems. Nucl. Phys. B 460, 299–334 (1996) [F] Fré, P.: Lectures on special Kähler geometry and electric-magnetic duality rotations. Nucl. Phys. Proc. Suppl. 45BC, 59–114 (1996), hep-th/9512043 [G] Gates, S.J.: Superspace formulation of new non-linear sigma models. J. Nucl. Phys. B 238, 349–366 (1984) [GS] Guillemin, V., Sternberg, S.: Symplectic techniques in physics. Cambridge: Cambridge University Press, 1990 [H] Hitchin, N.: Monopoles, Minimal Surfaces and Algebraic Curves. Séminaire de Mathématiques Supérieures, Montréal, Quebec: Les Presses de L’Université de Montréal Vol. 105, 1987 [L] Lu, Z.: A note on the special Kähler manifolds. Preprint [ST] Sierra, G., Townsend, P.K.: An introduction to N = 2 rigid supersymmetry. In: Supersymmetry and Supergravity 1983, B. Milewski, ed., Singapore: World Scientific, 1983, p. 396 [SW1] Seiberg, N., Witten, E.: Electric-magnetic duality, monopole condensation, and confinement in N = 2 supersymmetric Yang–Mills theory. Nucl. Phys. B 430, 485–486 (1994); Erratum Nucl. Phys. B 430, 485–486 (1994), hep-th/9407087 [SW2] Seiberg, N., Witten, E.: Monopoles, duality and chiral symmetry breaking in N = 2 supersymmetric QCD. Nucl. Phys. B 431, 484–550 (1994), hep-th/9408099 [SW3] Seiberg, N., Witten, E.: Gauge dynamics and compactification to three dimensions. In: The Mathematical Beauty of Physics: A Memorial Volume for Claude Itzykson, J. M. Drouffe, J. B. Zuber, eds., Singapore: World Scientific, 1997, pp. 333–366, hep-th/9607163 [S] Simpson, C.T.: Higgs bundles and local systems. Inst. Hautes Études Sci. Publ. Math. 75, 5–95 (1992) [St] Strominger, A.: Special geometry. Commun. Math. Phys. 133, 163–180 (1990) [WP] de Wit, B., Van Proeyen, A.: Potentials and symmetries of general gauged N = 2 supergravityYang–Mills models. Nucl. Phys. B 245, 89–117 (1984) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 203, 53 – 69 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Chiral BRST Cohomology of N = 2 Strings at Arbitrary Ghost and Picture Number? Klaus Jünemann, Olaf Lechtenfeld Institut für Theoretische Physik, Universität Hannover, Appelstraße 2, D-30167 Hannover, Germany. E-mail:
[email protected];
[email protected] Received: 5 January 1998 / Accepted: 16 November 1998
Abstract: We compute the BRST cohomology of the holomorphic part of the N = 2 string at arbitrary ghost and picture number. We confirm the expectation that the relative cohomology at non-zero momentum consists of a single massless state in each picture. The absolute cohomology is obtained by an independent method based on homological algebra. For vanishing momentum, the relative and absolute cohomologies both display a picture dependence – a phenomenon discovered recently also in the relative Ramond sector of N = 1 strings by Berkovits and Zwiebach [1]. 1. Introduction The standard approach to describe quantum string theories is the BRST procedure which consists of introducing unphysical ghost fields associated with the symmetries of the theory. Physical states are then characterised as elements of the cohomology of the nilpotent BRST charge Q. For the open bosonic string this so-called absolute cohomology is well known to contain twice as many states as one would expect from light-cone quantisation [2]. Each state appears in two copies – either with or without the zero mode of the reparametrisation ghost c0 . The true physical spectrum therefore is determined by the BRST cohomology supplemented by the condition that a representative should be annihilated by the zero mode b0 of the reparametrisation anti-ghost. This space defines the relative cohomology.1 In the case of world-sheet supersymmetry, an additional subtlety arises due to the existence of an infinite number of inequivalent Fock space representations of the spinor ghosts – the so-called picture degeneracy labelled by π ∈ 21 Z [3]. In the N = 1 string theory this problem is partly solved by bosonising the ghost fields, which allows one to ? Supported in part by the “Deutsche Forschungsgemeinschaft”; grant LE-838/5-1 1 For closed strings this kind of condition gets more complicated and leads to the concept of semi-relative
cohomology. In this paper we consider for simplicity the chiral cohomology (describing open strings or the holomorphic part of closed strings) only.
54
K. Jünemann, O. Lechtenfeld
construct a picture-raising operator X that maps physical states from the picture π to π+1. Moreover, there exists a picture-lowering operator Y that inverts X on the absolute cohomology spaces, implying that the picture-raising operation is an isomorphism of the cohomologies at different pictures [4]. Unfortunately, Y does not commute with b0 . Thus this argument does not guarantee that picture-raising is an isomorphism also of the relative cohomologies at different pictures. This problem has been addressed very recently by Berkovits and Zwiebach [1], who used the momentum operator in the −1 picture to invert the zero mode of the picture-raising operator on states with non-vanishing momentum. This new picture-lowering operator commutes with b0 and can therefore be used to prove the picture independence of the relative cohomology for non-vanishing momentum. However, these arguments do not rule out a picture dependence of the relative cohomology at zero momentum – a phenomenon which indeed occurs in the R sector of the relative cohomology of N = 1 strings [1]. In N = 2 string theory there exist two independent spinor ghost systems leading to two different picture numbers (π + , π − ). After bosonisation, one can construct pictureraising operators X ± in complete analogy to the N = 1 case. These operators, however, cannot be inverted with local conformal fields [5,6]. There is thus the immediate question whether or not the absolute or relative BRST cohomologies are identical at different pictures. We address this question by two independent methods. The first method consists of applying the ideas of ref. [1] to the N = 2 string. In contrast to conventional picturelowering, this new kind of picture-lowering also works for the N = 2 string, but only for non-vanishing momentum. Since it commutes with b0 , we confirm the picture independence of both the absolute and the relative cohomology at non-zero momentum. To describe the second method, let us recall that for the N = 1 theory there exists an alternative argument, due to Narganes-Quijano [7], that picture raising is an isomorphism. It makes use of the fact that bosonisation extends the Fock space by an additional oscillator and that in this extended space the absolute BRST cohomology is trivial. Some standard constructions from homological algebra then suffice to prove the isomorphy of the absolute cohomologies at different pictures. This work does not require the existence of an explicit picture-lowering operator and will be reviewed in more detail later on. For a specific choice of bosonisation, the absolute cohomology in the extended Fock space of the N = 2 string again turns out to be trivial. However, the method of NarganesQuijano cannot be carried over in a straightforward way, since the structure of the extended Fock space is more complicated for the N = 2 string. We therefore need to slightly modify his method and invoke the spectral flow automorphism of the N = 2 super Virasoro algebra [8]. For the massless level,2 this will allow us to give an alternative proof of the picture independence of the absolute BRST cohomology at non-vanishing momentum. Unfortunately, we cannot treat the relative cohomology within this approach. It is, however, possible to extract some information about the exceptional case of zero momentum. Nevertheless, most of our arguments fail for vanishing momentum, and we will demonstrate picture dependence of the exceptional cohomology by explicit computations. For example, we shall see that the relative zero-momentum cohomology in the (−1, −1) picture consists of a single state of ghost number one, whereas in the (−1, 0) picture there exist nontrivial states with any positive ghost number. In contrast to the N = 1 string, this phenomenon occurs in the absolute cohomology as well, but it is pos2 In principle, the proof works at any mass level. Its induction assumes the equality of the absolute cohomology for some pair of neighbouring pictures, which we only proved explicitly for the massless level.
Chiral BRST Cohomology of N = 2 Strings
55
sible to show that the picture dependence of the absolute zero-momentum cohomology is restricted to ghost numbers 0, 1, 2 and 3. We have checked that this peculiar situation does not improve much when including the center-of-mass coordinate of the string [9, 1]. The plan of the paper is as follows: In the next section we present a few basic facts about cohomology, clarify the relation between absolute and relative cohomology, and perform some explicit calculations in simple cases. Moreover, a complete computation of the cohomology in the (−1, −1) picture along the lines of refs. [2,10] and the role of spectral flow for the BRST cohomology are described. In the third section we apply the ideas of Berkovits and Zwiebach [1] to the N = 2 string and show that picture raising is a bijective map on both the absolute and relative cohomology classes for non-vanishing momentum. In the fourth section we review part of the work of Narganes-Quijano and extend his method to the N = 2 string to give an alternative treatment of picture raising. In the final section the results are summarised. 2. Preliminary Investigations BRST quantisation and picture raising of the N = 2 string has been reviewed recently in ref. [11] whose notation and conventions we adopt throughout this paper. To keep things simple we concentrate on the NS sector. Other boundary conditions can be obtained by spectral flow [12,13].
2.1. Relative and absolute cohomology. BRST-closed states with non-vanishing eigenvalues of the zero modes of the bosonic N = 2 super Virasoro generators L0 or J0 are always exact. For cohomology computations it is therefore sufficient to restrict oneself to the space of states that are annihilated by L0 and J0 . Due to the relations {Q, b0 } = L0 , {Q, b˜0 } = J0 ,
(1)
it is possible to impose the further constraints that also the fermionic anti-ghost zero modes b0 and b˜0 3 annihilate the states under consideration. This leads to the concept of relative cohomology which appears to have a more direct physical meaning than the cohomology of the full Fock space. Throughout this paper we assume that all states are annihilated by b˜0 and thus work with the Fock space F := {|ψi ; L0 |ψi = J0 |ψi = b˜0 |ψi = 0}.
(2)
The relative Fock space consists of states that are also annihilated by b0 : Frel := {|ψi ∈ F ; b0 |ψi = 0}.
(3)
We treat the two types of fermionic anti-ghosts differently because it seems to be necessary to impose the conditions J0 |ψi = b˜0 |ψi = 0 as subsidiary conditions on an open N = 2 string field in order to write down a free field action. The situation is quite similar to the field theory of closed bosonic strings where the conditions (L0 − L¯ 0 )|ψi = (b0 − b¯0 )|ψi = 0 have to be imposed [14]. In contrast, b0 |ψi = 0 can 3 As usual, c and b denote the reparametrisation ghosts. We write c˜ and b˜ for the U (1) ghosts which have conformal weights 0 and 1, respectively.
56
K. Jünemann, O. Lechtenfeld
be considered as a gauge-fixing condition (Siegel gauge), and L0 |ψi = 0 simply is the equation of motion. Both the spaces F and Frel possess a grading with respect to picture and ghost number: X X + − g,π + ,π − F g,π ,π , Frel = Frel . (4) F = g,π + ,π −
g,π + ,π −
We often suppress the obvious grading with respect to the center-of-mass momentum k ∈ R2,2 . Following ref. [5] we bosonise the (commuting) spinor ghosts, ±
±
γ ± → η± eϕ , β ∓ → e−ϕ ∂ξ ∓ ,
(5)
and define the (total) ghost number current in a slightly unsual way [15]: jgh = −bc − b˜ c˜ + η+ ξ − + η− ξ + .
(6)
This has the advantage of commuting with picture raising while still assigning the correct ghost number to all ghost fields and giving ξ ± the ghost number minus one. Moreover, we define the ghost number of the ground state in the (0, 0) picture (and therefore in all pictures) to be zero. The BRST cohomology spaces inherit the various gradings and are denoted by X X + − + − H g,π ,π (F ), H (Frel ) = H g,π ,π (Frel ). (7) H (F ) = g,π + ,π −
g,π + ,π −
H (F ) is called the absolute4 and H (Frel ) the relative cohomology. These two types of cohomology are related by a well known exact sequence [2]. Although this has been described in detail in refs. [10,16], we briefly repeat this analysis here. F and Frel differ just by the possibility to apply the oscillator c0 , which implies the decomposition F = Frel ⊕ c0 Frel . The inclusion i : Frel → F and the projection pr : F → Frel , defined as i(ψ) := ψ + c0 0, pr(ψ + c0 χ) := χ, ψ , χ ∈ Frel ,
(8)
can be combined to the following exact sequence: i
pr
0 −→ Frel −→ F −→ Frel −→ 0.
(9)
Since the inclusion and the projection both commute with the BRST operator Q, this exact sequence induces an exact cohomology triangle:5 H(F) @ pr @ R @ {Q, c0 } H (Frel ) H (Frel ) i
4 Obviously, this name in not entirely logical since our absolute cohomology is still relative with respect to b˜0 . The relation between H (F ) and the cohomology of the full Fock space (where also the b˜0 condition is relaxed) can be analysed straightforwardly by the methods described in this section and is not relevant to the picture degeneracy which is the subject of this paper. 5 This is a standard mathematical construction; see for example chapter 0 of ref. [17] for a review.
Chiral BRST Cohomology of N = 2 Strings
57
The connecting homomorphism carries ghost number 2 and thus allows us to unwind the above triangle into the long exact cohomology sequence pr
{Q,c0 }
i
pr
−→ H g+1 (F ) −→ H g (Frel ) −→ H g+2 (Frel ) −→ H g+2 (F ) −→ H g+1 (Frel ) −→ .
(10)
This sequence will turn out to be useful for explicit calculations. It is interesting that picture raising can be treated similarly [7] as we will show in Sect. 4. 2.2. Explicit computations in the massless sector. The simplest possible case for explicit computations is the massless sector in the (−1, −1) picture where all positively moded −1,−1 spinor ghost oscillators are annihilation operators. The relative Fock space Frel (k·k = 0) consists of a single state with ghost number g = 1, namely c1 | −1, −1, k i, k·k = 0,
(11)
where | π + , π − , k i denotes the ground state with momentum k in the (π + , π − ) picture. The state (11) is BRST invariant but not exact and thus constitutes the complete relative cohomology in the (−1, −1) picture. This is the analogue of the vanishing theorems in the massless sector for the bosonic [2] and the N = 1 string [10]. The sequence (10) implies that the absolute cohomology contains two states: the state given in (11) and c0 c1 | −1, −1, k i, k·k = 0.
(12)
The corresponding vertex operators creating these states from the (0, 0) picture vacuum are (1)
V(−1,−1) (z) = ce−ϕ
+ −ϕ −
(2)
eik·Z (z), V(−1,−1) (z) = c∂ce−ϕ
+ −ϕ −
eik·Z (z).
(13)
We will see shortly that the connection between the relative and the absolute cohomology is more complicated in other pictures, since multiplying a state from the relative cohomology by c0 does not in general produce a BRST-closed state. In the (−1, −1) picture everything carries over unchanged to the exceptional case k = 0, i.e. H −1,−1 (k = 0) = H −1,−1 (k·k = 0)
for F and Frel .
(14)
+ becomes a We now turn to the massless sector of the (−1, 0) picture where γ1/2
−1,0 (k·k = 0) is spanned by the following creation operator. The relative Fock space Frel states with ghost number g: µ
−µ
+ N − ) (γ−1/2 )N d−1/2 | −1, 0, k i, AN := c1 (γ1/2
g = 2N + 1,
+ N − ) (γ−1/2 )N+1 | −1, 0, k i, BN := c1 (γ1/2
g = 2N + 2,
µν CN
:=
−µ + N+1 − −ν c1 (γ1/2 ) (γ−1/2 )N d−1/2 d−1/2 |
−1, 0, k i,
(15)
g = 2N + 2,
where N is a non-negative integer, µ = 0, 1, and I I dz r−3/2 ± dz r−1/2 ±µ z z γ (z), dr±µ = iψ (z) γr± = 2πi 2π i
(16)
58
K. Jünemann, O. Lechtenfeld
are the Fourier modes of the spinor ghosts and matter fermions. The BRST operator acts as µ
νµ
QAN = 2k −µ BN + kν+ CN , µ
QBN = kµ+ AN+1 ,
µν QCN
= 2k
−µ
AνN+1
(17) − 2k
−ν
µ AN +1
(Q2 = 0 can be checked explicitly). By inspection one learns that the cohomology H −1,0 (Frel |k·k = 0) resides at g = 1 only and is represented by µ
− | −1, 0, k i, k·k = 0 kµ+ A0 = c1 k + ·d−1/2
(18)
for any non-vanishing value of the momentum.6 The corresponding vertex operator creating this state from the (0, 0) picture vacuum is −
(1)
V(−1,0) (z) = ck + ·ψ − e−ϕ eik·Z (z)
(19)
(1)
which is the picture-raised version of V(−1,−1) in (13) (see the appendix of ref. [5] for a detailed list of vertex operators). This proves that in this simple case the picture-raising operation X − (and similarly X+ ) is an isomorphism between the relative cohomologies at k 6 = 0. What about the absolute cohomology H −1,0 (F |k·k = 0)? The sequence (10) implies that it is non-vanishing only at ghost number one and two. Obviously, the ghost number µ µ one part is simply represented by kµ+ A0 . Applying Q to c0 kµ+ A0 yields µ
µ
Qc0 kµ+ A0 = −4kµ+ A1 = −4QB0 ,
(20) µ
showing that the cohomology class at ghost number two is represented by c0 kµ+ A0 +4B0 . The two corresponding vertex operators are
(1) V(−1,0)
and
− (2) V(−1,0) (z) = c∂ck + ·iψ − e−ϕ + 4cη− eik·Z (z)
(21)
which are both obtained by picture raising the vertex operators in (13). For non-zero momentum we thus see that picture raising is an isomorphism in the absolute cohomology, too. Together, we have X− :
∼ =
H −1,−1 (k·k = 0) −→ H −1,0 (k·k = 0)
at k 6 = 0
(22)
for F as well as for Frel . In the exceptional case, k = 0, things are strikingly different. Q vanishes identically −1,0 (k = 0), and any of the states in (15) represents its own on the relative Fock space Frel nontrivial cohomology class even though the picture-raising operation annihilates the (−1, −1) vertex operator. Moreover, explicit calculations at higher pictures seem to indicate a proliferation of physical states. Therefore, the exceptional relative cohomologies + − H π ,π (Frel |k = 0) look entirely different in various pictures. 6 Note that in our conventions k + and k − are related by complex conjugation and thus cannot vanish individually. This is different in a real SL(2, R) notation [18,19], where k + = 0 is possible with non-zero k − . µ In such a case the representative (18) can be replaced by µν k −ν A0 , but the cohomology is unchanged.
Chiral BRST Cohomology of N = 2 Strings
59
To work out the exceptional absolute cohomology, we additionally have to consider the states in (15) multiplied by c0 . For k = 0 one finds that Q acts on these states as µ
µ
Qc0 AN = −4AN +1 , Qc0 BN = −4BN +1 , µν µν Qc0 CN = −4CN+1 .
(23)
Obviously, the absolute zero-momentum cohomology H −1,0 (F |k = 0) is spanned by µ µν νµ the two states A0 at ghost number one, by two more, B0 and C0 = −C0 , at ghost number two, and vanishes at any other ghost number. Are these results for H −1,0 (k = 0) consistent with the sequence (10)? At odd positive µ ghost number g = 2N + 1, the relative cohomology is spanned by AN which contains + − γ−1/2 . The connecting homomorphism {Q, c0 } acts (up to a numerical N powers of γ1/2 factor) by multiplication of just such a factor. We thus see that it is an isomorphism between the relative cohomologies with odd ghost number. The same is true for positive even ghost number. But this is precisely what we learn from the sequence (10) if we insert the result that the absolute cohomology vanishes at ghost number greater than 2. Let us briefly summarise the result of the above calculations for k·k = 0: At nonzero momentum, picture raising establishes an isomorphy between the (−1, −1) and the (−1, 0) pictures for both the relative and the absolute cohomology. For vanishing momentum, however, the cohomologies look very different. In the (−1, −1) picture both the absolute and the relative cohomology are obtained by the zero-momentum limit of the cohomology at non-vanishing momentum. In the (−1, 0) picture the BRST operator vanishes in the relative Fock space. The relative cohomology is two-dimensional at any positive ghost number. In contrast, the absolute cohomology is two-dimensional only at ghost numbers one and two and vanishes elsewhere. In other pictures one finds non-trivial absolute cohomology classes also at ghost numbers zero and three. For example, the states |0, 0, k = 0i and c−1 c0 c1 | − 2, −2, k = 0i
(24)
are both BRST invariant but not exact. This is in contrast to the N = 1 string where picture-lowering guarantees the picture independence of the absolute cohomology even in the exceptional case. In Sect. 4, however, we will prove that the absolute exceptional cohomology vanishes for ghost number g 6= 0, 1, 2, 3 at any picture. The picture dependence can thus only occur for these ghost numbers. 2.3. Complete calculation in the (−1, −1) picture. The above calculations were all done for k·k = 0. But what about massive states? Surely such states would carry additional Lorentz indices and therefore describe higher spin fields. Due to the absence of transverse dimensions in the (2,2) space-time, these states should not contribute any physical degrees of freedom, leaving the ground state as the only physical state. Although this sounds very plausible it is not what one would call a rigorous computation of the relative BRST cohomology. The most powerful approach to this kind of problem has been invented by Frenkel, Garland and Zuckerman [2] and extended to the N = 1 string in the −1 picture by Lian and Zuckerman [10]. Their method consists of introducing a new kind of grading – the filtration degree – to reduce the computation of the BRST cohomology to a standard problem of Lie algebra cohomology and can be applied to the N = 2 string, as well. Its essential new feature, namely the existence of the additional
60
K. Jünemann, O. Lechtenfeld
bosonic current J , can be incorporated in a straightforward way by simply extending the definition of the relative Fock space as indicated in Sect. 2.1. Another important ingredient in this analysis is that the Fock space of the matter sector must be a free module of the algebra of the negatively moded N = 2 super Virasoro generators. This property is also satisfied for critical N = 2 strings. For non-vanishing momentum it has in fact been shown in ref. [20] that the Fock space is a direct sum of universal enveloping algebras of the negative N = 2 super Virasoro algebra.7 The rest of the argument works in complete analogy to the N = 1 string, and it does not seem necessary to repeat it here since it has been described in great detail in ref. [10]. One finally arrives at the expected result that the state (11) is the only physical degree of freedom in the (−1, −1) picture and that there is no room for discrete states or other surprises. For k·k > 0 we thus have H g,−1,−1 (F ) = H g,−1,−1 (Frel ) = 0 for any g.
(25)
Unfortunately, this kind of analysis applies only to the (−1, −1) picture. The latter is singled out as the only picture where the creation (annihilation) operators are precisely the negatively (positively) moded oscillators and which has a nondegenerate scalar product with itself. Perhaps it is possible to find a clever redefinition of the filtration degree to apply this method also to other pictures, but it is not obvious to the authors how this could be done. 2.4. Spectral flow. We finally discuss one further aspect of the N = 2 string, namely spectral flow [8,12]. However, this will only be needed for the discussion in Sect. 4. Spectral flow is an automorphism of the N = 2 superconformal algebra associated to the U (1) subalgebra. An explicit construction is presented in the appendix of ref. [11]. If the spectral flow parameter 2 is chosen from the interval (0, 1), the spectral flow operator S(2) relates sectors with different boundary conditions (see however ref. [6] for a different point of view). For 2 = 1 it is a map within each sector and has a number of useful properties [12]: it has zero ghost number, commutes with Q, changes π + by +1, π − by −1 and is invertible (choose 2 = −1). It is therefore an isomorphism of the cohomologies, S(1) :
Hπ
+ ,π −
∼ =
(F ) −→ H π
+ +1,π − −1
(F ),
(26)
and it follows by induction that Hπ
+ ,π −
+ − (F ) ∼ = H π +n,π −n (F )
(27)
for arbitrary π + , π − , k and any integer n. Moreover, S(1) commutes with the pictureraising operators X ± up to BRST trivial terms [12], i.e. it commutes with them on the cohomology spaces. Since we have seen above that, for non-vanishing momentum, X− is an isomorphism between H −1,−1 (F |k·k = 0) and H −1,0 (F |k·k = 0), the commutative diagram H −1,−1 (F |k·k = 0) S (1)n y∼ =
X−
−−−−→ H −1,0 (F |k·k = 0) ∼ = S (1)n y∼ = X−
H −1+n,−1−n (F |k·k = 0) −−−−→ H −1+n,n (F |k·k = 0) 7 For k = 0 this is not true since the ground state is then annihilated by L . As in other string theories, −1 for this reason such kind of analysis does not apply in the exceptional case.
Chiral BRST Cohomology of N = 2 Strings
61
implies that X− is also an isomorphism in the bottom row. Thus, the spaces Hπ
+ ,π −
(F |k·k = 0)
for π + +π − ∈ {−2, −1}
(28)
are all isomorphic for non-zero momentum. Finally, let us remark that the above argument is not true for the relative cohomology since S(1) does not commute with b0 . 3. Picture-Lowering In this section we apply the method of Sect. 2 of ref. [1] to the open N = 2 string. We will, however, refrain from presenting the details since the calculations carry over in a straightforward way. To begin with, let us recall the bosonisation of the spinor ghosts of the N = 2 string [5]: ±
±
γ ± (z) → η± eϕ (z), β ∓ (z) → e−ϕ ∂ξ ∓ .
(29)
The zero modes ξ0± of the weight-zero fields ξ ± (z) do not take part in this process, and thus the Fock space F is extended to the bigger space F¯ . The picture-raising operators acting on F are defined as I dz ± (30) X (z), X± (z) := {Q, ξ ± (z)} X0± := {Q, ξ0± } = 2πiz and map a BRST-closed state |ψi ∈ F to Qξ0± |ψi which is trivial in F¯ but not in F . Note that both X0± do not contain any ξ0± and therefore are maps within the small space F. Following ref. [1] we consider the momentum operators in the (−1, 0) and (0, −1) picture: I dz −ϕ ± ±µ e iψ . (31) p˜ ±µ = 2πi Because of ±
±
[Q, e−ϕ iψ ±µ ] = ∂(ce−ϕ iψ ±µ ),
(32)
p˜ ±µ is BRST invariant and satisfies the key relations X0± p˜ ∓µ = 2p∓µ + {Q, m±µ }, p˜ ∓µ X0± = 2p∓µ + {Q, n±µ }, where p ±µ is the center-of-mass momentum, I dz i∂Z ±µ , p±µ = 2πi and m±µ and n±µ are given by Z I I dz1 dz2 z1 ∓ dw ∂ξ ± (w) e−ϕ iψ ∓µ (z2 ), m±µ = 2πiz1 |z2 |,
∀F ∈ F,
and s is implemented by an operator Q on K, i.e. s(F ) = QF − (−1)δ(F ) F Q, such that
< Qφ, ψ >=< φ, Qψ >
and
The assumptions (4.7) are made in order to fulfill s(F ∗ )
(4.6) Q2 = 0.
(4.7)
−(−1)δ(F ) s(F )∗
and s 2
= = 0. Note that if the inner product on K is positive definite, we find < Qφ, Qφ >=< φ, Q2 φ >= 0, hence Q = 0 and thus also s = 0. Hence for nontrivial s the inner product must necessarily be indefinite. def
Since the physical states should be s-invariant, we consider the kernel of Q: K0 = Ke Q. Let K00 be the range of Q. Because of Q2 = 0 we have K00 ⊂ K0 . We assume: (Positivity) and
(i) < φ, φ >≥ 0 ∀φ ∈ K0 , (ii) φ ∈ K0 ∧ < φ, φ >= 0 H⇒
φ ∈ K00 .
(4.8)
Then def
H=
K0 , K00
def
< [φ1 ], [φ2 ] >H : = < ψ1 , ψ2 >K ,
ψj ∈ [φj ] := φj + K00
(4.9) is a pre Hilbert space. (Due to (4.7) the definition of < [φ1 ], [φ2 ] >H is independent of the choice of the representatives ψj ∈ [φj ], j = 1, 2.) Now we verify that def
π([A])[φ] = [Aφ]
(4.10)
84
M. Dütsch, K. Fredenhagen
is a well defined representation on H (where A ∈ A0 , φ ∈ K0 , [A] := A + A00 ). Namely, let A + s(B), A ∈ A0 , B ∈ F, be a representative of [A] ∈ A in F, and let φ + Qψ, φ ∈ K0 , ψ ∈ K be a representative of [φ] ∈ H in K. We have to show that Aφ ∈ K0 and (A + s(B))(φ + Qψ) − Aφ ∈ K00 = QK. But QAφ = s(A)φ + (−1)δ(A) AQφ = 0, and s(B)φ + (A + s(B))Qψ = (QB − (−1)
δ(B)
BQ)(φ + Qψ) − (−1)δ(A) (s(A)ψ − QAψ) =
QB(φ + Qψ) + (−1)δ(A) QAψ ∈ QK. 4.3. Stability under deformations. It is gratifying that the described structure is stable under deformations, e.g. by turning P on the interaction. Let K be fixed and replace F ∈ F F, δ(Fn ) = const. In by a formal power series F˜ = n g n Fn with F0 = F and Fn ∈ P the same way replace s and Q by the formal power series s˜ = n g n sn (each sn is a P ˜ = n g n Qn , Qn ∈ L(K), with s0 = s, Q0 = Q and graded derivation), Q ˜ 2 = 0, < Qφ, ˜ ψ >=< φ, Qψ ˜ > s˜ 2 = 0, Q
˜
˜ F˜ − (−1)δ(F ) F˜ Q. ˜ and s˜ (F˜ ) = Q (4.11) def s˜ We can then define A˜ = Ke Ra s˜ . K0 and K00 have to be replaced by the formal power series ˜ and K˜ 00 := Ra Q ˜ with coefficients in K. Due to the above result, the algebra K˜ 0 := Ke Q def ˜ ˜ A has a natural representation on H˜ = K0 . The inner product on K induces an inner K˜ 00
product on H˜ which assumes values in the formal power series over C. We adopt the point P of view that a formal power series b˜ = n g n bn , bn ∈ C is positive if there is another P P n ˜ i.e. bn = nk=0 c¯k cn−k . This formal power series c˜ = n g cn , cn ∈ C with c˜∗ c˜ = b, is equivalent to the condition 4 bn ∈ R,
∀n ∈ N0 ,
(4.12)
and such that bl = 0 ∀l < 2k, ∃k ∈ N0 ∪ {∞} in the case k < ∞. and b2k > 0
(4.12) (4.13)
We now show that the assumptions concerning the positivity of the inner product are automatically fulfilled for the deformed theory, if they hold true in the undeformed model. Theorem 4. Let the positivity assumption (4.8) be fulfilled in zeroth order. Then (i) (ii) (iii) (iv)
˜ φ˜ >≥ 0 ∀φ˜ ∈ K˜ 0 , < φ, ˜ φ˜ >= 0 H⇒ φ˜ ∈ K˜ 00 . ˜ φ ∈ K˜ 0 ∧ < φ, ˜ 0 = φ. For every φ ∈ K0 there exists a power series φ˜ ∈ K˜ 0 with (φ) ˜ Let π and π˜ be the representations (4.10) of A, A on H, H˜ respectively. Then ˜ 6 = 0 if π(A0 ) 6 = 0. π( ˜ A)
4 Bordemann and Waldmann [BW] work with a weaker definition of positivity in the case of a formal Laurent series with real coefficients: they only require that the smallest non-vanishing coefficient is positive, it does not need to be an even coefficient.
Local (Perturbative) Construction of Observables in Gauge Theories
85
P Proof of Theorem 4, (i) and (ii). Let φ˜ ∈ K˜ 0 and bn = nk=0 < φk , φn−k >. bn clearly ˜ φ˜ = 0 implies Q0 φ0 = 0, hence φ0 ∈ K0 and b0 ≥ 0. If b0 > 0 (i) follows. If is real. Q (0) b0 = 0 we know that there is some ψ0 ∈ K with φ0 = Q0 ψ0 . Let ψk := ψ0 δk,0 and P (0) ψ˜ (0) := k g k ψk = ψ0 . Then ˜ ψ˜ (0) , η˜ (0) := φ˜ − Q
(4.14)
is a formal power series with vanishing term of zeroth order. We now proceed by induction and assume that b0 = b1 = · · · = b2n = 0 and there is some formal power series P (n) ψ˜ (n) = k g k ψk with coefficients in K such that η˜ (n) :=
X k
(n)
˜ ψ˜ (n) g k ηk = φ˜ − Q
(4.15)
vanishes up to order n. Then ˜ ψ˜ (n) , η˜ (n) + Q ˜ ψ˜ (n) >2n+1 =< η˜ (n) , η˜ (n) >2n+1 = 0 b2n+1 =< η˜ (n) + Q
(4.16)
(n) (n) ˜ η˜ (n) = 0 we get Q0 η(n) = 0, i.e. η(n) ∈ K0 and and b2n+2 =< ηn+1 , ηn+1 >. Since Q n+1 n+1 (n)
b2n+2 ≥ 0. If b2n+2 > 0 we obtain (i), otherwise ∃ψn+1 ∈ K with ηn+1 = Q0 ψn+1 , and we can define (n+1) (n) := ψk + δn+1,k ψn+1 . (4.17) ψk One easily verifies ˜ ψ˜ (n+1) )k = 0, (φ˜ − Q
∀k = 0, 1, . . . , n + 1.
(4.18)
˜ ψ, ˜ i.e. φ˜ ∈ K˜ 00 . Either the induction stops at some n or we find a ψ˜ with φ˜ = Q To prove (iii) we again proceed by induction and assume that there exists a power ˜ φ˜ (n) vanishes up to order n. This is certainly true for n = 0. series φ˜ (n) such that Q 2 (n) ˜ φ˜ (n) )n+1 , hence (Q ˜ φ˜ (n) )n+1 ∈ K0 . Moreover 0 =< ˜ φ˜ )n+1 = Q0 (Q Then 0 = (Q (n) (n) (n) (n) ˜ φ˜ >2n+2 =< (Q ˜ φ˜ )n+1 , (Q ˜ φ˜ )n+1 >, thus (Q ˜ φ˜ (n) )n+1 ∈ K00 and there ˜ φ˜ , Q Q ˜ φ˜ (n) )n+1 + Q0 φn+1 = 0. We then set (φ˜ (n+1) )k := (φ˜ (n) )k + exists a φn+1 ∈ K with (Q ˜ φ˜ (n+1) vanishes up to order n + 1; δn+1,k φn+1 and find that Q φ˜ := lim φ˜ (n) ∈ K˜ 0 n→∞
(4.19)
is then the desired formal power series. P k ˜˜ ˜ = 0 means A˜ = It remains to prove (iv). π( ˜ A) k g Ak ∈ Ke s˜ with Aφ ∈ K˜ 00 , ∀φ˜ ∈ K˜ 0 . By means of (iii) this implies in zeroth order A0 φ0 ∈ K00 , ∀φ0 ∈ K0 , i.e. π(A0 ) = 0. Note that φ → φ˜ is non-unique and this holds true also for the induced relation ˜ between H and H. The unit 1˜ in an algebra of formal power series is 1˜ = (1, 0, 0, . . . .) = 1g 0 , and P k ˜ a˜ = ∞ k=0 ak g is invertible iff a0 is invertible. We denote by C the formal power series P P def def n over C and consider K˜ = {φ˜ = n φn g |φn ∈ K} and F˜ = {F˜ = n Fn g n |Fn ∈ F}
86
M. Dütsch, K. Fredenhagen
˜ as C-modules. This is possible because the usual multiplication of power series yields maps ˜ × F˜ → F˜ : (a, ˜ → a˜ A˜ = A˜ a, C ˜ A) ˜
˜ × K˜ → K˜ : (a, ˜ → a˜ φ˜ = φ˜ a, C ˜ φ) ˜
which fulfill the relations ˜ a˜ φ) ˜ φ, ˜ = a( ˜ = (a˜ A) ˜ A( ˜ A˜ φ) and
˜ ∗ = a˜ ∗ A˜ ∗ , (a˜ A)
˜ b˜ ψ˜ >= a˜ ∗ b˜ < φ, ˜ ψ˜ > < a˜ φ, (4.20)
˜ = a˜ s˜ (A). ˜ s˜ (a˜ A)
(4.21)
˜ ˜ Also the physical pre-Hilbert space H˜ and the algebra of observables A(O) are Cmodules, and the multiplications by a “scalar” ˜ ˜ ˜ × A(O) ˜ → a[ ˜ = [a˜ A] ˜ = [A] ˜ a, C → A(O) : (a, ˜ [A]) ˜ A] ˜ ˜ × H˜ → H˜ : (a, ˜ → a[ ˜ = [a˜ φ] ˜ = [φ] ˜ a˜ C ˜ [φ]) ˜ φ]
(4.22) (4.23)
˜ ∈ H˜ can be normalized: satisfy (4.20). We are now going to prove that every [φ] ˜ [φ] ˜ such that ˜ ∈ H, ˜ 6 = 0, there exist [ψ] ˜ ∈ H˜ and a˜ ∈ C Corollary 5. For every [φ] ˜ = a[ ˜ [φ] ˜ ψ]
and
˜ [ψ] ˜ >= 1. < [ψ],
˜ From Theorem 4 we know b˜ = ˜ [φ] ˜ >∈ C. Proof. We set b˜ :=< [φ], R, b2k > 0.
(4.24) P∞
n=2k bn g
n,
bn ∈
˜ with a˜ ∗ a˜ = b. ˜ Then [ψ] ˜ := a˜ −1 [φ] ˜ Case (1), k = 0. There exists an invertible a˜ ∈ C satisfies the assertion (4.24). P ˜ Due to < φ0 , φ0 >= Case (2), k > 0. We consider a representative φ˜ = n φn g n of [φ]. b0 = 0 and Q0 φ0 = 0, there exists η0 ∈ K with Q0 η0 = φ0 . Then we can define τ˜1 by ˜ ˜ 0 which fulfills τ˜1 ∈ K˜ 0 and [φ] ˜ = g[τ˜1 ]. Hence < [τ˜1 ], [τ˜1 ] >= g −2 b. g τ˜1 := φ˜ − Qη ˜ until we obtain a If k > 1 we repeat this procedure (starting with [τ˜1 ] instead of [φ]) ˜ Similarly to case (1) we ˜ = g k [τ˜k ] and hence < [τ˜k ], [τ˜k ] >= g −2k b. τ˜k ∈ K˜ 0 with [φ] ˜ with c˜∗ c˜ = g −2k b. ˜ Then [ψ] ˜ := c˜−1 [τ˜k ] conclude that there exists an invertible c˜ ∈ C ˜ i.e. (4.24) is satisfied for a˜ := g k c. ˜ [ψ] ˜ >= 1 and [φ] ˜ = g k c[ ˜ ψ], ˜ u t satisfies < [ψ], ˜ A state ω on the algebra of observables A(O) is defined by (i) (ii) (iii) (iv)
˜ ˜ is linear, i.e. ω(a[ ˜ + [B]) ˜ = aω([ ˜ + ω([B]), ˜ ω: A(O) →C ˜ A] ˜ A]) ˜ ˜ ∗ ˜ ∈ A(O), ˜ ∗ ) = ω([A]) ∀[A] ω([A] ˜ ˜ ≥0 ˜ ∈ A(O) ˜ ∗ [A]) ∀[A] and ω([A] ˜ = 1. ˜ ω(1)
The constructed physical states, i.e. the vector states ˜ =< [φ], ˜ φ] ˜ [A][ ˜ >, ωφ˜ ([A])
˜ ˜ ∈ H, [φ]
(4.25)
˜ [φ] ˜ >= 1, also (iv). The positivity (iii) of the satisfy obviously (i), (ii) and, if < [φ], states ωφ˜ is ensured by
Local (Perturbative) Construction of Observables in Gauge Theories
87
Corollary 6 (Positivity of the Wightman distributions of gauge invariant fields). Let the algebra A˜ be generated by the s˜ -invariant fields φ˜ 1 , . . . , φ˜ l and let A˜ :=
k Z X n=0
X
fj1 ...jn (x1 , . . . , xn )φ˜ j1 (x1 ) . . . φ˜ jn (xn )dx1 . . . dxn ,
j1 ,...,jn
(4.26)
fj1 ...jn ∈ D(R4n ), and φ˜ ∈ K˜ 0 . Then
˜ A˜ ∗ A˜ φ˜ >≥ 0. < φ,
(4.27)
t Proof. Note A˜ φ˜ ∈ K˜ 0 and apply part (i) of Theorem 4. u ˜ = Q0 . This situation occurs if the adiabatic limit exists ([KO], Remark. Let us assume Q ˜ φ˜ = 0 means Q0 φk = 0, ∀k. see also Sect. 5.2), e.g. in massive gauge theories. Then Q ˜ Therefore, in this case the physical pre Hilbert space H is the space of formal power series with coefficients in H, ˜ H˜ = CH
˜ = Q0 ). (if Q
(4.28)
˜ ˜ But the states ωφ on A(O) induced by vectors φ ∈ H remain C-valued functionals. 5. Verification of the Assumptions in the Example of QED The construction in the previous section relies on some assumptions, which we are now going to verify in QED. The deformation is given by going over from the free theory to the interacting fields discussed in Sects. 2 and 3. For the free and the interacting theory we will first define the BRST-transformation s (˜s resp.) and then we will construct a ˜ which implements s (˜s ) in a representation nilpotent and hermitian operator Q (Q) space with indefinite inner product. Then the local observables (defined by (4.4)) are ˜ Q Ke Q ˜ naturally represented on H = Ke Ra Q (H = Ra Q ˜ ) by (4.10). It remains to prove the ˜ For the free theory we will do this by positivity of the inner product induced in H (H). giving explicitly (distinguished) representatives of the equivalence classes in H. Then ˜ we conclude from Theorem 4 that positivity holds true also for H. 5.1. The free theory. We consider the field algebra F which is generated by the free fields ˜ the Wick monomials j µ =: ψγ µ ψ :, γµ Aµ ψ, ψγµ Aµ , L0 = jµ Aµ Aµ , ψ, ψ, u, u, and the derivatives of free fields ∂µ Aµ , F µν = ∂ µ Aν − ∂ ν Aµ . This algebra has a faithful representation on the Fock space K = KA ⊗ Kψ ⊗ Kg of free fields (Appendix A). The Z2 -gradiation is (−1)δ(F ) , where F is a monomial in F and δ(F ) is the ghost number, Z ↔ def Qu = i d 3 x : u(x) ˜ ∂ 0 u(x) : . (5.1) [Qu , F ] = δ(F )F, x0 =const.
Note δ(u) = −1, δ(u) ˜ = 1. The graded derivation s is the BRST-transformation of free fields ˜ = −i∂µ Aµ . s(Aµ ) = i∂ µ u, s(ψ) = 0, s(ψ) = 0, s(u) = 0, s(u)
(5.2)
88
M. Dütsch, K. Fredenhagen
The transformation of Wick monomials and derivated free fields is given by s(: φ1 (x)φ2 (x) · · · :) =: s(φ1 )(x)φ2 (x) · · · : +(−1)δ(φ1 ) : φ1 (x)s(φ2 )(x) · · · : + . . . (5.3) and by translation invariance of s s(∂µ φ(x)) = ∂µx s(φ)(x). This transformation is implemented by the free Kugo–Ojima charge [DHKS1] Z ↔ def d 3 x (∂ν Aν (x)) ∂ 0 u(x), Q= x0 =const.
(5.4)
(5.5)
which fulfills Q∗ = Q, 5 [Qu , Q] = −Q and Q2 = 0. This is verified in Appendix A. Moreover, it is shown there that the inner product < ., . > is positive semidefinite on Ke Q and that the space of null vectors in Ke Q is precisely Ra Q [DHS1]. The existence of the integral in (5.5) can be proven by the following method due to Requardt [R]. We smear out J 0 = ∂ν Aν ∂ ↔ 0 u with k(x0 )h(x) ∈ D(R4 ), where R dx0 k(x0 ) = 1 and h is a smeared characteristic function of {x ∈ R3 , |x| ≤ R} for some R > 0. By scaling Z kλ (x0 ) := λk(λx0 ), hλ (x) := h(λx), Qλ := d 4 x kλ (x0 )hλ (x)J 0 (x) one easily finds limλ→0 kQλ k = 0 w.r.t. a suitable Krein space norm R (cf. Appendix A). In addition, due to current conservation, limλ→0 [Qλ , φ(y)]∓ = d 3 x [J 0 (x), φ(y)]∓ (note that the latter integral exists since the region of integration is bounded) for R big enough compared to the support of k. Therefore, the strong limit limλ→0 Qλ exists on a dense subspace and agrees with (5.5). s Unfortunately, the representation (4.10) of the observables A = Ke Ra s on the physical Ke Q pre Hilbert space H = Ra Q is not faithful . The counterexample is [u(f )], f ∈ D(R4 ) R real-valued, which induces a non-trivial element of A if f d 4 x 6= 0. Namely, due to u(f )∗ u(f ) = u(f )2 = 0, it is represented by zero on H. (This holds true for each representation in which < ·, · > is positive definite.) Since u(∂µ h) = i s(Aµ (h)), h ∈ D(R4 ), A has the following structure: (5.6) A = A(0) ⊕ u0 A(0) , R where u0 is the rest class of u(f ), f d 4 x = 1, and where A(0) is the subalgebra with ghost number zero. The representation (4.10) of A(0) on H is faithful. To make this plausible we mention that A(0) is generated by [F µν ], [ψ], [ψ] and Wick monomials thereof, whereas the “canonical” representatives of H are the states containing transversal photons, electrons and positrons only (A.39). The interaction Lagrangian of QED is s-invariant up to a divergence of a local field, [Q, L0 (x)] = i∂µ L1 µ (x),
L1 µ = : ψγ µ ψ : u. def
(5.7)
5 We restrict all operators (resp. formal power series of operators) to the dense invariant domain D and, therefore, there is no difference between symmetric and self-adjoint operators.
Local (Perturbative) Construction of Observables in Gauge Theories
89
Thus in the formal adiabatic limit the integral of the Lagrangian becomes invariant. In [DHKS1,DHKS2] the following Ward identities were postulated: [Q, T (L0 (x1 ) . . . L0 (xn ))] =
n X l=1
∂µxl T (L0 (x1 ) . . . L1 µ (xl ) . . . L0 (xn ))
(5.8)
(“free (perturbative) operator gauge invariance”, compare (3.12)). Provided the adiabatic limit exists this condition implies the s-invariance of the S-matrix; hence the S-matrix Q induces a unitary operator on the physical Hilbert space H = Ke Ra Q [DHS1,K]. The nice feature of the condition (5.8) is that its formulation makes sense independent of the adiabatic limit. So, if the normalizations are suitably chosen, the free (perturbative) operator gauge invariance (5.8) (more precisely the corresponding C-number identities which imply (5.8)) could been proven to hold to all orders in QED [DHS2,S] and also in SU (N ) Yang–Mills theories [DHS1,D1] and to imply (in the latter case) the usual Slavnov–Taylor identities [D2]. In addition, it determines to a large extent the possible structure of the model. Stora [St1] found that making a general ansatz for the interaction Lagrangian of self-interacting gauge fields, the Ward identities (5.8) require the coupling parameters to be totally antisymmetric structure constants of a Lie group. Moreover, (5.8) was used for a derivation of all the couplings of the standard model of electroweak interactions (especially the Higgs potential) [DS]. We emphasize that (5.8) is a pure quantum formulation of gauge invariance, without reference to classical physics. 5.2. The interacting theory: construction of the interacting Kugo–Ojima charge. We now replace the free fields (including Wick monomials and derivatives) considered in the previous subsection by the corresponding interacting fields, which are formal power series of unbounded operators in the Fock space K of free fields. Due to [Qu , L(x)] = 0 (3.1), we can normalize the time ordered products such that [Qu , T (L(x1 ) . . . L(xn ))] = 0
(N6) and
[Qu , T (L(x1 ) . . . L(xn )F (x))] = δ(F )T (L(x1 ) . . . L(xn )F (x)).
Hence [Qu , Fint L ] = δ(F )Fint L by (2.8-9). The fundamental normalization condition concerning the ghost number is (B.6) in combination with (N3); they imply (N6). Again def
we fix the region O to be the double cone O = O(−r,0),(r,0) , (r > 0) (3.18) and assume the switching function g ∈ D(R4 ) to be constant on O (3.19). We study the algebra ˜ F(O) (3.17) of interacting fields localized in O. The ghost fields do not couple in QED, hence u˜ int L (x) = u(x). ˜ (5.9) uint L (x) = u(x), The abelian BRST-transformation s˜ = s0 + gs1 [BRS] should be a graded derivation with zero square and compatible with the ∗-operation. In addition it shall induce the following transformations on the basic fields,6 µ
s˜ (Aint L (x)) = i∂ µ u(x),
s˜ (u(x)) = 0,
µ
s˜ (u(x)) ˜ = −i∂µ Aint L (x),
6 In contrast to the free case ψ int L and ψ int L are not observables in the sense of Sect. 4.1. This different behaviour can be understood physically by the accompanying soft photon cloud and mathematically by Gauss’ law.
90
M. Dütsch, K. Fredenhagen
s˜ (ψint L (x)) = −g(x)ψint L (x)u(x),
s˜ (ψ int L (x)) = g(x)ψ int L (x)u(x). (5.10)
(The pointwise products are well defined.) Let us assume that we have constructed the ˜ = Qint (g). Then we shall define s˜ in terms of the interacting Kugo–Ojima charge Q corresponding current such that Qint (g) implements s˜ , s˜ (F ) = Qint (g)F − (−1)δ(F ) F Qint (g),
˜ F ∈ F(O).
(5.11)
If Qint (g) is hermitian, s˜ is compatible with the ∗-operation, and s˜ 2 = 0 is implied by Qint (g)2 = 0. To get Qint (g) we follow Kugo and Ojima [KO] and replace the current in the free µ charge Q (5.5) by the corresponding interacting field ∂µ Aint L (x)∂ ↔xν u(x). By means of the field equation (3.8) and the Ward identity (3.13) we find µ
↔x
µ
µ
∂xν [∂µ Aint L (x) ∂ ν u(x)] = −( ∂µ Aint L (x))u(x) = (∂µ g)(x)jint L (x)u(x).
(5.12)
Hence the current is conserved in the region where g is constant. We may therefore ˜ define s˜ on an algebra F(O) in the following way: we choose g(x) = e = const on a neighbourhood U of O¯ and set Z ↔ def µ ˜ d 3 x [∂µ A (x) ∂ 0 u(x), F ]∓ , F ∈ F(O). (5.13) s˜ (F ) = int L
x0 =0
Because of current conservation, s˜ is implemented by the operators Z ↔x Qint (g, k) = d 4 x k µ (x)(∂ν Aνint L (x)) ∂ µ u(x)
(5.14)
µ
with k µ ∈ D(U), where k µ − δ0 h = ∂ µ f for some f ∈ D(U), and where h ∈ D(U) is a suitably chosen smeared characteristic function of the surface {(0, x), |x| ≤ r}. Now we are well prepared to prove that the definition (5.13) of s˜ agrees with the usual expressions (5.10) on the basic fields, and to compute all further commutators of Qint (g) with the interacting sub Wick monomials of L0 (3.15). Theorem 7. We assume that the interacting fields are normalized as described in Sects. 2 and 3, especially that they fulfill the field equations (3.8–9) and the Ward identities (N5). Furthermore we assume g = e = const on the double cone O = O(−r,0),(r,0) . Let k µ as before. Then we find the commutation rules µ
[Qint (g, k), Aint L (y)] = i∂ µ u(y),
µ
[Qint (g, k), ∂µ Aint L (y)] = 0,
(5.15a, b)
[Qint (g, k), ψint L (y)] = −eψint L (y)u(y),
(5.16a)
[Qint (g, k), ψ int L (y)] = eψ int L (y)u(y),
(5.16b)
µ {Qint (g, k), u(y)} ˜ = −i∂µ Aint L (y), (5.17a, b) {Qint (g, k), u(y)} = 0, µν µ [Qint (g, k), jint L (y)] = 0, (5.18a, b) [Qint (g, k), Fint L (y)] = 0, µ µ [Qint (g, k), (γµ A ψ)int L (y)] = −e(γµ A ψ)int L (y)u(y) + iγµ ψint L (y)∂ µ u(y),
(5.19) [Qint (g, k), (ψγµ Aµ )int L (y)] = e(ψγµ Aµ )int L (y)u(y) + iψ int L (y)γµ ∂ µ u(y), (5.20) (5.21) [Qint (g, k), (: ψγµ ψ : Aµ )int L (y)] = (: ψγµ ψ :)int L (y)i∂ µ u(y),
where always y ∈ O.
Local (Perturbative) Construction of Observables in Gauge Theories
91
Proof. Since the ghost fields are not influenced by the interaction, we know that the ghost and antighost fields commute with all other interacting fields. Moreover, the pointwise products of these fields with a ghost or antighost field are well defined and behave in commutators as ordinary products of operators in spite of their character as operator valued distributions. (This may be verified by using techniques from microlocal analysis as explained in [BF].) Thus the above relations follow from the commutation rules with ∂ν Aνint L (x) in Proposition 3 and the ghost antighost anticommutation relations in Eq. (3.2). With these preparations the commutators (5.15-16) can easily be computed, for example Z ↔x µ d 3 x [∂µ Aint L (x), ψint L (y)] ∂ 0 u(x) = [Qint (g), ψint L (y)] = x0 =0 Z ↔x (5.22) = eψint L (y) d 3 x D(x − y) ∂ 0 u(x) = −eψintL (y)u(y), where we have inserted Proposition 3, (3) and (A.31). By using other commutators of µ ∂µ Aint L (Proposition 3) we analogously prove (5.15a,b), (5.16b) and (5.21). Alternay tively (5.15b) can be obtained by taking the divergence ∂µ of (5.15a). Part (a) of (5.17) is obvious due to {u(x), u(y)} = 0; let us compute part (b), Z ↔x µ ˜ = −i d 3 x ∂µ Aint L (x) ∂ 0 D(x − y). (5.23) {Qint (g), u(y)} x0 =0
µ
From (5.12) we know ∂µ Aint L (x) = 0, ∀x ∈ O. Therefore, we may apply (A.31) to µν (5.23), which yields (5.17b). By applying ∂yν to (5.15a) and using Fint L = ∂ µ Aνint L − µ µ ∂ ν Aint L we get (5.18a). Analogously by working with the field equations for Aint L (3.8) and ψint L (3.9), we obtain (5.18b) and (5.19-20) from (5.15a) and (5.16a,b). In the formal adiabatic limit, ∂ν Aνint L converges to ∂ν Aν (3.14) and therefore one expects that Qint (g) will converge to the free Kugo–Ojima charge Q. Whereas in QED this reasoning seems to be correct, the corresponding argument does not work in nonabelian gauge theories (as can be seen by an explicit calculation of the first order of Qint (g)) [BDF]. We therefore prefer not to work in the adiabatic limit. The price to pay is that Qint (g) does not agree with Q, so for the construction of the physical Hilbert space we have to check the conditions of Sect. 4. We easily find that Qint (g, k) is hermitian for real valued k and we even get the nilpotency of Qint (g, k),
1 = 2
Z
Z d x h(x) 4
Qint (g, k)2 =
1 {Qint (g, k), Qint (g, k)} = 2 µ
↔x ↔y
d 4 y h(y)[∂µ Aint L (x), ∂ν Aνint L (y)] ∂ 0 ∂ 0 u(x)u(y) = 0, (5.24)
µ
by means of k µ = δ0 h + ∂ µ f and Proposition 3, (2).7 But we need in addition that the zeroth order term Q0 (k) of Qint (g, k) (5.14) satisfies the positivity assumption (4.8). There seems to be no reason why this should hold for a generic choice of k. One might try to control the limit when k tends to a smeared characteristic function of the t = 0 hyperplane (in order that Q0 (k) becomes equal to 7 We recall that the commutator [∂A int L (x), ∂Aint L (y)] vanishes for all x and y for which supp ∂µ g does not intersect Ox,y ∪ Oy,x .
92
M. Dütsch, K. Fredenhagen
the free charge Q (5.5)), but without a priori information on the existence of an s˜ -invariant state this appears to be a hard problem. There is a more elegant way to get rid of these problems which relies on the local character of our construction. We may embed our double cone O isometrically into the cylinder R × CL , where CL is a cube of length L, L r, with suitable boundary conditions (see appendix A), and where the first factor denotes the time axis. If we choose the compactification length L big enough, the physical properties of the local algebra ˜ F(O) are not changed. The quantization of the free fields on this cube is worked out in Appendix A. The inductive construction of the perturbation series for the S-matrix or the interacting fields is not changed by the compactification, Sects. 2 and 3 can be adopted without any modification [BF]. We assume the switching function g to fulfill ∀x ∈ O ∪ {(x0 , x)| |x0 | < }
g(x) = e = constant
(r > 0)
(5.25)
on R × CL and to have compact support in timelike directions. Now we may insert µ
k (x) :=
Z
µ δ0 h(x0 ),
h ∈ D([−, ]),
dx0 h(x0 ) = 1
into the expression (5.14) for Qint , because (x0 , x) → h(x0 ) is an admissible test function on R × CL . We define Qint (g) : D −→ D,
def
Qint (g) =
Z
Z dx0 h(x0 )
CL
µ
↔x
d 3 x ∂µ Aint L (x) ∂ 0 u(x). (5.26)
By means of (5.12) and the fact that g is constant on the region of integration (the timeslice [−, ] × CL ), we conclude that the result of the integration over CL is independent of x0 and hence the arbitrariness in the choice of h(x0 ) drops out, Z Qint (g) =
CL ,x0 =const.,|x0 | = (a, JL b),
a, b ∈ KL ,
(A.17)
where (., .) denotes the (positive definite) scalar product in KL , and the *-operation with respect to (A.17) is O ∗ = JL O + JL ,
< Oa, b >=< a, O ∗ b > .
def
µ
(A.18)
µ+
Let an , an , cj n , cj+n (j = 1, 2) be the usual annihilation and creation operators of the Fock spaces KA and Kg which fulfill the (anti-)commutation relations
and
ν+ ] = δn,m δ µν L3 2ωn [anµ , am
(A.19)
+ } = δn,m δj l L3 2ωn . {cj n , clm
(A.20)
The ghost fields uL and u˜ L and the zeroth component of the photon field A0L are scalar fields with Dirichlet boundary conditions and some unusual sign conventions, u˜ L (x) =
∞ X
1 1
3 n1 ,n2 ,n3 =1 (2ωn L ) 2 ∞ X
uL (x) =
n1 ,n2 ,n3 =1
A0L (x) =
∞ X n1 ,n2 ,n3 =1
+ (−c1n vn (x) + c2n vn (x)∗ ),
(A.21)
1
+ (c2n vn (x) + c1n vn (x)∗ ),
(A.22)
1
(an0 vn (x) − an0+ vn (x)∗ ).
(A.23)
1 (2ωn
L3 ) 2
1 (2ωn
L3 ) 2
The normalizations are such that they go over into the usual R 3 conP Lorentz covariant 3 ) by d k. For ventions of the non-compactified space by replacing ( 2π 3 n∈Z ,n6 =0 L µ the spatial components of the photon field AL we have a mixture of Dirichlet and von Neumann boundary conditions. For example for µ = 2 we define def
v2n (x) =
ηn 2
sin k1 x1 cos k2 x2 sin k3 x3 e−iωn x0 , 1 (ωn L3 ) 2 n1 , n3 = 1, 2, . . . , n2 = 0, 1, 2, . . . ,
(A.24)
Local (Perturbative) Construction of Observables in Gauge Theories
97
where ηn = 1 for n2 ≥ 1 and ηn = 2− 2 for n2 = 0 and similar for µ = 1, 3. Then we set X 1 (anl vln (x) + anl+ vln (x)∗ ), l = 1, 2, 3. (A.25) AlL (x) = 1 3 2 n (2ωn L ) 1
Jg (A.15) is defined implicitly by (A.11), (A.16) and (A.21-22), i.e. we have + ∗ = c2n , c1n
+ ∗ c2n = c1n .
For the photon two-point function we obtain µ+
µ
(A , AL (x)AνL (y)A ) =< A , AL (x)JL AνL (y)A > X def vµn (x)vµn (y)∗ , (v0n = vn ) = δ µν
(A.26)
n
which is obviously positive. We now transfer the construction of the interacting fields (Sects. 2 and 3) from Minkowski space to R × CL . Due to [BF] there is no principle obstacle and there are only a few changes in the formulas. Since spatial translation invariance is lost, the commutator functions and propagators do not only depend on the relative coordinates, they must be replaced by the above given expressions (A.5), (A.7) (A.8), etc. Some care is required in the proof of Proposition 3. To get (3.22) we use the identity av µρ
∂µx DL
(x, zm ) = −∂zρm DLav (x, zm )
(A.27)
(which is an immediate consequence of (A.5), (A.8)) and the fact that the boundary terms of the “partial integration” vanish because DLav (A.5) fulfills Dirichlet boundary conditions. By Lemma 1 (A) we obtain µ
Aint L (x) = Aµ (x) +
·
n X l=1
µν ret
g(x1 ) . . . g(xn )DL
∞ n+1 Z X i n=1
n!
d 4 x1 . . . d 4 xn ·
(x, xl )R(L0 (x1 ) . . . lˆ . . . L0 (xn ); j µ (xl )).
µν ret
(A.28) µ
Since DL (x, xl ) fulfills the boundary conditions of Aµ (x) we conclude that Aint L obeys the same boundary conditions as the corresponding free field, and similarly for ψint L , uint L , etc. Let us turn to the implementation of the free BRST-transformation. In the following and in the main text we omit the lower index “L”. Due to ∂ µ [(∂ν Aν )∂ ↔ µ u] = 0 the definition Z ↔ def d 3 x (∂ν Aν (x)) ∂ 0 u(x) (A.29) Q= CL , x0 =const.
of the free Kugo–Ojima charge (5.5) is independent of x0 . Because of Aµ ∗ = Aµ , u∗ = u (3.5) we immediately see Q∗ = Q. By means of Z ↔x d 3 x D(y − x) ∂ 0 φ(x) = φ(y), ∀φ fulfilling φ = 0, (A.30) CL , x0 =const.
98
M. Dütsch, K. Fredenhagen
one proves that the charge Q implements the BRST-transformation (5.2) of the free fields, e.g. Z ↔x d 3 x [∂ν Aν (x), Aµ (y)] ∂ 0 u(x) = [Q, Aµ (y)] = CL , x0 =const. Z ↔x d 3 x D(x − y) ∂ 0 u(x) = i∂ µ u(y). (A.31) = −i∂yµ CL , x0 =const.
The transformation (5.3–4) of Wick monomials and derivated fields is also implemented by Q, because of [Q, : φ1 (x)φ2 (x) · · · :]∓ =: =: [Q, φ1 (x)]∓ φ2 (x) · · · : +(−1)δ(φ1 ) : φ1 (x)[Q, φ2 (x)]∓ · · · : + . . . and
(A.32)
[Q, ∂µ φ(x)]∓ = ∂µx [Q, φ(x)]∓ .
We easily find that Q is nilpotent Z ↔ 1 d 3 x {Q, (∂ν Aν ) ∂ 0 u} = 0. Q2 = {Q, Q} = 2 CL , x0 =const.
(A.33)
Inserting (A.22–23) and (A.25) into (A.29) we obtain Q=
∞ X
1 1
L3 2 2
+ + [c1n b1n + b2n c2n ]
(A.34)
n1 ,n2 ,n3 =1
in a straightforward way (the sum in Q converges in the topology of the Krein space on the dense invariant domain D), where def
b1n =
j j
1
(an0 + i 1
22
which implies
kn a n ), ωn
def
b2n =
−1 1
22
j j
(an0 − i
kn a n ), ωn
+ ] = δn,m δj l L3 2ωn [bj n , blm
(A.35)
(A.36)
(cf. Sect. 5 of [DHS1]). By means of Q2 = 0 one finds similarly to [K] that the dense invariant domain D has the decomposition D = Ra Q ⊕ (Ke Q ∩ Ke Q+ ) ⊕ Ra Q+
(A.37)
and these three subspaces are pairwise orthogonal with respect to the scalar product (.,.) (cf. (A.17-18)). Additionally one easily verifies Ke Q ∩ Ke Q+ = Ke {Q, Q+ }.
(A.38)
Inserting (A.34) we find {Q, Q+ } =
1 L3
∞ X n1 ,n2 ,n3 =1
+ + + + ωn (b1n b1n + b2n b2n + c1n c1n + c2n c2n ),
(A.39)
Local (Perturbative) Construction of Observables in Gauge Theories
99
which agrees up to factors 2ωn2 with the particle number operator of the ghosts and the longitudinal and scalar photons; however, the kernels completely agree. Obviously the Krein operator J (A.15) is the identity on Ke {Q, Q+ }. Additionally J maps Ra Q onto Ra Q+ , due to J Q = Q+ J . From the decomposition (A.37) we conclude that our positivity assumption (4.8) is in fact satisfied, i.e. the indefinite product < ., . > is positive semidefinite on Ke Q and the null vectors in Ke Q are precisely Ra Q. The vectors in Ke{Q, Q+ } are distinguished representatives of the equivalence classes Q in the physical space H = Ke Ra Q (4.9). They provide the usual physical picture, namely that the states in H are built up from electrons, positrons and transversal photons only. Appendix B: Proof of the Ward Identities We recall the Ward identites (3.16), n X δ(y −xj )T A1 (x1 ) . . . (θ Aj )(xj ) . . . An (xn ) , ∂µy T j µ (y)A1 (x1 ) . . . An (xn ) = i j =1
(B.1)
where def
(θ Aj ) =
d |α=0 Aj α = i(rj − sj )Aj dα
for
s
Aj =: ψ rj ψ j B1 . . . Bl :
(B.2)
(B1 , . . . , Bl are non-spinorial free fields, i.e. photon or ghost operators), and Aj α is given by the global U (1)-transformation (3.17). Note Z def with Qψ = d 3 x : ψ(x)γ 0 ψ(x) :, (B.3) (θ Aj ) = −i[Qψ , Aj ] i.e. Qψ is the infinitesimal generator of the transformation (3.17). There exist several proofs of the Ward identities in QED, e.g. [FHRW,S]. Here we want to show that they can be fulfilled in our framework; in particular we have to check that they are compatible with our normalization conditions. We follow ideas from [St2]. First we point out a consequence of the Ward identities (B.1). For a given (x1 , . . . , xn ) let O ⊂ R4 be a double cone which contains the points x1 , . . . , xn and let g be a test function which is equal to 1 on a neighbourhood of O. We decompose ∂ µ g = a µ − bµ such that supp a µ ∩ (V − + O) = ∅ and supp bµ ∩ (V + + O) = ∅. We smear out (B.1) with this g in y. Then, by causal factorization, the left-hand side of (B.1) becomes −j µ (aµ )T A1 (x1 ) . . . An (xn ) + T A1 (x1 ) . . . An (xn ) j µ (bµ ) = = −[j µ (aµ ), T A1 (x1 ) . . . An (xn ) ] − T A1 (x1 ) . . . An (xn ) j µ (∂µ g).
(B.4)
The second term vanishes because j µ is a conserved current. Since T (A1 (x1 ) . . . An (xn )) is localized in O, the term −j µ (aµ ) in the commutator can be replaced by Qψ . 10 Hence the validity of the following lemma is necessary for the Ward identities: 10 This may be seen as follows. Different choices of a differ only in the spacelike complement of O and µ therefore do not affect the commutator. We may choose Z ex aµ (x) = ∂µ g(x) h(t)dt, (B.5) −∞
100
M. Dütsch, K. Fredenhagen
Lemma 8. In agreement with (N1-4) and (N6) the normalizations can be chosen such that the vacuum expectation values of the time ordered products vanish, if the sum of the charges of the factors is different from zero < |T (A1 . . . An )| >= 0 f or
X (rj − sj ) 6 = 0.
(B.6)
j
Under this condition the following identity becomes true n X T A1 (x1 ) . . . [Qψ , Aj (xj )] . . . An (xn ) ≡ [Qψ , T A1 (x1 ) . . . An (xn ) ] = j =1 n X T A1 (x1 ) . . . (θAj )(xj ) . . . An (xn ) . ≡i
(B.7)
j =1
Proof of Lemma 8. The lemma is certainly fulfilled for n = 1, and we proceed inductively with respect to the order n. For each fixed n we consider a second induction in the sum of the degrees of the Wick monomials Aj , j = 1, . . . , n. We commute the assertion (B.7) with the free fields. After inserting (N3) we can use the inductive assumption and find that these commutators vanish. Therefore, the identity (B.7) can only be violated by a C-number. (An analogous computation is given below in Step 1 of the proof of the Ward identities.) To determine this C-number we consider the vacuum expectation value of (B.7). Since Qψ = Q∗ψ annihilates the vacuum we find < |[Qψ , T (A1 . . . An )]| >= 0. Moreover note X
< |T (A1 . . . (θAj ) . . . An )| >= i
X (rj − sj ) < |T (A1 . . . An )| > .
j
j
(B.8) Due to the causal factorization and the validity of (B.7) in lower orders, the expression (B.8) must be local. Hence we can require (B.6) as a normalization condition, i.e. we extend zero by zero to the total diagonal. Obviously this prescription is compatible with (N1–4) and (N6). This completes the proof of the lemma. u t Proof of the Ward identities. We show that all Ward identities can be satisfied by choosing a suitable normalization of the vacuum expectation values of the time ordered products which contain no free field factor and with vanishing total charge (B.6). We work with the same double inductive procedure as in the previous proof. where e is a suitable timelike unit vector and h is a test function of one variable with sufficiently small support and total integral 1, i.e. the integral in (B.5) is a C ∞ -approximation to 2(ex − c) (c ∈ R is a suitable constant). Then by current conservation and partial integration we obtain −j µ (aµ ) = j µ (eµ gh(e·)) =
Z
hence the statement follows from g ≡ 1 on O.
Z dth(t)
ex=t
g(x)j µ (x)εµνρσ dx ν dx ρ dx σ ,
Local (Perturbative) Construction of Observables in Gauge Theories
101
Step 1. Again we commute the assertion with the free fields. By means of (N3) we obtain h {∂µy T j µ (y)A1 (x1 ) . . . An (xn ) − i X δ(y − xm )T A1 (x1 ) . . . (θAm )(xm ) . . . An (xn ) }, φj (z) = −i m
∂Ak . . . An − ∂µy T j µ (y)A1 . . . ∂φl k X ∂Ak δ(y − xm )T A1 . . . . . . (θ Am ) . . . An − −i ∂φl m (m6=k) o ∂(θAk ) . . . An 1lj (xk − z) + −iδ(y − xk )T A1 . . . ∂φl ∂j µ (y)A1 . . . An 1lj (y − z). +i∂µy T ∂φl =i
Xn
(B.9)
For φj 6 = ψ, ψ the last term vanishes and we obtain zero, due to ∂A ∂(θAk ) k 1lj = θ 1lj ∂φl ∂φl
(for φj 6 = ψ, ψ)
(B.10)
and the inductive assumption. If φj = ψ (φj = ψ is analogous) the last term is equal to o n i∂µy T ψ(y)A1 . . . An γ µ S(y − z) = X ∂Ak T A1 . . . . . . An δ(xk − y)S(xk − z) = −i ∂ψ
(B.11)
k
according to (N4). Because of ∂Ak ∂(θ Ak ) = i(rk − sk ) , ∂ψ ∂ψ
θ
∂A k
∂ψ
= i(rk − 1 − sk )
∂Ak ∂ψ
(B.12)
and the inductive assumption, the commutator (B.9) vanishes also in this case. Again we conclude that a possible violation of a Ward identity (we call it an anomaly) can only appear in the vacuum sector, i.e. in the vacuum expectation values. def a(y, x1 , . . . , xn ) = ∂µy T j µ (y)A1 (x1 ) . . . An (xn ) − −i
n X
δ(y − xj )T A1 (x1 ) . . . (θ Aj )(xj ) . . . An (xn ) =
j =1
−i
n X j =1
= ∂µy < |T j µ (y)A1 . . . An | > − δ(y − xj ) < |T A1 . . . (θ Aj ) . . . An | > .
(B.13)
102
M. Dütsch, K. Fredenhagen
Moreover the anomalies are local, i.e. a(y, x1 , . . . , xn ) = P (∂)δ(x1 − y) . . . δ(xn − y),
(B.14)
where P (∂) is a polynomial in ∂ ≡ (∂x1 , . . . , ∂xn ). The latter is a consequence of the induction with respect to the order n and the causal factorization (2.2) of the time ordered products. Step 2. Next we prove the Ward identities with a free field factor. We only need to consider their vacuum expectation values. The normalization condition (N4) implies the well known identity < |T (A1 (x1 ) . . . An (xn )φi (x))| >= =i
n X X k=1
l
1Fil (x − xk ) < |T (A1 (x1 ) . . .
∂Ak (xk ) . . . An (xn ))| >, ∂φl
(B.15)
where 1F is the Feynman propagator. By inserting this formula we obtain ∂µy < |T j µ (y)A1 (x1 ) . . . Am (xm )φi (x) | > − −i
n X j =1
δ(y − xj ) < |T A1 (x1 ) . . . (θ Aj )(xj ) . . .
. . . Am (xm )φi (x) | > −iδ(y − x) < |T A1 (x1 ) . . . Am (xm )(θ φi )(x) | >= X ∂Ak 1Fil (x − xk )∂µy < |T j µ (y)A1 . . . . . . Am | > + =i ∂φl k ∂j µ n o (y)A1 . . . Am | > + +i∂µy 1Fil (x − y) < |T ∂φl X X ∂Ak F δ(y − xj ) 1il (x − xk ) < |T A1 . . . . . . (θ Aj ) . . . Am | > + + ∂φl j k (k6=j ) X ∂(θ Ak ) δ(y − xk )1Fil (x − xk ) < |T A1 . . . . . . Am | > + + ∂φl k X ∂Ak 1F(θ i)l (x − xk ) < |T A1 . . . . . . Am | >, +δ(y − x) ∂φl k (B.16) def where m = n − 1 and 1F(θ i)l (x − y) = i < |T ((θ φi )(x)φl (y))| >. If φi 6 = ψ, ψ the second and the last term vanish (αi = 0). Due to (B.10) and the Ward identities in lower order we get zero. If φi = ψ (φi = ψ is analogous) the second term is equal to X ∂Ak < |T A1 . . . . . . Am | > [S F (xk − y)δ(y − x) − δ(xk − y)S F (y − x)] i ∂ψ k (B.17) by means of (N4). Because of (B.12) and the Ward identities in lower order all terms cancel in this case, too.
Local (Perturbative) Construction of Observables in Gauge Theories
103
Step 3. By choosing g as in (B.4) we conclude from Lemma 8, Z Z 0 = d 4 y g(y)a(y, x1 , . . . , xn ) = d 4 y a(y, x1 , . . . , xn )
(B.18)
in D0 (R4n ). This restricts the want to remove them by finite remaining anomalies. We µ renormalizations of < |T j (y)A1 (x1 ) . . . An (xn ) | >. This can only be done if the polynomials P (∂) (B.14) have the form P (∂) = (
n X i=1
µ
∂µxi )P1 (∂),
µ
P1 (∂) polynomial in ∂ ≡ (∂x1 , . . . , ∂xn ).
(B.19)
To prove this we consider the Fourier transformation of the anomaly (B.14), Z a(y, ˆ p1 , . . . , pn ) = (2π)−2n dx1 . . . dxn a(y, x1 , . . . , xn )ei(p1 x1 +···+pn xn ) = = (2π)−2n P (−ip1 , . . . ., −ipn )ei(p1 +···+pn )y . From (B.18) we know that P (−ip1 , . . . ., −ipn ) vanishes on the submanifold 0, n X pi ) = 0. P (−ip1 , . . . ., −ipn )δ(
Pn (B.20) i=1 pi = (B.21)
i=1
Let P˜ (q, p1 , . . . , pn−1 ) = P (−ip1 , . . . ., −ipn ), where q = Taylor series of P˜ , def
P˜ (q, p1 , . . . , pn−1 ) =
def
degree P˜
X
X
k=1
|α|+|β|=k
Pn
i=1 pi . We consider the
q α pβ ∂ |α| ∂ |β| ˜ P (0), α!β! ∂q α ∂pβ
(B.22)
p ≡ (p1 , . . . , pn−1 ). |β| ∂ ˜ ˜ The terms |α| = 0 vanish because ∂p β P (0) is obtained by varying P on the submanifold q = 0. There remain only terms with a factor q α , |α| ≥ 1. This proves (B.19). Step 4. But there is still a problem. The renormalization < |T j µ (y)A1 (x1 ) . . . An (xn ) | >→ µ →< |T j µ A1 . . . An | > +P1 (∂)δ(x1 − y) . . . δ(xn − y), µ
(B.23)
(which removes the anomaly) is only admissible if P1 (∂)δ(x 1 − y) . . . δ(xn − y) has µ the same symmetries as required for < |T j A1 . . . An | >. Especially if there are factors j µl (xl ) among A1 (x1 ) . . . An (xn ) the permutation symmetry with respect to (y, µ) ↔ (xl , µl ) must be maintained (for all l). There is a prominent counterexample µ µ µ where this is impossible: the axial anomaly, i.e. < |T jA (y)jA 1 (x1 )jA 2 (x2 ) | >, µ def
where jA = : ψγ µ γ 5 ψ :.
104
M. Dütsch, K. Fredenhagen
We have not found argument (for non-axial QED) that all possible anomalies P a general µ µ factorize P (∂) = ( i ∂µxi )P1 (∂) such that P1 (∂) has the wanted symmetries. However, taking Step 2 into account, and also the fact that the scaling degree at xj − y = 0 (∀j = 1, . . . , n) [BF] of the anomaly cannot be higher than the scaling degree of the terms in the corresponding Ward identity, the number of Ward identities which can be violated is strongly reduced. In addition, due to (B.18) terms of singular order ω = 0 (i.e. scaling degree δ = 4n) are excluded in the anomalies. The famous Ward identity which connects the vertex function with the electron self-energy has only one factor j and, hence, the renormalization (B.23) maintains the symmetries in that case. There remain the following anomalies: ∂µy < |T j µ (y)L(x11 ) . . . L(x1m )j µ1 (x21 ) | >= Y X (B.24) µ C1a1 ∂ a δ(xhj − y), = 1≤|a|≤3
h,j
∂µy < |T j µ (y)L(x11 ) . . . L(x1m )j µ1 (x21 )j µ2 (x22 )j µ3 (x23 ) | >= Y X µµµ C2a1 2 3 ∂ a δ(xhj − y). = |a|=1
(B.25)
h,j
... , Cl...
l = 1, 2, 3, 4 are restricted by Lorentz covariance and The unknown constants the permutation symmetry in x11 , . . . , x1m . The analogous Ward identity with three factors j is trivially fulfilled, due to Furry’s theorem, by imposing C-invariance as a further normalization condition. 11 The anomalies in (B.24) and (B.25) can be further x21 y restricted by the symmetry in the factors j of the terms on the l.h.s., e.g. ∂µ1 ∂µ
is symmetrical in y, x21 . By working this out one finds that the factorization (B.19) of the anomalies can be done in such a way that the symmetries are preserved in the renormalizations (B.23) [DHS2]. u t
Acknowledgements. We profited from discussions with Franz-Marc Boas, Izumi Ojima, Klaus Sibold and Raymond Stora which are gratefully acknowledged. Part of this work was done at the Max-Planck-Institute for Mathematics in the Sciences in Leipzig and at the University of Leipzig. The authors wish to thank Bodo Geyer, Gert Rudolph, Klaus Sibold and Eberhard Zeidler for warm hospitality.
References [BF]
[BDF] [BFK]
Brunetti, R. and Fredenhagen, K.: Interacting quantum fields in curved space: Renormalization of φ 4 . gr-qc/9701048, Proceedings of the Conference “Operator Algeras and Quantum Field Theory”, held at Accademia Nazionale dei Lincei, Rome, July 1996; Brunetti, R. and Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds. In preparation Boas, F.-M., Dütsch, M. and Fredenhagen, K.: A local (perturbative) construction of observables in gauge theories: Nonabelian gauge theories. Work in progress Brunetti, R., Fredenhagen, K. and Köhler, M.: The microlocal spectrum condition and Wick polynomials of free fields on curved spacetimes. Commun. Math. Phys. 180, 312 (1996)
11 In the inductive construction of the time ordered products C-invariance can only get lost in the extension to the total diagonal, because of the causal factorization (2.2). Starting with an extension which fulfills all other normalization condition (N1-4), (N6), (B.6) and symmetrizing it with respect to C-invariance, we obtain an extension which satisfies all requirements.
Local (Perturbative) Construction of Observables in Gauge Theories
[BRS]
105
Becchi, C., Rouet, A. and Stora, R.: Renormalization of the abelian Higgs–Kibble model. Commun. Math. Phys. 42, 127 (1975); Becchi, C., Rouet, A. and Stora, R.: Renormalization of gauge theories. Annals of Physics (N.Y.) 98, 287 (1976) [BlSe] Blanchard, P. and Seneor, R.: Green’s functions for theories with massless particles (in perturbation theory). Ann. Inst. H. Poincaré A 23, 147 (1975) [BS] Bogoliubov, N.N. and Shirkov, D.V.: Introduction to the Theory of Quantized Fields. New York: (1959) [Bu1] Buchholz, D., Porrmann, M. and Stein, U.: Dirac versus Wigner: Towards a universal particle concept in local quantum field theory. Phys. Lett. B 267, 377 (1991) [Bu2] Buchholz, D.: Gauss’ law and the infraparticle problem. Phys. Lett. B 174, 331 (1986) [BW] Bordemann, M. and Waldmann, S.: Formal GNS construction and states in deformation quantization. Commun. Math. Phys. 195, 549 (1998) [D1] Dütsch, M.: On gauge invariance of Yang–Mills theories with matter fields. N. Cimento A 109, 1145 (1996) [D2] Dütsch, M.: Slavnov–Taylor identities from the causal point of view. Int. J. Mod. Phys. A 12, 3205 (1997) [DF] Dütsch, M. and Fredenhagen, K.: Deformation stability of BRST-quantization. hep-th/9807215, to appear in the proceedings of the conference “Particles, Fields and Gravitation”, Lodz, Poland (1998) [DHS1] Dütsch, M., Hurth, T. and Scharf, G.: Causal construction of Yang–Mills theories. IV. Unitarity, N. Cimento A 108, 737 (1995) [DHS2] Dütsch, M., Hurth, T. and Scharf, G.: Gauge invariance of massless QED. Phys. Lett. B 327, 166 (1994) [DHKS1] Dütsch, M., Hurth, T., Krahe, F. and Scharf, G.: Causal construction of Yang–Mills theories. I. N. Cimento A 106, 1029 (1993) [DHKS2] Dütsch, M., Hurth, T., Krahe, F. and Scharf, G.: Causal construction of Yang–Mills theories. II. N. Cimento A 107, 375 (1994) [DKS] Dütsch, M., Krahe, F. and Scharf, G.: Interacting fields in finite QED. N. Cimento A 103, 871 (1990) [DS] Dütsch, M. and Scharf, G.: Perturbative gauge invariance: The electroweak theory. hep-th/9612091, to appear in Annalen der Physik (Leipzig); Aste, A., Dütsch, M. and Scharf, G.: Perturbative gauge invariance: The electroweak theory II. hep-th/9702053, to appear in Annalen der Physik (Leipzig) [EG] Epstein, H. and Glaser, V.: The role of locality in perturbation theory. Ann. Inst. H. Poincaré A 19, 211 (1973) [F] Feynman, R.P.: Acta Phys. Polonica 24, 697 (1963) [FHRW] Feldman, J.S., Hurd, T.R., Rosen, L. and Wright, J.D.: QED: A Proof of Renormalizability. Berlin– Heidelberg–New York: Springer-Verlag, 1988 [FP] Faddeev, L.D. and Popov, V.N.: Feynman diagrams for the Yang–Mills field. Phys. Lett. B 25, 29 (1967) [K] Krahe, F.: A causal approach to massive Yang–Mills theories. Acta Phys. Polonica B 27, 2453 (1996) [KO] Kugo, T. and Ojima, I.: Local covariant operator formalism of non-abelian gauge theories and quark confinement problem. Suppl. Progr. Theor. Phys. 66, 1 (1979) [R] Requardt, M.: Symmetry conservation and integrals over local charge desities in quantum field theory. Commun. Math. Phys. 50, 259 (1976) [S] Scharf, G.: Finite Quantum Electrodynamics. 2nd. ed., Berlin–Heidelberg–New York: SpringerVerlag, 1995 [Sch] Schroer, B.: Infrateilchen in der Quantenfeldtheorie. Fortschr. Phys. 173, 1527 (1963) [St1] Stora, R.: Local gauge groups in quantum field theory: Perturbative gauge theories. Talk given at the workshop “Local Quantum Physics” at the Erwin-Schroedinger-institute, Vienna (1997) [St2] Stora, R.: Lagrangian field theory. Summer School of Theoretical Physics about “Particle Physics”, Les Houches, 1971 [St3] Stora, R.: Differential algebras in Lagrangian field theory. ETH-Zürich Lectures, January–February 1993; Popineau, G. and Stora, R.: A pedagogical remark on the main theorem of perturbative renormalization theory. Unpublished preprint [Ste] Steinmann, O.: Perturbation expansions in axiomatic field theory. Lecture Notes in Physics 11, Berlin–Heidelberg–New York: Springer-Verlag, 1971 [V] Verch, R.: Local Definiteness, Primarity and Quasiequivalence of Quasifree Hadamard Quantum States in Curved Spacetime. Commun. Math. Phys. 160, 507 (1994) Communicated by D. C. Brydges
Commun. Math. Phys. 203, 107 – 118 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Global Existence for the Einstein–Boltzmann Equation in the Flat Robertson–Walker Spacetime Piotr Bogusław Mucha Institute of Applied Mathematics and Mechanics, Warsaw University, ul. Banacha 2, 02-097 Warsaw, Poland. E-mail:
[email protected] Received: 18 May 1998 / Accepted: 23 November 1998
Abstract: The initial value problem for the Einstein–Boltzmann equation in the spatially homogeneous and isotropic case is considered. The global in time mild solution is obtained. In the paper we consider the Einstein–Boltzmann system which describes an evolution of a collision gas in general relativity [1, 2]: i pµ pν f,pi = Q(f, f ), pα f,α − 0µν
(1.1)
Gµν = Tµν ,
(1.2)
Z
|g|1/2 d p, ¯ (1.3) p0 where Q(f, f ) is the collision operator and T µν is the energy-momentum tensor. The first equation (1.1) called the Boltzmann equation determines the distribution function f (t, x, p) of gas particles. To describe f we have to have a submanifold P (M) of the tangent bundle T M of the pseudoriemann manifold M which is defined by the constraint: T µν =
pµ pν f (t, x, p)
Px (p) : gx (p, p) = gαβ pα pβ = 1
(α, β = 0, 1, 2, 3),
(2)
where gαβ is a metric of M given by the Einstein equations (1.2). Then f : P (M) → R. The assumption (2) describes that all particles have the same mass equal to 1. We assume that the spacetime is spatially homogeneous and isotropic. The symmetry implies that the metric simplifies to the form: (3) ds 2 = dt 2 − R 2 (t) (dx 1 )2 + (dx 2 )2 + (dx 3 )2 , where R(t) > 0 and the distribution function f (t, x, ¯ p) ¯ does not depends on x and depend only on p = |p| ¯ (f (t, x, ¯ p) ¯ = f (t, p)). This metric describes the cosmological model known as the flat Robertson–Walker spacetime.
108
P. B. Mucha
We consider the initial value problem in such a case. The aim of this paper is to show a global in time mild solution. To prove this we use methods similar to ones applied to the classical spatially homogeneous Boltzmann equation [4]. The initial value problem for the Einstein–Boltzmann system in a general case has been considered in [1, 3 and 6]. The results contained in these papers are local in time. With the above assumptions the Einstein–Boltzmann system reads (see [1, 2]): f,t −
R˙ 1 pf,p = 0 Q(f, f ), R p ˙ 2 R R
(4.1)
= T00 ,
(4.2)
f (t, p)p ¯ 0 d p, ¯
(4.3)
Z
where T00 = R 3 (t)
Q(f, g) = Q+ (f, g) − Q− (f, g), Z Z R 3 d q¯ dωf (p0 )g(q 0 )S(p, ¯ q, ¯ p¯ 0 , q¯ 0 ), Q+ (f, g) = 0 Rq3 q S2 Z Z R 3 d q¯ dωf (p)g(q)S(p, ¯ q, ¯ p¯ 0 , q¯ 0 ), Q− (f, g) = 0 3 2 q Rq S
(4.4)
p¯ 0 = p¯ − (ω, p¯ − q)ω, ¯
(4.7)
0
S 2 , C1
(4.5) (4.6)
¯ q¯ = q¯ + (ω, p¯ − q)ω, q p0 = 1 + R 2 p2 ,
(4.9)
0 ≤ S( · , · , · , · ) ≤ C1 ,
(4.10)
is constant, p0
(4.8)
and q 0
are defined the same as for classical Boltzmann where ω ∈ (it can be done because for a fixed t the Riemann submanifold numbered by t is flat (E 3 )) and S(p, q, p0 , q 0 ) - a continuous function is the cross section for the collisions. Equation (4.2) is the equivalent to the system Gµν = Tµν (1.2), because the integrability conditions ∇µ T µν = 0 hold for T µν defined by the Boltzmann equation [2, 5]. For the system (4) we consider the initial value problem with data: R(0) = R0 > 0,
0 ≤ f (0, p) = f0 (p) = f0 (p) ¯ ∈ L1 (R3 ).
(5)
Because of (4.2) above the initial data do not ensure a uniqueness. We have to add an ˙ < 0 or R(0) ˙ > 0. Then we can even compute R(t). But first we extra condition R(0) have to define the mild solution to system (4). To reach our aim we have to reformulate the problem. For the Boltzmann equation (4.1), f,t −
R˙ 1 pf,p = 0 Q(f, f ), R p
we apply the characteristic method. Thus we consider the system: R˙ dp(t, y) = − p(t, y), dt R
(6.1)
Einstein–Boltzmann Equation
109
1 df (t, y) = 0 Q(f, f )(t, y), dt p where p0 =
(6.2)
q 1 + R 2 (t)p 2 (t, y).
Equation (6.1) gives the characteristic: p(t, y) =
yR(0) . R(t)
(7)
It’s easily seen that the jacobian of the transformation p → y is equal to: det (
R(0) 3 ∂ p¯ )=( ) > 0. ∂ y¯ R(t)
(8)
The second equation (6.2) will be solved in the form: Z
t
f (t, y) = f0 (y) + 0
1 Q(f, f )(s, y)ds. p0
(9)
The solution of (9) together with (7) we call the mild solution of (4). From (7) we see that q p 0 = 1 + R 2 (0)y 2 . Multiplying (9) by p 0 and integrating over p we get Z
Z p0 f (t, y)d y¯ =
p0 f (0, y)d y¯ = const.
R The integral from the collision operator vanishes ( Q(f, f )d p¯ = 0 which follows from properties of S [3]). And this implies from (4.3) and (8) that T00 (t) = T00 (0). (We have to assume that T00 (0) is finite.) ˙ From (4.2) we have two cases. The first case when R(0) < 0 gives p p R˙ = − T00 (0) then R(t) = R(0) exp (− T00 (0)t), R
(10)
and then R(t) is decreasing on [0, ∞). The density ||f ||L1 (Rp3 ) of the gas will grow up. ˙ The second case when R(0) > 0 gives p p R˙ = T00 (0) then R(t) = R(0) exp ( T00 (0)t), R
(11)
R(t) is increasing and the density will diminish. From the above considerations we conclude that the problem concentrates on the Boltzmann equation, because R(t) has already been given by (10) or (11).
110
P. B. Mucha
Notation.
v u 3 uX xk2 , x¯ = (x1 , x2 , x3 ), x = |x| = t k=1
∂ u = u,t = u, ˙ ∂t Z f (x)dx ≤ r}, Xr = {f ≥ 0 : ∂ u = u,x , ∂x
R3
Z ||u||y =
|u|d y, ¯ ||u||p = d y¯ =
Z |u|d p, ¯
R 3 (t) d p, ¯ R 3 (0)
R 3 (t) ||u||p . R 3 (0) The main result of the paper is the following theorem: R Theorem. If f0 (p) ∈ Xr for r ≥ 0 and R0 > 0 and p0 f0 (p)R03 d p¯ < ∞ then: ||u||y =
˙ (i) the Cauchy problem for the system (4) with R(0) < 0 has a unique global in time nonnegative mild solution such that: f ∈ C(0, ∞; L1 (Ry3 )) with
p R(t) = R(0) exp (− T00 (0)t)
and
p ||f (t)||p = ||f0 ||p exp (3 T00 (0)t);
˙ (ii) and if R(0) > 0, then the system (4) has global in time nonnegative mild solution such that: f ∈ C(0, ∞; L1 (Ry3 )) with
p R(t) = R(0) exp ( T00 (0)t)
and
p ||f (t)||p = ||f0 ||p exp (−3 T00 (0)t).
To prove the theorem we need some lemmas. Lemma 1. If f, g ∈ Xr then: 1 + Q (f, f ) − p0 1 − Q (f, f ) − p0 and N (r) = C1 r.
1 + Q (g, g) ≤ N (r)||f − g||y , 0 p y 1 − ≤ N (r)||f − g||y , Q (g, g) p0 y
(12) (13)
Einstein–Boltzmann Equation
111
Proof. We note that 1 1 1 Q(f, f ) − 0 Q(g, g) = 0 [Q(f, f − g) + Q(f − g, f )]. 0 p p p One can estimate the second term like the first. Q (see (4.4)) is estimated in two parts. We consider Q+ and Q− separately ((4.5) and (4.6)) Z Z 1 1 1 + 3 Q (f, f − g) = 0 R (t)d q¯ 0 dωf (p0 )(f − g)(q 0 )S. p0 p q S2 Taking the norm in L1 (Ry3 ) we obtain Z Z Z S dωf (p0 )|(f − g)(q 0 )| 0 0 =; d yR ¯ 3 d q¯ p q S2 changing variables y → p (see (8)) we get Z Z Z S dωf (p0 )|(f − g)(q 0 )| 0 0 . = γ d p¯ d q¯ 2 p q S By properties of p, q, p0 , q 0 ( p2 + q 2 = p 0 2 + q 0 2 ) there is: q q 0 0 2 2 2 2 p q = (1 + R p )(1 + R q ) ≥ 1 + R 2 (p0 2 + q 0 2 ) . By transformation (p, q) → (p 0 , q 0 ) (the jacobian is equal to 1 (4.7),(4.8)) we obtain: 1 + Q (f, f − g) ≤ C1 R 6 (t)||f ||p ||f − g||p p0 y (see (4.10)). Going back to L1 (Ry3 ) we get from (8): 1 + Q (f, f − g) ≤ C1 ||f ||y ||f − g||y . p0 y For Q− (f, f − g) using the same methods as for Q+ one can get Z Z d qR ¯ 3 1 dωf (y)f (q)S ≤ C1 ||f ||y ||f − g||y . d y¯ 0 p q0
(14)
(15)
In that way from (14) and (15) we conclude (12) and (13) with: N(r) = C1 r.
t u
(16)
Lemma 2. For any r > 0 there exists such n(r) > 0 that the equation: nu −
1 Q(u, u) = v p0
(17)
for v ∈ Xr has a unique nonnegative solution u which belongs to L1 (R3 ) for any n ≥ n(r).
112
P. B. Mucha
Proof. We construct the sequence of approximations: v u1 = n .. . uk+1 =
1 Q+ (uk , uk ) p0 . n + p10 Q− (1, uk )
v+
1 3 It’s obvious that uk ≥ 0. We have to show that {uk }∞ k=1 converge in L (Ry ) (and from now || · || = || · ||y ). Applying Lemma 1 we get:
||uk+1 − uk || ≤ (v + Q+ (uk ,uk ) )(n + Q− (1,uk−1 ) ) − (v + Q+ (uk−1 ,uk−1 ) )(n + Q− (1,uk ) ) 0 0 0 0 p p p p ≤ 2 n 1 nQ+ (u , u ) Q− (v, u ) Q+ (u , u )Q− (1, u ) k k k−1 k k k−1 + + − ≤ 2 [ n p0 p0 p0 2 nQ+ (uk−1 , uk−1 ) Q− (v, uk ) Q+ (uk−1 , uk−1 )Q− (1, uk ) − − ] − p0 p0 p0 2 But from Lemma 1 we have: + Q (uk ,uk ) − Q+ (uk−1 ,uk−1 ) N (r) 0 0 p p ≤ ||uk − uk−1 ||, n n − Q (v,uk−1 ) − Q− (v,uk ) N (r) 0 0 p p ≤ ||uk − uk−1 ||, 2 n n2 − Q (1,uk )[Q+ (uk ,uk )−Q+ (uk−1 ,uk−1 )] N 2 (r) 2 p0 ||uk − uk−1 ||, ≤ n2 n2 + Q (uk ,uk )[Q− (1,uk )−Q− (1,uk−1 )] N 2 (r) 2 p0 ||uk − uk−1 ||. ≤ n2 n2 So we obtain
||uk+1 − uk || ≤
Hence if
N(r) N(r) 2N 2 (r) + 2 + ||uk − uk−1 ||. n n n2
N(r) N(r) 2N 2 (r) < 1, + 2 + n n n2
(18)
Einstein–Boltzmann Equation
113
then from the Banach fixed point theorem the sequence {uk }∞ k=1 has a unique limit. Thus there exists n(r) such that for any n ≥ n(r) Eq. (17) has a solution (by (18) and Lemma 1 the uniqueness is obvious). u t Deffinition. Let us define the operator R(n, Q) = (n − n ≥ n(r).
1 Q)−1 p0
: Xr → Xr for
For R(n, Q) we show the following estimates: Lemma 3. If g, h ∈ Xr and n ≥ max{8N(r), 1} then ||R(n, Q)g|| ≤ where ε =
4N(r) n2
1+ε ||g||, n
(19)
and ||R(n, Q)nu − R(n, Q)nv|| ≤ N1 (r)||u − v||,
(20)
where N1 (r) < 2. Proof. Let R(n, Q)g = h, then from Lemma 1 and from properties of the sequence of approximations from Lemma 2 with n ≥ max{8N (r), 1}, one easily obtains: g + ||h|| = n +
1 Q+ (h, h) p0 1 Q− (1, h) p0
≤
2 ||g||. n
Using the above estimate and again Lemma 1 and Lemma 2 we get (19): g + ||h|| = n +
1 Q+ (h, h) p0 1 Q− (1, h) p0
≤
1+
4N (r) n2
n
||g||
with ε=
4N(r) . n2
(21)
We denote R(n, Q)nu = U and R(n, Q)nv = V ; then nu + 10 Q+ (U, U ) nv + 10 Q+ (V , V ) p p − ||R(n, Q)nu − R(n, Q)nv|| ≤ ≤ 1 − (1, V ) n + 10 Q− (1, U ) n + Q 0 p p 1 nQ+ (U, U ) nQ− (u, V ) Q+ (U, U )Q− (1, V ) + + ≤ 2 [n2 u + n p0 p0 p0 2 nQ+ (V , V ) nQ− (v, U ) Q+ (V , V )Q− (1, U ) 2 −(n v + + + )] . p0 p0 p0 2
114
P. B. Mucha
We have 1 + 0 Q (U, U ) − 10 Q+ (V , V ) N (2r) p p ||U − V ||, ≤ n n − Q (u,V ) − Q− (v,U ) Q− (u,V −U ) − Q− (v−u,U ) p0 0 0 0 p p p = ≤ n n 2N (r) N(r) ||U − V || + ||u − v||, n n Q+ (U, U )Q− (1, V ) − Q+ (V , V )Q− (1, U ) = n2 p 0 2 ≤
Q+ (U, U )Q− (1, V − U ) − [Q+ (V , V ) − Q+ (U, U )]Q− (1, U ) ≤ 2 2 0 n p ≤
2N 2 (2r) ||U − V ||, n2
hence we get N(2r) N (r) 2N 2 (2r) 2N(r) ||u − v|| + + + ||U − V ||. ||U − V || ≤ 1 + n n n n2 From the above inequality we conclude (20) and N1 (r) ≤
1+ 1−
2N (r) n 4N (r) n
0 and small δ > 0. We take n1 = [4N (r)] + l = k0 + 1, where l is taken such that: exp
1 4N(2r) < . k0 1−δ
(26)
Then from Lemma 5 we get a unique solution of (25) on the interval [0, T1 ] for T1 = 1 1 3n1 = 3(k0 +1) we denote this solution as fn1 . By Lemma 5 we get for 0 ≤ t ≤ T1 : ||fn1 ||(T1 ) ≤
1 1−
4N(r) n21
||f0 || ≤
1 4N (2r) (k0 +1)2
1−
||f0 ||.
Solving (25) with t0 = T1 with greater n we obtain T2 , etc. Precisely, we construct Fk0 - the approximation of the solution on [0, T ]: 1. Fk0 |[0,T1 ] = fn1 where n1 = k0 + 1, T1 = sup ||fn1 (t)|| ≤
0≤t≤T1
1 3(k0 +1) ,
fn1 (0, y) = f0 (y) and we have:
1 1−
4N (2r) (k0 +1)2
||f0 ||.
2. Fk0 |[T1 ,T2 ] = fn2 , where n2 = k0 + 2, T2 = T1 + 3(k01+2) , fn2 (T1 , y) = fn1 (T1 , y) and we have: 2 Y 1 ||f0 || sup ||fn2 (t)|| ≤ 4N (2r) 1 − T1 ≤t≤T2 2 j =1 (k +j ) 0
.. . i. Fk0 |[Ti−1 ,Ti ] = fni where ni = k0 + i, Ti = Ti−1 + fni−1 (Ti−1 , y) and we have: sup
Ti−1 ≤t≤Ti
||fni (t)|| ≤
i Y
1
j =1 1 −
4N (2r) (k0 +j )2
.. .
1 3(k0 +i) ,
||f0 ||
fni (Ti−1 , y) =
Einstein–Boltzmann Equation
117
nK . Fk0 |[TK−1 ,TK ] = fnK , where nK = k0 +K, TK = TK−1 + 3(k01+K) , fnK (TnK−1 , y) = fnK−1 (TnK−1 , y) and we have: sup
TK−1 ≤t≤TK
||fnK (t)|| ≤
K Y
1
j =1
4N (2r) (k0 +j )2
1−
||f0 ||,
P 1 where K is so large that TK ≥ T or K j =1 3(k0 +j ) > T (it is always possible) Q∞ P P 1 1 1 1 and from (26) j =1 < ∞). And this 4N (2r) < 1−δ ( n = ∞ and n2 1−
implies that we have:
(k0 +j )2
sup ||Fk0 || ≤
t∈[0,T ]
1 ||f0 ||. 1−δ
(27)
Thus we have constructed Fk0 . We show the convergence of this sequence. By the above construction we have Z t [Qm(s) (fm(s) ) − Qn(s) (fn(s) )]ds, FM − FN = 0
where m(s) and n(s) are defined by construction of FM and FN . We consider Qm (fm ) − Qn (fn ) = Qm (fm ) − Qm (fn ) + Qm (fn ) − Qn (fn ). By (23), Lemma 1 and (20) we get 1 Q(R(m, Q)mfm , R(m, Q)mfm ) − 1 Q(R(n, Q)nfn , R(n, Q)nfn ) ≤ p0 0 p ≤ γ ||fm − fn ||, hence we get sup ||FM − FN || ≤ T γ sup ||FM − FN || + T sup ||Qm (fn ) − Qn (fn )||.
t∈[0,T ]
t∈[0,T ]
Taking T so small that T γ
0 we get: (28) sup ||f (t)||y ≤ ||f (0)||y . t∈[0,T ]
By (28) we can continue the solution on intervals [T , 2T ], [2T , 3T ], ..., etc. Thus we constructed the solution for any T . R Now to obtain the norm ||f ||p it’s enough to integrate (9) (using Q(f, f ) dpp0¯ = 0, see [3]) and we get: (29) ||f (t)||y = ||f0 ||p
118
P. B. Mucha
˙ from (29) and (8). For R(0) < 0 we have
p ||f (t)||p = ||f0 ||p exp (3 T00 (0)t),
(30)
˙ and for R(0) > 0 we have
p ||f (t)||p = ||f0 ||p exp (−3 T00 (0)t).
(31)
Summing-up p(t, y) defined by (7) and the f solution of (9) is Summing-up p(t, y) defined by (7) and the f solution of (9) is the solution of (6). And from (10), (11), (30) and (31) we conclude the thesis of the theorem. The uniqueness is obvious. u t Acknowledgements. The author wishes to express his gratitude to Professor Wojciech Zaja˛czkowski for useful discussions and for help during the preparation of the paper.
References 1. Bancel, D., Choquet-Bruhat, Y.: Existence, Uniqueness, and Local Stability for the Einstein–Maxwell– Boltzman System. Commun. Math. Phys. 33, 84–96 (1973) 2. Bernstein, J.: Kinetic Theory in the Expanding Universe. Cambridge: Cambridge University Press, 1988 3. Bitcheler, K.: On the Cauchy problem for relativistic Boltzmann equation. Commun. Math. Phys. 4, 352– 364 (1967) 4. Di Blasio, G.: Differtiability of Spatially Homogeneous Solution of the Boltzmann Equation in the Non Maxwellian Case. Commun. Math. Phys. 38, 331–340 (1974) 5. Ehlers, J.: Survey of General Relativity Theory. In: Israel, W.(ed.), Relativity, Astrophysics and Cosmology. Dordrecht: Reidel, 1973 6. Mucha, P.B.: The Cauchy Problem for the Einstein–Boltzmann System. J. of Appl. Anal. 4 No 1, 129–141 (1998) Communicated by H. Araki
Commun. Math. Phys. 203, 119 – 184 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Supersymmetric Quantum Theory and Non-Commutative Geometry J. Fr¨ohlich1 , O. Grandjean2 , A. Recknagel3 1 Institut f¨ ur Theoretische Physik, ETH-H¨onggerberg, CH-8093 Z¨urich, Switzerland. E-mail:
[email protected] 2 Department of Mathematics, Harvard University, Cambridge, MA 02138, USA. E-mail:
[email protected] 3 Institut des Hautes Etudes ´ Scientifiques, 35, route de Chartres, F-91440 Bures-sur-Yvette, France. E-mail:
[email protected] Received: 1 April 1997 / Accepted: 24 November 1998
Abstract: Classical differential geometry can be encoded in spectral data, such as Connes’ spectral triples, involving supersymmetry algebras. In this paper, we formulate non-commutative geometry in terms of supersymmetric spectral data. This leads to generalizations of Connes’ non-commutative spin geometry encompassing noncommutative Riemannian, symplectic, complex-Hermitian and (Hyper-) K¨ahler geometry. A general framework for non-commutative geometry is developed from the point of view of supersymmetry and illustrated in terms of examples. In particular, the noncommutative torus and the non-commutative 3-sphere are studied in some detail. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Spectral Data of Non-Commutative Geometry . . . . . . . . . . . . . . . . . . . . 2.1 The N = 1 formulation of non-commutative geometry . . . . . . . . . . . . . . 2.2 The N = (1, 1) formulation of non-commutative geometry . . . . . . . . . . 2.3 Hermitian and K¨ahler non-commutative geometry . . . . . . . . . . . . . . . . . 2.4 The N = (4, 4) spectral data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Symplectic non-commutative geometry . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Non-Commutative 3-Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The N = 1 data associated to the 3-sphere . . . . . . . . . . . . . . . . . . . . . . . 3.2 The topology of the non-commutative 3-sphere . . . . . . . . . . . . . . . . . . . 3.3 The geometry of the non-commutative 3-sphere . . . . . . . . . . . . . . . . . . . 3.4 Remarks on N = (1, 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 The Non-Commutative Torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The classical torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Spin geometry (N = 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Riemannian geometry (N = (1, 1)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 K¨ahler geometry (N = (2, 2)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Directions for Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
120 123 123 138 146 153 154 155 156 158 163 166 169 169 170 174 176 178
120
J. Fr¨ohlich, O. Grandjean, A. Recknagel
1. Introduction The study of highly singular geometrical spaces, such as the space of leaves of certain foliations, of discrete spaces, and the study of quantum theory have led A. Connes to develop a general theory of non-commutative geometry, involving non-commutative measure theory, cyclic cohomology, non-commutative differential topology and spectral calculus, [Co1–5].A broad exposition of his theory and a rich variety of interesting examples can be found in his book [Co1]. Historically, the first examples of non-commutative spaces carrying geometrical structure emerged from non-relativistic quantum mechanics, as discovered by Heisenberg, Born, Jordan, Schr¨odinger and Dirac. Mathematically speaking, non-relativistic quantum mechanics is the theory of quantum phase spaces, which are non-commutative deformations of certain classical phase spaces (i.e., of certain symplectic manifolds), and it is the theory of dynamics on quantum phase spaces. Geometrical aspects of quantum phase spaces and supersymmetry entered the scene implicitly in Pauli’s theory of the non-relativistic, spinning electron and in the theory of non-relativistic positronium. Later on, the mathematicians discovered Pauli’s and Dirac’s theories of the electron as a powerful tool in algebraic topology and differential geometry. In a companion paper [FGR1], hereafter referred to as I, we have described a formulation of classical differential geometry in terms of the spectral data of non-relativistic, supersymmetric quantum theory, in particular in terms of the quantum theory of the non-relativistic electron and of positronium propagating on a general (spinc ) Riemannian manifold. The work in I is inspired by Connes’ fundamental work [Co1–5], and by Witten’s work on supersymmetric quantum theory and its applications in algebraic topology [Wi1, 2]; it attempts to merge these two threads of thought. Additional inspiration has come from the work in [AG, FW, AGF, HKLR] on the relation between index theory and supersymmetric quantum theory and on supersymmetric non-linear σ-models, as well as from the work by Jaffe and co-workers on connections between supersymmetry and cyclic cohomology [Ja1–3]. To elucidate the roots of some of these ideas in Pauli’s non-relativistic quantum theory of the electron and of positronium has proven useful and suggestive of various generalizations. The work described in the present paper has its origins in an attempt to apply the methods of non-commutative geometry to exploring the geometry of string theory, in particular of superstring vacua; see [CF, FG]. In trying to combine quantum theory with the theory of gravitation, one observes that it is impossible to localize events in spacetime arbitrarily precisely, and that, in a compact region of space-time, one can only resolve a finite number of distinct events [DFR]. One may then argue, heuristically, that space-time itself must have quantum-mechanical features at distance scales of the order of the Planck length, and that space-time and matter should be merged into a fundamental quantum theory of space-time-matter. Superstring theory [GSW] is a theoretical framework incorporating some of the features necessary for a unification of quantum theory and the theory of gravitation. Superstring vacua are described by certain superconformal field theories, see e.g. [GSW]. The intention of the program formulated in [CF, FG] is to reconstruct space-time geometry from algebraic data of superconformal field theory. In the study of concrete examples, one observes that, in general, the target spaces (spacetimes) of superconformal field theories are non-commutative geometrical spaces, and the tools of Connes’ non-commutative geometry become essential in describing their geometry. This observation has been confirmed more recently in the theory of D-branes [Pol, Wi4].
Supersymmetric Quantum Theory and Non-Commutative Geometry
121
The purpose of this paper is to cast some of the tools of non-commutative (differential) geometry into a form that makes connections to supersymmetric quantum theory manifest and that is particularly useful for applications to superconformal field theory. The methods and results of this paper are mathematically precise. Applications to physics are not treated here; but see e.g. [FGR2]. Instead, the general formalism developed in this paper is illustrated by an analysis of the geometry of the non-commutative torus and of the fuzzy 3-sphere; more details can be found in [Gr]. Next, we sketch some of the key ideas underlying our approach to non-commutative geometry; for further background see also part I and [FGR2]. Connes has shown how to formulate classical geometry in terms of algebraic data, so-called spectral triples, involving a commutative algebra A = C ∞ (M ) of (smooth) functions on the smooth manifold M under consideration, a Hilbert space H of spinors over M on which the algebra A acts by bounded operators, and a self-adjoint Dirac operator D on H satisfying certain properties with respect to A. As explained in [Co1], it is possible to extract complete geometrical information about M from the spectral triple (A, H, D). The definition of spectral triples involves, in the classical case, a Clifford action on certain vector bundles over M , e.g. the spinor bundle or the bundle of differential forms. As was recalled in ref. I, the latter bundle actually carries two anti-commuting Clifford actions – which can be used to define two Dirac-K¨ahler operators, D and D. It turns out that the algebraic relations between these operators are precisely those of the two supercharges of N = (1, 1) supersymmetric quantum mechanics (see part I, especially Sect. 3, for the precise meaning of the terminology): These relations are { D, D } = 0 and D2 = D2 . The commutators [ D, a ] and [ D, a ], for arbitrary a ∈ A, extend to bounded operators (anti-commuting sections of two Clifford bundles) acting on the Hilbert space H of square-integrable differential forms. Furthermore, if the underlying manifold M is compact, the operator exp(−εD2 ) is trace-class for any ε > 0. One may then introduce a nilpotent operator d := D − iD, which turns out to correspond to exterior differentiation of differential forms. From the N = (1, 1) supersymmetric spectral data (A, H, D, D) just described, one can reconstruct the de Rham-Hodge theory and the Riemannian geometry of smooth (compact) Riemannian manifolds. N = (1, 1) supersymmetric spectral data are a variant of Connes’ approach involving spectral triples. They are very natural from the point of view of supersymmetric quantum theory and encode the differential geometry of Riemannian manifolds (not required to be spinc manifolds). In a formulation of differential geometry in terms of spectral data (A, H, D, D, . . . ) with supersymmetry, additional geometrical structures, e.g. a symplectic or complex structure, appear in the form of global gauge symmetries commuting with the elements of A but acting non-trivially on the Dirac-K¨ahler operators D and D; see part I. For example, a global gauge symmetry group containing U(1) × U(1) generates four DiracK¨ahler operators – the “supercharges” of N = (2, 2) supersymmetry – from D and D and identifies the underlying manifold M as a K¨ahler manifold. A global gauge symmetry group containing SU(2) × SU(2) leads to eight supercharges generating an N = (4, 4) supersymmetry algebra and is characteristic of Hyperk¨ahler geometry; see also [AGF, HKLR]. Complex-Hermitian and symplectic geometry are encoded in N = (2, 2) supersymmetric spectral data with partially broken supersymmetry. A systematic classification of different types of differential geometry in terms of supersymmetric
122
J. Fr¨ohlich, O. Grandjean, A. Recknagel
spectral data extending the N = (1, 1) data of Riemannian geometry has been described in I (see Sect. I 3 for an overview, and [FGR2]). In this paper, we generalize these results from classical to non-commutative geometry, starting from the simple prescription to replace the commutative algebra of functions C ∞ (M ) over a classical manifold by a general, possibly non-commutative ∗ -algebra A satisfying certain properties. Section 2 contains general definitions and introduces various kinds of spectral data: We start with an exposition of Connes’ non-commutative spin geometry; most of the material can be found in [Co1], but we add some details on metric aspects ranging from connections over curvature and torsion to non-commutative Cartan structure equations. In Subsect. 2.2, we introduce spectral data with N = (1, 1) supersymmetry that naturally lead to a non-commutative analogue of the de Rham complex of differential forms. Moreover, this “Riemannian” formulation of non-commutative geometry allows for immediate specializations to spectral data with extended supersymmetry – which, in the classical case, correspond to manifolds carrying complex, K¨ahler, Hyperk¨ahler or symplectic structures. Spectral data with higher supersymmetry are treated in Subsects. 2.3–2.5. In Subsect. 2.2.5, we discuss the relationship between spectral triples, as defined by Connes, and spectral data with N = (1, 1) supersymmetry: Whereas in the classical case, one can always pass from one description of a smooth manifold to the other, the situation is not quite as clear in the non-commutative framework. We propose a procedure how to construct N = (1, 1) data from a spectral triple – heavily relying on Connes’ notion of a real structure [Co4] – but the construction is not complete for general spectral triples. Furthermore, Subsect. 2.2.6 contains proposals for definitions of non-commutative manifolds and non-commutative phase spaces, as suggested by the study of N = (1, 1) spectral data and by notions from quantum physics. In Sects. 3 and 4 we discuss two examples of non-commutative spaces, namely the “fuzzy 3-sphere” and the non-commutative torus. The choice of the latter example does not require further explanation since it is one of the classic examples of a non-commutative space; see e.g. [Co1, Co5, Ri]. Here we add a description of the non-commutative 2-torus in terms of spectral data with N = (1, 1) and N = (2, 2) supersymmetry, thus showing that this space can be endowed with a non-commutative Riemannian and a non-commutative K¨ahler structure. This is not too surprising, since the non-commutative torus can be regarded as a deformation of the classical flat torus. The calculations in Sect. 4 also provide an example where the general ideas of Subsect. 2.2.5 on how to construct N = (1, 1) from N = 1 spectral data can be carried out completely. The other example, the non-commutative 3-sphere discussed in Sect. 3 (see also [Gr]), represents a generalization of another prototype non-commutative geometrical space, namely the fuzzy 2-sphere [Ber, Ho, Ma, GKP]. We choose to study the 3-sphere for the following reasons: First, in contrast to the fuzzy 2-sphere and the non-commutative torus, it cannot be viewed as a quantization of a classical phase space. Second, it is the simplest example of a series of quantized spaces arising from so-called Wess–Zumino– Witten-models – conformal field theories associated to non-linear σ-models with compact simple Lie groups as target manifolds, see [Wi3]. There is reason to expect that the spectral data arising from other WZW-models – see [FG, FGR2] for a discussion – can be treated essentially by the same methods as the fuzzy 3-sphere associated to the group SU(2). In view of the conformal field theory origin, one is led to conjecture that, as a non-commutative space, the non-commutative 3-sphere describes the non-commutative geometry of the quantum group Uq (sl2 ), for q = exp(2πi/k + 2), where k ∈ Z+ is the level of the WZW-model. The parameter k appears in the spectral data of the non-
Supersymmetric Quantum Theory and Non-Commutative Geometry
123
commutative 3-sphere in a natural way. One may expect that the fuzzy 3-sphere can actually be defined for arbitrary values of this parameter, since the same is true for the quantum group. As in the example of the non-commutative torus with rational deformation parameter, a truncation of the algebra of “functions” occurs for the special values k ∈ Z+ , leading to the finite-dimensional matrix algebras used in Sect. 3. In Sect. 5, we conclude with a list of open problems arising naturally from our discussion. In particular, we briefly comment on other, string theory motivated applications of non-commutative geometry; see also [FG, FGR2]. The present text is meant as a companion paper to I: Now and then, we will permit ourselves to refer to [FGR1] for technical details of proofs which proceed analogously to the classical case. More importantly, the study of classical geometry in part I provides the best justification – besides the one of naturality – of the expectation that our classification of (non-commutative) geometries according to the supersymmetry content of the spectral data leads to useful and fruitful definitions of non-commutative geometrical structure. 2. Spectral Data of Non-Commutative Geometry In the following, we generalize the notions of part I from classical differential geometry to the non-commutative setting. The classification of geometrical structure according to the “supersymmetry content” of the relevant spectral data, which was uncovered in [FGR1], will be our guiding principle. In the first part, we review Connes’ formulation of non-commutative geometry using a single generalized Dirac operator, whereas, in the following subsections, spectral data with realizations of some genuine supersymmetry algebras will be introduced, allowing us to define non-commutative generalizations of Riemannian, complex, K¨ahler and Hyperk¨ahler, as well as of symplectic geometry. 2.1. The N = 1 formulation of non-commutative geometry. This section is devoted to the non-commutative generalization of an algebraic description of spin geometry – and, according to the results of Sect. I 2, of general Riemannian geometry – following the ideas of Connes [Co1]. The first two subsections contain the definition of abstract N = 1 spectral data and of differential forms. In Subsect. 2.1.3, we describe a notion of integration which leads us to a definition of square integrable differential forms. After having introduced vector bundles and Hermitian structures in Subsect. 2.1.4, we show in Subsect. 2.1.5 that the module of square integrable forms always carries a generalized Hermitian structure. We then define connections, torsion, and Riemannian, Ricci and scalar curvature in the next two subsections. Finally, in 2.1.8, we derive noncommutative Cartan structure equations. Although much of the material in Sect. 2.1 is contained (partly in much greater detail) in Connes’ book [Co1], it is reproduced here because it is basic for our analysis in later sections and because we wish to make this paper accessible to non-experts. 2.1.1. The N = 1 spectral data. Definition 2.1. A quadruple (A, H, D, γ) will be called a set of N = 1 (even) spectral data if 1) H is a separable Hilbert space; 2) A is a unital ∗ -algebra acting faithfully on H by bounded operators; 3) D is a self-adjoint operator on H such that i) for each a ∈ A, the commutator [ D, a ] defines a bounded operator on H,
124
J. Fr¨ohlich, O. Grandjean, A. Recknagel
ii) the operator exp(−εD2 ) is trace class for all ε > 0 ; 4) γ is a Z2 -grading on H, i.e., γ = γ ∗ = γ −1 , such that { γ, D } = 0, [ γ, a ] = 0 for all a ∈ A. As mentioned before, in non-commutative geometry A plays the role of the “algebra of functions over a non-commutative space”. The existence of a unit in A, together with property 3 ii) above, reflects the fact that we are dealing with “compact” noncommutative spaces. Note that if the Hilbert space H is infinite-dimensional, condition 3 ii) implies that the operator D is unbounded. By analogy with classical differential geometry, D is interpreted as a (generalized) Dirac operator. Also note that the fourth condition in Definition 2.1 does not impose any restriction e H, e D) e satisfying Properties 1–3 from on N = 1 spectral data: In fact, given a triple (A, above, we can define a set of N = 1 even spectral data (A, H, D, γ) by setting A = Ae ⊗ 12 ,
e ⊗ C2 , H=H e ⊗ τ1 , D=D
γ = 1H˜ ⊗ τ3 ,
where τi are the Pauli matrices acting on C2 . 2.1.2. Differential forms. The construction of differential forms follows the same lines as in classical differential geometry: We define the unital, graded, differential ∗ -algebra of universal forms, • (A), as in [Co1, CoK]: •
(A) =
∞ M
k
(A),
k
(A) := {
N X
ai0 δai1 · · · δaik | N ∈ N, aij ∈ A },
(2.1a)
i=1
k=0
where δ is an abstract linear operator satisfying δ 2 = 0 and the Leibniz rule. Note that, even in the classical case where A = C ∞ (M ) for some smooth manifold M , no relations ensuring (graded) commutativity of • (A) are imposed. The complex conjugation of functions over M is now to be replaced by the ∗ -operation of A. We define (δa)∗ = −δ(a∗ )
(2.1b)
for all a ∈ A. With the help of the (self-adjoint) generalized Dirac operator D, we introduce a ∗ -representation π of • (A) on H, π(a) = a,
π(δa) = [ D, a ],
cf. [Co1] or Eq. (I 2.12). A graded ∗ -ideal J of • (A) is defined by J :=
∞ M
J k,
J k := ker π |k (A) .
(2.2)
k=0
Since J is not a differential ideal, the graded quotient • (A)/J does not define a differential algebra and thus does not yield a satisfactory definition of the algebra of differential forms. This problem is solved as in the classical case.
Supersymmetric Quantum Theory and Non-Commutative Geometry
125
Proposition 2.2. ([Co1]) The graded sub-complex J + δJ =
∞ M
J k + δJ k−1 ,
k=0
:= 0 and δ is the universal differential in • (A), is a two-sided graded where J ∗ differential -ideal of • (A). −1
We define the unital graded differential ∗ -algebra of differential forms, •D (A), as the graded quotient • (A)/(J + δJ), i.e., •D (A) :=
∞ M
kD (A),
kD (A) := k (A)/(J k + δJ k−1 ).
(2.3)
k=0
Since •D (A) is a graded algebra, each kD (A) is, in particular, a bi-module over A = 0D (A). Note that π does not determine a representation of the algebra (or, for that matter, of the space) of differential forms •D (A) on the Hilbert space H: A differential k-form is an equivalence class [ω] ∈ kD (A) with some representative ω ∈ k (A), and π maps this class to a set of bounded operators on H, namely π [ω] = π(ω) + π δJ k−1 . In general, the only subspaces where we do not meet this complication are π 0D (A) = A and π 1D (A) ∼ = π 1 (A) . However, the image of •D (A) under π is Z2 -graded, ∞ ∞ M M 2k (A) ⊕ π 2k+1 π •D (A) = π D D (A) , k=0
k=0
because of the (anti-)commutation properties of the Z2 -grading γ on H, see Definition 2.1. 2.1.3. Integration. Property 3ii) of the Dirac operator in Definition 2.1 allows us to define the notion of integration over a non-commutative space in the same way as in the classical case, see part I. Note that, for certain sets of N = 1 spectral data, we could use the Dixmier trace, as Connes originally proposed; but the definition given below, first introduced in [CFF], works in greater generality (cf. the remarks in Sect. I 2.1.3). Moreover, it is closer to notions coming up naturally in quantum field theory. Definition 2.3. The integral over theR non-commutative space described by the N = 1 spectral data (A, H, D, γ) is a state − on π • (A) defined by • Z π (A) −→ C 2 Z − : Limε→0+ Tr H ωe−εD , 7−→ − ω := ω Tr H e−εD2 R where Limε→0+ denotes some limiting procedure making the functional − linear and positive semi-definite; the existence of such a procedure can be shown analogously to [Co1, 3], where the Dixmier trace is discussed.
126
J. Fr¨ohlich, O. Grandjean, A. Recknagel
R For this integral − to be a useful tool, we need an additional property that must be checked in each example: R Assumption 2.4. The state − on π • (A) is cyclic, i.e., Z Z ∗ − ω η = − η∗ ω for all ω, η ∈ π • (A) . R The state − determines a positive semi-definite sesqui-linear form on • (A) by setting Z (2.4) (ω, η) := − π(ω) π(η)∗ for all ω, η ∈ • (A). In the formulas below, we will often drop the representation symbol π under the integral, as there is no danger of confusion. Note that the commutation relations of the grading γ with the Dirac operator imply that forms of odd degree are orthogonal to those of even degree with respect to (·, ·). By K k we denote the kernel of this sesqui-linear form restricted to k (A). More precisely we set K :=
∞ M
Kk,
K k := { ω ∈ k (A) | ( ω, ω ) = 0 }.
(2.5)
k=0
Obviously, K k contains the ideal J k defined in Eq. (2.2); in the classical case they coincide. Assumption 2.4 is needed to show that K is a two-sided ideal of the algebra of universal forms, so that we can pass to the quotient algebra. Proposition 2.5. The set K is a two-sided graded ∗ -ideal of • (A). Proof. The Cauchy–Schwarz inequality for states implies that K is a vector space. If ω ∈ K k , then Assumption 2.4 gives Z Z ∗ ∗ ∗ (ω , ω ) = − π(ω) π(ω) = − π(ω)π(ω)∗ = 0, i.e. that K is closed under the involution *. With ω as above and η ∈ p (A), we have that Z Z (ηω, ηω) = − π(η)π(ω)π(ω)∗ π(η)∗ = − π(ω)∗ π(η)∗ π(η)π(ω) Z ≤ kπ(η)k2H − π(ω)∗ π(ω) = 0,
where k · kH is the operator norm on B(H). On the other hand, we have that Z Z (ωη, ωη) = − π(ω)π(η)π(η)∗ π(ω)∗ ≤ kπ(η)k2H − π(ω)π(ω)∗ = 0, and it follows that both ω η and η ω are elements of K, i.e., K is a two-sided ideal.
Supersymmetric Quantum Theory and Non-Commutative Geometry
127
We now define e • (A) :=
∞ M
e k (A), e k (A) := k (A)/K k .
(2.6)
k=0
e k (A), and The sesqui-linear form (·, ·) descends to a positive definite scalar product on e k the Hilbert space completion of this space with respect to the scalar we denote by H product, ∞ (·,·) M ek , H e k := e • := e k (A) . H (2.7) H k=0
e • does e k is to be interpreted as the space of square-integrable k-forms. Note that H H not in general coincide with the Hilbert space that would arise from a GNS construction R e • , orthogonality of forms of different degree e • (A): Whereas in H using the state − on is installed by definition, there may exist forms of even degree (or odd forms) in the GNS Hilbert space that have different degrees but are not orthogonal. •
k
e (A), the e (A) is a unital graded ∗ -algebra. For any ω ∈ Corollary 2.6. The space p p+k e e left and right actions of ω on (A) with values in (A), mL (ω)η := ωη, mR (ω)η := ηω, e • is a are continuous in the norm given by (·, ·). In particular, the Hilbert space H • e bi-module over (A) with continuous actions. Proof. The claim follows immediately from the two estimates given in the proof of the e p (A). e k (A) and η ∈ previous proposition, applied to ω ∈ •
•
e • are “well-behaved” with respect to the e (A)e (A) and H This remark shows that action. Furthermore, Corollary 2.6 will be useful for our discussion of curvature and torsion in Subsects. 2.1.7 and 2.1.8. e • (A) may fail to be differential, we introduce the unital graded Since the algebra e •D (A) as the graded quodifferential ∗ -algebra of square-integrable differential forms tient of • (A) by K + δK, e •D (A) :=
∞ M
e kD (A), e kD (A) := k (A)/(K k + δK k−1 ) ∼ e k (A)/δK k−1 . (2.8) =
k=0 •
e D (A) has the stated properties, one repeats the proof of Proposition In order to show that e •D (A) as a “smaller version” of •D (A) 2.2. Note that we can regard the A-bi-module in the sense that there exists a projection from the latter onto the former; whenever one deals with a concrete set of N = 1 spectral data that satisfy Assumption 2.4, it will be advantageous to work with the “smaller” algebra of square-integrable differential forms. The algebra •D (A), on the other hand, can be defined for arbitrary data. In the classical case, differential forms are identified with the orthogonal complement of Cl(k−2) within Cl(k) , see [Co1] and the remarks in part I, after Eq. (I 2.15). Now, we e k to introduce, for each k ≥ 1, the orthogonal projection use the scalar product (·, ·) on H
128
J. Fr¨ohlich, O. Grandjean, A. Recknagel
e k −→ H ek PδK k−1 : H
(2.9)
e k , and we set onto the image of δK k−1 in H ek ω ⊥ := (1 − PδK k−1 ) ω ∈ H
(2.10)
k
e D (A). This allows us to define a positive definite scalar product for each element [ω] ∈ k e on D (A) via the representative ω ⊥ : ( [ω], [η] ) := ( ω ⊥ , η ⊥ )
(2.11)
k
e D (A). In the classical case, this is just the usual inner product on the for all [ω], [η] ∈ space of square-integrable k-forms. 2.1.4. Vector bundles and Hermitian structures. Again, we simply follow the algebraic formulation of classical differential geometry in order to generalize the notion of a vector bundle to the non-commutative case: Definition 2.7 ([Co1]). A vector bundle E over the non-commutative space described by the N = 1 spectral data (A, H, D, γ) is a finitely generated projective left A-module. Recall that a module E is projective if there exists another module F such that the direct sum E ⊕ F is free, i.e., E ⊕ F ∼ = An as left A-modules, for some n ∈ N. Since A is an algebra, every A-module is a vector space; therefore, left A-modules are representations of the algebra A, and E is projective iff there exists a module F such that E ⊕ F is isomorphic to a multiple of the left-regular representation. By Swan’s Lemma [Sw], a finitely generated projective left module corresponds, in the commutative case, to the space of sections of a vector bundle. With this in mind, it is straightforward to define the notion of a Hermitian structure over a vector bundle: Definition 2.8 ([Co1]). A Hermitian structure over a vector bundle E is a sesqui-linear map (linear in the first argument) h·, ·i : E × E −→ A such that for all a, b ∈ A and all s, t ∈ E, 1) h as, bt i = a h s, t i b∗ ; 2) h s, s i ≥ 0 ; 3) the A-linear map
( ∗ E −→ ER , g : s 7−→ h s, · i
∗ := { φ ∈ Hom(E, A) | φ(as) = φ(s)a∗ }, is an isomorphism of left Awhere ER modules, i.e., g can be regarded as a metric on E. k
˜ (A). In this section we show that the 2.1.5. Generalized Hermitian structure on k e A-bi-modules (A) carry Hermitian structures in a slightly generalized sense. Let A e 0 , i.e., A is the von Neumann algebra be the weak closure of the algebra A acting on H 0 e0 . e (A) acting on the Hilbert space H generated by
Supersymmetric Quantum Theory and Non-Commutative Geometry
129
Theorem 2.9. There is a canonically defined sesqui-linear map e k (A) × e k (A) −→ A h·, ·iD : e k (A), such that for all a, b ∈ A and all ω, η ∈ 1) h a ω, b η iD = a h ω, η iD b∗ ; 2) h ω, ω iD ≥ 0 ; 3) h ω a, η iD = h ω, η a∗ iD . k
e (A). It is the non-commutative We call h·, ·iD a generalized Hermitian structure on analogue of the Riemannian metric on the bundle of differential forms. Note that h·, ·iD takes values in A and thus Property 3) of Definition 2.8 is not directly applicable. e k (A) and define the C-linear map Proof. Let ω, η ∈ Z ϕω,η (a) = − a η ω ∗ , e 0 (A). Note that a on the rhs actually is a representative in A of the class for all a ∈ e 0 (A), and analogously for ω and η (and we have omitted the representation a ∈ symbol π). The value of the integral is, however, independent of the choice of these representatives, which is why we used the same letters. The map ϕ satisfies Z 1 Z 1 Z 21 1 ∗ 2 ∗ ∗ 2 2 |ϕω,η (a)| ≤ − aa − ωη ηω ≤ (a, a) − ωη ∗ ηω ∗ . e 0 , and there exists an element Therefore, ϕω,η extends to a bounded linear functional on H e 0 such that h ω, η iD ∈ H ϕω,η (x) = (x, h ω, η iD ) e 0 ; since (·, ·) is non-degenerate, h ω, η iD is a well-defined element; but it for all x ∈ H remains to show that it also acts as a bounded operator on this Hilbert space. To this end, e 0 (A) which converges to h ω, η iD . Then, for all b, c ∈ e 0 (A), choose a net {aι } ⊂ Z Z ∗ ( h ω, η iD b, c ) = lim ( aι b, c ) = lim − aι bc = lim − aι (cb∗ )∗ ι→∞
ι→∞
∗
ι→∞
∗
= lim ( aι , cb ) = ( h ω, η iD , cb ), ι→∞
and it follows that |( h ω, η iD b, c )| = |( h ω, η iD , cb∗ )| = |( cb∗ , h ω, η iD )| Z Z Z ∗ ∗ ∗ ∗ = − cb η ω = − ω cb η = − b∗ η ω ∗ c 21 Z 21 21 Z 21 Z Z ≤ − b∗ b − c∗ ω η ∗ ηω ∗ c ≤ k ωη ∗ kH − b∗ b − c∗ c ≤ k ωη ∗ kH (b, b) 2 (c, c) 2 . 1
1
R In the third line, we first use the Cauchy–Schwarz inequality for the positive state − , and then an estimate which is true for all positive operators on a Hilbert space; the
130
J. Fr¨ohlich, O. Grandjean, A. Recknagel
upper bound k ωη ∗ kH again involves representatives ω, η ∈ π k (A) , which was not explicitly indicated above, since any two will do. e 0 , we see that h ω, η iD indeed defines a bounded operator in e 0 (A) is dense in H As e 0 , which, by definition, is the weak limit of elements in e 0 (A), i.e., it belongs to A. H Properties 1-3 of h·, ·iD are easy to verify. Note that the definition of the metric h·, ·iD given here differs slightly from the one of refs. [CFF, CFG]. One can, however, show that in the N = 1 case both definitions agree; moreover, the present one is better suited for the N = (1, 1) formulation to be introduced later. 2.1.6. Connections. Definition 2.10. A connection ∇ on a vector bundle E over a non-commutative space is a C-linear map e 1D (A) ⊗A E ∇ : E −→ such that
∇(as) = δa ⊗ s + a∇s
for all a ∈ A and all s ∈ E. Given a vector bundle E, we define a space of E-valued differential forms by e •D (A) ⊗A E ; e •D (E) := if ∇ is a connection on E, then it extends uniquely to a C-linear map, again denoted ∇,
such that for all ω ∈
e kD (A)
e •+1 e •D (E) −→ ∇ : D (E)
(2.12)
∇(ωs) = δω s + (−1)k ω ∇s
(2.13)
and all s ∈
e •D (E).
Definition 2.11. The curvature of a connection ∇ on a vector bundle E is given by e 2D (A) ⊗A E. R (∇) = −∇2 : E −→ Note that the curvature extends to a map e •+2 e •D (E) −→ R (∇) : D (E), which is left A-linear, as follows easily from Eq. (2.12) and Definition 2.10. Definition 2.12. A connection ∇ on a Hermitian vector bundle (E, h·, ·i) is called unitary if δ h s, t i = h ∇s, t i − h s, ∇t i for all s, t ∈ E, where the rhs of this equation is defined by h ω ⊗ s, t i = ω h s, t i, e 1D (A) and all s, t ∈ E. for all ω, η ∈
h s, η ⊗ t i = h s, t i η ∗
(2.14)
Supersymmetric Quantum Theory and Non-Commutative Geometry
131
2.1.7. Riemannian curvature and torsion. Throughout this section, we make three additional assumptions which limit the generality of our results, but turn out to be fulfilled in interesting examples. Assumption 2.13. We assume that the N = 1 spectral data under consideration have the following additional properties: e 0D (A) = A and e 1D (A) = e 1 (A), thus e 1D (A) carries a 1) K 0 = 0. (This implies that generalized Hermitian structure.) e 1D (A) is always a e 1D (A) is a vector bundle, called the cotangent bundle over A. ( 2) left A-module. Here, we assume, in addition, that it is finitely generated and projective.) e 1D (A) defines an isomorphism of left A-modules 3) The generalized metric h·, ·iD on e 1D (A) and the space of A-anti-linear maps from e 1D (A) to A, i.e., for each between A-anti-linear map, e 1D (A) −→ A, φ : e 1D (A) and all a ∈ A, there is a unique satisfying φ(aω) = φ(ω)a∗ for all ω ∈ e 1D (A) with ηφ ∈ φ(ω) = h ηφ , ω iD . If N = 1 spectral data (A, H, D, γ) satisfy these assumptions, we are able to define non-commutative generalizations of classical notions like torsion and curvature. e 1D (A) is a vecWhereas torsion and Riemann curvature can be introduced whenever tor bundle, the last assumption in 2.13 will provide a substitute for the procedure of “contracting indices” leading to Ricci and scalar curvature. 1
e D (A) over a nonDefinition 2.14. Let ∇ be a connection on the cotangent bundle commutative space (A, H, D, γ) satisfying Assumption 2.13. The torsion of ∇ is the A-linear map e 2D (A), e 1D (A) −→ T(∇) := δ − m ◦ ∇ : e 2D (A) denotes the product of 1-forms in e •D (A). e 1D (A) −→ e 1D (A) ⊗A where m : Using the definition of a connection, A-linearity of torsion is easy to verify. In analogy to the classical case, a unitary connection ∇ with T(∇) = 0 is called a Levi–Civita connection. In the classical case, there is exactly one Levi–Civita connection that, in addition, is a real operator on the complexified bundle of differential forms. In contrast, for a given set of non-commutative spectral data, there may be several (real) Levi–Civita connections – or none at all. e 1D (A) is a vector bundle, we can define the Riemannian curSince we assume that vature of a connection ∇ on the cotangent bundle as a specialization of Definition 2.11. To proceed further, we make use of part 2) of Assumption 2.13, which implies that e 1D (A) and an associated “dual basis” there exists a finite set of generators { E A } of e 1D (A)∗ , { εA } ⊂ e 1D (A) −→ A | φ(aω) = aφ(ω) for all a ∈ A, ω ∈ e 1D (A) }, e 1D (A)∗ := { φ :
132
J. Fr¨ohlich, O. Grandjean, A. Recknagel
e 1D (A) can be written as ω = εA (ω)E A , see e.g. [Jac]. Because the such that each ω ∈ e 2D (A) with curvature is A-linear, there is a family of elements { RA } ⊂ B
R (∇) = εA ⊗ RAB ⊗ E B ;
(2.15)
here and in the following the summation convention is used. Put differently, we have applied the canonical isomorphism of vector spaces e 1D (A) ∼ e 2D (A) ⊗A e 1D (A) e 2D (A) ⊗A e 1D (A)∗ ⊗A e 1D (A), HomA = 1
e D (A) is projective – and chosen explicit generators E A , εA . – which is valid because e 1D (A). Then we have that R (∇) ω = εA (ω) RAB ⊗ E B for any 1-form ω ∈ Note that although the components RAB need not be unique, the element on the rhs of Eq. (2.15) is well-defined. Likewise, the Ricci and scalar curvature, to be introduced below, will be invariant combinations of those components, as long as we make sure that all maps we use have the correct “tensorial properties” with respect to the A-action. The last part of Assumption 2.13 guarantees, furthermore, that to each εA there exists e 1D (A) such that a unique 1-form eA ∈ εA (ω) = h ω, eA iD e 1D (A). By Corollary 2.6, every such eA determines a bounded operator for all ω ∈ e 1 −→ H e 2 acting on H e 1 by left multiplication with eA . The adjoint of this mL (eA ) : H e • is denoted by operator with respect to the scalar product (·, ·) on H e2 e1 ead A : H −→ H .
(2.16)
ead A is a map of right A-modules, and it is easy to see that also the correspondence e 1D (A), we have εA 7→ ead is right A-linear: For all b ∈ A, ω ∈ A
(εA · b)(ω) = εA (ω) · b = h ω, eA i b = h ω, b∗ eA i, e 1 , ξ2 ∈ H e2 , and, furthermore, for all ξ1 ∈ H ( b∗ eA (ξ1 ), ξ2 ) = ( eA (ξ1 ), bξ2 ) = ( ξ1 , ead A (bξ2 ) ), e k . Altogether, the where scalar products have to be taken in the appropriate spaces H asserted right A-linearity follows. Therefore, the map A B εA ⊗ RAB ⊗ E B 7−→ ead A ⊗R B ⊗E
is well-defined and has the desired tensorial properties. The definition of Ricci curvature involves another operation which we require to be similarly well-behaved: e k , see Eq. (2.9), satisfy Lemma 2.15. The orthogonal projections PδK k−1 on H PδK k−1 (axb) = aPδK k−1 (x)b ek . for all a, b ∈ A and all x ∈ H
Supersymmetric Quantum Theory and Non-Commutative Geometry
133
e k . Then Proof. Set P := PδK k−1 , and let y ∈ P H ( P (axb), y ) = ( axb, P (y) ) = ( axb, y ) = ( x, a∗ yb∗ ) = ( x, P (a∗ yb∗ ) ) = ( aP (x)b, y ), where we have used that P is self-adjoint with respect to (·, ·), that P y = y, and that the image of P is an A-bi-module. This lemma shows that projecting onto the “2-form part” of RAB is an A-bi-module map, i.e., we may apply A B ad A ead A ⊗ R B ⊗ E 7−→ eA ⊗ R B
⊥
⊗ EB
⊥ with RAB = (1 − PδK 1 ) RAB as in Eq. (2.10). Altogether, we arrive at the following definition of the Ricci curvature, ⊥ e 1 ⊗A e 1D (A), RA B ⊗ EB ∈ H Ric(∇) = ead A which is in fact independent of any choices. In the following, we will also use the abbreviation ⊥ RA B RicB := ead A for the components (which, again, are not uniquely defined). From the components RicB we can pass to scalar curvature. Again, we have to make sure that all maps occurring in this process are A-covariant so as to obtain an invariant e 0 with ω defines a e 1D (A), right multiplication on H definition. For any 1-form ω ∈ 0 1 e −→ H e , and we denote by bounded operator mR (ω) : H ad e 1 −→ H e0 : H ωR
(2.17)
the adjoint of this operator. In a similar fashion as above, one establishes that ad ∗ (ωa)ad R (x) = ωR (xa )
e 1 and a ∈ A. This makes it possible to define the scalar curvature r (∇) for all x ∈ H of a connection ∇ as ad e0 . r (∇) = E B∗ R (RicB ) ∈ H As was the case for the Ricci tensor, acting with the adjoint of mR E B∗ serves as an analogue for “contraction of indices”. We summarize our results in the following e 1D (A) over a nonDefinition 2.16. Let ∇ be a connection on the cotangent bundle commutative space (A, H, D, γ) satisfying Assumption 2.13. The Riemannian curvature R (∇) is the left A-linear map e 1D (A). e 1D (A) −→ e 2D (A) ⊗A R (∇) = −∇2 : 1
1
e D (A) and dual generators εA of e D (A)∗ , and Choosing a set of generators E A of writing R (∇) = εA ⊗ RAB ⊗ E B as above, the Ricci tensor Ric (∇) is given by e 1 ⊗A 1 (A), Ric(∇) = RicB ⊗ E B ∈ H D
134
J. Fr¨ohlich, O. Grandjean, A. Recknagel
⊥ A where RicB := ead R , see Eqs. (2.10) and (2.16). Finally, the scalar curvature A B r (∇) of the connection ∇ is defined as r (∇) = E B∗
ad R
e0 , (RicB ) ∈ H
with the notation of Eq. (2.17). (Note that, in the classical case, our definition of the scalar curvature differs from the usual one by a sign.) Both Ric(∇) and r (∇) do not depend on the choice of generators. 2.1.8. Non-commutative Cartan structure equations. The classical Cartan structure equations are an important tool for explicit calculations in differential geometry. Noncommutative analogues of those equations were obtained in [CFF, CFG]. Since proofs were only sketched in these references, we will give a rather detailed account of their e 1D (A) is a results in the following. Throughout this section, we assume that the space vector bundle over A. In fact, no other properties of this space are used. Therefore all the statements on the non-commutative Cartan structure equations for the curvature will hold for any finitely generated projective module E over A; the torsion tensor, on the other hand, is defined only on the cotangent bundle over a non-commutative space. e 1D (A), then the curvature and the Let ∇ be a connection on the vector bundle torsion of ∇ are the left A-linear maps given in Definitions 2.16 and 2.14, e 1D (A), e 2D (A) ⊗A e 1D (A) −→ R (∇) : e 2D (A). e 1D (A) −→ T (∇) : e 1D (A) is finitely generated, we can choose a finite set of Since the left A-module e 1D (A), and define the components AB ∈ e 1D (A), generators { E A }A=1,... ,N ⊂ e 2D (A) and TA ∈ e 2D (A) of connection, curvature and torsion, resp., by setting RA ∈ B
∇ E A = − AB ⊗ E B , A
A
A
A
R (∇) E = R
B
B
⊗E ,
T (∇) E = T .
(2.18) (2.19) (2.20)
e 1D (A) is not a free Note that the components AB and RAB are not uniquely defined if module. Using Definitions 2.16 and 2.14, the components of the curvature and torsion tensors can be expressed in terms of the connection components: RAB = δ AB + AC C B , T
A
A
A
= δE +
BE
B
.
(2.21) (2.22)
As they stand, Eqs. (2.21) and (2.22) cannot be applied for solving typical problems like finding a connection without torsion, because the connection components AB e 1D (A) is free. We obtain more useful Cartan structure cannot be chosen at will unless e on a free equations if we can relate the components AB to those of a connection ∇ module AN . To this end, we employ some general constructions valid for any finitely generated projective left A-module E.
Supersymmetric Quantum Theory and Non-Commutative Geometry
135
e A }A=1,... ,N be the canonical basis of the standard module AN , and define a Let { E left A-module homomorphism ( e 1D (A) AN −→ (2.23) p : e A 7−→ aA E A aA E e 1D (A) is projective there exists a left A-module F such that for all aA ∈ A. Since e 1D (A) ⊕ F ∼ = AN .
(2.24)
e 1D (A) −→ AN the inclusion map determined by the isomorphism Denote by i : e 1D (A). For each A = 1, . . . , N , we define a left (2.24), which satisfies p ◦ i = id on A-linear map ( −→ A AN (2.25) εeA : B e 7−→ aA . aB E e A = ω for all ω ∈ AN . With the help of the inclusion i , we can It is clear that εeA (ω)E introduce the left A-linear maps ( 1 e D (A) −→ A (2.26) εA : ω 7−→ εeA i(ω) e 1D (A) can be written as for all A = 1, . . . , N . With these, ω ∈ e A = εA (ω)E A , ω = p i(ω) = p εeA (i(ω))E
(2.27)
and we see that { εA } is the dual basis already used in Sect. 2.1.7. The first step towards the non-commutative Cartan structure equations is the following result; see also [Kar]. e on AN , Proposition 2.17. Every connection ∇ e : AN −→ e 1D (A) ⊗A AN , ∇ 1
e D (A) by determines a connection ∇ on e ◦ i, ∇ = (id ⊗ p) ◦ ∇
(2.28)
1
e D (A) is of this form. and every connection on e be a connection on AN – which always exists (see the remarks after the Proof. Let ∇ e ◦ i is a well-defined map, and it satisfies proof). Clearly, ∇ = (id ⊗ p) ◦ ∇ e i(ω)) = (id ⊗ p) δa ⊗ i(ω) + a∇i(ω) e ∇(a ω) = (id ⊗ p) ∇(a = δa ⊗ ω + a∇ω 1
1
e D (A). e D (A). This proves that ∇ is a connection on for all a ∈ A and all ω ∈ 1 0 e If ∇ is any other connection on D (A), then e 1D (A) , e 1D (A) ⊗A e 1D (A), ∇0 − ∇ ∈ HomA
136
J. Fr¨ohlich, O. Grandjean, A. Recknagel
where HomA denotes the space of homomorphisms of left A-modules. Since e 1D (A) e 1D (A) ⊗A e 1D (A) ⊗A AN −→ id ⊗ p : 1
e D (A) is a projective module, there exists a module map is surjective and e 1D (A) ⊗A AN e 1D (A) −→ ϕ : with ∇0 − ∇ = (id ⊗ p) ◦ ϕ. e +ϕ e 1D (A) ⊗A AN , and ∇ e is a connection on AN Then ϕ e := ϕ ◦ p ∈ HomA AN , 1 e D (A) is given by ∇0 : whose associated connection on e + ϕ) (id ⊗ p) ◦ (∇ e ◦ i = ∇ + (id ⊗ p) ◦ ϕ = ∇0 . e 1D (A) comes from a connection on AN . This proves that every connection on
The importance of this proposition lies in the fact that an arbitrary collection of N e e1 e AB } 1-forms { A,B=1,... ,N ⊂ D (A) defines a connection ∇ on A by the formula e A − aA eB , e A = δaA ⊗ E e aA E e AB ⊗ E ∇ e 1D (A) is and conversely. Thus, not only the existence of connections on AN and guaranteed, but Eq. (2.28) allows us to compute the components AB of the induced e 1D (A). The action of ∇ on the generators is connection ∇ on e i(E A ) = (id ⊗ p) ∇ e εeB (i(E A ))E e B = (id ⊗ p) ∇ eB e εB (E A )E ∇E A = (id ⊗ p) ∇ e B − εB (E A ) eC e BC ⊗ E = (id ⊗ p) δεB (E A ) ⊗ E e CB ⊗ EB , = δεB (E A ) ⊗ E B − εC (E A ) where we have used some of the general properties listed before. In short, we get the relation e C B − δεB (E A ) (2.29) AB = εC (E A ) e 1D (A) in terms of the components expressing the components of the connection ∇ on e on AN . of the connection ∇ Upon inserting (2.29) into (2.21, 22), one arrives at Cartan structure equations which express torsion and curvature in terms of these unrestricted components. We can, howe 7→ ∇ is ever, obtain equations of a simpler form if we exploit the fact that the map ∇ many-to-one; this allows us to impose some extra symmetry relations on the components e of the connection ∇.
Supersymmetric Quantum Theory and Non-Commutative Geometry
137
e on AN , and denote by e AB be the coefficients of a connection ∇ Proposition 2.18. Let e A := ε (E A ) e the connection on AN whose components are given by e C D εB (E D ) . ∇ C B Then, these components enjoy the symmetry relations eC = eA εC (E A ) B B
eA , e A ε (E C ) = C B B
(2.30)
e and ∇ e induce the same connection on e 1D (A). In particular, every connection and ∇ 1 e D (A) is induced by a connection on AN that satisfies (2.30). on e on a generProof. We explicitly compute the action of the connection ∇ induced by ∇ ator, using Eqs. (2.27, 28) and the fact that all maps and the tensor product are A-linear: e C ⊗ EB ∇ E A = − AB ⊗ E B = δεB (E A ) ⊗ E B − εC (E A ) B D
e F εB (E F ) ⊗ E B = δεB (E A ) ⊗ E B − εC (E A )εD (E C ) D e F ⊗ εB (E F )E B = δεB (E A ) ⊗ E B − εD εC (E A )E C e DF ⊗ E F . = δεB (E A ) ⊗ E B − εD (E A ) e The symmetry relations This shows that ∇ is identical to the connection induced by ∇. (2.30) follow directly from A-linearity and (2.27). We are now in a position to state the Cartan structure equations in a simple form. e A be as in Proposition 2.18. Then the curvature and e AB and Theorem 2.19. Let B e 1D (A) are given by torsion components of the induced connection on eC + eA e C + δε (E A ) δ ε (E C ), RAB = εC (E A ) δ C B B C B A A B A B e T = ε (E ) δE + E . B
B
Proof. With Eqs. (2.21, 29, 30) and the Leibniz rule, we get eA + e A − δε (E A ) e C − δε (E C ) R AB = δ C B B C B eC A eC A A e = δ εC (E ) B + C − δεC (E ) B − δεB (E C ) e C + δε (E A ) δ ε (E C ) − eC + eA e A δε (E C ). = εC (E A )δ C B B C B C B The last term does in fact not contribute to the curvature, as can be seen after tensoring with E B : e A ⊗ EB + δ e A ⊗ ε (E C )E B = 0, e A δε (E C ) ⊗ E B = −δ B C B B C where we have used the Leibniz rule, the relations (2.30) and A-linearity of the tensor product. To compute the components of the torsion, we use Eqs. (2.22, 29) analogously, e A E B − δε (E A )E B = δE A + e A E B − δE A + ε (E A )δE B , TA = δE A + B B B B which gives the result.
138
J. Fr¨ohlich, O. Grandjean, A. Recknagel
The Cartan structure equations of Theorem 2.19 are considerably simpler than those one would get directly from (2.29) and (2.21, 22). The price to be paid is that the e A are not quite independent from each other, but of course they can easily components B e AB according to Proposition 2.18. be expressed in terms of the arbitrary components Therefore, the equations of Theorem 2.19 are useful e.g. for determining connections on e 1D (A) with special properties. We refer the reader to [CFG] for an explicit application of the Cartan structure equations. 2.2. The N = (1, 1) formulation of non-commutative geometry. In this section, we introduce the non-commutative generalization of the description of Riemannian geometry by a set of N = (1, 1) spectral data, which was presented, for the classical case, in Sect. 2.2 of part I. The advantage over the N = 1 formulation is that now the algebra of differential forms is naturally represented on the Hilbert space H. Therefore, calculations in concrete examples and also the study of cohomology rings will become much easier. There is the drawback that the algebra of differential forms is no longer closed under the ∗ -operation on H, but we will introduce an alternative involution below and add further remarks in Sect. 5. The N = (1, 1) framework explained in the following will also provide the basis for the definition of various types of complex non-commutative geometries in Sects. 2.3 and 2.4. 2.2.1. The N = (1, 1) spectral data. Definition 2.20. A quintuple (A, H, d, γ, ∗) is called a set of N = (1, 1) spectral data if 1) H is a separable Hilbert space; 2) A is a unital ∗ -algebra acting faithfully on H by bounded operators; 3) d is a densely defined closed operator on H such that i) d2 = 0 , ii) for each a ∈ A, the commutator [ d, a ] extends uniquely to a bounded operator on H, iii) the operator exp(−ε4) with 4 = dd∗ + d∗ d is trace class for all ε > 0 ; 4) γ is a Z2 -grading on H, i.e., γ = γ ∗ = γ −1 , such that i) [ γ, a ] = 0 for all a ∈ A , ii) { γ, d } = 0 ; 5) ∗ is a unitary operator on H such that i) ∗ d = ζ d∗ ∗ for some ζ ∈ C with |ζ| = 1 , ii) [ ∗, a ] = 0 for all a ∈ A . Several remarks are in order. First of all, note that we can introduce the two operators D = d + d∗ , D = i (d − d∗ ) on H which satisfy the relations D2 = D2 ,
{ D, D } = 0,
cf. Definition I 2.6. Thus, our notion of N = (1, 1) spectral data is an immediate generalization of a classical N = (1, 1) Dirac bundle – except for the boundedness conditions to be required on infinite-dimensional Hilbert spaces, and the existence of the additional operator ∗ (see the comments below).
Supersymmetric Quantum Theory and Non-Commutative Geometry
139
As in the N = 1 case, the Z2 -grading γ may always be introduced if not given from the start, simply by “doubling” the Hilbert space – see the remarks following Definition 2.1. e H, e d e, γ) Moreover, if (A, ˜ is a quadruple satisfying Conditions 1–4 of Definition 2.20, we obtain a full set of N = (1, 1) spectral data by setting e ⊗ C2 , A = Ae ⊗ 12 , H=H e∗ ⊗ 1 (12 − τ3 ), e ⊗ 1 (12 + τ3 ) − d d=d 2 2 ∗ = 1H˜ ⊗ τ1 γ = γ˜ ⊗ 12 , with the Pauli matrices τi as usual. Note that, in this example, ζ = −1, and the ∗-operator additionally satisfies ∗2 = 1 as well as [ γ, ∗ ] = 0 . The unitary operator ∗ was not present in our algebraic formulation of classical Riemannian geometry. But for a compact oriented manifold, the usual Hodge ∗-operator acting on differential forms satisfies all the properties listed above, after appropriate rescaling in each degree. (Moreover, one can always achieve ∗2 = 1 or ζ = −1.) For a non-orientable manifold, we can apply the construction of the previous paragraph to obtain a description of the differential forms in terms of N = (1, 1) spectral data including a Hodge operator. In our approach to the non-commutative case, we will make essential use of the existence of ∗, which we will also call Hodge operator, in analogy to the classical case. 2.2.2. Differential forms. We first introduce an involution, \, called complex conjugation, on the algebra of universal forms: \ : • (A) −→ • (A) is the unique C-anti-linear anti-automorphism such that \(a) ≡ a\ := a∗ ,
\(δa) ≡ (δa)\ := δ(a∗ )
(2.31)
for all a ∈ A. Here we choose a sign convention that differs from the N = 1 case, Eq. (2.1). If we write γˆ for the mod 2 reduction of the canonical Z-grading on • (A), we have δ\γˆ = \δ. (2.32) We define a representation of • (A) on H, again denoted by π, by π(a) := a,
π(δa) := [ d, a ]
(2.33)
for all a ∈ A. The map π is a Z2 -graded representation in the sense that π(γω ˆ γ) ˆ = γπ(ω)γ
(2.34)
for all ω ∈ • (A). Although the abstract algebra of universal forms is the same as in the N = 1 setting, the interpretation of the universal differential δ has changed: In the N = (1, 1) framework, it is represented on H by the nilpotent operator d, instead of the self-adjoint Dirac operator D, as before. In particular, we now have π(δω) = [ d, π(ω) ]g
(2.35)
140
J. Fr¨ohlich, O. Grandjean, A. Recknagel
for all ω ∈ • (A), where [·, ·]g denotes the graded commutator (defined with the canonical Z2 -grading on π(• (A)), see (2.34)). The validity of Eq. (2.35) is the main difference between the N = (1, 1) and the N = 1 formalism. It ensures that there do not exist any forms ω ∈ p (A) with π(ω) = 0 but π(δω) 6 = 0, in other words: Proposition 2.21. The graded vector space J=
∞ M
J k , J k := ker π |k (A)
k=0
with π defined in (2.33) is a two-sided graded differential \ -ideal of • (A). Proof. The first two properties are obvious, the third one is the content of Eq. (2.35). Using (2.31) and the relations satisfied by the Hodge ∗-operator according to part 5) of Definition 2.20, we find that π (δa)\ = π(δ(a∗ )) = [ d, a∗ ] = [ a, d∗ ]∗ = ζ [ a, ∗ d ∗−1 ]∗ = ζ ∗ [ a, d ]∗ ∗−1 = −ζ ∗ π(δa)∗ ∗−1 , which implies
π ω \ = (−ζ)k ∗ π(ω)∗ ∗−1
for all ω ∈ k (A). In particular, J = ker π is a \ -ideal.
(2.36)
As a consequence of this proposition, the algebra of differential forms •d (A)
:=
∞ M
kd (A),
kd (A) := k (A)/J k ,
(2.37)
k=0
is represented on the Hilbert space H via π. For later purposes, we will also need an involution on •d (A), and according to Proposition 2.21, this is given by the anti-linear map \ of (2.31). Note that the “natural” involution ω 7→ ω ∗ , see Eq. (2.1), which is inherited from H and was used in the N = 1 case, is no longer available here: The space π(k (A)) is not closed under taking adjoints, because d is not self-adjoint. In summary, the space •d (A) is a unital graded differential \ -algebra and the representation π of • (A) determines a representation of •d (A) on H as a unital differential algebra. 2.2.3. Integration. The integration theory follows the same lines as in the N = 1 case: R The state − is given as in Definition 2.3 with D2 written as 4 = dd∗ + d∗ d. Again, we make Assumption 2.4 about the cyclicity of the integral. This yields a sesqui-linear form on •d (A) as before: Z (2.38) (ω, η) = − ω η ∗ for all ω, η ∈ •d (A), where we have dropped the representation symbols π under the integral. Because of the presence of the Hodge ∗-operator, the form (·, ·) has an additional feature in the N = (1, 1) setting:
Supersymmetric Quantum Theory and Non-Commutative Geometry
141
Proposition 2.22. If the phase in part 5) of Definition 2.20 is ζ = ±1, then the inner product defined in Eq. (2.38) behaves like a real functional with respect to the involution \, i.e., for ω, η ∈ •d (A) we have ( ω \ , η \ ) = (ω, η), where the bar denotes ordinary complex conjugation. Proof. First, observe that the Hodge operator commutes with the Laplacian, which is verified e.g. by taking the adjoint of the relation ∗ d = ζ d∗ ∗ . Then the claim follows immediately using Eq. (2.36), unitarity of the Hodge operator, and cyclicity of the trace on H: Let ω ∈ pd (A), η ∈ qd (A), then Z Z Z ∗ ¯ q − ∗ ω ∗ ∗−1 ∗η ∗−1 = (−ζ)p−q − ω ∗ η ( ω \ , η \ ) = − ω \ η \ = (−ζ)p (−ζ) Z p−q − η ω ∗ = (−ζ)p−q (ω, η); = (−ζ) again, we have suppressed the representation symbol π. The claim follows since the Z2 -grading implies (ω, η) = 0 unless p − q ≡ 0 (mod 2). Note that in examples, p- and q-forms for p 6 = q are often orthogonal with respect to the inner product (·, ·); then Proposition 2.22 holds independently of the value of ζ. Since •d (A) is a \ - and not a ∗ -algebra, Proposition 2.5 is to be replaced by Proposition 2.23. The graded kernel K, see Eq. (2.5), of the sesqui-linear form (·, ·) is a two-sided graded \ -ideal of •d (A). Proof. The proof that K is a two-sided graded ideal is identical to the one of Proposition 2.5. That K is closed under \ follows immediately from the proof of Proposition 2.22. The remainder of Sect. 2.1.3 carries over to the N = (1, 1) case, with the only e • (A) is a \ -algebra and that the quotients k (A)/ K k + δK k−1 ∼ differences that = k k k−1 e e are denoted by d (A). (A)/δK e • (A) is not, in general, a While •d (A) is a differential algebra (by construction), differential algebra, because the ideal K may not be a differential ideal (i.e. there may / K k ). However, K is trivial in many interesting examples. If exist ω ∈ K k−1 with δω ∈ e • (A) of square-integrable forms is a differential algebra K is trivial then the algebra e• . which is faithfully represented on H 2.2.4. Unitary connections and scalar curvature. Except for the notions of unitary connections and scalar curvature, all definitions and results of Sects. 2.1.4–8 literally apply to the N = (1, 1) case as well. The two exceptions explicitly involve the ∗ -involution on the algebra of differential forms, which is no longer available now. Therefore, we have to modify the definitions for N = (1, 1) non-commutative geometry as follows: Definition 2.24. A connection ∇ on a Hermitian vector bundle E, h·, ·i over an N = (1, 1) non-commutative space is called unitary if d h s, t i = h ∇s, t i + h s, ∇t i
142
J. Fr¨ohlich, O. Grandjean, A. Recknagel
for all s, t ∈ E; the Hermitian structure on the rhs is extended to E-valued differential forms by h ω ⊗ s, t i = ω h s, t i, h s, η ⊗ t i = h s, t i η \ •
e d (A) and s, t ∈ E. for all ω, η ∈ 1
e d (A) is defined by Definition 2.25. The scalar curvature of a connection ∇ on r (∇) = E B \
ad R
e0 . (RicB ) ∈ H
2.2.5. Remarks on the relation of N = 1 and N = (1, 1) spectral data. The definitions of N = 1 and N = (1, 1) non-commutative spectral data provide two different generalizations of classical Riemannian differential geometry. In the latter context, one can always find an N = (1, 1) description of a manifold originally given by an N = 1 set of data (see part I), whereas a non-commutative set of N = (1, 1) spectral data seems to require a different mathematical structure than a spectral triple, because of the additional generalized Dirac operator which must be given on the Hilbert space. Thus, it is a natural and important question under which conditions on an N = 1 spectral triple e d, ∗) over the same (A, H, D) there exists an associated N = (1, 1) set of data (A, H, non-commutative space A. We have not been able yet to answer the question of how to pass from N = 1 to N = (1, 1) data in a general way. But in the following we present a procedure that might lead to a solution. Our guideline is the classical case, where the main step in passing from N = 1 to N = (1, 1) data is to replace the Hilbert space H = L2 (S) by e = L2 (S ⊗ S) carrying two actions of the Clifford algebra and therefore two antiH commuting Dirac operators D and D – which yield a description equivalent to the one involving the nilpotent differential d, see the remark after Definition 2.20. It is plausible that there are other approaches to this question, in particular approaches of a more operator algebraic nature, e.g. using a “Kasparov product of spectral triples”, but we will not enter these matters here. The first problem one meets when trying to copy the classical step from N = 1 to N = (1, 1) is that H should be an A-bi-module. To ensure this, we require that the set of N = 1 (even) spectral data (A, H, D, γ) is endowed with a real structure [Co4], i.e. that there exists an anti-unitary operator J on H such that J 2 = 1,
Jγ = 0 γJ,
JD = DJ
for some (independent) signs , 0 = ±1, and such that, in addition, JaJ ∗ commutes with b and [ D, b ] for all a, b ∈ A. This definition of a real structure was introduced by Connes in [Co4]; J is of course a variant of Tomita’s modular conjugation (cf. the next subsection). In the present context, J provides a canonical right A-module structure on H by defining ξ · a := Ja∗ J ∗ ξ for all a ∈ A, ξ ∈ H, see [Co4]. We can extend this to a right action of 1D (A) on H if we set ξ · ω := Jω ∗ J ∗ ξ
Supersymmetric Quantum Theory and Non-Commutative Geometry
143
for all ω ∈ 1D (A) and ξ ∈ H; for simplicity, the representation symbol π has been omitted. Note that by the assumptions on J, the right action commutes with the left action of A. Thus H is an A-bi-module, and we can form tensor products of bi-modules over the algebra A just as in the classical case. From now on, we assume that H contains a o o dense projective left A-module H which is stable under J and γ. In particular, H is itself o an A-bi-module. Since H is projective, it carries a Hermitian structure, see Definition o o 2.8, that induces a scalar product on H ⊗A H (see also Sect. 4.3). We shall denote by e the Hilbert space completion of Ho ⊗A Ho with respect to this scalar product. H The real structure J allows us to define the anti-linear “flip” operator o o 1D (A) ⊗A H −→ H ⊗A 1D (A) . 9 : ω ⊗ ξ 7−→ Jξ ⊗ ω ∗ It is straightforward to verify that 9 is well-defined and that it satisfies 9(a s) = 9(s) a∗ o
for all a ∈ A, s ∈ 1D (A) ⊗A H . o Since H is projective, it admits connections o
o
∇ : H −→ 1D (A) ⊗A H , i.e. C-linear maps such that
∇(aξ) = δa ⊗ ξ + a∇ξ
o
o
for all a ∈ A and ξ ∈ H . We assume that ∇ commutes with the grading γ on H , i.e. o o ∇ γ ξ = (1 ⊗ γ) ∇ξ for all ξ ∈ H . For each connection ∇ on H , there is an “associated right-connection” ∇ defined with the help of the flip 9: o o H −→ H ⊗A 1D (A) . ∇ : ξ 7−→ −9(∇J ∗ ξ) ∇ is again C-linear and satisfies ∇(ξa) = ξ ⊗ δa + (∇ξ)a. o
A connection ∇ on H , together with its associated right connection ∇, induces a C-linear e on Ho ⊗A Ho of the form “tensor product connection” ∇ o o o o H ⊗A H −→ H ⊗A 1D (A) ⊗A H e ∇ : ξ ⊗ ξ 7−→ ∇ξ ⊗ ξ + ξ ⊗ ∇ξ . 1 2 1 2 1 2 e is not quite a connection in the usual Because of the position of the factor 1D (A), ∇ sense. In the classical case, the last ingredient needed for the definition of the two Dirac operators of an N = (1, 1) Dirac bundle are the two anti-commuting Clifford actions on e Their obvious generalizations to the non-commutative case are the C-linear maps H. o o o o H ⊗A 1D (A) ⊗A H −→ H ⊗A H c : ξ1 ⊗ ω ⊗ ξ2 7−→ ξ1 ⊗ ω ξ2
144
J. Fr¨ohlich, O. Grandjean, A. Recknagel
and c :
o o H ⊗A 1D (A) ⊗A H
o
o
−→ H ⊗A H
ξ1 ⊗ ω ⊗ ξ2 7−→ ξ1 ω ⊗ γξ2 . o
o
With these, we may introduce two operators D and D on H ⊗A H in analogy to the classical case: e e D := c ◦ ∇. D := c ◦ ∇, In order to obtain a set of N = (1, 1) spectral data, one has to find a connection ∇ on o H which makes the operators D and D essentially self-adjoint and ensures that the relations D2 = D2 and { D, D } = 0 of Definition 2.20 are satisfied. The Z2 -grading e is simply the tensor product grading, and the Hodge operator can be taken to be on H ∗ = γ ⊗ 1. In Sect. 4 below, we will verify these conditions in the example of the noncommutative torus. In the general case, we have, up to now, not been able to prove o the existence of a connection ∇ on H which supplies D and D with the correct algebraic properties, but the naturality of the construction presented above as well as the similarity with the procedure of Sect. I 2.2.2 lead us to expect that this problem can be solved in many cases of interest. More precisely, we expect that the relation { D, D } = 0 can be satisfied under rather general assumptions, whereas it may often be appropriate to deal with a non-vanishing operator D2 − D2 that generates an S 1 -action. 2.2.6. Riemannian and Spinc “manifolds” in non-commutative geometry. In this section, we address the following question: What is the additional structure that makes an N = (1, 1) non-commutative space into a non-commutative “manifold”, into a Spinc “manifold”, or into a quantized phase space? There is a definition of non-commutative manifolds in terms of K-homology, see e.g. [Co1]. In our search for the characteristic features of non-commutative manifolds we will, as before, be guided by the classical case and by the principle that they should be natural from the point of view of quantum physics. Extrapolating from classical geometry, we are e.g. led to the following requirement an N = (1, 1) space (A, H, d, γ, ∗) should satisfy in order to describe a “manifold”: The data must extend to a set of N = 2 spectral data (A, H, d, T, ∗) where T is a self-adjoint operator on H such that i) [ T, a ] = 0 for all a ∈ A ; ii) [ T, d ] = d ; iii) T has integral spectrum, and γ is the mod 2 reduction of T , i.e. γ = ±1 on H± , where H± = span { ξ ∈ H | T ξ = n ξ for some n ∈ Z, (−1)n = ±1 }. Such N = 2 spectral data have been used in Sect. I 1.2 already, and have also been briefly discussed in Sect. I 3. Before we can formulate further properties that we suppose to characterize noncommutative manifolds, we recall some basic facts about Tomita-Takesaki theory. Let M be a von Neumann algebra acting on a separable Hilbert space H, and assume that ξ0 ∈ H is a cyclic and separating vector for M, i.e. M ξ0 = H
Supersymmetric Quantum Theory and Non-Commutative Geometry
145
and a ξ0 = 0
a=0
=⇒
for any a ∈ M, respectively. Then we may define an anti-linear operator S0 on H by setting S0 a ξ0 = a∗ ξ0 for all a ∈ M. One can show that S0 is closable, and we denote its closure by S. The polar decomposition of S is written as 1
S = J1 2 , where J is an anti-unitary involutive operator, referred to as modular conjugation, and the so-called modular operator 1 is a positive self-adjoint operator on H. The fundamental result of Tomita-Takesaki theory is the following theorem: JMJ = M0 , 1it M1−it = M for all t ∈ R; here, M0 denotes the commutant of M on H. Furthermore, the vector state ω0 (·) := (ξ0 , · ξ0 ) is a KMS-state for the automorphism σt := Ad1it of M, i.e. ω0 (σt (a) b) = ω0 (b σt−i (a)) for all a, b ∈ M and all real t. Let (A, H, d, T, ∗) be a set of N = 2 spectral data coming from an N = (1, 1) space as above. We define the analogue ClD (A) of the space of sections of the Clifford bundle, ClD (A) = { a0 [ D, a1 ] . . . [ D, ak ] | k ∈ Z+ , ai ∈ A }, where D = d + d∗ , and, corresponding to the second generalized Dirac operator D = i(d − d∗ ) , ClD (A) = { a0 [ D, a1 ] . . . [ D, ak ] | k ∈ Z+ , ai ∈ A }. In the classical setting, the sections ClD (A) and ClD (A) operate on H by the two actions c and c, respectively, see Definition I 2.6. In the general case, we notice that, in contrast to the algebra •d (A) introduced before, ClD (A) and ClD (A) form ∗ -algebras of operators on H, but are neither Z-graded nor differential. We want 00 to apply Tomita-Takesaki theory to the von Neumann algebra M := ClD (A) . Suppose there exists a vector ξ0 ∈ H which is cyclic and separating for M, and let J be the anti-unitary conjugation associated to M and ξ0 . Suppose, moreover, that for all a ∈ JA := JAJ the operator [ D, a ] uniquely extends to a bounded operator on H. Then we can form the algebra of bounded operators ClD (JA) on H as above. The properties JAJ ⊂ A0 and { D, D } = 0 imply that ClD (A) and ClD (JA) commute in the graded sense; to arrive at truly commuting algebras, we first decompose ClD (JA) into a direct sum − J + J ( A) ⊕ ClD ( A) ClD (JA) = ClD with
± J ( A) = { ω ∈ ClD (JA) | γ ω = ±ω γ}. ClD
f (JA) := Cl+ (JA) ⊕ γ Cl− (JA). This algebra Then we define the “twisted algebra” Cl D D D commutes with ClD (A).
146
J. Fr¨ohlich, O. Grandjean, A. Recknagel
We propose the following definitions: The N = 2 spectral data (A, H, d, T, ∗) describe a non-commutative manifold if f (JA) = J ClD (A) J. Cl D Furthermore, inspired by classical geometry, we say that a non-commutative manifold f (JA) module (A, H, d, T, ∗, ξ0 ) is spinc if the Hilbert space factorizes as a ClD (A)⊗ Cl D in the form H = HD ⊗Z HD , where Z denotes the center of M. Next, we introduce a notion of “quantized phase space”. We consider a set of N = (1, 1) spectral data (A, H, d, γ, ∗), where we now think of A as the algebra of phase space “functions” (i.e. of pseudo-differential operators, in the Schr¨odinger picture of quantum mechanics) rather than functions over configuration space. We are, therefore, not postulating the existence of a cyclic and separating vector for the algebra ClD (A). Instead, we define for each β > 0 the temperature or KMS state Z ClD (A) −→ RC 2 7−→ − ω:=Tr H ωe−βD − : β ω β 2 Tr H e−βD
,
R with no limit β → 0 taken, in contrast to Definition 2.3. The β-integral −β clearly is a faithful state, and through the GNS-construction we obtain a faithful representation of ClD (A) on a Hilbert space Hβ with a cyclic and separating vector ξβ ∈ Hβ for M. Each bounded operator A ∈ B(H) on H induces a bounded operator Aβ on Hβ ; this is easily seen by computing matrix elements of Aβ , Z h Aβ x, y i = − Axy ∗ β
for all x, y ∈ M ⊂ Hβ , and using the explicit form of the β-integral. We denote the modular conjugation and the modular operator on Hβ by Jβ and 4β , respectively, and we assume that for each a ∈ M the commutator 1 d itD −itD e Jβ aJβ e [ D, Jβ aJβ ] = i dt β β t=0 defines a bounded operator on Hβ . f (Jβ A) on Hβ , which is conThen we can define an algebra of bounded operators Cl D tained in the commutant of ClD (A), and we say that the N = (1, 1) spectral data (A, H, d, γ, ∗) describe a quantized phase space if the following equation holds: f (Jβ A). Jβ ClD (A) Jβ = Cl D 2.3. Hermitian and K¨ahler non-commutative geometry. In this section, we introduce the spectral data describing complex non-commutative spaces, more specifically spaces that carry a Hermitian or a K¨ahler structure; the terminology is of course carried over from the classical case, see part I. Since these structures are more restrictive than the data of Riemannian non-commutative geometry, we will be able to derive some appealing properties of the space of differential forms. We also find a necessary condition for a set
Supersymmetric Quantum Theory and Non-Commutative Geometry
147
of N = (1, 1) spectral data to extend to Hermitian data. A different approach to complex non-commutative geometry has been proposed in [BC]. 2.3.1. Hermitian and N = (2,2) spectral data. Definition 2.26. A set of data (A, H, ∂, ∂, T, T , γ, ∗) is called a set of Hermitian spectral data if 1) the quintuple (A, H, ∂ + ∂, γ, ∗) forms a set of N = (1, 1) spectral data; 2) T and T are self-adjoint bounded operators on H, ∂ and ∂ are densely defined, closed operators on H such that the following (anti-)commutation relations hold: ∂ 2 = ∂ 2 = 0,
{ ∂, ∂ } = 0,
[ T, ∂ ] = ∂,
[ T, ∂ ] = 0,
[ T , ∂ ] = 0,
[ T , ∂ ] = ∂,
[ T, T ] = 0; 3) for any a ∈ A, [ T, a ] = [ T , a ] = 0 and each of the operators [ ∂, a ], [ ∂, a ] and { ∂, [ ∂, a ] } extends uniquely to a bounded operator on H; 4) the Z2 -grading γ satisfies { γ, ∂ } = { γ, ∂ } = 0, [ γ, T ] = [ γ, T ] = 0 ; 5) the Hodge ∗-operator satisfies ∗ ∂ = ζ ∂ ∗ ∗,
∗ ∂ = ζ ∂∗ ∗
for some phase ζ ∈ C. Some remarks on this definition may be useful: The Jacobi identity and the equation { ∂, ∂ } = 0 show that Condition 3 above is in fact symmetric in ∂ and ∂. As in Sect. 2.2.1, a set (A, H, ∂, ∂, T, T ) that satisfies the first three conditions but does not involve γ or ∗, can be made into a complete set of Hermitian spectral data. In classical Hermitian geometry, the ∗-operator can always be taken to be the usual Hodge ∗-operator – up to a multiplicative redefinition in each degree – since complex manifolds are orientable. Next, we describe conditions sufficient to equip a set of N = (1, 1) spectral data with a Hermitian structure. In Subsect. 2.3.2, Corollary 2.34, a necessary criterion is given as well. Proposition 2.27. Let (A, H, d, γ, ∗) be a set of N = (1, 1) spectral data with [ γ, ∗ ] = 0, and let T be a self-adjoint bounded operator on H such that the operator ∂ := [ T, d ] is nilpotent: ∂ 2 = 0; [ T, ∂ ] = ∂ ; [ T, a ] = 0 for all a ∈ A; [ T, ω ] ∈ π(1 (A)) for all ω ∈ π(1 (A)); the operator ∂ := d − ∂ satisfies ∗ ∂ = ζ ∂ ∗ ∗ , where ζ is the phase appearing in the relations of ∗ in the N = (1, 1) data; f) [ T, γ ] = 0 and [ T, T ] = 0, where T := − ∗ T ∗−1 .
a) b) c) d) e)
Then (A, H, ∂, ∂, T, T , γ, ∗) forms a set of Hermitian spectral data.
148
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Notice that Conditions a)–d) are identical to those in Definition I 2.20 of Sect. I 2.4.1. Requirement e) will turn out to correspond to part e) of that definition. The relations in f) ensure compatibility of the operators T , γ and ∗ and were not needed in the classical setting. Proof. We check each of the conditions in Definition 2.26: The first one is satisfied by assumption, since d = ∂ + ∂ is the differential of N = (1, 1) spectral data. The equalities ∂ 2 = ∂ 2 = { ∂, ∂ } = [ T, ∂ ] = 0 follow from a) and b), as in the proof of Lemma I 2.21. With this, we compute [ T , ∂ ] = −[ ∗ T ∗−1 , ∂ ] = −ζ ∗ [ T, ∂ ∗ ] ∗−1 = ∂, and since
[ T , d ] = [ ∗ T ∗−1 , d∗ ]∗ = ζ ∗ [ T, d ]∗ ∗−1 = ∂,
we obtain [ T , ∂ ] = 0. The relation [ T, T ] = 0 and self-adjointness of T were part of the assumptions, and T ∗ = T is clear from the unitarity of the Hodge ∗-operator. That [ ∂, a ] and [ ∂, a ] are bounded for all a ∈ A follows from the corresponding property of d and from the assumption that T is bounded. As in the proof of Proposition I 2.22, one shows that { ∂, [ ∂, a ]} ∈ π(2d (A)), and therefore { ∂, [ ∂, a ]} is a bounded operator. T and ∗ commute with all a ∈ A by assumption, and thus the same is true for T. Using f) and the Jacobi identity, we get { γ, ∂ } = { γ, [ T, d ] } = [ T, { d, γ } ] + { d, [ γ, T ] } = 0 and { γ, ∂ } = { γ, d − ∂ } = 0. By assumption, γ commutes with T and ∗, therefore also with T . Finally, the relations of Condition 5 in Definition 2.26 between the ∗-operator and ∂, ∂ follow directly from e) and ∗ d = ζ d∗ ∗ . As in classical differential geometry, K¨ahler spaces arise as a special case of Hermitian geometry. In particular, K¨ahler spectral data provide a realization of the N = (2, 2) supersymmetry algebra: Definition 2.28. Hermitian spectral data (A, H, ∂, ∂, T, T , γ, ∗) are called N = (2, 2) or K¨ahler spectral data if { ∂, ∂ ∗ } = { ∂, ∂ ∗ } = 0, { ∂, ∂ ∗ } = { ∂, ∂ ∗ }. Note that the first line is a consequence of the second one in classical complex geometry, but has to be imposed as a separate condition in the non-commutative setting. One can also define K¨ahler spectral data, as in Sect. I 1.2, as containing a nilpotent differential d – together with its adjoint d∗ – and two commuting U(1) generators L3 and J0 , say, which satisfy the relations (I 1.49-51). This approach has the virtue that the complex structure familiar from classical differential geometry is already present in the algebraic formulation; see Eq. (I 1.54) for the precise relationship with J0 . Moreover, this way of introducing non-commutative complex geometry makes the role of Lie group symmetries of the spectral data explicit, which is somewhat hidden in the formulation of Definitions 2.26 and 2.28 and in Proposition 2.27: The presence of the U(1) × U(1)
Supersymmetric Quantum Theory and Non-Commutative Geometry
149
symmetry, acting in an appropriate way, ensures that a set of N = (1, 1) spectral data acquires an N = (2, 2) structure. Because of the advantages in the treatment of differential forms, we will stick to the setting using ∂ and ∂ for the time being, but the data with generators L3 and J0 will appear naturally in the context of symplectic geometry in Sect. 2.5. 2.3.2. Differential forms. In the context of Hermitian non-commutative geometry, we have two differential operators ∂ and ∂ at our disposal. We begin this section with the definition of an abstract algebra of universal forms which is appropriate for this situation. Definition 2.29. A bi-differential algebra B is a unital algebra together with two anticommuting nilpotent derivations δ, δ : B −→ B . A homomorphism of bi-differential algebras ϕ : B −→ B0 is a unital algebra homomorphism which intertwines the derivations. Definition 2.30. The algebra of complex universal forms •,• (A) over a unital algebra A is the (up to isomorphism) unique pair (ι, •,• (A)) consisting of a unital bi-differential algebra •,• (A) and an injective unital algebra homomorphism ι : A −→ •,• (A) such that the following universal property holds: For any bidifferential algebra B and any unital algebra homomorphism ϕ : A −→ B , there is a unique homomorphism ϕ e : •,• (A) −→ B of bi-differential algebras such that ϕ=ϕ e ◦ ι. The description of •,• (A) in terms of generators and relations is analogous to the case of • (A), and it shows that •,• (A) is a bi-graded bi-differential algebra •,• (A) =
∞ M
r,s (A)
(2.39)
r,s=0
by declaring the generators a, δa, δa and δδa, a ∈ A, to have bi-degrees (0,0), (1,0), (0,1) and (1,1), respectively. As in the N = (1, 1) framework, we introduce an involution \, called complex conjugation, on the algebra of complex universal forms, provided A is a ∗ -algebra: \ : •,• (A) −→ •,• (A) is the unique anti-linear anti-automorphism acting on generators by \(a) ≡ a\ := a∗ , \(δa) ≡ (δa)\ := δ(a∗ ), \
\(δa) ≡ (δa)\ := δ(a∗ ),
(2.40)
∗
\(δδa) ≡ (δδa) := δδ(a ). Let γ˜ be the Z2 -reduction of the total grading on •,• (A), i.e., γ˜ = (−1)r+s on r,s (A). Then it is easy to verify that δ\γ˜ = \δ. (2.41) This makes •,• (A) into a unital bi-graded bi-differential \ -algebra. Let (A, H, ∂, ∂, T, T , γ, ∗) be a set of Hermitian spectral data. Then we define a Z2 -graded representation π of •,• (A) as a unital bi-differential algebra on H by setting
150
J. Fr¨ohlich, O. Grandjean, A. Recknagel
π(a) = a, π(δa) = [ ∂, a ],
π(δa) = [ ∂, a ],
(2.42)
π(δδa) = { ∂, [ ∂, a ] }. Note that, by the Jacobi identity, the last equation is compatible with the anti-commutativity of δ and δ. As in the case of N = (1, 1) geometry, we have that π(δω) = [ ∂, π(ω) ]g ,
π(δω) = [ ∂, π(ω) ]g ,
(2.43)
for any ω ∈ •,• (A), and therefore the graded kernel of the representation π has good properties: We define J •,• :=
∞ M
J r,s ,
J r,s := { ω ∈ r,s (A) | π(ω) = 0 },
(2.44)
r,s=0
and we prove the following statement in the same way as Proposition 2.21: Proposition 2.31. The set J is a two-sided, bi-graded, bi-differential \ -ideal of •,• (A). We introduce the space of complex differential forms as (A) := •,• ∂,∂¯
∞ M r,s=0
r,s (A), ∂,∂¯
r,s (A) := r,s (A)/J r,s . ∂,∂¯
(2.45)
(A) is a unital bi-graded bi-differential \ -algebra, too, and the repreThe algebra •,• ∂,∂¯ sentation π determines a representation, still denoted π, of this algebra on H. Due to the presence of the operators T and T among the Hermitian spectral data, (A) under π enjoys a property not present in the N = (1, 1) case: the image of •,• ∂,∂¯ Proposition 2.32. The representation of the algebra of complex differential forms satisfies ∞ M •,• π r,s (A) . (2.46) π ∂,∂¯ (A) = ∂,∂¯ r,s=0
(A) as a unital, bi-graded, bi-differential In particular, π is a representation of •,• ∂,∂¯ \ \ -algebra. The -operation is implemented on π •,• (A) with the help of the Hodge ∂,∂¯ ∗-operator and the ∗ -operation on B(H): ( −→ π r,s π r,s ¯ (A) ¯ (A) ∂, ∂ ∂, ∂ . \ : ω 7−→ ω \ := (−ζ)r+s ∗ ω ∗ ∗−1 (A) . Then part 2) of Definition 2.26 implies that Proof. Let ω ∈ π r,s ∂,∂¯ [ T, ω ] = r ω,
[ T , ω ] = s ω,
which gives the direct sum decomposition (2.46). It remains to show that the \ -operation (A) : For a ∈ A, we have that is implemented on the space π •,• ∂,∂¯
Supersymmetric Quantum Theory and Non-Commutative Geometry
151
π (δa)\ = π δ(a∗ ) = [ ∂, a∗ ] = −[ ∂ ∗ , a ]∗ = −[ ζ¯ ∗ ∂ ∗−1 , a ]∗ = −ζ ∗ [ ∂, a ]∗ ∗−1 = −ζ ∗ π(δa)∗ ∗−1 , and, similarly, using (2.40) and the properties of the Hodge ∗-operator, π (δδa)\ = ζ 2 ∗ π(δδa)∗ ∗−1 . π (δa)\ = −ζ ∗ π(δa)∗ ∗−1 , This proves that π(ω \ ) = π(ω)\ .
(A) via the Hodge As an aside, we mention that the implementation of \ on π •,• ∂,∂¯ ∗-operator shows that Conditions e) of the “classical” Definition I 2.20 and of Proposition 2.27 are related; more precisely, the former is a consequence of the latter. Hermitian spectral data carry, in particular, an N = (1, 1) structure, and thus we have two notions of differential forms available. Their relation is described in our next proposition. Proposition 2.33. The space of N = (1, 1) differential forms is included in the space of Hermitian forms, i.e., M π r,s (A) , (2.47) π pd (A) ⊂ ∂,∂¯ r+s=p
and the spaces coincide if and only if
for all ω ∈ 1d (A).
[ T, ω ] ∈ π 1d (A)
(2.48)
Proof. The inclusion (2.47) follows simply from d = ∂ + ∂. If the spaces are equal then the equation [ T, ω ] = r ω, r,s for all ω ∈ π ∂,∂¯ (A) , implies (2.48). The converse is shown as in the proof of Proposition I 2.22 in Sect. 2.4.1 of part I, concerning classical Hermitian geometry. Note that even if the spaces of differential forms do not coincide, the algebra of complex forms contains a graded differential algebra ∂,• ∂¯ (A), d with d = ∂ + ∂ and ∂,• ∂¯ (A) :=
M p
∂,p ∂¯ (A),
∂,p ∂¯ (A) :=
M r+s=p
r,s (A). ∂,∂¯
(2.49)
By Proposition 2.32, we know that M (A) = π ∂,p ∂¯ (A) , π •,• ∂,∂¯ p
and hence we obtain a necessary condition for N = (1, 1) spectral data to extend to Hermitian spectral data: Corollary 2.34. If a set of N = (1, 1) spectral data extends to a set of Hermitian spectral data then M π pd (A) . π •d (A) = p
152
J. Fr¨ohlich, O. Grandjean, A. Recknagel
This condition is clearly not sufficient since it is always satisfied in classical differential geometry. Beyond the complexes (2.45) and (2.49), one can of course also consider the analogue (A). The details are of the Dolbeault complex using only the differential ∂ acting on •,• ∂,∂¯ straightforward. We conclude this subsection with some remarks concerning possible variations of our Definition 2.26 of Hermitian spectral data. For example, one may wish to drop the boundedness condition on the operators T and T , in order to include infinitedimensional spaces into the theory. This is possible, but then one has to make some stronger assumptions in Proposition 2.27. Another relaxation of the requirements in Hermitian spectral data is to avoid introducing T and T altogether, and to replace them by a decomposition of the Z2 -grading γ = γ∂ + γ∂¯ such that
{ γ∂ , ∂ } = 0,
[ γ∂ , ∂ ] = 0,
{ γ∂¯ , ∂ } = 0,
[ γ∂¯ , ∂ ] = 0.
Then the space of differential forms may be defined as above, but Propositions 2.32 and 2.33, as well as the good properties of the integral established in the next subsection, will not hold in general. 2.3.3. Integration in complex non-commutative geometry. The definition of the integral is completely analogous to the N = (1, 1) setting: Again we use the operator 4 = d d∗ + d∗ d, where now d = ∂ + ∂. Due to the larger set of data, the space of square-integrable, complex differential forms, now obtained after quotienting by the two-sided bi-graded \ -ideal K, has better properties than the corresponding space of forms in Riemannian non-commutative geometry. There, two elements ω ∈ pd (A) and η ∈ qd (A) with p 6 = q were not necessarily orthogonal with respect to the sesqui-linear form (·, ·) induced by the integral. For Hermitian and K¨ahler non-commutative geometry, however, we can prove the following orthogonality statements: i (A) , i = 1, 2. Then Proposition 2.35. Let ωi ∈ π r∂,i ,s ∂¯ (ω1 , ω2 ) = 0
(2.50)
if r1 + s1 6 = r2 + s2 in the Hermitian case; if the spectral data also carry an N = (2, 2) structure, then eq. (2.50) holds as soon as r1 6 = r2 or s1 6 = s2 . Proof. In the case of Hermitian spectral data, the assertion follows immediately from cyclicity of the trace, from the commutation relations [ T, ωi ] = ri ωi ,
[ T , ωi ] = si ωi ,
which means that T + T counts the total degree of a differential form, and from the equation [ T + T , 4 ] = 0. In the K¨ahler case, Definition 2.28 implies the stronger relations [ T, 4 ] = [ T , 4 ] = 0.
Supersymmetric Quantum Theory and Non-Commutative Geometry
153
1 e ∂,∂ 2.3.4. Generalized metric on (A). The notions of vector bundles, Hermitian structure, torsion, etc. are defined just as for N = (1, 1) spectral data in Subsect. 2.2. The definitions of holomorphic vector bundles and connections can be carried over from the classical case; see Sect. I 2.4.4. Again, we pass from ∂,1 ∂¯ , see Eq. (2.49), to the space
e ∂,1 ∂¯ , which is equipped with a generalized Hermitian of all square-integrable 1-forms structure h·, ·i∂,∂¯ according to the construction in Theorem 2.9. Starting from here, we can define an analogue e ∂,1 ∂¯ (A) −→ C e ∂,1 ∂¯ (A) × hh·, ·ii : of the C-bi-linear metric in classical complex geometry by hh ω, η ii := h ω, η \ i∂,∂¯ . e ∂,1 ∂¯ (A) has the following properProposition 2.36. The generalized metric hh·, ·ii on ties: 1) hh aω, η b ii = a hh ω, η ii b ; 2) hh ω a, η ii = hh ω, a η ii ; 3) hh ω, ω \ ii ≥ 0 ; e ∂,1 ∂¯ (A) and a, b ∈ A. If the underlying spectral data are K¨ahlerian, one here ω, η ∈ has that hh ω, η ii = 0 0,1 e 1,0 e ∂, if ω, η ∈ ∂¯ (A) or ω, η ∈ ∂,∂¯ (A) .
Proof. The first three statements follow directly from the definition of hh·, ·ii and the corresponding properties of h·, ·i∂,∂¯ listed in Theorem 2.9. The last assertion is a consee r,s quence of Proposition 2.35, using the fact that the spaces ∂,∂¯ (A) are A-bi-modules. Note that this property of the metric hh·, ·ii corresponds to the property gµν = gµ¯ ν¯ = 0 (in complex coordinates) in the classical case. 2.4. The N = (4, 4) spectral data. We just present the definition of spectral data describing non-commutative Hyperk¨ahler spaces. Obviously, it is chosen in analogy to the discussion of the classical case in Sect. 2.5 of part I. Definition 2.37. A set of data (A, H, Ga± , Ga± , T i , T i , γ, ∗) with a = 1, 2, i = 1, 2, 3, is called a set of N = (4, 4) or Hyperk¨ahler spectral data if 1) the subset (A, H, G1+ , G1+ , T 3 , T 3 , γ, ∗) forms a set of N = (2, 2) spectral data; on H, andT i , i = 1, 2, 3, 2) Ga± , a = 1, 2 are closed, densely defined operators ∗ a± ∗ = Ga∓ , T i = T i and the are bounded operators on H which satisfy G following (anti-)commutation relations (a, b = 1, 2, i, j = 1, 2, 3, and τ i are the Pauli matrices): { Ga+ , Gb+ } = 0,
{ Ga− , Gb+ } = δ ab ,
[ , Ga+ ] = 0,
[ , T i ] = 0,
1 i b+ τ G , 2 ab for some self-adjoint operator on H, which, in the classical case, is the holomorphic part of the Laplace operator; [ T i , T j ] = iijk T k ,
[ T i , Ga+ ] =
154
J. Fr¨ohlich, O. Grandjean, A. Recknagel
3) the operators Ga± , a = 1, 2, and T i , i = 1, 2, 3, also satisfy the conditions in 2) and (anti-)commute with Ga± and T i . The construction of non-commutative differential forms and the integration theory is precisely the same as for N = (2, 2) spectral data. We therefore refrain from giving more details. It might, however, be interesting to see whether the additional information encoded in N = (4, 4) spectral data gives rise to special properties, beyond the ones found for K¨ahler data in Subsect. 2.3.3. 2.5. Symplectic non-commutative geometry. Once more, our description in the noncommutative context follows the algebraic characterization of classical symplectic manifolds given in Sect. 2.6 of part I. The difference between our approaches to the classical and to the non-commutative case is that, in the former, we could derive most of the algebraic relations – including the SU(2) structure showing up on symplectic manifolds – from the specific properties of the symplectic 2-form, whereas now we will instead include those relations into the defining data, as a “substitute” for the symplectic form. Definition 2.38. The set of data (A, H, d, L3 , L+ , L− , γ, ∗) is called a set of symplectic spectral data if 1 (A, H, d, γ, ∗) is a set of N = (1, 1) spectral data; 2) L3 , L+ and L− are bounded operators on H which commute with all a ∈ A and satisfy the sl2 commutation relations [ L3 , L± ] = ±2L± , [ L+ , L− ] = L3 as well as the Hermiticity properties (L3 )∗ = L3 , (L± )∗ = L∓ ; furthermore, they commute with the grading γ on H; e∗ := [ L− , d ] is densely defined and closed, and together with d it 3) the operator d forms an SU(2) doublet, i.e., the following commutation relations hold: [ L3 , d ] = d, [ L+ , d ] = 0, e∗ , [ L− , d ] = d
e∗ ] = −d e∗ , [ L3 , d e∗ ] = d, [ L+ , d e∗ ] = 0. [ L− , d
As in the classical case, there is a second SU(2) doublet spanned by the adjoints d∗ e. The Jacobi identity shows that d e∗ is nilpotent and that it anti-commutes with d. and d Differential forms and integration theory are formulated just as for N = (1, 1) spectral data, but the presence of SU(2) generators among the symplectic spectral data leads to additional interesting features, such as the following: Let ω ∈ kd (A) and η ∈ ld (A) be two differential forms. Then their scalar product, see Eq. (2.38), vanishes unless k = l: (ω, η) = 0 if k 6 = l.
(2.51)
This is true because, by the SU(2) commutation relations listed above, the operator L3 induces a Z-grading on differential forms, and because L3 commutes with the Laplacian 4 = d∗ d + dd∗ . One consequence of (2.51) is that the reality property of (·, ·) stated in Proposition 2.22 is valid independently of the phase occurring in the Hodge relations. The following proposition shows that we can introduce an N = (2, 2) structure on a set of symplectic spectral data if certain additional properties are satisfied. As was the case for Definition 2.38, the extra requirements are slightly stronger than in the
Supersymmetric Quantum Theory and Non-Commutative Geometry
155
classical situation, where some structural elements like the almost-complex structure are given automatically. In the K¨ahler case, the latter allows for a separate counting of holomorphic resp. anti-holomorphic degrees of differential forms, which in turn ensures that the symmetry group of the symplectic data associated to a classical K¨ahler manifold is in fact SU(2) × U(1) – see also Sect. 3 of part I. Without this enlarged symmetry group, it is impossible to re-interpret the N = 4 as an N = (2, 2) supersymmetry algebra. Therefore, we explicitly postulate the existence of an additional U(1) generator in the non-commutative context – which coincides with the U(1) generator J0 in Eq. (I 1.49) of Sect. I 1.2 and is intimately related to the complex structure. Proposition 2.39. Suppose that the SU(2) generators of a set of symplectic spectral data satisfy the following relations with the Hodge operator: ∗ L3 = −L3 ∗,
∗ L+ = −ζ 2 L− ∗,
where ζ is the phase appearing in the Hodge relations of the N = (1, 1) subset of the symplectic data. Assume, furthermore, that there exists a bounded self-adjoint operator J0 on H which commutes with all a ∈ A, with the grading γ, and with L3 , whereas it acts like e, e] = id [ J0 , d [ J0 , d ] = −i d between the SU(2) doublets. Then the set of symplectic data carries an N = (2, 2) K¨ahler structure with 1 1 e), e), ∂ = (d + i d ∂ = (d − i d 2 2 1 1 T = (L3 + J0 ), T = (L3 − J0 ). 2 2 Proof. All the conditions listed in Definition 2.26 of Hermitian spectral data can be e2 = 0 and verified easily: Nilpotency of ∂ and ∂ follows from d2 = d e } = 0, { d, d
(2.52)
and the action of the Hodge operator on the SU(2) generators ensures that ∗ intertwines ∂ and ∂ in the right way. As for the extra conditions in Definition 2.28 of K¨ahler spectral data, one sees that the first one is always true for symplectic spectral data, whereas the second one, namely the equality of the “holomorphic” and “anti-holomorphic” Laplacians, is again a consequence of relation (2.52). 3. The Non-Commutative 3-Sphere Here and in the next section, we present two examples of non-commutative spaces and show how the general methods developed above can be applied. We first discuss the “quantized” or “fuzzy” 3-sphere. We draw some inspiration from the conformal field theory associated to a non-linear σ-model with target being a 3-sphere, the socalled SU(2)-WZW model, see [Wi3] and also [FGK, PS]. But while the ideas on a non-commutative interpretation of conformal field theory models proposed in [FG] are essential for placing non-commutative geometry into a string theory context, the following calculations are self-contained; the results of subsections 3.2 and 3.3 are taken from [Gr]. Although there is no doubt that the methods used in [Gr] and below can be extended to arbitrary compact, connected and simply connected Lie groups, we will, for simplicity, restrict ourselves to the case of SU(2).
156
J. Fr¨ohlich, O. Grandjean, A. Recknagel
We first introduce a set of N = 1 spectral data describing the non-commutative 3sphere, then discuss the de Rham complex and its cohomology, and finally turn towards geometrical aspects of this non-commutative space. SubSect. 3.4 briefly describes the N = (1, 1) formalism. 3.1. The N = 1 data associated to the 3-sphere. In this subsection, we introduce N = 1 data describing the non-commutative 3-sphere. Since the 3-sphere is diffeomorphic to the Lie group G = SU(2), we are looking for data describing a Lie group G . Let {TA } be a basis of g = Te G , the Lie algebra of G . By ϑA and ϑA we denote the left- and right-invariant vector fields associated to the basis elements TA , and by θA and θA the corresponding dual basis of 1-forms. The structure C are defined, as usual, by constants fAB C ϑC . [ϑA , ϑB ] = fAB
(3.1)
The Killing form on g induces a canonical Riemannian metric on T G given by D C fBD , gAB ≡ g(ϑA , ϑB ) = −Tr (adTA ◦ adTB ) = −fAC
(3.2)
and the Levi–Civita connection reads ∇A ϑB ≡ ∇ϑA ϑB =
1 C f ϑC . 2 AB
(3.3)
The left-invariant vector fields ϑA define a trivialization of the (co-)tangent bundle. We denote by ∇L the flat connection associated to that trivialization, ∇L θA = 0 for all A. We introduce the operators aA ∗ = θA ∧ , aA = g AB ϑB on the space of differential forms, as well as the usual gamma matrices γ A = aA ∗ − aA , γ A = i ( aA ∗ + aA ).
(3.4)
It is easy to verify that γ A and γ A generate two anti-commuting copies of the Clifford algebra, { γ A , γ B } = { γ A , γ B } = −2g AB , { γ A , γ B } = 0. (3.5) Following the notations of Sect. I 2.2, we shall denote by S the bundle of differential forms endowed with the above structures. We define two connections ∇S and ∇S on S by setting ∇S = θA ⊗ (∇L ϑA + ∇S = θ ⊗ (∇L ϑ A
A
1 fABC γ B γ C ), 12 1 fABC γ B γ C ), − 12
(3.6)
D gDC , and we put where fABC = fAB A A L A A JA := i ∇L ϑA , ψ := −i γ , J A := −i ∇ϑ , ψ := i γ . A
(3.7)
Supersymmetric Quantum Theory and Non-Commutative Geometry
157
These objects satisfy the commutation relations C JC , { ψ A , ψ B } = 2g AB , [ JA , JB ] = ifAB
(3.8)
with analogous relations for J A and ψ A ; barred and unbarred operators (anti-)commute. The two anti-commuting Dirac operators D and D on S read [FG] i fABC ψ A ψ B ψ C , 12 i A D = ψ JA − fABC ψ A ψ B ψ C . 12
D = ψ A JA −
(3.9)
The Z2 -grading operator γ on S , anti-commuting with D and D, is given by γ=
1 g εABC εDEF ψ A ψ B ψ C ψ D ψ E ψ F , i (3!)2
(3.10)
where g = det gAB . By L2 (S) ' L2 (G) ⊗ W , where W is the irreducible representation of the Clifford algebra of Eqs. (3.4,5), we denote the Hilbert space of square integrable sections of the bundle S , with respect to the normalized Haar measure on G . In the language of Connes’ spectral triples, the classical 3-sphere is described by the N = 1 data (L2 (S), C ∞ (G), D, γ) , with D ≡ D. The Hilbert space L2 (S) carries a unitary representation π of G × G given by (3.11) π(g1 , g2 )f (h) = f (g1−1 hg2 ), for all gi , h ∈ G and f ∈ L2 (G) . For each j ∈ 21 Z+ we denote by (π j , Vj ) the irreducible unitary (2j + 1)-dimensional (spin j) representation of G , and to each vector ξ ∗ ⊗ η ∈ Vj∗ ⊗ Vj we associate a smooth function fξ∗ ⊗η ∈ C ∞ (G) by setting fξ∗ ⊗ η (g) = √
1 h ξ ∗ , π j (g)η i. 2j + 1
(3.12)
This defines a linear isometry ϕ :
M
Vj∗ ⊗ Vj −→ L2 (G),
(3.13)
j∈ 21 Z+
and the Peter-Weyl theorem states that the image of ϕ is dense in L2 (G) and also in C(G) inLthe supremum norm topology. It is easy to verify that the operators JA and J A act on j∈ 1 Z+ Vj∗ ⊗ Vj as dπ(TA , 1) and dπ(1, TA ) , respectively. For each positive 2 integer k , we denote by P(k) the orthogonal projection P(k) : L2 (G) −→ H0 :=
k/2 M
Vj∗ ⊗ Vj .
(3.14)
j=0, 21 , ...
The Dirac operator D and the Z2 -grading γ clearly leave the finite-dimensional Hilbert space H0 ⊗ W invariant. We define A0 to be the unital subalgebra of End(H0 ) generated by operators of the form P(k) fξ∗ ⊗η , where ξ ∗ ⊗ η ∈ H0 . The following theorem is proven in [Gr]:
158
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Theorem 3.1. The algebra A0 coincides with the algebra of endomorphisms of H0 ,i.e., A0 = End (H0 ). The proof in [Gr] shows that A0 is a full matrix algebra for any compact, connected and simply connected group. That A0 equals the endomorphism ring of H0 was only proved for SU(2), but a slight generalization of the proof for SU(2) should yield the result for all groups of the above type. We define the non-commutative 3-sphere by the N = 1 data (A0 , H0 ⊗ W, D, γ) . Notice that this definition of the non-commutative 3-sphere is very close to that of the non-commutative 2-sphere [Ber, Ho, Ma, GKP]. For an alternative derivation of this definition, the reader is referred to [FG] where it is shown how this space arises as the quantum target of the WZW model based on SU (2) . We note that 1/k plays the role of Planck’s constant ~ in the quantization of symplectic manifolds, i.e., it is a deformation parameter. Formally, the classical 3-sphere emerges as the limit of non-commutative 3-spheres as the deformation parameter 1/k tends to zero. 3.2. The topology of the non-commutative 3-sphere. In this subsection, we shall apply the tools of Subsect. 2.1 to the non-commutative space (A0 , H0 ⊗ W, D, γ) describing the non-commutative 3-sphere; we follow the presentation in [Gr]. For convenience, we shall choose the basis {TA } of Te G in such a way that gAB = 2δAB . The structure C = εABC . constants are then given by the Levi–Civita tensor, fAB 3.2.1. The de Rham complex. First, we determine the structure of the spaces of differential forms nD (A0 ) and the action of the exterior differentiation δ : •D (A0 ) −→ •D (A0 ) . We use the same notations as in Subsect. 2.1.2. The space of 1-forms is nX o ai0 [ J A , ai1 ] ⊗ ψ A aij ∈ A0 . (3.15) 1D (A0 ) ' π(1 (A0 )) = i
Since A0 is a full matrix algebra, see Theorem 3.1, it follows that 1D (A0 ) ' { aA ⊗ ψ A | aA ∈ A0 }.
(3.16)
Using the fact that any element of π(2 (A0 )) can be written as a linear combination of products of pairs of elements in π(1 (A0 )), we get π(2 (A0 )) = { aAB ⊗ ψ A ψ B | aAB ∈ A0 }.
(3.17)
Our next task is to determineP the space π(δJ 1 ) of so-called “auxiliary 2-forms”, see Eq. (2.2). To this end, let ω = i ai δbi ∈ 1 (A0 ) be such that π(ω) =
X
ai [D , bi ] = 0.
(3.18)
i
Using Eqs. (3.8) and (3.18), we see that the coefficient of [ ψ A , ψ B ] in π(δω) is proportional to
Supersymmetric Quantum Theory and Non-Commutative Geometry
εAB
X i
[ J A , ai ][ J B , bi ] = −εAB
X
159
ai J A , [ J B , bi ]
i
X X i 1 ai [ J A , J B ], bi = − εAB εABC ai [ J C , bi ] = 0, = − εAB 2 2 i i
where εAB denotes the Levi–Civita antisymmetric tensor. This shows that π(δJ 1 ) is included in A0 , and since A0 is a full matrix algebra, this implies that π(δJ 1 ) is either 0 or equal to A0 . We construct a non-vanishing element of π(δJ 1 ) explicitly. Let Pj be the orthogonal projection onto Vj∗ ⊗ Vj . We define a , b ∈ A0 by a = P0 a P1/2 , b = P1/2 b P0 and
α γ ⊗ V1/2 3 ⊗ 7−→ α − 2β + 2γ + δ ∈ V0∗ ⊗ V0 , a : β δ 1 2 ∗ ∗ ⊗ ∈ V1/2 ⊗ V1/2 . b : V0 ⊗ V0 3 α 7−→ α · −2 1 ∗ V1/2
It is straightforward to verify that ω := aδb satisfies π(ω) = 0 and π(δω) 6 = 0 . This proves that π(δJ 1 ) = A0 , and we get 2D (A0 ) ' { aAB ⊗ ψ A ψ B | aAB = −aBA ∈ A0 }.
(3.19)
In order to determine the space of 3-forms, we first notice that π(3 (A0 )) = { aABC ⊗ ψ A ψ B ψ C }, and we compute the space π(δJ 2 ) . Let ai , bi , ci ∈ A0 be such that ω = satisfies X ai [ D, bi ][ D, ci ] = 0. π(ω) =
(3.20) P i
ai δbi δci (3.21)
i
The coefficient of ψ 1 ψ 2 ψ 3 in π(δω) is proportional to X X [ J A , ai ][ J B , bi ][ J C , ci ] = −εABC ai J A , [ J B , bi ][ J C , ci ] εABC i
= −εABC
X
ai
i
[J A , [ J B , bi ] [ J C , ci ] + [ J B , bi ] J A , [ J C , ci ]
i
X i ai εABD [ J D , bi ] [ J C , ci ] + εACD [ J B , bi ] [ J D , ci ] = 0, = − εABC 2 i where we have used Eq. (3.21) and the Jacobi identity. Thus, π(δJ 2 ) is included in π(1 (A0 )) , and since A0 is a full matrix algebra, it is either 0 or equal to π(1 (A0 )) . Let ω, η ∈ 1 (A0 ) be such that π(ω) = −1 ⊗ ψ A , π(η) = 0 and π(δη) = 1 ⊗ 1 . The existence of ω and η is ensured by Eqs. (3.16) and the fact that π(δJ 1 ) = A0 . We have ωη ∈ 2 (A0 ) , π(ωη) = 0 and π(δ(ωη)) = 1 ⊗ ψ A as π(δω) = 0 . This proves that π(δJ 2 ) = π(1 (A0 )), and we get 3D (A0 ) ' { a ⊗ ψ 1 ψ 2 ψ 3 | a ∈ A0 }.
(3.22)
160
J. Fr¨ohlich, O. Grandjean, A. Recknagel
We proceed with the space of 4-forms. First, we notice that due to the Clifford algebra relations, Eqs. (3.4,5,8), we have π(4 (A0 )) = { aAB ⊗ ψ A ψ B | aAB ∈ A0 }.
(3.23)
Let ω ∈ 1 (A0 ) and η ∈ 2 (A0 ) be such that π(ω) = 0 , π(δω) = 1 ⊗ 1 , and π(η) = 1 ⊗ ψ A ψ B . The existence of ω and η is ensured by the fact that π(δJ 1 ) = A0 and by Eq. (3.17). We have ωη ∈ 3 (A0 ) , π(ωη) = 0 and π(δ(ωη)) = 1 ⊗ ψ A ψ B as π(ω) = 0 . Since A0 is a full matrix algebra, this proves that π(4 (A0 )) = π(δJ 3 ) , and we get 4D (A0 ) = 0 . Using the fact that the product of differential forms induces a surjective map n+m nD (A0 ) ⊗ m D (A0 ) −→ D (A0 ) we obtain
nD (A0 ) = 0 ∀ n > 3 .
(3.24)
Collecting Eqs. (3.16), (3.22) and (3.24), we arrive at the following theorem on the structure of differential forms over the non-commutative space (A0 , H0 ⊗ W, D, γ): Theorem 3.2. The left A0 -modules nD (A0 ) are all free and given as follows: 0) 0D (A0 ) = A0 is one-dimensional with basis {1} ; 1) 1D (A0 ) is three-dimensional with basis {1 ⊗ ψ A } ; 2) 2D (A0 ) is three-dimensional with basis {1 ⊗ ψ A ψ A+1 } (where addition is taken modulo 3); 3) 3D (A0 ) is one-dimensional with basis {1 ⊗ ψ 1 ψ 2 ψ 3 } ; 4) nD (A0 ) = 0 for all n > 3 . Notice that the structure of the modules nD (A0 ) is the same as that of the spaces of differential forms on SU(2) ' S 3 . In the following, we compute the action of the exterior differential δ : nD (A0 ) −→ n+1 D (A0 ). We introduce the following bases of 1D (A0 ) and 2D (A0 ): eA = 1 ⊗ ψ A ∈ 1D (A0 ), f
A
=ε
ABC
B
⊗ψ ψ
C
∈
2D (A0 ),
(3.25) (3.26)
which allows us to identify 1D (A0 ) and 2D (A0 ) with the standard free module A30 , and we decompose their elements with respect to these bases, ω = ωA eA for ω ∈ 1D (A0 ), ω = ωA f
A
for ω ∈
2D (A0 ).
(3.27) (3.28)
It is easily verified that the product of 1-forms ω, η ∈ 1D (A0 ) is given by ω · η = εABC ωB ηC f A .
(3.29)
By the Leibniz rule for the exterior differential δ , knowledge of the action of δ on the elements a ∈ A0 , eA and f A fully determines the action of the differential on •D (A0 ) . By definition, we have (3.30) δa = [ J A , a ] eA .
Supersymmetric Quantum Theory and Non-Commutative Geometry
161
Using Eq. (3.30) and the nilpotency of δ we get 0 = δ 2 J A = iεABC δ(J C eB ) = −εBAC εDCF J F eD eB + iεBAC J C δeB from which we can successively conclude that εBAE εDEC = iεBAC δeB , eA eC = iεBAC δeB . With Eq. (3.29), we finally get
δeA = −if A .
(3.31)
This equation, together with the nilpotency of δ, furthermore implies that δf A = 0.
(3.32)
We summarize these results in the following Theorem 3.3. Let g = 3!1 εABC ψ A ψ B ψ C be the basis element of 3D (A0 ) , and eA and f A as in Eqs. (3.25,26). Then the algebra structure of •D (A0 ) is given as follows: a1) a2)
[ a, eA ] = [ a, f A ] = [ a, g ] = 0 for all a ∈ A0 , A B
ABC C
A B
AB
e e =ε e f
=δ
A B C
f , e e e =ε
ABC
g,
g.
(3.33) (3.34) (3.35)
The differential structure on •D (A0 ) is given by δa = [ J A , a ] eA ,
b1)
A
A
δe = −if , δf
b2)
(3.36) A
= 0.
(3.37)
3.2.2. Cohomology of the de Rham complex. Let us now compute the cohomology groups of the de Rham complex (•D (A0 ), δ) of Theorems 3.2 and 3.3. The zeroth cohomology group H 0 consists of those elements a ∈ A0 that are closed, i.e., satisfy δa = 0 . We have a ∈ H 0 ⇐⇒ δa = [ J A , a ]eA = 0 ⇐⇒ [ J A , a ] = 0 for all A, and it follows that H 0 = AR ≡
k/2 M j=0
and dimC H 0 =
k/2 X j=0
1Vj∗ ⊗ End (Vj ),
(2j + 1)2 =
1 (2k + 3)(k + 2)(k + 1). 6
(3.38)
(3.39)
In order to compute the first cohomology group, we first determine the closed 1-forms. For any 1-form ω = ωA eA ∈ 1D (A0 ), relation (3.37) implies that δω = [ J A , ωB ]εABC − iωC f C ,
162
J. Fr¨ohlich, O. Grandjean, A. Recknagel
and thus δω = 0 is equivalent to [ J A , ωB ]εABC = iωC .
(3.40)
We show that all closed 1-forms are exact. First, notice that if we view A0 as a representation space of su(2), then, for a closed 1-form, Eq. (3.40) must hold in all isotypic components. Therefore, there is no loss of generality in assuming that all coefficients ωA transform under the spin j representation, i.e., [ J A , [ J A , ωB ]] = j(j + 1) ωB .
(3.41)
Furthermore, we can assume that j 6 = 0 since otherwise ω = 0, as follows from Eq. (3.40). We define a(ω) ∈ A0 by 1 [ J A , ωA ] a(ω) = j(j + 1) and we compute δa . Using Eqs. (3.40,41) and the Jacobi identity, we get 1 [ J A , [ J B , ωB ]]eA j(j + 1) 1 iεABC [ J C , ωB ] + [ J B , [ J A , ωB ]] eA = j(j + 1) 1 [ J B , [ J B , ωA ]]eA = ωA eA . = j(j + 1)
δa(ω) =
This proves that
H 1 = 0.
(3.42)
We proceed towards the second cohomology group. The condition for a 2-form ω = ωA f A to be closed reads δω = 0 ⇐⇒ [ J A , ωA ] = 0.
(3.43)
Again, we assume that the components ωA belong to a spin j representation of su(2). If j = 0 , then setting ηA = iωA we get δ(ηA eA ) = ωA f A , proving that ω is exact. If j 6 = 0, we set ηA = −
1 εABC [ J B , ωC ], j(j + 1)
and one easily verifies that δ(ηA eA ) = ωA f A . This proves that H 2 = 0.
(3.44)
Finally, we compute the third cohomology group. Since all 3-forms are closed, we just have to compute the image of the exterior differential in 3D (A0 ) . For any 2-form ω we have δω = [ J A , ωA ] g with g being the basis element of 3D (A0 ) as in Theorem 3.3. This means that the image of δ in 3D (A0 ) is given by
Supersymmetric Quantum Theory and Non-Commutative Geometry
im δ
2D (A0 )
= span
3 [
163
im (ad J A ) ·g,
A=1
and this space consists of linear combinations of elements of A0 transforming under a spin j representation for j 6 = 0 , multiplied by g . Thus, the quotient 3D (A0 )/im δ is given by k/2 M 3 1Vj∗ ⊗ End (Vj ). (3.45) H ' AR ≡ j=0
Collecting our results of Eqs. (3.38,39,42,44) and (3.45), we get the following Theorem 3.4. The cohomology groups of the de Rham complex of Theorem 3.3 are H 0 ' H 3 ' AR ,
H1 = H2 = 0
with dimensions dimC H 0 = dimC H 3 =
1 (2k + 3)(k + 2)(k + 1). 6
This theorem shows that the cohomology groups of the fuzzy 3-sphere – which is the quantum target of the WZW model based on SU(2) [FG, Gr] – look very much like those of the classical SU(2) group manifold, except for the unexpected dimensions of the spaces H 0 and H 3 . We observe that in the classical setting, the cohomology groups are modules over the ring H 0 and that, for a connected space, the Betti numbers coincide with the dimensions of these modules. We are thus led to the idea that the dimensions of the cohomology groups over C may be less relevant than their dimensions as modules over H 0 . Of course, it may happen in general that some H 0 -module is not free, and we would, in that case, lose the notion of dimension. For the cohomology groups of the de Rham complex (•D (A0 ), δ) we get dimH 0 H 0 = dimH 0 H 3 = 1, dimH 0 H 1 = dimH 0 H 2 = 0, which fits perfectly with the classical result. The above proposal is obviously tailored to make sense of the cohomology groups of Theorem 3.4 and its general relevance remains to be decided by the study of other examples of non-commutative spaces. 3.3. The geometry of the non-commutative 3-sphere. The N = 1 spectral data (A0 , H0 ⊗ W, D, γ) permit us to investigate not only topological but also geometrical aspects of the quantized 3-sphere, namely integration of differential forms and Hermitian structures, as well as connections and the associated Riemann, Ricci and scalar curvatures. For a more detailed account of the results of this section, the reader is referred to [Gr]. 3.3.1. Integration and Hermitian structures. We start with the canonical scalar product and the Hermitian structures on the spaces of differential forms. We use the same notations as in Subsects. 2.1.3–2.1.5. Any element ω ∈ π(• (A0 )) can be written uniquely as 1 A 2 A e + ωA f + ω 3 g, ω = ω 0 + ωA
(3.46)
164
J. Fr¨ohlich, O. Grandjean, A. Recknagel
R i where ω i , ωA ∈ A0 . The integral − , as given in Definition 2.3, is just the normalized trace on H0 ⊗ W , denoted by Tr . Thus, for any element ω as above, we have Z (3.47) − ω = Tr ω 0 . It is easy to show that the sesqui-linear form (·, ·) associated to the integral is given by 1 1 ∗ 2 2 ∗ (ηA ) + ωA (ηA ) + ω 3 (η 3 )∗ . (3.48) (ω, η) = Tr ω 0 (η 0 )∗ + ωA This proves that the kernels K i of the sesqui-linear form (·, ·) equal the kernels J i of the representation π . Thus, in this example we also have the equality e pD (A0 ) = p (A0 ). D Furthermore, since π(δJ 1 ) = A0 and π(δJ 2 ) = π(1D (A0 )) , we see that the decomposition (3.46) gives the canonical representative ω ⊥ of an arbitrary differential form ω ∈ •D (A0 ) . The Hermitian structure on pD (A0 ) is readily seen to be hω, ηi = ωA (ηA )∗ , ω, η ∈ pD (A0 ).
(3.49)
Notice that, in this example, we get a true Hermitian structure on pD (A0 ) and not only e p (A0 ) , cf. Subsect. 2.1.5. a generalized Hermitian structure on 3.3.2. Connections on 1D (A0 ) . This last property makes it possible to regard 1D (A0 ) as the cotangent bundle of the non-commutative 3-sphere and to study connections on 1D (A0 ) . Since the space of 1-forms 1D (A0 ) is a trivial left A0 -module, a connection ∇ on 1 D (A0 ) is uniquely determined by the images of the basis elements, i.e., A eB ⊗ eC , ∇eA = −ωBC
(3.50)
A where ωBC are arbitrary elements of A0 .
Proposition 3.5. A connection ∇ is unitary if and only if its coefficients satisfy the Hermiticity condition A∗ C = ωBA . (3.51) ωBC Proof. It follows from (3.49) that heA , eB i = δ AB . Then we have for a unitary connection (see Definition 2.12) A B∗ eC + eC ωCA . 0 = δheA , eB i = −ωCB
Proposition 3.6. The torsion of a connection is given by A εBCD ) f D . TA = (−iδ AD + ωBC
Proof. Using Definition 2.14 and Eqs. (2.20), (3.34,37), we get A εBCD f D . T(∇) eA = −if A + ωBC
Supersymmetric Quantum Theory and Non-Commutative Geometry
165
Proposition 3.7. A connection is torsionless and unitary if and only if its coefficients satisfy the following conditions: i) ii) iii)
A A∗ − ωBC = iεABC , ωBC A ωBC A ωBC
= =
(3.52)
A∗ ωCB , C ωAB .
(3.53) (3.54)
In particular, such a connection is uniquely determined by the nine self-adjoint elements A 1 ∈ A0 and the self-adjoint part of ω23 . ωAB Proof. The condition of vanishing torsion, A εBCD = iδ AD , ωBC
can equivalently be written as A A − ωCB = iεABC . ωBC
(3.55)
Using alternatively Eqs. (3.55) and the unitarity condition Eq. (3.51) we get A A B∗ B∗ C = iεABC + ωCB = iεABC + ωCA = ωAC = ωAB . ωBC
which proves the result.
This proposition shows that, as in the classical case, there are many unitary and torsionless connections. There are two possibilities to reduce the space of “natural” connections further. First, we can consider real connections, i.e., connections whose associated parallel transport maps real forms to real forms. In the classical setting, a 1-form ω is real if ω ∗ = −ω (the sign comes from the fact that the Clifford matrices are anti-Hermitian). Thus, we see that our basis of 1-forms consists of imaginary 1-forms, i.e., eA∗ = eA . If the covariant derivative of an imaginary 1-form is to be imaginary, A must be anti-Hermitian. We call such a connection then the connection coefficients ωBC a real connection. Corollary 3.8. There is a unique real unitary and torsionless connection on the cotangent bundle 1D (A0 ) , and its coefficients are given by A = ωBC
i ABC . ε 2
There is another way of reducing the number of “natural” connections. If we look at a general unitary and torsionless connection, we see that it does not have any isotropy A are all independent of one another. We hope property. For example, the coefficients ωAA that if we require the connection to be invariant under all 1-parameter group of isometries (see [CFG, Gr]) we shall get relations among these coefficients. We shall not pursue this route here, but we refer the reader to [Gr] for a detailed analysis. e We proceed with the computation of the scalar curvature of the real connection ∇ of Corollary 3.8. A as defined in Eq. (3.50), the curvature For any connection ∇ with coefficients ωBC tensor is given by (see Definition 2.11)
166
J. Fr¨ohlich, O. Grandjean, A. Recknagel
R(∇) eA = −∇2 eA A A A = [ J D , ωEC ] εDEB − iωBC + ωDE ωFEC εDF B f B ⊗ eC .
(3.56)
e of Corollary 3.8, the curvature tensor reads In particular, for the real connection ∇ e eA = R(∇)
1 ABC B ε f ⊗ eC . 4
(3.57)
In order to compute the Ricci curvature, we use a dual basis to the generators eA , as in Subsect. 2.1.7, before Eq. (2.15). It is clear that the elements εA ∈ 1D (A0 )∗ defined by εA (ω) = εA (ωB eB ) := ωA
(3.58)
for all ω ∈ 1D (A0 ) , form a dual basis to eA . Using Eq. (3.49) it is then easy to verify that the dual 1-forms eA and their dual maps ead A , Eq. (2.16), are given by B ABC C e . eA = eA , ead A (f ) = −ε
(3.59)
e we get from Eq. (3.57), For the real connection ∇, e = − 1 eA ⊗ eA . Ric(∇) 2
(3.60)
ad We proceed with the computation of the scalar curvature. The right dual maps (eA R ) to A the basis 1-forms e , Eq. (2.17), act as ad B AB . (eA R ) (e ) = δ
(3.61)
e follows from Eq. (3.60) and is given by The scalar curvature of the real connection ∇ e = −3. r(∇) 2
(3.62)
It is the same as the scalar curvature of the unique real unitary and torsionless connection for the classical SU(2) – recall that the definition of the scalar curvature in the noncommutative setting differs from the classical one by a sign, see the remark in Definition 2.16. This completes our study of the non-commutative 3-sphere in terms of N = 1 spectral data. Our results show that the non-commutative 3-sphere has striking similarities with its classical counterpart. As we saw in Subsects. 3.2.1 and 3.2.2, the spaces of differential forms have the same structure as left-modules over the algebra of functions, and the cohomology groups have the same dimensions as modules over the zeroth cohomology group, H 0 . Furthermore, geometric invariants like the scalar curvature, too, coincide for the classical and the quantized 3-sphere. 3.4. Remarks on N = (1, 1). In the following, we consider N = (1, 1) data for the algebra A0 . The construction of the first subsection starts from the BRST operator of the group G and leads to a deformation of the de Rham complex for the classical 3-sphere in the form of N = (1, 1) data for the non-commutative 3-sphere. In the second subsection, we return to the two generalized Dirac operators provided by superconformal field theory [FG], which lead to a different formulation of N = (1, 1) data, displaying “spontaneously broken supersymmetry”.
Supersymmetric Quantum Theory and Non-Commutative Geometry
167
3.4.1. N = (1, 1) data from BRST. One way to arrive at N = (1, 1) data for the algebra A0 and at the associated (non-commutative) de Rham complex for the quantized 3-sphere is to use the action of the group G on the Hilbert space H0 for introducing a BRST operator (see also Sect. I 2.3). Let { JA } be the basis of the complexified Lie algebra gC of G introduced in Eq. (3.7). The BRST operator Q for the group G is defined as usual: We introduce ghosts cA and anti-ghosts bA satisfying the ghost algebra A . { cA , cB } = { bA , bB } = 0, { cA , bB } = δB
(3.63)
Then the BRST operator is given by Q = cA JA − and the ghost number operator is
i C A B f c c bC , 2 AB
(3.64)
T = cA bA .
(3.65)
The Hilbert space of the N = (1, 1) data will be of the form H0 ⊗ W where W is a representation space for the ghost algebra. In order to obtain N = (1, 1) data, we require that the ghost algebra acts unitarily on W with respect to the natural ∗ -operation, namely cA ∗ = g AB bB .
(3.66)
This choice is compatible with positive definiteness of the scalar product on W , and it renders the ghost number operator T self-adjoint. 1 Furthermore, this choice of ∗ operation leads to identifying the ghost algebra with the CAR { cA , cB } = 0, { cA , cB ∗ } = g AB ,
(3.67)
and the BRST operator can be written Q = cA JA −
i C A B ∗ f c c cC , 2 AB
(3.68)
where indices are raised and lowered with the metric gAB as usual. Under the identifications cA ∼ aA ∗ := −i θA ∧, where { θA } is a basis of 1-forms dual to { ϑA } , Eq. (3.68) for the BRST operator formally coincides with the exterior derivative on G. This fact was already mentioned in Sect. I 2.3. In order to complete our construction of N = (1, 1) data, we introduce the Hodge ∗-operator 1 √ g εA1 ...An ( cA1 + cA1 ∗ ) · · · ( cAn + cAn ∗ ), (3.69) ∗= n! where n = dim G. This operator clearly commutes with the algebra A0 of Theorem 3.1. Moreover, it is easy to verify that ∗ is unitary and satisfies ∗2 = (−1) 1
n(n−1) 2
(3.70) A∗
A
b∗ A =bA
In the context of gauge theories, one considers representations such that c =c , . These Hermiticity conditions together with the defining relations (3.63) imply that the inner product of the representation space is not positive definite – which is why cA and bA are called ghosts.
168
J. Fr¨ohlich, O. Grandjean, A. Recknagel
as well as
∗Q = (−1)n−1 Q ∗ .
(3.71)
It follows that (A, H, d, γ, ∗) with A = A0 , H = H0 ⊗ W , d = Q and where γ is the modulo 2 reduction of the Z-grading T , form a set of N = (1, 1) data in the sense of Definition 2.20. We refrain from presenting the details of the construction of differential forms and of the other geometrical quantities, since the computations are fairly straightforward. For example, the space of k-forms is given by kd (A0 ) = { aA1 ...Ak cA1 · · · cAk | aA1 ...Ak ∈ A0 }.
(3.72)
For G = SU(2), we see that these spaces are isomorphic to kD (A0 ) as left A0 -modules. Furthermore, it is easy to see that •d (A0 ) and •D (A0 ) are isomorphic as complexes, which proves that, in particular, their cohomologies coincide. Of course, the same constructions and results apply to the BRST operator associated with the right-action of G on H0 given by the generators J A of Eq. (3.7). The Hilbert space H = H0 ⊗ W can be decomposed into a direct sum of eigenspaces of the Z-grading operator T , n M H(k) , H= k=0
where H = H0 , n = dim G ( = 3 for G = SU(2)). The subspaces H(k) are left-modules for A0 . Furthermore, it follows from Eqs. (3.65) and (3.68) that d := Q maps H(k) into H(k+1) for k = 0, . . . , n (with H(n+1) := {0}). Since d2 = 0, H is a complex. Viewed as linear spaces, the cohomology groups of •d (A0 ) and (H, Q) are isomorphic, although the latter do not carry a ring structure. As a side remark, consider an odd operator H on H. Then d˜ := d + H is nilpotent if and only if { d, H } + H 2 = 0. If H commutes with A0 , then •d˜(A0 ) and •d (A0 ) are identical complexes. In the next subsection, we will meet a conformal field theory motivated example for d˜ = d + H which is nilpotent but for which H does not commute with A0 . (0)
3.4.2. Spontaneously broken supersymmetry. In Sect. 3.1 we introduced two connections ∇S and ∇S and their associated Dirac operators D and D, see Eqs. (3.6-9). Since these two Dirac operators correspond to different connections, they are not Dirac operators on an N = (1, 1) Dirac bundle in the sense of Definition I 2.6. It is interesting to notice that D and D nevertheless satisfy the N = (1, 1) algebra [FG], D2 = D2 , { D, D } = 0.
(3.73)
The easiest way to prove (3.73) is to verify that the generalized exterior derivative 1 d˜ := (D + iD) 2
(3.74)
is nilpotent. Let { ϑA } and { θA } denote a basis of the Lie algebra and the dual basis of 1-forms, respectively, as before. We define the operators aA ∗ = θA ∧, aA = ϑA as usual, and we can express the fermionic operators ψ A and ψ A as
Supersymmetric Quantum Theory and Non-Commutative Geometry
ψ A = −i(aA ∗ − aA ), ψ A = −(aA ∗ + aA ),
169
(3.75)
where indices are raised and lowered with the metric gAB . Using Eqs. (3.9) and (3.74), we can rewrite the operator d˜ as a sum of terms of degree 1, −1 and −3,
where
d˜ = d˜1 + d˜−1 + d˜−3 ,
(3.76)
1 + − fABC aA ∗ aB ∗ aC , d˜1 = aA ∗ JA 4 − , d˜−1 = −aA JA 1 d˜−3 = − fABC aA aB aC , 12
(3.77)
with
i ± = − (JA ± J A ). (3.78) JA 2 It is then straightforward to show that d˜ given by Eqs. (3.76,77) satisfies d˜2 = 0 and that ˜ d˜∗ } is given by the associated Laplacian 4 = { d, 4 = g AB JA JB +
dim G dim G = g AB J A J B + . 24 24
(3.79)
Thus, 4 is a strictly positive operator – corresponding to what one calls spontaneously broken supersymmetry in the context of field theory. This implies that the cohomology ˜ is trivial. However, the cohomology of the complex •˜(A0 ), as of the complex (H, d) d introduced in Sect. 2.2, is not trivial. Notice that d˜1 is the BRST operator associated to + and hence nilpotent. This implies that the BRST cohomology of the the generators JA fuzzy 3-sphere can be extracted from •d˜(A0 ). 4. The Non-Commutative Torus As a second, “classic” example of non-commutative spaces, we discuss the geometry of the non-commutative 2-torus [Ri, Co1, Co5]. After a short review of the classical torus in Subsect. 4.1, we analyze the spin geometry (N = 1) of the non-commutative torus in Subsect. 4.2 along the lines of [FGR2, Gr]. In Subsects. 4.3 and 4.4, we successively extend the N = 1 data to N = (1, 1) and N = (2, 2) data – according to the general procedure proposed in Subsect. 2.2.5 above. In these two last subsections, we do not give detailed proofs, but merely state the results since the computations, although straightforward, are tedious and not very illuminating. 4.1. The classical torus. To begin with, we describe the N = 1 data associated to the classical 2-torus T20 . By Fourier transformation, the algebra of smooth functions over T20 is isomorphic to the Schwarz space A0 := S(Z2 ) over Z2 , endowed with the (commutative) convolution product: X a(q) b(p − q), (4.1) (a • b)(p) = q∈Z2
where a, b ∈ A0 and p ∈ Z2 . Complex conjugation of functions translates into a ∗ operation:
170
J. Fr¨ohlich, O. Grandjean, A. Recknagel
a∗ (p) = a(−p),
a ∈ A0 .
(4.2)
If we choose a spin structure over T20 in such a way that the spinors are periodic along the elements of a homology basis, then the associated spinor bundle is a trivial rank 2 vector bundle. With this choice, the space of square integrable spinors is given by the direct sum (4.3) H = l2 (Z2 ) ⊕ l2 (Z2 ), where l2 (Z2 ) denotes the space of square summable functions over Z2 . The algebra A0 acts diagonally on H by the convolution product. We choose a flat metric (gµν ) on T20 and we introduce the corresponding 2-dimensional gamma matrices {γ µ , γ ν } = − 2 g µν , γ µ∗ = − γ µ .
(4.4)
Then, the Dirac operator D on H is given by (D ξ)(p) = i pµ γ µ ξ(p),
ξ ∈ H.
(4.5)
Finally, the Z2 -grading on H, denoted by σ, can be written as σ =
i √ g εµν γ µ γ ν , 2
(4.6)
where εµν is the Levi–Civita tensor. The data (A0 , H, D, σ) are the canonical N = 1 data associated to the compact spin manifold T20 , and it is thus clear that they satisfy all the properties of Definition 2.1. 4.2. Spin geometry (N = 1). The non-commutative torus is obtained by deforming the product of the algebra A0 . For each α ∈ R , we define the algebra Aα := S(Z2 ) with the product X a(q) b(p − q) eiπαω(p,q) , (4.7) (a •α b) (p) = q ∈ Z2
where ω is the integer-valued anti-symmetric bilinear form on Z2 × Z2 , ω(p, q) = p1 q2 − p2 q1 ,
p, q ∈ Z2 .
(4.8)
The ∗ -operation is defined as before. Alternatively, we could introduce the algebra Aα as the unital ∗ -algebra generated by the elements U and V subject to the relations U U ∗ = U ∗U = V V ∗ = V ∗V = 1 ,
U V = e−2πiα V U.
(4.9)
Having chosen an appropriate closure, the equivalence of the two descriptions is easily seen if one makes the following identifications: U (p) = δp1 ,1 δp2 ,0 , V (p) = δp1 ,0 δp2 ,1 .
(4.10)
If α is a rational number, α = M N , where M and N are co-prime integers, then the centre Z(Aα ) of Aα is infinite-dimensional: (4.11) Z(Aα ) = span U mN V nN | m, n ∈ Z . Let Iα denote the ideal of Aα generated by Z(Aα ) − 1. Then it is easy to see that the quotient Aα /Iα is isomorphic, as a unital ∗ -algebra, to the full matrix algebra MN (C).
Supersymmetric Quantum Theory and Non-Commutative Geometry
171
If α is irrational, then the centre of Aα is trivial and Aα is of type II1 , the trace being given by the evaluation at p = 0. Unless stated differently, we shall only study the case of irrational α. We define the non-commutative 2-torus T2α by its N = 1 data (Aα , H, D, σ) where H, D and σ are as in Eqs. (4.3), (4.5) and (4.6), and Aα acts diagonally on H by the deformed product, Eq. (4.7). 2 When α = M N is rational, one may work with the data (Aα /Iα , MN (C)⊗C , Dα , σ), where the Dirac operator Dα is given by π µ sin N pµ . (4.12) Dα = i γ π N
4.2.1. Differential forms. Recall that there is a representation π of the algebra of universal forms • (Aα ) on H (see Subsect. 2.1.2). The images of the homogeneous subspaces of • (Aα ) under π are given by (4.13) π 0 (Aα ) = Aα (by definition), 2k−1 µ (Aα ) = {aµ γ | aµ ∈ Aα } π (4.14) 2k π (Aα ) = {a + bσ | a, b ∈ Aα } (4.15) for all k ∈ Z+ . In principle, one should then compute the kernels J n of π (see Eq. (2.2)), but these are generally huge and difficult to describe explicitly. To determine the space of n-forms, it is simpler to use the isomorphism (4.16) nD (Aα ) = n (Aα ) J n + δJ n−1 ' π (n (Aα )) π(δJ n−1 ). First, we have to compute the spaces of “auxiliary forms” π(δJ n−1 ). Lemma 4.1. The spaces π(δJ n−1 ) of auxiliary forms are given by π δJ 1 = Aα , π δJ 2k = π 2k+1 (Aα ) , π δJ 2k+1 = π 2k+2 (Aα ) ,
(4.17) (4.18) (4.19)
for all k ≥ 1.
P 1 Proof. Let ai , bi ∈ Aα be such that the universal 1-form η = i ai δbi ∈ (Aα ) satisfies π(η) = 0. This means that X (p − q)µ γ µ aj (q)bj (p − q) eiπαω(p,q) = 0 (4.20) i j,q
for all p ∈ Z2 . Using Eq. (4.20), we have X qµ (p − q)ν γ µ γ ν aj (q)bj (p − q) eiπαω(p,q) π(δη) = − j,q
=−
X
(q 2 − p2 )aj (q)bj (p − q) eiπαω(p,q) ∈ Aα .
(4.21)
j,q
This proves that π(δJ 1 ) ⊂ Aα . Then, we construct an explicit non-vanishing element of π(δJ 1 ). We set
172
J. Fr¨ohlich, O. Grandjean, A. Recknagel
a1 (p) = b2 (p) = δp1 ,−1 δp2 ,0 , a2 (p) = b2 (p) = δp1 ,1 δp2 ,0 , P2 and an easy computation shows that the element η = i=1 ai δbi satisfies π(η) = 0 , π(δη) = −g 11 . Since π(δJ 1 ) is an Aα -bimodule, Eq. (4.17) follows. Let k ≥ 3 and η ∈ k (Aα ). Then, using Eqs. (4.14) and (4.15), we see that there exists an element ψ ∈ k−2 (Aα ) with π(η) = π(ψ). The first part of the proof ensures the existence of an element φ ∈ 1 (Aα ) with π(φ) = 0 and π(δφ) = 1. Then we have φψ ∈ J k−1 , and π(δ(φψ)) = π(ψ) = π(η), proving that η ∈ δJ k−1 , and therefore Eqs. (4.18) and (4.19). As a corollary to this lemma, we obtain the following Proposition 4.2. Up to isomorphism, the spaces of differential forms are given by 0D (Aα ) = Aα , 1D (Aα ) ∼ = {aµ γ µ | aµ ∈ Aα } , 2 (Aα ) ∼ = {a σ | a ∈ Aα } , D
nD (Aα ) = 0
for n ≥ 3,
(4.22) (4.23) (4.24) (4.25)
where we have chosen special representatives on the right-hand side. Notice that 1D (Aα ) and 2D (Aα ) are free left Aα -modules of rank 2 and 1, respectively. This reflects the fact the bundles of 1- and 2-forms over the 2-torus are trivial and of rank 2 and 1, respectively. 4.2.2. Integration and Hermitian structure over 1D (Aα ) . It follows from Eqs. (4.1315) that there is an isomorphism π(• (Aα )) ' Aα ⊗ M2 (C). Applying the general definition of the integral – see Subsect. 2.1.3 – to the non-commutative torus, one finds for an arbitrary element ω ∈ π(• (Aα )) , Z (4.26) − ω = Tr C2 (ω (0)) , where Tr C2 denotes the normalized trace on C2 . The cyclicity property, Assumption 2.4 in Subsect. 2.1.3, follows directly from the definition of the product in Aα and the cyclicity of the trace on M2 (C). The kernels K n of the canonical sesqui-linear form on π(• (Aα )) – see Eq. (2.5) – coincide with the kernels J n of π, and we get for all n ∈ Zn : e nD (Aα ) = nD (Aα ). e n (Aα ) = n (Aα ) , (4.27) Note that the equality K n = J n holds in all explicit examples of non-commutative N = 1 spaces studied so far. It is easy to see that the canonical representatives ω ⊥ on H of differential forms [ω] ∈ nD (Aα ), see Eq. (2.10), coincide with the choices already made in Eqs. (4.22–25). The canonical Hermitian structure on 1D (Aα ) is simply given by (4.28) hω, ηiD = ωµ g µν ην∗ ∈ Aα
Supersymmetric Quantum Theory and Non-Commutative Geometry
173
for all ω, η ∈ 1D (Aα ). Note that this is a true Hermitian metric, i.e., it takes values in Aα and not in the weak closure A00α . Again, this is also the typical situation in other examples. 4.2.3. Connections on 1D (Aα ) , and cohomology. Since 1D (Aα ) is a free left Aα module, it admits a basis which we can choose to be E µ := γ µ . A connection ∇ on 1D (Aα ) is uniquely specified by its coefficients 0λµν ∈ Aα , ∇ E µ = − 0µνλ E ν ⊗ E λ ∈ 1D (Aα ) ⊗Aα 1D (Aα ),
(4.29)
and these coefficients can be chosen arbitrarily. Note that in the classical case (α = 0) the basis E µ consists of real 1-forms. Accordingly, we say that the connection ∇ is real if its coefficients in the basis E µ are self-adjoint elements of Aα . A simple computation shows that there is a unique real, unitary, torsionless connection ∇L.C. on 1D (Aα ) given by (4.30) ∇L.C. E µ = 0. In the remainder of this subsection, we determine the de Rham complex and its cohomology. Let U and V be the elements of Aα defined in Eq. (4.10), then it is easy to verify that the elements E µ of 1D (Aα ) given by E 1 = U ∗ δU ,
E 2 = V ∗ δV,
(4.31)
form a basis of 1D (Aα ) and that they are closed, δE 1 = δE 2 = 0.
(4.32)
A word of caution is in order here: Eq. (4.32) does not mean that δE µ is zero as an element of 2 (Aα ), but that δE µ ∈ δJ 1 which is zero in the quotient space 2D (Aα ). As a basis of 2D (Aα ) we choose F =
1 εµν γ µ γ ν 2
(4.33)
and we get for the product of basis 1-forms, E µ E ν = εµν F.
(4.34)
This completely specifies the de Rham complex, and we can now compute the cohomology groups H p . For a ∈ Aα , we have the equivalences [D, a] = 0 ⇐⇒ ipµ γ µ a(p) = 0 ∀p ⇐⇒ a(p) = δp,0 a˜
(4.35)
for some a˜ ∈ C. This shows that H 0 ' C. Let aµ E µ be a 1-form, then we obtain with Eqs. (4.32) and (4.34) that pµ aν εµν F = 0, δ(aµ E µ ) = 0 ⇐⇒ ib
(4.36)
pµ a)(q) = qµ a(q). Suppose where pbµ denotes the multiplication operator by pµ , i.e., (b that the 1-form aµ E µ is closed and satisfies aµ (0) = 0, and define the algebra element −1 aµ . Using Eq. (4.36), we see that b by b = 2ib pµ δb = aµ E µ .
(4.37)
174
J. Fr¨ohlich, O. Grandjean, A. Recknagel
This proves that any closed 1-form is cohomologous to a “constant” 1-form cµ E µ with cµ ∈ C. On the other hand, a non-vanishing constant 1-form cµ E µ cannot be exact since the equation (4.38) (δa)(p) = ipµ E µ a(p) = δp,0 cµ E µ has no solution. Thus, we have H 1 ' C2 . The same argument shows that a constant 2-form cF , with c ∈ C, is not exact. If a 2-form aF satisfies a(0) = 0, then it is the −1 µ aE and this proves the following coboundary of the 1-form i εµν pbν Proposition 4.3. In the basis { 1, E µ = γ µ , F = 21 εµν γ µ γ ν } of •D (Aα ), the de Rham differential algebra is specified by the following relations: E µ E ν = εµν F, δE µ = δF = 0,
δa = ib pµ E µ ∀a ∈ Aα .
Furthermore, the cohomology of the de Rham complex is given by H0 ' H2 ' C ,
H 1 ' C2 .
(4.39)
This completes our study of the N = 1 data describing the non-commutative 2-torus at irrational deformation parameter. 4.3. Riemannian geometry (N = (1, 1)). At the end of our discussion of the noncommutative 3-sphere, in Subsect. 3.4, we have briefly outlined a description in terms of “Riemannian” N = (1, 1) data – with the two generalized Dirac operators borrowed from conformal field theory, see [FG]. In the following, we will treat the non-commutative torus (at irrational deformation parameter) as a Riemannian space. Here we can, moreover, construct a set of N = (1, 1) data from the Connes spectral triple along the general lines of Subsect. 2.2.5. Our first task is to find a real structure J on the N = 1 data (Aα , H, D, σ). To this end, we introduce the complex conjugation κ : H → H, (κx)(p) := x(p) := x(p), as well as the charge conjugation matrix C : H → H as the unique (up to a sign) constant matrix such that C γ µ = − γ µ C, ∗
C = C =C
−1
(4.40)
.
(4.41)
Then it is easy to verify that J = Cκ is a real structure. The right actions of Aα and 1D (Aα ) on H (see Subsect. 2.2.5) are given as follows ξ • a ≡ Ja∗ J ∗ ξ = ξ •α a∨ , ∗
∗
µ
ξ • ω ≡ Jω J ξ = γ ξ
∨ •α ω µ ,
(4.42) (4.43)
where ξ ∈ H, a ∈ Aα , ω ∈ 1D (Aα ), ξ •α a denotes the diagonal right action of a on ξ by the deformed product, and a∨ (p) := a(−p). o
Notice that (a •α b)∨ = a∨ •α b∨ . We denote by H the dense subspace S(Z2 )⊕S(Z2 ) ⊂ H o of smooth spinors. The space H is a two-dimensional free left Aα -module with canonical o basis {e1 , e2 }. Then, any connection ∇ on H is uniquely determined by its coefficients ωj i ∈ 1D (Aα ):
Supersymmetric Quantum Theory and Non-Commutative Geometry
175
∇ ei = ωi j ⊗ ej = ωµi j γ µ ⊗ ej ∈ 1D (Aα ) ⊗Aα H . o
(4.44)
The “associated right connection” ∇ is then given by ∇ ei = ej ⊗ ω i j ∈ H ⊗Aα 1D (Aα ),
(4.45)
ω j i = − Cki (ωl k )∗ Cjl = Cki (ωµ l k )∗ Cjl γ µ .
(4.46)
o
where
o
o
An arbitrary element in H ⊗Aα H can be written as ei ⊗ aij ej , where aij ∈ Aα . As in o o Subsect. 2.2.5, the “Dirac operators” D and D on H ⊗Aα H associated to the connection ∇ are given by (4.47) D ei ⊗ aij ej = ei ⊗ δ aij + ω k i akj + aik ωk j • ej , ij ij ik i kj j D ei ⊗ a ej = ei • δ a + ω k a + a ωk ⊗ σ ej . (4.48) o
o
In order to be able to define a scalar product on H ⊗Aα H , we need a Hermitian structure o on the right module H , denoted by h·, ·i, with values in Aα . It is defined by Z o (4.49) − hξ, ζi a = (ξ, ζ a) ∀ ξ, ζ ∈ H , ∀ a ∈ Aα . This Hermitian structure can be written explicitly as hξ, ζi =
2 X
ξi
•α
ζ i∨ ,
(4.50)
i=1
and it satisfies
hξ a, ζ bi = a∗ hξ, ζi b
(4.51)
o
o
o
for all ξ, ζ ∈ H and a, b ∈ Aα . Then we define the scalar product on H ⊗Aα H as (see [Co1]) (ξ1 ⊗ ξ2 , ζ1 ⊗ ζ2 ) = ξ2 , hξ1 , ζ1 i ζ2 . (4.52) This expression can be written in a more suggestive way if one introduces a Hermitian o structure, denoted h·, ·iL , on the left module H : hξ, ζiL := hJ ξ, J ζi. This Hermitian structure satisfies ha ξ, b ζiL = a hξ, ζiL b∗ o
o
o
for all a, b ∈ Aα and ξ, ζ ∈ H , and the scalar product on H ⊗Aα H can be written as follows Z (ξ1 ⊗ ξ2 , ζ1 ⊗ ζ2 ) = − hξ1 , ζ1 i hζ2 , ξ2 iL . A tedious computation shows that the relations D∗ = D , are equivalent to
∗
D = D,
{ D, D } = 0 ,
e ei ⊗ ej = 0 ∀ i, j. ∇
D2 = D
2
(4.53) (4.54)
176
J. Fr¨ohlich, O. Grandjean, A. Recknagel
In particular, we see that the original N = 1 data uniquely determine the operators D and D satisfying the N = (1, 1) algebra – cf. Definition 2.20 – D2 = D2 ,
{ D, D } = 0.
One can prove that there are unique Z2 -grading operators γ =1⊗σ ,
γ =σ⊗1
(4.55)
commuting with Aα and such that { D, γ } = { D, γ } = 0, [ D, γ ] = [ D, γ ] = 0. The combined Z2 -grading
0 = γγ
together with the Hodge operator
∗=γ
complete our data to a set of N = (1, 1) data Aα , H ⊗Aα H, D, D, 0, ∗ . Furthermore, these data admit a unique Z-grading T =
1 gµν γ µ ⊗ γ ν σ 2i
commuting with Aα , whose mod 2 reduction equals 0, and such that [ T , d ] = d. 4.4. K¨ahler geometry (N = (2, 2)). The classical torus can be regarded as a complex K¨ahler manifold, and thus it is natural to ask whether we can extend the N = (1, 1) spectral data to N = (2, 2) data in the non-commutative case, too. The simplest way to determine such an extension is to look for an anti-selfadjoint operator I commuting with Aα , 0, ∗, and T , and then to define a new differential by dI = [ I, d ].
(4.56)
The nilpotency of dI implies further constraints on the operator I. The idea behind this construction is to identify I with i(T − T ), where T and T are as in Definition 2.26. In the classical setting, this operator has clearly the above properties. The most general operator I on H ⊗Aα H that commutes with all elements of Aα is of the form 3 X R γ µ ⊗ γ ν Iµν , (4.57) I = µ,ν=0
where set
R Iµν
are elements of Aα acting on H ⊗Aα H from the right, and where we have γ 0 = 1, γ 3 = σ.
(4.58) R Iµν
= 0 unless The vanishing of the commutators of I with 0 and ∗ implies that R R R = I30 and leaves the coefficients I00 µ, ν ∈ {0, 3}. The equation [ I, T ] = 0 requires I03 R and I33 undetermined. Since the operator I appears only through commutators, its trace R = 0. All constraints together give part is irrelevant and we can set I00
Supersymmetric Quantum Theory and Non-Commutative Geometry R R I = (σ ⊗ 1 + 1 ⊗ σ) I03 + (σ ⊗ σ) I33 ,
177
(4.59)
R R and I33 are anti-selfadjoint elements of Aα . We decompose I into two parts where I03 R , I1 = (σ ⊗ 1 + 1 ⊗ σ) I03
I2 = (σ ⊗ σ)
R I33 ,
(4.60) (4.61)
and we introduce the new differentials according to Eq. (4.56), 1 (D − i D), 2 d2 = [ I1 , d ] , d3 = [ I2 , d ] . d1 = d =
(4.62) (4.63) (4.64)
The nilpotency of d2 and d3 implies that I03 and I33 are multiples of the identity, and we normalize them as follows: i (4.65) I1 = (σ ⊗ 1 + 1 ⊗ σ) , 2 (4.66) I2 = i (σ ⊗ σ) . Comparing Eqs. (4.66) and (4.55), we see that I2 = i γ γ
(4.67)
and it follows, using Eqs. (4.62) and (4.64), that d3 = [ I2 , d ] = 2 i d γ γ.
(4.68)
Thus, the differential d3 is a trivial modification of d, and we discard it. It is then easy to verify that (Aα , H ⊗Aα H, d1 , d2 , 0, ∗, T ) form a set of N = (2, 2) spectral data together with a Z-grading. Furthermore, they are, as we have shown, canonically determined by the original N = (1, 1) data. Therefore, a Riemannian non-commutative torus (at irrational deformation parameter α) admits a canonical K¨ahler structure. Notice that if we choose the metric g µν = δ µν in Eq. (4.4), then ∂ = − 21 (d1 + id2 ) coincides with the holomorphic differential obtained in [Co1] from cyclic cohomology and using the equivalence of conformal and complex structures in two dimensions. We have only given the definitions of the spectral data in the N = (1, 1) and the N = (2, 2) setting. As a straightforward application of the general methods described in Sect. 2, we could compute the associated de Rham resp. Dolbeault complexes, or geometrical quantities like curvature. We do not carry out these calculations. Instead, let us emphasize the following feature: In Sect. 3, we already say that the topology of “the” non-commutative 3-sphere depends on the spectral data other than the algebra. Now, we learn once again that, for rational deformation parameter α = M N , the algebra Aα does not specify the geometry of the underlying non-commutative space. It is only the selection of a specific K-cycle (H, D) that allows us to identify this space as a deformed torus. By choosing different K-cycles (H, D) for the same algebra Pl A = MN (C) (with N = j=1 j 2 ) we are able to describe either a fuzzy three-sphere, as discussed in Sect. 3, or a non-commutative torus. In other words, choosing different spectral data, but keeping the algebra A fixed, may lead to different non-commutative geometries. Yet, it is plausible that the sequence AN := MN (C), N = 1, 2, 3, . . . , of algebras may be associated uniquely with non-commutative tori, while the sequence AN := MN (C), Pl N = j=1 j 2 , l = 1, 2, 3, . . . , may be associated uniquely with fuzzy three-spheres.
178
J. Fr¨ohlich, O. Grandjean, A. Recknagel
5. Directions for Future Work In this work and in part I, we have presented an approach to (non-commutative) geometry rooted in supersymmetric quantum theory. We have classified the various types of classical and of non-commutative geometries according to the symmetries, or to the “supersymmetry content”, of their associated spectral data. Obviously, many natural and important questions remain to be studied. In this concluding section, we describe a few of these open problems and sketch, once more, some of the physical motivations underlying our work. (1) An obvious question is whether one can give a complete classification of the possible types of spectral data in terms of graded Lie algebras (and, perhaps, q-deformed graded Lie algebras). As an example, we recall the structure of N = 4+ spectral data , describing an extension of K¨ahler geometry (see Sects. 1.2 and 3 of part I). The spectral data involve e, d e∗ , L3 , L+ , L− , J0 and 4, which close under taking (anti-) the operators d, d∗ , d commutators: They generate a graded Lie algebra defined by [ L3 , L± ] = ±2L± , [ L+ , L− ] = L3 , [ J0 , L3 ] = [ J0 , L+ ] = 0, e∗ , e, [ L+ , d ] = 0, [ L− , d ] = d [ J0 , d ] = −i d [ L3 , d ] = d, e] = d e, e ] = 0, e ] = −d∗ , e ] = i d, [ L+ , d [ L− , d [ J0 , d [ L3 , d e, d e } = { d, d e } = {d, d e∗ } = 0, { d, d } = { d e, d e∗ } = 4, { d, d∗ } = { d where 4, the Laplacian, is a central element. The remaining (anti-)commutation relations follow by taking adjoints, with the rules that 4, J0 and L3 are self-adjoint, and (L− )∗ = L+ . It would be interesting to determine all graded Lie algebras (and their representations) occurring in spectral data of a (non-commutative) space. In the case of classical geometry, we have given a classification up to N = (4, 4) spectral data, and there appears to be enough information in the literature to settle the problem completely; see [Bes, HKLR, Joy]. In the non-commutative setting, however, further algebraic structures might occur, including q-deformations of graded Lie algebras. To give a list of all graded Lie algebras that are, in principle, admissible, appears possible; see [FGR2] for additional discussion. However, in view of the classical case, where we only found the groups U(1), SU(2), Sp(4) and direct products thereof (see part I, Sect. 3) we expect that not all Lie group symmetries that may arise in principle are actually realized in (non-commutative) geometry. Determining the graded Lie algebras that actually occur in the spectral data of geometric spaces is clearly just the first step towards a classification of non-commutative spaces. A more difficult problem will be to characterize the class of all ∗ -algebras A that admit a given type of spectral data, i.e. the class of algebras that possess a K-cycle (H, di ) with a collection of differentials di generating a certain graded Lie algebra such that the ordinary Lie group generators Xj contained in the graded Lie algebra commute with the elements of A. (2) Given some non-commutative geometry defined in terms of spectral data, it is natural to investigate its symmetries, i.e. to introduce a notion of diffeomorphisms. For definiteness, we start from a set of data (A, H, d, d∗ , T, ∗) with an N = 2 structure, cf. Sect. 2.2.6. To study notions of diffeomorphisms, it is useful to introduce an algebra 8•d (A) defined as the smallest ∗ -algebra of (unbounded) operators containing B :=
Supersymmetric Quantum Theory and Non-Commutative Geometry
179
π(• (A)) ∨ π(• (A))∗ and arbitrary graded commutators of d and d∗ with elements of B. Due to the existence of the Z-grading T , 8•d (A) decomposes into a direct sum M 8nd (A), 8nd (A) := { φ ∈ 8•d (A) | [ T, φ ]g = n φ }. 8•d (A) := n∈Z
Note that both positive and negative degrees occur. Thus, 8•d (A) is a graded ∗ -algebra. This algebra is quite a natural object to introduce when dealing with N = 2 spectral data, as the algebra •d (A) of differential forms does not have a ∗ -representation on H, because d is not self-adjoint. Ignoring operator domain problems arising because the (anti-)commutator of d with the adjoint of a differential form is unbounded, in general, we observe that 8•d (A) has the interesting property that it forms a complex with respect to the action of d by graded commutation, and, in view of examples from quantum field theory, we call it the field complex in the following. For N = (2, 2) non-commutative K¨ahler data with holomorphic and anti-holomorphic gradings T and T , see Definition 2.26, one may introduce a bi-graded complex 8•,• (A) ∂,∂ in a similar way. A slight generalization of such bi-graded field complexes containing operators φ of degree (n, m) with n and m real, but n + m ∈ Z, naturally occurs in N = (2, 2) superconformal field theory, see e.g. [FG, FGR2] and references given there. Next, we show how the field complex appears when we attempt to introduce a notion of diffeomorphisms of a (non-commutative) geometric space described in terms of N = 2 spectral data: One possible generalization of the notion of diffeomorphisms to non-commutative geometry is to identify them with ∗ -automorphisms of the algebra A of “smooth functions”. It may be advantageous, though, to follow concepts from classical geometry more closely: An infinitesimal diffeomorphism is then given by a derivation δ(·) := [ L, · ] of A, where L is an element of 80d such that δ commutes with d, i.e. [ d, L ] = 0. The derivation δ can then be extended to all of π(•d (A)), and δ preserves the degree of differential forms iff L commutes with T , i.e. iff L ∈ 80d . For a classical manifold M , it turns out that each L with the above properties can be written as L = { d, X } for some vector field X ∈ 8−1 d , i.e. L is the Lie derivative in the direction of this vector field. In the non-commutative situation, however, it might happen that the cohomology of the field complex at the zeroth position is non-trivial. In this case, the study of diffeomorphisms of the non-commutative space necessitates studying the cohomology of the field complex 8•d (A) in degree zero. As in classical differential geometry, it is interesting to investigate special diffeomorphisms, i.e. ones that preserve additional structure in the spectral data. As an example, consider derivations δ(·) = [ L, · ] such that L commutes with d and d∗ : They generate isometries of the non-commutative space. For complex spectral data, we may consider derivation not only commuting with d but also with ∂: They generate one-parameter groups of holomorphic diffeomorphisms. In the example of symplectic spectral data, we are interested in diffeomorphisms preserving the symplectic forms, i.e., in symplectomorphisms. One-parameter groups of symplectomorphisms are generated by derivations e∗ . commuting with d and d
180
J. Fr¨ohlich, O. Grandjean, A. Recknagel
(3) Another important topic in non-commutative geometry is deformation theory. Given spectral data specified in terms of generators { Xj , d, dα , 4 } of a graded Lie algebra (t) as in remark (1), we may study one-parameter families { Xj(t) , d, d(t) α , 4 }t∈R of deformations. Here, we choose to keep one generator, d, fixed, and we require that the graded Lie algebras are isomorphic to one another for all t. This means that we study deformations of the (non-commutative) complex or symplectic structure of a given space A while preserving the differential and the de Rham complex. Only those deformations of spectral data are of interest which cannot be obtained from the original ones by ∗ -automorphisms of the algebra A commuting with d (i.e. by “diffeomorphisms”). In classical geometry, the deformation theory of complex structures is well-developed (Kodaira-Spencer theory), and there are non-trivial results in the deformation theory of symplectic structures (e.g. Moser’s theorem); but this last topic is still a subject of active research. Next, we consider deformations d0 of the differential d of a given set of N = 2 spectral data (A, H, d, d∗ , T, ∗) which are of the form d0 := d + ω, for some operator ω ∈ 8•d (A) of odd degree. We require that d0 again squares to zero, which implies that ω has to satisfy a zero curvature condition ω 2 + { d, ω } = 0.
(5.1)
We distinguish between several possibilities: First, we require that the deformed data still carry an N = 2 structure with the same Z-grading T as before. Then ω must be an element of 81d (A) satisfying (5.1), and we can identify it with the connection 1-form of a flat connection on some vector bundle; for an example, see the discussion of the structure of classical N = (1, 1) Dirac bundles in Sect. 2.2.3 of part I. More generally, we only require the deformed data to be of N = (1, 1) type, with a Z2 -grading γ given by the mod 2 reduction of T . As a simple example, consider an operator ω in 8•d (A) of degree 2n + 1, with n 6 = 0. Then condition (5.1) implies that ω 2 = 0 and { d, ω } = 0. If ω = [ d, β ] and [ β, ω ] = 0 then d0 = e−β d eβ . We then say that d and d0 are equivalent. If ω represents a non-trivial cohomology class of the field complex 8•d (A) then d and d0 are inequivalent. (4) In the introduction to paper I and in [FGR2] we have remarked that, from the point of view of physics, it is quite unnatural to attribute special importance to the algebra of functions over configuration space. The natural algebra in Hamiltonian mechanics is the algebra of functions over phase space, and, in quantum mechanics, it is a noncommutative deformation thereof, denoted F~ (where ~ is Planck’s constant), which is the natural algebra to study. In examples where phase space is given as the cotangent bundle T ∗ M of a smooth manifold M , the configuration space, one may ask whether there are natural mathematical relations between spectral data involving the algebra A = C ∞ (M ) and ones involving the algebra F~ . For example, it may be possible to represent A and F~ on the same Hilbert space H and consider spectral data (A, H, d, T, ∗) and (F~ , H, d, T, ∗) with the same choice of operators d, T and ∗ on H. It is well known that from (A, H, d, T, ∗) configuration space M can be reconstructed (Gelfand’s theorem
Supersymmetric Quantum Theory and Non-Commutative Geometry
181
and extensions thereof). This leads to the natural question whether M can also be reconstructed from (F~ , H, d, T, ∗), or whether at least some of the topological properties of M , e.g. its Betti numbers, can be determined from these data. It is known that, in string theory, spectral data generalizing (F~ , H, d, T, ∗) do not determine configuration space uniquely; this is related to the subject of stringy dualities and symmetries, more precisely to T dualities, see e.g. [GPR] and also [KS, FG]. The distinction between “algebras of functions on configuration space” A and “algebras of functions on phase space” F remains meaningful in many examples of non-commutative spaces. Typically, F arises as a crossed product of A by some group G of “diffeomorphisms”. Under what conditions properties of the algebra A can be inferred from spectral data (F~ , H, d, T, ∗) without knowing explicitly how the group G acts on F represents a problem of considerable interest in quantum theory. For another perspective concerning the distinction between “algebras of functions on configuration space” and “algebras of functions on phase space” see Sect. 2.2.6. It is worth emphasizing that in quantum field theory and string theory, where M is an infinite-dimensional space, the analogue of the “algebra of functions on M ”, i.e. of A, does not exist, while the analogue of the “algebra of functions on phase space T ∗ M ”, i.e. of F, still makes sense. For additional discussion of these matters see also [FGR2]. (5) A topic in the theory of complex manifold that has attracted a lot of interest, recently, is mirror symmetry. For a definition of mirrors of classical Calabi-Yau manifolds, see e.g. [Y] and references given there, and cf. the remarks at the end of Sect. I 2.4.3. It is natural to ask whether one can define mirrors of non-commutative spaces, and whether some classical manifolds may have non-commutative mirrors. Superconformal field theory with N = (2, 2) supersymmetry suggests how one might define a mirror map in the context of non-commutative geometry (see [FG, FGR2]): Assume that two sets of N = (2, 2) spectral data (Ai , H, ∂i , ∂ i , Ti , T i , ∗i ), i = 1, 2, are given, where the algebras Ai act on the Hilbert spaces Hi which are subspaces of a single Hilbert space H on which the operators ∂i , ∂ i , Ti , T i and ∗i are defined. We say that the space A2 is the mirror of A1 if ∂2 = ∂1 , ∂ 2 = ∂ ∗1 , T2 = T1 , T 2 = −T 1 , and if the dimensions bp,q i of the cohomology of the Dolbeault complexes (2.45) satisfy p,q n−p,q , where n is the top dimension of differential forms (recall that in Definition b 2 = b1 2.26 we required T and T to be bounded operators). e Within superconformal field Let A be a non-commutative K¨ahler space with mirror A. theory, there is the following additional relation between the two algebras: Viewing A e as the algebra of functions over a (non-commutative) target M , and analogously for A f, the phase spaces over the loop spaces over M and M f coincide. and M (6) The success of the theory presented in this paper will ultimately be measured in terms of the applications it has to concrete problems of geometry and physics. In particular, one should try to apply the notions developed here to further examples of truly non-commutative spaces such as quantum groups, or the non-commutative complex projective spaces (see e.g. [Ber, Ho, Ma, GKP]), non-commutative Riemann surfaces [KL], and non-commutative symmetric spaces [BLU, BLR, GP, BBEW ]. In most of these cases, it is natural to ask whether the “deformed” spaces carry a complex or K¨ahler structure in the sense of Sect. 2.3 above. From our point of view, however, the most interesting examples for the general theory and the strongest motivation to study spectral data with supersymmetry come from string theory: The “ground states” of string theory are described by certain N = (2, 2) superconformal field theories. They provide the
182
J. Fr¨ohlich, O. Grandjean, A. Recknagel
spectral data of the loop space over a target which is a “quantization” of classical space – or rather of an internal compact manifold. It may happen that the conformal field theory is the quantization of a σ-model of maps from a parameter space into a classical target manifold. In general, the target space reconstructed from the spectral data of the conformal field theory then turns out to be a (non-commutative) deformation of the target space of the classical σ-model. The example of the superconformal SU(2) Wess– Zumino–Witten model, which is the quantization of a σ-model with target SU(2), has been studied in some detail in [FG, Gr, FGR2] and has motivated the results presented in Sect. 3. A more interesting class of examples would consist of N = (2, 2) superconformal field theories which are quantizations of σ-models whose target spaces are given by threedimensional Calabi-Yau manifolds. But one may also apply the methods developed in this paper to superconformal field theories which, at the outset, are not quantizations of some classical σ-models. They may enable us to reconstruct (typically non-commutative) geometric spaces from the supersymmetric spectral data of such conformal field theories. This leads to the idea that, quite generally, superconformal field theories are (quantum) σ-models, but with target spaces that tend to be non-commutative spaces. An interesting family of examples of this kind consists of the Gepner models, which are expected to give rise to non-commutative deformations of certain Calabi-Yau three-folds. For further discussion of these ideas see also [FG, FGR2]. Acknowledgement. We thank A. Connes for useful criticism and suggestions that led to improvements of this paper. We thank O. Augenstein, A.H. Chamseddine, G. Felder and K. Gaw¸edzki for previous collaborations that generated many of the ideas underlying our work. Useful discussions with A. Alekseev are gratefully acknowledged. We thank the referee for pointing out some imprecisions in Sect. 2.2.5. The work of O.G. is supported in part by the Department of Energy under Grant DE-FG02-94ER-25228 and by the National Science Foundation under Grant DMS-94-24334, and by the Clay Mathematics Institute. The work of A.R. has been supported in part by the Swiss National Foundation.
References [AG]
Alvarez-Gaum´e, L.: Supersymmetry and the Atiyah–Singer index theorem. Commun. Math. Phys. 90, 161–173 (1983) [AGF] Alvarez-Gaum´e, L., Freedman, D.Z.: Geometrical structure and ultraviolet finiteness in the supersymmetric σ-model. Commun. Math. Phys. 80, 443–451 (1981) [BC] Beazley Cohen, P.: Structure complexe non commutative et superconnexions. Preprint MPI f¨ur Mathematik, Bonn, MPI/92-19 [Ber] Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975) [Bes] Besse, A.L.: Einstein Manifolds. Berlin–Heidelberg–New York: Springer-Verlag, 1987 [BLR] Borthwick, D., Lesniewski, A., Rinaldi, M.: Hermitian symmetric superspaces of type IV. J. Math. Phys. 343, 4817–4833 (1993) [BLU] Borthwick, D., Lesniewski, A., Upmeier, H.: Non-perturbative quantization of Cartan domains. J. Funct. Anal. 113, 153–176 (1993) [BBEW] Bordemann, M., Brischle, M., Emmrich, C., Waldmann, S.: Phase space reduction for star-products: An explicit construction for CPn . Lett. Math. Phys. 36, 357–371 (1996) [CF] Chamseddine, A.H., Fr¨ohlich, J.: Some elements of Connes’non-commutative geometry, and spacetime geometry. In: Chen Ning Yang, a Great Physicist of the Twentieth Century, C.S. Liu and S.-T. Yau (eds.), Cambridge, MA: International Press 1995, pp. 10–34 [CFF] Chamseddine, A.H., Felder, G., Fr¨ohlich, J.: Gravity in non-commutative geometry. Commun. Math. Phys. 155, 205–217 (1993) [CFG] Chamseddine, A.H., Fr¨ohlich, J., Grandjean, O.: The gravitational sector in the Connes-Lott formulation of the standard model. J. Math. Phys. 36, 6255–6275 (1995) [Co1] Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994
Supersymmetric Quantum Theory and Non-Commutative Geometry
183
´ Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. 62, 257– 360 (1985) [Co3] Connes, A.: The action functional in noncommutative geometry. Commun. Math. Phys. 117, 673– 683 (1988) [Co4] Connes, (1988): Reality and noncommutative geometry. J. Math. Phys. 36, 6194–6231 (1995) [Co5] Connes, A.: C ∗ -alg`ebres et g´eom´etrie diff´erentielle. C.R. Acad. Sci. Paris S´er. A-B 290, 599–604 (1980) [CoK] Connes, A., Karoubi, M.: Caract`ere multiplicatif d’un module de Fredholm. K-Theory 2, 431–463 (1988) [DFR] Doplicher, S., Fredenhagen, K., Roberts, J.E.: The quantum structure of space-time at the Planck scale and quantum fields. Commun. Math. Phys. 172, 187–220 (1995) [FG] Fr¨ohlich, F, Gaw¸edzki, K.: Conformal field theory and the geometry of strings. CRM Proceedings and Lecture Notes, Vol. 7„ 57–97 (1994) [FGR1] Fr¨ohlich, J., Grandjean, O., Recknagel, A.: Supersymmetric quantum theory and differential geometry. Commun. Math. Phys. 193, 527–594 (1998) [FGR2] Fr¨ohlich, J., Grandjean, O., Recknagel, A.: Supersymmetric quantum theory, non-commutative geometry, and gravitation. In: Quantum Symmetries, A. Connes, K. Gaw¸edzki and J. Zinn-Justin (eds.), Les Houches, Session LXIV, 1995. Amsterdam, New York: Elsevier Science, 1998 [FGK] Felder, G., Gaw¸edzki, K., Kupiainen, A.: Spectra of Wess–Zumino–Witten models with arbitrary simple groups. Commun. Math. Phys. 117, 127–158 (1988) [FW] Friedan, D., Windey, P.: Supersymmetric derivation of the Atiyah–Singer index theorem and the chiral anomaly. Nucl. Phys. B235, 395–416 (1984) [GKP] Grosse, H., Klimˇcik, C., Preˇsnajder, P.: Towards finite quantum field theory in non-commutative geometry. Int. J. Theor. Phys. 35, 231–244 (1996) [GP] Grosse, H., Preˇsnajder, P.: The construction of non-commutative manifolds using coherent states. Lett. Math. Phys. 28, 239–250 (1993) [GPR] Giveon, A., Porrati, M., Rabinovici, E.: Target space duality in string theory. Phys. Rep. 244,77.-202 (1994) [Gr] Grandjean, O.: Non-commutative differential geometry. Ph.D. Thesis, ETH Z¨urich, July 1997 [GSW] Green, M.B., Schwarz, J.H., Witten, E.: Superstring Theory I,II, Cambridge: Cambridge University Press, 1987 [HKLR] Hitchin, N.J., Karlhede, A., Lindstrom, U., Rocek, M.: Hyperk¨ahler metrics and supersymmetry. Commun. Math. Phys. 108, 535–589 (1987) [Ho] Hoppe, J.: Quantum theory of a massless relativistic surface and a two-dimensional boundstate problem. Ph.D. Thesis, MIT 1982; Quantum theory of a relativistic surface. In: Constraint’s theory and relativistic dynamics, G. Longhi, L. Lusanna (eds.), Proceedings Florence 1986, Singapore: World Scientific [Ja1] Jaffe, A., Lesniewski, A., Osterwalder, K.: Quantum K-theory I: The Chern character. Commun. Math. Phys. 118, 1–14 (1988) [Ja2] Jaffe, A., Lesniewski, A., Osterwalder, K.: On super-KMS functionals and entire cyclic cohomology. K-theory 2, 675–682 (1989) [Ja3] Jaffe, A., Osterwalder, K.: Ward identities for non-commutative geometry. Commun. Math. Phys. 132, 119–130 (1990) [Jac] Jacobson, N.: Basic Algebra II, San Francisco, CA: W.H. Freeman and Company, 1985 [Joy] Joyce, D.D.: Compact hypercomplex and quaternionic manifolds. J. Differ. Geom. 35, 743–762 (1992); Manifolds with many complex structures. Q. J. Math. Oxf. II. Ser. 46, 169–184 (1995) [Kar] Karoubi, M.: Homologie cyclique et K-th´eorie. Soci´et´e Math´ematique de France, Ast´erisque 149, (1987) [KL] Klimek, S., Lesniewski, A.: Quantum Riemann surfaces I. The unit disc. Commun. Math. Phys. 146, 103–122 (1992); Quantum Riemann surfaces II. The discrete series. Lett. Math. Phys. 24, 125–139 (1992) ˇ [KS] Klimˇcik, C., Severa, P.: Dual non-abelian duality and the Drinfeld double. Phys. Lett. B 351, 455– 462 (1995) [Ma] Madore, J.: The commutative limit of a matrix geometry. J. Math. Phys 32, 332–335 (1991) [Pol] Polchinski, J.: Dirichlet branes and Ramond–Ramond charges. Phys. Rev. Lett. 75, 4724–4727 (1995); TASI lectures on D-branes, hep-th/9611050 [PS] Pressley, A., Segal, G.: Loop groups. Oxford: Clarendon Press, 1986
[Co2]
184
J. Fr¨ohlich, O. Grandjean, A. Recknagel
[Ri]
Rieffel, M.: Non-commutative tori – a case study of non-commutative differentiable manifolds. Contemp. Math. 105, 191–211 (1990) Swan, R.G.: Vector bundles and projective modules. Trans. Amer. Math. Soc. 105, 264–277 (1962) Witten, E.: Constraints on supersymmetry breaking. Nucl. Phys. B202, 253–316 (1982) Witten, E.: Supersymmetry and Morse theory. J. Diff. Geom. 17, 661–692 (1982) Witten, E.: Non-abelian bosonization in two dimensions. Commun. Math. Phys. 92, 455–472 (1984) Witten, E.: Bound states of strings and D-branes. Nucl. Phys. B460, 335–350 (1996) Yau, S.T. (ed.): Essays on mirror symmetry, Cambridge, MA: International Press, 1992
[Sw] [Wi1] [Wi2] [Wi3] [Wi4] [Y]
Communicated by A. Connes
Commun. Math. Phys. 203, 185 – 210 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Vertex Operator Solutions to the Discrete KP-Hierarchy M. Adler1,? , P. van Moerbeke1,2,?? 1 Department of Mathematics, Brandeis University, Waltham, Mass 02454, USA. E-mail:
[email protected] 2 Department of Mathematics, Universit´ e de Louvain, 1348 Louvain-la-Neuve, Belgium. E-mail:
[email protected];
[email protected] Received: 27 August 1998 / Accepted: 24 November 1998
Abstract: Vertex operators, which are disguised Darboux maps, transform solutions of the KP equation into new ones. In this paper, we show that the bi-infinite sequence obtained by Darboux transforming an arbitrary KP solution recursively forward and backwards, yields a solution to the discrete KP-hierarchy. The latter is a KP hierarchy where the continuous space x-variable gets replaced by a discrete n-variable. The fact that these sequences satisfy the discrete KP hierarchy is tantamount to certain bilinear relations connecting the consecutive KP solutions in the sequence. At the Grassmannian level, these relations are equivalent to a very simple fact, which is the nesting of the associated infinite-dimensional planes (flag). The discrete KP hierarchy can thus be viewed as a container for an entire ensemble of vertex or Darboux generated KP solutions. It turns out that many new and old systems lead to such discrete (semi-infinite) solutions, like sequences of soliton solutions, with more and more solitons, sequences of Calogero–Moser systems, having more and more particles, just to mention a few examples; this is developed in [3]. In this paper, as another example, we show that the q-KP hierarchy maps, via a kind of Fourier transform, into the discrete KP hierarchy, enabling us to write down a very large class of solutions to the q-KP hierarchy. This was also reported in a brief note [4]. Contents 0 1 2 3 4 5 ?
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The KP τ -Functions, Grassmannians and a Residue Formula . . . . . . . . . The Existence of a τ -Vector and the Discrete KP Bilinear Identity . . . . Sequences of τ -Functions, Flags and the Discrete KP Equation . . . . . . . Discrete KP-Solutions Generated by Vertex Operators . . . . . . . . . . . . . . Example of Vertex Generated Solutions: The q-KP Equation . . . . . . . . .
186 189 194 198 202 203
The support of a National Science Foundation grant # DMS-9503246 is gratefully acknowledged. The support of a National Science Foundation grant # DMS-9503246, a Nato, a FNRS and a Francqui Foundation grant is gratefully acknowledged. ??
186
M. Adler, P. van Moerbeke
0. Introduction Given the shift operator 3 = (δi,j−1 )i,j∈Z , consider the Lie algebra (
)
X
D=
i
ai 3 , ai diagonal operators
= D− + D +
(0.1)
−∞ 9∗ (t, z) = z9∗ (t, z), ∂9∗ ∂9 = (Ln )+ 9, = −((Ln )+ )> 9∗ . ∂tn ∂tn
(0.9)
Vertex Operator Solutions to Discrete KP-Hierarchy
187
Theorem 0.1. If L satisfies the discrete KP-hierarchy (0.3), then the wave vectors 9(t, z) and 9∗ (t, z) can be expressed in terms of one sequence of τ -functions τ (n, t) := τn (t1 , t2 , . . . ), n ∈ Z, to wit: P∞ i τn (t − [z −1 ]) P∞ ti zi n e 1 = z , 9(t, z) = e 1 ti z ψ(t, z) τn (t) n∈Z n∈Z P∞ i 9∗ (t, z) = e− 1 ti z ψ ∗ (t, z) n∈Z τn+1 (t + [z −1 ]) − P∞ ti zi −n 1 = z , e τn+1 (t) n∈Z satisfying the bilinear identity I
9n (t, z)9∗m (t0 , z)
z=∞
(0.10)
dz =0 2πiz
(0.11)
for all n > m. It follows that P∞ i 9 = W χ(z) = e 1 ti z Sχ(z), 9∗ = W >
−1
χ∗ (z) = e−
P∞ 1
ti z i
(S −1 )> χ∗ (z),
with1 S=
∞ X ˜ (t) pn (−∂)τ
τ (t)
0
−n
3
and S
−1
=
∞ X
−n
3
˜ 3
0
˜ (t) pn (∂)τ τ (t)
.
(0.12)
Then Lk has the following expression in terms of τ -functions2 , k
L =
∞ X `=0
diag
˜ n+k−`+1 ◦ τn p` (∂)τ 3k−` τn+k−`+1 τn n∈Z
(0.13)
with the τn ’s satisfying ! `−1 X ∂ ˜ k−r (∂) ˜ τn ◦ τn−` = 0, for `, k = 1, 2, 3, . . . − (` − r)pr (−∂)p ∂tk r=0 and 1 ∂2 ˜ τn ◦ τn = 0, for k = 1, 2, 3, . . . . − pk+1 (∂) (0.14) 2 ∂t1 ∂tk P (n) n 1 In an expression, like S = a 3 , we have a(n) = diag(a(n) ) ; also introduce the notation k k∈Z 0 ˜ (3a)k = ak+1 3 6 ak+1 3.
2 where the p are elementary Schur polynomials and where p (∂)f ˜ ◦ g refers to the usual Hirota operation, ` ` to be defined in Sect. 1.
188
M. Adler, P. van Moerbeke
Remark. Equation (0.13) reads, upon using (0.14), ∂ τn+k log 3k−1 + . . . Lk = 3k + ∂t1 τn n∈Z τn+1 ∂2 ∂ 0 log 3 + log τn 3−1 + · · · . + ∂tk τn n∈Z ∂t1 ∂tk n∈Z
(0.15)
With each component of the wave vector 9, or, what is the same, with each component of the τ -vector, we associate a sequence of infinite-dimensional planes in the Grassmannian Gr(n) , ( ) k ∂ 9n (t, z), k = 0, 1, 2, . . . Wn = spanC ∂t1 ) ( k P∞ i ∂ + z ψn (t, z), k = 0, 1, 2, . . . = e 1 ti z spanC ∂t1 P∞ i (0.16) =: e 1 ti z Wnt . Note that the plane z −n Wn ∈ Gr(0) has so-called virtual genus zero, in the terminology of [13]; in particular, this plane contains an element of order 1 + O(z −1 ). Setting {f, g} = f 0 g − f g 0 for 0 = ∂/∂t1 , we have the following statement: Theorem 0.2. The following six statements are equivalent (i) The discrete KP-equations (0.3); (ii) 9 and 9∗ , with the proper asymptotic behaviour, provided by (0.8), satisfy the bilinear identities for all t, t0 ∈ C∞ , I dz 9n (t, z)9∗m (t0 , z) = 0, for all n > m; (0.17) 2πiz z=∞ (iii) the τ -vector satisfies the following bilinear identities for all n > m and t, t0 ∈ C∞ : I P∞ 0 i τn (t − [z −1 ])τm+1 (t0 + [z −1 ])e 1 (ti −ti )z z n−m−1 dz = 0; (0.18) z=∞
(iv) The components τn of a τ -vector correspond to a flag of planes in Gr, · · · ⊃ Wn−1 ⊃ Wn ⊃ Wn+1 ⊃ . . . ;
(0.19)
(v) A sequence of KP-τ -functions τn satisfying the equations {τn (t − [z −1 ]), τn+1 (t)} + z(τn (t − [z −1 ])τn+1 (t) − τn+1 (t − [z −1 ])τn (t)) = 0;
(0.20)
(vi) A sequence of KP-τ -functions τn satisfying the first set of equations (0.14) for ` = 1, i.e., ∂ ˜ − pk (∂) τn+1 ◦ τn = 0 for k = 2, 3, . . . and n ∈ Z. (0.21) ∂tk
Vertex Operator Solutions to Discrete KP-Hierarchy
189
Remark. The 2-Toda lattice, studied in [15], amount to two coupled discrete KPhierarchy or discrete KP-hierarchies, thus introducing two sets of times tn ’s and sn ’s. Actually, every discrete KP-hierarchy can naturally be extended to a 2-Toda lattice; this is the content of Theorem 3.4. How to construct discrete KP-solutions. A wide class of examples of discrete KPsolutions is given in Sect. 4 by the following construction, involving the simple vertex operators, P∞ i P∞ z−i ∂ − i ∂ti 1 , (0.22) X(t, z) := e 1 ti z e which are disguised Darboux transformations acting on KP τ -functions. We now state: Theorem 0.3. Consider an arbitrary τ -function for the KP equation and a family of weights . . . , ν−1 (z)dz, ν0 (z)dz, ν1 (z)dz, . . . on R. The infinite sequence of τ -functions: τ0 = τ and, for n > 0, Z Z X(t, λ)νn−1 (λ)dλ· · · X(t, λ)ν0 (λ)dλ τ (t), τn := Z Z X(−t, λ)ν−n (λ)dλ· · · X(−t, λ)ν−1 (λ)dλ τ (t), τ−n := form a discrete KP-τ -vector, i.e., the bi-infinite matrix ∞ X ˜ n+2−` ◦ τn p` (∂)τ diag 31−` L= τn+2−` τn n∈Z
(0.23)
`=0
satisfies the discrete KP-hierarchy (0.3). As an interesting special case of this situation, we study in Sect. 6 the q-KP equation. A wide variety of examples are captured by this construction, like q-approximations to KP, discussed in Sect. 5, but also soliton formulas, matrix integrals, certain integrals leading to band matrices, the Calogero–Moser system and others, discussed in [3]. The technology in Theorem 3 might also be applicable to other approximations to integrable system, like the K & M-lattice [10]. For a detailed discussion of discrete approximations to integrable systems, see [7, 12]. Remark. A semi-infinite discrete KP-hierarchy with τ0 (t) = 1 is equivalent to a biinfinite discrete KP-hierarchy with τ−n (t) = τn (−t) and τ0 (t) = 1; this also amounts to W−n = Wn∗ , with W0 = H+ . In such cases, one only keeps the lower right-hand corner of L, while the lower left-hand corner completely vanishes. 1. The KP τ -Functions, Grassmannians and a Residue Formula As is well known [5], the bilinear identity I 9(t, z)9∗ (t, z)dz = 0, z=∞
together with the asymptotics
(1.1)
190
M. Adler, P. van Moerbeke
P∞ i P∞ i 1 1 ti z − ti z ∗ 1 1 , 9 (t, z) = e , 9(t, z) = e 1+O 1+O z z (1.2) force 9, 9∗ to be expressible in terms of τ -functions P∞ i τ (t + [z −1 ]) P∞ i τ (t − [z −1 ]) , 9∗ (t, z) = e− 1 ti z ; 9(t, z) = e 1 ti z τ (t) τ (t) moreover the KP τ -functions satisfy the differential Fay identity3 , for all y, z ∈ C, as shown in [1, 16]: {τ (t − [y −1 ]), τ (t − [z −1 ])} + (y − z)(τ (t − [y
−1
(1.3)
])τ (t − [z
−1
]) − τ (t)τ (t − [y
−1
] − [z
−1
]) = 0.
In fact this identity characterizes the τ -function, as shown in [14]. From (1.1), it follows that I P∞ i dz 0 = τ (t − a − [z −1 ])τ (t + a + [z −1 ])e−2 1 ai z 2πi ∞ X ∂2 ˜ ak − 2pk+1 (∂t) τ ◦ τ + O(a2 ). = ∂t1 ∂tk k=1
The Hirota notation used here is the following: Given a polynomial p ∂ ∂ti ,
(1.4)
∂ ∂ ∂t1 , ∂t2 , . . .
in
define the symbol p
∂ ∂ , ,... ∂t1 ∂t2
(f ◦ g)(t) := p
and ∂˜t :=
∂ ∂ , ,... ∂u1 ∂u2
∂ 1 ∂ 1 ∂ , , ,... ∂t1 2 ∂t2 3 ∂t3
f (t + u)g(t − u)
, (1.5) u=0
.
For future use, we state the following proposition shown in [1]: Proposition 1.1. Consider τ -functions τ1 and τ2 , the corresponding wave functions P P −1 ]) t z i τj (t − [z t zi (1.6) = e i≥1 i 1 + O(z −1 ) 9j = e i≥1 i τj (t) and the associated infinite-dimensional planes, as points in the Grassmannian Gr, ( ) k P∞ k ∂ ˜ it = W ˜ i e− 1 tk z ; ˜ 9i (t, z), for k = 0, 1, 2, . . . with W Wi = span ∂t1 then the following statements are equivalent: ˜ 1; ˜2 ⊂W (i) z W (ii) z92 (t, z) = ∂t∂ 1 91 (t, z) − α91 (t, z), for some function α = α(t); 3
{f, g} :=
∂f g ∂t1
∂g − f ∂t . 1
Vertex Operator Solutions to Discrete KP-Hierarchy
191
(iii) {τ1 (t − [z −1 ]), τ2 (t)} + z(τ1 (t − [z −1 ])τ2 (t) − τ2 (t − [z −1 ])τ1 (t)) = 0. (1.7) When (i), (ii) or (iii) holds, α(t) is given by α(t) =
τ2 ∂ log . ∂t1 τ1
(1.8)
˜ 1 , hence z W ˜ 2t ⊂ W ˜ 1t , implies ˜2 ⊂ W Proof. To prove that (i) ⇒ (ii), the inclusion z W by (0.16) that ˜ 1t zψ2 (t, z) = z(1 + O(z −1 )) ∈ W must be a linear combination4 zψ2 =
∂ψ1 ∂ + zψ1 − α(t)ψ1 , and thus z92 = 91 − α(t)91 . ∂t1 ∂t1
(1.9)
The expression (1.8) for α(t) follows from equating the z 0 -coefficient in (ii), upon using the τ -function representation (1.6). To show that (ii) ⇒ (i), note that z92 =
∂ ˜ 1, 91 − α91 ∈ W ∂t1
and taking t1 -derivatives, we have j j+1 j ∂ ∂ ∂ 92 = 91 + β1 91 + · · · + βj+1 91 , z ∂t1 ∂t1 ∂t1 for some β1 , · · · , βj+1 depending on t only; this implies the inclusion (i). The equivalence (ii) ⇐⇒ (iii) follows from a straightforward computation using the τ -function representation (1.6) of (ii) and the expression for α(t). Lemma 1.2. The following integral along a clockwise circle in the complex plane encompassing z = ∞ and z = α−1 , can be evaluated as follows: I dz z m+1 f (t + [α] − [z −1 ])g(t − [α] + [z −1 ]) −1 2 (z − α ) 2πiz z=∞ ! ∞ m−1 X X ∂ ˜ k−r (+∂) ˜ f ◦ g. αk − + (m − r)pr (−∂)p = α1−m ∂tk r=0 k=1
Proof. By the residue theorem, the integral above is the sum of residue at z = ∞ and at z = α−1 : I dz z m+1 f (t + [α] − [z −1 ])g(t − [α] + [z −1 ]) −1 )2 2πiz (z − α z=∞ m−1 d 1 1 f (t + [α] − [u])g(t − [α] + [u]) = −1 2 (m − 1)! du (1 − uα ) u=0 d − z m f (t + [α] − [z −1 ])g(t − [α] + [z −1 ]) . (1.10) dz z=α−1 4
remember ψi is the same as 9i , but without the exponential.
192
M. Adler, P. van Moerbeke
Evaluating each of the pieces requires a few steps. Step 1. 1 k!
d du
k
f (t + [α] − [u])g(t − [α] + [u])
=
∞ X
u=0
˜ ` (∂)f ˜ ◦ g. α` pk (−∂)p
`=0
At first note
d du
k
F ([u])
u=0
= k!pk (∂˜s )F (s)
(1.11)
and, by (1.5) and (1.12), k d 1 ˜ ◦g f (t + [u])g(t − [u]) = pk (∂)f k! du u=0 ˜ ◦f = pk (−∂)g X ˜ ˜ pi (−∂)g.p = j (∂)f.
(1.12)
i+j=k
Indeed 1 k!
d du
k
f (t + [α] − [u])g(t − [α] + [u])
˜ = pk (∂s )g(t − [α] + s)f (t + [α] − s) = pk (∂˜s ) = = = =
∞ X `=0 ∞ X `=0 ∞ X `=0 ∞ X
∞ X `=0
u=0
, using (1.11) s=0
˜ α p` (∂t )f (t − s) ◦ g(t + s) `
, using (1.13) s=0
˜ ˜ α pk (∂s )p` (∂w )f (t + w − s)g(t − w + s) `
, expressing Hirota,
s=w=0
α` pk (∂˜s )p` (−∂˜w )f (t − w − s)g(t + w + s) ˜ ˜ α pk (∂v )p` (−∂v )f (t − v)g(t + v)
, flipping signs,
s=w=0
`
v=0
˜ ` (∂)f ˜ ◦ g, using(1.5). α` pk (−∂)p
`=0
Step 2. Residue at ∞: Note ` X 2 ` ∞ 1 d −1 i−1 = d i(uα ) = (` + 1)!α−` ; (1.13) −1 du 1 − uα du u=0 u=0 i=1 then we find
Vertex Operator Solutions to Discrete KP-Hierarchy
193
m−1 d 1 f (t + [α] − [u])g(t − [α] + [u]) −1 2 du (1 − uα ) u=0 m−1 X m − 1 d r 1 = f (t + [α] − [u])g(t − [α] + [u]) r (m − 1)! r=0 du m−1−r 1 d du (1 − uα−1 )2 u=0 1 (m − 1)!
=
m−1 X
∞ X
r=0
`=0
(m − r)
˜ ` (∂)f ˜ ◦ g, using step 1 and (1.13) α`−m+r+1 pr (−∂)p
= mα1−m f (t)g(t) + α1−m
∞ X
αk
k=1
m X
˜ k−r (∂)f ˜ ◦ g, using p0 = 1. (m − r)pr (−∂)p
r=0
(1.14) Step 3. Residue at z = α−1 : d m −1 −1 z f (t + [α] − [z ])g(t − [α] + [z ]) dz z=α−1 d 2 −m u f (t + [α] − [u])g(t − [α] + [u]) = −u du = mα
−m+1
f (t)g(t) − α
= mα1−m f (t)g(t) +
2−m
∞ X k=1
u=α d f (t + [α] − [u])g(t − [α] + [u]) du u=α
α1−m+k
∂ f ◦ g, by elementary differentiation. ∂tk (1.15)
Finally, putting Step 2 and Step 3 in (1.10) yields Lemma 1.2.
Lemma 1.3. The Hirota symbol acts on functions f (t1 , t2 , . . . ) and g(t1 , t2 , . . . ) as follows: k ∂t ∂...∂t log fg for k odd n ∂ 1 i1 ik f ◦ g = a polynomial Pn in ∂k f g ∂t1 . . . ∂tn log f g for k even (1.16) ∂ti1 ...∂tik
over all subsets {i1 , . . . , ik } ⊂ {1, . . . , n}. Upon granting degree 1 to each partial in ti , the polynomial Pn is homogeneous of degree n. Proof. By induction, we assume the statement to be valid for an Hirota symbol, involving ` partials, and we prove the statement for a symbol involving ` + 1 partials: ∂` 1 ∂ f (t) ◦ g(t) f g ∂t`+1 ∂t1 . . . ∂t`
∂` 1 ∂ ∂t1 ...∂t` f (t + u) ◦ g(t − u) f (t + u)g(t − u) = f g ∂u`+1 f (t + u)g(t − u) u=0
194
M. Adler, P. van Moerbeke
1 ∂` ∂ f = log f (t + u) ◦ g(t − u) ∂t`+1 g f g ∂t1 . . . ∂t` ∂m f (t + u) ∂ ,..., P ..., log + ∂u`+1 ∂ti1 . . . tim g(t − u) ∂n log f (t + u)g(t − u), . . . , ∂tj1 . . . ∂tjn u=0
(1.17) where m is odd and n even. The result follows from the simple computation: ∂m f (t + u) ∂ m+1 ∂ log = log f (t)g(t), ∂u`+1 ∂ti1 . . . ∂tim g(t − u) u=0 ∂ti1 . . . ∂tim .∂t`+1 ∂n ∂ n+1 f (t) ∂ log f (t + u)g(t − u) = log ∂u`+1 ∂ti1 . . . ∂tin ∂ti1 . . . ∂tin .∂t`+1 g(t) u=0 (1.18) Remark. The induction formula (1.17) can be made into an explicit formula for Pn , involving partitions of the set {1, 2, . . . , n}. 2. The Existence of a τ -Vector and the Discrete KP Bilinear Identity Before proving Theorem 0.1, we shall need two lemmas, which are analogues of basic lemmas in the theory of differential operators. So the main purpose of this section is threefold, namely, to prove the bilinear identities for the wave and adjoint wave vectors, to prove the existence of a τ -vector and finally to give a closed form for Lk . Lemma 2.1. For z-independent U, V ∈ D, the following matrix identities hold 5 I dz UV = U χ(z) ⊗ V > χ∗ (z) . (2.1) 2πiz z=∞ Proof. Set U=
X
uα 3α and V =
α
X
3β vβ ,
β
where uα and vα are diagonal matrices. To prove (2.1), it suffices to compare the (i, j)entries on each side. On the left side of (2.1), we have X (U V )ij = uα 3α+β vβ α,β
=
X
ij
uα (i)(3α+β )ij vβ (j)
α,β
=
X
uα (i)vβ (j).
α,β α+β=j−i
5 (A ⊗ B) = A B and remember χ∗ (z) = χ(z −1 ). The contour in the integration below runs clockwise ij i j about ∞; i.e., opposite to the usual orientation.
Vertex Operator Solutions to Discrete KP-Hierarchy
195
On the right side of (2.1), we have I dz U χ(z) V > χ(z −1 ) i j 2πiz z=∞ I X X uα z α χ(z) vβ z β χ(z −1 ) = I
z=∞
α
X
=
z=∞ α,β
=
X
i
β
uα (i)vβ (j)z α+β+i−j
j
dz 2πiz
dz 2πiz
uα (i)vβ (j),
α,β α+β=j−i
establishing (2.1).
Lemma 2.2. For W (t) a wave operator of the discrete KP-hierarchy, W (t)W −1 (t0 ) ∈ D+ , ∀t, t0 .
(2.2)
Proof. Setting h(t, t0 ) = W (t)W −1 (t0 ), compute from (0.6), ∂h = (Ln (t))+ h, ∂tn
∂h = −h(Ln (t0 ))+ . ∂t0n
Since h(t, t) = I ∈ D+ , it follows that h(t, t0 ) evolves in D+ .
(2.3)
Consider the wave function, already defined in the introduction, and the adjoint wave function: ! P∞ i P i X ti z Sχ(z) = e ti z z n + si (n)z i , 9(t, z) = W χ(z) = e 1 ∗
−1 > ∗
−
P∞
i := (W −1 (t0 ))>
in formula (2.1) of Lemma 2.1, and using formula (0.8) for 9 and 9∗ in terms of W , one finds for all t, t0 ∈ C∞ , I dz 0 −1 9(t, z) ⊗ 9∗ (t0 , z) . (2.5) W (t)W (t ) = 2πiz z=∞ But, according to Lemma 2.2, W (t)W −1 (t0 ) ∈ D+ and thus (2.5) is upper-triangular, yielding I dz 9n (t, z)9∗m (t0 , z) = 0 for all n > m. (2.6) 2πiz z=∞
196
M. Adler, P. van Moerbeke
Defining
P i 8n (t, z) := z −n 9n (t, z) = e ti z (1 + O(z −1 )), P i 8∗n (t, z) := z n−1 9∗n−1 (t, z) = e− ti z (1 + O(z −1 )),
upon using the asymptotics (2.4), we have, by setting m = n − 1 in (2.6), I I dz ∗ 0 8n (t, z)8n (t , z)dz = 9n (t, z)9∗n−1 (t0 , z) = 0. z z=∞ z=∞ From the KP-theory, there exists a τ -function τn (t) for each n, such that P i τ (t + [z −1 ]) P i τ (t − [z −1 ]) n n , 8∗n (t, z) = e− ti z , 8n (t, z) = e ti z τn (t) τn (t) yielding the τ -function representation (0.10) for 9n and 9∗n . Step 2. The following holds for n ∈ Z: 1 ∂2 ˜ τn ◦ τn = 0, for k = 1, 2, 3, . . . , − pk+1 (∂) 2 ∂t1 ∂tk
(2.7)
! `−1 X ∂ ˜ ˜ − (` − r)pr (−∂)pk−r (∂) τn ◦ τn−` = 0, for `, k = 1, 2, 3, . . . . (2.8) ∂tk r=0 Indeed the bilinear identity (2.6), upon setting m = n − ` − 1, shifting t 7→ t + [α], t0 7→ t − [α], using the τ -function representation (0.10) of 9 and 9∗ , and Lemma 1.2 with m = `, yield6 I dz 0 = −α2 9n (t + [α], z)9∗n−`−1 (t − [α], z) τn (t + [α])τn−` (t − [α]) 2πiz I z=∞ P∞ i dz τn (t + [α] − [z −1 ])τn−` (t − [α] + [z −1 ])e2 1 (αz) /i α2 z `+1 =− 2πiz z=∞ ! ∞ `−1 X X ∂ ˜ k−r (∂) ˜ τn ◦ τn−` , αk − (` − r)pr (−∂)p = α1−` ∂tk r=0 k=1
establishing (2.8). As for (2.7), set m = n − 1, t 7→ t − a and t0 7→ t + a in the bilinear identity, and use (1.4). This establishes the two Eqs. (0.14). Step 3. To check the formulas (0.12) for S, compute P∞ i e 1 ti z Sχ(z) =: 9(t, z) P∞ i τ (t − [z −1 ]) χ(z) (by (0.10)) = e 1 ti z τ (t) ∞ P∞ i X ˜ (t) pn (−∂)τ ti z 1 z −n χ(z) = e τ (t) n=0 6
m
e
P∞ 1
(αz)i /i
= (1 − αz)−m
Vertex Operator Solutions to Discrete KP-Hierarchy
197
∞ P∞ i X ˜ (t) pn (−∂)τ 3−n χ(z). = e 1 ti z τ (t) 0
Similarly one checks the formula for S −1 using the formulas for 9∗ (t, z) in terms of S −1 and τ (t). Finally to check the formula (0.13) for Lk , use the formulas (0.12) for S ˜ see footnote 1): and S −1 (for 3, Lk = S3k S −1 ∞ X ˜ ˜ pj (∂)τ pi (−∂)τ −i−j+k ˜ 3 = 3 τ τ i,j≥0 ∞ X pi (−∂)τ ˜ ˜ ˜ −i−j+k+1 pj (∂)τ 3−i−j+k 3 = τ τ i,j≥0 X X pi (−∂)τ ˜ n pj (∂)τ ˜ n+k−`+1 = 3k−` τ τ n n+k−`+1 i,j≥0 `≥0
=
i+j=`
n∈Z
X p` (∂)τ ˜ n+k−`+1 ◦ τn `≥0
τn+k−`+1 τn
n∈Z
3k−`
(using (1.12))
yielding (0.13) and (0.15), upon noting, ˜ n+k ◦ τn τn+k p1 (∂)τ ∂ k = log , coef3k−1 L = τn+k τn ∂t1 τn n∈Z n∈Z ˜ n+1 ◦ τn τn+1 pk (∂)τ ∂ = log , by (2.8), coef30 Lk = τn+1 τn ∂tk τn n∈Z n∈Z ˜ n ◦ τn pk+1 (∂)τ ∂2 k = log τn , by (2.7), coef3−1 L = τn2 ∂t1 ∂tk n∈Z n∈Z concluding the proof of Theorem 0.1.
˜ (t)/τ (t)), the wave operator W (t) for the discrete Corollary 2.3. Setting γ(t) := (3τ KP-hierarchy has the following property: (W (t)W −1 (t0 ))− = 0, (W (t)W −1 (t0 ))0 =
γ(t) . γ(t0 )
Proof. That h(t, t0 ) = W (t)W −1 (t0 ) ∈ D+ was shown in Lemma 2.2. Concerning its diagonal h0 , we deduce from (2.3) that7 ∂ log h0 = (Lk (t))0 , ∂tk
∂ log h0 = −(Lk (t0 ))0 , with h0 (t, t) = I. ∂t0k
Note that γ(t)/γ(t0 ) satisfies the same differential equations as h0 (t) with the same initial condition, upon using (0.15): 7
M0 := diagonal part of M .
198
M. Adler, P. van Moerbeke
γ(t) ∂ τn+1 (t) ∂ = Lk (t)nn , log = log ∂tk γ(t0 ) n ∂tk τn (t) γ(t) ∂ τn+1 (t0 ) ∂ = −Lk (t0 )nn , log = − log ∂t0k γ(t0 ) n ∂t0k τn (t0 )
with γ(t)/γ(t0 )
= I.
t=t0
3. Sequences of τ -Functions, Flags and the Discrete KP Equation In this section, we prove Theorem 0.2; it will be broken up into three propositions: the first one is very similar to the analogous statement for the KP theory (see [5, 16]). One could make an argument unifying both cases, in the context of Lie theory. The second statement uses Grassmannian technology. Proposition 3.1. The following equivalences (i) ⇐⇒ (ii) ⇐⇒ (iii) stated in Theorem 0.2 hold. Proof. (i) ⇒ (ii) was already shown in Theorem 0.1. Regarding the converse (ii) ⇒ (i), we show vectors 9(t, z) and 9∗ (t, z) having the asymptotics (0.8) and satisfying the bilinear identity (ii) are discrete KP-hierarchy vectors. The point of the proof is to show that the matrices S and T > ∈ I + D− defined through P∞ i P∞ i 9(t, z) =: e 1 ti z Sχ(z), 9∗ (t, z) =: e− 1 ti z T χ∗ (z) satisfy the vector fields (0.6) with T > = S −1 . Step 1. T > = S −1 . Assuming the bilinear identities (assumption (ii) of Theorem 0.2), I dz ∗ 9(t, z) ⊗ 9 (t, z) 0= 2πiz − z=∞ I P∞ i P∞ i dz = e 1 ti z S χ(z) ⊗ e− 1 ti z T χ(z −1 ) 2πiz − z=∞ = (ST > )− , by (2.1), but since S, T > ∈ I + D− , ST > = I, yielding T > = S −1 . P∞ i Step 2. W (t)W −1 (t0 ) ∈ D+ , upon defining W (t) := S(t)e 1 ti 3 . According to the bilinear identity, the left-hand side of I dz 9(t, z) ⊗ 9∗ (t0 , z) 2πiz z=∞ I
P∞ 0 i P i dz e ti z S χ(z) ⊗ e− 1 ti z (S −1 )> χ(z −1 ) 2πiz z=∞ I P i P 0 >i dz S(t)e ti 3 χ(z) ⊗ (S −1 (t0 ))> e− t 3 χ(z −1 ) = 2πiz z=∞ P i P 0 i ti 3 − ti 3 −1 0 e S (t ), using Lemma 2.1 = S(t)e =
Vertex Operator Solutions to Discrete KP-Hierarchy
199
= W (t)W −1 (t0 ); belongs to D+ , and hence so does the right-hand side. Step 3.
P∞ i ∂ ∂ n n − (L )+ 9(t, z) = − (L )+ Sχ(z)e 1 ti z ∂tn ∂tn P∞ i ∂S n n = − (L )+ S + S z χ(z)e 1 ti z ∂tn P∞ i ∂S n n −1 = − (L )+ S + S 3 (S S) χ(z)e 1 ti z ∂tn P∞ i ∂S n n = − (L )+ S + L S χ(z)e 1 ti z ∂tn P∞ i ∂S n = + (L )− S) χ(z)e 1 ti z . ∂tn
Step 4. From W (t)W −1 (t0 ) ∈ D+ , since D+ is an algebra, deduce ∂ − (Ln )+ W (t) W −1 (t0 ) D+ 3 ∂tn t0 =t I dz ∂ = − (Ln )+ 9(t, z) ⊗ 9∗ (t, z) , by Lemma 2.1 ∂t 2πiz n z=∞ I P∞ i P∞ i dz ∂S(t) , +(Ln )− S(t) χ(z)e 1 ti z ⊗ (S > (t))−1 χ(z −1 )e− 1 ti z = ∂tn 2πiz z=∞ by step 3 ∂S(t) + (Ln )− S(t) S(t)−1 , by Lemma 2.1, = ∂tn and thus, since S ∈ I + D− and D− is an algebra, ∂S(t) n + (L )− S(t) S(t)−1 ∈ D+ ∩ D− = 0; ∂tn therefore, we have the discrete KP-hierarchy equations on S, ∂S(t) + (Ln )− S = 0, n = 1, 2, . . . , ∂tn and on L = S3S −1 ,
∂L = [−(Ln )− , L], ∂tn
ending the proof that (ii) ⇒ (i). Finally (ii) ⇐⇒ (iii) upon using the equivalence (i) ⇐⇒ (ii) and the τ -function representation (0.10) of 9 and 9∗ , shown in Theorem 0.1; this establishes Proposition 3.1.
200
M. Adler, P. van Moerbeke
With each component of the wave vector 9, or, what is the same, with each component of the τ -vector, we associate a sequence of infinite-dimensional planes in the Grassmannian Gr(n) , ) ( k ∂ 9n (t, z), k = 0, 1, 2, . . . Wn = spanC ∂t1 ( ) k P∞ i ∂ + z ψn (t, z), k = 0, 1, 2, . . . = e 1 ti z spanC ∂t1 P∞ i (3.1) =: e 1 ti z Wnt , and planes Wn∗
) ( k ∂ 1 ∗ = spanC 9n−1 (t, z), k = 0, 1, 2, . . . , z ∂t1
which are the orthogonal complements of Wn in Gr(n) , by the residue pairing I dz f (z)g(z) . hf, gi∞ := 2πi z=∞
(3.2)
(3.3)
Proposition 3.2. The equivalences (ii) ⇐⇒ (iv) ⇐⇒ (v) of Theorem 0.2 hold. Proof. The inclusion · · · ⊃ Wn−1 ⊃ Wn ⊃ Wn+1 ⊃ . . . in (iv) implies that Wn , given by (3.1), is also given by Wn = spanC {9n (t, z), 9n+1 (t, z), . . . }. Moreover the inclusions · · · ⊃ Wn ⊃ Wn+1 ⊃ . . . imply, by orthogonality, the inclu∗ sions · · · ⊂ Wn∗ ⊂ Wn+1 ⊂ . . . , and thus Wn∗ , given by (3.2) and thus specified by ∗ 9n−1 and τn , is also given by Wn∗ = {
9∗n−1 (t, z) 9∗n−2 (t, z) , , . . . }. z z
The bilinear identities (1.1) yield Wn∗ = Wn⊥ , with respect to the residue pairing. Indeed, since each τn is a τ -function, we have that I 9∗n−1 (t0 , z) dz i∞ = 9n (t, z)9∗n−1 (t0 , z) h9n (t, z), z 2πiz z=∞ 1 = τn (Hτn (t0 )) I P∞ 0 i dz τn (t − [z −1 ])τn (t0 + [z −1 ])e 1 (ti −ti )z = 0. × 2πi z=∞ Since
∗ )∗ , all n > m, Wn ⊂ Wm+1 = (Wm+1
∗ for all n > m, with respect to the residue we have the orthogonality Wn ⊥ Wm+1 ∗ 0 9m (t ,z) ∗ ∈ Wm+1 (t0 , z), we have pairing; since 9n (t, z) ∈ Wn , z
Vertex Operator Solutions to Discrete KP-Hierarchy
9∗ (t0 , z) i∞ = 0 = h9n (t, z), m z
I z=∞
201
9n (t, z)9∗m (t0 , z)
dz , all n > m, (3.4) 2πiz
which is (ii). Now assume (ii); then, for fixed n > m, we have k ` I ∂ ∂ dz 9n (t, z) 9∗m (t0 , z) , n > m, 0= 0 ∂t1 ∂t1 2πiz z=∞ and thus by (3.1) and (3.2), ∗ )∗ = Wm+1 , for n > m, Wn ⊆ (Wm+1
which implies the flag condition · · · ⊃ Wn−1 ⊃ Wn ⊃ Wn+1 ⊃ . . . , stated in (iv). (iv) ⇐⇒ (v), follows from the equivalence of (i) and (iii) in Proposition 1.1, by ˜ 1 = z −n+1 Wn−1 and W ˜ 2 = z −n Wn and noting setting τ1 := τn−1 , τ2 = τn , W z(z −n Wn ) ⊂ (z −n+1 Wn−1 ), i.e. Wn ⊂ Wn−1 ,
concluding the proof of the proposition.
Proposition 3.3. (v) ⇐⇒ (vi), as in Theorem 0.2, holds. Proof. Step 1. For a given n ∈ Z, statement (v), namely ˜ n , τn+1 } + τn+1 pk (−∂)τ ˜ n − τn pk (−∂)τ ˜ n+1 = 0, k ≥ 2 Rk(n) := {pk−1 (−∂)τ implies 0 Rk(n)
=
∂ ˜ − pk (∂) τn+1 ◦ τn = 0, k ≥ 2. ∂tk
Since Rk(n) are the Taylor coefficients of relation (v) in Theorem 0.2, statement (v)n is equivalent to (iv)n (i.e. Wn ⊃ Wn+1 ). The latter is equivalent to the bilinear identity (iii)n (i.e., (0.18) with n → n + 1 and m → n − 1). According to the arguments used in 0 the proof of Theorem 0.1, (iii)n implies Rk(n) = 0. Step 2. The converse holds, because, upon using an inductive argument, 0
0
0
(n) ); Rk(n) = αRk(n) + partials of (R1(n) , . . . , Rk−1 0
0
thus the vanishing of the R1(n) , . . . , Rk(n) implies the vanishing of Rk(n) .
Theorem 3.4. Every discrete KP-hierarchy is equivalent to a 2-Toda lattice. Proof. The 1-Toda theory implies for S1 := S ∈ I + D− , L1 := L, ∂S1 = −(Ln1 )− S1 (t), where L1 = S1 3S1−1 . ∂tn Then, in view of the 2-Toda theory, define S2 (t) ∈ D+ by means of the differential equations ∂S2 (t) = (Ln1 )+ S2 (t), n = 1, 2, . . . , ∂tn
202
M. Adler, P. van Moerbeke
with initial condition S2 (0) = (an invertible element d+ ∈ D+ ). Then define8 S1,2 (t, s) −1 , flowing according to the commuting differential equations and L1,2 = S1,2 3±1 S1,2 ∂S1,2 (t, s) = ±(Ln2 (t, s))∓ S1,2 (t, s) with S1,2 (t, 0) = S1,2 (t). ∂sn
(3.5)
S1,2 (t, s) satisfies the t-equations of 2-Toda for s = 0, by construction; now we must check that this holds for s 6 = 0; therefore, set (n) (t, s) = F1,2
∂S1,2 (t, s) ± (Ln1 (t, s))∓ S1,2 (t, s), for n = 1, 2, . . . . ∂tn
(3.6)
Compute, using (3.5) and [∂/∂tn , ∂/∂sn ] = 0, the system of two differential equations, (n) ∂F1,2 (n) −1 (n) (t, s) = ±[F2,1 S2 , Lk2 ]∓ S1,2 ± (Lk2 )∓ F1,2 , k, n = 1, 2, . . . ; ∂sk (n) (n) (t, 0) = 0, we have F1,2 (t, s) = 0 for all s. Thus, by (3.5) and (3.6), S1,2 (t, s) since F1,2 flow according to 2-Toda.
4. Discrete KP-Solutions Generated by Vertex Operators An important construction leading to Toda solutions is contained in Theorem 0.3, which is based on the following lemma: Lemma 4.1. Particular solutions to equation {τ1 (t − [z −1 ]), τ2 (t)} + z(τ1 (t − [z −1 ])τ2 (t) − τ2 (t − [z −1 ])τ1 (t)) = 0
(4.1)
are given, for arbitrary measures ν(λ)dλ, ν 0 (λ)dλ, by pairs (τ1 , τ2 ), defined by: Z Z P i X(t, λ)ν(λ)dλ τ1 (t) = e ti λ τ1 (t − [λ−1 ])ν(λ)dλ, (4.2) τ2 (t) = or
Z τ1 (t) =
Z P i X(−t, λ)ν 0 (λ)dλ τ2 (t) = e− ti λ τ2 (t + [λ−1 ])ν 0 (λ)dλ. (4.3)
Proof. Using
P∞
λ , z it suffices to check, before even integrating, that τ2 (t) = X(t, λ)τ1 (t) satisfies the above Eq. (4.1) P i e− ti λ {τ1 (t − [z −1 ]), τ2 (t)} + z(τ1 (t − [z −1 ])τ2 (t) − τ2 (t − [z −1 ])τ1 (t)) P i P i = e− ti λ {τ1 (t − [z −1 ]), e ti λ τ1 (t − [λ−1 ])} λ +z(τ1 (t − [z −1 ])τ1 (t − [λ−1 ]) − (1 − )τ1 (t)τ1 (t − [z −1 ] − λ−1 ])) z e−
8
1
1 λ i i(z )
=1−
The first index in L1,2 and S1,2 corresponds to the upper-sign.
Vertex Operator Solutions to Discrete KP-Hierarchy
203
= {τ1 (t − [z −1 ]), τ1 (t − [λ−1 ])} +(z − λ)(τ1 (t − [z −1 ])τ1 (t − [λ−1 ]) − τ1 (t)τ1 (t − [z −1 ] − [λ−1 ])) = 0, using the differential Fay identity (1.3) for the τ -function τ1 ; a similar proof works for the second solution, given by τ1 (t) = X(−t, λ)τ2 (t). Since Eq. (4.1) is linear in τ1 (t), and also in τ2 (t), the equation remains valid after integrating with regard to λ. Proof of Theorem 0.3. Note, from the definition of τ±n in Theorem 3, that each τn is defined inductively by Z Z τn+1 = X(t, λ)νn (λ)dλ τn and τ−n−1 = X(−t, λ)ν−n−1 (λ)dλ τ−n ; thus by Lemma 4.1, the functions τn+1 and τn are a solution of Eq. (v) of Theorem 0.2. Therefore, Theorem 0.2 implies that the τn ’s form a τ -vector of the discrete KP hierarchy.
5. Example of Vertex Generated Solutions: The q-KP Equation Consider the class of q-pseudo-difference operators, with y-dependent coefficients, acting on functions f (y) X ai (y)Di }, with Df (y) := f (qy), Dq = { and the q-derivative Dq , defined by Dq f (y) :=
1 f (qy) − f (y) = −λ(y)(D − 1)f (y), with λ(y) := − . (q − 1)y (q − 1)y
Consider the following q-pseudo-difference operators: Q = D + u0 (x)D0 + u−1 D−1 + . . . and Qq = Dq + v0 (x)Dq0 + v−1 (x)Dq−1 + . . . and the following q-deformations, which were proposed respectively by E. Frenkel [6] and Khesin, Lyubashenko and Roger [11], for n = 1, 2, . . . : ∂Q n = (Q )+ , Q (Frenkel system), ∂tn ∂Qq n = Qq + , Qq , (KLR system), ∂tn
(5.1) (5.2)
where ( )+ and ( )− refer to the q-difference and strictly q-pseudo-differential part of ( ). Haine and Iliev [8] have constructed q-Schur polynomials, solutions to Eqs. (5.2), by inserting the vector c(x) below in the usual Schur polynomials. In an elegant paper, Iliev [9] has obtained q-bilinear identities and q-tau functions, as well, purely within the KP theory. Defining (1 − q)x (1 − q)2 x2 (1 − q)3 x3 n−1 , , . . . ∈ C∞ and λ−1 , , c(x) = n (x) = (1 − q)xq 1−q 2(1 − q 2 ) 3(1 − q 3 ) (5.3)
204
M. Adler, P. van Moerbeke
one checks for n ≥ 1, Dn λ0 (x) = λn (x), and Dn c(x) = c(x) −
n X
[λ−1 i (x)],
1
D−n c(x) = c(x) +
n X
[λ−1 −i+1 (x)].
(5.4)
1
Theorem 5.1. There is an algebra isomorphism ˆ : Dq −→ D, which maps the Frenkel and KLR system into the discrete KP-hierarchy ∂L n = (L )+ , L , n = 1, 2, . . . . ∂tn Theorem 5.2. Consider the matrices L=3+
X
diag
−∞ 0 card(V)
(4.23)
fu
for every f 6 = 0. Thus, the category ComH is a unitary ribbon category (cf. [40]). Example 4.1. Let L = 1 and S(AN−1 ; t) as in Example 2.1. Then S(AN −1 ; t) is a compact CQT Hopf V-face algebra of unitary type with costar structure eji (m)× = j
ei (m). Its Woronowicz functional agrees with the counit ε. 5. Flat Face Models
Let G be a finite oriented graph with set of vertices V. We say that a quadruple r qp s or a diagram p
λ −−−−→ ry
µ s y
(5.1)
q
ν −−−−→ ξ is a face if p, q, r, s ∈ G1 and s(p) = λ = s(r), r(p) = µ = s(s), r(r) = ν = s(q), r(q) = ξ = r(s). (5.2) When G has no multiple edge, we also write r qp s = λν µξ . We say that (G, w) is a i h (V-)face model over a field K if w is a map which assigns a number w r qp s ∈ K to each face r qp s of G. We call w Boltzmann weight of (G, w). For convenience, we set
Face Algebras and Unitarity of SU(N)L -TQFT
223
h i w r qp s = 0 unless r qp s is a face. For a face model (G, w), we identify w with the L linear operator on KG2 = p∈G2 Kp given by w(p · q) =
X r·s∈G
h p i w r q r · s (p · q ∈ G2 ). s 2
(5.3)
For m ≥ 2 and 1 ≤ i < m, we define an operator wi = wi/m on KGm by wi/m (p · q · r) = p⊗w(q)⊗r (p ∈ Gi−1 , q ∈ G2 , r ∈ Gm−i−1 ), where we identify (p1 , . . . , pm ) ∈ Gm with p1 ⊗ . . . ⊗ pm ∈ (KG1 )⊗m . A face model is called invertible if w is invertible as an operator on KG2 . An invertible face model is called star-triangular (or Yang–Baxter) if w satisfies the braid relation w1 w2 w1 = w2 w1 w2 in End(KG3 ). For a star-triangular face model (G, w), the operators wi/m (1 ≤ i < m) define an action of the m-string braid group Bm on KGm λµ for each m ≥ 2 and λ, µ ∈ V. The following proposition gives a face version of the FRT construction. Proposition 5.1 ([35,30,13,19]). Let (G, w) be a V-face model and H(G) as in § 1. Let I be an ideal of H(G) generated by the following elements: X h c i c · d X h p i a · b wr q e − wa d e (p · q, a · b ∈ G2 ). (5.4) s r · s b p · q 2 2 r·s∈G
c·d∈G
Then I is a coideal of H(G) and the quotient A(w) := H(G)/I becomes a V-face algebra. If (G, w) is star-triangular, then there exist unique bilinear pairings R± on A(w) such that (A(w), R± ) is a CQT V-face algebra and that p r q ,e =w r s (5.5) R+ e q s p for each p, q, r, s ∈ G1 .
We denote the image of e qp by the projection H(G) → A(w) again by e qp . Then P Am (w) := p,q∈Gm Ke qp becomes a subcoalgebra of A(w) for each m ≥ 0. As the usual FRT construction (cf. [12, Prop. 2.1]), we have the following. ∼ Proposition 5.2. For each star-triangular face model (G, w), we have Am (w)∗ = HomBm (KGm ) (m ≥ 2) as K-algebras. We say that r qp s or (5.1) is a boundary condition of size m × n if p, q ∈ Gn , r, s ∈ Gm and the relation (5.2) is satisfied for some λ, µ, ν, ξ . For a face model (G, w), we define its partition function to be an extension w : boundary conditions of size m × n; m, n ≥ 1} → K of the map w which is determined by the following two recursion relations: X p p0 p · p0 s = w r a w a 0s , w r q · q0 q q m a∈G
(5.6)
224
T. Hayashi
X h p i p a w r · r 0 s · s0 = w r s w r 0 s0 (5.7) a q q a∈Gn 0 0 p, q ∈ Gn , p0 , q0 ∈ Gn , r, s ∈ Gm , r0 , s0 ∈ Gm . i h i h Also, we set w r qp s = δpq (respectively w r qp s = δrs ) if r, s ∈ G0 (respectively p, q ∈ G0 ). With this notation, the relation (5.5) holds for every star-triangular face model (G, w) and p, q ∈ Gn , r, s ∈ Gm (m, n ≥ 0). Next, we recall the notion of flat face model [18], which is a variant of A. Ocneanu’s notion of flat biunitary connection (cf. [33]). Let (G, w) be an invertible face model with a fixed vertex ∗ ∈ V = G0 . We assume that 3m G 6 = ∅ for each m ≥ 0 and that S m = 3 and V(m) = V(m) V = m≥0 V(m), where 3m G∗ are defined by G G∗ m (5.8) 3G = (λ, m) ∈ V × Z≥0 G∗λ 6= ∅ , 3m G = 3G ∩ (V × {m}), V(m) = λ ∈ V (λ, m) ∈ 3G .
(5.9) (5.10)
For each m ≥ 0, we define the algebra Str m (G, ∗) by Y End(KGm Str m (G, ∗) = ∗λ )
(5.11)
λ∈V(m)
and call it a string algebra of (G, w, ∗). For each m, n ≥ 0, we define the algebra map n ιmn : Str m (G, ∗) → Str m+n (G, ∗) by ιmn (x)(p · q) = xp ⊗ q (p ∈ Gm ∗λ , q ∈ Gλµ ). For ∗
∗
−1
∗
each 1 ≤ i < m, we define the` element w i = wi/m of Str m (G, ∗) to be the restriction of m m wi/m on KG∗− , where G∗− = λ∈V Gm ∗λ . We say that (G, w, ∗) is a flat face model if ∗
∗
∗
−1
ιmn (x)w nm ιnm (y)wnm = wnm ιnm (y)wnm ιmn (x)
(5.12)
∗
for each x ∈ Str m (G, ∗) and y ∈ Str n (G, ∗) (m, n ≥ 0), where wmn ∈ Str m+n (G, ∗) is defined by ∗
∗
∗
∗
∗
∗
∗
∗
∗
∗
w mn = (w n w n+1 · · · wm+n−1 )(wn−1 wn · · · wm+n−2 ) · · · (w1 w2 · · · w m ).
(5.13)
For each flat V-face model (G, w, ∗), n ≥ 0 and λ, µ ∈ V, there exists a unique left action 0 of Str n (G, ∗) on KGnλµ such that ∗
∗
−1
p ⊗ (0(x)q) = wnm ιnm (x)wnm (p · q)
(5.14)
n n for each m ≥ 0, p ∈ Gm ∗λ , q ∈ Gλµ and x ∈ Str (G, ∗). Using this action, we define the costring algebra M Costm (w, ∗) (5.15) Cost(w, ∗) =
to be the quotient V-face algebra of
L
m≥0 m≥0 EndK (KG
m )∗
∼ = H(G) given by
Cost m (w, ∗) = EndStrm (G,∗) (KGm )∗ .
(5.16)
Face Algebras and Unitarity of SU(N)L -TQFT
225 µ
For each λ, µ ∈ V and (ν, m) ∈ 3G , we define the non-negative integer Nλν (m) by the irreducible decomposition of KGm λµ : X µ Nλν (m)[KGm (5.17) [KGm ∗ν ], λµ ] = ν∈V(m)
Str m (G, ∗)-module
V , [V ] denotes the element of the Grothendieck where, for each µ group K0 (Str m (G, ∗)) corresponding to V (see e.g. [4, §5.1]). We call Nλν (m) fusion rules of (G, w, ∗), µ Theorem 5.3 ([18]). Let (G, w, ∗) be a flat V-face model with fusion rule Nλν (m). (1) For each (λ, m) ∈ 3G , up to isomorphism there exists a unique right Costm (w, ∗)comodule L(λ,m) such that dim L(λ,m) (∗, µ) = δλµ for each µ ∈ V. As coalgebras, we have
M
Cost m (w, ∗) ∼ =
(5.18)
End(L(λ,m) )∗ .
(5.19)
λ∈V(m) f
(2) In the corepresentation ring K0 (ComCost(w,∗) ), we have [L(∗,0) ] = 1, X
[L(λ,m) ][L(µ,n) ] =
ν∈V(m+n)
(5.20)
ν Nλµ (n)[L(ν,m+n) ].
Moreover, for each Cost m (w, ∗)-comodule M, we have X dim (M(∗, λ)) [L(λ,m) ]. [M] =
(5.21)
(5.22)
λ∈V(m)
(3) We have
µ dim L(ν,m) (λ, µ) = Nλν (m).
(5.23)
Lemma 5.4. For each flat star-triangular face model, we have ∗
0(wi/n ) = wi/n (n ≥ 2, 1 ≤ i < n).
(5.24)
Proof. By the braid relation, we have ∗
∗
∗
−1
∗
w nm wi/n+m w nm = w i+m/n+m ∗
(5.25) ∗
for each 1 ≤ i < n. Using this together with ιnm (w i/n ) = w i/n+m , we obtain ∗
p ⊗ (0(wi/n )q) = p ⊗ wi/n q for each m ≥ 0, p ∈
Gm ∗λ
and q ∈
Gnλµ
(5.26)
as required. u t
Proposition 5.5. Let (G, w) be a star-triangular V-face model with a fixed vertex ∗ ∈ V. Then (G, w, ∗) is flat if KGm ∗λ is an absolutely irreducible Bm -module for each (λ, m) ∈ 3G . In this case, we have Cost(w, ∗) = A(w) as quotients of H(G). ∗
Proof. Using (5.25), we see that (5.12) holds for every x ∈ Str m (G, ∗) and y = w i ∗ ∗ (1 ≤ i < n). Hence (G, w, ∗) is flat if Str m (G, ∗) = hw1 , . . . , wm−1 i for each m > 1. The second assertion follows from Proposition 5.2 and the lemma above.
226
T. Hayashi
6. SU (N )L -SOS Models In order to construct the algebras S(AN−1 ; t) , we first recall SU (N )L -SOS models (without spectral parameter) [23], which are equivalent to H. Wenzl’s representations of Iwahori–Hecke algebras (cf. [45]) and also, the monodromy representations of the braid group arising from conformal field theory (cf. A. Tsuchiya and Y. Kanie [39]). Let N ≥ 2 and L ≥ 2 be integers. For each 1 ≤ i ≤ N, we define the vector iˆ ∈ RN by 1ˆ = (1 − 1/N, −1/N, . . . , −1/N), . . . , Nˆ = (−1/N, . . . , −1/N, 1 − 1/N ). Let V = VNL be the subset of RN given by VNL = λ1 1ˆ + · · · λN Nˆ λ1 , . . . , λN ∈ Z, L ≥ λ1 ≥ · · · ≥ λN = 0 . (6.1) P ˆ λN = 0 and |λ| = For λ ∈ V, we define integers λ1 , . . . , λN and |λ| by λ = i λi i, P m m+1 by i λi . For m ≥ 0, we define a subset G of V (6.2) Gm = Vm+1 ∩ p = (λ | i1 , . . . , im ) λ ∈ V, 1 ≤ i1 , . . . , im ≤ N , where for λ ∈ RN and 1 ≤ i1 , . . . , im ≤ N, we set (λ | i1 , . . . , im ) = (λ, λ + iˆ1 , . . . , λ + iˆ1 + · · · + iˆm ).
(6.3)
Then (V, G1 ) defines an oriented graph G = GN,L and Gm is identified with the set of paths of G of length m. For p = (λ | i, j ), we set p† = (λ | j, i) and d(p) = λi − λj + j − i. We define subsets G2 [→], G2 [ ↓ ] and G2 [&] of G2 by G2 [→] = p ∈ G2 p† = p , G2 [ ↓ ] = p ∈ G2 p† 6 ∈ G2 , G2 [&] = p ∈ G2 p 6 = p† ∈ G2 .
(6.4)
(6.5) (6.6) (6.7)
Let t ∈ C be a primitive 2(N + L)th root of 1. Let be either 1 or −1 and ζ a nonzero parameter. We define a face model (G, wN,t, ) = (GN,L , wN,t,,ζ ) by setting 1 λ λ + iˆ , (6.8) = −ζ −1 t −d(p) wN,t, λ + iˆ λ + iˆ + jˆ [d(p)] wN,t,
[d(p) − 1] λ λ + iˆ , = ζ −1 λ + jˆ λ + iˆ + jˆ [d(p)] wN,t,
λ λ + kˆ = ζ −1 t λ + kˆ λ + 2kˆ
(6.9)
(6.10)
for each p = (λ | i, j ) ∈ G2 [&] q G2 [ ↓ ] and (λ | k, k) ∈ G2 [→], where [n] = (t n − t −n )/(t − t −1 ) for each n ∈ Z. We call (G, wN,t, ) an SU (N )L -SOS model (without spectral parameter) [23]. It is known that (G, wN,t, ) is star-triangular. Moreover, H. m Wenzl [45] showed that CGm 0λ is an irreducible Bm -module for each m ≥ 0 and λ ∈ 3G0 . Therefore (G, wN,t, , 0) is flat by Proposition 5.5. In [10], F. Goodman and H. Wenzl showed that the fusion rule of (G, wN,t, , 0) agrees with that of SU (N )L -WZW model. We give another proof of their result in the next section.
Face Algebras and Unitarity of SU(N)L -TQFT
227
Remark 6.1. (1) Strictly speaking, Wenzl deals with wN,t, only when = −1. However, it is clear that his arguments are applicable to the case = 1. The results for A(wN,t,1 ) also follows from those of A(wN,t,−1 ), since the former is a 2-cocycle deformation of the latter (cf. [5]). Hence, may be viewed as a gauge parameter. (2) In order to avoid using square roots of complex numbers, we use a different normalization of wN,t,−1 from Wenzl [45]. For each p ∈ Gm (m ≥ 1), we define κ(p) ∈ C by κ(p · q) = κ(p)κ(q) (p ∈ Gm , q ∈ Gn , m, n > 0), κ(λ | i) =
N Y
(6.11)
Ad(0 | i,k)+1 Ad(0 | i,k)+2 · · · Ad(λ | i,k) ((λ | i) ∈ G1 ),
k=i+1
√ where Ad = ad / ad a−d and ad = [d + 1]/[2][d]. Note that κ(p) satisfies κ(p) = Ad(p) κ(p† )
(6.12)
for each p ∈ G2 [&]. By replacing the basis {p} of CGm with {κ(p)p}, we obtain Wenzl’s original expression of the Hecke algebra representation. It is also useful to use {κ(p)2 p} 6 satisfies instead of {p} (see Sect. 12). The corresponding Boltzmann weight wN,t, h p i h p i r κ(p · q) 2 6 wN,t, wN,t, r q = wN,t, p s . r q := κ(r · s) q s s
(6.13)
We call {p} and {κ(p)2 p} a rational basis of type and type 6 respectively. 7. The Algebra S(AN−1 ; t) Applying Proposition 5.1 to (G, wN,t,,ζ ), we obtain a CQT V-face algebra A(wN,t,,ζ ) = Cost(wN,t, , 0). In order to define the “(quantum) determinant” of A(wN,t, ), we introduce an algebra = N,L, , which is a face-analogue of the exterior algebra. It is defined by generators ω(p) (p ∈ Gm ; m ≥ 0) with defining relations: X ω(k) = 1, (7.1) k∈V
ω(p)ω(q) = δr(p)s(q) ω(p · q),
(7.2)
ω(p) = − ω(p ) (p ∈ G [&]),
(7.3)
†
2
(7.4) ω(p) = 0 (p ∈ G [→]). P It is easy to verify that m := p∈Gm Cω(p) becomes an A(wN,t, )-comodule via X p ω(p) ⊗ e (q ∈ Gm ) (7.5) ω(q) 7 → q m 2
p∈G
for each m ≥ 0. For each m ≥ 0, we set X iˆ ) ∈ V2 I ⊂ {1, . . . , N}, card(I ) = m . Bm = (λ, λ + k∈I
(7.6)
228
T. Hayashi
Also we define L : Gm → Z≥0 by L(λ | i1 , . . . , im ) = Card{(k, l)|1 ≤ k < l ≤ N, ik < il }
(7.7)
L(p) ω(p) Proposition 7.1. For each (λ, µ) ∈ Bm , Gm λµ 6 = ∅ and ωm (λ, µ) := (−) m does not depend on the choice of p ∈ Gλµ . Moreover {ωm (λ, µ)|(λ, µ) ∈ Bm } is a basis of m . In particular, m = 0 if m > N.
Proof. We will prove this lemma by means of Bergman’s diamond lemma [2], ` or mrather its obvious generalization to the quotient algebras of ChGi, where hGi = m G . We ` define a “reduction system” S = S1 S2 ⊂ hGi × ChGi by setting S1 = (p, − p† ) p = (λ | i, j ) ∈ G2 [&], i < j , S2 = (p, 0) p = (λ | i1 , . . . , im ) ∈ Gm , m ≥ 2, card{i1 , . . . , im } < m . It is straightforward to verify that the quotient ChGi/hW −f | (W, f ) ∈ Si is isomorphic to and that all ambiguities of S are resolvable. Next, we introduce a semigroup partial order ≤ on hGi by setting (λ | i1 , . . . , im ) < (λ | j1 , . . . , jn ) if either m < n, or m = n and i1 = j1 , . . . , ik−1 = jk−1 , ik > jk for some 1 ≤ k ≤ m. Then ≤ is compatible with S and satisfies the descending chain condition. This completes the proof of the proposition. u t ˆ As an immediate consequence of For each 0 ≤ m ≤ N , we set 3m = 1ˆ + · · · + m. (5.22) and the proposition above, we obtain the following result. Proposition 7.2. For each 0 ≤ m ≤ N , we have m ∼ = L(3m ,m) as A(wN,t, )comodules. P λ Now we define the “determinant” det = λ,µ∈V det µ of A(wN,t, ) to be the element which corresponds to the group-like comodule N and its basis group-like ω(λ) ¯ via Lemma 3.2, where ω(λ) ¯ = D(λ)ωN (λ, λ) and D(λ) =
Y 1≤i<j ≤N
Explicitly, we have
[d(λ | i, j )] (λ ∈ V). [d(0 | i, j )]
p λ D(µ) X L(p)+L(q) (−) e , det = q D(λ) µ N
(7.8)
(7.9)
p∈Gλλ
where q denotes an arbitrary element of GN µµ . By (2.38) and (2.39) for g = det, the quotient S(AN−1 ; t) := A(wN,t, )/(det −1)
(7.10)
naturally becomes a V-face algebra, which we call an SU (N )L -SOS algebra. The proof of the following lemma will be given in Sects. 8 and 13. Lemma 7.3. For each p ∈ G1 , we have ¯ ⊗ ω(p)) = N −1 ζ −N tω(p) ⊗ ω(r(p)), ¯ cN 1 (ω(s(p)) ¯ = c1 N (ω(p) ⊗ ω(r(p)))
N −1 −N
ζ
t ω(s(p)) ¯ ⊗ ω(p).
(7.11) (7.12)
Face Algebras and Unitarity of SU(N)L -TQFT
229
Proposition 7.4. The element det belongs to the center of A(wN,t, ). Moreover, if ζ satisfies ζ N = N−1 t,
(7.13)
R± (det −1, a) = 0 = R± (a, det −1) (a ∈ A(wN,t,,ζ )).
(7.14)
then Hence, S(AN −1 ; t) naturally becomes a quotient CQT V-face algebra of A(wN,t,,ζ ). ¯ in two Proof (cf. [13]). By computing the coaction of A(wN,t, ) on ω(p) ⊗ ω(r(p)), ways via (7.11), weobtain the first assertion. We show the first equality of (7.14) for ± = + and a = e qp (p, q ∈ Gm , m ≥ 0). By (2.25) and (2.27), it suffices to show r(p) p + (7.15) ,e = δpq (p, q ∈ Gm ). R det s(p) q For m = 0, this follows from (2.27) and (2.25). By computing the left-hand side of (7.11) via (3.5), we obtain (7.15) for m = 1. For m ≥ 2, (7.15) follows from (2.23) and (2.38) for g = det by induction on m. u t Since the braiding of S(AN−1 ; t) depends on the choice of the discrete parameter ζ satisfying (7.13), we sometimes write S(AN −1 ; t),ζ instead of S(AN −1 ; t) . To state our first main result, we recall the fusion rule of the SU (N )L -WZW model ν of a in conformal field theory. By [9], it is characterized as the structure constant Nλµ commutative Z-algebra F (calledP the fusion algebra of the SU (N )L -WZW model) with ν χ ) such that free basis {χλ }λ∈V (i.e., χλ χµ = ν∈V Nλµ ν ( 1 (λ, µ) ∈ Bm µ (7.16) Nλ3m = 0 otherwise for each λ, µ ∈ V and 0 ≤ m < N. See Kac [25] or Walton [44] for an explicit formula ν . of Nλµ Theorem 7.5 ([11]). (1) For each λ ∈ V, up to isomorphism there exists a unique right S(AN−1 ; t) -comodule Lλ such that dim Lλ (0, µ) = δλµ (µ ∈ V). Moreover, we have S(AN−1 ; t) ∼ =
M
End(Lλ )∗
(7.17)
(7.18)
λ∈V
as coalgebras. In particular, Lλ is irreducible for each λ ∈ V (2) The corepresentation ring K0 (ComS(AN −1 ;t) ) is identified with the fusion algebra F of SU (N )L -WZW model via χλ = [Lλ ]. That is, we have [L0 ] = 1, X ν [Lλ ][Lµ ] = Nλµ [Lν ].
(7.19) (7.20)
ν∈V
(3) We have µ
dim (Lν (λ, µ)) = Nλν .
(7.21)
230
T. Hayashi
Proof. It is easy to verify that (λ, m) ∈ 3GN,L if and only if m ∈ |λ| + N Z≥0 . Since ¯ (λ,m) ∼ CV det n ⊗L = L(λ,m+Nn) by (3.2) and (5.18), we see that det satisfies all conditions ¯ = V and ϕ(λ, n) = (λ, |λ| + N n). Therefore, we have Part (1), of Lemma 3.3, where 3 (7.19) and X µ ν Nλµ (|µ|)[Lν ], dim (Lν (λ, µ)) = Nλν (m) ((ν, m) ∈ 3GN,L ). [Lλ ][Lµ ] = ν∈V
(7.22) ν := N ν (m) does not depend on the choice of m. Using Proposition In particular, N˜ λµ λµ 7.2, (5.22) and (7.17), we obtain X X ¯ m )(0, µ)[Lµ ] = dim(Lλ ⊗ dim m (λ, µ) [Lµ ]. (7.23) [Lλ ][L3m ] = µ∈V
µ∈V
ν satisfies the condition of N ν stated above. u t Thus the numbers N˜ λµ λµ
Proposition 7.6. The element det is not a zero-divisor of A(wN,t, ). In particular, we have det µλ 6 = 0 for each λ, µ ∈ V. Moreover, we have X c(λ) ◦ eλ eµ detm m ∈ Z≥0 , c(λ) ∈ C× (λ ∈ V) . (7.24) GLE A(wN,t, ) = c(µ) λ,µ∈V
Proof. The first assertion follows from the fact that det is simply reducible (see the proof of the theorem above). By Theorem 5.3 (1), every group-like comodule of A(wN,t, ) is isomorphic to (N )⊗m for some m ≥ 0. Hence the second assertion follows from Lemma 3.2. u t 8. Module Structure of Let H be a CQT V-face algebra over K. As in case H is a CQT bialgebra, the correspondence a 7 → R+ ( , a) defines an antialgebra-coalgebra map from H into the dual face algebra H◦ (cf. [16]). Let W be a right H-comodule. Combining the above map with the left action (3.11) of H∗ on W , we obtain a right action of H on W given by X w(0) R+ (w(1) , a) (w ∈ W, a ∈ H). (8.1) wa = (w)
Let V be another H-comodule. Then we have X va(1) ⊗ wa(2) (v ∈ V (λ, ν), w ∈ W (ν, µ), a ∈ H). (v ⊗ w) a =
(8.2)
(a)
If H has an antipode and W is finite-dimensional, then we have hva, wi = hv, wS −1 (a)i (v ∈ W ∨ , w ∈ W ),
(8.3)
by (3.6) and (2.29). In this section, we give an explicit description of the right A(wN,t, )-module structure of . By (7.5) and (5.5), we obtain the following.
Face Algebras and Unitarity of SU(N)L -TQFT
231
Lemma 8.1. For each s ∈ Gm and p, q ∈ Gn (n ≥ 0), we have h s i X p wN,t, p q ω(r) (s ∈ Gm , p, q ∈ Gn , n ≥ 0). ω(s) e = r q m
(8.4)
r∈G
In particular, we have (λ, µ)e
p ∈ δλ,s(p) δµ,s(q) (r(p), r(q)). q
(8.5)
Lemma 8.2. Let (λ, µ) be an element of Bm (m > 0) and p = (λ | i1 , . . . , im ) an element of Gm λµ . Define the set I and C(λ | k, l) ∈ C (k 6 = l) by I = {i1 , . . . , im } and C(λ | k, l) =
[d(λ | k, l) + 1] [d(λ | k, l)]
(8.6)
respectively. Then for each (λ | i), (µ | j ) ∈ G1 , we have: Y 1 λ |i ˆ 2 , . . . , im , j ) C(λ|i, k) ω(λ + i|i ω(p)e = (−ζ )−m t −d(λ|i,j ) [d(λ|i, j )] µ|j k∈I \{i}
(i = i1 , j 6∈ I ), ω(p)e
λ|i µ|j
= −δij (−ζ )−m t
Y
C(λ | i, k) ωm (λ + iˆ | i2 , . . . , im , i)
k∈I \{i}
(i = i1 , j ∈ I ),
λ|i ωm (p)e µ|j
(8.7)
= δij (ζ −1 )m
Y
(8.8)
C(λ | i, k) ωm (λ + iˆ | i1 , . . . , im )
k∈I
(i, j 6 ∈ I ),
λ|i ωm (p)e µ|j
(8.9)
=0
(i 6 ∈ I, j ∈ I ).
(8.10)
Proof. These formulas are proved by induction on m in a similar manner. Here we give m m−1 ⊗CG 1 , the left-hand side of ¯ the proof of (8.7). Since P is a quotient module of CG (8.7) is rewritten as q Aq Bq with λ|i ν |q , Bq = ω1 (ν, µ)e (8.11) Aq = ωm−1 (λ, ν)e ν |q µ|j by (8.2), where ν = λ+ iˆ1 +· · ·+ iˆm−1 and the summation is taken over for all 1 ≤ q ≤ N ˆ µ + jˆ) ∈ G1 only if q = im or j , we have such that (ν | q) ∈ G1 . Since (ν + q, ( λ|i Aim Bim + Aj Bj (ν | j ) ∈ G1 = ωm (λ, µ)e (8.12) Aim Bim otherwise. µ|j
232
T. Hayashi
Using the inductive assumption, we see that the right-hand side of (8.12) equals −m −d(i,j )
(−ζ )
t
[d(im , j ) − 1] 1 + [d(i, im )][d(im , j )] [d(i, j )][d(im , j )] m−1 Y
ˆ µ + jˆ) C(λ | i, in ) ωm (λ + i,
(8.13)
n=2
if (ν | j ) ∈ G1 , where d(k, l) = d(λ | k, l). Applying [a +b +1]+[a][b] = [a +1][b +1] to a = d(i, im ) and b = d(im , j ) − 1, we see that (8.13) equals the right-hand side of (8.7) (even if (ν | j ) 6 ∈ G1 ). Next suppose that (ν | j ) 6 ∈ G1 . It suffices to verify that the second term in the parentheses of (8.13) is zero. In case j = 1, we obtain ν1 = L. Using this together with 1 6 ∈ I , we see that λ1 = L. On the other hand, since (µ | 1) ∈ G1 , we have L − 1 ≥ µ1 = L − δim N . Hence, im = N and d(im , j ) − 1 = −L − N . In case j > 1, we obtain d(im , j ) − 1 = 0 in a similar manner. Thus we complete the proof of (8.7). u t The following lemma is frequently used in the sequel. Lemma 8.3. As a right A(wN,t, )-module, CG1 ∼ = 1 is irreducible. Hence CG1 is also irreducible as a left A(wN,t, )∗ -module. L ◦ ◦ Proof. Let W be a non-zero submodule of CG1 . Since W = λµ W eλ eµ and W eλ eµ ⊂ CG1λµ , we have s0 ∈ W for some s0 = (λ | i) ∈ G1 . To show CG1 = s0 A(wN,t, ), we introduce the oriented graph H determined by H0 = G1 and ( 1 s e qp ∈ C× r (∃p, q ∈ G1 ) 1 (8.14) card Hsr = 0 otherwise. It suffices to show that hHis0 r 6 = ∅ for every r ∈ G1 , where hHis r = ∪m Hsmr . We note that 1 6 = ∅ if (µ | j ), (µ + jˆ | k) ∈ G1 and j 6 = k, H(µ | j ) (µ+jˆ | k)
(8.15)
1 6 = ∅ if (µ | j ), (µ + kˆ | j ) ∈ G1 and j 6 = k, H(µ | j ) (µ+kˆ | j )
(8.16)
1 6 = ∅ if (µ | j, j ) ∈ G2 . H(µ | j ) (µ+jˆ | j )
(8.17)
P −1 ˆ Using (8.16), we obtain hHis0 s1 6 = ∅, where s1 = (L3i−1 + N k=i λk k | i). Suppose i 6 = N. Using (8.17), and then using (8.16), we obtain hHis1 s2 6 = ∅, where s2 = ((L − 1)3N−1 + 3i−1 | i). Using (8.15) and (8.16) respectively, we obtain hHis2 s3 , hHis4 (0|1) 6 = ∅ and hHis3 s4 6 = ∅ respectively, where s3 = ((L−1)3N −1 +3N −2 | N −1). and s4 = (3N−2 | N − 1). Therefore we obtain hHis0 (0|1) 6= ∅ if i 6 = N. By similar consideration, we also obtain hHis0 (0|1) 6 = ∅ in case i = N, and also, hHi(0|1)r 6 = ∅ for every r ∈ G1 . Thus, we have verified the first assertion. The second assertion is obvious t since the image of a 7 → R+ ( , a) is a subalgebra of A(wN,t, )∗ . u
Face Algebras and Unitarity of SU(N)L -TQFT
233
Now we begin to prove Lemma 7.3. By (8.5) and Lemma 8.2, we have λ|i ˆ ¯ + i). ω(ν)e ¯ = N−1 ζ −N tδij δνλ δνµ ω(ν µ|j
(8.18)
1 → CG1 ; ω(s(p)) ¯ ¯ ⊗ p 7→ Using (8.2) and this equality, we see that both N ⊗CG ¯ N → CG1 ; p ⊗ ω(s(p)) ¯ 7 → p ⊗ ω(r(p)) ¯ are isomorphisms of p ⊗ ω(r(p)) ¯ and CG1 ⊗ right A(wN,t, )-modules. Hence, by Lemma 8.3 and Schur’s Lemma, we have
¯ ⊗ p) = ϑ p ⊗ ω(r(p)) ¯ (p ∈ G1 ) cN 1 (ω(s(p))
(8.19)
for some constant ϑ. We will prove ϑ = N−1 ζ −N t in Sect. 12. 9. Transposes and Complex Conjugates The following proposition is an immediate consequence of the following reflection symmetry: h p i κ(r · s) 2 r wN,t, p s , (9.1) wN,t, r q = κ(p · q) q s where κ is as in (6.11). Proposition 9.1. There exists an algebra-anticoalgebra map A(wN,t, ) → A(wN,t, ); a 7 → aT given by T κ(p) 2 q p = e (p, q ∈ Gm , m ≥ 0). (9.2) e κ(q) p q Moreover it satisfies (aT )T = a and R± aT , bT = R± (b, a)
(9.3)
for each a, b ∈ A(wN,t, ) and ζ ∈ C× . The following proposition is needed to construct the “cofactor matrix”. Proposition 9.2. The element det satisfies detT = det. Hence, T induces an algebraanticoalgebra involution of S(AN−1 ; t) , which satisfies (9.3). Proof. Since detT is a group-like element of A(wN,t, ), we have detT = g det; g =
X c(λ) ◦ eλ eµ c(µ)
(9.4)
λ,µ∈V
by (7.24), where c(λ) (λ ∈ V) denotes some nonzero constant. Since both det and detT are central and det is not a zero divisor, g is central. Hence, by Lemma 8.3 and Schur’s lemma, we have p g = c p (p ∈ G1 ) for some c ∈ C. Hence we have c(λ) = c|λ| c(0) T for each λ ∈ V. In order to prove c = 1, we compute det 01ˆ in two ways. Using † p p (p ∈ G2 [ & ], q ∈ G2 [ ↓ ]), (9.5) [d(p) − 1] e = −[d(p) + 1] e q q
234
we obtain
T. Hayashi
k X (−)L(pi ) e i=1
pi 0 | 1, . . . , N
= (−)L(pk ) [k]2 e
pk 0 | 1, . . . , N
(9.6)
by induction on k, where pi = (1ˆ | 2, . . . , i, 1, i + 1, . . . , N). Substituting k = N in this equality, we get T 1ˆ 0 ˆ = c|1|−|0| det det ˆ1 0 1ˆ | 2, 3, . . . , N, 1 . = (−)N(N−1)/2+L(pN ) c[N ] e 0 | 1, 2, . . . , N − 1, N On the other hand, using (7.9) and (9.2), we see that the right-hand side of the above T t equality agrees with c det 01ˆ . This completes the proof of the proposition. u πi i ), or − exp(± Nπ+L ) with N + L ∈ 2Z. Then, we have Next suppose t = exp(± N+L m κ(p) > 0 for each p ∈ G (m > 0). Moreover, for each ζ with |ζ | = 1, the Boltzmann weight wN,t satisfies h p i κ(r · s) 2 r −1 wN,t, p s . (9.7) r q = wN,t, κ(p · q) q s
Similarly to Proposition 9.1 and Proposition 9.2, we obtain the following. πi i ) if N + L ∈ 2Z, and t = exp(± Nπ+L ) if Proposition 9.3. Set t = ± exp(± N+L N +L ∈ 1+2Z. Then for each solution ζ of (7.13), both S(AN −1 ; t),ζ and A(wN,t,,ζ ) are compact CQT V-face algebras of unitary type with costar structure × κ(p) 2 q p = e (p, q ∈ Gm , m ≥ 0). (9.8) e p κ(q) q
In fact, both S(AN−1 ; t) and A(wN,t, ) are spanned by unitary matrix corepresentations [eu qp ]p,q∈Gm (m ≥ 0) given by p κ(q) p (9.9) e (p, q ∈ Gm , m ≥ 0). eu = κ(p) q q
10. Antipodes and Ribbon Functionals Lemma 10.1. The V-face algebra S(AN−1 ; t) has an antipode given by ˆ X a D(λ + i) λ|i (−)L(a)+L(b) e , S e = (−)i+j ˆ b µ|j D(µ + j ) a
(10.1)
and the summation is taken over all where b denotes an arbitrary element of GN−1 ˆ λ+i λ
a ∈ GN−1ˆ . Moreover, we have µ+j µ p D(r(p))D(s(q)) p 2 e (p, q ∈ Gm , m ≥ 0). S e = D(s(p))D(r(q)) q q
(10.2)
Face Algebras and Unitarity of SU(N)L -TQFT
235
Proof (cf. [38], [12]). Let Y(λ | i) (µ | j ) denote the right-hand side of (10.1) viewed as an element of A(wN,t, ). By [17, §7], it suffices to verify that X X ◦ Ypr Xrq = δpq er(p) det, Xpr Yrq = δpq es(p) det, (10.3) r∈G1
r∈G1
¯ p) ˜ p ∈ G1 of N −1 by where Xpq = e qp (p, q ∈ G1 ). We define a basis ω( ˆ N−1 (λ + i, ˆ λ), where p˜ = (µ, λ) ∈ BN −1 for ˆ λ) = (−)i−1 D(λ + i)ω ω(λ ¯ + i, 1 N −1 is given by ω( ˜ 7→ ¯ q) p P= (λ, µ) ∈ G . Then, the coaction of A(wN,t, ) on ˜ ⊗ Yqp and the multiplication of gives maps ¯ p) p∈G1 ω( ¯ 1 → N ; ω( ¯ q) ˜ ⊗ ω(p) 7→ δpq ω(r(p)), ¯ N−1 ⊗ D(r(p)) ¯ N−1 → N ; ω(p) ⊗ ω( ˜ 7 → δpq (−)N −1 ω(s(p)). ¯ ¯ q) 1 ⊗ D(s(p)) Since these maps are compatible with the coaction of A(wN,t, ), we have the first formula of (10.3) and X Wrp Yqr = δpq es(p) det, (10.4) r
where Wpq ∈ A(wN,t, ) denotes the right-hand side of (10.2). Applying T to this equality, we obtain X
◦
Xpr Zrq = δpq es(p) det; Zpq =
r
D(r(p))D(s(q))κ(p)2 T Y . D(s(p))D(r(q))κ(q)2 qp
(10.5)
P Computing rs Ypr Xrs Zsq in two ways, we obtain Ypq = Zpq . This proves the second P 2 equality of (10.3). Finally, Computing rs Wrq S(Xsr )S (Xps ) in the algebra t S(AN−1 ; t) in two ways, we obtain (10.2). u Proposition 10.2. For each t and ζ , S(AN−1 ; t),ζ becomes a coribbon Hopf V-face algebra, whose braiding, antipode S and modified ribbon functional M = M1 are given by (5.5), (10.1) and the following formulas: D(r(p)) p (p, q ∈ Gm , m ≥ 0). (10.6) M e = δpq D(s(p)) q Moreover, we have dimq (Lλ ) = D(λ)
(10.7)
for each λ ∈ V. When N is even, there exists another ribbon functional M−1 given by p D(r(p)) (10.8) (p, q ∈ Gm , m ≥ 0). = δpq (−1)|r(p)|−|s(p)| M−1 e D(s(p)) q The quantum dimension of the corresponding ribbon category is given by |λ| dim−1 q (Lλ ) = (−1) D(λ).
(10.9)
236
T. Hayashi
Proof. Let M ∈ H∗ be as in (10.6). Using (10.2), we see that M satisfies (2.43). Hence, it suffices to verify that p p 2 =M e (10.10) (U1 U2 ) e (p, q ∈ Gm ) q q for each m ≥ 0. By (2.34) and (2.43), U1 M−1 is a central element of A(wN,t, )∗ . Hence by Lemma 8.3 and Schur’s lemma, we have U1 p = ϑMp (p ∈ G1 ) for some constant ϑ. Using (2.31), (2.29) and (10.2), we compute X 0|1 r 0|1 R+ S 2 e = ,e (10.11) U1−1 e 0|1 0|1 r 1 r∈G X D(0)D(ν) 0 1ˆ wN,t ˆ (10.12) = 2 ˆ 1ν D(1) ˆ 1+ ˆ 2ˆ ν=21,
= ζ −1 t N
1 . [N]
(10.13)
This shows that ϑ = ζ t −N , and similarly, we obtain U2 p = ϑ −1 Mp (p ∈ G1 ). This proves (10.10) for m = 1. For m ≥ 2, (10.10) follows from (10.10) for m = 1 by induction on m, using the fact that both M2 and U1 U2 are group-like (cf. (2.32), (2.33)). The second assertion follows from (10.6) and Lemma 3.1. u t We denote by S(AN−1 ; t)ι,ζ the coribbon Hopf V-face algebra (S(AN −1 ; t),ζ , Mι ), where ι = ±1 if N is even and ι = 1 if N is odd. πi i ), or t = − exp(± Nπ+L ) and N is odd, then we have Lemma 10.3. If t = exp(± N+L πi D(λ) > 0 for every λ ∈ V. If t = − exp(± N +L ) and N is even, then we have (−1)|λ| D(λ) > 0.
Proof. Straightforward. u t πi i ), or t = − exp(± Nπ+L ) and N, L ∈ 1 + Proposition 10.4. When t = exp(± N+L 2Z, the Woronowicz functional Q of S(AN−1 ; t) is given by (10.6). While when t = πi ) and N, L ∈ 2Z , Q is given by (10.8). − exp(± N+L
Proof. We will prove the first assertion. We set p p Q = Q eu , M = M eu q q 1 p,q∈G p,q∈G1
(10.14)
where Q denotes the Woronowicz functional of S(AN −1 ; t) and eu qp is as in (9.9). By (2.43), (4.6) and Lemma 8.3, we have M = ϑQ for some ϑ. Since M is positive by (10.6) and the lemma above, we have ϑ > 0. Since the quantum dimension satisfies have Tr(M) = Tr(M −1 ). By (4.7), this proves dimq L = dimq L∨ for every L, we p p M = Q. Now the assertion M eu q = Q eu q (p, q ∈ Gm ) easily follows from the fact that both M and Q are group-like, by induction on m. u t
Face Algebras and Unitarity of SU(N)L -TQFT
237
11. The Modular Tensor Category Let C be a ribbon category, which is additive over a field K. We say that C is semisimple if there exist a set VC , an involution ∨ : VC → VC , an element 0 ∈ VC and simple objects LCλ (λ ∈ VC ) such that every object of C is isomorphic to a finite direct sum of LCλ ’s, and that LC0 ∼ = 1, (LCλ )∨ ( K C C C(Lλ , Lµ ) = 0
∼ = LCλ∨ ,
(11.1)
(λ = µ) (λ 6 = µ)
(11.2)
for each λ, µ ∈ VC , where 1 denotes the unit object of C. For a semisimple ribbon ν (λ, µ, ν ∈ V ) and S-matrix S C = [S C ] category C, we define its fusion rule Nλµ C λµ λ,µ∈VC by X ν Nλ,µ [LCν ], (11.3) [LCλ ][LCµ ] = ν∈VC
C Sλµ
= Tr q (cLC LC ◦ cLC LC ). µ λ
λ
µ
(11.4)
By definition, we have C = dimq LCλ . Sλ0
(11.5)
cW V ◦ cV W = θV ⊗W ◦ (θV ⊗ θW )−1 ,
(11.6)
Since the twist θ satisfies
S C satisfies C = Sλµ
X ν∈VC
θν N ν dimq (LCν ), θλ θµ λµ
(11.7)
where θλ ∈ K is defined by θLC = θλ idLC . Moreover, S = S C satisfies the following λ λ Verlinde’s formula (cf. [43,32,40]): X ξ Nλµ Sξ ν = Sλν Sµν , (11.8) Sν,0 ξ ∈V
where λ, µ and ν denote arbitrary elements of V = VC . Let C be a semisimple ribbon category. We say that C is a modular tensor category (or MTC) if VC is finite and the matrix S C is invertible. If, in addition, C is unitary as a ribbon category, then it is called a unitary MTC. It is known that each (unitary) MTC gives rise to a (unitary) 3dimensional topological quantum field theory (TQFT), hence, in particular, an invariant of 3-manifolds of Witten–Reshetikhin–Turaev type (cf. V. Turaev [40]). The most well-known example of MTC is obtained as a certain semisimple quotient C(g, κ) of a category of representations of the quantized enveloping algebra Uq (g) of finite type in the case when q is a root of unity [1,8,28,41]. When g = slN , the simple objects LU λ of C(slN , N + L) (L ≥ 1) are also indexed by the set V = VN L given by
238
T. Hayashi
(6.1) and the fusion rules agree with those of the SU (N )L -WZW model. The quantum dimension and the constant θλ for C(slN , N + L) are given by (λ|λ+2ρ)∼
dimq (LU λ ) = D(λ)t0 , θλ = ζ0
(11.9)
πi ), t0 = ζ0N , ρ = 31 + · · · + 3N −1 , ( | )∼ = N ( | ) respectively, where ζ0 = exp( N(N+L) and ( | ) denotes the usual inner product of RN . Moreover, the S-matrix of C(slN , N +L) is given by S U = S 1 (ζ0 ). Here, for each primitive 2N (N + L)th root ζ of unity, we define the matrix S ι (ζ ) by the following Kac–Peterson formula (cf. [28]): P l(w) ζ −2(w(λ+ρ) | µ+ρ)∼ w∈SN (−1) ι |λ|+|µ| P , (11.10) S (ζ )λµ = ι l(w) ζ −2(w(ρ) | ρ)∼ w∈SN (−1)
where ι = ±1 if N ∈ 2Z, ι = 1 if N ∈ 1 + 2Z and the action of the symmetric group P L w(i). Note that (λ | µ)∼ ∈ Z for every λ, µ ∈ i Z3i . SN on i Ciˆ is given by w iˆ = [ Lemma 11.1. Let ζ be a primitive 2N(N + L)th root of unity and t = ζ N . Then the matrix S = S ι (ζ ) is both symmetric and invertible, and satisfies Verlinde’s formula (11.8). Moreover, we have: ι
S (ζ )3q 3r
S ι (ζ )λ0 = ι|λ| D(λ)t , X = ιq+r (ζ t)−2qr t 2s(q+r−s+1) D(3q+r−s + 3s )t
(11.11) (11.12)
s
for each λ ∈ VNL and 0 < r ≤ q < N, where the summation in (11.12) is taken over max{0, q + r − N } ≤ s ≤ r. Proof. Suppose ζ = ζ0 . Then the formula (11.12) follows from (11.7) for C(slN , N +L), (7.16) and (11.9). In this case, the other assertions also follow from the results for t C(slN , N + L). For other ζ , the assertions follow from Galois theory for Q(ζ0 )/Q. u ν be the fusion rules of SU (N ) -WZW models and let S and S 0 be Lemma 11.2. Let Nλµ L symmetric matrices whose entries are indexed by V = VN L . If these satisfy Verlinde’s 0 6 = 0 (λ ∈ V) and formula (11.8), Sλ0 = Sλ0 0 (0 < r ≤ q < N ), S3q 3r = S3 q 3r
(11.13)
then we have S = S 0 . Proof. We recall that there exists an algebra surjection from Z[x1 , . . . , xN ]SN onto F (cf. Theorem 7.5 (2)), which sends the Schur function s(λ1 ,... ,λN ) (see e.g. [31]) to [Lλ ] for each λ ∈ V (see e.g. [9]). For each ξ = (ξ1 , . . . , ξm ) such that N > ξ1 ≥ . . . ≥ ξm > 0, we define Eξ ∈ F to be the image of the elementary symmetric function eξ via this map, that is Eξ = [ξ1 ] · · · [ξm ]. Since {eξ } is a basis of Z[x1 , . . . , xN ]SN , {Eξ } spans F. We define the symmetric bilinear forms S and S 0 on F by setting 0 . (11.14) S [Lλ ], [Lµ ] = Sλµ , S 0 [Lλ ], [Lµ ] = Sλµ Then, Verlinde’s formula for S is rewritten as [Lν ] [Lν ] [Lν ] = S a, S b, , S ab, Sν0 Sν0 Sν0
(11.15)
Face Algebras and Unitarity of SU(N)L -TQFT
239
where ν ∈ V and a, b ∈ F. By (11.13) and this formula, we obtain S(Eξ , [r ]) = S 0 (Eξ , [r ])
(11.16)
for each ξ and 0 ≤ r < N, or equivalently, we obtain S([Lλ ], [r ]) = S 0 ([Lλ ], [r ])
(11.17)
for each λ ∈ V and 0 ≤ r < N . Repeating a similar consideration, we conclude that 0 holds for every λ, µ ∈ V. u t Sλµ = Sλµ f
For S = S(AN−1 ; t)ι,ζ , we denote the semisimple ribbon category ComS (resp. fu
u (A unitary ribbon category ComS ) by CS (AN−1 , t)ι,ζ (resp. CS N −1 , t),ζ ).
Theorem 11.3. Let N ≥ 2 and L ≥ 1 be integers, ι = ±1 if N ∈ 2Z, ι = 1 if N ∈ 1 + 2Z and = ±1. Let ζ be a primitive 2N (N + L)th root of unity. (1) Suppose N is odd or = 1 and set t = ζ N . Then the category CS (AN −1 , t)ι,ζ is a modular tensor category with S-matrix S ι (ζ ). (2) Suppose N is even, = −1 and t := −ζ N is a primitive 2(N + L)th root of unity. (Note that this implies L ∈ 2Z.) Then the category CS (AN −1 , t)ι−1,ζ is a modular tensor category with S-matrix S −ι (ζ ). Proof. We will prove Part (2). By (11.5), (10.7), (10.9), (11.11) and D(λ)−t = (−1)|λ| S = S −ι (ζ ) . Hence by the lemma above, it suffices to show that D(λ)t , we have Sλ0 λ0 S −ι (ζ ) = S for each 0 < r ≤ q < N . As we will see in the next section, S3 3 3 q r 3 q r ¯ r )(0, 3q+r−s + 3s ) the action of cr q ◦ cq r on the one-dimensional space (q ⊗ = Cωq (0, 3q ) ⊗ ωr (3q , 3q+r−s + 3s ) is given by the scalar (ζ t)−2qr t 2s(q+r−s+1) . Hence, by (3.14), we obtain S = S3 q 3r
X s
Tr Mι ◦ cr q ◦ cq r (q ⊗ ¯ r )(0, 3
q+r−s +3s )
= (−ι)p+q
X 2s(q+r−s+1) (ζ t 0 )−2qr t 0 D(3q+r−s + 3s )t 0 ,
(11.18)
s
where t 0 = ζ N and the summation is taken over max{0, q + r − N } ≤ s ≤ r. Since the right-hand side of (11.18) equals S −ι (ζ )3q 3r by (11.12), this completes the proof of Part (2). u t i ). If Corollary 11.4. Let N ≥ 2 and L ≥ 1 be integers, = ±1 and t = exp(± Nπ+L u N + L ∈ 1 + 2Z, CS (AN−1 , t),ζ is a unitary MTC provided that = 1 or N is odd, where ζ denotes an arbitrary primitive 2N(N + L)th root of unity such that ζ N = t. If u (A th N + L ∈ 2Z, CS N−1 , ±t),ζ is a unitary MTC for each primitive 2N (N + L) root N N−1 t. ζ of unity such that ζ = ±
Remark 11.1. (1) When N ∈ 1 + 2Z, S(AN −1 , t)ι−1,ζ is isomorphic to a 2-cocycle deformation of S(AN−1 , t)ι1,ζ . Hence CS (AN−1 , t)ι1,ζ and CS (AN −1 , t)ι−1,ζ are equivalent. (2) For g = soN and spN , a category-theoretic construction of unitary MTC’s related to C(g, κ) is given by Turaev and Wenzl [42].
240
T. Hayashi
12. Braidings on In this section, we give some explicit calculation of the braiding cq,r := cq r in order to complete the proof of Lemma 7.3 and Theorem 11.3. Since the braiding is a natural transformation and the multiplication of gives a S(AN −1 ; t) -comodule map ¯ r → q+r , cq,r satisfies mq,r : q ⊗ ¯ r ) = (idr ⊗m ¯ q,q 0 ) ◦ (cq,r ⊗id ¯ q 0 ) ◦ (idq ⊗c ¯ q 0 ,r ), cq+q 0 ,r ◦ (mq,q 0 ⊗id ¯ r,r 0 ) = (mr,r 0 ⊗id ¯ q ) ◦ (idr ⊗c ¯ q,r 0 ) ◦ (cq,r ⊗id ¯ r 0 ). cq,r+r 0 ◦ (idq ⊗m
(12.1) (12.2)
Lemma 12.1. For each 1 ≤ p < N and 1 ≤ q ≤ N − p, we have cq,1 ω 3p | p + 1, . . . , p + q ⊗ ω 3p+q | 1 [q] ω 3p | p + 1 ⊗ ω 3p+1 | p + 2, . . . , p + q, 1 = −(−ζ )−q t p+1 [p + 1] [p + q + 1] ω 3p | 1 ⊗ ω 3p + 1ˆ | p + 1, . . . , p + q . (12.3) + (−ζ )−q (−)q [p + 1] Proof. Suppose (12.3) is valid for each p and for some q < N − p. Then, by (12.1), we obtain (12.4) c1+q,1 ω 3p | p + 1, . . . , p + q + 1 ⊗ ω 3p+q+1 | 1 h ¯ 1,q ) ◦ (c1,1 ⊗id ¯ q ) ω(3p |p + 1) ⊗ = (id1 ⊗m [q] ω 3p+1 |p + 2 ⊗ ω 3p+2 |p + 3, . . . , p + q + 1, 1 [p + 2] i [p + q + 2] ˆ + 2, . . . , p + q + 1 ω 3p+1 |1 ⊗ω 3p+1 + 1|p +(−ζ )−q (−)q [p + 2]
− (−ζ )−q t p+2
=
[p + q + 2] [q] 3p 3p+1 3p 3p+1 wN,t wN,t + −t 3p+1 3p+2 3p+1 3p+1 + 1ˆ [p + 2] [p + 2] −q (−ζ ) ω(3p |p + 1) ⊗ ω 3p+1 | p + 2, . . . , p + q + 1, 1 3p 3p+1 −q q [p + q + 2] wN,t + (−ζ ) (−) 3p + 1ˆ 3p+1 + 1ˆ [p + 2] ω 3p | 1 ⊗ ω 3p + 1ˆ | p + 1, . . . , p + q + 1 . p+2
Computing the right-hand side of the above equality, we obtain (12.3) for q + 1. u t Using (12.3) for p = 1, q = N − 1 together with (12.1), we obtain cN,1 ω (0| 1, . . . , N) ⊗ ω (0 | 1) = −(−ζ )−N t [N ] ω (0 | 1) ⊗ ω (1 | 2, . . . , N, 1) . (12.5) This shows that the constant ϑ in (8.19) equals N −1 ζ −N t and completes the proof of Lemma 7.3.
Face Algebras and Unitarity of SU(N)L -TQFT
241
Lemma 12.2. We have the following relations: cq,1 ω 3p | p + 1, . . . , p + s, 1, . . . , q − s ⊗ ω 3p+s + 3q−s | q − s + 1 [s] ω 3p | p + 1 [p + 1] ⊗ ω 3p+1 | p + 2 . . . , p + s, 1, . . . , q − s + 1 ˆ 3p+s + 3q−s+1 + 1 3p | 1 ⊗ q 3p + 1, ∈ −(−ζ )−q t p−q+s+1
(1 ≤ p < N, 1 ≤ s ≤ N − p, s < q ≤ p + 2s − 1),
(12.6)
cq,r ω 3p | p + 1, . . . , p + q ⊗ ω 3p+q | 1, . . . , r [q]! [p]! ω 3p | p + 1, . . . , p + r [p + r]! [q − r]! ⊗ ω 3p+r | p + r + 1, . . . , p + q, 1, . . . , r X r 3p , λ ⊗ q λ, 3p+q + 3r +
∈ (−1)r (−ζ )−qr t pr+r
λ6=3p+r
(0 ≤ p ≤ N − 1, 0 ≤ q ≤ N − p, 0 ≤ r ≤ q),
(12.7)
where [n]! = [n] · · · [2][1] and [0]! = 1. Proof. The relation (12.6) follows from (12.1), (12.3) and cq−s,1 ω 3p+s | 1, . . . , q − s ⊗ ω 3p+s + 3q−s | q − s + 1 = (−ζ t)s−q ω 3p+s | 1 ⊗ ω 3p+s + 1ˆ | 2, . . . , q − s + 1 .
(12.8)
The relation (12.7) is easily proved by induction on r, using (12.2) and (12.6). u t Using (12.7), (12.2) and cq,r−s ω (0 | 1, . . . , q) ⊗ ω 3q | q + 1, . . . , q + r − s
= (−ζ t)qs−qr ω (0 | 1, . . . , r − s) ⊗ ω (3r−s | r − s + 1, . . . , q + r − s) ,
(12.9)
we obtain the following. Lemma 12.3. For each 0 ≤ q, r < N and max{0, q + r − N } ≤ s ≤ min{q, r}, we have cq,r ω (0 | 1, . . . , q) ⊗ ω 3q | q + 1, . . . , q + r − s, 1, . . . s [q]! [r − s]! = (−1)s (−ζ t)−qr t s(q+r−s+1) [r]! [q − s]! ω (0 | 1, . . . , r) ⊗ ω (3r | r + 1, . . . , q + r − s, 1, . . . , s) . (12.10) As an immediate consequence of the lemma above, we see that cr,q ◦ cq,r acts on ¯ r )(0, 3q+r−s + 3s ) as the scalar (ζ t)−2qr t 2s(q+r−s+1) . Thus we complete the (q ⊗ proof of Theorem 11.3.
242
T. Hayashi
13. ABF Models and SU (2)L -SOS Algebras In this section, we give an explicit description of the representation theory of S(A1 ; t) . We identify G = G2,L with the Dynkin diagram of type AL+1 : 0
L−1
1
−−−−−→ −−−−−→ ◦ ←−−−− − ◦ ←−−−− −·
·
L
−−−−−→ · ◦ ←−−−−− ◦ .
(13.1)
Also, we identify V and Gkij with {0, 1, · · · , L} and { (i0 , i1 , · · · , ik )| 0 ≤ i0 , · · · , ik ≤ L, |iν − iν−1 | = 1 (1 ≤ ν ≤ k)} respectively. We define the set B by k i, j, k ∈ V, |i − j | ≤ k ≤ i + j, B= . ij i + j + k ∈ 2Z, i + j + k ≤ 2L
(13.2)
(13.3)
Then, we have
( Nijk
=
1
k ij
∈B
0 (otherwise) .
(13.4)
In order to simplify the formula for quantum invariants stated in the introduction, we use the rational basis of type 6 instead of type (cf. Remark 6.1). The corresponding 6 is given by Boltzmann weight w = w2,t, [i + 1 ± 1] ±t ∓(i+1) i i±1 i i±1 , w = ζ −1 , (13.5) = −ζ −1 i∓1 i i±1 i [i + 1] [i + 1]
w
h i i i±1 w = ζ −1 t, w otherwise = 0. i±1 i±2
(13.6)
Next, we recall a realization of Lk introduced in [14]. Let 6 be an algebra generated by the symbols σ (p) (p ∈ Gk , k ≥ 0) with defining relations: X σ (k) = 1, (13.7) k∈V
σ (p)σ (q) = δr(p)s(q) σ (p · q), σ (i, i + 1, i) = σ (i, i − 1, i) (0 < i < L), σ (0, 1, 0) = σ (L, L − 1, L) = 0.
(13.8) (13.9) (13.10)
L k k k We define the grading 6 = k≥0 6 via 6 = span{σ (p) | p ∈ G }. Then each k component 6 becomes a right S(A1 ; t) -comodule via X p σ (p) ⊗ e (q ∈ Gk ). (13.11) σ (q) 7 → q k p∈G
Face Algebras and Unitarity of SU(N)L -TQFT
243
For each ijk ∈ B, the element σk (i, j ) := L(q) σ (q) does not depend on the choice of q ∈ Gkij , where L is as in Sect. 7, that is, L(i, i − 1, · · · , (i + j − k)/2, · · · , j − 1, j ) = 0, (13.12) L ((· · · , n, n + 1, n, · · · )) = L ((· · · , n, n − 1, n, · · · )) + 1. (13.13) It is easy to see that 6 k (i, j ) = Cσk (i, j ) for each ijk ∈ B and that { σk (i, j )| ijk ∈ B} is a linear basis of 6. Since dim 6 k (0, l) = dim 6 k (l, 0) = δkl , we have 6 k ∼ = Lk ∼ = (6 k )∨ by Theorem 7.5 and (3.7) . More explicitly, we have the following. Proposition 13.1. The map 6 k → (6 k )∨ ; σk (i, j ) 7 → c ijk σk∨ (i, j ) gives an identification of S(A1 ; t) -comodules, where {σk∨ (j, i)} denotes the dual basis of {σk (i, j )} and the constant c ijk is given by [(i + j + k)/2 + 1]! [(i − j + k)/2]! [(−i + j + k)/2]! k . c = (−)(i−j )/2 [i + 1] [(i + j − k)/2]! ij (13.14) Under this identification, the maps d6 k and b6 k in (3.8)-(3.9) are given by X k −1 c σk (i, j ) ⊗ σk (j, i), d6 k (i) = ji j k i b6 k (σk (i, j ) ⊗ σk (j, l)) = δil c ij respectively, where the summation in (13.15) is taken over all j ∈ V such that
(13.15) (13.16) k ji
∈ B.
Proof. It suffices to show the first assertion. Since 6 k (i, j ) = Cσk (i, j ) and (6 k )∨ (i, j ) = Cσk∨ (i, j ), there exists an isomorphism 6 k ∼ = (6 k )∨ of the form σk (i, j ) 7 → c ijk σk∨ (i, j ) for some nonzero constant c ijk ( ijk ∈ B). To compute c ijk , we consider the S(A1 ; t) -right module structure on 6 given by (8.1). Similarly to Lemma 8.2, we obtain [ i+j2∓k + 1] i, i ± 1 = ζ −k (±i∓j +k)/2 t (∓i±j +k)/2 σk (i ± 1, j ± 1). σk (i, j ) e j, j ± 1 [i + 1] (13.17) On the other hand, by (10.1), we obtain i, i ± 1 [j + 1] j ± 1, j e , S e = [i + 1] i ± 1, i j, j ± 1 i, i ± 1 [j + 1] j ∓ 1, j e . S e = − [i + 1] i ± 1, i j, j ∓ 1
(13.18)
Using this together with (8.3) and (13.17)− , we obtain [ i+j2+k + 2] ∨ i, i + 1 ∨ = ζ −k (i−j +k)/2 t (−i+j +k)/2 σk (i + 1, j + 1). σk (i, j ) e j, j + 1 [i + 2] (13.19)
244
T. Hayashi
By (13.17)+ and (13.19), we obtain k c i+1,j [i + 1] [(i + j + k)/2 + 2] +1 . = k [i + 2] [(i + j − k)/2 + 1] c i,j Similarly, by computing σk (i, j ) e c
k i±1,j ∓1 k c i,j
i,i±1 j,j ∓1 ,
= −
(13.20)
we obtain
[i + 1] [(±i ∓ j + k)/2 + 1] . [i + 1 ± 1] [(∓i ± j + k)/2]
(13.21)
By solving these recursion relations under some initial condition, we get (13.14). u t ¯ n →6 ¯ m and its inverse are given by ˜ n ⊗6 Proposition 13.2. (1) The braiding c : 6 m ⊗6 X ± h i wmn (13.22) c±1 (σm (h, i) ⊗ σn (i, k)) = σn (h, j ) ⊗ σm (j, k), j k j h q i X X ± h i L(u)+L(v)+L(q)+L(r) w± u r , (13.23) wmn := v j k n m u∈Ghj v∈Gj k
n respectively, and the summation where q and r denote arbitrary elements of Gm hi and G ik n m in (13.22) is taken over all j ∈ V such that hj , j k ∈ B.
(2) The ribbon functional of S(A1 , t)ι,ζ acts on Li as the scalar θi−1 given by θi = ιi ζ i(i+2) .
(13.24)
Proof. Since the braiding is a natural transformation and the map CGm → 6 m ; p 7→ σ (p) is a S(A1 ; t) -comodule map, we have h q i X X X L(q)+L(r) w± u r σ (u) ⊗ σ (v). c±1 (σm (h, i) ⊗ σn (i, k)) = v n m j u∈Ghj v∈Gj k
(13.25) Since σ (p) = L(p) σm (i, j ) for each p ∈ Gm ij , this proves Part (1). When i = 1, (13.24) follows from the proof of Proposition 10.2. For i > 1, (13.24) follows from (11.7) by induction on i. u t 14. The State Sum Invariants Let L be a positive integer and let , ι be elements of {±1} such that = 1 if L is a odd integer. Let t be a primitive 2(L + 2)th root of unity and ζ a solution of ζ 2 = t. Applying the general theory of TQFT to the MTC CS (A1 , t)ι,ζ , we obtain an invariant ι of oriented 3-manifolds. To give an explicit description of τ ι , we prepare some τ,ζ ,ζ terminologies on link diagrams. Let D be a generic link diagram in R × (0, 1) (viewed as a union of line segments AB), which presents a framed link L with components K1 , . . . , Kp (see e.g. [26]). A point of D is called extremal if the height function on D attains its local maximum
Face Algebras and Unitarity of SU(N)L -TQFT
245
or local minimum in this point, where the height function ht is the restriction of the projection R × (0, 1) → (0, 1) on D. A point of D is called singular if it is either an extremal point or a crossing point. We denote by ]D the set of all singular points of D. Let E be the set of all connected components of D \ ]D. We say that E ∈ E belongs to Kq (1 ≤ q ≤ p) if E is a subset of the image of Kq via the projection L → D. Let c be λ1 (E) a map from {K1 , . . . , Kp } to V. We say that a map λ : E → B; E 7→ λ2 (E)λ is a 3 (E) state on D of color c if λ1 (E) = c(Kq ) for each component Kq and E ∈ E belonging to Kq . We denote by cλ the color of a state λ, and by S(D) the set of all states on D. Figure (A) shows a state λ on a diagram of the Hopf link with 6 singular points, such that cλ (K1 ) = 1, cλ (K2 ) = 2.
1 32
@
b @ 2 @ 11
b @ @ 1 01 @ b
@ K@ 1
2 31 @
1 12
23
@ @ b
@ @b
2
@ L L−2 @ @ @ @ 1 @ 2 02
@
K2
@ @b
Fig. (A)
Next, we assign a complex number hλ|Ai for each state λ ∈ S(D) and singular point A ∈ ]D as follows: When (λ, A) is as in Fig. (B) or Fig. (C), then we set −1 l l δhk δij (14.1) δhk δij or hλ|Ai = c hλ|Ai = c ih hi respectively, where c hil is as in (13.14). A ◦
J
J
J JJ
l l jk
hi
l jk
l hi
JJ
J
J
J◦
A
Fig. (B)
Fig. (C)
When (λ, A) is as in Fig. (D± ), we set hλ|Ai = ± is as in (13.23). where wmn
± δij δhc δde δf k wmn
hi ef
,
(14.2)
246
T. Hayashi n jk
m hi
J
J
J
A◦
J
J J
m n
JJ
J
AJ◦
J J
JJ
m n
ef
cd
n jk
m hi
ef
cd
Fig. (D+ )
Fig. (D− )
The following result follows from Sect. 13 by a method quite similar to “vertex models on link invariants” (see e.g. [40] Appendix II), hence we omit the proof. ι be the invariant of closed oriented 3-manifolds associated with Theorem 14.1. Let τ,ζ the modular tensor category CS (A1 , t)ι,ζ (cf. [40]). Let M be a 3-manifold obtained by surgery on S 3 along a framed link L with p components K1 , . . . , Kp . Let D be a ι is given by generic diagram in R × (0, 1) which presents L. Then τ,ζ ι (M) τ,ζ
σ (L)
=1
−σ (L)−p−1
D
X
p Y
λ∈S (D) q=1
ιcλ (Kq ) [cλ (Kq ) + 1]
Y
hλ|Ai.
(14.3)
A∈]D
P P Here 1 denotes a fixed square root of i∈V [i + 1]2 , D = i∈V ιi ζ −i(i+2) [i + 1]2 and σ (L) denotes the signature of the linking matrix of L. References 1. Andersen, H.: Tensor products of quantized tilting modules. Commun. Math. Phys. 149, 149–159 (1992) 2. Bergman, G.: The diamond lemma for ring theory Adv. Math. 29, 178–218 (1978) 3. Böhm, G. and Szlachányi, K.: A coassociative C∗ -quantum group with non-integral dimensions. Lett. Math. Phys. 35, 437–456 (1996) 4. Chari, V. and Pressley, A.: A guide to quantum groups. Cambridge: Cambridge University Press, 1994 5. Doi, Y. and Takeuchi, M.: Multiplication alternation by two-cocycles – The quantum version. Commun. Alg. 22, 5715–5732 (1994) ¯ 6. Drinfeld, V.G.: On quasitriangular Quasi-Hopf algebras and a group closely connected with Gal(Q/Q). Leningrad Math. J. 2, 829–860 (1991) 7. Finkelberg, M.: An equivalence of fusion categories. Geom. Funct. Analysis 6, 249–267 (1996) 8. Gelfand, S. and Kazhdan, D.: Examples of tensor categories. Invent. Math. 109, 595–617 (1992) 9. Goodman, F. and Nakanishi, T.: Fusion algebras in integrable systems in two dimensions. Phys. Lett. B 262, 259–264 (1991) 10. Goodman, F. and Wenzl, H.: Littlewood-Richardson coefficients for Hecke algebras at roots of unity. Adv. Math. 82, 244–265 (1990) 11. Hayashi, T.: An algebra related to the fusion rules of Wess–Zumino–Witten models. Lett. Math Phys. 22, 291–296 (1991) 12. Hayashi, T.: Quantum deformation of classical groups. Publ. RIMS, Kyoto Univ. 28, 57–81 (1992) 13. Hayashi, T.: Quantum groups and quantum determinants. J. Algebra 152, 146–165 (1992) 14. Hayashi, T.: Quantum group symmetry of partition functions of IRF models and its application to Jones’ index theory. Commun. Math. Phys. 157, 331–345 (1993) 15. Hayashi, T.: Face algebras and their Drinfeld doubles. In: Proceedings of Symposia in Pure Mathematics, Vol 56, Part 2, Providence, RI: American Mathematical Society, 1994 16. Hayashi, T.: Face algebras I – A generalization of quantum group theory. To appear in J. Math. Soc. Japan 17. Hayashi, T.: Compact quantum groups of face type. Publ. RIMS, Kyoto Univ. 32, 351–369 (1996) 18. Hayashi, T.: Galois quantum groups of II1 -subfactors. Preprint
Face Algebras and Unitarity of SU(N)L -TQFT 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47.
247
Hayashi, T.: Face algebras II – Standard generator theorems. In preparation Hayashi, T.: Quantum groups and quantum semigroups. To appear in J. Algebra Hayashi, T.: In preparation Hayashi, T.: In preparation Jimbo, M., Miwa, T. and Okado, M.: Solvable lattice models related to the vector representation of classical simple Lie algebras. Commun. Math. Phys. 116, 507–525 (1988) Jurˇco, B. and Schupp, P.: AKS scheme for face and Calgero–Moser–Sutherland type models. Preprint Kac, V.: Infinite dimensional Lie algebras, 3rd ed.. Cambridge: Cambridge Univ. Press, 1990 Kassel, C.: Quantum groups. New York: Springer-Verlag, 1995 Kazhdan, D. and Wenzl, H.: Reconstructing monoidal categories. Adv. in Soviet Math. 16, 111–136 (1993) Kirillov, A., Jr.: On an inner product in modular tensor categories. J. of AMS 9, 1135–1169 (1996) Koornwinder, T.: Compact quantum groups and q-special functions. Preprint Larson, R. and Towber, J.: Two dual classes of bialgebras related to the concepts of “quantum group” and “quantum Lie algebra”. Commun. Alg. 19, 3295–3345 (1991) Macdonald, I.: Symmetric functions and Hall polynomials, 2nd ed.. Oxford: Oxford Univ. Press, 1995 Moore, G. and Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177–254 (1989) Ocneanu, A.: Quantized group, string algebras and Galois theory for algebras. In: Operator algebras and applications, Vol. 2. London Math. Soc. Lecture note series 136, 119–172 (1989) Reshetikhin, N. and Turaev, V.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math. 103, 547–598 (1991) Reshetikhin, N., Takhtadzhyan, L. and Faddeev, L.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193–225 (1990) Schauenburg, P.: Face algebras are ×R -bialgebras. Preprint Sweedler, M.: Hopf algebras, New York: Benjamin Inc., 1969 Takeuchi, M.: Matric bialgebras and quantum groups. Israel J. Math. 72, 232–251 (1990) Tsuchiya, A. and Kanie, Y.: Vertex operators in conformal field theory on P1 and monodromy representations of braid groups. In Adv. Stud. Pure Math. Vol. 16. Turaev, V.: Quantum invariants of knots and 3-manifolds, Berlin, New York: Walter de Gruyter, 1994 Turaev, V. and Wenzl, H.: Quantum invariants of 3-manifolds associated with classical simple Lie algebras. Int. J. of Modern Math. 4, 323–358 (1993) Turaev, V. and Wenzl, H.: Semisimple and modular categories from link invariants. Math. Ann 309, 411–461 (1997) Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B300, 360–376 (1988) Walton, M.: Algorithm for WZW fusion rules: A proof. Phys. Lett. B 241, 365–368 (1990) Wenzl, H.: Hecke algebras of type An and subfactors. Invent. Math. 92, 349–383 (1988) Wenzl, H.: C ∗ tensor categories from quantum groups. J. AMS 11, 261–282 (1998) Witten, E.: Quantum field theory and the Jones polynomial. Comm. Math. Phys. 121, 351–399 (1989)
Communicated by T. Miwa
Commun. Math. Phys. 203, 249 – 267 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Donaldson Invariants for Non-Simply Connected Manifolds Marcos Mariño, Gregory Moore Department of Physics, Yale University, New Haven, CT 06520, USA. E-mail:
[email protected],
[email protected] Received: 2 May 1998 / Accepted: 14 September 1998
Abstract: We study Coulomb branch (“u-plane”) integrals for N = 2 supersymmetric SU (2), SO(3) Yang–Mills theory on 4-manifolds X of b1 (X) > 0, b2+ (X) = 1. Using wall-crossing arguments we derive expressions for the Donaldson invariants for manifolds with b1 (X) > 0, b2+ (X) > 0. Explicit expressions for X = CP 1 × Fg , where Fg is a Riemann surface of genus g are obtained using Kronecker’s double series identity. The result might be useful in future studies of quantum cohomology. 1. Introduction The Donaldson invariants of four-manifolds have been a source of fascination both in mathematics and in physics. While there has been much progress in understanding these invariants, there is more to learn, particularly in terms of the relation of the invariants to Floer homology [1] and to Gromov–Witten invariants [2, 3]. Understanding the invariants in these contexts leads to the need to understand the Donaldson invariants for nonsimply connected four-manifolds X. Most investigations of Donaldson invariants have focussed on the case π1 (X) = 0. There exists a mathematical definition in the nonsimply connected case [4] but comparatively little is known about this case. This paper derives some new results on Donaldson invariants for 4-manifolds with first Betti number b1 (X) > 0. We do not consider the effects of torsion in H∗ (X; Z), nor the effects of a nonabelian fundamental group. Recently, using Witten’s physical approach to Donaldson theory [5–8], a fairly systematic (physical) procedure has been developed for deriving various properties of the Donaldson invariants, including wall-crossing and blowup formulae, and the relation to Seiberg–Witten invariants [9–13]. The systematic procedure, which begins with certain integrals over the Coulomb branch of vacua of an N = 2 SYM theory can be extended to higher rank gauge groups and to nonsimply connected manifolds. In [9, 10] partial results were obtained for nonsimply connected manifolds. In this paper a more complete treatment is given for the rank one groups SU (2), SO(3).
250
M. Mariño, G. Moore
Our main results are: 1. A wall-crossing formula for the Donaldson invariants in Eqs. (3.4) and (3.16) below. 2. An expression for the Donaldson invariants in terms of SW invariants. For manifolds of simple type this is given in Eq. (4.17) below. It is a natural generalization of Witten’s formula [7], obtained in the simply connected case. 3. Explicit expressions for the Donaldson invariants for X = CP 1 × Fg , where Fg is a Riemann surface of genus g. We give answers valid in both the chambers vol(CP 1 ) → 0 and vol(Fg ) → 0. The expressions in Eqs. (5.18), (5.20) below might prove useful in future studies of the Gromov–Witten invariants of the moduli space of flat connections on Fg . 2. The u-Plane Integral for b1 > 0 In this section we extend and elaborate on the results of Sect. 10 of [9]. We consider an arbitrary insertion of observables, using the proposal for the contact terms in [10]. Consider an N = 2 SU (2) or SO(3) supersymmetric Yang–Mills theory1 on a compact oriented 4-manifold X of b1 (X) > 0. As explained in [9] the Donaldson– Witten generating function can be written as ZDW = Zu + ZSW , where Zu is the Coulomb branch integral and ZSW is the contribution of the Seiberg–Witten invariants. The Coulomb branch integral is only nonvanishing for b2+ (X) = 1, but, by a procedure explained in [9, 12] can be taken as the starting point for a systematic derivation of ZDW . In the case of b1 (X) > 0 the Coulomb integral can be obtained by a simple generalization of the arguments in section three of [9]. First of all, the photon partition function includes [8] an integration over b1 zero modes of the gauge field corresponding to flat connections. These zero modes span the tangent space to a torus of dimension b1 , Tb1 = H 1 (X, IR)/H 1 (X, Z). The zero modes of the one-forms ψ live in this tangent space. As a consequence of having these extra zero modes, the photon partition function is 1 (2.1) (Imτ ) 2 (b1 −1) θ0 (τ, τ ), where θ0 (τ, τ ) is the Siegel–Narain theta function introduced in [8, 14, 9]. The next ingredient comes from the measure for the ψ-fields. The expansion in zero P1 ci βi , where βi , i = 1, . . . , b1 is an integral basis of harmonic modes reads ψ = bi=1 one-forms, and we identify H1 (X) ' H 1 (X, Z). The ci are Grassmann variables. The measure for the ψ fields is then: b1 Y
dci
i=1
(Imτ ) 2
b1
1
= (Imτ )− 2
b1 Y
dci .
(2.2)
i=1 ]
We can consider the ci as a basis of one-forms βi ∈ H 1 (Tb1 , Z), dual to βi ∈ In this way we can identify
H 1 (X, Z).
ψ=
b1 X i=1
]
βi ⊗ βi = c1 (L),
where L is the universal flat line bundle over Tb1 × X. 1 For simplicity we do not include hypermultiplets.
(2.3)
Donaldson Invariants for Non-Simply Connected Manifolds
251
Taking into account (2.1) and (2.2), we see that the Coulomb integral (without any insertion of observables) can be written as [9]: Z Z Z dψ dDAχ B σ y −1/2 Zu = 2 [da d a¯ dη dχ]
P ic(X)
1 2 2 (Imτ )D ∧ ∗D exp −iπ τ¯ λ+ − iπ τ λ− + π i(λ, w2 (X)) exp 8π √ Z √ Z i 2 d τ¯ dτ i 2 ηχ ∧ (D+ + 4πλ+ ) + 7 (ψ ∧ ψ) ∧ (4π λ− + D+ ) exp − 16π d a¯ 2 π da Z 2 d τ 1 ψ ∧ψ ∧ψ ∧ψ , + 3 · 211 πi da 2 (2.4) R where P ic(X) denotes a sum over line bundles and an integration over Tb1 . The integration over ψ is understood as integration of differential forms on Tb1 . The orientation of the measure of the finite-dimensional integral (2.4) corresponds to a choice of Donaldson orientation of the moduli space of instantons [15]. Consider now the generating function for an arbitrary insertion of zero, one, two and three observables. We introduce the formal sums of cycles Z
γ =
b1 X
ζi δi , S =
i=1
b2 X
λi Si , 6 = 3
i=1
b3 X i=1
θi 6i3 ,
(2.5)
where δi , i = 1, . . . , b1 , Si , i = 1, . . . , b2 and 6i3 , i = 1, . . . , b3 = b1 are a basis of one, two, and three cycles, respectively. The basis of one-cycles δi is dual to βi ∈ H 1 (X, Z). The λi are complex numbers, and ζi , θi are Grassmann variables. The insertion of observables corresponding to these cycles is Z Z Z K 3 u, (2.6) a1 Ku + a2 K 2 u + a3 γ
S
63
where a1 , a2 and a3 are constants that should be fixed by comparison to known mathematical√results. The constant for the two-observable has been already fixed in [9], a2 = i/ 2π. K is the canonical descent operator in the normalization of [9] , and we have, explicitly: 1 du ψ, Ku = √ 4 2 da √ 1 d 2u 2 du 2 ψ ∧ψ − (F+ + D), K u= 2 32 da 4 da √ 1 d 3u 3 d 2u 3 2i du 3 (2dχ − ∗dη). ψ ∧ψ ∧ψ − ψ ∧ (F+ + D) − K u= √ 16 da 2 8 da 227 da 3 (2.7) In addition, we have to take into account various contact terms associated to the intersecting cycles [9, 10]. These come from intersections of two, three and four cycles on
252
M. Mariño, G. Moore
the manifold X. For the intersection of two cycles we have S ∩ S ∈ H0 (X, Z), 6 3 ∩ γ ∈ H0 (X, Z), 6 3 ∩ S ∈ H1 (X, Z), 6 3 ∩ 6 3 ∈ H2 (X, Z).
(2.8)
For intersection of three cycles, we have the possibilities S ∩ 6 3 ∩ 6 3 ∈ H0 (X, Z),
6 3 ∩ 6 3 ∩ 6 3 ∈ H1 (X, Z),
(2.9)
and for intersection of four cycles we only have the possibility 6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ∈ H0 (X, Z).
(2.10)
The contact term corresponding to S ∩ S was obtained in [9]. A proposal for the structure of the contact terms corresponding to more general intersecting cycles was made in [10]. According to this proposal, the contact term associated to the intersection of p cycles is given by the appropriate descendant of the p th derivative of the Seibeg-Witten prepotential with respect to a deformation parameter τ0 . Moreover [11, 12], this deformation parameter τ0 can be related to the dynamically generated scale of the theory 3 (in the case of the asymptotically free theories) or to the microscopic gauge coupling, in the case of self-dual gauge theories. For SU (2) N = 2 supersymmetric Yang–Mills theory, the relation between τ0 and 3 is given by 34 = eiπ τ0 . This is related to the fact that 3 can be identified with the first slow time of the Toda–Whitham hierarchy [16]. Following the proposal of [10] the contact terms for the various intersections in (2.8), (2.9) and (2.10) are of the form Z Z KT (u) + a33 K 2 T (u) S 2 T (u) + a13 T (u)(6 3 ∩ γ ) + a32 3 3 3 6 ∩6 Z 6 ∩S (2.11) (3) 3 3 (3) KFτ0 + a332 Fτ0 (S ∩ 6 ∩ 6 ) + a333 + a3333 Fτ(4) (6 3 0
6 3 ∩6 3 ∩6 3
∩ 6 ∩ 6 ∩ 6 ), 3
3
3
(p)
where the a’s are constants which will be determined below. In this equation, Fτ0 (2) denotes the pth derivative of the prepotential with respect to τ0 , and T (u) = (4/π i)Fτ0 . The contact term for S ∩ S can be written as [10, 11]: T (u) =
du i 1h 2u − a . 4 da
(2.12)
The constants aij , aij k and aij kl will be obtained in terms of ai , i = 1, 2, 3, using singlevaluedness of the integrand on the u-plane. Notice that one expects, on physical grounds, that aij is proportional to ai aj , and so on. We can already plug the observables (2.6) and the corresponding contact terms (2.11) into the generating function and write an explicit expression for the u-plane integral. It is important, however, to check that the resulting expression has good properties under duality transformations, that is, that the integrand is single-valued in the u-plane. This is not obvious due to the holomorphic “functions” that appear in (2.7) and (2.11). Following the strategy of [9], we will first of all integrate the auxiliary field D. Comparing to the
Donaldson Invariants for Non-Simply Connected Manifolds
253
expressions in the simply-connected case, we see that the new terms coupling to D appear in the combination −
i i du h (4πλ− + D) ∧ S˜ , 4π da
(2.13)
where √ S˜ = S −
√ 3π d 2 u da 2 dτ dT 3 ψ ∧ψ + a3 2 ψ ∧ 6 3 − 2π ia33 6 ∧ 63, 32 du 4i da du du
(2.14)
and we interpret 6 3 as a harmonic one-form using Poincaré duality and the Hodge theorem. To guarantee that the resulting lattice sum over first Chern classes is wellbehaved under duality transformations, the two-form S˜ should be invariant under duality. To achieve this, we redefine the ψ field as 12π d 2 u da 3 6 . ψ˜ = ψ − √ a3 2 2i da dτ
(2.15)
a33 = −9π 2 a32
(2.16)
Then, if we choose
S˜ becomes √ 9π 3 2i 2 3 2 dτ ˜ ˜ ψ ∧ψ + a3 6 ∧ 6 3 . 32 du 4
√ S˜ = S −
(2.17)
We then see that, if ψ˜ is a modular form of weight (1, 0), then S˜ is a modular form of weight zero. Notice that the redefinition of ψ in (2.15) does not change the ψ measure. ˜ we have taken into account that To obtain the above expression for S, 4πi
d 2 u 2 da du dT − = πi . 2 da da dτ da
(2.18)
Once this redefinition has been made, the u-plane integral involves a lattice sum ˜ and identical to the one in the simply-connected case but with the substitution S → S, additional holomorphic insertions (coming from the observables and contact terms) that, ˜ should be modular forms of weight zero. once they are reexpressed in terms of ψ˜ and S, This is in fact the case for appropriate choices of the constants in (2.11). The computation is lengthy but straightforward. Duality invariance fixes all the constants in (2.11) in terms of a1 , a2 and a3 . The final result is: √ a32 = −6 2π ia3 , a13 = −6π 2 a1 a3 , √ a332 = −72 2πia32 , a333 = 36π 2 ia33 , a3333 = −216π
3
ia34 .
(2.19)
254
M. Mariño, G. Moore
The u-plane integral is therefore given by: √ Z Z dxdy 2 dτ ˜ 2 2ˆ ˜ u (S, ψ˜ ) µ(τ ) dψ exp 2pu + S T (u) + Zu = 1/2 0 y 32 du P ic(X) 0¯ (4)\H Z 3π i du ˜ a1 du a3 (S, ψ˜ ∧ 6 3 ) ψ˜ ∧ [γ ] − 3π 2 a1 a3 u(6 3 ∩ γ ) − + √ da 8 da 4 2 X √ 3 √ 7 dτ 2 4 2π i dτ 3 3 2π i 2 ˜ 3 a3 u(S, 6 ∧ 6 3 ) + a3 (ψ˜ ∧ 6 3 ) ψ˜ − u + 10 2 3·2 du 64 da √ 4 Z 9 2π du 9π 3 i 2 dτ 2 3 a u (ψ˜ , 6 ∧ 6 3 ) − a3 ψ˜ ∧ 6 3 ∧ 6 3 ∧ 6 3 + 64 3 du 256 3 da X 135π 6 4 ˜ + a3 u(6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 9(S), 16 (2.19) where
√
µ(τ ) = −
2 da χ σ A B 2 dτ
2 ˜ = exp(2iπλ20 ) exp − 1 ( du )2 S˜− 9(S) 8πy da X 2 2 exp −iπ τ¯ (λ+ ) − iπτ (λ− ) + π i(λ − λ0 , w2 (X))
(2.20)
λ∈H 2 + 21 w2 (E)
i du ˜ du ˜ (S+ , ω) , exp −i (S− , λ− ) (λ+ , ω) + da 4πy da 1 du 2 . Tˆ (u) = T (u) + 8πImτ da
Notice that, in the above exponential, all the terms are modular forms of weight zero if ψ˜ is a modular form of weight (1, 0). There is another check one can do of the above functions: one can formally assign an R-charge to the cycles in such a way that the observables have R-charge zero, namely: R(γ ) = −3, R(S) = −2, R(6 3 ) = −1. ˜ S˜ Taking into account that R(ψ) = 1, R(a) = 2, we see that the definitions of ψ, are consistent with this R-charge assignment and that, moreover, all the terms in the exponential of (2.20) have zero R-charge. This is in fact an example of Seiberg’s trick since the insertions of (2.6) (2.11) may be regarded as operators in some UV theory (e.g. in a brane configuration). The remaining constants a1 , a3 will be fixed below by comparing to certain topological results [17]. In principle, the coefficients in (2.20) are fixed by the proposal of Eq. (2.14) of the first paper in [10]. Although we find the proposal of Losev, Nekrasov, and Shatashvili natural, because of the many conventions involved we have not checked that the coefficients derived in (2.20) are consistent with their Eq. (2.14). 3. Donaldson Wall-Crossing In this section we want to compute the wall-crossing formulae associated to the region at infinity of the u-plane. To compare to mathematical results, it is better to write the
Donaldson Invariants for Non-Simply Connected Manifolds
255
expression in terms of ψ, S. There are two standard mathematical facts for manifolds of b2+ = 1 that will be very useful in writing the resulting wall-crossing formula [17]. First, for any β1 , β2 , β3 and β4 in H 1 (X, Z), one has β1 ∧ β2 ∧ β3 ∧ β4 = 0. Second, the image of the map ∧ : H 1 (X, Z) ⊗ H 1 (X, Z) −→ H 2 (X, Z)
(3.1)
is generated by a single rational cohomology class 6 (not to be confused with the threecycle 6 3 ). We introduce now the antisymmetric matrix aij associated to the basis βi of H 1 (X, Z), i = 1, . . . , b1 , as βi ∧ βj = aij 6. Finally, we introduce the two-form on Tb1 as X ] ] aij βi ∧ βj , (3.2) = i<j
which does not depend on the choice of basis. This is a volume element for the torus, hence Z b1 /2 b1 . (3.3) vol(T ) = Tb1 (b1 /2)! (For the manifolds under discussion b1 is even.) As a simple example, if X = CP 1 × Fg , where Fg is a Riemann surface of genus g, then 6 = [CP 1 ] (the Poincaré dual to the two-homology class of CP 1 ) and vol(Tb1 ) = 1. With all these ingredients, we can already write the wall-crossing formulae. Notice that, because of the first fact, many terms on the u-plane integral (2.21) vanish. We will write the formula first for an insertion of two and four observables. In this case, a straightforward generalization of the arguments in [9] gives: W C(λ) = −32i(−1)(λ−λ0 ,w2 (X)) e2π iλ0 (8i)−b1 /2 b du 2 σ/8 dτ 1− 21 ( 2i da 2 2 π dτ ) · q −λ /2 (u − 1) dτ du u2 − 1 · exp 2pu + S 2 T (u) − i(λ, S)/ h √ 2 Z dτ i 2 d u S 2 + λ, ψ dψ exp , · 32 da 2 2π da Tb1 q0 2
(3.4)
where we use the value obtained in [9] for the universal constants α, β, and h(τ ) =
1 da = ϑ2 ϑ3 . du 2
(3.5)
We can actually compute the integral over Tb1 and give an expression in terms of modular forms which generalizes the expressions given in [9, 18, 19]. Using the explicit expressions given in [9] for the Seiberg–Witten solution in terms of modular forms, we find that d 2u = 4f1 (q), da 2
16i dτ = f2 (q), da π
(3.6)
256
M. Mariño, G. Moore
where f1 (q) = f2 (q) =
2E2 + ϑ24 + ϑ34 3ϑ48
= 1 + 24q 1/2 + · · · ,
ϑ2 ϑ3 = q 1/8 + 18q 5/8 + · · · . 2ϑ48
(3.7)
Using (2.3) and (3.2) we find ψ 2 = 2(6 ⊗ ),
(3.8)
hence we can perform the integral over Tb1 to write a very explicit expression for (3.4): i 2 W C(λ) = − (−1)(λ−λ0 ,w2 (X)) e2πiλ0 2−b1 /4 vol(Tb1 ) 2 −λ2 /2 b1 −2 σ 2 h(τ ) ϑ4 exp 2pu + S T (u) − i(λ, S)/ h · q bX 1 /2 b=0
(3.9)
b1 /2 1 b b1 /2−b−1 b b1 /2−b (q) f (q) (S, 6) (λ, 6) . f 1 2 (8i)b b q0
Notice that this result confirms the conjecture on p. 18 of [17]. As a check of this expression, and also to fix an overall coefficient depending on b1 , we will compute the wall-crossing for the correlator p r S d−2r on a ruled surface, where d is half the dimension of the moduli space, and compare it to the expressions in [17]. Introduce the integer cohomology class ζ = 2λ. In [17], Muñoz computes the wall-crossing for walls satisfying ζ 2 = p1 and ζ 2 = p1 + 4, where p1 is the Pontryagin number of the instanton bundle. For ζ 2 = p1 , it is easy to see that only the first coefficients in the expansion in q contribute to (3.9), and there is no contribution from the contact term T (u). To compare the formulae, notice that we have to multiply our wall-crossing expression by r!(d −2r)!, as we are considering the Donaldson–Witten generating function. We finally obtain, δζ,λ0 (pr S d−2r ) =
1 (−1)(λ−λ0 ,w2 (X)) (−i)b1 /2 2−3b1 /4−b−d (−1)r+d pr vol(Tb1 ) 2 bX 1 /2 d − 2r (b1 /2)! (S, ζ )d−2r−b (S, 6)b (ζ, 6)b1 /2−b . (b1 /2 − b)! b b=0
(3.10) If we compare with [17], we find perfect agreement except for an overall factor 1/2 (a standard discrepancy between topological and quantum field theory normalizations), and a factor (−i)b1 /2 2−3b1 /4 . The latter factor is due to the normalization of the fermions in the physical theory. In order to make the identification in (2.3) and to use the normalization of topologists, we have to correct the ψ measure with an overall factor i b1 /2 29b1 /4 . As we will see in the next section, with this normalization the above formula gives the right answer for the generalized Seiberg–Witten wall-crossing formula of [20, 21]. The case ζ 2 = p1 + 4 also agrees with [17]. In this case the computation is more involved, as one has to take into account the first two coefficients in the q-expansion of the various functions in (3.9).
Donaldson Invariants for Non-Simply Connected Manifolds
257
We consider now an arbitrary insertion of observables associated to one, two, and three cycles. We have new contact terms as well as new terms in the integration over the torus. To write these in a convenient way, notice that we can use Poincaré duality and the isomorphism H1 (X, Z) ' H 1 (Tb1 , Z) to obtain a one-form 6 3] in H 1 (Tb1 , Z). Define ιβ ] = k
X p
akp βp] .
(3.11)
Using (2.3), we find ψ ∧ 6 3 = (ι6 3] )6.
(3.12)
In the same way, using the isomorphism H1 (X, Z) ' H 1 (X, Z) given by δi → βi , we P1 ] can define γ ] = bi=1 ζi βi , where the ζi were defined in (2.5). We then obtain, using (2.3) again, Z ψ = γ ]. (3.13) γ
The functions appearing in the u-plane integral can be written in terms of Jacobi theta functions and Eisenstein series as follows, dT = −8f3 (q), da
Fτ(3) =− 0
π2 f4 (q), 4
(3.14)
where 1 1 4 4 2 (2E + ϑ + ϑ ) − 1 = q 3/8 + 12q 7/8 + · · · , 2 2 3 16ϑ2 ϑ3 9ϑ48 1 f4 (q) = 8(ϑ2 ϑ3 )2 1 4 1 4 4 3 (ϑ2 + ϑ34 ) − E2 + · (2E + ϑ + ϑ ) = q 1/4 + 8q 3/4 + · · · . 2 2 3 2 54ϑ48 (3.15) f3 (q) =
The wall-crossing formula now reads, i 2 W C(λ) = − (−1)(λ−λ0 ,w2 (X)) e2πiλ0 27b1 /4 (−iπ )b1 /2 2 2 −λ /2 b1 −2 −1 σ · q h(τ ) f2 (q) ϑ4 exp 2pu + (S 2 − 6π 2 a1 a3 (6 3 ∩ γ ))T (u) √ 3 2 3 3 ∩ 6 ∩ S)f4 (q) − i(λ, S)/ h − 72 2π a3 f3 (q)(6 ∧ 6 , λ) + 18 2π Z h i ] exp 2(P (q), 6) + (R(q), 6)ι6 3] + Q(q)γ , √
Tb1
3
ia32 (6 3
3
q0
(3.16)
258
M. Mariño, G. Moore
where √ i 2 f1 (q)S + 8if2 (q)λ , P (q) = 16π
R(q) = 3πa3 4if3 (q)S − f1 (q)λ , √ 2a1 1 . Q(q) = 8 h
(3.17)
In the case of ruled surfaces and for ζ 2 = p1 , we find again agreement with the expressions obtained by Muñoz. In fact, comparing to the formula in p.13 of [17], we can find the values of a1 , a3 : πi
a1 = π −1/2 23/4 e− 4 ,
a3 =
π −3/2 1/4 π i 2 e4. 6
(3.18)
4. The Seiberg–Witten Contribution We can now follow the strategy in [9] and find the form of the Seiberg–Witten contribution at the monopole and dyon cusps. We focus on the monopole cusp, as the contribution at the dyon cusp can be obtained using the Z2 symmetry on the u-plane. In fact, as the functions involved in the monopole contribution are universal and they have been obtained in [9] in the simply-connected case, we will be able to derive the general wall-crossing formula of [20, 21]. A crucial ingredient in the discussion of the Seiberg–Witten contribution for nonsimply connected manifolds is that we have to consider generalized Seiberg–Witten invariants, i.e., we have to consider correlation functions of one-observables. Recall that the basic observable in Seiberg–Witten theory is the two-form aD on Mλ . The first descendant of aD (in the topological field theory associated to the Seiberg–Witten monopole equations) is ψ, which is a one -form on X and also a one-form on Mλ . It can be written as: ψ=
b1 X
νi βi ,
(4.1)
i=1
where βi ∈ H 1 (X, Z), i = 1, . . . , b1 is the basis of one-forms considered before, and Z ψ (4.2) νi = δi
are the one-observables of Seiberg–Witten theory. The generalized Seiberg–Witten invariant associated to the one-forms β1 , . . . , βr is defined by intersection theory on Mλ : Z dλ −r ν1 ∧ · · · ∧ νr ∧ aD 2 , (4.3) SW (λ, β1 ∧ · · · ∧ βr ) = Mλ
where dλ = λ2 − (2χ + 3σ )/4 is the virtual dimension of Mλ . These generalized invariants (and their wall-crossing) have been considered in [21].
Donaldson Invariants for Non-Simply Connected Manifolds
259
The Seiberg–Witten twisted Lagrangian near u = 1 (with the monopole fields included) can be written as [9, 11] i τ˜M F ∧ F + p(u)TrR ∧ R 16π √ i 2 d τ˜M (ψ ∧ ψ) ∧ F + `(u)TrR ∧ R˜ − 7 2 · π daD d 2 τ˜M i ψ ∧ ψ ∧ ψ ∧ ψ, + 2 3 · 211 π daD
{Q, W } +
(4.4)
and, as we see, the terms which are not Q-exact do not depend on the metric. The exponentiation of the terms involving the densities TrR ∧ R, TrR ∧ R˜ gives, after integration on X, the gravitational factors P (u)σ/8 , L(u)χ /4 considered in [9]. The term 2 involving F ∧ F gives a factor C(u)λ /2 , where F = 4π λ and C(u) = e−2π i τ˜M . The terms C(u), L(u) and P (u) are universal (they do not depend on the manifold X) and they were found in [9] using matching of wall-crossing in the simply-connected case. They can be written as dτD , L(u) = πiα 4 (u2 − 1) du −1 2 (4.5) (u − 1), P (u) = −π 2 β 8 aD aD , C(u) = qD where qD = e2πiτD . The last relation tells us that the gauge coupling τ˜M appearing in (4.4) is given by τ˜M = τD −
1 log aD , 2πi
(4.6)
eM (aD ) and therefore it is smooth at the monopole cusp. This defines the prepotential F 00 e through the equation FM (aD ) = τ˜M . First we analyze the Seiberg–Witten contribution when only two and four observables are introduced. It can be written as Z 2 2 2e2iπ(λ0 ·λ+λ0 ) C(u)λ /2 P (u)σ/8 L(u)χ /4 hepO+I2 (S) iλ,u=1 =
Mλ
b1 h i X du 2 · exp 2pu + i (S, λ) + S TM (u) exp (PM (u), 6) aij νi νj , daD
(4.7)
i,j =1
where
√ d τ˜M i 2 d 2u S + λ , PM (u) = 2 2π 32 daD daD
(4.8)
which is the magnetic version of the function in the Tb1 integral of (3.4), but with the smooth coupling constant (4.6). When we expand the exponential involving the oneobservables, we obtain the generalized Seiberg–Witten invariants with 2b insertions,
260
M. Mariño, G. Moore
b = 0, . . . , b1 . The final expression is: 1 2 2 ResaD =0 2e2iπ(λ0 ·λ+λ0 ) C(u)λ /2 P (u)σ/8 L(u)χ /4 b! b=0 du −d /2+b−1 · exp 2pu + i (S, λ) + S 2 TM (u) aD λ (PM (u), 6)b (4.9) daD
hepO+I2 (S) iλ,u=1 =
·
b1 X
bX 1 /2
ai1 j1 · · · aib jb SW (λ, βi1 ∧ βj1 ∧ · · · ∧ βib ∧ βjb ).
ip ,jp =1
On the other hand, the wall-crossing formula for the u-plane integral near u = 1 can be written as W C(λ) =2πi211b1 /4 i b1 /2 α χ β σ e2iπ(λ0 ·λ+λ0 ) vol(Tb1 ) dτ χ /4 2 −λ2 /2 (u − 1)σ/8 · ResaD =0 qD (u2 − 1) du du · exp 2pu + i (S, λ) + S 2 TM (u) (P (qD ), 6)b1 /2 , daD 2
(4.10)
where we have included the extra factors depending on b1 that we obtained in the previous section by comparing to the expressions in [17]. Notice that, from (4.6), one has √ 2 1 λ, (4.11) P (qD ) = PM (u) + 64π aD and (4.10) can then be expanded as: i b1 /2 2 α χ β σ e2iπ(λ0 ·λ+λ0 ) vol(Tb1 ) W C(λ) = 2πi π bX 1 /2 dτ χ /4 2 b1 /2 −λ2 /2 (u − 1)σ/8 ResaD =0 qD (u2 − 1) · du b b=0 du 2 · exp 2pu + i (S, λ) + S TM (u) daD b−b /2 · 211b/2 π b (PM (u), 6)b (λ, 6)b1 /2−b aD 1 .
(4.12)
Now we can compare the expressions for wall-crossing obtained from (4.9) and (4.12). Notice again that, to identify the fields ψ with the one-observables in (4.2) we have to be careful with possible normalization factors needed to agree with the normalization used by topologists. But for the terms with no insertions, i.e. b = 0 in (4.9), we should be able to obtain the wall-crossing formula for this Seiberg–Witten invariant. In fact, we find: W C(SW (λ)) = (−1)b1 /2 (λ, 6)b1 /2 vol(Tb1 ).
(4.13)
As λ ∈ H 2 (X, Z) + 21 w2 (X), and 2λ is the determinant line bundle of the corresponding Spinc structure, we find perfect agreement with [20, 21] (notice that the wall-crossing formula in Theorem 1.2 of [20] should have an extra factor of 2−b1 /2 ). For the wall-crossing
Donaldson Invariants for Non-Simply Connected Manifolds
261
of Seiberg–Witten invariants with one-observable insertions, the general formula of [21] is (in our notation): W C(SW (λ, β1 ∧ · · · ∧ βr )) =
(−1)
b1 −r 2
( b12−r )!
(λ, 6)
b1 −r 2
Z
]
Tb1
β1 ∧ · · · ∧ βr] ∧
b1 −r 2
,
(4.14) and we see that, if we introduce a normalization factor 2−9/4 π −1/2 i for the fields ψ, we obtain from the matching of wall-crossing the expression b X
ai1 j1 · · · aib jb W C(SW (λ, βi1 ∧ βj1 ∧ · · · ∧ βib ∧ βjb )) =
ip ,jp =1
(4.15)
(−1)b1 /2−b (λ, 6)b1 /2−b vol(Tb1 ), 2b (b1 /2 − b)! again in agreement with (4.14). The expression (4.14) can be actually derived by considering general insertions of one and three-observables. We then see that, in general, the Donaldson invariants for a non-simply connected manifold should be written in terms of the generalized Seiberg–Witten invariants, and only in this case we have consistent formulae for matching of wall-crossing. When we include arbitrary observables associated to one- and three-cycles, the above expressions are more complicated and we have to take into account the terms involving γ and 6 3 , as well as the new contact terms. For manifolds of b2+ > 1, all the contact terms appear in the Seiberg–Witten contribution. They are given by the expression in (2.11), where the prepotential is now the eM (aD ) (notice that, as all the contact terms are regular at the monopole prepotential F cusp, we can take as well the dual prepotential FD (aD )). In general we have a complicated expression which can be written explicitly using the previous results. In the simple type case, dλ = 0, the Seiberg–Witten moduli space consists of a finite set of points and counting them with appropriate signs we obtain SW (λ). We only have to compute the different functions at u = 1 (the first term in the expansion in aD ). Using also (3.18), we can already write an explicit expression for the SW contribution of λ to the Donaldson invariants at u = 1, with an arbitrary insertion of observables: 7χ
hepO+I2 (S)+I1 (γ )+I3 (6 ) iλ,u=1 = 21+ 4 + 4 e2iπ(λ0 ·λ+λ0 ) n o 1 1 1 · exp 2p + S 2 − (6 3 ∩ γ ) − (6 3 ∧ 6 3 , S) + (6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 2 4 96 o n 1 3 3 · exp 2(S, λ) − (6 ∧ 6 , λ) SW (λ). 4 (4.16) 3
11σ
2
We see that the contact terms give, on the one hand, new contributions which depend on the intersection theory on X. On the other hand, there is a term coming from the intersection of the two-cycle 6 3 ∩ 6 3 with the Poincaré dual of the basic class λ, as was suspected in [6] using the cosmic string approach. Notice that all the coefficients in (4.16) are real, as needed for consistency. We can now write an expression for the Donaldson invariants of non-simply connected manifolds of simple type, summing over the basic classes and considering both the
262
M. Mariño, G. Moore
monopole and the dyon contributions. Using the Z2 symmetry on the u-plane, and taking into account the R-charges of the different terms in (4.16), we obtain: 2 hepO+I2 (S)+I1 (γ )+I3 (6 ) i 3
7χ
= 21+ 4 + 4 n o 1 1 1 · exp 2p + S 2 − (6 3 ∩ γ ) − (6 3 ∧ 6 3 , S) + (6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 2 4 96 n o X 1 2 e2iπ(λ0 ·λ+λ0 ) SW (λ) exp 2(S, λ) − (6 3 ∧ 6 3 , λ) · 4 11σ
λ
+i
χ +σ 4
−w22 (E)
n o 1 1 1 · exp −2p − S 2 + (6 3 ∩ γ ) + (6 3 ∧ 6 3 , S) − (6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 2 4 96 n o X i 3 2iπ(λ0 ·λ+λ20 ) e SW (λ) exp −2i(S, λ) + (6 ∧ 6 3 , λ) . · 4 λ
(4.17) 5. Donaldson Invariants for Product Ruled Surfaces In this section we will use the above results and the formulae in Sect. 8 of [9] to write general expressions for the Donaldson invariants of product ruled surfaces X = CP 1 × Fg , where Fg is a Riemann surface of genus g. 3 For these manifolds, b1 = 2g, b2 = 2, b2+ = 1, so σ = 0 and χ = 4 − 2b1 . H 2 (X, Z) is generated by the cohomology classes [CP 1 ], [Fg ], with intersection form I I 1,1 . Consider a general period point ω given by 1 ω(θ) = √ (eθ [CP 1 ] + e−θ [Fg ]). 2
(5.1)
The standard constant curvature metrics for CP 1 , Fg give natural representatives for [CP 1 ], [Fg ]. Thus, choosing coordinates z ∈ C for CP 1 and representing Fg (for g > 1) as a quotient by a Fuchsian group of the the Poincaré disk: D = {w : |w| < 1}, we have i dw ∧ dw , 2π(g − 1) (1 − |w|2 )2 i dz ∧ dz . [Fg ] = 2π (1 + |z|2 )2
[CP 1 ] =
(5.2)
A metric with period point (5.1) then has scalar curvature 8π(eθ − e−θ (g − 1)) and will hence be positive for e2θ > g −1. By [7] there are no SW contributions to the Donaldson invariants for this metric. Thus, to compute the invariants in the chamber corresponding to the limit of a small volume for CP 1 (with θ positive and large), we need only evaluate the u-plane integral. 2 Curiously, the coefficient of (6 3 )4 is 1/96 = 2−5 · 3−1 . The factor of 3 would seem to imply divisibility properties by 3 for certain intersection numbers. These are easily confirmed in simple examples, but we did not find a general proof. 3 This case has also been discussed in Eq. (2.15) of [10].
Donaldson Invariants for Non-Simply Connected Manifolds
263
The value of the integral depends on the non-abelian magnetic flux w2 (E). The vanishing argument of Sect. 5 of [9] applies when w2 (E) · [CP 1 ] 6= 0, (i.e. there is nontrivial flux through the small rational fiber). In this case we have simply that Zu = 0. Now consider w2 (E) = 0. Here we can use the computation of Sect. 8 of [9]. To do this, we choose the primitive null vector z = [CP 1 ] = 6. The expression derived in [9] 2 = (z, ω)2 is small, which is the case in the chamber under will be valid as long as z+ consideration (corresponding to θ large and positive). If we consider the expression (2.20) on a product ruled surface, we see that it involves a lattice sum 9(e S) and a term involving the observables and the measure. From this last term we can derive an expression generalizing the holomorphic function f appearing in Sect. 8 of [9]: √ dτ 1−g du 2 (8i)1−g (u2 − 1) f = 8πi du dτ √ 1 2 dτ 2 3 3 a (S, ψ˜ 2 ) · exp 2pu + S − (S, 6 ∧ 6 ) T (u) + 8 64 da (5.3) Z 3π i du a1 du 3 3 ˜ ˜ a3 (S, ψ ∧ 6 ) ψ ∧ [γ ] − u(6 ∩ γ ) − + √ 8 da 4 2 da X 1 − u(S, 6 3 ∧ 6 3 ) , 12 where we have taken into account the relation (2.17), and a3 is given by (3.18). We also ˜ z) = (S, 6). Using the computation leading to Eq. (8.15) of [9], we see have that (S, that the Donaldson invariants of product ruled surfaces in the limit of a small volume for CP 1 and for w2 (E) = 0 are given by Z (S, [CP 1 ]) √ ˜ d ψ f h coth i . (5.4) −8 2πi 2h Tb1 q0 When we consider only insertions of zero and two-observables, the integration over the torus is easily done and we obtain the following expression for the Donaldson invariants of CP 1 × Fg , which generalizes the expression obtained in [19, 9] for CP 1 × CP 1 : g (S, [CP 1 ]) 2 . (5.5) −16i h(u2 − 1)e2pu+S T (u) 2h2 f1 (q)(S, [CP 1 ]) coth i 2h q0 If w2 (E) = [CP 1 ], one can analyze the lattice reduction as in [9] including this nonzero flux, which gives extra phases function. The effect of these phases in the lattice theta 1 is simply to change the coth i(S, [CP ])/2h in (5.4) by −i csc (S, [CP 1 ])/2h , and we then obtain the generating function for CP 1 × Fg and with w2 (E) = [CP 1 ]: g (S, [CP 1 ]) 2 −16 h(u2 − 1)e2pu+S T (u) 2h2 f1 (q)(S, [CP 1 ]) csc , (5.6) 2h q0 where we have included only zero and two-observables. For applications to Floer theory and Gromov–Witten theory the most interesting chamber is on the other side of the Kähler cone, namely when θ → −∞ giving a very small volume to Fg . We can obtain the Donaldson invariants for this chamber by summing over all the wall-crossing discontinuities. In general, denoting the lift 2λ0 of
264
M. Mariño, G. Moore
w2 (E) by λ0 = 21 1 [CP 1 ] + 21 2 [Fg ] with 1,2 = 0, 1 we have walls at ω · λn,m = 0 where: 1 1 (5.7) λn,m = (n + 1 )[CP 1 ] + (m + 2 )[Fg ] 2 2 for (n + 21 1 )(m + 21 2 ) < 0, n, m ∈ Z.4 The infinite sums over wall-crossings can be written explicitly in terms of modular forms using a result of Kronecker [22]: 5 ∞ ∞ X X
q mn e2π i(nθ1 +mθ2 ) − e−2π i(nθ1 +mθ2 ) = −1 +
m=1 n=1
1 1 + −2π iθ 1 1−e 1 − e−2π iθ2
− iη3 (τ )
ϑ1 (θ1 + θ2 |τ ) . ϑ1 (θ1 |τ )ϑ1 (θ2 |τ ) (5.8)
We will consider in detail the cases 2λ0 = [CP 1 ], 2λ0 = [CP 1 ] + [Fg ], which are relevant for the Floer cohomology of Fg × S1 [2, 24] as well as to the quantum cohomology of the moduli space of stable bundles over Fg [3, 25]. We consider only insertions of zero, one, and two observables (i.e. we put 6 3 = 0). As w2 (X) = 0 for product ruled surfaces, the only λ-dependent factors in the wall-crossing expression (3.16) are i 2 ˜ λ) , (5.9) q −λ /2 exp − (S, h where S˜ has been defined in (2.14). Define now the formal variables 1 ˜ = 1 ([CP 1 ], S), 2πθ1 = ([CP 1 ], S) h h √ 1 1 2 dτ ˜ . 2πθ2 = − ([Fg ], S) = − ([Fg ], S) + h h 16 da
(5.10)
The sum of wall-crossings can then be written, for 2λ0 = [CP 1 ], as −
∞ ∞ X X
q m(n−1/2) e2π i((n−1/2)θ1 +mθ2 ) − e−2π i((n−1/2)θ1 +mθ2 )
m=1 n=1
ϑ4 (θ1 + θ2 |τ ) i + iη3 (τ ) , = 2 sin(πθ1 ) ϑ1 (θ1 |τ )ϑ4 (θ2 |τ )
(5.11)
and for 2λ0 = [CP 1 ] + [Fg ] as −
∞ ∞ X X
q (m−1/2)(n−1/2) e2π i((n−1/2)θ1 +(m−1/2)θ2 ) − e−2π i((n−1/2)θ1 +(m−1/2)θ2 )
m=1 n=1
= iη3 (τ )
ϑ1 (θ1 + θ2 |τ ) . ϑ4 (θ1 |τ )ϑ4 (θ2 |τ ) (5.12)
4 We require λ2 < 0 rather than λ2 ≤ 0 because the “walls” at λ2 = 0 are on the light-cone, and are never crossed when moving from one chamber to another. 5 Curiously, this identity showed up in recent studies of elliptic quantum cohomology, [23].
Donaldson Invariants for Non-Simply Connected Manifolds
265
These explicit expressions are obtained from (5.8) by shifting θ1 , θ2 appropriately. The quotients of theta functions can be written in terms of Weierstrass σ functions, as in the blow-up formulae of [26]: 1 ϑ1 (θ|τ ) 2 = e−η2 ω2 θ σ (t), 0 ϑ1 (0|τ ) ω2 ϑ4 (θ|τ ) 2 =e−η2 ω2 θ σ3 (t), ϑ4 (0|τ )
(5.13)
where t = ω2 θ and ω2 corresponds to the a-period of the Seiberg–Witten curve, ω2 = 8π √ h. The value of the Weierstrass zeta function at ω2 can be written in terms of E2 (τ ) 2 as η2 =
π2 E2 (τ ). 6ω2
(5.14)
In fact, the terms in (5.13) involving η2 cancel the E2 factors in the wall-crossing formula, and the resulting expressions are modular forms of weight zero (after integrating on Tb1 ). The coefficients in the expansion of the σ functions only depend on the zeroobservable u. After writing the Seiberg–Witten elliptic curve in Weierstrass form, one has that g2 =
1 u2 1 2u3 1 u − , g3 = − , 4 3 4 48 9 4
(5.15)
and the root relevant for σ3 (t) is e3 = −u/12. We then have the expansions g2 t 5 + ··· , 24 · 3 · 5 u σ3 (t) = 1 + t 2 + · · · . 24
(5.16)
ϑ4 (θ1 + θ2 |τ ) 4 σ3 (t1 + t2 ) = − √ he−2η2 ω2 θ1 θ2 , ϑ1 (θ1 |τ )ϑ4 (θ2 |τ ) σ (t1 )σ3 (t2 ) 2 √ ϑ1 (θ1 + θ2 |τ ) 2 −2η2 ω2 θ1 θ2 σ (t1 + t2 ) =− he . η3 (τ ) ϑ4 (θ1 |τ )ϑ4 (θ2 |τ ) 4 σ3 (t1 )σ3 (t2 )
(5.17)
σ (t) = t −
Using (5.13) we find η3 (τ )
Using these expressions, we can obtain an explicit answer for the Donaldson–Witten function of product ruled surfaces in the chamber where the volume of Fg is small. For 2λ0 = [CP 1 ], we have to add the expression for the chamber where the volume of 1 [CP 1 ] is small, and the infinite sum of wall-crossing terms. Whenthe volume of [CP ] is small, we have to use (5.4) but with the −i csc (S, [CP 1 ])/2h function as in (5.6). We then obtain S2 [CP 1 ] = − 2(7g+1)/2 (−iπ)g h2g−1 f2 (q)−1 e(2p+ 3 )u ZDW Z ·
Tb1
(5.18) √ i 2 σ3 (t1 + t2 ) 1 ] M(q)(S, [CP ]) + Q(q)γ · exp , 24π σ (t1 )σ3 (t2 ) q 0
266
M. Mariño, G. Moore
where M(q) =
ϑ24 + ϑ34 ϑ48
.
(5.19)
For 2λ0 = [CP 1 ] + [Fg ] we obtain: √ S2 2 7g/2 [CP 1 ]+[Fg ] = (−iπ)g h2g−1 f2 (q)−1 e(2p+ 3 )u 2 ZDW 8 √ Z i 2 σ (t1 + t2 ) 1 ] · exp . M(q)(S, [CP ]) + Q(q)γ · 24π σ3 (t1 )σ3 (t2 ) q 0 Tb1 (5.20) As a check of (5.20), one can easily derive, using the expansion of the σ functions, the result D[CP
1 ]+[F ] 1
((I (CP 1 ))2 ) = −2,
(5.21)
where D[CP ]+[F1 ] denotes the Donaldson invariant in the chamber vol(F1 ) → 0. This agrees with the computation in [2]. Recently the proof of the Atiyah conjecture on the relation of symplectic and instanton Floer homology has been completed [24, 25]. It is possible that the above expressions can be used to give another proof of this conjecture. In fact, 1
([CP 1 ],[CP 1 ]+[Fg ])
ZDW
[CP 1 ]+[Fg ]
[CP ] = ZDW + ZDW 1
(5.22)
is a generating function for the Gromov–Witten invariants of the moduli space of flat connections [2, 3, 24, 25]. In the simple case g = 1, one finds that D([CP
1 ],[CP 1 ]+[F ]) 1
(1) = −1, D([CP
1 ],[CP 1 ]+[F ]) 1
(O) = 2,
(5.23)
where O is the zero-observable. Taking into account that the generator of the quantum cohomology of the moduli space of flat connections on F1 (or of the Floer cohomology of F1 × S1 )is given by β = −4O, we obtain the relation β = 8, which is the first quantum correction to the classical cohomology ring in genus one. Acknowledgements. We would like to thank Y. Ruan and E. Witten for some discussions. We would also like to thank V. Muñoz and Tianjun Li for very useful and clarifying correspondence. This work is supported by DOE grant DE-FG02-92ER40704.
References 1. The Floer Memorial Volume. H. Hofer et. al. eds., Boston–Basel: Birkhäuser, 1995. 2. Donaldson, S.: Floer homology and algebraic geometry. In: Vector bundles in algebraic geometry, N.J. Hitchin et. al. eds. Cambridge: Cambridge University Press, 1995 3. Bershadsky, M., Johansen, A., Sadov, V. and Vafa, C.: Topological Reduction of 4D SYM to 2D σ –Models. hep-th/9501096; Nucl. Phys. B448, 166 (1995) 4. Morgan, J. and Mrowka, T.: A note on Donaldson’s polynomial invariants. Int. Math. Res. Not. 10, 223 (1992) 5. Witten, E.: Topological Quantum Field Theory. Commun. Math. Phys. 117, 353 (1988) 6. Witten, E.: Supersymmetric Yang–Mills theory on a four-manifold. hep-th/9403193; J. Math. Phys. 35, 5101 (1994)
Donaldson Invariants for Non-Simply Connected Manifolds
7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.
267
Witten, E.: Monopoles and four-manifolds. hep-th/9411102; Math. Res. Letters 1, 769 (1994) Witten, E.: On S-duality in abelian gauge theory. hep-th/9505186; Selecta Mathematica 1, 383 (1995) Moore, G. and Witten, E.: Integration over the u-plane in Donaldson theory. hep-th/9709193 Losev, A., Nekrasov, N. and Shatashvili, S.: Issues in topological gauge theory. hep-th/9711108; Testing Seiberg–Witten solution. hep-th/9801061 Mariño M. and Moore, G.: Integrating over the Coulomb branch in N = 2 gauge theory. hep-th/9712062 Mariño, M. and Moore, G.: The Donaldson–Witten function for gauge groups of rank larger than one. hep-th/9802185 Takasaki, K.: Integrable Hierarchies and Contact Terms in u-plane Integrals of Topologically Twisted Supersymmetric Gauge Theories. hep-th/9803217 Verlinde, E.: Global aspects of electric-magnetic duality. hep-th/9506011; Nucl. Phys. B455, 211 (1995) Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford: Clarendon Press, 1990 Gorsky, A., Marshakov, A., Mironov, A. and Morozov, A.: RG equations from Whitham hierarchy. hepth/9802007 Muñoz, V.: Wall-crossing formulae for algebraic surfaces with q > 0. alg-geom/9709002 Göttsche, L.: Modular forms and Donaldson invariants for 4-manifolds with b+ = 1. alg-geom/9506018; J. Am. Math. Soc. 9, 827 (1996) Göttsche, L. and Zagier, D.: Jacobi forms and the structure of Donaldson invariants for 4-manifolds with b+ = 1. alg-geom/9612020 Li, T.J. and Liu, A.: General wall-crossing formula. Math. Res. Lett. 2, 797 (1995) Okonek, C. and Teleman, A.: Seiberg–Witten invariants for manifolds with b2+ = 1, and the universal wall-crossing formula. alg-geom/9603003; Int. J. Math. 7, 811 (1996) Weil, A.: Elliptic Functions according to Eisenstein and Kronecker. Berlin–Heidelberg–New York: Springer-Verlag, 1976 Polishchuk, A.: Massey and Fukaya products on elliptic curves. alg-geom/9803017 Muñoz, V.: Ring structure of the Floer cohomology of 6 × S1 . dg-ga/9710029 Muñoz, V.: Quantum cohomology of the moduli space of stable bundles over a Riemann surface. alggeom/9711030 Fintushel, R. and Stern, R.J.: The blowup formula for Donaldson invariants. alg-geom/9405002; Ann. Math. 143, 529 (1996)
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 203, 269 – 295 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Gleason’s Theorem for Rectangular JBW∗ -Triples? C. Martin Edwards1 , Gottfried T. Rüttimann2 1 The Queen’s College, Oxford, United Kingdom. E-mail:
[email protected] 2 University of Berne, Berne, Switzerland. E-mail:
[email protected] Received: 3 August 1998 / Accepted: 20 October 1998
Abstract: A JBW∗ -triple B is said to be rectangular if there exists a W∗ -algebra A and a pair (p, q) of centrally equivalent elements of the complete orthomodular lattice P(A) of projections in A such that B is isomorphic to the JBW∗ -triple pAq. Any weak∗ -closed injective operator space provides an example of a rectangular JBW∗ triple. The principal order ideal CP(A)(p,q) of the complete ∗ -lattice CP(A) of centrally equivalent pairs of projections in a W∗ -algebra A, generated by (p, q), forms a complete lattice that is order isomorphic to the complete lattice I(B) of weak∗ -closed inner ideals in B and to the complete lattice S(B) of structural projections on B. Although not itself, in general, orthomodular, CP(A)(p,q) possesses a complementation that allows for definitions of orthogonality, centre, and central orthogonality to be given. A less familiar notion in lattice theory, that is well-known in the theory of Jordan algebras and Jordan triple systems, is that of rigid collinearity of a pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A)(p,q) . This is defined and characterized in terms of properties of P(A). A W∗ -algebra A is sometimes thought of as providing a model for a statistical physical system. In this case B, or, equivalently, pAq, may be thought of as providing a model for a fixed sub-system of that represented by A. Therefore, CP(A)(p,q) may be considered to represent the set consisting of a particular kind of sub-system of that represented by pAq. Central orthogonality and rigid collinearity of pairs of elements of CP(A)(p,q) may be regarded as representing two different types of disjointness, the former, classical disjointness, and the latter, decoherence, of the two sub-systems. It is therefore natural to consider bounded measures m on CP(A)(p,q) that are additive on centrally orthogonal and rigidly collinear pairs of elements. Using results of J.D.M. Wright, it is shown that, provided that neither of the two hereditary sub-W∗ -algebras pAp and qAq of A has a weak∗ -closed ideal of Type I2 , such measures are precisely those that are the restrictions of bounded sesquilinear functionals φm on pAp × qAq with the property that the action of the centroid Z(B) of B commutes with the adjoint ? Research partially supported by the Royal Society and the Swiss National Science Foundation.
270
C. M. Edwards, G. T. Rüttimann
operation. When B is a complex Hilbert space of dimension greater than two, this result reduces to Gleason’s Theorem. 1. Introduction This paper is concerned with the structure of rectangular JBW∗ -triples. A JBW∗ -triple B is said to be rectangular if there exists a W∗ -algebra A and a pair (p, q) of projections in A such that B is isomorphic to the JBW∗ -triple pAq. The family of rectangular JBW∗ triples includes, for example, all weak∗ -closed injective operator spaces [31]. Since all the properties that will be discussed in this paper are invariant under isomorphisms, without loss of generality, B will always be identified with pAq. For the general theory of JBW∗ -triples the reader is referred to [3,14,21,23,27] and [33]. In the study of the structure of the rectangular JBW∗ -triple B, the weak∗ -closed subspaces J that arise naturally are the inner ideals, which are defined by the property that, for each element a in J and each element b in B, the element ab∗ a lies in J . A pair (e, f ) of projections in the W∗ -algebra A is said to be centrally equivalent if e and f have the same central support projection. The authors showed in [15] that every weak∗ -closed inner ideal in a W∗ -algebra A is of the form eAf , for a unique centrally equivalent pair (e, f ) of projections in A. Since, for arbitrary projections p and q in A, pAq is a weak∗ -closed inner ideal in A, there is no loss of generality in assuming throughout that the pair (p, q) is centrally equivalent. The results of [15] show that every weak∗ -closed inner ideal in B is of the form eAf , where (e, f ) is a centrally equivalent pair of projections in A, with e and f dominated by p and q, respectively. A linear projection P on B is said to be structural if, for each element a in B, the elements (P a)a ∗ (P a) and P (a(P a)∗ a) of B coincide. It follows from the results of [13,15,17] and [18] that every such projection is weak∗ -continuous and contractive and is of the form a 7 → eaf for a centrally equivalent pair (e, f ) of projections in A, with e and f dominated by p and q, respectively. It is a consequence of the results referred to above that the sets CP(A)(p,q) of centrally equivalent pairs of projections in A such that e and f are dominated by p and q, respectively, and S(B) of structural projections on B, with appropriate partial orderings, form complete lattices which are order isomorphic to the complete lattice I(B) of weak∗ -closed inner ideals in B. A W∗ -algebra A is often thought of as a model for a statistical quantum-mechanical system, the bounded observables of which are represented by self-adjoint elements of A, and the propositions of which are represented by elements of the complete orthomodular lattice P(A) of projections in A. Weak∗ -continuous contractive projections on A can be thought of as representing filtering processes on the physical system, and their ranges as representing sub-systems. Consequently, the sub-systems corresponding to structural projections are represented by weak∗ -closed inner ideals in A. It follows that the rectangular JBW∗ -triple B may be considered as representing a physical system, which, though not itself classical or quantum, may possess sub-systems that are. Its structural sub-systems are represented by weak∗ -closed inner ideals or, equivalently, by elements of CP(A)(p,q) . In this paper, sometimes using the corresponding results for CP(A) discussed in [20], the properties of the complete lattice CP(A)(p,q) of centrally equivalent pairs of projections in the W∗ -algebra A, dominated by (p, q), are examined in some detail. It is shown that CP(A)(p,q) possesses a rich structure involving notions of compatibility, orthogonality, central orthogonality and rigid collinearity, all of which have physical interpretations.
Gleason’s Theorem for Rectangular JBW∗ -Triples
271
For a physical system represented by a W∗ -algebra A, a discussion of its statistics may be approached in two ways. In the traditional approach, states of the system are represented by bounded ortho-additive measures on P(A), whilst a second approach, sometimes referred to as that involving quantum histories ([24–26,34]), states are represented by measures on CP(A). The generalized Gleason Theorem of Bunce and Wright ([6–8]) shows that, provided that A has no weak∗ -closed ideal of Type I2 , states of the first kind are the restrictions of bounded linear functionals on A, and the results of [20] show that, under the same conditions, the relevant measures on CP(A) are the restrictions of bounded sesquilinear functionals, or decoherence functionals, on A × A. When A is a Type I factor, the second approach subsumes the first. For a system represented by a rectangular JBW∗ -triple B, where B is isomorphic to pAq and p and q are not equal, the first approach is not available. However, it is possible to appeal to the second approach. The central orthogonality and rigid collinearity of a pair (e1 , f1 ), (e2 , f2 ) of elements of CP(A)(p,q) correspond to two different kinds of disjointness of the corresponding sub-systems, the first classical disjointness, and the second, decoherence. Consequently, it is natural to study those bounded measures m on CP(A)(p,q) which have the property that, for each pair (e1 , f1 ), (e2 , f2 ) of either centrally orthogonal or rigidly collinear elements of CP(A)(p,q) , m((e1 , f1 ) ∨ (e2 , f2 )) = m((e1 , f1 )) + m((e2 , f2 )). Using results of [20] and [34], it is shown that, provided that neither of the W∗ -algebras pAp or qAq has a direct summand of Type I2 , such measures are the restrictions of a particular class of bounded sesquilinear functionals on pAp × qAq. In the case in which p and q coincide, these are the decoherence functionals, mentioned above, and discussed in [24–26] and [34–36]. Furthermore, a measure is normal if and only if the corresponding sesquilinear functional is separately weak∗ -continuous. The results obtained can be couched equivalently as properties of S(B) or I(B). The paper is organized as follows. In Sect. 2 definitions are given, notation is established and certain preliminary results are described. In Sect. 3 rectangular JBW∗ -triples are defined and the notion of compatibility in the complete lattice CP(A)(p,q) , which is order isomorphic to the complete lattice I(B) of weak∗ -closed inner ideals in the rectangular JBW∗ -triple B, is introduced. A more detailed study of the structure of CP(A)(p,q) is carried out in Sect. 4. In particular the notions of orthogonality, centre, central orthogonality and rigid collinearity are introduced and related to the centroid of B. Whilst the structure of CP(A)(p,q) and its physical interpretation are of independent interest, the main results of the paper are the generalization of Gleason’s Theorem [22], and the identification of normal measures on CP(A)(p,q) as the restrictions of separately weak∗ -continuous sesquilinear functionals, which are proved in Sect. 5. The last section is devoted to a discussion of examples, including that of the rectangular JBW∗ -triple Mr,s (C) of r × s complex matrices. In many ways, the most illuminating example is provided by the rectangular JBW∗ triple B which is itself a complex Hilbert space. In this case, the complete lattice I(B) of weak∗ -closed inner ideals is the complete lattice of closed subspaces of B and the complete lattice S(B) of structural projections is the complete lattice P(B(B)) of projections in the W∗ -algebra B(B) of bounded linear operators on the Hilbert space B. Since the centre ZS(B) of S(B) is trivial, there are no non-trivial centrally orthogonal pairs of elements of S(B). More surprisingly, there are no non-trivial orthogonal pairs of elements of S(B). However, two elements Q1 and Q2 of S(B) are rigidly collinear if and only if
272
C. M. Edwards, G. T. Rüttimann
they are orthogonal in the complete orthomodular lattice P(B(B)). Consequently, the bounded measures on S(B), discussed in Sect. 5, reduce to ortho-additive measures on P(B(B)) and the main results, Theorems 5.2 and 5.3, reduce to Gleason’s Theorem [22]. The conclusion that can be deduced from this is that, in generalizing Gleason’s Theorem to rectangular JBW∗ -triples, it is rigid collinearity, not orthogonality, that is the relevant binary relation for the additivity of measures. 2. Preliminaries Recall that a partially ordered set P is said to be a lattice if, for e and f in P, the supremum e ∨ f and the infimum e ∧ f exist. The partially ordered set P is said to be a complete lattice if, for any family (ej )j ∈3 of P, the supremum ∨j ∈3 ej and the infimum ∧j ∈3 ej exist. A complete lattice has a greatest element, denoted by 1 and a least element, denoted by 0. For each element p in the complete lattice P, the subset Pp consisting of elements e in P majorized by p forms a complete lattice with greatest element p and least element 0, and both the infimum and supremum of a family of elements of Pp is the same whether calculated in P or in Pp . The complete lattice Pp is said to be the principal order ideal in P generated by p. A complete lattice P together with an anti-order automorphism e 7 → e0 of P such that, for all elements e in P, e ∨ e0 = 1, e00 = e, and, for all elements e and f in P with e ≤ f , f = e ∨ (f ∧ e0 ), is said to be orthomodular and the mapping e 7 → e0 is said to be an orthocomplementation of P. Elements e and f in the complete orthomodular lattice P are said to be orthogonal, denoted e ⊥ f , if e ≤ f 0 . An element z in P is said to be central if, for all elements e in P, e = (z ∧ e) ∨ (z0 ∧ e). The set ZP of central elements of the complete orthomodular lattice P contains 0 and 1, and if z is contained in ZP then so also is z0 . The centre ZP of P forms a sub-complete orthomodular lattice of P which, with the restricted order and orthocomplementation, is Boolean. The central support c(e) of an element e in P is the infimum of the set of elements in ZP which dominate e. Observe that, for elements e and f in P, c(e ∧ c(f )) = c(e) ∧ c(f ), and, for every family (ej )j ∈3 of elements of P, _ _ ej ) = c(ej ). c( j ∈3
(2.1)
(2.2)
j ∈3
When endowed with the product ordering the Cartesian product P ×P of the complete orthomodular lattice P with itself forms a complete lattice. An element (e, f ) in P × P is said to be centrally equivalent if the central supports c(e) of e and c(f ) of f coincide and, in this case, the common central support is denoted by c(e, f ). Let CP be the collection of centrally equivalent elements of P × P. It follows from (2.2) that, with the ordering inherited from P × P, CP is a complete lattice with least element (0, 0) and greatest element (1, 1), and the supremum of a subset of CP coincides with its supremum when regarded as a subset of P × P. In general, this is not the case for the infimum. However, for any element (e, f ) in CP, (e, f ) = (c(f ), f ) ∧ (e, c(e)) = (c(f ), f ) ∧P ×P (e, c(e)). For details the reader is referred to [32].
Gleason’s Theorem for Rectangular JBW∗ -Triples
273
Let A be a W∗ -algebra and let P(A) be the family of self-adjoint idempotents, or projections in A. For e and f in P(A), write e ≤ f if ef = e and let e0 = 1 − e, where 1 is the unit in A. These define a partial ordering on P(A), with respect to which it forms a complete lattice, and an orthocomplementation on P(A) which is therefore a complete orthomodular lattice. Observe that, for orthogonal elements e and f in P(A), e ∨ f = e + f and, by (2.2), c(e + f ) = c(e) ∨ c(f ). Furthermore, for each increasing net (ej )j ∈3 in P(A), the supremum ∨j ∈3 ej coincides with the limit of the net (ej )j ∈3 in the weak∗ topology. The centre ZP(A) of P(A) coincides with the complete Boolean lattice of projections in the algebraic centre Z(A) of A. Observe that, for z in ZP(A) and e in P(A), z ∧ e = ze, z ∨ e = e + z − ez = z + z0 e.
(2.3)
A subspace J of the W∗ -algebra A is said to be a left ideal if AJ ⊆ J , is said to be a right ideal if J A ⊆ J , and is said to be an ideal if it is both a left and a right ideal. For each weak∗ -closed left ideal J in A there exists a unique projection e such that J coincides with Ae. The left ideal Ae is an ideal if and only if e is central. For each element a in A, the unique projection e(a) such that the left ideal {b ∈ A : ba = 0} coincides with Ae(a)0 is said to be the left support of a. It is the least element of P(A) for which e(a)a = a. The right support f (a) is similarly defined. Clearly, e(a) = f (a ∗ ) and, therefore, the left and right supports of a self-adjoint element a coincide. This element, denoted by s(a), is the unit in the sub-W∗ -algebra of A generated by a and is said to be the support projection of a. An element u in A is said to be a partial isometry if uu∗ u = u or, equivalently, if either uu∗ or u∗ u is a projection. For any subset B of A, the set of partial isometries in B is denoted by U(B). For each partial isometry u in A, e(u) = uu∗ , f (u) = u∗ u and the central supports c(e(u)) and c(f (u)) coincide. For each element a in A, there exists a unique partial isometry r(a) in A such that 1 a = r(a)|a| and f (r(a)) = s(|a|), where |a| = (a ∗ a) 2 . Moreover, r(a)∗ = r(a ∗ ), f (a) = f (r(a)), e(a) = e(r(a)), a = r(a)a ∗ r(a).
(2.4)
The partial isometry r(a) is said to be the support of a. For details of these and other results on W∗ -algebras the reader is referred to [29] and [30]. The Jordan triple product {a b c} of three elements a, b and c in A is defined by 1 (ab∗ c + cb∗ a). 2 A subspace J of A is said to be a subtriple of A if the subset {J J J } is contained in J , is said to be an inner ideal if the subset {J A J } is contained in J and is said to be an ideal if the subsets {A J A} and {J A A} are contained in J . Observe that, for each pair a, b of elements of A the subspace aAb is an inner ideal in A. Since the intersection of a family of weak∗ -closed inner ideals in A is a weak∗ -closed inner ideal in A, the set I(A) of weak∗ -closed inner ideals in A, when ordered by set inclusion, forms a complete lattice. The following important result is proved in [15]. {a b c} =
Lemma 2.1. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, and let CP(A) be the complete lattice of centrally equivalent pairs of elements of P(A). Then, the mapping (e, f ) 7 → eAf is an order isomorphism from CP(A) onto the complete lattice I(A) of weak∗ -closed inner ideals in A, with inverse J 7 → (eJ , fJ ), where _ _ {e(u) : u ∈ U(J )}, fJ = {f (u) : u ∈ U(J )}. eJ =
274
C. M. Edwards, G. T. Rüttimann
Since, for arbitrary elements e and f in P(A), the set eAf is a weak∗ -closed inner ideal in A, the corresponding element of CP(A) is (c(f )e, c(e)f ), the common central support c(c(f )e, c(e)f ) of c(f )e and c(e)f being c(e)c(f ). The complete lattice CP(A) has a complicated structure, some of which is described briefly below. For details, the reader is referred to [15] and [20]. For each element (e, f ) in CP(A) and each element a in A, let P2A (e, f )a = eaf, P1A (e, f )a = eaf 0 + e0 af, P0A (e, f )a = e0 af 0 .
(2.5)
Then, P0A (e, f ), P1A (e, f ) and P2A (e, f ) are weak∗ -continuous norm non-increasing pairwise orthogonal linear projections on A with sum IA , the identity operator on A. The ranges A0 (e, f ) of A2 (e, f ) of P0A (e, f ) and P2A (e, f ) are weak∗ -closed inner ideals in A and the range A1 (e, f ) of P1A (e, f ) is a weak∗ -closed subtriple of A. The decomposition A = A0 (e, f ) ⊕ A1 (e, f ) ⊕ A2 (e, f ) of A is said to be the generalized Peirce decomposition of A relative to (e, f ). Let D A (e, f ) be the weak∗ -continuous bounded linear operator on A defined, for each element a in A, by 1 1 D A (e, f )a = ( P1A (e, f ) + P2A (e, f ))a = (ea + af ). 2 2
(2.6)
Recall that a bounded linear operator T on a complex Banach space X is said to be hermitian if the numerical range of T in the Banach algebra B(X) of bounded linear operators on X is contained in R, or, equivalently, if, for all real numbers t, the bounded linear operator eitT is an isometry (see [5], Lemma 5.2). It can easily be seen that the weak∗ -continuous linear operator D A (e, f ) is hermitian. A pair (e1 , f1 ), (e2 , f2 ) of elements of CP(A) is said to be compatible, written (e1 , f1 )CA (e2 , f2 ) if, for j and k equal to 0, 1 or 2, the commutant [PjA (e1 , f1 ), PkA (e2 , f2 )] is equal to zero. It can be seen that the compatibility of (e1 , f1 ) and (e2 , f2 ) is equivalent to either of the conditions: [D(e1 , f1 ), D(e2 , f2 )] = 0; e1 e2 = e2 e1 and f1 f2 = f2 f1 . For an element (e, f ) in CP(A) define (e, f )0 = (c(f 0 )e0 , c(e0 )f 0 ).
(2.7)
(e, f )00 = ((c(f 0 )e0 )0 , (c(e0 )f 0 )0 ),
(2.8)
Then
and c((e, f )00 ) is equal to c(e, f ). Furthermore, the mapping (e, f ) 7→ (e, f )0 is order reversing, and, (e, f ) ≤ (e, f )00 = (e, f )0000 = . . . ,
(e, f )0 = (e, f )000 = (e, f )00000 = . . . . (2.9)
An element (e1 , f1 ) in CP(A) is said to be orthogonal to an element (e2 , f2 ) in CP(A), written (e1 , f1 ) ⊥ (e2 , f2 ), if (e2 , f2 ) ≤ (e1 , f1 )0 . This relation is symmetric and holds if and only if e1 ⊥ e2 and f1 ⊥ f2 . It follows that orthogonal elements are compatible. An element (g, h) in CP(A) is said to be central if, for each element (e, f ) of CP(A), (e, f ) = ((g, h) ∧ (e, f )) ∨ ((g, h)0 ∧ (e, f )) .
(2.10)
Gleason’s Theorem for Rectangular JBW∗ -Triples
275
An element (g, h) in CP(A) is central if and only if g is equal to h and lies in the centre ZP(A) of P(A). Two elements (e1 , f1 ) and (e2 , f2 ) of CP(A) are said to be centrally orthogonal if there exists an element z in the centre ZP(A) of P(A) such that (e1 , f1 ) ≤ (z, z) and (e2 , f2 ) ≤ (z, z)0 . Observe that centrally orthogonal elements are orthogonal and therefore compatible. Furthermore, the elements (e1 , f1 ) and (e2 , f2 ) are centrally orthogonal if and only if the central supports c(e1 , f1 ) and c(e2 , f2 ) are orthogonal, and, in this case, (e1 , f1 ) ∨ (e2 , f2 ) = (e1 + e2 , f1 + f2 ).
(2.11)
A pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A) is said to be rigidly collinear, written (e1 , f1 )>(e2 , f2 ), if A2 (e1 , f1 ) ⊆ A1 (e2 , f2 ),
A2 (e2 , f2 ) ⊆ A1 (e1 , f1 ).
This occurs if and only if there exist pairwise orthogonal elements w1 , w2 and w3 of ZP(A) of sum 1 such that w1 e1 = w1 e2 , w1 f1 ⊥ w1 f2 , w2 e1 ⊥ w2 e2 , w2 f1 = w2 f2 , w3 e1 = w3 e2 = w3 f1 = w3 f2 = 0, and in this case
c(e1 , f1 ) = c(e2 , f2 ) ≤ w1 + w2 .
Furthermore, a unique such triple w1 , w2 and w3 of elements of ZP(A), satisfying the additional conditions w1 ≤ c(e1 , f1 ) and w2 ≤ c(e1 , f1 ), exists. In this case c(e1 , f1 ) = c(e2 , f2 ) = w1 + w2 and writing w1 e1 = w1 e2 = e0 , w2 f1 = w2 f2 = f0 , w2 e1 + w2 e2 = e, w1 f1 + w1 f2 = f, c(e0 ) = w1 and c(f0 ) = w2 , and (e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that (e1 , f1 ) ∨ (e2 , f2 ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f + f0 ). A family ((ej , fj ))j ∈3 of elements of CP(A) is said to be rigidly collinear if every pair of distinct elements of the family is rigidly collinear. For j and k in 3 and l equal to 1, 2 or 3, let w(j, k)l be the unique element of ZP(A) such that w(j, k)1 fj ⊥ w(j, k)1 fk , w(j, k)1 ej = w(j, k)1 ek , w(j, k)2 fj = w(j, k)2 fk , w(j, k)2 ej ⊥ w(j, k)2 ek , w(j, k)3 ej = w(j, k)3 fj = w(j, k)3 ek = w(j, k)3 fk = 0, w(j, k)1 ≤ c(ej , fj ), w( j, k)2 ≤ c(ej , fj ),
3 X
w(j, k)l = 1.
l=1
Then, there exist uniquely pairwise orthogonal elements w1 , w2 and w3 of ZP(A) of sum 1 and elements e0 and f0 in P(A) such that for all distinct j and k in 3, and l equal to 1, 2 or 3, w(j, k)l = wl , for all j in 3, w1 ej = e0 , c(e0 ) = w1 ,
w2 fj = f0 , c(f0 ) = w2 ,
w3 ej = w3 fj = 0, c(ej , fj ) = w1 + w2 ,
(2.12)
276
C. M. Edwards, G. T. Rüttimann
and (w1 fj )j ∈3 and (w2 ej )j ∈3 are families of pairwise orthogonal elements of P(A). Writing _ _ w2 ej , f = w1 fj , e= j ∈3
j ∈3
(e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that _ (ej , fj ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f0 + f ). j ∈3
3. Rectangular JBW∗ -Triples Recall that a complex Banach space B equipped with a triple product (a, b, c) 7→ {a b c} from B × B × B to B which is symmetric and linear in the first and third variables, conjugate linear in the second variable, and, for elements a, b, c and d in B, satisfies the identity [D(a, b), D(c, d)] = D({a b c}, d) − D(c, {d a b}),
(3.1)
where D is the mapping from B × B to B defined by D(a, b)c = {a b c}, is said to be a Jordan ∗ -triple. When D
is continuous from B × B to the Banach space of bounded linear operators on B and, for each element a in B, D(a, a) is hermitian with non-negative spectrum and satisfies kD(a, a)k = kak2 , the Jordan ∗ -triple B is said to be a JB∗ -triple. If B is the dual of a Banach space B∗ then B is called a JBW∗ -triple. A subspace J of B is said to be a subtriple of B if {J J J } is contained in J , is said to be an inner ideal in B if {J B J } is contained in J , and is said to be an ideal in B if {B J B} and {J B B} are contained in J . Observe that a W∗ -algebra A endowed with the Jordan triple product forms a JBW∗ triple. Let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A. Then, the weak∗ -closed inner ideal pAq in A is a JBW∗ -triple. A JBW∗ -triple B is said to be rectangular if there exists a W∗ -algebra A and an element (p, q) in CP(A) such that B is isomorphic as a Jordan∗ -triple to pAq. The remainder of this section will be concerned with a fixed rectangular JBW∗ -triple B, which will be identified with the rectangular JBW∗ -triple pAq. Lemma 3.1. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, and let B be the rectangular JBW∗ -triple pAq. Then, there exists an order isomorphism (e, f ) 7 → eBf from the principal order ideal CP(A)(p,q) in CP(A) onto the complete lattice I(B) of weak∗ closed inner ideals in B. Proof. Let J be a weak∗ -closed inner ideal in B. Then, for each partial isometry u in J , the weak∗ -closed inner ideal e(u)Bf (u) lies in J . However, e(u)Bf (u) = e(u)pAqf (u) = e(u)Af (u), since, by Lemma 2.1, e(u) ≤ p and f (u) ≤ q. It follows from [17], Lemma 2.3 that J is an inner ideal in A. The result now follows from Lemma 2.1. u t
Gleason’s Theorem for Rectangular JBW∗ -Triples
277
Recall that the principal order ideal P(A)p in the complete orthomodular lattice P(A) coincides with the complete orthomodular lattice P(pAp) of projections in the hereditary sub-W∗ -algebra pAp of A. In order to simplify notation at a later stage, for each element e in P(A)p , let e0p = p − e. Let (e, f ) lie in CP(A)(p,q) and observe that, since (e, f ) ≤ (p, q), (e, f ) and (p, q) are compatible in A. It follows that, for j equal to 0, 1 or 2, PjA (e, f )P2A (p, q) = P2A (p, q)PjA (e, f ),
(3.2)
and the restriction PjB (e, f ) of PjA (e, f ) to B is a weak∗ -continuous norm-non-increasing linear projection onto a weak∗ -closed subspace Bj (e, f ) of B. Furthermore, for each element a in B, P2B (e, f )a = eaf, P1B (e, f )a = eaf 0q + e0p af, P0B (e, f )a = e0p af 0q ,
(3.3)
and P0B (e, f ), P1B (e, f ) and P2B (e, f ) are pairwise orthogonal with sum IB , the identity operator on B. Clearly B0 (e, f ) and B2 (e, f ) are inner ideals in B and B1 (e, f ) is a subtriple of B. The decomposition B = B0 (e, f ) ⊕ B1 (e, f ) ⊕ B2 (e, f ) of B is said to be the generalized Peirce decomposition of B relative to (e, f ). From (2.6) it is clear that, D A (e, f )P2A (p, q) = P2A (p, q)D A (e, f ), and, therefore, the restriction D B (e, f ) of D A (e, f ) to B is a weak∗ - continuous linear operator on B defined, for each element a in B, by 1 1 D B (e, f )a = ( P1B (e, f ) + P2B (e, f ))a = (ea + af ). 2 2
(3.4)
Lemma 3.2. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Let (e, f ) be an element of CP(A)(p,q) and, for j equal to 0, 1, or 2, let the operators PjB (e, f ) and D B (e, f ) be defined by (3.3) and (3.4). Then: (i) for each complex number λ of unit modulus, the weak∗ -continuous linear operator S B (e, f )(λ) on B, defined by S B (e, f )(λ) = P0B (e, f ) + λP1B (e, f ) + λ2 P2B (e, f ) is a Jordan triple automorphism of B and an isometry from B onto B such that, for each real number t, S B (e, f )(eit ) = exp(2itD B (e, f )); (ii) the weak∗ -continuous linear operator D B (e, f ) is hermitian.
278
C. M. Edwards, G. T. Rüttimann
Proof. This follows from the corresponding result for A, Lemma 3.1 of [20], and (3.2) and (3.3). u t Following the definition in [28], a pair (e1 , f1 ), (e2 , f2 ) of elements of CP(A)(p,q) is said to be compatible, written (e1 , f1 )CB (e2 , f2 ), if, for j and k equal to 0, 1 or 2, the commutant [PjB (e1 , f1 ), PkB (e2 , f2 )] is equal to zero. The next lemma describes other conditions equivalent to that of compatibility. Lemma 3.3. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q) and, for an element (e, f ) in CP(A)(p,q) , let the operator D B (e, f ) be defined by (3.4). Then, for elements (e1 , f1 ) and (e2 , f2 ) of CP(A)(p,q) , the following conditions are equivalent: (i) (ii) (iii) (iv) (v) (vi)
(e1 , f1 )CA (e2 , f2 ); (e1 , f1 )CB (e2 , f2 ); [D B (e1 , f1 ), D B (e2 , f2 )] = 0; 0 0 0 0 D B (e1 , f1 )(e2 Af2 ) ⊆ e2 Af2 and D B (e1 , f1 )(e2p Af2q ) ⊆ e2p Af2q ; 0 0 0 0 D B (e2 , f2 )(e1 Af1 ) ⊆ e1 Af1 and D B (e2 , f2 )(e1p Af1q ) ⊆ e1p Af1q ; e1 e2 = e2 e1 and f1 f2 = f2 f1 .
Proof. That (i) implies (ii) follows from (3.2). That (ii) and (iii) are equivalent and imply (iv) and (v) are proved, using (3.2) and (3.3) and Lemma 3.2, in exactly the same way as the corresponding results for A proved in [20], Lemma 3.2. Since, by the same result, (vi) implies (i), it remains to show that (iv) implies (vi). Using (3.4), for all elements a in A, e2 (e1 (e2 af2 ) + (e2 af2 )f1 )f2 = e1 (e2 af2 ) + (e2 af2 )f1 , 0
0
0
0
0
0
0
0
0
0
e2p (e1 (e2p af2q ) + (e2p af2q )f1 )f2q = e1 (e2p af2q ) + (e2p af2q )f1 .
(3.5) (3.6)
0
Multiplying (3.5) on the left by e2p and (3.6) on the left by e2 gives 0
e2p e1 e2 af2 = 0, 0
(3.7)
0
e2 e1 e2p af2q = 0.
(3.8)
Hence, from (3.7), for all b in A, 0
b∗ e2q e1 e2 af2 = 0,
(3.9)
0
and, choosing a equal to e2 e1 e2p b, 0
0
(e2 e1 e2p b)∗ (e2 e1 e2p b)f2 = 0.
(3.10)
Choosing b equal to a in (3.8) yields 0
0
(e2 e1 e2p b)f2 = (e2 e1 e2p b)q,
(3.11)
Gleason’s Theorem for Rectangular JBW∗ -Triples
279
and, substituting from (3.11) in (3.10), gives 0
0
q(e2 e1 e2p b)∗ (e2 e1 e2p b)q = 0.
(3.12)
Therefore, by (3.11), for all b in A, 0
e2 e1 e2p bq = 0. 0
Hence, the weak∗ -closed inner ideal e(e2 e1 e2p )Aq is equal to zero and, from [30], 1.10.7, 0
c(e(e2 e1 e2p )) ⊥ c(q). But
0
(3.13)
0
p(e2 e1 e2p ) = e2 e1 e2p , 0
and, therefore, e(e2 e1 e2p ) ≤ p. Hence, 0
c(e(e2 e1 e2p )) ≤ c(p) = c(q), 0
(3.14) 0
and, (3.13) and (3.14) imply that c(e(e2 e1 e2p )) is equal to zero. It follows that e(e2 e1 e2p ) 0 and, hence, e2 e1 e2p , is equal to zero. Therefore, e2 e1 = e2 e1 p = e2 e1 e2 = (e2 e1 e2 )∗ = pe1 e2 = e1 e2 , as required. Similarly f1 f2 and f2 f1 coincide. This completes the proof of the lemma. t u For a more detailed investigation into the concept of compatibility of subtriples of a Jordan ∗ -triple, the reader is referred to [12]. 4. The Centroid of a Rectangular JBW∗ -Triple Let B be an arbitrary JBW∗ -triple. Recall that the centroid Z(B) of B is the set of bounded linear operators T on B such that, for all elements a in B, T D(a, a) = D(a, a)T .
(4.1)
For each element T in Z(B) there exists a unique element T † in Z(B) such that, for all elements a and b in B, T {a b a} = {T a b a} = {a T † b a}.
(4.2)
A bounded linear operator T on B is said to be a multiplier if, for each element x in the set ∂e B1∗ of extreme points of the unit ball B1∗ in the dual space B ∗ of B, there exists a complex number λT (x) such that T ∗ x = λT (x)x.
(4.3)
In [11] it is shown that the centroid Z(B) of B coincides with the centralizer of B, namely, the set of multipliers T on B for which there exists a multiplier T † on B such that, for all elements x in ∂e B1∗ , T †∗ x = λT (x)x.
(4.4)
280
C. M. Edwards, G. T. Rüttimann
Recall that a linear projection P on B is said to be an M-projection if, for each element a in B, kak = max{kP ak, ka − P ak}. A subspace of B is said to be an M-summand if it is the range of a necessarily unique M-projection. The results of [3] and [23] show that the M-summands of B coincide with its weak∗ -closed ideals. The following result is an immediate consequence of those of [1,2,4,9] and [10]. Lemma 4.1. Let B be a JBW∗ -triple, with centroid Z(B), and let ∂e B1∗ be the set of extreme points of the unit ball B1∗ in the dual space B ∗ of B. Then: with respect to the operator norm and product, and the involution T 7 → T † , defined by (4.2), Z(B) forms a commutative W∗ -algebra; (ii) the mapping T 7 → λT defined by (4.3) is an isometric ∗ -isomorphism from Z(B) onto a sub-W∗ -algebra of the space of bounded complex-valued functions on ∂e B1∗ ; (iii) the set of M-projections on B, when ordered by the set inclusion of the corresponding M-summands, with complementation P 7 → IB − P is identical to the complete Boolean lattice P(Z(B)) of self-adjoint idempotents in the commutative W∗ -algebra Z(B). (i)
For the remainder of this section B will denote the rectangular JBW∗ -triple pAq discussed in Sect. 3. Before embarking upon a further discussion of the complete lattice CP(A)(p,q) , some preliminary results are needed. Recall that the mapping z 7 → pz is a ∗ -isomorphism from the commutative W∗ -algebra c(p, q)Z(A) onto the centre Z(pAp) of the hereditary sub-W∗ -algebra pAp of A. It follows that the same mapping determines an order isomorphism from the complete Boolean lattice ZP(A)c(p,q) onto ZP(pAp) or, equivalently, Z(P(A)p ). In order to simplify notation, for e in P(A)p , let ^ {zp : z ∈ ZP(A)c(p,q) , e ≤ z}. cp (e) = It is clear that cp (e) coincides with c(e)p. Lemma 4.2. Let A be a W∗ -algebra, with centre Z(A), let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, and let B be the rectangular JBW∗ -triple pAq, with centroid Z(B). Then, the mapping µ, defined, for each element z of the commutative W∗ -algebra c(p, q)Z(A), and each element a in B, by µ(z)(a) = za, is an isometric ∗ -isomorphism from c(p, q)Z(A) onto Z(B). Proof. It is clear that for each element z in c(p, q)Z(A), µ(z) lies in Z(B) and that µ is a ∗ -homomorphism. Suppose that z is an element of c(p, q)Z(A) such, that zB is equal to zero. Since the support r(z) of z in A is the weak∗ -limit of a sequence of finite linear combinations of elements consisting of products of z and z∗ , it follows that r(z) lies in c(p, q)Z(A). Similarly, e(z) lies in c(p, q)Z(A) and, by commutativity, f (z) and e(z) coincide. Therefore, it follows that pe(z)Ae(z)q is equal to zero. By [30], 1.10.7, it can be seen that e(z)c(p, q) = c(pe(z)) ⊥ c(e(z)q) = e(z)c(p, q), and it follows that e(z)c(p, q) is equal to zero. But e(z) ≤ c(p, q), and, therefore e(z) is equal to zero, which implies that z is equal to zero. It follows that µ is a ∗ -isomorphism
Gleason’s Theorem for Rectangular JBW∗ -Triples
281
into Z(B). Let P be an M-projection on B. Then, by [23], the range P B of B is a weak∗ -closed ideal in B such that B = P B ⊕M (P B)⊥ , where (P B)⊥ is the set of elements b in B such that D(P B, b) is equal to zero. It follows from Lemma 3.1 and the results of [19] that there exists an element (e, f ) in CP(A)(p,q) such that B = eAf ⊕M e0p Af 0q . Therefore, by Lemma 3.2,
eAf 0q = e0p Af = {0}.
Hence, by [30], 1.10.7, f 0q ≤ cq (f 0q ) ≤ cq (e)0q = c(e, f )0 q ≤ f 0q . It follows that
f 0q = c(e, f )0 q,
e0p = c(e, f )0 p,
and, hence, that e = c(e, f )p,
f = c(e, f )q.
Therefore, P = µ(c(e, f )). Lemma 4.1 shows that µ maps onto Z(B), as required. u t This result has the following immediate corollary. Corollary 4.3. The mapping µ defined in Lemma 4.2, when restricted to the complete Boolean lattice ZP(A)c(p,q) , is an order isomorphism onto the complete Boolean lattice of M-projections on B. It is now possible to examine the complete lattice CP(A)(p,q) in more detail. Lemma 4.4. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q) and, for an element (e, f ) in CP(A)(p,q) , let (e, f )0(p,q) = (c(f 0q )e0p , c(e0p )f 0q ). Then: (i)
for each element (e, f ) in CP(A)(p,q) , (e, f )0(p,q) 0(p,q) = ((c(f 0f )e0p )0p , (c(e0p )f 0q )0q ), and
c((e, f )0(p,q) 0(p,q) ) = c(e, f );
(ii) the mapping (e, f ) 7 → (e, f )0(p,q) is order reversing;
(4.5)
282
C. M. Edwards, G. T. Rüttimann
(iii) for each element (e, f ) in CP(A)(p,q) , (e, f ) ≤ (e, f )0(p,q) 0(p,q) = (e, f )0(p,q) 0(p,q) 0(p,q) 0(p,q) = . . . , (e, f )0(p,q) = (e, f )0(p,q) 0(p,q) 0(p,q) = (e, f )0(p,q) 0(p,q) 0(p,q) 0(p,q) 0(p,q) = . . . . Proof. Observe that (c(e0p )f 0q )0q = (c(e0p )q ∧ f 0q )0q = (c(e0p )q)0q ∨ (f 0q )0q
(4.6)
= (c(e0p )q)0q ∨ f. Therefore, c((c(e0p )f 0q )0q ) = c(c(e0p )q)0q ) ∨ c(e, f ).
(4.7)
Observe that, since e ≤ c(e), c(e)0 p ≤ e0p ≤ c(e0p ), which implies that c(e)0 c(p, q) ≤ c(e0p ). Therefore, c(e)0 q ≤ c(e0p )q, which shows that (c(e0p )q)0q ≤ (c(e)0 q)0q = c(e)q. Hence
c((c(e0p )q)0q ) ≤ c(e, f )c(p, q) = c(e, f ),
and it follows from (4.7) that c((c(e0p )f 0q )0q ) = c(e, f ). Similarly, c((c(f 0q )e0p )0p ) is also equal to c(e, f ) and (i) follows. Let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A)(p,q) such that (e1 , f1 ) ≤ (e2 , f2 ). 0 0 0 0 Then, e1 ≤ e2 and f1 ≤ f2 . and e2p ≤ e1p and f2q ≤ f1q . Hence, 0
0
0
0
0
0
c(f2q )e2p ≤ c(f2q )e1p ≤ c(f1q )e1p , 0
0
0
0
and, similarly c(e2p )f2q ≤ c(e1p )f1q . Therefore, (e2 , f2 )0(p,q) ≤ (e1 , f1 )0(p,q) and the proof of (ii) is complete. From (i) it can be seen that (e, f ) ≤ (e, f )0(p,q) 0(p,q) , and, combining this result with (ii), completes the proof of (iii). u t In general, the complete lattice CP(A)(p,q) is not orthomodular. However, it is still possible to have a notion of orthogonality in CP(A)(p,q) . An element (e1 , f1 ) in CP(A)(p,q) is said to be orthogonal to an element (e2 , f2 ) in CP(A)(p,q) , written (e1 , f1 ) ⊥(p,q) (e2 , f2 ), if (e2 , f2 ) ≤ (e1 , f1 )0(p,q) . The next lemma shows that this is a reasonable definition. Lemma 4.5. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then, for elements (e1 , f1 ) and (e2 , f2 ) in CP(A)(p,q) , the following conditions are equivalent:
Gleason’s Theorem for Rectangular JBW∗ -Triples
283
(i) (e1 , f1 ) ⊥(p,q) (e2 , f2 ); (ii) (e2 , f2 ) ⊥(p,q) (e1 , f1 ); (iii) e1 + e2 ≤ p and f1 + f2 ≤ q. Proof. The equivalence of (i) and (ii) is immediate from Lemma 4.4. Furthermore, if (i) holds, then 0 0 0 e2 ≤ c(f1q )e1p ≤ e1p , and e1 + e2 ≤ p as required. Similarly, f1 + f2 ≤ q and (iii) holds. Conversely, if these 0 0 conditions hold then e2 ≤ e1p , and, since f2 ≤ f1q , 0
e2 ≤ c(e2 , f2 ) ≤ c(f1q ). 0
0
0
0
Therefore, e2 ≤ c(f1q )e1p and, similarly, f2 ≤ c(e1p )f1q , which together imply that (i) holds. u t Lemma 4.2 and Lemma 4.5 lead to the following result. Corollary 4.6. Let (e1 , f1 ), (e2 , f2 ) be a pair of orthogonal elements in CP(A)(p,q) . Then (e1 , f1 ) and (e2 , f2 ) are compatible. An element (g, h) in CP(A)(p,q) is said to be central if, for each element (e, f ) of CP(A)(p,q) , (e, f ) = ((g, h) ∧ (e, f )) ∨ ((g, h)0(p,q) ∧ (e, f )).
(4.8)
The next result describes the central elements of CP(A)(p,q) . Lemma 4.7. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then, an element (g, h) in CP(A)(p,q) , is central if and only if there exists an element z in the complete Boolean lattice ZP(A)c(p,q) such that g = zp,
h = zq,
and, in this case, z is uniquely defined. Proof. Suppose that z is an element of ZP(A)c(p,q) . Observe that (zp, zq)0(p,q) = (c((zq)0q )(zp)0p , c((zp)0p )(zq)0q ) = (c(p, q)z0 p, c(p, q)z0 q) = (z0 p, z0 q). It follows that, for an arbitrary element (e, f ) of CP(A), ((zp, zq) ∧ (e, f )) ∨ ((z0 p, z0 q) ∧ (e, f )) = (e, f ), as required. Conversely, let (g, h) be a central element of CP(A)(p,q) . Then, for each element (e, f ) in CP(A)(p,q) , (e, f ) = (g, h) ∧ (e, f ) ∨ (g, h)0(p,q) ∧ (e, f ) = (g, h) ∧ (e, f ) ∨ (c(h0q )g 0p , c(g 0p )h0q ) ∧ (e, f ) .
284
C. M. Edwards, G. T. Rüttimann
Therefore, e = c(h ∧ f )(g ∧ e) ∨ c(c(g 0p )h0q ) ∧ f ))(c(h0q )g 0p ∧ e) = c(h ∧ f )(g ∧ e) ∨ c(g 0p )c(h0q )c(h0q ∧ f )(g 0p ∧ e) ≤ (g ∧ e) ∨ (g 0p ∧ e) ≤ e. It follows that
e = (g ∧ e) ∨ (g 0p ∧ e)
and, hence that g is an element of Z(P(A)p ). Therefore, by the remarks prior to Lemma 4.2, there exists a unique element z in ZP(A)c(p,q) such that g is equal to zp. Similarly, there exists a unique element w in ZP(A)c(p,q) such that h is equal to wq. However, z = zc(p, q) = c(zp) = c(g) = c(g, h) = c(h) = c(wq) = wc(p, q) = w, and the proof of the theorem is complete. u t Combining this result with Lemma 3.1 and Corollary 4.3 yields the following result. Corollary 4.8. The centre ZCP(A)(p,q) of the complete lattice CP(A)(p,q) is a complete Boolean lattice that is order isomorphic to the complete Boolean lattice of M-projections on B. Observe that it is a consequence of this result that the centre ZI(B) of the complete lattice of weak∗ -closed inner ideals in B is the complete Boolean lattice of weak∗ -closed ideals in B and the centre ZS(B) of the complete lattice of structural projections on B is the complete Boolean lattice of M-projections on B. Two elements (e1 , f1 ) and (e2 , f2 ) of CP(A)(p,q) are said to be centrally orthogonal if there exists an element z in ZP(A)(p,q) such that (e1 , f1 ) ≤ (zp, zq) and (e2 , f2 ) ≤ (zp, zq)0(p,q) . Observe that centrally orthogonal elements are orthogonal and therefore compatible. The proof of the following result, that follows closely that of [20], Theorem 4.6, is straightforward. Theorem 4.9. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then: (i) the elements (e1 , f1 ) and (e2 , f2 ) of CP(A)(p,q) are centrally orthogonal if and only if c(e1 , f1 ) + c(e2 , f2 ) ≤ c(p, q) and, in this case, (e1 , f1 ) ∨ (e2 , f2 ) = (e1 + e2 , f1 + f2 ); (ii) for each element (e, f ) in CP(A)(p,q) and each element z in ZP(A)c(p,q) , the pairs (ze, zf ) and (z0 e, z0 f ) are centrally orthogonal elements of CP(A)(p,q) such that (ze, zf ) ∨ (z0 e, z0 f ) = (e, f ).
Gleason’s Theorem for Rectangular JBW∗ -Triples
285
The notion of rigid collinearity, discussed earlier for a W∗ -algebra can be extended to the rectangular JBW∗ -triple B. A pair (e1 , f1 ), (e2 , f2 ) of elements in CP(A)(p,q) is said to be rigidly collinear, denoted by (e1 , f1 )>(p,q) (e2 , f2 ), if B2 (e1 , f1 ) ⊆ B1 (e2 , f2 ),
B2 (e2 , f2 ) ⊆ B1 (e1 , f1 ).
Observe that, because (p, q) and (e1 , f1 ) and (p, q) and (e2 , f2 ) are compatible, it follows that (e1 , f1 )>(p,q) (e2 , f2 ) if and only if (e1 , f1 )>(e2 , f2 ), and (e1 , f1 ) ≤ (p, q) and (e2 , f2 ) ≤ (p, q). Therefore, Theorem 5.3 of [20] and the previous results of this section lead immediately to the following characterization of rigid collinearity. Theorem 4.10. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ -triple pAq, and let (e1 , f1 ) and (e2 , f2 ) be elements of the principal order ideal CP(A)(p,q) in CP(A) generated by (p, q). Then (e1 , f1 )>(p,q) (e2 , f2 ) if and only if there exist elements w1 , w2 and w3 of ZP(A)c(p,q) such that, w1 + w2 + w3 = c(p, q), w1 e1 = w1 e2 , w1 f1 + w1 f2 ≤ w1 q, w2 e1 + w2 e2 ≤ w2 p, w2 f1 = w2 f2 , w3 e1 = w3 e2 = w3 f1 = w3 f2 = 0, and, in this case,
c(e1 , f1 ) = c(e2 , f2 ) ≤ w1 + w2 . Furthermore, a unique such triple w1 , w2 and w3 of elements of ZP(A)(p,q) satisfying the additional conditions w1 ≤ c(e1 , f1 ) and w2 ≤ c(e1 , f1 ) exists. In this case: (i) c(e1 , f1 ) = c(e2 , f2 ) = w1 + w2 ; (ii) writing w1 e1 = w1 e2 = e0 , w2 f1 = w2 f2 = f0 , w2 e1 + w2 e2 = e, w1 f1 + w1 f2 = f, e and e0 are elements of P(A)p and f and f0 are elements of P(A)q , such that c(e0 ) = w1 and c(f0 ) = w2 , and (e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A)(p,q) such that (e1 , f1 ) ∨ (e2 , f2 ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f + f0 ). The notion of rigid collinearity can be extended to any family ((ej , fj ))j ∈3 of elements of CP(A)(p,q) . Such a family is said to be rigidly collinear if every pair of distinct elements of the family is rigidly collinear. The next result, the proof of which follows closely that of Theorem 5.5 of [20], describes such families. Theorem 4.11. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let ((ej , fj ))j ∈3 be a rigidly collinear family of elements of the principal order ideal CP(A)(p,q) in CP(A) generated by (p, q). For j and k in 3 and l equal to 1, 2 or 3, let w(j, k)l be the unique element of ZP(A)c(p,q) such that w(j, k)1 fj + w(j, k)1 fk ≤ w(j, k)1 q, w(j, k)1 ej = w(j, k)1 ek , w(j, k)2 fj = w(j, k)2 fk , w(j, k)2 ej + w(j, k)2 ek ≤ w(j, k)2 p, w(j, k)3 ej = w(j, k)3 fj = w(j, k)3 ek = w(j, k)3 fk = 0, w(j, k)1 ≤ c(ej , fj ), w( j, k)2 ≤ c(ej , fj ),
3 X l=1
w(j, k)l = c(p, q).
286
C. M. Edwards, G. T. Rüttimann
Then, there exist uniquely pairwise orthogonal elements w1 , w2 and w3 of ZP(A)c(p,q) of sum c(p, q) and elements e0 in P(A)p and f0 in P(A)q such that: (i)
for all distinct j and k in 3, and l equal to 1, 2 or 3, w(j, k)l = wl ;
(ii) for all j in 3, w1 ej = e0 , c(e0 ) = w1 ,
w2 fj = f0 , c(f0 ) = w2 ,
w3 ej = w3 fj = 0, c(ej , fj ) = w1 + w2 ,
where (w2 ej )j ∈3 and (w1 fj )j ∈3 are families of pairwise orthogonal elements of P(A)p and P(A)q , respectively; (iii) writing _ _ w2 ej , f = w1 fj , e= j ∈3
j ∈3
(e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A)(p,q) such that _ (ej , fj ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f0 + f ). j ∈3
Recall that a W∗ -algebra is said to be a factor if its centre consists of complex multiples of its unit. In the case in which c(p, q)A is a factor, Theorem 4.11 has a particularly simple form. Corollary 4.12. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A such that c(p, q)A is a factor, let B be the rectangular JBW∗ -triple pAq, and let ((ej , fj ))j ∈3 be a rigidly collinear family of elements of the principal order ideal CP(A)(p,q) in CP(A) generated by (p, q). Then, either, there exists an element e0 in P(A)p such that, for all j in 3, ej is equal to e0 and (fj )j ∈3 is a family of pairwise orthogonal elements of P(A)q , or, there exists an element f0 in P(A)q such that, for all j in 3, fj is equal to f0 and (ej )j ∈3 is a family of pairwise orthogonal elements of P(A)p . Proof. Since ZP(A)c(p,q) is the set {0, c(p, q)} it follows that, in the theorem, one of w1 , w2 and w3 is equal to c(p, q) and the other two are equal to zero. If w3 is equal to c(p, q), then all elements of the rigidly collinear family are zero, giving a contradiction. If w1 is equal to c(p, q) and w2 is equal to zero, then, by Theorem 4.11, for all j in 3, ej is equal to e0 and (fj )j ∈3 is a family of pairwise orthogonal elements of P(A)q . The t other possibility occurs if w1 is equal to zero and w2 is equal to c(p, q). u 5. Measures on the Complete Lattice CP(A)(p,q) In this section certain measures on the complete lattice CP(A)(p,q) are analyzed. A measure m on CP(A)(p,q) is a mapping from CP(A)(p,q) to C such that, for each pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A)(p,q) that are either centrally orthogonal or rigidly collinear, m((e1 , f1 ) ∨ (e2 , f2 )) = m((e1 , f1 )) + m((e2 , f2 )).
Gleason’s Theorem for Rectangular JBW∗ -Triples
287
The measure m is said to be bounded if {m((e, f )) : (e, f ) ∈ CP(A)(p,q) } is a bounded subset of C. Recall that, according to [34], a mapping ν from P(A)p × P(A)q to C is said to be a quantum bimeasure if, for e1 , e2 in P(A)p such that e1 + e2 ≤ p and f in P(A)q , ν(e1 + e2 , f ) = ν(e1 , f ) + ν(e2 , f ), and, for e in P(A)p and f1 and f2 in P(A)q with f1 + f2 ≤ q, ν(e, f1 + f2 ) = ν(e, f1 ) + ν(e, f2 ). The quantum bimeasure ν is said to be bounded if the set {ν(e, f ) : e ∈ P(A)p , f ∈ P(A)q } is a bounded subset of C. The first lemma describes the relationship that exists between measures on CP(A)(p,q) and quantum bimeasures on P(A)p ×P(A)q . Its proof is very similar to that of Lemma 6.1 of [20]. Lemma 5.1. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then, there exists a bijection m 7 → νm from the set of measures m on CP(A)(p,q) onto the set of quantum bimeasures ν on P(A)p × P(A)q with the property that, for all elements z in ZP(A)c(p,q) , and all elements e in P(A)p and f in P(A)q , ν(ze, f ) = ν(e, zf ), defined, for e in P(A)p and f in P(A)q , by νm (e, f ) = m((c(f )e, c(e)f )). The mapping sends the set of bounded measures onto the set of bounded quantum bimeasures. Using the results of the previous section, this result can now be combined with those of Wright [34] to give a precise description of the bounded measures on CP(A)(p,q) , at least in the situation in which Gleason’s Theorem holds. Theorem 5.2. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A such that neither of the hereditary sub-W∗ -algebras pAp and qAq of A contains a weak∗ -closed ideal of Type I2 , let B be the rectangular JBW∗ -triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Let Sb (pAp × qAq) be the space of bounded sesquilinear functionals φ from pAp × qAq to C such that, for all elements a in pAp, b in qAq and c in c(p, q)Z(A), φ(ca, b) = φ(a, c∗ b). Then, the mapping φ 7 → mφ defined, for each element φ in Sb (pAp × qAq) and each element (e, f ) of CP(A)(p,q) , by mφ ((e, f )) = φ(e, f ) is a bijection from Sb (pAp×qAq) onto the space Mb (CP(A)(p,q) ) of bounded measures on CP(A).
288
C. M. Edwards, G. T. Rüttimann
Proof. Let φ be an element of Sb (pAp × qAq) and denote by νφ its restriction to P(A)p × P(A)q . Then, νφ is clearly a quantum bimeasure and, for all elements e in P(A)p , f in P(A)q and z in ZP(A)c(p,q) , νφ (ze, f ) = φ(ze, f ) = φ(e, z∗ f ) = φ(e, zf ) = νφ (e, zf ). It follows from Lemma 5.1 that the function mφ from CP(A)(p,q) to C, defined above, is a bounded measure. Let m be a bounded measure on CP(A)(p,q) and let νm be the bounded quantum bimeasure defined in Lemma 5.1. Then, by [34], Theorem 1, there exists a unique bounded bilinear functional ψm on pAp × qAq extending νm . For a in pAp and b in qAq, define a mapping φm from pAp × qAq to C by φm (a, b) = ψm (a, b∗ ). Then, φm is a bounded sesquilinear functional from pAp × qAq to C extending νm . Moreover, for e in P(A)p , f in P(A)q , and z in ZP(A)c(p,q) , φm (ze, f ) = νm (ze, f ) = νm (e, zf ) = φm (e, zf ). Since the set of finite linear combinations of elements of P(A)p is dense in pAp for the norm topology, it follows that, for all elements a in pAp, b in qAq, and z in ZP(A)c(p,q) , φm (za, b) = φm (a, zb). The space of finite linear combinations of elements of ZP(A)c(p,q) is dense in c(p, q)Z(A) in the norm topology. Recalling that φm is conjugate linear in the second variable, it follows that, for all elements a in pAp, b in qAq and c in c(p, q)Z(A), φm (ca, b) = φm (a, c∗ b). Hence, φm lies in Sb (pAp × qAq) and, for (e, f ) in CP(A)(p,q) , mφm ((e, f )) = φm (e, f ) = ψm (e, f ) = νm (e, f ) = m((e, f )). This completes the proof of the theorem. u t A measure m on CP(A)(p,q) is said to be normal if, for every centrally orthogonal or rigidly collinear family ((ej , fj ))j ∈3 of elements of CP(A)(p,q) , _ X (ej , fj ) = m((ej , fj )), m j ∈3
j ∈3
where the sum is defined to be the limit of the net formed by taking sums over finite subsets of 3. The final result describes the normal bounded measures on CP(A). Theorem 5.3. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A such that neither of the hereditary sub-W∗ -algebras pAp and qAq of A contains a weak∗ -closed ideal of Type I2 , let B be the rectangular JBW∗ -triple pAq, let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q) and let φ 7 → mφ be the bijection, defined in Theorem 5.2, from the space Sb (pAp × qAq) of bounded sesquilinear functionals φ from pAp × qAq to C such that, for all elements a in pAp, b in qAq, and c in c(p, q)Z(A), φ(ca, b) = φ(a, c∗ b), onto the space Mb (CP(A)(p,q) ) of bounded measures on CP(A)(p,q) . Then φ is separately weak∗ -continuous if and only if mφ is normal.
Gleason’s Theorem for Rectangular JBW∗ -Triples
289
Proof. Suppose that φ is separately weak∗ -continuous and let ((ej , fj ))j ∈3 be a centrally orthogonal family in CP(A)(p,q) . Let 0 be the directed set of finite subsets of 3. Then, using [20], Lemma 2.3, and the fact that, for j and k distinct elements of 3, φ(ej , fk ) = 0, _ _ _ _ _ (ej , fj ) = mφ ( ej , fj ) = φ ( ej , fk ) mφ j ∈3
j ∈3
= lim φ α∈0
X j ∈α
= lim lim
α∈0 β∈0
=
X
j ∈3
ej ,
_
j ∈3
fk = lim
α∈0
k∈3
XX
X j ∈α
φ(ej , fk ) = lim
α∈0
j ∈α k∈β
k∈3
_
φ ej , X
fk
k∈3
φ(ej , fj )
j ∈α
mφ ((ej , fj )),
j ∈3
as required. Now let ((ej , fj ))j ∈3 be a rigidly collinear family in CP(A)(p,q) . Let w1 , w2 and w3 be the elements of ZP(A)c(p,q) and (e0 , f ) and (e, f0 ) the centrally orthogonal elements of CP(A)(p,q) defined in Theorem 4.11. Then, using the separate weak∗ -continuity of φ and [20], Lemma 2.3, _ mφ ( (ej , fj )) = mφ ((e0 , f ) ∨ (e, f0 )) = mφ ((e0 , f )) + mφ ((e, f0 )) j ∈3
= φ(e0 , f ) + φ(e, f0 ) = φ(e0 , = lim
α∈0
= lim
α∈0
= lim
α∈0
= lim
α∈0
X
j ∈α
X
w1 fj ) + φ(
j ∈3
φ(e0 , w1 fj ) + lim
α∈0
j ∈α
X
_
X
_
w2 ej , f0 )
j ∈3
φ(w2 ej , f0 )
j ∈α
φ(w1 ej , w1 fj ) + φ(w2 ej , w2 fj )
mφ ((w1 ej , w1 fj )) + mφ ((w2 ej , w2 fj ))
j ∈α
X
mφ ((w1 ej , w1 fj ) ∨ (w2 ej , w2 fj )),
j ∈α
using the fact that (w1 ej , w1 fj ) and (w2 ej , w2 fj ) are centrally orthogonal. Hence, X _ mφ ((w1 ej + w2 ej , w1 fj + w2 fj )) mφ ( (ej , fj )) = lim j ∈3
α∈0
= lim
j ∈α
X
α∈0
mφ ((ej , fj )) =
j ∈α
X
mφ ((ej , fj )),
j ∈3
since w3 ej = w3 fj = 0. It follows that the measure mφ is normal. Suppose now that the measure mφ is normal and that f is an element of P(A)q . Let x be the bounded linear functional on pAp defined, for each element a in pAp, by x(a) = φ(a, f ).
290
C. M. Edwards, G. T. Rüttimann
Let (ej )j ∈3 be a centrally orthogonal family of elements of P(A)p . Using the properties of φ, and the fact that ((c(f )ej , c(ej )f ))j ∈3 is a centrally orthogonal family in CP(A)(p,q) , observe that _ _ _ _ ej = φ ej , f = φ c(f ) ej , c ej f x j ∈3
j ∈3
= mφ (c(f ) = mφ =
X
_
j ∈3
ej , c
j ∈3
_
_
j ∈3
ej f ) = mφ (
j ∈3
_
c(f )ej ,
j ∈3
_
c(ej )f ))
j ∈3
X (c(f )ej , c(ej )f ) = mφ ((c(f )ej , c(ej )f ))
j ∈3
φ(ej , f ) =
j ∈3
X
j ∈3
x(ej ).
j ∈3
Now, let (ej )j ∈3 be an orthogonal family of elements of P(A)p each having central support equal to z. Notice that ((c(f )ej , zf ))j ∈3 is a rigidly collinear family of elements of CP(A)(p,q) . Therefore, _ _ _ _ ej = φ ej , f = φ c(f ) ej , c ej f x j ∈3
j ∈3
= mφ (c(f ) = mφ =
X j ∈3
_
_
j ∈3
ej , c
j ∈3
_
j ∈3
ej f ) = mφ (
j ∈3
_ j ∈3
c(f )ej ,
X (c(f )ej , zf ) = mφ ((c(f )ej , zf ))
j ∈3
φ(ej , f ) =
X
_
c(ej )f )
j ∈3
j ∈3
x(ej ).
j ∈3
Lemma 6.5 of [20] shows that the bounded linear functional x is weak∗ -continuous on pAp. Since the set of finite linear combinations of elements of P(A)q is dense in qAq in the norm topology, by approximating a fixed element b in qAq by a finite linear combination of elements of P(A)q , it can be seen that the bounded linear functional mapping a 7 → φ(a, b) is also weak∗ -continuous on pAp. Similarly, for each fixed element a in pAp, the mapping b 7 → φ(a, b) is weak∗ -continuous on qAq. This completes the proof of the theorem. u t 6. Examples In this section three examples will be considered. The first one to be considered is that of a commutative W∗ -algebra A. In this case there exists a hyperstonian space such that A is isomorphic to the commutative W∗ -algebra C() of continuous complex-valued functions on , and the complete orthomodular lattice P(A) is Boolean and corresponds to the family of characteristic functions of clopen subsets of . It follows that a pair of elements e and f in P(A) are centrally equivalent if and only if they coincide. For a fixed element p in P(A), the rectangular JBW∗ -triple pAp is the weak∗ -closed ideal pA in A, and it is a commutative W∗ -algebra that is isomorphic to the commutative
Gleason’s Theorem for Rectangular JBW∗ -Triples
291
W∗ -algebra C(p ), where p is the clopen subset of corresponding to p. There is clearly no loss of generality in assuming that p is the identity in A and that A and B coincide. Observe that the centre Z(A) of A coincides with A, and that the mapping T 7 → T 1 is an isomorphism from the centroid of A onto A. In this example the structure of CP(A) is quite simple, because the mapping e 7→ (e, e) is an order isomorphism from the complete Boolean lattice P(A) onto CP(A). Notice that a pair (e1 , e1 ) and (e2 , e2 ) of elements of CP(A) is always compatible, is orthogonal or centrally orthogonal if and only if e1 ⊥ e2 and is rigidly collinear if and only if e1 and e2 are both zero. Let φ be a bounded sesquilinear functional on A × A such that, for elements a, b and c in A, φ(ca, b) = φ(a, c∗ b).
(6.1)
Notice that, from (6.1), for e and f in P(A) with e ≤ f , φ(e, f ) = φ(ef, f ) = φ(f, ef ) = φ(f, e).
(6.2)
Hence, for arbitrary e and f , using (6.2), φ(e, f ) + φ(f, f ) − φ(ef, f ) = φ(e + f − ef, f ) = φ(e ∨ f, f ) = φ(f, e ∨ f ) = φ(f, e + f − ef ) = φ(f, e) + φ(f, f ) − φ(f, ef ), from which it follows that (6.2) holds for all e and f in P(A). Let mφ be the bounded measure on CP(A), defined, according to Theorem 5.2, by mφ ((e, e)) = φ(e, e). Recall that every bounded measure on CP(A) arises in this way. Furthermore, the mapping πφ from P(A) to C defined, for e in P(A), by πφ (e) = mφ ((e, e)) is clearly a bounded complex measure on the complete Boolean lattice P(A). Therefore, there exists a unique bounded linear functional xφ on A, the corresponding integral, extending πφ . From the definition of a measure, it can be seen that, for e1 and e2 orthogonal, mφ ((e1 , e1 )) + mφ ((e2 , e2 )) = mφ ((e1 + e2 , e1 + e2 )) = φ(e1 + e2 , e1 + e2 ) = φ(e1 , e1 ) + φ(e2 , e2 ) + φ(e1 , e2 ) + φ(e2 , e1 ). Therefore, from (6.2), 2φ(e1 , e2 ) = φ(e1 , e2 ) + φ(e2 , e1 ) = 0.
(6.3)
Let a and b be elements of A that are finite linear combinations of elements of P(A). Then, without loss of generality, there exists a family e1 , e2 , . . . er of pairwise orthogonal elements of P(A) and complex numbers α1 , α2 , . . . αr and β1 , β2 , . . . βr such that a=
r X j =1
αj e j , b =
r X k=1
βk ek .
292
C. M. Edwards, G. T. Rüttimann
It follows that φ(a, b) = =
r X
αj βk φ(ej , ek ) =
j,k=1 r X
r X
αj βj mφ ((ej , ej ))
j =1
(6.4)
∗
αj βj xφ (ej ) = xφ (ab ).
j =1
Since the set of finite linear combinations of elements of P(A) is dense in A, (6.4) holds for arbitary elements a and b in A. Identifying A and C(), it follows that the bounded sesquilinear functional φ is given, for a and b in A, by Z a(ω)b(ω)πφ (dω). φ(a, b) =
It is clear from the definition and Theorem 5.3 that normal measures on CP(A) give rise to normal linear functionals on C(). Commutative W∗ -algebras are used as models for classical physical systems. It can be observed that the results of Sect. 5, as described above, show that ortho-additive bounded measures on P(A) and bounded measures on CP(A) are essentially the same. Integration with respect to a measure on P(A) creates a bounded linear functional on A and a bounded sesquilinear functional on A × A. Let r and s be positive integers greater than two, and let B be the rectangular JBW∗ triple Mr,s (C) of r ×s complex matrices. There exist projections p and q in the complete orthomodular lattice P(A) of projections in the Type I factor A, which is equal to the set Mr+s (C) of (r + s) × (r + s) matrices over C, such that B is isomorphic to pAq. Furthermore the central supports of p and q are equal to 1. In this example, since the centroid of B is trivial, the structure of CP(A)(p,q) is also much simplified. Notice that a pair (e1 , f1 ) and (e2 , f2 ) of non-zero elements of CP(A)(p,q) is compatible if and only if e1 commutes with e2 and f1 commutes with f2 , is orthogonal if and only if e1 ⊥ e2 and f1 ⊥ f2 , is never centrally orthogonal and is rigidly collinear if and only if either e1 is equal to e2 and f1 + f2 ≤ q, or f1 is equal to f2 and e1 + e2 ≤ p. In this example the bounded measures on P(A)(p,q) are precisely the quantum bi-measures on Mr (C) × Ms (C) discussed in [34]. In this finite-dimensional case all measures are normal. The analysis above can easily be extended to the rectangular JBW∗ -triple B(H, K) of bounded linear operators from the complex Hilbert space H into the complex Hilbert space K, provided that neither H or K is of dimension two. In the special case in which H and K coincide, the rectangular JBW∗ -triple B(H, H ) is the Type I factor B(H ), originally taken to represent a quantum-mechanical system. The final example shows that the results of Sect. 5 are genuine generalizations of Gleason’s original theorem. Let B be a complex Hilbert space. Define the triple product of elements a, b and c in B by {a b c} =
1 (ha, bic + hc, bia). 2
Then, it can be easily seen that B is a Jordan∗ -triple. Let B ∗ be the Banach dual space of B and let a 7 → aˆ be the conjugate linear mapping from B onto B ∗ , defined, for b in B, by a(b) ˆ = hb, ai.
Gleason’s Theorem for Rectangular JBW∗ -Triples
293
Then, B ∗ is a complex Hilbert space with respect to the inner product, defined for a and b in B, by ˆ = hb, ai, ha, ˆ bi and the mapping a 7 → aˆ is isometric. For T in B(B), define Tˆ on B ∗ , for a in B, by Tˆ aˆ = (Tˆa). Then Tˆ lies in B(B ∗ ) and the map T 7 → Tˆ is a conjugate linear ∗ -isomorphism from B(B) onto B(B ∗ ) and, in particular is an ortho-order isomorphism from the complete orthomodular lattice P(B(B)) onto P(B(B ∗ )). Consider the complex Hilbert space B ∗ ⊕C and let A be the Type I factor B(B ∗ ⊕C). ˆ α) in B ∗ ⊕ C, by Define elements p and q in P(B(B ∗ ⊕ C)), for (a, p(a, ˆ α) = (0, α), q(a, ˆ α) = (a, ˆ 0). For each element b in B and (a, ˆ α) in B ∗ ⊕ C, let η(b)(a, ˆ α) = (0, hb, ai). Then, η is a linear mapping from B into pAq, and a simple calculation shows that, regarded as a bounded linear operator on B ∗ ⊕ C, ˆ 0). η(b)∗ (a, ˆ α) = (α b, ˆ α) lie in B ∗ ⊕ C. Then, Let b1 and b2 lie in B and let (a, ˆ α) = η(b1 )η(b2 )∗ (0, hb1 , ai) = η(b1 )(hb1 , aibˆ2 , 0) η(b1 )η(b2 )∗ η(b1 )(a, ˆ α), = (0, hb1 , hb1 , aib2 i) = η({b1 b2 b1 })(a, and η is a Jordan ∗ -triple isomorphism from B into pAq. Let c be an arbitary element of A. Then pcq lies in B(B ∗ , C) and, therefore, there exists an element b in B such that, for all aˆ in B ∗ , (pcq)(a) ˆ = hb, ai, and it can be seen that η(b) and pcq coincide. Hence η is a Jordan ∗ -triple isomorphism from B onto pAq and B is a rectangular JBW∗ -triple. Since A is a factor, (p, q) lies in CP(A), and a pair (e, f ) in CP(A) lies in the order ideal CP(A)(p,q) if and only if e ≤ p and f ≤ q. Since p is minimal in P(A), e is equal either to 0 or to p. Recall that the order interval [0, q] can be identified with the order ideal P(A)q or, equivalently with P(qAq) and P(B(B ∗ )). The remarks above show that ˆ is an order isomorphism from P(B(B)) onto CP(A)(p,q) . the mapping Q 7 → (p, Q) It follows that the complete lattice S(B) of structural projections on B coincides with the complete lattice P(B(B)) of orthogonal projections on B, and the complete lattice I(B) of weak∗ -closed inner ideals in B coincides with the complete lattice of closed subspaces of B. Furthermore, for Q1 and Q2 in S(B), Q1 and Q2 are compatible if ˆ 2 ) are compatible, which occurs when Q ˆ 1 and Q ˆ 2 or, ˆ 1 ) and (p, Q and only if (p, Q equivalently, Q1 and Q2 , commute. Moreover, Q1 and Q2 are orthogonal if and only ˆ 1 ) ⊥ (p, Q ˆ 2 ), which occurs if and only if at least one of Q1 and Q2 is zero. if (p, Q ˆ 1 )>(p,q) (p, Q ˆ 2 ), which Finally, Q1 and Q2 are rigidly collinear if and only if (p, Q occurs if and only if Q1 and Q2 are orthogonal.
294
C. M. Edwards, G. T. Rüttimann
Suppose now that the dimension of B is not two and let m be a bounded measure ˆ in on S(B). Then, it follows that the mapping m ˆ defined, for each element (p, Q) CP(A)(p,q) , by ˆ = m(Q), m(p, ˆ Q) is a bounded measure. Since the centroid of pAq is trivial, by Theorem 5.2, there exists a unique bounded sesquilinear functional ξmˆ on Cp × B(B ∗ ) such that, for all Q in S(B), ˆ = m(p, ˆ = m(Q). ˆ Q) ξmˆ (p, Q) For each element T in B(B), let xm (T ) = ξmˆ (p, Tˆ ). Then xm is a bounded linear functional on B(B) extending m and, by Theorem 5.2, xm is the unique such extension. Furthermore, by Theorem 5.3, xm is weak∗ -continuous if and only if m is normal. In this case there exists a unique trace class operator ρm on B such that, for all T in B(B), xm (T ) = Tr(ρm T ). In other words, Gleason’s Theorem [22] holds. It should, however, be observed that, since the results depend upon the generalized Gleason Theorem for W∗ -algebras proved by Bunce and Wright [6,7], this is not an alternative proof of Gleason’s Theorem. References 1. Alfsen, E.M., Effros, E.G.: Structure in real Banach spaces I. Ann. of Math. 96, 98–128 (1972) 2. Alfsen, E.M., Effros, E.G.: Structure in real Banach spaces II. Ann. of Math. 96, 129–174 (1972) 3. Barton, T.J., Timoney, R.M.: Weak∗ -continuity of Jordan triple products and its applications. Math. Scand. 59, 177–191 (1986) 4. Behrends, E.: M-structure and the Banach-Stone Theorem. Lecture Notes in Mathematics 736, Berlin– Heidelberg–New York: Springer, 1979 5. Bonsall, F.F., Duncan, J.: Numerical Ranges of Operators on Normed Spaces and of Elements of Normed Algebras. London Mathematical Society Lecture Note Series 2, Cambridge: Cambridge University Press, 1971 6. Bunce, L.J., Wright, J.D.M.: The Mackey-Gleason problem. Bull. Am. Math. Soc. 26, 288–293 (1992) 7. Bunce, L.J., Wright, J.D.M.: Complex measures on projections in von Neumann algebras. J. London Math. Soc. 46, 269–279 (1992) 8. Bunce, L.J., Wright, J.D.M.: The Mackey-Gleason problem for vector measures on projections in von Neumann algebras. J. London Math. Soc. 49, 133–149 (1994) 9. Cunningham, F.: M-structure in Banach spaces. Math. Proc. Cambridge Philos. Soc. 63, 613–629 (1967) 10. Cunningham, F., Effros, E.G., Roy, N.M.: M-structure in dual Banach spaces. Israel J. Math. 14, 304–309 (1973) 11. Dineen, S., Timoney, R.M.: The centroid of a JB∗ -triple system. Math. Scand. 62, 327–342 (1988) 12. Edwards, C.M., Lörch, D., Rüttimann, G.T.: Compatible subtriples of Jordan ∗ -triples. J. Algebra (to appear) 13. Edwards, C.M., McCrimmon, K., Rüttimann, G.T.: The range of a structural projection. J. Funct. Anal. 139, 196–224 (1996) 14. Edwards, C.M., Rüttimann, G.T.: On the facial structure of the unit balls in a JBW∗ -triple and its predual. J. London Math. Soc. 38, 317–322 (1988) 15. Edwards, C.M., Rüttimann, G.T.: Inner ideals in W∗ -algebras. Michigan Math. J. 36, 147–159 (1989) 16. Edwards, C.M., Rüttimann, G.T.: On inner ideals in ternary algebras. Math. Z. 204, 309–318 (1990) 17. Edwards, C.M., Rüttimann, G.T.: A characterization of inner ideals in JB∗ -triples. Proc. Am. Math. Soc. 116, 1049–1057 (1992) 18. Edwards, C.M., Rüttimann, G.T.: Structural projections on JBW∗ -triples. J. London Math. Soc. 53, 354– 368 (1996) 19. Edwards, C.M., Rüttimann, G.T.: Peirce inner ideals in Jordan ∗ -triples. J. Algebra 180, 41–66 (1996)
Gleason’s Theorem for Rectangular JBW∗ -Triples
295
20. Edwards, C.M., Rüttimann, G.T.: The lattice of weak∗ -closed inner ideals in a W∗ -algebra. Commun. Math. Phys. 197, 131–166 (1998) 21. Friedman, Y., Russo, B.: Structure of the predual of a JBW∗ -triple. J. Reine Angew. Math. 356, 67–89 (1985) 22. Gleason, A.M.: Measures on the closed subspaces of a Hilbert space. J. Maths. and Mechanics. 6, 885–894 (1957) 23. Horn, G.: Characterization of the predual and the ideal structure of a JBW∗ -triple. Math. Scand. 61, 117–133 (1987) 24. Isham, C.J.: Quantum logic and histories approach to quantum theory. J. Math. Phys. 35, 2157–2185 (1994) 25. Isham, C.J., Linden, N.: Quantum temporal logic and decoherence functionals in the histories approach to generalised quantum theory. J. Math. Phys. 35, 5452–5476 (1994) 26. Isham, C.J., Linden, N., Schreckenberg, S.: The classification of decoherence functionals: An analogue of Gleason’s theorem. J. Math. Phys. 35, 6360–6370 (1994) 27. Kaup, W.: Riemann mapping theorem for bounded symmetric domains in complex Banach spaces. Math. Z. 183, 503–529 (1983) 28. McCrimmon, K.: Compatible Peirce decomposition of Jordan triple systems. Pac. J. Math. 83, 415–439 (1979) 29. Pedersen, G.K.: C∗ -algebras and their automorphism groups. (London Mathematical Society Monographs 14). London: Academic Press, 1979 30. Sakai, S.: C∗ -algebras and W∗ -algebras. Berlin–Heidelberg–New York: Springer, 1971 31. Ruan, Z.-J.: Injectivity of operator spaces. Trans. Am. Math. Soc. 315, 89–104 (1989) 32. Rüttimann, G.T.: Non-commutative measure theory. Habilitationsschrift, Universität Bern 1980 33. Upmeier, H.: Symmetric Banach manifolds and Jordan C∗ -algebras. Amsterdam: North Holland, 1985 34. Wright, J.D.M.: The structure of decoherence functionals for von Neumann quantum histories. J. Math. Phys. 36, 5409–5413 (1995) 35. Wright, J.D.M.: Linear representations of bilinear forms on operator algebras. Expositiones Mathematicae (to appear) 36. Wright, J.D.M.: Decoherence functionals for von Neumann quantum histories: boundedness and countable additivity. Commun. Math. Phys. 191, 493–500 (1998) Communicated by H. Araki
Commun. Math. Phys. 203, 297 – 324 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Quantized Flag Manifolds and Irreducible ∗-Representations Jasper V. Stokman1,?,?? , Mathijs S. Dijkhuizen2,??? 1 KdV Institute for Mathematics, University of Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam,
The Netherlands
2 Department of Mathematics, Faculty of Science, Kobe University, Rokko, Kobe 657, Japan
Received: 24 February 1998 / Accepted: 22 October 1998
Abstract: We study irreducible ∗-representations of a certain quantization of the algebra of polynomial functions on a generalized flag manifold regarded as a real manifold. All irreducible ∗-representations are classified for a subclass of flag manifolds containing in particular the irreducible compact Hermitian symmetric spaces. For this subclass it is shown that the irreducible ∗-representations are parametrized by the symplectic leaves of the underlying Poisson bracket. We also discuss the relation between the quantized flag manifolds studied in this paper and the quantum flag manifolds studied by Soibel’man, Lakshimibai & Reshetikhin, Jurˇco & Šˇtovíˇcek and Korogodsky. 1. Introduction The irreducible ∗-representations of the “standard” quantization Cq [U ] of the algebra of functions on a compact connected simple Lie group U were classified by Soibel’man [40]. He showed that there is a 1–1 correspondence between the equivalence classes of irreducible ∗-representations of Cq [U ] and the symplectic leaves of the underlying Poisson bracket on U (cf. [39,40]). This Poisson bracket is sometimes called Bruhat–Poisson, because its symplectic foliation is a refinement of the Bruhat decomposition of U (cf. Soibel’man [39,40]). The symplectic leaves are naturally parametrized by W × T , where T ⊂ U is a maximal torus and W is the Weyl group associated with (U, T ). The 1–1 correspondence between equivalence classes of irreducible ∗-representations of Cq [U ] and symplectic leaves of U can be formally explained by the observation that in the semi-classical limit the kernel of an irreducible ∗-representation should tend ? Current address: Université Louis Pasteur, Institut de Recherche Mathématique Avancée, 7, rue René Descartes, F-67084 Strasbourg, France ?? The first author was supported by a NISSAN-fellowship of the Netherlands Organization of Scientific Research (NWO) ??? Current address: Schweizer Strasse 21, D-60594 Frankfurt/Main, Germany. E-mail:
[email protected] 298
J. V. Stokman, M. S. Dijkhuizen
to a maximal Poisson ideal. The quotient of the Poisson algebra of polynomial functions on U by this ideal is isomorphic to the Poisson algebra of functions on the symplectic leaf. In recent years many people have studied quantum homogeneous spaces (see for example [45,31,36,13,30,35,4]). The results referred to above raise the obvious question whether the irreducible ∗-representations of quantized function algebras on U homogeneous spaces can be classified and related to the symplectic foliation of the underlying Poisson bracket. This question was already raised in a paper by Lu & Weinstein [25, Question 4.8], where they studied certain Poisson brackets on U -homogeneous spaces that arise as a quotient of the Bruhat–Poisson bracket on U . To our knowledge, affirmative answers to the above mentioned question have been given so far for only three different types of U -homogeneous spaces, namely Podle´s’s family of quantum 2-spheres [36] (the relation with the symplectic foliation of certain covariant Poisson brackets on the 2-sphere seems to have been observed for the first time by Lu & Weinstein [26]), odd-dimensional complex quantum spheres SU (n+1)/SU (n) (cf. Vaksman & Soibel’man [45]), and Stiefel manifolds U (n)/U (n − l) (cf. Podkolzin & Vainerman [35]). In this paper we study the irreducible ∗-representations of a certain quantized ∗algebra of functions on a generalized flag manifold. To be more specific, let G denote the complexification of U , and let P ⊂ G be a parabolic subgroup containing the standard Borel subgroup B with respect to a fixed choice of Cartan subalgebra and system of positive roots (compatible with the choice of Bruhat–Poisson bracket on U , see [25]). The generalized flag manifold U/K with K := U ∩ P naturally becomes a Poisson U -homogeneous space (cf. Lu & Weinstein [25]). The quotient Poisson bracket on U/K is also called Bruhat–Poisson in [25], and its symplectic leaves coincide with the Schubert cells of the flag manifold G/P ' U/K. It is straightforward to realize a quantum analogue Cq [K] of the algebra of polynomial functions on K as a quantum subgroup of Cq [U ]. The corresponding ∗-subalgebra Cq [U/K] of Cq [K]-invariant functions in Cq [U ] may be regarded as a quantization of the Poisson algebra of functions on U/K endowed with the Bruhat–Poisson bracket. The main result in this paper is a classification of all the irreducible ∗-representations of Cq [U/K] for an important subclass of flag manifolds containing in particular the irreducible Hermitian symmetric spaces of compact type. For this subclass we show that the equivalence classes of irreducible ∗-representations are parametrized by the Schubert cells of U/K. Let us emphasize that we regard here the flag manifold U/K as a real manifold. This means that the algebra of functions on U/K has a natural ∗-structure, which survives quantization and allows us to study ∗-representations in a way analogous to Soibel’man’s approach [40]. For an arbitrary generalized flag manifold U/K we describe in detail how irreducible ∗-representations of Cq [U ] decompose under restriction to Cq [U/K]. This decomposition corresponds precisely to the way symplectic leaves in U project to Schubert cells in the flag manifold U/K. It leads immediately to a classification of the irreducible ∗representations of the C ∗ -algebra Cq (U/K), where Cq (U/K) is obtained by taking the closure of Cq [U/K] with respect to the universal C ∗ -norm on Cq [U ]. The equivalence classes of irreducible ∗-representations of Cq (U/K) are naturally parametrized by the symplectic leaves of U/K endowed with the Bruhat–Poisson bracket. For the classification of the irreducible ∗-representations of the quantized function algebra Cq [U/K] itself it is important to have a kind of Poincaré–Birkhoff–Witt (PBW) factorization of Cq [U/K] (which in turn is closely related to the irreducible decom-
Quantized Flag Manifolds and Irreducible ∗-Representations
299
position of tensor products of certain finite-dimensional irreducible U -modules). Such a factorization is needed in order to develop a kind of highest weight representation theory for Cq [U/K]. In Soibel’man’s paper [40], a crucial role is played by a similar factorization of Cq [U ]. From Soibel’man’s results one easily derives a factorization of the algebra Cq [U/T ] (corresponding to P minimal parabolic in G). In this paper we derive a PBW type factorization for a different subclass of flag manifolds using the so-called Parthasarathy–Ranga Rao–Varadarajan (PRV) conjecture. This conjecture was formulated as a follow-up to certain results in the paper [34] and was independently proved by Kumar [18] and Mathieu [29] (see also Littelmann [22]). The subclass of flag manifolds U/K we consider here can be characterized by the two conditions that (U, K) is a Gel’fand pair and that the Dynkin diagram of K can be obtained from the Dynkin diagram of U by deleting one node (cf. Koornwinder [14]). Note that the corresponding P ⊂ G is always maximal parabolic. These two conditions are satisfied for the irreducible compact Hermitian symmetric pairs (U, K). Roughly speaking, the PBW factorization in the above mentioned cases states that the quantized function algebra Cq [U/K] coincides with the quantized algebra of zeroweighted complex valued polynomials on U/K. The quantized algebra of zero-weighted complex valued polynomials can be naturally defined for an arbitrary generalized flag manifold U/K. It is always a ∗-subalgebra of Cq [U/K] and invariant under the Cq [U ]coaction (we shall call it the factorized ∗-algebra associated with U/K). The factorized ∗-algebra is closely related to the quantized algebra of holomorphic polynomials on generalized flag manifolds studied by Soibel’man [41], Lakshmibai & Reshetikhin [19, 20], and Jurˇco & Šˇtovíˇcek [9] (for the classical groups) as well as to the function spaces considered recently by Korogodsky [15]. In this paper we classify the irreducible ∗-representations of the factorized ∗-algebra associated with an arbitrary flag manifold U/K and we show that the equivalence classes of irreducible ∗-representations are naturally parametrized by the symplectic leaves of U/K endowed with the Bruhat–Poisson bracket. In particular, we obtain a complete classification of the irreducible ∗-representations of Cq [U/K] whenever a PBW type factorization holds for Cq [U/K] (i.e., Cq [U/K] is equal to its factorized ∗-algebra). The paper is organized as follows. In Sect. 2 we review the results by Lu & Weinstein [25] and Soibel’man [40] concerning the Bruhat–Poisson bracket on U and the quotient Poisson bracket on a flag manifold. In Sect. 3 we recall some well-known results on the “standard” quantization of the universal enveloping algebra of a simple complex Lie algebra and its finite-dimensional representations. We also recall the construction of the corresponding quantized function algebra Cq [U ] and give some commutation relations between certain matrix coefficients of irreducible corepresentations of Cq [U ]. They will play a crucial role in the classification of the irreducible ∗-representations of the factorized ∗-algebra. In Sect. 4 we define the quantized algebra Cq [U/K] of functions on a flag manifold U/K and its associated factorized ∗-subalgebra. We prove that the factorized ∗-algebra is equal to Cq [U/K] for the subclass of flag manifolds referred to above. In Sect. 5 we study the restriction of an arbitrary irreducible ∗-representation of Cq [U ] to Cq [U/K]. We use here Soibel’man’s explicit realization of the irreducible ∗-representations of Cq [U ] as tensor products of irreducible ∗-representations of Cq [SU (2)] (cf. [40], see also [12,45] for SU (n)). As a corollary we obtain a complete classification of the irreducible ∗-representations of the C ∗ -algebra Cq (U/K). Section 6 is devoted to the classification of the irreducible ∗-representations of the factorized ∗-algebra associated with an arbitrary flag manifold. The techniques in Sect. 6
300
J. V. Stokman, M. S. Dijkhuizen
are similar to those used by Soibel’man [40] for the classification of the irreducible ∗representations of Cq [U ], and to those used by Joseph [8] to handle the more general problem of determining the primitive ideals of Cq [U ]. 2. Bruhat–Poisson Brackets on Flag Manifolds In this section we review some results by Soibel’man [40] and Lu & Weinstein [25] concerning the Bruhat–Poisson bracket on a compact connected simple Lie group U and its flag manifolds. For unexplained terminology in this section we refer the reader to [2] and [25]. Let g be a complex simple Lie algebra with a fixed Cartan subalgebra h ⊂ g. Let G be the connected simply connected Lie group with Lie algebra g (regarded here as a real analytic Lie group). Let R ⊂ h∗ be the root system associated with (g, h) and write gα for the root space associated with α ∈ R. Let 1 = {α1 , . . . , αr } be a basis of simple roots for R, and let R + (resp. R − ) be the set of positive (resp. negative) roots relative to 1. We identify h with its dual by the Killing form κ. The non-degenerate symmetric bilinear form on h∗ induced by κ is denoted by (·, ·). Let W ⊂ GL(h∗ ) be the Weyl group of the root system R and write si = sαi for the simple reflection associated with αi ∈ 1. For α ∈ R write dα := (α, α)/2. Let Hα ∈ h be the element associated with the coroot α ∨ := dα−1 α ∈ h∗ under the identification h ' h∗ . Let us choose nonzero Xα ∈ gα (α ∈ R) such that for all α, β ∈ R one has [Xα , X−α ] = Hα , κ Xα , X−α = dα−1 and [Xα , Xβ ] = cα,β Xα+β with cα,β = −c−α,−β ∈ R whenever α + β ∈ R. Let h0 be the real form of h defined as the real span of the Hα ’s (α ∈ R). Then X X u := R(Xα − X−α ) ⊕ Ri(Xα + X−α ) ⊕ ih0 (2.1) α∈R +
α∈R +
is a compact real form of g. P Set b+ := h0 ⊕ n+ with n+ := ⊕ α∈R + gα . Then, by the Iwasawa decomposition for g, the triple (g, u, b+ ) is a Manin triple with respect to the imaginary part of the Killing form κ (cf. [25, §4]). The corresponding coboundary cocommutator on u can be integrated to a Sklyanin bracket on the connected Lie subgroup U ⊂ G with Lie algebra u. The corresponding Poisson tensor is explicitly given by g = lg⊗2 r − rg⊗2 r, (g ∈ U ), where lg resp. rg denotes infinitesimal left resp. right translation and with the classical r-matrix r ∈ g ∧ g given by the following well-known skew solution of the Modified Classical Yang–Baxter Equation: X dα X−α ⊗ Xα − Xα ⊗ X−α ∈ u ∧ u. (2.2) r=i α∈R +
This particular Sklyanin bracket is often called Bruhat–Poisson, since its symplectic foliation is closely related to the Bruhat decomposition of G. Let us explain this in more detail. Let B+ be the connected subgroup of G with Lie algebra b+ , let T ⊂ U be the maximal torus in U with Lie algebra ih0 , and set B := T B+ . The Weyl group NU (T )/T , where NU (T ) is the normalizer of T in U , is isomorphic to W . More explicitly, the
Quantized Flag Manifolds and Irreducible ∗-Representations
301
isomorphism sends the simple reflection si to exp π2 (Xαi − X−αi ) /T . The disjoint union of G in double B+ -cosets (cf. [46, Prop. 1.2.3.6]), a B+ mB+ (2.3) G= m∈NU (T )
` is a refinement of the Bruhat decomposition G = w∈W BwB. For m ∈ NU (T ) we set 6m := U ∩ B+ mB+ . Then 6m 6 = ∅ for all m ∈ NU (T ), and we have the disjoint union a 6m . (2.4) U= m∈NU (T )
By the Iwasawa decomposition for G there exists for any b ∈ B+ and u ∈ U a unique ub ∈ U such that bu ∈ ub B+ . The map U × B+ → U, (u, b) 7→ ub
−1
(2.5)
is a right action of B+ on U , and the corresponding decomposition of U into B+ -orbits coincides with the decomposition (2.4). On the other hand, if we regard B+ as the Poisson–Lie group dual to U , the action (2.5) becomes the right dressing action of the dual group on U (cf. [25, Thm. 3.14]). The orbits in U under the right dressing action are exactly the symplectic leaves of the Poisson bracket on U (cf. [38, Thm. 13]; [25, Thm. 3.15]), hence (2.4) coincides with the decomposition of U into symplectic leaves (cf. [40, Thm. 2.2]). Next, we recall some results by Lu & Weinstein [25] concerning certain quotient Poisson brackets on generalized flag manifolds. Let S ⊂ 1 be a set of simple roots, and let PS be the corresponding standard parabolic subgroup of G. The Lie algebra pS of PS is given by M gα (2.6) pS := h ⊕ α∈0S
with 0S := R + ∪ {α ∈ R | α ∈ span(S)}. Let lS be the Levi factor of pS , M gα , lS := h ⊕
(2.7)
α∈0S ∩(−0S )
and set kS := pS ∩u = lS ∩u. Then kS is a compact real form of lS . Set KS := U ∩PS ⊂ U , then KS ⊂ U is a Poisson–Lie subgroup of U with Lie algebra kS (cf. [25, Thm. 4.7]). Hence there is a unique Poisson bracket on U/KS such that the natural projection π : U → U/KS is a Poisson map. This bracket is also called Bruhat–Poisson. It is covariant in the sense that the natural left action U × U/KS → U/KS is a Poisson map. Let WS be the subgroup of W generated by the simple reflections in S. The decomposition PS = BWS B (cf. [46, Thm. 1.2.1.1]) implies the Schubert cell decomposition of U/KS ' G/PS : a Xw , Xw := (U ∩ BwPS )/KS ' Bw/PS , (2.8) U/KS = w∈W/WS
where w ∈ W/WS is the right WS -coset in W which contains w. Since KS ⊂ U is a Poisson–Lie subgroup, we have that the right dressing-action (2.5) of B+ on U induces a right Poisson action of B+ on U/KS (cf. [25, Thm. 4.6]). The
302
J. V. Stokman, M. S. Dijkhuizen
corresponding B+ -orbits in U/KS coincide exactly with the Schubert cells. On the other hand, the symplectic leaves of the Poisson manifold U/KS are exactly the orbits under the B+ -action (see [25, Thm. 4.6]). We conclude (cf. [25, Thm. 4.7]): Theorem 2.1. The decomposition into symplectic leaves of the flag manifold U/KS endowed with the Bruhat–Poisson bracket coincides with its decomposition into Schubert cells. Consider now the set of minimal coset representatives W S := {w ∈ W | l(wsα ) > l(w) ∀α ∈ S}.
(2.9)
W S is a complete set of coset representatives for W/WS . Any element w ∈ W can be uniquely written as a product w = w1 w2 with w1 ∈ W S , w2 ∈ WS . The elements of W S are minimal in the sense that l(w1 w2 ) = l(w1 ) + l(w2 ), (w1 ∈ W S , w2 ∈ WS ),
(2.10)
where l(w) := #(R + ∩ wR − ) is the length function on W . Observe that π maps the symplectic leaf 6m ⊂ U onto the symplectic leaf Xw(m) ⊂ U/KS , where w(m) := m/T ∈ W . We write πm : 6m → Xw(m) for the surjective Poisson map obtained by restricting π to the symplectic leaf 6m . The minimality condition (2.9) translates to the following property of the map πm . Proposition 2.2. Let m ∈ NU (T ). Then πm : 6m → Xw(m) is a symplectic automorphism if and only if w(m) ∈ W S . Proof. For w ∈ W set nw :=
M
gα , Nw := exp(nw ).
α∈R + ∩wR −
Observe that the complex dimension of Nw is equal to l(w). Write prU : G ' U ×B+ → U for the canonical projection. It is well known that for m ∈ NU (T ) and for w ∈ W S with representative mw ∈ NU (T ), the maps φm : Nw(m) → 6m , n 7→ prU (nm),
ψw : Nw → Xw , n 7→ π prU (nmw )
are surjective diffeomorphisms (see for example [1, Prop. 1.1 & 5.1]). The map ψw is independent of the choice of representative mw for w. It follows now from (2.10) by a dimension count that πm can only be a diffeomorphism if w(m) ∈ W S . On the other −1 and hence π is a hand, if m ∈ NU (T ) such that w(m) ∈ W S , then πm = ψw(m) ◦ φm m diffeomorphism. u t Soibel’man [40] gave a description of the symplectic leaves 6m (m ∈ NU (T )) as a product of two-dimensional leaves which turns out to have a nice generalization to the quantized setting (cf. Sect. 5). For i ∈ [1, r], let γi : SU (2) ,→ U be the embedding corresponding to the i th node of the Dynkin diagram of U . After a possible renormalization of the Bruhat–Poisson structure on SU (2), γi becomes an embedding of Poisson–Lie groups. Recall that the two-dimensional leaves of SU (2) are given by α β ∈ SU(2) | arg(β) = arg(t) (t ∈ T), St := −β α
Quantized Flag Manifolds and Irreducible ∗-Representations
303
where T ⊂ C is the unit circle in the complex plane. The restriction of the embedding γi to S1 ⊂ SU (2) is a symplectic automorphism from S1 onto the symplectic leaf 6mi ⊂ U , where mi = exp π2 (Xαi − X−αi ) . Recall that mi ∈ NU (T ) is a representative of the simple reflection si ∈ W . For arbitrary m ∈ NU (T ) let w(m) = si1 si2 · · · sil be a reduced expression for w(m) := m/T ∈ W , and let tm ∈ T be the unique element such that m = mi1 mi2 · · · mir tm . Note that tm depends on the choice of reduced expression for w(m). The map (g1 , . . . , gl ) 7 → γi1 (g1 )γi2 (g2 ) · · · γil (gl )tm defines a symplectic automorphism from S1×l onto the symplectic leaf 6m ⊂ U (cf. [40, §2]; [42]). Note that the image of the map is independent of the choice of reduced expression for w(m), although the map itself is not. Combined with Proposition 2.2 we now obtain the following description of the symplectic leaves of the generalized flag manifold U/KS . Proposition 2.3. Let m ∈ NU (T ) and set w := m/T ∈ W . Let w1 ∈ W S , w2 ∈ WS be such that w = w1 w2 and choose reduced expressions w1 = si1 · · · sip and w2 = sip+1 · · · sil . Then the map (g1 , g2 , . . . , gl ) 7 → γi1 (g1 )γi2 (g2 ) · · · γil (gl )/KS is a surjective Poisson map from S1×l onto the Schubert cell Xw . It factorizes through ×p ×(l−p) ×p ×p the projection pr : S1×l = S1 × S1 → S1 . The quotient map from S1 onto Xw is a symplectic automorphism. In particular, we have Xw = 6mi1 6mi2 · · · 6mip /KS . See Lu [24] for more details in the case of the full flag manifold (KS = T ). 3. Preliminaries on the Quantized Function Algebra Cq [U ] In this section we introduce some notations which we will need throughout the remainder of this paper. First, we recall the definition of the quantized universal enveloping algebra associated with the simple complex Lie algebra g. We use the notations introduced in the previous section. Set di := dαi and Hi := Hαi for i ∈ [1, r]. Let A = (aij ) be the Cartan matrix, i.e. aij := di−1 (αi , αj ). Note that Hi ∈ h is the unique element such that αj (Hi ) = aij for all j . The weight lattice is given by P = {λ ∈ h∗ | λ(Hi ) = (λ, αi∨ ) ∈ Z ∀i}.
(3.1)
The fundamental weights $αi = $i (i ∈ [1, r]) are characterized by $i (Hj ) = ($i , αj∨ ) = δij for all j . The set of dominant weights P+ resp. regular dominant weights P++ is equal to K-span{$α }α∈1 with K = Z+ resp. N. We fix q ∈ (0, 1). The quantized universal enveloping algebra Uq (g) associated with the simple Lie algebra g is the unital associative algebra over C with generators Ki±1 ,
304
J. V. Stokman, M. S. Dijkhuizen
Xi± (i = [1, r]) and relations Ki Kj = Kj Ki , Ki Ki−1 = Ki−1 Ki = 1, ±αj (Hi )
Ki Xj± Ki−1 = qi
Xj± ,
Xi+ Xj− − Xj− Xi+ = δij 1−aij
X s=0
(−1)s
1 − aij s
qi
Ki − Ki−1 qi − qi−1
,
(3.2)
(Xi± )1−aij −s Xj± (Xi± )s = 0 (i 6= j ),
where qi := q di , [a]q :=
q a − q −a (a ∈ N), [0]q := 1, q − q −1
[a]q ! := [a]q [a − 1]q . . . [1]q , and [a]q ! a := . [a − n]q ![n]q ! n q A Hopf algebra structure on Uq (g) is uniquely determined by the formulas 1(Xi+ ) = Xi+ ⊗ 1 + Ki ⊗ Xi+ , 1(Xi− ) = Xi− ⊗ Ki−1 + 1 ⊗ Xi− ,
1(Ki±1 ) = Ki±1 ⊗ Ki±1 ,
S(Ki±1 ) = Ki∓1 , S(Xi+ ) = −Ki−1 Xi+ , S(Xi− ) = −Xi− Ki ,
(3.3)
ε(Ki±1 ) = 1, ε(Xi± ) = 0.
In fact, Uq (g) may be regarded as a quantization of the co-Poisson-Hopf algebra structure (cf. [2, Ch. 6]) on U (g) induced by the Lie bialgebra (g, −iδ), δ being the cocommutator of g associated with the r-matrix (2.2). Uq (g) becomes a Hopf ∗-algebra with ∗-structure on the generators given by (Ki±1 )∗ = Ki±1 , (Xi+ )∗ = qi−1 Xi− Ki , (Xi− )∗ = qi Ki−1 Xi+ .
(3.4)
In the classical limit q → 1, the ∗-structure becomes an involutive, conjugate-linear anti-automorphism of g with −1 eigenspace equal to the compact real form u defined in (2.1). Let U ± = Uq (n± ) be the subalgebra of Uq (g) generated by Xi± (i = [1, r]) and write U 0 := Uq (h) for the commutative subalgebra generated by Ki±1 (i = [1, r]). Let Q (resp. Q+ ) be the integral (resp. positive integral) span of the positive roots. We have the direct sum decomposition M ± U±α , U± = α∈Q+
±α(H )
± i := {φ ∈ U ± | Ki φKi−1 = qi φ}. The Poincaré–Birkhoff–Witt Theowhere U±α rem for Uq (g) states that multiplication defines an isomorphism of vector spaces
U − ⊗ U 0 ⊗ U + → Uq (g).
Quantized Flag Manifolds and Irreducible ∗-Representations
305
In particular, Uq (g) is spanned by elements of the form b−η K α aζ , where b−η ∈ − , aζ ∈ Uζ+ (η, ζ ∈ Q+ ) and α ∈ Q. Here we used the notation K α = K1k1 · · · Krkr U−η P if α = i ki αi . For a left Uq (g)-module V , we say that 0 6 = v ∈ V has weight µ ∈ h∗ if Ki · v = µ(Hi ) v = q (µ,αi ) v for all i. We write Vµ for the corresponding weight space. Recall that qi a P -weighted finite-dimensional irreducible representation of Uq (g) is a highest weight module V = V (λ)Pwith highest weight λ ∈ P+ . If vλ ∈ V (λ) is a highest weight vector, − we have V (λ) = ⊕ α∈Q+ U−α vλ by the PBW Theorem, hence the set of weights P (λ) of V (λ) is a subset of the weight lattice P satisfying µ ≤ λ for all µ ∈ P (λ). Here ≤ is the dominance order on P (i.e. µ ≤ ν if ν − µ ∈ Q+ and µ < ν if µ ≤ ν and µ 6 = ν). We define irreducible finite-dimensional P -weighted right Uq (g)-modules with respect to the opposite Borel subgroup. So the irreducible finite-dimensional right Uq (g)module V (λ) with highest weight λ ∈ P + has the weight space decomposition V (λ) = P⊕ + α∈Q+ vλ Uα , where vλ ∈ V (λ) is the highest weight vector of V (λ). The weights of the right Uq (g)-module V (λ) coincide with the weights of the left Uq (g)-module V (λ) and the dimensions of the corresponding weight spaces are the same. The quantized algebra Cq [G] of functions on the connected simply connected complex Lie group G with Lie algebra g is the subspace in the linear dual Uq (g)∗ spanned by the matrix coefficients of the finite-dimensional irreducible representations V (λ) (λ ∈ P+ ). The Hopf ∗-algebra structure on Uq (g) induces a Hopf ∗-algebra structure on Cq [G] ⊂ Uq (g)∗ by the formulas (φψ)(X) = (φ ⊗ ψ)1(X), 1(X) = ε(X), 1(φ)(X ⊗ Y ) = φ(XY ), ε(φ) = φ(1), ∗
S(φ)(X) = φ(S(X)), (φ )(X) =
(3.5)
φ(S(X)∗ ),
where φ, ψ ∈ Cq [G] ⊂ Uq (g)∗ and X, Y ∈ Uq (g). The algebra Cq [G] can be regarded as a quantization of the Poisson algebra of polynomial functions on the algebraic Poisson–Lie group G, where the Poisson structure on G is given by the Sklyanin bracket associated with the classical r-matrix −ir (cf. (2.2)). Since the ∗-structure (3.5) on Cq [G] is associated with the compact real form U of G in the classical limit, we will write Cq [U ] for Cq [G] with this particular choice of ∗-structure. Note that Cq [U ] is a Uq (g)-bimodule with the left respectively right action given by (X.φ)(Y ) := φ(Y X), (φ.X)(Y ) := φ(XY ),
(3.6)
where φ ∈ Cq [U ] and X, Y ∈ Uq (g). The finite-dimensional irreducible Uq (g)-module V (λ) of highest weight λ ∈ P+ is known to be unitarizable (say with inner product (., .)). So we can choose an orthonormal basis consisting of weight vectors {vµ(i) | µ ∈ P (λ), i = [1, dim(V (λ)µ )]},
(3.7)
(i)
where vµ ∈ V (λ)µ (we omit the index i if dim(V (λ)µ ) = 1). Set λ (X) := (X.vν(j ) , vµ(i) ), X ∈ Uq (g), Cµ,i;ν,j
(3.8)
for µ, ν ∈ P (λ) and 1 ≤ i ≤ dim(V (λ)µ ), 1 ≤ j ≤ dim(V (λ)ν ). If dim(V (λ)µ ) = 1 respectively dim(V (λ)ν ) = 1 we omit the dependence on i respectively j in (3.8). It is sometimes also convenient to use the notation λ (X) := (X.w, v), v, w ∈ V (λ), X ∈ Uq (g). Cv;w
306
J. V. Stokman, M. S. Dijkhuizen
Note that when λ runs through P+ and µ, i, ν and j run through the above-mentioned sets the matrix elements (3.8) form a linear basis of Cq [G]. Furthermore, we have the formulas X λ λ λ )= Cµ,i;σ,s ⊗ Cσ,s;ν,j , 1(Cµ,i;ν,j σ,s (3.9) λ λ λ ) = δµ,ν δi,j , (Cµ,i;ν,j )∗ = S(Cν,j ). ε(Cµ,i;ν,j ;µ,i (Sums for which the summation sets are not specified are taken over the “obvious” choice of summation sets.) Using the relations (3.9) and the Hopf algebra axiom for the antipode S we obtain X λ λ (Cσ,s;µ,i )∗ Cσ,s;ν,j = δµ,ν δi,j . (3.10) σ,s
λ )∗ are matrix coefficients of the contragredient representation The elements (Cµ,i;ν,j V (λ)∗ ' V (−σ0 λ) (here σ0 is the longest element in W ). To be precise, let π : Uq (g) → End(V (λ)) be the representation of highest weight λ, and let (·, ·) be an inner product (r) with respect to which π is unitarizable. Fix an orthonormal basis of weight vectors {vµ }. Let (π ∗ , V (λ)∗ ) be the contragredient representation, i.e. π ∗ (X)φ = φ ◦ π(S(X)) for X ∈ Uq (g) and φ ∈ V (λ)∗ . For u ∈ V (λ) set u∗ := (·, u) ∈ V (λ)∗ . We define an inner product on V (λ)∗ by (u∗ , v ∗ ) := π(K −2ρ )v, u , u, v ∈ V (λ), P where ρ = 1/2 α∈R + α ∈ h∗ . By using the fact that S 2 (u) = K −2ρ uK 2ρ (u ∈ Uq (g)) one easily deduces that π ∗ is unitarizable with respect to the inner product (·, ·) on (i) (i) V (λ)∗ and that {φ−µ := q (µ,ρ) (vµ )∗ } is an orthonormal basis of V (λ)∗ consisting of (i)
−σ0 λ weight vectors (here φ−µ has weight −µ). Defining the matrix coefficients C−µ,i;−ν,j (i)
of (π ∗ , V (λ)∗ ) with respect to the orthonormal basis {φ−µ }, we then have −σ0 λ λ )∗ = q (µ−ν,ρ) C−µ,i;−ν,j (Cµ,i;ν,j
(3.11)
(cf. [40, Prop. 3.3]). A fundamental role in Soibel’man’s theory of irreducible ∗-representations of Cq [U ] is played by a Poincaré–Birkhoff–Witt (PBW) type factorization of Cq [U ]. For λ ∈ P+ , set λ | v ∈ V (λ)}. Bλ := span{Cv;v λ
Note that Bλ is a right Uq (g)-submodule of Cq [U ] isomorphic to V (λ). Set M M Bλ , A++ := Bλ . A+ := λ∈P+
(3.12)
(3.13)
λ∈P++
The subalgebra and right Uq (g)-module A+ is equal to the subalgebra of left U + invariant elements in Cq [U ] (cf. [8]). The existence of a PBW type factorization of Cq [U ] now amounts to the following statement. Theorem 3.1 ([40, Thm. 3.1]). The multiplication map m : (A++ )∗ ⊗ A++ → Cq [U ] is surjective.
Quantized Flag Manifolds and Irreducible ∗-Representations
307
A detailed proof can be found in [8, Prop. 9.2.2]. The proof is based on certain results concerning decompositions of tensor products of irreducible finite-dimensional Uq (g)modules which can be traced back to Kostant in the classical case [16, Thm. 5.1]. The close connection between Theorem 3.1 and the decomposition of tensor products of irreducible Uq (g)-modules becomes clear by observing that (Bλ )∗ Bµ ' V (λ)∗ ⊗ V (µ)
(3.14)
as right Uq (g)-modules. Important for the study of ∗-representations of Cq [U ] is some detailed information about the commutation relations between matrix elements in Cq [U ]. In view of Theoλ rem 3.1, we are especially interested in commutation relations between the Cµ,i;λ and 3 λ 3 ∗ Cν,j ;3 resp. between the Cµ,i;λ and (Cν,j ;3 ) , where λ, 3 ∈ P+ . To state these commutation relations we need to introduce certain vector subspaces of Cq [U ]. Let λ, 3 ∈ P+ and µ ∈ P (λ), ν ∈ P (3), then we set λ C 3 | (v, w) ∈ sN }, N(µ, λ; ν, 3) := span{Cv;v λ w;v3
3 C λ | (v, w) ∈ sN }, N opp (µ, λ; ν, 3) := span{Cw;v 3 v;vλ
(3.15)
where sN := sN(µ, λ; ν, 3) is the set of pairs (v, w) ∈ V (λ)µ0 × V (3)ν 0 with µ0 > µ, ν 0 < ν and µ0 + ν 0 = µ + ν. Furthermore set λ 3 )∗ Cw;v | (v, w) ∈ sO}, O(µ, λ; ν, 3) := span{(Cv;v λ 3
3 λ (Cv;v )∗ | (v, w) ∈ sO}, O opp (µ, λ; ν, 3) := span{Cw;v 3 λ
(3.16)
where sO := sO(µ, λ; ν, 3) is the set of pairs (v, w) ∈ V (λ)µ0 × V (3)ν 0 with µ0 < µ, ν 0 < ν and µ − µ0 = ν − ν 0 . If sN (resp. sO) is empty, then let N = N opp = {0} (resp. O = O opp = {0}). We now have the following proposition. Proposition 3.2. Let λ, 3 ∈ P+ and v ∈ V (λ)µ , w ∈ V (3)ν . λ 3 and Cw;v satisfy the commutation relation (i) The matrix elements Cv;v λ 3 λ 3 C 3 = q (λ,3)−(µ,ν) Cw;v C λ mod N (µ, λ; ν, 3). Cv;v λ w;v3 3 v;vλ
Moreover, we have N = N opp . λ )∗ and C 3 (ii) The matrix elements (Cv;v w;v3 satisfy the commutation relation λ λ 3 3 λ (Cv;v )∗ Cw;v = q (µ,ν)−(λ,3) Cw;v (Cv;v )∗ mod O(µ, λ; ν, 3). λ 3 3 λ
Moreover, we have O = O opp . Soibel’man [40] derived commutation relations using the universal R-matrix whereas Joseph [8, §9.1] used the Poincaré–Birkhoff–Witt Theorem for Uq (g) and the left, respectively right, action (3.6) of Uq (g) on Cq [U ]. Although the commutation relations formulated here are slightly sharper, the proof can be derived in a similar manner and will therefore be omitted. As a corollary of Proposition 3.2 (i) we have Corollary 3.3. Let λ, 3 ∈ P+ and v ∈ V (λ)µ , w ∈ V (3)ν . Then λ 3 C 3 = q (µ,ν)−(λ,3) Cw;v Cλ Cv;v λ w;v3 3 v;vλ
mod N (ν, 3; µ, λ).
(3.17)
308
J. V. Stokman, M. S. Dijkhuizen
Note that Proposition 3.2 (i) and Corollary 3.3 give two different ways to rewrite λ C3 as elements of the vector space Cv;v λ w;v3 Wλ,3 := span{Cw30 ;v3 Cvλ0 ;vλ | v 0 ∈ V (λ), w0 ∈ V (3)}. We will need both “inequivalent” commutation relations (Proposition 3.2 (i) and Corollary 3.3) in later sections. It follows in particular that, when v 0 ∈ V (λ) and w0 ∈ V (3) run through a basis, the elements Cw30 ;v3 Cvλ0 ;vλ are (in general) linearly dependent. This also follows from the following two observations. On the one hand, Wλ,3 ' V (λ + 3) as right Uq (g)-modules. On the other hand, V (λ + 3) occurs with multiplicity one in V (λ) ⊗ V (3), whereas in general V (λ) ⊗ V (3) has other irreducible components too. By contrast, the commutation relation given in Proposition 3.2 (ii) is unique in the 3 (C λ )∗ are sense that, when v ∈ V (λ) and w ∈ V (3) run through a basis, the Cw;v v;vλ 3 linearly independent (cf. (3.14)). We end this section by recalling the special case g = sl(2, C). Set $1 $1 , t12 := C$ , t11 := C$ 1 ;$1 1 ;−$1
$1 $1 t21 := C−$ , t22 := C−$ . 1 ;$1 1 ;−$1
(3.18)
Then it is well known that the tij ’s generate the algebra Cq [SU (2)]. The commutation relations tk1 tk2 = qtk2 tk1 , t1k t2k = qt2k t1k (k = 1, 2), t12 t21 = t21 t12 , t11 t22 − t22 t11 = (q − q −1 )t12 t21 , t11 t22 − qt12 t21 = 1
(3.19)
characterize the algebra structure of Cq [SU (2)] in terms of the generators tij . The ∗∗ = t , t ∗ = −qt . structure is uniquely determined by the formulas t11 22 12 21 4. Quantized Function Algebras on Generalized Flag Manifolds Let S be any subset of the simple roots 1. We will sometimes identify S with the index set {i | αi ∈ S}. Let pS ⊂ g be the corresponding standard parabolic subalgebra, given explicitly by (2.6). We define the quantized universal enveloping algebra Uq (lS ) associated with the Levi factor lS of pS as the subalgebra of Uq (g) generated by Ki±1 (i ∈ [1, r]) and Xi± (i ∈ S). Note that Uq (lS ) is a Hopf ∗-subalgebra of Uq (g). For later use in this section we briefly discuss the finite-dimensional representation theory of Uq (lS ). Recall that lS is a reductive Lie algebra with centre \ Ker(αi ) ⊂ h. (4.1) Z(lS ) = i∈S
Moreover, we have direct sum decompositions h = Z(lS ) ⊕ hS , lS = Z(lS ) ⊕ lss S , where hS = span{Hi }i∈S and is explictly given by
lss S
(4.2)
is the semisimple part of lS . The semisimple part lss S
lss S := hS ⊕
M α∈0S ∩(−0S )
gα .
(4.3)
Quantized Flag Manifolds and Irreducible ∗-Representations
309
We define the quantized universal enveloping algebra Uq (lss S ) associated with the semiof l as the subalgebra of U (g) generated by Ki±1 and Xi± for all i ∈ S. simple part lss S q S ss Observe that Uq (lS ) is a Hopf ∗-subalgebra of Uq (g). There are obvious notions of weight vectors and weights for Uq (lS )-modules. With a suitably extended interpretation of the notion of highest weight, the irreducible finitedimensional Uq (lS )-modules may be characterized in terms of highest weights. By relating the finite-dimensional representation theory of Uq (lS ) to the representation ss theory of Uq (lss S ), which is well known since lS is semisimple, one easily derives the following result. Proposition 4.1. (i) Any finite-dimensional Uq (lS )-module V which is completely reducible as Uq (h)-module, is completely reducible as Uq (lS )-module. (ii) The multiplicity of any irreducible Uq (lS )-module in the irreducible decomposition of the restriction of the Uq (g)-module V (λ) to Uq (lS ) is the same as in the classical case. Next, we define the quantized algebra of functions on U/KS . The mapping ι∗S : Uq (g)∗ Uq (lS )∗ dual to the Hopf ∗-embedding ιS : Uq (lS ) ,→ Uq (g) is surjective, and we set Cq [LS ] := ι∗S (Cq [G]) = {φ ◦ ιS | φ ∈ Cq [G]}. The formulas (3.5) uniquely determine a Hopf ∗-algebra structure on Cq [LS ], and ι∗S then becomes a Hopf ∗-algebra morphism. We write Cq [KS ] for Cq [LS ] with this particular choice of ∗-structure. Define a ∗-subalgebra Cq [U/KS ] ⊂ Cq [U ] by Cq [U/KS ] := {φ ∈ Cq [U ] | (id ⊗ ι∗S )1(φ) = φ ⊗ 1} = {φ ∈ Cq [U ] | X.φ = ε(X)φ, ∀ X ∈ Uq (lS )}.
(4.4)
The algebra Cq [U/KS ] is a left Cq [U ]-subcomodule of Cq [U ]. We call it the quantized algebra of functions on the generalized flag manifold U/KS . In a similar way, one can define the quantized function algebra Cq [KSss ] corresponding to the semisimple part KSss of KS as the image of the dual of the natural embedding Uq (lss S ) ,→ Uq (g). Its Hopf ∗-algebra structure is again given by the formulas (3.5). The subalgebra Cq [U/KSss ] then consists by definition of all right Cq [KSss ]-invariant elements in Cq [U ]. Note that Cq [U/KSss ] ⊂ Cq [U ] is a left Uq (h)-submodule and that Cq [U/KS ] coincides with the subalgebra of Uq (h)-invariant elements in Cq [U/KSss ]. We now turn to PBW type factorizations of the algebra Cq [U/KS ]. Write P (S), P+ (S), resp. P++ (S) for K-span{$α }α∈S with K = Z, Z+ resp. N. Set S c := 1 \ S. The quantized algebra Ahol S of holomorphic polynomials on U/KS is defined by Ahol S :=
M
Bλ ⊂ Cq [U ],
(4.5)
λ∈P+ (S c )
where Bλ is given by (3.12) (cf. [19,20,41,9] and [15]). Note that Ahol S is a right Uq (g)comodule subalgebra of Cq [U ], (4.5) being the (multiplicity free) decomposition of Ahol S ∗ into irreducible Uq (g)-modules. The right Uq (g)-module algebra (Ahol S ) ⊂ Cq [U ] is called the quantized algebra of antiholomorphic polynomials on U/KS .
310
J. V. Stokman, M. S. Dijkhuizen
Lemma 4.2. The linear subspace hol ∗ hol ⊂ Cq [U ], Ass S := m (AS ) ⊗ AS where m is the multiplication map of Cq [U ], is a right Uq (g)-submodule ∗-subalgebra of Cq [U ]. Proof. Proposition 3.2 (ii) implies that Ass S is a subalgebra of Cq [U ]. The other assertions are immediate. u t The subalgebra Ass S may be considered as a quantum analogue of the algebra of complexvalued polynomial functions on the real manifold U/KSss . c Remark 4.3. In the classical setting (q = 1), the algebra Ass S (#S = 1) can be interpreted as an algebra of functions on the product of an affine spherical G-variety with its dual. The G-module structure on Ass S is then related to the doubled G-action (see [32,33] for the terminology). These (and related) G-varieties have been studied in several papers, see for example [33,32] and [23].
The algebra Ass S ⊂ Cq [U ] is stable under the left Uq (h)-action, so we can speak of ss Uq (h)-weighted elements in Ass S . Let AS be the left Uq (h)-invariant elements of AS . Then AS ⊂ Cq [U ] is a right Uq (g)-module ∗-subalgebra of Cq [U ]. We now have the following lemma. ss Lemma 4.4. We have Ass S ⊂ Cq [U/KS ], so in particular AS ⊂ Cq [U/KS ]. Furthermore, λ λ )∗ Cw;v | λ ∈ P+ (S c ), v, w ∈ V (λ)}. AS = span{(Cv;v λ λ
(4.6)
Proof. Choose λ ∈ P+ (S c ) and i ∈ S. Then we have Xi+ · vλ = 0 and Ki · vλ = vλ . It follows that Cvλ ⊂ V (λ) is a one-dimensional Uqi (sl(2; C))-submodule, where we consider the Uqi (sl(2; C)) action on V (λ) via the embedding φi : Uqi (sl(2; C)) ,→ ss Uq (g). It follows that Xi− · vλ = 0. This readily implies that Ass S ⊂ Cq [U/KS ]. The remaining assertions are immediate. u t Definition 4.5. We call AS ⊂ Cq [U/KS ] the factorized ∗-subalgebra associated with U/KS . In Theorem 4.10(i) below we show that Theorem 3.1 directly implies that A∅ = Cq [U/K∅ ]. In fact, we conjecture that Conjecture 4.6. AS = Cq [U/KS ] for all subsets S of the simple roots 1. In Theorem 4.10(ii) below we will prove the conjecture for a certain subclass of generalized flag manifolds that we shall define and classify in the following proposition. For the proof in these cases we use the so-called Parthasarathy–Ranga Rao–Varadarajan (PRV) conjecture, which was proved independently by Kumar [18] and Mathieu [29]. The PRV conjecture gives information about which irreducible constituents occur in tensor products of irreducible finite-dimensional U -modules. Recall the notations introduced in Sect. 2. A pair (U, K) with K a subgroup of U is called a Gel’fand pair if for every irreducible representation of U , the subspace of K-fixed vectors is at most one dimensional. The following proposition was observed by Koornwinder [14].
Quantized Flag Manifolds and Irreducible ∗-Representations
311
Proposition 4.7 ([14]). Let U be a connected, simply connected compact Lie group with Lie algebra u, and let p ⊂ g be a standard maximal parabolic subalgebra. Let K ⊂ U be the connected subgroup with Lie algebra k := p ∩ u. Then (U, K) is a Gel’fand pair if and only if one of the following three conditions are satisfied: (i) (U, K) is an irreducible compact Hermitian symmetric pair; (ii) (U, K) ' (SO(2l + 1), U (l)), (l ≥ 2); (iii) (U, K) ' (Sp(l), U (1) × Sp(l − 1)), (l ≥ 2). Proof. For a list of the irreducible compact Hermitian symmetric pairs see [6, Ch. X, Table V]. The proposition follows from this and the classification of compact Gel’fand pairs (U, K) with U simple (cf. [17, Tab. 1]). u t Let (U, K) be a pair from the list (i)–(iii) in Proposition 4.7, and let (u, k) be the associated pair of Lie algebras. Then k = kS for some subset S ⊂ 1 with #S c = 1. We call the simple root α ∈ S c the Gel’fand node associated with (U, K). A dominant weight λ ∈ P+ is called spherical if the subspace of K-fixed vectors in V (λ) is one dimensional. The corresponding representation V (λ) is then also called spherical. We write P+K ⊂ P+ for the subset of dominant spherical weights. Proposition 4.8. Let (U, K) be a pair from the list (i)–(iii) in Proposition 4.7, and let α ∈ 1 be the associated Gel’fand node with corresponding fundamental weight $ := $α . Then we have a multiplicity free irreducible decomposition of Uq (g)-modules of the form ∗
V ($ ) ⊗ V ($ ) '
l M
V (µi )
i=0
for certain l ∈ N, where µ0 := 0 ∈ P+ and {µi }li=1 is a subset of the dominant spherical weights P+K . Furthermore, every λ ∈ P+K can be uniquely written as a Z+ linear combination of the µi ’s (i ∈ [1, l]). Definition 4.9. The spherical weights µi (i ∈ [1, l]) are called the fundamental spherical weights associated with (U, K). Proof. It is well known that the trivial representation V (0) occurs with multiplicity one in the tensor product decomposition of V ($ )∗ ⊗ V ($ ). Furthermore, observe that ∗ (4.7) V ($ )∗ ⊗ V ($ ) ' B$ B$ ⊂ A{α}c ⊂ Cq [U/K] as right Uq (g)-modules. By Proposition 4.1 we have the multiplicity free decomposition as right Uq (g)-modules M V (λ), (4.8) Cq [U/K] ' λ∈P+K
from which it follows that the decomposition of V ($ )∗ ⊗ V ($ ) is multiplicity free, and that its irreducible constituents are all spherical. Krämer [17, Tab. 1] presented for each pair (U, K) from the list (i)–(iii) in Proposition 4.7 a set of dominant spherical weights {µi }li=1 satisfying the property that every λ ∈ P+K can be uniquely written as a Z+ -linear combination of the µi ’s (i ∈ [1, l]).
312
J. V. Stokman, M. S. Dijkhuizen
The µi ’s are explicitly given as a Z+ -linear combination of the fundamental dominant weights $j (j ∈ [1, r]). In case of the Hermitian symmetric spaces U/K, there is an elegant procedure to recover the µi ’s as linear combinations of the fundamental dominant weights from the corresponding Satake diagrams [44]. We show now that all spherical representations V (µi ) (i ∈ [1, l]) are constituents of V ($ )∗ ⊗V ($ ) by using the PRV conjecture, which states the following. Let λ, µ ∈ P+ and w ∈ W . Let [λ + wµ] be the unique element in P+ which lies in the W -orbit of λ + wµ. Then V ([λ + wµ]) occurs with multiplicity at least one in V (λ) ⊗ V (µ) (for a proof, see [18,29] or [22]). For each pair (U, K) from the list (i)–(iii) of Proposition 4.7, it is now possible to find explicit Weyl group elements wi ∈ W such that [$ − wi $ ] = µi , (i = [1, l]). Combined with the PRV conjecture and the fact that V ($ )∗ ' V (−σ0 $ ), this implies that V (µi ) is a constituent of V ($ )∗ ⊗ V ($ ) for all i ∈ [1, l]. As an example, let us follow the procedure for the compact Hermitian symmetric pair (U, K) = (SO(2l), U (l)) (l ≥ 2). We use the standard realization of the root system P R of type Dl in the l-dimensional vector space V = li=1 Rεi , with basis given by αi = εi − εi+1 (i = [1, l − 1]) and αl = εl−1 + εl . The fundamental weights are given by $i = ε1 + ε2 + . . . + εi , (i < l − 1), $l−1 = (ε1 + ε2 + . . . + εl−1 − εl )/2, $l = (ε1 + ε2 + . . . + εl−1 + εl )/2. We set $ = $l (i.e. S c = {αl }). Let σi be the linear map defined by εj 7→ −εj (j = i, i + 1) and εj 7 → εj otherwise. Then σi ∈ W (i = [1, l − 1]). If l = 2l 0 + 1, then $ − σ1 σ3 . . . σ2i−1 $ = $2i , (i = [1, l 0 − 1]), $ − σ1 σ3 . . . σ2l 0 −1 $ = $l−1 + $l .
(4.9)
If l = 2l 0 then we have $ − σ1 σ3 . . . σ2i−1 $ = $2i , $ − σ1 σ3 . . . σ2l 0 −1 $ = 2$l .
(i = [1, l 0 − 1]),
(4.10)
By comparison with [17, Tab. 1] we see that (4.9) (resp. (4.10)) are exactly the spherical weights {µi }li=1 for the pair (U, K) = (SO(2l), U (l)). The other cases are checked in a similar manner. To complete the proof, we have to show that the V (µi ) (i ∈ [0, l]) are the only irreducible constituents which can occur in the tensor product decomposition of V ($ )∗ ⊗ V ($ ). This is also proved case by case. The cases corresponding to the exceptional groups can be directly verified using for instance the maple-package “qtensor” of Stem bridge [43]1 . The special case (U, K) = SU (p + l), S(U (p) × U (l)) (i.e. for which U/K is a complex Grassmannian), follows easily from the Pieri formula for Schur functions [28, Ch. I, (5.17)] (see [4] for more details). The remaining cases can be checked by showing that for λ ∈ P+K \ {µi }li=0 , we have λ 6 ≤ $ − σ0 $ , which implies that V (λ) t cannot occur as constituent of V ($ )∗ ⊗ V ($ ). u 1 http://www.math.lsa.umich.edu/∼jrs/maple.html
Quantized Flag Manifolds and Irreducible ∗-Representations
313
The following main theorem of this section gives a positive answer to Conjecture 4.6 for S = ∅ and for the pairs classified in Proposition 4.7. Theorem 4.10. The factorized ∗-subalgebra AS is equal to Cq [U/KS ] if (i) S = ∅, i.e. U/KS = U/T is the full flag manifold; (ii) #S c = 1 and the simple root α ∈ S c is a Gel’fand node. Proof. To prove (i) we look at the simultaneous eigenspace decomposition of Cq [U ] with respect to the left Uq (h)-action on Cq [U ]. The simultaneous eigenspace corresponding to the character ε of Uq (h) is exactly Cq [U/T ]. Using Soibel’man’s factorization of Cq [U ] (cf. Theorem 3.1) and Lemma 4.4, it is then easily checked that Cq [U/T ] = A∅ . To prove (ii) we note that l M
V (µi ) ' (B$ )∗ B$ ⊂ A{α}c
i=0
as right Uq (g)-modules by Proposition 4.8 and (4.7) (here we use the notations as introduced in Proposition 4.8). Now Cq [U ] is an integral domain (cf. [8, Lemma 9.1.9 (i)]), hence vλ vµ ∈ A{α}c is a highest weight vector of highest weight λ + µ if vλ , vµ ∈ A{α}c are highest weight vectors of highest weight λ respectively µ. It follows that M V (λ) ,→ A{α}c λ∈P+K
as right Uq (g)-modules. Combining with (4.8), it follows that A{α}c = Cq [U/K{α}c ], as requisted. u t In the remainder of the paper we study the irreducible ∗-representations of the ∗algebras AS and Cq [U/KS ]. In the next section we first consider the restriction of the irreducible ∗-representations of Cq [U ] to the ∗-algebras AS and Cq [U/KS ]. 5. Restriction of Irreducible ∗-Representations to Cq [U/K] Let us first recall some results from Soibel’man [40] concerning the irreducible ∗-representations of Cq [U ]. Let {ei }i∈Z+ be the standard orthonormal basis of l2 (Z+ ). Write B(l2 (Z+ )) for the algebra of bounded linear operators on l2 (Z+ ). Then the formulas q πq (t11 )ej = (1 − q 2j )ej −1 , πq (t12 )ej = −q j +1 ej , (5.1) q πq (t21 )ej = q j ej , πq (t22 )ej = (1 − q 2(j +1) )ej +1 (here πq (t11 )e0 = 0) uniquely determine an irreducible ∗-representation πq : Cq [SU (2)] → B(l2 (Z+ )). Now the dual of the injective Hopf ∗-algebra morphism φi : Uqi (sl(2; C)) ,→ Uq (g) corresponding to the i th node of the Dynkin diagram (i ∈ [1, r]) is a surjective Hopf ∗-algebra morphism φi∗ : Cq [U ] Cqi [SU (2)]. Hence we obtain irreducible ∗-representations πi := πqi ◦ φi∗ : Cq [U ] → B(l2 (Z+ )).
314
J. V. Stokman, M. S. Dijkhuizen
On the other hand, there is a family of one-dimensional ∗-representations τt of Cq [U ] parametrized by the maximal torus t ∈ T ' Tr (T ⊂ C denoting the unit circle in the complex plane). More explicitly, let ιT : Uq (h) ,→ Uq (g) be the natural Hopf ∗-algebra embedding, and set Cq [T ] := span{φµ }µ∈P ⊂ Uq (h)∗ , where φµ (K σ ) := q (µ,σ ) for σ ∈ Q. As in (3.5) we get a Hopf ∗-algebra structure on Cq [T ]. Then ι∗T : Cq [U ] → Cq [T ], ι∗T (φ) := φ ◦ ιT is a surjective Hopf ∗-algebra morphism. Any irreducible ∗as τ˜t (φµ ) := t µ for a representation of Cq [T ] is one-dimensional and can be written Pr m1 mr r µ unique t ∈ T ' T . Here t := t1 . . . tr for µ = i=1 mi $i . So we obtain a one-dimensional ∗-representation τt := τ˜t ◦ ι∗T of Cq [U ], which is given explicitly on λ by the formula matrix elements Cµ,i;ν,j λ ) = δµ,ν δi,j t µ . τt (Cµ,i;ν,j
(5.2)
The following theorem completely describes the irreducible ∗-representations of Cq [U ]. Theorem 5.1 (Soibel’man [40]). Let σ ∈ W , and fix a reduced expression σ = si1 si1 · · · sil . The ∗-representation πσ := πi1 ⊗ πi2 ⊗ · · · ⊗ πil
(5.3)
does not depend on the choice of reduced expression (up to equivalence). The set {πσ ⊗ τt | t ∈ T , σ ∈ W } is a complete set of mutually inequivalent irreducible ∗-representations of Cq [U ]. Here tensor products of ∗-representations are defined in the usual way by means of the coalgebra structure on Cq [U ]. The irreducible representation πe with respect to the unit element e ∈ W is by definition the one-dimensional ∗-representation associated with the counit on Cq [U ]. In Soibel’man’s terminology, the representations πσ ⊗ τt are said to be associated with the Schubert cell Xσ of U/T (cf. Sect. 2). We also mention here an important property of the kernel of πσ , which we will repeatedly need later on. Let Uq (b) be the subalgebra of Uq (g) generated by the Ki±1 and the Xi+ (i ∈ [1, r]). For any λ ∈ P+ , the ∗-representation πσ satisfies λ )=0 v∈ / Uq (b)vσ λ , πσ (Cvλσ λ ;vλ ) 6 = 0 (5.4) πσ (Cv;v λ (cf. [40, Thm. 5.7]). Formula (5.4) combined with [1, Lemma 2.12] shows that the classical limit of the kernel of πσ formally tends to the ideal of functions vanishing on Xσ . Fix now a subset S ⊂ 1. We freely use the notations introduced earlier. Our next goal is to describe how the ∗-representations πσ decompose under restriction to the subalgebra Cq [U/KS ]. Consider the selfadjoint operators Lσ λ;λ := πσ ((Cσλλ;λ )∗ Cσλλ;λ )
(5.5)
for λ ∈ P+ (S c ). Let σ = si1 · · · sil be a reduced expression for σ , and set πσ = πi1 ⊗ πi2 ⊗ · · · ⊗ πil . Then it follows from [40, Proof of Prop. 5.2] (see also [40, Proof of Prop. 5.8]) that ∨
∨
∨
πσ (Cσλλ;λ ) = c πqi1 (t21 )(λ,γ1 ) ⊗ πqi2 (t21 )(λ,γ2 ) ⊗ · · · ⊗ πqil (t21 )(λ,γl ) ,
(5.6)
Quantized Flag Manifolds and Irreducible ∗-Representations
315
where the scalar c ∈ T depends on the particular choices of bases for the irreducible representations V (µ) (µ ∈ P+ ), and with γk := sil sil−1 · · · sik+1 (αik ) (1 ≤ k ≤ l − 1), γl := αil .
(5.7)
The proof of (5.6), which was given in [40] under the assumption that λ ∈ P++ , is in fact valid for all dominant weights λ ∈ P+ . It follows from (5.1), (5.5) and (5.6) that l2 (Z+ )⊗l(σ ) decomposes as an orthogonal direct sum of eigenspaces for Lσ λ;λ , M Hγ (λ), (5.8) l2 (Z+ )⊗l(σ ) = γ ∈I (λ)
where I (λ) ⊂ (0, 1] denotes the set of eigenvalues of Lσ λ;λ , and Hγ (λ) denotes the eigenspace of Lσ λ;λ corresponding to the eigenvalue γ ∈ I (λ) (we suppress the dependence on σ if there is no confusion possible). Observe that 1 ∈ I (λ) and that Lσ λ;λ is injective. Recall the definition of the set W S of minimal coset representatives (cf. (2.9)). An alternative characterization of W S is given by W S = {σ ∈ W | σ (RS+ ) ⊂ R + },
(5.9)
where RS+ := R + ∩ span{S} (cf. [1, Prop. 5.1 (iii)]). Using this alternative description of W S we obtain the following properties of Lσ λ;λ for λ ∈ P++ (S c ). Proposition 5.2. Suppose that σ ∈ W S and λ ∈ P++ (S c ). Then (i) Lσ λ;λ is a compact operator; (ii) The eigenspace H1 (λ) of Lσ λ;λ corresponding to the eigenvalue 1 is spanned by the ⊗l(σ ) . vector e0 Proof. Fix a λ ∈ P++ (S c ), and let σ = si1 si2 · · · sil be a reduced expression of a minimal coset representative σ ∈ W S . It is well known that R + ∩ σ −1 (R − ) = {γk }lk=1 ,
(5.10)
where the γk are defined by (5.7). We have γk ∈ R + \ RS+ by (5.9). It follows that (λ, γk∨ ) > 0 for all k, since λ ∈ P++ (S c ). By (5.1) and (5.6) it follows that H1 (λ) = ⊗l(σ ) } and that Hγ (λ) is finite-dimensional for all γ ∈ I (λ). Since the spectrum span{e0 of Lσ λ;λ (which is equal to I (λ) ∪ {0}) does not have a limit point except 0, we conclude t that Lσ λ;λ is a compact operator (cf. [37, Thm. 12.30]). u Let us recall the following well known inequalities for weights of finite-dimensional irreducible representations of g (or, equivalently, Uq (g)). Proposition 5.3. Let λ ∈ P+ and µ, ν ∈ P (λ). Then (λ, λ) ≥ (µ, ν), and equality holds if and only if µ = ν ∈ W λ. For a proof of the proposition, see for instance [10, Prop. 11.4]. The proof is based on the following lemma, which we will also need later on. The lemma is a slightly weaker version of [10, Lemma 11.2]. {λ}, and let mi ∈ Z+ (i ∈ [1, r]) be the Lemma 5.4. Let λ ∈ P+ and µ ∈ P (λ) \ P expansion coefficients defined by λ − µ = i mi αi . Then there is an i ∈ [1, r] with mi > 0 and λ(Hi ) 6 = 0.
316
J. V. Stokman, M. S. Dijkhuizen
We now have the following proposition, which can be regarded as a quantum analogue of the “if” part of Proposition 2.2. Proposition 5.5. Let σ ∈ W S . Then πσ restricts to an irreducible ∗-representation of the factorized ∗-algebra AS . In particular, πσ restricts to an irreducible ∗-representation of Cq [U/KS ]. Proof. Let λ ∈ P++ (S c ) and σ ∈ W S . Suppose H ⊂ l2 (Z+ )⊗l(σ ) is a non-zero closed subspace invariant under πσ |AS . Set γ := kLσ λ;λ |H k. Then γ > 0, since Lσ λ;λ is injective and γ is an eigenvalue of Lσ λ;λ |H by Proposition 5.2(i). Let Hγ be the corresponding eigenspace. We claim that λ λ )∗ Cµ,i;λ )Hγ = 0, µ 6= σ λ. πσ ((Cµ,i;λ
(5.11)
Suppose for the moment that the claim is correct. Then (3.10) and (5.11) imply γ = 1, ⊗l(σ ) } by Proposition 5.2(ii). So every non-zero closed invariant subhence Hγ = span{e0 ⊗l(σ ) . Since H ⊥ is also a closed invariant subspace, we must space contains the vector e0 ⊥ have H = {0}, i.e. H = l2 (Z+ )⊗l(σ ) . It remains therefore to prove the claim (5.11). λ ) = 0 if µ < σ λ. Hence By (5.4) we have πσ (Cµ,i;λ λ λ λ λ )∗ Cµ,i;λ ) = q (λ,λ)−(µ,σ λ) πσ ((Cµ,i;λ Cσλλ;λ )∗ Cσλλ;λ Cµ,i;λ ) Lσ λ;λ πσ ((Cµ,i;λ λ λ )∗ (Cσλλ;λ )∗ Cσλλ;λ Cµ,i;λ ) = q 2(λ,λ)−2(µ,σ λ) πσ ((Cµ,i;λ λ λ )∗ Cσλλ;λ )πσ ((Cσλλ;λ )∗ Cµ,i;λ ), = q 2(λ,λ)−2(µ,σ λ) πσ ((Cµ,i;λ
where we used Proposition 3.2(i) in the second equality and Proposition 3.2(ii) in the first and third equality. So (5.11) will then follow from λ )Hγ = 0, µ 6= σ λ, πσ ((Cσλλ;λ )∗ Cµ,i;λ
(5.12)
in view of the injectivity of Lσ λ;λ . Fix h ∈ Hγ and µ ∈ P (λ) with µ 6 = σ λ. By λ ∈ AS ⊂ Cq [U/KS ], hence the vector Lemma 4.4 we have (Cσλλ;λ )∗ Cµ;i;λ λ h˜ := πσ ((Cσλλ;λ )∗ Cµ;i;λ )h
(5.13)
lies in the invariant subspace H . Again using the commutation relations given in Proposition 3.2 and Corollary 3.3, we see that h˜ is an eigenvector of Lσ λ;λ with eigenvalue −1 γ˜ := q 2(λ,σ (µ)−λ) γ . We have γ˜ > γ by Proposition 5.3. By the maximality of γ , we conclude that h˜ = 0. This proves (5.12), hence also the claim (5.11). u t Definition 5.6. We say that the irreducible ∗-representation πσ (σ ∈ W S ) of Cq [U/KS ] is associated with the Schubert cell Xσ ⊂ U/KS . The following proposition can be regarded as a quantum analogue of Proposition 2.3 as well as of the “only if” part of Proposition 2.2. Proposition 5.7. Let σ ∈ W , and let σ = uv be the unique decomposition of σ with u ∈ W S and v ∈ WS . For πσ = πu ⊗ πv (cf. (2.10)) and t ∈ T , we have (πσ ⊗ τt )(a) = πu (a) ⊗ id⊗l(v) , a ∈ Cq [U/KS ].
Quantized Flag Manifolds and Irreducible ∗-Representations
317
Proof. Recall that the one-dimensional ∗-representation τt factorizes through ι∗T : Cq [U ] → Cq [T ] and that πi factorizes through φi∗ : Cq [U ] → Cqi [SU (2)]. The maps ι∗T and φi∗ (i ∈ S) factorize through ι∗S : Cq [U ] → Cq [KS ] since the ranges of ιT and φi (i ∈ S) lie in the Hopf-subalgebra Uq (lS ). Hence πv ⊗ τt (v ∈ WS , t ∈ T ) factorizes through ι∗S , say πv ⊗ τt = πv,t ◦ ι∗S . Then we have for a ∈ Cq [U/KS ], (πσ ⊗ τt )(a) = (πu ⊗ πv ⊗ τt ) ◦ 1(a) = (πu ⊗ πv,t ) ◦ (id ⊗ ι∗S )1(a)
= πu (a) ⊗ πv,t (1) = πu (a) ⊗ id⊗l(v) ,
which completes the proof of the proposition. u t Lemma 5.8. The ∗-representations {πσ }σ ∈W S , considered as ∗-representations of AS respectively Cq [U/KS ], are mutually inequivalent. Proof. Let σ, σ 0 ∈ W S with σ 6 = σ 0 and λ ∈ P++ (S c ). Then σ λ 6 = σ 0 λ, since the isotropy subgroup {σ ∈ W | σ λ = λ} is equal to WS by Chevalley’s Lemma (cf. [11, Prop. 2.72]). Without loss of generality we may assume that σ λ 6 ≥ σ 0 λ. Then we have πσ 0 ((Cσλλ,λ )∗ Cσλλ,λ ) = 0 by (5.4). On the other hand, Lσ λ;λ is injective. It follows that t πσ 6 ' πσ 0 as ∗-representations of AS . u Let now k.ku be the universal C ∗ -norm on Cq [U ] (cf. [3, §4]), so kaku :=
sup
σ ∈W,t∈T
k(πσ ⊗ τt )(a)k, a ∈ Cq [U ].
(5.14)
Let Cq (U ) (resp. Cq (U/KS )) be the completion of Cq [U ] (resp. Cq [U/KS ]) with respect to k.ku . All ∗-representations πσ ⊗ τt of Cq [U ] extend to ∗-representations of the C ∗ algebra Cq (U ) by continuity. The results of this section can now be summarized as follows. Theorem 5.9. Let S ⊂ 1. Then {πσ }σ ∈W S is a complete set of mutually inequivalent irreducible ∗-representations of Cq (U/KS ). Proof. This follows from the previous results, since every irreducible ∗-representation of Cq (U/KS ) appears as an irreducible component of σ|Cq (U/KS ) for some irreducible t ∗-representation σ of Cq (U ) (cf. [5, Prop. 2.10.2]). u Theorem 5.9 does not imply that {πσ }σ ∈W S is a complete set of irreducible ∗-representations of the ∗-algebra Cq [U/KS ] itself. Indeed, it is not clear that any irreducible ∗-representation of Cq [U/KS ] can be continuously extended to a ∗-representation of Cq (U/KS ). In the remainder of this paper we will deal with the classification of the irreducible ∗-representations of AS . In particular, this will yield a complete classification of the irreducible ∗-representations of Cq [U/KS ] for the generalized flag manifolds U/KS for which the PBW factorization is valid (cf. Theorem 4.10). 6. Irreducible ∗-Representations of AS Let S ⊂ 1 be any subset. In this section we show that {πσ }σ ∈W S exhausts the set of irreducible ∗-representations of AS (up to equivalence). We fix therefore an arbitrary irreducible ∗-representation τ : AS → B(H ) and we will show that τ ' πσ for a
318
J. V. Stokman, M. S. Dijkhuizen
(unique) σ ∈ W S . In order to associate the proper minimal coset representative σ ∈ W S with τ , we need to study the range τ (AS ) ⊂ B(H ) of τ in more detail. For λ ∈ P+ (S c ) and µ, ν ∈ P (λ), let τ λ (µ; ν), τ λ (ν) ⊂ B(H ) be the linear subspaces λ λ )∗ Cw;v ) | v ∈ V (λ)µ , w ∈ V (λ)ν }, τ λ (µ; ν) := {τ ((Cv;v λ λ λ λ )∗ Cw;v ) | v ∈ V (λ), w ∈ V (λ)ν }. τ λ (ν) := {τ ((Cv;v λ λ
(6.1)
For λ ∈ P+ (S c ) set D(λ) := {ν ∈ P (λ) | τ λ (ν) 6 = {0}}
(6.2)
/ D(λ) for all ν 0 < ν. and let Dm (λ) be the set of weights ν ∈ D(λ) such that ν 0 ∈ By (3.10), we have D(λ) 6 = ∅, hence also Dm (λ) 6= ∅. We start with a lemma which is useful for the computation of commutation relations in τ (AS ) ⊂ B(H ). Lemma 6.1. Let λ, 3 ∈ P+ (S c ) and ν ∈ Dm (λ). Let v ∈ V (λ), v 0 ∈ V (λ)ν 0 with λ )∗ , ν 0 < ν and w, w0 ∈ V (3). Then the product of the four matrix elements (Cv;v λ λ 3 3 Cv 0 ;vλ , (Cw;v3 )∗ , and Cw0 ;v3 , taken in an arbitrary order, is contained in Ker(τ ). Proof. Since Ker(τ ) is a two-sided ∗-ideal in AS , it follows from the definitions that 3 λ )∗ Cw30 ;v3 (Cv;v )∗ Cvλ0 ;vλ ∈ Ker(τ ). (Cw;v 3 λ
If the product of the four matrix coefficients is taken in a different order, then we can rewrite it by Proposition 3.2 and by Corollary 3.3 as a linear combination of products of matrix elements 3 λ )∗ Cu30 ;v3 (Cx;v )∗ Cxλ0 ;vλ (Cu;v 3 λ
with x 0 ∈ V (λ)ν 00 and ν 00 ≤ ν 0 < ν. These are all contained in Ker(τ ), since ν ∈ Dm (λ). t u Lemma 6.2. Let λ ∈ P+ (S c ) and ν ∈ Dm (λ). Then (i) τ λ (ν; ν) 6 = {0}; (ii) ν = σ λ for some σ ∈ W S . Proof. Let λ ∈ P+ (S c ) and ν ∈ Dm (λ). Fix weight vectors v ∈ V (λ)µ , w ∈ V (λ)ν λ )∗ C λ such that Tv;w := τ ((Cv;v w;vλ ) 6 = 0. By Lemma 6.1, we compute λ λ λ λ (Cv;v C λ )∗ Cw;v ) (Tv;w )∗ Tv;w = q (µ,ν)−(λ,λ) τ (Cv;v λ λ w;vλ λ λ λ (Cv;v )∗ )Tw;w , = τ (Cv;v λ λ
(6.3)
where we used Proposition 3.2(ii) in the first equality and Proposition 3.2(i) in the second equality. On the other hand, (Tv;w )∗ Tv;w 6 = 0 since B(H ) is a C ∗ -algebra, so we conclude that Tw;w 6 = 0. In particular, τ λ (ν, ν) 6= {0}. Formula (6.3) for v = w gives λ λ (Cw;v )∗ )Tw;w = q (λ,λ)−(ν,ν) Tw;w Tw;w , 0 6 = (Tw;w )∗ Tw;w = τ (Cw;v λ λ
where we have used Proposition 3.2(ii) in the last equality. It follows that (λ, λ) = (ν, ν), t since Tw;w is selfadjoint. By Proposition 5.3 we obtain ν = σ λ for some σ ∈ W S . u
Quantized Flag Manifolds and Irreducible ∗-Representations
319
For λ ∈ P+ (S c ) and ν ∈ Dm (λ) we set λ ∗ λ ) Cν;λ ). Lν;λ := τ ((Cν;λ
(6.4)
This definition makes sense since dim(V (λ)ν ) = 1 by Lemma 6.2(ii). Furthermore, Lν;λ is a non-zero selfadjoint operator which commutes with the elements of τ (AS ) in the following way. Lemma 6.3. Let λ, 3 ∈ P+ (S c ) and ν ∈ Dm (λ). For v ∈ V (3)µ , w ∈ V (3)µ0 we have 0
3 3 3 3 )∗ Cw;v ) = q 2(ν,µ −µ) τ ((Cv;v )∗ Cw;v )Lν;λ . Lν;λ τ ((Cv;v 3 3 3 3
Proof. By Lemma 6.1 and the commutation relations in Sect. 3 we compute 3 3 3 3 )∗ Cw;v ) = q (λ,3)−(ν,µ) τ ((Cv;v C λ )∗ Cvλν ;vλ Cw;v ) Lν;λ τ ((Cv;v 3 3 3 vν ;vλ 3
3 3 )∗ (Cvλν ;vλ )∗ Cvλν ;vλ Cw;v ) = q 2(λ,3)−2(ν,µ) τ ((Cv;v 3 3 0
3 3 )∗ (Cvλν ;vλ )∗ Cw;v Cλ ) = q (ν,µ )+(λ,3)−2(ν,µ) τ ((Cv;v 3 3 vν ;vλ 0
3 3 )∗ Cw;v )Lν;λ , = q 2(ν,µ −µ) τ ((Cv;v 3 3
where we used Proposition 3.2(ii) for the first and fourth equality, Proposition 3.2(i) for the second equality, and Corollary 3.3 for the third equality. u t It follows from Lemma 6.3 that Ker(Lν;λ ) ( H is a closed invariant subspace. By the irreducibility of τ , we thus obtain the following corollary. Corollary 6.4. Let λ ∈ P+ (S c ) and ν ∈ Dm (λ). Then Lν;λ is injective. The minimal coset representative σ of Lemma 6.2(ii) is unique and independent of λ ∈ P+ (S c ) in the following sense. Lemma 6.5. There exists a unique σ ∈ W S such that Dm (λ) = {σ λ} for all λ ∈ P+ (S c ). Proof. Let 3 ∈ P++ (S c ) and ν ∈ Dm (3). Then there exists a unique σ ∈ W S such that ν = σ 3 by Lemma 6.2(ii) and by Chevalley’s Lemma (cf. [11, Prop. 2.27]). Fix furthermore arbitrary λ ∈ P+ (S c ) and ν 0 ∈ Dm (λ). Choose a σ 0 ∈ W such that ν 0 = σ 0 λ. By Lemma 6.1 and the commutation relations of Sect. 3, we compute 0
3 ∗ 3 ) Cν;3 Cνλ0 ;λ ) Lν;3 Lν 0 ;λ = q (3,λ)−(ν,ν ) τ ((Cνλ0 ;λ Cν;3 0
3 ∗ λ 3 ) Cν 0 ;λ Cν;3 ) = q 3(3,λ)−3(ν,ν ) τ ((Cνλ0 ;λ )∗ (Cν;3 0
= q 2(3,λ)−2(ν,ν ) Lν 0 ;λ Lν;3 , where we used Proposition 3.2(ii) in the first and third equality and Proposition 3.2(i) twice in the second equality. If we repeat the same computation, but now using Corollary 3.3 twice in the second equality, then we obtain 0
Lν;3 Lν 0 ;λ = q 2(ν,ν )−2(3,λ) Lν 0 ;λ Lν;3 , hence
0 0 q 2(3,λ)−2(ν,ν ) − q 2(ν,ν )−2(3,λ) Lν 0 ;λ Lν;3 = 0.
320
J. V. Stokman, M. S. Dijkhuizen
By Corollary 6.4 we have Lν 0 ;λ Lν;3 6 = 0, so we conclude that (3, λ) − (ν, ν 0 ) = (3, λ − σ −1 σ 0 λ) = 0. Since 3 ∈ P++ (S c ) and λ ∈ P+ (S c ), it follows from Lemma 5.4 that λ = σ −1 σ 0 λ, i.e. t ν 0 = σ λ. Hence, Dm (λ) = {σ λ} for all λ ∈ P+ (S c ). u In the remainder of this section we write σ for the unique minimal coset representative such that Dm (λ) = {σ λ} for all λ ∈ P+ (S c ). We are going to prove that τ ' πσ . First ⊗l(σ ) (cf. Proposition 5.2(ii)) in we look for the analogue of the distinguished vector e0 the representation space H of τ . The spectrum I (λ) of Lσ λ;λ is contained in [0, ∞), since Lσ λ;λ is a positive operator. By considering the spectral decomposition of Lσ λ;λ , one obtains the following corollary of Lemma 6.3 and [12, Lemma 4.3]. Corollary 6.6. Let λ ∈ P+ (S c ). Then I (λ) ⊂ [0, ∞) is a countable set with no limit points, except possibly 0. The proof of Corollary 6.6 is similar to the proof of [40, Prop. 3.9] and of [12, Prop. 4.2]. By Corollary 6.6 we have an orthogonal direct sum decomposition M Hγ (λ) (6.5) H = γ ∈I (λ)∩R>0
into eigenspaces of Lσ λ;λ , where Hγ (λ) is the eigenspace of Lσ λ;λ corresponding to the eigenvalue γ . Let γ0 (λ) > 0 be the largest eigenvalue of Lσ λ;λ . Lemma 6.7. Let λ ∈ P+ (S c ), v ∈ V (λ), w ∈ V (λ)ν and assume that ν 6= σ λ. Then λ )∗ C λ τ ((Cv;v w;vλ )(Hγ0 (λ) (λ)) = {0}. λ Proof. Let λ ∈ P+ (S c ), v ∈ V (λ)µ and w ∈ V (λ)ν . By Lemma 6.1 and the commutation relations in Sect. 3, we compute λ λ λ λ )∗ Cw;v ) = τ (Cvλσ λ ;vλ (Cv;v Cλ )∗ Cw;v ) Lσ λ;λ τ ((Cv;v λ λ λ vσ λ ;vλ λ
λ λ )∗ (Cvλσ λ ;vλ )∗ Cw;v ) = q (λ,λ)−(µ,σ λ) τ (Cvλσ λ ;vλ (Cv;v λ λ
λ λ )∗ Cvλσ λ ;vλ )τ ((Cvλσ λ ;vλ )∗ Cw;v ), = q 2(λ,λ)−2(µ,σ λ) τ ((Cv;v λ λ
where we used Proposition 3.2(i) in the second equality and Proposition 3.2(ii) in the first and third equality. This computation, together with the injectivity of Lσ λ;λ , shows that it suffices to give a proof of the lemma for the special case that v = vσ λ . So we fix h ∈ Hγ0 (λ) (λ) and w ∈ V (λ)ν with ν ∈ P (λ) and ν 6= σ λ. It follows from λ )h is an eigenvector of Lσ λ;λ with eigenvalue Lemma 6.3 that h˜ := τ ((Cvλσ λ ;vλ )∗ Cw;v λ −1 2(λ,σ (ν)−λ) γ0 (λ). By Proposition 5.3 we have γ˜0 (λ) > γ0 (λ), hence h˜ = 0 γ˜0 (λ) = q t by the maximality of the eigenvalue γ0 (λ). u Corollary 6.8. γ0 (λ) = 1 for all λ ∈ P+ (S c ). Proof. Follows from (3.10) and Lemma 6.7. u t
Quantized Flag Manifolds and Irreducible ∗-Representations
321 µ
The linear subspace of Cq [U ] spanned by the matrix elements {Cσ µ;µ }µ∈P+ is a subµ algebra of Cq [U ] with algebraic generators Cσ$$i i ;$i (i ∈ [1, r]), since Cσ µ;µ Cσν ν;ν = µ+ν
λµ,ν Cσ (µ+ν);µ+ν , where the scalar λµ,ν ∈ T depends on the particular choices of orthonormal bases for the finite-dimensional irreducible representations V (µ) and V (ν) (cf. [40, Proof of Prop. 3.12]). Then it follows from Proposition 3.2 and Lemma 6.1 that Lσ (µ+ν);µ+ν = Lσ µ;µ Lσ ν;ν
(6.6)
for all µ, ν ∈ P+ (S c ), hence span{Lσ λ;λ }λ∈P+ (S c ) is a commutative subalgebra of B(H ). Set \ H1 ($i ), (6.7) H1 := i∈S c
then H1 ⊂ H1 (λ) for all λ ∈ P+ (S c ) by (6.6). Lemma 6.9. H1 = H1 (λ) for all λ ∈ P++ (S c ). In particular, H1 6 = {0}. Proof. For µ ∈ P+ (S c ) we have kLσ µ;µ k = 1. Moreover, for any h ∈ H , h ∈ H1 (µ)
⇔
kLσ µ;µ hk = khk.
(6.8)
This follows from the eigenspace decomposition (6.5) for Lσ µ;µ and the fact that 1 is the largest eigenvalue of Lσ µ;µ . Let λ ∈ P++ (S c ) and choose arbitrary i ∈ S c . Then λ = µ + $i for certain µ ∈ P+ (S c ). By (6.6), we obtain for h ∈ H1 (λ), khk = kLσ λ;λ hk = kLσ µ;µ Lσ $i ;$i hk ≤ kLσ $i ;$i hk ≤ khk, hence we have equality everywhere. By (6.8), it follows that h ∈ H1 ($i ). Since i ∈ S c was arbitrary, we conclude that h ∈ H1 . u t Lemma 6.10. Let λ ∈ P+ (S c ). For all v ∈ V (λ)µ with µ 6 = σ λ we have λ )∗ Cvλσ λ ;vλ (H1 ) ⊂ H1⊥ . τ (Cv;v λ Proof. Let 3 ∈ P++ (S c ), λ ∈ P+ (S c ), and v ∈ V (λ)µ with µ 6= σ λ and µ ∈ P (λ). Then λ )∗ Cvλσ λ ;vλ ) = q 2(3,λ−σ Lσ 3;3 τ ((Cv;v λ
−1 (µ))
λ τ ((Cv;v )∗ Cvλσ λ ;vλ )Lσ 3;3 λ
by Lemma 6.3. By Lemma 5.4 we have (3, λ − σ −1 (µ)) > 0. Hence, λ λ )∗ Cvλσ λ ;vλ )(H1 ) = τ ((Cv;v )∗ Cvλσ λ ;vλ )(H1 (3)) τ ((Cv;v λ λ M Hγ (3) = H1 (3)⊥ = H1⊥ , ⊂ γ 0 such that for any h < h0 , there is an injective map f˜(h) sending elements of 6(h) exactly into hZn and such that f˜(h) − f (h) = O(h∞ ). Because f (h) is of order zero, there is a fixed open ball B˜ 0 ⊂ f (h; U) such that 0 ˜ B ∩ (hZn ) is contained in f˜(h; 6(h)). ˜ in Then, one can find a smaller ball B˜ ⊂ B˜ 0 such that for any two points P˜ , Q → n n 0 ˜ takes any point of B˜ ∩(hZ ) into B˜ ∩(hZn ) B˜ ∩(hZ ), the translation by the vector P˜ Q ˜ Pulling back by (Fig. 4). Let us denote by B an open ball in Rn such that f (h; B) ⊂ B. h
A˜ 0
B˜ 0
A˜ ˜ Q hZn
P˜
B˜
Fig. 4. Parallel translation
f˜(h), one thus defines the “parallel transport” τP→Q (A) of a point A ∈ 6(h) ∩ B along the direction given by two points P and Q in 6(h) ∩ B. When the composition is defined, we have → ◦ τ → = τ→ . τQR PQ PR
(4)
Moreover, because translation in Zn is an isometry, there exists a constant C > 0, independent of h, such that for any A ∈ 6(h) ∩ B →
→
||QτP→Q (A)|| < C||P A||.
(5)
Because of Proposition 1, any other choice of affine chart f (h) gives the same parallel transport. 2. Now, let (6(h), U) be a general asymptotic affine lattice. If γ is any path in U, one can cover its image by open balls Bi on which parallel transport is well defined for h
476
S. V˜u Ngo.c
less than some hi > 0. If U is compact, as we shall always assume, this can be done with a finite number of such balls B1 , . . . , B` , ordered in a way that for each 1 ≤ i < `, Bi ∩ Bi+1 6 = ∅. In the following, take h to be less than mini hi . Let P ∈ 6(h)∩B0 and Q ∈ 6(h)∩B` . For each i = 1, . . . , ` − 1, pick up a point Pi ∈ 6(h) ∩ (Bi ∩ Bi+1 ). For h small enough, this set is not empty. Because of the estimate (5), the mapping def
τγ ,P ,Q = τP
→
`−1 Q
◦ · · · ◦ τP → ◦ τP→ P P 1 2
1
is well-defined when restricted to a sufficiently small ball B0 around P (here again, 6(h) ∩ B0 won’t be empty if h is small enough). Equation (4) shows that this map does not depend on the choice of the intermediate points Pi . Therefore it depends only on P , Q, and on the homotopy class of γ (as a path from a point in B1 to a point in B` ). If Q = P , and γ is a loop (B` ∩ B1 6 = ∅ and B0 ⊂ B1 ) then τγ ,P ,P is a map from 6(h) ∩ B0 to 6(h) ∩ B1 leaving P invariant. If f (h) is an affine chart for 6(h) on B1 , then f˜(h) ◦ τγ ,P ,P ◦ f˜(h)−1 is a locally defined map τ˜γ ,f (h),P from hZn to itself leaving f˜(h; P ) invariant. We know from Sect. 2 (formula (1)) that the choice of such an affine chart allows the quantum monodromy map µf to take its values in GA(n, Z). Remember that L denotes the natural homomorphism from GA(n, R) to GL(n, R). Proposition 4. The map τ˜γ ,f (h),P is equal to the linearisation at P˜ = f˜(h; P ) of the quantum monodromy along γ : →
→
˜ = L(µf (γ ))P˜ R, ˜ ∀R˜ ∈ hZn , P˜ τ˜γ ,f (h),P (R) whenever the left-hand side of the above is defined. Proof. If we choose affine charts fi (h) for 6(h) on each of the Bi ’s with f1 = f , and let Ai,i+1 be the transition elements of the monodromy cocycle fi (h)/ h = Ai,i+1 (fi+1 (h)/ h) + O(h∞ ) (convention ` + 1 ≡ 1), then it is easy to check that →
→
˜ = L(A1,` ) · · · L(A3,2 )L(A2,1 ) · P˜ R, ˜ P˜ τ˜γ ,f (h),P (R) whenever the composition is defined. Using (2) finishes the proof.
As an application, one can easily “read off” from the spectrum of the quantum Champagne bottle (Fig. 5) that the linear part of the quantum monodromy is conjugate to the 1 −1 matrix . 0 1
Quantum Monodromy in Integrable Systems
477 E2 = hn
R0
R
E1
P
γ
Fig. 5. Spectrum of the Champagne bottle. The gray disc encloses the focus-focus critical value. R 0 = τγ ,P ,P (R)
5.3. Unwinding the spectrum. We keep here the notation of the previous paragraph. In particular, 6(h) is any asymptotic affine lattice on U, γ is a path in U whose image is covered by balls Bi on which local parallel translation is defined. We choose points P ∈ B1 ∩ 6(h), Q ∈ B` ∩ 6(h) and P1 , P2 , . . . , P`−1 , P` = Q such that for i = 1, . . . , ` − 1, Pi ∈ Bi ∩ Bi+1 ∩ 6(h). Given an affine chart f (h) on B1 , for h small there is a unique k1 ∈ Zn such that the ◦ f˜(h)−1 is just translation by hk1 . If B1 , . . . , B` are endowed with map f˜(h) ◦ τP→ P1 affine charts f1 (h) = f (h), f2 (h), . . . , f` (h), in the same way we define ki ∈ Zn such that f˜i (h) ◦ τ → ◦ f˜i (h)−1 Pi−1 Pi
is translation by the vector hki . We unwind the points P , P1 , . . . , P` onto hZn using the following procedure (see Fig. 6): – P˜ = f˜(h; P ); – P˜1 = P˜ + hk1 = f˜(h, P1 ); – P˜2 = P˜1 + hL(A2,1 ) · k2 ; – ... ˜ = P˜` = P˜`−1 + hL(A`,`−1 ) · · · L(A2,1 ) · k` . – Q Then one easily checks that P˜i = hA1,2 ◦ A2,3 ◦ · · · ◦ Ai−1,i (f˜i (h; Pi )/ h). In particular, applying this procedure to a loop γ (P = Q) proves the following : ˜ Proposition 5. For h small enough, the quantum monodromy µf gives the end point Q of the unwinding of any loop γ on U through a point P ∈ 6(h) around which we are given an affine chart f (h) by the following formula : ˜ = h(µf (γ ))−1 (f˜(h; P )/ h). Q
478
S. V˜u Ngo.c f (h)
E2 = hn
h P1
P
P2
P˜1 ˜ Q
P˜2
P˜
Q P3
P˜11
P11 E1
P4
P10
P5
P9
P˜5
P8
P6
P˜8
P7 P˜7
Fig. 6. Unwinding of the points Pi . We deduce that yP˜ = 4, which allows us to locate the horizontal line through the origin 0 ∈ hZ2 (the dotted one)
Remark 4. There is a unique symbol g(h) defined on the universal cover U˜ of U that is an affine chart for 6(h) and that coincides with f (h) above B0 . Then Q can be seen as ˜ The point is now that the lift γ .P ∈ U. ˜ + O(h∞ ). g(h; Q) = Q ˜ and for any γ ∈ π1 (U), there is a unique νP (γ ) ∈ GA(n, Z) such that For any P ∈ U, g(h; γ .P )/ h = νP (γ )(g(h; P )/ h) + O(h∞ ). By definition, we have νP (γ γ 0 ) = νγ .P (γ 0 )νP (γ ). But one can show that for any loop γ such that γ .P = Q, then νQ (γ 0 ) = νP (γ )νP (γ 0 )νP (γ )−1 . Therefore, νP is actually a homomorphism. Proposition 5 just says that νP = µ−1 f . Applying this proposition together with Theorem 2 to a focus-focus singularity, weR see R 1 1 that if the principal part of f (h) is given by the action integrals 2π γ1 α and 2π γ2 α then, for a small loop δ enclosing the critical value o, 1 ν(δ) = ι . 01 In particular, the whole horizontal line through the origin consists of fixed points. Of course, locating the origin on a diagram like Fig. 6 may require the computation of the
Quantum Monodromy in Integrable Systems
479
˜ it is easy to find the horizontal action at one point. However, given P˜ and its image Q, line through the origin, for yP˜ = xQ˜ − xP˜ .
Acknowledgements. One of the reasons for having written this article is the enthusiasm of R. Cushman for the subject; I would like to thank him for this. I would also like to thank my adviser Y. Colin de Verdière, and J. J. Duistermaat, for stimulating discussions. My research is supported by a Marie Curie Fellowship Nr. ERBFMBICT961572.
References 1. Bates, L.M.: Monodromy in the Champagne bottle. Z. Angew. Math. Phys. 6, 837–847 (1991) 2. Berger, M.: Géométrie. Vol. 1. Paris: Cedic/Nathan, 1977 3. Charbonnel, A.-M.: Comportement semi-classique du spectre conjoint d’opérateurs pseudo-différentiels qui commutent. Asymptotic Analysis 1, 227–261 (1988) 4. Child, M.S.: Quantum states in a Champagne bottle. J. Phys. A. 31, 657–670 (1998) 5. Child, M.S., Weston, T., and Tennyson, J.: Quantum monodromy in the spectrum of H2 O and other systems: New insight into the level structure of quasi-linear molecules. To appear 6. Colin de Verdière, Y.: Spectre conjoint d’opérateurs pseudo-différentiels qui commutent II. Math. Z. 171, 51–73 (1980) 7. Cushman, R. and Duistermaat, J.J.: The quantum spherical pendulum. Bull. Am. Math. Soc. (N.S.) 19, 475–479 (1988) 8. Cushman, R. and Duistermaat, J.J.: Non-hamiltonian monodromy. Preprint University of Utrecht, 1997 9. Duistermaat, J.J.: On global action-angle variables. Comm. Pure Appl. Math. 33, 687–706 (1980) 10. Eliasson, L.H.: Hamiltonian systems with Poisson commuting integrals. Ph.D. thesis, University of Stockholm, 1984 11. Guillemin, V. and Uribe, A.: Monodromy in the quantum spherical pendulum. Commun. Math. Phys. 122, 563–574 (1989) 12. Hirzebruch, F.: Topological methods in algebraic geometry. Grundlehren der math. W., Vol. 131. New York: Springer, 1966 13. Nguyên Tiên, Z.: A topological classification of integrable hamiltonian systems. Séminaire Gaston Darboux de géometrie et topologie différentielle (Brouzet, R., ed.) Université Montpellier II, 1994–1995, pp. 43–54 14. V˜u Ngo.c, S.: Bohr-Sommerfeld conditions for integrable systems with critical manifolds of focus-focus type. Preprint Institut Fourier 433, 1998 15. Zou, M.: Monodromy in two degrees of freedom integrable systems. J. Geom. Phys. 10, 37–45 (1992) Communicated by H. Araki
Commun. Math. Phys. 203, 481 – 498 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Ergodic Actions of Universal Quantum Groups on Operator Algebras Shuzhou Wang Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail:
[email protected] Received: 21 April 1998 / Accepted: 14 December 1998
Abstract: We construct ergodic actions of compact quantum groups on C ∗ -algebras and von Neumann algebras, and exhibit phenomena of such actions that are of different nature from ergodic actions of compact groups. In particular, we construct: (1) an ergodic action of the compact quantum Au (Q) on the type IIIλ Powers factor Rλ for an appropriate positive Q ∈ GL(2, R); (2) an ergodic action of the compact quantum group Au (n) on the hyperfinite II1 factor R; (3) an ergodic action of the compact quantum group Au (Q) on the Cuntz algebra On for each positive matrix Q ∈ GL(n, C); (4) ergodic actions of compact quantum groups on their homogeneous spaces, as well as an example of a non-homogeneous classical space that admits an ergodic action of a compact quantum group. 1. Introduction It is well known that compact groups admit no ergodic actions on operator algebras other than the finite ones (i.e. those with finite traces) [15]. Therefore, there arose the following basic problem (cf. p. 76 of [15]): Construct an ergodic action of a semisimple compact Lie group on the Murray–von Neumann II1 factor R. Later, Wassermann developed some general theory of ergodic actions of compact groups on operator algebras and showed that SU (2) cannot act ergodically on R [33,34], leaving experts the doubt that semisimple compact Lie groups admit ergodic actions on R at all. In [5], Boca studied the general theory of ergodic action of compact quantum groups [37] on C ∗ -algebras and generalized some basic results on ergodic actions of compact groups to compact quantum groups. But so far there is still a lack of non-trivial examples of ergodic actions of compact quantum groups on operator algebras. The purpose of the present paper is two-fold, which is in some sense opposite to that of Boca [5]. First, we show that some new phenomena can occur for ergodic actions of quantum groups. Second, we supply some general methods to construct ergodic actions of compact quantum groups on operator algebras and give several non-trivial examples of
482
S. Wang
such actions. We show that the universal compact matrix quantum groups Au (Q) of [27, 28] admit ergodic actions on both the (infinite) injective factors of type III (for Q 6 = cIn , c ∈ C∗ ) and the (infinite) Cuntz algebras (for Q > 0). We construct an ergodic action of the universal compact matrix quantum group of Kac type Au (n) on the hyperfinite factor R, which may not admit ergodic actions of any semisimple compact Lie group [34]. We also study ergodic actions of compact quantum groups on their homogeneous spaces and show that there are non-homogeneous classical spaces that admit ergodic actions of quantum groups. These results show that compact quantum groups have a much richer theory of ergodic actions on operator algebras than compact (Lie) groups. Unlike Boca [5], we study actions of compact quantum groups on both C ∗ -algebras and von Neumann algebras, not just C ∗ -algebras. Our construction of ergodic actions of compact quantum groups on von Neumann algebras come from their “measure preserving” actions on C ∗ -algebras, just as in the classical situation (see Theorem 2.5). One of our constructions of ergodic actions (see Sect. 3) uses tensor products of irreducible representations of compact quantum groups. This method was first used by Wassermann [35] in the setting of Lie groups (instead of quantum groups) to construct subfactors from their “product type actions”. At the other extreme, actions of quantum groups with large fixed point algebras (i.e. prime actions) have been studied by many authors, see, e.g. [9, 7]. Generalizing the canonical action of compact Lie groups on the Cuntz algebras [8] introduced by Doplicher-Roberts [12,10], Konishi et al [19] study (the non-ergodic) action of SUq (2) on the Cuntz algebra O2 and its CAR subalgebra and show that their fixed point algebras coincide (see also [20]). This result is extended to SUq (n) by Paolucci [22]. In [21], this action of the quantum group SUq (n) is induced to a (non-ergodic) action on the Powers factor Rλ by a rather complicated method, which follows from our result Theorem 2.5 in a much simpler and more conceptual manner. The contents of this paper are as follows. In Sect. 2, we give a general method of construction of quantum group actions on von Neumann algebras from their “measure preserving” action on C ∗ -algebras. Using this and a result of Banica [3] on the tensor products of the fundamental representation of Au (Q), we construct in Sect. 3 an ergodic action of a universal quantum groups Au (Q) on the Powers factor Rλ of type IIIλ and and an ergodic action of Au (n) on the hyperfinite II1 factor R. In Sect. 4, using results of Banica [2], we show that the fixed point subalgebra of R under the quantum subgroup Ao (n) of Au (n) is also a factor and that the action of Ao (n) on R is prime. In Sect. 5, we construct ergodic action of Au (Q) on the Cuntz algebras and on the injective factor R∞ of type III1 as well as the other factors of type III. It is also shown that the (unimodular) compact quantum group Au (n) of Kac type acts ergodically on the injective factor of type III 1 , a fact rather surprising to us. In the last section, Sect. 6, we n study ergodic actions of compact quantum groups on their “quotient spaces”, and show that the quantum automorphism group Aaut (X4 ) acts ergodically on the classical space X4 with 4 points, but X4 is not isomorphic to a quotient space. We point out that instead of using the fundamental representation of Au (Q), we can also use representations of free products of compact quantum groups [28] in the examples in Sect. 3 and Sect. 5 for the constructions of ergodic actions. 2. Lifting Actions on C ∗ -Algebras to von Neumann Algebras In this section, we describe (Theorem 2.5) how to construct ergodic actions of compact quantum groups on von Neumann algebras from “measure preserving” actions on noncommutative topological spaces (i.e. C ∗ -algebras).
Actions of Quantum Groups on Operator Algebras
483
To fix notation, we first recall some basic notions concerning actions of quantum groups on operator algebras ([1,5,23,32]). For convenience in this paper, we will use the definition given in [32] for the notion of actions of compact quantum groups on C ∗ algebras. As in [32], Woronowicz Hopf C ∗ -algebras are assumed to be full in order to define morphisms. We adapt the following convention (see [28,27,32]): when A = C(G) is a Woronowicz Hopf C ∗ -algebra, we also say that A is a compact quantum group, referring to the dual object G. Definition 2.1 (cf [32]). A (left) action of a compact quantum group A on a C ∗ -algebra B is a unital *-homomorphism α from B to B ⊗ A such that (1) (idB ⊗ 8)α = (α ⊗ idA )α, where 8 is the coproduct on A, (2) (idB ⊗ )α = idB , where is the counit on A, (3) There is a dense *-subalgebra B of B, such that α(B) ⊆ B ⊗ A, where A is the canonical dense *-subalgebra of A. Remarks. (1) The definition above is equivalent to the one in Podles [23]. As in [23], we do not impose the condition that α is injective, which is required in [1,5], though the examples constructed in this paper satisfy this condition. We conjecture that this condition is a consequence of the other conditions in the definition. A special case of this conjecture says that the coproduct of a Woronowicz Hopf C ∗ -algebra is injective, which is true for both the full Woronowicz Hopf C ∗ -algebras (because of the counital property) and the reduced ones (because of Baaj-Skandalis [1]). Even if this conjecture is false, one can still obtain an injective α˜ from α by passing to the quotient of B by the kernel of α. We leave the verification of the latter as an exercise for the reader. (2) The above notion of left action of quantum group G would be called right coaction of the Woronowicz Hopf C ∗ -algebra C(G) by some other authors. But we prefer the more geometric term “action of quantum group”. We can similarly define a right action of a quantum group G, which would be called a left coaction of the Woronowicz Hopf C ∗ -algebra C(G) by some other specialists. Definition 2.2. Let α be an action of a compact quantum group A on B. An element b of B is said to be fixed under α (or invariant under α) if α(b) = b ⊗ 1A .
(2.1)
The fixed point algebra B α (or B A if no confusion arises) of the action α is B α = {b ∈ B | α(b) = b ⊗ 1A }.
(2.2)
The action of A is said to be ergodic if B α = CI . A continuous functional φ on B is said to be invariant under α if (φ ⊗ idA )α(b) = φ(b)1A .
(2.3)
Fix an action α of a compact quantum group A on B. Let h be the Haar state on A [37,28,26]. Then we have Proposition 2.3. (1) The map E = (1 ⊗ h)α is a projection of norm one from B onto Bα.
484
S. Wang
(2) Let Bα = {b ∈ B | α(b) = b ⊗ 1A }.
(2.4)
Then B α is norm dense in B α . Hence the action α is ergodic if and only if it is so when restricted to the dense *-subalgebra B of B. Proof. (1) This is an easy consequence of the following form of the invariance of the Haar state (cf. [37]): (idA ⊗ h)8(a) = h(a)1A , a ∈ A. (2) If b ∈ B α , then b can be approximated in norm by a sequence of elements bl ∈ B. Let b¯l be the average of bl : b¯l = (1B ⊗ h)α(bl ). Then from part (1) of the proposition, b¯l ∈ B α . From condition (3) of Definition 2.1, we see that b¯l ∈ B α . Moreover, kb¯l − bk = k(1 ⊗ h)α(bl − b)k ≤ k(1 ⊗ h)αkkbl − bk → 0. The rest is clear. u t Preserve the notation above. Let A be the von Neumann algebra generated by the GNS representation πh of A for the state h. Then A is a Hopf von Neumann algebra. For later use, we need to adapt the definitions above to the situation of von Neumann algebras. Definition 2.4. A right coaction of a Hopf von Neumann algebra A on a von Neumann algebra B is a normal homomorphism α from B to B ⊗ A such that (1) (idB ⊗ 8)α = (α ⊗ idA )α, where 8 is the coproduct on A; (2) α(B)(1 ⊗ A) generates the von Neumann algebra B ⊗ A. The main reason why we use the term “coaction of Hopf von Neumann algebra” is that von Neumann algebras are measure-theoretic objects instead of geometric-topological objects (cf. Remark (2) after Definition 2.1). Condition (2) in the above definition is an analog of the density condition as used in the Hopf C ∗ -algebra setting [1,23]. It is well known that there is no analogue of counit in the Hopf von Neumann algebra situation simply because a von Neumann algebra corresponds to a measure space in the commutative case (the simplest case), and functions are defined only up to sets of measure zero. Hence we do not have an analog of condition (2) of Definition 2.1 for Hopf von Neumann algebras coactions. If A comes from the GNS-representation of the Haar state on a compact quantum group A and A coacts on the right on some von Neumann algebra B, we will abuse the terminology by saying that the quantum group A acts on B. Other notions such as invariant elements (or functionals), fixed point algebra and ergodic actions in the C ∗ -case above can also be carried over to to the von Neumann algebra situation. The main result of this section is the following Theorem 2.5. Let B be a C ∗ -algebra endowed with an action α of a compact quantum group A. Let τ be an α-invariant state on B. Then
Actions of Quantum Groups on Operator Algebras
485
(1) α lifts to a coaction α˜ of the Hopf von Neumann algebra A = πh (A)00 on the von Neumann algebra B = πτ (B)00 defined by α(π ˜ τ (b)) = (πτ ⊗ πh )α(b),
b ∈ B,
(2.5)
where πh and πτ are respectively the GNS representations associated with the Haar state h on A and the state τ on B. (2) If α is ergodic, then so is α. ˜ Proof. (1). We will only show that the natural map α˜ given on the dense subalgebra πτ (B) by b ∈ B, α(π ˜ τ (b)) = (πτ ⊗ πh )α(b), is well defined and extends to a normal morphism from B to B ⊗ A. Let b ∈ B and a ∈ A. Denote by b˜ and a˜ respectively the corresponding elements of the Hilbert spaces H = L2 (B, τ ) and K = L2 (A, h). Define an operator U on H ⊗ K by ˜ U (b˜ ⊗ a) ˜ = (πτ ⊗ πh )α(b)(1˜ B ⊗ a).
(2.6)
Then since τ is α invariant, we have < U (b˜ ⊗ a), ˜ U (b˜ ⊗ a) ˜ > = (τ ⊗ h)(1B ⊗ a ∗ )α(b∗ b)(1B ⊗ a) = aha ∗ ((τ ⊗ idA )α(b∗ b)) = aha ∗ (τ (b∗ b)1A ) = < b˜ ⊗ a, ˜ b˜ ⊗ a˜ >, where aha ∗ is the functional on A defined by aha ∗ (x) = h(a ∗ xa),
x ∈ A.
Hence U is an isometry. Since α(B)(1 ⊗ A) is dense in B ⊗ A, U is a unitary operator. We also have (πτ ⊗ πh )α(b)U (b˜0 ⊗ a˜0 ) = (πτ ⊗ πh )α(b)(πτ ⊗ πh )α(b0 )(1˜ B ⊗ a˜0 ) = (πτ ⊗ πh )α(bb0 )(1˜ B ⊗ a˜0 ) = U (πτ (b)b˜0 ⊗ a˜0 ) = U (πτ (b) ⊗ 1)(b˜0 ⊗ a˜0 ). That is (πτ ⊗ πh )α(b) = U (πτ (b) ⊗ 1)U ∗ .
(2.7)
Condition (1) of Definition 2.4 follows immediately. Since α(B)(1⊗A) is dense in B ⊗A (cf. Remark (1) after Definition 2.1 and Podles [23]), Condition (2) of Definition 2.4 follows. (2) Assume α is ergodic. Let z ∈ B be a fixed element under α: ˜ α(z) ˜ = z ⊗ 1A . Let bn ∈ B be a net of elements such that πτ (bn ) → z in the weak operator topology. Consider the average of πτ (bn ) integrated over the quantum group A, ˜ τ (bn )), zn = (idB ⊗ h)α(π
486
S. Wang
where we use the same letter h to denote the faithful normal state on A determined by the Haar state h on A. Then one can verify that zn → z in the weak operator topology. Moreover, using (idA ⊗ h)8(a) = h(a)1A , a ∈ A, where we denote the coproduct on A by the same symbol as the coproduct 8 on A, we have ˜ τ (bn )) α(z ˜ n ) = (idB ⊗ idA ⊗ h)(α˜ ⊗ idA )α(π ˜ τ (bn )) = (idB ⊗ idA ⊗ h)(idB ⊗ 8)α(π ˜ τ (bn )) = (idB ⊗ (idA ⊗ h)8)α(π ˜ τ (bn )) ⊗ 1A = zn ⊗ 1A . = (idB ⊗ h)α(π ˜ From part (1) of the theorem, we see That is, each zn is fixed under α. zn = (πτ ⊗ h)α(bn ) = πτ (b¯n ), where
b¯n = (1 ⊗ h)α(bn ) ∈ B α is the average of of bn . Since α is ergodic, b¯n is a scalar. This implies that each zn is also a scalar. Consequently, the operator z, as a limit of the zn ’s in the weak operator topology, is a scalar. u t Remarks. Define on the Hilbert A-module H ⊗ A (conjugate linear in the second variable) an operator u by u(b˜ ⊗ a) = (πτ ⊗ 1)α(b)(1˜ B ⊗ a).
(2.8)
Then one verifies that u is a unitary representation of the quantum group A (cf. [37,1, 28]) and (πτ , u) satisfies the following covariance condition in the sense of 0.3 of [1]: (πτ ⊗ 1)α(b) = u(πτ (b) ⊗ 1)u∗ .
(2.9)
The operator U defined above is given by U = (1 ⊗ πh )u. The pair (πτ , U ) along with the relation (2.7) can be called a covariant system in the framework of Hopf von Neumann algebras. Note that part (1) of the above theorem also gives a conceptual proof of Proposition 4.2.(i) of [21], where a rather complicated (and nonconceptual) proof is given. Notation. Let v be a unitary representation of a quantum group A on some finite dimensional Hilbert space Hv [37,1,28]. Define Adv (b) = v(b ⊗ 1)v ∗ , b ∈ B(Hv ).
(2.10)
Then using Proposition 3.2 of [37], we see that Adv is an action of A on B(Hv ) (see also the remark after the proof of Theorem 4.1 in [32]). It will be called the adjoint action of the quantum group A for the representation v. Note that unlike in the case of locally compact groups, for quantum groups we have in general Adv⊗in w 6 = Adv ⊗in Adw ,
(2.11)
where ⊗in denotes the interior tensor product representations [37,29]. For other basic notions on compact quantum groups, we refer the reader to [37,28, 29].
Actions of Quantum Groups on Operator Algebras
487
3. Ergodic Actions of Au (Q) on the Powers Factor Rλ and the Murray–von Neumann II1 Factor R We construct in this section an ergodic action of the universal quantum group Au (Q) on the type IIIλ Powers factor Rλ for a proper choice of Q and an ergodic action of Au (n) on the type II1 Murray–von Neumann factor R. These are obtained as consequences of Theorem 3.6 below. Recall [28,27,30] that for every non-singular n × n complex matrix Q (n > 1 in the rest of this paper), the universal compact quantum group (Au (Q), u) is generated by uij (i, j = 1, · · · , n) with defining relations (with u = (uij )): Au (Q) :
u∗ u = In = uu∗ ,
ut QuQ ¯ −1 = In = QuQ ¯ −1 ut .
There is also another related family of quantum groups Ao (Q) [28,27,30,2]: Ao (Q) :
ut QuQ−1 = In = QuQ−1 ut ,
u¯ = u; (here Q > 0).
Part (1) of the next proposition gives a characterization of Au (Q) in terms of the functional φQ defined below. Proposition 3.1. Consider the adjoint action Adu corresponding to the fundamental representation u of the quantum group (Au (Q), u). (1) The quantum group (Au (Q), u) is the largest compact matrix quantum group such that its action Adu on Mn (C) leaves invariant the functional φQ defined by φQ (b) = T r(Qt b), b ∈ Mn (C). (2) Adu is an ergodic action if and only if Q = λE, where λ is a nonzero scalar, E is the positive matrix (1 ⊗ h)ut u¯ (cf. [27]), and h is the Haar measure of Au (Q). Proof. (1) It is a straightforward calculation to verify that the action Adu of Au (Q) leaves the functional φQ invariant. Assume that (A, v) is a compact quantum group such that Adv leaves φQ invariant (v = (vij )ni,j =1 ). Then the vij ’s satisfy the defining relations for Au (Q). Hence (A, v) is a quantum subgroup of Au (Q). (2) A matrix S is fixed by Adu if and only if S intertwines the fundamental representation u with itself. Hence the action Adu is ergodic if and only if the fundamental representation u is irreducible. When Q = λE, then Au (Q) = Au (E). Since E is positive, u is irreducible (cf. [3]). On the other hand we have (see [27]) ¯ −1 , (ut )−1 = E uE where E is defined as in the proposition. We also have ¯ −1 . (ut )−1 = QuQ Hence
¯ −1 and Q−1 E u¯ = uQ ¯ −1 E. E uE ¯ −1 = QuQ
If u is irreducible, then so is u¯ and therefore Q−1 E = scalar. u t
488
S. Wang
Note. The proof of necessary condition in (2) above was pointed to us by Woronowicz. Our original proof contains an error. In general the invariant functional φQ defined above is not a trace, even if the action Adu is ergodic. However, for ergodic actions of compact groups on operator algebras, one has the following finiteness theorem of Høegh–Krohn–Landstad–Størmer [15]: Theorem 3.2. If a von Neumann algebra admits an ergodic action of a compact group G, then (a) this von Neumann algebra is finite; (b) the unique G-invariant state is a trace on the von Neumann algebra. The proposition above shows that part (b) of this finiteness theorem is no longer true for compact quantum groups in general. We now show that part (a) of the above finiteness theorem is false for compact quantum groups either: not only can compact quantum groups act on infinite algebras, they can act on purely infinite factors (type III factors). Definition 3.3. Let (Bi , πj i ) be an inductive system of C ∗ -algebras (i, j ∈ I ). For each i ∈ I , let αi be an action of a compact quantum group A on Bi . We say that the actions αi are a compatible system of actions for (Bi , πj i ) if for each pair i ≤ j , the following holds: (πj i ⊗ 1)αi = αj πj i . The following lemmas will be used in the next theorem. Preserve the notation in Definition 3.3. Let πi be the natural embedding of Bi into the inductive limit B of the Bi ’s. Lemma 3.4. Put for each i ∈ I , απi (bi ) = (πi ⊗ 1)αi (bi ),
bi ∈ Bi .
Then α induces a well defined action of the quantum group A on B. The action α is ergodic if and only if each αi is. Assume further that φi is an inductive system of states on Bi and that each φi is invariant under αi . Then α leaves invariant the inductive limit state τ = lim φi . Proof. Let j > i, so πj i (bi ) ∈ Bj . Then by the formula of α given in the lemma, we have απj (πj i (bi )) = (πj ⊗ 1)αj (πj i (bi )).
(*)
Since πj πj i = πi , the left-hand side of the above is equal to απi (bi ) = (πi ⊗ 1)αi (bi ). From the compatibility condition we see that the right-hand side of (∗) is equal to (πj ⊗ 1)(πj i ⊗ 1)αi (bi ) = (πi ⊗ 1)αi (bi ). S This shows that α is well defined on the dense subalgebra B = πi (Bi ) of B, where Bi is the dense *-subalgebra of Bi according to Definition 2.1. It also clear that α is bounded
Actions of Quantum Groups on Operator Algebras
489
and satisfies conditions of Definition 2.1. Hence α induces a well defined action of the quantum group on B. Assume that each αi is ergodic. It is clear that the action α is ergodic on the dense *-subalgebra B. Hence α is ergodic on B by Proposition 2.3. Conversely, if α is ergodic, then the restrictions αi of α to Bi is clearly ergodic. We now show that τ is invariant under α. Note that τ (πi (bi ) = φi (bi ). From this we have (τ ⊗ 1)α(πi (bi )) = (τ ⊗ 1)(πi ⊗ 1)αi (bi )) = (φi ⊗ 1)αi (bi )) = φi (bi ) = τ (πi (bi )). By density of B in B, we have (τ ⊗ 1)α(b) = τ (b),
b ∈ B.
This completes the proof of the lemma. u t Note. Not every action of a compact quantum group on an inductive limit of C ∗ -algebras arises from a compatible system of actions of A. Lemma 3.5. Let uk be a unitary representation of a compact quantum group A on Vk for each natural number k. Assume that Aduk leaves invariant a functional ψk on B(Vk ). Then the action Adu1 ⊗in ···⊗in uk leaves the functional φ k = ψ1 ⊗ · · · ⊗ ψk invariant. Proof. Straightforward calculation. u t Let Q ∈ GL(n, C) be a positive matrix with trace 1. We now construct a sequence of actions αk of the compact quantum group (Au (Q), u) on Mn (C)⊗k . Denote by uk the k th fold interior tensor product of the representation u, i.e., uk = u ⊗in · · · ⊗in u, see [29] for the definition of the interior tensor product ⊗in . Put αk = Aduk ,
⊗k k φQ = φQ = φQ ⊗ · · · ⊗ φQ .
Let B = lim Mn (C)⊗k , k→∞
B = πQ (B)00 ,
k τQ = lim φQ , k→∞
A = πh (Au (Q))00 ,
where πQ and πh are respectively the GNS-representations for the positive functional τQ and the Haar state h on Au (Q). Theorem 3.6. The actions αk (k = 1, 2, · · · ) of Au (Q) forms a compatible system of k invariant. These actions give rise to a natural ergodic actions leaving the functionals φQ ergodic action on the UHF algebra B leaving invariant the positive functional τQ , which in turn lifts to an ergodic action on the von Neumann algebra B.
490
S. Wang
Proof. It is straightforward to verify that the actions αk are a compatible system of actions. Since each uk is irreducible (cf. [3]), we see that the actions αk are ergodic. By 3.1.(1), φQ is invariant under the action Adu . Hence applying the lemmas above, we k are invariant under the actions α , and these actions give rise see that the functionals φQ k to an ergodic action of the quantum group A on B leaving τQ invariant. Now apply Theorem 2.5, the action α on B induces an ergodic action α˜ : B −→ B ⊗ A at the von Neumann algebra level defined by α(π ˜ Q (b)) = (πQ ⊗ πh )α(b), where b ∈ B. u t Corollary 3.7. Take
Q=
a 0 0 1−a
, a ∈ (0, 1/2).
Then τQ is the Powers state, so the quantum group Au (Q) acts ergodically on the Powers factor Rλ of type IIIλ , where λ = a/(1 − a). Corollary 3.8 (compare with [34]). Take Q = In . Then τQ is the unique trace on the UHF algebra B of type n∞ , so the quantum group Au (n) = Au (In ) acts ergodically on the hyperfinite II1 factor R. We will see in Sect. 5 that for an appropriate choice of Q, the quantum groups Au (Q) act on the injective factor R∞ of type III1 also. It would be interesting to know whether compact quantum groups admit ergodic actions on factors of type III0 too. 4. Fixed Point Subalgebras of Quantum Subgroups In this section, we show that although the actions of the universal quantum groups Au (Q) constructed in the last section are ergodic, when restricted to some of their non-trivial quantum subgroups, we obtain interesting large fixed point algebras. Let a 0 Q = , as in 3.7. Put q = λ1/2 = (a/(1 − a))1/2 . Then from the defini0 1−a tions of SUq (2) and Au (Q), we see that SUq (2) is a quantum subgroup of Au (Q). By restriction, we obtain from the action of Au (Q) an action of SUq (2) on Rλ . The fixed point subalgebra of Rλ under the action of SUq (2) is generated by the Jones projections {1, e1 , e2 , · · · , }. The restriction of the Powers states τQ to this fixed point algebra is a trace and its values on the Jones projections gives the Jones polynomial. See the book of Jones [18]. Now take Q = n1 In . We have Corollary 3.8. For simplicity of notation, let τ denote the trace τQ on the UHF algebra B. There are two special quantum subgroups of Au (n): SU (n) and Ao (Q) = Ao (n). By 4.7.d. of [14], for any closed subgroup G of SU (n), the fixed point algebra R G is a II1 subfactor of R. We now show that the same result holds for quantum subgroups of Ao (n). For this, it suffices to prove the following Proposition 4.1. The fixed point subalgebra R Ao (n) of R for the quantum subgroup Ao (n) of Au (n) is a II1 factor and the action of Ao (n) on R is prime.
Actions of Quantum Groups on Operator Algebras
491
Proof. Put β = n2 . By [2], the fixed point subalgebra of Mn (C)⊗k for the action αk = Aduk is generated by 1, e1 , · · · , ek−1 , where u is the fundamental representation of the quantum group Ao (n), es = IH ⊗(s−1) ⊗
X1 eij ⊗ eij ⊗ IH ⊗(k−s−1) , n i,j
and H = Cn . The e’s satisfy the relations: (i) es2 = es = es∗ ; (ii) es et = et es , 1 ≤ s, t ≤ k − 1, |s − t| ≥ 2; (iii) βes et es = es , 1 ≤ s, t ≤ k − 1, |s − t| = 1. We now show that the restriction of τ on the fixed point subalgebra of Mn (C)⊗k satisfies the Markov trace condition of modulus β, where τ is the trace on R. Namely, we will verify the identity 1 τ (wek−1 ) = τ (w) β for w in the subalgebra of Mn (C)⊗k generated by 1, e1 , · · · , ek−2 . By Theorem 4.1.1 and Corollary 2.2.4 of Jones [17], this will complete the proof of the proposition. It will also follow that the action of Ao (n) on R is prime. To verify this, it suffices by Proposition 2.8.1 of [14] to check the Markov trace condition for w of the form w = (ei1 ei1 −1 · · · ej1 )(ei2 ei2 −1 · · · ej2 ) · · · (eip eip −1 · · · ejp ), where 1 ≤ i1 < i2 · · · ip ≤ k − 2, 1 ≤ j1 < j2 · · · jp ≤ k − 2, i1 ≥ j1 , i2 ≥ j2 , · · · , ip ≥ jp , 0 ≤ p ≤ k − 2. If ip < k − 2 then it is easy to see that τ (wek−1 ) = τ (w)τ (ek−1 ) =
1 τ (w), β
noting that τ (ek−1 ) = β1 . Hence we can assume ip = k − 2. Let lw be the length of the word w. Then w takes the form w=
X 1 ( )lw (· · · ) ⊗ eab ⊗ 1, n
where the summation is over the indices a, b and some other indices that need not be specified, and the terms in (· · · ) are certain elements of Mn (C)⊗(k−2) that need not be
492
S. Wang
specified either (the components in the tensor product of the terms in (· · · ) are products of eij ’s). We have then XX 1 τ (wek−1 ) = ( )lw +1 τ ((· · · ) ⊗ eab exy ⊗ exy ) n x,y
XX 1 τ ((· · · ) ⊗ eab exy ⊗ 1)τ (IH ⊗(k−1) ⊗ exy ) = ( )lw +1 n x,y
XX 1 1 ((· · · ) ⊗ eab exx ⊗ 1) ) = ( )lw +1 τ ( n n x =
1 τ (w). β
The proof is complete. u t Remarks. (1) In view of the above result, fixed point algebras of quantum subgroups of Ao (n) give examples of subfactors. Therefore, it would be interesting to classify finite quantum subgroups of Ao (n) and study them in the light of Jones’theory, see [14] for this in the case of the Lie group SU (2). Note that the quantum Ao (n) contains the quantum permutation group Aaut (Xn ) of n point space Xn (see [32]) and many other interesting quantum subgroups (see [28]). It would also be interesting to determine the fixed point subalgebras of the quantum subgroups of SU−1 (n) (SU−1 (n) is a quantum subgroup of Au (n) because its antipode has period 2 [28,27]). We refer the reader to Banica [4] for some interesting related results. (2) Note that since Adv⊗in w 6 = Adv ⊗in Adw , for unitary representation v and w of Ao (n), we do not have a commuting square like the one on p. 222 of [14] for a given quantum subgroup G of Ao (n). 5. Ergodic Actions of Au (Q) on the Cuntz Algebra and the Injective Factor of Type III1 The Cuntz algebra On is an infinite simple C ∗ -algebra without trace, hence by [15] it does not admit an ergodic action of a compact group. Recall that the Cuntz algebra On is the simple C ∗ -algebra generated by n isometries Sk (k = 1, · · · , n) such that n X k=1
Sk Sk∗ = 1.
(5.1)
Just as U (n), the compact matrix quantum group Au (Q) acts on On in a natural manner [12,10,19], where Q is a positive matrix of trace 1 in GL(n, C): α(Sj ) =
n X
Si ⊗ uij ,
(5.2)
i=1
the dense *-algebra B of Definition 2.1 being the *-subalgebra 0 On of On generated by the Si ’s, see Doplicher-Roberts [11]. However, unlike the actions of compact groups on On , we have
Actions of Quantum Groups on Operator Algebras
493
Theorem 5.1. The above action α of the quantum group Au (Q) on On is ergodic, the unique α-invariant state on On is the quasi-free state ωQ associated with Q [13]. Proof. Let H be the Hilbert subspace of On linearly spanned by the Sk ’s. Let (H s , H r ) be the linear span of elements of the form Si1 Si2 · · · Sir Sj∗s · · · Sj∗2 Sj∗1 . Then 0 On is the linear span of all the spaces (H s , H r ) , r, s ≥ 0 (see [11]). Observe that each of the spaces (H s , H r ) is invariant under the action α: α(Si1 Si2 · · · Sir Sj∗s · · · Sj∗2 Sj∗1 ) =
n X k1 ,···kr ,l1 ,··· ,ls =1
Sk1 Sk2 · · · Skr Sl∗s · · · Sl∗2 Sl∗1 ⊗ uk1 i1 uk2 i2 · · · ukr ir u∗ls js · · · u∗l2 j2 u∗l1 j1 .
Hence (id ⊗ h)α((H s , H r )) is the space of the fixed elements of (H s , H r ) under α, where h is the Haar state on Au (Q). For r 6 = s, the tensor product representations u⊗r and u⊗s of the fundamental representation u of the quantum group Au (Q) are inequivalent and irreducible [3]. Hence by Theorem 5.7 of Woronowicz [37], for r 6 = s, h(uk1 i1 uk2 i2 · · · ukr ir u∗ls js · · · u∗l2 j2 u∗l1 j1 ) = 0,
(5.3)
and therefore (H s , H r ) has no fixed point other than 0. For r = s, identifying the elements Si1 Si2 · · · Sir Sj∗r · · · Sj∗2 Sj∗1 of (H r , H r ) with the matrix units ei1 j1 ⊗ ei2 j2 ⊗ · · · ⊗ eir jr of Mn (C)⊗r , the action α on (H r , H r ) is identified with the action αr on Mn (C)⊗r of Theorem 3.6. Hence the fixed elements of (H r , H r ) under α are the scalars. Consequently, the fixed elements of 0 On under α are the scalars. By Proposition 2.3, α is ergodic on On . Let φ be the (unique) α-invariant state on On . Then for x ∈ (H r , H s ) with r 6= s, r, s ≥ 0, we have φ(x) = h((φ ⊗ 1)α(x)) = φ((1 ⊗ h)α(x)). But (1 ⊗ h)α(x) = 0 according to the computation above. Hence φ(x) = 0. From the consideration of the last paragraph, α restricts to an ergodic action on the subalgebra (H k , H k ) of On . Identifying (H k , H k ) with Mn (C)⊗k as above, we see that k (x), φ(x) = φQ
x ∈ (H k , H k ),
k is the functional in Theorem 3.6. This shows that φ is the quasi-trace state ω where φQ Q associated with Q (cf. [13]). u t
We can assume that Q = diag(q1 , q2 , · · · , qn ) is a diagonal positive matrix with trace 1, since Au (Q) and Au (V QV −1 ) are similar to each other [27]. Let β be a positive number. Define numbers ω1 , ω2 , · · · , ωn by diag(e−βω1 , e−βω2 , e−βωn ) = diag(q1 , q2 , · · · , qn ).
(5.4)
Let πQ be the GNS representation of the α-invariant state ωQ of On . Then by Theorem 4.7 of Izumi [16] and Theorem 2.5, we have
494
S. Wang
Corollary 5.2. If ω1 /ωk is irrational for some k, then the compact quantum group Au (Q) acts ergodically on the injective factor πQ (On )00 of type III1 . Remarks. (1) The big quantum semi-group Unc (n) of Brown also acts on On in the same way as Au (Q) on On above. See Brown [6] and 4.1 of Wang [28] for the quantum semi-group structure on Unc (n). (2) If the ω1 /ωk ’s are rational for all k, then πQ (On )00 is an injective factor of type IIIλ , on which Au (Q) acts ergodically, where λ is determined from an equation involving q1 , · · · , qn (see [16]). In particular, taking Au (Q) = Au (n), we see that even the compact matrix quantum group Au (n) of Kac type admits ergodic actions on both the infinite C ∗ -algebra On and the injective factor πQ (On )00 of type III 1 . In view of Corollary 3.8, n it would be interesting to solve the following problem: Problem. Does a compact matrix quantum group of non-Kac type admit ergodic action on the hyperfinite II1 factor R? 6. Ergodic Actions on Quotient Spaces In this section, we study ergodic actions of compact quantum groups on their quantum quotient spaces. We also give an example to show that, contrary to the classical situation, not all ergodic actions arise in this way. Fix a quantum subgroup H of a compact quantum group G, which is given by a surjective morphism θ of Woronowicz Hopf C ∗ -algebras from C(G) to C(H ). Let hH and hG be respectively the Haar states on C(H ) and C(G). Then there is a natural action β of the quantum group H on G given by β : C(G) −→ C(H ) ⊗ C(G),
β = (θ ⊗ 1)8G ,
(6.1)
where 8G is the coproduct on C(G). The quotient space H \G is defined by the fixed point algebra of β (cf. [23]): C(H \G) = C(G)β = {a ∈ C(G) : (θ ⊗ 1)8G (a) = 1 ⊗ a}.
(6.2)
The restriction of 8G to C(H \G) defines a natural action α of G on C(H \G): α = 8G |C(H \G) : C(H \G) −→ C(H \G) ⊗ C(G).
(6.3)
The dense *-subalgebras of Definition 2.1 for the actions β and α are the natural ones. Note that E = (hH ⊗ 1)β = (hH θ ⊗ 1)8G is a projection of norm one from C(G) to C(H \G) (cf. Proposition 2.3 and [23]). Proposition 6.1. In the situation as above, we have (1) the action α of G on C(H \G) is ergodic; (2) C(H \G) has a unique α invariant state ω satisfying hG (a) = ω((hH θ ⊗ 1)8G (a)), Namely, ω is the restriction of hG on C(H \G).
a ∈ C(G).
(6.4)
Actions of Quantum Groups on Operator Algebras
495
Note. Part (2) of the proposition above is the analogue of the following well known integration formula in the classical situation: Z Z Z a(g)dg = a(hg)dhdω(g), a ∈ C(G). G
H \G H
Proof. (1) Let a ∈ C(H \G) be fixed under α, i.e., α(a) = a ⊗ 1.
(**)
Since α(a) = 8G (a) and since (by the definition of C(H \G)) (θ ⊗ 1)8G (a) = 1 ⊗ a, it follows that (θ ⊗ 1)α(a) = 1 ⊗ a. Using (∗∗) for the left hand side of the above, we get θ(a) ⊗ 1 = 1 ⊗ a. This is possible only for a = λ · 1 for some scalar λ. (2) The general result of the existence and uniqueness of the invariant state for an ergodic action is proven in [5]. For the special situation we consider here, we now not only prove the existence and uniqueness of the invariant state, but also give the precise formula of the invariant state. Let ω be the restriction of hG on the subalgebra C(H \G) of C(G). Since (hH θ ⊗ 1)8G is a projection from C(G) onto C(H \G) and α is the restriction of 8G on C(H \G), the invariance of ω for the action α follows from the invariance of the Haar state hG . Conversely, let µ be any invariant state on C(H \G). Using again the fact that (hH θ ⊗ 1)8G is a projection from C(G) onto C(H \G), a standard calculation shows that the functional φ(a) = µ((hH θ ⊗ 1)8G (a)), a ∈ C(G) is a right invariant state, i.e. φ ∗ ψ(a) = φ(a),
a ∈ C(G),
where ψ is a state on A and φ ∗ ψ = (φ ⊗ ψ)8G is the convolution operation (cf. [37]). From the uniqueness of the Haar state, it follows from this that φ = hG ,
µ = ω = hG |C(H \G) .
t u
Remarks. (1) Note that the quantum groups Au (Q), Ao (Q) and Bu (Q) have many quantum subgroups. In the light of Proposition 6.1 and Theorem 2.5, it would be interesting to study the corresponding operator algebras and the actions on them. We leave this to a separate work. (2) More general than the considerations in Proposition 6.1, if two quantum groups admit commuting actions on a noncommutative space, then they act on each other’s orbit spaces (not necessarily in an ergodic manner), just as in the classical situation. Note that the notion of orbit space corresponds to fixed point algebra in the noncommutative situation.
496
S. Wang
An Example. Every transitive action of a compact group G on a topological space X is isomorphic to the natural action of G on H \G, where H is the closed subgroup of G that fixes some point of X. However, this is no longer true for quantum groups, even if the space on which the quantum group acts is a classical one. To see this, let Xn = {x1 , · · · , xn } be the space with n points. By Theorem 3.1 of [32], the quantum automorphism group Aaut (X4 ) of X4 contains the ordinary permutation group S4 , hence it acts ergodically on X4 . The quantum subgroup of Aaut (X4 ) that fixes a point, say x1 , is isomorphic to Aaut (X3 ), which is the same as C(S3 ), a (commutative) algebra of dimension 6. From [32], we know that as a C ∗ -algebra, Aaut (Xn ) is the same as C(Sn ) for n ≤ 3 and it has C ∗ (Z/2Z ∗ Z/2Z) as a quotient for n ≥ 4, where Z/2Z ∗ Z/2Z is the free product of the two-element group Z/2Z with itself, because the entries of the matrix p 1−p 0 0 0 0 1 − p p 0 0 q 1−q 0 0 1−q q satisfy the commutation relations of the algebra Aaut (X4 ), where p, q are the projections generating the C ∗ -algebra C ∗ (Z/2Z ∗ Z/2Z): p = (1 − u)/2 and q = (1 − v)/2, u and v being the unitary generators of the first and second copies of Z/2Z in the free product (cf. [24]). For simplicity of notation, let C(G) = Aaut (X4 ), and let M be the canonical dense subalgebra of C(G) generated by the coefficients of the fundamental representation of G (see [37]). Let H = S3 , the subgroup of G that fixes x1 , and let H = C(H ). Let θ be the surjection from C(G) to C(H ) that embeds H as subgroup of G (cf. [32]). Let β be the action defined in the beginning of this section. We claim that the coset space H \G is not isomorphic to X4 as a G-space (see Sect. 2 of [32] for the notion of morphism). Namely, we have Theorem 6.2. The G-algebras C(H \G) (which is defined to be C(G)β ) and C(X4 ) are not isomorphic to each other. Proof. Since C(X4 ) has dimension 4, it suffices to show that C(H \G) is infinite dimensional. We make M into a Hopf H-module (i.e. a compatible system of a left H comodule and a left H module) as follows. The restriction of β to M clearly defines a left H comodule structure: β : M −→ H ⊗ M.
(6.5)
The left H module structure on M is the trivial one defined by H ⊗ M −→ M, h · m = (h)m, h ∈ H, m ∈ M.
(6.6) (6.7)
By Theorem 4.1.1 of Sweedler [25], we have an isomorphism of left H modules H ⊗ Mβ ∼ = M.
(6.8)
H ⊗ A(H \G) ∼ = M, h ⊗ m0 7 → h · m0 , h ∈ H, m0 ∈ A(H \G),
(6.9) (6.10)
That is
Actions of Quantum Groups on Operator Algebras
497
where A(H \G) = Mβ is the canonical dense subalgebra of C(H \G) = C(G)β . Since M is infinite dimensional and H is finite dimensional, A(H \G) and therefore C(H \G) are also infinite dimensional. u t Acknowledgement. The author is indebted to Marc A. Rieffel for continual support. Part of this paper was written while the author was a member at the IHES during the year July, 1995-Aug, 1996. He thanks the IHES for its financial support and hospitality during this period. The author also wishes to thank the Department of Mathematics at UC-Berkeley for its support and hospitality while the author holds an NSF Postdoctoral Fellowship there during the final stage of this paper.
References 1. Baaj, S. and Skandalis, G.: Unitaires multiplicatifs et dualité pour les produits croisés de C ∗ -algèbres. Ann. Sci. Ec. Norm. Sup. 26, 425–488 (1993) 2. Banica, T.: Théorie des représentations du groupe quantique compact libre O(n). C. R. Acad. Sci. Paris 322, Serie I, 241–244 (1996) 3. Banica, T.: Le groupe quantique compact libre U (n). Commun. Math. Phys. 190, 143–172 (1997) 4. Banica, T.: Quantum groups acting on n points, complex Hadamard matrix and a construction of subfactors, math/9806054 5. Boca, F.: Ergodic actions of compact matrix pseudogroups on C ∗ -algebras. In: Recent Advances in Operator Algebras. Astérisque 232, 93–109 (1995) 6. Brown, L.: Ext of certain free product of C ∗ -algebras. J. Operator Theory 6, 135–141 (1981) 7. Ceccherini, T., Doplicher, S., Pinzari, C. and Roberts, J.E.: A generalization of the Cuntz algebras and model actions. J. Funct. Anal. 125, 416–437 (1994) 8. Cuntz, Joachim: Simple C ∗ -algebras generated by isometries. Commun. Math. Phys. 57, no. 2, 173–185 (1977) 9. Cuntz, Joachim: Regular actions of Hopf algebras on the C ∗ -algebra generated by a Hilbert space. In: Operator algebras, mathematical physics, and low-dimensional topology (Istanbul, 1991), Wellesley, MA: A K Peters, 1993, pp. 87–100; MR 94m:461 10. Doplicher, S.: Abstract compact group duals, operator algebras and quantum field theory. Proc. ICM1990, Kyoto: Springer, 1991 11. Doplicher, S. and Roberts, J.E.: Duals of compact Lie groups realized in the Cuntz algebras and their actions on C ∗ -algebras. J. Funct. Anal. 74, 96–120 (1987) 12. Doplicher, S. and Roberts, J.E.: Compact group actions on C ∗ -algebras. J. Operator Theory 19, 283–305 (1988) 13. Evans, David E.: On On . Publ. RIMS. Kyoto Univ. 16, 915–927 (1980) 14. Goodman, F.M. and de la Harpe, P. and Jones, V.F.R.: Coxeter Graphs and Towers of Algebras. MSRI Publ. 14, Berlin–Heidelberg–New York: Springer-Verlag, 1989 15. Høegh-Krohn, R. and Lanstad, M.B. and Størmer, E.: Compact ergodic groups of automorphisms. Ann. of Math. 114, 75–86 (1981) 16. Izumi, Masaki: Subalgebras of infinite C ∗ -algebras with finite Watatani indices. I. Cuntz algebras. Commun. Math. Phys. 155, no. 1, 157–182 (1993); MR 94e:46104 17. Jones, V. F. R.: Index for Subfactors. Invent. Math. 72, 1–5 (1983) 18. Jones, V. F. R.: Subfactors and Knots. Regional Conference Series 80, Providence, RI: Am. Math. Soc., 1991 19. Konishi, Y., Nagisa, M. and Watatani, Y.: Some remarks on actions of compact matrix quantum groups on C ∗ -algebras. Pacific J. Math. 153, 119–127 (1992) 20. Marciniak, M.: Actions of compact quantum groups on C ∗ -algebras. Proc. AMS 126, 607–616 (1998) 21. Nakagami, Y.: Takesaki duality for the crossed product by quantum groups. In: Quantum and NonCommutative Analysis. H. Araki, ed., Dordrecht: Kluwer Academic Publishers, 1993, pp. 263–281 22. Paolucci, A.: Coactions of Hopf algebras on Cuntz algebras and their fixed point algebras. Proc. AMS 125, 1033–1042 (1997) 23. Podles, P.: Symmetries of quantum spaces. Subgroups and quotient spaces of quantum SU (2) and SO(3) groups. Commun. Math. Phys. 170, 1–20 (1995) 24. Raeburn, I. and Sinclair, A.M.: The C ∗ -algebra generated by two projections. Math. Scand. 65, 278–290 (1989) 25. Sweedler, M.E.: Hopf Algebras. New York: Benjamin, 1969 26. Van Daele, A.: The Haar measure on a compact quantum group. Proc. Am. Math. Soc. 123, 3125–3128 (1995)
498
S. Wang
27. Van Daele, A. and Wang, S. Z.: Universal quantum groups. International J. Math 7:2, 255–264 (1996) 28. Wang, S. Z.: Free products of compact quantum groups. Commun. Math. Phys. 167, 671–692 (1995) 29. Wang, S. Z.: Tensor products and crossed products of compact quantum groups. Proc. London Math. Soc. 71, 695–720 (1995) 30. Wang, S. Z.: New classes of compact quantum groups. Lecture notes for talks at the University of Amsterdam and the University of Warsaw, January and March, 1995 31. Wang, S. Z.: Problems in the theory of quantum groups. In: Quantum Groups and Quantum Spaces. Banach Center Publication 40 (1997), Inst. of Math., Polish Acad. Sci., Editors: R. Budzynski, W. Pusz, and S. Zakrzewski, pp. 67–78 32. Wang, S. Z.: Quantum symmetry groups of finite spaces. Commun. Math. Phys. 195:1, 195–211 (1998) 33. Wassermann, A.: Ergodic actions of compact groups on operator algebras I: General theory. Ann. of Math. 130, 273–319 (1989) 34. Wassermann, A.: Ergodic actions of compact groups on operator algebras III: Classification for SU (2). Invent. Math. 93, 309–355 (1988) 35. Wassermann, A.: Coactions and Yang-Baxter equations for ergodic actions and subfactors. In: Operator Algebras and Applications, no 2, ed. by D. Evans and M. Takesaki, London Math. Soc. Lecture Notes 136, 1988, pp. 203–236 36. Woronowicz, S. L.: Twisted SU (2) group. An example of noncommutative differential calculus, Publ. RIMS, Kyoto Univ. 23, 117–181 (1987) 37. Woronowicz, S. L.: Compact matrix pseudogroups. Commun. Math. Phys. 111, 613–665 (1987) 38. Woronowicz, S. L.: Tannaka–Krein duality for compact matrix pseudogroups. Twisted SU (N ) groups. Invent. Math. 93, 35–76 (1988) Communicated by A. Connes
Commun. Math. Phys. 203, 499 – 530 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Solutions of the Dirac–Fock Equations for Atoms and Molecules Maria J. Esteban1,? , Eric S´er´e2,? 1 Ceremade (URA CNRS 749), CNRS et Universit´ e Paris-Dauphine, Place du Mar´echal de Lattre de Tassigny, F-75775 Paris Cedex 16, France 2 D´ epartement de Math´ematiques, Universit´e de Cergy-Pontoise, 2 Av. Adolphe Chauvin, F-95302 Cergy-Pontoise Cedex, France
Received: 15 December 1997 / Accepted: 29 June 1998
Abstract: The Dirac–Fock equations are the relativistic analogue of the well-known Hartree–Fock equations. They are used in computational chemistry, and yield results on the inner-shell electrons of heavy atoms that are in very good agreement with experimental data. By a variational method, we prove the existence of infinitely many solutions of the Dirac–Fock equations “without projector”, for Coulomb systems of electrons in atoms, ions or molecules, with Z ≤ 124, N ≤ 41, N ≤ Z. Here, Z is the sum of the nuclear charges in the molecule, N is the number of electrons.
1. Introduction In relativistic quantum mechanics [5], the state of a free electron is represented by a wave function 9(t, x) with 9(t, .) ∈ L2 (R3 , C4 ) for any t. This wave satisfies the free Dirac equation: i∂t 9 = H0 9, with H0 = −i
3 X
αk ∂k + β.
(1.1)
k=1
Here, we have chosen a system of units such that ~ = c = 1, the mass me of the electron has also been normalized to 1. Before going further, let us fix some notations. In the whole paper, the conjugate of z1 ·
z ∈ C will be denoted by z ∗ . For X = z·4 a column vector in C4 , we denote by X ∗ the row covector (z1∗ , . . . , z4∗ ). Similarly, if A = (aij ) is a 4 × 4 complex matrix, we denote by A∗ its adjoint, (A∗ )ij = a∗ji . ? Present address: Ceremade (UMR CNRS 7534), Universit´ e Paris-Dauphine, Place du Mar´echal de Lattre de Tassigny, F-75775 Paris Cedex 16, France
500
M. J. Esteban, E. S´er´e
We denote by (X, X 0 ) the Hermitian product of two vectors X, X 0 in C4 , and by 4 X Xi Xi∗ . The usual Hermitian product in |X| , the norm of X in C4 , i. e. |X|2 = i=1
L2 (R3 , C4 ) is denoted (ϕ, ψ)L2 =
Z
ϕ(x), ψ(x) d3 x.
(1.2)
R3
In the Dirac equation, α1 , α2 , α3 and β are 4 × 4 complex matrices, whose standard form (in 2 × 2 blocks) is 0 σk I 0 (k = 1, 2, 3), β= , αk = σk 0 0 −I
with σ1 =
01 0 −i 1 0 , σ2 = , σ3 = . 10 i 0 0 −1
One can easily check the following relations: αk = αk∗ , β = β ∗ , αk α` + α` αk = 2δk` , αk β + βαk = 0.
(1.3)
These algebraic conditions are here to ensure that H0 is a symmetric operator, such that H02 = −1 + 1.
(1.4)
Let us now consider an electron near a nucleus of atomic number Z. We assume that the nucleus is point-like and is situated at the origin of coordinates, and we take the system of units of Eq. (1.1). The Hamiltonian of the electron, in the coulombic field created by the nucleus, is then HZ = H0 − αZV (x), with V (x) =
1 . |x|
(1.5)
1 . Here, α is a positive dimensionless constant. Its physical value is α ≈ 137 Lemma 1.1 lists some properties of H0 and V (x), that will be useful in this paper.
Lemma 1.1. (P1) H0 is a self-adjoint operator on L2 (R3 , C4 ), with domain D(H0 ) = H 1 (R3 , C4 ). Its spectrum is (−∞, −1] ∪ [1, +∞). There are two orthogonal projectors on L2 (R3 , C4 ), 3+ and 3− = 1L2 − 3+ , both with infinite rank, and such that √ √ − 13+ = 3+ 1 − √ 1 H0 3+ = 3+ H0 = 1√ (1.6) H0 3− = 3− H0 = − 1 − 13− = −3− 1 − 1. 1 satisfies the following Hardy-type inequalities: (P2) The coulombic potential V (x) = |x| 1 π 2 (1.7) ϕ, (µ ∗ V ) ϕ 2 ≤ ( + ) ϕ, |H0 |ϕ 2 , 2 2 π L L
for all ϕ ∈ 3+ (H 1/2 ) ∪ 3− (H 1/2 ) and for all probability measures µ on R3 . Moreover, π ϕ, |H0 |ϕ , ∀ϕ ∈ H 1/2 , (1.8) ϕ, (µ ∗ V ) ϕ 2 ≤ 2 L L2 (1.9) k (µ ∗ V ) ϕkL2 ≤ 2k∇ϕkL2 , ∀ϕ ∈ H 1 .
Dirac–Fock Equations for Atoms and Molecules
501
In the particular case where µ is equal to the Dirac mass at the origin δ0 , an inequality more precise than (1.7) was proved in [8, 47, 48]. This inequality reads as follows: αZ ϕ, ϕ ≥ ((1 − αZ)ϕ, ϕ) , H0 − |x| for all Z ≤ Zc := ( π +22 )α , for all ϕ ∈ 3+ (H 1/2 (R3 , C4 )) . The technique used in 2 π [8, 47, 48] is based on ideas introduced by Evans-Perry and Siedentop in [18]. We refer to [27, 30] for inequality (1.8) in the case µ = δ0 . Thaller’s book [46] gathers many results on the Dirac operator, including (P1) and the standard Hardy inequality (1.9) for µ = δ0 , with references. The extension of (1.7), (1.8) and (1.9) from µ = δ0 to a general probability measure µ is immediate, since the projectors 3± , the gradient ∇ and the free Dirac operator H0 commute with translations. For completeness, we shall give the explicit form of the projectors 3+ , 3− in Sect. 3. For ϕ ∈ L2 (R3 , C4 ), let us denote ϕ+ = 3+ ϕ, ϕ− = 3− ϕ. Let E = H 1/2 (R3 , C4 ), E + = 3+ E, E − = 3− E. E is a Hilbert space with Hermitian product √ = ϕ, 1 − 1ψ = ϕ+ , ψ + + ϕ− , ψ − . ϕ, ψ E
L2
E
E
(1.10)
Since H0 is unbounded from below, it is difficult to define a ground state for relativistic atoms and molecules. In order to study the stability of relativistic molecules from a mathematical viewpoint, various simplified models have been √ introduced. In the simplest one, H0 is replaced by the positive definite Hamiltonian 1 − 1. See for instance [27, 14, 37, 34], and the Selecta of E.H. Lieb [33] for a more detailed list of references on this topic. A more realistic model due to Brown and Ravenhall [6] uses projection operators: 3+ (H0 + V )3+ replaces H0 + V , i. e., the one-particle Hilbert space is 3+ L2 instead of L2 . The above projected operator and its multi-particle counterpart was widely discussed by J. Sucher in [43, 44]. In [26], Hardekopf and Sucher investigated numerically the operator B := 3+ (H0 − αZ|x|−1 )3+ , and they claimed that its ground state energy vanishes when Z = Zc := ( π +22 )α . The first mathematical study on the semibounded2 π ness of B appeared in [18]. In [18], Evans, Perry and Siedentop proved that on the space of rapidly decaying smooth spinors, B is bounded from below by αZ(1/π − π/4) if the charge Z does not exceed Zc and unbounded from below if Z is larger than Zc . As already mentioned, several authors [8, 47, 48] improved this result later by showing that B is positive and bounded from below by (1 − αZ) whenever Z ≤ Zc . For results concerning multi-particle versions of B, see for instance [35]. The Dirac–Fock (DF) functional was first introduced by Swirles [45] as an approximation for the energy of a system of N electrons in an atom of large nuclear charge Z. In such atoms, the inner-shell electrons have relativistic energies, and the standard Hartree–Fock (HF) approximation, based on the nonrelativistic Schr¨odinger equation, is no longer valid. The Euler–Lagrange equations of the DF energy functional can be solved numerically. The solutions represent stationary states of the electrons in the atom. The numerical results are in very good agreement with experimental data (see e.g. [32, 23, 15, 38, 31, 22]). In [43, 44, 41, 24, 10], the relationship between Dirac–Fock and quantum electrodynamics is studied.
502
M. J. Esteban, E. S´er´e
In the Dirac–Fock model, the N electrons are represented by a Slater determinant of N functions ϕk ∈ E, subjected to the normalization constraints = δk` . (1.11) ϕ` , ϕk L2
We shall denote 8 = (ϕ1 , · · · , ϕN ), and the constraints above will be written in the shorter form Gram8 = 11, with h i := ϕ` , ϕk 2 . (1.12) Gram8 k`
L
We consider a molecule, with: • nuclear charge density Zµ, where Z > 0 is the total nuclear charge and µ is a probability measure defined on R3 . In the particular case of m point-like nuclei, each m m X X Zi δxi and Z = Zi . one having atomic number Zi at a fixed location xi , Zµ = i=1
• N relativistic electrons.
i=1
We assume that the interaction between these particles is purely electrostatic. The DF energy of the N electrons in the molecule, is E(8) =
N X `=1
α + 2
ϕ` , H0 ϕ`
ZZ
L2
− αZ
N X `=1
h
ϕ` , (µ ∗ V ) ϕ`
L2
i
V (x − y) ρ(x)ρ(y) − tr R(x, y)R(y, x)
(1.13) d xd y. 3
3
R3 ×R3
Here, ρ is a scalar and R is a 4 × 4 complex matrix, given by ρ(x) =
N X
N X ϕ` (x), ϕ` (x) , R(x, y) = ϕ` (x) ⊗ ϕ∗` (y),
`=1
(1.14)
`=1
ρ is the electronic density, R is the exchange matrix which comes from the antisymmetry of the Slater determinant. Note that R(y, x) = R(x, y)∗ , so that tr R(x, y)R(y, x) = X |R(x, y)ij |2 . i,j
The main difference with the more standard HF functional, is that the kinetic energy term (ϕk , −1ϕk )L2 in HF is replaced by (ϕk , H0 ϕk )L2 in DF. This changes completely the nature of the functional, which becomes strongly indefinite: it is not bounded below, and any of its critical points has an infinite Morse index. The DF functional is invariant under the action of the group U(N ): X X u1` ϕ` , . . . , uN ` ϕ` , u ∈ U(N ), 8 ∈ E N . (1.15) u·8= `
We denote
`
o n 6 = 8 ∈ E N / Gram8 = 11 .
(1.16)
Dirac–Fock Equations for Atoms and Molecules
503
Using inequality (1.8), one can easily prove that the DF functional E is smooth on E N . A critical point of E|6 is a weak solution of the following Euler–Lagrange equations: H 8 ϕk =
N X
λk` ϕ` , k = 1, . . . , N.
(1.17)
`=1
Here, H 8 ψ = H0 ψ −αZ (µ ∗ V ) ψ R +α(ρ ∗ V )ψ − α R3 R(x, y)ψ(y)V (x − y) dy.
(1.18)
Since H 8 is self-adjoint from H 1/2 to its dual H −1/2 , 3 = (λk` ) is a self-adjoint (N ×N ) complex matrix. It is the matrix of Lagrange multipliers associated to the constraints (ϕ` , ϕk )L2 = δk` . For 8 ∈ 6 a critical point whose matrix of multipliers is 3, and u ∈ U(N ), the ˜ = u · 8 is 3 ˜ = u3u∗ . So any U (N )-orbit matrix of multipliers of the critical point 8 of critical points of E|6 contains a weak solution of the following system of nonlinear eigenvalue problems, called the Dirac–Fock equations: H 8 ϕk = k ϕk , k = 1, . . . , N.
(1.19)
Physically, H 8 represents the Hamiltonian of an electron in the mean field due to the nuclei and the electrons. The eigenvalues 1 , . . . , N are the energies of each electron in this mean field. In the HF model, the Euler–Lagrange equations have a form similar to (1.19), with −1 instead of H0 in the expression of H 8 . The physically interesting states correspond to 1 ≤ · · · ≤ N < 0, and the ground state minimizes EHF on 6, which implies that 1 , . . . , N are the N first eigenvalues of H 8 (see [36]). In the DF model, the physically interesting states correspond to 0 < k < 1: a positive energy inferior to the rest mass of the electron. The definition of a ground state is less clear: the DF functional has no minimum on 6. This fact is at the origin of serious difficulties in the numerical implementation, as well as the interpretation, of the DF equations (see [10] and references therein). One way to deal with this problem, is to restrict the energy functional to the space (3+ E)N , 3+ being, as defined above, the projector on the space of positive states of the free Dirac operator [43, 44]: this corresponds to a Hartree–Fock reduction of the already mentioned Brown-Ravenhall model. The associated Euler–Lagrange equations are the “projected” Dirac–Fock equations 3+ H 8 3+ ϕk = k ϕk .
(1.20)
Note that, in the case k > 0, (1.19) can be written formally as 3+8 H 8 3+8 ϕk = k ϕk .
(1.21)
Here, 3+8 is the projector on the positive space associated to H 8 . Numerical computations using (1.19) rather than (1.20), give results that are in very good agreement with experimental data (see e.g. [23, 38]). This is not very surprising: in the presence of strong electric fields, the projector 3+8 seems physically more adequate than the free-energy projector 3+ (see [28]). In [41] Mittleman derived the DF equations with “self-consistent projector” (1.21), from a variational procedure applied to a QED Hamiltonian in Fock space, followed by the standard Hartree–Fock approximation.
504
M. J. Esteban, E. S´er´e
Important existence results are known on the HF equations. Lieb and Simon [36] proved the existence of a ground state of EHF on 6, provided N < Z + 1, where Z is the total nuclear charge, P.-L. Lions [39] proved the existence of infinitely many excited states if N ≤ Z. Using inequality (1.7), one can easily extend the results of [36, 39] to the projected equations (1.20), assuming that α max(Z, N )
0.
The third difficulty with DF, is that all critical points have an infinite Morse index. This kind of problem is often encountered in the theory of Hamiltonian systems and in certain elliptic PDEs. One way of dealing with it is to use a concavity property of the functional to get rid of the “negative directions”: see e.g. [1, 7, 9]. We shall use this method. We get a reduced functional Iν,p . A min-max argument gives us Palais– Smale sequences 8n,ν,p for Iν,p with finite “Morse index”, thanks to [19]. Adapting the arguments of [39], we prove that the k ’s of such sequences are smaller than 1. Then we pass to the limit (ν, n, p) → (0, ∞, ∞), and get the desired solutions of DF, with 0 < k < 1.
506
M. J. Esteban, E. S´er´e
2 Our concavity argument works only if α(3N − 1) < π/2+2/π . In the last 20 years, very powerful methods have been developed to deal with strongly indefinite functionals, that do not present any concavity property [42, 13, 20, 29]. This suggests that it might be possible to weaken the assumptions on N in Theorem 1.2.
2. Sketch of the Proof of Theorem 1.2 As announced in the Introduction, we replace V (x) = by the regularized potential Vν (x) =
1 (2πν)3/2
e−|x|
2
/2ν
1 |x| ,
in the expression of E (1.13),
∗ V (x), ν > 0.
(2.1)
This replacement is made for the attractive potential of the nucleus, as well as for the electronic repulsion and exchange terms. The regularized DF functional is denoted ν Eν , and the associated one-particle Hamiltonian (1.18) is denoted H 8 . 2 1 The Gaussian (2πν)3/2 e−|x| /2ν is normalized in L1 , so that Vν satisfies the same inequalities (1.7–8–9) as V . We also replace the constraint “8 ∈ 600 by a penalization term πp . The penalization parameter p is a positive integer. The penalized functional
is defined in the domain
Fν,p = Eν − πp
(2.2)
o n A = 8 ∈ E N / 0 < Gram8 < 11 ,
(2.3)
where Gram8 is the N × N matrix (ϕi , ϕj )L2 1≤i,j≤N . The penalization term has the form h p −1 i 11 − Gram8 . (2.4) πp (8) = tr Gram8 Note that Fν,p is invariant under the U(N ) action (1.15). It is easy to see that Fν,p is well-defined and smooth on A. We are going to construct approximate critical points of Fν,p . As ν → 0 and p → ∞, these points will converge to critical points of E|6 . Any U(N ) orbit in A contains a point 8 such that Gram 8 is diagonal, with eigenvalues in nondecreasing order: Gram8 = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ · · · ≤ σN < 1.
(2.5)
We call O the set of points 8 ∈ A, satisfying (2.5). If 8 ∈ O, then ∂Fν,p ν (8) = H 8 ϕk − k ϕk , ∂ϕk
(2.6)
with k = ep (σk ), ep (x) =
xp pxp−1 − (p − 1)xp d = . dx 1 − x (1 − x)2
(2.7)
Dirac–Fock Equations for Atoms and Molecules
507
The function ep is positive and increasing on (0, 1), so that 0 < 1 ≤ · · · ≤ N . This is one of the advantages of the penalized functional Fν,p : its critical points in O are solutions of a nonlinear eigenvalue problem, with positive eigenvalues. In the proof of Theorem 1.2, we need to control not only the critical points of Fν,p , but also its Palais–Smale sequences. Of course, we just need to study Palais– Smale sequences in O, thanks to the U(N ) invariance. Unfortunately, the Palais–Smale condition does not hold for Fν,p , exactly as in the case of the HF functional. But it can be replaced by the following lemma, which is related to the spectral properties of the Dirac operator with a potential. Its proof is based on inequality (1.7). Lemma 2.1 (Convergence of approximate solutions). Assume that α max(Z, N ) < 2 π/2+2/π . (a) Let (νn ) be a sequence of real numbers in (0, 1), (pn ) a sequence of positive integers, and (8n ) a sequence in O, i.e. such that Gram8n = Diag(σ1,n , . . . , σN,n ), 0 < σ1,n ≤ · · · ≤ σN,n < 1. d xpn We denote k,n = epn (σk,n ), with epn (x) = dx 1−x . We assume that 0
Fνn ,pn (8n ) −→ 0 n→∞
(2.8)
iN h ∗ 1 = E N . We also assume that for the strong topology of H − 2 (R3 , C4 ) lim inf σ1,n > 0.
(2.9)
lim inf 1,N ≥ h0 ,
(2.10)
n→∞
Then, n→∞
where h0 ∈ (0, 1) is a constant which depends only on αZ, αN. (b) If, moreover, lim sup N,n < 1, n→∞
(2.11)
then, after extraction of a subsequence, the functions ϕk,n converge to N functions T ϕk ∈ E ∩ 1≤q 0, independent of ν, p, such that, for any 8 ∈ A and 9− ∈ (E − )N , N X 00 ||ψk− ||2E . Fν,p (8) · 9− , 9− ≤ −s
(2.15)
k=1
00
Lemma 2.2 will be proved in Sect. 4, where an explicit formula for Fν,p will be given. Now, let n o (2.16) A+ = A ∩ (E + )N = 8+ ∈ (E + )N / 0 < Gram(8+ ) < 11 . For 8+ ∈ A+ , let
n o 0(8+ ) = χ− ∈ (E − )N / Gram(8+ ) + Gram(χ− ) < 11 o n = χ− ∈ (E − )N / 8+ + χ− ∈ A .
(2.17)
One can easily see that 0(8+ ) is an open convex subset of (E − )N , and that Fν,p (8+ +χ− ) converges to −∞ as χ− approaches the boundary of 0(8+ ), for 8+ fixed. So, Lemma 2.2 has the following consequence: 2 . Then, for any 8+ ∈ A+ , the funcCorollary 2.3. Assume that α(3N − 1) < π/2+2/π tional χ− ∈ 0(8+ ) 7→ Fν,p (8+ + χ− )
has a unique maximizer hν,p (8+ ) ∈ 0(8+ ). The mapping hν,p : A+ → (E − )N is smooth for the (H 1/2 )N norm, and equivariant under the U(N ) action (1.15).
Dirac–Fock Equations for Atoms and Molecules
We denote
509
Iν,p (8+ ) = Fν,p 8+ + hν,p (8+ ) , 8+ ∈ A+ .
(2.18)
Iν,p is well-defined and smooth on A+ . Since hν,p is U(N ) equivariant, Iν,p is invariant, and any U(N ) orbit in A+ contains a point 8+ such that 8 = 8+ + hν,p (8+ ) satisfies 0 (2.5). By definition of hν,p , for all 9− ∈ (E − )N , Fν,p (8+ + hν,p (8+ )) · 9− = 0 . As a consequence, if 8+ is a critical point of Iν,p , then, 8 = 8+ + hν,p (8+ ) is a critical point of Fν,p . So we just have to look for critical points of Iν,p . This is much more comfortable, because this reduced functional is not strongly indefinite. We now give a relationship between Morse-type information on a Palais–Smale sequence 8+n for Iν,p , and the estimate (2.11) on the k ’s. Unfortunately, we do not have a precise inequality like (2.14). 2 , Lemma 2.4 (The Morse index controls the k ’s). Assume that α(3N − 1) < π/2+2/π + + N < Z + 1. Let ν ∈ (0, 1), p ≥ 2, M > 0, and (8n ) a sequence in A . Denoting
8n = 8+n + hν,p (8+n ), we assume that 8n ∈ O, i.e. Gram8n = Diag(σ1,n , . . . , σN,n ), with 0 < σ1,n ≤ · · · ≤ σN,n < 1. Suppose that 0
Iν,p (8+n ) ≤ M, Iν,p (8+n ) → 0, lim inf σ1,n > 0,
(2.19)
and that the quadratic form on (E + )N : n h i X 00 ||ψk+ ||2E Qn (9+ ) = Iν,p (8+n ) 9+ , 9+ + δn
(2.20)
k=1
has a negative space of dimension at most m, for a sequence δn → 0. Then, there is a constant bm ∈ (0, 1), independent of ν, p, M, 8+n , δn , such that lim sup N,n ≤ bm , with N,n = ep (σN,n ). n→∞
(2.21)
The last step in the proof of Theorem 1.2 is to find Palais–Smale sequences for Iν,p , with Morse-type information. For this purpose, we look for positive min-max levels of Iν,p in A+ . Note that A+ is an open subset of E N , whose boundary is ∂A+ = G1 ∪ G2 , with n o G1 = 8+ ∈ (E + )N / Gram8+ ≤ 11, det Gram8+ = 0 n o . G2 = 8+ ∈ (E + )N / Gram8+ ≤ 11, det(11 − Gram8+ ) = 0 If Iν,p were negative for 8+ close to ∂A+ , the existence of positive min-max levels for Iν,p would be a direct consequence of the topology of (A+ , ∂A+ ). We have Iν,p (8+ ) −→ −∞ as distL2 (8+ , G2 ) −→ 0, with ||8+ ||E N bounded. But Iν,p (8+ ) may remain positive when 8+ is close to G1 . Following [12, 13, 40], we solve this difficulty by studying the gradient vector field of Iν,p near G1 . We prove that this field “points inward”, in the following sense:
510
M. J. Esteban, E. S´er´e
Lemma 2.5 (A pseudo-gradient pointing inward near G1 ). Assume that α max(Z, 3N − 1)
0. Then there are d(ν), e(ν) > 0 such that, if8+ ∈ A+ satisfies
det (Gram8+ ) ∈ [d(ν), 2d(ν)]
then one can find a vector X ∈ (E + )N , with ( 0 Iν,p (8+ ) · X ≥ e(ν)kXk(E)N (∀p ≥ 2), 10 (8+ ) · X > 0, where 1(8+ ) = det(Gram8+ ).
(2.22)
Note that in Lemma 2.5 there is a constant, d(ν), which depends on ν. We have been unable to make d independent of ν. ∞ (d(ν), 1), R be such that Now, let αν ∈ C ∀x ≥ 2d(ν) αν (x) = 0, 0 ∀x < 2d(ν) αν (x) > 0, (2.23) αν (x) → −∞ as x → d(ν). Let β ∈ C ∞ (R, R) be such that β ≡ −1 on (−∞, −1) β(t) = t, ∀t ≥ 0 β(t) ≤ 0, ∀t ≤ 0. We define a new functional Jν,p , by ( Jν,p (8+ ) = β Iν,p (8+ ) + αν ◦ 1(8+ ) if 8+ ∈ A+ and 1(8+ ) > d(ν), Jν,p (8+ ) = −1
(2.24)
(2.25)
otherwise
It is easy to see that Jν,p is smooth on (E + )N . If 8+ is a critical point of Jν,p with Jν,p (8+ ) ≥ 0, then 8+ is also a critical point of Iν,p + αν ◦ 1, at the same level. From (2.22)–(2.23), this is only possible if 1(8+ ) > 2d(ν), hence 8+ is a critical point of Iν,p and Iν,p coincides with Jν,p in a neighborhood of 8+ . The same holds for Palais–Smale sequences. So we can look for positive min-max levels of Jν,p instead of Iν,p . This is much more convenient, because Jν,p is defined on (E + )N , with Jν,p = −1 on ∂A+ . Jν,p is invariant under the U(N ) action (1.15). For F a finite dimensional complex subspace of E + , let o n (2.26) D(F ) = 8+ ∈ F N / Gram8+ ≤ 11 . We say that a homotopy h ∈ C [0, 1] × (E + )N , (E + )N is “admissible ” if h(λ, u · 8+ ) = u · h(λ, 8+ ),
h(λ, 8+ ) = 8+ , ∀λ ∈ [0, 1],
∀u ∈ U(N ) ∀(λ, 8+ ) ∈ [0, 1] × (E + )N ∀8+ ∈ ∂A+ .
(2.27)
Dirac–Fock Equations for Atoms and Molecules
511
We define the class of sets n Q(F ) = Q ⊂ (E + )N / there is h, admissible, such that o h(0, ·) = Id(E + )N , h(1, D(F )) = Q .
(2.28)
Finally, let cν,p (F ) =
inf
max Jν,p (8+ ).
Q∈Q(F ) 8+ ∈Q
(2.29)
We have Lemma 2.6 (The min–max levels). Assume that α max(Z, 3N − 1)
0 .
Moreover, D(Fj ) / U(N ) has dimension mj = 2N j+N 2 . It then follows from arguments by Fang and Ghoussoub [19, 21], that there is a Palais–Smale sequence at the level cν,p (Fj ), with Morse-type information: Lemma 2.7 (Palais–Smale sequences with bounded Morse index). Assume that α max(Z, 3N − 1)
d(ν),
(2.31)
and a sequence δn > 0, δn → 0, such that the quadratic form N i h X 00 ||ψk+ ||2E , 9+ ∈ (E + )N Qn (9+ ) = Iν,p (8+n ) 9+ , 9+ + δn k=1
has a negative space of dimension at most mj = 2N j + N 2 .
(2.32)
512
M. J. Esteban, E. S´er´e
Proof of Theorem 1.2. We now prove Theorem 1.2 as a direct consequence of Lemmas 2.1, 2.4, 2.6 and 2.7. Let j ≥ 0, p ≥ max(3, p(j)) be two integers. Take ν = p1 ∈ (0, 1). There is a sequence 8+n satisfying (2.31–32) of Lemma 2.7 and such that 8n = 8+n + h1/p,p (8+n ) satisfies (2.5). Then (2.21) of Lemma 2.4 holds, with m = mj . So, from (b.1) of Lemma 2.1, 8n converges, after extraction of a subsequence, to a critical point 8j,p of F1/p,p , with p ), F1/p,p (8j,p ) = c1/p,p (Fj ) , Gram8j,p = Diag(σ1p , · · · , σN p < 1, h0 ≤ ep (σkp ) ≤ bmj < 1. 0 < σ1p ≤ · · · ≤ σN
Since ep converges uniformly to 0 on any interval [0, s], s < 1, we have lim σ1p = 1. p→∞
Applying (b.2) of Lemma 2.1 to the sequence 8j,p , for j fixed, we find, after extraction of a subsequence, a limit 8j which satisfies the requirements (1.22–25) of Theorem 1.2. In Sect. 3, we study the properties of the first derivative of Fν,p , and we prove Lemma 2.1. In Sect. 4, we compute the Hessian of Fν,p , and we prove Lemmas 2.2 and 2.4. In Sect. 5, we study the min-max argument, and prove Lemmas 2.5 and 2.6. 3. The First Derivative of Fν,p Our first task is to prove property (P1) of Lemma 1.1. For this purpose, we write H0 in Fourier space: [ H 0 ψ(ξ) =
3 X k=1
b 0 (ξ) the matrix We denote by H
1 ξ·σ b b = αk ξk + β ψ(ξ) ψ(ξ) ξ · σ −1
(3.1)
3 X 1 ξ.σ , with the standard notation ξ · σ = ξk σk . ξ.σ −1 k=1
b 0 (ξ)2 = (1 + |ξ|2 )11C4 . Taking b 0 (ξ) is a self-adjoint 4 × 4 matrix, and we have: H H p b 0 (ξ) + 1 + |ξ|2 11 H + c p = 3 (ξ) = 2 1 + |ξ|2 √ 1 2 + 1 | √ξ.σ 2 (3.2) 1+|ξ| 1+|ξ| 1 = − − − − − | − − − − − 2 √ξ.σ | −√ 1 2 + 1 2 1+|ξ|
and
1+|ξ|
p b 0 (ξ) + 1 + |ξ|2 11 −H d − p = 3 (ξ) = 2 1 + |ξ|2 − √ 1 2 + 1 | − √ξ.σ 2 1+|ξ| 1+|ξ| 1 , − − − − − | − − − − − = 2 √ξ.σ 1 √ | +1 − 2 2 1+|ξ|
1+|ξ|
(3.3)
Dirac–Fock Equations for Atoms and Molecules
513
d − c+ (ξ), 3 we find that 3 (ξ) are two orthogonal projectors of rank 2, with p c+ H c+ (ξ) = 1 + |ξ|2 3 c+ (ξ) c0 3 c0 (ξ) =H 3 p d d d − − c0 3 c0 (ξ) =H (ξ) = − 1 + |ξ|2 3 (ξ) 3− H . d +d − − c+ c = 3 3 (ξ) = 0 3 3 (ξ) c+ d − (ξ) = 11 4 3 (ξ) + 3
(3.4)
C
Finally, if we define 3+ , 3− on L2 (R3 , C4 ) by ( + c+ (ξ)ψ(ξ) [ b ψ(ξ) = 3 3 d − − [ b ψ(ξ) = 3 (ξ)ψ(ξ) 3
(3.5)
we easily obtain (P1) of Lemma 1.1, as a consequence of (3.4). We now give a first consequence of inequality (1.7). Lemma 3.1. Assume that α max(Z, N ) < (i)
2 π/2+2/π
.
There is a constant h0 > 0, such that for any ν ∈ [0, 1], 8 ∈ E N such that Gram(8) ≤ 11, and ψ ∈ E, ν
h0 ||ψ||H 1/2 ≤ ||H 8 ψ||H −1/2 .
(3.6)
ν
In other words, H 8 is a self-adjoint isomorphism between H 1/2 and its dual H −1/2 , whose inverse is bounded independently of 8, ν. ν (ii) Take ν ∈ [0, 1], 8 ∈ E N with Gram(8) ≤ 11, and ψ ∈ E, such that H 8 ψ ∈ T 1,q (R3 , C4 ). L2 (R3 , C4 ). Then ψ ∈ 1≤q k,n ≥ cpn
−→ 1,
n→∞
516
M. J. Esteban, E. S´er´e
so 8 ∈ 6. Obviously, 8 satisfies (2.12), so it is a critical point of E|6 . Moreover, E(8) = lim E(8n ). n→∞ X xp θpn (σk,n ), with θp (x) = 1−x . Now, πpn (8n ) = k
0
θp (x) p 1 = + ≥ p, ∀x ∈ (0, 1) . So, < 1. But θp (x) x 1 − x
0
We recall that θpn (σk,n ) = k,n θpn (σk,n )
0 is a constant independent of 8, 9, ν. Proof of Lemma 4.1. We obviously have (ψ` , (µ ∗ Vν )ψ` )L2 > 0. The Fourier transform of Vν is a positive measure, so ZZ (4.8) Vν (x − y)f (x)f (y)∗ ≥ 0, ∀f ∈ L1 ∩ L3/2 (R3 , C). As a consequence, K2 ≥ 0. Now, K(y, x) = K(x, y)∗ , so that tr K(x, y)K(y, x) ≥ 0 ∀x, y, hence K4 ≥ 0. We thus have h i X 1 00 E (8) 9, 9 ≤ ||ψ`+ ||2E − ||ψ`− ||2E + K1 + K3 + K5 . 2
(4.9)
`
Now, take 8 ∈ A and 9− ∈ (E − )N . For m = 1, . . . , N , we have ||ϕm ||L2 ≤ 1. So, using inequality (1.7), we easily get (π/2 + 2/π)αN X − 2 ||ψ` ||E , 2
(4.10)
(π/2 + 2/π)α(N − 1) X − 2 ||ψ` ||E . 2
(4.11)
K1 ≤
`
and K5 ≤
`
By the Cauchy–Schwarz inequality, K3 ≤ K1 .
(4.12)
N Finally, for any 8 ∈ A and 9− ∈ E − , i h X X (π/2 + 2/π)α 1 00 Eν (8) 9− , 9− ≤ − ||ψ`− ||2E + (3N − 1) ||ψ`− ||2E 2 2 ` ` X − 2 ≤ −s ||ψ` ||E , (4.13) `
with s = 1 −
(π/2+2/π)α 2
(3N − 1) . Note that s > 0 provided α(3N − 1)
0, we associate f (|x|/λ) 0 . ψ(x) = 0 0
(4.40)
Obviously, ψ ∈ H 1 (R3 , C4 ). We call Wd,λ the d-dimensional real vector space of functions ψ of the form (4.54), with λ fixed and f ∈ Vd arbitrary.
522
M. J. Esteban, E. S´er´e
It is easy to see that there are two constants 0 < c∗ (d) < c∗ (d) < ∞ such that, for any ψ ∈ Wd,λ and λ large, 2
(H0 ψ, ψ) = kψkL2 , c∗ 2 2 k∇ψkL2 ≤ 2 kψkL2 , λ c∗ 2 ψ, Vν ψ 2 ≥ kψkL2 , ∀ν ∈ [0, 1], λ L c∗ 2 2 k3− ψkL2 ≤ 2 kψkL2 , λ 1 ((µ ∗ Vν ) ψ, ψ)L2 ≥ (Vν ψ, ψ)L2 − o ||ψ||2L2 , ∀ν ∈ [0, 1]. λ
(4.41) (4.42) (4.43) (4.44)
(4.45)
Inequalities (4.42), (4.43) and (4.45) follow from scaling arguments, and (4.44) is a consequence of formula (3.3). Now, suppose that ψ ∈ Wd,λ satisfies = 0, ∀k, (4.46) ϕ+k , ψ L2
for some 8 = (ϕ+1 , . . . , ϕ+N ) ∈ A+ , such that Gram 8 = (σ1 , . . . , σn ), 0 < σ1 ≤ · · · ≤ σN < 1, with 8 = 8+ + hν,p (8+ ). Let 9+ = (0, . . . , 3+ ψ). From Lemma 4.3, we have, +
for any ν ∈ (0, 1), p ≥ 1, i 1 00 i h h 1 00 2 Iν,p (8+ ) 9+ , 9+ ≤ Eν (8) 9+ , 9+ − ep (σN )k3+ ψkL2 + 2 2 2 + c¯k∇ψkL2 .
(4.47)
From Lemma 2.2, i 1 00 i h h i h 00 1 00 Eν (8) 9+ , 9+ ≤ Eν (8) 9, 9 − Eν (8) 9, 9− , 2 2
(4.48)
where 9 = (0, . . . , 0, ψ), 9− = (0, . . . , 0, 3− ψ). But from Hardy’s inequality (1.9), i h 00 Eν (8) 9, 9− ≤ ck∇ψkL2 k3− ψkL2 , (4.49) for some c > 0 which depends only on N, Z. Moreover, using Lemma 4.4, we get h i 1 00 Eν (8) 9, 9 ≤ ψ, H0 ψ + α(N −1)(ψ, Vν ψ)L2 − αZ ((µ ∗ Vν ) ψ, ψ)L2. 2 (4.50) Finally, combining (4.41, 4.42, . . . , 4.50), we get i h αc∗ + o(1) 1 00 kψk2L2 − ep (σN )k3+ ψk2L2 Iν,p (8) 9+ , 9+ ≤ 1 − (Z − N + 1) 2 λ (c + c¯)c∗ + kψk2L2 λ2 (4.51) αc∗ (Z − N + 1) ≤ 1− − ep (σN ) k3+ ψk2E 2λ
Dirac–Fock Equations for Atoms and Molecules
523
for λ = λ(d) large enough. Now, take m ≥ 0. Choose Xm as an (m + 1)-dimensional subspace of 3+ Wd,λ(d) ∩ n o⊥ ϕ+1 , . . . , ϕ+N , where d = m + 2N + 1 (such a space always exists). Take bm = αc∗ (Z − N + 1) 1− . Then it is easy to check that Xm satisfies (4.39), and Lemma 4.5 2λ(d) is proved. Lemma 2.4 is now an immediate consequence of Lemma 4.5. 5. The Min–Max Argument We start with a proof of Lemma 2.5. We need the following result: 2 . Take ν ∈ (0, 1). There is a constant Lemma 5.1. Assume that α(3N − 1) < π/2+2/π + + C(ν) > 0 such that, for any p ≥ 1 and 8 ∈ A ,
σ1 (8+ ) ≤ C(ν)σ1+ (8+ ).
(5.1)
Here, σ1+ (8+ ) is the smallest eigenvalue of Gram 8+ , and σ1 (8+ ) is the smallest eigenvalue of Gram 8, where 8 = 8+ + hν,p (8+ ). Remark. The constant C depends on ν. We have been unable to prove that C remains bounded as ν tends to 0. Proof of Lemma 5.1. Take 8+ ∈ A+ , i.e. 8+ ∈ (E + )N with 0 < Gram(8+ ) < 11. Using the U(N ) invariance, we just have to prove the lemma when + + ), 0 < σ1+ ≤ · · · ≤ σN < 1. Gram(8+ ) = Diag(σ1+ , . . . , σN
(5.2)
We denote − + − hν,p (8+ ) = 8− = (ϕ− 1 , . . . , ϕN ), 8 = 8 + 8 = (ϕ1 , . . . , ϕN ).
We introduce the following functional on E − : F (ψ − ) = ϕ+1 + ψ − , H8ν 1 (ϕ+1 + ψ − )
L2
− πp (ϕ+1 + ψ − , ϕ2 , . . . , ϕN ).
(5.3)
Here, 81 = (ϕ2 , . . . , ϕN ) ∈ E N −1 , and H8ν 1 ψ = H0 − αZ (µ ∗ Vν ) ψ +α
N ZZ X
Vν (x − y) |ϕk (y)|2 ψ(x) − (ϕk (y), ψ(y))ϕk (x) dy.
(5.4)
k=2
We have extended πp to E N , with values in R, by defining πp (8) = +∞ when 11−Gram8 00 is not positive definite. F is thus well-defined on E − with values in R, and F (ψ − ) exists when F (ψ − ) > −∞. From Lemma 2.2, F is strictly concave, and 00 (5.5) F (ψ − ) χ− , χ− ≤ −skχ− k2E ,
524
M. J. Esteban, E. S´er´e
for any χ− ∈ E − , and ψ − ∈ E − such that F (ψ − ) > −∞. We have Fν,p ϕ+1 + ψ − , ϕ2 , . . . , ϕN = F (ψ − ) + Eν (81 ),
(5.6)
− + so ϕ− 1 is the unique maximizer of F on E . From (5.2), (ϕ1 , ϕk )L2 = 0, ∀k ≥ 2. − − Therefore, for any χ ∈ E , 0 πp ϕ+1 , ϕ2 , . . . , ϕN · χ− = + n−1 − − 0 Re(χ− , ϕ− (σ1 ) 0 ... 0 2 ) . . . Re(χ , ϕN ) − 0 X Re(ϕ− 2 ,χ ) 2ntr = . . . . n−1 . . Gram 81 ) 0 n≥p − 0 , χ ) Re(ϕ− N = 0.
As a consequence, F 0 (0)χ− = 2Re χ− , H8ν 1 ϕ+1 .
(5.7)
So there is a constant K1 (ν) > 0 such that |F 0 (0)χ− | ≤ K1 (ν)kϕ+1 kL2 kχ− kE , ∀χ− ∈ E − .
(5.8)
But (5.5) implies that − − 2 0 F (ϕ− 1 ) ≤ F (0) + F (0)ϕ1 − skϕ1 kE .
(5.9)
Since F (ϕ− 1 ) ≥ F (0), (5.8) (5.9) give + kϕ− 1 kE ≤ K2 (ν)kϕ1 kL2 .
(5.10)
Finally, (5.10) gives
X
2
N
2 + 2 + +
σ1 (8 ) = inf ξk ϕk
2 ≤ kϕ1 kL2 ≤ C(ν)kϕ1 kL2 = C(ν)σ1 (8 ). (5.11) ξ∈ C N +
||ξ||=1
Lemma 5.1 is proved.
k=1
L
We are now ready to prove Lemma 2.5. Using once again the U(N ) invariance, we just have to consider 8+ ∈ A+ such that, denoting 8 = 8+ + 8− , 8− = hν,p (8+ ), the following holds: Gram(8) = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ · · · ≤ σN < 1.
(5.12)
We want to find X ∈ (E + )N satisfying (2.22), assuming that 1(8+ ) = det Gram 8+ is in [d(ν), 2d(ν)]. We choose X = (ϕ+1 , 0, . . . , 0).
(5.13)
10 (8+ ) · X = 21(8+ ) > 0.
(5.14)
Obviously,
Dirac–Fock Equations for Atoms and Molecules
525
0
Since Fν,p (8+ ) · (χ− , 0, . . . , 0) = 0, ∀χ− ∈ E − , we may write 0
0
Iν,p (8+ ) · X = Fν,p (8) · (ϕ+1 − ϕ− 1 , 0, . . . , 0) − ν , H ϕ = 2 ϕ+1 , H8ν 1 ϕ+1 2 − 2 ϕ− 81 1 1 L2 L 2 − 2ep (σ1 ) ||ϕ+1 ||2L2 − ||ϕ− 1 ||L2 .
(5.15)
From inequality (1.7), we have ( )||ϕ+ ||2E , ∀ϕ+ ∈ E + , (ϕ+ , H8ν 1 ϕ+ )L2 ≥ (1 − (π/2+2/π)αZ 2 (π/2+2/π)α(N −1) − ν − 2 )||ϕ− ||2E , ∀ϕ− ∈ E − . −(ϕ , H81 ϕ )L ≥ (1 − 2
(5.16)
As a consequence, h i 0 (π/2 + 2/π) α max(Z, N − 1) − ep (σ1 ) kϕ+1 k2E . (5.17) Iν,p (8+ ) · X ≥ 2 1 − 2 xp x ≤ = e1 (x) is small when x > 0 is small. Moreover, by 1−x 1−x (π/2+2/π) α max(Z,N −1) < 1. assumption, 2 From Lemma 5.1, h i N1 (5.18) 1(8+ ) ≤ σ1 (8+ ) ≤ C(ν)σ1+ (8+ ) ≤ C(ν) 1(8+ ) .
But ep (x) =
Lemma 2.5 is now an immediate consequence of (5.14), (5.17) and (5.18). Our goal now is to prove Lemma 2.6. We start with a “linear” result that will give us the lower bound a(j) in (2.30). 2 . Then there is a nondecreasing sequence Lemma 5.2. Assume that αZ < π/2+2/π {λj , j ≥ 0} in (0, 1), with lim λj = 1, and a sequence {Gj , j ≥ 0} of complex vector j→∞
subspaces of E + , with dim C (E + /Gj ) = j, and ϕ+ , (H0 − αZ (µ ∗ V )) ϕ+ 2 ≥ λj kϕ+ k2L2 , ∀ϕ+ ∈ Gj .
(5.19)
L
Proof. The arguments below are classical (see [46], 112-117 for a similar situation). The operator T = 3+ (H0 − αZ (µ ∗ V )) 3+ , defined as a Friedrichs extension, is selfadjoint on 3+ (L2 ) and has essential spectrum σess (T ) = [1, +∞). Indeed, the arguments used in [18] to prove the result when µ is a Dirac mass, extend to the more general case. From (1.7), σ(T ) ⊂ (0, ∞). As a consequence, σ(T ) ∩ (−∞, 1) consists only of positive eigenvalues with finite multiplicity. One can easily prove, using the Rayleigh quotients, that σ(T ) ∩ (−∞, 1) = {λj , j ≥ 0}, with 0 < λ0 ≤ · · · ≤ λj ≤ . . . , lim λj = 1. Let Gj be the orthogonal space, for the L2 -hermitian product, of M Ker(T − λk IE + ). Kj =
j→∞
(5.20)
k≤j−1
Obviously, E /Gj ≈ Kj has complex dimension j, and (5.19) holds. +
We now construct the space Fj , and we find the upper bound a¯ (j).
526
M. J. Esteban, E. S´er´e
2 Lemma 5.3. Assume that α(3N − 1) < π/2+2/π , N < 2Z + 1. There is a sequence {¯a(j), j ≥ 0} in (0, N ) and a sequence {Fj , j ≥ 0} of complex vector subspaces of E + , with dim C Fj = j + N , and
N ∩ A+ . Iν,p (8+ ) ≤ a¯ (j), ∀8+ ∈ Fj
(5.21)
Proof. Our arguments will be similar to those in the proof of Lemma 2.4, but simpler. We consider the space Wd,λ of functions ψ of the form (4.40), with λ fixed and f ∈ Vd + arbitrary. We denote Wd,λ = 3+ (Wd,λ ). From (4.44), for λ large enough, + = dimC Wd,λ = d. dimC Wd,λ
(5.22)
N From (4.37), for any 8 ∈ Wd,λ , such that Gram 8 ≤ 2 δk` , Eν (8) =
X
(ϕk , H0 ϕk ) − αZ (ϕk , (µ ∗ Vν ) ϕk )L2
k
αX + 2
ZZ
n Vν (x − y) |ϕk (x)|2 |ϕ` (y)|2
(5.23)
k6=`
o − ϕk (x), ϕ` (x) ϕ` (y), ϕk (y) X X α αZ (ϕk , (µ ∗ Vν ) ϕk )L2 . ϕk , H0 + (N − 1)Vν ϕk 2 − ≤ 2 L k
k
Moreover, using inequalities (1.7) and (1.9), one can find two constants a, b > 0 such that 1/2 X 1/2 X 0 k∇ϕk k2L2 kψk− k2L2 + |Eν (8).9− | ≤ a k
+b
X k
k
2 kϕ− k kE
1/2 X k
kψk− k2E
1/2
,
(5.24)
− − ∈ (E − )N is arbitrary. Now, we take 8+ = where ϕ+k = 3+ ϕk , ϕ− k = 3 ϕk , and 9 + N ) ∩ A+ . (ϕ+1 , . . . , ϕ+N ) ∈ (Wλ,d + We recall that A = {8+ ∈ (E + )N / 0 < Gram8+ < 11}. From (4.44), for λ large enough, there is 8 ∈ (Wλ,d )N , such that 3+ ϕk = ϕ+k (∀k) and Gram(8) ≤ 2(δk,` ). Since πp ≥ 0, we may write sup Eν (8 + 9− ). (5.25) Iν,p (8+ ) ≤ Eν 8+ + hν,p (8+ ) ≤ 9− ∈(E − )N
Combining (5.24), (5.25) and Lemma 4.1, we get, for some a0 > 0, X 2 2 k∇ϕk kL2 + k3− ϕk kE . Iν,p (8+ ) ≤ Eν (8) + a0 k
Finally, combining (5.23), (5.26) and the estimates (4.41), . . . , (4.45), we find,
(5.26)
Dirac–Fock Equations for Atoms and Molecules
527
c∗ Iν,p (8 ) ≤ N 1 − α(2Z − N + 1) +o 2λ +
1 . λ
(5.27)
+ ¯ . Then (5.27) gives We take λ(d) large enough, and Fj = Wj+N, ¯ λ(j+N )
Iν,p (8+ ) ≤ a¯ (j) < N, ∀8+ ∈ (Fj )N ∩ A+ . From (5.22), dimC Fj = j + N , so Lemma 5.3 is proved.
(5.28)
We are now ready to prove Lemma 2.6. We take Fj as in Lemma 5.3. Obviously, cν,p (Fj ) = inf Q∈Q(Fj ) max8+ ∈Q Jν,p (8+ ) ≤ ≤ max8+ ∈(Fj )N ∩A+ Jν,p (8+ ) ≤ a¯ (j),
(5.29)
where for any F , the class of sets Q(F ) is defined in Section 2, formula (2.28). To find a lower estimate on cν,p (Fj ), we define n j+1 o 11 . Sj = 8+ ∈ (Gj )N / Gram8+ = j+2
(5.30)
Take 8+ ∈ Sj . From Lemma 5.2, we have Eν (8+ ) ≥
X k
j+1 λj . ϕk , (H0 − αZ (µ ∗ Vν )) ϕk ≥ N j+2
(5.31)
So there is p(j) such that, if p ≥ p(j), then j + 1 p j + 1 1 11 = N (j + 2) ≤ Eν (8+ ). πp j+2 j+2 j+2 Together with (5.31), this gives Iν,p (8+ ) ≥ Eν (8+ ) − πp (8+ ) ≥ N We choose a(j) = N
j+1 j+2
2
j+1 j+2
2 λj .
(5.32)
λj . Obviously, lim a(j) = N , and Lemma 2.6 is an immej→∞
diate consequence of the following intersection result: Lemma 5.4. For any Q ∈ Q(Fj ), the intersection Q ∩ Sj is non-empty. Proof of Lemma 5.4 (hence of Lemma 2.6). The quotient set Sj /U(N ) is a submanifold of the Hilbert manifold A+ /U(N ), and codimR Sj /U(N ), A+ /U(N ) = codimR Sj , (E + )N = (5.33) N 2 + codimR (Gj )N , (E + )N = N 2j + N . Take > 0 small, and define n o Mj () = 8+ ∈ (Fj )N ∩ A+ /det(Gram 8+ )det(11 − Gram 8+ ) ≥ . (5.34)
528
M. J. Esteban, E. S´er´e
Mj is a manifold with boundary, and dim R Mj = dimR (Fj )N = 2N (j + N ). If h is ”admissible”, then, from (2.27) and by continuity of h, there is h > 0 such that (5.35) h [0, 1] × ∂Mj (h ) ∩ Sj = ∅. Now, Mj /U(N ) is a submanifold (with boundary) of A+ /U(N ), and dimR Mj /U(N ) = dimR Mj − dimR U(N )
= N (2j + N ) = codimR Sj /U(N ), A+ /U(N ) .
(5.36)
Perturbing slightly Fj if necessary, we may impose that Fj and Gj intersect transversally. Their intersection is then a complex subspace Hj of E + , of dimension N , and Sj /U(N )∩ Mj /U(N ) is a transverse intersection of cardinal 1. Its unique element is the U(N ) class j+1 11. So the intersection of bases (ϕ+1 , . . . , ϕ+N ) of Hj , such that Gram (ϕ+1 , . . . , ϕ+N ) = j+2 index of Sj /U(N ) and Mj /U(N ) (mod 2) is 1. From (5.35), we also have (5.37) IZ2 Sj /U(N ), h(1, Mj )/U(N ) = 1. So Sj intersects Q = h 1, D(Fj ) , and Lemma 5.4 (hence Lemma 2.6) is proved. This ends the proof of Theorem 1.2. Acknowledgement. The authors wish to thank B. Buffoni, P. Chaix and P. Indelicato for stimulating conversations. They are also indebted to the referee for valuable remarks.
References 1. Amann, H.: Saddle points and multiple solutions of differential equations. Math. Z. 169, 127–166 (1979) 2. Bahri, A.: Une m´ethode perturbative en th´eorie de Morse. Th`ese d’Etat, Universit´e P. et M. Curie, Paris, 1981 3. Bahri, A., Berestycki, H.: A perturbation method in critical point theory and applications. Trans. Am. Math. Soc. 267 (1), 1–32 (1981) 4. Bahri, A., Berestycki, H.: Points critiques de perturbations de fonctionnelles paires et applications. Comptes rendus Acad. Sci. Paris, S´erie A-B 291 (3), 189–192 (1980) 5. Bjorken, J.D., Drell, S.D.: Relativistic quantum mechanics. New York: McGraw-Hill, 1964 6. Brown, G.E., Ravenhall, D.G.: On the interaction of two electrons. Proc. Roy. Soc. London. A208, 552–559 (1951) 7. Buffoni, B., Jeanjean, L.: Minimax characterization of solutions for a semi-linear elliptic equation with lack of compactness. Ann. Inst. H. Poincar´e 10 (4), 377–404 (1993) 8. Burenkov, V.I., Evans, W.D.: On the evaluation of the norm of an integral operator associated with the stability of one-electron atoms. Preprint Mp-arc archive list number 97–247 9. Castro, A., Lazer, A.C.: Applications of a min-max principle. Rev. Colomb. Mat. 10, 141–149 (1976) 10. Chaix, P., Iracane, D.: The Bogoliubov-Dirac–Fock formalism. J. Phys. At. Mol. Opt. Phys. 22, 3791– 3814 (1989) 11. Coffman, C.V.: Ljusternik–Schnirelman theory: Complementary principles and the Morse index. Nonlinear Analysis, Theory and Applications 12 (5), 507–529 (1988) 12. Conley, C.: Isolated invariant sets and the Morse index. C.B.M.S. 38, Providence, RI: A.M.S. 1978 13. Conley, C., Zehnder, E.: The Birkhoff–Lewis fixed point theorem and a conjecture of V.I. Arnold. Invent. Math. 73, 33–49 (1983)
Dirac–Fock Equations for Atoms and Molecules
529
14. Daubechies, I., Lieb, E.H.: One-electron relativistic molecules with Coulomb interaction. Commun. Math. Phys. 90, 497–510 (1983) 15. Desclaux, J.: Relativistic Dirac–Fock expectation values for atoms with Z = 1 to Z = 120. Atomic Data and Nuclear Data Tables 12, 311–406 (1973) 16. Dolbeault, J., Esteban, M.J., S´er´e, E.: Variational characterization for eigenvalues of Dirac operators. Preprint mparc 98–177 17. Esteban, M.J., S´er´e, E. Existence and multiplicity of solutions for linear and nonlinear Dirac problems. In: Partial Differential Equations and Their Applications. CRM Proceedings and Lecture Notes, volume 12. Eds. P.C. Greiner, V. Ivrii, L.A. Seco and C. Sulem. Providence, RI: AMS, 1997 18. Evans, W.D., Perry, P., Siedentop, H.: The spectrum of relativistic one-electron atoms according to Bethe and Salpeter. Commun. Math. Phys. 178, 733–746 (1996) 19. Fang, G., Ghoussoub, N.: Morse-type information on Palais–Smale sequences obtained by min-max principles. Manuscripta Math. 75, 81–95 (1992) 20. Floer, A.: A relative Morse index for the symplectic action. CPAM 41, 393–407 (1988) 21. Ghoussoub, N.: Duality and perturbation methods in critical point theory. Cambridge: Cambridge Univ. Press, 1993 22. Gorceix, O., Indelicato, P., Desclaux, J.P.: Multiconfiguration Dirac–Fock studies of two-electron ions: I. Electron-electron interaction. J. Phys. B: At. Mol. Phys. 20, 639–649 (1987) 23. Grant, I.P.: Relativistic Calculation of Atomic Structures. Adv. Phys. 19, 747–811 (1970) 24. Grant, I.P., Quiney, H.M.: Foundations of the relativistic theory of atomic and molecular structure. Adv. Atom. Mol. Phys. 23, 37–86 (1988) 25. Griesemer, M., Siedentop, H.: A minimax principle for the eigenvalues in spectral gaps. Preprint mparc 97–492 26. Hardekopf, G., Sucher, J.: Relativistic wave equations in momentum space. Phys. Rev. A 30, 703–711 (1984) 27. Herbst, I.W.: Spectral theory of the operator (p2 + m2 )1/2 − Ze2 /r. Commun. Math. Phys. 53, 285–294 (1977) 28. Heully, J.L., Lindgren, I., Lindroh, E., Martensson-Pendrill, A.M.: Comment on relativistic wave equations and negative-energy states. Phys. Rev. A 33, 4426–4429 (1986) 29. Hofer, H., Wysocki, K.: First order elliptic systems and the existence of homoclinic orbits in Hamiltonian systems. Math. Ann. 288, 483–503 (1990) 30. Kato, T.: Perturbation theory for linear operators. Berlin–Heidelberg–New York: Springer, 1966 31. Quiney, H.M., Grant, I.P., Wilson, S.: The Dirac equation in the algebraic approximation: V. Selfconsistent field studies including the Breit interaction. J. Phys. B: At. Mol. Phys. 20, 1413–1422 (1987) 32. Kim, Y.K.: Relativistic self-consistent Field theory for closed-shell atoms. Phys. Rev. 154, 17–39 (1967) 33. Selecta of E.H. Lieb. The stability of matter: From atoms to stars. Edited by W. Thirring (second edition), 2nd edition, Berlin–Heidelberg–New York: Springer, 1997 34. Lieb, E.H., Loss, M., Siedentop, H.: Stability of relativistic matter via Thomas-Fermi theory. Helvetica Physica Acta 69 (5–6), 974–984 (1996) 35. Lieb, E.H., Siedentop, H., Solovej, J.P.: Stability and instability of relativistic electrons in classical electromagnetic fields. J. Stat. Phys. 89 (1-2), 37–59 (1997) 36. Lieb, E.H., Simon, B.: The Hartree–Fock theory for Coulomb systems. Commun. Math. Phys. 53, 185– 194 (1977) 37. Lieb, E.H., Yau, H.-T.: The stability and instability of relativistic matter. Commun. Math. Phys. 118, 177–213 (1988) 38. Lindgren, I., Rosen, A.: Relativistic self-consistent field calculations. Case Stud. At. Phys. 4, 93–149 (1974) 39. Lions, P.-L.: Solutions of Hartree–Fock equations for Coulomb systems. Commun. Math. Phys. 109, 33–97 (1987) 40. Majer, P., Terracini, S.: Periodic solutions to some problems of n-body type. Arch. Rat. Mech. Anal. 124, 381–404 (1993) 41. Mittleman, M.H.: Theory of relativistic effects on atoms: Configuration-space Hamiltonian. Phys. Rev. A 24 (3), 1167–1175 (1981) 42. Rabinowitz, P.H.: Periodic solutions of Hamiltonian systems. CPAM 31, 157–184 (1978) 43. Sucher, J.: Foundations of the relativistic theory of many-particle atoms. Phys. Rev. A 22 (2), 348–362 (1980) 44. Sucher, J.: Relativistic many-electron Hamiltonians. Phys. Scrypta 36, 271–281 (1987) 45. Swirles, B.: The relativistic self-consistent field. Proc. Roy. Soc. A 152, 625–649 (1935)
530
M. J. Esteban, E. S´er´e
46. Thaller, B.: The Dirac equation. Berlin–Heidelberg–New York: Springer-Verlag, 1992 47. Tix, C.: Strict positivity of a relativistic Hamiltonian due to Brown and Ravenhall. Bull. London Math. Soc. 30 (3), 283–290 (1998) 48. Tix, C.: Lower bound for the ground state energy of the no-pair Hamiltonian. Phys. Lett. B 405, 293–296 (1997) 49. Viterbo, C.: Indice de Morse des points critiques obtenus par minimax. A.I.H.P. Analyse non lin´eaire 5 (3), 221–225 (1988) Communicated by B. Simon
Commun. Math. Phys. 203, 531 – 549 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
The Moduli of Flat PGL(2, R) Connections on Riemann Surfaces Eugene Z. Xia Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA. E-mail:
[email protected] Received: 2 January 1997 / Accepted: 28 November 1998
Abstract: Suppose X is a compact Riemann surface with genus g > 1. Each class [σ ] ∈ Hom(π1 (X), PGL(2, R))/ PGL(2, R) is associated with the first and second Stiefel–Whitney classes w1 ([σ ]) and w2 ([σ ]). The set of representation classes with a fixed w1 6 = 0 has two connected components. These two connected components are characterized by w2 being 0 or 1. For each fixed w1 6 = 0, we prove that the component, characterized by w2 = 0, contains an open dense set diffeomorphic to the total space of a vector bundle of rank 2g − 2 over a once punctured algebraic torus of dimension g − 1. The other component, characterized by w2 = 1, contains an open dense set diffeomorphic to the total space of a vector bundle of rank 2g − 2 over an algebraic torus of dimension g − 1. 1. Introduction Let X be a compact Riemann surface of genus g > 1, and Hom(π1 (X), PGL(2, R)) the space of homomorphisms from π1 (X) to PGL(2, R). The group PGL(2, R) has two connected components and is isomorphic to SO(2, 1). The space Hom(π1 (X), PSL(2, R)) has 4g − 3 connected components and these components are distinguished by the Euler class e [6,9,10,18]. To obtain more detailed information on these representation spaces, Hitchin made use of the complex structure on X. By studying the space of rank-2 Higgs bundles over X, he showed that the 2g − 2 connected components (corresponding to non-zero Euler classes) of Hom(π1 (X), PSL(2, R))/ PSL(2, R) are complex vector bundles over symmetric products of X [9].
532
E. Z. Xia
Let α ∈ H1 (X, Z2 ) and define P Wα = {σ ∈ Hom(π1 (X), PGL(2, R)) : w1 (σ ) = α}. For any two non-zero classes α, β ∈ H1 (X, Z2 ), P Wα is homeomorphic to P Wβ [19]. Fix a non-zero class α and define P W to be P Wα . Then P W has two connected components distinguished by the two Stiefel–Whitney classes in H2 (X, Z2 ) [19]. This paper is a study of the topology of the space Hom(π1 (X), PGL(2, R))/ PGL(2, R), in particular the component P W/ PGL(2, R). Each representation σ ∈ P W may be ˜ SL(2, R)), where lifted to an element σ˜ = π ∗ (σ ) ∈ Hom(X, π : X˜ −→ X is a chosen unramified double cover of X. Let P W 0 be the subset of P W such that σ ∈ P W 0 implies that σ is irreducible and σ˜ is a semi-simple but non-central representation. In particular, P W 0 is open and dense in P W . For a precise description of P W 0 , see Sect. 2. Theorem 1.1. The space P W 0 / PGL(2, R) has two connected components P Q0 and P Q1 . The component P Q0 is the total space of a vector bundle of rank 2g − 2 over an once punctured compact algebraic torus of dimension g − 1. The component P Q1 is the total space of a vector bundle of rank 2g − 2 over an algebraic torus of dimension g − 1. The precise description of these two components is given in Sect. 3 and 6. Corollary 1.2. The space Hom(π1 (X), PGL(2, R)) has 22g+1 +4g −5 connected components. A representation is called parabolic if it is reducible but not semi-simple. The set P W \ P W 0 consists of representations of three types: 1. The R∗ representations. 2. The parabolic representations 3. The representations that lift to parabolic representations by π ∗ . Together these points form a subvariety of P W . ˜ 2. The Pull-Back Representations of π1 (X) Let X be a compact Riemann surface of genus g > 1 and G an algebraic group. A representation σ ∈ Hom(π1 (X, x), G) defines a flat G-bundle P over X. Let SL− (2, R) = {g ∈ GL(2, R) : det(g) = −1}, SLi (2, R) = SL(2, R) ∪ i SL(2, R)− , SL±i (2, R) = SLi (2, R) ∪ i SLi (2, R). Then SLi (2, R) is a subgroup of SL(2, C) and has two connected components. The projectivization of both SLi (2, R) and SL±i (2, R) is PGL(2, R).
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
533
The obstruction classes of P give rise to the obstruction class maps on : Hom(π1 (X, x), G) −→ Hn (X, πn−1 (G)). In particular, if G is GL(2, R), PGL(2, R) or SLi (2, R), then o1 is the first Stiefel– Whitney class w1 [17,19]. The class o2 is the second Stiefel–Whitney class w2 when G is PSL(2, C) and the Euler class e when G is PSL(2, R) [6,17]. g−1 Fix a point x ∈ X. Choose a set of generators S = {ai , bi }i=0 for the fundamental group π1 (X, x) and define R to be the formal expression g−1 Y i=0
ai bi ai−1 bi−1 .
Then π1 (X, x) is generated by S with the relation R = 1. Let 0 be the central extension of π1 (X, x) by c with R = c and c2 = 1. This gives the exact sequence 0 −→ Z2 −→ 0 −→ π1 (X, x) −→ 1. Let M be the space Hom(0, SL(2, C)) which has two connected components depending on whether c goes to I or −I [2,6,9]. Denote the two components by M0 and M1 , respectively. Note M0 is the space Hom(π1 (X, x), SL(2, C)). The space N = 2g−2 Hom(0, SL(2, R)) has 4g − 3 connected components consisting of {Nj }j =2−2g [6,9] and is a subset of Hom(0, SL(2, C)). Each σ ∈ Hom(0, SL(2, C)) acts on C2 via the standard representation of SL(2, C). Definition 2.1. A representation σ is irreducible if its action on C2 is irreducible, and is semi-simple if it is a direct sum of irreducible representations. Identify H2 (X, Z2 ) with Z2 and H2 (X, Z) with Z. Let J2 (X) be the space of central representations: J2 (X) = Hom(π1 (X, x), {±I }) ∼ = Z2 . 2g
Define PM P Mi PN P Nj
= Hom(π1 (X, x), PSL(2, C)), = w2−1 (i) ⊂ P M, = Hom(π1 (X, x), PSL(2, R)), = e−1 (j ) ⊂ P N.
The space J2 (X) is a group and acts on the Mi ’s and Ni ’s. The quotients are precisely the P Mi ’s and P Ni ’s (the projective representations). Remark 2.2. We shall use the superscripts s and ss to denote the irreducible and semisimple subspaces. For example, M s and M ss denote the subspaces of irreducible and semi-simple subspaces of M, respectively. Definition 2.3. W = {σ ∈ Hom(0, SLi (2, R)) : σ (a0 ) ∈ i SL− (2, R) and σ (S \ {a0 }) ⊂ SL(2, R)}, W0 = {σ ∈ W : σ (c) = I }, W1 = {σ ∈ W : σ (c) = −I }.
534
E. Z. Xia
The subspaces W0 and W1 are the ones associated with the second Stiefel–Whitney class w2 being 0 and 1, respectively. The group J2 (X) ∩ W acts on W . Denote by P W, P W0 , P W1 the respective quotient spaces. The sets P W0 and P W1 are connected [19]. There exists a double cover X˜ of X with covering map [1] π : X˜ −→ X g−1
˜ x) ˜ is generated by S˜ = {a˜ i , b˜i }i=1−g with such that π1 (X, π∗ (a˜ 0 ) = a02 , π∗ (b˜0 ) = b0 , and for i > 0,
π∗ (a˜ i ) = a0−1 π∗ (a˜ −i )a0 = ai , π∗ (b˜i ) = a0−1 π∗ (b˜−i )a0 = bi .
The double cover admits a fixed point free involution τ that is π -invariant, i.e, the diagram τ X˜ −−−−→ π y
X˜ π y
id
X −−−−→ X commutes. Composition of representations σ ∈ P W with the induced map ˜ x) ˜ −→ π1 (X, x) π∗ : π1 (X, defines a map
˜ x), ˜ PSL(2, R)). π ∗ : P W −→ Hom(π1 (X,
Proposition 2.4. The image of π ∗ consists of representations σ˜ satisfying e(σ˜ ) = 0. Proof. Let P˜ = π ∗ (P ). Then P˜ admits an involution: ∗
τ P˜ −−−−→ y
P˜ y
τ ˜ X˜ −−−−→ X.
Since σ (a0 ) ∈ i SL− (2, R), the associated flat principal bundle P is not orientable. ˜ but τ ∗ Hence τ ∗ must reverse orientations on P˜ . Since τ preserves orientations on M, reverses orientations on P˜ , e(σ˜ ) = τ ∗ (e(σ˜ )) = e(τ ∗ (σ˜ )) = −e(σ˜ ). This implies e(σ˜ ) = 0. u t Corollary 2.5. The representation σ˜ may be further lifted to a representation in ˜ x), ˜ SL(2, R)). Hom(π1 (X,
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
535
Proof. The obstruction to lifting is the mod-2 reduction of the Euler class e(σ˜ ) which is zero by Proposition 2.4. u t ˜ x)) ˜ which has index 2 in π1 (X, x) and let F1 = Let F0 be the group π∗ (π1 (X, π1 (X, x) \ F0 . Consider the following homomorphisms of groups π∗
i
˜ x) ˜ −→ F0 −→ π1 (X, x). π1 (X, These homomorphisms are injective. Therefore, a representation σ ∈ W induces a representation ˜ x) ˜ −→ SL(2, R). σ˜ : π1 (X, This defines a map
˜ x), ˜ SL(2, R)). π ∗ : W −→ Hom(π1 (X,
The map π ∗ is equivariant with respect to the action of PGL(2, R), thus, descends to a map ˜ x), ˜ SL(2, R))/ PGL(2, R). π ∗ : W/ PGL(2, R) −→ Hom(π1 (X, Definition 2.6. Ws W 00 W0 PWs PW0
= = = = =
{σ ∈ W : σ is irreducible}, {σ ∈ W s : π ∗ (σ ) is irreducible}, {σ ∈ W s : π ∗ (σ ) is semi-simple, and σ (a02 ) 6= ±I } ∪ W 00 , (J2 (X) ∩ W )\W s , (J2 (X) ∩ W )\W 0 .
The subspaces associated with w2 being 0 and 1 are denoted by the subscripts 0 and 1. Proposition 2.7. The subspace W 0 is open and dense in W . Proof. The space W s is smooth and open and dense in W . The subvariety W s \ W 0 has real codimension at least 1 in W s . Hence W 0 is open and dense in W s and is, therefore, open and dense in W . u t Corollary 2.8. The subspace P W 0 is open and dense in P W . Proposition 2.9. The projection π ∗ is a 2-to-1 map on W 0 and the two points in each fibre differ by a central representation. Proof. Let σ1 , σ2 ∈ W 0 such that σ˜ = π ∗ (σ1 ) = π ∗ (σ2 ). Then σ˜ ◦ π∗ (a˜ 0 ) = σ1 (a02 ) = σ2 (a02 ). Case (1). Suppose σ1 (a02 ) = σ (a02 ) 6 = ±I . Then there are exactly two elements ±A ∈ SL(2, R) such that (±A)2 = σ1 (a0 )2 = σ1 (a02 ).
536
E. Z. Xia
Hence σ2 (a0 ) = ±A = ±σ (a0 ). Hence the inverse image of σ˜ by π ∗ has two points and these two points differ by a central representation. Case (2). Let σ1 , σ2 ∈ W 00 . Then σ˜ is irreducible. Hence σ1 |F0 = σ2 |F0 is irreducible. Let d ∈ F0 . Then a0−1 da0 ∈ F0 . This implies σ1 (a0−1 da0 ) = σ2 (a0−1 da0 ). Hence
σ2 (a0 )σ1 (a0−1 )σ1 (d) = σ2 (d)σ2 (a0 )σ1 (a0−1 ).
That is, σ2 (a0 )σ1 (a0−1 ) intertwines σ1 |F0 . Since σ1 |F0 is irreducible, by Schur’s lemma, σ2 (a0 )σ1 (a0−1 ) is in the center of SL(2, R). Thus, σ2 (a0 ) = ±σ1 (a0 ). u t Corollary 2.10. The map π ∗ is 1-to-1 on P W 0 . Corollary 2.11. 1. The map π ∗ is 2-to-1 on W 0 / PGL(2, R) and the two points in each fibre differ by a central representation. 2. The map π ∗ is 1-to-1 on P W 0 / PGL(2, R). 3. The Prym Variety over X˜ Consider the given complex structure on X and denote by K its canonical bundle. The projection π induces a complex structure on X˜ and the free involution τ preserves this structure. Any holomorphic bundle E over X pulls back to a holomorphic bundle E˜ over ˜ = E. ˜ In particular, τ ∗ K˜ = K, ˜ where K˜ is the canonical bundle on X˜ such that τ ∗ (E) 0 ˜ Let Div (X) denote the group of all degree zero divisors on X. The Jacobi variety X. J (X) is the space of holomorphic line bundles over X with degree zero [1]. For any holomorphic line bundle L over X, π ∗ L is a holomorphic line bundle over ˜ X. Hence π induces a homomorphism ˜ π ∗ : J (X) −→ J (X). ˜ The resulting homomorphism If D ∈ Div 0 (X), then π −1 (D) ∈ Div 0 (X). ˜ π ∗ : Div 0 (X) −→ Div 0 (X) together with the basic epimorphism u satisfy the commutative diagram [1]: π∗
˜ Div 0 (X) −−−−→ Div 0 (X) u u y y J (X)
π∗
−−−−→
˜ J (X)
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
537
˜ then π(D) ˜ ∈ Div 0 (X). Hence π also induces On the other hand, if D˜ ∈ Div 0 (X), a homomorphism (the norm map) ˜ −→ Div 0 (X). Nm : Div 0 (X) The map N m descends to a homomorphism ˜ −→ J (X) Nm : J (X) and the diagram Nm
˜ −−−−→ Div 0 (X) Div 0 (X) u u y y ˜ J (X)
Nm
−−−−→
J (X)
commutes [1]. ˜ τ −1 (D) ˜ is in Div 0 (X). ˜ Hence τ induces automorphisms τ ∗ on For D˜ ∈ Div 0 (X), 0 ˜ ˜ the group Div (X) and J (X) such that the diagram ∗
τ ˜ −−− ˜ −→ Div 0 (X) Div 0 (X) u u y y
˜ J (X)
τ∗
−−−−→
˜ J (X)
commutes. Let ˜ = −L}, ˜ ˜ : τ ∗ (L) P = {L˜ ∈ J (X) ∗ ˜ ˜ ˜ ˜ S = {L ∈ J (X) : τ (L) = L},
Remark 3.1. 1. The space S is an abelian variety of dimension g. ˜ τ ). 2. The identity component P0 of P is, by definition, the Prym Variety P rym(X, ˜ ˜ 3. The subgroups of 2-torsions of J (X) and J (X) are precisely J2 (X) and J2 (X). ˜ ∩ P and the quotient is denoted by The group P contains the subgroup J2 (X) ˜ ∩ P)\P. P P = (J2 (X) ˜ has two points, namely Proposition 3.2. 1. The kernel of the map π ∗ : J (X) −→ J (X) the trivial bundle 1 and a two torsion Tη . ˜ = 22g . 2. |P ∩ J2 (X)| 3. P has four connected components and P P is connected. 4. P contains Ker(Nm) as a subgroup of index 2. 5. If deg(L˜ 0 ) = 2 such that τ ∗ L˜ 0 = L˜ 0 , then there exists L˜ 1 such that L˜ 21 = L˜ 0 and τ ∗ (L˜ 1 ) = L˜ 1 ⊗ T˜ , where T˜ ∈ Ker(Nm) \ P0 .
538
E. Z. Xia
Proof. For 1, 2, 3 and 4, see [1,11]. Suppose deg(L˜ 0 ) = 2. Then it is immediate that there exists L˜ 1 such that L˜ 21 = L˜ 0 . Since τ ∗ (L˜ 0 ) = L˜ 0 , (τ ∗ (L˜ 1 ))2 = τ ∗ (L˜ 21 ) = τ ∗ (L˜ 0 ) = L˜ 0 = L˜ 21 . Hence,
τ ∗ (L˜ 1 ) = L˜ 1 ⊗ T˜
˜ In addition, since for some T˜ ∈ J2 (X). T˜ = L˜ 1 ⊗ (τ ∗ (L˜ 1 ))−1 and deg(L˜ 1 ) = 1, T˜ ∈ Ker(Nm) \ P0 [1]. u t 4. Stable Holomorphic Pairs and the Self-Dual Equation This section briefly reviews the rank-2 gauge theory over Riemann surfaces. The main results are due to Corlette, Hitchin and Donaldson. See [2–5,9] for details. For general smooth projective varieties, see [13–16]. 4.1. The complex case. The maximum compact subgroups of GL(2, C) and SL(2, C) are U(2) and SU(2) with fundamental groups isomorphic to Z2 . Let P c be a principal GL(2, C) bundle over a compact Riemann surface with first Chern class c1 (P c ) being either 0 or 1. Let V be the associated vector bundle. Fix a Hermitian metric h on V . This corresponds to a reduction of P c to a U(2) principal bundle P over X. Choose a U(2) (i.e. compatible with h) connection D0 on V such that the curvature F (D0 ) is central [2,9]. In addition, in the case of c1 (P c ) = 0, we choose h to be the constant metric 1 and D0 = d. Denote by G c the SL(2, C) gauge group on P c and G the SU(2) gauge group on P . The gauge group G preserves h. Let ad(P ) = P ×Ad su(2), ad(P c ) = P c ×Ad sl(2, C), where Ad is the adjoint representation. The difference of any two connections on P c or P is a 1-form. Hence, with the choice of D0 , one may identify 1 (X, ad(P c )) and 1 (X, ad(P )) with the space of connections of the fixed determinant det(D0 ) on P c and P , respectively. An element 8 of 1,0 (X, ad(P c )) is called a Higgs field. Given 8 ∈ 1,0 (X, ad(P c )) and A ∈ 1 (X, ad(P )), one may construct connections DA and D: DA = D0 + A, D = D0 + A + (8 + 8∗ ), where 8∗ denotes the adjoint of 8 with respect to the metric h. The (0, 1) part of D0 determines a holomorphic structure ∂¯0 on V [7]. Again, let A ∈ 1 (X, ad(P )), i.e. DA is compatible with h. Then ∂¯A , the (0, 1) part of DA , defines a holomorphic structure on V . Similarly, given a holomorphic structure ∂¯ on V with ¯ = det(∂¯0 ), det(∂) there exists a unique A ∈ 1 (X, ad(P )) such that DA is compatible with h and ¯ ∂¯A = ∂.
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
539
Hence the metric h determines a one-to-one correspondence between the space 1 (X, ad(P )) and the space of holomorphic structures on V with determinant equal to det(∂¯0 ). Higgs fields are sections of the bundle End0 V ⊗ K, where End0 V is the bundle of trace free complex linear transformations of V , and K is the canonical bundle on X. A holomorphic structure ∂¯ on V induces on End0 V a holomorphic structure which, when combined with the inherent holomorphic structure on K, gives a holomorphic structure ¯ on End0 V ⊗ K. A Higgs field 8 is holomorphic if (which we shall also call ∂) ¯ = 0. ∂8 ¯ 8) is a Higgs bundle. A pair (DA , 8) is holomorphic When 8 is holomorphic, we say (∂, if ∂¯A 8 = 0. Therefore, the set HC of holomorphic (DA , 8) pairs corresponds bijectively to the set ¯ 8). Hig of Higgs bundles (∂, The complex gauge group G c acts on Hig naturally. Since G ⊂ G c , G acts on the Higgs fields. Hence G acts on HC. The group G c is much larger than G; hence, one can expect the space HC/G to be much larger than the space Hig/G c . The key issue of this analysis is to establish an equivalence between the stable Higgs bundles in Hig/G c and the irreducible pairs in HC/G satisfying Hitchin’s self-duality equation. ¯ is 8-invariant if A holomorphic subbundle L of (V , ∂) 8(L) ⊆ L ⊗ K. ¯ 8) is stable (semi-stable) if L being 8-invariant implies A Higgs bundle (∂, 1 deg(L) < (≤) deg(V ). 2 A Higgs bundle is poly-stable if it is a direct sum of stable Higgs bundles of the same degree. Denote by Higs and Higss the space of stable and poly-stable Higgs bundles on X. The action of G c preserves Higs and Higss ; hence, one may define the moduli spaces Hs = Higs /G c , Hss = Higss /G c . The space Hss is a coarse moduli space parameterizing G c -equivalence classes of polystable Higgs bundles while Hs is a fine moduli space of stable Higgs bundles [9,12]. Denote by H0s , H1s , H0ss , H1ss the components of stable and poly-stable Higgs bundles associated with c1 (P c ) being 0 and 1, respectively. A pair (DA , 8) is irreducible if the connection DA + 8 + 8∗ is irreducible. A pair (DA , 8) is semi-simple if DA + 8 + 8∗ is a direct sum of irreducible connections of the same degree. A holomorphic pair (DA , 8) is called self-dual if it satisfies Hitchin’s self-duality equation [9]: F (DA ) + [8, 8∗ ] =
1 tr(F (D0 ))I. 2
Let Y M s and Y M ss denote the spaces of irreducible and semi-simple self-dual pairs. The action of G preserves both the properties of irreducibility and self-duality; hence, one may define the moduli spaces YMs = Y M s /G, YMss = Y M ss /G.
540
E. Z. Xia
¯ 8) such that its Hitchin showed that each G c orbit in Hs contains a Higgs bundle (∂, corresponding pair (DA , 8) is a self-dual pair with ¯ ∂¯A = ∂. ¯ 8) is unique up to G-equivalence. In other words, the Moreover the Higgs bundle (∂, s two moduli spaces H and YMs are diffeomorphic. Furthermore, given any self-dual pair in Y M s , the connection D = DA + 8 + 8∗ is flat and irreducible for c1 = 0 and descends to a flat PSL(2, C) connection for c1 = 1. From now on, we shall always assume V to have a holomorphic structure and write V0 for ¯ 8). We call the connection ∂¯0 and (V , 8) for a poly-stable Higgs bundle instead of (∂, D, so constructed from a Higgs bundle (V , 8), the connection associated with (V , 8). The 2-torsion subgroup J2 (X) acts on Hss by L.(V , 8) = (L ⊗ V , 8). Theorem 4.1 (Hitchin [9,12]). Hcs Hcss J2 (X)\Hcs J2 (X)\Hcss
∼ = ∼ = ∼ = ∼ =
Mcs / PSL(2, C), Mcss / PSL(2, C), P Mcs / PSL(2, C), P Mcss / PSL(2, C).
Denote by g the identification maps of these spaces. It is straightforward to generalize the notion of stability, semi-stability and polystability to Higgs bundles (V , 8) with c1 (V ) equal to any integer. Define Hcss to be the moduli space of G c -equivalence classes of poly-stable Higgs bundle (V , 8) with c1 (V ) = c. Let (Ld , DL ) be a holomorphic line bundle of degree d with a connection DL . The line bundle Ld defines an isomorphism between the space of holomorphic bundles of a fixed first Chern class c with the space of holomorphic bundles with first Chern class c + 2d: Ld ⊗ Vc 7 −→ Vc+2d . Moreover if V has a connection D, then the projective bundles (P (V ), D) and (P (Ld ⊗ V ), DL ⊗ D) are isomorphic. Define U = Ld ⊗ V , where c1 (V ) is either 0 or 1. Then U ⊗ U ∗ = (Ld ⊗ V ) ⊗ (Ld ⊗ V )∗ = V ⊗ V ∗ , and
End0 U ⊗ K = End0 V ⊗ K.
Hence Ld defines an isomorphism Ld
(V , 8) 7 −→ (Ld ⊗ V , 8) ss . which is G c -equivariant, hence, defines an isomorphism from Hcss to Hc+2d
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
541
Corollary 4.2. The components Hcss and J2 (X)\Hcss are homeomorphic to Mcss0 / PSL(2, C) and P Mcss0 / PSL(2, C), respectively if c ≡ c0 mod 2. ˜ Fix the Hermitian metric h˜ = π ∗ (h) on X. Definition 4.3. Construct the above moduli spaces on the double cover X˜ and denote these objects by a ˜ . For example, h˜ = π ∗ (h) is the pull-back Hermitian metric on V˜ ˜ and H˜ ss is the coarse moduli space of poly-stable Higgs bundles on X. The involution τ induces a pull-back action τ ∗ on H˜ ss . ˜ Proposition 4.4. The involution τ ∗ commutes with g. ˜ all the the operations involved Proof. Since τ preserves h˜ and the complex structure on X, in the identification map g˜ commute with τ ∗ . One can see this locally by choosing an acyclic cover {U˜ i , V˜i } on X˜ symmetric with respect to τ in the sense that τ (U˜ i ) = V˜i τ (V˜i ) = U˜ i ˜ Ui ∩ V˜i = ∅. Such a cover is possible because τ does not fix any point and preserves the complex ˜ u structure on X. t 4.2. The real case. Now we turn our attention to the subsets of Hs and Hss that correspond to N s / PGL(2, R) and N ss / PGL(2, R). We say a Higgs bundle (V , 8) satisfies the reality condition or is a real Higgs bundle [9] if 1. There is a holomorphic line bundle L such that V = L ⊕ (L−1 ⊗ det(V )), 2. 8 = 81 ⊕ 82 , where 81 : L −→ L−1 ⊗ det(V ) ⊗ K, 82 : L−1 ⊗ det(V ) −→ L ⊗ K, i.e. 8 is of the form:
0 b , c 0
where b and c are holomorphic sections of the bundles L2 ⊗ K ⊗ det(V )−1 and L−2 ⊗ K ⊗ det(V ), respectively.
542
E. Z. Xia
For such a Higgs bundle (V , 8), the line bundle L inherits a Hermitian metric h1 from h. Let D be the connection associated to (V , 8). The metric h1 defines a bundle isomorphism between L¯ and L−1 . This induces an anti-holomorphic linear transformation f : V −→ V ⊗ det(V )−1 , f (u1 , u2 ) = (u¯ 2 , u¯ 1 ). In addition, D commutes with f . Thus, D is a flat connection on the projective subbundle P E ⊂ P (V ) fixed by f . Moreover, (P E, D) is a flat PSL(2, R)-bundle and the Euler class of P E equals 2 deg(L) − deg(V ). Let RHess be the subset of Hss of poly-stable real Higgs bundle with Euler class e and RHes the subset of RHess of stable real Higgs bundles. Theorem 4.5 (Hitchin [9]). The moduli space RHess is homeomorphic to Ness / PSL(2, R), and J2 (X)\RHess ∼ = P Ness / PSL(2, R).
The subspaces RHes and J2 (X)\RHes are diffeomorphic to Nes / PSL(2, R) and P Nes / PSL(2, R), respectively. Corollary 4.6. Tensoring with a line bundle Ld of degree d gives a one-to-one correspondence between the real Higgs bundles in Hcss with Euler class e and the real Higgs ss with Euler class e. bundles in Hc+2d 5. Stability This section is a study of stability criteria of real Higgs bundles. Proposition 5.1. Suppose V = L1 ⊕ L2 , with d = deg(L1 ) = deg(L2 ).
Then the bundles L1 and L2 are the only two holomorphic subbundles of V of degree d if and only if L1 6 = L2 . Proof. If L1 = L2 , then ts ⊕ (1 − t)s generates a line subbundle of V for all t ∈ [0, 1], where s is a meromorphic section of L1 [8]. Suppose L1 6 = L2 . Let H ⊂ V be a holomorphic line bundle of degree d. Then H corresponds to a holomorphic section ϕ of the bundle H −1 ⊗ V = H −1 ⊗ L1 ⊕ H −1 ⊗ L2 such that ϕ has no zero. Thus where ϕ1 is a section of do ϕ1 and ϕ2 . However
H −1
ϕ = ϕ1 ⊕ ϕ2 , ⊗ L1 and ϕ2 of H −1 ⊗ L2 . Since ϕ has no poles, neither
deg(H −1 ⊗ L1 ) = deg(H −1 ⊗ L2 ) = 0, so ϕ1 is either identically zero or has no zero. The same is true with ϕ2 . If ϕ1 ≡ 0, then ϕ2 has no zero and H = L2 . If ϕ2 ≡ 0, then ϕ1 has no zero and H = L1 . If neither ϕ1 , ϕ2 has any zero, then H = L1 and H = L2 . This is the case of L1 = L2 . t u
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
543
Proposition 5.2. Suppose (V , 8) is a Higgs bundle on X and V˜ = π ∗ (V ). In addition, suppose V˜ = L˜ 1 ⊕ L˜ 2 deg(L˜ 1 ) = deg(L˜ 2 ) = d τ ∗ (L˜ 1 ) = L˜ 2
with
τ ∗ (L˜ 2 ) = L˜ 1 ˜ 6 = L˜ 2 . L1 Then (V , 8) is stable. Proof. Suppose H ⊂ V . Then H˜ = π ∗ (H ) ⊂ V˜ . Hence, by Proposition 5.1, H˜ = L˜ 1 or H˜ = L˜ 2 or deg(H˜ ) < d. On the other hand, since τ ∗ (H˜ ) = H˜ , it must be the case that deg(H˜ ) < d. This implies V is a stable holomorphic bundle. Hence (V , 8) is stable for any 8. u t 6. The Flat PGL(2, R) Structures Suppose (V , 8) is a Higgs bundle on X. Then (V , 8) pulls back to ˜ = π ∗ (V , 8). (V˜ , 8) Proposition 4.4 indicates that one needs to determine the set of stable Higgs bundles of the form (V , 8) on X such that π ∗ (V , 8) ∈ RH˜ 0ss and det(V ) is det(V0 ). The pull-back V˜0 = π ∗ (V0 ) is a holomorphic bundle on X˜ and det(V˜0 ) is a line ˜ Suppose deg(V0 ) = 1. Then deg(V˜0 ) = 2. By Proposition 3.2, there exists bundle on X. a line bundle L˜ 1 and a 2-torsion line bundle T˜ such that L˜ 21 = det(V˜0 ),
τ ∗ (L˜ 1 ) = L˜ 1 ⊗ T˜ .
Definition 6.1. ˜ b) ˜ : L˜ ∈ P, L˜ 2 6 = 1, b˜ ∈ H0 (X, ˜ L˜ 2 K)}, ˜ Q0 = {(L, 0 ˜ ˜2 ˜ ˜ ˜ ˜ ˜ ˜ Q1 = {(L ⊗ L1 , b) : L ∈ P, b ∈ H (X, L ⊗ T˜ ⊗ K)}, Q = Q0 ∪ Q1 . ˜ ∩ P acts on Q: The group J2 (X) ˜ ∩ P) × Q −→ Q, (J2 (X) ˜ b)) ˜ −→ (L˜ 0 ⊗ L, ˜ b). ˜ (L˜ 0 , (L, The quotients are denoted by P Q0 , P Q1 , P Q, respectively. Proposition 6.2. The spaces P W00 / PGL(2, R) and P W10 / PGL(2, R) are diffeomorphic to P Q0 and P Q1 , respectively.
544
E. Z. Xia
Proof. Case 1: w2 = 0. Let σ ∈ W00 / PGL(2, R). Then σ corresponds to an element in H0s . Let σ˜ = π ∗ (σ ). By Proposition 2.4 and Theorem 4.5, σ˜ is an SL(2, R) representation and corresponds ˜ such that: to an element in RH˜ 0ss , hence, to a poly-stable Higgs bundle (V˜ , 8) π ∗ (V ) = V˜ = L˜ ⊕ L˜ −1 , 0 ∗ ˜ ˜ ˜ π (8) = 8 = 81 ⊕ 82 = c˜ with
b˜ 0
˜ = 0. deg(L)
˜ Suppose Since τ preserves the degree of any divisor, τ ∗ preserves the degree of L. 2 ˜ L 6 = 1. By Proposition 5.1, either ˜ ˜ τ ∗ (L) τ ∗ (L) = L˜ = L˜ −1 or τ ∗ (L˜ −1 ) = L. τ ∗ (L˜ −1 ) = L˜ −1 ˜ This implies, after normalizing, the following dichotomy: 1.
2.
∗ ˜ = L, ˜ τ ∗ (L˜ −1 ) = L˜ −1 , τ (L) ˜ 1) = 8 ˜ 1 , τ ∗ (8 ˜ 2) = 8 ˜ 2, τ ∗ (8 ∗ ˜ ˜ τ ∗ (c) ˜ = c; ˜ τ (b) = b, ∗ ˜ = L˜ −1 , τ ∗ (L˜ −1 ) = L, ˜ τ (L) ˜ 1) = 8 ˜ 2 , τ ∗ (8 ˜ 2) = 8 ˜ 1, τ ∗ (8 ˜ = c, ˜ ˜ τ ∗ (c) ˜ = b. τ ∗ (b)
Let E˜ ⊂ V˜ be the flat SL(2, R)-bundle fixed by the anti-holomorphic map f˜. Let D˜ ˜ be the connection associated with (V˜ , 8). ˜ therefore, the pair (E, ˜ D) ˜ With the solutions to Eq. 1, τ ∗ preserves orientations on E; ˜ L˜ 2 ⊗ K) descends to a flat SL(2, R)-bundle (E, D) (w1 (E) = 0) of X. Note b˜ ∈ H0 (X, ˜ L˜ −2 ⊗ K) with L˜ ∈ S. Alternatively, (V˜ , 8) ˜ is a lift of a pair (V , 8) and c˜ ∈ H0 (X, that satisfies the reality condition. Note π ∗ (V , 8) = π ∗ (V ⊗ Tη , 8). ˜ is not unique. However these two Higgs bundles differ Hence the descent from (V˜ , 8) by a 2-torsion; hence, the projectivized bundles are the same. ˜ Hence the pair (E, ˜ D) ˜ With the solutions to Eq. 2, τ ∗ reverses orientations on E. −1 ˜ ˜ descends to a flat SLi (2, R)-bundle (E, D) on X. The two equations on L, L are precisely the condition for L˜ to be in P.
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
545
The group P has four connected components. Let Lη ∈ J (X) such that L2η = Tη . Let L˜ η = π ∗ (Lη ). Then L˜ 2η = π ∗ (Lη )2 = π ∗ (L2η ) = π ∗ (Tη ) = 1. In other words, L˜ η is a 2-torsion in S, hence, it is also in P. Suppose V is a rank-2 holomorphic bundle on X with det(V ) = 1 such that V˜ = ∗ π (V ) = L˜ ⊕ L˜ −1 , where L˜ ∈ P. Then L˜ ⊗ L˜ η ∈ P and π ∗ (V ⊗ Lη ) = V˜ ⊗ L˜ η = (L˜ ⊗ L˜ η ) ⊕ (L˜ ⊗ L˜ η )−1 . However,
det(V ⊗ Lη ) = Tη 6 = 1.
Therefore if L˜ ∈ P and
V˜ = π ∗ (V ) = L˜ ⊕ L˜ −1 ,
then either det(V ) = 1 or det(V ) = Tη . Since det(V ) cannot jump on connected components of P, it must be the case that only two components of P induce vector bundles V on X with det(V ) = 1. Denote these two components P00 . The components P \P00 will induce bundles V with determinant Tη . Hence only the Higgs bundles induced by P00 correspond to SLi (2, R) representations. Remark 6.3. The points in Q0 correspond to points in the space of SL±i (2, R) representation classes. Let ˜ ˜ b) ˜ : L˜ ∈ P00 , L˜ 2 6 = 1, b˜ ∈ H0 (X, ˜ L˜ 2 K)}. Q00 = {(L, By Corollary 2.11, this construction describes a 2-to-1 map 5 : W00 / PGL(2, R) −→ Q00 . Note the 2-torsions in P00 are excluded because they correspond to the reducible representation classes [σ ] with σ (a02 ) = ±I , hence, are not in W00 by definition. ˜ b) ˜ ∈ Q0 and To show 5 is onto, let (L, 0 V˜ = L˜ ⊕ L˜ −1 , 0 b˜ ˜ = 8 ˜ 0 . τ ∗ (b) ˜ The involution τ ∗ on V˜ preserves the subspace E˜ ⊂ V˜ but reverses orientations on E. ∗ ∗ Let < τ > be the order two group generated by τ and define the quotients V = V˜ / < τ ∗ >, ˜ < τ ∗ >, 8 = 8/ ˜ < τ∗ > . E = E/ Since L˜ 6 = L˜ −1 , by Proposition 5.2, (V , 8) is a stable Higgs bundle over X. Moreover ˜ and E˜ are pull-backs of (V , 8) and E by π ∗ , respectively. Let D˜ be the con(V˜ , 8) ˜ D) ˜ is a flat SL(2, R)-bundle and τ ∗ reverses ˜ Then (E, nection associated with (V˜ , 8).
546
E. Z. Xia
˜ Hence (E, D) is a flat SLi (2, R)-bundle on X. Hence the map 5 orientations on E. ˜ consists of the two points (V , 8) and is onto. Note the fibre of 5 at the point (V˜ , 8) (V ⊗ Tη , 8). Case 2: w2 = 1. The proof is similar to the proof of Case 1. A representation class σ ∈ W10 / PGL(2, R) corresponds to an element in H1s . Hence σ˜ = π ∗ (σ ) is an SL(2, R) representation. By Proposition 4.5 and Corollary 4.6, σ˜ corresponds to a ˜ such that: poly-stable Higgs bundle (V˜ , 8) π ∗ (V ) = V˜ = L˜ 1 ⊗ V˜1 = L˜ 1 ⊗ L˜ ⊕ L˜ 1 ⊗ L˜ −1 , ˜ ˜ =8 ˜1⊕8 ˜2 = 0 b , π ∗ (8) = 8 c˜ 0 ˜ = 0. Suppose with deg(L)
L˜ 1 ⊗ L˜ 6 = L˜ 1 ⊗ L˜ −1 .
By Proposition 5.1, either ˜ τ ∗ (L˜ 1 ⊗ L) = L˜ 1 ⊗ L˜ τ ∗ (L˜ ⊗ L˜ −1 ) = L˜ ⊗ L˜ −1 1 1
˜ τ ∗ (L˜ 1 ⊗ L) or
= L˜ 1 ⊗ L˜ −1
τ ∗ (L˜ ⊗ L˜ −1 ) = L˜ ⊗ L. ˜ 1 1
This implies, after normalizing, the following dichotomy: 1.
∗ ˜ = L˜ 1 ⊗ L, ˜ τ ∗ (L˜ 1 ⊗ L˜ −1 ) = L˜ 1 ⊗ L˜ −1 , τ (L˜ 1 ⊗ L) ˜ 1) ˜ 1, ˜ 2) ˜ 2, =8 τ ∗ (8 =8 τ ∗ (8 ∗ ˜ ˜ = b, τ ∗ (c) ˜ = c; ˜ τ (b)
2.
∗ ˜ = L˜ 1 ⊗ L˜ −1 , τ ∗ (L˜ 1 ⊗ L˜ −1 ) = L˜ 1 ⊗ L, ˜ τ (L˜ 1 ⊗ L) ˜ 1) ˜ 2, ˜ 2) ˜ 1, =8 τ ∗ (8 =8 τ ∗ (8 ∗ ˜ ˜ = c, ˜ τ ∗ (c) ˜ = b. τ (b) Equation 1 has no solution. Since V˜ = π ∗ (V ), the equality ˜ = L˜ 1 ⊗ L˜ τ ∗ (L˜ 1 ⊗ L)
would imply the existence of L0 ∈ V with L˜ 1 ⊗ L˜ = π ∗ (L0 ). ˜ = 1, the degree of L0 would have been 1 . This is not possible. Since deg(L˜ 1 ⊗ L) 2 With the solutions to Eq. 2, τ ∗ reverses orientations on the SL(2, R)-bundle E˜ ⊂ V˜ . ˜ D) ˜ descends to an ˜ Then the pair (E, Let D˜ be the connection associated with (V˜ , 8).
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
547
SLi (2, R)-bundle (E, D) on X. The bundle further descends to a flat projective bundle (P (E), D) on X. Since T˜ is a 2-torsion in P, τ ∗ (T˜ ) = T˜ −1 = T˜ . Hence T˜ ∈ S. Since S is an abelian variety, there exists L˜ 2 ∈ S such that T˜ = L˜ 22 . This implies
τ ∗ (L˜ 2 ) = L˜ 2 = T˜ ⊗ L˜ −1 2 τ ∗ (L˜ −1 ) = L˜ −1 = T˜ ⊗ L˜ . 2 2 2
Hence for each L˜ ∈ P, τ ∗ ((L˜ ⊗ L˜ 2 ) ⊗ L˜ 1 )
= (L˜ ⊗ L˜ 2 )−1 ⊗ L˜ 1
τ ∗ ((L˜ ⊗ L˜ )−1 ⊗ L˜ ) = (L˜ ⊗ L˜ ) ⊗ L˜ . 2 1 2 1 Hence the solutions to Eq. 2 give points in Q1 . Note, by Proposition 3.2, T˜ 6 ∈ P0 . Thus there does not exist L˜ ∈ P such that L˜ 2 ⊗ T˜ = 1. This implies that there does not exist L˜ ∈ P such that L˜ ⊗ L˜ 2 ⊗ L˜ 1 = (L˜ ⊗ L˜ 2 )−1 ⊗ L˜ 1 . This gives a 2-to-1 map
5 : W10 / PGL(2, R) −→ Q1 .
Similar to Case 1, there are only two components of P that induce vector bundles V such that ˜ ⊕ L˜ 1 ⊗ (L˜ 2 ⊗ L) ˜ −1 , π ∗ (V ) = V˜ = L˜ 1 ⊗ (L˜ 2 ⊗ L) with det(V ) = det(V0 ). Denote these two components P10 and define ˜ : L˜ ∈ P10 , b˜ ∈ H0 (X, ˜ L˜ 2 ⊗ T˜ ⊗ K)}. ˜ Q01 = {(L˜ ⊗ L˜ 1 , b) ˜ b) ˜ ∈ Q0 and Let (L, 1 ˜ ⊕ L˜ 1 ⊗ (L˜ 2 ⊗ L) ˜ −1 , V˜ = L˜ 1 ⊗ (L˜ 2 ⊗ L) 0 b˜ ˜ = 8 ˜ 0 . τ ∗ (b) ˜ The involution τ ∗ on V˜ preserves the subspace E˜ ⊂ V˜ but reverses orientations on E. Let < τ ∗ > be the order two group generated by τ ∗ and define quotient sets V = V˜ / < τ ∗ >, ˜ < τ ∗ >, 8 = 8/ ˜ < τ∗ > . E = E/ ˜ and E˜ are By Proposition 5.2, (V , 8) is a stable Higgs bundle over X. Moreover, (V˜ , 8) ∗ ˜ D) ˜ is an SL(2, R)-bundle and pull-backs of (V , 8) and E by π , respectively. Since (E,
548
E. Z. Xia
˜ (E, D) is an SLi (2, R)-bundle on X. The bundle (E, D) τ ∗ reverses orientations on E, descends to a flat projective bundle (P (E), D) (PGL(2, R)-bundle) on X. Hence 5 is onto. Finally, by Corollary 2.11, ˜ ∩ P)\Q = (J2 (X) ∩ W )\W 0 / PGL(2, R) = P W 0 / PGL(2, R). P Q = (J2 (X) Q00
Q0
Let = W 0 / PGL(2, R)
t u
∪ Q01 . Proposition 6.2 actually provides an explicit identification with Q0 . This is stronger than needed to obtain Theorem 1.1. Since
of
P (V˜ ⊗ L˜ 0 ) = P (V˜ ) for any line bundle L˜ 0 , an alternative approach is to look at the equation ˜ = (V˜ ⊗ L˜ 0 , 8). ˜ τ ∗ (V˜ , 8) For the component P W 0 / PGL(2, R), this leads to the system of equations: ∗ ˜ = L˜ −1 ⊗ L˜ 0 , τ ∗ (L˜ −1 ) = L˜ ⊗ L˜ 0 τ (L) ˜ 1) = 8 ˜ 2, ˜ 2) = 8 ˜ 1, τ ∗ (8 τ ∗ (8 ∗ ˜ ˜ ˜ τ ∗ (c) ˜ = b. τ (b) = c, The solutions to this system of equations correspond to the GL(2, C) connections that project down to flat PGL(2, R) connections. The quotient P P is homeomorphic to a compact complex torus with complex dimen˜ ∈ P P \{[1]} is the vector space H0 (X, ˜ L˜ 2 K). ˜ sion g −1. If w2 = 0, then above each [L] By the Riemann-Roch formula, ˜ − h0 (L˜ −2 ) = 1 − (2g − 1) + deg(L˜ 2 K). ˜ h0 (L˜ 2 K) This implies ˜ = 1 − (2g − 1) + [2(2g − 1) − 2] = 2g − 2. h0 (L˜ 2 K) Hence the total dimension is ˜ + dim(P P) = 3g − 3. h0 (L˜ 2 K) ˜ = 2g − 1. Note if L˜ is a 2-torsion, then h0 (L˜ 2 K) Suppose w2 = 1. Again there is no L˜ ∈ P such that L˜ 2 ⊗ T˜ = 1. This implies that ˜ L˜ 2 ⊗ T˜ ⊗ K) ˜ is of dimension ˜ ∈ P P, the vector space H0 (X, above any [L] ˜ = 1 − (2g − 1) + [2(2g − 1) − 2] = 2g − 2. h0 (L˜ 2 ⊗ T˜ ⊗ K) Again, the total dimension is ˜ + dim(P P) = 3g − 3. h0 (L˜ 2 ⊗ T˜ ⊗ K) Finally, by Corollary 2.8, P W 0 is open and dense in P W . Therefore P W 0 / PGL(2, R) is open and dense in P W/ PGL(2, R). This proves Theorem 1.1. Acknowledgement. Most of this research was carried out at the University of Maryland at College Park. I thank Professor William Goldman, for his encouragement and for insightful discussions over the course of this research. I thank Professors Kevin Corlette, Ron Donagi, Jonathan Poritz, Richard Schwartz and Scott Wolpert for insightful discussions. I also thank Goldman and Poritz for proof-reading previous versions. I thank the referee for detailed and helpful suggestions for improvement. Finally, I thank IHES for hospitality and for providing an excellent research environment during the final revision of this paper.
Moduli of Flat PGL(2, R) Connections on Riemann Surfaces
549
References 1. Arbarello, E., Cornalba, M., Griffiths, P., Harris, J.: Geometry of Algebraic Curves Vol. I. Berlin– Heidelberg–New York: Springer-Verlag, 1984 2. Atiyah, M., Bott, R.: TheYang–Mills Equations Over Riemann Surfaces. Philos. Trans. Roy. Soc. London, Ser. A 308, 523–615 (1982) 3. Corlette, K.: Flat G-bundles With Canonical Metrics. J. Diff. Geom. 28, 361–382 (1988) 4. Donaldson, S.: Twisted Harmonic Maps and the Self-Duality Equations. Proc. London Math. Soc. 55, 127–131 (1987) 5. Donaldson, S., Kronheimer, S.: The Geometry of Four-Manifolds. Oxford Mathematical Monographs, 1990 6. Goldman, W.: Topological Components of Spaces of Representations. Invent. Math. 93, 557-607 (1988) 7. Griffiths, P., Harris, J.: Principles of Algebraic Geometry. New York: Wiley Interscience, 1978 8. Gunning, R.: Lectures on Vector Bundles Over Riemann Surfaces. Princeton, NJ: Princeton University Press, 1967 9. Hitchin, N.: The Self-Duality Equations on a Riemann Surface. Proc. London Math. Soc. 55, 59–126 (1987) 10. Milnor, J.: On the Existence of a Connection with Curvature Zero. Comment. Math. Helv. 32, 215–223 (1958) 11. Mumford, D.: Prym Varieties I. Contributions to Analysis, London–New York: Academic Press, 1974, pp. 325–350 12. Nitsure, N.: Moduli Space of Semistable Pairs on a Curve. Proc. London Math. Soc. 62, 275–300 (1991) 13. Simpson, C.: Constructing Variations of Hodge Structures Using Yang–Mills Theory and Applications to Uniformization. J. of the AMS 1, 867–918 (1988) 14. Simpson, C.: Hodge Bundles and Local Systems. Publ. Math. I.H.E.S. 75, 6–95 (1992) 15. Simpson, C.: Moduli of Representations of the Fundamental Group of a Smooth Projective Variety. I. Publ. Math. I.H.E.S. 79, 47–129 (1994) 16. Simpson, C.: Moduli of Representations of the Fundamental Group of a Smooth Projective Variety. II. Publ. Math. I.H.E.S. 80, 5–79 (1994) 17. Steenrod, N.: The Topology of Fiber Bundles. Princeton, NJ: Princeton University Press, 1951 18. Wood, J.: Bundles with Totally Disconnected Structure Group. Comment. Math. Helv. 51, 183–199 (1971) 19. Xia, E.: Components of Hom(π1 , PGL(2, R)). Topology 36 No. 2, 481–499 (1997) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 203, 551 – 572 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Yangian Realisations from Finite W-Algebras E. Ragoucy1,? , P. Sorba2 1 Theory Division, CERN, CH-1211 Genève 23, Switzerland. E-mail:
[email protected] 2 Laboratoire de Physique Théorique LAPTH?? , LAPP, BP 110, F-74941 Annecy-le-Vieux Cedex, France.
E-mail:
[email protected] Received: 1 April 1998 / Accepted: 28 November 1998
Abstract: We construct an algebra homomorphism between the Yangian Y (sl(n)) and the finite W-algebras W(sl(np), n.sl(p)) for any p. We show how this result can be applied to determine properties of the finite dimensional representations of such Walgebras. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . 2. Finite W Algebras: Notation and Classification . . . 2.1 Classical W(G, S) algebras . . . . . . . . . . . 2.2 Quantum W(G, S) algebras . . . . . . . . . . . 2.3 Miura representations . . . . . . . . . . . . . . 2.4 Example: W(sl(np), n.sl(p)) algebras . . . . . 3. Yangians Y (G) . . . . . . . . . . . . . . . . . . . . 3.1 Definition . . . . . . . . . . . . . . . . . . . . 3.2 Evaluation representations of Y (sl(n)) . . . . . 4. Yangians and Classical W-Algebra . . . . . . . . . 5. Yangians and Quantum W(sl(np), n.sl(p)) Algebras 6. Representations of W(sl(2n), n.sl(2)) Algebras . . 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . A. The Soldering Procedure . . . . . . . . . . . . . . . B. Classical W(sl(np), n.sl(p)) Algebras . . . . . . . B.1 Generalities . . . . . . . . . . . . . . . . . . . B.2 The generic case n 6 = 2 . . . . . . . . . . . . . B.3 The particular case of Y (sl(2)) . . . . . . . . . ? On leave of absence from LAPTH.
?? URA 14-36 du CNRS, associée à l’Université de Savoie.
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
552 553 553 554 554 555 556 556 557 558 559 560 562 563 565 565 567 568
552
E. Ragoucy, P. Sorba
C. Quantum W(sl(np), n.sl(p)) Algebras . . . . . . . . . . . . . . . . . . . . . 569 D. Tensor Products of Some Finite Dimensional Representations of sl(n) . . . . . 570
1. Introduction In the year 1985, the mathematical physics literature was enriched with two new types of symmetries: W algebras [1] and Yangians [2]. W algebras showed up in the context of two dimensional conformal field theories. They benefited from development owing in particular to their property to be algebras of the constants of motion for Toda field theories, themselves defined as constrained WZNW models [3]. Yangians were first considered and defined in connection with some rational solutions of the quantumYang– Baxter equation. Later, their relevance in integrable models with non-Abelian symmetry was remarked [4]. Yangian symmetry has been proved for the Haldane-Shastry SU (n) quantum spin chains with inverse square exchange, as well as for the embedding of this ˆ (2)1 WZNW one; this last approach leads to a new classification of model in the SU the states of a conformal field theory in which the fundamental quasi-particles are the spinons [5] (see also [6]). Let us also emphasize theYangian symmetry determined in the Calogero-Sutherland-Moser models [5,7]. Coming back to W algebras, it can be shown that their zero modes provide algebras with a finite number of generators and which close polynomially. Such algebras can also be constructed by symplectic reduction of finite dimensional Lie algebras in the same way usual – or affine – W algebras arise as reduction of affine Lie algebras: they are called finite W algebras [8] (FWA). This definition extends to any algebra which satisfies the above properties of finiteness and polynomiality [9]. Some properties of such FWA’s have been developed [9–12] and in particular a large class of them can be seen as the commutant, in a generalization of the enveloping algebra U(G), of a subalgebra Gˆ of a simple Lie algebra G [11]. This feature of FWA’s has been exploited in order to get new realizations of a simple Lie algebra G once knowing a G differential operator realization. In such a framework, representations of a FWA are used for the determination of G representations. This method has been applied to reformulate the construction of the unitary, irreducible representations of the conformal algebra so(4, 2) and of its Poincaré subalgebra, and compared it to the usual induced representation techniques [10]. It has also been used for building representations of observable algebras for systems of two identical particles in d = 1 and d = 2 dimensions, the G algebra under consideration being then symplectic ones; in each case, it has then been possible to relate the anyonic parameter to the eigenvalues of a Wgenerator [12]. In this paper, we show that the defining relations of aYangian are satisfied for a family of FWA’s. In other words, such W algebras provideYangian realizations. This remarkable connection between two a priori different types of symmetry deserves in our opinion to be considered more closely. Meanwhile, we will use results on the representation theory of Yangians and start to adapt them to this class of FWA’s. In particular, we will show on special examples – the algebra W(sl(2n), n.sl(2)) – how to get the classification of all their irreducible finite dimensional representations. It has seemed to us necessary to introduce in some detail the two main and a priori different algebraic objects needed for the purpose of this work. Hence, we propose in Sect. 2 a brief reminder on W-algebras with definitions and properties which will become useful to establish our main result. In particular, a short paragraph presents the Miura transformation. The structure of W(sl(np), n.sl(p)) algebras is also analysed.
Yangians from Finite W-Algebras
553
Then, the notion of Yangian Y (G) is introduced in Sect. 3, with some basic properties on its representation theory. Such preliminaries allow us to arrive well-equipped for showing the main result of our paper, namely that there is an algebra homomorphism between the Yangian Y (sl(n)) and the finite W(sl(np), n.sl(p)) algebra (for any p). This property is proven in Sect. 4 for the classical case (i.e. W-algebras with Poisson brackets) and generalized to the quantum case (i.e. W-algebras with usual commutators) in the Sect. 5. The proof necessitates the explicit knowledge of commutation relations among W generators. Such a result is obtained in the classical case via the soldering procedure [13]. Its extension to the quantum case leads to determine sl(n) invariant tensors with well determined symmetries. In order not to overload the paper, all these necessary intermediate results are gathered in the appendices. Finally, as an application, the representation theory of W(sl(2n), n.sl(2)) algebras is considered in Sect. 6. General remarks and a discussion about some further possible developments conclude our study. 2. Finite W Algebras: Notation and Classification As mentioned above, the W algebras that we will be interested in can be systematically obtained by the Hamiltonian reduction technique in a way analogous to the one used for the construction and classification of affine W algebras [3,14,15]. Actually, given a simple Lie algebra G, there is a one-to-one correspondence between the finite W algebras one can construct in U(G) and the sl(2) subalgebras in G. We note that any sl(2) G-subalgebra is principal in a subalgebra S of G. The step generator E+ in the sl(2) subalgebra which is principal inPS is then written as a linear combination of the simple root generators of S: E+ = si=1 Eβi , where βi , i = 1, . . . , s = rank S are the simple roots of S. It can be shown that one can complete uniquely E+ with two generators E− and H such that (E± , H ) is an sl(2) algebra. It is rather usual to denote the corresponding W algebra as W(G, S). It is an algebra freely generated by a finite number of generators and which has a second antisymmetric product. Depending on the nature of this second product, we will speak of a classical (the product is a Poisson bracket) or a quantum (the product is a commutator) W-algebra. 2.1. Classical W(G, S) algebras. To specify the Poisson structure of the W-algebra, we start with the Poisson–Kirillov structure on G ∗ . It mimicks the Lie algebra structure on G, and we will still denote by Hi , E±α the generators in G ∗ . U(G ∗ ) is then a PoissonLie enveloping algebra. We construct the classical W(G, S) algebra from a Hamiltonian reduction on U(G ∗ ), the constraints being given by the sl(2) embedding as follows. The Cartan generator H of the sl(2) subalgebra under consideration provides a gradation of G: G = ⊕N i=−N G i with [H, X] = i X, ∀ X ∈ G i . The root system of G is also graded: 1 = ⊕i 1i . We have X χα Eα , χα ∈ C. H ∈ G 0 , E± ∈ G ±1 and E− =
(2.1)
(2.2)
α∈1−1
Then, the first class constraints are 8α = 8(Eα ) = Eα = 0 if Eα ∈ G 1 are defined recursively through f a bc [Qb1 , Qcn−1 ] = cv Qan
with cv ηab = f a cd f bcd .
(3.5)
It can be shown that for G = sl(2), (3.3) is a consequence of the other relations, while for G 6 = sl(2), (3.4) follows from (3.1–3.3). The coproduct on Y (G) is given by 1(Qa0 ) = 1 ⊗ Qa0 + Qa0 ⊗ 1, 1(Qa1 ) = 1 ⊗ Qa1 + Qa1 ⊗ 1 + 21 f a bc Qb0 ⊗ Qc0 .
(3.6)
In the following, we will focus on the Yangians Y (sl(n)). 3.2. Evaluation representations of Y (sl(n)). When G = sl(n), there is a special class of finite dimensional irreducible representations called evaluation representations. They are defined from the algebra homomorphisms U(sl(n)) Y (sl(n)) → ± ta Qa0 → with A ∈ C, (3.7) evA Qa → A t a ± d a t b t c bc 1 where the t a ’s form a sl(n) basis, and d a bc is the totally symmetric invariant tensor of + − and evA are isomorphic sl(n) (we set d a bc = 0 when n = 2). It can be shown that evA + − (and indeed evA = evA = evA when G = sl(2)). An evaluation representation of Y (G) is defined by the pull-back of a G-representation ± ). The corresponding representation (with the help of the evalutation homomorphism evA ± space will be denoted generically by V A (π), where π is a representation of sl(n). We select hereafter two properties [19] which will be used in Sect. 6. Theorem 1. Any finite-dimensional irreducible Y (sl(n)) module is isomorphic to a subquotient of a tensor product of evaluation representations. Theorem 2. When G=sl(2), let V A (j ) be the (2j + 1)-dimensional irreducible representation space of evA (j ∈ 21 Z). Then, V A (j ) ⊗ V B (k) is reducible if and only if A − B = ±(j + k − m + 1) for some 0 < m ≤ min(2j, 2k). In that case, V A (j ) ⊗ V B (k) is not completely reducible, and not isomorphic to V B (k) ⊗ V A (j ); otherwise, V A (j ) ⊗ V B (k) is irreducible and isomorphic to V B (k) ⊗ V A (j ).
558
E. Ragoucy, P. Sorba
4. Yangians and Classical W-Algebra In this section, we want to show that there is an algebra morphism between the Yangian Y (sl(n)) and the classical W(sl(np), n.sl(p)) algebras (∀ p). For such a purpose, we need to compute some of the PB of the W-algebra. It is done using the soldering procedure [13], the calculation being quite tricky (see the appendices). For the generic case of W(sl(np), n.sl(p)) algebras, the result is {W1a , W1b } =
1 ab p3 au b f c W2c − d v f cu − d bu v f a cu d v de W0c W0d W0e , (4.1) 5 16
where the indices run from 0 to n2 − 1, with the notation W00 = 0 and W10 = C2 and the normalisation: {W0a , Wkb } =
1 ab f c Wkc p
k = 0, 1, 2, · · · .
(4.2)
When p = 2, we have the constraint W2a = 5 d a bc W0b W1c . Let us stress that the tensors f ab c and d ab c are gl(n) tensors, not sl(n) ones: see Appendix B for clarification. For the case of W(sl(2p), 2.sl(p)), the relations simplify to: 1 c p3 c E 2 W2 − W0 W0 , (4.3) {W1a , W1b } = f ab c 5 2 6(3p2 − 7) c 0 (p 2 − 9)(p 2 − 4) c E 2 3 a b ab W3c + W W + W1 W0 + {W1 , W2 } = f c 14 p(p2 − 1) 1 1 2(p2 − 1) i (4.4) + 3 W20 W0c − 30 W0c (WE1 · WE0 ) , together with the constraints W2a = 10 [ W0a W10 + δ0a (WE1 · WE0 ) ] for p = 2, and W3a = 0 for p = 2 or 3. In this basis, the map is Y (sl(n)) → W(sl(np), n.sl(p)) → βk Wka for k = 0, 1, . . . , p , Qak (4.5) ρp a (W , W , . . . , W ) for l > 0 Qap+l → Pp+l 0 1 p where Pla are some homogeneous polynomials which preserve the “conformal spin” of Wka . A careful computation shows (see Appendix B), using the PBs (4.1–4.4), that the generators Wka obey the relations (3.1–3.4), the commutators being replaced by PBs. As Y (sl(n)) is topologically generated by Qa0 and Qa1 , it is sufficient to give β0 and β1 . Indeed, once (4.5) is satisfied for k = 0 and 1, the relation (3.5) together with the PB of the W-algebra ensure that (4.5) can be iteratively constructed for all k. We show in Appendix B that in our basis this relation is indeed satisfied for β0 = p and β1 = 2.
(4.6)
We can thus conclude: Proposition 1. The classical algebra W(sl(np), n.sl(p)) provides a representation of the Yangian Y (sl(n)), the map being given by ρp defined in (4.5) and (4.6).
Yangians from Finite W-Algebras
559
5. Yangians and Quantum W(sl(np), n.sl(p)) Algebras We can use the above study to deduce the same result for quantum W(sl(np), n.sl(p)) algebras. In fact, as these algebras are a quantisation of the classical ones, we can deduce that the most general form of the commutator is 1 ab f c Wkc , (5.1) p p3 au b 1 d v f cu − d bu v f a cu d v de s3 (W0c , W0d , W0e ) + [W1a , W1b ] = f ab c W2c − 5 16 ab s2 (W0c , W0d ) + t˜cab W0c (5.2) +tcab W1c + tcd [W0a , Wkb ] =
for sl(n), and in the special case of sl(2), 6(3p2 − 7) 3 a b ab W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c )+ [W1 , W2 ] = f c 14 p(p2 − 1) 2 (p − 9)(p2 − 4) g c c d e ηdg ηe − 30 ηde ηg s3 (W0 , W0 , W1 ) + + 2(p2 − 1) ab ab W1c W0d + gcde s3 (W0c , W0d , W0e ) + gˆ cab W1c + +gcab W2c + gcd ab s2 (W0c , W0d ) + g˜ cab W0c +gˆ cd
(5.3)
and gaab1 a2 ···ak . By construction, the t-tensors are symmetric in for some tensors taab 1 a2 ···ak the lower indices and antisymmetric in the upper ones. The g-tensors are only symmetric ab which has no symmetry property. Moreover, the in the lower indices, except for gcd a Jacobi identities with W0 show that they are invariant tensors. Hence, we are looking for objects which belong to the trivial representation in 32 (G) ⊗ Sk (G), S2 (G) ⊗ Sk (G), or 32 (G) ⊗ 32 (G), where Sk (G) is the totally symmetric product G ⊗ G ⊗ · · · ⊗ G (k times), while 32 (G) is the antisymmetric product G ⊗ G. Computing the decomposition of these tensor products shows that the multiplicity M0 of the trivial representation in these products is (for G = sl(n)): 1 if n 6= 2 M0 [S2 (G) ⊗ G] = , M0 [32 (G) ⊗ G] = 1, 0 for sl(2) M0 [32 (G) ⊗ S2 (G)] =
1 if n 6 = 2 , M0 [32 (G) ⊗ 32 (G)] = 0 for sl(2)
3 if n 6 = 2 , 1 for sl(2)
4 if n 6 = 2, 3 M0 [32 (G) ⊗ S3 (G)] = 3 for sl(3) . 1 for sl(2) (5.4) Now, it is easy to show that the following tensors indeed belong to these spaces2 : 32 (G) ⊗ G :
tcab = f ab c ,
32 (G) ⊗ S2 (G) :
ab = f ab d e . tcd e cd
2 More general formulae are given in Appendix D.
S2 (G) ⊗ G :
tcab = d ab c ,
(5.5)
560
E. Ragoucy, P. Sorba
As they are evidently independent and give the correct multiplicities (with the convention that the d-tensor is null for sl(2)), we deduce that the most general form one gets is: p3 au b d v f cv − d bu v f a cv d v de s3 (W0c , W0d , W0e ) + 16 1 c W2 + µ1 W1c + µ2 d c de s2 (W0d , W0e ) + µ3 W0c (5.6) +f ab c 5
[W1a , W1b ] = −
for the algebra W(sl(np), n.sl(p)). This commutator is the only one needed to prove that the algebra satisfies the defining relations of the Yangian when n 6 = 2. For the algebra W(sl(2p), 2.sl(p)), we need also the relation:
6(3p2 − 7) 3 W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c ) + 14 p(p2 − 1) 2 (p − 9)(p2 − 4) g c c η η − 30 η η (5.7) s3 (W0 , W0d , W1e ) + + dg de e g 2(p 2 − 1) i + ν1 W2c + ν10 W1c + ν100 W0c + ν2 f c de W1d W0e + ν3 ηde s3 (W0c , W0d , W0e ) .
[W1a , W2b ]
=f
ab
c
Then, one can show that the commutators (5.6–5.7) obey the defining relations of the Yangians Y (sl(n)) for the same normalisations as in the classical case. It is done in Appendices C and D. Proposition 2. The quantum algebra W(sl(np), n.sl(p)) provide a representation of the Yangian Y (sl(n)), the map being given by ρp defined in (4.5) and (4.6). At this point, let us note that the Yangian structure of the algebra W(sl(4), 2sl(2)) has been already remarked [9,20] and used for quantum mechanics applications [20]. 6. Representations of W(sl(2n), n.sl(2)) Algebras Owing to the above identification, it is possible to adapt some known properties on Yangian representation theory to finite W representations. We first illustrate this assertion in the case of W(sl(2n), n.sl(2)). Proposition 3. Any finite dimensional irreducible representation of the algebra W(sl(4), 2.sl(2)) is either an evaluation module V A (j ) or the tensor product of two evaluation modules V A (j )⊗V (−A) (k). Conversely, V A (j ) for any A, and V A (j )⊗V (−A) (k) (A 6 = 0) with 2A 6 = ±(j + k − m + 1) for any m such that 0 < m ≤ min (2j, 2k), are finite dimensional irreducible representations of the algebra W(sl(4), 2.sl(2)). The tensor product is calculated via the Yangian coproduct defined in (3.6). The proof is done by direct calculation, using Theorem 2 of Sect. 3.2.As a (irreducible) representation of the W(sl(4), 2.sl(2)) algebra must be a (irreducible) representation of theYangian Y (sl(2)), we deduce that the (finite dimensional) irreducible representations of W(sl(4), 2.sl(2)) are in the set of evaluation modules V A (j ) or V A (j ) ⊗ V B (k) ⊗ · · · ⊗ V C (`).
Yangians from Finite W-Algebras
561
For V A (j ), it is obvious that we have an irreducible representation, where the value of the W Casimir operator C2 is related to A: (6.1) C2 (A, j ) = 2j (j + 1) + A2 I. For a product V A (j ) ⊗ V B (k), calculations show that we must have A + B = 0 and 1 2 2 C2 (A, j ; B, k) = 2j (j + 1) + 2k(k + 1) + (A + B ) I ⊗ I 2
(6.2)
in order to get a representation of the W-algebra. The irreducibility is fixed by the first part of Theorem 2, Sect. 3.2, i.e. when there is no m ∈ ] 0, min(2j, 2k) ] such that 2A = ±(j + k − m + 1). It is the second part of the theorem which ensures that the above construction exhausts the set of irreducible finite dimensional representations of W(sl(4), 2sl(2)). Indeed, in the product V A (j ) ⊗ V B (k) ⊗ V C (`), we already know that we must have B = −A and C = −B for V A (j ) ⊗ V B (k) and V B (k) ⊗ V C (`) to be representations of the Walgebra. Then, the irreducibility of V (−A) (k)⊗V A (`) implies that this last representation is isomorphic to V A (`)⊗V (−A) (k). Therefore, V A (j )⊗V (−A) (k)⊗V A (`) is isomorphic to V A (j ) ⊗ V A (`) ⊗ V (−A) (k). But the product V A (j ) ⊗ V A (`) is not a representation of the W-algebra3 , so that the triple product is not either. Note that we get the surprising result that the tensor product of two representations of the W(sl(4), 2.sl(2)) algebra (V A (j ) and V B (k)) is not always a representation of this algebra. In some sense, this result can be interpreted as a no-go theorem for the existence of a coproduct for W-algebras. Let us also remark that the above representations are those obtained through the Miura map (see Sects. 2.3 and 2.4), so that we have proved that the Miura map gives all the irreducible finite dimensional representations of this W-algebra. Moreover, as the G 0 algebra we have to consider is just sl(2) ⊕ sl(2) ⊕ gl(1) = s(2.gl(2)), the condition A + B = 0 in the tensor product V A (j ) ⊗ V B (k) can just be interpreted as the traceless condition on s(2.gl(2)). Indeed, a representation of 2gl(2) is given by a representation space Dj ⊗ Dk of 2sl(2), together with the values A and B of the two gl(1) generators, while for s(2.gl(2)), one has to impose A + B = 0. In fact, we are able to prove a more general result: Proposition 4. Any finite dimensional irreducible representation of the algebra W(sl(2n), n.sl(2)) must be either an evaluation module V ± A (π ) or the tensor prod± ∓ 0 uct of two evaluation modules V A (π) ⊗ V (−A) (π ), where π and π 0 are irreducible finite dimensional representations of sl(n), the tensor product being calculated via the Yangian coproduct defined in (3.6). All these representations can be obtained from the Miura map: s(2.gl(n)) ≡ 2.sl(n) ⊕ gl(1) → W(sl(2n), n.sl(2)). Note that these algebras are just the ones used in [11] to construct the finite Walgebras as commutants in U(G). It seems rather natural to conjecture that this situation will remain valid in the general case of W(sl(np), n.sl(p)) algebras [21]. 3 In fact, for A = 0, the tensor product indeed provides a representation of the W-algebra (it is just a representation of sl(2)). However, in that case, the tensor product is not irreducible.
562
E. Ragoucy, P. Sorba
7. Conclusion A rather surprising connection between Yangians and finite W-algebras has been developed in this paper. We have proved directly that finite W-algebras of the type W(sl(np), n.sl(p)) satisfy the defining relations of theYangian Y (sl(n)). In particular, we have been led to explicitly compute rather non-trivial commutators of W generators, namely spin 2 – spin 2 and spin 2 – spin 3 ones, a result which is interesting in itself. The question is now to understand more deeply this relationship betweenYangian and finite W-algebras. Of course, the structure of the W(sl(np), n.sl(p)) algebra (see Sect. 2.4) reveals the special role played by its (spin one) Lie subalgebra sl(n). The W generators of equal spin gather into adjoint representations of this sl(n) algebra, inducing some resemblance with the Y (sl(n)) yangian structure. At this point, let us remark another common point between Y (sl(n)) and W(sl(np), n.sl(p)), namely the construction of their finite dimensional representations with the help of sl(n) ones. Indeed, the evaluation homomorphism (in the case of Yangians) and the Miura map (for W-algebras) play identical roles for such a construction: the former allows to represent Y (sl(n)) on the tensor product of sl(n) representations (with the use of additional constant numbers), while the later uses a representation of the G 0 algebra p.sl(n) ⊕ (p − 1).gl(1). This clearly shows a one-to-one correspondence. Let us also stress another feature of the W(sl(np), n.sl(p)) algebras: for p = 2, they are the commutant in (a localisation of) U(sl(n)) of an Abelian subalgebra G˜ of sl(n) [11,18], the case p > 2 being with no doubt generalisable. Finally, in seeking to understand our results, one could think of a R-matrix approach. This point of view looks natural, since a R-matrix definition of the Yangians is available, while our W-algebras are symmetry algebras of (integrable) non-Abelian lattice Toda models. Due to the wide class of W(G, S)-algebras, it seems natural to think of generalisations of our work. First of all, one could imagine to study Yangians Y (G) (with G 6 = sl(n)) from the W-algebras point of view. However, a rapid survey of W(G, S)-algebras shows that W(sl(np), n.sl(p)) algebras are the only W(G, S)-algebras where the generators are all gathered in adjoint representations of the Lie W-subalgebra. Inversely, W(G, S)-algebras might be a way to generalize the notion of Yangians Y (G) to cases where the generators are in any representation of G. In that case, the Hopf structure remains to be determined. Finally, it would be of some interest to look for an extension to affine W-algebras. Let us end with two comments concerning applications. The first one concerns the representation theory of finite W-algebras. Preliminary results have been given in Sect. 6 and deals with the classification of finite dimensional representations of W(sl(2n), n.sl(2)). More complete results will be available soon [21]. Secondly, the possibility of carrying out the tensor product of W representations, although only in special cases, allows to imagine the construction of spin chain models based on a finite W(sl(np), n.sl(p)) algebra. Acknowledgements. We have benefited from valuable discussions with M. L. Ge, Ph. Roche and particulary Ph. Zaugg.
Yangians from Finite W-Algebras
563
Appendices A. The Soldering Procedure The soldering procedure [13] allows to compute the Poisson brackets of the W-algebras. The basic idea is to implement the W(G, S)-transformations from G ones. Indeed, as the W(G, S)-algebra can be realised from a Hamiltonian reduction on G, one can see the W transformations as a particular class of (field dependent) G conjugations that preserve the constraints we have imposed. Thus, the soldering procedure just says that the PBs of the W(G, S) algebra can be deduced from the commutators in G. It applies to any W(G, S) algebra, but we will focus on the W(sl(np), n.sl(p)) ones. For such a purpose, we define J =
2 (np) X−1
J a ta where ta are (np) × (np) matrices and J a ∈ G ∗ .
(A.1)
a=1
Then, we introduce the highest weight basis for the sl(2) under consideration (E± , H ): J = E− +
2 )−1 p(n X
W i Mi with [E+ , Mi ] = 0,
(A.2)
i=1
where Mi are (np) × (np) matrices and E+ is considered here in the fundamental Pnp−1 representation E+ = i=1 Ei,i+1 , with Eij the matrix whose elements are (Eij )kl = δik δj l . To compute the PB of the generator W i of the W-algebra, one writes the variation of J under the infinitesimal action of one of the W-generators in two ways, namely: δε J = { tr(εJ), J }P B = [ εJ, J ],
(A.3)
where { tr(εJ), J }P B is the matrix of PB: { tr(εJ), J }P B = { tr(εJ), W i }P B Mi ,
(A.4)
and [ εJ, J ] is a commutator of (np) × (np) matrices: [ εJ, J ] = f a bc εb Jc ta .
(A.5)
ε is an np × np matrix such that δε J = [ εJ, J ] keeps the form (A.2) with of course δε E− = 0. This matrix ε has p(n2 ) − 1 free entries, which is the right number of parameters needed to describe a gauge transformation by a general element in the Walgebra. Identifying the matrix of PB with the commutator of matrices leads to the PB of the W-algebra. We now use the property gl(np) ∼ gl(n) ⊗ gl(p) to explicitly compute some of the PBs. In gl(n) ⊗ gl(p), a general element can be written as J =
2 −1 p 2 −1 nX X
α=0 s=0
J αs tα ⊗ τs
(A.6)
564
E. Ragoucy, P. Sorba
with tα , n × n matrices and τs , p × p matrices. The principal sl(2) in n.sl(p) takes the form H = In ⊗ h and E± = In ⊗ e± ,
(A.7)
where (h, e± ) form the principal sl(2) in sl(p), and In is the identity in sl(n). Then, J = I n ⊗ e− −
p−1 X
Wk ⊗ mk ,
(A.8)
k=0
Pp−1 where e− is viewed as a p × p matrix, e− = i=1 Ei,i+1 and mk are p × p matrices representing the highest weights of the principal sl(2) in sl(p). They have been computed in [22]: mk =
p−k X i=1
aki Ei,i+k with aki =
(i + k − 1)! (p − i)! . (i − 1)! (p − k − i)!
(A.9)
Wk are n × n matrices whose entries Wka (a = 0, . . . , n2 − 1) are the W-generators (with Wk0 related to Ck+1 for k > 0 and W00 = 0 by the traceless condition on J). Note that the indices run from 0 to n2 − 1 because we are using gl(n) indices instead of sl(n) ones (see Appendix B for details). Using this notation and demanding that δε J keeps the form (A.2), one can compute the commutator [ εJ, J ] to get the relations defining the matrix ε. These relations are quite awful, but, for our purpose, we just need to compute the matrix elements [ εJ, J ]1,2 and [ εJ, J ]1,3 . A rather long calculation leads to: {tr(µ W0 ), Wk }P B =
1 [Wk , µ] k = 0, 1, 2, . . . , p
(A.10)
{tr(λ W1 ), W1 }P B = 2 1 6 p −4 , λ] + , {W , λ}] + {W , [W , λ]} + = [W [W 2 0 1 1 0 p(p2 − 1) 5 2 1 (A.11) − [W0 , [W0 , [W0 , λ]]] , 2 3(p2 − 9) 6 [W3 , λ] + {W2 , [W0 , λ]} + {tr(λ W1 ), W2 }P B = 2 p(p − 1) 14 1 1 1 + [W0 , {W2 , λ}] + [{W1 , W1 }, λ] − [W1 , [W0 , [W0 , λ]]] + 2 3 2 1 1 (A.12) − [W0 , [W1 , [W0 , λ]]] − [W0 , [W0 , [W1 , λ]]] , 4 12 where µ (resp. λ) is a n × n matrix whose entries µa (resp. λa ) are the parameters of the infinitesimal transformations associated to W0a (resp. W1a ): µ = µa ta ; λ = λa ta ; Wk = Wka ta k = 0, 1, 2
with t0 = In .
(A.13)
Yangians from Finite W-Algebras
565
B. Classical W(sl(np), n.sl(p)) Algebras B.1. Generalities. As we are using heavily the isomorphism gl(np) ∼ gl(n) ⊗ gl(p) for our calculations, we are forced to make use of gl(n) indices instead of sl(n) ones. We denote the last index by a = 0. It corresponds to the gl(1) generator that commutes with sl(n) in gl(n). We can consistently extend the definition of the totally (anti-)symmetric tensors f and d from sl(n) to gl(n) by d ab 0 = 2 ηab and f ab 0 = 0 ∀ a, b = 0, 1, . . . , n2 − 1.
(B.1)
In the fundamental representation of gl(n), we have then the decomposition: ta tb =
1 ab (f c + d ab c )t c with t 0 = In . 2
(B.2)
Then, it is easy to show that the Jacobi identities f ab c f cd e + f bd c f ca e + f da c f cb e = 0, d ab c f cd e + d bd c f ca e + d da c f cb e = 0
(B.3) (B.4)
are still valid for any values of a, b, d, e = 0, 1, . . . , n2 −1. If we compute {{t a , t b }, t c }− {{t c , t b }, t a } = [[t a , t c ], t b ], we get also the relation between f and d tensors: d ab d d dc e − d bc d d da e = f ac d f db e .
(B.5)
These identities will be the only one needed for our purpose. Note that the identity f ab c fabd = cv ηcd
(B.6)
is not valid in gl(n) since the left hand side is 0 for c = d = 0. As an aside comment, let us remark that the isomorphism gl(np) ∼ gl(n) ⊗ gl(p) together with the above conventions allow us to construct the structure constants of gl(np) from those of gl(n) and gl(p). Indeed let t a (resp. t¯q and T (a,q) = t a ⊗ t¯q ) be the generators in the fundamental representation of gl(n) (resp. gl(p) and gl(np)); qr let f ab c (resp. f¯s and F (a,q)(b,r) (c,s) ) be their structure constants; and let d ab c (resp. qr (a,q)(b,r) d¯s and D (c,s) ) be their totally symmetric invariant tensor. The calculation of [T (a,q) , T (b,r) ] and {T (a,q) , T (b,r) } show that 1 ab ¯ qr qr f c ds + d ab c f¯s , 2 1 ab ¯qr qr = f c fs + d ab c d¯s , 2
F (a,q)(b,r) (c,s) =
(B.7)
D (a,q)(b,r) (c,s)
(B.8)
which shows that e.g. D (a,q)(b,r) (0,0) = 2η(a,q)(b,r) = 2ηab ηqr in agreement with our conventions.
(B.9)
566
E. Ragoucy, P. Sorba
With these conventions and properties, we deduce from the soldering procedure result (A.10–A.12) the PBs 1 ab k = 0, 1, 2, . . . , (B.10) f c Wkc p 2 p − 4 ab 6 1 f c W2c + (d a cu f ub d − d b cu f ua d ) W1c W0d + {W1a , W1b } = 2 p(p − 1) 5 2 1 a (B.11) + f cu f b dv f uv e W0c W0d W0e , 2 6 1 b ua 3(p2 − 9) ab c a ub W + d − f d f f {W1a , W2b } = d d c cu 3 p(p2 − 1) 14 2 cu 1 W0c W2d + f ab u d u cd W1c W1d + 6 1 a 1 1 a b uv b uv f cu f ev f d + f du f cv f e + f a eu f b cv f uv d W0c W0d W1e . (B.12) 2 4 12 {W0a , Wkb } =
We repeat that the indices run from 0 to n2 − 1. Noting the identity (proved using (B.5) and the commutativity of the product) f a cu f b dv f uv e W0c W0d W0e = 3 a 1 ab u v ub b ua v f u d cv d de − (d uv f c − d uv f c )d de W0c W0d W0e , = 2 4
(B.13)
and performing a change of basis 2 e1a = p(p − 1) W1a + p d a bc W0b W0c , W 6 4 2 2 e1c + e2a = p(p − 1)(p − 4) W2a + 5 d a bc W0b W W 6 5p(p 2 − 4) a d bu d u cd W0b W0c W0d , + 24 2 2 2 e3a = p(p − 1)(p − 4)(p − 9) W3a , W 6
(B.14)
(B.15) (B.16)
we obtain the PB4 : {W1a , W1b } =
1 ab p 3 au b f c W2c − d v f cu − d bu v f a cu d v de W0c W0d W0e . (B.17) 5 16
Note that in this basis, we have W10 = C2 (i.e. W10 is central). Now, as the relations that we have to verify are different if n is 2 or not, we specify both cases. We begin with the general case. 4 We keep the notation W a for W e a : throughout the text it is W e a which is used, except in Eqs. (B.10-B.12) j j j and the convention (A.13).
Yangians from Finite W-Algebras
567
B.2. The generic case n 6 = 2. One has to verify that the PB (B.17) obeys the defining relations of the Yangian. We rewrite (3.3) as q
f bc d {Qa1 , Qd1 } + circ. perm. (a, b, c) = f a qd f b rx f c sy f xyd Q0 Qr0 Qs0 .
(B.18)
Plugging the PB into the left hand side of (B.18) leads to 3 1 2p bc aµ d dµ a f d d ν f πµ − d ν f π µ + circ. perm. (a, b, c) × lhs = −β0 β1 16 p γ ρ × d ν γρ W0π W0 W0 . This has to be compared with q
rhs = β03 f a qx f b ry f c sz f xyz W0 W0r W0s , where we have used latin (resp. greek) letters for sl(n) (resp. gl(n)) indices. To prove the equality between lhs and rhs, we first remark, using the Jacobi identity for f , that the index 0 can be dropped from lhs, or equivalently added to rhs. We choose to use gl(n) indices, and come back to latin letters to denote them. p2 q lhs = −β0 β12 f ab d d cy x f d qy −d dy x f c qy d x rs W0 W0r W0s +circ. perm. (a, b, c) 16 p2 q (B.19) = −β0 β12 f ab d d dy x f c yq d x rs W0 W0r W0s +circ. perm. (a, b, c), 8 where we have used the Jacobi identities (B.3–B.4). With (B.5) and the symmetry in (q, r, s), one can rewrite rhs as: 1 q rhs = f a qd f b rx (d cx y d yd s − d cd y d yx s ) W0 W0r W0s 3 β0 1 cx b = d y f rx (d yd s f a qd + d yd q f a sd ) − 2 1 cd a q yx b yx b − d y f qd (d s f rx + d r f sx ) W0 W0r W0s 2 1 q = (f a qd f by x − f ay x f b qd )d cd y d x rs W0 W0r W0s 2 1 q = − R abc qx d x rs W0 W0r W0s . 2 Using the Jacobi identity (B.4), we have
(B.20)
R abc qx = f by x (f ac d d d qy + f a yd d cd q ) − (a ↔ b) = (f ac d f by x − f bc d f ay x )dyq d + (f by x f a yd − f ay x f b yd )d cd q = (f ac d f by x − f bc d f ay x )dyq d + f ab y fdx y d cd q .
(B.21)
Now, since rhs is invariant under cyclic permutations of (a, b, c), we can write q 6 rhs = β03 2f ab d f cy x dyq d −f ab y fdx y d cd q + circ. perm. (a, b, c) d x rs W0 W0r W0s q = β03 f ab d −2f cy q dyx d −fyx d d cy q + circ. perm. (a, b, c) d x rs W0 W0r W0s q
= −3β03 f ab d f cy q dyx d d x rs W0 W0r W0s + circ. perm. (a, b, c).
(B.22)
568
E. Ragoucy, P. Sorba
From the normalisation β0 = p, we deduce that lhs and rhs are equal when β12 = 4, i.e. for Qa0 = p W0a
and Qa1 = 2 W1a .
(B.23)
which ends the proof for the generic case. B.3. The particular case of Y (sl(2)). As a normalisation, we take for the fundamental representation of gl(2) the matrices: 01 0 −i 1 0 10 ; t2 = ; t3 = ; t0 = , (B.24) t1 = 10 i 0 0 −1 01 We have in that case d abc = 2δ0a ηbc + circ. perm. (a, b, c)
and cv = −8. j
(B.25)
j
Then, using the special property f ij m f m kl = −4(ηki ηl − ηli ηk ) valid in sl(2) (i.e. when none of the index is 0), we get the PB 1 k p3 E j i ij k E W − (W0 · W0 ) W0 , (B.26) {W1 , W1 } = f k 5 2 2 where xE · yE = x 1 y1 + x 2 y2 + x 3 y3 and the indices i, j , k now run from 1 to 3. Note that for p = 2, once the constraint W2a = 10 [W0a W10 + δ0a (WE1 · WE0 )] is applied, we recover the algebra presented in Sect. 2.4, up to the normalisation J i = W0i , S i = 2 W1i , C2 = W10 and f ij k = 2iεij k . After multiplication by fij k fmn l , the relation (3.4) can be rewritten as: j j j l k fij k {{W1i , W1 }, W1l } + fij l {{W1i , W1 }, W1k } = 32 fij k ηm + fij l ηm W0m W0 W1i . (B.27) Using the above PB and the normalisation β0 = p, we get: h i l + f l ηk W m W j W i , lhs = cv β13 15 {W2k , W1l } + {W2l , W1k } − p2 fij k ηm ij m 0 1 0 l + f l ηk W m W j W i . rhs = 32p 2 β1 fij k ηm ij m 0 1 0 (B.28) Thus, we need to simplify the PBs (B.12) in the new basis (B.14–B.15). For sl(2) it takes the form: 6(3p2 − 7) k 0 3 k j W3 + 3 W20 W0k + W W + {W1i , W2 } = f ij k 14 p(p2 − 1) 1 1 (p 2 − 9)(p 2 − 4) k E 2 k E E W1 W0 − 30 W0 (W1 · W0 ) (B.29) + 2(p2 − 1) so that it does not contribute to lhs. Hence, we have j
l l + f k ij ηm ) W0m W0 W1i . lhs = −cv β13 p2 (f k ij ηm
(B.30)
The relation (B.27) is then satisfied for Qa0 = p W0a
and Qa1 = 2 W1a ,
which is the same normalisation as for the generic case.
(B.31)
Yangians from Finite W-Algebras
569
C. Quantum W(sl(np), n.sl(p)) Algebras In the quantum case, one has to check that the corrections to the leading terms in the W-algebras do not perturb the defining relations of the Yangian5 . – In the general case n 6 = 2, the calculation is quite easy. Indeed, the commutator takes the form p3 au b d v f cv − d bu v f a cv d v de s3 (W0c , W0d , W0e ) + 16 1 c ab c c d e c W + µ1 W1 + µ2 d de s2 (W0 , W0 ) + µ3 W0 , (C.1) +f c 5 2
[W1a , W1b ] = −
where µi (i = 1, 2, 3) are undetermined constants. However, one remarks that the terms containing f ab c in [W1a , W1b ] do not contribute to (3.3). Since these are the only type of terms we add, the calculation is identical to the classical one (up to symmetrization of the products). – In the case of W(sl(2p), 2.sl(p)), we need a little more. Due to the calculations done in the classical case, we already know that proving (3.4) amounts to show that j j j j µ1 [W1i , W1 ] + [W1 , W1i ] + µ3 [W0i , W1 ] + [W0 , W1i ] + 1 i j j [W2 , W1 ] + [W2 , W1i ] = 0, (C.2) + 5 where the indices run from 1 to 3. The terms corresponding to µ1 and µ3 disappear because of the antisymmetry in (i, j ). Thus, we just need to compute the corrections to the commutator [W2a , W1b ]. Using the results of Appendix D, we compute the most general form of this commutator: [W1a , W2b ] = f ab c
6(3p2 − 7) 3 W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c )+ 14 p(p2 − 1)
(p 2 − 9)(p2 − 4) g c c d e ηdg ηe − 30 ηde ηg ) s3 (W0 , W0 , W1 ) + +( 2(p2 − 1) +f ab c ν1 W2c + ν10 W1c + ν100 W0c + ν2 f ab c ηde s3 (W0c , W0d , W0e ) + u + ν30 ηab ηcd + ν300 (ηca ηdb + ηda ηcb ) W1c W0d + + ν3 f ab u fcd (C.3) + ν4 ηab ηcd + ν40 (ηca ηdb + ηda ηcb ) s2 (W0c , W0d ) with indices running from 0 to 3. Looking at (C.2), one sees that some of the new terms that may appear in the right-hand side of the commutator do contribute to (C.2). Thus, one has to check that they are not in the true commutator. It is done thank to the Jacobi identity based on (W1a , W1b , W1c ) which shows (for a, b, c all different) that6 5 More exactly that the modification is the same as the one introduced in replacing the commutative product in U (G ∗ ) by the (symmetrised) non Abelian product of U (G). 6 Let us note en passant that the Jacobi identity has just removed in the new terms those which are symmetric in a, b.
570
E. Ragoucy, P. Sorba
ν30 = ν300 = ν4 = ν40 = 0. We deduce that the commutator takes the form: 6(3p2 − 7) 3 W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c )+ [W1a , W2b ] = f ab c 14 p(p2 − 1) 2 (p − 9)(p 2 − 4) g c c ηdg ηe − 30 ηde ηg s3 (W0 , W0d , W1e ) + + 2(p2 − 1) + ν1 W2c + ν10 W1c + ν100 W0c + i + ν2 ηde s3 (W0c , W0d , W0e ) + ν3 f c de W1d W0e so that (C.2) and hence (3.4) are satisfied. D. Tensor Products of Some Finite Dimensional Representations of sl(n) We want here to compute the tensor product of the G-adjoint representation by itself several times, for G = sl(n). We will also need to select the totally symmetric part of these products. For such a purpose, we use Young diagrams, which allow us to determine the decompositions: G ⊗ G = (2, 0, .., 0, 2) ⊕ 2 (1, 0, .., 0, 1) ⊕ (2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2) ⊕ (0, 1, 0, . . . , 0, 1, 0) ⊕ (0, . . . , 0), where we have denoted by G = (1, 0, . . . , 0, 1) the adjoint representation. It remains to select the (anti-)symmetric part of these products. For G ⊗ G, the calculation has already been done (see e.g. [23]) and reads: S2 (G) = (G ⊗ G)sym = (2, 0, . . . , 0, 2) ⊕ (1, 0, . . . , 0, 1) ⊕(0, 1, 0, . . . , 0, 1, 0) ⊕ (0, . . . , 0), 32 (G) = (G ⊗ G)skew = (1, 0, . . . , 0, 1) ⊕(2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2).
(D.1) (D.2)
As far as S3 (G) is concerned, we already know that this sum of representations belongs to (S2 (G) ⊗ G)sym , which decomposes as (S2 (G) ⊗ G)sym = (3, 0, . . . , 0, 3) ⊕ 3 (2, 0, . . . , 0, 2) ⊕ 3 (1, 0, . . . , 0, 1) ⊕3 (0, 1, 0, . . . , 0, 1, 0) ⊕ 2 (1, 1, 0, . . . , 0, 1, 1) ⊕2 [(2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2)] ⊕(0, 0, 1, 0, . . . , 0, 1, 0, 0) ⊕ [(3, 0, . . . , 0, 1, 1) ⊕(1, 1, 0, . . . , 0, 3)] ⊕ [(1, 1, 0, . . . , 0, 1, 0, 0) ⊕(0, 0, 1, 0, . . . , 0, 1, 1)] ⊕ (0, . . . , 0). (D.3) This implies that we must have S3 (G) = a (3, 0, . . . , 0, 3) ⊕ b (2, 0, . . . , 0, 2) ⊕ c (1, 0, . . . , 0, 1) ⊕d (0, 1, 0, . . . , 0, 1, 0) ⊕ e (1, 1, 0, . . . , 0, 1, 1) ⊕f (0, 0, 1, 0, . . . , 0, 1, 0, 0) ⊕ m (0, . . . , 0) ⊕g [(2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2)] ⊕ h [(3, 0, . . . , 0, 1, 1) ⊕ (1, 1, 0, . . . , 0, 3)] ⊕ ⊕ i [(1, 1, 0, . . . , 0, 1, 0, 0) ⊕ (0, 0, 1, 0, . . . , 0, 1, 1)]
(D.4)
Yangians from Finite W-Algebras
571
with each multiplicity in (D.4) lower or equal to the corresponding multiplicity in (D.3). But we know the dimension of S3 (G): it is the dimension of a totally symmetric tensor 2 4 with 3 indices in a space of dimension dimG=n2 − 1, i.e. n (n6 −1) . Computing this dimension with (D.4) leads to only two possible solutions for the parameters: a = e = f = 1, h = i = 0, b = d = m, c = 3 − m and g = 2 − m with m = 0 or 1. As m is the multiplicity of the trivial representation in S3 (G), we deduce that, for7 G = sl(n), n 6 = 2, we have m = 1 (since dabc belongs to this space). Thus S3 (G) = (3, 0, . . . , 0, 3) ⊕ (2, 0, . . . , 0, 2) ⊕ 2 (1, 0, . . . , 0, 1) ⊕(0, 1, 0, . . . , 0, 1, 0) ⊕ (1, 1, 0, . . . , 0, 1, 1) ⊕(0, 0, 1, 0, . . . , 0, 1, 0, 0) ⊕ [(2, 0, . . . , 0, 1, 0) ⊕(0, 1, 0, . . . , 0, 2)] ⊕ (0, . . . , 0).
(D.5)
Finally, the multiplicity of the trivial representation in the tensor products occurring in Sect. 5 is computed through the remark that, in sl(n), the tensor product of two finite dimensional irreducible representations R and R 0 contains the trivial representation if and only if R and R 0 are conjugate. In that case, the multiplicity is 1. This leads to the multiplicities given in (5.4). We give also a basis for the corresponding spaces. To be complete, let us mention the bases: ab tcd = f ab e f e cd ab = d a d eb − d b d ea 32 (G) ⊗ 32 (G) : tcd ce d ce d , ab tcd = f a ce d eb d − f b ce d ea d ab t cde ab tcde 32 (G) ⊗ S3 (G) : t ab cde ab tcde
= f ab c ηcd + circ. perm. (c, d, e) = f ab g d g cm d m de + circ. perm. (c, d, e) . = (ηca d b de − ηcb d a de ) + circ. perm. (c, d, e) = (f a cg d gb m − f b cg d ga m )d m de + circ. perm. (c, d, e) (D.6)
In the case of sl(2), we need more information. Fortunately, the calculation is easier in that case, and we can go further. Indeed, we have (with Dj the (2j + 1)-dimensional representation of sl(2)): (D1 × D1 )sym = D0 ⊕ D2 ; (D1 × D1 )skew = D1 ; S3 (D1 ) = D1 ⊕ D3 ,
(D.7)
which leads to the multiplicities and tensors: M0 [(D1 × D1 )skew × (D1 × D1 )skew ] = 1 : f ab u f u cd ∼ ηca ηdb − ηda ηcb ,
M0 [(D1 × D1 )sym × (D1 × D1 )sym ] = 2 : M0 [(D1 × D1 )skew × (D1 × D1 )sym ] = 0 M0 [(D1 × D1 )skew × S3 (D1 )] = 1 M0 [(D1 × D1 )sym × S3 (D1 )] = 0 7 The case G = sl(2) is treated below.
ηab ηcd , , ηca ηdb + ηda ηcb
− : f ab c ηde + circ. perm. (c, d, e), −.
(D.8)
572
E. Ragoucy, P. Sorba
References 1. Zamolodchikov, A.B.: Infinite additional symmetries in two-dimensional conformal quantum field theory. Theor. Math. Phys. 63, 347 (1985) 2. Drinfel’d, V.G.: Hopf algebras and the quantum Yang–Baxter equation. Sov. Math. Dokl. 32, 254 (1985) 3. Feher, L., O’Raifeartaigh, L., Ruelle, P., Tsutsui, I. and Wipf, A.: On the general structure of Hamiltonian reductions of the WZWN theory. Phys. Rep. 222, 1, (1992) and ref. therein 4. Bernard, D.: Hidden Yangians in 2d massive current algebras. Commun. Math. Phys. 137, 191 (1991) 5. Haldane, F.D.M., Na, Z.N.C., Falstra, J.C., Bernard, D. and Pasquier, V.: Yangian symmetry of integrable quantum chains with long-range interactions and a new description of states in conformal field theory. Phys. Rev. Lett. 69, 2021 (1992) 6. Schoutens, K.: Yangian symmetry in conformal field theory. Phys. Lett. B331, 335 (1994); Bouwknegt, P., Ludwig, A. and Schoutens, K.: Spinon bases, Yangian symmetry and fermionic representations of Virasoro characters in conformal field theory. Phys. Lett. B338, 448 (1994) 7. Avan, J., Babelon, O. and Billey, E.: Exact Yangian symmetry in the classical Euler–Calogero–Moser model. Phys. Lett. A188, 263 (1994) 8. De Boer, J., Harmsze, F. and Tjin, T.: Non-linear finite W-symmetries and applications in elementary systems. Phys. Rep. 272, 139 (1996) 9. Barbarin, F., Ragoucy, E. and Sorba, P.: Remarks on finite W-algebras. hep-th/9612070, Proceedings of Vth International Colloquium on Quantum Groups and Integrable Systems, Prague (Czech Republic), June 1996; Extended and Quantum Algebras and their Applications to Physics, Tianjin (China), August 1996; Selected Topics of Theoretical and Modern Mathematical Physics, Tbilisi (Georgia), September 1996 10. Barbarin, F., Ragoucy, E. and Sorba, P.: W-realization of Lie algebras: Application to so(4, 2) and Poincaré algebras. Commun. Math. Phys. 186, 393 (1997) 11. Barbarin, F., Ragoucy, E. and Sorba, P.: Non-polynomial realizations of W-algebras. Int. J. Math. Phys. A11, 2835 (1996) 12. Barbarin, F., Ragoucy, E. and Sorba, P.: Finite W algebras and intermediate statistics. Nucl. Phys. B442, 425 (1995) 13. Balog, J., Feher, L., O’Raifeartaigh, L., Forgacs, P. and Wipf, A.: Toda theory and W-algebra from a gauged WZWN point of view. Ann. of Phys. 203, 76 (1990) 14. Bais, F.A., Tjin, T. and van Driel, P.: Covariantly coupled chiral algebras. Nucl. Phys. B357, 632 (1991) 15. Frappat, L., Ragoucy, E. and Sorba, P.: W-algebras and superalgebras from constrained WZW models: A group theoretical classification. Commun. Math. Phys. 157, 499 (1993) 16. de Boer, J. and Tjin, T.: The relation between quantum W algebras and Lie algebras. Commun. Math. Phys. 160, 317 (1994); Representation theory of finite W algebras. Commun. Math. Phys. 158, 485 (1993) 17. Madsen, J.O. and Ragoucy, E.: Quantum Hamiltonian reduction in superspace formalism. Nucl. Phys. B429, 277 (1994); Secondary quantum Hamiltonian reduction. Commun. Math. Phys. 185, 509 (1997) 18. Barbarin, F.: Algèbres W et applications. (in french), PhD-thesis, p.82-85, Preprint LAPTH 19. Chari, V. and Pressley, A.: A guide to quantum group. chap. 12, Cambridge: Cambridge University Press, 1994 20. Ge, M.L., Xue, K. and Cho, Y.M.: Realizations of Yangians in Quantum Mechanics and Applications. preprint NIM-TP-97-12 21. Ragoucy, E., Sorba, P. and Zaugg, Ph.: Work in progress 22. Frappat, L., Ragoucy, E. and Sorba, P.: Folding the W-algebras. Nucl. Phys. B404, 805 (1993) 23. Gourdin, M.: Basic of Lie groups. Moriond series no. 37, Ed. Frontières, 1982 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 203, 573 – 592 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Classical r-Matrices and Compatible Poisson Structures for Lax Equations on Poisson Algebras Luen-Chau Li Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA. E-mail:
[email protected] Received: 10 February 1998 / Accepted: 9 December 1998
Abstract: Given a classical r-matrix on a Poisson algebra, we show how to construct a natural family of compatible Poisson structures for the Hamiltonian formulation of Lax equations. Examples for which our formalism applies include the Benny hierarchy, the dispersionless Toda lattice hierarchy, the dispersionless KP and modified KP hierarchies, the dispersionless Dym hierarchy, etc.
1. Introduction Two Poisson brackets on the same manifold are said to be compatible if their sum is also a Poisson bracket [GDO,M]. There are many examples of integrable systems which are Hamiltonian with respect to two compatible Poisson structures (see, e.g. [DO]). Indeed, when one of the structures happens to be nondegenerate, there is a simple way which allows one to produce a whole family of compatible Poisson structures [KR,RSTS1]. However, the existence of further structures is not a necessity when the two compatible structures are both degenerate. In the late seventies, we saw the beginning of the Lie algebraic approach to integrable systems [K,A]. The Korteweg de-Vries (KdV) equation, for example, was shown to be a Hamiltonian system on coadjoint orbits [A]. Furthermore, the second Poisson structure for KdV type equations was constructed on subspaces of the algebra of formal pseudo-differential operators [A,GD]. We now refer to this second structure as the Adler–Gelfand–Dickey structure. Recently, it was found to be of independent interest in conformal field theory [DFIZ]. In the meantime, the Lie algebraic approach to integrable systems was extensively developed, particularly by the Russian school in St. Petersburg (see, e.g., the survey in [RSTS2]). In the so-called r-matrix framework, the simplest Poisson structures for the Hamiltonian formulation of Lax equations on Lie algebras are the linear Poisson structures associated with the R-brackets. In the case where g is the Lie algebra of a noncommutative, associative algebra, a construction of quadratic brackets
574
L.-C. Li
which give Lax equations was first available for the skew-symmetric r-matrices satisfying the modified Yang–Baxter equation [STS1]. Subsequently, this was superseded by a more general construction valid for a wider class of r-matrices [LP1,LP2]. Indeed, in [LP2], even a third order structure was found. At this juncture, the reader should note that on the abstract level of associative algebras, neither the linear structure nor the quadratic structure is nondegenerate. Therefore, the recipe for producing a whole family of structures is not applicable in this context. As a matter of fact, no Poisson structures with order > 3 was ever found. In this connection, we would like to mention the thesis of Strack [ST], which showed (by using computer algebra) that beyond order 3, no Poisson structures of a certain form can exist for the Hamiltonian formulation of Lax equations. So this is the state of affairs for noncommutative, associative algebras. In this paper, we address the Hamiltonian formulation of Lax equations, as before, but in the context of Poisson algebras. Here, we show how to construct a natural family of compatible Poisson structures on the full algebra. On the group of invertible elements (if non-empty and forms an open subset), similar consideration shows we can even define structures of negative order. Thus the situation for Poisson algebras, in which multiplication is commutative, is entirely different. Recall that a Poisson algebra is by definition a commutative, associative algebra with unit 1 equipped with a Lie bracket such that the Leibniz rule holds [W1]. The most familiar examples of Poisson algebras are given by the collection of smooth functions on Poisson manifolds. For us, the particular examples which have partly motivated this work are the algebras associated with the truncated Benny equation [G-KR], and the various dispersionless equations [DM,K,TT] which are currently of interest in topological field theory [D,K]. As the reader will see, a family of vector fields Vn , n ≥ −1, plays the key role in this investigation. These vector fields Vn are invariants of degree 1 of the vector fields associated with the Lax equations, and satisfy the Virasoro relations [Vm , Vn ] = (n−m)Vm+n . For a given classical r-matrix on the Poisson algebra, we can construct the associated linear bracket. If we denote by π−1 the bivector field corresponding to this basic linear structure, we shall show that the Lie derivatives LVm π−1 essentially generate all higher order structures. Thus our construction works for an arbitrary classical r-matrix! This is in marked contrast to previous results on quadratic Poisson structures on noncommutative, associative algebras [STS1,LP1,LP2], where one has to make rather stringent assumptions on the r-matrix. In this connection, we would like to remind the reader of the important difference between the notions of double Lie algebras and Lie bialgebras. Recall that the former was motivated by the study of integrable systems [STS1] and is associated with classical r-matrices. On the other hand, the notion of Lie bialgebras had its origin in the geometry of Poisson Lie groups [DR]. The two do intersect, for example, in the class of double Lie algebras called Baxter Lie algebras [STS2] (where the r-matrix satisfies additional properties). In our case, as the r-matrix is assumed to be completely arbitrary, we are working within the framework of double Lie algebras here. The paper is organized as follows. In Sect. 2, we assemble a number of basic facts and definitions which will be used in the paper. In Sect. 3, we formulate the main result and display the explicit formulas for the linear, quadratic, and higher order structures. Then we study a number of basic properties. In order to prepare for the proof of the main result, we introduce the vector fields Vn in Sect. 4 and discuss their relation with the Lax equations. Then, in Sect. 5, we give a proof of the main result. In order to illustrate the use of our construction in Sect. 3, we describe the multi-Hamiltonian formalism of some concrete partial differential equations in Sect. 6. Our examples include the hierarchy of truncated Benny equations [B,G-KR] in nonlinear waves, the dispersionless Toda lattice
Classical r-Matrices and Compatible Poisson Structures
575
hierarchy [DM], the dispersionless KP [K,TT] and modified KP hierarchies, and the dispersionless Dym hierarchy. Note that in each example, the set of Lax operators under consideration is a submanifold of the full Poisson algebra. However, this submanifold is not necessarily a Poisson submanifold of the full algebra equipped with a bracket which comes from Sect. 3. For this reason, the passage from the bracket on the algebra to the Hamiltonian structure on the submanifold of Lax operators might involve the process of reduction [MR]. Thus in our examples, we find Dirac reduction [D,MR] (i.e. reduction with constraints) comes in naturally. For the Benny hierarchy and the dispersionless Toda lattice hierarchy, we shall compute the first few Poisson structures explicitly, and illustrate the use of Dirac reduction. Our explicit expressions for the structures not only allow us to find the Casimir functions, they also show that the structures which come from our Poisson algebras are of hydrodynamic type or its generalizations [DN,F]. Indeed, as it turns out, all the higher structures of the dispersionless Toda lattice hierarchy are nonlocal generalizations of brackets of hydrodynamic type. This shows how our construction in Sect. 3 can get complicated upon reduction to a specific submanifold of Lax operators. To close we stress again that our main result is formulated along the lines of the r-matrix approach (where Poisson structures are defined either on Lie algebras or their duals, or on Lie groups) and applies to all Poisson algebras satisfying the assumptions of Theorem 3.2. In any concrete applications, the use of reduction techniques (where necessary) is perfectly natural and the reader should not feel uncomfortable under such circumstances. 2. Preliminaries We collect in this section a number of basic facts, and introduce some terminology which will be used in the sequel. Let P be a smooth manifold. A Poisson bracket {·, ·} on P is a Lie bracket on C ∞ (P ) which satisfies the derivation property in each argument. If π is the bivector field corresponding to the bracket operation, i.e. {F, H } = π(dF, dH ),
(2.1)
then it is well-known that the Jacobi identity for {·, ·} is equivalent to [π, π]S = 0 [W2], where [·, ·]S is the Schouten bracket [S]. Recall that if 0(∧k T M) is the space of sections of the vector bundle ∧k T M, and ∧∗ (M) = ⊕ 0(∧k T M), the Schouten bracket [·, ·]S k≥0
is the bilinear map
(2.2) [·, ·]S : ∧∗ (M) × ∧∗ (M) → ∧∗ (M) which extends the usual Lie bracket operation on 0(T M) and makes ∧∗ (M) into a Lie superalgebra. In particular, the following graded Jacobi identity holds: (−1)pr [u, [v, w]S ]S + (−1)qp [v, [w, u]S ]S + (−1)rq [w, [u, v]S ]S = 0,
(2.3)
where u ∈ 0(∧p T M), v ∈ 0(∧q T M) and w ∈ 0(∧r T M). As we mentioned in the introduction, two Poisson brackets on P are said to be compatible if their sum is also a Poisson bracket, i.e. satisfies the Jacobi identity [GDO,M]. In terms of the corresponding bivector fields π1 and π2 , this is equivalent to [π1 , π2 ]S = 0, as [πi , πi ]S = 0, i = 1, 2. In this paper, we shall construct compatible Poisson structures for the Hamiltonian formulation of Lax equations (associated with r-matrices) when the underlying manifold P is a Poisson algebra.
576
L.-C. Li
Definition 2.4. Let A be a commutative, associative algebra with unit 1. If there is a Lie bracket on A such that for each element a ∈ A, the operator ada : b 7 → [a, b] is a derivation of the multiplication, then (A, [·, ·]) is called a Poisson algebra. Thus the Poisson algebras are Lie algebras with an additional associative algebra structure (with commutative multiplication and unit 1) related by the derivation property to the Lie bracket. Note that some authors call the Lie bracket on A the Poisson structure on A (see, for example, [W1]), but we shall refrain from such usage in order to avoid confusion. We now recall the notion of a classical r-matrix [STS1]. Let g be a Lie algebra. A linear operator R in the space g is called a classical r-matrix if the R-bracket given by [X, Y ]R =
1 ([RX, Y ] + [X, RY ]), X, Y ∈ g 2
(2.5)
is a Lie bracket, i.e. satisfies the Jacobi identity. Some well-known sufficient conditions for R ∈ End(g) to be a classical r-matrix are theYang–Baxter equation and the modified Yang–Baxter equation. But in this paper, we can establish our results without assuming these conditions. To close this section, we define what we mean by Lax equations. Definition 2.6. Let A be a Poisson algebra, and suppose R ∈ End(A) is a classical r-matrix. Equations of the form L˙ = [R(X(L)), L], L ∈ A,
(2.7)
where X : A → A is a smooth map satisfying [X(L), L] = 0, dX(L) · [L0 , L] = [L0 , X(L)], L, L0 ∈ A,
(2.8)
are called Lax equations. The basic Lax equations on A are given by L˙ = Zm (L) = [R(Lm ), L], m ≥ 1.
(2.9)
More generally, if H is a smooth ad-invariant function (in the sense defined in (3.1)), then L˙ = [R(Lm dH (L)), L] is also a Lax equation. 3. A Family of Compatible Poisson Structures on Poisson Algebras In what follows, we shall assume the Poisson algebra A is equipped with a non-degenerate ad-invariant pairing (·, ·). A function F defined on A is said to be smooth if there exists a map dF : A → A such that d F (L + tL0 ) = (dF (L), L0 ) , L, L0 ∈ A dt t=0
(3.1)
Theorem 3.2. Let A be a Poisson algebra with Lie bracket [·, ·] and non-degenerate adinvariant pairing (·, ·) with respect to which the operation of multiplication is symmetric, i.e. (XY, Z) = (X, Y Z), ∀ X, Y, Z ∈ A. Assume R ∈ End(A) is a classical r-matrix, then
Classical r-Matrices and Compatible Poisson Structures
577
(a) for each integer n ≥ −1, the formula {F, H }(n) (L) = (L, [R(Ln+1 dF (L)), dH (L)] + [dF (L), R(Ln+1 dH (L))]) (3.3) (where F and H are smooth) defines a Poisson structure on A, (b) the structures {·, ·}(n) are compatible with each other, (c) if πn is the bivector field corresponding to {·, ·}(n) and Dπn : ∧∗ (A) → ∧∗ (A) is the associated coboundary operator, i.e. Dπn X = [πn , X]S , X ∈ ∧∗ (A). There exists vector fields Vm on A, m ≥ −1 satisfying the Virasoro relations [Vm , Vn ] = (n − m)Vm+n such that Dπn Vm = (n − m)πm+n , m, n ≥ −1. We shall prove this result in Sect. 5, after we introduce the vector fields Vm in Sect. 4 and explain what they are in relation to the Lax equations. As the reader will see, the relations [πn , Vm ]S = (n − m)πm+n between the bivector fields which we establish at the beginning of Sect. 5 play the key role in proving parts (a) and (b) of the above theorem. They are also responsible for the following. Corollary 3.4 (Involution of Casimir Functions). {Hπ0n (A), Hπ0n (A)}(m+n) = 0, m, n ≥ −1, m 6 = n. Proof. This follows from the formula [πn , Vm ]S (dF, dH ) = LVm πn (dF, dH ) = t Vm {F, H }(n) − {Vm F, H }(n) − {F, Vm H }(n) . u Remark 3.5. Note that from the compatibility of the structures, it follows that {Hπ0m (A), Hπ0m (A)}(n) ⊂ Hπ0m (A).
(3.6)
We now give a number of basic properties of the Poisson structures {·, ·}(n) , n ≥ −1. Theorem 3.7. (a) Smooth functions in A which are ad-invariant Poisson commute in {·, ·}(n) . (b) The Hamiltonian system generated by a smooth ad-invariant function H in the Poisson structure {·, ·}(n) is given by the Lax equation L˙ = [R(Ln+1 dH (L)), L]. Proof. (a) If F and H are smooth functions in A which are ad-invariant, we have [dF (L), L] = [dH (L), L] = 0. Therefore, {F, H }(n) (L) = ([dH (L), L], R(Ln+1 dF (L)))+([L, dF (L)], R(Ln+1 dH (L))) = 0. (b) If H is ad-invariant, for any smooth F , we have {F, H }(n) (L) = (L, [dF (L), t R(Ln+1 dH (L))]) = (dF (L), [R(Ln+1 dH (L)), L]). u From formula (3.3), it is clear that the bracket {·, ·}(n) vanishes at the unit 1. Therefore, the linearization of {·, ·}(n) defines a Lie bracket on A, and an easy calculation shows it coincides with the R-bracket [·, ·]R . The following result is reminiscent of the multiplicative property of Poisson Lie groups [DR]. However, it is in the context of a Poisson algebra and the reason for its validity is entirely different. Theorem 3.8. Equip A with the structure {·, ·}(0) and A × A with the product structure. Then the multiplication map m : A × A → A is a Poisson map.
578
L.-C. Li
Proof. Let F and H be smooth functions on A. For L1 , L2 ∈ A, let L = m(L1 , L2 ). Clearly, F ◦ m depends on two variables and by taking its derivative with respect to the i th variable, i = 1, 2, we obtain d1 (F ◦ m)(L1 , L2 ) = L2 dF (L), d2 (F ◦ m)(L1 , L2 ) = L1 dF (L). To simplify notation, let X1 = dF (L), X2 = dH (L) and denote the product structure on A × A also by {·, ·}(0) , then we have {F ◦ m, H ◦ m}(0) (L1 , L2 ) = (L1 , [R(LX1 ), L2 X2 ] + [L2 X1 , R(LX2 )]) + (L2 , [R(LX1 ), L1 X2 ] + [L1 X1 , R(LX2 )]).
(*)
By the derivation property of [·, ·], the commutativity of multiplication and its symmetry with respect to the ad-invariant pairing (·, ·), we have (L1 , [R(LX1 ), L2 X2 ]) = (L, [R(LX1 ), X2 ]) − (L2 , [R(LX1 ), L1 X2 ]). Likewise, (L2 , [L1 X1 , R(LX2 )]) = (L, [X1 , R(LX2 )] − (L1 , [L2 X1 , R(LX2 )]). When we insert these relations in (*), the result follows. u t Consider now Ainv , the group of invertible elements of A. We assume Ainv 6= φ and form an open subset of A. Then we can define vector fields Z−m , V−n for m ≥ 1, n ≥ 2, on Ainv as in formulas (4.2) and (4.5). If we define {F, H }(−n) (L) = (L,[R(L−n+1 dF (L)), dH (L)] + [dF (L), R(L−n+1 dH (L))]), n≥2 (3.9) for smooth functions F and H on Ainv , it is easy to check that the analysis in Sect. 5 also holds for these objects. In particular, this means {·, ·}(−n) are Poisson structures on Ainv . Theorem 3.10. Let ι : Ainv → Ainv be the inversion map, i.e. ι(L) = L−1 . Then {F ◦ ι, H ◦ ι}(n) (L) = −{F, H }(−n) ◦ ι(L), n ≥ 0, for all smooth functions F and H on Ainv . Proof. We have d(F ◦ ι)(L) = −L−2 dF (L−1 ) and so {F ◦ ι, H ◦ ι}(n) (L) = (L, [R(Ln−1 dF (L−1 )), L−2 dH (L−1 )] − (F ↔ H )). Now, (L, [R(Ln−1 dF (L−1 )), L−2 dH (L−1 )]) = (L, L−2 [R(Ln−1 dF (L−1 )), dH (L−1 )])+(L dH (L−1 ), [R(Ln−1 dF (L−1 )), L−2 ]) = (L−1 , [R(Ln−1 dF (L−1 )), dH (L−1 )])+2(dH (L−1 ), [R(Ln−1 dF (L−1 )), L−1 ]) = −(L−1 , [R(Ln−1 dF (L−1 )), dH (L−1 )]). Hence the assertion follows. u t
Classical r-Matrices and Compatible Poisson Structures
579
4. Lax Equations on Poisson Algebras and Virasoro Invariants According to Definition 2.6, corresponding to each smooth map X : A → A satisfying (2.8) is a Lax equation e L˙ = X(L) = [R(X(L)), L]. (4.1) To prepare for the proof of Theorem 3.2, we shall introduce vector fields Vn , n ≥ −1 on A which are related to the Lax equations. Before we do so, we first prove e Y e] = 0. Theorem 4.2. Let X, Y : A → A be smooth maps satisfying (2.8). Then [X, Proof. We have e(L) · X(L) e dY e e = [R(dY (L) · X(L)), L] + [R(Y (L)), X(L)] = [R([R(X(L)), Y (L)]), L] + [R(Y (L)), [R(X(L)), L]]. Therefore, e Y e](L) [X, = 2[R([X(L), Y (L)]R ), L]+[R(Y (L)), [R(X(L)), L]]−[R(X(L)), [R(Y (L)), L]] = [2R([X(L), Y (L)]R ), L]−[[R(X(L)), R(Y (L))], L], by Jacobi identity = −[[R(X(L)), R(Y (L))]−2R([X(L), Y (L)]R ), L]. Let BR (X, Y ) = [RX, RY ]−2R([X, Y ]R ). Then R is a classical r-matrix iff [BR (X, Y ), Z]+[BR (Y, Z), X]+[BR (Z, X), Y ] = 0, ∀ X, Y, Z ∈ A. Using the ad-invariant pairing, this is equivalent to [BR (X, Y ), Z] =R ∗ [RX, [Y, Z]] − R ∗ [X, R ∗ [Y, Z]] − [RX, R ∗ [Y, Z]] + R ∗ [RY, [Z, X]] − R ∗ [Y, R ∗ [Z, X]] − [RY, R ∗ [Z, X]]. If we now put X = X(L), Y = Y (L) and Z = L in the above relation, we obtain e Y e](L) = 0, as asserted. u [X, t The vector fields Vn , n ≥ −1, are defined as follows: Vn (L) = Ln+1 , n ≥ −1.
(4.3)
Theorem 4.4. The vector fields Vn satisfy the Virasoro relations [Vm , Vn ] = (n − m)Vm+n , m, n ≥ −1. Proof. Clear. u t Given a smooth manifold M and a vector field V on M, recall that a tensor field T is an invariant tensor field of V iff LV T = 0. Generalizing one step further, we shall say that T is an invariant tensor field of degree 1 iff L2X T = 0. The vector fields Vn e corresponding introduced in (4.3) above are invariants of degree 1 of the vector fields X to the Lax equations. Indeed, we have e=Y e, Theorem 4.5. If X : A → A is a smooth map satisfying (2.8), we have LVm X where Y (L) = dX(L) · Vm (L).
580
L.-C. Li
Proof. e [Vm , X](L) e e = d X(L) · Vm (L) − dVm (L) · X(L)
= [R(dX(L) · Vm (L)), L] + [R(X(L)), Vm (L)] − (m + 1)Lm [R(X(L)), L] = [R(dX(L) · Vm (L)), L]. Thus, it remains to show Y (L) = dX(L) · Vm (L) satisfies (2.8). To do this, first note that from the condition [X(L), L] = 0, L ∈ A, we have [dX(L) · L0 , L] + [X(L), L0 ] = 0, L, L0 ∈ A. Therefore, [Y (L), L] = [dX(L) · Vm (L), L] = −[X(L), Vm (L)] = −(m + 1)Lm [X(L), L] = 0, for all L ∈ A. On the other hand, it follows from dX(L)·[L0 , L] = [L0 , X(L)], L, L0 ∈ A, that (d 2 X(L) · L0 )([L00 , L]) + dX(L) · [L00 , L0 ] = [L00 , dX(L) · L0 ], L, L0 , L00 ∈ A. (∗) Consequently, for all L, L0 ∈ A, we have dY (L) · [L0 , L] = (d 2 X(L) · [L0 , L])(Vm (L)) + dX(L) · ((m + 1)Lm [L0 , L]) = (d 2 X(L) · Vm (L))([L0 , L]) + dX(L) · [L0 , Vm (L)] = [L0 , Y (L)], by (*).
t u
Remark 4.6. For the vector fields Zn in (2.9), we have in particular the relations LVm Zn = nZm+n , m ≥ −1, n ≥ 1. If we now combine Theorem 4.5 and Theorem 4.2, the nature of the vector fields Vm is now revealed. Corollary 4.7. L2X e V−1 = 0. e Vn = 0, n ≥ 0, LX 5. Virasoro Action on the Bivector Fields and Compatibility of the Structures The goal of this section is to prove Theorem 3.2. To do this, we consider the action of the vector fields Vm on the bivector fields πn corresponding to {·, ·}(n) , n ≥ −1. Theorem 5.1. LVm πn = (n − m)πm+n , m, n ≥ −1. As indicated in Sect. 3, this result is the key in proving Theorem 3.2. The demonstration of Theorem 5.1 is quite tedious, so we break it up into several steps. First, note that from the property of the Lie derivative, we have LVm πn (dF, dH ) = Vm {F, H }(n) − {Vm F, H }(n) − {F, Vm H }(n) .
(5.2)
Using the expressions for {·, ·}(n) and Vm , we obtain the identities in the next two lemmas. We shall omit the rather lengthy computations. Lemma 5.3. Vm {F, H }(n) (L) = (Vm (L), [R(Ln+1 dF (L)), dH (L)]) + (L, [R(Ln+1 dF (L)), d 2 H (L) · Vm (L)]) + (n + 1)(L, [R(Lm+n+1 dF (L)), dH (L)]) + (L, [R(Ln+1 d 2 F (L) · Vm (L)), dH (L)]) − (F ↔ H ), where (F ↔ H ) denote terms obtained from previous ones by switching F and H .
Classical r-Matrices and Compatible Poisson Structures
581
Lemma 5.4. {Vm F, H }(n) (L) + {F, Vm H }(n) (L) = (L, [R(Ln+1 d 2 F (L) · Vm (L)), dH (L)] + [d 2 F (L) · Vm (L), R(Ln+1 dH (L))]) + (m + 1)(L, [R(Lm+n+1 dF (L)), dH (L)] + [Lm dF (L), R(Ln+1 dH (L))]) − (F ↔ H ). Proof of Theorem 5.1. By combining the expressions in Lemma 5.3 and Lemma 5.4 according to (5.2), it is clear that terms involving second derivatives cancel out, and we obtain LVm πn (L) (X1 , X2 ) = (Vm (L), [R(Ln+1 X1 ), X2 ]) + (n − m)(L, [R(Lm+n+1 X1 ), X2 ]) − (m + 1)(L, [Lm X1 , R(Ln+1 X2 )]) − (1 ↔ 2),
(*)
where X1 = dF (L), X2 = dH (L). Now, by repeated application of the derivation property, the commutativity of multiplication and its symmetry with respect to (·, ·), we have (Vm (L), [R(Ln+1 X1 ), X2 ]) − (1 ↔ 2) = (L, [R(Ln+1 X1 ), Lm X2 ]) − (LX2 , [R(Ln+1 X1 ), Lm ]) − (1 ↔ 2) = (L, [R(Ln+1 X1 ), Lm X2 ]) − m(Lm X2 , [R(Ln+1 X1 ), L]) − (1 ↔ 2) = (m + 1)(L, [R(Ln+1 X1 ), Lm X2 ]) − (1 ↔ 2) = (m + 1)(L, [Lm X1 , R(Ln+1 X2 )]) − (1 ↔ 2). If we substitute this in (*), the result follows. u t Remark 5.5. In the case of noncommutative, associative algebra, relations similar to the ones in Theorem 5.1 were obtained in [LP2] for the three structures there. Corollary 5.6. [πm , πn ]S = m, n ≥ −1.
1 n+2 [Vn+1 , [π−1 , πm ]S ]S
+
m−n−1 n+2 [π−1 , πm+n+1 ]S
for
Proof. From Theorem 5.1 and the graded Jacobi identity for the Schouten bracket, it follows that [πm , πn ]S 1 [πm , [Vn+1 , π−1 ]S ]S =− n+2 1 1 [Vn+1 , [π−1 , πm ]S ]S + [π−1 , [Vn+1 , πm ]S ]S =− n+2 n+2 1 m−n−1 [Vn+1 , [π−1 , πm ]S ]S + [π−1 , πm+n+1 ]S . =− n+2 n+2
t u
Remark 5.7. The formulation of Corollary 5.6 is motivated by similar considerations in [AvM].
582
L.-C. Li
Proof of Theorem 3.2. If we set m = −1 in the identity in Corollary 5.5, we find 1 [Vn+1 , [π−1 , π−1 ]S ]S = 0, ∀ n ≥ −1, as π−1 is the bivector [π−1 , πn ]S = − 2(n+2) field for the Lie–Poisson structure {·, ·}(−1) . From the same identity, it now follows that [πm , πn ]S = 0, ∀ m, n ≥ −1. Hence the brackets {·, ·}(n) define compatible Poisson structures on A. Finally, the assertion in part (c) follows from Theorem 5.1. u t
6. Some Examples In this section, we look at some concrete examples of partial differential equations which can be realized as Lax equations on Poisson algebras. In each case, we describe the multiHamiltonian formalism which follows from our universal construction in Sect. 3. The reader should note that in these applications, we are dealing with Lax operators which form submanifolds of the full Poisson subalgebras under consideration. Although these submanifolds of Lax operators are invariant under the dynamics of the associated Lax equations, however, they are not automatically Poisson submanifolds of the brackets which arise from the general construction in Sect. 3. For this reason, there are two kinds of situations in the examples which follows. In the happy case where the submanifold M of Lax operators does form a Poisson submanifold of (A, {·, ·}(n) ), there is of course an induced structure on M which can be obtained by simple restriction of {·, ·}(n) to M. On the other hand, when M is not a Poisson submanifold of (A, {·, ·}(n) ), the reader will see that the geometry in each case warrants the application of Dirac reduction, i.e. reduction with constraints [D,KO,MR]. Thus in this latter case, the brackets which arise from the construction in Sect. 3 serve as the starting point of a reduction process from which the constrained brackets on M are computed. In the following, we shall rescale the expression for {·, ·}(n) by the factor 21 . 6.1. The Benny hierarchy. The Benny equations in nonlinear waves [B] (we shall consider the simplest case here) are given by the quasi-linear system u0 1 u0 u0 = . (6.1) u−1 t u−1 x u−1 u0 We shall deal with the case where u0 , u−1 are smooth functions on the circle S 1 = R/Z. Following [G-KR], introduce the algebra A of Laurent polynomials in λ, having the form X ui (x)λi , (6.2) u(x, λ) = i
where the coefficients ui are smooth functions on the circle S 1 . With the well-known Lie-bracket defined by [u, v]−1 =
∂u ∂v ∂u ∂v − , u, v ∈ A, ∂λ ∂x ∂x ∂λ
(6.3)
it is clear that (A, [·, ·]−1 ) is a Poisson algebra. In [G-KR], the Benny equations are rewritten as a Lax equation in this Poisson algebra. Indeed, (6.1) is equivalent to 1 2 dL = R L ,L , (6.4) dt 4 −1
Classical r-Matrices and Compatible Poisson Structures
583
where the Lax operator L is an element of the Benny manifold n o MBenny = L ∈ A L(x, λ) = λ + u0 (x) + u−1 (x)λ−1
(6.5)
and the r-matrix R is the one associated with the direct sum decomposition A = A>1 ⊕ A60 into subalgebras A >1
X = u ∈ A u(x, λ) = ui (x)λi ,
(6.6)
(6.7a)
i >1
A 60
X = u ∈ A u(x, λ) = ui (x)λi .
(6.7b)
i 60
In view of the representation in (6.4), the quasi-linear system (6.1) is only a member of a hierarchy of Lax equations on MBenny , and this is what we call the Benny hierarchy. Note that the Poisson algebra introduced above admits the trace functional Z (6.8) tr−1 u = u−1 (x)dx, u ∈ A (here and below we integrate over S 1 ) which satisfies the important property tr−1 [u, v] = 0, u, v ∈ A.
(6.9)
Therefore, we can equip A with a non-degenerate ad-invariant pairing (·, ·)−1 : (u, v)−1 = tr−1 (uv) , u, v ∈ A.
(6.10)
Thus we have all the ingredients which are required for the application of Theorem 3.2. Consequently, we have a family of Poisson structures {·, ·}(n) , n ≥ −1, on A. It is easy to check that MBenny is a Poisson submanifold of (A, {·, ·}(−1) ). Therefore, the induced structure on MBenny provides the first Poisson structure for the equations in the Benny hierarchy [G-KR]. Using u = (u0 , u−1 ) as coordinates on MBenny , the associated Hamiltonian operator is given explicitly by d 0 D , D= , (6.11) B(−1) (u) = D 0 dx which is apparently well-known to people working in other frameworks (see, for examR structure is degenerate, with Casimirs given by C1 (u) = Rple, [DN]). Clearly, this first u0 (x)dx and C2 (u) = u−1 (x)dx. Remark 6.12. One of the advantages in formulating the Benny equations as a Lax equation on A is that it automatically suggests a method of solution, namely, via a factorization problem on a symplectic diffeomorphism group. The analytic details, however, are nontrivial.
584
L.-C. Li
We now turn to the higher structures. Here, it is easy to see that MBenny is not a Poisson submanifold of any of the brackets {·, ·}(n) , n > 0. However, we shall see that we can apply Dirac reduction to {·, ·}(n) with appropriate constraints to obtain the higher structures on MBenny . We shall illustrate the procedure for n = 0 and n = 1, thereby obtaining the second and third Poisson structures on MBenny . For n = 0, the Hamiltonian vector field generated by H is of the form (0) XH (u) = u56−2 ([dH (u), u]−1 ) − 560 (udH (u)), u −1 = 5>1 (udH (u)), u −1 − u5≥−1 ([dH (u), u]−1 ).
(6.13) (0)
If L ∈ MBenny , it follows from this formula that the highest order term of XH (L) in λ is λ0 , while the lowest order is in λ−2 . Using u = (u0 , u−1 , u−2 ) as coordinates on the submanifold {λ + u0 (x) + u−1 (x)λ−1 + u−2 (x)λ−2 ∈ A}, the operator which gives (0) XH (L) can be computed explicitly: u−1 D + u−1x D u0 D + u0x . u0 D 2u−1 D + u−1x 0 2 0 −u−1 D − u−1 u−1x u−1 D
(6.14)
Therefore, we can apply Dirac reduction with constraint u−2 ≡ 0 to obtain the second structure on MBenny :
D u0 D + u0x B0 (u) = u0 D 2u−1 D + u−1x u−1 D + u−1x (−u2−1 D − u−1 u−1x )−1 (u−1 D 0) − 0 00 01 2 u0 u . D+ u + = u0 2u−1 1 0 −1x 0 0 0x
(6.15)
Note that this second structure is of hydrodynamic type [DN] because the associated Hamiltonian operator is of the form ij
ij
B0 (u) = g ij (u)D + bk (u) ukx .
(6.16)
In this case, the metric which defines the structure (6.15) is non-degenerate where 1 = u20 − 4u−1 6= 0.
(6.17)
For n = 1, i.e. for the bracket {·, ·}(1) , we have a similar formula for the Hamiltonian vector field h i (1) XH (u) = u2 56−2 ([dH (u), u]−1 ) − 560 (u2 dH (u)), u −1 h i 2 2 = 5>1 (u dH (u)), u − u 5>−1 ([dH (u), u]−1 ). (6.18) −1
Classical r-Matrices and Compatible Poisson Structures
585
(1)
This time, the highest order term of XH (L) (L ∈ MBenny ) in λ is still λ0 , but the lowest order term is in λ−3 . Therefore, in the coordinates u = (u0 , u−1 , u−2 , u−3 ), the operator (1) which gives XH (L) is given by 2u0 D + u0x (u20 + 3u−1 )D + 2u0 u0x + 2u−1x (u2 + 3u−1 )D + u−1x 4u0 u−1 D + 12u−1 u0x + 2u0 u−1x 0 2u0 u−1 D u2−1 D 2 0 u−1 D 2u0 u−1 D + 2u0 u−1x + 2u−1 u0x u2−1 D + 2u−1 u−1x u2−1 D + 2u−1 u−1x 0 . (6.19) −2u0 u2−1 D − 2u0 u−1 u−1x − u2−1 u0x −u3−1 D − 2u2−1 u−1x −u3−1 D − u2−1 u−1x 0 To obtain the structure on MBenny , we have to use Dirac reduction with the constraints u−2 ≡ 0, u−3 ≡ 0. Accordingly, we have to invert the lower 2 × 2 block of (6.19): −1 −2u0 u2−1 D − 2u0 u−1 u−1x − u2−1 u0x −u3−1 D − 2u2−1 u−1x −u3−1 D − u2−1 u−1x 0 1 1 −1 0 2 D 2 u u −1 −1 . (6.20) = − 1 −1 1 1 −1 u0 − u0 D −1 1 D − D 2 2 2 2 2 u −1 u u u u u −1
−1
−1
−1
−1
Hence the Hamiltonian operator of the third structure is given by 2u0 D + u0x (u20 + 3u−1 )D + 2u0 u0x + 2u−1x B1 (u) = (u20 + 3u−1 )D + u−1x 4u0 u−1 D + 2u−1 u0x + 2u0 u−1x 2u0 u−1 D + 2u−1 u0x + 2u0 u−1x u2−1 D + 2u−1 u−1x + u2−1 D + 2u−1 u−1x 0 1 1 −1 D u2 0 2u0 u−1 D u2−1 D u2−1 −1 1 −1 1 0 u2−1 D D u−1 − u21 D −1 uu20 − uu20 D −1 u21 u2−1 −1 −1 −1 −1 0 2 u20 + 4u−1 4u0 2 2u0 + = D + u u−1x . 0x 0 2u−1 2 2u0 u20 + 4u−1 4u0 u−1 (6.21) Again, this corresponds to a bracket of hydrodynamic type and the non-degeneracy of the metric is characterized by the same condition in (6.17). Remark 6.22. R(a) Alternatively, on Rthe symplectic leaves of the first structure defined by the conditions u0 (x)dx = const, u−1 (x)dx = const, B−1 is invertible and therefore we can compute the recursion operator u0 + u0x D −1 2 −1 = . (6.23) R = B0 B−1 2u−1 + u−1x D −1 u0 From this, we can check that B1 = RB0 . (b) In principle, one can compute all higher structures explicitly by applying Dirac reduction to {·, ·}(n) or by using the recursion operator R, but the calculations are quite formidable and we do not know if there exists an efficient way to do this.
586
L.-C. Li
6.2. The dispersionless Toda lattice hierarchy. Let A be the algebra introduced in Example 6.1, but now we equip it with the following Lie bracket: ∂u ∂v ∂u ∂v − , u, v ∈ A. (6.24) [u, v]0 = λ ∂λ ∂x ∂x ∂λ Then (A, [·, ·]0 ) is also a Poisson algebra. The dispersionless Toda lattice hierarchy is defined by the Lax equations dL = 5k (Ln ), L 0 = − 5l (Ln ), L 0 , n = 1, 2, . . . , dt
(6.25)
where the Lax operator L is an element of the manifold MdToda = {L ∈ A L(x, λ) = u1 (x)λ + u0 (x) + u1 (x)λ−1 }
(6.26)
and 5k , 5l are the projection operators relative to the direct sum decomposition A=k⊕l
(6.27)
) X i −i ui (x)(λ − λ ) , k = u ∈ A u(x, λ) =
(6.28a)
into subalgebras (
i>0
X ui (x)λi . l = u ∈ A u(x, λ) =
(6.28b)
i≤0
When n = 1, the corresponding Lax equation dL = [5k (L), L]0 ⇐⇒ dt
u0t = 4u1 u1x . u1t = u1 u0x
(6.29)
These are the dispersionless Toda lattice equations and can be obtained from the periodic Toda lattice ODE system dak 2 = 2(bk2 − bk−1 ), dt
dbk = bk (ak+1 − ak ) dt
(6.30)
by taking a continuum (or long wave) limit. The Poisson algebra (A, [·, ·]0 ) also has all the ingredients needed for the construction in Theorem 3.2. In this case, the invariant trace is of the form Z (6.31) tr0 u = u0 (x)dx, u ∈ A which gives rise to the non-degenerate ad-invariant pairing (·, ·)0 : (u, v)0 = tr0 (uv), u, v ∈ A.
(6.32)
As the r-matrix for the equations in (6.25) is given by R = 5k − 5l ,
(6.33)
Classical r-Matrices and Compatible Poisson Structures
587
it follows from (3.3) that the Hamiltonian vector field generated by H in the structure {·, ·}(n) is of the form h i (n) XH (u) = 5k (un+1 dH (u)), u − un+1 5∗l ([dH (u), u]0 ) 0 h i (6.34) = un+1 5∗k ([dH (u), u]0 ) − 5l (un+1 dH (u)), u . 0
Using this formula, we can now check that MdToda is a Poisson submanifold of (A, {·, ·}(n) ) only for n = −1, 0, and 1. Accordingly, the induced structures on MdToda provide the first, second and third Poisson structures for the equations in the dispersionless Toda lattice hierarchy. Using u = (u0 , u1 ) as coordinates on MdToda , the Hamiltonian operator of the first structure is given explicitly by 0 u1x 0 u1 D+ . (6.35) B−1 (u) = u1 0 0 0 Clearly, the associated Hamiltonian vector fields preserve the sign of u1 . Therefore, B−1 (u) restricts to a structure on −1 M+ dToda = {L ∈ A L(x, λ) = u1 (x)λ + u0 (x) + u1 (x)λ , u1 (x) > 0} (6.36) R R whose symplectic leaves are the level sets of the Casimirs u0 (x)dx, ln u1 (x)dx. Finally, we note that B−1 (u) is obviously of hydrodynamic type and the corrresponding 2 ij metric is non-degenerate on M+ dToda as det (g ) = −u1 . Remark 6.37. Note that the equations in the dispersionless Toda lattice hierarchy are (strictly) hyperbolic in M+ dToda and we can take w1 (u) = u0 − 2u1 , w2 (u) = u0 + 2u1
(6.38)
as the Riemann invariants. We shall not give the proof as the reader can easily supply the details. As for the second and third Poisson structures on MdToda , direct calculation shows the corresponding Hamiltonian operators have the form 2 4u1 u0 u1 4u1 u0 0 0 u + (6.39) D + u , B0 (u) = 0x 0 u1 1x u1 0 u0 u1 u21 ! 4u31 + u20 u1 8u0 u21 4u21 0 D + u B1 (u) = 2u0 u1 u21 0x 4u31 + u20 u1 2u0 u21 (6.40) 8u0 u1 u20 + 8u21 + u1x . 4u21 2u0 u1 These structures also restrict to M+ dToda , and are obviously of hydrodynamic type. But in contrast to the first structure, the metrics associated with B0 (u) and B1 (u) are nondegenerate only on a subset of M+ dToda , characterized by the condition w1 (u)w2 (u) 6= 0, where w1 (u), w2 (u) are the Riemann invariants in (6.38).
(6.41)
588
L.-C. Li
In order to compute the higher structures, we have to invoke Dirac reduction, as in the last example. Here, we shall do this for the fourth structure as it presents new features which are also shared by all higher structures. First of all, we check that for L ∈ M+ dToda , (2) we have XH (L) ∈ I m5∗l , and the highest order term in λ is λ2 . Then we write down the (2) operator which gives XH (L) using the coordinates u = (u0 , u1 , u2 ) on the submanifold (2) where XH (L) lies:
(12u41 + 12u20 u21 )D + 6(u41 + u20 u21 )x
(u3 u + 12u u3 )D + u (u3 + 6u u2 ) 0 1 1 0 0 1 x 0 1 2u41 D (u30 u1 + 12u0 u31 )D + 6u31 u0x + 24u0 u21 u1x + u30 u1x
2u41 D + 8u31 u1x
(4u41 + 3u20 u21 )D + 3u0 u21 u0x + (8u31 + 3u20 u1 )u1x
u31 u0x
−u31 u0x
−u41 D
.
(6.42)
− 2u31 u1x
Finally, we invoke Dirac reduction with constraint u2 ≡ 0 to compute the Hamiltonian operator of the fourth structure, and the result is ! ! 12u0 u21 16u41 + 12u20 u21 12u0 u31 + u30 u1 4u31 D+ u0x B2 (u) = 12u0 u31 + u30 u1 4u41 + 3u20 u21 3u20 u1 + 8u31 3u0 u21 ! 32u31 + 12u20 u1 24u0 u21 + u30 u1x + (6.43) 12u0 u21 8u31 + 3u20 u1 ! 16u1 u1x D −1 u1 u1x 4u1 u1x D −1 u1 u0x . − 4u1 u0x D −1 u1 u1x u1 u0x D −1 u1 u0x Thus, B2 (u) has a nonlocal tail, and provides an example of a class of nonlocal Hamiltonian operators of the form ij
B ij (u) = g ij D + bk ukx +
N X
j
(wα )ik ukx D −1 (w α )` u`x .
(6.44)
α=1
In the case where det(g ij ) 6 = 0, we note that the geometric root of such structures was discussed in [F] and applied to the chromatography equations. At this point, the reader can check that the subset of M+ dToda where the metric associated with B2 (u) is on-degenerate isRlikewise defined by (6.41). R Also, on the symplectic leaves of the first structure where u0 (x)dx = const and ln u1 (x)dx = const, the recursion operator −1 exists and it is not hard to show that B1 = Rb0 and B2 = R2 B0 . R = B0 B−1 Remark 6.45. In [DM], the authors considered the dispersionless Toda lattice equations with boundary conditions u1 (0) = 0, u1 (1) = 0. We remark that the multi-Hamiltonian formalism of this problem can also be obtained in a similar fashion. Indeed, the only major change one has to make here is to replace theP algebra above by the algebra of Laurent polynomials in λ, having the form u(x, λ) = i ui (x)λi , where the coefficients ui are smooth functions on I = [0, 1] satisfying the additional conditions uj (0) = uj (1) = 0, j 6 = 0. Otherwise, everything goes through just the same as before. In particular, the formula for the Hamiltonian operators of the first four structures are still those given in (6.35), (6.39), (6.40) and (6.43).
Classical r-Matrices and Compatible Poisson Structures
589
In the next two examples, we shall consider equations with infinitely many field variables. For simplicity of exposition, we shall not get into reduction calculations here, only remark that the number of constraints is still finite in each case. 6.3. The dispersionless KP hierarchy. Let A be the algebra of formal Laurent series in λ, having the form N (u) X ui (x)λi , (6.46) u(x, λ) = i=−∞
where the coefficients ui are smooth functions on S 1 = R/Z. Define [u, v]−1 =
∂u ∂v ∂u ∂v − , u, v ∈ A, ∂λ ∂x ∂x ∂λ
(6.47)
then (A[·, ·]−1 ) is a Poisson algebra. The (extended) dispersionless KP (dKP) hierarchy is defined by the equations dL = 5≥0 (Ln ), L −1 = − 5≤−1 (Ln ), L −1 , n = 1, 2, . . . , dt where the Lax operator is an element of the (extended) dKP manifold ) ( 0 X i ui (x)λ , MdKP = L ∈ A L(x, λ) = λ +
(6.48)
(6.49)
i=−∞
and 5≥0 , 5≤−1 are projection operators relative to the decomposition A = A≥0 ⊕ A≤−1 into subalgebras A≥0
X = u ∈ A u(x, λ) = ui (x)λi , (
A≤−1
(6.50)
(6.51a)
i≥0
) −1 X = u ∈ A u(x, λ) = ui (x)λi .
(6.51b)
i=−∞
In the standard form of the dKP equations [TT], the coefficient u0 ≡ 0, but we shall not get into reduction calculations here. For the Poisson algebra (A, [·, ·]−1 ), the invariant trace is defined by Z (6.52) tr−1 u = u−1 (x)dx, u ∈ A, and we have the non-degenerate ad-invariant pairing (·, ·)−1 : (u, v)−1 = tr−1 (uv), u, v ∈ A.
(6.53)
So again we can invoke Theorem 3.2, using the r-matrix R = 5≥0 − 5≤−1
(6.54)
590
L.-C. Li
in this case to obtain the corresponding brackets {·, ·}(n) , n ≥ −1. Here, it is easy to check that MdkP is a Poisson submanifold of (A, {·, ·}(n) ) only for n = −1, 0. Therefore, the induced structures on MdKP provide the first and second Hamiltonian structures nfor the equations in the hierarchy. For o the bracket {·, ·}(1) , the slightly larger P 1 i ui (x)λ is a Poisson submanifold. Hence the manifold u ∈ A u(x, λ) = i=−∞
third structure on MdKP can be computed using Dirac reduction with constraint u1 ≡ 1. We shall leave the details to the interested reader. 6.4. The dispersionless modified KP and the dispersionless Dym hierarchy. Let (A, [·, ·]−1 ) be the Poisson algebra in Example 6.3, with the same invariant pairing (·, ·)−1 . Consider the decomposition A = A≥k ⊕ A≤k−1 , k ≥ 0 with associated projection operators 5≥k and 5≤k−1 , where X ui (x)λi , A≥k = u ∈ A u(x, λ) =
(6.55)
(6.56a)
i≥k
) k−1 X = u ∈ A u(x, λ) = ui (x)λi . (
A≤k−1
(6.56b)
i=−∞
Clearly, A≥k is a subalgebra of (A, [·, ·]−1 ) for all k. On the other hand, simple verification shows that A≤k−1 is a subalgebra of (A, [·, ·]−1 ) only for k = 0, 1, 2. Therefore, among the direct sum decompositions in (6.55), only the three cases k = 0, 1, and 2 lead to r-matrices, and the case k = 0 has already appeared in Example 6.3. We now consider the other two cases, with Lax equations dL = 5≥k (Ln ), L −1 = − 5≤k−1 (Ln ), L −1 , n = 1, 2, . . . , ; k = 1, 2, . dt (6.57) For k = 1 and L ∈ MdKP , the equations in (6.57) constitute the dispersionless modified KP hierarchy. For k = 2, we obtain the dispersionless Dym hierarchy when the Lax operator L is from the submanifold ( ) 1 X i ui (x)λ . (6.58) MdDym = L ∈ A L(x, λ) = i=−∞
These hierarchies are the semi-classical limit of the modified KP and the Dym hierarchies in [ANPV,KO]. For the dmKP hierarchy, with r-matrix given by R = 5≥1 − 5≤0 , the manifold of Lax operators is a Poisson submanifold of the associated brackets {·, ·}(n) for n = −1, 0, 1. Hence the induced structures on MdKP provide the first three Poisson structures for the Hamiltonian description of dmKP. The higher structures, on the other hand, have to be computed using Dirac reduction. For the dispersionless Dym hierarchy, the situation is even better, for in this case the first five Poisson structures on MdDym are obtained from the brackets {·, ·}(n) (−1 ≤ n ≤ 3) associated with R = 5≥2 − 5≤1 by simple restriction. Again, the passage from {·, ·}(n) (n ≥ 4) to the higher structures require the application of Dirac reduction.
Classical r-Matrices and Compatible Poisson Structures
591
References [A]
Adler, M.: On a trace functional for formal pseudo-differential operators and the symplectic structure of the Korteweg de-Vries (KdV) type equations. Invent. Math. 50, 219–248 (1979) [AvM] Adler, M., van Moerbeke, P.: Compatible Poisson structures and the Virasoro algebra. Comm. Pure Appl. Math. 47, 5–37 (1994) [ANPV] Aratyn, H., Nissimov, E., Pacheva, S., Vaysburd, I.: R-matrix formulation of the KP hierarchies and their gauge equivalence. Phys. Lett. B 294, 167–176 (1992) [B] Benny, D. J.: Some properties of long nonlinear waves. Stud. Appl. Math. 52, 45–50 (1973) [D] Dubrovin, B.: Geometry of 2D topological field theories. In: Lecture Notes in Math., Vol. 1620, Berlin–Heidelberg–New York: Springer-Verlag, 1996 [DFIZ] DiFrancesco, P., Itzykson, C., Zuber, J.-B.: Classical W -algebras. Commun. Math. Phys. 140, 543– 567 (1991) [DM] Deift, P., McLaughlin, K. T-R.: A continuum limit of the Toda lattice. Memoirs of Am. Math. Soc. 131, no. 624 (1998) [DN] Dubrovin, B., Novikov, S. P.: Hydrodynamics of weakly deformed soliton lattices, differential geometry and Hamiltonian theory. Russian Math. Surveys 44, 35–124 (1989) [DO] Dorfman, I.: Dirac structures and integrability of nonlinear evolution equations. Chichester, England : J. Wiley, 1993 [DR] Drinfeld, V. G.: Hamiltonian structure on Lie groups, Lie bialgebras and the geometrical meaning of the Yang–Baxter equations. Sov. Math. Doklady 27, 69–71 (1983) [F] Ferapontov, E. V.: Differential geometry of nonlocal Hamiltonian operators of hydrodynamic type. Funct. Anal. Appl. 25, 195–204 (1991) [GD] Gelfand, I. Dickey, L.: A family of Hamiltonian structures related to nonlinear integrable differential equations. Preprint no. 136, Inst. Appl. Math. USSR Acad. Sci. 1978 (in Russian). English transl. In: Collected papers of I. M. Gelfand, Vol. 1. Berlin–Heidelberg–NewYork: Springer 1987, pp. 625–646 [GDO] Gelfand, I., Dorfman, I.: Hamiltonian operators and algebraic structures related to them. Funct. Anal. Appl. 13, 248–262 (1979) [G-KR] Golenischeva-Kutuzova, M., Reiman, A. G.: Integrable equations, related with the Poisson algebra. J. Soviet Math. 169, 890–894 (1988) [K] Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. Math. 34, 195–338 (1979) [Kri] Krichever, I. M.: The dispersionless Lax equations and topological minimal models. Commun. Math. Phys. 143, 415–426 (1991) [KO] Konopelchenko, B., Oevel, W.: An r-matrix approach to nonstandard classes of integrable equations. Publ. RIMS, Kyoto Univ. 29, 581–666 (1993) [KR] Kulish, P. P., Reiman, A. G.: Hierarchy of symplectic forms for the Schrodinger and the Dirac equations on the line. Zap. Nauchn. Sem. L. O. M. I. 77, 134–147 (1978) (in Russian), English transl. In: J. Soviet Math. 22, 1627–1637 (1983) [LP1] Li, L. C., Parmentier, S.: A new class of quadratic Poisson structures and the Yang–Baxter equation. C. R. Acad. Sci., Paris Ser. I 307, 279–281 (1988) [LP2] Li, L. C., Parmentier, S.: Nonlinear Poisson structures and r-matrices. Commun. Math. Phys. 125, 545–563 (1989) [M] Magri, F.: A simple model of the integrable Hamiltonian equation. J. Math. Phys. 19, 1156–1162 (1978) [MR] Marsden, J., Ratiu, T.: Reduction of Poisson manifolds. Lett. in Math. Phys. 11, 161–169 (1986) [RSTS1] Reiman, A. G., Semenov-Tian-Shansky, M. A.: A family of Hamiltonian structures, hierarchy of Hamiltonians, and reduction for first-order matrix differential operators. Funct. Anal. Appl. 14, 146–148 (1980) [RSTS2] Reiman, A. G., Semenov-Tian-Shansky, M. A.: Group-theoretical methods in the theory of finite dimensional integrable systems. In: Dynamical Systems VII, ed. by V. I. Arnold, S. P. Novikov, Encyclopaedia of Mathematical Sciences, Vol. 16, Berlin–Heidelberg–New York: Springer-Verlag, 1994 [S] Schouten, J. A.: On the differential operators of first order in tensor calculus. Conv. di Geom. Differen. 1953, Roma: Ed. Cremonese, 1954 [ST] Strack, K.: r-Matrizen and assoziativen Algebren: eine systematische Suche nach PoissonKlammern. Thesis (1990)
592
[STS1] [STS2] [TT] [W1] [W2]
L.-C. Li Semenov-Tian-Shansky, M. A.: What is a classical r-matrix? Funct. Anal. Appl. 17, 259–272 (1983) Semenov-Tian-Shansky, M. A.: Dressing transformations and Poisson Lie group actions. Publ. RIMS, Kyoto University 21, 1237–1260 (1985) Takasaki, K., Takebe, T.: Integrable hierarchies and dispersionless limit. Rev. Math. Phys. 7, 743– 808 (1995) Weinstein, A.: Coisotropic calculus and Poisson groupoids. J. Math. Soc. Japan 40, 705–727 (1988) Weinstein, A.: The local structure of Poisson manifolds. J. Diff. Geom. 18, 523–557 (1983)
Communicated by B. Simon
Commun. Math. Phys. 203, 593 – 612 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Multifractal Analysis of Local Entropies for Expansive Homeomorphisms with Specification Floris Takens, Evgeny Verbitski Department of Mathematics, University of Groningen, P.O.Box 800, 9700 AV, Groningen, The Netherlands. E-mail:
[email protected];
[email protected] Received: 22 September 1998 / Accepted: 11 December 1998
Abstract: In the present paper we study the multifractal spectrum of local entropies. We obtain results, similar to those of the multifractal analysis of pointwise dimensions, but under much weaker assumptions on the dynamical systems. We assume our dynamical system to be defined by an expansive homeomorphism with the specification property. We establish the variational relation between the multifractal spectrum and other thermodynamical characteristics of the dynamical system, including the spectrum of correlation entropies.
1. Introduction Recently in the series of papers [10,11,2] L. Barreira, Ya. B. Pesin, J. Schmeling, and H. Weiss performed a complete multifractal analysis of local dimensions, entropies and Lyapunov exponents for conformal expanding maps and surface Axiom A diffeomorphisms with Gibbs measures. The main goal of these papers was primarily the analysis of the local (pointwise) dimensions. This is an extremely difficult problem and, for example, similar results for hyperbolic systems in dimensions 3 and higher have not been obtained. In the present work we concentrate our attention on the multifractal analysis of the local (pointwise) entropies. We are able to obtain results, which are similar to those mentioned above, for Gibbs measures of the expansive homeomorphisms with specification property. Note that such dynamical systems may not have Markov partitions, which is a crucial condition in the previous works. However, due to the fact that less is known about thermodynamical properties of these dynamical systems we were able to obtain only the continuous differentiabilty of the multifractal spectrum of local entropies (compare: the same spectra for the dynamical systems with Markov partitions are analytic). We believe that the smoothness of the multifractal spectrum in our case can be improved.
594
F. Takens, E. Verbitski
We have related the mutifractal spectrum of the local entropies to the spectrum of correlation entropies. These correlation entropies serve as entopy-like analogues of the Hentschel–Procaccia and Renyi spectra of generalized dimensions. This allows us to complete the duality between the mutifractal analyses of local dimensions and entropies. 2. Expansiveness and Specification The following definitions and fundametal results are taken from [6,8,17], for a compact presentation see [9, Chap.20]. Throughout this paper we assume (X, d) to be a compact metric space. Definition 2.1. A homeomorphism f : X → X is called expansive if there exists a constant γ > 0 such that if d(f n (x), f n (y)) < γ for all n ∈ Z then x = y.
(2.1)
The maximal γ with such a property is called the expansivity constant. Another important property is the following. Definition 2.2 (Bowen [6]). We say that f : X → X is a homeomorphism with the specification property (abbreviated to “a homeomorphism with specification”) if for each δ > 0 there exists an integer p = p(δ) such that the following holds: if a) I1 , . . . , In are intervals of integers, Ij ⊆ [a, b] for some a, b ∈ Z and all j, b) dist(Ii , Ij ) ≥ p(δ) for i 6 = j , then for arbitrary x1 , . . . , xn ∈ X there exists a point x ∈ X such that 1) f b−a+p(δ) (x) = x, 2) d(f k (x), f k (xi )) < δ for k ∈ Ii . The specification property guarantees good mixing properties of f and a sufficient number of periodic orbits. Homeomorphisms that are expansive and with specification, form a general class of “strongly chaotic” dynamical systems. For example, the following holds: Theorem 2.3 ( [9, Theorem 18.3.9]). Let 3 be a topologically mixing compact locally maximal hyperbolic set for a diffeomorphism f . Then f |3 has the specification property. Remark. A generalization of the notion of a space with a hyperbolic diffeomorpism is the so-called Smale space [16]. Also for the Smale spaces mixing implies specification as well. 3. Equilibrium States For the multifractal analysis one needs an invariant probability measure. On an attractor there is usually one physically relevant measure (density of a generic orbit) called the SRB (Sinai-Ruelle-Bowen) measure, which often belongs to the class of equilibrum states or Gibbs measures. We introduce the last notion now. Again, let (X, d) be a compact space, f : X → X a continuous map and ϕ : X → R a continuous function. We shall use the following notation.
Multifractal Analysis of Local Entropies
595
Definition 3.1. For every n ∈ N and any x, y ∈ X define a new metric dn (x, y) =
max
i=0,... ,n−1
d(f i (x), f i (y)),
and let Bn (x, ε) = {y ∈ X : dn (x, y) < ε} for ε > 0. The set E ⊂ X is said to be (n, ε)-separated if for every x, y ∈ E such that x 6 = y we have dn (x, y) > ε. We say that a set F ⊂ X is (n, ε)-spanning if for every y ∈ X there exist x ∈ F such that dn (x, y) < ε. For any function ϕ : X → R and x ∈ X put (Sn ϕ)(x) =
n−1 X
ϕ(f k (x)).
k=0
Now we introduce the topological pressure which will be defined on the space C(X) of all continuous functions on (X, d). Definition 3.2. For n ∈ N and ε > 0 define ) ( X exp (Sn ϕ)(x) , Zn (ϕ, ε) = sup E
(3.1)
x∈E
where the supremum is taken over all (n, ε)-separated sets E. The pressure is then defined as P (ϕ) = lim lim sup ε→0 n→∞
1 log Zn (ϕ, ε). n
(3.2)
The topological entropy of f , denoted by htop (f ), is by definition the topological pressure of ϕ ≡ 0. The topological pressure admits other equivalent definitions, for this, see [21]. In particular, the following statement is known as the Variational Principle. Theorem 3.3. Denote by Mf (X) the set of all f -invariant Borel probability measures on X. Let ϕ ∈ C(X). Then Z hµ (f ) + ϕdµ . P (ϕ) = sup µ∈Mf (X)
This result inspires the following definition. Definition 3.4. An element µ of Mf (X) is called an equilibrium state for the potential ϕ if Z P (ϕ) = hµ (f ) +
ϕ dµ.
The equilibrium state for ϕ ≡ 0 (if it exists) is called a measure of maximal entopy. We recall some other basic properties of the topological pressure: 1. P : C(X) → R is continuous and monotonously increasing, i.e., ϕ ≤ ψ ⇒ P (ϕ) ≤ P (ψ).
596
F. Takens, E. Verbitski
2. One of the following holds: P (ϕ) = +∞ ∀ϕ ∈ C(X), P (ϕ) < +∞ ∀ϕ ∈ C(X). Expansive homeomorphisms, which we will consider in the next sections, always have finite topological entropy and hence the pressure of every continuous function is finite. 3. P : C(X) → R is convex, i.e., ∀λ ∈ [0, 1], P (λϕ + (1 − λ)ψ) ≤ λP (ϕ) + (1 − λ)P (ψ). 4. For any ϕ ∈ C(X) and c ∈ R one has P (ϕ + c) = P (ϕ) + c. We impose additional conditions on the class of potentials under consideration. We say that ϕ ∈ Vf (X) if it is continuous and there exist ε > 0 and K > 0 such that for all n ∈ N, d(f k (x), f k (y)) < ε for k = 0, . . . , n − 1 ⇒ (Sn ϕ)(x) − (Sn ϕ)(y) < K. For example, for a hyperbolic diffeomorphism f , any Hölder continuous function ϕ is in Vf (X) [9, Prop.20.2.6]. Theorem 3.5 ( [6,16,9]). If f is an expansive homeomorphism with specification and ϕ ∈ Vf (X) then there exists a unique measure µϕ such that Z P (ϕ) = hµϕ (f ) + ϕdµϕ . Moreover, µϕ is ergodic, positive on open sets and mixing. The equilibrium state µϕ can be constructed from the measures concentrated on periodic points in the following way. For every n ≥ 1 define a probability measure µϕ,n supported on the set of periodic points F ix(f n ) = {x ∈ X : f n (x) = x} as follows: µϕ,n =
1 P (f, ϕ, n)
X
e(Sn ϕ)(x) δx ,
(3.3)
x∈F ix(f n )
where δx is a unit measure at x and P (f, ϕ, n) =
X
e(Sn ϕ)(x) is a normalizing
x∈F ix(f n )
constant. Theorem 3.6 ([6,9]). An equilibrium state µϕ is a weak∗ limit of the sequence {µϕ,n }, i.e., for every h ∈ C(X), Z Z h(x)dµϕ,n → h(x)dµϕ as n → ∞. For our purposes of analysis of local entropies the following result will play a key role.
Multifractal Analysis of Local Entropies
597
Theorem 3.7 ( [8, Proposition 2.1], [9, Theorem 20.3.4]). Let f be an expansive homeomorphism with the specification property. Let ϕ ∈ Vf (X) and denote its equilibrium state by µϕ . Then for a sufficiently small ε > 0 there exist constants Aε , Bε > 0 such that for all x ∈ X and n ≥ 0, µϕ {y ∈ X : d(f k (x), f k (y)) < ε for k = 0, . . . , n − 1} ≤ Bε . (3.4) Aε ≤ exp (−nP (ϕ) + (Sn ϕ)(x)) Remark. Actually, the result above states that for expansive homeomorphisms with specification the equilibrium states are the so-called Gibbs measures (states) as well. See [8] for detailed discussion. We have seen that for every ϕ ∈ Vf (X) there exists a unique equilibrium state. Using (3.3) and (3.4) we are able to give necessary and sufficient conditions for potentials ϕ, ψ ∈ Vf (X) to have the same equilibrium states µϕ = µψ . Theorem 3.8. Let f be an expansive homeomorphism with specification. The equilibrium states µϕ and µψ corresponding to the potentials ϕ, ψ ∈ Vf (X) coincide if and only if there exists a constant c ∈ R such that (Sn ϕ)(x) = (Sn ψ)(x) + nc
(3.5)
for all x ∈ F ix(f n ) and all n. Proof. If (3.5) holds for all x ∈ F ix(f n ) and n, then by (3.3) one has µϕ,n = µψ,n for all n. Thus µϕ = µψ . Suppose that µϕ = µψ =: µ. Consider “adjusted” potentials e ϕ = ϕ − P (ϕ) and e = ψ − P (ψ). Let x ∈ F ix(f n ) for some n ∈ N, applying (3.4) for sufficiently small ψ ε > 0, we conclude that e)(x) . ϕ )(x) ≤ µ(Bn (x, ε)) ≤ Bεψ exp (Sn ψ Aϕε exp (Sn e e)(x) + C 0 for some constant C 0 independent of x and ϕ )(x) ≤ (Sn ψ This implies that (Sn e kn n. Since x ∈ F ix(f ) for all k ∈ N we have that e)(x) (Skn e ϕ )(x) (Skn ψ e)(x). ≤ lim = (Sn ψ k→∞ k→∞ k k
ϕ )(x) = lim (Sn e
By symmetry we obtain the opposite inequality. Hence e)(x) ϕ )(x) = (Sn ψ (Sn e t for all x ∈ F ix(f n ) and n ∈ N. This implies (3.5) with c = P (ϕ) − P (ψ). u 4. Thermodynamical Formalism for Expansive Homeomorphisms with Specification In this section we establish some technical results on the properties of the pressure for expansive homeomorphisms which will be exploited later in the proof of the main result.
598
F. Takens, E. Verbitski
Lemma 4.1. Suppose f : X → X is an expansive homeomorphism with specification. Let ϕ ∈ Vf (X). Then the function P (qϕ), q ∈ R, is continuously differentiable with respect to q and its derivative is given by Z dP (qϕ) = ϕdµq , dq where µq is the equlibrium state corresponding to the potential qϕ. Moreover, P (qϕ) is a strictly convex function of q provided the equilibrium state µϕ for ϕ is not a measure of maximal entropy. If µϕ is the measure of maximal entropy then P (qϕ) − qP (ϕ) = (1 − q)htop (f ) for all q ∈ R. Proof. We shall use several results from [21] to show that P (qϕ) is a differentiable function of q. For a moment we are going to use the fact that f : X → X is a continuous map on a compact metric space (X, d) with finite topological entropy. Since the topological pressure is a continuous and convex function on C(X), for every ϕ, ψ ∈ C(X), the function P (ϕ + tψ) − P (ϕ) t→ t is non-increasing as t ↓ 0. Hence there exist right and left derivatives of P (ϕ) in the direction of ψ, i.e., P (ϕ + tψ) − P (ϕ) , t→0+ t P (ϕ + tψ) − P (ϕ) . d − P (ϕ)(ψ) = lim t→0− t d + P (ϕ)(ψ) = lim
We say that the pressure P is Gâteaux differentiable at ϕ if for every ψ the following holds d + P (ϕ)(ψ) = d − P (ϕ)(ψ). This turns out to be equivalent to the condition that the map ψ → d + P (ϕ)(ψ) is linear. A linear functional α on C(X) is called a tangent functional (subdifferential) to P (·) at ϕ if P (ϕ + ψ) − P (ϕ) ≥ α(ψ) for all ψ ∈ C(X). Applying the Riesz representation theorem we conclude that there exist a finite signed measure ν = ν(α) on X such that Z α(ψ) = ψdν for all ψ ∈ C(X). From now on we identify the tangent functional α with the corresponding measure ν from the Riesz representation. Denote by tϕ (P ) the set of all tangent functionals to P at ϕ and by Mϕ (X) the set of all equilibrium states corresponding to the potential ϕ. Applying the Variational Principle one concludes Mϕ (X) ⊂ tϕ (P ).
Multifractal Analysis of Local Entropies
599
One can easily check that the pressure P is Gâteaux differentiable at ϕ if and only if there is a unique tangent functional ν to P at ϕ [21, Corollary 2] and that Z dP (ϕ)(ψ) = ψdν. Combining the results of Theorems 8.2 and 9.15 from [21] one has that for expansive homeomorphism f : X → X, Mϕ (X) = tϕ (X) for every ϕ ∈ C(X). Since for every ϕ ∈ Vf (X) the set Mϕ (X) consists of a single element (uniqueness of equilibrium states), we have that the pressure P is Gâteaux differentiable at any ϕ ∈ Vf (X) and Z d (4.1) P (ϕ + tψ) = ψdµϕ t=0 dt for all ψ ∈ C(X). This proves the differentiability of the pressure function P (qϕ) at q = 1. The result for all other q follows in the same manner since qϕ ∈ Vf (X) for every q ∈ R if ϕ ∈ Vf (X). If a convex function is differentiable, then its derivative is continuous. Since we have already established the differentiability of P (qϕ) (and it is convex) we obtain the desired result. Now we are going to establish the strict convexity of P (qϕ). Suppose, µϕ is not a measure of maximal entropy. Then applying the result of Theorem 3.8 we conclude that the equilibrium states µq1 and µq2 , corresponding to potentials q1 ϕ and q2 ϕ respectively, are not equal if q1 6 = q2 . Indeed, assume µq1 = µq2 for some q1 6= q2 . Then by Theorem 3.8 we conclude that for some constant c, (Sn q1 ϕ)(x) = (Sn q2 ϕ)(x) + nc F ix(f n ).
This implies that (Sn ϕ)(x) = nc˜ with c˜ = c/(q1 − q2 ). for all n and x ∈ Appying again Theorem 3.8 one has that the equilibrium state µϕ and the equilibium state µ0 , corresponding to potential ψ ≡ 0, are equal. It means that µϕ is the measure of maximal entropy. Hence we have arrived at a contradiction with the assumption. Therefore µq1 6 = µq2 if q1 6 = q2 . The function h : R → R is called strictly convex if for every q0 ∈ R there exists λ(q0 ) ∈ R such that R
h(q) > h(q0 ) + λ(q0 )(q − q0 )
for all q 6 = q0 .
Put λ(q0 ) = ϕdµq0 for any q0 ∈ R. Since µq 6= µq0 for q 6= q0 and µq is the unique equilibrium state for qϕ, one has Z P (qϕ) = hµq (f ) + qϕdµq Z = sup hµ (f ) + qϕdµ µ∈Mf (X)
> hµq0 (f ) + = hµq0 (f ) +
Z Z
qϕdµq0 q0 ϕdµq0 + (q − q0 )
= P (q0 ϕ) + λ(q0 )(q − q0 ).
Z ϕdµq0
600
F. Takens, E. Verbitski
This means that P (qϕ) is a strictly convex function. If the equilibrium state µϕ is indeed a measure of maximal entropy, then µϕ = µqϕ =: µ for all q ∈ R. This is a consequence of Theorems 3.5 and 3.8. Then applying the Variational Principle to µϕ and µqϕ we conclude that Z P (qϕ) = hµ (f ) + q ϕdµ, Z P (ϕ) = hµ (f ) + ϕdµ, where hµ (f ) = htop (f ) since µ is the measure of maximal entropy. The result follows immediately. u t Remark. Much stronger result on smoothness of the pressure are known. For example, the analyticity of pressure has been established for Smale spaces [16], i.e., generalizations of Axiom A diffeomorphisms. The key property which these systems inherit from hyperbolic dynamical systems is the so-called local product structure, which in turn guarantees the existence of Markov partitions. The known methods of establishing the analyticity of pressure strongly rely on this Markov structure. Expansive homeomorphism with specification do not necessarily have Markov partitions. For expansive homeomorpshism with specification we were able to prove only the continuous differentiability of the pressure. However we believe that this result can be improved. Definition 4.2. We say that E is a maximal (n, ε)-separated set if it can not be enlarged by adding new points preserving the separation condition. It is easy to see that every maximal (n, ε)-separated set E is an (n, ε)-spanning set as well. The following estimates from [8] will be used later. Lemma 4.3. Let f be an expansive homeomorphism and γ > 0 be its expansivity constant. Let ϕ ∈ Vf (X). For every finite set E put X exp (Sn ϕ)(x) . Zn (ϕ, E) = x∈E
1. If ε, ε 0 < γ /2 and E, E 0 are the maximal (n, ε)- and (n, ε0 )-separated sets respectively then one has Zn (ϕ, E) ≤ CZn (ϕ, E 0 ), where the constant C = C(ε, ε0 ) is independent of n. In particular, P (ϕ) = lim
n→∞
1 log Zn (ϕ, En ), n
(4.2)
where En are the arbirary maximal (n, ε)-separated sets. 2. If furthermore f satisfies the specification property and ε < γ /2, then there exists a constant D = D(ϕ, ε) > 0 such that | log Zn (ϕ, En ) − nP (ϕ)| < D for all n and all maximal (n, ε)-separated sets.
(4.3)
Multifractal Analysis of Local Entropies
601
5. Topological Entropy for Non-Compact Sets The generalization of the topological entropy to non-compact or non-invariant sets goes back to Bowen [5]. Later Pesin and Pitskel [13] generalized the notion of pressure to the case of non-compact sets. Note that by definition topological entropy is the topological pressure for ϕ ≡ 0. Now we give the formal definition of the topological entropy of a non-compact or non-invariant set. Suppose f : X → X is a continuous map on a compact metric space (X, d). Let U = {U1 , . . . , UM } be a finite open cover of X. By defintion, a string U is a sequence Ui1 . . . Uin with ik ∈ {1, . . . , M}, its length n is denoted by n(U). The collection of all strings of length n is denoted by Wn (U). For each U ∈ Wn (U) define the open set X(U) = U1 ∩ f −1 U2 ∩ . . . ∩ f −n+1 Un = {x ∈ X : f k−1 x ∈ Uk , k = 1, . . . , n}. We say that a collection of strings 0 covers a set Z ⊂ X if [ X(U) ⊃ Z. U∈0
For every real number s introduce M(Z, s, U) = lim inf N→∞ 0
X
exp(−n(U)s),
U∈0
S where the infinum is taken over all collections 0 ⊆ n≥N Wn (U) covering Z. There exists a unique value s such that M(Z, ·, U) jumps from +∞ to 0, h(Z, U) := s = sup{s : M(Z, s, U) = +∞} = inf{s : M(Z, s, U) = 0} Finally, one can show that the following limit exists: htop (f |Z ) :=
lim
diam(U)→0
h(Z, U).
Definition 5.1. The number htop (f |Z ) is called the topological entropy of f restricted to the set Z, or, simply, the topological entropy of Z. This definition of the topological entropy is similar to the definition of the Hausdorff dimension (the diameters of the covering open sets are substituted by exp(−n(U)), which can be treated as a “dynamical diameter” of X(U)). Indeed, these definitions are particular cases of the so-called Carathéodory dimension characteristics [14]. Theorem 5.2 ([12] ). The topological entropy as defined above has the following properties: 1. htop (f |Z1 ) ≤ htop (f |Z2 ) for any Z1 ⊂ Z2 ⊂ X; 2. htop (f |Z ) = sup htop (f |Zi ), where Z = ∪∞ i=1 Zi ⊂ X; i
3. if µ is an invariant measure such that µ(Z) = 1, then htop (f |Z ) ≥ hµ (f ).
602
F. Takens, E. Verbitski
6. Local Entropy In this section we give the definition of local entropy. The fundamental result on its existence and properties is the Brin–Katok formula below. Using the notation from Sect. 3 we introduce the lower and upper local entropies at x ∈ X as follows 1 hµ (f, x) := lim lim inf − log µ(Bn (x, ε)), ε→0 n→∞ n 1 hµ (f, x) := lim lim sup − log µ(Bn (x, ε)). ε→0 n→∞ n
(6.1) (6.2)
Note that the limits in ε exist due to the monotonicity. We say that the local entropy exists at x if hµ (f, x) = hµ (f, x).
(6.3)
In this case the common value will be denoted by hµ (f, x). Theorem 6.1 (Brin–Katok formula, [7]). Let f : X → X be a continuous map on a compact metric space (X, d) preserving a non-atomic Borel measure µ, then 1. for µ-a.e. x ∈ X the local entropy exists, i.e., hµ (f, x) = hµ (f, x) = hµ (f, x); 2. hµ (f, x) is a f –invariant function of x, and Z hµ (f, x) dµ = hµ (f ), where hµ (f ) is the measure–theoretic entropy of f . Remark. If µ is ergodic then hµ (f, x) = hµ (f ) for µ-a.e. x ∈ X. Lemma 6.2. Let f be an expansive homeomorphism with specification. Consider an equilibrium state µϕ for the potential ϕ ∈ Vf (X). For every x ∈ X put ϕ ∗ (x) = lim inf n→∞
ϕ ∗ (x) = lim sup n→∞
Then
n−1
1X ϕ(f i (x)), n 1 n
i=0 n−1 X
ϕ(f i (x)).
i=0
hµ (f, x) = P (ϕ) − ϕ ∗ (x), hµ (f, x) = P (ϕ) − ϕ ∗ (x),
for all x ∈ X. Therefore hµ (f, x) = hµ (f, x) if and only if ϕ ∗ (x) = ϕ ∗ (x).
Multifractal Analysis of Local Entropies
603
Proof. Using the estimate from Theorem 3.7 we conclude that for every sufficiently small ε > 0 and some constants C1 , C2 one has n−1
1X 1 C1 + P (ϕ) − ϕ(f i (x)) ≤ − log µ(Bn (x, ε)) n n n i=0
n−1
≤
1X C2 ϕ(f i (x)) + P (ϕ) − n n i=0
for all n ≥ 1 and every x ∈ X. The statement follows easily. 7. Multifractal Spectrum for Local Entropies Following [2] we introduce a multifractal spectrum for (local) entropies. For every α consider a level set of local entropy Kα = {x ∈ X : hµ (f, x) = α}, and the corresponding multifractal decomposition on level sets [ [ Kα {x ∈ X : hµ (f, x) does not exist}. X=
(7.1)
(7.2)
α
We use the topological entropy, defined in Sect. 5, to measure the “size” of sets {Kα }. Namely, define a multifractal spectrum for local entropies as follows: EE (α) = htop (f |Kα ).
(7.3)
This notation needs a brief explanation: two E’s stand for the topological Entropy of level set of local Entropy. For other multifractal spectra DE , ED , DD , see [2]. From a general multifractal formalism one expects EE (α) to be smooth and concave on a certain interval of α’s. We are able to establish this in the case of equilibrium states for expansive homeomorphisms with specification. The crucial observation which we exploit in the proof is the following. Let µ = µϕ be an equilibrium state for a potential ϕ. Then applying the result of the previous section one gets that n−1
1X ϕ(f i (x)) = P (ϕ) − α. n→∞ n
x ∈ Kα ⇐⇒ hµ (f, x) = α ⇐⇒ lim
(7.4)
k=0
Therefore, the level sets of local entropies are exactly the level sets of limits of ergodic averages of ϕ. From the Ergodic Theorem one concludes that only one of these level sets has full measure, while others are of measure 0. We adopt a technique of estimation of the topological entropy of these level sets from [2]. The main idea is the following: we introduce a 1-parameter family of measures such that for each α with Kα 6= ∅ there is exactly one measure in the family for which Kα has full measure. These measures µq are the equilibrium states for potentials ϕq = qϕ −P (qϕ). However, for the correspondence between levels {Kα } and measures {µq } we need a parameterization α(q) such that 1, if α˜ = α(q), µq (Kα˜ ) = 0, if α˜ 6= α(q).
604
F. Takens, E. Verbitski
The parameterization can be given as follows: first define T (q) = P (qϕ) − qP (ϕ), and α(q) = −T 0 (q) (note that T is C 1 by Lemma 4.1). Below we will establish that htop (f |Kα(q) ) = hµq (f ), i.e., µq is the measure with maximal metric entropy among all invariant measures {ν} such that ν(Kα(q) ) = 1. In order to complete the analysis we have to show that Kα = ∅ for every α 6 ∈ [inf q α(q), supq α(q)]. 8. Main Result In this section we state our main result. It is exactly in the form of the corresponding results from [2,10] for the multifractal analysis of local (pointwise) dimensions. We are following the same notation and order of statements. The last statement of our theorem is analogous to Remark 5 in [10]. It relates the multifractal spectra of the local entropies to the spectra hµ (f, q) of the correlation entropies (analogue of the Hentschel– Procaccia spectra for dimensions H P (q)) and Rµ (f, q) (analogue of the Renyi spectra of dimensions R(q)). Although it would be natural to call Rµ (f, q) the Renyi spectra of entropies, it might cause some confusion, since there exists a different notion called the Renyi entropy of order q [4,20]. Theorem 8.1. Let f be an expansive homeomorphism with the specification property of a compact metric space (X, d). Let ϕ ∈ Vf (X) and µ = µϕ be the corresponding equilibrium state. Then 1. For µ-a.e. x ∈ X the local entropy at x exists and Z hµ (f, x) = hµ (f ) = P (ϕ) −
ϕ dµ.
2. For any q ∈ R define the function T (q) = P (qϕ) − qP (ϕ). Then T (q) is a convex C 1 function R of q. Moreover, T (0) = htop (f ), T (1) = 0; for every q ∈ R one has T 0 (q) = ϕdµq − P (ϕ) ≤ 0, where µq is the equilibrium state for ϕq = qϕ − P (qϕ). 3. Put α(q) = −T 0 (q). Then EE (α(q)) := htop (f |Kα(q) ) = T (q) + qα(q). Define α = inf α(q) = lim α(q), q
q→+∞
α = sup α(q) = lim α(q). q
q→−∞
Then Kα = ∅ if α 6 ∈ [α, α]. It means that the domain of the multifractal spectrum for local entropies α → EE (α) is the range of the function q → −T 0 (q).
Multifractal Analysis of Local Entropies
605
4. If the equilibrium state µ for the potential ϕ is not a measure of maximal entropy, then the relation between EE and T (q) can be written in the following variational form: EE (α) = inf (T (q) + qα) for α ∈ (α, α), q∈R
T (q) = sup (EE (α) − qα) for q ∈ R. α∈(α,α)
This implies that EE is strictly concave and continuously differentiable on (α, α) with the derivative given by EE0 (α) = q, where q ∈ R is such that α = −T 0 (q). 5. For every q ∈ R, q 6 = 1, the following limits exist: Z 1 log µ(Bn (x, ε))q−1 dµ, hµ (f, q) = lim lim − ε→0 n→∞ n(q − 1) X 1 µ(Bn (x, ε))q , log sup Rµ (f, q) = lim lim − ε→0 n→∞ n(q − 1) E x∈E
where the supremum is taken over all (n, ε)-separated sets E. For q 6 = 1 one has T (q) . hµ (f, q) = Rµ (f, q) = − q −1 The family of correlation entropies hµ (f, q) depends continuously on q and hµ (f, 0) = htop (f ), hµ (f, 1) := lim hµ (f, q) = hµ (f ). q→1
Proof. (1) The first statement is a consequence of the Brin-Katok formula for ergodic dynamical systems (Theorem 6.1). (2) The smoothness and convexity properties of T follow directly from Lemma 4.1. We calculate the derivative of T with respect to q. Using the formula from Lemma 4.1 one gets Z (8.1) T 0 (q) = ϕdµq − P (ϕ), where µq is the equilibrium state for the potential ϕq = qϕ − P (qϕ). The inequality T 0 (q) ≤ 0 follows from the Variational Principle applied to ϕ. (3) This statement is taken from [2] where it has not been proved. For the sake of completeness we give the proof here. Let us first calculate the measure–theoretic entropy of the equilibrium state µq . From the Variational Principle for µq we have Z hµq (f ) = P (ϕq ) − ϕq dµq Z = 0 + T (q) + qP (ϕ) − q ϕdµq (8.2) Z = T (q) + q P (ϕ) − ϕdµq = T (q) + qα(q),
606
F. Takens, E. Verbitski
where α(q) = −T 0 (q) and we use formula (8.1) for the derivative of T (q). As we have seen in Lemma 6.2 for any α one has n−1
1X ϕ(f i (x)) = P (ϕ) − α. n→∞ n
hµ (f, x) = α
if and only if lim
i=0
Let us apply now Lemma 6.2 to the equilibrium state µq corresponding to the potential qϕ. Similarly one gets that for every β, n−1
hµq (f, x) = β
if and only if
1X ϕ(f i (x)) = P (qϕ) − β. n→∞ n
q lim
i=0
Hence one concludes that hµ (f, x) = α
hµq (f, x) = P (qϕ) − qP (ϕ) + qα.
if and only if
For α = α(q) we get x ∈ Kα(q)
if and only if
hµq (f, x) = T (q) + qα(q).
(8.3)
Combining the results of (8.2) and (8.3) one gets hµq (f ) = T (q) + qα(q), hµq (f, x) = T (q) + qα(q)
if and only if x ∈ Kα(q) .
This means that hµq (f, x) = hµq (f ) if and only if x ∈ Kα(q) . Since µq is ergodic, we know from the Brin–Katok formula that hµq (f, x) = hµq (f ) for µq -a.e. x ∈ X. Hence we conclude that µq (Kαq ) = µq ({x : hµq (f, x) = hµq (f )}) = 1. Therefore we obtained the desired parametrization of the level sets. We have to compute the topological entropy of f restricted to Kα(q) , EE (α(q)) := htop f |Kα(q) . Using the properties of the topological entropy from Theorem 5.2 we conclude that EE (α(q)) = htop f |Kα(q) ≥ hµq (f ) = T (q) + qα(q), since µq (Kα(q) ) = 1. We have to prove the oposite inequality. For this it would be sufficient to show that htop (f |Kα(q) ) ≤ λ for any λ > T (q) + qα(q). Choose such λ and let δ = λ − T (q) − qα(q) > 0. Rewriting the definition of Kα(q) in terms of µq and ϕq one has Kα(q) = x ∈ X : hµq (f, x) = hµq (f ) = T (q) + qα(q) ) ( n−1 1X i ϕq (f x) = −T (q) − qα(q) . = x ∈ X : lim n→∞ n i=0
Multifractal Analysis of Local Entropies
607
For every x ∈ Kα(q) there exists an integer n(x) such that n−1 δ 1 X i ϕq (f x) + T (q) + qα(q) ≤ 2 n
(8.4)
i=0
for all n ≥ n(x). For every integer N consider the set Kα(q),N = {x ∈ Kα(q) : n(x) ≤ N}. Obviously we have Kα(q) =
[
Kα(q),N , Kα(q),N ⊂ Kα(q),N+1 .
N≥1
Using the properties of the topological entropy from Theorem 5.2 we conclude that htop (f |Kα(q) ) = lim htop (f |Kα(q),N ). N→∞
We are going to show that for any N ∈ N one has htop (f |Kα(q),N ) ≤ λ; this in turn will imply htop (f |Kα(q) ) ≤ λ. M Consider an arbitrary finite cover U = B(xi , ε/2) i=1 of X by open balls of radius ε/2, with ε < γ /2, where γ is the expansivity constant for f . Together with U we consider U˜ an open cover by balls with centers at xi and radii ε. Let E = {yj } be a maximal (n, ε/2)-separated set in X. Define a subset E 0 of E by choosing those yj which have a point from Kα(q),N close to them, namely E 0 = {yj ∈ E : Kα(q),N ∩ Bn (yj , ε/2) 6 = ∅}. This implies that
[
Kα(q),N ⊂
Bn (yj , ε/2).
yj ∈E 0
For every yj ∈ E 0 there exists at least one string Ui0 ,... ,in−1 from Wn (U) such that yj ∈ X(Ui0 ,... ,in−1 ). It is easy to see that if yj ∈ X(Ui0 ,... ,in−1 ) = Ui0 ∩ f −1 Ui1 ∩ . . . f −n+1 Uin−1 , then
Bn (yj , ε/2) ⊂ S(U˜ i0 ,... ,in−1 ) = U˜ i0 ∩ f −1 U˜ i1 ∩ . . . f −n+1 U˜ in−1 . In other words the collection of strings 0˜ = {U˜ i0 ,... ,in−1 } covers Kα(q),N . Therefore ˜ n) = m(Kα(q),N , λ, U, ≤
X
inf
0⊂∪k≥n Wk (U˜ ) U∈0 0 covers Kα(q),N
X
exp(−m(U)λ)
exp(−m(e U)λ)
e U∈e 0
= e−nδ
X
exp −n(T (q) + qα(q))
e U∈e 0
= e−nδ
X
yj ∈E 0
exp −n T (q) + qα(q) .
(8.5)
608
F. Takens, E. Verbitski
Since the potential ϕ ∈ Vf (X), so is ϕq , and n−1 n−1 X X (Sn ϕq )(x) − (Sn ϕq )(y) = ϕq (f k (x)) − ϕq (f k (y)) ≤ |q|K k=0
k=0
for all x, y ∈ X with dn (x, y) < ε/2. For any yj ∈ E 0 let xj be an arbitrary point from Kα(q),N ∩ Bn (yj , ε/2). Since xj ∈ Kα(q),N and n ≥ N from (8.4) we have −n(T (q) + qα(q)) ≤ −n(T (q) + qα(q)) − (Sn ϕq )(xj ) + (Sn ϕq )(yj ) + |q|K nδ + (Sn ϕq )(yj ) + |q|K. ≤ 2 Thus we can continue the estimate (8.5) as follows: X e n) ≤ e−nδ/2+|q|K exp((Sn ϕq )(yj )) m(Kα(q),N , λ, U, yj ∈E 0 0 −nδ/2
≤Ce
Zn (ϕq , E).
Using the estimates from Lemma (4.3) and the fact that P (ϕq ) = 0 we conclude that ˜ n) ≤ C 00 e−nδ/2 . m(Kα(q),N , λ, U, Hence
˜ = lim m(Kα(q),N , λ, U, ˜ n) = 0, m(Kα(q),N , λ, U) n→∞
and since U was an open cover by balls of radius ε/2 we get m(Kα(q),N , λ) =
lim
diam(U )→0
˜ = 0. m(Kα(q),N , λ, U)
Then by definition of the topological entropy we have htop (f |Kα(q),N ) ≤ λ for all N . Hence htop (f |Kα(q) ) ≤ λ for all λ > T (q) + qα(q). This completes the proof that htop (f |Kα(q), ) ≤ T (q) + qα(q). The rest of the statement is taken from [18]. It states that we have a complete description of the spectra for local entropies. (4) If the equilibrium state for the potential ϕ is not a measure maximal entropy then it was shown in Lemma 4.1 that T (q) is strictly convex, i.e., the following holds for every q, q0 ∈ R, q 6 = q0 : T (q) > T (q0 ) + T 0 (q0 )(q − q0 ). Therefore, if α ∈ (α, α) then there exists q0 ∈ R such that α = −T 0 (q0 ). We have seen that in this case EE (α) = T (q0 ) + αq0 . Using the strict convexity of T (q) we obtain that for q ∈ R, q 6 = q0 the following holds EE (α) = T (q0 ) + αq0 < T (q) + αq. Hence, EE (α) = inf (T (q) + αq) for α ∈ (α, α). q∈R
Multifractal Analysis of Local Entropies
609
In a similar manner one obtains the second relation T (q) = supα∈(α,α) (EE (α) − qα). Using the notion of the Legendre transform [15] we can say that actually functions T (q) and F (α) := −EE (−α) form a Legendre pair, i.e., one is the Legendre transform of another. Therefore the convexity and differentiabilty of EE follow from the properties of the Legendre transform. In particular, for α ∈ (α, α) one has EE0 (α) = q, where q ∈ R is such that α = −T 0 (q). In the case when µ is the measure of maximal entropy one has hµ (f, x) = hµ (f ) = htop (f ) for all x ∈ X. It means that EE is a delta-like function htop (f ), if α = htop (f ), EE (α) = 0, otherwise. This “degenerate” behaviour of the multifractal spectrum for the measure of maximal entropy can be successfuly exploited. For this see [2], where it has been used for the calculations of the mutifractal spectra for Lyapunov exponents. (5) This is an essentially new result. We prove it by means of standard thermodynamical technique. Let q > 1 and E be an arbitrary (n, ε)-separated set. One has Z Z X µ(Bn (x, ε)q−1 dµ µ(Bn (x, ε))q−1 dµ ≥ xi ∈EB (x ,ε/2) n i
≥
X
µ(Bn (xi , ε/2))q ,
xi ∈E
since x ∈ Bn (xi , ε/2) implies Bn (xi , ε/2) ⊂ Bn (x, ε). Applying inequality (3.4), and using the fact that E is an (n, ε)-separated set, we get Z n−1 X X q Aε/2 exp −qP n + qϕ(f j xi ) , µ(Bn (x, ε))q−1 dµ ≥ sup E
xi ∈E
j =0
where the supremum is taken over all (n, ε)-separated sets. Taking logarithms and applying estimates from Lemma 4.3 we conclude that in the limit hµ (f, q) ≤ Rµ (f, q) ≤
P (qϕ) − qP (ϕ) . 1−q
To finish the proof we have to show the opposite inequality. We do it in a similar manner. Let now E be a maximal (n, ε/2)-separated set, then Z Z X µ(Bn (x, ε/2))q−1 dµ µ(Bn (x, ε/2))q−1 dµ ≤ xi ∈E B (x ,ε/2) n i
≤
X
µ(Bn (xi , ε))q ,
xi ∈E
since x ∈ Bn (xi , ε/2) implies that Bn (x, ε/2) ⊂ Bn (xi , ε).
610
F. Takens, E. Verbitski
Again since E is an arbitrary (n, ε/2)-separated set and applying the inequality (3.4) we obtain Z n−1 X X Bεq exp −qP n + qϕ(f j xi ) . µ(Bn (x, ε/2))q−1 dµ ≤ sup E
xi ∈F
j =0
Taking logarithms and using estimates from Lemma 4.3 in the limit n → ∞ we get hµ (f, q) ≥ Rµ (f, q) ≥
P (qϕ) − qP (ϕ) . 1−q
Combining all together we get the statement in the case q > 1. The case q < 1 is completely analogous. The continuity and other properties of hµ (f, q) follow from the corresponding properties of T (q). u t 9. Final Remarks A. Consider an irregular set B = {x ∈ X : hµ (f, x) does not exist } n−1
1X ϕ(f k (x)) does not exist}. n→∞ n
= {x ∈ X : lim
k=0
We have seen that for the measure of maximal entropy mE this is an empty set. It was shown in [3] that in a number of cases, the set B is either empty or has full topological entropy and Hausdorff dimension. B. There exists another way of defining local (pointwise) entropies. Namely, consider an arbitrary finite measurable partition ξ of X. We can define a local entropy at x with respect to ξ as follows (if the limit exists): 1 hµ (f, x, ξ ) = lim − log µ(ξ (n) (x)), n→∞ n where ξ (n) = ξ ∨ f −1 ξ ∨ . . . ∨ f −n+1 ξ and ξ (n) (x) is the element of ξ (n) containing x. We can define a spectrum of local entropies with respect to ξ as follows: EE (α) = htop (f |{x:hµ (f,x,ξ )=α)} ). The situation when ξ is a finite Markov partition for an expanding dynamical system has been studied in [2,1]. One can eaily check that in this case the two spectra coincide. C. The results of this paper can be extended to the case of expansive endomorphisms (i.e., non-invertible maps) with the specification property. They are defined in exactly the same way as the expansive homeomorphisms with specification except that the set Z in (2.1) is substituted by N (positive expansiveness). The characteristic property of the equilibrium states (Theorem 3.7) remains valid [17]. Therefore our analysis works without any modifications. In the case of expansive homeomorphisms we can give another definition of local entropies. Namely, for any n ≥ 1 define i i B± n (x, ε) = {y ∈ X : d(f (x), f (y)) < ε for all i = −n + 1, . . . , n − 1},
Multifractal Analysis of Local Entropies
and
611
1 log µ(B± n (x, ε)), 2n − 1 1 ± hµ (f, x) = lim lim sup − log µ(B± n (x, ε)). ε→0 n→∞ 2n − 1 h± µ (f, x) = lim lim inf − ε→0 n→∞
Then the level sets of these local entropies will be in one-to-one correspondence with the level sets of two-sided ergodic averages of ϕ, 1 n→∞ 2n − 1 lim
n−1 X
ϕ(f k (x)).
k=−n+1
The level sets of two-sided and one-sided ergodic averages of ϕ can be different. However, they have the same topological entropy with respect to f . Therefore the multifractal spectrum based on h± µ (f, x) will be the same. D. A requirement of the existence of a Markov partition is stronger than a specification property, provided the dynamical system is mixing. Consider the family of one-dimensional interval maps Tβ , defined by Tβ (x) = βx (mod 1). For β > 1 these maps are expanding and therefore expansive. The ergodic properties of Tβ depend on the number-theoretic properties of β. For these systems it turns out [19] that: i) the set of β’s for which Tβ has a finite Markov partition is at most countable; ii) the set of β’s for which Tβ has the specification property is uncountable and has Hausdorff dimension 1, but still has Lebesgue measure 0. Therefore, we can see that in the family {Tβ }β>1 , specification is a much more general property than the property of having a finite Markov partition. Acknowledgements. The work of the second author was supported by the Netherlands Organization for Scientific Research (NWO), grant 613-06-551. We would like to thank Luis Barreira, Yakov Pesin and Jörg Schmeling for their valuable advice and comments.
References 1. Barreira, L., Pesin, Ya. and Schmeling, J.: Multifractal spectra and multifractal rigidity for horseshoes. J. Dynam. Control Systems 3(1), 33–49 (1997) 2. Barreira, L., Pesin, Ya. and Schmeling, J.: On a general concept of multifractality: Multifractal spectra for dimensions, entropies, and Lyapunov exponents. Multifractal rigidity. Chaos 7 (1), 27–38 (1997) 3. Barreira, L and Schmeling, J.: Sets of “non-typical” points have full topological entropy and full hausdorff dimension. Preprint Instituto Superior Techno, 1997 4. Beck, Ch. and Schlögl, F.: Thermodynamics of chaotic systems. Vol. 4 of Cambridge Nonlinear Science Series, Cambridge, Cambridge University Press, 1993, An introduction 5. Bowen, R.: Topological entropy for noncompact sets. Trans. Am. Math. Soc. 184, 125–136 (1973) 6. Bowen, R.: Some systems with unique equilibrium states. Math. Systems Theory 8 (3), 193–202 (1974/75) 7. Brin, M. and Katok, A.: On local entropy. In: Geometric dynamics (Rio de Janeiro, 1981), Vol. 1007 of Lecture Notes in Math., Berlin: Springer, 1983, pp. 30–38 8. Haydn, N.T.A. and Ruelle, D.: Equivalence of Gibbs and equilibrium states for homeomorphisms satisfying expansiveness and specification. Commun. Math. Phys. 148 (1), 155–167 (1992) 9. Katok, A. and Hasselblatt, B.: Introduction to the modern theory of dynamical systems. Vol. 54 of Encyclopedia of Mathematics and its Applications, Cambridge: Cambridge University Press, 1995 10. Pesin, Ya. and Weiss, H.: A multifractal analysis of equilibrium measures for conformal expanding maps and Moran-like geometric constructions. J. Stat. Phys. 86 (1–2), 233–275 (1997)
612
F. Takens, E. Verbitski
11. Pesin, Ya. and Weiss, H.: The multifractal analysis of Gibbs measures: Motivation, mathematical foundation, and examples. Chaos 7 (1), 89–106 (1997) 12. Pesin, Ya.B.: Dimension-like characteristics for invariant sets of dynamical systems. Uspekhi Mat. Nauk 43 (4(262)), 95–128, 255 (1988) 13. Pesin, Ya.B. and Pitskel, B.S.: Topological pressure and the variational principle for noncompact sets. Funktsional. Anal. i Prilozhen. 18 (4), 50–63, 96 (1984) 14. Pesin, Ya.B.: Dimension theory in dynamical systems. Chicago Lectures in Mathematics. Chicago, IL: University of Chicago Press, 1997, Contemporary views and applications 15. Roberts, A.W. and Varberg, D.E.: Convex functions. New York–London: Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], 1973, Pure and Applied Mathematics, Vol. 57 16. Ruelle, D.: Thermodynamic formalism, Vol. 5 of Encyclopedia of Mathematics and its Applications, Reading, MA: Addison-Wesley, 1978 17. Ruelle, D.: Thermodynamic formalism for maps satisfying positive expansiveness and specification. Nonlinearity 5 (6), 1223–1236 (1992) 18. Schmeling, J.: On the completeness of multifractal spectra. Preprint WIAS, Berlin, 1996 19. Schmeling, J.: Symbolic dynamics for β-shifts and self-normal numbers. Ergodic Theory Dynam. Systems 17 (3), 675–694 (1997) 20. Takens, F. and Verbitski, E.: Generalized entropies: Rényi and correlation integral approach. Nonlinearity 11, 771–782 (1998) 21. Walters, P.: An introduction to ergodic theory. Vol. 79 of Graduate Texts in Mathematics. NewYork-Berlin: Springer-Verlag, 1982 22. Walters, P. Differentiability properties of the pressure of a continuous transformation on a compact metric space. J. London Math. Soc. (2) 46 (3), 471–481 (1992) Communicated by Ya. G. Sinai
Commun. Math. Phys. 203, 613 – 633 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
On the Algebro-Geometric Integration of the Schlesinger Equations P. Deift1 , A. Its2 , A. Kapaev3 , X. Zhou4 1 Courant Institute of Mathematical Sciences, New York, NY 10003, USA 2 Department of Mathematical Sciences, Indiana University – Purdue University Indianapolis, Indianapolis,
IN 46202-3216, USA. E-mail:
[email protected] 3 St. Petersburg Branch of Steklov Mathematical Institute, Russian Academy of Sciences, St. Petersburg,
191011, Russia
4 Department of Mathematics, Duke University, Durham, NC 27708-0320, USA.
Received: 3 March 1998 / Accepted: 16 December 1998
Abstract: A new approach to the construction of isomonodromy deformations of 2 × 2 Fuchsian systems is presented. The method is based on a combination of the algebrogeometric scheme and Riemann–Hilbert approach of the theory of integrable systems. For a given number 2g + 1, g ≥ 1, of finite (regular) singularities, the method produces a 2g-parameter submanifold of the Fuchsian monodromy data for which the relevant Riemann–Hilbert problem can be solved in closed form via the Baker–Akhiezer function technique. This in turn leads to a 2g-parameter family of solutions of the corresponding Schlesinger equations, explicitly described in terms of Riemann theta functions of genus g. In the case g = 1 the solution found coincides with the general elliptic solution of the particular case of the Painlevé VI equation first obtained by N. J. Hitchin [H1]. Introduction Let 0=
P = (λ, w) : w2 =
2g+1 Y j =1
(λ − aj ) , aj 6= ak for j 6= k ,
be a hyperelliptic curve of genus g ≥ 1. We shall use the standard representation of 0 as a two-sheeted covering of CP1 with the cuts along the intervals [a2k+1 , a2k+2 ], k = 0, . . . , g, a2g+2 = ∞, assuming that 3) state is shown to be completely magnetized for long chains. The number of states of energy E = log(n) summed over chain length is expressed in terms of a restricted divisor problem. We conjecture that its asymptotic form is (n log n), consistent with the phase transition at β = 2, and suggesting a possible connection with the Riemann ζ -function. The spin interaction coefficients include all even many-body terms and are translation invariant. Computer results indicate that all the interaction coefficients, except the constant term, are ferromagnetic. 1. Introduction There has been considerable recent interest in the area of statistical mechanical models inspired by or closely connected with number theory. An overall goal of this work is to illuminate connections between the two disciplines, so that each might provide new insights and techniques useful for the other. There is, more specifically, the hope that a direct link between the Riemann hypothesis and the Lee–Yang theory of phase transitions, based on zeros of the partition function, will eventually emerge. In this paper, we introduce and examine a new equilibrium statistical mechanics spin chain model based on the number-theoretic Farey fractions. Our model is closely related to the number theoretic spin chain studied by Knauf (see references given below) and others (see [Cv]) and in fact has the same free energy. However it differs from that previous work in several respects. Perhaps most important, our model is translation invariant by construction. In addition, our matrix formulation clarifies some of the results
636
P. Kleban, A. E. Özlük
on the number theoretic spin chain. However, we have yet to conclusively demonstrate a connection with the Riemann ζ -function, as is the case (by construction) for the previous model at low temperatures. In Sect. 2 we define the new model, and point out its connection to the number theoretic spin chain studied by Knauf and others, which we refer to as KSC below. We also exhibit exact results for the partition function at certain temperatures. Section 3 contains a first proof of the existence of the free energy (of the infinite system) and has a unique phase transition at β = 2. In fact, it is eqal to the KSC free energy, establishing a direct connection between the two models. In Sect. 4 we examine the number of states, and show it is related to a certain number-theoretic restricted divisor problem. Using a conjectured asymptotic form for the summed (over all chain lengths) number of states we then suggest a connection with the Riemann function. Sect. 5 uses the results of the previous section to prove the existence of a phase transition in a different way, and also shows that the infinite system is in a completely magnetized state at low temperatures. In Sect. 6 we examine the spin interaction coefficients, which include all even many-body terms and are translation invariant. Our numerical results indicate that all the interaction coefficients, except the constant term, are ferromagnetic. We also discuss the question of whether the Farey and KSC interactions are the same in the limit of long chains. 2. Definition of the Farey Spin Chain We begin with a preliminary definition and then proceed to construct the Farey fractions. Definition. The mediant of the rational numbers Note that, for
a b
1.
In what follows, we omit the j = 2m−1 case. Thus, the fraction 11 is not included and the denominator 1 only appears once in Fn . Furthermore, M m (j ) = AM m−1 (1), with 0 ≤ 1 ≤ 2m−2 , l = (am−2 . . . a1 a0 )2 . Therefore, for 0 ≤ j ≤ 2m−1 − 1, with T m (j ) defined as T m (j ) = Tr(M m (j )), we have T m (j ) = b + c = Den(xjm ) + Num(xjm+1 ).
(4)
We are now in a position to define the partition function of the spin chain. First we extend the definition of M m (j ) to include all possible products of m factors A or B, so there are now 2m matrix products. Now A = I + σ− , and similarly B = I + σ+ , where I is the (2 × 2) unit matrix and σ+ , σ− are Pauli matrices. These satisfy σ+2 = 0 = σ−2 , σ+ σ− + σ− σ+ = I . It follows that M m (j ) is a linear combination of I , σ+ , σ− , σ+ σ− and σ− σ+ . If one exchanges A and B in the product, σ + and σ− are exchanged, but the trace remains invariant. Thus each new T m (j ) is exactly the same as one from the original set. We then interpret each T m (j ) as specifying the energy of a given spin state j of a periodic chain of length m via Em (j ) = log T m (j ).
(5)
The k th spin may be regarded as down or up (or, equivalently, the k th site as empty or full) according to whether ak = 0 or 1, respectively. Definition. The partition function is given by X T m (j )−β , Zm (β) =
(6)
j
where the sum extends over all 2m matrix products, i.e. the first factor in M m (j ) may be either A or B. It follows that Z1 = 21−β ,
Z2 = 21−β + 2 · 3−β ,
Z3 = 21−β + 6 · 2−2β , Z4 = 21−β + 4 · 6−β + 8 · 5−β + 2 · 7−β , . . . .
(7)
638
P. Kleban, A. E. Özlük
Remark. The A − B properties of T m (j ) discussed above imply that the Farey chain exhibits up-down (equivalently, particle-hole) symmetry. It also follows immediately from the properties of the trace that the energy is invariant under cyclic translation of the spin matrices. Therefore the spin chain has translation invariant spin interactions (see Sect. 6). Remark. The partition function may be evaluated exactly for certain values of β. When β = 0, Zm = 2m , the number of states. For β = −1, Zm is the sum of the trace of all possible matrix products. Since the two operations commute, Zm (−1) = Tr(A + B)m
(8)
[P]. Now A + B has eigenvalues 1 and 3, so Zm (−1) = 1m + 3m = 1 + 3m .
(9)
One can also calculate correlation functions of the A and B matrices for these b values in a straightforward way. These simplifications may also be applied to the KSC. The β = −1 method gives an easy way to derive the results of [G-K]. Remark. The KSC may also be expressed using the A and B matrices. To do this, one must replace T m (j ) in the formula for the partition function with (10) D m (j ) = b = Den(xjm ) = M m (j ) 2,2 and sum only over the restricted set of matrices beginning with A. We denote the resulting K β. partition function Zm−1 In the language of [Ka], this is the canonical partition function for a chain of length m − 1 (the leading matrix A is not counted). Since all matrix elements of M m (j ) are positive, 0 < D m (j ) < T m (j ).
(11)
Thus the energy of each (restricted) state of the Farey chain is bounded below by the energy of the KSC. Since Zm may also be computed using the restricted sum, for β > 0 one has K (β). Zm (β) < 2Zm−1
(12)
For β < 0, the inequality is reversed. 3. Existence of the Free Energy and Phase Transition Definition. The Farey free energy (per spin) F (b) is defined as βF (β) = lim
m→∞
− ln[Zm (β)] , m
(13)
and the KSC free energy FK (β) is defined similarly, with Z replaced by Z K . Theorem 2. The Farey free energy satisfies F (β) = FK (β). Thus βF (β) exists for all β ≥ 0, for β any negative integer, and exhibits a unique phase transition at β = 2.
Farey Fraction Spin Chain
639
Proof. For β = 0 βF (β) = βFK (β) = − ln 2
(14)
follows immediately for either spin spin chain by the remark above. For the other β values, we make use of Theorem 3 below, which is independent of the present argument. With dc the immediate neighbor of ab , one sees that c = b¯ + at; t = 0, 1, 2, 3, . . . ,
(14a)
where the index t classifies the appearances of dc with increasing chain length m and satisfies t < m. Here bb¯ = 1( mod a) and 1 ≤ b¯ ≤ a − 1 so that b¯ < a < b. Hence c < (m + 1)b,
(15)
whick make use of as follows. First note that D m (j ) = b and T m (j ) = b + c satisfy D m (j ) < T m (j ) < (m + 2)D m (j ).
(16)
For β > 0, it follows that K > Zm > Zm−1
1 ZK . (m + 2)β m−1
Taking logarithms and dividing by −βm gives ln(m + 2) 1 1 K K < Fm < . Fm−1 + 1− Fm−1 1− m m m
(17)
(18)
Our results for positive β are then established by taking the limit m → ∞, and using the rigorous results that βFK (β) exists for all β > 0 [Kd] and exhibits a unique phase transition at β = 2 [C-K]. For β < 0 same inequality holds for Fm ; thus since βFK (β), is also known to exist for β any negative integer [C-K], the rest of the theorem is proved. t u Remark. Another way of establishing the phase transition and gaining information on the behavior of the system at large β in given in Theorem 5. Remark. Note that F (−1) = ln 3 by the calculation above. Making use of [C-K] gives the same result for FK (−1). Figures 1 and 2 illustrate the free energy and energy fluctuation per spin 1E (which is proportional to the specific heat) obtained by exact enumeration for chains up to length m = 16. It is clear that the convergence of the free energy with length is much slower at large β, where the system is known to be completely magnetized in the infinite m limit (see Sect. 5), than for small β values. In addition, Fig. 1 shows that while the approach to the limit is non–monotonic for small chain lengths, for β = 1 this is not the case at larger β values. Finally, Fig. 2 shows a peak in 1E near β = 2 consistent with the phase transition there. 1E increases much more rapidly with length at the peak (β = 2) than at nearby β values.
640
P. Kleban, A. E. Özlük
0.4
0.2
Fm 0
-0.2
2
4
6
8
10
12
14
16
m Zm vs. m for Farey spin chain at β = 1 (stars) and β = 4 (diamonds) Fig. 1. Free energy Fm ≡ − ln mβ
0.4 0.35 0.3 0.25
1E 0.2 0.15 0.1 0.05
1.4
1.6
1.8
2
2.2
2.4
2.6
β 1 Fig. 2. Fluctuation of the energy per spin 1E ≡ m with m = 16
D
E Em (j )2 − hEm (j )i2 vs. β for the Farey chain spin
4. Number of States
In this section we consider the summed number of states of the spin chain, that is, the number of states of a given energy regardless of chain length. To begin, we derive an expression for the number of immediate right neighbors of a given Farey fraction. Definition. If m is the smallest positive integer such that conductor of ab . We write m = cond ab .
a b
∈ Fm , then we call m the
Farey Fraction Spin Chain
641
0 Let m = cond ab = m, and let Yx 0 < neighbors of ab in Fm . Then we have
a b
2 and X is any finite product of A’s and B’s beginning with A as described above. We count the contribution of ab to the left side of (29) by keeping track of the immediate right neighbors of ab . For this, we consider Theorem 3 and b + b¯ + at = n; t = 0, 1, 2 . . . ,
(30)
which is equivalent to b2 + bb¯ + abt = bn or
(31)
b − bn + 1 ≡ 0 (mod a). 2
(32)
Therefore we look for the divisors a of bn−b2 −1 for 1 ≤ a < b. There are db (bn−b2 −1) of these, where db (m) is the number of positive divisors of m that are less than b. It immediately follows that Theorem 4. The number of solutions of (29) is 8(n) =
n−1 X
db (bn − b2 − 1).
b=1
Similar quantities are considered in [H-T]. Remark. We can put an upper bound on 8(n) by considering (30). If we ignore the restrictions, the number of solutions to it is just the number of unordered partitions of n into three positive integers. Therefore 1 n+2 1 (33) 8(n) ≤ = (n + 2)(n + 1) ∼ n2 , 2 2 2 where the asymptotic form applies as n → ∞. Conjecture. As n → ∞, 8(n) ∼ 21 n log n. Consider the fact that
n X
d(m) ∼ n log n, which implies that for large n the average
m=1
n X √ 2 dmu (m) ∼ sin−1 ( u) · n [D-D-T]. d(m) π m=1 Now replace the quantities d(m) and db (m) by the averaged functions d(x) and d(x; y),
value of d is log n, and the DDT theorem
Farey Fraction Spin Chain
643
where the latter is the averaged number of divisors of y ≤ x. Taken together, the quoted results suggest that Z n √ 2 d(x u ; x)dx ∼ sin−1 ( u) · n log n, (34) π 1 and more generally the averaged quantity satisfies s ! log g 2 · log n d(g; x) ∼ sin−1 π log x
(35)
log g is consistent with the fact that divisors are uniformly log x distributed on a log scale ([H-T], p. 62). The asymptotic form of the number of states then follows Z n−1 d(b : bn − b2 − 1)db 8(n) ∼ 1 s ! (36) Z log b 2 n−1 −1 sin db · log n. ∼ π 1 log b(n − b) 1 π Evaluating the integral for large n, we find sin−1 √ n = n, which leads immedi4 2 ately to the conjectured formula. for large x ≤ n. The argument
5. Thermodynamic Consequences The results for the (summed) number of states in Sect. 6 may be used to derive some consequences for the thermodynamics. We first express the partition function as Zm (β) = 21−β + 2
∞ X 8m (n) n=3
nβ
,
(37)
where 8m (n) is the number of solutions of (29) for fixed chain length m, i.e. the number of states. The factor 2 appears in front of the summation of the preceding equation since (29) refers to the restricted set of matrices beginning with A. The summed number ∞ X 0 (β) = Z (β) − 21−β and use the 8m (n). If we define Zm of states is then 8(n) = m m=1
conjectured asymptotic form of 8(n), then ∞ X m=1
0 Zm (β) = 2
=
∞ X 8(n)
n=3 ∞ X n=1
nβ ∞
X ε(n) 22−β n log n log 2 + 2 − nβ nβ 3 n=1
= −ζ 0 (β − 1) + 2˜ε(β) −
22−β log 2, 3
(38)
644
P. Kleban, A. E. Özlük
where (n) corrects the asymptotic form, ε˜ (β) is its Dirichlet transform, and ζ 0 is the derivative of the Riemann ζ -function. If we assume that ε˜ (β) is regular, then a singularity in the thermodynamics for β = 2 follows from the pole in the Riemann function at β − 1 = 1. We have not been able to prove this assumption, however. Theorem 5. Z 0 m(β) → 0 as m → ∞ so that Zm(β) → 21−β , and the free energy F (β) = 0 for β > 3. Proof. We use the upper bound on 8(n) derived above. Now ∞ X m=1
0 Zm (β) =
∞ X 8(n) n=3
nβ
≤
∞ X (n + 2)(n + 1) n=3
nβ
.
(39)
The summation in this equation is finite for β > 3. u t Remark. Since βF (β) is finite at β = 0, Theorem 5 establishes the existence of a phase transition (singularity in βF (β)) in a different way than used in Sect. 3. In addition, since the partition function for long chains is given by the sum over the two lowest energy (j = 0 or 2m − 1) states only, it shows that the limiting chain is in a completely magnetized state (all spins up or all spins down) at low temperatures. Similar behavior holds, by construction, for the KSC, with the partition function approaching a ratio of Riemann ζ -functions for β > 2 as m → ∞. For the Farey chain, replacing the upper bound by our conjectured asymptotic form for 8(n), leads to F (β) = 0 and Zm (β) → 21−β for β > 2. 6. Spin Interaction Coefficients We define the spin interaction coefficients Jm (t) as follows [Kb]. Let t, 0 ≤ t ≤ 2m−1 , specify the coefficient, and suppose that the binary expansion of t is t = (bm−1 . . .b1 b0 )2 , m−1 X ai bi . Then bk ∈ {0, 1}. Let j · t = i=0
Jm (t) = −
1 X (−1)j ·t Em (j ), 2m
(40)
j
where Em (j ) is the energy of configuration j for a chain of length m, defined above. In this notation, the energy of any configuration is given via a sum over spin clusters X (−1)j ·t Jm (t). (41) Em (j ) = − t
Note that each factor si ≡ (−1ai ) ∈ {−1, +1} in each term may be interpreted as a spin at site i on the chain. The spin si is present or not in a given term according to whether bi = 1 or bi = 0, respectively. More explicitly, " # Y X m−1 bi (si ) J ({bi }) . (42) Em (j ) = − {bi }
i=0
Each term thus defines a spin cluster, i.e. a set of sites for which bi = 1. The sites in a given cluster may be adjacent to one another or separated by bi = 0 sites with no spins present.
Farey Fraction Spin Chain
645
Lemma. The interaction coefficient Jm (0) satisfies Jm (0) = − 21m √ m ln
5+1 2
!
X
ln Tm (j ) ≥
j
= −(0.48121 . . . )m for large m.
Proof. By considering the generation of the Farey fractions Fm , it is clear that each numerator and denominator is bounded above by the Fibonacci number Fm+1 . Therefore √ !m+1 1 + 2 5 , (43) T m (j ) ≤ 2Fm+1 ∼ √ 2 5 as m → ∞. u t We have calculated Jm (0) by computer for chains up to m = 16 as illustrated in Fig. 3. The results indicate that it approaches −(am + b), with a = 0.3962, for both the Farey spin chain and KSC. Since Jm (0) is also (minus) the average energy for a chain of length m, we considered the fluctuation of the energy, i.e. 2 X X 1 1 2 ln T m (j ) . (44) ln T m (j ) − m σm2 = m 2 2 j
j
Numerically, from calculations on chains up to length 16, σm appears to approach 0.019 m as m → ∞. The corresponding quantity for the KSC apparently approaches 0.014 m. Given the numerical uncertainties, it is possible that the asymptotic value is the same for both chains. Note that, since Tm ≥ 2, Jm (0) < 0. By the up-down symmetry discussed above, it follows that Jm (t)P= 0 whenever the cluster defined by t contains an odd number of spins, i.e. when i bi = odd, so only even interaction coefficients are non-zero. Furthermore, Jm (t) exhibits cyclic symmetry in t, i.e. it is invariant under translation of the spin cluster, due to the invariance of the trace under cyclic translation of the spin matrices mentioned above. Our computer results for short chains verify all the exact behavior mentioned in the preceding paragraph. In addition, we find several interesting features. For all non–zero clusters (with an even number of spins) Jm (t) > 0, so all interaction coefficients (except t = 0) are ferromagnetic. Similar behavior occurs for the KSC in what is referred to as the grand canonical ensemble [Kb]. We find that as m increases, each such Jm (t) apparently approaches a finite limit. An example is shown in Fig. 3. Denoting by J (t) the limit of Jm (t) as m → ∞, we find the values listed in Table 1. Note that the (approximate) numbers for the pair interaction coefficients (t = (10 . . . 01000 . . . )) are consistent with a decrease of J by a factor of 1/2 for each increase of separation of the two spins in the cluster by one site. Generally, J appears to decrease with the number of spins in the cluster and the distance between the spins. Theorem 2 establishes that the Farey spin chain and KSC have the same free energy. This raises the question as to whether J (t) is in some sense the same in both cases. This is a more complicated issue, in part because the KSC is not translation invariant. Our numerical results are consistent with the J (t) being the same, at least for KSC interactions with spin clusters far from the edges, i.e. with t of the form 0 . . . 0p0 . . . 0,
646
P. Kleban, A. E. Özlük
0.2
-0.3
0.15
-0.325
0.1
-0.35
0.05
-0.375
-0.4 2.5
5
7.5
10
12.5
15
m
Fig. 3. Jm+1 (0) − Jm (0) vs. m for the Farey chain (stars) and KSC (diamonds) with values on the right scale; pair interaction Jm (11000 . . . ) vs. m for the Farey chain (boxes) with values on the left scale
Table 1. Interaction coefficients t
J (t)
11000 . . . 101000 . . . 1001000 . . . 10001000 . . . 100001000 . . . 1000001000 . . . 1111000 . . . 11011000 . . . 110011000 . . . 11101000 . . .
0.131 0.0612 0.0291 0.0141 0.0068 0.0033 0.0081 0.0028 0.0011 0.004
with p = 1 . . . 1 fixed and each string of 0s of length proportional to m. For t of the form p0 . . . 0, so the cluster remains at one edge, the J (t) values are certainly different. Further, Jm (t) for the KSC is rigorously known to exhibit the following behavior for m → ∞: Jm (t) → 0 when t has an odd number of spins, it is small unless the length of p is small, and it has translation invariance [Ka]. All of these are consistent with what we find for the Farey chain, as described above. Unfortunately, the bounds relating D m (j ) and T m (j ) used in the proof of Theorem 2 are not strong enough to establish equality, since each term in Jm (t) will be bounded above and below by the corresponding term in JmK (t) plus (depending on sign) a possible term of magnitude ln(m + 2)2−m , and on summation the latter gives rise to a divergent contribution as m → ∞. However, one can draw some conclusion about Jm (0) and σm in this way. We assume, consistent with the lemma and numerical results above, that both
Farey Fraction Spin Chain
647
quantities are proportional to m in this limit. It is then easy to show that the respective coefficients must be the same for either spin chain, as suggested by the numerics. Note added. Recent work [C-K-K] has extended the analysis of the Farey spin chain, addressing in particular the magnetization in the vicinity of the phase transition. Acknowledgements. We acknowledge stimulating and useful interactions with P. Contucci, A. Knauf and I. Peschel, thank A. Knauf for a computer program, and thank an anonymous referee for important comments.
References [C-K]
Contucci, P. and Knauf, A.: The Phase Transition of the Number-Theoretical Spin Chain. Forum Mathematicum 9, 547–567 (1997) [C-K-K] Contucci, P., Kleban, P., and Knauf A.: A Fully Magnetizing Phase Transition. Preprint (mathph/9811020) [Cv] Cvitanovic, P.: Circle Maps: Irrationally Winding. In: From Number Theory to Theoretical Physics, Berlin–Heidelberg–New York: Springer, 1992 [D-D-T] Deshouillers, J.-M., Dress, F. and Tenenbaum, G.: Lois de répartition des diviseurs, I. Acta Arithmetica XXXIV (1979) [G-K] Guerra, F. and Knauf, A.: Free Energy and Correlations of the Number-Theoretical Spin Chain. J. Math. Phys. 39, 3188–3202 (1998) (http://wwwsfb288.math.tu_berlin.de/bulletinboard.html) [H-T] Hall, R. and Tenenbaum, G.: Divisors. Cambridge: Cambridge University Press, 1988 [Ka] Knauf, A.: On a Ferromagnetic Spin Chain. Commun. Math. Phys. 153, 77–115 (1993) [Kb] Knauf, A.: Phases of the Number-Theoretical Spin Chain. J. Stat. Phys. 73, 423–431 (1993) [Kd] Knauf, A.: On a Ferromagnetic Spin Chain, Part II: Thermodynamic Limit. J. Math. Phys. 35, 228–236 (1994) [P] Peschel, I.: Private communication Communicated by M. E. Fisher
Commun. Math. Phys. 203, 649 – 666 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Riccati-Type Equations, Generalised WZNW Equations, and Multidimensional Toda Systems L. A. Ferreira1 , J. F. Gomes1 , A. V. Razumov2 , M. V. Saveliev2 , A. H. Zimerman1 1 Instituto de Física Teórica - IFT/UNESP, Rua Pamplona 145, 01405-900, São Paulo - SP, Brazil.
E-mail:
[email protected];
[email protected];
[email protected] 2 Institute for High Energy Physics, 142284 Protvino, Moscow Region, Russia.
E-mail:
[email protected];
[email protected] Received: 3 August 1998 / Accepted: 21 December 1998
Abstract: We associate to an arbitrary Z-gradation of the Lie algebra of a Lie group a system of Riccati-type first order differential equations. The particular cases under consideration are the ordinary Riccati and the matrix Riccati equations. The multidimensional extension of these equations is given. The generalisation of the associated Redheffer–Reid differential systems appears in a natural way. The connection between the Toda systems and the Riccati-type equations in lower and higher dimensions is established. Within this context the integrability problem for those equations is studied. As an illustration, some examples of the integrable multidimensional Riccati-type equations related to the maximally nonabelian Toda systems are given. 1. Introduction At the present time there is a great number of papers in mathematics and physics devoted to various aspects of the matrix differential Riccati equation proposed in the ’20s by Radon in the context of the Lagrange variational problem. In particular, this equation has been discussed in connection with the oscillation of the solutions to systems of linear differential equations, Lie group and differential geometric aspects of the theory of analytic functions of several complex variables in classical domains, probability theory, computation schemes. For a systematic account of the development in the theory of the matrix differential Riccati equation up to the ’70s see, for example, the survey [33]. More recently there appeared papers where this equation was considered as a Bäcklund-type transformation for some integrable systems of differential geometry. In particular, for the Lamé and the Bourlet equations. A relevant superposition principle for the equation has been studied on the basis of the theory of Lie algebras, see, for example, [1,2,20, 30–32] and references therein. The matrix Riccati equation also arises as an equation of motion on Grassmann manifolds and on homogeneous spaces attached to the Hartree– Fock–Bogoliubov problem, see, for example, [4,11] and references therein; and in some other subjects of applied mathematics and physics such as optimal control theory, plasma,
650
L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman
etc., see, for example, [12,29,28]. Continued–fraction solutions to the matrix differential Riccati equation were constructed in [7,8], based on a sequence of substitutions with the coefficients satisfying a matrix generalisation of the Volterra-type equations which in turn provide a Bäcklund transformation for the corresponding matrix version of the Toda lattice. In papers [5,6] the matrix differential Riccati equation occurs in the steepest descent solution to the total least squares problem as a flow on Grassmannians via the Brockett double bracket commutator equation; in the special case of projective space this is the Toda lattice flow in Moser’s variables. In the present paper we investigate the equations associated with an arbitrary Zgradation of the Lie algebra g of a Lie group G. For the case G = GL(2, C) and the principal gradation of gl(2, C) this is the ordinary Riccati equation. For the case G = GL(n, C) and some special Z-gradation of gl(n, C) we get the matrix Riccati equation. The underlying group-algebraic structure allows us to give a unifying approach to the investigation of the integrability problem for the equations under consideration which we call the Riccati-type equations. We also give a multidimensional generalisation of the Riccati-type equations and discuss their integrability. It has been very useful for the study of ordinary matrix Riccati equations to associate with them the so-called Redheffer–Reid differential system [25–27]. In our approach the corresponding generalisation of such systems appears in a natural way. The associated Redheffer–Reid system can be considered as the constraints providing some reduction of the Wess–Zumino–Novikov–Witten (WZNW) equations. On the other hand, it is well known that the Toda-type systems can be also obtained by the appropriate reduction of the WZNW equations, see, for example, [14]. This implies a deep connection of the Toda-type systems and the Riccati-type equations. In particular, under the relevant constraints the Riccati-type equations play the role of a Bäcklund map for the Toda systems, and, in a sense, are a generalisation of the Volterra equations. Some years ago there appeared a remarkable generalisation [16] of the WZNW equations. The associated Redheffer–Reid system in the multidimensional case can be considered again as the constraints imposed on the solutions of those equations. We show that in the same way as in the two dimensional case, the appropriate reduction of the multidimensional WZNW equations leads to the multidimensional Toda systems [22], in particular to the equations [9,10,13] describing topological and antitopological fusion.1 The multidimensional Toda systems are integrable for the relevant integration data with the general solution being determined by the corresponding arbitrary mappings in accordance with the integration scheme developed in [22]. Therefore the integrability problem for the multidimensional Riccati-type equations can be studied, in particular, on the basis of that fact. As an illustration of the general construction we discuss in detail some examples related to the maximally nonabelian Toda systems [23]. Analogously to the Toda systems one can construct higher grading generalisations in the sense of [18,24] for the multidimensional Riccati-type equations. 2. One Dimensional Riccati-Type Equations We shall introduce Riccati-type equations within the usual language of integrable models, through an associate linear problem involving Lie algebra and Lie group valued objects. 1 It is rather clear that the multidimensional systems suggested in [22] become two dimensional equations only under a relevant reduction. Moreover, arbitrary mappings determining the general solution to these equations are not necessarily factorised to the products of mappings each depending on one coordinate only. One can easily be convinced by the examples considered there in detail.
Riccati-Type Equations and Toda Systems
651
Let G be a connected Lie group and g be its Lie algebra. Without any loss of generality we assume that G is a matrix Lie group, otherwise we replace G by its image under some faithful representation of G. For any fixed mapping λ : R → g consider the equation ψ −1
dψ =λ dx
(2.1)
for the mapping ψ : R → G. Certainly one can use the complex plane C instead of the real line R. Suppose that the Lie algebra g is endowed with a Z-gradation, M gm . g= m∈Z
Define the following nilpotent subalgebras of g: M M gm , g>0 = gm , g0 ψ>0 + ψ>0 = λ, (2.4) ψ≤0 ψ>0 dx dx where ψ≤0 = ψ0 −1 dψ≤0 −1 + ψ>0 = ψ>0 λ ψ>0 , dx dx
and hence −1 ψ≤0
dψ≤0 −1 = (ψ>0 λ ψ>0 )≤0 , dx
(2.5)
where the subscript ≤ 0 denotes the corresponding component with respect to the decomposition g = g≤0 ⊕ g>0 = (g0 .
652
L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman
Substituting (2.5) into (2.4) one gets −1 ψ>0
dψ>0 −1 −1 = λ − ψ>0 (ψ>0 λ ψ>0 )≤0 ψ>0 dx
that can be rewritten as dψ>0 −1 −1 )>0 . (2.6) ψ>0 = (ψ>0 λ ψ>0 dx By the reasons which are clear from what follows we call this equation for the mapping ψ>0 a Riccati-type equation. The formal integration of Eq. (2.6) can be performed in the following way. Consider the linear differential equation dψ = ψ λ, dx
(2.7)
for the mapping ψ : R → G. Find the solution of this equation with the initial condition ψ(0) = a, where a is a constant element of the Lie group G. Using now the Gauss decomposition (2.3) of the mapping ψ we find the solution of Eq. (2.6) with the initial condition ψ>0 = a>0 , where a>0 is the positive grade component of a arising from the Gauss decomposition (2.2). It is clear that in order to obtain the general solution of Eq. (2.6) it suffices to consider elements a belonging to the Lie subgroup G>0 . Then the solution of (2.6) is expressed in terms ofthe solution of (2.7). Note that the solution of Eq. (2.7) with the initial condition ψ(0) = a can be obtained from the solution with the initial condition ψ(0) = e, where e is the unit element of G, by left multiplication by a. Thus we have shown that one can associate a Riccati-type equation to any Z-gradation of a Lie group. The integration of these equations is reduced to integration of some matrix system of first order linear differential equations. Let now χ be some mapping from R to G. It is clear that if the mapping ψ satisfies Eq. (2.7), then the mapping ψ 0 = ψχ −1 satisfies the equation dψ 0 = ψ 0 λ0 , dx where λ0 = χ λ χ −1 −
dχ −1 χ . dx
(2.8)
If χ is a mapping from R to G0 , then the corresponding component 0 = χ ψ>0 χ −1 ψ>0
of the mapping ψ 0 satisfies the Riccati-type equation (2.6) with λ replaced by λ0 . In this, λ00 = χ λ0 χ −1 −
dχ −1 χ , dx
and it is clear that we can choose the mapping χ so that λ00 vanishes. Another interesting possibility arises when χ is a mapping from R to G>0 . Let us choose a mapping χ such that λ0>0 = 0. From (2.8) it follows that this case is realised if and only if dχ −1 χ = (χ λχ −1 )>0 , dx
Riccati-Type Equations and Toda Systems
653
i.e., χ should satisfy the Riccati-type equation (2.6). Thus, having a particular solution of the Riccati-type equation, its general solution can be constructed from the general solution of the equation with λ>0 = 0. As will be shown below, for this case the Riccatitype equation can be solved in a quite simple way. 3. Simplest Example Consider first the case of the Lie group GL(n, C), n ≥ 2 and represent n as the sum of two positive integers n1 and n2 . For the Lie algebra gl(n, C) there is a Z-gradation where arbitrary elements x0 of the subalgebras g0 and g0 have the form 0 0 (x0 )11 0 (x>0 )12 0 , x0 = . , x>0 = x0 =
In1 U 0 In2
.
(3.3)
One easily sees that Eq. (2.6) takes in the case under consideration the form dU = B − AU + U D − U CU. (3.4) dx In the case n = 2, n1 = n2 = 1, we have the usual Riccati equation. For n = 2m, n1 = n2 = m, we come to the so-called matrix Riccati equation. This justifies our choice for the name of Eq. (2.6) in the general case. Note that Eq. (2.6) has arisen in the context of the factorization method, see, for example, [19]. However, the explicit connection of this equation with the Riccati equation and the matrix Riccati equation has not been traced yet.
654
L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman
3.1. Case B = 0. If C = 0 then Eq. (3.4) is linear. In the case B = 0, under the conditions n1 = n2 and det U (x) 6 = 0 for any x, the substitution V = U −1 leads to the linear equation dV = V A − DV + C. dx Nevertheless, it is instructive to consider the procedure of obtaining the general solution to equation (3.4) for B = 0. Recall that having a particular solution to the Riccati-type equation, we can reduce the consideration to the case where λ>0 = 0. For the equation in question this is equivalent to the requirement B = 0. First, find the mapping χ : R → G0 such that transformation (2.8) would give λ00 = 0. Parametrising χ as Q 0 χ= , 0 R one comes to the following equations for R and Q: dR = R D. dx
dQ = Q A, dx
Therefore we can choose Z Z x 0 0 A(x ) dx , R(x) = P exp Q(x) = P exp 0
x
0
D(x ) dx
0
,
(3.5)
0
where the symbol P exp(·) denotes the path ordered exponential (multiplicative integral). Now solve the equation dψ 0 = ψ 0 λ0 , dx where λ0 =
0 0 C0 0
=
0 0 RCQ−1 0
.
The solution of this equation with the initial condition ψ 0 (0) = In is 0 I n1 ψ(x) = S(x) In2 with Z
x
S(x) =
R(x 0 ) C(x 0 ) Q−1 (x 0 ) dx 0 .
0
Hence, the solution of Eq. (2.7) with the initial condition ψ(0) = In is given by Q 0 ψ= . SQ R
(3.6)
Riccati-Type Equations and Toda Systems
655
To obtain the general solution of the equation under consideration we should have the solution of Eq. (2.7) with the initial condition In1 m , (3.7) ψ(0) = 0 In 2 where m is an arbitrary n1 × n2 matrix. Such a solution is represented as (In1 + mS)Q mR . ψ= SQ R Now, using (3.1) we conclude that the general solution to Eq. (3.4) in the case B = 0 is U = Q−1 (In1 + mS)−1 mR, where Q, R and S are given by relations (3.5) and (3.6). Thus we see that in the case when λ is a block upper or lower triangular matrix the Riccati-type equation (3.4) can be explicitly integrated. Actually if λ is a constant mapping we can reduce it by a similarity transformation to the block upper or lower triangular form and solve the corresponding Riccati-type equation. The solution of the initial equation is obtained then by some algebraic calculations. 3.2. The case A = 0 and D = 0. Representing the mapping ψ in the form ψ11 ψ12 ψ= ψ21 ψ22 one easily sees that Eq. (2.7) is equivalent to the system dψ11 = ψ12 C, dx dψ21 = ψ22 C, dx
dψ12 = ψ11 B, dx dψ22 = ψ21 B. dx
(3.8) (3.9)
3.2.1. The case C = B. Consider the case C = B; that is certainly possible only if n1 = n2 . In this case we can rewrite Eqs. (3.8) and (3.9) as d(ψ11 + ψ12 ) = (ψ11 + ψ12 )B, dx d(ψ22 + ψ21 ) = (ψ22 + ψ21 )B, dx
d(ψ11 − ψ12 ) = −(ψ11 − ψ12 )B, dx d(ψ22 − ψ21 ) = −(ψ22 − ψ21 )B. dx
Hence, the solution of Eq. (2.7) with the initial condition ψ(0) = In is given by 1 F +H F −H , ψ= 2 F −H F +H where
Z
x
F (x) = P exp 0
0
B(x ) dx
0
Z , H (x) = P exp − 0
x
0
B(x ) dx
0
.
656
L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman
The solution of Eq. (2.7) with the initial condition of form (3.7) is ψ=
1 2
F + H + m(F − H ) F − H + m(F + H ) F −H F +H
;
therefore, the general solution to the Riccati-type equation under consideration can be written as U = (F + H + m(F − H ))−1 (F − H + m(F + H )). 3.2.2. The case of constant B and C. As we noted above, the general solution to the Riccati-type equations (3.4) for the case of constant mapping λ can be obtained by a reduction of λ to the block upper or lower triangular form. Nevertheless, it is interesting to consider the particular case of the constant λ when the general solution has the most simple form. Suppose that n1 = n2 and that B and C are constant nondegenerate matrices. In this case the solution of Eq. (2.7) with the initial condition ψ(0) = In is ψ(x) =
√ √ √ −1 BCx) sinh( BCx) BCC cosh( √ √ √ , sinh( CBx) CBB −1 cosh( CBx)
and for the general solution one has −1 √ √ √ U (x) = cosh( BCx) + m sinh( CBx) CBB −1 √ √ √ × sinh( BCx) BCC −1 + m cosh( CBx) . It should be noted here that the expression for U (x) does not actually contain square roots of matrices that can be easily seen from the corresponding expansions into the power series.
4. A Further Example The next example is based on another Z-gradation of the Lie algebra gl(n, C). Here one represents n as the sum of three positive integers n1 , n2 and n3 and considers an element x of gl(n, C) as a 3×3 block matrix (xrs ) with xrs being an nr ×ns matrix. The subspace gm is formed by the block matrices x = (xrs ) where only the blocks xrs with s − r = m are different from zero. Arbitrary elements x0 of the subalgebras g0 have the form
0
0 0
x0
(x0 )11 0 0 (x0 )22 0 0
Riccati-Type Equations and Toda Systems
657
The subgroups G0 are formed by the nondegenerate matrices In1 In1 (a>0 )12 (a>0 )13 0 0 In2 0 , In2 (a>0 )23 , a>0 = 0 a0 entering the Gauss decomposition of type (2.3): −1 −1 = (ψ>0 λi ψ>0 )>0 . ∂i ψ>0 ψ>0
(5.3)
Riccati-Type Equations and Toda Systems
659
We call these equations multidimensional Riccati-type equations. The integration of Eqs. (5.3) is again reduced to the integration of linear system (5.1). The transformation (2.8), where χ is a mapping from Rd to G0 , cannot be used now to get the Riccati-type equations with λ0 = 0. Indeed, to this end we should solve the equations χ −1 ∂i χ = (λi )0 .
(5.4)
The integrability conditions for these equations do not in general follow from (5.2). However, for the case (λi )>0 = 0 relations (5.4) are a consequence of relations (5.2) and we can, with the help of transformation (2.8), reduce these equations to the case where (λi )0 = 0. Note that in the multidimensional case it is again possible to use transformation (2.8), where χ is some solution of the Riccati-type equations, to reduce the equations to the case where (λi )>0 = 0. When λi are constant mappings, conditions (5.2) imply that the matrices λi commute. Here, by a similarity transformation, we can reduce λi to a triangular form. In such a case, and not only for constant λi , the multidimensional Riccati-type equations can be integrated by a procedure similar to one used in the one dimensional case. As a concrete example consider the Lie group GL(n, C) with the gradation of its Lie algebra described in Sect. 3. Parametrising the mappings λi as Ai Bi λi = Ci Di and using for the mapping ψ>0 parametrisation (3.3) we come to the following multidimensional Riccati-type equations: ∂i U = Bi − Ai U + U Di − U Ci U.
(5.5)
When Ai = 0 and Bi = 0 conditions (5.2) become ∂i Cj − ∂j Ci = 0; hence, there exists a mapping S such that Ci = ∂i S. Then, the general solution of Eqs. (5.5) has the form U = (In1 + mS)−1 m, where m is an arbitrary n1 × n2 matrix. 6. Generalised WZNW Equations and Multidimensional Toda Equations Consider the space R2d as a differential manifold and denote the standard coordinates on R2d by z−i , z+i , i = 1, . . . , d. Let ψ be a mapping from R2d to the Lie group G, which satisfies the equations ∂+j (ψ −1 ∂−i ψ) = 0, that can be equivalently rewritten as ∂−i (∂+j ψ ψ −1 ) = 0.
(6.1)
660
L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman
Here and in what follows we use the notations ∂−i = ∂/∂z−i and ∂+j = ∂/∂z+j . In accordance with [16] we call Eqs. (6.1) the generalised WZNW equations. It is wellknown that the two dimensional Toda equations can be considered as reductions of the WZNW equations; for a review we refer the reader to the remarkable paper [14], and for the affine case to [3,15]. Let us show that in multidimensional situation the appropriate reductions of the generalised WZWN equations give the multidimensional Toda equations recently proposed and investigated in [22]. It is clear that the g-valued mappings ι−i = ψ −1 ∂−i ψ,
ι+j = −∂+j ψ ψ −1
(6.2)
satisfy the relations ∂+j ι−i = 0,
∂−i ι+j = 0.
(6.3)
Moreover, the mappings ι−i and ι+i satisfy, by construction, the following zero curvature conditions: ∂−i ι−j − ∂−j ι−i + [ι−i , ι−j ] = 0,
∂+i ι+j − ∂+j ι+i + [ι+i , ι+j ] = 0.
(6.4)
The reduction in question is realised by imposing on the mapping ψ the constraints (ψ −1 ∂−i ψ)0 = −c+i ,
(6.5)
where c−i and c+i are some fixed mappings taking values in the subspaces g−1 and g+1 respectively. In other words, one imposes the restrictions (ι−i )0 = c+i .
From (6.3) and (6.4) it follows that we should consider only the mappings c−i and c+i which satisfy the conditions ∂+j c−i = 0, [c−i , c−j ] = 0,
∂−i c+j = 0, [c+i , c+j ] = 0.
(6.6) (6.7)
Using the Gauss decomposition (2.3) we have −1 −1 −1 ψ0 (ψ0
−1 −1 + ψ>0 (ψ0−1 ∂−i ψ0 )ψ>0 + ψ>0 ∂−i ψ>0 .
Taking into account the first equality of (6.5), one sees that −1 ∂−i ψ 0 one can compute η > 0 such that 8(η) ≥ ε. The result holds for kernels that are bounded below, in the sense B(z, ω) ≥ ν > 0. In a second paper [10], Carlen and Carvalho adapted their analysis to the case of the hard spheres potential, under some L∞ -type assumptions on f . As an immediate consequence of (13) and (21), their result implies d (22) − H (f |M f ) ≥ 8 H (f |M f ) dt and this is enough to conclude that H (f ) −−−→ H (M f ). t→∞
But, by the Csiszar-Kullback inequality [18,29], q kf − M f kL1 ≤ 2H (f |M f ),
(23)
(24)
and therefore (23) implies at once that f (t) goes towards M f in L1 norm. Moreover, given any number ε > 0 and an initial datum f0 , one can compute explicitly a time Tε (f0 ) such that kf (t) − M f kL1 ≤ ε for t ≥ Tε (f0 ). Several applications of this result have been given: in particular a rigorous hydrodynamic limit in the large for a model equation in plasma physics [11] and a proof of trend to equilibrium in the weak sense for the Boltzmann equation for hard spheres, in the case when H (f0 ) = ∞ [1]. Unfortunately, the function 8 given by Carlen and Carvalho is very intricate (see [10], p. 754), which makes explicit computations rather difficult, even when moments of high order are finite. Moreover, it is not a priori clear that 8 possesses positive derivatives (even of high order) near the origin, and hence the rate of return to equilibrium predicted by (22) is very slow. It is therefore natural to ask whether the inequality (22) can be found to hold for a simple function 8 that grows not too slow (ideally, linearly) near the origin. In fact, in an older paper, Cercignani [15] conjectured that D(f ) ≥ λ(f0 )H (f |M f )
(25)
for some λ(f0 ) > 0 depending on the initial datum. Inequality (25) would imply an exponential decay towards equilibrium. Bobylev [5] proved that this conjecture cannot
672
G. Toscani, C. Villani
hold for Maxwellian molecules if λ(f0 ) depends only on the mass, momentum and energy of f0 . Indeed, he exhibited a family of initial data with the same moments up to order 2, for which the trend to equilibrium can be as slow as desired. Wennberg [51] arrived to the same conclusion in the case of hard potentials (6) with 0 < γ ≤ 1, by a direct study of D(f ). Finally, in a very recent work [6], Bobylev and Cercignani proved the inequality (25) to be false, for all realistic potentials, even for functions that have an arbitrary high number of moments close to the equilibrium value, and are very smooth and bounded below by a given Maxwellian. They conjecture that the only reasonable spaces of functions in which (25) may hold would be Lp spaces with an inverse Maxwellian weight. A good theory of existence in such spaces is very far out of reach at the moment. In this paper, we shall derive a new bound of the form (20), with a function 8 which is at the same time much more simple and increasing faster near the origin. Namely D(f ) ≥ Cε (f ) H (f |M f )1+ε , where Cε depends only on the cross-section, the quantities (12), and Z dv f (v)| log f (v)|(1 + |v|2 )r/2 , kf kL1r log L ≡ RN
Z kf kL1s ≡
RN
f (v)(1 + |v|2 )s/2
(26)
(27)
(28)
for some r(ε) > 2, s(ε) > 4 that we shall compute, and K, A such that ∀v ∈ RN ,
f (v) ≥ Ke−A|v| . 2
(29)
We note that the result by Carlen and Carvalho is slightly more general than ours, since in its most general version it does not require bounds in L1r log L ∩ L1s but only in L log L ∩ L1s . Our conclusion holds for all kernels B such that |z · ω| N −2 (30) B(z, ω) ≥ ψ(|z|) |z| for some smooth function ψ > 0 decaying at most algebraically at infinity. Consider for instance the case when ψ(|z|) = (1 + |z|)γ ,
0 < γ ≤ 1.
In this framework, Gustafsson [28] proved uniform (in time) boundedness of the solution to (1) in weighted Lp spaces (p > 1), from which uniform bounds in L1s log L are easily extracted. In addition, it is known that one can choose fixed K and A, depending only on the mass, energy and entropy of f0 , such that (29) holds for f (t, ·) as soon as t ≥ t0 > 0 (see [37]). Consequently, our result implies that for this family of kernels, solutions to the Boltzmann equation decay towards the equilibrium in L1 norm like t −1/ε , for all p ε > 0, as soon as f0 ∈ ∩s>0 Ls . In fact, it would be very likely that the norms of f in 1 Ls log L become finite for all positive time as soon as the initial entropy is finite, as it happens for the moments [51].
Sharp Entropy Dissipation Bounds
673
We emphasize that the bound (26) is optimal, in the sense that, under our assumptions, the result simply does not hold for ε = 0, as shown by the previous discussion of Cercignani’s conjecture. In the case where condition (30) fails to be true, we do not recover such a strong result as (26). Yet if the set of (z, ω) such that (30) is violated is of small measure, it is easy to adapt our method and get an explicit algebraic bound of the form D(f ) ≥ CH (f |M)α . In particular, this is true for all hard potentials (see Sect. 6). Up to our knowledge, this is the first result of algebraic decay with explicit bounds available for the Boltzmann equation. But we wish to point out that the inequality (26) is more interesting than just a statement concerning solutions of the spatially homogeneous Boltzmann equation: it is a general functional inequality, that could be applied in any context, in particular the spatially inhomogeneous Boltzmann equation, if suitable estimates on the solutions were known. In addition, our work suggests simple obstructions for (25) not to hold, linked to the tails of the distribution f , that are generally found to be the most severe obstacle in rigorous proofs of decay to equilibrium (cf. [5] for instance). Also, the comparison of our results to the ones obtained by Desvillettes and the second author in [20,21], is a clear illustration of the physical fact that the tails of distribution may be an obstacle to the trend to equilibrium in the case of the Boltzmann equation (which is a kind of jump process), but not in the case of related diffusion-type equations. Our approach is based upon several tools that have been known for a more or less long time in the context of the Boltzmann equation. The first one is the regularization by the so-called adjoint Ornstein-Uhlenbeck semigroup, i.e. the semigroup (St ) generated by the linear Fokker–Planck equation, ∂t f = ∇ · (T ∇f + f (v − u)),
u ∈ RN , T > 0.
(31)
The class of distribution functions satisfying (12) is invariant by this semigroup. Moreover, solutions of (31) are smooth for all positive time, and converge towards M f . Therefore, this semigroup gives a very convenient interpolation between f and M f . It plays a central role in the proof of Carlen and Carvalho, and also in our analysis. Our second tool is the use of the tensorial structure of the Boltzmann equation, and in particular the fact that the entropy dissipation (14) is a convex function of the tensor product ff∗ . In fact, most of our study will take place in R2N , and it is only in the end that we shall go back to the one-variable N -dimensional space. Our third tool, going back at least to Boltzmann, and systematically used by Desvillettes [19], consists in the introduction of linear operators that “kill” functions with some symmetries. Typical examples are i |v|2 + |v∗ |2 = 0, (v − v∗ ) ∧ (∇ − ∇∗ ) U v + v∗ , 2
h
(32)
used by Boltzmann to prove that smooth solutions of (15) are Maxwellian (cf. [19]), or
∂2 ∂2 − 2 2 ∂y ∂x
x−y x+y +ϕ = 0, ϕ √ √ 2 2
(33)
arising in the context of rescaled convolution (see for instance the work by Carlen [8] about the cases of equality in the logarithmic Sobolev inequality).
674
G. Toscani, C. Villani
The last main ingredient of our proof is the so-called Landau collision operator, Z ∂ ∂f∗ ∂f −f dv∗ aij (v − v∗ ) f∗ QL (f, f ) = , (34) ∂vj ∂vj ∂v∗j where the symmetric matrix function aij is defined by zi zj aij (z) = δij − 2 9(|z|) |z|
(35)
for some nonnegative function 9. Here we have adopted the convention of Einstein for implicit summation over repeated indices. The collision operator (34) bears much resemblance with (2), and is in fact obtained from it by a suitable asymptotic regime (see [43] and the references therein). None of these tools is new; but the main feature of our study is the way they are combined all together. For the sake of completeness, we shall recall in the next section all the material concerning Eqs. (31) and (34) that will be needed in the sequel. Let us end this introduction with some comments on entropy dissipation methods. These have proved to be very robust and apply to various contexts, in particular diffusiontype equations ([39,2]), where sharp rates of exponential decay towards equilibrium in relative entropy have been derived for linear and weakly nonlinear models. In the context of the Boltzmann equation, one of their essential features is monotonicity: namely, if Q1 and Q2 are two Boltzmann operators of the form (2), with cross sections B1 ≥ B2 , then, in view of (14), D1 (f ) ≥ D2 (f ). The advantage of this property was noted by Carlen and Carvalho: in view of it, all the results that are obtained in an algebraically simplified framework for a peculiar cross section B2 , automatically extend to all cross sections B1 ≥ B2 . Carlen, Gabetta and Toscani [12] recently proved exponential convergence to equilibrium with an explicit rate for the Boltzmann equation with Maxwellian molecules. The rate depends on both the tails and the smoothness of the solution, and is essentially optimal. But their method relies on Fourier analysis, and hence seems very difficult to extend to other kernels. There, the proof makes use of a Lyapunov functional (distance) introduced in [26], defined in terms of the Fourier transform, which plays the role of the classical entropy. We note that the monotonicity properties of this entropy functional have been recently used to give a proof of uniqueness of the solution to (1) for true Maxwell molecules [41]. The organization of the paper is as follows. Section 2 is devoted to some preliminary material concerning the linear Fokker–Planck and the Landau equation. In Sect. 3, we study some symmetry properties enjoyed by the Boltzmann and the Fokker–Planck equation. In Sect. 4, we establish an integral representation of a lower bound for D(f ), based on regularization by (St ). In Sect. 5 we state and prove our main result, namely the bound below (26). In Sect. 6, we show how to extend our results to various models, including the hard potentials with cut-off. In Sect. 7, we treat Kac’s model as a variant. In Sect. 8, we do some remarks about the procedure of regularization of D by the semigroup (St ), and show how it can be linked to the decay of the Fisher information along the solutions to the Boltzmann equation.
Sharp Entropy Dissipation Bounds
675
2. Preliminaries: Fokker–Planck and Landau Equations From now on, unless otherwise stated, we shall consider only nonnegative distribution functions f satisfying the normalization ρ = 1,
u = 0,
T = 1,
(36)
where ρ, u and T are defined by (12). This class of functions is invariant by all the equations that we shall consider: Boltzmann, Fokker–Planck, Landau and Kac. Moreover, we shall denote by M the corresponding Maxwellian distribution, M(v) =
|v|2 1 e− 2 . N/2 (2π)
(37)
Accordingly, Z H (f |M) =
RN
Z I (f |M) = 4
RN
f f log M = H (f ) − H (M), M M
(38)
v p 2 f = I (f ) − I (M), ∇+ 2
(39)
all being well-defined in [0, ∞]. Another form of (39) is R these expressions |∇ log(f/M)|2 f , at least when some smoothness is available for f . Let us recall the basic properties of H (·|M) and I (·|M). We refer to [13,14] and the references therein for complete proofs of the assertions below, and more material about the Fokker–Planck equation. Both the relative entropy and the relative Fisher information are strictly convex, weakly lower semicontinuous (for the L1 topology for instance) functionals. For the relative entropy, this can be seen directly by the Legendre-type representation [22,13] Z Z , (40) H (f |M) = sup f ϕ − log eϕ M where the supremum is taken for instance over all continuous bounded functions ϕ. The basic link between H (·|M) and I (·|M) is given by the action of the adjoint Ornstein-Uhlenbeck semigroup, (St )t≥0 , which can be defined as the semigroup associated to the Fokker–Planck equation. With the conventions (36), the Fokker–Planck equation simply reads ∂t f = ∇ · (∇f + f v) ≡ Lf.
(41)
The explicit solution of this equation is well-known, and thus St f = fe−2t ∗ M1−e−2t , where we use the notation gλ (v) =
1 λN/2
g
v . √ λ
Note that M is invariant by the action of this semigroup.
(42)
(43)
676
G. Toscani, C. Villani
It is well known that t 7 −→ St f is continuous for the strong L1 topology. Moreover, assuming that H (f ) < ∞, the function t 7 −→ H (St f |M) belongs to C([0, ∞)) ∩ C 1 (0, ∞), and for all t > 0, d H (St f |M) = −I (St f |M). dt
(44)
In particular, H (St f |M) is decreasing with t. What is less known is the fact that I (St f |M) is also decreasing. The rate of decay has been found in [39]: I (St f |M) ≤ I (f |M)e−2t .
(45)
This inequality can also be considered as a direct consequence of the so-called Blachman– Stam inequality (cf. [13]), and follows as well by the important work of Bakry and Emery, based on the so-called 02 calculus [4]. We shall use this decay property in the study of the entropy production for Kac equation. Moreover, St f −→ M in relative entropy as t → ∞. More precisely [39], H (St f |M) ≤ H (f |M)e−2t .
(46)
As a consequence, Z
∞
H (f |M) = 0
dt I (St f |M).
(47)
We shall also use the fact that the Fokker–Planck equation propagates moments. Namely, for any s > 0 and any datum f ∈ L1 , for all t > 0, (48) kSt f kL1s ≤ max(1, 2s−1 ) e−st kf kL1s + (1 − e−2t )s/2 kMkL1s , This can be easily seen by remarking that (42) is nothing more than the probability density of the random variable Xt = e−t X + (1 − e−2t )1/2 W, where X has density f , and W is a normalized Gaussian variable independent of X. Consequently, h i kSt f kL1s = E (1 + Xt2 )s/2 , E denoting mathematical expectation. On the other hand, 2 s/2 1 + e−t X + (1 − e−2t )1/2 W h is ≤ e−t (1 + X2 )1/2 + (1 − e−2t )1/2 (1 + W 2 )1/2 and (48) follows from the inequality (x 2 + y 2 )s ≤ max(1, 2s−1 )(x 2s + y 2s ). For the moments of order 2, more can be said: all the quantities of the form Z dv (St f )(v)vi vj Pij (St f ) = RN
Sharp Entropy Dissipation Bounds
677
behave monotonically and converge exponentially fast to their equilibrium value δij as t → ∞. Finally, the semigroup (St ) has smoothing effects: for all t > 0, St f is C ∞ and such that | log St f (v)| ≤ Ct (1 + |v|2 )
(49)
for some constant Ct depending on t. A proof of (49) can be found in [9]. Let us now prove a less known propagation property. Proposition 1. Let s > 0, ε > 0, and let f ≥ 0 such that kf kL1 < ∞ and s+ε kf kL1s log L < ∞. Then there exists Cs depending only on s, ε and kf kL1s log L , kf kL1 , s+ε such that for all t > 0, kSt f kL1s log L ≤ Cs . Proof. Here as in the sequel, we will denote by C, Cs various constants. We note first that it suffices to obtain a uniform bound on Z Ls = dv f (v) log f (v)(1 + |v|2 )s/2 . Indeed, Z
f | log f |(1 + |v|2 )s/2 ≤
Z
f log f (1 + |v|2 )s/2 Z +2 f | log f |(1 + |v|2 )s/2 . f ≤1
(50)
But, owing to the inequality x| log x| ≤ y − x log y that holds for all 0 ≤ x ≤ 1, 0 < y ≤ 1, we see that if f ≤ 1, for any ε > 0, 2 )ε/2
f | log f | ≤ e−(1+|v|
+ (1 + |v|2 )ε/2 f,
and this implies Z f ≤1
f | log f |(1 + |v|2 )s/2 ≤ ds,ε + kf kL1 , s+ε
(51)
where we put Z ds,ε =
RN
2 )ε/2
(1 + |v|2 )s/2 e−(1+|v|
dv.
Since the moments of order s + ε are uniformly propagated by (St ), it is sufficient to consider the first integral in the right hand side of (50), whence our claim. For all t > 0, St f is smooth and | log f | is quadratically bounded. Moreover, the mapping t 7 → Ls (St f ) is continuous (see related arguments in Sect. 5) and continuously differentiable for t > 0. Let us compute its derivative. In the sequel, we use the abridged
678
G. Toscani, C. Villani
notation f = f (t) for St f . Integrating systematically by parts in all the integrals where only one derivative of f enters, we easily obtain Z |∇f |2 (1 + |v|2 )s/2 ∇ · (∇f + f v) log f (1 + |v|2 )s/2 = − f Z h i + f log f 1(1 + |v|2 )s/2 − v · ∇(1 + |v|2 )s/2 Z n h i o + f ∇ · v(1 + |v|2 )s/2 − 1(1 + |v|2 )s/2
Z
and Z
∇ · (∇f + f v) (1 + |v|2 )s/2 = −
Z
+
Z
f v · ∇(1 + |v|2 )s/2 f 1(1 + |v|2 )s/2 .
Hence Z
Z
|∇f |2 =− (1 + |v|2 )s/2 f Z s(s − 2)|v|2 s|v|2 Ns 2 s/2 + + N kf kL1s . + f log f (1 + |v| ) − (1 + |v|2 ) (1 + |v|2 )2 (1 + |v|2 ) d dt
2 s/2
f log f (1 + |v| )
Now, by (51), s|v|2 s(s − 2)|v|2 Ns − + f log f (1 + |v| ) (1 + |v|2 ) (1 + |v|2 )2 (1 + |v|2 ) Z Z f log f (1 + |v|2 )s/2 − s f log f (1 + |v|2 )s/2 ≤ (N s + s(s − 2)) f ≥1 f ≤1 Z h i 2 s/2 ≤ (N s + s(s − 2)) f log f (1 + |v| ) + (N s + s(s − 1)) ds,ε + kf kL1 , Z
2 s/2
s+ε
and we obtain Z p f log f (1 + |v|2 )s/2 ≤ −4 |∇ f |2 (1 + |v|2 )s/2 + Z h i (N s +s(s −2)) f log f (1+|v|2 )s/2 +(Ns +s(s −1)) ds,ε + kf kL1 +N kf kL1s . d dt
Z
s+ε
Next, since hp i sp p f (1 + |v|2 )s/4 − f v(1 + |v|2 )s/4−1 , ∇ f (1 + |v|2 )s/4 = ∇ 2
Sharp Entropy Dissipation Bounds
we write
679
Z
p |∇ f |2 (1 + |v|2 )s/2 2 Z p Z = 4 ∇ f (1 + |v|2 )s/4 + s 2 f |v|2 (1 + |v|2 )s/2−2 Z p p v − 4s ∇ f (1 + |v|2 )s/4 f (1 + |v|2 )s/4 1 + |v|2 Z = I f (1 + |v|2 )s/2 + s 2 f |v|2 (1 + |v|2 )s/2−2 Z v 2 s/2 + 2s f (1 + |v| ) ∇ · , 1 + |v|2 √ √ where we have used the identity 2(∇ g) g = ∇g and integrated by parts in the last integral. By Gross’s logarithmic Sobolev inequality [27], written with respect to the Lebesgue measure, for all functions g ∈ H 1 (RN ), and for any a > 0, Z Z N dv |g|2 log |g|2 /kgk2L2 (RN ) + N + log 2π a dv |g|2 N N 2 R R Z 2 ≤ 2a dv |∇g| . 4
RN
Hence, choosing a = [4Ns + 4s(s − 2)]−1 we obtain Z 2 s/2 ≥ 8(Ns + s(s − 2)) f log f (1 + |v|2 )s/2 + I f (1 + |v| ) Z f (1 + |v|2 )s/2 log(1 + |v|2 )s/2 − kf kL1s log kf kL1s + N N + log[π(2Ns + 2s(s − 2))−1 ] kf kL1s . 2 Grouping together all the previous inequalities, we conclude that Z Z d f log f (1 + |v|2 )s/2 ≤ −8(Ns + s(s − 2)) f log f (1 + |v|2 )s/2 + C, dt where C depends on N, s, kf kL1 in a explicitly computable way. By (48), kf kL1 is s+ε s+ε bounded uniformly in t, and this implies a uniform bound for Ls (f (t)). u t All these properties would suffice to ensure that (St )t≥0 gives a very convenient way of smoothing densities in the frame of the Boltzmann equation. This becomes still clearer in view of the work by Morgenstern [35] upon the case of Maxwellian potentials with cut-off. For these potentials, under a suitable normalization of B, the Boltzmann equation simply reads Z (52) ∂t f = Q+ (f, f ) − f = dv∗ dσ b(k · σ )f 0 f∗0 − f,
680
G. Toscani, C. Villani
where k = (v − v∗ )/|v − v∗ |. First Morgenstern in dimension two, and subsequently Bobylev in any dimension of the velocity variable [5] proved that if Q+ is given by (52), then for all δ > 0, Q+ (f ∗ Mδ , f ∗ Mδ ) = Q+ (f, f ) ∗ Mδ .
(53)
Since Q+ also commutes with the rescaling (43), it follows that Q+ (St f, St f ) = St Q+ (f, f ), and as a consequence that the semigroup induced by the Boltzmann equation with Maxwellian molecules commutes with the adjoint Ornstein-Uhlenbeck semigroup. This property was crucial in the analysis of [9] (see also [47]). Several other symmetry properties connecting the Boltzmann equation and the Fokker– Planck equation will be studied in the next section (one can safely assume that we overlooked many others). The Boltzmann equation with Maxwellian molecules enjoys many remarkable properties, reminiscent of the Fokker–Planck equation. In particular, I (f |M) is decreasing along its solutions. This was proven in [38] for N = 2, in [9] in the case when b is constant, and in [47] in the general case. As we shall see in Sect. 8, this decreasing property can be related to the problem of finding a lower bound for D(f ). This point had already been noted (by a different argument) by Carlen and Carvalho. These peculiar properties of Maxwellian molecules may be somewhat enlightened (or obscured !) by the study of the so-called asymptotics of grazing collisions. These are a limiting process under which the Boltzmann equation transforms into a nonlinear diffusion-type equation, called Landau (or sometimes Fokker–Planck !) equation. This limit was discovered from the formal point of view by Landau [30] in the study of plasmas, in the frame of the Coulombian potential. From the mathematical point of view, a very wide class of potentials can be considered. We refer to [43] for a detailed study, and further references on the subject. See also [45] for a rigorous variant of Landau’s original argument. It turns out that in the case of Maxwellian molecules, the corresponding Landau equation resembles very much the Fokker–Planck equation. In [44] the following representation was established: X (N − Ti )∂ii f + (N − 1)∇ · (f v) + 1S f, (54) ∂t f = i
where an orthonormal basis (ei ) has been chosen such that Z f (v)vi vj dv = δij Ti , RN
(55)
which is always possible by diagonalization of the quadratic form Z (v ∈ RN ). q : e 7 −→ f (v) (v, e)2 dv The condition (55) is preserved by Eq. (54). Ti will be called the directional temperature of f along the direction ei . Here 1S denotes the Laplace-Beltrami operator, X |v|2 − vi vj ∂ij f (v) − (N − 1) v · ∇f (v). 1S f (v) = ij
Sharp Entropy Dissipation Bounds
681
In particular, in the case of radial distributions, the Landau equation (54) coincides (up to a multiplicative factor N − 1) with the Fokker–Planck equation. The questions of existence, uniqueness, asymptotic behaviour and some qualitative properties of solutions to Eq. (54) have been addressed in [44]. The Landau equation with Maxwellian molecules shares many properties with both the Boltzmann equation and the Fokker–Planck equation (see [46] for instance). We shall be essentially interested in the associated entropy dissipation, Z p p (56) DL (f ) = 2 dv dv∗ |v − v∗ |2 5(v − v∗ )(∇ − ∇∗ ) ff∗ (∇ − ∇∗ ) ff∗ , where ∇∗ denotes the gradient with respect to the variable v∗ , and 5(z) is the orthogonal projection upon the space orthogonal to z, zi zj (57) 5ij (z) = δij − 2 |z| (we use the standard notation Axx = Aij xi xj ). We refer to [21] for a study of the functional DL . All that we shall use here is the inequality DL (f ) ≥ min (N − Ti ) I (f |M) = (N − Tf )I (f |M), 1≤i≤N
where
(58)
Z Tf = max
e∈S N −1 RN
dv f (v) (v, e)2 .
(59)
The inequality (58), established by Desvillettes and the second author, is clearly reminiscent of formula (44). See [21] for complete proofs, and many applications to the trend towards equibrium in the general case of the Landau equation with hard potentials. Our study will reveal an unexpected connection between the functionals D and DL , and the semigroup (St ), which will allow us to derive a bound below for D, starting from the bound (58). To establish this connection, we have to study more precisely the symmetries of the equations. 3. Symmetries for Boltzmann and Fokker–Planck Equations In this section, we give the symmetry properties that will make it possible to regularize D by the Ornstein-Uhlenbeck semigroup. We begin with an equivalent representation of (2), obtained by the classical change of variables (v, v∗ , ω) → (v, v∗ , σ ), such that v + v∗ |v − v∗ | + σ v 0 = 2 2 (60) v 0 = v + v∗ − |v − v∗ | σ. ∗ 2 2 In these variables, Q(f, f ) =
Z
Z RN
dv∗
S N −1
e − v∗ , σ ) f 0 f∗0 − ff∗ , dσ B(v
(61)
682
G. Toscani, C. Villani
e − v∗ , σ ) = (2|k · ω|)N−2 B(v − v∗ , ω), and the notation where B(v v − v∗ k= |v − v∗ | will be systematically used R R in the sequel. We recall that dω and dσ are normalized in such a way that dω = dσ = 1. We also set X0 = (v 0 , v∗0 ) = Tω X = Uσ X,
X = (v, v∗ ),
(62)
where Uσ is associated to the transformation (60). Note that for fixed σ , Uσ : R2N → R2N is not bijective. For any function G(X), we write t
t
Tω G = G ◦ Tω ,
Uσ G = G ◦ Uσ .
We now state a lemma which is due to Boltzmann himself. Lemma 1. Let f ∈ L1 (RN ). Then the average Z dσf 0 f∗0 G(v, v∗ ) =
(63)
S N −1
depends only on m = v + v∗ and e = (|v|2 + |v∗ |2 )/2. More generally, this result holds for any average of the form Z dσ t Uσ F (X), (64) G(X) = S N −1
where F ∈
L1 (R2N ).
The proof is immediate, since, in view of (60), the average (63) depends only on the sphere with center q = (v + v∗ )/2 and radius r = |v − v∗ |/2. The set of (N + 1) scalar variables (q, r) is clearly equivalent to (m, e). We note that G ∈ L1 (R2N ) since for any ϕ ∈ L∞ , Z Z t = Gϕ dσ dX U F (X)ϕ(X) σ 2N N −1 2N R
Z =
R
Z R2N
dX F (X)
×S
dσ Uσ ϕ(X) ≤ kF kL1 kϕkL∞ . t
Now, the heart of the matter lies in the following property. We denote by T the operation of tensor product, and by A the average operation (64). Moreover, we use the same symbol St for the action of the adjoint Ornstein-Uhlenbeck semigroup in RN and in R2N . When no confusion is possible, we also use the symbol M for the Maxwellian in 2N variables: M(X) = M(v)M(v∗ ). Proposition 2. The diagram R T A f −−−−→ F = ff∗ −−−−→ G = dσf 0 f∗0 S S S y t y t y t T
St f −−−−→ is commutative.
St F
A
−−−−→
St G
(65)
Sharp Entropy Dissipation Bounds
683
Remark. Let DN be the set of functions in L1 (RN ) satisfying conditions (36). Then T maps DN into D2N , and A maps D2N into itself. Proof. We first prove that the action of St commutes with the tensorization T . Since St is the composition of a convolution by a Maxwellian distribution Mλ(t) and a rescaling of the velocity space, it is sufficient to check the property for these two operations. Since for all µ > 0, µX = (µv, µv∗ ), obviously (ff∗ )λ = (fλ )(fλ )∗ , which proves the second part of the proposition, and shows at the same time that we only need to consider the convolution by M instead of Mλ . On the other hand, M(X) = M(v)M(v∗ ),
(66)
and this directly implies that for all function g, (M ∗ g)(M ∗ g)∗ = M ∗ (gg∗ ). We now prove that in R2N , St commutes with A. Since X 7−→ Uσ (X) is homogeneous of degree one, the rescaling (43) commutes with A. Therefore, we just have to check that A also commutes with the convolution by M. Let us set q=
w + w∗ , 2
r = |w − w∗ |,
`=
w − w∗ . |w − w∗ |
Thus, r r r r w = q + `, w∗ = q − `, w0 = q + σ, w∗0 = q − σ. 2 2 2 2 Then, for any function F (v, v∗ ), Z Z r r M∗ dσ t Uσ F = J dq r N−1 dr d` dσ F q + σ, q − σ 2 2 r r M v − q − `, v∗ − q + ` , 2 2
(67)
where J denotes some Jacobian (remember that dσ is the normalized measure on S N −1 ). On the other hand, Z Z r r dσ t Uσ (M ∗ F ) = J dq r N−1 dr d` dσ F q + `, q − ` 2 2 r r 0 0 (68) M v − q − `, v∗ − q + ` . 2 2 Exchanging ` and σ in (67), we see that we only have to prove that for all q, r, `, v, v∗ , Z r r dσ M(v − q − σ, v∗ − q + σ ) 2 2 Z |v − v∗ | r v + v∗ |v − v∗ | r v + v∗ + σ − q − `, − σ −q + ` . = dσ M 2 2 2 2 2 2 (69)
684
G. Toscani, C. Villani
Changing v into v − q and v∗ into v∗ − q, we reduce to the case when q = 0. Using now the property z + z∗ z − z∗ M , (70) M(z, z∗ ) = M √ √ 2 2 √ we let M((v + v∗ )/ 2) appear as a multiplicative factor of both sides, and we just have to prove that Z Z |v − v∗ |σ − r` v − v∗ − rσ = dσ M . dσ M √ √ 2 2 Up to a constant, the left-hand side is e
2
− |v−v4 ∗ |
e
2
Z
2
Z
− r4
r
dσ e− 2 (v−v∗ )·σ ,
while the right hand side is e
2
− |v−v4 ∗ |
e
− r4
r
dσ e− 2 |v−v∗ |`·σ .
R Since, by rotational invariance, dσ e−r`·σ does not depend on ` ∈ S N −1 , the conclusion follows. u t Corollary 21. Let G(X) depend only on m = v + v∗ and e = (|v|2 + |v∗ |2 )/2. Then, for all t > 0, St G depends only on m and e. In fact, a direct proof of this corollary is immediate: by density and linearity, it is sufficient to consider only the case when G(X) = G1 ((v + v∗ )/2)G2 (|v − v∗ |/2). In view of (70), St = (St G1 )(St G2 ). Since G2 is radial by assumption, and since M is radial, so is St G2 , which completes the proof. In the case N = 2, more can be said. In the representation (61), one can take as a new variable the angle θ between σ and k. Thus the transformation X 7 −→ X0 can be seen as a rotation in R2 , or, more precisely v + v∗ v + v∗ (71) , v − v∗ 7 −→ , Rθ (v − v∗ ) , 2 2 where Rθ denotes the standard rotation by angle θ in oriented R2 . By extension, we shall denote by Rθ the application given by (71). Proposition 3. Assume N = 2. Then for each θ ∈ R/(2π Z), the diagram tR
T
θ
f −−−−→ F = ff∗ −−−−→ t Rθ F = f 0 f∗0 = F 0 S S S y t y t y t T
St f −−−−→
St F
tR
θ
−−−−→
is commutative. In short, St (f 0 f∗0 ) = (St f )0 (St f )0∗ .
(St F )0
(72)
Sharp Entropy Dissipation Bounds
685
Proof. It suffices to note that
Z
t
M ∗ ( Rθ F )(X) = = = =
Z
R2N
2N ZR
Z
R2N R2N
dY t Rθ F (Y )M(X − Y ) dY F (Rθ Y )M(X − Y ) dY F (Y )M(X − Rθ−1 Y ) dY F (Y )M(Rθ X − Y )
= (M ∗ F )(Rθ X), where we have used the fact that a rotation has unit Jacobian, and that M is invariant t under t Rθ . u The particular character of the dimension 2 was already noticed in related problems [38,47]. Unfortunately, it is difficult to see how this property could generalize to higher dimensions. For example, in the case N = 3, even if one chooses a system of spherical coordinates (r, θ, φ) with axis k, there is no canonical way to choose the coordinate φ, and it is not clear whether this can be done in such a manner that φ be well defined independently of k. The analog of Proposition 3 is however valid if one replaces Rθ by Z 2π dφ t Uσ F, Jθ : F 7 −→ 0
where in the right-hand side the coordinates of σ in the local spherical system of axis v − v∗ are (θ, φ). We conclude this section by noting that a similar lemma holds with the transformations Tω . Proposition 4. For each ω ∈ S N−1 , the diagram tT
T
ω
f −−−−→ F = ff∗ −−−−→ t Tω F = f 0 f∗0 = F 0 S S S y t y t y t T
St f −−−−→
St F
tT
ω
−−−−→
(73)
(St F )0
is commutative. The proof is the same as for Proposition 3. Remarks. 1. The properties of invariance under t Rθ or t Tω characterize the Maxwellian distribution. Hence no other convolution regularization than Maxwellian could yield the same conclusion. 2. As pointed out to us by Desvillettes, these propositions give an immediate proof that Maxwellian distributions are the only solutions in L12 of Eq. (15). Indeed, let f be such a solution. Then, in view of Proposition 4, so is St f for any t > 0. Since St f is smooth, classical proofs relying for instance on the “killing operator” (32) prove that St f is the Maxwellian distribution M f . By the continuity of St at time 0, this is also true for f .
686
G. Toscani, C. Villani
4. Integral Representation of a Lower Bound for D In this section, we fix a cross section |z · ω| N−2 , B(z, ω) = ψ(|z|) 2 |z|
e σ ) = ψ(|z|), i.e. B(z,
(74)
with the variables (60). The entropy dissipation for the kernel B reads 1 D(f ) = 4
Z
Z
f 0 f∗0 dσ f 0 f∗0 − ff∗ log . ff∗
dv dv∗ ψ(|v − v∗ |)
(75)
The assumptions on the nonnegative function ψ shall be made precise later on. R By the joint convexity of the function (x, y) 7→ (x−y) log(x/y), and since dσ = 1, we get Z F 1 (76) dX ψ(|v − v∗ |)(F − G) log ≡ D(f ), D(f ) ≥ 4 G where we use the notations (62), and Z F (v, v∗ ) = ff∗ ,
G=
dσ t Uσ F = AF.
(77)
Our aim here is to establish an integral representation for D(f ). To this end, we shall regularize D by (St ) and compute (d/dt)D(St f ). At first sight this seems a formidable job to do, since f appears no less than eight times in (75). But applying Proposition 2, we see that it is equivalent to apply St to the functions F and G appearing in the right hand side of (76). For all positive time t > 0, St F and St G are smooth and their logarithm is bounded by a quadratic expression. This is enough to justify all the manipulations below. The following lemma will enable us to compute very simply the time-derivative of D(St f ). It yields actually the commutator between derivation along St and the function (x, y) 7 −→ (x − y) log(x/y), which is homogeneous of degree 1. Proposition 5. Let F and G be smooth functions with logarithms quadratically bounded. Then ∇F ∇G 2 St F d = −(F + G) − (St F − St G) log dt t=0 St G F G d F + S . (78) (F − G) log t dt t=0 G Proof. In the sequel, ∇ stands for ∇X . Let L denote the Fokker–Planck operator in R2N , and let us compute LF LG F F − − L (F − G) log . L(F − G) log + (F − G) G F G G
(79)
Sharp Entropy Dissipation Bounds
687
We shall show that this expression is equal to the first term in the right hand side of (78). Indeed, expanding (79), we find F ∇ · ∇(F − G) + (F − G)X log G ∇ · (∇F + F X) ∇ · (∇G + GX) − + (F − G) F G ∇F ∇G F F +X− − X + (F − G)X log −∇ · ∇(F −G) log + (F − G) G F G G F F = ∇ · ∇(F − G) log − ∇ · ∇(F − G) log G G F F + ∇ · (F − G)X log − ∇ · (F − G)X log G G F −G F −G ∇ · (∇F + F X) − ∇ · (∇F + F X) + F F G−F G−F ∇ · (∇G + GX) − ∇ · (∇G + GX) + G G ∇G ∇F − = −∇(F − G) · F G ∇G ∇F − − (F − G)X · F G ∇G G∇F + − (∇F + F X) · − F F2 F ∇G ∇F + . − (∇G + GX) · − G G2 Expanding the last expression, we see that all the terms containing X cancel out. As for the other ones, we obtain in the end −
|∇G|2 ∇F · ∇G ∇G · ∇F |∇F |2 |∇G|2 |∇F |2 − +2 +2 −G − F F G F G F2 G2 ∇F ∇G 2 − = −(F + G) . F G
t u
This relation may seem somehat miraculous. In Sect. (8), we shall try to connect it with other known properties in kinetic theory. With Proposition 5 at hand, it is immediate to compute the time derivative of D(St f ). Let us write L∗ : g 7 −→ 1X g − X · ∇X g, for the adjoint of L. We note that
g ∈ D0 (R2N )
ψ 0 (|v − v∗ |) , 1X ψ(|v − v∗ |) = 21v ψ(|v − v∗ |) = 2 ψ (|v − v∗ |) + (N − 1) |v − v∗ | 00
688
G. Toscani, C. Villani
X · ∇X ψ(|v − v∗ |) = v · ∇v ψ(|v − v∗ |) − v∗ · ∇v ψ(|v − v∗ |) = |v − v∗ |ψ 0 (|v − v∗ |). Hence we can safely use the notation
(N − 1) − |z| ψ 0 (|z|), (L∗ ψ)(|z|) = 2ψ 00 (|z|) + 2 |z|
and by (76) and (78) we get for all t > 0, Z ∇St F ∇St G 2 1 d dX ψ(|v − v∗ |)(St F + St G) − D(St f ) = − dt 4 St F St G Z 1 St F + dX (L∗ ψ)(|v − v∗ |)(St F − St G) log , 4 St G
(80)
(81)
provided that ψ(|v − v∗ |)(St F − St G) log
St F ∈ L1 (R2N ), St G
(L∗ ψ)(|v − v∗ |)(St F − St G) log
St F ∈ L1 (R2N ). St G
(82)
(83)
We remark that, thanks to condition (49) and the result of Proposition 1, conditions (82) and (83) are propagated by St , under weak assumptions on the growth of ψ and L∗ ψ. In more detail, we assume for simplicity that ψ is bounded below by a fixed number ν > 0, F ψ ∈ L1 (R2N ), and L∗ ψ(|v − v∗ |) ≤ Cψ(|v − v∗ |) ≤ C1 (1 + |X|2 )α/2 ,
(84)
for some α > 0. Note that the condition (84) is always satisfied if ψ(|z|) behaves like a power of |z| at infinity, since then, the dominant term as |z| → ∞ of L∗ ψ(|z|) is −|z|ψ 0 (|z|). Then, if kf kL1
2+α
log L
< ∞,
(85)
both (82) and (83) hold uniformly in time for all t ≥ t0 > 0 (because the OrnsteinUhlenbeck adjoint semigroup generates pointwise Maxwellian lower bounds). Finally, we prove that continuity at time 0 of the mapping t 7→ D(St f ) follows under the same conditions (84) and (85). First, by the convexity of D and the strong continuity of (St ) at time 0, D(f ) ≤ limt→0 D(St f ). Therefore it is sufficient to check that D(f ) ≥ limt→0 D(St f ). Let us denote by Mε the centered Maxwellian with temperature ε, and define Fε =
(F ψ) ∗ Mε , ψ
Gε =
(Gψ) ∗ Mε . ψ
Since ψ is bounded below, it is clear that (Fε , Gε ) −→ (F, G) strongly in L1 × L1 as ε → 0, and (Fε ψ, Gε ψ) −→ (F ψ, Gψ) as well. We note that D(f ) = D(F ψ, Gψ),
Sharp Entropy Dissipation Bounds
689
where D(F, G) =
1 4
Z dX (F − G) log
F . G
Let us use for a while the notation D(f ) = D(F ). In view of the choice of Fε , Gε , D(F ) ≤ limε→0 D(Fε ψ, Gε ψ) = D((F ψ) ∗ Mε , (Gψ) ∗ Mε ) ≡ D(Fε ). But D is a translation-invariant functional, and (F ψ) ∗ Mε is an average of translates of F ψ. By convexity of D, D (F ψ) ∗ Mε , (Gψ) ∗ Mε ≤ D(F ψ, Gψ), and hence D(Fε ) → D(F )
as ε → 0.
(86)
By the preceding proof, since Fε , Gε are smooth (hence t 7 −→ D(St f ) is continuous at t = 0), we see that for all t > 0, Z Z 1 t Sτ Fε D(Fε ) ≥ D(St Fε ) − dτ (L∗ ψ)(Sτ Fε − Sτ Gε ) log . 4 0 Sτ Gε Hence, since by assumption L∗ ψ ≤ Cψ (and since Sτ Gε is an average of Sτ Fε ), Z C t D(Fε ) ≥ D(St Fε ) − dτ D(Sτ Fε ). 4 0 By Gronwall’s lemma, D(St Fε ) ≤ D(Fε )eCt/4 . Letting ε go to 0, by the convexity of D and (86), we get D(St F ) ≤ D(F )eCt/4 . This is sufficient to conclude that limt→0 D(St F ) ≤ D(F ). On the other hand, as t goes to infinity, D(St F ) goes to 0 (because St F −→ M), at least when F log F ψ(1 + |X|2 ) ∈ L1 (R2N ).
(87)
But this follows by condition (85). Thus, we conclude with the following Theorem 6. Representation formula for D. Assume that ψ is bounded below, and conditions (84), (85) are satisfied at t = 0. Then 1 D(f ) = 4
Z 0
∞
∇St F ∇St G 2 − dt dX ψ(|v − v∗ |)(St F + St G) St F St G Z ∞ Z 1 St F . − dt dX (L∗ ψ)(|v − v∗ |)(St F − St G) log 4 0 St G Z
(88)
Remark. We are confident that this formula holds under more general assumptions, but this will be more than enough for our purposes.
690
G. Toscani, C. Villani
5. Main Result We are now ready to prove our main result. Theorem 7. Let f satisfy the normalization (36). Let ψ(|z|) ≥ (1 + |z|)γ for some 2 < ∞, real number γ ≤ 0. Assume that f (v) ≥ Ke−A|v| , and that kf kL1 2+s+ε log L < ∞ for some s > 0, ε > 0. Then there exists a constant Cs (f ) depending kf kL1 4+s+ε , such that only on s, ε, γ , K, A, kf kL1 log L and kf kL1 2+s+ε
4+s+ε
D(f ) ≥ Cs (f ) H (f |M)1+
2−γ s
,
(89)
where D(f ) is the lower bound for the entropy dissipation given by (76). Remark. To treat potentials that are “essentially” bounded below, in the sense of Sect. 6, like hard potentials, it will be sufficient to apply this theorem with γ = 0. However, we choose a function ψ which may be decaying to show that the theorem also holds for soft potentials. Proof. Let ψ1 be a smooth convex function with ψ1 (|z|) = 1 for |z| ≤ 1, ψ1 (|z|) = |z|2 for |z| ≥ 2, |z|2 /2 ≤ ψ1 (|z|) ≤ 1 + |z|2 . Since L∗ (|z|2 ) = 4N − 2|z|2 , we can impose that |L∗ ψ1 (|z|)| ≤ C(1 + |z|2 ). Let us set ψR (|z|) = ψ1 (|z|/R). Hence, |L∗ ψR (|z|)| ≤
C 1 + |z|2 1|z|≥R . 2 R
Let D R (f ) be the functional D associated to ψR . Since ψR (|z|) ≥ |z|2 /(2R 2 ), and since |v − v∗ |2 ≥ R 2 H⇒ |v|2 + |v∗ |2 ≥
R2 , 2
we obtain by Theorem 6, Z ∞ Z ∇X St F ∇X St G 2 1 2 − D R (f ) ≥ dt dX |v − v∗ | (St F + St G) 8R 2 0 St F St G Z ∞ Z C St F . − 2 dt dX 1 + |X|2 1|X|≥R/√2 (St F − St G) log R 0 St G
(90)
Next, since ψR (|z|) = 1 for |z| ≤ R, we can write (1 + |z|)γ ≥ (1 + R)γ ψR (|z|) − (1 + R)γ
|z|2 1|z|≥R . R2
Accordingly,
Z 1 F 2 √ (F − G) log D(f ) ≥ (1 + R)γ D R (f ) − 1 . dX |X| |X|≥R/ 2 4R 2 G
(91)
Summing up, by (90) and (91) we have decomposed D(f ) into three parts, two of which involve only large values of X. Now we shall estimate the principal part of D R (f ). The heart of the whole argument lies in the following:
Sharp Entropy Dissipation Bounds
691
Proposition 8. Let F = ff∗ and let G be a function depending only on m = v + v∗ and e = (|v|2 + |v∗ |2 )/2. Assume that f and G are smooth, with logarithms quadratically bounded, and that f satisfies the normalization (36). Then Z ∇X F ∇X G 2 1 2 − ≥ (N − Tf ) I (f |M), (92) dX |v − v∗ | (F + G) F G 2 where
Z Tf = sup
e∈S N −1
dv f (v)(v, e)2 .
Proof. Let us write ∇X = [∇, ∇∗ ]. Then, ∇f ∇f ∇X F = , . F f f ∗
(93)
On the other hand,
∇m G ∂e G ∇m G ∂e G ∇X G = +v , + v∗ (m, e). G G G G G
(94)
Let us consider the “killing operator” P (v, v∗ ) defined by P : [A, B] 7 −→ 5(v − v∗ ) (A − B),
(95)
where 5 is given by (57). In view of (94), clearly, ∇X G = 0. (96) G For all (v, v∗ ), kP k ≤ 2, where k · k denotes the norm in the sense of matrices. Here we see precisely the advantage of the Ornstein-Uhlenbeck regularization in our approach: it enables us to use the operator (95), which is pointwise bounded, instead of (32), which is definitely not. Let us set ∇X G ∇X F − . K(X) = F G In view of (93) and (96), ∇f ∇X F ∇f = 5(v − v∗ ) − P K(X) = P . F f f ∗ P
Since |P K|2 ≤ 4|K|2 , we find Z ∇X F ∇X G 2 − dX |v − v∗ |2 (F + G) F G Z Z 1 dX |v − v∗ |2 F |P K|2 ≥ dX |v − v∗ |2 F |K|2 ≥ 4 1 = 4
Z
2 ∇f ∇f . dv dv∗ |v − v∗ | ff∗ 5(v − v∗ ) − f f 2
∗
Apart from the factor 1/2, this is the entropy dissipation for the Landau equation with Maxwellian molecules. It suffices to apply the inequality (58) to conclude. u t
692
G. Toscani, C. Villani
Now, by the properties recalled in Sect. 2, for all t > 0, T(St f ) ≤ Tf , with equality only when all the directional temperatures of f are equal to 1. Hence, setting λ = N − Tf , we can apply the preceding proposition and recover Z ∞ Z ∇X St F ∇X St G 2 1 2 − dt dX |v − v | (S F + S G) ∗ t t SF R2 0 St G t Z ∞ λ λ ≥ I (St f |M) = H (f |M), 2 2R 0 2R 2 where we have used the relation (47). It is shown in [21] that λ ≥ λ0 (f ), where λ0 depends onlyPon H (f ) and the normalization (36). Indeed, if (ei ) is any orthonormal basis, then i Ti = N . Therefore, to control λ from below it suffices to control all the directional temperatures Ti from below. The finiteness of H (f ) and of some moments of f suffices to prevent f from concentrating on a hyperplane (v, e) = 0. √ Proposition 9. Estimate of the tails. Let R ≥ 2, and Z F (97) e(R) = dX |X|2 1|X|≥R/√2 (F − G) log , G Z
∞
E(R) =
Z dt
0
dX |X|2 1|X|≥R/√2 (St F − St G) log
St F . St G
(98)
Then for all s > 0, ε > 0, Cs+ε , Rs where Cs+ε depends only on the normalization (36), s, ε, kf kL1 e(R) + E(R) ≤
and A such that f ≥
2 Ke−A|v| .
2+s+ε
log L , kf kL14+s+ε , K
Remark. We have used the inequality (1 + |X|2 )1|X|>R/√2 ≤ 2|X|2 1|X|>R/√2 to get E. Proof. We begin with e(R). Throughout the proof, C will denote various constants depending only on the aforementioned quantities. 2 2 2 Since f ≥ Ke−A|v| , by tensorization F ≥ Ke−A|X| , and also G ≥ Ke−A|X| , since Maxwellian distributions satisfy Eq. (15). Since (x − y) log(x/y) ≤ x log x if x ≥ y ≥ 1, and | log F |, | log G| ≤ C(1 + |X|2 ) if F, G ≤ 1, we can write Z e(R) ≤ dX |X|2 1|X|≥R/√2 F log F Z + dX |X|2 1|X|≥R/√2 G log G Z + C dX |X|2 (F + G)(1 + |X|2 )1|X|≥R/√2 .
Sharp Entropy Dissipation Bounds
693
R Since G = dσ F 0 and |X0 |2 = |X|2 , we write, using the convexity of x 7 −→ x log x and the change of variables dσ dX = dσ dX0 , Z dX |X|2 1|X|≥R/√2 G log G Z
Z ≤
dσ
dX |X|
2
1|X|≥R/√2 F 0 log F 0
Z =
dX |X|2 1|X|≥R/√2 F log F.
In the same manner, Z Z 2 2 √ dX |X| 1|X|≥R/ 2 (1 + |X| )G = dX |X|2 1|X|≥R/√2 (1 + |X|2 )F. Hence,
Z e(R) ≤ C Z +C
dX |X|2 1|X|≥R/√2 F log F
dX |X|4 1|X|≥R/√2 F.
Writing 1|X|≥R/√2 ≤ 1|v|≥R/2 + 1|v∗ |≥R/2 , |X|2 = |v|2 + |v∗ |2 , |X|4 ≤ C(|v|4 + |v∗ |4 ), log F = log f + log f∗ , we obtain
Z
dv |v| 1|v|≥R/2 f | log f | e(R) ≤ C dv∗ f∗ (1 + |v∗ | ) Z Z 2 2 dv |v| f | log f | +C dv∗ f∗ (1 + |v∗ | )1|v∗ |≥R/2 Z Z dv |v|4 1|v|≥R/2 f +C dv∗ f∗ Z Z 4 dv |v| f +C dv∗ f∗ 1|v∗ |≥R/2 Z
2
≤
2
C + kf k + kf k kf k 1 1 kf kL1 log L + kf kL1 1 kf kL1 . L L log L L r 2+r 2+r 2 4+r 4 Rr
Let us now turn to E(R). First, St (e−A|v| ) = Me−2t /(2A) ∗ M1−e−2t = M1+[(2A)−1 −1]e−2t , 2
and since St is a linear positive transformation, St f ≥ Ce−A0 |v| , 2
694
G. Toscani, C. Villani
where
A0 = sup
1 1 + [(2A)−1 − 1]e
, 0 ≤ t < ∞ = max(1, 2A). −2t
As a consequence, we also have a fixed Maxwellian lower bound for St F and St G. We estimate E(R), taking into account the fact that St F −→ M, St G −→ M. By the elementary inequality (f − g) log
f ≤ |f − M|| log f | + |g − M|| log g| g + C|f − M|(1 + |X|2 ) + C|g − M|(1 + |X|2 )
(which is easy to obtain distinguishing between the cases f ≥ g ≥ M, M ≥ f ≥ g, 2 ≥ f ≥ M ≥ g, f ≥ 2 ≥ M ≥ g and so on), we have Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St F − M|| log St F | (99) E(R) ≤ 0 Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St G − M|| log St G| (100) + 0 Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St F − M|(1 + |X|2 ) (101) +C 0 Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St G − M|(1 + |X|2 ). (102) +C 0
By convexity and changes of variables in the same spirit as before, we reduce to the problem of estimating only (99) and (101). We begin with (101). By the results recalled in Sect. 2, for all t > 0, kSt F − MkL1 ≤
p 2H (F |M)e−t
(of course H (F ) = 2H (f ) is finite), and for all r > 0, kSt F − MkL1r ≤ Cr (kf kL1r ) + kMkL1r . Hence, for any ε > 0, K > 0, separating between small and large |X|, Z C dX (1 + |X|)r |St F − M| ≤ CK r kSt F − MkL1 + ε kSt F − MkL1 r+ε K ≤ CK r e−t + where Cr+ε depends only on kF kL1 Choosing K =
1/(r+ε) Cr+ε et/(r+ε) ,
r+ε
Cr+ε , Kε
(i.e. on kf kL1
r+ε
we get ε
kSt F − MkL1r ≤ Ce− r+ε t ,
and the normalization (36)).
Sharp Entropy Dissipation Bounds
695
and therefore Z ∞ Z Z ∞ C dt dX |X|4 1|X|≥R/√2 |St F − M| ≤ s dt kSt F − MkL1 4+s R 0 0 C 4+s+ε ≤ s . R ε Finally, we handle the integral (99). Applying the same strategy as above and using Proposition 1, it is sufficient to prove that Z dX |St F − M|| log St F ||X|2 −−−→ 0 t→∞
with an exponential rate. To that purpose, we use the elementary inequality x x |x − y|| log x| ≤ |x − y| log 1x≤y + |x − y|| log y| + C x log + y − x . y y (103) To prove (103), it suffices to note that
x |x − y|| log x| ≤ |x − y|| log y| + |x − y| log , y
and to bound the second term. By homogeneity, we just have to check that if z = (x/y) ≥ 1, then (z − 1) log z ≤ C(z log z + 1 − z). This last inequality is easily obtained (note that both functions have vanishing derivatives of first order for z = 1). By Hölder’s inequality, then (103) applied with x = St F and y = M, Z |St F − M|| log St F ||X|2 Z ≤C
ε/(1+ε) |St F − M|| log St F | kf kL1
2+2ε
Z ≤ Cε
log L kf kL1
1/(1+ε)
4+2ε
1 1+ε . dX |St F − M|(1 + |X| ) + H (St F |M) 2
The right-hand side converges exponentially fast to 0, and the desired conclusion follows by the same arguments as before. u t End of the proof of Theorem 7. By Propositions 8 and 9, E(R) e(R) λ γ H (f |M) − − 2 D(f ) ≥ D(f ) ≥ C(1 + R) R2 R2 R Cs+ε H (f |M) − . ≥ C(1 + R) Rs √ 1/s u Choosing R = max(2−1/s Cs+ε H (f |M)−1/s , 2), we get the desired result. t γ −2
696
G. Toscani, C. Villani
6. Extension to Other Kernels Theorem 7 covers essentially all kernels B(z, ω) that are locally bounded below by |z · ω|/|z| (in dimension 3; (|z · ω|/|z|)N−2 in the general case). The cross-section |z · ω|, for instance, corresponding to the hard-spheres potential, does not satisfy this assumption for |z| near 0. In order to obtain a result for such potentials, we just have to follow the strategy applied by Carlen and Carvalho [10], namely cut out (small) portions of the velocity space where B is small. In this example, we write |z · ω| ≥
|z · ω| |z · ω| − 1|z|≤ , |z| |z|
and we estimate from above the entropy dissipation associated to ψ(|z|) = 1|z|≤ , i.e. Z Z f 0 f∗0 . χ() = dv dv∗ 1|v−v∗ |≤ dσ f 0 f∗0 − ff∗ log ff∗ As goes to 0, χ () −→ 0, and we have to estimate explicitly the corresponding rate of convergence. Here, it is not clear whether the assumptions f ∈ L1s log L ∩ L1r and 2 f (v) ≥ Ke−A|v| suffice to provide such an estimate. But as soon as, say, L2 bounds are available for f , this can be done easily. Indeed, let D (R) = {(v, v∗ ) ∈ R2N ; |v − v∗ | ≤ , |v|2 + |v∗ |2 ≤ R}. Then, for all R > 0, s > 0, Z f 0 f∗0 C χ () ≤ s (|v|2 + |v∗ |2 )(s+1)/2 1|v|2 +|v∗ |2 ≥R 2 f 0 f∗0 − ff∗ log R ff∗ Z f 0 f∗0 + . f 0 f∗0 − ff∗ log ff∗ D (R)×S N −1
(104)
Using the Maxwellian lower bound, we write systematically
f 0 f∗0 ≤ f 0 f∗0 log(f 0 f∗0 ) + ff∗ log ff∗ f 0 f∗0 − ff∗ log ff∗ + C(ff∗ + f 0 f∗0 )(|v|2 + |v∗ |2 ),
and changing primed to unprimed variables we reduce to estimating Z ff∗ log(ff∗ )(|v|2 + |v∗ |2 )s/2 R2N
and
(105)
Z D (R)
ff∗ log(ff∗ ).
(106)
Writing log ff∗ = log f + (log f )∗ , we easily see that (105) is bounded by a constant depending on the norm of f in L1s log L. On the other hand, if f ∈ L2 , then ff∗ ∈ L2 (R2N ), and this is enough to provide an explicit estimate of (106) in terms of the Lebesgue measure of D (R). In the end, it suffices to choose a convenient R, depending on .
Sharp Entropy Dissipation Bounds
697
We do not enter into the details of this computation, since several estimates of this type can be found in [10] for the Boltzmann equation, and in [21] for the Landau equation with hard potentials. The same technique also allows to cover all kernels B for which ( ) |z · ω| N −2 , |z| ≤ R = O(R β δ ) (z, ω)/B(z, ω) ≤ ψ(|z|) |z|
(107)
for some positive numbers β < N, δ > 0, as R goes to infinity and goes to 0 – that is, precisely those kernels such that the set of points for which they are not bounded below is of very small measure. As a conclusion, for all kernels B satisfying (107), our method enables to obtain an algebraic estimate of the form D(f ) ≥ Cα (f ) H (f |M)α ,
(108)
with α and Cα (f ) explicitly computable, and depending on β, δ in (107), γ in (7), and various norms of f , say in weighted L2 . Let us briefly comment on the conditions of Theorem 7. As proven recently by Pulvirenti and Wennberg [37], a Maxwellian lower bound for the solution to (1) in the case of hard potentials is available at any positive time, provided the initial datum has bounded energy and entropy. On the other hand, the Ornstein-Uhlenbeck semigroup produces such bounds. But, as the numerical applications in [10] show, these are terribly small, and it is not clear whether they would be useful. Essentially, for hard potentials, solutions automatically have a good decay at infinity, and with bounds that are uniform in time. In particular, weighted Lp -bounds on the solution propagate uniformly in time if sufficiently many moments exist at time t = 0. p A fairly complete study of these uniformly boundedness properties in Lr was done by Gustafsson in [28]. This result was improved by Wennberg in [50]. In any case, provided the conditions of Gustafsson theorems are satisfied, both weighted L2 - norms and the Maxwellian lower bound are available, uniformly in time. This allows us to transform the inequality (108) into a theorem of decay to equilibrium with explicit rate. On the contrary, for soft potentials it is an open problem whether the bounds can be found to hold uniformly in time. Yet a study of trend to equilibrium can be performed if these bounds are growing slowly: this will be the object of another work. Let us also do some comments on the estimate below for λ = N − Tf in Theorem 7. Even though λ is estimated below in terms only of H (f ) and the normalization (36), this gives very poor estimates, as the numerical applications in [21] show. Indeed, the entropy is very bad at controlling the concentration on sets of small measure. Here again, working in an L2 framework enables far better estimates. An interesting feedback effect was studied in [21]: under suitable assumptions, as time goes by, solutions of the Boltzmann equation converge towards the Maxwellian distribution, say in L12 , and hence all the directional temperatures of f (t) converge towards the equilibrium value 1. Therefore, the constant λ essentially becomes better with time, and is equal to N − 1 asymptotically. This effect is a direct consequence of the nonlinearity of the Boltzmann equation. By the way, note that in the particular case of radial solutions, one has always Tf = 1, hence λ = N − 1.
698
G. Toscani, C. Villani
7. The Kac Model In this section, we show how the previous analysis can be extended to the Kac model. We recall that Kac’s collision operator reads, for a distribution f (v), v ∈ R, Z 2π dθ dv∗ f 0 f∗0 − ff∗ , (109) QK (f, f ) = 0
where we note by dθ the normalized measure on (0, 2π ), and 0 v = v cos θ − v∗ sin θ v 0 = v sin θ + v cos θ. ∗ ∗
(110)
Hence the postcollisional velocities are simply obtained by a rotation of angle θ in the space (v, v∗ ). There is only one collisional invariant in the Kac model, namely e = (|v|2 +|v∗ |2 )/2. R 0 It is clear that for any function f , dθf f∗0 depends only on e. Proposition 3 extends trivially to the Kac model, and all the subsequent analysis can be done. But the peculiarity of the dimension 1 is that there is no corresponding Landau equation, since the orthogonal projection 5 defined by (57) is meaningless. Therefore, we have to change the proof of Proposition 8 because we cannot rely on the use of the linear operator (95). This can be linked to the following elementary observation. For any vector function g on RN (think of g as ∇ log f ), the property ∀(v, v∗ ), g(v) − g(v∗ ) = λv,v∗ (v − v∗ ) H⇒ ∀v, g(v) = λv + µ (λ ∈ R, µ ∈ RN ), holds in any dimension N ≥ 2, but is obviously false if N = 1. The following proposition is thus a replacement for Proposition 3. Proposition 10. Assume N = 1. Let f be a smooth function with logarithm quadratically bounded, with unit mass and temperature, and let G be a function of e. Then Z ∇F ∇G 2 ≥ I (f |M). (111) − dX |X|2 F F G Proof. This time, 0 G G0 ∇G = v (e), v∗ (e) . G G G Hence we can apply the “killing operator” P (v, v∗ ) : [A, B] 7 −→ v∗ A − vB. For all (v, v∗ ), the square norm of P is bounded by 2(|v|2 + |v∗ |2 ) = 2|X|2 . Defining K(v, v∗ ) =
∇F ∇G − , F G
Sharp Entropy Dissipation Bounds
699
we see that P K(v, v∗ ) = v∗ Hence, Z
Z f 0 (v) ∇G 2 1 f 0 (v∗ ) 2 ≥ ff∗ − dv dv∗ v∗ −v dX F |X| F G 2 f (v) f (v∗ ) 2 ∇F
Z =
f 0 (v∗ ) f 0 (v) −v . f (v) f (v∗ )
Z dv∗ f (v∗ )|v∗ |2
dv
f 0 (v)2 f (v)
Z −
Z dv∗ v∗ f 0 (v∗ ) . dv vf 0 (v)
Integrating by parts the second term, we obtain simply I (f ) − 1 = I (f |M).
t u
At this point we can apply the ideas of the previous section, concluding in particular with the algebraic decay to equilibrium in relative entropy of the solution to the Kac equation. On the other hand, due to the particular symmetries of this one-dimensional model, Theorem 7 can be improved in a number of ways. Theorem 11. Assume N = 1. Let f ∈ L12 (R) satisfy the normalization (36). If in addition, for some s > 0, kf kL1 < ∞, and 2+s Z f 0 (v)4 < ∞, (112) I2 (f ) = dv f (v)3 then for all ε > 0 there exists a constant Cs,ε (f ) depending only on s, ε, kf kL1 and 2+s I2 (f ), such that D(f ) ≥ Cs,ε (f ) H (f |M)1+(2+ε)/s .
(113)
Remark. It is known that if I2 (f0 ) is finite, then I2 (ft ) remains bounded, uniformly in time, if ft is the solution to the Kac equation with initial datum f0 : more precisely [25, 40], n o I2 (St f ) ≤ max I2 (f0 ), CI (f0 )2 , where C is numerical. Proof. Let us repeat the argument of Proposition 10. We consider a function f which is smooth and whose logarithm is quadratically bounded. Let G be a function of e. Then, for any R > 0, Z Z ∇F ∇G 2 ∇G 2 1 2 ∇F − − ≥ dX F |X| ≥ dX F F G 2R 2 |v|≤R, |v∗ |≤R F G Z f 0 (v) f 0 (v∗ ) 2 1 dv dv∗ v∗ ff∗ = −v 4R 2 |v|≤R,|v∗ |≤R f (v) f (v∗ ) Z Z f 0 (v)2 1 2 dv f (v )|v | dv = ∗ ∗ ∗ 2R 2 f (v) |v∗ |≤R |v|≤R Z Z 1 0 0 dv vf (v) dv v f (v ) . (114) − ∗ ∗ ∗ 2R 2 |v|≤R |v∗ |≤R
700
G. Toscani, C. Villani
≡
1 4(f ). 2R 2
Writing Z
f 0 (v)2 = I (f ) − dv f (v) |v|≤R Z |v|≤R
dv vf 0 (v) = −1 −
Z |v|>R
Z |v|>R
dv
f 0 (v)2 , f (v)
dv vf 0 (v),
we see that Z
Z 4(f ) = I (f |M) + Z −
dv |v| f (v) 2
|v|>R
|v|>R
dv
f 0 (v)2 f (v) Z −2
|v|>R
− I (f )
|v|>R
dv Z
f 0 (v)2 f (v)
|v|>R
dv |v|2 f (v)
dv vf 0 (v) −
Z |v|>R
2 dv vf 0 (v) .
(115)
By Cauchy–Schwarz inequality, Z
0
|v|>R
2
dv vf (v)
Z ≤
Z dv |v| f (v) 2
|v|>R
|v|>R
dv
f 0 (v)2 . f (v)
(116)
Thus, 4(f ) ≥ I (f |M) − GR (f ), where we define Z
Z
f 0 (v)2 dv |v| f (v) + dv −2 GR (f ) = I (f ) f (v) |v|>R |v|>R
Z
2
Z
Z
|v|>R
dv vf 0 (v) (117)
f 0 (v)2 dv |v| f (v) + |v| f (v) + 2vf (v) + =I (f |M) f (v) |v|>R |v|>R 2 0 Z Z f (v) dv |v|2 f (v) + + v f (v). =I (f |M) f (v) |v|>R |v|>R 2
2
0
Now, we estimate GR (St f ) for t > 0. First, clearly, Z Cs dv |v|2 (St f )(v) ≤ I (St f |M) s , I (St f |M) R |v|>R where Cs depends only on kf kL1 . 2+s
Sharp Entropy Dissipation Bounds
701
Next, for all ε > 0,
Z |v|>R
dv
(St f )0 (v) +v (St f )(v)
2
Z ε (St f )(v) ≤ I (St f |M) 1+ε 2 Z +2
≤ I (f |M)
ε 1+ε
e
2εt − 1+ε
|v|>R
(St f )0 (v)2 St f (v) |v|>R 1 1+ε 2 dv |v| (St f )(v) dv
Z 2
Cs (St f )0 (v)2 +2 s dv St f (v) R |v|>R
1 1+ε
,
where we have used (45). By Cauchy-Schwarz inequality, 1/2 Z Z (St f )0 (v)2 dv dv St f (v) . ≤ I2 (St f )1/2 St f (v) |v|>R |v|>R Next, we use the fact that I2 is bounded uniformly in time for solutions of the onedimensional Fokker–Planck equation [40]: n o I2 (St f ) ≤ max e4 I2 (f ), (1 − e−2 )−2 I2 (M) . The proof of this inequality relies on the method developed by Lions and the first author [32] for proving refined estimates of the central limit theorem. Combining this with the boundedness of kf kL1 , we easily obtain 2+s
GR (St f ) ≤
ε 2εt Cs Cs,ε I (St f |M) 1+ε + s/(1+ε) e− 1+ε . s R R
(118)
Since (by Theorem 6 with ψ ≡ 1) Z ∞ 1 dt [I (St f |M) − GR (St f )], D(f ) ≥ 2R 2 0 by (118) we see that if R ≥ Rs (depending only on s and on kf kL1 , with Rs ≥ 1), D(f ) ≥
1 2R 2
2+s
Cs,ε 1 H (f |M) − s/(1+ε) , 2 R
(119)
where Cs,ε depends only on I2 (f ), kf kL1 , s and ε. Note indeed that if, for some a > 0, 2+s we denote by t0 the first time t such that I (St f |M) ≤ a, then (this is a rough estimate !) Z
+∞ 0
I (St f |M)
ε 1+ε
Z dt =
t0
0
≤a
I (St f |M) 1 − 1+ε
Z 0
+∞
ε 1+ε
Z dt +
+∞
t0
ε
I (St f |M) 1+ε dt ε
I (St f |M) dt + a 1+ε
Z
+∞
ε
e−2 1+ε t dt,
0
and we just have to choose Cs a −1/(1+ε) = 1/2 to make sure that the estimate (119) holds. −s/(1+ε) , we get the Choosing in (119) R −s/(1+ε) = min{(4Cs,ε )−1 , H (f |M), Rs result (with 2ε in place of ε). u t
702
G. Toscani, C. Villani
Theorem 11 gives a lower bound on the entropy production for Kac equation with a constant depending on the functional I2 . The boundedness of this functional substitutes the bounds on L1s log L, and also the moment condition is better. On the other hand, this functional has been shown to be uniformly bounded in time (cf. [23]) along the solution to the Kac equation. Hence, if the initial datum f0 for the Kac equation is such that I2 (f0 ) < ∞, the decay to equilibrium follows. McKean [34] has studied the rate of convergence to equilibrium in the Kac model, conjecturing the existence of a lower bound of type (113), without specifying the nature of the positive term on the right side. McKean’s paper contains a number of general ideas, including the validity of formula (53), as well as the introduction of Fisher information (that he called Linnik functional), and its connection with the trend to equilibrium. For his study of the decay, McKean used the regularization of the solution f (t), taking the convolution with a Maxwellian of small energy: f δ = f ∗ Mδ . Then I2 (f δ ) is bounded, and the conditions of the previous theorem are automatically satisfied if some moment of order higher than 2 is finite. 8. Remarks About Fisher Information and Entropy Dissipation We begin with a trivial assertion. Let (B t )t≥0 be a semigroup commuting with the adjoint Ornstein-Uhlenbeck semigroup (St )t≥0 , and let D be the associated entropy dissipation functional. Then d d D(S f ) = I (B t f ). (120) t dt t=0 dt t=0 Indeed, in view of the commuting property, both terms are equal to d d H (B s St f ). dt t=0 ds s=0 In other words, the evolution of the entropy dissipation along the adjoint OrnsteinUhlenbeck semigroup is given by the evolution of the Fisher information along the semigroup (B t ). As a first application, we can recover in a straightforward way the first term in the right-hand side of (88), without the use of formula (78). More precisely, we shall show that for any two smooth functions F and G, Z Z ∇F St F ∇G 2 d . (121) = − dX (F + G) − dX (St F − St G) log dt t=0 St G F G To this purpose, we introduce the linear system ∂t F = (G − F ) ∂ G = (F − G). t
(122)
It is clear that, if we set H = H (F )+H (G), then the time-dissipation of H associated to the system (122) is the functional Z F D(F, G) = (F − G) log . G
Sharp Entropy Dissipation Bounds
703
But the semigroup associated to the system (122) obviously commutes with the system ∂t F = LF (123) ∂ G = LG, t where L stands for the linear Fokker–Planck operator, since LF − LG = L(F − G). Hence, to prove (121), it is sufficient to compute the time-derivative of I = I (F ) + I (G) along solutions of the system (122). This computation is immediate and yields the desired result. Let us now choose for (B t ) the semigroup associated to the Boltzmann equation with Maxwellian molecules, and think the other way. In view of Bobylev’s lemma [5], (B t ) commutes with (St ). Hence, in all the cases when the entropy dissipation D is directly seen to be decreasing under evolution by the adjoint Ornstein-Uhlenbeck semigroup, we recover by the remark above an alternative proof that I is decreasing along solutions of the Boltzmann equation (note that it suffices to deal with smooth functions, because of the regularizing properties of the Fokker–Planck equation). That D is decreasing along (St ) can be seen directly at least in three different cases, with the help of the results of Sect. 3. The case N = 2. For Maxwellian molecules in two dimensions, we can write, following the notations of Proposition 3, Z π ζ (θ ) D(F, t Rθ F ) D(f ) = 0
for some nonnegative function ζ . For each θ, D(F, t Rθ F ) is decreasing along the adjoint Ornstein-Uhlenbeck semigroup, and hence D also. As a consequence, we have a new proof of the result by Toscani [38] that I is decreasing along solutions of the Boltzmann equation with Maxwellian molecules in two dimensions. The case B(z, ω) constant. If B is a constant, then we can write, following the notations of Proposition 4, Z dω D(F, t Tω F ), D(f ) = S N −1
and conclude as before. This gives a new proof of the result by Carlen and Carvalho [9] that I is decreasing along solutions of the Boltzmann equation with constant kernel. In fact, in both of the previous cases, one can also adapt and simplify the argument given by McKean for Kac’s model: to prove that D(f ∗ Mδ ) ≤ D(f ) (which implies the decreasing property of I by differentiation in δ), write j (x, y) = (x − y) log(x/y) and note that, by Jensen’s inequality, for given θ (or ω, with Tω in place of Rθ ), j (F ∗ Mδ , (t Rθ F ) ∗ Mδ ) ≤ j (F, t Rθ F ) ∗ Mδ . Integration with respect to dv dv∗ , and use of the translational invariance of D yield then D((F ∗ Mδ , (t Rθ F ) ∗ Mδ ) ≤ D(F, t Rθ F ), and the conclusion follows.
704
G. Toscani, C. Villani
e σ ) constant. For simplicity we treat the case N = 3. If B = |z · ω|/|z|, The case B(z, e then B is constant, and Z Z F dω dX |k · ω|(F − Gω ) log , D(f ) = Gω S N −1 with Gω = t Tω F . Applying Proposition 5, we see that it suffices to prove that for each ω, Z F ≤ 0. (124) dX L∗ (|k · ω|) (F − Gω ) log Gω Let us compute L∗ (|k · ω|). First, using the general formula ∇v [b(k · ω)] =
1 b0 (k · ω)5k ⊥ ω, |v − v∗ |
(125)
we find that ∇X (|k · ω|) =
sgn(k · ω) 5k ⊥ ω, −5k ⊥ ω . |v − v∗ |
This term is well-defined only for k · ω 6 = 0, but since we can restrict to functions F and G that are smooth, this does not matter here. As a consequence, X · ∇X (|k · ω|) is a multiple of v · 5k ⊥ ω − v∗ · 5k ⊥ = (v − v∗ ) · 5k ⊥ ω = 0. A similar computation shows that v − v∗ sgn(k · ω) · 5k ⊥ ω |v − v∗ |3 1 4 δ(k·ω)=0 5k ⊥ ω · 5k ⊥ ω + |v − v∗ | |v − v∗ | 4 |k · ω|, − |v − v∗ |2
1X (|k · ω|) = − 2
(126) (127) (128)
where to compute the last term we have used formula (125) and the relation sgn(u)u = |u|. The expression (126) is 0 because v − v∗ and 5k ⊥ ω are orthogonal. The contribution of (127) to the integral (124) is also 0 because when (k · ω) = 0, then F = Gω , and F , Gω are smooth. Finally, the expression (128) is nonpositive. Gathering all of this, we obtain that the inequality (124) actually holds. We do not know if by this method one can recover the general theorem that Fisher’s information is decreasing along solutions of the Boltzmann equation with Maxwellian molecules in any dimension [47]. But we found rather striking this connection with the problem of finding a lower bound for the entropy dissipation. Acknowledgement. The main part of this work was done while the second author was visiting the Mathematics Departement of the University of Pavia. It is a pleasure for him to thank the whole Department for their kind hospitality. The first author acknowledges the partial support of the National Council for Researches of Italy, Gruppo Nazionale per la Fisica Matematica. Both authors acknowledge the support of the European TMR Project Kinetic Applications, contract ERB FMBX-CT97-0157.
Sharp Entropy Dissipation Bounds
705
References 1. Abrahamsson, F.: Strong L1 convergence to equilibrium without entropy conditions for the spatially homogeneous Boltzmann equation. Preprint NO 1997-43, Dept. of Math., Chalmers Univ. of Tech. Göteborg, 1997 2. Arnold, A., Markowich, P., Toscani, G. and Unterreiter, A.: On logarithmic Sobolev inequalities, CsiszarKullback inequalities, and the rate of convergence to equilibrium for Fokker–Planck type equations. Preprint, 1997 3. Arkeryd, L.: Stability in L1 for the spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal. 103 (2), 151–167 (1988) 4. Barky, D. and Emery, M.: Diffusions hypercontractives. (in French) Lect. Notes Math. 1123, Sém. XIX: 177–206, 1985 5. Bobylev, A.V.: The theory of the nonlinear, spatially uniform Boltzmann equation for Maxwellian molecules. Sov. Sci. Rev. C. Math. Phys. 7, 111–233 (1988) 6. Bobylev, A.V. and Cercignani, C.: On the rate of entropy production for the Boltzmann equation. J. Stat. Phys. 94, 603–618 (1999) 7. Boltzmann, L.: Lectures on Gas Theory. Reprinted by Dover Publications, 1995 8. Carlen, E.: Superadditivity of Fisher’s information and logarithmic Sobolev inequalities. J. Funct. Anal. 101 (1), 194–211 (1991) 9. Carlen, E. and Carvalho, M.: Strict entropy production bounds and stability of the rate of convergence to equilibrium for the Boltzmann equation. J. Stat. Phys. 67 (3–4), 575–608 (1992) 10. Carlen, E. and Carvalho, M.: Entropy production estimates for Boltzmann equations with physically realistic collision kernels. J. Stat. Phys. 74 (3–4), 743–782 (1994) 11. Carlen, E., Esposito, R., Lebowitz, J.L., Marra, R. and Rokhlenko, A. Kinetics of a model weakly ionized plasma in the presence of multiple equilibria, Preprint, 1996 12. Carlen, E., Gabetta, E. and Toscani, G.: Propagation of smoothness in velocities and strong exponential convergence for maxwellian molecules. Commun. Math. Phys. 199, 521–546 (1999) 13. Carlen, E. and Soffer, A.: Entropy production by block variable summation and central limit theorems. Commun. Math. Phys. 140, 339–371 (1991) 14. Carrillo, J.A. and Toscani, G.: Exponential convergence toward equilibrium for homogeneous FokkerPlanck-type equations. Math. Mod. Meth. Appl. Sci. 21, 1269–1286 (1998) 15. Cercignani, C.: H -theorem and trend to equilibrium in the kinetic theory of gases. Arch. Mech. 34, 231–241 (1982) 16. Cercignani, C.: The Boltzmann equation and its applications. New York: Springer, 1988 17. Cercignani, C., Illner, R. and Pulvirenti, M.: The mathematical theory of dilute gases. NewYork: SpringerVerlag, 1994 18. Csiszar, I.: Information-type measures of difference of probability distributions and indirect observations. Stud. Sci. Math. Hung. 2, 299–318 (1967) 19. Desvillettes, L.: Entropy dissipation rate and convergence in kinetic equations. Commun. Math. Phys. 123 (4), 687–702 (1989) 20. Desvillettes, L. and Villani, C.: On the spatially homogeneous Landau equation for hard potentials. Part I: Existence, uniqueness and smoothness. To appear in Comm. P.D.E. 21. Desvillettes, L. and Villani, C.: On the spatially homogeneous Landau equation for hard potentials. Part II: H -theorem and applications. To appear in Comm. P.D.E. 22. Donsker, M.D. and Varadhan, S.R.S.: Asymptotic evaluation of certain Markov process expectations for large time, I. Comm. Pure Appl. Math. 28, 1–47 (1975) 23. Gabetta, E.: On a conjecture of McKean with application to Kac’s model. Transp. Theo. Statis. Phys. 24, 305–318 (1995) 24. Gabetta, E. and Toscani, G.: On entropy production rates for some kinetic equations. Bull. Tech. Univ. Istanbul 47, 219–230 (1994) 25. Gabetta, E. and Pareschi, L.: About the non cut-off Kac equation: uniqueness and asymptotic behaviour. Comm. Appl. Nonlinear Anal. 4, 1–20 (1997) 26. Gabetta, E., Toscani, G. and Wennberg, B.: Metrics for probability distributions and the trend to equilibrium for solutions of the Boltzmann equation, J. Stat. Phys. 81 , 901–934 (1995) 27. Gross, L.: Logarithmic Sobolev inequalities. Am. J. Math. 97, 1061–1083 (1975) 28. Gustafsson, T.: Global Lp -properties for the spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal. 103, 1–39 (1988) 29. Kullback, S.: A lower bound for discrimination information in terms of variation. IEEE Trans. Inf. The. 4, 126–127 (1967) 30. Lifchitz, E.M. and Pitaevskii, L.P.: Physical Kinetics – Course in theoretical physics. Vol. 10, Oxford: Pergamon, 1981 31. Lions, P.L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications. J. Math. Kyoto Univ. 34 (2), 391–427 (1994)
706
G. Toscani, C. Villani
32. Lions, P.L. and Toscani, G.: A strenghtened central limit theorem for smooth densities. J. Funct. Anal. 128, 148–167 (1995) 33. Maxwell, J.C.: On the dynamical theory of gases. Phil. Trans. R. Soc. Lond., 157, 49–88 (1867) 34. McKean, H.P.: Speed of approach to equilibrium for Kac’s caricature of a Maxwellian gas. Arch. Rat. Mech. Anal. 21, 343–367 (1966) 35. Morgenstern, D.: Analytical studies related to the Maxwell-Boltzmann equation. J. Rat. Mech. Anal. 4, 533–555 (195) 36. Perthame, B.: Introduction to the collision models in Boltzmann’s theory. Preprint, Univ. Pierre et Marie Curie, 1995 37. Pulvirenti, A. and Wennberg, B.: A Maxwellian lower bound for solutions to the Boltzmann equation. Commun. Math. Phys. 183, 145–160 (1997) 38. Toscani, G.: New a priori estimates for the spatially homogeneous Boltzmann equation. Cont. Mech. Thermodyn. 4, 81–93 (1992) 39. Toscani, G. Entropy production and the rate of convergence to equilibrium for the Fokker-Planck equation. To appear in Quarterly of Appl. Math. 40. Toscani, G.: The grazing collisions asymptotics of the non cut-off Kac equation. Math. Mod. Num. An. 32, 763–772 (1998) 41. Toscani, G. and Villani, C.: Probability metrics and uniqueness of the solution to the Boltzmann equation for a Maxwell gas. J. Stat. Phys. 94, 619–637 (1999) 42. Truesdell, C. and Muncaster, R.G.: Fundamentals of Maxwell’s kinetic theory of a simple monatomic gas. New York: Academic Press, 1980 43. Villani, C.: On a new class of weak solutions to the spatially homogeneous Boltzmann and Landau equations. Arch. Rat. Mech. Anal. 143 (3), 241–271 (1998) 44. Villani, C.: On the spatially homogeneous Landau equation for Maxwellian molecules. Math. Mod. Meth. Appl. Sci. 8 (6), 957–983 (1998) 45. Villani, C.: Conservative forms of Boltzmann’s collision operator: Landau revisited. To appear in Math. Mod. Num. An. 46. Villani, C.: Decrease of the Fisher information for the Landau equation with Maxwellian molecules. To appear in Math. Mod. Meth. Appl. Sci. 47. Villani, C.: Fisher information bounds for Boltzmann’s collision operator. J. Math. Pures Appl. 77, 821– 837 (1998) 48. Wennberg, B.: On an entropy dissipation inequality for the Boltzmann equation. C. R. Acad. Sci. Paris, t. 315, Série I, 1441–1446 (1992) 49. Wennberg, B.: Stability and exponential convergence for the Boltzmann equation. PhD thesis, Chalmers Univ. Tech., 1993 50. Wennberg, B.: Regularity in the Boltzmann equation and the Radon transform. Comm. P.D.E. 11 & 12 (19), 2057–2074 (1994) 51. Wennberg, B.: Entropy dissipation and moment production for the Boltzmann equation. J. Stat. Phys. 86 (5/6), 1053–1066 (1997) Communicated by J. L. Lebowitz
Commun. Math. Phys. 203, 707 – 712 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Geometry of the Constraint Sets for Yang–Mills–Dirac Equations with Inhomogeneous Boundary Conditions ´ J¸edrzej Sniatycki Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada Received: 29 June 1998 / Accepted: 28 December 1998
Abstract: The constraint equation for minimally coupled Yang–Mills and Dirac fields in bounded domains is studied under the inhomogeneous boundary conditions which admit unique solutions of the evolution equations. For each value of the boundary data, the constraint set is shown to be a submanifold of the extended phase space. It is a prinicipal fibre bundle over the reduced phase space with structure group consisting of the gauge symmetries which coincide on the boundary with the identity transformation up to the first order of contact. 1. Introduction Let M be a compact domain in R3 with smooth boundary ∂M, representing the physical space accessible to the fields, and g the Lie algebra of a compact Lie group G presented as a subgroup of the automorphism group of a finite dimensional vector space V . The Cauchy data (A, E, 9) for the equations under investigation form the extended phase space P = {(A, E, 9) ∈ H 2 (M, g ⊗ R3 ) × H 1 (M, g ⊗ R3 ) × H 2 (M, C4 ⊗ V )}.
(1)
Here the Cauchy data (A, E) for the Yang–Mills fields are considered to be g-valued vector fields on M. The matter fields are Dirac spinor fields 9 on M with values in the vector space V . In the preceding paper, [1], we have established a local in time existence and the uniqueness of solutions of the Yang–Mills–Dirac evolution equations subject to the gauge condition Z n(grad A0 ) = −nE and A0 d3 x = 0, 1A0 = − div E, M
and the boundary conditions 1
t(curl A(t)) = λ(t) ∈ H 2 (0(T ∂M ⊗ g)) ,
(2)
´ J. Sniatycki
708
3 1 (I d − iγ k nk )9(t) |∂M = µ(t) ∈ H 2 (∂M, C4 ⊗ V ), 2
(3)
1 1 (I d − iγ k nk )γ 0 (γ k ∂k + im)9(t) |∂M = ν(t) ∈ H 2 (∂M, C4 ⊗ V ), 2
(4)
−
where A0 is the scalar potential, t and n denote the components tangential and normal to ∂M, respectively, and γ 0 and γ k are the Dirac matrices acting in C4 . The boundary data (λ(t), µ(t), ν(t)) are not arbitrary, but they have to satisfy the following consistency conditions: 3
3
λ = λ1 + grad χ, where λ1 ∈ H 2 (0(T ∂M ⊗ g)) and χ ∈ H 2 (∂M, g);
(5)
(I d + iγ k nk )µ = 0;
(6)
(I d + iγ k nk )ν = 0.
(7)
We denote by B the space of the values of the boundary data, that is β = (λ, µ, ν) ∈ B 1 3 if and only if λ ∈ H 2 (0(T ∂M ⊗ g)) satisfies Eq. (5), µ ∈ H 2 (∂M, C4 ⊗ V ) satisfies 1 Eq. (6), and ν ∈ H 2 (∂M, C4 ⊗ V ) satisfies Eq. (7). The extended phase space P is weakly symplectic, with the weak symplectic form ω = dθ, where Z (E · a + 9 † ψ)d3 x. (8) hθ(A, E, 9) | (a, e, ψ)i = M
For each β = (λ, µ, ν) ∈ B we denote by Pβ ⊂ P the space of the Cauchy data satisfying the boundary conditions given by Eqs. (2), (3), and (4). It is a closed subspace of the Hilbert space P , and the pull-back of ω to Pβ is weakly symplectic. However, Pβ has no symplectic complement in P . The constraint equation of the Yang–Mills–Dirac theory is div E + [A; E] = 9 † (I ⊗ T a )9Ta ,
(9)
where T a is a basis in g. The space of solutions in P of the constraint equation is called the constraint set and it is denoted by C. The aim of this paper is to describe the structure of Cβ = C ∩ Pβ that is, the space of solutions in P of the constraint equation with the boundary data β. Theorem 1. For each β ∈ B, the constraint set Cβ is a smooth submanifold of Pβ . We denote by GS(P )1 the connected group of time independent gauge transformations, represented by maps ϕ ∈ H 3 (M, G) such that ϕ |∂M = identity and n grad ϕ = 0. Its action on the Cauchy data (A, E, 9) ∈ P is given by A 7 → ϕAϕ −1 + ϕ grad ϕ −1 , E 7 → ϕEϕ −1 , 9 7 → ϕ9.
(10)
Theorem 2. The action of GS(P )1 in P is continuous, smooth, free and proper. For every β ∈ B, the action of GS(P )1 in P preserves Pβ .
Geometry of the Constraint Sets for Yang–Mills–Dirac Equations
709
Corollary 3. The space Cβ /GS(P )1 of GS(P1 ) orbits in Cβ is a quotient manifold of Cβ , and Cβ has the structure of a principal fibre bundle over Cβ /GS(P )1 with structure group GS(P )1 . Elements of the Lie algebra gs(P )1 of GS(P )1 are given by maps ξ ∈ H 3 (M, g) such that ξ |∂M = 0, and n grad ξ = 0. Their action in P is given by the vector field ξP (A, E, 9) = (−DA ξ, −[E, ξ ], 9 † ξ ), where DA ξ = dξ + [A, ξ ] is the covariant differential of ξ with respect to the connection A. The action of GS(P )1 in P preserves the 1-form θ given by Eq. (8). Hence it preserves ω = dθ, and it is Hamiltonian with an equivariant momentum map J1 such that, for every ξ ∈ gs(P )1 , Z (−E · DA ξ + 9 † ξ 9)d3 x. (11) hJ1 (A, E, 9) | ξ i = hθ | ξP (A, E, 9)i = M
Using Stokes’ Theorem and the vanishing of ξ on ∂M, we obtain Z {(div E + [A; E])ξ + 9 † ξ 9}d3 x. hJ1 (A, E, 9) | ξ i = M
Since the above equality is satisfied for every smooth ξ which vanishes on the boundary together with its gradient, the Fundamental Theorem in the Calculus of Variations and the constraint equation (9) imply that (A, E, 9) ∈ C ⇐⇒ hJ1 (A, E, 9) | ξ i = 0.
(12)
The pull-back ωCβ of ω to Cβ has involutive kernel ker ωCβ . The reduced phase space ˇ Pβ is defined as the set of equivalence classes of points in Cβ under the equivalence relation p ' p 0 if and only if there is a piece-wise smooth curve in Cβ with the tangent vector contained in ker ωCβ . If ker ωCβ is a distribution, it is clearly involutive and the equivalence classes coincide with integral manifolds of ker ωCβ . We denote by ρβ : Cβ → Pˇβ the canonical projection associating to each p ∈ Cβ its equivalence class containing p. Theorem 4. For each β ∈ B, 1. Pˇβ = Cβ /GS(P )1 ; 2. Pˇβ is endowed with a weak Riemannian metric induced by the L2 scalar product in P , and with a 1-form θPˇβ such that ρβ∗ θPˇβ = θCβ . Moreover, ωPˇβ = dθPˇβ is weakly symplectic and ρβ∗ ωPˇβ = ωCβ . The regularity results obtained here are analogous to the results valid for the Yang– Mills and Dirac fields in the Minkowski space-time, [2], and much stronger than in the bag model, [3]. The reason for this is that the boundary conditions used here are weaker than the bag boundary condition. In particular, we need not specify the normal component of the electric Yang–Mills field E on the boundary.
´ J. Sniatycki
710
2. Proofs 2.1. Proof of Theorem 1. The space P0 , corresponding to β = 0 ∈ B, is a closed subspace of P . For each β ∈ B, Pβ is a closed affine subspace of P with the tangent space P0 . Let fβ : Pβ → L2 (M, g) be given by fβ (A, E, 9) = div E + [A, E] − 9 † (I ⊗ T a )9Ta . The constraint set Cβ is the zero level of fβ , Cβ = fβ−1 (0). The derivative of fβ at the point (A, E, 9) ∈ Pβ in the direction (a, e, ψ) ∈ P0 is given by Dfβ(A,E,9) (a, e, ψ) = div e + [A; e] + [a; E] − ψ † (I ⊗ T a )9Ta − 9 † (I ⊗ T a )ψTa . (13) The differential Dfβ(A,E,9) maps P0 to L2 (M, g). Since our boundary conditions impose no restrictions on E, the operator P0 → L2 (M, g) : (a, e, ψ) 7→ div e is submersive. Moreover, by the Rellich-Kondrachev Theorem, [4], the operator P0 → L2 (M, g) : (a, e, ψ) 7 → [A; e] + [a; E] − ψ † (I ⊗ T a )9Ta − 9 † (I ⊗ T a )ψTa is compact. Hence, Dfβ (A,E,9) is semi-Fredholm and its range is closed, see [5]. Since Dfβ(A,E,9) maps to a Hilbert space, its range has the orthogonal complement. Similarly, the kernel of Dfβ(A,E,9) is closed and has the orthogonal complement in P0 . Hence, by the Implicit Function Theorem fβ−1 (0) is a submanifold of Pβ , see [6]. This completes the proof of Theorem 1. 2.2. Proof of Theorem 2. The action of H 3 (M, G) is given by A 7 → ϕAϕ −1 + ϕ grad ϕ −1 , E 7 → ϕEϕ −1 and 9 7 → ϕ9.
(14)
If ϕ ∈ H 3 (M, G), then ϕ grad ϕ −1 ∈ H 2 (M, g ⊗ R3 ). Moreover, for k = 1, 2, the pointwise product of elements of H 3 (M, R) and H k (M, R) are in H k (M, R). Hence, (A, E, 9) ∈ P implies that (ϕAϕ −1 + ϕ grad ϕ −1 , ϕEϕ −1 , ϕ9) ∈ P . This proves that H 3 (M, G) acts in P . Continuity and smoothness of the action of H 3 (M, G) in P follows from the continuity in H k (M, R) of pointwise products of elements of H 3 (M, R) and H k (M, R), k = 1, 2. Properness of the action (14) of H 3 (M, G) in H 2 (M, R3 ⊗ g) × H 1 (M, R3 ⊗ g) × 2 H (M, C4 ⊗ V ) was proved in [3]. The boundary conditions assumed there did not affect the proof. Since M is contractible it follows that GS(P )0 = {ϕ ∈ H 3 (M, G) | ϕ |∂M = identity}
Geometry of the Constraint Sets for Yang–Mills–Dirac Equations
711
is connected. Since GS(P )0 is a Banach–Lie subgroup of H 3 (M, G), continuity, smoothness and properness of its action in P is a consequence of the same properties of the action of H 3 (M, G). To show that the action of GS(P )0 is free, it suffices to consider its action on the Yang–Mills potentials. If ϕ ∈ GS(P )0 preserves A, then A = ϕAϕ −1 + ϕ grad ϕ −1 , which implies that grad ϕ + [A, ϕ] = 0,
(15)
that is ϕ is covariantly constant with respect to the connection A as a section of the group bundle over M. For any x ∈ M, let x(s) be a smooth path in M such that x(1) = x and x(0) ∈ ∂M. Restricting Eq. (15) to the path x(s) and setting ϕ(s) = ϕ(x(s)), we get an initial value problem d ϕ(s) = −[A(x(s)) · x(s), ˙ ϕ(s)] and ϕ(0) = identity, ds where x(s) ˙ denotes the derivative of x(s) with respect to s. Clearly, ϕ(s) = identity is a solution. Moreover, since A(x(s)) · x(s) ˙ is continuous in s, the solution is unique, which implies that ϕ(x) = ϕ(x(1)) = ϕ(1) = identity. Since GS(P )0 is connected, it follows that the isotropy group in GS(P )0 of any Yang–Mills potential A is trivial. This ensures that the action of GS(P )0 in P is free. Since GS(P )1 is a Banach–Lie subgroup of GS(P )0 its action in P is also continuous, smooth, proper and free. Moreover, each ϕ ∈ GS(P )1 satisfies the conditions ϕ |∂M = 0 and grad ϕ |∂M = 0. Hence, t curl (ϕAϕ −1 + ϕ grad ϕ −1 ) = = t(grad ϕ × Aϕ −1 + ϕ curl Aϕ −1 + ϕA × grad ϕ −1 + grad ϕ × grad ϕ −1 ) = t grad ϕ × nAϕ −1 + n grad ϕ × tAϕ −1 + ϕt curl Aϕ −1 + +ϕtA × n grad ϕ −1 + ϕnA × t grad ϕ −1 +t grad ϕ × n grad ϕ −1 + n grad ϕ × t grad ϕ −1 = n grad ϕ × tA + t curl A + tA × n grad ϕ −1 = t curl A. Similarly, 1 1 (I d − iγ k nk )ϕ9 |∂M = (I d − iγ k nk )9 |∂M , 2 2 and 1 1 − (I d − iγ k nk )γ 0 (γ k ∂k + im)ϕ9 |∂M = − (I d − iγ k nk )γ 0 (γ k ∂k + im)9 |∂M . 2 2 Hence, the action of ϕ in P preserves Pβ . Finally, the constraint equation is gauge invariant. Hence, the constraint set Cβ is preserved by the action of GS(P )1 . Moreover, GS(P )1 acts in Cβ continuously, smoothly, properly and freely. This completes the proof of Theorem 2.
712
´ J. Sniatycki
2.3. Proof of Corollary 3. Since GS(P )1 acts in Cβ continuously, properly and freely, the space Cβ /GS(P )1 of the GS(P )1 -orbits in Cβ is a quotient manifold of Cβ . Moreover, since the action of GS(P )1 in Cβ is continuous, proper and free, Cβ has the structure of the principal fibre bundle with base space Cβ /GS(P )1 and structure group GS(P )1 . A proof that a continuous, smooth, proper and free action of a Lie group in a finite dimensional manifold gives rise to the structure of a principal fibre bundle is given in Ref. [7]. This proof extends without change to continuous, smooth, proper and free actions of Banach–Lie groups on Banach manifolds. This completes the proof of Corollary 3. 2.4. Theorem 4. Theorem 4 is a consequence of smoothness of the constraint set and several results which can be found in the literature, see Refs. [8,9,3,10]. Its proof is essentially identical to the proof of the corresponding result for Yang–Mills and Dirac fields in the Minkowski space-time, [2]. References ´ 1. Schwarz, G., Sniatycki, J. and Tafel, J.: Yang–Mills and Dirac fields with inhomogeneous boundary conditions. Commun. Math. Phys. 188, 439–448 (1997) ´ 2. Sniatycki, J.: Regularity of constraints and reduction in the Minkowski space Yang–Mills–Dirac theory. Ann. Inst. Henri Poincaré 70, 277–293 (1999) ´ 3. Sniatycki, J., Schwarz, G. and Bates, L.: Yang–Mills and Dirac fields in a bag, constraints and reduction. Commun. Math. Phys. 176, 95–115 (1996) 4. Adams, R.A.: Sobolev Spaces. New York: Academic Press, 1975 5. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–NewYork: Springer Verlag, 1984 6. Lang, S.: Differential and Riemannian Manifolds. New York: Springer, 1995 7. Cushman, R. and Bates, L.: Global Aspects of Classical Integrable Systems. Basel–Boston–Berlin: Birkhäuser, 1997 8. Arms, J., Marsden, J.E. and Moncrief, V.: Symmetry and bifurcation of momentum maps. Commun. Math. Phys. 90, 361–372 (1981) 9. Palais, R.: On the existence of slices for actions of non-compact Lie groups. Ann. Math. 73, 295–323 (1961) 10. Mitter, P. and Viallet, C.: On the bundle of connections and the gauge orbit manifold inYang–Mills theory. Commun. Math. Phys. 79, 457–472 (1981) Communicated by G. Felder
Commun. Math. Phys. 203, 713 – 728 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Singular Monopoles and Gravitational Instantons Sergey A. Cherkis1,? , Anton Kapustin2,?? 1 California Institute of Technology, Pasadena, CA 91125, USA. E-mail:
[email protected] 2 School of Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA.
E-mail:
[email protected] Received: 29 May 1998 / Accepted: 12 January 1999
Abstract: We model Ak and Dk asymptotically locally flat gravitational instantons on the moduli spaces of solutions of U (2) Bogomolny equations with prescribed singularities. We study these moduli spaces using Ward correspondence and find their twistor description. This enables us to write down the Kähler potential for Ak and Dk gravitational instantons in a relatively explicit form. 1. Introduction A gravitational instanton is a smooth four-dimensional manifold with a Riemannian metric satisfying Einstein equations. A particularly interesting class of gravitational instantons is that of four-dimensional hyperkähler manifolds, i.e. manifolds with holonomy group contained in SU (2). A hyperkähler manifold can be alternatively characterized as a Riemannian manifold admitting three covariantly constant complex structures I, J, K satisfying the quaternion relations I J = −J I = K, etc.
(1)
such that the metric is Hermitian with respect to I, J, K. Covariant constancy of I, J, K implies that three 2-forms ω1 = g(I ·, ·), ω2 = g(J ·, ·), ω3 = g(K·, ·) are closed. If we pick one of the complex structures, say I , we may regard a hyperkähler manifold as a complex manifold equipped with Kähler metric (with Kähler form ω1 ) and a complex symplectic form ω = ω2 + iω3 . Hyperkähler four-manifolds arise in several physical problems. For example, compactification of string and M-theory on hyperkähler four-manifolds preserves one half of supersymmetries and provides exact solutions of stringy equations of motion. ? Research supported in part by DOE grant DE-FG03-92-ER40701.
?? Research supported in part by DOE grant DE-FG02-90-ER40542.
714
S. A. Cherkis, A. Kapustin
The only compact hyperkähler four-manifolds are T 4 and K3, but the K3 metric is not known explicitly. In the noncompact case there are several possibilities to consider. There are no nontrivial hyperkähler metrics asymptotically approaching that of R4 , but the situation becomes more interesting if one makes the metric only Asymptotically Locally Euclidean (ALE), i.e. that the metric look asymptotically like the quotient of R4 by a finite group of isometries. All such metrics fit into the ADE classification of Kronheimer which we now briefly explain. Let 0 be a finite subgroup of SU (2). There is a natural correspondence between such 0’s and ADE Dynkin diagrams: the Ak diagram corresponds to the cyclic group Zk+1 , the Dk diagram corresponds to the binary dihedral group Dk−2 of order 4(k − 2), and Ek diagrams correspond to symmetry groups of tetrahedron, cube, and icosahedron. Since SU (2) acts on C2 by the fundamental representation, we may consider quotients C2 / 0 (known as Kleinian singularities). Kronheimer showed that resolutions of Kleinian singularities admit ALE hyperkähler metrics, and that all such metrics arise in this way [1,2]. In the Ak case the metric has been known explicitly for some time: it is the Gibbons-Hawking metric with k + 1 centers [3]. Kronheimer provided an implicit construction of Dk and Ek ALE gravitational instantons as hyperkähler quotients [1]. Another interesting class of noncompact gravitational instantons is that of Asymptotically Locally Flat (ALF) manifolds. By definition, the metric of an ALF manifold has the asymptotic form ds 2 = dr 2 + σ12 + r 2 (σ22 + σ32 ),
(2)
where σj are left-invariant one-forms on S3 / 0 for some finite subgroup 0 acting on S 3 = SU (2) from the right. The only known hyperkähler metric of this sort is the multiTaub-NUT metric. As a complex manifold the k + 1-center multi-Taub-NUT space is isomorphic to the resolution of C2 /Zk+1 , so we will call it the Ak -type ALF gravitational instanton. Compactification of M-theory of this manifold is equivalent to a configuration of k + 1 parallel D6 branes in IIA string theory. Furthermore, it is expected that a configuration of an O6+ orientifold and k D6 branes in IIA string theory corresponds to the compactification of M-theory on a Dk ALF space [4], i.e. an ALF gravitational instanton isomorphic to the resolution of C2 /Dk−2 . More generally, any compactification of Mtheory on an ALF hyperkähler manifold should correspond to a IIA brane configuration preserving half of supersymmetries. Thus it is of interest to find all four-dimensional ALF hyperkähler metrics in as explicit a form as possible. In our previous paper [5] we constructed Dk ALF metrics from moduli spaces of certain ordinary differential equations (Nahm equations). In this paper we construct both Ak and Dk ALF hyperkähler four-manifolds from moduli spaces of solutions of U (2) Bogomolny equations on R3 with prescribed singularities. Solutions of SU (2) Bogomolny equations with singularities were previously considered by Kronheimer [6], and much of our discussion closely follows that in Ref. [6]. The idea is that, on one hand, the moduli space of Bogomolny equations carries natural hyperkähler structure, while on the other hand the solutions can be found by means of Ward correspondence. This approach yields directly the twistor space (in the sense of Penrose) of the moduli space of solutions. To get the metric itself one needs to find an appropriate family of sections of the twistor space. In both Ak and Dk cases we were able to identify the correct family of sections only modulo some finite choices. The Ak metrics are simple enough so that one can explicitly see that only one choice gives nonsingular metrics. In the end of Subsect. 5.1 we argue that the Dk twistor spaces we have constructed correspond to
Singular Monopoles and Gravitational Instantons
715
everywhere smooth hyperkähler metrics as well. We also show that the Dk ALF metrics we obtain are identical to those found in Ref. [5]. Let us explain what we mean by solutions of Bogomolny equations with prescribed singularities. Recall that Bogomolny equations on R3 are equations for a connection A in a vector bundle B over R3 and a section 8 of EndB: ∗FA = DA 8. Let kak be the Ad-invariant norm on u(2), kak2 = − 21 Tra 2 . Fix k distinct points p 1 , . . . , pk ∈ R3 . A singular U (2) monopole is a solution of U (2) Bogomolny equations on R3 \{p1 , . . . , pk } satisfying the following conditions. (i) As r → p α 2rα 8 → i diag (0, −`α ) up to gauge transformations, and d (rα k8k) is bounded. Here rα = |r − pα |, α = 1, . . . , k. (ii) As r → ∞ one has asymptotic expansions, up to gauge transformations, P n n − `α + O(1/r 2 ), 8 = i diag µ1 − , µ2 + 2r 2r ∂k8k = O 1/r 2 , kD8k = O 1/r 2 . ∂ We will refer to n as the nonabelian charge of the monopole,P and to {`α } as its abelian charges. We will assume that µ1 > µ2 . We also set n0 = n − α `α , µ = µ1 − µ2 , for short. Every fiber of the complex rank two bundle B splits into the eigenspaces of 8, B = M1 ⊕ M2 , near r = pα or when r → ∞. Let M1 corresponds the eigenvalue of 8 diverging as r → p α . It is a simple consequence of Bogomolny equations that −`α is the degree of M1 restricted to a small 2-sphere around r = pα . Similarly, −n and n0 are the degrees of eigensubbundles of B restricted to a large 2-sphere. Therefore n and {`α } are integers. String theory considerations imply that the moduli space of the n = 1 monopole with `α = 1, α = 1, . . . , k is an Ak−1 ALF gravitational instanton, and the centered moduli space of n = 2 monopole with `α = 1, α = 1, . . . , k is a Dk ALF gravitational instanton [7]. (The centered n = 2 monopole moduli space is a U (1) hyperkähler quotient of the n = 2 monopole moduli space; see Sect. 5 below.) In this paper we show that this is indeed the case. The main tool is the Ward correspondence described in Sect. 2. In Sect. 3 we use it to derive the twistor space for arbitrary n. In Sects. 4 and 5 we deal with the n = 1 and n = 2 cases, respectively. We show that the moduli spaces are resolutions of Ak−1 and Dk singularities, as expected, find the real holomorphic sections of the twistor spaces, and derive the Kähler potentials for the metrics using the generalized Legendre transform method of Refs. [8,9]. 2. Ward Correspondence From now on we restrict ourselves to the case `α = 1, α = 1, . . . , k. We make some comments on the more general case of positive `α at the end of Subsect. 5.1. To construct the moduli space of singular U (2) monopoles, we will use a version of Ward correspondence due to Hitchin [10]. The set of all oriented straight lines T in R3 has a natural complex structure, as it is the tangent bundle of the projective line. T can be covered by two patches V0 (ζ 6 = ∞) and V1 (ζ 6 = 0) with coordinates (η, ζ ) and (η0 , ζ 0 ) = (η/ζ 2 , 1/ζ ).
716
S. A. Cherkis, A. Kapustin
For any point x ∈ R3 the set of all oriented straight lines through x sweeps out a projective line Px ∈ T; thus there is a holomorphic map Px : P1 → T. The reversal of the orientation of lines in R3 is an antiholomorphic map τ : T → T satisfying τ 2 = id. It is called the real structure of T. For any x it acts on Px as the antipodal map. Thus Px is a real holomorphic section of T. For any straight line in R3 , γ = {x|x = ut + v, u · u = 1, u · v = 0} , let γ+ = {x|x = ut + v, t > R} , γ− = {x|x = ut + v, t < −R} ,
(3)
where R is a positive number greater than any |pα |. Now we define two complex rank 2 vector bundles E + and E − over T: (4) E + = s ∈ 0 (γ+ , E) |Dγ s = i8s , − E = s ∈ 0 (γ− , E) |Dγ s = i8s . From Bogomolny equations it follows, as in Ref. [10], that these bundles are holomorphic. The real structure τ on T can be lifted to an antilinear antiholomorphic map σ : E + → (E − )∗ . Thus every solution of U (2) Bogomolny equations maps to a pair of holomorphic rank two bundles on T interchanged by the real structure. Let Px denote the real section corresponding to x, and Pα the real section corresponding to p α . Let P be the union of all Pα . If γ does not pass through any of pα , any solution s can be continued from γ+ to γ− . This defines a natural identification of the fibers Eγ+ and Eγ− . Therefore we have an isomorphism h : E + |T\P → E − |T\P .
(5)
For nonsingular monopoles h extends to an isomorphism over the whole T, therefore the Ward correspondence maps a nonsingular monopole to a holomorphic bundle over T. In the present case h or h−1 may have singularities at P , and the Ward correspondence maps a singular monopole into a triplet (E + , E − , h). This triplet satisfies a certain triviality constraint which we now proceed to formulate. T P consists of an even number For any x distinct from all p α the intersection Px T of points. For a generic x the cardinality of Qx = Px P is 2k. For any x we can − arbitrarily split Qx into two sets of equal cardinality Q+ x and Qx and construct a vector + − bundle Ex over Px by gluing together E restricted to Px \Q+ x and E restricted to , with the transition function h. (Of course, E depends on the Px \Q− x x S − splitting.) The Qx such that Ex is triviality constraint is that for any x there is a splitting Qx = Q+ x trivial. Now we state the Ward correspondence between singular U (2) monopoles and twistor data. There is a bijection between singular monopoles modulo gauge transformations and pairs (E + , E − ) of holomorphic rank 2 bundles over T equipped with an isomorphism Eq. (5) satisfying the following conditions: S − Qx such that Ex is trivial. (a) For any x 6 = pα there is a splitting Qx = Q+ x
Singular Monopoles and Gravitational Instantons
717
(b) In the vicinity of each point of P there exist trivializations of E + and E − such that h takes the form 1Q 0 , (6) h= 0 α (η − Pα (ζ )) so that h extends across P to a morphism E + → E − . (c) The real structure τ on T lifts to an antilinear antiholomorphic map σ : E + → − (E )∗ . The injectivity of the Ward correspondence can be shown by a straightforward modification of the argument in Ref. [10]. We conjecture the surjectivity by analogy with the nonsingular case. Let us explain where (a) and (b) come from. The condition (b) arises from studying the behavior of the solutions of the equation Dγ s = i8s as γ approaches pα . Details can be found in Ref. [6]. (There the SU (2) case was analyzed, but the extension to U (2) is straightforward). To demonstrate (a) it is sufficient to exhibit a holomorphic trivialization of Ex . Take any x 6 = pα , α = 1, . . . , k and recall that Px consists of all straight lines γ passing through x. To obtain a holomorphic section of Ex pick a vector v1 in the fiber of B over x and take it as an initial condition for the equation Dγ s = i8s at t = 0. Integrating it forward and backward in t and varying γ yields sections of E + and E − related by h. It is easy to check that they are holomorphic and thereby combine into a holomorphic section s1 of Ex . To get a section s2 of Ex linearly independent from s1 just pick a vector v2 linearly independent from v1 and repeat the procedure. (This argument has to be modified if there is a straight line γ passing through x and α, β ∈ {1, . . . , k} such that p α and pβ lie on γ and x separates them. In this case one of the vectors v1 , v2 has to be varied, vi ∼ ζ −1 , as one varies γ .) We now want to encode the twistor data in an algebraic curve S ⊂ T, in the spirit of Ref. [10]. We denote by O(m) the pullback to T of the unique degree m line bundle on P1 , and by Lx (m) a line bundle over T with the transition function ζ −m e−xη/ζ from + V0 to V1 . Let L+ 1 be a line subbundle of E which consists of solutions of Dγ s = i8s − bounded by const · exp(−µ1 t)t n as t → +∞. Similarly, a line bundle L− 1 ⊂ E 0 −n consists of solutions bounded by const · exp(−µ2 t)t as t → −∞. The line bundles − and L are defined by L+ 2 2 + + L+ 2 = E /L1 ,
− − L− 2 = E /L1 .
As in Ref. [10] the asymptotic conditions on the Higgs field can be used to show that − L+ 1,2 and L1,2 are holomorphic line bundles, and that the following isomorphisms hold: + − − µ1 µ2 0 µ2 0 µ1 L+ 1 ' L (−n), L2 ' L (n ), L1 ' L (−n ), L2 ' L (n).
Consider a composite map − + − ψ : L+ 1 → E → E → L2 ,
where the first arrow is an inclusion, the second arrow is h, and the third arrow is a natural projection. We may regard ψ as an element of H 0 (T, O(2n)). Let us define the spectral curve S to be the zero level of ψ. S is in the linear system O(2n). Arguments identical to those in Ref. [10] can be used to prove that S is compact and real (i.e., τ (S) = S). Consider now a map φ : ∧2 E + → ∧2 E − induced by h. By virtue of Eq. (6) the zero level of φ is precisely P . We will assume in what follows that S does not contain
718
S. A. Cherkis, A. Kapustin
any of Pα as components. Physically this corresponds to the requirement that none of theTnonabelian monopoles is located at x = pα . For simplicity we will also assume that S P consists of 2nk points (this is a generic situation). The construction here bears a close resemblance to that in Ref. [11], where nonsingular monopoles for all classical groups were constructed. According to Ref. [11], the spectral data for a nonsingular SU (3) monopole with magnetic charge (k, n) include a pair of spectral curves S1 , S2 in the linearTsystems O(2n), O(2k). Our S and P are analogs of S1 and S2 . The condition that S P consists of 2nk points is analogous to the requirement in Ref. [11] that the monopoles are generic. (This resemblance is not a coincidence: if we consider an SU (3) gauge theory broken down to SU (2) × U (1) by a large vev of an adjoint Higgs field, the (k, n) monopoles of SU (3) reduce to singular monopoles of SU (2) × U (1) with nonabelian charge n and total abelian charge k. In this limit the spectral data of Ref. [11] must reduce to ours.) + − Since L+ 1 |S = ker ψ|S , we have a well-defined holomorphic map ρ : L2 |S → L2 |S − induced by h. There is also a holomorphic map ξ : L+ by h. Thus 1 |S → L1 |S induced 0 µ 0 we have natural elements ρ ∈ H (S, L (k)) and ξ ∈ H S, L−µ (k) . It also easily follows from theTdefinition that ρ ⊗ ξ = φ|S , and therefore the divisors of both ρ and ξ are subsets of S P . ρ and ξ are interchanged by real structure, and therefore the same is true about their divisors. It follows that the divisors of ρ and ξ are disjoint and have equal cardinality. Thus we can define the spectral data for a generic singular monopole to consist of A spectral T curve S, which is a real compact curve in the linear system O(2n) such that S P consists of 2nk Sdisjoint points. T (ii) A splitting S P = Q+ Q− into sets of equal cardinality interchanged by τ . (iii) A section ρ of Lµ (k)|S with divisor Q+ and a section ξ of L−µ (k)|S with divisor Q− . ρ and ξ are interchanged by real structure.
(i)
The condition (iii) is a constraint on S. It implies that ρ and ξ satisfy Y (η − Pα (ζ )). ρξ =
(7)
α
For nonsingular monopoles it reduces to the requirement that Lµ |S is trivial, as in 2µ Ref. [10]. As a consequence of (iii), L |S Q− − Q+ is trivial. Recall that the spectral data for nonsingular SU (2) monopoles satisfy an additional constraint, the ”vanishing theorem" of Ref. [12]. It says that Lzµ (n − 2) is nontrivial for z ∈ (0, 1). A natural guess for the analogue of this condition in our case is (iv) Lzµ (n − 2) −Q+ is nontrivial for z ∈ (0, 1). We already mentioned a close connection of the spectral data for singular U (2) monopoles and those for nonsingular SU (3) monopoles [11] with the largest Higgs vev set to +∞. Consequently, one can obtain the condition (iv) from the “vanishing theorem” of Ref. [11] by taking the appropriate limit. A direct derivation of (iv) should also be possible. Arguments very similar to those in Ref. [10] show that the spectral data determine the singular monopole uniquely. A natural question is if there is a one-to-one correspondence between singular U (2) monopoles and spectral data defined by (i-iv). The answer was positive for nonsingular SU (2) monopoles [12], so it is highly plausible that the same is true in the present case. Presumably a proper proof of this can be achieved by converting the spectral data into solutions of Nahm equations, and then reconstructing singular monopoles by an inverse Nahm transform [12,11].
Singular Monopoles and Gravitational Instantons
719
3. Twistor Space for Singular Monopoles Having established the correspondence between singular U (2) monopoles and algebraic data on T, we now proceed to construct the twistor space Zn for the moduli space of a singular monopole with nonabelian charge n. We follow the method of Ref. [13]. For fixed ζ = ζ0 every point in Zn yields a spectral curve S which intersects the fiber of T over ζ0 at n points. Thus we have a projection Zn → ⊕nj=1 O(2j ) = Yn . Concretely, if S is given by ηn + η1 ηn−1 + · · · + ηn = 0, the corresponding point in Yn is (η1 , . . . , ηn ). Now consider an n-fold cover of Yn , n o Xn = (η, η1 , . . . , ηn ) ∈ O(2) ⊕ Yn |ηn + η1 ηn−1 + · · · + ηn = 0 . There are two natural projections π1 : Xn → T and π2 : Xn → Yn . Using these projections, we get a rank n bundle V + over Yn as a direct image sheaf V + = π2∗ π1∗ Lµ (k). Similarly, we get a rank n bundle V − = π2∗ π1∗ L−µ (k). For any point in Zn we have a section ρ of Lµ (k)|S and a section ξ of L−µ (k)|S . Therefore, there is an inclusion Zn ⊂ V + ⊕ V − . To describe this inclusion more concretely, we must rewrite the condition (iii) in terms of sections of V ± . The result is as follows. Let U be a 2n+1-dimensional subvariety in C3n+1 with coordinates (ζ, η1 , . . . , ηn , ρ0 , . . . , ρn−1 , ξ0 , . . . , ξn−1 ) defined by (ρ0 + ρ1 η + · · · + ρn−1 ηn−1 )(ξ0 + ξ1 η + · · · + ξn−1 ηn−1 ) =
Y (η − Pα (ζ )), α
mod ηn + η1 ηn−1 + · · · + ηn = 0.
(8)
Take two copies of U and glue them together over ζ 6= 0, ∞ by ζ˜ = ζ −1 , η˜ j = ζ −2j ηj , j = 1, . . . , n,
(9)
ρ˜0 + ρ˜1 η˜ + · · · + ρ˜n−1 η˜ n−1 = e−µη/ζ ζ −k (ρ0 + ρ1 η + · · · + ρn−1 ηn−1 ), ξ˜0 + ξ˜1 η˜ + · · · + ξ˜n−1 η˜ n−1 = eµη/ζ ζ −k (ξ0 + ξ1 η + · · · + ξn−1 ηn−1 ), all modulo ηn + η1 ηn−1 + · · · + ηn = 0. The resulting 2n + 1-dimensional variety is Zn , the twistor space of singular monopoles with nonabelian charge n. To reconstruct the hyperkähler metric from the twistor space one has to find a holomorphic section of 32 TF∗ ⊗ O(2), where TF∗ is the cotangent bundle of the fiber of Zn . Upon restriction to any fiber of Zn this section must be closed and nondegenerate. An obvious choice (the same as in Ref. [13]) is ω=4
n X dρ(βj ) ∧ dβj j =1
ρ(βj )
,
where βj , j = 1, . . . , n are the roots of ηn + η1 ηn−1 + · · · + ηn = 0.
(10)
720
S. A. Cherkis, A. Kapustin
4. Moduli Space M1 of n = 1 Monopole Specializing the formulas of the previous section to n = 1, we get that the twistor space Z1 is a hypersurface in the total space of Lµ (k) ⊕ L−µ (k), ρ0 ξ0 =
k Y
(η − Pα (ζ )),
(11)
α=1
Lµ (k), ξ0 C2 /Zk , so
L−µ (k),
∈ and η ∈ O(2). Obviously, for fixed ζ this is a where ρ0 ∈ the corresponding hyperkähler metric is an Ak−1 gravitational resolution of instanton. In fact, it is well known what the metric is: it is the multi-Taub-NUT metric with k centers. In the remainder of this section we rederive this result using the Legendre transform method of Refs. [14,8,9]. This will serve as a warm-up for the discussion of Dk ALF metrics in the next section. First we find the real holomorphic sections of the twistor space Z1 . This amounts to solving Eq. (11) with ρ0 , ξ0 , and η now regarded as holomorphic sections of the appropriate bundles. Recalling that η = aζ 2 + 2bζ − a¯ and Pα (ζ ) = aα ζ 2 + 2bα ζ − a¯ α with b, bα ∈ R, one gets in the patch V0 ρ0 = Ae+µ(b + aζ ) ξ0 = Be−µ(b + aζ ) with AB =
Q
k Y α=1 k Y
(ζ − uα ) , (ζ − vα ) ,
α=1
(a − aα ). Here uα and vα are the roots of the equation η(ζ ) = Pα (ζ ), −(b − bα ) − 1α , a − aα −(b − bα ) + 1α , vα = a − aα
uα =
(12)
p with 1α = (b − bα )2 + |a − aα |2 > 0. (The ambiguity in the sign of 1α is fixed by requiring that the hyperkähler metric on this family of sections be everywhere nonsingular. This is equivalent to asking that the normal bundle of every section in the family is O(1) ⊕ O(1).) Since the real structure must interchange ρ0 and ξ0 , we get Y (13) BB = (b − bα + 1α ) . Thus we have a family of solutions to Eq. (11) parametrized by Re a, Im a, b, and Arg B. Having found the real holomorphic sections, we compute the Kähler potential. The twistor space Z1 is fibered over P1 with an intermediate projection Z1 → O(2) → P1 . In the above ζ and η are coordinates on the base and the fiber of O(2), respectively. The holomorphic 2-form ω ∈ 32 T ∗ ⊗ O(2) is given by ω = 4dη ∧
dρ . ρ
(14)
Singular Monopoles and Gravitational Instantons
p
721
,
Q
u
v
p
Fig. 1. The contour γα enclosing uα and vα
For ζ 6 = ∞ we can choose η(ζ ) and χ = 2 log ρξ as two coordinates on the moduli space M1 holomorphic with respect to the complex structure defined by ζ . The coordinates in the patch ζ 6 = 0 are related to these as η0 = η/ζ 2 , χ 0 = χ − 4µη/ζ.
(15)
The second equation here follows from ρ0 and ξ0 being sections of Lµ (k) and L−µ (k). In terms of these coordinates ω = dη ∧ dχ = ζ 2 dη0 ∧ dχ 0 .
(16)
Following Ref. [9] we define an auxiliary function fˆ and a contour C by the equation I I I I I I dζ dζ ˆ dζ dζ 0 dζ χ+ χ = + χ − 4µ η (17) f = j j j j j +1 C ζ 0 ζ ∞ ζ 0 ∞ ζ ∞ ζ H H for any integer j . Here and in what follows the integrals 0 and ∞ are taken along small positively oriented contours around respective points. This implies that in the first of these integrals the contour runs counterclockwise, while in the second one it runs clockwise. Substituting an explicit expression for χ we find I I I dζ ˆ X dζ dζ 2 log ) − P (ζ )) + 4µ η. (18) f = (η(ζ α j j j +1 ζ ζ ζ C 0α 0 α Here 0α is a figure-eight-shaped contour enclosing uα and vα (see Fig. 1). We define a function G(η, ζ ) by ∂G/∂η = fˆ. According to Ref. [9] the Legendre transform of the Kähler potential is given by I dζ 1 G(η, ζ ). (19) F (a, b) = 2πi C ζ 2 Using Eq. (18) we find 2µ F (a, b) = 2πi
I 0
k
dζ 2 X 1 η + ζ3 2πi α=1
I 0α
dζ 2 (η − Pα ) log (η − Pα ) . ζ2
(20)
The Kähler potential K is the Legendre transform of F : ∂F = t + t¯. K(a, a, ¯ t, t¯) = F − b t + t¯ , ∂b
(21)
It is a well known fact that the metric corresponding to Eq. (20) is the multi-Taub-NUT metric with k centers [9,14]. This is in agreement with string theory predictions [7].
722
S. A. Cherkis, A. Kapustin
5. Moduli Space of Centered n = 2 Monopole 5.1. Twistor space Z20 of centered n = 2 monopole. For n = 2 the moduli space M2 is 8-dimensional and admits a triholomorphic U (1) action. We define the centered moduli space M20 to be the hyperkähler quotient of M2 with respect to this U (1) (at zero level). The U (1) action on M2 lifts to a C∗ action on Z2 . It acts by ρj → λρj , ξj → λ−1 ξj . The corresponding moment map is η1 , as can be easily seen from the expression for ω. Thus Z20 , the twistor space of M20 , is the C∗ quotient of the subvariety η1 = 0 = η˜ 1 in Z2 . We first investigate one coordinate patch of Z20 . Let us denote ψ1 = ρ0 ξ0 , ψ2 = ρ1 ξ1 , ψ3 = 21 (ρ0 ξ1 + ρ1 ξ0 ), ψ4 = 21 (ρ0 ξ1 − ρ1 ξ0 ). The variables ψi are invariant with respect to C∗ action and satisfy ψ1 ψ2 = ψ32 − ψ42 , Y√ √ ( −η2 − Pα (ζ )), ψ1 − η2 ψ2 + 2 −η2 ψ3 =
(22)
α
Y √ √ (− −η2 − Pα (ζ )). ψ1 − η2 ψ2 − 2 −η2 ψ3 = α
These equations define a three-dimensional subvariety U 0 in C6 with coordinates (ζ, η2 , ψ1 , . . . , ψ4 ). Geometric invariant theory tells us that Z20 can be obtained by gluing together two copies of U 0 over ζ 6 = 0, ∞. The transition functions can be computed from Eq. (9): (23) ζ˜ = ζ −1 , −4 η˜ 2 = ζ η2 , ζ −2k √ ψ1 − η2 ψ2 + cos γ (ψ1 + η2 ψ2 ) − 2ψ4 η2 sin γ , ψ˜ 1 = 2 ζ 4−2k √ −(ψ1 − η2 ψ2 ) + cos γ (ψ1 + η2 ψ2 ) − 2ψ4 η2 sin γ , ψ˜ 2 = 2η2 2−2k ˜ ψ3 , ψ3 = ζ ζ 2−2k sin γ ˜ (ψ1 + η2 ψ2 ) √ + ψ4 cos γ , ψ4 = 2 η2 √ where γ = 2µ η2 /ζ. From this explicit description of Z20 one can see that for any ζ the fiber of Z20 is a resolution of the Dk singularity. Indeed, combining Eqs. (22) we see that the fiber of U0 over ζ is biholomorphic to a hypersurface in C3 (with coordinates (η2 , ψ2 , ψ4 )) given by ψ42 + η2 ψ22 + ψ2 Q(η2 ) − R(η2 )2 = 0, where Q(η2 ), R(η2 ) are polynomials in η2 defined by Y√ Y √ ( −η2 − Pα (ζ )) + (− −η2 − Pα (ζ )), 2Q(η2 ) = α
α
α
α
Y√ Y √ ( −η2 − Pα (ζ )) − (− −η2 − Pα (ζ )). 4 −η2 R(η2 ) = √
(24)
Singular Monopoles and Gravitational Instantons
723
Furthermore, these formulas imply that if all points p1 , . . . , pk are distinct, the manifold M20 is a smooth complex manifold in any of its complex structures. Since the 2-form ω is smooth as well, we conclude that M20 is a smooth hyperkähler manifold. The smoothness of M20 is also in agreement with string theory predictions. Indeed, as explained in Ref. [7], the space M20 is the Coulomb branch of N = 4, D = 3 SU (2) gauge theory with k fundamental hypermultiplets, with pα being hypermultiplet masses. When pα are all distinct, the theory has no Higgs branch, and therefore the Coulomb branch is smooth everywhere. When some masses become equal, the Higgs branch emerges, and the Coulomb branch develops an orbifold singularity at the point where it meets the Higgs branch. Thus we expect that when some of p α coincide, or equivalently, when some of `α are bigger than 1, the manifold M20 has orbifold singularities. In Ref. [5] the same manifold Z20 arose as the twistor space of the moduli space of a system of ordinary differential equations (so called Nahm equations). This is of course a consequence of a general correspondence between solutions of Bogomolny equations and Nahm equations [15,12,11]. Thus Ref. [5] provides an equivalent construction of Dk ALF metrics. 5.2. Real holomorphic sections of Z20 . The discussion of Sect. 2 implies that a real holomorphic section of the uncentered twistor space Z2 is a triplet (S, ρ, ξ ), where S is the spectral curve in T given by η2 + η1 η + η2 = 0, ρ and ξ are holomorphic sections of Lµ (k)|S and L−µ (k)|S satisfying the condition (iii) of Sect. 2. Then, as explained in Sect. 3, the real holomorphic sections of Z20 are obtained by setting η1 = 0 and modding by the C∗ action ρ → λρ, ξ → λ−1 ξ . In this subsection we find the explicit form of the real holomorphic sections of Z20 . The curve η2 + η2 = 0 is either elliptic or a union of two CP1 ’s. The former case is generic, while the latter occurs at a submanifold of the moduli space. Intuitively the latter case corresponds to the situation when the two nonabelian monopoles are on top of each other. It suffices to consider the elliptic case. By an SO(3) rotation ζ =
a ζ˜ + b , −bζ˜ + a
η=
η˜ , ˜ (−bζ + a)2
|a|2 + |b|2 = 1,
we can always bring the elliptic curve η2 = −η2 (ζ ) to the form η˜ 2 = 4k12 ζ˜ 3 − 3k2 ζ˜ 2 − ζ˜ , k1 > 0, k2 ∈ R.
(25)
(26)
It follows that the discriminant 1 > 0, and therefore the lattice defined by the curve S is rectangular. We denote this lattice 2 and its real and imaginary periods by 2ω and 2ω0 , respectively. We parametrized S by five real parameters: the Euler angles of the SO(3) rotation and a pair of real numbers k1 and k2 . We will see in a moment that the condition (iii) imposes one real constraint on them, so we will obtain a four-parameter family of real sections, as required. To write explicitly a section of Lµ (k)|S , we will use the standard “flat" parameter on the elliptic curve u defined modulo 2, in terms of which η˜ = k1 P 0 (u), ζ = P(u) + k2 . Here P(u) is the Weierstrass elliptic function. In terms of u the real structure acts by u → −u + ω + ω0 .
724
S. A. Cherkis, A. Kapustin
A section of Lµ (k)|S can be thought of as a pair of functions on S f1 , f2 such that f1 is holomorphic everywhere except ζ = ∞, f2 is holomorphic everywhere except ζ = 0, and for ζ 6 = 0, ∞ f2 (ζ ) = ζ −k exp(−µη/ζ )f1 (ζ ). The point ζ = ∞ corresponds to two points u∞ , −u∞ on S defined by P(u∞ ) + k2 = a/b.SFurthermore, S condition is Q . Let us recall that Q = Q (iii) implies that the divisor of f 1 + + − α Qα , where T Qα = S Pα , α = 1, . . . , k. Thus Qα consists of solutions of a system of two equations η = Pα (ζ ), η2 = −η2 (ζ ). Obviously, this defines four points on the elliptic curve S. Because of real structure, these four points split into two pairs whose members are interchanged by τ . Q+ includes one point from each pair (for all α), SQ− includes the rest. There is a 4m -fold ambiguity involved in the splitting Q = Q+ Q− . It can be fixed, in principle, by requiring that the normal bundle of every section of the twistor space in the family that we are considering is O(1) ⊕ O(1). Let us denote the “flat" coordinates of points in Q+ by uα , u0α , α = 1, . . . , k, and those in Q− by vα , vα0 , α = 1, . . . , k. By definition, vα = −uα + ω + ω0 (mod 2), vα0 = −u0α + ω + ω0 (mod 2). We fix the mod 2 ambiguity by requiring that uα , u0α , vα , vα0 is in the fundamental rectangle of 2. In this notation a section of Lµ (k)|S is given by f1 ∼ exp (−µk1 (ζW (u + u∞ ) + ζW (u − u∞ )) + Cu)
Y σ (u − uα )σ (u − u0 ) α . σ (u − u )σ (u + u ∞ ∞) α (27)
Here ζW (u) and σ (u) are Weierstrass quasielliptic functions (we denote the Weierstrass ζ -function by ζW (u) to avoid confusion with the affine coordinate ζ on the P1 of complex structures), and C is a constant. Similarly, a section of L−µ (k)|S with the divisor Q− is represented by a pair of functions g1 , g2 related by g2 (ζ ) = ζ −k exp(µη/ζ )g1 (ζ ). Explicitly g1 is given by g1 ∼ exp (µk1 (ζW (u + u∞ ) + ζW (u − u∞ )) + Du)
Y σ (u − vα )σ (u − v 0 ) α , σ (u − u )σ (u + u ∞ ∞) α (28)
where D is another constant. In general f1 and g1 are quasiperiodic with periods 2ω and 2ω0 . The condition (iii) is equivalent to asking that f1 and g1 be doubly periodic. One can see that the latter can be achieved by adjusting C and D if and only if X (uα + u0α ) ∈ 2, 2µk1 + α
X (vα + vα0 ) ∈ 2. 2µk1 −
(29)
α
m0 , p, p0 Recalling that k1 is real and positive, wePconclude that there exist integers m,P 0 0 0 and a real number x ∈ (0, 2ω] such that α (uα + uα ) = −x + 2mω + 2m ω , α (vα + vα0 ) = x + 2pω + 2p0 ω0 . Then Eqs. (29) together with the condition (iv) imply 2µk1 = x. Then for f1 and g1 to be doubly periodic one has to set C = 2mζW (ω) + 2m0 ζW (ω0 ), D = 2pζW (ω) + 2p0 ζW (ω0 ).
(30)
Singular Monopoles and Gravitational Instantons
725
Let us notice for future use that log f1 (u + ω) − log f1 (u) = −2π im0 , log f1 (u + ω0 ) − log f1 (u) = 2π im, log g1 (u + ω) − log g1 (u) = −2π ip0 , log g1 (u + ω0 ) − log g1 (u) = 2π ip.
(31)
Equation (30) is a transcendental equation on k1 , k2 , and the SO(3) rotation required to bring S to the standard form Eq. (26). It reduces the number of real parameters in the equation of the curve from 5 to 4. Thus we have a four-parameter family of real sections of Z20 . 5.3. The Kähler potential of the centered n = 2 moduli space. Having found a fourparameter family of real holomorphic sections of Z20 we now would like to compute the corresponding hyperkähler metric. Since Z20 has an intermediate holomorphic projection on O(4), we can use the method of Ref. [9] to write down the Legendre transform of the Kähler potential. The existence of the projection is equivalent to saying that η2 is a holomorphic coordinate on Z20 . The holomorphic 2-form ω in the patch ζ 6 = ∞ can be written as X 1 f1 log ≡ dη2 ∧ dχ . ω = dη2 ∧ d η g1 branches
√ Here f1 and η = −η2 are regarded as double-valued functions of ζ ∈ P1 \{ζ 6 = ∞}, and the sum is over the two branches of the cover S → P1 . Similarly, in the patch ζ 6= 0 we can write ω0 = dη20 ∧ dχ 0 . On the overlap we have the relations ω0 = ζ −2 ω, η20 = ζ −4 η2 , χ 0 = ζ 2 χ − 4µζ.
(32)
Following Ref. [9], we would like to find a (multi-valued) function fˆ(η, ζ ) and a contour C on the double cover S → P1 such that I I I dζ ˆ dζ dζ 0 χ+ χ f (η, ζ ) = j j −2 j ζ ζ C 0 ∞ ζ for any integer j . Here the contours of integration on the RHS are small positively oriented loops around ζ = 0 and ζ = ∞. To find fˆ we substitute the explicit expressions for χ and χ 0 and rewrite the integral on the RHS as an integral in the u-plane. Then the RHS becomes I I f1 (u) dζ du + 4µ ζ (u)−j +2 log , (33) j −1 k1 g1 (u) 0 ζ where the contour in the first integral consists of four small positively oriented loops around four preimages of the points ζ = 0 and ζ = ∞ in the fundamental rectangle of the lattice 2. We denote these points u0 , u00 = 2(ω + ω0 ) − u0 , u∞ , u0∞ = 2(ω + ω0 ) − u∞ . Besides these four points the only other branch points of log f1 (u)/g1 (u) in
726
p
A
u
v
u
1
p
X X
p
u
0 1
u0
p p p p CC
CC
CC
S. A. Cherkis, A. Kapustin
0
u
u0
p
CC
A 0
X X
0
v 0
X X
X X
Fig. 2. Integration contours in Eq. (33). Only one of the contours Aα and one of the contours A0α are shown
the fundamental rectangle are uα , u0α , vα , vα0 , α = 1, . . . , k. As for ζ (u), it is elliptic. Then we can rewrite Eq. (33) as I I I f1 (u) X f1 (u) du du dζ ζ (u)−j +2 log ζ (u)−j +2 log , + + 4µ j −1 g1 (u) g1 (u) bdry k1 Aα +A0α k1 0 ζ α (34) where the contour in the first integral runs along the boundary of the fundamental rectangle, while Aα and A0α enclose the pairs of points uα , vα and u0α , vα0 , respectively (see Fig. 2). Using Eqs. (31) the integral over the boundary can be simplified to I du ζ (u)−j +2 , −2πi (m−p,m0 −p0 ) k1 where the contour (m − p, m0 − p0 ) winds m − p times around the real cycle and m0 − p 0 times around the imaginary cycle. Recalling the explicit form of f1 (u) and g1 (u), we can rewrite the integral over Aα + A0α as I du ζ (u)−j +2 log σ (u − uα )σ (u − u0α )σ (u − vα )σ (u − vα0 ), (35) 0 Bα +Bα k1 where the contours Bα and Bα0 are figure-eight-shaped contours shown in Fig. 3. On the other hand, it can be easily seen that η(u) − Pα (ζ (u)) ∼ eu(C+D)
σ (u − uα )σ (u − u0α )σ (u − vα )σ (u − vα0 ) . σ (u − u∞ )2 σ (u + u∞ )2
Singular Monopoles and Gravitational Instantons
B
727
u p
p
p
J
u
0 1
u0 0
v p
p
p
u
p
1
u 0
u0 Q
B 0
v 0
p
Fig. 3. The contours Bα and Bα0
Since neither u∞ nor u0∞ are enclosed by the contour Bα + Bα0 , the integral Eq. (35) is equal to I du ζ (u)−j +2 log(η(u) − Pα (ζ (u))). Bα +Bα0 k1 Collecting all of this together we get I I dζ ˆ dζ −j +2 f (η, ζ )=−2πi ζ j 0 0 ζ C (m−p,m −p ) η I I X dζ −j +2 dζ log(η − Pα (ζ )) + 4µ . ζ + j −1 0 η ζ Cα +Cα 0 α
(36)
Here all the functions are regarded as functions on the double cover of the ζ -plane, and the contours Cα , Cα0 are the images of Bα , Bα0 under the map u 7→ ζ. We now define a function G(η, ζ ) by ∂G/∂η = −2ηζ −2 fˆ. According to Ref. [9] the Legendre transform of the Kähler potential is given by I dζ 1 G(η, ζ ). F = 2πi ζ2 Hence we can read off F :
I I 4µη2 2η 1 dζ 3 + dζ 2 F =− 0 0 2πi 0 ζ ζ (m−p,m −p ) X 1 I dζ 2(η − Pα (ζ )) log(η − Pα (ζ )). − 0 2πi Cα +Cα ζ 2 α
(37)
F may be regarded as a function of the coefficients of η2 (ζ ) = z+vζ +wζ 2 −vζ 3 +zζ 4 . Since w is real, F depends on 5 real parameters. These parameters are subject to one transcendental constraint expressed by Eq. (30). (This constraint implies ∂F /∂w =
728
S. A. Cherkis, A. Kapustin
0.) Thus we may think of w as an implicit function of z and v. The Kähler potential K(z, z, u, u) is the Legendre transform of F : K(z, z, u, u) = F (z, z, v, v, w) − uv − uv,
∂F ∂F = u, = u. ∂v ∂v
Equation (37) agrees with a conjecture by Chalmers [16]. We already saw in Sect. 5 that M20 is a resolution of Dk singularity. Now we can check that it is ALF. To this end we take the limit k1 → +∞. Equation (30) implies that in this limit ω → ∞, while ω0 stays finite. Thus the curve S degenerates: η2 (ζ ) → −(P (ζ ))2 , where P (ζ ) is a real section of T. It is easy to see that in this limit F reduces to the Taub-NUT form (see Sect. 4): I 4µP (ζ )2 P (ζ ) log P (ζ ) 1 dζ − +K , (38) F ∼ 2πi 0 ζ3 ζ2 where K is an integer depending on the limiting behavior of uα , u0α , vα , vα0 . Therefore asymptotically the metric on M20 has the Taub-NUT form. With some more work it should be possible to compute the integer K as well. Note also that if we set µ = 0, then the metric becomes ALE. Kronheimer proved [2] that the Dk ALE metric is essentially unique. Thus we have obtained the Legendre transform of the Kähler potential for the Dk metrics of Ref. [1]. It would be interesting to obtain a similar representation for the Ek ALE metrics. References 1. Kronheimer, P.B.: The Construction of ALE Spaces as Hyper-Kähler Quotients. J. Differ. Geom. 29, 665–683 (1989) 2. Kronheimer, P.B.: A Torelli-type theorem for gravitational instantons. J. Diff. Geom. 29, 685–697 (1989) 3. Gibbons, G.W. and Hawking, S.W.: Gravitational Multi-instantons. Phys. Lett. B 78, 430–432 (1978) 4. Sen, A.: A Note on Enhanced Gauge Symmetries in M- and String Theory. JHEP 09, 1 (1997). hepth/9707123 5. Cherkis, S.A. and Kapustin, A.: Dk Gravitational Instantons and Nahm Equations. hep-th/9803112 6. Kronheimer, P.B.: Monopoles and Taub-NUT Metrics. M. Sc. Thesis, Oxford, 1985 7. Cherkis, S.A. and Kapustin, A.: Singular Monopoles and Supersymmetric Gauge Theories in Three Dimensions. hep-th/9711145 8. Lindström, U. and Roˇcek, M.: Commun. Math. Phys. 115, 21 (1988) 9. Ivanov, I.T. and Roˇcek, M.: Supersymmetric Sigma Models, Twistors, and the Atiyah–Hitchin Metric. Commun. Math. Phys. 182, 291–302 (1996) 10. Hitchin, N.J.: Monopoles and Geodesics. Commun. Math. Phys. 83, 579–602 (1982) 11. Hurtubise, J. and Murray, M.K.: On the Construction of Monopoles for the Classical Groups. Commun. Math. Phys. 122, 35–89 (1989) 12. Hitchin, N.J.: On the Construction of Monopoles. Commun. Math. Phys. 89, 145–190 (1983) 13. Atiyah, M. and Hitchin, N.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton Univ. Press, 1988 14. Hitchin, N.J., Karlhede, A., Lindström, U. and Roˇcek, M.: Hyperkähler Metrics and Supersymmetry. Commun. Math. Phys. 108, 535–589 (1987) 15. Nahm, W.: Self-dual monopoles and calorons. In: Lecture Notes in Physics 201, G. Denardo et al. (eds.), Berlin–Heidelberg–New York: Springer, 1984 16. Chalmers, G.: The Implicit Metric on a Deformation of the Atiyah–Hitchin Manifold. hep-th/9709082; Multi-monopole Moduli Spaces for SU (N) Gauge Group. hep-th/9605182 Communicated by G. Felder
Commun. Math. Phys. 203, 729 – 741 (1999)
Communications in
Mathematical Physics
© Springer-Verlag 1999
Structure of Shocks in Burgers Turbulence with Stable Noise Initial Data Jean Bertoin Laboratoire de Probabilités, Université Pierre et Marie Curie, 4, Place Jussieu, F-75252 Paris Cedex 05, France. E-mail:
[email protected] Received: 28 September 1998 / Accepted: 12 January 1999
Abstract: Burgers equation can be used as a simplified model for hydrodynamic turbulence. The purpose of this paper is to study the structure of the shocks for the inviscid equation in dimension 1 when the initial velocity is given by a stable Lévy noise with index α ∈ (1/2, 2]. We prove that Lagrangian regular points exist (i.e. there are fluid particles that have not participated in shocks at any time between 0 and t) if and only if α ≤ 1 and the noise is not completely asymmetric, and that otherwise the shock structure is discrete. Moreover, in the Cauchy case α = 1, we show that there are no rarefaction intervals, i.e. at time t > 0, there are fluid particles in any non-empty open interval. 1. Introduction Burgers has introduced the equation 2 u ∂t u + ∂x u2 /2 = ε∂xx as a simple model of hydrodynamic turbulence for compressible fluids, where the parameter ε > 0 describes the viscosity of the fluid, and the solution is meant to represent the velocity of a fluid particle located at x at time t. Roughly, when the viscosity tends to 0, the dynamic of the system of particles corresponds to completely inelastic shocks, in the sense that if two (clumps of) particles collide at a given time, then they form a larger clump of particles in such a way that mass and momentum are preserved. Although it is known that this is not an accurate model for turbulence, Burgers equation is still widely used in physical problems such as, for instance, the study of shock wave formation in compressible fluids, or that of the formation of large clusters in the universe, or also as a simplified version of more elaborate models of turbulence (e.g. the Navier–Stokes equation). To present this work as simply as possible, it is convenient to use the fluid particles picture that has just been sketched, describe first informally results in this setting, and postpone the mathematical rigor to the next sections.
730
J. Bertoin
There is an abundant literature on the inviscid Burgers equation (that is the limit as the viscosity ε goes to 0) in dimension 1, with random initial data. See in particular [1, 2,4,6,7,12,15,16,18–20] and references therein. An interesting problem in this field is to obtain qualitative results on the shock structure. To that end, recall that a so-called Lagrangian regular point at time t can be viewed as the initial location of a particle that has not participated in shocks induced by the turbulence at any time between 0 and t, and that a rarefaction interval is an interval that contains no fluid particles at time t. Sinai [19] has proven that when the initial velocity is given by a Brownian motion, then the set of Lagrangian regular points has Hausdorff dimension 1/2 and that there are no rarefaction intervals. When the initial velocity is a Gaussian white noise, Avellaneda and E [1] have shown that the shock structure is discrete, in the sense that at time t > 0, there are no Lagrangian regular points and only finitely many clumps of particles are left in a given compact set. Quite recently, numerical simulations led Janicki and Woyczynski [12] to the conjecture that when the initial velocity is a stable Lévy process of index α ∈ (1, 2], the Hausdorff dimension of Lagrangian regular points is 1/α (this conjecture has been proven mathematically in [4] when the Lévy process has no positive jumps). We consider here the case when the initial velocity is given by a stable Lévy noise. Specifically, if we introduce the initial potential ψ(·, 0), which is formally defined by ∂x ψ(x, 0) = −u(x, 0), then the process ψ(·, 0) has independent and homogeneous increments and its one-dimensional distributions are stable laws with index α ∈ (1/2, 2]. This situation naturally appears as a limit in a large class of renormalized potentials, see [5]. It is easy to show that in this framework, Lagrangian regular points are exceptional, in the sense that for each fixed point x ∈ R, the probability that x is regular is always zero. We will prove a much more precise result. For α > 1, and also for 1/2 < α < 1 if the noise is completely asymmetric (that is if the initial potential is a monotone increasing or decreasing process), the shock structure is discrete a.s. Nonetheless, if α ∈ (1/2, 1] and if the noise is not completely asymmetric, then a.s. there are Lagrangian regular points. The Cauchy case α = 1 is especially interesting from the mathematical point of view. We will show that it is the only one for which there are no rarefaction intervals, that is every non-empty open interval contains fluid particles at time t > 0. Informally, this study suggests that for α > 1, the shocks induced by Burgers turbulence are numerous and strong enough to involve every single fluid particle at any time t > 0 and to create only finitely many clusters on any given compact interval. For α ∈ (1/2, 1], the initial data is not as rough. However in the completely asymmetric case, the monotonicity of the initial potential implies that all the particles are moving in the same direction, and this explains why again the shock structure is discrete. On the other hand, when the noise is not completely asymmetric, the monotonicity is lost and thanks to compensations that occur when clumps of particles with opposite velocity collide, some exceptional particles are not involved in the turbulence. The rest of this paper is organized as follows. The next section is devoted to the formal presentation of the notions. Our results are then stated and proven in Sect. 3. In the case α ∈ (1/2, 1], the proofs essentially rely on known sample path properties of stable Lévy processes which have been obtained in the 70’s by Fristedt, Hawkes, Monrad and Silverstein. The argument to establish that the shock structure is discrete when α ∈ (1, 2] is less direct; it requires some material on fluctuation theory for Lévy processes.
Shocks in Burgers Turbulence with Stable Noise Initial Data
731
2. Preliminary 2.1. Some basic features on Burgers equation. In this subsection, we review some classical material on the inviscid Burgers equation that can be found for instance in [18] or [19]. From the works of Hopf [11] and Cole [8], it is known that given an initial velocity, Burgers equation with viscosity ε > 0 possesses a unique solution uε , and that uε converges as ε → 0+ to a solution u0 = u to the inviscid equation, which is usually referred to as the Hopf–Cole (or entropic) solution. The Hopf–Cole solution has a simple expression in terms of potential functions. If we introduce ψ by u = −∂x ψ, then the potential at time t is expressed in terms of the Legendre transform of the function a → ψ(a, 0) − a 2 /2t: (x − a)2 . (1) ψ(x, t) = sup ψ(a, 0) − 2t a∈R Of course, we implicitly supposed that ψ(a, 0) = o(a 2 )
as |a| → ∞,
(2)
so that the quantity in (1) is finite. Note also that the formula (1) makes sense whenever the initial velocity u(·, 0) = −∂x ψ(·, 0) is the derivative (in the sense of Schwartz) of a function. To that end, we shall merely assume that the initial potential ψ(·, 0) has only discontinuities of the first kind, i.e. there exists left and right limits at each point; and it will then be convenient to work with the version that is right-continuous. For the sake of simplicity, we shall focus on time t = 1 in the sequel. Of course, our results are valid at any positive time; this can be easily checked by a simple scaling argument. The structure of the shocks in Burgers turbulence is conveniently described in terms of the inverse Lagrangian function which we now introduce. We denote by a(x) the largest location a at which the supremum in (1) is reached, i.e. ) ( (x − b)2 = ψ(x, 1) . a(x) = sup a ∈ R : sup ψ(b, 0) − 2 b≥a We will frequently make use of the fact that, as ψ(·, 0) is continuous to the right and possesses limits to the left, one has for all a ∈ R, ψ(a(x), 0) − (x − a(x))2 /2 ≥ ψ(a, 0) − (x − a)2 /2 when ψ(·, 0) is continuous or makes an upwards jump at a(x), whereas ψ(a(x)−, 0) − (x − a(x))2 /2 ≥ ψ(a, 0) − (x − a)2 /2 when ψ(·, 0) makes a downwards jump at a(x). We stress that the inverse Lagrangian function x → a(x) is right-continuous and increasing. Its right-continuous inverse a → x(a), which is given by x(a) = inf {y ∈ R : a(y) > a} , is called the Lagrangian function; alternatively, it can be viewed as the (right) derivative of the convex hull of the function a → −ψ(a, 0) + a 2 /2. From the point of view of
732
J. Bertoin
hydrodynamic turbulence, the Lagrangian function describes the position at time 1 of the fluid particle initially located at a. We see that if a discontinuity of the inverse Lagrangian function occurs at some point x, i.e. lim a(y) := a(x−) < a(x), y→x−
then the Lagrangian function is constant on the interval [a(x−), a(x)), which means that at time 1, there is a clump located at x which is formed by all the particles that were initially in the interval [a(x−), a(x)). Similarly, if the inverse Lagrangian function stays constant on some interval [x, y), then the Lagrangian function never takes values in the open interval (x, y), which means that at time 1, there are no fluid particle in (x, y). This motivates the following definition. We first introduce the closed range of the inverse Lagrangian function, A = {y = a(x) or y = a(x−) for some x ∈ R} . The open set R−A has a canonical decomposition into disjoint open intervals of the type (a(x−), a(x)); their closures [a(x−), a(x)] are called the shock intervals. A Lagrangian shock point is a point that belongs to some shock interval. A Lagrangian regular point is a point in A that is isolated neither to its left nor to its right in A. We thus have a natural partition of R into the set of Lagrangian regular points and the set of Lagrangian shock points. From the point of view of hydrodynamic turbulence for compressible fluids, a Lagrangian shock point (respectively, a Lagrangian regular point) represents the initial location of a particle that belongs to some clump at time 1 (respectively, that has not been involved in the shocks induced by the turbulence before time 1). One says that the shock structure is discrete if A is a discrete set. This means that there are only finitely many shock intervals in a given compact set and there exists no Lagrangian regular points. Finally one calls (x, y) is a rarefaction interval if the inverse Lagragian function stays constant on [x, y). 2.2. Stable Lévy processes. We saw in the preceding subsection that it is easier to study Burgers turbulence using potentials than velocities. To that end, it is more convenient to discuss the initial data in terms of a stable Lévy process rather than in terms of its derivative, namely a stable Lévy noise. In this subsection, we briefly review material in this field that will be useful in the sequel, and refer to [3] and [17] for much more on this topic. Let S = (Ss , s ∈ R) denote a generic stable Lévy process indexed by the real line. This means that S has independent and homogeneous increments and fulfills the scaling property law
Sks = k 1/α Ss ,
∀k > 0,
where α ∈ (0, 2] is known as the index. Note that S0 = 0 a.s. We will always consider the version of S for which the sample paths are right-continuous and have limits to the left, a.s. It is plain from the scaling property that (2) cannot hold unless α > 1/2, and conversely, it is easy to check that (2) is fulfilled if α > 1/2. This explains why we will restrict our attention to that case in the next section. On the other hand, the case when S is a constant drift is trivial and will be implicitly excluded in the sequel. First, recall that for α < 1, S is a pure jump process with bounded variation a.s., and its derivative dS· in the sense of Stieltjes is a mixture of Dirac point masses whose
Shocks in Burgers Turbulence with Stable Noise Initial Data
733
law can be described in terms of a certain Poisson measure. In particular, S is monotone increasing if and only if it has only positive jumps; one then says that S is a subordinator. One calls S completely asymmetric if either S or −S is a subordinator. If S is not completely asymmetric, then it can be expressed as the difference of two independent stable subordinators. Second, for α ≥ 1, S has unbounded variation and its derivative in the sense of Schwartz is no longer a signed measure. In particular it is not a monotone process even when it only has positive jumps. When α = 1, S is called a Cauchy process; it can be expressed as the sum of a symmetric Cauchy process and a deterministic drift; it always possesses both positive jumps and negative jumps. Known properties about the growth of S will play a major role in this study. The literature mostly concerns the growth at the right of points. Because the time-reversed process Sˆs = S(−s)− is again a stable Lévy process with the same law as −S, results at the left follow immediately; and sometimes it will be convenient for us to use the two-sided version of a result that appears as one-sided in the references. One of the first result in that field concerns the upper rate of growth at a fixed point. It has been proved by Khintchine, see Theorem VIII.5 in [3] for an accessible reference. Lemma 1 (Khintchine). Let S be a stable process with index α ∈ (0, 2]; suppose that −S is not a subordinator. For every β > 0, we have with probability one lim sup h→0+
Sh = 0 or ∞ hβ
according as α < 1/β or α ≥ 1/β. Next, we present a uniform result on the lower rate of growth for stable subordinators that can be found in an even stronger form in Fristedt [9] and Hawkes [10]. Lemma 2 (Fristedt and Hawkes). Let S = (Ss , s ∈ R) be a stable subordinator with index α ∈ (0, 1). With probability one, we have for all s ∈ R, lim inf h→0+
Ss+h − Ss Ss− − Ss−h < ∞ , lim inf < ∞, h→0+ h1/α h1/α
(i)
Ss+h − Ss > 0. h1/α | log h|1−1/α
(ii)
and lim inf h→0+
The final result concerns the behavior near local extrema; see Theorem 7.3 in Monrad and Silverstein [14]. Lemma 3 (Monrad and Silverstein). Let S be a stable process with index α ∈ (0, 2], which is not completely asymmetric if α < 1, and let f : (0, ∞) → (0, ∞) be an increasing function. With probability one, we have for any time µ at which S reaches a local maximum Sµ − Sµ±h = 0 or ∞ lim inf 1/α h→0+ h f (h) R1 according as the integral 0 t −1 f (t) dt diverges or converges.
734
J. Bertoin
3. Statements and Proofs We suppose throughout this section that the initial potential is a stable Lévy process, i.e. ψ(a, 0) = Sa . Our purpose is to describe the shock structure depending on the value of the index α. 3.1. The completely asymmetric case when α ∈ (1/2, 1). In this subsection, we establish that the shock structure is discrete when α ∈ (1/2, 1) in the completely asymmetric case. Theorem 1. Suppose that initial potential ψ(·, 0) is a completely asymmetric stable Lévy process with index α ∈ (1/2, 1). Then the shock structure is discrete a.s. For the sake of simplicity, we will focus on the case when the initial potential is monotone increasing, i.e. is a stable subordinator. The monotone decreasing case is similar and therefore omitted. The study relies heavily on the uniform result on the rate of growth stated in Lemma 2. We first point out that the set of jump points of the initial potential, J = {y ∈ R : ψ(y, 0) 6 = ψ(y−, 0)} , contains the closed range of the inverse Lagrangian function, A. Lemma 4. With probability one, we have A ⊆ J . Proof. Because α < 1, we see from Lemma 2(i) that a.s., for any point y that is not a jump of the initial potential ψ(·, 0) = S· , one has lim inf h→0+
ψ(y + h, 0) − ψ(y, 0) ψ(y, 0) − ψ(y − h, 0) = lim inf = 0. h→0+ h h
(3)
If y ∈ / J can be expressed as y = a(x) or y = a(x−) for some x ∈ R, then the function a → ψ(a, 0) − (x − a)2 /2 reaches its maximum at y. By (3), this can only happen if x = y. On the other hand, because α > 1/2, we see from Lemma 2(ii) that ψ(y + h, 0) − ψ(y, 0) = ∞, h→0+ h2 lim
which entails that y cannot be the location of the maximum of the function a → ψ(a, 0)− t (y − a)2 /2. u We are now able to establish Theorem 1. Proof. We have to show that A is a discrete set a.s. Pick an arbitrary y ∈ A. Because ψ(·, 0) makes a positive jump at y, it is easy to check that y is isolated on its left in A. Note that the model of hydrodynamic turbulence makes this property completely obvious as at time t = 0, the particle located at y has a negative momentum, and thus it instantaneously collides with those that are immediately to its left. So all that we need is to check that y is also isolated on its right in A. To that end, recall that the set of jump points J can be expressed at the values taken by a countable family of stopping times, and by the strong Markov property, the path behavior of ψ(·, 0) after a stopping time is the same as after the origin. We know from
Shocks in Burgers Turbulence with Stable Noise Initial Data
735
Lemma 1 that a stable process with index α < 1 has derivative zero at the origin, so with probability one, we have ψ(y + h, 0) − ψ(y, 0) = 0. h→0+ h lim
(4)
Suppose y is a point of accumulation in A, i.e. there is a decreasing sequence yn = a(xn ) converging to y. In particular, we have ψ(yn , 0) − that is
(xn − yn )2 (xn − y)2 ≥ ψ(y, 0) − , 2 2
yn + y − 2xn ψ(yn , 0) − ψ(y, 0) ≥ , yn − y 2
and therefore lim sup h→0+
ψ(y + h, 0) − ψ(y, 0) ≥ y − x, h
where x denotes the limit of the decreasing sequence xn . By (4), we must have y ≤ x. On the other hand, we know that y = a(x) by the right-continuity of the inverse Lagrangian function, so y must be the location of a maximum of the function a → ψ(a, 0) − (x − a)2 /2. Again by (4), the right-derivative at y of this function is x − y ≥ 0, which forces x = y. So y must be the location of a maximum of the function a → ψ(a, 0)−(y−a)2 /2, and we see from Lemma 2(ii) that this is impossible, except on an event of probability zero. We conclude that y is isolated on its right in A. u t 3.2. The non-completely asymmetric case when α ∈ (1/2, 1]. We suppose here that α ∈ (1/2, 1] and that the noise is not completely asymmetric; we shall first prove that Lagrangian regular points are exceptional. Theorem 2. Suppose that initial potential ψ(·, 0) is stable Lévy process with index α ∈ (1/2, 1] that is not completely asymmetric. Then for every fixed x ∈ R, the probability that x is Lagrangian regular equals zero. Next, we will establish the existence of Lagrangian regular points. Theorem 3. Suppose that the initial potential ψ(·, 0) is a stable Lévy process with index α ∈ (1/2, 1] that is not completely asymmetric. Then the probability that there exists Lagrangian regular points is one. We first consider Theorem 2; by stationarity, we may suppose x = 0. The argument relies on a property that is intuitively obvious from the point of view of hydrodynamic turbulence. For α < 1, because at time t = 0 the velocity of a fluid particle is either 0 or proportional to a Dirac point mass, and as in the latter case the particle instantaneously collides with some of its neighbors, a fluid particle which has not been involved in shocks up to time t must have the same location as at the origin of time. (Of course, one has to be careful with such an informal argument: the result becomes false in the Cauchy case α = 1.) Lemma 5. Suppose α < 1. With probability one, if r is a Lagrangian regular point, then r = a(r) (or equivalently r = x(r)).
736
J. Bertoin
Proof. Recall from Lemma 1 that a.s. lim
h→0+
ψ(h, 0) ψ(h, 0) = 0 and lim sup = ∞. h h2 h→0+
A slight variation of the proof of Theorem 1 then shows that a point r in the closure of the range of the inverse Lagrangian function, which is also a jump point of ψ(·, 0), is necessarily isolated in A, and therefore is not Lagrangian regular. Suppose r = a(x) is a Lagrangian regular point, say with x > r; and recall that the initial potential can be expressed in the form ψ(·, 0) = S (1) (·) − S (2) (·), where S (1) and S (2) are two independent stable subordinators. As ψ(·, 0) is continuous at r, we have for every h > 0, ψ(r, 0) −
(x − r − h)2 (x − r)2 ≥ ψ(r + h, 0) − , 2 2
that is S (1) (r + h) − S (1) (r) ≤ S (2) (r + h) − S (2) (r) − h(2x − 2r − h)/2. By Lemma 2(i), we may pick a sequence hn → 0+ such that S (2) (r + hn ) − S (2) (r) = o(hn ). But then, as x > r, we would have S (1) (r + hn ) − S (1) (r) < 0 when n is sufficiently large, which is impossible. One proves similarly (working at the left of r) that x < r is impossible. u t We are now able to prove Theorem 2. Proof. In the Cauchy case α = 1, we know from Lemma 1 that lim suph→0+ ψ(h, 0)/ h = ∞ a.s., and since ψ(·, 0) is continuous at 0, this entails that 0 cannot be Lagrangian regular with probability one. In the case α ∈ (1/2, 1), we know from Lemma 5 that if 0 is Lagrangian regular, we must have a(0) = 0. On the other hand, we know from Lemma 1 that lim suph→0+ ψ(h, 0)/ h2 = ∞ a.s., and this entails that 0 cannot be the location of a maximum of a → ψ(a, 0) − a 2 /2 a.s. We conclude that the probability that 0 is Lagrangian regular is zero. u t We next turn our attention to Theorem 3, and to that end, we shall prove that local maxima of ψ(·, 0) have a positive probability of being Lagrangian regular. It is easy to deduce that Lagrangian regular points exist with probability one. Proof. Let µ be the (a.s. unique) location of the maximum of ψ(·, 0) on [0, 1]. It is well-known that 0 < µ < 1 a.s. We deduce from Lemma 3 that ψ(µ, 0) − ψ(µ ± h, 0) = 0 h→0+ h lim
a.s.
It follows that if 0 : [0, 1] → R denotes the concave hull of the restriction of ψ(·, 0) to [0, 1], that is if 0 is the smallest concave function with 0(a) ≥ ψ(a, 0) for every a ∈ [0, 1], then its derivate γ = 0 0 is continuous at µ and
Shocks in Burgers Turbulence with Stable Noise Initial Data
737
γ (µ + h) < γ (µ) = 0 < γ (µ − h) for every sufficiently small h > 0. This implies that the support of the Stieltjes measure −dγ contains µ, and more precisely µ is neither isolated to the left nor to the right in Supp(−dγ ). Then pick any x ∈ Supp(−dγ ) arbitrarily closed to µ. Clearly, the graph of 0 touches that of ψ(·, 0) at x, so we must have 0(x) = ψ(x, 0) or 0(x) = ψ(x−, 0). In both cases, x is the location of a maximum of a → ψ(a, 0) − γ (x)a on [0, 1], and a fortiori it is the unique location of the maximum of a → ψ(a, 0) − (x − γ (x) − a)2 /2 on [0, 1]. Plainly, µ is also the unique location of the maximum of a → ψ(a, 0) − (µ − a)2 /2 on [0, 1] . Because ψ(µ, 0) > max (ψ(0, 0), ψ(1, 0)), there is a positive probability that the preceding two maxima are global (i.e. on R) and not only local (i.e. on [0, 1]). We conclude that with positive probability, µ ∈ A and is neither isolated on its right nor on its left, and therefore is a Lagrangian regular point. u t 3.3. The case α ∈ (1, 2]. Our aim in this subsection is to establish that the shock structure is discrete when α ∈ (1, 2]. In the Gaussian case α = 2, this has been proven first by Avellaneda and E [1]. Their approach relied crucially on the Girsanov theorem which enables one to add a parabolic drift to the standard Brownian motion. This cannot be done in the stable case α < 2, so we will use a completely different argument. Theorem 4. Suppose that the initial potential ψ(·, 0) is a stable Lévy process with index α ∈ (1, 2]. Then the shock structure is discrete a.s. Because the random set A is stationary (i.e. its law is invariant by translation), we have to prove that Card ([1, 2] ∩ A) < ∞ a.s. It is easy to verify that the probability that a(x) ∈ [1, 2] for some x with |x| > n goes to zero as n → ∞ (see [5] for the rate of decay), so it suffices in fact to establish that for each fixed n, Card {a(x) ∈ [1, 2] : |x| ≤ n} < ∞
a.s.
(5)
We first point out that when α ≥ 1, a jump point of ψ(·, 0) cannot be in A (we stress that the argument also applies in the Cauchy case α = 1). Lemma 6. Suppose that ψ(·, 0) is a stable Lévy process with index α ∈ [1, 2). Then with probability one, ψ(·, 0) is continuous at every point in A. Proof. Recall from Lemma 1 that lim sup h→0+
ψ(h, 0) − ψ(0, 0) ψ(−h, 0) − ψ(0, 0) = lim sup = ∞ h h h→0+
a.s.
By the strong Markov property (and time-reversal), we thus have with probability one lim sup h→0+
ψ(y + h, 0) − ψ(y, 0) ψ(y − h, 0) − ψ(y−, 0) = lim sup = ∞ h h h→0+
for all jump points y ∈ J . If y = a(x) or y = a(x−) for some x, and if y ∈ J is the point of, say, a positive jump of ψ(·, 0), then we have for every h > 0, ψ(y, 0) − (x − y)2 /2 ≥ ψ(y + h, 0) − (x − y − h)2 /2.
738
J. Bertoin
Therefore we would have lim sup h→0+
ψ(y + h, 0) − ψ(y, 0) ≤ y − x, h
which is impossible, except on an event with probability zero. The case of a negative jump is similar, working now at the left of the jump. u t We resume our analysis of the shock structure with the observation that if a point y ∈ [1, 2] can be expressed as y = a(x) for some x ∈ [−n, n], then, as ψ(·, 0) is continuous at y, one has ψ(y ± h, 0) < ψ(y, 0) + 2nh
for every h ∈ (0, 2],
(6)
that is one can touch from above the graph of ψ(·, 0) on [y − 2, y + 2], using a vertical cone centered at y with vertices of slope ±2n. We shall estimate the probability that the preceding behavior occurs at some point y in a small interval [a, a + ε] ⊆ [1, 2]. By stationarity, this probability does not depend on the choice of a, so we may focus on the case a = 1. Let us analyze this situation from a “dynamical” point of view, i.e. considering the process X = (ψ(a, 0) + 2na, a ≥ 0) and thinking of the variable a ≥ 0 as time. Let us denote by τ the first instant after time 1 at which X reaches a new maximum, and set Ya = Xτ +a − Xτ − 4na for a ≥ 0. If (6) holds for some y ∈ [1, 1 + ε], then we must have τ ∈ [1, 1 + ε]; moreover Y cannot reach a new maximum on the time interval [1 + ε − τ, 3 − τ ], and a fortiori not on [ε, 1 + ε]. Observe also from the strong Markov property that (Xa , 0 ≤ a ≤ τ ) and Y are independent. The key step is thus provided by the following lemma that we will prove in a while. Lemma 7. There is a finite constant K > 0 such that P (X reaches a new maximum on [1, 1 + ε]) × P (Y does not reach a new maximum on [ε, 1 + ε]) ≤ εK. Indeed, it then follows from the preceding analysis that P ((6) holds for some y ∈ [1, 1 + ε]) ≤ εK, which entails by Tonelli’s theorem (assuming for simplicity that 1/ε is an integer) E (Card {k = 0, · · · , 1/ε : (6) holds for some y ∈ [1 + kε, 1 + (k + 1)ε]}) ≤ 2K, and a fortiori (5). We now proceed to the proof of Lemma 7. Proof. For simplicity, write Sa = ψ(τ + a, 0) − ψ(τ, 0), so S = (Sa , a ≥ 0) is a stable process with index α and Ya = Sa − 2na. The set of times M when Y reaches a new maximum is known as the ascending ladder time set; it is a regenerative set (i.e. the range of a subordinator). See [3], Sect. VI.1. In this setting, Y does not reach a new maximum on [ε, 1 + ε] only if M ∩ (ε, 1 + ε] = ∅. On the other hand, it is well-known that the probability of the latter event can be bounded from above by P (M ∩ (ε, 1 + ε] = ∅) ≤ 5(1)U (ε),
Shocks in Burgers Turbulence with Stable Noise Initial Data
739
where 5 denotes the tail of the Lévy measure associated to the regenerative set M and U the renewal function. This can be seen for instance from Proposition III.2 in [3]. As 5(1) is just a constant number, we only need an estimate of U (ε) as ε → 0+. We know from Proposition III.1 of [3] that U (ε) = O(1/8(1/ε)), where 8 is the Laplace exponent of the ascending ladder time set of Y . Moreover, we know from fluctuation theory (cf. [3, p. 166] that Z ∞ e−s − e−qs s −1 P (Ys ≥ 0) ds . 8(q) = exp 0
Note that by the scaling property, P(Ss ≥ 0) = ρ does not depend on s (this quantity is known as the positivity parameter of the stable process S), and thus P (Ys ≥ 0) = P (Ss ≥ 0) − P (0 ≤ Ss < 2ns) = ρ − P 0 ≤ S1 ≤ 2ns 1−1/α . On the one hand, we may use the classical identity Z ∞ e−s − e−qs s −1 ds = q ρ . exp ρ 0
On the other hand, the existence of bounded density for the stable law entails that as s → 0+, P 0 ≤ S1 ≤ 2ns 1−1/α = O s 1−1/α which in turn ensures that Z ∞ e−s s −1 P 0 ≤ S1 ≤ 2ns 1−1/α ds < ∞. 0
Putting the pieces together, we get P (Y does not reach a new maximum on [ε, 1 + ε]) = O(U (ε)) = O (1/8(1/ε))) = O(ερ ). We then consider the event that X reaches a new supremum on [1, 1 + ε]. If we introduce the time-reversed process Xˆ a = X(1+ε−a)− − X1+ε , then the foregoing event occurs if and only if Xˆ does not reach on [ε, 1 + ε]. On the other hand, a new maximum ˆ ˆ X has the same law as the process Sa − 2na, a ≥ 0 , where Sˆ = −S. Plainly, one has ρˆ := P(Sˆ1 ≥ 0) = 1 − P(S1 > 0) = 1 − ρ, and we deduce from above that P (X reaches a new maximum on [1, 1 + ε]) = O(ε1−ρ ). This completes the proof of the lemma, and therefore that of Theorem 4. u t
740
J. Bertoin
3.4. Absence of rarefaction intervals in the Cauchy case. Plainly, rarefaction intervals exist when the shock structure is discrete. It is also easy to see that rarefaction intervals exist for α ∈ (1/2, 1) in the non-completely asymmetric case. More precisely, denote by y the first positive location of a jump greater than 1 of ψ(·, 0). Recall that ψ(·, 0) has right-derivative zero at y; it follows that the probability that y is the unique location of the maximum of the function a → ψ(a, 0) − (x − a)2 /2 is positive provided that x < y and (x − y)2 < 2. Then the same argument as in the proof of Theorem 1 shows that y is isolated in A, both on its left and on its right. By the right-continuity of the inverse Lagrangian function, we see that the set {x : a(x) = y} contains a non-empty open interval. We now consider the Cauchy case α = 1, that is we suppose that ψ(a, 0) = Ca + da, where C is a symmetric Cauchy process and d ∈ R a drift coefficient. The purpose of this subsection is to prove the absence of rarefaction intervals, that is that the locations of the fluid particles at time 1 form an everywhere dense set a.s. Theorem 5. Suppose that initial potential ψ(·, 0) is a Cauchy process. Then with probability one there are no rarefaction intervals. Proof. Recall from Lemma 6 that jump times of ψ(·, 0) do not belong to A, a.s. Now suppose (x, x 0 ) is a rarefaction interval, that is a(·) stays constant on [x, x 0 ); denote its value by y. As y is not a jump time of ψ(·, 0), we have for all h > 0, (x − y + h)2 (x − y)2 ≥ ψ(y − h, 0) − , 2 2 (x 0 − y − h)2 (x 0 − y)2 ≥ ψ(y + h, 0) − . ψ(y, 0) − 2 2 ψ(y, 0) −
We deduce that
ψ(y, 0) − ψ(y − h, 0) ≥ y − x and h ψ(y + h, 0) − ψ(y, 0) ≤ y − x0. lim sup h h→0+ lim inf h→0+
As x < x 0 , we may thus find a rational number q ∈ (y −x 0 , y −x). Then y is the location of a local maximum of a → ψ (q) (a, 0) := ψ(a, 0) − qa and moreover ψ (q) (y, 0) − ψ (q) (y + h, 0) > 0. (7) h→0+ h On the other hand, the family ψ (s) (·, 0), s ∈ Q is a countable family of Cauchy processes. For each of these processes, we can invoke Lemma 3 to see that with probability one, for any s ∈ Q and any location µ of a local maximum for ψ (s) (·, 0), lim inf
ψ (s) (µ, 0) − ψ (s) (µ + h, 0) = 0. h→0+ h We conclude that (7) is impossible, except on an event of probability zero, and therefore there are no rarefaction intervals a.s. u t lim inf
The absence of rarefaction intervals means that the Lagrangian function a → x(a) is continuous. On the other hand, it only increases on the set of Lagrangian regular points, and it follows from Theorem 2 and Tonelli’s theorem that the latter has Lebesgue measure zero a.s. In the terminology used by Sinai [19], one says that the Lagrangian function is a complete devil staircase.
Shocks in Burgers Turbulence with Stable Noise Initial Data
741
References 1. Avellaneda, M. and E, W.: Statistical properties of shocks in Burgers turbulence. Commun. Math. Phys. 172, 13–38 (1995) 2. Avellaneda, M.: Statistical properties of shocks in Burgers turbulence, II: Tail probabilities for velocities, shock-strengths and rarefaction intervals. Commun. Math. Phys. 169, 45–59 (1995) 3. Bertoin, J.: Lévy processes. Cambridge: Cambridge University Press, 1996 4. Bertoin, J.: The inviscid Burgers equation with Brownian initial velocity. Comm. Math. Phys. 193, 397– 406 (1998) 5. Bertoin, J.: Large deviation estimates in Burgers turbulence with stable noise initial data. J. Stat. Phys. 91, No. 3/4, 655–667 (1998) 6. Burgers, J.M.: The nonlinear diffusion equation. Dordrecht, Reidel, 1974 7. Chorin, A.J.: Lectures on turbulence theory. Boston: Publish or Perish, 1975 8. Cole, J.D.: On a quasi linear parabolic equation occurring in aerodynamics. Quart. Appl. Math. 9, 225–236 (1951) 9. Fristedt, B.E.: Uniform local behavior of stable subordinators. Ann. Probab. 7, 1003–1013 (1979) 10. Hawkes, J.: A lower Lipschitz condition for the stable subordinator. Wahrscheinlichkeitstheorie verw. Gebiete 17, 23–32 (1971) 11. Hopf, E.: The partial differential equation ut +uux = µuxx . Comm. Pure Appl. Math. 3, 201–230 (1950) 12. Janicki, A.W. and Woyczynski, W.A.: Hausdorff dimension of regular points in stochastic flows with Lévy α-stable initial data. J. Stat. Phys. 86, 277–299 (1997) 13. Molchanov, S.A., Surgailis, D. and Woyczynski, W.A.: Hyperbolic asymptotics in Burgers’ turbulence and extremal processes. Commun. Math. Phys. 168, 209–226 (1995) 14. Monrad, D. and Silverstein, M.L.: Stable processes: Sample function growth at a local minimum. Z. Wahrscheinlichkeitstheorie verw. Gebiete 49, 177–210 (1979) 15. Ryan, R.: Large-deviation analysis of Burgers turbulence with white-noise initial data. Comm. Pure Appl. Math. 51, 47–75 (1998) 16. Ryan, R.: The statistics of Burgers turbulence initialized with fractional Brownian noise data. Commun. Math. Phys. 191, 71–86 (1998) 17. Samorodnitsky, G. and Taqqu, M.S.: Stable non-Gaussian random processes: stochastic models with infinite variance. London: Chapman and Hall, 1994 18. She, Z.S., Aurell, E. and Frisch, U.: The inviscid Burgers equation with initial data of Brownian type. Commun. Math. Phys. 148, 623–641 (1992) 19. Sinai, Ya.: Statistics of shocks in solution of inviscid Burgers equation. Commun. Math. Phys. 148, 601–621 (1992) 20. Woyczynski, W.A.: Göttingen Lectures on Burgers-KPZ turbulence. Lecture Notes in Maths, Berlin– Heidelberg–New York: Springer, to appear Communicated by Ya. G. Sinai